Demystifying Regular Expressions
If you miss an attacker on your network, it’s probably not because you don’t have enough data. It’s more likely that you have too much data.
I hate when I know the data I want is there, but I just can’t find it.
This happened to me a lot early in my career, particularly when writing Snort signatures or searching for things in a SIEM. Most tools have limited native matching capability, so it always came down to writing regular expressions to dig deeper. I avoided them like the plague.
Seriously, a human wrote this?
Is this real life?
Regular expressions are confusing, hard to read, and it seems like you have to learn a whole new programming language just to make sense of them. My struggles with regex went on for years.
Eventually, I’d run into scenarios where I was at an impasse. Situations like:
- Writing a signature to detect an exploit kit, but it involved matching a complex HTTP URI string — I feel like the malware authors do this just to mess with me.
- Searching through authentication logs to match specifically formatted usernames, like steve7.smith.sales — seriously, why did sales hire so many Steve Smiths?!
- Parsing threat intelligence feeds to match and remove problematic indicators — I’m looking at you 127.0.0.1.
I’d just brush my shortcoming off and focus on other problems or spend a lot of time manually digging through data in my terminal or Excel. When I absolutely had to figure something out I would dig around until I found a similar regex and try to modify it until it fit my use case. I didn’t understand what I was doing or what I created. It got me through, but it also resulted in a lot of wasted time and false positives.
I was in denial, but I eventually realized a harsh truth.
Writing regular expressions is a critical skill for any security analyst. If you can’t do it, you won’t be able to fully do the things you need to hunt down attackers and investigate incidents.
My inability to write regular expressions was holding me back from doing things that were essential for my job and I wasn’t going to be a great analyst until I figured it out.
For the next several nights I locked my door, turned my phone off, stocked up on Coke Zero, and started researching everything I could about regular expressions. I read about how they were formed, what the components were, and different implementations. After several days of research, I tried to apply what I had learned in practical situations.
You know what happened? Nothing. I wasn’t any better at getting useful results. Not only that, I had so much extra nonsense floating around in my head it only slowed me down further and made me more confused. What was wrong?
Here’s the thing…
All the material I was consuming wasn’t written for me.
It was written for career developers, engineers, and research scientists…but not for security practitioners. Literally, none of the examples I had found would help me detect the latest Sofacy variant or find non-standard user agents in outbound HTTP requests.
I needed something that would connect the theory to the practical in a meaningful way. Unfortunately, that didn’t exist at the time.
When I started learning about regex, I set a goal: I didn’t want to be intimated by regular expressions anymore.
And now I want to teach you how not to be intimidated by them too, because your ability to write and read regular expressions quickly and effectively is one of the greatest career-enhancing skills I can give you.
If you’ve ever avoided writing a detection rule because the engine’s native matching capability wasn’t robust enough…
If you’ve ever exported data to Excel and manually gone through it because you couldn’t get your query specific enough…
If you’ve ever had your eyes glaze over while trying to understand a complex search or rule…
Then it’s time learn a new skill.
Most security analysis and detection tools support matching with regular expressions because of limitations in their own feature set. This means that if you can write regular expressions, you can search with infinite precision. This applies to IDS engines, SIEMs, and even command line tools like grep.
The phrase “searching for a needle in a haystack” is overused, but it’s a serious component of what security analysts do. A large part of our success is contingent on being able to search through large repositories of data and match things that meet very specific criteria.
Demystifying Regular Expressions will help you do exactly that.
Demystifying Regular Expressions is designed specifically for security practitioners so that you will get the maximum amount of value by learning regular expressions using scenarios you’ll actually encounter in the real world.
You’ll go through specific examples related to:
- Writing host-based detection with YARA
- Search through line-based logs with grep
- Writing detection rules with Snort
- Matching host logs in SIEMs like LogRhythm
Writing regular expressions is made easier when you have the right tools. The course will introduce you to tools like Espresso and RegeEx Buddy and how they can ease the development process and help you understand and troubleshoot complex expressions.
Demystifying Regular Expressions also comes with a free license for RegEx buddy, one of my favorite tools for creating the expressions I use
If you’ve tried to understand regular expressions but never got quite comfortable with writing them, then this is the course for you.
- The most common uses of regular expressions and how to apply them in places you weren’t even aware of.
- The process of iteratively building and testing regular expressions for things you want to match.
- Techniques for overcoming common gotchas like dealing with whitespace
- How to Evaluate the efficiency of expressions by the number of steps it takes to match.
- A definitive guide to escaping so you’ll know when and how to do it
- How quantifiers can be used to match specific numbers of data occurrences
- How to use capture groups to reference specific matched content and perform additional operations on it
- Complex behavioral structures like lookarounds and conditionals
- The use of modifiers to match case-sensitive, enable free-spacing, or match in single line mode.
You can view the full course syllabus here.
Demystifying Regular Expressions includes:
- Over 5 hours of demonstration videos. These videos will break down the individual components of regular expressions and demonstrate how to use them.
- A free RegEx Buddy license. This is one of my favorite tools for building, testing, and troubleshooting my expressions. The tools retails for $50.
- Hands-on labs to help you develop and test your skills. You’ll start with individual labs built around specific concepts and eventually work your way up to practical labs that mirror scenarios you’ll encounter on the job.
- Participation in our student charitable profit sharing program. A few times a year we designate a portion of our proceeds for charitable causes. AND students get to take part in nominating charities that are important to them to receive these donations.
- 6-month access to course video lectures and lab exercises. You can extend access later if you need more time.
- A Certification of Completion
- 10 Continuing Education Credits (CPEs/CEUs)
Meet the Instructor – Darrel Rendell
Darrel Rendell has worked InfoSec for over a decade, being fortunate to fall out of college and into the arms of a major Anti Virus vendor. He has since had stints with multiple prominent InfoSec names and is currently the Principal Threat Intelligence Analyst with PhishMe.
Darrel has been writing regular expressions longer than he could program. He’s used almost every mainstream flavour; dialect; engine; and iteration available – he’s even built regular expression generators and his own engine!
Frequently Asked Questions
Q: I’m not a programmer. Will this be over my head?
A: This course is designed specifically for security practitioners and assumes no prior programming knowledge.
Q: Is this course live?
A: This is NOT a live course. It’s an online video course you can take at your own pace.
Q: How long do I have access to the course material?
A: You have access to the course for six months following your purchase date.
Q: How much time does it take to do this course?
A: We recommend taking the course over a 2 to 3-week period. You can go through all the material in a couple of days but taking time to complete the labs and practice what you’re learning is critical for retention.
Q: How many CPEs/CMUs is this course worth?
A: Organizations calculate continuing education credits in different ways, but they are often based on the length of the training. This course averages 10 hours of video+lab work.
“The course does a great job of explaining and easing into things. The most useful thing I learned was about efficiency and lookaheads/lookbehinds. The latter open quite a few possibilities for capturing that I hadn’t considered before. I was surprised to learn about the different applications for Regex. I tend to think of it in a very specific use-case (Splunk) and it was a neat experience seeing other applications that use it.” – Adam Army
“Demystifying Regular Expressions is a great course for understanding regular expressions and applications to information security. It surprised me how easy regular expressions are if the basics are understood.” – Kevin Stone