20. Regular Expressions
20.1 Overview

JMeter includes the pattern matching software Apache Jakarta ORO

There is some documentation for this on the Jakarta web-site, for example a summary of the pattern matching characters

There is also documentation on an older incarnation of the product at OROMatcher User's guide , which might prove useful.

The pattern matching is very similar to the pattern matching in Perl. A full installation of Perl will include plenty of documentation on regular expressions - look for perlrequick, perlretut, perlre, perlreref.

It is worth stressing the difference between "contains" and "matches", as used on the Response Assertion test element:

  • "contains" means that the regular expression matched at least some part of the target, so 'alphabet' "contains" 'ph.b.' because the regular expression matches the substring 'phabe'.
  • "matches" means that the regular expression matched the whole target. So 'alphabet' is "matched" by 'al.*t'.

In this case, it is equivalent to wrapping the regular expression in ^ and $, viz '^al.*t$'.

However, this is not always the case. For example, the regular expression 'alp|.lp.*' is "contained" in 'alphabet', but does not match 'alphabet'.

Why? Because when the pattern matcher finds the sequence 'alp' in 'alphabet', it stops trying any other combinations - and 'alp' is not the same as 'alphabet', as it does not include 'habet'.

Note: unlike Perl, there is no need to (i.e. do not) enclose the regular expression in //. So how does one use the Perl modifiers ismx etc if there is no trailing /? The solution is to use Perl5 extended regular expressions, i.e. /abc/i becomes (?i)abc


20.2 Examples

Extract single string

Suppose you want to match the following portion of a web-page:

name="file" value="readme.txt" and you want to extract readme.txt.

A suitable reqular expression would be:

name="file" value="(.+?)"

The special characters above are:

  • ( and ) - these enclose the portion of the match string to be returned
  • . - match any character. + - one or more times. ? - don't be greedy, i.e. stop when first match succeeds

Note: without the ?, the .+ would continue past the first " until it found the last possible " - probably not what was intended.

Extract multiple strings

Suppose you want to match the following portion of a web-page: name="file.name" value="readme.txt" and you want to extract file.name and readme.txt.

A suitable reqular expression would be:

name="(.+?)" value="(.+?)"

This would create 2 groups, which could be used in the JMeter Regular Expression Extractor template as $1$ and $2$.

The JMeter Regex Extractor saves the values of the groups in additional variables.

For example, assume:

  • Reference Name: MYREF
  • Regex: name="(.+?)" value="(.+?)"
  • Template: $1$$2$

Do not enclose the regular expression in / /

The following variables would be set:

  • MYREF: file.namereadme.txt
  • MYREF_g0: name="file.name" value="readme.txt"
  • MYREF_g1: file.name
  • MYREF_g2: readme.txt
These variables can be referred to later on in the JMeter test plan, as ${MYREF}, ${MYREF_g1} etc


20.3 Line mode

The pattern matching behave in various slightly different ways, depending on the setting of the multi-line and single-line modifiers.

There are the four possible combinations:

  • Default behavior. '.' matches any character except "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.
  • Single-line modifier (?s): Treat string as a single long line. '.' matches any character, even "\n". ^ matches only at the beginning of the string and $ matches only at the end or before a newline at the end.
  • Multi-line modifier (?m): Treat string as a set of multiple lines. '.' matches any character except "\n". ^ and $ are able to match at the start or end of any line within the string.
  • Both modifiers (?sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n". ^ and $, however, are able to match at the start or end of any line within the string.




Copyright © 1999-2006, Apache Software Foundation Updated: $Date: 2006-05-07 22:21:01 +0100 (Sun, 07 May 2006) $