We're updating the issue view to help you get more done. 

Unnamed capture groups are broken

Description

Some uses of regexp patterns that include an unnamed capture group before a named one seems to cause the index of named groups to get out of sync (no longer retrieve proper matched value)

eg: regular expression 'a(a|c)(?<named>b)'
applied to input "acb" (no quotes)

causes the function matcher.group("named") to return the value "c" instead of the expected value "b".

Upon further inspection, it appears that the group name extractor function marks the group "named" as having index '1', by inserting it at the front of the list, but since the underlying regexp has '(a|c)' marked as group '1', and the "named" group marked as group '2', I can see that this is why the indices are out of sync.

A workaround is to simply make all unnamed groups unmatching, ie using (?\:) instead of ().

Environment

None

Status

Assignee

TonyT

Reporter

Anonymous

Labels

None

Fix versions

Priority

Medium