On 21.03.2012 21:13, Dmitry Olshansky wrote:
On 21.03.2012 20:05, James Oliphant wrote:
While following the regex discussion, I have been compiling the examples
to help with my understanding of how it works.

From Dmitry's example page:
http://blackwhale.github.com/regular-expression.html
and from the dlang.org website:
http://dlang.org/phobos/std_regex.html

std.regex.replace calls a delegate
auto delegate(Captures!string)
which does not compile. The definition in Phobos for Captures is
struct Captures(R,DIndex)
and for the purposes of these examples changing the delegate to
auto delegate(Captures!(string,uint))
seems to work. Is this correct?


Mm-hm it means the fix to use size_t by default is in upstream, but not
in 2.058 I think. User needs not to specify index type, this is a hook
for future extension.


In another example on Dmitry's page that starts:
auto m = match("Ranges are hot!", r"(\w)\w*(\w)"); //at least 3
"word" symbols
The output from the example is "Ranges, R, s", but I don't quite
understand why those where the matches in this case.


Ok, \w matches any single word character, that is alpha, numeric or one
of few other oddities*.
Now (\w) captures 1 character into 1st _submatch_ ('R').
\w* captures the rest the gets reverted so that the next (\w) matches
The second (\w) thus captures last char ('s') into 2nd _submatch_
captures lists submatches captured during one match, [0] is the whole
match.

I get it that people tend to think that I was about to show multiple
_matches_ here, but that belongs to the next chapter. Here I was just
showing how to work with submatches, that needs to be stressed somehow.


Oh wait, it's in this chapter :) I probably should make more noise about "g" flag, and separate submatches from range of matches more cleanly.


*This is enormously useful tool to get info on unicode stuff and regex
in particular
http://unicode.org/cldr/utility/index.jsp


Also does the
regular expression imply match at least 2 "word" symbols where \w* means
match 0 or more "word" symbols?

Yup, that's right at least 2, I should correct wording.


These newsgroups are a great resource, keep up the great work!

You are welcome.



--
Dmitry Olshansky

Reply via email to