On Jan 30, 2018 12:05 AM, Ori Bernstein <o...@eigenstate.org> wrote: > > On Mon, 29 Jan 2018 23:23:18 -0600, Edgar Pettijohn <ed...@pettijohn-web.com> > wrote: > > > I'm trying to use patterns.c for some pattern matching. The manual > > mentions captures using "()" around what you want to capture. I don't > > see how to get at the data though. Here is a sample program. > > > > #include <stdio.h> > > #include "patterns.h" > > > > int > > main(int argc, char *argv[]) > > { > > const char *errstr = NULL; > > const char *string = "the quick the brown the fox"; > > const char *pattern = "the"; > > int ret; > > struct str_match match; > > > > ret = str_match(string, pattern, &match, &errstr); > > > > if (errstr != NULL) > > printf("%s\n", errstr); > > else > > printf("number of matches %d\n", match.sm_nmatch); > > > > return 0; > > } > > > > It prints 2 which I was expecting 3. I've tried multiple other patterns > > and it seems the answer is always 2. Which leads me to believe I'm doing > > something wrong. Any assistance appreciated. > > > > > > Thanks, > > > > > > Edgar > > The code is looking for a match of the pattern in the string, not all matches > of the pattern in the string. It also makes the (IMO, surprising) decision > that not having any capture groups in the pattern implies capturing the whole > pattern. The whole string goes into the first match. > > So, in your case, you're matching: > > "the quick the brown the fox"; > ^^^ > > Accordingly: > > matches.sm_match[0] = "the quick the brown the fox" > matches.sm_match[1] = "the" > > If you had 'quick', you'd get similar behavior: > > "the quick the brown the fox"; > ^^^^ > > Equivalently, putting the whole pattern in '()' will match the same thing: > > pattern = "(quick)" > > But multiple parens will match their substrings: > > pattern = "(qu)ick (the)" > > "the quick the brown the fox"; > ^^ ^^^ > matches.sm_match[0] = "the quick the brown the fox" > matches.sm_match[1] = "qu" > matches.sm_match[2] = "the" > > The choice to capture implicitly, I think, is confusing, but the behavior > seems to me to be correct. > > -- > Ori Bernstein
Thanks. Makes sense now. Probably would have figured it out for myself if I'd have printed out matches.sm_match[0], etc. Live and learn. Edgar