> When parsing structures which can extend over newlines and/or support
> wierd quoting rules (like csv), it's almost inevitable.
When I need to parse, say, backslash escapes (and sscanf's %O doesn't
do the right thing), I start by splitting on \ and then investigate
the pieces. Or alternatively I use sscanf in more or less the same
way, i.e. scans data up to a character, do stuff with it, then loop.
E.g:
String.Buffer out = String.Buffer();
while (1) {
int res = sscanf (in, "%[^\\]\\%c%s", string pre, int esc, in);
out->add (pre);
if (res != 3) break;
switch (esc) {
case '\\n': case '"': out->putchar (esc); break;
// ...
}
}
I almost always use tricks like that to avoid single stepping chars on
the pike level. Even though the loop above actually is O(n^2) I
believe it's faster on moderately small strings.
But then again, I also got integer chars in my example above, so maybe
we're talking about the same approach afterall.