05-Nov-2013 20:55, Philippe Sigaud пишет:
On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.o...@gmail.com
<mailto:dmitry.o...@gmail.com>> wrote:
I was also toying with the idea of exposing Builder interface for
std.regex. But push/pop IMHO are better be implicitly designed-out:
auto re =
atom('x').star(charClass(__unicode.Letter),atom('y')).__build();
... and letting the nesting be explicit.
Is the same as:
auto re = regex(`x(?:\p{L}y)*`);
Aimed for apps/libs that build regular expressions anyway and have
no need in textual parser.
Another possible advantage is to reference external names inside your
construction, thus naming other regexen or refencing external variables
to deposit backreferences inside them.
Actually it's a bad, bad idea. It has nice potential to destroy all
optimization opportunities and performance guarantees of it (like being
linear in time, and that only works today w/o funky extensions used).
After all I'm in a curious position of having to do some work at R-T as
well where you can't always just generate some D code ;)
What would be real nice though is to let users register their own
dictionary of 'tokens' from that. Then things like Ipv4 pattern or
domain name pattern as simple as `\d` pieces they use today (say with
\i{user-defined-name}).
> All in all, to get a regex
> construct that can interact with the external word.
Well, I think of some rather interesting ways to do it even w/o tying in
some external stuff as building blocks. It's rather making std.regex
itself less rigid and more lean (as in cheap to invoke). Then external
modules may slice and dice its primitives as seen fit.
What ANTLR does is similar technique - a regular lookahead to
resolve ambiguity in the grammar (implicitly). A lot like LL(k) but
with unlimited length (so called LL(*)). Of course, it generates
LL(k) disambiguation where possible, then LL(*), failing that the
usual backtracking.
I liked that idea since the author added it to ANTLR, but I never used
it since.
I wonder whether that can be implemented inside another parser generator
or if it uses some specific-to-ANTLR internal machinery.
I don't think there is much of specific in it. You would though have to
accept it's no longer a PEG but rather some hybrid top-down EBNF parser
that resolves ambiguities.
I worry that the greater threat to good AST manipulation tools
in D is a
lack of free time, and not the DMD bugs as much.
Good for you I guess, my developments in related area are blocked
still :(
Walter is far from convinced that AST manipulation is a good thing. You
would have to convince him first.
I thought it was about tools that work with D code like say lints,
refactoring, etc.
--
Dmitry Olshansky