05-Nov-2013 20:55, Philippe Sigaud пишет:
On Tue, Nov 5, 2013 at 3:54 PM, Dmitry Olshansky <dmitry.o...@gmail.com
<mailto:dmitry.o...@gmail.com>> wrote:


    I was also toying with the idea of exposing Builder interface for
    std.regex. But push/pop IMHO are better be implicitly designed-out:

    auto re =
    atom('x').star(charClass(__unicode.Letter),atom('y')).__build();

    ... and letting the nesting be explicit.

    Is the same as:
    auto re = regex(`x(?:\p{L}y)*`);

    Aimed for apps/libs that build regular expressions anyway and have
    no need in textual parser.

Another possible advantage is to reference external names inside your
construction, thus naming other regexen or refencing external variables
to deposit backreferences inside them.

Actually it's a bad, bad idea. It has nice potential to destroy all optimization opportunities and performance guarantees of it (like being linear in time, and that only works today w/o funky extensions used).

After all I'm in a curious position of having to do some work at R-T as well where you can't always just generate some D code ;)

What would be real nice though is to let users register their own dictionary of 'tokens' from that. Then things like Ipv4 pattern or domain name pattern as simple as `\d` pieces they use today (say with \i{user-defined-name}).

> All in all, to get a regex
> construct that can interact with the external word.

Well, I think of some rather interesting ways to do it even w/o tying in some external stuff as building blocks. It's rather making std.regex itself less rigid and more lean (as in cheap to invoke). Then external modules may slice and dice its primitives as seen fit.


    What ANTLR does is similar technique - a regular lookahead to
    resolve ambiguity in the grammar (implicitly). A lot like LL(k) but
    with unlimited length (so called LL(*)). Of course, it generates
    LL(k) disambiguation where possible, then LL(*), failing that the
    usual backtracking.

I liked that idea since the author added it to ANTLR, but I never used
it since.
I wonder whether that can be implemented inside another parser generator
or if it uses some specific-to-ANTLR internal machinery.

I don't think there is much of specific in it. You would though have to accept it's no longer a PEG but rather some hybrid top-down EBNF parser that resolves ambiguities.

        I worry that the greater threat to good AST manipulation tools
        in D is a
        lack of free time, and not the DMD bugs as much.


    Good for you I guess, my developments in related area are blocked
    still :(

Walter is far from convinced that AST manipulation is a good thing. You
would have to convince him first.

I thought it was about tools that work with D code like say lints, refactoring, etc.


--
Dmitry Olshansky

Reply via email to