Current state: this works: /////////////////// open Regdef; { var digit = Charset "9"; val digits = Rpt (digit, 1, -1); var letter = Charset "x"; var us = Regdef::String "_"; var id = Seqs (list ( Alts (list(us,letter)), Rpt( Alts(list(letter,digit,us)),0,-1))); println$ render id; };
// alternate encoding var digit = regexp(charset "9" ); regdef digits = digit+; var letter = regexp(charset "x"); var us = regexp("_"); regdef id = (us|letter)(letter|digit|us)*; val pl = regexp ( perl ( "(?:_|[x])(?:[x]|[9]|_)*" )); println$ render id; println$ render pl; val dd = regexp ( regex ( Alts(list(letter,digit,us)))); println$ render dd; var r = RE2 (render (Group id)); var n = r.NumberOfCapturingGroups; println$ "Captures = "+str n; var s = "x_9"; var v = _ctor_varray[StringPiece]$ (n+1).ulong, StringPiece ""; var res = Re2::Match(r,StringPiece s,0, ANCHOR_START, v.stl_begin, v.len.int); println$ res; println$ v; /////////////////// What we have here is some tests for making regexps, various ways. The first block uses the regex type constructors. The second lot uses the syntax extension. Finally, I actually test the result will work with re2, which it does: (?:_|[x])(?:[x]|[9]|_)* (?:_|[x])(?:[x]|[9]|_)* (?:_|[x])(?:[x]|[9]|_)* [x]|[9]|_ Captures = 1 true varray(x_9, x_9) Obviously this will all be streamlined a bit more. Two things are in mind: x in regset will test if a string is a member of a regular set. A bit more ambitious: match s with | regexp "..." => ... .. This will require a change to the compiler itself. There are two problems here: the first is to consider how match works. Currently it is uses two functions: a match checker (does it match??) and a match handler (so now, extract the match variables and do something with them). For regexps we don't want to waste time checking, then re-analysing to extract the sub-group variables, at least I don't think so: it is actually a lot faster to do a match without extraction than with it (so perhaps we do want to do the matching twice). The second problem is exactly how to get the sub groups out. We could use an array here. We could also have named groups and a string indexed string valued dictionary. Or we could use actual Felix variables (as all the other matches do), but this is harder to do and is *impossible* if the match case expression is variable: it can only work if regexp is a constant literal (obviously, since we need to know the variable names). Anyhow: whilst it is by no means finished the proof of principle integration is now done. -- john skaller skal...@users.sourceforge.net ------------------------------------------------------------------------------ Lotusphere 2011 Register now for Lotusphere 2011 and learn how to connect the dots, take your collaborative environment to the next level, and enter the era of Social Business. http://p.sf.net/sfu/lotusphere-d2d _______________________________________________ Felix-language mailing list Felix-language@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/felix-language