An APL wrapper (⎕regexp[OP]) of a simple API like this would be great, (rune means unicode)
https://9fans.github.io/plan9port/man/man3/regexp.html One can build more APL functions out of these without much performance penalty. On the other hand, if there is an DFA implementation provided by APL (c.f. J's dyadic ;:) http://www.jsoftware.com/help/dictionary/d332.htm one can probably write the regular expression engine within an APL function with minimal performance lost. > On Sep 20, 2017, at 2:47 PM, Juergen Sauermann > <[email protected]> wrote: > > Hi Elias, > > I am generally in favour of supporting regular expressions in GNU APL. > > We should do that in a way that is compatible with the way in which the most > commonly used libraries > do that (even if they are lacking some features that more exotic libraries > may have. Unfortunately I do not > have a full overview of all (or even any) existing libraries. I personally > love grep and hate perl (the latter not > only because of their regexes). > > I would like to avoid constructs like s/aaa/bbb/ where operations are kind of > text-encoded into strings. > That is, IMHO, a hack-ish programming style and should be replaced by a more > APL-alike syntax such as > 'aaa' ⎕REX['s'] 'bbb' or maybe 's' ⎕REX 'aaa' 'bbb'. > > Or, if the number of operations is small (perl seems to have only 2, not > counting the translate which is already > covered by other APL functions), then we could also have different > ⎕-functions for them and thus avoiding a > third argument. > > Everybody else, please feel invited to join the discussion. > > Best Regards, > Jürgen Sauermann > > > On 09/20/2017 05:59 AM, Elias Mårtenson wrote: >> On several occasions, I have felt that built-in regex support in GNU APL >> would be very helpful. >> >> Implementing it should be rather simple, but I'd like to discuss how such an >> API should look in order for it to be as useful as possible. >> >> I was thinking of the following form: >> >> regex ⎕Regex string >> >> The way I envision this to work, is to have the function return ⍬ if there >> is no match, or a string containing the match, if there is one: >> >> 'f..' ⎕Regex 'xzooy' >> ┏⊖┓ >> ┃0┃ >> ┗━┛ >> 'f..' ⎕Regex 'xfooy' >> 'foo' >> >> If the regex has subexpressions, those matches should be returned as >> individual strings: >> >> '([0-9]+)-([0-9]+)-([0-9]+) '⎕Regex '2017-01-02' >> ┏→━━━━━━━━━━━━━━━┓ >> ┃"2017" "01" "02"┃ >> ┗∊━━━━━━━━━━━━━━━┛ >> >> This would be a very useful API, and reasonably easy to implement by simply >> calling into the standard regcomp() call: >> http://pubs.opengroup.org/onlinepubs/009695399/functions/regcomp.html >> >> What do you think? Is this a reasonable way to implement it? Any suggestions >> about alternative API's? >> >> Regards, >> Elias >
