-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 30-12-11 09:32, Eli Barzilay wrote: > This doesn't look like an issue that is related to guile, just that > he chose python as the goal... The first other random example I > tried was `split-string' in Emacs, which did the same thing as > Racket.
They may choose python's version as the goal. It doesn't look like they have looked very hard as of yet at what else is out there. Probably because they are expecting compatibility between most implementations. > >> Welcome to Racket v5.2.0.7. >>> (regexp-split "([^0-9])" "123+456*/") >> '("123" "456" "" "") >> >> should it be considered a bug in racket that it doesn't support >> capturing groups in regexp-split? > > No. > > >> Without the capturing group the results are identical: [...] > > Which is expected. Good, just establishing a baseline here, but it is good that some compatibility is *expected*. How nice is that? Since we're expecting compatibility between python and racket, I guess it goes without saying that racket's and guile's regexp-split should be compatible as well. R7RS Large may standardize a regular expression library, and we can make that easier by reducing incompatibilities between schemes. We can all grow from examining our incompatibilities, discussing them and sometimes resolving them. > Python does something which is IMO very weird: > >>>> re.split("([^0-9])", "123+456*/") > ['123', '+', '456', '*', '', '/', ''] > > It's even more confusing with multiple patterns: > >>>> re.split("([^0-9]([0-9]))", "123+456*/") > ['123', '+4', '4', '56*/'] > > There's probably uses for that -- at least for the simple version > with a single group around the whole regexp, but that's some hybrid > of `regexp-split' and `regexp-match*': it returns something that > interlevase them, which can be useful, but I'd rather see it with > a different name. Yes, I agree that I find it a bit weird as well. You don't lose anything by supporting this though, since you can always use a non-capturing group, but I do agree that it can be considered an inappropriate extension of the meaning of regexp-split. I'll be sure to raise these issues on the guile list. > We've talked semi-recently about adding an option to > `regexp-match*' so it can return the lists of matches for each > pattern, perhaps add another option for returning the unmatched > sequences between them, and give the whole thing a new name? > (Something that indicates it being the multitool version of all of > these.) Interesting. Marijn -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk79i1QACgkQp/VmCx0OL2zI4gCgtLLd3b6vgzaksYSA7wsZksHA yeIAoJJ6G7AcimN3OhtxFMvN8Xf7TdrH =1+Ax -----END PGP SIGNATURE----- _________________________ Racket Developers list: http://lists.racket-lang.org/dev