I've implemented a new `regexp-explode' function. It accepts the same arguments as `regexp-match*' and `regexp-split', but with two additional keyword arguments:
* #:select-match If this is #t (the default) then the result includes the lists of results from the sub-matches. It can also be #f to not include them, and it can be a "selector function" that chooses a specific one (eg, `car' etc) or return a different list of matches (eg, `cdr'). * #:select-gap This is just a boolean flag -- if it's #t (the default), the strings between the matches are returned as well -- interleaved with the (lists of) matches, otherwise they're omitted. So by default, you get the information that `regexp-split' returns, interleaved with the full results of matching. Examples: -> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2") '("0" ("+" #f) "1" (".*" "*") "2") -> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2" #:select-match car #:select-gap #f) '("+" ".*") -> (regexp-explode #rx"[^0-9]([^0-9])?" "0+1.*2" #:select-match cadr) '("0" #f "1" "*" "2") *** Minor poll: I'm not too happy with that `select-gap' name. Any suggestions for a better name? But the obvious next function to implement, `regexp-explode-positions', complicated things a little. The thing is that there's no point in having it have the same interface -- the gaps are useless there since they're easily inferred from the matches (as seen by the lack of a `regexp-split-positions' function). So, a possible alternative that I thought about is to add a `#:select-match' keyword to `regexp-match-positions*' instead, so it can return the list of position matches in a similar way. However, that would lead to another problem: it would be bad to have a keyword argument only for `regexp-match-positions*' which is not accepted by `regexp-match*'. So a solution to that is to add it to `regexp-match*' too, but then there's little point in `regexp-explode'... So the options that I see are: 1. Drop the new `regexp-explode' name, and instead have this functionality folded into `regexp-match*', which will get the two new keywords with a default of #f for `#:select-gap', and `car' for `#:select-match'. Similarly Add `#:select-match' to `regexp-match-positions*', but not `#:selet-gap'. 1a. Minor variation: insist on uniformity, and include a `#:select-gap' keyword for `regexp-match-positions*' too. 2. Same as #1, but also have `regexp-explode', which is now the same as `regexp-match*' but with different defaults for the two keywords. 2a. Same variation for #1a. 3. Do not extend the interface of existing functions -- have only the new `regexp-explode' have the added functionality. For the positions version, add a `regexp-explode-positions', without a `#:select-gap' keyword. The possible advantage here is that the (already complicated) output type of `regexp-match*' stays the same, and `regexp-explode' gets the much more complicated one. 3a. Same as #3, but with `#:select-gap' for `regexp-explode-positions'. I'm now leaning towards #1. Any votes for other options, or maybe something different? -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _________________________ Racket Developers list: http://lists.racket-lang.org/dev