Sorry for the new thread, but this is a kind of a summary on the extensions that I think we're converging to, with a way to resolve the exact meaning of arguments. Please read through and reply if you see any problems with it. There are three specific questions, which are marked with [*1*]...[*3b*] -- I'd appreciate suggestions for them.
Starting with the problem of the argument, I think that the best choice is to go with plain ones -- no implicit `+', and no strings as bags of characters. (This is option (c) in the other thread.) Two rationales: * These functions are supposed to be simple, so a simple rule like that works nicely to that end. * Uses strings for what they are: an ordered sequence of characters. Going with a bag-of-characters is really abusing the string type. Adding an implicit `+' is complicating things since it interprets a string as a kind of a pattern. But to allow other uses, make these arguments a string *or* a regexp, where a regexp is taken as-is. This leads to another simplicity point in this design: * These functions are mostly similar to the regexp ones, except that the implicit coercion from a string to a pattern happens with `regexp-quote' rather than with `regexp'. It also means that when you want something that is not a plain string, you just use a regexp. This doesn't necessarily goes back to the full regexp versions with the implied complexity. A few examples: * The default argument for a pattern that serves as a separator (as in `string-trim' and `string-split') is a regexp: #px"\\s+". So newbies get to use them without learning them. * If there's a need for something different in the future, say one or more spaces and tabs, then making regexps a valid input means that we could add a binding for such regexps, so newbies can now do something like: (string-split string spaces-or-tabs) and still not worry about regexps. * Even if there's some obvious need for the bag-of-chars thing, it could be added as a function: (string-split string (either " " "\t")) Note that I'm not suggesting adding these last two items -- I'm just saying that accepting regexps means that such extensions are easier to do in the future. The suggested functions are (these are skeletons, they'll also have keyword arguments for some more tweaks): (string-trim str [sep #px"\\s+"]) Removes occurrences of `sep' from the beginning and end of `str'. (Keywords can make it do only one side.) This is already implemented (but not pushed). I will need to change it though, in subtle ways due to the new meaning of the `sep' argument. (string-normalize-spaces str [sep #px"\\s+"]) Replaces occurrences of `sep' with a space, trimming it at the edges. (Keywords can disable the trimming, and can make it use a different character to substitute.) This is also already implemented but will need to change in subtle ways as the last one. (string-split str [sep #px"\\s+"]) Splits `str' on occurrences of `sep'. Unclear whether it should do that with or without trimming, which affects keeping a first/last empty part. [*1*] Possible solution: make it take a `#:trim?' keyword, in analogy to `string-normalize-spaces'. This would make `#t' the obvious choice for a default, which means that (string-split ",,foo, bar," ",") -> '("foo" " bar") (string-replace str from to [start 0] [end (string-length str)]) Simple wrapper that quotes the `from' and `to'. Note the different order argument which is supposed to make this be more like common functions. Another rationale for this difference: these functions focus on the string, rather than on the regexp. (string-index str sub [start 0] [end (string-length str)]) Looks for occurrences of `sub' in `str', returns the index if found, #f otherwise. [*2*] I'm not sure about the name, maybe `string-index-of' is better? (list-index list elt) Looks for `elt' in `list'. This is a possible extension for `racket/list' that would be kind of obvious with adding the above. [*3*] I'm not sure if it should be added, but IIRC it was requested a few times. If it does get added, then there's another question for how far the analogy goes: [*3a*] Should it take a start/end index too? [*3b*] Should it take a list of elements and look for a matching sublist instead (which is not a function that is common to ask for, AFAICT)? -- ((lambda (x) (x x)) (lambda (x) (x x))) Eli Barzilay: http://barzilay.org/ Maze is Life! _________________________ Racket Developers list: http://lists.racket-lang.org/dev