I'd written a re-split before discovering Stuart's in
clojure.contrib.str-utils. Mine's a little different in that it's
lazy and the seq it returns includes the parts that match the pattern
as well as the parts in between:
user=> (my-re-split #"[0-9]+" "abc123def456")
("abc" "123" "def" "456")
user=> (re-split #"[0-9]+" "abc123def456")
("abc" "def")
It's easy to use seq functions or destructuring to pick out the pieces you want:
user=> (take-nth 2 (my-re-split #"[0-9]+" "abc123def456"))
("abc" "def")
That's just like re-split from str-utils. Or you can get just the separators:
user=> (take-nth 2 (rest (my-re-split #"[0-9]+" "abc123def456")))
("123" "456")
Or both:
user=> (for [[othr num] (partition 2 (my-re-split #"[0-9]+"
"abc123def456"))] {:othr othr :num num})
({:num "123", :othr "abc"} {:num "456", :othr "def"})
So, should I add my-re-split to str-utils? If so, is it similar
enough to re-split to replace the existing one? Otherwise, what would
be a good name for it?
Here's the implementation:
(defn re-split
[#^java.util.regex.Pattern re #^CharSequence cs]
(let [m (re-matcher re cs)]
((fn step [prevend]
(if (.find m)
(lazy-cons (.subSequence cs prevend (.start m))
(lazy-cons (re-groups m)
(step (+ (.start m) (count (.group m))))))
(when (< prevend (.length cs))
(list (.subSequence cs prevend (.length cs))))))
0)))
--Chouser
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---