re-split

Chouser Mon, 08 Sep 2008 20:59:51 -0700

I'd written a re-split before discovering Stuart's in
clojure.contrib.str-utils.  Mine's a little different in that it's
lazy and the seq it returns includes the parts that match the pattern
as well as the parts in between:


user=> (my-re-split #"[0-9]+" "abc123def456")
("abc" "123" "def" "456")
user=> (re-split #"[0-9]+" "abc123def456")
("abc" "def")

It's easy to use seq functions or destructuring to pick out the pieces you want:

user=> (take-nth 2 (my-re-split #"[0-9]+" "abc123def456"))
("abc" "def")

That's just like re-split from str-utils.  Or you can get just the separators:

user=> (take-nth 2 (rest (my-re-split #"[0-9]+" "abc123def456")))
("123" "456")

Or both:

user=> (for [[othr num] (partition 2 (my-re-split #"[0-9]+"
"abc123def456"))] {:othr othr :num num})
({:num "123", :othr "abc"} {:num "456", :othr "def"})

So, should I add my-re-split to str-utils?  If so, is it similar
enough to re-split to replace the existing one?  Otherwise, what would
be a good name for it?

Here's the implementation:

(defn re-split
  [#^java.util.regex.Pattern re #^CharSequence cs]
    (let [m (re-matcher re cs)]
      ((fn step [prevend]
           (if (.find m)
             (lazy-cons (.subSequence cs prevend (.start m))
                        (lazy-cons (re-groups m)
                                   (step (+ (.start m) (count (.group m))))))
             (when (< prevend (.length cs))
               (list (.subSequence cs prevend (.length cs))))))
       0)))

--Chouser

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

re-split

Reply via email to