Looks interesting and maybe even very useful. Why not put your code on
Github or some other public repo of your liking. It's much nicer than
pasting all this code ;)

On Mon, Mar 23, 2009 at 9:18 PM, Sean <francoisdev...@gmail.com> wrote:

>
> Hello Everyone,
> I've been reviewing the str-utils package, and I'd like to propose a
> few changes to the library.  I've included the code at the bottom.
>
> USE MULTI-METHODS
>
> I'd like to propose re-writing the following methods to used multi-
> methods.  Every single method will take an input called input-string.
>
> *re-split[input-string & remaining-inputs](...)*
>
> The remaining inputs can be dispatched based on a regex pattern, a
> list of patterns, or a map.
>
> regex pattern - splits a string into a list, like it does now.
> e.g. (re-split "1 2 3\n4 5 6" #"\n") => ("1 2 3" "4 5 6")
>
> list - this splits each element either like a map or a regex.  The map
> operator is applied recursively to each element
> e.g. (re-split "1 2 3\n4 5 6" (list #"\n" #"\s+")) => (("1" "2" "3")
> ("4" "5" "6"))
>
> map - this splits each element based on the inputs of the map.  It is
> how options are passed to the method.
> e.g (re-split "1 2 3" {:pattern #"\s+" :limit 2 :marshal-fn #
> (java.lang.Double/parseDouble %)}) => (1.0 2.0)
> The :pattern and :limit options are relatively straightforward.
> The :marshal-fn is mapped after the string is split.
>
> These items can be chained together, as the following example shows
> e.g. (re-split "1 2 3\n4 5 6" (list #"\n" {:pattern #"\s+" :limit
> 2 :marshal-fn #(java.lang.Double/parseDouble %)})) => ((1.0 2.0) (4.0
> 5.0))
>
> In my opinion, the :marshal-fn is best used at the end of the list.
> However, it could be used earlier in the list, but a exception will
> most likely be thrown.
>
>
> *re-partion[input-string & remaining-inputs]
>
> This methods behaves like the original re-partition method, with the
> remaining-inputs being able to a list or a pattern.  I don't see a
> need to change the behavior of this method at the moment.
>
> *re-gsub[input-string & remaining-inputs]
>
> This method can take a list or two atoms as the remaining inputs.
>
> Two atoms -
> e.g. (re-gsub "1 2 3 4 5 6" #"\s" "") => "123456"
>
> A paired list
> e.g (re-gsub "1 2 3 4 5 6" '((#"\s" " ) (#"\d" "D"))) => "DDDDDD"
>
> *re-sub[input-string & remaining-inputs]
>
> Again, this method can take a list or two atoms as the remaining
> inputs.
>
> Two atoms
> e.g. (re-sub "1 2 3 4 5 6" #"\d" "D") => "D 2 3 4 5 6"
>
> A paired list
> e.g (re-sub "1 2 3 4 5 6" '((#"\d" "D") (#"\d" "E"))) => "D E 3 4 5 6"
>
> NEW PARSING HELPERS
> I've created four methods, str-before, str-before-inc, str-after, str-
> after-inc.  They are designed to help strip off parts of string before
> a regex.
>
> (str-before "Clojure Is Awesome" #"\s") => "Clojure"
> (str-before-inc "Clojure Is Awesome" #"\s") => "Clojure "
> (str-after "Clojure Is Awesome" #"\s") => "Is Awesome"
> (str-after-inc "Clojure Is Awesome" #"\s") => " Is Awesome"
>
> These methods can be used to help parse strings
>
> (str-before (str-after "<h4 ... >" #"<h4") ">") => ;the stuff in the
> middle
>
> NEW INFLECTORS
> I've added a few inflectors that I am familiar with from Rails.  My
> apologies if their origin is anther language.  I'd be interested in
> knowing where the method originated
>
> str-reverse
> This methods reverses a string
> e.g. (str-reverse "Clojure") => "erujolC"
>
> trim
> This is a convenience wrapper for the trim method java supplies
> e.g. (trim "  Clojure  ") => "Clojure"
>
> strip
> This is an alias for trim.  I accidently switch between *trim* and
> *strip* all the time.
> e.g. (strip "  Clojure  ") => "Clojure"
>
> ltrim
> This method removes the leading whitespace
> e.g. (ltrim "  Cloure  ") => "Clojure  "
>
> rtrim
> This method removes the trailing whitespace
> e.g. (ltrim "  Cloure  ") => "  Clojure"
>
> downcase
> This is a convenience wrapper for the toLowerCase method java supplies
> e.g. (downcase "Clojure") => "clojure"
>
> upcase
> This is a convenience wrapper for the toUpperCase method java supplies
> e.g. (upcase "Clojure") => "CLOJURE"
>
> capitalize
> This method capitalizes a string
> e.g (capitalize "clojure") => "Clojure"
>
> titleize, camelize, dasherize, underscore
> These methods manipulate "sentences", producing a consistent output.
> Check the unit tests for more examples
> (titleize "clojure iS Awesome") => "Clojure Is Awesome"
> (camleize "clojure iS Awesome") => "clojureIsAwesome"
> (dasherize "clojure iS Awesome") => "clojure-is-awesome"
> (underscore "clojure iS Awesome") => "clojure_is_awesome"
>
> *FINAL THOUGHTS*
> There are three more methods, str-join, chop, and chomp that were
> already in str-utils.  I change the implementation of the methods, but
> the behavior should be the same.
>
> There is a big catch with my proposed change.  The signature of re-
> split, re-partition, re-gsub and re-sub changes.  They will not be
> backwards compatible, and will break code.  However, I think the
> flexibility is worth it.
>
> *TO-DOs*
> There are a few more things I'd like to add, but that could done at a
> later date.
>
> *Add more inflectors
>
> The following additions become pretty easy if the propsed re-gsub is
> included:
>
> *Add HTML-escape function (like Rails' h method)
> *Add Javascript-escape function (like Rails' javascript-escape method)
> *Add SQL-escape function
>
> Okay, that's everything I can think of for now.  I'd like to thank the
> Stuart Sierra, and all of the contributors to this library.  This is
> possible because I'm standing on their shoulders.
>
> Oh, and I apologize for not putting this up on github, especially
> after I asked someone else to do the same yesterday.  I'll try not to
> be so hypocritical going forward.
>
> *CODE*
>
> (ns devlinsf.str-utils)
>
> ;;; String Merging & Slicing
>
> (defn str-join
>  "Returns a string of all elements in 'sequence', separated by
>  'separator'.  Like Perl's 'join'."
>  [separator sequence]
>  (apply str (interpose separator sequence)))
>
>
> (defmulti re-split (fn[input-string & remaining-inputs] (class (first
> remaining-inputs))))
>
> (defmethod re-split java.util.regex.Pattern
>  ([string #^java.util.regex.Pattern pattern] (seq (. pattern (split
> string)))))
>
> (defmethod re-split clojure.lang.PersistentList
>  [input-string patterns]
>  (let [reversed (reverse patterns)
>        pattern (first reversed)
>        remaining (rest reversed)]
>    (if (empty? remaining)
>      (re-split input-string pattern)
>      (map #(re-split % pattern) (re-split input-string (reverse
> remaining))))))
>
> (defmethod re-split clojure.lang.PersistentArrayMap
>  [input-string map-options]
>  (cond (:limit map-options) (take (:limit map-options) (re-split
> input-string (dissoc map-options :limit)))
>        (:marshal-fn map-options) (map (:marshal-fn map-options) (re-split
> input-string (dissoc map-options :marshal-fn)))
>        'true (re-split input-string (:pattern map-options))))
>
> (defmulti re-partition (fn[input-string & remaining-inputs] (class
> (first remaining-inputs))))
>
> (defmethod re-partition java.util.regex.Pattern
>  [string #^java.util.regex.Pattern re]
>  (let [m (re-matcher re string)]
>    ((fn step [prevend]
>       (lazy-seq
>        (if (.find m)
>          (cons (.subSequence string prevend (.start m))
>                (cons (re-groups m)
>                      (step (+ (.start m) (count (.group m))))))
>          (when (< prevend (.length string))
>            (list (.subSequence string prevend (.length string)))))))
>     0)))
>
> (defmethod re-partition clojure.lang.PersistentList
>  [input-string patterns]
>  (let [reversed (reverse patterns)
>        pattern (first reversed)
>        remaining (rest reversed)]
>    (if (empty? remaining)
>      (re-partition input-string pattern)
>      (map #(re-partition % pattern) (re-partition input-string
> (reverse remaining))))))
>
> (defmulti re-gsub (fn[input-string & remaining-inputs] (class (first
> remaining-inputs))))
>
> (defmethod re-gsub java.util.regex.Pattern
>  [#^String string #^java.util.regex.Pattern regex replacement]
>  (if (ifn? replacement)
>    (let [parts (vec (re-partition regex string))]
>      (apply str
>             (reduce (fn [parts match-idx]
>                       (update-in parts [match-idx] replacement))
>                     parts (range 1 (count parts) 2))))
>    (.. regex (matcher string) (replaceAll replacement))))
>
> (defmethod re-gsub clojure.lang.PersistentList
>  [input-string regex-pattern-pairs]
>  (let [reversed (reverse regex-pattern-pairs)
>        pair (first reversed)
>        remaining (rest reversed)]
>    (if (empty? remaining)
>      (re-gsub input-string (first pair) (second pair))
>      (re-gsub (re-gsub input-string (reverse remaining)) (first pair)
> (second pair)))))
>
>
> (defmulti re-sub (fn[input-string & remaining-inputs] (class (first
> remaining-inputs))))
>
> (defmethod re-sub java.util.regex.Pattern
>  [#^String string #^java.util.regex.Pattern regex replacement ]
>  (if (ifn? replacement)
>    (let [m (re-matcher regex string)]
>      (if (.find m)
>        (str (.subSequence string 0 (.start m))
>             (replacement (re-groups m))
>             (.subSequence string (.end m) (.length string)))
>        string))
>    (.. regex (matcher string) (replaceFirst replacement))))
>
> (defmethod re-sub clojure.lang.PersistentList
>  [input-string regex-pattern-pairs]
>  (let [reversed (reverse regex-pattern-pairs)
>        pair (first reversed)
>        remaining (rest reversed)]
>    (if (empty? remaining)
>      (re-sub input-string (first pair) (second pair))
>      (re-sub (re-sub input-string (reverse remaining)) (first pair)
> (second pair)))))
>
> ;;; Parsing Helpers
> (defn str-before [input-string regex]
>  (let [matches (re-partition input-string regex)]
>    (first matches)))
>
> (defn str-before-inc [input-string regex]
>  (let [matches (re-partition input-string regex)]
>    (str (first matches) (second matches))))
>
> (defn str-after [input-string regex]
>  (let [matches (re-partition input-string regex)]
>    (str-join "" (rest (rest matches)))))
>
> (defn str-after-inc [input-string regex]
>  (let [matches (re-partition input-string regex)]
>    (str-join "" (rest matches))))
>
>
> ;;; Inflectors
> ;;; These methods only take the input string.
> (defn str-reverse
>  "This method excepts a string and returns the reversed string as a
> results"
>  [input-string]
>  (apply str (reverse input-string)))
>
>
> (defn upcase
>  "Converts the entire string to upper case"
>  [input-string]
>  (. input-string toUpperCase))
>
> (defn downcase [input-string]
>  "Converts the entire string to lower case"
>  (. input-string toLowerCase))
>
> (defn trim[input-string]
>  "Shortcut for String.trim"
>  (. input-string trim))
>
> (defn strip
>  "Alias for trim, like Ruby."
>  [input-string]
>  (trim input-string))
>
> (defn ltrim
>  "This method chops all of the leading whitespace."
>  [input-string]
>  (str-after input-string #"\s+"))
>
> (defn rtrim
>  "This method chops all of the trailing whitespace."
>  [input-string]
>  (str-reverse (str-after (str-reverse input-string) #"\s+")))
>
> (defn chop
>  "Removes the last character of string."
>  [input-string]
>  (subs input-string 0 (dec (count input-string))))
>
> (defn chomp
>  "Removes all trailing newline \\n or return \\r characters from
>  string.  Note: String.trim() is similar and faster."
>  [input-string]
>  (str-before input-string #"[\r\n]+"))
>
> (defn capitalize
>  "This method turns a string into a capitalized version, Xxxx"
>  [input-string]
>  (str-join "" (list
>                (upcase (str (first input-string)))
>                (downcase (apply str (rest input-string))))))
>
> (defn titleize
>  "This method takes an input string, splits it across whitespace,
> dashes, and underscores.  Each word is capitalized, and the result is
> joined with \" \"."
>  [input-string]
>  (let [words (re-split input-string #"[\s_-]+")]
>    (str-join " " (map capitalize words))))
>
> (defn camelize
>  "This method takes an input string, splits it across whitespace,
> dashes, and underscores.  The first word is captialized, and the rest
> are downcased, and the result is joined with \"\"."
>  [input-string]
>  (let [words (re-split input-string #"[\s_-]+")]
>    (str-join "" (cons (downcase (first words)) (map capitalize (rest
> words))))))
>
> (defn dasherize
>  "This method takes an input string, splits it across whitespace,
> dashes, and underscores.  Each word is downcased, and the result is
> joined with \"-\"."
>  [input-string]
>  (let [words (re-split input-string #"[\s_-]+")]
>    (str-join "-" (map downcase words))))
>
> (defn underscore
>  "This method takes an input string, splits it across whitespace,
> dashes, and underscores.  Each word is downcased, and the result is
> joined with \"_\"."
>  [input-string]
>  (let [words (re-split input-string #"[\s_-]+")]
>    (str-join "_" (map downcase words))))
>
> ;;; Escapees
>
> ;TO-DO
>
> ;(defn sql-escape[x])
> ;(defn html-escape[x])
> ;(defn javascript-escape[x])
> ;(defn pdf-escape)
>
>
> *UNIT TESTS*
> (ns devlinsf.test-contrib.str-utils
>    (:use clojure.contrib.test-is
>          devlinsf.str-utils))
>
> (deftest test-str-reverse
>  (is (= (str-reverse "Clojure") "erujolC")))
>
> (deftest test-downcase
>  (is (= (downcase "Clojure") "clojure")))
>
> (deftest test-upcase
>  (is (= (upcase "Clojure") "CLOJURE")))
>
> (deftest test-trim
>  (is (= (trim "  Clojure  ") "Clojure")))
>
> (deftest test-strip
>  (is (= (strip "  Clojure  ") "Clojure")))
>
> (deftest test-ltrim
>  (is (= (ltrim "  Clojure  ") "Clojure  ")))
>
> (deftest test-rtrim
>  (is (= (rtrim "  Clojure  ") "  Clojure")))
>
> (deftest test-chop
>  (is (= (chop "Clojure") "Clojur")))
>
> (deftest test-chomp
>  (is (= (chomp "Clojure \n") "Clojure "))
>  (is (= (chomp "Clojure \r") "Clojure "))
>  (is (= (chomp "Clojure \n\r") "Clojure ")))
>
> (deftest test-capitalize
>  (is (= (capitalize "clojure") "Clojure")))
>
> (deftest test-titleize
>  (let [expected-string "Clojure Is Awesome"]
>    (is (= (titleize "clojure is awesome") expected-string))
>    (is (= (titleize "clojure   is  awesome") expected-string))
>    (is (= (titleize "CLOJURE IS AWESOME") expected-string))
>    (is (= (titleize "clojure-is-awesome") expected-string))
>    (is (= (titleize "clojure- _ is---awesome") expected-string))
>    (is (= (titleize "clojure_is_awesome") expected-string))))
>
> (deftest test-camelize
>  (let [expected-string "clojureIsAwesome"]
>    (is (= (camelize "clojure is awesome") expected-string))
>    (is (= (camelize "clojure   is  awesome") expected-string))
>    (is (= (camelize "CLOJURE IS AWESOME") expected-string))
>    (is (= (camelize "clojure-is-awesome") expected-string))
>    (is (= (camelize "clojure- _ is---awesome") expected-string))
>    (is (= (camelize "clojure_is_awesome") expected-string))))
>
> (deftest test-underscore
>  (let [expected-string "clojure_is_awesome"]
>    (is (= (underscore "clojure is awesome") expected-string))
>    (is (= (underscore "clojure   is  awesome") expected-string))
>    (is (= (underscore "CLOJURE IS AWESOME") expected-string))
>    (is (= (underscore "clojure-is-awesome") expected-string))
>    (is (= (underscore "clojure- _ is---awesome") expected-string))
>    (is (= (underscore "clojure_is_awesome") expected-string))))
>
> (deftest test-dasherize
>  (let [expected-string "clojure-is-awesome"]
>    (is (= (dasherize "clojure is awesome") expected-string))
>    (is (= (dasherize "clojure   is  awesome") expected-string))
>    (is (= (dasherize "CLOJURE IS AWESOME") expected-string))
>    (is (= (dasherize "clojure-is-awesome") expected-string))
>    (is (= (dasherize "clojure- _ is---awesome") expected-string))
>    (is (= (dasherize "clojure_is_awesome") expected-string))))
>
> (deftest test-str-before
>  (is (= (str-before "Clojure Is Awesome" #"Is") "Clojure ")))
>
> (deftest test-str-before-inc
>  (is (= (str-before-inc "Clojure Is Awesome" #"Is") "Clojure Is")))
>
> (deftest test-str-after
>  (is (= (str-after "Clojure Is Awesome" #"Is") " Awesome")))
>
> (deftest test-str-after-inc
>  (is (= (str-after-inc "Clojure Is Awesome" #"Is") "Is Awesome")))
>
> (deftest test-str-join
>  (is (= (str-join " " '("A" "B")) "A B")))
>
> (deftest test-re-split-single-regex
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-split source-string #"\n") '("1\t2\t3" "4\t5\t6")))))
>
> (deftest test-re-split-single-map
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-split source-string {:pattern #"\n"}) '("1\t2\t3"
> "4\t5\t6")))
>    (is (= (re-split source-string {:pattern #"\n" :limit 1})
> '("1\t2\t3")))
>    (is (= (re-split source-string {:pattern #"\n" :marshal-fn #(str %
> "\ta")}) '("1\t2\t3\ta" "4\t5\t6\ta")))
>    (is (= (re-split source-string {:pattern #"\n" :limit 1 :marshal-
> fn #(str % "\ta")}) '("1\t2\t3\ta")))
>    ))
>
> (deftest test-re-split-single-element-list
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-split source-string (list #"\n")) '("1\t2\t3"
> "4\t5\t6")))))
>
> (deftest test-re-split-pure-list
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-split source-string (list #"\n" #"\t")) '(("1" "2" "3")
> ("4" "5" "6"))))))
>
> (deftest test-re-split-mixed-list
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-split source-string (list {:pattern #"\n" :limit 1}
> #"\t")) '(("1" "2" "3"))))
>    (is (= (re-split source-string (list {:pattern #"\n" :limit 1}
> {:pattern #"\t" :limit 2})) '(("1" "2"))))
>    (is (= (re-split source-string (list
>                                    {:pattern #"\n" :limit 1}
>                                    {:pattern #"\t" :limit 2 :marshal-fn
> #(java.lang.Double/
> parseDouble %)}))
>           '((1.0 2.0))))
>    (is (= (re-split source-string (list
>                                    {:pattern #"\n"}
>                                    {:pattern #"\t" :marshal-fn
> #(java.lang.Double/parseDouble
> %)}))
>           '((1.0 2.0 3.0) (4.0 5.0 6.0))))
>    (is (= (map #(reduce + %) (re-split source-string (list
>                                                       {:pattern #"\n"}
>                                                       {:pattern #"\t"
> :marshal-fn #(java.lang.Double/
> parseDouble %)})))
>           '(6.0 15.0)))
>    (is (= (reduce +(map #(reduce + %) (re-split source-string (list
>                                                                {:pattern
> #"\n"}
>                                                                {:pattern
> #"\t" :marshal-fn #(java.lang.Double/parseDouble
> %)}))))
>           '21.0))
>    ))
>
> (deftest test-re-partition
>  (is (= (re-partition "Clojure Is Awesome" #"\s+") '("Clojure" " "
> "Is" " " "Awesome"))))
>
> (deftest test-re-gsub
>  (let [source-string "1\t2\t3\n4\t5\t6"]
>    (is (= (re-gsub source-string #"\s+" " ") "1 2 3 4 5 6"))
>    (is (= (re-gsub source-string '((#"\s+" " "))) "1 2 3 4 5 6"))
>    (is (= (re-gsub source-string '((#"\s+" " ") (#"\d" "D"))) "D D D
> D D D"))))
>
> (deftest test-re-sub
>  (let [source-string "1 2 3 4 5 6"]
>    (is (= (re-sub source-string #"\d" "D") "D 2 3 4 5 6"))
>    (is (= (re-sub source-string '((#"\d" "D") (#"\d" "E"))) "D E 3 4
> 5 6"))))
>
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To post to this group, send email to clojure@googlegroups.com
To unsubscribe from this group, send email to 
clojure+unsubscr...@googlegroups.com
For more options, visit this group at 
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to