Thanks, I should have done that tbh. my code is on github https://github.com/andrewchambers/ccc/blob/master/src/ccc/lex.clj . Don't think it compiles or runs on master currently though. If anyone is interested im trying to test the feasibility/size/maintainability of a clojure (or clojurescript) version of this guys tiny C compiler https://github.com/rui314/8cc. Basically to compare functional programming with what i consider excellent C code.
Cheers, On Monday, May 5, 2014 4:43:27 PM UTC+12, Atamert Ölçgen wrote: > > I created a gist of your code for better readability, I hope you don't > mind. > > https://gist.github.com/muhuk/7c4a2b8db63886e2a9cd > > > On Mon, May 5, 2014 at 12:36 PM, Andrew Chambers > <andrewc...@gmail.com<javascript:> > > wrote: > >> I've been trying to make a tokenizer/lexer for a project of mine and came >> up with the following code, >> I've modelled the stream of characters as seq/lazy of chars which is then >> converted to a lazy-seq of token objects. >> I'm relatively happy with how idiomatic and functional the code seems, >> however when benchmarked, the code takes about 30 seconds on clojure (after >> i increase the heap to 1 gig) >> to process a 30 meg file, and over 1 minute 30 seconds with >> clojurescript. This is in contrast to about of 0.1 to 0.5 seconds or less >> in C. Is >> there any idiomatic way to process the file without being a factor of 100 >> times slower than C? >> >> Also, is there a tool for clojure similar to gprof for C? >> >> >> Each function takes in a char seq and returns both a token and the seq >> after its been advanced. >> >> (defn match-ident >> >> [cs] >> (let [start (first cs)] >> (if (ident-first-char? start) >> >> (let [ identseq (cons start (take-while ident-tail-char? (rest cs))) >> >> ^String ident (apply str identseq)] >> [(drop (.length ident) cs) [:ident ident]])))) >> >> >> (defn match-num >> [cs] >> (if (digit? (first cs)) >> >> (let [ numseq (take-while digit? cs) >> ^String numstr (apply str numseq) >> >> retseq (drop (.length numstr) cs)] >> (if (= (first retseq) \.) >> >> nil >> [retseq [:number numstr]])))) >> >> (defn match-ws >> >> [cs] >> (if (whitespace-char? (first cs)) >> (let [ wsseq (take-while whitespace-char? cs) >> >> ^String wsstr (apply str wsseq) >> retseq (drop (.length wsstr) cs)] >> >> [retseq [:ws wsstr]]))) >> >> >> ... >> >> (defn next-token >> >> [cs] >> (or (match-ident cs) >> (match-ws cs) >> >> (match-punct cs) >> (match-num cs) >> (match-eof cs) >> >> (match-unknown cs))) >> >> ;; Here I build the lazy seq of tokens. >> >> (defn token-seq >> [cs] >> >> (let [[newcs tok] (next-token cs)] >> (lazy-seq (cons tok (token-seq newcs))))) >> >> >> Cheers, >> Andrew Chambers >> >> >> -- >> You received this message because you are subscribed to the Google >> Groups "Clojure" group. >> To post to this group, send email to clo...@googlegroups.com<javascript:> >> Note that posts from new members are moderated - please be patient with >> your first post. >> To unsubscribe from this group, send email to >> clojure+u...@googlegroups.com <javascript:> >> For more options, visit this group at >> http://groups.google.com/group/clojure?hl=en >> --- >> You received this message because you are subscribed to the Google Groups >> "Clojure" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to clojure+u...@googlegroups.com <javascript:>. >> For more options, visit https://groups.google.com/d/optout. >> > > > > -- > Kind Regards, > Atamert Ölçgen > > -+- > --+ > +++ > > www.muhuk.com > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.