Re: ANN: Gloss, a byte-format DSL

Zach Tellman Mon, 10 Jan 2011 14:02:08 -0800

I don't know if your example codec is as simple as your real problem,
but here's a codec that will work for the string you provided:


(repeated
  (string :utf-8 :delimiters ["\n" "\n\0"])
  :delimiters ["\n\0"] :strip-delimiters? false)

This terminates the whole sequence only on \n\0, but doesn't strip out
the terminator so that it can be used by the last string as well.
Right now this will not properly encode (the last token will be \n\n
\0, so there will be a spurious empty string at the end of the list),
but I plan on fixing that soon.

Zach

On Jan 10, 11:47 am, "pepijn (aka fliebel)" <pepijnde...@gmail.com>
wrote:
> Oh, right. My code does have a start and an end. I'm using header for
> the start. So the only way Gloss could do this is with a fn that
> combines header and repeated into one?
>
> On Jan 10, 12:29 pm, Ken Wesson <kwess...@gmail.com> wrote:
>
>
>
>
>
>
>
> > On Mon, Jan 10, 2011 at 4:52 AM, pepijn (aka fliebel)
>
> > <pepijnde...@gmail.com> wrote:
> > > The later. The string in my example is \n terminated, so it should
> > > read past the \0, and then the outer list should terminate on the last
> > > \0.
>
> > > My point is that I need to parse recursive structures, which of course
> > > contain the same terminator as the outer one.
>
> > > What if I wanted to parse null terminated lists of null terminated
> > > lists of ... of bytes? Like so:
>
> > > abc\0def\0\0
> > > def\0ghi\0jkl\0\0
> > > \0
>
> > > Your implementation would just make the outer list read to the first
> > > \0 and call it a day. What I need is that it only checks for a
> > > terminator on the outer list after reading a full inner list, and keep
> > > doing that, until it finds a terminator right after the inner list.
>
> > That's not going to work unless all leaves are at the same depth and
> > there are no empty lists at any depth. If the leaves are all at the
> > same depth:
>
> > (defn unflatten
> >   ([in-seq sentinel]
> >     (unflatten (seq in-seq) sentinel [[]] 0))
> >   ([in-seq sentinel stack height]
> >     (if in-seq
> >       (let [[f & r] in-seq]
> >         (if (= f sentinel)
> >           (let [h (inc height)
> >                 l1 (get stack h)
> >                 l1 (if l1 l1 [])
> >                 l2 (get stack height)
> >                 s1 (assoc stack h (conj l1 l2))
> >                 s2 (assoc s1 height [])]
> >             (recur r sentinel s2 (inc height)))
> >           (let [l1 (get stack 0)
> >                 s1 (assoc stack 0 (conj l1 f))]
> >             (recur r sentinel s1 0))))
> >       (peek stack))))
>
> > user=> (unflatten [1 2 0 3 4 0 0 5 6 0 7 8 0 0] 0)
> > [[[1 2] [3 4]]
> >  [[5 6] [7 8]]]
>
> > Otherwise you're going to need to have a start-of-list delimiter as
> > well as an end-of-list delimiter:
>
> > (defn unflatten
> >   ([in-seq start-del end-del]
> >     (unflatten (seq in-seq) start-del end-del (list [])))
> >   ([in-seq s e stack]
> >     (if in-seq
> >       (let [[f & r] in-seq]
> >         (condp = f
> >           e (let [[x y & z] stack]
> >               (if y
> >                 (recur r s e (conj z (conj y x)))
> >                 (throw (Error.))))
> >           s (recur r s e (conj stack []))
> >           (let [[x & y] stack]
> >             (recur r s e (conj y (conj x f))))))
> >       (if (> (count stack) 1)
> >         (throw (Error.))
> >         (first stack)))))
>
> > user=> (unflatten "(((ab)(cd))(ef)(g(hi)))" \( \))
> > [[[[\a \b] [\c \d]]
> >   [\e \f]
> >   [\g [\h \i]]]]

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en

Re: ANN: Gloss, a byte-format DSL

Reply via email to