Let's say I want to make a small parser for a small language. The
language is made up of values of two kinds: arrays and bits. An array
can only be of the form "[x,y]", where x and y are values. A bit can
only be of the form "0" or "1". In other words, the language's strings
can look like this:
- 0
- 1
- [0,0]
- [1,0]
- [1,1]
- [[0,0],1]
Now, to create a parser I can create three metafunctions that can
create rules. Rules are functions that take a collection of tokens (in
this case, characters), and pass around an array made up of the
current parsed structure and the remaining tokens. If a rule receives
tokens that are invalid for the rule, it returns nil instead, which
would propagate back up the stack of function calls.
Clojure
user=> (defn conc [& subrules]
(fn [tokens]
(loop [subrule-queue (seq subrules), remaining-tokens (seq
tokens),
products []]
(if (nil? subrule-queue)
[products remaining-tokens]
(let [[subrule-products subrule-remainder :as
subrule-result]
((first subrule-queue) remaining-tokens)]
(when-not (nil? subrule-result)
(recur (rest subrule-queue)
subrule-remainder (conj products
subrule-products))))))))
#'user/conc
user=> (defn alt [& subrules]
(fn [tokens]
(some #(% tokens) subrules)))
#'user/alt
user=> (defn literal [literal-token]
(fn [tokens]
(let [first-token (first tokens), remainder (rest tokens)]
(when (= first-token literal-token)
[first-token remainder]))))
#'user/literal
user=> (def on (literal \1))
#'user/on
user=> (def off (literal \0))
#'user/off
user=> (def bit (alt on off))
#'user/bit
user=> (bit (seq "1, 0"))
[\1 (\, \space \0)]
user=> (bit (seq "starst"))
nil
user=> (def array-start (literal \[))
#'user/array-start
user=> (array-start (seq "[1, 2, 3]"))
[\[ (\1 \, \space \2 \, \space \3 \])]
user=> (def array-end (literal \]))
#'user/array-end
user=> (def array-sep (literal \,))
#'user/array-sep
Now, value and array are recursive into each other. Because Clojure
substitutes variables immediately outside of functions—and barring
macros, I have to put either the value rule or the array rule into a
wrapper function. This does not behave as I expect, though:
user=> (declare value)
#'user/value
user=> (defn array []
; I'm defining it as a function because otherwise I'd get an unbound
variable exception.
(conc array-start value array-sep value array-end))
#'user/array
user=> (def value (conc (array) bit))
#'user/value
user=> (def value (alt (array) bit))
#'user/value
user=> (value (seq "0"))
[\0 nil]
user=> (value (seq "[1,0]"))
nil
user=> (value (seq "[0,0]")) ; It should accept it, because (array)
accepts it. But it doesn't:
nil
user=> ((array) (seq "[0,0]")) ; This works as intended:
[[\[ \0 \, \0 \]] nil]
user=> (value (seq "[0,3]")) ; This should return nil, but a weird
argument exception is raised instead:
java.lang.IllegalArgumentException: Key must be integer
(NO_SOURCE_FILE:0)
user=> ((array) (seq "[0,3]")) ; This is what I want:
nil
Can anybody shed light on why (value (seq "[0,0]")) and (value (seq
"[0,3]")) do not work as intended? This is a big problem with my
parser library, since dealing with two or more rules that refer to
each other is a huge pain. I'm considering switching my parser library
to macros if that could fix it more easily somehow.
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups
"Clojure" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
-~----------~----~----~----~------~----~------~--~---