For instance, if your 'dsl' was YAML but a value could look like a 'ruby
block'... then you would just need to parse YAML, your parser wouldn't
have to care that a value happened to look like a 'ruby block', that's
just payload like any other. And once parsed, you could investigate the
payloads with by using imperative code, or with regexps, to see which
ones looked like 'a ruby block', if it mattered to pull those out.
And of course there's already a YAML parser built into ruby, you
wouldn't need to write one.
If your "dsl" is not YAML but is your own home-built thing, the same
basic approach could be tried, of structuring it so your parser doesn't
actually have to recognize a 'block', it's just a payload in the larger
structure.
One way or another, I suspect you want to re-think the nature of your
"dsl" to be easier to work with.
Ironically, however, Parslet itself demonstrates the utility of using
Plain Old Ruby for your "dsl" -- note how a parslet grammar is just ruby
code, it's not parsed by anything other than ruby itself. Again,
there's a reason many ruby libraries take this approach, writing your
own parser for a 'dsl' instead can be a lot of cost for little benefit
compared to just structuring your API such that plain old ruby is a
decent "dsl".
On 5/19/2011 10:31 AM, Jonathan Rochkind wrote:
> So, do you think your task ends up being basically writing a parser for
> the ruby language? That's obviously a somewhat hard problem. :)
>
> What does the "DSL" you are embedding these 'blocks' in look like?
> There may be an easier way of approaching the parsing, knowing the
> context. Or you may be able to alter your "DSL" to make the parsing
> easier, but more unambiguously parseable things in the rest of the DSL.
>
> But there's a reason that many ruby libraries use plain old ruby code as
> a "DSL" -- you already have a ruby parser, ruby itself, and don't need
> to write one.
>
> Jonathan
>
> On 5/18/2011 5:25 PM, Joe Hellerstein wrote:
>> The plot thickens when we consider handling "do..end" blocks in Ruby.
>>
>> To extend our curly-brace-balancing scheme, we'd need to balance all uses of
>> "end" inside a Ruby block. That is fine -- there's only a small set of Ruby
>> expressions that end in "end" and I can enumerate their starting words (if,
>> unless, while, until, case, for, class, module, def.)
>>
>> The problem is the way Ruby allows the use of "if"/"unless" at the end of a
>> Ruby statement without a matching "end". To do my "end" balancing right and
>> handle these cases, I'd need to recognize Ruby statements so I could figure
>> out that the "if"/"unless" logic was a suffix. Blech.
>>
>> Any further thoughts?
>>
>> J
>>
>>
>> On May 18, 2011, at 12:39 PM, Joe Hellerstein wrote:
>>
>>> Nicely done and thank you! As long as I don't mind enforcing
>>> delimiter-balancing in the thing I'm gobbling up (and in this case I
>>> don't), your trick works.
>>>
>>> Fixed example below for future ref.
>>>
>>> Joe
>>>
>>> require 'rubygems'
>>> require 'parslet'
>>>
>>> class Mini< Parslet::Parser
>>> rule(:lbrace) { str('{')>> space? }
>>> rule(:rbrace) { str('}')>> space? }
>>> rule(:word) { match['a-z'].repeat(1)>> space? }
>>> rule(:space) { match('\s').repeat(1) }
>>> rule(:space?) { space.maybe }
>>> rule(:block) { lbrace>> (content | block).repeat(1)>> rbrace }
>>> rule(:content) { match['^{}'] }
>>>
>>> rule(:stmt) { (block.as(:block) | word).repeat }
>>> root :stmt
>>> end
>>>
>>> def parse(str)
>>> mini = Mini.new
>>> print "Parsing #{str}: "
>>>
>>> p mini.parse(str)
>>> rescue Parslet::ParseFailed => error
>>> puts error, mini.root.error_tree
>>> end
>>>
>>> parse "joe is here {hi {it's} joe}"
>>>
>>>
>>>
>>> On May 18, 2011, at 11:47 AM, Jonathan Rochkind wrote:
>>>
>>>> Ie, something like this maybe (just typing it into the email client, dont'
>>>> know if it even compiles, let alone works, but may give you some ideas).
>>>>
>>>> rule :block do
>>>> str('{')<< content.maybe<< block.maybe<< content.maybe<<
>>>> str('}'
>>>> end
>>>>
>>>> rule :content do
>>>> match['^{}].repeat
>>>> end
>>>>
>>>> The trick I think might work is the recursive call to block that will
>>>> allow a (balanced) block to be inside a block, but otherwise we dont'
>>>> allow '{' or '}'. Except I actually have no idea if this will actually
>>>> work, heh, but some ideas to work with.
>>>>