I always favor composition over inheritance. (Think of it as
the strategy pattern) see
http://stackoverflow.com/questions/243274/best-practice-with-unit-testing-abstract-classes/2947823#2947823
The key here is that rule definitions are nothing special. They are just
methods that define methods that return parsers. :)
The parser >> operator takes a parser and returns a parser.. so your rule
body is just defining a parser too.
You can even do stuff like this.. (thought I would prefer to pass in a date
parser myself)
def date_parser
@region == :uk ? (day >> forwardslash >> ukmonth >> forwardslash >> year)
: (usmonth >> forwardslash >> day >> forwardslash >> year)
end
and use it in rules like any other parser.
rule(:shorteeshort) {
(space? >> at >> actor.as(:mainactor) >>
action >>
amountnum >> amountunits >> date_parser ).as(:shortee)
}
---
"No man is an island... except Philip"
On Wed, Feb 6, 2013 at 9:15 AM, Jeremy Nevill <[email protected]> wrote:
> Hi,
>
> Thanks for the quick replies… I think that the oo route is attractive as
> that would be easier to maintain and feel right.
>
> Here's my rather unsubtle excerpt for messages I know to be in UK format:
>
> rule(:shorteeukshort) {
> (space? >> at >> actor.as(:mainactor) >>
> action >>
> amountnum >> amountunits >>
> day >> forwardslash >>
> ukmonth >> forwardslash >> year).as(:shortee)
> }
>
> A lot of refactoring opportunity to be had… quite why I haven't extracted
> out the date format bit yet I don't know.
>
> Just for the record I've been doing a bit of performance testing on
> sending messages into my Shortee parser and it easily parsers about
> 300/second when hosted in a rails app on my Macbook Pro using MongoDB as
> the backend.
>
> Thanks again, most appreciated.
>
> Regards,
>
> Jeremy
>
>
>
> On 5 Feb 2013, at 21:58, Jonathan Rochkind <[email protected]> wrote:
>
> > 1) Worth investigating the 'chronic' gem, rather than writing a parser
> > yourself. It can handle all sorts of natural language ish date formats.
> >
> > 2) But for your actual question. Parslet parsers are ordinary ruby
> > classes. They can inherit from each other, as well as use modules.
> >
> > So you could start out with an abstract parser class that lacks the rule
> > for dates, and then have two subclasses, US and UK, both of which
> > define their own date rule.
> >
> > Or you could use any other implementation sharing OO design. For
> > instance, start out defining the US one as complete, then have the UK
> > one sub-class it and over-ride the relevant date rule. Or put the bulk
> > of your parser (without US/UK specific rules) in a ruby module, then
> > have both the US and UK ones 'include' that module, and supply their own
> > locale specific rules.
> >
> > I haven't actually tried any of these things recently, but they should
> > all work, something along those lines. That parslet parsers are just
> > ordinary ruby classes to which you can use ordinary ruby language
> > composition features -- is one of the very strong points about parslet
> > in my opinion.
> >
> > On 2/5/2013 4:46 PM, Jeremy Nevill wrote:
> >> First of all, great parser and documentation which has helped me make
> the leap from regex to proper parsing.
> >>
> >> We're using Parslet it to parse our Shortee event message format,
> sample messages being:
> >>
> >> @JeremyNevill ate 1lambchop 01/02/2013
> >> @JeremyNevill walked @Rover 3miles 12/dec/2012
> >>
> >> I have the parser working nicely, extracting the message entities
> defined in my syntax: https://github.com/JeremyNevill/shortee
> >>
> >> Now the issue I have is how to handle ambiguous dates as we have both
> US date format and UK date format clients:
> >>
> >> e.g. 01/02/2013 in the UK is 1st/Feb/2013 but 2nd/Jan/2013 in the US
> >>
> >> At present I have 2 very similar parsers, one that handles UK dates,
> the other that handles US… this is not very DRY and I'm wondering if there
> is a better way to go…maybe appending the date format required to the
> message when it gets sent into the parser.
> >>
> >> Any help will be most appreciated as I'm a bit stumped on the preferred
> method for problems like this.
> >>
> >> Regards,
> >>
> >> Jeremy Nevill
> >> www.nevill.net
> >>
>
>