RE: [ruby.parslet] International Date Formats

Jonathan Rochkind Tue, 05 Feb 2013 15:23:53 -0800

"prefer composition over inheritance" is indeed a good principle.

But I don't understand how you actually do what you're saying in this example, 
in practice. Can you provide a more complete example? Of two parsers that share 
most of their grammar, except for a different date rule?  (Or am I 
misunderstanding that as the path to the goal?)

I don't know how to get the bulk of the shared logic to be in two different 
parsers DRY, except using either inheritance or a mix-in (a mixin is really 
just a kind of inheritance too). Your example doesn't show this. But I'm 
curious to see how to do it, if there is a way! Can you share?
________________________________
From: [email protected] [[email protected]] on behalf of 
Nigel Thorne [[email protected]]
Sent: Tuesday, February 05, 2013 6:02 PM
To: [email protected]
Subject: Re: [ruby.parslet] International Date Formats

I always favor composition over inheritance. (Think of it as the strategy 
pattern)  see 
http://stackoverflow.com/questions/243274/best-practice-with-unit-testing-abstract-classes/2947823#2947823

The key here is that rule definitions are nothing special. They are just 
methods that define methods that return parsers. :)

The parser >> operator takes a parser and returns a parser.. so your rule body 
is just defining a parser too.

You can even do stuff like this.. (thought I would prefer to pass in a date 
parser myself)

def date_parser
    @region == :uk ? (day >> forwardslash >> ukmonth >> forwardslash >> year) : 
(usmonth >> forwardslash >> day >> forwardslash >> year)
end

and use it in rules like any other parser.

rule(:shorteeshort) {
    (space? >> at >> actor.as<http://actor.as/>(:mainactor) >>
        action >>
        amountnum >> amountunits >> date_parser ).as(:shortee)
  }

---
"No man is an island... except Philip"

On Wed, Feb 6, 2013 at 9:15 AM, Jeremy Nevill 
<[email protected]<mailto:[email protected]>> wrote:
Hi,

Thanks for the quick replies… I think that the oo route is attractive as that 
would be easier to maintain and feel right.

Here's my rather unsubtle excerpt for messages I know to be in UK format:

rule(:shorteeukshort) {
    (space? >> at >> actor.as<http://actor.as>(:mainactor) >>
        action >>
        amountnum >> amountunits >>
        day >> forwardslash >>
        ukmonth >> forwardslash >> year).as(:shortee)
  }

A lot of refactoring opportunity to be had… quite why I haven't extracted out 
the date format bit yet I don't know.

Just for the record I've been doing a bit of performance testing on sending 
messages into my Shortee parser and it easily parsers about 300/second when 
hosted in a rails app on my Macbook Pro using MongoDB as the backend.

Thanks again, most appreciated.

Regards,

Jeremy

On 5 Feb 2013, at 21:58, Jonathan Rochkind 
<[email protected]<mailto:[email protected]>> wrote:

> 1) Worth investigating the 'chronic' gem, rather than writing a parser
> yourself. It can handle all sorts of natural language ish date formats.
>
> 2) But for your actual question. Parslet parsers are ordinary ruby
> classes. They can inherit from each other, as well as use modules.
>
> So you could start out with an abstract parser class that lacks the rule
> for dates, and then have two subclasses,  US and UK, both of which
> define their own date rule.
>
> Or you could use any other implementation sharing OO design. For
> instance, start out defining the US one as complete, then have the UK
> one sub-class it and over-ride the relevant date rule. Or put the bulk
> of your parser (without US/UK specific rules) in a ruby module, then
> have both the US and UK ones 'include' that module, and supply their own
> locale specific rules.
>
> I haven't actually tried any of these things recently, but they should
> all work, something along those lines. That parslet parsers are just
> ordinary ruby classes to which you can use ordinary ruby language
> composition features -- is one of the very strong points about parslet
> in my opinion.
>
> On 2/5/2013 4:46 PM, Jeremy Nevill wrote:
>> First of all, great parser and documentation which has helped me make the 
>> leap from regex to proper parsing.
>>
>> We're using Parslet it to parse our Shortee event message format, sample 
>> messages being:
>>
>> @JeremyNevill ate 1lambchop 01/02/2013
>> @JeremyNevill walked @Rover 3miles 12/dec/2012
>>
>> I have the parser working nicely, extracting the message entities defined in 
>> my syntax:  https://github.com/JeremyNevill/shortee
>>
>> Now the issue I have is how to handle ambiguous dates as we have both US 
>> date format and UK date format clients:
>>
>> e.g. 01/02/2013 in the UK is 1st/Feb/2013 but 2nd/Jan/2013 in the US
>>
>> At present I have 2 very similar parsers, one that handles UK dates, the 
>> other that handles US… this is not very DRY and I'm wondering if there is a 
>> better way to go…maybe appending the date format required to the message 
>> when it gets sent into the parser.
>>
>> Any help will be most appreciated as I'm a bit stumped on the preferred 
>> method for problems like this.
>>
>> Regards,
>>
>> Jeremy Nevill
>> www.nevill.net<http://www.nevill.net>
>>

RE: [ruby.parslet] International Date Formats

Reply via email to