Re: [Pharo-dev] Pratt Parsers for PetitParser

Camille Fri, 12 Jun 2015 01:53:09 -0700

> On 12 Jun 2015, at 09:36, Jan Kurš <k...@iam.unibe.ch> wrote:
> 
> Hi, 
> 
> Did you tried to inspect the PPFailure in the Moose? There is a tab with a 
> tree structure giving you a pretty good overview, what is going wrong... Or 
> as an alternative, one can call:
> myParser enableDebug parse:myInput
>


I didn’t know about these features, there are cool :) Thanks!

> Otherwise, PetitParser really needs some nice error reporting system. I would 
> have integrated one, but I am not aware of any suitable solution :(

I would be happy to discuss any idea you’ll have on that subject :)

> 
> Cheers,
> Jan
> 
> On Fri, Jun 12, 2015 at 8:07 AM Francisco Garau <francisco.ga...@gmail.com 
> <mailto:francisco.ga...@gmail.com>> wrote:
> 
> Hi Camile
> 
> On 11 Jun 2015, at 10:47, Camille <camille.ter...@gmail.com 
> <mailto:camille.ter...@gmail.com>> wrote:
> 
>> 
>>> On 11 Jun 2015, at 09:30, Jan Kurš <k...@iam.unibe.ch 
>>> <mailto:k...@iam.unibe.ch>> wrote:
>>> 
>>> That sounds really cool and useful extension.
>>> 
>> Thank you Jan!
>>> Regarding the furthest failure, the core of the problem is the distinction 
>>> between an error and a failure. Error reports on a problem in the input, 
>>> while failure is information for choice parser combinator. In general, the 
>>> furthest failure is a better approximation of an error than the last 
>>> failure, so we use it.
>>> 
>>> I am not sure what exactly is the problem in case of PrattParser. I guess 
>>> the last failure gives better results for a user?
>>> 
>> Yes, with furthest failure I get errors from tokenization instead of my 
>> errors. 
>> For exemple with the calculator grammar I gave in my first mail when I parse 
>> ‘1+’ the furthestFailure is ‘$- expected at 2’ whereas I return ‘expression 
>> expected at 2’ because there's a whole expression missing. Same thing with 
>> ‘(1+2’ that returns ‘digit expected at 4’ instead of ‘$) expected at 4’.
>> 
>> But furthest failure gives wrong messages in other cases to. 
>> Consider this sequence parser:
>> 
>> keyword := #letter asParser plus , $: asParser.
>> keyword parse: ‘foo'
>> 
>> This returns 'letter expected at 3’, but no matter how many letters I add to 
>> the end I’ll still get ‘letter expected’. 
>> I want to get what is really missing: '$: expected at 3’.
> 
> Any of those error messages wouldn't be too bad if the failing production 
> rule were also mentioned. Something like "keyword rule failed with xx 
> expected at 3"
> 
> But yeah, that would require annotating the production rules which can 
> pollute the clarity of the grammar definition...
> 
> 
>> Maybe returning the “latest furthest failure” instead of the “first furthest 
>> failure” could solves the problem here (i.e. replacing > with >= in 
>> PPContext>>#noteFailure:)?
>>> One has to consider a pratt parser included in the normal parser, e. g. 
>>> Expressions parsed by pratt in a Java Grammar. Depending where an error 
>>> occurs, different strategy for choosing the proper failure is necessary :-/
>>> 
>> Indeed, my hack (redefining #parseWithContext:) works only when the Pratt 
>> parser is the top parser, but a soon as I compose it I’m screwed because 
>> only parseOn: is sent to the Pratt parser.
>> That’s why I wonder if letting the parser decide what to return wouldn’t 
>> solve the problem: by default the furthest failure but special parsers can 
>> still decide.
>>> Regarding tokenization, there is a message token, that returns 
>>> PPTokenParser, which transforms a parsed input into the PPToken object. 
>>> Perhaps this might be helpful?
>>> 
>> The Pratt tokens are special: a token points back to the parser that 
>> generated it (its “kind”).
>> PPTokenKind subclasses PPFlatteningParser and generates instances of 
>> PPPrattToken.
>> A PPTokenKind stores the precedence, the action to be executed when a token 
>> of this kind is met at the start of an expression (for terminals and 
>> prefixes) and the action to be executed when a token is met in the middle of 
>> an expression (for postfixes and infixes).
>> 
>> Cheers,
>> Camille
>>> Cheers Jan
>>> 
>>> 
>>> On Wed, Jun 10, 2015, 20:52 Richard Sargent 
>>> <richard.sarg...@gemtalksystems.com 
>>> <mailto:richard.sarg...@gemtalksystems.com>> wrote:
>>> camille teruel wrote
>>> >> On 10 Jun 2015, at 19:11, Chris Cunningham &lt;
>>> 
>>> > cunningham.cb@
>>> 
>>> > &gt; wrote:
>>> >>
>>> >> Inteteresting....
>>> >>
>>> >> On Wed, Jun 10, 2015 at 9:36 AM, Camille &lt;
>>> 
>>> > camille.teruel@
>>> 
>>> >  &lt;mailto:
>>> 
>>> > camille.teruel@
>>> 
>>> > &gt;> wrote:
>>> >> Hello Pharoers and Moosers,
>>> >>
>>> >> I did a Pratt parser extension for PetitParser.
>>> >>
>>> >>
>>> > <snip>
>>> >
>>> >> @PP Devs:
>>> >> I had trouble with the PPContext furthestFailure that is taken into
>>> >> account instead of the failures I return, so I had to redefine
>>> >> #parseWithContext: to return the failures I want. The results given by
>>> >> furthestFailure were not very meaningful in my case (the same is true for
>>> >> PPExpressionParser btw).
>>> >> But I guess it was introduced because it gives good results in other
>>> >> cases.
>>> >> So would it be possible to change this behavior to let the parser decide
>>> >> if it returns the furthestFailure or the original failure?
>>> >>
>>> >> The intent behind the furthestFailure is that it give the failure that
>>> >> gets the furthest into the source stream.  It is most useful when there
>>> >> are embedded choice operators in the parser - the original/pre furthest
>>> >> behaviour would return the last failure, which depending on the incoming
>>> >> stream and the order of the choice options could be significantly not
>>> >> useful.
>>> >>
>>> >> I ran into this when working with the sql parser, which started off with
>>> >> the outer choice of (by memory):
>>> >>    ^ selectStatement / insertStatement / updateStatement /
>>> >> deleteStatement
>>> >> If I was trying to part a select statement that had an error at the very
>>> >> end of the statement, the parser would return an error talking about how
>>> >> the incoming stream failed in deleteStatement.  Not useful.
>>> >>
>>> >> I would be saddened if this further failure was not available.
>>> >
>>> > Yes in that case returning the furthest failure gives better results.
>>> > However, this don’t give meaningful messages in all cases.
>>> > For exemple with the calculator I gave in my previous message, if I parse
>>> > ‘1+’ I want to get ‘expression expected at: 2’ but instead it returns ‘$-
>>> > expected at 2'.
>>> > I’m not proposing to remove this feature but to let parsers decide to use
>>> > it or not.
>>> > Something like (changes in bold):
>>> >
>>> > PPParser>>parseWithContext: context
>>> >       | result |
>>> >       context initializeFor: self.
>>> >       result := self parseOn: context.
>>> >
>>> >       "Return the furthest failure, it gives better results than the last
>>> > failure"
>>> >       (result isPetitFailure and: [ self wantsFurthestFailure and: [ 
>>> > context
>>> > furthestFailure notNil ] ])
>>> >               ifTrue: [ ^ context furthestFailure ].
>>> >       ^ result
>>> 
>>> This screams at me. Why not just delegate to the context and use a context
>>> that returns the preferred failure? e.g. end with:
>>> ^context preferredResultFor: result.
>>> 
>>> 
>>> >
>>> > PPParser>>wantsFurthestFailure
>>> >       ^ true
>>> >
>>> > Like this, one can return the failures he wants.
>>> >
>>> > PPPrattParser>>wantsFurthestFailure
>>> >       ^ false
>>> >
>>> >
>>> > Camille
>>> >
>>> >>
>>> >> -cbc
>>> 
>>> 
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: 
>>> http://forum.world.st/Pratt-Parsers-for-PetitParser-tp4831456p4831486.html 
>>> <http://forum.world.st/Pratt-Parsers-for-PetitParser-tp4831456p4831486.html>
>>> Sent from the Pharo Smalltalk Developers mailing list archive at Nabble.com 
>>> <http://nabble.com/>.
>>> 
>>

Re: [Pharo-dev] Pratt Parsers for PetitParser

Reply via email to