Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-30 Thread Kevin Charter
Hi Matthias,

On Wed, May 30, 2012 at 1:36 PM, Matthias Hörmann wrote:

> parseTest (do; r1 <- anyOf ["Hello", "Hallo", "Foo", "HallofFame"]; r2 <-
> string "bla"; return (r1, r2)) "Hallofbla"
>
> which prints this:
>
> parse error at (line 1, column 8):unknown parse error
>
> And my question about this is made up of two parts
>
> 1. Why doesn't it print my "unexpected" message but instead says unknown
> parse error
> 2. Why is the location in the text off (I would expect it to fail at
> column 6 (first character beyond the result it could return) or 7 (first
> character that makes the string no prefix of any acceptable string)
>

What version of parsec 3 are you using? In version 3.1.1, I get (using
Text.Parsec.String instead of Text.Parsec.Text):

parse error at (line 1, column 1):
unexpected "Hallofb", expecting one of ["Hello","Hallo","Foo","HallofFame"]

which is what I would have expected, bearing in mind that 'try p' pretends
that it hasn't consumed input when 'p' fails.

I don't think you need to use 'try' in your 'anyOf' function, but you'll
have to change it to handle seeing the end of input if the one-character
look-ahead fails.

Kevin
-- 
Kevin Charter
kevin.char...@acm.org
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-30 Thread Kevin Charter
On Wed, May 30, 2012 at 3:11 PM, Kevin Charter  wrote:

> What version of parsec 3 are you using? In version 3.1.1, I get (using
> Text.Parsec.String instead of Text.Parsec.Text):
>

Ah, answered my own question. I gather you're using 3.1.2, since it's the
first and so far only version with the Text.Parsec.Text module.

Kevin
-- 
Kevin Charter
kevin.char...@acm.org
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-30 Thread Antoine Latter
On Wed, May 30, 2012 at 4:18 PM, Kevin Charter  wrote:
> On Wed, May 30, 2012 at 3:11 PM, Kevin Charter  wrote:
>>
>> What version of parsec 3 are you using? In version 3.1.1, I get (using
>> Text.Parsec.String instead of Text.Parsec.Text):
>
>
> Ah, answered my own question. I gather you're using 3.1.2, since it's the
> first and so far only version with the Text.Parsec.Text module.
>

We changed how 'try' handled errors in some cases in between 3.1.1 and
3.1.2. I'll take a look at this.

Antoine

> Kevin
> --
> Kevin Charter
> kevin.char...@acm.org
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-30 Thread Roman Cheplyaka
* Matthias Hörmann  [2012-05-30 21:36:13+0200]
> And my question about this is made up of two parts
> 
> 1. Why doesn't it print my "unexpected" message but instead says unknown
> parse error
> 2. Why is the location in the text off (I would expect it to fail at column
> 6 (first character beyond the result it could return) or 7 (first character
> that makes the string no prefix of any acceptable string)

Thanks for reporting. This is a regression introduced by me in this patch:

Sun Feb 20 18:24:22 EET 2011  Roman Cheplyaka 
  * Choose the longest match when merging error messages

The source of the regression is that parsec sometimes generates dummy (aka
"unknown") error messages when no actual error has occurred. In your
case the dummy error has a "bigger" position because it was generated
by anyChar inside lookAhead.

So, when merging errors, before simply looking at the positions we
should check if one of them is dummy and just ignore it. The patch is
attached.

With this patch your code prints:

parse error at (line 1, column 7):
unexpected "Hallofb", expecting one of ["Hello","Hallo","Foo","HallofFame"]

This is probably still somewhat confusing to a user of your code
(there's no "Hallofb" starting at column 7), but is correct from
Parsec's point of view, because you generated this message while looking
at the 7th character.

-- 
Roman I. Cheplyaka :: http://ro-che.info/
1 patch for repository http://code.haskell.org/parsec3:

Thu May 31 01:38:09 EEST 2012  Roman Cheplyaka 
  * When merging error messages, prefer known messages to unknown ones
  
  This fixes a regression introduced by:
  
  Sun Feb 20 18:24:22 EET 2011  Roman Cheplyaka 
* Choose the longest match when merging error messages
  
  The source of the regression is that parsec sometimes generates dummy (aka
  "unknown") error messages when no actual error has occurred.
  
  So, when merging errors, before simply looking at the positions we should 
check
  if one of them is unknown and just ignore it.
  
  Reported by Matthias Hörmann.

New patches:

[When merging error messages, prefer known messages to unknown ones
Roman Cheplyaka **20120530223809
 Ignore-this: 1cfcc0a8d1cbfd183a3897e79c320c22
 
 This fixes a regression introduced by:
 
 Sun Feb 20 18:24:22 EET 2011  Roman Cheplyaka 
   * Choose the longest match when merging error messages
 
 The source of the regression is that parsec sometimes generates dummy (aka
 "unknown") error messages when no actual error has occurred.
 
 So, when merging errors, before simply looking at the positions we should check
 if one of them is unknown and just ignore it.
 
 Reported by Matthias Hörmann.
] {
hunk ./Text/Parsec/Error.hs 137
 = ParseError pos (msg : filter (msg /=) msgs)
 
 mergeError :: ParseError -> ParseError -> ParseError
-mergeError (ParseError pos1 msgs1) (ParseError pos2 msgs2)
+mergeError e1@(ParseError pos1 msgs1) e2@(ParseError pos2 msgs2)
+-- prefer meaningful errors
+| null msgs2 && not (null msgs1) = e1
+| null msgs1 && not (null msgs2) = e2
+| otherwise
 = case pos1 `compare` pos2 of
 -- select the longest match
 EQ -> ParseError pos1 (msgs1 ++ msgs2)
hunk ./Text/Parsec/Error.hs 145
-GT -> ParseError pos1 msgs1
-LT -> ParseError pos2 msgs2
+GT -> e1
+LT -> e2
 
 instance Show ParseError where
 show err
}

Context:

[TAG 3.1.2
Antoine Latter **20111008182138
 Ignore-this: 96361fd74cad3d51b4213e0bcd91cdf3
] 
[version bump for release
Antoine Latter **20111008181844
 Ignore-this: 9c28994644744eaf375d9c5d75d2b201
] 
[add Stream Text instances
Antoine Latter **20111008181718
 Ignore-this: fcf1bc6a54bae9936669e28047c4f736
] 
[Fix reserved name recognition for case-insensitive languages.
Antoine Latter **20111008180454
 Ignore-this: aed4027b1f273913f7586208e5a6f82c
] 
[Documentation fix
Roman Cheplyaka **20111228222953
 Ignore-this: 2d226ed7cde7a8322be04f5188957eb2
] 
[lookAhead: do not consume input on success; update documentation
Roman Cheplyaka **20110220162920
 Ignore-this: e884771490209b93e9fec044543a18ef
] 
[try: do not reset the error position
Roman Cheplyaka **20110220162449
 Ignore-this: 8508bc41fc6dcd9b7c06aac762f12c71
] 
[Choose the longest match when merging error messages
Roman Cheplyaka **20110220162422
 Ignore-this: 54e2733159a1574abb229e09ff6935c1
] 
[TAG 3.1.1
Antoine Latter **20110129160030
 Ignore-this: 42ddc9e7316d68945c2c1260c2acd403
] 
Patch bundle hash:
09fc71cdc9e86dc672f19bc8fb939103cae782bb
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-30 Thread Kevin Charter
Hi Antoine and Roman,

On Wed, May 30, 2012 at 4:14 PM, Antoine Latter  wrote:

> We changed how 'try' handled errors in some cases in between 3.1.1 and
> 3.1.2. I'll take a look at this.
>
> Antoine
>

Thanks for confirming -- I tried 3.1.2 and got the same result as Matthias.
And Roman, thanks for the light-speed patch! I was about to say I had an
example that showed the problem might actually have to do with 'lookAhead'
rather than 'try', and then I saw your message.

Kevin
-- 
Kevin Charter
kevin.char...@acm.org
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-31 Thread Matthias Hörmann
Hello

Thanks for the quick help with this. I thought about the idea that
lookAhead might be the cause of the
positioning bug but then discarded that idea because I thought lookAhead
should never lead to an error
past wherever the input position is now considering it doesn't consume any
input.

I am aware of the issue with the error message position and the output, I
was still working on improving
that when I was puzzled by the fact that the error message I specified
wasn't even returned to me.

As for try, I believe I need it to make sure the input I consume one
character at the time, before I know
if I will reach another valid match (or any at all) does not stay consumed
when my parser fails.

I am still very much in the experimental phase as far as writing Parsec
combinators beyond very simple
stuff is concerned so I am open for suggestions on how to improve it in a
way that doesn't need "try".

Thanks again for all the help and especially for the patch. After applying
it I do get the error message I
specified.

I noticed there are still some other problems in the code. In particular it
doesn't work as intended in cases
like this one:

parseTest (do; r1 <- anyOf ["Hello", "Hallo", "Foo", "HallofFame"]; r2 <-
string "fbla"; return (r1, r2)) "Hallofbla"

where it should (according to my goal) return no parse error but instead
accept "Hallo" and allow the string parser
to consume the rejected suffix but I will try to fix that.

Matthias Hoermann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-05-31 Thread Roman Cheplyaka
* Matthias Hörmann  [2012-05-31 10:40:31+0200]
> I noticed there are still some other problems in the code. In particular it
> doesn't work as intended in cases
> like this one:
> 
> parseTest (do; r1 <- anyOf ["Hello", "Hallo", "Foo", "HallofFame"]; r2 <-
> string "fbla"; return (r1, r2)) "Hallofbla"
> 
> where it should (according to my goal) return no parse error but instead
> accept "Hallo" and allow the string parser
> to consume the rejected suffix but I will try to fix that.

This looks more like a job for regular expressions.
E.g. using the regex-applicative package:

> let anyOf = foldr1 (<|>) . map string
> let re = (,) <$> anyOf ["Hello", "Hallo", "Foo", "HallofFame"] <*> string 
"fbla"
> "Hallofbla" =~ re

Just ("Hallo","fbla")

Theoretically regular expressions also do the kind of optimization that
you achieve with a trie, but this particular engine doesn't. Nevertheless,
it may be a good base for your own engine.

-- 
Roman I. Cheplyaka :: http://ro-che.info/

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Troubles understanding Parsec Error Handling

2012-06-11 Thread Antoine Latter
On Wed, May 30, 2012 at 5:47 PM, Roman Cheplyaka  wrote:
>
> With this patch your code prints:
>
>    parse error at (line 1, column 7):
>    unexpected "Hallofb", expecting one of ["Hello","Hallo","Foo","HallofFame"]
>

Hi folks,

Roman's patch has been included in the newly-released parsec 3.1.3:

http://hackage.haskell.org/package/parsec-3.1.3

Enjoy,

Antoine

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe