Stephane Bortzmeyer wrote:
On Mon, Jul 31, 2006 at 06:51:27PM +0100,
Chris Kuklewicz <[EMAIL PROTECTED]> wrote a message of 102 lines which said:

minilang = do
      char 'a'
      optional (try (do {comma ; char 'b'}))
      optional (do {comma ; char 'c'})
      eof
      return "OK"

I now have a new problem which was hidden beneath. If the language
authorizes "a,bb" and "a,bbc", "a,bbc" is not accepted by my parser
since it already accepted "a,bb" and the "c" which is left triggers a
syntax error.

This time, "try" believes it succeeded but should not. I need more
look-ahead but I'm not sure how?

The problem is mentioned here:

http://www.cs.uu.nl/people/daan/download/parsec/parsec.html#notFollowedBy

Your whole parser is indeed failing, and again it is because of the "failing after consuming some input" issue. For "a,bbc" your "bb" token parser consumes the "bb" and then the dangling "c" causes the error.

So you cannot commit to consuming the "bb" unless you know the rest of the string is okay. There are a few ways to accomplish this. The first would be to test whether "bb" is followed by "eof" or "comma" before accepting it. Another solution is to try and parse what follows "bb" before accepting "bb".

A small fix would look like:

minilang' = do
       string "a"
       optional (try $ do {comma ; string "bb"; endToken})
       optional (do {comma ; string "bbc"})
       eof
       return "OK"
  where endToken = eof <|> lookAhead (comma >> return ())

A more general fix looks like this:

stringLang :: [String] -> GenParser Char st [String]
stringLang items = polyLang comma (map string items)

listLang :: [Char] -> GenParser Char st [Char]
listLang items = polyLang comma (map char items)


The first version of polyLang uses the "test eof or comma before accepting" strategy:

polyLang :: (Show element,Show token) => GenParser element state ignore -> [GenParser element state token] -> GenParser element state [token]
polyLang _ [] = eof >> return []
polyLang separator input = (use input) <|> polyLang separator (tail input)
where use (opX:xs) = do (x,test) <- try (do x <- opX
                              test <- more
                              when test (separator >> return ())
                              return (x,test))
rest <- if test then (loop xs <|> unexpected ("(problem after "++show x++")")) else return []
          return (x:rest)
        more = option True (eof >> return False)
        loop [] = (unexpected "cannot parse")
        loop input' = use input' <|> loop (tail input')

The second version polyLang' uses the "test rest of input before accepting" strategy:

polyLang' :: (Show element,Show token) => GenParser element state ignore -> [GenParser element state token] -> GenParser element state [token]
polyLang' _ [] = eof >> return []
polyLang' separator input = (use input) <|> polyLang' separator (tail input)
  where use (opX:xs) = try (do x <- opX
                               test <- more
                               rest <- if test
                                         then separator >> (loop xs <|> unexpected ("(problem 
after "++show x++")"))
                                         else return []
                               return (x:rest))
        more = option True (eof >> return False)
        loop [] = (unexpected "cannot parse")
        loop input' = use input' <|> loop (tail input')

It works:

*Main> run (stringLang ["a","bb","bbc"]) "a,bbc"
["a","bbc"]

The error reporting gets a bit strange, and is different between the two versions of polyLang'

*Main> run (polyLang comma (map string ["a","bb","bbc","dd"])) "a,bbc,bb"
parse error at (line 1, column 7):
unexpected cannot parse or (problem after "bbc")
expecting "dd"

*Main> run (polyLang' comma (map string ["a","bb","bbc","d"])) "a,bbc,bb"
parse error at (line 1, column 1):
unexpected "c", cannot parse, (problem after "bbc"), (problem after "a") or "a"
expecting end of input, ",", "dd", "bb" or "bbc"

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to