Re: [Haskell-cafe] Optional EOF in Parsec.
Kannan Goundan wrote: I'm writing a parser with Parsec. In the input language, elements of a sequence are separated by commas: [1, 2, 3] However, instead of a comma, you can also use an EOL: [1, 2 3] Anywhere else, EOL is considered ignorable whitespace. So it's not as simple as just making EOL a token and looking for (comma | eol). Hi Kannan, let's construct the parser top-down. On the top level, you have opening and closing characters, '[' and ']'. Parsec has a function for that: between (char '[') (char ']) And what's in between? A list of elements separated by something. Parsec provides a sepBy function for that: element `sepBy` separator which parses a list of elements separated by separator. What's your separator? Well it's either ',' or a new line and spaces before and after that: mySpaces (newline | char ',') mySpaces -- [1] Let's combine what we've got: myListOf :: (Parsec String () a) - Parsec String () [a] myListOf elem = between (char '[') (char ']') (elem `sepBy` (mySpaces (newline | char ',') mySpaces)) where mySpaces = many (oneOf ( \t)) And test it in ghci: *Main parseTest (myListOf anyChar) [a , b, d ,d\np] abddp Hope this helps! Stephan PS: The important thing is that there are a lot solutions for tricky situations (like yours) in Text.Parsec.Combinator (especially the sepBy and many families). Knowing them can save a lot of work :) [1] I don't use parsec's spaces function because it also accepts newline characters. I've implemented this functionality in a hand-written parser (basically a hack that keeps track of whether the last read token was preceded by an EOL, without making EOL itself a token). Does anybody have ideas about how to do this with Parsec? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe -- Früher hieß es ja: Ich denke, also bin ich. Heute weiß man: Es geht auch so. - Dieter Nuhr ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optional EOF in Parsec.
Kannan Goundan wrote: I've implemented this functionality in a hand-written parser (basically a hack that keeps track of whether the last read token was preceded by an EOL, without making EOL itself a token). Does anybody have ideas about how to do this with Parsec? You can do exactly the same with Parsec: * create a lexer that yields a [Token], including EOL tokens; * write a function of type [Token] - [(Token, Bool)] that discards EOLs and tells for each token whether it was preceded by a (now discarded) EOL; * write your pToken :: Token - Parsec Token function (I omitted some type variables there) that recognises one (Token, Bool)-tuple from the input stream. Or, perhaps easier: * create a lexer that yields a [Token], including EOL tokens; * write a function of type [Token] - [Token] that discards only those EOL tokens that aren't needed -- for example, those EOL tokens that occur when there are no open ['s, then parse those EOL's explicitly in your parser. Hope this helps, Martijn. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Optional EOF in Parsec.
I'm writing a parser with Parsec. In the input language, elements of a sequence are separated by commas: [1, 2, 3] However, instead of a comma, you can also use an EOL: [1, 2 3] Anywhere else, EOL is considered ignorable whitespace. So it's not as simple as just making EOL a token and looking for (comma | eol). I've implemented this functionality in a hand-written parser (basically a hack that keeps track of whether the last read token was preceded by an EOL, without making EOL itself a token). Does anybody have ideas about how to do this with Parsec? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Optional EOF in Parsec.
On Fri, Apr 3, 2009 at 8:17 PM, Kannan Goundan kan...@cakoose.com wrote: I'm writing a parser with Parsec. In the input language, elements of a sequence are separated by commas: [1, 2, 3] However, instead of a comma, you can also use an EOL: [1, 2 3] Anywhere else, EOL is considered ignorable whitespace. So it's not as simple as just making EOL a token and looking for (comma | eol). Untested, but hopefully enough so you get an idea of where to start: -- End of line parser. Consumes the carriage return, if present. eol :: Parser () eol = eof | char '\n' -- list-element separator. listSep :: Parser () listSep = eol | (char ',' spaces) -- list parser. The list may be empty - denoted by [] myListOf :: Parser a - Parser [a] myListOf p = char '[' sepBy p listSep = \vals - char ']' return vals This would probably be better off with a custom version of the 'spaces' parser that didn't parse newlines. Antoine ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe