Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
2009/12/16 Jason Dusek jason.du...@gmail.com: What is the relationship between the Parsec API, Applicative and Alternative? Is the only point of overlap `|`? Hello everyone, Lots of functions in Text.ParserCombinators.Parsec.Combinator can be defined with only with an obligation on Applicative and no need to access the token stream, parser state. etc. No surprise of course, as the equivalents were available in UU parsing. Here are some if the ones form Text.ParserCombinators.Parsec.Combinator defined just with Applicative obligations that I did for a two continuation parser monad a while ago. I was a bit surprised no-one had put something similar up on Hackage, maybe they have but its not yet indexed by Hayoo. The code is a bit old and might be some way off optimal, though the combinators do show the utility of applicative cons (:). Parsec's character parsers (Text.ParserCombinators.Parsec.Char) need access to the input. My attempts to define similar ones with only type class obligations rather than some concrete character type were dismal failures. Best wishes Stephen -- Applicative cons (:) :: Applicative f = f a - f [a] - f [a] (:) p1 p2 = (:) $ p1 * p2 choice :: Alternative f = [f a] - f a choice = foldr (|) empty count :: Applicative f = Int - f a - f [a] count i p | i = 0= pure [] | otherwise = p : count (i-1) p between :: Applicative f = f open - f close - f a - f a between o c a = o * a * c option :: Alternative f = a - f a - f a option x p = p | pure x optionMaybe :: Alternative f = f a - f (Maybe a) optionMaybe = optional -- aka Parsecs /optional/ optionUnit :: Alternative f = f a - f () optionUnit p = () $ p | pure () skipMany1 :: Alternative f = f a - f () skipMany1 p = p * skipMany p skipMany :: Alternative f = f a - f () skipMany p = many_p where many_p = some_p | pure () some_p = p * many_p -- | @many1@ an alias for @s...@. many1 :: Alternative f = f a - f [a] many1 = some sepBy :: Alternative f = f a - f b - f [a] sepBy p sep = sepBy1 p sep | pure [] sepBy1 :: Alternative f = f a - f b - f [a] sepBy1 p sep = p : step where step = (sep * p) : step | pure [] sepEndBy :: Alternative f = f a - f b - f [a] sepEndBy p sep = sepEndBy1 p sep | pure [] sepEndBy1 :: Alternative f = f a - f b - f [a] sepEndBy1 p sep = (p * sep) : step where step = (p * sep) : step | pure [] manyTill :: Alternative f = f a - f b - f [a] manyTill p end = step | pure [] where step = p : (step | (pure [] $ end)) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
2009/12/16 Stephen Tetley stephen.tet...@gmail.com: 2009/12/16 Jason Dusek jason.du...@gmail.com: What is the relationship between the Parsec API, Applicative and Alternative? Is the only point of overlap `|`? Lots of functions in Text.ParserCombinators.Parsec.Combinator can be defined with only with an obligation on Applicative and no need to access the token stream, parser state. etc. No surprise of course, as the equivalents were available in UU parsing. Is UU parsing more Applicative aware, then? When you say obligation, what do you mean? -- Jason Dusek ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
Hi Jason UU parsing somewhat invented the Applicative style - it defined the usual combinators from Control.Applicative ($), (*), (*), (*) etc. but didn't have an 'Applicative' type class. By obligation, I mean relying only on the Applicative class for the derived operations, here manyTill, choice etc. Best wishes Stephen 2009/12/16 Jason Dusek jason.du...@gmail.com: ... Is UU parsing more Applicative aware, then? When you say obligation, what do you mean? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
Hi everyone, While you are discussing performance of parsing combinator libraries, I though I'd mention parsimony, available from Hackage. It has as good performance as parsec v2 but it also has support for different buffer types (e.g., byte strings, including support for utf8 decoding, etc) which is similar to parsec v3. I also think that it has a slightly simpler API. -Iavor On Wed, Dec 16, 2009 at 4:45 AM, Stephen Tetley stephen.tet...@gmail.com wrote: Hi Jason UU parsing somewhat invented the Applicative style - it defined the usual combinators from Control.Applicative ($), (*), (*), (*) etc. but didn't have an 'Applicative' type class. By obligation, I mean relying only on the Applicative class for the derived operations, here manyTill, choice etc. Best wishes Stephen 2009/12/16 Jason Dusek jason.du...@gmail.com: ... Is UU parsing more Applicative aware, then? When you say obligation, what do you mean? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Sun, Dec 6, 2009 at 8:40 AM, Antoine Latter aslat...@gmail.com wrote: Well, the more eyes and test cases we can get on the new code the better. I'd also like to encourage anyone who's familiar with the idea of how parsec works to take a look at my changes to make sure they're comprehensible. Where would I look to find a darcs repo with your changes? It would be great to see parsec3 finally replace parsec2. Besides the performance issue, are there any other considerations keeping it from becoming the default? (I'd also like to write up a patch to add Data.Text support, not surprisingly.) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
Hi Bryan, On Tue, Dec 15, 2009 at 3:13 PM, Bryan O'Sullivan b...@serpentine.com wrote: Where would I look to find a darcs repo with your changes? It would be great to see parsec3 finally replace parsec2. Besides the performance issue, are there any other considerations keeping it from becoming the default? (I'd also like to write up a patch to add Data.Text support, not surprisingly.) This is the darcs repository, it's Antoine's work: http://community.haskell.org/~aslatter/code/parsec/cps I don't know about the second question, I think performance and just inertia were the big factors. I don't know, for example, if it's like quickcheck, where the new version is missing a few things that the old one provided. Everything appears to work for Pandoc though. Data.Text support would be wonderful! Paulo ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Tue, Dec 15, 2009 at 3:13 PM, Bryan O'Sullivan b...@serpentine.com wrote: Besides the performance issue, are there any other considerations keeping it from becoming the default? One thing that makes me a bit hesitant is that it's a pretty big change to the core parser data structure, to the extent that I'm not sure I should even call it Parsec. Reading through the Parsec technical report, one of the innovations that made Parsec what it was is that it introduced a new way of returning the four possible parse results Previous work with parsers returning good error message indicated parse reults with the following sort of data structure: data ParseResult s a = EmptyOk s a-- parsed ok but did not consume any input | EmptyError ErrorMessage -- did not parse okay, but did not consume any input | ConsumedOk s a -- parsed ok and consumed input | ConsumedError ErrorMessage -- did not parse okay and consumed input To something like: data Consumed a = Empty a | Consumed a data Reply s a = Ok s a | Error ErrorMessage data ParseResult s a = Consumed (Reply s a) This change allows us to determine whether a parser consumes input before forcing the computation that determines if the parser succeeds, improving performance and getting rid of a space leak. My branch takes us back to returning the four flattened results, but offers them as four continuations to take during parsing. So my biggest reservation is if I can even call my branch the same parser as Parsec. Antoine ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
So my biggest reservation is if I can even call my branch the same parser as Parsec. Maybe we could go the way regexes: say parsec is an API, and allow multiple backends with different characteristics as long as they basically implement that API. In practice, a lot of parser combinator libraries have something close to the parsec API already. They don't have to be completely compatible, i.e. you might have a 'parse' function take different arguments, but it would be practical to have a standard set of core combinators. Or come up with a new name for the API or call in parseclike or something :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
What is the relationship between the Parsec API, Applicative and Alternative? Is the only point of overlap `|`? -- Jason Dusek ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Fri, Dec 4, 2009 at 7:47 PM, Paulo Tanimoto tanim...@arizona.edu wrote: Great! Antoine, other tests we should do? Derek, I apologize if you're already following this, but can you give us your opinion? Paulo Well, the more eyes and test cases we can get on the new code the better. I'd also like to encourage anyone who's familiar with the idea of how parsec works to take a look at my changes to make sure they're comprehensible. Antoine ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Friday December 4, 2009, John MacFarlane wrote: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: parsec2: mean: 67.66576 ms, lb 67.56722 ms, ub 67.88983 ms, ci 0.950 std dev: 722.3878 us, lb 323.0343 us, ub 1.356013 ms, ci 0.950 parsec3: mean: 68.20847 ms, lb 68.16387 ms, ub 68.26284 ms, ci 0.950 std dev: 252.7773 us, lb 204.5512 us, ub 325.2424 us, ci 0.950 So, once you release the new parsec3, I am prepared to remove the parsec 3 restriction from the libraries I maintain: pandoc, highlighting-kate, filestore, gitit, and yst. I don't know what the performance of the current parsec3 is compared to parsec2. It would be helpful if you could run your benchmark for that also and include it. If the only issue is performance, I respectfully request that you remove the parsec 3 requirement even before the new version of parsec3 is released. Thank you. Howard ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
+++ Howard B. Golden [Dec 05 09 13:36 ]: On Friday December 4, 2009, John MacFarlane wrote: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: parsec2: mean: 67.66576 ms, lb 67.56722 ms, ub 67.88983 ms, ci 0.950 std dev: 722.3878 us, lb 323.0343 us, ub 1.356013 ms, ci 0.950 parsec3: mean: 68.20847 ms, lb 68.16387 ms, ub 68.26284 ms, ci 0.950 std dev: 252.7773 us, lb 204.5512 us, ub 325.2424 us, ci 0.950 So, once you release the new parsec3, I am prepared to remove the parsec 3 restriction from the libraries I maintain: pandoc, highlighting-kate, filestore, gitit, and yst. I don't know what the performance of the current parsec3 is compared to parsec2. It would be helpful if you could run your benchmark for that also and include it. parsec 2.1.0.1 from HackageDB: mean: 67.71456 ms, lb 67.65181 ms, ub 67.82660 ms, ci 0.950 std dev: 416.1303 us, lb 274.0063 us, ub 761.6995 us, ci 0.950 parsec 3.0.1 from HackageDB: mean: 188.5380 ms, lb 188.3217 ms, ub 188.7615 ms, ci 0.950 std dev: 1.136199 ms, lb 964.3489 us, ub 1.366720 ms, ci 0.950 parsec 3.0.1 from Antoine: mean: 69.29665 ms, lb 69.22450 ms, ub 69.48016 ms, ci 0.950 std dev: 551.3562 us, lb 263.7954 us, ub 1.156183 ms, ci 0.950 If the only issue is performance, I respectfully request that you remove the parsec 3 requirement even before the new version of parsec3 is released. Thank you. Sorry, I don't want to do that. Lots of people have both parsec-2 and parsec-3 installed, and if I remove the restriction, their pandoc (and gitit and...) will be much slower unless they take special steps. John ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Mon, Nov 23, 2009 at 12:29 PM, Antoine Latter aslat...@gmail.com wrote: I finally had some time to test it. After running it multiple times (of course, it would be nice to use criterion here), I'm getting numbers in this neighborhood: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: parsec2: mean: 67.66576 ms, lb 67.56722 ms, ub 67.88983 ms, ci 0.950 std dev: 722.3878 us, lb 323.0343 us, ub 1.356013 ms, ci 0.950 parsec3: mean: 68.20847 ms, lb 68.16387 ms, ub 68.26284 ms, ci 0.950 std dev: 252.7773 us, lb 204.5512 us, ub 325.2424 us, ci 0.950 So, once you release the new parsec3, I am prepared to remove the parsec 3 restriction from the libraries I maintain: pandoc, highlighting-kate, filestore, gitit, and yst. John ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
Hi John, On Fri, Dec 4, 2009 at 3:09 PM, John MacFarlane j...@berkeley.edu wrote: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: parsec2: mean: 67.66576 ms, lb 67.56722 ms, ub 67.88983 ms, ci 0.950 std dev: 722.3878 us, lb 323.0343 us, ub 1.356013 ms, ci 0.950 parsec3: mean: 68.20847 ms, lb 68.16387 ms, ub 68.26284 ms, ci 0.950 std dev: 252.7773 us, lb 204.5512 us, ub 325.2424 us, ci 0.950 So, once you release the new parsec3, I am prepared to remove the parsec 3 restriction from the libraries I maintain: pandoc, highlighting-kate, filestore, gitit, and yst. John Great! Antoine, other tests we should do? Derek, I apologize if you're already following this, but can you give us your opinion? Paulo ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Fri, Dec 4, 2009 at 1:09 PM, John MacFarlane j...@berkeley.edu wrote: On Mon, Nov 23, 2009 at 12:29 PM, Antoine Latter aslat...@gmail.com wrote: I finally had some time to test it. After running it multiple times (of course, it would be nice to use criterion here), I'm getting numbers in this neighborhood: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: Very nice, I was interested in parsec 3 but scared off by the reports of slowness, as I'm sure many others were. Is there any document out there describing the differences between 2 and 3? I gathered 3 allows more flexibility wrt the input, so you can more easily use ByteString or Text, but it would be nice to have a doc saying what the new features are and why we should be interested in upgrading. The old parsec docs were out of date even for parsec 2, and looks like they haven't been updated. The new ones look like they use haddock which is great, that was a gripe I had about the old doc. However, the haddock docs are less friendly than the old doc. So my suggestion is to paste the old introduction (with Daan permission, of course) or something similar into the Text.Parsec description field, along with links to more detailed descriptions and tutorial in the style of v2 on haskell.org along with a 2 vs. 3 doc, even if they're sketchy and brief. Or if it's ok I could just send some darcs patches :) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Optimizing Parsec 3 -- was: Wiki software?
On Fri, Dec 4, 2009 at 11:01 PM, Evan Laforge qdun...@gmail.com wrote: On Fri, Dec 4, 2009 at 1:09 PM, John MacFarlane j...@berkeley.edu wrote: On Mon, Nov 23, 2009 at 12:29 PM, Antoine Latter aslat...@gmail.com wrote: I finally had some time to test it. After running it multiple times (of course, it would be nice to use criterion here), I'm getting numbers in this neighborhood: I used criterion to compare pandoc compiled with parsec2 to pandoc compiled with your version of parsec3. (The benchmark is converting testsuite.txt from markdown to HTML.) The difference was minor: Very nice, I was interested in parsec 3 but scared off by the reports of slowness, as I'm sure many others were. Is there any document out there describing the differences between 2 and 3? I gathered 3 allows more flexibility wrt the input, so you can more easily use ByteString or Text, but it would be nice to have a doc saying what the new features are and why we should be interested in upgrading. Basically, the main (only) significant changes are that Parsec 3 provides a monad transformer rather than just a monad and the input has been generalized to take arbitrary Streams rather than lists of tokens. The old parsec docs were out of date even for parsec 2, and looks like they haven't been updated. The new ones look like they use haddock which is great, that was a gripe I had about the old doc. However, the haddock docs are less friendly than the old doc. So my suggestion is to paste the old introduction (with Daan permission, of course) or something similar into the Text.Parsec description field, along with links to more detailed descriptions and tutorial in the style of v2 on haskell.org along with a 2 vs. 3 doc, even if they're sketchy and brief. The Parsec Letter applies to Parsec 3 readily. The only thing that needs changing is the module names and possibly one or two function names which the haddock documentation should readily point out. The letter obviously does not cover the new features of Parsec 3; for that there is only the haddock at this point. It would be nice to update the Parsec letter but that would ideally require the document source and necessarily require Daan's permission. Unfortunately, no one has been able to get in touch with Daan on this issue to my knowledge. Or if it's ok I could just send some darcs patches :) You can certainly email me patches and I'll likely apply them if there are no copyright or licensing issues. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe