Dear Doaitse,
In the days since my original post I had already come to favor the
uu-parsing package. I have printed the report and read it every day to
figure out how to use it. I cannot follow everything yet, and also hope
that won't be necessary in order to use it. :-)
My progress is a bit slow, but I'm not giving up. What I do like most,
over the other combinatory packages, is the approach of using
"breadth-first" when it comes to choice, the idea is certainly
enlightening. The packages capability to do "online- / partial parsing"
is essential for me.
I am a bit surprised about it's "raw" state. The basic combinators and
primitives are there but combinators like pChain, pDigit etc. are not
predefined and merely present in the examples package.
I had gotten quite comfortable with parsec and need to find the right
way to "translate" my parsec code to your package.
Anyway let me thank you for your work, I really appreciate it very much.
Günther
S. Doaitse Swierstra schrieb:
Lazy parsing has been the default for the last ten years in uulib, and
is now available in the simple uu-parsinglib
(http://hackage.haskell.org/cgi-bin/hackage-scripts/package/uu-parsinglib).
The whole design of the latter in described in a technical report to
which references are given on the web page. It provides also error
correction, the ability to use several different kinds of input
tokens, and (with some help) ambiguities. If speed is an issue you can
insert extra hints which locally change the breadth-first parsing
process locally into a somewhat more depth-first form. When compared
with Parsec the good news is that usually you do not have to put
annotations to get nice results.
The older uulib version also performs an abstract interpretation which
basically changes the search for which alternative to take from a
linear to a logarithmic complexity, but does not provide a monadic
structure, in which you use results recognised thus far to construct
new parsers.
Both the old uulib version and the new version have always had an
applicative interface.
In the near future elements of the abstract interpretation of the old
uulib version will migrate into the new version. It is the advent of
GADT's which made this new version feasable.
An example of the error correction at work at the following example code:
pa, pb, paz :: P_m (Str Char) [Char]
pa = lift <$> pSym 'a'
pb = lift <$> pSym 'b'
p <++> q = (++) <$> p <*> q
pa2 = pa <++> pa
pa3 = pa <++> pa2
pCount p = (\ a b -> b+1) <$> p <*> pCount p <<|> pReturn 0
pExact 0 p = pReturn []
pExact n p = (:) <$> p <*> pExact (n-1) p
paz = pMany (pSym ('a', 'z'))
paz' = pSym (\t -> 'a' <= t && t <= 'z', "a .. z", 'k')
main :: IO ()
main = do print (test pa "a")
print (test pa "b")
print (test pa2 "bbab")
print (test pa "ba")
print (test pa "aa")
print (test (do l <- pCount pa
pExact l pb) "aaacabbb")
print (test (amb ( (++) <$> pa2 <*> pa3 <|> (++) <$> pa3 <*> pa2))
"aaabaa")
print (test paz "ab1z7")
print (test paz' "m")
print (test paz' "")
is
loeki:~ doaitse$ ghci -package uu-parsinglib
GHCi, version 6.10.1: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer ... linking ... done.
Loading package base ... linking ... done.
Loading package syb ... linking ... done.
Loading package array-0.2.0.0 ... linking ... done.
Loading package filepath-1.1.0.1 ... linking ... done.
Loading package old-locale-1.0.0.1 ... linking ... done.
Loading package old-time-1.0.0.1 ... linking ... done.
Loading package unix-2.3.1.0 ... linking ... done.
Loading package directory-1.0.0.2 ... linking ... done.
Loading package process-1.0.1.1 ... linking ... done.
Loading package random-1.0.0.1 ... linking ... done.
Loading package haskell98 ... linking ... done.
Loading package uu-parsinglib-2.0.0 ... linking ... done.
Prelude> :m Text.ParserCombinators.UU.Examples
Prelude Text.ParserCombinators.UU.Examples> main
("a",[])
("a",[
Deleted 'b' at position 0 expecting one of ["'a'"],
Inserted 'a' at position 1 expecting one of ["'a'"]])
("aa",[
Deleted 'b' at position 0 expecting one of ["'a'"],
Deleted 'b' at position 1 expecting one of ["'a'"],
Deleted 'b' at position 3 expecting one of ["'a'"],
Inserted 'a' at position 4 expecting one of ["'a'"]])
("a",[
Deleted 'b' at position 0 expecting one of ["'a'"]])
("a",[
The token 'a'was not consumed by the parsing process.])
(["b","b","b","b"],[
Deleted 'c' at position 3 expecting one of ["'a'","'b'"],
Inserted 'b' at position 8 expecting one of ["'b'"]])
(["aaaaa"],[
Deleted 'b' at position 3 expecting one of ["'a'","'a'"]])
("abz",[
Deleted '1' at position 2 expecting one of ["'a'..'z'"],
The token '7'was not consumed by the parsing process.])
('m',[])
('k',[
Inserted 'k' at position 0 expecting one of ["a .. z"]])
Prelude Text.ParserCombinators.UU.Examples>
Doaitse Swierstra
On 27 mei 2009, at 01:52, GüŸnther Schmidt wrote:
Hi all,
is it possible to do lazy parsing with Parsec? I understand that one
can do that with polyparse, don't know about uulib, but I happen to
be already somewhat familiar with Parsec, so before I do switch to
polyparse I rather make sure I actually have to.
The files it has to parse is anywhere from 500 MB to 5 GB.
Günther
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe
_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe