Re: Arrow Classes
Ashley Yakeley wrote: Wolfgang Jeltsch wrote: This brings me to another point. One year ago we had a discussion on The Haskell Mailing List concerning arrows. (The subject of the mails was just arrows.) The point was that it seemed strange to me that first and second are included in the basic arrow class Arrow while left and right have their extra class ArrowChoice. Not only that it seemed strange to me but it made impossible to make Parser baseMonad an instance of Arrow. Parser baseMonad has nice implementations of pure and () but none of first or second. I agree. My own Arrow module hierarchy looks more or less like this: class Compositor comp where [...] class (Compositor arrow) = Arrow arrow where [...] class (Arrow arrow) = ProductArrow arrow where [...] class (Arrow arrow) = CoproductArrow arrow where [...] class (ProductArrow arrow,CoproductArrow arrow) = FullArrow arrow instance (ProductArrow arrow,CoproductArrow arrow) = FullArrow arrow class (Arrow arrow) = ArrowFix arrow where [...] class (FullArrow arrow) = ApplyArrow arrow where [...] On that topic, see below for what mine looks like (from HXML, URL: http://www.flightlab.com/~joe/hxml/ ). I started off with Hughes' conventions, but for some reason could never remember the difference between and ***, or between ||| and +++. I found , , |||, | to have better mnemonic value. This also frees up +++ for ArrowPlus, which -- in HXML applications -- is frequently used and should thus be easy to type. When using the ArrowChoice operators, I kept tripping over all the 'Either' coproduct types, so added some syntactic sugar (borrowed from HaXML): data Choice a = a : a class (Arrow a) = ArrowChoice a where [ ... ] ( ?) :: (b - Bool) - Choice (a b c) - a b c (?) :: a b Bool- Choice (a b c) - a b c I found p ? f : g much more pleasant to use. (I also like the idea of splitting the product operators out of the base Arrow class -- will consider doing that in my library). -- infixr 5 +++ infixr 3 , infixr 2 |, |||, ?, ?, : infixl 1 class Arrow a where arr :: (b - c) - a b c () :: a b c - a c d - a b d apfst :: a b c - a (b,x) (c,x) apsnd :: a b c - a (x,b) (x,c) () :: a b c - a d e - a (b,d) (c,e) () :: a b c - a b d - a b (c,d) liftA2 :: (b - c - d) - a e b - a e c - a e d aConst :: c - a b c idArrow :: a b b -- Minimal implementation: arr, , apfst or data Choice a = a : a class (Arrow a) = ArrowChoice a where apl :: a b c - a (Either b d) (Either c d) apr :: a b c - a (Either d b) (Either d c) (|) :: a b c - a d e - a (Either b d) (Either c e) (|||) :: a b c - a d c - a (Either b d) c ( ?) :: (b - Bool) - Choice (a b c) - a b c (?) :: a b Bool- Choice (a b c) - a b c -- Minimal implementation: | or apl class (Arrow a) = ArrowApply a where app :: a (a b c,b) c class (Arrow a) = ArrowZero a where aZero :: a b c aMaybe :: a (Maybe c) c aGuard :: (b - Bool) - a b b class (Arrow a) = ArrowPlus a where (+++) :: a b c - a b c - a b c --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Is there a name for this structure?
Not really a Haskell question, but someone here might know the answer... Suppose you have two morphisms f : A - B and g : B - A such that neither (f . g) nor (g . f) is the identity, but satisfying (f . g . f) = f. Is there a conventional name for this? Alternately, same question, but f and g are functors and A and B categories. In some cases (g . f . g) is also equal to g; is there a name for this as well? I find myself running into pairs of functions with this property over and over again, and am looking for a short way to describe the property... Thanks, --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
ANNOUNCE: HXML 0.2, XML parser for Haskell
Announcing HXML version 0.2, a non-validating XML parser written in Haskell. It is designed for space-efficiency, taking advantage of lazy evaluation to reduce memory requirements. HXML is available at: URL: http://www.flightlab.com/~joe/hxml The current version is 0.2, and is pre-beta quality. HXML has been tested with GHC 5.02, NHC 1.10, and various recent versions of Hugs 98. Changes in version 0.2: + New Arrow-based combinator library + Added support for CDATA sections + New function parseDocument recognizes (and ignores) the document prolog (XML and DOCTYPE declarations) + Several data structures and public functions have been renamed + Space fault in comment parsing fixed Please contact Joe English [EMAIL PROTECTED] with any questions, comments, or bug reports. --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Lazy Parsing
Brandon Michael Moore wrote: I'm wondering if there are any libraries out there for creating parsers that lazily build up their result. I know I could thread the remaining input through a parser by hand, but it seems like someone should have already done it. This turns out to be rather difficult to do in the general case (but see below -- XML is a special case). If you have type Parser sym result = [sym] - Maybe (result, [sym]) a Parser can't decide whether to return 'Just (result,rest)' or 'Nothing' until it has successfully parsed the complete result. So pattern matching on the parser's return value will force the entire production. Variations on the theme -- Either instead of Maybe, list-of-successes, continuation-passing combinators, etc -- all face a similar problem. However, if your top-level grammar is of the form: things :: empty | thing things {- == thing* -} then instead of: case runParser (pMany pThing) input of Just (result,[]) - ... you can use something like unfoldr (runParser pThing) input to build the result list incrementally. This will be less eager; instead of parsing and returning an entire list of Things, it parses one Thing at a time. Another thing to watch out for is heap drag. The list-of-successes approach tends to retain the entire input, just in case the parser needs to backtrack. Parsec [1] and UU_Parsing [?] solve this by severely restricting the amount of required lookahead. I'd like to be able to turn a stream of XML into a lazy tree of tags (probably Maybe tags, or Either errors tags), but I don't think HaXml and the like do that sort of thing. That's exactly how HXML [2] works. The parser returns a lazy list of tokens (analogous to SAX events), which are folded up into a tree by a separate function. In addition it uses a CPS parser library so (as with Parsec), there is minimal heap drag. [1] Parsec: URL: http://www.cs.ruu.nl/~daan/parsec.html [1] HXML: URL: http://www.flightlab.com/~joe/hxml (Note: HXML release 0.2 will be ready Real Soon Now, and there have been many incompatible changes since 0.1. The main thing left to be finished is the documentation, if you can live without that let me know and I'll put a snapshot up.) --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: HaXml, memory usage and segmentation fault
An update on Dmitry's problems with HaXml memory usage: + Compiling HaXml and the driver program with ghc -O helps a *lot*. + Using the version of HaXml that comes preinstalled with GHC (-package text) helps even more. There is a slight difference in the 'Pretty' module (which is used to print the output) between the two versions. + I wrote an adapter that converts my parser's XML representation into HaXml's, so you can use it as a drop-in replacement. This helps some, but not enough. The heap profile using HaXml 1.02 has two large humps: the first from parsing the input, and the second from pretty-printing the output. (With the GHC version of HaXml the second hump is about half as tall as with the official HaXml version). With the new parser, only the smaller hump remains. + Figuring that using a pretty-printer is overkill, I replaced it with a quick hack that converts the HaXml representation _back_ into my representation and feeds it to a serializer that I had previously written. This improves things some more: the identity transformation 'processXmlWith keep' now has a flat heap profile. + Unfortunately, Dmitry's original program still has a space leak. I suspect that the HaXml combinators (or, more likely, the HaXml internal representation) are not as space-efficient as I had originally thought, since when I rewrote Dmitry's test case to use the new parser's internal representation directly I again got a flat heap profile -- there doesn't seem to be anything wrong with the structure of the original program. The code will be ready to release Real Soon Now; I'll keep you posted. --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: HaXml, memory usage and segmentation fault
Dmitry Astapov wrote: JE and the Hugs interpreter just isn't designed to work with large JE inputs. Try compiling the program instead. well, ghc-5.02 seems to dislike something inside XmlLib.hs - it could not find interface defs file for modules IOExts .. I plan to look more deeply into it though. I got it to compile with ghc 5.02 using ghc --make -package lang translate.hs The compiled version succeeds, but on a large document it uses a *lot* of memory and starts paging pretty badly. JE Try the identity transform 'main = processXmlWith keep' on your sample JE document and see if that runs out of heap too. If so, there's not JE much you can do short of replacing the HaXml parser. I tried this as well, modifying your program to use an XML parser I wrote a while ago that has better laziness properties than the HaXML one. Alas, my parser also suffers from a space leak under Hugs, so this only deferred the problem. Under ghc/ghci, though, it has modest memory requirements and runs without paging. --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: HaXml, memory usage and segmentation fault
Dmitry Astapov wrote: JE I tried this as well, modifying your program to use an XML parser I JE wrote a while ago that has better laziness properties than the HaXML JE one. Alas, my parser also suffers from a space leak under Hugs, so JE this only deferred the problem. Under ghc/ghci, though, it has modest JE memory requirements and runs without paging. Is it's distribution restricted? Is it possible to get it somwhere, use it, patch it, etc? If you don't mind a complete lack of documentation, sure :-) The code is alpha quality; there are a few missing features and a couple of things that it just gets wrong, but it's basically working. I'll package it up and put it on the Web when I get a chance. This may take a day or two... --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: HaXml, memory usage and segmentation fault
Dmitry Astapov wrote: I have Hugs version February 2001, HaXml version 1.02 and this program: [...] This program can process following file: ?xml version='1.0'? invoice [... one customer containing two contracts ... ] /invoice Now increase amount of customers to 10, and amount of contracts within each customer to 999. After that, runhugs -h600 translate.hs invoice.xml invoice.html dumps core :( What's the reason: bug in hugs, bug in HaXml, or my own bad programming techniques? More an inappropriate use of Hugs -- 10 customers with 999 contracts each is a moderately large input file, and the Hugs interpreter just isn't designed to work with large inputs. Try compiling the program instead. The other issue is that HaXml's XML parser is insufficiently lazy (although the rest of HaXml has very nice strictness properties). For instance, there's no reason why your program shouldn't run in near-constant space, but due to the way the parser is structured it won't begin producing any output until the entire input document has been read. Try the identity transform 'main = processXmlWith keep' on your sample document and see if that runs out of heap too. If so, there's not much you can do short of replacing the HaXml parser. --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Multithreaded stateful software
Mark Carroll wrote: One of the projects I have coming up is a multi-threaded server that manages many clients in performing a distributed computation using a number of computers. [...] (a) This really isn't what Haskell was designed for, and if I try to write this in Haskell I'll never want to touch it again. (b) This project is quite feasible in Haskell but when it's done I'll feel I should have just used Java or something. (c) Haskell's monads, concurrency stuff and TCP/IP libraries are really quite powerful and useful, and I'll be happy I picked Haskell for the task. There's also: (d) You end up learning all sorts of new things about distributed processing (as well as Haskell) and, armed with the new knowledge, future problems of the same nature will be easier to solve no matter what language you use. That's what usually happens to me. (Personally, if I had this project coming up, I'd use it as an excuse to finally learn Erlang...) --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: A sample revised prelude for numeric classes
Dylan Thurston wrote: I've started writing up a more concrete proposal for what I'd like the Prelude to look like in terms of numeric classes. I like this proposal a lot. The organization is closer to traditional mathematical structures than the current Prelude, but not as intimidating as Mechveliani's Basic Algebra Proposal. A very nice balance, IMO. A couple of requests: module Lattice where class Lattice a where meet, join :: a - a - a Could this be split into class SemiLattice a where join :: a - a - a and class (SemiLattice a) = Lattice a where meet :: a - a - a I run across a lot of structures which could usefully be modeled as semilattices, but lack a 'meet' operation. It would be reasonable to make Ord a subclass of this, but it would probably complicate the class heirarchy too much for the gain. In a similar vein, I'd really like to see the Ord class split up: class PartialOrder a where (), () :: a - a - Bool class (Eq a, PartialOrder a) = Ord a where compare:: a - a - Ordering (=), (=) :: a - a - Bool max, min :: a - a - a Perhaps it would make sense for PartialOrder to be a superclass of Lattice? --Joe English [EMAIL PROTECTED] ___ Haskell-Cafe mailing list [EMAIL PROTECTED] http://www.haskell.org/mailman/listinfo/haskell-cafe