Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
> > Message: 7 > Date: Tue, 29 Mar 2011 22:39:12 -0400 > From: wren ng thornton > Subject: Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8 > To: haskell-cafe@haskell.org > Message-ID: <4d9297d0.7060...@freegeek.org> > Content-Type: text/plain; charset=ISO-8859-1; format=flowed > > On 3/29/11 4:40 AM, o...@okmij.org wrote: > > Wren Thornton wrote: > >> This is often conflated with the iteratee throwing an error/exception, > >> which is wrong because we should distinguish between bad program > >> states and argument passing. > > > > I guess this is a matter of different points of view on exceptions. > > The problem is not so much the exceptions per se (one goto is about as > good as any other), it has more to do with the fact that important > things are being left out of the types. > > One of the great things about Haskell is that you can lean so heavily on > the type system to protect yourself when refactoring, designing by > contract, etc. However, if there's an unspoken code of communication > between specific enumerators and iteratees, it's very easy to break > things. This is why the communication should be captured in the types, > regardless of the control-flow mechanism used to implement that > communication. I'd like the static guarantee that whatever special > requests my iteratee could make, its enumerator is in a position to > fulfill those requests (or die trying). Allowing for the iteratee to be > paired with an enumerator which is incapable of handling its requests is > a type error and should be treated as such. > This has long been a goal of mine for iteratee (since before the 0.4 release), although I haven't really done any work on it. Maybe it's time to see if I can get a more satisfactory implementation. > > Wren Thornton wrote: > >> In an ideal framework the producers, transformers, and consumers of > >> stream data would have a type parameter indicating the up-stream > >> communication they support or require (in addition to the type > >> parameters for stream type, result type, and side-effect type). > > > > Very true. Currently the design of Iteratees quite resembles that of > > Control.Exception: everything can throw SomeException. Ideally one > > would like to be more precise, and specify what exceptions or sorts of > > exceptions could be thrown -- by Iteratees, and by ordinary Haskell > > functions. The design of a good effect system is still the topic of > > active research, although there are some encouraging results. > > Yeah, I'm not a big fan of extensible exceptions either. Don't get me > wrong, it's an awesome hack and it's far cleaner than the Java approach; > but it still goes against my sensibilities. > I've come around to the view that exceptions are a bad idea (in Haskell), and just making everything explicit (via Maybe, ErrorT, explicit-exceptions, or otherwise) is the best approach at present. John L. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 3/29/11 4:40 AM, o...@okmij.org wrote: Wren Thornton wrote: This is often conflated with the iteratee throwing an error/exception, which is wrong because we should distinguish between bad program states and argument passing. I guess this is a matter of different points of view on exceptions. The problem is not so much the exceptions per se (one goto is about as good as any other), it has more to do with the fact that important things are being left out of the types. One of the great things about Haskell is that you can lean so heavily on the type system to protect yourself when refactoring, designing by contract, etc. However, if there's an unspoken code of communication between specific enumerators and iteratees, it's very easy to break things. This is why the communication should be captured in the types, regardless of the control-flow mechanism used to implement that communication. I'd like the static guarantee that whatever special requests my iteratee could make, its enumerator is in a position to fulfill those requests (or die trying). Allowing for the iteratee to be paired with an enumerator which is incapable of handling its requests is a type error and should be treated as such. Wren Thornton wrote: In an ideal framework the producers, transformers, and consumers of stream data would have a type parameter indicating the up-stream communication they support or require (in addition to the type parameters for stream type, result type, and side-effect type). Very true. Currently the design of Iteratees quite resembles that of Control.Exception: everything can throw SomeException. Ideally one would like to be more precise, and specify what exceptions or sorts of exceptions could be thrown -- by Iteratees, and by ordinary Haskell functions. The design of a good effect system is still the topic of active research, although there are some encouraging results. Yeah, I'm not a big fan of extensible exceptions either. Don't get me wrong, it's an awesome hack and it's far cleaner than the Java approach; but it still goes against my sensibilities. I think a big part of the problem is that we don't have a good type theory for coroutines. The idea of functions that "never return" just doesn't cut it. And conflating legitimate control-flow manipulation with bottom doesn't either. But, as of yet, that's all we've got. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Antoine Latter wrote: > It looks like we can't quite fit Enumeratee into the Category > typeclass (without newtypes, at least). That's a shame. Yeah. Intuitively it looks like iteratees and enumeratees are excellent candidates for Category and even Arrow. Unfortunately they can either be monad transformers or arrows. You can't mix without, as you said, a newtype. That's very unfortunate. Greets, Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Hello, >> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m >> b >> enum =$ iter = joinI (enum $$ iter) >> >> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> >> Enumerator ai m b >> ($=) = joinE >> -- > > The operators sound good to me. My only request would be to put in a > usage example in the documentation. I'd be happy to write one if you'd > like. Personally, I think that =$ will *greatly* clean up my code. I have a tutorial to describe how to use the enumerator library in Japanese. Since it is popular among the Haskell community in Japan, I guess it's worth translating into English. So, I did. http://www.mew.org/~kazu/proj/enumerator/ This tutorial explains how to use (=$) and ($=) as well as other operators(($$), (<==<), (>>=)). Of course, my English is broken. If English native speakers will kindly correct broken grammar, it would be appreciated. I'm reachable by e-mail or twitter (@kazu_yamamoto). --Kazu ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Tue, Mar 29, 2011 at 6:15 PM, Ertugrul Soeylemez wrote: > Hello John, > > Sorry that I'm late. And honestly one day for request submissions is a > bit narrow. > > I have a request, too: Right now it is difficult to compose > enumeratees. An equivalent of (.) for enumeratees would be great. So > instead of: > > joinI $ e1 $$ joinI $ e2 $$ iter > > one could write > > let e = e1 .= e2 in e =$ iter > > I would appreciate a 0.4.10 with such a composition operator. > > > Greets, > Ertugrul > It looks like we can't quite fit Enumeratee into the Category typeclass (without newtypes, at least). That's a shame. Antoine ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Hello John, Sorry that I'm late. And honestly one day for request submissions is a bit narrow. I have a request, too: Right now it is difficult to compose enumeratees. An equivalent of (.) for enumeratees would be great. So instead of: joinI $ e1 $$ joinI $ e2 $$ iter one could write let e = e1 .= e2 in e =$ iter I would appreciate a 0.4.10 with such a composition operator. Greets, Ertugrul John Millikin wrote: > Since the release, a couple people have sent in feature requests, so I'm > going to put out 0.4.9 in a day or so. > > New features will be: > > - tryIO: runs an IO computation, and converts any exceptions into > ``throwError`` calls (requested by Kazu Yamamoto) > > - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) > when defining enumerators > > - mapAccum and mapAccum: sort of like map and mapM, except the step function > is stateful (requested by Long Huynh Huu) > > Anyone else out there sitting on a request? Please send them in -- I am > always happy to receive them, even if they must be declined. > > --- > > Also, I would like to do a quick poll regarding operators. > > 1. It has been requested that I add operator aliases for joinI and joinE. > > 2. There have been complaints that the library defines too many operators > (currently, 5). > > Do any existing enumerator users, or anyone for that matter, have an opinion > either way? > > The proposed operators are: > > -- > infixr 0 =$ > infixr 0 $= > > (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m > b > enum =$ iter = joinI (enum $$ iter) > > ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> > Enumerator ai m b > ($=) = joinE > -- > > -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Thanks, I look forward to being able to use these new operators! On Tue, Mar 29, 2011 at 8:22 PM, John Millikin wrote: > 0.4.9 has been uploaded to cabal, with the new operators. Changes are in the > replied-to post (and also quoted below), plus the new operators proposed by > Kazu Yamamoto. > > Here's the corresponding docs (they have examples!) > > -- > -- | @enum =$ iter = 'joinI' (enum $$ iter)@ > -- > -- “Wraps” an iteratee /inner/ in an enumeratee /wrapper/. > -- The resulting iteratee will consume /wrapper/’s input type and > -- yield /inner/’s output type. > -- > -- Note: if the inner iteratee yields leftover input when it finishes, > -- that extra will be discarded. > -- > -- As an example, consider an iteratee that converts a stream of > UTF8-encoded > -- bytes into a single 'TL.Text': > -- > -- > consumeUTF8 :: Monad m => Iteratee ByteString m Text > -- > -- It could be written with either 'joinI' or '(=$)': > -- > -- > import Data.Enumerator.Text as ET > -- > > -- > consumeUTF8 = joinI (decode utf8 $$ ET.consume) > -- > consumeUTF8 = decode utf8 =$ ET.consume > -- > -- Since: 0.4.9 > > -- | @enum $= enee = 'joinE' enum enee@ > -- > -- “Wraps” an enumerator /inner/ in an enumeratee /wrapper/. > -- The resulting enumerator will generate /wrapper/’s output type. > -- > -- As an example, consider an enumerator that yields line character counts > -- for a text file (e.g. for source code readability checking): > -- > -- > enumFileCounts :: FilePath -> Enumerator Int IO b > -- > -- It could be written with either 'joinE' or '($=)': > -- > -- > import Data.Text as T > -- > import Data.Enumerator.List as EL > -- > import Data.Enumerator.Text as ET > -- > > -- > enumFileCounts path = joinE (enumFile path) (EL.map T.length) > -- > enumFileCounts path = enumFile path $= EL.map T.length > -- > -- Since: 0.4.9 > -- > > Minor release note -- 0.4.9 and 0.4.9.1 are the exact same code; I just > forgot a @ in one of the new docs and had to re-upload so Hackage would > haddock properly. There is no difference in behavior. > > On Monday, March 28, 2011 10:50:45 PM UTC-7, John Millikin wrote: >> >> Since the release, a couple people have sent in feature requests, so I'm >> going to put out 0.4.9 in a day or so. >> >> New features will be: >> >> - tryIO: runs an IO computation, and converts any exceptions into >> ``throwError`` calls (requested by Kazu Yamamoto) >> >> - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) >> when defining enumerators >> >> - mapAccum and mapAccum: sort of like map and mapM, except the step >> function is stateful (requested by Long Huynh Huu) >> >> Anyone else out there sitting on a request? Please send them in -- I am >> always happy to receive them, even if they must be declined. >> >> --- >> >> Also, I would like to do a quick poll regarding operators. >> >> 1. It has been requested that I add operator aliases for joinI and joinE. >> >> 2. There have been complaints that the library defines too many operators >> (currently, 5). >> >> Do any existing enumerator users, or anyone for that matter, have an >> opinion either way? >> >> The proposed operators are: >> >> -- >> infixr 0 =$ >> infixr 0 $= >> >> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao >> m b >> enum =$ iter = joinI (enum $$ iter) >> >> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b >> -> Enumerator ai m b >> ($=) = joinE >> -- >> > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
0.4.9 has been uploaded to cabal, with the new operators. Changes are in the replied-to post (and also quoted below), plus the new operators proposed by Kazu Yamamoto. Here's the corresponding docs (they have examples!) -- -- | @enum =$ iter = 'joinI' (enum $$ iter)@ -- -- “Wraps” an iteratee /inner/ in an enumeratee /wrapper/. -- The resulting iteratee will consume /wrapper/’s input type and -- yield /inner/’s output type. -- -- Note: if the inner iteratee yields leftover input when it finishes, -- that extra will be discarded. -- -- As an example, consider an iteratee that converts a stream of UTF8-encoded -- bytes into a single 'TL.Text': -- -- > consumeUTF8 :: Monad m => Iteratee ByteString m Text -- -- It could be written with either 'joinI' or '(=$)': -- -- > import Data.Enumerator.Text as ET -- > -- > consumeUTF8 = joinI (decode utf8 $$ ET.consume) -- > consumeUTF8 = decode utf8 =$ ET.consume -- -- Since: 0.4.9 -- | @enum $= enee = 'joinE' enum enee@ -- -- “Wraps” an enumerator /inner/ in an enumeratee /wrapper/. -- The resulting enumerator will generate /wrapper/’s output type. -- -- As an example, consider an enumerator that yields line character counts -- for a text file (e.g. for source code readability checking): -- -- > enumFileCounts :: FilePath -> Enumerator Int IO b -- -- It could be written with either 'joinE' or '($=)': -- -- > import Data.Text as T -- > import Data.Enumerator.List as EL -- > import Data.Enumerator.Text as ET -- > -- > enumFileCounts path = joinE (enumFile path) (EL.map T.length) -- > enumFileCounts path = enumFile path $= EL.map T.length -- -- Since: 0.4.9 -- Minor release note -- 0.4.9 and 0.4.9.1 are the exact same code; I just forgot a @ in one of the new docs and had to re-upload so Hackage would haddock properly. There is no difference in behavior. On Monday, March 28, 2011 10:50:45 PM UTC-7, John Millikin wrote: > > Since the release, a couple people have sent in feature requests, so I'm > going to put out 0.4.9 in a day or so. > > New features will be: > > - tryIO: runs an IO computation, and converts any exceptions into > ``throwError`` calls (requested by Kazu Yamamoto) > > - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) > when defining enumerators > > - mapAccum and mapAccum: sort of like map and mapM, except the step > function is stateful (requested by Long Huynh Huu) > > Anyone else out there sitting on a request? Please send them in -- I am > always happy to receive them, even if they must be declined. > > --- > > Also, I would like to do a quick poll regarding operators. > > 1. It has been requested that I add operator aliases for joinI and joinE. > > 2. There have been complaints that the library defines too many operators > (currently, 5). > > Do any existing enumerator users, or anyone for that matter, have an > opinion either way? > > The proposed operators are: > > -- > infixr 0 =$ > infixr 0 $= > > (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m > b > enum =$ iter = joinI (enum $$ iter) > > ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> > Enumerator ai m b > ($=) = joinE > -- > >___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mar 29, 2011 10:42 AM, wrote: > > I'm keen to hear of the example that seem to require Iteratee's > allocating additional resources. I'd really like to see if any of such > cases can be cast it terms of regions, implemented via iterated > Iteratee transformers. Hello Oleg, The first example which comes to mind - because I recently implemented this - is HTTP uploads: the input stream can hold N files (the value of N not known ahead of time), and each upload can go to a separate temporary file. If the request stream fails early (e.g. client browser closes connection), any already-created temp files and any open file handles must be cleaned up. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Wren Thornton wrote: > This is often conflated with the iteratee throwing an error/exception, > which is wrong because we should distinguish between bad program > states and argument passing. I guess this is a matter of different points of view on exceptions. I am a fan of the model of (effectful) computation proposed by Cartwright and Felleisen a while ago: http://okmij.org/ftp/Computation/monads.html#ExtensibleDS In their model, all the computation is done by throwing resumable exceptions -- including the pure computation such as arithmetic and CBV/CBN applications. The similarity of ExtensibleDS.hs with Iteratee should be quite noticeable, especially regarding the part about throwing `errors.' The relation with co-routines is hard to miss: in fact, Iteratees are built upon co-routines, which are being resumed by the enumerator. The error message is the additional piece of data that is being associated with `yield', telling the enumerator to do something extra rather than mere getting the next piece of data and resuming the co-routine. The co-routine is the simplest part of the iteratee; it is the plumbing that takes a long time to engineer. John A. De Goes: > 2. Error recovery is ill-defined because errors do not describe > what portion of the input they have already consumed; I'm confused about this complaint: if an iteratee encounters an unusual condition or just has a special request, it sends a message that eventually propagates to the responsible enumerator. That enumerator knows how much data it has sent down to iteratees. The RandomIO module http://okmij.org/ftp/Haskell/Iteratee/RandomIO.hs is a good illustration: when the enumerator receives the seek request, it checks if the desired stream offset corresponds to the data already in the current IO buffer. If so, no IO is performed and the iteratee is resumed with the existing buffer data. The tests in the file check for that. Iteratee knows nothing about the buffer or if there is a buffer. Wren Thornton wrote: > In an ideal framework the producers, transformers, and consumers of > stream data would have a type parameter indicating the up-stream > communication they support or require (in addition to the type > parameters for stream type, result type, and side-effect type). Very true. Currently the design of Iteratees quite resembles that of Control.Exception: everything can throw SomeException. Ideally one would like to be more precise, and specify what exceptions or sorts of exceptions could be thrown -- by Iteratees, and by ordinary Haskell functions. The design of a good effect system is still the topic of active research, although there are some encouraging results. John A. De Goes: > 3. Iteratees sometimes need to manage resources, but they're not > designed to do so which leads to hideous workarounds; Gregory Collins: > The thing which I find is missing the most from enumerator as it > stands is not this -- it's the fact that Iteratees sometimes need to > allocate resources which need explicit manual deallocation (i.e. > sockets, file descriptors, mmaps, etc), but because Enumerators are > running the show, there is no "local" way to ensure that the > cleanup/bracket routines get run on error. I used to think that processing several inputs at different paces was indeed a stumbling block. It seemed that an iteratee needed to open a separate file, which it is indeed ill-equipped to do. Fortunately, that difficulty has been overcome, surprisingly in a natural way with no changes to the library: http://okmij.org/ftp/Streams.html#2enum1iter The pleasant surprise is that we can iterate (no pun intended) Iteratee monad transformers, just as we did with (IORT s) monad transformer in the Lightweight Monadic Regions. Thus we maintain the region-like discipline of managing resources. I'm keen to hear of the example that seem to require Iteratee's allocating additional resources. I'd really like to see if any of such cases can be cast it terms of regions, implemented via iterated Iteratee transformers. John A. De Goes: > 1. It does not make sense in general to bind with an iteratee that > has already consumed input, but there's no type-level difference > between a "virgin" iteratee and one that has already consumed input; I'm not sure I follow. Why should it make a difference between a virgin iteratee and the one that consumed some input. One should think of the Iteratee as two arguments of fold (f and z) bundled together. Why the function being folded over should care how many times it has been applied to input data? It is a pure function, transforming state plus input to a new state. The useful laws of fold hold precisely because the function f is pure and doesn't care. A detailed example showing why you think you need this distinction would be appreciated. > 4. Iteratees cannot incrementally produce output, it's all or > nothing, which makes them terrible for many real world problems >
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Tue, Mar 29, 2011 at 7:50 AM, John Millikin wrote: > Since the release, a couple people have sent in feature requests, so I'm > going to put out 0.4.9 in a day or so. > > New features will be: > > - tryIO: runs an IO computation, and converts any exceptions into > ``throwError`` calls (requested by Kazu Yamamoto) > > - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) > when defining enumerators > > - mapAccum and mapAccum: sort of like map and mapM, except the step function > is stateful (requested by Long Huynh Huu) > > Anyone else out there sitting on a request? Please send them in -- I am > always happy to receive them, even if they must be declined. > > --- > > Also, I would like to do a quick poll regarding operators. > > 1. It has been requested that I add operator aliases for joinI and joinE. > > 2. There have been complaints that the library defines too many operators > (currently, 5). > > Do any existing enumerator users, or anyone for that matter, have an opinion > either way? > > The proposed operators are: > > -- > infixr 0 =$ > infixr 0 $= > > (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m > b > enum =$ iter = joinI (enum $$ iter) > > ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> > Enumerator ai m b > ($=) = joinE > -- The operators sound good to me. My only request would be to put in a usage example in the documentation. I'd be happy to write one if you'd like. Personally, I think that =$ will *greatly* clean up my code. Michael ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Since the release, a couple people have sent in feature requests, so I'm going to put out 0.4.9 in a day or so. New features will be: - tryIO: runs an IO computation, and converts any exceptions into ``throwError`` calls (requested by Kazu Yamamoto) - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) when defining enumerators - mapAccum and mapAccum: sort of like map and mapM, except the step function is stateful (requested by Long Huynh Huu) Anyone else out there sitting on a request? Please send them in -- I am always happy to receive them, even if they must be declined. --- Also, I would like to do a quick poll regarding operators. 1. It has been requested that I add operator aliases for joinI and joinE. 2. There have been complaints that the library defines too many operators (currently, 5). Do any existing enumerator users, or anyone for that matter, have an opinion either way? The proposed operators are: -- infixr 0 =$ infixr 0 $= (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m b enum =$ iter = joinI (enum $$ iter) ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> Enumerator ai m b ($=) = joinE -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
This isn't quite what I'm after. I want to pull chunks on demand (i.e. have control over both the input and the output). Enumeratees don't allow me to do that. Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes On Mar 27, 2011, at 7:58 PM, John Millikin wrote: > If the receiver can only accept very small chunks, you can put a rechunking > stage in between the compression and iteratee: > > --- > verySmallChunks :: Monad m => Enumeratee ByteString ByteString m b > verySmallSchunks = sequence (take 10) > --- > > Resending is slightly more complex -- if the other end can say "resend that > last chunk", then it should be easy enough, but "resend the last 2 bytes of > that chunk you sent 5 minutes ago" would be much harder. What is your use > case? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Now THAT"s what I'm talking about. Augment such a solution with interruptible & resumable data producers, and I'd have everything I need. Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes On Mar 27, 2011, at 10:54 PM, wren ng thornton wrote: > In an ideal framework the producers, transformers, and consumers of stream > data would have a type parameter indicating the up-stream communication they > support or require (in addition to the type parameters for stream type, > result type, and side-effect type). That way clients can just define an ADT > for their communication protocol, and be done with it. There may still be > issues with the Expression Problem, but at least those are pushed out of the > stream processing framework itself which really shouldn't care about the > types of communication used. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mar 28, 2011, at 12:54 AM, wren ng thornton wrote: On 3/27/11 9:58 PM, John Millikin wrote: Resending is slightly more complex -- if the other end can say "resend that last chunk", then it should be easy enough, but "resend the last 2 bytes of that chunk you sent 5 minutes ago" would be much harder. What is your use case? This does highlight one of the restrictions I've lamented about the iteratee framework. Namely that the current versions I've seen place unnecessary limitations on the communication the iteratee is allowed to give to the enumerator/enumeratees above it. This is often conflated with the iteratee throwing an error/exception, which is wrong because we should distinguish between bad program states and argument passing. Moreover the type system doesn't capture the kinds of communication iteratees assume of their enumerators/enumeratees, nor the kinds of communication supported by the enumerators/ enumeratees, which means that failure to hook them up in the right (non-typechecked) way /does/ constitute an error. The one example that tends to be supported is the iteratee requesting that the enumerator/enumeratees seek to a given position in a file. Which is a good example, but it's not the only one. Requesting the resending of chunks is another good example. But there's no limit to the reasonable kinds of communication an iteratee could want. In an ideal framework the producers, transformers, and consumers of stream data would have a type parameter indicating the up-stream communication they support or require (in addition to the type parameters for stream type, result type, and side-effect type). That way clients can just define an ADT for their communication protocol, and be done with it. There may still be issues with the Expression Problem, but at least those are pushed out of the stream processing framework itself which really shouldn't care about the types of communication used. It's somewhat outdated and underdeveloped (I was writing for myself so I never really bothered finishing it), but I wrote an exploration of iteratee semantics[1] a while back in which I specified an iteratee as a monad-transformer stack involving, at its core, the "PromptT" or "ProgramT" monad transformers (as far as I know, the same could be done with the "Coroutine" monad). I personally found that construction far more lucid than the usual ad-hoc view, and it also makes it very clear how the model can be trivially extended to support additional operations such as these. Based on what I learned while writing that (and on the similarity between coroutines and the concepts I used), I strongly agree with Mario Blažević, suggestion to look at his monad-coroutine library as a way of understanding where they fit in some larger design space. I would even go so far as to suggest that something like it could be considered as either a replacement for iteratees or as the underlying implementation of an iteratee library, because the concept not only subsumes iteratees and enumerators, but also delegates control to code that can be independent of both rather than simply reversing the "conventional" iterator concept. I believe it also subsumes iterators and whatever their corresponding parts are called. As he mentions, his implementation does not come with all the "plumbing", but I think it would be worthwhile to create that plumbing, because either coroutines or "operational monads" may very well be the basis needed to develop a "grand unified theory" of composable stream processing. If nothing else, the isomorphisms in his coroutine-enumerator[2] and coroutine-iteratee[3] packages seem to give a much more direct and useful iteratee semantics than I've seen given anywhere else, and at the same time they are much more readily extended to cover additional operations. -- James 1. https://github.com/mokus0/junkbox/tree/master/Papers/HighLevelIteratees 2. http://hackage.haskell.org/package/coroutine-enumerator 3. http://hackage.haskell.org/package/coroutine-iteratee ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mon, Mar 28, 2011 at 8:06 AM, John Millikin wrote: > On Sunday, March 27, 2011 9:45:23 PM UTC-7, Ertugrul Soeylemez wrote: >> >> > For setting a global timeout on an entire session, it's better to wrap >> > the ``run_`` call with ``System.Timeout.timeout`` -- this is more >> > efficient than testing the time on every chunk, and does not require a >> > specialised enumerator. >> It may be more efficient, but I don't really like it. I like robust >> >> applications, and to me killing a thread is always a mistake, even if >> the thread is kill-safe. >> > ``timeout`` doesn't kill the thread, it just returns ``Nothing`` if the > computation took longer than expected. > Timeout does kill the thread that is used for timing out :-). The thread that measures the timeout throws an exception to the worker thread that's being monitored. Either way you're interrupting a thread. Kill it or toss an exception at it, I don't see the difference really. Dave > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Sunday, March 27, 2011 9:45:23 PM UTC-7, Ertugrul Soeylemez wrote: > > > For setting a global timeout on an entire session, it's better to wrap > > the ``run_`` call with ``System.Timeout.timeout`` -- this is more > > efficient than testing the time on every chunk, and does not require a > > specialised enumerator. > It may be more efficient, but I don't really like it. I like robust > > applications, and to me killing a thread is always a mistake, even if > the thread is kill-safe. > ``timeout`` doesn't kill the thread, it just returns ``Nothing`` if the computation took longer than expected. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 3/27/11 9:58 PM, John Millikin wrote: Resending is slightly more complex -- if the other end can say "resend that last chunk", then it should be easy enough, but "resend the last 2 bytes of that chunk you sent 5 minutes ago" would be much harder. What is your use case? This does highlight one of the restrictions I've lamented about the iteratee framework. Namely that the current versions I've seen place unnecessary limitations on the communication the iteratee is allowed to give to the enumerator/enumeratees above it. This is often conflated with the iteratee throwing an error/exception, which is wrong because we should distinguish between bad program states and argument passing. Moreover the type system doesn't capture the kinds of communication iteratees assume of their enumerators/enumeratees, nor the kinds of communication supported by the enumerators/enumeratees, which means that failure to hook them up in the right (non-typechecked) way /does/ constitute an error. The one example that tends to be supported is the iteratee requesting that the enumerator/enumeratees seek to a given position in a file. Which is a good example, but it's not the only one. Requesting the resending of chunks is another good example. But there's no limit to the reasonable kinds of communication an iteratee could want. In an ideal framework the producers, transformers, and consumers of stream data would have a type parameter indicating the up-stream communication they support or require (in addition to the type parameters for stream type, result type, and side-effect type). That way clients can just define an ADT for their communication protocol, and be done with it. There may still be issues with the Expression Problem, but at least those are pushed out of the stream processing framework itself which really shouldn't care about the types of communication used. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
John Millikin wrote: > Good idea -- I've added an ``enumSocketTimed`` and ``iterSocketTimed`` > to the network-enumerator package at < > http://hackage.haskell.org/package/network-enumerator > >. ``enumSocketTimed`` is equivalent to your ``enumHandleTimeout``, > but instead of Handle uses the more efficient Socket type. For simple applications working with handles is much more convenient, so I decided to implement a timed handle enumerator instead of a socket enumerator. Perhaps it would be a good idea to add your 'enumSocketTimed' and 'iterSocketTimed' to my netlines package, too. Also I should add 'iterHandleTimeout'. > For setting a global timeout on an entire session, it's better to wrap > the ``run_`` call with ``System.Timeout.timeout`` -- this is more > efficient than testing the time on every chunk, and does not require a > specialised enumerator. It may be more efficient, but I don't really like it. I like robust applications, and to me killing a thread is always a mistake, even if the thread is kill-safe. Greets, Ertugrul -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
If the receiver can only accept very small chunks, you can put a rechunking stage in between the compression and iteratee: --- verySmallChunks :: Monad m => Enumeratee ByteString ByteString m b verySmallSchunks = sequence (take 10) --- Resending is slightly more complex -- if the other end can say "resend that last chunk", then it should be easy enough, but "resend the last 2 bytes of that chunk you sent 5 minutes ago" would be much harder. What is your use case? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
This formulation does not let me control the production of compressed chunks independently from the provision of input; a receiver may only be capable of consuming a tiny amount at a time, and I may have to resend some chunks. Which is the whole point: iteratee & friends are lopsided. They provide excellent control of an input stream to the iteratee, but there is no structure permitting equivalent control of the output stream. Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes On Mar 27, 2011, at 3:22 PM, wren ng thornton wrote: > On 3/27/11 11:38 AM, John A. De Goes wrote: >> >> Enumeratees solve some use cases but not others. Let's say you want to >> incrementally compress a 2 GB file. If you use an enumeratee to do this, >> your "transformer" iteratee has to do IO. I'd prefer an abstraction to >> incrementally and purely produce the output from a stream of input. > > I don't see why? In pseudocode we could have, > >enumRead2GBFile :: FilePath -> Enumerator IO ByteString >enumRead2GBFile file iter0 = do >fd <- open file >let loop iter = do >mline <- read fd >case mline of >Nothing -> return iter >Just line -> do >iter' <- feed iter line >if isDone iter' >then return iter' >else loop iter' >iterF <- loop iter0 >close fd >return iterF > >compress :: Monad m => Enumeratee m ByteString ByteString >compress = go state0 >where >go state = do >chunk <- get >let (state',hash) = compressify state chunk >put hash >go state' > >compressify :: Foo -> ByteString -> (Foo,ByteString) > > it's just a pipeline like function composition or shell pipes. There's no > reason intermediate points of the pipeline have do anything impure. > > -- > Live well, > ~wren > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 3/27/11 11:38 AM, John A. De Goes wrote: Enumeratees solve some use cases but not others. Let's say you want to incrementally compress a 2 GB file. If you use an enumeratee to do this, your "transformer" iteratee has to do IO. I'd prefer an abstraction to incrementally and purely produce the output from a stream of input. I don't see why? In pseudocode we could have, enumRead2GBFile :: FilePath -> Enumerator IO ByteString enumRead2GBFile file iter0 = do fd <- open file let loop iter = do mline <- read fd case mline of Nothing -> return iter Just line -> do iter' <- feed iter line if isDone iter' then return iter' else loop iter' iterF <- loop iter0 close fd return iterF compress :: Monad m => Enumeratee m ByteString ByteString compress = go state0 where go state = do chunk <- get let (state',hash) = compressify state chunk put hash go state' compressify :: Foo -> ByteString -> (Foo,ByteString) it's just a pipeline like function composition or shell pipes. There's no reason intermediate points of the pipeline have do anything impure. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Sunday, March 27, 2011 8:38:38 AM UTC-7, John A. De Goes wrote: > > > Enumeratees solve some use cases but not others. Let's say you want to > incrementally compress a 2 GB file. If you use an enumeratee to do this, > your "transformer" iteratee has to do IO. I'd prefer an abstraction to > incrementally and purely produce the output from a stream of input. There's no reason the transformer has to do IO. Right now a lot of the interesting enumerator-based packages are actually bindings to C libraries, so they are forced to use IO, but there's nothing inherent in the enumeratee design to require it. For example, the text codec enumeratees "encode" and "decode" in Data.Enumerator.Text are pure. I'm working on ideas for writing pure enumeratees to bound libraries, but they will likely only work if the underlying library fully exposes its state, like zlib. Libraries with private or very complex internal states, such as libxml or expat, will probably never be implementable in pure enumeratees. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mar 26, 2011, at 4:24 PM, Mario Blažević wrote: > On 11-03-26 04:33 PM, John A. De Goes wrote: > Out of curiosity, have you looked at the monad-coroutine library? It's a > more generic and IMO much cleaner model, though I wouldn't recommend it as a > replacement because the enumerator and iteratee libraries come with more > predefined plumbing. I think your point #1 still stands, but others can all > be made to disappear - as long as you define your suspension functors > properly. I haven't looked at it. I will take a look. > Do you mean a sort of a transducer monad transformer or an actual finite > state machine? The latter would seem rather restrictive. Yes, I mean transducer monad transformer, especially if you equate "mealy machine" with "finite state machine". I equate mealy machine with "two-taped transducer". Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Enumeratees solve some use cases but not others. Let's say you want to incrementally compress a 2 GB file. If you use an enumeratee to do this, your "transformer" iteratee has to do IO. I'd prefer an abstraction to incrementally and purely produce the output from a stream of input. Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes On Mar 26, 2011, at 3:12 PM, wren ng thornton wrote: > On 3/26/11 4:33 PM, John A. De Goes wrote: >> 4. Iteratees cannot incrementally produce output, it's all or nothing, which >> makes them terrible for many real world problems that require both >> incremental input and incremental output. > > For this one, enumeratees are the proposed solution. But for some reason > enumeratees are oft overlooked. > > -- > Live well, > ~wren > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Hello Ertugrul Söylemez, Good idea -- I've added an ``enumSocketTimed`` and ``iterSocketTimed`` to the network-enumerator package at < http://hackage.haskell.org/package/network-enumerator >. ``enumSocketTimed`` is equivalent to your ``enumHandleTimeout``, but instead of Handle uses the more efficient Socket type. For setting a global timeout on an entire session, it's better to wrap the ``run_`` call with ``System.Timeout.timeout`` -- this is more efficient than testing the time on every chunk, and does not require a specialised enumerator. The signatures/docs are: -- | Enumerate binary data from a 'Socket', using 'recv'. The socket must -- be connected. -- -- The buffer size should be a small power of 2, such as 4096. -- -- If any call to 'recv' takes longer than the timeout, 'enumSocketTimed' -- will throw an error. To add a timeout for the entire session, wrap the -- call to 'E.run' in 'timeout'. -- -- Since: 0.1.2 enumSocketTimed :: MonadIO m => Integer -- ^ Buffer size -> Integer -- ^ Timeout, in microseconds -> S.Socket -> E.Enumerator B.ByteString m b -- | Write data to a 'S.Socket', using 'sendMany'. The socket must be connected. -- -- If any call to 'sendMany' takes longer than the timeout, 'iterSocketTimed' -- will throw an error. To add a timeout for the entire session, wrap the -- call to 'E.run' in 'timeout'. -- -- Since: 0.1.2 iterSocketTimed :: MonadIO m => Integer -- ^ Timeout, in microseconds -> S.Socket -> E.Iteratee B.ByteString m () ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Hello John, great stuff! Many thanks. I can't await the 0.5 release of your library. By the way, I believe that the 'enumHandleSession' and 'enumHandleTimeout' enumerators from the 'netlines' library really belong into this one. You will want to have timeout support, when reading from a network handle. Keep up the good work! Greets, Ertugrul John Millikin wrote: > - > Enumerators are an efficient, predictable, and safe alternative to > lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be > processed in near–constant space by pure code. Although somewhat more > complex to write, using enumerators instead of lazy I/O produces more > correct programs. > > http://hackage.haskell.org/package/enumerator > http://john-millikin.com/software/enumerator/ > - > > Hello -cafe, > > It's been a while since the last point release of enumerator. This one > is sufficiently large that I think folks might want to know about it, > and since I try not to spam too many announcements, I'll give a quick > rundown on major changes in other 0.4.x versions as well. > > First, most of what I call "list analogues" -- enumerator-based > versions of 'head', 'take', 'map', etc -- have been separated into > three modules (Data.Enumerator.List, .Binary, and .Text) depending on > what sorts of data they operate on. This separation has been an > ongoing process throughout 0.4.x releases, and I think it's now > complete. The old names in Data.Enumerator will continue to exist in > 0.4.x versions, but will be removed in 0.5. > > Second, Gregory Collins and Ertugrul Soeylemez found a space leak in > Iteratee's (>>=), which could cause eventual space exhaustion in some > circumstances. If you use enumerators to process very large or > infinite streams, you probably want to upgrade to version 0.4.7 or > higher. > > Third, the source code PDF has seen some substantial improvement -- if > you're interested in how the library is implemented, or have insomnia, > read it at < http://john-millikin.com/software/enumerator/enumerator_0.4.8.pdf > > > > Finally, there is a known issue in the current encoding of iteratees > -- if an iteratee yields extra data but never consumed anything, that > iteratee will violate the monad law of associativity. Oleg has updated > his implementations to fix this problem, but since it would break a > *lot* of dependent libraries, I'm holding off until the vague future > of version 0.5. Since iteratees that yield extra data they didn't > consume are invalid anyway, I hope this problem will not cause too > much inconvenience. > > New features > - > > * Range-limited binary file enumeration (requested + initial patch by > Bardur Arantsson). > > * splitWhen , based on the "split" package < > http://hackage.haskell.org/package/split > > > * 0.4.6: Typeable instances for most types (requested by Michael Snoyman) > > * 0.4.5: joinE , which simplifies enumerator/enumeratee composition > (requested by Michael Snoyman) > > ___ > Libraries mailing list > librar...@haskell.org > http://www.haskell.org/mailman/listinfo/libraries -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://ertes.de/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 11-03-26 04:33 PM, John A. De Goes wrote: I noticed this problem some time ago. Beyond just breaking monadic associativity, there are many other issues with standard definitions of iteratees: 1. It does not make sense in general to bind with an iteratee that has already consumed input, but there's no type-level difference between a "virgin" iteratee and one that has already consumed input; 2. Error recovery is ill-defined because errors do not describe what portion of the input they have already consumed; 3. Iteratees sometimes need to manage resources, but they're not designed to do so which leads to hideous workarounds; 4. Iteratees cannot incrementally produce output, it's all or nothing, which makes them terrible for many real world problems that require both incremental input and incremental output. Overall, I regard iteratees as only a partial success. They're leaky and somewhat unsafe abstractions. Out of curiosity, have you looked at the monad-coroutine library? It's a more generic and IMO much cleaner model, though I wouldn't recommend it as a replacement because the enumerator and iteratee libraries come with more predefined plumbing. I think your point #1 still stands, but others can all be made to disappear - as long as you define your suspension functors properly. I'm experimenting with Mealy machines because I think they have more long-term promise to solve the problems of iteratees. Do you mean a sort of a transducer monad transformer or an actual finite state machine? The latter would seem rather restrictive. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 2011-03-26, Gregory Collins wrote: >> Since the API is being broken anyway, I'm also going to take the >> opportunity to change the Stream type so it can represent "EOF + some >> data". That should allow lots of interesting behaviors, such as >> arbitrary lookahead. > > The thing which I find is missing the most from enumerator as it > stands is not this -- it's the fact that Iteratees sometimes need to > allocate resources which need explicit manual deallocation (i.e. > sockets, file descriptors, mmaps, etc), but because Enumerators are > running the show, there is no "local" way to ensure that the > cleanup/bracket routines get run on error. This hurts composability, > because you are forced to either allocate these resources outside the > body of the enumerator (where you can bracket "run_") or play > finalizer-on-mvar tricks with the garbage collector. This kind of > sucks. I agree that it sucks, but it's a tradeoff of the left-fold enumerator design. Potential solutions are welcome. > The iteratee package has an error constructor on the Stream type for > this purpose; I think you could do that -- with the downside that you > need to pattern-match against another constructor in mainline code, > hurting performance -- or is there some other reasonable way to deal > with it? I don't think this would help. Remember that the iteratee has *no* control whatsoever over its lifetime. There is no guarantee that a higher-level enumerator or enumeratee will actually feed it data until it has enough; the computation can be interrupted at any level. Looking at the iteratee package's Stream constructor, I think it doesn't do what you think it does. While it might help with resource management in a specific case, it won't help if (for example) an enumeratee above your iteratee decides to yield. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On 3/26/11 4:33 PM, John A. De Goes wrote: 4. Iteratees cannot incrementally produce output, it's all or nothing, which makes them terrible for many real world problems that require both incremental input and incremental output. For this one, enumeratees are the proposed solution. But for some reason enumeratees are oft overlooked. -- Live well, ~wren ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
I noticed this problem some time ago. Beyond just breaking monadic associativity, there are many other issues with standard definitions of iteratees: 1. It does not make sense in general to bind with an iteratee that has already consumed input, but there's no type-level difference between a "virgin" iteratee and one that has already consumed input; 2. Error recovery is ill-defined because errors do not describe what portion of the input they have already consumed; 3. Iteratees sometimes need to manage resources, but they're not designed to do so which leads to hideous workarounds; 4. Iteratees cannot incrementally produce output, it's all or nothing, which makes them terrible for many real world problems that require both incremental input and incremental output. Overall, I regard iteratees as only a partial success. They're leaky and somewhat unsafe abstractions. I'm experimenting with Mealy machines because I think they have more long-term promise to solve the problems of iteratees. Regards, John A. De Goes Twitter: @jdegoes LinkedIn: http://linkedin.com/in/jdegoes On Mar 26, 2011, at 1:03 PM, John Millikin wrote: > On Mar 26, 10:46 am, Michael Snoyman wrote: >> As far as the left-over data in a yield issue: does that require a >> breaking API change, or a change to the definition of >>= which would >> change semantics?? > > It requires a pretty serious API change, as the definition of > 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions > also have problems (they can yield extra on a continue step), so I'm > at a bit of a loss as to what to do. Either way, underlying primitives > allow users to create iteratees with invalid/undefined behavior. Not > very Haskell-y. > > All of the new high-level functions added in recent versions are part > of an attempted workaround. I'd like to move the Iteratee definitions > themselves to a ``Data.Enumerator.Internal`` module, and add some > words discouraging their direct use. There would still be some API > breaks (the >>== , $$, and >==> operators would go away) but at least > clients wouldn't be subjected to a complete rewrite. > > Since the API is being broken anyway, I'm also going to take the > opportunity to change the Stream type so it can represent "EOF + some > data". That should allow lots of interesting behaviors, such as > arbitrary lookahead. > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Sat, Mar 26, 2011 at 8:03 PM, John Millikin wrote: > On Mar 26, 10:46 am, Michael Snoyman wrote: >> As far as the left-over data in a yield issue: does that require a >> breaking API change, or a change to the definition of >>= which would >> change semantics?? > > It requires a pretty serious API change, as the definition of > 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions > also have problems (they can yield extra on a continue step), so I'm > at a bit of a loss as to what to do. Either way, underlying primitives > allow users to create iteratees with invalid/undefined behavior. Not > very Haskell-y. You can also write an iteratee which doesn't move to the "Done" state when it gets EOF. This is equally "bad", i.e. not actually much of a problem in practice. > Since the API is being broken anyway, I'm also going to take the > opportunity to change the Stream type so it can represent "EOF + some > data". That should allow lots of interesting behaviors, such as > arbitrary lookahead. The thing which I find is missing the most from enumerator as it stands is not this -- it's the fact that Iteratees sometimes need to allocate resources which need explicit manual deallocation (i.e. sockets, file descriptors, mmaps, etc), but because Enumerators are running the show, there is no "local" way to ensure that the cleanup/bracket routines get run on error. This hurts composability, because you are forced to either allocate these resources outside the body of the enumerator (where you can bracket "run_") or play finalizer-on-mvar tricks with the garbage collector. This kind of sucks. The iteratee package has an error constructor on the Stream type for this purpose; I think you could do that -- with the downside that you need to pattern-match against another constructor in mainline code, hurting performance -- or is there some other reasonable way to deal with it? G -- Gregory Collins ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
On Mar 26, 10:46 am, Michael Snoyman wrote: > As far as the left-over data in a yield issue: does that require a > breaking API change, or a change to the definition of >>= which would > change semantics?? It requires a pretty serious API change, as the definition of 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions also have problems (they can yield extra on a continue step), so I'm at a bit of a loss as to what to do. Either way, underlying primitives allow users to create iteratees with invalid/undefined behavior. Not very Haskell-y. All of the new high-level functions added in recent versions are part of an attempted workaround. I'd like to move the Iteratee definitions themselves to a ``Data.Enumerator.Internal`` module, and add some words discouraging their direct use. There would still be some API breaks (the >>== , $$, and >==> operators would go away) but at least clients wouldn't be subjected to a complete rewrite. Since the API is being broken anyway, I'm also going to take the opportunity to change the Stream type so it can represent "EOF + some data". That should allow lots of interesting behaviors, such as arbitrary lookahead. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
Great work as usual John. I'm actually very happy to see enumHandleRange: the next version of WAI will support partial files, and I just implemented my own version of enumHandleRange over there. I will gladly switch to your (most likely more correct) version. As far as the left-over data in a yield issue: does that require a breaking API change, or a change to the definition of >>= which would change semantics?? Michael On Sat, Mar 26, 2011 at 7:39 AM, John Millikin wrote: > - > Enumerators are an efficient, predictable, and safe alternative to > lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be > processed in near–constant space by pure code. Although somewhat more > complex to write, using enumerators instead of lazy I/O produces more > correct programs. > > http://hackage.haskell.org/package/enumerator > http://john-millikin.com/software/enumerator/ > - > > Hello -cafe, > > It's been a while since the last point release of enumerator. This one > is sufficiently large that I think folks might want to know about it, > and since I try not to spam too many announcements, I'll give a quick > rundown on major changes in other 0.4.x versions as well. > > First, most of what I call "list analogues" -- enumerator-based > versions of 'head', 'take', 'map', etc -- have been separated into > three modules (Data.Enumerator.List, .Binary, and .Text) depending on > what sorts of data they operate on. This separation has been an > ongoing process throughout 0.4.x releases, and I think it's now > complete. The old names in Data.Enumerator will continue to exist in > 0.4.x versions, but will be removed in 0.5. > > Second, Gregory Collins and Ertugrul Soeylemez found a space leak in > Iteratee's (>>=), which could cause eventual space exhaustion in some > circumstances. If you use enumerators to process very large or > infinite streams, you probably want to upgrade to version 0.4.7 or > higher. > > Third, the source code PDF has seen some substantial improvement -- if > you're interested in how the library is implemented, or have insomnia, > read it at < http://john-millikin.com/software/enumerator/enumerator_0.4.8.pdf >> > > Finally, there is a known issue in the current encoding of iteratees > -- if an iteratee yields extra data but never consumed anything, that > iteratee will violate the monad law of associativity. Oleg has updated > his implementations to fix this problem, but since it would break a > *lot* of dependent libraries, I'm holding off until the vague future > of version 0.5. Since iteratees that yield extra data they didn't > consume are invalid anyway, I hope this problem will not cause too > much inconvenience. > > New features > - > > * Range-limited binary file enumeration (requested + initial patch by > Bardur Arantsson). > > * splitWhen , based on the "split" package < > http://hackage.haskell.org/package/split > > > * 0.4.6: Typeable instances for most types (requested by Michael Snoyman) > > * 0.4.5: joinE , which simplifies enumerator/enumeratee composition > (requested by Michael Snoyman) > > ___ > Libraries mailing list > librar...@haskell.org > http://www.haskell.org/mailman/listinfo/libraries > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe