Re: [Haskell-cafe] [iteratee] empty chunk as special case of input
2011/7/14 John Lato : > Sorry for the followup, but I forgot about one other important reason > (probably the real reason) for the nullC case in bindIteratee. Note > what happens in the regular case: the iteratee is run, and if it's in > a completed state, the result is passed to the bound function (in the > "m_done" line), which is then also run. Examine what happens if the > inner iteratee is also complete: > >> const . flip onDone stream > > which would be more clearly written as > >> \b _str -> onDone b stream > > so in this case the leftover stream result from the first iteratee > (stream) is used as the result of the second iteratee, and the > leftover stream from the second iteratee (_str) is discarded. > This doesn't seem right; what should happen is that the two streams > should be appended somehow. Yes I see. From this point ov view, the way of ignoring second iteratee's leftover stream is neither worse or better comparing to other possible ways, like ignoring stream of first iteratee or appending them together somehow. I thought about it, and now it seems that all this problem exists because of iteratee's possibility to jump into done state without processing any data. I came to iteratees from IncrementalGet library (binary-strict package), and thought that they are using similar concepts, but now I see big difference - IncrementalGet's approach doesn't allow such state change. That is how they define /Get/ (iteratee-like structure). newtype Get r a = Get { unGet :: S -> (a -> S -> IResult r) -> IResult r } data IResult a = IFailed S String | IFinished S a | IPartial (B.ByteString -> IResult a) data S = S ... -- contains data chunk (bytestring) and some other state holders unGet has similar design in onDone branch, but onCont is hidden inside IResult. So, user can't obtain the result without providing a stream as input. Well, there is also black magic there.. but I think It makes impossible to have two conflicting iteratees like bindIteratee may discover. I would like to compare those approaches and decide what is "better" (it depends on task of course, but how?).. binary-strict's code is easier to understand, but iteratees are more general and offer more features, including very powerfull stream transformations. Is it good idea to merge somehow those approaces? For example, if I'll replace IncrementalGet's hardcoded stream type with type variable like iterarees do, will I be able to implement convStream on top of Get, how do you think? What about enumeratees? By the way, Iteratee package contains itertut.lhs - very good tutorial, thanks! It says that CPS was used to eliminate constructors. How do yo think, may I hope that one day compiler will be able to transform constructor-based approach, introduced there, into CPS automatically? Thanks, Sergey ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [iteratee] empty chunk as special case of input
Sorry for the followup, but I forgot about one other important reason (probably the real reason) for the nullC case in bindIteratee. Note what happens in the regular case: the iteratee is run, and if it's in a completed state, the result is passed to the bound function (in the "m_done" line), which is then also run. Examine what happens if the inner iteratee is also complete: > const . flip onDone stream which would be more clearly written as > \b _str -> onDone b stream so in this case the leftover stream result from the first iteratee (stream) is used as the result of the second iteratee, and the leftover stream from the second iteratee (_str) is discarded. This doesn't seem right; what should happen is that the two streams should be appended somehow. It works because at this stage an iteratee won't have been enumerated over (by the current stream at least), so it can't have any leftover data, just a null chunk. But bindIteratee explicitly checks for the null chunk case also so that's not a problem. If the iteratee was enumerated over by another stream and therefore does have leftover data, then since that data isn't part of the current stream it's rightfully discarded anyway. This is why your function produced an unexpected result; it's in a completed state without having been enumerated over, but also has leftover data, which bindIteratee ignores. Now that I've thought about it, I'm not convinced this is always correct; in particular I suspect it for being responsible for a slightly convoluted implementation of enumFromCallbackCatch. I'll have to expend more brain cells on it, I think. John L. On Thu, Jul 14, 2011 at 1:15 AM, John Lato wrote: > Hi Sergey, > > iteratee (the package) uses a null chunk to signify that no further > stream data is available within the iteratee, that is, at some point > the stream has been entirely consumed. Therefore, if any of the > composed iteratees haven't run to completion, they need to get more > data from an enumerator. Thus 'bindIteratee' has the nullC guard in > the definition as an optimization; there's no need to send the null > chunk to bound iteratees because in most cases they won't be able to > do anything with it. > > I've recently considered removing this, but at present when I take it > out some unit tests fail and I haven't had time to explore further. > Since this would have other benefits I would like to do so provided it > doesn't strongly impact performance. Rather than simply removing the > case I could add a null case to the Stream type, but that could cause > some extra work for users. > > Also, one rule for writing iteratees is that they shouldn't put > elements into the stream. Doing so may cause various transformers to > behave incorrectly. If you want to modify a stream rather than simply > consuming elements, the correct approach is to create an enumeratee > (stream transformer). > > John L. > > On Wed, Jul 13, 2011 at 11:00 PM, Sergey Mironov wrote: >> Hi community, hi John. I find myself reading bindIteratee[1] function >> for a several days.. there is something that keeps me away from >> completely understanding of the concept. The most noticeble thing is >> \nullC\ guard in the definition. To demonstate the consequences of >> this solution, let me define an iterator like >> >> myI = Iteratee $ \onDone _ -> onDone 'a' (Chunk "xyz") >> >> It is a bit unusial, since myI substitutes real stream with a fake one >> (xyz). Now lets define two actions producing different results in >> unusual manner: >> >> printI i = enumPure1Chunk ['a'..'g'] i >>= run >>= print >> >> i1 = (return 'b' >> myI >> I.head) -- myI substitutes the stream, >> last /I.head/ produces 'x', OK >> i2 = (I.head >> myI >> I.head) -- produces 'b'! I expected another >> 'x' here but myI's stream was ignored by >>= >> >> Well, I understand that this is probably an expected behaviour, but >> what is it for? Why we can't handle null input like non-null? Iterator >> may just stay in it's current state in that case. >> >> Thanks in advance >> Sergey >> >> -- >> [1] - bindIteratee (basically, >>=) code from Data.Iteratee.Base.hs >> >> bindIteratee :: (Monad m, Nullable s) >> => Iteratee s m a >> -> (a -> Iteratee s m b) >> -> Iteratee s m b >> bindIteratee = self >> where >> self m f = Iteratee $ \onDone onCont -> >> let m_done a (Chunk s) >> | nullC s = runIter (f a) onDone onCont >> m_done a stream = runIter (f a) (const . flip onDone >> stream) f_cont >> where f_cont k Nothing = runIter (k stream) onDone onCont >> f_cont k e = onCont k e >> in runIter m m_done (onCont . (flip self f .)) >> > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [iteratee] empty chunk as special case of input
Hi Sergey, iteratee (the package) uses a null chunk to signify that no further stream data is available within the iteratee, that is, at some point the stream has been entirely consumed. Therefore, if any of the composed iteratees haven't run to completion, they need to get more data from an enumerator. Thus 'bindIteratee' has the nullC guard in the definition as an optimization; there's no need to send the null chunk to bound iteratees because in most cases they won't be able to do anything with it. I've recently considered removing this, but at present when I take it out some unit tests fail and I haven't had time to explore further. Since this would have other benefits I would like to do so provided it doesn't strongly impact performance. Rather than simply removing the case I could add a null case to the Stream type, but that could cause some extra work for users. Also, one rule for writing iteratees is that they shouldn't put elements into the stream. Doing so may cause various transformers to behave incorrectly. If you want to modify a stream rather than simply consuming elements, the correct approach is to create an enumeratee (stream transformer). John L. On Wed, Jul 13, 2011 at 11:00 PM, Sergey Mironov wrote: > Hi community, hi John. I find myself reading bindIteratee[1] function > for a several days.. there is something that keeps me away from > completely understanding of the concept. The most noticeble thing is > \nullC\ guard in the definition. To demonstate the consequences of > this solution, let me define an iterator like > > myI = Iteratee $ \onDone _ -> onDone 'a' (Chunk "xyz") > > It is a bit unusial, since myI substitutes real stream with a fake one > (xyz). Now lets define two actions producing different results in > unusual manner: > > printI i = enumPure1Chunk ['a'..'g'] i >>= run >>= print > > i1 = (return 'b' >> myI >> I.head) -- myI substitutes the stream, > last /I.head/ produces 'x', OK > i2 = (I.head >> myI >> I.head) -- produces 'b'! I expected another > 'x' here but myI's stream was ignored by >>= > > Well, I understand that this is probably an expected behaviour, but > what is it for? Why we can't handle null input like non-null? Iterator > may just stay in it's current state in that case. > > Thanks in advance > Sergey > > -- > [1] - bindIteratee (basically, >>=) code from Data.Iteratee.Base.hs > > bindIteratee :: (Monad m, Nullable s) > => Iteratee s m a > -> (a -> Iteratee s m b) > -> Iteratee s m b > bindIteratee = self > where > self m f = Iteratee $ \onDone onCont -> > let m_done a (Chunk s) > | nullC s = runIter (f a) onDone onCont > m_done a stream = runIter (f a) (const . flip onDone > stream) f_cont > where f_cont k Nothing = runIter (k stream) onDone onCont > f_cont k e = onCont k e > in runIter m m_done (onCont . (flip self f .)) > ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] [iteratee] empty chunk as special case of input
Hi community, hi John. I find myself reading bindIteratee[1] function for a several days.. there is something that keeps me away from completely understanding of the concept. The most noticeble thing is \nullC\ guard in the definition. To demonstate the consequences of this solution, let me define an iterator like myI = Iteratee $ \onDone _ -> onDone 'a' (Chunk "xyz") It is a bit unusial, since myI substitutes real stream with a fake one (xyz). Now lets define two actions producing different results in unusual manner: printI i = enumPure1Chunk ['a'..'g'] i >>= run >>= print i1 = (return 'b' >> myI >> I.head) -- myI substitutes the stream, last /I.head/ produces 'x', OK i2 = (I.head >> myI >> I.head) -- produces 'b'! I expected another 'x' here but myI's stream was ignored by >>= Well, I understand that this is probably an expected behaviour, but what is it for? Why we can't handle null input like non-null? Iterator may just stay in it's current state in that case. Thanks in advance Sergey -- [1] - bindIteratee (basically, >>=) code from Data.Iteratee.Base.hs bindIteratee :: (Monad m, Nullable s) => Iteratee s m a -> (a -> Iteratee s m b) -> Iteratee s m b bindIteratee = self where self m f = Iteratee $ \onDone onCont -> let m_done a (Chunk s) | nullC s = runIter (f a) onDone onCont m_done a stream = runIter (f a) (const . flip onDone stream) f_cont where f_cont k Nothing = runIter (k stream) onDone onCont f_cont k e = onCont k e in runIter m m_done (onCont . (flip self f .)) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe