Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-31 Thread John Lato
>
> Message: 7
> Date: Tue, 29 Mar 2011 22:39:12 -0400
> From: wren ng thornton 
> Subject: Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8
> To: haskell-cafe@haskell.org
> Message-ID: <4d9297d0.7060...@freegeek.org>
> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>
> On 3/29/11 4:40 AM, o...@okmij.org wrote:
> > Wren Thornton wrote:
> >> This is often conflated with the iteratee throwing an error/exception,
> >> which is wrong because we should distinguish between bad program
> >> states and argument passing.
> >
> > I guess this is a matter of different points of view on exceptions.
>
> The problem is not so much the exceptions per se (one goto is about as
> good as any other), it has more to do with the fact that important
> things are being left out of the types.
>
> One of the great things about Haskell is that you can lean so heavily on
> the type system to protect yourself when refactoring, designing by
> contract, etc. However, if there's an unspoken code of communication
> between specific enumerators and iteratees, it's very easy to break
> things. This is why the communication should be captured in the types,
> regardless of the control-flow mechanism used to implement that
> communication. I'd like the static guarantee that whatever special
> requests my iteratee could make, its enumerator is in a position to
> fulfill those requests (or die trying). Allowing for the iteratee to be
> paired with an enumerator which is incapable of handling its requests is
> a type error and should be treated as such.
>

This has long been a goal of mine for iteratee (since before the 0.4
release), although I haven't really done any work on it.  Maybe it's time to
see if I can get a more satisfactory implementation.


> > Wren Thornton wrote:
> >> In an ideal framework the producers, transformers, and consumers of
> >> stream data would have a type parameter indicating the up-stream
> >> communication they support or require (in addition to the type
> >> parameters for stream type, result type, and side-effect type).
> >
> > Very true. Currently the design of Iteratees quite resembles that of
> > Control.Exception: everything can throw SomeException. Ideally one
> > would like to be more precise, and specify what exceptions or sorts of
> > exceptions could be thrown -- by Iteratees, and by ordinary Haskell
> > functions. The design of a good effect system is still the topic of
> > active research, although there are some encouraging results.
>
> Yeah, I'm not a big fan of extensible exceptions either. Don't get me
> wrong, it's an awesome hack and it's far cleaner than the Java approach;
> but it still goes against my sensibilities.
>

I've come around to the view that exceptions are a bad idea (in Haskell),
and just making everything explicit (via Maybe, ErrorT, explicit-exceptions,
or otherwise) is the best approach at present.

John L.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread wren ng thornton

On 3/29/11 4:40 AM, o...@okmij.org wrote:

Wren Thornton wrote:

This is often conflated with the iteratee throwing an error/exception,
which is wrong because we should distinguish between bad program
states and argument passing.


I guess this is a matter of different points of view on exceptions.


The problem is not so much the exceptions per se (one goto is about as 
good as any other), it has more to do with the fact that important 
things are being left out of the types.


One of the great things about Haskell is that you can lean so heavily on 
the type system to protect yourself when refactoring, designing by 
contract, etc. However, if there's an unspoken code of communication 
between specific enumerators and iteratees, it's very easy to break 
things. This is why the communication should be captured in the types, 
regardless of the control-flow mechanism used to implement that 
communication. I'd like the static guarantee that whatever special 
requests my iteratee could make, its enumerator is in a position to 
fulfill those requests (or die trying). Allowing for the iteratee to be 
paired with an enumerator which is incapable of handling its requests is 
a type error and should be treated as such.




Wren Thornton wrote:

In an ideal framework the producers, transformers, and consumers of
stream data would have a type parameter indicating the up-stream
communication they support or require (in addition to the type
parameters for stream type, result type, and side-effect type).


Very true. Currently the design of Iteratees quite resembles that of
Control.Exception: everything can throw SomeException. Ideally one
would like to be more precise, and specify what exceptions or sorts of
exceptions could be thrown -- by Iteratees, and by ordinary Haskell
functions. The design of a good effect system is still the topic of
active research, although there are some encouraging results.


Yeah, I'm not a big fan of extensible exceptions either. Don't get me 
wrong, it's an awesome hack and it's far cleaner than the Java approach; 
but it still goes against my sensibilities.


I think a big part of the problem is that we don't have a good type 
theory for coroutines. The idea of functions that "never return" just 
doesn't cut it. And conflating legitimate control-flow manipulation with 
bottom doesn't either. But, as of yet, that's all we've got.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread Ertugrul Soeylemez
Antoine Latter  wrote:

> It looks like we can't quite fit Enumeratee into the Category
> typeclass (without newtypes, at least). That's a shame.

Yeah.  Intuitively it looks like iteratees and enumeratees are excellent
candidates for Category and even Arrow.  Unfortunately they can either
be monad transformers or arrows.  You can't mix without, as you said, a
newtype.  That's very unfortunate.


Greets,
Ertugrul


-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread 山本和彦
Hello,

>> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m
>> b
>> enum =$ iter = joinI (enum $$ iter)
>>
>> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b ->
>> Enumerator ai m b
>> ($=) = joinE
>> --
> 
> The operators sound good to me. My only request would be to put in a
> usage example in the documentation. I'd be happy to write one if you'd
> like. Personally, I think that =$ will *greatly* clean up my code.

I have a tutorial to describe how to use the enumerator library in
Japanese. Since it is popular among the Haskell community in Japan, I
guess it's worth translating into English. So, I did.

http://www.mew.org/~kazu/proj/enumerator/

This tutorial explains how to use (=$) and ($=) as well as other
operators(($$), (<==<), (>>=)).

Of course, my English is broken. If English native speakers will
kindly correct broken grammar, it would be appreciated.

I'm reachable by e-mail or twitter (@kazu_yamamoto).

--Kazu

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread Antoine Latter
On Tue, Mar 29, 2011 at 6:15 PM, Ertugrul Soeylemez  wrote:
> Hello John,
>
> Sorry that I'm late.  And honestly one day for request submissions is a
> bit narrow.
>
> I have a request, too:  Right now it is difficult to compose
> enumeratees.  An equivalent of (.) for enumeratees would be great.  So
> instead of:
>
>    joinI $ e1 $$ joinI $ e2 $$ iter
>
> one could write
>
>    let e = e1 .= e2 in e =$ iter
>
> I would appreciate a 0.4.10 with such a composition operator.
>
>
> Greets,
> Ertugrul
>

It looks like we can't quite fit Enumeratee into the Category
typeclass (without newtypes, at least). That's a shame.

Antoine

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread Ertugrul Soeylemez
Hello John,

Sorry that I'm late.  And honestly one day for request submissions is a
bit narrow.

I have a request, too:  Right now it is difficult to compose
enumeratees.  An equivalent of (.) for enumeratees would be great.  So
instead of:

joinI $ e1 $$ joinI $ e2 $$ iter

one could write

let e = e1 .= e2 in e =$ iter

I would appreciate a 0.4.10 with such a composition operator.


Greets,
Ertugrul


John Millikin  wrote:

> Since the release, a couple people have sent in feature requests, so I'm 
> going to put out 0.4.9 in a day or so.
> 
> New features will be:
> 
> - tryIO: runs an IO computation, and converts any exceptions into 
> ``throwError`` calls (requested by Kazu Yamamoto)
> 
> - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) 
> when defining enumerators
> 
> - mapAccum and mapAccum: sort of like map and mapM, except the step function 
> is stateful (requested by Long Huynh Huu)
> 
> Anyone else out there sitting on a request? Please send them in -- I am 
> always happy to receive them, even if they must be declined.
> 
> ---
> 
> Also, I would like to do a quick poll regarding operators.
> 
> 1. It has been requested that I add operator aliases for joinI and joinE.
> 
> 2. There have been complaints that the library defines too many operators 
> (currently, 5).
> 
> Do any existing enumerator users, or anyone for that matter, have an opinion 
> either way?
> 
> The proposed operators are:
> 
> --
> infixr 0 =$
> infixr 0 $=
> 
> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m 
> b
> enum =$ iter = joinI (enum $$ iter)
> 
> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> 
> Enumerator ai m b
> ($=) = joinE
> --
> 
> 


-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread Michael Snoyman
Thanks, I look forward to being able to use these new operators!

On Tue, Mar 29, 2011 at 8:22 PM, John Millikin  wrote:
> 0.4.9 has been uploaded to cabal, with the new operators. Changes are in the
> replied-to post (and also quoted below), plus the new operators proposed by
> Kazu Yamamoto.
>
> Here's the corresponding docs (they have examples!)
>
> --
> -- | @enum =$ iter = 'joinI' (enum $$ iter)@
> --
> -- “Wraps” an iteratee /inner/ in an enumeratee /wrapper/.
> -- The resulting iteratee will consume /wrapper/’s input type and
> -- yield /inner/’s output type.
> --
> -- Note: if the inner iteratee yields leftover input when it finishes,
> -- that extra will be discarded.
> --
> -- As an example, consider an iteratee that converts a stream of
> UTF8-encoded
> -- bytes into a single 'TL.Text':
> --
> -- > consumeUTF8 :: Monad m => Iteratee ByteString m Text
> --
> -- It could be written with either 'joinI' or '(=$)':
> --
> -- > import Data.Enumerator.Text as ET
> -- >
> -- > consumeUTF8 = joinI (decode utf8 $$ ET.consume)
> -- > consumeUTF8 = decode utf8 =$ ET.consume
> --
> -- Since: 0.4.9
>
> -- | @enum $= enee = 'joinE' enum enee@
> --
> -- “Wraps” an enumerator /inner/ in an enumeratee /wrapper/.
> -- The resulting enumerator will generate /wrapper/’s output type.
> --
> -- As an example, consider an enumerator that yields line character counts
> -- for a text file (e.g. for source code readability checking):
> --
> -- > enumFileCounts :: FilePath -> Enumerator Int IO b
> --
> -- It could be written with either 'joinE' or '($=)':
> --
> -- > import Data.Text as T
> -- > import Data.Enumerator.List as EL
> -- > import Data.Enumerator.Text as ET
> -- >
> -- > enumFileCounts path = joinE (enumFile path) (EL.map T.length)
> -- > enumFileCounts path = enumFile path $= EL.map T.length
> --
> -- Since: 0.4.9
> --
>
> Minor release note -- 0.4.9 and 0.4.9.1 are the exact same code; I just
> forgot a @ in one of the new docs and had to re-upload so Hackage would
> haddock properly. There is no difference in behavior.
>
> On Monday, March 28, 2011 10:50:45 PM UTC-7, John Millikin wrote:
>>
>> Since the release, a couple people have sent in feature requests, so I'm
>> going to put out 0.4.9 in a day or so.
>>
>> New features will be:
>>
>> - tryIO: runs an IO computation, and converts any exceptions into
>> ``throwError`` calls (requested by Kazu Yamamoto)
>>
>> - checkContinue: encapsulates a common pattern (loop (Continue k) = ...)
>> when defining enumerators
>>
>> - mapAccum and mapAccum: sort of like map and mapM, except the step
>> function is stateful (requested by Long Huynh Huu)
>>
>> Anyone else out there sitting on a request? Please send them in -- I am
>> always happy to receive them, even if they must be declined.
>>
>> ---
>>
>> Also, I would like to do a quick poll regarding operators.
>>
>> 1. It has been requested that I add operator aliases for joinI and joinE.
>>
>> 2. There have been complaints that the library defines too many operators
>> (currently, 5).
>>
>> Do any existing enumerator users, or anyone for that matter, have an
>> opinion either way?
>>
>> The proposed operators are:
>>
>> --
>> infixr 0 =$
>> infixr 0 $=
>>
>> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao
>> m b
>> enum =$ iter = joinI (enum $$ iter)
>>
>> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b
>> -> Enumerator ai m b
>> ($=) = joinE
>> --
>>
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread John Millikin
0.4.9 has been uploaded to cabal, with the new operators. Changes are in the 
replied-to post (and also quoted below), plus the new operators proposed by 
Kazu Yamamoto.

Here's the corresponding docs (they have examples!)

--
-- | @enum =$ iter = 'joinI' (enum $$ iter)@
--
-- “Wraps” an iteratee /inner/ in an enumeratee /wrapper/.
-- The resulting iteratee will consume /wrapper/’s input type and
-- yield /inner/’s output type.
--
-- Note: if the inner iteratee yields leftover input when it finishes,
-- that extra will be discarded.
--
-- As an example, consider an iteratee that converts a stream of 
UTF8-encoded
-- bytes into a single 'TL.Text':
--
-- > consumeUTF8 :: Monad m => Iteratee ByteString m Text
--
-- It could be written with either 'joinI' or '(=$)':
--
-- > import Data.Enumerator.Text as ET
-- >
-- > consumeUTF8 = joinI (decode utf8 $$ ET.consume)
-- > consumeUTF8 = decode utf8 =$ ET.consume
--
-- Since: 0.4.9

-- | @enum $= enee = 'joinE' enum enee@
--
-- “Wraps” an enumerator /inner/ in an enumeratee /wrapper/.
-- The resulting enumerator will generate /wrapper/’s output type.
--
-- As an example, consider an enumerator that yields line character counts
-- for a text file (e.g. for source code readability checking):
--
-- > enumFileCounts :: FilePath -> Enumerator Int IO b
--
-- It could be written with either 'joinE' or '($=)':
--
-- > import Data.Text as T
-- > import Data.Enumerator.List as EL
-- > import Data.Enumerator.Text as ET
-- >
-- > enumFileCounts path = joinE (enumFile path) (EL.map T.length)
-- > enumFileCounts path = enumFile path $= EL.map T.length
--
-- Since: 0.4.9
--

Minor release note -- 0.4.9 and 0.4.9.1 are the exact same code; I just 
forgot a @ in one of the new docs and had to re-upload so Hackage would 
haddock properly. There is no difference in behavior.

On Monday, March 28, 2011 10:50:45 PM UTC-7, John Millikin wrote:
>
> Since the release, a couple people have sent in feature requests, so I'm 
> going to put out 0.4.9 in a day or so.
>
> New features will be:
>
> - tryIO: runs an IO computation, and converts any exceptions into 
> ``throwError`` calls (requested by Kazu Yamamoto)
>
> - checkContinue: encapsulates a common pattern (loop (Continue k) = ...) 
> when defining enumerators
>
> - mapAccum and mapAccum: sort of like map and mapM, except the step 
> function is stateful (requested by Long Huynh Huu)
>
> Anyone else out there sitting on a request? Please send them in -- I am 
> always happy to receive them, even if they must be declined.
>
> ---
>
> Also, I would like to do a quick poll regarding operators.
>
> 1. It has been requested that I add operator aliases for joinI and joinE.
>
> 2. There have been complaints that the library defines too many operators 
> (currently, 5).
>
> Do any existing enumerator users, or anyone for that matter, have an 
> opinion either way?
>
> The proposed operators are:
>
> --
> infixr 0 =$
> infixr 0 $=
>
> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m 
> b
> enum =$ iter = joinI (enum $$ iter)
>
> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> 
> Enumerator ai m b
> ($=) = joinE
> --
>
>___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread Gregory Collins
On Mar 29, 2011 10:42 AM,  wrote:
>

> I'm keen to hear of the example that seem to require Iteratee's
> allocating additional resources. I'd really like to see if any of such
> cases can be cast it terms of regions, implemented via iterated
> Iteratee transformers.

Hello Oleg,

The first example which comes to mind - because I recently implemented this
- is HTTP uploads: the input stream can hold N files (the value of N not
known ahead of time), and each upload  can go to a separate temporary file.

If the request stream fails early (e.g. client browser closes connection),
any already-created temp files and any open file handles must be cleaned up.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-29 Thread oleg

Wren Thornton wrote:
> This is often conflated with the iteratee throwing an error/exception,
> which is wrong because we should distinguish between bad program
> states and argument passing.

I guess this is a matter of different points of view on exceptions. I
am a fan of the model of (effectful) computation proposed by
Cartwright and Felleisen a while ago:
http://okmij.org/ftp/Computation/monads.html#ExtensibleDS
In their model, all the computation is done by throwing resumable
exceptions -- including the pure computation such as arithmetic and
CBV/CBN applications. The similarity of ExtensibleDS.hs with Iteratee
should be quite noticeable, especially regarding the part about
throwing `errors.'

The relation with co-routines is hard to miss: in fact, Iteratees are
built upon co-routines, which are being resumed by the enumerator. The
error message is the additional piece of data that is being associated
with `yield', telling the enumerator to do something extra rather than
mere getting the next piece of data and resuming the co-routine. The
co-routine is the simplest part of the iteratee; it is the plumbing
that takes a long time to engineer.

John A. De Goes:
> 2. Error recovery is ill-defined because errors do not describe
> what portion of the input they have already consumed;

I'm confused about this complaint: if an iteratee encounters an
unusual condition or just has a special request, it sends a message
that eventually propagates to the responsible enumerator. That
enumerator knows how much data it has sent down to iteratees. The
RandomIO module 
http://okmij.org/ftp/Haskell/Iteratee/RandomIO.hs
is a good illustration: when the enumerator receives the seek request,
it checks if the desired stream offset corresponds to the data already
in the current IO buffer. If so, no IO is performed and the iteratee
is resumed with the existing buffer data. The tests in the file check
for that. Iteratee knows nothing about the buffer or if there is a
buffer.

Wren Thornton wrote:
> In an ideal framework the producers, transformers, and consumers of
> stream data would have a type parameter indicating the up-stream
> communication they support or require (in addition to the type
> parameters for stream type, result type, and side-effect type).
Very true. Currently the design of Iteratees quite resembles that of
Control.Exception: everything can throw SomeException. Ideally one
would like to be more precise, and specify what exceptions or sorts of
exceptions could be thrown -- by Iteratees, and by ordinary Haskell
functions. The design of a good effect system is still the topic of
active research, although there are some encouraging results.


John A. De Goes:
>  3. Iteratees sometimes need to manage resources, but they're not
>  designed to do so which leads to hideous workarounds;

Gregory Collins:
> The thing which I find is missing the most from enumerator as it
> stands is not this -- it's the fact that Iteratees sometimes need to
> allocate resources which need explicit manual deallocation (i.e.
> sockets, file descriptors, mmaps, etc), but because Enumerators are
> running the show, there is no "local" way to ensure that the
> cleanup/bracket routines get run on error.

I used to think that processing several inputs at different paces was
indeed a stumbling block. It seemed that an iteratee needed to open a
separate file, which it is indeed ill-equipped to do. Fortunately,
that difficulty has been overcome, surprisingly in a natural way with
no changes to the library:
http://okmij.org/ftp/Streams.html#2enum1iter

The pleasant surprise is that we can iterate (no pun intended)
Iteratee monad transformers, just as we did with (IORT s) monad
transformer in the Lightweight Monadic Regions. Thus we maintain the
region-like discipline of managing resources.

I'm keen to hear of the example that seem to require Iteratee's
allocating additional resources. I'd really like to see if any of such
cases can be cast it terms of regions, implemented via iterated
Iteratee transformers.

John A. De Goes:
> 1. It does not make sense in general to bind with an iteratee that
> has already consumed input, but there's no type-level difference
> between a "virgin" iteratee and one that has already consumed input;

I'm not sure I follow. Why should it make a difference between a
virgin iteratee and the one that consumed some input. One should think
of the Iteratee as two arguments of fold (f and z) bundled together. Why
the function being folded over should care how many times it has been
applied to input data? It is a pure function, transforming state
plus input to a new state. The useful laws of fold hold precisely
because the function f is pure and doesn't care.

A detailed example showing why you think you need this distinction
would be appreciated.

> 4. Iteratees cannot incrementally produce output, it's all or
> nothing, which makes them terrible for many real world problems
>  

Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread Michael Snoyman
On Tue, Mar 29, 2011 at 7:50 AM, John Millikin  wrote:
> Since the release, a couple people have sent in feature requests, so I'm
> going to put out 0.4.9 in a day or so.
>
> New features will be:
>
> - tryIO: runs an IO computation, and converts any exceptions into
> ``throwError`` calls (requested by Kazu Yamamoto)
>
> - checkContinue: encapsulates a common pattern (loop (Continue k) = ...)
> when defining enumerators
>
> - mapAccum and mapAccum: sort of like map and mapM, except the step function
> is stateful (requested by Long Huynh Huu)
>
> Anyone else out there sitting on a request? Please send them in -- I am
> always happy to receive them, even if they must be declined.
>
> ---
>
> Also, I would like to do a quick poll regarding operators.
>
> 1. It has been requested that I add operator aliases for joinI and joinE.
>
> 2. There have been complaints that the library defines too many operators
> (currently, 5).
>
> Do any existing enumerator users, or anyone for that matter, have an opinion
> either way?
>
> The proposed operators are:
>
> --
> infixr 0 =$
> infixr 0 $=
>
> (=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m
> b
> enum =$ iter = joinI (enum $$ iter)
>
> ($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b ->
> Enumerator ai m b
> ($=) = joinE
> --

The operators sound good to me. My only request would be to put in a
usage example in the documentation. I'd be happy to write one if you'd
like. Personally, I think that =$ will *greatly* clean up my code.

Michael

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread John Millikin
Since the release, a couple people have sent in feature requests, so I'm 
going to put out 0.4.9 in a day or so.

New features will be:

- tryIO: runs an IO computation, and converts any exceptions into 
``throwError`` calls (requested by Kazu Yamamoto)

- checkContinue: encapsulates a common pattern (loop (Continue k) = ...) 
when defining enumerators

- mapAccum and mapAccum: sort of like map and mapM, except the step function 
is stateful (requested by Long Huynh Huu)

Anyone else out there sitting on a request? Please send them in -- I am 
always happy to receive them, even if they must be declined.

---

Also, I would like to do a quick poll regarding operators.

1. It has been requested that I add operator aliases for joinI and joinE.

2. There have been complaints that the library defines too many operators 
(currently, 5).

Do any existing enumerator users, or anyone for that matter, have an opinion 
either way?

The proposed operators are:

--
infixr 0 =$
infixr 0 $=

(=$) :: Monad m => Enumeratee ao ai m b -> Iteratee ai m b -> Iteratee ao m 
b
enum =$ iter = joinI (enum $$ iter)

($=) :: Monad m => Enumerator ao m (Step ai m b) -> Enumeratee ao ai m b -> 
Enumerator ai m b
($=) = joinE
--

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread John A. De Goes

This isn't quite what I'm after. I want to pull chunks on demand (i.e. have 
control over both the input and the output). Enumeratees don't allow me to do 
that.

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes

On Mar 27, 2011, at 7:58 PM, John Millikin wrote:

> If the receiver can only accept very small chunks, you can put a rechunking 
> stage in between the compression and iteratee:
> 
> ---
> verySmallChunks :: Monad m => Enumeratee ByteString ByteString m b
> verySmallSchunks = sequence (take 10)
> ---
> 
> Resending is slightly more complex -- if the other end can say "resend that 
> last chunk", then it should be easy enough, but "resend the last 2 bytes of 
> that chunk you sent 5 minutes ago" would be much harder. What is your use 
> case?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread John A. De Goes

Now THAT"s what I'm talking about. Augment such a solution with interruptible & 
resumable data producers, and I'd have everything I need.

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes

On Mar 27, 2011, at 10:54 PM, wren ng thornton wrote:

> In an ideal framework the producers, transformers, and consumers of stream 
> data would have a type parameter indicating the up-stream communication they 
> support or require (in addition to the type parameters for stream type, 
> result type, and side-effect type). That way clients can just define an ADT 
> for their communication protocol, and be done with it. There may still be 
> issues with the Expression Problem, but at least those are pushed out of the 
> stream processing framework itself which really shouldn't care about the 
> types of communication used.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread James Cook

On Mar 28, 2011, at 12:54 AM, wren ng thornton wrote:


On 3/27/11 9:58 PM, John Millikin wrote:
Resending is slightly more complex -- if the other end can say  
"resend that
last chunk", then it should be easy enough, but "resend the last 2  
bytes of
that chunk you sent 5 minutes ago" would be much harder. What is  
your use

case?


This does highlight one of the restrictions I've lamented about the  
iteratee framework. Namely that the current versions I've seen place  
unnecessary limitations on the communication the iteratee is allowed  
to give to the enumerator/enumeratees above it. This is often  
conflated with the iteratee throwing an error/exception, which is  
wrong because we should distinguish between bad program states and  
argument passing. Moreover the type system doesn't capture the kinds  
of communication iteratees assume of their enumerators/enumeratees,  
nor the kinds of communication supported by the enumerators/ 
enumeratees, which means that failure to hook them up in the right  
(non-typechecked) way /does/ constitute an error.


The one example that tends to be supported is the iteratee  
requesting that the enumerator/enumeratees seek to a given position  
in a file. Which is a good example, but it's not the only one.  
Requesting the resending of chunks is another good example. But  
there's no limit to the reasonable kinds of communication an  
iteratee could want.


In an ideal framework the producers, transformers, and consumers of  
stream data would have a type parameter indicating the up-stream  
communication they support or require (in addition to the type  
parameters for stream type, result type, and side-effect type). That  
way clients can just define an ADT for their communication protocol,  
and be done with it. There may still be issues with the Expression  
Problem, but at least those are pushed out of the stream processing  
framework itself which really shouldn't care about the types of  
communication used.


It's somewhat outdated and underdeveloped (I was writing for myself so  
I never really bothered finishing it), but I wrote an exploration of  
iteratee semantics[1] a while back in which I specified an iteratee as  
a monad-transformer stack involving, at its core, the "PromptT" or  
"ProgramT" monad transformers (as far as I know, the same could be  
done with the "Coroutine" monad).  I personally found that  
construction far more lucid than the usual ad-hoc view, and it also  
makes it very clear how the model can be trivially extended to support  
additional operations such as these.


Based on what I learned while writing that (and on the similarity  
between coroutines and the concepts I used), I strongly agree with  
Mario Blažević, suggestion to look at his monad-coroutine library as  
a way of understanding where they fit in some larger design space.  I  
would even go so far as to suggest that something like it could be  
considered as either a replacement for iteratees or as the underlying  
implementation of an iteratee library, because the concept not only  
subsumes iteratees and enumerators, but also delegates control to code  
that can be independent of both rather than simply reversing the  
"conventional" iterator concept.  I believe it also subsumes iterators  
and whatever their corresponding parts are called.


As he mentions, his implementation does not come with all the  
"plumbing", but I think it would be worthwhile to create that  
plumbing, because either coroutines or "operational monads" may very  
well be the basis needed to develop a "grand unified theory" of  
composable stream processing.  If nothing else, the isomorphisms in  
his coroutine-enumerator[2] and coroutine-iteratee[3] packages seem to  
give a much more direct and useful iteratee semantics than I've seen  
given anywhere else, and at the same time they are much more readily  
extended to cover additional operations.


-- James

1. https://github.com/mokus0/junkbox/tree/master/Papers/HighLevelIteratees
2. http://hackage.haskell.org/package/coroutine-enumerator
3. http://hackage.haskell.org/package/coroutine-iteratee
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread David Leimbach
On Mon, Mar 28, 2011 at 8:06 AM, John Millikin  wrote:

> On Sunday, March 27, 2011 9:45:23 PM UTC-7, Ertugrul Soeylemez wrote:
>>
>> > For setting a global timeout on an entire session, it's better to wrap
>> > the ``run_`` call with ``System.Timeout.timeout`` -- this is more
>> > efficient than testing the time on every chunk, and does not require a
>> > specialised enumerator.
>> It may be more efficient, but I don't really like it.  I like robust
>>
>> applications, and to me killing a thread is always a mistake, even if
>> the thread is kill-safe.
>>
> ``timeout`` doesn't kill the thread, it just returns ``Nothing`` if the
> computation took longer than expected.
>


Timeout does kill the thread that is used for timing out :-).  The
thread that measures the timeout throws an exception to the worker
thread that's being monitored.

Either way you're interrupting a thread.  Kill it or toss an exception
at it, I don't see the difference really.

Dave



>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-28 Thread John Millikin
On Sunday, March 27, 2011 9:45:23 PM UTC-7, Ertugrul Soeylemez wrote:
>
> > For setting a global timeout on an entire session, it's better to wrap
> > the ``run_`` call with ``System.Timeout.timeout`` -- this is more
> > efficient than testing the time on every chunk, and does not require a
> > specialised enumerator.
> It may be more efficient, but I don't really like it.  I like robust
>
> applications, and to me killing a thread is always a mistake, even if
> the thread is kill-safe.
>
``timeout`` doesn't kill the thread, it just returns ``Nothing`` if the 
computation took longer than expected.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread wren ng thornton

On 3/27/11 9:58 PM, John Millikin wrote:

Resending is slightly more complex -- if the other end can say "resend that
last chunk", then it should be easy enough, but "resend the last 2 bytes of
that chunk you sent 5 minutes ago" would be much harder. What is your use
case?


This does highlight one of the restrictions I've lamented about the 
iteratee framework. Namely that the current versions I've seen place 
unnecessary limitations on the communication the iteratee is allowed to 
give to the enumerator/enumeratees above it. This is often conflated 
with the iteratee throwing an error/exception, which is wrong because we 
should distinguish between bad program states and argument passing. 
Moreover the type system doesn't capture the kinds of communication 
iteratees assume of their enumerators/enumeratees, nor the kinds of 
communication supported by the enumerators/enumeratees, which means that 
failure to hook them up in the right (non-typechecked) way /does/ 
constitute an error.


The one example that tends to be supported is the iteratee requesting 
that the enumerator/enumeratees seek to a given position in a file. 
Which is a good example, but it's not the only one. Requesting the 
resending of chunks is another good example. But there's no limit to the 
reasonable kinds of communication an iteratee could want.


In an ideal framework the producers, transformers, and consumers of 
stream data would have a type parameter indicating the up-stream 
communication they support or require (in addition to the type 
parameters for stream type, result type, and side-effect type). That way 
clients can just define an ADT for their communication protocol, and be 
done with it. There may still be issues with the Expression Problem, but 
at least those are pushed out of the stream processing framework itself 
which really shouldn't care about the types of communication used.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread Ertugrul Soeylemez
John Millikin  wrote:

> Good idea -- I've added an ``enumSocketTimed`` and ``iterSocketTimed``
> to the network-enumerator package at <
> http://hackage.haskell.org/package/network-enumerator
> >. ``enumSocketTimed`` is equivalent to your ``enumHandleTimeout``,
> but instead of Handle uses the more efficient Socket type.

For simple applications working with handles is much more convenient, so
I decided to implement a timed handle enumerator instead of a socket
enumerator.

Perhaps it would be a good idea to add your 'enumSocketTimed' and
'iterSocketTimed' to my netlines package, too.  Also I should add
'iterHandleTimeout'.


> For setting a global timeout on an entire session, it's better to wrap
> the ``run_`` call with ``System.Timeout.timeout`` -- this is more
> efficient than testing the time on every chunk, and does not require a
> specialised enumerator.

It may be more efficient, but I don't really like it.  I like robust
applications, and to me killing a thread is always a mistake, even if
the thread is kill-safe.


Greets,
Ertugrul


-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread John Millikin
If the receiver can only accept very small chunks, you can put a rechunking 
stage in between the compression and iteratee:

---
verySmallChunks :: Monad m => Enumeratee ByteString ByteString m b
verySmallSchunks = sequence (take 10)
---

Resending is slightly more complex -- if the other end can say "resend that 
last chunk", then it should be easy enough, but "resend the last 2 bytes of 
that chunk you sent 5 minutes ago" would be much harder. What is your use 
case?
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread John A. De Goes

This formulation does not let me control the production of compressed chunks 
independently from the provision of input; a receiver may only be capable of 
consuming a tiny amount at a time, and I may have to resend some chunks. Which 
is the whole point: iteratee & friends are lopsided. They provide excellent 
control of an input stream to the iteratee, but there is no structure 
permitting equivalent control of the output stream.

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes

On Mar 27, 2011, at 3:22 PM, wren ng thornton wrote:

> On 3/27/11 11:38 AM, John A. De Goes wrote:
>> 
>> Enumeratees solve some use cases but not others. Let's say you want to 
>> incrementally compress a 2 GB file. If you use an enumeratee to do this, 
>> your "transformer" iteratee has to do IO. I'd prefer an abstraction to 
>> incrementally and purely produce the output from a stream of input.
> 
> I don't see why? In pseudocode we could have,
> 
>enumRead2GBFile :: FilePath -> Enumerator IO ByteString
>enumRead2GBFile file iter0 = do
>fd <- open file
>let loop iter = do
>mline <- read fd
>case mline of
>Nothing -> return iter
>Just line -> do
>iter' <- feed iter line
>if isDone iter'
>then return iter'
>else loop iter'
>iterF <- loop iter0
>close fd
>return iterF
> 
>compress :: Monad m => Enumeratee m ByteString ByteString
>compress = go state0
>where
>go state = do
>chunk <- get
>let (state',hash) = compressify state chunk
>put hash
>go state'
> 
>compressify :: Foo -> ByteString -> (Foo,ByteString)
> 
> it's just a pipeline like function composition or shell pipes. There's no 
> reason intermediate points of the pipeline have do anything impure.
> 
> -- 
> Live well,
> ~wren
> 
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread wren ng thornton

On 3/27/11 11:38 AM, John A. De Goes wrote:


Enumeratees solve some use cases but not others. Let's say you want to incrementally 
compress a 2 GB file. If you use an enumeratee to do this, your "transformer" 
iteratee has to do IO. I'd prefer an abstraction to incrementally and purely produce the 
output from a stream of input.


I don't see why? In pseudocode we could have,

enumRead2GBFile :: FilePath -> Enumerator IO ByteString
enumRead2GBFile file iter0 = do
fd <- open file
let loop iter = do
mline <- read fd
case mline of
Nothing -> return iter
Just line -> do
iter' <- feed iter line
if isDone iter'
then return iter'
else loop iter'
iterF <- loop iter0
close fd
return iterF

compress :: Monad m => Enumeratee m ByteString ByteString
compress = go state0
where
go state = do
chunk <- get
let (state',hash) = compressify state chunk
put hash
go state'

compressify :: Foo -> ByteString -> (Foo,ByteString)

it's just a pipeline like function composition or shell pipes. There's 
no reason intermediate points of the pipeline have do anything impure.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread John Millikin
On Sunday, March 27, 2011 8:38:38 AM UTC-7, John A. De Goes wrote:
>
>
> Enumeratees solve some use cases but not others. Let's say you want to 
> incrementally compress a 2 GB file. If you use an enumeratee to do this, 
> your "transformer" iteratee has to do IO. I'd prefer an abstraction to 
> incrementally and purely produce the output from a stream of input.


There's no reason the transformer has to do IO. Right now a lot of the 
interesting enumerator-based packages are actually bindings to C libraries, 
so they are forced to use IO, but there's nothing inherent in the enumeratee 
design to require it.

For example, the text codec enumeratees "encode" and "decode" in 
Data.Enumerator.Text are pure.

I'm working on ideas for writing pure enumeratees to bound libraries, but 
they will likely only work if the underlying library fully exposes its 
state, like zlib. Libraries with private or very complex internal states, 
such as libxml or expat, will probably never be implementable in pure 
enumeratees.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread John A. De Goes
On Mar 26, 2011, at 4:24 PM, Mario Blažević wrote:
> On 11-03-26 04:33 PM, John A. De Goes wrote:
> Out of curiosity, have you looked at the monad-coroutine library? It's a 
> more generic and IMO much cleaner model, though I wouldn't recommend it as a 
> replacement because the enumerator and iteratee libraries come with more 
> predefined plumbing. I think your point #1 still stands, but others can all 
> be made to disappear - as long as you define your suspension functors 
> properly.

I haven't looked at it. I will take a look.

> Do you mean a sort of a transducer monad transformer or an actual finite 
> state machine? The latter would seem rather restrictive.


Yes, I mean transducer monad transformer, especially if you equate "mealy 
machine" with "finite state machine". I equate mealy machine with "two-taped 
transducer".

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-27 Thread John A. De Goes

Enumeratees solve some use cases but not others. Let's say you want to 
incrementally compress a 2 GB file. If you use an enumeratee to do this, your 
"transformer" iteratee has to do IO. I'd prefer an abstraction to incrementally 
and purely produce the output from a stream of input.

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes

On Mar 26, 2011, at 3:12 PM, wren ng thornton wrote:

> On 3/26/11 4:33 PM, John A. De Goes wrote:
>> 4. Iteratees cannot incrementally produce output, it's all or nothing, which 
>> makes them terrible for many real world problems that require both 
>> incremental input and incremental output.
> 
> For this one, enumeratees are the proposed solution. But for some reason 
> enumeratees are oft overlooked.
> 
> -- 
> Live well,
> ~wren
> 
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread John Millikin
Hello Ertugrul Söylemez,

Good idea -- I've added an ``enumSocketTimed`` and ``iterSocketTimed`` to 
the network-enumerator package at < 
http://hackage.haskell.org/package/network-enumerator >. ``enumSocketTimed`` 
is equivalent to your ``enumHandleTimeout``, but instead of Handle uses the 
more efficient Socket type.

For setting a global timeout on an entire session, it's better to wrap the 
``run_`` call with ``System.Timeout.timeout`` -- this is more efficient than 
testing the time on every chunk, and does not require a specialised 
enumerator.

The signatures/docs are:



-- | Enumerate binary data from a 'Socket', using 'recv'. The socket must
-- be connected.
--
-- The buffer size should be a small power of 2, such as 4096.
--
-- If any call to 'recv' takes longer than the timeout, 'enumSocketTimed'
-- will throw an error. To add a timeout for the entire session, wrap the
-- call to 'E.run' in 'timeout'.
--
-- Since: 0.1.2
enumSocketTimed :: MonadIO m
=> Integer -- ^ Buffer size 
-> Integer -- ^ Timeout, in microseconds
-> S.Socket
-> E.Enumerator B.ByteString m b

-- | Write data to a 'S.Socket', using 'sendMany'. The socket must be 
connected.
--
-- If any call to 'sendMany' takes longer than the timeout, 
'iterSocketTimed'
-- will throw an error. To add a timeout for the entire session, wrap the
-- call to 'E.run' in 'timeout'.
--
-- Since: 0.1.2
iterSocketTimed :: MonadIO m
=> Integer -- ^ Timeout, in microseconds
-> S.Socket
-> E.Iteratee B.ByteString m ()


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread Ertugrul Soeylemez
Hello John,

great stuff!  Many thanks.  I can't await the 0.5 release of your
library.  By the way, I believe that the 'enumHandleSession' and
'enumHandleTimeout' enumerators from the 'netlines' library really
belong into this one.  You will want to have timeout support, when
reading from a network handle.

Keep up the good work!


Greets,
Ertugrul


John Millikin  wrote:

> -
> Enumerators are an efficient, predictable, and safe alternative to
> lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be
> processed in near–constant space by pure code. Although somewhat more
> complex to write, using enumerators instead of lazy I/O produces more
> correct programs.
> 
> http://hackage.haskell.org/package/enumerator
> http://john-millikin.com/software/enumerator/
> -
> 
> Hello -cafe,
> 
> It's been a while since the last point release of enumerator. This one
> is sufficiently large that I think folks might want to know about it,
> and since I try not to spam too many announcements, I'll give a quick
> rundown on major changes in other 0.4.x versions as well.
> 
> First, most of what I call "list analogues" -- enumerator-based
> versions of 'head', 'take', 'map', etc -- have been separated into
> three modules (Data.Enumerator.List, .Binary, and .Text) depending on
> what sorts of data they operate on. This separation has been an
> ongoing process throughout 0.4.x releases, and I think it's now
> complete. The old names in Data.Enumerator will continue to exist in
> 0.4.x versions, but will be removed in 0.5.
> 
> Second, Gregory Collins and Ertugrul Soeylemez found a space leak in
> Iteratee's (>>=), which could cause eventual space exhaustion in some
> circumstances. If you use enumerators to process very large or
> infinite streams, you probably want to upgrade to version 0.4.7 or
> higher.
> 
> Third, the source code PDF has seen some substantial improvement -- if
> you're interested in how the library is implemented, or have insomnia,
> read it at < http://john-millikin.com/software/enumerator/enumerator_0.4.8.pdf
> >
> 
> Finally, there is a known issue in the current encoding of iteratees
> -- if an iteratee yields extra data but never consumed anything, that
> iteratee will violate the monad law of associativity. Oleg has updated
> his implementations to fix this problem, but since it would break a
> *lot* of dependent libraries, I'm holding off until the vague future
> of version 0.5. Since iteratees that yield extra data they didn't
> consume are invalid anyway, I hope this problem will not cause too
> much inconvenience.
> 
> New features
> -
> 
> * Range-limited binary file enumeration (requested + initial patch by
> Bardur Arantsson).
> 
> * splitWhen , based on the "split" package <
> http://hackage.haskell.org/package/split >
> 
> * 0.4.6: Typeable instances for most types (requested by Michael Snoyman)
> 
> * 0.4.5: joinE , which simplifies enumerator/enumeratee composition
> (requested by Michael Snoyman)
> 
> ___
> Libraries mailing list
> librar...@haskell.org
> http://www.haskell.org/mailman/listinfo/libraries



-- 
nightmare = unsafePerformIO (getWrongWife >>= sex)
http://ertes.de/



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread Mario Blažević

On 11-03-26 04:33 PM, John A. De Goes wrote:


I noticed this problem some time ago. Beyond just breaking monadic 
associativity, there are many other issues with standard definitions 
of iteratees:


1. It does not make sense in general to bind with an iteratee that
has already consumed input, but there's no type-level difference
between a "virgin" iteratee and one that has already consumed input;

2. Error recovery is ill-defined because errors do not describe
what portion of the input they have already consumed;

3. Iteratees sometimes need to manage resources, but they're not
designed to do so which leads to hideous workarounds;

4. Iteratees cannot incrementally produce output, it's all or
nothing, which makes them terrible for many real world problems
that require both incremental input and incremental output.


Overall, I regard iteratees as only a partial success. They're leaky 
and somewhat unsafe abstractions.


Out of curiosity, have you looked at the monad-coroutine library? 
It's a more generic and IMO much cleaner model, though I wouldn't 
recommend it as a replacement because the enumerator and iteratee 
libraries come with more predefined plumbing. I think your point #1 
still stands, but others can all be made to disappear - as long as you 
define your suspension functors properly.



I'm experimenting with Mealy machines because I think they have more 
long-term promise to solve the problems of iteratees.


Do you mean a sort of a transducer monad transformer or an actual 
finite state machine? The latter would seem rather restrictive.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread John Millikin
On 2011-03-26, Gregory Collins  wrote:
>> Since the API is being broken anyway, I'm also going to take the
>> opportunity to change the Stream type so it can represent "EOF + some
>> data". That should allow lots of interesting behaviors, such as
>> arbitrary lookahead.
>
> The thing which I find is missing the most from enumerator as it
> stands is not this -- it's the fact that Iteratees sometimes need to
> allocate resources which need explicit manual deallocation (i.e.
> sockets, file descriptors, mmaps, etc), but because Enumerators are
> running the show, there is no "local" way to ensure that the
> cleanup/bracket routines get run on error. This hurts composability,
> because you are forced to either allocate these resources outside the
> body of the enumerator (where you can bracket "run_") or play
> finalizer-on-mvar tricks with the garbage collector. This kind of
> sucks.

I agree that it sucks, but it's a tradeoff of the left-fold enumerator
design. Potential solutions are welcome.

> The iteratee package has an error constructor on the Stream type for
> this purpose; I think you could do that -- with the downside that you
> need to pattern-match against another constructor in mainline code,
> hurting performance -- or is there some other reasonable way to deal
> with it?

I don't think this would help. Remember that the iteratee has *no*
control whatsoever over its lifetime. There is no guarantee that a
higher-level enumerator or enumeratee will actually feed it data until
it has enough; the computation can be interrupted at any level.

Looking at the iteratee package's Stream constructor, I think it
doesn't do what you think it does. While it might help with resource
management in a specific case, it won't help if (for example) an
enumeratee above your iteratee decides to yield.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread wren ng thornton

On 3/26/11 4:33 PM, John A. De Goes wrote:

4. Iteratees cannot incrementally produce output, it's all or nothing, which 
makes them terrible for many real world problems that require both incremental 
input and incremental output.


For this one, enumeratees are the proposed solution. But for some reason 
enumeratees are oft overlooked.


--
Live well,
~wren

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread John A. De Goes

I noticed this problem some time ago. Beyond just breaking monadic 
associativity, there are many other issues with standard definitions of 
iteratees:

1. It does not make sense in general to bind with an iteratee that has already 
consumed input, but there's no type-level difference between a "virgin" 
iteratee and one that has already consumed input;

2. Error recovery is ill-defined because errors do not describe what portion of 
the input they have already consumed;

3. Iteratees sometimes need to manage resources, but they're not designed to do 
so which leads to hideous workarounds;

4. Iteratees cannot incrementally produce output, it's all or nothing, which 
makes them terrible for many real world problems that require both incremental 
input and incremental output.

Overall, I regard iteratees as only a partial success. They're leaky and 
somewhat unsafe abstractions.

I'm experimenting with Mealy machines because I think they have more long-term 
promise to solve the problems of iteratees.

Regards,

John A. De Goes
Twitter: @jdegoes 
LinkedIn: http://linkedin.com/in/jdegoes

On Mar 26, 2011, at 1:03 PM, John Millikin wrote:

> On Mar 26, 10:46 am, Michael Snoyman  wrote:
>> As far as the left-over data in a yield issue: does that require a
>> breaking API change, or a change to the definition of >>= which would
>> change semantics??
> 
> It requires a pretty serious API change, as the definition of
> 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions
> also have problems (they can yield extra on a continue step), so I'm
> at a bit of a loss as to what to do. Either way, underlying primitives
> allow users to create iteratees with invalid/undefined behavior. Not
> very Haskell-y.
> 
> All of the new high-level functions added in recent versions are part
> of an attempted workaround. I'd like to move the Iteratee definitions
> themselves to a ``Data.Enumerator.Internal`` module, and add some
> words discouraging their direct use. There would still be some API
> breaks (the >>== , $$, and >==> operators would go away) but at least
> clients wouldn't be subjected to a complete rewrite.
> 
> Since the API is being broken anyway, I'm also going to take the
> opportunity to change the Stream type so it can represent "EOF + some
> data". That should allow lots of interesting behaviors, such as
> arbitrary lookahead.
> 
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
> 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread Gregory Collins
On Sat, Mar 26, 2011 at 8:03 PM, John Millikin  wrote:
> On Mar 26, 10:46 am, Michael Snoyman  wrote:
>> As far as the left-over data in a yield issue: does that require a
>> breaking API change, or a change to the definition of >>= which would
>> change semantics??
>
> It requires a pretty serious API change, as the definition of
> 'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions
> also have problems (they can yield extra on a continue step), so I'm
> at a bit of a loss as to what to do. Either way, underlying primitives
> allow users to create iteratees with invalid/undefined behavior. Not
> very Haskell-y.

You can also write an iteratee which doesn't move to the "Done" state
when it gets EOF. This is equally "bad", i.e. not actually much of a
problem in practice.

> Since the API is being broken anyway, I'm also going to take the
> opportunity to change the Stream type so it can represent "EOF + some
> data". That should allow lots of interesting behaviors, such as
> arbitrary lookahead.

The thing which I find is missing the most from enumerator as it
stands is not this -- it's the fact that Iteratees sometimes need to
allocate resources which need explicit manual deallocation (i.e.
sockets, file descriptors, mmaps, etc), but because Enumerators are
running the show, there is no "local" way to ensure that the
cleanup/bracket routines get run on error. This hurts composability,
because you are forced to either allocate these resources outside the
body of the enumerator (where you can bracket "run_") or play
finalizer-on-mvar tricks with the garbage collector. This kind of
sucks.

The iteratee package has an error constructor on the Stream type for
this purpose; I think you could do that -- with the downside that you
need to pattern-match against another constructor in mainline code,
hurting performance -- or is there some other reasonable way to deal
with it?

G
-- 
Gregory Collins 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread John Millikin
On Mar 26, 10:46 am, Michael Snoyman  wrote:
> As far as the left-over data in a yield issue: does that require a
> breaking API change, or a change to the definition of >>= which would
> change semantics??

It requires a pretty serious API change, as the definition of
'Iteratee' itself is at fault. Unfortunately, Oleg's new definitions
also have problems (they can yield extra on a continue step), so I'm
at a bit of a loss as to what to do. Either way, underlying primitives
allow users to create iteratees with invalid/undefined behavior. Not
very Haskell-y.

All of the new high-level functions added in recent versions are part
of an attempted workaround. I'd like to move the Iteratee definitions
themselves to a ``Data.Enumerator.Internal`` module, and add some
words discouraging their direct use. There would still be some API
breaks (the >>== , $$, and >==> operators would go away) but at least
clients wouldn't be subjected to a complete rewrite.

Since the API is being broken anyway, I'm also going to take the
opportunity to change the Stream type so it can represent "EOF + some
data". That should allow lots of interesting behaviors, such as
arbitrary lookahead.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: enumerator 0.4.8

2011-03-26 Thread Michael Snoyman
Great work as usual John. I'm actually very happy to see
enumHandleRange: the next version of WAI will support partial files,
and I just implemented my own version of enumHandleRange over there. I
will gladly switch to your (most likely more correct) version.

As far as the left-over data in a yield issue: does that require a
breaking API change, or a change to the definition of >>= which would
change semantics??

Michael

On Sat, Mar 26, 2011 at 7:39 AM, John Millikin  wrote:
> -
> Enumerators are an efficient, predictable, and safe alternative to
> lazy I/O. Discovered by Oleg Kiselyov, they allow large datasets to be
> processed in near–constant space by pure code. Although somewhat more
> complex to write, using enumerators instead of lazy I/O produces more
> correct programs.
>
> http://hackage.haskell.org/package/enumerator
> http://john-millikin.com/software/enumerator/
> -
>
> Hello -cafe,
>
> It's been a while since the last point release of enumerator. This one
> is sufficiently large that I think folks might want to know about it,
> and since I try not to spam too many announcements, I'll give a quick
> rundown on major changes in other 0.4.x versions as well.
>
> First, most of what I call "list analogues" -- enumerator-based
> versions of 'head', 'take', 'map', etc -- have been separated into
> three modules (Data.Enumerator.List, .Binary, and .Text) depending on
> what sorts of data they operate on. This separation has been an
> ongoing process throughout 0.4.x releases, and I think it's now
> complete. The old names in Data.Enumerator will continue to exist in
> 0.4.x versions, but will be removed in 0.5.
>
> Second, Gregory Collins and Ertugrul Soeylemez found a space leak in
> Iteratee's (>>=), which could cause eventual space exhaustion in some
> circumstances. If you use enumerators to process very large or
> infinite streams, you probably want to upgrade to version 0.4.7 or
> higher.
>
> Third, the source code PDF has seen some substantial improvement -- if
> you're interested in how the library is implemented, or have insomnia,
> read it at < http://john-millikin.com/software/enumerator/enumerator_0.4.8.pdf
>>
>
> Finally, there is a known issue in the current encoding of iteratees
> -- if an iteratee yields extra data but never consumed anything, that
> iteratee will violate the monad law of associativity. Oleg has updated
> his implementations to fix this problem, but since it would break a
> *lot* of dependent libraries, I'm holding off until the vague future
> of version 0.5. Since iteratees that yield extra data they didn't
> consume are invalid anyway, I hope this problem will not cause too
> much inconvenience.
>
> New features
> -
>
> * Range-limited binary file enumeration (requested + initial patch by
> Bardur Arantsson).
>
> * splitWhen , based on the "split" package <
> http://hackage.haskell.org/package/split >
>
> * 0.4.6: Typeable instances for most types (requested by Michael Snoyman)
>
> * 0.4.5: joinE , which simplifies enumerator/enumeratee composition
> (requested by Michael Snoyman)
>
> ___
> Libraries mailing list
> librar...@haskell.org
> http://www.haskell.org/mailman/listinfo/libraries
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe