Re: [Haskell-cafe] The Typeable class is changing

2011-07-11 Thread dm-list-haskell-cafe
At Mon, 11 Jul 2011 21:18:30 +0300,
Yitzchak Gale wrote:
> 
> Please respond to this thread if you own a package
> that will be affected by this change.

iterIO uses mkTyCon for the simple reason that ((Typeable t, Typeable
m) => Iter t m) is Typeable1 and there is no automatic way of deriving
Typeable1.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] IterIO: How to write use my inumReverse

2011-07-04 Thread dm-list-haskell-cafe
At Mon, 4 Jul 2011 20:36:33 +1000,
John Ky wrote:
> 
> Hi Haskell Cafe,
> 
>       enum |$ inumLines .| inumReverse .| inumUnlines .| iter
> ...
>
> iterLines :: (Monad m) => Iter L.ByteString m [L.ByteString]
> iterLines = do
>   line <- lineI
>   return [line]
>
> iterUnlines :: (Monad m) => Iter [L.ByteString] m L.ByteString
> iterUnlines = (L.concat . (++ [C.pack "\n"])) `liftM` dataI
>
> iterReverse :: (Monad m) => Iter [L.ByteString] m [L.ByteString]
> iterReverse = do
>   lines <- dataI
>   return (map L.reverse lines)
>
> inumLines = mkInum iterLines
> inumUnlines = mkInum iterUnlines
> inumReverse = mkInum iterReverse
> 
> It all works fine.
> 
> My question is: Is it possible to rewrite inumReverse to be this:
> 
> iterReverse :: (Monad m) => Iter L.ByteString m L.ByteString
> iterReverse = do
>   line <- dataI
>   return (L.reverse line)
>
> inumReverse = mkInum iterReverse
> 
> And still be able to use it in the line:
> 
> enum |$ inumLines .| {-- inumReverse goes in here somehow --} .|
> inumUnlines .| iter
> 
> The reason I ask is that the Haskell function reverse has the type [a] -> [a],
> not  [[a]] -> [[a]].
> 
> I thought perhaps the alternative inumReverse is cleaner than the original as
> it behaves more similarly to Haskell's own reverse function.

I'm not sure what you are trying to achieve.  If you want an iter that
works on L.ByteStrings, then you can say:

 iterReverse :: (Monad m) => Iter L.ByteString m L.ByteString
 iterReverse = do
   line <- lineI
   return (L.reverse line)

In that case you don't need inumLines and inumUnlines.  If, however,
you want the type to be [L.ByteString], and you would rather do this
one line at a time, instead of calling map, then you could do
something like the following:

 iterReverse :: (Monad m) => Iter [L.ByteString] m [L.ByteString]
 iterReverse = do
   line <- headI
   return [L.reverse line]

But the code you have above should also work, so it all depends on
what you are trying to achieve.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange context reduction with partial application and casting

2011-07-03 Thread dm-list-haskell-cafe
At Sat, 2 Jul 2011 17:23:50 -0400,
Brent Yorgey wrote:
> 
> On Sat, Jul 02, 2011 at 09:02:13PM +0200, Daniel Fischer wrote:
> > 
> > - disabling the monomorphism restriction
> > :set -XNoMonomorphismRestriction
> > let g = f
> 
> This is the recommended solution.  The confusion caused by the MR far
> outweighs its occasional benefits.

Recommended by some people, but by no means everyone.  

For instance, Vytiniotis, Peyton Jones, and Schrijvers make a good
argument that the monomorphism restriction should effectively be
expanded to include both pattern and function bindings in let and
where clauses:

http://research.microsoft.com/pubs/102483/tldi10-vytiniotis.pdf

The above paper is currently implemented in GHC, and on by default if
you enable GADTs.  (So you would additionally need
-XNoMonoLocalBindings if you wanted to use GADTs.)  Moreover, even
with -XNoMonoLocalBindings, you still run into the fact that bindings
are not generalized within a declaration group, which could lead to
confusion.  In particular, consider the following program:

{-# LANGUAGE NoMonomorphismRestriction #-}
x = 2 -- The type of x is:  Num a => a
y = (x + y) :: Int

The type of x is what you would expect without the monomorphism
restriction.  Now say x has a gratuitous use of y:

{-# LANGUAGE NoMonomorphismRestriction #-}
x = 2 where _ = y -- The type of x is:  Int
y = (x + y) :: Int

If you want x to be polymorphic in this case, you have to give it a
type signature anyway:

{-# LANGUAGE NoMonomorphismRestriction #-}
x :: Num a => a
x = 2 where _ = y
y = (x + y) :: Int

Thus, what I would recommend, instead of -XNoMonoLocalBindings, is to
give type signatures to your polymorphic bindings.  This makes the
code more readable.  It has the disadvantage that Haskell doesn't
allow you to name monomorphic type variables, which, for local
bindings, can require either the use of -XScopedTypeVariables or
giving extra function arguments whose only purpose is to bring a type
variable into scope.  But both of those are probably more future-proof
than -XNoMonomorphismRestriction.

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Patterns for processing large but finite streams

2011-06-30 Thread dm-list-haskell-cafe
At Fri, 1 Jul 2011 09:39:32 +0400,
Eugene Kirpichov wrote:
> 
> Hi,
> 
> I'm rewriting timeplot to avoid holding the whole input in memory, and
> naturally a problem arises:
> 
> How to represent large but finite streams and functions that process
> them, returning other streams or some kinds of aggregate values?
> 
> Examples:
> * Adjacent differences of a stream of numbers
> * Given a stream of numbers with times, split it into buckets by time
> of given width and produce a stream of (bucket, 50%,75% and 90%
> quantiles in this bucket)
> * Sum a stream of numbers
> 
> Is this, perhaps, what comonads are for? Or iteratees?

Sounds like a good job for iteratees.  Summing a stream of numbers is
just an Iteratee.  Transcoding a stream into another stream is a job
for an Inum (Iteratee-enumerator) or enumeratee, depending on which
package's nomenclature you use.  You have three implementations to
choose from:

 - http://hackage.haskell.org/package/iteratee(original)

 - http://hackage.haskell.org/package/enumerator  (John Milikin's re-write)

 - http://hackage.haskell.org/package/iterIO  (my 3rd-generation attempt)

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Confused about my IterIO code

2011-06-30 Thread dm-list-haskell-cafe
At Thu, 30 Jun 2011 23:53:02 +1000,
John Ky wrote:
> 
> But all I've done is:
> 
> enum |$ inumReverseLines .| iter
>
> inumReverseLines = mkInum $ do
>   line <- lineI
>   return (L.reverse (L.concat [line, C.pack "\n"]))

mkInum repeatedly invokes its iter argument so as to keep producing
chunks.  If you want to reverse only one line, it might be easiest to
use something along the lines of:

mkInumM $ do
  line <- lineI
  ifeed (L.reverse (L.concat [line, C.pack "\n"]))

mkInumM is a more manual Inum construction function that doesn't
automatically do things like loop or handle EOF conditions.

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How to flush with IterIO in echo server

2011-06-29 Thread dm-list-haskell-cafe
At Wed, 29 Jun 2011 21:13:47 +1000,
John Ky wrote:
> 
> Hi Haskell Cafe,
> 
> I've written an echo server using just sockets:
> 
> ...
> 
> When I send text to it, it will echo it back immediately after my newline.
> 
> I then modified it to user IterIO:
> 
> import Control.Concurrent
> import Control.Exception
> import Control.Monad
> import Control.Monad.Trans
> import Data.IterIO
> import Data.IterIO.Inum
> import Network
> import System.IO
> import System.IO.Error (isEOFError)
> import qualified Data.ByteString.Lazy as L
>
> iterHandle' :: (MonadIO m) => Handle -> IO (Iter L.ByteString m (), Onum
> L.ByteString m a)
> iterHandle' = iterHandle
>
> main = withSocketsDo $ do
>   sListen <- listenOn (PortNumber 8000)
>   putStrLn "Listening on Port 8000"
>   forkIO $ forever $ do
>     (sSession, hostname, port) <- accept sListen
>     putStrLn ("Connected to " ++ hostname ++ ":" ++ show port)
>     forkIO $ do
>       (iter, enum) <- iterHandle' sSession
>       enum |$ iter
>       return ()
>   putStrLn "Press  to quit."
>   exitOnCtrlD
>
> exitOnCtrlD = try getLine >>= either
>   (\e -> unless (isEOFError e) $ ioError e)
>   (const exitOnCtrlD)
> 
> It works, however it doesn't send anything back to me until end of file.
> 
> I fixed that problem with my sockets version by flushing after each line, but
> I don't know if IterIO will let me flush on every newline.

The buffering is actually happening in the Handle code.  One way to
avoid this is to change your code to call hSetBuffering as follows:

  hSetBuffering sSession NoBuffering
  (iter, enum) <- iterHandle' sSession

This is mentioned in the documentation of handleI, but arguably should
also be there in the docs for iterHandle.  I think I'll add a mention
there.

If you use Sockets and call iterStream, that should also avoid the
problem.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] IterIO type restricted functions

2011-06-28 Thread dm-list-haskell-cafe
At Wed, 29 Jun 2011 10:11:26 +1000,
John Ky wrote:
> 
> [1  ]
> [1.1  ]
> 
> Hi all,
> 
> From the IterIO tutorial:
> 
> enumFile' is like enumFile above, but type restricted to data in the lazy
> ByteString format, which is more efficient than plain Strings. (enumFile
> supports multiple types, but in this example there is not enough
> information for Haskell to choose one of them, so we must use enumfile' or
> use :: to specify a type explicitly.
> 
> Which is fine, but shouldn't there also be iterHandle' and iterStream'?
> 
> Also I'm guessing the f is a typo that should be F.

Thanks, just fixed the typo in git.

I guess the logic is that these are super-easy to define in your own
code, so better not to pollute the namespace with lots of symbols.

The same logic also applies to enumFile'.  However, in testing I found
that I often wanted something like enumFile' to prototype something
quick and dirty, just to test from a file.  So I added it as a
convenience to myself.

I don't have very strong feelings either way.  If enumFile' is
inconsistent, my inclination would be to get rid of enumFile' rather
than add iterHandle' etc.  That way, if someone wants to do everything
in terms of strict byte strings, or text, or whatever, then can just
define the primed versions to be whatever data format they prefer.

I guess the reason it doesn't feel to horrible as-is is that enumFile
is already a convenience function effectively combining openFile with
enumHandle.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What is a "simple pattern binding"?

2011-06-26 Thread dm-list-haskell-cafe
At Sun, 26 Jun 2011 21:15:06 +0100,
Paterson, Ross wrote:
> 
> > True.  That ambiguity could be avoided by adding the word "declaration"
> > after "type signature".
> 
> On second thoughts, this is unnecessary.  The Report consistently uses
> "expression type signature" for the expression and "type signature"
> for the declaration.

I already sent the haskell-prime mailing list a proposal for the
following wording:

A binding b1 depends on a binding b2 in the same list of
declarations if either

1. b1 contains a free identifier v, v is bound by b2, and the
   list of declarations does not contain a type signature for
   v; or

http://www.haskell.org/pipermail/haskell-prime/2011-June/003482.html

I think this is clearer, but it might make more sense to discuss on
haskell-prime.

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What is a "simple pattern binding"?

2011-06-26 Thread dm-list-haskell-cafe
At Sun, 26 Jun 2011 09:31:05 +0100,
Paterson, Ross wrote:
> 
> Indeed, the Report has two problems:
> 
> Sections 4.4.3.2 and 4.5.5 have different definitions of "simple pattern".
> This has been there since section 4.5.5 (Monomorphism Restriction) was
> added in Haskell 1.1.  But then the only technical use of the term is
> in section 4.5.5.
>
> When the definition of declaration group (section 4.5.1) was changed in
> Haskell 2010 to break dependencies on type signatures, Rule 1 of the
> Monomorphism Restriction (section 4.5.5), while not incorrect, became
> partially redundant and overly complex.  It could have been simplified
> along the lines you describe.

Aha!  This is starting to make sense!  Indeed the Haskell98 text is
far clearer, and when I look at the differences in section 4.5.1, I
start to understand what the committee meant.  Still, the clause

b1 contains a free identifier that has no type signature and
is bound by b2

applies the phrase "has no type signature" to the identifier, not to
the binding.  Such phrasing does not exclude expression
type-signatures.  I presume that in the following code, binding b1
does not depend on b2:

(x, y) = (z, 1)  -- call this binding b1
(z, _) = (2, y)  -- call this binding b2
w = 1 + (z :: Double)

So my reading was that they meant "has no type signature *in b1*".
I take it that your reading is that they meant:

b1 contains a free identifier that is bound by b2 and b2 is
accompanied by a type signature for that identifier

I think, given the ambiguities here, it's worth filing a ticket on the
haskell' web site.

Thank you.
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What is a "simple pattern binding"?

2011-06-25 Thread dm-list-haskell-cafe
At Sun, 26 Jun 2011 01:41:01 +0100,
Paterson, Ross wrote:
> 
> > I thought "no type signature" meant no type signature inside b1.
> 
> No, it means no type signature for the variable.
> 
> > Otherwise, you are saying nothing could depend on a binding with a
> > type signature.  By that logic, there can be no mutual dependence,
> > and so every declaration with a type signature is its own (singleton)
> > declaration group.
> 
> A pattern binding can bind more than one variable.  If all the variables
> bound by a binding have type signatures, that binding is indeed a
> singleton declaration group.

If this is the case, then multiple sentences in the 2010 report don't
make sense, though the way in which they don't make sense sort of
depends on what "simple pattern binding" means.  Which of the
following constitute a simple pattern binding?

a.  a | False = undefined
  | otherwise = \x -> x

b.  Just b = Just (\x -> x)

c.  Just c | False = undefined
   | otherwise = Just (\x -> x)

d.  (d, d') = (\x -> x, d)

e.  (e, e') | False = undefined
| otherwise = (\x -> x, e)

If it's any clue, GHC infers a polymorphic type for a only.  It infers
type "GHC.Prim.Any -> GHC.Prim.Any" for the others.  Moreover, GHC
accepts the type signature "a :: t -> t", but rejects such a
polymorphic signature for the other variables, and also rejects
programs such as:

Just b = Just (\x -> x)

f :: (Show a) => a -> a
f = b -- illegal

So let's work under the assumption that a is a simple pattern binding,
and the others are not.  If you have a different definition, I'll make
a different argument.  (Note also that if you agree with this
definition, then there is a bug in section 4.4.3.2 of the report,
since a is not of the form "p = e".  But if you take the 4.4.3.2
definition, then I'll argue section 4.5.5 has a bug.)

Let's posit a definition that accepts only a (and in particular that
rejects d and e).  Such a definition is further supported by the
phrase "a simple pattern binding is a pattern binding in which the
pattern consists of only a single variable" (from section 4.5.5).

If we accept that a simple pattern binding cannot bind more than one
variable, then the definition of the monomorphism restriction in
section 4.5.5 is not consistent with your interpretation of the term
declaration group.  After all, given our posited definition, the
following language in 4.5.5:

(a): every variable in the group is bound by a function binding
 or a simple pattern binding (Section 4.4.3.2), and

(b): an explicit type signature is given for every variable in
 the group that is bound by simple pattern binding.

should instead read:

   (a): every binding is a function binding, or

   (b): the group consists of a simple pattern binding with an
explicit type signature.

In particular, why would the report say "an explicit type signature is
given for EVERY variable" when there can be only one such variable?

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What is a "simple pattern binding"?

2011-06-25 Thread dm-list-haskell-cafe
At Sun, 26 Jun 2011 00:17:12 +0100,
Paterson, Ross wrote:
> 
> > > > g1 x y z = if x>y then show x ++ show z else g2 y x
> > > >
> > > > g2 :: (Show a, Ord a) => a -> a -> String
> > > > g2 | False = \p q -> g1 q p ()
> > > >| otherwise = \p q -> g1 q p 'a'
> > > >where x = True
> > >
> > > It appears to me that GHC is justified. According to 4.5.1 and 4.5.2, g1
> > > by itself constitutes a declaration group. It is considered by itself
> > > and is generalized prior to combining it with g2.
> 
> > Great, now I'm even more confused.  4.5.1 says:
> 
> > A binding b1 depends on a binding b2 in the same list of
> > declarations if either
> 
> >  1. b1 contains a free identifier that has no type signature
> > and is bound by b2, or
> 
> >  2. b1 depends on a binding that depends on b2.
> 
> > A declaration group is a minimal set of mutually dependent
> > bindings. Hindley-Milner type inference is applied to each
> > declaration group in dependency order.
> 
> > So here the first binding (of g1) contains the free identifier g2,
> > which is bound by the second binding.  Conversely, the second binding
> > contains g1 free.  So the two bindings are mutually dependent, no?
> 
> No, the binding of g1 doesn't depend on the binding of g2, because g2
> has a type signature (clause 1).

I thought "no type signature" meant no type signature inside b1.
Otherwise, you are saying nothing could depend on a binding with a
type signature.  By that logic, there can be no mutual dependence, and
so every declaration with a type signature is its own (singleton)
declaration group.

But this can't be what the committee was thinking given the following
language in section 4.5.2:

If the programmer supplies explicit type signatures for more
than one variable in a declaration group, the contexts of
these signatures must be identical up to renaming of the type
variables.

Such a restriction would be vacuous if every type signature created a
singleton declaration groups.

Moreover, section 4.5.5 is also inconsistent with such an
interpretation:

The monomorphism restriction

Rule 1.

 We say that a given declaration group is unrestricted if
 and only if:

 (a): every variable in the group is bound by a function
  binding or a simple pattern binding (Section
  4.4.3.2), and

 (b): an explicit type signature is given for every
  variable in the group that is bound by simple
  pattern binding.

If every binding with an explicit type signature is its own
declaration group, then why isn't the monomorphism restriction stated
more simply as follows?

   (a): every binding is a function binding, or

   (b): the group consists of a pattern binding with an explicit type
signature.

When they say "an explicit type signature is given for every variable
in the group..." they have to be thinking there may be more than one
of them.

> The type of g1 is inferred using the declared type of g2.  Then that
> type is used in inferring a type for g2, which will be compared with
> its declared signature.

Thanks for the reply, but now I'm now even more confused.

Perhaps I should ask if someone can give me a better definition of
declaration group, ideally with support in the language spec...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What is a "simple pattern binding"?

2011-06-25 Thread dm-list-haskell-cafe
At Sat, 25 Jun 2011 14:20:52 -0400,
Scott Turner wrote:
> 
> > g1 x y z = if x>y then show x ++ show z else g2 y x
> > 
> > g2 :: (Show a, Ord a) => a -> a -> String
> > g2 | False = \p q -> g1 q p ()
> >| otherwise = \p q -> g1 q p 'a'
> >where x = True
>
> It appears to me that GHC is justified. According to 4.5.1 and 4.5.2, g1
> by itself constitutes a declaration group. It is considered by itself
> and is generalized prior to combining it with g2.
> 
> I agree that the report is confusing in its use of "simple pattern binding".

Great, now I'm even more confused.  4.5.1 says:

A binding b1 depends on a binding b2 in the same list of
declarations if either

 1. b1 contains a free identifier that has no type signature
and is bound by b2, or

 2. b1 depends on a binding that depends on b2.

A declaration group is a minimal set of mutually dependent
bindings. Hindley-Milner type inference is applied to each
declaration group in dependency order.

So here the first binding (of g1) contains the free identifier g2,
which is bound by the second binding.  Conversely, the second binding
contains g1 free.  So the two bindings are mutually dependent, no?

In fact, section 4.5.2 goes on to use the following example for a
declaration group:

  f x = let g1 x y = if x>y then show x else g2 y x  
g2 p q = g1 q p  
in ...

This example is very close to the code I gave.  How can my example
have two declaration groups when this example has only one?

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] What is a "simple pattern binding"?

2011-06-25 Thread dm-list-haskell-cafe
Section 4.4.3.2 of the 2010 Haskell report says:

A simple pattern binding has form p = e. The pattern p is
matched “lazily” as an irrefutable pattern, as if there were
an implicit ~ in front of it.

This makes it sound as though p is a pattern, which I assume means
what section 3.17 defines as the non-terminal "pat".

pat -> lpat qconop pat | lpat

This is further suggested by the explicit mention of ~, which would be
redundant if p had to be a var, since variables always match
(according to section 3.17.2 rule 1).  So my reading of section
4.4.3.2 is that the following is considered a simple pattern binding
(because it has no guards):

(f, g) = (\x -> x, f)


However, section 4.5.5 seems to contradict this.  It reads:

Recall that a variable is bound by either a function binding
or a pattern binding, and that a simple pattern binding is a
pattern binding in which the pattern consists of only a single
variable (Section 4.4.3).

Moreover, it goes on to give an example and explanation:

[(n,s)] = reads t
... Hence, when non-simple pattern bindings are used

This text makes it sound as though a "simple pattern binding" can have
only a single variable to the left of the = sign, meaning:

f = \x -> x


Further confusing things, GHC accepts the following:

g1 x y z = if x>y then show x ++ show z else g2 y x

g2 :: (Show a, Ord a) => a -> a -> String
g2 | False = \p q -> g1 q p ()
   | otherwise = \p q -> g1 q p 'a'
   where x = True


and infers type:

g1 :: (Show a, Show a1, Ord a1) => a1 -> a1 -> a -> [Char]

According to 4.4.3.2, g2 definitely does not have a simple pattern
binding, as its binding is not of the form p = e where p is a pattern.
Yet by section 4.5.5, if g2 were not considered a simple pattern
binding, the constrained type variables in the binding group
containing g1 and g2 (in particular the inferred type (Show a => a) of
z in g1) would not be allowed to be generalized.

So is section 4.4.3.2 of the Haskell 2010 report just wrong?  Or is
GHC allowing code prohibited by the standard?  Or am I somehow
misreading the standard?

Anyway, if someone can provide a less ambiguous definition of the term
"simple pattern binding", I would appreciated it, particularly if you
can point to support for your definition in the Haskell 2010 report...

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Iteratees again (Was: How on Earth Do You Reason about Space?)

2011-06-02 Thread dm-list-haskell-cafe
At Thu, 02 Jun 2011 13:52:52 +0200,
Ketil Malde wrote:
> 
> I have a bunch of old code, parsers etc, which are based on the
> 'readFile' paradigm:
> 
>   type Str = Data.ByteString.Lazy.Char8.ByteString -- usually
> 
>   decodeFoo :: Str -> Foo
>   encodeFoo :: Foo -> Str
> 
>   readFoo = decodeFoo . readFile 
>   writeFoo f = writeFile f . encodeFoo
>   hReadFoo = decodeFoo . hRead
>   :
>   (etc)
> 
> This works pretty well, as long as Foo is strict enough that you don't
> retain all or huge parts of input, and as long as you can process input
> in a forward, linear fashion.  And, like my frequency count above, I
> can't really see how this can be made much simpler.

This is fine if you never have parse errors and always read to the end
of the file.  Otherwise, the code above is incorrect and ends up
leaking file descriptors.  In general, it is very hard to write
parsers that parse every possible input and never fail.  Thus, for
anything other than a toy program, your code actually has to be:

readFoo path = bracket (hOpen path) hclose $
hGetContents >=> (\s -> return $! decodeFoo s)

Which is still not guaranteed to work if Foo contains thunks, so then
you end up having to write:

readFoo path = bracket (hOpen path) hclose $ \h -> do
  s <- hGetContents h
  let foo = decodeFoo s
  deepseq foo $ return foo

Or, finally, what a lot of code falls back to, inserting gratuitous
calls to length:

readFoo path = bracket (hOpen path) hclose $ \h -> do
  s <- hGetContents h
  length s `seq` return decodeFoo s

The equivalent code with the iterIO package would be:

readFoo path = enumFile path |$ fooI

which seems a lot simpler to me...

> Would there be any great advantage to rewriting my stuff to use
> iterators?  Or at least, use iterators for new stuff?

In addition to avoiding edge cases like leaked file descriptors and
memory, one of the things I discovered in implementing iterIO is that
it's really handy to have your I/O functions be the same as your
parsing combinators.  So iteratees might actually admit a far simpler
implementation of decodeFoo/fooI.

More specifically, imagine that you have decodeFoo, and now want to
implement decodeBar where a Bar includes some Foos.  Unfortunately,
having an implementation of decodeFoo in-hand doesn't help you
implement decodeBar.  You'd have to re-write your function to return
residual input, maybe something like:

decodeFooReal :: String -> (Foo, String)

decodeFoo :: String -> Foo
decodeFoo = fst . decodeFooReal

and now you implement decodeBar in terms of decodeFooReal, but you
have to pass around residual input explicitly, handle parsing failures
explicitly, etc.

> As I see it, iterators are complex and the dust is only starting to
> settle on implementations and interfaces, and will introduce more
> dependencies.  So my instinct is to stick with the worse-is-better
> approach, but I'm willing to be educated.

I fully agree with the point about dependencies and waiting for the
dust to settle, though I hope a lot of that changes in a year or so.
However, iterIO should already significantly reduce the complexity.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-05-31 Thread dm-list-haskell-cafe
At Tue, 31 May 2011 21:30:01 -0500,
austin seipp wrote:
> 
> The short story is thus: when you turn on GADTs, it also now turns on
> another extension implicitly (MonoLocalBinds) which restricts let
> generalization...
> 
> You can find a little more info about the change here:
> 
> http://hackage.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7

Thanks for the precise response I needed.

It definitely felt like I was running up against something like the
monomorphism restriction, but my bindings were function and not
pattern bindings, so I couldn't understand what was going on.  I had
even gone and re-read the GHC documentation
(http://www.haskell.org/ghc/docs/7.0-latest/html/users_guide/data-type-extensions.html#gadt),
which says that -XGADTs enables -XRelaxedPolyRec, but makes no mention
of -XMonoLocalBinds.

It might save users some frustration if the GHC manual and/or the
error message mentioned something about let bindings being monomorphic
by default.

On a related note, I already started fixing this in my code by
enabling ScopedTypeVariables, as it's too much of a pain to do this
without that extension.

I usually try to use the minimum number of extensions possible to
future-proof my code.  However, is it reasonable to conclude that if
I'm going to use GADTs anyway, then additionally enabling
ScopedTypeVariables doesn't really make my code any less future-proof?

Thanks,
David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-05-31 Thread dm-list-haskell-cafe
I'm using GHC 7.0.2 and running into a compiler error that I cannot
understand.  Can anyone shed light on the issue for me?  The code does
not make use of GADTs and compiles just fine without them.  But when I
add a {-# LANGUAGE GADTs #-} pragma, it fails to compile.

Here is the code:



{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE GADTs #-}

mkUnit :: (forall t. t -> m t) -> m ()
mkUnit f = f ()

data Const b a = Const { unConst :: b }

sillyId :: r -> r
sillyId r = unConst (mkUnit mkConst_r) -- This line can't compile with GADTs
where mkConst_r t = mkConst r t
  mkConst :: b -> t -> Const b t
  mkConst b _ = Const b



And the error I get is:

Couldn't match type `t0' with `t'
  because type variable `t' would escape its scope
This (rigid, skolem) type variable is bound by
  a type expected by the context: t -> Const r t
The following variables have types that mention t0
  mkConst_r :: t0 -> Const r t0
(bound at /u/dm/hack/hs/gadt.hs:11:11)
In the first argument of `mkUnit', namely `mkConst_r'
In the first argument of `unConst', namely `(mkUnit mkConst_r)'
In the expression: unConst (mkUnit mkConst_r)

I've found several workarounds for the issue, but I'd like to
understand what the error message means and why it is caused by GADTs.

Thanks in advance for any help.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Haskell memory model (was iterIO-0.1)

2011-05-18 Thread dm-list-haskell-cafe
At Wed, 18 May 2011 09:56:22 +0100,
Simon Marlow wrote:
> 
> Ok.  I'm not sure how feasible RCU is with IORefs, or even whether it's 
> necessary.  After all, the usual pattern of having readers use readIORef 
> while writers use atomicModifyIORef gives the RCU cost model (zero 
> overhead for readers, expensive writes) with far less complexity. 
> Garbage collection does the job of reclamation automatically.  Have I 
> missed something here?

Right, that's what I was calling RCU.  Usually the hard part in RCU is
the garbage collection.  Obviously if you needed to do something else
like close a file handle, then IORefs are not sufficient.  But for a
lot of applications of RCU, IORefs plus garbage collection should be
sufficient.

> A slight improvement over this scheme is to use TVar with readTVarIO for 
> the readers (no transaction involved), and transactions for the writers. 
>   This greatly increases the scope of what a writer can do, since they 
> can perform an update on a bunch of state at the same time.

Good point.

> There is an operational semantics in the Concurrent Haskell paper that 
> does not admit the behaviour you describe, but I'll add something to the 
> docs to that effect.

Ah, you got me.  I probably should have looked at that paper, which is
linked to from Control.Concurrent.  Still, in some cases (not
necessarily here), papers are static and code continues to evolve, so
it's nice to stuff documented in haddock as well.

> That's a good point - readMVar cannot be optimised to avoid the lock. In 
> fact, readMVar is just
> 
>readMVar m = do x <- takeMVar m; putMVar m x; return x
> 
> although there have been suggestions that we should make it atomic.  If 
> we were to do so, it would still have to use a barrier to avoid reordering.

What would be even cooler would be if swapMVar could be made atomic.
Or better yet, if there could be a compareAndSwapMVar, since on some
architectures (though not x86) that could be a single instruction and
allow for truly wait-free data types.  (That might not be possible
without sacrificing referential transparency, since the obvious
implementation would involve comparing pointers rather than values.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Haskell memory model (was iterIO-0.1)

2011-05-16 Thread dm-list-haskell-cafe
At Mon, 16 May 2011 22:31:14 +0100,
Simon Marlow wrote:
> 
> Good example - so it looks like we don't get full sequential consistency 
> on x86 (actually I'd been thinking only about write ordering and 
> forgetting that reads could be reordered around writes).
> 
> But that's bad because it means Haskell has a memory model, and we have 
> to say what it is, or at least say that ordering is undefined.

Right.  So I think the memory model is something along the lines of
the no-crash property you mentioned--i.e., readIORef will return some
value written with writeIORef and not a mish-mash of multiple
writes--combined with the model of the underlying hardware.  Maybe
atomicModifyIORef should serve as a barrier, too.

>   But I've never heard anyone claim that a prerequisite to Haskell being 
> useful as a parallel programming language is a well-defined memory 
> model.  I think there's a couple of reasons for that:
> 
>- deterministic parallel programming models (e.g. Strategies,
>  monad-par) don't care about memory models.  These are the
>  first port of call for parallel programming.

Okay, well, I make this claim as a C/C++ programmer more used to
writing low-level/kernel code than functional code.  So I'm thinking
less of things like deterministic scientific codes and more along the
lines of network servers processing lots of messages and other
asynchronous events happening in a non-deterministic order anyway.

I think several programming patterns would be useful in Haskell that
require some understanding of the memory model.  One that particularly
jumps to mind is the read-copy-update (RCU) pattern for frequently
accessed but seldom updated data (configuration state, routing tables,
etc.)

As you've described them, IORefs are well suited to such a pattern
because reads are very cheap and updates happen through an atomic
pointer write.  But if the documentation doesn't say that readIORef is
supposed to be cheap (or imply so by mentioning that readIORef exposes
the underlying hardware's memory consistency), then there's no way to
tell that IORefs are suitable for RCU, so people may think they have
to do something uglier using peek and poke.

>- If you have to use concurrency, then none of MVars,
>  atomicModifyIORef or STM care about memory models either.
> 
> So the memory model only becomes visible when you use concurrency with 
> shared IORefs (without atomicModifyIORef) or bare peek/poke, which is 
> pretty rare and easily avoided.

Actually:


http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Concurrent-MVar.html

There's nothing in the documentation for MVars that says anything
about sequential consistency.  If you take my example from the
previous email, replace writeIORef with (\p v -> modifyMVar_ p $
return v), replace all other occurrences of IORef with MVar, nothing
in the docs suggests you won't see the "critical section" message
printed twice.

Presumably modifyMVar must take a spinlock.  Moreover, to be correct
on x86, the spinlock must execute an LFENCE after acquiring the lock
and an SFENCE prior to releasing the lock.  But does readMVar acquire
the spinlock, or is it optimized to take advantage of pointer-sized
writes being atomic?

One can argue that an optimized readMVar is better, because you can
always force serialization with modifyMVar.  Or one can argue that,
for consistency, readMVar should be identical to a takeMVar followed
by a putMVar, since people can use IORefs for less deterministic
behavior.  The latter is certainly what the *current* code does, but
until the behavior is documented, I'd worry about some later version
of ghc optimizing readMVar and changing the consistency.

Systems have memory models for a reason; you can't get away from them
entirely for all applications.  Haskell's strength, I think, is in
making sure that 99+% of code can't possibly depend on the memory
model.  For functional and ST code, you don't even need to look at the
code to know that this is true--safety is guaranteed by the types
(modulo some unsafe stuff I hope will be even easier to detect with
ghc 7.2...).  But for that last little bit of tricky code, the best
you can do is specify the behavior of the building blocks and maybe
provide some useful architecture-independent wrappers for things
(e.g., abstracted memory barriers).

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread dm-list-haskell-cafe
At Tue, 17 May 2011 02:18:55 +1000,
Bernie Pope wrote:
> 
>  http://augustss.blogspot.com/2011/04/
> ugly-memoization-heres-problem-that-i.html
> 
> He says that "There's no guarantee about readIORef and writeIORef when doing
> multi-threading.". But I was wondering if that was true, and if it were, what
> the consequences would be. If you read his reply to my question on the blog,
> then I believe that he was saying that sequential consistency was not
> guaranteed.

While I don't know how IORefs work and I'd love to understand this
better, I can't imagine any IORef implementation in which memoIO (in
the blog post) would give the wrong answer on x86.  It might, of
course, cause f x to be evaluated multiple times on the same x.

However, on other CPUs (e.g., the DEC alpha), there could maybe, maybe
be issues.  Though I'm not sure, since to avoid crashes, the alpha
implementation of IORef would need to include a memory barrier.  The
question is whether there is an architecture in which IORef avoids
crashes AND memoIO can give you the wrong answer.  Also, if Simon's
original post means that IORef operations all contain barrier
instructions, it could be that memoIO is simply correct and the blog
post is simply wrong about needing MVars.

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-16 Thread dm-list-haskell-cafe
At Mon, 16 May 2011 10:56:02 +0100,
Simon Marlow wrote:
>
> Yes, it's not actually documented as far as I know, and we should fix 
> that.  But if you think about it, sequential consistency is really the 
> only sensible policy: suppose one processor creates a heap object and 
> writes a reference to it in the IORef, then another processor reads the 
> IORef.  The writes that created the heap object must be visible to the 
> second processor, otherwise it will encounter uninitialised memory and 
> crash.  So sequential consistency is necessary to ensure concurrent 
> programs can't crash.
> 
> Now perhaps it's possible to have a relaxed memory model that provides
> the no-crashes guarantee but still allows IORef writes to be reordered
> (e.g. some kind of causal consistency).  That might be important if
> there is some processor arcitecture that provides that memory model, but
> as far as I know there isn't.

Actually, in your heap object example, it sounds like you only really
care about preserving program order, rather than write atomicity.
Thus, you can get away with less-than-sequential consistency and not
crash.

The x86 is an example of a relaxed memory model that provides the
no-crashes guarantee you are talking about.  Specifically, the x86
deviates from sequential consistency in two ways

  1. A load can finish before an earlier store to a different memory
 location.  [intel, Sec. 8.2.3.4]

  2. A thread can read its own writes early. [intel, 8.2.3.5]

  [Section references are to the intel architecture manual, vol 3a:
   http://www.intel.com/Assets/PDF/manual/253668.pdf]

One could imagine an implementation of IORefs that relies on the fact
that pointer writes are atomic and that program order is preserved to
avoid mutex overhead for most calls.  E.g.:

  struct IORef {
spinlock_t lock; /* Only ever used by atomicModifyIORef */
HaskellValue *val;   /* Updated atomically because pointer-sized
writes are atomic */
  };

  HaskellValue *
  readIORef (struct IORef *ref)
  {
return ref->val;
  }

  void
  writeIORef (struct IORef *ref, HaskellValue *val)
  {
/* Note that if *val was initialized in the same thread, then by
 * the time another CPU sees ref->val, it will also see the
 * correct contents of *ref->val, because stores are seen in a
 * consistent order by other processors [intel, Sec. 8.2.3.7].
 *
 * If *val was initialized in a different thread, then since this
 * thread has seen it, other threads will too, because x86
 * guarantees stores are transitively visible [intel, Sec. 8.2.3.6].
 */
ref->val = val;
  }

  /* modifyIORef is built out of readIORef and writeIORef */

  HaskellValue *
  atomicModifyIORef (Struct IORef *ref, HaskellFunction *f)
  {
HaskellValue *result;
spinlock_acquire (&ref->lock);

result = modifyIORef (ref, f);

spinlock_release (&ref->lock);
return result;
  }

This is actually how I assumed IORefs worked.  But then consider the
following program:

  maybePrint :: IORef Bool -> IORef Bool -> IO ()
  maybePrint myRef yourRef = do
writeIORef myRef True
yourVal <- readIORef yourRef
unless yourVal $ putStrLn "critical section"

  main :: IO ()
  main = do
r1 <- newIORef False
r2 <- newIORef False
forkOS $ maybePrint r1 r2
forkOS $ maybePrint r2 r1
threadDelay 100

Under sequential consistency, the string "critical section" should be
output at most once.  However, with the above IORef implementation on
x86, since a read can finish before a write to a different location,
both threads might see False for the value of yourVal.

To prevent this deviation from sequential consistency, you would need
to do something like stick an MFENCE instruction at the end of
writeIORef, and that would slow down the common case where you don't
care about sequential consistency.  In fact, I would argue that if you
care about S.C., you should either be using atomicModifyIORef or
MVars.

Can you explain what actually happens inside the real IORef
implementation?

As an aside, these days one sees a lot of hand-wringing over the fact
that CPU clock rates have been flat for a while and the only way to
get more performance is through parallelism.  "How are we going to
teach programmers to write concurrent code when it's so hard to write
and debug?" I've heard numerous people ask.

Haskell could be a major step in the right direction, since in the
absence of variables, it's impossible to have data races.  (You can
still have deadlock and other kinds of race condition, such as the one
in maybePrint above, if you had my definition of IORef, but data races
are by far the most pernicious concurrency problems.)  Of course, the
key to making Haskell useful in a parallel setting is that things like
the memory

Re: [Haskell-cafe] Sending messages up-and-down the iteratee-enumerator chain [Was: iterIO-0.1]

2011-05-13 Thread dm-list-haskell-cafe
At Fri, 13 May 2011 02:57:38 -0700 (PDT),
o...@okmij.org wrote:
> 
> The code described in this message does exactly that.

Hey, Oleg.  This is really cool!  In particular, your Bindable class
has the potential to unify a whole bunch of request types and both
simplify and generalize code.  Also, Sum is clearly a more elegant
solution that just requiring everything to be Typeable.  It may solve
some problems I had where I wanted to send messages in exceptions that
contained types I didn't know to be Typeable.  I need to digest the
code a bit more, but it may make sense for me to use this technique in
a future version of iterIO.  (Much of iterIO is obviously inspired by
your stuff as it is.)

However, I still have two questions.  First, the Iter type in your
message seems more like your first iteratee implementation, which is
the approach iterIO and enumerator now take.  I wonder if it's
possible to implement something like Tell your current, CPS-based
iteratee.  Part of the reason I didn't take a CPS-based approach for
Iter was that I couldn't get the upward control requests to work.
(Also I wanted pure iteratees, which reduced the gain from CPS.)

A challenge for Tell is that you need to know the size of buffered
data and not move the input stream.  So the control handler needs to
decide what happens to residual data (since Seek does flush the
input).  Conceptually, it seems like it ought to be doable to pass
residual data up and down the enumerator/iteratee stack in a CPS
style.  But when I try to represent residual input as something like:

  data Input r m s = forall a. Input ((Stream s -> Iteratee s m a) -> m r)

I just can't get the types to work out.

The second question is what happens to residual data for downstream
requests.  In the prototype code of your message, the Stream is over
Chars, which are not a Monoid.  In practice, you obviously want
iteratees to be able to look arbitrarily far ahead--for instance an
iteratee that returns a number of digits that is a multiple of 8 might
have 8 characters of residual data (if the first 7 are digits).

So what I'm stuck on is figuring out the right way to sequence the
downstream requests with respect to the input data, particularly when
you have enumeratees transcoding from one type to the other.  Any
thoughts?

Thanks,
David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-12 Thread dm-list-haskell-cafe
At Thu, 12 May 2011 16:45:02 +0100,
Simon Marlow wrote:
> 
> >> There are no locks here, thanks to the message-passing implementation we
> >> use for throwTo between processors.
> >
> > Okay, that sounds good.  So then there is no guarantee about ordering
> > of throwTo exceptions?  That seems like a good thing since there are
> > other mechanisms for synchronization.
> 
> What kind of ordering guarantee did you have in mind?  We do guarantee 
> that in
> 
> throwTo t e1
> throwTo t e2
> 
> Thread t will receive e1 before e2 (obviously, because throwTo is 
> synchronous and only returns when the exception has been raised).
> ...
> Pending exceptions are processed in LIFO order (for no good reason other 
> than performance)...

I mean, suppose you have three CPUs, A, B, and C running threads ta,
tb, and tc.  Then should the following order of events be permitted?

AB C
  throwTo tc e1
  throwTo tb e2
 catch e2
 poke p x
  peek p (sees x)
  catch e1

I would argue that this is just fine, that one should rely on MVars if
one cares about ordering.  But I'm not sure what "Pending exceptions
are processed in LIFO order" means in the presence of relaxed memory
consistency.

The reason I'm asking is that I want to make sure I never end up
having to pay the overhead of an MFENCE instruction or equivalent
every time I use unmaskAsyncExceptions#...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
> 
> > However, if there's some simpler way to guarantee that>>= is the
> > point where exceptions are thrown (and might be the case for GHC in
> > practice), then I basically only need to update the docs.  If someone
> > with more GHC understanding could explain how asynchronous exceptions
> > work, I'd love to hear it...
> 
> There's no guarantee of the form that you mention - asynchronous 
> exceptions can occur anywhere.  However, there might be a way to do what 
> you want (disclaimer: I haven't looked at the implementation of iterIO).
> 
> Control.Exception will have a new operation in 7.2.1:
> 
>allowInterrupt :: IO ()
>allowInterrupt = unsafeUnmask $ return ()
> 
> which allows an asynchronous exception to be thrown inside mask (until 
> 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

So to answer my own question from earlier, I did a bit of
benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon
3060, running linux 2.6.38), I get the following costs:

 9 ns - return () :: IO ()   -- baseline (meaningless in itself)
13 ns - unsafeUnmask $ return () -- with interrupts enabled
18 ns - unsafeUnmask $ return () -- inside a mask_

13 ns - ffi  -- a null FFI call (getpid cached by libc)
18 ns - unsafeUnmask ffi -- with interrupts enabled
22 ns - unsafeUnmask ffi -- inside a mask_

   131 ns - syscall  -- getppid through FFI
   135 ns - unsafeUnmask syscall -- with interrupts enabled
   140 ns - unsafeUnmask syscall -- inside a mask_

So it seems that the cost of calling unsafeUnmask inside every liftIO
would be about 22 cycles per liftIO invocation, which seems eminently
reasonable.  You could then safely run your whole program inside a big
mask_ and not worry about exceptions happening between >>=
invocations.  Though truly compute-intensive workloads could have
issues, the kind of applications targeted by iterIO will spend most of
their time doing I/O, so this shouldn't be an issue.

Better yet, for programs that don't use asynchronous exceptions, if
you don't put your whole program inside a mask_, the cost drops
roughly in half.  It's hard to imagine any real application whose
performance would take a significant hit because of an extra 11 cycles
per liftIO.

Is there anything I'm missing?  For instance, my machine only has one
CPU, and the tests all ran with one thread.  Does
unmaskAsyncExceptions# acquire a spinlock that could lock the memory
bus?  Or is there some other reason unsafeUnmask could become
expensive on NUMA machines, or in the presence of concurrency?

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
> 
> There's no guarantee of the form that you mention - asynchronous 
> exceptions can occur anywhere.  However, there might be a way to do what 
> you want (disclaimer: I haven't looked at the implementation of iterIO).
> 
> Control.Exception will have a new operation in 7.2.1:
> 
>allowInterrupt :: IO ()
>allowInterrupt = unsafeUnmask $ return ()
> 
> which allows an asynchronous exception to be thrown inside mask (until 
> 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

Ah.  I didn't know about unsafeUnmask.  Is unmaskAsyncExceptions# low
enough overhead that it would be reasonable to wrap every invocation
of liftIO in unsafeUnmask?

I'm now thinking it might be reasonable to execute all liftIO actions
inside unsafeUnmask (with maybe a special liftIOmasked function for
those few places where you don't want asynchronous exceptions).  Most
of the uses of mask are because you need two or more binds to execute
without interruption, e.g.:

bracket before after thing =
  mask $ \restore -> do
a <- before
-- Big problem if exception happens here --
r <- restore (thing a) `onException` after a
_ <- after a
return r

But when bind sites are the only place an exception can be thrown,
things get a lot simpler.  For instance, it is perfectly reasonable to
write:

bracket before after thing = do
  a <- before
  thing a `finallyI` after a

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-11 Thread dm-list-haskell-cafe
At Wed, 11 May 2011 13:02:21 +0100,
Simon Marlow wrote:
> 
> There's no guarantee of the form that you mention - asynchronous 
> exceptions can occur anywhere.  However, there might be a way to do what 
> you want (disclaimer: I haven't looked at the implementation of iterIO).
> 
> Control.Exception will have a new operation in 7.2.1:
> 
>allowInterrupt :: IO ()
>allowInterrupt = unsafeUnmask $ return ()
> 
> which allows an asynchronous exception to be thrown inside mask (until 
> 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO).

Ah.  I didn't know about unsafeUnmask.  Is unmaskAsyncExceptions# low
enough overhead that it would be reasonable to wrap every invocation
of liftIO in unsafeUnmask?

I'm now thinking it might be reasonable to execute all liftIO actions
inside unsafeUnmask (with maybe a special liftIOmasked function for
those few places where you don't want asynchronous exceptions).  Most
of the uses of mask are because you need two or more binds to execute
without interruption, e.g.:

bracket before after thing =
  mask $ \restore -> do
a <- before
-- Big problem if exception happens here --
r <- restore (thing a) `onException` after a
_ <- after a
return r

But when bind sites are the only place an exception can be thrown,
things get a lot simpler.  For instance, it is perfectly reasonable to
write:

bracket before after thing = do
  a <- before
  thing a `finallyI` after a

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-10 Thread dm-list-haskell-cafe
sing IO to happen.

Part of the reason iterIO doesn't have this problem is that iterIO's
Chunk structure (which is vaguely equivalent to iteratee's Stream) is
a Monoid, so it's really easy to save up multiple chunks of residual
and "ungotten" data.  Every Iter is passed all buffered data of its
input type in its entirety (and the inner pipeline stages can actually
un-transcode data to make this true across data types).  But that's
also what makes downstream control messages are harder, because
there's no way to represent exceptions at particular points in the
input stream, just an EOF marker at the very end.

> I like the MonadTrans implementation a lot...

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Type-class conditional behavior

2011-05-08 Thread dm-list-haskell-cafe
At Sat, 7 May 2011 22:14:27 -0700,
Nicholas Tung wrote:
> 
> Dear all,
> 
>     I'd like to write a function "maybeShow :: a -> Maybe String", which runs
> "show" if its argument is of class Show.

You can't do this, because in general there is no way to know whether
an arbitrary object a is of class Show.  In fact, in the worst case,
you could even have two different instances of Show for the same type
defined in two different modules of your program.  Obviously you can't
import both modules with both instances into the same module, but what
if you didn't import either--how would the compiler know where to find
the Show function or which one to use.

The best you could hope for is to run show if type a is *known* to be
in class Show at your call site.  But that would lead to some pretty
weird behavior.  For instance, the following two functions would be
different--f1 would always return Just, and f2 would always return
Nothing, which is why I assume no combination of LANGUAGE pragmas
would allow it:

f1 :: (Show a) => a -> Maybe String
f1 = maybeShow

f2 :: a -> Maybe String
f2 = maybeShow

In fact, I suspect that your arrow example is more like f2, in that
you don't have a Show dictionary around, so maybeShow will always
return nothing.

Is there any way you can pass the function around explicitly, as in:

data AV t where
  AVLeft :: AV (a, a -> Maybe String)
 -> AV (Either (a, a -> Maybe String) b)

It is also possible to pass dictionaries around explicitly using the
ExistentialQuantification extension (which is required by the standard
library exception mechanism, so is probably a reasonably safe one to
rely on).  Can you do something like the following?

{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE GADTs #-}

data Showable a = forall a. (Show a) => Showable a

data AV t where
  AVLeft :: AV (Showable a) -> AV (Either (Showable a) b)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-07 Thread dm-list-haskell-cafe
At Sat, 07 May 2011 21:50:13 +0100,
Maciej Marcin Piechotka wrote:
> 
> Sorry for third post but I wonder why the many instances are restricted
> by Monad.

It would be great if Functor were a superclass of Monad.  However,
since it isn't, and since I can't think of anything particularly
useful to do with Iters over Functors that aren't also Monads, I'd
rather just pass one dictionary around than two.  So my convention
throughout the library is that m has to be a Monad but doesn't have to
be a Functor.

In general, I try to place as few requirements in the contexts of
functions as possible.  However, I also want to be able to call most
functions from most other ones.  If some of the useful low-level
functions end up requiring Functor, then most functions in the library
are going to end up requiring (Functor m, Monad m) => instead of
(Monad m) =>, which will actually end up increasing the amount of
stuff in contexts.

(Of course, (Iter t m) itself is an Applicative Functor, even when m
is just a Monad.  So that I make use of in the parsing module.)

David

_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 06 May 2011 21:27:21 -0400,
Mario Blažević wrote:
> 
> > I'd been thinking about using the terms Source and Sink, but Source is
> > very overloaded, and "SinkSource" doesn't exactly roll off the tongue
> > or evoke a particularly helpful intuition.
> 
>  The SCC package happens to use Source and Sink names as well. They 
> are used not for coroutines directly, but instead for references to 
> coroutines of the appropriate type. Every consumer thus owns a Source 
> from which it fetches its input, and that Source is always bound to 
> another coroutine that yields those values through a Sink. Source and 
> Sink are a passive handle to a Producer and Consumer. I may be 
> subjective, but I find this use of the terms very fitting.

You mean fitting for references to coroutines, or fitting for the
replacement names for Enumerator/Iteratee?

If there's overwhelming consensus, I would certainly consider changing
the names in the iterIO library, but it's a pretty big change...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Sat, 07 May 2011 00:09:46 +0200,
Henk-Jan van Tuyl wrote:
> 
> >> On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl   
> >> wrote:
> >> > iterIO cannot be compiled on Windows, because it depends on the  
> >> package
> >> > unix.
> [...]
> > I'd obviously love to make my stuff work on Windows, but probably lack
> > the experience to do it on my own.  Suggestions and help are of course
> > welcome...
> 
> Is the unix-compat package any good?

Thanks for the suggestion.  I'm not sure I totally understand how to
use unix-compat, though.  It gives me calls like

 getFdStatus :: Fd -> IO FileStatus

which is one of the things I need.  But how do I get an Fd in the
first place?  (unix-compat seems to have no equivalent of openFd.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Sat, 7 May 2011 01:15:25 +1000,
Alex Mason wrote:
> 
> Hi All,
> 
> I really love the look of this package, but if this is going be
> *the* iteratee package, I would absolutely love to see it fix some
> of the biggest mistakes in the other iteratee packages, soecifically
> naming. A change in naming for the terms iteratee, enumerator and
> enumeratee would go a hell of a long way here; Peaker on #haskell
> suggested Consumer/Producer/Transformer, and there is a lot of
> agreement in the channel that these are vastly better names. They’re
> also far less intimidating to users.
> 
> I personally feel that maybe Transformer isn't such a great name
> (being closely associated with monad transformers), and that maybe
> something like Mapper would be better, but I'm by no means in love
> with that name either. More people in #haskell seem to like
> Transformer, and I don't think my argument against it is very
> strong, so the hivemind seems to have settled on the
> Producer/Transformer/Consumer trilogy.
> 
> I'd love to hear thoughts on the issue, especially from David.

This is a question I struggled a lot with.  I definitely agree that
the terms are pretty intimidating to new users.

At least one thing I've concluded is that it really should be
presented as two concepts, rather than three.  So we should talk
about, e.g., producers, consumers, and pipeline stages that do both.

I'd been thinking about using the terms Source and Sink, but Source is
very overloaded, and "SinkSource" doesn't exactly roll off the tongue
or evoke a particularly helpful intuition.

In the end, I decided just to come up with new terms that wouldn't
carry any pre-conceptions (e.g., what's an "Inum"?), and then build
the intuition through copious documentation...

I'm open to suggestion here.  I've already overhauled the naming
conventions in the library once.  Initially I used the names EnumI and
EnumO for Inum and Onum.  I think the old names were much worse,
especially since Enum is a fundamental typeclass that has absolutely
nothing to do with enumerators.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:54:16 -0300,
Felipe Almeida Lessa wrote:
> 
> On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl  wrote:
> > iterIO cannot be compiled on Windows, because it depends on the package
> > unix.
> 
> That's a big showstopper.  I wonder if the package split I recommend
> could solve this issue, or if it's something deeper.

It's actually worse than this, unfortunately.

The unix package dependency is mostly there for efficiency.  For the
HTTP package, in order to handle things like directories,
If-Modified-Since, and Content-Length, I need to look at file
attributes.  The platform-independent code lets me do this, but I
would have to make many more system calls.  Also, I would have a
slight race condition, because it's hard to get the attributes of the
file you actually opened (to make sure the length hasn't changed,
etc), while the unix package gets me access to both stat and fstat.

This has all been abstracted away by the FileSystemCalls class, so if
there's a way to implement those five functions on Windows, we could
move defaultFileSystemCalls to its own module (or even its own
package), and solve the problem without sacrificing performance or
correctness on unix.

Unfortunately, there are two worse unix dependencies:

 1) I'm using the network IO package to do IO on ByteStrings, and the
network library claims this doesn't work on windows.

 2) Proper implementation of many network protocols requires the
ability to send a TCP FIN segment without closing the underlying
file descriptor (so you can still read from it).  Thus, I'm using
FFI to call the shutdown() system call on the file descriptors of
Handles.  I have no idea how to make this work on Windows.

I'm hoping that time eventually solves problem #1.  As for problem #2,
the ideal solution would be to get something like hShutdown into the
system libraries.

I'd obviously love to make my stuff work on Windows, but probably lack
the experience to do it on my own.  Suggestions and help are of course
welcome...

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:10:26 -0300,
Felipe Almeida Lessa wrote:
> 
> He also says that the enumerator package's Enumerators aren't
> iteratees, only iterIO's enumerators are.  Well, that's not what I'm
> reading:
> 
>   -- from enumerator package
>   newtype Iteratee a m b = Iteratee {runIteratee :: m (Step a m b)}
>   type Enumerator a m b = Step a m b -> Iteratee a m b
>   type Enumeratee ao ai m b = Step ai m b -> Iteratee ao m (Step ai m b)
> 
>   -- from iterIO package
>   newtype Iter t m a = Iter {runIter :: Chunk t -> IterR t m a}
>   type Inum tIn tOut m a = Iter tOut m a -> Iter tIn m (IterR tOut m a)
>   type Onum t m a = Inum () t m a
> 
> The enumerator package's Enumerator *is* an iteratee, an so is its
> Enumeratee.

Strictly speaking, I guess that's precise if you look at the type of
Enumerator.  However, it's not really an iteratee in the spirit of
iteratees, since it isn't really a data sink and has no input type.

> The only real difference is that iterIO represents
> enumerators as enumeratees from () to something.  In enumerator
> package terms, that would be
> 
>   -- enumerator packages's enumerator if it was iterIO's :)
>   -- note that Inum's "tIn" and "tOut" are reversed w.r.t Enumeratee
> "ao" and "ai"
>   type Enumerator a m b = Enumeratee () a m b
> 
> Whether this representation is better or worse isn't clear for me.

Exactly.  The reason it's better (and for a long time my library was
more like the enumerator one) is that the mechanics of uniform error
handling are complex enough as it is.  When enumerators and
enumeratees are two different types, you need two different mechanisms
for constructing them, and then have to worry about handing errors in
the two different cases.  I found that unifying enumerators and
enumeratees (or Inums and Onums as I call them) significantly
simplified a lot of code.

> Now, one big problem that iterIO has that enumerator hasn't, is that
> iterIO is a *big* library with many dependencies, including OpenSSL.
> IMHO, that package should be split into many others.

Yes, this is definitely true.

> So, in the enumerator vs. iterIO challenge, the only big differences I see 
> are:
> 
>  a) iterIO has a different exception handling mechanism.
>  b) iterIO can have pure iteratees that don't touch the monad.
>  c) iterIO's iteratees can send control messages to ther enumerators.
>  d) iterIO's enumerators are enumeratees, but enumerator's enumerators
> are simpler.
>  e) enumerator has fewer dependencies.
>  f) enumerator uses conventional nomenclature.
>  g) enumerator is Haskell 98, while iterIO needs many extensions (e.g.
> MPTC and functional dependencies).
> 
> Anything that I missed?
> 
> The bottomline: the biggest advantage I see right now in favor of
> iterIO is c),

I basically agree with this list, but think you are underestimating
the value of a.  I would rank a as the most important difference
between the packages.  (a also is the reason for d.)

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators

2011-05-06 Thread dm-list-haskell-cafe
At Fri, 6 May 2011 10:15:50 +0200,
Gregory Collins wrote:
> 
> Hi David,
> 
> Re: this comment from catchI:
> 
> > It is not possible to catch asynchronous exceptions, such as
> > lazily evaluated divide-by-zero errors, the throw function, or
> > exceptions raised by other threads using throwTo if those
> > exceptions might arrive anywhere outside of a liftIO call.
> 
> It might be worth investigating providing a version which can catch
> asynchronous exceptions if the underlying monad supports it (via
> MonadCatchIO or something similar). One of the most interesting
> advantages I can see for IterIO over the other iteratee
> implementations is that you actually have some control over resource
> usage -- not being able to catch asynchronous exceptions nullifies
> much of that advantage. A clear use case for this is timeouts on
> server threads, where you typically throw a TimeoutException exception
> to the handling thread using "throwTo" if the timeout is exceeded.

Excellent point.  There's actually a chance that iterIO already
catches those kinds of exceptions, but I wasn't sure enough about how
the Haskell runtime works to make that claim.  I've noticed in
practice that asynchronous exceptions tend to come exactly when I
execute the IO >>= operation.  If that's true, then since each IO >>=
is wrapped in a try block, the exceptions will all be caught (well,
not divide by zero, but things like throwTo, which I think are more
important).

One way I was thinking of implementing this was wrapping the whole
execution in block, and then calling unblock (unless iterIO's own
hypothetical block function is called) for every invocation of liftIO.
Unfortunately, the block and unblock functions now seem to be
deprecated, and the replacement mask/unmask ones would not be as
amenable to this technique.

However, if there's some simpler way to guarantee that >>= is the
point where exceptions are thrown (and might be the case for GHC in
practice), then I basically only need to update the docs.  If someone
with more GHC understanding could explain how asynchronous exceptions
work, I'd love to hear it...

> Another question re: resource cleanup: in the docs I see:
> 
> > Now suppose inumHttpBody fails (most likely because it receives an
> > EOF before reading the number of bytes specified in the
> > Content-Length header). Because inumHttpBody is fused to handler,
> > the failure will cause handler to receive an EOF, which will cause
> > foldForm to fail, which will cause handleI to receive an EOF and
> > return, which will ensure hClose runs and the file handle h is not
> > leaked.
> 
> > Once the EOFs have been processed, the exception will propagate
> > upwards making inumHttpServer fail, which in turn will send an EOF
> > to iter. Then the exception will cause enum to fail, after which
> > sock will be closed. In summary, despite the complex structure of
> > the web server, because all the components are fused together with
> > pipe operators, corner cases like this just work with no need to
> > worry about leaked file descriptors.
> 
> Could you go into a little bit of detail about the mechanism behind this?

Yes, absolutely.  This relies on the fact that an Inum must always
return its target Iter, even when the Inum fails.  This invariant is
ensured by the two Inum construction functions, mkInumC and mkInumM,
which catch exceptions thrown by the "codec" iteratee and add in the
state of the target iteratee.

Now when you execute code like "inum .| iter", the immediate result of
running inum is "IterR tIn m (IterR tOut m a)"--i.e., the result of an
iteratee returning the result an iteratee (because Inums are
iteratees, too).  If the Inum failed, then the outer IterR will use
the Fail constructor:

Fail !IterFail !(Maybe a) !(Maybe (Chunk t))

Where the "Maybe a" will be a "Maybe (IterR tOut m b)", and, because
of the Inum invariant, will be Just an actual result.  .| then must
translate the inner iteratee result to the appropriate return type for
the Inum (since the Inum's type (IterR tIn m ...) is different from
the Iter's (Iter tOut m ...)).  This happens through the internal
function joinR, which says:

joinR (Fail e (Just i) c) = flip onDoneR (runR i) $ \r ->
case r of
  Done a _-> Fail e (Just a) c
  Fail e' a _ -> Fail e' a c
  _ -> error "joinR"

Where the 'runR' function basically keeps feeding EOF to an Iter (and
executing it's monadic actions and rejecting its control requests)
until it returns a result, at which point the result's residual input
can be discarded and replaced with the residual input of the Inum.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: borked windows environment, want to start over

2010-11-19 Thread list+haskell
On 11/18/2010 5:02 PM, Michael Litchard wrote:
> On Tue, Nov 16, 2010 at 4:39 PM, Michael Litchard  wrote:
>> I think I may have borked things good using cygwin. I want to remove
>> it and do a clean install of haskell platform w/out cygwin. What do I
>> need to do to make sure all configuration files have been removed?
> 
> Hmm, I wasn't precise enough in my question. My concern is there are
> configuration files related to the windows haskell-platform install
> that I don't know about , that need to be removed prior to doing a
> clean install. Or is it just a matter of doing a standard windows
> uninstall, will that take care of things?

As for Cygwin interacting with your native Windows Haskell Platform
installation, I think the only thing you have to worry about is if you
manually added your Cygwin bin directories to the Windows %PATH% (this
won't be done by the Cygwin installer, it's something you would have had
to do yourself).  That can cause Cabal to get confused, in my
experience, depending on the precedence of your Cygwin path entries.

Cygwin is generally pretty good about not messing with the host Windows
environment...  deleting the c:\cygwin directory and removing any
(Windows) %PATH% entries is generally sufficient to fully uninstall it.
 If you created any cygwin-based Windows services (sshd, etc.) you'll
have to delete those too, and there are a couple registry keys you could
delete if you want, but aside from %PATH% there's nothing that should
really mess with Haskell Platform...

When you say you may have borked things good with cygwin, what specific
problem(s) are you referring to?

-- 
Mark Shroyer
http://markshroyer.com/contact/
_______
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe