Re: [Haskell-cafe] The Typeable class is changing
At Mon, 11 Jul 2011 21:18:30 +0300, Yitzchak Gale wrote: > > Please respond to this thread if you own a package > that will be affected by this change. iterIO uses mkTyCon for the simple reason that ((Typeable t, Typeable m) => Iter t m) is Typeable1 and there is no automatic way of deriving Typeable1. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] IterIO: How to write use my inumReverse
At Mon, 4 Jul 2011 20:36:33 +1000, John Ky wrote: > > Hi Haskell Cafe, > > enum |$ inumLines .| inumReverse .| inumUnlines .| iter > ... > > iterLines :: (Monad m) => Iter L.ByteString m [L.ByteString] > iterLines = do > line <- lineI > return [line] > > iterUnlines :: (Monad m) => Iter [L.ByteString] m L.ByteString > iterUnlines = (L.concat . (++ [C.pack "\n"])) `liftM` dataI > > iterReverse :: (Monad m) => Iter [L.ByteString] m [L.ByteString] > iterReverse = do > lines <- dataI > return (map L.reverse lines) > > inumLines = mkInum iterLines > inumUnlines = mkInum iterUnlines > inumReverse = mkInum iterReverse > > It all works fine. > > My question is: Is it possible to rewrite inumReverse to be this: > > iterReverse :: (Monad m) => Iter L.ByteString m L.ByteString > iterReverse = do > line <- dataI > return (L.reverse line) > > inumReverse = mkInum iterReverse > > And still be able to use it in the line: > > enum |$ inumLines .| {-- inumReverse goes in here somehow --} .| > inumUnlines .| iter > > The reason I ask is that the Haskell function reverse has the type [a] -> [a], > not [[a]] -> [[a]]. > > I thought perhaps the alternative inumReverse is cleaner than the original as > it behaves more similarly to Haskell's own reverse function. I'm not sure what you are trying to achieve. If you want an iter that works on L.ByteStrings, then you can say: iterReverse :: (Monad m) => Iter L.ByteString m L.ByteString iterReverse = do line <- lineI return (L.reverse line) In that case you don't need inumLines and inumUnlines. If, however, you want the type to be [L.ByteString], and you would rather do this one line at a time, instead of calling map, then you could do something like the following: iterReverse :: (Monad m) => Iter [L.ByteString] m [L.ByteString] iterReverse = do line <- headI return [L.reverse line] But the code you have above should also work, so it all depends on what you are trying to achieve. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange context reduction with partial application and casting
At Sat, 2 Jul 2011 17:23:50 -0400, Brent Yorgey wrote: > > On Sat, Jul 02, 2011 at 09:02:13PM +0200, Daniel Fischer wrote: > > > > - disabling the monomorphism restriction > > :set -XNoMonomorphismRestriction > > let g = f > > This is the recommended solution. The confusion caused by the MR far > outweighs its occasional benefits. Recommended by some people, but by no means everyone. For instance, Vytiniotis, Peyton Jones, and Schrijvers make a good argument that the monomorphism restriction should effectively be expanded to include both pattern and function bindings in let and where clauses: http://research.microsoft.com/pubs/102483/tldi10-vytiniotis.pdf The above paper is currently implemented in GHC, and on by default if you enable GADTs. (So you would additionally need -XNoMonoLocalBindings if you wanted to use GADTs.) Moreover, even with -XNoMonoLocalBindings, you still run into the fact that bindings are not generalized within a declaration group, which could lead to confusion. In particular, consider the following program: {-# LANGUAGE NoMonomorphismRestriction #-} x = 2 -- The type of x is: Num a => a y = (x + y) :: Int The type of x is what you would expect without the monomorphism restriction. Now say x has a gratuitous use of y: {-# LANGUAGE NoMonomorphismRestriction #-} x = 2 where _ = y -- The type of x is: Int y = (x + y) :: Int If you want x to be polymorphic in this case, you have to give it a type signature anyway: {-# LANGUAGE NoMonomorphismRestriction #-} x :: Num a => a x = 2 where _ = y y = (x + y) :: Int Thus, what I would recommend, instead of -XNoMonoLocalBindings, is to give type signatures to your polymorphic bindings. This makes the code more readable. It has the disadvantage that Haskell doesn't allow you to name monomorphic type variables, which, for local bindings, can require either the use of -XScopedTypeVariables or giving extra function arguments whose only purpose is to bring a type variable into scope. But both of those are probably more future-proof than -XNoMonomorphismRestriction. David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Patterns for processing large but finite streams
At Fri, 1 Jul 2011 09:39:32 +0400, Eugene Kirpichov wrote: > > Hi, > > I'm rewriting timeplot to avoid holding the whole input in memory, and > naturally a problem arises: > > How to represent large but finite streams and functions that process > them, returning other streams or some kinds of aggregate values? > > Examples: > * Adjacent differences of a stream of numbers > * Given a stream of numbers with times, split it into buckets by time > of given width and produce a stream of (bucket, 50%,75% and 90% > quantiles in this bucket) > * Sum a stream of numbers > > Is this, perhaps, what comonads are for? Or iteratees? Sounds like a good job for iteratees. Summing a stream of numbers is just an Iteratee. Transcoding a stream into another stream is a job for an Inum (Iteratee-enumerator) or enumeratee, depending on which package's nomenclature you use. You have three implementations to choose from: - http://hackage.haskell.org/package/iteratee(original) - http://hackage.haskell.org/package/enumerator (John Milikin's re-write) - http://hackage.haskell.org/package/iterIO (my 3rd-generation attempt) David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Confused about my IterIO code
At Thu, 30 Jun 2011 23:53:02 +1000, John Ky wrote: > > But all I've done is: > > enum |$ inumReverseLines .| iter > > inumReverseLines = mkInum $ do > line <- lineI > return (L.reverse (L.concat [line, C.pack "\n"])) mkInum repeatedly invokes its iter argument so as to keep producing chunks. If you want to reverse only one line, it might be easiest to use something along the lines of: mkInumM $ do line <- lineI ifeed (L.reverse (L.concat [line, C.pack "\n"])) mkInumM is a more manual Inum construction function that doesn't automatically do things like loop or handle EOF conditions. David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] How to flush with IterIO in echo server
At Wed, 29 Jun 2011 21:13:47 +1000, John Ky wrote: > > Hi Haskell Cafe, > > I've written an echo server using just sockets: > > ... > > When I send text to it, it will echo it back immediately after my newline. > > I then modified it to user IterIO: > > import Control.Concurrent > import Control.Exception > import Control.Monad > import Control.Monad.Trans > import Data.IterIO > import Data.IterIO.Inum > import Network > import System.IO > import System.IO.Error (isEOFError) > import qualified Data.ByteString.Lazy as L > > iterHandle' :: (MonadIO m) => Handle -> IO (Iter L.ByteString m (), Onum > L.ByteString m a) > iterHandle' = iterHandle > > main = withSocketsDo $ do > sListen <- listenOn (PortNumber 8000) > putStrLn "Listening on Port 8000" > forkIO $ forever $ do > (sSession, hostname, port) <- accept sListen > putStrLn ("Connected to " ++ hostname ++ ":" ++ show port) > forkIO $ do > (iter, enum) <- iterHandle' sSession > enum |$ iter > return () > putStrLn "Press to quit." > exitOnCtrlD > > exitOnCtrlD = try getLine >>= either > (\e -> unless (isEOFError e) $ ioError e) > (const exitOnCtrlD) > > It works, however it doesn't send anything back to me until end of file. > > I fixed that problem with my sockets version by flushing after each line, but > I don't know if IterIO will let me flush on every newline. The buffering is actually happening in the Handle code. One way to avoid this is to change your code to call hSetBuffering as follows: hSetBuffering sSession NoBuffering (iter, enum) <- iterHandle' sSession This is mentioned in the documentation of handleI, but arguably should also be there in the docs for iterHandle. I think I'll add a mention there. If you use Sockets and call iterStream, that should also avoid the problem. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] IterIO type restricted functions
At Wed, 29 Jun 2011 10:11:26 +1000, John Ky wrote: > > [1 ] > [1.1 ] > > Hi all, > > From the IterIO tutorial: > > enumFile' is like enumFile above, but type restricted to data in the lazy > ByteString format, which is more efficient than plain Strings. (enumFile > supports multiple types, but in this example there is not enough > information for Haskell to choose one of them, so we must use enumfile' or > use :: to specify a type explicitly. > > Which is fine, but shouldn't there also be iterHandle' and iterStream'? > > Also I'm guessing the f is a typo that should be F. Thanks, just fixed the typo in git. I guess the logic is that these are super-easy to define in your own code, so better not to pollute the namespace with lots of symbols. The same logic also applies to enumFile'. However, in testing I found that I often wanted something like enumFile' to prototype something quick and dirty, just to test from a file. So I added it as a convenience to myself. I don't have very strong feelings either way. If enumFile' is inconsistent, my inclination would be to get rid of enumFile' rather than add iterHandle' etc. That way, if someone wants to do everything in terms of strict byte strings, or text, or whatever, then can just define the primed versions to be whatever data format they prefer. I guess the reason it doesn't feel to horrible as-is is that enumFile is already a convenience function effectively combining openFile with enumHandle. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What is a "simple pattern binding"?
At Sun, 26 Jun 2011 21:15:06 +0100, Paterson, Ross wrote: > > > True. That ambiguity could be avoided by adding the word "declaration" > > after "type signature". > > On second thoughts, this is unnecessary. The Report consistently uses > "expression type signature" for the expression and "type signature" > for the declaration. I already sent the haskell-prime mailing list a proposal for the following wording: A binding b1 depends on a binding b2 in the same list of declarations if either 1. b1 contains a free identifier v, v is bound by b2, and the list of declarations does not contain a type signature for v; or http://www.haskell.org/pipermail/haskell-prime/2011-June/003482.html I think this is clearer, but it might make more sense to discuss on haskell-prime. David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What is a "simple pattern binding"?
At Sun, 26 Jun 2011 09:31:05 +0100, Paterson, Ross wrote: > > Indeed, the Report has two problems: > > Sections 4.4.3.2 and 4.5.5 have different definitions of "simple pattern". > This has been there since section 4.5.5 (Monomorphism Restriction) was > added in Haskell 1.1. But then the only technical use of the term is > in section 4.5.5. > > When the definition of declaration group (section 4.5.1) was changed in > Haskell 2010 to break dependencies on type signatures, Rule 1 of the > Monomorphism Restriction (section 4.5.5), while not incorrect, became > partially redundant and overly complex. It could have been simplified > along the lines you describe. Aha! This is starting to make sense! Indeed the Haskell98 text is far clearer, and when I look at the differences in section 4.5.1, I start to understand what the committee meant. Still, the clause b1 contains a free identifier that has no type signature and is bound by b2 applies the phrase "has no type signature" to the identifier, not to the binding. Such phrasing does not exclude expression type-signatures. I presume that in the following code, binding b1 does not depend on b2: (x, y) = (z, 1) -- call this binding b1 (z, _) = (2, y) -- call this binding b2 w = 1 + (z :: Double) So my reading was that they meant "has no type signature *in b1*". I take it that your reading is that they meant: b1 contains a free identifier that is bound by b2 and b2 is accompanied by a type signature for that identifier I think, given the ambiguities here, it's worth filing a ticket on the haskell' web site. Thank you. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What is a "simple pattern binding"?
At Sun, 26 Jun 2011 01:41:01 +0100, Paterson, Ross wrote: > > > I thought "no type signature" meant no type signature inside b1. > > No, it means no type signature for the variable. > > > Otherwise, you are saying nothing could depend on a binding with a > > type signature. By that logic, there can be no mutual dependence, > > and so every declaration with a type signature is its own (singleton) > > declaration group. > > A pattern binding can bind more than one variable. If all the variables > bound by a binding have type signatures, that binding is indeed a > singleton declaration group. If this is the case, then multiple sentences in the 2010 report don't make sense, though the way in which they don't make sense sort of depends on what "simple pattern binding" means. Which of the following constitute a simple pattern binding? a. a | False = undefined | otherwise = \x -> x b. Just b = Just (\x -> x) c. Just c | False = undefined | otherwise = Just (\x -> x) d. (d, d') = (\x -> x, d) e. (e, e') | False = undefined | otherwise = (\x -> x, e) If it's any clue, GHC infers a polymorphic type for a only. It infers type "GHC.Prim.Any -> GHC.Prim.Any" for the others. Moreover, GHC accepts the type signature "a :: t -> t", but rejects such a polymorphic signature for the other variables, and also rejects programs such as: Just b = Just (\x -> x) f :: (Show a) => a -> a f = b -- illegal So let's work under the assumption that a is a simple pattern binding, and the others are not. If you have a different definition, I'll make a different argument. (Note also that if you agree with this definition, then there is a bug in section 4.4.3.2 of the report, since a is not of the form "p = e". But if you take the 4.4.3.2 definition, then I'll argue section 4.5.5 has a bug.) Let's posit a definition that accepts only a (and in particular that rejects d and e). Such a definition is further supported by the phrase "a simple pattern binding is a pattern binding in which the pattern consists of only a single variable" (from section 4.5.5). If we accept that a simple pattern binding cannot bind more than one variable, then the definition of the monomorphism restriction in section 4.5.5 is not consistent with your interpretation of the term declaration group. After all, given our posited definition, the following language in 4.5.5: (a): every variable in the group is bound by a function binding or a simple pattern binding (Section 4.4.3.2), and (b): an explicit type signature is given for every variable in the group that is bound by simple pattern binding. should instead read: (a): every binding is a function binding, or (b): the group consists of a simple pattern binding with an explicit type signature. In particular, why would the report say "an explicit type signature is given for EVERY variable" when there can be only one such variable? David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What is a "simple pattern binding"?
At Sun, 26 Jun 2011 00:17:12 +0100, Paterson, Ross wrote: > > > > > g1 x y z = if x>y then show x ++ show z else g2 y x > > > > > > > > g2 :: (Show a, Ord a) => a -> a -> String > > > > g2 | False = \p q -> g1 q p () > > > >| otherwise = \p q -> g1 q p 'a' > > > >where x = True > > > > > > It appears to me that GHC is justified. According to 4.5.1 and 4.5.2, g1 > > > by itself constitutes a declaration group. It is considered by itself > > > and is generalized prior to combining it with g2. > > > Great, now I'm even more confused. 4.5.1 says: > > > A binding b1 depends on a binding b2 in the same list of > > declarations if either > > > 1. b1 contains a free identifier that has no type signature > > and is bound by b2, or > > > 2. b1 depends on a binding that depends on b2. > > > A declaration group is a minimal set of mutually dependent > > bindings. Hindley-Milner type inference is applied to each > > declaration group in dependency order. > > > So here the first binding (of g1) contains the free identifier g2, > > which is bound by the second binding. Conversely, the second binding > > contains g1 free. So the two bindings are mutually dependent, no? > > No, the binding of g1 doesn't depend on the binding of g2, because g2 > has a type signature (clause 1). I thought "no type signature" meant no type signature inside b1. Otherwise, you are saying nothing could depend on a binding with a type signature. By that logic, there can be no mutual dependence, and so every declaration with a type signature is its own (singleton) declaration group. But this can't be what the committee was thinking given the following language in section 4.5.2: If the programmer supplies explicit type signatures for more than one variable in a declaration group, the contexts of these signatures must be identical up to renaming of the type variables. Such a restriction would be vacuous if every type signature created a singleton declaration groups. Moreover, section 4.5.5 is also inconsistent with such an interpretation: The monomorphism restriction Rule 1. We say that a given declaration group is unrestricted if and only if: (a): every variable in the group is bound by a function binding or a simple pattern binding (Section 4.4.3.2), and (b): an explicit type signature is given for every variable in the group that is bound by simple pattern binding. If every binding with an explicit type signature is its own declaration group, then why isn't the monomorphism restriction stated more simply as follows? (a): every binding is a function binding, or (b): the group consists of a pattern binding with an explicit type signature. When they say "an explicit type signature is given for every variable in the group..." they have to be thinking there may be more than one of them. > The type of g1 is inferred using the declared type of g2. Then that > type is used in inferring a type for g2, which will be compared with > its declared signature. Thanks for the reply, but now I'm now even more confused. Perhaps I should ask if someone can give me a better definition of declaration group, ideally with support in the language spec... David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What is a "simple pattern binding"?
At Sat, 25 Jun 2011 14:20:52 -0400, Scott Turner wrote: > > > g1 x y z = if x>y then show x ++ show z else g2 y x > > > > g2 :: (Show a, Ord a) => a -> a -> String > > g2 | False = \p q -> g1 q p () > >| otherwise = \p q -> g1 q p 'a' > >where x = True > > It appears to me that GHC is justified. According to 4.5.1 and 4.5.2, g1 > by itself constitutes a declaration group. It is considered by itself > and is generalized prior to combining it with g2. > > I agree that the report is confusing in its use of "simple pattern binding". Great, now I'm even more confused. 4.5.1 says: A binding b1 depends on a binding b2 in the same list of declarations if either 1. b1 contains a free identifier that has no type signature and is bound by b2, or 2. b1 depends on a binding that depends on b2. A declaration group is a minimal set of mutually dependent bindings. Hindley-Milner type inference is applied to each declaration group in dependency order. So here the first binding (of g1) contains the free identifier g2, which is bound by the second binding. Conversely, the second binding contains g1 free. So the two bindings are mutually dependent, no? In fact, section 4.5.2 goes on to use the following example for a declaration group: f x = let g1 x y = if x>y then show x else g2 y x g2 p q = g1 q p in ... This example is very close to the code I gave. How can my example have two declaration groups when this example has only one? David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] What is a "simple pattern binding"?
Section 4.4.3.2 of the 2010 Haskell report says: A simple pattern binding has form p = e. The pattern p is matched “lazily” as an irrefutable pattern, as if there were an implicit ~ in front of it. This makes it sound as though p is a pattern, which I assume means what section 3.17 defines as the non-terminal "pat". pat -> lpat qconop pat | lpat This is further suggested by the explicit mention of ~, which would be redundant if p had to be a var, since variables always match (according to section 3.17.2 rule 1). So my reading of section 4.4.3.2 is that the following is considered a simple pattern binding (because it has no guards): (f, g) = (\x -> x, f) However, section 4.5.5 seems to contradict this. It reads: Recall that a variable is bound by either a function binding or a pattern binding, and that a simple pattern binding is a pattern binding in which the pattern consists of only a single variable (Section 4.4.3). Moreover, it goes on to give an example and explanation: [(n,s)] = reads t ... Hence, when non-simple pattern bindings are used This text makes it sound as though a "simple pattern binding" can have only a single variable to the left of the = sign, meaning: f = \x -> x Further confusing things, GHC accepts the following: g1 x y z = if x>y then show x ++ show z else g2 y x g2 :: (Show a, Ord a) => a -> a -> String g2 | False = \p q -> g1 q p () | otherwise = \p q -> g1 q p 'a' where x = True and infers type: g1 :: (Show a, Show a1, Ord a1) => a1 -> a1 -> a -> [Char] According to 4.4.3.2, g2 definitely does not have a simple pattern binding, as its binding is not of the form p = e where p is a pattern. Yet by section 4.5.5, if g2 were not considered a simple pattern binding, the constrained type variables in the binding group containing g1 and g2 (in particular the inferred type (Show a => a) of z in g1) would not be allowed to be generalized. So is section 4.4.3.2 of the Haskell 2010 report just wrong? Or is GHC allowing code prohibited by the standard? Or am I somehow misreading the standard? Anyway, if someone can provide a less ambiguous definition of the term "simple pattern binding", I would appreciated it, particularly if you can point to support for your definition in the Haskell 2010 report... Thanks, David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Iteratees again (Was: How on Earth Do You Reason about Space?)
At Thu, 02 Jun 2011 13:52:52 +0200, Ketil Malde wrote: > > I have a bunch of old code, parsers etc, which are based on the > 'readFile' paradigm: > > type Str = Data.ByteString.Lazy.Char8.ByteString -- usually > > decodeFoo :: Str -> Foo > encodeFoo :: Foo -> Str > > readFoo = decodeFoo . readFile > writeFoo f = writeFile f . encodeFoo > hReadFoo = decodeFoo . hRead > : > (etc) > > This works pretty well, as long as Foo is strict enough that you don't > retain all or huge parts of input, and as long as you can process input > in a forward, linear fashion. And, like my frequency count above, I > can't really see how this can be made much simpler. This is fine if you never have parse errors and always read to the end of the file. Otherwise, the code above is incorrect and ends up leaking file descriptors. In general, it is very hard to write parsers that parse every possible input and never fail. Thus, for anything other than a toy program, your code actually has to be: readFoo path = bracket (hOpen path) hclose $ hGetContents >=> (\s -> return $! decodeFoo s) Which is still not guaranteed to work if Foo contains thunks, so then you end up having to write: readFoo path = bracket (hOpen path) hclose $ \h -> do s <- hGetContents h let foo = decodeFoo s deepseq foo $ return foo Or, finally, what a lot of code falls back to, inserting gratuitous calls to length: readFoo path = bracket (hOpen path) hclose $ \h -> do s <- hGetContents h length s `seq` return decodeFoo s The equivalent code with the iterIO package would be: readFoo path = enumFile path |$ fooI which seems a lot simpler to me... > Would there be any great advantage to rewriting my stuff to use > iterators? Or at least, use iterators for new stuff? In addition to avoiding edge cases like leaked file descriptors and memory, one of the things I discovered in implementing iterIO is that it's really handy to have your I/O functions be the same as your parsing combinators. So iteratees might actually admit a far simpler implementation of decodeFoo/fooI. More specifically, imagine that you have decodeFoo, and now want to implement decodeBar where a Bar includes some Foos. Unfortunately, having an implementation of decodeFoo in-hand doesn't help you implement decodeBar. You'd have to re-write your function to return residual input, maybe something like: decodeFooReal :: String -> (Foo, String) decodeFoo :: String -> Foo decodeFoo = fst . decodeFooReal and now you implement decodeBar in terms of decodeFooReal, but you have to pass around residual input explicitly, handle parsing failures explicitly, etc. > As I see it, iterators are complex and the dust is only starting to > settle on implementations and interfaces, and will introduce more > dependencies. So my instinct is to stick with the worse-is-better > approach, but I'm willing to be educated. I fully agree with the point about dependencies and waiting for the dust to settle, though I hope a lot of that changes in a year or so. However, iterIO should already significantly reduce the complexity. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?
At Tue, 31 May 2011 21:30:01 -0500, austin seipp wrote: > > The short story is thus: when you turn on GADTs, it also now turns on > another extension implicitly (MonoLocalBinds) which restricts let > generalization... > > You can find a little more info about the change here: > > http://hackage.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7 Thanks for the precise response I needed. It definitely felt like I was running up against something like the monomorphism restriction, but my bindings were function and not pattern bindings, so I couldn't understand what was going on. I had even gone and re-read the GHC documentation (http://www.haskell.org/ghc/docs/7.0-latest/html/users_guide/data-type-extensions.html#gadt), which says that -XGADTs enables -XRelaxedPolyRec, but makes no mention of -XMonoLocalBinds. It might save users some frustration if the GHC manual and/or the error message mentioned something about let bindings being monomorphic by default. On a related note, I already started fixing this in my code by enabling ScopedTypeVariables, as it's too much of a pain to do this without that extension. I usually try to use the minimum number of extensions possible to future-proof my code. However, is it reasonable to conclude that if I'm going to use GADTs anyway, then additionally enabling ScopedTypeVariables doesn't really make my code any less future-proof? Thanks, David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?
I'm using GHC 7.0.2 and running into a compiler error that I cannot understand. Can anyone shed light on the issue for me? The code does not make use of GADTs and compiles just fine without them. But when I add a {-# LANGUAGE GADTs #-} pragma, it fails to compile. Here is the code: {-# LANGUAGE Rank2Types #-} {-# LANGUAGE GADTs #-} mkUnit :: (forall t. t -> m t) -> m () mkUnit f = f () data Const b a = Const { unConst :: b } sillyId :: r -> r sillyId r = unConst (mkUnit mkConst_r) -- This line can't compile with GADTs where mkConst_r t = mkConst r t mkConst :: b -> t -> Const b t mkConst b _ = Const b And the error I get is: Couldn't match type `t0' with `t' because type variable `t' would escape its scope This (rigid, skolem) type variable is bound by a type expected by the context: t -> Const r t The following variables have types that mention t0 mkConst_r :: t0 -> Const r t0 (bound at /u/dm/hack/hs/gadt.hs:11:11) In the first argument of `mkUnit', namely `mkConst_r' In the first argument of `unConst', namely `(mkUnit mkConst_r)' In the expression: unConst (mkUnit mkConst_r) I've found several workarounds for the issue, but I'd like to understand what the error message means and why it is caused by GADTs. Thanks in advance for any help. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Haskell memory model (was iterIO-0.1)
At Wed, 18 May 2011 09:56:22 +0100, Simon Marlow wrote: > > Ok. I'm not sure how feasible RCU is with IORefs, or even whether it's > necessary. After all, the usual pattern of having readers use readIORef > while writers use atomicModifyIORef gives the RCU cost model (zero > overhead for readers, expensive writes) with far less complexity. > Garbage collection does the job of reclamation automatically. Have I > missed something here? Right, that's what I was calling RCU. Usually the hard part in RCU is the garbage collection. Obviously if you needed to do something else like close a file handle, then IORefs are not sufficient. But for a lot of applications of RCU, IORefs plus garbage collection should be sufficient. > A slight improvement over this scheme is to use TVar with readTVarIO for > the readers (no transaction involved), and transactions for the writers. > This greatly increases the scope of what a writer can do, since they > can perform an update on a bunch of state at the same time. Good point. > There is an operational semantics in the Concurrent Haskell paper that > does not admit the behaviour you describe, but I'll add something to the > docs to that effect. Ah, you got me. I probably should have looked at that paper, which is linked to from Control.Concurrent. Still, in some cases (not necessarily here), papers are static and code continues to evolve, so it's nice to stuff documented in haddock as well. > That's a good point - readMVar cannot be optimised to avoid the lock. In > fact, readMVar is just > >readMVar m = do x <- takeMVar m; putMVar m x; return x > > although there have been suggestions that we should make it atomic. If > we were to do so, it would still have to use a barrier to avoid reordering. What would be even cooler would be if swapMVar could be made atomic. Or better yet, if there could be a compareAndSwapMVar, since on some architectures (though not x86) that could be a single instruction and allow for truly wait-free data types. (That might not be possible without sacrificing referential transparency, since the obvious implementation would involve comparing pointers rather than values.) David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Haskell memory model (was iterIO-0.1)
At Mon, 16 May 2011 22:31:14 +0100, Simon Marlow wrote: > > Good example - so it looks like we don't get full sequential consistency > on x86 (actually I'd been thinking only about write ordering and > forgetting that reads could be reordered around writes). > > But that's bad because it means Haskell has a memory model, and we have > to say what it is, or at least say that ordering is undefined. Right. So I think the memory model is something along the lines of the no-crash property you mentioned--i.e., readIORef will return some value written with writeIORef and not a mish-mash of multiple writes--combined with the model of the underlying hardware. Maybe atomicModifyIORef should serve as a barrier, too. > But I've never heard anyone claim that a prerequisite to Haskell being > useful as a parallel programming language is a well-defined memory > model. I think there's a couple of reasons for that: > >- deterministic parallel programming models (e.g. Strategies, > monad-par) don't care about memory models. These are the > first port of call for parallel programming. Okay, well, I make this claim as a C/C++ programmer more used to writing low-level/kernel code than functional code. So I'm thinking less of things like deterministic scientific codes and more along the lines of network servers processing lots of messages and other asynchronous events happening in a non-deterministic order anyway. I think several programming patterns would be useful in Haskell that require some understanding of the memory model. One that particularly jumps to mind is the read-copy-update (RCU) pattern for frequently accessed but seldom updated data (configuration state, routing tables, etc.) As you've described them, IORefs are well suited to such a pattern because reads are very cheap and updates happen through an atomic pointer write. But if the documentation doesn't say that readIORef is supposed to be cheap (or imply so by mentioning that readIORef exposes the underlying hardware's memory consistency), then there's no way to tell that IORefs are suitable for RCU, so people may think they have to do something uglier using peek and poke. >- If you have to use concurrency, then none of MVars, > atomicModifyIORef or STM care about memory models either. > > So the memory model only becomes visible when you use concurrency with > shared IORefs (without atomicModifyIORef) or bare peek/poke, which is > pretty rare and easily avoided. Actually: http://hackage.haskell.org/packages/archive/base/latest/doc/html/Control-Concurrent-MVar.html There's nothing in the documentation for MVars that says anything about sequential consistency. If you take my example from the previous email, replace writeIORef with (\p v -> modifyMVar_ p $ return v), replace all other occurrences of IORef with MVar, nothing in the docs suggests you won't see the "critical section" message printed twice. Presumably modifyMVar must take a spinlock. Moreover, to be correct on x86, the spinlock must execute an LFENCE after acquiring the lock and an SFENCE prior to releasing the lock. But does readMVar acquire the spinlock, or is it optimized to take advantage of pointer-sized writes being atomic? One can argue that an optimized readMVar is better, because you can always force serialization with modifyMVar. Or one can argue that, for consistency, readMVar should be identical to a takeMVar followed by a putMVar, since people can use IORefs for less deterministic behavior. The latter is certainly what the *current* code does, but until the behavior is documented, I'd worry about some later version of ghc optimizing readMVar and changing the consistency. Systems have memory models for a reason; you can't get away from them entirely for all applications. Haskell's strength, I think, is in making sure that 99+% of code can't possibly depend on the memory model. For functional and ST code, you don't even need to look at the code to know that this is true--safety is guaranteed by the types (modulo some unsafe stuff I hope will be even easier to detect with ghc 7.2...). But for that last little bit of tricky code, the best you can do is specify the behavior of the building blocks and maybe provide some useful architecture-independent wrappers for things (e.g., abstracted memory barriers). David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Tue, 17 May 2011 02:18:55 +1000, Bernie Pope wrote: > > http://augustss.blogspot.com/2011/04/ > ugly-memoization-heres-problem-that-i.html > > He says that "There's no guarantee about readIORef and writeIORef when doing > multi-threading.". But I was wondering if that was true, and if it were, what > the consequences would be. If you read his reply to my question on the blog, > then I believe that he was saying that sequential consistency was not > guaranteed. While I don't know how IORefs work and I'd love to understand this better, I can't imagine any IORef implementation in which memoIO (in the blog post) would give the wrong answer on x86. It might, of course, cause f x to be evaluated multiple times on the same x. However, on other CPUs (e.g., the DEC alpha), there could maybe, maybe be issues. Though I'm not sure, since to avoid crashes, the alpha implementation of IORef would need to include a memory barrier. The question is whether there is an architecture in which IORef avoids crashes AND memoIO can give you the wrong answer. Also, if Simon's original post means that IORef operations all contain barrier instructions, it could be that memoIO is simply correct and the blog post is simply wrong about needing MVars. David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Mon, 16 May 2011 10:56:02 +0100, Simon Marlow wrote: > > Yes, it's not actually documented as far as I know, and we should fix > that. But if you think about it, sequential consistency is really the > only sensible policy: suppose one processor creates a heap object and > writes a reference to it in the IORef, then another processor reads the > IORef. The writes that created the heap object must be visible to the > second processor, otherwise it will encounter uninitialised memory and > crash. So sequential consistency is necessary to ensure concurrent > programs can't crash. > > Now perhaps it's possible to have a relaxed memory model that provides > the no-crashes guarantee but still allows IORef writes to be reordered > (e.g. some kind of causal consistency). That might be important if > there is some processor arcitecture that provides that memory model, but > as far as I know there isn't. Actually, in your heap object example, it sounds like you only really care about preserving program order, rather than write atomicity. Thus, you can get away with less-than-sequential consistency and not crash. The x86 is an example of a relaxed memory model that provides the no-crashes guarantee you are talking about. Specifically, the x86 deviates from sequential consistency in two ways 1. A load can finish before an earlier store to a different memory location. [intel, Sec. 8.2.3.4] 2. A thread can read its own writes early. [intel, 8.2.3.5] [Section references are to the intel architecture manual, vol 3a: http://www.intel.com/Assets/PDF/manual/253668.pdf] One could imagine an implementation of IORefs that relies on the fact that pointer writes are atomic and that program order is preserved to avoid mutex overhead for most calls. E.g.: struct IORef { spinlock_t lock; /* Only ever used by atomicModifyIORef */ HaskellValue *val; /* Updated atomically because pointer-sized writes are atomic */ }; HaskellValue * readIORef (struct IORef *ref) { return ref->val; } void writeIORef (struct IORef *ref, HaskellValue *val) { /* Note that if *val was initialized in the same thread, then by * the time another CPU sees ref->val, it will also see the * correct contents of *ref->val, because stores are seen in a * consistent order by other processors [intel, Sec. 8.2.3.7]. * * If *val was initialized in a different thread, then since this * thread has seen it, other threads will too, because x86 * guarantees stores are transitively visible [intel, Sec. 8.2.3.6]. */ ref->val = val; } /* modifyIORef is built out of readIORef and writeIORef */ HaskellValue * atomicModifyIORef (Struct IORef *ref, HaskellFunction *f) { HaskellValue *result; spinlock_acquire (&ref->lock); result = modifyIORef (ref, f); spinlock_release (&ref->lock); return result; } This is actually how I assumed IORefs worked. But then consider the following program: maybePrint :: IORef Bool -> IORef Bool -> IO () maybePrint myRef yourRef = do writeIORef myRef True yourVal <- readIORef yourRef unless yourVal $ putStrLn "critical section" main :: IO () main = do r1 <- newIORef False r2 <- newIORef False forkOS $ maybePrint r1 r2 forkOS $ maybePrint r2 r1 threadDelay 100 Under sequential consistency, the string "critical section" should be output at most once. However, with the above IORef implementation on x86, since a read can finish before a write to a different location, both threads might see False for the value of yourVal. To prevent this deviation from sequential consistency, you would need to do something like stick an MFENCE instruction at the end of writeIORef, and that would slow down the common case where you don't care about sequential consistency. In fact, I would argue that if you care about S.C., you should either be using atomicModifyIORef or MVars. Can you explain what actually happens inside the real IORef implementation? As an aside, these days one sees a lot of hand-wringing over the fact that CPU clock rates have been flat for a while and the only way to get more performance is through parallelism. "How are we going to teach programmers to write concurrent code when it's so hard to write and debug?" I've heard numerous people ask. Haskell could be a major step in the right direction, since in the absence of variables, it's impossible to have data races. (You can still have deadlock and other kinds of race condition, such as the one in maybePrint above, if you had my definition of IORef, but data races are by far the most pernicious concurrency problems.) Of course, the key to making Haskell useful in a parallel setting is that things like the memory
Re: [Haskell-cafe] Sending messages up-and-down the iteratee-enumerator chain [Was: iterIO-0.1]
At Fri, 13 May 2011 02:57:38 -0700 (PDT), o...@okmij.org wrote: > > The code described in this message does exactly that. Hey, Oleg. This is really cool! In particular, your Bindable class has the potential to unify a whole bunch of request types and both simplify and generalize code. Also, Sum is clearly a more elegant solution that just requiring everything to be Typeable. It may solve some problems I had where I wanted to send messages in exceptions that contained types I didn't know to be Typeable. I need to digest the code a bit more, but it may make sense for me to use this technique in a future version of iterIO. (Much of iterIO is obviously inspired by your stuff as it is.) However, I still have two questions. First, the Iter type in your message seems more like your first iteratee implementation, which is the approach iterIO and enumerator now take. I wonder if it's possible to implement something like Tell your current, CPS-based iteratee. Part of the reason I didn't take a CPS-based approach for Iter was that I couldn't get the upward control requests to work. (Also I wanted pure iteratees, which reduced the gain from CPS.) A challenge for Tell is that you need to know the size of buffered data and not move the input stream. So the control handler needs to decide what happens to residual data (since Seek does flush the input). Conceptually, it seems like it ought to be doable to pass residual data up and down the enumerator/iteratee stack in a CPS style. But when I try to represent residual input as something like: data Input r m s = forall a. Input ((Stream s -> Iteratee s m a) -> m r) I just can't get the types to work out. The second question is what happens to residual data for downstream requests. In the prototype code of your message, the Stream is over Chars, which are not a Monoid. In practice, you obviously want iteratees to be able to look arbitrarily far ahead--for instance an iteratee that returns a number of digits that is a multiple of 8 might have 8 characters of residual data (if the first 7 are digits). So what I'm stuck on is figuring out the right way to sequence the downstream requests with respect to the input data, particularly when you have enumeratees transcoding from one type to the other. Any thoughts? Thanks, David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Thu, 12 May 2011 16:45:02 +0100, Simon Marlow wrote: > > >> There are no locks here, thanks to the message-passing implementation we > >> use for throwTo between processors. > > > > Okay, that sounds good. So then there is no guarantee about ordering > > of throwTo exceptions? That seems like a good thing since there are > > other mechanisms for synchronization. > > What kind of ordering guarantee did you have in mind? We do guarantee > that in > > throwTo t e1 > throwTo t e2 > > Thread t will receive e1 before e2 (obviously, because throwTo is > synchronous and only returns when the exception has been raised). > ... > Pending exceptions are processed in LIFO order (for no good reason other > than performance)... I mean, suppose you have three CPUs, A, B, and C running threads ta, tb, and tc. Then should the following order of events be permitted? AB C throwTo tc e1 throwTo tb e2 catch e2 poke p x peek p (sees x) catch e1 I would argue that this is just fine, that one should rely on MVars if one cares about ordering. But I'm not sure what "Pending exceptions are processed in LIFO order" means in the presence of relaxed memory consistency. The reason I'm asking is that I want to make sure I never end up having to pay the overhead of an MFENCE instruction or equivalent every time I use unmaskAsyncExceptions#... David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Wed, 11 May 2011 13:02:21 +0100, Simon Marlow wrote: > > > However, if there's some simpler way to guarantee that>>= is the > > point where exceptions are thrown (and might be the case for GHC in > > practice), then I basically only need to update the docs. If someone > > with more GHC understanding could explain how asynchronous exceptions > > work, I'd love to hear it... > > There's no guarantee of the form that you mention - asynchronous > exceptions can occur anywhere. However, there might be a way to do what > you want (disclaimer: I haven't looked at the implementation of iterIO). > > Control.Exception will have a new operation in 7.2.1: > >allowInterrupt :: IO () >allowInterrupt = unsafeUnmask $ return () > > which allows an asynchronous exception to be thrown inside mask (until > 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO). So to answer my own question from earlier, I did a bit of benchmarking, and it seems that on my machine (a 2.4 GHz Intel Xeon 3060, running linux 2.6.38), I get the following costs: 9 ns - return () :: IO () -- baseline (meaningless in itself) 13 ns - unsafeUnmask $ return () -- with interrupts enabled 18 ns - unsafeUnmask $ return () -- inside a mask_ 13 ns - ffi -- a null FFI call (getpid cached by libc) 18 ns - unsafeUnmask ffi -- with interrupts enabled 22 ns - unsafeUnmask ffi -- inside a mask_ 131 ns - syscall -- getppid through FFI 135 ns - unsafeUnmask syscall -- with interrupts enabled 140 ns - unsafeUnmask syscall -- inside a mask_ So it seems that the cost of calling unsafeUnmask inside every liftIO would be about 22 cycles per liftIO invocation, which seems eminently reasonable. You could then safely run your whole program inside a big mask_ and not worry about exceptions happening between >>= invocations. Though truly compute-intensive workloads could have issues, the kind of applications targeted by iterIO will spend most of their time doing I/O, so this shouldn't be an issue. Better yet, for programs that don't use asynchronous exceptions, if you don't put your whole program inside a mask_, the cost drops roughly in half. It's hard to imagine any real application whose performance would take a significant hit because of an extra 11 cycles per liftIO. Is there anything I'm missing? For instance, my machine only has one CPU, and the tests all ran with one thread. Does unmaskAsyncExceptions# acquire a spinlock that could lock the memory bus? Or is there some other reason unsafeUnmask could become expensive on NUMA machines, or in the presence of concurrency? Thanks, David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Wed, 11 May 2011 13:02:21 +0100, Simon Marlow wrote: > > There's no guarantee of the form that you mention - asynchronous > exceptions can occur anywhere. However, there might be a way to do what > you want (disclaimer: I haven't looked at the implementation of iterIO). > > Control.Exception will have a new operation in 7.2.1: > >allowInterrupt :: IO () >allowInterrupt = unsafeUnmask $ return () > > which allows an asynchronous exception to be thrown inside mask (until > 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO). Ah. I didn't know about unsafeUnmask. Is unmaskAsyncExceptions# low enough overhead that it would be reasonable to wrap every invocation of liftIO in unsafeUnmask? I'm now thinking it might be reasonable to execute all liftIO actions inside unsafeUnmask (with maybe a special liftIOmasked function for those few places where you don't want asynchronous exceptions). Most of the uses of mask are because you need two or more binds to execute without interruption, e.g.: bracket before after thing = mask $ \restore -> do a <- before -- Big problem if exception happens here -- r <- restore (thing a) `onException` after a _ <- after a return r But when bind sites are the only place an exception can be thrown, things get a lot simpler. For instance, it is perfectly reasonable to write: bracket before after thing = do a <- before thing a `finallyI` after a David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Wed, 11 May 2011 13:02:21 +0100, Simon Marlow wrote: > > There's no guarantee of the form that you mention - asynchronous > exceptions can occur anywhere. However, there might be a way to do what > you want (disclaimer: I haven't looked at the implementation of iterIO). > > Control.Exception will have a new operation in 7.2.1: > >allowInterrupt :: IO () >allowInterrupt = unsafeUnmask $ return () > > which allows an asynchronous exception to be thrown inside mask (until > 7.2.1 you can define it yourself, unsafeUnmask comes from GHC.IO). Ah. I didn't know about unsafeUnmask. Is unmaskAsyncExceptions# low enough overhead that it would be reasonable to wrap every invocation of liftIO in unsafeUnmask? I'm now thinking it might be reasonable to execute all liftIO actions inside unsafeUnmask (with maybe a special liftIOmasked function for those few places where you don't want asynchronous exceptions). Most of the uses of mask are because you need two or more binds to execute without interruption, e.g.: bracket before after thing = mask $ \restore -> do a <- before -- Big problem if exception happens here -- r <- restore (thing a) `onException` after a _ <- after a return r But when bind sites are the only place an exception can be thrown, things get a lot simpler. For instance, it is perfectly reasonable to write: bracket before after thing = do a <- before thing a `finallyI` after a David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
sing IO to happen. Part of the reason iterIO doesn't have this problem is that iterIO's Chunk structure (which is vaguely equivalent to iteratee's Stream) is a Monoid, so it's really easy to save up multiple chunks of residual and "ungotten" data. Every Iter is passed all buffered data of its input type in its entirety (and the inner pipeline stages can actually un-transcode data to make this true across data types). But that's also what makes downstream control messages are harder, because there's no way to represent exceptions at particular points in the input stream, just an EOF marker at the very end. > I like the MonadTrans implementation a lot... Thanks, David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Type-class conditional behavior
At Sat, 7 May 2011 22:14:27 -0700, Nicholas Tung wrote: > > Dear all, > > I'd like to write a function "maybeShow :: a -> Maybe String", which runs > "show" if its argument is of class Show. You can't do this, because in general there is no way to know whether an arbitrary object a is of class Show. In fact, in the worst case, you could even have two different instances of Show for the same type defined in two different modules of your program. Obviously you can't import both modules with both instances into the same module, but what if you didn't import either--how would the compiler know where to find the Show function or which one to use. The best you could hope for is to run show if type a is *known* to be in class Show at your call site. But that would lead to some pretty weird behavior. For instance, the following two functions would be different--f1 would always return Just, and f2 would always return Nothing, which is why I assume no combination of LANGUAGE pragmas would allow it: f1 :: (Show a) => a -> Maybe String f1 = maybeShow f2 :: a -> Maybe String f2 = maybeShow In fact, I suspect that your arrow example is more like f2, in that you don't have a Show dictionary around, so maybeShow will always return nothing. Is there any way you can pass the function around explicitly, as in: data AV t where AVLeft :: AV (a, a -> Maybe String) -> AV (Either (a, a -> Maybe String) b) It is also possible to pass dictionaries around explicitly using the ExistentialQuantification extension (which is required by the standard library exception mechanism, so is probably a reasonably safe one to rely on). Can you do something like the following? {-# LANGUAGE ExistentialQuantification #-} {-# LANGUAGE GADTs #-} data Showable a = forall a. (Show a) => Showable a data AV t where AVLeft :: AV (Showable a) -> AV (Either (Showable a) b) David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Sat, 07 May 2011 21:50:13 +0100, Maciej Marcin Piechotka wrote: > > Sorry for third post but I wonder why the many instances are restricted > by Monad. It would be great if Functor were a superclass of Monad. However, since it isn't, and since I can't think of anything particularly useful to do with Iters over Functors that aren't also Monads, I'd rather just pass one dictionary around than two. So my convention throughout the library is that m has to be a Monad but doesn't have to be a Functor. In general, I try to place as few requirements in the contexts of functions as possible. However, I also want to be able to call most functions from most other ones. If some of the useful low-level functions end up requiring Functor, then most functions in the library are going to end up requiring (Functor m, Monad m) => instead of (Monad m) =>, which will actually end up increasing the amount of stuff in contexts. (Of course, (Iter t m) itself is an Applicative Functor, even when m is just a Monad. So that I make use of in the parsing module.) David _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Fri, 06 May 2011 21:27:21 -0400, Mario Blažević wrote: > > > I'd been thinking about using the terms Source and Sink, but Source is > > very overloaded, and "SinkSource" doesn't exactly roll off the tongue > > or evoke a particularly helpful intuition. > > The SCC package happens to use Source and Sink names as well. They > are used not for coroutines directly, but instead for references to > coroutines of the appropriate type. Every consumer thus owns a Source > from which it fetches its input, and that Source is always bound to > another coroutine that yields those values through a Sink. Source and > Sink are a passive handle to a Producer and Consumer. I may be > subjective, but I find this use of the terms very fitting. You mean fitting for references to coroutines, or fitting for the replacement names for Enumerator/Iteratee? If there's overwhelming consensus, I would certainly consider changing the names in the iterIO library, but it's a pretty big change... David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Sat, 07 May 2011 00:09:46 +0200, Henk-Jan van Tuyl wrote: > > >> On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl > >> wrote: > >> > iterIO cannot be compiled on Windows, because it depends on the > >> package > >> > unix. > [...] > > I'd obviously love to make my stuff work on Windows, but probably lack > > the experience to do it on my own. Suggestions and help are of course > > welcome... > > Is the unix-compat package any good? Thanks for the suggestion. I'm not sure I totally understand how to use unix-compat, though. It gives me calls like getFdStatus :: Fd -> IO FileStatus which is one of the things I need. But how do I get an Fd in the first place? (unix-compat seems to have no equivalent of openFd.) David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Sat, 7 May 2011 01:15:25 +1000, Alex Mason wrote: > > Hi All, > > I really love the look of this package, but if this is going be > *the* iteratee package, I would absolutely love to see it fix some > of the biggest mistakes in the other iteratee packages, soecifically > naming. A change in naming for the terms iteratee, enumerator and > enumeratee would go a hell of a long way here; Peaker on #haskell > suggested Consumer/Producer/Transformer, and there is a lot of > agreement in the channel that these are vastly better names. They’re > also far less intimidating to users. > > I personally feel that maybe Transformer isn't such a great name > (being closely associated with monad transformers), and that maybe > something like Mapper would be better, but I'm by no means in love > with that name either. More people in #haskell seem to like > Transformer, and I don't think my argument against it is very > strong, so the hivemind seems to have settled on the > Producer/Transformer/Consumer trilogy. > > I'd love to hear thoughts on the issue, especially from David. This is a question I struggled a lot with. I definitely agree that the terms are pretty intimidating to new users. At least one thing I've concluded is that it really should be presented as two concepts, rather than three. So we should talk about, e.g., producers, consumers, and pipeline stages that do both. I'd been thinking about using the terms Source and Sink, but Source is very overloaded, and "SinkSource" doesn't exactly roll off the tongue or evoke a particularly helpful intuition. In the end, I decided just to come up with new terms that wouldn't carry any pre-conceptions (e.g., what's an "Inum"?), and then build the intuition through copious documentation... I'm open to suggestion here. I've already overhauled the naming conventions in the library once. Initially I used the names EnumI and EnumO for Inum and Onum. I think the old names were much worse, especially since Enum is a fundamental typeclass that has absolutely nothing to do with enumerators. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Fri, 6 May 2011 10:54:16 -0300, Felipe Almeida Lessa wrote: > > On Fri, May 6, 2011 at 10:44 AM, Henk-Jan van Tuyl wrote: > > iterIO cannot be compiled on Windows, because it depends on the package > > unix. > > That's a big showstopper. I wonder if the package split I recommend > could solve this issue, or if it's something deeper. It's actually worse than this, unfortunately. The unix package dependency is mostly there for efficiency. For the HTTP package, in order to handle things like directories, If-Modified-Since, and Content-Length, I need to look at file attributes. The platform-independent code lets me do this, but I would have to make many more system calls. Also, I would have a slight race condition, because it's hard to get the attributes of the file you actually opened (to make sure the length hasn't changed, etc), while the unix package gets me access to both stat and fstat. This has all been abstracted away by the FileSystemCalls class, so if there's a way to implement those five functions on Windows, we could move defaultFileSystemCalls to its own module (or even its own package), and solve the problem without sacrificing performance or correctness on unix. Unfortunately, there are two worse unix dependencies: 1) I'm using the network IO package to do IO on ByteStrings, and the network library claims this doesn't work on windows. 2) Proper implementation of many network protocols requires the ability to send a TCP FIN segment without closing the underlying file descriptor (so you can still read from it). Thus, I'm using FFI to call the shutdown() system call on the file descriptors of Handles. I have no idea how to make this work on Windows. I'm hoping that time eventually solves problem #1. As for problem #2, the ideal solution would be to get something like hShutdown into the system libraries. I'd obviously love to make my stuff work on Windows, but probably lack the experience to do it on my own. Suggestions and help are of course welcome... David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Fri, 6 May 2011 10:10:26 -0300, Felipe Almeida Lessa wrote: > > He also says that the enumerator package's Enumerators aren't > iteratees, only iterIO's enumerators are. Well, that's not what I'm > reading: > > -- from enumerator package > newtype Iteratee a m b = Iteratee {runIteratee :: m (Step a m b)} > type Enumerator a m b = Step a m b -> Iteratee a m b > type Enumeratee ao ai m b = Step ai m b -> Iteratee ao m (Step ai m b) > > -- from iterIO package > newtype Iter t m a = Iter {runIter :: Chunk t -> IterR t m a} > type Inum tIn tOut m a = Iter tOut m a -> Iter tIn m (IterR tOut m a) > type Onum t m a = Inum () t m a > > The enumerator package's Enumerator *is* an iteratee, an so is its > Enumeratee. Strictly speaking, I guess that's precise if you look at the type of Enumerator. However, it's not really an iteratee in the spirit of iteratees, since it isn't really a data sink and has no input type. > The only real difference is that iterIO represents > enumerators as enumeratees from () to something. In enumerator > package terms, that would be > > -- enumerator packages's enumerator if it was iterIO's :) > -- note that Inum's "tIn" and "tOut" are reversed w.r.t Enumeratee > "ao" and "ai" > type Enumerator a m b = Enumeratee () a m b > > Whether this representation is better or worse isn't clear for me. Exactly. The reason it's better (and for a long time my library was more like the enumerator one) is that the mechanics of uniform error handling are complex enough as it is. When enumerators and enumeratees are two different types, you need two different mechanisms for constructing them, and then have to worry about handing errors in the two different cases. I found that unifying enumerators and enumeratees (or Inums and Onums as I call them) significantly simplified a lot of code. > Now, one big problem that iterIO has that enumerator hasn't, is that > iterIO is a *big* library with many dependencies, including OpenSSL. > IMHO, that package should be split into many others. Yes, this is definitely true. > So, in the enumerator vs. iterIO challenge, the only big differences I see > are: > > a) iterIO has a different exception handling mechanism. > b) iterIO can have pure iteratees that don't touch the monad. > c) iterIO's iteratees can send control messages to ther enumerators. > d) iterIO's enumerators are enumeratees, but enumerator's enumerators > are simpler. > e) enumerator has fewer dependencies. > f) enumerator uses conventional nomenclature. > g) enumerator is Haskell 98, while iterIO needs many extensions (e.g. > MPTC and functional dependencies). > > Anything that I missed? > > The bottomline: the biggest advantage I see right now in favor of > iterIO is c), I basically agree with this list, but think you are underestimating the value of a. I would rank a as the most important difference between the packages. (a also is the reason for d.) David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: iterIO-0.1 - iteratee-based IO with pipe operators
At Fri, 6 May 2011 10:15:50 +0200, Gregory Collins wrote: > > Hi David, > > Re: this comment from catchI: > > > It is not possible to catch asynchronous exceptions, such as > > lazily evaluated divide-by-zero errors, the throw function, or > > exceptions raised by other threads using throwTo if those > > exceptions might arrive anywhere outside of a liftIO call. > > It might be worth investigating providing a version which can catch > asynchronous exceptions if the underlying monad supports it (via > MonadCatchIO or something similar). One of the most interesting > advantages I can see for IterIO over the other iteratee > implementations is that you actually have some control over resource > usage -- not being able to catch asynchronous exceptions nullifies > much of that advantage. A clear use case for this is timeouts on > server threads, where you typically throw a TimeoutException exception > to the handling thread using "throwTo" if the timeout is exceeded. Excellent point. There's actually a chance that iterIO already catches those kinds of exceptions, but I wasn't sure enough about how the Haskell runtime works to make that claim. I've noticed in practice that asynchronous exceptions tend to come exactly when I execute the IO >>= operation. If that's true, then since each IO >>= is wrapped in a try block, the exceptions will all be caught (well, not divide by zero, but things like throwTo, which I think are more important). One way I was thinking of implementing this was wrapping the whole execution in block, and then calling unblock (unless iterIO's own hypothetical block function is called) for every invocation of liftIO. Unfortunately, the block and unblock functions now seem to be deprecated, and the replacement mask/unmask ones would not be as amenable to this technique. However, if there's some simpler way to guarantee that >>= is the point where exceptions are thrown (and might be the case for GHC in practice), then I basically only need to update the docs. If someone with more GHC understanding could explain how asynchronous exceptions work, I'd love to hear it... > Another question re: resource cleanup: in the docs I see: > > > Now suppose inumHttpBody fails (most likely because it receives an > > EOF before reading the number of bytes specified in the > > Content-Length header). Because inumHttpBody is fused to handler, > > the failure will cause handler to receive an EOF, which will cause > > foldForm to fail, which will cause handleI to receive an EOF and > > return, which will ensure hClose runs and the file handle h is not > > leaked. > > > Once the EOFs have been processed, the exception will propagate > > upwards making inumHttpServer fail, which in turn will send an EOF > > to iter. Then the exception will cause enum to fail, after which > > sock will be closed. In summary, despite the complex structure of > > the web server, because all the components are fused together with > > pipe operators, corner cases like this just work with no need to > > worry about leaked file descriptors. > > Could you go into a little bit of detail about the mechanism behind this? Yes, absolutely. This relies on the fact that an Inum must always return its target Iter, even when the Inum fails. This invariant is ensured by the two Inum construction functions, mkInumC and mkInumM, which catch exceptions thrown by the "codec" iteratee and add in the state of the target iteratee. Now when you execute code like "inum .| iter", the immediate result of running inum is "IterR tIn m (IterR tOut m a)"--i.e., the result of an iteratee returning the result an iteratee (because Inums are iteratees, too). If the Inum failed, then the outer IterR will use the Fail constructor: Fail !IterFail !(Maybe a) !(Maybe (Chunk t)) Where the "Maybe a" will be a "Maybe (IterR tOut m b)", and, because of the Inum invariant, will be Just an actual result. .| then must translate the inner iteratee result to the appropriate return type for the Inum (since the Inum's type (IterR tIn m ...) is different from the Iter's (Iter tOut m ...)). This happens through the internal function joinR, which says: joinR (Fail e (Just i) c) = flip onDoneR (runR i) $ \r -> case r of Done a _-> Fail e (Just a) c Fail e' a _ -> Fail e' a c _ -> error "joinR" Where the 'runR' function basically keeps feeding EOF to an Iter (and executing it's monadic actions and rejecting its control requests) until it returns a result, at which point the result's residual input can be discarded and replaced with the residual input of the Inum. David ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: borked windows environment, want to start over
On 11/18/2010 5:02 PM, Michael Litchard wrote: > On Tue, Nov 16, 2010 at 4:39 PM, Michael Litchard wrote: >> I think I may have borked things good using cygwin. I want to remove >> it and do a clean install of haskell platform w/out cygwin. What do I >> need to do to make sure all configuration files have been removed? > > Hmm, I wasn't precise enough in my question. My concern is there are > configuration files related to the windows haskell-platform install > that I don't know about , that need to be removed prior to doing a > clean install. Or is it just a matter of doing a standard windows > uninstall, will that take care of things? As for Cygwin interacting with your native Windows Haskell Platform installation, I think the only thing you have to worry about is if you manually added your Cygwin bin directories to the Windows %PATH% (this won't be done by the Cygwin installer, it's something you would have had to do yourself). That can cause Cabal to get confused, in my experience, depending on the precedence of your Cygwin path entries. Cygwin is generally pretty good about not messing with the host Windows environment... deleting the c:\cygwin directory and removing any (Windows) %PATH% entries is generally sufficient to fully uninstall it. If you created any cygwin-based Windows services (sshd, etc.) you'll have to delete those too, and there are a couple registry keys you could delete if you want, but aside from %PATH% there's nothing that should really mess with Haskell Platform... When you say you may have borked things good with cygwin, what specific problem(s) are you referring to? -- Mark Shroyer http://markshroyer.com/contact/ _______ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe