Re: [Haskell-cafe] Explaining monads
Ronald Guida wrote: > Given the question "What is a Monad", I would have to say "A Monad is > a device for sequencing side-effects." peterv <[EMAIL PROTECTED]> wrote: > "Side-effects" is a piece of linguistic cruft played fast-and-loose > by too many people in this game. "Sequencing" suffers the same > disease. Gregory Propf wrote: > I made this mistake myself at first too. It seems that the Monad = > "side effect machine" error is common to Haskell newbies. Probably > to do with the fact that the first thing every programmer wants to > do is write a hello world program and for that you need the IO Monad > which requires some explanation of how a Monad can allow for side > effects (at least the IO Monad). - Greg Eariler in this thread, I had a conversation with several people regarding monads and arrows. My goal was to try to come up with a brief explanation. I realized that "sequencing side-effects" is a simplistic and incorrect view, so now I'm thinking in terms of DSELs. I have heard that writing a monad tutorial is something people do when they finally understand monads. I interpret this observation to mean that either (1) monads (and arrows) are just difficult things, or (2) most of the existing explanations and tutorials are somehow inadequate or incomplete. My present goal is to understand monads well enough to be able to explain them to others. I wonder if it's possible to create a tutorial that explains monads well enough so that they just "make sense" or "click" for people. Here is the brief explanation I came up with: > Arrows and monads are abstract data types used to construct Domain > Specific Embedded Languages (DSELs) within Haskel. A simple arrow > provides a closed DSEL. A monad is a special type of arrow that > creates an open DSEL by allowing users to embed arbitrary Haskel > within it. Is this an accurate explanation? I hate to feed a fire, but is "Domain Specific Embedded Language" a well-defined phrase, or is it just another example of linguistic cruft? If DSEL is cruft, then is there a better way to briefly explain monads and arrows? Also, is this a /useful/ explanation, or have I simply hidden the complexity by invoking the concepts of ADTs and DSELs? -- Ron ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Small proof intro
I put together a small intro lesson on proving haskell code using quickcheck, equational reasoning and Isabelle/HOL. Its very elementary, but might be of interest to some people here. http://www.thenewsh.com/%7Enewsham/formal/reverse/ Feedback is appreciated. Tim Newsham http://www.thenewsh.com/~newsham/ ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
G'day all. Quoting Andrew Coppin <[EMAIL PROTECTED]>: So many other programming languages allow weird things to happen with numeric overflows... it would be nice if Haskell didn't. It would nice if CPUs supported trapping integer arithmetic. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Re: Re: Language support for imperative code. Was: Re: monad subexpressions
Isaac Dupree wrote: > Benjamin Franksen wrote: >> I'd be careful. Introducing a network connection into the equation makes the >> object (its methods) susceptible to a whole new bunch of failure modes; >> think indefinite delays, connection loss, network buffer overflow, etc etc. >> It may be a mistake to abstract all that away; in fact I am convinced that >> the old Unix habit of sweeping all these failure modes and potentially long >> delays under a big carpet named 'file abstraction' was a bad idea to begin >> with. The ages old and still not solved problems with web browsers hanging >> indefinitely (w/o allowing any GUI interaction) while name resolution waits >> for completion is only the most prominent example. > > IMO it's just a terribly stupid bug in the best web browsers. Maybe > inefficient, poorly, or not-at-all-used multithreading? Explicitly creating a (system) thread with all the overhead (in computing resources, as well as code complexity) only because the system interface is broken? Yes, of course, necessary, but not nice. An extra parameter for a continuation would be a lot more light-weight and would also make explicit that we must expect the call to be delayed. I think the main reason why systems don't regularly employ this scheme is that it is so tedious to work with closures in low-level languages like C. > "file abstraction" has its points. We just need a (type-level?) > clear-to-program-with distinction between operations that may block > indefinitely, and operations that have particular bounds on their > difficulty. Although, modern OSes try to balance too many things, don't > usually make any such hard real-time guarantees, in favor of everything > turning out more-or-less correct eventually. Back to "file abstraction" > - well, considering the benefits of mounting remote systems as a > filesystem. The hierarchy abstraction of the filesystem didn't stay the > same performance characteristics... And all kinds of potential problems > result too, when the connection breaks down! Indeed, as I have experienced multiple times: NFS clients completely hanging for minutes, enforcing coffee break for the whole office! Not that I would mind a coffee break now or then, but it tends to happen in the middle of an attempt to fix a critical bug in the production system... > How do you program with all those error conditions explicitly? It is > difficult. You need libraries to do it well - and I'm not at all sure > whether there exist such libraries yet! I mean, programs are much too > complicated already without infesting them with a lot of special cases. What I would like to have is a clear distinction, apparent in the type, between actions that can be expected to terminate fast and with certainty (apart from broken hardware, that is) and others which are inherently insecure and may involve considerable or even indefinite delays. The latter should accept a continuation argument. However, there is obviously a judgement call involved here. Thus, the system should be flexible enough to allow to treat the same resource either as one or the other, depending on the demands of the application. There may be situations where a name lookup can safely be treated as a synchronous operation (e.g. a script run as a cron job); in other situations one might need to regard even local bus access to some I/O card as asynchronous. > > indefinite delays > I can create with `someCommand | haskellProgram` too Yes, user input as well as reading from a pipe should be handled like a network connection: call my continuation whenever input is available. > > connection loss > Is there a correct way to detect this? There are many ways, typically involving some sort of beacons. Anyway, if all else fails the operation times out. > I find it rather odd when I lose > my IRC connection for a moment and then it comes back a moment later > (Wesnoth games are worse, apparently, as they don't reconnect > automatically). I often prefer considering them an indefinite delay. Right: The user (the application) is in the best position to decide how long to wait before an operation times out. > > network buffer overflow > that is: too much input, not processing it fast enough? (or similar). Yeah, though usually the other way around, i.e. too much output and the system can't send it fast enough (maybe because the other side is slow in accepting data, or because the connection is bad, or whatever). > Memory size limitations are considerably unhandled in programs of all > sorts, not just networked ones, though they(networked) may suffer the > most. It is usually not a problem with modern desktop or server systems, rather with so called 'real-time' OSes, where everything tends to be statically allocated. > We wish we had true unlimited-memory Turing machines :) ...this > is possibly the most difficult issue to deal with formally. Probably > requires limiting input data rates artificially. Th
Re: [Haskell-cafe] Re: Re: Explaining monads
On 8/13/07, Benjamin Franksen <[EMAIL PROTECTED]> wrote: > Ok, I admit defeat now ;-) Monads in general /allow/ sequencing of (certain) > effects, but it is not necessary for a monad to do so. I'm sure I said that 300 or so posts ago :-) Here's an example that might help make some of this explicit. Suppose I have a function f :: a -> b -> c. Now suppose I have some values "in a monad": x :: m a y :: m b How do I lift f so I can apply it to x and y and get a value of type m c? Obviously I can't just write f x y. There's an obvious solution that comes to mind: do x' <- x y' <- y return $ f x' y' But there's another solution: do y' <- y x' <- x return $ f x' y' So in order to lift f I was forced to make an explicit decision about the order in which I wanted to process x and y. (I think of >>= as a kind of bottleneck that forces you to compute the arguments to f one at a time in a particular order - but there are already too many metaphors flying around so maybe you should ignore that.) Information about that decision now becomes available to the monad's implementation. If you want, you can use this information to implement 'sequencing' (whatever that means). But you're also free to ignore it. If you do ignore it then you have a commutative monad, with Reader being the best known example. It's just like ordinary monoids. The very act of writing x*y imposes an order on x and y. You can use this to define binary operators that can make use of this ordering. But the most familiar monoid operators (addition and multiplication of course) are commutative and order doesn't matter. It'd be cool to have a notation that didn't force you to put one argument or the other first. (Well, I know of one, but it's a bit wacky: http://www.lawsofform.org/interpretations.html) -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Explaining monads
I made this mistake myself at first too. It seems that the Monad = "side effect machine" error is common to Haskell newbies. Probably to do with the fact that the first thing every programmer wants to do is write a hello world program and for that you need the IO Monad which requires some explanation of how a Monad can allow for side effects (at least the IO Monad). - Greg - Original Message From: peterv <[EMAIL PROTECTED]> To: Kim-Ee Yeoh <[EMAIL PROTECTED]>; haskell-cafe@haskell.org Sent: Monday, August 13, 2007 10:31:48 AM Subject: RE: [Haskell-cafe] Explaining monads Ronald Guida wrote: > > Given the question "What is a Monad", I would have to say "A Monad is > a device for sequencing side-effects." > There are side-effects and there are side-effects. If the only monad you use is Maybe, the only side-effect you get is a slight warming of the CPU. "Side-effects" is a piece of linguistic cruft played fast-and-loose by too many people in this game. "Sequencing" suffers the same disease. Sick sense of humor? Visit Yahoo! TV's Comedy with an Edge to see what's on, when. http://tv.yahoo.com/collections/222___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Re: Explaining monads
Derek Elkins wrote: > On Mon, 2007-08-13 at 22:29 +0200, Benjamin Franksen wrote: >> Brian Brunswick wrote: >> > One thing that I keep seeing people say (not you), is that >> monads /sequence/ >> > side effects. This is wrong, or at >> > least a limited picture. >> > >> > /All/ of the above structures are about combining compatible things things >> > together in a row. >> > /None/ of them force any particular order of evaluation - that all comes >> > from the particular instance. So its >> > only a particular feature of IO that it sequences the side effects. Others >> > don't - we can have a lazy State >> > monad that just builds up big thunks. >> >> I am a bit astonished. >> >> Let's take the simplest example: Maybe. The effect in question is the >> premature abortion of a computation (when Nothing is returned). And of >> course Maybe sequences these effects, that's what you use it for: the >> _first_ action to be encountered that returns Nothing aborts the >> computation. Clearly sequencing goes on here. > > You are wrong. Proof: Let's take a simpler example: Identity. QED I don't accept this proof. Note the wording: 'Monads sequence (certain, monad specific) effects'. Identity has no effects, ergo no sequencing has to happen. I didn't claim that /all/ monadic actions are (necessarily) sequenced. > This also disproves David Roundy's statement that > do x <- return 2; undefined; return (x*x) will hit bottom. > > Reader also does not sequence it's "actions". Ok, I admit defeat now ;-) Monads in general /allow/ sequencing of (certain) effects, but it is not necessary for a monad to do so. > Writer is a kind of funny example. In which way? > Certainly, any monad instance where (>>=) needs it's first argument to > determine whether to continue, e.g. Maybe, Either, IO, Parser, Cont, > List, Tree will clearly need to force it's first argument before > continuing, but that's just the nature of the situation. Don't forget State, clearly it sequences actions even though it always continues; but maybe 'sequencing' is too strong a word: Just as with Reader, a sequence of reads (with no writes in between) may actually happen in any order, State imposes strict order on groups of adjacent reads and all (single) writes, correct? Ok, I think I understand where I was misled: I took the means for the end. There are many monads that impose a certain order on (some) effects; but this is done only as far as necessary to maintain X, whatever X may be, maybe X is just the monad laws? What about: The imperative way always imposes a sequencing of actions, period. Monads in Haskell (except IO which is just imperative programming) allow us to impose ordering on effects only partially, ideally only where necessary. Thanks for further enlightening me. Hey, I just said that sequencing is a means, not an end, but maybe even this is not necessarily true. I imagine a 'Sequence Monad' whose only effect would be to order evaluation... would that be possible? Or useful? Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On 8/13/07, David Roundy <[EMAIL PROTECTED]> wrote: | Try executing: | | do { x <- return 2; undefined; return (x*x); } | | in any monad you like It's not just the identity monad: Prelude> :m +Control.Monad.State Prelude Control.Monad.State> flip evalState () $ do { x <- return 2; undefined; return (x*x); } 4 Regards, Arie ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions
On Tue, 14 Aug 2007, Benjamin Franksen wrote: ... > I'd be careful. Introducing a network connection into the equation makes the > object (its methods) susceptible to a whole new bunch of failure modes; > think indefinite delays, connection loss, network buffer overflow, etc etc. > It may be a mistake to abstract all that away; in fact I am convinced that > the old Unix habit of sweeping all these failure modes and potentially long > delays under a big carpet named 'file abstraction' was a bad idea to begin > with. The ages old and still not solved problems with web browsers hanging > indefinitely (w/o allowing any GUI interaction) while name resolution waits > for completion is only the most prominent example. Ironically, the place where all this sweeping under the carpet has caused me personally the most irritation is one of the most appealing file abstractions - remote disk filesystems, NFS. In any case, I agree (I think) that a sophisticated user interface needs to deal with time. I think that's a key motivation for reactive object approaches. It has to be considered part of the equation, along with the rest of the I/O situation, if you're trying to reason about it that way. Donn Cave, [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On 8/13/07, David Roundy <[EMAIL PROTECTED]> wrote: > Try executing: > > do { x <- return 2; undefined; return (x*x); } > > in any monad you like instance Monad M where return a = M a ~(M a) >>= f = f a Or is that cheating? -- Dan ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions
Benjamin Franksen wrote: I'd be careful. Introducing a network connection into the equation makes the object (its methods) susceptible to a whole new bunch of failure modes; think indefinite delays, connection loss, network buffer overflow, etc etc. It may be a mistake to abstract all that away; in fact I am convinced that the old Unix habit of sweeping all these failure modes and potentially long delays under a big carpet named 'file abstraction' was a bad idea to begin with. The ages old and still not solved problems with web browsers hanging indefinitely (w/o allowing any GUI interaction) while name resolution waits for completion is only the most prominent example. IMO it's just a terribly stupid bug in the best web browsers. Maybe inefficient, poorly, or not-at-all-used multithreading? "file abstraction" has its points. We just need a (type-level?) clear-to-program-with distinction between operations that may block indefinitely, and operations that have particular bounds on their difficulty. Although, modern OSes try to balance too many things, don't usually make any such hard real-time guarantees, in favor of everything turning out more-or-less correct eventually. Back to "file abstraction" - well, considering the benefits of mounting remote systems as a filesystem. The hierarchy abstraction of the filesystem didn't stay the same performance characteristics... And all kinds of potential problems result too, when the connection breaks down! How do you program with all those error conditions explicitly? It is difficult. You need libraries to do it well - and I'm not at all sure whether there exist such libraries yet! I mean, programs are much too complicated already without infesting them with a lot of special cases. > indefinite delays I can create with `someCommand | haskellProgram` too > connection loss Is there a correct way to detect this? I find it rather odd when I lose my IRC connection for a moment and then it comes back a moment later (Wesnoth games are worse, apparently, as they don't reconnect automatically). I often prefer considering them an indefinite delay. > network buffer overflow that is: too much input, not processing it fast enough? (or similar). Memory size limitations are considerably unhandled in programs of all sorts, not just networked ones, though they(networked) may suffer the most. We wish we had true unlimited-memory Turing machines :) ...this is possibly the most difficult issue to deal with formally. Probably requires limiting input data rates artificially. Isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] Explaining monads
On Mon, 2007-08-13 at 19:31 +0200, peterv wrote: > When I read "side-effects", I understand it as "unwanted effects", like > "aliasing", and "effects depending on the order of execution". I'm not sure > if my understanding here is correct. I hope Haskell does not allow > "side-effects" but only "effects", meaning the monads do not allow you to > write the typical ill-behaving code you get when doing real imperative > programming, enforcing a single wiring of execution, not allowing the > capture of the RealWorld object. In Concurrent Clean special compiler > support is present to enforce "uniqueness typing", and in Haskell special > compiler support is available to make the RealWorld object not available at > runtime (so e.g. you can't put the RealWorld object in a list). Is this > correct? Monads certainly do allow "side-effects" in that way. Heck, without allowing aliasing there wouldn't be much to compel the use of mutability. Aliasing is the great boon and great curse of imperative programming. Order of execution issues are tricky. On the one hand, you must always (directly or indirectly) specify the order of execution, so technically there's no uncertainty. On the other hand, there are two distinct definitions of liftM2, liftM2'1 f ma mb = do a <- ma; b <- mb; return (f a b) -- and liftM2'2 f ma mb = do b <- mb; a <- ma; return (f a b) Note that order of -evaluation- is moot (at least as much as it is for any term). Laziness does not come into it (unless you use lazy IO, in which case you get what you deserve.) > BTW: What is the correct word in Haskell for "object"? I mean the (lazy) > value you get when evaluating a data constructor? What you said doesn't really make much sense, but I think the(/a) word you want is a "thunk". ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
On 13/08/07, Pekka Karjalainen <[EMAIL PROTECTED]> wrote: > On 8/13/07, L.Guo <[EMAIL PROTECTED]> wrote: > > Hi All: > > Hello, > > > > > I am reading http://www.haskell.org/haskellwiki/Prime_numbers > > > > The code in sector "1 Bitwise prime sieve". > > > > I have 3 questions about it. > > > > 1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ? > > Yes, it appears so. In a 32 bit implementation I get: > > Prelude> sqrt $ fromIntegral (maxBound :: Int) > 46340.950001051984 > > > 2) We have this type definition : > > pureSieve :: Int -> Int > >Why there is no error (type mismatch) of this call in func main : > > pureSieve 1000 > > If you have integer literals in your program, the compiler sees a > fromInteger in front of them. So the value is just converted to type > Int automatically, because that is expected here. > > You can give different numeric default declarations in your own > modules. Please see sections 10.3 (for overloaded literals) and 10.4 > (for defaults) here: > http://www.haskell.org/tutorial/numbers.html > > Sometimes you can get an overflow like this: > > Prelude> 1000 :: Int > -159383552 > > > 3) In main again, what does expression [| x |] mean ? Why this cannot be > > execute in > > GHCi ? > > It's Template Haskell, and is used there for some kind of optimisation > (I think). Template Haskell needs to be enabled with a command line > switch for it to work. Please see the documentation for more > information. It's section 7.6 in your User's Guide. > > Though in this case you can probably just remove it to try out the > program. Perhaps someone else can explain what actual effect it has > here. I think it just computes a single function call to pureSieve at compile time. I believe its origin is from making a point that when you stop comparing apples to apples it's easy to cheat (this code comes from a discussion on this list where someone insisted on adding optimizations to the, admittedly naive, algorithm in C# and comparing it without making the same optimizations in Haskell -- so someone, I forget who but I'm a search will turn it up, wrote a quick template Haskell "optimization" to make a point that we don't really get useful results unless we compare the same algorithms in both languages). In general you should probably ignore that TH bit and just call pureSieve normally. -- Sebastian Sylvan +44(0)7857-300802 UIN: 44640862 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On Mon, 2007-08-13 at 22:29 +0200, Benjamin Franksen wrote: > Brian Brunswick wrote: > > One thing that I keep seeing people say (not you), is that > monads /sequence/ > > side effects. This is wrong, or at > > least a limited picture. > > > > /All/ of the above structures are about combining compatible things things > > together in a row. > > /None/ of them force any particular order of evaluation - that all comes > > from the particular instance. So its > > only a particular feature of IO that it sequences the side effects. Others > > don't - we can have a lazy State > > monad that just builds up big thunks. > > I am a bit astonished. > > Let's take the simplest example: Maybe. The effect in question is the > premature abortion of a computation (when Nothing is returned). And of > course Maybe sequences these effects, that's what you use it for: the > _first_ action to be encountered that returns Nothing aborts the > computation. Clearly sequencing goes on here. You are wrong. Proof: Let's take a simpler example: Identity. QED This also disproves David Roundy's statement that do x <- return 2; undefined; return (x*x) will hit bottom. Reader also does not sequence it's "actions". Writer is a kind of funny example. Certainly, any monad instance where (>>=) needs it's first argument to determine whether to continue, e.g. Maybe, Either, IO, Parser, Cont, List, Tree will clearly need to force it's first argument before continuing, but that's just the nature of the situation. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On 13/08/07, David Roundy <[EMAIL PROTECTED]> wrote: > > Try executing: > > do { x <- return 2; undefined; return (x*x); } > > in any monad you like, and you'll find that regardless of the *data* > dependencies (the return value of this monadic action is unambiguous), the > undefined is evaluated *before* the value 4 is returned. > -- > > Prelude> :m + Control.Monad.Identity Prelude Control.Monad.Identity> runIdentity $ do { x <- return 2; undefined; return (x*x); } Loading package mtl-1.0 ... linking ... done. 4 -- [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
David Roundy wrote: It's the *effect* of a monad, not the *side* effect. The type of >>= defines this dependency. And when you have a chain of dependencies, that is sometimes referred to as a sequence. True, it's not mystical, but it's still sequenced. How can a Haskell type define a data dependency? Haskell functions are non-strict, so they may choose to ignore arguments, and ignored arguments are not evaluated at all. Try executing: do { x <- return 2; undefined; return (x*x); } in any monad you like, and you'll find that regardless of the *data* dependencies (the return value of this monadic action is unambiguous), the undefined is evaluated *before* the value 4 is returned. > runIdentity $ do {x <- return 2; undefined; return (x * x) } 4 Tillmann ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Re: Explaining monads
Brandon S. Allbery KF8NH wrote: > On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote: >> Let's take the simplest example: Maybe. The effect in question is the >> premature abortion of a computation (when Nothing is returned). And of >> course Maybe sequences these effects, that's what you use it for: the >> _first_ action to be encountered that returns Nothing aborts the >> computation. Clearly sequencing goes on here. > > Clearly it does, but not as a side effect of the *monad*. It's > ordinary Haskell data dependencies at work here, not some mystical > behavior of a monad. I can't remember claiming that Monads have any mystic abilities. In fact, these Monads are implemented in such a way that their definition /employs/ data dependencies to enforce a certain sequencing of effects. I think that is exactly the point, isn't it? >> What about State? The effect is reading/writing the state. Again, >> the State >> Monad takes care that these effects get sequenced, and again that's >> what >> you expect it to do for you. > > No, I expect it to carry a value around for me. If I carry that > value around myself instead of relying on the monad to do it for me, > *the calculation still gets sequenced by the data dependencies*. Of course, you can unfold (itso inline) bind and return (or never use them in the first place). Again, nobody claimed Monads do the sequencing by employing Mystic Force (tm); almost all Monads can be implemented in plain Haskell, nevertheless they sequence certain effects -- You Could Have Invented Them Monads Yourself ;-) The Monad merely captures the idiom, abstracts it and ideally implements it in a library for your convenience and for the benefit of those trying to understand what your code is supposed to achieve. She reads 'StateM ...' and immediately sees 'ah, there he wants to use some data threaded along for reading and writing'. Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions
Brian Hulley wrote: > Brian Hulley wrote: >> apfelmus wrote: >>> Brian Hulley schrieb: main = do buffer <- createBuffer edit1 <- createEdit buffer edit2 <- createEdit buffer splitter <- createSplitter (wrapWidget edit1) (wrapWidget edit2) runMessageLoopWith splitter ... Thus the ability to abstract mutable state gives to my mind by far the best solution. >>> >>> I'm not sure whether mutable state is the real goodie here. I think >>> it's the ability to indpendently access parts of a compound state. >>> http://www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf >> >> This is indeed a real key to the problem. > Of course this is only one aspect of the problem... > > Thinking about this a bit more, and just so this thought is recorded for > posterity (!) and for the benefit of anyone now or in a few hundred > years time, trying to solve "Fermat's last GUI", the object oriented > solution allows the buffer object to do anything it wants, so that it > could negotiate a network connection and implement the interface based > on a shared network buffer for example, without needing any changes to > the client code above, so a functional gui would need to have the same > flexibility to compete with the OO solution. I'd be careful. Introducing a network connection into the equation makes the object (its methods) susceptible to a whole new bunch of failure modes; think indefinite delays, connection loss, network buffer overflow, etc etc. It may be a mistake to abstract all that away; in fact I am convinced that the old Unix habit of sweeping all these failure modes and potentially long delays under a big carpet named 'file abstraction' was a bad idea to begin with. The ages old and still not solved problems with web browsers hanging indefinitely (w/o allowing any GUI interaction) while name resolution waits for completion is only the most prominent example. Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions
Brian Hulley wrote: Thinking about this a bit more, and just so this thought is recorded for posterity (!) and for the benefit of anyone now or in a few hundred years time, trying to solve "Fermat's last GUI", the object oriented solution allows the buffer object to do anything it wants, so that it could negotiate a network connection and implement the interface based on a shared network buffer for example, without needing any changes to the client code above, so a functional gui would need to have the same flexibility to compete with the OO solution. Probably it would be parametric in the input mechanism, somehow. (A Haskell approach might use type classes, slightly obscuring the parametricity..) Another thing that would be interesting would be to have a formal treatment of what is supposed to happen in a gui. For example, when you move the mouse over a control which has become dirty (ie needs to be re-rendered because its state is now inconsistent), what should the control do? Should it respond as if the new state were already visible to the user, or should it interpret the mouse position according to the last state that was rendered, or should it just ignore all mouse events until the next time it gets rendered? This is not a trivial question because you could imagine an animated control where the user might naturally be following the movement, whereas when the user clicks on a cell in a spreadsheet when the cells to the left have now expanded due to a change in data thus moving the cell along (but where this updated model has not yet been re-rendered) the user might be irritated at the wrong cell being selected... It's tricky little issues like this that I haven't found any documentation for anywhere, and which would make a proper mathematical treatment of interaction with a gui very useful, regardless of whether it is implemented in OOP or functional style. Jef Raskin (late interface designer, author of _The Humane Interface_) would probably say that anything with such importance to user decisions, should be rendered within a tenth of a second. Computers fifteen years ago could sometimes do it! Fancy details can be filled in later if it takes that long. Of course that completely dodges the mathematical question... in which human response time should really be taken into account too! Humans really are not like machines and are not all alike either! Oh no, do we need psychological formalisms? Firefox suffers the above problems badly, alas - the "Stop" button is half useless because it doesn't even noticed you pressed it for such a long time, etc... Reading up on user interface design principles as well as thinking functionally, is probably a useful approach - although not everything that you read will agree or be right. The whole concept of GUIs - they are very complicated - it is quite arguable that they are just a wrong interface - however, some of the world's people are fortunate enough to be accustomed to them already, which complicates matters considerably. Isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On Mon, Aug 13, 2007 at 05:13:01PM -0400, Brandon S. Allbery KF8NH wrote: > On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote: > >Let's take the simplest example: Maybe. The effect in question is the > >premature abortion of a computation (when Nothing is returned). And of > >course Maybe sequences these effects, that's what you use it for: the > >_first_ action to be encountered that returns Nothing aborts the > >computation. Clearly sequencing goes on here. > > Clearly it does, but not as a side effect of the *monad*. It's ordinary > Haskell data dependencies at work here, not some mystical behavior of a > monad. It's the *effect* of a monad, not the *side* effect. The type of >>= defines this dependency. And when you have a chain of dependencies, that is sometimes referred to as a sequence. True, it's not mystical, but it's still sequenced. Try executing: do { x <- return 2; undefined; return (x*x); } in any monad you like, and you'll find that regardless of the *data* dependencies (the return value of this monadic action is unambiguous), the undefined is evaluated *before* the value 4 is returned. -- David Roundy Department of Physics Oregon State University ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
Andrew Coppin wrote: Stefan O'Rear wrote: Also, large numbers don't (this is arguably a bug...) have restricted types: [EMAIL PROTECTED]:~$ ghc -e '100 :: Int' -1486618624 So many other programming languages allow weird things to happen with numeric overflows... it would be nice if Haskell didn't. Shall we have a GHC warning if it can detect those cases, either the Haskell report bounds or that of the target GHC is currently compiling to? (of course someone would have to implement it, and there would always be cases it didn't check, and the (non-portable) behavior is sometimes desired...) Hugs often does throw an exception (runtime error) rather than allow Int overflow (but not always). Isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] out of core computing in haskell
In the second case, if your numerical data is just a big array, why not just mmap it? Unless you're running on a 32 bit machine, or have a *seriously* large amount of data, that seems like the easiest option. Although if you're talking about mutable data, that's a whole 'nother can of worms... but that's true even if your data fits into memory. You could look at Data.ByteString to get some idea as to how to wrap a pure interface around a disk-backed chunk of read-only memory. Or just read about the ffi, it's not too hard to figure out. David On Mon, Aug 13, 2007 at 01:27:03PM -0700, Carter T Schonwald wrote: > The main two use cases I have in mind are > 1) really really really big abstract syntax trees or proof trees (a la > compilers or theorem provers) > 2) random access to numerical data > > does that help clarify what i'm asking about? In each case, is there a > standard way of dealing with this? > in the case of (1) the sensible way seems to me to do some sort of zipper > representation which loads adjacent nodes to some depth surrounding your > current location in the tree, and in the case of (2) it seems like the > sensible way would be load subsequences of the data into memory. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions
Brian Hulley wrote: apfelmus wrote: Brian Hulley schrieb: main = do buffer <- createBuffer edit1 <- createEdit buffer edit2 <- createEdit buffer splitter <- createSplitter (wrapWidget edit1) (wrapWidget edit2) runMessageLoopWith splitter ... Thus the ability to abstract mutable state gives to my mind by far the best solution. I'm not sure whether mutable state is the real goodie here. I think it's the ability to indpendently access parts of a compound state. http://www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf This is indeed a real key to the problem. Of course this is only one aspect of the problem... Thinking about this a bit more, and just so this thought is recorded for posterity (!) and for the benefit of anyone now or in a few hundred years time, trying to solve "Fermat's last GUI", the object oriented solution allows the buffer object to do anything it wants, so that it could negotiate a network connection and implement the interface based on a shared network buffer for example, without needing any changes to the client code above, so a functional gui would need to have the same flexibility to compete with the OO solution. Another thing that would be interesting would be to have a formal treatment of what is supposed to happen in a gui. For example, when you move the mouse over a control which has become dirty (ie needs to be re-rendered because its state is now inconsistent), what should the control do? Should it respond as if the new state were already visible to the user, or should it interpret the mouse position according to the last state that was rendered, or should it just ignore all mouse events until the next time it gets rendered? This is not a trivial question because you could imagine an animated control where the user might naturally be following the movement, whereas when the user clicks on a cell in a spreadsheet when the cells to the left have now expanded due to a change in data thus moving the cell along (but where this updated model has not yet been re-rendered) the user might be irritated at the wrong cell being selected... It's tricky little issues like this that I haven't found any documentation for anywhere, and which would make a proper mathematical treatment of interaction with a gui very useful, regardless of whether it is implemented in OOP or functional style. Anyway just a thought, Brian. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Help with a project design
Andrea Rossato wrote: > The task this library should do is simple: given an xml object > (representing a bibliographic reference), render it with rules stored > in a different xml object (the citation style). While I think I can > find solutions for this problem - the rendering -, what I find > difficult is the design of the reference xml objects. > > Bibliographic entries have different types, which must be rendered > differently. These types can be classified into 3 main classes (books, > articles, parts of a book) that can be rendered with the same methods. > That seems to fit Haskell perfectly. > > Now, I basically see 2 approaches: > > 1. create some data structures (most part of them is common) to map >different types of bibliographic entries, and create the needed >classes with the render methods; > > 2. keep the xml objects as xml and create an abstract interface to the >xml objects to get the data required for rendering and classifying >the xml objects. This way I would have to: >- create data types to store different types of xml objects (data > Book = Book XmlTree, data Artilce, etc.): these data types > represent my reference classes; >- create a class of 'render'-able types with the render method and > define the instances; >- create an existential type to set the type of the xml objects > with some kind of setType :: XmlTree -> ExistentialContainer I may not be overly qualified (and experienced with Haskell) to give you advice, so take what follows with caution. I would definitely prefer choice 1 over 2. I think it is very important to design the data structure independent from any external representation of that data. XML is a fine way to externally represent data, but this should not influence your choice of data structure. I'd rather keep the possibility of alternative representations in the back of my head, and make the data structure general enough that they could be added w/o disrupting your main algorithms. Abstraction can be added later; if you find that you need to maintain invariants for you bibliographic data that cannot be easily expressed in the type itself, then you might consider to make your data type abstract, i.e. put it into a module of its own and export only an API. > I think that the first approach is not abstract enough and requires a > lot of boilerplate code to translate into a Haskell type a specific > type of bibliographic entry. A certain amount of boiler plate may be unavoidable. I never found this to be a serious obstacle, but again that may be due to my limited experience. It is a bit tedious to write but OTOH may even serve you as 'finger exercise'. If it really gets out of hand, 'scrap' it in some way ;) I recommend the Uniplate approach because it is very easy to understand, performs good, and requires the least amount of extensions. OK, you have been warned... Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Explaining monads
On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote: Let's take the simplest example: Maybe. The effect in question is the premature abortion of a computation (when Nothing is returned). And of course Maybe sequences these effects, that's what you use it for: the _first_ action to be encountered that returns Nothing aborts the computation. Clearly sequencing goes on here. Clearly it does, but not as a side effect of the *monad*. It's ordinary Haskell data dependencies at work here, not some mystical behavior of a monad. What about State? The effect is reading/writing the state. Again, the State Monad takes care that these effects get sequenced, and again that's what you expect it to do for you. No, I expect it to carry a value around for me. If I carry that value around myself instead of relying on the monad to do it for me, *the calculation still gets sequenced by the data dependencies*. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Explaining monads
Brian Brunswick wrote: > One thing that I keep seeing people say (not you), is that monads /sequence/ > side effects. This is wrong, or at > least a limited picture. > > /All/ of the above structures are about combining compatible things things > together in a row. > /None/ of them force any particular order of evaluation - that all comes > from the particular instance. So its > only a particular feature of IO that it sequences the side effects. Others > don't - we can have a lazy State > monad that just builds up big thunks. I am a bit astonished. Let's take the simplest example: Maybe. The effect in question is the premature abortion of a computation (when Nothing is returned). And of course Maybe sequences these effects, that's what you use it for: the _first_ action to be encountered that returns Nothing aborts the computation. Clearly sequencing goes on here. Similar with the Error Monad (i.e. Either Err, for some Err type). I won't talk about List Monad because I always had difficulty understanding the List Monad. What about State? The effect is reading/writing the state. Again, the State Monad takes care that these effects get sequenced, and again that's what you expect it to do for you. And so on... This is -- of course -- not a proof, so maybe there /are/ Monads that don't sequence (their) effects. I'd be most interested to see an example, if there is one, to bring myself nearer to the -- unattainable -- goal of full enlightenment wrt Monads. Cheers Ben ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] out of core computing in haskell
The main two use cases I have in mind are 1) really really really big abstract syntax trees or proof trees (a la compilers or theorem provers) 2) random access to numerical data does that help clarify what i'm asking about? In each case, is there a standard way of dealing with this? in the case of (1) the sensible way seems to me to do some sort of zipper representation which loads adjacent nodes to some depth surrounding your current location in the tree, and in the case of (2) it seems like the sensible way would be load subsequences of the data into memory. - Original Message From: Andrew Coppin <[EMAIL PROTECTED]> To: haskell-cafe@haskell.org Sent: Monday, August 13, 2007 3:55:57 PM Subject: Re: [Haskell-cafe] out of core computing in haskell Carter T Schonwald wrote: > Hello Everyone, > I'm not quite sure if I'm posing this question correctly, but what > facilities currently exist in haskell to nicely deal with > datastructures that won't fit within a given machine's ram? > And if there are no such facilities, what would it take to fix that? If you just want to process a big chunk of data from one end to the other without loading it all into RAM at once... that's fairly easy. Haskell is a lazy language. By playing with functions such as getContents, you can automatically load the data as you access it. No tricky programming required. If you're asking for something more specific -- e.g., "how do I talk to a standard database like Oracle / MySql / et al.", there are a couple of libraries for that. (Unfortunately, no one standard one.) See Stefan's answer. I'd you'd like to be more specific about what you'd like to do... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Dynamic thread management?
On Aug 13, 2007, at 2:53 PM, Mitar wrote: Hi! I am thinking about a model where you would have only n threads on a n-core (or processor) machine. They would be your worker threads and you would spawn them only once (at the beginning of the program) and then just delegate work between them. On 8/13/07, Jan-Willem Maessen <[EMAIL PROTECTED]> wrote: The problem here is that while Cilk spawns are incredibly cheap, they're still more than a simple procedure call (2-10x as expensive if my fading memory serves me rightly). Let's imagine we have a nice, parallelizable computation that we've expressed using recursive subdivision (the Cilk folks like to use matrix multiplication as an example here). Near the leaves of that computation we still spend the majority of our time paying the overhead of spawning. So we end up actually imposing a depth bound, and writing two versions of our computation---the one that spawns, which we use for coarse-grained computations, and the one that doesn't, which we use when computation gets fine-grained. It makes a really big difference in practice. But this could be done at the runtime too. If the lazy-non-evaluated-yet chunk is "big" then divide it into a few parts and run each part in its thread. But if the chunk is small (you are at the end of the evaluation and you already evaluated necessary subexpressions) you do it in the thread which encounters this situation (which is probably near the end of the program or the end of the monadic IO action). I didn't make my point very well. The hard part is determining exactly when a chunk is "big" or "small" without actually computing its value. Consider recursive fib (which I use only because it's easy to understand, and has been used as an example of this problem by the Cilk implementors): fib n = if n <= 1 then n else fib (n-1) + fib (n-2) Imagine we're computing (fib 30). We can divide and conquer; plenty of parallelism there! But do most calls to fib represent enough work to justify running them in parallel? No, because most of the calls are to (fib 0) or (fib 1)! We should only pay the spawn cost up to a certain bound --- say n >= 5 --- and then run serially for smaller n. This has a dramatic effect on how fast fib runs, but of course the best possible choice of bound is going to be machine-dependent. We can instrument our program and have some chance of doing OK for fib :: Int -> Int; it's not at all obvious what to do for: myFunction :: [Int] -> (Int -> Bool) -> [Frob] In effect, I end up needing to write myFunction with a built-in bound on computation, and I need to do it in such a way that the underlying systems knows that one branch should be serial and the other branch parallel. This is the problem I was attempting to allude to above. Yes, we can decide on a function-by-function or callsite-by-callsite basis that "enough work" is being done to justify parallelism. But the answer to this question is often "maybe", or even "no" (as in fib). Yes, you have parMap but the problem I saw with it (and please correct me if I am wrong) is that it spawns a new thread for every application of the function to the element? But what if functions are small? Then this is quite an overhead. And you have to know this in advance if you want to use something else than the default parMap which is not always possible (if we are passing a function as an argument to the function which calls map). For example: calculate f xs = foldl (+) 0 $ map f xs -- or foldr, I am not sure You seem to be arguing that we can pick the right implementation of "map" and "fold" (serial or parallel) here if we only knew that xs was "big enough" and f "expensive enough". I agree. But that begs the question: let's assume "calculate" is a function that's called from numerous places, with a mixture of "big" and "small" arguments. Now I need two versions of calculate, and I need to decide at each call site whether to call "big calculate" or "small calculate". We also need to make sure any eagerness we introduce is semantically sound, too, but I think we've got a pretty good handle on that part in practice between my work on resource- bounded evaluation in Eager Haskell, Rob Ennals' work on eager evaluation in GHC, and the Singh & Harris paper. That isn't to say that any of this is impossible---but it's going to take a while to figure out how to get it right [it'd be a great Ph.D. project to even knock off the low-hanging fruit like fib and recursive matrix multiply]. Meanwhile, we also need to work hard educating programmers how to write code that'll run in parallel in the first place, and giving them the libraries that'll make it easy. -Jan-Willem Maessen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] out of core computing in haskell
Carter T Schonwald wrote: Hello Everyone, I'm not quite sure if I'm posing this question correctly, but what facilities currently exist in haskell to nicely deal with datastructures that won't fit within a given machine's ram? And if there are no such facilities, what would it take to fix that? If you just want to process a big chunk of data from one end to the other without loading it all into RAM at once... that's fairly easy. Haskell is a lazy language. By playing with functions such as getContents, you can automatically load the data as you access it. No tricky programming required. If you're asking for something more specific -- e.g., "how do I talk to a standard database like Oracle / MySql / et al.", there are a couple of libraries for that. (Unfortunately, no one standard one.) See Stefan's answer. I'd you'd like to be more specific about what you'd like to do... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] out of core computing in haskell
On Mon, Aug 13, 2007 at 12:29:25PM -0700, Carter T Schonwald wrote: > Hello Everyone, > I'm not quite sure if I'm posing this question correctly, but what > facilities currently exist in haskell to nicely deal with > datastructures that won't fit within a given machine's ram? And if > there are no such facilities, what would it take to fix that? You're asking correctly, and according to the Hackage list: http://hackage.haskell.org/packages/archive/pkg-list.html we have: * anydbm library and program: Interface for DBM-like database systems * BerkeleyDB library: Bindings for Berkeley DB v1.x * haskelldb library: SQL unwrapper for Haskell. * HDBC library * hsql library * PostgreSQL library: Thin wrapper over the C postgresql library We also have Data.Binary and Data.ByteString, but those are more useful for building data stores, than as data stores themselves. Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] out of core computing in haskell
Hello Everyone, I'm not quite sure if I'm posing this question correctly, but what facilities currently exist in haskell to nicely deal with datastructures that won't fit within a given machine's ram? And if there are no such facilities, what would it take to fix that? thanks -Carter ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
Stefan O'Rear wrote: Also, large numbers don't (this is arguably a bug...) have restricted types: [EMAIL PROTECTED]:~$ ghc -e '100 :: Int' -1486618624 So many other programming languages allow weird things to happen with numeric overflows... it would be nice if Haskell didn't. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Dynamic thread management?
Hi! I am thinking about a model where you would have only n threads on a n-core (or processor) machine. They would be your worker threads and you would spawn them only once (at the beginning of the program) and then just delegate work between them. On 8/13/07, Jan-Willem Maessen <[EMAIL PROTECTED]> wrote: > The problem here is that while Cilk spawns are incredibly cheap, > they're still more than a simple procedure call (2-10x as expensive > if my fading memory serves me rightly). Let's imagine we have a > nice, parallelizable computation that we've expressed using recursive > subdivision (the Cilk folks like to use matrix multiplication as an > example here). Near the leaves of that computation we still spend > the majority of our time paying the overhead of spawning. So we end > up actually imposing a depth bound, and writing two versions of our > computation---the one that spawns, which we use for coarse-grained > computations, and the one that doesn't, which we use when computation > gets fine-grained. It makes a really big difference in practice. But this could be done at the runtime too. If the lazy-non-evaluated-yet chunk is "big" then divide it into a few parts and run each part in its thread. But if the chunk is small (you are at the end of the evaluation and you already evaluated necessary subexpressions) you do it in the thread which encounters this situation (which is probably near the end of the program or the end of the monadic IO action). And this line when you choose to delegate or not can be determined at runtime too. In combination with some transactional memory or some other trick which would be behind this delegation this could be probably possible. We could also hint runtime that the function would probably take a long time to compute (even if it is lazy) with making a type for such functions which would signal this. Of course this could also make things worse if used improperly. But sometimes you know that you will be running the map of time-consuming function. Yes, you have parMap but the problem I saw with it (and please correct me if I am wrong) is that it spawns a new thread for every application of the function to the element? But what if functions are small? Then this is quite an overhead. And you have to know this in advance if you want to use something else than the default parMap which is not always possible (if we are passing a function as an argument to the function which calls map). For example: calculate f xs = foldl (+) 0 $ map f xs -- or foldr, I am not sure And I would like to see something like this: it gets to the point when we need to evaluate this function call, for some "big" f and some "big" list of xs, so the thread which gets to it starts evaluating the first value and when it starts with another (it is recursive call so it is a "similar" evaluation) it sees that the other thread is empty and the function would probably take a long time (it speculates) so it delegates it there and continues with the third element that is a dummy recursive call to the function, in this case foldl (dummy because it did not really evaluate everything at the previous level). Now, if the second thread finishes first, it goes to the next element (recursive call) but sees that it is already (still) evaluating, so it gets to the fourth. Otherwise, it the first thread finishes first just goes to the next element. This would be some kind of speculative evaluating. If the CPUs know how to do that why would not we at the higher level? It would be also an interesting lazy evaluation of the, in this example, foldl function. The system would make a recursive call but would just go another call deeper if it finds that it is impossible (because it is still evaluating somewhere else) to evaluate it. And at every level it finishes it will check previous levels if it can collapse them (and maybe prematurely finish the whole thing). It would be like that I unwind the loop (in this case recursion), evaluate everything (on many threads) and then (try to) calculate the value. If it finishes prematurely ... sad, but for "big" lists and "big" functions it would be a saver. And the size of this window (unwinding) would be a heuristic speculative function of number of possible threads (cores, processors) and of information how long have previous evaluations of the same function taken. I really messed this explanation up. Or maybe it is completely wrong and this is why it looks like a mess to me. :-) Mitar ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC optimisations
Stefan O'Rear wrote: On Mon, Aug 13, 2007 at 07:35:31PM +0100, Andrew Coppin wrote: (I once compiled a program that used the GHC API. The final binary was several times larger than ghc.exe...) GHC is a particularly bad case because what it does is determined by the settings of a bunch of switches in the configuration data. Of course, GHC isn't smart enough to perform inter-module control flow analysis, so even with -split-objs you'd probably still link most of GHC. Is it likely that GHC will ever become "smart enough" to do that kind of analysis? (I imagine this is going to be especially fun with precompiled libraries...) This is called the Constructed Product Return (CPR) analysis, and it applies to all types with one constructor (in archaic jargon, product types). Right. So it doesn't have to have strict fields or anything? Just has to have exactly one constructor? Yep, CPR is completely independent of strictness, however the returned product must be *new*, since returning an old object by value risks losing sharing (and thus creating large memory leaks). Right, OK. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC optimisations
On Mon, Aug 13, 2007 at 07:35:31PM +0100, Andrew Coppin wrote: > Not related to optimisation, but... is there some switch to warn you if > something gets removed? (Presumably this means you forgot to export > something, or you haven't coded the bit that calls it yet or something.) -fwarn-unused-binds (included in -Wall) > (I once compiled a program that used the GHC API. The final binary was > several times larger than ghc.exe...) GHC is a particularly bad case because what it does is determined by the settings of a bunch of switches in the configuration data. Of course, GHC isn't smart enough to perform inter-module control flow analysis, so even with -split-objs you'd probably still link most of GHC. >>> I read somewhere that if one funtion returns a tuple, and the caller >>> immediately extracts the values from the tuple, GHC tries to optimise >>> away the tuple - so it's as if the function can just return multiple >>> values at once. Is this true? Does it apply only to tuples, or to all >>> types? >> >> This is called the Constructed Product Return (CPR) analysis, and it >> applies to all types with one constructor (in archaic jargon, product >> types). > > Right. So it doesn't have to have strict fields or anything? Just has to > have exactly one constructor? Yep, CPR is completely independent of strictness, however the returned product must be *new*, since returning an old object by value risks losing sharing (and thus creating large memory leaks). Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC optimisations
Stefan O'Rear wrote: On Mon, Aug 13, 2007 at 07:05:03PM +0100, Andrew Coppin wrote: Does GHC do dead code elimination? Yes - all unused let-binders are removed. Not related to optimisation, but... is there some switch to warn you if something gets removed? (Presumably this means you forgot to export something, or you haven't coded the bit that calls it yet or something.) Will the generated executable contain the code for g and h, even though they aren't called? Or does the unused code get eliminated? How about the standard libraries? By default, GHC generates one object file per source module, so if any values from that module are used, the entire module is linked in. Right, OK. I had a feeling that was the case. (I once compiled a program that used the GHC API. The final binary was several times larger than ghc.exe...) With the -split-objs flag, GHC generates one object file per top level function, sometimes significantly reducing the size of executables. This is not the default because it puts tremendous strain on the linker. It is common to spend several *minutes* linking base when compiling ghc, and it's not unheard of for ar to simply run out of memory and die. Ouch! o_O Probably simpler for me, as a human, to split the program into a sensible module structure... ;-) I read somewhere that if one funtion returns a tuple, and the caller immediately extracts the values from the tuple, GHC tries to optimise away the tuple - so it's as if the function can just return multiple values at once. Is this true? Does it apply only to tuples, or to all types? This is called the Constructed Product Return (CPR) analysis, and it applies to all types with one constructor (in archaic jargon, product types). Right. So it doesn't have to have strict fields or anything? Just has to have exactly one constructor? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC optimisations
On Mon, Aug 13, 2007 at 07:05:03PM +0100, Andrew Coppin wrote: > Does GHC do dead code elimination? Yes - all unused let-binders are removed. > I observe that if you take a module and edit it so that some function is > now no longer exported or called by any exported function, the size of the > *.o file seems to go down. This suggests that dead code within a single > module is eliminated. > > However... how about this? > > module Foo where > > f = ... > g = ... > h = ... > > module Main where > > import Foo > > main = ...f... > > Will the generated executable contain the code for g and h, even though > they aren't called? Or does the unused code get eliminated? How about the > standard libraries? By default, GHC generates one object file per source module, so if any values from that module are used, the entire module is linked in. With the -split-objs flag, GHC generates one object file per top level function, sometimes significantly reducing the size of executables. This is not the default because it puts tremendous strain on the linker. It is common to spend several *minutes* linking base when compiling ghc, and it's not unheard of for ar to simply run out of memory and die. > I read somewhere that if one funtion returns a tuple, and the caller > immediately extracts the values from the tuple, GHC tries to optimise away > the tuple - so it's as if the function can just return multiple values at > once. Is this true? Does it apply only to tuples, or to all types? This is called the Constructed Product Return (CPR) analysis, and it applies to all types with one constructor (in archaic jargon, product types). For instance, (+) (I# x#) (I# y#) = I# (x# +# y#) foo = case 2 + 2 of I# x# -> ... is transformed into: zdzp (I# x#) (I# y#) = (x# +# y#) foo = case 2 + 2 of x# -> ... This is one of the two main pillars of arithmetic unboxing, alongside and equally important to strictness analysis. Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: zip3, zip4 ... -> zipn?
On 8/11/07, Per Vognsen <[EMAIL PROTECTED]> wrote: > > Applicative functors can indeed help: > > (,,,) <$> [1,2,3] <*> [-1,0,1] <*> [1,1,1] <*> [0,2,6] > > You just use n-1 commas when you want the effect of zipn. Actually, that's not quite right, since that uses the applicative functor related to the list monad (so you get a list of 4-tuples of all possible combinations, rather than a 4-way zip). To get the zip behavior, you need to add a ZipList constructor in front of all the lists, and then apply getZipList at the end to extract. -Brent ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] GHC optimisations
Does GHC do dead code elimination? I observe that if you take a module and edit it so that some function is now no longer exported or called by any exported function, the size of the *.o file seems to go down. This suggests that dead code within a single module is eliminated. However... how about this? module Foo where f = ... g = ... h = ... module Main where import Foo main = ...f... Will the generated executable contain the code for g and h, even though they aren't called? Or does the unused code get eliminated? How about the standard libraries? I read somewhere that if one funtion returns a tuple, and the caller immediately extracts the values from the tuple, GHC tries to optimise away the tuple - so it's as if the function can just return multiple values at once. Is this true? Does it apply only to tuples, or to all types? ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] Explaining monads
When I read "side-effects", I understand it as "unwanted effects", like "aliasing", and "effects depending on the order of execution". I'm not sure if my understanding here is correct. I hope Haskell does not allow "side-effects" but only "effects", meaning the monads do not allow you to write the typical ill-behaving code you get when doing real imperative programming, enforcing a single wiring of execution, not allowing the capture of the RealWorld object. In Concurrent Clean special compiler support is present to enforce "uniqueness typing", and in Haskell special compiler support is available to make the RealWorld object not available at runtime (so e.g. you can't put the RealWorld object in a list). Is this correct? BTW: What is the correct word in Haskell for "object"? I mean the (lazy) value you get when evaluating a data constructor? -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Kim-Ee Yeoh Sent: Monday, August 13, 2007 15:30 To: haskell-cafe@haskell.org Subject: Re: [Haskell-cafe] Explaining monads Ronald Guida wrote: > > Given the question "What is a Monad", I would have to say "A Monad is > a device for sequencing side-effects." > There are side-effects and there are side-effects. If the only monad you use is Maybe, the only side-effect you get is a slight warming of the CPU. Dave Menendez pointed to that fine Wadler link earlier. Please read it. To wit, in Section 2: "Explaining Monads" the "essence of an algorithm can become buried under the plumbing required to carry data from its point of creation to its point of use." Monads can help keep the clarity of your code untrammelled by providing implicit plumbing, "side-channels" if you prefer, when data is moved around. In fact if you follow Wadler all the way to his monadic expression evaluator, you see that you could modularize your code in awesomely cool ways. You get to see how the kernel of the expression evaluator could be written for a generic monad and compiled once-and-for-all. Any additional feature (the "variations") is coded by enriching the monad. Monads are powerful devices for modularizing code. Available for free. Only in Haskell (thanks to type classes!). Get them today. "Side-effects" is a piece of linguistic cruft played fast-and-loose by too many people in this game. "Sequencing" suffers the same disease. -- View this message in context: http://www.nabble.com/Explaining-monads-tf4244948.html#a12126170 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe No virus found in this incoming message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 12/08/2007 11:03 No virus found in this outgoing message. Checked by AVG Free Edition. Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 12/08/2007 11:03 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
On Mon, Aug 13, 2007 at 01:27:45PM -0300, Isaac Dupree wrote: > David Roundy wrote: > >The only cost is that > >this syntax relies on the do notation, and thus makes the desugaring of > >that do notation slightly more complicated when used. > > If I understand correctly, > > do > blah > f (do >foo >bar (<- action) > ) > blah > > has an ambiguity: which do-block is the action bound in? I can easily > imagine myself being frustrated at having to refactor my code if the > defined answer is not the one I want at the moment. It doesn't have an ambiguity, because it's defined to be bound in the innermost do loop. This isn't a new concept, the <- syntax in the existing do notation has the same behavior. -- David Roundy Department of Physics Oregon State University ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
David Roundy wrote: The only cost is that this syntax relies on the do notation, and thus makes the desugaring of that do notation slightly more complicated when used. If I understand correctly, do blah f (do foo bar (<- action) ) blah has an ambiguity: which do-block is the action bound in? I can easily imagine myself being frustrated at having to refactor my code if the defined answer is not the one I want at the moment. Isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] xpath types for SYB/generic haskell type system?
The SYB papers provide really powerful functions for accessing and manipulating a values in arbitrary shaped containers. The cost of this capability appears to be loss of type checking. For example gfindtype x returns a maybe y. Given that the type checker actually has to know whether or not x actually contains a y inside of it, is there a way to annotate a gfindtype sort of function that just returns a value and if applied with the wrong type has a compiler enforce error? It may not be in this version of haskell, but it seems like there is no technical reason you could not have partial type annotations that describe the traversal strategies described in SYB. Perhaps it is a type version of an xpath expression e.g myFindType::(.//y) x => x->y The (.//y) x says that y is a type nested somewhere in x. Note, since this is happening at compiler time, this capability will still not prevent you from doing a (fromJust Nothing), but it still seems super valuable if you are doing generic haskell type stuff? Is there a mathematical reason why this wouldn't work? -Alex- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
On Mon, Aug 13, 2007 at 05:39:34PM +0200, apfelmus wrote: > Stefan O'Rear schrieb: >> On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote: >>> My assumption is that we have an equivalence >>> >>> forall a,b . m (a -> m b) ~ (a -> m b) >>> >>> because any side effect executed by the extra m on the outside can well >>> be delayed until we are supplied a value a. Well, at least when all >>> arguments are fully applied, for some notion of "fully applied" >> (\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β) >> (\a x -> a >>= ($ x)) (return X) ==> (β) >> (\x -> (return X) >>= ($ x)) ==> (monad law) >> (\x -> ($ x) X)==> (β on the sugar-hidden >> 'flip') >> (\x -> X x)==> (η) >> X >> Up to subtle strictness bugs arising from my use of η :), you're safe. > > Yes, but that's only one direction :) The other one is the problem: > > return . (\f x -> f >>= ($ x)) =?= id > > Here's a counterexample > > f :: IO (a -> IO a) > f = writeAHaskellProgram >> return return > > f' :: IO (a -> IO a) > f' = return $ (\f x -> f >>= ($ x)) $ f > ==> (β) > return $ \x -> (writeAHaskellProgram >> return return) >>= ($ x) > ==> (BIND) > return $ \x -> writeAHaskellProgram >> (return return >>= ($ x)) > ==> (LUNIT) > return $ \x -> writeAHaskellProgram >> (($ x) return) > ==> (β) > return $ \x -> writeAHaskellProgram >> return x > > Those two are different, because > > clever = f >> return () = writeAHaskellProgram > clever' = f' >> return () = return () > > are clearly different ;) I figured that wouldn't be a problem since our values don't escape, and the functions we define all respect the embedding... More formally: Projections and injections: proj ma = \x -> ma >>= \ fn' -> fn' x inj fn = return fn Define an equivalence relation: ma ≡ mb <-> proj ma = proj mb Projection respects equivalence: ma ≡ mb -> proj ma = proj mb(intro ->) ma ≡ mb => proj ma = proj mb(equiv def) proj ma = proj mb => proj ma = proj mb (assumption) Application: (@) ma1 = \x -> join (proj ma1 x) Application respects equivalence: ma1 ≡ ma2 -> (@) ma1 = (@) ma2 (intro ->) ma1 ≡ ma2 => (@) ma1 = (@) ma2 (β) ma1 ≡ ma2 => (\x -> join (proj ma1 x)) = (\x -> join (proj ma2 x)) (extensionality) ma1 ≡ ma2 => join (proj ma1 x) = join (proj ma2 x)(application respects = left) ma1 ≡ ma2 => proj ma1 x = proj ma2 x(application respects = right) ma1 = ma2 => proj ma1 = proj ma2(lemma) Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
L.Guo wrote: > Because 10,000,000 is too large for a Int On my pitiful system, > maxBound::Int 2147483647 is certainly greater than 1000 . > it is always in type of Integer or some higher level data type. Haskell doesn't do static checking like that. In GHC on my system (where 10,000,000,000 is too large for an Int - note ten zeroes), > 100::Int 1410065408 (Hugs gives "Program error: arithmetic overflow") Isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
Stefan O'Rear schrieb: On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote: My assumption is that we have an equivalence forall a,b . m (a -> m b) ~ (a -> m b) because any side effect executed by the extra m on the outside can well be delayed until we are supplied a value a. Well, at least when all arguments are fully applied, for some notion of "fully applied" (\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β) (\a x -> a >>= ($ x)) (return X) ==> (β) (\x -> (return X) >>= ($ x)) ==> (monad law) (\x -> ($ x) X)==> (β on the sugar-hidden 'flip') (\x -> X x)==> (η) X Up to subtle strictness bugs arising from my use of η :), you're safe. Yes, but that's only one direction :) The other one is the problem: return . (\f x -> f >>= ($ x)) =?= id Here's a counterexample f :: IO (a -> IO a) f = writeAHaskellProgram >> return return f' :: IO (a -> IO a) f' = return $ (\f x -> f >>= ($ x)) $ f ==> (β) return $ \x -> (writeAHaskellProgram >> return return) >>= ($ x) ==> (BIND) return $ \x -> writeAHaskellProgram >> (return return >>= ($ x)) ==> (LUNIT) return $ \x -> writeAHaskellProgram >> (($ x) return) ==> (β) return $ \x -> writeAHaskellProgram >> return x Those two are different, because clever = f >> return () = writeAHaskellProgram clever' = f' >> return () = return () are clearly different ;) Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
On Mon, Aug 13, 2007 at 11:23:59PM +0800, L.Guo wrote: > Because 10,000,000 is too large for a Int, it is always in type of Integer or > some higher level data type. In Haskell, Int always supports at least -536,870,912 to 536,870,911. Also, large numbers don't (this is arguably a bug...) have restricted types: [EMAIL PROTECTED]:~$ ghc -e '100 :: Int' -1486618624 [EMAIL PROTECTED]:~$ Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
Because 10,000,000 is too large for a Int, it is always in type of Integer or some higher level data type. -- L.Guo 2007-08-13 - From: Alexis Hazell At: 2007-08-13 22:46:46 Subject: Re: [Haskell-cafe] A few questions on primes generating. On Tuesday 14 August 2007 00:22, L.Guo wrote: > 2) We have this type definition : > pureSieve :: Int -> Int >Why there is no error (type mismatch) of this call in func main : > pureSieve 1000 The Haskell Report says that an Int covers at least the range [- 2^29, 2^29 - 1], which that number is well within . . . . why do you think it should report a type error? Alexis. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
On 8/13/07, L.Guo <[EMAIL PROTECTED]> wrote: > Hi All: Hello, > > I am reading http://www.haskell.org/haskellwiki/Prime_numbers > > The code in sector "1 Bitwise prime sieve". > > I have 3 questions about it. > > 1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ? Yes, it appears so. In a 32 bit implementation I get: Prelude> sqrt $ fromIntegral (maxBound :: Int) 46340.950001051984 > 2) We have this type definition : > pureSieve :: Int -> Int >Why there is no error (type mismatch) of this call in func main : > pureSieve 1000 If you have integer literals in your program, the compiler sees a fromInteger in front of them. So the value is just converted to type Int automatically, because that is expected here. You can give different numeric default declarations in your own modules. Please see sections 10.3 (for overloaded literals) and 10.4 (for defaults) here: http://www.haskell.org/tutorial/numbers.html Sometimes you can get an overflow like this: Prelude> 1000 :: Int -159383552 > 3) In main again, what does expression [| x |] mean ? Why this cannot be > execute in > GHCi ? It's Template Haskell, and is used there for some kind of optimisation (I think). Template Haskell needs to be enabled with a command line switch for it to work. Please see the documentation for more information. It's section 7.6 in your User's Guide. Though in this case you can probably just remove it to try out the program. Perhaps someone else can explain what actual effect it has here. Pekka ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote: > My assumption is that we have an equivalence > > forall a,b . m (a -> m b) ~ (a -> m b) > > because any side effect executed by the extra m on the outside can well be > delayed until we are supplied a value a. Well, at least when all arguments > are fully applied, for some notion of "fully applied" (\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β) (\a x -> a >>= ($ x)) (return X) ==> (β) (\x -> (return X) >>= ($ x)) ==> (monad law) (\x -> ($ x) X)==> (β on the sugar-hidden 'flip') (\x -> X x)==> (η) X Up to subtle strictness bugs arising from my use of η :), you're safe. Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] A few questions on primes generating.
On Tuesday 14 August 2007 00:22, L.Guo wrote: > 2) We have this type definition : > pureSieve :: Int -> Int >Why there is no error (type mismatch) of this call in func main : > pureSieve 1000 The Haskell Report says that an Int covers at least the range [- 2^29, 2^29 - 1], which that number is well within . . . . why do you think it should report a type error? Alexis. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Dynamic thread management?
On Aug 11, 2007, at 12:35 PM, Brian Hurt wrote: You guys might also want to take a look at the Cilk programming language, and how it managed threads. If you know C, learning Cilk is about 2 hours of work, as it's C with half a dozen extra keywords and a few new concepts. I'd love to see Cilk - C + Haskell as a programming language. It was called pH, and we (meaning Alejandro Caro and myself) implemented it back in the mid/late 90's using Lennart Augustsson's hbcc front end (which he hacked to add a bunch of pH-specific syntax). Arvind and Nikhil wrote a textbook on pH programming. There are two problems, still: one is that laziness means you can't actually prove you need something until very close to the time you actually want it. By the time I know that I'm adding f x to g y, it's probably too late to usefully run them in parallel (unless they're both *very* large). We used eager evaluation in pH---to the point that we actually gave up the ability to manipulate infinite lazy data structures. In NDP they've done much the same thing, first instrumenting the program to see that the eagerness they introduce won't disrupt execution. Even the "par" annotation has this feel: we are telling the implementation that it's OK to do some computation even if it isn't yet obvious that we'll need the results. The second problem is controlling the overhead. More on this below. The key idea of Cilk is that it's easier to deparallelize than it is to parallelize, especially automatically. So the idea is that the program is written incredibly parallel, with huge numbers of microthreads, which are (on average) very cheap to spawn. The runtime then deparallelizes the microthreads, running multiple microthreads sequentially within a single real thread (a worker thread). Microthreads that live their entire life within a single real thread are cheap to spawn (as in "not much more expensive than a normal function call" cheap). The problem here is that while Cilk spawns are incredibly cheap, they're still more than a simple procedure call (2-10x as expensive if my fading memory serves me rightly). Let's imagine we have a nice, parallelizable computation that we've expressed using recursive subdivision (the Cilk folks like to use matrix multiplication as an example here). Near the leaves of that computation we still spend the majority of our time paying the overhead of spawning. So we end up actually imposing a depth bound, and writing two versions of our computation---the one that spawns, which we use for coarse-grained computations, and the one that doesn't, which we use when computation gets fine-grained. It makes a really big difference in practice. The programmer is free to use this trick in any programming language. But we haven't yet figured out a way to *avoid* the need to do so. This continues to worry me to this day, because making the right choices is black magic and specific to a particular combination of algorithm and machine. That said, there is some reason for optimism: the overhead of creating work in Cilk is comparable to the overhead of creating a thunk in Haskell. The problem that Cilk runs into is that it's, well, C. It doesn't deal with contention at well at all- a single microthread blocking blocks the whole worker thread- meaning, among other things, that you can have "false deadlocks", where one microthread blocks on another microthread in the same real thread, and thus acts like it's deadlocked even though it really isn't. This is actually a fundamental problem with the threading model: there is no guarantee of fairness using work stealing, so if you do something that requires fair scheduling you get into a lot of trouble fast. It's not fair to blame C for this. You have to be very careful to define the interaction between fair IO-style threads and unfair get-my-work-done threads. You have greatly increased the likelyhood of raceconditions as well (mutable data and multithreading just don't mix). Plus you have all the normal fun you have with C bugs- wild pointers, buffer over runs, etc. This, however, *is* C's fault. :-) More on pH: we got our programs to scale, but had troubles going past 8 threads. We found ourselves limited by a non-parallel GC (my fault, but labor-intensive to get right) and the absence of parallelism in the underlying algorithms. For the latter problem there simply is no magic bullet. -Jan-Willem Maessen ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell
Benjamin Franksen wrote: As has been already mentioned in this thread, in http://www.soi.city.ac.uk/~ross/papers/Applicative.html Conor McBride and Ross Paterson invent/explain a new type class that is now part of the base package (Control.Applicative). They also use/propose syntactic sugar for it, i.e. pure f <*> u1 <*> ... <*> un ~~> (| f u1 ... un |) (I just made up the symbols '(|' and '|)', the concrete syntax would have to be fixed by people more knowledgeable than me.) The problem with [| and |] lifted to monads that this only works for fully applied arguments, i.e. that handle :: IO Handle string :: IO String [| hPutStr handle string |] :: IO () works but [| hPutStr handle |] = join (return hPutStr `ap` handle) ^= join ((f :: m (a -> b -> m c)) `ap` (x :: m a)) = join ( y :: m (b -> m c)) is a type error. I think this is also what makes the (<- action) proposal so non-local and what is the source of this whole discussion. The core problem is: Functions like a -> b -> m c can't be partially applied to monadic actions like m a without specifying the number of arguments in advance. In other words, such functions aren't curried correctly. Clearly, LiftMn specifies the number of arguments. But _both_ the (<-) proposal and idiom brackets specify the number of arguments too! Namely by requiring that all arguments are fully applied. So, neither one is capable of partially applying the first argument without saturating the call, you can only write handle :: IO Handle -- define putStr in terms of the above hPutStr putStr :: String -> IO () putStr = \x -> [| hPutStr handle (return x) |] putStr = \x -> do { hPutStr (<- handle) x } One way to get currying for monads is to work with functions m a -> m b -> m c However, this type is larger than a -> b -> m c , i.e. the function from :: Monad m => (a -> b -> m c) -> (m a -> m b -> m c) from f ma mb = ma >>= \a -> mb >>= \b -> f a b is not surjective since we could perform the actions in a different order from2 f ma mb = mb >>= \b -> ma >>= \a -> f a b In other words, if someone gives you a value of type m a -> m b -> m c , then you can't convert it to a -> b -> m c and back without risking that you end up with a different result. But there is another type suitable for currying m (a -> m (b -> m c)) which I believe is in some way equivalent to a -> b -> m c from :: Monad m => (a -> b -> m c) -> m (a -> m (b -> m c)) from f = return $ \a -> return $ \b -> f a b to :: Monad m => m (a -> m (b -> m c)) -> (a -> b -> m c) to f a b = f >>= \g -> g a >>= \h -> h b but I'm not sure. My assumption is that we have an equivalence forall a,b . m (a -> m b) ~ (a -> m b) because any side effect executed by the extra m on the outside can well be delayed until we are supplied a value a. Well, at least when all arguments are fully applied, for some notion of "fully applied" Anyway, here's how to curry with that type: (@) :: Monad m => m (a -> m b) -> (m a -> m b) (@) f x = join (f `ap` x) hPutStr :: IO (Handle -> IO (String -> IO ())) handle :: IO Handle putStr :: IO (String -> IO ()) putStr = hPutStr @ handle With the infix type synonym type (~>) a b = a -> IO b we can also write hPutStr :: IO (Handle ~> String ~> () ) putStr :: IO (String ~> () ) This is of course the Kleisli-Arrow which explains why currying works. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] A few questions on primes generating.
Hi All: I am reading http://www.haskell.org/haskellwiki/Prime_numbers The code in sector "1 Bitwise prime sieve". I have 3 questions about it. 1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ? 2) We have this type definition : pureSieve :: Int -> Int Why there is no error (type mismatch) of this call in func main : pureSieve 1000 3) In main again, what does expression [| x |] mean ? Why this cannot be execute in GHCi ? Thanks for any advice. Regards -- L.Guo 2007-08-13 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Explaining monads
Ronald Guida wrote: > > Given the question "What is a Monad", I would have to say "A Monad is > a device for sequencing side-effects." > There are side-effects and there are side-effects. If the only monad you use is Maybe, the only side-effect you get is a slight warming of the CPU. Dave Menendez pointed to that fine Wadler link earlier. Please read it. To wit, in Section 2: "Explaining Monads" the "essence of an algorithm can become buried under the plumbing required to carry data from its point of creation to its point of use." Monads can help keep the clarity of your code untrammelled by providing implicit plumbing, "side-channels" if you prefer, when data is moved around. In fact if you follow Wadler all the way to his monadic expression evaluator, you see that you could modularize your code in awesomely cool ways. You get to see how the kernel of the expression evaluator could be written for a generic monad and compiled once-and-for-all. Any additional feature (the "variations") is coded by enriching the monad. Monads are powerful devices for modularizing code. Available for free. Only in Haskell (thanks to type classes!). Get them today. "Side-effects" is a piece of linguistic cruft played fast-and-loose by too many people in this game. "Sequencing" suffers the same disease. -- View this message in context: http://www.nabble.com/Explaining-monads-tf4244948.html#a12126170 Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions
Isaac Dupree schrieb: apfelmus wrote: Mutable data structures in the sense of ephemeral (= not persistent = update in-place) data structure indeed do introduce the need to work in ST since the old version is - by definition - not available anymore. Not in the quantum/information-theoretic sense, not necessarily. Consider import Control.Monad.ST import Data.STRef main = print (runST (do r <- newSTRef 1 notUnavailable <- readSTRef r writeSTRef r 5 return notUnavailable )) I'm not sure what this has to do with quantum mechanics ;) but you're right, I forgot that. This means that either STRefs cannot be updated in-place or that every read operation copies the contents or something like that. In any case, simple values like Ints or Bools are rather uninteresting, update in-place is only important for larger structures like arrays. Here, ST does updates in-place and retaining an array will copy it. Regards, apfelmus ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] IO within parser
Malte Milatz wrote: Sam Hughes <[EMAIL PROTECTED]>, Sun, 12 Aug 2007 20:12:55 -0400: [A parser like Parsec, with monad transformers:] $ darcs get http://samuelhughes.com/darcs/partran/ Is this related in any way to the following GSoC project? http://code.google.com/soc/2007/haskell/appinfo.html?csaid=B97EF4562EF3B244 No, I just wanted it for my own purposes. - Sam ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Explaining monads
On 11/08/07, Ronald Guida <[EMAIL PROTECTED]> wrote: > > > Here's my interpretation of the table: > > -- > Structure | Subject | Action| Verb | Result > +--+++-- > function| a | a->b | flip ($) | b > Functor | f a | a->b | <$> | f b > Applicative | f a | f (a->b) | flip <*> | f b > Monad | m a | a->m b| >>= | m b > Comonad | w a | w a->b| =>> | w b > Arrow | a b c | a c d | >>> | a b d > -- Nice Kim-Ee Yeoh wrote: > > ... I think you'll find that each of those structures have their > > privileged place in your code. > > Agreed. I'm still a beginner; I'm not sure how to choose one > structure over another, at least not yet. But that's because ... > > > Monads are undoubtedly more pervasive, and that could be because there > > aren't as many arrow and comonad tutorials, atomic ones or otherwise. And I'm trying to say that these shouldn't be separate tutorials at all - its much more instructive to compare and contrast. Moreover, Comonad isn't even in the standard libraries (Hoogle returns > no results for it). > > When I searched for tutorials on monads, I found lots of them. In > fact, I have heard that writing (yet another) monad tutorial is part > of standard Haskell initiation. Thats what I was doing above :-) One thing that I keep seeing people say (not you), is that monads /sequence/ side effects. This is wrong, or at least a limited picture. /All/ of the above structures are about combining compatible things things together in a row. /None/ of them force any particular order of evaluation - that all comes from the particular instance. So its only a particular feature of IO that it sequences the side effects. Others don't - we can have a lazy State monad that just builds up big thunks. IO could be implemented as any of the above structures, and still be perfectly able to keep things in order. Indeed, uniqueness types as in clean, are arguably just the first one - function composition functor IO would be really boring - we could just perform a sequence of actions with no choices at all. (But the whole sequence could be repeated, and I guess the Structure could be nested for fixed loops) The key to the choice of IO as a Monad comes back the the argument about 'simplicity' or what ever we want to call it - I agree its rather context dependent, and indeed I was rather flippant at the end of my first message But lets look at the nature of the actual things being sequenced, the actions above. In only 3 cases are the actions simple enough to take a single /a/ argument. Function a->b; Functor a->b; Monad a->m b In function and functor, the action takes no part in the complexity, doesn't know about it. In function application the action gets used (possibly) once, in functor and monad possibly many times. Only in Monad does the action have a say in the complexity, by returning an possibly non-trivial m b. So thats the reason Monads are so handy - the combined actions are simple, but they can at least participate - influence the flow of control if we are talking about a IO-like Monad. Also, arrows are supposed to be more general than both monads and > comonads. If I could just figure out what each structure (functor, > monad, comonad, arrow) is doing, and compare and contrast them, then I > probably will have made leaps of understanding. I have a sense that > tells me that the table above and the data structures below probably > start to scratch the surface of understanding. > > -- > > Arrows are most general because they have full access to the complexity going on in the structure. Each arrow can do arbitrarily complex (possibly bidirectional) negotiation with its input, and possibly asynchronously arbitrarily complex negotiation with its output. Any sort of data can flow any way at any time, the only restriction is that for an 'Arrow a b' object, the input side uses a's and the output b's. Compare a monad - the input must be single, simple a's. All that can happen is that the function gets called multiple times. -- [EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Suffix tree
G'day all. Quoting Jon Harrop <[EMAIL PROTECTED]>: > Does anyone have any Haskell code implementing suffix trees? I'm particularly > interested in high-performance construction. No, but I have this: http://citeseer.ist.psu.edu/giegerich95comparison.html First person to implement them gets the eternal gratitude of hackage. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Suffix tree
On Sun, 2007-08-12 at 23:09 +0100, Jon Harrop wrote: > Suffix trees are a data structure used to search for a substring of length > "m" > in a string of length "n" in O(m) time. Suffix trees can also be used for > efficient approximate searches. This data structure is of particular > importance in bioinformatics. Suffix trees (and their sibling, suffix arrays) are cool. Basically a suffix tree is a trie, but where any nodes with only one child are collapsed. Easy to construct through sorting, linear time is trickier, and perhaps the biggest problem is keeping memory use reasonable. Suffix arrays is a sorted array of indices, so it's more compact, but takes O(m log n) to search. There's also the enhanced suffix array, which provides the same functionality as the suffix tree, but uses approximately the same space as well (something like 12n bytes vs 5n bytes, IIRC). > Does anyone have any Haskell code implementing suffix trees? I'm particularly > interested in high-performance construction. I was using a suffix tree to index some data, but I discovered I only needed the tree to constant depth, so - shame on me - I ended up simply sorting the suffixes in one case, and storing hashed words in a Map in another. I've also toyed with FFI'ing to comressed suffix arrays, but in the end, this wasn't as successful as I'd hoped. For straightforward suffix array construction, this is probably the fastest way to go about it, though. Linear time suffix trees are a bit complicated to implement in a functional language, as they depend on additional links in the tree during construction. I'm sure this can be done using IO/STrefs, but it would of course be cooler if it could be done tying-the-knot style. Anyway, I have a handful of shaky FFI bindings, a reasonable bibtex file, and lots of good will and encouragement. Let me know if you have use for any of them! -k ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe