Re: [Haskell-cafe] Explaining monads

2007-08-13 Thread Ronald Guida

Ronald Guida wrote:
> Given the question "What is a Monad", I would have to say "A Monad is
> a device for sequencing side-effects."

peterv <[EMAIL PROTECTED]> wrote:
> "Side-effects" is a piece of linguistic cruft played fast-and-loose
> by too many people in this game. "Sequencing" suffers the same
> disease.

Gregory Propf wrote:
> I made this mistake myself at first too.  It seems that the Monad =
> "side effect machine" error is common to Haskell newbies.  Probably
> to do with the fact that the first thing every programmer wants to
> do is write a hello world program and for that you need the IO Monad
> which requires some explanation of how a Monad can allow for side
> effects (at least the IO Monad). - Greg

Eariler in this thread, I had a conversation with several people
regarding monads and arrows.  My goal was to try to come up with a
brief explanation.  I realized that "sequencing side-effects" is a
simplistic and incorrect view, so now I'm thinking in terms of DSELs.

I have heard that writing a monad tutorial is something people do when
they finally understand monads.  I interpret this observation to mean
that either (1) monads (and arrows) are just difficult things, or (2)
most of the existing explanations and tutorials are somehow inadequate
or incomplete.

My present goal is to understand monads well enough to be able to
explain them to others.  I wonder if it's possible to create a
tutorial that explains monads well enough so that they just "make
sense" or "click" for people.

Here is the brief explanation I came up with:
> Arrows and monads are abstract data types used to construct Domain
> Specific Embedded Languages (DSELs) within Haskel.  A simple arrow
> provides a closed DSEL.  A monad is a special type of arrow that
> creates an open DSEL by allowing users to embed arbitrary Haskel
> within it.

Is this an accurate explanation?  I hate to feed a fire, but is
"Domain Specific Embedded Language" a well-defined phrase, or is it
just another example of linguistic cruft? If DSEL is cruft, then is
there a better way to briefly explain monads and arrows?

Also, is this a /useful/ explanation, or have I simply hidden the
complexity by invoking the concepts of ADTs and DSELs?

-- Ron

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Small proof intro

2007-08-13 Thread Tim Newsham
I put together a small intro lesson on proving haskell code using 
quickcheck, equational reasoning and Isabelle/HOL. Its very elementary, 
but might be of interest to some people here.


  http://www.thenewsh.com/%7Enewsham/formal/reverse/

Feedback is appreciated.

Tim Newsham
http://www.thenewsh.com/~newsham/
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread ajb

G'day all.

Quoting Andrew Coppin <[EMAIL PROTECTED]>:


So many other programming languages allow weird things to happen with
numeric overflows... it would be nice if Haskell didn't.


It would nice if CPUs supported trapping integer arithmetic.

Cheers,
Andrew Bromage
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Re: Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Benjamin Franksen
Isaac Dupree wrote:
> Benjamin Franksen wrote:
>> I'd be careful. Introducing a network connection into the equation makes
the
>> object (its methods) susceptible to a whole new bunch of failure modes;
>> think indefinite delays, connection loss, network buffer overflow, etc
etc.
>> It may be a mistake to abstract all that away; in fact I am convinced
that
>> the old Unix habit of sweeping all these failure modes and potentially
long
>> delays under a big carpet named 'file abstraction' was a bad idea to
begin
>> with. The ages old and still not solved problems with web browsers
hanging
>> indefinitely (w/o allowing any GUI interaction) while name resolution
waits
>> for completion is only the most prominent example.
> 
> IMO it's just a terribly stupid bug in the best web browsers.  Maybe 
> inefficient, poorly, or not-at-all-used multithreading?

Explicitly creating a (system) thread with all the overhead (in computing
resources, as well as code complexity) only because the system interface is
broken? Yes, of course, necessary, but not nice. An extra parameter for a
continuation would be a lot more light-weight and would also make explicit
that we must expect the call to be delayed.

I think the main reason why systems don't regularly employ this scheme is
that it is so tedious to work with closures in low-level languages like C.

> "file abstraction" has its points. We just need a (type-level?) 
> clear-to-program-with distinction between operations that may block 
> indefinitely, and operations that have particular bounds on their 
> difficulty.  Although, modern OSes try to balance too many things, don't 
> usually make any such hard real-time guarantees, in favor of everything 
> turning out more-or-less correct eventually.  Back to "file abstraction" 
> - well, considering the benefits of mounting remote systems as a 
> filesystem.  The hierarchy abstraction of the filesystem didn't stay the 
> same performance characteristics... And all kinds of potential problems 
> result too, when the connection breaks down!

Indeed, as I have experienced multiple times: NFS clients completely hanging
for minutes, enforcing coffee break for the whole office! Not that I would
mind a coffee break now or then, but it tends to happen in the middle of an
attempt to fix a critical bug in the production system...

> How do you program with all those error conditions explicitly? It is 
> difficult. You need libraries to do it well - and I'm not at all sure 
> whether there exist such libraries yet!  I mean, programs are much too 
> complicated already without infesting them with a lot of special cases.

What I would like to have is a clear distinction, apparent in the type,
between actions that can be expected to terminate fast and with certainty
(apart from broken hardware, that is) and others which are inherently
insecure and may involve considerable or even indefinite delays. The latter
should accept a continuation argument.

However, there is obviously a judgement call involved here. Thus, the system
should be flexible enough to allow to treat the same resource either as one
or the other, depending on the demands of the application. There may be
situations where a name lookup can safely be treated as a synchronous
operation (e.g. a script run as a cron job); in other situations one might
need to regard even local bus access to some I/O card as asynchronous.

>  > indefinite delays
> I can create with `someCommand | haskellProgram` too

Yes, user input as well as reading from a pipe should be handled like a
network connection: call my continuation whenever input is available.

>  > connection loss
> Is there a correct way to detect this? 

There are many ways, typically involving some sort of beacons. Anyway, if
all else fails the operation times out.

> I find it rather odd when I lose  
> my IRC connection for a moment and then it comes back a moment later 
> (Wesnoth games are worse, apparently, as they don't reconnect 
> automatically).  I often prefer considering them an indefinite delay.

Right: The user (the application) is in the best position to decide how long
to wait before an operation times out.

>  > network buffer overflow
> that is: too much input, not processing it fast enough? (or similar). 

Yeah, though usually the other way around, i.e. too much output and the
system can't send it fast enough (maybe because the other side is slow in
accepting data, or because the connection is bad, or whatever).

> Memory size limitations are considerably unhandled in programs of all 
> sorts, not just networked ones, though they(networked) may suffer the 
> most.

It is usually not a problem with modern desktop or server systems, rather
with so called 'real-time' OSes, where everything tends to be statically
allocated.

> We wish we had true unlimited-memory Turing machines :) ...this  
> is possibly the most difficult issue to deal with formally.  Probably 
> requires limiting input data rates artificially.

Th

Re: [Haskell-cafe] Re: Re: Explaining monads

2007-08-13 Thread Dan Piponi
On 8/13/07, Benjamin Franksen <[EMAIL PROTECTED]> wrote:

> Ok, I admit defeat now ;-) Monads in general /allow/ sequencing of (certain)
> effects, but it is not necessary for a monad to do so.

I'm sure I said that 300 or so posts ago :-)

Here's an example that might help make some of this explicit.

Suppose I have a function f :: a -> b -> c.
Now suppose I have some values "in a monad":
x :: m a
y :: m b

How do I lift f so I can apply it to x and y and get a value of type m
c? Obviously I can't just write f x y.

There's an obvious solution that comes to mind:

do
  x' <- x
  y' <- y
  return $ f x' y'

But there's another solution:

do
  y' <- y
  x' <- x
  return $ f x' y'

So in order to lift f I was forced to make an explicit decision about
the order in which I wanted to process x and y. (I think of >>= as a
kind of bottleneck that forces you to compute the arguments to f one
at a time in a particular order - but there are already too many
metaphors flying around so maybe you should ignore that.) Information
about that decision now becomes available to the monad's
implementation.

If you want, you can use this information to implement 'sequencing'
(whatever that means). But you're also free to ignore it. If you do
ignore it then you have a commutative monad, with Reader being the
best known example.

It's just like ordinary monoids. The very act of writing x*y imposes
an order on x and y. You can use this to define binary operators that
can make use of this ordering. But the most familiar monoid operators
(addition and multiplication of course) are commutative and order
doesn't matter. It'd be cool to have a notation that didn't force you
to put one argument or the other first. (Well, I know of one, but it's
a bit wacky: http://www.lawsofform.org/interpretations.html)
--
Dan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Explaining monads

2007-08-13 Thread Gregory Propf
I made this mistake myself at first too.  It seems that the Monad = "side 
effect machine" error is common to Haskell newbies.  Probably to do with the 
fact that the first thing every programmer wants to do is write a hello world 
program and for that you need the IO Monad which requires some explanation of 
how a Monad can allow for side effects (at least the IO Monad). - Greg

- Original Message 
From: peterv <[EMAIL PROTECTED]>
To: Kim-Ee Yeoh <[EMAIL PROTECTED]>; haskell-cafe@haskell.org
Sent: Monday, August 13, 2007 10:31:48 AM
Subject: RE: [Haskell-cafe] Explaining monads


Ronald Guida wrote:
> 
> Given the question "What is a Monad", I would have to say "A Monad is
> a device for sequencing side-effects."
> 

There are side-effects and there are side-effects. If the only
monad you use is Maybe, the only side-effect you get is a slight
warming of the CPU.



"Side-effects" is a piece of linguistic cruft played fast-and-loose
by too many people in this game. "Sequencing" suffers the same 
disease.

 





   

Sick sense of humor? Visit Yahoo! TV's 
Comedy with an Edge to see what's on, when. 
http://tv.yahoo.com/collections/222___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Re: Explaining monads

2007-08-13 Thread Benjamin Franksen
Derek Elkins wrote:
> On Mon, 2007-08-13 at 22:29 +0200, Benjamin Franksen wrote:
>> Brian Brunswick wrote:
>> > One thing that I keep seeing people say (not you), is that
>> monads /sequence/
>> > side effects. This is wrong, or at
>> > least a limited picture.
>> > 
>> > /All/ of the above structures are about combining compatible things
things
>> > together in a row.
>> > /None/ of them force any particular order of evaluation - that all
comes
>> > from the particular instance. So its
>> > only a particular feature of IO that it sequences the side effects.
Others
>> > don't - we can have a lazy State
>> > monad that just builds up big thunks.
>> 
>> I am a bit astonished.
>> 
>> Let's take the simplest example: Maybe. The effect in question is the
>> premature abortion of a computation (when Nothing is returned). And of
>> course Maybe sequences these effects, that's what you use it for: the
>> _first_ action to be encountered that returns Nothing aborts the
>> computation. Clearly sequencing goes on here.
> 
> You are wrong. Proof: Let's take a simpler example: Identity.  QED

I don't accept this proof. Note the wording: 'Monads sequence (certain,
monad specific) effects'. Identity has no effects, ergo no sequencing has
to happen. I didn't claim that /all/ monadic actions are (necessarily)
sequenced.

> This also disproves David Roundy's statement that 
> do x <- return 2; undefined; return (x*x) will hit bottom.
> 
> Reader also does not sequence it's "actions". 

Ok, I admit defeat now ;-) Monads in general /allow/ sequencing of (certain)
effects, but it is not necessary for a monad to do so.

> Writer is a kind of funny example.

In which way?

> Certainly, any monad instance where (>>=) needs it's first argument to
> determine whether to continue, e.g. Maybe, Either, IO, Parser, Cont,
> List, Tree will clearly need to force it's first argument before
> continuing, but that's just the nature of the situation.

Don't forget State, clearly it sequences actions even though it always
continues; but maybe 'sequencing' is too strong a word: Just as with
Reader, a sequence of reads (with no writes in between) may actually happen
in any order, State imposes strict order on groups of adjacent reads and
all (single) writes, correct?

Ok, I think I understand where I was misled: I took the means for the end.

There are many monads that impose a certain order on (some) effects; but
this is done only as far as necessary to maintain X, whatever X may be,
maybe X is just the monad laws?

What about: The imperative way always imposes a sequencing of actions,
period. Monads in Haskell (except IO which is just imperative programming)
allow us to impose ordering on effects only partially, ideally only where
necessary.

Thanks for further enlightening me.

Hey, I just said that sequencing is a means, not an end, but maybe even this
is not necessarily true. I imagine a 'Sequence Monad' whose only effect
would be to order evaluation... would that be possible? Or useful?

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Arie Peterson

On 8/13/07, David Roundy <[EMAIL PROTECTED]> wrote:

| Try executing:
|
|   do { x <- return 2; undefined; return (x*x); }
|
| in any monad you like

It's not just the identity monad:

Prelude> :m +Control.Monad.State
Prelude Control.Monad.State> flip evalState () $ do { x <- return 2;
undefined; return (x*x); }
4


Regards,

Arie

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Donn Cave
On Tue, 14 Aug 2007, Benjamin Franksen wrote:
...
> I'd be careful. Introducing a network connection into the equation makes the
> object (its methods) susceptible to a whole new bunch of failure modes;
> think indefinite delays, connection loss, network buffer overflow, etc etc.
> It may be a mistake to abstract all that away; in fact I am convinced that
> the old Unix habit of sweeping all these failure modes and potentially long
> delays under a big carpet named 'file abstraction' was a bad idea to begin
> with. The ages old and still not solved problems with web browsers hanging
> indefinitely (w/o allowing any GUI interaction) while name resolution waits
> for completion is only the most prominent example.

Ironically, the place where all this sweeping under the carpet has caused
me personally the most irritation is one of the most appealing file
abstractions - remote disk filesystems, NFS.

In any case, I agree (I think) that a sophisticated user interface needs
to deal with time.  I think that's a key motivation for reactive object
approaches.   It has to be considered part of the equation, along with the
rest of the I/O situation, if you're trying to reason about it that way.

Donn Cave, [EMAIL PROTECTED]

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Dan Piponi
On 8/13/07, David Roundy <[EMAIL PROTECTED]> wrote:
> Try executing:
>
>   do { x <- return 2; undefined; return (x*x); }
>
> in any monad you like

instance Monad M where
return a = M a
~(M a) >>= f = f a

Or is that cheating?
--
Dan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Isaac Dupree

Benjamin Franksen wrote:

I'd be careful. Introducing a network connection into the equation makes the
object (its methods) susceptible to a whole new bunch of failure modes;
think indefinite delays, connection loss, network buffer overflow, etc etc.
It may be a mistake to abstract all that away; in fact I am convinced that
the old Unix habit of sweeping all these failure modes and potentially long
delays under a big carpet named 'file abstraction' was a bad idea to begin
with. The ages old and still not solved problems with web browsers hanging
indefinitely (w/o allowing any GUI interaction) while name resolution waits
for completion is only the most prominent example.


IMO it's just a terribly stupid bug in the best web browsers.  Maybe 
inefficient, poorly, or not-at-all-used multithreading?


"file abstraction" has its points. We just need a (type-level?) 
clear-to-program-with distinction between operations that may block 
indefinitely, and operations that have particular bounds on their 
difficulty.  Although, modern OSes try to balance too many things, don't 
usually make any such hard real-time guarantees, in favor of everything 
turning out more-or-less correct eventually.  Back to "file abstraction" 
- well, considering the benefits of mounting remote systems as a 
filesystem.  The hierarchy abstraction of the filesystem didn't stay the 
same performance characteristics... And all kinds of potential problems 
result too, when the connection breaks down!


How do you program with all those error conditions explicitly? It is 
difficult. You need libraries to do it well - and I'm not at all sure 
whether there exist such libraries yet!  I mean, programs are much too 
complicated already without infesting them with a lot of special cases.


> indefinite delays
I can create with `someCommand | haskellProgram` too
> connection loss
Is there a correct way to detect this? I find it rather odd when I lose 
my IRC connection for a moment and then it comes back a moment later 
(Wesnoth games are worse, apparently, as they don't reconnect 
automatically).  I often prefer considering them an indefinite delay.

> network buffer overflow
that is: too much input, not processing it fast enough? (or similar). 
Memory size limitations are considerably unhandled in programs of all 
sorts, not just networked ones, though they(networked) may suffer the 
most.  We wish we had true unlimited-memory Turing machines :) ...this 
is possibly the most difficult issue to deal with formally.  Probably 
requires limiting input data rates artificially.



Isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Explaining monads

2007-08-13 Thread Derek Elkins
On Mon, 2007-08-13 at 19:31 +0200, peterv wrote:
> When I read "side-effects", I understand it as "unwanted effects", like
> "aliasing", and "effects depending on the order of execution". I'm not sure
> if my understanding here is correct. I hope Haskell does not allow
> "side-effects" but only "effects", meaning the monads do not allow you to
> write the typical ill-behaving code you get when doing real imperative
> programming, enforcing a single wiring of execution, not allowing the
> capture of the RealWorld object. In Concurrent Clean special compiler
> support is present to enforce "uniqueness typing", and in Haskell special
> compiler support is available to make the RealWorld object not available at
> runtime (so e.g. you can't put the RealWorld object in a list). Is this
> correct? 

Monads certainly do allow "side-effects" in that way.  Heck, without
allowing aliasing there wouldn't be much to compel the use of
mutability.  Aliasing is the great boon and great curse of imperative
programming.  Order of execution issues are tricky.  On the one hand,
you must always (directly or indirectly) specify the order of execution,
so technically there's no uncertainty. On the other hand, there are two
distinct definitions of liftM2,
liftM2'1 f ma mb = do a <- ma; b <- mb; return (f a b) -- and
liftM2'2 f ma mb = do b <- mb; a <- ma; return (f a b)
Note that order of -evaluation- is moot (at least as much as it is for
any term).  Laziness does not come into it (unless you use lazy IO, in
which case you get what you deserve.)

> BTW: What is the correct word in Haskell for "object"? I mean the (lazy)
> value you get when evaluating a data constructor? 

What you said doesn't really make much sense, but I think the(/a) word you want 
is a "thunk".

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Sebastian Sylvan
On 13/08/07, Pekka Karjalainen <[EMAIL PROTECTED]> wrote:
> On 8/13/07, L.Guo <[EMAIL PROTECTED]> wrote:
> > Hi All:
>
> Hello,
>
> >
> > I am reading http://www.haskell.org/haskellwiki/Prime_numbers
> >
> > The code in sector "1 Bitwise prime sieve".
> >
> > I have 3 questions about it.
> >
> > 1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ?
>
> Yes, it appears so. In a 32 bit implementation I get:
>
> Prelude> sqrt $ fromIntegral (maxBound :: Int)
> 46340.950001051984
>
> > 2) We have this type definition :
> > pureSieve :: Int -> Int
> >Why there is no error (type mismatch) of this call in func main :
> > pureSieve 1000
>
> If you have integer literals in your program, the compiler sees a
> fromInteger in front of them. So the value is just converted to type
> Int automatically, because that is expected here.
>
> You can give different numeric default declarations in your own
> modules. Please see sections 10.3 (for overloaded literals) and 10.4
> (for defaults) here:
> http://www.haskell.org/tutorial/numbers.html
>
> Sometimes you can get an overflow like this:
>
> Prelude> 1000 :: Int
> -159383552
>
> > 3) In main again, what does expression [| x |] mean ? Why this cannot be 
> > execute in
> > GHCi ?
>
> It's Template Haskell, and is used there for some kind of optimisation
> (I think). Template Haskell needs to be enabled with a command line
> switch for it to work. Please see the documentation for more
> information. It's section 7.6 in your User's Guide.
>
> Though in this case you can probably just remove it to try out the
> program. Perhaps someone else can explain what actual effect it has
> here.

I think it just computes a single function call to pureSieve at
compile time. I believe its origin is from making a point that when
you stop comparing apples to apples it's easy to cheat (this code
comes from a discussion on this list where someone insisted on adding
optimizations to the, admittedly naive, algorithm in C# and comparing
it without making the same optimizations in Haskell -- so someone, I
forget who but I'm a search will turn it up, wrote a quick template
Haskell "optimization" to make a point that we don't really get useful
results unless we compare the same algorithms in both languages).

In general you should probably ignore that TH bit and just call
pureSieve normally.

-- 
Sebastian Sylvan
+44(0)7857-300802
UIN: 44640862
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Derek Elkins
On Mon, 2007-08-13 at 22:29 +0200, Benjamin Franksen wrote:
> Brian Brunswick wrote:
> > One thing that I keep seeing people say (not you), is that
> monads /sequence/
> > side effects. This is wrong, or at
> > least a limited picture.
> > 
> > /All/ of the above structures are about combining compatible things things
> > together in a row.
> > /None/ of them force any particular order of evaluation - that all comes
> > from the particular instance. So its
> > only a particular feature of IO that it sequences the side effects. Others
> > don't - we can have a lazy State
> > monad that just builds up big thunks.
> 
> I am a bit astonished.
> 
> Let's take the simplest example: Maybe. The effect in question is the
> premature abortion of a computation (when Nothing is returned). And of
> course Maybe sequences these effects, that's what you use it for: the
> _first_ action to be encountered that returns Nothing aborts the
> computation. Clearly sequencing goes on here.

You are wrong. Proof: Let's take a simpler example: Identity.  QED
This also disproves David Roundy's statement that 
do x <- return 2; undefined; return (x*x) will hit bottom.

Reader also does not sequence it's "actions".  Writer is a kind of funny
example.

Certainly, any monad instance where (>>=) needs it's first argument to
determine whether to continue, e.g. Maybe, Either, IO, Parser, Cont,
List, Tree will clearly need to force it's first argument before
continuing, but that's just the nature of the situation.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Brian Brunswick
On 13/08/07, David Roundy <[EMAIL PROTECTED]> wrote:
>
> Try executing:
>
>   do { x <- return 2; undefined; return (x*x); }
>
> in any monad you like, and you'll find that regardless of the *data*
> dependencies (the return value of this monadic action is unambiguous), the
> undefined is evaluated *before* the value 4 is returned.
> --
>
>
Prelude> :m + Control.Monad.Identity
Prelude Control.Monad.Identity> runIdentity $ do { x <- return 2; undefined;
return (x*x); }
Loading package mtl-1.0 ... linking ... done.
4

-- 
[EMAIL PROTECTED]
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Tillmann Rendel

David Roundy wrote:

It's the *effect* of a monad, not the *side* effect.  The type of >>=
defines this dependency.  And when you have a chain of dependencies, that
is sometimes referred to as a sequence.  True, it's not mystical, but it's
still sequenced.


How can a Haskell type define a data dependency? Haskell functions are 
non-strict, so they may choose to ignore arguments, and ignored 
arguments are not evaluated at all.



Try executing:

  do { x <- return 2; undefined; return (x*x); }

in any monad you like, and you'll find that regardless of the *data*
dependencies (the return value of this monadic action is unambiguous), the
undefined is evaluated *before* the value 4 is returned.


> runIdentity $ do {x <- return 2; undefined; return (x * x) }
4

  Tillmann
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Re: Explaining monads

2007-08-13 Thread Benjamin Franksen
Brandon S. Allbery KF8NH wrote:
> On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote:
>> Let's take the simplest example: Maybe. The effect in question is the
>> premature abortion of a computation (when Nothing is returned). And of
>> course Maybe sequences these effects, that's what you use it for: the
>> _first_ action to be encountered that returns Nothing aborts the
>> computation. Clearly sequencing goes on here.
> 
> Clearly it does, but not as a side effect of the *monad*.  It's  
> ordinary Haskell data dependencies at work here, not some mystical  
> behavior of a monad.

I can't remember claiming that Monads have any mystic abilities. In fact,
these Monads are implemented in such a way that their definition /employs/
data dependencies to enforce a certain sequencing of effects. I think that
is exactly the point, isn't it?

>> What about State? The effect is reading/writing the state. Again,  
>> the State
>> Monad takes care that these effects get sequenced, and again that's  
>> what
>> you expect it to do for you.
> 
> No, I expect it to carry a value around for me.  If I carry that  
> value around myself instead of relying on the monad to do it for me,  
> *the calculation still gets sequenced by the data dependencies*.

Of course, you can unfold (itso inline) bind and return (or never use them
in the first place). Again, nobody claimed Monads do the sequencing by
employing Mystic Force (tm); almost all Monads can be implemented in plain
Haskell, nevertheless they sequence certain effects -- You Could Have
Invented Them Monads Yourself ;-)

The Monad merely captures the idiom, abstracts it and ideally implements it
in a library for your convenience and for the benefit of those trying to
understand what your code is supposed to achieve. She reads 'StateM ...'
and immediately sees 'ah, there he wants to use some data threaded along
for reading and writing'.

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Benjamin Franksen
Brian Hulley wrote:
> Brian Hulley wrote:
>> apfelmus wrote:
>>> Brian Hulley schrieb:
main = do
buffer <- createBuffer
edit1 <- createEdit buffer
edit2 <- createEdit buffer
splitter <- createSplitter (wrapWidget edit1) (wrapWidget 
 edit2)
runMessageLoopWith splitter

 ... Thus the ability to abstract mutable state gives to my mind by 
 far the best solution.
>>>
>>> I'm not sure whether mutable state is the real goodie here. I think 
>>> it's the ability to indpendently access parts of a compound state.
>>>   http://www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf
>>
>> This is indeed a real key to the problem.
> Of course this is only one aspect of the problem...
> 
> Thinking about this a bit more, and just so this thought is recorded for 
> posterity (!) and for the benefit of anyone now or in a few hundred 
> years time, trying to solve "Fermat's last GUI", the object oriented 
> solution allows the buffer object to do anything it wants, so that it 
> could negotiate a network connection and implement the interface based 
> on a shared network buffer for example, without needing any changes to 
> the client code above, so a functional gui would need to have the same 
> flexibility to compete with the OO solution.

I'd be careful. Introducing a network connection into the equation makes the
object (its methods) susceptible to a whole new bunch of failure modes;
think indefinite delays, connection loss, network buffer overflow, etc etc.
It may be a mistake to abstract all that away; in fact I am convinced that
the old Unix habit of sweeping all these failure modes and potentially long
delays under a big carpet named 'file abstraction' was a bad idea to begin
with. The ages old and still not solved problems with web browsers hanging
indefinitely (w/o allowing any GUI interaction) while name resolution waits
for completion is only the most prominent example.

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Isaac Dupree

Brian Hulley wrote:
Thinking about this a bit more, and just so this thought is recorded for 
posterity (!) and for the benefit of anyone now or in a few hundred 
years time, trying to solve "Fermat's last GUI", the object oriented 
solution allows the buffer object to do anything it wants, so that it 
could negotiate a network connection and implement the interface based 
on a shared network buffer for example, without needing any changes to 
the client code above, so a functional gui would need to have the same 
flexibility to compete with the OO solution.


Probably it would be parametric in the input mechanism, somehow.  (A 
Haskell approach might use type classes, slightly obscuring the 
parametricity..)


Another thing that would be interesting would be to have a formal 
treatment of what is supposed to happen in a gui. For example, when you 
move the mouse over a control which has become dirty (ie needs to be 
re-rendered because its state is now inconsistent), what should the 
control do? Should it respond as if the new state were already visible 
to the user, or should it interpret the mouse position according to the 
last state that was rendered, or should it just ignore all mouse events 
until the next time it gets rendered? This is not a trivial question 
because you could imagine an animated control where the user might 
naturally be following the movement, whereas when the user clicks on a 
cell in a spreadsheet when the cells to the left have now expanded due 
to a change in data thus moving the cell along (but where this updated 
model has not yet been re-rendered) the user might be irritated at the 
wrong cell being selected... It's tricky little issues like this that I 
haven't found any documentation for anywhere, and which would make a 
proper mathematical treatment of interaction with a gui very useful, 
regardless of whether it is implemented in OOP or functional style.


Jef Raskin (late interface designer, author of _The Humane Interface_) 
would probably say that anything with such importance to user decisions, 
should be rendered within a tenth of a second.  Computers fifteen years 
ago could sometimes do it!  Fancy details can be filled in later if it 
takes that long.


Of course that completely dodges the mathematical question... in which 
human response time should really be taken into account too! Humans 
really are not like machines and are not all alike either!  Oh no, do we 
need psychological formalisms?


Firefox suffers the above problems badly, alas - the "Stop" button is 
half useless because it doesn't even noticed you pressed it for such a 
long time, etc...


Reading up on user interface design principles as well as thinking 
functionally, is probably a useful approach - although not everything 
that you read will agree or be right.  The whole concept of GUIs - they 
are very complicated - it is quite arguable that they are just a wrong 
interface - however, some of the world's people are fortunate enough to 
be accustomed to them already, which complicates matters considerably.



Isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread David Roundy
On Mon, Aug 13, 2007 at 05:13:01PM -0400, Brandon S. Allbery KF8NH wrote:
> On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote:
> >Let's take the simplest example: Maybe. The effect in question is the
> >premature abortion of a computation (when Nothing is returned). And of
> >course Maybe sequences these effects, that's what you use it for: the
> >_first_ action to be encountered that returns Nothing aborts the
> >computation. Clearly sequencing goes on here.
> 
> Clearly it does, but not as a side effect of the *monad*.  It's ordinary
> Haskell data dependencies at work here, not some mystical behavior of a
> monad.

It's the *effect* of a monad, not the *side* effect.  The type of >>=
defines this dependency.  And when you have a chain of dependencies, that
is sometimes referred to as a sequence.  True, it's not mystical, but it's
still sequenced.

Try executing:

  do { x <- return 2; undefined; return (x*x); }

in any monad you like, and you'll find that regardless of the *data*
dependencies (the return value of this monadic action is unambiguous), the
undefined is evaluated *before* the value 4 is returned.
-- 
David Roundy
Department of Physics
Oregon State University
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Isaac Dupree

Andrew Coppin wrote:

Stefan O'Rear wrote:

Also, large numbers don't (this is arguably a bug...) have restricted
types:

[EMAIL PROTECTED]:~$ ghc -e '100 :: Int'
-1486618624
  


So many other programming languages allow weird things to happen with 
numeric overflows... it would be nice if Haskell didn't.


Shall we have a GHC warning if it can detect those cases, either the 
Haskell report bounds or that of the target GHC is currently compiling 
to? (of course someone would have to implement it, and there would 
always be cases it didn't check, and the (non-portable) behavior is 
sometimes desired...)


Hugs often does throw an exception (runtime error) rather than allow Int 
overflow (but not always).


Isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] out of core computing in haskell

2007-08-13 Thread David Roundy
In the second case, if your numerical data is just a big array, why not
just mmap it? Unless you're running on a 32 bit machine, or have a
*seriously* large amount of data, that seems like the easiest option.
Although if you're talking about mutable data, that's a whole 'nother can
of worms... but that's true even if your data fits into memory.

You could look at Data.ByteString to get some idea as to how to wrap a pure
interface around a disk-backed chunk of read-only memory.  Or just read
about the ffi, it's not too hard to figure out.

David

On Mon, Aug 13, 2007 at 01:27:03PM -0700, Carter T Schonwald wrote:
> The main two use cases I have in mind are 
> 1) really really really big abstract syntax trees or proof trees (a la
> compilers or theorem provers)
> 2) random access to numerical data 
> 
> does that help clarify what i'm asking about? In each case, is there a
> standard way of dealing with this?
> in the case of (1) the sensible way seems to me to do some sort of zipper
> representation which loads adjacent nodes to some depth surrounding your
> current location in the tree, and in the case of (2) it seems like the
> sensible way would be load subsequences of the data into memory.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread Brian Hulley

Brian Hulley wrote:

apfelmus wrote:

Brian Hulley schrieb:

   main = do
   buffer <- createBuffer
   edit1 <- createEdit buffer
   edit2 <- createEdit buffer
   splitter <- createSplitter (wrapWidget edit1) (wrapWidget 
edit2)

   runMessageLoopWith splitter

... Thus the ability to abstract mutable state gives to my mind by 
far the best solution.


I'm not sure whether mutable state is the real goodie here. I think 
it's the ability to indpendently access parts of a compound state.

  http://www.st.cs.ru.nl/papers/2005/eves2005-FFormsIFL04.pdf


This is indeed a real key to the problem.

Of course this is only one aspect of the problem...

Thinking about this a bit more, and just so this thought is recorded for 
posterity (!) and for the benefit of anyone now or in a few hundred 
years time, trying to solve "Fermat's last GUI", the object oriented 
solution allows the buffer object to do anything it wants, so that it 
could negotiate a network connection and implement the interface based 
on a shared network buffer for example, without needing any changes to 
the client code above, so a functional gui would need to have the same 
flexibility to compete with the OO solution.


Another thing that would be interesting would be to have a formal 
treatment of what is supposed to happen in a gui. For example, when you 
move the mouse over a control which has become dirty (ie needs to be 
re-rendered because its state is now inconsistent), what should the 
control do? Should it respond as if the new state were already visible 
to the user, or should it interpret the mouse position according to the 
last state that was rendered, or should it just ignore all mouse events 
until the next time it gets rendered? This is not a trivial question 
because you could imagine an animated control where the user might 
naturally be following the movement, whereas when the user clicks on a 
cell in a spreadsheet when the cells to the left have now expanded due 
to a change in data thus moving the cell along (but where this updated 
model has not yet been re-rendered) the user might be irritated at the 
wrong cell being selected... It's tricky little issues like this that I 
haven't found any documentation for anywhere, and which would make a 
proper mathematical treatment of interaction with a gui very useful, 
regardless of whether it is implemented in OOP or functional style.


Anyway just a thought,

Brian.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Help with a project design

2007-08-13 Thread Benjamin Franksen
Andrea Rossato wrote:
> The task this library should do is simple: given an xml object
> (representing a bibliographic reference), render it with rules stored
> in a different xml object (the citation style). While I think I can
> find solutions for this problem - the rendering -, what I find
> difficult is the design of the reference xml objects.
> 
> Bibliographic entries have different types, which must be rendered
> differently. These types can be classified into 3 main classes (books,
> articles, parts of a book) that can be rendered with the same methods.
> That seems to fit Haskell perfectly.
> 
> Now, I basically see 2 approaches:
> 
> 1. create some data structures (most part of them is common) to map
>different types of bibliographic entries, and create the needed
>classes with the render methods;
> 
> 2. keep the xml objects as xml and create an abstract interface to the
>xml objects to get the data required for rendering and classifying
>the xml objects. This way I would have to:
>- create data types to store different types of xml objects (data
>  Book = Book XmlTree, data Artilce, etc.): these data types
>  represent my reference classes;
>- create a class of 'render'-able types with the render method and
>  define the instances;
>- create an existential type to set the type of the xml objects 
>  with some kind of setType :: XmlTree -> ExistentialContainer

I may not be overly qualified (and experienced with Haskell) to give you
advice, so take what follows with caution.

I would definitely prefer choice 1 over 2. I think it is very important to
design the data structure independent from any external representation of
that data. XML is a fine way to externally represent data, but this should
not influence your choice of data structure. I'd rather keep the
possibility of alternative representations in the back of my head, and make
the data structure general enough that they could be added w/o disrupting
your main algorithms.

Abstraction can be added later; if you find that you need to maintain
invariants for you bibliographic data that cannot be easily expressed in
the type itself, then you might consider to make your data type abstract,
i.e. put it into a module of its own and export only an API.

> I think that the first approach is not abstract enough and requires a
> lot of boilerplate code to translate into a Haskell type a specific
> type of bibliographic entry. 

A certain amount of boiler plate may be unavoidable. I never found this to
be a serious obstacle, but again that may be due to my limited experience.
It is a bit tedious to write but OTOH may even serve you as 'finger
exercise'. If it really gets out of hand, 'scrap' it in some way ;) I
recommend the Uniplate approach because it is very easy to understand,
performs good, and requires the least amount of extensions.

OK, you have been warned...

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Brandon S. Allbery KF8NH


On Aug 13, 2007, at 16:29 , Benjamin Franksen wrote:


Let's take the simplest example: Maybe. The effect in question is the
premature abortion of a computation (when Nothing is returned). And of
course Maybe sequences these effects, that's what you use it for: the
_first_ action to be encountered that returns Nothing aborts the
computation. Clearly sequencing goes on here.


Clearly it does, but not as a side effect of the *monad*.  It's  
ordinary Haskell data dependencies at work here, not some mystical  
behavior of a monad.


What about State? The effect is reading/writing the state. Again,  
the State
Monad takes care that these effects get sequenced, and again that's  
what

you expect it to do for you.


No, I expect it to carry a value around for me.  If I carry that  
value around myself instead of relying on the monad to do it for me,  
*the calculation still gets sequenced by the data dependencies*.


--
brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED]
system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED]
electrical and computer engineering, carnegie mellon universityKF8NH


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Explaining monads

2007-08-13 Thread Benjamin Franksen
Brian Brunswick wrote:
> One thing that I keep seeing people say (not you), is that
monads /sequence/
> side effects. This is wrong, or at
> least a limited picture.
> 
> /All/ of the above structures are about combining compatible things things
> together in a row.
> /None/ of them force any particular order of evaluation - that all comes
> from the particular instance. So its
> only a particular feature of IO that it sequences the side effects. Others
> don't - we can have a lazy State
> monad that just builds up big thunks.

I am a bit astonished.

Let's take the simplest example: Maybe. The effect in question is the
premature abortion of a computation (when Nothing is returned). And of
course Maybe sequences these effects, that's what you use it for: the
_first_ action to be encountered that returns Nothing aborts the
computation. Clearly sequencing goes on here.

Similar with the Error Monad (i.e. Either Err, for some Err type).

I won't talk about List Monad because I always had difficulty understanding
the List Monad.

What about State? The effect is reading/writing the state. Again, the State
Monad takes care that these effects get sequenced, and again that's what
you expect it to do for you.

And so on...

This is -- of course -- not a proof, so maybe there /are/ Monads that don't
sequence (their) effects. I'd be most interested to see an example, if
there is one, to bring myself nearer to the -- unattainable -- goal of full
enlightenment wrt Monads.

Cheers
Ben

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] out of core computing in haskell

2007-08-13 Thread Carter T Schonwald
The main two use cases I have in mind are 
1) really really really big abstract syntax trees or proof trees (a la 
compilers or theorem provers)
2) random access to numerical data 

does that help clarify what i'm asking about? In each case, is there a standard 
way of dealing with this?
in the case of (1) the sensible way seems to me to do some sort of zipper 
representation which loads adjacent nodes to some depth surrounding your 
current location in the tree,
and in the case of (2) it seems like the sensible way would be load 
subsequences of the data into memory.


- Original Message 
From: Andrew Coppin <[EMAIL PROTECTED]>
To: haskell-cafe@haskell.org
Sent: Monday, August 13, 2007 3:55:57 PM
Subject: Re: [Haskell-cafe] out of core computing in haskell

Carter T Schonwald wrote:
> Hello Everyone,
> I'm not quite sure if I'm posing this question correctly, but what 
> facilities currently exist in haskell to nicely deal with 
> datastructures that won't fit within a given machine's ram?
> And if there are no such facilities, what would it take to fix that?

If you just want to process a big chunk of data from one end to the 
other without loading it all into RAM at once... that's fairly easy. 
Haskell is a lazy language. By playing with functions such as 
getContents, you can automatically load the data as you access it. No 
tricky programming required.

If you're asking for something more specific -- e.g., "how do I talk to 
a standard database like Oracle / MySql / et al.", there are a couple of 
libraries for that. (Unfortunately, no one standard one.) See Stefan's 
answer.

I'd you'd like to be more specific about what you'd like to do...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic thread management?

2007-08-13 Thread Jan-Willem Maessen


On Aug 13, 2007, at 2:53 PM, Mitar wrote:


Hi!

I am thinking about a model where you would have only n threads on a
n-core (or processor) machine. They would be your worker threads and
you would spawn them only once (at the beginning of the program) and
then just delegate work between them.

On 8/13/07, Jan-Willem Maessen <[EMAIL PROTECTED]> wrote:

The problem here is that while Cilk spawns are incredibly cheap,
they're still more than a simple procedure call (2-10x as expensive
if my fading memory serves me rightly).  Let's imagine we have a
nice, parallelizable computation that we've expressed using recursive
subdivision (the Cilk folks like to use matrix multiplication as an
example here).  Near the leaves of that computation we still spend
the majority of our time paying the overhead of spawning.  So we end
up actually imposing a depth bound, and writing two versions of our
computation---the one that spawns, which we use for coarse-grained
computations, and the one that doesn't, which we use when computation
gets fine-grained.  It makes a really big difference in practice.


But this could be done at the runtime too. If the
lazy-non-evaluated-yet chunk is "big" then divide it into a few parts
and run each part in its thread. But if the chunk is small (you are at
the end of the evaluation and you already evaluated necessary
subexpressions) you do it in the thread which encounters this
situation (which is probably near the end of the program or the end of
the monadic IO action).


I didn't make my point very well.  The hard part is determining  
exactly when a chunk is "big" or "small" without actually computing  
its value.  Consider recursive fib (which I use only because it's  
easy to understand, and has been used as an example of this problem  
by the Cilk implementors):


fib n = if n <= 1 then n else fib (n-1) + fib (n-2)

Imagine we're computing (fib 30).  We can divide and conquer; plenty  
of parallelism there!  But do most calls to fib represent enough work  
to justify running them in parallel?  No, because most of the calls  
are to (fib 0) or (fib 1)!  We should only pay the spawn cost up to a  
certain bound --- say n >= 5 --- and then run serially for smaller  
n.  This has a dramatic effect on how fast fib runs, but of course  
the best possible choice of bound is going to be machine-dependent.


We can instrument our program and have some chance of doing OK for  
fib :: Int -> Int; it's not at all obvious what to do for:

   myFunction :: [Int] -> (Int -> Bool) -> [Frob]

In effect, I end up needing to write myFunction with a built-in bound  
on computation, and I need to do it in such a way that the underlying  
systems knows that one branch should be serial and the other branch  
parallel.  This is the problem I was attempting to allude to above.   
Yes, we can decide on a function-by-function or callsite-by-callsite  
basis that "enough work" is being done to justify parallelism.  But  
the answer to this question is often "maybe", or even "no" (as in fib).



Yes, you have parMap but the problem I saw with it (and please correct
me if I am wrong) is that it spawns a new thread for every application
of the function to the element?



But what if functions are small? Then
this is quite an overhead. And you have to know this in advance if you
want to use something else than the default parMap which is not always
possible (if we are passing a function as an argument to the function
which calls map).

For example:

calculate f xs = foldl (+) 0 $ map f xs -- or foldr, I am not sure


You seem to be arguing that we can pick the right implementation of  
"map" and "fold" (serial or parallel) here if we only knew that xs  
was "big enough" and f "expensive enough".


I agree.  But that begs the question: let's assume "calculate" is a  
function that's called from numerous places, with a mixture of "big"  
and "small" arguments.  Now I need two versions of calculate, and I  
need to decide at each call site whether to call "big calculate" or  
"small calculate".  We also need to make sure any eagerness we  
introduce is semantically sound, too, but I think we've got a pretty  
good handle on that part in practice between my work on resource- 
bounded evaluation in Eager Haskell, Rob Ennals' work on eager  
evaluation in GHC, and the Singh & Harris paper.


That isn't to say that any of this is impossible---but it's going to  
take a while to figure out how to get it right [it'd be a great Ph.D.  
project to even knock off the low-hanging fruit like fib and  
recursive matrix multiply].


Meanwhile, we also need to work hard educating programmers how to  
write code that'll run in parallel in the first place, and giving  
them the libraries that'll make it easy.


-Jan-Willem Maessen



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] out of core computing in haskell

2007-08-13 Thread Andrew Coppin

Carter T Schonwald wrote:

Hello Everyone,
I'm not quite sure if I'm posing this question correctly, but what 
facilities currently exist in haskell to nicely deal with 
datastructures that won't fit within a given machine's ram?

And if there are no such facilities, what would it take to fix that?


If you just want to process a big chunk of data from one end to the 
other without loading it all into RAM at once... that's fairly easy. 
Haskell is a lazy language. By playing with functions such as 
getContents, you can automatically load the data as you access it. No 
tricky programming required.


If you're asking for something more specific -- e.g., "how do I talk to 
a standard database like Oracle / MySql / et al.", there are a couple of 
libraries for that. (Unfortunately, no one standard one.) See Stefan's 
answer.


I'd you'd like to be more specific about what you'd like to do...


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] out of core computing in haskell

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 12:29:25PM -0700, Carter T Schonwald wrote:
> Hello Everyone,

> I'm not quite sure if I'm posing this question correctly, but what
> facilities currently exist in haskell to nicely deal with
> datastructures that won't fit within a given machine's ram?  And if
> there are no such facilities, what would it take to fix that?

You're asking correctly, and according to the Hackage list:

http://hackage.haskell.org/packages/archive/pkg-list.html

we have:

 * anydbm library and program: Interface for DBM-like database systems
 * BerkeleyDB library: Bindings for Berkeley DB v1.x
 * haskelldb library: SQL unwrapper for Haskell.
 * HDBC library
 * hsql library
 * PostgreSQL library: Thin wrapper over the C postgresql library

We also have Data.Binary and Data.ByteString, but those are more useful
for building data stores, than as data stores themselves.

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] out of core computing in haskell

2007-08-13 Thread Carter T Schonwald
Hello Everyone,
I'm not quite sure if I'm posing this question correctly, but what facilities 
currently exist in haskell to nicely deal with datastructures that won't fit 
within a given machine's ram? 
And if there are no such facilities, what would it take to fix that? 

thanks
-Carter


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Andrew Coppin

Stefan O'Rear wrote:

Also, large numbers don't (this is arguably a bug...) have restricted
types:

[EMAIL PROTECTED]:~$ ghc -e '100 :: Int'
-1486618624
  


So many other programming languages allow weird things to happen with 
numeric overflows... it would be nice if Haskell didn't.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic thread management?

2007-08-13 Thread Mitar
Hi!

I am thinking about a model where you would have only n threads on a
n-core (or processor) machine. They would be your worker threads and
you would spawn them only once (at the beginning of the program) and
then just delegate work between them.

On 8/13/07, Jan-Willem Maessen <[EMAIL PROTECTED]> wrote:
> The problem here is that while Cilk spawns are incredibly cheap,
> they're still more than a simple procedure call (2-10x as expensive
> if my fading memory serves me rightly).  Let's imagine we have a
> nice, parallelizable computation that we've expressed using recursive
> subdivision (the Cilk folks like to use matrix multiplication as an
> example here).  Near the leaves of that computation we still spend
> the majority of our time paying the overhead of spawning.  So we end
> up actually imposing a depth bound, and writing two versions of our
> computation---the one that spawns, which we use for coarse-grained
> computations, and the one that doesn't, which we use when computation
> gets fine-grained.  It makes a really big difference in practice.

But this could be done at the runtime too. If the
lazy-non-evaluated-yet chunk is "big" then divide it into a few parts
and run each part in its thread. But if the chunk is small (you are at
the end of the evaluation and you already evaluated necessary
subexpressions) you do it in the thread which encounters this
situation (which is probably near the end of the program or the end of
the monadic IO action).

And this line when you choose to delegate or not can be determined at
runtime too.

In combination with some transactional memory or some other trick
which would be behind this delegation this could be probably possible.

We could also hint runtime that the function would probably take a
long time to compute (even if it is lazy) with making a type for such
functions which would signal this. Of course this could also make
things worse if used improperly. But sometimes you know that you will
be running the map of time-consuming function.

Yes, you have parMap but the problem I saw with it (and please correct
me if I am wrong) is that it spawns a new thread for every application
of the function to the element? But what if functions are small? Then
this is quite an overhead. And you have to know this in advance if you
want to use something else than the default parMap which is not always
possible (if we are passing a function as an argument to the function
which calls map).

For example:

calculate f xs = foldl (+) 0 $ map f xs -- or foldr, I am not sure

And I would like to see something like this: it gets to the point when
we need to evaluate this function call, for some "big" f and some
"big" list of xs, so the thread which gets to it starts evaluating the
first value and when it starts with another (it is recursive call so
it is a "similar" evaluation) it sees that the other thread is empty
and the function would probably take a long time (it speculates) so it
delegates it there and continues with the third element that is a
dummy recursive call to the function, in this case foldl (dummy
because it did not really evaluate everything at the previous level).
Now, if the second thread finishes first, it goes to the next element
(recursive call) but sees that it is already (still) evaluating, so it
gets to the fourth. Otherwise, it the first thread finishes first just
goes to the next element.

This would be some kind of speculative evaluating. If the CPUs know
how to do that why would not we at the higher level?

It would be also an interesting lazy evaluation of the, in this
example, foldl function. The system would make a recursive call but
would just go another call deeper if it finds that it is impossible
(because it is still evaluating somewhere else) to evaluate it. And at
every level it finishes it will check previous levels if it can
collapse them (and maybe prematurely finish the whole thing).

It would be like that I unwind the loop (in this case recursion),
evaluate everything (on many threads) and then (try to) calculate the
value. If it finishes prematurely ... sad, but for "big" lists and
"big" functions it would be a saver. And the size of this window
(unwinding) would be a heuristic speculative function of number of
possible threads (cores, processors) and of information how long have
previous evaluations of the same function taken.

I really messed this explanation up. Or maybe it is completely wrong
and this is why it looks like a mess to me. :-)


Mitar
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC optimisations

2007-08-13 Thread Andrew Coppin

Stefan O'Rear wrote:

On Mon, Aug 13, 2007 at 07:35:31PM +0100, Andrew Coppin wrote:
  
(I once compiled a program that used the GHC API. The final binary was 
several times larger than ghc.exe...)



GHC is a particularly bad case because what it does is determined by the
settings of a bunch of switches in the configuration data.  Of course,
GHC isn't smart enough to perform inter-module control flow analysis, so
even with -split-objs you'd probably still link most of GHC.
  


Is it likely that GHC will ever become "smart enough" to do that kind of 
analysis? (I imagine this is going to be especially fun with precompiled 
libraries...)



This is called the Constructed Product Return (CPR) analysis, and it
applies to all types with one constructor (in archaic jargon, product
types).
  
Right. So it doesn't have to have strict fields or anything? Just has to 
have exactly one constructor?



Yep, CPR is completely independent of strictness, however the returned
product must be *new*, since returning an old object by value risks
losing sharing (and thus creating large memory leaks).
  


Right, OK.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC optimisations

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 07:35:31PM +0100, Andrew Coppin wrote:
> Not related to optimisation, but... is there some switch to warn you if 
> something gets removed? (Presumably this means you forgot to export 
> something, or you haven't coded the bit that calls it yet or something.)

-fwarn-unused-binds (included in -Wall)

> (I once compiled a program that used the GHC API. The final binary was 
> several times larger than ghc.exe...)

GHC is a particularly bad case because what it does is determined by the
settings of a bunch of switches in the configuration data.  Of course,
GHC isn't smart enough to perform inter-module control flow analysis, so
even with -split-objs you'd probably still link most of GHC.

>>> I read somewhere that if one funtion returns a tuple, and the caller 
>>> immediately extracts the values from the tuple, GHC tries to optimise 
>>> away the tuple - so it's as if the function can just return multiple 
>>> values at once. Is this true? Does it apply only to tuples, or to all 
>>> types?
>>
>> This is called the Constructed Product Return (CPR) analysis, and it
>> applies to all types with one constructor (in archaic jargon, product
>> types).
>
> Right. So it doesn't have to have strict fields or anything? Just has to 
> have exactly one constructor?

Yep, CPR is completely independent of strictness, however the returned
product must be *new*, since returning an old object by value risks
losing sharing (and thus creating large memory leaks).

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC optimisations

2007-08-13 Thread Andrew Coppin

Stefan O'Rear wrote:

On Mon, Aug 13, 2007 at 07:05:03PM +0100, Andrew Coppin wrote:
  

Does GHC do dead code elimination?



Yes - all unused let-binders are removed.
  


Not related to optimisation, but... is there some switch to warn you if 
something gets removed? (Presumably this means you forgot to export 
something, or you haven't coded the bit that calls it yet or something.)


Will the generated executable contain the code for g and h, even though 
they aren't called? Or does the unused code get eliminated? How about the 
standard libraries?



By default, GHC generates one object file per source module, so if any
values from that module are used, the entire module is linked in.


Right, OK. I had a feeling that was the case.

(I once compiled a program that used the GHC API. The final binary was 
several times larger than ghc.exe...)



With
the -split-objs flag, GHC generates one object file per top level
function, sometimes significantly reducing the size of executables.
This is not the default because it puts tremendous strain on the linker.
It is common to spend several *minutes* linking base when compiling ghc,
and it's not unheard of for ar to simply run out of memory and die.
  


Ouch! o_O

Probably simpler for me, as a human, to split the program into a 
sensible module structure... ;-)


I read somewhere that if one funtion returns a tuple, and the caller 
immediately extracts the values from the tuple, GHC tries to optimise away 
the tuple - so it's as if the function can just return multiple values at 
once. Is this true? Does it apply only to tuples, or to all types?



This is called the Constructed Product Return (CPR) analysis, and it
applies to all types with one constructor (in archaic jargon, product
types).
  


Right. So it doesn't have to have strict fields or anything? Just has to 
have exactly one constructor?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC optimisations

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 07:05:03PM +0100, Andrew Coppin wrote:
> Does GHC do dead code elimination?

Yes - all unused let-binders are removed.

> I observe that if you take a module and edit it so that some function is 
> now no longer exported or called by any exported function, the size of the 
> *.o file seems to go down. This suggests that dead code within a single 
> module is eliminated.
>
> However... how about this?
>
>  module Foo where
>
>  f = ...
>  g = ...
>  h = ...
>
>  module Main where
>
>  import Foo
>
>  main = ...f...
>
> Will the generated executable contain the code for g and h, even though 
> they aren't called? Or does the unused code get eliminated? How about the 
> standard libraries?

By default, GHC generates one object file per source module, so if any
values from that module are used, the entire module is linked in.  With
the -split-objs flag, GHC generates one object file per top level
function, sometimes significantly reducing the size of executables.
This is not the default because it puts tremendous strain on the linker.
It is common to spend several *minutes* linking base when compiling ghc,
and it's not unheard of for ar to simply run out of memory and die.

> I read somewhere that if one funtion returns a tuple, and the caller 
> immediately extracts the values from the tuple, GHC tries to optimise away 
> the tuple - so it's as if the function can just return multiple values at 
> once. Is this true? Does it apply only to tuples, or to all types?

This is called the Constructed Product Return (CPR) analysis, and it
applies to all types with one constructor (in archaic jargon, product
types).

For instance,

(+) (I# x#) (I# y#) = I# (x# +# y#)

foo = case 2 + 2 of I# x# -> ...

is transformed into:

zdzp (I# x#) (I# y#) = (x# +# y#)

foo = case 2 + 2 of x# -> ...

This is one of the two main pillars of arithmetic unboxing, alongside
and equally important to strictness analysis.

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: zip3, zip4 ... -> zipn?

2007-08-13 Thread Brent Yorgey
On 8/11/07, Per Vognsen <[EMAIL PROTECTED]> wrote:
>
> Applicative functors can indeed help:
>
> (,,,) <$> [1,2,3] <*> [-1,0,1] <*> [1,1,1] <*> [0,2,6]
>
> You just use n-1 commas when you want the effect of zipn.


Actually, that's not quite right, since that uses the applicative functor
related to the list monad (so you get a list of 4-tuples of all possible
combinations, rather than a 4-way zip).  To get the zip behavior, you need
to add a ZipList constructor in front of all the lists, and then apply
getZipList at the end to extract.

-Brent
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] GHC optimisations

2007-08-13 Thread Andrew Coppin

Does GHC do dead code elimination?

I observe that if you take a module and edit it so that some function is 
now no longer exported or called by any exported function, the size of 
the *.o file seems to go down. This suggests that dead code within a 
single module is eliminated.


However... how about this?

 module Foo where

 f = ...
 g = ...
 h = ...

 module Main where

 import Foo

 main = ...f...

Will the generated executable contain the code for g and h, even though 
they aren't called? Or does the unused code get eliminated? How about 
the standard libraries?




I read somewhere that if one funtion returns a tuple, and the caller 
immediately extracts the values from the tuple, GHC tries to optimise 
away the tuple - so it's as if the function can just return multiple 
values at once. Is this true? Does it apply only to tuples, or to all types?


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


RE: [Haskell-cafe] Explaining monads

2007-08-13 Thread peterv
When I read "side-effects", I understand it as "unwanted effects", like
"aliasing", and "effects depending on the order of execution". I'm not sure
if my understanding here is correct. I hope Haskell does not allow
"side-effects" but only "effects", meaning the monads do not allow you to
write the typical ill-behaving code you get when doing real imperative
programming, enforcing a single wiring of execution, not allowing the
capture of the RealWorld object. In Concurrent Clean special compiler
support is present to enforce "uniqueness typing", and in Haskell special
compiler support is available to make the RealWorld object not available at
runtime (so e.g. you can't put the RealWorld object in a list). Is this
correct? 

BTW: What is the correct word in Haskell for "object"? I mean the (lazy)
value you get when evaluating a data constructor? 

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Kim-Ee Yeoh
Sent: Monday, August 13, 2007 15:30
To: haskell-cafe@haskell.org
Subject: Re: [Haskell-cafe] Explaining monads



Ronald Guida wrote:
> 
> Given the question "What is a Monad", I would have to say "A Monad is
> a device for sequencing side-effects."
> 

There are side-effects and there are side-effects. If the only
monad you use is Maybe, the only side-effect you get is a slight
warming of the CPU.

Dave Menendez pointed to that fine Wadler link earlier. Please read
it. To wit, in Section 2: "Explaining Monads" the "essence of an
algorithm can become buried under the plumbing required to carry
data from its point of creation to its point of use." Monads can
help keep the clarity of your code untrammelled by providing
implicit plumbing, "side-channels" if you prefer, when data is
moved around.

In fact if you follow Wadler all the way to his monadic expression
evaluator, you see that you could modularize your code in awesomely
cool ways. You get to see how the kernel of the expression
evaluator could be written for a generic monad and compiled
once-and-for-all. Any additional feature (the "variations") is
coded by enriching the monad.

Monads are powerful devices for modularizing code. Available for
free. Only in Haskell (thanks to type classes!). Get them today.

"Side-effects" is a piece of linguistic cruft played fast-and-loose
by too many people in this game. "Sequencing" suffers the same 
disease.

-- 
View this message in context:
http://www.nabble.com/Explaining-monads-tf4244948.html#a12126170
Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

No virus found in this incoming message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 12/08/2007
11:03
 

No virus found in this outgoing message.
Checked by AVG Free Edition. 
Version: 7.5.476 / Virus Database: 269.11.15/949 - Release Date: 12/08/2007
11:03
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread David Roundy
On Mon, Aug 13, 2007 at 01:27:45PM -0300, Isaac Dupree wrote:
> David Roundy wrote:
> >The only cost is that
> >this syntax relies on the do notation, and thus makes the desugaring of
> >that do notation slightly more complicated when used.
> 
> If I understand correctly,
> 
>  do
>   blah
>   f (do
>foo
>bar (<- action)
>  )
>   blah
> 
> has an ambiguity: which do-block is the action bound in?  I can easily 
> imagine myself being frustrated at having to refactor my code if the 
> defined answer is not the one I want at the moment.

It doesn't have an ambiguity, because it's defined to be bound in the
innermost do loop.  This isn't a new concept, the <- syntax in the existing
do notation has the same behavior.
-- 
David Roundy
Department of Physics
Oregon State University
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread Isaac Dupree

David Roundy wrote:

The only cost is that
this syntax relies on the do notation, and thus makes the desugaring of
that do notation slightly more complicated when used.


If I understand correctly,

 do
  blah
  f (do
   foo
   bar (<- action)
 )
  blah

has an ambiguity: which do-block is the action bound in?  I can easily 
imagine myself being frustrated at having to refactor my code if the 
defined answer is not the one I want at the moment.


Isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] xpath types for SYB/generic haskell type system?

2007-08-13 Thread Alex Jacobson
The SYB papers provide really powerful functions for accessing and 
manipulating a values in arbitrary shaped containers.


The cost of this capability appears to be loss of type checking.  For 
example gfindtype x returns a maybe y.


Given that the type checker actually has to know whether or not x 
actually contains a y inside of it, is there a way to annotate a 
gfindtype sort of function that just returns a value and if applied with 
the wrong type has a compiler enforce error?


It may not be in this version of haskell, but it seems like there is no 
technical reason you could not have partial type annotations that 
describe the traversal strategies described in SYB.  Perhaps it is a 
type version of an xpath expression e.g


  myFindType::(.//y) x => x->y

The (.//y) x says that y is a type nested somewhere in x.

Note, since this is happening at compiler time, this capability will 
still not prevent you from doing a (fromJust Nothing), but it still 
seems super valuable if you are doing generic haskell type stuff?


Is there a mathematical reason why this wouldn't work?

-Alex-
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 05:39:34PM +0200, apfelmus wrote:
> Stefan O'Rear schrieb:
>> On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote:
>>> My assumption is that we have an equivalence
>>>
>>>   forall a,b . m (a -> m b) ~ (a -> m b)
>>>
>>> because any side effect executed by the extra m on the outside can well 
>>> be delayed until we are supplied a value a. Well, at least when all 
>>> arguments are fully applied, for some notion of "fully applied"
>> (\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β)
>> (\a x -> a >>= ($ x)) (return X)   ==> (β)
>> (\x -> (return X) >>= ($ x))   ==> (monad law)
>> (\x -> ($ x) X)==> (β on the sugar-hidden 
>> 'flip')
>> (\x -> X x)==> (η)
>> X
>> Up to subtle strictness bugs arising from my use of η :), you're safe.
>
> Yes, but that's only one direction :) The other one is the problem:
>
>  return . (\f x -> f >>= ($ x)) =?= id
>
> Here's a counterexample
>
>  f :: IO (a -> IO a)
>  f = writeAHaskellProgram >> return return
>
>  f' :: IO (a -> IO a)
>  f' = return $ (\f x -> f >>= ($ x)) $ f
>  ==> (β)
>   return $ \x -> (writeAHaskellProgram >> return return) >>= ($ x)
>  ==> (BIND)
>   return $ \x -> writeAHaskellProgram >> (return return >>= ($ x))
>  ==> (LUNIT)
>   return $ \x -> writeAHaskellProgram >> (($ x) return)
>  ==> (β)
>   return $ \x -> writeAHaskellProgram >> return x
>
> Those two are different, because
>
>  clever  = f  >> return () = writeAHaskellProgram
>  clever' = f' >> return () = return ()
>
> are clearly different ;)

I figured that wouldn't be a problem since our values don't escape, and
the functions we define all respect the embedding... More formally:

Projections and injections:

proj ma = \x -> ma >>= \ fn' -> fn' x
inj  fn = return fn

Define an equivalence relation:

ma ≡ mb <-> proj ma = proj mb

Projection respects equivalence:

ma ≡ mb -> proj ma = proj mb(intro ->)
ma ≡ mb => proj ma = proj mb(equiv def)
proj ma = proj mb => proj ma = proj mb  (assumption)

Application:

(@) ma1 = \x -> join (proj ma1 x)

Application respects equivalence:

ma1 ≡ ma2 -> (@) ma1 = (@) ma2  (intro ->)
ma1 ≡ ma2 => (@) ma1 = (@) ma2  (β)
ma1 ≡ ma2 => (\x -> join (proj ma1 x)) = (\x -> join (proj ma2 x))
(extensionality)
ma1 ≡ ma2 => join (proj ma1 x) = join (proj ma2 x)(application respects = 
left)
ma1 ≡ ma2 => proj ma1 x = proj ma2 x(application respects = right)
ma1 = ma2 => proj ma1 = proj ma2(lemma)

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Isaac Dupree
L.Guo wrote:
> Because 10,000,000 is too large for a Int

On my pitiful system,
> maxBound::Int
2147483647
is certainly greater than
1000
.

> it is always in type of Integer or some higher level data type.

Haskell doesn't do static checking like that. In GHC on my system (where
10,000,000,000 is too large for an Int - note ten zeroes),
> 100::Int
1410065408
(Hugs gives "Program error: arithmetic overflow")


Isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread apfelmus

Stefan O'Rear schrieb:

On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote:

My assumption is that we have an equivalence

  forall a,b . m (a -> m b) ~ (a -> m b)

because any side effect executed by the extra m on the outside can well be 
delayed until we are supplied a value a. Well, at least when all arguments 
are fully applied, for some notion of "fully applied"


(\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β)
(\a x -> a >>= ($ x)) (return X)   ==> (β)
(\x -> (return X) >>= ($ x))   ==> (monad law)
(\x -> ($ x) X)==> (β on the sugar-hidden 'flip')
(\x -> X x)==> (η)
X

Up to subtle strictness bugs arising from my use of η :), you're safe.


Yes, but that's only one direction :) The other one is the problem:

 return . (\f x -> f >>= ($ x)) =?= id

Here's a counterexample

 f :: IO (a -> IO a)
 f = writeAHaskellProgram >> return return

 f' :: IO (a -> IO a)
 f' = return $ (\f x -> f >>= ($ x)) $ f
 ==> (β)
  return $ \x -> (writeAHaskellProgram >> return return) >>= ($ x)
 ==> (BIND)
  return $ \x -> writeAHaskellProgram >> (return return >>= ($ x))
 ==> (LUNIT)
  return $ \x -> writeAHaskellProgram >> (($ x) return)
 ==> (β)
  return $ \x -> writeAHaskellProgram >> return x

Those two are different, because

 clever  = f  >> return () = writeAHaskellProgram
 clever' = f' >> return () = return ()

are clearly different ;)

Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 11:23:59PM +0800, L.Guo wrote:
> Because 10,000,000 is too large for a Int, it is always in type of Integer or 
> some higher level data type.

In Haskell, Int always supports at least -536,870,912 to 536,870,911.

Also, large numbers don't (this is arguably a bug...) have restricted
types:

[EMAIL PROTECTED]:~$ ghc -e '100 :: Int'
-1486618624
[EMAIL PROTECTED]:~$ 

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread L.Guo
Because 10,000,000 is too large for a Int, it is always in type of Integer or 
some higher level data type.

--   
L.Guo
2007-08-13

-
From: Alexis Hazell
At: 2007-08-13 22:46:46
Subject: Re: [Haskell-cafe] A few questions on primes generating.

On Tuesday 14 August 2007 00:22, L.Guo wrote:

> 2) We have this type definition :
> pureSieve :: Int -> Int
>Why there is no error (type mismatch) of this call in func main :
> pureSieve 1000

The Haskell Report says that an Int covers at least the range [- 2^29, 2^29 - 
1], which that number is well within . . . . why do you think it should 
report a type error? 


Alexis.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Pekka Karjalainen
On 8/13/07, L.Guo <[EMAIL PROTECTED]> wrote:
> Hi All:

Hello,

>
> I am reading http://www.haskell.org/haskellwiki/Prime_numbers
>
> The code in sector "1 Bitwise prime sieve".
>
> I have 3 questions about it.
>
> 1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ?

Yes, it appears so. In a 32 bit implementation I get:

Prelude> sqrt $ fromIntegral (maxBound :: Int)
46340.950001051984

> 2) We have this type definition :
> pureSieve :: Int -> Int
>Why there is no error (type mismatch) of this call in func main :
> pureSieve 1000

If you have integer literals in your program, the compiler sees a
fromInteger in front of them. So the value is just converted to type
Int automatically, because that is expected here.

You can give different numeric default declarations in your own
modules. Please see sections 10.3 (for overloaded literals) and 10.4
(for defaults) here:
http://www.haskell.org/tutorial/numbers.html

Sometimes you can get an overflow like this:

Prelude> 1000 :: Int
-159383552

> 3) In main again, what does expression [| x |] mean ? Why this cannot be 
> execute in
> GHCi ?

It's Template Haskell, and is used there for some kind of optimisation
(I think). Template Haskell needs to be enabled with a command line
switch for it to work. Please see the documentation for more
information. It's section 7.6 in your User's Guide.

Though in this case you can probably just remove it to try out the
program. Perhaps someone else can explain what actual effect it has
here.

Pekka
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread Stefan O'Rear
On Mon, Aug 13, 2007 at 04:35:12PM +0200, apfelmus wrote:
> My assumption is that we have an equivalence
>
>   forall a,b . m (a -> m b) ~ (a -> m b)
>
> because any side effect executed by the extra m on the outside can well be 
> delayed until we are supplied a value a. Well, at least when all arguments 
> are fully applied, for some notion of "fully applied"


(\a x -> a >>= ($ x)) ((\f -> return f) X) ==> (β)
(\a x -> a >>= ($ x)) (return X)   ==> (β)
(\x -> (return X) >>= ($ x))   ==> (monad law)
(\x -> ($ x) X)==> (β on the sugar-hidden 'flip')
(\x -> X x)==> (η)
X

Up to subtle strictness bugs arising from my use of η :), you're safe.

Stefan


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread Alexis Hazell
On Tuesday 14 August 2007 00:22, L.Guo wrote:

> 2) We have this type definition :
> pureSieve :: Int -> Int
>Why there is no error (type mismatch) of this call in func main :
> pureSieve 1000

The Haskell Report says that an Int covers at least the range [- 2^29, 2^29 - 
1], which that number is well within . . . . why do you think it should 
report a type error? 


Alexis.
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic thread management?

2007-08-13 Thread Jan-Willem Maessen


On Aug 11, 2007, at 12:35 PM, Brian Hurt wrote:



You guys might also want to take a look at the Cilk programming  
language, and how it managed threads.  If you know C, learning Cilk  
is about 2 hours of work, as it's C with half a dozen extra  
keywords and a few new concepts.  I'd love to see Cilk - C +  
Haskell as a programming language.


It was called pH, and we (meaning Alejandro Caro and myself)  
implemented it back in the mid/late 90's using Lennart Augustsson's  
hbcc front end (which he hacked to add a bunch of pH-specific  
syntax).  Arvind and Nikhil wrote a textbook on pH programming.


There are two problems, still: one is that laziness means you can't  
actually prove you need something until very close to the time you  
actually want it.  By the time I know that I'm adding f x to g y,  
it's probably too late to usefully run them in parallel (unless  
they're both *very* large).  We used eager evaluation in pH---to the  
point that we actually gave up the ability to manipulate infinite  
lazy data structures.  In NDP they've done much the same thing, first  
instrumenting the program to see that the eagerness they introduce  
won't disrupt execution.  Even the "par" annotation has this feel: we  
are telling the implementation that it's OK to do some computation  
even if it isn't yet obvious that we'll need the results.


The second problem is controlling the overhead.  More on this below.

The key idea of Cilk is that it's easier to deparallelize than it  
is to parallelize, especially automatically.  So the idea is that  
the program is written incredibly parallel, with huge numbers of  
microthreads, which are (on average) very cheap to spawn.  The  
runtime then deparallelizes the microthreads, running multiple  
microthreads sequentially within a single real thread (a worker  
thread).  Microthreads that live their entire life within a single  
real thread are cheap to spawn (as in "not much more expensive than  
a normal function call" cheap).


The problem here is that while Cilk spawns are incredibly cheap,  
they're still more than a simple procedure call (2-10x as expensive  
if my fading memory serves me rightly).  Let's imagine we have a  
nice, parallelizable computation that we've expressed using recursive  
subdivision (the Cilk folks like to use matrix multiplication as an  
example here).  Near the leaves of that computation we still spend  
the majority of our time paying the overhead of spawning.  So we end  
up actually imposing a depth bound, and writing two versions of our  
computation---the one that spawns, which we use for coarse-grained  
computations, and the one that doesn't, which we use when computation  
gets fine-grained.  It makes a really big difference in practice.


The programmer is free to use this trick in any programming  
language.  But we haven't yet figured out a way to *avoid* the need  
to do so.  This continues to worry me to this day, because making the  
right choices is black magic and specific to a particular combination  
of algorithm and machine.


That said, there is some reason for optimism: the overhead of  
creating work in Cilk is comparable to the overhead of creating a  
thunk in Haskell.


The problem that Cilk runs into is that it's, well, C.  It doesn't  
deal with contention at well at all- a single microthread blocking  
blocks the whole worker thread- meaning, among other things, that  
you can have "false deadlocks", where one microthread blocks on  
another microthread in the same real thread, and thus acts like  
it's deadlocked even though it really isn't.


This is actually a fundamental problem with the threading model:  
there is no guarantee of fairness using work stealing, so if you do  
something that requires fair scheduling you get into a lot of trouble  
fast.  It's not fair to blame C for this.  You have to be very  
careful to define the interaction between fair IO-style threads and  
unfair get-my-work-done threads.


You have greatly increased the likelyhood of raceconditions as well  
(mutable data and multithreading just don't mix).  Plus you have  
all the normal fun you have with C bugs- wild pointers, buffer over  
runs, etc.


This, however, *is* C's fault. :-)

More on pH: we got our programs to scale, but had troubles going past  
8 threads.  We found ourselves limited by a non-parallel GC (my  
fault, but labor-intensive to get right) and the absence of  
parallelism in the underlying algorithms.  For the latter problem  
there simply is no magic bullet.


-Jan-Willem Maessen



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: a regressive view of support for imperative programming in Haskell

2007-08-13 Thread apfelmus

Benjamin Franksen wrote:

As has been already mentioned in this thread, in
http://www.soi.city.ac.uk/~ross/papers/Applicative.html Conor McBride and
Ross Paterson invent/explain a new type class that is now part of the base
package (Control.Applicative). They also use/propose syntactic sugar for
it, i.e.

pure f <*> u1 <*> ... <*> un

~~> (| f u1 ... un |)

(I just made up the symbols '(|' and '|)', the concrete syntax would have to
be fixed by people more knowledgeable than me.)


The problem with [| and |] lifted to monads that this only works for
fully applied arguments, i.e. that

  handle :: IO Handle
  string :: IO String

  [| hPutStr handle string |] :: IO ()

works but

   [| hPutStr handle |]
  = join (return hPutStr `ap` handle)
 ^= join ((f :: m (a -> b -> m c)) `ap` (x :: m a))
  = join ( y :: m (b -> m c))

is a type error.

I think this is also what makes the (<- action) proposal so non-local
and what is the source of this whole discussion. The core problem is:

  Functions like  a -> b -> m c  can't be partially applied to
  monadic actions like  m a  without specifying the number of
  arguments in advance. In other words, such functions aren't
  curried correctly.

Clearly, LiftMn specifies the number of arguments. But _both_ the (<-) 
proposal and idiom brackets specify the number of arguments too! Namely 
by requiring that all arguments are fully applied. So, neither one is 
capable of partially applying the first argument without saturating the 
call, you can only write


  handle :: IO Handle

-- define putStr in terms of the above hPutStr
  putStr :: String -> IO ()
  putStr = \x -> [| hPutStr handle (return x) |]
  putStr = \x -> do { hPutStr (<- handle) x }


One way to get currying for monads is to work with functions

  m a -> m b -> m c

However, this type is larger than  a -> b -> m c  , i.e. the function

  from :: Monad m => (a -> b -> m c) -> (m a -> m b -> m c)
  from f ma mb = ma >>= \a -> mb >>= \b -> f a b

is not surjective since we could perform the actions in a different order

  from2 f ma mb = mb >>= \b -> ma >>= \a -> f a b

In other words, if someone gives you a value of type  m a -> m b -> m c 
 , then you can't convert it to  a -> b -> m c  and back without 
risking that you end up with a different result.



But there is another type suitable for currying

  m (a -> m (b -> m c))

which I believe is in some way equivalent to  a -> b -> m c

  from :: Monad m => (a -> b -> m c) -> m (a -> m (b -> m c))
  from f = return $ \a -> return $ \b -> f a b

  to   :: Monad m => m (a -> m (b -> m c)) -> (a -> b -> m c)
  to f a b = f >>= \g -> g a >>= \h -> h b

but I'm not sure. My assumption is that we have an equivalence

  forall a,b . m (a -> m b) ~ (a -> m b)

because any side effect executed by the extra m on the outside can well 
be delayed until we are supplied a value a. Well, at least when all 
arguments are fully applied, for some notion of "fully applied"


Anyway, here's how to curry with that type:

  (@) :: Monad m => m (a -> m b) -> (m a -> m b)
  (@) f x = join (f `ap` x)

  hPutStr :: IO (Handle -> IO (String -> IO ()))
  handle  :: IO Handle

  putStr :: IO (String -> IO ())
  putStr = hPutStr @ handle

With the infix type synonym

  type (~>) a b = a -> IO b

we can also write

  hPutStr :: IO (Handle ~> String ~> () )
  putStr  :: IO (String ~> () )

This is of course the Kleisli-Arrow which explains why currying works.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] A few questions on primes generating.

2007-08-13 Thread L.Guo
Hi All:

I am reading http://www.haskell.org/haskellwiki/Prime_numbers

The code in sector "1 Bitwise prime sieve".

I have 3 questions about it.

1) In function go, what does the number 46340 mean ? Is it sqrt(MAX_LONG) ?
2) We have this type definition :
pureSieve :: Int -> Int
   Why there is no error (type mismatch) of this call in func main :
pureSieve 1000
3) In main again, what does expression [| x |] mean ? Why this cannot be 
execute in GHCi ?

Thanks for any advice.

Regards
--
L.Guo
2007-08-13

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Explaining monads

2007-08-13 Thread Kim-Ee Yeoh


Ronald Guida wrote:
> 
> Given the question "What is a Monad", I would have to say "A Monad is
> a device for sequencing side-effects."
> 

There are side-effects and there are side-effects. If the only
monad you use is Maybe, the only side-effect you get is a slight
warming of the CPU.

Dave Menendez pointed to that fine Wadler link earlier. Please read
it. To wit, in Section 2: "Explaining Monads" the "essence of an
algorithm can become buried under the plumbing required to carry
data from its point of creation to its point of use." Monads can
help keep the clarity of your code untrammelled by providing
implicit plumbing, "side-channels" if you prefer, when data is
moved around.

In fact if you follow Wadler all the way to his monadic expression
evaluator, you see that you could modularize your code in awesomely
cool ways. You get to see how the kernel of the expression
evaluator could be written for a generic monad and compiled
once-and-for-all. Any additional feature (the "variations") is
coded by enriching the monad.

Monads are powerful devices for modularizing code. Available for
free. Only in Haskell (thanks to type classes!). Get them today.

"Side-effects" is a piece of linguistic cruft played fast-and-loose
by too many people in this game. "Sequencing" suffers the same 
disease.

-- 
View this message in context: 
http://www.nabble.com/Explaining-monads-tf4244948.html#a12126170
Sent from the Haskell - Haskell-Cafe mailing list archive at Nabble.com.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Language support for imperative code. Was: Re: monad subexpressions

2007-08-13 Thread apfelmus

Isaac Dupree schrieb:

apfelmus wrote:
Mutable data structures in the sense of ephemeral (= not persistent = 
update in-place) data structure indeed do introduce the need to work 
in ST since the old version is - by definition - not available anymore. 


Not in the quantum/information-theoretic sense, not necessarily. Consider

import Control.Monad.ST
import Data.STRef
main = print (runST (do
   r <- newSTRef 1
   notUnavailable <- readSTRef r
   writeSTRef r 5
   return notUnavailable
 ))


I'm not sure what this has to do with quantum mechanics ;) but you're 
right, I forgot that. This means that either STRefs cannot be updated 
in-place or that every read operation copies the contents or something 
like that.


In any case, simple values like Ints or Bools are rather uninteresting, 
update in-place is only important for larger structures like arrays. 
Here, ST does updates in-place and retaining an array will copy it.


Regards,
apfelmus

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] IO within parser

2007-08-13 Thread Sam Hughes

Malte Milatz wrote:

Sam Hughes <[EMAIL PROTECTED]>, Sun, 12 Aug 2007 20:12:55 -0400:
[A parser like Parsec, with monad transformers:]

$ darcs get http://samuelhughes.com/darcs/partran/


Is this related in any way to the following GSoC project?

http://code.google.com/soc/2007/haskell/appinfo.html?csaid=B97EF4562EF3B244




No, I just wanted it for my own purposes.

- Sam
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Explaining monads

2007-08-13 Thread Brian Brunswick
On 11/08/07, Ronald Guida <[EMAIL PROTECTED]> wrote:
>
>
> Here's my interpretation of the table:
>
> --
> Structure   | Subject  |  Action|  Verb  |  Result
> +--+++--
> function|  a   |  a->b  |  flip ($)  |  b
> Functor |  f a |  a->b  |  <$>   |  f b
> Applicative |  f a |  f (a->b)  |  flip <*>  |  f b
> Monad   |  m a |  a->m b|  >>=   |  m b
> Comonad |  w a |  w a->b|  =>>   |  w b
> Arrow   |  a b c   |  a c d |  >>>   |  a b d
> --



Nice

Kim-Ee Yeoh wrote:
> > ... I think you'll find that each of those structures have their
> > privileged place in your code.
>
> Agreed.  I'm still a beginner; I'm not sure how to choose one
> structure over another, at least not yet.  But that's because ...
>
> > Monads are undoubtedly more pervasive, and that could be because there
> > aren't as many arrow and comonad tutorials, atomic ones or otherwise.


And I'm trying to say that these shouldn't be separate tutorials at all -
its
much more instructive to compare and contrast.


Moreover, Comonad isn't even in the standard libraries (Hoogle returns
> no results for it).
>
> When I searched for tutorials on monads, I found lots of them.  In
> fact, I have heard that writing (yet another) monad tutorial is part
> of standard Haskell initiation.



Thats what I was doing above :-)




One thing that I keep seeing people say (not you), is that monads /sequence/
side effects. This is wrong, or at
least a limited picture.

/All/ of the above structures are about combining compatible things things
together in a row.
/None/ of them force any particular order of evaluation - that all comes
from the particular instance. So its
only a particular feature of IO that it sequences the side effects. Others
don't - we can have a lazy State
monad that just builds up big thunks.

IO could be implemented as any of the above structures, and still be
perfectly able to keep
things in order. Indeed, uniqueness types as in clean, are arguably just the
first one - function composition

functor IO would be really boring - we could just perform a sequence of
actions with no choices at all.
(But the whole sequence could be repeated, and I guess the Structure could
be nested for fixed loops)

The key to the choice of IO as a Monad comes back the the argument about
'simplicity' or what ever we
want to call it - I agree its rather context dependent, and indeed I was
rather flippant at the end of my first message

But lets look at the nature of the actual things being sequenced, the
actions above.

In only 3 cases are the actions simple enough to take a single /a/ argument.
Function a->b; Functor a->b;  Monad a->m b

In function and functor, the action takes no part in the complexity, doesn't
know about it.
In function application the action gets used (possibly) once, in functor and
monad possibly many times.
Only in Monad does the action have a say in the complexity, by returning an
possibly non-trivial m b.

So thats the reason Monads are so handy - the combined actions are simple,
but they can
at least participate - influence the flow of control if we are talking about
a IO-like Monad.

Also, arrows are supposed to be more general than both monads and
> comonads.  If I could just figure out what each structure (functor,
> monad, comonad, arrow) is doing, and compare and contrast them, then I
> probably will have made leaps of understanding.  I have a sense that
> tells me that the table above and the data structures below probably
> start to scratch the surface of understanding.
>
> --
>
>
Arrows are most general because they have full access to the complexity
going on in the
structure. Each arrow can do arbitrarily complex (possibly bidirectional)
negotiation with its
input, and possibly asynchronously arbitrarily complex negotiation with its
output. Any sort of
data can flow any way at any time, the only restriction is that for an
'Arrow a b' object, the
input side uses a's and the output b's.

Compare a monad - the input must be single, simple a's. All that can happen
is that the function gets called multiple times.

-- 
[EMAIL PROTECTED]
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Suffix tree

2007-08-13 Thread ajb
G'day all.

Quoting Jon Harrop <[EMAIL PROTECTED]>:

> Does anyone have any Haskell code implementing suffix trees? I'm particularly
> interested in high-performance construction.

No, but I have this:

http://citeseer.ist.psu.edu/giegerich95comparison.html

First person to implement them gets the eternal gratitude of hackage.

Cheers,
Andrew Bromage
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Suffix tree

2007-08-13 Thread Ketil Malde
On Sun, 2007-08-12 at 23:09 +0100, Jon Harrop wrote:
> Suffix trees are a data structure used to search for a substring of length 
> "m" 
> in a string of length "n" in O(m) time. Suffix trees can also be used for 
> efficient approximate searches. This data structure is of particular 
> importance in bioinformatics.

Suffix trees (and their sibling, suffix arrays) are cool.  Basically a
suffix tree is a trie, but where any nodes with only one child are
collapsed.  Easy to construct through sorting, linear time is trickier,
and perhaps the biggest problem is keeping memory use reasonable.

Suffix arrays is a sorted array of indices, so it's more compact, but
takes O(m log n) to search.  There's also the enhanced suffix array,
which provides the same functionality as the suffix tree, but uses
approximately the same space as well (something like 12n bytes vs 5n
bytes, IIRC).

> Does anyone have any Haskell code implementing suffix trees? I'm particularly 
> interested in high-performance construction.

I was using a suffix tree to index some data, but I discovered I only
needed the tree to constant depth, so - shame on me - I ended up simply
sorting the suffixes in one case, and storing hashed words in a Map in
another.

I've also toyed with FFI'ing to comressed suffix arrays, but in the end,
this wasn't as successful as I'd hoped.  For straightforward suffix
array construction, this is probably the fastest way to go about it,
though.

Linear time suffix trees are a bit complicated to implement in a
functional language, as they depend on additional links in the tree
during construction.  I'm sure this can be done using IO/STrefs, but it
would of course be cooler if it could be done tying-the-knot style. 

Anyway, I have a handful of shaky FFI bindings, a reasonable bibtex
file, and lots of good will and encouragement.  Let me know if you have
use for any of them!

-k

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe