Beginners Digest, Vol 58, Issue 36

beginners-request Tue, 23 Apr 2013 21:13:59 -0700

Send Beginners mailing list submissions to
        [email protected]

To subscribe or unsubscribe via the World Wide Web, visit
        http://www.haskell.org/mailman/listinfo/beginners
or, via email, send a message with subject or body 'help' to
        [email protected]

You can reach the person managing the list at
        [email protected]

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Beginners digest..."

Today's Topics:

   1.  What is the functional way of implementing a function that
      takes a long time to execute? (Costello, Roger L.)
   2. Re:  What is the functional way of implementing a function
      that takes a long time to execute? (Ertugrul S?ylemez)
   3.  How to avoid repeating a type restriction from   a data
      constructor (gs)
   4. Re:  How to avoid repeating a type restriction    from a data
      constructor (Daniel Fischer)
   5. Re:  What is the functional way of implementing a function
      that takes a long time to execute? (Costello, Roger L.)
   6. Re:  Editor choices (Keshav Kini)
   7. Re:  What is the functional way of implementing a function
      that takes a long time to execute? (Ertugrul S?ylemez)

----------------------------------------------------------------------

Message: 1
Date: Tue, 23 Apr 2013 12:29:57 +0000
From: "Costello, Roger L." <[email protected]>
Subject: [Haskell-beginners] What is the functional way of
        implementing a function that takes a long time to execute?
To: "[email protected]" <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="us-ascii"

Hi Folks,

Suppose a function takes a long time to do its work.

Perhaps it takes minutes or even hours to complete. 

While it is crunching along, it would be nice to have some insight into its 
status such as (1) how close is it to completing? (2) what part of the task is 
it currently working on?

It might even be nice to be notified when it is finished.

What is the functional way of implementing the function?

/Roger

------------------------------

Message: 2
Date: Tue, 23 Apr 2013 15:28:49 +0200
From: Ertugrul S?ylemez <[email protected]>
Subject: Re: [Haskell-beginners] What is the functional way of
        implementing a function that takes a long time to execute?
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"

"Costello, Roger L." <[email protected]> wrote:

> Suppose a function takes a long time to do its work.
>
> Perhaps it takes minutes or even hours to complete.
>
> While it is crunching along, it would be nice to have some insight
> into its status such as (1) how close is it to completing? (2) what
> part of the task is it currently working on?
>
> It might even be nice to be notified when it is finished.
>
> What is the functional way of implementing the function?

The traditional way in Haskell to see intermediate results is not to
produce the final result only, but to produce a list of intermediate
results, the last of which is the final result.  The trick is to apply
what we call corecursion, which basically means:  Wrap the recursion in
an (ideally nonstrict) data constructor cell.  The function

    sqrtApprox :: Rational -> Rational

then becomes:

    sqrtApprox :: Rational -> [Rational]

You know that you have done it properly if you used proper corecursion,
which looks similar to this:

    loop x y = (x, y) : loop x' y'
        where
        x' = {- ... -}
        y' = {- ... -}

The important part is that the recursion is the right argument of the
constructor (:) and that the constructor application is the last thing
that happens.  You can make sure that you have done it properly and even
get some nice deforestation optimizations by using one of the predefined
corecursion operators like 'unfoldr', 'iterate', etc.

Sometimes you'll want to encode an algorithm even as a composition of
predefined corecursive formulas like 'map', 'filter', 'tails', etc.  For
example you may have a list [1,2,3,4,5,6,7,8,9] to begin with and you
want to encode the sum of the products of three consecutive values,
1*2*3 + 4*5*6 + 7*8*9 + 10:

    takeWhile (not . null) . map (take 3) . iterate (drop 3)

Now how do we produce online statistics while this algorithm calculates
its result?  This is the easiest part, but there is a catch.  You have a
corecursively produced list, which ensures that the list is lazy enough.
All you have to do now is to consume the list as part of an IO action.
The foldM combinator is most helpful here:

    sumStats :: (Num a) => [a] -> IO a
    sumStats = foldM f 0
        where
        f s x = do
            putStrLn ("Sum so far: " ++ s)
            return (s + x)

As said there is a catch.  In your recursive consumer you actually have
to make sure that the intermediate result is actually calculated.  This
is important, because otherwise your consumer doesn't actually force the
calculation, but really just builds up a large unevaluated expression,
which is only evaluated at the very end.

The easiest way to ensure this is to just do what I did in sumStats:
Print the intermediate result (you may call it 'state') or some value
derived from it (make sure that the value depends on the entire state).
If you don't want to perform some IO action with the state you can also
just be strict.  There are many ways to be strict, my favorite being

    f s x = do
        {- ... -}
        return $! s + x

but you can also use the BangPatterns extension.

The basic idea of all this is that you turn an opaque monolithic formula
into a stream processing formula.  This not only makes it more flexible,
but may also help you understand the original problem better and split
it into independent modules.  Remember that you can always compose
stream processors using regular function composition as done above.

Once you are comfortable with using lists for stream processing you can
also use an actual stream processing abstraction.  My personal favorite
right now is the 'pipes' library, but there are other useful libraries
including the other modern library 'conduit' as well as the traditional
'enumerator' library.

This of course does not end with streams.  Streams are for algorithms
with a linear execution paths, but not every algorithm follows that.  If
your formula naturally follows multiple paths, there is nothing wrong
with corecursively producing and recursively consuming trees or graphs,
for example.  You just need unfoldTree+foldTreeM or
unfoldGraph+foldGraphM for that.  This works with every algebraic data
structure.

As a final remark, don't worry about the performance.  Haskell is a lazy
language.  Done properly at least the intermediate lists will be
optimized away by the compiler and the resulting machine code is close
to what a C compiler would have produced for the original monolithic
formula randomly interspersed with statistics printing commands.

Greets
Ertugrul

-- 
Not to be or to be and (not to be or to be and (not to be or to be and
(not to be or to be and ... that is the list monad.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20130423/83bd322b/attachment-0001.pgp>

------------------------------

Message: 3
Date: Tue, 23 Apr 2013 15:05:05 +0000 (UTC)
From: gs <[email protected]>
Subject: [Haskell-beginners] How to avoid repeating a type restriction
        from    a data constructor
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset=us-ascii

In
http://code.accursoft.com/binding/src/c13ccbbec0ba8e326369ff2252863f20a891ef76/binding-core/src/Data/Binding/Simple.hs
there's a data constructor
    data Source v a = Variable v => Source {bindings :: v [Binding a], var
:: v a}
Source is used many times in this file and in
http://code.accursoft.com/binding/src/c13ccbbec0ba8e326369ff2252863f20a891ef76/binding-core/src/Data/Binding/List.hs,
and each time, I have to repeat the context Variable v. Is there any way to
avoid this redundancy?

------------------------------

Message: 4
Date: Tue, 23 Apr 2013 17:31:59 +0200
From: Daniel Fischer <[email protected]>
Subject: Re: [Haskell-beginners] How to avoid repeating a type
        restriction     from a data constructor
To: [email protected]
Cc: gs <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"

On Tuesday 23 April 2013, 15:05:05, gs wrote:
> In
> http://code.accursoft.com/binding/src/c13ccbbec0ba8e326369ff2252863f20a891ef
> 76/binding-core/src/Data/Binding/Simple.hs there's a data constructor
>     data Source v a = Variable v => Source {bindings :: v [Binding a], var
> 
> :: v a}
> 
> Source is used many times in this file and in
> http://code.accursoft.com/binding/src/c13ccbbec0ba8e326369ff2252863f20a891ef
> 76/binding-core/src/Data/Binding/List.hs, and each time, I have to repeat
> the context Variable v. Is there any way to avoid this redundancy?
> 

Use a GADT,

{-# LANGUAGE GADTs #-}

data Source x y where
    Source :: Variable v => { bindings :: v [Binding a], var :: v a }
                     -> Source v a

The `Variable v` context becomes available by pattern-matching on the 
constructor `Source` (but not by using the field names to deconstruct a value 
of type `Source v a`!). With the original datatype context, that wasn't so 
(thus datatype contexts were mostly unhelpful; they have been removed from the 
language, and now require the DatatypeContexts extension in GHC).

If you can't change the definition of the datatype, you can't avoid the 
redundancy, you have to mention the `Variable v` context everywhere Source is 
used (unless you omit type signatures altogether).

------------------------------

Message: 5
Date: Tue, 23 Apr 2013 22:55:48 +0000
From: "Costello, Roger L." <[email protected]>
Subject: Re: [Haskell-beginners] What is the functional way of
        implementing a function that takes a long time to execute?
To: "The Haskell-Beginners Mailing List - Discussion of primarily
        beginner-level topics related to Haskell" <[email protected]>
Message-ID:
        <[email protected]>
Content-Type: text/plain; charset="iso-8859-1"

Hello Ertugrul,

Thank you for your detailed explanation. I am not sure that I understand all of 
its subtleties.

I am wondering if you would be willing to provide us with a simple, complete 
example that we can try out on our own machines? 

/Roger

-----Original Message-----
From: [email protected] [mailto:[email protected]] On 
Behalf Of Ertugrul S?ylemez
Sent: Tuesday, April 23, 2013 9:29 AM
To: [email protected]
Subject: Re: [Haskell-beginners] What is the functional way of implementing a 
function that takes a long time to execute?

"Costello, Roger L." <[email protected]> wrote:

> Suppose a function takes a long time to do its work.
>
> Perhaps it takes minutes or even hours to complete.
>
> While it is crunching along, it would be nice to have some insight
> into its status such as (1) how close is it to completing? (2) what
> part of the task is it currently working on?
>
> It might even be nice to be notified when it is finished.
>
> What is the functional way of implementing the function?

The traditional way in Haskell to see intermediate results is not to
produce the final result only, but to produce a list of intermediate
results, the last of which is the final result.  The trick is to apply
what we call corecursion, which basically means:  Wrap the recursion in
an (ideally nonstrict) data constructor cell.  The function

    sqrtApprox :: Rational -> Rational

then becomes:

    sqrtApprox :: Rational -> [Rational]

You know that you have done it properly if you used proper corecursion,
which looks similar to this:

    loop x y = (x, y) : loop x' y'
        where
        x' = {- ... -}
        y' = {- ... -}

The important part is that the recursion is the right argument of the
constructor (:) and that the constructor application is the last thing
that happens.  You can make sure that you have done it properly and even
get some nice deforestation optimizations by using one of the predefined
corecursion operators like 'unfoldr', 'iterate', etc.

Sometimes you'll want to encode an algorithm even as a composition of
predefined corecursive formulas like 'map', 'filter', 'tails', etc.  For
example you may have a list [1,2,3,4,5,6,7,8,9] to begin with and you
want to encode the sum of the products of three consecutive values,
1*2*3 + 4*5*6 + 7*8*9 + 10:

    takeWhile (not . null) . map (take 3) . iterate (drop 3)

Now how do we produce online statistics while this algorithm calculates
its result?  This is the easiest part, but there is a catch.  You have a
corecursively produced list, which ensures that the list is lazy enough.
All you have to do now is to consume the list as part of an IO action.
The foldM combinator is most helpful here:

    sumStats :: (Num a) => [a] -> IO a
    sumStats = foldM f 0
        where
        f s x = do
            putStrLn ("Sum so far: " ++ s)
            return (s + x)

As said there is a catch.  In your recursive consumer you actually have
to make sure that the intermediate result is actually calculated.  This
is important, because otherwise your consumer doesn't actually force the
calculation, but really just builds up a large unevaluated expression,
which is only evaluated at the very end.

The easiest way to ensure this is to just do what I did in sumStats:
Print the intermediate result (you may call it 'state') or some value
derived from it (make sure that the value depends on the entire state).
If you don't want to perform some IO action with the state you can also
just be strict.  There are many ways to be strict, my favorite being

    f s x = do
        {- ... -}
        return $! s + x

but you can also use the BangPatterns extension.

The basic idea of all this is that you turn an opaque monolithic formula
into a stream processing formula.  This not only makes it more flexible,
but may also help you understand the original problem better and split
it into independent modules.  Remember that you can always compose
stream processors using regular function composition as done above.

Once you are comfortable with using lists for stream processing you can
also use an actual stream processing abstraction.  My personal favorite
right now is the 'pipes' library, but there are other useful libraries
including the other modern library 'conduit' as well as the traditional
'enumerator' library.

This of course does not end with streams.  Streams are for algorithms
with a linear execution paths, but not every algorithm follows that.  If
your formula naturally follows multiple paths, there is nothing wrong
with corecursively producing and recursively consuming trees or graphs,
for example.  You just need unfoldTree+foldTreeM or
unfoldGraph+foldGraphM for that.  This works with every algebraic data
structure.

As a final remark, don't worry about the performance.  Haskell is a lazy
language.  Done properly at least the intermediate lists will be
optimized away by the compiler and the resulting machine code is close
to what a C compiler would have produced for the original monolithic
formula randomly interspersed with statistics printing commands.

Greets
Ertugrul

-- 
Not to be or to be and (not to be or to be and (not to be or to be and
(not to be or to be and ... that is the list monad.

------------------------------

Message: 6
Date: Tue, 23 Apr 2013 18:30:59 -0700
From: Keshav Kini <[email protected]>
Subject: Re: [Haskell-beginners] Editor choices
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain

Hollister Herhold <[email protected]> writes:
> I like haskell-mode quite a bit. But the divide does run deep indeed.
>
> You can always run emacs in vi-mode.   :)

I do my Haskell coding with haskell-mode + evil-mode, which gives me
something pretty close to the familiar vim experience plus the power of
haskell-mode.

-Keshav

------------------------------

Message: 7
Date: Wed, 24 Apr 2013 06:12:59 +0200
From: Ertugrul S?ylemez <[email protected]>
Subject: Re: [Haskell-beginners] What is the functional way of
        implementing a function that takes a long time to execute?
To: [email protected]
Message-ID: <[email protected]>
Content-Type: text/plain; charset="us-ascii"

"Costello, Roger L." <[email protected]> wrote:

> Thank you for your detailed explanation. I am not sure that I
> understand all of its subtleties.
>
> I am wondering if you would be willing to provide us with a simple,
> complete example that we can try out on our own machines?

I have written the Lucas-Lehmer primality test in various variants in
the this paste:

    <http://hpaste.org/86449>

The gist of it is that for a prime p you generate a sequence, pick a
particular element from it and compare it to zero.  If it is zero, then
2^p - 1 is a prime number.  In the paste the llSeq function is universal
to both the pure algorithm (llTest) that only computes the result as
well as the stats variant (llTestStats) that is an IO action and tells
you how many iterations are left to go.

For comparison I have also written a monolithic variant (llTestMono)
that doesn't use the stream processing style and shows how you would
implement the same algorithm in an imperative language.  This is
monolithic in that there is no flexibility here.

They are all about equal in speed.  Just compile and run with the
argument 10007:

    ./lucas-lehmer 10007

This will test whether the number 2^10007 - 1 is prime.  It should be
finished in less than a second and conclude with False.  The argument
11213 will conclude with True.

Greets,
Ertugrul

-- 
Not to be or to be and (not to be or to be and (not to be or to be and
(not to be or to be and ... that is the list monad.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: 
<http://www.haskell.org/pipermail/beginners/attachments/20130424/2ecea472/attachment.pgp>

------------------------------

_______________________________________________
Beginners mailing list
[email protected]
http://www.haskell.org/mailman/listinfo/beginners

End of Beginners Digest, Vol 58, Issue 36
*****************************************

Beginners Digest, Vol 58, Issue 36

Reply via email to