Re: [Haskell-cafe] What's the advantage of writing Haskell this way?

2011-05-31 Thread John Ky
Thanks Malcom.

I suspected that much, so I added it:

data Stream m a
= Chunks (m a)
| EOF
deriving (Show, Eq)

instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
mempty = Chunks mempty
mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
mappend _ _ = EOF

instance (Monad m, MonadPlus m) => Monad (Stream m) where
return = Chunks . return
Chunks xs >>= f = mconcat (fmap f xs)
EOF >>= _ = EOF

This gives me the error:

Iteratee.hs:30:10:
Non type-variable argument in the constraint: Monoid (m a)
(Use -XFlexibleContexts to permit this)
In the context: (Monad m, MonadPlus m, Monoid (m a))
While checking the context of an instance declaration
In the instance declaration for `Monoid (Stream m a)'

So I run with the new flag:

ghci -XFlexibleContexts Iteratee.hs

Then I get the following error instead:

Iteratee.hs:37:43:
Could not deduce (m ~ [])
from the context (Monad m, MonadPlus m)
  bound by the instance declaration at Iteratee.hs:35:10-51
  `m' is a rigid type variable bound by
  the instance declaration at Iteratee.hs:35:17
Expected type: [a]
  Actual type: m a
In the second argument of `fmap', namely `xs'
In the first argument of `mconcat', namely `(fmap f xs)'
In the expression: mconcat (fmap f xs)

Which is complaining about the line I highlighted above.  So I try:

data Stream m a
= Chunks (m a)
| EOF
deriving (Show, Eq)

instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
mempty = Chunks mempty
mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
mappend _ _ = EOF

instance (Monad m, MonadPlus m, Monoid (m a)) => Monad (Stream m) where
return = Chunks . return
Chunks xs >>= f = mconcat (fmap f xs)
EOF >>= _ = EOF

But the same trick doesn't work:

Iteratee.hs:35:10:
Variable occurs more often in a constraint than in the instance head
  in the constraint: Monoid (m a)
(Use -XUndecidableInstances to permit this)
In the instance declaration for `Monad (Stream m)'

Is that because I don't use a on the right hand side of =>?

Cheers,

-John

On 31 May 2011 15:54, Malcolm Wallace  wrote:

>
> instance (Monad m, MonadPlus m) => Monoid (Stream m a) where
>
>  mempty = Chunks mempty
> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
>  mappend _ _ = EOF
>
>
> Iteratee.hs:28:25:
> No instance for (Monoid (m a))
>   arising from a use of `mempty'
>
>
> There is a clue in the first part of the error message.  Add the required
> instance as part of the predicate:
>
> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
> ...
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the advantage of writing Haskell this way?

2011-05-31 Thread Lyndon Maydwell
I think this is because mconcat expects a list.

On Tue, May 31, 2011 at 3:31 PM, John Ky  wrote:
> Thanks Malcom.
> I suspected that much, so I added it:
> data Stream m a
> = Chunks (m a)
> | EOF
> deriving (Show, Eq)
> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
> mempty = Chunks mempty
> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
> mappend _ _ = EOF
> instance (Monad m, MonadPlus m) => Monad (Stream m) where
> return = Chunks . return
> Chunks xs >>= f = mconcat (fmap f xs)
> EOF >>= _ = EOF
> This gives me the error:
> Iteratee.hs:30:10:
>     Non type-variable argument in the constraint: Monoid (m a)
>     (Use -XFlexibleContexts to permit this)
>     In the context: (Monad m, MonadPlus m, Monoid (m a))
>     While checking the context of an instance declaration
>     In the instance declaration for `Monoid (Stream m a)'
> So I run with the new flag:
> ghci -XFlexibleContexts Iteratee.hs
> Then I get the following error instead:
> Iteratee.hs:37:43:
>     Could not deduce (m ~ [])
>     from the context (Monad m, MonadPlus m)
>       bound by the instance declaration at Iteratee.hs:35:10-51
>       `m' is a rigid type variable bound by
>           the instance declaration at Iteratee.hs:35:17
>     Expected type: [a]
>       Actual type: m a
>     In the second argument of `fmap', namely `xs'
>     In the first argument of `mconcat', namely `(fmap f xs)'
>     In the expression: mconcat (fmap f xs)
> Which is complaining about the line I highlighted above.  So I try:
> data Stream m a
> = Chunks (m a)
> | EOF
> deriving (Show, Eq)
> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
> mempty = Chunks mempty
> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
> mappend _ _ = EOF
> instance (Monad m, MonadPlus m, Monoid (m a)) => Monad (Stream m) where
> return = Chunks . return
> Chunks xs >>= f = mconcat (fmap f xs)
> EOF >>= _ = EOF
> But the same trick doesn't work:
> Iteratee.hs:35:10:
>     Variable occurs more often in a constraint than in the instance head
>       in the constraint: Monoid (m a)
>     (Use -XUndecidableInstances to permit this)
>     In the instance declaration for `Monad (Stream m)'
> Is that because I don't use a on the right hand side of =>?
> Cheers,
> -John
> On 31 May 2011 15:54, Malcolm Wallace  wrote:
>>
>> instance (Monad m, MonadPlus m) => Monoid (Stream m a) where
>>
>> mempty = Chunks mempty
>> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
>> mappend _ _ = EOF
>>
>> Iteratee.hs:28:25:
>>     No instance for (Monoid (m a))
>>       arising from a use of `mempty'
>>
>> There is a clue in the first part of the error message.  Add the required
>> instance as part of the predicate:
>> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a) where
>> ...
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the advantage of writing Haskell this way?

2011-05-31 Thread Yves Parès
Maybe you are looking for a more generic way to concatenate it:
There is 
fold::
(Foldable t, Monoid m) => t
m -> 
min
Data.Foldable, but it would add another Foldable constraint.

You search a function like:
concatMPlus :: (MonadPlus m, Monoid a) => m a -> a
but this cannot exist ;) ("m a -> m a" would, but not "m a -> a")


2011/5/31 Lyndon Maydwell 

> I think this is because mconcat expects a list.
>
> On Tue, May 31, 2011 at 3:31 PM, John Ky  wrote:
> > Thanks Malcom.
> > I suspected that much, so I added it:
> > data Stream m a
> > = Chunks (m a)
> > | EOF
> > deriving (Show, Eq)
> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
> where
> > mempty = Chunks mempty
> > mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
> > mappend _ _ = EOF
> > instance (Monad m, MonadPlus m) => Monad (Stream m) where
> > return = Chunks . return
> > Chunks xs >>= f = mconcat (fmap f xs)
> > EOF >>= _ = EOF
> > This gives me the error:
> > Iteratee.hs:30:10:
> > Non type-variable argument in the constraint: Monoid (m a)
> > (Use -XFlexibleContexts to permit this)
> > In the context: (Monad m, MonadPlus m, Monoid (m a))
> > While checking the context of an instance declaration
> > In the instance declaration for `Monoid (Stream m a)'
> > So I run with the new flag:
> > ghci -XFlexibleContexts Iteratee.hs
> > Then I get the following error instead:
> > Iteratee.hs:37:43:
> > Could not deduce (m ~ [])
> > from the context (Monad m, MonadPlus m)
> >   bound by the instance declaration at Iteratee.hs:35:10-51
> >   `m' is a rigid type variable bound by
> >   the instance declaration at Iteratee.hs:35:17
> > Expected type: [a]
> >   Actual type: m a
> > In the second argument of `fmap', namely `xs'
> > In the first argument of `mconcat', namely `(fmap f xs)'
> > In the expression: mconcat (fmap f xs)
> > Which is complaining about the line I highlighted above.  So I try:
> > data Stream m a
> > = Chunks (m a)
> > | EOF
> > deriving (Show, Eq)
> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
> where
> > mempty = Chunks mempty
> > mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
> > mappend _ _ = EOF
> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monad (Stream m) where
> > return = Chunks . return
> > Chunks xs >>= f = mconcat (fmap f xs)
> > EOF >>= _ = EOF
> > But the same trick doesn't work:
> > Iteratee.hs:35:10:
> > Variable occurs more often in a constraint than in the instance head
> >   in the constraint: Monoid (m a)
> > (Use -XUndecidableInstances to permit this)
> > In the instance declaration for `Monad (Stream m)'
> > Is that because I don't use a on the right hand side of =>?
> > Cheers,
> > -John
> > On 31 May 2011 15:54, Malcolm Wallace  wrote:
> >>
> >> instance (Monad m, MonadPlus m) => Monoid (Stream m a) where
> >>
> >> mempty = Chunks mempty
> >> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
> >> mappend _ _ = EOF
> >>
> >> Iteratee.hs:28:25:
> >> No instance for (Monoid (m a))
> >>   arising from a use of `mempty'
> >>
> >> There is a clue in the first part of the error message.  Add the
> required
> >> instance as part of the predicate:
> >> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
> where
> >> ...
> >
> > ___
> > Haskell-Cafe mailing list
> > Haskell-Cafe@haskell.org
> > http://www.haskell.org/mailman/listinfo/haskell-cafe
> >
> >
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Please add a method for optimized concat to the Semigroup class

2011-05-31 Thread Yitzchak Gale
Edward Kmett wrote:
> I felt I should probably mention that ultimately what was done is I moved
> NonEmpty all the way down into semigroups and chose
>> sconcat :: NonEmpty a -> a
> at it was the closest analogue to the current mconcat behavior.
> So, request accomodated. ;)

Indeed, this is an excellent solution. Thanks!

-Yitz

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the advantage of writing Haskell this way?

2011-05-31 Thread Lyndon Maydwell
Heh. Looks like there will be about five class constraints, but it
will still be more general.

There must be some higher level abstraction that is less ugly.

On Tue, May 31, 2011 at 3:45 PM, Yves Parès  wrote:
> Maybe you are looking for a more generic way to concatenate it:
> There is fold :: (Foldable t, Monoid m) => t m -> m in Data.Foldable, but it
> would add another Foldable constraint.
>
> You search a function like:
> concatMPlus :: (MonadPlus m, Monoid a) => m a -> a
> but this cannot exist ;) ("m a -> m a" would, but not "m a -> a")
>
>
> 2011/5/31 Lyndon Maydwell 
>>
>> I think this is because mconcat expects a list.
>>
>> On Tue, May 31, 2011 at 3:31 PM, John Ky  wrote:
>> > Thanks Malcom.
>> > I suspected that much, so I added it:
>> > data Stream m a
>> > = Chunks (m a)
>> > | EOF
>> > deriving (Show, Eq)
>> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
>> > where
>> > mempty = Chunks mempty
>> > mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
>> > mappend _ _ = EOF
>> > instance (Monad m, MonadPlus m) => Monad (Stream m) where
>> > return = Chunks . return
>> > Chunks xs >>= f = mconcat (fmap f xs)
>> > EOF >>= _ = EOF
>> > This gives me the error:
>> > Iteratee.hs:30:10:
>> >     Non type-variable argument in the constraint: Monoid (m a)
>> >     (Use -XFlexibleContexts to permit this)
>> >     In the context: (Monad m, MonadPlus m, Monoid (m a))
>> >     While checking the context of an instance declaration
>> >     In the instance declaration for `Monoid (Stream m a)'
>> > So I run with the new flag:
>> > ghci -XFlexibleContexts Iteratee.hs
>> > Then I get the following error instead:
>> > Iteratee.hs:37:43:
>> >     Could not deduce (m ~ [])
>> >     from the context (Monad m, MonadPlus m)
>> >       bound by the instance declaration at Iteratee.hs:35:10-51
>> >       `m' is a rigid type variable bound by
>> >           the instance declaration at Iteratee.hs:35:17
>> >     Expected type: [a]
>> >       Actual type: m a
>> >     In the second argument of `fmap', namely `xs'
>> >     In the first argument of `mconcat', namely `(fmap f xs)'
>> >     In the expression: mconcat (fmap f xs)
>> > Which is complaining about the line I highlighted above.  So I try:
>> > data Stream m a
>> > = Chunks (m a)
>> > | EOF
>> > deriving (Show, Eq)
>> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
>> > where
>> > mempty = Chunks mempty
>> > mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
>> > mappend _ _ = EOF
>> > instance (Monad m, MonadPlus m, Monoid (m a)) => Monad (Stream m) where
>> > return = Chunks . return
>> > Chunks xs >>= f = mconcat (fmap f xs)
>> > EOF >>= _ = EOF
>> > But the same trick doesn't work:
>> > Iteratee.hs:35:10:
>> >     Variable occurs more often in a constraint than in the instance head
>> >       in the constraint: Monoid (m a)
>> >     (Use -XUndecidableInstances to permit this)
>> >     In the instance declaration for `Monad (Stream m)'
>> > Is that because I don't use a on the right hand side of =>?
>> > Cheers,
>> > -John
>> > On 31 May 2011 15:54, Malcolm Wallace  wrote:
>> >>
>> >> instance (Monad m, MonadPlus m) => Monoid (Stream m a) where
>> >>
>> >> mempty = Chunks mempty
>> >> mappend (Chunks xs) (Chunks ys) = Chunks (xs `mappend` ys)
>> >> mappend _ _ = EOF
>> >>
>> >> Iteratee.hs:28:25:
>> >>     No instance for (Monoid (m a))
>> >>       arising from a use of `mempty'
>> >>
>> >> There is a clue in the first part of the error message.  Add the
>> >> required
>> >> instance as part of the predicate:
>> >> instance (Monad m, MonadPlus m, Monoid (m a)) => Monoid (Stream m a)
>> >> where
>> >> ...
>> >
>> > ___
>> > Haskell-Cafe mailing list
>> > Haskell-Cafe@haskell.org
>> > http://www.haskell.org/mailman/listinfo/haskell-cafe
>> >
>> >
>>
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can't figure out cmap in hmatrix

2011-05-31 Thread Alberto Ruiz

On 05/30/2011 10:33 PM, Carter Schonwald wrote:

this is actually a bug in the type of cmap, a fix is due in the next
release (at least thats what Alberto indicated to me when I asked about
this a monthish ago) (note how you have the container type c e, but we
want c a and c b ). Instead  use the vector map or matrix map ops directly

cheers
-Carter schonwald


I have just uploaded to Hackage the bug-fixed version.

Thanks,
Alberto


On Mon, May 30, 2011 at 3:27 PM, Mats Klingberg mailto:maklingb...@gmail.com>> wrote:

Hello,

I'm playing around a bit with the hmatrix package
(http://hackage.haskell.org/package/hmatrix) but can't quite figure
out how to make the cmap function in Numeric.Container work.

An example:

ghci> import Numeric.LinearAlgebra
ghci> let v = fromList [1.0,2.0,3.0]
ghci> v
fromList [1.0,2.0,3.0] :: Data.Vector.Storable.Vector
ghci> :t v
v :: Vector Double
ghci> cmap sqrt v

:1:1:
No instance for (Container Vector e0)
  arising from a use of `cmap'
Possible fix: add an instance declaration for (Container Vector e0)
In the expression: cmap sqrt v
In an equation for `it': it = cmap sqrt v

ghci> :t cmap
cmap
  :: (Container c e, Element b, Element a) => (a -> b) -> c a -> c b

There is an instance for (Container Vector Double) but I assume that
since the signature of cmap doesn't mention the type variable 'e'
GHCi can't infer it. Googling hasn't helped me so far, except for
digging up another post to this list with the same (?) problem, but
no answer:
http://www.haskell.org/pipermail/haskell-cafe/2011-April/091390.html

Is there a way to tell GHC what instance to use, or how should cmap
be used?

Thanks!
Mats



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org 
http://www.haskell.org/mailman/listinfo/haskell-cafe




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub class and model expansion

2011-05-31 Thread Patrick Browne
Continuing the thread on model expansion.
I have changed the example trying to focus on expanding models of M in G
Why is the operation ! ok or RHS but not visible on LHS of G?
The equation itself does not seem to suffer from the dependent type
problem of my previous post.

class M a where
 (!)  ::  a -> a -> a
 e :: a


class M a => G a where
 (!-)  ::  a -> a -> a
-- OK in G
 a !- e = e ! a

-- Switching LHS/RHS is not OK in G but fine in S
--  if I declare both ! and !- in S
-- ! is not a (visible) method of class `G'
-- a ! e = e !- a


Thanks,
Pat







On 30/05/2011 00:47, Brandon Moore wrote:
>> From: Patrick Brown; Sent: Sunday, May 29, 2011 5:42 PM
> 
>> Subject: Re: [Haskell-cafe] Sub class and model expansion
>>
>> Ryan,
>> Thank you for your helpful reply.
>> I have no real understanding of dependent types (DT)
>> From the web is seems that DTs are types that depend on *values*.
>> How does the concept of DT relate to the equation from my example?
>>
  -- inverse a ! a = e
>>
>> What type is depending on what value?
> 
> You want part of the group interface to be a proof
> that your inverse function agrees property with the
> monoid operation and unit.
> 
> If that's going to be a value, it has to have some type.
> If that type is anything like "Proof that inverse a ! a = e",
> then the type mentions, or "depends on" the values
> inverse, (!), and e.
> 
> You can see exactly this in the Agda standard library,
> in the definitions IsMonoid and IsGroup in
> Algebra.Structures, which define records containing
> proofs that a set of operations actually forms a monoid
> or group.
> 
> 
> You could probably avoid the apparent circularity
> of full dependent types if you split up a language
> into values which can have types, and a separate
> pair of propositions and proofs which can refer to
> values and types, but not to themselves.
> 
> I think that's basically the organization of any system
> where you use a separate prover to prove things about
> programs in some previously-existing programming
> language, but I haven't stumbled across any examples
> where that sort of organization was designed into
> a single language from the start.
> 
> 
> Brandon
> 


This message has been scanned for content and viruses by the DIT Information 
Services E-Mail Scanning Service, and is believed to be clean. http://www.dit.ie

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] PhD studentship in Nottingham

2011-05-31 Thread Graham Hutton
Dear all,

I are currently advertising a PhD studentship in functional
programming -- the advert is copied below, and attached.  If
you know of any good candidates who many be interested in this,
or there is a local mailing list for advertising such things,
I would be much obliged if you could pass on the advert.

Best wishes,

Graham

+--+

 PhD Studentship in Functional Programming

School of Computer Science
   University of Nottingham, UK

Applications are invited for a fully-funded PhD studentship in
the Functional Programming Lab in the School of Computer Science
at the University of Nottingham, starting on 1st October 2011,
under the supervision of Prof Graham Hutton.

The suggested topic for the studentship is to develop a practical
implementation of the worker/wrapper transformation, a simple but
powerful unifying paradigm for program optimisation, but other
ideas for possible topics are welcome too.  The studentship is
for three and a half years, and includes a maintenance grant
of 13,590 UK pounds per year, and tuition fees.

Applicants for the position will require a first-class Honours
degree (or equivalent) in Computer Science and/or Mathematics,
some experience in functional programming, and an aptitude for
mathematical subjects.Additional desirable attributes include
a Masters degree, and/or some experience in programming language
semantics, implementation and optimisation.  Due to the nature
of the funding, the position is only open to UK citizens, or EU
citizens who have resided in the UK for the last three years.

The successful applicant will work under the supervision of Prof
Graham Hutton in the FP Lab in Nottingham, a leading centre for
research on functional programming.  The group currently comprises
5 academic staff, 1 research fellow, and 10 PhD students.

In order to apply, please submit the following to Prof Graham
Hutton (graham.hut...@nottingham.ac.uk) by 24th June 2011: an
up-to-date copy of your CV (including the results of all your
University examinations to date) along with a brief covering
letter that describes your experience in functional programming,
your reasons for wishing to pursue a PhD in this area, and any
ideas you have regarding possible research directions.

Closing date for applications: 24th June 2011.

+--+


--  
Graham Hutton
Functional Programming Lab
School of Computer Science
University of Nottingham, UK
http://www.cs.nott.ac.uk/~gmh


This message and any attachment are intended solely for the addressee and may 
contain confidential information. If you have received this message in error, 
please send it back to me, and immediately delete it.   Please do not use, copy 
or disclose the information contained in this message or in any attachment.  
Any views or opinions expressed by the author of this email do not necessarily 
reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment
may still contain software viruses which could damage your computer system:
you are advised to perform your own checks. Email communications with the
University of Nottingham may be monitored as permitted by UK legislation.

phd-advert.txt
Description: phd-advert.txt
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANN: quickcheck-properties

2011-05-31 Thread Mario Blažević

On 11-05-30 05:05 AM, Alexey Khudyakov wrote:

On 30.05.2011 12:26, Bas van Dijk wrote:

On 30 May 2011 00:14, Alexey Khudyakov wrote:

It always puzzled me why there are no packages for for testing general
type classes laws. (Monoid laws, monad laws etc). It looks like ideal
case for quickcheck and smallcheck.


How about 'checkers' by Conal Elliott:
http://hackage.haskell.org/package/checkers


We really need better search on hackage than C-f in browser. I didn't
find them. Thank you for pointers.


	When I needed the very same thing a few months ago, I discovered 
checkers by using the reverse dependencies list for QuickCheck:



http://bifunctor.homelinux.net/~roel/cgi-bin/hackage-scripts/revdeps/QuickCheck-2.4.1.1#direct

	That helped a lot, though finding checkers in the list still wasn't a 
breeze.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
Dear Cafe,

(Excuse the probably very ranty email; I am, unfortunately, at the end of my
wits, and I hope that as fellow programmers, you will understand that this is
among the most dreadful situations for our kind to be in.)

Say, we have an input file that contains a word per line. I want to find all
unigrams (unique words) in that file, and associate with them the amount of
times they occurred in the file. This would allow me, for example, to make a
list of word frequencies in a given text.

Simple enough task. Here's an implementation using iteratees (lazy IO is evil)
and unordered-containers' Data.HashMap.Strict, which enforces WHNF in values
and keys:

> import qualified Data.ByteString.Char8 as S
> import qualified Data.Iteratee as I
> import Data.Iteratee.IO
> 
> import qualified Data.HashMap.Strict as T
> 
> import Data.Iteratee.Char
> import Data.List (foldl')
> import Data.Char (toLower)
> 
> import Data.Ord (comparing)
> import Data.List (sortBy)
> import System.Environment (getArgs)
> 
> type Wordcounts = T.HashMap S.ByteString Int
> 
> f' :: Monad m => I.Iteratee S.ByteString m Wordcounts
> f' = I.joinI $ enumLinesBS (I.liftI $ step T.empty)
> where step t (I.Chunk str) = I.liftI (step $! foldl' maybeIncrement t str)
>   step t stream = I.idone t stream
>   maybeIncrement t s
>   | s == S.empty = t
>   | otherwise= {-# SCC "m-I-other" #-} T.insertWith (+) s 1 t
> 
> main :: IO ()
> main = getArgs >>= fileDriverVBuf 65536 f'.head >>= print.prettyList
> where prettyList = -- sortBy (comparing snd) . T.toList
>T.keys

Some lines are empty, and I don't want them to be recorded, so that's why
maybeIncrement is necessary.
hpaste of this code: http://hpaste.org/47300/spaceleak (ignore convert, that's
yet another issue.)

Now, here's some observations: on a 75M input file (minuscule, compared to what
I actually need) this program will eat 30M of heap space (says profiling) and
return in 14 secs.

I have two problems with that: a) that's too much heap space, b) the actual 
memory
residency is *much* worse.

ad b) average memory residency is at 38MB (this is OK, given heap consumption)
but max residency is at 130MB, which is unacceptable to me (remember that I need
to run this on files *much* bigger than just 75M.)

<>

In fact, htop reports total memory residency of the program at around 320 MB at
the end of its life-cycle (compiled and ran without profiling options.) I tried
running this program on a 1.4GB file, and had to kill it when at 3.5GB memory
consumption, my computer started paging. The actual hash-map, however, shouldn't
be much bigger than in the 75MB case (I expect about twice the size,) since the
input is natural language. I redirected the output of the program (showing a
list of assoc pairs that were in the hash map) to a file, and that file measured
11MB in the case of a 75MB corpus, and 40MB when I ran the program on a 1.4GB
corpus (I had to use a machine with 24GB of RAM to be able to run this.)

ad a) heap consumption is too high for two reasons: firstly, the actual data I
care about is much less than there's data on the heap. Secondly, about half the
heap space is in LAG state. Here are profiles that will illustrate this:
http://imgur.com/wBWmJ&XN1mW

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
On Tue, May 31, 2011 at 06:10:00PM +0200, Aleksandar Dimitrov wrote:
> ad a) heap consumption is too high for two reasons: firstly, the actual data I
> care about is much less than there's data on the heap. Secondly, about half 
> the
> heap space is in LAG state. Here are profiles that will illustrate this:
> http://imgur.com/wBWmJ&XN1mW

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread malcolm.wallace
ad a) heap consumption is too high for two reasons: firstly, the actual data I
care about is much less than there's data on the heap. Secondly, about half the
heap space is in LAG state. Here are profiles that will illustrate this:
http://imgur.com/wBWmJ&XN1mW

Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
> In Lag/Drag/Void/Use profiling, Lag is actually heap cells that are created
> too _early_.  (Drag are those that are kept for longer than necessary.)  Lots
> of Lag generally means your program is too strict - it is forcing structure
> long before it needs to.  To fix it, you need to make things lazier.  My first
> suspicion would fall on ByteString.

Indeed, thank you, I mixed those up. I cannot use lazy byte strings here,
because of the way Data.Iteratee.Char's enumLinesBS works (it takes strict
byte strings.)

The only other strictness in there is ($!) and foldl'. The latter is necessary
for the program to even run (i.e. not run out of stack space.)

The strict application in step's argument seems necessary, since without it, the
program consumes 1200 MB of RAM (on my 75MB test data,) and takes very very 
long.
The hb profile indicates that a lot of data is allocated up front, and then
gradually eliminated. Interestingly, removing ($!) here seemed to *introduce*
unnecessary strictness. Here's the hb profile without ($!):
http://imgur.com/Ex7Pd I don't understand what is happening here :-\ I only just
started using iteratees.

Regards,
Aleks


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Johan Tibell
Hi Aleksandar,

On Tue, May 31, 2011 at 6:10 PM, Aleksandar Dimitrov
 wrote:
> Say, we have an input file that contains a word per line. I want to find all
> unigrams (unique words) in that file, and associate with them the amount of
> times they occurred in the file. This would allow me, for example, to make a
> list of word frequencies in a given text.

Here's how I would do it:

{-# LANGUAGE BangPatterns #-}
module Ngram (countUnigrams) where

import qualified Data.ByteString as S
import qualified Data.HashMap.Strict as M
import System.IO

foldLines :: (a -> S.ByteString -> a) -> a -> Handle -> IO a
foldLines f z0 h = go z0
  where
go !z = do
eof <- hIsEOF h
if eof
then return z
else do
line <- S.hGetLine h
go $ f z line
{-# INLINE foldLines #-}

-- Example use
countUnigrams :: IO (M.HashMap S.ByteString Int)
countUnigrams = foldLines (\ m s -> M.insertWith (+) s 1 m) M.empty stdin

> RANT
>
> I have tried and tried again to avoid writing programs in Haskell that would
> leak space like BP likes to leak oil. However, I have yet to produce a single
> instance of a program that would do anything at all and at the same time 
> consume
> less memory than there is actual data in the input file.
>
> It is very disconcerting to me that I seem to be unable, even after quite some
> practice, to identify space leaks in trivial programs like the above. I know 
> of
> no good resource to educate myself in that regard. I have read the GHC manual,
> RWH's chapter on profiling, also "Inside T5"'s recent series on the Haskell
> heap, but no dice. Even if I can clearly see the exact line where at least 
> some
> of the leaking happens (as I can in this case,) it seems impossible for me to
> prevent it.
>
> *thank you very much* for reading this far. This is probably a mostly useless
> email anyhow, I just had to get it off my chest. Maybe, just maybe, someone
> among you will have a crucial insight that will save Haskell for me :-) But
> currently, I see no justification to not start my next project in Lua, Python 
> or
> Java. Sure, Haskell's code is pretty, and it's fun, but if I can't actually
> *run* it, why bother?  (Yes, this isn't the first time I've ran into this
> problem …)

We definitely need more accessible material on how to reliably write
fast Haskell code. There are those among us who can, but it shouldn't
be necessary to learn it in the way they did (i.e. by lots of
tinkering, learning from the elders, etc). I'd like to write a 60 (or
so) pages tutorial on the subject, but haven't found the time.

In addition to RWH, perhaps the slides from the talk on
high-performance Haskell I gave could be useful:


http://blog.johantibell.com/2010/09/slides-from-my-high-performance-haskell.html

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Sub class and model expansion

2011-05-31 Thread Ryan Ingram
On Tue, May 31, 2011 at 1:40 AM, Patrick Browne wrote:

> Continuing the thread on model expansion.
> I have changed the example trying to focus on expanding models of M in G
> Why is the operation ! ok or RHS but not visible on LHS of G?
> The equation itself does not seem to suffer from the dependent type
> problem of my previous post.
>
> class M a where
>  (!)  ::  a -> a -> a
>  e :: a
>
>
> class M a => G a where
>  (!-)  ::  a -> a -> a
> -- OK in G
>  a !- e = e ! a
>

This doesn't do what you think.  This is equivalent to

(!-) = (\a e -> e ! a)

That is, "e" is lambda-bound here, not the same "e" from M.  In this case
you've defined (!-) as "flip (!)"

When you define functions in a class declaration, you are just defining a
default implementation.  Generally this is used when you can implement some
functions in terms of the others, such as:

class Eq a where
   (==) :: a -> a -> a
   (/=) :: a -> a -> a

   a == b = not (a /= b)
   a /= b = not (a == b)

Now someone making an instance of this class need only define (==) or (/=);
the other one will be defined via the default instance.  (This has the
somewhat undesirable property that 'instance Eq X where' with no methods is
a valid instance declaration but in that instance == and /= are infinite
loops)

Maybe this will help you think about this: what code do you expect to write
for instances of this class?

instance M Int where
  e = 0
  (!) = +

instance G Int where
  -- do I need to write any code here?

It seems to me you are expecting the compiler to automatically derive the
definition 'inverse = negate'; there's no general way for the compiler to do
so, since it doesn't know the structure of your type.

Functions in Haskell, unlike, say, Prolog, only go from left of the = to the
right of the =.  Thanks to referential transparency, you can go backwards
from the point of view of proving properties about your code, but during
evaluation term rewriting only happens from left to right.

  -- ryan
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Edward Z. Yang
Hello Aleksandar,

It is possible that the iteratees library is space leaking; I recall some
recent discussion to this effect.  Your example seems simple enough that
you might recompile with a version of iteratees that has -auto-all enabled.
Unfortunately, it's not really a safe bet to assume your libraries are
leak free, and if you've pinpointed it down to a single line, and there
doesn't seem a way to squash the leak, I'd bet it's the library's fault.

Edward

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] *GROUP HUG*

2011-05-31 Thread Ozgur Akgun
Evan,

On 24 May 2011 19:57, Evan Laforge  wrote:

> On the catMaybes thing, I have a function 'mapMaybe = Maybe.catMaybes
> . map'.  I turns out I only ever used catMaybes after mapping a Maybe
> function, so I hardly ever use catMaybes anymore.  I suppose it should
> have been maybeMap for consistency with concatMap.
>

Just wanted to point out, that function is already defined in Data.Maybe:

http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Maybe.html#v:mapMaybe

Hugs :)

Ozgur
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] *GROUP HUG*

2011-05-31 Thread Evan Laforge
On Tue, May 31, 2011 at 11:31 AM, Ozgur Akgun  wrote:
> Evan,
> On 24 May 2011 19:57, Evan Laforge  wrote:
>>
>> On the catMaybes thing, I have a function 'mapMaybe = Maybe.catMaybes
>> . map'.  I turns out I only ever used catMaybes after mapping a Maybe
>> function, so I hardly ever use catMaybes anymore.  I suppose it should
>> have been maybeMap for consistency with concatMap.
>
> Just wanted to point out, that function is already defined in Data.Maybe:
> http://hackage.haskell.org/packages/archive/base/latest/doc/html/Data-Maybe.html#v:mapMaybe
> Hugs :)

Indeed it is, somehow I missed that one and wrote my own instead.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Brandon Moore
I can't reproduce heap usage growing with the
size of the input file.

I made a word list from Project Gutenberg's
copy of "War and Peace" by

tr -sc '[[:alpha:]]' '\n' < pg2600.txt > words.txt

Using 1, 25, or 1000 repetitions of this ~3MB wordlist
shows about 100MB of address space used according
to top, and no more than 5MB or so of haskell heap
used according to the memory profile, with a flat
memory profile.


Is your memory usage growing with the size of the input
file, or the size of the histogram?

I was worried data sharing might mean your keys
retain entire 64K chunks of the input. However, it
seems enumLines depends on the StringLike ByteString
instance, which just converts to and from String.
That can't be efficient, but I suppose it avoids excessive sharing.

The other thing that occurs to me is that the total size of
your keys would also be approximately the size of the input
file if you were using plain text without each word split onto
a separate line.

Brandon

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Brandon Moore
Wait, do ByteStrings show up on a heap profile, if the space is
allocated with malloc?

Anyway, I think my tests still show that the memory used by the
process doesn't grow simply by adding more data, if you
are no longer added keys to the map.



- Original Message -
> From: Brandon Moore 
> To: Aleksandar Dimitrov ; 
> "haskell-cafe@haskell.org" 
> Cc: 
> Sent: Tuesday, May 31, 2011 1:43 PM
> Subject: Re: [Haskell-cafe] How on Earth Do You Reason about Space?
> 
> I can't reproduce heap usage growing with the
> size of the input file.
> 
> I made a word list from Project Gutenberg's
> copy of "War and Peace" by
> 
> tr -sc '[[:alpha:]]' '\n' < pg2600.txt > words.txt
> 
> Using 1, 25, or 1000 repetitions of this ~3MB wordlist
> shows about 100MB of address space used according
> to top, and no more than 5MB or so of haskell heap
> used according to the memory profile, with a flat
> memory profile.
> 
> 
> Is your memory usage growing with the size of the input
> file, or the size of the histogram?
> 
> I was worried data sharing might mean your keys
> retain entire 64K chunks of the input. However, it
> seems enumLines depends on the StringLike ByteString
> instance, which just converts to and from String.
> That can't be efficient, but I suppose it avoids excessive sharing.
> 
> The other thing that occurs to me is that the total size of
> your keys would also be approximately the size of the input
> file if you were using plain text without each word split onto
> a separate line.
> 
> Brandon
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Thoughts about currying and optimisations

2011-05-31 Thread Yves Parès
Hello Café,
An idea came to me: unless the compiler notices that stuffA and stuffB are
equivalent, would it be correct to suppose that A is better than B?

stuffA x = if someComputationallyExpensiveTest x
then doSomething else doSomethingElse

stuffB x y = if someComputationallyExpensiveTest x
then doSomething y else doSomethingElse y

I explain: in stuffA, the function only depends on x, so when doing:
a = stuffA xxx
runs the expensive test once and for all, and a can directly be bound to
doSomething or doSomethingElse
so calling after:
a foo
a bar
won't run the test

Whereas:
b = stuffB xxx
is valid due to curryfication, but:
b foo
b bar
will both run the expensive test
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] *GROUP HUG*

2011-05-31 Thread Alberto G. Corona
fluency in Scala is an industry asset, since it runs in the Java VM,
while Haskell is an academic asset as well as a fun asset.

The value of  an industry asset grows with the  lack of competence of
others. Therefore competing guys are not welcome. There are enoug
crocodiles  in the pond.

Alberto.

2011/5/24 Ketil Malde :
> Juan Daugherty  writes:
>
>> Every computing culture is different.
>
> Definitely.  I've just given up on several "cultures" which are just too
> vitriolic.  Scala is on my list of interesting stuff to look at some
> day, but if I'm going to be flamed for asking questions about the
> language, I can easily find something else to fill my time with.
>
>> Being in the habit of asking questions you should be able to answer yourself
>> is not a good idea.
>
> Maybe not.
>
> Is it a good idea to flame people who do this?
>
> If you look at two case studies: the Scala thread where Gregory was
> involved, and the recent mail here, basically asking for 'catMaybes'.
>
> Replying "whoosh", or otherwise producing non-informative, arrogant
> feedback results in a long thread, culminating in a more useful answer,
> as well as a lot of noise.
>
> Replying with a pointer to 'catMaybes' resulted in (most likely) the
> author going off to finish/improve his program, and some more
> interesting discussion on alternative ways to do this.
>
> The point is that at face value, being rude and arrogant may drive away
> naive questions, but is much more likely to result in endless threads of
> discussions of etiquette, usually laced with ample amounts of
> hostility.  This actually decreases signal to noise.
>
> Also it not only drives away the naive questions, it drives away the
> people asking them.  People who might at some point become informed,
> contributing members of the community.
>
> I don't know, maybe Scala is big enough that they can afford to behave
> that way.  Some people quit haskell-cafe for other (better policed?)
> forums, so perhaps we are too liberal?  I hope not.
>
>> Although Haskell comm. is necessarily welcoming due to the learning
>> curve and lack of popular adoption there are limits here too.
>
> My theory is that flaming cultures arise around people who are
> technically brilliant, but somewhat lacking socially, either through
> arrogance or ineptitude.  Members of communities where such people
> become central respect them for their brilliance, and then emulate, echo
> or support them.  (I guess the implicit rationale is that if the smart
> people are assholes, being an asshole will make people - or myself -
> think I'm smart, too).
>
> Why have we managed to avoid this?  Partly because of the heavy academic
> slant, usually academia tends to reserved politeness.  Also, there's a
> lot of theory floating around, so although I might get impatient with
> some people, I can't really grow arrogant, since there's so *much* I'm
> obviously completely ignorant at.
>
> (Sorry about the long off-topic rant.)
>
> -k
> --
> If I haven't seen further, it is by standing in the footprints of giants
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Thoughts about currying and optimisations

2011-05-31 Thread Edward Z. Yang
I believe this transformation is called the 'full laziness' optimization.
It can introduce space leaks if the computationally expensive test is
replaced with a reference to a space expensive value.

Edward

Excerpts from Yves Parès's message of Tue May 31 15:14:07 -0400 2011:
> Hello Café,
> An idea came to me: unless the compiler notices that stuffA and stuffB are
> equivalent, would it be correct to suppose that A is better than B?
> 
> stuffA x = if someComputationallyExpensiveTest x
> then doSomething else doSomethingElse
> 
> stuffB x y = if someComputationallyExpensiveTest x
> then doSomething y else doSomethingElse y
> 
> I explain: in stuffA, the function only depends on x, so when doing:
> a = stuffA xxx
> runs the expensive test once and for all, and a can directly be bound to
> doSomething or doSomethingElse
> so calling after:
> a foo
> a bar
> won't run the test
> 
> Whereas:
> b = stuffB xxx
> is valid due to curryfication, but:
> b foo
> b bar
> will both run the expensive test

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Can't figure out cmap in hmatrix

2011-05-31 Thread Mats Klingberg

31 maj 2011 kl. 09.59 Alberto Ruiz wrote:
> I have just uploaded to Hackage the bug-fixed version.

That works fine. Thanks for a nice package!
Mats

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] *GROUP HUG*

2011-05-31 Thread Adrien Haxaire

Le 31/05/2011 21:15, Alberto G. Corona a écrit :

Haskell is an academic asset as well as a fun asset.


I fully agree. These are two of the three reasons which made me choose 
haskell as the functional language to learn. Coding fortran all day, I 
wanted a new approach on programming. The strong scientific roots of 
haskell would give me stuff to learn and discover for a lot of time. The 
atmosphere/halo around haskell was intriguing too: "come on! it's fun! i 
can write foldr with foldl!" is not the kind of enthusiasm I was used too :)


The third reason is, as you already now, the community. Never have seen 
so much encouragements, help, time, humility, jokes,... crossing the gap 
takes some time, but when you feel that lots of people are glad you're 
here, it's just constant joy.


group hug !

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Scott Lawrence
I was under the impression that operations performed in monads (in this
case, the IO monad) were lazy. (Certainly, every time I make the
opposite assumption, my code fails :P .) Which doesn't explain why the
following code fails to terminate:

  iRecurse :: (Num a) => IO a
  iRecurse = do
recurse <- iRecurse
return 1

  main = (putStrLn . show) =<< iRecurse

Any pointers to a good explanation of when the IO monad is lazy?


=== The long story ===

I wrote a function unfold with type signature (([a] -> a) -> [a]), for
generating a list in which each element can be calculated from all of
the previous elements.

  unfold :: ([a] -> a) -> [a]
  unfold f = unfold1 f []

  unfold1 :: ([a] -> a) -> [a] -> [a]
  unfold1 f l = f l : unfold1 f (f l : l)

Now I'm attempting to do the same thing, except where f returns a monad.
(My use case is when f randomly selects the next element, i.e. text
generation from markov chains.) So I want

  unfoldM1 :: (Monad m) => ([a] -> m a) -> [a] -> m [a]

My instinct, then, would be to do something like:

  unfoldM1 f l = do
next <- f l
rest <- unfoldM1 f (next : l)
return (next : rest)

But that, like iRecurse above, doesn't work.



signature.asc
Description: OpenPGP digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Artyom Kazak

Suppose iRecurse looks like this:
  iRecurse = do
x <- launchMissiles
r <- iRecurse
return 1

As x is never needed, launchMissiles will never execute. It obviously is  
not what is needed.


But in Haskell, standart file input|output is often lazy. It's a  
combination of buffering and special tricks, not the usual rule.


Scott Lawrence  писал(а) в своём письме Tue, 31 May 2011  
22:49:02 +0300:



I was under the impression that operations performed in monads (in this
case, the IO monad) were lazy. (Certainly, every time I make the
opposite assumption, my code fails :P .) Which doesn't explain why the
following code fails to terminate:

  iRecurse :: (Num a) => IO a
  iRecurse = do
recurse <- iRecurse
return 1

  main = (putStrLn . show) =<< iRecurse

Any pointers to a good explanation of when the IO monad is lazy?


=== The long story ===

I wrote a function unfold with type signature (([a] -> a) -> [a]), for
generating a list in which each element can be calculated from all of
the previous elements.

  unfold :: ([a] -> a) -> [a]
  unfold f = unfold1 f []

  unfold1 :: ([a] -> a) -> [a] -> [a]
  unfold1 f l = f l : unfold1 f (f l : l)

Now I'm attempting to do the same thing, except where f returns a monad.
(My use case is when f randomly selects the next element, i.e. text
generation from markov chains.) So I want

  unfoldM1 :: (Monad m) => ([a] -> m a) -> [a] -> m [a]

My instinct, then, would be to do something like:

  unfoldM1 f l = do
next <- f l
rest <- unfoldM1 f (next : l)
return (next : rest)

But that, like iRecurse above, doesn't work.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Scott Lawrence
On 05/31/2011 04:20 PM, Artyom Kazak wrote:
> Suppose iRecurse looks like this:
>   iRecurse = do
> x <- launchMissiles
> r <- iRecurse
> return 1
> 
> As x is never needed, launchMissiles will never execute. It obviously is
> not what is needed.

Prelude> let launchMissiles = putStrLn "UH OH" >> return 1
Prelude> let iRecurse = launchMissiles >> return 1
Prelude> iRecurse
UH OH
1
Prelude>

Looks like launchMissiles /does/ execute, even though x is (obviously)
never needed.




signature.asc
Description: OpenPGP digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] *GROUP HUG*

2011-05-31 Thread Yves Parès
> "come on! it's fun! i can write foldr with foldl!"

And when you try to explain that to your java-ITC-formatted friends, they
utterly surprisingly seem not to care about it ^^


2011/5/31 Adrien Haxaire 

> Le 31/05/2011 21:15, Alberto G. Corona a écrit :
>
>  Haskell is an academic asset as well as a fun asset.
>>
>
> I fully agree. These are two of the three reasons which made me choose
> haskell as the functional language to learn. Coding fortran all day, I
> wanted a new approach on programming. The strong scientific roots of haskell
> would give me stuff to learn and discover for a lot of time. The
> atmosphere/halo around haskell was intriguing too: "come on! it's fun! i can
> write foldr with foldl!" is not the kind of enthusiasm I was used too :)
>
> The third reason is, as you already now, the community. Never have seen so
> much encouragements, help, time, humility, jokes,... crossing the gap takes
> some time, but when you feel that lots of people are glad you're here, it's
> just constant joy.
>
> group hug !
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Anthony Cowley
On Tue, May 31, 2011 at 3:49 PM, Scott Lawrence  wrote:
> I was under the impression that operations performed in monads (in this
> case, the IO monad) were lazy. (Certainly, every time I make the
> opposite assumption, my code fails :P .) Which doesn't explain why the
> following code fails to terminate:
>
>  iRecurse :: (Num a) => IO a
>  iRecurse = do
>    recurse <- iRecurse
>    return 1
>
>  main = (putStrLn . show) =<< iRecurse
>
> Any pointers to a good explanation of when the IO monad is lazy?

import System.IO.Unsafe

iRecurse :: (Num a) => IO a
iRecurse = do
  recurse <- unsafeInterleaveIO iRecurse
  return 1

More interesting variations of this leave you with questions of
whether or not the missles were launched, or, worse yet, was data
actually read from the file handle?

Anthony

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Yves Parès
No, I think Artyom meant "assuming IO is lazy".
He intended to show that, indeed, it is not, or else side-effects would
never be performed


2011/5/31 Scott Lawrence 

> On 05/31/2011 04:20 PM, Artyom Kazak wrote:
> > Suppose iRecurse looks like this:
> >   iRecurse = do
> > x <- launchMissiles
> > r <- iRecurse
> > return 1
> >
> > As x is never needed, launchMissiles will never execute. It obviously is
> > not what is needed.
>
> Prelude> let launchMissiles = putStrLn "UH OH" >> return 1
> Prelude> let iRecurse = launchMissiles >> return 1
> Prelude> iRecurse
> UH OH
> 1
> Prelude>
>
> Looks like launchMissiles /does/ execute, even though x is (obviously)
> never needed.
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Thoughts about currying and optimisations

2011-05-31 Thread Yves Parès
> It can introduce space leaks if the computationally expensive test is
replaced with a reference to a space expensive value.

You mean if the rest of stuffX's body keeps a reference to that value, I
suppose? (I suppose, or else that value would be useless).
Ok, so GHC does detect that case and optimizes it.


2011/5/31 Edward Z. Yang 

> I believe this transformation is called the 'full laziness' optimization.
> It can introduce space leaks if the computationally expensive test is
> replaced with a reference to a space expensive value.
>
> Edward
>
> Excerpts from Yves Parès's message of Tue May 31 15:14:07 -0400 2011:
> > Hello Café,
> > An idea came to me: unless the compiler notices that stuffA and stuffB
> are
> > equivalent, would it be correct to suppose that A is better than B?
> >
> > stuffA x = if someComputationallyExpensiveTest x
> > then doSomething else doSomethingElse
> >
> > stuffB x y = if someComputationallyExpensiveTest x
> > then doSomething y else doSomethingElse y
> >
> > I explain: in stuffA, the function only depends on x, so when doing:
> > a = stuffA xxx
> > runs the expensive test once and for all, and a can directly be bound to
> > doSomething or doSomethingElse
> > so calling after:
> > a foo
> > a bar
> > won't run the test
> >
> > Whereas:
> > b = stuffB xxx
> > is valid due to curryfication, but:
> > b foo
> > b bar
> > will both run the expensive test
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Decimal type-level arithmetic.

2011-05-31 Thread Serguey Zefirov
I would like to present my version of type arithmetic with decimal
encoding: http://thesz.mskhug.ru/svn/hhdl/TyleA.hs

It is not worth Cabal package in its current state, but I hope it
would be useful for someone.

It is easy to use, just say Plus (D1 :. D2 :. D0) D8 to get a type of
128. Or you can say $(tySize 128) if you're not afraid to  use
Template Haskell.

I tested it on my current project, it works quite well with numbers
around 128..512, about 10-20 times faster than Peano numbers.

As operations over this representation are not lazy (Peano numbers
oeprations are lazy for one argument of Plus and for part of one
argument of Max), it is not well suited for big number of operations
done at once.

I encountered it when I autogenerated a big arithmetic expression for
big polymorphic data type. I think it is quite rare situation.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Artyom Kazak
Scott Lawrence  писал(а) в своём письме Tue, 31 May 2011  
23:29:49 +0300:



On 05/31/2011 04:20 PM, Artyom Kazak wrote:

Suppose iRecurse looks like this:
  iRecurse = do
x <- launchMissiles
r <- iRecurse
return 1

As x is never needed, launchMissiles will never execute. It obviously is
not what is needed.

Prelude> let launchMissiles = putStrLn "UH OH" >> return 1
Prelude> let iRecurse = launchMissiles >> return 1
Prelude> iRecurse
UH OH
1
Prelude>
Looks like launchMissiles /does/ execute, even though x is (obviously)
never needed.


Oh, sorry. I was unclear. I have meant "assuming IO is lazy", as Yves  
wrote.


And saying "some hacks" I meant unsafeInterleaveIO, which lies beneath the  
laziness of, for example, getContents.


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Gregory Crosswhite

On 5/31/11 12:49 PM, Scott Lawrence wrote:

I was under the impression that operations performed in monads (in this
case, the IO monad) were lazy.


Whether they are lazy or not depends entirely on the definition of the 
monad.  For example, if you look up the ST and State monads you will 
find that they come in strict and lazy flavors.


As a general rule, operations in the IO monad are strict except for very 
special cases which are explicitly labeled as such, e.g. 
unsafeInterleaveIO, lazyIO, etc.


FYI, in GHC the definition of IO is at


http://www.haskell.org/ghc/docs/7.0.3/html/libraries/ghc-prim-0.2.0.0/src/GHC-Types.html#IO


You can tell it is strict because the result of the map is an unboxed 
tuple, which is strict (at least, if I understand correctly :-) ).


If you are curious, State# and RealWorld are defined here:


http://www.haskell.org/ghc/docs/7.0.3/html/libraries/ghc-prim-0.2.0.0/src/GHC-Prim.html#State.


State# and RealWorld do not contain data constructors because they are 
not intended to contain data but rather to parametrize types --- that is 
to say, you can think of IO as being a special case of the strict ST 
transformer which uses a special type tag to keep different ST threads 
separate (even though this type is never instantiated), and in the case 
of IO the state tag is RealWorld.


So in short, monads need not be strict but often are, and in particular 
IO is designed to be strict because it is essentially just a special 
case of the strict ST monad.


Cheers,
Greg

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Scott Lawrence
On 05/31/2011 04:48 PM, Artyom Kazak wrote:
> 
> Oh, sorry. I was unclear. I have meant "assuming IO is lazy", as Yves
> wrote.

Ah, ok. That makes more sense.

> 
> And saying "some hacks" I meant unsafeInterleaveIO, which lies beneath
> the laziness of, for example, getContents.

Which explains why assuming getContents is strict has never worked for me.

I'm trying to implement unfoldM1 without using unsafeIO, if possible. Since

  unfoldM1 f l = do
next <- f l
rest <- unfoldM1 f (next : l)
return (next : rest)

obviously won't work, I've been trying to use fmap

  unfoldM1 :: (Functor m, Monad m) => ([a] -> m a) -> [a] -> m [a]
  unfoldM1 f l = do
next <- f l
fmap (next :) $ unfoldM1 f (next : l)

Evaluation here also doesn't terminate (or, (head $ unfoldM (return .
head)) doesn't), although I can't figure out why. fmap shouldn't need to
fully evaluate a list to prepend an element, right?



signature.asc
Description: OpenPGP digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Daniel Fischer
On Tuesday 31 May 2011 22:35:26, Yves Parès wrote:
> He intended to show that, indeed, it is not, or else side-effects would
> never be performed

On the other hand, IO is lazy in the values it produces.
Going with the IO a = State RealWorld a fiction, IO is state-strict but 
value-lazy. The side-effects affect the state, hence are performed, the 
values are only evaluated to the extent required to determine the state.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Stephen Tetley
2011/5/31 Scott Lawrence :

> Evaluation here also doesn't terminate (or, (head $ unfoldM (return .
> head)) doesn't), although I can't figure out why. fmap shouldn't need to
> fully evaluate a list to prepend an element, right?

I'm afriad fmap doesn't get to choose - if the monad is strict then
both definitions are equivalent (probably...).

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Decimal type-level arithmetic.

2011-05-31 Thread Henning Thielemann


On Wed, 1 Jun 2011, Serguey Zefirov wrote:


I would like to present my version of type arithmetic with decimal
encoding: http://thesz.mskhug.ru/svn/hhdl/TyleA.hs


How does it compare to
  http://hackage.haskell.org/package/type-level
?

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Scott Lawrence
Apparently:

Prelude> let r = (fmap (1:) r) :: IO [Integer]
Prelude> fmap (take 5) r
*** Exception: stack overflow

Thanks - I'll just have to stay out of IO for this, then.

On Tue, May 31, 2011 at 17:05, Stephen Tetley  wrote:
> 2011/5/31 Scott Lawrence :
>
>> Evaluation here also doesn't terminate (or, (head $ unfoldM (return .
>> head)) doesn't), although I can't figure out why. fmap shouldn't need to
>> fully evaluate a list to prepend an element, right?
>
> I'm afriad fmap doesn't get to choose - if the monad is strict then
> both definitions are equivalent (probably...).
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Scott Lawrence

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Decimal type-level arithmetic.

2011-05-31 Thread Serguey Zefirov
2011/6/1 Henning Thielemann :
>
> On Wed, 1 Jun 2011, Serguey Zefirov wrote:
>> I would like to present my version of type arithmetic with decimal
>> encoding: http://thesz.mskhug.ru/svn/hhdl/TyleA.hs
> How does it compare to
>  http://hackage.haskell.org/package/type-level
> ?

My version is slightly simpler to use, I think, because type-level
uses functional dependencies: "Type-level functions are implemented
using functional dependencies of multi parameter type classes."

I use type families.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
Hi Johan,

> Here's how I would do it:

I implemented your method, with these minimal changes (i.e. just using a main
driver in the same file.)

> countUnigrams :: Handle -> IO (M.Map S.ByteString Int)
> countUnigrams = foldLines (\ m s -> M.insertWith (+) s 1 m) M.empty
> 
> main :: IO ()
> main = do (f:_) <- getArgs
>   openFile f ReadMode >>= countUnigrams >>= print . M.toList

It seems to perform about 3x worse than the iteratee method in terms of time,
and worse in terms of space :-( On Brandon's War & Peace example, hGetLine uses
1.565 seconds for the small file, whereas my iteratee method uses 1.085s for the
small file, and around 2 minutes for the large file.

For the large file, the code above starts consuming around 2.5GB of RAM,
so it clearly has a space leak somewhere. Where, I don't know.

If you want to try it out, here's a short command line to make a test corpus the
way Brandon made one:

+++

wget 'http://www.gutenberg.org/files/2600/2600.zip';
unzip 2600.zip;
touch wnp100.txt;
for i in {1..100}; do echo -n "$i "; cat 2600.txt >> wnp100.txt; done;
echo "Done.

+++

Note, that, as I detailed in my prior email to Brandon, even if you do end up
with a (supposedly) non-leaking program for this example corpus, that doesn't
mean it'll scale well to real world data.

I also tried sprinkling strictness annotations throughout your above code, but I
failed to produce good results :-(

> We definitely need more accessible material on how to reliably write
> fast Haskell code. There are those among us who can, but it shouldn't
> be necessary to learn it in the way they did (i.e. by lots of
> tinkering, learning from the elders, etc). I'd like to write a 60 (or
> so) pages tutorial on the subject, but haven't found the time.

I'd be an eager reader :-) Please do announce it on -cafe or the "usual places"
should you ever come around to writing it!

I, unfortunately, don't really have any contact to "the elders," apart from what
I read on their respective blogs…

> In addition to RWH, perhaps the slides from the talk on
> high-performance Haskell I gave could be useful:
> 
> 
> http://blog.johantibell.com/2010/09/slides-from-my-high-performance-haskell.html

Thanks, I'll give it a look later tomorrow!

Regards,
Aleks

PS: Sorry I didn't answer you in #haskell, I ended up having to go afk for a
short while. Thanks for all your help!


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
On Tue, May 31, 2011 at 11:43:27AM -0700, Brandon Moore wrote:
> I can't reproduce heap usage growing with the
> size of the input file.
> 
> I made a word list from Project Gutenberg's
> copy of "War and Peace" by
> 
> tr -sc '[[:alpha:]]' '\n' < pg2600.txt > words.txt
> 
> Using 1, 25, or 1000 repetitions of this ~3MB wordlist
> shows about 100MB of address space used according
> to top, and no more than 5MB or so of haskell heap
> used according to the memory profile, with a flat
> memory profile.

This will lead to very small variance in your data. The effect I'm experiencing
is so small in this case, that it's barely observable (but it is, see below.)

> Is your memory usage growing with the size of the input
> file, or the size of the histogram?

While the histogram is naturally growing with the size of the input file, memory
usage seems to be proportional mainly to the histogram. It is clear that, due to
the effect of the Long Tail, the histogram is going to constantly grow in a real
setting, as opposed to just replicating the same data.  In your test case, the
histogram is *not* growing with the size of the input file.

The memory usage is proportional to the histogram, which is proportional to the
file size. That is not the problem. The problem is, that, compared to the size
of the histogram, the memory consumption is *inadequately* high. Here's some
more data, using your Tolstoi example:

du file.txt
344Mfile.txt

<>

As you can see, memory residency is at 8 MB avg, 10 MB max. This is the
War&Peace file, replicated 100 times. Let's look at the result for the file
*without* first replicating it 100 times:

du words.txt
3.0Mwords.txt

<>

4.8MB avg, 9.1 MB max. It seems input file size *does* have an effect of sorts,
but it's negligible. What is more interesting is this: the file is a whopping
3MB big. How on earth does the program consume almost 5 MB *on average*? This is
*not* constant memory usage. This is memory usage trivial enough to not be worth
a fuss for small inputs, but unfortunately, it gets larger as soon as you
increase the file size (in a realistic fashion: i.e. you'll also increase the
unigram count.)

> I was worried data sharing might mean your keys
> retain entire 64K chunks of the input. However, it
> seems enumLines depends on the StringLike ByteString
> instance, which just converts to and from String.

Ouch, that sounds like something worth fixing.

> The other thing that occurs to me is that the total size of
> your keys would also be approximately the size of the input
> file if you were using plain text without each word split onto
> a separate line.

Well, I am not. The corpus is a word-per-line corpus, I'm reading a word per
line, and adding that to my hash map. This should never result in a data
structure even close to the size of the original corpus.

It could be, in a very unrealistic worst case scenario. But even a corpus of
30GB of Poe and Heidegger isn't going to make that happen. Furthermore, mine is 
not
such a scenario at all. As I said, if you reduce the corpus to a set of words
(i.e. a set of unigrams) you get a 40MB file from a 1.4GB corpus. Why is it,
that in order to create that 40MB file from a 1.4GB corpus, my trivial little
program needs somewhere north of 6-8 GB of RAM?

In this trivial example for War and Peace, why is it that in order to create the
unigram table for War and Peace, which ends up being a mere 201KB big, we're
chomping through 5MB on average, and nearly 10MB max? That's at least 25 times
more than we actually *should* have (yes, I know that the RTS is somewhere
there, too, but I think it's not a problem to ignore that for now.)

Regards,
Aleks


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
On Tue, May 31, 2011 at 02:13:14PM -0400, Edward Z. Yang wrote:
> It is possible that the iteratees library is space leaking; I recall some
> recent discussion to this effect.  Your example seems simple enough that
> you might recompile with a version of iteratees that has -auto-all enabled.

If I understand you correctly, you imply that I should try compiling iteratee
with profiling, no? I did install the iteratee library with profiling support (I
have the cabal profiling flag globally set in my cabal config,) but my profiles
so far seem to be blaming LAGging ByteStrings and HashMaps.

I, unfortunately, do not know how I would test iteratee itself for a space leak
here.

> Unfortunately, it's not really a safe bet to assume your libraries are
> leak free, and if you've pinpointed it down to a single line, and there
> doesn't seem a way to squash the leak, I'd bet it's the library's fault.

Since my knowledge of Haskell, and, in particular, high-performance Haskell, is
very lacking, my current m.o. is to blame myself :-) It might be iteratee, but
unfortunately, I have not found something that gives me better performance than
iteratee yet.

Regards,
Aleks


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread John Lato
From: "Edward Z. Yang" 

>
> Hello Aleksandar,
>
> It is possible that the iteratees library is space leaking; I recall some
> recent discussion to this effect.  Your example seems simple enough that
> you might recompile with a version of iteratees that has -auto-all enabled.
> Unfortunately, it's not really a safe bet to assume your libraries are
> leak free, and if you've pinpointed it down to a single line, and there
> doesn't seem a way to squash the leak, I'd bet it's the library's fault.
>
> Edward
>

I can't reproduce the space leak here.  I tried Aleksander's original code,
my iteratee version, the Ngrams version posted by Johan Tibell, and a lazy
bytestring version.

my iteratee version (only f' has changed from Aleksander's code):

f' :: Monad m => I.Iteratee S.ByteString m Wordcounts
f' = I.joinI $ (enumLinesBS I.><> I.filter (not . S.null)) $ I.foldl' (\t s
-> T.insertWith (+) s 1 t) T.empty

my lazy bytestring version

> import Data.Iteratee.Char
> import Data.List (foldl')import Data.Char (toLower)
>
> import Data.Ord (comparing)
> import Data.List (sortBy)
> import System.Environment (getArgs)
> import qualified Data.ByteString.Lazy.Char8 as L
> import qualified Data.HashMap.Strict as T
>
> f'2 = foldl' (\t s -> T.insertWith (+) s 1 t) T.empty . filter (not .
L.null) . L.lines
>
> main2 :: IO ()
> main2 = getArgs >>= L.readFile .head >>= print . T.keys . f'2

None of these leak space for me (all compiled with ghc-7.0.3 -O2).
Performance was pretty comparable for every version, although Aleksander's
original did seem to have a very small edge.

As someone already pointed out, keep in mind that this will use a lot of
memory anyway, unless there's a lot of repetition of words.

I'd be happy to help track down a space leak in iteratee, but for now I'm
not seeing one.

Best,
John Lato
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Antoine Latter
On Tue, May 31, 2011 at 2:49 PM, Scott Lawrence  wrote:
> I was under the impression that operations performed in monads (in this
> case, the IO monad) were lazy. (Certainly, every time I make the
> opposite assumption, my code fails :P .) Which doesn't explain why the
> following code fails to terminate:
>
>  iRecurse :: (Num a) => IO a
>  iRecurse = do
>    recurse <- iRecurse
>    return 1
>
>  main = (putStrLn . show) =<< iRecurse
>
> Any pointers to a good explanation of when the IO monad is lazy?
>
>
> === The long story ===
>
> I wrote a function unfold with type signature (([a] -> a) -> [a]), for
> generating a list in which each element can be calculated from all of
> the previous elements.
>
>  unfold :: ([a] -> a) -> [a]
>  unfold f = unfold1 f []
>
>  unfold1 :: ([a] -> a) -> [a] -> [a]
>  unfold1 f l = f l : unfold1 f (f l : l)
>
> Now I'm attempting to do the same thing, except where f returns a monad.
> (My use case is when f randomly selects the next element, i.e. text
> generation from markov chains.) So I want
>
>  unfoldM1 :: (Monad m) => ([a] -> m a) -> [a] -> m [a]
>
> My instinct, then, would be to do something like:
>
>  unfoldM1 f l = do
>    next <- f l
>    rest <- unfoldM1 f (next : l)
>    return (next : rest)
>
> But that, like iRecurse above, doesn't work.
>

You could use a different type:

> type IOStream a = (a, IO (IOStream a))

> unfold :: ([a] -> IO a) -> IO (IOStream a)
> unfold f =
> let go prev = do
>   next <- f prev
>   return (next, go (next:prev))
> in do
>   z <- f []
>   go [z]

> toList :: Int -> IOStream a -> IO [a]
> toList 0 _ = return []
> toList n (x,rest) = do
>   xs <- toList (n-1) rest
>   return (x:xs)

Antoine

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Antoine Latter
On Tue, May 31, 2011 at 6:10 PM, Antoine Latter  wrote:
>
> You could use a different type:
>
>> type IOStream a = (a, IO (IOStream a))
>
>> unfold :: ([a] -> IO a) -> IO (IOStream a)
>> unfold f =
>>     let go prev = do
>>           next <- f prev
>>           return (next, go (next:prev))
>>     in do
>>       z <- f []
>>       go [z]
>
>> toList :: Int -> IOStream a -> IO [a]
>> toList 0 _ = return []
>> toList n (x,rest) = do
>>   xs <- toList (n-1) rest
>>   return (x:xs)
>

Let's pretend I did that right:

> toList :: Int -> IOStream a -> IO [a]
> toList 0 _ = return []
> toList 1 (x,_) = return [x]
> toList n (x,r) = do
>   rest <- r
>   xs <- toList (n-1) rest
>   return (x:xs)

Antoine

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-05-31 Thread dm-list-haskell-cafe
I'm using GHC 7.0.2 and running into a compiler error that I cannot
understand.  Can anyone shed light on the issue for me?  The code does
not make use of GADTs and compiles just fine without them.  But when I
add a {-# LANGUAGE GADTs #-} pragma, it fails to compile.

Here is the code:



{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE GADTs #-}

mkUnit :: (forall t. t -> m t) -> m ()
mkUnit f = f ()

data Const b a = Const { unConst :: b }

sillyId :: r -> r
sillyId r = unConst (mkUnit mkConst_r) -- This line can't compile with GADTs
where mkConst_r t = mkConst r t
  mkConst :: b -> t -> Const b t
  mkConst b _ = Const b



And the error I get is:

Couldn't match type `t0' with `t'
  because type variable `t' would escape its scope
This (rigid, skolem) type variable is bound by
  a type expected by the context: t -> Const r t
The following variables have types that mention t0
  mkConst_r :: t0 -> Const r t0
(bound at /u/dm/hack/hs/gadt.hs:11:11)
In the first argument of `mkUnit', namely `mkConst_r'
In the first argument of `unConst', namely `(mkUnit mkConst_r)'
In the expression: unConst (mkUnit mkConst_r)

I've found several workarounds for the issue, but I'd like to
understand what the error message means and why it is caused by GADTs.

Thanks in advance for any help.

David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] How on Earth Do You Reason about Space?

2011-05-31 Thread Aleksandar Dimitrov
On Tue, May 31, 2011 at 11:30:06PM +0100, John Lato wrote:
> I can't reproduce the space leak here.  I tried Aleksander's original code,
> my iteratee version, the Ngrams version posted by Johan Tibell, and a lazy
> bytestring version.

I unfortunately can't post the actual corpus here, because it's copyrighted. But
there's plenty of ways to retrieve large amounts of data from the Internet. See
below.


> f' :: Monad m => I.Iteratee S.ByteString m Wordcounts
> f' = I.joinI $ (enumLinesBS I.><> I.filter (not . S.null)) $ I.foldl' (\t s
> -> T.insertWith (+) s 1 t) T.empty

Neat, folding the specialised function into the enumeratee is nifty! One can
tell that my experience with iteratee/enumerator has been only a day's worth for
now :-\

> None of these leak space for me (all compiled with ghc-7.0.3 -O2).
> Performance was pretty comparable for every version, although Aleksander's
> original did seem to have a very small edge.

How big were your input corpora?

Here's an absolutely evil shell script that is going to make me donate money to
project Gutenberg. It will gather a corpus that is still very small, but at
least realistic in its distributional properties.

+++ scrape.sh

#!/bin/sh

textfile=all_text.txt

touch $textfile

text=0
size=0
for i in `seq 10 300`; do
wget -q "http://www.gutenberg.org/files/$i/$i.zip";
if [ -f $i.zip ]; then
unzip -qq $i.zip
tr -sc '[[:alpha:]]' '\n' < $i.txt >> $textfile
text=`dc -e "$text 1 + p"`
size=`du -h $textfile | cut -f 1`
rm -f $i.zip $i.txt
fi
echo -n "\rFound $text of $i texts, total $size."
done
echo "\rFound $text texts, total $size"

+++

It'll take a while to run, and the resulting corpus is just a fraction of what
I'm using, but it serves well to illustrate the problem. If you want, you can
increase the amount of data gathered by simply tweaking the numerical range in
the seq statement. (If you make it gobble up more bandwidth, it might be polite
to put a sleep somewhere in the inner loop. I'd host the resulting file myself,
but I don't know if there aren't any conditions on that.)

If you did not tweak the script, it should've gathered some 100MB of data.

Running my unmodified original program, htop records 320MB of RAM usage towards
the end (*without* profiling being enabled.)

So it seems that I can't get rid of a factor of around 3x the input file size.
Luckily, the dependency seems to be linear. Here's some profiling:

<>
../src/cafe/tools/iterTable 106M_text.txt +RTS -tstderr  50.44s user 1.50s 
system 99% cpu 52.064 total

ghc itself reports 38MB avg (can live with that,) and 140MB max (too much.)

Redirecting the program's output to a file will yield a mere 2.2M for the data
gathered by the above script. Since those 2.2M of data are all I care about, why
do I need so much more RAM to compute them?

Are my demands unreasonable?

> I'd be happy to help track down a space leak in iteratee, but for now I'm
> not seeing one.

Thank you for your offer! Maybe I'm just seeing ghosts, and there is no space
leak. But I really do think that the program is eating too much RAM.

Regards,
Aleks


signature.asc
Description: Digital signature
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Lazy Evaluation in Monads

2011-05-31 Thread Albert Y. C. Lai

On a tangent, not doing IO, but food for thought:

{-# LANGUAGE FlexibleContexts #-}

import Control.Monad.State.Lazy as N
import Control.Monad.State.Strict as S

gen :: (MonadState [()] m) => m ()
gen = do
  gen
  modify (() :)

many = take 3 (N.execState gen [])
none = take 3 (S.execState gen [])


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-05-31 Thread austin seipp
Hi David,

It seems to be a result of the new typechecker and more specifically
the new behavior for GADTs in GHC 7.

The short story is thus: when you turn on GADTs, it also now turns on
another extension implicitly (MonoLocalBinds) which restricts let
generalization. More specifically, it causes GHC to not generalize the
inferred type of polymorphic let bindings (and where clauses too.)
This results in the error you see. GHC telling you that the type of
the argument to mkUnit (in this case mkConst_r) is not polymorphic
enough.

The correct fix is to simply give an explicit type to any polymorphic
let binding. This will let GHC happily compile your program with GADTs
with otherwise no changes needed for correctness. The reasoning for
this behavior is because SPJ et al found it a lot more difficult to
build a robust type inference engine, when faced with non-monomorphic
local bindings. Luckily the overall impact of such a change was
measured to be relatively small, but indeed, it will break some
existing programs. The changes aren't huge though - this program
successfully typechecks with GHC 7.0.3:

---
{-# LANGUAGE Rank2Types #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE GADTs #-}

module Main where

mkUnit :: (forall t. t -> m t) -> m ()
mkUnit f = f ()

data Const b a = Const { unConst :: b }

sillyId :: forall r. r -> r
sillyId r = unConst (mkUnit mkConst_r) -- This line can't compile with GADTs
  where mkConst_r :: t -> Const r t
mkConst_r t = mkConst r t
mkConst :: b -> t -> Const b t
mkConst b _ = Const b

main :: IO ()
main = return ()
---

There is also another 'fix' that may work, which Simon talks about
himself: you can enable GADTs, but turn off enforced monomorphic local
bindings by giving the extension 'NoMonoLocalBinds'. This will merely
tell GHC to try anyway, but it'd be better to just update your
program, as you can't give as many guarantees about the type
inferencer when you give it such code.

You can find a little more info about the change here:

http://hackage.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7

And in Simon's paper ("Let should not be generalized.")

Hope it helps,

On Tue, May 31, 2011 at 6:42 PM,   wrote:
> I'm using GHC 7.0.2 and running into a compiler error that I cannot
> understand.  Can anyone shed light on the issue for me?  The code does
> not make use of GADTs and compiles just fine without them.  But when I
> add a {-# LANGUAGE GADTs #-} pragma, it fails to compile.
>
> Here is the code:
>
> 
>
> {-# LANGUAGE Rank2Types #-}
> {-# LANGUAGE GADTs #-}
>
> mkUnit :: (forall t. t -> m t) -> m ()
> mkUnit f = f ()
>
> data Const b a = Const { unConst :: b }
>
> sillyId :: r -> r
> sillyId r = unConst (mkUnit mkConst_r) -- This line can't compile with GADTs
>    where mkConst_r t = mkConst r t
>          mkConst :: b -> t -> Const b t
>          mkConst b _ = Const b
>
> 
>
> And the error I get is:
>
>    Couldn't match type `t0' with `t'
>      because type variable `t' would escape its scope
>    This (rigid, skolem) type variable is bound by
>      a type expected by the context: t -> Const r t
>    The following variables have types that mention t0
>      mkConst_r :: t0 -> Const r t0
>        (bound at /u/dm/hack/hs/gadt.hs:11:11)
>    In the first argument of `mkUnit', namely `mkConst_r'
>    In the first argument of `unConst', namely `(mkUnit mkConst_r)'
>    In the expression: unConst (mkUnit mkConst_r)
>
> I've found several workarounds for the issue, but I'd like to
> understand what the error message means and why it is caused by GADTs.
>
> Thanks in advance for any help.
>
> David
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>



-- 
Regards,
Austin

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Enabling GADTs breaks Rank2Types code compilation - Why?

2011-05-31 Thread dm-list-haskell-cafe
At Tue, 31 May 2011 21:30:01 -0500,
austin seipp wrote:
> 
> The short story is thus: when you turn on GADTs, it also now turns on
> another extension implicitly (MonoLocalBinds) which restricts let
> generalization...
> 
> You can find a little more info about the change here:
> 
> http://hackage.haskell.org/trac/ghc/blog/LetGeneralisationInGhc7

Thanks for the precise response I needed.

It definitely felt like I was running up against something like the
monomorphism restriction, but my bindings were function and not
pattern bindings, so I couldn't understand what was going on.  I had
even gone and re-read the GHC documentation
(http://www.haskell.org/ghc/docs/7.0-latest/html/users_guide/data-type-extensions.html#gadt),
which says that -XGADTs enables -XRelaxedPolyRec, but makes no mention
of -XMonoLocalBinds.

It might save users some frustration if the GHC manual and/or the
error message mentioned something about let bindings being monomorphic
by default.

On a related note, I already started fixing this in my code by
enabling ScopedTypeVariables, as it's too much of a pain to do this
without that extension.

I usually try to use the minimum number of extensions possible to
future-proof my code.  However, is it reasonable to conclude that if
I'm going to use GADTs anyway, then additionally enabling
ScopedTypeVariables doesn't really make my code any less future-proof?

Thanks,
David

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Computing the memory footprint of a HashMap ByteString Int (Was: How on Earth Do You Reason about Space?)

2011-05-31 Thread Johan Tibell
Hi Aleksandar,

I thought it'd be educational to do some back-of-the-envelope
calculations to see how much memory we'd expect to use to store words
in a HashMap ByteString Int. First, lets start by looking at how much
memory one ByteString uses. Here's the definition of ByteString [1]:

data ByteString = PS {-# UNPACK #-} !(ForeignPtr Word8)
 {-# UNPACK #-} !Int-- offset
 {-# UNPACK #-} !Int-- length

The two Int fields are used to support O(1) slicing of ByteStrings.

We also need the definitions of ForeignPtr [2] and Int.

data ForeignPtr a = ForeignPtr Addr# ForeignPtrContents

data ForeignPtrContents
= PlainForeignPtr !(IORef (Finalizers, [IO ()]))
| MallocPtr  (MutableByteArray# RealWorld) !(IORef
(Finalizers, [IO ()]))
| PlainPtr   (MutableByteArray# RealWorld)

data Int = I# Int#

The UNPACK indicates to the compiler that it should unpack the
contents of a constructor field into the constructor itself, removing
a level of indirection. We'll end up with a structure that looks like
this:

data ByteString = PS Addr# ForeignPtrContents
 Int#-- offset
 Int#-- length

To compute the size of a ByteString, count the number of constructor
fields and add one (for the constructor itself). This is how many
words the value is going to use. In this case it's 5 words. In
addition we need to add the size of the ForeignPtrContents
constructor, which happens to be a PlainPtr in this case, so we add
two more words.

Finally we need to look at the definition of MutableByteArray# [3],
which is implemented by a C struct named StgArrWords:

typedef struct {
StgHeader  header;
StgWordbytes;
StgWordpayload[FLEXIBLE_ARRAY];
} StgArrWords;

The StgHeader takes one word (when compiling with profiling turned
off) so StgArrWords takes 2 words, plus the actual payload.

If we add it all up we get 9 words, plus the size of the payload (i.e.
the length of a word encoded using UTF-8 in your case).

Now lets look at the definition of HashMap [4]:

data HashMap k v
= Bin {-# UNPACK #-} !SuffixMask
  !(HashMap k v)
  !(HashMap k v)
| Tip {-# UNPACK #-} !Hash
  {-# UNPACK #-} !(FL.FullList k v)
| Nil

We also need the definition of FullList [5]:

data FullList k v = FL !k v !(List k v)
data List k v = Nil | Cons !k v !(List k v)

For the sake of this discussion lets assume the tree is perfectly
balanced. For a HashMap of size N this means that we have N leaves
(Tip) and N-1 interior nodes (Bin). Each Bin takes 4 words.

The size of the Tip depends on the number of hash collisions. These
are quite rare so lets assume that the FullList only has one element.
Also, the Nil constructor is free as it can be shared by all instances
in the program. After applying the UNPACK pragmas the Tip constructor
looks like:

| Tip Int# !k v !(List k v)

This takes another 5 words.

Now when we know the overhead of both Bin and Tip we can compute the
overhead per key/value pair as: (5N + 4(N-1)) / N = 9 - 4/N ~= 9
words.

Given that an Int (not an Int#) takes 2 words, we can approximate the
memory cost of a key/value pair in a HashMap ByteString Int as

(9 + 9 + 2) * MachineWordSize + AvgByteStringLength

bytes.

For example, the average English word length is 5 characters, if you
include stop words. We'd expect to use

(9 + 9 + 2) * 8 + 5 = 165

bytes per unique word in our input corpus, on a 64-bit machine. Plus
any GC overhead. This is probably more than one would expect.

I'm working on switching HashMap to use another data structure, a Hash
Array Mapped Trie, in its implementation. This will bring the overhead
down from 9 words to about 4 words.

You could try using the lower overhead SmallString type from the
smallstring package [6]. It has an overhead of 4 words per string
(instead of 9 like ByteString). There's some runtime overhead involved
in converting a value (i.e. Text) to a SmallString. I don't know if
this overhead will be noticeable in your program.

1. http://code.haskell.org/bytestring/Data/ByteString/Internal.hs
2. https://github.com/ghc/packages-base/blob/master/GHC/ForeignPtr.hs
3. https://github.com/ghc/ghc/blob/master/includes/rts/storage/Closures.h
4. 
https://github.com/tibbe/unordered-containers/blob/master/Data/HashMap/Common.hs
5. 
https://github.com/tibbe/unordered-containers/blob/master/Data/FullList/Lazy.hs
6. http://hackage.haskell.org/package/smallstring

P.S. If your program indeed has a space leak this won't help you, but
it's a good way to figure out if your program uses a reasonable amount
of memory.

Cheers,
Johan

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe