Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-24 Thread C K Kashyap
Thanks for the pointer Mukesh  I'll go over the blog.

Changing the xml parser to another one from hackage - xml - helped but not
fully. I think I would need to change to bytestring. But for now, I split
the program into smaller programs and it seems to work.

Regards,
Kashyap


On Sat, Mar 23, 2013 at 11:55 AM, mukesh tiwari <
mukeshtiwari.ii...@gmail.com> wrote:

> Hi Kashyap
> I am not sure if this solution to your problem but try using Bytestring
> rather than String in
>
> parseXML' :: String -> XMLAST
> parseXML' str =
>   f ast where
>   ast = parse (spaces >> xmlParser) "" str
>   f (Right x) = x
>
>   f (Left x) = CouldNotParse
>
>
> Also see this post[1] My Space is Leaking..
>
> Regards,
> Mukesh Tiwari
>
> [1] http://www.mega-nerd.com/erikd/Blog/
>
>
> On Sat, Mar 23, 2013 at 11:11 AM, C K Kashyap  wrote:
>
>> Oops...I sent out the earlier message accidentally.
>>
>> I got some profiling done and got this pdf generated. I see unhealthy
>> growths in my XML parser.
>> https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
>> I must be not using parsec efficiently.
>>
>> Regards,
>> Kashyap
>>
>>
>>
>>
>> On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap wrote:
>>
>>> I got some profiling done and got this pdf generated. I see unhealthy
>>> growths in my XML parser.
>>>
>>>
>>>
>>> On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap wrote:
>>>
 Hi folks,

 I've run into more issues with my report generation tool  I'd
 really appreciate some help.

 I've created a repro project on github to demonstrate the problem.
 git://github.com/ckkashyap/haskell-perf-repro.git

 There is a template xml file that needs to be replicated several times
 (3000 or so) under the data directory and then "driver" needs to be run.
 The memory used by driver keeps growing until it runs out of memory.

 Also, I'd appreciate some tips on how to go about debugging this
 situation. I am on the windows platform.


 Regards,
 Kashyap


 On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh  wrote:

> On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
>  wrote:
> > Yes. You (and Dan) are totally right. 'Let' just bind expression, not
> > evaluating it. Dan's evaluate trick force rnf to run before hClose.
> As I
> > said - it's tricky part especially for newbie like me :)
>
> To place this in perspective, one only needs to descend one or two
> more layers before the semantics starts confusing even experts.
>
> Whereas the difference between seq and evaluate shouldn't be too hard
> to grasp, that between evaluate and (return $!) is considerably more
> subtle, as Edward Yang notified us 10 days ago. See the thread titled
> To seq or not to seq.
>
> -- Kim-Ee
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>


>>>
>>
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread mukesh tiwari
Hi Kashyap
I am not sure if this solution to your problem but try using Bytestring
rather than String in

parseXML' :: String -> XMLAST
parseXML' str =
  f ast where
  ast = parse (spaces >> xmlParser) "" str
  f (Right x) = x
  f (Left x) = CouldNotParse


Also see this post[1] My Space is Leaking..

Regards,
Mukesh Tiwari

[1] http://www.mega-nerd.com/erikd/Blog/


On Sat, Mar 23, 2013 at 11:11 AM, C K Kashyap  wrote:

> Oops...I sent out the earlier message accidentally.
>
> I got some profiling done and got this pdf generated. I see unhealthy
> growths in my XML parser.
> https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
> I must be not using parsec efficiently.
>
> Regards,
> Kashyap
>
>
>
>
> On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap  wrote:
>
>> I got some profiling done and got this pdf generated. I see unhealthy
>> growths in my XML parser.
>>
>>
>>
>> On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap  wrote:
>>
>>> Hi folks,
>>>
>>> I've run into more issues with my report generation tool  I'd really
>>> appreciate some help.
>>>
>>> I've created a repro project on github to demonstrate the problem.
>>> git://github.com/ckkashyap/haskell-perf-repro.git
>>>
>>> There is a template xml file that needs to be replicated several times
>>> (3000 or so) under the data directory and then "driver" needs to be run.
>>> The memory used by driver keeps growing until it runs out of memory.
>>>
>>> Also, I'd appreciate some tips on how to go about debugging this
>>> situation. I am on the windows platform.
>>>
>>>
>>> Regards,
>>> Kashyap
>>>
>>>
>>> On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh  wrote:
>>>
 On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
  wrote:
 > Yes. You (and Dan) are totally right. 'Let' just bind expression, not
 > evaluating it. Dan's evaluate trick force rnf to run before hClose.
 As I
 > said - it's tricky part especially for newbie like me :)

 To place this in perspective, one only needs to descend one or two
 more layers before the semantics starts confusing even experts.

 Whereas the difference between seq and evaluate shouldn't be too hard
 to grasp, that between evaluate and (return $!) is considerably more
 subtle, as Edward Yang notified us 10 days ago. See the thread titled
 To seq or not to seq.

 -- Kim-Ee

 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

>>>
>>>
>>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
Oops...I sent out the earlier message accidentally.

I got some profiling done and got this pdf generated. I see unhealthy
growths in my XML parser.
https://github.com/ckkashyap/haskell-perf-repro/blob/master/RSXP.hs
I must be not using parsec efficiently.

Regards,
Kashyap




On Sat, Mar 23, 2013 at 11:07 AM, C K Kashyap  wrote:

> I got some profiling done and got this pdf generated. I see unhealthy
> growths in my XML parser.
>
>
>
> On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap  wrote:
>
>> Hi folks,
>>
>> I've run into more issues with my report generation tool  I'd really
>> appreciate some help.
>>
>> I've created a repro project on github to demonstrate the problem.
>> git://github.com/ckkashyap/haskell-perf-repro.git
>>
>> There is a template xml file that needs to be replicated several times
>> (3000 or so) under the data directory and then "driver" needs to be run.
>> The memory used by driver keeps growing until it runs out of memory.
>>
>> Also, I'd appreciate some tips on how to go about debugging this
>> situation. I am on the windows platform.
>>
>>
>> Regards,
>> Kashyap
>>
>>
>> On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh  wrote:
>>
>>> On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
>>>  wrote:
>>> > Yes. You (and Dan) are totally right. 'Let' just bind expression, not
>>> > evaluating it. Dan's evaluate trick force rnf to run before hClose. As
>>> I
>>> > said - it's tricky part especially for newbie like me :)
>>>
>>> To place this in perspective, one only needs to descend one or two
>>> more layers before the semantics starts confusing even experts.
>>>
>>> Whereas the difference between seq and evaluate shouldn't be too hard
>>> to grasp, that between evaluate and (return $!) is considerably more
>>> subtle, as Edward Yang notified us 10 days ago. See the thread titled
>>> To seq or not to seq.
>>>
>>> -- Kim-Ee
>>>
>>> ___
>>> Haskell-Cafe mailing list
>>> Haskell-Cafe@haskell.org
>>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>>
>>
>>
>


driver.pdf
Description: Adobe PDF document
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
I got some profiling done and got this pdf generated. I see unhealthy
growths in my XML parser.



On Fri, Mar 22, 2013 at 8:12 PM, C K Kashyap  wrote:

> Hi folks,
>
> I've run into more issues with my report generation tool  I'd really
> appreciate some help.
>
> I've created a repro project on github to demonstrate the problem.
> git://github.com/ckkashyap/haskell-perf-repro.git
>
> There is a template xml file that needs to be replicated several times
> (3000 or so) under the data directory and then "driver" needs to be run.
> The memory used by driver keeps growing until it runs out of memory.
>
> Also, I'd appreciate some tips on how to go about debugging this
> situation. I am on the windows platform.
>
>
> Regards,
> Kashyap
>
>
> On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh  wrote:
>
>> On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
>>  wrote:
>> > Yes. You (and Dan) are totally right. 'Let' just bind expression, not
>> > evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
>> > said - it's tricky part especially for newbie like me :)
>>
>> To place this in perspective, one only needs to descend one or two
>> more layers before the semantics starts confusing even experts.
>>
>> Whereas the difference between seq and evaluate shouldn't be too hard
>> to grasp, that between evaluate and (return $!) is considerably more
>> subtle, as Edward Yang notified us 10 days ago. See the thread titled
>> To seq or not to seq.
>>
>> -- Kim-Ee
>>
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-22 Thread C K Kashyap
Hi folks,

I've run into more issues with my report generation tool  I'd really
appreciate some help.

I've created a repro project on github to demonstrate the problem.
git://github.com/ckkashyap/haskell-perf-repro.git

There is a template xml file that needs to be replicated several times
(3000 or so) under the data directory and then "driver" needs to be run.
The memory used by driver keeps growing until it runs out of memory.

Also, I'd appreciate some tips on how to go about debugging this situation.
I am on the windows platform.


Regards,
Kashyap


On Tue, Mar 19, 2013 at 1:11 PM, Kim-Ee Yeoh  wrote:

> On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
>  wrote:
> > Yes. You (and Dan) are totally right. 'Let' just bind expression, not
> > evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
> > said - it's tricky part especially for newbie like me :)
>
> To place this in perspective, one only needs to descend one or two
> more layers before the semantics starts confusing even experts.
>
> Whereas the difference between seq and evaluate shouldn't be too hard
> to grasp, that between evaluate and (return $!) is considerably more
> subtle, as Edward Yang notified us 10 days ago. See the thread titled
> To seq or not to seq.
>
> -- Kim-Ee
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-19 Thread Kim-Ee Yeoh
On Tue, Mar 19, 2013 at 2:01 PM, Konstantin Litvinenko
 wrote:
> Yes. You (and Dan) are totally right. 'Let' just bind expression, not
> evaluating it. Dan's evaluate trick force rnf to run before hClose. As I
> said - it's tricky part especially for newbie like me :)

To place this in perspective, one only needs to descend one or two
more layers before the semantics starts confusing even experts.

Whereas the difference between seq and evaluate shouldn't be too hard
to grasp, that between evaluate and (return $!) is considerably more
subtle, as Edward Yang notified us 10 days ago. See the thread titled
To seq or not to seq.

-- Kim-Ee

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-19 Thread Konstantin Litvinenko

On 03/19/2013 07:12 AM, Edward Kmett wrote:

Konstantin,

Please allow me to elaborate on Dan's point -- or at least the point
that I believe that Dan is making.

Using,

let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str


or ($!!)will create a value that *when forced* cause the rnfto occur.

As you don't look at buguntil much later this causes the same problem as
before!



Yes. You (and Dan) are totally right. 'Let' just bind expression, not 
evaluating it. Dan's evaluate trick force rnf to run before hClose. As I 
said - it's tricky part especially for newbie like me :)




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Edward Kmett
Konstantin,

Please allow me to elaborate on Dan's point -- or at least the point that I
believe that Dan is making.

Using,

let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str


or ($!!) will create a value that *when forced* cause the rnf to occur.

As you don't look at bug until much later this causes the same problem as
before!

His addition of evaluate forces the rnf to happen before proceeding.

On a more ad hoc basis you might say

let !bug = fileContents2Bug $!! str

but without the bang-pattern or the evaluate, which is arguably strictly
better (er no pun intended) from a semantics perspective nothing has
happened yet until someone inspects bug.

With the code as structured this doesn't happen until it is too late.

-Edward

On Mon, Mar 18, 2013 at 1:11 PM, Konstantin Litvinenko <
to.darkan...@gmail.com> wrote:

> On 03/18/2013 06:06 PM, Dan Doel wrote:
>
>> Do note that deepSeq alone won't (I think) change anything in your
>> current code. bug will deepSeq the file contents.
>>
>
> rfn fully evaluate 'bug' by reading all file content. Later hClose will
> close it and we done. Not reading all content will lead to semi closed
> handle, leaked in that case. Handle will be opened until hGetContents lazy
> list hit the end.
>
>
>  And the cons will
>
>> seq bug. But nothing is evaluating the cons. And further, the cons
>> isn't seqing the tail, so none of that will collapse, either. So the
>> file descriptors will still all be opened at once.
>>
>> Probably the best solution if you choose to go this way is:
>>
>>  bug <- evaluate (fileContents2Bug $!! str)
>>
>> which ties the evaluation of the file contents into the IO execution.
>> At that point, deepSeqing the file is probably unnecessary, though,
>> because evaluating the bug will likely allow the file contents to be
>> collected.
>>
>
> evaluate do the same as $! - evaluate args to WHNF. That won't help in any
> way. Executing in IO monad doesn't imply strictness Thats why mixing lazy
> hGetContent with strict hOpen/hClose is so tricky.
>
>
>
>
> __**_
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Konstantin Litvinenko

On 03/18/2013 06:06 PM, Dan Doel wrote:

Do note that deepSeq alone won't (I think) change anything in your
current code. bug will deepSeq the file contents.


rfn fully evaluate 'bug' by reading all file content. Later hClose will 
close it and we done. Not reading all content will lead to semi closed 
handle, leaked in that case. Handle will be opened until hGetContents 
lazy list hit the end.


 And the cons will

seq bug. But nothing is evaluating the cons. And further, the cons
isn't seqing the tail, so none of that will collapse, either. So the
file descriptors will still all be opened at once.

Probably the best solution if you choose to go this way is:

 bug <- evaluate (fileContents2Bug $!! str)

which ties the evaluation of the file contents into the IO execution.
At that point, deepSeqing the file is probably unnecessary, though,
because evaluating the bug will likely allow the file contents to be
collected.


evaluate do the same as $! - evaluate args to WHNF. That won't help in 
any way. Executing in IO monad doesn't imply strictness Thats why mixing 
lazy hGetContent with strict hOpen/hClose is so tricky.




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Dan Doel
Do note that deepSeq alone won't (I think) change anything in your
current code. bug will deepSeq the file contents. And the cons will
seq bug. But nothing is evaluating the cons. And further, the cons
isn't seqing the tail, so none of that will collapse, either. So the
file descriptors will still all be opened at once.

Probably the best solution if you choose to go this way is:

bug <- evaluate (fileContents2Bug $!! str)

which ties the evaluation of the file contents into the IO execution.
At that point, deepSeqing the file is probably unnecessary, though,
because evaluating the bug will likely allow the file contents to be
collected.

On Mon, Mar 18, 2013 at 6:42 AM, C K Kashyap  wrote:
> Thanks Konstantin ... I'll try that out too...
>
>
>
> Regards,
> Kashyap
>
>
> On Mon, Mar 18, 2013 at 3:31 PM, Konstantin Litvinenko
>  wrote:
>>
>> On 03/17/2013 07:08 AM, C K Kashyap wrote:
>>>
>>> I am working on an automation that periodically fetches bug data from
>>> our bug tracking system and creates static HTML reports. Things worked
>>> fine when the bugs were in the order of 200 or so. Now I am trying to
>>> run it against 3000 bugs and suddenly I see things like - too  many open
>>> handles, out of memory etc ...
>>>
>>> Here's the code snippet - http://hpaste.org/84197
>>>
>>> It's a small snippet and I've put in the comments stating how I run into
>>> "out of file handles" or simply file not getting read due to lazy IO.
>>>
>>> I realize that putting ($!) using a trial/error approach is going to be
>>> futile. I'd appreciate some pointers into the tools I could use to get
>>> some idea of which expressions are building up huge thunks.
>>
>>
>> You problem is in
>>
>> let bug = ($!) fileContents2Bug str
>>
>> ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
>> in a file, not to all content. To fully evaluate 'str' you need something
>> like
>>
>> let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str
>>
>>
>>
>>
>>
>>
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread C K Kashyap
Thanks Konstantin ... I'll try that out too...



Regards,
Kashyap


On Mon, Mar 18, 2013 at 3:31 PM, Konstantin Litvinenko <
to.darkan...@gmail.com> wrote:

> On 03/17/2013 07:08 AM, C K Kashyap wrote:
>
>> I am working on an automation that periodically fetches bug data from
>> our bug tracking system and creates static HTML reports. Things worked
>> fine when the bugs were in the order of 200 or so. Now I am trying to
>> run it against 3000 bugs and suddenly I see things like - too  many open
>> handles, out of memory etc ...
>>
>> Here's the code snippet - http://hpaste.org/84197
>>
>> It's a small snippet and I've put in the comments stating how I run into
>> "out of file handles" or simply file not getting read due to lazy IO.
>>
>> I realize that putting ($!) using a trial/error approach is going to be
>> futile. I'd appreciate some pointers into the tools I could use to get
>> some idea of which expressions are building up huge thunks.
>>
>
> You problem is in
>
> let bug = ($!) fileContents2Bug str
>
> ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
> in a file, not to all content. To fully evaluate 'str' you need something
> like
>
> let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str
>
>
>
>
>
>
> __**_
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/**mailman/listinfo/haskell-cafe
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Ivan Lazar Miljenovic
On 18 March 2013 21:01, Konstantin Litvinenko  wrote:
> On 03/17/2013 07:08 AM, C K Kashyap wrote:
>>
>> I am working on an automation that periodically fetches bug data from
>> our bug tracking system and creates static HTML reports. Things worked
>> fine when the bugs were in the order of 200 or so. Now I am trying to
>> run it against 3000 bugs and suddenly I see things like - too  many open
>> handles, out of memory etc ...
>>
>> Here's the code snippet - http://hpaste.org/84197
>>
>> It's a small snippet and I've put in the comments stating how I run into
>> "out of file handles" or simply file not getting read due to lazy IO.
>>
>> I realize that putting ($!) using a trial/error approach is going to be
>> futile. I'd appreciate some pointers into the tools I could use to get
>> some idea of which expressions are building up huge thunks.
>
>
> You problem is in
>
> let bug = ($!) fileContents2Bug str
>
> ($!) evaluate only WHNF and you need NF. Above just evaluate to first char
> in a file, not to all content. To fully evaluate 'str' you need something
> like
>
> let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str

Or use $!! from Control.DeepSeq.

>
>
>
>
>
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe



-- 
Ivan Lazar Miljenovic
ivan.miljeno...@gmail.com
http://IvanMiljenovic.wordpress.com

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-18 Thread Konstantin Litvinenko

On 03/17/2013 07:08 AM, C K Kashyap wrote:

I am working on an automation that periodically fetches bug data from
our bug tracking system and creates static HTML reports. Things worked
fine when the bugs were in the order of 200 or so. Now I am trying to
run it against 3000 bugs and suddenly I see things like - too  many open
handles, out of memory etc ...

Here's the code snippet - http://hpaste.org/84197

It's a small snippet and I've put in the comments stating how I run into
"out of file handles" or simply file not getting read due to lazy IO.

I realize that putting ($!) using a trial/error approach is going to be
futile. I'd appreciate some pointers into the tools I could use to get
some idea of which expressions are building up huge thunks.


You problem is in

let bug = ($!) fileContents2Bug str

($!) evaluate only WHNF and you need NF. Above just evaluate to first 
char in a file, not to all content. To fully evaluate 'str' you need 
something like


let bug = Control.DeepSeq.rnf str `seq` fileContents2Bug str





___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread C K Kashyap
Thanks everyone,

Dan, MapMI worked for me ...

Regards,
Kashyap


On Mon, Mar 18, 2013 at 12:42 AM, Petr Pudlák  wrote:

> Hi Kashyap,
>
> you could also use iteratees or conduits for a task like that. The beauty
> of such libraries is that they can ensure that a resource is always
> properly disposed of. See this simple example:
> https://gist.github.com/anonymous/5183107
> It prints the first line of each file given as an argument. After each
> line is printed, the `fileConduit` pipe ensures that the handle is closed.
> It also makes the program nicely composable.
>
> Best regards,
> Petr
>
>
> import Control.Monad
> import Control.Monad.Trans.Class
>
> import Control.Monad.IO.Class
> import Data.Conduit
>
> import Data.Conduit.List
> import System.Environment
>
> import System.IO
>
>
> {- | Accept file paths on input, output opened file handle, and ensure that 
> the
>  - handle is always closed after its downstream pipe finishes whatever work 
> on it. -}
>
> fileConduit :: MonadResource m => IOMode -> Conduit FilePath m Handle
>
> fileConduit mode = awaitForever process
>
>   where
> process file = bracketP (openFile file mode) closeWithMsg yield
>
> closeWithMsg h = do
>
> putStrLn "Closing file"
>
>
> hClose h
>
> {- | Print the first line from each handle on input. Don't care about the 
> handle. -}
>
> firstLine :: MonadIO m => Sink Handle m ()
>
> firstLine = awaitForever (liftIO . (hGetLine >=> putStrLn))
>
>
> main = do
>
> args <- getArgs
>
>
> runResourceT $ sourceList args =$= fileConduit ReadMode $$ firstLine
>
>
>
>
> 2013/3/17 C K Kashyap 
>
>> Hi,
>>
>> I am working on an automation that periodically fetches bug data from our
>> bug tracking system and creates static HTML reports. Things worked fine
>> when the bugs were in the order of 200 or so. Now I am trying to run it
>> against 3000 bugs and suddenly I see things like - too  many open handles,
>> out of memory etc ...
>>
>> Here's the code snippet - http://hpaste.org/84197
>>
>> It's a small snippet and I've put in the comments stating how I run into
>> "out of file handles" or simply file not getting read due to lazy IO.
>>
>> I realize that putting ($!) using a trial/error approach is going to be
>> futile. I'd appreciate some pointers into the tools I could use to get some
>> idea of which expressions are building up huge thunks.
>>
>>
>> Regards,
>> Kashyap
>>
>> ___
>> Haskell-Cafe mailing list
>> Haskell-Cafe@haskell.org
>> http://www.haskell.org/mailman/listinfo/haskell-cafe
>>
>>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread Petr Pudlák
Hi Kashyap,

you could also use iteratees or conduits for a task like that. The beauty
of such libraries is that they can ensure that a resource is always
properly disposed of. See this simple example:
https://gist.github.com/anonymous/5183107
It prints the first line of each file given as an argument. After each line
is printed, the `fileConduit` pipe ensures that the handle is closed. It
also makes the program nicely composable.

Best regards,
Petr


import Control.Monad
import Control.Monad.Trans.Class
import Control.Monad.IO.Class
import Data.Conduit
import Data.Conduit.List
import System.Environment
import System.IO

{- | Accept file paths on input, output opened file handle, and ensure that the
 - handle is always closed after its downstream pipe finishes whatever
work on it. -}
fileConduit :: MonadResource m => IOMode -> Conduit FilePath m Handle
fileConduit mode = awaitForever process
  where
process file = bracketP (openFile file mode) closeWithMsg yield
closeWithMsg h = do
putStrLn "Closing file"

hClose h

{- | Print the first line from each handle on input. Don't care about
the handle. -}
firstLine :: MonadIO m => Sink Handle m ()
firstLine = awaitForever (liftIO . (hGetLine >=> putStrLn))

main = do
args <- getArgs

runResourceT $ sourceList args =$= fileConduit ReadMode $$ firstLine




2013/3/17 C K Kashyap 

> Hi,
>
> I am working on an automation that periodically fetches bug data from our
> bug tracking system and creates static HTML reports. Things worked fine
> when the bugs were in the order of 200 or so. Now I am trying to run it
> against 3000 bugs and suddenly I see things like - too  many open handles,
> out of memory etc ...
>
> Here's the code snippet - http://hpaste.org/84197
>
> It's a small snippet and I've put in the comments stating how I run into
> "out of file handles" or simply file not getting read due to lazy IO.
>
> I realize that putting ($!) using a trial/error approach is going to be
> futile. I'd appreciate some pointers into the tools I could use to get some
> idea of which expressions are building up huge thunks.
>
>
> Regards,
> Kashyap
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>
>
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-17 Thread Dan Doel
One thing that typically isn't mentioned in these situations is that
you can add more laziness. I'm unsure if it would work from just your
snippet, but it might.

The core problem is that something like:

mapM readFile names

will open all the files at once. Applying any processing to the file
contents is irrelevant unless the results of that processing is
evaluated sufficiently to allow the file to be closed.

Now, most people will tell you that this means lazy I/O is evil, and
you should make it all strict. But, consider an analogous situation
where instead of opening a file handle, we do something that allocates
a lot of memory, and can only free it after processing. We'd run out
of memory allocating 3,000 * X, but X alone is fine. Then people
usually suggest delaying the allocation until you need it, i.e. lazy
evaluation.

Unfortunately, there's no combinator for this in the standard
libraries, but you can write one:

mapMI :: (a -> IO b) -> [a] -> IO [b]
mapMI _ [] = return []
-- You can play with this case a bit. This will open a file for
the head of the list,
-- and then when each subsequent cons cell is inspected. You could probably
-- interleave 'f x' as well.
mapMI f (x:xs) = do y <- f x ; ys <- unsafeInterleaveIO (mapMI f
xs) ; return (y:ys)

Now, mapMI readFile only opens the handle when you match on the list,
so if you process the list incrementally, it will open the file
handles one-by-one.

As an aside, you should never use hClose when doing lazy I/O. That's
kind of like solving the above, "i've allocated too much memory,"
problem with, "just overwrite some expensive stuff with some other
cheap stuff to free up space."

-- Dan


On Sun, Mar 17, 2013 at 1:08 AM, C K Kashyap  wrote:
> Hi,
>
> I am working on an automation that periodically fetches bug data from our
> bug tracking system and creates static HTML reports. Things worked fine when
> the bugs were in the order of 200 or so. Now I am trying to run it against
> 3000 bugs and suddenly I see things like - too  many open handles, out of
> memory etc ...
>
> Here's the code snippet - http://hpaste.org/84197
>
> It's a small snippet and I've put in the comments stating how I run into
> "out of file handles" or simply file not getting read due to lazy IO.
>
> I realize that putting ($!) using a trial/error approach is going to be
> futile. I'd appreciate some pointers into the tools I could use to get some
> idea of which expressions are building up huge thunks.
>
>
> Regards,
> Kashyap
>
> ___
> Haskell-Cafe mailing list
> Haskell-Cafe@haskell.org
> http://www.haskell.org/mailman/listinfo/haskell-cafe
>

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need some advice around lazy IO

2013-03-16 Thread Carlo Hamalainen
On Sun, Mar 17, 2013 at 3:08 PM, C K Kashyap  wrote:

> It's a small snippet and I've put in the comments stating how I run into
> "out of file handles" or simply file not getting read due to lazy IO.
>
> I realize that putting ($!) using a trial/error approach is going to be
> futile. I'd appreciate some pointers into the tools I could use to get some
> idea of which expressions are building up huge thunks.
>

Have you tried System.IO.Strict's readFile? I had similar problems (too
many file handles) and fixed it with

import qualified System.IO.Strict as S

and then using S.readFile instead of the standard prelude's readFile.

This is where I used the strict IO readFile in my toy project:
https://github.com/carlohamalainen/checker/blob/master/Checker.hs

-- 
Carlo Hamalainen
http://carlo-hamalainen.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Need some advice around lazy IO

2013-03-16 Thread C K Kashyap
Hi,

I am working on an automation that periodically fetches bug data from our
bug tracking system and creates static HTML reports. Things worked fine
when the bugs were in the order of 200 or so. Now I am trying to run it
against 3000 bugs and suddenly I see things like - too  many open handles,
out of memory etc ...

Here's the code snippet - http://hpaste.org/84197

It's a small snippet and I've put in the comments stating how I run into
"out of file handles" or simply file not getting read due to lazy IO.

I realize that putting ($!) using a trial/error approach is going to be
futile. I'd appreciate some pointers into the tools I could use to get some
idea of which expressions are building up huge thunks.


Regards,
Kashyap
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe