Re: [Haskell-cafe] Re: Over-allocation

2007-12-08 Thread Don Stewart
gracjanpolak:
> Gracjan Polak  gmail.com> writes:
> 
> > 
> > Don Stewart  galois.com> writes:
> > > 
> > > ByteStrings have all the same operations as lists though, so you can
> > > index, compare and take substrings, with the benefit that he underlying
> > > string will be shared, not copied. And only use 1 byte per element.
> > 
> > Is there any parser built directly over ByteString that I could look at?
> > 
> > Or maybe somebody implemented something like Text.ParserCombinators.ReadP 
> > for
> > ByteString?
> > 
> > >From the first sight it seems doable, so there is light at the end of the 
> > tunnel :)
> > 
> 
> Just a success report, after 58 min of coding I got kind of ReadP parser over
> ByteString working and my memory usage went down from 1500MB to... 1.2MB! Over
> 1000 times better! Incredible!
> 
> Thanks for the suggestion to do it with ByteStrings!
> 
> I hope to publish it when I clean it up enough!

That is really awesome!
Sharing input strings around with bytestrings really should lead to
excellent memory savings in parsing. I'm glad we see this confirmed.

Will you be releasing the code soon?

-- Don
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-30 Thread Simon Marlow

Gracjan Polak wrote:


My program is eating too much memory:

copyfile source.txt dest.txt +RTS -sstderr
Reading file...
Reducing structure...
Writting file...
Done in 20.277s
1,499,778,352 bytes allocated in the heap
2,299,036,932 bytes copied during GC (scavenged)
1,522,112,856 bytes copied during GC (not scavenged)
 17,846,272 bytes maximum residency (198 sample(s))

   2860 collections in generation 0 ( 10.37s)
198 collections in generation 1 (  8.35s)

 50 Mb total memory in use

  INIT  time0.00s  (  0.00s elapsed)
  MUT   time1.26s  (  1.54s elapsed)
  GCtime   18.72s  ( 18.74s elapsed)
  EXIT  time0.00s  (  0.00s elapsed)
  Total time   19.98s  ( 20.28s elapsed)


ooh.  May I have your program (the unfixed version) for benchmarking the 
parallel GC?


Cheers,
Simon, currently collecting space leaks
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-22 Thread Gracjan Polak
Gracjan Polak  gmail.com> writes:

> 
> Don Stewart  galois.com> writes:
> > 
> > ByteStrings have all the same operations as lists though, so you can
> > index, compare and take substrings, with the benefit that he underlying
> > string will be shared, not copied. And only use 1 byte per element.
> 
> Is there any parser built directly over ByteString that I could look at?
> 
> Or maybe somebody implemented something like Text.ParserCombinators.ReadP for
> ByteString?
> 
> >From the first sight it seems doable, so there is light at the end of the 
> tunnel :)
> 

Just a success report, after 58 min of coding I got kind of ReadP parser over
ByteString working and my memory usage went down from 1500MB to... 1.2MB! Over
1000 times better! Incredible!

Thanks for the suggestion to do it with ByteStrings!

I hope to publish it when I clean it up enough!

-- 
Gracjan



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-22 Thread Gracjan Polak
Don Stewart  galois.com> writes:
> 
> ByteStrings have all the same operations as lists though, so you can
> index, compare and take substrings, with the benefit that he underlying
> string will be shared, not copied. And only use 1 byte per element.

Is there any parser built directly over ByteString that I could look at?

Or maybe somebody implemented something like Text.ParserCombinators.ReadP for
ByteString?

>From the first sight it seems doable, so there is light at the end of the 
tunnel :)

-- 
Gracjan


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Don Stewart
gracjanpolak:
> Ketil Malde  ii.uib.no> writes:
> > 
> > Then you get the memory behavior you ask for.  Unevaluated strings are
> > extremely expensive, something like 12 bytes per char on 32 bit, twice
> > that on 64 bits, and then you need GC overhead, etc.  ByteStrings are
> > much better, but you then probably need to implement your own XML
> > parsing. 
> > 
> 
> My lazy chunks have type ByteString -> Object. Only internally they use
> ByteString.unpack to get the list of Word8s to parse them.
> 
> My parser is totally my own so I can do anything I wish. Except it is hard for
> me to image a parser working on something else than [Word8]. How do I do this?

ByteStrings have all the same operations as lists though, so you can
index, compare and take substrings, with the benefit that he underlying
string will be shared, not copied. And only use 1 byte per element.

> So how do I get rid of those (:) and W8# that are allocated everywhere on my 
> heap?
> 
> Thanks for the suggestion for -hd, really useful option!
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Gracjan Polak
Ketil Malde  ii.uib.no> writes:
> 
> Then you get the memory behavior you ask for.  Unevaluated strings are
> extremely expensive, something like 12 bytes per char on 32 bit, twice
> that on 64 bits, and then you need GC overhead, etc.  ByteStrings are
> much better, but you then probably need to implement your own XML
> parsing. 
> 

My lazy chunks have type ByteString -> Object. Only internally they use
ByteString.unpack to get the list of Word8s to parse them.

My parser is totally my own so I can do anything I wish. Except it is hard for
me to image a parser working on something else than [Word8]. How do I do this?

So how do I get rid of those (:) and W8# that are allocated everywhere on my 
heap?

Thanks for the suggestion for -hd, really useful option!

-- 
Gracjan




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Ketil Malde
Gracjan Polak <[EMAIL PROTECTED]> writes:

> I tried both Map and IntMap and there was no difference in memory total usage 
> or
> usage pattern. Seems I'm already strict enough.

This only proves Map and IntMap are equally strict, or in other words,
they are both lazy in the elements.

> Values are left lazy till the point where they are forced, and that is at
> write-out in my current excersise. I'd want to leave them lazy as in more
> involved transformation not all of them will be needed.

Then you get the memory behavior you ask for.  Unevaluated strings are
extremely expensive, something like 12 bytes per char on 32 bit, twice
that on 64 bits, and then you need GC overhead, etc.  ByteStrings are
much better, but you then probably need to implement your own XML
parsing. 

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Ketil Malde
Gracjan Polak <[EMAIL PROTECTED]> writes:

> The problem is that my prog allocates a lot just to free it immediatelly 
> after.
> But what?

Use +RTS -hd instead, which will tell you the constructor.

I bet you'll find it's (:), and that you are retaining a load of Chars
from your input file, pending evaluation of the elements from your
map.

-k
-- 
If I haven't seen further, it is by standing in the footprints of giants
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Gracjan Polak
Ketil Malde  ii.uib.no> writes:

> 
> Gracjan Polak  gmail.com> writes:
> 
> > let entries = IntMap.fromList (map (\(a,b,c) -> (a,c)) (concat p))
> 
> Gut reaction: Map is lazy in its values (but probably not the key,
> which are checked for order), so you should force the 'c' before
> inserting it in the map.  (There's probably a strict fromList or
> IntMap somewhere?) 

I tried both Map and IntMap and there was no difference in memory total usage or
usage pattern. Seems I'm already strict enough.

Values are left lazy till the point where they are forced, and that is at
write-out in my current excersise. I'd want to leave them lazy as in more
involved transformation not all of them will be needed.

-- 
Gracjan


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Over-allocation

2007-11-21 Thread Gracjan Polak
Stefan O'Rear  cox.net> writes:
> 
> Note that heap profiling is even more a black art than time profiling;
> you may need to do a lot of experimentation to find an enlightening
> profile.
> 

Black art indeed... I did -hc, looked at the postscript generated from every
angle I could and it looks like this:

/|/|/|
   / |   / |   / |
  /  |  /  |  /  |
 /   | /   | /   |
/|/|/|

This is only xparse, other functions are unimportant and aren't even visible on
the graph.

My xparse allocates a lot of memory which is then almost all freed at the very
next occasion by GC. Seems I do not have space leaks.

The problem is that my prog allocates a lot just to free it immediatelly after.
But what?

-- 
Gracjan


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe