Re: pinned byte arrays
Pinning the arrays gives the GC much less flexibility. Especially if your objects are small. It means that the GC can't move things to compact the heap and you'll end up with lots of holes in between heap objects of things that were collected but the space couldn't be re-used because other objects didn't happen to fit in it. You end up with a very fragmented heap. You're essentially completely preventing the compacting GC from doing any compacting. Why can't it compact? Because you've told it that it isn't allowed to move anything! Honestly, I don't see why you think pinning all these little arrays should make things any better. It just means they can't be moved around it doesn't mean that magically they never have to be considered or freed by the GC. You only want to pin if you must do for some external reason or if the object is very big and therefore takes a significant amount of time to move. Duncan On Wed, 2006-10-18 at 19:50 +0400, Bulat Ziganshin wrote: > Hello glasgow-haskell-users, > > i have program that holds in memory a lot of strings (filenames) and > use my own packed-string library to represent these string. this > library uses newByteArray# to allocate strings. in my benchmark run > (with 300.000 filenames in memory) program prints the following > statistics: > > 37,502,956 bytes maximum residency (16 sample(s)) > >2928 collections in generation 0 ( 6.90s) > 16 collections in generation 1 ( 12.20s) > > 65 Mb total memory in use > > most of this memory occupied by filenames. note that 28 mb of memory > is used just for GC procedure (i use compacting GC) > > then, i thought that using pinned byte arrays should significantly > improve memory usage because these arrays can't be moved around and > therefore, i thought, these arrays will not be involved in GC. amount > of data which should be compacted by GC will decrease and amount of > memory wasted by GC should also decrease > > so i replaced in my packed str lib newByteArray# with > newPinnedByteArray# and seen the following: > > 53,629,344 bytes maximum residency (14 sample(s)) > >2904 collections in generation 0 ( 6.97s) > 14 collections in generation 1 ( 13.16s) > > 100 Mb total memory in use > > why? why memory usage was increased? why another 47 megs is just > wasted out? how GC works with pinned byte arrays? and why GC times was > almost the same - i thought that GC for pinned data should be very > different than GC for unpinned byte arrays > ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: Benchmarking GHC
On Thu, 2006-10-19 at 21:10 +0400, Bulat Ziganshin wrote: > btw, writing this message i thought that > -fconvert-strings-to-ByteStrings option will give a significant boost > to many programs without rewriting them :) This kind of data refinement has a side condition on the strictness of the function. You need to know that your list function is strict in the spine and elements before it can be swapped to a version that operates on packed strings. If it's merely strict in the spine then one could switch to lazy arrays. There's also the possibility to use lists that are strict in the elements. It'd be an interesting topic to research, to see if this strictness analysis and transformation could be done automatically, and indeed if it is properly meaning preserving. Duncan ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[2]: Benchmarking GHC
Hello Simon, Thursday, October 19, 2006, 6:40:54 PM, you wrote: > These days -O2, which invokes the SpecConstr pass, can have a big > effect, but only on some programs. it also enables -optc-O2. so, answering Neil's question: -O2 -funbox-strict-fields (sidenote to SPJ: -funbox-simple-strict-fields may be a good way to _safe_ optimization) RTS -A10m option may be helpful (even with 6.6), so you may allow to run program two times - with and without this option - and select the best run btw, writing this message i thought that -fconvert-strings-to-ByteStrings option will give a significant boost to many programs without rewriting them :) -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
recursive commentaries :)
Hello glasgow-haskell-users, http://hackage.haskell.org/trac/ghc/wiki/Commentary contains a link 'The old GHC Commentary' which points to the http://www.cse.unsw.edu.au/~chak/haskell/ghc/comm/ page. guess what this page contains? :) it will be great to restore link to old commentaries -- Best regards, Bulat mailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
RE: Benchmarking GHC
These days -O2, which invokes the SpecConstr pass, can have a big effect, but only on some programs. Simon | -Original Message- | From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] | On Behalf Of Neil Mitchell | Sent: 19 October 2006 11:22 | To: Donald Bruce Stewart | Cc: GHC Users Mailing List | Subject: Re: Benchmarking GHC | | Hi | | > > One thing that IME makes a difference is -funbox-strict-fields. It's | > > probably better to use pragmas for this, though. Another thing to | > > consider is garbage collection RTS flags, those can sometimes make a | > > big difference. | > | | I _don't_ want to speed up a particular program by modifying it, I | want to take a set of existing programs which are treated as black | boxes, and compile them all with the same flags. I don't want to | experiment to see which flags give the best particular result on a per | program basis, or even for the benchmark as a whole, I just want to | know what the "standard recommendation" is for people who want fast | code but not to understand anything. | | > All this and more on the under-publicised Performance wiki, | > http://haskell.org/haskellwiki/Performance | | It's a very good resource, and I've read it before :) | | Another way to treat my question is, the wiki says "Of course, if a | GHC compiled program runs slower than the same program compiled with | another Haskell compiler, then it's definitely a bug" - in this | sentance what does the command line look like in the GHC compiled | case? | | Thanks | | Neil | ___ | Glasgow-haskell-users mailing list | Glasgow-haskell-users@haskell.org | http://www.haskell.org/mailman/listinfo/glasgow-haskell-users ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re[2]: Benchmarking GHC
Hello Ketil, Thursday, October 19, 2006, 11:05:48 AM, you wrote: > One thing that IME makes a difference is -funbox-strict-fields. It's > probably better to use pragmas for this, though. Another thing to > consider is garbage collection RTS flags, those can sometimes make a > big difference. yes, it's better to unbox individual fields. i had a program where this flag leads to significant memory usage increase. smth like this: data T1 = T1 ... -- many fields data T2 = T2 !T1 !T1 !T1 make t1 = T2 t1 t1 t1 -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Benchmarking GHC
Hello Neil, Wednesday, October 18, 2006, 10:49:37 PM, you wrote: > * At the moment, -O2 is unlikely to produce better code than -O. ghc manual full of text that was written 10 years or more ago :) > * When we want to go for broke, we tend to use -O2 -fvia-C >>From this I guess the answer is "-O2 -fvia-C"? I just wanted to check just "-O2" does the same -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell] Expecting more inlining for bit shifting
On Wed, Oct 18, 2006 at 07:00:18AM -0400, [EMAIL PROTECTED] wrote: > I'm not sure this approach is best. In my case the ... needs to be the > entire body of the shift code. It would be ridiculous to have two copies > of the same code. What would be better is a hint pragma that says, > ``inline me if the following set of parameters are literals''. > not at all: > {-# RULES "shift/const-inline" >forall x y# . shift x y# = inline shift x y# #-} of course, you would still need to make sure the body of shift were available. (perhaps instances of inline in rules should force the argument to appear in the hi file in full) a hint pragma would be trickier to implement and not as flexible I would think. for instance, what if you want to inline only when one argument is an arbitrary constant, but the other arg is a certain value? John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
pinned byte arrays
Hello glasgow-haskell-users, i have program that holds in memory a lot of strings (filenames) and use my own packed-string library to represent these string. this library uses newByteArray# to allocate strings. in my benchmark run (with 300.000 filenames in memory) program prints the following statistics: 37,502,956 bytes maximum residency (16 sample(s)) 2928 collections in generation 0 ( 6.90s) 16 collections in generation 1 ( 12.20s) 65 Mb total memory in use most of this memory occupied by filenames. note that 28 mb of memory is used just for GC procedure (i use compacting GC) then, i thought that using pinned byte arrays should significantly improve memory usage because these arrays can't be moved around and therefore, i thought, these arrays will not be involved in GC. amount of data which should be compacted by GC will decrease and amount of memory wasted by GC should also decrease so i replaced in my packed str lib newByteArray# with newPinnedByteArray# and seen the following: 53,629,344 bytes maximum residency (14 sample(s)) 2904 collections in generation 0 ( 6.97s) 14 collections in generation 1 ( 13.16s) 100 Mb total memory in use why? why memory usage was increased? why another 47 megs is just wasted out? how GC works with pinned byte arrays? and why GC times was almost the same - i thought that GC for pinned data should be very different than GC for unpinned byte arrays -- Best regards, Bulat mailto:[EMAIL PROTECTED] ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Benchmarking GHC
Hi > One thing that IME makes a difference is -funbox-strict-fields. It's > probably better to use pragmas for this, though. Another thing to > consider is garbage collection RTS flags, those can sometimes make a > big difference. I _don't_ want to speed up a particular program by modifying it, I want to take a set of existing programs which are treated as black boxes, and compile them all with the same flags. I don't want to experiment to see which flags give the best particular result on a per program basis, or even for the benchmark as a whole, I just want to know what the "standard recommendation" is for people who want fast code but not to understand anything. All this and more on the under-publicised Performance wiki, http://haskell.org/haskellwiki/Performance It's a very good resource, and I've read it before :) Another way to treat my question is, the wiki says "Of course, if a GHC compiled program runs slower than the same program compiled with another Haskell compiler, then it's definitely a bug" - in this sentance what does the command line look like in the GHC compiled case? Thanks Neil ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Benchmarking GHC
ketil+haskell: > "Neil Mitchell" <[EMAIL PROTECTED]> writes: > > > I want to benchmark GHC vs some other Haskell compilers, what flags > > should I use? > > > [...] I guess the answer is "-O2 -fvia-C"? > > I tend to use -O2, but haven't really tested it against plain -O. > >From what I've seen -fvia-C is sometimes faster, sometimes slower, I > tend to cross my fingers and hope the compiler uses sensible defaults > on the current architecture. > > One thing that IME makes a difference is -funbox-strict-fields. It's > probably better to use pragmas for this, though. Another thing to > consider is garbage collection RTS flags, those can sometimes make a > big difference. All this and more on the under-publicised Performance wiki, http://haskell.org/haskellwiki/Performance -- Don ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Benchmarking GHC
"Neil Mitchell" <[EMAIL PROTECTED]> writes: > I want to benchmark GHC vs some other Haskell compilers, what flags > should I use? > [...] I guess the answer is "-O2 -fvia-C"? I tend to use -O2, but haven't really tested it against plain -O. >From what I've seen -fvia-C is sometimes faster, sometimes slower, I tend to cross my fingers and hope the compiler uses sensible defaults on the current architecture. One thing that IME makes a difference is -funbox-strict-fields. It's probably better to use pragmas for this, though. Another thing to consider is garbage collection RTS flags, those can sometimes make a big difference. -k -- If I haven't seen further, it is by standing in the footprints of giants ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users