RE: [Haskell-cafe] using Map rather than FiniteMap
On 25 January 2005 23:27, S. Alexander Jacobson wrote: Oops. It pays to check your checking code before making posts like this. After actually running the correct test, I am still getting semi-ridiculous space behavior (6k/pair)! import qualified Map zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Has this profile: example +RTS -p -K5M -RTS total time =5.10 secs (255 ticks @ 20 ms) total alloc = 612,534,236 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc balanceMap 71.8 72.8 insert Map 12.2 10.8 size Map9.09.7 binMap2.42.2 rotateRMap1.60.3 zipped Main 0.81.9 Notice the 612Mb for storing a mapping from Int to Int!! Notice that's 612Mb *allocation* to create the finite map and deconstruct it into a list. Not 612Mb of live heap to store the finite map. If you make sure everything is evaluated, and examine the heap profile I'm sure you'll find that the finite map is taking a reasonable amount of space (perhaps 20-30 bytes per node for storing integers, I guess). To get a rough idea of the total live heap without profiling, you can run the program with +RTS -sstderr and look at the memory in use figure. Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] using Map rather than FiniteMap
On 26 January 2005 14:29, S. Alexander Jacobson wrote: Ah, ok. So I ran the code with 10 items, 5 items, and 25000 items and got total memory in use of 28Mb, 15Mb, and 8Mb respectively. That comes to 260-280 bytes per record which is still an order of magnitude higher than the 20-30 bytes per record we would expect. When using the ordinary 2-generation collector, memory in use will tend to be 2-3 times larger than the actual residency, because this is a copying collector. On the other hand, I found 10mb, 5mb, and 2.5mb maximum residency, but that is still ~100 bytes per record. Right. Lastly, I tried example +RTS -p -K5M -hc and then looked at the resulting graph (attachment #2) and it shows a total of 1.6Mb heap allocated and if that data is correct, it does correspond roughly to our 20-30 bytes per record estimate. That profile doesn't include the stack, which sounds reasonable. Regarding stack, I tried example +RTS -p -K5M -xt -hc and then ran hp2ps and looked at the resulting graph (attachment #1) SYSTEM appears to use 4mb of stack at the very beginning (presumably to create zipped), but I don't know why it would. Then this stack is consumed by the rest of the program. Stacks never get smaller in GHC, only bigger. If you need 4Mb of stack at one point in your program, you will forever have a 4Mb stack after that (fixing this wouldn't really buy you much; the memory doesn't get traversed or copied by the GC - but one thing you could do is madvise() to tell the OS it doesn't have to page the memory, though). I haven't looked at your code in detail, hopefully someone else can shed some light on why you're building up such a large stack in the first place. Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
S. Alexander Jacobson wrote: zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Has this profile: example +RTS -p -K5M -RTS total time =5.10 secs (255 ticks @ 20 ms) total alloc = 612,534,236 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc balanceMap 71.8 72.8 insert Map 12.2 10.8 size Map9.09.7 binMap2.42.2 rotateRMap1.60.3 zipped Main 0.81.9 I get similar results without optimizations and much better ones using only about 160MB with -O. Cheers Christian -- unoptimized testMap +RTS -p -K5m -RTS total time =6.34 secs (317 ticks @ 20 ms) total alloc = 612,549,684 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc balanceCommon.Lib.Map74.1 72.8 insert Common.Lib.Map 9.8 10.8 size Common.Lib.Map 9.59.7 binCommon.Lib.Map 1.62.2 zipped Main 1.31.9 -- optimized results ghc --make TestMap.hs -O -prof -auto-all -o testMap testMap +RTS -p -K5m -RTS total time =1.22 secs (61 ticks @ 20 ms) total alloc = 159,737,668 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc insert Common.Lib.Map39.30.5 balanceCommon.Lib.Map24.6 47.4 size Common.Lib.Map21.3 34.1 foldR Common.Lib.Map 3.32.5 zipped Main 3.36.5 untup Main 3.30.8 produceMain 3.30.8 singleRCommon.Lib.Map 1.60.0 toAscList Common.Lib.Map 0.01.5 single Common.Lib.Map 0.01.5 keys Common.Lib.Map 0.01.5 binCommon.Lib.Map 0.03.0 ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
On Tue, 25 Jan 2005, David Menendez wrote: Does having 'zipped' at the top level mean that the program is keeping the entire 100,000-element list in memory? I don't know, but I tested with zipped at the top, in the where, and it appears to make no performance or memory difference. Also, would performance improve if you used Map.fromList? How about Map.fromAscList? HUGE IMPROVEMENT. The old code had maximum residency of 11Mb and running time of 41s. The code using Map.fromList has max. residency of 6.4Mb and running time of 5.2s! I guess foldlStrict is HUGELY more efficient than than foldr. (Why isn't it in the prelude?) But even using foldrStrict (fromList), we are still using 60 bytes per item and have ~30 bytes unaccounted for. Any hints? import qualified Map main = print $ length $ Map.keys theMap where zipped =zip [1..] [1..10]::[(Int,Int)] theMap = Map.fromList zipped -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com S. Alexander Jacobson writes: After actually running the correct test, I am still getting semi-ridiculous space behavior (6k/pair)! import qualified Map zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Two questions I'm currently too lazy to investigate myself: -- David Menendez [EMAIL PROTECTED] | In this house, we obey the laws http://www.eyrie.org/~zednenem |of thermodynamics! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell-cafe] using Map rather than FiniteMap
Ah, ok. So I ran the code with 10 items, 5 items, and 25000 items and got total memory in use of 28Mb, 15Mb, and 8Mb respectively. That comes to 260-280 bytes per record which is still an order of magnitude higher than the 20-30 bytes per record we would expect. On the other hand, I found 10mb, 5mb, and 2.5mb maximum residency, but that is still ~100 bytes per record. Lastly, I tried example +RTS -p -K5M -hc and then looked at the resulting graph (attachment #2) and it shows a total of 1.6Mb heap allocated and if that data is correct, it does correspond roughly to our 20-30 bytes per record estimate. Regarding stack, I tried example +RTS -p -K5M -xt -hc and then ran hp2ps and looked at the resulting graph (attachment #1) SYSTEM appears to use 4mb of stack at the very beginning (presumably to create zipped), but I don't know why it would. Then this stack is consumed by the rest of the program. Can you provide some guidance on interpreting this data? -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com On Wed, 26 Jan 2005, Simon Marlow wrote: On 25 January 2005 23:27, S. Alexander Jacobson wrote: Oops. It pays to check your checking code before making posts like this. After actually running the correct test, I am still getting semi-ridiculous space behavior (6k/pair)! import qualified Map zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Has this profile: example +RTS -p -K5M -RTS total time =5.10 secs (255 ticks @ 20 ms) total alloc = 612,534,236 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc balanceMap 71.8 72.8 insert Map 12.2 10.8 size Map9.09.7 binMap2.42.2 rotateRMap1.60.3 zipped Main 0.81.9 Notice the 612Mb for storing a mapping from Int to Int!! Notice that's 612Mb *allocation* to create the finite map and deconstruct it into a list. Not 612Mb of live heap to store the finite map. If you make sure everything is evaluated, and examine the heap profile I'm sure you'll find that the finite map is taking a reasonable amount of space (perhaps 20-30 bytes per node for storing integers, I guess). To get a rough idea of the total live heap without profiling, you can run the program with +RTS -sstderr and look at the memory in use figure. Cheers, Simon example.ps Description: PostScript document example2.ps Description: PostScript document ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] using Map rather than FiniteMap
Just did a search after my last post and learned that FiniteMap is bad. Discoverd that Data.Map is the intended replacement. Downloaded it and modified it to work with 6.2. Blazingly fast! Yay. Please ignore the prior message. -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
S. Alexander Jacobson [EMAIL PROTECTED] writes: Just did a search after my last post and learned that FiniteMap is bad. Discoverd that Data.Map is the intended replacement. Downloaded it and modified it to work with 6.2. Blazingly fast! Oh? I was aware that Data.Map was supposed to be better, but not that FiniteMap was particularly bad. Is there any information about when, why and by how much Data.Map is better? -kzm -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
I didn't find any such information. I just decided to look at the FiniteMap source code in CVS and discovered in the comments that it was deprecated in favor of Data.Map. So I downloaded the new Data.Map and Data.Set and ran the code I posted before. I executed basically instantly and used minimal memory. I think Map picked up a lot of speed because it doesn't spaceleak, but am not sure. -Alex- On Tue, 25 Jan 2005, Ketil Malde wrote: S. Alexander Jacobson [EMAIL PROTECTED] writes: Just did a search after my last post and learned that FiniteMap is bad. Discoverd that Data.Map is the intended replacement. Downloaded it and modified it to work with 6.2. Blazingly fast! Oh? I was aware that Data.Map was supposed to be better, but not that FiniteMap was particularly bad. Is there any information about when, why and by how much Data.Map is better? -kzm -- If I haven't seen further, it is by standing in the footprints of giants __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
Just did a search after my last post and learned that FiniteMap is bad. Discoverd that Data.Map is the intended replacement. Downloaded it and modified it to work with 6.2. Blazingly fast! Yay. Hi, just curious, How much trouble was getting it to work with ghc 6.2 and adapting your program to use the API of Data.Map instead of Data.FiniteMap? J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
Oops. It pays to check your checking code before making posts like this. After actually running the correct test, I am still getting semi-ridiculous space behavior (6k/pair)! import qualified Map zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Has this profile: example +RTS -p -K5M -RTS total time =5.10 secs (255 ticks @ 20 ms) total alloc = 612,534,236 bytes (excludes profiling overheads) COST CENTREMODULE %time %alloc balanceMap 71.8 72.8 insert Map 12.2 10.8 size Map9.09.7 binMap2.42.2 rotateRMap1.60.3 zipped Main 0.81.9 Notice the 612Mb for storing a mapping from Int to Int!! Also 5M of stack is required to make this work. Can anyone help? -Alex- __ S. Alexander Jacobson tel:917-770-6565 http://alexjacobson.com On Tue, 25 Jan 2005, Jorge Adriano Aires wrote: Just did a search after my last post and learned that FiniteMap is bad. Discoverd that Data.Map is the intended replacement. Downloaded it and modified it to work with 6.2. Blazingly fast! Yay. Hi, just curious, How much trouble was getting it to work with ghc 6.2 and adapting your program to use the API of Data.Map instead of Data.FiniteMap? J.A. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] using Map rather than FiniteMap
S. Alexander Jacobson writes: After actually running the correct test, I am still getting semi-ridiculous space behavior (6k/pair)! import qualified Map zipped =zip [1..] [1..10]::[(Int,Int)] untup f (x,y) = f x y produce = foldr (untup Map.insert) Map.empty zipped fm = length $ Map.keys produce main = print $ fm Two questions I'm currently too lazy to investigate myself: Does having 'zipped' at the top level mean that the program is keeping the entire 100,000-element list in memory? Also, would performance improve if you used Map.fromList? How about Map.fromAscList? -- David Menendez [EMAIL PROTECTED] | In this house, we obey the laws http://www.eyrie.org/~zednenem |of thermodynamics! ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe