Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
2008/4/1, Chaddaï Fouché [EMAIL PROTECTED]: 2008/3/31, Bruno Carnazzi [EMAIL PROTECTED]: Dears Haskellers, As an Haskell newbie, I'm learning Haskell by trying to resolve Euler Project problems (http://projecteuler.net/ ). I'm hanging on problem 14 (Collatz problem). I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? Others have pointed potential source of memory leaks, but I must say that using Data.Map for the cache in the first place appear to me as a very bad idea... Data.Map by nature take much more place than necessary. You have an integer index, why not use an array instead ? Because I don't know anything about arrays in Haskell. Thank you for pointing this, I have to read some more Haskell manuals :) import Data.Array import Data.List import Data.Ord syrs n = a where a = listArray (1,n) $ 0:[ syr n x | x - [2..n]] syr n x = if x' = n then a ! x' else 1 + syr n x' where x' = if even x then x `div` 2 else 3 * x + 1 main = print $ maximumBy (comparing snd) $ assocs $ syrs 100 The logic and the complexity in this algorithm is comparable to mine but the performance difference is huge, which is not very intuitive in my mind (There is no 1+1+1+1+1... problem with array ?) This solution takes 2 seconds (on my machine) to resolve the problem. On the other hand, now that I have read your solution, I see that using Map was the least of the problem... All those Map.map, while retaining the original Map... Your solution is too clever (twisted) for its own good, I suggest you aim for simplicity next time. -- Jedaï Thank you, Bruno. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
2008/4/1, Bruno Carnazzi [EMAIL PROTECTED]: Because I don't know anything about arrays in Haskell. Thank you for pointing this, I have to read some more Haskell manuals :) A good place to learn about Haskell's array (which come in many flavours) is this wiki page : http://www.haskell.org/haskellwiki/Modern_array_libraries import Data.Array import Data.List import Data.Ord syrs n = a where a = listArray (1,n) $ 0:[ syr n x | x - [2..n]] syr x = if x' = n then 1 + a ! x' else 1 + syr x' where x' = if even x then x `div` 2 else 3 * x + 1 main = print $ maximumBy (comparing snd) $ assocs $ syrs 100 The logic and the complexity in this algorithm is comparable to mine but the performance difference is huge, which is not very intuitive in my mind (There is no 1+1+1+1+1... problem with array ?) Array or Map isn't really the problem here (my algorithm with a Map instead only take 6s to find the solution) as I thought at first. The main problem in your code I think is that because of Map.map, you create multiple copies of your smaller Maps in memory and union force them to materialize, while the fact that you don't evaluate the value means the GC won't collect them. Anyway, your algorithm by itself is pretty slow I think, since for every step to a number which is not already recorded you must add 1 to all the numbers you passed on the way. -- Jedaï ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
Dears Haskellers, As an Haskell newbie, I'm learning Haskell by trying to resolve Euler Project problems (http://projecteuler.net/ ). I'm hanging on problem 14 (Collatz problem). I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? Here's the code : import qualified Data.List as List import qualified Data.Map as Map f n | even n = n `div` 2 | otherwise = 3 * n + 1 chain m n = let chain' cn cm | Map.member cn m = Map.map (+ (m Map.! cn)) cm | otherwise = chain' (f cn) $! Map.insert cn 1 (Map.map (+1) cm) in chain' n Map.empty chains n = List.foldl' (\m i - Map.union m (chain m i)) (Map.singleton 1 1) [2..n] maxCollatz c1@(_,l1) c2@(_,l2) | l1 l2 = c2 | otherwise = c1 maxChain = List.foldl' maxCollatz (0,0) . Map.toList . chains main = let n = 100 in putStrLn $ show $ maxChain n Hope someone can help me, I really don't see what is th problem... Best regards, Bruno. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
Hello Bruno, Monday, March 31, 2008, 7:51:43 PM, you wrote: I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? first step is to reduce n and see whether program will finish and how memreqs depends on value of n main = let n = 100 -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
The program ends for values up to 40 : *Main :set +s *Main maxChain 1000 (871,179) (0.09 secs, 3697648 bytes) *Main maxChain 1 (6171,262) (0.73 secs, 31560008 bytes) *Main maxChain 10 (77031,351) (9.31 secs, 347122064 bytes) *Main maxChain 20 (156159,383) (19.32 secs, 709303708 bytes) This one take about 10 minutes and swap a lot *Main maxChain 30 (230631,443) (38.02 secs, 1083800124 bytes) This one swap a lot and does not end in less than 10 minutes *Main maxChain 40 The ratio memreq/n seems to be more or less constant : *Main 3697648/1000 3697.648 *Main 31560008/1 3156.0008 *Main 347122064/10 3471.22064 *Main 709303708/20 3546.51854 *Main 1083800124/30 3612.66708 Thank you, Bruno. 2008/3/31, Bulat Ziganshin [EMAIL PROTECTED]: Hello Bruno, Monday, March 31, 2008, 7:51:43 PM, you wrote: I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? first step is to reduce n and see whether program will finish and how memreqs depends on value of n main = let n = 100 -- Best regards, Bulatmailto:[EMAIL PROTECTED] ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
Bruno Carnazzi [EMAIL PROTECTED] writes: The program ends for values up to 40 : Wild guess here - I know nothing about the problem, and haven't examined your program in detail - but could it be that you default to Int, and that it wraps silently at some power of two, thereby making your algorithm wrap around? Try to stick some 'Integer' type annotations in there, and see if it helps. -k (cetera censeo...) -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
I've done this modification with no more success : import qualified Data.List as List import qualified Data.Map as Map f :: Integer - Integer f n | even n = n `div` 2 | otherwise = 3 * n + 1 chain m n = let chain' cn cm | Map.member cn m = Map.map (+ (m Map.! cn)) cm | otherwise = chain' (f cn) $! Map.insert cn 1 (Map.map (+1) cm) in chain' n Map.empty chains n = List.foldl' (\m i - Map.union m (chain m i)) (Map.singleton 1 1) [2..n] maxCollatz c1@(_,l1) c2@(_,l2) | l1 l2 = c2 | otherwise = c1 maxChain = List.foldl' maxCollatz (0,0) . Map.toList . chains main = let n = 100 in putStrLn $ show $ maxChain n Best regards, Bruno. 2008/3/31, Ketil Malde [EMAIL PROTECTED]: Bruno Carnazzi [EMAIL PROTECTED] writes: The program ends for values up to 40 : Wild guess here - I know nothing about the problem, and haven't examined your program in detail - but could it be that you default to Int, and that it wraps silently at some power of two, thereby making your algorithm wrap around? Try to stick some 'Integer' type annotations in there, and see if it helps. -k (cetera censeo...) -- If I haven't seen further, it is by standing in the footprints of giants ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
On Mon, Mar 31, 2008 at 6:00 PM, Bruno Carnazzi [EMAIL PROTECTED] wrote: I've done this modification with no more success : import qualified Data.List as List import qualified Data.Map as Map f :: Integer - Integer f n | even n = n `div` 2 | otherwise = 3 * n + 1 chain m n = let chain' cn cm | Map.member cn m = Map.map (+ (m Map.! cn)) cm | otherwise = chain' (f cn) $! Map.insert cn 1 (Map.map (+1) cm) in chain' n Map.empty This function raises a red flag for me. The collatz sequence gets _very_ big, with very many distinct numbers. You are saving the length of the resulting chain for each of those numbers, which is going to take a lot of memory. But Map is also lazy in its values, so the values you are storing once chain finishes will look like: 1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1+1 Instead of 22, which is taking quite a lot of memory as well. My impression is that the caching approach is just a bad idea. It is not necessary to solve the problem efficiently; a brute force approach runs in under 1 minute in constant memory for me. Try to simplify your approach. I'd suggest generating a list representing the collatz sequence starting at the number, then just taking its 'length'. Do that for each number and find the maximum. There should be no need for Data.Map. Luke ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
2008/3/31, Bruno Carnazzi [EMAIL PROTECTED]: Dears Haskellers, As an Haskell newbie, I'm learning Haskell by trying to resolve Euler Project problems (http://projecteuler.net/ ). I'm hanging on problem 14 (Collatz problem). I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? Others have pointed potential source of memory leaks, but I must say that using Data.Map for the cache in the first place appear to me as a very bad idea... Data.Map by nature take much more place than necessary. You have an integer index, why not use an array instead ? import Data.Array import Data.List import Data.Ord syrs n = a where a = listArray (1,n) $ 0:[ syr n x | x - [2..n]] syr n x = if x' = n then a ! x' else 1 + syr n x' where x' = if even x then x `div` 2 else 3 * x + 1 main = print $ maximumBy (comparing snd) $ assocs $ syrs 100 This solution takes 2 seconds (on my machine) to resolve the problem. On the other hand, now that I have read your solution, I see that using Map was the least of the problem... All those Map.map, while retaining the original Map... Your solution is too clever (twisted) for its own good, I suggest you aim for simplicity next time. -- Jedaï ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] [Newbie] Problem with Data.Map (or something else ?)
chaddai.fouche: 2008/3/31, Bruno Carnazzi [EMAIL PROTECTED]: Dears Haskellers, As an Haskell newbie, I'm learning Haskell by trying to resolve Euler Project problems (http://projecteuler.net/ ). I'm hanging on problem 14 (Collatz problem). I've written the following program... Which does not end in a reasonable time :( My algorithm seems ok to me but I see that memory consumption is gigantic... Is this a memory problem with Data.Map ? Or an infinite loop ? (Where ?) In a more general way, how can I troubleshoot these kind of problem ? Others have pointed potential source of memory leaks, but I must say that using Data.Map for the cache in the first place appear to me as a very bad idea... Data.Map by nature take much more place than necessary. You have an integer index, why not use an array instead ? import Data.Array import Data.List import Data.Ord syrs n = a where a = listArray (1,n) $ 0:[ syr n x | x - [2..n]] syr n x = if x' = n then a ! x' else 1 + syr n x' where x' = if even x then x `div` 2 else 3 * x + 1 main = print $ maximumBy (comparing snd) $ assocs $ syrs 100 This solution takes 2 seconds (on my machine) to resolve the problem. On the other hand, now that I have read your solution, I see that using Map was the least of the problem... All those Map.map, while retaining the original Map... Your solution is too clever (twisted) for its own good, I suggest you aim for simplicity next time. and if its got Int indices, Data.IntMap is always a better option than Data.Map and usually outperforms the HashTable type, while being pure. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe