Send Beginners mailing list submissions to beginners@haskell.org To subscribe or unsubscribe via the World Wide Web, visit http://www.haskell.org/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-requ...@haskell.org
You can reach the person managing the list at beginners-ow...@haskell.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..." Today's Topics: 1. Excess mem consumption in file IO task (Moritz Tacke) 2. Re: Excess mem consumption in file IO task (Ertugrul Soeylemez) 3. Some confusion still with a YAHT example (David Schonberger) 4. Re: Some confusion still with a YAHT example (Daniel Fischer) 5. Re: Some confusion still with a YAHT example (Brandon S. Allbery KF8NH) ---------------------------------------------------------------------- Message: 1 Date: Tue, 6 Jan 2009 14:27:35 +0100 From: "Moritz Tacke" <t...@informatik.uni-freiburg.de> Subject: [Haskell-beginners] Excess mem consumption in file IO task To: beginners@haskell.org Message-ID: <814cf9120901060527o6e6a124ckf9c201188c9d8...@mail.gmail.com> Content-Type: text/plain; charset=ISO-8859-1 Hi! I have some resource problems when extracting data from a file. The task is as follows: I have a huge (500MB) binary file, containing some interesting parts and lots of rubbish. Furthermore, there is a directory that tells me the parts of the file (first- and last byte index) that contain the substrings I need. My approach to do this is to open the file and to pass the list of addresses along with the handle to a function that processes the list step-by-step and calls a subfunction which uses the handle to seek the start position of the interesting block, reads the block into a bytestring (lazy or not, didn't make any difference here) and calls the function that scans this byte string for the interesting part. Using this approach - which results in a data structure with an approximate size of 10 MB - the program uses hundreds of megabytes of RAM, which forces my computer to swap (with the obvious results...). I have right now two main suspects: The recursive function is tail-recursive, but I don't know whether the usual way to write these functions (with an accumulator etc) works in monadic code (the stage is, of course, the IO monad, and I am using the do-notation as I don't like the only other way I know, writing lambdas and lambdas and lambdas into the function body). The other problem I can imagine is the passing-around of the file handle, and the subsequent reading of byte strings: Are those strings somehow attached to the handle, and does the handle work in a different way than I expected, i.e. is the handle copied while using it as an argument for another function, and exists something like a register of handles that keeps the connection upright and, therefore, excludes the (handle, string)-chunk from garbage collection? I have, of course, been experimenting with the "seq" - function, but, honestly, I am not sure whether I got it right. Does a call to "identity $! (function arguments ...)" force the full evaluation of the function? Greetings! Moritz ------------------------------ Message: 2 Date: Wed, 7 Jan 2009 09:54:00 +0100 From: Ertugrul Soeylemez <e...@ertes.de> Subject: [Haskell-beginners] Re: Excess mem consumption in file IO task To: beginners@haskell.org Message-ID: <20090107095400.74a0d...@tritium.xx> Content-Type: text/plain; charset=US-ASCII "Moritz Tacke" <t...@informatik.uni-freiburg.de> wrote: > I have some resource problems when extracting data from a file. The > task is as follows: I have a huge (500MB) binary file, containing some > interesting parts and lots of rubbish. Furthermore, there is a > directory that tells me the parts of the file (first- and last byte > index) that contain the substrings I need. My approach to do this is > to open the file and to pass the list of addresses along with the > handle to a function that processes the list step-by-step and calls a > subfunction which uses the handle to seek the start position of the > interesting block, reads the block into a bytestring (lazy or not, > didn't make any difference here) and calls the function that scans > this byte string for the interesting part. Using this approach - which > results in a data structure with an approximate size of 10 MB - the > program uses hundreds of megabytes of RAM, which forces my computer to > swap (with the obvious results...). You may want to post the relevant parts of your source code on hpaste.org for reference. > I have right now two main suspects: The recursive function is > tail-recursive, but I don't know whether the usual way to write these > functions (with an accumulator etc) works in monadic code (the stage > is, of course, the IO monad, and I am using the do-notation as I don't > like the only other way I know, writing lambdas and lambdas and > lambdas into the function body). The other problem I can imagine is > the passing-around of the file handle, and the subsequent reading of > byte strings: Are those strings somehow attached to the handle, and > does the handle work in a different way than I expected, i.e. is the > handle copied while using it as an argument for another function, and > exists something like a register of handles that keeps the connection > upright and, therefore, excludes the (handle, string)-chunk from > garbage collection? Usually no, unless you read the file with a lazy read function like hGetContents. And the normal notation and the do-notation are equivalent. When compiling, the do-notation is simply translated to the normal notation. > I have, of course, been experimenting with the "seq" - function, but, > honestly, I am not sure whether I got it right. Does a call to > "identity $! (function arguments ...)" force the full evaluation of > the function? No, a `seq` b says that before evaluating 'b', 'a' should be evaluated. The function itself may treat its arguments lazily, which makes a difference, when it's recursive. Greets, Ertugrul. -- nightmare = unsafePerformIO (getWrongWife >>= sex) http://blog.ertes.de/ ------------------------------ Message: 3 Date: Wed, 7 Jan 2009 16:31:04 -0800 From: David Schonberger <llp_...@hotmail.com> Subject: [Haskell-beginners] Some confusion still with a YAHT example To: <beginners@haskell.org> Message-ID: <bay126-w305edc182594e7c68032c29a...@phx.gbl> Content-Type: text/plain; charset="iso-8859-1" Hello all. This is my first post and I'm essentially a rank beginner with Haskell, having used it briefly some 7 years ago. Glad to be back but struggling with syntax and understanding error messages. This post actually ties to a prior post, http://www.haskell.org/pipermail/beginners/2008-December/000583.html My troubles are with exercise 3.10, p45 of from HDIII's "Yet Another Haskell Tutorial" (YAHT). The exercise states: Write a program that will repeatedly ask the user for numbers until she types in zero, at which point it will tell her the sum of all the numbers, the product of all the numbers, and, for each number, its factorial. For instance, a session might look like: Note that the sample session accompanying the exercise suggests a session such as: Give me a number (or 0 to stop): 5 Give me a number (or 0 to stop): 8 Give me a number (or 0 to stop): 2 Give me a number (or 0 to stop): 0 The sum is 15 The product is 80 5 factorial is 120 8 factorial is 40320 2 factorial is 2 The following code handles the sum and products pieces--I built the module incrementally--but fails on the factorial part. In fact my current code, if it worked, would only output something like: The sum is 15 The product is 80 120 40320 2 But I'm not even getting that much. Here's the code: --begin code module AskForNumbers whereimport IO askForNums = do putStrLn "Enter a pos int or 0 to end: " numStr <- getLine let num = read numStr if num == 0 then return [] else do rest <- askForNums return (num:rest) listFactorial l = if length l == 0 then return 1 else do fact (head l) listFactorial (tail l) fact n = if n == 0 then return 1 else return foldr (*) 1 [1..n] f = do nums <- askForNums putStr ("Sum is " ++ (show (foldr (+) 0 nums)) ++ "\n") putStr ("Product is " ++ (show (foldr (*) 1 nums)) ++ "\n") listFactorial nums --end code Here is the error msg I get when I load into WinHugs (Sept 2006 version): ERROR file:.\AskForNumbers.hs:22 - Ambiguous type signature in inferred type *** ambiguous type : (Num a, Num [a], Monad ((->) [b]), Num c, Num (b -> [a] -> [a]), Enum a, Monad ((->) (c -> c -> c))) => a -> [b] -> [a] *** assigned to : fact Ambiguous type assigned to fact. Ok. Not sure what to make of that or how to correct it. I though I was passing fact an Integer, since I think nums is a list of Integers, since the sum and product lines in f work ok. Help? Thanks. David -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/beginners/attachments/20090107/525c4d9a/attachment-0001.htm ------------------------------ Message: 4 Date: Thu, 8 Jan 2009 03:35:39 +0100 From: Daniel Fischer <daniel.is.fisc...@web.de> Subject: Re: [Haskell-beginners] Some confusion still with a YAHT example To: David Schonberger <llp_...@hotmail.com>, <beginners@haskell.org> Message-ID: <200901080335.39450.daniel.is.fisc...@web.de> Content-Type: text/plain; charset="iso-8859-1" Am Donnerstag, 8. Januar 2009 01:31 schrieb David Schonberger: > Hello all. This is my first post and I'm essentially a rank beginner with > Haskell, having used it briefly some 7 years ago. Glad to be back but > struggling with syntax and understanding error messages. > > This post actually ties to a prior post, > http://www.haskell.org/pipermail/beginners/2008-December/000583.html > > My troubles are with exercise 3.10, p45 of from HDIII's "Yet Another > Haskell Tutorial" (YAHT). The exercise states: > > Write a program that will repeatedly ask the user for numbers until she > types in zero, at which point it will tell her the sum of all the numbers, > the product of all the numbers, and, for each number, its factorial. For > instance, a session might look like: > > Note that the sample session accompanying the exercise suggests a session > such as: > > Give me a number (or 0 to stop): > 5 > Give me a number (or 0 to stop): > 8 > Give me a number (or 0 to stop): > 2 > Give me a number (or 0 to stop): > 0 > The sum is 15 > The product is 80 > 5 factorial is 120 > 8 factorial is 40320 > 2 factorial is 2 > > The following code handles the sum and products pieces--I built the module > incrementally--but fails on the factorial part. In fact my current code, if > it worked, would only output something like: > > The sum is 15 > The product is 80 > 120 > 40320 > 2 > > But I'm not even getting that much. Here's the code: > > --begin code > > module AskForNumbers whereimport IO askForNums = do putStrLn "Enter a pos > int or 0 to end: " numStr <- getLine let num = read numStr if num == 0 > then return [] else do rest <- askForNums return (num:rest) > > listFactorial l = if length l == 0 then return 1 else do fact (head l) > listFactorial (tail l) > > fact n = if n == 0 then return 1 else return foldr (*) 1 [1..n] f = do > nums <- askForNums putStr ("Sum is " ++ (show (foldr (+) 0 nums)) ++ "\n") > putStr ("Product is " ++ (show (foldr (*) 1 nums)) ++ "\n") listFactorial > nums > > --end code > > Here is the error msg I get when I load into WinHugs (Sept 2006 version): > > ERROR file:.\AskForNumbers.hs:22 - Ambiguous type signature in inferred > type *** ambiguous type : (Num a, Num [a], Monad ((->) [b]), Num c, Num (b > -> [a] -> [a]), Enum a, Monad ((->) (c -> c -> c))) => a -> [b] -> [a] *** > assigned to : fact > > Ambiguous type assigned to fact. Ok. Not sure what to make of that or how > to correct it. I though I was passing fact an Integer, since I think nums > is a list of Integers, since the sum and product lines in f work ok. > > Help? Thanks. Okay, that's a nice one :) Don't be disappointed by the rather prosaic reason for it. First, if you don't already know what ambiguous type means: in the context "(Num a, ..., Monad ((->) (c -> c -> c)))" there appears a type variable, c, which doesn't appear on the right hand side a -> [b] -> [a]. I will not explain how that monstrous type arises, though it is a nice exercise in type inferring. module AskForNumbers where import IO askForNums = do putStrLn "Enter a pos int or 0 to end: " numStr <- getLine let num = read numStr if num == 0 then return [] else do rest <- askForNums return (num:rest) listFactorial l = if length l == 0 then return 1 else do fact (head l) listFactorial (tail l) ^^^^ There are several things that can be improved here. First, don't use "if length l == 0". That has to traverse the whole list and if that's long (or even infinite), you're doing a lot of superfluous work. To check for an empty list, use "null", so "if null l then ...", that's far more efficient because null is defined as null [] = True null _ = False Also, it would be better to define listFactorial by pattern matching: listFactorial [] = ... listFactorial (k:ks) = do fact k listFactorial ks Next, why do you return 1 for an empty list? Probably to match the type of the other branch, but that that isn't so well chosen either. And since fact only returns something and does nothing else, listFactorial actually is "do something without any effects and then return 1", that isn't what you want. What you want is that for each k in the list it prints k factorial is ... That suggests that you either let fact do the output, so that fact would get the type Int(eger) -> IO () or let fact be a pure function calculating the factorial and have listFactorial (k:ks) = do putStrLn (show k ++ " factorial is " ++ show (fact k)) listFactorials ks f = do nums <- askForNums putStr ("Sum is " ++ (show (foldr (+) 0 nums)) ++ "\n") putStr ("Product is " ++ (show (foldr (*) 1 nums)) ++ "\n") listFactorial nums There are library functions "sum" and "product", you could use them. Instead of putStr (some string ++ "\n") you could use putStrLn (some string) but all this is just a matter of personal taste. Now comes the culprit: fact n = if n == 0 then return 1 else return foldr (*) 1 [1 .. n] The else-branch should be return ( foldr (*) 1 [1 .. n] ) or return $ foldr (*) 1 [1 .. n] Without the parentheses or ($), it is parsed as ( ( (return foldr) (*) ) 1 ) [1 .. n], which leads to the interesting type of the error message. If you're interested, I could explain how hugs infers that. > > David HTH, Daniel ------------------------------ Message: 5 Date: Thu, 8 Jan 2009 00:17:30 -0500 From: "Brandon S. Allbery KF8NH" <allb...@ece.cmu.edu> Subject: Re: [Haskell-beginners] Some confusion still with a YAHT example To: David Schonberger <llp_...@hotmail.com> Cc: beginners@haskell.org Message-ID: <2500c059-cc26-4297-9d44-3874a9295...@ece.cmu.edu> Content-Type: text/plain; charset="us-ascii" I see there's another answer about this but it misses one key point which is likely the cause of your (and WinHugs') confusion. On 2009 Jan 7, at 19:31, David Schonberger wrote: > listFactorial l = > if length l == 0 > then return 1 > else do > fact (head l) > listFactorial (tail l) > > fact n = > if n == 0 > then return 1 > else return foldr (*) 1 [1..n] "return" doesn't mean what it does in other languages. It means "wrap this value in a monad". Since neither of the above functions involves a monad at all, the "return"s confuse Hugs into thinking you're doing something more complex than you actually are (hence the scary-looking type inference). Ignoring (most) other optimizations to the code, since the other message hit them: > listFactorial [] = [1] > listFactorial (x:xs) = fact x : listFactorial xs > > fact 0 = 1 > fact n = foldr (*) 1 [1..n] > > f = do > nums <= askForNums > putStr ("Sum is " ++ (show (foldr (+) 0 nums)) ++ "\n") > putStr ("Product is " ++ (show (foldr (*) 1 nums)) ++ "\n") > print $ listFactorial nums If you'd rather keep your original definition of "f" and put listFactorial in IO, then: > listFactorial [] = print 1 > listFactorial (x:xs) = do > print $ fact x > listFactorial xs -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] allb...@kf8nh.com system administrator [openafs,heimdal,too many hats] allb...@ece.cmu.edu electrical and computer engineering, carnegie mellon university KF8NH -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.haskell.org/pipermail/beginners/attachments/20090108/323e0faa/attachment.htm ------------------------------ _______________________________________________ Beginners mailing list Beginners@haskell.org http://www.haskell.org/mailman/listinfo/beginners End of Beginners Digest, Vol 7, Issue 7 ***************************************