Send Beginners mailing list submissions to beginners@haskell.org To subscribe or unsubscribe via the World Wide Web, visit http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners or, via email, send a message with subject or body 'help' to beginners-requ...@haskell.org
You can reach the person managing the list at beginners-ow...@haskell.org When replying, please edit your Subject line so it is more specific than "Re: Contents of Beginners digest..." Today's Topics: 1. Fwd: CAF taking all the memory (Mahdi) 2. using HS to writing/managing a selfmade filesystem on a real partition? (Silent Leaf) ---------------------------------------------------------------------- Message: 1 Date: Mon, 9 May 2016 18:41:03 +0430 From: Mahdi <mdiba...@aol.com> To: beginners@haskell.org Subject: [Haskell-beginners] Fwd: CAF taking all the memory Message-ID: <6d06e900-40d3-4cd1-814d-da821658d...@aol.com> Content-Type: text/plain; charset="utf-8" > Begin forwarded message: > > From: Mahdi <mdiba...@aol.com> > Subject: Re: [Haskell-beginners] CAF taking all the memory > Date: May 8, 2016 at 18:06:29 GMT+4:30 > To: Ben Gamari <b...@smart-cactus.org> > > Wow, This is a great explanation! Thank you! > I see your point on using lazy lists (although I was not aware of the > specific problems you mentioned). > So, I switched over to HMatrix [0], (the libraries you mentioned added a lot > of complexity to the code) > which uses unboxed arrays for matrices. > > After refactoring my code to use HMatrix, I ran the program, and the same > thing happens, > the speed hasn?t changed much either, then I tried to take another profile to > see if anything has changed, > but I couldn?t as described here [1]. > > But my main question is, why would Haskell even store all this data? There > are no side-effects, so > there should be no memory leak in my code. > From what I read online, Haskell is trying to avoid re-calculating the > functions by caching their > input/outputs (they?re pure, so this can be done), but I hoped there would be > a way to disable this, > unfortunately there doesn?t seem to be a way (or there is?). > If that?s the case, then using libraries won?t solve the problem, and that?s > frustrating. > > * I also tried compiling with -O0, but no luck. > > [0] http://dis.um.es/~alberto/hmatrix/hmatrixtut.html > [1] https://github.com/albertoruiz/hmatrix/issues/190 > > Thank you Ben! > > - Mahdi > >> On May 3, 2016, at 14:36, Ben Gamari <b...@smart-cactus.org> wrote: >> >> Mahdi <mdiba...@aol.com> writes: >> >>> Hello there, >>> >> Hi! >> >>> I?m pretty new to Haskell and I?m trying to write a Neural Network in >>> Haskell for educational purposes. >>> >>> So, I have my neural network working, it can learn XOR in 500 >>> iterations, but the thing is, if I increase the iterations to >>> something like 5 milion times, the process just drains my RAM until >>> either it?s killed or the OS drowns. Here is the code: [0] >>> >>> I searched online and tried to get information on why this is >>> happening, I profiled the memory usage and found that the memory is >>> taken by CAF, searching online, >>> >> Great! The next question you should then ask is "which CAF"? A CAF is >> simply a top-level "constant" in your program. Indeed it sounds like you >> have not defined any cost-centers, which means that GHC will attribute >> the entire cost of your program to the ambiguous cost-center "CAF" >> (which in this case really just means "the whole program"). >> >> As discussed in the users guide [1] one way to define cost-centers >> within your program is to manually annotate expressions with SCC >> pragmas. However, in this case we simply want to let GHC do this for us, >> for which we can use the `-fprof-auto -fprof-cafs` flags (which >> automatically annotate top-level definitions and CAFs with >> cost-centers), >> >> $ ghc -O examples/xor.hs -fprof-auto -fprof-cafs >> >> Now your program should give a much more useful profile (run having set >> the iteration count to 50000), >> >> $ time examples/xor +RTS -p >> [[0.99548584],[4.5138146e-3],[0.9954874],[4.513808e-3]] >> >> real 0m4.019s >> user 0m0.004s >> sys 0m4.013s >> $ cat xor.prof >> Tue May 3 11:15 2016 Time and Allocation Profiling Report (Final) >> >> xor +RTS -p -RTS >> >> total time = 1.11 secs (1107 ticks @ 1000 us, 1 processor) >> total alloc = 1,984,202,600 bytes (excludes profiling overheads) >> >> COST CENTRE MODULE %time %alloc >> >> matrixMap Utils.Math 21.4 26.8 >> matadd Utils.Math 20.8 22.7 >> matmul.\ Utils.Math 10.4 16.0 >> dot Utils.Math 7.1 13.4 >> column Utils.Math 7.0 2.9 >> dot.\ Utils.Math 6.9 1.9 >> rowPairs Utils.Math 5.8 6.5 >> sigmoid' NN 4.7 0.8 >> train.helper Main 4.0 1.3 >> sigmoid NN 3.3 0.8 >> matmul Utils.Math 2.2 2.0 >> hadamard Utils.Math 1.8 2.1 >> columns Utils.Math 1.2 1.3 >> >> >> individual inherited >> COST CENTRE MODULE no. entries %time >> %alloc %time %alloc >> >> MAIN MAIN 79 0 0.0 >> 0.0 100.0 100.0 >> [snip] >> CAF:main3 Main 143 0 0.0 >> 0.0 100.0 100.0 >> (...) Main 162 1 0.0 >> 0.0 100.0 100.0 >> train Main 163 1 0.0 >> 0.0 100.0 100.0 >> train.helper Main 164 50001 4.0 >> 1.3 100.0 100.0 >> train.helper.hweights Main 258 50001 0.5 >> 0.0 0.5 0.0 >> train.helper.oweights Main 235 50001 0.4 >> 0.0 0.4 0.0 >> train.helper.oback Main 207 50000 0.3 >> 0.1 19.0 20.9 >> backward' NN 208 50000 0.3 >> 0.6 18.7 20.8 >> [snip] >> >> So, here we see that costs are actually spread throughout the program. >> >> Without diving any deeper into this particular program it's hard to give >> more guidance however I will say that your lazy list Matrix >> representation is very rarely the right choice for even toy linear >> algebra problems. >> >> First, consider the fact that even just a list cons constructor requires >> three words (an info table pointer, a pointer to payload, and a pointer >> to tail) plus the size of the payload (which in the case of an evaluated >> `Float` is 2 words: one info table pointer and the floating point value >> itself). So, a list of n *evaluated* `Float`s (which would require only >> 4*n bytes if packed densely) will require 40*n bytes if represented as a >> lazy list. >> >> Then, consider the fact that indexing into a lazy list is an O(n) >> operation: this means that your `Math.column` operation on an n x m >> matrix may be O(n*m). Even worse, `Math.columns`, as used by >> `Math.matmul` is O(n * m!). >> >> Finally, consider the fact that whenever you "construct" a lazy list you >> aren't actually performing any computation: you are actually >> constructing a single thunk which represents the entire result; however, >> if you then go to index into the middle of that list you will end up >> constructing n cons cells and a thunk for the payload of each. In the >> case of primitive linear algebra operations the cost of constructing >> this payload thunk can be greater than simply computing the result. >> >> For these reasons I wouldn't recommend that lazy lists are used in this >> way. If you have a dense matrix use an array (probably even unboxed; >> see, for instance, the `array`, `vector`, and `repa` libraries); if >> you have a sparse matrix then use an appropriate sparse representation >> (although sadly good sparse linear algebra tools are hard to come by in >> Haskell) Not only will the result be significantly more efficient in >> space and time but the runtime behavior of the program will be >> significantly easier to follow since you can more easily ensure that >> evaluation occurs when you expect it to. >> >> Hopefully this helps. Good luck and let us know if there are further >> issues. >> >> Cheers, >> >> - Ben >> >> >> [1] >> http://downloads.haskell.org/~ghc/master/users-guide//profiling.html#cost-centres-and-cost-centre-stacks > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160509/b7ecd57c/attachment-0001.html> ------------------------------ Message: 2 Date: Mon, 9 May 2016 20:50:14 +0200 From: Silent Leaf <silent.le...@gmail.com> To: The Haskell-Beginners Mailing List - Discussion of primarily beginner-level topics related to Haskell <beginners@haskell.org> Subject: [Haskell-beginners] using HS to writing/managing a selfmade filesystem on a real partition? Message-ID: <cagfccjn27ga1-vff_iguci6sny_poneutdzbn2yf904jec1...@mail.gmail.com> Content-Type: text/plain; charset="utf-8" Mostly all in the title. I have a project of developing a personal filesystem, possibly at first virtual (the file(s) representing a virtual partition formatted with my filesystem, would be saved in a host filesys, eg ext4 or whatever), but probably in the end not virtual, directly working on the contents of a real partition. Can haskell do that kind of thing, aka writing data on a partition directly (without using a known filesys), etc? Is it at least more or less adapted for this task (not talking about performances, unless the consequences be a *really* slow filesys), aka doable, easily doable, relatively speaking (aka not worse than with another language)? Incidentally, if i wanted Linux to recognize the filesys, i've heard one has to write a module and put it in connection with the kernel or something. could haskell do that? if that's a "no" somewhere for one of my questions, which parts can't be written in haskell (without horrible performances or code very very hard to write), and can they be written in C (or whatever) as foreign functions? which parts would that represent for the whole program? Thanks a lot in advance! PS: just in case, tips on sources of information on how to do any of the above will be extremely appreciated! (even if it's in, say C, for that matter, providing there's a way to translate the steps into a haskell program) -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.haskell.org/pipermail/beginners/attachments/20160509/60559f2c/attachment.html> ------------------------------ Subject: Digest Footer _______________________________________________ Beginners mailing list Beginners@haskell.org http://mail.haskell.org/cgi-bin/mailman/listinfo/beginners ------------------------------ End of Beginners Digest, Vol 95, Issue 15 *****************************************