Re: [Haskell-cafe] Performance Tuning darcs (a real shootout?)
On Mon, Jan 23, 2006 at 08:37:55PM -0800, Jason Dagit wrote: On Jan 23, 2006, at 3:33 AM, Bulat Ziganshin wrote: what you mean? afaik, there is no standard FastPackedString implementation, but there is some library that with minimal modifications used in darcs, jhc and many other apps I considered the version at Don Stewart's web site to be the standard, perhaps that was silly of me. Actually, FastPackedString originated in darcs, and Don separated it out as a library, making minimal modifications... :) FastPackedString started out with the Data.PackedString code, which I modified to hold the data in a ForeignPtr and to treat the data as always being 8 bit words. Plus the mmap stuff and splitting. I renamed it when I deviated from the Data.PackedString interface (which is the module I used to use). it seems that Ian just used this as memory/time-efficient alternative for hGetContents. reading from memory-mapped file may be done as pure computation if the whole file is mapped. is this used in darcs? I'm not sure, I have looked at the code but I can't tell. I think that was the point with the mmap'd files. There are several layers of abstraction at work here. Slurpies, PackedStrings, (custom) Lazy Reader monad, and maybe some others. Patch files are normally stored compressed, so we can't mmap them and treat them nicely. For crazy-size repositories one might prefer to store them uncompressed, at least on one's working repository (presumably not in the publicly available repo). If the patches aren't compressed, they are indeed mmapped as an entire file. And alas, even I don't have much time at the moment to help with optimizing darcs. -- David Roundy http://www.darcs.net ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Performance Tuning darcs (a real shootout?)
On Jan 24, 2006, at 1:55 AM, Simon Marlow wrote: You can get a quick picture of heap usage with +RTS -Sstderr, by the way. To find out what's actually in that heap, you'll need heap profiling (as you know). [snip] Yes, GHC's heap is mmap()'d anonymously. You really need to find out whether the space leak is mmap()'d by GHC's runtime, or by darcs itself - +RTS -Sstderr or profiling will tell you about GHC's memory usage. Ah, I had been using little s, but I forgot about the existence of big S. I'll try to include some profiles and the knowledge gained by using it. I wish I could work on that right now but chances are it will be Monday or Tuesday before I get to look at it again. I'd start by using heap profiling to track down what the space leak consists of, and hopefully to give you enough information to diagnose it. Let's see some heap profiles! Yes! Presumably the space leak is just as visible with smaller patches, so you don't need the full 300M patch to investigate it. This is true, I've had problems with even 30mb patches. I guess I liked using the 300mb patch because it emphasized and exaggerated the performance and often I left the profile running on one machine while I went off studying the code on another. But, it's a good suggestion for when I want to be able to iterate or get my results sooner. I don't usually resort to -ddump-simpl until I'm optimising the inner loop, use profiling to find out where the inner loops actually *are* first. Point taken. Are there tools or techniques that can help me understand why the memory consumption peaks when applying a patch? Is it foolish to think that lazy evaluation is the right approach? Since you asked, I've never been that keen on mixing laziness and I/ O. Your experiences have strengthened that conviction - if you want strict control over resource usage, laziness is always going to be problematic. Sure it's great if you can get it right, the code is shorter and runs in small constant space. But can you guarantee that it'll still have the same memory behaviour with the next version of the compiler? With a different compiler? And I've heard others say that laziness adds enough unpredictability that it makes optimizing just that much trickier. I guess this may be one of the cases where the trickiness outweighs the elegance. I'm looking for advice or help in optimizing darcs in this case. I guess this could be viewed as a challenge for people that felt like the micro benchmarks of the shootout were unfair to Haskell. Can we demonstrate that Haskell provides good performance in the real-world when working with large files? Ideally, darcs could easily work with a patch that is 10GB in size using only a few megs of ram if need be and doing so in about the time it takes read the file once or twice and gzip it. I'd love to help you look into it, but I don't really have the time. I'm happy to help out with advice where possible, though. Several people have spoken up and said, I'd help but I'm busy including droundy himself. This is fine, when I said help I was thinking of advice like you gave. It was a poor choice of phrasing on my part. I can work ghc and stare at lines of code, but sometimes I need guidance since I'm mostly out of my league in this case. Thanks, Jason ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?
Greetings, Debian has a system called popularity-contest, which is an opt-in survey of package use. I was curious to see the ranking of darcs among Haskell implementations themselves. The results: Darcs ranks higher than ghc and hugs. #name is the package name; #inst is the number of people who installed this package; #vote is the number of people who use this package regularly; #old is the number of people who installed, but don't use this package # regularly; #recent is the number of people who upgraded this package recently; #no-files is the number of people whose entry didn't contain enough # information (atime and ctime were 0). #rank nameinst vote old recent no-files (maintainer) 138 darcs563 159 280 124 0 (Isaac Jones) (51) hugs 304 119 15728 0 (Isaac Jones) 321 ghc6 194818033 0 (Ian Lynagh) Hugs is listed in a different category, so the ranking is off. Ian Lynagh pointed me at this nice graph showing the historical installation of the three packages: http://people.debian.org/~igloo/popcon-graphs/index.php?packages=ghc6,darcs,hugsshow_installed=onwant_percent=onwant_legend=onbeenhere=1 peace, isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?
(note: these are just some random thoughts to get the day started :-)) hi, is this really surprising? compilers (for any language) are only of interest to developers, while most applications written in the language have a wide user base, so i would assume that there are many situations where this holds (internet explorer more popular than visual c++?). of course, darcs itself is a tool aimed at developers, but again, haskell developers are a subset of all the developers that can use it. yet another thing (and this is debian specific) is that i use the darcs distributed with debian, which is old but works fine, but i don't use the ghc distributed with debian because it is old, and somewhat broken. still, it is good to know that a tool written in haskell is doing so well, but this is not surprising either, because after all, haskell is a pretty cool programming language. -iavor On 1/24/06, Isaac Jones [EMAIL PROTECTED] wrote: Greetings, Debian has a system called popularity-contest, which is an opt-in survey of package use. I was curious to see the ranking of darcs among Haskell implementations themselves. The results: Darcs ranks higher than ghc and hugs. #name is the package name; #inst is the number of people who installed this package; #vote is the number of people who use this package regularly; #old is the number of people who installed, but don't use this package # regularly; #recent is the number of people who upgraded this package recently; #no-files is the number of people whose entry didn't contain enough # information (atime and ctime were 0). #rank nameinst vote old recent no-files (maintainer) 138 darcs563 159 280 124 0 (Isaac Jones) (51) hugs 304 119 15728 0 (Isaac Jones) 321 ghc6 194818033 0 (Ian Lynagh) Hugs is listed in a different category, so the ranking is off. Ian Lynagh pointed me at this nice graph showing the historical installation of the three packages: http://people.debian.org/~igloo/popcon-graphs/index.php?packages=ghc6,darcs,hugsshow_installed=onwant_percent=onwant_legend=onbeenhere=1 peace, isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?
On 1/24/06, Jared Updike [EMAIL PROTECTED] wrote: What happened to Avoid success at all costs? [1] Jared. [1] seventh slide, Simon Peyton Jones, http://research.microsoft.com/Users/simonpj/papers/haskell-retrospective/ I was unaware of that motto. It looks like we'd better do something to make darcs harder to use. Josh ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?
Iavor Diatchki [EMAIL PROTECTED] writes: (snip) yet another thing (and this is debian specific) is that i use the darcs distributed with debian, which is old but works fine, The darcs shipped w/ Debian Stable might be oldish, but that's the darcs that was available when Debian was frozen. The Debian unstable version is 1.0.5 (the newest), and testing should be there soon. If you're using stable, you can configure your system so that you can say: apt-get install darcs/unstable So it'll just get the unstable darcs and keep everything else as stable. peace, isaac ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: Guess what ... Tutorial uploaded! :)
Evening, Ben. Ben Rudiak-Gould [EMAIL PROTECTED] 00:46 24/1/2006 wrote: BR Dmitry Astapov wrote: http://www.haskell.org/hawiki/HitchhickersGuideToTheHaskell BR I like the approach too, but the section on IO actions, which I'm BR reading now, is not correct. It's not true that a - someAction BR has the effect of associating a with someAction, with the actual BR work deferred until later. Ahem.. I got a little carried away on the topic of getContents, apparently :( BR All of the IO associated with someAction happens at the program BR point where a - someAction appears, whether or not it's needed BR later. getContents is a rare exception to this rule. But of course. Now I'll have to think how to back out of the mess I created :) -- Dmitry Astapov //ADEpt GPG KeyID/fprint: F5D7639D/CA36 E6C4 815D 434D 0498 2B08 7867 4860 F5D7 639D ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: darcs: the first haskell tool more popular than?Haskell itself?
Josh Hoyt [EMAIL PROTECTED] wrote in article [EMAIL PROTECTED] in gmane.comp.lang.haskell.cafe: On 1/24/06, Jared Updike [EMAIL PROTECTED] wrote: What happened to Avoid success at all costs? http://research.microsoft.com/Users/simonpj/papers/haskell-retrospective/ I was unaware of that motto. It looks like we'd better do something to make darcs harder to use. Time for a coalgebra of patches? -- Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig Can't sleep, clown will eat me. --- Unlike you I get Windows shoved down my throat at work. Ooh, that's a pane in the neck. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe