Re: [Haskell-cafe] Performance Tuning darcs (a real shootout?)

2006-01-24 Thread David Roundy
On Mon, Jan 23, 2006 at 08:37:55PM -0800, Jason Dagit wrote:
 On Jan 23, 2006, at 3:33 AM, Bulat Ziganshin wrote:
 what you mean? afaik, there is no standard FastPackedString
 implementation, but there is some library that with minimal
 modifications used in darcs, jhc and many other apps
 
 I considered the version at Don Stewart's web site to be the standard,
 perhaps that was silly of me.

Actually, FastPackedString originated in darcs, and Don separated it out as
a library, making minimal modifications...  :) FastPackedString started out
with the Data.PackedString code, which I modified to hold the data in a
ForeignPtr and to treat the data as always being 8 bit words.  Plus the
mmap stuff and splitting.  I renamed it when I deviated from the
Data.PackedString interface (which is the module I used to use).

 it seems that Ian just used this as memory/time-efficient alternative
 for hGetContents. reading from memory-mapped file may be done as pure
 computation if the whole file is mapped. is this used in darcs?
 
 I'm not sure, I have looked at the code but I can't tell.  I think  
 that was the point with the mmap'd files.  There are several layers  
 of abstraction at work here.  Slurpies, PackedStrings, (custom) Lazy  
 Reader monad, and maybe some others.

Patch files are normally stored compressed, so we can't mmap them and treat
them nicely.  For crazy-size repositories one might prefer to store them
uncompressed, at least on one's working repository (presumably not in the
publicly available repo).  If the patches aren't compressed, they are
indeed mmapped as an entire file.

And alas, even I don't have much time at the moment to help with optimizing
darcs.
-- 
David Roundy
http://www.darcs.net
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Performance Tuning darcs (a real shootout?)

2006-01-24 Thread Jason Dagit


On Jan 24, 2006, at 1:55 AM, Simon Marlow wrote:



You can get a quick picture of heap usage with +RTS -Sstderr, by  
the way.  To find out what's actually in that heap, you'll need  
heap profiling (as you know).

[snip]
Yes, GHC's heap is mmap()'d anonymously.  You really need to find  
out whether the space leak is mmap()'d by GHC's runtime, or by  
darcs itself - +RTS -Sstderr or profiling will tell you about GHC's  
memory usage.


Ah, I had been using little s, but I forgot about the existence of  
big S.  I'll try to include some profiles and the knowledge gained by  
using it.  I wish I could work on that right now but chances are it  
will be Monday or Tuesday before I get to look at it again.




I'd start by using heap profiling to track down what the space leak  
consists of, and hopefully to give you enough information to  
diagnose it.  Let's see some heap profiles!


Yes!



Presumably the space leak is just as visible with smaller patches,  
so you don't need the full 300M patch to investigate it.


This is true, I've had problems with even 30mb patches.  I guess I  
liked using the 300mb patch because it emphasized and exaggerated the  
performance and often I left the profile running on one machine while  
I went off studying the code on another.  But, it's a good suggestion  
for when I want to be able to iterate or get my results sooner.




I don't usually resort to -ddump-simpl until I'm optimising the  
inner loop, use profiling to find out where the inner loops  
actually *are* first.


Point taken.

Are there tools or techniques that can help me understand why the  
memory consumption peaks when applying a patch?  Is it foolish to  
think that lazy evaluation is the right approach?


Since you asked, I've never been that keen on mixing laziness and I/ 
O. Your experiences have strengthened that conviction - if you want  
strict control over resource usage, laziness is always going to be  
problematic.  Sure it's great if you can get it right, the code is  
shorter and runs in small constant space.  But can you guarantee  
that it'll still have the same memory behaviour with the next  
version of the compiler?  With a different compiler?


And I've heard others say that laziness adds enough unpredictability  
that it makes optimizing just that much trickier.  I guess this may  
be one of the cases where the trickiness outweighs the elegance.




I'm looking for advice or help in optimizing darcs in this case.   
I guess this could be viewed as a challenge for people that felt  
like the micro benchmarks of the shootout were unfair to Haskell.   
Can we demonstrate that Haskell provides good performance in the  
real-world when working with large files?  Ideally, darcs could  
easily work with a patch that is 10GB in size using only a few  
megs of ram if need be and doing so in about the time it takes  
read the file once or twice and gzip it.


I'd love to help you look into it, but I don't really have the  
time. I'm happy to help out with advice where possible, though.


Several people have spoken up and said, I'd help but I'm busy  
including droundy himself.  This is fine, when I said help I was  
thinking of advice like you gave.  It was a poor choice of phrasing  
on my part.  I can work ghc and stare at lines of code, but sometimes  
I need guidance since I'm mostly out of my league in this case.


Thanks,
Jason

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?

2006-01-24 Thread Isaac Jones
Greetings,

Debian has a system called popularity-contest, which is an opt-in
survey of package use.  I was curious to see the ranking of darcs
among Haskell implementations themselves.  The results: Darcs ranks
higher than ghc and hugs.

#name is the package name;
#inst is the number of people who installed this package;
#vote is the number of people who use this package regularly;
#old is the number of people who installed, but don't use this package
#  regularly;
#recent is the number of people who upgraded this package recently;
#no-files is the number of people whose entry didn't contain enough
#   information (atime and ctime were 0).


#rank nameinst  vote   old recent no-files 
(maintainer)
138   darcs563   159   280   124 0 (Isaac 
Jones)
(51)  hugs 304   119   15728 0 (Isaac 
Jones)
321   ghc6 194818033 0 (Ian 
Lynagh)

Hugs is listed in a different category, so the ranking is off.

Ian Lynagh pointed me at this nice graph showing the historical
installation of the three packages:

http://people.debian.org/~igloo/popcon-graphs/index.php?packages=ghc6,darcs,hugsshow_installed=onwant_percent=onwant_legend=onbeenhere=1

peace,

  isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?

2006-01-24 Thread Iavor Diatchki
(note: these are just some random thoughts to get the day started :-))
hi,
is this really surprising?  compilers (for any language) are only of
interest to developers, while most applications written in the
language have a wide user base, so i would assume that there are many
situations where this holds (internet explorer more popular than
visual c++?).  of course, darcs itself is a tool aimed at developers,
but again, haskell developers are a subset of all the developers that
can use it.  yet another thing (and this is debian specific) is that i
use the darcs distributed with debian, which is old but works fine,
but i don't use the ghc distributed with debian because it is old, and
somewhat broken. still, it is good to know that a tool written in
haskell is doing so well, but this is not surprising either, because
after all, haskell is a pretty cool programming language.
-iavor

On 1/24/06, Isaac Jones [EMAIL PROTECTED] wrote:
 Greetings,

 Debian has a system called popularity-contest, which is an opt-in
 survey of package use.  I was curious to see the ranking of darcs
 among Haskell implementations themselves.  The results: Darcs ranks
 higher than ghc and hugs.

 #name is the package name;
 #inst is the number of people who installed this package;
 #vote is the number of people who use this package regularly;
 #old is the number of people who installed, but don't use this package
 #  regularly;
 #recent is the number of people who upgraded this package recently;
 #no-files is the number of people whose entry didn't contain enough
 #   information (atime and ctime were 0).


 #rank nameinst  vote   old recent no-files 
 (maintainer)
 138   darcs563   159   280   124 0 (Isaac 
 Jones)
 (51)  hugs 304   119   15728 0 (Isaac 
 Jones)
 321   ghc6 194818033 0 (Ian 
 Lynagh)

 Hugs is listed in a different category, so the ranking is off.

 Ian Lynagh pointed me at this nice graph showing the historical
 installation of the three packages:

 http://people.debian.org/~igloo/popcon-graphs/index.php?packages=ghc6,darcs,hugsshow_installed=onwant_percent=onwant_legend=onbeenhere=1

 peace,

   isaac
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?

2006-01-24 Thread Josh Hoyt
On 1/24/06, Jared Updike [EMAIL PROTECTED] wrote:
 What happened to Avoid success at all costs? [1]

   Jared.

 [1] seventh slide, Simon Peyton Jones,
 http://research.microsoft.com/Users/simonpj/papers/haskell-retrospective/

I was unaware of that motto. It looks like we'd better do something to
make darcs harder to use.

Josh
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] darcs: the first haskell tool more popular than Haskell itself?

2006-01-24 Thread Isaac Jones
Iavor Diatchki [EMAIL PROTECTED] writes:

(snip)
 yet another thing (and this is debian specific) is that i
 use the darcs distributed with debian, which is old but works fine,

The darcs shipped w/ Debian Stable might be oldish, but that's the
darcs that was available when Debian was frozen.  The Debian unstable
version is 1.0.5 (the newest), and testing should be there soon.

If you're using stable, you can configure your system so that you can
say:
apt-get install darcs/unstable

So it'll just get the unstable darcs and keep everything else as
stable.

peace,

  isaac
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: Guess what ... Tutorial uploaded! :)

2006-01-24 Thread Dmitry Astapov

Evening, Ben. 

Ben Rudiak-Gould [EMAIL PROTECTED] 00:46 24/1/2006 wrote:

 BR Dmitry Astapov wrote:
 http://www.haskell.org/hawiki/HitchhickersGuideToTheHaskell

 BR I like the approach too, but the section on IO actions, which I'm
 BR reading now, is not correct. It's not true that a - someAction
 BR has the effect of associating a with someAction, with the actual
 BR work deferred until later.

Ahem.. I got a little carried away on the topic of getContents,
apparently :(


 BR All of the IO associated with someAction happens at the program
 BR point where a - someAction appears, whether or not it's needed
 BR later. getContents is a rare exception to this rule.

But of course. Now I'll have to think how to back out of the mess I
created :)

-- 
Dmitry Astapov //ADEpt
GPG KeyID/fprint: F5D7639D/CA36 E6C4 815D 434D 0498  2B08 7867 4860 F5D7 639D

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: darcs: the first haskell tool more popular than?Haskell itself?

2006-01-24 Thread Chung-chieh Shan
Josh Hoyt [EMAIL PROTECTED] wrote in article [EMAIL PROTECTED] in 
gmane.comp.lang.haskell.cafe:
 On 1/24/06, Jared Updike [EMAIL PROTECTED] wrote:
  What happened to Avoid success at all costs?
  http://research.microsoft.com/Users/simonpj/papers/haskell-retrospective/
 I was unaware of that motto. It looks like we'd better do something to
 make darcs harder to use.

Time for a coalgebra of patches?

-- 
Edit this signature at http://www.digitas.harvard.edu/cgi-bin/ken/sig
Can't sleep, clown will eat me.
---
Unlike you I get Windows shoved down my throat at work.
Ooh, that's a pane in the neck.

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe