Re: [Haskell-cafe] MD5? (was: Haskell performance question)
I minor changes, fixing up my chunking function (finally) thus eliminating the space leak. Performance is now under 3x that of C! Yay! Also, nano MD5 benched at 1.15x 'C' (for files small enough for strict ByteStrings to do ok). Get the code: darcs get http://code.haskell.org/~tommd/pureMD5 On the 2GB benchmark it is even more competitive (see my blog on sequence.complete). Let me know if you get significantly different results (and you will if you IO doesn't horribly bottle neck you like on my laptop). -Tom > You might like to test against, > > http://hackage.haskell.org/cgi-bin/hackage-scripts/package/nano-md5-0.1 > > which is a strict bytestring openssl binding. > > -- Don -- "The philosophy behind your actions should never change, on the other hand, the practicality of them is never constant." - Thomas Main DuBuisson ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] MD5? (was: Haskell performance question)
On Thu, Nov 08, 2007 at 06:14:20PM -0500, Thomas M. DuBuisson wrote: > Glad you asked! > > http://sequence.complete.org/node/367 > > I just posted that last night! Once I get a a community.haskell.org > login I will put the code on darcs. > > The short of it it: > 1) The code is still ugly, I haven't been modivated to clean. > 2) Manually unrolled, it is ~ 6 times slower than C > 3) When Rolled it is still much slower than that > 4) There is some optimizer bug in GHC - this code could be 2x faster, I > feel certain. > 5) I benchmarked using a 200MB file, so I think it will handle whatever. Why did you put yourself through all this pain when you could have just copied the code from md5sum(1), removed the main function, and foreign imported its buffer accumulator wrapping it as a function over lazy bytestrings? We have the best foreign function interface in the world. Reinventing wheels is stupid, especially if the existing wheels are this easy to use. Stefan signature.asc Description: Digital signature ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] MD5? (was: Haskell performance question)
thomas.dubuisson: > Glad you asked! > > http://sequence.complete.org/node/367 > > I just posted that last night! Once I get a a community.haskell.org > login I will put the code on darcs. Cool. I'll look at this. You might like to test against, http://hackage.haskell.org/cgi-bin/hackage-scripts/package/nano-md5-0.1 which is a strict bytestring openssl binding. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] MD5? (was: Haskell performance question)
Glad you asked! http://sequence.complete.org/node/367 I just posted that last night! Once I get a a community.haskell.org login I will put the code on darcs. The short of it it: 1) The code is still ugly, I haven't been modivated to clean. 2) Manually unrolled, it is ~ 6 times slower than C 3) When Rolled it is still much slower than that 4) There is some optimizer bug in GHC - this code could be 2x faster, I feel certain. 5) I benchmarked using a 200MB file, so I think it will handle whatever. Thomas DuBuisson On Thu, 2007-11-08 at 22:14 +, Andrew Coppin wrote: > Don Stewart wrote: > > dpiponi: > > > >> I was getting about 1.5s for the Haskell program and about 0.08s for > >> the C one with the same n=10,000,000. > >> > > > > I'm sure we can do better than that! > > > That's the spirit! :-D > > > Speaking of which [yes, I'm going to totally hijack this thread now...], > does anybody have a Haskell MD5 hash implementation that goes fast? > IIRC, I found one in MissingH, and it worked great. Except that as soon > as you feed it a 10 MB file, the standard Unix "md5sum" executable takes > about 0.001s to do it, and the Haskell version goes crazy and starts > eating virtual memory like candy. o_O (Although given a few minutes it > *does* produce the correct answer. But given that I want to run it over > an entire CD..) > > Given the choise, I'd *like* to find a fast 100% Haskell implementation > - but failing that, (nice) bindings to a fast C implementation will do I > guess. (I *only* need to compute MD5 hashes for files on disk. I don't > need to do anything more fancy than that...) > > ___ > Haskell-Cafe mailing list > Haskell-Cafe@haskell.org > http://www.haskell.org/mailman/listinfo/haskell-cafe -- "The philosophy behind your actions should never change, on the other hand, the practicality of them is never constant." - Thomas Main DuBuisson ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] MD5? (was: Haskell performance question)
andrewcoppin: > Don Stewart wrote: > >dpiponi: > > > >>I was getting about 1.5s for the Haskell program and about 0.08s for > >>the C one with the same n=10,000,000. > >> > > > >I'm sure we can do better than that! > > > That's the spirit! :-D > > > Speaking of which [yes, I'm going to totally hijack this thread now...], > does anybody have a Haskell MD5 hash implementation that goes fast? > IIRC, I found one in MissingH, and it worked great. Except that as soon > as you feed it a 10 MB file, the standard Unix "md5sum" executable takes > about 0.001s to do it, and the Haskell version goes crazy and starts > eating virtual memory like candy. o_O (Although given a few minutes it > *does* produce the correct answer. But given that I want to run it over > an entire CD..) > > Given the choise, I'd *like* to find a fast 100% Haskell implementation > - but failing that, (nice) bindings to a fast C implementation will do I > guess. (I *only* need to compute MD5 hashes for files on disk. I don't > need to do anything more fancy than that...) Start with a fast C version, and translate that into code over ByteStrings. If its not within 2x, call the bytestring hackers hotline, which is on the wiki. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] MD5? (was: Haskell performance question)
Don Stewart wrote: dpiponi: I was getting about 1.5s for the Haskell program and about 0.08s for the C one with the same n=10,000,000. I'm sure we can do better than that! That's the spirit! :-D Speaking of which [yes, I'm going to totally hijack this thread now...], does anybody have a Haskell MD5 hash implementation that goes fast? IIRC, I found one in MissingH, and it worked great. Except that as soon as you feed it a 10 MB file, the standard Unix "md5sum" executable takes about 0.001s to do it, and the Haskell version goes crazy and starts eating virtual memory like candy. o_O (Although given a few minutes it *does* produce the correct answer. But given that I want to run it over an entire CD..) Given the choise, I'd *like* to find a fast 100% Haskell implementation - but failing that, (nice) bindings to a fast C implementation will do I guess. (I *only* need to compute MD5 hashes for files on disk. I don't need to do anything more fancy than that...) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe