Re: [Haskell-cafe] Re: GHC predictability
Don Stewart wrote: ndmitchell: 2. Does anybody know how to actually read GHC's Core output anyway? There is one different from standard Haskell I am aware of. In Core, case x of _ - 1 will evaluate x, in Haskell it won't. Other than that, its just Haskell, but without pattern matching and only cases - hence the rather high number of cases. Well, 'let's too, which bind either unlifted or lifted values. If they're binding lifted things, that's a thunk being allocated. See, somebody needs to write all this down - in language comprehensible to people who don't already know what an unlifed value is. ;-) ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
Neil Mitchell wrote: Hi 1. What is ghc-core? You actually answer this question as part of question 2. Think of it as simple Haskell with some additional bits. I rephrase: I know what GHC's Core language is. But Dons said I suggest you install ghc-core, which suggests the existence of an item of software named ghc-core. This I know nothing about... ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
Richard A. O'Keefe wrote: On 14 May 2008, at 8:58 am, Andrew Coppin wrote: What I'm trying to say [and saying very badly] is that Haskell is an almost terrifyingly subtle language. Name me a useful programming language that isn't. Simply interchanging two for-loops, from for (i = 0; i N; i++) for (j = 0; j N; j++) tofor (j = 0; j N; j++) for (i = 0; i N; i++) when marching over an array, can easily slow you down by nearly two orders of magnitude in C. [Hint: read What every computer scientist needs to know about memory.] OK, well *that* is news... :-o I would suggest that for heap-allocated data that isn't an array, both cases will be equally unperformant. I can't prove that though... Unexpectedly slow is better than inexplicably buggy. +184 o_O ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
On 2008 May 14, at 14:23, Andrew Coppin wrote: Neil Mitchell wrote: 1. What is ghc-core? You actually answer this question as part of question 2. Think of it as simple Haskell with some additional bits. I rephrase: I know what GHC's Core language is. But Dons said I suggest you install ghc-core, which suggests the existence of an item of software named ghc-core. This I know nothing about... Any time someone mentions something that sounds like a Haskell package, http://hackage.haskell.org is your friend. And there I find, in the Development category: ghc-core program: Display GHC's core and assembly output in a pager IIRC it's just a prettyprinter/colorizer to make it easier to follow and understand GHC Core. -- brandon s. allbery [solaris,freebsd,perl,pugs,haskell] [EMAIL PROTECTED] system administrator [openafs,heimdal,too many hats] [EMAIL PROTECTED] electrical and computer engineering, carnegie mellon universityKF8NH ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
On Mon, 2008-05-12 at 19:30 -0700, Don Stewart wrote: I offer up the following example: mean xs = sum xs / length xs Now try, say, mean [1.. 1e9], and watch GHC eat several GB of RAM. (!!) But you know why, don't you? sat down and spent the best part of a day writing an MD5 implementation. Eventually I got it so that all the test vectors work right. (Stupid little-endian nonsense... mutter mutter...) When I tried it on a file containing more than 1 MB of data... o dear... Did you use Data.Binary or the other libraries for efficient access to gigabytes of data? Of course, the first step in any serious attempt at performance improvement is to actually profile the code to figure out where the time is being spent. Well, actually, for our 'mean()' case, it means just using the right structures. Arrays for example: import Data.Array.Vector import Text.Printf mean :: UArr Double - Double mean arr = b / fromIntegral a where k (n :*: s) a = n+1 :*: s+a a :*: b = foldlU k (0 :*: 0) arr :: (Int :*: Double) main = printf %f\n . mean $ enumFromToFracU 1 1e9 For example, $ time ./A 5.067109e8 ./A 3.53s user 0.00s system 99% cpu 3.532 total Try allocating an array of doubles in C, and getting similar results. (The compiler is optimising this to: Main_zdszdwfold_info: leaq32(%r12), %rax cmpq%r15, %rax movq%rax, %r12 ja .L10 movsd .LC0(%rip), %xmm0 ucomisd %xmm5, %xmm0 jae .L12 movq%rsi, (%rax) movq$base_GHCziFloat_Dzh_con_info, -24(%rax) movsd %xmm6, -16(%rax) movq$base_GHCziBase_Izh_con_info, -8(%rax) leaq-7(%rax), %rbx leaq-23(%rax), %rsi jmp *(%rbp) .L12: movapd %xmm6, %xmm0 addq$1, %rsi subq$32, %r12 addsd %xmm5, %xmm0 addsd .LC2(%rip), %xmm5 movapd %xmm0, %xmm6 jmp Main_zdszdwfold_info Note even any garbage collection. This should be pretty much the same performance-wise as unoptimised C. almost any nontrivial program you write spends 60% or more of its time doing GC rather than actual work. Ok, you're doing something very wrong. GC time is typically less than 15% of the running time of typical work programs I hack on. I bet you're using lists inappropriately? I find it completely impossibly to write code that doesn't crawl along at a snail's pace. Even when I manage to make it faster, I usually have no clue why. I think there is a problem that few people are taking the time to understand the compilation model of Haskell, while they've had the C runtime behaviour drilled into their brains since college. Until you sit down and understand what your Haskell code means, it will be very hard to reason about optimisations, unfortunately. GHC does what it does well, and its predictable. As long as you understand the operations your trying to make predictions about. I suggest installing ghc-core, and looking at how your code is compiled. Try many examples, and have a good knowledge of the STG paper. My experience has been that simply understanding lazy/normal order evaluation is enough to catch most of the big performance problems beginners complain about. If you can't evaluate Haskell by hand at the source level, you have no hope of understanding performance issues. For going more for C-like speed, your advice seems more appropriate. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
derek.a.elkins: On Mon, 2008-05-12 at 19:30 -0700, Don Stewart wrote: I offer up the following example: mean xs = sum xs / length xs Now try, say, mean [1.. 1e9], and watch GHC eat several GB of RAM. (!!) But you know why, don't you? sat down and spent the best part of a day writing an MD5 implementation. Eventually I got it so that all the test vectors work right. (Stupid little-endian nonsense... mutter mutter...) When I tried it on a file containing more than 1 MB of data... o dear... Did you use Data.Binary or the other libraries for efficient access to gigabytes of data? Of course, the first step in any serious attempt at performance improvement is to actually profile the code to figure out where the time is being spent. Well, actually, for our 'mean()' case, it means just using the right structures. Arrays for example: import Data.Array.Vector import Text.Printf mean :: UArr Double - Double mean arr = b / fromIntegral a where k (n :*: s) a = n+1 :*: s+a a :*: b = foldlU k (0 :*: 0) arr :: (Int :*: Double) main = printf %f\n . mean $ enumFromToFracU 1 1e9 For example, $ time ./A 5.067109e8 ./A 3.53s user 0.00s system 99% cpu 3.532 total Try allocating an array of doubles in C, and getting similar results. (The compiler is optimising this to: Main_zdszdwfold_info: leaq32(%r12), %rax cmpq%r15, %rax movq%rax, %r12 ja .L10 movsd .LC0(%rip), %xmm0 ucomisd %xmm5, %xmm0 jae .L12 movq%rsi, (%rax) movq$base_GHCziFloat_Dzh_con_info, -24(%rax) movsd %xmm6, -16(%rax) movq$base_GHCziBase_Izh_con_info, -8(%rax) leaq-7(%rax), %rbx leaq-23(%rax), %rsi jmp *(%rbp) .L12: movapd %xmm6, %xmm0 addq$1, %rsi subq$32, %r12 addsd %xmm5, %xmm0 addsd .LC2(%rip), %xmm5 movapd %xmm0, %xmm6 jmp Main_zdszdwfold_info Note even any garbage collection. This should be pretty much the same performance-wise as unoptimised C. almost any nontrivial program you write spends 60% or more of its time doing GC rather than actual work. Ok, you're doing something very wrong. GC time is typically less than 15% of the running time of typical work programs I hack on. I bet you're using lists inappropriately? I find it completely impossibly to write code that doesn't crawl along at a snail's pace. Even when I manage to make it faster, I usually have no clue why. I think there is a problem that few people are taking the time to understand the compilation model of Haskell, while they've had the C runtime behaviour drilled into their brains since college. Until you sit down and understand what your Haskell code means, it will be very hard to reason about optimisations, unfortunately. GHC does what it does well, and its predictable. As long as you understand the operations your trying to make predictions about. I suggest installing ghc-core, and looking at how your code is compiled. Try many examples, and have a good knowledge of the STG paper. My experience has been that simply understanding lazy/normal order evaluation is enough to catch most of the big performance problems beginners complain about. If you can't evaluate Haskell by hand at the source level, you have no hope of understanding performance issues. For going more for C-like speed, your advice seems more appropriate. Yes, I agree. You must first be able to reason about the evaluation strategy on paper. If you want to get C like speed, you should understand Core. Thankfully, the knowledge of how to do this, and tools to help, are improving. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: GHC predictability
Darrin Thompson [EMAIL PROTECTED] wrote: On Tue, May 13, 2008 at 2:20 AM, Don Stewart [EMAIL PROTECTED] wrote: Note the use of strict pairs. Key to ensuring the accumulators end up in registers.The performance difference here is due to fold (and all left folds) not fusing in normal build/foldr fusion. The vector version runs about the same speed as unoptimsed C. These tricks going into Real World Haskell? When you say someone needs to get familiar with the STG paper it scares me (a beginner) off a little, an I've been making an effort to approach the papers. I could barely understand the Fusion one and getting familiar with compiler internals sounds like something I'd not be ready for. Probably if I really looked at ghc-core I'd be pleasantly surprised but I'm totally biased against even looking. Gcc is hard to read, thus ghc is also. So while you are right about all this when you say it, I think your goal is to persuade. RWH has some of the best practical prose I've read yet. Well done there. Hopefully chapter 26 will be crammed full of this stuff? You know, sometimes I wish this would be the Eve forums, so that I could just answer FAIL. Anyway, the goal of the Haskell community is to prevent success at any cost, so anything that is done to ease things for noobs that is not purely meant to prevent anyone from asking questions will be warded off by automatic defence systems of the big ivory tower, which are reinforced faster than you can ever hope to understand any topic. To get a bit more on-topic: I currently completely fail to implement a layout rule in Parsec because I don't understand its inner workings, and thus constantly mess up my state. Parsec's ease of usage is deceiving as soon as you use more than combinators: Suddenly the plumbing becomes important, and hackage is full of such things. Haskell has potentially infinite learning curves, and each one of them usually represents a wall. To make them crumble, you have to get used to not understand anything until you understand everything. -- (c) this sig last receiving data processing entity. Inspect headers for past copyright information. All rights reserved. Unauthorised copying, hiring, renting, public performance and/or broadcasting of this signature prohibited. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
ndmitchell: Hi 1. What is ghc-core? You actually answer this question as part of question 2. Think of it as simple Haskell with some additional bits. 2. Does anybody know how to actually read GHC's Core output anyway? To me, it looks almost exactly like very, very complicated Haskell source with a suspicious concentration of case expressions - but I understand that in the Core language, many constructs actually mean something quite different. There is one different from standard Haskell I am aware of. In Core, case x of _ - 1 will evaluate x, in Haskell it won't. Other than that, its just Haskell, but without pattern matching and only cases - hence the rather high number of cases. Well, 'let's too, which bind either unlifted or lifted values. If they're binding lifted things, that's a thunk being allocated. -- Don ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
Andrew Coppin wrote: 2. Does anybody know how to actually read GHC's Core output anyway? To me, it looks almost exactly like very, very complicated Haskell source with a suspicious concentration of case expressions - but I understand that in the Core language, many constructs actually mean something quite different. I can read most of GHC Core. I was never taught, and I never asked. I did not read GHC's source code. I just guessed. I agree with your assessment about the domination by case expressions. They spell out the evaluation order. You need truckloads of them. I slightly agree with your assessment about complicated, but I call it detailed and tedious. This is an assembly language for functional programming. If it weren't detailed and tedious, it would have no right to exist. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: GHC predictability
On 14 May 2008, at 8:58 am, Andrew Coppin wrote: What I'm trying to say [and saying very badly] is that Haskell is an almost terrifyingly subtle language. Name me a useful programming language that isn't. Simply interchanging two for-loops, from for (i = 0; i N; i++) for (j = 0; j N; j++) to for (j = 0; j N; j++) for (i = 0; i N; i++) when marching over an array, can easily slow you down by nearly two orders of magnitude in C. [Hint: read What every computer scientist needs to know about memory.] For a real shock, take a look at what your C++ templates are doing... There's one big difference between Haskell and language T (my other preferred language). Seemingly insignificant changes in Haskell can kill performance, but seemingly insignificant changes in language T can take you into territory the library designers never thought of where there are lions, tigers, and bears in abundance. Unexpectedly slow is better than inexplicably buggy. ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe