[Haskell-cafe] Re: [Haskell] specification of sum
On 2005-11-02, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote: (Moving this to the cafe.) G'day all. Quoting Cale Gibbard [EMAIL PROTECTED]: We already do rely on them in most cases. Of course, not every property can be proved by the compiler, but many pieces of code are going to assume quite a lot. Agreed. I think that the assumption that (+) and (*) in Num define something like a ring on the given type is a sensible one. I'm not so certain. Octonian multiplication, to pick one example, is not associative, but I'd like to be able to use (*) nonetheless. (*) is already defined as being left-associative in H98. I'm actually working with non-associative structures at the moment where the operation is usually considerd multiplication, but I still wouldn't want to use (*) for it. Too much hassle for normal uses. -- Aaron Denney -- ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
RE: [Haskell] specification of sum
On 02 November 2005 17:06, Scherrer, Chad wrote: Surely not... sum is defined by Haskell 98 as: sum = foldl (+) 0 and this is exactly what GHC provides. Furthermore we have specialised strict versions for Int and Integer. I'd been using ghci for testing along the way and getting terrible results; does the specialization only apply to ghc per se? Yes, you need 'ghc -O' to get the optimised versions. The performance characteristics of unoptimised (including interpreted) code are often quite different. You can load the optimised module into GHCi, of course. Cheers, Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
RE: [Haskell] specification of sum
On 02 November 2005 00:20, Lennart Augustsson wrote: Furthermore, ghc has a WRONG definition of sum. Surely not... sum is defined by Haskell 98 as: sum = foldl (+) 0 and this is exactly what GHC provides. Furthermore we have specialised strict versions for Int and Integer. Also, we shouldn't be turning overloaded functions into class methods purely for the purposes of providing optimised versions; that's what the SPECIALISE pragma is for. Cheers, Simon ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
RE: [Haskell] specification of sum
Surely not... sum is defined by Haskell 98 as: sum = foldl (+) 0 and this is exactly what GHC provides. Furthermore we have specialised strict versions for Int and Integer. I'd been using ghci for testing along the way and getting terrible results; does the specialization only apply to ghc per se? Cheers, Simon Also, Cale, I was thinking about your comment about formal power series, and I don't see that (+) should not be strict in this case. In particular, if they are represented as infinite lists, I agree that zipWith (+) works just fine, though it is strict but lazy. Here is the strictness: zipWith (+) undefined [1,2,3] == undefined zipWith (+) [1,2,3] undefined == undefined And here is the laziness: head $ zipWith (+) (1:undefined) (2:undefined) == 3 Or am I missing something? -Chad ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 02/11/05, Scherrer, Chad [EMAIL PROTECTED] wrote: Surely not... sum is defined by Haskell 98 as: sum = foldl (+) 0 and this is exactly what GHC provides. Furthermore we have specialised strict versions for Int and Integer. I'd been using ghci for testing along the way and getting terrible results; does the specialization only apply to ghc per se? Cheers, Simon Also, Cale, I was thinking about your comment about formal power series, and I don't see that (+) should not be strict in this case. In particular, if they are represented as infinite lists, I agree that zipWith (+) works just fine, though it is strict but lazy. Here is the strictness: zipWith (+) undefined [1,2,3] == undefined zipWith (+) [1,2,3] undefined == undefined And here is the laziness: head $ zipWith (+) (1:undefined) (2:undefined) == 3 Or am I missing something? -Chad Oh, well that's true, but I suppose that by strictness I mean that it doesn't completely force the evaluation of its arguments to normal form, as shown in the second example. If it was more strict, that would also be undefined, as would be the sum of any two infinite lists. I suppose it's somewhat of a matter of perspective: are you passing it a list, or a cons cell? In the case of power series, I was thinking of the whole power series as the parameter, not just the outermost data constructor. - Cale ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
Simon Marlow wrote: On 02 November 2005 00:20, Lennart Augustsson wrote: Furthermore, ghc has a WRONG definition of sum. Surely not... sum is defined by Haskell 98 as: sum = foldl (+) 0 and this is exactly what GHC provides. Furthermore we have specialised strict versions for Int and Integer. Also, we shouldn't be turning overloaded functions into class methods purely for the purposes of providing optimised versions; that's what the SPECIALISE pragma is for. You are absolutly right, sum is defined with foldl. I wonder why my hbc prelude had it defined with foldr? (This should teach me not to look at bit rotted code.) -- Lennart ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On Wed, Nov 02, 2005 at 11:18:13AM -, Simon Marlow wrote: Also, we shouldn't be turning overloaded functions into class methods purely for the purposes of providing optimised versions; that's what the SPECIALISE pragma is for. I am a little torn on the issue, on one hand, if it is purely for performance, then yeah, that makes sense, and SPECIALISE is a pretty key pragma for any compiler. (so much so that I have 5 variations on it in jhc :) ). however, having a default of (+) a b = sum [a,b] might be useful if sum and product are more straightforward to define for an instance than + and *. however, I don't know if this ever actually occurs in practice. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 11/1/05, Scherrer, Chad [EMAIL PROTECTED] wrote: I was wondering... In my experience, it's worked much better to use sum' = foldl' (+) 0 than the built-in sum function, which leaks memory like crazy for large input lists. I'm guessing the built-in definition is sum = foldr (+) 0 But as far as I know, (+) is always strict, so foldl' seems much more natural to me. Is there a case where the build-in definition is preferable? The library definition in ghc actually uses foldl. It's conceivable that you may not want (+) to be non-strict for certain data types. The question then becomes, is there a case where you want _sum_ to be non-strict? /S -- Sebastian Sylvan +46(0)736-818655 UIN: 44640862 ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 01/11/05, Sebastian Sylvan [EMAIL PROTECTED] wrote: On 11/1/05, Scherrer, Chad [EMAIL PROTECTED] wrote: I was wondering... In my experience, it's worked much better to use sum' = foldl' (+) 0 than the built-in sum function, which leaks memory like crazy for large input lists. I'm guessing the built-in definition is sum = foldr (+) 0 But as far as I know, (+) is always strict, so foldl' seems much more natural to me. Is there a case where the build-in definition is preferable? The library definition in ghc actually uses foldl. It's conceivable that you may not want (+) to be non-strict for certain data types. The question then becomes, is there a case where you want _sum_ to be non-strict? I'd argue that the likely answer is no, given that (+) is an operation in an Abelian group and not a monoid. Regardless of whether (+) is strict, when you compute a sum you will always have to consume the entire list. This is because until you have observed the last element of the list, nothing can be said about the final result. To see this, it's enough to see that even if the sum of all the other elements of the list is g, the final element could be -g + h, which would give a final result of h, regardless of what g is. On the other hand, you don't always want product to be strict, since (*) is a monoid operation, and so you actually can have information about the final result long before the list is completely consumed. As a simple example, consider computing the product of [0..1000]. Unfortunately, giving a general definition to product seems to preclude the possibility of allowing for this kind of optimisation in general, since in general, ring elements aren't comparable for equality (though Eq is a superclass of Num, you can't always define (==) sensibly) and even if you can test for equality, without knowing anything extra about your ring, you can't use anything else about the specific monoid to help shortcut the calculation. - Cale ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 01/11/05, Scherrer, Chad [EMAIL PROTECTED] wrote: I was wondering... In my experience, it's worked much better to use sum' = foldl' (+) 0 than the built-in sum function, which leaks memory like crazy for large input lists. I'm guessing the built-in definition is sum = foldr (+) 0 But as far as I know, (+) is always strict, so foldl' seems much more natural to me. Is there a case where the build-in definition is preferable? Chad Scherrer Computational Mathematics Group Pacific Northwest National Laboratory You don't always want (+) to be strict. Consider working with the ring of formal power series over, say, the integers. You don't want (+) to force the evaluation of an infinite formal summation which is passed to it, since that's an infinite loop, so it will have to be non-strict, somewhat like zipWith (+) over the lists of coefficients. - Cale ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
Cale Gibbard wrote: On 01/11/05, Sebastian Sylvan [EMAIL PROTECTED] wrote: On 11/1/05, Scherrer, Chad [EMAIL PROTECTED] wrote: I was wondering... In my experience, it's worked much better to use sum' = foldl' (+) 0 than the built-in sum function, which leaks memory like crazy for large input lists. I'm guessing the built-in definition is sum = foldr (+) 0 But as far as I know, (+) is always strict, so foldl' seems much more natural to me. Is there a case where the build-in definition is preferable? The library definition in ghc actually uses foldl. It's conceivable that you may not want (+) to be non-strict for certain data types. The question then becomes, is there a case where you want _sum_ to be non-strict? I'd argue that the likely answer is no, given that (+) is an operation in an Abelian group and not a monoid. Regardless of whether (+) is strict, when you compute a sum you will always have to consume the entire list. (+) is not an operation in an Abelian group. (+) is a function with type a-a-a for some particular a. Haskell makes no mention about (+) having any other properties. Furthermore, ghc has a WRONG definition of sum. You cannot change a foldr into a foldl, unless the operator is associative, and (+) does not have to be. -- Lennart PS. I think additional properties of class methods would be great, but since Haskell cannot enforce them we should not rely on them. ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
RE: [Haskell] specification of sum
You don't always want (+) to be strict. Consider working with the ring of formal power series over, say, the integers. You don't want (+) to force the evaluation of an infinite formal summation which is passed to it, since that's an infinite loop, so it will have to be non-strict, somewhat like zipWith (+) over the lists of coefficients. - Cale Hmm, this is a good point, but for most people, It seems like the most common usage would be to add up a list of actual concrete numbers, and the resulting memory leak in the code using sum is at least a minor annoyance. It's hard to say how much time a given newbie will take to catch this nuance. Since sum' = foldl' (+) 0 (like foldl', the ' means strict ) is so often preferable, I'll go so far as to suggest it be included it in upcoming versions of Data.List. That way it would be hard to miss, and would remove what could otherwise be a very common stumbling block for anyone doing numerical work with Haskell. I haven't used product so extensively, but I suspect there may be similar issues with it? -Chad ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 01/11/05, Lennart Augustsson [EMAIL PROTECTED] wrote: Cale Gibbard wrote: On 01/11/05, Sebastian Sylvan [EMAIL PROTECTED] wrote: On 11/1/05, Scherrer, Chad [EMAIL PROTECTED] wrote: I was wondering... In my experience, it's worked much better to use sum' = foldl' (+) 0 than the built-in sum function, which leaks memory like crazy for large input lists. I'm guessing the built-in definition is sum = foldr (+) 0 But as far as I know, (+) is always strict, so foldl' seems much more natural to me. Is there a case where the build-in definition is preferable? The library definition in ghc actually uses foldl. It's conceivable that you may not want (+) to be non-strict for certain data types. The question then becomes, is there a case where you want _sum_ to be non-strict? I'd argue that the likely answer is no, given that (+) is an operation in an Abelian group and not a monoid. Regardless of whether (+) is strict, when you compute a sum you will always have to consume the entire list. (+) is not an operation in an Abelian group. (+) is a function with type a-a-a for some particular a. Haskell makes no mention about (+) having any other properties. Furthermore, ghc has a WRONG definition of sum. You cannot change a foldr into a foldl, unless the operator is associative, and (+) does not have to be. -- Lennart PS. I think additional properties of class methods would be great, but since Haskell cannot enforce them we should not rely on them. We already do rely on them in most cases. Of course, not every property can be proved by the compiler, but many pieces of code are going to assume quite a lot. For example, the relation defined by (=) in the Ord class is generally assumed to be at least a partial order, and in most cases, a total ordering, but nothing about its type tells you that. Functions will simply not work as intended when = is not a partial order (in some cases resulting in nontermination!), and there are quite a few places where it's even assumed to be total. Similarly, (==) is assumed to be an equivalence relation. The monoid operations in Data.Monoid are assumed to define a monoid. The monad laws are not enforced, but we rely on those too. In reality, classes come with proof obligations -- it would be nice if the compiler could somehow summarise these, but I think it's too much at this point to expect it to find a proof in most practical cases. I think that the assumption that (+) and (*) in Num define something like a ring on the given type is a sensible one. It would be better to call the operations you had something else if they didn't (at least approximately) satisfy the ring axioms, as people are going to look at your notation and assume things like distributivity, etc. hold at least in approximation. (With types like Float, you're not going to get half of them exactly, but they'll still usually come close to holding, as you're emulating some fragment of the real numbers.) - Cale ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
The solution would be to bring 'sum' and 'product' into the Num class so the most efficient version for each type can be used and the default is no worse than the current non-class versions. (this is even pretty much completly backwards compatable so could be considered for haskell 06) I'd also like to see 'join' and 'ap' added to Monad while we are at it. John -- John Meacham - ⑆repetae.net⑆john⑈ ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell] specification of sum
On 01/11/05, John Meacham [EMAIL PROTECTED] wrote: The solution would be to bring 'sum' and 'product' into the Num class so the most efficient version for each type can be used and the default is no worse than the current non-class versions. (this is even pretty much completly backwards compatable so could be considered for haskell 06) I'd also like to see 'join' and 'ap' added to Monad while we are at it. John Yes! I completely agree :) - Cale ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
[Haskell-cafe] Re: [Haskell] specification of sum
(Moving this to the cafe.) G'day all. Quoting Cale Gibbard [EMAIL PROTECTED]: We already do rely on them in most cases. Of course, not every property can be proved by the compiler, but many pieces of code are going to assume quite a lot. Agreed. I think that the assumption that (+) and (*) in Num define something like a ring on the given type is a sensible one. I'm not so certain. Octonian multiplication, to pick one example, is not associative, but I'd like to be able to use (*) nonetheless. Cheers, Andrew Bromage ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe