Re: [Haskell-cafe] Strict type system allows for a maximum number of programming errors to be caught at compile time.
On 04/05/2010, at 13:30, Luke Palmer wrote: On Mon, May 3, 2010 at 11:07 AM, Kyle Murphy orc...@gmail.com wrote: The fact that it doesn't is proof enough that there's a problem with it even if that problem is simply that the types you're using aren't exactly correct. Further, I'd argue that in the first instance with a non-strict type system, the instance of wrong code that compiles would be higher. The only argument to support non-strict typing would be if you could show that it takes less time to track down runtime bugs than it does to fix compile time type errors, and any such claim I'd be highly skeptical of. Clearly. But many people believe in this methodology, and use test suites and code coverage instead of types. Indeed, such practices are essentially empirical type checking, and they afford the advantage that their verification is much more expressive (however less reliable) than our static type system, because they may use arbitrary code to express their predicates. I don't think it's a question of types vs. testing. Rather, it's types + testing vs. just testing. How is the latter more expressive than the former for defining properties of programs? Also, testing loses a lot of appeal once you start dealing with concurrent programs. Testing for this program doesn't have race conditions isn't exactly easy. You want as many static guarantees as possible. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Parallel Haskell: 2-year project to push real world use
On 03/05/2010, at 22:04, Johan Tibell wrote: On Mon, May 3, 2010 at 11:12 AM, Simon Peyton-Jones simo...@microsoft.com wrote: | Does this mean DPH is ready for abuse? | | The wiki page sounds pretty tentative, but it looks like it's been awhile | since it's been updated. | | http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell In truth, nested data parallelism has taken longer than we'd hoped to be ready for abuse :-). We have not lost enthusiasm though -- Manual, Roman, Gabi, Ben, and I talk on the phone each week about it. I think we'll have something usable by the end of the summer. That's very encouraging! I think people (me included) have gotten the impression that the project ran into problems so challenging that it stalled. Perhaps a small status update once in a while would give people a better idea of what's going on. :) We keep running into challenging problems (the last one was a volcano) but we never stall. Things like the new GHC inliner, vector, repa are all part of DPH work. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Parallel Haskell: 2-year project to push real world use
On 04/05/2010, at 09:21, Christian Höner zu Siederdissen wrote: Hi, on that topic, consider this (rather trivial) array: a = array (1,10) [ (i,f i) | i -[1..10]] where f 1 = 1 f 2 = 1 f i = a!(i-1) + a!(i-2) (aah, school ;) Right now, I am abusing vector in ST by doing this: a - new a' - freeze a forM_ [3..10] $ \i - do write a (a'!(i-1) + a!(i-2)) Let's say I wanted to do something like this in dph (or repa), does that work? We are actually using this for RNA folding algorithms that are at least O(n^3) time. For some of the more advanced stuff, it would be really nice if we could just parallelize. Do you really just need a prefix sum? These are easily parallelisable if the operator is associative. For instance, you could implement the Fibonacci sequence as: mapP fst $ scanP (\(a,b) _ - (a+b,a)) (1,0) $ replicateP n (0,0) and DPH would parallelise it. That's how I would write the above with vector as well. To summarise: I need arrays that allow in-place updates. In-place updates + parallelism = bad! That's oversimplifying, of course. But the transformations underlying DPH, for instance, simply don't work in the presence of side effects. Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are using vector right now. On a single core, it performs really great -- even compared to C-code that has been optimized a lot. That's great to know! Do you (or anyone else) by any chance have any benchmarks you could share? At the moment, I'm only benchmarking vector with a couple of rather simplistic algorithms which is a bit of a problem. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] Haskell and the Software design process
On 03/05/2010, at 06:02, Jaco van Iterson wrote: I was just wondering what methods are best to design/model the software in bigger projects when you are planning to use Haskell. Is there no difference compared to other languages? Are there any Haskell tools? In addition to what Don said, here are a couple of things I've learned. This is just from personal experience so YMMV. Design in Haskell is much more often bottom-up than in, say, traditional OO where it's frequently top-down all the way. I believe this is mainly due to purity. When you have some kind of global state, your design process often has to be top-down because of intricate interactions between program components which modify that state. Designing Haskell software tends to involve much fewer diagrams than OO. Your most important design tool is the type system. You can often express large chunks of your design through types and have the compiler check and enforce them. Fiddling with types is often part of the design process and should be treated accordingly. If you stumble on a useful design pattern, think about how to encode it in the type system (this is quite different from OO patterns). Higher-order functions and type classes are very powerful tools for reducing coupling and for implementing design patterns. Prototyping is very cheap and easy. Writing prototypes and playing with them in ghci allows you to see how your subsystems will behave and adjust the design accordingly. In general, you ought to write code (esp. type signatures) while designing. Some libraries/subsystems will evolve into or start out as EDSLs. This is good and should be encouraged. Identifying EDSLs that would be useful for implementing your software is an important step in the design process. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 22:42, Roman Leshchinskiy wrote: On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote: I was thinking of this: data T = T Double deriving ( Eq, Ord ) ... GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT That is indeed what it does. Which is a plain old bug, since it leads to inconsistent behaviour between wrapped vs unwrapped values. *Main T (0/0) == T (0/0) False *Main T (0/0) T (0/0) False *Main T (0/0) T (0/0) True *Main (0/0) (0/0) False Urgh. You're right, I hadn't thought of this. Would you care to submit a bug report? I submitted one but on further reflection, this is not so simple. Let's look at pairs as an example. At the moment, () is implemented basically like this: (a,b) (c,d) = case compare a c of LT - False EQ - compare b d GT - True Of course, this means that (0/0,'a') (0/0,'a'). So we could change the implementation: (a,b) (c,d) = a c || (a == c b d) But now we compare a to c twice which is very bad for, say, ([Int],Int). Clearly, we want to use the first definition but it leads to inconsistent results for Doubles. I don't see how to solve this while keeping IEEE semantics of silent NaNs. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 07:29, Don Stewart wrote: Oh, the Platform has very strict standards about APIs, What is an API? The package versioning policy only seems to talk about types and function signatures. John's old-locale example shows that this is not enough. Would it perhaps make sense for at least the Platform to require packages to have unit tests and to require versions to be bumped whenever those change (sufficiently)? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: On 24/04/2010, at 07:29, Don Stewart wrote: Oh, the Platform has very strict standards about APIs, What is an API? The package versioning policy only seems to talk about types and function signatures. John's old-locale example shows that this is not enough. I would think that the API is all the functions/classes/datatypes/instances/etc. exported from the library in combination with their types. So the semantics of those functions doesn't matter at all? Would it perhaps make sense for at least the Platform to require packages to have unit tests and to require versions to be bumped whenever those change (sufficiently)? I don't get this; just because someone changes a unit test (because they thought of a new case, etc.) they should bump the package version even if all the changes were internal and not exported? Adding new tests (i.e., new postconditions) doesn't change the API. Loosening preconditions doesn't, either. Also, the tests would only cover the exposed part of the library, of course. Internal tests are of no concern to the library's clients. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] The instability of Haskell libraries
On 24/04/2010, at 18:54, Ivan Lazar Miljenovic wrote: Roman Leshchinskiy r...@cse.unsw.edu.au writes: On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote: I would think that the API is all the functions/classes/datatypes/instances/etc. exported from the library in combination with their types. So the semantics of those functions doesn't matter at all? What do you refer to by semantics? Can you provide an example of when what you consider to be the API to change when the functions, types, etc. don't? John Goerzen gave one in the very first post of this thread: the fix to old-locale which didn't change any types but apparently changed the behaviour of a function quite drastically. Another example would be a change to the Ord instances for Float and Double which would have compare raise an exception on NaNs as discussed in a different thread on this list. Another one, which is admittedly silly but demonstrates my point, would be changing the implementation of map to map _ _ = [] In general, any significant tightening/changing of preconditions and loosening/changing of postconditions would qualify. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 19:56, Barak A. Pearlmutter wrote: And yet a lot of generic code is written in terms of compare. That's can be an advantage, because often that code *should* blow up when it gets a NaN. E.g., sorting a list of Floats which includes a NaN. However, often you will know that the list doesn't contain NaNs and will still have to pay a performance penalty. It's a question of what the right default is - safety or performance. In the case of floating point numbers, I'm leaning towards performance. That said, I would be very much in favour of providing a SafeFloat or whatever type with much safer semantics than IEEE floats and trying to get people to use that type by default unless they really need the performance. Even deriving(Ord) only produces compare and relies on standard definitions for other methods. I don't think that's actually a problem. Surely the IEEE Floating Point types would give their own definitions of not just compare but also , =, etc, overriding the problematic deriving(Ord) definitions of comparison in terms of compare and vice-versa. I was thinking of this: data T = T Double deriving ( Eq, Ord ) Unless I'm mistaken, at the moment GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT ... That is, all comparisons on T would be paying the NaN performance tax. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote: Currently the standard prelude has default definition: ... compare x y | x == y= EQ | x = y= LT | otherwise = GT I'd suggest [...] compare x y | x y = LT | x == y= EQ | x y = GT | otherwise = error no consistent ordering It is not clear to me that this would cause a measurable performance hit in the case of floating point numbers. We're talking about at most two extra instructions: a compare and a conditional branch. The The problem are not so much the additional instructions. Rather, it's the fact that compare for Float and Double can fail at all which inhibits some optimisations. For instance, GHC is free to eliminate the comparison in (x `compare` y) `seq` a but wouldn't be with your change. It doesn't actually do that at the moment, which looks like an optimiser deficiency to me. But in any case, the property can fail has a significant effect on optimisations sometimes. I was thinking of this: data T = T Double deriving ( Eq, Ord ) ... GHC basically produces instance Ord T where compare (T x) (T y) = compare x y t u = compare t u == LT That is indeed what it does. Which is a plain old bug, since it leads to inconsistent behaviour between wrapped vs unwrapped values. *Main T (0/0) == T (0/0) False *Main T (0/0) T (0/0) False *Main T (0/0) T (0/0) True *Main (0/0) (0/0) False Urgh. You're right, I hadn't thought of this. Would you care to submit a bug report? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 23/04/2010, at 01:34, Barak A. Pearlmutter wrote: I'd suggest that compare involving a NaN should yield error violation of the law of the excluded middle Please think of the poor guys trying to write high-performance code in Haskell! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers
On 24/04/2010, at 07:15, Barak A. Pearlmutter wrote: In all seriousness, I think it is reasonable when isNaN x for x C x == C x C C x C == x C x to all be False, for all floats C, even C=x, as a sort of efficient weak Bool bottom. This is what the FP hardware does --- so it is very efficient. But if you force the system to choose one of the three, which is what compare x C is doing, I think the result should be _|_. Because there is no way to choose, no reasonable Ordering to return. It is possible to write generic Ord n = code under these conditions, if you're careful to case out ,==, when you don't want a NaN to kill the computation, and when necessary handle the case that all three come out false. That's what good numeric programmers actually do. But compare giving a wrong Ordering is an invitation to get it wrong. And yet a lot of generic code is written in terms of compare. Even deriving(Ord) only produces compare and relies on standard definitions for other methods. Don't get me wrong, I don't think the current situation is ideal (although it doesn't seem all that bad to me). But this change would have far-reaching implications for performance which ought to be evaluated before it can be seriously considered, in my opinion. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
index*OffAddr
In package vector, primitive vectors (the ones that Data.Vector.Unboxed is built on top of) are represented as follows (ByteArray and friends are wrappers for various GHC primitives provided by package primitive): data Vector a = Vector Int -- offset into the ByteArray Int -- length ByteArray -- data This representation supports cheap slicing which is quite crucial. However, indexing into such vectors is a bit more expensive than necessary: index (Vector i _ arr) j = indexByteArray arr (i+j) Ultimately, this requires 2 additions to get the element's address: base address off the ByteArray + ((i + j) * size of element) I'd like to always allocate pinned ByteArrays and store the starting address of the vector instead of the offset: data Vector a = Vector Addr Int ByteArray This would make indexing cheaper as it would require only one addition: index (Vector addr i _) = indexOffAddr addr i This is quite a big deal if indexing happens in an inner loop (some algorithms become up to 20% faster). Of course, the backend could optimise the offset-based version by performing partial redundancy elimination but it doesn't and it probably wouldn't get all interesting cases even if it did. So the second version is better. The problem is that I can't implement it because I must touch the ByteArray after accessing the memory. This results in code like this which hides the constructor, breaking various optimisations: case indexIntOffAddr# addr# i# of { n# - case touch# arr# realWorld# of { _ - I# n# }} After thinking about this for a while, I came up with two possible solutions. One is to provide a pure version of touch#: use# :: o - o' - o' such that use# x y = y. This would change the code above to: I# (use# arr# (indexIntOffAddr# addr# i#)) I don't know how to implement this, though, because use# would have to be able to return arbitrary (unboxed) types and the code generator doesn't really seem to support this. A perhaps simpler solution is to add a new set of primitives: indexIntOffAddrUsing# :: o - Addr# - Int# - Int# ... These would take an additional argument which they'd touch and otherwise ignore. The code would then become: I# (indexIntOffAddrUsing# arr# addr# i#) Incidentally, the index*OffAddr# primitives don't seem to be used anywhere. Any thoughts? Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] vector recycling
On 18/04/2010, at 08:07, Ben wrote: On Fri, Apr 16, 2010 at 11:19 PM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: That said, it would be quite possible to provide something like the following: fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a as far as i understand there would be two ways of writing such a function : 1) to use mutable vectors monadically underneath and hide them inside some kind of unsafeX, or 2) to give a specialized fold with sufficient hints to the compiler to use the rewriting framework. Right, I meant 2. I'm not saying it's necessarily a good idea, just that it would be possible. This could use the recycling framework to safely do as much in-place as possible while still preserving a purely functional interface. I have to think about it. Really, this looks like just a poor man's substitute for linear types. although i am supposed to know something about category theory, since my training is in math, i don't know about girard's later work. is there a short precis you can give (or a pointer?) This is a nice introduction: http://homepages.inf.ed.ac.uk/wadler/papers/linear/linear.ps Also, Clean's uniqueness types are quite similar. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector recycling
On 17/04/2010, at 13:32, Ben wrote: module Main where import qualified Data.Vector.Generic as V import qualified Data.Vector.Unboxed as UV type Vec = UV.Vector Double axpy :: Double - Vec - Vec - Vec axpy a x y = V.zipWith (+) (V.map (* a) x) y sumVecs :: [(Double, Vec)] - Vec sumVecs axs = let (a, x) = head axs in foldl adder (V.map (* a) x) (tail axs) where adder :: Vec - (Double, Vec) - Vec adder v1 (a, x) = axpy a x v1 how to i write this in a way which ensures recycling / fusion, e.g. in-place updates? Unfortunately, recycling won't help you here. It is a purely local optimisation which doesn't work across function boundaries (inlining notwithstanding) and recursive calls. Your best bet is to use a mutable vector and do the fold in the ST monad. That said, it would be quite possible to provide something like the following: fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a This could use the recycling framework to safely do as much in-place as possible while still preserving a purely functional interface. I have to think about it. Really, this looks like just a poor man's substitute for linear types. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange error with type classes + associated types
On 17/04/2010, at 11:00, Conal Elliott wrote: I'm unsure now, but I think I tried making Basis a data type (not syn) and ran into the problem I mentioned above. The Basis *synonyms* also have HasTrie instances, which is crucially important. If we switch to (injective) data types, then we lose the HasTrie instances. I'd be okay with defining HasTrie instances (preferably via deriving) for the associated Basis data types, but I couldn't figure out how to. Maybe it's not possible currently, or maybe I just didn't know how. Could you perhaps make (:-*) a proper type rather than a synonym? That would help with the ambiguity. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Floyd Warshall performance (again)
On 16/04/2010, at 18:06, Mathieu Boespflug wrote: shortestPath :: [(Int, Int, Int)] - UArray Int Int shortestPath g = runSTUArray $ do let mnew = newArray (0, SIZE * SIZE) 1 mread arr i j = unsafeRead arr (i * SIZE + j) mwrite arr i j x = unsafeWrite arr (i * SIZE + j) x unsafeIOToST $ hSetBuffering stdout LineBuffering unsafeIOToST $ putStrLn Allocating ... pm - mnew unsafeIOToST $ putStrLn Allocating ... done let loop1 SIZE = return () loop1 k = let loop2 SIZE = return () loop2 i = let loop3 SIZE = return () loop3 j = do xij - mread pm i j xik - mread pm i k xkj - mread pm k j mwrite pm i j (min xij (xik + xkj)) loop3 (j + 1) in loop3 0 loop2 (i + 1) in loop2 0 loop1 (k + 1) loop1 0 return pm In general, GHC doesn't like nested loops. You might want to try the following structure: loop1 SIZE = return () loop1 k = loop2 k 0 loop2 k SIZE = loop1 (k+1) loop2 k i = loop3 k i 0 loop3 k i SIZE = loop2 k (i+1) loop3 k i j = do ... loop3 k i (j+1) And, as Max suggested, the llvm backend ought to improve things. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Can't install Criterion package on ghc ..
On 15/04/2010, at 02:55, John Lato wrote: The problem isn't with criterion itself, but with vector-algorithms. The vector library relies heavily on type families, which have dodgy support in ghc-6.10. As a matter of fact, this particular problem is easy to fix by adding a couple of type signatures. I used to run into it frequently back when I was still working with 6.10. I'll send a patch. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] Vector to Monadic Stream and back, how?
On 14/04/2010, at 09:05, Xiao-Yong Jin wrote: I want to use 'mapM' on Data.Vector.Vector, but it looks like the only 'mapM' defined is in Data.Vector.Fusion.Stream.Monadic. I'm able to use 'stream' and 'liftStream' to convert a 'Vector' to a monadic stream, on which I can use 'mapM'. But I couldn't find a way to convert the monadic stream back to Vector without using an intermediate list. I don't think I understand the internal of monadic stream that much. But it looks not so fusion to me. Is it the only way back to Vector? Unfortunately, it's not at all clear to me how to implement mapM on vectors/arrays without going through an intermediate list for arbitrary monads (it's easy for ST and IO). The next version of vector will include mapM and friends but it will probably be implemented much like you describe, with appropriate specialisations for ST and IO. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Strange error with type classes + associated types
On 15/04/2010, at 00:30, Brent Yorgey wrote: On Wed, Apr 14, 2010 at 09:51:52AM +0100, Stephen Tetley wrote: On 14 April 2010 03:48, Brent Yorgey byor...@seas.upenn.edu wrote: Can someone more well-versed in the intricacies of type checking with associated types explain this? Or is this a bug in GHC? If you take the definition of append out out the class - GHCi will give it a type: append (Affine a2 b2) (Affine a1 b1) = Affine (a2 *.* a1) (lapply a2 b1 ^+^ b2) *VectorSpace :t append append :: (Scalar v ~ Scalar v1, Basis v ~ Basis u, Basis v1 ~ Basis v, VectorSpace v1, HasTrie (Basis v), HasBasis v, HasBasis u) = Affine v1 - Affine v - Affine v1 Right, this seems weird to me. Why is there still a 'u' mentioned in the constraints? Actually, I don't even see why there ought to be both v and v1. The type of (*.*) mentions three type variables, u, v, and w: (*.*) :: (HasBasis u, HasTrie (Basis u), HasBasis v, HasTrie (Basis v), VectorSpace w, Scalar v ~ Scalar w) = (v :-* w) - (u :-* v) - u :-* w Note that (:-*) is a type synonym: type :-* u v = MSum (Basis u :-: v) Substituting this into the type of (*.*), we get: (*.*) :: ... = MSum (Basis v :-* w) - MSum (Basis u :-* v) - MSum (Basis u :-* w) Now, Basis is an associated type: class VectorSpace v = HasBasis v where type Basis v ... This means that there is no way to obtain u from Basis u. Since u only ever occurs as an argument to Basis, a type family, it can never be unified with anything. This, in turn, means that there is no way to call (*.*) at all (unless I'm severely mistaken). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST
On 08/04/2010, at 01:38, Henning Thielemann wrote: On Apr 6, 2010, at 5:30 PM, Roman Leshchinskiy wrote: In fact, the only safe-ish use for it I have found is to use Storable-related functions in ST, hoping that the instances don't actually use any real IO functionality. Arguably, this shouldn't be necessary as Storable should live in ST anyway. But Storable in ST monad would be still dangerous, because pointers may point to non-allocated memory or point outside of an array. I don't think that's the kind of safety the original poster had in mind. You can have invalid memory accesses even in pure code but that's ok since we know what the semantics is: bottom. I understood the question to be about the conditions under which unsafeIOToST can violate referential transparency. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST
On 07/04/2010, at 07:33, Nicolas Frisby wrote: I haven't been able to find it via Google or Haddock. An old message suggests is was just a matter of exceptions? I don't think that's correct. You can implement unsafePerformIO in terms unsafeIOToST: unsafePerformIO :: IO a - a unsafePerformIO p = runST (unsafeIOToST p) In fact, the only safe-ish use for it I have found is to use Storable-related functions in ST, hoping that the instances don't actually use any real IO functionality. Arguably, this shouldn't be necessary as Storable should live in ST anyway. I only want to use the IO for generating Data.Uniques to pair with STRefs in order to make a map of them. I'm guessing this would be a safe use since it's exception free (... right?). It's hard to tell without looking at your code. But if you are generating Uniques in ST then it's probably unsafe: foo :: () - Unique foo _ = runST (unsafeIOToST newUnique) What's the value of foo ()? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: replicateM over vectors
On 04/04/2010, at 05:33, Chad Scherrer wrote: Roman Leshchinskiy rl at cse.unsw.edu.au writes: Ah. I missed that. Then your best bet is probably replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const action) $ new n It's uglier that it should be but vector simply doesn't define the right combinators for this at the moment. I'm having trouble getting this to typecheck. I'll reread your Recycle Your Arrays paper; maybe then it will make more sense. Ugh. I shouldn't write emails while frantically scrambling to make a conference deadline. What I meant is this: replicate n action = do { v - new n; v' - munstream v (generate M n (const action)) } Sorry for the confusion. There are two things one would have to do. First, add a function to Generic.New which initialises a New from a Monadic.Stream and fusion rules for it. That's easy. The hard part is to generalise New to work with arbitrary monads: at the moment it is defined as: data New a = New (forall mv s. MVector mv a = ST s (mv s a)) This is because its basic reason for existence is to be passed to Vector.new which then does a runST to produce an immutable vector. It is perhaps possible to make New more general but it's quite tricky. I'll think about it after the ICFP deadline But the m I'm interested in happens to be ST. Sounds like it's still easy in principle, but not immediate. Is that right? Not really. The big step is getting from the type above to a fixed s so that you can use it in a particular ST computation. It's not just a question of making the types work, either. I also have to convince myself that it is actually safe to do so (in particular, that the rewrite rules in the library can't break things). Getting from there to IO is very easy. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] replicateM over vectors
On 02/04/2010, at 12:16, Don Stewart wrote: Chad.Scherrer: Hi, I'd like to be able to do replicateM, but over a vector instead of a list. Right now I'm doing this: The operation you are looking for is called newWith. It probably should be called replicate. Roman? Can we generate frozen arrays for monadic generators, and still fuse in the current New/Mutable/MStream architecture? For monadic stuff, fusion happens on things of type New. For instance, you could write this (I'm omitting the class contexts and Data.Vector prefixes): replicate :: Int - a - New a replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x) and then either Mutable.run (replicate n x) to get a mutable vector or new (replicate n x) to get an immutable one. You could also chain operations on New, including monadic ones: v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f) $ replicate n x and this ought to fuse. Note that the New stuff is quite rough and only really intended for internal use at the moment. I wanted to get the normal APIs working properly first. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] replicateM over vectors
On 02/04/2010, at 13:01, Don Stewart wrote: rl: replicate :: Int - a - New a replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x) and then either Mutable.run (replicate n x) to get a mutable vector or new (replicate n x) Hmm, but here 'a' is pure. I don't think he wants newWith :: (PrimMonad m, MVector v a) = Int - a - m (v (PrimState m) a) but more: newWithM :: (PrimMonad m, MVector v a) = Int - m a - m (v (PrimState m) a) Ah. I missed that. Then your best bet is probably replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const action) $ new n It's uglier that it should be but vector simply doesn't define the right combinators for this at the moment. to get an immutable one. You could also chain operations on New, including monadic ones: v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f) $ replicate n x Oh, that's interesting. But what if we want to fill directly with the monadic action? We wouldn't mapM (const a) $ replicate n undefined So how do we best do a fusible, e.g.: replicateM :: G.Vector v a = Int - IO a - IO (v a) There are two things one would have to do. First, add a function to Generic.New which initialises a New from a Monadic.Stream and fusion rules for it. That's easy. The hard part is to generalise New to work with arbitrary monads: at the moment it is defined as: data New a = New (forall mv s. MVector mv a = ST s (mv s a)) This is because its basic reason for existence is to be passed to Vector.new which then does a runST to produce an immutable vector. It is perhaps possible to make New more general but it's quite tricky. I'll think about it after the ICFP deadline :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Data Structures GSoC
On 31/03/2010, at 18:14, Achim Schneider wrote: We have a lot of useful interfaces (e.g. ListLike, Edison), but they don't seem to enjoy wide-spread popularity. Perhaps that's an indication that we need different interfaces? IMO, huge classes which generalise every useful function we can think of just isn't the right approach. We need small interfaces between containers and algorithms. In fact, the situation is perhaps somewhat similar to C++ where by providing exactly that the STL has been able to replace OO-style collection libraries which never really worked all that well. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Shootout update
I'm wondering... Since the DPH libraries are shipped with GHC by default are we allowed to use them for the shootout? Roman On 30/03/2010, at 19:25, Simon Marlow wrote: The shootout (sorry, Computer Language Benchmarks Game) recently updated to GHC 6.12.1, and many of the results got worse. Isaac Gouy has added the +RTS -qg flag to partially fix it, but that turns off the parallel GC completely and we know that in most cases better results can be had by leaving it on. We really need to tune the flags for these benchmarks properly. http://shootout.alioth.debian.org/u64q/haskell.php It may be that we have to back off to +RTS -N3 in some cases to avoid the last-core problem (http://hackage.haskell.org/trac/ghc/ticket/3553), at least until 6.12.2. Any volunteers with a quad-core to take a look at these programs and optimise them for 6.12.1? Cheers, Simon ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 29/03/2010, at 02:27, Lennart Augustsson wrote: Does anything change if you swap the first two rhss? No, not as far as I can tell. On Sun, Mar 28, 2010 at 1:28 AM, Roman Leshchinskiy r...@cse.unsw.edu.au wrote: On 28/03/2010, at 09:47, Lennart Augustsson wrote: It's important to switch from mod to rem. This can be done by a simple abstract interpretation. Also, changing the definition of rem from a `rem` b | b == 0 = divZeroError | a == minBound b == (-1) = overflowError | otherwise = a `remInt` b to a `rem` b | b == 0 = divZeroError | b == (-1) a == minBound = overflowError | otherwise = a `remInt` b speeds up the GHC version by about 20%. Figuring out why is left as an exercise to the reader :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC
On 28/03/2010, at 01:36, Jan-Willem Maessen wrote: It's worth pointing out that there's a bit of bang-pattern mysticism going on in this conversation (which has not been uncommon of late!). A non-buggy strictness analyzer should expose the strictness of these functions without difficulty. Actually, rangeJ is lazy in i and rangeK is lazy in i and j. GHC does unbox everything important here but that needs more optimisations than just strictness analysis. You are right, though, that GHC doesn't need bang patterns here. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 27/03/2010, at 05:27, John Meacham wrote: Here are jhc's timings for the same programs on my machine. gcc and ghc both used -O3 and jhc had its full standard optimizations turned on. jhc: ./hs.out 5.12s user 0.07s system 96% cpu 5.380 total gcc: ./a.out 5.58s user 0.00s system 97% cpu 5.710 total ghc: ./try 31.11s user 0.00s system 96% cpu 32.200 total I really don't understand these GHC numbers. I get about 3s for the C version, about 5s for GHC with rem and about 7.5s for GHC with mod. Is this perhaps on a 64-bit system? What is sizeof(int) in C and sizeOf (undefined :: Int) in Haskell? That said, I suspect the only thing this benchmark really measures is how fast the various compilers can compute i * i + j * j + k * k `mod` 7. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 28/03/2010, at 11:07, John Meacham wrote: I have not thoroughly checked it, but I think there are a couple things going on here: It could also be worthwhile to float out (i*i + j*j) in rangeK instead of computing it in every loop iteration. Neither ghc nor gcc can do this; if jhc can then that might explain the performance difference (although I would expect it to be larger in this case). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] GHC vs GCC vs JHC
On 28/03/2010, at 09:47, Lennart Augustsson wrote: It's important to switch from mod to rem. This can be done by a simple abstract interpretation. Also, changing the definition of rem from a `rem` b | b == 0 = divZeroError | a == minBound b == (-1) = overflowError | otherwise = a `remInt` b to a `rem` b | b == 0 = divZeroError | b == (-1) a == minBound = overflowError | otherwise = a `remInt` b speeds up the GHC version by about 20%. Figuring out why is left as an exercise to the reader :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Performance question
On 19/03/2010, at 08:48, Daniel Fischer wrote: Am Donnerstag 18 März 2010 21:57:34 schrieb Daniel Fischer: Contrary to my expectations, however, using unboxed arrays is slower than straight arrays (in my tests). However, a few {-# SPECIALISE #-} pragmas set the record straight. This is because without specialising, unsafeAt is a straight (inlineable) function call for boxed arrays but is overloaded and hence much slower for unboxed ones. In general, unboxed arrays tend to be slower in generic code. The only real solution is making functions such as binarySearch INLINE. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Feedback request: priority queues in containers
On 17/03/2010, at 03:16, Louis Wasserman wrote: I'm not willing to do this sort of typeclass wrapper thing, primarily because nothing else in containers does -- even though we might have a Mapping type class that handles both IntMap and Map, we don't. I'm inclined to let that design choice stand, as far as containers is concerned. It would make perfect sense to write a new package with such a type class and offering instances for the containers priority queue implementations, but I prefer to stick with the style that containers already seems to use -- that is, exporting separate modules without a unifying type class, but with nearly-identical method signatures. FWIW, vector does both. It defines most vector operations generically and then exports appropriate specialisations for each concrete vector type. I think this is the most flexible and convenient approach. I just wish Haskell had some kind of support for it. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] vector stream fusion, inlining and compilation time
On 06/03/2010, at 03:10, stefan kersten wrote: i'm still curious, though, why my three versions of direct convolution perform so differently (see attached file). in particular, i somehow expected conv_3 to be the slowest and conv_2 to perform similar to conv_1. any ideas? i haven't had a look at the core yet, mainly because i'm lacking the expertise ... Hmm, one problem is that the current definition of reverse is suboptimal to say the least. I'll fix that. Could you perhaps send me your complete benchmark? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: [Haskell] Recursive definition of fibonacci with Data.Vector
On 08/03/2010, at 12:17, Alexander Solla wrote: GHC even optimizes it to: fib = fib Sounds like an implementation bug, not an infinite dimensional vector space bug. My guess is that strictness is getting in the way, and forcing what would be a lazy call to fib in the corresponding list code -- fib = 0 : 1 : (zipWith (+) fib (tail fib)) -- into a strict one. In fact, I'm pretty sure that's what the problem is: data Vector a = Vector {-# UNPACK #-} !Int {-# UNPACK #-} !Int {-# UNPACK #-} !(Array a) The problem is that you have to allocate an Array of a specific length when creating a Vector. Arrays are finite by definition. It's not a bug, it's a feature. Note that in the context of package vector, vector means a 1-dimensional, 0-indexed array. This is not unusual - see, for instance, the standard C++ library. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector stream fusion, inlining and compilation time
On 05/03/2010, at 04:34, stefan kersten wrote: i've been hunting down some performance problems in DSP code using vector and the single most important transformation seems to be throwing in INLINE pragmas for any function that uses vector combinators and is to be called from higher-level code. failing to do so seems to prevent vector operations from being fused and results in big performance hits (the good news is that the optimized code is quite competitive). does anybody have some more info about the do's and don'ts when programming with vector? This is a general problem when working with RULES-based optimisations. Here is an example of what happens: suppose we have foo :: Vector Int - Vector Int foo xs = map (+1) xs Now, GHC will generate a nice tight loop for this but if in a different module, we have something like this: bar xs = foo (foo xs) then this won't fuse because (a) foo won't be inlined and (b) even if GHC did inline here, it would inline the nice tight loop which can't possibly fuse instead of the original map which can. By slapping an INLINE pragma on foo, you're telling GHC to (almost) always inline the function and to use the original definition for inlining, thus giving it a chance to fuse. GHC could be a bit cleverer here (perhaps by noticing that the original definition is small enough to inline and keeping it) but in general, you have to add INLINE pragmas in such cases if you want to be sure your code fuses. A general-purpose mechanism for handling situations like this automatically would be great but we haven't found a good one so far. the downside after adding the INLINE pragmas is that now some of my modules take _really_ long to compile (up to a couple of minutes); any ideas where i can start looking to bring the compilation times down again? Alas, stream fusion (and fusion in general, I guess) requires what I would call whole loop compilation - you need to inline everything into loops. That tends to be slow. I don't know what your code looks like but you could try to control inlining a bit more. For instance, if you have something like this: foo ... = ... map f xs ... where f x = ... you could tell GHC not to inline f until fairly late in the game by adding {-# INLINE [0] f #-} to the where clause. This helps sometimes. i'm compiling with -O2 -funbox-strict-fields instead of -Odph (with ghc 6.10.4 on OSX 10.4), because it's faster for some of my code, but -O2 vs. -Odph doesn't make a noticable difference in compilation time. If you're *really* interested in performance, I would suggest using GHC head. It really is much better for this kind of code (although not necessarily faster wrt to compilation times). This is what -Odph does: -- -Odph is equivalent to -- ---O2 optimise as much as possible ---fno-method-sharing sharing specialisation defeats fusion -- sometimes ---fdicts-cheap always inline dictionaries ---fmax-simplifier-iterations20 this is necessary sometimes ---fsimplifier-phases=3 we use an additional simplifier phase -- for fusion ---fno-spec-constr-thresholdrun SpecConstr even for big loops ---fno-spec-constr-countSpecConstr as much as possible I'm surprised -Odph doesn't produce faster code than -O2. In any case, you could try turning these flags on individually (esp. -fno-method-sharing and the spec-constr flags) to see how they affect performance and compilation times. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Removing/deprecating -fvia-c
On 17/02/2010, at 18:37, Isaac Dupree wrote: LLVM and GCC are open-source projects that are improving over time... is there any particular reason we expect GCC to have poor numeric performance forever? Past experience :-) GCC has been around for a while and if it doesn't optimise numeric code well by now, there is no reason to believe that it ever will. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
[Haskell] ANN: vector 0.5
Hi everyone, I am pleased to announce the release of version 0.5 of package vector, a high-performance Haskell array library with a powerful loop fusion framework. The main highlights compared to previous versions are: * DPH-style unboxed vectors (in Data.Vector.Unboxed) which use associated types to select the appropriate unboxed representation depending on the type of the elements. * A redesigned interface between mutable and immutable vectors. In particular, the popular unsafeFreeze primitive is now supported for all vector types. * Many new operations on both immutable and mutable vectors. * Significant performance improvements. The library comes with a fairly complete testsuite (mainly thanks to Max Bolingbroke) and is quite stable by now. Barring various disasters, I expect to release version 1.0 in the next 3 to 4 months. The release is accompanied by a new version of the NoSlow array benchmark suite. A few quite meaningless preliminary benchmarks are available at: http://unlines.wordpress.com/2010/02/15/vector-0-5-is-here-2 Both vector and NoSlow are on Hackage. Hackers can get the darcs repos from http://code.haskell.org/vector and http://code.haskell.org/NoSlow Enjoy! Roman___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 23:28, Dan Doel wrote: On Thursday 11 February 2010 8:54:15 pm Dan Doel wrote: On Thursday 11 February 2010 12:43:10 pm stefan kersten wrote: On 10.02.10 19:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. that would be even better of course! an O(0) solution, at least for me ;) let me know if i can be of any help (e.g. in testing). i suppose uvector-algorithms would also need to be ported to vector, then. I could do this. To this end, I've done a preliminary port of the library, such that all the modules compile. I've just used safe operations so far, so it's probably a significant decrease in performance over the 0.2 uvector-algorithms (unless perhaps you turn off the bounds checking flag), but it's a start. It can be gotten with: darcs get http://code.haskell.org/~dolio/vector-algorithms That's great, thanks! FWIW, vector has two kinds of bounds checks: real ones which catch invalid indices supplied by the user (on by default) and internal ones which catch bugs in the library (off by default since the library is, of course, bug-free ;-). I guess you'd eventually want to use the latter but not the former; that's exactly what unsafe operations provide. I only encountered a couple snags during the porting so far: * swap isn't exported from D.V.Generic.Mutable, so I'm using my own. Ah, I'll export it. Also, I gladly accept patches :-) * I use a copy with an offset into the from and to arrays, and with a length (this is necessary for merge sort). However, I only saw a whole array copy (and only with identical sizes) in vector (so I wrote my own again). That's actually a conscious decision. Since vectors support O(1) slicing, you can simply copy a slice of the source vector into a slice of the target vector. * Some kind of thawing of immutable vectors into mutable vectors, or other way to copy the former into the latter would be useful. Right now I'm using unstream . stream, but I'm not sure that's the best way to do it. At the moment, it is (although it ought to be wrapped in a nicer interface). Something like memcpy doesn't work for Data.Vector.Unboxed because the ByteArrays aren't pinned. I don't really want to provide thawing until someone convinces me that it is actually useful. BTW, vector also supports array recycling so you could implement true in-place sorting for fused pipelines. Something like map (+1) . sort . update xs wouldn't allocate any temporary arrays in that case. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:39, Don Stewart wrote: bos: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? uvector's not seeing much development, but at least in the last round of benchmarks it was still consistently faster -- since it's been micro-optimized. FWIW, the development version of vector is usually faster the both uvector and dph-prim-seq, at least for the development version of NoSlow. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:40, Don Stewart wrote: rl: On 11/02/2010, at 05:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? Yes, with the caveat that I haven't really used it in production code (I have tested and benchmarked it, though). BTW, I'll release version 0.5 as soon as get a code.haskell.org account and move the repo there. That's the main problem. I think we could move to vector as a whole, if the suite of testing/ performance/documentation stuff from uvector was ported. Hmm, I'm not sure what you mean here. Mostly thanks to Max Bolingbroke's efforts, vector has a fairly extensive testsuite. I benchmark it a lot (with NoSlow) and haven't found any significant performance problems in a while. As to documentation, there are comments for most of the functions :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 12:54, Dan Doel wrote: I also notice that vector seems to have discarded the idea of Vec (A * B) = Vec A * Vec B Oh no, it hasn't. In contrast to uvector/DPH, which use a custom strict tuple type for rather outdated reasons, vector uses normal tuples. For instance, Data.Vector.Unboxed.Vector (a,b,c) is internally represented as a triple of unboxed vectors of a, b and c. In general, vector supports 4 kinds of arrays at the moment: Data.Vector.Primitive wrappers around ByteArray#, can store primitive types Data.Vector.Unboxed uses type families, can store everything D.V.Primitive can plus tuples and can be extended for user-defined types Data.Vector.Storable wrappers around ForeignPtr, can store Storable things Data.Vector boxed arrays Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 12/02/2010, at 13:49, Don Stewart wrote: rl: On 12/02/2010, at 12:39, Don Stewart wrote: bos: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? uvector's not seeing much development, but at least in the last round of benchmarks it was still consistently faster -- since it's been micro-optimized. FWIW, the development version of vector is usually faster the both uvector and dph-prim-seq, at least for the development version of NoSlow. Ah ha -- that's useful. Public benchmarks soon? In time for the Zurich Hackathon?? (March 20) I've been trying to find the time to put the benchmarks on my blog since the beginning of January but, alas, unsuccessfully so far. In any case, vector and NoSlow currently live in http://www.cse.unsw.edu.au/~rl/code/darcs/vector http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow If Roman declares the vector to be faster -- my main concern here for flat uarrays -- and makes the repo available so we can work on it, I'd be willing to merge uvector's tests and docs and extra array operations in. It is generally faster than dph-prim-seq. Benchmarking against uvector is a bit difficult because it's missing operations necessary for implementing most of the algorithms in NoSlow (in particular, bulk updates). For the ones that uvector supports, vector tends to be faster. BTW, this is for unsafe operations which don't use bounds checking. Bounds checking can make things a little slower but often doesn't cost anything as long as only collective operations are used. Sometimes it makes things faster which means that the simplifier still gets confused in some situations. There are also some significant differences between 6.12 and the HEAD (the HEAD is much more predictable). In general, I find it hard to believe that the performance differences I'm seeing really matter all that much in real-world programs. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] vector to uvector and back again
On 11/02/2010, at 05:03, Bryan O'Sullivan wrote: I'm thinking of switching the statistics library over to using vector. uvector is pretty bit-rotted in comparison to vector at this point, and it's really seeing no development, while vector is The Shiny Future. Roman, would you call the vector library good enough to use in production at the moment? Yes, with the caveat that I haven't really used it in production code (I have tested and benchmarked it, though). BTW, I'll release version 0.5 as soon as get a code.haskell.org account and move the repo there. Roman On Wed, Feb 10, 2010 at 9:59 AM, stefan kersten s...@k-hornz.de wrote: hi, i've been using the vector [1] library for implementing some signal processing algorithms, but now i'd like to use the statistics [2] package on my data, which is based on the uvector [3] library. is there a (straightforward) way of converting between vectors and uvectors, preferrably O(1)? thanks, sk [1] http://hackage.haskell.org/package/vector [2] http://hackage.haskell.org/package/statistics [3] http://hackage.haskell.org/package/uvector ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Restrictions on associated types for classes
On 18/12/2009, at 00:37, Stephen Lavelle wrote: Given class MyClass k where type AssociatedType k :: * Is there a way of requiring AssociatedType be of class Eq, say? This works with -XFlexibleContexts: class Eq (AssociatedType k) = MyClass k where type AssociatedType k :: * Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Boxed Mutable Arrays
On 15/12/2009, at 06:53, Brad Larsen wrote: On another note, does this (or perhaps better phrased, will this) bug also affect Data Parallel Haskell? Luckily, no. DPH represents arrays of user-defined types by unboxed arrays (that's essentially what the vectoriser does). It does use boxed arrays in a couple of places internally but they are small. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Re: [Haskell] ANN: NoSlow - Microbenchmarks for array libraries
On 28/11/2009, at 07:45, Henning Thielemann wrote: Is there also a darcs repository? Yes, http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell] ANN: NoSlow - Microbenchmarks for array libraries
I'm pleased to announce NoSlow, a nascent benchmark suite for various array (and list) libraries with particular emphasis on finding out how well they fuse things. At the moment, it knows about * standard lists * primitive DPH arrays (dph-prim-seq) * uvector * vector * storablevector It compiles and runs (using Brian's criterion) a fairly random collection of very small loop kernels for each of those and produces a lot of data which it then uses to generate ugly HTML tables. In the future, it will have more benchmarks, more complex benchmarks and much prettier tables. You can get more information (including the ugly tables) from my blog http://unlines.wordpress.com/2009/11/27/noslow and NoSlow itself from Hackage http://hackage.haskell.org/package/NoSlow Any help would be highly appreciated as I don't have a lot of time to work on it. Roman ___ Haskell mailing list Haskell@haskell.org http://www.haskell.org/mailman/listinfo/haskell
Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?
On 18/11/2009, at 21:10, Simon Peyton-Jones wrote: Yes I think it can, although you are right to point out that I said nothing about type inference. One minor thing is that you've misunderstood the proposal a bit. It ONLY springs into action when there's a dot. So you'd have to write bar1 x = x.foo bar2 x = x.foo Yes, that's what I meant to write, silly me. I promise to pay more attention next time. OK so now it works rather like type functions. Suppose, the types with which foo was in scope were foo :: Int - Int foo :: Bool - Char Now imagine that we had a weird kind of type function type instance TDNR_foo Int = Int - Int type instance TDNR_foo Bool = Bool - Char Each 'foo' gives a type instance for TDNR_foo, mapping the type of the first argument to the type of that foo. Hmm... GHC doesn't allow this: type instance TDNR_foo () = forall a. () - a - a IIUC this restriction is necessary to guarantee termination. Given your analogy, wouldn't this proposal run into similar problems? | Another example: suppose we have | | data T a where | TInt :: T Int | TBool :: T Bool | | foo :: T Int - u | foo :: T Bool - u | | bar :: T a - u | bar x = case x of | TInt - foo x | TBool - foo x | | Here, (foo x) calls different functions in the two alternatives, right? To be | honest, that's not something I'd like to see in Haskell. You mean x.foo and x.foo, right? Then yes, certainly. Of course that's already true of type classes: data T a where T1 :: Show a = T a T2 :: Sow a = T a bar :: a - T a - String bar x y = case y of T1 - show x T2 - show x Then I get different show's. How so? Surely you'll get the same Show instance in both cases unless you have conflicting instances in your program? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?
Simon, have you given any thought to how this interacts with type system extensions, in particular with GADTs and type families? The proposal relies on being able to find the type of a term but it's not entirely clear to me what that means. Here is an example: foo :: F Int - Int foo :: Int - Int bar1 :: Int - Int bar1 = foo bar2 :: Int ~ F Int = Int - Int bar2 = foo IIUC, bar1 is ok but bar2 isn't. Do we realy want to have such a strong dependency between name lookup and type inference? Can name lookup be specified properly without also having to specify the entire inference algorithm? Another example: suppose we have data T a where TInt :: T Int TBool :: T Bool foo :: T Int - u foo :: T Bool - u bar :: T a - u bar x = case x of TInt - foo x TBool - foo x Here, (foo x) calls different functions in the two alternatives, right? To be honest, that's not something I'd like to see in Haskell. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] poor perfomance of indexU in uvector package
On 16/11/2009, at 22:46, Alexey Khudyakov wrote: Problems begin when you need non-contiguous block. Easiest way to so is indexing. FWIW, this operation is called backpermute and is probably exported as bpermute in uvector. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Inliner behaviour - tiny changes lead to huge performance differences
On 13/11/2009, at 18:04, Bryan O'Sullivan wrote: main = do args - getArgs forM_ args $ \a - do s - B.readFile a let t = T.decodeUtf8 s print (T.length t) The streamUtf8 function looks roughly like this: streamUtf8 :: OnDecodeError - ByteString - Stream Char streamUtf8 onErr bs = Stream next 0 (maxSize l) where l = B.length bs next i | i = l = Done | U8.validate1 x1 = Yield (unsafeChr8 x1) (i+1) | {- etc. -} {-# INLINE [0] streamUtf8 #-} The values being Yielded from the inner function are, as you can see, themselves constructed by functions. Originally, with the inner next function manually marked as INLINE, I found that functions like unsafeChr8 were not being inlined by GHC, and performance was terrible due to the amount of boxing and unboxing happening in the inner loop. Let's see if I understand this correctly. In your code, decodeUtf8 calls streamUtf8. They both get inlined into main but then unsafeChr8 does not. Correct? If so, are you sure that unsafeChr8 is really called in the simplified code? IIUC, this isn't necessary if you don't actually inspect the Chars (which length presumably doesn't). So perhaps GHC removes the call altogether? If not, what does it do with the result? I somehow stumbled on the idea of removing the INLINE annotation from next, and performance suddenly improved by a significant integer multiple. This caused the body of streamUtf8 to be inlined into my test program, as I hoped. Or are you saying that it's streamUtf8 that isn't getting inlined into main? length :: Text - Int length t = Stream.length (Stream.stream t) {-# INLINE length #-} And the streaming length is: length :: Stream Char - Int length = S.lengthI {-# INLINE[1] length #-} And the lengthI function is defined more generally, in the hope that I could use it for both Int and Int64 lengths: lengthI :: Integral a = Stream Char - a lengthI (Stream next s0 _len) = loop_length 0 s0 where loop_length !z s = case next s of Done - z Skips' - loop_length z s' Yield _ s' - loop_length (z + 1) s' {-# INLINE[0] lengthI #-} Unfortunately, although lengthI is inlined into the Int-typed streaming length function, that function is not in turn marked with __inline_me in simplifier output, so the length/decodeUtf8 loops do not fuse. The code is pretty fast, but there's still a lot of boxing and unboxing happening for all the Yields. Does changing the definition of length to length = id S.lengthI help? GHC used to have a bug in this area but I haven't been bitten by it for quite some time. Also, I wonder how Stream.stream is defined. Is it strict in Text? If it isn't, does making it strict help? All of these flip-flops in inliner behaviour are very difficult to understand, and they seem to be exceedingly fragile. Should I expect the situation to be better with the new inliner in 6.12? I suspect that the fragility you are seeing is just a symptom of a problem in how the UTF-8 library implements stream fusion. It's a bit tricky to get everything right. Generally, I've found the simplifier to be quite stable and predictable in the last year or so. Simon is working hard on making it even better. If you have a spare minute, perhaps you could try the HEAD with the new inliner and see if that helps? Although I somewhat doubt it, to be honest. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Re[2]: [Haskell-cafe] What's the deal with Clean?
On 05/11/2009, at 04:01, Bulat Ziganshin wrote: oh, can we stop saying about shootout? if you want to see speed of pure haskell code, look at papers about fast arrays/strings - their authors have measured that lazy lists are hundreds times slower than idiomatic C code. is use of lazy lists counted as mistake too and paper authors had too small haskell experience? In the papers I coauthored, I don't think we measured any such thing. What we measured was that in algorithms that are best implemented with (unboxed) arrays, using boxed lists is going to cost you. That's not a very surprising conclusion and it's by no means specific to Haskell. The problem was/is the lack of nice purely declarative array libraries but that changing, albeit slowly. It's a question of using the right data structure for the algorithm, not a C vs. Haskell thing. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:23, Daniel Peebles wrote: In the presence of fusion (as is the case in uvector), it's hard to give meaningful time complexities for operations as they depend on what operations they are paired with. We need to think of a better way to express this behavior in the documentation though. I have to disagree here. Fusion never makes the complexity of operations worse. If it does, it's a bug. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:12, brian wrote: indexU :: UA e = UArr e - Int - e O(n). indexU extracts an element out of an immutable unboxed array. This is a typo (unless Don inserted a nop loop into the original DPH code). Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 13:35, wren ng thornton wrote: Roman Leshchinskiy wrote: On 04/11/2009, at 13:23, Daniel Peebles wrote: In the presence of fusion (as is the case in uvector), it's hard to give meaningful time complexities for operations as they depend on what operations they are paired with. We need to think of a better way to express this behavior in the documentation though. I have to disagree here. Fusion never makes the complexity of operations worse. If it does, it's a bug. I think the point was more that the relevant complexity bound can change in the presence of fusion. For a poor example: the first map over a list is O(n) but all subsequent ones in a chain of maps are O(1) with fusion. I'm sure there are better examples than that, but you get the idea. Some people may care to know about that latter complexity rather than just the independent complexity. I think asymptotic complexity is the wrong tool for what you're trying to do. You implement your algorithm using operations with known complexities. This allows you to compute the complexity of the entire algorithm. That's all you can use operation complexities for. The compiler is then free to optimise the algorithm as it sees fit but is supposed to preserve (or improve) its complexity. It is not guaranteed or even supposed to preserve the original operations. To stay with your example, each of the two maps is linear regardless of whether fusion happens. Executing the two maps, be it one after another or interlocked, is linear simply because O(n) + O(n) = O(n), not because of fusion. Essentially, you're trying to use complexity to describe an optimisation which doesn't actually affect the complexity. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] What's the deal with Clean?
On 04/11/2009, at 14:07, Gregory Crosswhite wrote: Actually, it's not a typo. If you look at the source, what you'll see is indexU arr n = indexS (streamU arr) n I suspect it gets rewritten back to the O(1) version somewhere after is has had a chance to fuse. If not, then it's a bug. In the vector package, I do this instead, though: indexU arr n = O(1) implemetation {-# RULES indexU/unstreamU forall s n. indexU (unstreamU s) n = indexS s n #-} Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Arrays in Clean and Haskell
On 04/11/2009, at 14:38, Philippos Apolinarius wrote: And here comes the reason for writing this article. In the previous version of the Gauss elimination algorithm, I have imported Data.Array.IO. I also wrote a version of the program that imports Data.Array.ST. The problem is that I don't know how to read an STUArray from a file, process it, and write it back to a file. Why don't you use the IOUArray directly instead of converting it to STUArray and back? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Stream-fusion without the lists
On 12/05/2009, at 14:45, Reiner Pope wrote: The Stream datatype seems to be much better suited to representing loops than the list datatype is. So, instead of programming with the lists, why don't we just use the Stream datatype directly? I think the main reason is that streams don't store data and therefore don't support sharing. That is, in let xs = map f ys in (sum xs, product xs) the elements of xs will be computed once if it is a list but twice if it is a stream. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Really bad code for single method dictionaries?
On 27/03/2009, at 18:32, Don Stewart wrote: I don't think this is still the case. Roman, do you remember? Hmm, not really. I recall that there was some sort of problem which I didn't have time to investigate then but it's been so long... Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Loop unrolling + fusion ?
On 07/03/2009, at 09:26, Claus Reinke wrote: My preferred spec would be roughly {-# NOINLINE f #-} as now {-# INLINE f #-}works as now, which is for non-recursive f only (might in future be taken as go-ahead for analysis-based recursion unfolding) {-# INLINE f PEEL n #-} inline calls *into* recursive f (called loop peeling for loops) {-# INLINE f UNROLL m #-} inline recursive calls to f *inside* f (called loop unrolling for loops) {-# INLINE f PEEL n UNROLL m #-} combine the previous two The problem here is that this only works for directly recursive functions which I, for instance, don't normally use in high- performance code. Most of my loops are pipelines of collective combinators like map, filter, fold etc. because these are the ones that can be fused automatically. Unless I'm misunderstanding something, this approach doesn't handle such cases. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] bytestring vs. uvector
On 09/03/2009, at 11:47, Claus Reinke wrote: Btw, have any of the Haskell array optimization researchers considered fixpoints yet? This, for instance, is a very nice paper: http://www.pllab.riec.tohoku.ac.jp/~ohori/research/OhoriSasanoPOPL07.pdf However, in the context of high-performance array programming explicit recursion is bad because it is very hard if not impossible to parallelise automatically except in fairly trivial cases. And if your array program is not parallelisable then you don't really care about performance all that much :-) Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Loop unrolling + fusion ?
On 01/03/2009, at 04:49, Don Stewart wrote: So now, since we've gone to such effort to produce a tiny loop like, this, can't we unroll it just a little? Sadly, my attempts to get GCC to trigger its loop unroller on this guy haven't succeeded. -funroll-loops and -funroll-all-loops doesn't touch it, That's because the C produced by GHC doesn't look like a loop to GCC. This can be fixed but given that we are moving away from -fvia-C anyway, it probably isn't worth doing. Anyone think of a way to apply Claus' TH unroller, or somehow convince GCC it is worth unrolling this guy, so we get the win of both aggressive high level fusion, and aggressive low level loop optimisations? The problem with low-level loop optimisations is that in general, they should be done at a low level. Core is much too early for this. To find out whether and how much to unroll a particular loop, you must take things like register pressure and instruction scheduling into account. IMO, the backend is the only reasonable place to do these optimisations. Using an exisiting backend like LLVM would really help here. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 30/11/2008, at 11:36, Don Stewart wrote: Should mutable arrays have list-like APIs? All the usual operations, just in-place and destructive where appropriate? I don't know. To be honest, I don't think that the term mutable array describes a single data structure. For instance, one of the central questions which unveils a whole bunch of design possibilities is: can mutable arrays be concatenated and how does that work if yes? Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 29/11/2008, at 11:49, Claus Reinke wrote: Yes, it is very difficult. A sensible API for a standard array library is something that needs more research. FWIW, I don't know of any other language that has what I'd like to see in Haskell. C+ + probably comes closest but they have it easy - they don't do fusion. I assume you've looked at SAC? http://www.sac-home.org/ Yes. They have it even easier - they don't have polymorphism, they don't have higher-order functions, they don't have boxing and laziness and in a sense, they don't even do general-purpose programming, just scientific algorithms. And they have no existing language to integrate their stuff with. This is not to imply that their work isn't interesting and valuable; IMO, it just doesn't really help us when it comes to designing a Haskell array library. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 30/11/2008, at 02:43, Brad Larsen wrote: On Fri, 28 Nov 2008 19:00:38 -0500, Roman Leshchinskiy [EMAIL PROTECTED] wrote: On 29/11/2008, at 10:47, Claus Reinke wrote: [...] And would it be difficult for you all to agree on a standard API, to make switching between the alternatives easy (if it is indeed impossible to unify their advantages in a single library, the reasons for which should also be documented somewhere)? Yes, it is very difficult. A sensible API for a standard array library is something that needs more research. FWIW, I don't know of any other language that has what I'd like to see in Haskell. C++ probably comes closest but they have it easy - they don't do fusion. [...] Would you elaborate on what you'd like to see in an array library? I'd like to have a library which is efficient (in particular, implements aggressive fusion), is roughly equivalent to the current standard list library in power and supports strict/unboxed/mutable arrays. It should also provide a generic framework for implementing new kinds of arrays. And eventually, it should also be usable in high- performance and parallel algorithms. And perhaps which C++ array library you are thinking of? Your C++ comment caught my attention, and now I'm curious. Surely you don't mean C-style arrays. :-D No, I meant vector, basic_string and deque which are provided by the standard library. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Data parallelism doesn't seem to work on windows...
On 30/11/2008, at 10:43, Sebastian Sylvan wrote: This, on the other hand does not use more than one core: -- compiler command line (from shootout code): ghc --make -fcpr-off - threaded -fdph-par -package dph-base -Odph -XPArr parr2.hs -- execution as before main = print $ [: True | n - [: 1000 .. 5000 :], fac n == 0 :] Unfortunately, that's not enough to get parallelism. You also need to - fvectorise the computation, i.e., everything that comes after $ (but not print because that can't be vectorised yet). At the moment, this means that you have to split your code in two modules because - fvectorise is a module-wide flag. Please take a look at dph/examples/ dotp to see how this is done. Sorry that this is so inconvenient at the moment but we're working on it! Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 30/11/2008, at 08:32, Andrew Coppin wrote: Henning Thielemann wrote: I suspect that this particular function is less useful than you think. It safes one allocation and might be faster since it uses less cache, but on the other hand, it cannot be fused. Hmm, I haven't seen your original message but I suspect you are talking about in-place map. In that case, this is not entirely true. Shameless plug: http://www.cse.unsw.edu.au/~rl/publications/recycling.html I think in-place array updates are only sensible for writing array elements in really random order. As long as you can formulate your algorithm the way read from random indices, but write a complete array from left to right, there is almost no need for mutable arrays. Many array algorithms cannot really be written in this way. I think we do need mutable arrays and they should provide much more than just read/write. How to integrate them nicely with immutable arrays is not really clear, though. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell!
On 28/11/2008, at 20:04, Simon Marlow wrote: So we have two vector libraries, vector and uvector, which have a lot in common - they are both single-dimension array types that support unboxed instances and have list-like operations with fusion. They ought to be unified, really. Yes. This shouldn't be too hard to do since both libraries are based on the internal DPH arrays. Although I have to admit that I never really looked at Don's code and have no idea how much he has changed. But it's more than that. The basic idea behind vector is to provide a common framework for normal arrays, ByteString, StorableVector etc. It's not finished by a long shot but (unsurprisingly) I think it goes in the right direction. The proliferation of array-like libraries is counterproductive. The main difference between these libraries and Haskell's arrays is the Ix class. So perhaps Haskell's arrays should be reimplemented on top of the low-level vector libraries? The Ix class is the root cause of the problems with optimising the standard array libraries. Yes, Haskell arrays should be based on a lower-level array library. I would also argue that they should be redesigned and given a more streamlined and efficient interface. The Ix class is not the only problem wrt efficiency. In particular, the H98 array library relies quite heavily on lists which makes implementing any kind of array fusion rather hard. In contrast to Ix, this is completely unnecessary, IMO. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 29/11/2008, at 08:43, Andrew Coppin wrote: What *I* propose is that somebody [you see what I did there?] should sit down, take stock of all the multitudes of array libraries, what features they have, what obvious features they're missing, and think up a good API from scratch. Once we figure out what the best way to arrange all this stuff is, *then* we attack the problem of implementing it for real. That is the idea behind vector. I don't know how good it is but it's the best I could come up with (or will be once it's finished). That said, I don't think there is such a thing as a perfect array API. Different libraries serve different purposes. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Re: Go Haskell! - array libraries
On 29/11/2008, at 10:47, Claus Reinke wrote: But I don't want Perl, I want a well designed language and well designed libraries. I think it's find to let libraries proliferate, but at some point you also need to step back and abstract. -- Lennart Especially so if the free marketeers claim there is something fundamentally wrong with the standard libraries and language, as in the case of arrays. When someone did that nice little survey of the last bunch of array libraries (Bulat, I think? now in the wiki book), I was hoping there would be a grand unification soon. Instead, it seems that those who have worked most with Haskell arrays recently have simply abandoned all of the standard array libraries. Okay, why not, if there are good reasons. But can't you document those reasons, for each of your alternative proposals, so that people have some basis on which to choose (other than who has the loudest market voice;-)? I think so far, it's always been the same two reasons: efficiency and ease of use. H98 arrays are inherently inefficient and IMO not very easy to use, at least not for the things that I'm doing. And would it be difficult for you all to agree on a standard API, to make switching between the alternatives easy (if it is indeed impossible to unify their advantages in a single library, the reasons for which should also be documented somewhere)? Yes, it is very difficult. A sensible API for a standard array library is something that needs more research. FWIW, I don't know of any other language that has what I'd like to see in Haskell. C++ probably comes closest but they have it easy - they don't do fusion. And what is wrong about Simon's suggestion, to use the standard array lib APIs on top of your implementations? Again, IMO H98 arrays are only suitable for a very restricted set of array algorithms. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Need machine for DPH benchmarking
Hi again, a big thank you to all the people who offered us machines. I think we should be fine now. World domination is just around the corner! Roman On 27/11/2008, at 18:45, Roman Leshchinskiy wrote: Hi all, we, the DPH team, are at the moment in the very unfortunate situation of not having a proper machine for running our benchmarks on. Could a kind soul maybe give us (i.e., me) access to a quadcore or 2xquadcore x86 Linux or OS X machine? I only need to build ghc on it and run small benchmarks which never take more than a couple of minutes, maybe once every couple of days or so. We do need to use all cores, though, so no other CPU-intensive processes can be running during benchmarking. This is only for a week or two, until we get our own machine. We would be eternally grateful and won't forget you when DPH takes over the world. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] Need machine for DPH benchmarking
Hi all, we, the DPH team, are at the moment in the very unfortunate situation of not having a proper machine for running our benchmarks on. Could a kind soul maybe give us (i.e., me) access to a quadcore or 2xquadcore x86 Linux or OS X machine? I only need to build ghc on it and run small benchmarks which never take more than a couple of minutes, maybe once every couple of days or so. We do need to use all cores, though, so no other CPU-intensive processes can be running during benchmarking. This is only for a week or two, until we get our own machine. We would be eternally grateful and won't forget you when DPH takes over the world. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Repair to floating point enumerations?
On 15/10/2008, at 20:41, Malcolm Wallace wrote: Phil proposes that, although retaining the instances of Enum for Float and Double, we simplify the definitions of the numericEnumFrom family: numericEnumFromThenTo :: (Fractional a, Ord a) = a - a - a - [a] numericEnumFrom = iterate (+1) numericEnumFromThen n m = iterate (+(m-n)) n numericEnumFromTo n m = takeWhile (= m) (numericEnumFrom n) numericEnumFromThenTo n m p = takeWhile (= p) (numericEnumFromThen n m) I'd like to raise the following two points in this context. Firstly, an array library which attempts to provide reasonable counterparts to the list functions would want to define array versions of enumFromTo and enumFromThenTo. To be efficient, it must be able to determine the expected number of elements reasonably fast (in constant time). Secondly, a parallel array library (such as the one provided by DPH) also needs to be able to generate, say, the first half of the array on one processor and the second half on another. This requires enumFromTo to obey some form of the following law for some f and g: enumFromTo m n = enumFromTo m (f m n) ++ enumFromTo (g m n) n It might make sense to try to provide this for Float and Double for the sake of consistency with future libraries. I'm not sure how easy it is to do, though. Roman ___ Haskell-prime mailing list Haskell-prime@haskell.org http://www.haskell.org/mailman/listinfo/haskell-prime
Re: Strictness in data declaration not matched in assembler?
On 16/10/2008, at 21:34, Simon Peyton-Jones wrote: For strict *constructors*, on the other hand, we *do* guarantee to evaluate the argument before building the constructor. We generate a wrapper thus wC = \ab. case a of { a' - C a' b } (Remember 'case' always evaluates in Core.) So for strict constructors we could take advantage of the known evaluated-ness of the result to avoid the test. BUT people who care probably UNPACK their strict fields too, which is even better. The time you can't do that is for sum types data T = MkT ![Int] You also can't do it for polymorphic components. I've used code like: data T a = MkT !a foo :: T (a,b) - a foo (MkT (x,y)) = x Here, unpacking doesn't work but foo could still access the components of the pair directly. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Some initial results with DPH
On 23/09/2008, at 14:59, Roman Leshchinskiy wrote: dotp :: [:Int:] - [:Int:] - Int dotp v w = I.sumP [: (I.*) x y | x - v, y - w :] The way the vectoriser works at the moment, it will repeat the array w (lengthP v) times, i.e., create an array of length (lengthP v * lengthP w). This is quite unfortunate and needs to be fused away but isn't at the moment. The only advice I can give is to stay away from array comprehensions for now. They work but are extremely slow. This definition should work fine: dotp v w = I.sumP (zipWithP (I.*) v w) Actually, I didn't pay attention when I wrote this. The two are not equivalent, of course. Only the second one computes the dot product. With comprehensions, you'd have to write dotp v w = [: (I.*) x y | x - v | y - w :] I suspect that will perform reasonably even now. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Some initial results with DPH
Hi Austin, first of all, thanks a lot for taking the time to report your results! On 23/09/2008, at 11:48, Austin Seipp wrote: * The vectorise pass boosts compilation times *a lot*. I don't think this is exactly unwarrented since it seems like a pretty complicated transformation, but while making the primitive version using just the unlifted interface the compilation takes about 1.5 seconds, for the vectorised version it's on the order of 15 seconds. For something as trivial as this dot-product thing, that's a bit of a compilation time, though. The problem here is not the vectoriser but rather the subsequent optimisations. The vectoriser itself is (or should be - I haven't really timed it, to be honest) quite fast. It generates very complex code, however, which GHC takes a lot of time to optimise. We'll improve the output of the vectoriser eventually, but not before it is complete. For the moment, there is no solution for this, I'm afraid. * It's pretty much impossible to use ghc-core to examine the output core of the vectorised version - I let it run and before anything started showing up in `less` it was already using on the order of 100mb of memory. If I just add -ddump-simpl to the command line, the reason is obvious: the core generated is absolutely huge. Yes. Again, this is something we'll try to improve eventually. * For the benchmark included, the vectorised ver. spends about 98% of its time from what I can see in the GC before it dies from stack overflow. I haven't tried something like +RTS -A1G -RTS yet, though. IIUC, the code is dotp :: [:Int:] - [:Int:] - Int dotp v w = I.sumP [: (I.*) x y | x - v, y - w :] The way the vectoriser works at the moment, it will repeat the array w (lengthP v) times, i.e., create an array of length (lengthP v * lengthP w). This is quite unfortunate and needs to be fused away but isn't at the moment. The only advice I can give is to stay away from array comprehensions for now. They work but are extremely slow. This definition should work fine: dotp v w = I.sumP (zipWithP (I.*) v w) * The vectoriser is really, really touchy. For example, the below code sample works (from DotPVect.hs): import Data.Array.Parallel.Prelude.Int as I dotp :: [:Int:] - [:Int:] - Int dotp v w = I.sumP [: (I.*) x y | x - v, y - w :] This however, does not work: dotp :: [:Int:] - [:Int:] - Int dotp v w = I.sumP [: (Prelude.*) x y | x - v, y - w :] This is because the vectorised code needs to call the vectorised version of (*). Internally, the vectoriser has a hardwired mapping from top-level functions to their vectorised versions. That is, it knows that it should replace calls to (Data.Array.Parallel.Prelude.Int.*) by calls to Data.Array.Parallel.Prelude.Base.Int.plusV. There is no vectorised version of (Prelude.*), however, and there won't be one until we can vectorise the Prelude. In fact, the vectoriser doesn't even support classes at the moment. So the rule of thumb is: unless it's in Data.Array.Parallel.Prelude or you wrote and vectorised it yourself, it will choke the vectoriser. I also ran into a few other errs relating to the vectoriser dying - if I can find some I'll reply to this with some results. Please do! And please keep using DPH and reporting your results, that is really useful to us! FWIW, we'll include some DPH documentation in 6.10 but it still has to be written... Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 28/08/2008, at 19:27, Simon Peyton-Jones wrote: Duncan, I'm not following every detail here, but it's clear that you have some clear mental infrastructure in your head that informs and underpins the way Cabal is. Cabal takes the view that..., has principles, and is clearly partitioned internally. These things are clear to you, but my sense it that they are *not* clear even to other well-informed people. (I exclude myself from this group.) It's like the Loch Ness monster: the bits above the waves make sense only when you get an underwater picture that shows you the monster underneath that explains why the humps surface in the way they do. FWIW, I fully agree with this (although I'm not especially well- informed in this particular area). It would be immensely helpful if Cabal's philosophy was described somewhere. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 28/08/2008, at 21:10, Ian Lynagh wrote: On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote: PS: concerning your last point, about separating the Simple build system, that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build- system part. The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part. Hmm, from the previous discussion I got the impression that (large parts of) this functionality would be extracted from Simple and could then be used by other build systems. Is this wrong? Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 28/08/2008, at 23:59, Simon Marlow wrote: The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. (but but but... I hear you say - don't touch that keyboard yet!) Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build. Cabal doesn't give this guarantee, either, since it allows you to depend on just network or on networkx. To be perfectly honest, I think neither autoconf's approach (free-form feature tests) nor Cabal's (version-based dependencies) really work for all important use cases. And I have to disagree with what you write below - I think both systems are fundamentally flawed. As I said before, what does (mostly) work IMO is depending on interfaces which are independent of packages. Being required to specify the exact interface you depend on solves the problem you describe above. It also solves the problem of name clashes with functions defined in later versions of a package. And it is still nicely declarative. Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package-version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example. Interesting. From our previous discussion I got the impression that you wouldn't like something like this. :-) Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 29/08/2008, at 03:11, Ian Lynagh wrote: On Fri, Aug 29, 2008 at 12:57:59AM +1000, Roman Leshchinskiy wrote: On 28/08/2008, at 21:10, Ian Lynagh wrote: On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote: PS: concerning your last point, about separating the Simple build system, that might indeed be good. Indeed, the GHC plan described here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is (I think) precisely using the declarative part but not the build- system part. The * Use Cabal for Haddocking, installing, and anything else we need to do. bullet point uses the build system part. Hmm, from the previous discussion I got the impression that (large parts of) this functionality would be extracted from Simple and could then be used by other build systems. Is this wrong? I thought that the proposal was to split Cabal into the declarative package specification part, and the how to build the package part? If so, then surely how to run haddock on the sources belongs in the how to build the package part? Ignore me, I misunderstood what your original mail. Sorry for the confusion. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 29/08/2008, at 01:31, Simon Marlow wrote: Roman Leshchinskiy wrote: On 28/08/2008, at 23:59, Simon Marlow wrote: The important thing about Cabal's way of specifying dependencies is that they can be made sound with not much difficulty. If I say that my package depends on base==3.0 and network==1.0, then I can guarantee that as long as those dependencies are present then my package will build. (but but but... I hear you say - don't touch that keyboard yet!) Suppose you used autoconf tests instead. You might happen to know that Network.Socket.blah was added at some point and write a test for that, but alas if you didn't also write a test for Network.Socket.foo (which your code uses but ends up getting removed in network-1.1) then your code breaks. Autoconf doesn't help you make your configuration sound, and you get no prior guarantee that your code will build. Cabal doesn't give this guarantee, either, since it allows you to depend on just network or on networkx. Indeed. That's why I was careful not to say that Cabal gives you the guarantee, only that it's easy to achieve it. True, it's easy to specify. But IIUC, if you do so you have to update your package whenever any of the packages you depend on changes even if that change doesn't affect you. This is a very high (if not prohibitive) cost and one which the autoconf model doesn't force on you. Both systems are flawed, but neither fundamentally. For Cabal I think it would be interesting to look into using more precise dependencies (module.identifier::type, rather than package- version) and have them auto-generated. But this has difficult implications: implementing cabal-install's installation plans becomes much harder, for example. Interesting. From our previous discussion I got the impression that you wouldn't like something like this. :-) Sorry for giving that impression. Yes I'd like to solve the problems that Cabal dependencies have, but I don't want the solution to be too costly - first-class interfaces seem too heavyweight to me. But I do agree with most of the arguments you gave in their favour. I'm not sure what you mean by first-class interfaces. Surely, if you specify the interfaces you depend on you'll want to share and reuse those specifications. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Version control systems
On 16/08/2008, at 00:12, Ian Lynagh wrote: On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty wrote: Moreover, as I wrote a few times before, some reasons for switching in the first place are invalidated by not having the core libraries in git, too. For example, one complaint about darcs is that it either doesn't build (on the Sun Solaris T1 and T2 machines) I don't remember seeing this mentioned before, and googling for Solaris T1 darcs doesn't find anything. What goes wrong? I'd expect darcs to build anywhere GHC does. I only vaguely remember what was wrong but IIRC, the problem was that darcs 1.0.? didn't build with GHC 6.8.? because of some incompatibility in the libs and darcs 2 built ok but didn't work, probably because of libcurl issues. At that point I gave up. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 14/08/2008, at 06:32, Duncan Coutts wrote: On Wed, 2008-08-13 at 22:47 +1000, Roman Leshchinskiy wrote: Again, I'm not arguing against a build system written in Haskell. I'd just like it to be completely separated from Haskell's packaging system. In particular, polluting a package description with build information seems wrong to me. There is a huge overlap of course. The things needed to build a package tend to be the dependencies. The ability to automatically extract the dependencies from a package description is crucial as it is what enables automatic package management either directly or by conversion to distro packages. Tools like automake + autoconf do not give us that. Right. Dependencies are part of a package description. That's what Cabal should do. It should provide a nice clean interface to the dependencies stuff for the build system to use. I don't think it does that at the moment; IIUC, it is all done by Distribution.Simple. There is of course some separation possible, which in Cabal roughly corresponds to the stuff under Distribution.Simple vs everything else. We could split those two aspects into separate packages but it's not clear to me that we'd gain much by doing that. My point isn't really about distribution, it's about coupling. My concern is that the syntax of .cabal files is increasingly based on what Distribution.Simple needs. This effectively makes all other build systems second class. It also loses us clean package descriptions which is what .cabal files should be. It's not too bad at the moment but will get worse as Distribution.Simple gets more complex since it will need more and more information. Just as an example, consider something like ld-options. This is obviously not a dependency and is basically only documented by how it is used by Distribution.Simple. It shouldn't be in .cabal, IMO. If a build system needs this information, it should be provided somewhere else. There is still the Make build type which we could improve if people want it. That allows the declarative stuff to be given in the .cabal file (so that package managers can do their thing) and all the building is delegated to make. People have not shown any interest in this so it's never been improved much. The obvious disadvantage of using it is that you have to do a lot of work to make your build system do all the things that users expect. But that is precisely my (other) point. A lot of that work is really unnecessary and could be done by Cabal since it only or mostly depends on the package information. Instead, it is implemented somewhere in Distribution.Simple and not really usable from the outside. For instance, a lot of the functionality of setup sdist, setup register and so on could be implemented generically and used by a make-based build system as well. Also, there is no easy way for build systems to work with the declarative stuff because a lot of that functionality is, again, part of Distribution.Simple. IMO, this is a direct result of the tight coupling between the package management and build system parts of Cabal. The other problem, of course, is that it isn't clear what exactly a build system should provide. IIUC, that's what Building and installing a package in the Cabal manual defines but there, we have things like this: setup test Run the test suite specified by the runTests field of Distribution.Simple.UserHooks. See Distribution.Simple for information about creating hooks and using defaultMainWithHooks. As a matter of fact, a lot of Cabal is documented in terms of what Distribution.Simple does. Again, this effectively shuts out other build systems. I'm sorry if this all sounds too negative, it shouldn't really. I think you guys have done a great job in implementing a system which is obviously very important to the community. I just somewhat disagree with the direction in which it is heading now. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 14/08/2008, at 18:01, Simon Marlow wrote: Roman Leshchinskiy wrote: But that is precisely my (other) point. A lot of that work is really unnecessary and could be done by Cabal since it only or mostly depends on the package information. Instead, it is implemented somewhere in Distribution.Simple and not really usable from the outside. For instance, a lot of the functionality of setup sdist, setup register and so on could be implemented generically and used by a make-based build system as well. That's exactly what I'm proposing we do in GHC: re-use Cabal's setup register and some of the other parts of the simple build system in a make-based build system for packages. It might require a bit of refactoring of Cabal, but I don't expect it to be a major upheaval at all. Ah! I hadn't realised that you are going to reuse Cabal functionality. You wrote Extract the code from Cabal that generates Makefiles so I thought you won't be really using anything from Cabal. I think what you're proposing is mostly a matter of abstracting parts of Cabal with cleaner and more modular APIs, which is absolutely a good thing, but doesn't require a fundamental redesign. The tight coupling and lack of separation between Cabal's generic parts and the simple build system is somewhat accidental (lazy implementors :-), and is actually a lot better than it used to be thanks to the work Duncan has put in. I'm sure it'll improve further over time. IMO, getting this right is absolutely crucial for Cabal's usability and should be the primary short-term goal. Then again, I guess I should contribute code instead of opinions already :-) The other part of your complaint is that the BuildInfo is in the .cabal file along with the PackageDescription (the types are pretty well separated internally). Again I don't think there's anything fundamental here, and in fact some packages have separate .buildinfo files. Well, it is fundamental in the sense that this is how Cabal is used (and is supposed to be used) at the moment. It is good that Cabal separates these things internally but the separation should be enforced in the external interface, as well. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 13/08/2008, at 17:47, Simon Marlow wrote: Roman Leshchinskiy wrote: On 12/08/2008, at 20:11, Simon Marlow wrote: - Extract the code from Cabal that generates Makefiles, and treat it as part of the GHC build system. Rather than generating a Makefile complete with build rules, we generate a Makefile that just has the package-specific metadata (list of modules, etc.), and put the code to actually build the package in the GHC build system. Sounds good. It would be nice if the .cabal parser from Cabal could be made into a separate, stable library which ghc (and nhc?) could use. This makes me wonder, though. Wouldn't this model make more sense for Cabal in general than the current approach of duplicating the functionality of autoconf, make and other stuff? If it works ghc, it ought to work for other projects, too. Cabal as a preprocessor seems much more attractive to me than as a universal build system. So packages would be required to provide their own build system? That sounds like it would make it a lot harder for people to just create a package that others can use. The ease of making a Cabal package has I think a lot to do with the wealth of software available on Hackage. Of course there should be a standard build system for simple packages. It could be part of Cabal or a separate tool (for which Cabal could, again, act as a preprocessor). GHC is a special case: we already need a build system for other reasons. I agree. I just don't think that adding a full-fledged build system to Cabal is the solution. In my experience, huge monolithic tools which try to do everything never work well. I much prefer small, modular tools. A Haskell-based build system is an interesting project but why does it have to be a part of Cabal? It was a design decision early on with Cabal that we didn't want to rely on the target system having a Unix-like build environment. You might disagree with this, but it certainly has some value: a Windows user can download GHC and immediately start building and installing external packages without having to install Cygwin. I agree with this decision but IIUC, this only really works for simple (wrt building) packages which don't even use configure. Making Cabal into a modular preprocessor and providing a thin wrapper for ghc -- make which can act as a target for Cabal would achieve this just as well. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system idea
On 13/08/2008, at 20:34, Simon Marlow wrote: Roman Leshchinskiy wrote: Of course there should be a standard build system for simple packages. It could be part of Cabal or a separate tool (for which Cabal could, again, act as a preprocessor). GHC is a special case: we already need a build system for other reasons. I agree. I just don't think that adding a full-fledged build system to Cabal is the solution. In my experience, huge monolithic tools which try to do everything never work well. I much prefer small, modular tools. A Haskell-based build system is an interesting project but why does it have to be a part of Cabal? Hmm, but you said above there should be a standard build system for simple packages. It could be part of Cabal On second thought, it shouldn't be part of Cabal :-) Cabal has two parts: some generic infrastructure, and a simple build system (under Distribution.Simple) that suffices for most packages. We distribute them together only because it's convenient; you don't have to use the simple build system if you don't want to. My impression of Cabal is that it is a build system with a bit of generic infrastructure. In particular, a large part of the .cabal syntax is specified in terms of this build system and some of it only really makes sense for this build system. I think perhaps you're objecting to the fact that the simple build system isn't so simple, and we keep adding more functionality to it. This is true, but the alternative - forcing some packages to provide their own build system - seems worse to me. Cabal packages do provide their own build system; it's just that they use Cabal syntax instead of, say, make. The advantage of doing this is, of course, that Cabal's syntax is simpler. Adding things to the simple build system erodes this advantage. Complex projects will still have complex build systems - the complexity will be in the .cabal files. If Cabal's goal is to be able to build any project it will basically have to duplicate the functionality of autoconf, automake, libtool, make and a couple of other tools *and* be just as flexible. I think this is neither realistic nor necessary. So where do we stop? And what about the packages that Cabal won't support when we stop? IMO, we should have stopped some time ago. A .cabal file should describe a package, not how to build it. Building should be handled by different tools with a clear interface between them and Cabal. If the build system of choice needs additional information, then that information should be provided in a separate file and not in the package description. Again, I'm not arguing against a build system written in Haskell. I'd just like it to be completely separated from Haskell's packaging system. In particular, polluting a package description with build information seems wrong to me. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Version control systems
On 10/08/2008, at 14:40, Manuel M T Chakravarty wrote: Personally, I am more than happy to stay with darcs, too, but my understanding was that at least the Simons decided that we are going to move from darcs to git. All I am saying is that whatever vcs ghc uses, you need to be able to *easily* get, modify, and commit patches to the HEAD and the boot libs with *just one* vcs. Using two vcs is going to make the current situation worse, not better. I suspect that if GHC switches to git, it will become the standard vcs in the Haskell community sooner or later. Expecting that people (especially newcomers) will use different vcs for different libraries/ compilers is just unrealistic. Really, why should they? Any advantages in usability that darcs might have over git will be overshadowed by the inconvenience of having to remember two different sets of commands. I expect that many new projects will use git and old projects will start switching to it over time. So if the move is made, it should IMO include as big a chunk of the infrastructure as possible. Eventually, it will migrate to git anyway and the earlier it does, the simpler life will be for the developers. As to whether the switch should be made at all, I'm not sure. I've had my share of problems with darcs and I don't think it's suitable for a project of GHC's size at the moment. On the other hand, I suspect that a mixture of git and darcs repos will be even more problematic than what we have now. Maybe investing some time in fixing the most obvious darcs problems would be a better solution? Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Version control systems
On 10/08/2008, at 05:38, Don Stewart wrote: Instead, if we just use ubiquitous, common tools -- like git -- for everything, we minimise the pain for people, and sit firmly in the mainstream of open source. While I agree with this in general, I'm not sure it really applies to vcs (especially darcs) all that much. I don't think anyone who has ever worked with a vcs will need more than a day to learn how to use darcs (or any other sane vcs, for that matter). Really, the problem with darcs is not that it is not mainstream; rather, it's just that it simply doesn't work sometimes. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Build system woes
On 30/07/2008, at 23:46, Simon Marlow wrote: We can talk about the general issues on IRC. But I thought I'd answer a few of the specific questions quickly: Thanks, Simon! Roman Leschinskiy wrote: I don't think I understand how GHC itself is built any longer, either. What does cabal-bin do? It's effectively a pre-compiled Setup.hs for packages that have no need for a custom Setup.hs. cabal-install does the same job for end- users, but we can't rely on having cabal-install in the GHC build system. I see. So it looks at the Build-Type in the package description and calls the right defaultMain if it's not Custom. And if my Setup.hs isn't standard then it's my responsibility to set the Build-Type to Custom in the .cabal file and this isn't checked, right? I wonder, though. What exactly does this buy us? Duncan says: The point being that linking default Setup.hs scripts all the time is a waste (especially since it doesn't parallelise). But the time spent in compiling those Setup.hs is negligible and don't understand the it doesn't parallelise bit. What is runghc.wrapper? it's a template used to make a shell wrapper for a binary. There seems to be new functionality in Cabal to support this. I see. Is runghc the only program we do this for? Or will others be added gradually? What is the difference between make.library.* and build.library.* and why do we need both This one is my fault. build.* runs 'cabal-bin build', whereas make.* runs 'cabal-bin makefile' followed by 'make'. That is, build.* ends up using ghc --make, whereas make.* ends up using a traditional makefile with ghc -M for dependencies and individual single-module compilations. The latter was added so that we could (a) use make -j and (b) compile single modules for testing/debugging purposes. Ok, this answers the following question: (and why doesn't one of them work for dph)? Because it does non-standard stuff in Setup.hs and hence no Makefile can be generated. Hence the SUBDIRS_BUILD hackery in libraries/ Makefile, I assume. Thanks again! Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: [Haskell-cafe] uvector and the stream interface
Jules Bean wrote: It would also be helpful to have someone explain why we have: Ptr a ByteString IOUArray IOCArray Data.Storable.StorableArray UArr Of course, I know the answers to some of those questions, ByteString is obviously less polymorphic than all the others there, and Ptr a doesn't contain size information. But it seems we have a rapidly bifurcating profusion of 'typed interfaces to chunks of memory' with no obvious consistency to their naming scheme and I think it's starting to get confusing... I think the main reason for this is the lack of a generic infrastructure for efficient arrays which means that everyone is rolling their own. My hope is that the Data Parallel Haskell project will eventually provide such an infrastructure (but I'm biased, of course). We need it anyway and we are perhaps in the best position to do it at the moment. Unfortunately, we don't have too much time to work on it so development is slow. Still, I think (obviously) that the vector library is a step in the right direction. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] uvector and the stream interface
stefan kersten wrote: (2) personally i much prefer the list-like interface provided by the stream-fusion powered libraries (ndp, uvector, vector). can't the stream-fusion framework and correspondingly the vector interface be separated from the memory representation, provided a particular concrete representation comes up with a stream/unstream pair? then it would be easy to swap out the underlying representation according to the required characteristics. The vector library does this. The problem is that just stream/unstream isn't sufficient to get good performance. In fact, stream fusion alone is suboptimal for many array algorithms since it can't handle array operations which require random access. In particular, it isn't enough for DPH. We need a more powerful framework which, hopefully, will also be useful for non-DPH arrays. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)
Levi Stephen wrote: On Sun, Jul 13, 2008 at 12:31 AM, Roman Leshchinskiy [EMAIL PROTECTED] wrote: Hi all, the vector library will eventually provide fast, Int-indexed arrays with a powerful fusion framework. It's very immature at the moment (I haven't tested most of the code) and implements just a few combinators but I thought releasing early wouldn't hurt. Use at your own risk and expect things to break horribly! Sounds interesting. How does this compare to the uvector library? IIUC, uvector is based on an older version of the DPH libraries and only provides unboxed arrays. On the other hand, it's much more stable and has more functionality. I expect that the two libraries will be merged eventually. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
[Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)
Hi all, the vector library will eventually provide fast, Int-indexed arrays with a powerful fusion framework. It's very immature at the moment (I haven't tested most of the code) and implements just a few combinators but I thought releasing early wouldn't hurt. Use at your own risk and expect things to break horribly! What it provides: * Boxed and unboxed arrays with a generic interface and a very basic set of combinators. * A powerful loop fusion framework. It is based on stream fusion but already goes beyond that (in particular, it can do some things in-place) and will be significantly extended in the future. * Extensibility. * Rudimentary documentation. * Bugs. The code is based on the DPH libraries but is much more generic and a *lot* simpler. This has only been made possible by the tremendous progress in GHC's simplifier in the recent months/years (thanks Simon!). Consequently, you'll need a recent development version of GHC to build this, 6.8 won't work. It might be able to compile the library but you'll get terrible code, much worse than with lists. If you want to try it out, compile with -O2 -fno-spec-constr-count and please don't forget to let me know about things that didn't work (or, miraculously, did). Grab it from http://hackage.haskell.org/cgi-bin/hackage-scripts/package/vector or darcs get http://darcs.haskell.org/vector Again, special thanks to Simon for doing such a wonderful job with the optimiser. Enjoy, Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)
Sebastian Sylvan wrote: Is there any more (easily-digested, like a paper) information available about this? Specifically what things can happen in-place, and future extensions... Apart from the stream fusion papers, unfortunately no, it's all very much work in progress. Basically, at the moment it will avoid allocating some unnecessary arrays even if no loop fusion happens. So for instance, in something like map f (xs // us) the two loops can't be fused ((//) is the same update operation as in Data.Array). Usually, you'd allocate one array for (xs // us) and then another one for the map. This library will do the map in-place if f doesn't change the type of the elements. This is pretty important at least for data-parallel code. I'll have to see how far this scales. In the future, the fusion system will also be able to handle multiple directions of traversals and permutations (it can only do left-to-right traversals at the moment). The Rewriting Haskell Strings paper talks a bit about that but DPH has much more complex requirements so I'll have to use a different approach which I haven't implemented so far. Roman ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: Generalized phase control for GHC
Simon Peyton-Jones wrote: Ah -- Roman you mean you want to add a phase-ordering constraint at some time *other* than when you declare one or other of the phases. Are you sure this is important? Fairly. I've explained why in a follow-up to Max's message. Also, why do you want phase aliases? Modularity. Let's take NDP fusion as an example. We have 2 logically distinct fusion phases, one for fusing distributed stuff and one for stream fusion (let's call the DIST and STREAM). At the moment, the two are performed simultaneously, i.e., we have DIST = STREAM. However, I suspect that having DIST STREAM might produce better results. This means that I'd like to use INLINE DIST when implementing distributed fusion, INLINE STREAM for stream fusion and only have one place where I declare DIST = STREAM or DIST STREAM. I don't quite see how to achieve this without aliases. This will be even more of a problem once I add additional fusion layers. Incidentially, this is also an example of why adding ordering constraints on already declared phases is useful. Neither of the two fusion systems really depends on the other so it would be nice to be able to put the DIST STREAM or DIST = STREAM declaration into a module which *integrates* the two. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Generalized phase control for GHC
Max Bolingbroke wrote: Hi Roman, Three things. Firstly, what would lenient ordering be useful for? You probably had a specific use case in mind? I suspect that when you have multiple plugins all specifying constraints on the phase ordering independently it is possible to end up in a situation where using each plugin individually results in a consistent phase ordering but a combination of multiple plugins causes a cycle to appear in the order. Hence it is useful to let plugin authors flag up dependencies they don't need to have so we can intellegently break the cycle using those constraints rather than just giving up. Admittedly I only have a superstition that this will be a practical problem. If you don't need a dependency and it can be ignored anyway, why would you want to specify it in the first place? I just can't quite imagine a situation in which I would use this. As to cycles, I think just aborting if they occur would be ok. Secondly, I think it is quite important to be able to specify dependencies for already declared phases. That is, I (probably) want to be able to say in module C that phase X from module A should come before phase Y from module B. This in interesting. You're right that it's not possible with this system: to support this you would need a more class + instance rather than single declaration flavour for phases. Do you have a practical example in mind of how you would use it? Yeah, I'd prefer the class/instance model. An example: suppose you have two independent libraries which implement two optimisations and you want one of them to run before another. I think we really need this if we want to compose optimisations. I suspect that's not quite possible with your current design. Lastly, in addition to and I'd also like to have = (i.e., phase X = phase Y which would make X an alias for Y). Ah! This would also let you do what you wanted with your second point: module C where {-# PHASE XAlias = A.X, B.Y #-} True. It's a bit of a hack, though :-) Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Generalized phase control for GHC
Max Bolingbroke wrote: If you don't need a dependency and it can be ignored anyway, why would you want to specify it in the first place? I just can't quite imagine a situation in which I would use this. I think it makes sense because many of the inter-pass dependencies we have in the GHC pipeline today are actually somewhat lenient. For example, something quite sensible will happen if we do CSE before full laziness, it will simply lead to less efficient code. In contrast, it's imperative that we perform strictness analysis before applying worker-wrapper based on the annotations it adds. I see. I suspect I'd prefer if GHC asked me to manually resolve such conflicts rather than silently generating suboptimal code but then again, perhaps not. In any case, my comments mostly apply to phase control for inlining and rule matching; other passes probably have different requirements which I haven't thought about at all. Yeah, I'd prefer the class/instance model. An example: suppose you have two independent libraries which implement two optimisations and you want one of them to run before another. I think we really need this if we want to compose optimisations. This is an interesting observation. So, you imagine that the user program itself might import the compiler plugins and add a constraint between the phases they export. I'm not sure how many users would really want to do this, but I can see it being useful. What I had in mind is a library/plugin which uses other libraries/plugins. For instance, let's say we have two libraries which both implement rule-based fusion, one for some high-level stuff and one for something low level. Now, a third library which uses the first two might want to specify that high-level fusion should be done before low-level fusion. Or consider something like loop optimisations, where you might have separate plugins which implement unrolling, PRE and so on and one plugin which aggregates all these to provide a complete loop optimiser. It's all about compositionality, really. Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
Re: Generalized phase control for GHC
Hi Max, sorry for replying so late, I've completely forgotten about this. I would be interested in feedback on the design before the implementation is complete and in the wild. I'm especially interested in hearing if you believe that loss of support for numeric phase numbers 2 is a problem, as this is the only breaking change that I'm proposing. Three things. Firstly, what would lenient ordering be useful for? You probably had a specific use case in mind? Secondly, I think it is quite important to be able to specify dependencies for already declared phases. That is, I (probably) want to be able to say in module C that phase X from module A should come before phase Y from module B. I suspect that's not quite possible with your current design. Lastly, in addition to and I'd also like to have = (i.e., phase X = phase Y which would make X an alias for Y). As to the numeric phase ids, I'm all in favour of dropping them altogether. For legacy code, you could provide wired-in names 0, 1 and 2 for now but I would deprecate them. Also, I'm one of the probably very few people who have ever used phase 3 and beyond and dropping support for that is perfectly fine with me. Thanks for working on this! Roman ___ Glasgow-haskell-users mailing list Glasgow-haskell-users@haskell.org http://www.haskell.org/mailman/listinfo/glasgow-haskell-users