Re: [Haskell-cafe] Strict type system allows for a maximum number of programming errors to be caught at compile time.

2010-05-04 Thread Roman Leshchinskiy
On 04/05/2010, at 13:30, Luke Palmer wrote:

 On Mon, May 3, 2010 at 11:07 AM, Kyle Murphy orc...@gmail.com wrote:
 
 The fact that it doesn't is proof enough that there's a problem
 with it even if that problem is simply that the types you're using aren't
 exactly correct. Further, I'd argue that in the first instance with a
 non-strict type system, the instance of wrong code that compiles would be
 higher. The only argument to support non-strict typing would be if you could
 show that it takes less time to track down runtime bugs than it does to fix
 compile time type errors, and any such claim I'd be highly skeptical of.
 
 Clearly.  But many people believe in this methodology, and use test
 suites and code coverage instead of types.  Indeed, such practices are
 essentially empirical type checking, and they afford the advantage
 that their verification is much more expressive (however less
 reliable) than our static type system, because they may use arbitrary
 code to express their predicates.

I don't think it's a question of types vs. testing. Rather, it's types + 
testing vs. just testing. How is the latter more expressive than the former for 
defining properties of programs?

Also, testing loses a lot of appeal once you start dealing with concurrent 
programs. Testing for this program doesn't have race conditions isn't exactly 
easy. You want as many static guarantees as possible.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Parallel Haskell: 2-year project to push real world use

2010-05-03 Thread Roman Leshchinskiy
On 03/05/2010, at 22:04, Johan Tibell wrote:

 On Mon, May 3, 2010 at 11:12 AM, Simon Peyton-Jones simo...@microsoft.com 
 wrote:
 | Does this mean DPH is ready for abuse?
 |
 | The wiki page sounds pretty tentative, but it looks like it's been awhile
 | since it's been updated.
 |
 | http://www.haskell.org/haskellwiki/GHC/Data_Parallel_Haskell
 
 In truth, nested data parallelism has taken longer than we'd hoped to be 
 ready for abuse :-).   We have not lost enthusiasm though -- Manual, Roman, 
 Gabi, Ben, and I talk on the phone each week about it.  I think we'll have 
 something usable by the end of the summer.
 
 That's very encouraging! I think people (me included) have gotten the 
 impression that the project ran into problems so challenging that it stalled. 
 Perhaps a small status update once in a while would give people a better idea 
 of what's going on. :)

We keep running into challenging problems (the last one was a volcano) but we 
never stall. Things like the new GHC inliner, vector, repa are all part of DPH 
work.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Parallel Haskell: 2-year project to push real world use

2010-05-03 Thread Roman Leshchinskiy
On 04/05/2010, at 09:21, Christian Höner zu Siederdissen wrote:

 Hi,
 
 on that topic, consider this (rather trivial) array:
 
 a = array (1,10) [ (i,f i) | i -[1..10]] where
  f 1 = 1
  f 2 = 1
  f i = a!(i-1) + a!(i-2)
 
 (aah, school ;)
 
 Right now, I am abusing vector in ST by doing this:
 
 a - new
 a' - freeze a
 forM_ [3..10] $ \i - do
  write a (a'!(i-1) + a!(i-2))
 
 Let's say I wanted to do something like this in dph (or repa), does that
 work? We are actually using this for RNA folding algorithms that are at
 least O(n^3) time. For some of the more advanced stuff, it would be
 really nice if we could just parallelize.

Do you really just need a prefix sum? These are easily parallelisable if the 
operator is associative. For instance, you could implement the Fibonacci 
sequence as:

mapP fst $ scanP (\(a,b) _ - (a+b,a)) (1,0) $ replicateP n (0,0)

and DPH would parallelise it. That's how I would write the above with vector as 
well.

 To summarise: I need arrays that allow in-place updates.

In-place updates + parallelism = bad! That's oversimplifying, of course. But 
the transformations underlying DPH, for instance, simply don't work in the 
presence of side effects.

 Otherwise, most libraries that do heavy stuff (O(n^3) or worse) are
 using vector right now. On a single core, it performs really great --
 even compared to C-code that has been optimized a lot.

That's great to know! Do you (or anyone else) by any chance have any benchmarks 
you could share? At the moment, I'm only benchmarking vector with a couple of 
rather simplistic algorithms which is a bit of a problem.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Haskell and the Software design process

2010-05-03 Thread Roman Leshchinskiy
On 03/05/2010, at 06:02, Jaco van Iterson wrote:

 I was just wondering what methods are best to design/model the software in 
 bigger projects when you are planning to use Haskell.
 Is there no difference compared to other languages? Are there any Haskell 
 tools?

In addition to what Don said, here are a couple of things I've learned. This is 
just from personal experience so YMMV.

Design in Haskell is much more often bottom-up than in, say, traditional OO 
where it's frequently top-down all the way. I believe this is mainly due to 
purity. When you have some kind of global state, your design process often has 
to be top-down because of intricate interactions between program components 
which modify that state.

Designing Haskell software tends to involve much fewer diagrams than OO. Your 
most important design tool is the type system. You can often express large 
chunks of your design through types and have the compiler check and enforce 
them. Fiddling with types is often part of the design process and should be 
treated accordingly. If you stumble on a useful design pattern, think about how 
to encode it in the type system (this is quite different from OO patterns).

Higher-order functions and type classes are very powerful tools for reducing 
coupling and for implementing design patterns.

Prototyping is very cheap and easy. Writing prototypes and playing with them in 
ghci allows you to see how your subsystems will behave and adjust the design 
accordingly. In general, you ought to write code (esp. type signatures) while 
designing. 

Some libraries/subsystems will evolve into or start out as EDSLs. This is good 
and should be encouraged. Identifying EDSLs that would be useful for 
implementing your software is an important step in the design process.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers

2010-04-26 Thread Roman Leshchinskiy
On 24/04/2010, at 22:42, Roman Leshchinskiy wrote:

 On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote:
 
 I was thinking of this:
 
 data T = T Double deriving ( Eq, Ord )
 
 ... GHC basically produces
 
 instance Ord T where
 compare (T x) (T y) = compare x y
 t  u = compare t u == LT
 
 That is indeed what it does.  Which is a plain old bug, since it leads
 to inconsistent behaviour between wrapped vs unwrapped values.
 
 *Main T (0/0) == T (0/0)
 False
 *Main T (0/0)  T (0/0)
 False
 *Main T (0/0)  T (0/0)
 True
 *Main (0/0)  (0/0)
 False
 
 Urgh. You're right, I hadn't thought of this. Would you care to submit a bug 
 report?

I submitted one but on further reflection, this is not so simple. Let's look at 
pairs as an example. At the moment, () is implemented basically like this:

 (a,b)  (c,d) = case compare a c of
   LT - False
   EQ - compare b d
   GT - True

Of course, this means that (0/0,'a')  (0/0,'a'). So we could change the 
implementation:

  (a,b)  (c,d) = a  c || (a == c  b  d)

But now we compare a to c twice which is very bad for, say, ([Int],Int). 
Clearly, we want to use the first definition but it leads to inconsistent 
results for Doubles. I don't see how to solve this while keeping IEEE semantics 
of silent NaNs.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The instability of Haskell libraries

2010-04-24 Thread Roman Leshchinskiy
On 24/04/2010, at 07:29, Don Stewart wrote:

 Oh, the Platform has very strict standards about APIs,

What is an API? The package versioning policy only seems to talk about types 
and function signatures. John's old-locale example shows that this is not 
enough.

Would it perhaps make sense for at least the Platform to require packages to 
have unit tests and to require versions to be bumped whenever those change 
(sufficiently)?

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The instability of Haskell libraries

2010-04-24 Thread Roman Leshchinskiy
On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote:

 Roman Leshchinskiy r...@cse.unsw.edu.au writes:
 
 On 24/04/2010, at 07:29, Don Stewart wrote:
 
 Oh, the Platform has very strict standards about APIs,
 
 What is an API? The package versioning policy only seems to talk about
 types and function signatures. John's old-locale example shows that
 this is not enough.
 
 I would think that the API is all the
 functions/classes/datatypes/instances/etc. exported from the library in
 combination with their types.

So the semantics of those functions doesn't matter at all?

 Would it perhaps make sense for at least the Platform to require
 packages to have unit tests and to require versions to be bumped
 whenever those change (sufficiently)?
 
 I don't get this; just because someone changes a unit test (because they
 thought of a new case, etc.) they should bump the package version even
 if all the changes were internal and not exported?

Adding new tests (i.e., new postconditions) doesn't change the API. Loosening 
preconditions doesn't, either. Also, the tests would only cover the exposed 
part of the library, of course. Internal tests are of no concern to the 
library's clients.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The instability of Haskell libraries

2010-04-24 Thread Roman Leshchinskiy
On 24/04/2010, at 18:54, Ivan Lazar Miljenovic wrote:

 Roman Leshchinskiy r...@cse.unsw.edu.au writes:
 
 On 24/04/2010, at 18:06, Ivan Lazar Miljenovic wrote:
 I would think that the API is all the
 functions/classes/datatypes/instances/etc. exported from the library in
 combination with their types.
 
 So the semantics of those functions doesn't matter at all?
 
 What do you refer to by semantics?  Can you provide an example of when
 what you consider to be the API to change when the functions, types,
 etc. don't?

John Goerzen gave one in the very first post of this thread: the fix to 
old-locale which didn't change any types but apparently changed the behaviour 
of a function quite drastically. Another example would be a change to the Ord 
instances for Float and Double which would have compare raise an exception on 
NaNs as discussed in a different thread on this list. Another one, which is 
admittedly silly but demonstrates my point, would be changing the 
implementation of map to

map _ _ = []

In general, any significant tightening/changing of preconditions and 
loosening/changing of postconditions would qualify. 

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers

2010-04-24 Thread Roman Leshchinskiy
On 24/04/2010, at 19:56, Barak A. Pearlmutter wrote:

 And yet a lot of generic code is written in terms of compare.
 
 That's can be an advantage, because often that code *should* blow up
 when it gets a NaN.  E.g., sorting a list of Floats which includes a
 NaN.

However, often you will know that the list doesn't contain NaNs and will still 
have to pay a performance penalty. It's a question of what the right default is 
- safety or performance. In the case of floating point numbers, I'm leaning 
towards performance.

That said, I would be very much in favour of providing a SafeFloat or whatever 
type with much safer semantics than IEEE floats and trying to get people to use 
that type by default unless they really need the performance.

 Even deriving(Ord) only produces compare and relies on standard
 definitions for other methods.
 
 I don't think that's actually a problem.  Surely the IEEE Floating
 Point types would give their own definitions of not just compare but
 also , =, etc, overriding the problematic deriving(Ord) definitions
 of comparison in terms of compare and vice-versa.

I was thinking of this:

data T = T Double deriving ( Eq, Ord )

Unless I'm mistaken, at the moment GHC basically produces

instance Ord T where
  compare (T x) (T y) = compare x y
  t  u = compare t u == LT
  ...

That is, all comparisons on T would be paying the NaN performance tax.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers

2010-04-24 Thread Roman Leshchinskiy
On 24/04/2010, at 22:06, Barak A. Pearlmutter wrote:

 Currently the standard prelude has default definition:
 
...
compare x y
 | x == y=  EQ
 | x = y=  LT
 | otherwise =  GT
 
 I'd suggest
 
 [...]
 
compare x y
 | x  y =  LT
 | x == y=  EQ
| x  y =  GT
 | otherwise =  error no consistent ordering
 
 It is not clear to me that this would cause a measurable performance
 hit in the case of floating point numbers.  We're talking about at
 most two extra instructions: a compare and a conditional branch.  The

The problem are not so much the additional instructions. Rather, it's the fact 
that compare for Float and Double can fail at all which inhibits some 
optimisations. For instance, GHC is free to eliminate the comparison in (x 
`compare` y) `seq` a but wouldn't be with your change. It doesn't actually do 
that at the moment, which looks like an optimiser deficiency to me. But in any 
case, the property can fail has a significant effect on optimisations 
sometimes.

 I was thinking of this:
 
 data T = T Double deriving ( Eq, Ord )
 
 ... GHC basically produces
 
 instance Ord T where
  compare (T x) (T y) = compare x y
  t  u = compare t u == LT
 
 That is indeed what it does.  Which is a plain old bug, since it leads
 to inconsistent behaviour between wrapped vs unwrapped values.
 
 *Main T (0/0) == T (0/0)
 False
 *Main T (0/0)  T (0/0)
 False
 *Main T (0/0)  T (0/0)
 True
 *Main (0/0)  (0/0)
 False

Urgh. You're right, I hadn't thought of this. Would you care to submit a bug 
report?

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers

2010-04-23 Thread Roman Leshchinskiy
On 23/04/2010, at 01:34, Barak A. Pearlmutter wrote:

 I'd suggest that compare involving a NaN should yield
 
error violation of the law of the excluded middle

Please think of the poor guys trying to write high-performance code in Haskell!

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Proper Handling of Exceptional IEEE Floating Point Numbers

2010-04-23 Thread Roman Leshchinskiy
On 24/04/2010, at 07:15, Barak A. Pearlmutter wrote:

 In all seriousness, I think it is reasonable when isNaN x for
 x  C
 x == C
 x  C
 C  x
 C == x
 C  x
 to all be False, for all floats C, even C=x, as a sort of efficient
 weak Bool bottom. This is what the FP hardware does --- so it is very
 efficient.
 
 But if you force the system to choose one of the three, which is what
 compare x C
 is doing, I think the result should be _|_.  Because there is no way
 to choose, no reasonable Ordering to return.
 
 It is possible to write generic Ord n = code under these
 conditions, if you're careful to case out ,==, when you don't want a
 NaN to kill the computation, and when necessary handle the case that
 all three come out false.  That's what good numeric programmers
 actually do.  But compare giving a wrong Ordering is an invitation
 to get it wrong.

And yet a lot of generic code is written in terms of compare. Even 
deriving(Ord) only produces compare and relies on standard definitions for 
other methods. Don't get me wrong, I don't think the current situation is ideal 
(although it doesn't seem all that bad to me). But this change would have 
far-reaching implications for performance which ought to be evaluated before it 
can be seriously considered, in my opinion.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


index*OffAddr

2010-04-21 Thread Roman Leshchinskiy
In package vector, primitive vectors (the ones that Data.Vector.Unboxed is 
built on top of) are represented as follows (ByteArray and friends are wrappers 
for various GHC primitives provided by package primitive):

data Vector a = Vector Int   -- offset into the ByteArray
   Int   -- length
   ByteArray -- data

This representation supports cheap slicing which is quite crucial. However, 
indexing into such vectors is a bit more expensive than necessary:

index (Vector i _ arr) j = indexByteArray arr (i+j)

Ultimately, this requires 2 additions to get the element's address:

  base address off the ByteArray + ((i + j) * size of element)

I'd like to always allocate pinned ByteArrays and store the starting address of 
the vector instead of the offset:

data Vector a = Vector Addr
   Int
   ByteArray

This would make indexing cheaper as it would require only one addition:

index (Vector addr i _) = indexOffAddr addr i

This is quite a big deal if indexing happens in an inner loop (some algorithms 
become up to 20% faster). Of course, the backend could optimise the 
offset-based version by performing partial redundancy elimination but it 
doesn't and it probably wouldn't get all interesting cases even if it did. So 
the second version is better.

The problem is that I can't implement it because I must touch the ByteArray 
after accessing the memory. This results in code like this which hides the 
constructor, breaking various optimisations:

  case indexIntOffAddr# addr# i# of { n# -
  case touch# arr# realWorld# of { _ - I# n# }}

After thinking about this for a while, I came up with two possible solutions. 
One is to provide a pure version of touch#:

  use# :: o - o' - o'

such that use# x y = y. This would change the code above to:

  I# (use# arr# (indexIntOffAddr# addr# i#))

I don't know how to implement this, though, because use# would have to be able 
to return arbitrary (unboxed) types and the code generator doesn't really seem 
to support this.

A perhaps simpler solution is to add a new set of primitives:

  indexIntOffAddrUsing# :: o - Addr# - Int# - Int#
  ...

These would take an additional argument which they'd touch and otherwise 
ignore. The code would then become:

  I# (indexIntOffAddrUsing# arr# addr# i#)

Incidentally, the index*OffAddr# primitives don't seem to be used anywhere.

Any thoughts?

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] vector recycling

2010-04-18 Thread Roman Leshchinskiy
On 18/04/2010, at 08:07, Ben wrote:

 On Fri, Apr 16, 2010 at 11:19 PM, Roman Leshchinskiy r...@cse.unsw.edu.au 
 wrote:
 That said, it would be quite possible to provide something like the 
 following:
 
 fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a
 
 as far as i understand there would be two ways of writing such a
 function : 1) to use mutable vectors monadically underneath and hide
 them inside some kind of unsafeX, or 2) to give a specialized fold
 with sufficient hints to the compiler to use the rewriting framework.

Right, I meant 2. I'm not saying it's necessarily a good idea, just that it 
would be possible.

 This could use the recycling framework to safely do as much in-place as 
 possible while still preserving a purely functional interface. I have to 
 think about it. Really, this looks like just a poor man's substitute for 
 linear types.
 
 although i am supposed to know something about category theory, since
 my training is in math, i don't know about girard's later work.  is
 there a short precis you can give (or a pointer?)

This is a nice introduction:

http://homepages.inf.ed.ac.uk/wadler/papers/linear/linear.ps

Also, Clean's uniqueness types are quite similar.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector recycling

2010-04-17 Thread Roman Leshchinskiy
On 17/04/2010, at 13:32, Ben wrote:

 module Main where
 
 import qualified Data.Vector.Generic as V
 import qualified Data.Vector.Unboxed as UV
 
 type Vec = UV.Vector Double
 
 axpy :: Double - Vec - Vec - Vec
 axpy a x y = V.zipWith (+) (V.map (* a) x) y
 
 sumVecs :: [(Double, Vec)] - Vec
 sumVecs axs =
let (a, x) = head axs
in foldl adder (V.map (* a) x) (tail axs)
where adder :: Vec - (Double, Vec) - Vec
  adder v1 (a, x) = axpy a x v1
 
 how to i write this in a way which ensures recycling / fusion, e.g.
 in-place updates?

Unfortunately, recycling won't help you here. It is a purely local optimisation 
which doesn't work across function boundaries (inlining notwithstanding) and 
recursive calls. Your best bet is to use a mutable vector and do the fold in 
the ST monad.

That said, it would be quite possible to provide something like the following:

fold_inplace :: Vector v a = (v a - b - v a) - v a - [b] - v a

This could use the recycling framework to safely do as much in-place as 
possible while still preserving a purely functional interface. I have to think 
about it. Really, this looks like just a poor man's substitute for linear types.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange error with type classes + associated types

2010-04-17 Thread Roman Leshchinskiy
On 17/04/2010, at 11:00, Conal Elliott wrote:

 I'm unsure now, but I think I tried making Basis a data type (not syn) and 
 ran into the problem I mentioned above.  The Basis *synonyms* also have 
 HasTrie instances, which is crucially important.  If we switch to (injective) 
 data types, then we lose the HasTrie instances.  I'd be okay with defining 
 HasTrie instances (preferably via deriving) for the associated Basis data 
 types, but I couldn't figure out how to.  Maybe it's not possible currently, 
 or maybe I just didn't know how.

Could you perhaps make (:-*) a proper type rather than a synonym? That would 
help with the ambiguity.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Floyd Warshall performance (again)

2010-04-16 Thread Roman Leshchinskiy
On 16/04/2010, at 18:06, Mathieu Boespflug wrote:

 shortestPath :: [(Int, Int, Int)] - UArray Int Int
 shortestPath g = runSTUArray $ do
  let mnew = newArray (0, SIZE * SIZE) 1
  mread arr i j = unsafeRead arr (i * SIZE + j)
  mwrite arr i j x = unsafeWrite arr (i * SIZE + j) x
  unsafeIOToST $ hSetBuffering stdout LineBuffering
  unsafeIOToST $ putStrLn Allocating ...
  pm - mnew
  unsafeIOToST $ putStrLn Allocating ... done
  let loop1 SIZE = return ()
  loop1 k = let loop2 SIZE = return ()
loop2 i = let loop3 SIZE = return ()
  loop3 j = do
xij - mread pm i j
xik - mread pm i k
xkj - mread pm k j
mwrite pm i j (min xij (xik + xkj))
loop3 (j + 1)
  in loop3 0  loop2 (i + 1)
in loop2 0  loop1 (k + 1)
  loop1 0
  return pm

In general, GHC doesn't like nested loops. You might want to try the following 
structure:

  loop1 SIZE = return ()
  loop1 k = loop2 k 0

  loop2 k SIZE = loop1 (k+1)
  loop2 k i = loop3 k i 0

  loop3 k i SIZE = loop2 k (i+1)
  loop3 k i j = do
  ...
  loop3 k i (j+1)

And, as Max suggested, the llvm backend ought to improve things.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Can't install Criterion package on ghc ..

2010-04-14 Thread Roman Leshchinskiy
On 15/04/2010, at 02:55, John Lato wrote:

 The problem isn't with criterion itself, but with vector-algorithms.
 The vector library relies heavily on type families, which have dodgy
 support in ghc-6.10.

As a matter of fact, this particular problem is easy to fix by adding a couple 
of type signatures. I used to run into it frequently back when I was still 
working with 6.10. I'll send a patch.

Roman
 

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Vector to Monadic Stream and back, how?

2010-04-14 Thread Roman Leshchinskiy
On 14/04/2010, at 09:05, Xiao-Yong Jin wrote:

 I want to use 'mapM' on Data.Vector.Vector, but it looks
 like the only 'mapM' defined is in
 Data.Vector.Fusion.Stream.Monadic.  I'm able to use 'stream'
 and 'liftStream' to convert a 'Vector' to a monadic stream,
 on which I can use 'mapM'.  But I couldn't find a way to
 convert the monadic stream back to Vector without using an
 intermediate list.  I don't think I understand the internal
 of monadic stream that much.  But it looks not so fusion to
 me.  Is it the only way back to Vector?

Unfortunately, it's not at all clear to me how to implement mapM on 
vectors/arrays without going through an intermediate list for arbitrary monads 
(it's easy for ST and IO). The next version of vector will include mapM and 
friends but it will probably be implemented much like you describe, with 
appropriate specialisations for ST and IO.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Strange error with type classes + associated types

2010-04-14 Thread Roman Leshchinskiy

On 15/04/2010, at 00:30, Brent Yorgey wrote:

 On Wed, Apr 14, 2010 at 09:51:52AM +0100, Stephen Tetley wrote:
 On 14 April 2010 03:48, Brent Yorgey byor...@seas.upenn.edu wrote:
 
 Can someone more well-versed in the intricacies of type checking with
 associated types explain this?  Or is this a bug in GHC?
 
 If you take the definition of append out out the class - GHCi will
 give it a type:
 
 append (Affine a2 b2) (Affine a1 b1) = Affine (a2 *.* a1) (lapply a2 b1 ^+^ 
 b2)
 
 *VectorSpace :t append
 append
  :: (Scalar v ~ Scalar v1,
  Basis v ~ Basis u,
  Basis v1 ~ Basis v,
  VectorSpace v1,
  HasTrie (Basis v),
  HasBasis v,
  HasBasis u) =
 Affine v1 - Affine v - Affine v1
 
 Right, this seems weird to me.  Why is there still a 'u' mentioned in
 the constraints?  Actually, I don't even see why there ought to be
 both v and v1.  The type of (*.*) mentions three type variables, u, v, and w:
 
 (*.*)  :: (HasBasis  u, HasTrie  (Basis  u), 
   HasBasis  v, HasTrie  (Basis  v), 
   VectorSpace  w, 
   Scalar  v ~ Scalar  w) 
   = (v :-*  w) - (u :-*  v) - u :-*  w

Note that (:-*) is a type synonym:

type :-* u v = MSum (Basis u :-: v)

Substituting this into the type of (*.*), we get:

(*.*) :: ... = MSum (Basis v :-* w) - MSum (Basis u :-* v) - MSum (Basis u 
:-* w)

Now, Basis is an associated type:

class VectorSpace v = HasBasis v where
  type Basis v
  ...

This means that there is no way to obtain u from Basis u. Since u only ever 
occurs as an argument to Basis, a type family, it can never be unified with 
anything. This, in turn, means that there is no way to call (*.*) at all 
(unless I'm severely mistaken).

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST

2010-04-07 Thread Roman Leshchinskiy
On 08/04/2010, at 01:38, Henning Thielemann wrote:

 On Apr 6, 2010, at 5:30 PM, Roman Leshchinskiy wrote:
  
 In fact, the only safe-ish use for it I have found is to use 
 Storable-related functions in ST, hoping that the instances don't actually 
 use any real IO functionality. Arguably, this shouldn't be necessary as 
 Storable should live in ST anyway.
 

 But Storable in ST monad would be still dangerous, because pointers may point 
 to non-allocated memory or point outside of an array.

I don't think that's the kind of safety the original poster had in mind. You 
can have invalid memory accesses even in pure code but that's ok since we know 
what the semantics is: bottom. I understood the question to be about the 
conditions under which unsafeIOToST can violate referential transparency.

Roman
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] what are the safety conditions for unsafeIOToST

2010-04-06 Thread Roman Leshchinskiy
On 07/04/2010, at 07:33, Nicolas Frisby wrote:

 I haven't been able to find it via Google or Haddock. An old message
 suggests is was just a matter of exceptions?

I don't think that's correct. You can implement unsafePerformIO in terms 
unsafeIOToST:

unsafePerformIO :: IO a - a
unsafePerformIO p = runST (unsafeIOToST p)

In fact, the only safe-ish use for it I have found is to use Storable-related 
functions in ST, hoping that the instances don't actually use any real IO 
functionality. Arguably, this shouldn't be necessary as Storable should live in 
ST anyway.

 I only want to use the IO for generating Data.Uniques to pair with
 STRefs in order to make a map of them. I'm guessing this would be a
 safe use since it's exception free (... right?).

It's hard to tell without looking at your code. But if you are generating 
Uniques in ST then it's probably unsafe:

foo :: () - Unique
foo _ = runST (unsafeIOToST newUnique)

What's the value of foo ()?

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: replicateM over vectors

2010-04-04 Thread Roman Leshchinskiy
On 04/04/2010, at 05:33, Chad Scherrer wrote:

 Roman Leshchinskiy rl at cse.unsw.edu.au writes:
 
 Ah. I missed that. Then your best bet is probably
 
 replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const
 action)
 $ new n
 
 It's uglier that it should be but vector simply doesn't define the right
 combinators for this at the moment.
 
 I'm having trouble getting this to typecheck. I'll reread your Recycle Your
 Arrays paper; maybe then it will make more sense.

Ugh. I shouldn't write emails while frantically scrambling to make a conference 
deadline. What I meant is this:

replicate n action = do { v - new n; v' - munstream v (generate M n (const 
action)) }

Sorry for the confusion.

 There are two things one would have to do. First, add a function to
 Generic.New which initialises a New from a
 Monadic.Stream and fusion rules for it. That's easy. The hard part is to
 generalise New to work with
 arbitrary monads: at the moment it is defined as:
 
 data New a = New (forall mv s. MVector mv a = ST s (mv s a))
 
 This is because its basic reason for existence is to be passed to Vector.new
 which then does a runST to
 produce an immutable vector. It is perhaps possible to make New more general
 but it's quite tricky. I'll
 think about it after the ICFP deadline 
 
 But the m I'm interested in happens to be ST. Sounds like it's still easy in
 principle, but not immediate. Is that right?

Not really. The big step is getting from the type above to a fixed s so that 
you can use it in a particular ST computation. It's not just a question of 
making the types work, either. I also have to convince myself that it is 
actually safe to do so (in particular, that the rewrite rules in the library 
can't break things). Getting from there to IO is very easy.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] replicateM over vectors

2010-04-01 Thread Roman Leshchinskiy
On 02/04/2010, at 12:16, Don Stewart wrote:

 Chad.Scherrer:
 Hi,
 
 I'd like to be able to do replicateM, but over a vector instead of a list. 
 Right now I'm doing this:

The operation you are looking for is called newWith. It probably should be 
called replicate.

 Roman? Can we generate frozen arrays for monadic generators, and still fuse in
 the current New/Mutable/MStream architecture?

For monadic stuff, fusion happens on things of type New. For instance, you 
could write this (I'm omitting the class contexts and Data.Vector prefixes):

replicate :: Int - a - New a
replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x)

and then either

  Mutable.run (replicate n x)

to get a mutable vector or

  new (replicate n x)

to get an immutable one. You could also chain operations on New, including 
monadic ones:

  v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f)
   $ replicate n x

and this ought to fuse.

Note that the New stuff is quite rough and only really intended for internal 
use at the moment. I wanted to get the normal APIs working properly first.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] replicateM over vectors

2010-04-01 Thread Roman Leshchinskiy

On 02/04/2010, at 13:01, Don Stewart wrote:

 rl:
 replicate :: Int - a - New a
 replicate n x = Generic.New.unstream (Fusion.Stream.replicate n x)
 
 and then either
 
  Mutable.run (replicate n x)
 
 to get a mutable vector or
 
  new (replicate n x)
 
 
 Hmm, but here 'a' is pure. I don't think he wants
 
newWith :: (PrimMonad m, MVector v a) = Int - a - m (v (PrimState m) a)
 
 but more:
 
newWithM :: (PrimMonad m, MVector v a) = Int - m a - m (v (PrimState m) 
 a)

Ah. I missed that. Then your best bet is probably

replicate n action = munstream v $ Fusion.Stream.Monadic.generateM n (const 
action)
 $ new n

It's uglier that it should be but vector simply doesn't define the right 
combinators for this at the moment.

 to get an immutable one. You could also chain operations on New, including 
 monadic ones:
 
  v - Mutable.run $ Generic.New.transform (Fusion.Stream.Monadic.mapM f)
   $ replicate n x
 
 
 Oh, that's interesting. But what if we want to fill directly with the monadic 
 action?
 We wouldn't
 
mapM (const a) $ replicate n undefined 
 
 So how do we best do a fusible, e.g.:
 
replicateM :: G.Vector v a = Int - IO a - IO (v a)

There are two things one would have to do. First, add a function to Generic.New 
which initialises a New from a Monadic.Stream and fusion rules for it. That's 
easy. The hard part is to generalise New to work with arbitrary monads: at the 
moment it is defined as:

data New a = New (forall mv s. MVector mv a = ST s (mv s a))

This is because its basic reason for existence is to be passed to Vector.new 
which then does a runST to produce an immutable vector. It is perhaps possible 
to make New more general but it's quite tricky. I'll think about it after the 
ICFP deadline :-)

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Data Structures GSoC

2010-03-31 Thread Roman Leshchinskiy
On 31/03/2010, at 18:14, Achim Schneider wrote:

 We have a lot of useful interfaces (e.g. ListLike, Edison), but they
 don't seem to enjoy wide-spread popularity.

Perhaps that's an indication that we need different interfaces? IMO, huge 
classes which generalise every useful function we can think of just isn't the 
right approach. We need small interfaces between containers and algorithms. In 
fact, the situation is perhaps somewhat similar to C++ where by providing 
exactly that the STL has been able to replace OO-style collection libraries 
which never really worked all that well.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Shootout update

2010-03-31 Thread Roman Leshchinskiy
I'm wondering... Since the DPH libraries are shipped with GHC by default are we 
allowed to use them for the shootout?

Roman

On 30/03/2010, at 19:25, Simon Marlow wrote:

 The shootout (sorry, Computer Language Benchmarks Game) recently updated to 
 GHC 6.12.1, and many of the results got worse.  Isaac Gouy has added the +RTS 
 -qg flag to partially fix it, but that turns off the parallel GC completely 
 and we know that in most cases better results can be had by leaving it on.  
 We really need to tune the flags for these benchmarks properly.
 
 http://shootout.alioth.debian.org/u64q/haskell.php
 
 It may be that we have to back off to +RTS -N3 in some cases to avoid the 
 last-core problem (http://hackage.haskell.org/trac/ghc/ticket/3553), at least 
 until 6.12.2.
 
 Any volunteers with a quad-core to take a look at these programs and optimise 
 them for 6.12.1?
 
 Cheers,
   Simon
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC vs GCC vs JHC

2010-03-29 Thread Roman Leshchinskiy
On 29/03/2010, at 02:27, Lennart Augustsson wrote:

 Does anything change if you swap the first two rhss?

No, not as far as I can tell.

 
 On Sun, Mar 28, 2010 at 1:28 AM, Roman Leshchinskiy r...@cse.unsw.edu.au 
 wrote:
 On 28/03/2010, at 09:47, Lennart Augustsson wrote:
 
 It's important to switch from mod to rem.  This can be done by a
 simple abstract interpretation.
 
 Also, changing the definition of rem from
 
a `rem` b
 | b == 0 = divZeroError
 | a == minBound  b == (-1) = overflowError
 | otherwise  =  a `remInt` b
 
 to
 
a `rem` b
 | b == 0 = divZeroError
 | b == (-1)  a == minBound = overflowError
 | otherwise  =  a `remInt` b
 
 speeds up the GHC version by about 20%. Figuring out why is left as an 
 exercise to the reader :-)
 
 Roman
 
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC vs GCC

2010-03-27 Thread Roman Leshchinskiy
On 28/03/2010, at 01:36, Jan-Willem Maessen wrote:

 It's worth pointing out that there's a bit of bang-pattern mysticism going on 
 in this conversation (which has not been uncommon of late!).  A non-buggy 
 strictness analyzer should expose the strictness of these functions without 
 difficulty.

Actually, rangeJ is lazy in i and rangeK is lazy in i and j. GHC does unbox 
everything important here but that needs more optimisations than just 
strictness analysis. You are right, though, that GHC doesn't need bang patterns 
here.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC vs GCC vs JHC

2010-03-27 Thread Roman Leshchinskiy
On 27/03/2010, at 05:27, John Meacham wrote:

 Here are jhc's timings for the same programs on my machine. gcc and ghc
 both used -O3 and jhc had its full standard optimizations turned on.
 
 jhc:
 ./hs.out  5.12s user 0.07s system 96% cpu 5.380 total
 
 gcc:
 ./a.out  5.58s user 0.00s system 97% cpu 5.710 total
 
 ghc:
 ./try  31.11s user 0.00s system 96% cpu 32.200 total

I really don't understand these GHC numbers. I get about 3s for the C version, 
about 5s for GHC with rem and about 7.5s for GHC with mod. Is this perhaps on a 
64-bit system? What is sizeof(int) in C and sizeOf (undefined :: Int) in 
Haskell?

That said, I suspect the only thing this benchmark really measures is how fast 
the various compilers can compute i * i + j * j + k * k `mod` 7.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC vs GCC vs JHC

2010-03-27 Thread Roman Leshchinskiy
On 28/03/2010, at 11:07, John Meacham wrote:

 I have not thoroughly checked it, but I think there are a couple things
 going on here:

It could also be worthwhile to float out (i*i + j*j) in rangeK instead of 
computing it in every loop iteration. Neither ghc nor gcc can do this; if jhc 
can then that might explain the performance difference (although I would expect 
it to be larger in this case).

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] GHC vs GCC vs JHC

2010-03-27 Thread Roman Leshchinskiy
On 28/03/2010, at 09:47, Lennart Augustsson wrote:

 It's important to switch from mod to rem.  This can be done by a
 simple abstract interpretation.

Also, changing the definition of rem from

a `rem` b
 | b == 0 = divZeroError
 | a == minBound  b == (-1) = overflowError
 | otherwise  =  a `remInt` b

to

a `rem` b
 | b == 0 = divZeroError
 | b == (-1)  a == minBound = overflowError
 | otherwise  =  a `remInt` b

speeds up the GHC version by about 20%. Figuring out why is left as an exercise 
to the reader :-)

Roman 


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance question

2010-03-18 Thread Roman Leshchinskiy
On 19/03/2010, at 08:48, Daniel Fischer wrote:

 Am Donnerstag 18 März 2010 21:57:34 schrieb Daniel Fischer:
 
 Contrary to my expectations, however, using unboxed arrays is slower
 than straight arrays (in my tests).
 
 
 However, a few {-# SPECIALISE #-} pragmas set the record straight.

This is because without specialising, unsafeAt is a straight (inlineable) 
function call for boxed arrays but is overloaded and hence much slower for 
unboxed ones. In general, unboxed arrays tend to be slower in generic code. The 
only real solution is making functions such as binarySearch INLINE.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Feedback request: priority queues in containers

2010-03-17 Thread Roman Leshchinskiy
On 17/03/2010, at 03:16, Louis Wasserman wrote:

 I'm not willing to do this sort of typeclass wrapper thing, primarily because 
 nothing else in containers does -- even though we might have a Mapping type 
 class that handles both IntMap and Map, we don't.
 
 I'm inclined to let that design choice stand, as far as containers is 
 concerned.  It would make perfect sense to write a new package with such a 
 type class and offering instances for the containers priority queue 
 implementations, but I prefer to stick with the style that containers already 
 seems to use -- that is, exporting separate modules without a unifying type 
 class, but with nearly-identical method signatures.

FWIW, vector does both. It defines most vector operations generically and then 
exports appropriate specialisations for each concrete vector type. I think this 
is the most flexible and convenient approach. I just wish Haskell had some kind 
of support for it.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] vector stream fusion, inlining and compilation time

2010-03-07 Thread Roman Leshchinskiy
On 06/03/2010, at 03:10, stefan kersten wrote:

 i'm still curious, though, why my three versions of direct convolution perform
 so differently (see attached file). in particular, i somehow expected conv_3 
 to
 be the slowest and conv_2 to perform similar to conv_1. any ideas? i haven't 
 had
 a look at the core yet, mainly because i'm lacking the expertise ...

Hmm, one problem is that the current definition of reverse is suboptimal to say 
the least. I'll fix that.

Could you perhaps send me your complete benchmark?

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: [Haskell] Recursive definition of fibonacci with Data.Vector

2010-03-07 Thread Roman Leshchinskiy
On 08/03/2010, at 12:17, Alexander Solla wrote:

 GHC even optimizes it to:
 
fib = fib
 
 Sounds like an implementation bug, not an infinite dimensional vector space 
 bug.  My guess is that strictness is getting in the way, and forcing what 
 would be a lazy call to fib in the corresponding list code -- fib = 0 : 1 : 
 (zipWith (+) fib (tail fib)) -- into a strict one.
 
 In fact, I'm pretty sure that's what the problem is:
 
 data Vector a = Vector {-# UNPACK #-} !Int
{-# UNPACK #-} !Int
{-# UNPACK #-} !(Array a)

The problem is that you have to allocate an Array of a specific length when 
creating a Vector. Arrays are finite by definition. It's not a bug, it's a 
feature.

Note that in the context of package vector, vector means a 1-dimensional, 
0-indexed array. This is not unusual - see, for instance, the standard C++ 
library.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector stream fusion, inlining and compilation time

2010-03-04 Thread Roman Leshchinskiy
On 05/03/2010, at 04:34, stefan kersten wrote:

 i've been hunting down some performance problems in DSP code using vector and
 the single most important transformation seems to be throwing in INLINE 
 pragmas
 for any function that uses vector combinators and is to be called from
 higher-level code. failing to do so seems to prevent vector operations from
 being fused and results in big performance hits (the good news is that the
 optimized code is quite competitive). does anybody have some more info about 
 the
 do's and don'ts when programming with vector?

This is a general problem when working with RULES-based optimisations. Here is 
an example of what happens: suppose we have

foo :: Vector Int - Vector Int
foo xs = map (+1) xs

Now, GHC will generate a nice tight loop for this but if in a different module, 
we have something like this:

bar xs = foo (foo xs)

then this won't fuse because (a) foo won't be inlined and (b) even if GHC did 
inline here, it would inline the nice tight loop which can't possibly fuse 
instead of the original map which can. By slapping an INLINE pragma on foo, 
you're telling GHC to (almost) always inline the function and to use the 
original definition for inlining, thus giving it a chance to fuse.

GHC could be a bit cleverer here (perhaps by noticing that the original 
definition is small enough to inline and keeping it) but in general, you have 
to add INLINE pragmas in such cases if you want to be sure your code fuses. A 
general-purpose mechanism for handling situations like this automatically would 
be great but we haven't found a good one so far.

 the downside after adding the INLINE pragmas is that now some of my modules 
 take
 _really_ long to compile (up to a couple of minutes); any ideas where i can
 start looking to bring the compilation times down again?

Alas, stream fusion (and fusion in general, I guess) requires what I would call 
whole loop compilation - you need to inline everything into loops. That tends 
to be slow. I don't know what your code looks like but you could try to control 
inlining a bit more. For instance, if you have something like this:

foo ... = ... map f xs ...
  where
f x = ...

you could tell GHC not to inline f until fairly late in the game by adding

  {-# INLINE [0] f #-}

to the where clause. This helps sometimes.

 i'm compiling with -O2 -funbox-strict-fields instead of -Odph (with ghc 6.10.4
 on OSX 10.4), because it's faster for some of my code, but -O2 vs. -Odph 
 doesn't
 make a noticable difference in compilation time.

If you're *really* interested in performance, I would suggest using GHC head. 
It really is much better for this kind of code (although not necessarily faster 
wrt to compilation times).

This is what -Odph does:

-- -Odph is equivalent to
--
---O2   optimise as much as possible
---fno-method-sharing   sharing specialisation defeats fusion
--  sometimes
---fdicts-cheap always inline dictionaries
---fmax-simplifier-iterations20 this is necessary sometimes
---fsimplifier-phases=3 we use an additional simplifier phase
--  for fusion
---fno-spec-constr-thresholdrun SpecConstr even for big loops
---fno-spec-constr-countSpecConstr as much as possible

I'm surprised -Odph doesn't produce faster code than -O2. In any case, you 
could try turning these flags on individually (esp. -fno-method-sharing and the 
spec-constr flags) to see how they affect performance and compilation times.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Removing/deprecating -fvia-c

2010-02-17 Thread Roman Leshchinskiy
On 17/02/2010, at 18:37, Isaac Dupree wrote:

 LLVM and GCC are open-source projects that are improving over time... is 
 there any particular reason we expect GCC to have poor numeric performance 
 forever?

Past experience :-) GCC has been around for a while and if it doesn't optimise 
numeric code well by now, there is no reason to believe that it ever will.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


[Haskell] ANN: vector 0.5

2010-02-15 Thread Roman Leshchinskiy
Hi everyone,

I am pleased to announce the release of version 0.5 of package vector, a 
high-performance Haskell array library with a powerful loop fusion framework. 
The main highlights compared to previous versions are:

  * DPH-style unboxed vectors (in Data.Vector.Unboxed) which use associated 
types to 
select the appropriate unboxed representation depending on the type of the 
elements. 

  * A redesigned interface between mutable and immutable vectors. In 
particular, the 
popular unsafeFreeze primitive is now supported for all vector types.

  * Many new operations on both immutable and mutable vectors.

  * Significant performance improvements.

The library comes with a fairly complete testsuite (mainly thanks to Max 
Bolingbroke) and  is quite stable by now. Barring various disasters, I expect 
to release version 1.0 in the next 3 to 4 months.

The release is accompanied by a new version of the NoSlow array benchmark 
suite. A few quite meaningless preliminary benchmarks are available at:

  http://unlines.wordpress.com/2010/02/15/vector-0-5-is-here-2

Both vector and NoSlow are on Hackage. Hackers can get the darcs repos from

  http://code.haskell.org/vector

and

  http://code.haskell.org/NoSlow

Enjoy!

Roman___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell-cafe] vector to uvector and back again

2010-02-12 Thread Roman Leshchinskiy

On 12/02/2010, at 23:28, Dan Doel wrote:

 On Thursday 11 February 2010 8:54:15 pm Dan Doel wrote:
 On Thursday 11 February 2010 12:43:10 pm stefan kersten wrote:
 On 10.02.10 19:03, Bryan O'Sullivan wrote:
 I'm thinking of switching the statistics library over to using vector.
 
 that would be even better of course! an O(0) solution, at least for me ;)
 let me know if i can be of any help (e.g. in testing). i suppose
 uvector-algorithms would also need to be ported to vector, then.
 
 I could do this.
 
 To this end, I've done a preliminary port of the library, such that all the 
 modules compile. I've just used safe operations so far, so it's probably a 
 significant decrease in performance over the 0.2 uvector-algorithms (unless 
 perhaps you turn off the bounds checking flag), but it's a start. It can be 
 gotten with:
 
  darcs get http://code.haskell.org/~dolio/vector-algorithms

That's great, thanks! FWIW, vector has two kinds of bounds checks: real ones 
which catch invalid indices supplied by the user (on by default) and internal 
ones which catch bugs in the library (off by default since the library is, of 
course, bug-free ;-). I guess you'd eventually want to use the latter but not 
the former; that's exactly what unsafe operations provide.

 I only encountered a couple snags during the porting so far:
 
  * swap isn't exported from D.V.Generic.Mutable, so I'm using my own.

Ah, I'll export it. Also, I gladly accept patches :-)

  * I use a copy with an offset into the from and to arrays, and with a
length (this is necessary for merge sort). However, I only saw a whole
array copy (and only with identical sizes) in vector (so I wrote my own
again).

That's actually a conscious decision. Since vectors support O(1) slicing, you 
can simply copy a slice of the source vector into a slice of the target vector.

  * Some kind of thawing of immutable vectors into mutable vectors, or other
way to copy the former into the latter would be useful. Right now I'm
using unstream . stream, but I'm not sure that's the best way to do it.

At the moment, it is (although it ought to be wrapped in a nicer interface). 
Something like memcpy doesn't work for Data.Vector.Unboxed because the 
ByteArrays aren't pinned. I don't really want to provide thawing until someone 
convinces me that it is actually useful.

BTW, vector also supports array recycling so you could implement true in-place 
sorting for fused pipelines. Something like

  map (+1) . sort . update xs

wouldn't allocate any temporary arrays in that case.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector to uvector and back again

2010-02-11 Thread Roman Leshchinskiy
On 12/02/2010, at 12:39, Don Stewart wrote:

 bos:
 I'm thinking of switching the statistics library over to using vector. 
 uvector
 is pretty bit-rotted in comparison to vector at this point, and it's really
 seeing no development, while vector is The Shiny Future. Roman, would you 
 call
 the vector library good enough to use in production at the moment?
 
 uvector's not seeing much development, but at least in the last round of
 benchmarks it was still consistently faster -- since it's been
 micro-optimized.

FWIW, the development version of vector is usually faster the both uvector and 
dph-prim-seq, at least for the development version of NoSlow.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector to uvector and back again

2010-02-11 Thread Roman Leshchinskiy
On 12/02/2010, at 12:40, Don Stewart wrote:

 rl:
 On 11/02/2010, at 05:03, Bryan O'Sullivan wrote:
 
 I'm thinking of switching the statistics library over to using vector. 
 uvector is pretty bit-rotted in comparison to vector at this point, and 
 it's really seeing no development, while vector is The Shiny Future. Roman, 
 would you call the vector library good enough to use in production at the 
 moment?
 
 Yes, with the caveat that I haven't really used it in production code
 (I have tested and benchmarked it, though). BTW, I'll release version
 0.5 as soon as get a code.haskell.org account and move the repo there.
 
 
 That's the main problem. I think we could move to vector as a whole, if
 the suite of testing/ performance/documentation stuff from uvector was ported.

Hmm, I'm not sure what you mean here. Mostly thanks to Max Bolingbroke's 
efforts, vector has a fairly extensive testsuite. I benchmark it a lot (with 
NoSlow) and haven't found any significant performance problems in a while. As 
to documentation, there are comments for most of the functions :-)

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector to uvector and back again

2010-02-11 Thread Roman Leshchinskiy
On 12/02/2010, at 12:54, Dan Doel wrote:

 I also notice that vector seems to have discarded the idea of
 
  Vec (A * B) = Vec A * Vec B

Oh no, it hasn't. In contrast to uvector/DPH, which use a custom strict tuple 
type for  rather outdated reasons, vector uses normal tuples. For instance, 
Data.Vector.Unboxed.Vector (a,b,c) is internally represented as a triple of 
unboxed vectors of a, b and c. In general, vector supports 4 kinds of arrays at 
the moment:

Data.Vector.Primitive wrappers around ByteArray#, can store primitive types
Data.Vector.Unboxed   uses type families, can store everything 
D.V.Primitive can
  plus tuples and can be extended for user-defined types
Data.Vector.Storable  wrappers around ForeignPtr, can store Storable things
Data.Vector   boxed arrays

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector to uvector and back again

2010-02-11 Thread Roman Leshchinskiy
On 12/02/2010, at 13:49, Don Stewart wrote:

 rl:
 On 12/02/2010, at 12:39, Don Stewart wrote:
 
 bos:
 I'm thinking of switching the statistics library over to using vector. 
 uvector
 is pretty bit-rotted in comparison to vector at this point, and it's really
 seeing no development, while vector is The Shiny Future. Roman, would you 
 call
 the vector library good enough to use in production at the moment?
 
 uvector's not seeing much development, but at least in the last round of
 benchmarks it was still consistently faster -- since it's been
 micro-optimized.
 
 FWIW, the development version of vector is usually faster the both
 uvector and dph-prim-seq, at least for the development version of
 NoSlow.
 
 Ah ha -- that's useful. Public benchmarks soon? In time for the Zurich
 Hackathon?? (March 20)

I've been trying to find the time to put the benchmarks on my blog since the 
beginning of January but, alas, unsuccessfully so far. In any case, vector and 
NoSlow currently live in

  http://www.cse.unsw.edu.au/~rl/code/darcs/vector
  http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow

 If Roman declares the vector to be faster -- my main concern here for
 flat uarrays -- and makes the repo available so we can work on it, I'd
 be willing to merge uvector's tests and docs and extra array operations
 in.

It is generally faster than dph-prim-seq. Benchmarking against uvector is a bit 
difficult because it's missing operations necessary for implementing most of 
the algorithms in NoSlow (in particular, bulk updates). For the ones that 
uvector supports, vector tends to be faster.

BTW, this is for unsafe operations which don't use bounds checking. Bounds 
checking can make things a little slower but often doesn't cost anything as 
long as only collective operations are used. Sometimes it makes things faster 
which means that the simplifier still gets confused in some situations. There 
are also some significant differences between 6.12 and the HEAD (the HEAD is 
much more predictable).

In general, I find it hard to believe that the performance differences I'm 
seeing really matter all that much in real-world programs.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector to uvector and back again

2010-02-10 Thread Roman Leshchinskiy
On 11/02/2010, at 05:03, Bryan O'Sullivan wrote:

 I'm thinking of switching the statistics library over to using vector. 
 uvector is pretty bit-rotted in comparison to vector at this point, and it's 
 really seeing no development, while vector is The Shiny Future. Roman, would 
 you call the vector library good enough to use in production at the moment?

Yes, with the caveat that I haven't really used it in production code (I have 
tested and benchmarked it, though). BTW, I'll release version 0.5 as soon as 
get a code.haskell.org account and move the repo there.

Roman

 
 
 On Wed, Feb 10, 2010 at 9:59 AM, stefan kersten s...@k-hornz.de wrote:
 hi,
 
 i've been using the vector [1] library for implementing some signal processing
 algorithms, but now i'd like to use the statistics [2] package on my data, 
 which
 is based on the uvector [3] library. is there a (straightforward) way of
 converting between vectors and uvectors, preferrably O(1)?
 
 thanks,
 sk
 
 [1] http://hackage.haskell.org/package/vector
 [2] http://hackage.haskell.org/package/statistics
 [3] http://hackage.haskell.org/package/uvector
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Restrictions on associated types for classes

2009-12-17 Thread Roman Leshchinskiy
On 18/12/2009, at 00:37, Stephen Lavelle wrote:

 Given
 
 class MyClass k where
  type AssociatedType k :: *
 
 Is there a way of requiring AssociatedType be of class Eq, say?

This works with -XFlexibleContexts:

class Eq (AssociatedType k) = MyClass k where
type AssociatedType k :: *


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Boxed Mutable Arrays

2009-12-14 Thread Roman Leshchinskiy
On 15/12/2009, at 06:53, Brad Larsen wrote:

 On another note, does this (or perhaps better phrased, will this) bug
 also affect Data Parallel Haskell?

Luckily, no. DPH represents arrays of user-defined types by unboxed arrays 
(that's essentially what the vectoriser does). It does use boxed arrays in a 
couple of places internally but they are small.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Re: [Haskell] ANN: NoSlow - Microbenchmarks for array libraries

2009-11-27 Thread Roman Leshchinskiy
On 28/11/2009, at 07:45, Henning Thielemann wrote:

 Is there also a darcs repository?

Yes, http://www.cse.unsw.edu.au/~rl/code/darcs/NoSlow.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell] ANN: NoSlow - Microbenchmarks for array libraries

2009-11-26 Thread Roman Leshchinskiy
I'm pleased to announce NoSlow, a nascent benchmark suite for various array 
(and list) libraries with particular emphasis on finding out how well they fuse 
things. At the moment, it knows about

  * standard lists
  * primitive DPH arrays (dph-prim-seq)
  * uvector
  * vector
  * storablevector

It compiles and runs (using Brian's criterion) a fairly random collection of 
very small loop kernels for each of those and produces a lot of data which it 
then uses to generate ugly HTML tables. In the future, it will have more 
benchmarks, more complex benchmarks and much prettier tables.

You can get more information (including the ugly tables) from my blog

  http://unlines.wordpress.com/2009/11/27/noslow

and NoSlow itself from Hackage

  http://hackage.haskell.org/package/NoSlow

Any help would be highly appreciated as I don't have a lot of time to work on 
it.

Roman


___
Haskell mailing list
Haskell@haskell.org
http://www.haskell.org/mailman/listinfo/haskell


Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?

2009-11-18 Thread Roman Leshchinskiy
On 18/11/2009, at 21:10, Simon Peyton-Jones wrote:

 Yes I think it can, although you are right to point out that I said nothing 
 about type inference.  One minor thing is that you've misunderstood the 
 proposal a bit.  It ONLY springs into action when there's a dot.  So you'd 
 have to write
   bar1 x = x.foo
   bar2 x = x.foo

Yes, that's what I meant to write, silly me. I promise to pay more attention 
next time.

 OK so now it works rather like type functions.  Suppose, the types with which 
 foo was in scope were
   foo :: Int - Int
   foo :: Bool - Char
 
 Now imagine that we had a weird kind of type function
 
   type instance TDNR_foo Int = Int - Int
   type instance TDNR_foo Bool = Bool - Char
 
 Each 'foo' gives a type instance for TDNR_foo, mapping the type of the first 
 argument to the type of that foo.

Hmm... GHC doesn't allow this:

type instance TDNR_foo () = forall a. () - a - a

IIUC this restriction is necessary to guarantee termination. Given your 
analogy, wouldn't this proposal run into similar problems?

 | Another example: suppose we have
 | 
 | data T a where
 |   TInt  :: T Int
 |   TBool :: T Bool
 | 
 | foo :: T Int - u
 | foo :: T Bool - u
 | 
 | bar :: T a - u
 | bar x = case x of
 |   TInt  - foo x
 |   TBool - foo x
 | 
 | Here, (foo x) calls different functions in the two alternatives, right? To 
 be
 | honest, that's not something I'd like to see in Haskell.
 
 You mean x.foo and x.foo, right?  Then yes, certainly. 
 
 Of course that's already true of type classes:
 
   data T a where
 T1 :: Show a = T a
 T2 :: Sow a = T a
 
   bar :: a - T a - String
   bar x y = case y of
   T1 - show x
   T2 - show x
 
 Then I get different show's.

How so? Surely you'll get the same Show instance in both cases unless you have 
conflicting instances in your program?

Roman
 

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Status of TypeDirectedNameResolution proposal?

2009-11-17 Thread Roman Leshchinskiy
Simon, have you given any thought to how this interacts with type system 
extensions, in particular with GADTs and type families? The proposal relies on 
being able to find the type of a term but it's not entirely clear to me what 
that means. Here is an example:

foo :: F Int - Int
foo :: Int - Int

bar1 :: Int - Int
bar1 = foo

bar2 :: Int ~ F Int = Int - Int
bar2 = foo

IIUC, bar1 is ok but bar2 isn't. Do we realy want to have such a strong 
dependency between name lookup and type inference? Can name lookup be specified 
properly without also having to specify the entire inference algorithm?

Another example: suppose we have

data T a where
  TInt  :: T Int
  TBool :: T Bool

foo :: T Int - u
foo :: T Bool - u

bar :: T a - u
bar x = case x of
  TInt  - foo x
  TBool - foo x

Here, (foo x) calls different functions in the two alternatives, right? To be 
honest, that's not something I'd like to see in Haskell.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] poor perfomance of indexU in uvector package

2009-11-16 Thread Roman Leshchinskiy
On 16/11/2009, at 22:46, Alexey Khudyakov wrote:

 Problems begin when you need non-contiguous block. Easiest way to so
 is indexing.

FWIW, this operation is called backpermute and is probably exported as bpermute 
in uvector.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Inliner behaviour - tiny changes lead to huge performance differences

2009-11-13 Thread Roman Leshchinskiy

On 13/11/2009, at 18:04, Bryan O'Sullivan wrote:


main = do
  args - getArgs
  forM_ args $ \a - do
s - B.readFile a
let t = T.decodeUtf8 s
print (T.length t)

The streamUtf8 function looks roughly like this:

streamUtf8 :: OnDecodeError - ByteString - Stream Char
streamUtf8 onErr bs = Stream next 0 (maxSize l)
where
  l = B.length bs
  next i
  | i = l =  Done
  | U8.validate1 x1 = Yield (unsafeChr8 x1) (i+1)
  | {- etc. -}
{-# INLINE [0] streamUtf8 #-}

The values being Yielded from the inner function are, as you can  
see, themselves constructed by functions.


Originally, with the inner next function manually marked as INLINE,  
I found that functions like unsafeChr8 were not being inlined by  
GHC, and performance was terrible due to the amount of boxing and  
unboxing happening in the inner loop.


Let's see if I understand this correctly. In your code, decodeUtf8  
calls streamUtf8. They both get inlined into main but then unsafeChr8  
does not. Correct?


If so, are you sure that unsafeChr8 is really called in the simplified  
code? IIUC, this isn't necessary if you don't actually inspect the  
Chars (which length presumably doesn't). So perhaps GHC removes the  
call altogether? If not, what does it do with the result?


I somehow stumbled on the idea of removing the INLINE annotation  
from next, and performance suddenly improved by a significant  
integer multiple. This caused the body of streamUtf8 to be inlined  
into my test program, as I hoped.


Or are you saying that it's streamUtf8 that isn't getting inlined into  
main?



length :: Text - Int
length t = Stream.length (Stream.stream t)
{-# INLINE length #-}

And the streaming length is:

length :: Stream Char - Int
length = S.lengthI
{-# INLINE[1] length #-}

And the lengthI function is defined more generally, in the hope that  
I could use it for both Int and Int64 lengths:


lengthI :: Integral a = Stream Char - a
lengthI (Stream next s0 _len) = loop_length 0 s0
where
  loop_length !z s  = case next s of
   Done   - z
   Skips' - loop_length z s'
   Yield _ s' - loop_length (z + 1) s'
{-# INLINE[0] lengthI #-}

Unfortunately, although lengthI is inlined into the Int-typed  
streaming length function, that function is not in turn marked with  
__inline_me in simplifier output, so the length/decodeUtf8 loops do  
not fuse. The code is pretty fast, but there's still a lot of boxing  
and unboxing happening for all the Yields.


Does changing the definition of length to

length = id S.lengthI

help? GHC used to have a bug in this area but I haven't been bitten by  
it for quite some time.


Also, I wonder how Stream.stream is defined. Is it strict in Text? If  
it isn't, does making it strict help?


All of these flip-flops in inliner behaviour are very difficult to  
understand, and they seem to be exceedingly fragile. Should I expect  
the situation to be better with the new inliner in 6.12?


I suspect that the fragility you are seeing is just a symptom of a  
problem in how the UTF-8 library implements stream fusion. It's a bit  
tricky to get everything right. Generally, I've found the simplifier  
to be quite stable and predictable in the last year or so. Simon is  
working hard on making it even better. If you have a spare minute,  
perhaps you could try the HEAD with the new inliner and see if that  
helps? Although I somewhat doubt it, to be honest.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Re[2]: [Haskell-cafe] What's the deal with Clean?

2009-11-04 Thread Roman Leshchinskiy

On 05/11/2009, at 04:01, Bulat Ziganshin wrote:


oh, can we stop saying about shootout? if you want to see speed of
pure haskell code, look at papers about fast arrays/strings - their
authors have measured that lazy lists are hundreds times slower than
idiomatic C code. is use of lazy lists counted as mistake too and
paper authors had too small haskell experience?


In the papers I coauthored, I don't think we measured any such thing.  
What we measured was that in algorithms that are best implemented with  
(unboxed) arrays, using boxed lists is going to cost you. That's not a  
very surprising conclusion and it's by no means specific to Haskell.  
The problem was/is the lack of nice purely declarative array libraries  
but that changing, albeit slowly. It's a question of using the right  
data structure for the algorithm, not a C vs. Haskell thing.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the deal with Clean?

2009-11-03 Thread Roman Leshchinskiy

On 04/11/2009, at 13:23, Daniel Peebles wrote:


In the presence of fusion (as is the case in uvector), it's hard to
give meaningful time complexities for operations as they depend on
what operations they are paired with. We need to think of a better way
to express this behavior in the documentation though.


I have to disagree here. Fusion never makes the complexity of  
operations worse. If it does, it's a bug.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the deal with Clean?

2009-11-03 Thread Roman Leshchinskiy

On 04/11/2009, at 13:12, brian wrote:


 indexU :: UA e = UArr e - Int - e

 O(n). indexU extracts an element out of an immutable unboxed array.


This is a typo (unless Don inserted a nop loop into the original DPH  
code).


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the deal with Clean?

2009-11-03 Thread Roman Leshchinskiy

On 04/11/2009, at 13:35, wren ng thornton wrote:


Roman Leshchinskiy wrote:

On 04/11/2009, at 13:23, Daniel Peebles wrote:

In the presence of fusion (as is the case in uvector), it's hard to
give meaningful time complexities for operations as they depend on
what operations they are paired with. We need to think of a better  
way

to express this behavior in the documentation though.
I have to disagree here. Fusion never makes the complexity of  
operations worse. If it does, it's a bug.


I think the point was more that the relevant complexity bound can  
change in the presence of fusion. For a poor example: the first map  
over a list is O(n) but all subsequent ones in a chain of maps are  
O(1) with fusion. I'm sure there are better examples than that, but  
you get the idea. Some people may care to know about that latter  
complexity rather than just the independent complexity.


I think asymptotic complexity is the wrong tool for what you're trying  
to do. You implement your algorithm using operations with known  
complexities. This allows you to compute the complexity of the entire  
algorithm. That's all you can use operation complexities for. The  
compiler is then free to optimise the algorithm as it sees fit but is  
supposed to preserve (or improve) its complexity. It is not guaranteed  
or even supposed to preserve the original operations. To stay with  
your example, each of the two maps is linear regardless of whether  
fusion happens. Executing the two maps, be it one after another or  
interlocked, is linear simply because O(n) + O(n) = O(n), not because  
of fusion.


Essentially, you're trying to use complexity to describe an  
optimisation which doesn't actually affect the complexity.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] What's the deal with Clean?

2009-11-03 Thread Roman Leshchinskiy

On 04/11/2009, at 14:07, Gregory Crosswhite wrote:

Actually, it's not a typo.  If you look at the source, what you'll  
see is


indexU arr n = indexS (streamU arr) n


I suspect it gets rewritten back to the O(1) version somewhere after  
is has had a chance to fuse. If not, then it's a bug. In the vector  
package, I do this instead, though:


indexU arr n = O(1) implemetation

{-# RULES

indexU/unstreamU  forall s n. indexU (unstreamU s) n = indexS s n

#-}

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Arrays in Clean and Haskell

2009-11-03 Thread Roman Leshchinskiy

On 04/11/2009, at 14:38, Philippos Apolinarius wrote:

And here comes the reason for writing this article. In the previous  
version of the Gauss elimination algorithm, I have imported  
Data.Array.IO. I also wrote a version of the program that imports  
Data.Array.ST. The problem is that I  don't know how to read an  
STUArray from a file, process it, and write it back to a file.


Why don't you use the IOUArray directly instead of converting it to  
STUArray and back?


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stream-fusion without the lists

2009-05-12 Thread Roman Leshchinskiy

On 12/05/2009, at 14:45, Reiner Pope wrote:


The Stream datatype seems to be much better suited to representing
loops than the list datatype is. So, instead of programming with the
lists, why don't we just use the Stream datatype directly?


I think the main reason is that streams don't store data and therefore  
don't support sharing. That is, in


let xs = map f ys in (sum xs, product xs)

the elements of xs will be computed once if it is a list but twice if  
it is a stream.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Really bad code for single method dictionaries?

2009-03-27 Thread Roman Leshchinskiy

On 27/03/2009, at 18:32, Don Stewart wrote:


I don't think this is still  the case.

Roman, do you remember?


Hmm, not really. I recall that there was some sort of problem which I  
didn't have time to investigate then but it's been so long...


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop unrolling + fusion ?

2009-03-08 Thread Roman Leshchinskiy

On 07/03/2009, at 09:26, Claus Reinke wrote:


My preferred spec would be roughly

{-# NOINLINE f #-}
  as now

{-# INLINE f #-}works as now, which is for non-recursive f only  
(might in future

  be taken as go-ahead for analysis-based recursion unfolding)

{-# INLINE f PEEL n #-}
  inline calls *into* recursive f (called loop peeling for loops)
{-# INLINE f UNROLL m #-}
  inline recursive calls to f *inside* f (called loop unrolling for  
loops)


{-# INLINE f PEEL n UNROLL m #-}
  combine the previous two


The problem here is that this only works for directly recursive  
functions which I, for instance, don't normally use in high- 
performance code. Most of my loops are pipelines of collective  
combinators like map, filter, fold etc. because these are the ones  
that can be fused automatically. Unless I'm misunderstanding  
something, this approach doesn't handle such cases.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] bytestring vs. uvector

2009-03-08 Thread Roman Leshchinskiy

On 09/03/2009, at 11:47, Claus Reinke wrote:


Btw, have any of the Haskell array optimization researchers
considered fixpoints yet?


This, for instance, is a very nice paper:

http://www.pllab.riec.tohoku.ac.jp/~ohori/research/OhoriSasanoPOPL07.pdf

However, in the context of high-performance array programming explicit  
recursion is bad because it is very hard if not impossible to  
parallelise automatically except in fairly trivial cases. And if your  
array program is not parallelisable then you don't really care about  
performance all that much :-)


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Loop unrolling + fusion ?

2009-02-28 Thread Roman Leshchinskiy

On 01/03/2009, at 04:49, Don Stewart wrote:

So now, since we've gone to such effort to produce a tiny loop like,  
this,
can't we unroll it just a little? Sadly, my attempts to get GCC to  
trigger

its loop unroller on this guy haven't succeeded. -funroll-loops and
-funroll-all-loops doesn't  touch it,


That's because the C produced by GHC doesn't look like a loop to GCC.  
This can be fixed but given that we are moving away from -fvia-C  
anyway, it probably isn't worth doing.


Anyone think of a way to apply Claus' TH unroller, or somehow  
convince GCC
it is worth unrolling this guy, so we get the win of both aggressive  
high level

fusion, and aggressive low level loop optimisations?


The problem with low-level loop optimisations is that in general, they  
should be done at a low level. Core is much too early for this. To  
find out whether and how much to unroll a particular loop, you must  
take things like register pressure and instruction scheduling into  
account. IMO, the backend is the only reasonable place to do these  
optimisations. Using an exisiting backend like LLVM would really help  
here.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-30 Thread Roman Leshchinskiy

On 30/11/2008, at 11:36, Don Stewart wrote:


Should mutable arrays have list-like APIs? All the usual operations,
just in-place and destructive where appropriate?


I don't know. To be honest, I don't think that the term mutable  
array describes a single data structure. For instance, one of the  
central questions which unveils a whole bunch of design possibilities  
is: can mutable arrays be concatenated and how does that work if yes?


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-29 Thread Roman Leshchinskiy

On 29/11/2008, at 11:49, Claus Reinke wrote:

Yes, it is very difficult. A sensible API for a standard array  
library  is something that needs more research. FWIW, I don't know  
of any other  language that has what I'd like to see in Haskell. C+ 
+ probably comes  closest but they have it easy - they don't do  
fusion.


I assume you've looked at SAC? http://www.sac-home.org/


Yes. They have it even easier - they don't have polymorphism, they  
don't have higher-order functions, they don't have boxing and laziness  
and in a sense, they don't even do general-purpose programming, just  
scientific algorithms. And they have no existing language to integrate  
their stuff with. This is not to imply that their work isn't  
interesting and valuable; IMO, it just doesn't really help us when it  
comes to designing a Haskell array library.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-29 Thread Roman Leshchinskiy

On 30/11/2008, at 02:43, Brad Larsen wrote:

On Fri, 28 Nov 2008 19:00:38 -0500, Roman Leshchinskiy [EMAIL PROTECTED] 
 wrote:



On 29/11/2008, at 10:47, Claus Reinke wrote:

[...]

And would it be difficult for you all to agree on a standard API, to
make switching between the alternatives easy (if
it is indeed impossible to unify their advantages in a single  
library,

the reasons for which should also be documented somewhere)?


Yes, it is very difficult. A sensible API for a standard array  
library
is something that needs more research. FWIW, I don't know of any  
other

language that has what I'd like to see in Haskell. C++ probably comes
closest but they have it easy - they don't do fusion.

[...]

Would you elaborate on what you'd like to see in an array library?


I'd like to have a library which is efficient (in particular,  
implements aggressive fusion), is roughly equivalent to the current  
standard list library in power and supports strict/unboxed/mutable  
arrays. It should also provide a generic framework for implementing  
new kinds of arrays. And eventually, it should also be usable in high- 
performance and parallel algorithms.


 And perhaps which C++ array library you are thinking of?  Your C++  
comment caught my attention, and now I'm curious.  Surely you don't  
mean C-style arrays. :-D


No, I meant vector, basic_string and deque which are provided by the  
standard library.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data parallelism doesn't seem to work on windows...

2008-11-29 Thread Roman Leshchinskiy

On 30/11/2008, at 10:43, Sebastian Sylvan wrote:


This, on the other hand does not use more than one core:

-- compiler command line (from shootout code): ghc --make -fcpr-off - 
threaded -fdph-par -package dph-base -Odph -XPArr parr2.hs

-- execution as before
main = print $ [: True | n - [: 1000 .. 5000 :], fac n == 0 :]


Unfortunately, that's not enough to get parallelism. You also need to - 
fvectorise the computation, i.e., everything that comes after $ (but  
not print because that can't be vectorised yet). At the moment, this  
means that you have to split your code in two modules because - 
fvectorise is a module-wide flag. Please take a look at dph/examples/ 
dotp to see how this is done. Sorry that this is so inconvenient at  
the moment but we're working on it!


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-29 Thread Roman Leshchinskiy

On 30/11/2008, at 08:32, Andrew Coppin wrote:


Henning Thielemann wrote:
I suspect that this particular function is less useful than you  
think.

It safes one allocation and might be faster since it uses less cache,
but on the other hand, it cannot be fused.


Hmm, I haven't seen your original message but I suspect you are  
talking about in-place map. In that case, this is not entirely true.  
Shameless plug:


http://www.cse.unsw.edu.au/~rl/publications/recycling.html


I think in-place array
updates are only sensible for writing array elements in really random
order. As long as you can formulate your algorithm the way read from
random indices, but write a complete array from left to right,  
there is

almost no need for mutable arrays.


Many array algorithms cannot really be written in this way. I think we  
do need mutable arrays and they should provide much more than just  
read/write. How to integrate them nicely with immutable arrays is not  
really clear, though.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell!

2008-11-28 Thread Roman Leshchinskiy

On 28/11/2008, at 20:04, Simon Marlow wrote:

So we have two vector libraries, vector and uvector, which have a  
lot in common - they are both single-dimension array types that  
support unboxed instances and have list-like operations with  
fusion.  They ought to be unified, really.


Yes. This shouldn't be too hard to do since both libraries are based  
on the internal DPH arrays. Although I have to admit that I never  
really looked at Don's code and have no idea how much he has changed.


But it's more than that. The basic idea behind vector is to provide a  
common framework for normal arrays, ByteString, StorableVector etc.  
It's not finished by a long shot but (unsurprisingly) I think it goes  
in the right direction. The proliferation of array-like libraries is  
counterproductive.


The main difference between these libraries and Haskell's arrays is  
the Ix class.  So perhaps Haskell's arrays should be reimplemented  
on top of the low-level vector libraries?
The Ix class is the root cause of the problems with optimising the  
standard array libraries.


Yes, Haskell arrays should be based on a lower-level array library. I  
would also argue that they should be redesigned and given a more  
streamlined and efficient interface. The Ix class is not the only  
problem wrt efficiency. In particular, the H98 array library relies  
quite heavily on lists which makes implementing any kind of array  
fusion rather hard. In contrast to Ix, this is completely unnecessary,  
IMO.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-28 Thread Roman Leshchinskiy


On 29/11/2008, at 08:43, Andrew Coppin wrote:

What *I* propose is that somebody [you see what I did there?] should  
sit down, take stock of all the multitudes of array libraries, what  
features they have, what obvious features they're missing, and think  
up a good API from scratch. Once we figure out what the best way to  
arrange all this stuff is, *then* we attack the problem of  
implementing it for real.


That is the idea behind vector. I don't know how good it is but it's  
the best I could come up with (or will be once it's finished). That  
said, I don't think there is such a thing as a perfect array API.  
Different libraries serve different purposes.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Re: Go Haskell! - array libraries

2008-11-28 Thread Roman Leshchinskiy

On 29/11/2008, at 10:47, Claus Reinke wrote:


But I don't want Perl, I want a well designed language and well
designed libraries.
I think it's find to let libraries proliferate, but at some point you
also need to step back and abstract.
-- Lennart


Especially so if the free marketeers claim there is something  
fundamentally wrong with the standard libraries and language, as in  
the case of arrays. When someone did that nice little survey of the  
last bunch of array libraries (Bulat, I think? now in the wiki  
book), I was hoping there would be a grand unification soon.  
Instead, it seems that those who have worked most with Haskell  
arrays recently have simply abandoned all of the standard array  
libraries.
Okay, why not, if there are good reasons. But can't you document  
those reasons, for each of your alternative proposals, so that  
people have some basis on which to choose (other than who has the  
loudest market voice;-)?


I think so far, it's always been the same two reasons: efficiency and  
ease of use. H98 arrays are inherently inefficient and IMO not very  
easy to use, at least not for the things that I'm doing.


And would it be difficult for you all to agree on a standard API, to  
make switching between the alternatives easy (if

it is indeed impossible to unify their advantages in a single library,
the reasons for which should also be documented somewhere)?


Yes, it is very difficult. A sensible API for a standard array library  
is something that needs more research. FWIW, I don't know of any other  
language that has what I'd like to see in Haskell. C++ probably comes  
closest but they have it easy - they don't do fusion.


And what is wrong about Simon's suggestion, to use the standard  
array lib APIs on top of your implementations?


Again, IMO H98 arrays are only suitable for a very restricted set of  
array algorithms.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Need machine for DPH benchmarking

2008-11-27 Thread Roman Leshchinskiy

Hi again,

a big thank you to all the people who offered us machines. I think we  
should be fine now. World domination is just around the corner!


Roman

On 27/11/2008, at 18:45, Roman Leshchinskiy wrote:


Hi all,

we, the DPH team, are at the moment in the very unfortunate  
situation of not having a proper machine for running our benchmarks  
on. Could a kind soul maybe give us (i.e., me) access to a quadcore  
or 2xquadcore x86 Linux or OS X machine? I only need to build ghc on  
it and run small benchmarks which never take more than a couple of  
minutes, maybe once every couple of days or so. We do need to use  
all cores, though, so no other CPU-intensive processes can be  
running during benchmarking. This is only for a week or two, until  
we get our own machine. We would be eternally grateful and won't  
forget you when DPH takes over the world.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] Need machine for DPH benchmarking

2008-11-26 Thread Roman Leshchinskiy

Hi all,

we, the DPH team, are at the moment in the very unfortunate situation  
of not having a proper machine for running our benchmarks on. Could a  
kind soul maybe give us (i.e., me) access to a quadcore or 2xquadcore  
x86 Linux or OS X machine? I only need to build ghc on it and run  
small benchmarks which never take more than a couple of minutes, maybe  
once every couple of days or so. We do need to use all cores, though,  
so no other CPU-intensive processes can be running during  
benchmarking. This is only for a week or two, until we get our own  
machine. We would be eternally grateful and won't forget you when DPH  
takes over the world.


Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Repair to floating point enumerations?

2008-10-17 Thread Roman Leshchinskiy

On 15/10/2008, at 20:41, Malcolm Wallace wrote:


Phil proposes that, although retaining the instances of Enum for Float
and Double, we simplify the definitions of the numericEnumFrom family:

 numericEnumFromThenTo   :: (Fractional a, Ord a) = a - a - a -  
[a]

 numericEnumFrom =  iterate (+1)
 numericEnumFromThen n m =  iterate (+(m-n)) n
 numericEnumFromTo n m   =  takeWhile (= m) (numericEnumFrom n)
 numericEnumFromThenTo n m p = takeWhile (= p) (numericEnumFromThen  
n m)



I'd like to raise the following two points in this context.

Firstly, an array library which attempts to provide reasonable  
counterparts to the list functions would want to define array versions  
of enumFromTo and enumFromThenTo. To be efficient, it must be able to  
determine the expected number of elements reasonably fast (in constant  
time).


Secondly, a parallel array library (such as the one provided by DPH)  
also needs to be able to generate, say, the first half of the array on  
one processor and the second half on another. This requires enumFromTo  
to obey some form of the following law for some f and g:


  enumFromTo m n = enumFromTo m (f m n) ++ enumFromTo (g m n) n

It might make sense to try to provide this for Float and Double for  
the sake of consistency with future libraries. I'm not sure how easy  
it is to do, though.


Roman


___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Strictness in data declaration not matched in assembler?

2008-10-16 Thread Roman Leshchinskiy

On 16/10/2008, at 21:34, Simon Peyton-Jones wrote:

For strict *constructors*, on the other hand, we *do* guarantee to  
evaluate the argument before building the constructor.  We generate  
a wrapper thus

   wC = \ab. case a of { a' - C a' b }
(Remember 'case' always evaluates in Core.)  So for strict  
constructors we could take advantage of the known evaluated-ness of  
the result to avoid the test.


BUT people who care probably UNPACK their strict fields too, which  
is even better.  The time you can't do that is for sum types

   data T = MkT ![Int]


You also can't do it for polymorphic components. I've used code like:

  data T a = MkT !a

  foo :: T (a,b) - a
  foo (MkT (x,y)) = x

Here, unpacking doesn't work but foo could still access the components  
of the pair directly.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Some initial results with DPH

2008-09-23 Thread Roman Leshchinskiy

On 23/09/2008, at 14:59, Roman Leshchinskiy wrote:


dotp :: [:Int:] - [:Int:] - Int
dotp v w = I.sumP [: (I.*) x y | x - v, y - w :]


The way the vectoriser works at the moment, it will repeat the array  
w (lengthP v) times, i.e., create an array of length (lengthP v *  
lengthP w). This is quite unfortunate and needs to be fused away but  
isn't at the moment. The only advice I can give is to stay away from  
array comprehensions for now. They work but are extremely slow. This  
definition should work fine:


dotp v w = I.sumP (zipWithP (I.*) v w)


Actually, I didn't pay attention when I wrote this. The two are not  
equivalent, of course. Only the second one computes the dot product.  
With comprehensions, you'd have to write


dotp v w = [: (I.*) x y | x - v | y - w :]

I suspect that will perform reasonably even now.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Some initial results with DPH

2008-09-22 Thread Roman Leshchinskiy

Hi Austin,

first of all, thanks a lot for taking the time to report your results!

On 23/09/2008, at 11:48, Austin Seipp wrote:


* The vectorise pass boosts compilation times *a lot*. I don't think
 this is exactly unwarrented since it seems like a pretty complicated
 transformation, but while making the primitive version using just
 the unlifted interface the compilation takes about 1.5 seconds, for
 the vectorised version it's on the order of 15 seconds. For
 something as trivial as this dot-product thing, that's a bit
 of a compilation time, though.


The problem here is not the vectoriser but rather the subsequent  
optimisations. The vectoriser itself is (or should be - I haven't  
really timed it, to be honest) quite fast. It generates very complex  
code, however, which GHC takes a lot of time to optimise. We'll  
improve the output of the vectoriser eventually, but not before it is  
complete. For the moment, there is no solution for this, I'm afraid.



* It's pretty much impossible to use ghc-core to examine the output
 core of the vectorised version - I let it run and before anything
 started showing up in `less` it was already using on the order of
 100mb of memory. If I just add -ddump-simpl to the command line, the
 reason is obvious: the core generated is absolutely huge.


Yes. Again, this is something we'll try to improve eventually.


* For the benchmark included, the vectorised ver. spends about 98% of
 its time from what I can see in the GC before it dies from stack
 overflow. I haven't tried something like +RTS -A1G -RTS yet, though.


IIUC, the code is


dotp :: [:Int:] - [:Int:] - Int
dotp v w = I.sumP [: (I.*) x y | x - v, y - w :]


The way the vectoriser works at the moment, it will repeat the array w  
(lengthP v) times, i.e., create an array of length (lengthP v *  
lengthP w). This is quite unfortunate and needs to be fused away but  
isn't at the moment. The only advice I can give is to stay away from  
array comprehensions for now. They work but are extremely slow. This  
definition should work fine:


dotp v w = I.sumP (zipWithP (I.*) v w)


* The vectoriser is really, really touchy. For example, the below code
 sample works (from DotPVect.hs):


import Data.Array.Parallel.Prelude.Int as I

dotp :: [:Int:] - [:Int:] - Int
dotp v w = I.sumP [: (I.*) x y | x - v, y - w :]


This however, does not work:


dotp :: [:Int:] - [:Int:] - Int
dotp v w = I.sumP [: (Prelude.*) x y | x - v, y - w :]


This is because the vectorised code needs to call the vectorised  
version of (*). Internally, the vectoriser has a hardwired mapping  
from top-level functions to their vectorised versions. That is, it  
knows that it should replace calls to  
(Data.Array.Parallel.Prelude.Int.*) by calls to  
Data.Array.Parallel.Prelude.Base.Int.plusV. There is no vectorised  
version of (Prelude.*), however, and there won't be one until we can  
vectorise the Prelude. In fact, the vectoriser doesn't even support  
classes at the moment. So the rule of thumb is: unless it's in  
Data.Array.Parallel.Prelude or you wrote and vectorised it yourself,  
it will choke the vectoriser.



I also ran into a few other errs relating to the vectoriser dying - if
I can find some I'll reply to this with some results.


Please do! And please keep using DPH and reporting your results, that  
is really useful to us!


FWIW, we'll include some DPH documentation in 6.10 but it still has to  
be written...


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-28 Thread Roman Leshchinskiy

On 28/08/2008, at 19:27, Simon Peyton-Jones wrote:

Duncan, I'm not following every detail here, but it's clear that you  
have some clear mental infrastructure in your head that informs and  
underpins the way Cabal is.   Cabal takes the view that..., has  
principles, and is clearly partitioned internally.


These things are clear to you, but my sense it that they are *not*  
clear even to other well-informed people.  (I exclude myself from  
this group.)  It's like the Loch Ness monster: the bits above the  
waves make sense only when you get an underwater picture that shows  
you the monster underneath that explains why the humps surface in  
the way they do.


FWIW, I fully agree with this (although I'm not especially well- 
informed in this particular area). It would be immensely helpful if  
Cabal's philosophy was described somewhere.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-28 Thread Roman Leshchinskiy

On 28/08/2008, at 21:10, Ian Lynagh wrote:


On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:


PS: concerning your last point, about separating the Simple build  
system, that might indeed be good.  Indeed, the GHC plan described  
here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is  
(I think) precisely using the declarative part but not the build- 
system part.


The
 * Use Cabal for Haddocking, installing, and anything else we need  
to do.

bullet point uses the build system part.


Hmm, from the previous discussion I got the impression that (large  
parts of) this functionality would be extracted from Simple and could  
then be used by other build systems. Is this wrong?


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-28 Thread Roman Leshchinskiy

On 28/08/2008, at 23:59, Simon Marlow wrote:

The important thing about Cabal's way of specifying dependencies is  
that they can be made sound with not much difficulty.  If I say that  
my package depends on base==3.0 and network==1.0, then I can  
guarantee that as long as those dependencies are present then my  
package will build.  (but but but... I hear you say - don't touch  
that keyboard yet!)


Suppose you used autoconf tests instead.  You might happen to know  
that Network.Socket.blah was added at some point and write a test  
for that, but alas if you didn't also write a test for  
Network.Socket.foo (which your code uses but ends up getting removed  
in network-1.1) then your code breaks.  Autoconf doesn't help you  
make your configuration sound, and you get no prior guarantee that  
your code will build.


Cabal doesn't give this guarantee, either, since it allows you to  
depend on just network or on networkx. To be perfectly honest, I  
think neither autoconf's approach (free-form feature tests) nor  
Cabal's (version-based dependencies) really work for all important use  
cases. And I have to disagree with what you write below - I think both  
systems are fundamentally flawed.


As I said before, what does (mostly) work IMO is depending on  
interfaces which are independent of packages. Being required to  
specify the exact interface you depend on solves the problem you  
describe above. It also solves the problem of name clashes with  
functions defined in later versions of a package. And it is still  
nicely declarative.


Both systems are flawed, but neither fundamentally.  For Cabal I  
think it would be interesting to look into using more precise  
dependencies (module.identifier::type, rather than package-version)  
and have them auto-generated.  But this has difficult implications:  
implementing cabal-install's installation plans becomes much harder,  
for example.


Interesting. From our previous discussion I got the impression that  
you wouldn't like something like this. :-)


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-28 Thread Roman Leshchinskiy

On 29/08/2008, at 03:11, Ian Lynagh wrote:


On Fri, Aug 29, 2008 at 12:57:59AM +1000, Roman Leshchinskiy wrote:

On 28/08/2008, at 21:10, Ian Lynagh wrote:


On Thu, Aug 28, 2008 at 10:27:22AM +0100, Simon Peyton-Jones wrote:


PS: concerning your last point, about separating the Simple build
system, that might indeed be good.  Indeed, the GHC plan described
here http://hackage.haskell.org/trac/ghc/wiki/Design/BuildSystem is
(I think) precisely using the declarative part but not the build-
system part.


The
* Use Cabal for Haddocking, installing, and anything else we need
to do.
bullet point uses the build system part.


Hmm, from the previous discussion I got the impression that (large
parts of) this functionality would be extracted from Simple and could
then be used by other build systems. Is this wrong?


I thought that the proposal was to split Cabal into the declarative
package specification part, and the how to build the package part?

If so, then surely how to run haddock on the sources belongs in the
how to build the package part?


Ignore me, I misunderstood what your original mail. Sorry for the  
confusion.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-28 Thread Roman Leshchinskiy

On 29/08/2008, at 01:31, Simon Marlow wrote:


Roman Leshchinskiy wrote:

On 28/08/2008, at 23:59, Simon Marlow wrote:
The important thing about Cabal's way of specifying dependencies  
is that they can be made sound with not much difficulty.  If I say  
that my package depends on base==3.0 and network==1.0, then I can  
guarantee that as long as those dependencies are present then my  
package will build.  (but but but... I hear you say - don't  
touch that keyboard yet!)


Suppose you used autoconf tests instead.  You might happen to know  
that Network.Socket.blah was added at some point and write a test  
for that, but alas if you didn't also write a test for  
Network.Socket.foo (which your code uses but ends up getting  
removed in network-1.1) then your code breaks.  Autoconf doesn't  
help you make your configuration sound, and you get no prior  
guarantee that your code will build.
Cabal doesn't give this guarantee, either, since it allows you to  
depend on just network or on networkx.


Indeed.  That's why I was careful not to say that Cabal gives you  
the guarantee, only that it's easy to achieve it.


True, it's easy to specify. But IIUC, if you do so you have to update  
your package whenever any of the packages you depend on changes even  
if that change doesn't affect you. This is a very high (if not  
prohibitive) cost and one which the autoconf model doesn't force on you.


Both systems are flawed, but neither fundamentally.  For Cabal I  
think it would be interesting to look into using more precise  
dependencies (module.identifier::type, rather than package- 
version) and have them auto-generated.  But this has difficult  
implications: implementing cabal-install's installation plans  
becomes much harder, for example.
Interesting. From our previous discussion I got the impression that  
you wouldn't like something like this. :-)


Sorry for giving that impression.  Yes I'd like to solve the  
problems that Cabal dependencies have, but I don't want the solution  
to be too costly - first-class interfaces seem too heavyweight to  
me.  But I do agree with most of the arguments you gave in their  
favour.


I'm not sure what you mean by first-class interfaces. Surely, if you  
specify the interfaces you depend on you'll want to share and reuse  
those specifications.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Version control systems

2008-08-15 Thread Roman Leshchinskiy

On 16/08/2008, at 00:12, Ian Lynagh wrote:

On Fri, Aug 15, 2008 at 11:12:20AM +1000, Manuel M T Chakravarty  
wrote:


Moreover, as I wrote a few times before, some reasons for switching  
in

the first place are invalidated by not having the core libraries in
git, too.  For example, one complaint about darcs is that it either
doesn't build (on the Sun Solaris T1 and T2 machines)


I don't remember seeing this mentioned before, and googling for
   Solaris T1 darcs
doesn't find anything. What goes wrong? I'd expect darcs to build
anywhere GHC does.


I only vaguely remember what was wrong but IIRC, the problem was that  
darcs 1.0.? didn't build with GHC 6.8.? because of some  
incompatibility in the libs and darcs 2 built ok but didn't work,  
probably because of libcurl issues. At that point I gave up.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-14 Thread Roman Leshchinskiy

On 14/08/2008, at 06:32, Duncan Coutts wrote:


On Wed, 2008-08-13 at 22:47 +1000, Roman Leshchinskiy wrote:


Again, I'm not arguing against a build system written in Haskell. I'd
just like it to be completely separated from Haskell's packaging
system. In particular, polluting a package description with build
information seems wrong to me.


There is a huge overlap of course. The things needed to build a  
package

tend to be the dependencies. The ability to automatically extract the
dependencies from a package description is crucial as it is what  
enables
automatic package management either directly or by conversion to  
distro

packages. Tools like automake + autoconf do not give us that.


Right. Dependencies are part of a package description. That's what  
Cabal should do. It should provide a nice clean interface to the  
dependencies stuff for the build system to use. I don't think it does  
that at the moment; IIUC, it is all done by Distribution.Simple.



There is of course some separation possible, which in Cabal roughly
corresponds to the stuff under Distribution.Simple vs everything else.
We could split those two aspects into separate packages but it's not
clear to me that we'd gain much by doing that.


My point isn't really about distribution, it's about coupling. My  
concern is that the syntax of .cabal files is increasingly based on  
what Distribution.Simple needs. This effectively makes all other build  
systems second class. It also loses us clean package descriptions  
which is what .cabal files should be. It's not too bad at the moment  
but will get worse as Distribution.Simple gets more complex since it  
will need more and more information.


Just as an example, consider something like ld-options. This is  
obviously not a dependency and is basically only documented by how it  
is used by Distribution.Simple. It shouldn't be in .cabal, IMO. If a  
build system needs this information, it should be provided somewhere  
else.


There is still the Make build type which we could improve if people  
want
it. That allows the declarative stuff to be given in the .cabal file  
(so

that package managers can do their thing) and all the building is
delegated to make. People have not shown any interest in this so it's
never been improved much. The obvious disadvantage of using it is that
you have to do a lot of work to make your build system do all the  
things

that users expect.


But that is precisely my (other) point. A lot of that work is really  
unnecessary and could be done by Cabal since it only or mostly depends  
on the package information. Instead, it is implemented somewhere in  
Distribution.Simple and not really usable from the outside. For  
instance, a lot of the functionality of setup sdist, setup register  
and so on could be implemented generically and used by a make-based  
build system as well. Also, there is no easy way for build systems to  
work with the declarative stuff because a lot of that functionality  
is, again, part of Distribution.Simple. IMO, this is a direct result  
of the tight coupling between the package management and build system  
parts of Cabal.


The other problem, of course, is that it isn't clear what exactly a  
build system should provide. IIUC, that's what Building and  
installing a package in the Cabal manual defines but there, we have  
things like this:


setup test

Run the test suite specified by the runTests field of  
Distribution.Simple.UserHooks. See Distribution.Simple for information  
about creating hooks and using defaultMainWithHooks.


As a matter of fact, a lot of Cabal is documented in terms of what  
Distribution.Simple does. Again, this effectively shuts out other  
build systems.


I'm sorry if this all sounds too negative, it shouldn't really. I  
think you guys have done a great job in implementing a system which is  
obviously very important to the community. I just somewhat disagree  
with the direction in which it is heading now.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-14 Thread Roman Leshchinskiy

On 14/08/2008, at 18:01, Simon Marlow wrote:


Roman Leshchinskiy wrote:

But that is precisely my (other) point. A lot of that work is  
really unnecessary and could be done by Cabal since it only or  
mostly depends on the package information. Instead, it is  
implemented somewhere in Distribution.Simple and not really usable  
from the outside. For instance, a lot of the functionality of setup  
sdist, setup register and so on could be implemented generically  
and used by a make-based build system as well.


That's exactly what I'm proposing we do in GHC: re-use Cabal's setup  
register and some of the other parts of the simple build system in a  
make-based build system for packages.  It might require a bit of  
refactoring of Cabal, but I don't expect it to be a major upheaval  
at all.


Ah! I hadn't realised that you are going to reuse Cabal functionality.  
You wrote Extract the code from Cabal that generates Makefiles so I  
thought you won't be really using anything from Cabal.


I think what you're proposing is mostly a matter of abstracting  
parts of Cabal with cleaner and more modular APIs, which is  
absolutely a good thing, but doesn't require a fundamental  
redesign.  The tight coupling and lack of separation between Cabal's  
generic parts and the simple build system is somewhat accidental  
(lazy implementors :-), and is actually a lot better than it used to  
be thanks to the work Duncan has put in.  I'm sure it'll improve  
further over time.


IMO, getting this right is absolutely crucial for Cabal's usability  
and should be the primary short-term goal. Then again, I guess I  
should contribute code instead of opinions already :-)


The other part of your complaint is that the BuildInfo is in  
the .cabal file along with the PackageDescription (the types are  
pretty well separated internally).  Again I don't think there's  
anything fundamental here, and in fact some packages have  
separate .buildinfo files.


Well, it is fundamental in the sense that this is how Cabal is used  
(and is supposed to be used) at the moment. It is good that Cabal  
separates these things internally but the separation should be  
enforced in the external interface, as well.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-13 Thread Roman Leshchinskiy

On 13/08/2008, at 17:47, Simon Marlow wrote:


Roman Leshchinskiy wrote:

On 12/08/2008, at 20:11, Simon Marlow wrote:
- Extract the code from Cabal that generates Makefiles, and treat  
it as

 part of the GHC build system.  Rather than generating a Makefile
 complete with build rules, we generate a Makefile that just
 has the package-specific metadata (list of modules, etc.), and put
 the code to actually build the package in the GHC build system.
Sounds good. It would be nice if the .cabal parser from Cabal could  
be made into a separate, stable library which ghc (and nhc?) could  
use.
This makes me wonder, though. Wouldn't this model make more sense  
for Cabal in general than the current approach of duplicating the  
functionality of autoconf, make and other stuff? If it works ghc,  
it ought to work for other projects, too. Cabal as a preprocessor  
seems much more attractive to me than as a universal build system.


So packages would be required to provide their own build system?   
That sounds like it would make it a lot harder for people to just  
create a package that others can use.  The ease of making a Cabal  
package has I think a lot to do with the wealth of software  
available on Hackage.


Of course there should be a standard build system for simple packages.  
It could be part of Cabal or a separate tool (for which Cabal could,  
again, act as a preprocessor).


GHC is a special case: we already need a build system for other  
reasons.


I agree. I just don't think that adding a full-fledged build system to  
Cabal is the solution. In my experience, huge monolithic tools which  
try to do everything never work well. I much prefer small, modular  
tools. A Haskell-based build system is an interesting project but why  
does it have to be a part of Cabal?


It was a design decision early on with Cabal that we didn't want to  
rely on the target system having a Unix-like build environment.  You  
might disagree with this, but it certainly has some value: a Windows  
user can download GHC and immediately start building and installing  
external packages without having to install Cygwin.


I agree with this decision but IIUC, this only really works for simple  
(wrt building) packages which don't even use configure. Making Cabal  
into a modular preprocessor and providing a thin wrapper for ghc -- 
make which can act as a target for Cabal would achieve this just as  
well.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system idea

2008-08-13 Thread Roman Leshchinskiy

On 13/08/2008, at 20:34, Simon Marlow wrote:


Roman Leshchinskiy wrote:

Of course there should be a standard build system for simple  
packages. It could be part of Cabal or a separate tool (for which  
Cabal could, again, act as a preprocessor).
GHC is a special case: we already need a build system for other  
reasons.
I agree. I just don't think that adding a full-fledged build system  
to Cabal is the solution. In my experience, huge monolithic tools  
which try to do everything never work well. I much prefer small,  
modular tools. A Haskell-based build system is an interesting  
project but why does it have to be a part of Cabal?


Hmm, but you said above there should be a standard build system for  
simple packages.  It could be part of Cabal


On second thought, it shouldn't be part of Cabal :-)

Cabal has two parts: some generic infrastructure, and a simple  
build system (under Distribution.Simple) that suffices for most  
packages.  We distribute them together only because it's convenient;  
you don't have to use the simple build system if you don't want to.


My impression of Cabal is that it is a build system with a bit of  
generic infrastructure. In particular, a large part of the .cabal  
syntax is specified in terms of this build system and some of it only  
really makes sense for this build system.


I think perhaps you're objecting to the fact that the simple build  
system isn't so simple, and we keep adding more functionality to  
it.  This is true, but the alternative - forcing some packages to  
provide their own build system - seems worse to me.



Cabal packages do provide their own build system; it's just that they  
use Cabal syntax instead of, say, make. The advantage of doing this  
is, of course, that Cabal's syntax is simpler. Adding things to the  
simple build system erodes this advantage. Complex projects will  
still have complex build systems - the complexity will be in  
the .cabal files. If Cabal's goal is to be able to build any project  
it will basically have to duplicate the functionality of autoconf,  
automake, libtool, make and a couple of other tools *and* be just as  
flexible. I think this is neither realistic nor necessary. So where do  
we stop? And what about the packages that Cabal won't support when we  
stop?


IMO, we should have stopped some time ago. A .cabal file should  
describe a package, not how to build it. Building should be handled by  
different tools with a clear interface between them and Cabal. If the  
build system of choice needs additional information, then that  
information should be provided in a separate file and not in the  
package description.


Again, I'm not arguing against a build system written in Haskell. I'd  
just like it to be completely separated from Haskell's packaging  
system. In particular, polluting a package description with build  
information seems wrong to me.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Version control systems

2008-08-09 Thread Roman Leshchinskiy


On 10/08/2008, at 14:40, Manuel M T Chakravarty wrote:

Personally, I am more than happy to stay with darcs, too, but my  
understanding was that at least the Simons decided that we are going  
to move from darcs to git.  All I am saying is that whatever vcs ghc  
uses, you need to be able to *easily* get, modify, and commit  
patches to the HEAD and the boot libs with *just one* vcs.  Using  
two vcs is going to make the current situation worse, not better.



I suspect that if GHC switches to git, it will become the standard vcs  
in the Haskell community sooner or later. Expecting that people  
(especially newcomers) will use different vcs for different libraries/ 
compilers is just unrealistic. Really, why should they? Any advantages  
in usability that darcs might have over git will be overshadowed by  
the inconvenience of having to remember two different sets of  
commands. I expect that many new projects will use git and old  
projects will start switching to it over time. So if the move is made,  
it should IMO include as big a chunk of the infrastructure as  
possible. Eventually, it will migrate to git anyway and the earlier it  
does, the simpler life will be for the developers.


As to whether the switch should be made at all, I'm not sure. I've had  
my share of problems with darcs and I don't think it's suitable for a  
project of GHC's size at the moment. On the other hand, I suspect that  
a mixture of git and darcs repos will be even more problematic than  
what we have now. Maybe investing some time in fixing the most obvious  
darcs problems would be a better solution?


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Version control systems

2008-08-09 Thread Roman Leshchinskiy

On 10/08/2008, at 05:38, Don Stewart wrote:


Instead, if we just use ubiquitous, common tools -- like git -- for
everything, we minimise the pain for people, and sit firmly in the
mainstream of open source.


While I agree with this in general, I'm not sure it really applies to  
vcs (especially darcs) all that much. I don't think anyone who has  
ever worked with a vcs will need more than a day to learn how to use  
darcs (or any other sane vcs, for that matter). Really, the problem  
with darcs is not that it is not mainstream; rather, it's just that it  
simply doesn't work sometimes.


Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Build system woes

2008-07-30 Thread Roman Leshchinskiy


On 30/07/2008, at 23:46, Simon Marlow wrote:

We can talk about the general issues on IRC.  But I thought I'd  
answer a few of the specific questions quickly:


Thanks, Simon!


Roman Leschinskiy wrote:

I don't think I understand how GHC itself is built any longer,  
either. What does cabal-bin do?


It's effectively a pre-compiled Setup.hs for packages that have no  
need for a custom Setup.hs.  cabal-install does the same job for end- 
users, but we can't rely on having cabal-install in the GHC build  
system.


I see. So it looks at the Build-Type in the package description and  
calls the right defaultMain if it's not Custom. And if my Setup.hs  
isn't standard then it's my responsibility to set the Build-Type to  
Custom in the .cabal file and this isn't checked, right?


I wonder, though. What exactly does this buy us? Duncan says:

The point being that linking default Setup.hs scripts all the time  
is a

waste (especially since it doesn't parallelise).


But the time spent in compiling those Setup.hs is negligible and don't  
understand the it doesn't parallelise bit.



What is runghc.wrapper?


it's a template used to make a shell wrapper for a binary.  There  
seems to be new functionality in Cabal to support this.


I see. Is runghc the only program we do this for? Or will others be  
added gradually?


What is the difference between make.library.* and build.library.*  
and why do we need both


This one is my fault. build.* runs 'cabal-bin build', whereas make.*  
runs 'cabal-bin makefile' followed by 'make'.  That is, build.* ends  
up using ghc --make, whereas make.* ends up using a traditional  
makefile with ghc -M for dependencies and individual single-module  
compilations.  The latter was added so that we could (a) use make -j  
and (b) compile single modules for testing/debugging purposes.


Ok, this answers the following question:


 (and why doesn't one of them work for dph)?


Because it does non-standard stuff in Setup.hs and hence no Makefile  
can be generated. Hence the SUBDIRS_BUILD hackery in libraries/ 
Makefile, I assume.


Thanks again!

Roman

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] uvector and the stream interface

2008-07-14 Thread Roman Leshchinskiy

Jules Bean wrote:


It would also be helpful to have someone explain why we have:

Ptr a
ByteString
IOUArray
IOCArray
Data.Storable.StorableArray
UArr

Of course, I know the answers to some of those questions, ByteString is 
obviously less polymorphic than all the others there, and Ptr a doesn't 
contain size information. But it seems we have a rapidly bifurcating 
profusion of 'typed interfaces to chunks of memory' with no obvious 
consistency to their naming scheme and I think it's starting to get 
confusing...


I think the main reason for this is the lack of a generic infrastructure 
for efficient arrays which means that everyone is rolling their own. My 
hope is that the Data Parallel Haskell project will eventually provide 
such an infrastructure (but I'm biased, of course). We need it anyway 
and we are perhaps in the best position to do it at the moment. 
Unfortunately, we don't have too much time to work on it so development 
is slow. Still, I think (obviously) that the vector library is a step in 
the right direction.


Roman

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] uvector and the stream interface

2008-07-14 Thread Roman Leshchinskiy

stefan kersten wrote:


(2) personally i much prefer the list-like interface provided by the 
stream-fusion powered libraries (ndp, uvector, vector). can't the 
stream-fusion framework and correspondingly the vector interface be 
separated from the memory representation, provided a particular concrete 
representation comes up with a stream/unstream pair? then it would be 
easy to swap out the underlying representation according to the required 
characteristics.


The vector library does this. The problem is that just stream/unstream 
isn't sufficient to get good performance. In fact, stream fusion alone 
is suboptimal for many array algorithms since it can't handle array 
operations which require random access. In particular, it isn't enough 
for DPH. We need a more powerful framework which, hopefully, will also 
be useful for non-DPH arrays.


Roman

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)

2008-07-13 Thread Roman Leshchinskiy

Levi Stephen wrote:

 On Sun, Jul 13, 2008 at 12:31 AM, Roman Leshchinskiy
[EMAIL PROTECTED] wrote:

Hi all,

the vector library will eventually provide fast, Int-indexed arrays with a
powerful fusion framework. It's very immature at the moment (I haven't
 tested most of the code) and implements just a few combinators but I
thought releasing early wouldn't hurt. Use at your own risk and expect
things to break horribly!



Sounds interesting. How does this compare to the uvector library?


IIUC, uvector is based on an older version of the DPH libraries and only 
provides unboxed arrays. On the other hand, it's much more stable and 
has more functionality. I expect that the two libraries will be merged 
eventually.


Roman

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


[Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)

2008-07-12 Thread Roman Leshchinskiy

Hi all,

the vector library will eventually provide fast, Int-indexed arrays with 
a powerful fusion framework. It's very immature at the moment (I haven't 
 tested most of the code) and implements just a few combinators but I 
thought releasing early wouldn't hurt. Use at your own risk and expect 
things to break horribly!


What it provides:

  * Boxed and unboxed arrays with a generic interface and a very basic
set of combinators.

  * A powerful loop fusion framework. It is based on stream fusion but
already goes beyond that (in particular, it can do some things
in-place) and will be significantly extended in the future.

  * Extensibility.

  * Rudimentary documentation.

  * Bugs.

The code is based on the DPH libraries but is much more generic and a 
*lot* simpler. This has only been made possible by the tremendous 
progress in GHC's simplifier in the recent months/years (thanks Simon!). 
Consequently, you'll need a recent development version of GHC to build 
this, 6.8 won't work. It might be able to compile the library but you'll 
get terrible code, much worse than with lists.


If you want to try it out, compile with -O2 -fno-spec-constr-count and 
please don't forget to let me know about things that didn't work (or, 
miraculously, did).


Grab it from

http://hackage.haskell.org/cgi-bin/hackage-scripts/package/vector

or

darcs get http://darcs.haskell.org/vector

Again, special thanks to Simon for doing such a wonderful job with the 
optimiser.


Enjoy,

Roman
___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector 0.1 (efficient arrays with lots of fusion)

2008-07-12 Thread Roman Leshchinskiy

Sebastian Sylvan wrote:


Is there any more (easily-digested, like a paper) information available 
about this? Specifically what things can happen in-place, and future 
extensions...


Apart from the stream fusion papers, unfortunately no, it's all very 
much work in progress. Basically, at the moment it will avoid allocating 
some unnecessary arrays even if no loop fusion happens. So for instance, 
in something like


  map f (xs // us)

the two loops can't be fused ((//) is the same update operation as in 
Data.Array). Usually, you'd allocate one array for (xs // us) and then 
another one for the map. This library will do the map in-place if f 
doesn't change the type of the elements. This is pretty important at 
least for data-parallel code. I'll have to see how far this scales.


In the future, the fusion system will also be able to handle multiple 
directions of traversals and permutations (it can only do left-to-right 
traversals at the moment). The Rewriting Haskell Strings paper talks a 
bit about that but DPH has much more complex requirements so I'll have 
to use a different approach which I haven't implemented so far.


Roman

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Generalized phase control for GHC

2008-07-07 Thread Roman Leshchinskiy

Simon Peyton-Jones wrote:


Ah -- Roman you mean you want to add a phase-ordering constraint at some time 
*other* than when you declare one or other of the phases.  Are you sure this is 
important?


Fairly. I've explained why in a follow-up to Max's message.


Also, why do you want phase aliases?


Modularity. Let's take NDP fusion as an example. We have 2 logically 
distinct fusion phases, one for fusing distributed stuff and one for 
stream fusion (let's call the DIST and STREAM). At the moment, the two 
are performed simultaneously, i.e., we have DIST = STREAM. However, I 
suspect that having DIST  STREAM might produce better results. This 
means that I'd like to use INLINE DIST when implementing distributed 
fusion, INLINE STREAM for stream fusion and only have one place where I 
declare DIST = STREAM or DIST  STREAM. I don't quite see how to achieve 
this without aliases. This will be even more of a problem once I add 
additional fusion layers.


Incidentially, this is also an example of why adding ordering 
constraints on already declared phases is useful. Neither of the two 
fusion systems really depends on the other so it would be nice to be 
able to put the DIST  STREAM or DIST = STREAM declaration into a module 
which *integrates* the two.


Roman
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Generalized phase control for GHC

2008-07-05 Thread Roman Leshchinskiy

Max Bolingbroke wrote:

Hi Roman,

Three things. Firstly, what would lenient ordering be useful for? You
probably had a specific use case in mind?


I suspect that when you have multiple plugins all specifying
constraints on the phase ordering independently it is possible to end
up in a situation where using each plugin individually results in a
consistent phase ordering but a combination of multiple plugins causes
a cycle to appear in the order. Hence it is useful to let plugin
authors flag up dependencies they don't need to have so we can
intellegently break the cycle using those constraints rather than just
giving up. Admittedly I only have a superstition that this will be a
practical problem.


If you don't need a dependency and it can be ignored anyway, why would 
you want to specify it in the first place? I just can't quite imagine a 
situation in which I would use this.


As to cycles, I think just aborting if they occur would be ok.


Secondly, I think it is quite
important to be able to specify dependencies for already declared phases.
That is, I (probably) want to be able to say in module C that phase X from
module A should come before phase Y from module B.


This in interesting. You're right that it's not possible with this
system: to support this you would need a more class + instance rather
than single declaration flavour for phases. Do you have a practical
example in mind of how you would use it?


Yeah, I'd prefer the class/instance model. An example: suppose you have 
two independent libraries which implement two optimisations and you want 
one of them to run before another. I think we really need this if we 
want to compose optimisations.



I suspect that's not
quite possible with your current design. Lastly, in addition to  and  I'd
also like to have = (i.e., phase X = phase Y which would make X an alias for
Y).


Ah! This would also let you do what you wanted with your second point:

module C where
{-# PHASE XAlias = A.X,  B.Y #-}


True. It's a bit of a hack, though :-)

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Generalized phase control for GHC

2008-07-05 Thread Roman Leshchinskiy

Max Bolingbroke wrote:

If you don't need a dependency and it can be ignored anyway, why would you
want to specify it in the first place? I just can't quite imagine a
situation in which I would use this.


I think it makes sense because many of the inter-pass dependencies we
have in the GHC pipeline today are actually somewhat lenient. For
example, something quite sensible will happen if we do CSE before full
laziness, it will simply lead to less efficient code. In contrast,
it's imperative that we perform strictness analysis before applying
worker-wrapper based on the annotations it adds.


I see. I suspect I'd prefer if GHC asked me to manually resolve such 
conflicts rather than silently generating suboptimal code but then 
again, perhaps not. In any case, my comments mostly apply to phase 
control for inlining and rule matching; other passes probably have 
different requirements which I haven't thought about at all.



Yeah, I'd prefer the class/instance model. An example: suppose you have two
independent libraries which implement two optimisations and you want one of
them to run before another. I think we really need this if we want to
compose optimisations.


This is an interesting observation. So, you imagine that the user
program itself might import the compiler plugins and add a constraint
between the phases they export. I'm not sure how many users would
really want to do this, but I can see it being useful.


What I had in mind is a library/plugin which uses other 
libraries/plugins. For instance, let's say we have two libraries which 
both implement rule-based fusion, one for some high-level stuff and one 
for something low level. Now, a third library which uses the first two 
might want to specify that high-level fusion should be done before 
low-level fusion. Or consider something like loop optimisations, where 
you might have separate plugins which implement unrolling, PRE and so on 
and one plugin which aggregates all these to provide a complete loop 
optimiser. It's all about compositionality, really.


Roman

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Generalized phase control for GHC

2008-07-04 Thread Roman Leshchinskiy

Hi Max,

sorry for replying so late, I've completely forgotten about this.


I would be interested in feedback on the design before the
implementation is complete and in the wild. I'm especially interested
in hearing if you believe that loss of support for numeric phase
numbers  2 is a problem, as this is the only breaking change that I'm
proposing.


Three things. Firstly, what would lenient ordering be useful for? You 
probably had a specific use case in mind? Secondly, I think it is quite 
important to be able to specify dependencies for already declared 
phases. That is, I (probably) want to be able to say in module C that 
phase X from module A should come before phase Y from module B. I 
suspect that's not quite possible with your current design. Lastly, in 
addition to  and  I'd also like to have = (i.e., phase X = phase Y 
which would make X an alias for Y).


As to the numeric phase ids, I'm all in favour of dropping them 
altogether. For legacy code, you could provide wired-in names 0, 1 and 2 
for now but I would deprecate them. Also, I'm one of the probably very 
few people who have ever used phase 3 and beyond and dropping support 
for that is perfectly fine with me.


Thanks for working on this!

Roman

___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


<    1   2   3   >