Re: copyArray# bug

2012-10-09 Thread Roman Leshchinskiy
Johan Tibell wrote:
 Hi,

 I did quite a bit of work to make sure copyArray# and friends get
 unrolled if the number of elements to copy is a constant. Does this
 still work with the extra branch?

I would expect it to but I don't know. Does the testsuite check for this?

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: copyArray# bug

2012-10-09 Thread Roman Leshchinskiy
Herbert Valerio Riedel wrote:
 Roman Leshchinskiy r...@cse.unsw.edu.au writes:


 [...]

 If I'm right then I would suggest not to use copyArray# and
 copyMutableArray# for GHC  7.8.

 I've grepped today's

  http://hackage.haskell.org/cgi-bin/hackage-scripts/archive.tar

 for occurences of those two primitives, and this resulted in the
 following matches:

 [...]

 ...so, are you saying, that those packages above are dangerous to use
 with GHC=7.6.1?

I don't know about the other packages (it depends entirely on what kind of
arrays they copy and how) but these particular functions in primitive
definitely are dangerous to use. I'll release a hotfix shortly.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: copyArray# bug

2012-10-08 Thread Roman Leshchinskiy
Simon Marlow wrote:
 On 06/10/2012 22:41, Roman Leshchinskiy wrote:
 I've been chasing a segfault in the dev version of vector and I think I
 finally traced it to a bug in the implementation of copyArray# and
 copyMutableArray#. More specifically, I think emitSetCards in
 StgCmmPrim.hs (and CgPrimOp.hs) will sometimes fail to mark the last
 card as dirty because in the current implementation, the number of cards
 to mark is computed solely from the number of copied elements while it
 really depends on which cards the first and the last elements belong to.
 That is, the number of elements to copy might be less than the number of
 elements per card but the copied range might still span two cards.

 The attached patch fixes this (and the segfault in vector) and also
 makes copyArray# return immediately if the number of elements to copy is
 0. Could someone who is familiar with the code please review it and tell
 me if it looks sensible. If it does, I'll make the same modification to
 CgPrimOp.hs (which has exactly the same code) and commit. Unfortunately,
 I have no idea how to write a testcase for this since the bug is only
 triggered in very specific circumstances.

 It seems that all released versions of GHC that implement
 copyArray#/copyMutableArray# have this problem. At least, vector's
 testsuite now segfaults with all of them in roughly the same place after
 recent modifications I've made (which involve calling copyArray# a lot).
 If I'm right then I would suggest not to use copyArray# and
 copyMutableArray# for GHC  7.8.

 Nice catch!

 Just to make sure I'm understanding: the conditional you added is not
 just an optimisation, it is required because otherwise the memset() call
 will attempt to mark a single card. (this was the bug I fixed last
 time I touched this code, but I think I might have inadverdently
 introduced the bug you just fixed)

Yes, that's exactly right. I'll add a comment.

 Please go ahead and commit.  Note that CgPrimOp is scheduled for
 demolition very shortly, but the bug will need to be fixed there in the
 7.6 branch.

Will do. Thanks for taking a look!

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


copyArray# bug

2012-10-06 Thread Roman Leshchinskiy
I've been chasing a segfault in the dev version of vector and I think I 
finally traced it to a bug in the implementation of copyArray# and 
copyMutableArray#. More specifically, I think emitSetCards in 
StgCmmPrim.hs (and CgPrimOp.hs) will sometimes fail to mark the last 
card as dirty because in the current implementation, the number of cards 
to mark is computed solely from the number of copied elements while it 
really depends on which cards the first and the last elements belong to. 
That is, the number of elements to copy might be less than the number of 
elements per card but the copied range might still span two cards.


The attached patch fixes this (and the segfault in vector) and also 
makes copyArray# return immediately if the number of elements to copy is 
0. Could someone who is familiar with the code please review it and tell 
me if it looks sensible. If it does, I'll make the same modification to 
CgPrimOp.hs (which has exactly the same code) and commit. Unfortunately, 
I have no idea how to write a testcase for this since the bug is only 
triggered in very specific circumstances.


It seems that all released versions of GHC that implement 
copyArray#/copyMutableArray# have this problem. At least, vector's 
testsuite now segfaults with all of them in roughly the same place after 
recent modifications I've made (which involve calling copyArray# a lot). 
If I'm right then I would suggest not to use copyArray# and 
copyMutableArray# for GHC  7.8.


Roman

diff --git a/compiler/codeGen/StgCmmPrim.hs b/compiler/codeGen/StgCmmPrim.hs
index cbb2aa7..6c291f1 100644
--- a/compiler/codeGen/StgCmmPrim.hs
+++ b/compiler/codeGen/StgCmmPrim.hs
@@ -1069,27 +1069,30 @@ emitCopyArray :: (CmmExpr - CmmExpr - CmmExpr - 
CmmExpr - CmmExpr
   - FCode ()
 emitCopyArray copy src0 src_off0 dst0 dst_off0 n0 = do
 dflags - getDynFlags
--- Passed as arguments (be careful)
-src - assignTempE src0
-src_off - assignTempE src_off0
-dst - assignTempE dst0
-dst_off - assignTempE dst_off0
 n   - assignTempE n0
+nonzero - getCode $ do
+-- Passed as arguments (be careful)
+src - assignTempE src0
+src_off - assignTempE src_off0
+dst - assignTempE dst0
+dst_off - assignTempE dst_off0
 
--- Set the dirty bit in the header.
-emit (setInfo dst (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel)))
+-- Set the dirty bit in the header.
+emit (setInfo dst (CmmLit (CmmLabel mkMAP_DIRTY_infoLabel)))
 
-dst_elems_p - assignTempE $ cmmOffsetB dflags dst (arrPtrsHdrSize dflags)
-dst_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p dst_off
-src_p - assignTempE $ cmmOffsetExprW dflags (cmmOffsetB dflags src 
(arrPtrsHdrSize dflags)) src_off
-bytes - assignTempE $ cmmMulWord dflags n (mkIntExpr dflags (wORD_SIZE 
dflags))
+dst_elems_p - assignTempE $ cmmOffsetB dflags dst (arrPtrsHdrSize 
dflags)
+dst_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p dst_off
+src_p - assignTempE $ cmmOffsetExprW dflags (cmmOffsetB dflags src 
(arrPtrsHdrSize dflags)) src_off
+bytes - assignTempE $ cmmMulWord dflags n (mkIntExpr dflags 
(wORD_SIZE dflags))
 
-copy src dst dst_p src_p bytes
+copy src dst dst_p src_p bytes
 
--- The base address of the destination card table
-dst_cards_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p 
(loadArrPtrsSize dflags dst)
+-- The base address of the destination card table
+dst_cards_p - assignTempE $ cmmOffsetExprW dflags dst_elems_p 
(loadArrPtrsSize dflags dst)
 
-emitSetCards dst_off dst_cards_p n
+emitSetCards dst_off dst_cards_p n
+
+emit = mkCmmIfThen (cmmNeWord dflags n (mkIntExpr dflags 0)) nonzero
 
 -- | Takes an info table label, a register to return the newly
 -- allocated array in, a source array, an offset in the source array,
@@ -1142,10 +1145,11 @@ emitSetCards :: CmmExpr - CmmExpr - CmmExpr - FCode 
()
 emitSetCards dst_start dst_cards_start n = do
 dflags - getDynFlags
 start_card - assignTempE $ card dflags dst_start
+end_card - assignTempE $ card dflags (cmmSubWord dflags (cmmAddWord 
dflags dst_start n) (mkIntExpr dflags 1))
 emitMemsetCall (cmmAddWord dflags dst_cards_start start_card)
-(mkIntExpr dflags 1)
-(cardRoundUp dflags n)
-(mkIntExpr dflags 1) -- no alignment (1 byte)
+   (mkIntExpr dflags 1)
+   (cmmAddWord dflags (cmmSubWord dflags end_card start_card) 
(mkIntExpr dflags 1))
+   (mkIntExpr dflags 1) -- no alignment (1 byte)
 
 -- Convert an element index to a card index
 card :: DynFlags - CmmExpr - CmmExpr
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Call for discussion: OverloadedLists extension

2012-09-25 Thread Roman Leshchinskiy
Simon Peyton-Jones wrote:
 |  I remember a similar discussion a few years ago. The question of
 whether
 |  or not overloading list literals a good idea notwithstanding, the
 problem
 |  with this is that fromList for vectors is highly inefficient. So if
 |  something like this gets implemented and if vector/array literals are
 one
 |  of the main motivations then I really hope there will be no lists
 |  involved.

 Would you like to remind us why it is so inefficient?  Can't the vector
 construction be a fold over the list?  Ah... you need to know the *length*
 of the list, don't you?  So that you can allocate a suitably-sized vector.
  Which of course we do for literal lists.

 So what if fromList went
   fromList :: Int - [b] - a b
 where the Int is the length of the list?

That's part of a problem. There are really two aspects to it. Firstly, a
naive list-based implementation would be a loop. But when I write ([x,y]
:: Vector Double) somewhere in an inner loop in my program, I *really*
don't want a loop with two iterations at runtime - I want just an
allocation and two writes. I suppose this could be solved by doing
something like this:

  {-# INLINE fromList #-}
  fromList [] = V.empty
  fromList [x] = V.singleton x
  fromList [x,y] = ...
  -- and so on up to 8? 16? 32?
  fromList xs = fromList_loop xs

But it's ugly and, more importantly, inlines a huge term for every literal.

The other problem is with literals where all values are known at compile
time. Suppose I have ([2.5,1.4] :: Vector Double) in an inner loop. Here,
I don't want a complicated CAF for the constant vector which would have to
be entered on every loop iteration. I'd much rather just have a pointer to
the actual data somewhere in memory and use that. This is more or less
what happens for strings at the moment, even though you have to use
rewrite rules to get at the pointer which, in my opinion, is neither ideal
nor really necessary. IMO, the right design shouldn't rely on rewrite
rules. Also, strings give you an Addr# whereas vector supports ByteArray#,
too.

Since enumerated literals have been mentioned in a different post, I'll
just mention that the Enum class as it is now can't support those
efficiently for arrays because there is no way to determine either the
length or the nth element of [x..y] in constant time. This would have to
be fixed.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Call for discussion: OverloadedLists extension

2012-09-25 Thread Roman Leshchinskiy
Simon Peyton-Jones wrote:

 | pointer to the actual data somewhere in memory and use that. This is
 | more or less what happens for strings at the moment, even though you
 | have to use rewrite rules to get at the pointer which, in my opinion, is
 | neither ideal nor really necessary. IMO, the right design shouldn't
 | rely on rewrite rules. Also, strings give you an Addr# whereas vector
 | supports ByteArray#, too.

 If it's not necessary, I wonder if you have an idea for the right
 design?

For strings, we could have something like this:

data StringPtr

stringFromStringPtr :: StringPtr - Int - String
unsafeStringPtrToPtr :: StringPtr - Ptr CChar

class IsString a where
  fromString :: String - a
  fromStringPtr :: StringPtr - Int - a
  fromStringPtr p n = fromString $ stringFromStringPtr p n

abc would then desugar to fromStringPtr (address of abc) 3. Note that
we couldn't just use Ptr CChar instead of StringPtr because stringFromPtr
would only be safe if the data that the pointer references never changes.

It's much trickier for general-purpose arrays. It's also much trickier to
support both Ptr and ByteArray. I'd have to think about how to do that.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Call for discussion: OverloadedLists extension

2012-09-24 Thread Roman Leshchinskiy
Michael Snoyman wrote:

 The simplest example I can think of is allowing easier usage of Vector:

 [1, 2, 3] :: Vector Int

 In order to allow this, we could use a typeclass approach similar to
 how OverloadedStrings works:

 class IsList a where
 fromList :: [b] - a b
 instance IsList Vector where
 fromList = V.fromList
 foo :: Vector Int
 foo = fromList [1, 2, 3]

I remember a similar discussion a few years ago. The question of whether
or not overloading list literals a good idea notwithstanding, the problem
with this is that fromList for vectors is highly inefficient. So if
something like this gets implemented and if vector/array literals are one
of the main motivations then I really hope there will be no lists
involved.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic Programming with Data.Vector

2012-09-18 Thread Roman Leshchinskiy
Myles C. Maxfield wrote:
 Aha there it is! Thanks so much. I didn't see it because it's under the
 Unfolding section instead of the Construction section.

You're quite right, having a separate Unfolding section isn't the best
idea. I'll fix this.

Roman

 On Mon, Sep 17, 2012 at 6:07 AM, Roman Leshchinskiy
 r...@cse.unsw.edu.auwrote:

 Myles C. Maxfield wrote:
 
  Overall, I'm looking for a function, similar to Data.Vector's
 'generate'
  function, but instead of the generation function taking the
 destination
  index, I'd like it to take the elements that have previously been
  constructed. Is there such a function? If there isn't one, is this
 kind
 of
  function feasible to write? If such a function doesn't exist and is
  feasible to write, I'd be happy to try to write and contribute it.

 Indeed there is, it's called constructN (or constructrN if you want to
 construct it right to left).

 Roman








___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Dynamic Programming with Data.Vector

2012-09-17 Thread Roman Leshchinskiy
Myles C. Maxfield wrote:

 Overall, I'm looking for a function, similar to Data.Vector's 'generate'
 function, but instead of the generation function taking the destination
 index, I'd like it to take the elements that have previously been
 constructed. Is there such a function? If there isn't one, is this kind of
 function feasible to write? If such a function doesn't exist and is
 feasible to write, I'd be happy to try to write and contribute it.

Indeed there is, it's called constructN (or constructrN if you want to
construct it right to left).

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance with do notation, mwc-random and unboxed vector

2012-06-13 Thread Roman Leshchinskiy
On 12 Jun 2012, at 12:52, Dmitry Dzhus d...@dzhus.org wrote:

 12.06.2012, 01:08, Roman Leshchinskiy r...@cse.unsw.edu.au:
 
 perhaps the state hack is getting in the way.
 
 I don't quite understand the internals of this yet, but `-fno-state-hack` 
 leads to great performance in both cases!
 How safe is that?

It doesn't change the semantics of your program but it can make it 
significantly slower (or faster, as in this case). The various state hack 
related tickets on trac might give you an idea of what is happening here.

We really need some proper arity analysis!

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-06-11 Thread Roman Leshchinskiy
On 11/06/2012, at 18:52, Evan Laforge wrote:

 On Tue, May 29, 2012 at 12:52 PM, Roman Leshchinskiy r...@cse.unsw.edu.au 
 wrote:
 
 Vector should definitely fuse this, if it doesn't it's a bug. Please report 
 if it doesn't for you. To verify, just count the number of letrecs in the 
 optimised Core. You'll see one letrec if it has been fused and two if it 
 hasn't.
 
 I see two letrecs in find_before2, but both of them are on findIndex.
 I only have one findIndex so I'm not sure what's going on.  The first
 one calls the second, but there's an boxed Either argument in there,
 which must be coming out of vector internals.

Hmm, which version of GHC and what compiler flags are you using? I'm not 
familiar with ghc-core, maybe that's doing something wrong. Just run ghc -O2 
-ddump-simpl and look at the output. Below is the code I'm getting for 
find_before2 with 7.4.2. As you can see, everything has been fused (although I 
notice that GHC isn't pushing x_a11p and y1_a124 into the branches for some 
reason, looks like a new regression but not a particularly bad one and nothing 
to do with fusion).

find_before2_rkk :: Int - Vector Int - Int
[GblId, Arity=2, Caf=NoCafRefs, Str=DmdType LU(LLL)m]
find_before2_rkk =
  \ (n_arE :: Int) (vec_arF :: Vector Int) -
case vec_arF `cast` ...
of _ { Vector ipv_s2Jf ipv1_s2Jg ipv2_s2Jh -
case n_arE of _ { I# y_a11t -
case # 0 y_a11t of _ {
  False -
letrec {
  $sfindIndex_loop_s2Qz [Occ=LoopBreaker]
:: Int# - Int# - Int# - Id (Maybe Int)
  [LclId, Arity=3, Str=DmdType LLL]
  $sfindIndex_loop_s2Qz =
\ (sc_s2Q8 :: Int#) (sc1_s2Q9 :: Int#) (sc2_s2Qa :: Int#) -
  case =# sc_s2Q8 ipv1_s2Jg of _ {
False -
  case indexIntArray# ipv2_s2Jh (+# ipv_s2Jf sc_s2Q8)
  of wild_a2JM { __DEFAULT -
  let {
x_a11p [Dmd=Just L] :: Int#
[LclId, Str=DmdType]
x_a11p = +# sc1_s2Q9 wild_a2JM } in
  case # x_a11p y_a11t of _ {
False -
  $sfindIndex_loop_s2Qz (+# sc_s2Q8 1) x_a11p (+# sc2_s2Qa 
1);
True - (Just @ Int (I# sc2_s2Qa)) `cast` ...
  }
  };
True - (Nothing @ Int) `cast` ...
  }; } in
case ($sfindIndex_loop_s2Qz 0 0 1) `cast` ... of _ {
  Nothing - lvl_r2QO;
  Just i_arH -
case i_arH of _ { I# x_a11Q -
let {
  y1_a124 [Dmd=Just L] :: Int#
  [LclId, Str=DmdType]
  y1_a124 = -# x_a11Q 1 } in
case =# 0 y1_a124 of _ {
  False - lvl_r2QO;
  True - I# y1_a124
}
}
};
  True - lvl_r2QO
}
}
}

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Performance with do notation, mwc-random and unboxed vector

2012-06-11 Thread Roman Leshchinskiy
On 11/06/2012, at 10:38, Dmitry Dzhus wrote:

 Consider this simple source where we generate an unboxed vector with million
 pseudo-random numbers:
 
  8 -
 import qualified Data.Vector.Unboxed as VU
 
 import System.Random.MWC
 import System.Random.MWC.Distributions (standard)
 
 count = 100
 
 main = do
  g - create
  e' - VU.replicateM count $ standard g
  return ()
  8 -
 
 Being compiled with -O2, this runs for 0.052 s on my machine.
 
 Changing the replicateM line to use do notation brings the runtime down to 
 11.257 s!
 See below:
 
  8 -
 import qualified Data.Vector.Unboxed as VU
 
 import System.Random.MWC
 import System.Random.MWC.Distributions (standard)
 
 count = 100
 
 main = do
  g - create
  e' - VU.replicateM count $ do
   v - standard g
   return v
  return ()
  8 -

The former essentially generates this:

  replicateM n ((letrec f = ... in f) `cast` ...)

and the latter this:

  replicateM n (\(s :: State# RealWorld) - (letrec f = ... in f s) `cast` ...)

I'd look further into this but mwc-random just inlines too much stuff. Could 
you perhaps find a smaller example that doesn't use mwc-random? In any case, it 
looks like a GHC bug, perhaps the state hack is getting in the way.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] vector operations

2012-05-29 Thread Roman Leshchinskiy
On 29/05/2012, at 19:49, Evan Laforge wrote:

 Good question.. I copied both to a file and tried ghc-core, but it
 inlines big chunks of Data.Vector and I can't read it very well, but
 it looks like the answer is no, it still builds the the list of sums.
 I guess the next step is to benchmark and see how busy the gc is on
 each version.

Vector should definitely fuse this, if it doesn't it's a bug. Please report if 
it doesn't for you. To verify, just count the number of letrecs in the 
optimised Core. You'll see one letrec if it has been fused and two if it hasn't.

 But my impression was that stream fusion can't handle early aborts,
 which was why I was wondering why Vector lacks a foldAbort type
 function.

Stream fusion easily handles early aborts. There isn't anything like foldAbort 
precisely because it can be built out of existing operations at no extra cost.

Roman

 On Wed, May 23, 2012 at 5:13 AM, Jake McArthur jake.mcart...@gmail.com 
 wrote:
 Have you already verified that stream fusion won't just do this for you?
 
 On May 23, 2012 12:35 AM, Evan Laforge qdun...@gmail.com wrote:
 
 So I wanted to find the first index in a vector whose running sum is
 greater than a given number.
 
 The straightforward way is to create the running sum and then search:
 
 Vector.findIndex (=target) (Vector.scanl' (+) 0 vector)
 
 But vectors are strict so it could do extra work, and what if I don't
 want to generate garbage?  I could do it with a fold, but it would
 have to have the ability to abort early.  Of course I could write such
 a fold myself using indexing:
 
 import qualified Data.Vector.Generic as Vector
 
 fold_abort :: (Vector.Vector v a) = (accum - a - Maybe accum) - accum
- v a - accum
 fold_abort f accum vec = go 0 accum
where go i accum = maybe accum (go (i+1)) $ f accum = vec Vector.!? i
 
 find_before :: (Vector.Vector v a, Num a, Ord a) = a - v a - Int
 find_before n = fst . fold_abort go (0, 0)
where
go (i, total) a
| total + a = n = Nothing
| otherwise = Just (i+1, total+a)
 
 So it's bigger and clunkier, but I would think it would be much more
 efficient (provided using Data.Vector.Generic won't inhibit inlining
 and unboxing).  But I'm a bit surprised there isn't already something
 like fold_abort... or is there?
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe
 
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Unpack primitive types by default in data

2012-02-17 Thread Roman Leshchinskiy
Johan Tibell wrote:
 Hi all,

 I've been thinking about this some more and I think we should
 definitely unpack primitive types (e.g. Int, Word, Float, Double,
 Char) by default.

 The worry is that reboxing will cost us, but I realized today that at
 least one other language, Java, does this already today and even
 though it hurts performance in some cases, it seems to be a win on
 average. In Java all primitive fields get auto-boxed/unboxed when
 stored in polymorphic fields (e.g. in a HashMap which stores keys and
 fields as Object pointers.) This seems analogous to our case, except
 we might also unbox when calling lazy functions.

I'm not convinced that this is a good idea because it doesn't treat all
types equally. The comparison with Java is problematic, IMO, because in
Java 'int' is always called 'int' whereas in Haskell, it might be called
many different things.

To better understand the proposal, which of the types below would you want
to be unboxed automatically?

data A = A Int#
newtype B = B A
data C = C !B
data D = D !C
data E = E !()
data F = F !D

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Unpack primitive types by default in data

2012-02-17 Thread Roman Leshchinskiy
Jean-Marie Gaillourdet wrote:
 Hi,

 On 17.02.2012, at 09:52, Roman Leshchinskiy wrote:

 Johan Tibell wrote:

 The worry is that reboxing will cost us, but I realized today that at
 least one other language, Java, does this already today and even
 though it hurts performance in some cases, it seems to be a win on
 average. In Java all primitive fields get auto-boxed/unboxed when
 stored in polymorphic fields (e.g. in a HashMap which stores keys and
 fields as Object pointers.) This seems analogous to our case, except
 we might also unbox when calling lazy functions.

 I'm not convinced that this is a good idea because it doesn't treat all
 types equally. The comparison with Java is problematic, IMO, because in
 Java 'int' is always called 'int' whereas in Haskell, it might be called
 many different things.

 Actually, there are two types for every primitive type in Java:
 int / java.lang.Integer
 byte / java.lang.Byte
 char / java.lang.Char
 float / java.lang.Float
 double / java.lang.Double
 ...

True, I didn't phrase that right at all. AFAIK, in Java, there is a
one-to-one correspondence between a primitive type and its boxed version.
So you can say that boxed integers will be unboxed when necessary and it's
clear what that means. But in Haskell, there is no such one-to-one
correspondence. This is a very good thing but it makes specifying and
understanding what will be unboxed when much harder.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Unpack primitive types by default in data

2012-02-17 Thread Roman Leshchinskiy
On 17/02/2012, at 17:51, Johan Tibell wrote:

 On Fri, Feb 17, 2012 at 12:52 AM, Roman Leshchinskiy r...@cse.unsw.edu.au 
 wrote:
 I'm not convinced that this is a good idea because it doesn't treat all
 types equally. The comparison with Java is problematic, IMO, because in
 Java 'int' is always called 'int' whereas in Haskell, it might be called
 many different things.
 
 To better understand the proposal, which of the types below would you want
 to be unboxed automatically?
 
 data A = A Int#
 newtype B = B A
 data C = C !B
 data D = D !C
 data E = E !()
 data F = F !D
 
 All of the above. Put in other words: all fields whose final
 representation type could be the size of a pointer if we unpacked
 enough.

Ok, that makes sense. I would include Double# and Int64# in this list. For the 
simple reason that if the target audience are beginners they will have a hard 
time figuring out why their programs run 20x faster with Float/Int than with 
Double/Int64 instead of just 2x faster.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Changes to Typeable

2012-02-14 Thread Roman Leshchinskiy
On 13/02/2012, at 11:10, Simon Peyton-Jones wrote:

 |  Should there perhaps be a NewTypeable module which could then be renamed
 |  into Typeable once it is sufficiently well established?
 
 I started with that idea, but there would be a 2-stage process:
 * Step 1: (when PolyTypable becomes available) People change to import 
 Data.PolyTypeable
 * Step 2: (when PolyTypeable becomes Typeable) People change back to 
 Data.Typeable

The problem is that libraries generally have to support multiple versions of 
GHC and this would become harder. But that isn't too bad, preprocessor magic 
solves it. It would be easier if we could define Typeable1 etc. as an alias for 
Typeable (since they now mean the same thing) but we don't have class aliases.

My main objection is still the fact that a central library will now rely on a 
highly experimental language feature which isn't even really available in a GHC 
release yet (my understanding is that support for polykinds in 7.4 is shaky at 
best). IMO, this should be avoided as a matter of policy. I realise that others 
are much less conservative than me in this respect, though.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Changes to Typeable

2012-02-12 Thread Roman Leshchinskiy
On 12/02/2012, at 03:21, Iavor Diatchki wrote:

 PS: I wouldn't worry too much about breaking existing code, as long as
 derived Typeable instances continue to work---I never provide custom
 ones and, in fact, I think that GHC should no allow them or, at least,
 give a stern warning when it sees one.

There is no easy way to have derived Typeable instances for data families. 
Standalone deriving works, but only since 7.2 and only like this:

data family T a
deriving instance Typeable1 T

Presumably, this should be Typeable rather than Typeable1 with the new design.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: require spaces around the dot operator

2012-02-12 Thread Roman Leshchinskiy
On 12/02/2012, at 02:39, Greg Weber wrote:

 This proposal stands on its own
 * the dot operator is inconsistent with Module function selection.
 * we are allowed the option of expanding the usage of the dot without
 spaces if this proposal goes forward.
 
 The point is that we will decide whether or not to expand the usage of
 the dot in the *future*. We could decide on a completely different
 usage than record field selection.

Then we will have broken a lot of code just to remove a tiny inconsistency from 
the language. That really doesn't sound like a good idea to me.

 If this proposal is not compelling enough on its own we should merge
 it with other proposals and discuss them together as a single new
 concrete proposal.

Personally, I would much prefer this.

BTW, after looking through the relevant Wiki pages I think the proposal is 
actually underspecified. The TDNR page introduces a new lexeme for .var_id. 
This is quite easy to integrate into the grammar and to parse but it means that 
(f).(g) and (f. g) still both parse as function composition applied to f and g. 
The DotOperator page, however, seems to require that neither of these parse as 
function composition. But in that case, what does the grammar look like for 
'.'? More specifically, what does Text.Read.lex return for the two examples?

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Proposal: require spaces around the dot operator

2012-02-12 Thread Roman Leshchinskiy
On 12/02/2012, at 02:42, Isaac Dupree wrote:

 Does it help your concern about breaking existing code to make sure this 
 proposal has a LANGUAGE flag? (-XDotSpaces or such)
 
 (I'm guessing that helps somewhat but not very satisfactorily; the more 
 default and standard it becomes, the more often it tends to break code 
 anyway.)

I'm actually not sure why anyone would want to turn on the flag if all it does 
is render legal Haskell code invalid. I really doubt it would become widely 
used. So while it would address my concern in a sense, I don't really see the 
point of introducing it.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Changes to Typeable

2012-02-11 Thread Roman Leshchinskiy
On 10/02/2012, at 23:30, John Meacham wrote:

 something I have thought about is perhaps a special syntax for Proxy, like
 {:: Int - Int } is short for (Proxy :: Proxy (Int - Int)). not sure whether
 that is useful enough in practice though, but could be handy if we are 
 throwing
 around types a lot.

We really need explicit syntax for type application. There are already a lot of 
cases where we have to work around not having it (e.g., Storable) and with the 
new extensions, there are going to be more and more of those.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Proposal: require spaces around the dot operator

2012-02-11 Thread Roman Leshchinskiy
On 10/02/2012, at 02:41, Greg Weber wrote:

 There are 2 compelling reasons I know of to prefer dot for record access
 1) follows an almost universal convention in modern programming languages
 2) is consistent with using the dot to select functions from module 
 name-spaces

I don't understand what you mean by consistent in 2). The TDNR proposal quite 
explicitly says that X.f and (X).f mean very different things. This isn't 
consistent, IMO, it's actually exactly the same inconsistency we have now.

 We can have a lot of fun bike-shedding about what operator we would
 prefer were these constraints not present. Personally I wouldn't care.
 However, I find either one of these 2 points reason enough to use the
 dot for record field access, and even without a better record system
 the second point is reason enough to not use dot for function
 composition.
 
 It is somewhat convenient to argue that it is too much work and
 discussion for something one is discussing against. The only point
 that should matter is how existing Haskell code is effected.

Huge amounts of existing Haskell code are broken by this. To me, that should 
override all other considerations. If Haskell is to be seen as a real-world 
programming language, then breaking code in this way simply shouldn't be 
acceptable.

I also don't really understand why it makes sense to take clear, concise and 
well-established syntax away from a very frequently used language feature and 
use it for (at least in my experience) a less widely used language feature 
without an equally clear and concise alternative for the former. Isn't the net 
effect less readable code?

As an aside, could - be used for field access? I don't think it introduces any 
ambiguities (although it's late and I'm probably mistaken) and there are 
well-known precedents in other programming languages.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Proposal: require spaces around the dot operator

2012-02-11 Thread Roman Leshchinskiy
On 12/02/2012, at 01:29, Nate Soares wrote:

 If - was introduced for accessing fields, we'd have to discuss whether it 
 should have spaces around it. I'd lean towards requiring that it have no 
 spaces when used for field access, for symmetry with . when used for module 
 access.

I'm not spaces matter in this case, - is a reserved token so we would just 
have expressions of the form expr - field with no special lexical rules. BTW, 
if - doesn't work for some reason then there is also = which AFAIK isn't used 
in expressions at all currently.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Proposal: require spaces around the dot operator

2012-02-11 Thread Roman Leshchinskiy
On 12/02/2012, at 02:04, Greg Weber wrote:

 I am sorry that I made the huge mistake in referencing future possible
 proposals. If this proposal passes, that has no bearing on whether the
 other proposals would pass, it just makes them possible.
 
 Please help me fix my error by stopping all discussions of future
 proposals and focusing solely on the one at hand.

But if we don't consider those future proposals, then what is the justification 
for this one? It does break existing code so there must be some fairly 
compelling arguments for it. I don't think it can be considered in isolation.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-30 Thread Roman Leshchinskiy
Marc Weber wrote:
 Replying to all replies at once:

 Malcolm Wallace
  At work, we have a strict version of Haskell
 :-) which proofs that it is worth thinking about it.

But doesn't necessarily prove that it's a good idea.

   Just (Item id ua t k v) - M.insertWith
 (+) k 1 st

Does replacing this by insertWith' help?

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] strict version of Haskell - does it exist?

2012-01-30 Thread Roman Leshchinskiy
Marc Weber wrote:
 Replying to all replies at once:

 Malcolm Wallace
  At work, we have a strict version of Haskell
 :-) which proofs that it is worth thinking about it.

But doesn't necessarily prove that it's a good idea.

   Just (Item id ua t k v) - M.insertWith
 (+) k 1 st

Does replacing this by insertWith' help?

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Unboxed Rationals?

2012-01-12 Thread Roman Leshchinskiy
On 12/01/2012, at 21:01, Artyom Kazak wrote:

 Yves Parès limestr...@gmail.com писал(а) в своём письме Thu, 12 Jan 2012 
 13:14:16 +0200:
 
 uvector is deprecated, its functionnalities has been ported into vector.
 
 Yes, but a Ratio a instance hasn't been ported.

FWIW, vector isn't a port of uvector in any sense, shape or form. Rather, 
uvector was a fork of a very old version of an internal DPH package, whereas 
vector is a from-scratch implementation of arrays based on the experience 
gained while working on DPH.

Vector is an open-source project and has a trac. If you need something, open a 
ticket or better yet, send me patches! Emails sometimes work, too :-)

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Unboxed Rationals?

2012-01-11 Thread Roman Leshchinskiy
On 11/01/2012, at 17:00, Artyom Kazak wrote:

 In fact, I am surprised that Data.Vector doesn't have a Ratio
 instance, but has a Complex instance. Any ideas, why?

Nobody has asked for it so far.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Documenting strictness properties for Data.Map.Strict

2011-11-18 Thread Roman Leshchinskiy
Johan Tibell wrote:

   map (\ v - undefined)  ==  undefined
   mapKeys (\ k - undefined)  ==  undefined

Not really related to the question but I don't really understand how these
properties can possibly hold. Shouldn't it be:

  map (\v - undefined) x = undefined

And even then, does this really hold for empty maps?

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Stream fusion

2011-11-18 Thread Roman Leshchinskiy
Yves Parès wrote:

 While re-reading RealWorldHaskell, chapter 25, I saw that -- unlike I
 believed -- loop fusion wasn't activated by default under GHC for lists
 (but that module Data.List.Stream from package stream-fusion could provide
 it).

Note that stream fusion is only one way to do fusion. For lists, GHC uses
foldr/build fusion which is a different approach but still fuses loops.
You get this by default when compiling with optimisations.

 Is that still the case? If not, then are there some cases of list
 processing where loop fusion would be a bad thing? (Ergo cases where you
 should stick to Prelude/Data.List functions and not use Data.List.Stream
 implementation)

I'm not sure if anybody has actually benchmarked the stream-fusion package
with a modern GHC. I suspect it wouldn't hold up well, too many things
have changed in the compiler since it was written. So I'm not really sure
you should be using it at all. Chances are, if you really care about
having tight loops you shouldn't be using lists at all.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.Vector.Mutable.mapM

2011-10-23 Thread Roman Leshchinskiy
Joachim Breitner wrote:
 Hi,

 I’m consdering to change some performance critical code from Vector to
 MVector, hopefully avoiding a lot of copying and garbage collecting. But
 it seems that the Data.Vector.Mutable interface at
 http://hackage.haskell.org/packages/archive/vector/0.9/doc/html/Data-Vector-Mutable.html
  is quite limited; e.g. I am missing simple functions having type
 modifyM :: PrimMonad m = (a - m a) - MVector (PrimState m) a -
 m ()
 that would do something with each element in the vector.

At the moment, the best way to do this is:

modifyM = Data.Vector.Generic.Mutable.transform
. Data.Vector.Fusion.Stream.Monadic.mapM

Note that transform will return a new vector but that is guaranteed to be
a slice of the original one. Since mapM doesn't change the number of
elements, you can safely ignore the return value as it will be always your
original vector.

 Is this an indication that such use is actually not desired, or is it
 just the case that nobody has developed that yet?

The latter. I need to come up with a nice mechanism for specifying loops
over mutable vectors but this isn't entirely trivial and I haven't had
enough time to really work on this lately.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0

2011-10-19 Thread Roman Leshchinskiy
Conrad Parker wrote:
 On 15 October 2011 23:18, Ivan Lazar Miljenovic
 ivan.miljeno...@gmail.com wrote:
 On 16 October 2011 01:15, Bas van Dijk v.dijk@gmail.com wrote:

 I agree that you shouldn't use ByteStrings or Vectors of Word8s for
 Unicode strings. However I can imagine that for quick sessions in ghci
 it can be quite handy if they are shown as strings. For example,
 currently we have:

 import Network.HTTP.Enumerator
 simpleHttp http://code.haskell.org/~basvandijk/;
 Chunk html\nheadtitleBas van
 Dijk/title/head\nbody\nh1Bas van Dijk/h1\n\npEmail: a
 href=\mailto://v.dijk@gmail.com\;v.dijk@gmail.com/a/p\n\npNick
 on IRC: ttbasvandijk/tt/p\n\na
 href=\http://www.haskellers.com/user/basvandijk/\;\n  img
 src=\http://www.haskellers.com/static/badge.png\; \n       alt=\I'm
 a Haskeller\\n       border=\0\\n/a\n\npSee my a
 href=\https://github.com/basvandijk\;GitHub/a page for a list of
 projects I work on./p\n\n/body\n/html\n Empty

 If ByteStrings were not shown as strings this would look like:

 Chunk ( fromList
 [60,104,116,109,108,62,10,60,104,101,97,100,62,60,116,105,116,108,101,62,66,97,115,32,118,97,110,32,68,105,106,107,60,47,116,105,116,108,101,62,60,47,104,101,97,100,62,10,60,98,111,100,121,62,10,60,104,49,62,66,97,115,32,118,97,110,32,68,105,106,107,60,47,104,49,62,10,10,60,112,62,69,109,97,105,108,58,32,60,97,32,104,114,101,102,61,34,109,97,105,108,116,111,58,47,47,118,46,100,105,106,107,46,98,97,115,64,103,109,97,105,108,46,99,111,109,34,62,118,46,100,105,106,107,46,98,97,115,64,103,109,97,105,108,46,99,111,109,60,47,97,62,60,47,112,62,10,10,60,112,62,78,105,99,107,32,111,110,32,73,82,67,58,32,60,116,116,62,98,97,115,118,97,110,100,105,106,107,60,47,116,116,62,60,47,112,62,10,10,60,97,32,104,114,101,102,61,34,104,116,116,112,58,47,47,119,119,119,46,104,97,115,107,101,108,108,101,114,115,46,99,111,109,47,117,115,101,114,47,98,97,115,118,97,110,100,105,106,107,47,34,62,10,32,32,60,105,109,103,32,115,114,99,61,34,104,116,116,112,58,47,47,119,119,119,46,104,97,115,107,101,108,108,101,114,115,46,99,111,109,47,115,116,97,116,105,99,47,98,97,100,103,101,46,112,110,103,34,32,10,32,32,32,32,32,32,32,97,108,116,61,34,73,39,109,32,97,32,72,97,115,107,101,108,108,101,114,34,10,32,32,32,32,32,32,32,98,111,114,100,101,114,61,34,48,34,62,10,60,47,97,62,10,10,60,112,62,83,101,101,32,109,121,32,60,97,32,104,114,101,102,61,34,104,116,116,112,115,58,47,47,103,105,116,104,117,98,46,99,111,109,47,98,97,115,118,97,110,100,105,106,107,34,62,71,105,116,72,117,98,60,47,97,62,32,112,97,103,101,32,102,111,114,32,97,32,108,105,115,116,32,111,102,32,112,114,111,106,101,99,116,115,32,73,32,119,111,114,107,32,111,110,46,60,47,112,62,10,10,60,47,98,111,100,121,62,10,60,47,104,116,109,108,62,10])
 Empty

 Personally, I don't work in ghci that often so I don't care that much
 if we have or don't have specialized Show instances for Vectors of
 Word8s.

 So what do other people think about this?

 Actually, for my current use case of Bytestrings (binary encoding of
 graphs using existing encoding schemes), I would prefer this
 [Word8]-based Show instance as it would help with my debugging, since
 the output looks along the lines of: Chunk (fromList
 [3,2,3,0,3,1,3,0,2,2,1,0]).  I am the first to admit that my use case
 is probably different from others though.


 And I often work with mixed text/binary data (eg. text annotations in
 video streams). I'd want the Show/Read instances to be in the form of
 a hexdump with char representation alongside (like xxd or od -xc
 output). It roundtrips well, so why not? :-)

So it seems that (1) people have very different requirements and (2) the
Show instance only really matters for debugging in ghci. Here is a
thought. What if ghci allowed Show instances to be overridden dynamically?
So you could put your preferred Show instance for Vector Word8 in you
.ghci file and ghci would use that when displaying stuff (but not when
actually evaluating things). Would that solve most of the problems without
messing with vector's Show instances?

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0

2011-10-19 Thread Roman Leshchinskiy
Ivan Lazar Miljenovic wrote:
 On 19 October 2011 22:09, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:

 So it seems that (1) people have very different requirements and (2) the
 Show instance only really matters for debugging in ghci. Here is a
 thought. What if ghci allowed Show instances to be overridden
 dynamically?
 So you could put your preferred Show instance for Vector Word8 in you
 .ghci file and ghci would use that when displaying stuff (but not when
 actually evaluating things). Would that solve most of the problems
 without
 messing with vector's Show instances?

 Would this hypothetical ghci feature also work for cases where you
 have a ByteString as part of another type that derives Show and Read?

Yes. The idea would be to evaluate the expression, then build the Show
instance for the type of the result taking the ghci overrides into account
and then use that to display the result. I have to admit that I have no
idea how difficult it would be to do this but surely it can't be that
hard.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0

2011-10-17 Thread Roman Leshchinskiy
Michael Snoyman wrote:
 On Mon, Oct 17, 2011 at 12:14 PM, Bas van Dijk v.dijk@gmail.com
 wrote:

 My idea is that when vector-bytestring is as fast as bytestring, it
 can replace it. When that happens it doesn't matter if users use the
 vector interface. I would even recommend it over using the bytestring
 interface so that bytestring can eventually be deprecated in favor of
 vector.

 +1. I'm in favor of using the OverlappingInstances/no newtype and
 specialized Show instance. I think that, if there was *ever* a case
 where OverlappingInstances was a good fit, it's this one. We're
 talking about a single module exporting both the base and overlapped
 instance, so which instance gets used should be completely decidable.
 (Unless of course someone defines an orphan instance elsewhere, but
 that's a different issue IMO.) And even in a worst-case-scenario where
 somehow we get the wrong instance, we're only talking about output
 used as a debugging aid, so the damage is minimal.

So suppose we change the Show and Read instances for Storable vectors of
Word8 and Char. What happens with unboxed and boxed vectors of these
types? Should these be changed as well? Should these be changed as well?
If not, why not?

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0

2011-10-15 Thread Roman Leshchinskiy
On 14/10/2011, at 12:37, Bas van Dijk wrote:

 If there's need for a specific Show instance for Vectors of Word8s we
 can always add one directly to vector. (Roman, what are your thoughts
 on this?)

Personally, I think that ByteString and especially Vector Word8 aren't strings 
and shouldn't be treated as such. But I wouldn't be strongly against showing 
them as strings. However, I *am* strongly against using UndecidableInstances in 
vector and I don't see how to implement this without using them.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] ANNOUNCE: vector-bytestring-0.0.0.0

2011-10-15 Thread Roman Leshchinskiy

On 15/10/2011, at 12:26, Roman Leshchinskiy wrote:

 On 14/10/2011, at 12:37, Bas van Dijk wrote:
 
 If there's need for a specific Show instance for Vectors of Word8s we
 can always add one directly to vector. (Roman, what are your thoughts
 on this?)
 
 Personally, I think that ByteString and especially Vector Word8 aren't 
 strings and shouldn't be treated as such. But I wouldn't be strongly against 
 showing them as strings. However, I *am* strongly against using 
 UndecidableInstances in vector and I don't see how to implement this without 
 using them.

I meant OverlappingInstances, of course. To clarify, I would still consider it 
if everybody thinks it's a really good idea.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Combining stream and list fusion

2011-10-12 Thread Roman Leshchinskiy
Bas van Dijk wrote:
 Hello,

 I'm trying to make the following faster:

 Data.Vector.Generic.fromList list

 where 'list' is some expression yielding a list.

Unfortunately, I don't think that's possible. The problem is that you
'list' will be expressed in terms of foldr/build and fromList would have
to produce a Stream, i.e., basically an unfoldr. But AFAIK, there is no
unfoldr/build fusion rule. There is one the other way round and vector
makes use of that in toList.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Two Proposals

2011-10-06 Thread Roman Leshchinskiy
Manuel M T Chakravarty wrote:
 Roman Leshchinskiy:

 What data structures other than lists do we want to construct using list
 literals? I'm not really sure what the use cases are.

 Parallel arrays! (I want to get rid of our custom syntax.)

Why? Don't you think it is useful to have a visual indication of which
data structure you are using and what is going to be evaluated in
parallel?

In any case, if we want to get rid of the parallel array syntax, we have
to overload list literals, enumerations and list comprehensions. We have
the generic monadic desugaring for the latter but recovering an efficient
DPH program from that sn't trivial.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


RE: Two Proposals

2011-10-05 Thread Roman Leshchinskiy
Simon Peyton-Jones wrote:

 I'm not sure if this plan would support [(fred,45), (bill,22)] :: Map
 String Int.  Probably not.   Maybe that's a shortcoming... but such Maps
 are a rather surprising use of list literals.

What data structures other than lists do we want to construct using list
literals? I'm not really sure what the use cases are.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Two Proposals

2011-10-04 Thread Roman Leshchinskiy
George Giorgidze wrote:

 This extension could also be used for giving data-parallel array literals
 instead of the special syntax used currently.

Unfortunately, it couldn't. DPH array literals don't (and can't really) go
through lists.

In general, if we are going to overload list literals then forcing the
desugaring to always go through lists seems wrong to me. There are plenty
of data structures where that might result in a significant performance
hit.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Two Proposals

2011-10-04 Thread Roman Leshchinskiy
Yitzchak Gale wrote:
 Roman Leshchinskiy wrote:
 In general, if we are going to overload list literals then forcing the
 desugaring to always go through lists seems wrong to me. There are
 plenty
 of data structures where that might result in a significant performance
 hit.

 These are literals. So the lists will almost always be quite short,
 and they will be evaluated only once. So I don't think there will
 be that much of a performance hit normally.

Calling them literals is misleading, IMO. They won't necessarily be only
evaluated once:

f x = [x]

In DPH, it wasn't uncommon for certain benchmarks to spend 90% of the time
constructing arrays from [:x,y,z:] terms until we made a significant
effort to ensure that this doesn't happen. This is the only real data
point related to this that I have but it does indicate that making the
desugaring efficient is quite important.

 That said, my extension that allows them to be desugared
 at compile time would solve that issue if it arises.

Personally, I don't like having desugaring depend on TH at all. I'm not
sure think there is a real need for it. This would, IMO, already be better
than fromList wrt efficiency:

class Cons a where
  type Elem a
  empty :: a
  cons  :: Elem a - a - a

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] instance Enum Double considered not entirely great?

2011-09-28 Thread Roman Leshchinskiy
On 25/09/2011, at 18:20, Chris Smith wrote:

 class Ord a = Range a where
rangeFromTo :: a - a - [a] -- subsumes Ix.range / Enum.enumFromTo
rangeFromThenTo :: a - a - a - [a]
inRange   :: (a, a) - a - Bool
 -- Does have instances for Float/Double.  List ranges desugar to this.
 -- Also has instances for tuples
 
 class Range a = InfiniteRange a where -- [1]
rangeFrom :: a - [a]
rangeFromThen :: a - a - [a]
 -- Has instances for Float/Double
 -- No instances for tuples

I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or 
enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should 
ensure that they can be defined generically for all types. The rationale is 
that other data structures (like arrays) want to provide similar functions 
without having to go through lists.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] instance Enum Double considered not entirely great?

2011-09-28 Thread Roman Leshchinskiy
On 28/09/2011, at 23:23, Ivan Lazar Miljenovic wrote:

 On 29 September 2011 07:56, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:
 On 25/09/2011, at 18:20, Chris Smith wrote:
 
 class Ord a = Range a where
rangeFromTo :: a - a - [a] -- subsumes Ix.range / Enum.enumFromTo
rangeFromThenTo :: a - a - a - [a]
inRange   :: (a, a) - a - Bool
 -- Does have instances for Float/Double.  List ranges desugar to this.
 -- Also has instances for tuples
 
 class Range a = InfiniteRange a where -- [1]
rangeFrom :: a - [a]
rangeFromThen :: a - a - [a]
 -- Has instances for Float/Double
 -- No instances for tuples
 
 I realise I'm slightly late to the discussion but IMO, the rangeFrom* (or 
 enumFrom*) functions shouldn't be methods. Rather, a redesign of Enum should 
 ensure that they can be defined generically for all types. The rationale is 
 that other data structures (like arrays) want to provide similar functions 
 without having to go through lists.
 
 Wouldn't this require something like the ListLike class?

Not at all. You could have something like:

class Enum a where
  enumFromToSize :: a - a - Integer
  advance :: a - Integer - a
  ...

And then [x..y] would desugar to map (advance x) (enumFromTo_Integer 0 $ 
enumFromToSize x y) where enumFromTo_Integer would be primitive. Of course, 
it's possible to design a much more efficient interface but this should give a 
general idea. An added benefit would be that you could generate the sequence in 
parallel (which is quite crucial for, e.g., DPH). Basically, the requirements 
would be that you can get the size of a range and compute the nth element of a 
range (or, equivalently, split the range) in constant time. Are there any Enum 
instances which don't satisfy this (apart from the broken floating point 
instances which *could* satisfy this)?

As it stands, none of the array libraries that I've participated in designing 
and writing can use the Enum class properly (or, in the case of DPH, at all). 
For instance, vector has 230 lines of code (including comments) and 16 rules to 
implement enumFromTo (the vector version) halfway efficiently when the element 
type is known statically. I haven't bothered with enumFromThenTo so far. 
Interestingly, GHC's *list* library has to jump through similar hoops to make 
enumFromTo and enumFromThenTo work with foldr/build fusion (again, only when 
the element type is known statically). IMO, making enumFromThen and friends 
into methods just doesn't work, not even for lists really.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.IArray rant

2011-09-06 Thread Roman Leshchinskiy
Jon Fairbairn wrote:
 Roman Leshchinskiy r...@cse.unsw.edu.au writes:

 No, arrays were not considered to be bad, they were designed
 with parallelism in mind.

I'm not sure how this can be the case if, as discussed below, most array
operations have to go through lists, an inherently sequential data
structure.

 It's rather that some considered the IArray API to be
 inadequate most of the time. Really, H98 arrays aren't very
 good at anything they do. For collective operations, you are
 supposed to convert the array to a list, work on the list and
 then convert it back to an array which just seems wrong.

 I am unconvinded that this is any more wrong than using a for
 loop in an imperative language.

Oh, definitely. In fact, I consider using a for loop even more wrong, you
really want to use collective operations instead whenever possible. But I
don't think for loops are the right benchmark for ease of use.

 Remember that the lists are
 lazy, so it’s misleading to say “convert the array to a list”
 since what happens most of the time is that elements are taken
 out of the array and passed to the processing function and then
 thrown away before the next element is processed.

Efficiency isn't even the biggest problem here. Whenever you want to
perform a collective operation on an array, you have to do the actual work
on an entirely different data structure. So I, as a programmer, have to
say: Convert this array to a list, then do something with the list and
then convert the resulting list back to an array. To me, at least, this
is very inconvenient and requires a context switch. No other widely used
container type requires you to do that.

 Multidimensional arrays can't be sliced and diced in the style
 of Repa or NumPy.

 I’m not familiar with Repa or NumPy, but what can they do that
 cannot be done with judicious use of ixmap, which is a very
 powerful mechanism.

Yes, it is quite powerful but not very convenient. Just as an example, if
A is a matrix, then A[3,:] gives you the 4th row and A[:,3] the 4th column
in NumPy. You can do that with ixmap but it's far more involved.

It also isn't powerful enough since it doesn't support certain highly
useful uses of shape polymorphism (an example is a generic concat which
decreases the dimensionality of any array by 1). This is discussed in
detail in http://www.cse.unsw.edu.au/~chak/papers/repa.pdf.

 In general, H98 arrays seem to have been designed with the
 goal of providing a container with O(1) indexing. They do
 that, I suppose, although they aren't very well integrated
 with the rest of the language

 Can you give examples?

My favourite is the interaction between arrays and Enum. If I want to
implement a generic enumFromTo m n that produces an array, I have to
create the list [m .. n], take its length (which forces the entire list
into memory) and then create an array of that length and fill it from the
list. There doesn't seem to be any other way of doing it. This is a
deficiency of Enum, really, but that's what I mean by not being well
integrated. There are other examples like this.

 and they have conceptual problems (such as requiring two
 bounds checks for one array access).

 Assuming that you mean that for safe array access where nothing
 is known about the index at compile time, since any sort of
 array has at least a beginning and an end, they all require two
 bounds checks. Once you do know something about the index, it’s
 a question of implementation.

That's not what I meant. You really need to do two full bounds checks, one
against inRange and one against the actual length of the array. Here is
the relevant comment from the GHC libraries:

Note [Double bounds-checking of index values]
~
When you index an array, a!x, there are two possible bounds checks we
might make:

  (A) Check that (inRange (bounds a) x) holds.

  (A) is checked in the method for 'index'

  (B) Check that (index (bounds a) x) lies in the range 0..n,
  where n is the size of the underlying array

  (B) is checked in the top-level function (!), in safeIndex.

Of course it *should* be the case that (A) holds iff (B) holds, but that
is a property of the particular instances of index, bounds, and inRange,
so GHC cannot guarantee it.

 * If you do (A) and not (B), then you might get a seg-fault,
   by indexing at some bizarre location.  Trac #1610

 * If you do (B) but not (A), you may get no complaint when you index
   an array out of its semantic bounds.  Trac #2120

At various times we have had (A) and not (B), or (B) and not (A); both
led to complaints.  So now we implement *both* checks (Trac #2669).

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Idiomatic usage of the fixpoint library

2011-09-05 Thread Roman Leshchinskiy
Roman Cheplyaka wrote:

 {-# LANGUAGE TypeFamilies, FlexibleContexts, UndecidableInstances,
 FlexibleInstances #-}
 import Data.Fixpoint

 newtype Expr = Expr { unExpr :: Pre Expr Expr }

 instance Functor (Pre Expr) = Fixpoint Expr where
 data Pre Expr a
 = Add a a
 | Const Int
 project = unExpr
 inject = Expr

 instance Functor (Pre Expr) where
 fmap f (Const x) = Const x
 fmap f (Add x1 x2) = Add (f x1) (f x2)

 eval = cata eval' where
 eval' (Const x) = x
 eval' (Add x1 x2) = x1 + x2

 There are some issues with this code, compared to simply using

 newtype Fix f = In { out :: f (Fix f) }

 to build an Expr.

 1. Since 'Pre' is a data (not type) family, we cannot simply make use of
a functor defined elsewhere. We need to define the functor inside the
instance declaration (or at least wrap an existing functor).

Yes, it would be nicer if it was a type family. There is a single reason
why this isn't the case but I find that reason pretty compelling: you
couldn't type hylo if it was.

 2. I wasn't able to derive the Functor instance, getting an error

 Derived instance `Functor (Pre Expr)'
   requires illegal partial application of data type family Pre
 In the data type instance declaration for `Pre'

That's really a GHC problem. There is no reason why it shouldn't be able
to do this.

 3. Having to use UndecidableInstances makes me feel a bit uncomfortable.

You don't need UndecidableInstances. Just get rid of the Functor (Pre
Expr) constraint on the Fixpoint Expr instance, it's doesn't do anything
anyway.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Data.IArray rant

2011-09-03 Thread Roman Leshchinskiy
On 03/09/2011, at 03:04, Ivan Lazar Miljenovic wrote:

 On 3 September 2011 11:38, Evan Laforge qdun...@gmail.com wrote:
  The result is that my first contact with haskell
 arrays left me with the impression that they were complicated, hard to
 use, and designed for someone with different priorities than me.  Of
 course, Data.Vector didn't exist back then, but if someone were new to
 haskell now I would recommend they skip Data.IArray and head straight
 for vector.
 
 To an extent, I wonder how much of this has been that arrays were
 considered to be bad in Haskell, so no-one used them and no-one
 bothered to try and improve the API much (and instead went and created
 Vector, etc.).

It's rather that some considered the IArray API to be inadequate most of the 
time. Really, H98 arrays aren't very good at anything they do. For collective 
operations, you are supposed to convert the array to a list, work on the list 
and then convert it back to an array which just seems wrong. Multidimensional 
arrays can't be sliced and diced in the style of Repa or NumPy. In general, H98 
arrays seem to have been designed with the goal of providing a container with 
O(1) indexing. They do that, I suppose, although they aren't very well 
integrated with the rest of the language and they have conceptual problems 
(such as requiring two bounds checks for one array access). But requirements 
have shifted quite a bit since then. Now, people want to write real array 
programs and they want those to be fast. Personally, I don't know how to 
improve the H98 array API to provide this. You basically need to create a 
completely new API based on different principles.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] attoparsec and vectors

2011-06-29 Thread Roman Leshchinskiy
Gregory Collins wrote:
 On Tue, Jun 28, 2011 at 6:20 PM, Eric Rasmussen ericrasmus...@gmail.com
 wrote:

 It runs quickly, but I naively thought I could outperform it by
 reworking many to build a vector directly, instead of having to build
 a list first and then convert it to a vector:

 manyVec :: Alternative f = f a - f (V.Vector a) manyVec v = many_v  
 where many_v = some_v | pure V.empty         some_v = V.cons
 $ v * many_v


 That's an O(n^2) loop, and a thunk leak to boot. If you don't know the
 size of the vector ahead of time, the only way I can think of to beat
 Vector.fromList is to use a mutable vector with a highwater mark,
 and double the size if you fill it. At the end, you'd use unsafeFreeze to
 turn the mutable vector into a pure one, and unsafeTake to truncate the
 vector into the correct size.

That's basically what fromList does. You could do this at a higher
abstraction level by generating a Stream rather than a list and then using
unstream to create a vector. I don't know if it's possible to do that with
attoparsec. But you'd only save allocating and deallocating a lazily
consumed list anyway. I'm not sure if it will be even noticable compared
to how much parsing costs.

 For an example of a similar technique (minus the freezing part), I did
 a similar thing in the hashtables library:

You might be interested in 'grow' :-)

http://hackage.haskell.org/packages/archive/vector/0.7.1/doc/html/Data-Vector-Generic-Mutable.html#g:8

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Addr# field in ForeignPtr

2011-06-01 Thread Roman Leshchinskiy
Hi all,

GHC defines ForeignPtr as:

data ForeignPtr a = ForeignPtr Addr# ForeignPtrContents
-- we cache the Addr# in the ForeignPtr object, but attach
-- the finalizer to the IORef (or the MutableByteArray# in
-- the case of a MallocPtr).  The aim of the representation
-- is to make withForeignPtr efficient; in fact, withForeignPtr
-- should be just as efficient as unpacking a Ptr, and multiple
-- withForeignPtrs can share an unpacked ForeignPtr.  Note
-- that touchForeignPtr only has to touch the ForeignPtrContents
-- object, because that ensures that whatever the finalizer is
-- attached to is kept alive.

Is it ok to modify the Addr# field? Or do the libraries assume that it always 
points to the start of the block described by ForeignPtrContents? Changing the 
Addr# field would be quite useful for pointing to the middle of a memory block. 
For instance, Storable vectors are currently defined as:

data Vector a = Vector {-# UNPACK #-} !(Ptr a)
   {-# UNPACK #-} !Int
   {-# UNPACK #-} !(ForeignPtr a)

If I could use the Addr# field of the ForeignPtr, I could get rid of the Ptr.

Alternatively, I could define my own version of ForeignPtr if only 
ForeignPtrContents was exported (abstractedly would be enough).

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: TypeFamilies vs. FunctionalDependencies type-level recursion

2011-05-31 Thread Roman Leshchinskiy
On 30/05/2011, at 00:55, dm-list-haskell-pr...@scs.stanford.edu wrote:

 I'm absolutely not advocating making overlapping instances (or, worse,
 overlapping types) part of Haskell', nor under the impression that the
 committee would ever consider doing so.  I'm just pointing out that
 right now OverlappingInstances are the only way to do recursive
 programming at the type level, for the specific reasons I outlined.  I
 hope that before FunctionalDependencies or TypeFamilies or any other
 type-level programming becomes part of Haskell', there is a way to
 differentiate base and recursive cases *without* overlapping
 instances.

FWIW, I don't think this is really about type-level recursion. You can do 
recursive programming with type families:

data Z
data S n

type family Plus m n
type instance Plus Z n = n
type instance Plus (S m) n = S (Plus m n)

It's deciding type equality via overlapping instances that is problematic here. 
But, as others have pointed out, this is somewhat dodgy anyway. I suppose what 
you really want is something like this:

data True
data False

type family Equal a b

Where Equal a b ~ True if and only if a and b are known to be the same type and 
Equal a b ~ False if and only if they are known to be different types. You 
could, in theory, get this by defining appropriate instances for all type 
constructors in a program:

type instance Equal Int Int = True
type instance Equal Int [a] = False
type instance Equal [a] Int = False
type instance Equal [a] [b] = Equal a b
...

But that's infeasible, of course. However, nothing prevents a compiler from 
providing this as a built-in. Arguably, this would be much cleaner than the 
solution based on fundeps and overlapping instances.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: [Haskell-cafe] Is fusion overrated?

2011-05-18 Thread Roman Leshchinskiy
Roman Cheplyaka wrote:

 Of course I don't claim that fusion is useless -- just trying to
 understand the problem it solves. Are we saving a few closures and cons
 cells here?

In addition to what everyone else said, fusion can be a big win when it
allows further optimisations. For instance, fusing map (+1) . map (+2) can
eliminate 1 addition per iteration. Even without taking allocation into
account, most of the reasons for why loop fusion is a worthwhile
optimisation in C apply to Haskell, too!

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Broken ghc-7.0.3/vector combination?

2011-04-20 Thread Roman Leshchinskiy
Daniel Fischer wrote:

 Further investigation of the sorting code in vector-algorithms revealed
 no bugs there, and if the runtime was forced to keep a keen eye on the
 indices, by replacing unsafeRead/Write/Swap with their bounds-checked
 counterparts or by 'trace'ing enough of their uses, the NaNs did not
 appear.

Did you replace them in vector-algorithms or in vector itself?

 So, is it possible that some change in ghc-7.0.3 vs. the previous
 versions caused a bad interaction between ghc-optimisations and vector
 fusion resulting in bad vector reads/writes?

Am I right in assuming that this happens in code which uses only mutable
vectors? Fusion only works for immutable ones so it shouldn't really
affect things here.

Have you tried playing around with code generation flags like -msse2?

In any case, I would try to take a look at this if you tell me how to
reproduce.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Faster Array#/MutableArray# copies

2011-03-01 Thread Roman Leshchinskiy
Simon Marlow wrote:

 For small arrays like this maybe we should have a new array type that
 leaves out all the card-marking stuff too (or just use tuples, as Roman
 suggested).

Would it, in theory, be possible to have an unpacked array type? That
is, could we have constructors for which the length of the closure is
determined dynamically at runtime?

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Faster Array#/MutableArray# copies

2011-03-01 Thread Roman Leshchinskiy
Simon Marlow wrote:
 On 01/03/2011 11:55, Roman Leshchinskiy wrote:


 Would it, in theory, be possible to have an unpacked array type? That
  is, could we have constructors for which the length of the closure is
 determined dynamically at runtime?

 Certainly, but the amount of effort to implement depends on what you
 want to support. e.g. do you want to support {-# UNPACK #-} on primitive
 array types in a constructor field?  That's probably quite hard.  I
 believe Duncan Coutts has been thinking along similar lines, we talked
 about it once.

I can see that supporting this would be rather hard:

data T a = T {-# UNPACK #-} (Array# a)

We would have to allow Array# to point to the middle of a closure and it's
far from obvious how to initialise this since we don't have Array#
literals.

 Or were you thinking of something more restricted?

Yes, I was thinking of some special syntax. Something along these lines:

data T a = T {a}

f x = T {x x x}
g (T {x y z}) = x
h (T xs) = xs{0}

I'm not seriously suggesting this syntax, this is just to demonstrate the
general idea. In the last function, it shouldn't be possible to do
anything with xs except indexing and taking the length.

This would be much easier, right?

Of course, we would also want this for byte arrays...

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] ANN: unordered-containers - a new, faster hashing-based containers library

2011-02-23 Thread Roman Leshchinskiy
Johan Tibell wrote:

 I'm working on a patch that provides O(1) size right now. The trick is
 to define HashMap as:

 data HashMap k v = HM {-# UNPACK #-} !Int !(Tree k v)

Another possibility is:

data HashMap k v = HM Int !(Tree k v)

hashMap t = HM (treeSize t) t

That way size is O(n) on first use but O(1) afterwards. Then again, if
someone really needs this they can program it themselves. I've never
needed an O(1) size for maps.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Faster Array#/MutableArray# copies

2011-02-18 Thread Roman Leshchinskiy
Max Bolingbroke wrote:
 On 18 February 2011 01:18, Johan Tibell johan.tib...@gmail.com wrote:

 It seems like a sufficient solution for your needs would be for us to
 use the LTO support in LLVM to inline across module boundaries - in
 particular to inline primop implementations into their call sites. LLVM
 would then probably deal with unrolling small loops with statically known
 bounds.

Could we simply use this?

http://llvm.org/docs/LangRef.html#int_memcpy

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Faster Array#/MutableArray# copies

2011-02-18 Thread Roman Leshchinskiy
Johan Tibell wrote:

 * Could we use built-in compiler rules to catch array copies of known
 length and replace them with e.g. unrolled loops? My particular use case
 involves copying small arrays (size: 1-32). Ideally this should be as fast
 as copying a tuple of the corresponding size but I'm pretty sure we're far
 off that goal.

Out of idle curiousity, couldn't you use tuples instead of arrays?

FWIW, I agree that doing something cleverer than just calling memcpy could
be very worthwhile. As Max points out, you could perhaps try to do
something with the LLVM backend.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Injective type families?

2011-02-16 Thread Roman Leshchinskiy
On 14/02/2011, at 21:28, Conal Elliott wrote:

 Is there a way to declare a type family to be injective?
 
 I have
 
  data Z
  data S n
 
  type family n :+: m
  type instance Z   :+: m = m
  type instance S n :+: m = S (n :+: m)

You could prove it :-)

class Nat n where
  induct :: p Z - (forall m. p m - p (S m)) - p n

instance Nat Z where
  induct z _ = z

instance Nat n = Nat (S n) where
  induct z s = s (induct z s)

data P n1 n2 m where
  P :: (forall a. (m :+: n1) ~ (m :+: n2) = (n1 ~ n2 = a) - a) - P n1 n2 m

injective :: forall m n1 n2 a. (Nat m, (m :+: n1) ~ (m :+: n2)) = n1 - n2 - 
m - (n1 ~ n2 = a) - a
injective _ _ _ x = case induct (P (\x - x)) (\(P f) - P f) :: P n1 n2 m of
  P f - f x

This is a bit inefficient, of course, because it involves recursion. With a 
little bit of safe cheating, it is possible to get by without recursion, 
basically by making induction an axiom rather than proving it.

It would be nicer if the compiler could prove it for us, of course.

Roman




___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] rewrite rules to specialize function according to type class?

2011-02-15 Thread Roman Leshchinskiy
Max Bolingbroke wrote:
 2011/2/15 Simon Peyton-Jones simo...@microsoft.com:

 but currently any pragmas in a class decl are treated as attaching to
 the *default method*, not to the method selector:

 I see. I didn't realise that that was what was happening. Personally I
 find this a bit surprising, but I can see the motivation. Of course, a
 sensible alternative design would be to have them control the selectors,
 and then you could declare that you want your default methods to be
 inlined like this:

 {{{
 class MyClass a where
   foo :: a - a
   foo = default_foo

 {-# INLINE default_foo #-}
 default_foo = ... big expression ...
 }}}

I wouldn't necessarily expect this to guarantee inlining for the same
reason that the following code doesn't guarantee that foo gets rewritten
to big:

foo = bar
{-# INLINE bar #-}
bar = big

It might work with the current implementation (I'm not even sure if it
does) but it would always look dodgy to me.

Also, what if I write:

class MyClass a where
  foo :: a - a
  foo x = default_foo x

I assume this wouldn't guarantee inlining?

 In any event, perhaps it would be worth warning if you write an INLINE
 pragma for some identifier in a class declaration where no corresponding
 default method has been declared, in just the same way you would if you
 wrote an INLINE pragma for a non-existant binding?

+1

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] rewrite rules to specialize function according to type class?

2011-02-15 Thread Roman Leshchinskiy
Max Bolingbroke wrote:
 On 15 February 2011 11:23, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:

 I wouldn't necessarily expect this to guarantee inlining for the same
 reason that the following code doesn't guarantee that foo gets rewritten
  to big:

 foo = bar
 {-# INLINE bar #-}
 bar = big

 It might work with the current implementation (I'm not even sure if it
 does) but it would always look dodgy to me.

 In this case there doesn't seem to be any point inlining anyway,
 because nothing is known about the context into which you are inlining.
 Nonetheless, what will happen (I think) is that any users of
 foo will get the definition of foo inlined (because that doesn't
 increase program size) so now they refer to bar instead. Now GHC can look
 at the use site of bar and the definition of bar and decide whether it is
 a good idea to inline.

Ah, but you assume that bar won't be inlined into foo first. Consider that
it is perfectly acceptable for GHC to generate this:

foo = big
{-# INLINE bar #-}
bar = big

We did ask to inline bar, after all.

 Basically, I expect the small RHS for the default in my class
 declaration to be inlined unconditionally, and then GHCs heuristics will
 determine how and when to inline the actual default definition (e.g.
 default_foo).

As soon as GHC generates a Core term for the RHS of the default method all
bets are off because it might inline default_foo into that term which
would make it too big to be inlined somewhere else. I thought you were
suggesting to treat foo = default_foo specially by not generating a
separate RHS for the default definition of foo and just rewriting it to
default_foo instead.

What it basically comes down to is a staging problem. You don't want
default_foo to be inlined into the RHS of foo before the latter is inlined
but the only way to achieve this is by marking foo as INLINE which is
precisely what you want to avoid.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] rewrite rules to specialize function according to type class?

2011-02-15 Thread Roman Leshchinskiy
Max Bolingbroke wrote:
 On 15 February 2011 15:12, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:

 Ah, but you assume that bar won't be inlined into foo first. Consider
 that it is perfectly acceptable for GHC to generate this:

 foo = big {-# INLINE bar #-}
 bar = big

 We did ask to inline bar, after all.


 Well, yes, but when considering the use site for foo don't we now
 inline the *original RHS* of foo? This recent change means that it doesn't
 matter whether bar gets inlined into foo first - use sites of foo will
 only get a chance to inline the bar RHS.

Only if foo has an INLINE pragma. Otherwise, GHC uses whatever RHS is
available when it wants to inline.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] rewrite rules to specialize function according to type class?

2011-02-15 Thread Roman Leshchinskiy
Max Bolingbroke wrote:
 On 15 February 2011 16:45, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:

 Only if foo has an INLINE pragma. Otherwise, GHC uses whatever RHS is
 available when it wants to inline.

 Ah, I see! Well yes, in that case my workaround is indeed broken in
 the way you describe, and there is no way to repair it because in my
 proposal you wouldn't be able to write an INLINE pragma on the actual
 default method definition.

There is an alternative, actually. When compiling a module with a function
that doesn't have an INLINE pragma, GHC uses its optimised rhs for
inlining in every stage and then records its unfolding for use in other
modules if it is small enough to be inlined. This has some unfortunate
(IMO) implications. Consider the following code:

{-# INLINE [1] f #-}
f = big
g = f
h = g

Will big be inlined into h? This depends on the module that h is defined
in. If it's in the same module as g, then g will most likely be inlined
into h in phase 2, i.e., before f has been inlined into g. Then, f will be
inlined into both g and h in phase 1. However, after f is inlined into g,
g's rhs becomes too big for inlining. So if h is defined in a different
module, g won't be inlined into it.

We could just as well say that a function's rhs should be recorded forever
as soon as it becomes small enough to be considered for inlining. So GHC
could notice that g is very small in phase 2 and basically add an
INLINABLE pragma to it at that point, regardless of what happens to its
rhs afterwards. This would ensure that inlining isn't affected by
splitting things into modules and would probably also make your proposal
work. But it would also result in a lot more inlining compared to now.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Vector library

2011-02-14 Thread Roman Leshchinskiy
Pierre-Etienne Meunier wrote:

 This is mostly a question for Roman : how do you use your vector library
 with multi-dimensional arrays ? I mean, the array library from the
 standard libraries does something more intelligent than the C-like
 solution with indirections.

Vector doesn't include any support for multidimensional arrays. This is by
design as the library has exactly one purpose: to provide a fast
implementation of contiguous, one-dimensional arrays. As you point out, it
is well possible to build multidimensional arrays on top of it but that
would be a separate library. You might want to take a look at Repa
(http://hackage.haskell.org/package/repa) which does exactly that (it sits
on top of DPH which sits on top of vector). It also gives you parallelism.

FWIW, I don't think we've nailed the right API for multidimensional
arrays yet. It's a hard problem. But we are getting there.

Roman




___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Reform of the Monad, and Disruptive Change

2011-02-04 Thread Roman Leshchinskiy
On 04/02/2011, at 10:49, Dark Lord wrote:

 I thoroughly agree with this. However, in the event that this does not 
 happen, piecemeal fixes are better than none.

FWIW, I disagree. To put it bluntly, why is repeatedly breaking a lot of code 
better than not breaking it at all? Breaking a lot of code once might be ok 
because the benefits of fixing many issues probably outweigh the costs. But for 
each individual change (such as the Monad redesign), the costs far outweigh the 
benefits, IMO.

 (Seeing as the inertia in Haskell is such that Haskell 2011 was cancelled, 
 and Haskell Platform 2011 contains no new packages, such a task force doesn't 
 seem very likely.)

Introducing backwards-incompatible changes into a language standard *should* be 
hard.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Reform of the Monad, and Disruptive Change

2011-02-04 Thread Roman Leshchinskiy
On 04/02/2011, at 10:08, Malcolm Wallace wrote:

 I suggested, and several people +1'd, that if we are making disruptive 
 changes to the standard libraries defined in the Language Report (especially 
 the Prelude), then we should aim to make a thorough job of cleaning up all 
 the cruft and redesigning in a single strike.  This means not just 
 rearranging the Monad hierarchy, but looking at I/O types, exceptions, the 
 default strictness of foldl, and much much more.

Is there a list of known issues for the standard libraries somewhere? Would it 
perhaps make sense to create a design bug tracker for them?

 Then (for instance) ghc could make a major release with the refreshed 
 libraries, and after a little experience in the field (and perhaps a few 
 patches), the libraries would then proceed to be blessed as part of the 
 subsequent language standard.

Perhaps GHC could be released with two sets of libraries. This would give 
people time to experiment without breaking existing code. It would also make 
implementing individual changes much easier.

Roman



___
Haskell-prime mailing list
Haskell-prime@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-prime


Re: Behavior of the inliner on imported class methods

2011-01-18 Thread Roman Leshchinskiy
Have you tried adding another (dummy) method to the class? GHC used to have 
problems with optimising single-method classes in the past.

Roman

On 18 Jan 2011, at 10:33, José Pedro Magalhães j...@cs.uu.nl wrote:

 Hello all,
 
 I fail to understand the behavior of the inliner in the following example:
 
 module M1 where
 
 class MyEnum a where myEnum :: [a]
 
 instance MyEnum () where myEnum = [()]
  
 module M2 where
 
 import M1
 
 f1 = map (\() - 'p') [()]
 f2 = map (\() - 'q') myEnum
 
 The generated core code for M2 with ghc-7.0.1 -O is:
 
 M2.f22 :: GHC.Types.Char
 [GblId,
  Caf=NoCafRefs,
  Str=DmdType m,
  Unf=Unf{Src=vanilla, TopLvl=True, Arity=0, Value=True,
  ConLike=True, Cheap=True, Expandable=True,
  Guidance=IF_ARGS [] 1 2}]
 M2.f22 = GHC.Types.C# 'q'
 
 M2.f11 :: GHC.Types.Char
 [GblId,
  Caf=NoCafRefs,
  Str=DmdType m,
  Unf=Unf{Src=vanilla, TopLvl=True, Arity=0, Value=True,
  ConLike=True, Cheap=True, Expandable=True,
  Guidance=IF_ARGS [] 1 2}]
 M2.f11 = GHC.Types.C# 'p'
 
 M2.f21 :: () - GHC.Types.Char
 [GblId,
  Arity=1,
  Caf=NoCafRefs,
  Unf=Unf{Src=vanilla, TopLvl=True, Arity=1, Value=True,
  ConLike=True, Cheap=True, Expandable=True,
  Guidance=ALWAYS_IF(unsat_ok=True,boring_ok=True)}]
 M2.f21 =
   \ (ds_dch :: ()) - case ds_dch of _ { () - M2.f22 }
 
 M2.f2 :: [GHC.Types.Char]
 [GblId,
  Str=DmdType,
  Unf=Unf{Src=vanilla, TopLvl=True, Arity=0, Value=False,
  ConLike=False, Cheap=False, Expandable=False,
  Guidance=IF_ARGS [] 3 0}]
 M2.f2 =
   GHC.Base.map
 @ () @ GHC.Types.Char M2.f21 M1.$fMyEnum()_$cmyEnum
 
 M2.f1 :: [GHC.Types.Char]
 [GblId,
  Caf=NoCafRefs,
  Str=DmdType,
  Unf=Unf{Src=vanilla, TopLvl=True, Arity=0, Value=True,
  ConLike=True, Cheap=True, Expandable=True,
  Guidance=IF_ARGS [] 1 3}]
 M2.f1 =
   GHC.Types.:
 @ GHC.Types.Char M2.f11 (GHC.Types.[] @ GHC.Types.Char)
 
 So, why does the inliner fail to get rid of the map in f2, while correctly 
 ditching it in f1? Note that using two modules is essential here: if the 
 instance is in M2 (and thus becoming orphan), the inliner works correctly. 
 Adding INLINE/INLINABLE pragmas to myEnum doesn't improve things either. Is 
 this a bug, or is there a reason for this behavior?
 
 
 Thanks,
 Pedro
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Can't make sense of newArray# docs

2011-01-18 Thread Roman Leshchinskiy
On 18/01/2011, at 22:18, Johan Tibell wrote:

 The docs for newArray# states:
 
 Create a new mutable array of specified size (in bytes), in the
 specified state thread, with each element containing the specified
 initial value.

The docs are wrong.

 I'm trying to implement
 the following array type:
 
data MurableArray s a = Array { unArray :: !(MutableArray# s a) }

Have a look at the implementation of 
http://hackage.haskell.org/packages/archive/primitive/0.3.1/doc/html/Data-Primitive-Array.html

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-13 Thread Roman Leshchinskiy
On 12 Jan 2011, at 23:31, Edward Z. Yang ezy...@mit.edu wrote:

 Excerpts from Roman Leshchinskiy's message of Wed Jan 12 18:20:25 -0500 2011:
 How would we get the current functionality of darcs-all pull? Is it even 
 possible?
 
 Here is the rebase-y workflow.

Thank you making things clearer!

 
 # pull the latest patches for GHC, and sticks your patchset on top
 git pull --rebase
 # resolve any conflicts that occured during rebase
 # register any new submodules (if any)
 git submodule init
 # make your submodules reflect the latest version GHC has
 git submodule update --rebase

This doesn't pull in all base patches, though, just the ones that GHC depends 
on, right? How would I get all base patches?
 

 
Roman
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-12 Thread Roman Leshchinskiy
On 12/01/2011, at 09:22, Simon Marlow wrote:

 On 11/01/2011 23:11, Roman Leshchinskiy wrote:
 
 A quick look at the docs seems to indicate that we'd need to do
 
 git pull
 git submodule update
 
 which doesn't look like a win over darcs-all. Also, I completely fail to 
 understand what git submodule update does. It doesn't seem to pull all 
 patches from the master repo. The git submodule docs are even worse than the 
 rest of the git docs which is rather discouraging.
 
 True, however the build system could automatically check whether you had 
 missed this step, because it could check the hashes.

That would be an improvement. How do you pull submodule patches which the main 
repo doesn't depend on, though? Out of curiousity, has anyone here used 
submodules for something similar to what we would need?

 Thomas says that it doesn't do automatic dependency tracking which looks 
 like a huge weakness to me. Personally, I haven't been able to successfully 
 unpull non-consecutive chunks of patches with git so far but I only tried 2 
 or 3 times before giving up.
 
 Right, not being able to automatically commute patches is a regression 
 compared to darcs.  Git isn't universally better than darcs, which is why 
 we're having this discussion - the question is, do the advantages outweigh 
 the disadvantages.

Oh, definitely, I wasn't implying than one is somehow objectively better than 
the other. All I'm saying is that darcs is much better suited to my personal 
workflow than git. Or at least the very small part of git that I've been able 
to figure out (I do have to say that I've probably read about 3x as much about 
git as I ever read about darcs, though).

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-12 Thread Roman Leshchinskiy
On 12/01/2011, at 22:22, Iavor Diatchki wrote:

 When you issue the command git submodule update, you are telling git to 
 advance the sub-module repo to the expected version (i.e., where the 
 pointer points to).  The reason this does not happen automatically is that 
 you might have also made changes to the submodule, so you might want to do 
 some merging there, instead of just pulling.

Thank you so much for the explanation. Sadly, I'm still confused. Are you 
saying that submodule update is the wrong thing to do if I have changes in 
some of the submodules?

 One thing to note is that if we were to set things up with sub-modules, then 
 every now and then we would have to advance the GHC's expected pointer for 
 various libraries to the latest (or a newer) version.  Of course, we could 
 have a script do this but, at least in theory, when someone makes a commit 
 which updates the version of a sub-module, they are asserting that they 
 things ought to work with the newer version of the sub-module.

How would we get the current functionality of darcs-all pull? Is it even 
possible?

Suppose I want to hack on GHC and base (base is a submodule of GHC). For this, 
I want to:

  - pull the latest patches to both GHC and base
  - write code
  - record my patches in both GHC and base
  - pull again to get whatever patches have been pushed while I was hacking
  - validate
  - push my patches to both GHC and base

Which commands would accomplish this?

The git docs still don't make any sense to me. FWIW, I'd be very wary of using 
any features that are so badly documented. Or programs, for that matter.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-11 Thread Roman Leshchinskiy
On 11/01/2011, at 16:14, Tony Finch wrote:

 On Mon, 10 Jan 2011, Roman Leshchinskiy wrote:
 
 It also seems to make finding buggy patches rather hard.
 
 Have a look at `git bisect`.

I'm aware of git bisect. It doesn't do what I want. I usually have a pretty 
good idea of which patch(es) might have caused a problem and I want to unpull 
it and its dependencies. This is easy in darcs; I have no idea how to do that 
in git.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-11 Thread Roman Leshchinskiy
On 11/01/2011, at 21:41, Iavor Diatchki wrote:

 If GHC and the libraries on which it depends were in git (migrated, or 
 mirrored), then we could use git sub-modules to track the dependencies 
 between changes to GHC and changes to the libraries. 
 
 Roughly, the workflow would be like this:
 1. Make a change to the library and commit it.
 2. Make a change to GHC.
 3. Make a GHC commit which records the change and the dependency on the 
 commit in the library repository.

What about dependencies which go the other way? Actually, the dependency is 
often mutual: the GHC change won't work without the library change and the 
library change won't work without the GHC change. Does git support this?

 This is useful because when someone gets the changes to GHC, they would know 
 that they need to update their library as well (and there is tool support to 
 make all updates automatically). This kind of dependency is not at all 
 obvious with our current workflow.

IMO, darcs-all works pretty well. I don't think I ever really had problems with 
missing library patches.

 The same method works for going back to a previous state of the project, 
 where one can rewind the libraries to their old versions too.

This would be useful. Unfortunately, git's rewinding seems rather crippled 
compared to darcs.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-11 Thread Roman Leshchinskiy
On 11/01/2011, at 22:20, Simon Marlow wrote:

 On 11/01/11 21:57, Roman Leshchinskiy wrote:
 IMO, darcs-all works pretty well. I don't think I ever really had
 problems with missing library patches.
 
 I often see problems where someone has done 'darcs pull' rather than 
 './darcs-all pull' and ended up with a weird compilation error as a result.  
 If we could eliminate this source of errors, it would be a major win.

A quick look at the docs seems to indicate that we'd need to do

git pull
git submodule update

which doesn't look like a win over darcs-all. Also, I completely fail to 
understand what git submodule update does. It doesn't seem to pull all patches 
from the master repo. The git submodule docs are even worse than the rest of 
the git docs which is rather discouraging.

 This would be useful. Unfortunately, git's rewinding seems rather
 crippled compared to darcs.
 
 In what way?

Thomas says that it doesn't do automatic dependency tracking which looks like a 
huge weakness to me. Personally, I haven't been able to successfully unpull 
non-consecutive chunks of patches with git so far but I only tried 2 or 3 times 
before giving up.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: RFC: migrating to git

2011-01-10 Thread Roman Leshchinskiy
On 10/01/2011, at 13:27, Simon Marlow wrote:

 On 10/01/2011 13:02, Max Bolingbroke wrote:
 However, I remember the last time this came up there were some issues
 that might make migration painful. From the top of my head:
 
 1) Some people expressed concern that they would have to use two
 revision control systems to work on GHC, because not all GHC
 dependencies would be git-based.
 
 It would be a prerequisite to switching that a GHC developer only has to use 
 one VCS.  So we either migrate dependencies to git, or mirror them in 
 GHC-specific git branches.

I'm not sure how that is going to work. It might well be possible to build GHC 
using only git. But most GHC developers also contribute to various libraries 
which are often quite intimately linked to GHC. In particular, GHC patches are 
often accompanied by library patches. Unless all those libraries switch to git, 
too, we'll have to use both git and darcs which would be *really* annoying.

Personally, I rather dislike git, mostly for the reasons that Malcolm already 
mentioned. Compared to darcs, it seems to get in the way much too often. It 
also seems to make finding buggy patches rather hard. But maybe I just don't 
know how to use it properly. In any case, a switch to git wouldn't deter me 
from contributing to GHC, but neither would a switch to any other VCS. I would 
certainly swear more often while developing, though.

Roman



___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] ghc/dph

2010-12-15 Thread Roman Leshchinskiy
On 14/12/2010, at 13:35, Johannes Waldmann wrote:

 I want to use dph (data parallel haskell) for a presentation.
 (Nothing fancy, just compile and run some demos.)
 What ghc version should I use and where do I get it?

That's a tricky question. We are currently working on getting DPH to work 
properly with GHC 7 but we aren't quite done yet. You might want to try 7.0.1 + 
the DPH sources from the darcs repo.

 I read the advice use HEAD but when I build
 from the 7.1.20101213 source snapshot,
 dph is not installed (should it be?)

Nowadays, you have to install DPH separately which isn't easy since we haven't 
released the packages yet. In any case, DPH doesn't work with the current HEAD 
at all because I haven't adapted it to the new superclass story yet.

Roman



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Type families again

2010-12-02 Thread Roman Leshchinskiy
On 2 Dec 2010, at 21:29, Andrew Coppin andrewcop...@btinternet.com wrote:

 Does anybody have any suggestions?

class Mappable t a b where
 type Rebind t a b
 map :: (a - b) - t - Rebind a b

This is based on an old C++ trick.

Roman

 



___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: SpecConstr number of specializations (-fspec-constr-count)

2010-11-25 Thread Roman Leshchinskiy
On 25/11/2010, at 10:33, José Pedro Magalhães wrote:

 Is this a bug, or is the value of spec-constr-count being manipulated in some 
 way for certain passes?

spec-constr-count decreases for nested specialisations. For instance, if 
spec-constr-count is 6 and SpecConstr generates 2 specialisations for a 
function foo, then spec-constr-count will be 3 for all functions nested in foo. 
You can turn it off completely with -fno-spec-constr-count.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread Roman Leshchinskiy
On 05/11/2010, at 23:22, David Peixotto wrote:

 I spent some time looking at the code generated for llvm and the optimizations
 it can apply. There were quite a bit of details to examine and I wrote it up
 as blog post here:
 http://www.dmpots.com/blog/2010/11/05/optimizing-haskell-loops-with-llvm.html.

Nice! Thanks a lot for doing that!

 To summarize, I found that it is possible to get LLVM to do this
 transformation through a combination of tail-call elimination, inlining,
 induction variable optimization, and global value numbering. This works fine
 on x86_64 where we can pass parameters in registers, but fails to fully fire
 in i386 back end because LLVM gets caught up by aliasing problems because
 function parameters are passed on the stack. The possible aliasing between the
 stack pointer (Sp) and the function argument pointer (R1) prevented the full
 transformation, but it was still able to reduce the f loop to straight line 
 code.

Hmm... IIRC we agreed that Sp is never aliased in GHC-generated code and David 
Terei (I'm cc'ing here, not sure if he reads the list) made sure to include 
appropriate annotations in Haskell code. In fact, in your post Sp is passed as 
i32* noalias nocapture %Sp_Arg. Isn't that enough for LLVM to know that Sp 
isn't aliased?

 1. The ability of LLVM to optimize Haskell functions is limited by the calling
 convention. Particularly for i386, function arguments are passed on a stack
 that LLVM knows nothing about. The reads and writes to the stack look like
 arbitrary loads and stores. It has no notion of popping elements from the
 stack which makes it difficult to know when it is ok to eliminate stores to
 the stack. 

But shouldn't it just promote stack locations to registers?

 2. The possible aliasing introduced by casting integer arguments
 (R1-R6) to pointers limits the effectiveness of its optimizations.

Yes, that's a big problem. David tried to solve some of it by including noalias 
annotations but it's not clear what to do with, say, newly allocated ByteArrays 
which can't be aliased by anything.

Anyway, it's great to know that there are things we can improve to make LLVM 
optimise better.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread Roman Leshchinskiy
On 06/11/2010, at 00:28, David Peixotto wrote:

 Yes, the LLVM code has Sp, Hp, Base all annotated as noalias. I believe that 
 Sp, Hp, and Base should never alias, but a (boxed) R1 should always alias 
 with either Sp or Hp. I had a hard time determining exactly how LLVM uses the 
 noalias annotation, but playing with opt -print-alias-sets I saw that Sp was 
 a MayAlias with the pointers derived from R1. I would guess that casting an 
 int to a pointer (like we do for R1) makes that pointer MayAlias with 
 everything regardless of the noalias annotation.

Are you sure about R1 aliasing Sp? AFAIK, R1 points to a closure on the heap, 
not to a stack location. That is, it can alias pointers on the stack or Hp but 
it can't alias the Sp itself. I don't think Sp can be aliased by anything 
outside of the garbage collector.

Perhaps we shouldn't mark Hp as noalias, though.

 But shouldn't it just promote stack locations to registers?
 
 Yes, LLVM can and will promote the stack locations to registers, but since it 
 doesn't know that Sp is really a stack, it is difficult for it to tell when 
 it can avoid the writes back to the stack even though *we* know they will not 
 be visible once the function call returns.

Right, I meant GHC stack locations. Let me rephrase my question: shouldn't it 
just promote array locations to registers?

 It may profitable to write our own alias analysis pass for LLVM that encodes 
 our knowledge of what can alias in the GHC world view. It wouldn't be useful 
 for other LLVM clients, but might be a good option for us.

Actually, I think our aliasing properties should be fairly close to those of, 
say, Java. I wonder how LLVM deals with those.

 Yeah, I'm generally very impressed with what LLVM is able to do with the code 
 from GHC. Any help we can give it will just make it that much better!

I have to say I'm slightly disappointed with what LLVM does with tight loops 
generated by GHC. That's not necessarily LLVM's fault, you are quite right that 
we should probably give it more information.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-05 Thread Roman Leshchinskiy
On 06/11/2010, at 02:27, Sebastian Fischer wrote:

 Interesting. This approach requires `f` to be inlined into its call site in 
 order to eliminate the redundant argument. This is different from the 
 proposal to provide a specialized version of `f` (where the arguments are 
 combined) which could be reused at different call sites.

Which proposal do you mean? I'm not sure something like that is feasible 
without knowing the call sites. You have to know which arguments to combine.

 How many call sites with identical arguments are there in the generated code 
 that triggered this discussion and in the stream fusion code that would 
 benefit from this optimization?

In stream fusion code, there is normally exactly one call site. I suspect that 
the Christian's example has also been derived from stream fusion code.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-03 Thread Roman Leshchinskiy


On 3 Nov 2010, at 10:45, Christian Hoener zu Siederdissen 
choe...@tbi.univie.ac.at wrote:

 Thanks, I'll do some measurements on this with ghc7.
 
 Gruss,
 Christian
 
 On 11/02/2010 01:23 PM, Simon Marlow wrote:
 On 02/11/2010 08:17, Christian Höner zu Siederdissen wrote:
 Hi,
 
 is the following problem a job for ghc or the code generation backend
 (llvm)?
 
 We are given this program:
 
 {-# LANGUAGE BangPatterns #-}
 
 module Main where
 
 f :: Int -  Int -  Int -  Int -  Int
 f !i !j !s !m
   | i == 0= s+m
   | otherwise = f (i-1) (j-1) (s + i+1) (m + j*5)
 
 g :: Int -  Int
 g !k = f k k 0 0
 
 
 ff :: Int -  Int -  Int -  Int
 ff !i !s !m
   | i == 0= s+m
   | otherwise = ff (i-1) (s + i+1) (m + i*5)
 
 gg :: Int -  Int
 gg !k = ff k 0 0
 
 main = do
   print $ g 20
   print $ gg 20
 
 
 Here, 'f' and 'g' are a representation of the code I have. Both counters
 'i' and 'j' in 'f' count from the same value with the same step size and
 terminate at the same time but are not reduced to just one counter. Can
 I reasonably expect this to be done by the code generator?
 'ff' represents what I would like to see.
 
 GHC doesn't have any optimisations that would do this currently,
 although it's possible that LLVM's loop optimisations might do this on
 the generated code for f.
 
 Cheers,
Simon
 
 
 
 Btw. look at the core, to see that indeed 'f' keep four arguments.
 Functions like 'f' are a result of vector-fusion at work but can be
 written by oneself as well. The point is that if 'f' gets reduced to
 'ff' then I can have this:
 
 fun k = zipWith (+) (map f1 $ mkIdxs k) (map f2 $ mkIdxs k)
 
 which makes for nicer code sometimes; but before rewriting I wanted to
 ask if that kills performance.
 
 
 Thanks,
 Christian
 
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Loop optimisation with identical counters

2010-11-03 Thread Roman Leshchinskiy
LLVM doesn't eliminate the counters. FWIW, fixing this would improve 
performance of stream fusion code quite a bit. It's very easy to do in Core.

Roman

On 3 Nov 2010, at 10:45, Christian Hoener zu Siederdissen 
choe...@tbi.univie.ac.at wrote:

 Thanks, I'll do some measurements on this with ghc7.
 
 Gruss,
 Christian
 
 On 11/02/2010 01:23 PM, Simon Marlow wrote:
 On 02/11/2010 08:17, Christian Höner zu Siederdissen wrote:
 Hi,
 
 is the following problem a job for ghc or the code generation backend
 (llvm)?
 
 We are given this program:
 
 {-# LANGUAGE BangPatterns #-}
 
 module Main where
 
 f :: Int -  Int -  Int -  Int -  Int
 f !i !j !s !m
   | i == 0= s+m
   | otherwise = f (i-1) (j-1) (s + i+1) (m + j*5)
 
 g :: Int -  Int
 g !k = f k k 0 0
 
 
 ff :: Int -  Int -  Int -  Int
 ff !i !s !m
   | i == 0= s+m
   | otherwise = ff (i-1) (s + i+1) (m + i*5)
 
 gg :: Int -  Int
 gg !k = ff k 0 0
 
 main = do
   print $ g 20
   print $ gg 20
 
 
 Here, 'f' and 'g' are a representation of the code I have. Both counters
 'i' and 'j' in 'f' count from the same value with the same step size and
 terminate at the same time but are not reduced to just one counter. Can
 I reasonably expect this to be done by the code generator?
 'ff' represents what I would like to see.
 
 GHC doesn't have any optimisations that would do this currently,
 although it's possible that LLVM's loop optimisations might do this on
 the generated code for f.
 
 Cheers,
Simon
 
 
 
 Btw. look at the core, to see that indeed 'f' keep four arguments.
 Functions like 'f' are a result of vector-fusion at work but can be
 written by oneself as well. The point is that if 'f' gets reduced to
 'ff' then I can have this:
 
 fun k = zipWith (+) (map f1 $ mkIdxs k) (map f2 $ mkIdxs k)
 
 which makes for nicer code sometimes; but before rewriting I wanted to
 ask if that kills performance.
 
 
 Thanks,
 Christian
 
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
 
 
 ___
 Glasgow-haskell-users mailing list
 Glasgow-haskell-users@haskell.org
 http://www.haskell.org/mailman/listinfo/glasgow-haskell-users
___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Add/Change #3557 (SIMD operations in GHC.Prim)

2010-10-26 Thread Roman Leshchinskiy
DPH doesn't provide SIMD primitives. Both DPH and vector could greatly benefit 
from having them available, though. It would be fantastic if you could add them.

Roman

On 26 Oct 2010, at 10:00, Alexander McPhail haskell.vivian.mcph...@gmail.com 
wrote:

 Hi,
 
 Does DPH resolve this issue or are SIMD primitives an orthogonal issue?
 
 I might take a look at this ticket if it is still relevant.
 
 Cheers,
 
 Vivian
 
 DISCLAIMER
 
 This transmission contains information that may be confidential. It is 
 intended for the named addressee only. Unless you are the named addressee you 
 may not copy or use it or disclose it to anyone else.
 
___
Glasgow-haskell-bugs mailing list
Glasgow-haskell-bugs@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-bugs


Re: [Haskell-cafe] Eta-expansion and existentials (or: types destroy my laziness)

2010-10-17 Thread Roman Leshchinskiy

On 16/10/2010, at 12:36, Max Bolingbroke wrote:

 On 16 October 2010 12:16, Roman Leshchinskiy r...@cse.unsw.edu.au wrote:
 eta :: Stream a - Stream a
 eta s = Stream s next
   where
 next (Stream s next') = case next' s of
   Just (x,s') - Just (x,Stream s' next')
   Nothing - Nothing
 
 Making GHC optimise stream code involving eta properly is hard :-)
 
 Good point, I don't exactly mean non-recursive for requirement 3) then
 - I mean an adjective with a fuzzier definition like GHC-optimisable
 :-)

I suspect the easiest way to achieve this is to expand the set of 
GHC-optimisable things until it includes eta :-)

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Eta-expansion and existentials (or: types destroy my laziness)

2010-10-16 Thread Roman Leshchinskiy

On 16/10/2010, at 12:00, Max Bolingbroke wrote:

 Hi Cafe,
 
 I've run across a problem with my use of existential data types,
 whereby programs using them are forced to become too strict, and I'm
 looking for possible solutions to the problem.
 
 I'm going to explain what I mean by using a literate Haskell program.
 First, the preliminaries:
 
 {-# LANGUAGE ExistentialQuantification #-}
 import Control.Arrow (second)
 import Unsafe.Coerce
 
 Let's start with a simple example of an existential data type:
 
 data Stream a = forall s. Stream s (s - Maybe (a, s))
 
 [...]
 In fact, to define a correct cons it would be sufficient to have some
 function (eta :: Stream a - Stream a) such that (eta s) has the same
 semantics as s, except that eta s /= _|_ for any s.

That's easy.

eta :: Stream a - Stream a
eta s = Stream s next
   where
 next (Stream s next') = case next' s of
   Just (x,s') - Just (x,Stream s' next')
   Nothing - Nothing

Making GHC optimise stream code involving eta properly is hard :-)

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Massive slowdown in mwc-random after switching to use of primitive package

2010-07-12 Thread Roman Leshchinskiy

On 11/07/2010, at 22:49, Bryan O'Sullivan wrote:

 On Sun, Jul 11, 2010 at 12:59 AM, Dan Doel dan.d...@gmail.com wrote:
 
 You're using GHC 6.12.x presumably?
 
 That's right.
  
 There are known performance problems with
 using abstract PrimMonads in that version (and, actually, just using IO as
 well).
 
 
 Ah, that's a shame. I'm surprised it would be affecting IO too!

FWIW, I tried a couple of different designs when writing primitive and the 
current one seems to works best with 6.12, at least for the things I'm using it 
for. Really, 6.12 just seems to be rather hopeless here.

 In the IO case for instance, checking the core tends to reveal lots of 
 casting
 between RealWorld and PrimState IO, despite the fact that those should be
 identical.
 
 
 I'd certainly noticed that the Core for the PrimMonad code was huge and 
 almost impossible to follow for the enormous amount of casting that was 
 taking place.

The head has -dsuppress-coercions which omits coercion terms when pretty 
printing Core. It would be easy to backport that to 6.12.

 I might revert both mwc-random and statistics back to using plain ST for now, 
 then, as having them run slightly faster than 1% of their former speed is a 
 bit painful :-( 

I fear that's the only sensible solution. I wouldn't throw away your current 
code, though, as 6.14 won't have these problems (hopefully).

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: [Haskell-cafe] Construction of short vectors

2010-06-27 Thread Roman Leshchinskiy
On 25/06/2010, at 06:41, Alexey Khudyakov wrote:

 Then constructor like one below arise naturally. And I don't know how to write
 them properly. It's possible to use fromList but then list could be allocated
 which is obviously wasteful.
 
 vector2 :: Double - Double - Vec2D
 vector2 x y = ...
 -- Vec2D is some wrapper around Vector data type

Your best bet is probably singleton x ++ singleton y. Unfortunately, GHC 
doesn't seem to provide any real way of specifying array literals.

 Another question is there any specific problems with short vectors? They could
 be just 2 elements long. I mean performance problems

A data type like this one should be faster:

data Vec2D = Vec2D {-# UNPACK #-} !Double {-# UNPACK #-} !Double

Firstly, this needs one less indirection for accessing the components. 
Secondly, GHC's simplifier knows a lot more about algebraic data types than it 
does about arrays so the above definition will often lead to better 
optimisations. Whether or not the difference is significant probably depends on 
the program.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] The mother of all functors/monads/categories

2010-06-27 Thread Roman Leshchinskiy
On 27/06/2010, at 19:54, Max Bolingbroke wrote:

 Q: What is the mother of all X, where X is some type class?
 A: It is a data type D such that:
 
 1. There exist total functions:
 
 lift :: X d = d a - D a
 lower :: X d = D a - d a

Are those universally quantified over d? If so, then none of your examples fit 
this definition. I assume you mean this:

lift :: X d = d a - D d a
lower :: X d = D d a - d a

In that case, isn't D just the dictionary for (X d) and a value of type (d a)? 
I.e., couldn't we always define it as:

data D d a where { D :: X d = d a - D d a }

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bug with [Double]

2010-05-19 Thread Roman Leshchinskiy
On 19/05/2010, at 19:24, Dmitry Olshansky wrote:

 Prelude [1,1+2/3..10]
 [1.0,1.6665,2.333,2.9996,3.666,4.332,4.998,5.664,6.33,6.9964,7.6625,8.329,8.995,9.66,10.327]
 
 -- It is a bug!

Unfortunately, it isn't. Section 6.3.4 of the Haskell report says:

For Float and Double, the semantics of the enumFrom family is given by the 
rules for Int above, except that the list terminates when the elements become 
greater than e3+i/2 for positive increment i, or when they become less than 
e3+i/2 for negative i.

In this case, i = 2/3 so the last value in the list is 10+1/3. The same applies 
to the other examples.

Personally, I consider the Enum class itself to be broken.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bug with [Double]

2010-05-19 Thread Roman Leshchinskiy
On 19/05/2010, at 20:36, Ivan Lazar Miljenovic wrote:

 Roman Leshchinskiy r...@cse.unsw.edu.au writes:
 Personally, I consider the Enum class itself to be broken.
 
 Oh?  In what sense?

Firstly, the enumFrom* family of functions shouldn't be methods and the class 
itself should provide enough facilities for implementing them generically. GHC, 
for instance, specialises them for all primitive numeric types just to get 
foldr/build fusion to work. That shouldn't be necessary and doesn't help with 
overloaded code anyway. For instance, this generates an intermediate list:

foo :: Enum a = a - a - [Int]
foo a b = [fromEnum x | x - [a..b]]

It's even worse when you want to implement similar functionality for other data 
structures. In vector, I basically had to duplicate all those specialisations 
to get decent performance. The generic case is horribly inefficient:

enumFromTo x y = fromList [x .. y]

There is no other sensible definition.

Secondly, it should be possible to compute the length and the nth element of 
[a..b] in constant time. At the moment, it's impossible to distribute [a..b] 
efficiently across multiple threads - you have to generate the entire list 
first and then split it into chunks. It's completely unclear to me what [:a .. 
b:] should mean in DPH, for instance.

So basically, Enum only provides enough functionality to desugar [a..b] and 
friends and even here, it doesn't interact well with fusion. Of course, these 
concerns weren't relevant back when the class was designed. But it is really 
broken now, IMO.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Bug with [Double]

2010-05-19 Thread Roman Leshchinskiy
On 19/05/2010, at 23:44, Ben Millwood wrote:

 On Wed, May 19, 2010 at 10:57 AM, Serguey Zefirov sergu...@gmail.com wrote:
 
 PS
 Rationals:
 Prelude [1,1+2/3..10] :: [Rational]
 [1 % 1,5 % 3,7 % 3,3 % 1,11 % 3,13 % 3,5 % 1,17 % 3,19 % 3,7 % 1,23 %
 3,25 % 3,9 % 1,29 % 3,31 % 3]
 
 Same result.
 
 This sounds like a bug to me. The section of the Haskell Report that
 deals with the Enum class mentions Float and Double, not Rational, and
 there's really no sensible reason why Rationals would exhibit this
 behaviour given that they don't have rounding error.

From Section 12.1 of the Library Report:

instance  (Integral a)  = Enum (Ratio a)  where
succ x   =  x+1
pred x   =  x-1
toEnum   =  fromIntegral
fromEnum =  fromInteger . truncate  -- May overflow
enumFrom =  numericEnumFrom  -- These numericEnumXXX functions
enumFromThen =  numericEnumFromThen  -- are as defined in Prelude.hs
enumFromTo   =  numericEnumFromTo   -- but not exported from it!
enumFromThenTo   =  numericEnumFromThenTo

The numericEnum functions are defined in Section 8 of the Language Report and 
have semantics required for Float and Double.

Roman

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Numerical Analysis

2010-05-17 Thread Roman Leshchinskiy
On 17/05/2010, at 05:17, Gregory Crosswhite wrote:

 As an aside, while there are advantages to writing numerical analysis 
 routines in Haskell, it might be better strategy to instead link in something 
 like LAPACK and provide nice wrappers to it in Haskell, since this way you 
 can harness the work of the experts who have spent a lot of time perfecting 
 their code rather than re-inventing the wheel.

I don't see think this is an either/or question. A good array library ought to 
provide BLAS, Lapack, FFTW etc. bindings *and* allow writing high-performance 
code in pure Haskell. I haven't implemented any of these bindings for vector 
only because I'm still deciding what to do with multidimensional arrays.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Numerical Analysis

2010-05-17 Thread Roman Leshchinskiy
On 17/05/2010, at 02:52, Pierre-Etienne Meunier wrote:

 You are quite right that vector only supports nested arrays but not 
 multidimensional ones. This is by design, however - the library's only goal 
 is to provide efficient one-dimensional, Int-indexed arrays. I'm thinking 
 about how to implement multidimensional arrays on top of vector but it's not 
 that easy. While repa is a step in that direction, I also need to support 
 mutable arrays and interoperability with C which complicates things 
 immensely.
 
 I understand. What complicates it even more (at least in what I imagine) is 
 that C uses the same syntax for multidimensional and nested arrays, and I do 
 not believe that for instance GHC's FFI allows for array types such as int 
 x[19][3].

Actually, it does since an argument of that type is equivalent to int *x. FWIW, 
I always say nested array when I mean that the individual subarrays can have 
different lengths as opposed to multidimensional ones where they are all the 
same. So the former are similar to int *x[].

 I was also wondering about how to do linear algebra : an infinite number of 
 types would be needed to express all the constraints on matrix multiplication 
 : we need types such as array of size m * n. Is there a way to generate 
 these automatically with for instance template haskell (again ! But I know 
 nothing of template haskell, neither, sorry !)

Encoding the bounds in the type system is possible but rather messy. In 
general, simply saying the array has indices of type (Int,Int) and doing 
dynamic bounds check when necessary seems to work best in Haskell.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Numerical Analysis

2010-05-16 Thread Roman Leshchinskiy
On 16/05/2010, at 10:17, Pierre-Etienne Meunier wrote:

 I've also just noticed a lack in the vector library : multidimensional arrays 
 seem to require indirections like in caml, whereas in C or in Data.Ix, there 
 is a way to avoid this. This is especially important for avoiding cache 
 misses with many dimensions, as well as for providing a clean interface. For 
 instance if a 10x10 matrix is initialized unproperly like 
 
 Data.Vector.replicate 10 $ Data.Vector.replicate 10 0
 
 The result is a total mess. Surely, every programmer knows that a computer 
 has got memory, and that this memory has to be allocated, but from what I 
 understand of haskell, I would expect the interface and the RTS to do it for 
 me. And an integer multiplication, followed by an addition, is way cheaper 
 than accessing uncached memory. Or maybe I do not understand that pipelines, 
 hyperthreading and all that stuff would give you the same result ?

You are quite right that vector only supports nested arrays but not 
multidimensional ones. This is by design, however - the library's only goal is 
to provide efficient one-dimensional, Int-indexed arrays. I'm thinking about 
how to implement multidimensional arrays on top of vector but it's not that 
easy. While repa is a step in that direction, I also need to support mutable 
arrays and interoperability with C which complicates things immensely.

That said, if all you need is a matrix it's quite easy to implement the 
necessary index calculations yourself. Also, since you are working with 
numerics I highly recommend that you use either Data.Vector.Unboxed or 
Data.Vector.Storable instead of Data.Vector as boxing tends to be prohibitively 
expensive in this domain.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Monadic style with Streams (as in Data.Array.Parallel.Stream)

2010-05-16 Thread Roman Leshchinskiy
On 16/05/2010, at 11:54, Mark Wassell wrote:

 Hi,
 
 This possibly might go against the spirit of what Stream programming is about 
 but I having difficulties converting an imperative algorithm [1] into Haskell 
 and think it would be easier if I was able to write it in a monadic style 
 with operations to read and write from and to the streams.
 
 I first tried to approach it by delving into the innards of other Stream 
 functions to devise what I needed. I only got so far and the sticking point 
 was defining the Monad. I then approached it from the Monad side and although 
 what I have is workable, it probably isn't going to perform (for one it uses 
 fromStream and tailS on each read off the front of the stream).

Data.Array.Parallel.Stream serves only one purpose: to represent loops produced 
by DPH in such a way that the compiler is able to optimise them well. Putting a 
monad on top of that will very very likely break this. To be honest, I'm not 
sure why you need the monad anyway. I would expect compression/decompression to 
be pure functions of type Stream Word8 - Stream Word8.

In any case, I would urgently recommend not to use Data.Array.Parallel.Stream 
for anything at this point. This whole subsystem will soon die of old age and 
be replaced by the much nicer stuff from package vector, specifically 
Data.Vector.Fusion.Stream and Data.Vector.Fusion.Stream.Monadic. Note that the 
latter implements monadic streams as described in 
http://www.cse.unsw.edu.au/~rl/publications/recycling.html. Perhaps those can 
be useful for you if you really need a monad.

Roman


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: Custom reducing functions for DPH

2010-05-10 Thread Roman Leshchinskiy
On 11/05/2010, at 05:29, Edward Z. Yang wrote:

 Some of the important primitives offered by Data Parallel Haskell are
 reduction primitives such as sumP and prodP, which take a data parallel
 array and reduce it to a single value.  I was wondering what the current
 capabilities for end-users interested in implementing their own
 reduction primitives were.  In particular, if I want to run data parallel
 computations on strings, I will generally want a more exotic set of
 combining operators.
 
 thoughtpolice informed me that GHC 6.10 seemed to have sumP/prodP hard
 coded into the vectorisation and optimisation stages, so this didn't
 really seem possible in userspace.  I'm interested to know if this situation
 has changed.  No hard feelings if it hasn't; I'm really just playing around
 with DPH and seeing what it can do. :-)

Short answer: we will eventually allow arbitrary operators in folds and scans 
but we need some time to figure out how to do this efficiently and it's not 
really on the top of our todo list at the moment.

That said, there are no theoretical problems with allowing user-defined 
operators and we could, with some work, provide support for them now. The only 
real difficulty is efficiency. Consider than in (foldP f xs) the function f 
might itself do something in parallel. In that case, we have to eliminate the 
nesting. This is quite straightforward to do by splitting xs in two halves, ys 
and zs, and then computing (zipWithP f ys zs). Then, we split the result, zip 
again and so on until we are left with a one-element array. This works because 
zipWithP knows how to eliminate nested parallelism. In fact, we have just 
directly encoded the parallel tree reduction algorithm.

We can do *much* better if we know that f is purely sequential, though (e.g., 
if f is Int addition). Now, we just split xs into chunks, one per thread, 
compute the local sums (a tight, sequential loop on each thread) and then 
combine them. This is a huge performance win.

For now, we have just provided specialised folds which we know are fast and 
which are easy to implement. When the rest of the system works reliably, we'll 
think about how to implement the general fold combinator with good performance 
for purely sequential combining operators.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: -O vs. -O2

2010-05-09 Thread Roman Leshchinskiy
On 07/05/2010, at 19:53, Simon Marlow wrote:

 On 05/05/2010 12:24, Roman Leshchinskiy wrote:
 Whenever I do cabal sdist on one of my projects, I get this warning:
 
 Distribution quality warnings: 'ghc-options: -O2' is rarely needed.
 Check that it is giving a real benefit and not just imposing longer
 compile times on your users.
 
 This finally got me curious and I did a nofib run to compare -O to
 -O2. The results are below (this is with the current HEAD).
 
 What were the compile-time results?

Gosh, I thought I included those.

-1 s.d.-   +1.1%
+1 s.d.-  +16.3%
Average-   +8.4%

The full results are below. IMO, the increases are rather negligible.

Roman

Compile Times

---
Programlog-O  log-O2
---

anna
   AbsConc3 0.23  +17.4%
  AbstractEval2 0.080.08
   AbstractMisc 0.110.12
  AbstractVals2 0.23  +17.4%
  Apply 0.100.12
BarakiConc3 0.170.18
 BarakiMeet 0.110.12
   BaseDefs 0.87   +8.0%
   Constructors 0.170.21
 Dependancy 0.24  +25.0%
 DomainExpr 0.120.13
EtaAbstract 0.120.13
FrontierDATAFN2 0.190.19
FrontierGENERIC 0.160.16
  FrontierMisc2 0.100.11
Inverse 0.160.16
LambdaLift5 0.42   +4.8%
   Main 0.31   +0.0%
MakeDomains 0.100.10
Monster 0.120.12
MyUtils 0.130.13
Parser2 0.64   +7.8%
PrettyPrint 0.130.13
   PrintResults 0.23   +8.7%
  ReadTable 0.160.17
   Simplify 0.100.10
 SmallerLattice 0.36  +13.9%
  StrictAn6 0.59  +11.9%
 SuccsAndPreds2 0.24   +4.2%
TExpr2DExpr 0.100.10
 TypeCheck5 0.70  +14.3%
  Utils 0.34  +14.7%

ansi
   Main 0.200.20

atom
   Main 0.160.16

awards
   Main 0.160.16
  QSort 0.040.05

banner
   Main 0.29   +3.4%

bernouilli
   Main 0.150.15

boyer
   Main 0.42   +4.8%

boyer2
Checker 0.060.07
Lisplikefns 0.150.16
   Main 0.080.08
 Rewritefns 0.070.09
   Rulebasetext 0.110.12

bspt
   BSPT 0.200.23
 Euclid 0.32   +3.1%
  EuclidGMS 0.150.16
GeomNum 0.110.11
   Init 0.130.13
  Input 0.40   +5.0%
  Interface 0.100.10
  Interpret 0.120.13
Libfuns 0.030.03
 MGRlib 0.090.09
   Main 0.060.06
  Merge 0.130.14
 Params 0.090.09
   Prog 0.060.06
  Rationals 0.180.19
 Render 0.120.12
 Stdlib 0.090.10

cacheprof
   Arch_x86 0.68   +2.9%
   Generics 0.190.20
   Main 2.13  +11.3%

calendar
   Main 0.24   +8.3%

cichelli
  Auxil 0.150.17
   Interval 0.030.03
Key 0.040.04
   Main 0.050.05
   Prog 0.120.12

circsim
   Main 0.63   +7.9%

clausify
   Main 0.23   +4.3%

comp_lab_zift
   Main 0.68

Re: -O vs. -O2

2010-05-09 Thread Roman Leshchinskiy
On 09/05/2010, at 07:50, Duncan Coutts wrote:

 On Wed, 2010-05-05 at 21:24 +1000, Roman Leshchinskiy wrote:
 Whenever I do cabal sdist on one of my projects, I get this warning:
 
 Distribution quality warnings:
 'ghc-options: -O2' is rarely needed. Check that it is giving a real benefit
 and not just imposing longer compile times on your users.
 
 This finally got me curious and I did a nofib run to compare -O to
 -O2. The results are below (this is with the current HEAD).
 
 Is there a real-world example of -O2 causing significantly longer
 compile times without providing a real benefit? If not, would it
 perhaps make sense for Cabal to use -O2 by default or even for GHC to
 make the two flags equivalent?
 
 It should be -O1 for default/balanced optimisations and -O2 for things
 involving a bigger tradeoff in terms of code size or compile time. so
 any optimisations in -O2 that GHC HQ believe are a no-brainer for the
 majority of packages should be moved into -O1.

Unless I'm mistaken, the only difference between -O1 and -O2 are SpecConstr and 
LiberateCase. These are quite heavily constrained by default (e.g., SpecConstr 
will not specialise big functions and will not generate more than 3 
specialisations for smaller ones).

 It's fine for people writing performance sensitive code to use -O2 in
 their packages. It's just not something we need to encourage for random
 packages. Before we added that warning, many package authors were not
 really thinking and just chucking in -O2 because 2 is bigger than 1 so
 it must be better right?. There certainly used to be packages that took
 longer to compile, generated more code, and ran slower when using -O2.
 That was some time ago of course.

And yet that doesn't really happen for nofib programs. So if there are still 
examples of this, we should include them in nofib. Quite probably, they're just 
running into bugs in the simplifier which should be fixed.

Also, the Cabal warning only talks about compile times. If it's just between 
compile times and better performance, then the latter should be the default, 
IMO, since you only install things once and run them many times.

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


-O vs. -O2

2010-05-05 Thread Roman Leshchinskiy
Whenever I do cabal sdist on one of my projects, I get this warning:

Distribution quality warnings:
'ghc-options: -O2' is rarely needed. Check that it is giving a real benefit
and not just imposing longer compile times on your users.

This finally got me curious and I did a nofib run to compare -O to -O2. The 
results are below (this is with the current HEAD).

Is there a real-world example of -O2 causing significantly longer compile times 
without providing a real benefit? If not, would it perhaps make sense for Cabal 
to use -O2 by default or even for GHC to make the two flags equivalent?

Roman

NoFib Results


Program   SizeAllocs   Runtime   Elapsed

   anna  +2.5% -0.2%  0.08  0.11
   ansi  +0.0% +0.0%  0.00  0.01
   atom  +0.0% +0.0%  0.20 +0.0%
 awards  +0.0% +0.0%  0.00  0.01
 banner  +0.0% +0.0%  0.00  0.01
 bernouilli  +0.0% +0.0% +0.0% +0.5%
  boyer  +0.0% -0.3%  0.03  0.04
 boyer2  +0.2% +0.0%  0.00  0.01
   bspt  +0.4% +0.0%  0.00  0.02
  cacheprof  +0.6% +0.0% +0.0% +0.0%
   calendar  +0.2% +0.1%  0.00  0.01
   cichelli  +0.0% -0.0%  0.07  0.08
circsim  +0.2% -0.0% -1.4% +0.3%
   clausify  +0.2% -6.8%  0.03  0.04
  comp_lab_zift  +0.8% +0.7%  0.17  0.19
   compress  +0.0% +0.0%  0.14  0.16
  compress2  +2.3% -1.3%  0.14  0.17
constraints  +0.0% +0.0% -0.4% -0.0%
   cryptarithm1  +0.0% +0.0% +0.0% -0.3%
   cryptarithm2  +0.0% -4.7%  0.01  0.02
cse  +0.0% +0.0%  0.00  0.01
  eliza  +0.0% +0.0%  0.00  0.01
  event  +0.2% +1.5%  0.14  0.16
 exp3_8  +0.0% +0.0%  0.10  0.11
 expert  +0.4% +0.4%  0.00  0.01
fem  +0.5% +1.0%  0.02  0.02
fft  +0.2% -1.6%  0.03  0.04
   fft2  +0.2% +0.0%  0.06  0.07
   fibheaps  +0.2% +0.4%  0.03  0.04
   fish  +0.0% +0.0%  0.01  0.03
  fluid  +2.1% -1.9%  0.01  0.02
 fulsom  +0.2% +0.0% +0.0% -0.5%
 gamteb  +0.7% -2.1%  0.07  0.09
gcd  +0.0% -1.7%  0.02  0.02
gen_regexps  +0.0% +0.0%  0.00  0.01
 genfft  +0.0% -0.3%  0.03  0.03
 gg  +0.5% -0.8%  0.01  0.02
   grep  +0.2% +0.0%  0.00  0.01
 hidden  +0.3% +0.0% -3.2% -3.5%
hpg  +0.2% +0.0%  0.14 +0.6%
ida  +0.5% -0.7%  0.07  0.08
  infer  +0.0% +0.0%  0.04  0.05
integer  +0.0% +0.0% +0.0% +0.0%
  integrate  +0.0% +0.0% +0.0% +0.0%
knights  +0.7%-13.4%  0.00  0.01
   lcss  +0.0% -0.0% -0.8% -0.7%
   life  +0.0% +0.0%  0.20 -1.0%
   lift  +0.2% +3.6%  0.00  0.01
  listcompr  +0.0% +0.0%  0.09  0.12
   listcopy  +0.0% -0.0%  0.10  0.13
   maillist  +0.0% -0.2%  0.05 +0.2%
 mandel  +0.0% +0.0%  0.07  0.09
mandel2  +0.2%-67.2%  0.00  0.00
minimax  +0.0% +0.4%  0.00  0.01
mkhprog  +0.0% +0.0%  0.00  0.01
 multiplier  +0.0% +0.0%  0.09  0.11
   nucleic2  +0.0% +0.0%  0.06  0.07
   para  +0.9% +0.5% +6.4% +5.5%
  paraffins  +0.4% +0.2%  0.07  0.10
 parser  +0.5% +0.3%  0.03  0.04
parstof  +0.2% -2.5%  0.00  0.01
pic  +0.4% -0.8%  0.00  0.02
  power  +0.4% +0.0% +0.0% +0.8%
 pretty  +0.2% -2.9%  0.00  0.01
 primes  +0.0% +0.0%  0.05  0.06
  primetest  +0.2% +0.0% +0.0% +0.6%
 prolog  +0.2% -0.9%  0.00  0.01
 puzzle  +0.0% +0.0%  0.14  0.16
  

Re: Parallel Haskell: 2-year project to push real world use

2010-05-04 Thread Roman Leshchinskiy
On 04/05/2010, at 11:10, Christian Höner zu Siederdissen wrote:

 * Ben Lippmeier b...@ouroborus.net [04.05.2010 02:21]:
 
 You can certainly create an array with these values, but in the provided 
 code it looks like each successive array element has a serial dependency on 
 the previous two elements. How were you expecting it to parallelise?
 
 actually, in reality it is rather more complex, in a 2d-array, each cell
 (i,j) requires a linear number of accesses to previously calculated
 cells that all have indices bounded by the current (i,j).
 
 One of the simplest codes is like this:
 
 forall i in [1..n]
 forall j in [i..n]
 set (i,j) to: minimum of (i,k)+(k,j) (forall k in [i+1..j-1])

Is this related to wavefront algorithms? Although those only access immediate 
neighbours IIRC.

In any case, vector could well provide an operation like this:

cant_think_of_a_name :: Vector v a = Int - (v a - a) - v a

The function would take the initialised prefix of the vector (starting with 
empty) and produce the next element. This would require a bit of hackery 
underneath but the interface would be safe and pure. Would something like this 
be useful?

 Here http://www.tbi.univie.ac.at/newpapers/Abstracts/98-06-009.ps.gz is
 a description of a parallel version of RNAfold.

IIUC, this parallelises processing of each diagonal but computes the diagonals 
one after another. Could you perhaps store each diagonal as a separate 
(parallel) array? That would make things much simpler.

 I can make my libraries available under GPLv3, they just need a bit of
 love. This gives you a moderately complex algorithm for which there is,
 too, a highly optimized C version (RNAfold -d2, in the vienna rna
 package).

That would be fantastic!

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


Re: Parallel Haskell: 2-year project to push real world use

2010-05-04 Thread Roman Leshchinskiy

On 04/05/2010, at 18:37, Christian Höner zu Siederdissen wrote:

 * Roman Leshchinskiy r...@cse.unsw.edu.au [04.05.2010 10:02]:
 On 04/05/2010, at 11:10, Christian Höner zu Siederdissen wrote:
 
 Here http://www.tbi.univie.ac.at/newpapers/Abstracts/98-06-009.ps.gz is
 a description of a parallel version of RNAfold.
 
 IIUC, this parallelises processing of each diagonal but computes the 
 diagonals one after another. Could you perhaps store each diagonal as a 
 separate (parallel) array? That would make things much simpler.
 
 That is no problem at all.

Really? So you could have a sequential list/array/whatever of flat parallel 
arrays, one per diagonal? And construct all elements of each diagonal in 
parallel? That would make parallelisation quite trivial...

Roman


___
Glasgow-haskell-users mailing list
Glasgow-haskell-users@haskell.org
http://www.haskell.org/mailman/listinfo/glasgow-haskell-users


  1   2   3   >