Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Alexander Solla
I don't mean to be blunt, but have you guys taken a course in linear
algebra?


On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
tmcdon...@cse.unsw.edu.au wrote:

 As far as I am aware, the only description is in the Repa paper. I you are
 right, it really should be explained properly somewhere…

 At a simpler example, here is the outer product of two vectors [1].

 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc
 (Matrix e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys

 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

 If we then `A.fold (+) 0` the matrix, it would reduce along each row
 producing a vector. So the first element of that vector is going to be
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
 the idea we want for our matrix multiplication … but I agree, it is
 difficult for me to visualise as well.

 I do the same sort of trick with the n-body demo to get all n^2 particle
 interactions.

 -Trev


  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to write
 and efficient ones easy. It's actually kind of annoying when learning, but
 probably for the best.

 Is there a good write-up of the algorithm you're using somewhere? The Repa
 paper was very brief in its explaination, and I'm having trouble
 visualizing the mapping of the 2D matricies into 3 dimensions.

   - Clark


 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over multidimensional
 arrays, so you can still get around the fact that we are limited to flat
 data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first Repa
 paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :. All)
 arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :. All)
 (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,

 I've recently started learning about cuda and hetrogenous programming,
 and have been using accelerate [1] to help me out. Right now, I'm running
 into trouble in that I can't call parallel code from sequential code. Turns
 out GPUs aren't exactly like Repa =P.

 Here's what I have so far:

 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , Vector
  , Scalar
  , Elt
  , fold
  , slice
  , constant
  , Array
  , Z(..), DIM1, DIM2
  , fromList
  , All(..)
  , generate
  , lift, unlift
  , shape
  )
 import Data.Array.Accelerate.Interpreter ( run )

 dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar
 a)
 dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys

 type Matrix a = Array DIM2 a

 getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a)
 getRow n mat = slice mat . constant $ Z :. n :. All

 -- Naive matrix multiplication:
 --
 -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
 matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix
 Double)
 matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
 \ix -
   let (Z :. i :. j) = unlift ix
in getRow i a `dotP` getRow j b
 where
 b = A.transpose b' -- I assume row indexing is faster than column
 indexing...
 (Z :. nrows :.   _  ) = unlift $ shape a
 (Z :.   _   :. ncols) = unlift $ shape b


 This, of course, gives me errors right now because I'm calling getRow and
 dotP from within the generation function, which expects Exp[ression]s, not
 Acc[elerated computation]s.

 So maybe I need to replace that line with an inner for loop? Is there an
 easy way to do that with Accelerate?


Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Clark Gaebel
No. But that doesn't stop me from being curious with Accelerate. Might you
have a better explaination for what's happening here than Trevor's?

  - Clark


On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote:

 I don't mean to be blunt, but have you guys taken a course in linear
 algebra?


 On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 As far as I am aware, the only description is in the Repa paper. I you
 are right, it really should be explained properly somewhere…

 At a simpler example, here is the outer product of two vectors [1].

 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc
 (Matrix e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys

 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

 If we then `A.fold (+) 0` the matrix, it would reduce along each row
 producing a vector. So the first element of that vector is going to be
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
 the idea we want for our matrix multiplication … but I agree, it is
 difficult for me to visualise as well.

 I do the same sort of trick with the n-body demo to get all n^2 particle
 interactions.

 -Trev


  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to write
 and efficient ones easy. It's actually kind of annoying when learning, but
 probably for the best.

 Is there a good write-up of the algorithm you're using somewhere? The
 Repa paper was very brief in its explaination, and I'm having trouble
 visualizing the mapping of the 2D matricies into 3 dimensions.

   - Clark


 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over multidimensional
 arrays, so you can still get around the fact that we are limited to flat
 data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first Repa
 paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp
 Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp
 Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :.
 All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :.
 All) (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,

 I've recently started learning about cuda and hetrogenous programming,
 and have been using accelerate [1] to help me out. Right now, I'm running
 into trouble in that I can't call parallel code from sequential code. Turns
 out GPUs aren't exactly like Repa =P.

 Here's what I have so far:

 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , Vector
  , Scalar
  , Elt
  , fold
  , slice
  , constant
  , Array
  , Z(..), DIM1, DIM2
  , fromList
  , All(..)
  , generate
  , lift, unlift
  , shape
  )
 import Data.Array.Accelerate.Interpreter ( run )

 dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc
 (Scalar a)
 dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys

 type Matrix a = Array DIM2 a

 getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a)
 getRow n mat = slice mat . constant $ Z :. n :. All

 -- Naive matrix multiplication:
 --
 -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
 matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc
 (Matrix Double)
 matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
 \ix -
   let (Z :. i :. j) = unlift ix
in getRow i a `dotP` getRow j b
 where
 b = A.transpose b' -- I assume row indexing is faster than
 column indexing...
 (Z :. nrows :.   _  ) = unlift $ shape a
 (Z :.   _   :. ncols) = unlift $ shape b


 This, of course, gives me errors right now because I'm 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Alexander Solla
Well, an m x n matrix corresponds to a linear transformation in at most
min{m,n} dimensions.  In particular, this means that a 2x2 matrix
corresponds to a plane, line, or the origin of 3-space, as a linear
subspace.  Which of those the matrix corresponds to depends on the matrix's
rank, which is the number of linearly independent columns (or rows) in
the matrix.

Do you really need to know /which/ plane or line a matrix corresponds to?
 If so, reduce it using Gaussian elimination and, if appropriate, compute
its eigenvectors or span.  Otherwise, think of it as a generic
plane/line/0-point.

Outer products represent more of these simple facts about linear algebra.
 The product of an mx1 and 1xn matrices is an mxn matrix with rank at most
1.  Trouble visualizing this means you are missing the essential facts (for
the general picture as the product as a line or origin), or requires some
computational details -- reducing the matrix using Gaussian elimination and
determining its span.

As I said, I don't mean to be harsh, but playing with a vector algebra
package without understanding vectors is like playing with a calculator
without understanding multiplication.  You're better off learning what
multiplication represents first, before using a machine to do it fast.  So,
I can humbly recommend taking a course on the subject.  For example,
https://www.coursera.org/course/matrix


On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 No. But that doesn't stop me from being curious with Accelerate. Might you
 have a better explaination for what's happening here than Trevor's?

   - Clark


 On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote:

 I don't mean to be blunt, but have you guys taken a course in linear
 algebra?


 On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 As far as I am aware, the only description is in the Repa paper. I you
 are right, it really should be explained properly somewhere…

 At a simpler example, here is the outer product of two vectors [1].

 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc
 (Matrix e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys

 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

 If we then `A.fold (+) 0` the matrix, it would reduce along each row
 producing a vector. So the first element of that vector is going to be
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
 the idea we want for our matrix multiplication … but I agree, it is
 difficult for me to visualise as well.

 I do the same sort of trick with the n-body demo to get all n^2 particle
 interactions.

 -Trev


  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to write
 and efficient ones easy. It's actually kind of annoying when learning, but
 probably for the best.

 Is there a good write-up of the algorithm you're using somewhere? The
 Repa paper was very brief in its explaination, and I'm having trouble
 visualizing the mapping of the 2D matricies into 3 dimensions.

   - Clark


 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over multidimensional
 arrays, so you can still get around the fact that we are limited to flat
 data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first Repa
 paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp
 Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp
 Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :.
 All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :.
 All) (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,

 I've recently started learning about cuda and hetrogenous programming,
 and have been using accelerate [1] to help me out. Right now, I'm running
 into trouble in that I can't call parallel code from sequential code. Turns
 out GPUs aren't exactly like Repa =P.

 Here's what I have so far:

 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Alexander Solla
Sorry, I didn't realize that course was offered next year.  I read through
Matrices and Linear Algebra when I was in high school.  And used
Friedberg, Insel, Spence's Linear Algebra in college.


On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.comwrote:

 Well, an m x n matrix corresponds to a linear transformation in at most
 min{m,n} dimensions.  In particular, this means that a 2x2 matrix
 corresponds to a plane, line, or the origin of 3-space, as a linear
 subspace.  Which of those the matrix corresponds to depends on the matrix's
 rank, which is the number of linearly independent columns (or rows) in
 the matrix.

 Do you really need to know /which/ plane or line a matrix corresponds to?
  If so, reduce it using Gaussian elimination and, if appropriate, compute
 its eigenvectors or span.  Otherwise, think of it as a generic
 plane/line/0-point.

 Outer products represent more of these simple facts about linear algebra.
  The product of an mx1 and 1xn matrices is an mxn matrix with rank at most
 1.  Trouble visualizing this means you are missing the essential facts (for
 the general picture as the product as a line or origin), or requires some
 computational details -- reducing the matrix using Gaussian elimination and
 determining its span.

 As I said, I don't mean to be harsh, but playing with a vector algebra
 package without understanding vectors is like playing with a calculator
 without understanding multiplication.  You're better off learning what
 multiplication represents first, before using a machine to do it fast.  So,
 I can humbly recommend taking a course on the subject.  For example,
 https://www.coursera.org/course/matrix


 On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 No. But that doesn't stop me from being curious with Accelerate. Might
 you have a better explaination for what's happening here than Trevor's?

   - Clark


 On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote:

 I don't mean to be blunt, but have you guys taken a course in linear
 algebra?


 On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 As far as I am aware, the only description is in the Repa paper. I you
 are right, it really should be explained properly somewhere…

 At a simpler example, here is the outer product of two vectors [1].

 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc
 (Matrix e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys

 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

 If we then `A.fold (+) 0` the matrix, it would reduce along each row
 producing a vector. So the first element of that vector is going to be
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
 the idea we want for our matrix multiplication … but I agree, it is
 difficult for me to visualise as well.

 I do the same sort of trick with the n-body demo to get all n^2
 particle interactions.

 -Trev


  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to
 write and efficient ones easy. It's actually kind of annoying when
 learning, but probably for the best.

 Is there a good write-up of the algorithm you're using somewhere? The
 Repa paper was very brief in its explaination, and I'm having trouble
 visualizing the mapping of the 2D matricies into 3 dimensions.

   - Clark


 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over
 multidimensional arrays, so you can still get around the fact that we are
 limited to flat data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first
 Repa paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp
 Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp
 Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :.
 All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :.
 All) (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,

 I've recently started learning about cuda and hetrogenous programming,
 and have been using accelerate [1] to help me out. Right now, I'm running
 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Trevor L. McDonell
Thanks for the insight Alex.
In case you are concerned, Accelerate is not in fact a linear algebra library.


On 05/12/2012, at 11:43 AM, Alexander Solla alex.so...@gmail.com wrote:

 Sorry, I didn't realize that course was offered next year.  I read through 
 Matrices and Linear Algebra when I was in high school.  And used Friedberg, 
 Insel, Spence's Linear Algebra in college.
 
 
 On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.com wrote:
 Well, an m x n matrix corresponds to a linear transformation in at most 
 min{m,n} dimensions.  In particular, this means that a 2x2 matrix corresponds 
 to a plane, line, or the origin of 3-space, as a linear subspace.  Which of 
 those the matrix corresponds to depends on the matrix's rank, which is the 
 number of linearly independent columns (or rows) in the matrix.
 
 Do you really need to know /which/ plane or line a matrix corresponds to?  If 
 so, reduce it using Gaussian elimination and, if appropriate, compute its 
 eigenvectors or span.  Otherwise, think of it as a generic plane/line/0-point.
 
 Outer products represent more of these simple facts about linear algebra.  
 The product of an mx1 and 1xn matrices is an mxn matrix with rank at most 1.  
 Trouble visualizing this means you are missing the essential facts (for the 
 general picture as the product as a line or origin), or requires some 
 computational details -- reducing the matrix using Gaussian elimination and 
 determining its span.
 
 As I said, I don't mean to be harsh, but playing with a vector algebra 
 package without understanding vectors is like playing with a calculator 
 without understanding multiplication.  You're better off learning what 
 multiplication represents first, before using a machine to do it fast.  So, I 
 can humbly recommend taking a course on the subject.  For example, 
 https://www.coursera.org/course/matrix
 
 
 On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:
 No. But that doesn't stop me from being curious with Accelerate. Might you 
 have a better explaination for what's happening here than Trevor's?
 
   - Clark
 
 
 On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.com wrote:
 I don't mean to be blunt, but have you guys taken a course in linear algebra?
 
 
 On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:
 As far as I am aware, the only description is in the Repa paper. I you are 
 right, it really should be explained properly somewhere… 
 
 At a simpler example, here is the outer product of two vectors [1].
 
 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix 
 e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys
 
 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys
 
 If we then `A.fold (+) 0` the matrix, it would reduce along each row 
 producing a vector. So the first element of that vector is going to be 
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's 
 the idea we want for our matrix multiplication … but I agree, it is difficult 
 for me to visualise as well.
 
 I do the same sort of trick with the n-body demo to get all n^2 particle 
 interactions.
 
 -Trev
 
 
  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication
 
 
 
 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:
 
 Ah. I see now. Silly Haskell making inefficient algorithms hard to write and 
 efficient ones easy. It's actually kind of annoying when learning, but 
 probably for the best.
 
 Is there a good write-up of the algorithm you're using somewhere? The Repa 
 paper was very brief in its explaination, and I'm having trouble visualizing 
 the mapping of the 2D matricies into 3 dimensions.
 
   - Clark
 
 
 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:
 Hi Clark,
 
 The trick is that most accelerate operations work over multidimensional 
 arrays, so you can still get around the fact that we are limited to flat 
 data-parallelism only.
 
 Here is matrix multiplication in Accelerate, lifted from the first Repa 
 paper [1].
 
 
 import Data.Array.Accelerate as A
 
 type Matrix a = Array DIM2 a
 
 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc 
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int
 
 arrRepl = A.replicate (lift $ Z :. All   :. colsB :. All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :. All) 
 (A.transpose brr)
 
 
 If you use github sources rather than the hackage package, those 
 intermediate replicates will get fused away.
 
 
 Cheers,
 -Trev
 
  [1] 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-04 Thread Clark Gaebel
Thanks! I'll read through Matricies and Linear Algebra over the next few
days.

  - Clark


On Tue, Dec 4, 2012 at 7:43 PM, Alexander Solla alex.so...@gmail.comwrote:

 Sorry, I didn't realize that course was offered next year.  I read through
 Matrices and Linear Algebra when I was in high school.  And used
 Friedberg, Insel, Spence's Linear Algebra in college.


 On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.comwrote:

 Well, an m x n matrix corresponds to a linear transformation in at most
 min{m,n} dimensions.  In particular, this means that a 2x2 matrix
 corresponds to a plane, line, or the origin of 3-space, as a linear
 subspace.  Which of those the matrix corresponds to depends on the matrix's
 rank, which is the number of linearly independent columns (or rows) in
 the matrix.

 Do you really need to know /which/ plane or line a matrix corresponds to?
  If so, reduce it using Gaussian elimination and, if appropriate, compute
 its eigenvectors or span.  Otherwise, think of it as a generic
 plane/line/0-point.

 Outer products represent more of these simple facts about linear algebra.
  The product of an mx1 and 1xn matrices is an mxn matrix with rank at most
 1.  Trouble visualizing this means you are missing the essential facts (for
 the general picture as the product as a line or origin), or requires some
 computational details -- reducing the matrix using Gaussian elimination and
 determining its span.

 As I said, I don't mean to be harsh, but playing with a vector algebra
 package without understanding vectors is like playing with a calculator
 without understanding multiplication.  You're better off learning what
 multiplication represents first, before using a machine to do it fast.  So,
 I can humbly recommend taking a course on the subject.  For example,
 https://www.coursera.org/course/matrix


 On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.cawrote:

 No. But that doesn't stop me from being curious with Accelerate. Might
 you have a better explaination for what's happening here than Trevor's?

   - Clark


 On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote:

 I don't mean to be blunt, but have you guys taken a course in linear
 algebra?


 On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 As far as I am aware, the only description is in the Repa paper. I you
 are right, it really should be explained properly somewhere…

 At a simpler example, here is the outer product of two vectors [1].

 vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc
 (Matrix e)
 vvProd xs ys = A.zipWith (*) xsRepl ysRepl
   where
 n   = A.size xs
 m   = A.size ys

 xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
 ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

 If we then `A.fold (+) 0` the matrix, it would reduce along each row
 producing a vector. So the first element of that vector is going to be
 calculated as (xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's
 the idea we want for our matrix multiplication … but I agree, it is
 difficult for me to visualise as well.

 I do the same sort of trick with the n-body demo to get all n^2
 particle interactions.

 -Trev


  [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



 On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to
 write and efficient ones easy. It's actually kind of annoying when
 learning, but probably for the best.

 Is there a good write-up of the algorithm you're using somewhere? The
 Repa paper was very brief in its explaination, and I'm having trouble
 visualizing the mapping of the 2D matricies into 3 dimensions.

   - Clark


 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over
 multidimensional arrays, so you can still get around the fact that we are
 limited to flat data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first
 Repa paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :.
 Exp Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :.
 Exp Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :.
 All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :.
 All) (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-03 Thread Clark Gaebel
Ah. I see now. Silly Haskell making inefficient algorithms hard to write
and efficient ones easy. It's actually kind of annoying when learning, but
probably for the best.

Is there a good write-up of the algorithm you're using somewhere? The Repa
paper was very brief in its explaination, and I'm having trouble
visualizing the mapping of the 2D matricies into 3 dimensions.

  - Clark


On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
tmcdon...@cse.unsw.edu.au wrote:

 Hi Clark,

 The trick is that most accelerate operations work over multidimensional
 arrays, so you can still get around the fact that we are limited to flat
 data-parallelism only.

 Here is matrix multiplication in Accelerate, lifted from the first Repa
 paper [1].


 import Data.Array.Accelerate as A

 type Matrix a = Array DIM2 a

 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc
 (Matrix e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int

 arrRepl = A.replicate (lift $ Z :. All   :. colsB :. All)
 arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :. All)
 (A.transpose brr)


 If you use github sources rather than the hackage package, those
 intermediate replicates will get fused away.


 Cheers,
 -Trev

  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,

 I've recently started learning about cuda and hetrogenous programming, and
 have been using accelerate [1] to help me out. Right now, I'm running into
 trouble in that I can't call parallel code from sequential code. Turns out
 GPUs aren't exactly like Repa =P.

 Here's what I have so far:

 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , Vector
  , Scalar
  , Elt
  , fold
  , slice
  , constant
  , Array
  , Z(..), DIM1, DIM2
  , fromList
  , All(..)
  , generate
  , lift, unlift
  , shape
  )
 import Data.Array.Accelerate.Interpreter ( run )

 dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar
 a)
 dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys

 type Matrix a = Array DIM2 a

 getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a)
 getRow n mat = slice mat . constant $ Z :. n :. All

 -- Naive matrix multiplication:
 --
 -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
 matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix
 Double)
 matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
 \ix -
   let (Z :. i :. j) = unlift ix
in getRow i a `dotP` getRow j b
 where
 b = A.transpose b' -- I assume row indexing is faster than column
 indexing...
 (Z :. nrows :.   _  ) = unlift $ shape a
 (Z :.   _   :. ncols) = unlift $ shape b


 This, of course, gives me errors right now because I'm calling getRow and
 dotP from within the generation function, which expects Exp[ression]s, not
 Acc[elerated computation]s.

 So maybe I need to replace that line with an inner for loop? Is there an
 easy way to do that with Accelerate?

 Thanks for your help,
   - Clark

 [1] http://hackage.haskell.org/package/accelerate
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe



 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe


___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe


Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-03 Thread Trevor L. McDonell
As far as I am aware, the only description is in the Repa paper. I you are 
right, it really should be explained properly somewhere… 

At a simpler example, here is the outer product of two vectors [1].

vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e)
vvProd xs ys = A.zipWith (*) xsRepl ysRepl
  where
n   = A.size xs
m   = A.size ys

xsRepl  = A.replicate (lift (Z :. All :. m  )) xs
ysRepl  = A.replicate (lift (Z :. n   :. All)) ys

If we then `A.fold (+) 0` the matrix, it would reduce along each row producing 
a vector. So the first element of that vector is going to be calculated as 
(xs[0] * ys[0] + xs[0] * ys[1] +  … xs[0] * ys[m-1]). That's the idea we want 
for our matrix multiplication … but I agree, it is difficult for me to 
visualise as well.

I do the same sort of trick with the n-body demo to get all n^2 particle 
interactions.

-Trev


 [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication



On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Ah. I see now. Silly Haskell making inefficient algorithms hard to write and 
 efficient ones easy. It's actually kind of annoying when learning, but 
 probably for the best.
 
 Is there a good write-up of the algorithm you're using somewhere? The Repa 
 paper was very brief in its explaination, and I'm having trouble visualizing 
 the mapping of the 2D matricies into 3 dimensions.
 
   - Clark
 
 
 On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell 
 tmcdon...@cse.unsw.edu.au wrote:
 Hi Clark,
 
 The trick is that most accelerate operations work over multidimensional 
 arrays, so you can still get around the fact that we are limited to flat 
 data-parallelism only.
 
 Here is matrix multiplication in Accelerate, lifted from the first Repa paper 
 [1].
 
 
 import Data.Array.Accelerate as A
 
 type Matrix a = Array DIM2 a
 
 matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix 
 e)
 matMul arr brr
   = A.fold (+) 0
   $ A.zipWith (*) arrRepl brrRepl
   where
 Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int
 Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int
 
 arrRepl = A.replicate (lift $ Z :. All   :. colsB :. All) arr
 brrRepl = A.replicate (lift $ Z :. rowsA :. All   :. All) 
 (A.transpose brr)
 
 
 If you use github sources rather than the hackage package, those intermediate 
 replicates will get fused away.
 
 
 Cheers,
 -Trev
 
  [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html
 
 
 
 
 On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:
 
 Hello cafe,
 
 I've recently started learning about cuda and hetrogenous programming, and 
 have been using accelerate [1] to help me out. Right now, I'm running into 
 trouble in that I can't call parallel code from sequential code. Turns out 
 GPUs aren't exactly like Repa =P.
 
 Here's what I have so far:
 
 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , Vector
  , Scalar
  , Elt
  , fold
  , slice
  , constant
  , Array
  , Z(..), DIM1, DIM2
  , fromList
  , All(..)
  , generate
  , lift, unlift
  , shape
  )
 import Data.Array.Accelerate.Interpreter ( run )
 
 dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a)
 dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys
 
 type Matrix a = Array DIM2 a
 
 getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a)
 getRow n mat = slice mat . constant $ Z :. n :. All
 
 -- Naive matrix multiplication:
 --
 -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
 matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix 
 Double)
 matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
 \ix -
   let (Z :. i :. j) = unlift ix
in getRow i a `dotP` getRow j b
 where
 b = A.transpose b' -- I assume row indexing is faster than column 
 indexing...
 (Z :. nrows :.   _  ) = unlift $ shape a
 (Z :.   _   :. ncols) = unlift $ shape b
 
 
 This, of course, gives me errors right now because I'm calling getRow and 
 dotP from within the generation function, which expects Exp[ression]s, not 
 Acc[elerated computation]s.
 
 So maybe I need to replace that line with an inner for loop? Is there an 
 easy way to do that with Accelerate?
 
 Thanks for your help,
   - Clark
 
 [1] http://hackage.haskell.org/package/accelerate
 

Re: [Haskell-cafe] Naive matrix multiplication with Accelerate

2012-12-02 Thread Trevor L. McDonell
Hi Clark,

The trick is that most accelerate operations work over multidimensional arrays, 
so you can still get around the fact that we are limited to flat 
data-parallelism only.

Here is matrix multiplication in Accelerate, lifted from the first Repa paper 
[1].


import Data.Array.Accelerate as A

type Matrix a = Array DIM2 a

matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e)
matMul arr brr
  = A.fold (+) 0
  $ A.zipWith (*) arrRepl brrRepl
  where
Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int
Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int

arrRepl = A.replicate (lift $ Z :. All   :. colsB :. All) arr
brrRepl = A.replicate (lift $ Z :. rowsA :. All   :. All) 
(A.transpose brr)


If you use github sources rather than the hackage package, those intermediate 
replicates will get fused away.


Cheers,
-Trev

 [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html




On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote:

 Hello cafe,
 
 I've recently started learning about cuda and hetrogenous programming, and 
 have been using accelerate [1] to help me out. Right now, I'm running into 
 trouble in that I can't call parallel code from sequential code. Turns out 
 GPUs aren't exactly like Repa =P.
 
 Here's what I have so far:
 
 import qualified Data.Array.Accelerate as A
 import Data.Array.Accelerate ( (:.)(..)
  , Acc
  , Vector
  , Scalar
  , Elt
  , fold
  , slice
  , constant
  , Array
  , Z(..), DIM1, DIM2
  , fromList
  , All(..)
  , generate
  , lift, unlift
  , shape
  )
 import Data.Array.Accelerate.Interpreter ( run )
 
 dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a)
 dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys
 
 type Matrix a = Array DIM2 a
 
 getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a)
 getRow n mat = slice mat . constant $ Z :. n :. All
 
 -- Naive matrix multiplication:
 --
 -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b'
 matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix 
 Double)
 matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $
 \ix -
   let (Z :. i :. j) = unlift ix
in getRow i a `dotP` getRow j b
 where
 b = A.transpose b' -- I assume row indexing is faster than column 
 indexing...
 (Z :. nrows :.   _  ) = unlift $ shape a
 (Z :.   _   :. ncols) = unlift $ shape b
 
 
 This, of course, gives me errors right now because I'm calling getRow and 
 dotP from within the generation function, which expects Exp[ression]s, not 
 Acc[elerated computation]s.
 
 So maybe I need to replace that line with an inner for loop? Is there an easy 
 way to do that with Accelerate?
 
 Thanks for your help,
   - Clark
 
 [1] http://hackage.haskell.org/package/accelerate
 ___
 Haskell-Cafe mailing list
 Haskell-Cafe@haskell.org
 http://www.haskell.org/mailman/listinfo/haskell-cafe

___
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe