Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc , Vector , Scalar , Elt , fold , slice , constant , Array , Z(..), DIM1, DIM2 , fromList , All(..) , generate , lift, unlift , shape ) import Data.Array.Accelerate.Interpreter ( run ) dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a) dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys type Matrix a = Array DIM2 a getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a) getRow n mat = slice mat . constant $ Z :. n :. All -- Naive matrix multiplication: -- -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix Double) matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ \ix - let (Z :. i :. j) = unlift ix in getRow i a `dotP` getRow j b where b = A.transpose b' -- I assume row indexing is faster than column indexing... (Z :. nrows :. _ ) = unlift $ shape a (Z :. _ :. ncols) = unlift $ shape b This, of course, gives me errors right now because I'm calling getRow and dotP from within the generation function, which expects Exp[ression]s, not Acc[elerated computation]s. So maybe I need to replace that line with an inner for loop? Is there an easy way to do that with Accelerate?
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
No. But that doesn't stop me from being curious with Accelerate. Might you have a better explaination for what's happening here than Trevor's? - Clark On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote: I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc , Vector , Scalar , Elt , fold , slice , constant , Array , Z(..), DIM1, DIM2 , fromList , All(..) , generate , lift, unlift , shape ) import Data.Array.Accelerate.Interpreter ( run ) dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a) dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys type Matrix a = Array DIM2 a getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a) getRow n mat = slice mat . constant $ Z :. n :. All -- Naive matrix multiplication: -- -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix Double) matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ \ix - let (Z :. i :. j) = unlift ix in getRow i a `dotP` getRow j b where b = A.transpose b' -- I assume row indexing is faster than column indexing... (Z :. nrows :. _ ) = unlift $ shape a (Z :. _ :. ncols) = unlift $ shape b This, of course, gives me errors right now because I'm
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Well, an m x n matrix corresponds to a linear transformation in at most min{m,n} dimensions. In particular, this means that a 2x2 matrix corresponds to a plane, line, or the origin of 3-space, as a linear subspace. Which of those the matrix corresponds to depends on the matrix's rank, which is the number of linearly independent columns (or rows) in the matrix. Do you really need to know /which/ plane or line a matrix corresponds to? If so, reduce it using Gaussian elimination and, if appropriate, compute its eigenvectors or span. Otherwise, think of it as a generic plane/line/0-point. Outer products represent more of these simple facts about linear algebra. The product of an mx1 and 1xn matrices is an mxn matrix with rank at most 1. Trouble visualizing this means you are missing the essential facts (for the general picture as the product as a line or origin), or requires some computational details -- reducing the matrix using Gaussian elimination and determining its span. As I said, I don't mean to be harsh, but playing with a vector algebra package without understanding vectors is like playing with a calculator without understanding multiplication. You're better off learning what multiplication represents first, before using a machine to do it fast. So, I can humbly recommend taking a course on the subject. For example, https://www.coursera.org/course/matrix On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: No. But that doesn't stop me from being curious with Accelerate. Might you have a better explaination for what's happening here than Trevor's? - Clark On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote: I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc ,
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Sorry, I didn't realize that course was offered next year. I read through Matrices and Linear Algebra when I was in high school. And used Friedberg, Insel, Spence's Linear Algebra in college. On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.comwrote: Well, an m x n matrix corresponds to a linear transformation in at most min{m,n} dimensions. In particular, this means that a 2x2 matrix corresponds to a plane, line, or the origin of 3-space, as a linear subspace. Which of those the matrix corresponds to depends on the matrix's rank, which is the number of linearly independent columns (or rows) in the matrix. Do you really need to know /which/ plane or line a matrix corresponds to? If so, reduce it using Gaussian elimination and, if appropriate, compute its eigenvectors or span. Otherwise, think of it as a generic plane/line/0-point. Outer products represent more of these simple facts about linear algebra. The product of an mx1 and 1xn matrices is an mxn matrix with rank at most 1. Trouble visualizing this means you are missing the essential facts (for the general picture as the product as a line or origin), or requires some computational details -- reducing the matrix using Gaussian elimination and determining its span. As I said, I don't mean to be harsh, but playing with a vector algebra package without understanding vectors is like playing with a calculator without understanding multiplication. You're better off learning what multiplication represents first, before using a machine to do it fast. So, I can humbly recommend taking a course on the subject. For example, https://www.coursera.org/course/matrix On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: No. But that doesn't stop me from being curious with Accelerate. Might you have a better explaination for what's happening here than Trevor's? - Clark On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote: I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Thanks for the insight Alex. In case you are concerned, Accelerate is not in fact a linear algebra library. On 05/12/2012, at 11:43 AM, Alexander Solla alex.so...@gmail.com wrote: Sorry, I didn't realize that course was offered next year. I read through Matrices and Linear Algebra when I was in high school. And used Friedberg, Insel, Spence's Linear Algebra in college. On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.com wrote: Well, an m x n matrix corresponds to a linear transformation in at most min{m,n} dimensions. In particular, this means that a 2x2 matrix corresponds to a plane, line, or the origin of 3-space, as a linear subspace. Which of those the matrix corresponds to depends on the matrix's rank, which is the number of linearly independent columns (or rows) in the matrix. Do you really need to know /which/ plane or line a matrix corresponds to? If so, reduce it using Gaussian elimination and, if appropriate, compute its eigenvectors or span. Otherwise, think of it as a generic plane/line/0-point. Outer products represent more of these simple facts about linear algebra. The product of an mx1 and 1xn matrices is an mxn matrix with rank at most 1. Trouble visualizing this means you are missing the essential facts (for the general picture as the product as a line or origin), or requires some computational details -- reducing the matrix using Gaussian elimination and determining its span. As I said, I don't mean to be harsh, but playing with a vector algebra package without understanding vectors is like playing with a calculator without understanding multiplication. You're better off learning what multiplication represents first, before using a machine to do it fast. So, I can humbly recommend taking a course on the subject. For example, https://www.coursera.org/course/matrix On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: No. But that doesn't stop me from being curious with Accelerate. Might you have a better explaination for what's happening here than Trevor's? - Clark On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.com wrote: I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1]
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Thanks! I'll read through Matricies and Linear Algebra over the next few days. - Clark On Tue, Dec 4, 2012 at 7:43 PM, Alexander Solla alex.so...@gmail.comwrote: Sorry, I didn't realize that course was offered next year. I read through Matrices and Linear Algebra when I was in high school. And used Friedberg, Insel, Spence's Linear Algebra in college. On Tue, Dec 4, 2012 at 4:37 PM, Alexander Solla alex.so...@gmail.comwrote: Well, an m x n matrix corresponds to a linear transformation in at most min{m,n} dimensions. In particular, this means that a 2x2 matrix corresponds to a plane, line, or the origin of 3-space, as a linear subspace. Which of those the matrix corresponds to depends on the matrix's rank, which is the number of linearly independent columns (or rows) in the matrix. Do you really need to know /which/ plane or line a matrix corresponds to? If so, reduce it using Gaussian elimination and, if appropriate, compute its eigenvectors or span. Otherwise, think of it as a generic plane/line/0-point. Outer products represent more of these simple facts about linear algebra. The product of an mx1 and 1xn matrices is an mxn matrix with rank at most 1. Trouble visualizing this means you are missing the essential facts (for the general picture as the product as a line or origin), or requires some computational details -- reducing the matrix using Gaussian elimination and determining its span. As I said, I don't mean to be harsh, but playing with a vector algebra package without understanding vectors is like playing with a calculator without understanding multiplication. You're better off learning what multiplication represents first, before using a machine to do it fast. So, I can humbly recommend taking a course on the subject. For example, https://www.coursera.org/course/matrix On Tue, Dec 4, 2012 at 4:13 PM, Clark Gaebel cgae...@uwaterloo.cawrote: No. But that doesn't stop me from being curious with Accelerate. Might you have a better explaination for what's happening here than Trevor's? - Clark On Tue, Dec 4, 2012 at 7:08 PM, Alexander Solla alex.so...@gmail.comwrote: I don't mean to be blunt, but have you guys taken a course in linear algebra? On Mon, Dec 3, 2012 at 9:21 PM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc , Vector , Scalar , Elt , fold , slice , constant , Array , Z(..), DIM1, DIM2 , fromList , All(..) , generate , lift, unlift , shape ) import Data.Array.Accelerate.Interpreter ( run ) dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a) dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys type Matrix a = Array DIM2 a getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a) getRow n mat = slice mat . constant $ Z :. n :. All -- Naive matrix multiplication: -- -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix Double) matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ \ix - let (Z :. i :. j) = unlift ix in getRow i a `dotP` getRow j b where b = A.transpose b' -- I assume row indexing is faster than column indexing... (Z :. nrows :. _ ) = unlift $ shape a (Z :. _ :. ncols) = unlift $ shape b This, of course, gives me errors right now because I'm calling getRow and dotP from within the generation function, which expects Exp[ression]s, not Acc[elerated computation]s. So maybe I need to replace that line with an inner for loop? Is there an easy way to do that with Accelerate? Thanks for your help, - Clark [1] http://hackage.haskell.org/package/accelerate ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
As far as I am aware, the only description is in the Repa paper. I you are right, it really should be explained properly somewhere… At a simpler example, here is the outer product of two vectors [1]. vvProd :: (IsNum e, Elt e) = Acc (Vector e) - Acc (Vector e) - Acc (Matrix e) vvProd xs ys = A.zipWith (*) xsRepl ysRepl where n = A.size xs m = A.size ys xsRepl = A.replicate (lift (Z :. All :. m )) xs ysRepl = A.replicate (lift (Z :. n :. All)) ys If we then `A.fold (+) 0` the matrix, it would reduce along each row producing a vector. So the first element of that vector is going to be calculated as (xs[0] * ys[0] + xs[0] * ys[1] + … xs[0] * ys[m-1]). That's the idea we want for our matrix multiplication … but I agree, it is difficult for me to visualise as well. I do the same sort of trick with the n-body demo to get all n^2 particle interactions. -Trev [1]: http://en.wikipedia.org/wiki/Outer_product#Vector_multiplication On 04/12/2012, at 3:41 AM, Clark Gaebel cgae...@uwaterloo.ca wrote: Ah. I see now. Silly Haskell making inefficient algorithms hard to write and efficient ones easy. It's actually kind of annoying when learning, but probably for the best. Is there a good write-up of the algorithm you're using somewhere? The Repa paper was very brief in its explaination, and I'm having trouble visualizing the mapping of the 2D matricies into 3 dimensions. - Clark On Mon, Dec 3, 2012 at 2:06 AM, Trevor L. McDonell tmcdon...@cse.unsw.edu.au wrote: Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc , Vector , Scalar , Elt , fold , slice , constant , Array , Z(..), DIM1, DIM2 , fromList , All(..) , generate , lift, unlift , shape ) import Data.Array.Accelerate.Interpreter ( run ) dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a) dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys type Matrix a = Array DIM2 a getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a) getRow n mat = slice mat . constant $ Z :. n :. All -- Naive matrix multiplication: -- -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix Double) matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ \ix - let (Z :. i :. j) = unlift ix in getRow i a `dotP` getRow j b where b = A.transpose b' -- I assume row indexing is faster than column indexing... (Z :. nrows :. _ ) = unlift $ shape a (Z :. _ :. ncols) = unlift $ shape b This, of course, gives me errors right now because I'm calling getRow and dotP from within the generation function, which expects Exp[ression]s, not Acc[elerated computation]s. So maybe I need to replace that line with an inner for loop? Is there an easy way to do that with Accelerate? Thanks for your help, - Clark [1] http://hackage.haskell.org/package/accelerate
Re: [Haskell-cafe] Naive matrix multiplication with Accelerate
Hi Clark, The trick is that most accelerate operations work over multidimensional arrays, so you can still get around the fact that we are limited to flat data-parallelism only. Here is matrix multiplication in Accelerate, lifted from the first Repa paper [1]. import Data.Array.Accelerate as A type Matrix a = Array DIM2 a matMul :: (IsNum e, Elt e) = Acc (Matrix e) - Acc (Matrix e) - Acc (Matrix e) matMul arr brr = A.fold (+) 0 $ A.zipWith (*) arrRepl brrRepl where Z :. rowsA :. _ = unlift (shape arr):: Z :. Exp Int :. Exp Int Z :. _ :. colsB = unlift (shape brr):: Z :. Exp Int :. Exp Int arrRepl = A.replicate (lift $ Z :. All :. colsB :. All) arr brrRepl = A.replicate (lift $ Z :. rowsA :. All :. All) (A.transpose brr) If you use github sources rather than the hackage package, those intermediate replicates will get fused away. Cheers, -Trev [1] http://www.cse.unsw.edu.au/~chak/papers/KCLPL10.html On 03/12/2012, at 5:07 PM, Clark Gaebel cgae...@uwaterloo.ca wrote: Hello cafe, I've recently started learning about cuda and hetrogenous programming, and have been using accelerate [1] to help me out. Right now, I'm running into trouble in that I can't call parallel code from sequential code. Turns out GPUs aren't exactly like Repa =P. Here's what I have so far: import qualified Data.Array.Accelerate as A import Data.Array.Accelerate ( (:.)(..) , Acc , Vector , Scalar , Elt , fold , slice , constant , Array , Z(..), DIM1, DIM2 , fromList , All(..) , generate , lift, unlift , shape ) import Data.Array.Accelerate.Interpreter ( run ) dotP :: (Num a, Elt a) = Acc (Vector a) - Acc (Vector a) - Acc (Scalar a) dotP xs ys = fold (+) 0 $ A.zipWith (*) xs ys type Matrix a = Array DIM2 a getRow :: Elt a = Int - Acc (Matrix a) - Acc (Vector a) getRow n mat = slice mat . constant $ Z :. n :. All -- Naive matrix multiplication: -- -- index (i, j) is equal to the ith row of 'a' `dot` the jth row of 'b' matMul :: A.Acc (Matrix Double) - A.Acc (Matrix Double) - A.Acc (Matrix Double) matMul a b' = A.generate (constant $ Z :. nrows :. ncols) $ \ix - let (Z :. i :. j) = unlift ix in getRow i a `dotP` getRow j b where b = A.transpose b' -- I assume row indexing is faster than column indexing... (Z :. nrows :. _ ) = unlift $ shape a (Z :. _ :. ncols) = unlift $ shape b This, of course, gives me errors right now because I'm calling getRow and dotP from within the generation function, which expects Exp[ression]s, not Acc[elerated computation]s. So maybe I need to replace that line with an inner for loop? Is there an easy way to do that with Accelerate? Thanks for your help, - Clark [1] http://hackage.haskell.org/package/accelerate ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe ___ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe