Hi again. This is the system information: - Ubuntu 12.04 32-bit - Intel® Core™2 Duo CPU T5270 @ 1.40GHz × 2 - 2.9 GiB RAM
GHC version: - GHC 7.4.1 DPH libraries: - dph-base-0.6.1.1 - (dph-lifted-base-0.6.1.1) - (dph-lifted-vseg-0.6.1.2) - (dph-prim-interface-0.6.1.1) - (dph-prim-par-0.6.1.1) - (dph-prim-seq-0.6.1.1) Compilation flags: I'm using two combinations of flags, taken from different sources. In both cases results are identical: - From https://github.com/ghc/packages-dph: -rtsopts -threaded -fllvm -optlo-O3 -Odph -fcpr-off -fno-liberate-case -package dph-lifted-vseg - From dph-examples: -rtsopts -threaded -fllvm -Odph -package dph-lifted-vseg -fcpr-off -fno-liberate-case -fsimpl-tick-factor=1000 Execution flags: +RTS -N Tests: Computing the product of two 400*400 matrices takes 6.037993 seconds. Computing the product of two 600*600 matrices yields "out of memory (requested 1728053248 bytes)". DPH code: ------------------------------------------------------------------------------------- {-# NOINLINE matMult_wrapper #-} matMult_wrapper :: Matrix_wrapper -> Matrix_wrapper -> Matrix_wrapper matMult_wrapper mA mB = toPArrayP (mapP toPArrayP (matMult (fromNestedPArrayP mA) (fromNestedPArrayP mB))) matMult :: Matrix -> Matrix -> Matrix matMult mA mB = mapP (\row -> mapP (\col -> dotp row col) mB) mA -- I removed the call to transposeP, so It's no longer an actual matrix product dotp :: MVector -> MVector -> MMultType dotp row col = D.sumP (zipWithP (D.*) row col) ------------------------------------------------------------------------------------- I'm attaching the files with the full example code If there is any other information needed, please let me know Any help is very appreciated On Tue, Jul 10, 2012 at 9:51 AM, Mauro Blanco <blanco...@gmail.com> wrote: > Thanks for both answers > > I have used repa with the newer interface for the same example, but I > wanted to have another example using DPH. I know repa is more suited for > regular representations, but I wanted to express the same program in DPH > where I don´t have to worry of nested parallel computation. > > The transposeP did not seem to be the problem as only > executing transposeP on "big" matrices generated no memory issues. But, > something like this (on square matrices) still have memory problems: > matMult :: Matrix -> Matrix -> Matrix > matMult mA mB = mapP (\row -> mapP (\col -> dotp row col) *mB*) mA > > dotp :: MVector -> MVector -> MMultType > dotp row col = D.sumP (zipWithP (D.*) row col) > > Later today (or tomorrow) I will post exact OS, GHC and libraries version > as the command line options and execution information on the simplified > example. > > Thanks again > > > On Tue, Jul 10, 2012 at 8:43 AM, Manuel M T Chakravarty < > c...@cse.unsw.edu.au> wrote: > >> Firstly, especially when you are talking about performance, please >> provided detailed information on (a) the versions of the compiler and >> libraries that you used and (b) of the command line options that you used >> for compilation. >> >> Secondly, your function 'transposeP' doesn't make for a good nested >> data-parallel program. I haven't looked at the generated code, but I >> wouldn't be surprised if it doesn't optimise very well. >> >> The core benefit of nested data parallelism is that the sub-arrays in a >> nested array of arrays can be of varying size leading to irregular >> parallelism. However, that flexibility comes at a price, namely that it is >> a fairly inefficient representation for *rectangular arrays*, such as >> regular two-dimensional matrices (as in your example). It shouldn't be >> quite as inefficient as what you report, but it will never be competitive >> with a dedicated regular representation. >> >> Hence, for code, such as yours, we recommend to use our Repa library: >> http://hackage.haskell.org/package/repa >> >> It generates very fast code for regular array problems, see also >> http://www.cse.unsw.edu.au/~chak/papers/LCKP12.html >> >> Manuel >> >> >> mblanco <blanco...@gmail.com>: >> >> Hi, I'm trying to implement a matrix product example using DPH. This is >> the code: >> ------------------------------**------------------------------** >> ------------------------------**------------------------- >> type MMultType = Double >> type Matrix = [:[:MMultType:]:] >> type MVector = [:MMultType:] >> type Matrix_wrapper = PArray (PArray MMultType) >> >> {-# NOINLINE matMult_wrapper #-} >> matMult_wrapper :: Matrix_wrapper -> Matrix_wrapper -> Matrix_wrapper >> matMult_wrapper mA mB = toPArrayP (mapP toPArrayP (matMult >> (fromNestedPArrayP mA) (fromNestedPArrayP mB))) >> >> matMult :: Matrix -> Matrix -> Matrix >> matMult mA mB = mapP (\row -> mapP (\col -> dotp row col) (transposeP >> mB)) mA >> >> dotp :: MVector -> MVector -> MMultType >> dotp row col = D.sumP (zipWithP (D.*) row col) >> >> transposeP :: Matrix -> Matrix >> transposeP m = >> let >> h = lengthP m >> w = lengthP (m !: 0) >> rh = I.enumFromToP 0 (h I.- 1) >> rw = I.enumFromToP 0 (w I.- 1) >> in >> if h I.== 0 then [: :] >> else mapP (\y -> mapP (\x -> m !: x !: y) rh) rw >> ------------------------------**------------------------------** >> ------------------------------**------------------------- >> >> My problem is at execution time, on matrices of size 300*300 the program >> does finish (although it is very slow), but on 700*700 it consumes GBs of >> RAM until the process is aborted. >> >> In the paper "Work Efficient Higher-Order Vectorisation" it is explained >> that a work complexity problem (wich involved unnecesary array replication) >> was recently treated. So at first I thought the code implementation related >> to the paper had not been uploaded to hackage. But as I understand it must >> have been, as that seems to be the motive of the "dph-lifted-vseg" package. >> >> Does anybody notice the problem with the example or if the problem is >> related to the subject treated in the paper? >> >> Thanks in advance! >> _______________________________________________ >> Haskell-Cafe mailing list >> Haskell-Cafe@haskell.org >> http://www.haskell.org/mailman/listinfo/haskell-cafe >> >> >> > > > -- > Mauro Blanco > > -- Mauro Blanco
DPH_mmult.hs
Description: Binary data
DPH_mmult_wrapper.hs
Description: Binary data
Utils_mmult.hs
Description: Binary data
_______________________________________________ Haskell-Cafe mailing list Haskell-Cafe@haskell.org http://www.haskell.org/mailman/listinfo/haskell-cafe