J does not do matrix multiplication this way, row-by-column. Instead, it uses row-at-a-time. See Inner Product -- An Old/New Problem <http://www.jsoftware.com/papers/innerproduct/ip.htm> .
On Tue, Jun 21, 2016 at 7:11 PM, bill lam <[email protected]> wrote: > Actually vector addition or mulitiplication are exceptional, > since memory reference is already localised for scalar > functions. For matrix multiplication, it needs to ensure a row > from A and a column from B can be fit into the local memory of a > core. My point is GPU needs fine tuning case by case. > > Вт, 21 июн 2016, Henry Rich написал(а): > > Actually, I think matrix multiplication IS exceptional: at least it's > > different from vector addition or multiplication. The operation takes > > O(n^3) arithmetic operations to produce a result of O(n^2) atoms, so > even if > > the data-transfer is relatively expensive, a big reduction in time spent > on > > arithmetic may outweigh the cost of data management. For n>100 I'd > expect > > the GPU to be a winner. > > > > Henry Rich > > > > > > On 6/21/2016 9:43 PM, bill lam wrote: > > > The main difficulty in using GPU is memory, not just memory > > > bandwidth, but also how to pipe data into GPU and fine tuning > > > block size so that memory reference can be localized within > > > each core. matrix multiplication is no exception. > > > > > > Ср, 22 июн 2016, JGeneral написал(а): > > > > in my tessts with Arrayfire (bindings here: > > > > https://github.com/Pascal-J/Jfire ) > > > > > > > > what I found annoying was the JIT compilation step. I think Futhark > does away with this step, or at least provides a saveable version. > > > > > > > > all recent Intel/AMD chips have decent built in GPUs with low > latency. > > > > > > > > Even on faster dedicated cards though, you can keep data/results > there if there is further processing to do. > > > > > > > > things like martix multiplication and other similar tasks are 10x to > 100x faster (iirc) including the round trip back to cpu. > > > > > > > > > > > > > > > > > > > > ----- Original Message ----- > > > > From: bill lam <[email protected]> > > > > To: 'Pascal Jasmin' via General <[email protected]> > > > > Sent: Tuesday, June 21, 2016 8:26 PM > > > > Subject: Re: [Jgeneral] GPU APL compiler work > > > > > > > > INO benefit of using GPU for implementing APL (and J) primiivies > > > > is questionable. Most primitives are simple and the efficiency > > > > of APL/J comes the processing large arrays. The time needed to > > > > read/write GPU memory for large array is not justified > > > > unless the job is highly looped eg, encoding/decoding jpeg. > > > > > > > > > > > > Пн, 20 июн 2016, JGeneral написал(а): > > > > > Interesting recent projects, > > > > > > > > > > TAIL - typed array intermediate language > > > > > http://www.elsman.com/pdf/array14_final.pdf > > > > > > > > > > uses structures very similar to J's internal noun format. (all of > the items are the same anyway, though it perhaps only has int and double > data types) > > > > > > > > > > Semantics for core operations are similar to J (take with negative > index takes from the end) > > > > > > > > > > > > > > > used with a SML apl to TAIL compiler > > > > > > > > > > https://github.com/melsman/apltail/ > > > > > > > > > > A more interesting project is the Futhark language, and its > leveraging of the above 2 projects to target GPUs, and extends datatypes to > char, bool, tuples. > > > > > > > > > > Futhark feels higher level and cleaner than TAIL. > > > > > > > > > > > > > > > spec paper: http://futhark-lang.org/publications/fhpc16.pdf > > > > > > > > > > more general overview/benchmark/example site: > > > > > > > > > > http://futhark-lang.org/index.html > > > > > > > > > > pretty much every link there is interesting. > > > > > > ---------------------------------------------------------------------- > > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > -- > > > > regards, > > > > ==================================================== > > > > GPG key 1024D/4434BAB3 2008-08-24 > > > > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > > > > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > > > > > > > > > ---------------------------------------------------------------------- > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > > ---------------------------------------------------------------------- > > > > For information about J forums see > http://www.jsoftware.com/forums.htm > > > > ---------------------------------------------------------------------- > > For information about J forums see http://www.jsoftware.com/forums.htm > > -- > regards, > ==================================================== > GPG key 1024D/4434BAB3 2008-08-24 > gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3 > gpg --keyserver subkeys.pgp.net --armor --export 4434BAB3 > ---------------------------------------------------------------------- > For information about J forums see http://www.jsoftware.com/forums.htm > ---------------------------------------------------------------------- For information about J forums see http://www.jsoftware.com/forums.htm
