Le 05/01/2013 16:23, Volker Braun a écrit :
Fundamentally, the Xeon Phi programming model is not really that much
different from OpenCL/Cuda. You send data to the coprocessor card, run
some code there, and pull back the result to the host CPU. It doesn't
speed up anything that is not specifically targeted at the coprocessor
card.

If you want to use it, you first of all need a problem that is
sufficiently parallelizable. Write Xeon Phi code in C/C++, compile it
with the special compiler, wrap it into a shared library, load it into
Cython/Python.

The Intel MKL basically does that, so if we get around to implementing
the proposal that I wrote earlier then at least linear algebra would be
sped up on stampede.

--
You received this message because you are subscribed to the Google
Groups "sage-devel" group.
To post to this group, send email to sage-devel@googlegroups.com.
To unsubscribe from this group, send email to
sage-devel+unsubscr...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.


I have a small experience with the Xeon Phi (alias MIC): a pure C++ code, about 25000 instructions was ported in 10 minutes on the Xeon Phi: once the code is working on a classical Intel machine with an Intel compiler), it works on the Xeon Phi (we have a joint project with people from Intel to port numerical projects on this platform). This is nice and very impressive. Also, complicated data structures can be used: so, it seems very nice and easy.

But the devil is waiting for you: getting good performances is much more difficult, as everyone can imagine: my code is build with the TBB library which seems to be a (the?) good choice for this architecture: at the the first execution, the code was running 2 times slower than on classical Intel machines (Sandy Bridge). The problem with performances, is that: 1) you must be sure to have permanently more than 60 threads available for running, 2) you must absolutely use the vector unit (512 bits), and this is not so easy: what can be vectorized in Sage's libs?. Ok, the sources will remain in C, C++, but vectorizing means often rewrite a marge part of the code.
Remember also that the Xeon has only 8gb of ram.

One of the port we tried was a classical EDO solver (Radau5) recoded in C++: the code evaluates the Jacobian matrix of some f: IR^n->IR^n by finite differences: this can be vectorized, this is not too difficult, but it works better if n is a multiple of 8 (because 512= 8x64).

But altogether, developing on this architecture is much more classical than developing with cuda: for the old guys like me it's a bit like programming on Cray machine in 1990....this is quite nice.

If there is some project to do something around Sage and Xeon Phi, I am interested (we will by 2 this year). But which project?

t.d.

--
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To post to this group, send email to sage-devel@googlegroups.com.
To unsubscribe from this group, send email to 
sage-devel+unsubscr...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.


<<attachment: tdumont.vcf>>

Reply via email to