Peter Skomoroch wrote:
This reminds me of a similar issue I had.  What approaches do you take for
large dense matrix multiplication in MPI, when the matrices are too large to
fit into cluster memory?  If I hack up something to cache intermediate

Hi Peter:

results to disk, the IO seems to drag everything to a halt and I'm looking
for a better solution.  I'd like to use some libraries like PETSc, but how

Disk memory has a latency of 10^-3 seconds or so, and a bandwidth of from 10^7 to 10^8 bytes/second. Compare that to physical ram. Latencies of 10^-7 seconds or less, and bandwidths of 10^9 to 10^10 seconds.

If you are going to do disk IO, pay that latency cost once for many pages, not once per page with seeks.

Just like with other streaming calculations, you likely need to do some sort of double buffering. That said, disk IO is not really the answer.

would you work around memory limitations like this (short of building a
bigger cluster)?

20+ years ago I worked on a large dense Markov matrix calculation where after computing the relevant matrix elements and using them in the calculation, I would throw them away. It was cheaper (less time consuming) than spilling them to disk and then trying to recover them later. Then again, this was an IBM 3090 VF 180 ... so ...

Since you are doing matrix multiplication, I might suggest looking at the Golub and Van Loan bible on Matrix Computations for some ideas. That said, Matrix multiplications are decomposable. If you can reconstruct matrix elements easily (more quickly than storage/retrieval) this might be a good method. Or if you can decompose it far enough, or if the problem has some sort of essential symmetry you can exploit in the matrix structure, this could help. Symmetries not only imply conservation laws, they tend to reduce storage requirements.

Out of curiousity, what size matrices are you using? I know some of the structural folks can, with large enough DoF problems hit 10^8 or so on a side. Not dense (usually with specific banded structure).

And that brings up another possibility. If you can perform various transforms on your matrix to get it into a well known form (banded, ...) this could make multiplications go much faster.


--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web  : http://www.scalableinformatics.com
       http://jackrabbit.scalableinformatics.com
phone: +1 734 786 8423
fax  : +1 866 888 3112
cell : +1 734 612 4615
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to