Many thanks for your detailed comments on this issue. I have also found
that the main draw-back for my large matrices is cache utilization. I
will certainly try the Intel Math Kernel Library and perhaps few other
more. For my current project, portability is a very important issue since my
apps will have to run under Winsows/Linux/Mac.
----- Original Message -----
Sent: Tuesday, October 26, 2004 5:59
AM
Subject: RE: [msvc] Matrix
multiplication
Hi,
Can't say that I'm a great expert on this subject, but I've dug into
this in the past.
As
far as I recall the Strassen Algorithm does only work with square matrices as
you say. Also from dim recollections of the past the speed improvement of the
Strassen algorithm really isn't that great relative to the complexity of
implementing it (I think it scales as O(n^log2(7)) which is about O(n^2.8)
versus the brute-force algorithm's O(n^3). This is a dramatically poorer
complexity improvement than (say) switching between a bubblesort and a
quicksort. I do recall reading about algorithms by Coppersmith and Winograd
(well I have a note in an old log book that I've dug up!) that was a
touch more efficient, but again not vastly so.
To
be honest I was never convinced that the number of additions and
multiplications was really the limiting factor in such large matrix
multiplications. My suspicion was that the main performance limitation was the
fact that the matrices in question became far too large to fit into the cache
(or indeed the installed memory!), and consequently the brute force
algorithm's performance was being limited much more by hardware than by any
(overwhelming) inefficiency. My feeling was that if one could organise access
of the data, so that all the values being used were prefetched into the cache
one would gain a far more substantial performance boost than moving away from
the brute-force approach. In the end I gave up and used the Intel Math Kernel
Library (it was free back then). This implements standard BLAS
routines in a manner which always exceeded the performance of my
code. For what I wanted the peformance was acceptable, although I don't recall
ever needing matrices as large as yours.
The
more recent versions of the Intel Math Kernel Library aren't free (although I
think there's a free trial version so you can see if it gives you adequate
performance). But the newer versions are able to scale across multiple
processors in a machine (matrix multiplication is a very paralellisable
operation) and include optimised versions for all flavours of Pentium
processors, Itanium processors and also have been ported to
Linux.
Hope
that helps
Daniel
[Daniel Robinson] -----Original
Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED]On Behalf Of Juan
Carlos
Sent: 25 October 2004 18:14
To:
[EMAIL PROTECTED]
Subject: [msvc] Matrix
multiplication
Hi all,
I have to multiply two matrices. The simple, brute force algorithm
works quite efficiently with small matrices (e.g. <100x100) but I need to
multiply matrices quite large > 2048 x 2048. Does anyone know any other
algoritm for matrix multiplication? The matrices can be non-square. The
Strassen algorithm seems to be a good method but if I remember correctly, it
does only work with square matrices (not 100% sure though).
Any help?
Regards,
Carlos
_______________________________________________
msvc mailing
list
[EMAIL PROTECTED]
See
http://beginthread.com/mailman/listinfo/msvc_beginthread.com for subscription
changes, and list archive.