Matt, Jed, thanks a lot for the discussions. Since the ordering could minimizing the bandwidth, I think it is really worth to have a try with the matrix partitioning / ordering. If there is a factor two of increase in the flop rate, that's quite promising!
On Thu, Dec 23, 2010 at 3:32 AM, Jed Brown <jed at 59a2.org> wrote: > I disagree, there is easily a factor of two in flop/s between a naive > ordering (e.g. hierarchical by node type in a finite element method) and a > good low-bandwidth ordering. > > This is in the FUN3D papers and still true today, in my experience. > > Incomplete factorization is also very order dependent, as you note. > > Jed > > On Dec 22, 2010 5:03 PM, "Matthew Knepley" <knepley at gmail.com> wrote: > > On Wed, Dec 22, 2010 at 10:11 AM, Yongjun Chen <yjxd.chen at gmail.com> > wrote: > > > > > On Wed, Dec 22, 2010 at 6:53 PM, Satish Balay <balay at mcs.anl.gov> wrote: > >> > >> On Wed, 22 De... > 1) To see a large gain, the ordering you start with would have to be very > bad. Maybe it is. These > orderings try to minimize bandwidth, which means minimize communication > in the MatMult. > > 2) If you use incomplete facotrization, the ordering can have a large > effect on conditioning, so > number of iterations, which does not improve scalability. This would > impact scalability if you > use a parallel IC, however all those packages reorder your matrix > already. > > In short, I suspect this will not help a lot, except maybe with > conditioning, which is what I was refering to in the quote. > > Matt > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more... > > -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20101223/6b3f076c/attachment-0001.htm>
