On Thu, 2 Sep 2010 22:09:09 -0600 (GMT-06:00), Shri <abhyshr at mcs.anl.gov> wrote: > > > mpiexec -n 2 ./ex35 -X 202 -Y 102 -Z 102 -log_summary > > > > MatAssemblyBegin 8 1.0 3.4185e-01 4.1 0.00e+00 0.0 0.0e+00 0.0e+00 > > 1.2e+01 2 0 0 0 9 2 0 0 0 15 0 > > MatAssemblyEnd 8 1.0 8.8343e-01 1.0 0.00e+00 0.0 2.0e+01 2.9e+04 > > 3.6e+01 9 0 38 0 26 9 0 38 0 44 0 > > MatGetSubMatrice 2 1.0 1.0293e+00 1.1 0.00e+00 0.0 2.0e+01 9.6e+06 > > 1.0e+01 10 0 38 51 7 10 0 38 51 12 0 > > MatLoad 1 1.0 2.3205e+00 1.0 0.00e+00 0.0 2.1e+01 9.0e+06 > > 2.6e+01 23 0 40 50 19 23 0 40 50 32 0 > > MatView 1 1.0 2.1851e+00 1.3 0.00e+00 0.0 1.9e+01 9.9e+06 > > 1.9e+01 19 0 37 50 14 19 0 37 50 23 0 > > > Are these the results with the optimized petsc build?? It takes an > indefinite amount of time to run ex35 with these options with petsc > debug build. Hence Barry suggested to use a profiler and see what's > going on in MatLoad_MPIAIJ.
That is with an optimized build. With a debug build, it runs in a completely reasonable amount of time for the 100^3 problem, but the 202x102x102 size produces bad quicksort behavior (!): [...] #3064 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101252) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3065 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101303) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3066 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101405) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3067 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101407) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3068 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101410) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3069 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101416) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3070 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101429) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3071 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101454) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3072 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101505) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3073 0x00007fbef6a91e00 in PetscSortInt_Private (v=0x7fbee4299760, right=2101607) at /home/jed/petsc/src/sys/utils/sorti.c:42 #3074 0x00007fbef6a9219e in PetscSortInt (n=2101608, i=0x7fbee4299760) at /home/jed/petsc/src/sys/utils/sorti.c:81 #3075 0x00007fbef7f4786b in AOCreateBasic (comm=0xef0da0, napp=1050804, myapp=0x4b7b2e0, mypetsc=0x47788b0, aoout=0xf04fd0) at /home/jed/petsc/src/dm/ao/impls/basic/aobasic.c:274 #3076 0x00007fbef7f487e3 in AOCreateBasicIS (isapp=0x1077720, ispetsc=0x1079740, aoout=0xf04fd0) at /home/jed/petsc/src/dm/ao/impls/basic/aobasic.c:367 #3077 0x00007fbef8009836 in DAGetAO (da=0xf040e0, ao=0x7fffb44efb48) at /home/jed/petsc/src/dm/da/src/daindex.c:158 #3078 0x00007fbef7f59acf in MatView_MPI_DA (A=0xfd4f10, viewer=0xeefc60) at /home/jed/petsc/src/dm/da/utils/fdda.c:586 #3079 0x00007fbef729185f in MatView (mat=0xfd4f10, viewer=0xeefc60) at /home/jed/petsc/src/mat/interface/matrix.c:719 #3080 0x000000000040125c in main (argc=8, argv=0x7fffb44efdf8) at ex35.c:33 Presumably the difference between debug and optimized is that the bad quicksort is less painful when there is no PetscFunctionBegin, but I'm surprised that the difference is more than a factor of, say, 50. Maybe the compiler does a significant transformation for the optimized case. > > Is this unacceptably slow (do you still want me to profile MatLoad)? > I don't know whether the MatLoad and MatView time can be regarded as fast > or slow as i don't have anything to compare it with.I guess for a matrix of > size 2M with roughly 20% nonzeros, reading approximately 0.4 M entries from > disk in 2 seconds is acceptable,what do you think? The file was 180 MiB, reading that in 2 seconds on my laptop sounds reasonable to me. Jed