On Fri, 29 Jan 2010 16:52:51 -0600, Barry Smith <bsmith at mcs.anl.gov> wrote: > It is possible some times to "turn hardware prefetching off", > possibly with Intel compiler options.
Hmm, I had never heard of this. Apparently the instruction can only be executed in ring 0, and though the Linux kernel exposes this. resulting thing. http://stackoverflow.com/questions/784041/how-do-i-programatically-disable-hardware-prefetching Anyway, I didn't bother with this. > I owe you a beer if the difference is more than say 2 percent. Would you accept a 30 percent speedup instead of a 2 percent slowdown? Apply the attached patch, compile with GCC (I don't know if other compilers have the same __builtin_prefetch), and compare the following (top result is before the patch). ./ex19 -ksp_type cgs -pc_type none -ksp_monitor -ksp_max_it 1000 -snes_max_it 1 -da_grid_x 50 -da_grid_y 50 -log_summary MatMult 2001 1.0 5.3909e+00 1.0 3.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00 41 41 0 0 0 82 81 0 0 0 563 MatMult 2001 1.0 3.9953e+00 1.0 3.03e+09 1.0 0.0e+00 0.0e+00 0.0e+00 38 41 0 0 0 77 81 0 0 0 759 ./ex19 -ksp_type cgs -pc_type none -ksp_monitor -ksp_max_it 100 -snes_max_it 1 -da_grid_x 200 -da_grid_y 200 -log_summary MatMult 201 1.0 7.9618e+00 1.0 4.98e+09 1.0 0.0e+00 0.0e+00 0.0e+00 28 38 0 0 0 60 77 0 0 0 626 MatMult 201 1.0 6.1575e+00 1.0 4.98e+09 1.0 0.0e+00 0.0e+00 0.0e+00 24 38 0 0 0 54 77 0 0 0 809 ./ex19 -ksp_type cgs -pc_type none -ksp_monitor -ksp_max_it 100 -snes_max_it 1 -da_grid_x 300 -da_grid_y 300 -log_summary MatMult 201 1.0 1.7829e+01 1.0 1.12e+10 1.0 0.0e+00 0.0e+00 0.0e+00 27 38 0 0 0 60 77 0 0 0 630 MatMult 201 1.0 1.3561e+01 1.0 1.12e+10 1.0 0.0e+00 0.0e+00 0.0e+00 24 38 0 0 0 53 77 0 0 0 828 This blows me away. Jed -------------- next part -------------- A non-text attachment was scrubbed... Name: inode-prefetch.patch Type: text/x-patch Size: 1131 bytes Desc: not available URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20100130/8564b0bf/attachment.bin>
