Hi, Chris, Since we compute the speed up off the bandwidth achieved by a single MPI process, and a process can drive all memory channels, the maximum speed up can only come from experiments (vs. not by # of memory channels).
--Junchao Zhang On Mon, Oct 20, 2025 at 9:45 AM Klaij, Christiaan <[email protected]> wrote: > Hi Junchao, > > Thanks for you answer. Regarding the speed-up what would you expect if not > 24 out of 64, and why? > > Chris > > ________________________________________ > > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44 <+31%20317%2049%2033%2044> | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUomu-nz$ > > [image: Facebook] > <https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERT9e7Q2s$ > > > [image: LinkedIn] > <https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUf1DiSy$ > > > [image: YouTube] > <https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERRQd8GVj$ > > > > From: Junchao Zhang <[email protected]> > Sent: Friday, October 17, 2025 5:01 PM > To: Klaij, Christiaan > Cc: PETSc users list > Subject: Re: [petsc-users] interpreting petsc streams result > > Hi, Chris, > I did have an MR > https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7651__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERYIUu7jT$ > to > improve mpistream. I should rework it after Barry's !6903. See my inlined > comments to your questions > > On Fri, Oct 17, 2025 at 3:37 AM Klaij, Christiaan via petsc-users < > [email protected]<mailto:[email protected]>> wrote: > Attached is a petsc streams result kindly provided by a hardware > vendor for a single compute node, dual socket, with two AMD epyc > 9355 processors. Each processor has 32 cores, 12 DDR5 memory > channels and mem BW around 600 GB/s. > > * It is not immediately clear which line corresponds to which > y-axis. Could future versions of petsc please color the axis > label with the matching line color? > definitely > > > * Why would the achieved bandwidth be roughly 0.9 x 1e6 MB/s = > 900 GB/s and not closer to 1200 GB/s? > I recall it is actually not simple to get the theoretical max bandwidth. > One has to use special SIMD instructions, compiler flags and streaming > stores etc. > > > * The speed-up seems to be 12 out of 64, provided multiples of 8 > cores are used. As expected given 12 memory channels? > Maybe not, otherwise the speedup should be 24 as you have 24 channels. > > > * Does the zig-zag pattern indicate a pinning problem, or is it > unavoidable given the 8 core building block of these type of > processors? > I checked and found "make mpistream" uses --map-by core. I think we should > use --map-by socket or --map-by l3cache. > > > Chris > [cid:ii_199f2a38566119b24a61] > dr. ir. Christiaan Klaij | senior researcher > Research & Development | CFD Development > T +31 317 49 33 44<tel:+31%20317%2049%2033%2044> | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!eWQ9V6JC_rYZNcX9e_xTrfUjf7r2ZdoUyieeSwP7mc9QEj97847bdPUYphA8CBkYwpsrwb65pvMYnmpUnOFERUomu-nz$ > < > https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsBfp_z4A$ > > > [Facebook]< > https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsCH7BGfA$ > > > [LinkedIn]< > https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsDAV2fAI$ > > > [YouTube]< > https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!fqSBpN3Ld5fjzXGShGI09uJke12M-5LukEHe-y-gw0Bw9msZeH7wNiId6DZxQpluR_RUWpuoQWUD2HSsEyu_yEs$ > > >
