On Fri, Oct 10, 2014 at 8:11 PM, Andres Freund <and...@2ndquadrant.com> wrote: > On 2014-10-10 17:18:46 +0530, Amit Kapila wrote: > > On Fri, Oct 10, 2014 at 1:27 PM, Andres Freund <and...@2ndquadrant.com> > > wrote: > > > > Observations > > > > ---------------------- > > > > a. The patch performs really well (increase upto ~40%) incase all the > > > > data fits in shared buffers (scale factor -100). > > > > b. Incase data doesn't fit in shared buffers, but fits in RAM > > > > (scale factor -3000), there is performance increase upto 16 client > > count, > > > > however after that it starts dipping (in above config unto ~4.4%). > > > > > > Hm. Interesting. I don't see that dip on x86. > > > > Is it possible that implementation of some atomic operation is costlier > > for particular architecture? > > Yes, sure. And IIRC POWER improved atomics performance considerably for > POWER8... > > > I have tried again for scale factor 3000 and could see the dip and this > > time I have even tried with 175 client count and the dip is approximately > > 5% which is slightly more than 160 client count. > > FWIW, the profile always looks like:
For my tests on Power8, the profile looks somewhat similar to below profile mentioned by you, please see this mail: http://www.postgresql.org/message-id/caa4ek1je9zblhsfiavhd18gdwxux21zfqpjgq_dz_zoa35n...@mail.gmail.com However on Power7, the profile looks different which I have posted above thread. > > BTW, that profile *clearly* indicates we should make StrategyGetBuffer() > smarter. Yeah, even bgreclaimer patch is able to achieve the same, however after that the contention moves to somewhere else as you can see in above link. > > > > Here it goes.. > > > > Lwlock_contention patches - client_count=128 > > ---------------------------------------------------------------------- > > > > + 7.95% postgres postgres [.] GetSnapshotData > > + 3.58% postgres postgres [.] AllocSetAlloc > > + 2.51% postgres postgres [.] _bt_compare > > + 2.44% postgres postgres [.] > > hash_search_with_hash_value > > + 2.33% postgres [kernel.kallsyms] [k] .__copy_tofrom_user > > + 2.24% postgres postgres [.] AllocSetFreeIndex > > + 1.75% postgres postgres [.] > > pg_atomic_fetch_add_u32_impl > > Uh. Huh? Normally that'll be inline. That's compiled with gcc? What were > the compiler settings you used? Nothing specific, for performance tests where I have to take profiles I use below: ./configure --prefix=<installation_path> CFLAGS="-fno-omit-frame-pointer" make With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com