On Fri, Sep 18, 2015 at 4:03 AM, Haribabu Kommi <kommi.harib...@gmail.com> wrote: > On Thu, Sep 3, 2015 at 8:21 PM, Amit Kapila <amit.kapil...@gmail.com> wrote: >> >> Attached, find the rebased version of patch. > > Here are the performance test results:
Thanks, this is really interesting. I'm very surprised by how much kernel overhead this shows. I wonder where that's coming from. The writes to and reads from the shm_mq shouldn't need to touch the kernel at all except for page faults; that's why I chose this form of IPC. It could be that the signals which are sent for flow control are chewing up a lot of cycles, but if that's the problem, it's not very clear from here. copy_user_generic_string doesn't sound like something related to signals. And why all the kernel time in _spin_lock? Maybe perf -g would help us tease out where this kernel time is coming from. Some of this may be due to rapid context switching. Suppose the master process is the bottleneck. Then each worker will fill up the queue and go to sleep. When the master reads a tuple, the worker has to wake up and write a tuple, and then it goes back to sleep. This might be an indication that we need a bigger shm_mq size. I think that would be experimenting with: if we double or quadruple or increase by 10x the queue size, what happens to performance? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers