On Tue, Sep 27, 2016 at 3:31 PM, Robert Haas <robertmh...@gmail.com> wrote: > You seem to have erased the subject line from this email somehow.
I think that that was Gmail. Maybe that's new? Generally, I have to go out of my way to change the subject line, so it seems unlikely that I fat-fingered it. I wish that they'd stop changing things... > On Tue, Sep 27, 2016 at 10:18 AM, Peter Geoghegan <p...@heroku.com> wrote: > No, a shm_mq is not a mutual exclusion mechanism. It's a queue! > > A data dependency and a lock aren't the same thing. If your point in > saying "we need something like an LWLock" was actually "the leader > will have to wait if it's merged all tuples from a worker and no more > are immediately available" then I think that's a pretty odd way to > make that point. To me, the first statement appears to be false, > while the second is obviously true. Okay. Sorry for being unclear. > OK, well I'm not taking any position on whether what Heikki is > proposing will turn out to be good from a performance point of view. > My intuitions about sorting performance haven't turned out to be > particularly good in the past. I'm only saying that if you do decide > to queue the tuples passing from worker to leader in a shm_mq, you > shouldn't read from the shm_mq objects in a round robin, but rather > read multiple tuples at a time from the same worker whenever that is > possible without blocking. If you read tuples from workers one at a > time, you get a lot of context-switch thrashing, because the worker > wakes up, writes one tuple (or even part of a tuple) and immediately > goes back to sleep. Then it shortly thereafter does it again. > Whereas if you drain the worker's whole queue each time you read from > that queue, then the worker can wake up, refill it completely, and go > back to sleep again. So you incur fewer context switches for the same > amount of work. Or at least, that's how it worked out for Gather, and > what I am saying is that it will probably work out the same way for > sorting, if somebody chooses to try to implement this. Maybe. This would involve the overlapping of multiple worker merges that are performed in parallel with a single leader merge. I don't think it would be all that great. There would be a trade-off around disk bandwidth utilization too, so I'm particularly doubtful that the complexity would pay for itself. But, of course, I too have been wrong about sorting performance characteristics in the past, so I can't be sure. -- Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers