On 22 June 2017 at 09:07, Andres Freund <and...@anarazel.de> wrote: > On 2017-06-22 09:03:05 +0800, Craig Ringer wrote: >> On 22 June 2017 at 08:29, Andres Freund <and...@anarazel.de> wrote: >> >> > I.e. we're doing tiny write send() syscalls (they should be coalesced) >> >> That's likely worth doing, but can probably wait for a separate patch. > > I don't think so, we should get this right, it could have API influence. > > >> The kernel will usually do some packet aggregation unless we use >> TCP_NODELAY (which we don't and shouldn't), and the syscall overhead >> is IMO not worth worrying about just yet. > > 1) > /* > * Select socket options: no delay of > outgoing data for > * TCP sockets, nonblock mode, > close-on-exec. Fail if any > * of this fails. > */ > if (!IS_AF_UNIX(addr_cur->ai_family)) > { > if (!connectNoDelay(conn)) > { > > pqDropConnection(conn, true); > conn->addr_cur = > addr_cur->ai_next; > continue; > } > } > > 2) Even if nodelay weren't set, this can still lead to smaller packets > being sent, because you start sending normal sized tcp packets, > rather than jumbo ones, even if configured (pretty common these > days). > > 3) Syscall overhead is actually quite significant.
Fair enough, and *headdesk* re not checking NODELAY. I thought I'd checked for our use of that before, but I must've remembered wrong. We could use TCP_CORK but it's not portable and it'd be better to just collect up a buffer to dispatch. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers