Peter T. Breuer wrote:
I set up pg to replace a plain gdbm database for my application.  But
even running to the same machine, via a unix socket

* the pg database ran 100 times slower

For what operations? Bulk reads? 19-way joins?

Across the net it was

  * about 500 to 1000 times slower than local gdbm
with no cpu use to speak of.

Disk-intensive or memory intensive?

I'd heard that networked databases are slow.  I might have left it at
that if curiosity hadn't led me to write a network server for gdbm
databases, and talk to _that_ just to get a comparison.

Lo and behold and smack me with a corncob if it wasn't _slower_ than pg.

On a whim I mapped the network bandwidth per packet size with the NPtcp
suite, and got surprising answers ..  at 1500B, naturally, the bandwidth
was the full 10Mb/s (minus overheads, say 8.5Mb/s) of my pathetic little
local net.  At 100B the bandwidth available was only 25Kb/s.  At 10B,
you might as well use tin cans and taut string instead.

This sounds like you're testing a single connection. You would expect "dead time" to dominate in that scenario. What happens when you have 50 simultaneous connections? Or do you think it's just packet overhead?

I also mapped the network flows using ntop, and yes, the average packet
size for both gdbm and pg in one direction was only about 100B or
so.  That's it!  Clearly there are a lot of short queries going out and
the answers were none too big either ( I had a LIMIT 1 in all my PG
queries).

I'm not sure that 100B query-results are usually the bottleneck.
Why would you have LIMIT 1 on all your queries?

About 75% of traffic was in the 64-128B range while my application was
running, with the peak bandwidth in that range being about 75-125Kb/s
(and I do mean bits, not bytes).

None of this sounds like typical database traffic to me. Yes, there are lots of small result-sets, but there are also typically larger (several kilobytes) to much larger (10s-100s KB).

Soooo ... I took a look at my implementation of remote gdbm, and did
a very little work to aggregate outgoing transmissions together into
lumps.  Three lines added in two places.  At the level of the protocol
where I could tell how long the immediate conversation segment would be,
I "corked" the tcp socket before starting the segment and "uncorked" it
after the segment (for "cork", see tcp(7), setsockopt(2) and TCP_CORK in
linux).

I'm a bit puzzled, because I'd have thought the standard Nagle algorithm would manage this gracefully enough for short-query cases. There's no way (that I know of) for a backend to handle more than one query at a time.

Surprise, ...  I got a speed up of hundreds of times.  The same application
that crawled under my original rgdbm implementation and under PG now
maxed out the network bandwidth at close to a full 10Mb/s and 1200
pkts/s, at 10% CPU on my 700MHz client, and a bit less on the 1GHz
server.

So

  * Is that what is holding up postgres over the net too?  Lots of tiny
    packets?

I'm not sure your setup is typical, interesting though the figures are. Google a bit for pg_bench perhaps and see if you can reproduce the effect with a more typical load. I'd be interested in being proved wrong.

And if so

  * can one fix it the way I fixed it for remote gdbm?

The speedup was hundreds of times. Can someone point me at the relevant
bits of pg code? A quick look seems to say that fe-*.c is interesting. I need to find where the actual read and write on the
conn->sock is done.

You'll want to look in backend/libpq and interfaces/libpq I think (although I'm not a developer).

--
  Richard Huxton
  Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

              http://archives.postgresql.org

Reply via email to