Re: [HACKERS] Horizontal scalability/sharding

Tomas Vondra Tue, 01 Sep 2015 11:05:53 -0700

Hi,

On 09/01/2015 07:17 PM, Robert Haas wrote:

On Tue, Sep 1, 2015 at 1:06 PM, Josh Berkus <j...@agliodbs.com> wrote:

You're assuming that our primary bottleneck for writes is IO. It's
not at present for most users, and it certainly won't be in the
future. You need to move your thinking on systems resources into
the 21st century, instead of solving the resource problems from 15
years ago.


Your experience doesn't match mine.  I find that it's frequently
impossible to get the system to use all of the available CPU
capacity, either because you're bottlenecked on locks or because you
are bottlenecked on the  I/O subsystem, and with the locking
improvements in newer versions, the former is becoming less and less
common. Amit's recent work on scalability demonstrates this trend: he
goes looking for lock bottlenecks, and finds problems that only occur
at 128+ concurrent connections running full tilt.  The patches show
limited benefit - a few percentage points - at lesser concurrency
levels.  Either there are other locking bottlenecks that limit
performance at lower client counts but which mysteriously disappear
as concurrency increases, which I would find surprising, or the limit
is somewhere else.  I haven't seen any convincing evidence that the
I/O subsystem is the bottleneck, but I'm having a hard time figuring
out what else it could be.

Memory bandwidth, for example. It's quite difficult to spot, because theintuition is that memory is fast, but thanks to improvements in storage(and stagnation in RAM bandwidth), this is becoming a significant issue.

Process-management overhead is another thing we tend to ignore, but onceyou get to many processes all willing to work at the same time, you needto account for that.

Of course, this applies differently to different sharding use cases. Forexample analytics workloads have serious issues with memory bandwidth,but not so much with process management overhead (because the number ofconnections is usually about number of cores). Use cases with manyclients (in web-scale use cases) tends to run into both (all theprocesses also have to share all the caches, killing them).

I don't know if sharding can help solving (or at least improve) theseissues. And if sharding in general can, I don't know if it still holdsfor FDW-based solution.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Horizontal scalability/sharding

Reply via email to