On Mon, Aug 31, 2015 at 4:16 PM, Josh Berkus <j...@agliodbs.com> wrote: > First, let me put out there that I think the horizontal scaling project > which has buy-in from the community and we're working on is infinitely > better than the one we're not working on or is an underresourced fork. > So we're in agreement on that. However, I think there's a lot of room > for discussion; I feel like the FDW approach was decided in exclusive > meetings involving a very small number of people. The FDW approach > *may* be the right approach, but I'd like to see some rigorous > questioning of that before it's final.
It seems to me that sharding consists of (1) breaking your data set up into shards, (2) possibly replicating some of those shards onto multiple machines, and then (3) being able to access the remote data from local queries. As far as (1) is concerned, we need declarative partitioning, which is being worked on by Amit Langote. As far as (2) is concerned, I hope and expect BDR, or technology derived therefrom, to eventually fill that need. As far as (3) is concerned, why wouldn't we use the foreign data wrapper interface, and specifically postgres_fdw? That interface was designed for the explicit purpose of allowing access to remote data sources, and a lot of work has been put into it, so it would be highly surprising if we decided to throw that away and develop something completely new from the ground up. It's true that postgres_fdw doesn't do everything we need yet. The new join pushdown hooks aren't used by postgres_fdw yet, and the API itself has some bugs with EvalPlanQual handling. Aggregate pushdown is waiting on upper planner path-ification. DML pushdown doesn't exist yet, and the hooks that would enable pushdown of ORDER BY clauses to the remote side aren't being used by postgres_fdw. But all of these things have been worked on. Patches for many of them have already been posted. They have suffered from a certain amount of neglect by senior hackers, and perhaps also from a shortage of time on the part of the authors. But an awful lot of the work that is needed here has already been done, if only we could get it committed. Aggregate pushdown is a notable exception, but abandoning the foreign data wrapper approach in favor of something else won't fix that. Postgres-XC developed a purpose-built system for talking to other nodes instead of using the FDW interface, for the very good reason that the FDW interface did not yet exist at the time that Postgres-XC was created. But several people associated with the XC project have said, including one on this thread, that if it had existed, they probably would have used it. And it's hard to see why you wouldn't: with XC's approach, the remote data source is presumed to be PostgreSQL (or Postgres-XC/XL/X2/whatever); and you can only use the facility as part of a sharding solution. The FDW interface can talk to anything, and it can be used for stuff other than sharding, like making one remote table appear local because you just happen to want that for some reason. This makes the XC approach look rather brittle by comparison. I don't blame the XC folks for taking the shortest path between two points, but FDWs are better, and we ought to try to leverage that. > Particularly, I'm concerned that we already have two projects in process > aimed at horizontal scalability, and it seems like we could bring either > (or both) projects to production quality MUCH faster than we could make > an FDW-based solution work. These are: > > * pg_shard > * BDR > > It seems worthwhile, just as a thought experiment, if we can get where > we want using those, faster, or by combining those with new FDW features. I think it's abundantly clear that we need a logical replication solution as part of any horizontal scalability story. People will want to do things like have 10 machines with each piece of data on 3 of them, and there won't be any reasonable way of doing that without logical replication. I assume that BDR, or some technology derived from it, will end up in core and solve that problem. I had actually hoped we were going to get that in 9.5, but it didn't happen that way. Still, I think that getting first single-master, and then eventually multi-master, logical replication in core is absolutely critical. And not just for sharding specifically: replicating your whole database to several nodes and load-balancing your clients across them isn't sharding, but it does give you read scalability and is a good fit for people with geographically dispersed data with good geographical locality. I think a lot of people will want that. I'm not quite sure yet how we can marry declarative partitioning and better FDW-pushdown and logical replication into one seamless, easy to deploy solution that requires very low administrator effort. But I am sure that each of those things, taken individually, is very useful, and that being able to construct a solution from those building blocks would be a big improvement over what we have today. I can't imagine that trying to do one monolithic project that provides all of those things, but only if you combine them in the specific way that the designer had in mind, is ever going to be successful. People _will_ want access to each of those features in an unbundled fashion. And, trying to do them altogether leads to trying to solve too many problems at once. I think the history of Postgres-XC is a cautionary tale there. I don't really understand how pg_shard fits into this equation. It looks to me like it does some interesting things but, for example, it doesn't support JOIN pushdown, and suggests that you use the proprietary CitusDB engine if you need that. But I think JOIN pushdown is something we want to have in core, not something where we want to point people to proprietary alternatives. And it has some restrictions on INSERT statements - they have to contain only values which are constants or which can be folded to constants. I'm just guessing, but I bet that's probably due to some limitation which pg_shard, being out of core, has difficulty overcoming, but we can do better in core. Basically I guess I expect much of what pg_shard does to be subsumed as we improve FDWs, but maybe not all of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers