On Wed, Feb 24, 2016 at 11:34 PM, Bruce Momjian <br...@momjian.us> wrote: > On Wed, Feb 24, 2016 at 12:17:28PM +0300, Alexander Korotkov wrote: >> Hi, Bruce! >> >> The important point for me is to distinguish different kind of plans: >> implementation plan and research plan. >> If we're talking about implementation plan then it should be proven that >> proposed approach works in this case. I.e research should be already done. >> If we're talking about research plan then we should realize that result is >> unpredictable. And we would probably need to dramatically change our way. > > Yes, good point. I would say FDW-based sharding is certainly still a > research approach, but an odd one because we are adding code even while > in research mode. I think that is possible because the FDW improvements > have other uses beyond sharding. > > I think another aspect is that we already know that modifying the > Postgres source code can produce a useful sharding solution --- XC, XL, > Greenplum, and CitusDB all prove that, and pg_shard does it as a plugin. > So, we know that with unlimited code changes, it is possible. What we > don't know is whether it is possible with acceptable code changes, and > how much of the feature-set can be supported this way. > > We had a similar case with the Windows port, where SRA (my employer at > the time) and Nusphere both had native Windows ports of Postgres, and > they supplied source code to help with the port. So, in that case also, > we knew a native Windows port was possible, and we (or at least I) could > see the code that was required to do it. The big question was whether a > native Windows port could be added in a community-acceptable way, and > the community agreed we could try if we didn't make the code messier --- > that was a success. > > For pg_upgrade, I had code from EDB (my employer at the time) that kind > of worked, but needed lots of polish, and again, I could do it in > contrib as long as I didn't mess up the backend code --- that worked > well too. > > So, I guess I am saying, the FDW/sharding thing is a research project, > but one that is implementing code because of existing proven solutions > and because the improvements are benefiting other use-cases beyond > sharding. > > Also, in the big picture, the existence of many Postgres forks, all > doing sharding, indicates that there is demand for this capability, and > if we can get some this capability into Postgres we will increase the > number of people using native Postgres. We might also be able to reduce > the amount of duplicate work being done in all these forks and allow > them to more easily focus on more advanced use-cases. > >> This two things would work with FDW: >> 1) Pull data from data nodes to coordinator. >> 2) Pushdown computations from coordinator to data nodes: joins, aggregates >> etc. >> It's proven and clear. This is good. >> Another point is that these FDW advances are useful by themselves. This is >> good >> too. >> >> However, the model of FDW assumes that communication happen only between >> coordinator and data node. But full-weight distributed optimized can't be >> done >> under this restriction, because it requires every node to communicate every >> other node if it makes distributed query faster. And as I get, FDW approach >> currently have no research and no particular plan for that. > > This is very true. I imagine cross-node connections will certainly > complicate the implementation and lead to significant code changes, > which might be unacceptable. I think we need to go with a > non-cross-node implementation first, then if that is accepted, we can > start to think what cross-node code changes would look like. It > certainly would require FDW knowledge to exist on every shard. Some > have suggested that FDWs wouldn't work well for cross-node connections > or wouldn't scale and we shouldn't be using them --- I am not sure what > to think of that. > >> As I get from Robert Haas's talk >> (https://docs.google.com/viewer?a=v&pid=sites& >> srcid=ZGVmYXVsdGRvbWFpbnxyb2JlcnRtaGFhc3xneDo1ZmFhYzBhNjNhNzVhMDM0) >> >> Before we consider repartitioning joins, we should probably get >> everything >> previously discussed working first. >> – Join Pushdown For Parallelism, FDWs >> – PartialAggregate/FinalizeAggregate >> – Aggregate Pushdown For Parallelism, FDWs >> – Declarative Partitioning >> – Parallel-Aware Append >> >> >> So, as I get we didn't ever think about possibility of data redistribution >> using FDW. Probably, something changed since that time. But I haven't heard >> about it. > > No, you didn't miss it. :-( We just haven't gotten to studying that > yet. One possible outcome is that built-in Postgres has non-cross-node > sharding, and forks of Postgres have cross-node sharding, again assuming > cross-node sharding requires an unacceptable amount of code change. I > don't think anyone knows the answer yet. > >> On Tue, Feb 23, 2016 at 7:43 PM, Bruce Momjian <br...@momjian.us> wrote: >> >> Second, as part of this staged implementation, there are several use >> cases that will be shardable at first, and then only later, more complex >> ones. For example, here are some use cases and the technology they >> require: >> >> 1. Cross-node read-only queries on read-only shards using aggregate >> queries, e.g. data warehouse: >> >> This is the simplest to implement as it doesn't require a global >> transaction manager, global snapshot manager, and the number of rows >> returned from the shards is minimal because of the aggregates. >> >> 2. Cross-node read-only queries on read-only shards using non-aggregate >> queries: >> >> This will stress the coordinator to collect and process many returned >> rows, and will show how well the FDW transfer mechanism scales. >> >> >> FDW would work for queries which fits pull-pushdown model. I see no plan to >> make other queries work. > > Yep, see above. > >> 3. Cross-node read-only queries on read/write shards: >> >> This will require a global snapshot manager to make sure the shards >> return consistent data. >> >> 4. Cross-node read-write queries: >> >> This will require a global snapshot manager and global snapshot manager. >> >> >> At this point, it unclear why don't you refer work done in the direction of >> distributed transaction manager (which is also distributed snapshot manager >> in >> your terminology) >> http://www.postgresql.org/message-id/56bb7880.4020...@postgrespro.ru > > Yes, there is certainly great work being done on that. I should have > included a URL for that --- glad you did. I wasn't aware it also was a > distributed snapshot manager. :-) And again, as you said earlier, it > is useful for more things that just FDW sharding. > >> In 9.6, we will have FDW join and sort pushdown >> (http://thombrown.blogspot.com/2016/02/postgresql-96-part-1-horizontal-s >> calability.html). Unfortunately I don't think we will have aggregate >> pushdown, so we can't test #1, but we might be able to test #2, even in >> 9.5. Also, we might have better partitioning syntax in 9.6. >> >> We need things like parallel partition access and replicated lookup >> tables for more join pushdown. >> >> In a way, because these enhancements are useful independent of sharding, >> we have not tested to see how well an FDW sharding setup will work and >> for which workloads. >> >> >> This is the point I agree. I'm not objecting against any single FDW advance, >> because it's useful by itself. >> >> >> We know Postgres XC/XL works, and scales, but we also know they require >> too many code changes to be merged into Postgres (at least based on >> previous discussions). The FDW sharding approach is to enhance the >> existing features of Postgres to allow as much sharding as possible. >> >> >> This comparison doesn't seems correct to me. Postgres XC/XL supports data >> redistribution between nodes. And I haven't heard any single idea of >> supporting >> this in FDW. You are comparing not equal things. > > Well, as far as I know XC doesn't support data redistribution between > nodes and I saw good benchmarks of that, as well as XL.
XC does support that in 1.2 with a very basic approach (coded that years ago), though it takes an exclusive lock on the table involved. And actually I think what I did in this case really sucked, the effort was centralized on the Coordinator to gather and then redistribute the tuples, at least tuples that do not need to move were not moved at all. >> Once that is done, we can see what workloads it covers and >> decide if we are willing to copy the volume of code necessary >> to implement all supported Postgres XC or XL workloads. >> (The Postgres XL license now matches the Postgres license, >> http://www.postgres-xl.org/2015/07/license-change-and-9-5-merge/. >> Postgres XC has always used the Postgres license.) Postgres-XC used the GPL license first, and has moved to PostgreSQL license exactly to allow Postgres core to reuse it later on if needed. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers