On Fri, Jun 1, 2018 at 11:29:43AM -0500, Merlin Moncure wrote: > FWIW, Distributed analytical queries is the right market to be in. > This is the field in which I work, and this is where the action is at. > I am very, very, sure about this. My view is that many of the > existing solutions to this problem (in particular hadoop class > soltuions) have major architectural downsides that make them > inappropriate in use cases that postgres really shines at; direct > hookups to low latency applications for example. postgres is > fundamentally a more capable 'node' with its multiple man-millennia of > engineering behind it. Unlimited vertical scaling (RAC etc) is > interesting too, but this is not the way the market is moving as > hardware advancements have reduced or eliminated the need for that in > many spheres. > > The direction of the project is sound and we are on the cusp of the > point where multiple independent coalescing features (FDW, logical > replication, parallel query, executor enhancements) will open new > scaling avenues that will not require trading off the many other > benefits of SQL that competing contemporary solutions might. The > broader development market is starting to realize this and that is a > major driver of the recent upswing in popularity. This is benefiting > me tremendously personally due to having gone 'all-in' with postgres > almost 20 years ago :-D. (Time sure flies) These are truly > wonderful times for the community.
I am coming in late, but I am glad we are having this conversation. We have made great strides toward sharding while adding minimal sharding-specific code. We can now see a time when we will complete the the minimal sharding-specific code tasks. Once we reach that point, we will need to decide what sharding-specific code to add, and to do that, we need to understand which direction to go in, and to do that, we need to know the trade-offs. While I am glad people know a lot about how other projects handle sharding, these can be only guides to how Postgres will handle such workloads. I think we need to get to a point where we have all of the minimal sharding-specific code features done, at least as proof-of-concept, and then test Postgres with various workloads like OLTP/OLAP and read-write/read-only. This will tell us where sharding-specific code will have the greatest impact. What we don't want to do is to add a bunch of sharding-specific code without knowing which workloads it benefits, and how many of our users will actually use sharding. Some projects have it done that, and it didn't end well since they then had a lot of product complexity with little user value. -- Bruce Momjian <br...@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +