On Tue, Apr 5, 2016 at 3:55 AM, David Rowley <david.row...@2ndquadrant.com> wrote: >> I think it might be a good idea if these patches made less use of >> bytea and exposed the numeric transition values as, say, a 2-element >> array of numeric. > > Well, if you have a look at NumericAggState you can see it's not quite > as simple as an array of numeric, unless of course you'd be willing to > spend the extra cycles, use the extra memory, and bandwidth to convert > those int64's to numeric too, then it could be made to work. To do are > you describe properly, we'd need a composite type.
Uggh, yeah. Yuck. > hmm, isn't that why we have a deserialisation functions? Do you see a > use case where these won't be available? ... > I've not yet read the design spec for sharding in Postgres. If there's > something I can read over to get an idea of why you think this won't > work, then maybe we can come to a good conclusion that way. But if I > take a guess, then I'd have imagined that we'd not support sharding > over different major versions, and if we really needed to change the > serialisation format later, then we could do so. We could even put a > note in the documents that the serialisation format may change between > major versions. Well, OK, so here was my thought. For the sake of simplicity, let's suppose that creating a sharded table works more or less like what you can already do today: create a parent table with a non-inherited CHECK (false) constraint and then create some inheritance children that are foreign tables on various remote servers. Give those children CHECK constraints that explicate the partitioning scheme. This is probably not actually how we want this to work in detail (e.g. we probably want declarative partitioning) but the details don't matter very much for purposes of what I'm trying to explain here so let's just ignore them for the moment. Now, let's suppose that the user sets up a sharded table and then says: SELECT a, SUM(b), AVG(c) FROM sometab. At this point, what we'd like to have happen is that for each child foreign table, we go and fetch partially aggregated results. Those children might be running any version of PostgreSQL - I was not assuming that we'd insist on matching major versions, although of course that could be done - and there would probably need to be a minimum version of PostgreSQL anyway. They could even be running some other database. As long as they can spit out partial aggregates in a format that we can understand, we can deserialize those aggregates and run combine functions on them. But if the remote side is, say, MariaDB, it's probably much easier to get it to spit out something that looks like a PostgreSQL array than it is to make it spit out some bytea blob that's in an entirely PostgreSQL-specific format. Now, maybe that doesn't matter. Even getting something like this working with PostgreSQL as the remote side is going to be hard. Moreover, for this to have any chance of working with anything other than a compatible PostgreSQL server on the remote side, the FDW is going to have to write bespoke code for each aggregate anyway, and that code can always construct the requisite bytea blobs internally. So who cares? I can't point to any really tangible advantage of having serialized transition states be human-readable, so maybe there isn't one. But I was thinking about it, for fear that there might be some advantage that I'm missing. > To be really honest, I'm quite worried that if I go and make this > change then my time might be wasted as I really think making that > change this late in the day is just setting this up for failure. I > really don't think we can bat this patch over the Pacific Ocean too > many times before we find ourselves inside the feature freeze. Of > course, if you really think it's no good, that's different, it can't > be committed, but "I think it might be better" does not seem like a > solid enough argument for me to want to risk trying this and delaying > further for that. I think I agree. Certainly, with what we've got here today, these are not user-exposed, so I think we could certainly change them next time around if need be. If they ever become user-exposed, then maybe we should think a little harder. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers