Tom, * Tom Lane (t...@sss.pgh.pa.us) wrote: > I've now taken this idea as far as building the required infrastructure > and revamping a couple of array operators to use it. There's a lot yet > to do, but I've done enough to get some preliminary ideas about > performance (see below).
Nice! > * Although I said above that everything owned by a deserialized object > has to live in a single memory context, I do have ideas about relaxing > that. The core idea would be to invent a "memory context reset/delete > callback" feature in mcxt.c. Then a deserialized object could register > such a callback on its own memory context, and use the callback to clean > up resources outside its context. This is potentially useful for instance > for something like PostGIS, where an object likely includes some data that > was allocated with malloc not palloc because it was created by library > functions that aren't Postgres-aware. Another likely use-case is for > deserialized objects representing composite types to maintain reference > counts on their tuple descriptors instead of having to copy said > descriptors into their private contexts. This'd be material for a > separate patch though. Being able to register a callback to be used on deletion of the context would certainly be very nice and strikes me as pretty independent of the rest of this. You've probably thought of this already, but registering the callback should probably allow the caller to pass in a pointer to be passed back to the callback function when the delete happens, so that there's a place for the metadata to be stored about what the callback function needs to clean up when it's called. > So that's the plan, and attached is a very-much-WIP patch that uses this > approach to speed up plpgsql array element assignments (and not a whole > lot else as yet). Here's the basic test case I've been using: I've not looked at the code at all as yet, but it makes sense to me. > With the attached patch, those timings drop to 80 and 150 ms respectively. And those numbers are pretty fantastic and would address an area we regularly get dinged on. > Still, if the worst-case slowdown is around 20% on trivially-sized arrays, > I'd gladly take that to have better performance on larger arrays. And I > think this example is close to the worst case for the patch's approach, > since it's testing small, fixed-element-length, no-nulls arrays, which is > what the existing code can handle without spending a lot of cycles. Agreed. > BTW, I'm not all that thrilled with the "deserialized object" terminology. > I found myself repeatedly tripping up on which form was serialized and > which de-. If anyone's got a better naming idea I'm willing to adopt it. Unfortunately, nothing comes to mind. Serialization is, at least, a pretty well understood concept and so the naming will likely make sense to newcomers, even if it's difficult to keep track of which is serialized and which is deserialized. > I'm not sure exactly how to push this forward. I would not want to > commit it without converting a significant number of array functions to > understand about deserialized inputs, and by the time I've finished that > work it's likely to be too late for 9.5. OTOH I'm sure that the PostGIS > folk would love to have this infrastructure in 9.5 not 9.6 so they could > make a start on fixing their issues. (Further down the pike, I'd plan to > look at adapting composite-type operations, JSONB, etc, to make use of > this approach, but that certainly isn't happening for 9.5.) > > Thoughts, advice, better ideas? I'm not really a big fan of putting an infrastructure out there for modules to use that we don't use ourselves (particularly when it's clear that there are places where we could/should be). On the other hand, this doesn't impact on-disk format and therefore I'm less worried that we'll end up with a release-critical issue when we're getting ready to put 9.5 out there. So, I'm on the fence about it. I'd love to see all of this in 9.5 with the array functions converted, but I don't think it'd be horrible if only a subset had been done in time for 9.5. The others aren't going to go anywhere and will still work. I do think it'd be better to have at least some core users of this new infrastructure rather than just putting it out there for modules to use but I agree it'd be a bit grotty to have only some of the array functions converted. Thanks! Stephen
signature.asc
Description: Digital signature