On Mon, Sep 16, 2013 at 11:43 AM, Heikki Linnakangas < hlinnakan...@vmware.com> wrote:
> On 15.09.2013 12:14, Alexander Korotkov wrote: > >> On Sat, Jun 29, 2013 at 12:56 PM, Heikki Linnakangas< >> hlinnakan...@vmware.com> wrote: >> >> There's a few open questions: >>> >>> 1. How are we going to handle pg_upgrade? It would be nice to be able to >>> read the old page format, or convert on-the-fly. OTOH, if it gets too >>> complicated, might not be worth it. The indexes are much smaller with the >>> patch, so anyone using GIN probably wants to rebuild them anyway, sooner >>> or >>> later. Still, I'd like to give it a shot. >>> >>> 2. The patch introduces a small fixed 32-entry index into the packed >>> items. Is that an optimal number? >>> >>> 3. I'd like to see some performance testing of insertions, deletions, and >>> vacuum. I suspect that maintaining the 32-entry index might be fairly >>> expensive, as it's rewritten on every update to a leaf page. >>> >> >> It appears that maintaining 32-entry index is really expensive because it >> required re-decoding whole page. This issue is fixed in attached version >> of >> patch by introducing incremental updating of that index. Benchmarks will >> be >> posted later today.. >> > > Great! Please also benchmark WAL replay; you're still doing > non-incremental update of the 32-entry index in ginRedoInsert. > > A couple of quick comments after a quick glance (in addition to the above): > > The varbyte encoding stuff is a quite isolated piece of functionality. And > potentially useful elsewhere, too. It would be good to separate that into a > separate .c/.h files. I'm thinking of > src/backend/access/gin/**packeditemptr.c, > which would contain ginDataPageLeafReadItemPointer**, > ginDataPageLeafWriteItemPointe**r and ginDataPageLeafGetItemPointerS**ize > functions. A new typedef for varbyte-encoded things would probably be good > too, something like "typedef char *PackedItemPointer". In the new .c file, > please also add a file-header comment explaining the encoding. > > The README really needs to be updated. The posting tree page structure is > a lot more complicated now, there needs to be a whole new section to > explain it. A picture would be nice, similar to the one in bufpage.h. > > It's a bit funny that we've clumped together all different kinds of GIN > insertions into one WAL record type. ginRedoInsert does completely > different things depending on what kind of a page it is. And the > ginXlogInsert struct also contains different kinds of things depending on > what kind of an insertion it is. It would be cleaner to split it into > three. (this isn't your patch's fault - it was like that before, too.) Apparently some bugs breaks index on huge updates independent on incremental update of the 32-entry. I'm in debugging now. That's why benchmarks are delayed. ------ With best regards, Alexander Korotkov.