On Tuesday, January 29, 2013 2:53 AM Heikki Linnakangas wrote: > On 28.01.2013 15:39, Amit Kapila wrote: > > Rebased the patch as per HEAD. > > I don't like the way heap_delta_encode has intimate knowledge of how > the lz compression works. It feels like a violent punch through the > abstraction layers. > > Ideally, you would just pass the old and new tuple to pglz as char *, > and pglz code would find the common parts. But I guess that's too slow, > as that's what I originally suggested and you rejected that approach. > But even if that's not possible on performance grounds, we don't need > to completely blow up the abstraction. pglz can still do the encoding - > the caller just needs to pass it the attribute boundaries to consider > for matches, so that it doesn't need to scan them byte by byte. > > I came up with the attached patch. I wrote it to demonstrate the API, > I'm not 100% sure the result after decoding is correct.
I have checked the patch code, found few problems. 1. History will be old tuple, in that case below call needs to be changed /* return pglz_compress_with_history((char *) oldtup->t_data, oldtup->t_len, (char *) newtup->t_data, newtup->t_len, offsets, noffsets, (PGLZ_Header *) encdata, &strategy);*/ return pglz_compress_with_history((char *) newtup->t_data, newtup->t_len, (char *) oldtup->t_data, oldtup->t_len, offsets, noffsets, (PGLZ_Header *) encdata, &strategy); 2. The offset array is beginning of each column offset. In that case below needs to be changed. offsets[noffsets++] = off; off = att_addlength_pointer(off, thisatt->attlen, tp + off); if (thisatt->attlen <= 0) slow = true; /* can't use attcacheoff anymore */ // offsets[noffsets++] = off; } Apart from this, some of the test cases are failing which I need to check. I have debugged the new code, it appears to me that this will not be as efficient as the current approach of patch. It needs to build hash table for history reference and comparison which can add overhead as compare to existing approach. I am taking the Performance and WAL Reduction data. Can there be another way with which current patch code can be made better, so that we don't need to change the encoding approach, as I am having feeling that this might not be performance wise equally good. With Regards, Amit Kapila. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers