On 09.10.2013 02:04, Tomas Vondra wrote:
On 8.10.2013 21:59, Heikki Linnakangas wrote:
On 08.10.2013 17:47, Alexander Korotkov wrote:
Hi, Tomas!

On Sun, Oct 6, 2013 at 3:58 AM, Tomas Vondra<t...@fuzzy.cz>   wrote:

I've attempted to rerun the benchmarks tests I did a few weeks ago, but
   I got repeated crashes when loading the data (into a table with
tsvector+gin index).

Right before a crash, theres this message in the log:

     PANIC:  not enough space in leaf page!


Thanks for testing. Heikki's version of patch don't works for me too on
even much more simplier examples. I can try to get it working if he
answer
my question about GinDataLeafPageGetPostingList* macros.

The new macros in that patch version were quite botched. Here's a new
attempt.

Nope, still the same errors :-(

PANIC:  not enough space in leaf page!
LOG:  server process (PID 29722) was terminated by signal 6: Aborted
DETAIL:  Failed process was running: autovacuum: ANALYZE public.messages

I've continued hacking away at the patch, here's yet another version. I've done a lot of cleanup and refactoring to make the code more readable (I hope). I'm not sure what caused the panic you saw, but it's probably fixed now. Let me know if not.

Some notable changes since my last patch version:

* I changed the ginbtree interface so that isEnoughSpace method is no more. Instead, placeToPage returns false without doing anything if the item doesn't fit. It was slightly simpler this way when working with the compressed posting lists, as placeToPage had to calculate tuple sizes anyway to decide how many items fit on the page.

* I rewrote incrUpdateItemIndexes(). It now operates in two stages: 1. adjust the pageOffsets and 2. split the bin. I found it more readable this way.

* I changed the WAL format of insert records so that there is a separate struct for data-leaf, data-internal, and entry pages. The information recorded for each of those was so different that it was confusing to cram them all into the same struct.


I'm going to step back a bit from the details of the patch now. I think there's enough work for you, Alexander, until the next commitfest:

* Try to make the code also work with the old page format, so that you don't need to REINDEX after pg_upgrade.

* Documentation. The new compressed posting list format needs to be explained somewhere. At the top of ginpostinglist.c would be a good place. The README should be improved too.

* Fix whatever I broke (sorry)

Are we committed to go ahead with this patch in 9.4 timeframe, in one form or another? If we are, then I'd like to start committing refactoring patches that just move code around in preparation for the Patch, to make the review of the core of the patch easier. For example, merging the isEnoughSpace and placeToPage methods in the ginbtree interface. Without the patch, it's unnecessary code churn - it won't do any harm but it won't do any good either. I'm pretty confident that this patch will land in the 9.4 timeframe, so barring objections I'll start committing such refactorings.

- Heikki

Attachment: gin-packed-postinglists-10-heikki.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to