On Wed, Jan 7, 2015 at 4:11 PM, Michael Paquier <michael.paqu...@gmail.com> wrote:
> On Wed, Dec 3, 2014 at 2:37 AM, Robert Haas <robertmh...@gmail.com> wrote: > > On Fri, Nov 28, 2014 at 4:27 AM, Alexander Korotkov > > <aekorot...@gmail.com> wrote: > >> On Fri, Nov 21, 2014 at 8:12 AM, Michael Paquier < > michael.paqu...@gmail.com> > >> wrote: > >>> Please find attached a simple patch adding fillfactor as storage > parameter > >>> for GIN indexes. The default value is the same as the one currently > aka 100 > >>> to have the pages completely packed when a GIN index is created. > >> > >> > >> That's not true. Let us discuss it a little bit. > >> [blah discussion] > That's quite a nice explanation. Thanks! > > >> My summary is following: > >> 1) In order to have fully correct support of fillfactor in GIN we need > to > >> rewrite GIN build algorithm. > >> 2) Without rewriting GIN build algorithm, not much can be done with > entry > >> tree. However, you can implement some heuristics. > TBH, I am not really planning to rewrite the whole code. > > >> 3) You definitely need to touch code that selects ratio of split in > >> dataPlaceToPageLeaf (starting with if (!btree->isBuild)). > OK I see, so for a split we need to have a calculation based on the > fillfactor, with 75% by default. > > >> 4) GIN data pages are always compressed excepts pg_upgraded indexes from > >> pre-9.4. Take care about it in following code. > >> if (GinPageIsCompressed(page)) > >> freespace = GinDataLeafPageGetFreeSpace(page); > >> + else if (btree->isBuild) > >> + freespace = BLCKSZ * (100 - fillfactor) / 100; > Hm. Simply reversing both conditions is fine, no? > > > This is a very interesting explanation; thanks for writing it up! > > It does leave me wondering why anyone would want fillfactor for GIN at > > all, even if they were willing to rewrite the build algorithm. > Based on the explanation of Alexander, the current GIN code fills in a > page at 75% for a split, and was doing even 50/50 pre-9.4 if I recall > correctly. IMO a higher fillfactor makes sense for a GIN index that > gets less random updates, no? > > I am attaching an updated patch, with the default fillfactor value at > 75%, and with the page split code using the fillfactor rate. > Thoughts? Rewritten version of patch is attached. I made following changes: 1) I removed fillfactor handling from entry tree. Because in this case fillfactor parameter would be only upper bound for actual page fullness. It's very like GiST approach to fillfactor. But I would rather remove fillfactor from GiST than spread such approach to other AMs. 2) Fillfactor handling for posting trees is fixed. 3) Default fillfactor for GIN is reverted to 90. I didn't mean that default fillfactor should be 75%. I mean that equilibrium state of tree would be about 75% fullness. But that doesn't mean that we don't want indexes to be better packed just after creation. Anyway I didn't see reason why default fillfactor for GIN btree should differs from default fillfactor of regular btree. ------ With best regards, Alexander Korotkov.
gin_fillfactor_3.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers