What would be wrong with letting it degrade to lossy?  I suppose the
reason it's trying to avoid that is to avoid having to recheck all the
rows on that page when it comes time to do the index insertion; but
surely having to do that is better than having arbitrary, unpredictable
failure conditions.

No, I don't think that's it.  See here, beginning with "the problem
with lossy tbm has two aspects":

http://archives.postgresql.org/message-id/4974b002.3040...@sigaev.ru
Right. Some comments to that points:

 - amgettuple interface hasn't possibility to work with page-wide result instead
   of exact ItemPointer. amgettuple can not return just a block number as
   amgetbitmap can.

It's not so difficult to teach GIN to return ItemPointer one by one from pending list and eliminate this point. GIN will not collect matched ItemPointers in tidbitmap and will return them immediately. But:

 - Because of concurrent vacuum process: while we scan pending list, it's
   content could be transferred into regular structure of index and then we will
   find the same tuple twice. Again, amgettuple hasn't protection from that,
   only amgetbitmap has it. So, we need to filter results from regular GIN
   by results from pending list. ANd for filtering we can't use lossy tbm.

Again, we talk about amgettuple interface. We need to filter results from regular GIN by results from pending list. Now GIN does it by lookup matched ItemPointer in tidbitmap constructed from pending list. We could use ordered array of ItemPointers instead of tidbitmap, but I don't believe that it will take less memory. It's impossible to rescan pending list for two reasons: a) too slow, b) concurrent cleanup process (vacuum or insert now), because they could move tuples into regular structure.

Is it acceptable to add option to tidbitmap which will forbid tidbitmap to become lossy? That removes disabling index scan in gincostestimate and checking of non-lossy tidbitmap.

In current version, cleanup of pending list starts much earlier than non-lossy limit is reached in typical use-cases. Insertion process will start cleanup with at least one fired trigger:
 - number of heap tuples in pending list could produce lossy tidbitmap
- total size of pending list is greater than work_mem. This trigger is developed to speedup bulk insertion (which is used in cleanup), because it will collect whole pending list in memory at once. And this trigger is more strict than first one because in typical use-case of GIN heap tuple is rather big.

I believe that user could get GIN's error about work_mem only intentionally:
- turn off autovacuum
- set big work_mem
- populate table with GIN index (by needed number of insertion)
- prepare query which will return a lot of results (possibly, with seqscan=off because cost of scan of pending list grows fast)
- decrease work_mem for at least ten times
- execute query


--
Teodor Sigaev                                   E-mail: teo...@sigaev.ru
                                                   WWW: http://www.sigaev.ru/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to