On 15/03/14 20:27, Heikki Linnakangas wrote:
That said, I didn't expect the difference to be quite that big when you're appending to the end of the table. When the new entries go to the end of the posting lists, you only need to recompress and WAL-log the last posting list, which is max 256 bytes long. But I guess that's still a lot more WAL than in the old format.

That could be optimized, but I figured we can live with it, thanks to the fastupdate feature. Fastupdate allows amortizing that cost over several insertions. But of course, you explicitly disabled that...

In a concurrent update environment, fastupdate as it is in 9.2 is not really useful. It may be that you can bulk up insertion, but you have no control over who ends up paying the debt. Doubling the amount of wal from gin-indexing would be pretty tough for us, in 9.2 we generate roughly 1TB wal / day, keeping it for some weeks to be able to do PITR. The wal are mainly due to gin-index updates as new data is added and needs to be searchable by users. We do run gzip that cuts it down to 25-30% before keeping the for too long, but doubling this is going to be a migration challenge.

If fast-update could be made to work in an environment where we both have users searching the index and manually updating it and 4+ backend processes updating the index concurrently then it would be a good benefit to gain.

the gin index currently contains 70+ million records with and average tsvector of 124 terms.

--
Jesper .. trying to add some real-world info.



- Heikki





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to