Re: LSM tree for Postgres

Konstantin Knizhnik Tue, 04 Aug 2020 11:56:08 -0700



On 04.08.2020 20:44, Tomas Vondra wrote:

Unique indexes are not supported now.
And I do not see some acceptable solution here.
If we will have to check presence of duplicate at the time of insertthen it will eliminate all advantages of LSM approach.And if we postpone to the moment of merge, then... I afraid that itwill be too late.
Ummm, but in your response to Stephen you said:

    But search locates not ANY record with specified key in top index
    but record which satisfies snapshot of the transaction. Why do we
    need more records if we know that there are no duplicates?

So how do you know there are no duplicates, if unique indexes are not
supported (and may not be for LSM)?


In index AM I marked Lsm3 index as not supporting unique constraint.
So it can not be used to enforce unique contraint.
But it is possible to specify "unique" in index properties.

In this case it is responsibility of programmer to guarantee that thereare no duplicates in the index.This option allows to use this search optimization - locate first recordsatisfying snapshot and not touch other indexes.

Isn't it a bit suspicious that with more clients the throughputactually
drops significantly? Is this merely due to PoC stage, or is there some
inherent concurrency bottleneck?
My explaination is the following (I am not 100% sure that it istrue): multiple clients insert records faster than merge bgworker isable to merge them to main index. It cause blown of top index and asa result it doesn't fir in memory any more.So we loose advantages of fast inserts. If we have N top indexesinstead of just 2, we can keep size of each top index small enough.But in this case search operations will have to merge N indexes andso search is almost N times slow (the fact that each top index fitsin memorydoesn't mean that all of the fits in memory at the same time, so westill have to read pages from disk during lookups in top indexes).
Hmmm, maybe. Should be easy to verify by monitoring the size of the top
index, and limiting it to some reasonable value to keep good
performance. Something like gin_pending_list_size I guess.

Lsm3 provides functions for getting size of active top index, explicitlyforce merge of top index and

wait completion of merge operation.
Once of use cases of Lsm3 may be delayed update of indexes.

For some application insert speed is very critical: them can not loosedata which is received at high rate.In this case in working hours we insert data in small index and at nightinitiate merge of this index with main index.

Re: LSM tree for Postgres

Reply via email to