On 04.08.2020 20:44, Tomas Vondra wrote:
Unique indexes are not supported now.
And I do not see some acceptable solution here.
If we will have to check presence of duplicate at the time of insert then it will eliminate all advantages of LSM approach. And if we postpone to the moment of merge, then... I afraid that it will be too late.


Ummm, but in your response to Stephen you said:

    But search locates not ANY record with specified key in top index
    but record which satisfies snapshot of the transaction. Why do we
    need more records if we know that there are no duplicates?

So how do you know there are no duplicates, if unique indexes are not
supported (and may not be for LSM)?


In index AM I marked Lsm3 index as not supporting unique constraint.
So it can not be used to enforce unique contraint.
But it is possible to specify "unique" in index properties.
In this case it is responsibility of programmer to guarantee that there are no duplicates in the index. This option allows to use this search optimization - locate first record satisfying snapshot and not touch other indexes.


Isn't it a bit suspicious that with more clients the throughput actually
drops significantly? Is this merely due to PoC stage, or is there some
inherent concurrency bottleneck?

My explaination is the following (I am not 100% sure that it is true): multiple clients insert records faster than merge bgworker is able to merge them to main index. It cause blown of top index and as a result it doesn't fir in memory any more. So we loose advantages of fast inserts. If we have N top indexes instead of just 2, we can keep size of each top index small enough. But in this case search operations will have to merge N indexes and so search is almost N times slow (the fact that each top index fits in memory doesn't mean that all of the fits in memory at the same time, so we still have to read pages from disk during lookups in top indexes).


Hmmm, maybe. Should be easy to verify by monitoring the size of the top
index, and limiting it to some reasonable value to keep good
performance. Something like gin_pending_list_size I guess.


Lsm3 provides functions for getting size of active top index, explicitly force merge of top index and
wait completion of merge operation.
Once of use cases of Lsm3 may be delayed update of indexes.
For some application insert speed is very critical: them can not loose data which is received at high rate. In this case in working hours we insert data in small index and at night initiate merge of this index with main index.



Reply via email to