Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

Andrey Borodin Mon, 09 Apr 2018 22:25:20 -0700


> 9 апр. 2018 г., в 23:04, Heikki Linnakangas <hlinn...@iki.fi> написал(а):
> 
> On 09/04/18 18:21, Andrey Borodin wrote:
>>> 9 апр. 2018 г., в 19:50, Teodor Sigaev <teo...@sigaev.ru>
>>> написал(а):
>>>> 3. Why do we *not* lock the entry leaf page, if there is no
>>>> match? We still need a lock to remember that we probed for that
>>>> value and there was no match, so that we conflict with a tuple
>>>> that might be inserted later. At least #3 is a bug. The attached
>>>> patch adds an isolation test that demonstrates it. #1 and #2 are
>>>> weird, and cause unnecessary locking, so I think we should fix
>>>> those too, even if they don't lead to incorrect results.
>>> I can't find a hole here. Agree.
>> Please correct me if I'm wrong. Let's say we have posting trees for
>> word A and word B. We are looking for a document that contains both. We will 
>> read through all posting tree of A, but only through some
>> segments of B. If we will not find anything in B, we have to lock
>> only segments where we actually were looking, not all the posting
>> tree of B.
> 
> True, that works. It was not clear from the code or comments that that was 
> intended. I'm not sure if that's worthwhile, compared to locking just the 
> posting tree root block.
From the text search POV this is kind of bulky granularity: if you have 
frequent words like "the", "a", "in", conflicts are inevitable.
I'm not sure we have means for picking optimal granularity: should it be ranges 
of postings, ranges of pages of posting trees, entries, pages of entries or 
whole index.
Technically, [time for locking] should be less than [time of transaction 
retry]*[probability of conflict]. Holding this constraint we should minimize 
[time for locking] + [time of transaction retry]*[probability of conflict].
I suspect that [time for locking] is some orders of magnitude less than time of 
transaction. So, efforts should be skewed towards smaller granularity to reduce 
[probability of conflict].
But all this is not real math and have no strength of science.


> I'll let Teodor decide..
+1. I belive this is very close to optimal solution :)

Best regards, Andrey Borodin.

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

Reply via email to