[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771882#comment-16771882
 ] 

Vikas Saurabh commented on OAK-8046:
------------------------------------

Thanks for the review [~tmueller].

bq. LOG.debug was changed to LOG.info, was that intentional? LOG.info("Change 
in index version detected..."
Yes, I intentionally did that for 2 reasons:
* seeing logs with wrapping counter under same query would've been confusing 
otherwise (I think)... such line would likely make it clearer why the counter 
wrapped around
* I don't expect such cases to be too common so this likely won't cause lot of 
logging traffic and would be useful in cases where it shows up

bq. hasRewound() needs to be called at the right moment, always after next()
Umm... I guess it needs to be called between 2 next() calls. After or before 
would probably not matter, right?

bq. I think there is a risk that hasRewound() isn't always called at the right 
moment
I don't think I understand. Afaict, as long as FulltextPathCursor#next is the 
right place to log index traversal log then checking for hasRewound() in next() 
should be ok too, right?

bq. A future change might be: add a skip method (to speed up queries with 
offset). I'm not saying we need to add this.
Hmm... I've multiple thoughts on this - current case is ok wrt to local lucene 
I think. Otoh, remote index can't have such a concept of rewound - we really 
*cant* give the guarantees in remote index that we give in local lucene. Next 
thought is where do we mean to give "skip" method - from lucene pov each new 
batch of result is already given due to skipping. From query pov, skip only 
makes sense for the first batch of resutls.

bq. Maybe it would be simpler if the LuceneResultRowIterator would maintain the 
read count itself, and so instead of boolean hasRewound add a method 
getReadCount? So rename IteratorRewoundStateProvider to IteratorReadSizeProvider
I thought about it - but then each provider (hopefully ES somewhere in the 
future) would also need to do the same thing OR FulltextPathCursor would've to 
do check if underlying cursor provides size or not and support its own counter 
accordingly. In the end, I felt that simple flag to reset counter should be ok. 
That said, I'm ok to change it if you feel rather strongly about it (I don't 
have strong opinion one way or other... I just preferred current one slightly).

Btw, do note, resetted counter is technically incorrect in context of 
"nrt"/"sync" results. Without index re-open, the counter would first account 
for results from "nrt"/"sync" and then lucene index based results would be 
counted. When the index gets reopened, the first counted result is from lucene 
index. So, "10000" after re-open would technically be "10000+x" (x results were 
given due to "nrt"/"sync").
I initially thought about accounting for this - but I don't expect "nrt"/"sync" 
results to be huge in context where these numbers become relevant - so, the 
error in accountin, imo was tolerable (when compared to implementation 
complication that accurate accounting would've brought in).

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: OAK-8046
>                 URL: https://issues.apache.org/jira/browse/OAK-8046
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: lucene
>    Affects Versions: 1.8.7
>            Reporter: Georg Henzler
>            Assignee: Vikas Saurabh
>            Priority: Major
>         Attachments: OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to