[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-06-11 Thread Thomas Mueller (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17854260#comment-17854260
 ] 

Thomas Mueller commented on OAK-8046:
-

> in a content management system with 100.000nd of pages and assets, doing a 
> query that is below 200 items is not always feasible?

It is, using keyset pagination as documented in 
https://jackrabbit.apache.org/oak/docs/query/query-engine.html#keyset-pagination

> There are even ootb features that read more nodes.

Queries that read more than 100'000 nodes need to be changed. This happened for 
example for "sling alias" queries and "vanity path" queries in Apache Sling.

It is fine to read more than 200 nodes per query. It is not good to read more 
than 100'000 nodes.

> Plus how would you influence the time a query takes, besides setting a good 
> index definition

In reality this is not a problem for queries that read few nodes.

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-06-10 Thread Roy Teeuwen (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853746#comment-17853746
 ] 

Roy Teeuwen commented on OAK-8046:
--

[~thomasm]  in a content management system with 100.000nd of pages and assets, 
doing a query that is below 200 items is not always feasible? There are even 
ootb features that read more nodes. Plus how would you influence the time a 
query takes, besides setting a good index definition

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-06-10 Thread Thomas Mueller (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17853671#comment-17853671
 ] 

Thomas Mueller commented on OAK-8046:
-

>  I guess the only thing we can do is move this class to an ignored log file 

[~royteeuwen] No. Best is if the queries read less than 200 nodes, and 
relatively quickly (within a second or so). 

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-05-21 Thread Roy Teeuwen (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848198#comment-17848198
 ] 

Roy Teeuwen commented on OAK-8046:
--

[~thomasm]  Thanks for the prompt reply!

Seeing as we have more than 100.000 nodes / pages and we have processes which 
query for these pages, we are getting this log all the time. So I guess the 
only thing we can do is move this class to an ignored log file :(? Very weird 
to get this as LOG.info but we can't act / do anything about it

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-05-21 Thread Thomas Mueller (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848193#comment-17848193
 ] 

Thomas Mueller commented on OAK-8046:
-

> Should a reindex be triggered

No. That won't help.

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-05-21 Thread Thomas Mueller (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848192#comment-17848192
 ] 

Thomas Mueller commented on OAK-8046:
-

[~royteeuwen] it means while the query is still running (and reading more 
nodes), the index was updated concurrently. Indexes are updated every ~5 
seconds.

Best is if the queries read less than 200 nodes, and relatively quickly (within 
a second or so). If you have queries that read 100'000 or more nodes, it is 
quite easy to get into this situation. With less than 200 nodes, it's typically 
never a problem. (There's also the case where less than 200 nodes are read, but 
very slowly... but that's unlikely).

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2024-05-21 Thread Roy Teeuwen (Jira)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848177#comment-17848177
 ] 

Roy Teeuwen commented on OAK-8046:
--

[~thomasm]  [~catholicon]  we see the "Change in index version detected. Query 
would be performed without offset" logged 100.00 times per day on certain 
instances, but for me it's not clear what the action is required to fix this? 
Can we as consumer of Apache Oak (used in AEM) do anything to mitigate this log 
line of occurring? Should a reindex be triggered, or something else? 

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12.0, 1.10.1, 1.8.12
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-20 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16773481#comment-16773481
 ] 

Vikas Saurabh commented on OAK-8046:


Had to update test because aggregate lucene index (compatVersion=1) doesn't 
respect index tag (because I don't RTFM - limitation 1 mentioned at \[0]). Did 
that on trunk at [r1853997|https://svn.apache.org/r1853997].

Backported to 1.10 at [r1853991|https://svn.apache.org/r1853991] and 
[r1853998|https://svn.apache.org/r1853998]
Backported to 1.8 at [r1854003|https://svn.apache.org/r1854003]

\[0]: 
https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Query_Option_Index_Tag

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Fix For: 1.12, 1.11.0
>
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-20 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772742#comment-16772742
 ] 

Thomas Mueller commented on OAK-8046:
-

>  resetted counter is technically incorrect in context of "nrt"/"sync" results.

Hm, yes. I wonder if someone will open a new issue saying the limit is not 
respected...

Could you add a system property as well so for such cases it's possible to 
switch back to the old behavior?

> LOG.debug was changed to LOG.info
> I intentionally did that 

Ah yes that makes sense!

> remote index can't have such a concept of rewound - we really cant give the 
> guarantees in remote index that we give in local lucene.

I'm not quire sure, for remote indexes, the query could also be re-run, and 
internally we could skip already seen entries.

I think in either case, re-running a query (Lucene or remote) slows things 
down... I think iterating over large results should be avoided in the 
application. Maybe we should even make the LOG message a warning later on...



> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-19 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772735#comment-16772735
 ] 

Thomas Mueller commented on OAK-8046:
-

> rewoundCount instead of hasRewoud.
Thanks! I think this is more robust now (with robust I mean, we can now change 
the code without risking to introduce bugs in this area).

+1 for the patch.

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-19 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772505#comment-16772505
 ] 

Vikas Saurabh commented on OAK-8046:


[~tmueller],
{quote}
bq. I think there is a risk that hasRewound() isn't always called at the right 
moment
I don't think I understand. Afaict, as long as FulltextPathCursor#next is the 
right place to log index traversal log then checking for hasRewound() in next() 
should be ok too, right?
{quote}
Ok, I understood what you meant (we need to check when {{lastDoc}} is null and 
before pulling another result that'd set {{lastDoc}} ) after I wrote the test. 
I've attached  [^OAK-8046-take2.patch] which has {{rewoundCount}} instead of 
{{hasRewoud}}.

I've added one test for compatV2 index. I am somehow not able to figure out 
compatV1 case. I'd add that case and another which asserts that we still fail 
for genuine cases where the query should've failed.

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046-take2.patch, OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-19 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771882#comment-16771882
 ] 

Vikas Saurabh commented on OAK-8046:


Thanks for the review [~tmueller].

bq. LOG.debug was changed to LOG.info, was that intentional? LOG.info("Change 
in index version detected..."
Yes, I intentionally did that for 2 reasons:
* seeing logs with wrapping counter under same query would've been confusing 
otherwise (I think)... such line would likely make it clearer why the counter 
wrapped around
* I don't expect such cases to be too common so this likely won't cause lot of 
logging traffic and would be useful in cases where it shows up

bq. hasRewound() needs to be called at the right moment, always after next()
Umm... I guess it needs to be called between 2 next() calls. After or before 
would probably not matter, right?

bq. I think there is a risk that hasRewound() isn't always called at the right 
moment
I don't think I understand. Afaict, as long as FulltextPathCursor#next is the 
right place to log index traversal log then checking for hasRewound() in next() 
should be ok too, right?

bq. A future change might be: add a skip method (to speed up queries with 
offset). I'm not saying we need to add this.
Hmm... I've multiple thoughts on this - current case is ok wrt to local lucene 
I think. Otoh, remote index can't have such a concept of rewound - we really 
*cant* give the guarantees in remote index that we give in local lucene. Next 
thought is where do we mean to give "skip" method - from lucene pov each new 
batch of result is already given due to skipping. From query pov, skip only 
makes sense for the first batch of resutls.

bq. Maybe it would be simpler if the LuceneResultRowIterator would maintain the 
read count itself, and so instead of boolean hasRewound add a method 
getReadCount? So rename IteratorRewoundStateProvider to IteratorReadSizeProvider
I thought about it - but then each provider (hopefully ES somewhere in the 
future) would also need to do the same thing OR FulltextPathCursor would've to 
do check if underlying cursor provides size or not and support its own counter 
accordingly. In the end, I felt that simple flag to reset counter should be ok. 
That said, I'm ok to change it if you feel rather strongly about it (I don't 
have strong opinion one way or other... I just preferred current one slightly).

Btw, do note, resetted counter is technically incorrect in context of 
"nrt"/"sync" results. Without index re-open, the counter would first account 
for results from "nrt"/"sync" and then lucene index based results would be 
counted. When the index gets reopened, the first counted result is from lucene 
index. So, "1" after re-open would technically be "1+x" (x results were 
given due to "nrt"/"sync").
I initially thought about accounting for this - but I don't expect "nrt"/"sync" 
results to be huge in context where these numbers become relevant - so, the 
error in accountin, imo was tolerable (when compared to implementation 
complication that accurate accounting would've brought in).

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-19 Thread Thomas Mueller (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771675#comment-16771675
 ] 

Thomas Mueller commented on OAK-8046:
-

Thanks [~catholicon]! Feedback:
* LOG.debug was changed to LOG.info, was that intentional? LOG.info("Change in 
index version detected..."
* I think the logic is a bit hard to maintain. hasRewound() needs to be called 
at the right moment, always after next(). If we change FulltextPathCursor some 
more, I think there is a risk that hasRewound() isn't always called at the 
right moment. A future change might be: add a skip method (to speed up queries 
with offset). I'm not saying we need to add this... but I think conceptually 
hasRewound is a bit tricky.
* Maybe it would be simpler if the LuceneResultRowIterator would maintain the 
read count itself, and so instead of boolean hasRewound add a method 
getReadCount? So rename IteratorRewoundStateProvider to IteratorReadSizeProvider

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (OAK-8046) Result items are not always correctly counted against the configured read limit if a query uses a lucene index

2019-02-18 Thread Vikas Saurabh (JIRA)


[ 
https://issues.apache.org/jira/browse/OAK-8046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16771436#comment-16771436
 ] 

Vikas Saurabh commented on OAK-8046:


[~tmueller] can you please review  [^OAK-8046.patch] . I'm working on writing 
test cases in the mean time (which is turning out to be a bit harder to refresh 
index without consuming the whole cursor).

> Result items are not always correctly counted against the configured read 
> limit if a query uses a lucene index 
> ---
>
> Key: OAK-8046
> URL: https://issues.apache.org/jira/browse/OAK-8046
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: lucene
>Affects Versions: 1.8.7
>Reporter: Georg Henzler
>Assignee: Vikas Saurabh
>Priority: Major
> Attachments: OAK-8046.patch
>
>
> There are cases where an index is re-opened during query execution. In that 
> case, already returned entries are read again and skipped, so basically 
> counted twice. This should be fixed to only count entries once (see also [1])
> The issue most likely exists since the read limit was introduced with OAK-6875
> [1] 
> https://lists.apache.org/thread.html/dddf9834fee0bccb6e48f61ba2a01430e34fc0b464b12809f7dfe2eb@%3Coak-dev.jackrabbit.apache.org%3E



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)