Re: [PR] Move HitQueue in TopScoreDocCollector to a LongHeap [lucene]

2025-06-05 Thread via GitHub
gf2121 merged PR #14714: URL: https://github.com/apache/lucene/pull/14714 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Respect minCompetitiveScore in BlockMaxConjunctionBulkScorer [lucene]

2025-06-05 Thread via GitHub
gf2121 merged PR #14751: URL: https://github.com/apache/lucene/pull/14751 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [I] Dynamic pruning with DocValue skipper [lucene]

2025-06-05 Thread via GitHub
gf2121 closed issue #14666: Dynamic pruning with DocValue skipper URL: https://github.com/apache/lucene/issues/14666 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscr

Re: [PR] Dynamic pruning with DocValueSkipper [lucene]

2025-06-05 Thread via GitHub
gf2121 merged PR #14672: URL: https://github.com/apache/lucene/pull/14672 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Implement IndexedDISI#docIDRunEnd [lucene]

2025-06-05 Thread via GitHub
gf2121 merged PR #14753: URL: https://github.com/apache/lucene/pull/14753 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] .editorconfig [lucene]

2025-06-05 Thread via GitHub
dsmiley commented on code in PR #14740: URL: https://github.com/apache/lucene/pull/14740#discussion_r2131348045 ## .editorconfig: ## @@ -0,0 +1,918 @@ +# EditorConfig: https://editorconfig.org +# for consistent code style configuration across editors/IDEs + +# top-most EditorCon

Re: [PR] [BlockJoin] Add ParentsChildrenBlockJoinQuery to support parent and c… [lucene]

2025-06-05 Thread via GitHub
karthi01ec commented on code in PR #14728: URL: https://github.com/apache/lucene/pull/14728#discussion_r2131286994 ## lucene/join/src/java/org/apache/lucene/search/join/ParentsChildrenBlockJoinQuery.java: ## @@ -0,0 +1,489 @@ +/* + * Licensed to the Apache Software Foundation (A

Re: [PR] Fixed incorrect Telugu normalization of vu వు to ma మ ( [lucene]

2025-06-05 Thread via GitHub
github-actions[bot] commented on PR #14699: URL: https://github.com/apache/lucene/pull/14699#issuecomment-2947243752 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] [BlockJoin] Add ParentsChildrenBlockJoinQuery to support parent and c… [lucene]

2025-06-05 Thread via GitHub
github-actions[bot] commented on PR #14728: URL: https://github.com/apache/lucene/pull/14728#issuecomment-2946622248 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [I] Nightly benchmark regression on 2025.05.01 [lucene]

2025-06-05 Thread via GitHub
jpountz commented on issue #14630: URL: https://github.com/apache/lucene/issues/14630#issuecomment-2946079858 Thank you @mikemccand ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Support multiple HNSW graphs backed by the same vectors [lucene]

2025-06-05 Thread via GitHub
msokolov commented on issue #14758: URL: https://github.com/apache/lucene/issues/14758#issuecomment-2945764977 I wonder if we could support this in a different way via some kind of "shadow field" that would refer to the same underlying vector data but could be indexed on a sub-set of docume

Re: [PR] IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery should only be counted once when computing maxClauseCount [lucene]

2025-06-05 Thread via GitHub
github-actions[bot] commented on PR #14759: URL: https://github.com/apache/lucene/pull/14759#issuecomment-2945481292 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

[PR] IndexOrDocValuesQuery and IndexSortSortedNumericDocValuesRangeQuery should only be counted once when computing maxClauseCount [lucene]

2025-06-05 Thread via GitHub
iverase opened a new pull request, #14759: URL: https://github.com/apache/lucene/pull/14759 see https://github.com/apache/lucene/issues/14756 The idea proposed here is that both queries are just leaf queries and should only call once the method QueryVisitor#visitLeaf. For IndexOrDocVa

Re: [I] Nightly benchmark regression on 2025.05.01 [lucene]

2025-06-05 Thread via GitHub
mikemccand commented on issue #14630: URL: https://github.com/apache/lucene/issues/14630#issuecomment-2945352847 > It looks like nightly benchmarks only run every 2 days since May 13th, vs. every day before that. Is this because it now takes longer to run the benchmark? Oh sorry I thi

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2945106263 Yup, I didn't want to bother @kkewwei with this so I did it myself, it's easy enough. :) -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
uschindler commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2945076224 He did that already. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comme

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
uschindler commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2945073619 It's only about the version number. It was a message for @jpountz who merged this. -- This is an automated message from the Apache Git Service. To respond to the message, please log

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
kkewwei commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2945056588 > The changes entry needs to be moved. @uschindler Kindly advise whether it is should be placed under the "Improvements"? -- This is an automated message from the Apache Git Ser

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
uschindler commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2945028845 Thanks! Cool first step! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
dweiss commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2945007270 Merged. Thank you, @thecoop -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific c

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
dweiss merged PR #14705: URL: https://github.com/apache/lucene/pull/14705 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apac

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
uschindler commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2945004920 The changes entry needs to be moved. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to th

Re: [PR] Speed up findNextGEQ by aggresive stepping [lucene]

2025-06-05 Thread via GitHub
HUSTERGS commented on PR #14735: URL: https://github.com/apache/lucene/pull/14735#issuecomment-2944528692 > You are reading correctly, the high p-values suggest that the speedups are not statistically significant. Got it! -- This is an automated message from the Apache Git Service.

Re: [PR] Speed up findNextGEQ by aggresive stepping [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14735: URL: https://github.com/apache/lucene/pull/14735#issuecomment-2944502036 You are reading correctly, the high p-values suggest that the speedups are not statistically significant. -- This is an automated message from the Apache Git Service. To respond to the

Re: [I] Support multiple HNSW graphs backed by the same vectors [lucene]

2025-06-05 Thread via GitHub
jpountz commented on issue #14758: URL: https://github.com/apache/lucene/issues/14758#issuecomment-2944618497 If filtered performance is critical, then I wonder if there could be better ways, e.g. using a vector search algorithm from the IVF family instead of HNSW, and configuring an index

Re: [PR] Implement IndexedDISI#docIDRunEnd [lucene]

2025-06-05 Thread via GitHub
HUSTERGS commented on PR #14753: URL: https://github.com/apache/lucene/pull/14753#issuecomment-2944556833 > Could you add a CHANGE entry under 9.3.0? Yes, I forgot that, pushed another commit -- This is an automated message from the Apache Git Service. To respond to the message, ple

Re: [PR] Speed up findNextGEQ by aggresive stepping [lucene]

2025-06-05 Thread via GitHub
HUSTERGS closed pull request #14735: Speed up findNextGEQ by aggresive stepping URL: https://github.com/apache/lucene/pull/14735 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Speed up findNextGEQ by aggresive stepping [lucene]

2025-06-05 Thread via GitHub
HUSTERGS commented on PR #14735: URL: https://github.com/apache/lucene/pull/14735#issuecomment-2944491596 > I ran benchmarks locally and get similar results as yours. This suggests that this change is neutral in terms of performance. @jpountz Thanks for you reply! I noticed that the

Re: [PR] Respect minCompetitiveScore in BlockMaxConjunctionBulkScorer [lucene]

2025-06-05 Thread via GitHub
jpountz commented on code in PR #14751: URL: https://github.com/apache/lucene/pull/14751#discussion_r2128822718 ## lucene/core/src/java/org/apache/lucene/search/ScorerUtil.java: ## @@ -120,11 +120,22 @@ public int length() { } } + /** + * Filters competitive hits fr

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2944438839 FYI I backported to branch_10x so that it's included in 10.3. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14757: URL: https://github.com/apache/lucene/pull/14757#issuecomment-2944408425 Thank you! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [I] Too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
jpountz closed issue #14755: Too many documents collected when only bool-filter condition is present URL: https://github.com/apache/lucene/issues/14755 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go t

Re: [PR] Fix too many documents collected when only bool-filter condition is present [lucene]

2025-06-05 Thread via GitHub
jpountz merged PR #14757: URL: https://github.com/apache/lucene/pull/14757 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2944387202 I'm fine with merging too. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific co

Re: [PR] Speed up findNextGEQ by aggresive stepping [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14735: URL: https://github.com/apache/lucene/pull/14735#issuecomment-2944372773 I ran benchmarks locally and get similar results as yours. This suggests that this change is neutral in terms of performance. -- This is an automated message from the Apache Git Servic

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
thecoop commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2944318250 There are some non-trivial subclasses (`TopDocs.MergeSortQueue`, `MultiTermsEnum.TermMergeQueue`, `NearSpansUnordered.SpanTotalLengthEndPositionWindow`), some queues that themselves are

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2944223363 > In general, I'm fine to making baby steps although it could also be developed into a larger patch that adds this additional "smaller-than" interface and makes the PQ final. I wonder wh

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2944211560 > I wasn't able to run OrdinalMapBenchmark I ran it on my machine. Got the following results on main (2 runs): ``` id: 651.43124 msec name: 955.36439 msec country_co

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
jpountz commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2944121468 For reference, there is an interesting PR at https://github.com/apache/lucene/pull/14714 that replaces `PriorityQueue` with a `LongHeap` when computing top-k hits by score. Results sugge

Re: [PR] feat: add automated backport workflow [lucene]

2025-06-05 Thread via GitHub
ayagmar commented on PR #14738: URL: https://github.com/apache/lucene/pull/14738#issuecomment-2943862049 Hello everyone, thanks for all feedback, I will try to address them one by one over the weekend BR -- This is an automated message from the Apache Git Service. To respond to the mes

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-06-05 Thread via GitHub
thecoop commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2943669040 I wasn't able to run OrdinalMapBenchmark, but a few lucene-bench runs show no significant changes -- This is an automated message from the Apache Git Service. To respond to the message