Re: [I] Test failure in TestIndexWriter.testDeleteUnusedFiles on Windows 11 [lucene]

2025-05-23 Thread via GitHub
vsop-479 commented on issue #12524: URL: https://github.com/apache/lucene/issues/12524#issuecomment-2906268205 Fixed by https://github.com/apache/lucene/pull/14627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the UR

Re: [I] Test failure in TestIndexWriter.testDeleteUnusedFiles on Windows 11 [lucene]

2025-05-23 Thread via GitHub
vsop-479 closed issue #12524: Test failure in TestIndexWriter.testDeleteUnusedFiles on Windows 11 URL: https://github.com/apache/lucene/issues/12524 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to t

Re: [PR] Make task executor non-final [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14524: URL: https://github.com/apache/lucene/pull/14524#issuecomment-2906085585 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] speed up numDeletesToMerge of SoftDeletesRetentionMergePolicy [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14531: URL: https://github.com/apache/lucene/pull/14531#issuecomment-2906085562 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] Avoid unnecessary comparison for CELL_CROSSES_QUERY cases [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14626: URL: https://github.com/apache/lucene/pull/14626#issuecomment-2905629336 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Avoid unnecessary comparison for CELL_CROSSES_QUERY cases [lucene]

2025-05-23 Thread via GitHub
jainankitk commented on code in PR #14626: URL: https://github.com/apache/lucene/pull/14626#discussion_r2105308858 ## lucene/core/src/java/org/apache/lucene/search/PointRangeQuery.java: ## @@ -154,16 +154,18 @@ private Relation relate(byte[] minPackedValue, byte[] maxPackedValu

Re: [PR] Reduce the number of comparisons when lowerPoint is equal to upperPoint [lucene]

2025-05-23 Thread via GitHub
jainankitk commented on PR #14625: URL: https://github.com/apache/lucene/pull/14625#issuecomment-2905595532 > On your benchmark run, I see no query with a speedup and a low p-value? I primarily wanted to ensure there is no regression. Do we have any benchmark queries with `low == high

Re: [PR] Speed up exhaustive execution of TermQuery [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14707: URL: https://github.com/apache/lucene/pull/14707#issuecomment-2905479822 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Fix WindowsFS test failure seen on Policeman Jenkins [lucene]

2025-05-23 Thread via GitHub
uschindler merged PR #14706: URL: https://github.com/apache/lucene/pull/14706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.

Re: [PR] Speed up exhaustive execution of TermQuery [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14707: URL: https://github.com/apache/lucene/pull/14707#issuecomment-2905443403 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] Speed up exhaustive execution of TermQuery [lucene]

2025-05-23 Thread via GitHub
gf2121 opened a new pull request, #14707: URL: https://github.com/apache/lucene/pull/14707 This tries to take advantage of the new method `Scorer#nextDocsAndScores` to speed up exhaustive execution of TermQuery. ``` TaskQPS baseline StdDevQPS my_mo

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
rmuir commented on PR #14704: URL: https://github.com/apache/lucene/pull/14704#issuecomment-2905412399 before adding more complexity to scores can we resolve the current problem with scoring in jenkins? https://jenkins.thetaphi.de/job/Lucene-MMAPv2-Linux/3326/ ``` java.lang.AssertionE

Re: [PR] Fix the test failure seen on Policeman Jenkins [lucene]

2025-05-23 Thread via GitHub
uschindler commented on PR #14706: URL: https://github.com/apache/lucene/pull/14706#issuecomment-2905362603 Relates to #14627 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [PR] Fix the test failure seen on Policeman Jenkins [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14706: URL: https://github.com/apache/lucene/pull/14706#issuecomment-2905338898 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] Fix the test failure seen on Policeman Jenkins [lucene]

2025-05-23 Thread via GitHub
uschindler opened a new pull request, #14706: URL: https://github.com/apache/lucene/pull/14706 This fixes the following failure seen on Polceman Jenkins: ``` > junit.framework.AssertionFailedError: Expected exception InvalidPathException but no exception was thrown >

Re: [PR] Add a Faiss codec for KNN searches [lucene]

2025-05-23 Thread via GitHub
mikemccand commented on code in PR #14178: URL: https://github.com/apache/lucene/pull/14178#discussion_r2105066032 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/codecs/faiss/FaissKnnVectorsFormat.java: ## @@ -0,0 +1,93 @@ +/* + * Licensed to the Apache Software Foundation

[PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
thecoop opened a new pull request, #14705: URL: https://github.com/apache/lucene/pull/14705 Related to https://github.com/apache/lucene/issues/11338#issuecomment-1224356220, this replaces some simple uses of `PriorityQueue` subclasses with ones using a `Comparator`. This reduces the boiler

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2905063283 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
uschindler commented on code in PR #14705: URL: https://github.com/apache/lucene/pull/14705#discussion_r2105059932 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90CompoundFormat.java: ## @@ -105,29 +107,18 @@ public void write(Directory dir, SegmentInfo si, IO

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
uschindler commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2905147301 > This reduces the boilerplate required, the number of classes defined,... But it does not reduce the number of loaded classes. The lambdas are still hidden classes. -- T

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
thecoop commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2905093783 Next stage is to run some perf tests to check this doesn't introduce a slowdown due to changes to the virtual method calls in key hotspots. -- This is an automated message from the Apa

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2905082216 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14705: URL: https://github.com/apache/lucene/pull/14705#issuecomment-2905068125 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [PR] Swap out some simple PriorityQueue subclasses for one using a Comparator [lucene]

2025-05-23 Thread via GitHub
thecoop commented on code in PR #14705: URL: https://github.com/apache/lucene/pull/14705#discussion_r2105001181 ## lucene/core/src/java/org/apache/lucene/codecs/lucene90/Lucene90CompoundFormat.java: ## @@ -105,29 +107,18 @@ public void write(Directory dir, SegmentInfo si, IOCon

Re: [PR] Speed up exhaustive evaluation. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on PR #14679: URL: https://github.com/apache/lucene/pull/14679#issuecomment-2904550017 Nightly reported a ~6% speedup on the `OrMany` task: https://benchmarks.mikemccandless.com/OrMany.html. I'll push an annotation. -- This is an automated message from the Apache Git Ser

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on code in PR #14704: URL: https://github.com/apache/lucene/pull/14704#discussion_r2104638649 ## lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java: ## @@ -229,7 +249,39 @@ public float score(float freq, long encodedNorm) { //

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
rmuir commented on code in PR #14704: URL: https://github.com/apache/lucene/pull/14704#discussion_r2104654751 ## lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java: ## @@ -217,6 +221,19 @@ private static class BM25Scorer extends SimScorer { thi

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on PR #14704: URL: https://github.com/apache/lucene/pull/14704#issuecomment-2904496600 Great feedback once again @gf2121 ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on PR #14704: URL: https://github.com/apache/lucene/pull/14704#issuecomment-2904498062 Here is what luceneutil has to say for exhaustive evaluation (totalHitsThreshold=Integer.MAX_VALUE): ``` TaskQPS baseline StdDevQPS my_modified

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
gf2121 commented on code in PR #14704: URL: https://github.com/apache/lucene/pull/14704#discussion_r2104631515 ## lucene/core/src/java/org/apache/lucene/search/similarities/BM25Similarity.java: ## @@ -229,7 +249,39 @@ public float score(float freq, long encodedNorm) { //

Re: [PR] Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on code in PR #14701: URL: https://github.com/apache/lucene/pull/14701#discussion_r2104624813 ## lucene/core/src/java/org/apache/lucene/search/BlockMaxConjunctionBulkScorer.java: ## @@ -118,98 +115,36 @@ private void scoreWindow( return; } -Sc

Re: [PR] Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on code in PR #14701: URL: https://github.com/apache/lucene/pull/14701#discussion_r2104619608 ## lucene/core/src/java/org/apache/lucene/search/ScorerUtil.java: ## @@ -117,4 +119,69 @@ public int length() { return in.length(); } } + + static void

Re: [PR] Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. [lucene]

2025-05-23 Thread via GitHub
jpountz commented on code in PR #14701: URL: https://github.com/apache/lucene/pull/14701#discussion_r2104618311 ## lucene/core/src/java/org/apache/lucene/search/MaxScoreBulkScorer.java: ## @@ -394,31 +365,22 @@ void updateMaxWindowScores(int windowMin, int windowMax) throws IOE

Re: [PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14704: URL: https://github.com/apache/lucene/pull/14704#issuecomment-2904443217 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

[PR] Better vectorize score computations. [lucene]

2025-05-23 Thread via GitHub
jpountz opened a new pull request, #14704: URL: https://github.com/apache/lucene/pull/14704 Existing vectorization of scores is a bit fragile since it relies on `SimScorer#score` being inlined in the for loops where it is called. This is currently the case in nightly benchmarks, but may not

Re: [PR] Refactor main top-n bulk scorers to evaluate hits in a more term-at-a-time fashion. [lucene]

2025-05-23 Thread via GitHub
gf2121 commented on code in PR #14701: URL: https://github.com/apache/lucene/pull/14701#discussion_r2104492611 ## lucene/core/src/java/org/apache/lucene/search/ScorerUtil.java: ## @@ -117,4 +119,69 @@ public int length() { return in.length(); } } + + static void

Re: [PR] Use a temporary repository location to download certain ecj versions ("drops") [lucene]

2025-05-23 Thread via GitHub
uschindler commented on PR #14703: URL: https://github.com/apache/lucene/pull/14703#issuecomment-2903999563 Thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsub

Re: [PR] Use a temporary repository location to download certain ecj versions ("drops") [lucene]

2025-05-23 Thread via GitHub
dweiss commented on PR #14703: URL: https://github.com/apache/lucene/pull/14703#issuecomment-2903970658 I have access to it from this infra issue - https://issues.apache.org/jira/browse/INFRA-26434 -- This is an automated message from the Apache Git Service. To respond to the message,

Re: [PR] Update the IOContext on IndexInput rather than the ReadAdvice [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14702: URL: https://github.com/apache/lucene/pull/14702#issuecomment-2903924729 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil

Re: [I] Remove Telugu normalization of vu వు to ma మ from IndicNormalizer [lucene]

2025-05-23 Thread via GitHub
praveen-d291 commented on issue #14659: URL: https://github.com/apache/lucene/issues/14659#issuecomment-2903917694 Hey @rmuir , Thanks for the explanation. I've been thinking about the TeluguAnalyzer's default behavior, and I believe we have a significant hidden issue. The analyzer b

Re: [PR] Update the IOContext on IndexInput rather than the ReadAdvice [lucene]

2025-05-23 Thread via GitHub
thecoop commented on code in PR #14702: URL: https://github.com/apache/lucene/pull/14702#discussion_r2104201080 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsReader.java: ## @@ -124,7 +124,7 @@ public abstract void search( * * The default implementation retu

Re: [PR] Update the IOContext on IndexInput rather than the ReadAdvice [lucene]

2025-05-23 Thread via GitHub
github-actions[bot] commented on PR #14702: URL: https://github.com/apache/lucene/pull/14702#issuecomment-2903798515 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you wil