Re: [PR] deps(java): bump org.apache.commons:commons-compress from 1.27.1 to 1.28.0 [lucene]

2025-08-28 Thread via GitHub
github-actions[bot] commented on PR #15007: URL: https://github.com/apache/lucene/pull/15007#issuecomment-3235333580 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
nipunbatra8 commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2308577087 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
nipunbatra8 commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2308567962 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
nipunbatra8 commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2308528094 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundat

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
nipunbatra8 commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2308323694 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Foundat

Re: [I] Deadlock between IndexWriter, IndexWriterConfig, ConcurrentMergeScheduler [lucene]

2025-08-28 Thread via GitHub
ayinresh commented on issue #15119: URL: https://github.com/apache/lucene/issues/15119#issuecomment-3234589059 @drabe8 Not an expert, but I believe this is a type of class loading deadlock, which has occurred numerous times in the past (e.g. [LUCENE-6482](https://issues.apache.org/j

Re: [PR] [bug] Fix Locale usage during lower-casing [lucene]

2025-08-28 Thread via GitHub
sandeshkr419 commented on PR #15076: URL: https://github.com/apache/lucene/pull/15076#issuecomment-3234556642 @msfroh - Thanks for merging this. I'll get to adding the test coverage in a follow-up PR. @jainankitk Do we need to backport this as well to 10.x - otherwise 10.3 might

[PR] feat: implement asBulkSimScorer on FeatureFields's SimScorers [lucene]

2025-08-28 Thread via GitHub
AdityaTeltia opened a new pull request, #15137: URL: https://github.com/apache/lucene/pull/15137 Closes #15117 This change implements `asBulkSimScorer()` for FeatureField's SimScorers, similar to the optimization added for BM25SimScorer. The patch also includes: - Unit tes

Re: [PR] feat: implement asBulkSimScorer on FeatureFields's SimScorers [lucene]

2025-08-28 Thread via GitHub
github-actions[bot] commented on PR #15137: URL: https://github.com/apache/lucene/pull/15137#issuecomment-3234461932 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] Fix off-heap byte vector scoring at query time [lucene]

2025-08-28 Thread via GitHub
kaivalnp commented on PR #14874: URL: https://github.com/apache/lucene/pull/14874#issuecomment-3234118400 I missed this earlier, but we saw an indexing throughput increase in a specific case for this change :) https://github.com/mikemccand/luceneutil/pull/452 -- This is an automated me

Re: [I] Improve FirstPassGroupingCollector to support early termination and pruning/skipping [lucene]

2025-08-28 Thread via GitHub
mikemccand commented on issue #15136: URL: https://github.com/apache/lucene/issues/15136#issuecomment-3233790261 +1 to somehow bring some love to Lucene's grouping collectors, to catch up with all the nice optimizations that the default collectors have! I don't know enough specifics t

Re: [I] Understand 2025/08/06 nightly benchy regression in KNN indexing [lucene]

2025-08-28 Thread via GitHub
mikemccand commented on issue #15079: URL: https://github.com/apache/lucene/issues/15079#issuecomment-3233930945 > I did not see much changes since Aug 21th all looks stable, so I don't think the segment trafces (using infostream) is really affecting indexin performance. +1, that is

Re: [I] Improve kNN behavior on permissive filters [lucene]

2025-08-28 Thread via GitHub
benwtrent commented on issue #15132: URL: https://github.com/apache/lucene/issues/15132#issuecomment-3233377028 One major draw back of post-filtering is the case when a filter is negatively correlated with the queries nearest neighbors. In that case, its possible that we return less than `k

[I] Improve FirstPassGroupingCollector to support early termination and pruning/skipping [lucene]

2025-08-28 Thread via GitHub
alexmm-amzn opened a new issue, #15136: URL: https://github.com/apache/lucene/issues/15136 ### Description The `FirstPassGroupingCollector` [[1](https://github.com/apache/lucene/blob/main/lucene/grouping/src/java/org/apache/lucene/search/grouping/FirstPassGroupingCollector.java)] fro

Re: [PR] Move finishMerge to a finally block so it always runs [lucene]

2025-08-28 Thread via GitHub
dweiss commented on code in PR #14977: URL: https://github.com/apache/lucene/pull/14977#discussion_r2307142029 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -225,7 +226,7 @@ private void mergeTerms(SegmentWriteState segmentWriteState, SegmentReadStat

Re: [PR] early exit in SortedNumericDocValuesRangeQuery & SortedSetDocValuesRangeQuery [lucene]

2025-08-28 Thread via GitHub
RamakrishnaChilaka commented on PR #15135: URL: https://github.com/apache/lucene/pull/15135#issuecomment-3233139036 Thanks for review @jpountz. For example, if the docValues are sorted in ascending order, and lowerValue & upperValue is greater than the max of the segment. We would cal

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2307011526 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/index/TestBandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,382 @@ +/* + * Licensed to the Apache Software F

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2307039991 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/index/TestBandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,382 @@ +/* + * Licensed to the Apache Software F

Re: [PR] Expose `SegmentSizeAndDocs` and `MergeScore` features in `TieredMergePolicy` [lucene]

2025-08-28 Thread via GitHub
jpountz commented on PR #15131: URL: https://github.com/apache/lucene/pull/15131#issuecomment-3233034855 Good question. I've had little success with customizing TieredMergePolicy in the past, because all its features tend to interact with one another. The `score()` method may be one excepti

Re: [PR] Make `flatVectorsFormat` injectable in `Lucene99HnswVectorsFormat` to allow custom format and scorers [lucene]

2025-08-28 Thread via GitHub
ChrisHegarty commented on PR #15090: URL: https://github.com/apache/lucene/pull/15090#issuecomment-3233056066 I can understand the reasoning behind this request - I encountered a somewhat similar situation in the past and had considered making a similar change, but didn't (for the same reas

Re: [PR] PostingsDecodingUtil: interchange loops to enable better memory access and SIMD vectorisation [lucene]

2025-08-28 Thread via GitHub
jpountz merged PR #15110: URL: https://github.com/apache/lucene/pull/15110 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apa

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2306919441 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] early exit in SortedNumericDocValuesRangeQuery & SortedSetDocValuesRangeQuery [lucene]

2025-08-28 Thread via GitHub
jpountz commented on PR #15135: URL: https://github.com/apache/lucene/pull/15135#issuecomment-3233006000 This doesn't seem to be saving much work, does it? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2307033186 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/index/TestBandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,382 @@ +/* + * Licensed to the Apache Software F

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2307007915 ## lucene/sandbox/src/test/org/apache/lucene/sandbox/index/TestBandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,382 @@ +/* + * Licensed to the Apache Software F

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2306995447 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2306993663 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2306947911 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Found

Re: [PR] early exit in SortedNumericDocValuesRangeQuery & SortedSetDocValuesRangeQuery [lucene]

2025-08-28 Thread via GitHub
github-actions[bot] commented on PR #15135: URL: https://github.com/apache/lucene/pull/15135#issuecomment-3232879390 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] [Draft] Add BandwidthCappedMergeScheduler for enforcing a global merge bandwidth cap [lucene]

2025-08-28 Thread via GitHub
shubhamsrkdev commented on code in PR #14964: URL: https://github.com/apache/lucene/pull/14964#discussion_r2306941801 ## lucene/sandbox/src/java/org/apache/lucene/sandbox/index/BandwidthCappedMergeScheduler.java: ## @@ -0,0 +1,185 @@ +/* + * Licensed to the Apache Software Found

[PR] early exit in SortedNumericDocValuesRangeQuery & SortedSetDocValuesRangeQuery [lucene]

2025-08-28 Thread via GitHub
RamakrishnaChilaka opened a new pull request, #15135: URL: https://github.com/apache/lucene/pull/15135 Skipping segments if segment's min and max are out of range of the query -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

Re: [PR] Remove even more boolean success flags [lucene]

2025-08-28 Thread via GitHub
thecoop commented on code in PR #15134: URL: https://github.com/apache/lucene/pull/15134#discussion_r2306709652 ## lucene/suggest/src/java/org/apache/lucene/search/suggest/analyzing/FreeTextSuggester.java: ## @@ -286,70 +282,64 @@ public void build(InputIterator iterator, double

Re: [PR] Remove even more boolean success flags [lucene]

2025-08-28 Thread via GitHub
thecoop commented on code in PR #15134: URL: https://github.com/apache/lucene/pull/15134#discussion_r2306702052 ## lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/SortingStrategy.java: ## @@ -107,11 +104,7 @@ public String next() throws IOException {

[PR] Remove even more boolean success flags [lucene]

2025-08-28 Thread via GitHub
thecoop opened a new pull request, #15134: URL: https://github.com/apache/lucene/pull/15134 Most of the remaining ones are in backwards-compatible codecs and the test framework -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] Move finishMerge to a finally block so it always runs [lucene]

2025-08-28 Thread via GitHub
thecoop commented on code in PR #14977: URL: https://github.com/apache/lucene/pull/14977#discussion_r2306579645 ## lucene/core/src/java/org/apache/lucene/index/SegmentMerger.java: ## @@ -225,7 +226,7 @@ private void mergeTerms(SegmentWriteState segmentWriteState, SegmentReadSta