date:20250803

Re: [PR] Avoid reconstructing HNSW graph during singleton merges [lucene]

2025-08-03 Thread via GitHub

Pulkitg64 commented on PR #15003: URL: https://github.com/apache/lucene/pull/15003#issuecomment-3149313198 > I wonder if this optimization could be applied when there are more than 1 segment to merge by first applying deletions on the bigger segment to merge and then adding vectors from ot

Re: [PR] Backport: Remove full integrity check from SortingStoredFieldsConsumer [lucene]

2025-08-03 Thread via GitHub

martijnvg merged PR #15032: URL: https://github.com/apache/lucene/pull/15032 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

2025-08-03 Thread via GitHub

vigyasharma commented on PR #15015: URL: https://github.com/apache/lucene/pull/15015#issuecomment-3149080542 Let us also make this class `@lucene.experimental`; in case we need to change some interfaces or class structure as we progress with #13883 -- This is an automated message from th

[PR] Backport: Remove full integrity check from SortingStoredFieldsConsumer [lucene]

2025-08-03 Thread via GitHub

martijnvg opened a new pull request, #15032: URL: https://github.com/apache/lucene/pull/15032 Backporting #15001 to the 10.x branch. In write-heavy scenarios with significant stored field usage, the full integrity check that happens during flushing stored fields to disk when index so

Re: [I] Should SortingStoredFieldsConsumer do a full integrity check? [lucene]

2025-08-03 Thread via GitHub

martijnvg closed issue #14881: Should SortingStoredFieldsConsumer do a full integrity check? URL: https://github.com/apache/lucene/issues/14881 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the sp

Re: [PR] Remove full integrity check from SortingStoredFieldsConsumer [lucene]

2025-08-03 Thread via GitHub

martijnvg merged PR #15001: URL: https://github.com/apache/lucene/pull/15001 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.a

Re: [PR] feat(nori): add metadata support to Korean tokenizer [lucene]

2025-08-03 Thread via GitHub

github-actions[bot] commented on PR #14969: URL: https://github.com/apache/lucene/pull/14969#issuecomment-3148816264 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your contributi

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

2025-08-03 Thread via GitHub

vigyasharma commented on code in PR #15015: URL: https://github.com/apache/lucene/pull/15015#discussion_r2250184825 ## lucene/core/src/java/org/apache/lucene/index/MultiIndexMergeScheduler.java: ## @@ -0,0 +1,169 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

2025-08-03 Thread via GitHub

vigyasharma commented on code in PR #15015: URL: https://github.com/apache/lucene/pull/15015#discussion_r2250084919 ## lucene/CHANGES.txt: ## @@ -31,6 +29,8 @@ API Changes New Features - +* GITHUB#15015: MultiIndexMergeScheduler: a production multi-tenant

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

2025-08-03 Thread via GitHub

vigyasharma commented on code in PR #15015: URL: https://github.com/apache/lucene/pull/15015#discussion_r2250081443 ## lucene/core/src/java/org/apache/lucene/index/MultiIndexMergeScheduler.java: ## @@ -0,0 +1,169 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

2025-08-03 Thread via GitHub

ChrisHegarty commented on PR #14980: URL: https://github.com/apache/lucene/pull/14980#issuecomment-3148533122 > I wonder if it would be beneficial to return the "best" score from a scored block? Its possible that the caller can skip handling the scored block altogether if the best score ret

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

2025-08-03 Thread via GitHub

ChrisHegarty commented on PR #14980: URL: https://github.com/apache/lucene/pull/14980#issuecomment-3148532408 @mccullocht can you try this in your environment? I see good perf improvement without the need to write a custom scorer (in a different language). Note: when testing, there is just

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

2025-08-03 Thread via GitHub

github-actions[bot] commented on PR #14980: URL: https://github.com/apache/lucene/pull/14980#issuecomment-3148513640 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

2025-08-03 Thread via GitHub

github-actions[bot] commented on PR #14980: URL: https://github.com/apache/lucene/pull/14980#issuecomment-3148506272 This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog label to it and you will stop

Re: [PR] GroupVarInt Encoding Implementation for HNSW Graphs [lucene]

2025-08-03 Thread via GitHub

jpountz commented on PR #14932: URL: https://github.com/apache/lucene/pull/14932#issuecomment-3148266414 The change looks good to me, I have the same feedback as @kaivalnp. Should we try to run [knnPerfTest](https://github.com/mikemccand/luceneutil/blob/main/src/python/knnPerfTest.py) with

Re: [I] Faceting + Data Sketches [lucene]

2025-08-03 Thread via GitHub

jpountz commented on issue #15017: URL: https://github.com/apache/lucene/issues/15017#issuecomment-3148203349 > Since facet counting is a relatively light-weight operation This statement surprised me a bit since faceting tasks on nightly benchmarks run several times slower than top-k

Re: [PR] Avoid reconstructing HNSW graph during singleton merges [lucene]

Re: [PR] Backport: Remove full integrity check from SortingStoredFieldsConsumer [lucene]

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

[PR] Backport: Remove full integrity check from SortingStoredFieldsConsumer [lucene]

Re: [I] Should SortingStoredFieldsConsumer do a full integrity check? [lucene]

Re: [PR] Remove full integrity check from SortingStoredFieldsConsumer [lucene]

Re: [PR] feat(nori): add metadata support to Korean tokenizer [lucene]

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

Re: [PR] MultiIndexMergeScheduler: a production multi-tenant merge scheduler [lucene]

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

Re: [PR] Add bulk off-heap scoring for float32 vectors [lucene]

Re: [PR] GroupVarInt Encoding Implementation for HNSW Graphs [lucene]

Re: [I] Faceting + Data Sketches [lucene]

16 matches

Site Navigation

Mail list logo

Footer information