Re: [PR] Reduce ArrayUtil#grow in decompress [lucene]

2024-01-22 Thread via GitHub
github-actions[bot] commented on PR #12996: URL: https://github.com/apache/lucene/pull/12996#issuecomment-1905061582 This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the d...@lucene.apache.org list. Thank you for your

Re: [PR] Backport SOLR-14765 to branch_8_11 [lucene-solr]

2024-01-22 Thread via GitHub
HoustonPutman commented on PR #2682: URL: https://github.com/apache/lucene-solr/pull/2682#issuecomment-1905034000 Given that I have yet to have a passing build, hopefully this helps! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [PR] Fix TaxonomyIndexArray construction from empty index [lucene]

2024-01-22 Thread via GitHub
stefanvodita commented on PR #13028: URL: https://github.com/apache/lucene/pull/13028#issuecomment-1904887153 Thank you for finding this @msfroh! I pushed a commit that should fix the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please

Re: [PR] Backport SOLR-14765 to branch_8_11 [lucene-solr]

2024-01-22 Thread via GitHub
itygh commented on PR #2682: URL: https://github.com/apache/lucene-solr/pull/2682#issuecomment-1904782055 这是来自QQ邮箱的假期自动回复邮件。您好,我最近正在休假中,无法亲自回复您的邮件。我将在假期结束后,尽快给您回复。 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[PR] Backport SOLR-14765 to branch_8_11 [lucene-solr]

2024-01-22 Thread via GitHub
risdenk opened a new pull request, #2682: URL: https://github.com/apache/lucene-solr/pull/2682 https://issues.apache.org/jira/browse/SOLR-14765 backports SOLR-14765 and follow up bug fix/test commits. -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Fix TaxonomyIndexArray construction from empty index [lucene]

2024-01-22 Thread via GitHub
msfroh commented on PR #13028: URL: https://github.com/apache/lucene/pull/13028#issuecomment-1904540802 Hey -- I was just telling a colleague about this change and he asked about the synchronization on `initChildrenSiblings`. I talked through the logic in `computeChildrenSiblings` and

Re: [PR] Fix TaxonomyIndexArray construction from empty index [lucene]

2024-01-22 Thread via GitHub
stefanvodita commented on code in PR #13028: URL: https://github.com/apache/lucene/pull/13028#discussion_r1462177725 ## lucene/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestTaxonomyIndexArrays.java: ## @@ -59,4 +64,22 @@ public void testMultiplesOfChunkSize() {

Re: [PR] Fix TaxonomyIndexArray construction from empty index [lucene]

2024-01-22 Thread via GitHub
msfroh commented on code in PR #13028: URL: https://github.com/apache/lucene/pull/13028#discussion_r1462133413 ## lucene/facet/src/test/org/apache/lucene/facet/taxonomy/directory/TestTaxonomyIndexArrays.java: ## @@ -59,4 +64,22 @@ public void testMultiplesOfChunkSize() {

Re: [PR] Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-01-22 Thread via GitHub
benwtrent commented on PR #13027: URL: https://github.com/apache/lucene/pull/13027#issuecomment-1904325653 @jpountz, I gotcha, pushed a change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [PR] Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-01-22 Thread via GitHub
jpountz commented on PR #13027: URL: https://github.com/apache/lucene/pull/13027#issuecomment-1904218681 OK, taking an int in that method sounds fine to me. Can we still preserve the signature of `mergeOneField` to return `void` and re-compute the doc count before passing the doc count to

Re: [PR] Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-01-22 Thread via GitHub
benwtrent commented on PR #13027: URL: https://github.com/apache/lucene/pull/13027#issuecomment-1904205095 > Adding an @Override to MergedVectorValues that asserts size() is never called is a good protection idea. Well, that would be overkill actually, many things use `size()` and

[PR] Fix TaxonomyIndexArray construction from empty index [lucene]

2024-01-22 Thread via GitHub
stefanvodita opened a new pull request, #13028: URL: https://github.com/apache/lucene/pull/13028 In #12995, we introduced `allocateChunkedArray`, but had a misaligned assumption in the constructor for cases where the allocation size is 0. @msfroh - sorry I missed this check when we

Re: [PR] Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-01-22 Thread via GitHub
benwtrent commented on PR #13027: URL: https://github.com/apache/lucene/pull/13027#issuecomment-1904192755 @jpountz > I see that it needs to linearly scan all vectors anyway, so this shouldn't come at a performance penalty? We would then need to linearly scan all the vectors

Re: [PR] Fix NPE when sampling for quantization in Lucene99HnswScalarQuantizedVectorsFormat [lucene]

2024-01-22 Thread via GitHub
jpountz commented on PR #13027: URL: https://github.com/apache/lucene/pull/13027#issuecomment-1904174622 Good catch. I wonder what is the best place to compute size() correctly. I see you fixed the merge instances, but this is not how it's done elsewhere, see e.g. `FieldsConsumer#write`,

[PR] Propagate minimum competitive score in ReqOptSumScorer. [lucene]

2024-01-22 Thread via GitHub
jpountz opened a new pull request, #13026: URL: https://github.com/apache/lucene/pull/13026 If the required clause doesn't contribute scores, which typically happens if the required clause is a `FILTER` clause, then the minimum competitive score can be propagated directly to the optional

Re: [I] Move group-varint encoding/decoding logic to DataOutput/DataInput? [lucene]

2024-01-22 Thread via GitHub
jpountz commented on issue #12826: URL: https://github.com/apache/lucene/issues/12826#issuecomment-1903948450 Implemented! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

Re: [I] Move group-varint encoding/decoding logic to DataOutput/DataInput? [lucene]

2024-01-22 Thread via GitHub
jpountz closed issue #12826: Move group-varint encoding/decoding logic to DataOutput/DataInput? URL: https://github.com/apache/lucene/issues/12826 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to

Re: [I] Should we fail retrieving doc values on the wrong type? [lucene]

2024-01-22 Thread via GitHub
jpountz closed issue #12005: Should we fail retrieving doc values on the wrong type? URL: https://github.com/apache/lucene/issues/12005 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [I] Investigate slow fuzzy queries [lucene]

2024-01-22 Thread via GitHub
jpountz commented on issue #12456: URL: https://github.com/apache/lucene/issues/12456#issuecomment-1903947480 Closing: the slowdown has been mostly mitigated through other changes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to

Re: [I] Investigate slow fuzzy queries [lucene]

2024-01-22 Thread via GitHub
jpountz closed issue #12456: Investigate slow fuzzy queries URL: https://github.com/apache/lucene/issues/12456 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe,

Re: [I] What if we pick up segments in segment size's ascending order in TieredMergePolicy.doFindMerges? [lucene]

2024-01-22 Thread via GitHub
jpountz commented on issue #13022: URL: https://github.com/apache/lucene/issues/13022#issuecomment-1903944692 Is `findMerges()` a bottleck for your use-case? FWIW, this way `TieredMergePolicy` works is a feature, it allows it to greedily pack segments into merges in a way that allows

Re: [PR] [Minor] Document operation costs for stale workflow [lucene]

2024-01-22 Thread via GitHub
stefanvodita merged PR #13000: URL: https://github.com/apache/lucene/pull/13000 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [I] Taxonomy facets: can we change massive `int[]` for parent/child/sibling tree to paged/block `int[]` to reduce RAM pressure? [lucene]

2024-01-22 Thread via GitHub
stefanvodita closed issue #12989: Taxonomy facets: can we change massive `int[]` for parent/child/sibling tree to paged/block `int[]` to reduce RAM pressure? URL: https://github.com/apache/lucene/issues/12989 -- This is an automated message from the Apache Git Service. To respond to the

Re: [PR] Split taxonomy arrays across chunks [lucene]

2024-01-22 Thread via GitHub
stefanvodita merged PR #12995: URL: https://github.com/apache/lucene/pull/12995 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[I] Reproducible failure in TestGrouping.testRandom [lucene]

2024-01-22 Thread via GitHub
easyice opened a new issue, #13025: URL: https://github.com/apache/lucene/issues/13025 ### Description > java.lang.AssertionError: expected.totalGroupedHitCount != actual.totalGroupedHitCount expected:<23> but was:<21> > at