[jira] [Resolved] (LUCENE-10421) Non-deterministic results from KnnVectorQuery?

2022-02-24 Thread Robert Muir (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-10421. -- Fix Version/s: 9.1 Resolution: Fixed > Non-deterministic results from KnnVectorQuery?

[jira] [Commented] (LUCENE-10421) Non-deterministic results from KnnVectorQuery?

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497932#comment-17497932 ] ASF subversion and git services commented on LUCENE-10421: -- Commit

[jira] [Commented] (LUCENE-10421) Non-deterministic results from KnnVectorQuery?

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497920#comment-17497920 ] ASF subversion and git services commented on LUCENE-10421: -- Commit

[GitHub] [lucene] rmuir merged pull request #686: LUCENE-10421: use Constant instead of relying upon timestamp

2022-02-24 Thread GitBox
rmuir merged pull request #686: URL: https://github.com/apache/lucene/pull/686 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] LuXugang opened a new pull request #714: LUCENE-10439: update CHANGES.txt

2022-02-24 Thread GitBox
LuXugang opened a new pull request #714: URL: https://github.com/apache/lucene/pull/714 update CHANGES.txt for [LUCENE-10424](https://issues.apache.org/jira/browse/LUCENE-10424) and [LUCENE-10439](https://issues.apache.org/jira/browse/LUCENE-10439) . -- This is an automated message

[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-02-24 Thread Michael Bien (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497915#comment-17497915 ] Michael Bien commented on LUCENE-10431: --- thanks for the tips. I might have found the cause. The

[jira] [Updated] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing

2022-02-24 Thread Peixin Li (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peixin Li updated LUCENE-10441: --- Description: Hi experts!, i have facing ArrayIndexOutOfBoundsException during indexing and

[jira] [Created] (LUCENE-10441) ArrayIndexOutOfBoundsException during indexing

2022-02-24 Thread Peixin Li (Jira)
Peixin Li created LUCENE-10441: -- Summary: ArrayIndexOutOfBoundsException during indexing Key: LUCENE-10441 URL: https://issues.apache.org/jira/browse/LUCENE-10441 Project: Lucene - Core Issue

[jira] (LUCENE-10394) Explore moving ByteBuffer(sData|Index)Input to absolute bulk gets

2022-02-24 Thread Gautam Worah (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10394 ] Gautam Worah deleted comment on LUCENE-10394: --- was (Author: gworah): I'll try to work on this soon. Looking into the ByteBuffer API in the meantime. > Explore moving

[jira] [Commented] (LUCENE-10394) Explore moving ByteBuffer(sData|Index)Input to absolute bulk gets

2022-02-24 Thread Gautam Worah (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497873#comment-17497873 ] Gautam Worah commented on LUCENE-10394: --- I'll try to work on this soon. Looking into the

[jira] [Commented] (LUCENE-9952) FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497757#comment-17497757 ] ASF subversion and git services commented on LUCENE-9952: - Commit

[jira] [Commented] (LUCENE-9952) FacetResult#value can be inaccurate in SortedSetDocValueFacetCounts

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497749#comment-17497749 ] ASF subversion and git services commented on LUCENE-9952: - Commit

[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-02-24 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497718#comment-17497718 ] Uwe Schindler commented on LUCENE-10431: Sorry with builder pattern you can't create a BQ

[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-02-24 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497716#comment-17497716 ] Uwe Schindler commented on LUCENE-10431: What was the exact query. Is it the only boolean one

[GitHub] [lucene] magibney commented on pull request #380: LUCENE-10171 - Fix dictionary-based OpenNLPLemmatizerFilterFactory caching issue

2022-02-24 Thread GitBox
magibney commented on pull request #380: URL: https://github.com/apache/lucene/pull/380#issuecomment-1050227228 This patch applies cleanly and all tests pass. I plan to commit this within the next few days, because i think it does improve things (targeting 9.1 release). But I want

[jira] [Commented] (LUCENE-10440) Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets

2022-02-24 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497692#comment-17497692 ] Greg Miller commented on LUCENE-10440: -- PRs posted for this. The only point maybe worth calling

[GitHub] [lucene] gsmiller opened a new pull request #713: LUCENE-10440: Mark TaxonomyFacets and FloatTaxonomyFacets as deprecated

2022-02-24 Thread GitBox
gsmiller opened a new pull request #713: URL: https://github.com/apache/lucene/pull/713 This is a "backport" of #712, providing early `@Deprecation` notice. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene] gsmiller opened a new pull request #712: LUCENE-10440: Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets

2022-02-24 Thread GitBox
gsmiller opened a new pull request #712: URL: https://github.com/apache/lucene/pull/712 # Description These two classes are really implementation details, meant to hold common logic for our faceting implementations, but they are `public` and could be extended by users. It would be

[GitHub] [lucene] rmuir commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-1050188471 > I don't think the grow(long) is necessary, we can always added to the IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in BKDReader#addAll as it does

[jira] [Created] (LUCENE-10440) Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets

2022-02-24 Thread Greg Miller (Jira)
Greg Miller created LUCENE-10440: Summary: Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets Key: LUCENE-10440 URL: https://issues.apache.org/jira/browse/LUCENE-10440 Project: Lucene - Core

[jira] [Assigned] (LUCENE-10440) Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets

2022-02-24 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller reassigned LUCENE-10440: Assignee: Greg Miller > Reduce visibility of TaxonomyFacets and FloatTaxonomyFacets >

[GitHub] [lucene] jtibshirani commented on pull request #686: LUCENE-10421: use Constant instead of relying upon timestamp

2022-02-24 Thread GitBox
jtibshirani commented on pull request #686: URL: https://github.com/apache/lucene/pull/686#issuecomment-1050175765 Thanks @rmuir ! Are you okay to merge this? I got confused recently over a sometimes-reproducible test failure. -- This is an automated message from the Apache Git Service.

[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Ankit Jain (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497658#comment-17497658 ] Ankit Jain commented on LUCENE-10428: - {quote}I opened a pull request that doesn't fix the bug but

[GitHub] [lucene] rmuir commented on pull request #710: LUCENE-10311: Make FixedBitSet#approximateCardinality faster (and actually approximate).

2022-02-24 Thread GitBox
rmuir commented on pull request #710: URL: https://github.com/apache/lucene/pull/710#issuecomment-1050162660 also, another random suggestion for another day. I think it would be fine to have some logic like this at some point: ``` if (length < N) { return cardinality(); //

[jira] [Commented] (LUCENE-10391) Reuse data structures across HnswGraph invocations

2022-02-24 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497648#comment-17497648 ] Julie Tibshirani commented on LUCENE-10391: --- Now that the benchmarks are running again, we

[jira] [Commented] (LUCENE-10438) Leverage Weight#count in lucene/facets

2022-02-24 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497616#comment-17497616 ] Greg Miller commented on LUCENE-10438: -- I experimented with this a bit for taxo- and ssdv-faceting

[jira] [Updated] (LUCENE-10391) Reuse data structures across HnswGraph invocations

2022-02-24 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julie Tibshirani updated LUCENE-10391: -- Attachment: Screen Shot 2022-02-24 at 10.18.42 AM.png > Reuse data structures across

[GitHub] [lucene] rmuir commented on a change in pull request #710: LUCENE-10311: Make FixedBitSet#approximateCardinality faster (and actually approximate).

2022-02-24 Thread GitBox
rmuir commented on a change in pull request #710: URL: https://github.com/apache/lucene/pull/710#discussion_r814137771 ## File path: lucene/core/src/java/org/apache/lucene/util/FixedBitSet.java ## @@ -176,6 +176,30 @@ public int cardinality() { return (int)

[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497597#comment-17497597 ] Adrien Grand commented on LUCENE-10428: --- This is interesting indeed since query execution should

[jira] [Comment Edited] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Ankit Jain (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497553#comment-17497553 ] Ankit Jain edited comment on LUCENE-10428 at 2/24/22, 5:01 PM: --- {quote}By

[jira] [Comment Edited] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Ankit Jain (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497553#comment-17497553 ] Ankit Jain edited comment on LUCENE-10428 at 2/24/22, 5:00 PM: --- {quote}By

[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Ankit Jain (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497553#comment-17497553 ] Ankit Jain commented on LUCENE-10428: - {quote}By any chance, were you able to see what is the

[GitHub] [lucene] rmuir commented on pull request #710: LUCENE-10311: Make FixedBitSet#approximateCardinality faster (and actually approximate).

2022-02-24 Thread GitBox
rmuir commented on pull request #710: URL: https://github.com/apache/lucene/pull/710#issuecomment-1050050397 Since we made the method `abstract`, let's just have it forward to exact-cardinality for the `JavaUtilBitSet` used in the unit tests? It should fix the test issues. I agree

[jira] [Commented] (LUCENE-10427) OLAP likewise rollup during segment merge process

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497526#comment-17497526 ] Adrien Grand commented on LUCENE-10427: --- I know that the Elasticsearch team is looking into doing

[GitHub] [lucene] jpountz opened a new pull request #710: LUCENE-10311: Make FixedBitSet#approximateCardinality faster (and actually approximate).

2022-02-24 Thread GitBox
jpountz opened a new pull request #710: URL: https://github.com/apache/lucene/pull/710 This computes a pop count on a sample of the longs that back the bitset. Quick benchmarks suggest that this runs 5x-10x faster than `FixedBitSet#cardinality` depending on the length of the

[GitHub] [lucene] iverase commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r814047234 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[GitHub] [lucene] iverase commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r814045946 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[GitHub] [lucene] rmuir commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r814040808 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[GitHub] [lucene] rmuir commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r814039139 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[jira] [Commented] (LUCENE-10432) Add optional 'name' property to org.apache.lucene.search.Explanation

2022-02-24 Thread Andriy Redko (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497499#comment-17497499 ] Andriy Redko commented on LUCENE-10432: --- Yeah, it may not 100% cover everything (like the

[GitHub] [lucene] iverase edited a comment on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase edited a comment on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-104552 I don't think the grow(long) is necessary, we can always added to the IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in BKDReader#addAll as

[GitHub] [lucene] iverase commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-104552 I don't think the is necessary, we can always added to the IntersectVisitor instead. Maybe would be worthy to adjust how we call grow() in BKDReader#addAll as it does not need

[GitHub] [lucene] iverase commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r813994000 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[GitHub] [lucene] iverase commented on a change in pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
iverase commented on a change in pull request #709: URL: https://github.com/apache/lucene/pull/709#discussion_r813988648 ## File path: lucene/core/src/java/org/apache/lucene/util/DocIdSetBuilder.java ## @@ -266,20 +224,12 @@ private void upgradeToBitSet() { public DocIdSet

[GitHub] [lucene] rmuir commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-1049967927 If we want to add the `grow(long)` sugar method that simply truncates to `Integer.MAX_VALUE` and clean up all the points callsites, or write a cool

[jira] [Commented] (LUCENE-10432) Add optional 'name' property to org.apache.lucene.search.Explanation

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497468#comment-17497468 ] Adrien Grand commented on LUCENE-10432: --- The bit I'm missing is how you would let Lucene know

[GitHub] [lucene] jpountz commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
jpountz commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-1049959940 That change makes sense to me. FWIW my recollection from profiling DocIdSetBuilder is that the deduplication logic is cheap and most of the time is spent in

[GitHub] [lucene] rmuir commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
rmuir commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049948208 prototype: https://github.com/apache/lucene/pull/709 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the

[GitHub] [lucene] rmuir commented on pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir commented on pull request #709: URL: https://github.com/apache/lucene/pull/709#issuecomment-1049948027 Here's a first stab of what i proposed on https://github.com/apache/lucene/pull/692 You can see how damaging the current cost() implementation is. As followup commits

[GitHub] [lucene] rmuir opened a new pull request #709: LUCENE-10311: remove complex cost estimation and abstraction leakage around it

2022-02-24 Thread GitBox
rmuir opened a new pull request #709: URL: https://github.com/apache/lucene/pull/709 Cost estimation drives the API complexity out of control, we don't need it. Hopefully i've cleared up all the API damage from this explosive leak. Instead, FixedBitSet.approximateCardinality() is

[jira] [Commented] (LUCENE-10428) getMinCompetitiveScore method in MaxScoreSumPropagator fails to converge leading to busy threads in infinite loop

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497422#comment-17497422 ] Adrien Grand commented on LUCENE-10428: --- Ouch this is bad. Note that in your code snippet,

[GitHub] [lucene] rmuir commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
rmuir commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049905030 To try to be more helpful, here's what i'd propose. I can try to hack up a draft PR later if we want, if it is helpful. DocIdSetBuilder, remove complex cost estimation: *

[GitHub] [lucene] rmuir commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
rmuir commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049894634 If this is literally all about "style" issue then let's be open and honest about that. I am fine with: ``` /** sugar: to just make code look pretty, nothing else */ public

[jira] [Commented] (LUCENE-10432) Add optional 'name' property to org.apache.lucene.search.Explanation

2022-02-24 Thread Andriy Redko (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497405#comment-17497405 ] Andriy Redko commented on LUCENE-10432: --- Thanks [~jpountz]  > I wonder if you have thought about

[GitHub] [lucene] iverase commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
iverase commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049887302 32 bits will need to be discarded anyway, the issue is where. You either do it at the PointValues level by calling grow like: ``` visitor.grow((int)

[jira] [Commented] (LUCENE-10431) AssertionError in BooleanQuery.hashCode()

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497402#comment-17497402 ] Adrien Grand commented on LUCENE-10431: --- I've been starring at the code and at this stack trace

[jira] [Commented] (LUCENE-10432) Add optional 'name' property to org.apache.lucene.search.Explanation

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497399#comment-17497399 ] Adrien Grand commented on LUCENE-10432: --- [~reta] I wonder if you have thought about how queries

[GitHub] [lucene] rmuir commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
rmuir commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049869937 > @rmuir We can remove the cost estimation, but it will not address the problem. I'll try to explain the problem differently in case it helps. I really think it will address

[jira] [Commented] (LUCENE-10438) Leverage Weight#count in lucene/facets

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497394#comment-17497394 ] Adrien Grand commented on LUCENE-10438: --- Solr indeed has a version of faceting that does this. I

[GitHub] [lucene] jpountz commented on pull request #692: LUCENE-10311: Different implementations of DocIdSetBuilder for points and terms

2022-02-24 Thread GitBox
jpountz commented on pull request #692: URL: https://github.com/apache/lucene/pull/692#issuecomment-1049857125 @rmuir We can remove the cost estimation, but it will not address the problem. I'll try to explain the problem differently in case it helps. DocIdSetBuilder takes doc IDs

[jira] [Updated] (LUCENE-10315) Speed up BKD leaf block ids codec by a 512 ints ForUtil

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-10315: -- Fix Version/s: (was: 9.1) > Speed up BKD leaf block ids codec by a 512 ints ForUtil >

[jira] [Reopened] (LUCENE-10315) Speed up BKD leaf block ids codec by a 512 ints ForUtil

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand reopened LUCENE-10315: --- > Speed up BKD leaf block ids codec by a 512 ints ForUtil >

[jira] [Resolved] (LUCENE-10439) Support multi-valued and multiple dimensions for count query in PointRangeQuery

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10439. --- Fix Version/s: 9.1 Resolution: Fixed > Support multi-valued and multiple dimensions

[jira] [Commented] (LUCENE-10408) Better dense encoding of doc Ids in Lucene91HnswVectorsFormat

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497372#comment-17497372 ] ASF subversion and git services commented on LUCENE-10408: -- Commit

[jira] [Commented] (LUCENE-10382) Allow KnnVectorQuery to operate over a subset of liveDocs

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497373#comment-17497373 ] ASF subversion and git services commented on LUCENE-10382: -- Commit

[jira] [Commented] (LUCENE-10382) Allow KnnVectorQuery to operate over a subset of liveDocs

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497369#comment-17497369 ] ASF subversion and git services commented on LUCENE-10382: -- Commit

[GitHub] [lucene] jpountz merged pull request #702: LUCENE-10382: Use `IndexReaderContext#id` to check reader identity.

2022-02-24 Thread GitBox
jpountz merged pull request #702: URL: https://github.com/apache/lucene/pull/702 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (LUCENE-10408) Better dense encoding of doc Ids in Lucene91HnswVectorsFormat

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497367#comment-17497367 ] ASF subversion and git services commented on LUCENE-10408: -- Commit

[GitHub] [lucene] jpountz merged pull request #708: LUCENE-10408: Write doc IDs of KNN vectors as ints rather than vints.

2022-02-24 Thread GitBox
jpountz merged pull request #708: URL: https://github.com/apache/lucene/pull/708 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] jpountz opened a new pull request #708: LUCENE-10408: Write doc IDs of KNN vectors as ints rather than vints.

2022-02-24 Thread GitBox
jpountz opened a new pull request #708: URL: https://github.com/apache/lucene/pull/708 Since doc IDs with a vector are loaded as an int[] in memory, this changes the on-disk format of vectors to align with the in-memory representation by using ints instead of vints to represent doc

[jira] [Commented] (LUCENE-10417) IntNRQ task performance decreased in nightly benchmark

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497280#comment-17497280 ] Adrien Grand commented on LUCENE-10417: --- FYI Elasticsearch was upgraded to a recent Lucene

[jira] [Commented] (LUCENE-10439) Support multi-valued and multiple dimensions for count query in PointRangeQuery

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497273#comment-17497273 ] ASF subversion and git services commented on LUCENE-10439: -- Commit

[jira] [Commented] (LUCENE-10439) Support multi-valued and multiple dimensions for count query in PointRangeQuery

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497272#comment-17497272 ] ASF subversion and git services commented on LUCENE-10439: -- Commit

[GitHub] [lucene] jpountz merged pull request #705: LUCENE-10439: Support multi-valued and multiple dimensions for count query in PointRangeQuery

2022-02-24 Thread GitBox
jpountz merged pull request #705: URL: https://github.com/apache/lucene/pull/705 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] LuXugang commented on pull request #705: LUCENE-10439: Support multi-valued and multiple dimensions for count query in PointRangeQuery

2022-02-24 Thread GitBox
LuXugang commented on pull request #705: URL: https://github.com/apache/lucene/pull/705#issuecomment-1049639011 > We need two cases: > > * Checking whether all documents match and returning values.getDocCount(). This works when there are no deletions. > * Actually

[jira] [Commented] (LUCENE-10417) IntNRQ task performance decreased in nightly benchmark

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497267#comment-17497267 ] ASF subversion and git services commented on LUCENE-10417: -- Commit

[jira] [Commented] (LUCENE-10315) Speed up BKD leaf block ids codec by a 512 ints ForUtil

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497268#comment-17497268 ] ASF subversion and git services commented on LUCENE-10315: -- Commit

[GitHub] [lucene] gf2121 merged pull request #707: LUCENE-10417: Revert LUCENE-10315 (backport 9x)

2022-02-24 Thread GitBox
gf2121 merged pull request #707: URL: https://github.com/apache/lucene/pull/707 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Assigned] (LUCENE-10194) Should IndexWriter buffer KNN vectors on disk?

2022-02-24 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10194?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mayya Sharipova reassigned LUCENE-10194: Assignee: Mayya Sharipova > Should IndexWriter buffer KNN vectors on disk? >

[jira] [Commented] (LUCENE-10417) IntNRQ task performance decreased in nightly benchmark

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497259#comment-17497259 ] ASF subversion and git services commented on LUCENE-10417: -- Commit

[jira] [Commented] (LUCENE-10315) Speed up BKD leaf block ids codec by a 512 ints ForUtil

2022-02-24 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17497260#comment-17497260 ] ASF subversion and git services commented on LUCENE-10315: -- Commit

[GitHub] [lucene] gf2121 merged pull request #706: LUCENE-10417: Revert "LUCENE-10315"

2022-02-24 Thread GitBox
gf2121 merged pull request #706: URL: https://github.com/apache/lucene/pull/706 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Resolved] (LUCENE-10435) Break loop early while checking whether DocValuesFieldExistsQuery can be rewrite to MatchAllDocsQuery

2022-02-24 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10435. --- Fix Version/s: 9.1 Resolution: Fixed > Break loop early while checking whether

[GitHub] [lucene] gf2121 opened a new pull request #706: LUCENE-10417: Revert "LUCENE-10315"

2022-02-24 Thread GitBox
gf2121 opened a new pull request #706: URL: https://github.com/apache/lucene/pull/706 SIMD-optimization for BKD `DocIdsWriter` was introduced in https://github.com/apache/lucene/pull/652 in order to speed up decoding of docIDs, but it leads to the regression in nightly benchmark.