[GitHub] [lucene] gsmiller opened a new pull request #611: LUCENE-9952: Fix dim count inaccuracies in SSDV faceting when a dim is multi-valued

2022-01-17 Thread GitBox
gsmiller opened a new pull request #611: URL: https://github.com/apache/lucene/pull/611 # Description Dimension-level counts in SSDV faceting can be inaccurate when docs contain multiple values under the same dim (the doc will be counted multiple times towards the dim). This fixes t

[jira] [Commented] (LUCENE-10368) Reduce visibility of IntTaxonomyFacets

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477509#comment-17477509 ] Greg Miller commented on LUCENE-10368: -- Moved both of these PRs out of draft mode

[jira] [Updated] (LUCENE-10374) Track down the "browse" taxonomy faceting qps regression

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-10374: - Priority: Minor (was: Major) > Track down the "browse" taxonomy faceting qps regression >

[jira] [Updated] (LUCENE-10374) Track down the "browse" taxonomy faceting qps regression

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-10374: - Issue Type: Task (was: Improvement) > Track down the "browse" taxonomy faceting qps regression

[jira] [Commented] (LUCENE-10379) Count directly into the values array in FastTaxonomyFacetCounts#countAl

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477507#comment-17477507 ] Greg Miller commented on LUCENE-10379: -- This has now been backported to 9.x. Resol

[jira] [Updated] (LUCENE-10379) Count directly into the values array in FastTaxonomyFacetCounts#countAl

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller updated LUCENE-10379: - Fix Version/s: 9.1 10.0 (main) > Count directly into the values array in Fas

[jira] [Resolved] (LUCENE-10379) Count directly into the values array in FastTaxonomyFacetCounts#countAl

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Miller resolved LUCENE-10379. -- Resolution: Fixed > Count directly into the values array in FastTaxonomyFacetCounts#countAl >

[GitHub] [lucene] gsmiller merged pull request #610: LUCENE-10379: Count directly into the dense values array in FastTaxonomyFacetCounts#countAll

2022-01-17 Thread GitBox
gsmiller merged pull request #610: URL: https://github.com/apache/lucene/pull/610 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr.

[jira] [Commented] (LUCENE-10379) Count directly into the values array in FastTaxonomyFacetCounts#countAl

2022-01-17 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477506#comment-17477506 ] ASF subversion and git services commented on LUCENE-10379: -- Co

[GitHub] [lucene] gsmiller opened a new pull request #610: LUCENE-10379: Count directly into the dense values array in FastTaxonomyFacetCounts#countAll

2022-01-17 Thread GitBox
gsmiller opened a new pull request #610: URL: https://github.com/apache/lucene/pull/610 Backport recent optimizations onto 9.x -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific commen

[jira] [Commented] (LUCENE-10350) Avoid some null checking for FastTaxonomyFacetCounts#countAll()

2022-01-17 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477502#comment-17477502 ] ASF subversion and git services commented on LUCENE-10350: -- Co

[GitHub] [lucene] gsmiller merged pull request #609: revert part of LUCENE-10350 that got missed in the earlier revert

2022-01-17 Thread GitBox
gsmiller merged pull request #609: URL: https://github.com/apache/lucene/pull/609 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr.

[jira] [Commented] (LUCENE-10374) Track down the "browse" taxonomy faceting qps regression

2022-01-17 Thread Greg Miller (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477489#comment-17477489 ] Greg Miller commented on LUCENE-10374: -- As an update here, merging just part of th

[GitHub] [lucene] gsmiller commented on a change in pull request #606: LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops

2022-01-17 Thread GitBox
gsmiller commented on a change in pull request #606: URL: https://github.com/apache/lucene/pull/606#discussion_r786337979 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FastTaxonomyFacetCounts.java ## @@ -126,23 +126,39 @@ private void countAll(IndexReade

[GitHub] [lucene] gsmiller commented on a change in pull request #606: LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops

2022-01-17 Thread GitBox
gsmiller commented on a change in pull request #606: URL: https://github.com/apache/lucene/pull/606#discussion_r786337739 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FastTaxonomyFacetCounts.java ## @@ -126,23 +126,39 @@ private void countAll(IndexReade

[GitHub] [lucene] gsmiller commented on a change in pull request #606: LUCENE-10380: Further optimize FastTaxonomyFacetCounts#countAll by moving the liveDocs null check outside the loops

2022-01-17 Thread GitBox
gsmiller commented on a change in pull request #606: URL: https://github.com/apache/lucene/pull/606#discussion_r786337003 ## File path: lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FastTaxonomyFacetCounts.java ## @@ -126,23 +126,39 @@ private void countAll(IndexReade

[GitHub] [lucene] mayya-sharipova commented on pull request #416: LUCENE-10054 Make HnswGraph hierarchical

2022-01-17 Thread GitBox
mayya-sharipova commented on pull request #416: URL: https://github.com/apache/lucene/pull/416#issuecomment-1014658302 Thanks everyone for the review. I am closing this PR in favour of https://github.com/apache/lucene/pull/608 that copies these changes into a new Lucene91Codec and Lucene91

[GitHub] [lucene] mayya-sharipova closed pull request #416: LUCENE-10054 Make HnswGraph hierarchical

2022-01-17 Thread GitBox
mayya-sharipova closed pull request #416: URL: https://github.com/apache/lucene/pull/416 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-un

[GitHub] [lucene] mayya-sharipova opened a new pull request #608: LUCENE-10054 Make HnswGraph hierarchical

2022-01-17 Thread GitBox
mayya-sharipova opened a new pull request #608: URL: https://github.com/apache/lucene/pull/608 Currently HNSW has only a single layer. This patch makes HNSW graph multi-layered. This PR is based on the following PRs: #250, #267, #287, #315, #536, #416 Main changes: -

[jira] [Commented] (LUCENE-10366) Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477256#comment-17477256 ] Uwe Schindler commented on LUCENE-10366: This seems to make Hotspot inline the

[jira] [Commented] (LUCENE-10376) Roll up the loop in vint/vlong in DataInput

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477250#comment-17477250 ] Uwe Schindler commented on LUCENE-10376: I will merge and backport the PR durin

[jira] [Commented] (LUCENE-10376) Roll up the loop in vint/vlong in DataInput

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477249#comment-17477249 ] Uwe Schindler commented on LUCENE-10376: +1 > Roll up the loop in vint/vlong i

[jira] [Assigned] (LUCENE-10376) Roll up the loop in vint/vlong in DataInput

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10376?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-10376: -- Assignee: Uwe Schindler > Roll up the loop in vint/vlong in DataInput >

[jira] [Commented] (LUCENE-10366) Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477248#comment-17477248 ] Uwe Schindler commented on LUCENE-10366: In PR https://github.com/apache/lucene

[jira] [Assigned] (LUCENE-10366) Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread Uwe Schindler (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler reassigned LUCENE-10366: -- Assignee: Uwe Schindler > Reduce the number of valid checks for ByteBufferIndexInput

[GitHub] [lucene] uschindler commented on pull request #602: LUCENE-10376: Roll up the loop in vint/vlong in DataInput

2022-01-17 Thread GitBox
uschindler commented on pull request #602: URL: https://github.com/apache/lucene/pull/602#issuecomment-1014573645 I would also take this one. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene] uschindler commented on pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
uschindler commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014572990 Hi @jpountz, if you agree I'd merge this in and backport! Uwe -- This is an automated message from the Apache Git Service. To respond to the message, please log on t

[GitHub] [lucene] gf2121 commented on a change in pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
gf2121 commented on a change in pull request #592: URL: https://github.com/apache/lucene/pull/592#discussion_r786025580 ## File path: lucene/CHANGES.txt ## @@ -126,6 +126,9 @@ Improvements Optimizations - +* LUCENE-10366: Override #readVInt and #readVLon

[GitHub] [lucene] uschindler commented on a change in pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
uschindler commented on a change in pull request #592: URL: https://github.com/apache/lucene/pull/592#discussion_r786023119 ## File path: lucene/CHANGES.txt ## @@ -126,6 +126,9 @@ Improvements Optimizations - +* LUCENE-10366: Override #readVInt and #read

[GitHub] [lucene] uschindler commented on a change in pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
uschindler commented on a change in pull request #592: URL: https://github.com/apache/lucene/pull/592#discussion_r786023119 ## File path: lucene/CHANGES.txt ## @@ -126,6 +126,9 @@ Improvements Optimizations - +* LUCENE-10366: Override #readVInt and #read

[GitHub] [lucene] uschindler commented on a change in pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
uschindler commented on a change in pull request #592: URL: https://github.com/apache/lucene/pull/592#discussion_r786004689 ## File path: lucene/CHANGES.txt ## @@ -126,6 +126,9 @@ Improvements Optimizations - +* LUCENE-10366: Override #readVInt and #read

[GitHub] [lucene] uschindler commented on a change in pull request #592: LUCENE-10366: Override #readVInt and #readVLong for ByteBufferDataInput to avoid the abstraction confusion of #readByte.

2022-01-17 Thread GitBox
uschindler commented on a change in pull request #592: URL: https://github.com/apache/lucene/pull/592#discussion_r786003953 ## File path: lucene/CHANGES.txt ## @@ -126,6 +126,9 @@ Improvements Optimizations - +* LUCENE-10366: Override #readVInt and #read

[GitHub] [lucene] uschindler commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
uschindler commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014534594 OK, thanks, so let's keep the hack. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] gf2121 commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
gf2121 commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014529950 Here is the report of [making vint final](https://github.com/apache/lucene/compare/main...gf2121:make_vint_final?expand=1) (50 repeat * 20 JVM): ```

[jira] [Resolved] (LUCENE-10377) Replace 'sortPos' parameter in SortField.getComparator()

2022-01-17 Thread Alan Woodward (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alan Woodward resolved LUCENE-10377. Fix Version/s: 9.1 10.0 (main) Resolution: Fixed > Replace 'sor

[jira] [Commented] (LUCENE-10377) Replace 'sortPos' parameter in SortField.getComparator()

2022-01-17 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477142#comment-17477142 ] ASF subversion and git services commented on LUCENE-10377: -- Co

[GitHub] [lucene] uschindler commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
uschindler commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014394276 Although it is unrelated, but could you make that one final for consistency? https://github.com/apache/lucene/blob/97d669bcdb5e302a4fd79c648d5f621d90a9ac3b/lucene/core/src/java/

[GitHub] [lucene] gf2121 commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
gf2121 commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014390791 Thanks @uschindler ! > I was also thinking about the following: Why not make readVInt/readVLong final in the DataInput class? This should be possible after the refactoring in

[jira] [Commented] (LUCENE-10377) Replace 'sortPos' parameter in SortField.getComparator()

2022-01-17 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477117#comment-17477117 ] ASF subversion and git services commented on LUCENE-10377: -- Co

[GitHub] [lucene] romseygeek merged pull request #603: LUCENE-10377: Replace 'sortPos' with 'enableSkipping' in SortField.getComparator()

2022-01-17 Thread GitBox
romseygeek merged pull request #603: URL: https://github.com/apache/lucene/pull/603 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubsc

[GitHub] [lucene] uschindler commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
uschindler commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014376367 I was also thinking about the following: Why not make readVInt/readVLong final in the DataInput class? This should be possible after the refactoring in #602 ! We should

[GitHub] [lucene] uschindler commented on pull request #592: LUCENE-10366: Reduce the number of valid checks for ByteBufferIndexInput#readVInt

2022-01-17 Thread GitBox
uschindler commented on pull request #592: URL: https://github.com/apache/lucene/pull/592#issuecomment-1014373226 I will try out the same on `MemorySegmentIndexInput` (#518). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub an

[GitHub] [lucene] iverase opened a new pull request #607: LUCENE-10288: Check BKD tree shape for lucene pre-8.6 1D indexes

2022-01-17 Thread GitBox
iverase opened a new pull request #607: URL: https://github.com/apache/lucene/pull/607 In LUCENE-9820 we changed the API for PointValues to expose methods to navigate the BKD tree. One important change is the ability to compute the number of points below any node of the tree. In order to c

[GitHub] [lucene] iverase commented on pull request #541: LUCENE-10315: Speed up BKD leaf block ids codec by a 512 ints ForUtil

2022-01-17 Thread GitBox
iverase commented on pull request #541: URL: https://github.com/apache/lucene/pull/541#issuecomment-1014284136 This is looking much better. I tried some of my benchmarks and for points it looks like: ``` |Approach||Index time (sec)||Force merge time (sec)||Index size (GB)||Reader

[jira] [Comment Edited] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

2022-01-17 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477036#comment-17477036 ] Ignacio Vera edited comment on LUCENE-10288 at 1/17/22, 8:52 AM:

[jira] [Commented] (LUCENE-10288) Are 1-dimensional kd trees in pre-86 indices always unbalanced trees?

2022-01-17 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477036#comment-17477036 ] Ignacio Vera commented on LUCENE-10288: --- Yes, I think we can compute it cheaply b