[GitHub] [lucene] navneet1v commented on a diff in pull request #1017: LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape

2022-07-12 Thread GitBox
navneet1v commented on code in PR #1017: URL: https://github.com/apache/lucene/pull/1017#discussion_r919668826 ## lucene/core/src/java/org/apache/lucene/document/ShapeDocValuesField.java: ## @@ -0,0 +1,844 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or

[GitHub] [lucene] zacharymorn commented on pull request #1018: LUCENE-10480: Use BulkScorer to limit BMMScorer to only top-level disjunctions

2022-07-12 Thread GitBox
zacharymorn commented on PR #1018: URL: https://github.com/apache/lucene/pull/1018#issuecomment-1182774748 Benchmark results with `wikinightly.tasks` boolean queries below: ``` TaskQPS baseline StdDevQPS my_modified_version StdDev

[jira] [Comment Edited] (LUCENE-10480) Specialize 2-clauses disjunctions

2022-07-12 Thread Zach Chen (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566149#comment-17566149 ] Zach Chen edited comment on LUCENE-10480 at 7/13/22 5:09 AM: - {quote}I

[jira] [Commented] (LUCENE-10480) Specialize 2-clauses disjunctions

2022-07-12 Thread Zach Chen (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566149#comment-17566149 ] Zach Chen commented on LUCENE-10480: {quote}I wouldn't say blocker, but maybe we could give us time

[GitHub] [lucene] zacharymorn opened a new pull request, #1018: LUCENE-10480: Use BulkScorer to limit BMMScorer to only top-level disjunctions

2022-07-12 Thread GitBox
zacharymorn opened a new pull request, #1018: URL: https://github.com/apache/lucene/pull/1018 ### Description (or a Jira issue link if you have one) Use BulkScorer to limit BMMScorer to only top-level disjunctions Note: Tests update pending -- This is an automated message

[GitHub] [lucene] msokolov commented on pull request #947: LUCENE-10577: enable quantization of HNSW vectors to 8 bits

2022-07-12 Thread GitBox
msokolov commented on PR #947: URL: https://github.com/apache/lucene/pull/947#issuecomment-1182694202 OK, this last round of commits moves the new vector encoding parameter out of IndexableField and FieldInfo into Codec constructor and internally to the codec, in FieldEntry. It certainly

[GitHub] [lucene-jira-archive] mocobeta closed issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mocobeta closed issue #38: StackOverflowException on certain issue descriptions and comment text URL: https://github.com/apache/lucene-jira-archive/issues/38 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above

[GitHub] [lucene-jira-archive] mocobeta merged pull request #39: Stack overflows can occur when parsing Jira lists

2022-07-12 Thread GitBox
mocobeta merged PR #39: URL: https://github.com/apache/lucene-jira-archive/pull/39 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] gsmiller merged pull request #1010: Specialize ordinal encoding for SortedSetDocValues

2022-07-12 Thread GitBox
gsmiller merged PR #1010: URL: https://github.com/apache/lucene/pull/1010 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Updated] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

2022-07-12 Thread Nick Knize (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Knize updated LUCENE-10654: Fix Version/s: 9.3 > New companion doc value format for LatLonShape and XYShape field types >

[jira] [Commented] (LUCENE-10649) Failure in TestDemoParallelLeafReader.testRandomMultipleSchemaGensSameField

2022-07-12 Thread Vigya Sharma (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566071#comment-17566071 ] Vigya Sharma commented on LUCENE-10649: --- Great, thanks for confirming Adrien. I'll open a PR with

[GitHub] [lucene] nknize opened a new pull request, #1017: LUCENE-10654: Add new ShapeDocValuesField for LatLonShape and XYShape

2022-07-12 Thread GitBox
nknize opened a new pull request, #1017: URL: https://github.com/apache/lucene/pull/1017 Adds new doc value field to support LatLonShape and XYShape doc values. The implementation is inspired by ComponentTree. A binary tree of tessellated components (point, line, or triangle) is

[GitHub] [lucene] Yuti-G commented on a diff in pull request #1013: LUCENE-10644: Facets#getAllChildren testing should ignore child order

2022-07-12 Thread GitBox
Yuti-G commented on code in PR #1013: URL: https://github.com/apache/lucene/pull/1013#discussion_r919502708 ## lucene/facet/src/test/org/apache/lucene/facet/FacetTestCase.java: ## @@ -264,4 +264,24 @@ protected void assertFloatValuesEquals(FacetResult a, FacetResult b) {

[jira] [Updated] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

2022-07-12 Thread Nick Knize (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Knize updated LUCENE-10654: Description: {{XYDocValuesField}} provides doc value support for {{XYPoint}}.

[jira] [Created] (LUCENE-10654) New companion doc value format for LatLonShape and XYShape field types

2022-07-12 Thread Nick Knize (Jira)
Nick Knize created LUCENE-10654: --- Summary: New companion doc value format for LatLonShape and XYShape field types Key: LUCENE-10654 URL: https://issues.apache.org/jira/browse/LUCENE-10654 Project:

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-07-12 Thread Mayya Sharipova (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566026#comment-17566026 ] Mayya Sharipova commented on LUCENE-10471: -- [~sstolpovskiy]  [~sokolov] Thanks for providing

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17566015#comment-17566015 ] Michael Sokolov commented on LUCENE-10577: -- OK, that makes sense to me – I'll see about moving

[GitHub] [lucene] jpountz commented on a diff in pull request #987: LUCENE-10627: Using CompositeByteBuf to Reduce Memory Copy

2022-07-12 Thread GitBox
jpountz commented on code in PR #987: URL: https://github.com/apache/lucene/pull/987#discussion_r918752313 ## lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java: ## @@ -257,9 +270,13 @@ private static class DeflateCompressor extends Compressor {

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181660019 Sorry -- not pushed to the PR yet -- struggling w/ git ;) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[jira] [Commented] (LUCENE-10619) Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565873#comment-17565873 ] ASF subversion and git services commented on LUCENE-10619: -- Commit

[jira] [Resolved] (LUCENE-10619) Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10619. --- Fix Version/s: 9.3 Resolution: Fixed > Optimize the writeBytes in TermsHashPerField

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #39: Stack overflows can occur when parsing Jira lists

2022-07-12 Thread GitBox
mocobeta commented on PR #39: URL: https://github.com/apache/lucene-jira-archive/pull/39#issuecomment-1181804695 Thank you @mikemccand -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181662032 OK don't merge this -- I somehow messed up and slurped in unrelated (already previously committed/pushed) changes. I have to drop off for now but will try to fix this a bit

[GitHub] [lucene-jira-archive] mocobeta commented on issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mocobeta commented on issue #38: URL: https://github.com/apache/lucene-jira-archive/issues/38#issuecomment-1181803770 I'll merge it once I confirmed it parses all Jira without any errors. (I think nobody can review the quick and dirty fix...) -- This is an automated message from the

[GitHub] [lucene] tang-hi commented on pull request #966: LUCENE-10619: Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread GitBox
tang-hi commented on PR #966: URL: https://github.com/apache/lucene/pull/966#issuecomment-1181886902 @jpountz thanks for the suggestion  . I have changed testWriteBytes to write small chunks each time -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene-jira-archive] mikemccand commented on a diff in pull request #39: Stack overflows can occur when parsing Jira lists

2022-07-12 Thread GitBox
mikemccand commented on code in PR #39: URL: https://github.com/apache/lucene-jira-archive/pull/39#discussion_r919015037 ## migration/src/markup/lists.py: ## @@ -40,6 +40,11 @@ def action(self, tokens: ParseResults) -> str: for line in tokens: #

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181657754 I pushed a small change to make a best-effort when we hit exceptions from the converter. Such comments look like this:

[jira] [Commented] (LUCENE-10619) Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565872#comment-17565872 ] ASF subversion and git services commented on LUCENE-10619: -- Commit

[GitHub] [lucene-jira-archive] mikemccand opened a new issue, #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mikemccand opened a new issue, #38: URL: https://github.com/apache/lucene-jira-archive/issues/38 Spinoff from #33. Some issues' text hit a stack overflow exception, e.g. one of the comments on LUCENE-550: ``` (.venv) beast3:migration[polish_legacy_jira]$ python

[GitHub] [lucene-jira-archive] mikemccand commented on issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mikemccand commented on issue #38: URL: https://github.com/apache/lucene-jira-archive/issues/38#issuecomment-1181596940 Note that it is pretty rare -- when I ran the full conversion, I saw four separate occurrences. Might not be so important to track down? We can just carry over the raw

[GitHub] [lucene] jpountz merged pull request #966: LUCENE-10619: Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread GitBox
jpountz merged PR #966: URL: https://github.com/apache/lucene/pull/966 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-12 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565871#comment-17565871 ] Julie Tibshirani commented on LUCENE-10577: --- I checked out the latest PR changes, and I like

[jira] [Comment Edited] (LUCENE-10577) Quantize vector values

2022-07-12 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565919#comment-17565919 ] Julie Tibshirani edited comment on LUCENE-10577 at 7/12/22 4:23 PM:

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-12 Thread Julie Tibshirani (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565919#comment-17565919 ] Julie Tibshirani commented on LUCENE-10577: --- I wasn't suggesting making it entirely an

[jira] [Commented] (LUCENE-10628) Enable MatchingFacetSetCounts to use space partitioning data structures

2022-07-12 Thread Marc D'Mello (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565944#comment-17565944 ] Marc D'Mello commented on LUCENE-10628: --- Thanks for taking a look! As for the answer to your

[jira] [Commented] (LUCENE-10650) "after_effect": "no" was removed what replaces it?

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565384#comment-17565384 ] Adrien Grand commented on LUCENE-10650: --- {{query.boost}} is the {{query.getBoost()}} from

[GitHub] [lucene] jpountz commented on pull request #987: LUCENE-10627: Using CompositeByteBuf to Reduce Memory Copy

2022-07-12 Thread GitBox
jpountz commented on PR #987: URL: https://github.com/apache/lucene/pull/987#issuecomment-1181718918 > if we only using compress method with variants ByteBuffersDataInput in LUCENE90, we can not using abstract method Compressor.compress, when we want to use other compression mode. I

[GitHub] [lucene-jira-archive] mikemccand opened a new issue, #37: Why are some Jira issues completely missing?

2022-07-12 Thread GitBox
mikemccand opened a new issue, #37: URL: https://github.com/apache/lucene-jira-archive/issues/37 Spinoff from #33. This is not a blocker for migration, more because I'm curious how Jira lost issues and how pervasive this problem might be -- maybe other Apache projects are affected?

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181586767 And thank you for the quick fix! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181589626 > It looks like a bug introduced in [cfbc821](https://github.com/apache/lucene-jira-archive/commit/cfbc821390859a7053e43028325b6bc616ec2b5b). (I have postponed testing it

[GitHub] [lucene-jira-archive] mocobeta commented on issue #36: Can we parallelize the converter script?

2022-07-12 Thread GitBox
mocobeta commented on issue #36: URL: https://github.com/apache/lucene-jira-archive/issues/36#issuecomment-1181522090 https://docs.python.org/3/howto/logging-cookbook.html#logging-to-a-single-file-from-multiple-processes -- This is an automated message from the Apache Git Service. To

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181586644 > Sorry there should have been a "catch all" try~except clause. I made a quick fix in #35. No worries at all! No need to apologize! -- This is an automated message

[jira] [Commented] (LUCENE-10577) Quantize vector values

2022-07-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565914#comment-17565914 ] Michael Sokolov commented on LUCENE-10577: -- It would be nice if we could make this encoding an

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181587514 > I'm also converting the whole Jira issue myself; it looks like it takes several hours... (recent changes to fix conversion errors could affect the conversion speed I

[GitHub] [lucene-jira-archive] mikemccand merged pull request #40: #27: polish the legacy Jira text added to the issue a bit

2022-07-12 Thread GitBox
mikemccand merged PR #40: URL: https://github.com/apache/lucene-jira-archive/pull/40 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] jpountz commented on pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-07-12 Thread GitBox
jpountz commented on PR #907: URL: https://github.com/apache/lucene/pull/907#issuecomment-1181518177 @shahrs87 Can you look into removing all other instances of `terms == Terms.EMPTY` or `terms != Terms.EMPTY` as well? To do this while keeping tests passing, I think you'll need to create

[GitHub] [lucene-jira-archive] mikemccand commented on issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mikemccand commented on issue #38: URL: https://github.com/apache/lucene-jira-archive/issues/38#issuecomment-1181644356 > I'm trying to find other ways that do not cause infinite recursion while parsing lists correctly. Awesome, thanks @mocobeta! -- This is an automated message

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919332844 ## lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldKnnVectorsFormat.java: ## @@ -102,9 +104,22 @@ private class FieldsWriter extends

[GitHub] [lucene] mayya-sharipova commented on pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on PR #992: URL: https://github.com/apache/lucene/pull/992#issuecomment-1182388563 @jtibshirani @jpountz Thank for your review. I've tried to address your comments, but it looks like we are still not clear how to organize `merge` and `flush` methods. Would be

[jira] [Commented] (LUCENE-10653) Should BlockMaxMaxscoreScorer rebuild its heap in bulk?

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10653?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565380#comment-17565380 ] Adrien Grand commented on LUCENE-10653: --- +1 to doing a bulk heapify The fact that this scorer

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919349095 ## lucene/core/src/java/org/apache/lucene/index/VectorValuesWriter.java: ## @@ -26,233 +26,153 @@ import org.apache.lucene.codecs.KnnVectorsWriter; import

[GitHub] [lucene-jira-archive] mocobeta commented on issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mocobeta commented on issue #38: URL: https://github.com/apache/lucene-jira-archive/issues/38#issuecomment-1181776008 I opened #39. I cannot really explain _why the ad-hoc fix works_ but it works. I think there should be a better way though, it would be sufficient for the one-time batch.

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919343914 ## lucene/core/src/java/org/apache/lucene/codecs/lucene93/Lucene93HnswVectorsWriter.java: ## @@ -266,65 +470,128 @@ private void writeMeta( } } -

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919288022 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java: ## @@ -24,28 +24,40 @@ import org.apache.lucene.index.DocIDMerger; import

[GitHub] [lucene] luyuncheng commented on a diff in pull request #987: LUCENE-10627: Using CompositeByteBuf to Reduce Memory Copy

2022-07-12 Thread GitBox
luyuncheng commented on code in PR #987: URL: https://github.com/apache/lucene/pull/987#discussion_r918848057 ## lucene/core/src/java/org/apache/lucene/codecs/compressing/CompressionMode.java: ## @@ -257,9 +270,13 @@ private static class DeflateCompressor extends Compressor {

[jira] [Commented] (LUCENE-10471) Increase the number of dims for KNN vectors to 2048

2022-07-12 Thread Michael Sokolov (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565468#comment-17565468 ] Michael Sokolov commented on LUCENE-10471: -- We should not be imposing an arbitrary limit that

[GitHub] [lucene] luyuncheng commented on pull request #987: LUCENE-10627: Using CompositeByteBuf to Reduce Memory Copy

2022-07-12 Thread GitBox
luyuncheng commented on PR #987: URL: https://github.com/apache/lucene/pull/987#issuecomment-1181632413 > Would it be possible to remove all `CompressionMode#compress` variants that take a `byte[]` now that you introduced a new method that takes a `ByteBuffersDataInput`? > > Also

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919332844 ## lucene/core/src/java/org/apache/lucene/codecs/perfield/PerFieldKnnVectorsFormat.java: ## @@ -102,9 +104,22 @@ private class FieldsWriter extends

[jira] [Commented] (LUCENE-10649) Failure in TestDemoParallelLeafReader.testRandomMultipleSchemaGensSameField

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565885#comment-17565885 ] Adrien Grand commented on LUCENE-10649: --- Good catch [~vigyas], it looks related indeed. The bug

[GitHub] [lucene-jira-archive] mocobeta opened a new pull request, #39: Fix stack overflow when parsing lists

2022-07-12 Thread GitBox
mocobeta opened a new pull request, #39: URL: https://github.com/apache/lucene-jira-archive/pull/39 Close #38 This ad-hoc patch fixes `'maximum recursion depth exceeded'` error, and also makes the script a bit faster. (8h -> 5h) -- This is an automated message from the Apache Git

[jira] [Commented] (LUCENE-10628) Enable MatchingFacetSetCounts to use space partitioning data structures

2022-07-12 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565882#comment-17565882 ] Ignacio Vera commented on LUCENE-10628: --- I have mainly worked with two type of trees in Lucene.

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mocobeta commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181624324 > Thanks -- I was beginning to wonder if it was normal how long it was taking ;) Of course it's not normal; I remember it took two or three hours to convert the whole

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919288022 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java: ## @@ -24,28 +24,40 @@ import org.apache.lucene.index.DocIDMerger; import

[jira] [Commented] (LUCENE-10603) Improve iteration of ords for SortedSetDocValues

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565402#comment-17565402 ] Adrien Grand commented on LUCENE-10603: --- +1 > Improve iteration of ords for SortedSetDocValues >

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #992: LUCENE-10592 Build HNSW Graph on indexing

2022-07-12 Thread GitBox
mayya-sharipova commented on code in PR #992: URL: https://github.com/apache/lucene/pull/992#discussion_r919288022 ## lucene/core/src/java/org/apache/lucene/codecs/KnnVectorsWriter.java: ## @@ -24,28 +24,40 @@ import org.apache.lucene.index.DocIDMerger; import

[GitHub] [lucene-jira-archive] mikemccand closed pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand closed pull request #33: Polish wording of Legacy Jira details header, and each comment footer URL: https://github.com/apache/lucene-jira-archive/pull/33 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL

[GitHub] [lucene-jira-archive] mikemccand opened a new pull request, #40: #27: polish the legacy Jira text added to the issue a bit

2022-07-12 Thread GitBox
mikemccand opened a new pull request, #40: URL: https://github.com/apache/lucene-jira-archive/pull/40 I "rebooted" my PR by downloading the diff off the messed up #33 PR, futzing it locally, applying, resolving conflicts. Messy messy. I'll try to more carefully manage the git merging

[GitHub] [lucene-jira-archive] mikemccand commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mikemccand commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181821682 I'm closing this messed up PR -- I rebooted it into #40. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and

[GitHub] [lucene] jpountz commented on a diff in pull request #1003: LUCENE-10616: optimizing decompress when only retrieving some fields

2022-07-12 Thread GitBox
jpountz commented on code in PR #1003: URL: https://github.com/apache/lucene/pull/1003#discussion_r918758391 ## lucene/core/src/java/org/apache/lucene/codecs/compressing/Decompressor.java: ## @@ -42,6 +44,13 @@ protected Decompressor() {} public abstract void decompress(

[GitHub] [lucene] jpountz commented on a diff in pull request #966: LUCENE-10619: Optimize the writeBytes in TermsHashPerField

2022-07-12 Thread GitBox
jpountz commented on code in PR #966: URL: https://github.com/apache/lucene/pull/966#discussion_r918804129 ## lucene/core/src/java/org/apache/lucene/index/TermsHashPerField.java: ## @@ -230,9 +230,29 @@ final void writeByte(int stream, byte b) { } final void

[GitHub] [lucene-jira-archive] mocobeta commented on issue #38: StackOverflowException on certain issue descriptions and comment text

2022-07-12 Thread GitBox
mocobeta commented on issue #38: URL: https://github.com/apache/lucene-jira-archive/issues/38#issuecomment-1181605666 Thank you for opening this. While the stack overflow is rare, this recursion in parsing also causes a significant slowdown in conversion. I'm sure the root cause

[jira] [Updated] (LUCENE-10600) SortedSetDocValues#docValueCount should be an int, not long

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand updated LUCENE-10600: -- Fix Version/s: 9.3 > SortedSetDocValues#docValueCount should be an int, not long >

[jira] [Commented] (LUCENE-10480) Specialize 2-clauses disjunctions

2022-07-12 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17565375#comment-17565375 ] Adrien Grand commented on LUCENE-10480: --- +1 to explore this in a separate issue. bq. Do you

[GitHub] [lucene-jira-archive] mocobeta commented on issue #36: Can we parallelize the converter script?

2022-07-12 Thread GitBox
mocobeta commented on issue #36: URL: https://github.com/apache/lucene-jira-archive/issues/36#issuecomment-1181497062 I found https://pypi.org/project/multiprocessing-logging/, but this works only on Linux. -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [lucene-jira-archive] mocobeta commented on pull request #33: Polish wording of Legacy Jira details header, and each comment footer

2022-07-12 Thread GitBox
mocobeta commented on PR #33: URL: https://github.com/apache/lucene-jira-archive/pull/33#issuecomment-1181456586 I'm also converting the whole Jira issue myself; it looks like it takes several hours... (recent changes to fix conversion errors could affect the conversion speed I think).

[GitHub] [lucene-jira-archive] mocobeta opened a new issue, #36: Can we parallelize the converter script?

2022-07-12 Thread GitBox
mocobeta opened a new issue, #36: URL: https://github.com/apache/lucene-jira-archive/issues/36 `jira2markdown_imprt.py` is single-threaded and it takes several hours to convert all Jira issues. I think it'd be easy to parallelize this with

[GitHub] [lucene] stefanvodita commented on a diff in pull request #1015: [LUCENE-10629]: Add fast match query support to FacetSets

2022-07-12 Thread GitBox
stefanvodita commented on code in PR #1015: URL: https://github.com/apache/lucene/pull/1015#discussion_r918597529 ## lucene/facet/src/java/org/apache/lucene/facet/facetset/MatchingFacetSetsCounts.java: ## @@ -52,8 +52,10 @@ public MatchingFacetSetsCounts( String field,