[GitHub] [lucene] zacharymorn commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
zacharymorn commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1165218655 Thanks @jpountz for the suggestion and also providing the bulk scorer implementation! The result looks pretty impressive as well! I just tried `taskRepeatCount=200` with my

[GitHub] [lucene] zacharymorn commented on pull request #968: [LUCENE-10624] Binary Search for Sparse IndexedDISI advanceWithinBloc…

2022-06-23 Thread GitBox
zacharymorn commented on PR #968: URL: https://github.com/apache/lucene/pull/968#issuecomment-1165214328 Hmm I see. I'm actually also wondering if it will be possible to have one of them simply delegate to the other one (potentially indirectly via some helper method), and then check the

[jira] [Updated] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Weiming Wu updated LUCENE-10624: Attachment: candiate-exponential-searchsparse-sorted.0.log > Binary Search for Sparse

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558318#comment-17558318 ] Tomoko Uchida commented on LUCENE-10557: Seems converting Jira "table" markup to Markdown is

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558263#comment-17558263 ] Michael McCandless commented on LUCENE-10557: - {quote}I'm still not fully sure if we

[jira] [Comment Edited] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558247#comment-17558247 ] Weiming Wu edited comment on LUCENE-10624 at 6/23/22 10:46 PM: --- Hi

[jira] [Commented] (LUCENE-10624) Binary Search for Sparse IndexedDISI advanceWithinBlock & advanceExactWithinBlock

2022-06-23 Thread Weiming Wu (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558247#comment-17558247 ] Weiming Wu commented on LUCENE-10624: - Hi Adrien. Thanks for your comments!   For the reason of

[GitHub] [lucene] shahrs87 commented on a diff in pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-06-23 Thread GitBox
shahrs87 commented on code in PR #907: URL: https://github.com/apache/lucene/pull/907#discussion_r905561688 ## lucene/core/src/java/org/apache/lucene/index/FrozenBufferedUpdates.java: ## @@ -595,7 +595,7 @@ private void setField(String field) throws IOException {

[GitHub] [lucene] shahrs87 commented on a diff in pull request #907: LUCENE-10357 Ghost fields and postings/points

2022-06-23 Thread GitBox
shahrs87 commented on code in PR #907: URL: https://github.com/apache/lucene/pull/907#discussion_r905458119 ## lucene/core/src/java/org/apache/lucene/index/CheckIndex.java: ## @@ -1378,7 +1378,7 @@ private static Status.TermIndexStatus checkFields( computedFieldCount++;

[GitHub] [lucene] kaivalnp commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164860104 Yes, I saw similar improvement for `BitSet` backed queries as the numbers [here](https://github.com/apache/lucene/pull/932) -- This is an automated message from the Apache Git Service.

[GitHub] [lucene] donnerpeter commented on pull request #975: LUCENE-10626 Hunspell: add tools to aid dictionary editing

2022-06-23 Thread GitBox
donnerpeter commented on PR #975: URL: https://github.com/apache/lucene/pull/975#issuecomment-1164800989 Reviewing commits separately might be easier -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go

[GitHub] [lucene] donnerpeter opened a new pull request, #975: LUCENE-10626 Hunspell: add tools to aid dictionary editing

2022-06-23 Thread GitBox
donnerpeter opened a new pull request, #975: URL: https://github.com/apache/lucene/pull/975 https://issues.apache.org/jira/browse/LUCENE-10626 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the

[jira] [Created] (LUCENE-10626) Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion and stem/flag suggestion

2022-06-23 Thread Peter Gromov (Jira)
Peter Gromov created LUCENE-10626: - Summary: Hunspell: add tools to aid dictionary editing: analysis introspection, stem expansion and stem/flag suggestion Key: LUCENE-10626 URL:

[GitHub] [lucene] mdmarshmallow commented on pull request #841: LUCENE-10274: Add hyperrectangle faceting capabilities

2022-06-23 Thread GitBox
mdmarshmallow commented on PR #841: URL: https://github.com/apache/lucene/pull/841#issuecomment-1164640793 Yeah, I think this change should be completely compatible with 9.30. Most of our changes are isolated to the new `facetset` package and all other changes are just adding some

[GitHub] [lucene] jtibshirani commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
jtibshirani commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164628757 The latest approach looks good to me. Are you still seeing a significant latency improvement in some cases? -- This is an automated message from the Apache Git Service. To respond to

[GitHub] [lucene] jtibshirani commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
jtibshirani commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r905236130 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter)

[jira] [Resolved] (LUCENE-10620) Can we pass the Weight to Collector?

2022-06-23 Thread Adrien Grand (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrien Grand resolved LUCENE-10620. --- Fix Version/s: 9.3 Resolution: Fixed > Can we pass the Weight to Collector? >

[jira] [Commented] (LUCENE-10620) Can we pass the Weight to Collector?

2022-06-23 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558151#comment-17558151 ] ASF subversion and git services commented on LUCENE-10620: -- Commit

[GitHub] [lucene] jpountz merged pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
jpountz merged PR #964: URL: https://github.com/apache/lucene/pull/964 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[GitHub] [lucene] jpountz commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
jpountz commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1164586245 Thanks for taking the time to think about it @gsmiller, appreciated! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use

[jira] [Commented] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558141#comment-17558141 ] Alessandro Benedetti commented on LUCENE-10593: --- Recent performance tests in the Pull

[jira] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593 ] Alessandro Benedetti deleted comment on LUCENE-10593: --- was (Author: alessandro.benedetti): Hi @msokolov @mayya-sharipova and @jtibshirani , I have finally finished my performance

[GitHub] [lucene] alessandrobenedetti commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1164571753 @msokolov your input has been invaluable! I run the tests on the same machine, with the preprocessed files and now the results are different. The main and this branch

[jira] [Commented] (LUCENE-9580) Tessellator failure for a certain polygon

2022-06-23 Thread Hugo Mercier (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-9580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558127#comment-17558127 ] Hugo Mercier commented on LUCENE-9580: -- I've encountered the same issue on Elasticsearch 7.17

[jira] [Comment Edited] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-23 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558103#comment-17558103 ] Ignacio Vera edited comment on LUCENE-10396 at 6/23/22 3:01 PM: I have

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905082182 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i <

[GitHub] [lucene] kaivalnp commented on pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on PR #951: URL: https://github.com/apache/lucene/pull/951#issuecomment-1164455602 Thank you! I have added this approach to the latest commit, and a suggestion to incorporate deletes above -- This is an automated message from the Apache Git Service. To respond to the

[GitHub] [lucene] kaivalnp commented on a diff in pull request #951: LUCENE-10606: Optimize Prefilter Hit Collection

2022-06-23 Thread GitBox
kaivalnp commented on code in PR #951: URL: https://github.com/apache/lucene/pull/951#discussion_r905065952 ## lucene/core/src/java/org/apache/lucene/search/KnnVectorQuery.java: ## @@ -92,20 +91,40 @@ public KnnVectorQuery(String field, float[] target, int k, Query filter) {

[jira] [Commented] (LUCENE-10396) Automatically create sparse indexes for sort fields

2022-06-23 Thread Ignacio Vera (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10396?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558103#comment-17558103 ] Ignacio Vera commented on LUCENE-10396: --- I have been thinking on the ability if visiting a single

[GitHub] [lucene] jpountz commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1164436120 @zacharymorn FYI I played with a slightly different approach that implements BMM as a bulk scorer instead of a scorer, which I was hoping would help with making bookkeeping more

[GitHub] [lucene] alessandrobenedetti commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
alessandrobenedetti commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905042674 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i <

[GitHub] [lucene] msokolov commented on a diff in pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
msokolov commented on code in PR #926: URL: https://github.com/apache/lucene/pull/926#discussion_r905035144 ## lucene/core/src/java/org/apache/lucene/util/hnsw/HnswGraphBuilder.java: ## @@ -246,7 +246,7 @@ private boolean diversityCheck( for (int i = 0; i <

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Tomoko Uchida (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558097#comment-17558097 ] Tomoko Uchida commented on LUCENE-10557: I'm still not fully sure if we can/should Jira

[GitHub] [lucene] msokolov commented on pull request #926: VectorSimilarityFunction reverse removal

2022-06-23 Thread GitBox
msokolov commented on PR #926: URL: https://github.com/apache/lucene/pull/926#issuecomment-1164418508 Hi Alessandro, thank you for running the tests. I'm suspicious of the results though -- they just look too good to be true! I know from profiling that we spend most of the time in

[jira] [Comment Edited] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558093#comment-17558093 ] Alessandro Benedetti edited comment on LUCENE-10593 at 6/23/22 1:34 PM:

[jira] [Commented] (LUCENE-10593) VectorSimilarityFunction reverse removal

2022-06-23 Thread Alessandro Benedetti (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558093#comment-17558093 ] Alessandro Benedetti commented on LUCENE-10593: --- Hi @msokolov @mayya-sharipova and

[GitHub] [lucene] gsmiller commented on pull request #964: LUCENE-10620: Pass the Weight to Collectors.

2022-06-23 Thread GitBox
gsmiller commented on PR #964: URL: https://github.com/apache/lucene/pull/964#issuecomment-1164336774 @jpountz ah right. No, I don’t think it makes sense for users to have to deal with creating weights on their own (and having to consider query rewriting as well before doing so). Your

[jira] [Commented] (LUCENE-10557) Migrate to GitHub issue from Jira

2022-06-23 Thread Michael McCandless (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558029#comment-17558029 ] Michael McCandless commented on LUCENE-10557: - Finally catching up over here! *Thank you*

[GitHub] [lucene] jpountz commented on pull request #972: LUCENE-10480: Use BMM scorer for 2 clauses disjunction

2022-06-23 Thread GitBox
jpountz commented on PR #972: URL: https://github.com/apache/lucene/pull/972#issuecomment-1164220705 The fact that queries perform slower in general in your first benchmark run makes me wonder if this could be due to insufficient warmup time. The default task repeat count of 20 might be