[GitHub] [lucene] mocobeta commented on pull request #811: Add some basic tasks to help/workflow

2022-04-15 Thread GitBox
mocobeta commented on PR #811: URL: https://github.com/apache/lucene/pull/811#issuecomment-1100574008 > I think we can still do some more to help new contributors but I have no specific action items in my head. I'll create separate JIRAs/PRs if I come up with something. Sounds great,

[GitHub] [lucene] zhaih commented on pull request #813: Backport LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders…

2022-04-15 Thread GitBox
zhaih commented on PR #813: URL: https://github.com/apache/lucene/pull/813#issuecomment-1100483369 Thank you @gautamworah96 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment.

[jira] [Commented] (LUCENE-10482) Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide

2022-04-15 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522988#comment-17522988 ] ASF subversion and git services commented on LUCENE-10482: -- Co

[GitHub] [lucene] zhaih merged pull request #813: Backport LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders…

2022-04-15 Thread GitBox
zhaih merged PR #813: URL: https://github.com/apache/lucene/pull/813 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.or

[jira] [Closed] (LUCENE-7863) Don't repeat postings (and perhaps positions) on ReverseWF, EdgeNGram, etc

2022-04-15 Thread Mikhail Khludnev (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev closed LUCENE-7863. Lucene Fields: (was: New) > Don't repeat postings (and perhaps positions) on ReverseWF,

[jira] [Resolved] (LUCENE-7863) Don't repeat postings (and perhaps positions) on ReverseWF, EdgeNGram, etc

2022-04-15 Thread Mikhail Khludnev (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-7863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Khludnev resolved LUCENE-7863. -- Resolution: Won't Fix > Don't repeat postings (and perhaps positions) on ReverseWF, Ed

[GitHub] [lucene] gautamworah96 opened a new pull request, #813: Backport LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders…

2022-04-15 Thread GitBox
gautamworah96 opened a new pull request, #813: URL: https://github.com/apache/lucene/pull/813 Backport of PR: https://github.com/apache/lucene/pull/762, commit: 10ebc099c846c7d96f4ff5f9b7853df850fa8442 for branch_9x. Changes entry is under 9.2 in both the earlier PR and this PR --

[GitHub] [lucene] gautamworah96 commented on pull request #811: Add some basic tasks to help/workflow

2022-04-15 Thread GitBox
gautamworah96 commented on PR #811: URL: https://github.com/apache/lucene/pull/811#issuecomment-1100391643 Hmm. I had not looked at the `help/tests.txt` file. +1 on not adding all the comprehensive options to this workflow file. LGTM overall. I think we can still do some more to h

[GitHub] [lucene] mayya-sharipova commented on pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-04-15 Thread GitBox
mayya-sharipova commented on PR #792: URL: https://github.com/apache/lucene/pull/792#issuecomment-1100389761 @LuXugang Thanks a lot for your work. I was thinking may be a better way to present these changes is to leave all formats changes to a later PR. And for this PR just to make changes

[GitHub] [lucene] mayya-sharipova commented on a diff in pull request #792: LUCENE-10502: Use IndexedDISI to store docIds and DirectMonotonicWriter/Reader to handle ordToDoc

2022-04-15 Thread GitBox
mayya-sharipova commented on code in PR #792: URL: https://github.com/apache/lucene/pull/792#discussion_r851509289 ## lucene/core/src/java/org/apache/lucene/codecs/lucene92/Lucene92HnswVectorsWriter.java: ## @@ -0,0 +1,328 @@ +/* + * Licensed to the Apache Software Foundation (A

[jira] [Commented] (LUCENE-10482) Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide

2022-04-15 Thread ASF subversion and git services (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522911#comment-17522911 ] ASF subversion and git services commented on LUCENE-10482: -- Co

[GitHub] [lucene] zhaih commented on pull request #762: LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide

2022-04-15 Thread GitBox
zhaih commented on PR #762: URL: https://github.com/apache/lucene/pull/762#issuecomment-1100258275 Pushed, could you also open a backport PR? @gautamworah96 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL abov

[GitHub] [lucene] zhaih merged pull request #762: LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide

2022-04-15 Thread GitBox
zhaih merged PR #762: URL: https://github.com/apache/lucene/pull/762 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.or

[GitHub] [lucene] mocobeta commented on pull request #805: LUCENE-10493: factor out Viterbi algorithm and share it between kuromoji and nori

2022-04-15 Thread GitBox
mocobeta commented on PR #805: URL: https://github.com/apache/lucene/pull/805#issuecomment-1100187804 Hi Robert and Mike, thank you for your response. I think this can be kept open for a sufficient time period - it is unlikely to happen large conflicts between these changes and the main bra

[GitHub] [lucene] ChrisHegarty commented on a diff in pull request #812: LUCENE-10517: Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread GitBox
ChrisHegarty commented on code in PR #812: URL: https://github.com/apache/lucene/pull/812#discussion_r851301618 ## lucene/facet/src/java/org/apache/lucene/facet/taxonomy/FastTaxonomyFacetCounts.java: ## @@ -126,23 +126,41 @@ private void countAll(IndexReader reader) throws IOEx

[jira] [Updated] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Hegarty updated LUCENE-10517: --- Description: While analysing various profiles, [@grcevski|https://github.com/grcevski] and

[jira] [Updated] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Hegarty updated LUCENE-10517: --- Issue Type: Improvement (was: Bug) > Improve performance of SortedSetDV faceting by iterat

[jira] [Comment Edited] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522838#comment-17522838 ] Chris Hegarty edited comment on LUCENE-10517 at 4/15/22 2:21 PM:

[GitHub] [lucene] mikemccand commented on pull request #805: LUCENE-10493: factor out Viterbi algorithm and share it between kuromoji and nori

2022-04-15 Thread GitBox
mikemccand commented on PR #805: URL: https://github.com/apache/lucene/pull/805#issuecomment-1100135300 Whoa, this sounds awesome! I will try to review soon. Thanks @mocobeta. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

[GitHub] [lucene] ChrisHegarty commented on pull request #812: LUCENE-10517: Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread GitBox
ChrisHegarty commented on PR #812: URL: https://github.com/apache/lucene/pull/812#issuecomment-1100132616 The perf improvements come from changing the target type of the `nextDoc` invocations - which results in an invokevirtual rather than an invokeinterface. The changes in this PR proposed

[GitHub] [lucene] ChrisHegarty commented on pull request #812: LUCENE-10517: Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread GitBox
ChrisHegarty commented on PR #812: URL: https://github.com/apache/lucene/pull/812#issuecomment-1100130595 I added some luceneutil benchmark output in the JIRA issue, but while positive someone more familiar running these benchmarks should verify in their own environment. -- This

[GitHub] [lucene] ChrisHegarty opened a new pull request, #812: LUCENE-10517: Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread GitBox
ChrisHegarty opened a new pull request, #812: URL: https://github.com/apache/lucene/pull/812 # Description SortedSetDV faceting (and friends), can improve performance within tight loops by using invokevirtual (rather than invokeinterface). The C2 JIT compiler can produce slightly mor

[jira] [Commented] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522842#comment-17522842 ] Chris Hegarty commented on LUCENE-10517: While the two sets of results above sh

[jira] [Commented] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522839#comment-17522839 ] Chris Hegarty commented on LUCENE-10517: [~grcevski] observes the following on

[jira] [Commented] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522838#comment-17522838 ] Chris Hegarty commented on LUCENE-10517: I my M1 I get the following luceneutil

[jira] [Created] (LUCENE-10517) Improve performance of SortedSetDV faceting by iterating on class types

2022-04-15 Thread Chris Hegarty (Jira)
Chris Hegarty created LUCENE-10517: -- Summary: Improve performance of SortedSetDV faceting by iterating on class types Key: LUCENE-10517 URL: https://issues.apache.org/jira/browse/LUCENE-10517 Project

[GitHub] [lucene] rmuir commented on pull request #805: LUCENE-10493: factor out Viterbi algorithm and share it between kuromoji and nori

2022-04-15 Thread GitBox
rmuir commented on PR #805: URL: https://github.com/apache/lucene/pull/805#issuecomment-1100064836 I think @mikemccand actually created most of this code and is most familiar with it. Mike, if you have time can you look too? For the special n-best class, is the issue that `nori` simpl

[GitHub] [lucene] rmuir commented on pull request #805: LUCENE-10493: factor out Viterbi algorithm and share it between kuromoji and nori

2022-04-15 Thread GitBox
rmuir commented on PR #805: URL: https://github.com/apache/lucene/pull/805#issuecomment-1100062662 Sure, sorry for the slow response. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

[GitHub] [lucene] mocobeta commented on pull request #811: Add some basic tasks to help/workflow

2022-04-15 Thread GitBox
mocobeta commented on PR #811: URL: https://github.com/apache/lucene/pull/811#issuecomment-1100018613 @gautamworah96 thanks for your comments. > I sometimes use the -Ptests.iters= param for beasting out multiple runs of a single test to catch random edge cases that I might have misse

[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-15 Thread GitBox
mocobeta commented on code in PR #811: URL: https://github.com/apache/lucene/pull/811#discussion_r851184763 ## help/workflow.txt: ## @@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core): gradlew -p lucene/core assemble ls lucene/core/build/libs +Assemble

[GitHub] [lucene] mocobeta commented on a diff in pull request #811: Add some basic tasks to help/workflow

2022-04-15 Thread GitBox
mocobeta commented on code in PR #811: URL: https://github.com/apache/lucene/pull/811#discussion_r851181043 ## help/workflow.txt: ## @@ -25,11 +25,22 @@ Assemble a single module's JAR (here for lucene-core): gradlew -p lucene/core assemble ls lucene/core/build/libs +Assemble

[jira] [Resolved] (LUCENE-10448) MergeRateLimiter doesn't always limit instant rate.

2022-04-15 Thread kkewwei (Jira)
[ https://issues.apache.org/jira/browse/LUCENE-10448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kkewwei resolved LUCENE-10448. -- Resolution: Not A Problem > MergeRateLimiter doesn't always limit instant rate. > ---