[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335431950
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +258,61 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+assertSuggestions(collector.get(), expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+
+// TODO expecting true here, why false?
 
 Review comment:
   I'll open an issue. I also wonder if we shouldn't rely on the fact that the 
top suggest collector will also early terminate so whenever we expect rejection 
(because of deleted docs or because we deduplicate on suggestions/doc) we could 
set the queue size to its maximum value (5000). Currently we have different 
heuristics that tries to pick a sensitive value automatically but there is no 
guarantee of admissibility. For instance if we want to deduplicate by document 
id we should ensure that the queue size is greater than 
`topN*maxAnalyzedValuesPerDoc` and we'd need to compute this value at index 
time.
   I may be completely off but it would be interesting to see the effects  of 
setting the queue size to its maximum value on all search. This way the 
admissibility is easier to reason about and we don't need to correlate it with 
the choice made by the heuristic.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-16 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r335426995
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocs.java
 ##
 @@ -116,19 +133,29 @@ public TopSuggestDocs(TotalHits totalHits, 
SuggestScoreDoc[] scoreDocs) {
*/
   public static TopSuggestDocs merge(int topN, TopSuggestDocs[] shardHits) {
 SuggestScoreDocPriorityQueue priorityQueue = new 
SuggestScoreDocPriorityQueue(topN);
+boolean allComplete = true;
 for (TopSuggestDocs shardHit : shardHits) {
   for (SuggestScoreDoc scoreDoc : shardHit.scoreLookupDocs()) {
 if (scoreDoc == priorityQueue.insertWithOverflow(scoreDoc)) {
   break;
 }
   }
+  allComplete &= shardHit.isComplete;
 }
 SuggestScoreDoc[] topNResults = priorityQueue.getResults();
 if (topNResults.length > 0) {
-  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults);
+  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults,
+  allComplete);
 } else {
   return TopSuggestDocs.EMPTY;
 }
   }
 
+  /**
+   * Indicates if the list of results is complete or not. Might be 
false if the {@link TopNSearcher} rejected
+   * too many of the queued results.
 
 Review comment:
   The admissibility of the search is computed from the reject count so a value 
of `false` means that we exhausted all the paths but we had to reject all of 
them so the topN is truncated. It's hard to follow the full logic but it should 
be ok as long as it is ok to return less than the topN when there are more 
rejections than the queue size can handle ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-09 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r330405030
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +258,61 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+assertSuggestions(collector.get(), expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+
+// TODO expecting true here, why false?
 
 Review comment:
   it seems that NRTSuggesterBuilder#maxAnalyzedPathsPerOutput is not computed 
correctly. From what I understand it records the number of suggestions with the 
same analyzed form but the comment says that it should be the highest number of 
analyzed paths we saw for any input surface form. So imo this is a bug, it's 
not exactly related to this change so we should probably open a new issue for 
this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-09 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r332875335
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocs.java
 ##
 @@ -116,19 +123,28 @@ public TopSuggestDocs(TotalHits totalHits, 
SuggestScoreDoc[] scoreDocs) {
*/
   public static TopSuggestDocs merge(int topN, TopSuggestDocs[] shardHits) {
 SuggestScoreDocPriorityQueue priorityQueue = new 
SuggestScoreDocPriorityQueue(topN);
+boolean allComplete = true;
 for (TopSuggestDocs shardHit : shardHits) {
   for (SuggestScoreDoc scoreDoc : shardHit.scoreLookupDocs()) {
 if (scoreDoc == priorityQueue.insertWithOverflow(scoreDoc)) {
   break;
 }
   }
+  allComplete = allComplete && shardHit.isComplete;
 }
 SuggestScoreDoc[] topNResults = priorityQueue.getResults();
 if (topNResults.length > 0) {
-  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults);
+  return new TopSuggestDocs(new TotalHits(topNResults.length, 
TotalHits.Relation.EQUAL_TO), topNResults,
+  allComplete);
 } else {
   return TopSuggestDocs.EMPTY;
 }
   }
 
+  /**
+   * returns true if we exhausted all possibilities to collect results
 
 Review comment:
   nit: the flag indicates that the list of results is complete but we don't 
need to exhaust all possibilities to achieve this.  Maybe something like: 
`indicate if the list of results is complete or not, this might be 
false if the {@link TopNSearcher} rejected too many results.` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-09 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r332873041
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java
 ##
 @@ -283,6 +299,7 @@ public int compare(Pair o1, Pair o2) {
* 
* If a filter is applied, the queue size is increased by
* half the number of live documents.
+   *
 
 Review comment:
   nit: restore the formatting


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-09 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r332872725
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocsCollector.java
 ##
 @@ -201,4 +205,27 @@ public void collect(int doc) throws IOException {
   public ScoreMode scoreMode() {
 return ScoreMode.COMPLETE;
   }
+
+  /**
+   * returns true if the collector clearly exhausted all possibilities to 
collect results
+   */
+  boolean isComplete() {
 
 Review comment:
   Is this needed now that we provide the information in the `TopSuggestDocs` ? 


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-09 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r332873923
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/TopSuggestDocs.java
 ##
 @@ -116,19 +123,28 @@ public TopSuggestDocs(TotalHits totalHits, 
SuggestScoreDoc[] scoreDocs) {
*/
   public static TopSuggestDocs merge(int topN, TopSuggestDocs[] shardHits) {
 SuggestScoreDocPriorityQueue priorityQueue = new 
SuggestScoreDocPriorityQueue(topN);
+boolean allComplete = true;
 for (TopSuggestDocs shardHit : shardHits) {
   for (SuggestScoreDoc scoreDoc : shardHit.scoreLookupDocs()) {
 if (scoreDoc == priorityQueue.insertWithOverflow(scoreDoc)) {
   break;
 }
   }
+  allComplete = allComplete && shardHit.isComplete;
 
 Review comment:
   nit: you can do `allComplete &= shardHit.isComplete` ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-02 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r330667847
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +263,126 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+
+  @Override
+  protected boolean canReject() {
+return true;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+TopSuggestDocs suggestions = collector.get();
+assertSuggestions(suggestions, expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+assertTrue(suggestions.isComplete());
+
+reader.close();
+iw.close();
+  }
+
+  /**
+   * A large scale tests where the collector rejects based on docIds
+   */
+  public void testCollectorWithManyRejects() throws Exception {
+Analyzer analyzer = new MockAnalyzer(random());
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+Set acceptedDocs = new HashSet<>();
+List expectedResults = new ArrayList();
+
+for (int docCount = 0; docCount < 1; docCount++) {
+  Document document = new Document();
+  String value = "ab" + 
RandomStrings.randomAsciiAlphanumOfLength(random(), 10) +"_" + docCount;
+  document.add(new SuggestField("suggest_field", value, docCount));
+  if (random().nextDouble() > 0.75) {
 
 Review comment:
   the maximum queue size is `5000` so we should ensure that we don't reject 
more than this number if we want to ensure that the search is complete. If you 
change the live docs to contain at least `5000` docs, this test should work 
fine.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-02 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r330659818
 
 

 ##
 File path: 
lucene/suggest/src/java/org/apache/lucene/search/suggest/document/NRTSuggester.java
 ##
 @@ -283,17 +299,25 @@ public int compare(Pair o1, Pair o2) {
* 
* If a filter is applied, the queue size is increased by
* half the number of live documents.
+   *
+   * If the collector can reject documents upon collecting, the queue size is
+   * increased by half the number of live documents again.
+   *
* 
* The maximum queue size is {@link #MAX_TOP_N_QUEUE_SIZE}
*/
-  private int getMaxTopNSearcherQueueSize(int topN, int numDocs, double 
liveDocsRatio, boolean filterEnabled) {
+  private int getMaxTopNSearcherQueueSize(int topN, int numDocs, double 
liveDocsRatio, boolean filterEnabled,
 
 Review comment:
   I am not sure we need to differentiate the case where there is a filter and 
when the collector can reject. It's the same thing, we don't know the number of 
rejections beforehand so just adding `(numDocs/2)` once should be enough. So we 
can maybe just merge the two boolean and applies the heuristic once ?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[GitHub] [lucene-solr] jimczi commented on a change in pull request #913: LUCENE-8995: TopSuggestDocsCollector#collect should be able to signal rejection

2019-10-02 Thread GitBox
jimczi commented on a change in pull request #913: LUCENE-8995: 
TopSuggestDocsCollector#collect should be able to signal rejection
URL: https://github.com/apache/lucene-solr/pull/913#discussion_r330405030
 
 

 ##
 File path: 
lucene/suggest/src/test/org/apache/lucene/search/suggest/document/TestPrefixCompletionQuery.java
 ##
 @@ -253,6 +258,61 @@ public void testDocFiltering() throws Exception {
 iw.close();
   }
 
+  /**
+   * Test that the correct amount of documents are collected if using a 
collector that also rejects documents.
+   */
+  public void testCollectorThatRejects() throws Exception {
+// use synonym analyzer to have multiple paths to same suggested document. 
This mock adds "dog" as synonym for "dogs"
+Analyzer analyzer = new MockSynonymAnalyzer();
+RandomIndexWriter iw = new RandomIndexWriter(random(), dir, 
iwcWithSuggestField(analyzer, "suggest_field"));
+List expectedResults = new ArrayList();
+
+for (int docCount = 10; docCount > 0; docCount--) {
+  Document document = new Document();
+  String value = "ab" + docCount + " dogs";
+  document.add(new SuggestField("suggest_field", value, docCount));
+  expectedResults.add(new Entry(value, docCount));
+  iw.addDocument(document);
+}
+
+if (rarely()) {
+  iw.commit();
+}
+
+DirectoryReader reader = iw.getReader();
+SuggestIndexSearcher indexSearcher = new SuggestIndexSearcher(reader);
+
+PrefixCompletionQuery query = new PrefixCompletionQuery(analyzer, new 
Term("suggest_field", "ab"));
+int topN = 5;
+
+// use a TopSuggestDocsCollector that rejects results with duplicate docIds
+TopSuggestDocsCollector collector = new TopSuggestDocsCollector(topN, 
false) {
+
+  private Set seenDocIds = new HashSet<>();
+
+  @Override
+  public boolean collect(int docID, CharSequence key, CharSequence 
context, float score) throws IOException {
+  int globalDocId = docID + docBase;
+  boolean collected = false;
+  if (seenDocIds.contains(globalDocId) == false) {
+  super.collect(docID, key, context, score);
+  seenDocIds.add(globalDocId);
+  collected = true;
+  }
+  return collected;
+  }
+};
+
+indexSearcher.suggest(query, collector);
+assertSuggestions(collector.get(), expectedResults.subList(0, 
topN).toArray(new Entry[0]));
+
+// TODO expecting true here, why false?
 
 Review comment:
   it seems that NRTSuggesterBuilder#maxAnalyzedPathsPerOutput is not computed 
correctly. From what I understand it records the number of suggestions with the 
same analyzed form but the comment says that it should be the highest number of 
analyzed paths we saw for any input surface form. So imo this is a bug, it's 
exactly related to this change so we should probably open a new issue for this.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org