zhaih commented on a change in pull request #157:
URL: https://github.com/apache/lucene/pull/157#discussion_r642116290



##########
File path: 
lucene/analysis/common/src/java/org/apache/lucene/analysis/core/FlattenGraphFilter.java
##########
@@ -362,6 +378,40 @@ public boolean incrementToken() throws IOException {
     }
   }
 
+  private OutputNode recoverFromHole(InputNode src, int startOffset) {
+    // This means the "from" node of this token was never seen as a "to" node,
+    // which should only happen if we just crossed a hole.  This is a 
challenging
+    // case for us because we normally rely on the full dependencies expressed
+    // by the arcs to assign outgoing node IDs.  It would be better if tokens
+    // were never dropped but instead just marked deleted with a new
+    // TermDeletedAttribute (boolean valued) ... but until that future, we have
+    // a hack here to forcefully jump the output node ID:
+    assert src.outputNode == -1;
+    src.node = inputFrom;
+
+    int maxOutIndex = outputNodes.getMaxPos();
+    OutputNode outSrc = outputNodes.get(maxOutIndex);
+    // There are two types of holes, neighbor holes and consumed holes. A 
neighbor hole is between

Review comment:
       Could you add some example explaining these 2 types of holes? 
   Seems to me that consumed holes are like 
   ```
   abc: [0,3]
   b: [1,2] => c: [2,3]
   ``` 
   right? But I didn't really get what neighbor hole looks like...




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to