uschindler commented on PR #15900:
URL: https://github.com/apache/lucene/pull/15900#issuecomment-4175906875

   Here is the test for above code:
   
   ```java
     public void testTruncating() throws Exception {
       // Note: also truncates one more char when the final char would 
otherwise be half of a surrogate
       // pair
       TokenStream stream =
           whitespaceMockTokenizer(
               "abcdefg 1234567 ABCDEFG abcde abc 12345 123 1234😃5 😃12345 😃😃 
😃😃😃 😃😃😃😃 😃😃😃😃😃 😃😃😃😃😃😃");
       stream = new TruncateTokenFilter(stream, 5);
       assertTokenStreamContents(
           stream,
           new String[] {
             "abcde",
             "12345",
             "ABCDE",
             "abcde",
             "abc",
             "12345",
             "123",
             "1234😃",
             "😃1234",
             "😃😃",
             "😃😃😃",
             "😃😃😃😃",
             "😃😃😃😃😃",
             "😃😃😃😃😃"
           });
     }
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to