Hi all I am trying to use Apache Lucene for Ngram Separator.
Reader reader = new StringReader("This is a test string"); NGramTokenizer gramTokenizer = new NGramTokenizer(reader, 1, 3); CharTermAttribute charTermAttribute = gramTokenizer.addAttribute(CharTermAttribute.class); gramTokenizer.reset(); while (gramTokenizer.incrementToken()) { String token = charTermAttribute.toString(); System.out.println(token); } gramTokenizer.end(); gramTokenizer.close(); } This is the code i used but it is returning character by character , I want it to return in terms like this ,test , string, this test etc =================== i tried with shringleFilter also , but it is giving nullpoint exception *Reader reader = new StringReader("This is a test string");* * TokenStream tokenizer = new StandardTokenizer(Version.LUCENE_41, reader);* * tokenizer = new ShingleFilter(tokenizer, 2, 3);* * CharTermAttribute charTermAttribute = tokenizer.addAttribute(CharTermAttribute.class);* * while (tokenizer.incrementToken()) {* * String token = charTermAttribute.toString();* * System.out.println(token);* * }* Plz guide Thanks -- Launchship Technology respects your privacy. This email is intended only for the use of the party to which it is addressed and may contain information that is privileged, confidential, or protected by law. If you have received this message in error, or do not want to receive any further emails from us, please notify us immediately by replying to the message and deleting it from your computer.