I realise that 3.0.2 is an old version of Lucene but if I have Java code as
follows:
int nGramLength = 3;
Set<String> stopWords = new Set<String>();
stopwords.add("the");
stopwords.add("and");
...
SnowballAnalyzer snowballAnalyzer = new SnowballAnalyzer(Version.LUCENE_30,
"English", stopWords);
ShingleAnalyzerWrapper shingleAnalyzer = new
ShingleAnalyzerWrapper(snowballAnalyzer, nGramLength);
Which will generate the frequency of ngrams from a particular a string of
text without stop words, how can I disable the LowerCaseFilter which forms
part of the SnowBallAnalyzer? I want to preserve the case of the ngrams
generated so that I can perform various counts according to the presence /
absence of upper case characters in the ngrams.
I am something of a Lucene newbie. And I should add that upgrading the
version of Lucene is not an option here.