Hi Dawid, Maybe you could use KeywordMarkerFilter, either directly or as a recipe for a StopwordMarkerFilter?
Note that KeywordAttribute is used by most (all?) Lucene stemmers, so I wouldn't use KeywordMarkerFilter if your analysis chain already includes a stemmer. Steve -----Original Message----- From: Dawid Weiss [mailto:[email protected]] Sent: Tuesday, August 21, 2012 4:34 PM To: [email protected] Subject: Looking for a code pattern to pass stop words as an attribute Seeking advice. I have an application where I need to know which tokens are stop words. Most analyzers construct the token stream in a way that those tokens are filtered out -- this isn't what I need, I want them in, but marked somehow. The question is how to do it nicely and in a simple way, possibly reusing existing token filters? I had a few ideas but they all seem awkward -- let me know if I'm missing something obvious. Dawid --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
