the whitespace
tokenizer, I will use the StanadardTokenizer
Thanks Hoss
--
View this message in context:
http://www.nabble.com/facets-and-stopwords-tp23952823p24390157.html
Sent from the Solr - User mailing list archive at Nabble.com.
: http://projecte01.development.barcelonamedia.org/fonetic/
: you will see a Top Words list (in Spanish and stemmed) in the list there
: is the word si which is in 20649 documents.
: If you click at this word, the system will perform the query
: (x) content:si, with no answers at all
:
. org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}
The field is indexed, tokenized, stored and termvectors are stored.
So, why the stopwords are in the index?
--
View this message in context:
http://www.nabble.com/facets-and-stopwords-tp23952823p24286283.html
Sent from the Solr - User mailing list archive at Nabble.com.
: Date: Tue, 9 Jun 2009 16:04:03 -0700 (PDT)
: From: JCodina
: Subject: facets and stopwords
: I have a text field from where I remove stop words, as a first approximation
: I use facets to see the most common words in the text, but.. stopwords are
: there, and if I search documents having
you can check wat's going on on the content field.
I use the DataImportHandler to import the data, and
Solr analyzer shows me how the stopwords are removed from both the query
and the indexed text, but why facets show me these words?
--
View this message in context:
http://www.nabble.com/facets