In general, it's hard to just answer since there are many factors to consider, not the least of which is what you want it to do. In this case, I suspect the issue is WordDelimiterFactory, it splits words on all non alphanumerics by default.
It would probably be a good idea to work with the various combinations of tokenizers and filters to get a feel for what they do. The admin analysis page allows you to put in arbitrary text and see what the results of analysis are. So if you define a bunch of different fields in your schema (just for testing), and then put text in the analysis page you'll see what transformations occur. This is invaluable for understanding the differences. And until you get a good idea what various tokenizers and filters do both in isolation and in combination, you'll get lots of surprises. Even after you're familiar with them, you'll *still* get surprises, but at least you'll have a chance to figure it out... Best Erick On Thu, May 13, 2010 at 5:23 PM, Anderson vasconcelos < anderson.v...@gmail.com> wrote: > I'm using the textgen fieldtype on my field as follow: > <fieldType name="textgen" class="solr.TextField" > positionIncrementGap="100"> > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt" enablePositionIncrements="true" /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="1" > catenateNumbers="1" catenateAll="0" splitOnCaseChange="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" > ignoreCase="true" expand="true"/> > <filter class="solr.StopFilterFactory" > ignoreCase="true" > words="stopwords.txt" > enablePositionIncrements="true" > /> > <filter class="solr.WordDelimiterFilterFactory" > generateWordParts="1" generateNumberParts="1" catenateWords="0" > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > ..... > <dynamicField name="field_value_*" type="textgen" indexed="true" > stored="true"/> > > ..... > > They no remove the @ symbol. To configure to index the @ symbol i must use > HTMLStripStandardTokenizerFactory ? > > Thanks > > 2010/5/13 Erick Erickson <erickerick...@gmail.com> > > > Probably your analyzer is removing the @ symbol, it's hard to say if you > > don't include the relevant parts of your schema. > > > > This page might help: > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters > > > > <http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters>Best > > Erick > > > > On Thu, May 13, 2010 at 3:59 PM, Anderson vasconcelos < > > anderson.v...@gmail.com> wrote: > > > > > Why solr/lucene no index the Character '@' ? > > > > > > I send to index email fields x...@gmail.com ...and after try do search > > > to_email:*...@*, and not found. > > > > > > I need to do some configuration? > > > > > > Thanks > > > > > >