Re: Question about Solr Fieldtypes, Chaining of Tokenizers

2010-12-06 Thread Matthew Hall
Yes, that's my conclusion as well Grant. As for the example output: The symposium of TgThe(RX3fg+and) gene studies Should end up tokenizing to: symposium tg the rx3fg and gene studi Assuming I guessed right on the stemming. Anyhow, thanks for the confirmation guys. Matt On 12/4/2010 8:18

Re: Question about Solr Fieldtypes, Chaining of Tokenizers

2010-12-04 Thread Grant Ingersoll
Could you expand on your example and show the output you want? FWIW, you could simply write a token filter that does the same thing as the WhitespaceTokenizer. -Grant On Dec 3, 2010, at 1:14 PM, Matthew Hall wrote: Hey folks, I'm working with a fairly specific set of requirements for our

Re: Question about Solr Fieldtypes, Chaining of Tokenizers

2010-12-04 Thread Robert Muir
On Fri, Dec 3, 2010 at 1:14 PM, Matthew Hall mh...@informatics.jax.org wrote: Oh, and let me add that the WordDelimiterFilter comes really close to what I want, but since we are unwilling to promote our solr version to the trunk (we are on the 1.4x) version atm, the inability to turn off the

Question about Solr Fieldtypes, Chaining of Tokenizers

2010-12-03 Thread Matthew Hall
Hey folks, I'm working with a fairly specific set of requirements for our corpus that needs a somewhat tricky text type for both indexing and searching. The chain currently looks like this: tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.PatternReplaceFilterFactory