WordDelimiterFilterFactory will _almost_ do what you want
by setting things like catenateWords=0 and catenateNumbers=1,
_except_ that the punctuation will be removed. So
12.34 - 1234
ab,cd - ab cd
is that close enough?
Otherwise, writing a simple Filter is probably the way to go.
Best
Erick
On
: Thursday, April 12, 2012 8:01 AM
Subject: Re: Question about solr.WordDelimiterFilterFactory
WordDelimiterFilterFactory will _almost_ do what you want
by setting things like catenateWords=0 and catenateNumbers=1,
_except_ that the punctuation will be removed. So
12.34 - 1234
ab,cd - ab cd
Hello,
I am new to solr/lucene. I am tasked to index a large number of documents. Some
of these documents contain decimal points. I am looking for a way to index
these documents so that adjacent numeric characters (such as [0-9.,]) are
treated as single token. For example,
12.34 = 12.34