Hi there,

It looks alot like using Solr's standard "WordDelimiterFilter" (see the sample 
schema.xml) does what you need.
It splits on alphabetical to numeric boundaries and on the various kinds of 
intra word delimiters like "-", "_" or ".". You can decide whether the parts 
are put together again in addition to the split up tokens. Control this by the 
parameters "catenateWords", "catenateNumbers" and "catenateAll".
Good documentation on this topic is found on the wiki

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-1c9b83870ca7890cd73b193cefed83c283339089

-- Christian


-----Ursprüngliche Nachricht-----
Von: Frans Flippo [mailto:[EMAIL PROTECTED] 
Gesendet: Donnerstag, 31. Mai 2007 11:27
An: solr-user@lucene.apache.org
Betreff: Re: SOLR Indexing/Querying


I think if you add a field that has an analyzer that creates tokens on 
alpha/digit/punctuation boundaries, that should go a long way. Use that both at 
index and search time.

For example:
* 3555LHP  becomes "3555" "LHP"
  Searching for D3555 becomes "D" OR "3555", so it matches on token "3555" from 
3555LHP.

* t14240 becomes "t" "14240"
  Searching for t14240-ss  becomes "t" OR "14240" OR "ss", matching "14240" 
from "t14240".

Similarly for your other examples.

If this proves to be too broad, you may need to define some stricter rules, but 
you could use this for starters.

I think you will have to write your own analyzer, as it doesn't look like any 
of the analyzers available in Solr/Lucene do exactly what you need. But that's 
relatively straightforward. Just start with the code from one of the existing 
Analyzers (e.g. KeywordAnalyzer).

Good luck,
Frans

On 5/31/07, realw5 <[EMAIL PROTECTED]> wrote:
>
>
> Hey Guys,
> I need some guidance in regards to a problem we are having with our 
> solr index. Below is a list of terms our customers search for, which 
> are failing or not returning the complete set. The second side of the 
> list is the product id/keyword we want it to match.
>
> Can you give me some direction on how this can (or let me know if i 
> can't be
> done) with index/query analyzers. Any help is much appeciated!
>
> Dan
>
> ---------------------------
>
> Keyword Typed In / We want it to find
>
> D3555 / 3555LHP
> D460160-BN / D460160
> D460160BN / D460160
> Dd454557 / D454557
> 84200ORB / 84200
> 84200-ORB / 84200
> T13420-SCH / T13420
> t14240-ss / t14240
> --
> View this message in context: 
> http://www.nabble.com/SOLR-Indexing-Querying-tf3843221.html#a10883456
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Reply via email to