Is the vocabulary known? That is, do you know the abbreviations that
will be used? If so, you could consider synonyms, in which case you'd
go to tokenized titles and use phrase queries to get your matches...

Regexes often don't scale extremely well, although the 4.x FST
implementations are much faster than they used to be.

It seems to me that regularizing the titles is a better idea than
trying to fake it with regexes, but you know your problem space better
than me...

Best
Erick

On Fri, Jul 12, 2013 at 1:32 PM, Parul Gupta(Knimbus)
<parulgp...@gmail.com> wrote:
> Hi,
> Ok I will not use Bold text in my queries....
>
> I guess my question is not clear to you....
>
> See what I am doing is, i have a live source say 'A'  and a stored database
> say it as 'B'.ok
> A and B ,both have title fields in them.Consider A as non-persistent solr
> and B as persistent solr.
>
> I have to match the title coming from A to the database B.
>
> Since some title from live source A comes in short form e.g 'med. phys.' and
> 'phys. fluids'.
> But corresponding to these titles my database B have titles 'medical
> physics' and 'physics of fluids'.
> Since this type of differences occurs and A not able to search there
> corresponding titles in B by using 'tokenized' field 'title' with using wild
> cards,hence i used Term component first.Which gives me the corresponding
> matched title with B.When i got the full title like 'medical physics',i
> fetched it from HTML,and then again search it in tokenized field of 'title'
> say it 'titlenew'(copy field of title) which brings me result 'medical
> physics'.But I am failing to get match of 'phys. fluids' with 'physics of
> fluids' as it has stop word in it using [a-z0-9]*.
>
> Hope know u will get my issue...and will help..
> thanks..
>
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/Problem-using-Term-Component-in-solr-tp4077200p4077628.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to