Chris,

Yes, Solr can help you with that. We did something similar with company
names.

You can watch this training to help you understand better how Solr works if
you are just getting started:
http://www.pluralsight.com/courses/table-of-contents/enterprise-search-using-apache-solr


On Fri, Nov 7, 2014 at 10:36 AM, Chris Manu <[email protected]> wrote:

>
>
> Hello,
>
>  I apologize for taking your time, but I am not trained in
> this area, but someone suggested that this software could do want I need
> completed, and I would like to enquire as to whether it can.
>
>
>
>  I require matching a series of titles (currently over 40k)
> contained in individual cells in a worksheet with the contents of rich
> documents (i.e. Word, PDF). The searching process would need to be
> automated,
> since there will be several thousand titles and numerous documents. The
> matching would be "fuzzy" since there may be some variation in
> punctuation, or a misuse of a preposition.
>
>
>
> The software would record the relevance of any match (i.e. a
> percentage score), as well as the names of the documents and the page
> numbers
> where the matches were found. This information would be saved in a format
> that
> could be opened by Excel. Since there is likely to be multiple matches in
> the
> same document or across documents, each match for each title would have
> its own
> row.
>
>
>
>
>
> I will appreciate your assistance and I look forward to your
> reply.
>
>
>
>
>
> Cheers!
>
>




-- 

*Xavier Morera*

Entrepreneur | Author & Trainer | Consultant | Developer

*www.xaviermorera.com <http://www.xaviermorera.com/>*

office:  (305) 600-4919

cel:     +506 8849-8866

skype: xmorera
Twitter <https://twitter.com/xmorera> | LinkedIn
<https://www.linkedin.com/in/xmorera> | Pluralsight Author
<http://www.pluralsight.com/author/xavier-morera>

Reply via email to