On Jan 8, 2007, at 4:58 AM, Alan Burlison wrote:
I'm in the process of evaluating what we are going to do with the search functionality for http://opensolaris.org, and at the moment Solr is my first choice to replace what we already have - *if* it can be made to handle disparate data sources.

There really is no question of "if" Solr can be made to handle it. :) POSTing an encoded binary document in XML will work, and it certainly will work to have Solr unencode it and parse it.

The Lucene in Action codebase has a DocumentHandler interface that could be used for this, which has implementations for Word, PDF, HTML, RTF, and some others. It's simplistic, so it might not be of value specifically.

        Erik

Reply via email to