Chris Hostetter wrote:

For your purposes, if you've got a system that works and does the Document
conversion for you, then you are probably right: Solr may not be a usefull
addition to your architecture.  Solr doesn't really attempt to solve the
problem of parsing differnet kinds of data streams into a unified Document
module -- it just tries to expose all of the Lucene goodness through an
easy to use, easy to configre, HTTP interface.  Besides the
configuration, Solr's other means of being a value add is in it's
IndexReader management, it's caching, and it's plugin support for mixing
and matching request handlers, output writters, and field types as easily
as you can mix and match Analyzers.

There has been some discussion about adding plugin support for the
"update" side of things as well -- at a very simple level this could allow
for messages to be sent via JSON, or CSV instead of just XML -- but
there's no reason a more comple upate plugin couldn't read in a binary PDF
file and parse it into it's appropriate fields ... but we aren't
quite there yet.  Feel free to bring this up on solr-dev if you'd be
interested in working on it.

I'm interested in discussing this further. I've moved the discussion onto solr-dev, as suggested.

--
Alan Burlison
--

Reply via email to