I put binary data in an ordinary Solr stored field, don't need any special schema.

I have run into trouble making sure the data is not corrupted on the way in during indexing, depending on exactly what form of communication is being used to index (SolrJ, SolrJ with EmbeddedSolr, DIH, etc.), as well as settings in the container (eg jetty or tomcat) used to house Solr. But I think it's possible to get it working no matter what the path, if you run into trouble someone may be able to help you.

My binary data is not very large though (generally under 1 meg).

However, in general, _indexing_ large data should be fine, although it will create a larger index which can require more RAM, or be slower, etc. But that's geenrally just a function of total size of index, or really total number of unique terms, doesn't matter if the docs they come from are big or small.

_Storing_ large fields can sometimes be a problem, lucene/Solr are really optimized as an index, not a key/value store. Some people choose to _store_ their large objects in some external store (rdbms, nosql key/value, whatever), and have the client application look up the objects themselves by primary-key/unique-id, after the pk/uid's themselves are retrieved from Solr. Use Solr for what it's good at, indexing, use something else good at storing for storing large objects. But other people sometimes store large objects directly in Solr without problems, can depend on the exact nature of your index and use.

On 4/6/2011 2:09 PM, Ezequiel Calderara wrote:
Another question that maybe is easier to answer, how can i store binary
data? Any example schema?

2011/4/6 Ezequiel Calderara<ezech...@gmail.com>

Hello everyone, i need to know if some has used solr for indexing and
storing images (upt to 16MB) or binary docs.

How does solr behaves with this type of docs? How affects performance?

Thanks Everyone

--
______
Ezequiel.

Http://www.ironicnet.com



Reply via email to