Hi Vijay,
here you can find all supported formats by Tika, which is internally used by SolrCell:

 * https://tika.apache.org/*1.4*/formats.html
 * https://tika.apache.org/*1.5*/formats.html
 * https://tika.apache.org/*1.6*/formats.html
 * https://tika.apache.org/*1.7*/formats.html

Best,
Andrea



On 04/15/2015 04:20 PM, Vijaya Narayana Reddy Bhoomi Reddy wrote:
Hi,

I am trying to index various binary file types into Solr. However, some
file types seems to be ignored and not getting indexed, though the metadata
is being extracted successfuly for all the types.

Specifically, zip files and jpg files are not getting indexed, where as
pdf, MS office documents are getting indexed. Hence wondering whether there
is a defined list of indexable file types.

Moreover, I am just wondering why Solr could not index the jpg and zip
documents when it was able to extract the metadata from those files?

The code snippet is as below:

contentStreamUpdateReq.addFile(file, fileType);
contentStreamUpdateReq.setParam("literal.id", literalId);
contentStreamUpdateReq.setParam("uprefix", "attr_");
contentStreamUpdateReq.setParam("fmap.content", "content");
contentStreamUpdateReq.setAction(AbstractUpdateRequest.ACTION.COMMIT, true,
true);
solrServer.request(contentStreamUpdateReq);

Thanks & Regards
Vijay


Reply via email to