Re: ContentTypes supported by Solr to index

Andrea Gazzarini Wed, 15 Apr 2015 07:28:06 -0700

Hi Vijay,

here you can find all supported formats by Tika, which is internallyused by SolrCell:


 * https://tika.apache.org/*1.4*/formats.html
 * https://tika.apache.org/*1.5*/formats.html
 * https://tika.apache.org/*1.6*/formats.html
 * https://tika.apache.org/*1.7*/formats.html

Best,
Andrea



On 04/15/2015 04:20 PM, Vijaya Narayana Reddy Bhoomi Reddy wrote:

Hi,

I am trying to index various binary file types into Solr. However, some
file types seems to be ignored and not getting indexed, though the metadata
is being extracted successfuly for all the types.

Specifically, zip files and jpg files are not getting indexed, where as
pdf, MS office documents are getting indexed. Hence wondering whether there
is a defined list of indexable file types.

Moreover, I am just wondering why Solr could not index the jpg and zip
documents when it was able to extract the metadata from those files?

The code snippet is as below:

contentStreamUpdateReq.addFile(file, fileType);
contentStreamUpdateReq.setParam("literal.id", literalId);
contentStreamUpdateReq.setParam("uprefix", "attr_");
contentStreamUpdateReq.setParam("fmap.content", "content");
contentStreamUpdateReq.setAction(AbstractUpdateRequest.ACTION.COMMIT, true,
true);
solrServer.request(contentStreamUpdateReq);

Thanks & Regards
Vijay

Re: ContentTypes supported by Solr to index

Reply via email to