Thanks for your response.

So, for text/plain guessed as application/octet-stream I suppose the problem
comes from couchdb itself, as:
"t...@blackberry:~/couchdb-lucene$ file test
test: UTF-8 Unicode text".

On the other hand, for "text/x-patch" and "text/whatether",
Metadata.CONTENT_TYPE
could be filled in tika calls with "text/plain" via a matching table.

'Just an idea... :)

Robert Newson wrote:
couchdb-lucene uses the content-type stored in couchdb when parsing
attachments. couchdb-lucene then uses Apache Tika to parse the
attachments, and it is there that support for new MIME types should be
requested.

A list of currently supported MIME types is available at;

http://github.com/rnewson/couchdb-lucene

B.

Reply via email to