couchdb-lucene uses the content-type stored in couchdb when parsing attachments. couchdb-lucene then uses Apache Tika to parse the attachments, and it is there that support for new MIME types should be requested.
A list of currently supported MIME types is available at; http://github.com/rnewson/couchdb-lucene B. On Sat, Sep 5, 2009 at 11:51 AM, Thomas Harding<[email protected]> wrote: > > You got it! > Tried to upload a pdf file, then it works... > > However, is someone have a way to handle ASCII or UTF-8 files which are > guessed as "application/octet-stream" (sic!). > > More generally, how to force the handling by lucene for a peculiar > mime-type? > My first tries were for documents which "couchdb mime-type" was > "text/x-patch", > which you can obviously guess the usability :p > > Robert Newson wrote: >> >> Hi, >> >> The index function looks correct so I would suggest you check what >> content type couchdb thinks your attachment is. If it's not in the >> support list of content types, then it explains the lack of matches. >> >> B. >> >> On Sat, Sep 5, 2009 at 3:03 AM, Paul Joseph >> Davis<[email protected]> wrote: >> >>> >>> This is reaching a bit, but have you tried using 'attachment:diff' in the >>> query? I seem to remember something about a minimum length for wildcard >>> searching. >>> >>> >>> >>> On Sep 4, 2009, at 9:45 PM, Thomas Harding <[email protected]> >>> wrote: >>> >>> >>>> >>>> Hello, >>>> I'm trying to index, then retrieve attachments with couchdb-lucene. >>>> I guess the problem comes from the query, but you can either find >>>> the indexing code below. >>>> >>>> Trying a query to retrieve a "diff" attachment content which contains >>>> "diff" >>>> >>>> ##################### >>>> the query (among other tries) >>>> ##################### >>>> $ curl 'http://127.0.0.1:5984/ajatus_devel_db_content/\ >>>> _fti/lucene/by_attachments?q=attachment:d*' >>>> >>>> ##################### >>>> the response >>>> ##################### >>>> {"q":"attachment:d*","etag":"12387ad7f7b", >>>> "view_sig":"7ceed7519f0b61c517bd9ffee373414b", >>>> >>>> >>>> "skip":0,"limit":25,"total_rows":0,"search_duration":0,"fetch_duration":0,"rows":[]} >>>> >>>> ################# >>>> the "_design/lucene" code: >>>> ################# >>>> { >>>> "_id": "_design/lucene", >>>> "fulltext": { >>>> ............ >>>> "by_attachments": { >>>> "defaults": { >>>> "store": "no" >>>> }, >>>> "index": "function(doc) { var ret=new Document(); if (doc._attachments) >>>> { >>>> for (var i in doc._attachments) { ret.attachment('attachment', i); }}; >>>> return ret }" >>>> }, >>>> }, >>>> } >>>> >>>> >>>> > >
