Got it - I understand how that works now and my search is returning the correct results now. Thanks again!
-- Rory On Monday, 5 September 2011 at 10:38, Robert Newson wrote: > The analyzer setting is a top-level item as documented in the README here; > > https://github.com/rnewson/couchdb-lucene > > B. > > On 5 September 2011 10:14, Rory Franklin <[email protected] > (mailto:[email protected])> wrote: > > I've modified my original index in CouchDB to be the following, but not > > having any joy with things being broken up in to tokens: > > > > > > { > > "_id": "_design/foo", > > "_rev": "19-da99913ce4cdd421903d0d48f9a40cc3", > > "fulltext": { > > "by_metadata": { > > "index": "function(doc) { > > var ret=new Document(); > > if (doc['type'] == 'CSAsset' && doc['deleted'] != true) { > > for (var i in doc.metadata) { > > if(doc.metadata[i]['key'] == 'Title') { > > ret.add(doc.metadata[i]['value'].toLowerCase(), {'field':'sort_title', > > 'store':'yes', 'index' : 'not_analyzed'}); > > } > > ret.add(doc.metadata[i]['value'],{ 'field' : > > doc.metadata[i]['key'].toLowerCase(), 'analyzer' : 'simple' }); > > ret.add(doc.metadata[i]['value'], { 'analyzer' : 'simple' }); > > } > > for (var i in doc.partitions) { > > ret.add(doc.partitions[i].partition_id,{'field':'partition'}); > > ret.add(doc.partitions[i].partition_id); > > } > > ret.add(doc['created_at'], {'field':'sort_created_at', 'store':'yes', > > 'index' : 'not_analyzed'}); > > return ret; > > } else { > > return null; > > } > > }" > > } > > } > > } > > > > I've opened the index up in Luke and going to the Documents tab and doing > > reconstruct & edit on a particular document shows that the fields aren't > > being split up in to separate tokens. > > > > > > -- > > > > Rory > > > > On Saturday, 3 September 2011 at 17:12, Robert Newson wrote: > > > > > " For instance, searching for the term "wonderland" should return back > > > a document where there is a field with the value > > > "some_wonderland_example" but it doesn't." > > > > > > It shouldn't and doesn't. :) > > > > > > 'some_wonderland_example' is a single token when tokenized by the > > > default StandardAnalyzer. If instead you specify "analyzer":"simple", > > > you will find that it is 3 tokens, and your search should work. > > > > > > B. > > > > > > On 3 September 2011 16:06, Rory Franklin <[email protected] > > > (mailto:[email protected])> wrote: > > > > I'm using couchdb-lucene to index a list of fields (user defined) in a > > > > document using the following design document: > > > > > > > > { > > > > "_id": "_design/foo", > > > > "_rev": "16-dcd0d39369c35b3d74ceef13a388826f", > > > > "fulltext": { > > > > "by_metadata": { > > > > "index": "function(doc) { > > > > var ret=new Document(); > > > > if (doc['type'] == 'CSAsset' && doc['deleted'] != true) { > > > > for (var i in doc.metadata) { > > > > if(doc.metadata[i]['key'] == 'Title') { > > > > ret.add(doc.metadata[i]['value'].toLowerCase(), {'field':'sort_title', > > > > 'store':'yes', 'index' : 'not_analyzed'}); > > > > } > > > > ret.add(doc.metadata[i]['value'],{'field':doc.metadata[i]['key'].toLowerCase() > > > > }); > > > > ret.add(doc.metadata[i]['value']); > > > > } > > > > for (var i in doc.partitions) { > > > > ret.add(doc.partitions[i].partition_id,{'field':'partition'}); > > > > ret.add(doc.partitions[i].partition_id); > > > > } > > > > ret.add(doc['created_at'], {'field':'sort_created_at', 'store':'yes', > > > > 'index' : 'not_analyzed'}); > > > > return ret; > > > > } else { > > > > return null; > > > > } > > > > }" > > > > } > > > > } > > > > } > > > > > > > > > > > > > > > > (I've formatted the definition so that it's not all on one line for > > > > readability here) > > > > > > > > However, when using the by_metadata view it doesn't appear to be > > > > breaking the values up when there are underscores. For instance, > > > > searching for the term "wonderland" should return back a document where > > > > there is a field with the value "some_wonderland_example" but it > > > > doesn't. It returns the document if I search for the full term. > > > > > > > > I'm just wondering whether I'm defining the index incorrectly? (of > > > > course, feel free to point out if I'm doing anything else glaringly > > > > obviously wrong too!) > > > > > > > > > > > > > > > > Rory
