On 01 May 2006, at 02:53, Andi Vajda wrote:

Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) to search an untokenized field using a term that contains spaces. For a document that has a creator "Doe J", the query
creator:"Doe J"
doesn't return any results, and
creator:Doe J
doesn't match what it needs to.

Again, please send in code that reproduces the problem. If you can make sure that what you're trying to do work in Java Lucene, that's a plus.
Ideally, your sample code would be organized as unit tests.

Good idea to do the tests: I realised that StandardAnalyzer was converting the search terms to lowercase when used in QueryParser, but not when adding untokenized fields to the document using IndexWriter, so the two weren't matching. Fixed now, thanks (and it's presumably not a PyLucene problem).

alf.

--------

#!/usr/bin/env python

from PyLucene import *

filestore = FSDirectory.getDirectory("test", True)
analyzer = StandardAnalyzer()
filewriter = IndexWriter(filestore, analyzer, True)

doc = Document()

doc.add(Field('author-space', "Doe J", Field.Store.YES, Field.Index.UN_TOKENIZED)) doc.add(Field('author-space-tok', "Doe J", Field.Store.YES, Field.Index.TOKENIZED)) doc.add(Field('author-underscore', "Doe_J", Field.Store.YES, Field.Index.UN_TOKENIZED)) doc.add(Field('author-underscore-tok', "Doe_J", Field.Store.YES, Field.Index.TOKENIZED))

filewriter.addDocument(doc)
filewriter.close()

searcher = IndexSearcher("test")

for q in ("Doe J", "Doe_J"):
for f in ("author-space", "author-space-tok", "author- underscore", "author-underscore-tok"): #query = QueryParser.parse(q, f, analyzer) # only works for tokenized fields query = TermQuery(Term(f, q)) # only works for untokenized fields
        hits = searcher.search(query)
        print "\nQ: %s\nQuery: %s\n" % (q, query)
        for i, doc in hits:
            print "Result: %s\n" % doc[f]
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to