On Sun, 30 Apr 2006, Alf Eaton wrote:

I have a couple of questions regarding indexing and searching a document that has repeated values for the same field (specifically, the authors of a document, in this case):

Firstly, I'm adding the repeated field with this code:

for creator in creators:
doc.add(Field('creator', creator, Field.Store.YES, Field.Index.UN_TOKENIZED))

but can't find a way to read those fields back out from the index. If I use

for author in hits[i]["creator"]:
      print author

I'm not sure I understand what you're trying to do in the code above.
In PyLucene 1.9.1, the way to iterate hits is:

  for i, doc in hits:
      print doc['creator']

If there is more than one field called 'creator' then, you might want to try:
  for i, doc in hits:
     for creator in doc.getFields('creator'):
         print creator

In PyLucene 2.0rc1, you can also say:

  for hit in hits:
      for creator in hit.getDocument().getFields('creator'):
          print creator

If this doesn't work, please send in code that illustrates the problem (that would help in understanding and fixing the potential bug(s)).

Secondly, it doesn't seem to be possible (in PyLucene 1.9.1) to search an untokenized field using a term that contains spaces. For a document that has a creator "Doe J", the query
creator:"Doe J"
doesn't return any results, and
creator:Doe J
doesn't match what it needs to.

Again, please send in code that reproduces the problem. If you can make sure that what you're trying to do work in Java Lucene, that's a plus.

Ideally, your sample code would be organized as unit tests.

Thanks !

Andi..
_______________________________________________
pylucene-dev mailing list
[email protected]
http://lists.osafoundation.org/mailman/listinfo/pylucene-dev

Reply via email to