checking one last thing, just in case...

as I mentioned, I have previously indexed the same document in another
index (for another purpose), as I am going to use the same analyzer,
would it be possible to avoid analyzing the doc again?

I see IndexWriter.addDocument() returns void, so it does not seem to
be an easy way to do that no?

thanks

On 11/21/06, Wolfgang Hoschek <[EMAIL PROTECTED]> wrote:

On Nov 21, 2006, at 12:38 PM, jm wrote:

> Ok, thanks, I'll give MemoryIndex a go, and if that is not good enoguh
> I will explore the other options then.

To get started you can use something like this:

for each document D:
     MemoryIndex index = createMemoryIndex(D, ...)
     for each query Q:
         float score = index.search(Q)
        if (score > 0.0) System.out.println("it's a match");




   private MemoryIndex createMemoryIndex(Document doc, Analyzer
analyzer) {
     MemoryIndex index = new MemoryIndex();
     Enumeration iter = doc.fields();
     while (iter.hasMoreElements()) {
       Field field = (Field) iter.nextElement();
       index.addField(field.name(), field.stringValue(), analyzer);
     }
     return index;
   }



>
>
> On 11/21/06, Wolfgang Hoschek <[EMAIL PROTECTED]> wrote:
>> On Nov 21, 2006, at 7:43 AM, jm wrote:
>>
>> > Hi,
>> >
>> > I have to decide between  using a RAMDirectory and MemoryIndex, but
>> > not sure what approach will work better...
>> >
>> > I have to run many items (tens of thousands) against some
>> queries (100
>> > at most), but I have to do it one item at a time. And I already
>> have
>> > the lucene Document associated with each item, from a previous
>> > operation I perform.
>> >
>> > From what I read MemoryIndex should be faster, but apparently I
>> cannot
>> > reuse the document I already have, and I have to create a new
>> > MemoryIndex per item.
>>
>> A MemoryIndex object holds one document.
>>
>> > Using the RAMDirectory I can use only one of
>> > them, also one IndexWriter, and create a IndexSearcher and
>> IndexReader
>> > per item, for searching and removing the item each time.
>> >
>> > Any thoughts?
>>
>> The MemoryIndex impl is optimized to work efficiently without reusing
>> the MemoryIndex object for a subsequent document. See the source
>> code. Reusing the object would not further improve performance.
>>
>> Wolfgang.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: [EMAIL PROTECTED]
>> For additional commands, e-mail: [EMAIL PROTECTED]
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to