Re: RamDirectory vs MemoryIndex vs MMapDirectory for In-Memory-Index

2018-09-25 Thread Matthias Müller
Thanks Dawid, glad I asked! Am Dienstag, den 25.09.2018, 10:46 +0200 schrieb Dawid Weiss: > Use MMapDirectory on a temporary location, Matthias. If you really > need in-memory indexes, a new Directory implementation is coming > (RAMDirectory will be deprecated, then removed), but the difference >

Re: RamDirectory vs MemoryIndex vs MMapDirectory for In-Memory-Index

2018-09-25 Thread Dawid Weiss
Use MMapDirectory on a temporary location, Matthias. If you really need in-memory indexes, a new Directory implementation is coming (RAMDirectory will be deprecated, then removed), but the difference compared to MMapDirectory is typically not worth the hassle. See this issue for more discussion.

RamDirectory vs MemoryIndex vs MMapDirectory for In-Memory-Index

2018-09-25 Thread Matthias Müller
Hi, Lucene provides different storage options for in-memory indexes. I found three structures that would qualify for the task: * RamDirectory (which I currently use for prototyping, but wonder if it is the ideal choice for my task) * MemoryIndex, which claims to have better performance and

Re: RAMDirectory vs MemoryIndex

2006-11-27 Thread Wolfgang Hoschek
On Nov 26, 2006, at 8:57 AM, jm wrote: I tested this. I use a single static analyzer for all my documents, and the caching analyzer was not working properly. I had to add a method to clear the cache each time a new document was to be indexed, and then it worked as expected. I have never looked

Re: RAMDirectory vs MemoryIndex

2006-11-27 Thread jm
On 11/27/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 26, 2006, at 8:57 AM, jm wrote: I tested this. I use a single static analyzer for all my documents, and the caching analyzer was not working properly. I had to add a method to clear the cache each time a new document was to be

Re: RAMDirectory vs MemoryIndex

2006-11-27 Thread Wolfgang Hoschek
On Nov 27, 2006, at 9:57 AM, jm wrote: On 11/27/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 26, 2006, at 8:57 AM, jm wrote: I tested this. I use a single static analyzer for all my documents, and the caching analyzer was not working properly. I had to add a method to clear the

Re: RAMDirectory vs MemoryIndex

2006-11-27 Thread jm
yes that would be ok for my, as long as I can reuse my child analyzer. On 11/27/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 27, 2006, at 9:57 AM, jm wrote: On 11/27/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 26, 2006, at 8:57 AM, jm wrote: I tested this. I use a

Re: RAMDirectory vs MemoryIndex

2006-11-27 Thread Wolfgang Hoschek
Ok. I reverted back to the version without a public clear() method. Wolfgang. On Nov 27, 2006, at 12:17 PM, jm wrote: yes that would be ok for my, as long as I can reuse my child analyzer. On 11/27/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 27, 2006, at 9:57 AM, jm wrote: On

Re: RAMDirectory vs MemoryIndex

2006-11-26 Thread jm
I tested this. I use a single static analyzer for all my documents, and the caching analyzer was not working properly. I had to add a method to clear the cache each time a new document was to be indexed, and then it worked as expected. I have never looked into lucenes inner working so I am not

Re: RAMDirectory vs MemoryIndex

2006-11-23 Thread jm
thanks. I'll try to get this working and see wether there is a perf difference during the weekend. On 11/23/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: Out of interest, I've checked an implementation of something like this into AnalyzerUtil SVN trunk: /** * Returns an analyzer wrapper

Re: RAMDirectory vs MemoryIndex

2006-11-22 Thread jm
checking one last thing, just in case... as I mentioned, I have previously indexed the same document in another index (for another purpose), as I am going to use the same analyzer, would it be possible to avoid analyzing the doc again? I see IndexWriter.addDocument() returns void, so it does

Re: RAMDirectory vs MemoryIndex

2006-11-22 Thread Wolfgang Hoschek
I've never tried it, but I guess you could write an Analyzer and TokenFilter that no only feeds into IndexWriter on IndexWriter.addDocument(), but as a sneaky side effect also simultaneously saves its tokens into a list so that you could later turn that list into another TokenStream to be

Re: RAMDirectory vs MemoryIndex

2006-11-22 Thread Wolfgang Hoschek
Out of interest, I've checked an implementation of something like this into AnalyzerUtil SVN trunk: /** * Returns an analyzer wrapper that caches all tokens generated by the underlying child analyzer's * token stream, and delivers those cached tokens on subsequent calls to *

Re: RAMDirectory vs MemoryIndex

2006-11-21 Thread Wolfgang Hoschek
On Nov 21, 2006, at 12:38 PM, jm wrote: Ok, thanks, I'll give MemoryIndex a go, and if that is not good enoguh I will explore the other options then. To get started you can use something like this: for each document D: MemoryIndex index = createMemoryIndex(D, ...) for each query Q:

Re: RAMDirectory vs MemoryIndex

2006-11-21 Thread karl wettin
21 nov 2006 kl. 16.43 skrev jm: Any thoughts? You can also try InstantiatedIndex, similair in speed and design with a MemoryIndex, but can handle multiple documents, IndexReader, IndexWriter, IndexModifier et.c. just like any Directory implementation. It requires a minor patch to the

Re: RAMDirectory vs MemoryIndex

2006-11-21 Thread Wolfgang Hoschek
On Nov 21, 2006, at 7:43 AM, jm wrote: Hi, I have to decide between using a RAMDirectory and MemoryIndex, but not sure what approach will work better... I have to run many items (tens of thousands) against some queries (100 at most), but I have to do it one item at a time. And I already have

Re: RAMDirectory vs MemoryIndex

2006-11-21 Thread jm
Ok, thanks, I'll give MemoryIndex a go, and if that is not good enoguh I will explore the other options then. On 11/21/06, Wolfgang Hoschek [EMAIL PROTECTED] wrote: On Nov 21, 2006, at 7:43 AM, jm wrote: Hi, I have to decide between using a RAMDirectory and MemoryIndex, but not sure what