Re: Submission, btree BooleanScorer

2005-05-22 Thread Paul Elschot
On Sunday 22 May 2005 03:09, Karl Wright wrote: > I've been looking at the BooleanScorer code in 1.4.3 and realized that it has several problems. These are: > > 1) It does things in chunks of 1024 document ids. This means it executes in a time that depends on the number of indexed documents.

Re: Submission, btree BooleanScorer

2005-05-22 Thread Karl Wright
I tried a variant of the submitted code with another test case that I am using for a somewhat different class of problem, which had 15,000 terms, and it passed that with much better performance than the current BooleanQuery code. I have not tried it with RangeQuery or PrefixQuery however. The

Re: Submission, btree BooleanScorer

2005-05-22 Thread Karl Wright
The BooleanScorer2 code in the svn trunk looks like it has been entirely rewritten over the code that this submission is based upon (which was the 1.4.3 stuff). The techniques used may still be applicable, but would apply within one or more of the specialized scorers instead: DisjunctionScorer,

DO NOT REPLY [Bug 34193] - [PATCH] Performance improvement to DisjunctionSumScorer

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

Re: Submission, btree BooleanScorer

2005-05-22 Thread Paul Elschot
On Sunday 22 May 2005 13:12, Karl Wright wrote: > The BooleanScorer2 code in the svn trunk looks like it has been entirely rewritten over the code that this submission is based upon (which was the 1.4.3 stuff). The techniques used may still be applicable, but would apply within one or more of t

DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Otis Gospodnetic
I didn't follow this closely, but are you saying that LuceneIndexAccessor then replaces IOError caused by locking with blocking calls? It sounds like the client of LuceneIndexAccessor still needs to keep track of open IndexReaders, IndexWriters, etc., or else one can end up with a hard-to-track bl

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Maik Schreiber
> I didn't follow this closely, but are you saying that > LuceneIndexAccessor then replaces IOError caused by locking with > blocking calls? It sounds like the client of LuceneIndexAccessor still > needs to keep track of open IndexReaders, IndexWriters, etc., or else > one can end up with a hard-t

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Otis Gospodnetic
Hi Maik, So what happens in this case: IndexAccessProvider accessProvider = new IndexAccessProvider(directory, analyzer); LuceneIndexAccessor accessor = new LuceneIndexAccessor(accessProvider); accessor.open(); IndexWriter writer = accessor.getWriter(); // reference to the same

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Maik Schreiber
> IndexWriter writer = accessor.getWriter(); > // reference to the same instance? > IndexWriter writer2 = accessor.getWriter(); > writer.addDocument(); > writer2.addDocument(); Yes, regardless of which thread invokes getWriter(). This means multiple threads are concurrently able to add new

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Daniel Naber
On Sunday 22 May 2005 21:01, Maik Schreiber wrote: > Yes, regardless of which thread invokes getWriter(). This means multiple > threads are concurrently able to add new documents. Isn't t that already possible without any accessor class (you need to use the same IndexWriter for all your threads)

Re: DO NOT REPLY [Bug 34995] - Contribution: LuceneIndexAccessor

2005-05-22 Thread Maik Schreiber
> Isn't t that already possible without any accessor class (you need to use > the same IndexWriter for all your threads)? Yes, but you also need to keep track of who's using the writer before you can close it. Additionally, closing a writer yourself doesn't make sure that cached readers and searc

One Byte is Seven bits too many? - A Design suggestion

2005-05-22 Thread Arvind Srinivasan
One Byte is Seven bits too many? - A Design suggestion Hi, The norm takes up 1 byte of storage per document per field. While this may seem very small, a simple calculation shows that the IndexSearcher can consume lots of memory when it caches the norms. Further, the current implementation loads

RE: One Byte is Seven bits too many? - A Design suggestion

2005-05-22 Thread Robert Engels
I have always thought that the norms should be an interface, rather than fixed, as there are many uses of lucene where norms are not necessary, and the memory overhead is substantial. -Original Message- From: Arvind Srinivasan [mailto:[EMAIL PROTECTED] Sent: Sunday, May 22, 2005 7:05 PM To

[ANN] LuceneKit Preview 1

2005-05-22 Thread Yen-Ju Chen
Hi, A minimum set of LuceneKit has been ported based on Apache Lucene svn version. I would like to use this chance to make an announcement of LuceneKit Preview 1. While many features are still missing, all the implemented classes pass theirs unit tests, though it doesn't guarantee Lucene

DO NOT REPLY [Bug 34882] - Contrib: Main memory based SynonymMap and SynonymTokenFilter

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

DO NOT REPLY [Bug 34882] - Contrib: Main memory based SynonymMap and SynonymTokenFilter

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

DO NOT REPLY [Bug 34882] - Contrib: Main memory based SynonymMap and SynonymTokenFilter

2005-05-22 Thread bugzilla
DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG· RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT . ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND· INSERTED IN THE BUG DATABASE. http://issues.apache.org/bugzilla/show_bu

Re: One Byte is Seven bits too many? - A Design suggestion

2005-05-22 Thread Paul Elschot
On Monday 23 May 2005 02:04, Arvind Srinivasan wrote: > One Byte is Seven bits too many? - A Design suggestion > > Hi, > > The norm takes up 1 byte of storage per document per field. While this may seem > very small, a simple calculation shows that the IndexSearcher can consume lots of > memor