On 10/10/06, Grant Ingersoll [EMAIL PROTECTED] wrote:
I would be interested in another survey, this time about how many
people use a fixed set of Fields in their applications. The large
majority of mine do. I know SOLR supports dynamic fields, but I
wonder how much they are used. If there
On 10/11/06, Yonik Seeley [EMAIL PROTECTED] wrote:
On 10/10/06, David Balmain [EMAIL PROTECTED] wrote:
Given these factors and the fact that benchmarks can be a very touchy
subject, particularly in the Java community,
OK, I'll bite! (but I'm always too aggravated at many of the Java
design
On 10/11/06, Doug Cutting [EMAIL PROTECTED] wrote:
David Balmain wrote:
The start of my benchmarks are here:
http://ferret.davebalmain.com/trac/wiki/FerretVsLucene
Ferret looks fast! Nice work.
A big knee in indexing performance occurs when indexes get much larger
than memory, when merging
On 10/11/06, Yonik Seeley [EMAIL PROTECTED] wrote:
On 10/10/06, David Balmain [EMAIL PROTECTED] wrote:
The start of my benchmarks are here:
http://ferret.davebalmain.com/trac/wiki/FerretVsLucene
I did set maxBufferedDocs to 1000 and optimized both indeces at the
end
Ah, I had missed
On 10/11/06, Ning Li [EMAIL PROTECTED] wrote:
On 10/10/06, Yonik Seeley [EMAIL PROTECTED] wrote:
On 10/10/06, Otis Gospodnetic [EMAIL PROTECTED] wrote:
Hi,
Maybe I missed it, but I was surprised that nobody here wondered about the
algorithm and data structure changes that Dave Balmain
Hi Greg,
I don't know which documentation of the Lucene FileFormat you are
looking at but you can see UInt32 (Int) UInt64 (Long) and VInt defined
here:
http://lucene.apache.org/java/docs/fileformats.html
Are you at liberty to tell us what you are working on? You may also
like to take a look
On 7/10/06, Doug Cutting [EMAIL PROTECTED] wrote:
Chuck Williams wrote:
Lucene today allows many field properties to vary at the Field level.
E.g., the same field name might be tokenized in one Field on a Document
while it is untokenized in another Field on the same or different
Document.
On 7/11/06, Chuck Williams [EMAIL PROTECTED] wrote:
David Balmain wrote on 07/10/2006 01:04 AM:
The only problem I could find with this solution is that
fields are no longer in alphabetical order in the term dictionary but
I couldn't think of a use-case where this is necessary although I'm
On 7/11/06, Yonik Seeley [EMAIL PROTECTED] wrote:
On 7/10/06, David Balmain [EMAIL PROTECTED] wrote:
I don't think declaring all fields up front is necessary for
substantial optimizations. I've found that the key to some really good
optimizations is having constant field numbers
On 7/10/06, Chuck Williams [EMAIL PROTECTED] wrote:
David Balmain wrote on 07/09/2006 06:44 PM:
On 7/10/06, Chuck Williams [EMAIL PROTECTED] wrote:
Marvin Humphrey wrote on 07/08/2006 11:13 PM:
On Jul 8, 2006, at 9:46 AM, Chuck Williams wrote:
Many things would be cleaner in Lucene
Hi Marvin,
Where are you with this? I also have a vested interest in seeing
Lucene move to using byte counts. I was wondering if I could help out.
Is the patch you pasted here the latest you have?
Cheers,
Dave
On 4/12/06, Marvin Humphrey [EMAIL PROTECTED] wrote:
Greets,
I'm back working on
Hi Erik,
The only way I can see this exception being thrown is when you have
two SpanCells with the same start in a particular document. In this
case matchIsOrdered will return false even though the SpanCells may
still be ordered in the priority queue. The current code for
matchIsOrdered is;
Hi Robert,
I'm very interested in this. I've ported the indexing part of Lucene
to C myself. Currently it's not portable (runs on *nix), but it does
implement file locking. I'm mostly curious to see how you solved some
of the problems I came across and how your performance is compared to
the java
This sounds like it should be possible, except for docId clashes - if
index A had a document with Id 100 and index B also has a document with
Id 100, after my index file copying, index C will end up having 2
documents with Id 100, and that won't work. So, documents in C would
have to be
14 matches
Mail list logo