Don Gilbert writes:
>
> I ran into this problem using current Lucene implementation
> of rangeQuery applied to genome data (search a chromosome
> range from 1..20MB). We wanted to use lucene queries like
>
> +organism:fruitfly +chromosome:X +location:[100 500]
>
> to find all the ge
I ran into this problem using current Lucene implementation
of rangeQuery applied to genome data (search a chromosome
range from 1..20MB). We wanted to use lucene queries like
+organism:fruitfly +chromosome:X +location:[100 500]
to find all the genome features (1000s to 100,000s) th
I do these kind of things as part of a layer between Lucene and my application, but
often have thought it would be nice to have a metadata layer available that wasn't
part of the Lucene core, but was packaged w/ Lucene. It could provide the information
necessary and have tools for updating with
Otis Gospodnetic wrote:
Can anyone comment on performance differences?
I'd expect multi-threaded performance to be a bit worse with the
compound format, but single-threaded performance should be nearly identical.
Doug
-
To unsub
David Spencer wrote:
Does it ever make sense to set the Similartity obj in either (only one
of..) IndexWriter or IndexSearcher? i.e. If I set it in IndexWriter can
I avoid setting it in IndexSearcher? Also, can I avoid setting it in
IndexWriter and only set it in IndexSearcher? I noticed Nutch s
org.apache.lucene.demo.FileDocument.Document(File) is invoked from IndexFiles and does:
Reader reader = new BufferedReader(new InputStreamReader(is));
Notice that the InputStreamReader does not specify an encoding so your default
encoding is being used.
You should probably write your own gl
Uddam answers inline:
> - in DefaultSimilarity.queryNorm(float sumOfSquareWeights) : how does it
> compute the query weight?
To understand Lucene scoring it is easiest if you follow a query with only
one term on a searchable.
Here is the general flow of control for such a query:
-IndexSearcher
I have been trying to follow Lucene scoring across multiple searchables. And
I do not see where the IDF gets normalized between searchables? (Sum DF
across searchables in first half of query and use in second half of query
execution to calculate right IDF across searchables.)
Lets say you have one
I did the test earlier on 1.3
http://issues.apache.org/eyebrowse/[EMAIL PROTECTED]
he.org&msgId=1408808
Regards,
Hui
-Original Message-
From: Otis Gospodnetic [mailto:[EMAIL PROTECTED]
Sent: Tuesday, June 08, 2004 5:23 AM
To: Lucene Users List
Subject: Performance: compound vs. multi-f
Hello,
I don't know if the author of CLucene is on this list. You may get
better help on CLucene mailing list or forum on sf.net.
Otis
--- Yue Sun <[EMAIL PROTECTED]> wrote:
> Hi,
>
> First, I am not sure if I should post my question here, since I am
> using
> CLucene (C++ port of Lucene) to
Hi,
First, I am not sure if I should post my question here, since I am using
CLucene (C++ port of Lucene) to build indexes. Hope someone here could
help me.
I am indexing at a solaris machine with 1G memory. I use ram writer and
fs writer, and write into fs index once a while. Now I am testing
Can anyone comment on performance differences?
I just ran a comparison, indexing about 250'000 small documents. Both
the time for indexing (239s) and the final disk space used (16.6MB) were
identical. Haven't compared search performance, though I suspect I can
save myself the effort...
Hi all,
It's so confusing the way Lucence computes the score. I tried to see what happaned but
am blocked with some parameters' mystery.
- in DefaultSimilarity.queryNorm(float sumOfSquareWeights) : how does it compute the
query weight?
- How does it compute the weight of each field in the ind
I am not 100% certain now, but I _think_ there were some changes that
required that you re-index your data when upgrading to 1.4rc2. I would
check the CHANGES file (link on the site, just look at the complete
file).
Otis
--- juan lu <[EMAIL PROTECTED]> wrote:
> I had been using 1.3 final for 1 m
Hello,
I was wondering if anyone can comment on the performance difference of
compound versus multi-file indices. I am interested in both indexing
and searching performance, and have tried testing indexing performance
of both formats.
My tests so far show no indexing performance differences betw
Hi Michael,
I wonder if you would be interested in cooperating on the
extracting/index management bit. We use Lucene and our own extractor
plugins for a Swing-application:
http://tockit.sf.net/docco
Code can be found here:
http://cvs.sourceforge.net/viewcvs.py/toscanaj/docco/
It is BSD-Style l
--- Doug Cutting <[EMAIL PROTECTED]> wrote: > Jayant
Kumar wrote:
> > Thanks for the patch. It helped in increasing the
> > search speed to a good extent.
>
> Good. I'll commit it. Thanks for testing it.
>
> > But when we tried to
> > give about 100 queries in 10 seconds, then again
> we
> > f
17 matches
Mail list logo