Based on the method sent earlier, it looks like Lucene first checks to
see if optimization is even necessary.
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
> If an index has no deletions, it does not need to be optimized. You can
> find out if it has deletions with IndexReader.hasDeletions.
Is that true? An index that has just been created (with no deletions)
can still have multiple segments that could be optimized. I'm not
sure your statement is c
You appear to be searching for the word "Engineer" in the "name"
field. Shouldn't this query be directed at the "designation" field?
The only terms in the name field would be "Ebrahim", "Faisal", "John",
and "Smith", wouldn't they?
On Thu, 30 Dec 2004 22:06:46 +0530, Mohamed Ebrahim Faisal
<[EM
> But for the other issue on 'store lucene' vs 'store db'. Does anyone can
> provide me with some field experience on size?
> The system I'm developing will provide searching through some 2000
> pdf's, say some 200 pages each. I feed the plain text into Lucene on a
> Field.UnStored bases. I also st
Thanks for correcting me. I use the reader version -- hence my confusion.
-Mike
On Wed, 22 Dec 2004 11:53:31 -0500, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
>
> On Dec 22, 2004, at 11:36 AM, Mike Snare wrote:
> > Whether or not the text is stored in the index is a different co
I've never used the german analyzer, so I don't know what stop words
it defines/uses. Someone else will have to answer that. Sorry
On Wed, 22 Dec 2004 17:45:17 +0100, DES <[EMAIL PROTECTED]> wrote:
> I actually use Field.Text(String,String) to add documents to my index. Maybe
> I do not understa
Whether or not the text is stored in the index is a different concern
that how it is analyzed. If you want the text to be indexed, and not
stored, then use the Field.Text(String, String) method or the
appropriate constructor when adding a field to the Document. You'll
need to also store a referen
I'm still new to Lucene, but wouldn't that be the coord()? My
understanding is that the coord() is the fraction of the boolean query
that matched a given document.
Again, I'm new, so somebody else will have to confirm or deny...
-Mike
On Mon, 20 Dec 2004 00:33:21 -0800 (PST), Gururaja H
<[EMAI
Absolutely, but -- correct me if I'm wrong -- it would give no higher
ranking to half-baked and would take a good deal longer on large
indices.
On Thu, 16 Dec 2004 20:03:27 +0100, Daniel Naber
<[EMAIL PROTECTED]> wrote:
> On Thursday 16 December 2004 13:46, Mike Snare wrote:
>
> Not if these words are spelling variations of the same concept, which
> doesn't seem unlikely.
>
> > In addition, why do we assume that a-1 is a "typical product name" but
> > a-b isn't?
>
> Maybe for "a-b", but what about English words like "half-baked"?
Perhaps that's the difference in think
> a-1 is considered a typical product name that needs to be unchanged
> (there's a comment in the source that mentions this). Indexing
> "hyphen-word" as two tokens has the advantage that it can then be found
> with the following queries:
> hypen-word (will be turned into a phrase query internally)
I am writing a tool that uses lucene, and I immediately ran into a
problem searching for words that contain internal hyphens (dashes).
After looking at the StandardTokenizer, I saw that it was because
there is no rule that will matchor
. Based on what I can tell from the source, every other
12 matches
Mail list logo