Hello,
I have a question about idf computation for different fields:
As we know, idf = Math.log(numDocs/(docFreq+1)) + 1.0
docFreq is field specific, however, numDocs is a shared number for all
fields.
for example:
Assume there are 1M docs, mean numDocs=10^6
all of the docs have field_1, but only
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Hi,
we currently migrate from Lucene 3.5.0 to Lucene 4. So far so good, but in one
project we have the
need to access multiple indices, that can be also remote ones. In the past, we
solved this by
using the Searcher interface, and implemented a subcl
Thanks, that did work.
On Tue, Jun 17, 2014 at 8:49 PM, Jack Krupansky
wrote:
> Yeah, this is kind of tricky and confusing! Here's what happens:
>
> 1. The query parser "parses" the input string into individual source
> terms, each delimited by white space. The escape is removed in this
> proc
Your first case is supposed to work; if it doesn't it's a bad bug :)
Can you reduce it to a small example?
Mike McCandless
http://blog.mikemccandless.com
On Wed, Jun 18, 2014 at 10:08 AM, Clemens Wyss DEV wrote:
> I would like to perform a batch update on an index. In order to omit
> duplica
I would like to perform a batch update on an index. In order to omit duplicate
entries I am making use of IndexWriter#updateDocument(Term, Document)
open an IndexWriter;
foreach( element in elementsToBeUpdatedWhichHaveDuplicates )
{
doc = element.toDoc();
indexWriter.updateDocument( uniqueTermFor
Hi!
We have switched from Lucene 3.6 to >=Lucene 4.7 (java7) and we are also
experiencing a distinct slowdown using the same dataset. We are running the
software under Windows 2008R2.
In our case, we have identified that there a lot more IO calls (= number of
time the buffer is refilled in Ind