Hi,
While indexing , a norm value is calculated for each field and
injected in the index. This norm value is used as field level boosting
which is also multiplied with other factors like tf-idf and query level
boost which you specify with setBoost. so you see setting boosting is one
of the
Hi,
If you are comparing two song titles which are usually very short you are
better of using custom set of several features rather than using one of
cosine or levenstein or jaccard. You may use the combination of the
following:
1. cosine sim score
2. Jaccard overlap coeff
3. how many words in
hi,
While indexing the documents , store the Term Vectors for the content
field. Now for each document you will have an array of terms and their
corresponding frequency in the document. Using the Index Reader you can
retrieve this term vectors. Similarity between two documents can be
computed
Hi,
I am using lucene 4.8. I already have an index. I want to use the
Free text suggester feature when a user queries the index. I am not sure
how to start with this. A sample code snippet or a pointer to one would be
really helpful.
Thanks,
Parnab
Have a look at this article if you have not already gone through it.
http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html
On Thu, Aug 14, 2014 at 11:16 PM, Michael Jennings
mike.c.jenni...@gmail.com wrote:
Hi everyone,
I'm a bit of a Lucene newb, but a fairly
TF is straight forward, you can simply count the no of occurrences in the
doc by simple string matching. For IDF you need to know total no of docs in
the collection and the no. of docs having the bigram. reader.maxDoc() will
give you the total no of docs in the collection. To calculate the number
download lucene source code... and check the demo source files that are
shipped with it ... you should find a sample indexing file...
On Thu, Jun 26, 2014 at 9:27 PM, Venkata krishna venkat1...@gmail.com
wrote:
Hi,
I have to index millions of files, that's why i am thinking batch wise
Just add the lucene jar files in the build path of the project.
On Sat, Sep 28, 2013 at 5:04 PM, sajad naderi sajad_nader...@yahoo.comwrote:
hi
i want run code sample of lucene in actionbook by eclipse
please tell me how configure eclipse to run those code
Hi Rajashekhar,
yet it is possible . You can form a Boolean Query which will match the
documents as per your required conditions . Then you can delete by the
respective document ids by instantiating a indexReader.
You can refer to Book Lucene in Action 2nd Edition for more details .
Thanks,
Hi Deepak ,
Lucene already has multi-language support . For any language you just need
to write the custom Analyzer for that language .While indexing you can
configure the indexer to use the custom analyzer as and when needed .
During searching also, the same applies .You just need to provide
Erick
On Sun, Sep 30, 2012 at 8:02 AM, parnab kumar parnab.2...@gmail.com
wrote:
Hi Erick,
Can you please share your thoughts on the following :
Since lucene by default does vector space scoring , the
weight component for a term from the document is nothing
, otherwise the words are
indistinguishable.
Best
Erick
On Sat, Sep 29, 2012 at 12:23 PM, parnab kumar parnab.2...@gmail.com
wrote:
Hi All,
I have an algorithm by which i measure the importance of a
term
in a document . While indexing i want to store weight with respect
Hi,
Use IndexReader instead . You can loop through the index and
read one document at a time .
Thanks,
Parnab
On Mon, Oct 1, 2012 at 10:33 AM, Selvakumar vvekselva...@gmail.com wrote:
Hi,
I'm new to Lucene and I reading the docs on Lucene.
I read through the Lucene Index
13 matches
Mail list logo