Hi,
While indexing , a norm value is calculated for each field and
injected in the index. This norm value is used as field level boosting
which is also multiplied with other factors like tf-idf and query level
boost which you specify with setBoost. so you see setting boosting is one
of the s
Hi,
If you are comparing two song titles which are usually very short you are
better of using custom set of several features rather than using one of
cosine or levenstein or jaccard. You may use the combination of the
following:
1. cosine sim score
2. Jaccard overlap coeff
3. how many words in th
hi,
While indexing the documents , store the Term Vectors for the content
field. Now for each document you will have an array of terms and their
corresponding frequency in the document. Using the Index Reader you can
retrieve this term vectors. Similarity between two documents can be
computed as
Hi,
I am using lucene 4.8. I already have an index. I want to use the
Free text suggester feature when a user queries the index. I am not sure
how to start with this. A sample code snippet or a pointer to one would be
really helpful.
Thanks,
Parnab
Have a look at this article if you have not already gone through it.
http://blog.mikemccandless.com/2011/06/lucenes-near-real-time-search-is-fast.html
On Thu, Aug 14, 2014 at 11:16 PM, Michael Jennings <
mike.c.jenni...@gmail.com> wrote:
> Hi everyone,
>
> I'm a bit of a Lucene newb, but a fairl
TF is straight forward, you can simply count the no of occurrences in the
doc by simple string matching. For IDF you need to know total no of docs in
the collection and the no. of docs having the bigram. reader.maxDoc() will
give you the total no of docs in the collection. To calculate the number o
download lucene source code... and check the demo source files that are
shipped with it ... you should find a sample indexing file...
On Thu, Jun 26, 2014 at 9:27 PM, Venkata krishna
wrote:
> Hi,
>
> I have to index millions of files, that's why i am thinking batch wise
> indexing is good.
>
>
Just add the lucene jar files in the build path of the project.
On Sat, Sep 28, 2013 at 5:04 PM, sajad naderi wrote:
> hi
> i want run code sample of "lucene in action"book by eclipse
> please tell me how configure eclipse to run those code
>
Hi Rajashekhar,
yet it is possible . You can form a Boolean Query which will match the
documents as per your required conditions . Then you can delete by the
respective document ids by instantiating a indexReader.
You can refer to Book Lucene in Action 2nd Edition for more details .
Thanks,
Parn
Hi Deepak ,
Lucene already has multi-language support . For any language you just need
to write the custom Analyzer for that language .While indexing you can
configure the indexer to use the custom analyzer as and when needed .
During searching also, the same applies .You just need to provide the
t
> Erick
>
> On Sun, Sep 30, 2012 at 8:02 AM, parnab kumar
> wrote:
> > Hi Erick,
> > Can you please share your thoughts on the following :
> > Since lucene by default does vector space scoring , the
> > weight component for a term from
Hi,
Use IndexReader instead . You can loop through the index and
read one document at a time .
Thanks,
Parnab
On Mon, Oct 1, 2012 at 10:33 AM, Selvakumar wrote:
> Hi,
>
> I'm new to Lucene and I reading the docs on Lucene.
>
>
> I read through the Lucene Index File Format, so to e
are
> indistinguishable.
>
> Best
> Erick
>
> On Sat, Sep 29, 2012 at 12:23 PM, parnab kumar
> wrote:
> > Hi All,
> >
> >I have an algorithm by which i measure the importance of a
> term
> > in a document . While indexing i want to store weig
13 matches
Mail list logo