Re: Jensen–Shannon divergence

2015-12-13 Thread Shay Hummel
Hi I am sorry but I didn't understand your answer. Can you please elaborate? Shay On Sun, Dec 13, 2015 at 3:41 PM will martin wrote: > expand your due diligence beyond wikipedia: > i.e. > > http://ciir.cs.umass.edu/pubfiles/ir-464.pdf > > > > > On Dec 13, 2015

Jensen–Shannon divergence

2015-12-13 Thread Shay Hummel
proach - specifically the LMDiriclet. The similarity will be calculated using the JS-Div between the document model and the query model. Is it possible? if so how? Thank you, Shay Hummel -- Regards, Shay Hummel

Re: Tf and Df in lucene

2015-06-15 Thread Shay Hummel
freq, float docLen){ > return freq; > } > > When you use this similarity, search for three term query, scores will > summed tf values. Also you can extract additional info from explain feature. > > Ahmet > > > > > On Monday, June 15, 2015 5:50 PM, Shay Hummel >

Re: Tf and Df in lucene

2015-06-15 Thread Shay Hummel
, > field, term.bytes()); > > if (postingsEnum == null) return; > > int max = 0; > while (postingsEnum.nextDoc() != PostingsEnum.NO_MORE_DOCS) { > final int freq = postingsEnum.freq(); > int docID = postingsEnum.docID();} > > > Ahmet > > > > > On Monday, J

Tf and Df in lucene

2015-06-14 Thread Shay Hummel
Hi I was wondering, what is the easiest way to get the term frequency of a term t in document d, namely tf(t,d) ? In the same spirit - what is the easieast way the get the document frequency of a term in the collection, i.e. how many contain the term t, namely df(t) ? Regards, Shay

Re: Text dependent analyzer

2015-04-20 Thread Shay Hummel
ence detection outside of the Lucene. >> > >> > By the way, I remember there was a way to consume all upstream token >> stream. >> > >> > I think it was consuming all input and injecting one concatenated huge >> term/token. >> > >> > Ke

Re: Text dependent analyzer

2015-04-15 Thread Shay Hummel
of the solr, using opennlp for > instance, and then feed them to solr. > > https://opennlp.apache.org/documentation/1.5.2-incubating/manual/opennlp.html#tools.sentdetect > > Ahmet > > > > > On Tuesday, April 14, 2015 8:12 PM, Shay Hummel > wrote: > Hi > I

Span near query with payloads

2015-04-14 Thread Shay Hummel
Doc2: I read the book the Hobbit during the flight to London. (book is a noun, flight is a noun) when I searched "book flight" with (verb and noun as payload respectively) it worked for the correct window (depends If i removed stopwords or not) properly. So what did you mean with this note? Thank you, Shay Hummel

Text dependent analyzer

2015-04-14 Thread Shay Hummel
english analyzer (createComponent). However, this is not dependent of the text it receives - which is the first part of what I am trying to do. So ... How can it be achieved? Thank you, Shay Hummel