: (d) Now we might get stupid (or erroneous)
: few words docs as top results;
: (e) To solve this, pivoted doc-length-norm punishes too
: long docs (longer than the average) but only slightly
: rewards docs that are shorter than the average.
I get that your calculation is much more gr
> However ... i still think that if you realy want
> a length norm that takes into account the average
> length of the docs, you want one that rewards docs
> for being near the average ...
... like SweetSpotSimilarity (SSS)
> it doesn't seem to make a lot of sense to me to say
> that a doc whose
: The Similarity portion of the payload functionality could be used for
: scoring binary fields.
that can be used as a hook to decide how to evaluate an arbitrary byte[]
payload as a float for the purposes of scoring -- but it doesn't address
the problem of how do we write/read a payload which is
: Yes, actually: 1 / sqrt((1 - Slope) * Pivot + (Slope) * Doclen)
interesting ... it doesn't really seem like there is any direct
relationship between your average length (Pivot) and your Doclen --
on the surface when i first read your example it seemed like it has more
to do with the shifting o
Chris Hostetter wrote:
> isn't that just a flat line with a slope relative to teh
> specified "Slope"
> ? your pivot just seems to affect the y-intercept (which would be the
> lengthNorm for field containing 0 terms) but doesn't that cancel out of
> any scoring equation since the fieldNorm is mul
On Jul 16, 2007, at 9:24 PM, Chris Hostetter wrote:
Hmmm... perhaps what we need is a generalization of the pyaload API to
allow storing/reading payloads on a per document, per field, or per
index
basis ... along with some sort of "PayloadMerger" that could be
used by
InexWriter when merg
> : I think both are not good enough for large dynamic collections.
> : Both are good enough for experiments. But it should be more
> : efficient in a working dynamic large system.
>
> Hmmm... perhaps what we need is a generalization of the pyaload API to
> allow storing/reading payloads on a per d
: Basically it is
: (1 - Slope) * Pivot + (Slope) * Doclen
: Where Pivot reflects on the average doc length, and
: Smaller Slope reduces the amount by which short docs
: are preferred over long ones. In collection with very
isn't that just a flat line with a slope relative to teh specified "Slo
Chris Hostetter wrote:
> i guess i'm not following how exactly your pivoted norm calculation works
> ... it sounds like you are still rewarding 1 term long fields more then
True.
> any other length ... is the distinction between your approach and the
> default implementation just that the defaul
: Thanks for your comments Chris, and sorry for the delayed
my turn for a delayed response ... i figured there was no rush since you
were offline for 10 days :)
: I didn't try this - passing the computed avg doc length to
: SweetSpotSimilarity (SSS) - it would be interesting to try. I wonder
: h
Thanks for your comments Chris, and sorry for the delayed
response - you raised some tough questions for me, and I
felt I have to clear my thoughts on this before replying.
(Well, as you'll see below they are not too clear now either,
but I am going to be off-line for the next ~10 days, so
decided
Is this the paper that you are refering to?
A. Chowdhury, D. Grossman, O. Frieder, C. McCabe, "Document
Normalization Revisited" , ACM-SIGIR, August 2002.
http://ir.iit.edu/~abdur/publications/p381-chowdhury.pdf
-Sean
Doron Cohen wrote on 6/30/2007, 4:56 AM:
> In particular for TREC
> data,
Doug Cutting wrote:
> We should be careful not to tune things too much for any one application
> and/or dataset. Tools to perform evaluation would clearly be valuable.
> But changes that improve Lucene's results on TREC data may or may not
> be of general utility. The best way to tune an appli
Nadav Har'El wrote:
> Another approach is to use Term Relevance Sets, described in [1].
> This new approach not only requires less manual labor than
> TREC's approach,
> but also works better when the corpus is evolving.
>
> [1] "Scaling IR-System Evaluation using Term Relevance Sets",
> Einat Ami
nt: Monday, June 25, 2007 8:48:03 PM
Subject: Re: search quality - assessment & improvements
On Jun 25, 2007, at 2:19 PM, Doron Cohen wrote:
>> IANAL and I didn't read the link, but I think people publish their
>> MAP scores, etc. all the time on TREC data. I think it implies
On Mon, Jun 25, 2007, Grant Ingersoll wrote about "Re: search quality -
assessment & improvements":
> 1. Create our own judgements on Wikipedia or the Reuters collection.
> This is no doubt hard and would require a fair number of volunteers
> and could/would compete
: For the first change, logic is that Lucene's default length normalization
: punishes long documents too much. I found contrib's sweet-spot-similarity
: helpful here, but not enough. I found that a better doc-length
: normalization method is one that considers collection statistics - e.g.
: avera
Marvin Humphrey wrote:
Wikipedia is a moving target. I think the collection would have to be
static.
In theory, one can evaluate against other search engines results for
Wikipedia. However this may violate their EULAs...
Doug
---
Yes you are correct, we could use the specific version that we use
for benchmarking. I was assuming that one, just didn't say it! :-)
-Grant
On Jun 25, 2007, at 3:00 PM, Marvin Humphrey wrote:
On Jun 25, 2007, at 11:56 AM, Grant Ingersoll wrote:
To do this, we could use Reuters or Wikipe
On Jun 25, 2007, at 11:56 AM, Grant Ingersoll wrote:
To do this, we could use Reuters or Wikipedia. The hard part is
generating the queries and having people make relevance judgments
for a sufficient sample size.
Wikipedia is a moving target. I think the collection would have to
be sta
On Jun 25, 2007, at 2:04 PM, Doug Cutting wrote:
Doron Cohen wrote:
It is very important that we would be able to assess the search
quality in
a repeatable manner - so that anyone can repeat the quality tests,
and
maybe find ways to improve them. (This would also allow to verify the
"impro
On Jun 25, 2007, at 2:19 PM, Doron Cohen wrote:
IANAL and I didn't read the link, but I think people publish their
MAP scores, etc. all the time on TREC data. I think it implies that
you obtained the data through legal means.
So you're saying that if person "X" got the TREC data legally, we
Hey Grant, thanks for your comments!
Grant Ingersoll wrote:
> As I am sure you are aware: https://issues.apache.org/jira/browse/
> LUCENE-836
I remembered you mentioning setting our own doc/query judgment system but
forgot it was in LUCENE-836, thanks for the reminder.
> On Jun 25, 2007, at 3:1
Doron Cohen wrote:
It is very important that we would be able to assess the search quality in
a repeatable manner - so that anyone can repeat the quality tests, and
maybe find ways to improve them. (This would also allow to verify the
"improvements claims" above...). This capability seems like a
Just to throw in a few things:
First off, this is great!
As I am sure you are aware: https://issues.apache.org/jira/browse/
LUCENE-836
On Jun 25, 2007, at 3:15 AM, Doron Cohen wrote:
hi, this could probably split into two threads but for context
let's start
it in a single discussion;
R
hi, this could probably split into two threads but for context let's start
it in a single discussion;
Recently I was looking at the search quality of Lucene - Recall and
Precision, focused at [EMAIL PROTECTED],5,10,20 and, mainly, MAP.
-- Part 1 --
I found out that quality can be enhanced by mo
26 matches
Mail list logo