Re[2]: lucene scoring

2008-08-08 Thread Александр Аристов
Query independent means that the threshold should have the same relevance for all queries and discard found docs below it. Current scoring implementation doesn't give guaranties that, say two documents found in two queries and which got the same score 0.5 are of the same quality. I don't

Re[4]: lucene scoring

2008-08-08 Thread Александр Аристов
Relevance ranking is an option but we still won't be able compare results. Lets say we have distributed searching - in this case top 10 from one server is not the same as those which are from another. Even worse we may get that in the resulting set a document with most top score is worse than

[jira] Updated: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-08-08 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eks Dev updated LUCENE-1219: Attachment: LUCENE-1219.extended.patch Mike, This new patch includes take3 and adds the following:

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-08-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12620960#action_12620960 ] Michael McCandless commented on LUCENE-1219: Eks, could we instead add this to

[jira] Commented: (LUCENE-1350) Filters which are consumers should not reset the payload or flags and should better reuse the token

2008-08-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12620970#action_12620970 ] Michael McCandless commented on LUCENE-1350: It seems like there are three

[jira] Updated: (LUCENE-1329) Remove synchronization in SegmentReader.isDeleted

2008-08-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-1329: --- Attachment: LUCENE-1329.patch I took a first cut at creating an explicit read only

[jira] Commented: (LUCENE-1329) Remove synchronization in SegmentReader.isDeleted

2008-08-08 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621024#action_12621024 ] Yonik Seeley commented on LUCENE-1329: -- bq. I didn't do this one yet ... it makes me

Re: Re[2]: lucene scoring

2008-08-08 Thread Doron Cohen
Following suggestion is weaker than the requested functionality, but maybe you'll find the concept useful to ignore so called garbage results. Assume that the query is a simple OR query made of a few words. By examining the frequencies of these words in the index (their DFs) devise a synthetic

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-08-08 Thread Eks Dev (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621036#action_12621036 ] Eks Dev commented on LUCENE-1219: - bq. could we instead add this to Field: byte[]

[jira] Commented: (LUCENE-1350) Filters which are consumers should not reset the payload or flags and should better reuse the token

2008-08-08 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621041#action_12621041 ] Doron Cohen commented on LUCENE-1350: - Mike, thanks for clearing things... You're

[jira] Issue Comment Edited: (LUCENE-1350) Filters which are consumers should not reset the payload or flags and should better reuse the token

2008-08-08 Thread Doron Cohen (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621041#action_12621041 ] doronc edited comment on LUCENE-1350 at 8/8/08 1:18 PM: - Mike,

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-08-08 Thread Yonik Seeley (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621043#action_12621043 ] Yonik Seeley commented on LUCENE-1219: -- bq. Also ... it'd be nice to have a way to do

[jira] Updated: (LUCENE-1350) Filters which are consumers should not reset the payload or flags and should better reuse the token

2008-08-08 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DM Smith updated LUCENE-1350: - Attachment: LUCENE-1350.patch {quote} Should we just absorb this issue into LUCENE-1333? DM, of your

[jira] Updated: (LUCENE-1333) Token implementation needs improvements

2008-08-08 Thread DM Smith (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DM Smith updated LUCENE-1333: - Attachment: LUCENE-1333.patch This patch includes all the previous ones. Note: It includes the

[jira] Commented: (LUCENE-753) Use NIO positional read to avoid synchronization in FSIndexInput

2008-08-08 Thread Matthew Mastracci (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621091#action_12621091 ] Matthew Mastracci commented on LUCENE-753: -- bq. Is the index itself corrupt, ie,

[jira] Commented: (LUCENE-1219) support array/offset/ length setters for Field with binary data

2008-08-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1219?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621099#action_12621099 ] Michael McCandless commented on LUCENE-1219: bq. realized I am missing actual

Re: Re[4]: lucene scoring

2008-08-08 Thread J. Delgado
The only score that I can think of that can measure quality across different queries are invariant scores such as pagerank. That is to score the document on its general information value and then use that as a filter regardless of the query. This is very different than the problem of trying to

[jira] Commented: (LUCENE-1333) Token implementation needs improvements

2008-08-08 Thread Michael McCandless (JIRA)
[ https://issues.apache.org/jira/browse/LUCENE-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12621101#action_12621101 ] Michael McCandless commented on LUCENE-1333: OK since you pulled it all