Re: score from two cores

2011-02-16 Thread linkedLetter

A common problem in metasearch engines. Its not intractable. You just have to
surface the right statistics into a 'fusion' scorer.

-
NOT always nice. When are we getting better releases?
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/score-from-two-cores-tp2012444p2515617.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: score from two cores

2010-12-03 Thread Erick Erickson
Uhhm, what are you trying to do? What do you want to do with the scores from
two cores?

Best
Erick

On Fri, Dec 3, 2010 at 11:21 AM, Ma, Xiaohui (NIH/NLM/LHC) [C] 
xiao...@mail.nlm.nih.gov wrote:

 I have multiple cores. How can I deal with score?

 Thanks so much for help!
 Xiaohui



RE: score from two cores

2010-12-03 Thread Ma, Xiaohui (NIH/NLM/LHC) [C]
Please correct me if I am doing something wrong. I really appreciate your help!

I have a core for metadata (xml files) and a core for pdf documents. Sometimes 
I need search them separately, sometimes I need search both of them together. 
There is the same key which is related them for each item.

For example, the xml files look like following:
?xml version=1.0 encoding=ISO-8859-1?
List
Item  
Keyrmaaac.pdf/Key
TIsomethingTI
UIrmaaac/UI
/Item
Item
   .
/List

I index rmaaac.pdf file with same Key and UI field in another core. Here is the 
example after I index rmaaac.pdf.
  ?xml version=1.0 encoding=UTF-8 ? 
  response
  lst name=responseHeader
  int name=status0/int 
  int name=QTime3/int 
  lst name=params
  str name=indenton/str 
  str name=start0/str 
  str name=qcollectionid: RM/str 
  str name=rows10/str 
  str name=version2.2/str 
  /lst
  /lst
  result name=response numFound=1 start=0
  doc
str name=UIrm/str 
str name=Keyrm.pdf/str  
str name=metadata_contentsomething/str
  /doc
  /result

The result information which is display to user comes from metadata, not from 
pdf files. If I search a term from documents, in order to display search 
results to user, I have to get Keys from documents and then redo search from 
metadata. Then score is different.

Please give me some suggestions!

Thanks so much,
Xiaohui 

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Friday, December 03, 2010 12:37 PM
To: solr-user@lucene.apache.org
Subject: Re: score from two cores

Uhhm, what are you trying to do? What do you want to do with the scores from
two cores?

Best
Erick

On Fri, Dec 3, 2010 at 11:21 AM, Ma, Xiaohui (NIH/NLM/LHC) [C] 
xiao...@mail.nlm.nih.gov wrote:

 I have multiple cores. How can I deal with score?

 Thanks so much for help!
 Xiaohui



Re: score from two cores

2010-12-03 Thread Erick Erickson
The scores will not be comparable. Scores are only relevant within one
search
on one core, so comparing them across two queries (even if it's the same
query
but against two different cores) is meaningless.

So, given your setup I would just use the results from one of the cores and
fill in
data from the other...

But why do you have two cores in the first place? Is it really necessary or
is it just
making things more complex?

Best
Erick

On Fri, Dec 3, 2010 at 1:36 PM, Ma, Xiaohui (NIH/NLM/LHC) [C] 
xiao...@mail.nlm.nih.gov wrote:

 Please correct me if I am doing something wrong. I really appreciate your
 help!

 I have a core for metadata (xml files) and a core for pdf documents.
 Sometimes I need search them separately, sometimes I need search both of
 them together. There is the same key which is related them for each item.

 For example, the xml files look like following:
 ?xml version=1.0 encoding=ISO-8859-1?
 List
Item
Keyrmaaac.pdf/Key
TIsomethingTI
UIrmaaac/UI
/Item
Item
   .
 /List

 I index rmaaac.pdf file with same Key and UI field in another core. Here is
 the example after I index rmaaac.pdf.
  ?xml version=1.0 encoding=UTF-8 ?
  response
  lst name=responseHeader
  int name=status0/int
  int name=QTime3/int
  lst name=params
  str name=indenton/str
  str name=start0/str
  str name=qcollectionid: RM/str
  str name=rows10/str
  str name=version2.2/str
  /lst
  /lst
  result name=response numFound=1 start=0
  doc
str name=UIrm/str
str name=Keyrm.pdf/str
str name=metadata_contentsomething/str
  /doc
  /result

 The result information which is display to user comes from metadata, not
 from pdf files. If I search a term from documents, in order to display
 search results to user, I have to get Keys from documents and then redo
 search from metadata. Then score is different.

 Please give me some suggestions!

 Thanks so much,
 Xiaohui

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Friday, December 03, 2010 12:37 PM
 To: solr-user@lucene.apache.org
 Subject: Re: score from two cores

 Uhhm, what are you trying to do? What do you want to do with the scores
 from
 two cores?

 Best
 Erick

 On Fri, Dec 3, 2010 at 11:21 AM, Ma, Xiaohui (NIH/NLM/LHC) [C] 
 xiao...@mail.nlm.nih.gov wrote:

  I have multiple cores. How can I deal with score?
 
  Thanks so much for help!
  Xiaohui
 



Re: score from two cores

2010-12-03 Thread Paul
On Fri, Dec 3, 2010 at 4:47 PM, Erick Erickson erickerick...@gmail.com wrote:
 But why do you have two cores in the first place? Is it really necessary or
 is it just
 making things more complex?

I don't know why the OP wants two cores, but I ran into this same
problem and had to abandon using a second core. My use case is: I have
lots of slowing-changing documents, and a few often-changing
documents. Those classes of documents are updated by different people
using different processes. I wanted to split them into separate cores
so that:

1) The large core wouldn't change except deliberately so there would
be less chance of a bug creeping in. Also, that core is the same on
different servers, so they could be replicated.

2) The small core would update and optimize quickly and the data in it
is different on different servers.

The problem is that the search results should return relevancy as if
there were only one core.