Hi,

I am currently putting together a search for a DB where I have resolutions 
along with their metadata as well as chapters, its text and metadata. Most of 
the searching will actually be done on the metadata. The plan atm is to support 
2 search modes: (a) one where the results will be resolutions and (b) another 
where the results will be chapters.

(a) Here I will search both the document and chapter data, but the actual 
result entities I want are resolutions. In terms of rating I obviously want 
stuff to rate higher with more relevant chapters, so I sort of need to group 
the hits on the chapters when computing the score. For good measure I might 
also want to show the number of chapters that had a match, potentially even 
with links to these chapters, so I would also need the chapter id's that 
matches.

(b) Here I will just search across the chapters and rank them each on their 
own. Seems straight forward.

Now how should I best structure my index for this?

number of cores:
I guess I will have two cores, one for documents and one for chapters? Then 
again there is some minor overlap in fields between the two and there is no 
real overhead with having unused fields, so I could just as well use one core.

grouping:
how do I best group the scores for the (a) type search? should I just do two 
searches and combine the results? then again this will make paging tricky.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org



Reply via email to