*Edit: each indexed text document contains a related field for identification purposes, so I would be able to identify the scores for both indexes through this field*
theDude_2 wrote: > > I appreciate your response, and read the wiki article concerning the > Federated search > and > > I'm not sure that my project falls into the "Federated Search" bucket... > > What I've done is created 2 indexes created with the same documents. > One index, contains the full documents - great for pure relevancy search > The second index: contains all of the same documents, but a small subset > of each documents contents - only allowing words to be indexed that we > deem as "good words" - > > (for example) if this was a football article database > Index 1: would index 100% of the article about the Redskins and the New > York Giants > Index 2: would index the same article by only the "good words" in the > document like Redskins, Giants, Quarterback, Linebacker, etc. > > What I'm trying to do, if it's even possible! is run the search on both > indexes containing references to the same article, and multiple the scores > together to get a final score that would represent something like a > "relative AND good word" score.... > > Figuring that if a user searches on "Who is the Quarterback for the > Giants" this will get the user an article that is both related to the > query, and deemed "important" to the query... > > I will look further into federated search and related items, but I think > that lucene probably wont be able to help me with this, am I right? > > > > > > > > > ------------ > > pjaol wrote: >> >> I'd start by doing some research on the question rather than asking for a >> solution.. >> What your asking for can be considered 'Federated Search' >> http://en.wikipedia.org/wiki/Federated_search >> >> And it can be conceived in as many ways as you have document types. Any >> answer will probably end up >> customized and weighted by your document silo value, usually companies >> weight those by business rules >> rather than head down the path of federated search, as it's just quicker >> and >> cheaper, and you can accomplish more. >> e.g >> Medication = score *2 (as higher advertising incentives) >> Diseases = score >> Books = score * 0.75 ( thousands of books, which nobody buys etc..) >> >> You might also want to try consolidating your data into 1 schema, and >> consider layering or collapsing results >> based on type. >> >> P >> >> On Fri, Apr 17, 2009 at 10:39 AM, theDude_2 <[email protected]> wrote: >> >>> >>> (bump) - any thoughts? >>> ---- >>> >>> >>> >>> theDude_2 wrote: >>> > >>> > hi! >>> > >>> > I am trying to do something a little unique... >>> > >>> > I have a 90k text documents that I am trying to search >>> > Search A: indexes and searches the documents using regular relevancy >>> > search >>> > Search B: indexes and searches the documents using a smaller subset of >>> > "key" words that I have chosen >>> > >>> > This gives me 2 seperate scores: Score A, and Score B... >>> > >>> > I am trying to show the top 10 results of the scores combined so.... >>> > >>> > FinalScoretextDoc = (scoreA_of_td1 * 0.5) * (scoreB_of_td1 * 0.5) >>> > >>> > While it seems straightforward, I do not want to calculate the scores >>> of >>> > all the documents outside of lucene. How can I integrate this better >>> into >>> > the lucene search engine? Is this possible to do by any simple means? >>> > >>> > Thanks guys + gals! >>> > >>> > >>> >>> -- >>> View this message in context: >>> http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23098961.html >>> Sent from the Lucene - Java Users mailing list archive at Nabble.com. >>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> >> > > -- View this message in context: http://www.nabble.com/A-Challenge%21%3A-Combining-2-searches-into-a-single-resultset--tp23085506p23099924.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
