I am curious about the potential use of document scoring as a means to extract additional data from an index. Specifically, I would like the score to be a count of how many times a particular field matched a set of terms. For example, I am indexing movie-stars (Each document is a movie-star). A movie-star has a number of fields, such as name, movies they have been in, etc. I want to produce an 'index' of stars by name and show how many movies, which match a filter, that they have appeared in.
In natural language my query might be: "List all stars who have appeared in a 'horror' movie, where last name starts with A, and tell me how many horror movies they were in." My search will look something like this: "+lastName:A* +movie:(1 7 21 58 92)" //where movie is a previously computed list of 'horror' movie ids If my index contained the following documents: doc1 = lastName:Anna movie:{3 10} doc2 = lastName:Aba movie:{1 10 12} doc3 = lastName:Addd movie:{3 21 55 92} doc4 = lastName:Baaa movie:{7 56} I would like to get back: doc2, score of 1 //score of 1 because only movie 1 matched doc3, score of 2 //score of 2 because movies 21 and 92 matched Currently, we perform an initial query against our Star index to retrieve a list of stars. Then we perform N queries against a separate movie index to count the number of movies that match our sub filter 'horror'. This is obviously very inefficient, and as I've shown above, the information (count) is available during the primary query. Thoughts? --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]