Hello: I'm working on implementing a requirement where when a document is returned, we want to pithily tell the end user why. That is, say, with five documents returned, they may be so for similar or different reasons. These "reasons" are the field(s) in which matches occurred. Some are more important than others, and I'll have to return just the most relevant one or two reasons to not overwhelm the user.
This is a separate goal than Solr's scoring of the returned documents. That is, index/query time boosting can indicate which fields are more significant in computing the overall document score, but then I need to know what fields where, matched with what terms. I do have an application that stands between Solr and the end user (RESTful API), so I figured I can rank the "reasons" and return more domain specific names rather than the Solr fields names. So, I've turned to highlighting, and in the results I can see for each document ID the fields matched, and the text in the field etc. Great. But, to get that to work, I have to specifically query individual fields. That is, the approach of <copyField>'ing a bunch of fields to a common text field for efficiency purposes is no longer an option. And, using the dismax request handler, I'm querying a lot of fields: <str name="qf"> n_nameExact^4.0 n_macromolecule_nameExact^3.0 n_macromolecule_name^2.0 n_macromolecule_id^1.8 n_pathway_nameExact^1.5 n_top_regulates n_top_regulated_by n_top_binds n_top_role_in_cell n_top_disease n_molecular_function n_protein_family n_subcell_location n_pathway_name n_cell_component n_bio_process n_synonym^0.5 n_macromolecule_summary^0.6 p_nameExact^4.0 p_name^2.0 p_description^0.6 </str> Is that crazy? Is telling Solr to look at so many individual fields going to be a performance problem? I'm only prototyping at this stage and it works great. :) I've not run anything yet at scale handling lots of requests. There are two document types in that shared index, demarcated using a field named type. So, when configuring the SolrJ SolrQuery, I do setup addFilterQuery() to select one or the other type. Anyway, using dismax with all of those query fields along with highlighting, I get the information I need to render meaningful results for the end user. But, it has a sort of smell to it. :) Shall I look for another way, or am I worrying about nothing? I am current using Solr 3.1 trunk. Thanks! Jeff -- Jeff Schmidt 535 Consulting j...@535consulting.com http://www.535consulting.com