[ https://issues.apache.org/jira/browse/SOLR-2592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294837#comment-13294837 ]
Andy Laird commented on SOLR-2592: ---------------------------------- I have tried out Michael's patch and would like to provide some feedback to the community. We are using a very-recent build from the 4x branch but I grabbed this patch from trunk and tried it out anyway... Our needs were driven by the fact that, currently, the counts returned when using field collapse are only accurate when the documents getting collapsed together are all on the same shard (see comments for https://issues.apache.org/jira/browse/SOLR-2066). For our case we collapse on a field, xyz, so we need to ensure that all documents with the same value for xyz are on the same shard (overall distribution is not a problem here) if we want counting to work. I grabbed the latest patch (dbq_fix.patch) in hopes of finding a solution to our problem. The great news is that Michael's patch worked like a charm for what we needed -- thank you kindly, Michael, for this effort! The not-so-good news is that for our particular issue we needed a way to get at data other than the uniqueKey (the only data available with ShardKeyParser) -- in our case we need access to the xyz field data. Since this implementation provides nothing but uniqueKey we had to encode the xyz data in our uniqueKey (e.g. newUniqueKey = what-used-to-be-our-uniqueKey + xyz), which is certainly less-than-ideal and adds unsavory coupling. Nonetheless, as a fix to a last-minute gotcha (our counts with field collapse need to be accurate in a multi-shard environment) I was happily surprised at how easy it was to find a solution to our particular problem with this patch. I would definitely like to see a second iteration that incorporates the ability to get at other document data, then you could do whatever you want by looking at dates and other fields, etc. though I understand that that probably goes quite a bit deeper in the codebase, especially with distributed search. > Pluggable shard lookup mechanism for SolrCloud > ---------------------------------------------- > > Key: SOLR-2592 > URL: https://issues.apache.org/jira/browse/SOLR-2592 > Project: Solr > Issue Type: New Feature > Components: SolrCloud > Affects Versions: 4.0 > Reporter: Noble Paul > Attachments: dbq_fix.patch, pluggable_sharding.patch, > pluggable_sharding_V2.patch > > > If the data in a cloud can be partitioned on some criteria (say range, hash, > attribute value etc) It will be easy to narrow down the search to a smaller > subset of shards and in effect can achieve more efficient search. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org