[ https://issues.apache.org/jira/browse/SOLR-1308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786240#action_12786240 ]
Jason Rutherglen commented on SOLR-1308: ---------------------------------------- {quote} Yeah... that's a pain. We could easily do per-segment faceting for non-string types though (int, long, etc) since they don't need to be merged. {quote} I opened SOLR-1617 for this. I think doc sets can be handled with a multi doc set (hopefully). Facets however, argh, FacetComponent is really hairy, though I think it boils down to simply adding field values of the same up? Then there seems to be edge cases which I'm scared of. At least it's easy to test whether we're fulfilling todays functionality by randomly unit testing per-segment and multi-segment side by side (i.e. if the results of one are different than the results of the other, we know there's something to fix). Perhaps we can initially add up field values, and test that (which is enough for my project), and move from there. I'd still like to genericize all of the distributed processes to work over multiple segments (like Lucene distributed search uses a MultiSearcher which also works locally), so that local or distributed is the same API wise. However given I've had trouble figuring out the existing distributed code (SOLR-1477 ran into a wall). Maybe as part of SolrCloud http://wiki.apache.org/solr/SolrCloud, we can rework the distributed APIs to be more user friendly (i.e. *MultiSearcher is really easy to understand). If Solr's going to work well in the cloud, distributed search probably needs to be easy to multi tier for scaling (i.e. if we have 1 proxy server and 100 nodes, we could have 1 top proxy, and 1 proxy per 10 nodes, etc). > Cache docsets at the SegmentReader level > ---------------------------------------- > > Key: SOLR-1308 > URL: https://issues.apache.org/jira/browse/SOLR-1308 > Project: Solr > Issue Type: Improvement > Affects Versions: 1.4 > Reporter: Jason Rutherglen > Priority: Minor > Fix For: 1.5 > > Original Estimate: 504h > Remaining Estimate: 504h > > Solr caches docsets at the top level Multi*Reader level. After a > commit, the filter/docset caches are flushed. Reloading the > cache in near realtime (i.e. commits every 1s - 2min) > unnecessarily consumes IO resources when reloading the filters, > especially for largish indexes. > We'll cache docsets at the SegmentReader level. The cache key > will include the reader. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.