[ https://issues.apache.org/jira/browse/SOLR-5473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14039800#comment-14039800 ]
Noble Paul edited comment on SOLR-5473 at 6/22/14 6:21 PM: ----------------------------------------------------------- Patch updated to trunk. Incorporating most of the comments # All external references are eliminated from the APIs # the node is given a suffix as /state.json instead of "/state" # removed the redundant attribute externla/stateVersion from the state object. The version is automatically derived from the znode from which the object is read # Thread-safety issues addressed # Added javadocs (and many more other subtle cleanups) The comments which are not addressed are # The selective watching of collection nodes by solr nodes. There are ony 3 choices when it comes to watching states #* Watch all nodes : this will would be equivalent or worse than the current clusterstate.json solution. All nodes will be notified of each state change (multiple times, one per collection where it is a member of ) #* Watch none. Just fetch the state data just in time (will kil the ZK) or cache , means the node will not have an updated state to make the right decision at the right time #* Watch selectively. This is the approach we have taken here # maintaining the zkStateReader reference in clusterstate. Agreed that is not elegant. The ideal solution would be to completely get rid of ClusterState.java because that node is going to go away. and we will only hava ZkStateReader and DocCollection and nothing in between. The problem is we have clusterstate.json now and it will exist there for a at least a couple of releases . So , I am torn between the choices and I decided to go with the not so elegant choice of ClusterState keeping a reference to ZkStatereade , so that all APIs work fine . My suggestion is to eliminate CLusterState.java when we deprecate the old format # The ephemeralCollectionData data in ZkStateReader. This is again not so elegant. This one is simple and performant and have minimal impact of the code.I'm happy to hear any other simpler ideas to make it better. We have done extensive testing on this patch internally with very large clusters (120+ nodes ) and very large non:of collections (1000+ of collections). The solr-5473 branch already has this code committed . If there are no objections I plan to commit this fairly soon was (Author: noble.paul): Patch updated to trunk. Incorporating most of the comments # All external references are eliminated from the APIs # the node is given a suffix as /state.json instead of "/state" # removed the redundant attribute externla/stateVersion from the state object. The version is automatically derived from the znode from which the object is read # Thread-safety issues addressed # Added javadocs (and many more other subtle cleanups) The comments which are not addressed are # The selective watching of collection nodes by solr nodes. There are ony 3 choices when it comes to watching states #* Watch all nodes : this will would be equivalent or worse than the current clusterstate.json solution. All nodes will be notified of each state change (multiple times, one per collection where it is a member of ) #* Watch none. Just fetch the state data just in time (will kil the ZK) or cache , means the node will not have an updated state to make the right decision at the right time #* Watch selectively. This is the approach we have taken here # maintaining the zkStateReader reference in clusterstate. Agreed that is not elegant. The ideal solution would be to completely get rid of ClusterState.java because that node is going to go away. and we will only hava ZkStateReader and DocCollection and nothing in between. The problem is we have clusterstate.json now and it will exist there for a at least a couple of releases . So , I am torn between the choices and I decided to go with the not so elegant choice of ClusterState keeping a reference to ZkStatereade , so that all APIs work fine . My suggestion is to eliminate CLusterState.java when we deprecate the old format # The ephemeralCollectionData data in ZkStateReader. This is again not so elegant. This one is simple and performant and have minimal impact of the code.I'm happy to hear any other simpler ideas to make it better. We have done extensive testing on this patch internally with very large clusters (120+ nodes ) and very large non:of collections (100s of collections). The solr-5473 branch already has this code committed . If there are no objections I plan to commit this fairly soon > Make one state.json per collection > ---------------------------------- > > Key: SOLR-5473 > URL: https://issues.apache.org/jira/browse/SOLR-5473 > Project: Solr > Issue Type: Sub-task > Components: SolrCloud > Reporter: Noble Paul > Assignee: Noble Paul > Fix For: 5.0 > > Attachments: SOLR-5473-74 .patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-74.patch, SOLR-5473-74.patch, SOLR-5473-74.patch, > SOLR-5473-configname-fix.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, SOLR-5473.patch, > SOLR-5473_undo.patch, ec2-23-20-119-52_solr.log, ec2-50-16-38-73_solr.log > > > As defined in the parent issue, store the states of each collection under > /collections/collectionname/state.json node -- This message was sent by Atlassian JIRA (v6.2#6252) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org