[ https://issues.apache.org/jira/browse/HBASE-5399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
nkeywal updated HBASE-5399: --------------------------- Status: Patch Available (was: Open) > Cut the link between the client and the zookeeper ensemble > ---------------------------------------------------------- > > Key: HBASE-5399 > URL: https://issues.apache.org/jira/browse/HBASE-5399 > Project: HBase > Issue Type: Improvement > Components: client > Affects Versions: 0.94.0 > Environment: all > Reporter: nkeywal > Assignee: nkeywal > Priority: Minor > Attachments: 5399.v27.patch, 5399_inprogress.patch, > 5399_inprogress.v14.patch, 5399_inprogress.v16.patch, > 5399_inprogress.v18.patch, 5399_inprogress.v20.patch, > 5399_inprogress.v21.patch, 5399_inprogress.v23.patch, > 5399_inprogress.v3.patch, 5399_inprogress.v9.patch > > > The link is often considered as an issue, for various reasons. One of them > being that there is a limit on the number of connection that ZK can manage. > Stack was suggesting as well to remove the link to master from HConnection. > There are choices to be made considering the existing API (that we don't want > to break). > The first patches I will submit on hadoop-qa should not be committed: they > are here to show the progress on the direction taken. > ZooKeeper is used for: > - public getter, to let the client do whatever he wants, and close ZooKeeper > when closing the connection => we have to deprecate this but keep it. > - read get master address to create a master => now done with a temporary > zookeeper connection > - read root location => now done with a temporary zookeeper connection, but > questionable. Used in public function "locateRegion". To be reworked. > - read cluster id => now done once with a temporary zookeeper connection. > - check if base done is available => now done once with a zookeeper > connection given as a parameter > - isTableDisabled/isTableAvailable => public functions, now done with a > temporary zookeeper connection. > - Called internally from HBaseAdmin and HTable > - getCurrentNrHRS(): public function to get the number of region servers and > create a pool of thread => now done with a temporary zookeeper connection > - > Master is used for: > - getMaster public getter, as for ZooKeeper => we have to deprecate this but > keep it. > - isMasterRunning(): public function, used internally by HMerge & HBaseAdmin > - getHTableDescriptor*: public functions offering access to the master. => > we could make them using a temporary master connection as well. > Main points are: > - hbase class for ZooKeeper; ZooKeeperWatcher is really designed for a > strongly coupled architecture ;-). This can be changed, but requires a lot of > modifications in these classes (likely adding a class in the middle of the > hierarchy, something like that). Anyway, non connected client will always be > really slower, because it's a tcp connection, and establishing a tcp > connection is slow. > - having a link between ZK and all the client seems to make sense for some > Use Cases. However, it won't scale if a TCP connection is required for every > client > - if we move the table descriptor part away from the client, we need to find > a new place for it. > - we will have the same issue if HBaseAdmin (for both ZK & Master), may be we > can put a timeout on the connection. That would make the whole system less > deterministic however. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira