Yes it does answer what I was looking for. Thanks Alex, for responding. Sameer
Date: Thu, 29 May 2014 10:58:39 -0700 From: ml-node+s679495n4059886...@n3.nabble.com To: s...@hotmail.com Subject: Re: Full stack rolling restart Yes, a full-stack rolling upgrade is possible. To perform a full-stack rolling restart of the CDH platform, we added and tested that functionality through Cloudera Manager, starting in CM4, running CDH4 and onward. For HBase rolling upgrades, the only Cloudera-supported path is through Cloudera Manager (though we've tested it without CM as well). For HDFS/MR/YARN/ZK, it's also supported using only CDH, though you can also use CM to do it. For the full stack rolling upgrade including HBase, here is how Cloudera Manager curates the process at a high level: 1. Restart all master nodes by restarting services in reverse dependency order. -- Master services = HMaster, NN, ZK, JT, etc. -- Reverse-dependency order: For example, HBase, then HDFS, then ZK (since HDFS depends on ZK and hbase depends on HDFS). This gets a bit more complicated if there is High-Availability enabled. Also as a general rule, backup master services (e.g. backup master) are upgraded before the active master services. 2. Restart all worker nodes (nodes that run worker services) in batches (default is 1, but is configurable) -- Worker service = DN, RS, TT, etc. -- Reverse-dependency order: Turn off balancer. Decommission RS (by closing and moving off all the regions one by one), gracefully shut down DN, start the DN back up, start the RS back up, load that RS with regions. Repeat for each worker node. Once the master and worker services have been restarted on all nodes of the cluster, the hbase balancer is then turned back on and the cluster is considered upgraded. Caveat: Rolling upgrades are only supported between minor versions of CDH. So 4.x to 4.y OR 5.x to 5.y (but not 4.x to 5.y). Did that answer your question? On Thu, May 29, 2014 at 9:11 AM, Jeremy Carroll <[hidden email]> wrote: > We have taken the approach of graceful stop of the RegionServer in > maintenance. Then restarting the DataNode. Once it has registered and back > online we start the RegionServer and move it's regions back. We do not > compact before or after the operation since it takes a short period of > time, and minor compactions will regain the small amount of locality lost > during the maintenance operation. > > I believe it's doubtful that he Hadoop project itself will release an > official cluster management / operations framework. So we built a lot of > this ourselves. > > Sent from my iPhone > > > On May 28, 2014, at 11:37 PM, sameerv <[hidden email]> wrote: > > > > I am curious to know what the industry folks think of rolling restart on > the > > full stack. Envisioning something like each node which services it runs, > > stop all services, use new configs and start all services. Is it > feasible to > > do this ? Can folks who have tried, share their experiences please? > > > > Thanks, > > Sameer > > > > > > > > > > -- > > View this message in context: > http://apache-hbase.679495.n3.nabble.com/Full-stack-rolling-restart-tp4059877.html > > Sent from the HBase User mailing list archive at Nabble.com. > -- Best Regards, Aleks Shulman 847.814.5804 Cloudera If you reply to this email, your message will be added to the discussion below: http://apache-hbase.679495.n3.nabble.com/Full-stack-rolling-restart-tp4059877p4059886.html To unsubscribe from Full stack rolling restart, click here. NAML -- View this message in context: http://apache-hbase.679495.n3.nabble.com/Full-stack-rolling-restart-tp4059877p4059909.html Sent from the HBase User mailing list archive at Nabble.com.