We don't have a process to uninstall old bits, and this area is largely undocumented and untested. You can try erasing the 2_2_6_* packages (via yum, zipper, apt).
Thanks, Alejandro On 11/23/15, 10:19 PM, "Andrew Robertson" <[email protected]> wrote: >I've created AMBARI-14031 & AMBARI-14032 for these issues. > >And thanks for the pointer for the #experimental / >opsDuringRollingUpgrade workaround. > >One more question - is there a process for uninstalling old / unused >versions of HDP? For example, now that I've upgraded from 2.2.8 -> >2.2.9, is there a way to remove 2.2.6? > >On Mon, Nov 23, 2015 at 11:35 AM, Alejandro Fernandez ><[email protected]> wrote: >> Hi, >> >> I wish your experience with Rolling Upgrades would have been better. >> I'll do my best to explain the solution to each one of those items. As a >> developer, I like to hear this feedback so we can make the product >>better. >> >> * Cluster is locked down while in the middle of upgrade: >> Operations like changing configs, adding hosts, adding services, etc. >>are >> disallowed by default. >> This is meant to prevent the user from drastically changing the stack >> configs and ending up in a worse state. >> Cluster operators can still change configs by navigating to >> http://server:8080/#/experimental and enabling >>"opsDuringRollingUpgrade". >> I completely agree that we need to be more flexible in this area since >> configs are likely to break, and the savvy users should still be >>allowed to >> change them. >> >> * Configs are only changed in major stack versions: >> In HDP 2.2.*->2.2.*, we don't expect any config changes, so the Upgrade >>Pack >> doesn't orchestrate any, whereas a 2.2.*->2.3.* has many config changes. >> At times, this will break, and we typically find out about it during >>testing >> and reports from users with custom configs. >> Tools like SmartSense can also help to point out incorrect configs. In >>the >> future, we may relax this so that even minor versions are allowed to >>change >> configs. >> >> * Unable to finalize since hosts are not on the new version: >> We've talked about a way to "force finalize" the versions. Today, >>Ambari is >> very strict about requiring all hosts to be updated. >> As a workaround, we have a python script called "RU Magician" that will >> allow you to fix things, and force any version to CURRENT; checkout >> https://github.com/apache/ambari/tree/branch-2.1.2/contrib/ru_magician >> You ran the correct SQL statements, so kudos to you for that. >> >> * Components that don't advertise a version: >> Some components like ZKFC, AMS, MySQL, Kerberos Client, don¹t need to >> advertise a version. >> In the case of ZKFC, it is because it uses the same binary as that of >> NameNode. So perhaps an earlier version of Ambari caused it to stay >>stuck on >> 2.2.6 in the DB. >> If you feel more comfortable, you can change ZKFC's version to >>'UNKNOWN'. >> >> My suggestion is to create Jiras on Apache for the following: >> >> Allow force finalizing a version during Stack Upgrade >> Allow changing configs during the middle of a Stack Upgrade, will need >>to >> prompt user with a disclaimer/warning >> >> Thanks, >> Alejandro >> >> On 11/22/15, 11:32 PM, "Andrew Robertson" <[email protected]> >> wrote: >> >> I performed a rolling upgrade of HDP from 2.2.8 to 2.2.9 today using >> Ambari 2.1.2.1 & ran into several issues. >> >> My YARN resource manager failed to start due to a "Service >> ResourceManager failed in state INITED; cause: >> java.lang.IllegalArgumentException: Illegal capacity of -1.0 for >> node-label=default in queue=root, valid capacity should in range of >> [0, 100].". (It was working fine with 2.2.8; this may be something new >> in 2.2.9). >> >> As Ambari usage feedback - this was impossible to fix in Ambari while >> the upgrade was going on, and it added a ton of (down)time to the >> upgrade. This error caused a number of service checks to time out >> after a long wait (many checks took 5-15 min to fail). I didn't see >> any way to fix the error (the only options I had during the upgrade >> were "Downgrade" - which I didn't want to do (It was a test cluster >> after all, I wanted to get through it so I could fix it); and "Ignore" >> which allowed it to continue, but caused each step to take 300+ >> seconds. Ambari seemed to lock the configs so I couldn't make changes >> to fix the issue while the upgrade was going on. Likewise, I couldn't >> manually restart the service myself or abort the service checks. Even >> at the "Verify operation" and the "finalize" checkpoints, where I >> could "pause" the upgrade - the configs were still locked and I had no >> ability to start/stop services. >> >> At the end, Ambari started giving other errors about being unable to >> finalize the upgrade. I ended up rebooting the cluster & ambari - this >> got it back to a state where I could edit the configs again to fix the >> YARN RM config. The fix to the RM not starting ended up being the >> same as AMBARI-11358, which appears to only have been fixed in the >> HDP2.3 upgrade. >> >> Separately, Ambari had the 2.2.9 version waiting to be finalized but I >> couldn't find any way to do this in the UI after the restart. So I >> went into the database and ran the following: >> UPDATE host_version SET state = 'INSTALLED' WHERE state = 'CURRENT'; >> UPDATE host_version SET state = 'CURRENT' WHERE repo_version_id = <id >> for 2.2.9.0 version> and state = 'UPGRADED'; >> UPDATE cluster_version SET state = 'INSTALLED' WHERE state = 'CURRENT'; >> UPDATE cluster_version SET state = 'CURRENT' WHERE repo_version_id = >> <id for 2.2.9.0 version> and state = 'UPGRADED'; >> UPDATE hostcomponentstate set upgrade_state = 'NONE'; >> This seems to have fixed that. >> >> Possibly unrelated - I did find there are 2 services that show up with >> an even older old version when checking the ambari database: >> >> ambari=> SELECT h.host_name, hcs.service_name, hcs.component_name, >> hcs.version FROM hostcomponentstate hcs JOIN hosts h ON hcs.host_id = >> h.host_id where hcs.version NOT IN ('2.2.9.0-3393', 'UNKNOWN'); >> host_name | service_name | component_name | >> version >> >>----------------------------------+--------------+----------------+------ >>-------- >> node2 | HDFS | ZKFC | 2.2.6.0-2800 >> node1 | HDFS | ZKFC | 2.2.6.0-2800 >> >> (But I had upgraded from 2.2.8; 2.2.6 was the version before that). >> >> Any suggestions on how to fix this? I think Ambari may just be >> confused, but I'm not sure how to verify this and/or fix Ambari (other >> than overwrite this field in the database?). I've verified the yum >> versions are right for the package and the right processes are >> actually running on the machine. >> >> Thank you! >> >> >
