Hi all, As you have seen today: We did something incredibly stupid and I sincerely apologize to all of you.
If you are still affected by this, here is the quick summary for how to fix it: ============================================================ 1.) As "root" from SSH run: yum clean all yum update That makes sure you're fully update and have the fix. 2.) Removal of unwanted PKG's: cd ~ wget http://devel.blueonyx.it/pub/BlueOnyx/.scripts/pkgRemoval.pl.txt mv pkgRemoval.pl.txt pkgRemoval.pl chmod 755 pkgRemoval.pl Then edit pkgRemoval.pl and in the list of unwanted PKGs delete all lines of PKGs that you want to keep. Once satisfied with this list, run this command: ./pkgRemoval.pl ============================================================ What caused the problem? A stupid beginners mistake in coding. In a for/next loop a variable was conditionally re-used without resetting it between runs. If you then happened to have the "Active Monitor" component "Software Updates" enabled, then the daily cronjob would poll the list of PKGs available to you on NewLinQ to keep you informed of PKGs that are installed and for which updates are available. However: If the faulty code loop tripped over itself while updating CODB with the new data from NewLinQ, then it would start to mark all PKGs for install. Depending on when it tripped, it might mark all PKGs, in other cases it just marked everything past a certain point. The next run of "Active Monitor" then installed all PKGs that had been marked for install. During testing of the code changes on Friday prior to release everything seemed to be fine and the problem didn't occur. On Saturday I spotted it on a client box under maintenance contract. However, in that case just two "unwanted" PKGs had been installed and that client also had the "All Packages Bundle" of which he used only half a dozen elements. This had allowed me to immediately identify the problem and to publish a fix. But this is actually where I had screwed up again: I made the assumption that the problem was minor and that it perhaps might not affect anyone else in even more drastic ways. But it did due to the delayed nature with which the "Active Monitor" component "Software Updates" works. Among the first casualties of the problem were the support ticket relay system, the list servers, the YUM repository MySQL server backend and the Solarspeed email server. They all had gotten "Dfix" installed, which collided with the already present "Dfix2". Two MariaDB servers also didn't survive the unmonitored updates to MariaDB-10.1, as some of them were also still running with passwords in the old format. So I only became aware of the problem when the phone started to ring itself off the nightstand. I spent the last 10 hours fixing our own broken stuff while simultaneously fixing the boxes of everyone that contacted me in any way. If you need help with the fix or still have pending issues on your servers, then please let me know and I'll get to it ASAP. -- With best regards Michael Stauber _______________________________________________ Blueonyx mailing list Blueonyx@mail.blueonyx.it http://mail.blueonyx.it/mailman/listinfo/blueonyx