commit 147e512867176f95658dca8f20b078ae6692710f Merge: 20d94d3 94d7ecd Author: Klaus Aehlig <[email protected]> Date: Mon Apr 20 15:10:48 2015 +0200 Merge branch 'stable-2.11' into stable-2.12 * stable-2.11 Update configure file to version 2.11.7 Update NEWS file for 2.11.7 release Add logging to RenewCrypto Fix format string for gnt-network info Replace textwrapper.wrap by a custom version for networks Add SSL improvements to NEWS file * stable-2.10 Update tag limitations Fix typos in doc/design-storagetypes.rst Make getFQDN prefer cluster protocol family Add version of getFQDN accepting preferences Make getFQDN honor vcluster Conflicts: NEWS: take all release entries configure.ac: ignore revision bump lib/cmdlib/cluster.py: manually apply 2.11 changes to 2.12 src/Ganeti/Daemon.hs: trivial Signed-off-by: Klaus Aehlig <[email protected]> diff --cc NEWS index 74a6e46,c3f54e0..2b50914 --- a/NEWS +++ b/NEWS @@@ -2,250 -2,17 +2,261 @@@ New ==== +Version 2.12.2 +-------------- + +*(Released Wed, 25 Mar 2015)* + +- Support for the lens Haskell library up to version 4.7 (issue #1028) +- SSH keys are now distributed only to master and master candidates + (issue #377) +- Improved performance for operations that frequently read the + cluster configuration +- Improved robustness of spawning job processes that occasionally caused + newly-started jobs to timeout +- Fixed race condition during cluster verify which occasionally caused + it to fail + +Inherited from the 2.11 branch: + +- Fix failing automatic glusterfs mounts (issue #984) +- Fix watcher failing to read its status file after an upgrade + (issue #1022) +- Improve Xen instance state handling, in particular of somewhat exotic + transitional states + +Inherited from the 2.10 branch: + +- Fix failing to change a diskless drbd instance to plain + (issue #1036) +- Fixed issues with auto-upgrades from pre-2.6 + (hv_state_static and disk_state_static) +- Fix memory leak in the monitoring daemon + +Inherited from the 2.9 branch: + +- Fix file descriptor leak in Confd client + +Known issues +~~~~~~~~~~~~ + +- GHC 7.8 introduced some incompatible changes, so currently Ganeti + 2.12. doesn't compile on GHC 7.8 +- Under certain conditions instance doesn't get unpaused after live + migration (issue #1050) +- GlusterFS support breaks at upgrade to 2.12 - switches back to + shared-file (issue #1030) + + +Version 2.12.1 +-------------- + +*(Released Wed, 14 Jan 2015)* + +- Fix users under which the wconfd and metad daemons run (issue #976) +- Clean up stale livelock files (issue #865) +- Fix setting up the metadata daemon's network interface for Xen +- Make watcher identify itself on disk activation +- Add "ignore-ipolicy" option to gnt-instance grow-disk +- Check disk size ipolicy during "gnt-instance grow-disk" (issue #995) + +Inherited from the 2.11 branch: + +- Fix counting votes when doing master failover (issue #962) +- Fix broken haskell dependencies (issues #758 and #912) +- Check if IPv6 is used directly when running SSH (issue #892) + +Inherited from the 2.10 branch: + +- Fix typo in gnt_cluster output (issue #1015) +- Use the Python path detected at configure time in the top-level Python + scripts. +- Fix check for sphinx-build from python2-sphinx +- Properly check if an instance exists in 'gnt-instance console' + + +Version 2.12.0 +-------------- + +*(Released Fri, 10 Oct 2014)* + +Incompatible/important changes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- Ganeti is now distributed under the 2-clause BSD license. + See the COPYING file. +- Do not use debug mode in production. Certain daemons will issue warnings + when launched in debug mode. Some debug logging violates some of the new + invariants in the system (see "New features"). The logging has been kept as + it aids diagnostics and development. + +New features +~~~~~~~~~~~~ + +- OS install script parameters now come in public, private and secret + varieties: + + - Public parameters are like all other parameters in Ganeti. + - Ganeti will not log private and secret parameters, *unless* it is running + in debug mode. + - Ganeti will not save secret parameters to configuration. Secret parameters + must be supplied every time you install, or reinstall, an instance. + - Attempting to override public parameters with private or secret parameters + results in an error. Similarly, you may not use secret parameters to + override private parameters. + +- The move-instance tool can now attempt to allocate an instance by using + opportunistic locking when an iallocator is used. +- The build system creates sample systemd unit files, available under + doc/examples/systemd. These unit files allow systemd to natively + manage and supervise all Ganeti processes. +- Different types of compression can be applied during instance moves, including + user-specified ones. +- Ganeti jobs now run as separate processes. The jobs are coordinated by + a new daemon "WConfd" that manages cluster's configuration and locks + for individual jobs. A consequence is that more jobs can run in parallel; + the number is run-time configurable, see "New features" entry + of 2.11.0. To avoid luxid being overloaded with tracking running jobs, it + backs of and only occasionally, in a sequential way, checks if jobs have + finished and schedules new ones. In this way, luxid keeps responsive under + high cluster load. The limit as when to start backing of is also run-time + configurable. +- The metadata daemon is now optionally available, as part of the + partial implementation of the OS-installs design. It allows pass + information to OS install scripts or to instances. + It is also possible to run Ganeti without the daemon, if desired. +- Detection of user shutdown of instances has been implemented for Xen + as well. + +New dependencies +~~~~~~~~~~~~~~~~ + +- The KVM CPU pinning no longer uses the affinity python package, but psutil + instead. The package is still optional and needed only if the feature is to + be used. + +Incomplete features +~~~~~~~~~~~~~~~~~~~ + +The following issues are related to features which are not completely +implemented in 2.12: + +- Issue 885: Network hotplugging on KVM sometimes makes an instance + unresponsive +- Issues 708 and 602: The secret parameters are currently still written + to disk in the job queue. +- Setting up the metadata network interface under Xen isn't fully + implemented yet. + +Known issues +~~~~~~~~~~~~ + +- *Wrong UDP checksums in DHCP network packets:* + If an instance communicates with the metadata daemon and uses DHCP to + obtain its IP address on the provided virtual network interface, + it can happen that UDP packets have a wrong checksum, due to + a bug in virtio. See for example https://bugs.launchpad.net/bugs/930962 + + Ganeti works around this bug by disabling the UDP checksums on the way + from a host to instances (only on the special metadata communication + network interface) using the ethtool command. Therefore if using + the metadata daemon the host nodes should have this tool available. +- The metadata daemon is run as root in the split-user mode, to be able + to bind to port 80. + This should be improved in future versions, see issue #949. + +Since 2.12.0 rc2 +~~~~~~~~~~~~~~~~ + +The following issues have been fixed: + +- Fixed passing additional parameters to RecreateInstanceDisks over + RAPI. +- Fixed the permissions of WConfd when running in the split-user mode. + As WConfd takes over the previous master daemon to manage the + configuration, it currently runs under the masterd user. +- Fixed the permissions of the metadata daemon wn running in the + split-user mode (see Known issues). +- Watcher now properly adds a reason trail entry when initiating disk + checks. +- Fixed removing KVM parameters introduced in 2.12 when downgrading a + cluster to 2.11: "migration_caps", "disk_aio" and "virtio_net_queues". +- Improved retrying of RPC calls that fail due to network errors. + + +Version 2.12.0 rc2 +------------------ + +*(Released Mon, 22 Sep 2014)* + +This was the second release candidate of the 2.12 series. +All important changes are listed in the latest 2.12 entry. + +Since 2.12.0 rc1 +~~~~~~~~~~~~~~~~ + +The following issues have been fixed: + +- Watcher now checks if WConfd is running and functional. +- Watcher now properly adds reason trail entries. +- Fixed NIC options in Xen's config files. + +Inherited from the 2.10 branch: + +- Fixed handling of the --online option +- Add warning against hvparam changes with live migrations, which might + lead to dangerous situations for instances. +- Only the LVs in the configured VG are checked during cluster verify. + + +Version 2.12.0 rc1 +------------------ + +*(Released Wed, 20 Aug 2014)* + +This was the first release candidate of the 2.12 series. +All important changes are listed in the latest 2.12 entry. + +Since 2.12.0 beta1 +~~~~~~~~~~~~~~~~~~ + +The following issues have been fixed: + +- Issue 881: Handle communication errors in mcpu +- Issue 883: WConfd leaks memory for some long operations +- Issue 884: Under heavy load the IAllocator fails with a "missing + instance" error + +Inherited from the 2.10 branch: + +- Improve the recognition of Xen domU states +- Automatic upgrades: + - Create the config backup archive in a safe way + - On upgrades, check for upgrades to resume first + - Pause watcher during upgrade +- Allow instance disks to be added with --no-wait-for-sync + + +Version 2.12.0 beta1 +-------------------- + +*(Released Mon, 21 Jul 2014)* + +This was the first beta release of the 2.12 series. All important changes +are listed in the latest 2.12 entry. + + + Version 2.11.7 + -------------- + + *(Released Fri, 17 Apr 2015)* + + - The operation 'gnt-cluster renew-crypto --new-node-certificates' is + now more robust against intermitten reachability errors. Nodes that + are temporarily not reachable, are contacted with several retries. + Nodes which are marked as offline are omitted right away. + + Version 2.11.6 -------------- diff --cc lib/cmdlib/cluster.py index 7d75239,5cd96b1..f22c810 --- a/lib/cmdlib/cluster.py +++ b/lib/cmdlib/cluster.py @@@ -114,43 -115,69 +114,64 @@@ class LUClusterRenewCrypto(NoHooksLU) def Exec(self, feedback_fn): master_uuid = self.cfg.GetMasterNode() ++ cluster = self.cfg.GetClusterInfo() ++ + logging.debug("Renewing the master's SSL node certificate." + " Master's UUID: %s.", master_uuid) - cluster = self.cfg.GetClusterInfo() server_digest = utils.GetCertificateDigest( cert_filename=pathutils.NODED_CERT_FILE) + logging.debug("SSL digest of the node certificate: %s.", server_digest) - utils.AddNodeToCandidateCerts("%s-SERVER" % master_uuid, - server_digest, - cluster.candidate_certs) + self.cfg.AddNodeToCandidateCerts("%s-SERVER" % master_uuid, + server_digest) + logging.debug("Added master's digest as *-SERVER entry to configuration." + " Current list of candidate certificates: %s.", + str(cluster.candidate_certs)) - try: old_master_digest = utils.GetCertificateDigest( cert_filename=pathutils.NODED_CLIENT_CERT_FILE) + logging.debug("SSL digest of old master's SSL node certificate: %s.", + old_master_digest) - utils.AddNodeToCandidateCerts("%s-OLDMASTER" % master_uuid, - old_master_digest, - cluster.candidate_certs) + self.cfg.AddNodeToCandidateCerts("%s-OLDMASTER" % master_uuid, + old_master_digest) + logging.debug("Added old master's node certificate digest to config" + " as *-OLDMASTER. Current list of candidate certificates:" + " %s.", str(cluster.candidate_certs)) - except IOError: - logging.info("No old certificate available.") + logging.info("No old master certificate available.") last_exception = None - for _ in range(self._MAX_NUM_RETRIES): + for i in range(self._MAX_NUM_RETRIES): try: # Technically it should not be necessary to set the cert # paths. However, due to a bug in the mock library, we # have to do this to be able to test the function properly. _UpdateMasterClientCert( - self, master_uuid, cluster, feedback_fn, + self, self.cfg, master_uuid, client_cert=pathutils.NODED_CLIENT_CERT_FILE, client_cert_tmp=pathutils.NODED_CLIENT_CERT_FILE_TMP) + logging.debug("Successfully renewed the master's node certificate.") break except errors.OpExecError as e: + logging.error("Renewing the master's SSL node certificate failed" + " at attempt no. %s with error '%s'", str(i), e) last_exception = e else: if last_exception: feedback_fn("Could not renew the master's client SSL certificate." - " Cleaning up. Error: %s." % last_exception) + " Cleaning up. Error: %s." % last_exception) # Cleaning up temporary certificates - utils.RemoveNodeFromCandidateCerts("%s-SERVER" % master_uuid, - cluster.candidate_certs) - utils.RemoveNodeFromCandidateCerts("%s-OLDMASTER" % master_uuid, - cluster.candidate_certs) + self.cfg.RemoveNodeFromCandidateCerts("%s-SERVER" % master_uuid) + self.cfg.RemoveNodeFromCandidateCerts("%s-OLDMASTER" % master_uuid) + logging.debug("Cleaned up *-SERVER and *-OLDMASTER certificate from" + " master candidate cert list. Current state of the" + " list: %s.", str(cluster.candidate_certs)) try: utils.RemoveFile(pathutils.NODED_CLIENT_CERT_FILE_TMP) - except IOError: - pass + except IOError as e: + logging.debug("Could not clean up temporary node certificate of the" + " master node. (Possibly because it was already removed" + " properly.) Error: %s.", e) return node_errors = {} @@@ -165,8 -196,12 +190,11 @@@ try: new_digest = CreateNewClientCert(self, node_uuid) if node_info.master_candidate: - utils.AddNodeToCandidateCerts(node_uuid, - new_digest, - cluster.candidate_certs) + self.cfg.AddNodeToCandidateCerts(node_uuid, + new_digest) + logging.debug("Added the node's certificate to candidate" + " certificate list. Current list: %s.", + str(cluster.candidate_certs)) break except errors.OpExecError as e: last_exception = e @@@ -182,8 -221,17 +214,15 @@@ msg += "Node %s: %s\n" % (uuid, e) feedback_fn(msg) - utils.RemoveNodeFromCandidateCerts("%s-SERVER" % master_uuid, - cluster.candidate_certs) - utils.RemoveNodeFromCandidateCerts("%s-OLDMASTER" % master_uuid, - cluster.candidate_certs) + self.cfg.RemoveNodeFromCandidateCerts("%s-SERVER" % master_uuid) + self.cfg.RemoveNodeFromCandidateCerts("%s-OLDMASTER" % master_uuid) + logging.debug("Cleaned up *-SERVER and *-OLDMASTER certificate from" + " master candidate cert list. Current state of the" + " list: %s.", cluster.candidate_certs) + + # Trigger another update of the config now with the new master cert + logging.debug("Trigger an update of the configuration on all nodes.") + self.cfg.Update(cluster, feedback_fn) class LUClusterActivateMasterIp(NoHooksLU): -- Klaus Aehlig Google Germany GmbH, Dienerstr. 12, 80331 Muenchen Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschaeftsfuehrer: Graham Law, Christine Elizabeth Flores
Merge branch 'stable-2.11' into stable-2.12
'Klaus Aehlig' via ganeti-devel Mon, 20 Apr 2015 06:21:22 -0700
- Merge branch 'stable-2.11' into stable-2... 'Klaus Aehlig' via ganeti-devel
- Re: Merge branch 'stable-2.11' into... 'Petr Pudlak' via ganeti-devel
- Merge branch 'stable-2.11' into sta... 'Klaus Aehlig' via ganeti-devel
- Re: Merge branch 'stable-2.11' ... 'Petr Pudlak' via ganeti-devel
- Merge branch 'stable-2.11' into sta... 'Klaus Aehlig' via ganeti-devel
- Re: Merge branch 'stable-2.11' ... 'Helga Velroyen' via ganeti-devel
- Merge branch 'stable-2.11' into sta... 'Klaus Aehlig' via ganeti-devel
- Re: Merge branch 'stable-2.11' ... 'Petr Pudlak' via ganeti-devel
- Merge branch 'stable-2.11' into sta... 'Klaus Aehlig' via ganeti-devel
- Re: Merge branch 'stable-2.11' ... 'Helga Velroyen' via ganeti-devel
- Merge branch 'stable-2.11' into sta... 'Klaus Aehlig' via ganeti-devel
