LGTM Thanks,
Guido On Thu, Dec 19, 2013 at 11:51 AM, Klaus Aehlig <[email protected]> wrote: > > > commit 4038f067bf4004ff6a4c2ef9b1dc339b3332341c > Merge: 9ba3870 a5c5097 > Author: Klaus Aehlig <[email protected]> > Date: Thu Dec 19 11:06:22 2013 +0100 > > Merge branch 'stable-2.10' into master > > * stable-2.10 > Version bump for 2.10.0~rc1 > Update NEWS for 2.10.0 rc1 release > Fix pylint 0.26.0/Python 2.7 warning > Update INSTALL and devnotes for 2.10 release > * stable-2.9 > Bump revision for 2.9.2 > Update NEWS for 2.9.2 release > Pass hvparams to GetInstanceInfo > Adapt parameters that moved to instance variables > Avoid lines longer than 80 chars > SingleNotifyPipeCondition: don't share pollers > KVM: use custom KVM path if set for version checking > * stable-2.8 > Version bump for 2.8.3 > Update NEWS for 2.8.3 release > Support reseting arbitrary params of ext disks > Allow modification of arbitrary params for ext > Do not clear disk.params in UpgradeConfig() > SetDiskID() before accepting an instance > Lock group(s) when creating instances > Fix job error message after unclean master shutdown > Add default file_driver if missing > Update tests > Xen handle domain shutdown > Fix evacuation out of drained node > Refactor reading live data in htools > master-up-setup: Ping multiple times with a shorter interval > Add a packet number limit to "fping" in master-ip-setup > Fix a bug in InstanceSetParams concerning names > build_chroot: hard-code the version of blaze-builder > Fix error printing > Allow link local IPv6 gateways > Fix NODE/NODE_RES locking in LUInstanceCreate > eta-reduce isIpV6 > Ganeti.Rpc: use brackets for ipv6 addresses > Update NEWS file with socket permission fix info > Fix socket permissions after master-failover > > Conflicts: > NEWS > devel/build_chroot > lib/cmdlib/instance.py > lib/hypervisor/hv_xen.py > lib/jqueue.py > src/Ganeti/Luxi.hs > tools/cfgupgrade > Resolution: > - tools/cfgupgrade: ignore downgrade changes from 2.10 > - NEWS: take both changes > - devel/build_chroot: both changes differed only in indentation; > use indentation from master > - lib/hypervisor/hv_xen.py: manually apply fd201010 and 70d8491f to > the conflicting hunks from stable-2.10 > - lib/jqueue.py: manually apply 9cbcb1be to the conflicting hunk from > master > Semanitcal conflicts: > - configure.ac: undo revision bump to ~rc1 > - lib/query.py: manually merge the two independently added > functions _GetInstAllNicVlans > > diff --cc NEWS > index 0caba80,26c2616..b0e8159 > --- a/NEWS > +++ b/NEWS > @@@ -2,62 -2,10 +2,62 @@@ New > ==== > > > +Version 2.11.0 alpha1 > +--------------------- > + > +*(unreleased)* > + > +Incompatible/important changes > +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > + > +- ``gnt-node list`` no longer shows disk space information for shared file > + disk templates because it is not a node attribute. (For example, if you > have > + both the file and shared file disk templates enabled, ``gnt-node list`` > now > + only shows information about the file disk template.) > +- The shared file disk template is now in the new 'sharedfile' storage type. > + As a result, ``gnt-node list-storage -t file`` now only shows information > + about the file disk template and you may use ``gnt-node list-storage -t > + sharedfile`` to query storage information for the shared file disk > template. > +- Over luxi, syntactially incorrect queries are now rejected as a whole; > + before, a 'SumbmitManyJobs' request was partially executed, if the outer > + structure of the request was syntactically correct. As the luxi protocol > + is internal (external applications are expected to use RAPI), the impact > + of this incompatible change should be limited. > +- Queries for nodes, instances, groups, backups and networks are now > + exclusively done via the luxi daemon. Legacy python code was removed, > + as well as the --enable-split-queries configuration option. > +- Orphan volumes errors are demoted to warnings and no longer affect the > exit > + code of ``gnt-cluster verify``. > + > +New features > +~~~~~~~~~~~~ > + > +- Instance moves, backups and imports can now use compression to transfer > the > + instance data. > +- Node groups can be configured to use an SSH port different than the > + default 22. > +- Added experimental support for Gluster distributed file storage as the > + ``gluster`` disk template under the new ``sharedfile`` storage type > through > + automatic management of per-node FUSE mount points. You can configure the > + mount point location at ``gnt-cluster init`` time by using the new > + ``--gluster-storage-dir`` switch. > + > +New dependencies > +~~~~~~~~~~~~~~~~ > +The following new dependencies have been added: > + > +For Haskell: > + > +- ``zlib`` library (http://hackage.haskell.org/package/base64-bytestring) > + > +- ``base64-bytestring`` library (http://hackage.haskell.org/package/zlib), > + at least version 1.0.0.0 > + > + > - Version 2.10.0 alpha1 > - --------------------- > + Version 2.10.0 rc1 > + ------------------ > > - *(unreleased)* > + *(Released Tue, 17 Dec 2013)* > > Incompatible/important changes > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > diff --cc configure.ac > index 80bb790,6e06d89..24ea047 > --- a/configure.ac > +++ b/configure.ac > @@@ -1,8 -1,8 +1,8 @@@ > # Configure script for Ganeti > m4_define([gnt_version_major], [2]) > -m4_define([gnt_version_minor], [10]) > +m4_define([gnt_version_minor], [11]) > m4_define([gnt_version_revision], [0]) > - m4_define([gnt_version_suffix], [~alpha1]) > + m4_define([gnt_version_suffix], [~rc1]) > m4_define([gnt_version_full], > m4_format([%d.%d.%d%s], > gnt_version_major, gnt_version_minor, > diff --cc devel/build_chroot > index 1dfa3b4,f34ef19..39bec38 > --- a/devel/build_chroot > +++ b/devel/build_chroot > @@@ -235,13 -229,19 +235,24 @@@ case $DIST_RELEASE i > python-bitarray python-ipaddr python-yaml qemu-utils python-coverage > pep8 \ > shelltestrunner python-dev pylint openssh-client vim git git-email > > + # We need version 0.9.4 of pyinotify because the packaged version, > 0.9.3, is > + # incompatibile with the packaged version of python-epydoc 3.0.1. > + # Reason: a logger class in pyinotify calculates its superclasses at > + # runtime, which clashes with python-epydoc's static analysis phase. > + # > + # Problem introduced in: > + # > https://github.com/seb-m/pyinotify/commit/2c7e8f8959d2f8528e0d90847df360 > + # and "fixed" in: > + # > https://github.com/seb-m/pyinotify/commit/98c5f41a6e2e90827a63ff1b878596 > + > + in_chroot -- \ > easy_install pyinotify==0.9.4 > > + in_chroot -- \ > + cabal update > + > + in_chroot -- \ > + cabal install --global base64-bytestring > ;; > > *) > diff --cc lib/cli.py > index 2f8f715,8ed7773..01c5ed0 > --- a/lib/cli.py > +++ b/lib/cli.py > @@@ -95,9 -94,9 +95,10 @@@ __all__ = > "GATEWAY6_OPT", > "GLOBAL_FILEDIR_OPT", > "HID_OS_OPT", > + "GLOBAL_GLUSTER_FILEDIR_OPT", > "GLOBAL_SHARED_FILEDIR_OPT", > "HOTPLUG_OPT", > + "HOTPLUG_IF_POSSIBLE_OPT", > "HVLIST_OPT", > "HVOPTS_OPT", > "HYPERVISOR_OPT", > diff --cc lib/hypervisor/hv_xen.py > index e499e07,047e563..0301a52 > --- a/lib/hypervisor/hv_xen.py > +++ b/lib/hypervisor/hv_xen.py > @@@ -665,46 -622,44 +661,61 @@@ class XenHypervisor(hv_base.BaseHypervi > > return self._StopInstance(name, force, instance.hvparams) > > - def _ShutdownInstance(self, name, hvparams, instance_info): > - # The '-w' flag waits for shutdown to complete > - # > - # In the case of shutdown, we want to wait until the shutdown > - # process is complete because then we want to also destroy the > - # domain, and we do not want to destroy the domain while it is > - # shutting down. > - if hv_base.HvInstanceState.IsShutdown(instance_info): > - logging.info("Instance '%s' is already shutdown, skipping shutdown" > - " command", name) > - else: > - result = self._RunXen(["shutdown", "-w", name], hvparams) > - if result.failed: > - raise errors.HypervisorError("Failed to shutdown instance %s: %s, > %s" % > - (name, result.fail_reason, > result.output)) > + def _ShutdownInstance(self, name, hvparams): > + """Shutdown an instance if the instance is running. > + > + @type name: string > + @param name: name of the instance to stop > + @type hvparams: dict of string > + @param hvparams: hypervisor parameters of the instance > + > + The '-w' flag waits for shutdown to complete which avoids the need > + to poll in the case where we want to destroy the domain > + immediately after shutdown. > + > + """ > + instance_info = self.GetInstanceInfo(name, hvparams=hvparams) > + > + if instance_info is None or _IsInstanceShutdown(instance_info[4]): > + logging.info("Failed to shutdown instance %s, not running", name) > + return None > + > + return self._RunXen(["shutdown", "-w", name], hvparams) > > def _DestroyInstance(self, name, hvparams): > - result = self._RunXen(["destroy", name], hvparams) > + """Destroy an instance if the instance if the instance exists. > > - if result.failed: > - raise errors.HypervisorError("Failed to destroy instance %s: %s, %s" % > - (name, result.fail_reason, > result.output)) > + @type name: string > + @param name: name of the instance to destroy > + @type hvparams: dict of string > + @param hvparams: hypervisor parameters of the instance > + > + """ > + instance_info = self.GetInstanceInfo(name, hvparams=hvparams) > + > + if instance_info is None: > + logging.info("Failed to destroy instance %s, does not exist", name) > + return None > + > + return self._RunXen(["destroy", name], hvparams) > > + # Destroy a domain only if necessary > + # > + # This method checks if the domain has already been destroyed before > + # issuing the 'destroy' command. This step is necessary to handle > + # domains created by other versions of Ganeti. For example, an > + # instance created with 2.10 will be destroy by the > + # '_ShutdownInstance', thus not requiring an additional destroy, > + # which would cause an error if issued. See issue 619. > + def _DestroyInstanceIfAlive(self, name, hvparams): > + instance_info = self.GetInstanceInfo(name, hvparams=hvparams) > + > + if instance_info is None: > + raise errors.HypervisorError("Failed to destroy instance %s, already" > + " destroyed" % name) > + else: > + self._DestroyInstance(name, hvparams) > + > def _StopInstance(self, name, force, hvparams): > """Stop an instance. > > @@@ -716,17 -673,16 +729,22 @@@ > @param hvparams: hypervisor parameters of the instance > > """ > + instance_info = self.GetInstanceInfo(name, hvparams=hvparams) > + > + if instance_info is None: > + raise errors.HypervisorError("Failed to shutdown instance %s," > + " not running" % name) > + > if force: > - self._DestroyInstance(name, hvparams) > - result = self._DestroyInstance(name, hvparams) > ++ result = self._DestroyInstanceIfAlive(name, hvparams) > else: > - self._ShutdownInstance(name, hvparams, instance_info[4]) > - self._DestroyInstanceIfAlive(name, hvparams) > + self._ShutdownInstance(name, hvparams) > - result = self._DestroyInstance(name, hvparams) > ++ result = self._DestroyInstanceIfAlive(name, hvparams) > + > + if result is not None and result.failed and \ > + self.GetInstanceInfo(name, hvparams=hvparams) is not None: > + raise errors.HypervisorError("Failed to stop instance %s: %s, %s" % > + (name, result.fail_reason, > result.output)) > > # Remove configuration file if stopping/starting instance was successful > self._RemoveConfigFile(name) > diff --cc lib/jqueue.py > index 2457c32,2011cf2..1cd7499 > --- a/lib/jqueue.py > +++ b/lib/jqueue.py > @@@ -1707,44 -1706,69 +1707,44 @@@ class JobQueue(object) > > # Setup worker pool > self._wpool = _JobQueueWorkerPool(self) > - try: > - self._InspectQueue() > - except: > - self._wpool.TerminateWorkers() > - raise > > - @locking.ssynchronized(_LOCK) > - @_RequireOpenQueue > - def _InspectQueue(self): > - """Loads the whole job queue and resumes unfinished jobs. > + def _PickupJobUnlocked(self, job_id): > + """Load a job from the job queue > > - This function needs the lock here because WorkerPool.AddTask() may > start a > - job while we're still doing our work. > + Pick up a job that already is in the job queue and start/resume it. > > """ > - logging.info("Inspecting job queue") > - > - restartjobs = [] > - > - all_job_ids = self._GetJobIDsUnlocked() > - jobs_count = len(all_job_ids) > - lastinfo = time.time() > - for idx, job_id in enumerate(all_job_ids): > - # Give an update every 1000 jobs or 10 seconds > - if (idx % 1000 == 0 or time.time() >= (lastinfo + 10.0) or > - idx == (jobs_count - 1)): > - logging.info("Job queue inspection: %d/%d (%0.1f %%)", > - idx, jobs_count - 1, 100.0 * (idx + 1) / jobs_count) > - lastinfo = time.time() > - > - job = self._LoadJobUnlocked(job_id) > - > - # a failure in loading the job can cause 'None' to be returned > - if job is None: > - continue > + job = self._LoadJobUnlocked(job_id) > > - status = job.CalcStatus() > - > - if status == constants.JOB_STATUS_QUEUED: > - restartjobs.append(job) > - > - elif status in (constants.JOB_STATUS_RUNNING, > - constants.JOB_STATUS_WAITING, > - constants.JOB_STATUS_CANCELING): > - logging.warning("Unfinished job %s found: %s", job.id, job) > - > - if status == constants.JOB_STATUS_WAITING: > - # Restart job > - job.MarkUnfinishedOps(constants.OP_STATUS_QUEUED, None) > - restartjobs.append(job) > - else: > - to_encode = errors.OpExecError("Unclean master daemon shutdown") > - job.MarkUnfinishedOps(constants.OP_STATUS_ERROR, > - _EncodeOpError(to_encode)) > - job.Finalize() > + if job is None: > + logging.warning("Job %s could not be read", job_id) > + return > > - self.UpdateJobUnlocked(job) > + status = job.CalcStatus() > - > + if status == constants.JOB_STATUS_QUEUED: > + self._EnqueueJobsUnlocked([job]) > + logging.info("Restarting job %s", job.id) > + > + elif status in (constants.JOB_STATUS_RUNNING, > + constants.JOB_STATUS_WAITING, > + constants.JOB_STATUS_CANCELING): > + logging.warning("Unfinished job %s found: %s", job.id, job) > + > + if status == constants.JOB_STATUS_WAITING: > + job.MarkUnfinishedOps(constants.OP_STATUS_QUEUED, None) > + self._EnqueueJobsUnlocked([job]) > + logging.info("Restarting job %s", job.id) > + else: > ++ to_encode = errors.OpExecError("Unclean master daemon shutdown") > + job.MarkUnfinishedOps(constants.OP_STATUS_ERROR, > - "Unclean master daemon shutdown") > ++ _EncodeOpError(to_encode)) > + job.Finalize() > > - if restartjobs: > - logging.info("Restarting %s jobs", len(restartjobs)) > - self._EnqueueJobsUnlocked(restartjobs) > + self.UpdateJobUnlocked(job) > > - logging.info("Job queue inspection finished") > + @locking.ssynchronized(_LOCK) > + def PickupJob(self, job_id): > + self._PickupJobUnlocked(job_id) > > def _GetRpc(self, address_list): > """Gets RPC runner with context. > diff --cc src/Ganeti/Luxi.hs > index 1fb2602,033fd69..52b585f > --- a/src/Ganeti/Luxi.hs > +++ b/src/Ganeti/Luxi.hs > @@@ -169,22 -194,115 +169,21 @@@ $(genAllConstr (drop 3) ''LuxiReq "allL > -- | The serialisation of LuxiOps into strings in messages. > $(genStrOfOp ''LuxiOp "strOfOp") > > --- | Type holding the initial (unparsed) Luxi call. > -data LuxiCall = LuxiCall LuxiReq JSValue > - > --- | The end-of-message separator. > -eOM :: Word8 > -eOM = 3 > - > --- | The end-of-message encoded as a ByteString. > -bEOM :: B.ByteString > -bEOM = B.singleton eOM > - > --- | Valid keys in the requests and responses. > -data MsgKeys = Method > - | Args > - | Success > - | Result > - > --- | The serialisation of MsgKeys into strings in messages. > -$(genStrOfKey ''MsgKeys "strOfKey") > > --- | Luxi client encapsulation. > -data Client = Client { socket :: Handle -- ^ The socket of the > client > - , rbuf :: IORef B.ByteString -- ^ Already received > buffer > - } > +luxiConnectConfig :: ConnectConfig > +luxiConnectConfig = ConnectConfig { connDaemon = GanetiLuxid > + , recvTmo = luxiDefRwto > + , sendTmo = luxiDefRwto > + } > > -- | Connects to the master daemon and returns a luxi Client. > -getClient :: String -> IO Client > -getClient path = do > - s <- S.socket S.AF_UNIX S.Stream S.defaultProtocol > - withTimeout luxiDefCtmo "creating luxi connection" $ > - S.connect s (S.SockAddrUnix path) > - rf <- newIORef B.empty > - h <- S.socketToHandle s ReadWriteMode > - return Client { socket=h, rbuf=rf } > +getLuxiClient :: String -> IO Client > +getLuxiClient = connectClient luxiConnectConfig luxiDefCtmo > > -- | Creates and returns a server endpoint. > -getServer :: Bool -> FilePath -> IO S.Socket > -getServer setOwner path = do > - s <- S.socket S.AF_UNIX S.Stream S.defaultProtocol > - S.bindSocket s (S.SockAddrUnix path) > - when setOwner $ do > - setOwnerAndGroupFromNames path GanetiLuxid $ ExtraGroup DaemonsGroup > - setFileMode path $ fromIntegral luxiSocketPerms > - S.listen s 5 -- 5 is the max backlog > - return s > - > --- | Closes a server endpoint. > --- FIXME: this should be encapsulated into a nicer type. > -closeServer :: FilePath -> S.Socket -> IO () > -closeServer path sock = do > - S.sClose sock > - removeFile path > - > --- | Accepts a client > -acceptClient :: S.Socket -> IO Client > -acceptClient s = do > - -- second return is the address of the client, which we ignore here > - (client_socket, _) <- S.accept s > - new_buffer <- newIORef B.empty > - handle <- S.socketToHandle client_socket ReadWriteMode > - return Client { socket=handle, rbuf=new_buffer } > - > --- | Closes the client socket. > -closeClient :: Client -> IO () > -closeClient = hClose . socket > - > --- | Sends a message over a luxi transport. > -sendMsg :: Client -> String -> IO () > -sendMsg s buf = withTimeout luxiDefRwto "sending luxi message" $ do > - let encoded = UTF8L.fromString buf > - handle = socket s > - BL.hPut handle encoded > - B.hPut handle bEOM > - hFlush handle > - > --- | Given a current buffer and the handle, it will read from the > --- network until we get a full message, and it will return that > --- message and the leftover buffer contents. > -recvUpdate :: Handle -> B.ByteString -> IO (B.ByteString, B.ByteString) > -recvUpdate handle obuf = do > - nbuf <- withTimeout luxiDefRwto "reading luxi response" $ do > - _ <- hWaitForInput handle (-1) > - B.hGetNonBlocking handle 4096 > - let (msg, remaining) = B.break (eOM ==) nbuf > - newbuf = B.append obuf msg > - if B.null remaining > - then recvUpdate handle newbuf > - else return (newbuf, B.tail remaining) > - > --- | Waits for a message over a luxi transport. > -recvMsg :: Client -> IO String > -recvMsg s = do > - cbuf <- readIORef $ rbuf s > - let (imsg, ibuf) = B.break (eOM ==) cbuf > - (msg, nbuf) <- > - if B.null ibuf -- if old buffer didn't contain a full message > - then recvUpdate (socket s) cbuf -- then we read from network > - else return (imsg, B.tail ibuf) -- else we return data from our > buffer > - writeIORef (rbuf s) nbuf > - return $ UTF8.toString msg > - > --- | Extended wrapper over recvMsg. > -recvMsgExt :: Client -> IO RecvResult > -recvMsgExt s = > - Control.Exception.catch (liftM RecvOk (recvMsg s)) $ \e -> > - return $ if isEOFError e > - then RecvConnClosed > - else RecvError (show e) > +getLuxiServer :: Bool -> FilePath -> IO Server > +getLuxiServer = connectServer luxiConnectConfig > > - > -- | Serialize a request to String. > buildCall :: LuxiOp -- ^ The method > -> String -- ^ The serialized form > > -- > Klaus Aehlig > Google Germany GmbH, Dienerstr. 12, 80331 Muenchen > Registergericht und -nummer: Hamburg, HRB 86891 > Sitz der Gesellschaft: Hamburg > Geschaeftsfuehrer: Graham Law, Christine Elizabeth Flores -- Guido Trotter Ganeti Engineering Google Germany GmbH Dienerstr. 12, 80331, München Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores Steuernummer: 48/725/00206 Umsatzsteueridentifikationsnummer: DE813741370
