Re: [MERGE] Merge branch 'stable-2.10' into master

Guido Trotter Thu, 19 Dec 2013 02:55:34 -0800

LGTM

Thanks,


Guido

On Thu, Dec 19, 2013 at 11:51 AM, Klaus Aehlig <[email protected]> wrote:
>
>
> commit 4038f067bf4004ff6a4c2ef9b1dc339b3332341c
> Merge: 9ba3870 a5c5097
> Author: Klaus Aehlig <[email protected]>
> Date:   Thu Dec 19 11:06:22 2013 +0100
>
>     Merge branch 'stable-2.10' into master
>
>     * stable-2.10
>       Version bump for 2.10.0~rc1
>       Update NEWS for 2.10.0 rc1 release
>       Fix pylint 0.26.0/Python 2.7 warning
>       Update INSTALL and devnotes for 2.10 release
>     * stable-2.9
>       Bump revision for 2.9.2
>       Update NEWS for 2.9.2 release
>       Pass hvparams to GetInstanceInfo
>       Adapt parameters that moved to instance variables
>       Avoid lines longer than 80 chars
>       SingleNotifyPipeCondition: don't share pollers
>       KVM: use custom KVM path if set for version checking
>     * stable-2.8
>       Version bump for 2.8.3
>       Update NEWS for 2.8.3 release
>       Support reseting arbitrary params of ext disks
>       Allow modification of arbitrary params for ext
>       Do not clear disk.params in UpgradeConfig()
>       SetDiskID() before accepting an instance
>       Lock group(s) when creating instances
>       Fix job error message after unclean master shutdown
>       Add default file_driver if missing
>       Update tests
>       Xen handle domain shutdown
>       Fix evacuation out of drained node
>       Refactor reading live data in htools
>       master-up-setup: Ping multiple times with a shorter interval
>       Add a packet number limit to "fping" in master-ip-setup
>       Fix a bug in InstanceSetParams concerning names
>       build_chroot: hard-code the version of blaze-builder
>       Fix error printing
>       Allow link local IPv6 gateways
>       Fix NODE/NODE_RES locking in LUInstanceCreate
>       eta-reduce isIpV6
>       Ganeti.Rpc: use brackets for ipv6 addresses
>       Update NEWS file with socket permission fix info
>       Fix socket permissions after master-failover
>
>     Conflicts:
>         NEWS
>         devel/build_chroot
>         lib/cmdlib/instance.py
>         lib/hypervisor/hv_xen.py
>         lib/jqueue.py
>         src/Ganeti/Luxi.hs
>         tools/cfgupgrade
>     Resolution:
>         - tools/cfgupgrade: ignore downgrade changes from 2.10
>         - NEWS: take both changes
>         - devel/build_chroot: both changes differed only in indentation;
>             use indentation from master
>         - lib/hypervisor/hv_xen.py: manually apply fd201010 and 70d8491f to
>             the conflicting hunks from stable-2.10
>         - lib/jqueue.py: manually apply 9cbcb1be to the conflicting hunk from
>             master
>     Semanitcal conflicts:
>         - configure.ac: undo revision bump to ~rc1
>             - lib/query.py: manually merge the two independently added
>             functions _GetInstAllNicVlans
>
> diff --cc NEWS
> index 0caba80,26c2616..b0e8159
> --- a/NEWS
> +++ b/NEWS
> @@@ -2,62 -2,10 +2,62 @@@ New
>   ====
>
>
>  +Version 2.11.0 alpha1
>  +---------------------
>  +
>  +*(unreleased)*
>  +
>  +Incompatible/important changes
>  +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>  +
>  +- ``gnt-node list`` no longer shows disk space information for shared file
>  +  disk templates because it is not a node attribute. (For example, if you 
> have
>  +  both the file and shared file disk templates enabled, ``gnt-node list`` 
> now
>  +  only shows information about the file disk template.)
>  +- The shared file disk template is now in the new 'sharedfile' storage type.
>  +  As a result, ``gnt-node list-storage -t file`` now only shows information
>  +  about the file disk template and you may use ``gnt-node list-storage -t
>  +  sharedfile`` to query storage information for the shared file disk 
> template.
>  +- Over luxi, syntactially incorrect queries are now rejected as a whole;
>  +  before, a 'SumbmitManyJobs' request was partially executed, if the outer
>  +  structure of the request was syntactically correct. As the luxi protocol
>  +  is internal (external applications are expected to use RAPI), the impact
>  +  of this incompatible change should be limited.
>  +- Queries for nodes, instances, groups, backups and networks are now
>  +  exclusively done via the luxi daemon. Legacy python code was removed,
>  +  as well as the --enable-split-queries configuration option.
>  +- Orphan volumes errors are demoted to warnings and no longer affect the 
> exit
>  +  code of ``gnt-cluster verify``.
>  +
>  +New features
>  +~~~~~~~~~~~~
>  +
>  +- Instance moves, backups and imports can now use compression to transfer 
> the
>  +  instance data.
>  +- Node groups can be configured to use an SSH port different than the
>  +  default 22.
>  +- Added experimental support for Gluster distributed file storage as the
>  +  ``gluster`` disk template under the new ``sharedfile`` storage type 
> through
>  +  automatic management of per-node FUSE mount points. You can configure the
>  +  mount point location at ``gnt-cluster init`` time by using the new
>  +  ``--gluster-storage-dir`` switch.
>  +
>  +New dependencies
>  +~~~~~~~~~~~~~~~~
>  +The following new dependencies have been added:
>  +
>  +For Haskell:
>  +
>  +- ``zlib`` library (http://hackage.haskell.org/package/base64-bytestring)
>  +
>  +- ``base64-bytestring`` library (http://hackage.haskell.org/package/zlib),
>  +  at least version 1.0.0.0
>  +
>  +
> - Version 2.10.0 alpha1
> - ---------------------
> + Version 2.10.0 rc1
> + ------------------
>
> - *(unreleased)*
> + *(Released Tue, 17 Dec 2013)*
>
>   Incompatible/important changes
>   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> diff --cc configure.ac
> index 80bb790,6e06d89..24ea047
> --- a/configure.ac
> +++ b/configure.ac
> @@@ -1,8 -1,8 +1,8 @@@
>   # Configure script for Ganeti
>   m4_define([gnt_version_major], [2])
>  -m4_define([gnt_version_minor], [10])
>  +m4_define([gnt_version_minor], [11])
>   m4_define([gnt_version_revision], [0])
> - m4_define([gnt_version_suffix], [~alpha1])
> + m4_define([gnt_version_suffix], [~rc1])
>   m4_define([gnt_version_full],
>             m4_format([%d.%d.%d%s],
>                       gnt_version_major, gnt_version_minor,
> diff --cc devel/build_chroot
> index 1dfa3b4,f34ef19..39bec38
> --- a/devel/build_chroot
> +++ b/devel/build_chroot
> @@@ -235,13 -229,19 +235,24 @@@ case $DIST_RELEASE i
>         python-bitarray python-ipaddr python-yaml qemu-utils python-coverage 
> pep8 \
>         shelltestrunner python-dev pylint openssh-client vim git git-email
>
> +     # We need version 0.9.4 of pyinotify because the packaged version, 
> 0.9.3, is
> +     # incompatibile with the packaged version of python-epydoc 3.0.1.
> +     # Reason: a logger class in pyinotify calculates its superclasses at
> +     # runtime, which clashes with python-epydoc's static analysis phase.
> +     #
> +     # Problem introduced in:
> +     #   
> https://github.com/seb-m/pyinotify/commit/2c7e8f8959d2f8528e0d90847df360
> +     # and "fixed" in:
> +     #   
> https://github.com/seb-m/pyinotify/commit/98c5f41a6e2e90827a63ff1b878596
> +
> +     in_chroot -- \
>         easy_install pyinotify==0.9.4
>
>  +     in_chroot -- \
>  +       cabal update
>  +
>  +     in_chroot -- \
>  +       cabal install --global base64-bytestring
>   ;;
>
>     *)
> diff --cc lib/cli.py
> index 2f8f715,8ed7773..01c5ed0
> --- a/lib/cli.py
> +++ b/lib/cli.py
> @@@ -95,9 -94,9 +95,10 @@@ __all__ =
>     "GATEWAY6_OPT",
>     "GLOBAL_FILEDIR_OPT",
>     "HID_OS_OPT",
>  +  "GLOBAL_GLUSTER_FILEDIR_OPT",
>     "GLOBAL_SHARED_FILEDIR_OPT",
>     "HOTPLUG_OPT",
> +   "HOTPLUG_IF_POSSIBLE_OPT",
>     "HVLIST_OPT",
>     "HVOPTS_OPT",
>     "HYPERVISOR_OPT",
> diff --cc lib/hypervisor/hv_xen.py
> index e499e07,047e563..0301a52
> --- a/lib/hypervisor/hv_xen.py
> +++ b/lib/hypervisor/hv_xen.py
> @@@ -665,46 -622,44 +661,61 @@@ class XenHypervisor(hv_base.BaseHypervi
>
>       return self._StopInstance(name, force, instance.hvparams)
>
> -   def _ShutdownInstance(self, name, hvparams, instance_info):
> -     # The '-w' flag waits for shutdown to complete
> -     #
> -     # In the case of shutdown, we want to wait until the shutdown
> -     # process is complete because then we want to also destroy the
> -     # domain, and we do not want to destroy the domain while it is
> -     # shutting down.
> -     if hv_base.HvInstanceState.IsShutdown(instance_info):
> -       logging.info("Instance '%s' is already shutdown, skipping shutdown"
> -                    " command", name)
> -     else:
> -       result = self._RunXen(["shutdown", "-w", name], hvparams)
> -       if result.failed:
> -         raise errors.HypervisorError("Failed to shutdown instance %s: %s, 
> %s" %
> -                                      (name, result.fail_reason, 
> result.output))
> +   def _ShutdownInstance(self, name, hvparams):
> +     """Shutdown an instance if the instance is running.
> +
> +     @type name: string
> +     @param name: name of the instance to stop
> +     @type hvparams: dict of string
> +     @param hvparams: hypervisor parameters of the instance
> +
> +     The '-w' flag waits for shutdown to complete which avoids the need
> +     to poll in the case where we want to destroy the domain
> +     immediately after shutdown.
> +
> +     """
> +     instance_info = self.GetInstanceInfo(name, hvparams=hvparams)
> +
> +     if instance_info is None or _IsInstanceShutdown(instance_info[4]):
> +       logging.info("Failed to shutdown instance %s, not running", name)
> +       return None
> +
> +     return self._RunXen(["shutdown", "-w", name], hvparams)
>
>     def _DestroyInstance(self, name, hvparams):
> -     result = self._RunXen(["destroy", name], hvparams)
> +     """Destroy an instance if the instance if the instance exists.
>
> -     if result.failed:
> -       raise errors.HypervisorError("Failed to destroy instance %s: %s, %s" %
> -                                    (name, result.fail_reason, 
> result.output))
> +     @type name: string
> +     @param name: name of the instance to destroy
> +     @type hvparams: dict of string
> +     @param hvparams: hypervisor parameters of the instance
> +
> +     """
> +     instance_info = self.GetInstanceInfo(name, hvparams=hvparams)
> +
> +     if instance_info is None:
> +       logging.info("Failed to destroy instance %s, does not exist", name)
> +       return None
> +
> +     return self._RunXen(["destroy", name], hvparams)
>
>  +  # Destroy a domain only if necessary
>  +  #
>  +  # This method checks if the domain has already been destroyed before
>  +  # issuing the 'destroy' command.  This step is necessary to handle
>  +  # domains created by other versions of Ganeti.  For example, an
>  +  # instance created with 2.10 will be destroy by the
>  +  # '_ShutdownInstance', thus not requiring an additional destroy,
>  +  # which would cause an error if issued.  See issue 619.
>  +  def _DestroyInstanceIfAlive(self, name, hvparams):
>  +    instance_info = self.GetInstanceInfo(name, hvparams=hvparams)
>  +
>  +    if instance_info is None:
>  +      raise errors.HypervisorError("Failed to destroy instance %s, already"
>  +                                   " destroyed" % name)
>  +    else:
>  +      self._DestroyInstance(name, hvparams)
>  +
>     def _StopInstance(self, name, force, hvparams):
>       """Stop an instance.
>
> @@@ -716,17 -673,16 +729,22 @@@
>       @param hvparams: hypervisor parameters of the instance
>
>       """
>  +    instance_info = self.GetInstanceInfo(name, hvparams=hvparams)
>  +
>  +    if instance_info is None:
>  +      raise errors.HypervisorError("Failed to shutdown instance %s,"
>  +                                   " not running" % name)
>  +
>       if force:
> -       self._DestroyInstance(name, hvparams)
>  -      result = self._DestroyInstance(name, hvparams)
> ++      result = self._DestroyInstanceIfAlive(name, hvparams)
>       else:
> -       self._ShutdownInstance(name, hvparams, instance_info[4])
> -       self._DestroyInstanceIfAlive(name, hvparams)
> +       self._ShutdownInstance(name, hvparams)
>  -      result = self._DestroyInstance(name, hvparams)
> ++      result = self._DestroyInstanceIfAlive(name, hvparams)
> +
> +     if result is not None and result.failed and \
> +           self.GetInstanceInfo(name, hvparams=hvparams) is not None:
> +       raise errors.HypervisorError("Failed to stop instance %s: %s, %s" %
> +                                    (name, result.fail_reason, 
> result.output))
>
>       # Remove configuration file if stopping/starting instance was successful
>       self._RemoveConfigFile(name)
> diff --cc lib/jqueue.py
> index 2457c32,2011cf2..1cd7499
> --- a/lib/jqueue.py
> +++ b/lib/jqueue.py
> @@@ -1707,44 -1706,69 +1707,44 @@@ class JobQueue(object)
>
>       # Setup worker pool
>       self._wpool = _JobQueueWorkerPool(self)
>  -    try:
>  -      self._InspectQueue()
>  -    except:
>  -      self._wpool.TerminateWorkers()
>  -      raise
>
>  -  @locking.ssynchronized(_LOCK)
>  -  @_RequireOpenQueue
>  -  def _InspectQueue(self):
>  -    """Loads the whole job queue and resumes unfinished jobs.
>  +  def _PickupJobUnlocked(self, job_id):
>  +    """Load a job from the job queue
>
>  -    This function needs the lock here because WorkerPool.AddTask() may 
> start a
>  -    job while we're still doing our work.
>  +    Pick up a job that already is in the job queue and start/resume it.
>
>       """
>  -    logging.info("Inspecting job queue")
>  -
>  -    restartjobs = []
>  -
>  -    all_job_ids = self._GetJobIDsUnlocked()
>  -    jobs_count = len(all_job_ids)
>  -    lastinfo = time.time()
>  -    for idx, job_id in enumerate(all_job_ids):
>  -      # Give an update every 1000 jobs or 10 seconds
>  -      if (idx % 1000 == 0 or time.time() >= (lastinfo + 10.0) or
>  -          idx == (jobs_count - 1)):
>  -        logging.info("Job queue inspection: %d/%d (%0.1f %%)",
>  -                     idx, jobs_count - 1, 100.0 * (idx + 1) / jobs_count)
>  -        lastinfo = time.time()
>  -
>  -      job = self._LoadJobUnlocked(job_id)
>  -
>  -      # a failure in loading the job can cause 'None' to be returned
>  -      if job is None:
>  -        continue
>  +    job = self._LoadJobUnlocked(job_id)
>
>  -      status = job.CalcStatus()
>  -
>  -      if status == constants.JOB_STATUS_QUEUED:
>  -        restartjobs.append(job)
>  -
>  -      elif status in (constants.JOB_STATUS_RUNNING,
>  -                      constants.JOB_STATUS_WAITING,
>  -                      constants.JOB_STATUS_CANCELING):
>  -        logging.warning("Unfinished job %s found: %s", job.id, job)
>  -
>  -        if status == constants.JOB_STATUS_WAITING:
>  -          # Restart job
>  -          job.MarkUnfinishedOps(constants.OP_STATUS_QUEUED, None)
>  -          restartjobs.append(job)
>  -        else:
>  -          to_encode = errors.OpExecError("Unclean master daemon shutdown")
>  -          job.MarkUnfinishedOps(constants.OP_STATUS_ERROR,
>  -                                _EncodeOpError(to_encode))
>  -          job.Finalize()
>  +    if job is None:
>  +      logging.warning("Job %s could not be read", job_id)
>  +      return
>
>  -        self.UpdateJobUnlocked(job)
>  +    status = job.CalcStatus()
> -
>  +    if status == constants.JOB_STATUS_QUEUED:
>  +      self._EnqueueJobsUnlocked([job])
>  +      logging.info("Restarting job %s", job.id)
>  +
>  +    elif status in (constants.JOB_STATUS_RUNNING,
>  +                    constants.JOB_STATUS_WAITING,
>  +                    constants.JOB_STATUS_CANCELING):
>  +      logging.warning("Unfinished job %s found: %s", job.id, job)
>  +
>  +      if status == constants.JOB_STATUS_WAITING:
>  +        job.MarkUnfinishedOps(constants.OP_STATUS_QUEUED, None)
>  +        self._EnqueueJobsUnlocked([job])
>  +        logging.info("Restarting job %s", job.id)
>  +      else:
> ++        to_encode = errors.OpExecError("Unclean master daemon shutdown")
>  +        job.MarkUnfinishedOps(constants.OP_STATUS_ERROR,
> -                               "Unclean master daemon shutdown")
> ++                              _EncodeOpError(to_encode))
>  +        job.Finalize()
>
>  -    if restartjobs:
>  -      logging.info("Restarting %s jobs", len(restartjobs))
>  -      self._EnqueueJobsUnlocked(restartjobs)
>  +    self.UpdateJobUnlocked(job)
>
>  -    logging.info("Job queue inspection finished")
>  +  @locking.ssynchronized(_LOCK)
>  +  def PickupJob(self, job_id):
>  +    self._PickupJobUnlocked(job_id)
>
>     def _GetRpc(self, address_list):
>       """Gets RPC runner with context.
> diff --cc src/Ganeti/Luxi.hs
> index 1fb2602,033fd69..52b585f
> --- a/src/Ganeti/Luxi.hs
> +++ b/src/Ganeti/Luxi.hs
> @@@ -169,22 -194,115 +169,21 @@@ $(genAllConstr (drop 3) ''LuxiReq "allL
>   -- | The serialisation of LuxiOps into strings in messages.
>   $(genStrOfOp ''LuxiOp "strOfOp")
>
>  --- | Type holding the initial (unparsed) Luxi call.
>  -data LuxiCall = LuxiCall LuxiReq JSValue
>  -
>  --- | The end-of-message separator.
>  -eOM :: Word8
>  -eOM = 3
>  -
>  --- | The end-of-message encoded as a ByteString.
>  -bEOM :: B.ByteString
>  -bEOM = B.singleton eOM
>  -
>  --- | Valid keys in the requests and responses.
>  -data MsgKeys = Method
>  -             | Args
>  -             | Success
>  -             | Result
>  -
>  --- | The serialisation of MsgKeys into strings in messages.
>  -$(genStrOfKey ''MsgKeys "strOfKey")
>
>  --- | Luxi client encapsulation.
>  -data Client = Client { socket :: Handle           -- ^ The socket of the 
> client
>  -                     , rbuf :: IORef B.ByteString -- ^ Already received 
> buffer
>  -                     }
>  +luxiConnectConfig :: ConnectConfig
>  +luxiConnectConfig = ConnectConfig { connDaemon = GanetiLuxid
>  +                                  , recvTmo    = luxiDefRwto
>  +                                  , sendTmo    = luxiDefRwto
>  +                                  }
>
>   -- | Connects to the master daemon and returns a luxi Client.
>  -getClient :: String -> IO Client
>  -getClient path = do
>  -  s <- S.socket S.AF_UNIX S.Stream S.defaultProtocol
>  -  withTimeout luxiDefCtmo "creating luxi connection" $
>  -              S.connect s (S.SockAddrUnix path)
>  -  rf <- newIORef B.empty
>  -  h <- S.socketToHandle s ReadWriteMode
>  -  return Client { socket=h, rbuf=rf }
>  +getLuxiClient :: String -> IO Client
>  +getLuxiClient = connectClient luxiConnectConfig luxiDefCtmo
>
>   -- | Creates and returns a server endpoint.
>  -getServer :: Bool -> FilePath -> IO S.Socket
>  -getServer setOwner path = do
>  -  s <- S.socket S.AF_UNIX S.Stream S.defaultProtocol
>  -  S.bindSocket s (S.SockAddrUnix path)
>  -  when setOwner $ do
>  -    setOwnerAndGroupFromNames path GanetiLuxid $ ExtraGroup DaemonsGroup
>  -    setFileMode path $ fromIntegral luxiSocketPerms
>  -  S.listen s 5 -- 5 is the max backlog
>  -  return s
>  -
>  --- | Closes a server endpoint.
>  --- FIXME: this should be encapsulated into a nicer type.
>  -closeServer :: FilePath -> S.Socket -> IO ()
>  -closeServer path sock = do
>  -  S.sClose sock
>  -  removeFile path
>  -
>  --- | Accepts a client
>  -acceptClient :: S.Socket -> IO Client
>  -acceptClient s = do
>  -  -- second return is the address of the client, which we ignore here
>  -  (client_socket, _) <- S.accept s
>  -  new_buffer <- newIORef B.empty
>  -  handle <- S.socketToHandle client_socket ReadWriteMode
>  -  return Client { socket=handle, rbuf=new_buffer }
>  -
>  --- | Closes the client socket.
>  -closeClient :: Client -> IO ()
>  -closeClient = hClose . socket
>  -
>  --- | Sends a message over a luxi transport.
>  -sendMsg :: Client -> String -> IO ()
>  -sendMsg s buf = withTimeout luxiDefRwto "sending luxi message" $ do
>  -  let encoded = UTF8L.fromString buf
>  -      handle = socket s
>  -  BL.hPut handle encoded
>  -  B.hPut handle bEOM
>  -  hFlush handle
>  -
>  --- | Given a current buffer and the handle, it will read from the
>  --- network until we get a full message, and it will return that
>  --- message and the leftover buffer contents.
>  -recvUpdate :: Handle -> B.ByteString -> IO (B.ByteString, B.ByteString)
>  -recvUpdate handle obuf = do
>  -  nbuf <- withTimeout luxiDefRwto "reading luxi response" $ do
>  -            _ <- hWaitForInput handle (-1)
>  -            B.hGetNonBlocking handle 4096
>  -  let (msg, remaining) = B.break (eOM ==) nbuf
>  -      newbuf = B.append obuf msg
>  -  if B.null remaining
>  -    then recvUpdate handle newbuf
>  -    else return (newbuf, B.tail remaining)
>  -
>  --- | Waits for a message over a luxi transport.
>  -recvMsg :: Client -> IO String
>  -recvMsg s = do
>  -  cbuf <- readIORef $ rbuf s
>  -  let (imsg, ibuf) = B.break (eOM ==) cbuf
>  -  (msg, nbuf) <-
>  -    if B.null ibuf      -- if old buffer didn't contain a full message
>  -      then recvUpdate (socket s) cbuf   -- then we read from network
>  -      else return (imsg, B.tail ibuf)   -- else we return data from our 
> buffer
>  -  writeIORef (rbuf s) nbuf
>  -  return $ UTF8.toString msg
>  -
>  --- | Extended wrapper over recvMsg.
>  -recvMsgExt :: Client -> IO RecvResult
>  -recvMsgExt s =
>  -  Control.Exception.catch (liftM RecvOk (recvMsg s)) $ \e ->
>  -    return $ if isEOFError e
>  -               then RecvConnClosed
>  -               else RecvError (show e)
>  +getLuxiServer :: Bool -> FilePath -> IO Server
>  +getLuxiServer = connectServer luxiConnectConfig
>
> -
>   -- | Serialize a request to String.
>   buildCall :: LuxiOp  -- ^ The method
>             -> String  -- ^ The serialized form
>
> --
> Klaus Aehlig
> Google Germany GmbH, Dienerstr. 12, 80331 Muenchen
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Geschaeftsfuehrer: Graham Law, Christine Elizabeth Flores



-- 
Guido Trotter
Ganeti Engineering
Google Germany GmbH
Dienerstr. 12, 80331, München

Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Geschäftsführer: Graham Law, Christine Elizabeth Flores
Steuernummer: 48/725/00206
Umsatzsteueridentifikationsnummer: DE813741370

Re: [MERGE] Merge branch 'stable-2.10' into master

Reply via email to