I had to fix some style errors and warnings. Please perform the automatic checks before sending a patch. A simple `make` didn't run through with this patch added.
LGTM (I pushed the patch to master), thanks, Thomas On Thu, Jul 25, 2013 at 11:07 PM, Pulkit Singhal <[email protected]>wrote: > Hi Thomas, > > I've sent an updated Design document. Forgot to fix the "node (" issue > though :P > > The details regarding 'common operations' are added under 'Ceph > configuration on Ganeti nodes'. Please mention if any other details are > required. > > > On Fri, Jul 26, 2013 at 2:21 AM, Pulkit Singhal <[email protected]>wrote: > >> Signed-off-by: Pulkit Singhal <[email protected]> >> --- >> doc/design-ceph-ganeti-support.rst | 184 >> ++++++++++++++++++++++++++++++++++++ >> doc/design-draft.rst | 1 + >> 2 files changed, 185 insertions(+) >> create mode 100644 doc/design-ceph-ganeti-support.rst >> >> diff --git a/doc/design-ceph-ganeti-support.rst >> b/doc/design-ceph-ganeti-support.rst >> new file mode 100644 >> index 0000000..8d0229c >> --- /dev/null >> +++ b/doc/design-ceph-ganeti-support.rst >> @@ -0,0 +1,184 @@ >> +============================ >> +RADOS/Ceph support in Ganeti >> +============================ >> + >> +.. contents:: :depth: 4 >> + >> +Objective >> +========= >> + >> +The project aims to improve Ceph RBD support in Ganeti. It can be >> +primarily divided into following tasks. >> + >> +- Use Qemu/KVM RBD driver to provide instances with direct RBD >> + support. >> +- Allow Ceph RBDs' configuration through Ganeti. >> +- Write a data collector to monitor Ceph nodes. >> + >> +Background >> +========== >> + >> +Ceph RBD >> +-------- >> + >> +Ceph is a distributed storage system which provides data access as >> +files, objects and blocks. As part of this project, we're interested in >> +integrating ceph's block device(RBD) directly with Qemu/KVM. >> + >> +Primary components/daemons of Ceph. >> +- Monitor - Serve as authentication point for clients. >> +- Metadata - Store all the filesystem metadata (Not configured here as >> +they are not required for RBD) >> +- OSD - Object storage devices. One daemon for each drive/location. >> + >> +RBD support in Ganeti >> +--------------------- >> + >> +Currently, Ganeti supports RBD volumes on a pre-configured Ceph cluster. >> +This is enabled through RBD disk templates. These templates allow RBD >> +volume's access through RBD Linux driver. The volumes are mapped to host >> +as local block devices which are then attached to the instances. This >> +method incurs an additional overhead. We plan to resolve it by using >> +Qemu's RBD driver to enable direct access to RBD volumes for KVM >> +instances. >> + >> +Also, Ganeti currently uses RBD volumes on a pre-configured ceph cluster. >> +Allowing configuration of ceph nodes through Ganeti will be a good >> +addition to its prime features. >> + >> + >> +Qemu/KVM Direct RBD Integration >> +=============================== >> + >> +A new disk param ``access`` is introduced. It's added at >> +cluster/node-group level to simplify prototype implementation. >> +It will specify the access method either as ``userspace`` or >> +``kernelspace``. It's accessible to StartInstance() in hv_kvm.py. The >> +device path, ``rbd:<pool>/<vol_name>``, is generated by RADOSBlockDevice >> +and is added to the params dictionary as ``kvm_dev_path``. >> + >> +This approach ensures that no disk template specific changes are >> +required in hv_kvm.py allowing easy integration of other distributed >> +storage systems(like Gluster). >> + >> +Note that the RBD volume is mapped as a local block device as before. >> +The local mapping won't be used during instance operation in the >> +``userspace`` access mode, but can be used by administrators and OS >> +scripts. >> + >> +Updated commands >> +---------------- >> +:: >> + $ gnt-instance info >> + >> +``access:userspace/kernelspace`` will be added to Disks category. This >> +output applies to KVM based instances only. >> + >> +Ceph configuration on Ganeti nodes >> +================================== >> + >> +This document proposes configuration of distributed storage >> +pool(Ceph or Gluster) through ganeti. Currently, this design document >> +focuses on configuring a Ceph cluster. A prerequisite of this setup >> +would be installation of ceph packages on all the concerned nodes. >> + >> +At Ganeti Cluster init, the user will set distributed-storage specific >> +options which will be stored at cluster level. The Storage cluster >> +will be initialized using ``gnt-storage``. For the prototype, only a >> +single storage pool/node-group is configured. >> + >> +Following steps take place when a node-group is initialized as a storage >> +cluster. >> +- Check for an existing ceph cluster through /etc/ceph/ceph.conf file on >> + each node. >> +- Fetch cluster configuration parameters and create a distributed >> + storage object accordingly. >> +- Issue an 'init distributed storage' RPC to group nodes(if any). >> +- On each node, ``ceph`` cli tool will run appropriate services. >> +- Mark nodes as well as the node-group as distributed-storage-enabled. >> + >> +The storage cluster will operate at a node-group level. The ceph >> +cluster will be initiated using gnt-storage. A new sub-command >> +``init-distributed-storage`` will be added to it. >> + >> +The configuration of the nodes will be handled through an init function >> +called by the node daemons running on the respective nodes. A new RPC is >> +introduced to handle the calls. >> + >> +A new object will be created to send the storage parameters to the node >> +- storage_type, devices, node_role(mon/osd) etc. >> + >> +A new node can be directly assigned to the storage enabled node-group. >> +During the 'gnt-node add' process, required ceph daemons will be started >> +and node will be added to the ceph cluster. >> + >> +Only an offline node can be assigned to storage enabled node-group. >> +``gnt-node --readd`` needs to be performed to issue RPCs for spawning >> +appropriate services on the newly assigned node. >> + >> +Updated Commands >> +---------------- >> + >> +Following are the affected commands.:: >> + >> + $ gnt-cluster init -S ceph:disk=/dev/sdb,option=value... >> + >> +During cluster initialization, ceph specific options are provided which >> +apply at cluster-level.:: >> + >> + $ gnt-cluster modify -S ceph:option=value2... >> + >> +For now, cluster modification will be allowed when there is no >> +initialized storage cluster.:: >> + >> + $ gnt-storage init-distributed-storage -s{--storage-type} ceph >> +<node-group> >> + >> +Ensure that no other node-group is configured as distributed storage >> +cluster and configure ceph on the specified node-group. If there is no >> +node in the node-group, it'll only be marked as distributed storage >> +enabled and no action will be taken.:: >> + >> + $ gnt-group assign-nodes <group> <node> >> + >> +It ensures that the node is offline if the node-group specified is >> +distributed storage capable. Ceph configuration on the newly assigned >> +node is not performed at this step.:: >> + >> + $ gnt-node --offline >> + >> +If the node is part of storage node-group, an offline call will >> stop/remove >> +ceph daemons.:: >> + >> + $ gnt-node add --readd >> + >> +If the node is now part of the storage node-group, issue init >> +distributed storage RPC to the respective node. This step is required >> +after assigning a node to the storage enabled node-group:: >> + >> + $ gnt-node remove >> + >> +A warning will be issued stating that the node is part of distributed >> +storage, mark it offline before removal. >> + >> +Data collector for Ceph >> +----------------------- >> + >> +TBD >> + >> +Future Work >> +----------- >> + >> +Due to the loopback bug in ceph, one may run into daemon hang issues >> +while performing writes to a RBD volumes through block device mapping. >> +This bug is applicable only when the RBD volume is stored on the OSD >> +running on the local node. In order to mitigate this issue, we can >> +create storage pools on different nodegroups and access RBD >> +volumes on different pools. >> +http://tracker.ceph.com/issues/3076 >> + >> +.. vim: set textwidth=72 : >> +.. Local Variables: >> +.. mode: rst >> +.. fill-column: 72 >> +.. End: >> diff --git a/doc/design-draft.rst b/doc/design-draft.rst >> index f164c7c..f49885f 100644 >> --- a/doc/design-draft.rst >> +++ b/doc/design-draft.rst >> @@ -24,6 +24,7 @@ Design document drafts >> design-cmdlib-unittests.rst >> design-hotplug.rst >> design-optables.rst >> + design-ceph-ganeti-support.rst >> >> .. vim: set textwidth=72 : >> .. Local Variables: >> -- >> 1.7.9.5 >> >> > -- Thomas Thrainer | Software Engineer | [email protected] | Google Germany GmbH Dienerstr. 12 80331 München Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschäftsführer: Graham Law, Christine Elizabeth Flores
