Hello everyone, I'd like to improve the Swift deployments done by TripleO. There are a few problems today when deployed with the current defaults:
1. Adding new nodes (or replacing existing nodes) is not possible, because the rings are built locally on each host and a new node doesn't know about the "history" of the rings. Therefore rings might become different on the nodes, and that results in an unusable state eventually. 2. The rings are only using a single device, and it seems that this is just a directory and not a mountpoint with a real device. Therefore data is stored on the root device - even if you have 100TB disk space in the background. If not fixed manually your root device will run out of space eventually. 3. Even if a real disk is mounted in /srv/node, replacing a faulty disk is much more troublesome. Normally you would simply unmount a disk, and then replace the disk sometime later. But because mount_check is set to False in the storage servers data will be written to the root device in the meantime; and when you finally mount the disk again, you can't simply cleanup. 4. In general, it's not possible to change cluster layout (using different zones/regions/partition power/device weight, slowly adding new devices to avoid 25% of the data will be moved immediately when adding new nodes to a small cluster, ...). You could manually manage your rings, but they will be overwritten finally when updating your overcloud. 5. Missing erasure coding support (or storage policies in general) This sounds bad, however most of the current issues can be fixed using customized templates and some tooling to create the rings in advance on the undercloud node. The information about all the devices can be collected from the introspection data, and by using node placement the nodenames in the rings are known in advance if the nodes are not yet powered on. This ensures a consistent ring state, and an operator can modify the rings if needed and to customize the cluster layout. Using some customized templates we can already do the following: - disable rinbguilding on the nodes - create filesystems on the extra blockdevices - copy ringfiles from the undercloud, using pre-built rings - enable mount_check by default - (define storage policies if needed) I started working on a POC using tripleo-quickstart, some custom templates and a small Python tool to build rings based on the introspection data: https://github.com/cschwede/tripleo-swift-ring-tool I'd like to get some feedback on the tool and templates. - Does this make sense to you? - How (and where) could we integrate this upstream? - Templates might be included in tripleo-heat-templates? IMO the most important change would be to avoid overwriting rings on the overcloud. There is a good chance to mess up your cluster if the template to disable ring building isn't used and you already have working rings in place. Same for the mount_check option. I'm curious about your thoughts! Thanks, Christian -- Christian Schwede _____________________________________________________________________ Red Hat GmbH Technopark II, Haus C, Werner-von-Siemens-Ring 11-15, 85630 Grasbrunn, Handelsregister: Amtsgericht Muenchen HRB 153243 Geschaeftsfuehrer: Mark Hegarty, Charlie Peters, Michael Cunningham, Charles Cachera __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev