## Summary Swift storage policies using ISA-L Vandermonde (`isa_l_rs_vand`) and having 5 or more parity bits will no longer be allowed. Swift's services will refuse to start unless these policies are deprecated. All existing data in these policies should be migrated to a different storage policy as soon as possible.
Using ISA-L's Cauchy mode (`isa_l_rs_cauchy`) with 5 or more parity bits is safe (as is Cauchy mode with less than 5 parity bits). Using ISA-L's Vandermonde mode (`isa_l_rs_vand`) with less than 5 parity bits is safe. This change is expected in the next Swift release (2.15.0) and will be included in Pike. ## Background Late last year, we discovered that a particular config setting for erasure codes in Swift would expose a bug in one of the supported erasure coded libraries (Intel's ISA-L) and could result in data becoming corrupted. **THIS DATA CORRUPTION BUG HAS BEEN FIXED**, and it was included in liberasurecode 1.3.1. We also bumped the dependency version for liberasurecode in Swift to remove the immediate danger to Swift clusters. When we updated the liberasurecode version dependency, we also added a warning in Swift if we detected a storage policy using `isa_l_rs_vand` with 5 or more parity bits. These changes were released in Swift 2.13.0 (and as the OpenStack Ocata release). An example bad erasure code policy config in `/etc/swift.conf` is ```ini [storage-policy:2] name = deepfreeze7-6 aliases = df76 policy_type = erasure_coding ec_type = isa_l_rs_vand ec_num_data_fragments = 7 ec_num_parity_fragments = 6 ec_object_segment_size = 1048576 ``` For deeper context and background, Swift's upstream erasure code docs are at <https://docs.openstack.org/developer/swift/overview_erasure_code.html> ## What's about to happen? After <https://review.openstack.org/#/c/468105/> lands, Swift services won't start if you have an EC storage policy with `isa_l_rs_vand` and 5 or more parity bits unless that policy is deprecated. No new containers with this policy will be able to be created. Existing objects will be readable and you can still write to containers previously created with this storage policy. This proposed patch is expected to be included in Swift's next 2.15.0 release. The OpenStack Pike release will include either Swift 2.15.x or Swift 2.16.x. ## Why now? Although Swift and liberasurecode will no longer actively corrupt data, it's still possible that some failures will result in an inability to reconstruct missing erasure code fragments to restore full durability. Operators should immediately cease using `isa_l_rs_vand` with 5 or more parity bits, and migrate all data stored in a policy like that to a different storage policy. Since data movement takes time, this process should be started as soon as possible. ## What do ops need to do right now? 1. Ensure that you are using liberasurecode 1.4.0 or later. 2. Identify any storage policies using `isa_l_rs_vand` with 5 or more parity bits 3. For each policy found, deprecate the storage policy. 4. Operators should change the name of the bad policy to reflect its deprecated state. After renaming this policy, an alias can be created on the new policy that matches the name of the old policy. This will provide continuity for client apps. See below for an example. 5. If you need to keep an erasure code policy with the same data/parity balance, create a new one using `isa_l_rs_cauchy` (this requires liberasurecode 1.4.0 or later). Note that a new storage policy must have a unique name. 6. Begin migrating existing data from the deprecated policy to a different storage policy. Depending on the amount of data stored, this may take a long time. At this time, there are no upstream tools to facilitate this, but the process is a matter of GET'ing the data from the existing container (with the deprecated policy) and PUT'ing the data to the new container (using the new policy). One way to migrate the data to a new policy is to use Swift's container sync feature. Using container sync will preserve object metadata and timestamps. Another option is to write a tool to iterate over existing data and send COPY request to copy data to a new contaienr. The advantage of this second option is that it's cheap to get started and doesn't require changing anything on the server side. Note that when objects move from one container to another, though, their URLs will change. **Caution** A deprecated policy cannot also be the default policy. Therefore if your default policy uses `isa_l_rs_vand` and 5 or more parity bits, you will need to configure a different default policy before deprecating the policy with the bad config. That may mean adding another storage policy to act as the default, or making another existing policy the default. ## Examples Good config after doing the above steps: ```ini # this policy is deprecated and replaced by storage policy 3 [storage-policy:2] name = deepfreeze7-6-deprecated policy_type = erasure_coding ec_type = isa_l_rs_vand ec_num_data_fragments = 7 ec_num_parity_fragments = 6 ec_object_segment_size = 1048576 deprecated = true # this policy is the good replacement for the one above [storage-policy:3] name = deepfreeze7-6 aliases = df76 policy_type = erasure_coding ec_type = isa_l_rs_cauchy ec_num_data_fragments = 7 ec_num_parity_fragments = 6 ec_object_segment_size = 1048576 ``` ## Need more help? Please feel free to ask any questions either here on the mailing list or in #openstack-swift on freenode IRC. Thank you for placing your trust in us to store your data in Swift. We are all deeply saddened by this bug and the extra work it may cause operators. John Dickinson, Swift PTL The OpenStack Swift community
signature.asc
Description: OpenPGP digital signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev