On 09/08/2015 12:56 AM, Dr. David Alan Gilbert wrote: > * Eric Blake (ebl...@redhat.com) wrote: >> On 09/02/2015 02:51 AM, Wen Congyang wrote: >>> If the child is not ready, read/write/getlength/flush will >>> return -errno. It is not critical error, and can be ignored: >>> 1. read/write: >>> Just not report the error event. >> >> What happens if all the children report an error? Or is the threshold >> at play here? > > I think it's interesting because in the COLO case the intention isn't > really about a threshold (in the way you might use for RAID or mirroring), > it's that one of the stores is local (and not expected to error) and one > is somewhere over a network, so if it fails you don't want to stop > the local VM working. > > However, if it fails we do need to know about it; if any write to > the secondary stops then the fault-tolerance has failed (at least for > that drive); so we should do *something* - I'm not sure what though.
The filter driver replication catches this error, and the COLO framework will get this error when doing checkpoint. Thanks Wen Congyang > > Dave > > >> For example, if you have a threshold of 3/5, then I'm assuming that if >> up to two children return an errno, then it is okay to ignore; but if >> three or more return an errno, you haven't met threshold, so the I/O >> must fail. >> >> Are you ignoring ALL errors (including things like EACCES), or just EIO >> errors? >> >> >>> 2. getlength: >>> just ignore it. If all children's getlength return -errno, >>> and be ignored, return -EIO. >>> 3. flush: >>> Just ignore it. If all children's getlength return -errno, >> >> s/getlength/flush/ >> >>> and be ignored, return 0. >> >> Yuck - claiming success when all of the children fail feels dangerous. >> >>> >>> Usage: children.x.ignore-errors=true >>> >>> Signed-off-by: Wen Congyang <we...@cn.fujitsu.com> >>> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> >>> Signed-off-by: Gonglei <arei.gong...@huawei.com> >>> Cc: Alberto Garcia <be...@igalia.com> >>> --- >>> block/quorum.c | 94 >>> ++++++++++++++++++++++++++++++++++++++++++++++++---- >> >> Interface review only: >> >>> +++ b/qapi/block-core.json >>> @@ -1411,6 +1411,8 @@ >>> # @allow-write-backing-file: #optional whether the backing file is opened >>> in >>> # read-write mode. It is only for backing file >>> # (Since 2.5 default: false) >>> +# @ignore-errors: #options whether the child's I/O error should be ignored. >> >> s/options/optional/ >> s/error/errors/ >> >>> +# it is only for quorum's child.(Since 2.5 default: false) >> >> Space after '.' in English sentences. >> >> The fact that you are documenting that this option can only be specified >> for quorum children makes me wonder if it belongs better as an option in >> BlockdevOptionsQuorum rather than BlockdevOptionsBase. >> >> Semantically, it sounds like you are trying to allow for a per-child >> decision of whether this particular child's errors matter to the overall >> quorum. So, if we have a 3/5 quorum, we can decide that for children A, >> B, C, and D, errors cannot be ignored, but for child E, errors are not a >> problem. >> >> As written, you are tying the semantics to each child BDS, and requiring >> special code to double-check that the property is only ever set if the >> BDS is used as the child of a quorum. Furthermore, if the property is >> set, you are then changing what the child does in response to various >> operations. >> >> What if you instead create a list property in the quorum parent? Maybe >> along the lines of: >> >> # @child-errors-okay: #optional an array of named-node children where >> errors will be ignored (Since 2.5, default empty) >> >> { 'struct': 'BlockdevOptionsQuorum', >> 'data': { '*blkverify': 'bool', >> 'children': [ 'BlockdevRef' ], >> 'vote-threshold': 'int', >> '*rewrite-corrupted': 'bool', >> '*read-pattern': 'QuorumReadPattern', >> '*child-errors-okay': ['str'] } } >> >> The above example of a 3/5 quorum, where only child E can ignore errors, >> would then be represented as: >> >> { "children": [ "A", "B", "C", "D", "E" ], 'vote-threshold':3, >> 'child-errors-okay': [ "E" ] } >> >> The code to ignore the errors is then done in the quorum itself (the BDS >> for E does not have to care about a special ignore-errors property, but >> just always returns the error as usual; and then the quorum is deciding >> how to handle the error), and you are not polluting the BDS state for >> something that is quorum-specific, because it is now the quorum itself >> that tracks the special casing. >> >> Finally, why can't hot-plug/unplug of quorum members work? If you are >> going to always ignore errors from a particular child, then why is that >> child even part of the quorum? Isn't a better design to just not add >> the child to the quorum until it is ready and won't be reporting errors? >> >> -- >> Eric Blake eblake redhat com +1-919-301-3266 >> Libvirt virtualization library http://libvirt.org >> > > > -- > Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK > . >