On Mon, 2019-09-09 at 17:26 +0200, Johannes Berg wrote: > > Maybe instead we should just add a "VHOST_USER_REPLY_ERROR" bit (e.g. > bit 4 after NEED_REPLY). Qemu in vhost_user_read_header() validates that > it received REPLY_MASK | VERSION, so it would reject the message at that > point. > > Another possibility would be to define the highest bit of the 'request' > field to indicate an error, so for GET_FEATURES we'd return the value > 0x80000000 | GET_FEATURES.
However, one way or another, that basically leaves us with three different ways of indicating an error: 1) already defined errors in existing messages - we can't change them since those are handled at runtime now, e.g. VHOST_USER_POSTCOPY_END returns a u64 value with an error status, and current code cannot deal with an error flag in the 'request' or 'flags' field 2) F_REPLY_ACK errors to messages that do not specify a response at all 3) this new way of indicating an error back from messages that specify a response, but the response has no inherent way of returning an error To me that really feels a bit too complex from the spec POV. But I don't see a way to generalize this without another extension, and again the device cannot choose which extensions it supports since the master chooses them and just sets them. Perhaps I really should just stick a "g_assert()" into the code at that point, and have it crash, since it's likely that F_KICK_CALL_MSGS isn't even going to be implemented in qemu (unless it grows simulation support and then it'd all be conditional on some simulation command-line option) And actually ... you got the order wrong: > > Next command is GET_FEATURES. Return an error response from that > > and device init will fail. That's not the case. We *start* with GET_FEATURES, if that includes protocol features then we do GET_PROTOCOL_FEATURES next, and then we get the # of queues next ... Though the whole discussion pretty much applies equivalently to GET_QUEUES_NUM instead of GET_FEATURES. johannes