On Fri, Nov 18, 2016 at 10:34:54AM +0000, Ketan Nilangekar wrote: > > > > > > On 11/18/16, 12:56 PM, "Jeff Cody" <jc...@redhat.com> wrote: > > >On Wed, Nov 16, 2016 at 08:12:41AM +0000, Stefan Hajnoczi wrote: > >> On Tue, Nov 15, 2016 at 10:38 PM, ashish mittal <ashmit...@gmail.com> > >> wrote: > >> > On Wed, Sep 28, 2016 at 2:45 PM, Stefan Hajnoczi <stefa...@gmail.com> > >> > wrote: > >> >> On Tue, Sep 27, 2016 at 09:09:49PM -0700, Ashish Mittal wrote: > >> >> 5. > >> >> I don't see any endianness handling or portable alignment of struct > >> >> fields in the network protocol code. Binary network protocols need to > >> >> take care of these issue for portability. This means libqnio compiled > >> >> for different architectures will not work. Do you plan to support any > >> >> other architectures besides x86? > >> >> > >> > > >> > No, we support only x86 and do not plan to support any other arch. > >> > Please let me know if this necessitates any changes to the configure > >> > script. > >> > >> I think no change to ./configure is necessary. The library will only > >> ship on x86 so other platforms will never attempt to compile the code. > >> > >> >> 6. > >> >> The networking code doesn't look robust: kvset uses assert() on input > >> >> from the network so the other side of the connection could cause SIGABRT > >> >> (coredump), the client uses the msg pointer as the cookie for the > >> >> response packet so the server can easily crash the client by sending a > >> >> bogus cookie value, etc. Even on the client side these things are > >> >> troublesome but on a server they are guaranteed security issues. I > >> >> didn't look into it deeply. Please audit the code. > >> >> > >> > > >> > By design, our solution on OpenStack platform uses a closed set of > >> > nodes communicating on dedicated networks. VxHS servers on all the > >> > nodes are on a dedicated network. Clients (qemu) connects to these > >> > only after reading the server IP from the XML (read by libvirt). The > >> > XML cannot be modified without proper access. Therefore, IMO this > >> > problem would be relevant only if someone were to use qnio as a > >> > generic mode of communication/data transfer, but for our use-case, we > >> > will not run into this problem. Is this explanation acceptable? > >> > >> No. The trust model is that the guest is untrusted and in the worst > >> case may gain code execution in QEMU due to security bugs. > >> > >> You are assuming block/vxhs.c and libqnio are trusted but that > >> assumption violates the trust model. > >> > >> In other words: > >> 1. Guest exploits a security hole inside QEMU and gains code execution > >> on the host. > >> 2. Guest uses VxHS client file descriptor on host to send a malicious > >> packet to VxHS server. > >> 3. VxHS server is compromised by guest. > >> 4. Compromised VxHS server sends malicious packets to all other > >> connected clients. > >> 5. All clients have been compromised. > >> > >> This means both the VxHS client and server must be robust. They have > >> to validate inputs to avoid buffer overflows, assertion failures, > >> infinite loops, etc. > >> > >> Stefan > > > > > >The libqnio code is important with respect to the VxHS driver. It is a bit > >different than other existing external protocol drivers, in that the current > >user and developer base is small, and the code itself is pretty new. So I > >think for the VxHS driver here upstream, we really do need to get some of > >the libqnio issues squared away. I don't know if we've ever explicitly > >address the extent to which libqnio issues affect the driver > >merging, so I figure it is probably worth discussing here. > > > >To try and consolidate libqnio discussion, here is what I think I've read / > >seen from others as the major issues that should be addressed in libqnio: > > > >* Code auditing, static analysis, and general code cleanup. Things like > > memory leaks shouldn't be happening, and some prior libqnio compiler > > warnings imply that there is more code analysis that should be done with > > libqnio. > > > > (With regards to memory leaks: Valgrind may be useful to track these down: > > > > # valgrind ./qemu-io -c 'write -pP 0xae 66000 128k' \ > > vxhs://localhost/test.raw > > > > ==30369== LEAK SUMMARY: > > ==30369== definitely lost: 4,168 bytes in 2 blocks > > ==30369== indirectly lost: 1,207,720 bytes in 58,085 blocks) > > We have done and are doing exhaustive memory leak tests using valgrind. > Memory leaks within qnio have been addressed to some extent. We will post > detailed valgrind results to this thread. >
That is good to hear. I ran the above on the latest HEAD from the qnio github repo, so I look forward to checking out the latest code once it is available. > > > >* Potential security issues such as buffer overruns, input validation, etc., > > need to be audited. > > We have known a few such issues from previous comments and have addressed > some of those. If there are any important outstanding ones, please let us > know and we will fix them on priority. > One concern is that the issues noted are not from an exhaustive review on Stefan's part, AFAIK. When Stefan called for auditing the code, that is really a call to look for other potential security flaws as well, using the perspective he outlined on the trust model. > > > >* Async operations need to be truly asynchronous, without blocking calls. > > There is only one blocking call in libqnio reconnect which we has been > pointed out. We will fix this soon. > Great, thanks! > > > >* Daniel pointed out that there is no authentication method for taking to a > > remote server. This seems a bit scary. Maybe all that is needed here is > > some clarification of the security scheme for authentication? My > > impression from above is that you are relying on the networks being > > private to provide some sort of implicit authentication, though, and this > > seems fragile (and doesn't protect against a compromised guest or other > > process on the server, for one). > > Our auth scheme is based on network isolation at L2/L3 level. If there is > a simplified authentication mechanism which we can implement without > imposing significant penalties on IO performance, please let us know and > we will implement that if feasible. > > > > >(if I've missed anything, please add it here!) > > > >-Jeff