Re: [Qemu-devel] Question on aio_poll
Stefan, --On 24 July 2013 09:54:39 +0200 Stefan Hajnoczi wrote: I left this how it was in the end (I think), and got round it by creating a bogus pipe for the test to listen to. Doing that requires the changes in my patch series, otherwise you break aio_poll() loops that are waiting for pending I/O requests. They don't want to wait for timers. Sorry I meant leaving the main code as is, and creating the bogus pipe solely in tests/test-aio.c in my new test that tests a timer attached to AioContext - no changes elsewhere. I hope we can eventually unify event loops and then the select function should behave as you described. For now though, we need to keep the current behavior until my .io_flush() removal series or something equivalent is merged, at least. OK. That's pretty much the way I went with the PATCHv2 series. I note you now have: if (ctx->pollfds->len == 1) { return progress; } Is the '1' there the event notifier? How do we know there is only one of them? There many be many EventNotifier instances. That's not what matters. Rather, it's about the aio_notify() EventNotifier. Each AioContext has its own EventNotifier which can be signalled with aio_notify(). The purpose of this function is to kick an event loop that is blocking in select()/poll(). This is necessary when another thread modifies something that the AioContext needs to act upon, such as adding/removing an fd. Thanks -- Alex Bligh
Re: [Qemu-devel] Question on aio_poll
On Tue, Jul 23, 2013 at 03:46:23PM +0100, Alex Bligh wrote: > --On 23 July 2013 14:18:25 +0200 Stefan Hajnoczi wrote: > >Unfortunately there is an issue with the series which I haven't had time > >to look into yet. I don't remember the details but I think make check > >is failing. > > > >The current qemu.git/master code is doing the "correct" thing though. > >Callers of aio_poll() are using it to complete any pending I/O requests > >and process BHs. If there is no work left, we do not want to block > >indefinitely. Instead we want to return. > > If we have no work to do (no FDs) and have a timer, then this should > wait for the timer to expire (i.e. wait until progress has been > made). Hence without a timer, it would be peculiar if it returned > earlier. > > I think it should behave like select really, i.e. if you give it > an infinite timeout (blocking) and no descriptors to work on, it hangs > for ever. At the very least it should warn, as this is in my opinion > an error by the caller. > > I left this how it was in the end (I think), and got round it by > creating a bogus pipe for the test to listen to. Doing that requires the changes in my patch series, otherwise you break aio_poll() loops that are waiting for pending I/O requests. They don't want to wait for timers. > >>Thirdly, I don't quite understand how/why busy is being set. It seems > >>to be set if the flush callback returns non-zero. That would imply (I > >>think) the fd handler has something to write. But what if it is just > >>interested in any data to read that is available (and never writes)? If > >>this is the only fd aio_poll has, it would appear it never polls. > > > >The point of .io_flush() is to select file descriptors that are awaiting > >I/O (either direction). For example, consider an iSCSI TCP socket with > >no I/O requests pending. In that case .io_flush() returns 0 and we will > >not block in aio_poll(). But if there is an iSCSI request pending, then > >.io_flush() will return 1 and we'll wait for the iSCSI response to be > >received. > > > >The effect of .io_flush() is that aio_poll() will return false if there > >is no I/O pending. > > Right, but take that example. If the tcp socket is idle because it's an > iSCSI server and it is waiting for an iSCSI request, then io_flush > returns 0. That will mean busy will not be set, and if it's the only > FD, g_poll won't be called AT ALL - forget the fact it won't block - > because it will exit aio_poll a couple of lines before the g_poll. That > means you'll never actually poll for the incoming iSCSI command. > Surely that can't be right! > > Or are you saying that this type of FD never appears in the aio poll > set so it is just returning for the main loop to handle them. That happens because QEMU has two types of fd monitoring. It has AioContext's aio_poll() which is designed for asynchronous I/O requests initiated by QEMU. It can wait for them to complete. QEMU also has main-loop's qemu_set_fd_handler() (iohandler) which is used for server connections like the one you described. The NBD server uses it, for example. I hope we can eventually unify event loops and then the select function should behave as you described. For now though, we need to keep the current behavior until my .io_flush() removal series or something equivalent is merged, at least. > >It turned out that this behavior could be implemented at the block layer > >instead of using the .io_flush() interface at the AioContext layer. The > >patch series I linked to above modifies the code so AioContext can > >eliminate the .io_flush() concept. > > I've just had a quick read of that. > > I think the key one is: > http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00099.html > > I note you've eliminated 'busy' - hurrah. > > I note you now have: > if (ctx->pollfds->len == 1) { > return progress; > } > > Is the '1' there the event notifier? How do we know there is only > one of them? There many be many EventNotifier instances. That's not what matters. Rather, it's about the aio_notify() EventNotifier. Each AioContext has its own EventNotifier which can be signalled with aio_notify(). The purpose of this function is to kick an event loop that is blocking in select()/poll(). This is necessary when another thread modifies something that the AioContext needs to act upon, such as adding/removing an fd.
Re: [Qemu-devel] Question on aio_poll
Stefan, --On 23 July 2013 14:18:25 +0200 Stefan Hajnoczi wrote: Secondly, the tests I am writing for the timer fail because g_poll is not called, because busy is false. This is because I haven't got an fd set up. But in the instance where aio_poll is called with blocking=true and there are no fd's to wait on, surely aio_poll should either hang indefinitely or (perhaps better) assert, rather than return immediately which is a recipe for an unexpected busywait? If I have timers, it should be running those timers rather than returning. FWIW this goes away in my series that gets rid of .io_flush(): http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00092.html Oooh another overlapping patch series into the mix :-) Unfortunately there is an issue with the series which I haven't had time to look into yet. I don't remember the details but I think make check is failing. The current qemu.git/master code is doing the "correct" thing though. Callers of aio_poll() are using it to complete any pending I/O requests and process BHs. If there is no work left, we do not want to block indefinitely. Instead we want to return. If we have no work to do (no FDs) and have a timer, then this should wait for the timer to expire (i.e. wait until progress has been made). Hence without a timer, it would be peculiar if it returned earlier. I think it should behave like select really, i.e. if you give it an infinite timeout (blocking) and no descriptors to work on, it hangs for ever. At the very least it should warn, as this is in my opinion an error by the caller. I left this how it was in the end (I think), and got round it by creating a bogus pipe for the test to listen to. Thirdly, I don't quite understand how/why busy is being set. It seems to be set if the flush callback returns non-zero. That would imply (I think) the fd handler has something to write. But what if it is just interested in any data to read that is available (and never writes)? If this is the only fd aio_poll has, it would appear it never polls. The point of .io_flush() is to select file descriptors that are awaiting I/O (either direction). For example, consider an iSCSI TCP socket with no I/O requests pending. In that case .io_flush() returns 0 and we will not block in aio_poll(). But if there is an iSCSI request pending, then .io_flush() will return 1 and we'll wait for the iSCSI response to be received. The effect of .io_flush() is that aio_poll() will return false if there is no I/O pending. Right, but take that example. If the tcp socket is idle because it's an iSCSI server and it is waiting for an iSCSI request, then io_flush returns 0. That will mean busy will not be set, and if it's the only FD, g_poll won't be called AT ALL - forget the fact it won't block - because it will exit aio_poll a couple of lines before the g_poll. That means you'll never actually poll for the incoming iSCSI command. Surely that can't be right! Or are you saying that this type of FD never appears in the aio poll set so it is just returning for the main loop to handle them. It turned out that this behavior could be implemented at the block layer instead of using the .io_flush() interface at the AioContext layer. The patch series I linked to above modifies the code so AioContext can eliminate the .io_flush() concept. I've just had a quick read of that. I think the key one is: http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00099.html I note you've eliminated 'busy' - hurrah. I note you now have: if (ctx->pollfds->len == 1) { return progress; } Is the '1' there the event notifier? How do we know there is only one of them? When adapting this for timers, I think it should be returning true only if a an AIO dispatch did something, or a BH was executed, or a timer ran. Specifically if the poll simply times out, it should not be returning true unless a timer ran. Correct? Yes, the return value is about making progress. If there is still work pending then it must return true. If there is no work pending it must return false. Yup, you made the same fix as me at the end of aio_poll in my PATCHv2 RFC series. On a related point, the g_source appears very fragile in respect of false wakeups. I would not be confident that it would not busy-loop. See the comments in the last of the patches in my series. -- Alex Bligh
Re: [Qemu-devel] Question on aio_poll
On Sat, Jul 20, 2013 at 02:14:43PM +0100, Alex Bligh wrote: > As part of adding timer operations to aio_poll and a clock to AioContext, I > am trying to figure out a couple of points on how aio_poll works. This > primarily revolves around the flag busy. > > Firstly, at the end of aio_poll, it has > assert(progress || busy); > > This assert in fact checks nothing, because before the poll (20 lines or > so higher), it has > if (!busy) { > return progress; > } > > and as 'busy' is not altered, busy must be true at the assert. > > Is this in fact meant to be checking that we have made progress if aio_poll > is called blocking? IE assert(progress || !blocking) ? > > Secondly, the tests I am writing for the timer fail because g_poll is > not called, because busy is false. This is because I haven't got an > fd set up. But in the instance where aio_poll is called with blocking=true > and there are no fd's to wait on, surely aio_poll should either hang > indefinitely or (perhaps better) assert, rather than return immediately > which is a recipe for an unexpected busywait? If I have timers, it should > be running those timers rather than returning. FWIW this goes away in my series that gets rid of .io_flush(): http://lists.nongnu.org/archive/html/qemu-devel/2013-07/msg00092.html Unfortunately there is an issue with the series which I haven't had time to look into yet. I don't remember the details but I think make check is failing. The current qemu.git/master code is doing the "correct" thing though. Callers of aio_poll() are using it to complete any pending I/O requests and process BHs. If there is no work left, we do not want to block indefinitely. Instead we want to return. > Thirdly, I don't quite understand how/why busy is being set. It seems > to be set if the flush callback returns non-zero. That would imply (I think) > the fd handler has something to write. But what if it is just interested > in any data to read that is available (and never writes)? If this is the > only fd aio_poll has, it would appear it never polls. The point of .io_flush() is to select file descriptors that are awaiting I/O (either direction). For example, consider an iSCSI TCP socket with no I/O requests pending. In that case .io_flush() returns 0 and we will not block in aio_poll(). But if there is an iSCSI request pending, then .io_flush() will return 1 and we'll wait for the iSCSI response to be received. The effect of .io_flush() is that aio_poll() will return false if there is no I/O pending. It turned out that this behavior could be implemented at the block layer instead of using the .io_flush() interface at the AioContext layer. The patch series I linked to above modifies the code so AioContext can eliminate the .io_flush() concept. > When adapting this for timers, I think it should be returning true only > if a an AIO dispatch did something, or a BH was executed, or a timer ran. > Specifically if the poll simply times out, it should not be returning > true unless a timer ran. Correct? Yes, the return value is about making progress. If there is still work pending then it must return true. If there is no work pending it must return false. Stefan
[Qemu-devel] Question on aio_poll
As part of adding timer operations to aio_poll and a clock to AioContext, I am trying to figure out a couple of points on how aio_poll works. This primarily revolves around the flag busy. Firstly, at the end of aio_poll, it has assert(progress || busy); This assert in fact checks nothing, because before the poll (20 lines or so higher), it has if (!busy) { return progress; } and as 'busy' is not altered, busy must be true at the assert. Is this in fact meant to be checking that we have made progress if aio_poll is called blocking? IE assert(progress || !blocking) ? Secondly, the tests I am writing for the timer fail because g_poll is not called, because busy is false. This is because I haven't got an fd set up. But in the instance where aio_poll is called with blocking=true and there are no fd's to wait on, surely aio_poll should either hang indefinitely or (perhaps better) assert, rather than return immediately which is a recipe for an unexpected busywait? If I have timers, it should be running those timers rather than returning. Thirdly, I don't quite understand how/why busy is being set. It seems to be set if the flush callback returns non-zero. That would imply (I think) the fd handler has something to write. But what if it is just interested in any data to read that is available (and never writes)? If this is the only fd aio_poll has, it would appear it never polls. When adapting this for timers, I think it should be returning true only if a an AIO dispatch did something, or a BH was executed, or a timer ran. Specifically if the poll simply times out, it should not be returning true unless a timer ran. Correct? -- Alex Bligh