Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Dear Kevin, Kevin Wolfwrites: > Am 08.04.2016 um 14:01 hat Sascha Silbe geschrieben: [...] >> The best approach probably would be to fix up the rate limit code to >> delay for multiple time slices if necessary. We should get rid of the >> artificial BDRV_SECTOR_SIZE granularity at the same time, always using >> bytes as the quantum unit. > > In the 2.7 time frame we might actually be able to reuse the normal I/O > throttling code for block jobs as the jobs will be using their own > BlockBackend and can therefore set their own throttling limits. Sounds good. Then I suggest we just drop this patch for now. I've added a reminder to check the situation again at the end of June (i.e. before soft feature freeze of 2.7). Will give documenting the minimum rate a try, but that shouldn't hold up the fixes. Sascha -- Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg https://se-silbe.de/ USt-IdNr. DE281696641
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Am 08.04.2016 um 14:01 hat Sascha Silbe geschrieben: > Dear Max, > > Sascha Silbewrites: > > > @Max: From a cursory glance at the code, maybe your 1 *byte* per second > > rate limit is being rounded down to 0 *blocks* per second, with 0 > > meaning no limit? See e.g. mirror_set_speed(). Though I must admit I > > don't understand how speed=0 translates to unlimited (like > > qapi/block-core.json:block-job-set-speed says). My understanding of > > ratelimit_calculate_delay() is that speed=0 means "1 quantum per time > > slice", with time slice usually being 100ms; not sure about the > > quantum. > > I think I've understood the issue now. > > The backup, commit, mirror and stream actions operate in on full chunks, > with chunk size depending on the action and backing device. For > e.g. commit that means it always bursts at least 0.5MiB; that's where > the value the reference output comes from. > > ratelimit_calculate_delay() lets through at least one burst per time > slice. This means the minimum rate is chunk size per time slice (always > 100ms). So for commit and stream one will always get at least 5 MiB/s. A > surprisingly large value for something specified in bytes per second, > BTW. (I.e. it should probably be documented in qmp-commands.hx if it > stays this way). > > On a busy or slow host, it may take the shell longer than the time slice > of 100ms to send the cancel command to qemu. When that happens, > additional chunks will get written before the job gets cancelled. That's > why I sometimes see 1 or even 1.5 MiB as offset, especially when running > CPU intensive workloads in parallel. > > The best approach probably would be to fix up the rate limit code to > delay for multiple time slices if necessary. We should get rid of the > artificial BDRV_SECTOR_SIZE granularity at the same time, always using > bytes as the quantum unit. In the 2.7 time frame we might actually be able to reuse the normal I/O throttling code for block jobs as the jobs will be using their own BlockBackend and can therefore set their own throttling limits. Kevin
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Dear Max, Sascha Silbewrites: > @Max: From a cursory glance at the code, maybe your 1 *byte* per second > rate limit is being rounded down to 0 *blocks* per second, with 0 > meaning no limit? See e.g. mirror_set_speed(). Though I must admit I > don't understand how speed=0 translates to unlimited (like > qapi/block-core.json:block-job-set-speed says). My understanding of > ratelimit_calculate_delay() is that speed=0 means "1 quantum per time > slice", with time slice usually being 100ms; not sure about the > quantum. I think I've understood the issue now. The backup, commit, mirror and stream actions operate in on full chunks, with chunk size depending on the action and backing device. For e.g. commit that means it always bursts at least 0.5MiB; that's where the value the reference output comes from. ratelimit_calculate_delay() lets through at least one burst per time slice. This means the minimum rate is chunk size per time slice (always 100ms). So for commit and stream one will always get at least 5 MiB/s. A surprisingly large value for something specified in bytes per second, BTW. (I.e. it should probably be documented in qmp-commands.hx if it stays this way). On a busy or slow host, it may take the shell longer than the time slice of 100ms to send the cancel command to qemu. When that happens, additional chunks will get written before the job gets cancelled. That's why I sometimes see 1 or even 1.5 MiB as offset, especially when running CPU intensive workloads in parallel. The best approach probably would be to fix up the rate limit code to delay for multiple time slices if necessary. We should get rid of the artificial BDRV_SECTOR_SIZE granularity at the same time, always using bytes as the quantum unit. Sascha -- Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg https://se-silbe.de/ USt-IdNr. DE281696641
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Am 07.04.2016 um 22:27 hat Sascha Silbe geschrieben: > Dear Max, > > Max Reitzwrites: > > > On 05.04.2016 11:21, Sascha Silbe wrote: > >> On systems with fast IO, qemu-io may write more than 1 MiB before > >> receiving the block-job-cancel command, causing the test case to fail. > >> > >> 141 is inherently racy, but we can at least reduce the likelihood of the > >> job completing before the cancel command arrives by bumping the size of > >> the data getting written; we'll try 32 MiB for a start. > > > > Hm, interesting. I tried to prevent this by setting the block jobs' > > speed to 1, which should make it stop after the block job has processed > > the first block of data. > > Hmm, seems like I didn't quite understand the way the test works > then. > > @Kevin: Please do NOT queue this patch. No problem. I expect this series to go through Max's tree anyway. Kevin
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Dear Max, Max Reitzwrites: > On 05.04.2016 11:21, Sascha Silbe wrote: >> On systems with fast IO, qemu-io may write more than 1 MiB before >> receiving the block-job-cancel command, causing the test case to fail. >> >> 141 is inherently racy, but we can at least reduce the likelihood of the >> job completing before the cancel command arrives by bumping the size of >> the data getting written; we'll try 32 MiB for a start. > > Hm, interesting. I tried to prevent this by setting the block jobs' > speed to 1, which should make it stop after the block job has processed > the first block of data. Hmm, seems like I didn't quite understand the way the test works then. @Kevin: Please do NOT queue this patch. @Max: From a cursory glance at the code, maybe your 1 *byte* per second rate limit is being rounded down to 0 *blocks* per second, with 0 meaning no limit? See e.g. mirror_set_speed(). Though I must admit I don't understand how speed=0 translates to unlimited (like qapi/block-core.json:block-job-set-speed says). My understanding of ratelimit_calculate_delay() is that speed=0 means "1 quantum per time slice", with time slice usually being 100ms; not sure about the quantum. Sascha -- Softwareentwicklung Sascha Silbe, Niederhofenstraße 5/1, 71229 Leonberg https://se-silbe.de/ USt-IdNr. DE281696641
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
Am 06.04.2016 um 18:15 hat Max Reitz geschrieben: > On 05.04.2016 11:21, Sascha Silbe wrote: > > On systems with fast IO, qemu-io may write more than 1 MiB before > > receiving the block-job-cancel command, causing the test case to fail. > > > > 141 is inherently racy, but we can at least reduce the likelihood of the > > job completing before the cancel command arrives by bumping the size of > > the data getting written; we'll try 32 MiB for a start. > > Hm, interesting. I tried to prevent this by setting the block jobs' > speed to 1, which should make it stop after the block job has processed > the first block of data. > > I won't oppose this patch, because if it fixes things for you, that's > good. But I don't think it should be necessary. We don't generally change test cases when they fail. Making a test case pass doesn't "fix things" per se. It only helps when the failure is a false positive. In this case, it looks like there might be a problem with block job throttling, so maybe we should look into that before changing the test? Kevin > > Once we actually move enough data around for the block job not to > > complete prematurely, the test will still fail because the offset value > > in the BLOCK_JOB_CANCELLED event will vary. Since this is more or less > > inherent to the nature of this event, we just replace it with a fixed > > value globally (in _filter_qmp), the same way we handle timestamps. > > > > Signed-off-by: Sascha Silbe> > Reviewed-by: Bo Tu > > --- > > tests/qemu-iotests/141 | 11 ++- > > tests/qemu-iotests/141.out | 24 > > tests/qemu-iotests/common.filter | 1 + > > 3 files changed, 19 insertions(+), 17 deletions(-) > pgpCbRzZJU0u4.pgp Description: PGP signature
Re: [Qemu-block] [PATCH 6/7] qemu-iotests: 141: reduce likelihood of race condition on systems with fast IO
On 05.04.2016 11:21, Sascha Silbe wrote: > On systems with fast IO, qemu-io may write more than 1 MiB before > receiving the block-job-cancel command, causing the test case to fail. > > 141 is inherently racy, but we can at least reduce the likelihood of the > job completing before the cancel command arrives by bumping the size of > the data getting written; we'll try 32 MiB for a start. Hm, interesting. I tried to prevent this by setting the block jobs' speed to 1, which should make it stop after the block job has processed the first block of data. I won't oppose this patch, because if it fixes things for you, that's good. But I don't think it should be necessary. Max > > Once we actually move enough data around for the block job not to > complete prematurely, the test will still fail because the offset value > in the BLOCK_JOB_CANCELLED event will vary. Since this is more or less > inherent to the nature of this event, we just replace it with a fixed > value globally (in _filter_qmp), the same way we handle timestamps. > > Signed-off-by: Sascha Silbe> Reviewed-by: Bo Tu > --- > tests/qemu-iotests/141 | 11 ++- > tests/qemu-iotests/141.out | 24 > tests/qemu-iotests/common.filter | 1 + > 3 files changed, 19 insertions(+), 17 deletions(-) signature.asc Description: OpenPGP digital signature