Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-09-17 Thread Denis V. Lunev
On 9/17/20 4:56 PM, Max Reitz wrote:
> On 10.08.20 12:14, Denis V. Lunev wrote:
>> There are severe delays with IO requests processing if QEMU is running in
>> virtual machine or over software defined storage. Such delays potentially
>> results in unpredictable guest behavior. For example, guests over IDE or
>> SATA drive could remount filesystem read-only if write is performed
>> longer than 10 seconds.
>>
>> Such reports are very complex to process. Some good starting point for this
>> seems quite reasonable. This patch provides one. It adds logging of such
>> potentially dangerous long IO operations.
>>
>> Signed-off-by: Denis V. Lunev 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> CC: Stefan Hajnoczi 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> ---
>>  block/accounting.c | 59 +-
>>  blockdev.c |  7 -
>>  include/block/accounting.h |  5 +++-
>>  3 files changed, 68 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/accounting.c b/block/accounting.c
>> index 8d41c8a83a..24f48c84b8 100644
>> --- a/block/accounting.c
>> +++ b/block/accounting.c
> [...]
>
>> @@ -182,6 +185,58 @@ void block_latency_histograms_clear(BlockAcctStats 
>> *stats)
>>  }
>>  }
>>  
>> +static const char *block_account_type(enum BlockAcctType type)
>> +{
>> +switch (type) {
>> +case BLOCK_ACCT_READ:
>> +return "READ";
>> +case BLOCK_ACCT_WRITE:
>> +return "WRITE";
>> +case BLOCK_ACCT_FLUSH:
>> +return "DISCARD";
>> +case BLOCK_ACCT_UNMAP:
>> +return "TRUNCATE";
> I don’t understand how FLUSH translates to DISCARD, and UNMAP to TRUNCATE.
>

Ha, seems I have made copy/pasted and missed this.
Nice catch.

>> +case BLOCK_ACCT_NONE:
>> +return "NONE";
>> +case BLOCK_MAX_IOTYPE:
>> +break;
>> +}
>> +return "UNKNOWN";
>> +}
>> +
>> +static void block_acct_report_long(BlockAcctStats *stats,
>> +   BlockAcctCookie *cookie, int64_t 
>> latency_ns)
>> +{
>> +unsigned latency_ms = latency_ns / SCALE_MS;
>> +int64_t start_time_host_s;
>> +char buf[64];
>> +struct tm t;
>> +BlockDriverState *bs;
>> +
>> +if (cookie->type == BLOCK_ACCT_NONE) {
>> +return;
>> +}
>> +if (stats->latency_log_threshold_ms == 0) {
>> +return;
>> +}
>> +if (latency_ms < stats->latency_log_threshold_ms) {
>> +return;
>> +}
>> +
>> +start_time_host_s = cookie->start_time_ns / NANOSECONDS_PER_SECOND;
>> +strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",
>> + localtime_r(_time_host_s, ));
>> +
>> +bs = blk_bs(blk_stats2blk(stats));
>> +qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
>> %s)\n",
>> + block_account_type(cookie->type), cookie->bytes,
>> + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
> Why not just latency_ms * .001f and %.3f?
I personally dislike floating point, this leads to
a bit strange output.


>
>> + (int)((cookie->start_time_ns / 100) % 1000),
> s/100/SCALE_MS/? (But I’m not sure whether that’s what the SCALE_?S
> are for.)
yes, you right.

>> + bs == NULL ? "unknown" : bdrv_get_node_name(bs),
>> + bs == NULL ? "unknown" : bdrv_get_format_name(bs),
>> + bs == NULL ? "unknown" : bs->filename);
> Now that I’m writing this response already, I wonder whether a QMP event
> wouldn’t be nice.  (But considering that accounting apparently just
> doesn’t with -blockdev, I suppose that’s not that big of a worry.)
may be. Why not?

Though there is some note. Usually after the stall there
is a bunch of requests completed at the same time
once the storage is unfrozen, thus some aggregation
would be required.

I think that this should be made separately.
>> +}
>> +
>>  static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie 
>> *cookie,
>>   bool failed)
>>  {
> [...]
>
>> diff --git a/blockdev.c b/blockdev.c
>> index 3848a9c8ab..66158d1292 100644
>> --- a/blockdev.c
>> +++ b/blockdev.c
>> @@ -622,7 +622,8 @@ static BlockBackend *blockdev_init(const char *file, 
>> QDict *bs_opts,
>>  
>>  bs->detect_zeroes = detect_zeroes;
>>  
>> -block_acct_setup(blk_get_stats(blk), account_invalid, 
>> account_failed);
>> +block_acct_setup(blk_get_stats(blk), account_invalid, 
>> account_failed,
>> +qemu_opt_get_number(opts, "latency-log-threshold", 0));
> latency_log_threshold_ms is an unsigned int and so this will silently
> overflow for values >= 2^32.
>
> (Or for user-specified values < 0, which are wrapped around.)
>
>>  
>>  if (!parse_stats_intervals(blk_get_stats(blk), interval_list, 
>> errp)) {
>>  blk_unref(blk);
>> @@ -3727,6 +3728,10 @@ QemuOptsList qemu_common_drive_opts = {
>>  .type = QEMU_OPT_BOOL,
>>  .help = "whether to account for failed I/O 

Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-09-17 Thread Max Reitz
On 10.08.20 12:14, Denis V. Lunev wrote:
> There are severe delays with IO requests processing if QEMU is running in
> virtual machine or over software defined storage. Such delays potentially
> results in unpredictable guest behavior. For example, guests over IDE or
> SATA drive could remount filesystem read-only if write is performed
> longer than 10 seconds.
> 
> Such reports are very complex to process. Some good starting point for this
> seems quite reasonable. This patch provides one. It adds logging of such
> potentially dangerous long IO operations.
> 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> CC: Max Reitz 
> ---
>  block/accounting.c | 59 +-
>  blockdev.c |  7 -
>  include/block/accounting.h |  5 +++-
>  3 files changed, 68 insertions(+), 3 deletions(-)
> 
> diff --git a/block/accounting.c b/block/accounting.c
> index 8d41c8a83a..24f48c84b8 100644
> --- a/block/accounting.c
> +++ b/block/accounting.c

[...]

> @@ -182,6 +185,58 @@ void block_latency_histograms_clear(BlockAcctStats 
> *stats)
>  }
>  }
>  
> +static const char *block_account_type(enum BlockAcctType type)
> +{
> +switch (type) {
> +case BLOCK_ACCT_READ:
> +return "READ";
> +case BLOCK_ACCT_WRITE:
> +return "WRITE";
> +case BLOCK_ACCT_FLUSH:
> +return "DISCARD";
> +case BLOCK_ACCT_UNMAP:
> +return "TRUNCATE";

I don’t understand how FLUSH translates to DISCARD, and UNMAP to TRUNCATE.

> +case BLOCK_ACCT_NONE:
> +return "NONE";
> +case BLOCK_MAX_IOTYPE:
> +break;
> +}
> +return "UNKNOWN";
> +}
> +
> +static void block_acct_report_long(BlockAcctStats *stats,
> +   BlockAcctCookie *cookie, int64_t 
> latency_ns)
> +{
> +unsigned latency_ms = latency_ns / SCALE_MS;
> +int64_t start_time_host_s;
> +char buf[64];
> +struct tm t;
> +BlockDriverState *bs;
> +
> +if (cookie->type == BLOCK_ACCT_NONE) {
> +return;
> +}
> +if (stats->latency_log_threshold_ms == 0) {
> +return;
> +}
> +if (latency_ms < stats->latency_log_threshold_ms) {
> +return;
> +}
> +
> +start_time_host_s = cookie->start_time_ns / NANOSECONDS_PER_SECOND;
> +strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",
> + localtime_r(_time_host_s, ));
> +
> +bs = blk_bs(blk_stats2blk(stats));
> +qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
> %s)\n",
> + block_account_type(cookie->type), cookie->bytes,
> + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,

Why not just latency_ms * .001f and %.3f?

> + (int)((cookie->start_time_ns / 100) % 1000),

s/100/SCALE_MS/? (But I’m not sure whether that’s what the SCALE_?S
are for.)

> + bs == NULL ? "unknown" : bdrv_get_node_name(bs),
> + bs == NULL ? "unknown" : bdrv_get_format_name(bs),
> + bs == NULL ? "unknown" : bs->filename);

Now that I’m writing this response already, I wonder whether a QMP event
wouldn’t be nice.  (But considering that accounting apparently just
doesn’t with -blockdev, I suppose that’s not that big of a worry.)

> +}
> +
>  static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie 
> *cookie,
>   bool failed)
>  {

[...]

> diff --git a/blockdev.c b/blockdev.c
> index 3848a9c8ab..66158d1292 100644
> --- a/blockdev.c
> +++ b/blockdev.c
> @@ -622,7 +622,8 @@ static BlockBackend *blockdev_init(const char *file, 
> QDict *bs_opts,
>  
>  bs->detect_zeroes = detect_zeroes;
>  
> -block_acct_setup(blk_get_stats(blk), account_invalid, 
> account_failed);
> +block_acct_setup(blk_get_stats(blk), account_invalid, account_failed,
> +qemu_opt_get_number(opts, "latency-log-threshold", 0));

latency_log_threshold_ms is an unsigned int and so this will silently
overflow for values >= 2^32.

(Or for user-specified values < 0, which are wrapped around.)

>  
>  if (!parse_stats_intervals(blk_get_stats(blk), interval_list, errp)) 
> {
>  blk_unref(blk);
> @@ -3727,6 +3728,10 @@ QemuOptsList qemu_common_drive_opts = {
>  .type = QEMU_OPT_BOOL,
>  .help = "whether to account for failed I/O operations "
>  "in the statistics",
> +},{
> +.name = "latency-log-threshold",
> +.type = QEMU_OPT_NUMBER,
> +.help = "threshold for long I/O report (disabled if <=0), in ms",

Because of that overflow, negative values will not necessarily disable
reporting, because truncating them to 32 bit may make them small
absolute values again.

In any case, I’d just say “disabled if 0”, and not mention anything
about <0.

Max



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-08-24 Thread Denis V. Lunev
On 8/20/20 11:03 AM, David Edmondson wrote:
> On Monday, 2020-08-10 at 13:14:46 +03, Denis V. Lunev wrote:
>
>> +strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",
> "%F %T" would include the year, which can be useful.
ok

>
>> + localtime_r(_time_host_s, ));
>> +
>> +bs = blk_bs(blk_stats2blk(stats));
>> +qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
>> %s)\n",
>> + block_account_type(cookie->type), cookie->bytes,
>> + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
>> + (int)((cookie->start_time_ns / 100) % 1000),
> Is there a reason not to use %u rather than casting?
no, we could use unsigned.

Will resend shortly.



Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-08-20 Thread David Edmondson
On Monday, 2020-08-10 at 13:14:46 +03, Denis V. Lunev wrote:

> +strftime(buf, sizeof(buf), "%m-%d %H:%M:%S",

"%F %T" would include the year, which can be useful.

> + localtime_r(_time_host_s, ));
> +
> +bs = blk_bs(blk_stats2blk(stats));
> +qemu_log("long %s[%ld] IO request: %d.03%d since %s.%03d bs: %s(%s, 
> %s)\n",
> + block_account_type(cookie->type), cookie->bytes,
> + (int)(latency_ms / 1000), (int)(latency_ms % 1000), buf,
> + (int)((cookie->start_time_ns / 100) % 1000),

Is there a reason not to use %u rather than casting?

dme.
-- 
We wanna wait, but here we go again.



Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-08-05 Thread Denis V. Lunev
On 8/5/20 4:51 PM, Philippe Mathieu-Daudé wrote:
> On 8/5/20 12:08 PM, Denis V. Lunev wrote:
>> There are severe delays with IO requests processing if QEMU is running in
>> virtual machine or over software defined storage. Such delays potentially
>> results in unpredictable guest behavior. For example, guests over IDE or
>> SATA drive could remount filesystem read-only if write is performed
>> longer than 10 seconds.
>>
>> Such reports are very complex to process. Some good starting point for this
>> seems quite reasonable. This patch provides one. It adds logging of such
>> potentially dangerous long IO operations.
>>
>> Signed-off-by: Denis V. Lunev 
>> Reviewed-by: Vladimir Sementsov-Ogievskiy 
>> CC: Stefan Hajnoczi 
>> CC: Kevin Wolf 
>> CC: Max Reitz 
>> ---
>>  block/accounting.c | 59 +-
>>  blockdev.c |  7 -
>>  include/block/accounting.h |  5 +++-
>>  slirp  |  2 +-
>>  4 files changed, 69 insertions(+), 4 deletions(-)
>>
> ...
>
>>  typedef struct BlockAcctCookie {
>> @@ -101,7 +104,7 @@ typedef struct BlockAcctCookie {
>>  
>>  void block_acct_init(BlockAcctStats *stats);
>>  void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
>> - bool account_failed);
>> +  bool account_failed, unsigned 
>> latency_log_threshold_ms);
>>  void block_acct_cleanup(BlockAcctStats *stats);
>>  void block_acct_add_interval(BlockAcctStats *stats, unsigned 
>> interval_length);
>>  BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
>> diff --git a/slirp b/slirp
>> index ce94eba204..2faae0f778 16
>> --- a/slirp
>> +++ b/slirp
>> @@ -1 +1 @@
>> -Subproject commit ce94eba2042d52a0ba3d9e252ebce86715e94275
>> +Subproject commit 2faae0f778f818fadc873308f983289df697eb93
> SLiRP change unrelated I presume...
>
yes :(

subprojects comes into play.. I have not event changed this.
Just a result from pull :



Re: [PATCH 2/3] block: add logging facility for long standing IO requests

2020-08-05 Thread Philippe Mathieu-Daudé
On 8/5/20 12:08 PM, Denis V. Lunev wrote:
> There are severe delays with IO requests processing if QEMU is running in
> virtual machine or over software defined storage. Such delays potentially
> results in unpredictable guest behavior. For example, guests over IDE or
> SATA drive could remount filesystem read-only if write is performed
> longer than 10 seconds.
> 
> Such reports are very complex to process. Some good starting point for this
> seems quite reasonable. This patch provides one. It adds logging of such
> potentially dangerous long IO operations.
> 
> Signed-off-by: Denis V. Lunev 
> Reviewed-by: Vladimir Sementsov-Ogievskiy 
> CC: Stefan Hajnoczi 
> CC: Kevin Wolf 
> CC: Max Reitz 
> ---
>  block/accounting.c | 59 +-
>  blockdev.c |  7 -
>  include/block/accounting.h |  5 +++-
>  slirp  |  2 +-
>  4 files changed, 69 insertions(+), 4 deletions(-)
> 
...

>  typedef struct BlockAcctCookie {
> @@ -101,7 +104,7 @@ typedef struct BlockAcctCookie {
>  
>  void block_acct_init(BlockAcctStats *stats);
>  void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
> - bool account_failed);
> +  bool account_failed, unsigned 
> latency_log_threshold_ms);
>  void block_acct_cleanup(BlockAcctStats *stats);
>  void block_acct_add_interval(BlockAcctStats *stats, unsigned 
> interval_length);
>  BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
> diff --git a/slirp b/slirp
> index ce94eba204..2faae0f778 16
> --- a/slirp
> +++ b/slirp
> @@ -1 +1 @@
> -Subproject commit ce94eba2042d52a0ba3d9e252ebce86715e94275
> +Subproject commit 2faae0f778f818fadc873308f983289df697eb93

SLiRP change unrelated I presume...