29.11.2018, 07:07, "Y Song" <ys114...@gmail.com>:
> On Mon, Nov 26, 2018 at 4:29 AM Aleksei Zakharov <zakharov....@yandex.ru> 
> wrote:
>>  Hi, everyone!
>>  I've written a small script using bcc python lib for qemu block io latency 
>> tracing.
>>  I attach two uprobes to block_acct_start and block_acct_done.
>>
>>  On some servers i can see unexpectedly high latencies:
>>  ~# ./qemuioslower 40000
>>  Tracing for qemu I/O... Ctrl-C to end
>>  TIME LATENCY SIZE(KB) IO TYPE
>>  0.0 18446744073709551 8192 write
>>  9.7294584 18446744073709551 0 flush
>>  24.815983 18446744073709551 4096 write
>>  25.422378 18446744073709551 0 flush
>>
>>  For me it looks like bpf_ktime_get_ns in stop() function returned time that 
>> is less, than bpf_ktime_get_ns() in start(), but why does this happen?
>>  May someone help me with understanding this behavior?
>
> Maybe you want to check with bcc issue
> https://github.com/iovisor/bcc/issues/728.
> Basically there is a kernel bug related to bpf_ktime_get_ns() and it
> is fixed in 4.9. Not sure about your kernel version.
Thanks for your answer!
I've tried to reproduce this with perf example from issue:
perf record -F 9 --cpu 0 --clockid CLOCK_MONOTONIC -- sleep 1
But it looks normal.
Also, i'm using 4.15 kernel, where this bug should be fixed already.
I tried to run different tools from bcc-tools, and i didn't find any strange 
output.

Is there any special case (non-buggy behavior) where such latencies could be 
shown?
We use servers with 2-4 numa nodes, might this be the cause?

Also, I tried to trace latency between scsi_req_enqueue and scsi_req_complete,
and i didn't see such high latencies in this case.

>
>>  BPF code:
>>  #include <uapi/linux/ptrace.h>
>>  BPF_HASH(req, u64);
>>  enum BlockAcctType {
>>      BLOCK_ACCT_READ,
>>      BLOCK_ACCT_WRITE,
>>      BLOCK_ACCT_FLUSH,
>>      BLOCK_MAX_IOTYPE,
>>  };
>>  struct data_t {
>>      u64 ts;
>>      u64 lat;
>>      u64 bytes;
>>      enum BlockAcctType type;
>>  };
>>  typedef struct BlockAcctCookie {
>>      int64_t bytes;
>>      int64_t start_time_ns;
>>      enum BlockAcctType type;
>>  } BlockAcctCookie;
>>  BPF_PERF_OUTPUT(events);
>>
>>  void start(struct pt_regs *ctx) {
>>      u32 pid = bpf_get_current_pid_tgid();
>>      if (FILTER_PID)
>>          return;
>>      u64 key = 0;
>>      bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
>>      u64 ts = bpf_ktime_get_ns();
>>      req.update(&key, &ts);
>>  }
>>
>>  void stop(struct pt_regs *ctx,void *first, struct BlockAcctCookie *cookie) {
>>      struct data_t data = {};
>>      u64 key = 0;
>>      bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
>>      data.ts = bpf_ktime_get_ns();
>>      data.type = cookie->type;
>>      data.bytes = cookie->bytes;
>>      u64 *s_time = req.lookup(&key);
>>      if (s_time != 0) {
>>          data.lat = (data.ts - *s_time);
>>          data.lat /= 1000;
>>          if (data.lat > MIN_US) {
>>                  events.perf_submit(ctx, &data, sizeof(data));
>>          }
>>          req.delete(&key);
>>      }
>>  }
>>
>>  Kernel: Linux hostname 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25 
>> 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>>  bcc packages versions:
>>  bcc-tools 0.7.0-1
>>  libbcc 0.7.0-1
>>  libcc1-0:amd64 5.4.0-6ubuntu1~16.04.10
>>  python-bcc 0.7.0-1
>>
>>  --
>>  Regards,
>>  Aleksei Zakharov
>>
>>  

-- 
Regards,
Aleksei Zakharov


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#1542): https://lists.iovisor.org/g/iovisor-dev/message/1542
Mute This Topic: https://lists.iovisor.org/mt/28320360/21656
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub  
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to