29.11.2018, 07:07, "Y Song" <ys114...@gmail.com>:
> On Mon, Nov 26, 2018 at 4:29 AM Aleksei Zakharov <zakharov....@yandex.ru>
> wrote:
>> Hi, everyone!
>> I've written a small script using bcc python lib for qemu block io latency
>> tracing.
>> I attach two uprobes to block_acct_start and block_acct_done.
>>
>> On some servers i can see unexpectedly high latencies:
>> ~# ./qemuioslower 40000
>> Tracing for qemu I/O... Ctrl-C to end
>> TIME LATENCY SIZE(KB) IO TYPE
>> 0.0 18446744073709551 8192 write
>> 9.7294584 18446744073709551 0 flush
>> 24.815983 18446744073709551 4096 write
>> 25.422378 18446744073709551 0 flush
>>
>> For me it looks like bpf_ktime_get_ns in stop() function returned time that
>> is less, than bpf_ktime_get_ns() in start(), but why does this happen?
>> May someone help me with understanding this behavior?
>
> Maybe you want to check with bcc issue
> https://github.com/iovisor/bcc/issues/728.
> Basically there is a kernel bug related to bpf_ktime_get_ns() and it
> is fixed in 4.9. Not sure about your kernel version.
Thanks for your answer!
I've tried to reproduce this with perf example from issue:
perf record -F 9 --cpu 0 --clockid CLOCK_MONOTONIC -- sleep 1
But it looks normal.
Also, i'm using 4.15 kernel, where this bug should be fixed already.
I tried to run different tools from bcc-tools, and i didn't find any strange
output.
Is there any special case (non-buggy behavior) where such latencies could be
shown?
We use servers with 2-4 numa nodes, might this be the cause?
Also, I tried to trace latency between scsi_req_enqueue and scsi_req_complete,
and i didn't see such high latencies in this case.
>
>> BPF code:
>> #include <uapi/linux/ptrace.h>
>> BPF_HASH(req, u64);
>> enum BlockAcctType {
>> BLOCK_ACCT_READ,
>> BLOCK_ACCT_WRITE,
>> BLOCK_ACCT_FLUSH,
>> BLOCK_MAX_IOTYPE,
>> };
>> struct data_t {
>> u64 ts;
>> u64 lat;
>> u64 bytes;
>> enum BlockAcctType type;
>> };
>> typedef struct BlockAcctCookie {
>> int64_t bytes;
>> int64_t start_time_ns;
>> enum BlockAcctType type;
>> } BlockAcctCookie;
>> BPF_PERF_OUTPUT(events);
>>
>> void start(struct pt_regs *ctx) {
>> u32 pid = bpf_get_current_pid_tgid();
>> if (FILTER_PID)
>> return;
>> u64 key = 0;
>> bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
>> u64 ts = bpf_ktime_get_ns();
>> req.update(&key, &ts);
>> }
>>
>> void stop(struct pt_regs *ctx,void *first, struct BlockAcctCookie *cookie) {
>> struct data_t data = {};
>> u64 key = 0;
>> bpf_probe_read(&key, sizeof(key), (void *)PT_REGS_PARM1(ctx));
>> data.ts = bpf_ktime_get_ns();
>> data.type = cookie->type;
>> data.bytes = cookie->bytes;
>> u64 *s_time = req.lookup(&key);
>> if (s_time != 0) {
>> data.lat = (data.ts - *s_time);
>> data.lat /= 1000;
>> if (data.lat > MIN_US) {
>> events.perf_submit(ctx, &data, sizeof(data));
>> }
>> req.delete(&key);
>> }
>> }
>>
>> Kernel: Linux hostname 4.15.0-36-generic #39~16.04.1-Ubuntu SMP Tue Sep 25
>> 08:59:23 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
>> bcc packages versions:
>> bcc-tools 0.7.0-1
>> libbcc 0.7.0-1
>> libcc1-0:amd64 5.4.0-6ubuntu1~16.04.10
>> python-bcc 0.7.0-1
>>
>> --
>> Regards,
>> Aleksei Zakharov
>>
>>
--
Regards,
Aleksei Zakharov
-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.
View/Reply Online (#1542): https://lists.iovisor.org/g/iovisor-dev/message/1542
Mute This Topic: https://lists.iovisor.org/mt/28320360/21656
Group Owner: iovisor-dev+ow...@lists.iovisor.org
Unsubscribe: https://lists.iovisor.org/g/iovisor-dev/unsub
[arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-