The following pull request was submitted through Github.
It can be accessed and reviewed at: https://github.com/lxc/lxcfs/pull/309

This e-mail was sent by the LXC bot, direct replies will not reach the author
unless they happen to be subscribed to this list.

=== Description (from pull-request) ===
This PR fix is similar to #125 

This type of problem is mainly due to a bug in the read_file function. The reason for this bug will be analyzed below.
Under some Case, We execute ps aux in the container and we get an error in not finding btime.
```
58eb38977abb(@:):/# ps aux
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
missing btime in /proc/stat
root           1  0.0  0.0 205172  7412 ?        Ss58eb38977abb(@:):/#
58eb38977abb(@:):/#
```
Then we use strace to track system calls.
```
58eb38977abb(@:):/# strace -ff -s 10240 -e read,open,openat ps aux > /dev/null
...  ...
open("/proc/stat", O_RDONLY)            = 6
read(6, "cpu  13030938 7115021 44303748 6188805307 13413081 0 182664 0 0 0\ncpu0 333372 172944 1060094 154237893 267468 0 13827 0 0 0\ncpu1 414513 177557 1145335 154878583 27030 0 13029 0 0 0\ncpu2 388653 177135 1152247 154818449 48530 0 12159 0 0 0\ncpu3 403940 ... ... 0 0 0 0 0 0 0 0", 4096) = 4096
read(6, "", 4096)                       = 0
read(6, "", 4096)                       = 0
missing btime in /proc/stat
+++ exited with 1 +++
58eb38977abb(@:):/#
```
Then we found that it was normal using cat /proc/stat.
```
58eb38977abb(@:):/# wc /proc/stat
48 2035 5899 /proc/stat
```
We found that the length was 5899 by cat /proc/stat. then we found that the first read got 4096 bytes, and the second read read was 0. Obviously there is a problem with the second read here.

So, I started debugging the lxcfs stat_read.
```
    char *cache = d->buf + CPUALL_MAX_SIZE;
    size_t cache_size = d->buflen - CPUALL_MAX_SIZE;
    FILE *f = NULL;
    struct cpuacct_usage *cg_cpu_usage = NULL;
    int cg_cpu_usage_size = 0;

    if (offset){
        if (offset > d->size)
            return -EINVAL;
        if (!d->cached)
            return 0;
        int left = d->size - offset;
        total_len = left > size ? size: left;
        memcpy(buf, d->buf + offset, total_len);
        return total_len;
    }

    pid_t initpid = lookup_initpid_in_store(fc->pid);
    lxcfs_v("initpid: %d\n", initpid);
    if (initpid <= 0)
        initpid = fc->pid;
```
I found that the offset value of the second read is 4096, which is in line with expectations, but it is true when judging `if (!d->cached)`, and returns directly to 0. Obviously this is not as expected. The problem is that the d->cached is not processed after the first read. Then fix it in the read_file function. 
When repaired, the `ps aux` command is work.

Please review. Thanks.

Signed-off-by: Hongbo Yin <yinhon...@bytedance.com>
From 030d022c5f1adfa7acdbb6deb5aad82d9580181d Mon Sep 17 00:00:00 2001
From: Hongbo Yin <yinhon...@bytedance.com>
Date: Tue, 15 Oct 2019 15:39:58 +0800
Subject: [PATCH] fix read_file second reading empty bug

Signed-off-by: Hongbo Yin <yinhon...@bytedance.com>
---
 bindings.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/bindings.c b/bindings.c
index 1811955..234574e 100644
--- a/bindings.c
+++ b/bindings.c
@@ -3382,6 +3382,8 @@ int read_file(const char *path, char *buf, size_t size, 
struct file_info *d)
   err:
        fclose(f);
        free(line);
+       if (d->size > rv)
+               d->cached = d->size - rv;
        return rv;
 }
 
_______________________________________________
lxc-devel mailing list
lxc-devel@lists.linuxcontainers.org
http://lists.linuxcontainers.org/listinfo/lxc-devel

Reply via email to