Hi all, I updated all my work in my gist[0], my patch is based on git commit: "57fac75 xtensa: sort values in struct_user_offsets". I found strace 4.9 is just released and it's about 20 commits ahead of my work. I will continue my work and rebase my patch to the latest strace 4.9. You can find a single big diff file in my gist[1] , It contains all my change to strace.
1) The overall structure of the JSON output I test the JSON output using programs in strace's test directory, it could now produce valid JSON output for most syscalls except the ioctl() in some special case and the signal situation. The JSON output are make up with a series lines of JSON object, each line is a single valid JSON object representing a syscall, signal or anything else. The user can just read specific lines for their need instead of the entire file at once. Each object/line has at least 4 key/value pairs for the type,name,args and ret of a syscall, the type and name are a single string, the args is a array for the arguments of a syscall and the ret is the return(in most case it would be a single number but may in other type in some situations). There are some extra key/value pair in the object in some special case such as error and strerror. 2) Some big changes to the old output One of the most important change to the original output are the numbers. Because JSON itself does not support octal and hex numbers. they are then changed to print using unsigned decimal(%u) in JSON output. And all pointers(output by %p) are wrapped in double quoted string. Another big change is made for these special styles in strace. First, strace will produce the abbrevation ... when the output is too long in some situations. Second, strace will produce a "?" in some situations for the unknown value. Third, strace will produce the invalid JSON format such as {0x1e9b7e9c, 0x7f7a} and MCE_GET_LOG_LEN or MEMERASE or MTRRIOC_DEL_ENTRY in the argument of a syscall. The solution for the first and second situation are simple, currently I just output a string "..." and "?" to wrap the abbreviation and interrogation mask. The third problem is the biggest challenge in the work, currently I need to modify these functions one by one to produce valid JSON output. you can go to [2] to see my detail modifications to these functions. 3) Example output you can compare the two kind output (produced using option: "-v -y" and "-j -v -y", there are more examples in my gist[0]): 1: open("/lib/x86_64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3</lib/x86_64-linux-gnu/libc-2.19.so> 2: {"type": "syscall", "name": "open", "args": ["/etc/ld.so.cache", ["O_RDONLY","O_CLOEXEC"]], "ret": [3, "/etc/ld.so.cache"]} 3: mmap(0x7f341644b000, 24576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3</lib/x86_64-linux-gnu/libc-2.19.so>, 0x1bb000) = 0x7f341644b000 4: {"type": "syscall", "name": "mmap", "args": [139870053089280, 24576, ["PROT_READ","PROT_WRITE"], "MAP_PRIVATE", "MAP_FIXED", "MAP_DENYWRITE", [3, "/lib/x86_64-linux-gnu/libc-2.19.so"], 1814528], "ret": 139870053089280} 5: read(3</tmp/strace_GSOC_json_test.txt>, "helloworld, this_is_a_test_strin"..., 1000) = 73 6: {"type": "syscall", "name": "read", "args": [[3, "/tmp/strace_GSOC_json_test.txt"], "helloworld, this_is_a_test_strin", 1000], "ret": 73}, 7: splice(3</tmp/strace_GSOC_json_test.txt>, NULL, 4</tmp/strace_tmp_file2>, NULL, 100, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EINVAL (Invalid argument) 8: {"type": "syscall", "name": "splice", "args": [[3, "/tmp/strace_GSOC_json_test.txt"], null, [4, "/tmp/strace_tmp_file2"], null, 100, ["SPLICE_F_MOVE","SPLICE_F_NONBLOCK"]], "ret": -1, "error": "EINVAL", "strerror": "Invalid argument"} 9: fstat(3</lib/x86_64-linux-gnu/libc-2.19.so>, {st_dev=makedev(8, 1), st_ino=1048744, st_mode=S_IFREG|0755, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=3608, st_size=1845024, st_atime=2014/08/17-20:39:31, st_mtime=2014/04/12-18:38:28, st_ctime=2014/06/12-15:09:58}) = 0 10: {"type": "syscall", "name": "fstat", "args": [[3, "/lib/x86_64-linux-gnu/libc-2.19.so"], {"st_dev": [8, 1], "st_ino": 1048744, "st_mode": ["S_IFREG",493], "st_nlink": 1, "st_uid": 0, "st_gid": 0, "st_blksize": 4096, "st_blocks": 3608, "st_size": 1845024, "st_atime": "[2014, 8,17,20,39,31]", "st_mtime": "[2014, 4,12,18,38,28]", "st_ctime": "[2014, 6,12,15, 9,58]"}], "ret": 0} 11: +++ exited with 0 +++ 12: {"type": "+++", "name": "exited", "info": 0 } 4) Implementation I had changed the design a lot since the first patch. It now had some big differences to my original proposals. The core idea is to keep strace code clean and not change it as much as possible. you can find a snippet of my change to strace here [2]. tprintf() is modified and now it will replace the %o,%x and %p,%s specifier with the corresponding %u and "%p","%s". so we do not need to modify these tprintf in the syscalls functions. and I also introduce some help functions to make the change clean and easy, you can also find their usage in [2]. 5) Future plan First, the current modification to strace is simple for most syscalls, but I think it is still very ugly when dealing with some extremely complex output functions such as sys_futex() and sys_clone(). I want to try more methods to improve the code look after we made our changes. Second, The format need more test in different situations and arguments, I think I need to write one test to each syscall just after I made changes to these syscalls, the programs in test directory is not enough to make sure the output valid. Dmitry, could you please give me some suggestions on my next work? Thank you! [0] https://gist.github.com/zym0017d [1] https://gist.github.com/zym0017d/9ba84382f0d1596d1fab [2] https://gist.github.com/zym0017d/55ab97db366de5cd709f --- YangMin ------------------------------------------------------------------------------ _______________________________________________ Strace-devel mailing list Strace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/strace-devel