Re: [RFC] bpf: Suggestion on bpf syscall interface
On 3/29/15 8:13 PM, He Kuang wrote: By using current bpf syscalls, we should keep the program which attaches bpf programs running background, use it or some other processes communicate with it to adjust maps parameters, like sample rate for sys_write. You can do all of the above by passing fds between processes. I still don't see a need for sysfs. In current implementation, we have to use a large and relative heavy daemon to deal with loading, configuration, adjusting and unloading works together. This daemon is actually small and simple. Just take a look how Daniel did for tc: http://patchwork.ozlabs.org/patch/456387/ In that example 3 programs are sharing maps and single bpf_agent monitors maps. Note that tc loaded programs and exited while agent keeps running. Very straightforward. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] bpf: Suggestion on bpf syscall interface
On 2015/3/29 1:21, Alexei Starovoitov wrote: On 3/28/15 4:36 AM, He Kuang wrote: Hi, Alexei In our end-end IO module project, we use bpf maps to record configurations. According to current bpf syscall interface, we should specify map_fd to lookup/update bpf maps, so we are restricted to do config in the same user program. you can pass map_fd and prog_fd from one process to another via normal scm_rights mechanism. In our current use case, we add a bpf probe point in sys_write() as the entry of IO procedure, this bpf point will return true on some conditions, and then trigger bpf chain on IO path, for example: SEC("kprobe/sys_write") int NODE_sys_write(struct pt_regs *ctx) { ... struct parameters *param = bpf_map_lookup_elem(¶meters_map, &index); if(param->num_samples % param->sample_rate !=0) return 0; ... /* extract characters from this sampled point, fill it to another map */ bpf_map_update_elem(&TRIGGER_mpage_submit_page_HASH, (void**)__b__buf, &value, BPF_ANY); return 1; ... SEC("kprobe/mpage_submit_page") int NODE_mpage_submit_page(struct pt_regs *ctx) { ... /* lookup filter table */ value = (struct table_value*)bpf_map_lookup_elem(&TRIGGER_mpage_submit_page_HASH, (void**)__b__buf); if (!value) return 0; ... By using current bpf syscalls, we should keep the program which attaches bpf programs running background, use it or some other processes communicate with it to adjust maps parameters, like sample rate for sys_write. What we hope is to use bpf maps/progs like kernel-modules or kprobes, one process inserts them to kernel, then they detactch from that process, and allow us to configure them with sysfs. For example: $ perf probe --add='sys_write' $ perf record -e probe:sys_open -aR sleep 1 In current implementation, we have to use a large and relative heavy daemon to deal with loading, configuration, adjusting and unloading works together. Thanks. My suggestion is to export this kind of operations to sysfs, so we can load&attach bpf progs and config it seperately. We implement this feature in our demo project. What's your opinion on this? Eventually we may use single sysfs file for lsmod-like listings, but I definitely don't want to create parallel interface to maps via sysfs. It's way too expensive and not really suitable for binary key/values. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] bpf: Suggestion on bpf syscall interface
On 03/28/2015 06:21 PM, Alexei Starovoitov wrote: On 3/28/15 4:36 AM, He Kuang wrote: Hi, Alexei In our end-end IO module project, we use bpf maps to record configurations. According to current bpf syscall interface, we should specify map_fd to lookup/update bpf maps, so we are restricted to do config in the same user program. you can pass map_fd and prog_fd from one process to another via normal scm_rights mechanism. +1, I've just tried that out in the context of a different work and works like a charm. My suggestion is to export this kind of operations to sysfs, so we can load&attach bpf progs and config it seperately. We implement this feature in our demo project. What's your opinion on this? Eventually we may use single sysfs file for lsmod-like listings, but I definitely don't want to create parallel interface to maps via sysfs. Yes, that would be a bad design decision. Btw, even more lightweight for kernel-side would be to just implement .show_fdinfo() for the anon indoes on the map/prog store and have some meta information exported from there. You can then grab that via /proc//fdinfo/, I would consider such a thing a slow-path operation anyway, and you would also get the app info using it for free. It's way too expensive and not really suitable for binary key/values. +1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [RFC] bpf: Suggestion on bpf syscall interface
On 3/28/15 4:36 AM, He Kuang wrote: Hi, Alexei In our end-end IO module project, we use bpf maps to record configurations. According to current bpf syscall interface, we should specify map_fd to lookup/update bpf maps, so we are restricted to do config in the same user program. you can pass map_fd and prog_fd from one process to another via normal scm_rights mechanism. My suggestion is to export this kind of operations to sysfs, so we can load&attach bpf progs and config it seperately. We implement this feature in our demo project. What's your opinion on this? Eventually we may use single sysfs file for lsmod-like listings, but I definitely don't want to create parallel interface to maps via sysfs. It's way too expensive and not really suitable for binary key/values. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/