> On Jan 9, 2019, at 2:18 AM, Peter Zijlstra <pet...@infradead.org> wrote:
> 
> On Tue, Jan 08, 2019 at 11:54:04PM +0000, Song Liu wrote:
> 
>> I think Intel PT case is at instruction granularity (instead of ksymbol
>> granularity)? 
> 
> Yes.
> 
>> If this is true, modules, BPF, and PT could still share
>> the ksymbol record for basic profiling. And advanced use cases like 
>> annotation will depend on user space to record BPF_EVENT (and equivalent
>> for other cases) timely. But at least, the ksymbol is already there. 
>> 
>> Does this make sense?  
> 
> I'm not sure I follow; the idea was that on ksym events we copy out the
> instructions using kcore. The ksym event already has addr+len.

I was thinking about modifying the text in-place scenario. In this case, 
we can use something like

struct perf_record_text_modify {
    u64 addr;
    u_big_enough old_instr;
    u_big_enough new_instr;
    timestamp ;
};

It is a fixed size record, and we don't need process it immediately 
in user space. At the end of perf run, a series of these events will 
help us reconstruct exact text at any time. 

> 
> All we need is some means of ensuring the symbol is still there by the
> time we see the event and do the copy.
> 
> I think we can do this with a new ioctl() on /proc/kcore itself:
> 
> - when we have kcore open, we queue all text-free operations on list-1.
> 
> - when we close kcore, we drain all (text-free) list-* and perform the
>   pending frees immediately.
> 
> - on ioctl(KCORE_QC) we perform the pending free of list-3 and advance
>   list-2 to list-3 and list-1 to list-2.
> 
> Perf would then open kcore at the start of the record, make a complete
> copy and keep the FD open. At the end of every buffer process, we issue
> KCORE_QC IFF we observed a ksym unreg in that buffer.

Does this mean we need to scan every buffer before writing it to perf.data 
during perf-record? 

Also, if we need ksym unreg here, I guess it is NOT really modifying text 
in-place, but creating new version and swap? Then can we include something 
like this in perf.data:

struct perf_record_text_modify {
    u64 old_addr;
    u64 new_addr;
    u32 old_len; /* up to MAX_SIZE */
    u32 new_len; /* up to MAX_SIZE */
    u8 old_text[MAX_SIZE];
    u8 new_text[MAX_SIZE];
    timestamp ;
};

In this way, this record is embedded in perf.data, and doesn't require
extra processing during perf-record (only at the end of perf-record). 
This would work for text modifying case, as modifying text is simply
old-text to new-text.
 
Similar solution would not work for BPF case, as bpf_prog_info is 
getting a lot more members in the near future. 

Does this make sense...?

Thanks,
Song

Reply via email to