On Thu, Jun 14, 2018 at 09:47:20PM +0200, Peter Zijlstra wrote:
> On Tue, Jun 12, 2018 at 10:51:15AM +0300, Alexander Shishkin wrote:
> > @@ -6112,6 +6219,32 @@ void perf_prepare_sample(struct perf_event_header 
> > *header,
> >  
> >     if (sample_type & PERF_SAMPLE_PHYS_ADDR)
> >             data->phys_addr = perf_virt_to_phys(data->addr);
> > +
> > +   if (sample_type & PERF_SAMPLE_AUX) {
> > +           u64 size;
> > +
> > +           header->size += sizeof(u64); /* size */
> > +
> > +           /*
> > +            * Given the 16bit nature of header::size, an AUX sample can
> > +            * easily overflow it, what with all the preceding sample bits.
> > +            * Make sure this doesn't happen by using up to U16_MAX bytes
> > +            * per sample in total (rounded down to 8 byte boundary).
> > +            */
> > +           size = min_t(size_t, U16_MAX - header->size,
> > +                        event->attr.aux_sample_size);
> > +           size = rounddown(size, 8);
> > +           size = perf_aux_sample_size(event, data, size);
> > +
> > +           WARN_ON_ONCE(size + header->size > U16_MAX);
> > +           header->size += size;
> > +   }
> > +   /*
> > +    * If you're adding more sample types here, you likely need to do
> > +    * something about the overflowing header::size, like repurpose the
> > +    * lowest 3 bits of size, which should be always zero at the moment.
> > +    */
> 
> Bugger yes.. I fairly quickly (but still too late) realized we should've
> used that u16 in u64 increments to allow up to 512K instead of 64K
> events.
>
> Still, even 64K samples are pretty terrifyingly huge. They'll be
> _sloowww_.
> 
> In any case, I suppose we can grab one of the attribute bits to rev. the
> output format -- a la sample_id_all. Do we really want to do that? 512K
> samples.... *shudder*.

Well, as far as PT goes, even a 4K sample carries a thousands of branches,
I can't imagine ever needing more than 64K, but I didn't really want to make
the call. Same goes for the width of aux_sample_size, which should match what
we can theoretically output.

Also as a thought, we could use the perf_adjust_period() to also reduce the
sample size dynamically if it takes too long to copy.

Regards,
--
Alex

Reply via email to