On Thu, Mar 26, 2015 at 03:37:29PM -0600, David Ahern wrote:
> On 3/26/15 3:11 PM, Don Zickus wrote:
> >Sorry for drawing this out.  Originally the performance still seemed off.
> >But as we split the patch up to see where the perf impact was, the problem
> >seemed to have disappeared.  So we are testing the original patch again.
> >
> >The only difference now is we were playing with the -BN option in perf based
> >on your changelog, just because we never used it before. :-)
> 
> I was beyond surprised that you were measuring a 50% hit with the
> first patch. As mentioned in a previous response it only adds the
> processing of 3 additional lines to the already opened and read
> /proc/pid/status file. So, when I wrote this second version I wanted
> to make sure we are only measuring the impact of this change. The
> /proc/pid/status files are read on startup of the record -- before
> any samples are taken.
> 
> The intent of '-e cpu-clock -F 1000 -- usleep 1' is to avoid any
> samples since we don't care about them. Really the -a should be
> dropped as well -- no need to open per-cpu events.
> 
> -B impacts processing done at the end of the run:
> 
> builin-record.c, __cmd_record():
> 
>                 if (!rec->no_buildid)
>                         process_buildids(rec);
> 
> and -N says don't copying anything to ~/.debug. All together it
> tries to focus the measurement to /proc walking.
> 
> >
> >One last test without the -BN option and if that looks fine, then we have no
> >objections.  Again sorry for dragging this out.  I will let you know
> >tomorrow EST.
> 
> no problem; appreciate the heads up.

I talked with Joe on my way out the door yesterday and he confirmed, just
removing -BN from our test showed a performance hit with your patch.  With
the -BN option, there is no performance hit and we are perfectly fine with
your patch.

So, I guess I am confused how the -BN and your patch could change behaviour.

Just to re-iterate what we did, Joe kicked off a specJBB run and he did 20
captures of two runs (one with the unpatched binary and one with a pached
binary).

for i in {1..20}
do
  time perf.unpatched mem record -a -e cpu/mem-loads,ldlat=50/pp -e 
cpu/mem-stores/pp sleep 10
  time perf.patched   mem record -a -e cpu/mem-loads,ldlat=50/pp -e 
cpu/mem-stores/pp sleep 10
done

then we repeat the above test but with -BN in both runs.  We compare the
log sizes to make sure they are similar for the random snapshots and compare
the times.  With the -BN option, the times are generally within +/- 0.5
seconds of each.  Without the -BN option the patched perf binary is
generally +20-40 seconds slower.



However, based on your description above about what the -BN option does, I
am scratching my head about our results.  Thoughts?

Cheers,
Don

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to