On Thu, Apr 27, 2017 at 05:23:16PM +0100, Chris Wilson wrote: > On Thu, Apr 27, 2017 at 06:30:42PM +0300, David Weinehall wrote: > > On Thu, Apr 27, 2017 at 04:55:20PM +0200, Arkadiusz Hiler wrote: > > > On Wed, Apr 26, 2017 at 06:00:41PM +0300, David Weinehall wrote: > > > > Add a bunch of MOCS entries for gen 9 that were missing from intel_mocs. > > > > Some of these are used by media-sdk; if these entries are missing > > > > the default will instead be to do everything uncached. > > > > > > > > This patch improves media-sdk performance with up to 60% > > > > with the (admittedly synthetic) benchmarks we use in our nightly > > > > testing, without regressing any other benchmarks. > > > > > > Hey David, > > > > > > I am testing some of the extended MOCS with Mesa and the differences I > > > see fit in the margins of statistical error. > > > > > > Odd, I thought, so to make sure I haven't messed up anything in the > > > process of compiling, setting LD_LIBRARY_PATH and benchmarking I turned > > > everything to UNCACHED - and I saw severe performance drop. > > > > > > So here is the question it induced: > > > > > > Have you used the "closest neighbour" from entries available or did you > > > defaulted to the UNCACHED ones? That could be the culprit. > > > > > > Note: I have tested MOCS for VB and Render Target only, and only in a > > > few synthetic cases - it will require much more fine-tuning and > > > benchmarking before any final conclusions. > > > > As I mentioned in the commit message, the improvements only manifest > > themselves for media-sdk workloads (and presumably other workloads > > that uses the same hardware); if you see any performance regressions > > with these additional entries I'd be interested to know. > > But what is being counter suggested is that their is no reason for these > mocs entries. If the sdk is just using mocs registers without first > programming them outside of the kernel abi, then it will be hitting > uncached memory - and then the only benefit is from simply enabling > cached access. The kernel ABI is minimalist for a reason, and we want to > know why we should be adding tables that we need to maintain forever > (bonus points for making that a consistent interface for hardware for > years to come). > -Chris
Thanks for rephrasing - that's exactly what I am concerned with. Did you just use the MediaSDK as it is - meaning that MOCS entries beyond the set of the 3 we have defined had been naively utilized? If that's the case it is probably the cause of the performance difference - everything beyond "the 3" means UNCACHED. Can you try changing MediaSDK to only use entries that are already in? How the performance differs in that case? -- Cheers, Arek _______________________________________________ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx