On Wed, Jun 14, 2017 at 10:13 PM, Jose Fonseca <jfons...@vmware.com> wrote: > On 14/06/17 21:07, Marek Olšák wrote: >> >> On Wed, Jun 14, 2017 at 9:45 PM, Jose Fonseca <jfons...@vmware.com> wrote: >>> >>> On 14/06/17 17:12, Marek Olšák wrote: >>>> >>>> >>>> On Tue, Jun 13, 2017 at 3:43 PM, Marek Olšák <mar...@gmail.com> wrote: >>>>> >>>>> >>>>> On Tue, Jun 13, 2017 at 1:40 PM, Jose Fonseca <jfons...@vmware.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> On 12/06/17 22:56, Marek Olšák wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Jun 12, 2017 at 10:43 PM, Jose Fonseca <jfons...@vmware.com> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On 12/06/17 21:25, Marek Olšák wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Mon, Jun 12, 2017 at 9:51 PM, Jose Fonseca <jfons...@vmware.com> >>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> How does this help exactly? >>>>>>>>>> >>>>>>>>>> Are applications actually rendering to the same FBO w/ and w/o >>>>>>>>>> SRGB >>>>>>>>>> decoding? >>>>>>>>>> >>>>>>>>>> Or is the problem here GL_SRGB_WRITE state getting spuriously >>>>>>>>>> dirtied >>>>>>>>>> by >>>>>>>>>> the >>>>>>>>>> application? >>>>>>>>>> >>>>>>>>>> And even if they do, why is toggling surface views in framebuffer >>>>>>>>>> state >>>>>>>>>> so >>>>>>>>>> expensive? >>>>>>>>>> >>>>>>>>>> I don't object per se, but it looks like an unusual thing to >>>>>>>>>> optimize >>>>>>>>>> for. >>>>>>>>>> >>>>>>>>> >>>>>>>>> set_framebuffer_state is basically a memory barrier. We have >>>>>>>>> different >>>>>>>>> caches between FB and textures and we have to flush them when a >>>>>>>>> texture is unbound from the framebuffer and set as a sampler view. >>>>>>>>> To >>>>>>>>> keep thing simple, set_framebuffer_state is the barrier. When we >>>>>>>>> change the blend state, the barrier is avoided. Note that the >>>>>>>>> barrier >>>>>>>>> makes set_framebuffer_state a function that is always GPU-bound. >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I see. >>>>>>>> >>>>>>>> And you're sure that the incoming set_framebuffer_state are not >>>>>>>> spurious? >>>>>>>> >>>>>>>> I know cso_context always eliminates redundant >>>>>>>> pipe_context::set_framebuffer_state calls, but it is perhaps >>>>>>>> possible >>>>>>>> that >>>>>>>> Mesa state tracker is reseting the framebuffer state with different >>>>>>>> surface >>>>>>>> views, but that in practice are exactly the same as the previous >>>>>>>> one? >>>>>>>> >>>>>>>> Like I said, it seems odd apps are doing this: it doesn't make much >>>>>>>> sense >>>>>>>> to >>>>>>>> me to change colorspace of the fragments between draws. (Unless some >>>>>>>> of >>>>>>>> the >>>>>>>> assets are already in SRGB and the app is trying to be too smart for >>>>>>>> its >>>>>>>> own >>>>>>>> good to avoid the sRGB->RGB->sRGB.) It seems much more likely that >>>>>>>> these >>>>>>>> framebuffer state changes are self-inflicted some where in our >>>>>>>> stack, >>>>>>>> than >>>>>>>> something truly demanded by the app. >>>>>>>> >>>>>>>> And if that's the case and we can fix it, then it would be a better >>>>>>>> solution >>>>>>>> all around. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> Yeah the funny part and the reason is that we have a microbenchmark >>>>>>> in >>>>>>> piglit (drawoverhead) changing this state between draw calls. :) >>>>>>> >>>>>>> Marek >>>>>>> >>>>>> >>>>>> I couldn't find that piglit microbenchmark. mesademos has >>>>>> src/perf/drawoverhead.c but it doesn't set GL_SRGB_WRITE. So if fbo >>>>>> is >>>>>> changing internally, then it's a perf bug in Mesa state tracker. >>>>>> >>>>>> Unless it's mimicking something that real apps do, then it's probably >>>>>> better >>>>>> to fix the microbenchmark to use a more realistic tests. >>>>> >>>>> >>>>> >>>>> If you build piglit, it's in bin/drawoverhead. >>>>> >>>>> You're right that this subtest (switching GL_FRAMEBUFFER_SRGB) is >>>>> rather artificial and fairly unlikely to occur with real apps. >>>> >>>> >>>> >>>> FYI, I'm dropping this series and I don't have it in my repo anymore. >>>> piglit/drawoverhead will be updated not to test this state change. >>>> >>>> Marek >>> >>> >>> >>> Great. >>> >>> BTW, I'm not sure what's a good state to change in such microbenchmark. >>> >>> There is of course, a myriad of states to pick, but they are not all the >>> same: performance can vary wildly depending on the choice. I'm not sure >>> what's a good representative state change in such circumstances Perhaps >>> toggling between two texture objects? Or some sampler state? >> >> >> If you've ever run the microbenchmark, you know there are plenty of >> state changes tested. I think there are like 15 state changes tested >> in about 60 subtests at the moment. I'm adding more tests into it. >> Currently I have 100 subtests in there locally. At the moment the >> missing subtests are mostly just shader resources: immutable textures >> (mutable textures i.e. not TexStorage-based are already tested), TBOs, >> images, image buffers, SSBOs (maybe), atomic counters (maybe). The >> methodology is 1 state change followed by 1 draw call in a loop, >> measuring the number of draw calls per second for that case, and >> comparing with the baseline draw rate (which is without the state >> change). >> >> Marek >> > > I just ran it. Pretty neat! I didn't know we were adding benchmarks to > piglit.
That's because piglit has a very convenient window system integration framework that I refuse to re-invent elsewhere. Marek _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/mesa-dev