Hi Samuel, On Tue, Mar 31, 2015 at 5:56 PM, Samuel Pitoiset <samuel.pitoi...@gmail.com> wrote: > Hello Robert, > > Sorry for the delay, I just saw your message few days ago, and I probably > removed the mail by mistake too...
And then I was on holiday; so more delay :-) > > I have never heard about your work on this area, happy to know right now. :) > > Well, regarding the backend stuff, I would prefer to keep the same for both > GL_AMD_performance_monitor and INTEL_performance_query. My experience with the Intel backend where I initially aimed to update both extensions behind one backend is that it was quite a hindrance and there wasn't a clear benefit to it when there isn't really any substantial code to speak of in the core infrastructure to share between the extensions. We should be careful not to talk cross purposes here though. In my mind having orthogonal frontends and even different backend interfaces wouldn't preclude a driver implementing both extensions with a unified backend if desirable, there would just be two separate sets of entry points for the frontend to interact with that unified backend. Some of the issues I came across were: The current design expects a common description of counters and their types, but the current implementation doesn't fully support INTEL_performance_query semantic types. Fixing this is awkward because neither extension has data/semantic types that are a strict subset of the other so to support all the types I imagine we'd also need to introduce some mechanism for black/white listing counters for each extension if we want to keep a common description. Then if we wanted to utilize the full range of types for both extensions I have a feeling a lot of the counters would end up being exclusively declared for one extension or the other which would negate some of the benefit of having a common structure. The current infrastructure seems somewhat biased towards implementing AMD_performance_monitor with the concept of groups and counter selection which doesn't exist in the INTEL_performance_query extension and that seems unfortunate when the selection mechanism looks to make the allocation and tracking of query objects more costly/complex if we don't need it for the INTEL_performance_query extension. There's no substantial utility code associated with the core infrastructure that the backends benefit from to help justify sharing a backend for multiple extensions. The core support just does simple frontend validation of user arguments to normalize things and handle gl errors consistently before interacting with the backend so in practice the INTEL_performance_query and AMD_performance_monitor code is rather orthogonal. I think the only things that connect the two extensions currently are the shared declaration of counters and a tiny amount of utility code for allocating/freeing monitor objects. Given the issue with the counter types I found things became simpler if the counter descriptions were instead moved into the backend. Given that INTEL_performance_query doesn't need any active group/counter state per object, the common object allocator also isn't ideal. So making both of these changes (which seem to make sense even without the goal of separating the extension) is enough to make the frontends completely orthogonal. I also really like that with the counter declarations in the backend that it's free to use whatever data structures are appropriate for the various counters. As opposed to statically declared arrays describing our counters, I needed to update our backend to programatically build up the lists of available counters and counter descriptions also necessarily became more detailed so it was nice that this work could be self contained in the backend and we can describe our Observation Architecture counters differently from our pipeline statistics counters. My thinking a.t.m is that if the current AMD_perfmon backend architecture seems to be ok for your needs then it could be for the best that the extensions can be easily made orthogonal so we can develop support for both extensions without stepping on each other's toes. Later if it's desirable to support both extensions in any driver we can always evaluate what opportunities there are to have a common backend interface if that could simplify things. > Currently, I'm trying to implement GL_AMD_perfmon as a state tracker which > is based on the query interface of Gallium and this looks quite good. Only > minor changes in the current interface are required to do that. > > At this time, most of hardware performance counters are *only* exposed > through the Gallium HUD and I think it's not very helpful for a large number > of applications. > I'm pretty sure that GL_AMD_perfmon will be very useful for exposing GPU > counters and this is also a requirement for a GSoC project this year. > > So, with respect to your work, my question is : why do you want to get rid > of AMD_perfmon in favour of INTEL_perf_query ? From my pov, the priority is to at least have one extension that works fully and can expose our Observation Architecture counters. Currently neither of our backends is usable in practice so we aren't exactly getting rid of AMD_perfmon in favour of INTEL_perf_query because neither extension really works for us yet. A difficulty for us has been that that we've only relatively recently learned how to configure our Gen graphics Observation Architecture performance counters and considering how our supporting kernel interface works it makes quite a big difference to how our backend needs to work which wasn't possible to consider for the first implementation. So to start with it's a question of picking one extension to focus on, and the INTEL_performance_query extension is a slightly better match for the performance counters we can get from Gen graphics, it's also slightly simpler and can express a bit more with its data/semantic types. I didn't start out with the plan of dropping our AMD_perfmon backend, but as I hit issues and looked to evolve the INTEL_perf_query support I started to see more and more that the current design was quite an impediment but also saw there was very little really connecting the two extensions. So from a practical point of view it was just simpler to draw a line between the two extensions and only have one extension to worry about. > Don't you think that the AMD extension is also useful as the INTEL one? I suppose here, usefulness is mainly dependent on what tooling we can enable with these extensions. In terms of the data exposed for tools, the extensions would expose more or less the same data if we exposed both extensions which would only be useful in the case of tools that only support one extension or the other. The INTEL extension has some more data/semantic types so maybe it has the edge in terms of what tools will want but there's not really much in it. I've been experimenting with a tool called gputop (https://github.com/rib/gputop) based on INTEL_performance_query as a way to test my work and Mark Janes has also been experimenting with a UI for fips based on INTEL_performance_query so we at least have some toys to start with based on INTEL_performance_query. Based on developing gputop, I see that neither extension is perfect really as we have more meta data about our counters than can be expressed by either extension. For example: I'd like to be able to report a stable uuid or unique name for counters that tools can trust wont ever change so tools can be made to understand the semantics of specific counters to help implement things like automatic bottleneck analysis. Currently we can only report a short + long name for counters which we we want to be human readable but might want to change them to improve readability. I'm hoping to compromise here and guarantee that our short names will be a stable part of the api for tools but its not guaranteed by either extension. We don't have a well specified way to report maximum throughputs e.g. for bandwidth values just because the INTEL spec only technically only expects drivers to report maximum values for 'raw' counters. For some of our counters (e.g. sampler bottleneck) we have information about what threshold should really be highlighted as 'bad' to users which tools would benefit from, but neither extension gives us a way to report this. Ok, I hope this helps explain some of what I've found while working on this. Depending on any further feedback here; I'm currently thinking I'll rebase my series soon, dropping the patch that removed all AMD_performance_monitor support and instead I'll just have a patch removing the Intel backend. Hopefully I can send out and RFC series relatively soon, cleaned up a bit more, updated against my latest drm perf interface and with support for some of our more interesting counters. Regards, - Robert > > Best regards, > Samuel Pitoiset. > > _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev