GPU lockup dumping
On 23.05.2012 19:02, Jerome Glisse wrote: > On Wed, May 23, 2012 at 12:41 PM, Dave Airlie wrote: >> On Wed, May 23, 2012 at 5:26 PM, Jerome Glisse wrote: >>> On Wed, May 23, 2012 at 12:08 PM, Dave Airlie wrote: On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse wrote: > On Wed, May 23, 2012 at 8:34 AM, Christian K?nig > wrote: >> On 23.05.2012 11:27, Dave Airlie wrote: >>> On Thu, May 17, 2012 at 7:28 PM,wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. >>> I really don't like introducing this at this stage into 3.5, >>> >>> I'd really like a good review of the API and what information we provide >>> along with how extensible it is. >>> >>> I'm still not convinced replay is what we want in the field, I know its >>> what >>> *you* want, but I think apitrace stuff in userspace pretty much covers >>> the replaying situation. So I'd have to look at this and see how easy >>> it makes disecting command streams etc. >>> >>> Dave. >> >> I agree that it might not be a good idea to push that into 3.5, since at >> least I (and I also think Alex) didn't had time to look into it yet. On >> the >> other hand the patches look quite reasonable. >> >> But I still wanted to throw in a requirement from my day to day work, >> maybe >> that helps finding a more general solution: >> When we start to work with more parts of the chip it might be necessary >> to >> dump everything that is currently "in the fly". For example I had a whole >> bunch of problems where copying data around with a 3D Blit and then >> missing >> a sync between this job and a job on another rings causes a "hiccup" in >> the >> hardware. >> >> I know that this isn't your focus and that is absolutely ok with me, >> cause >> the format you are introducing is just used in debugfs and so not part of >> any stable API (at least not in my understanding), but you should still >> keep >> in mind that we might need to extend it into that direction in the >> future. >> >> Christian. > Note that my format is also done with that in mind, it can capture ib > from all rings. The only thing i don't think worth capturing are the > ring themself because there would be no way to replay them without > adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. >>> Ring will contains very little, like ib schedule and fence, i don't >>> see how useful this can be. >>> >> In case we have a bug in our ib scheduling or fencing :-0 >> >> Dave. > Well i think we have several kind of lockup, the most basic one is > userspace sending broken shader, vertex, or something in that line. > The more complex one is timing related, like a bo move or some cache > invalidation that didn't happen properly and GPU endup reading either > wrong data or old cached data. I don't see how to capture useful > information for this second case, beside doing snapshot of memory. > > For multi-ring i agree that dumping the ring might prove useful spot > inter-ring semaphore deadlock, or possibly inter-ring absence of > synchronization (but that would be a bad kernel bug). I don't think that we need the actual data from the rings neither (at least as long as we keep the radeon_ring_* debugfs files). But it would still be nice to know weather or not there was a sync between the rings. See the patches I just send to you (sorry, actually send more patches than I wanted to send), storing the new sync_seq array within the debug output should enable us to actually figure out the dependencies and order between different IBs. Cheers, Christian.
Re: GPU lockup dumping
On 23.05.2012 19:02, Jerome Glisse wrote: On Wed, May 23, 2012 at 12:41 PM, Dave Airlieairl...@gmail.com wrote: On Wed, May 23, 2012 at 5:26 PM, Jerome Glissej.gli...@gmail.com wrote: On Wed, May 23, 2012 at 12:08 PM, Dave Airlieairl...@gmail.com wrote: On Wed, May 23, 2012 at 3:48 PM, Jerome Glissej.gli...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.comwrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. Ring will contains very little, like ib schedule and fence, i don't see how useful this can be. In case we have a bug in our ib scheduling or fencing :-0 Dave. Well i think we have several kind of lockup, the most basic one is userspace sending broken shader, vertex, or something in that line. The more complex one is timing related, like a bo move or some cache invalidation that didn't happen properly and GPU endup reading either wrong data or old cached data. I don't see how to capture useful information for this second case, beside doing snapshot of memory. For multi-ring i agree that dumping the ring might prove useful spot inter-ring semaphore deadlock, or possibly inter-ring absence of synchronization (but that would be a bad kernel bug). I don't think that we need the actual data from the rings neither (at least as long as we keep the radeon_ring_* debugfs files). But it would still be nice to know weather or not there was a sync between the rings. See the patches I just send to you (sorry, actually send more patches than I wanted to send), storing the new sync_seq array within the debug output should enable us to actually figure out the dependencies and order between different IBs. Cheers, Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
On Wed, May 23, 2012 at 5:26 PM, Jerome Glisse wrote: > On Wed, May 23, 2012 at 12:08 PM, Dave Airlie wrote: >> On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse wrote: >>> On Wed, May 23, 2012 at 8:34 AM, Christian K?nig >>> wrote: On 23.05.2012 11:27, Dave Airlie wrote: > > On Thu, May 17, 2012 at 7:28 PM, ?wrote: >> >> So here is improved patchset, where i splited ground work necessary >> for the dumping into their own patch. The debugfs improvement could >> probably be usefull to intel instead of having i915 have it's own >> debugfs file stuff. >> >> The lockup dumping public api have been move into radeon_drm.h >> >> Stressing the fact again that dump are self contained ie they have >> all the data needed to be replayed (vertex, indices, shader, texture, >> ...). >> >> Would really like to get this into 3.5, the new API is pretty much >> straightforward and userspace tools can easily be made to convert >> it to other format. The change to the driver is self contained. > > I really don't like introducing this at this stage into 3.5, > > I'd really like a good review of the API and what information we provide > along with how extensible it is. > > I'm still not convinced replay is what we want in the field, I know its > what > *you* want, but I think apitrace stuff in userspace pretty much covers > the replaying situation. So I'd have to look at this and see how easy > it makes disecting command streams etc. > > Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently "in the fly". For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a "hiccup" in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. >>> >>> Note that my format is also done with that in mind, it can capture ib >>> from all rings. The only thing i don't think worth capturing are the >>> ring themself because there would be no way to replay them without >>> adding some new special API. >> >> I'd like to dump the rings as well, as I said I'd rather we didn't >> limit this to replay, but make it useful for getting as much info as >> possible out >> >> Dave. > > Ring will contains very little, like ib schedule and fence, i don't > see how useful this can be. > In case we have a bug in our ib scheduling or fencing :-0 Dave.
GPU lockup dumping
On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse wrote: > On Wed, May 23, 2012 at 8:34 AM, Christian K?nig > wrote: >> On 23.05.2012 11:27, Dave Airlie wrote: >>> >>> On Thu, May 17, 2012 at 7:28 PM, ?wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. >>> >>> I really don't like introducing this at this stage into 3.5, >>> >>> I'd really like a good review of the API and what information we provide >>> along with how extensible it is. >>> >>> I'm still not convinced replay is what we want in the field, I know its >>> what >>> *you* want, but I think apitrace stuff in userspace pretty much covers >>> the replaying situation. So I'd have to look at this and see how easy >>> it makes disecting command streams etc. >>> >>> Dave. >> >> >> I agree that it might not be a good idea to push that into 3.5, since at >> least I (and I also think Alex) didn't had time to look into it yet. On the >> other hand the patches look quite reasonable. >> >> But I still wanted to throw in a requirement from my day to day work, maybe >> that helps finding a more general solution: >> When we start to work with more parts of the chip it might be necessary to >> dump everything that is currently "in the fly". For example I had a whole >> bunch of problems where copying data around with a 3D Blit and then missing >> a sync between this job and a job on another rings causes a "hiccup" in the >> hardware. >> >> I know that this isn't your focus and that is absolutely ok with me, cause >> the format you are introducing is just used in debugfs and so not part of >> any stable API (at least not in my understanding), but you should still keep >> in mind that we might need to extend it into that direction in the future. >> >> Christian. > > Note that my format is also done with that in mind, it can capture ib > from all rings. The only thing i don't think worth capturing are the > ring themself because there would be no way to replay them without > adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave.
GPU lockup dumping
On 23.05.2012 11:27, Dave Airlie wrote: > On Thu, May 17, 2012 at 7:28 PM, wrote: >> So here is improved patchset, where i splited ground work necessary >> for the dumping into their own patch. The debugfs improvement could >> probably be usefull to intel instead of having i915 have it's own >> debugfs file stuff. >> >> The lockup dumping public api have been move into radeon_drm.h >> >> Stressing the fact again that dump are self contained ie they have >> all the data needed to be replayed (vertex, indices, shader, texture, >> ...). >> >> Would really like to get this into 3.5, the new API is pretty much >> straightforward and userspace tools can easily be made to convert >> it to other format. The change to the driver is self contained. > I really don't like introducing this at this stage into 3.5, > > I'd really like a good review of the API and what information we provide > along with how extensible it is. > > I'm still not convinced replay is what we want in the field, I know its what > *you* want, but I think apitrace stuff in userspace pretty much covers > the replaying situation. So I'd have to look at this and see how easy > it makes disecting command streams etc. > > Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently "in the fly". For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a "hiccup" in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian.
GPU lockup dumping
On 05/23/2012 08:51 AM, Jerome Glisse wrote: > On Wed, May 23, 2012 at 5:27 AM, Dave Airlie wrote: >> On Thu, May 17, 2012 at 7:28 PM, wrote: >>> So here is improved patchset, where i splited ground work necessary >>> for the dumping into their own patch. The debugfs improvement could >>> probably be usefull to intel instead of having i915 have it's own >>> debugfs file stuff. >>> >>> The lockup dumping public api have been move into radeon_drm.h >>> >>> Stressing the fact again that dump are self contained ie they have >>> all the data needed to be replayed (vertex, indices, shader, texture, >>> ...). >>> >>> Would really like to get this into 3.5, the new API is pretty much >>> straightforward and userspace tools can easily be made to convert >>> it to other format. The change to the driver is self contained. >> >> I really don't like introducing this at this stage into 3.5, >> >> I'd really like a good review of the API and what information we provide >> along with how extensible it is. >> >> I'm still not convinced replay is what we want in the field, I know its what >> *you* want, but I think apitrace stuff in userspace pretty much covers >> the replaying situation. So I'd have to look at this and see how easy >> it makes disecting command streams etc. >> >> Dave. > > It store pciid and allow to dump all ib per ring, and all associated > bo object. It also have a bunch of flags to help the userspace tools > (like does userspace need to clear offset (vm vs no vm) ... What more > do you want to know ? Another useful thing might be current register states. We've been doing dumping (we call it snapshot) in the Qualcomm driver for a little bit now and between registers, rings, command buffers and various buffers we've been able to get a reasonably good picture of the state suitable for playback on emulators and other silly userspace tricks. We have structs for registers and index/data register pairs because we also dump lots of debug registers and queues and other various HW sources. The implementation is way different for obvious reasons but I would love to consolidate on a single format. Its easy for us to do since we share similar architectures, but if least two GPUs support the same format it can be a catalyst for others to join. https://www.codeaurora.org/gitweb/quic/la/?p=kernel/msm.git;a=blob;f=drivers/gpu/msm/kgsl_snapshot.h;hb=refs/heads/msm-3.0 Jordan
GPU lockup dumping
On Wed, May 23, 2012 at 12:41 PM, Dave Airlie wrote: > On Wed, May 23, 2012 at 5:26 PM, Jerome Glisse wrote: >> On Wed, May 23, 2012 at 12:08 PM, Dave Airlie wrote: >>> On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse >>> wrote: On Wed, May 23, 2012 at 8:34 AM, Christian K?nig wrote: > On 23.05.2012 11:27, Dave Airlie wrote: >> >> On Thu, May 17, 2012 at 7:28 PM, ?wrote: >>> >>> So here is improved patchset, where i splited ground work necessary >>> for the dumping into their own patch. The debugfs improvement could >>> probably be usefull to intel instead of having i915 have it's own >>> debugfs file stuff. >>> >>> The lockup dumping public api have been move into radeon_drm.h >>> >>> Stressing the fact again that dump are self contained ie they have >>> all the data needed to be replayed (vertex, indices, shader, texture, >>> ...). >>> >>> Would really like to get this into 3.5, the new API is pretty much >>> straightforward and userspace tools can easily be made to convert >>> it to other format. The change to the driver is self contained. >> >> I really don't like introducing this at this stage into 3.5, >> >> I'd really like a good review of the API and what information we provide >> along with how extensible it is. >> >> I'm still not convinced replay is what we want in the field, I know its >> what >> *you* want, but I think apitrace stuff in userspace pretty much covers >> the replaying situation. So I'd have to look at this and see how easy >> it makes disecting command streams etc. >> >> Dave. > > > I agree that it might not be a good idea to push that into 3.5, since at > least I (and I also think Alex) didn't had time to look into it yet. On > the > other hand the patches look quite reasonable. > > But I still wanted to throw in a requirement from my day to day work, > maybe > that helps finding a more general solution: > When we start to work with more parts of the chip it might be necessary to > dump everything that is currently "in the fly". For example I had a whole > bunch of problems where copying data around with a 3D Blit and then > missing > a sync between this job and a job on another rings causes a "hiccup" in > the > hardware. > > I know that this isn't your focus and that is absolutely ok with me, cause > the format you are introducing is just used in debugfs and so not part of > any stable API (at least not in my understanding), but you should still > keep > in mind that we might need to extend it into that direction in the future. > > Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. >>> >>> I'd like to dump the rings as well, as I said I'd rather we didn't >>> limit this to replay, but make it useful for getting as much info as >>> possible out >>> >>> Dave. >> >> Ring will contains very little, like ib schedule and fence, i don't >> see how useful this can be. >> > > In case we have a bug in our ib scheduling or fencing :-0 > > Dave. Well i think we have several kind of lockup, the most basic one is userspace sending broken shader, vertex, or something in that line. The more complex one is timing related, like a bo move or some cache invalidation that didn't happen properly and GPU endup reading either wrong data or old cached data. I don't see how to capture useful information for this second case, beside doing snapshot of memory. For multi-ring i agree that dumping the ring might prove useful spot inter-ring semaphore deadlock, or possibly inter-ring absence of synchronization (but that would be a bad kernel bug). Cheers, Jerome
GPU lockup dumping
On Wed, May 23, 2012 at 12:08 PM, Dave Airlie wrote: > On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse wrote: >> On Wed, May 23, 2012 at 8:34 AM, Christian K?nig >> wrote: >>> On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM, ?wrote: > > So here is improved patchset, where i splited ground work necessary > for the dumping into their own patch. The debugfs improvement could > probably be usefull to intel instead of having i915 have it's own > debugfs file stuff. > > The lockup dumping public api have been move into radeon_drm.h > > Stressing the fact again that dump are self contained ie they have > all the data needed to be replayed (vertex, indices, shader, texture, > ...). > > Would really like to get this into 3.5, the new API is pretty much > straightforward and userspace tools can easily be made to convert > it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. >>> >>> >>> I agree that it might not be a good idea to push that into 3.5, since at >>> least I (and I also think Alex) didn't had time to look into it yet. On the >>> other hand the patches look quite reasonable. >>> >>> But I still wanted to throw in a requirement from my day to day work, maybe >>> that helps finding a more general solution: >>> When we start to work with more parts of the chip it might be necessary to >>> dump everything that is currently "in the fly". For example I had a whole >>> bunch of problems where copying data around with a 3D Blit and then missing >>> a sync between this job and a job on another rings causes a "hiccup" in the >>> hardware. >>> >>> I know that this isn't your focus and that is absolutely ok with me, cause >>> the format you are introducing is just used in debugfs and so not part of >>> any stable API (at least not in my understanding), but you should still keep >>> in mind that we might need to extend it into that direction in the future. >>> >>> Christian. >> >> Note that my format is also done with that in mind, it can capture ib >> from all rings. The only thing i don't think worth capturing are the >> ring themself because there would be no way to replay them without >> adding some new special API. > > I'd like to dump the rings as well, as I said I'd rather we didn't > limit this to replay, but make it useful for getting as much info as > possible out > > Dave. Ring will contains very little, like ib schedule and fence, i don't see how useful this can be. Cheers, Jerome
GPU lockup dumping
On Wed, May 23, 2012 at 12:04 PM, Alex Deucher wrote: > On Wed, May 23, 2012 at 8:34 AM, Christian K?nig > wrote: >> On 23.05.2012 11:27, Dave Airlie wrote: >>> >>> On Thu, May 17, 2012 at 7:28 PM, ?wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. >>> >>> I really don't like introducing this at this stage into 3.5, >>> >>> I'd really like a good review of the API and what information we provide >>> along with how extensible it is. >>> >>> I'm still not convinced replay is what we want in the field, I know its >>> what >>> *you* want, but I think apitrace stuff in userspace pretty much covers >>> the replaying situation. So I'd have to look at this and see how easy >>> it makes disecting command streams etc. >>> >>> Dave. >> >> >> I agree that it might not be a good idea to push that into 3.5, since at >> least I (and I also think Alex) didn't had time to look into it yet. On the >> other hand the patches look quite reasonable. >> >> But I still wanted to throw in a requirement from my day to day work, maybe >> that helps finding a more general solution: >> When we start to work with more parts of the chip it might be necessary to >> dump everything that is currently "in the fly". For example I had a whole >> bunch of problems where copying data around with a 3D Blit and then missing >> a sync between this job and a job on another rings causes a "hiccup" in the >> hardware. >> >> I know that this isn't your focus and that is absolutely ok with me, cause >> the format you are introducing is just used in debugfs and so not part of >> any stable API (at least not in my understanding), but you should still keep >> in mind that we might need to extend it into that direction in the future. >> > > I'm ok with it as long as we have a path to implement support for the > internal dump format so I can have the hw guys play them back on the > simulators and such. > > Alex
GPU lockup dumping
On Wed, May 23, 2012 at 8:34 AM, Christian K?nig wrote: > On 23.05.2012 11:27, Dave Airlie wrote: >> >> On Thu, May 17, 2012 at 7:28 PM, ?wrote: >>> >>> So here is improved patchset, where i splited ground work necessary >>> for the dumping into their own patch. The debugfs improvement could >>> probably be usefull to intel instead of having i915 have it's own >>> debugfs file stuff. >>> >>> The lockup dumping public api have been move into radeon_drm.h >>> >>> Stressing the fact again that dump are self contained ie they have >>> all the data needed to be replayed (vertex, indices, shader, texture, >>> ...). >>> >>> Would really like to get this into 3.5, the new API is pretty much >>> straightforward and userspace tools can easily be made to convert >>> it to other format. The change to the driver is self contained. >> >> I really don't like introducing this at this stage into 3.5, >> >> I'd really like a good review of the API and what information we provide >> along with how extensible it is. >> >> I'm still not convinced replay is what we want in the field, I know its >> what >> *you* want, but I think apitrace stuff in userspace pretty much covers >> the replaying situation. So I'd have to look at this and see how easy >> it makes disecting command streams etc. >> >> Dave. > > > I agree that it might not be a good idea to push that into 3.5, since at > least I (and I also think Alex) didn't had time to look into it yet. On the > other hand the patches look quite reasonable. > > But I still wanted to throw in a requirement from my day to day work, maybe > that helps finding a more general solution: > When we start to work with more parts of the chip it might be necessary to > dump everything that is currently "in the fly". For example I had a whole > bunch of problems where copying data around with a 3D Blit and then missing > a sync between this job and a job on another rings causes a "hiccup" in the > hardware. > > I know that this isn't your focus and that is absolutely ok with me, cause > the format you are introducing is just used in debugfs and so not part of > any stable API (at least not in my understanding), but you should still keep > in mind that we might need to extend it into that direction in the future. > I'm ok with it as long as we have a path to implement support for the internal dump format so I can have the hw guys play them back on the simulators and such. Alex > Christian. > > ___ > dri-devel mailing list > dri-devel at lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
On Wed, May 23, 2012 at 5:27 AM, Dave Airlie wrote: > On Thu, May 17, 2012 at 7:28 PM, ? wrote: >> So here is improved patchset, where i splited ground work necessary >> for the dumping into their own patch. The debugfs improvement could >> probably be usefull to intel instead of having i915 have it's own >> debugfs file stuff. >> >> The lockup dumping public api have been move into radeon_drm.h >> >> Stressing the fact again that dump are self contained ie they have >> all the data needed to be replayed (vertex, indices, shader, texture, >> ...). >> >> Would really like to get this into 3.5, the new API is pretty much >> straightforward and userspace tools can easily be made to convert >> it to other format. The change to the driver is self contained. > > I really don't like introducing this at this stage into 3.5, > > I'd really like a good review of the API and what information we provide > along with how extensible it is. > > I'm still not convinced replay is what we want in the field, I know its what > *you* want, but I think apitrace stuff in userspace pretty much covers > the replaying situation. So I'd have to look at this and see how easy > it makes disecting command streams etc. > > Dave. It store pciid and allow to dump all ib per ring, and all associated bo object. It also have a bunch of flags to help the userspace tools (like does userspace need to clear offset (vm vs no vm) ... What more do you want to know ? Cheers, Jerome
GPU lockup dumping
On Wed, May 23, 2012 at 8:34 AM, Christian K?nig wrote: > On 23.05.2012 11:27, Dave Airlie wrote: >> >> On Thu, May 17, 2012 at 7:28 PM, ?wrote: >>> >>> So here is improved patchset, where i splited ground work necessary >>> for the dumping into their own patch. The debugfs improvement could >>> probably be usefull to intel instead of having i915 have it's own >>> debugfs file stuff. >>> >>> The lockup dumping public api have been move into radeon_drm.h >>> >>> Stressing the fact again that dump are self contained ie they have >>> all the data needed to be replayed (vertex, indices, shader, texture, >>> ...). >>> >>> Would really like to get this into 3.5, the new API is pretty much >>> straightforward and userspace tools can easily be made to convert >>> it to other format. The change to the driver is self contained. >> >> I really don't like introducing this at this stage into 3.5, >> >> I'd really like a good review of the API and what information we provide >> along with how extensible it is. >> >> I'm still not convinced replay is what we want in the field, I know its >> what >> *you* want, but I think apitrace stuff in userspace pretty much covers >> the replaying situation. So I'd have to look at this and see how easy >> it makes disecting command streams etc. >> >> Dave. > > > I agree that it might not be a good idea to push that into 3.5, since at > least I (and I also think Alex) didn't had time to look into it yet. On the > other hand the patches look quite reasonable. > > But I still wanted to throw in a requirement from my day to day work, maybe > that helps finding a more general solution: > When we start to work with more parts of the chip it might be necessary to > dump everything that is currently "in the fly". For example I had a whole > bunch of problems where copying data around with a 3D Blit and then missing > a sync between this job and a job on another rings causes a "hiccup" in the > hardware. > > I know that this isn't your focus and that is absolutely ok with me, cause > the format you are introducing is just used in debugfs and so not part of > any stable API (at least not in my understanding), but you should still keep > in mind that we might need to extend it into that direction in the future. > > Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. Cheers, Jerome
GPU lockup dumping
On Thu, May 17, 2012 at 7:28 PM, wrote: > So here is improved patchset, where i splited ground work necessary > for the dumping into their own patch. The debugfs improvement could > probably be usefull to intel instead of having i915 have it's own > debugfs file stuff. > > The lockup dumping public api have been move into radeon_drm.h > > Stressing the fact again that dump are self contained ie they have > all the data needed to be replayed (vertex, indices, shader, texture, > ...). > > Would really like to get this into 3.5, the new API is pretty much > straightforward and userspace tools can easily be made to convert > it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave.
Re: GPU lockup dumping
On Thu, May 17, 2012 at 7:28 PM, j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 5:27 AM, Dave Airlie airl...@gmail.com wrote: On Thu, May 17, 2012 at 7:28 PM, j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. It store pciid and allow to dump all ib per ring, and all associated bo object. It also have a bunch of flags to help the userspace tools (like does userspace need to clear offset (vm vs no vm) ... What more do you want to know ? Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. I'm ok with it as long as we have a path to implement support for the internal dump format so I can have the hw guys play them back on the simulators and such. Alex Christian. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 12:04 PM, Alex Deucher alexdeuc...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. I'm ok with it as long as we have a path to implement support for the internal dump format so I can have the hw guys play them back on the simulators and such. Alex From what i see of the internal dump format, it's way more complex and i don't think kernel is the right place for it, for instance you need to set the primary surface thing, i don't want to parse cmd stream in kernel to figure that out. I would rather have userspace tool that postprocess thing and convert it to proper AMD dump format. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 12:08 PM, Dave Airlie airl...@gmail.com wrote: On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. Ring will contains very little, like ib schedule and fence, i don't see how useful this can be. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 5:26 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 12:08 PM, Dave Airlie airl...@gmail.com wrote: On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. Ring will contains very little, like ib schedule and fence, i don't see how useful this can be. In case we have a bug in our ib scheduling or fencing :-0 Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On Wed, May 23, 2012 at 12:41 PM, Dave Airlie airl...@gmail.com wrote: On Wed, May 23, 2012 at 5:26 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 12:08 PM, Dave Airlie airl...@gmail.com wrote: On Wed, May 23, 2012 at 3:48 PM, Jerome Glisse j.gli...@gmail.com wrote: On Wed, May 23, 2012 at 8:34 AM, Christian König deathsim...@vodafone.de wrote: On 23.05.2012 11:27, Dave Airlie wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. I agree that it might not be a good idea to push that into 3.5, since at least I (and I also think Alex) didn't had time to look into it yet. On the other hand the patches look quite reasonable. But I still wanted to throw in a requirement from my day to day work, maybe that helps finding a more general solution: When we start to work with more parts of the chip it might be necessary to dump everything that is currently in the fly. For example I had a whole bunch of problems where copying data around with a 3D Blit and then missing a sync between this job and a job on another rings causes a hiccup in the hardware. I know that this isn't your focus and that is absolutely ok with me, cause the format you are introducing is just used in debugfs and so not part of any stable API (at least not in my understanding), but you should still keep in mind that we might need to extend it into that direction in the future. Christian. Note that my format is also done with that in mind, it can capture ib from all rings. The only thing i don't think worth capturing are the ring themself because there would be no way to replay them without adding some new special API. I'd like to dump the rings as well, as I said I'd rather we didn't limit this to replay, but make it useful for getting as much info as possible out Dave. Ring will contains very little, like ib schedule and fence, i don't see how useful this can be. In case we have a bug in our ib scheduling or fencing :-0 Dave. Well i think we have several kind of lockup, the most basic one is userspace sending broken shader, vertex, or something in that line. The more complex one is timing related, like a bo move or some cache invalidation that didn't happen properly and GPU endup reading either wrong data or old cached data. I don't see how to capture useful information for this second case, beside doing snapshot of memory. For multi-ring i agree that dumping the ring might prove useful spot inter-ring semaphore deadlock, or possibly inter-ring absence of synchronization (but that would be a bad kernel bug). Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: GPU lockup dumping
On 05/23/2012 08:51 AM, Jerome Glisse wrote: On Wed, May 23, 2012 at 5:27 AM, Dave Airlieairl...@gmail.com wrote: On Thu, May 17, 2012 at 7:28 PM,j.gli...@gmail.com wrote: So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. I really don't like introducing this at this stage into 3.5, I'd really like a good review of the API and what information we provide along with how extensible it is. I'm still not convinced replay is what we want in the field, I know its what *you* want, but I think apitrace stuff in userspace pretty much covers the replaying situation. So I'd have to look at this and see how easy it makes disecting command streams etc. Dave. It store pciid and allow to dump all ib per ring, and all associated bo object. It also have a bunch of flags to help the userspace tools (like does userspace need to clear offset (vm vs no vm) ... What more do you want to know ? Another useful thing might be current register states. We've been doing dumping (we call it snapshot) in the Qualcomm driver for a little bit now and between registers, rings, command buffers and various buffers we've been able to get a reasonably good picture of the state suitable for playback on emulators and other silly userspace tricks. We have structs for registers and index/data register pairs because we also dump lots of debug registers and queues and other various HW sources. The implementation is way different for obvious reasons but I would love to consolidate on a single format. Its easy for us to do since we share similar architectures, but if least two GPUs support the same format it can be a catalyst for others to join. https://www.codeaurora.org/gitweb/quic/la/?p=kernel/msm.git;a=blob;f=drivers/gpu/msm/kgsl_snapshot.h;hb=refs/heads/msm-3.0 Jordan ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
On Thu, May 17, 2012 at 6:07 PM, wrote: > Ok this time is final version, i added a bunch of flags to cmd buffer > to make the userspace tools life easier. > > Cheers, > Jerome > So updated libdrm patch : http://people.freedesktop.org/~glisse/lockup/0001-radeon-add-rati-dumping-helper.patch Also updated git://people.freedesktop.org/~glisse/joujou and now you can replay a gpu lockup file captured with those patch on r6xx hw. Example captured from the kernel (i manualy edited the file to fix what was causing the lockup so you can replay it safely). http://people.freedesktop.org/~glisse/lockup/lockup-dump.rati Also improved the text conversion see : http://people.freedesktop.org/~glisse/lockup/lockup-dump.tati Will do r7xx and the evergreen/cayman. But i do think that the kernel patch are good as is. Cheers, Jerome
Re: GPU lockup dumping
On Thu, May 17, 2012 at 6:07 PM, j.gli...@gmail.com wrote: Ok this time is final version, i added a bunch of flags to cmd buffer to make the userspace tools life easier. Cheers, Jerome So updated libdrm patch : http://people.freedesktop.org/~glisse/lockup/0001-radeon-add-rati-dumping-helper.patch Also updated git://people.freedesktop.org/~glisse/joujou and now you can replay a gpu lockup file captured with those patch on r6xx hw. Example captured from the kernel (i manualy edited the file to fix what was causing the lockup so you can replay it safely). http://people.freedesktop.org/~glisse/lockup/lockup-dump.rati Also improved the text conversion see : http://people.freedesktop.org/~glisse/lockup/lockup-dump.tati Will do r7xx and the evergreen/cayman. But i do think that the kernel patch are good as is. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
Ok this time is final version, i added a bunch of flags to cmd buffer to make the userspace tools life easier. Cheers, Jerome
GPU lockup dumping
Make the format more future proof reliable by adding a total chunk size field that allow old userspace to skip over potentialy new chunk. Not sure this is really needed but hey. Jerome
GPU lockup dumping
So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. Cheers, Jerome
GPU lockup dumping
So here is improved patchset, where i splited ground work necessary for the dumping into their own patch. The debugfs improvement could probably be usefull to intel instead of having i915 have it's own debugfs file stuff. The lockup dumping public api have been move into radeon_drm.h Stressing the fact again that dump are self contained ie they have all the data needed to be replayed (vertex, indices, shader, texture, ...). Would really like to get this into 3.5, the new API is pretty much straightforward and userspace tools can easily be made to convert it to other format. The change to the driver is self contained. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
Make the format more future proof reliable by adding a total chunk size field that allow old userspace to skip over potentialy new chunk. Not sure this is really needed but hey. Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
GPU lockup dumping
Ok this time is final version, i added a bunch of flags to cmd buffer to make the userspace tools life easier. Cheers, Jerome ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel