This is revision 4 implementing a GPU crash state for drm/msm (https://patchwork.freedesktop.org/series/36097/). This version fixes some identified issues and actually compiles on a modern kernel.
The goal is to store and provide enough information to debug software and hardware issues on the Adreno hardware in a semi human-readable format that can also be parsed by scripts. THe full set of changes here capture basic information about the GPU, the status and contents of the ringbuffers, a snapshot of the current register state and the active buffers from the hanging submit. The data is printed with devcoredump. For example, after a hang you can get the data from /sys/class/devcoredump/devcdX/data where X is a unique number. You can see an example of the output for a simple invalid opcode error on the db820c here: https://hastebin.com/ewamikoreh.cs v5: Fix symbol error in i915_gpu_error.c thanks to 01 dot org bot. Added open/release functions for the show debugfs file to get the state per Chris Wilson. Slightly modified the register output format to be more YAML friendly also per Chris. v4: Add buffer dump for the active submit. Fix refcount issue with devcoredump. Change header for a5xx registers to registers-hlsq because I'm told YAML requires unique tags. v3: Make recommended changes to ascii85 per Chris Wilson. Use devcoredump to dump crash states as suggested by Bjorn Andersson and add a new drm_print facility to facilitate that. Remove the now obsolete 'crash' debugfs node. Add documentation for the crash dump output. v2: Convert output to yaml, use ascii85 to dump ringbuffer contents. Jordan Crouse (10): include: Move ascii85 functions from i915 to linux/ascii85.h drm: drm_printer: Add printer for devcoredump drm/msm/gpu: Capture the state of the GPU drm/msm/gpu: Convert the GPU show function to use the GPU state drm/msm/gpu: Rearrange the code that collects the task during a hang drm/msm/gpu: Capture the GPU state on a GPU hang drm/msm/adreno: Convert the show/crash file format drm/msm/adreno: Add ringbuffer data to the GPU state drm/msm/adreno: Add a5xx specific registers for the GPU state drm/msm/gpu: Add the buffer objects from the submit to the crash dump Documentation/gpu/drm-msm-crash-dump.txt | 46 +++++ drivers/gpu/drm/drm_print.c | 54 +++++ drivers/gpu/drm/i915/i915_gpu_error.c | 35 +--- drivers/gpu/drm/msm/Kconfig | 1 + drivers/gpu/drm/msm/adreno/a3xx_gpu.c | 30 +-- drivers/gpu/drm/msm/adreno/a4xx_gpu.c | 22 ++- drivers/gpu/drm/msm/adreno/a5xx_gpu.c | 242 +++++++++++++++++++++-- drivers/gpu/drm/msm/adreno/adreno_gpu.c | 180 +++++++++++++++-- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 7 +- drivers/gpu/drm/msm/msm_debugfs.c | 93 ++++++++- drivers/gpu/drm/msm/msm_gpu.c | 143 +++++++++++++- drivers/gpu/drm/msm/msm_gpu.h | 67 ++++++- include/drm/drm_print.h | 27 +++ include/linux/ascii85.h | 39 ++++ 14 files changed, 886 insertions(+), 100 deletions(-) create mode 100644 Documentation/gpu/drm-msm-crash-dump.txt create mode 100644 include/linux/ascii85.h -- 2.17.0 _______________________________________________ dri-devel mailing list dri-devel@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/dri-devel