Bug#977286: crash on H.264 encoding

2021-01-10 Thread Steinar H. Gunderson
reopen 977268
tags 977286 - patch
thanks

On Sun, Dec 13, 2020 at 04:47:56PM +0100, Steinar H. Gunderson wrote:
> Whenever I start Nageru on my Kaby Lake laptop, it segfaults in the VA driver.
> This was fine in 20.3.0+ds1-1, broke in 20.4.1+ds1-1, and is still the case
> in 20.4.2+ds1-1. However, compiling upstream 20.4.3 appears to fix it.
> This is the relevant patch according to bisect:

Unfortunately, with 20.4.5, it's back, so I guess 20.4.3 fixing it was just
luck. The new backtrace is very similar:

Core was generated by `nageru'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x7f57cdfb767f in CodecHalSetRcsSurfaceState (hwInterface=, cmdBuffer=cmdBuffer@entry=0x7f57927fbfb0, 
surfaceCodecParams=surfaceCodecParams@entry=0x7f57927fbce0, 
kernelState=kernelState@entry=0x560628a1efd0)
at ./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp:463
463 ./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp: No 
such file or directory.
[Current thread is 1 (Thread 0x7f57927fe700 (LWP 3121820))]
(gdb) bt
#0  0x7f57cdfb767f in CodecHalSetRcsSurfaceState(CodechalHwInterface*, 
_MOS_COMMAND_BUFFER*, _CODECHAL_SURFACE_CODEC_PARAMS*, MHW_KERNEL_STATE*)
(hwInterface=, cmdBuffer=cmdBuffer@entry=0x7f57927fbfb0, 
surfaceCodecParams=surfaceCodecParams@entry=0x7f57927fbce0, 
kernelState=kernelState@entry=0x560628a1efd0) at 
./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp:463
#1  0x7f57ce0db641 in 
CodechalEncodeAvcEncG9::SendAvcMbEncSurfaces(_MOS_COMMAND_BUFFER*, 
_CODECHAL_ENCODE_AVC_MBENC_SURFACE_PARAMS*) (this=0x5606289e8970, 
cmdBuffer=0x7f57927fbfb0, params=0x7f57927fc150)
at ./media_driver/agnostic/gen9/codec/hal/codechal_encode_avc_g9.cpp:1955
#2  0x7f57ce013230 in CodechalEncodeAvcEnc::MbEncKernel(bool) 
(this=0x5606289e8970, mbEncIFrameDistInUse=)
at ./media_driver/agnostic/common/codec/hal/codechal_encode_avc.cpp:3896
#3  0x7f57ce0174f9 in CodechalEncodeAvcEnc::ExecuteKernelFunctions() 
(this=0x5606289e8970)
at ./media_driver/agnostic/common/codec/hal/codechal_encode_avc.cpp:6460
#4  0x7f57cdffe870 in CodechalEncoderState::ExecuteEnc(EncoderParams*) 
(this=0x5606289e8970, encodeParams=0x5606289cad60)
at ./media_driver/agnostic/common/codec/hal/codechal_encoder_base.cpp:4755
#5  0x7f57ce300043 in DdiEncodeAvc::EncodeInCodecHal(unsigned int) 
(this=0x5606288e9a90, numSlices=1)
at ./media_driver/linux/common/codec/ddi/media_ddi_encode_avc.cpp:1141
#6  0x7f57ce2eb2d7 in DdiEncodeBase::EndPicture(VADriverContext*, unsigned 
int)
(this=0x5606288e9a90, ctx=, context=)
at ./media_driver/linux/common/codec/ddi/media_ddi_encode_base.cpp:77
#7  0x7f57ce2f05db in DdiEncode_EndPicture(VADriverContext*, unsigned int)
(ctx=ctx@entry=0x5606282f8830, context=context@entry=536870912)
at ./media_driver/linux/common/codec/ddi/media_libva_encoder.cpp:629
#8  0x7f57ce31d50b in DdiMedia_EndPicture(VADriverContextP, VAContextID) 
(ctx=0x5606282f8830, context=536870912)
at ./media_driver/linux/common/ddi/media_libva.cpp:3831
#9  0x7f57dad24adf in vaEndPicture () at /lib/x86_64-linux-gnu/libva.so.2
#10 0x56062716603c in  ()
#11 0x560627168409 in  ()
#12 0x560627168dec in  ()
#13 0x7f57d893bed0 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#14 0x7f57d86f5ea7 in start_thread (arg=) at 
pthread_create.c:477
#15 0x7f57d8623def in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

/* Steinar */
-- 
Homepage: https://www.sesse.net/



Bug#977286: crash on H.264 encoding

2020-12-13 Thread Steinar H. Gunderson
Package: intel-media-va-driver-non-free
Version: 20.4.2+ds1-1
Severity: important
Tags: patch upstream

Hi,

Whenever I start Nageru on my Kaby Lake laptop, it segfaults in the VA driver.
This was fine in 20.3.0+ds1-1, broke in 20.4.1+ds1-1, and is still the case
in 20.4.2+ds1-1. However, compiling upstream 20.4.3 appears to fix it.
This is the relevant patch according to bisect:

  commit fe06066f8c643b75d6cdac21df8e16106d74e89c
  Author: JasonChen 
  Date:   Thu Nov 26 10:35:24 2020 +0800
  
  [Media Common] Integrate new gmm API for external surface
  
  Integrate new gmm API to create gmmResInfo for external surface.
  Need to update gmm to intel-gmmlib-20.3.3.

This is the backtrace:

Thread 15 "QS_Encode" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffb166d700 (LWP 1311008)]
0x7fffe930d8bf in CodecHalSetRcsSurfaceState (hwInterface=, 
cmdBuffer=cmdBuffer@entry=0x7fffb166af90, 
surfaceCodecParams=surfaceCodecParams@entry=0x7fffb166acc0, 
kernelState=kernelState@entry=0x56748fd0)
at ./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp:463
463 ./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp: Ingen 
slik fil eller filkatalog.
(gdb) bt
#0  0x7fffe930d8bf in CodecHalSetRcsSurfaceState(CodechalHwInterface*, 
_MOS_COMMAND_BUFFER*, _CODECHAL_SURFACE_CODEC_PARAMS*, MHW_KERNEL_STATE*)
(hwInterface=, cmdBuffer=cmdBuffer@entry=0x7fffb166af90, 
surfaceCodecParams=surfaceCodecParams@entry=0x7fffb166acc0, 
kernelState=kernelState@entry=0x56748fd0) at 
./media_driver/agnostic/common/codec/hal/codechal_utilities.cpp:463
#1  0x7fffe9431121 in 
CodechalEncodeAvcEncG9::SendAvcMbEncSurfaces(_MOS_COMMAND_BUFFER*, 
_CODECHAL_ENCODE_AVC_MBENC_SURFACE_PARAMS*)
(this=0x567124a0, cmdBuffer=0x7fffb166af90, params=0x7fffb166b140)
at ./media_driver/agnostic/gen9/codec/hal/codechal_encode_avc_g9.cpp:1955
#2  0x7fffe9369020 in CodechalEncodeAvcEnc::MbEncKernel(bool) 
(this=0x567124a0, mbEncIFrameDistInUse=)
at ./media_driver/agnostic/common/codec/hal/codechal_encode_avc.cpp:3896
#3  0x7fffe936d2e9 in CodechalEncodeAvcEnc::ExecuteKernelFunctions() 
(this=0x567124a0)
at ./media_driver/agnostic/common/codec/hal/codechal_encode_avc.cpp:6460
#4  0x7fffe9354660 in CodechalEncoderState::ExecuteEnc(EncoderParams*) 
(this=0x567124a0, encodeParams=0x566f41d8)
at ./media_driver/agnostic/common/codec/hal/codechal_encoder_base.cpp:4755
#5  0x7fffe96557f3 in DdiEncodeAvc::EncodeInCodecHal(unsigned int) 
(this=0x56614d40, numSlices=1)
at ./media_driver/linux/common/codec/ddi/media_ddi_encode_avc.cpp:1141
#6  0x7fffe9640a77 in DdiEncodeBase::EndPicture(VADriverContext*, unsigned 
int)
(this=0x56614d40, ctx=, context=) at 
./media_driver/linux/common/codec/ddi/media_ddi_encode_base.cpp:77
#7  0x7fffe9645d8b in DdiEncode_EndPicture(VADriverContext*, unsigned int) 
(ctx=ctx@entry=0x561c9590, context=context@entry=536870912)
at ./media_driver/linux/common/codec/ddi/media_libva_encoder.cpp:629
#8  0x7fffe9672b9b in DdiMedia_EndPicture(VADriverContextP, VAContextID) 
(ctx=0x561c9590, context=536870912)
at ./media_driver/linux/common/ddi/media_libva.cpp:3831
#9  0x76718adf in vaEndPicture () at /lib/x86_64-linux-gnu/libva.so.2
#10 0x555dc03e in 
QuickSyncEncoderImpl::encode_frame(QuickSyncEncoderImpl::PendingFrame, int, 
int, int, int, long, long, long, movit::YCbCrLumaCoefficients)
(this=this@entry=0x56633210, frame=..., 
encoding_frame_num=encoding_frame_num@entry=0, 
display_frame_num=display_frame_num@entry=0, 
gop_start_display_frame_num=gop_start_display_frame_num@entry=0, 
frame_type=frame_type@entry=7, pts=12000, dts=1, duration=24000, 
ycbcr_coefficients=movit::YCBCR_REC_601) at 
/usr/include/c++/10/bits/unique_ptr.h:173
#11 0x555de2e3 in QuickSyncEncoderImpl::encode_thread_func() 
(this=0x56633210) at ../nageru/quicksync_encoder.cpp:1832
#12 0x555decbc in operator() (__closure=0x5661e248) at 
../nageru/quicksync_encoder.cpp:1491
#13 std::__invoke_impl > (__f=...) at 
/usr/include/c++/10/bits/invoke.h:60
#14 std::__invoke > (__fn=...) at 
/usr/include/c++/10/bits/invoke.h:95
#15 
std::thread::_Invoker > 
>::_M_invoke<0> (this=0x5661e248) at /usr/include/c++/10/thread:264
#16 
std::thread::_Invoker > >::operator() 
(this=0x5661e248) at /usr/include/c++/10/thread:271
#17 
std::thread::_State_impl > > 
>::_M_run(void) (this=0x5661e240)
at /usr/include/c++/10/thread:215
#18 0x74077ed0 in  () at /lib/x86_64-linux-gnu/libstdc++.so.6
#19 0x73e2fea7 in start_thread (arg=) at 
pthread_create.c:477
#20 0x73d5fd8f in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Please upgrade to 20.4.3, and the crash should go away.

-- System Information:
Debian Release: 10.7
  APT prefers stable-debug
  APT policy: (500, 'stable-debug'), (500, 'proposed-updates'),