On 14/11/17 22:10, Mironov, Mikhail wrote: >> On 14/11/17 17:14, Mironov, Mikhail wrote: >>>>>>>>> + res = ctx->factory->pVtbl->CreateContext(ctx->factory, >>>>>>>>> + &ctx- >>>>>>> context); >>>>>>>>> + AMF_RETURN_IF_FALSE(ctx, res == AMF_OK, >>>> AVERROR_UNKNOWN, >>>>>>>> "CreateContext() failed with error %d\n", res); >>>>>>>>> + // try to reuse existing DX device >>>>>>>>> + if (avctx->hw_frames_ctx) { >>>>>>>>> + AVHWFramesContext *device_ctx = >>>>>>>>> + (AVHWFramesContext*)avctx- >>>>>>>>> hw_frames_ctx->data; >>>>>>>>> + if (device_ctx->device_ctx->type == >>>>>> AV_HWDEVICE_TYPE_D3D11VA){ >>>>>>>>> + if (amf_av_to_amf_format(device_ctx->sw_format) == >>>>>>>>> + AMF_SURFACE_UNKNOWN) { >>>>>>>> >>>>>>>> This test is inverted. >>>>>>>> >>>>>>>> Have you actually tested this path? Even with that test fixed, >>>>>>>> I'm unable to pass the following initialisation test with an AMD >>>>>>>> D3D11 >>>> device. >>>>>>>> >>>>>>> >>>>>>> Yes, the condition should be reverted. To test I had to add >>>>>>> "-hwaccel d3d11va -hwaccel_output_format d3d11" to the command >>>> line. >>>>>> >>>>>> Yeah. I get: >>>>>> >>>>>> $ ./ffmpeg_g -y -hwaccel d3d11va -hwaccel_device 0 - >>>>>> hwaccel_output_format d3d11 -i ~/bbb_1080_264.mp4 -an -c:v >> h264_amf >>>>>> out.mp4 ... >>>>>> [AVHWDeviceContext @ 000000000270e120] Created on device >> 1002:665f >>>>>> (AMD Radeon (TM) R7 360 Series). >>>>>> ... >>>>>> [h264_amf @ 00000000004dcd80] amf_shared: avctx->hw_frames_ctx >>>> has >>>>>> non-AMD device, switching to default >>>>>> >>>>>> It's then comedically slow in this state (about 2fps), but works >>>>>> fine when the decode is in software. >>>>> >>>>> Is it possible that you also have iGPU not disabled and it is used >>>>> for >>>> decoding as adapter 0? >>>> >>>> There is an integrated GPU, but it's currently completely disabled. >>>> (I made >>>> <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2017- >>>> November/219795.html> to check that the device was definitely right.) >>>> >>>>> Can you provide a log from dxdiag.exe? >>>> >>>> <http://ixia.jkqxz.net/~mrt/DxDiag.txt> >>>> >>>>> If AMF created own DX device then submission logic an speed is the >>>>> same >>>> as from SW decoder. >>>>> It would be interesting to see a short GPUVIEW log. >>>> >>>> My Windows knowledge is insufficient to get that immediately, but if >>>> you think it's useful I can look into it? >>> >>> I think I know what is going on. You are on Win7. In Win7 D3D11VA API is >> not available from MSFT. >>> AMF will fall into DX9 based encoding submission and this is why the >> message was produced. >>> The AMF performance should be the same on DX9 but I don’t know how >>> decoding is done without D3D11VA support. >>> GPUVIEW is not really needed if my assumptions are correct. >> >> Ah, that would make sense. Maybe detect it and fail earlier with a helpful >> message - the current "not an AMD device" is wrong in this case. >> >> Decode via D3D11 does work for me on Windows 7 with both AMD and Intel; >> I don't know anything about how, though. (I don't really care about >> Windows 7 - this was just a set of parts mashed together into a working >> machine for testing, the Windows 7 install is inherited from elsewhere.) > > I run this in Win7. What I see is the decoding does go via D3D11VA. The > support comes > with Platform Update. But AMF encoder works on Win7 via D3D9 only. That > explains > the performance hit: In D3D11 to copy video output HW accelerator copies > frame via staging texture. > If I use for decoding DXVA2 it is faster because staging texture is not > needed. > I am thinking to connect dxva2 acceleration with AMF encoder > but probably in the next phase. > I've added more precise logging. > >> >>>>>>>>> + { "filler_data", "Filler Data Enable", >> OFFSET(filler_data), >>>>>>>> AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, VE }, >>>>>>>>> + { "vbaq", "Enable VBAQ", >>>> OFFSET(enable_vbaq), >>>>>>>> AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, VE }, >>>>>>>>> + { "frame_skipping", "Rate Control Based Frame Skip", >>>>>>>> OFFSET(skip_frame), AV_OPT_TYPE_BOOL, { .i64 = 0 }, 0, 1, VE }, >>>>>>>>> + >>>>>>>>> + /// QP Values >>>>>>>>> + { "qp_i", "Quantization Parameter for I-Frame", >>>> OFFSET(qp_i), >>>>>>>> AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 51, VE }, >>>>>>>>> + { "qp_p", "Quantization Parameter for P-Frame", >>>>>> OFFSET(qp_p), >>>>>>>> AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 51, VE }, >>>>>>>>> + { "qp_b", "Quantization Parameter for B-Frame", >>>>>> OFFSET(qp_b), >>>>>>>> AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 51, VE }, >>>>>>>>> + >>>>>>>>> + /// Pre-Pass, Pre-Analysis, Two-Pass >>>>>>>>> + { "preanalysis", "Pre-Analysis Mode", >>>>>> OFFSET(preanalysis), >>>>>>>> AV_OPT_TYPE_BOOL,{ .i64 = 0 }, 0, 1, VE, NULL }, >>>>>>>>> + >>>>>>>>> + /// Maximum Access Unit Size >>>>>>>>> + { "max_au_size", "Maximum Access Unit Size for rate control >> (in >>>>>> bits)", >>>>>>>> OFFSET(max_au_size), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, >> INT_MAX, >>>> VE >>>>>> }, >>>>>>>> >>>>>>>> Can you explain more about what this option does? I don't seem >>>>>>>> to be able to get it to do anything - e.g. setting -max_au_size >>>>>>>> 80000 with 30fps CBR 1M (which should be easily achievable) still >>>>>>>> makes packets of more than 80000 >>>>>>>> bits.) >>>>>>>> >>>>>>> >>>>>>> It means maximum frame size in bits, and it should be used >>>>>>> together with enforce_hrd enabled. I tested, it works after the >>>>>>> related fix for >>>>>> enforce_hrd. >>>>>>> I added dependency handling. >>>>>> >>>>>> $ ./ffmpeg_g -y -nostats -i ~/bbb_1080_264.mp4 -an -c:v h264_amf >>>>>> -bsf:v trace_headers -frames:v 1000 -enforce_hrd 1 -b:v 1M -maxrate >>>>>> 1M - max_au_size 80000 out.mp4 2>&1 | grep 'Packet: [0-9]\{5\}' >>>>>> [AVBSFContext @ 00000000029d7f40] Packet: 11426 bytes, key frame, >>>>>> pts 128000, dts 128000. >>>>>> [AVBSFContext @ 00000000029d7f40] Packet: 17623 bytes, key frame, >>>>>> pts 192000, dts 192000. >>>>>> [AVBSFContext @ 00000000029d7f40] Packet: 23358 bytes, pts 249856, >>>>>> dts 249856. >>>>>> >>>>>> (That is, packets bigger than the supposed 80000-bit maximum.) >>>> Expected? >>>>> >>>>> No, this is not expected. I tried the exact command line and did not >>>>> get packages more then 80000 bits. Sorry to ask but did you apply >>>>> the >>>> change in amfenc.h? >>>> >>>> I used the most recent patch on the list, >>>> <https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2017- >>>> November/219757.html>. (Required a bit of fixup to apply, as Michael >>>> already noted.) >>> >>> Yes, I will submit the update today but I cannot repro large packets. >>> Can you just check if you get the change: >>> >>> - typedef amf_uint16 amf_bool; >>> + typedef amf_uint8 amf_bool; >> >> Yes, I have that change. >> >> Could it be a difference in support for the particular card I am using >> (Bonaire >> / GCN 2, so several generations old now), or will that be the same across all >> of them? >> > > I got a different clip and reproduced the issue. We discussed this with our > main "rate control" guy. > Basically, this parameter cannot guarantee the frame size in a complex scene > case when it is combined > with relatively low bit rate value and relatively low max AU size value. > To confirm this it would be great if you could share your output stream so we > verify that this is the case. > (or input stream).
Input: <http://distribution.bbb3d.renderfarming.net/video/mp4/bbb_sunflower_1080p_60fps_normal.mp4> Output: <http://ixia.jkqxz.net/~mrt/amf_max_au_size.mp4> Looking at the transition on frame 976, the output quality is pretty bad, but not really bad enough to merit the failure - the macroblock QPs are only 37/38, and go higher on following frames. - Mark _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel