from:"Wu, Jianhua"

[FFmpeg-devel] 回复: [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step size to 8

2024-06-06 Thread Wu Jianhua

Bross, Benjamin:
> 发件人: Bross, Benjamin 
> 发送时间: 2024年6月3日 3:42
> 收件人: FFmpeg development discussions and patches
> 抄送: Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH v2 3/3] tests/checkasm/vvc_alf: change alf step 
> size to 8
> 
>> From Benjamin Bross:
>>> for ALF where functions are in increments of 4 while 8 should be sufficient 
>>> according to the spec.
> 
> Actually, it is not only the increment but the size has to be a multiple of 
> 8, hence in addition loops should start at 8 instead of at 4.
> 
> ALF filter and classification is applied on CTU-level.
> According to VVC spec, CTU sizes can be: 32x32, 64x64, 128x128 luma samples.
> However, if width or height are not a multiple of CTU size, the larger CTU is 
> "forced to split" until width or height fits.
> E.g. width=840 luma samples fits 26.25 CTUs, i.e. 26 32x32 and one 8x32.
> Width and height are restricted to be at least a multiple of 8 (see H.266 
> 09/2023, 7.4.3.4 Sequence parameter set RBSP semantics)

> sps_pic_width_max_in_luma_samples shall not be equal to 0 and shall be an 
> integer multiple of
> Max( 8, MinCbSizeY )
> …
> sps_pic_height_max_in_luma_samples shall not be equal to 0 and shall be an 
> integer multiple of
> Max( 8, MinCbSizeY ).
> 
> Please note that this applies only for classification and 7x7 luma filtering. 
> For chroma 5x5 filtering and 420 subsampling, CTU sizes can be as small as 
> multiples of 4 chroma samples.


Hi Benjamin,

I will update some patches according to your comments.

Thanks for your suggestion!
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v2 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow

2024-05-30 Thread Wu Jianhua

Andreas Rheinhardt:
> 发件人: ffmpeg-devel  代表 Andreas Rheinhardt 
> 
> 发送时间: 2024年5月30日 11:33
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: Re: [FFmpeg-devel] [PATCH v2 1/3] avcodec/x86/vvc/vvc_alf: fix integer 
> overflow
> 
> toq...@outlook.com:
> > From: Wu Jianhua 
> >
> > Some tests fails with certain seeds
> >
> > tests/checkasm/checkasm 2325607578 --test=vvc_alf
> > checkasm: using random seed 2325607578
> 

> And can I get an answer to the question of whether the issue is present
> when used by the actual decoder and not only the checkasm test?
> 
> - Andreas
> 

Sure. This issue hasn't occurred in the actual decoding of our tests but only 
in the checksum test, for the filter is generated randomly.

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow

2024-05-30 Thread Wu Jianhua

Ronald S. Bultje:
> 发件人: Ronald S. Bultje 
> 发送时间: 2024年5月29日 13:56
> 收件人: Wu Jianhua
> 抄送: FFmpeg development discussions and patches; Nuo Mi; James Almer
> 主题: Re: [FFmpeg-devel] [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer 
> overflow
> 
> Hi,
> 
> On Wed, May 29, 2024 at 3:44 PM Wu Jianhua 
> mailto:toq...@outlook.com>> wrote:
> Ronald S. Bultje:
>> On Wed, May 29, 2024 at 11:38 AM 
>> mailto:toq...@outlook.com>> 
>> <mailto:toq...@outlook.com<mailto:toq...@outlook.com>>> wrote:
>> +%else
>> +vpunpcklqdq  m11, m2, m2
>> +vpunpckhqdq  m12, m2, m2
>> +vpunpcklwd   m11, m11, m14
>> +vpunpcklwd   m12, m12, m14
>> +paddd m0, m11
>> +paddd m1, m12
>>  +packssdw  m0, m0, m1
>> +%endif
>
> [..]
> > Also, the whole thing just emulates a saturated add. Can't you use paddsw 
> > instead of paddw and be done with it? To add to Andreas' question: is >>  
> > saturating here normatively required?
> 
> > We didn't have any sample that failed for this issue except for the 
> > checksum with specific seeds. I think we can keep not changing it until a 
> > real  sample has something wrong.
> 
> @Nuomi to get more details.
> 
> I think "just" replacing paddw with paddsw is correct, since the input pixels 
> are 12bit (so they could be either unsigned or signed), the filtered output > 
> is the result of packssdw (so signed words), and the desired output is 12bit 
> pixels anyway, anything greater than that is clipped to 12bit range. So to > 
> me, it seems paddsw is a cheaper way to accomplish the same thing.
> 
> Ronald

Hi Ronald,

Yes, it does. I've test paddsw and everything works well. It must be a cheaper 
way to get minimum performance loss.

And v2 sent.

Thanks for this.
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer overflow

2024-05-29 Thread Wu Jianhua

Ronald S. Bultje:
> 发件人: Ronald S. Bultje 
> 发送时间: 2024年5月29日 10:51
> 收件人: FFmpeg development discussions and patches
> 抄送: James Almer; Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH 1/3] avcodec/x86/vvc/vvc_alf: fix integer 
> overflow
> 
> Hi,
> 
> On Wed, May 29, 2024 at 11:38 AM 
> mailto:toq...@outlook.com>> wrote:
> +%else
> +vpunpcklqdq  m11, m2, m2
> +vpunpckhqdq  m12, m2, m2
> +vpunpcklwd   m11, m11, m14
> +vpunpcklwd   m12, m12, m14
> +paddd m0, m11
> +paddd m1, m12
> +packssdw  m0, m0, m1
> +%endif
> 
> punpcklqdq a, src, src
> punpckhqdq b, src, src
> punpcklwd a, a, zero
> punpcklwd b, b, zero
> 
> is the same as
> 
> punpcklwd a, src, zero
> punpckhwd b, src, zero

Thank you for pointing out this. This modification is really helpful for my 
improvement!


Andreas:
>Can this happen with real inputs (like when called from the decoder)? If
> not, then the test needs to be made more realistic.
> Anyway, what is the performance impact of this?

I didn't have a unit test, but the average FPS looks no change.

Ronald:
> Also, the whole thing just emulates a saturated add. Can't you use paddsw 
> instead of paddw and be done with it? To add to Andreas' question: is 
> saturating here normatively required?

We didn't have any sample that failed for this issue except for the checksum 
with specific seeds. I think we can keep not changing it until a real sample 
has something wrong. 

@Nuomi to get more details.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH] x86/vvc_alf: use the x86inc instruction macros

2024-05-21 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 James Almer 
> 
> 发送时间: 2024年5月21日 6:52
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: [FFmpeg-devel] [PATCH] x86/vvc_alf: use the x86inc instruction macros
> 
> Let its magic figure out the correct mnemonic based on target instruction set.
> 
> Signed-off-by: James Almer 
> ---
>  libavcodec/x86/vvc/vvc_alf.asm | 202 -
>  1 file changed, 101 insertions(+), 101 deletions(-)

I tested this patch and LGTM. Thanks for updating them. 

And would it be better to add avcodec to the path of the commit message?

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 DMVR SAD functions for VVC

2024-05-18 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 Stone Chen 
> 
> 发送时间: 2024年5月14日 13:40
> 收件人: ffmpeg-devel@ffmpeg.org
> 抄送: Stone Chen
> 主题: [FFmpeg-devel] [PATCH v3 1/2][GSoC 2024] libavcodec/x86/vvc: Add AVX2 
> DMVR SAD functions for VVC
> 
> Implements AVX2 DMVR (decoder-side motion vector refinement) SAD functions. 
> DMVR SAD is only calculated if w >= 8, h >= 8, and w * h > 128. > > To reduce 
> complexity, SAD is only calculated on even rows. This is calculated for all 
> video bitdepths, but the values passed to the function are always > 16bit 
> (even if the original video bitdepth is 8). The AVX2 implementation uses 
> min/max/sub.
> 
> Benchmarks ( AMD 7940HS )
> Before:
> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 80.7 |
> Chimera_8bit_1080P_1000_frames.vvc | 158.0 |
> NovosobornayaSquare_1920x1080.bin | 159.7 |
> RitualDance_1920x1080_60_10_420_37_RA.266 | 146.3 |
> 
> After:
> BQTerrace_1920x1080_60_10_420_22_RA.vvc | 82.7 |
> Chimera_8bit_1080P_1000_frames.vvc | 167.0 |
> NovosobornayaSquare_1920x1080.bin | 166.3 |
> RitualDance_1920x1080_60_10_420_37_RA.266 | 154.0 |
> ---

LGTM. Thanks for your efforts.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 1/4] avcodec/x86/vvc: add alf filter luma and chroma avx2 optimizations

2024-05-01 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 Nuo Mi 
> 
> 发送时间: 2024年4月30日 11:03
> 收件人: FFmpeg development discussions and patches
> 主题: Re: [FFmpeg-devel] [PATCH 1/4] avcodec/x86/vvc: add alf filter luma and 
> chroma avx2 optimizations
> 
> On Tue, Apr 30, 2024 at 12:34 AM Andreas Rheinhardt <
> andreas.rheinha...@outlook.com> wrote:
> 
> > toq...@outlook.com:
> > > vvc_alf_filter_chroma_16x12_10_c: 7235.5
> > > vvc_alf_filter_chroma_16x12_10_avx2: 9751.0
> >
> > Are these numbers correct? If so, the avx2 version should not be committed.
> >
> It could be a system turbulence. The data around it appears to be correct.
> 
> Hi Jianhua,
> Maybe you can test it again
> Thank you for the patch, Now we can smoothly play a 4k@60 on a modern i7.
> 

Sure. Will rerun the performance test without other processes with high CPU 
usage running  and resend the v2.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v2 1/3] avcodec/x86/vvc/vvcdsp_init: add put prototypes

2024-04-17 Thread Wu Jianhua

> 发件人: Nuo Mi 
> 发送时间: 2024年4月17日 6:14
> 收件人: FFmpeg development discussions and patches
> 抄送: Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH v2 1/3] avcodec/x86/vvc/vvcdsp_init: add put 
> prototypes
> 
> Hi Jianhua,
> thank you for the patches.
> could you add a log for each commit to explain why we need this commit?

Sure. v3 sent.

> 
> On Tue, Apr 16, 2024 at 1:36 AM 
> mailto:toq...@outlook.com>> wrote:
> From: Wu Jianhua mailto:toq...@outlook.com>>
> 
> Signed-off-by: Wu Jianhua mailto:toq...@outlook.com>>
> ---
>  libavcodec/x86/vvc/vvcdsp_init.c | 35 +++-
> 1 file changed, 34 insertions(+), 1 deletion(-)

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH] avcodec/x86/vvc/vvcdsp_init: fix linking error when configuring with --disable-ssse3 --disable-optimizations options

2024-04-15 Thread Wu Jianhua

> 发件人: Nuo Mi 
> 发送时间: 2024年3月3日 6:49
> 收件人: FFmpeg development discussions and patches
> 抄送: Wu Jianhua
> 主题: Re: [FFmpeg-devel] [PATCH] avcodec/x86/vvc/vvcdsp_init: fix linking error 
> when configuring with --disable-> > ssse3 --disable-optimizations options
> 
> Thank you, Jianhua.
> This patch mixes many things.
> Could you help split it into smaller, more atomic patches?
> For example, one for moving code blocks and another for fixing 
> --disable-ssse3.
> 

Sure. I sent the v2.

>  #define AVG_INIT(bd, opt) do {  \
> -c->inter.avg= bf(avg, bd, opt); \
> -c->inter.w_avg  = bf(w_avg, bd, opt);   \
> +c->inter.avg= bf(ff_vvc_avg, bd, opt);  \
> +c->inter.w_avg  = bf(ff_vvc_w_avg, bd, opt);\
> Why change the function scope to fix a compilation issue?

I use the same way that adds prototypes for the functions as HEVC DSP. Hence, 
the functions
cannot be declared with static and need the prefix to avoid naming conflict.



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: 回复: [PATCH 2/2] avformat/mov: improve HEIF parsing

2024-04-13 Thread Wu Jianhua

> 发件人: James Almer 
> 发送时间: 2024年4月13日 6:12
> 收件人: Wu Jianhua; ffmpeg-devel@ffmpeg.org
> 主题: Re: 回复: [FFmpeg-devel] [PATCH 2/2] avformat/mov: improve HEIF parsing
> 
> On 4/13/2024 8:04 AM, Wu Jianhua wrote:
>>> 发件人: ffmpeg-devel  代表 James Almer 
>>> 
>>> 发送时间: 2024年1月9日 11:55
>>> 收件人: ffmpeg-devel@ffmpeg.org
>>> 主题: [FFmpeg-devel] [PATCH 2/2] avformat/mov: improve HEIF parsing
>>>
>>> Parse iinf boxes and its child infe boxes to get the actual codec used
>>> (AV1 for avif, HEVC for heic), and properly export extradata in a generic
>>> way.
>>>
>>> The avif tests reference files are updated as the extradata is now exported.
>>>
>>> Signed-off-by: James Almer 
>>> ---
>>> libavformat/isom.h|   3 +-
>>> libavformat/mov.c | 157 ++
>>> .../fate/mov-avif-demux-still-image-1-item|   2 +-
>>> .../mov-avif-demux-still-image-multiple-items |   2 +-
>>> 4 files changed, 95 insertions(+), 69 deletions(-)
>>>
>>
>>> +if (version != 2) {
>>> +av_log(c->fc, AV_LOG_ERROR, "infe: version != 2 not supported.\n");
>>> +return AVERROR_PATCHWELCOME;
>>> +}
>>> +
>>
>> Hi James,
>> 
>> With the change, some errors occurred and the current FFmpeg failed to 
>> decode a lot of videos that can be decoded by older FFmpeg.
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] st: 0 edit list: 1 Missing key 
>> frame while searching for timestamp: 1000
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] st: 0 edit list 1 Cannot find 
>> an index entry before timestamp: 1000.
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] infe: version < 2 not supported
>> [mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] error reading header
>> 
>> I'm not familiar with the mov. Is it possible to treat this error as a 
>> warning and skip it when the version < 2?
>> 
>> Thanks,
>> Jianhua
>
> Can you test
> https://ffmpeg.org//pipermail/ffmpeg-devel/2024-April/325644.html ?

It works. Thanks for you quick fix!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 2/2] avformat/mov: improve HEIF parsing

2024-04-13 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 James Almer 
> 
> 发送时间: 2024年1月9日 11:55
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: [FFmpeg-devel] [PATCH 2/2] avformat/mov: improve HEIF parsing
>
> Parse iinf boxes and its child infe boxes to get the actual codec used
> (AV1 for avif, HEVC for heic), and properly export extradata in a generic
> way.
>
> The avif tests reference files are updated as the extradata is now exported.
>
> Signed-off-by: James Almer 
> ---
> libavformat/isom.h|   3 +-
> libavformat/mov.c | 157 ++
> .../fate/mov-avif-demux-still-image-1-item|   2 +-
> .../mov-avif-demux-still-image-multiple-items |   2 +-
> 4 files changed, 95 insertions(+), 69 deletions(-)
>

> +if (version != 2) {
> +av_log(c->fc, AV_LOG_ERROR, "infe: version != 2 not supported.\n");
> +return AVERROR_PATCHWELCOME;
> +}
> +

Hi James,

With the change, some errors occurred and the current FFmpeg failed to decode a 
lot of videos that can be decoded by older FFmpeg.
[mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] st: 0 edit list: 1 Missing key 
frame while searching for timestamp: 1000
[mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] st: 0 edit list 1 Cannot find an 
index entry before timestamp: 1000.
[mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] infe: version < 2 not supported
[mov,mp4,m4a,3gp,3g2,mj2 @ 01e6e79aac80] error reading header

I'm not familiar with the mov. Is it possible to treat this error as a warning 
and skip it when the version < 2?

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v3 8/9] avcodec: add D3D12VA hardware HEVC encoder

2024-02-05 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 
> tong1.wu-at-intel@ffmpeg.org 
> 发送时间: 2024年2月2日 2:16
> 收件人: ffmpeg-devel@ffmpeg.org
> 抄送: Tong Wu
> 主题: [FFmpeg-devel] [PATCH v3 8/9] avcodec: add D3D12VA hardware HEVC encoder
> 
> From: Tong Wu 
> 
> This implementation is based on D3D12 Video Encoding Spec:
> https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html
> 
> Sample command line for transcoding:
> ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4
> -c:v hevc_d3d12va output.mp4
> 
> Signed-off-by: Tong Wu 
> ---
> configure|6 +
> libavcodec/Makefile  |4 +-
> libavcodec/allcodecs.c   |1 +
> libavcodec/d3d12va_encode.c  | 1441 ++
> libavcodec/d3d12va_encode.h  |  275 ++
> libavcodec/d3d12va_encode_hevc.c | 1011 +
> libavcodec/hw_base_encode.h  |2 +-
> 7 files changed, 2738 insertions(+), 2 deletions(-)
> create mode 100644 libavcodec/d3d12va_encode.c
> create mode 100644 libavcodec/d3d12va_encode.h
> create mode 100644 libavcodec/d3d12va_encode_hevc.c
> 
>
>+min_cu_size = 
>d3d12va_encode_hevc_map_cusize(ctx->codec_conf.pHEVCConfig->MinLumaCodingUnitSize);
>+max_cu_size = 
>d3d12va_encode_hevc_map_cusize(ctx->codec_conf.pHEVCConfig->MaxLumaCodingUnitSize);
>+min_tu_size = 
>d3d12va_encode_hevc_map_tusize(ctx->codec_conf.pHEVCConfig->MinLumaTransformUnitSize);
>+max_tu_size = 
>d3d12va_encode_hevc_map_tusize(ctx->codec_conf.pHEVCConfig->MaxLumaTransformUnitSize);
>+
>+// VPS
>+
>+vps->nal_unit_header = (H265RawNALUnitHeader) {

Should this blank line be removed, because the comment is for the codes below?

> +vps->vps_timing_info_present_flag = 0;
> +
> +// SPS
> +
> +sps->nal_unit_header = (H265RawNALUnitHeader) {
> +.nal_unit_type = HEVC_NAL_SPS,
> +.nuh_layer_id  = 0,
> +.nuh_temporal_id_plus1 = 1,
> +};
The same as above.

> +static uint8_t 
> d3d12va_encode_hevc_map_cusize(D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_CUSIZE
>  cusize)
> +{
> +switch (cusize) {
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_CUSIZE_8x8:   
> return 8;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_CUSIZE_16x16: 
> return 16;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_CUSIZE_32x32: 
> return 32;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_CUSIZE_64x64: 
> return 64;
> +}
> +return 0;
> +}
> +
> +static uint8_t 
> d3d12va_encode_hevc_map_tusize(D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_TUSIZE
>  tusize)
> +{
> +switch (tusize) {
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_TUSIZE_4x4:   
> return 4;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_TUSIZE_8x8:   
> return 8;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_TUSIZE_16x16: 
> return 16;
> +case D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION_HEVC_TUSIZE_32x32: 
> return 32;
> +}
> +return 0;
> +}

A default branch is needed or we can use 8 << cusize and 4 << tusize for 
simplification.

> +hr = ID3D12Device3_QueryInterface(ctx->device3, _ID3D12VideoDevice3, 
> (void **)>video_device3);
> +if (FAILED(hr)) {
> +err = AVERROR_UNKNOWN;
> +goto fail;
> +}
> +
> +if (FAILED(ID3D12VideoDevice3_CheckFeatureSupport(ctx->video_device3, 
> D3D12_FEATURE_VIDEO_FEATURE_AREA_SUPPORT,
> +  , 
> sizeof(support))) && !support.VideoEncodeSupport) {
> +av_log(avctx, AV_LOG_ERROR, "D3D12 video device has no video encoder 
> support");
> +err = AVERROR(EINVAL);
> +goto fail;
> +}

We need to output the log for the ID3D12Device3_QueryInterface call, or the 
user will not know the error is resulting from that,
the OS and the driver don't support the ID3D12VideoDevice3 interface.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH] avcodec/x86/vvc/vvcdsp_init: fix unresolved external symbol on ARCH_X86_32

2024-02-05 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 Andreas Rheinhardt 
> 
> 发送时间: 2024年2月5日 4:06
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: Re: [FFmpeg-devel] [PATCH] avcodec/x86/vvc/vvcdsp_init: fix unresolved 
> external symbol on ARCH_X86_32
>
> toq...@outlook.com:
>> From: Wu Jianhua 
>>
>> Signed-off-by: Wu Jianhua 
>> ---
>>  libavcodec/x86/vvc/vvcdsp_init.c | 78 
>>  1 file changed, 40 insertions(+), 38 deletions(-)
>>
>> diff --git a/libavcodec/x86/vvc/vvcdsp_init.c 
>> b/libavcodec/x86/vvc/vvcdsp_init.c
>> index 909ef9f56b..8ee4074350 100644
>> --- a/libavcodec/x86/vvc/vvcdsp_init.c
>> +++ b/libavcodec/x86/vvc/vvcdsp_init.c
>> @@ -31,6 +31,7 @@
>>  #include "libavcodec/vvc/vvcdsp.h"
>>  #include "libavcodec/x86/h26x/h2656dsp.h"
>>
>
> Are really all of these functions unavailable for 32bit?
>
> - Andreas

Yes. Both libavcodec\x86\vvc\vvc_mc.asm and libavcodec\x86\h26x\h2656_inter.asm 
are wrapped in ARCH_X86_64.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH] lavc/d3d12va: Improve behaviour on missing decoder support

2024-02-04 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 Mark Thompson 
> 
> 发送时间: 2024年2月4日 5:24
> 收件人: FFmpeg development discussions and patches
> 主题: [FFmpeg-devel] [PATCH] lavc/d3d12va: Improve behaviour on missing decoder 
> support
> 
> Distinguish between a decoder being entirely missing and a decoder which
> requires features which are not present in the incomplete implementation
> in libavcodec and therefore can't be used.
> ---
>   libavcodec/d3d12va_decode.c | 12 
>   1 file changed, 8 insertions(+), 4 deletions(-)

LGTM. Thanks.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v3 7/8] avcodec/x86/vvc: add avg and avg_w AVX2 optimizations

2024-01-23 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 Michael Niedermayer 
> 
> 发送时间: 2024年1月22日 14:46
> 收件人: FFmpeg development discussions and patches
> 主题: Re: [FFmpeg-devel] [PATCH v3 7/8] avcodec/x86/vvc: add avg and avg_w AVX2 
> optimizations
> 
> On Tue, Jan 23, 2024 at 01:46:27AM +0800, toq...@outlook.com wrote:
>> From: Wu Jianhua 
>>
>>  The avg/avg_w is based on dav1d.
>>  See 
>> https://code.videolan.org/videolan/dav1d/-/blob/master/src/x86/mc_avx2.asm
>>
>>
>> Signed-off-by: Wu Jianhua 
>>  ---
>>  libavcodec/x86/vvc/Makefile  |   3 +-
>>  libavcodec/x86/vvc/vvc_mc.asm| 301 +++
>>  libavcodec/x86/vvc/vvcdsp_init.c |  52 ++
>>  3 files changed, 355 insertions(+), 1 deletion(-)
>>  create mode 100644 libavcodec/x86/vvc/vvc_mc.asm
>
> this seems to break x86-32
>
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol `ff_vvc_avg_8bpc_avx2.w2' 
> undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol `ff_vvc_avg_8bpc_avx2.w4' 
> undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol `ff_vvc_avg_8bpc_avx2.w8' 
> undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol 
> `ff_vvc_avg_8bpc_avx2.w16' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol 
> `ff_vvc_avg_8bpc_avx2.w32' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol 
> `ff_vvc_avg_8bpc_avx2.w64' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:51: error: symbol 
> `ff_vvc_avg_8bpc_avx2.w128' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w2' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w4' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w8' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w16' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w32' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w64' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:52: error: symbol 
> `ff_vvc_avg_16bpc_avx2.w128' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w2' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w4' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w8' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w16' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w32' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w64' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcodec/x86/vvc/vvc_mc.asm:53: error: symbol 
> `ff_vvc_w_avg_8bpc_avx2.w128' undefined
> src/libavcodec/x86/vvc/vvc_mc.asm:48: ... from macro `AVG_JMP_TABLE' defined 
> here
> src/libavcod

[FFmpeg-devel] 回复: 回复: 回复: [PATCH 2/2] fate: add raw IAMF tests

2024-01-22 Thread Wu Jianhua

>发件人: ffmpeg-devel  代表 James Almer 
>
>发送时间: 2024年1月22日 8:49
>收件人: ffmpeg-devel@ffmpeg.org
>主题: Re: [FFmpeg-devel] 回复: 回复: [PATCH 2/2] fate: add raw IAMF tests
>
> On 1/22/2024 1:23 PM, Wu Jianhua wrote:
>>> 发件人: ffmpeg-devel  代表 James Almer 
>>> 
>>> 发送时间: 2024年1月22日 8:10
>>> 收件人: ffmpeg-devel@ffmpeg.org
>>> 主题: Re: [FFmpeg-devel] 回复: [PATCH 2/2] fate: add raw IAMF tests
>>>
>>> On 1/22/2024 1:02 PM, Wu Jianhua wrote:
>>>>> 发件人: ffmpeg-devel  代表 James Almer 
>>>>> 
>>>>> 发送时间: 2024年1月20日 4:22
>>>>> 收件人: ffmpeg-devel@ffmpeg.org
>>>>> 主题: [FFmpeg-devel] [PATCH 2/2] fate: add raw IAMF tests
>>>>>
>>>>> Covers muxing from raw pcm audio input into FLAC, using several scalable 
>>>>> layouts,
>>>>> and demuxing the result.
>>>>>
>>>>
>>>> Hi there.
>>>>
>>>> Test iamf-7_1_4 failed. Look at tests/data/fate/iamf-7_1_4.err for details.
>>>> make: *** [tests/Makefile:317: fate-iamf-7_1_4] Error 234
>>>> Test iamf-5_1_4 failed. Look at tests/data/fate/iamf-5_1_4.err for details.
>>>> make: *** [tests/Makefile:317: fate-iamf-5_1_4] Error 234
>>>>
>>>> These tests failed on my machine and looks like some machines on 
>>>> http://fate.ffmpeg.org/ failed as well.
>>>>
>>>> Can you help check what the problem is?
>>>
>>> What is your system (The shell, mainly)? And what is the error you get?
>>>
>>
>> My OS is Ubuntu 22.04.3 LTS and the compiler is gcc version 11.4.0 (Ubuntu 
>> 11.4.0-1ubuntu1~22.04) .
> 
> Can you check again with current git head?

The issue is fixed. Thanks for the quick fix.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: 回复: [PATCH 2/2] fate: add raw IAMF tests

2024-01-22 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 James Almer 
> 
> 发送时间: 2024年1月22日 8:10
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: Re: [FFmpeg-devel] 回复: [PATCH 2/2] fate: add raw IAMF tests
>
> On 1/22/2024 1:02 PM, Wu Jianhua wrote:
>>> 发件人: ffmpeg-devel  代表 James Almer 
>>> 
>>> 发送时间: 2024年1月20日 4:22
>>> 收件人: ffmpeg-devel@ffmpeg.org
>>> 主题: [FFmpeg-devel] [PATCH 2/2] fate: add raw IAMF tests
>>>
>>> Covers muxing from raw pcm audio input into FLAC, using several scalable 
>>> layouts,
>>> and demuxing the result.
>>>
>>
>> Hi there.
>>
>> Test iamf-7_1_4 failed. Look at tests/data/fate/iamf-7_1_4.err for details.
>> make: *** [tests/Makefile:317: fate-iamf-7_1_4] Error 234
>> Test iamf-5_1_4 failed. Look at tests/data/fate/iamf-5_1_4.err for details.
>> make: *** [tests/Makefile:317: fate-iamf-5_1_4] Error 234
>>
>> These tests failed on my machine and looks like some machines on 
>> http://fate.ffmpeg.org/ failed as well.
>>
>> Can you help check what the problem is?
>
> What is your system (The shell, mainly)? And what is the error you get?
>

My OS is Ubuntu 22.04.3 LTS and the compiler is gcc version 11.4.0 (Ubuntu 
11.4.0-1ubuntu1~22.04) .

Here is the full error message:
TESTseek-acodec-adpcm-ima_wav-trellis
--- ./tests/ref/fate/iamf-stereo2024-01-23 00:17:30.109846236 +0800
+++ tests/data/fate/iamf-stereo 2024-01-23 00:18:16.857845219 +0800
@@ -1,18 +0,0 @@
-ace731a4fbc302e24498d6b64daa16e7 *tests/data/fate/iamf-stereo.iamf
-14426 tests/data/fate/iamf-stereo.iamf
-#extradata 0:   34, 0x40a802c6
-#tb 0: 1/44100
-#media_type 0: audio
-#codec_id 0: flac
-#sample_rate 0: 44100
-#channel_layout_name 0: stereo
-0,  0,  0, 4608, 1399, 0x6e89566e
-0,   4608,   4608, 4608, 1442, 0x6c3c5b13
-0,   9216,   9216, 4608, 1380, 0xc497571b
-0,  13824,  13824, 4608, 1383, 0x48e9510f
-0,  18432,  18432, 4608, 1572, 0x9a514719
-0,  23040,  23040, 4608, 1391, 0x74ac5014
-0,  27648,  27648, 4608, 1422, 0x2f9d47c5
-0,  32256,  32256, 4608, 1768, 0x2a044b99
-0,  36864,  36864, 4608, 1534, 0xb0b35a3f
-0,  41472,  41472, 4608,  926, 0xc26a5eae
Test iamf-stereo failed. Look at tests/data/fate/iamf-stereo.err for details.
make: *** [tests/Makefile:317: fate-iamf-stereo] Error 234
make: *** Waiting for unfinished jobs
TESTseek-acodec-adpcm-ms
--- ./tests/ref/fate/iamf-5_1_4 2024-01-23 00:17:30.109846236 +0800
+++ tests/data/fate/iamf-5_1_4  2024-01-23 00:18:16.845845219 +0800
@@ -1,98 +0,0 @@
-c447cbbc8943cfb751fdf1145a094250 *tests/data/fate/iamf-5_1_4.iamf
-85603 tests/data/fate/iamf-5_1_4.iamf
-#extradata 0:   34, 0x40a802c6
-#extradata 1:   34, 0x40a802c6
-#extradata 2:   34, 0x407c02c4
-#extradata 3:   34, 0x407c02c4
-#extradata 4:   34, 0x40a802c6
-#extradata 5:   34, 0x40a802c6
-#tb 0: 1/44100
-#media_type 0: audio
-#codec_id 0: flac
-#sample_rate 0: 44100
-#channel_layout_name 0: stereo
-#tb 1: 1/44100
-#media_type 1: audio
-#codec_id 1: flac
-#sample_rate 1: 44100
-#channel_layout_name 1: stereo
-#tb 2: 1/44100
-#media_type 2: audio
-#codec_id 2: flac
-#sample_rate 2: 44100
-#channel_layout_name 2: mono
-#tb 3: 1/44100
-#media_type 3: audio
-#codec_id 3: flac
-#sample_rate 3: 44100
-#channel_layout_name 3: mono
-#tb 4: 1/44100
-#media_type 4: audio
-#codec_id 4: flac
-#sample_rate 4: 44100
-#channel_layout_name 4: stereo
-#tb 5: 1/44100
-#media_type 5: audio
-#codec_id 5: flac
-#sample_rate 5: 44100
-#channel_layout_name 5: stereo
-0,  0,  0, 4608, 1399, 0x6e89566e
-1,  0,  0, 4608, 1399, 0x6e89566e
-2,  0,  0, 4608, 1396, 0x0dcb5677
-3,  0,  0, 4608, 1396, 0x0dcb5677
-4,  0,  0, 4608, 1399, 0x6e89566e
-5,  0,  0, 4608, 1399, 0x6e89566e
-0,   4608,   4608, 4608, 1442, 0x6c3c5b13
-1,   4608,   4608, 4608, 1442, 0x6c3c5b13
-2,   4608,   4608, 4608, 1439, 0xc46b5ac5
-3,   4608,   4608, 4608, 1439, 0xc46b5ac5
-4,   4608,   4608, 4608, 1442, 0x6c3c5b13
-5,   4608,   4608, 4608, 1442, 0x6c3c5b13
-0,   9216,   9216, 4608, 1380, 0xc497571b
-1,   9216,   9216, 4608, 1380, 0xc497571b
-2,   9216,   9216, 4608, 1377, 0x5b2a55fe
-3,   9216,   9216, 4608, 1377, 0x5b2a55fe
-4,   9216,   9216, 4608, 1380, 0xc497571b
-5,   9216,   9216, 4608, 1380, 0xc497571b
-0,  13824,  13824, 4608, 1383, 0x48e9510f
-1,  13824,  13824, 4608, 1383, 0x48e9510f
-2,  13824,  13824, 4608, 1380

[FFmpeg-devel] 回复: [PATCH 2/2] fate: add raw IAMF tests

2024-01-22 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 James Almer 
> 
> 发送时间: 2024年1月20日 4:22
> 收件人: ffmpeg-devel@ffmpeg.org
> 主题: [FFmpeg-devel] [PATCH 2/2] fate: add raw IAMF tests
>
> Covers muxing from raw pcm audio input into FLAC, using several scalable 
> layouts,
> and demuxing the result.
>

Hi there.

Test iamf-7_1_4 failed. Look at tests/data/fate/iamf-7_1_4.err for details.
make: *** [tests/Makefile:317: fate-iamf-7_1_4] Error 234
Test iamf-5_1_4 failed. Look at tests/data/fate/iamf-5_1_4.err for details.
make: *** [tests/Makefile:317: fate-iamf-5_1_4] Error 234

These tests failed on my machine and looks like some machines on 
http://fate.ffmpeg.org/ failed as well.

Can you help check what the problem is?

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 8/9] avcodec: add D3D12VA hardware HEVC encoder

2024-01-22 Thread Wu Jianhua

> 发件人: ffmpeg-devel  代表 
> tong1.wu-at-intel@ffmpeg.org 
> 发送时间: 2024年1月21日 21:57
> 收件人: ffmpeg-devel@ffmpeg.org
> 抄送: Tong Wu
> 主题: [FFmpeg-devel] [PATCH 8/9] avcodec: add D3D12VA hardware HEVC encoder
> 
> From: Tong Wu 
> 
> This implementation is based on D3D12 Video Encoding Spec:
> https://microsoft.github.io/DirectX-Specs/d3d/D3D12VideoEncoding.html
> 
> Sample command line for transcoding:
> ffmpeg.exe -hwaccel d3d12va -hwaccel_output_format d3d12 -i input.mp4
> -c:v hevc_d3d12va output.mp4
> 
> Signed-off-by: Tong Wu 
> ---
 > configure|6 +
 > libavcodec/Makefile  |4 +-
 > libavcodec/allcodecs.c   |1 +
 > libavcodec/d3d12va_encode.c  | 1441 ++
 > libavcodec/d3d12va_encode.h  |  200 +
 > libavcodec/d3d12va_encode_hevc.c | 1016 +
 > libavcodec/hw_base_encode.h  |2 +-
 > 7 files changed, 2668 insertions(+), 2 deletions(-)
 > create mode 100644 libavcodec/d3d12va_encode.c
 > create mode 100644 libavcodec/d3d12va_encode.h
 > create mode 100644 libavcodec/d3d12va_encode_hevc.c

> +D3D12_OBJECT_RELEASE(ctx->sync_ctx.fence);
> +if (ctx->sync_ctx.event)
> +CloseHandle(ctx->sync_ctx.event);
> +
> +D3D12_OBJECT_RELEASE(ctx->video_device3);
> +D3D12_OBJECT_RELEASE(ctx->device);
> +D3D12_OBJECT_RELEASE(ctx->encoder_heap);
> +D3D12_OBJECT_RELEASE(ctx->encoder);

We need to release all of the objects, including the encoder and encoder_heap, 
created by the device before releasing the device.

> +
> +typedef struct D3D12VAEncodeProfile {
> +//lavc profile value (AV_PROFILE_*).
> +int   av_profile;
> +//Supported bit depth.
> +int   depth;
> +//Number of components.
> +int   nb_components;
> +//Chroma subsampling in width dimension.
> +int   log2_chroma_w;
> +//Chroma subsampling in height dimension.
> +int   log2_chroma_h;
> +//D3D12 profile value.
> +D3D12_VIDEO_ENCODER_PROFILE_DESC d3d12_profile;
> +} D3D12VAEncodeProfile;
> +
> +typedef struct D3D12VAEncodeRCMode {
> +// Base.
> +HWBaseEncodeRCMode base;
> +// Supported by D3D12 HW.
> +int supported;
> +// D3D12 mode value.
> +D3D12_VIDEO_ENCODER_RATE_CONTROL_MODE d3d12_mode;
> +} D3D12VAEncodeRCMode;
> +
> +typedef struct D3D12VAEncodeContext {
> +HWBaseEncodeContext base;
> +
> +//Codec-specific hooks.
> +const struct D3D12VAEncodeType *codec;
> +
> +//Chosen encoding profile details.
> +const D3D12VAEncodeProfile *profile;
> +
> +//Chosen rate control mode details.
> +const D3D12VAEncodeRCMode *rc_mode;
> +
> +AVD3D12VADeviceContext *hwctx;
> +
> +//Device3 interface.
> +ID3D12Device3 *device3;
> +
> +ID3D12VideoDevice3 *video_device3;
> +
> +//Pool of (reusable) bitstream output buffers.
> +AVBufferPool   *output_buffer_pool;
> +
> +//D3D12 video encoder.
> +AVBufferRef *encoder_ref;
> +
> +ID3D12VideoEncoder *encoder;
> +
> +//D3D12 video encoder heap.
> +ID3D12VideoEncoderHeap *encoder_heap;
> +
> +//A cached queue for reusing the D3D12 command allocators.
> +//@see 
> https://learn.microsoft.com/en-us/windows/win32/direct3d12/recording-command-lists-and-bundles#id3d12commandallocator
> +AVFifo *allocator_queue;
> +
> +//D3D12 command queue.
> +ID3D12CommandQueue *command_queue;
> +
> +//D3D12 video encode command list.
> +ID3D12VideoEncodeCommandList2 *command_list;
> +
> +//The sync context used to sync command queue.
> +AVD3D12VASyncContext sync_ctx;
> +
> +//bi_not_empty feature.
> +int bi_not_empty;
> +
> +//D3D12 hardware structures.
> +D3D12_VIDEO_ENCODER_PICTURE_RESOLUTION_DESC resolution;
> +
> +D3D12_VIDEO_ENCODER_CODEC_CONFIGURATION codec_conf;
> +
> +D3D12_VIDEO_ENCODER_RATE_CONTROL rc;
> +
> +D3D12_FEATURE_DATA_VIDEO_ENCODER_RESOURCE_REQUIREMENTS req;
> +
> +D3D12_VIDEO_ENCODER_SEQUENCE_GOP_STRUCTURE GOP;
> +
> +D3D12_FEATURE_DATA_VIDEO_ENCODER_RESOLUTION_SUPPORT_LIMITS res_limits;
> +
> +D3D12_VIDEO_ENCODER_LEVEL_SETTING level;
> +} D3D12VAEncodeContext;
> +
Can we use the comment style the same as D3D12VADecodeContext?


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 7/8] avcodec/x86/vvc: add avg and avg_w AVX2 optimizations

2024-01-19 Thread Wu Jianhua

>发件人: ffmpeg-devel  代表 Michael Niedermayer 
>
>发送时间: 2024年1月18日 13:48
>收件人: FFmpeg development discussions and patches
>主题: Re: [FFmpeg-devel] [PATCH 7/8] avcodec/x86/vvc: add avg and avg_w AVX2 
>optimizations
>
>On Thu, Jan 18, 2024 at 10:24:03PM +0800, toq...@outlook.com wrote:
>> From: Wu Jianhua 
>>
>> The avg/avg_w is based on dav1d.
>> See 
>> https://code.videolan.org/videolan/dav1d/-/blob/master/src/x86/mc_avx2.asm
>>
>>
>> Signed-off-by: Wu Jianhua 
>> ---
>>  libavcodec/x86/vvc/Makefile  |   3 +-
>>  libavcodec/x86/vvc/vvc_mc.asm| 301 +++
>>  libavcodec/x86/vvc/vvcdsp_init.c |  54 ++
>>  3 files changed, 357 insertions(+), 1 deletion(-)
>>  create mode 100644 libavcodec/x86/vvc/vvc_mc.asm
> 
> error: cannot convert from y to UTF-8
> fatal: could not parse patch
> 
> [...]

I used the wrong encoding method. It's updated in the v2. 

Thanks for verifying this.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 1/2] avcodec/d3d12va_mpeg2: remove unused variables

2024-01-01 Thread Wu Jianhua

James Almer:
> [FFmpeg-devel] [PATCH 1/2] avcodec/d3d12va_mpeg2: remove unused variables
>
> Signed-off-by: James Almer 
> ---
> is_field worries me. Was it a copy-paste left over, or is it meant to
> be checked?
>
> libavcodec/d3d12va_mpeg2.c | 8 
> 1 file changed, 8 deletions(-)

> diff --git a/libavcodec/d3d12va_mpeg2.c b/libavcodec/d3d12va_mpeg2.c
> index 91bf3f8b75..a2ae8bf948 100644
> --- a/libavcodec/d3d12va_mpeg2.c
> +++ b/libavcodec/d3d12va_mpeg2.c
> @@ -49,7 +49,6 @@ static int d3d12va_mpeg2_start_frame(AVCodecContext *avctx, 
> av_unused const uint
> const MpegEncContext  *s   = avctx->priv_data;
> D3D12VADecodeContext  *ctx = D3D12VA_DECODE_CONTEXT(avctx);
> D3D12DecodePictureContext *ctx_pic = 
> s->current_picture_ptr->hwaccel_picture_private;
>-DXVA_QmatrixData  *qm  = _pic->qm;
>
> if (!ctx)
> return -1;
>@@ -76,8 +75,6 @@ static int d3d12va_mpeg2_decode_slice(AVCodecContext *avctx, 
>const uint8_t *buff
> const MpegEncContext  *s   = avctx->priv_data;
> D3D12DecodePictureContext *ctx_pic = 
> s->current_picture_ptr->hwaccel_picture_private;
>
> -int is_field = s->picture_structure != PICT_FRAME;
> -
> 

This patch set LGTM. It's something added for debugging but forgot to remove. 
Thanks for your fix.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2] avcodec/d3d12va_decode: don't change the resource state if the referenced frame is the same as the current frame

2023-12-27 Thread Wu Jianhua

This commit removes the follow warning and error:

D3D12 WARNING: ID3D12CommandList::ResourceBarrier: Called on the same 
subresource(s) of
Resource(0x02236E0E00D0:'Unnamed ID3D12Resource Object') in separate 
Barrier Descs
which is inefficient and likely unintentional. Desc[0] and Desc[1] on 
(subresource :
4294967295). [RESOURCE_MANIPULATION WARNING #1008: 
RESOURCE_BARRIER_DUPLICATE_SUBRESOURCE_TRANSITIONS]

D3D12 ERROR: ID3D12CommandList::ResourceBarrier: Before state (0x0: 
D3D12_RESOURCE_STATE_[COMMON|PRESENT])
of resource (0x02236E0E00D0:'Unnamed ID3D12Resource Object') 
(subresource: 0) specified
by transition barrier does not match with the state (0x2: 
D3D12_RESOURCE_STATE_VIDEO_DECODE_WRITE)
specified in the previous call to ResourceBarrier [RESOURCE_MANIPULATION 
ERROR #527:
RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH]

Patch attached


0001-avcodec-d3d12va_decode-don-t-change-the-resource-sta.patch
Description: 0001-avcodec-d3d12va_decode-don-t-change-the-resource-sta.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH] avcodec/d3d12va_decode: don't change the resource state if the referenced frame is the same as the current frame

2023-12-27 Thread Wu Jianhua

> From: ffmpeg-devel  On Behalf Of Wu, Tong1 
> 
> To: FFmpeg development discussions and patches
> Subject: Re: [FFmpeg-devel] [PATCH] avcodec/d3d12va_decode: don't change the 
> resource state if the referenced frame is > the same as the current frame
>
>>From: ffmpeg-devel  On Behalf Of Wu
>>Jianhua
>>Sent: Tuesday, December 26, 2023 9:21 PM
>>To: FFmpeg development discussions and patches >de...@ffmpeg.org>
>>Subject: [FFmpeg-devel] [PATCH] avcodec/d3d12va_decode: don't change the
>>resource state if the referenced frame is the same as the current frame
>>
>>avcodec/d3d12va_decode: don't change the resource state if the referenced
>>frame is the same as the current frame
>>
>> This commit removes the follow warning and error:
>>
>>D3D12 WARNING: ID3D12CommandList::ResourceBarrier: Called on the
>>same subresource(s) of
>>Resource(0x02236E0E00D0:'Unnamed ID3D12Resource Object') in
>>separate Barrier Descs
>>which is inefficient and likely unintentional. Desc[0] and Desc[1] on
>>(subresource :
>>4294967295). [RESOURCE_MANIPULATION WARNING #1008:
>>RESOURCE_BARRIER_DUPLICATE_SUBRESOURCE_TRANSITIONS]
>>
>>D3D12 ERROR: ID3D12CommandList::ResourceBarrier: Before state (0x0:
>>D3D12_RESOURCE_STATE_[COMMON|PRESENT])
>>of resource (0x02236E0E00D0:'Unnamed ID3D12Resource Object')
>>(subresource: 0) specified
>>by transition barrier does not match with the state (0x2:
>>D3D12_RESOURCE_STATE_VIDEO_DECODE_WRITE)
>>specified in the previous call to ResourceBarrier [RESOURCE_MANIPULATION
>>ERROR #527:
>>RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH]
>>
>>Patch attached
>
> Could you please split the function declaration(header) into 2 lines since 
> it's a little bit long?
>
> Thx,
> Tong

Sure. Will do in the v2.

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH] avcodec/d3d12va_decode: don't change the resource state if the referenced frame is the same as the current frame

2023-12-26 Thread Wu Jianhua

avcodec/d3d12va_decode: don't change the resource state if the referenced frame 
is the same as the current frame

 This commit removes the follow warning and error:

D3D12 WARNING: ID3D12CommandList::ResourceBarrier: Called on the same 
subresource(s) of
Resource(0x02236E0E00D0:'Unnamed ID3D12Resource Object') in separate 
Barrier Descs
which is inefficient and likely unintentional. Desc[0] and Desc[1] on 
(subresource :
4294967295). [RESOURCE_MANIPULATION WARNING #1008: 
RESOURCE_BARRIER_DUPLICATE_SUBRESOURCE_TRANSITIONS]

D3D12 ERROR: ID3D12CommandList::ResourceBarrier: Before state (0x0: 
D3D12_RESOURCE_STATE_[COMMON|PRESENT])
of resource (0x02236E0E00D0:'Unnamed ID3D12Resource Object') 
(subresource: 0) specified
by transition barrier does not match with the state (0x2: 
D3D12_RESOURCE_STATE_VIDEO_DECODE_WRITE)
specified in the previous call to ResourceBarrier [RESOURCE_MANIPULATION 
ERROR #527:
RESOURCE_BARRIER_BEFORE_AFTER_MISMATCH]

Patch attached


0001-avcodec-d3d12va_decode-don-t-change-the-resource-sta.patch
Description: 0001-avcodec-d3d12va_decode-don-t-change-the-resource-sta.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH 1/5] avcodec/d3d12va_vp9: fix vp9 max_num_refs value

2023-12-25 Thread Wu Jianhua

Tong Wu :
> subject: [FFmpeg-devel] [PATCH 1/5] avcodec/d3d12va_vp9: fix vp9 max_num_refs 
> value
>
> Previous max_num_refs was based on pp.frame_refs plus 1 and it could possibly
> reaches the size limit. Actually it should be the size of pp.ref_frame_map
> plus 1.
>
> Signed-off-by: Tong Wu 
> ---
> libavcodec/d3d12va_vp9.c | 2 +-
>1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/libavcodec/d3d12va_vp9.c b/libavcodec/d3d12va_vp9.c
> index bb94e18781..d6dfc905d9 100644
> --- a/libavcodec/d3d12va_vp9.c
> +++ b/libavcodec/d3d12va_vp9.c
> @@ -148,7 +148,7 @@ static int d3d12va_vp9_decode_init(AVCodecContext *avctx)
>break;
>};
>
> -ctx->max_num_ref = FF_ARRAY_ELEMS(pp.frame_refs) + 1;
> +ctx->max_num_ref = FF_ARRAY_ELEMS(pp.ref_frame_map) + 1;
> 
> return ff_d3d12va_decode_init(avctx);
> }
> --
> 2.41.0.windows.1

LGTM. I tested this fix both in command line and C API and it fixed the VP9 
decoding issue
that the decoded reference frames of some samples have corrupted.

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-11-11 Thread Wu Jianhua

> From: Lynne
> Sent: 2022年10月15日 13:16
> To: FFmpeg development discussions and patches
> Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated 
> H264, HEVC, VP9, and AV1 decoding
>
>Oct 14, 2022, 14:32 by toq...@outlook.com:
>
>> Lynne wrote:
>>
>>> Sent: 2022年10月14日 6:28
>>> To: FFmpeg development discussions and 
>>> patches
>>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware 
>>> accelerated H264, HEVC, VP9, and AV1 decoding
>>>
>>> Oct 13, 2022, 17:48 by toq...@outlook.com:
>>>
> Lynne wrote:
>
> Oct 12, 2022, 13:09 by toq...@outlook.com:
>
> [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and 
> AV1 decoding
>
> Patch attached.
>
> The Sync locking functions and the queue locking functions should
> be a function pointer in the device/frame context. Vulkan has
> the same issue, and that's how I did it there. This allows for
> API users to plug their own locking primitives in, which they need
> to in case they initialize their own contexts.
>

 I don’t need to follow your design.

>>>
>>> Yes, you do, because it's not my design, it's the design of the entire
>>> hwcontext system. If API users cannot initialize a hwcontext with
>>> their own, it's breakage. It's not optional.
>>> Locking primitives are a part of this.
>>> Either fix this or this is simply not getting merged.
>>>
>>
>> As I learn from the docs of Direct3D12, it is a multithreading-friendly API,
>> which means that the only mandatory field, device, set by the user can be
>> accessed and used without any lock. Why do the API users need locking
>> primitives to init the hwcontext_d3d12va? Have you got the initialization
>> failed? If so, could you share the runtime error details here?
>>
>> And I checked the other hwcontexts, they didn't have a locking primitive 
>> also.
>> The hwcontext_d3d11va has a locking used to protect accesses to 
>> device_context
>> and video_context calls, which d3d12 doesn't need. So why the API users 
>> cannot
>> initialize the hwcontext with their own? Is there any real failure case? 
>> Please pardon
>> I don't quite understand what situation you are talking about, could you 
>> elaborate
>> further if I get it wrong?
>>
> You should also document which fields API users have to set
> themselves if they plan to use their own context.
>

 Where should I document them? Doesn’t the comments enough?

>>> In the comments. Look at how it's done elsewhere.
>>>
>> There is already a comment like what d3d11va wrote:
>>  /**
>> * Device used for objects creation and access. This can also be
>>  * used to set the libavcodec decoding device.
>>  *
>>  * Can be set by the user. This is the only mandatory field - the other
>>  * device context fields are set from this and are available for convenience.
>>  *
>>  * Deallocating the AVHWDeviceContext will always release this interface,
>>  * and it does not matter whether it was user-allocated.
>>  */
>>  ID3D12Device *device;
>>
>> Is there anything else missing?
>>
>
> What about the frames context? Frame contexts are also user-settable.

Yeah. The comment is missing. I will add them. Thanks.

> Also, struct names in the public context lack an AV prefix.
>
 Will fix. And which struct? Could you add the reference?

>>>
>>> In the main public context.
>>>
>> I check the codes and found that AVD3D12VASyncContext, 
>> AVD3D12VADeviceContext,
>> AVD3D12FrameDescriptor and AVD3D12VAFramesContext are the structs in
>> the hwcontext_d3d12va. Is there any other struct without AV prefix? Could you
>> paste the name here?
>>
> D3D12VA_MAX_SURFACES is a terrible hack. Vendors should
> fix their own drivers rather than users running out of memory.
>

 Not my responsibility as a personal developer. I know nothing
 about the drivers. You can ask those vendors to fix them. I don’t
 think it’s a `terrible hack`. On my test, The MAX_SURFACES is
 enough for the decoder. If there are any docs or the drivers fixed
 it, just simply remove it. Why user will run out of memory?

>>>
>>> The whole way the hwcontext is designed is sketchy. Why are
>>> you keeping all texture information in an array (texture_infos)?Frames are 
>>> already pooled (it's called AVFramePool), so you're
>>> not doing anything by adding a second layer on top of this other
>>> than complexity.
>>> The initial_pool_size parameter was a hack added to support
>>> hwcontexts which cannot allocate surfaces dynamically, you don't
>>> need to support this at all, you can just let users allocate
>>> frames as they need to rather than preinitializing.
>>>
>>
>> It’s the same implementation as d3d11va. The static initial_pool_size is
>> needed by the decoder heap to initialize and the input

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-11-11 Thread Wu Jianhua







From: ffmpeg-devel  on behalf of Lynne 

Sent: Saturday, October 15, 2022 1:16:18 PM
To: FFmpeg development discussions and patches 
Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated 
H264, HEVC, VP9, and AV1 decoding

Oct 14, 2022, 14:32 by toq...@outlook.com:

> Lynne wrote:
>
>> Sent: 2022年10月14日 6:28
>> To: FFmpeg development discussions and 
>> patches
>> Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware 
>> accelerated H264, HEVC, VP9, and AV1 decoding
>>
>> Oct 13, 2022, 17:48 by toq...@outlook.com:
>>
 Lynne wrote:

 Oct 12, 2022, 13:09 by toq...@outlook.com:

> [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and 
> AV1 decoding
>
> Patch attached.
>
 The Sync locking functions and the queue locking functions should
 be a function pointer in the device/frame context. Vulkan has
 the same issue, and that's how I did it there. This allows for
 API users to plug their own locking primitives in, which they need
 to in case they initialize their own contexts.

>>>
>>> I don’t need to follow your design.
>>>
>>
>> Yes, you do, because it's not my design, it's the design of the entire
>> hwcontext system. If API users cannot initialize a hwcontext with
>> their own, it's breakage. It's not optional.
>> Locking primitives are a part of this.
>> Either fix this or this is simply not getting merged.
>>
>
> As I learn from the docs of Direct3D12, it is a multithreading-friendly API,
> which means that the only mandatory field, device, set by the user can be
> accessed and used without any lock. Why do the API users need locking
> primitives to init the hwcontext_d3d12va? Have you got the initialization
> failed? If so, could you share the runtime error details here?
>
> And I checked the other hwcontexts, they didn't have a locking primitive also.
> The hwcontext_d3d11va has a locking used to protect accesses to device_context
> and video_context calls, which d3d12 doesn't need. So why the API users cannot
> initialize the hwcontext with their own? Is there any real failure case? 
> Please pardon
> I don't quite understand what situation you are talking about, could you 
> elaborate
> further if I get it wrong?
>
 You should also document which fields API users have to set
 themselves if they plan to use their own context.

>>>
>>> Where should I document them? Doesn’t the comments enough?
>>>
>> In the comments. Look at how it's done elsewhere.
>>
> There is already a comment like what d3d11va wrote:
>  /**
>  * Device used for objects creation and access. This can also be
>  * used to set the libavcodec decoding device.
>  *
>  * Can be set by the user. This is the only mandatory field - the other
>  * device context fields are set from this and are available for convenience.
>  *
>  * Deallocating the AVHWDeviceContext will always release this interface,
>  * and it does not matter whether it was user-allocated.
>  */
>  ID3D12Device *device;
>
> Is there anything else missing?
>

What about the frames context? Frame contexts are also user-settable.


 Also, struct names in the public context lack an AV prefix.

>>> Will fix. And which struct? Could you add the reference?
>>>
>>
>> In the main public context.
>>
> I check the codes and found that AVD3D12VASyncContext, AVD3D12VADeviceContext,
> AVD3D12FrameDescriptor and AVD3D12VAFramesContext are the structs in
> the hwcontext_d3d12va. Is there any other struct without AV prefix? Could you
> paste the name here?
>
 D3D12VA_MAX_SURFACES is a terrible hack. Vendors should
 fix their own drivers rather than users running out of memory.

>>>
>>> Not my responsibility as a personal developer. I know nothing
>>> about the drivers. You can ask those vendors to fix them. I don’t
>>> think it’s a `terrible hack`. On my test, The MAX_SURFACES is
>>> enough for the decoder. If there are any docs or the drivers fixed
>>> it, just simply remove it. Why user will run out of memory?
>>>
>>
>> The whole way the hwcontext is designed is sketchy. Why are
>> you keeping all texture information in an array (texture_infos)?Frames are 
>> already pooled (it's called AVFramePool), so you're
>> not doing anything by adding a second layer on top of this other
>> than complexity.
>> The initial_pool_size parameter was a hack added to support
>> hwcontexts which cannot allocate surfaces dynamically, you don't
>> need to support this at all, you can just let users allocate
>> frames as they need to rather than preinitializing.
>>
>
> It’s the same implementation as d3d11va. The static initial_pool_size is
> needed by the decoder heap to initialize and the input stream needs it
> as well The feature that allows the user to allocate frames is not the basic
> functionalities of decoding. I recommend the user who needs the

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-10-14 Thread Wu Jianhua

Lynne wrote:
> Sent: 2022年10月14日 6:28
> To: FFmpeg development discussions and patches
> Subject: Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated 
> H264, HEVC, VP9, and AV1 decoding
>
> Oct 13, 2022, 17:48 by toq...@outlook.com:
>
>>> Lynne wrote:
>>>
>>> Oct 12, 2022, 13:09 by toq...@outlook.com:
>>>
 [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 
 decoding

 Patch attached.

>>> The Sync locking functions and the queue locking functions should
>>> be a function pointer in the device/frame context. Vulkan has
>>> the same issue, and that's how I did it there. This allows for
>>> API users to plug their own locking primitives in, which they need
>>> to in case they initialize their own contexts.
>>>
>>
>> I don’t need to follow your design.
>>
>
> Yes, you do, because it's not my design, it's the design of the entire
> hwcontext system. If API users cannot initialize a hwcontext with
> their own, it's breakage. It's not optional.
> Locking primitives are a part of this.
> Either fix this or this is simply not getting merged.
>

As I learn from the docs of Direct3D12, it is a multithreading-friendly API,
which means that the only mandatory field, device, set by the user can be
accessed and used without any lock. Why do the API users need locking
primitives to init the hwcontext_d3d12va? Have you got the initialization
failed? If so, could you share the runtime error details here?

And I checked the other hwcontexts, they didn't have a locking primitive also.
The hwcontext_d3d11va has a locking used to protect accesses to device_context
and video_context calls, which d3d12 doesn't need. So why the API users cannot
initialize the hwcontext with their own? Is there any real failure case? Please 
pardon
I don't quite understand what situation you are talking about, could you 
elaborate
further if I get it wrong?

>
>>> You should also document which fields API users have to set
>>> themselves if they plan to use their own context.
>>>
>>
>> Where should I document them? Doesn’t the comments enough?
>>

> In the comments. Look at how it's done elsewhere.
There is already a comment like what d3d11va wrote:
/**
 * Device used for objects creation and access. This can also be
 * used to set the libavcodec decoding device.
 *
 * Can be set by the user. This is the only mandatory field - the other
 * device context fields are set from this and are available for 
convenience.
 *
 * Deallocating the AVHWDeviceContext will always release this interface,
 * and it does not matter whether it was user-allocated.
 */
ID3D12Device *device;

Is there anything else missing?

>>> Also, struct names in the public context lack an AV prefix.
>>>
>> Will fix. And which struct? Could you add the reference?
>>
>
> In the main public context.
>
I check the codes and found that AVD3D12VASyncContext, AVD3D12VADeviceContext,
AVD3D12FrameDescriptor and AVD3D12VAFramesContext are the structs in
the hwcontext_d3d12va. Is there any other struct without AV prefix? Could you
paste the name here?

>>> D3D12VA_MAX_SURFACES is a terrible hack. Vendors should
>>> fix their own drivers rather than users running out of memory.
>>>
>>
>> Not my responsibility as a personal developer. I know nothing
>> about the drivers. You can ask those vendors to fix them. I don’t
>> think it’s a `terrible hack`. On my test, The MAX_SURFACES is
>> enough for the decoder. If there are any docs or the drivers fixed
>> it, just simply remove it. Why user will run out of memory?
>>
>
> The whole way the hwcontext is designed is sketchy. Why are
> you keeping all texture information in an array (texture_infos)?Frames are 
> already pooled (it's called AVFramePool), so you're
> not doing anything by adding a second layer on top of this other
> than complexity.
> The initial_pool_size parameter was a hack added to support
> hwcontexts which cannot allocate surfaces dynamically, you don't
> need to support this at all, you can just let users allocate
> frames as they need to rather than preinitializing.
>

It’s the same implementation as d3d11va. The static initial_pool_size is
needed by the decoder heap to initialize and the input stream needs it
as well The feature that allows the user to allocate frames is not the basic
functionalities of decoding. I recommend the user who needs the feature
implement it and contribute the codes.

>>> Also, you have code style issues, don't wrap one-line if statements
>>> or loops in brackets.
>>>
>> Will fix. And which loop? Could you add the reference?
>>
>>> ff_d3d12dec_get_suitable_max_bitstream_size is an awful function.
>>> It does float math for sizes and has a magic mult factor of 1.5.
>>> You have to calculate this properly.
>>>
>> It simply calculate the size of NV12 and P010. Will add comment.
>>
>
>Do it _properly_. We have utilities for

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-10-13 Thread Wu Jianhua

James Almer<mailto:jamr...@gmail.com> wrote:
> On 10/13/2022 12:48 PM, Wu Jianhua wrote:
>>> Lynne<mailto:d...@lynne.ee> wrote:
>>
>>> Oct 12, 2022, 13:09 by toq...@outlook.com:
>>
>>>> [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 
>>>> decoding
>>>>
>>>> Patch attached.
>>>>
>>
>>> The Sync locking functions and the queue locking functions should
>>> be a function pointer in the device/frame context. Vulkan has
>>> the same issue, and that's how I did it there. This allows for
>>> API users to plug their own locking primitives in, which they need
>>> to in case they initialize their own contexts.
>>
>> I don’t need to follow your design.
>>
>>> You should also document which fields API users have to set
>>> themselves if they plan to use their own context.
>>
>> Where should I document them? Doesn’t the comments enough?
>>
>>> Also, struct names in the public context lack an AV prefix.
>> Will fix. And which struct? Could you add the reference?
>>
>>> D3D12VA_MAX_SURFACES is a terrible hack. Vendors should
>>> fix their own drivers rather than users running out of memory.
>>
>> Not my responsibility as a personal developer. I know nothing
>> about the drivers. You can ask those vendors to fix them. I don’t
>> think it’s a `terrible hack`. On my test, The MAX_SURFACES is
>> enough for the decoder. If there are any docs or the drivers fixed
>> it, just simply remove it. Why user will run out of memory?
>>
>>> Also, you have code style issues, don't wrap one-line if statements
>>> or loops in brackets.
>> Will fix. And which loop? Could you add the reference?
>>
>>> ff_d3d12dec_get_suitable_max_bitstream_size is an awful function.
>>> It does float math for sizes and has a magic mult factor of 1.5.
>>> You have to calculate this properly.
>> It simply calculate the size of NV12 and P010. Will add comment.
>
> Then you should probably use imgutils.h functions for that, and/or
> AVPixFmtDescriptor from pixdesc.h.

Great! Really thanks for the details. I will try to take a look at how to do 
that better.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-10-13 Thread Wu Jianhua

> Lynne wrote:

> Oct 12, 2022, 13:09 by toq...@outlook.com:

>> [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 
>> decoding
>>
>> Patch attached.
>>

> The Sync locking functions and the queue locking functions should
> be a function pointer in the device/frame context. Vulkan has
> the same issue, and that's how I did it there. This allows for
> API users to plug their own locking primitives in, which they need
> to in case they initialize their own contexts.

I don’t need to follow your design.

> You should also document which fields API users have to set
> themselves if they plan to use their own context.

Where should I document them? Doesn’t the comments enough?

> Also, struct names in the public context lack an AV prefix.
Will fix. And which struct? Could you add the reference?

> D3D12VA_MAX_SURFACES is a terrible hack. Vendors should
> fix their own drivers rather than users running out of memory.

Not my responsibility as a personal developer. I know nothing
about the drivers. You can ask those vendors to fix them. I don’t
think it’s a `terrible hack`. On my test, The MAX_SURFACES is
enough for the decoder. If there are any docs or the drivers fixed
it, just simply remove it. Why user will run out of memory?

> Also, you have code style issues, don't wrap one-line if statements
> or loops in brackets.
Will fix. And which loop? Could you add the reference?

> ff_d3d12dec_get_suitable_max_bitstream_size is an awful function.
> It does float math for sizes and has a magic mult factor of 1.5.
> You have to calculate this properly.
It simply calculate the size of NV12 and P010. Will add comment.

> On a first look, this is what stands out. Really must be split apart
> in patches.

Already claim that I will split it.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and AV1 decoding

2022-10-12 Thread Wu Jianhua

James Almer<mailto:jamr...@gmail.com> wrote:
> On 10/12/2022 12:12 PM, Jean-Baptiste Kempf wrote:
>> Hello,
>>
>> You really need to split this patch into logical patches.
>>
>> jb
>
> To expand on this, you should split into the following: One patch adding
> the hwcontext and pixel format to libavutil plus an entry in the
> APIChanges file mentioning the new public symbols and defines, then one
> patch per new hwaccel in libavcodec, the first of which should contain
> the shared code, and the last one the Changelog entry and the version
> bump (The changes in configure should of course be split across said
> patches).
>
> Compilation must succeed for every patch, hence the above suggested order.
>

I’ll try to do that. Thanks for the advice.

>>
>> On Wed, 12 Oct 2022, at 13:09, Wu Jianhua wrote:
>>> [PATCH] avcodec: add D3D12VA hardware accelerated H264, HEVC, VP9, and
>>> AV1 decoding
>>>
>>> Patch attached.
>>>
>>>


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] Fix AVX-512-VNNI_hevc_qpel_filters_avx512icl

2022-04-28 Thread Wu Jianhua

> Felix LeClair:
> Sent: 2022年4月29日 1:17
> To: ffmpeg-devel@ffmpeg.org
> Subject: [FFmpeg-devel] [PATCH] Fix AVX-512-VNNI_hevc_qpel_filters_avx512icl
>
> This commit fixes the above by swapping lines 1796 and 1795, moving the
> define out of the conditional
>
> subsrcq, tmpq
>  sub myq, 1
>  shl myq, 5
> -%ifdef PIC
>  %define %%table hevc_qpel_filters_avx512icl_v_%1
>+%ifdef PIC
>  lea tmpq, [%%table]
> %define FILTER tmpq
> %else
>--

LGTM. Thanks for the fix!

Best regards,
Jianhua



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 5/5] avcodec/x86/hevc_mc: add qpel_h64_8_avx512icl

2022-04-14 Thread Wu Jianhua

Ping!
Wu Jianhua:
>Henrik Gramner:
>> Sent: Friday, March 11, 2022 10:51 PM
>> To: FFmpeg development discussions and patches > devel at ffmpeg.org>
>> Subject: Re: [FFmpeg-devel] [PATCH v2 5/5] avcodec/x86/hevc_mc: add
>> qpel_h64_8_avx512icl
>> 
>> All 5/5 LGTM.
>> 
>
>Hi there,
>
>Are there any more comments or objections here?  
>If not, could someone help push this patchset?
>
>Many thanks!
>Jianhua
>
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 5/5] avcodec/x86/hevc_mc: add qpel_h64_8_avx512icl

2022-03-17 Thread Wu, Jianhua

Henrik Gramner:
> Sent: Friday, March 11, 2022 10:51 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v2 5/5] avcodec/x86/hevc_mc: add
> qpel_h64_8_avx512icl
> 
> All 5/5 LGTM.
> 

Hi there,

Are there any more comments or objections here?  
If not, could someone help push this patchset?

Many thanks!
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/6] avcodec/x86/hevc_mc: add qpel_h8_8_avx512icl and qpel_hv8_8_avx512icl

2022-03-10 Thread Wu, Jianhua

Henrik Gramner:
> From: ffmpeg-devel  On Behalf Of
> Henrik Gramner
> Sent: Thursday, March 10, 2022 11:22 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 2/6] avcodec/x86/hevc_mc: add
> qpel_h8_8_avx512icl and qpel_hv8_8_avx512icl
> 
> On Wed, Feb 23, 2022 at 9:58 AM 
> wrote:
> > +%macro HEVC_PUT_HEVC_QPEL_AVX512ICL 2
> > [...]
> > +vpmovdw xm6, m6
> > +movu [dstq], xm6
> 
> vpmovdw can take a memory operand as dst directly:
> vpmovdw  [dstq], m6
> 
> (the same applies to the hv function)
> 
> > +%macro HEVC_PUT_HEVC_QPEL_HV_AVX512ICL 2 cglobal
> > +hevc_put_hevc_qpel_hv%1_%2, 6, 7, 8, dst, src, srcstride, height, mx,
> > +my, tmp
> 
> This functions uses 27(?) vector registers but only specifies 8, so it will 
> break
> on Windows unless corrected.
> 
> > +if (EXTERNAL_AVX512ICL(cpu_flags)) {
> > +c->put_hevc_qpel[3][0][1] =
> ff_hevc_put_hevc_qpel_h8_8_avx512icl;
> > +c->put_hevc_qpel[3][1][1] =
> ff_hevc_put_hevc_qpel_hv8_8_avx512icl;
> > +}
> 
> Needs an ARCH_X86_64 guard as the code is 64-bit only.
> 

Thanks for the careful review. I updated a version 2 here:
http://ffmpeg.org/pipermail/ffmpeg-devel/2022-March/293872.html

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend mode

2022-03-10 Thread Wu Jianhua

Paul B Mahol<mailto:one...@gmail.com>:
>Sent: 2022年3月10日 20:01
>To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
>Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition 
>blend mode
>
>On Thu, Mar 10, 2022 at 12:58 PM Wu Jianhua  wrote:
>
>> Lynne<mailto:d...@lynne.ee>:
>> >Sent: 2022年3月10日 18:42
>> >To: FFmpeg development discussions and patches> ffmpeg-devel@ffmpeg.org>
>> >Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add
>> addition blend mode
>> >
>> >9 Mar 2022, 19:19 by toq...@outlook.com:
>> >
>> >> Lynne<mailto:d...@lynne.ee>:
>> >> >Sent: 2022年3月10日 1:43
>> >> >To: FFmpeg development discussions and patches> ffmpeg-devel@ffmpeg.org>
>> >> >Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add
>> addition blend mode
>> >>
>> >>>
>> >>>
>> >>>> Ping.
>> >>>>
>> >> >>>From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
>> >>
>> >>>> >Sent: 2022年2月25日 21:11
>> >>>>
>> >>>>>To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
>> >>>>>Subject: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add
>> addition blend mode
>> >>
>> >>>>
>> >>>>
>> >>> >
>> >>>
>> >> >None of them apply. Could you rebase them onto current git master
>> >> I didn't see any new commits to vf_ blend_vulkan.c. Is there any
>> conflict?
>> >>
>> >
>> >Patches lack SHA1 ancesor, so they can't be applied. You must've made
>> >them from a WIP repo.
>> >
>> >
>> >>> and squash them into 1 commit?
>> >>>
>> >> Nope. This didn't break the contribution rules. My intention is to keep
>> the
>> >> commit message as clear as I can, which can help you test each mode
> >quickly
> >>> for they are completed not the same things, or other people who want to
> >use
> >>> the mode specified easily.
> >>>
> >>
> >>It's just a few very obvious lines per patch, anyone can plainly
> >>see how new modes are added. I can squash them when I push,
> >>but if you did it, it would save me from having to download 10
> >>patches manually.
> >>
>>
> >Hi Lynne,
>>
>> Could you help do that? I sent the patches as attachments. It should
>>be easy to download. The procedure of patches resending is too
>> complicated and verbose for me.
>>
>
>There is nothing to fix patches that does not apply.
>
>Just send new patches for current master.
>

Sorry, I neglected the error message offered above. Thanks for this reminder!
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avfilter: add shader_vulkan filter

2022-03-10 Thread Wu Jianhua

Ping.
>From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
>Sent: 2022年3月4日 23:09
>To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
>Subject: [FFmpeg-devel] [PATCH v2] avfilter: add shader_vulkan filter
>
> [PATCH 1/2] avfilter: add shader_vulkan filter
>
> [PATCH 2/2] avutil/vulkan: print correct text for shader_vulkan filter
>
>Patches attached.
>

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend mode

2022-03-10 Thread Wu Jianhua

Lynne<mailto:d...@lynne.ee>:
>Sent: 2022年3月10日 18:42
>To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
>Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition 
>blend mode
>
>9 Mar 2022, 19:19 by toq...@outlook.com:
>
>> Lynne<mailto:d...@lynne.ee>:
>> >Sent: 2022年3月10日 1:43
>> >To: FFmpeg development discussions and 
>> >patches<mailto:ffmpeg-devel@ffmpeg.org>
>> >Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition 
>> >blend mode
>>
>>>
>>>
>>>> Ping.
>>>>
>> >>>From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
>>
>>>> >Sent: 2022年2月25日 21:11
>>>>
>>>>>To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
>>>>>Subject: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition 
>>>>>blend mode
>>
>>>>
>>>>
>>> >
>>>
>> >None of them apply. Could you rebase them onto current git master
>> I didn't see any new commits to vf_ blend_vulkan.c. Is there any conflict?
>>
>
>Patches lack SHA1 ancesor, so they can't be applied. You must've made
>them from a WIP repo.
>
>
>>> and squash them into 1 commit?
>>>
>> Nope. This didn't break the contribution rules. My intention is to keep the
>> commit message as clear as I can, which can help you test each mode quickly
>> for they are completed not the same things, or other people who want to use
>> the mode specified easily.
>>
>
>It's just a few very obvious lines per patch, anyone can plainly
>see how new modes are added. I can squash them when I push,
>but if you did it, it would save me from having to download 10
>patches manually.
>

Hi Lynne,

Could you help do that? I sent the patches as attachments. It should
be easy to download. The procedure of patches resending is too
complicated and verbose for me.

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend mode

2022-03-09 Thread Wu Jianhua

Lynne<mailto:d...@lynne.ee>:
>Sent: 2022年3月10日 1:43
>To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
>Subject: Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition 
>blend mode
>
>
>> Ping.
>>
>>>From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
>> >Sent: 2022年2月25日 21:11
>>>To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
>>>Subject: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend 
>>>mode
>
>>
>>
> >
>
>None of them apply. Could you rebase them onto current git master
I didn't see any new commits to vf_ blend_vulkan.c. Is there any conflict?

> and squash them into 1 commit?
Nope. This didn't break the contribution rules. My intention is to keep the
commit message as clear as I can, which can help you test each mode quickly
for they are completed not the same things, or other people who want to use
the mode specified easily.

Best regards,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend mode

2022-03-09 Thread Wu Jianhua

Ping.

>From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
>Sent: 2022年2月25日 21:11
>To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
>Subject: [FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend 
>mode
>
>[PATCH 01/10] avfilter/vf_blend_vulkan: add addition blend mode [PATCH 02/10] 
>avfilter/vf_blend_vulkan: >add average blend mode [PATCH 03/10] 
>avfilter/vf_blend_vulkan: add subtract >blend mode [PATCH 04/10] 
>avfilter/vf_blend_vulkan: add negation blend mode [PATCH 05/10] 
>>avfilter/vf_blend_vulkan: add extremity blend mode [PATCH 06/10] 
>avfilter/vf_blend_vulkan: add >difference blend mode [PATCH 07/10] 
>avfilter/vf_blend_vulkan: add darken blend mode [PATCH 08/10] 
>>avfilter/vf_blend_vulkan: add lighten blend mode [PATCH 09/10] 
>avfilter/vf_blend_vulkan: add >exclusion blend mode [PATCH 10/10] 
>avfilter/vf_blend_vulkan: add phoenix blend mode
>
> Patches attached.

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag

2022-03-08 Thread Wu, Jianhua

Ping.
> From: Wu, Jianhua
> Sent: Wednesday, March 2, 2022 1:34 PM
> To: ffmpeg-devel@ffmpeg.org
> Subject: RE: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag
> 
> Ping.
> > From: Wu, Jianhua 
> > Sent: Wednesday, February 23, 2022 4:58 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Wu, Jianhua 
> > Subject: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag
> >
> > From: Wu Jianhua 
> >
> > Signed-off-by: Wu Jianhua 
> > ---
> >  configure | 13 +++---
> >  libavutil/cpu.c   |  1 +
> >  libavutil/cpu.h   |  1 +
> >  libavutil/x86/cpu.c   |  8 --
> >  libavutil/x86/cpu.h   |  1 +
> >  libavutil/x86/x86inc.asm  | 53
> > ---
> >  tests/checkasm/checkasm.c | 35 +-
> >  7 files changed, 63 insertions(+), 49 deletions(-)
> >
> > diff --git a/configure b/configure
> > index 1535dc3c5b..d88c2ae979 100755
> > --- a/configure
> > +++ b/configure
> > @@ -444,6 +444,7 @@ Optimization options (experts only):
> >--disable-fma4   disable FMA4 optimizations
> >--disable-avx2   disable AVX2 optimizations
> >--disable-avx512 disable AVX-512 optimizations
> > +  --disable-avx512icl  disable AVX-512ICL optimizations
> >--disable-aesni  disable AESNI optimizations
> >--disable-armv5tedisable armv5te optimizations
> >--disable-armv6  disable armv6 optimizations
> > @@ -2098,6 +2099,7 @@ ARCH_EXT_LIST_X86_SIMD="
> >  avx
> >  avx2
> >  avx512
> > +avx512icl
> >  fma3
> >  fma4
> >  mmx
> > @@ -2666,6 +2668,7 @@ fma3_deps="avx"
> >  fma4_deps="avx"
> >  avx2_deps="avx"
> >  avx512_deps="avx2"
> > +avx512icl_deps="avx512"
> >
> >  mmx_external_deps="x86asm"
> >  mmx_inline_deps="inline_asm x86"
> > @@ -6128,10 +6131,11 @@ EOF
> >  elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;;
> >  esac
> >
> > -enabled avx512 && check_x86asm avx512_external "vmovdqa32
> > [eax]{k1}{z}, zmm0"
> > -enabled avx2   && check_x86asm avx2_external   "vextracti128 xmm0,
> > ymm0, 0"
> > -enabled xop&& check_x86asm xop_external"vpmacsdd xmm0,
> > xmm1, xmm2, xmm3"
> > -enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0,
> > ymm1, ymm2, ymm3"
> > +enabled avx512&& check_x86asm avx512_external"vmovdqa32
> > [eax]{k1}{z}, zmm0"
> > +enabled avx512icl && check_x86asm avx512icl_external
> > + "vpdpwssds
> > zmm31{k1}{z}, zmm29, zmm28"
> > +enabled avx2  && check_x86asm avx2_external  "vextracti128
> > xmm0, ymm0, 0"
> > +enabled xop   && check_x86asm xop_external   "vpmacsdd 
> > xmm0,
> > xmm1, xmm2, xmm3"
> > +enabled fma4  && check_x86asm fma4_external  "vfmaddps
> ymm0,
> > ymm1, ymm2, ymm3"
> >  check_x86asm cpunop  "CPU amdnop"
> >  fi
> >
> > @@ -7471,6 +7475,7 @@ if enabled x86; then
> >  echo "AVX enabled   ${avx-no}"
> >  echo "AVX2 enabled  ${avx2-no}"
> >  echo "AVX-512 enabled   ${avx512-no}"
> > +echo "AVX-512ICL enabled${avx512icl-no}"
> >  echo "XOP enabled   ${xop-no}"
> >  echo "FMA3 enabled  ${fma3-no}"
> >  echo "FMA4 enabled  ${fma4-no}"
> > diff --git a/libavutil/cpu.c b/libavutil/cpu.c index
> > 1368502245..833c220192
> > 100644
> > --- a/libavutil/cpu.c
> > +++ b/libavutil/cpu.c
> > @@ -137,6 +137,7 @@ int av_parse_cpu_caps(unsigned *flags, const char
> *s)
> >  { "cmov", NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> > AV_CPU_FLAG_CMOV },.unit = "flags" },
> >  { "aesni",NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> > AV_CPU_FLAG_AESNI},.unit = "flags" },
> >  { "avx512"  , NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> > AV_CPU_FLAG_AVX512   },.unit = "flags" },
> > +{ "avx512icl",  NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> > AV_CPU_FLAG_AVX51

[FFmpeg-devel] [PATCH v2] avfilter: add shader_vulkan filter

2022-03-04 Thread Wu, Jianhua

[PATCH 1/2] avfilter: add shader_vulkan filter

[PATCH 2/2] avutil/vulkan: print correct text for shader_vulkan filter

Patches attached.



0001-avfilter-add-shader_vulkan-filter.patch
Description: 0001-avfilter-add-shader_vulkan-filter.patch


0002-avutil-vulkan-print-correct-text-for-shader_vulkan-f.patch
Description: 0002-avutil-vulkan-print-correct-text-for-shader_vulkan-f.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag

2022-03-01 Thread Wu, Jianhua

Ping.
> -Original Message-
> From: Wu, Jianhua 
> Sent: Wednesday, February 23, 2022 4:58 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Wu, Jianhua 
> Subject: [PATCH 1/6] avutil/cpu: add AVX512 Icelake flag
> 
> From: Wu Jianhua 
> 
> Signed-off-by: Wu Jianhua 
> ---
>  configure | 13 +++---
>  libavutil/cpu.c   |  1 +
>  libavutil/cpu.h   |  1 +
>  libavutil/x86/cpu.c   |  8 --
>  libavutil/x86/cpu.h   |  1 +
>  libavutil/x86/x86inc.asm  | 53 ---
>  tests/checkasm/checkasm.c | 35 +-
>  7 files changed, 63 insertions(+), 49 deletions(-)
> 
> diff --git a/configure b/configure
> index 1535dc3c5b..d88c2ae979 100755
> --- a/configure
> +++ b/configure
> @@ -444,6 +444,7 @@ Optimization options (experts only):
>--disable-fma4   disable FMA4 optimizations
>--disable-avx2   disable AVX2 optimizations
>--disable-avx512 disable AVX-512 optimizations
> +  --disable-avx512icl  disable AVX-512ICL optimizations
>--disable-aesni  disable AESNI optimizations
>--disable-armv5tedisable armv5te optimizations
>--disable-armv6  disable armv6 optimizations
> @@ -2098,6 +2099,7 @@ ARCH_EXT_LIST_X86_SIMD="
>  avx
>  avx2
>  avx512
> +avx512icl
>  fma3
>  fma4
>  mmx
> @@ -2666,6 +2668,7 @@ fma3_deps="avx"
>  fma4_deps="avx"
>  avx2_deps="avx"
>  avx512_deps="avx2"
> +avx512icl_deps="avx512"
> 
>  mmx_external_deps="x86asm"
>  mmx_inline_deps="inline_asm x86"
> @@ -6128,10 +6131,11 @@ EOF
>  elf*) enabled debug && append X86ASMFLAGS $x86asm_debug ;;
>  esac
> 
> -enabled avx512 && check_x86asm avx512_external "vmovdqa32
> [eax]{k1}{z}, zmm0"
> -enabled avx2   && check_x86asm avx2_external   "vextracti128 xmm0,
> ymm0, 0"
> -enabled xop&& check_x86asm xop_external"vpmacsdd xmm0,
> xmm1, xmm2, xmm3"
> -enabled fma4   && check_x86asm fma4_external   "vfmaddps ymm0,
> ymm1, ymm2, ymm3"
> +enabled avx512&& check_x86asm avx512_external"vmovdqa32
> [eax]{k1}{z}, zmm0"
> +enabled avx512icl && check_x86asm avx512icl_external "vpdpwssds
> zmm31{k1}{z}, zmm29, zmm28"
> +enabled avx2  && check_x86asm avx2_external  "vextracti128
> xmm0, ymm0, 0"
> +enabled xop   && check_x86asm xop_external   "vpmacsdd xmm0,
> xmm1, xmm2, xmm3"
> +enabled fma4  && check_x86asm fma4_external  "vfmaddps ymm0,
> ymm1, ymm2, ymm3"
>  check_x86asm cpunop  "CPU amdnop"
>  fi
> 
> @@ -7471,6 +7475,7 @@ if enabled x86; then
>  echo "AVX enabled   ${avx-no}"
>  echo "AVX2 enabled  ${avx2-no}"
>  echo "AVX-512 enabled   ${avx512-no}"
> +echo "AVX-512ICL enabled${avx512icl-no}"
>  echo "XOP enabled   ${xop-no}"
>  echo "FMA3 enabled  ${fma3-no}"
>  echo "FMA4 enabled  ${fma4-no}"
> diff --git a/libavutil/cpu.c b/libavutil/cpu.c index 1368502245..833c220192
> 100644
> --- a/libavutil/cpu.c
> +++ b/libavutil/cpu.c
> @@ -137,6 +137,7 @@ int av_parse_cpu_caps(unsigned *flags, const char *s)
>  { "cmov", NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> AV_CPU_FLAG_CMOV },.unit = "flags" },
>  { "aesni",NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> AV_CPU_FLAG_AESNI},.unit = "flags" },
>  { "avx512"  , NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> AV_CPU_FLAG_AVX512   },.unit = "flags" },
> +{ "avx512icl",  NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> AV_CPU_FLAG_AVX512ICL   }, .unit = "flags" },
>  { "slowgather", NULL, 0, AV_OPT_TYPE_CONST, { .i64 =
> AV_CPU_FLAG_SLOW_GATHER }, .unit = "flags" },
> 
>  #define CPU_FLAG_P2 AV_CPU_FLAG_CMOV | AV_CPU_FLAG_MMX diff --
> git a/libavutil/cpu.h b/libavutil/cpu.h index ce9bf14bf7..9711e574c5 100644
> --- a/libavutil/cpu.h
> +++ b/libavutil/cpu.h
> @@ -54,6 +54,7 @@
>  #define AV_CPU_FLAG_BMI10x2 ///< Bit Manipulation Instruction
> Set 1
>  #define AV_CPU_FLAG_BMI20x4 ///< Bit Manipulation Instruction
> Set 2
>  #define AV_CPU_FLAG_AVX512 0x10 ///< AVX-512 fu

[FFmpeg-devel] [PATCH] avfilter/vf_blend_vulkan: add addition blend mode

2022-02-25 Thread Wu, Jianhua

[PATCH 01/10] avfilter/vf_blend_vulkan: add addition blend mode [PATCH 02/10] 
avfilter/vf_blend_vulkan: add average blend mode [PATCH 03/10] 
avfilter/vf_blend_vulkan: add subtract blend mode [PATCH 04/10] 
avfilter/vf_blend_vulkan: add negation blend mode [PATCH 05/10] 
avfilter/vf_blend_vulkan: add extremity blend mode [PATCH 06/10] 
avfilter/vf_blend_vulkan: add difference blend mode [PATCH 07/10] 
avfilter/vf_blend_vulkan: add darken blend mode [PATCH 08/10] 
avfilter/vf_blend_vulkan: add lighten blend mode [PATCH 09/10] 
avfilter/vf_blend_vulkan: add exclusion blend mode [PATCH 10/10] 
avfilter/vf_blend_vulkan: add phoenix blend mode

Patches attached.




0003-avfilter-vf_blend_vulkan-add-subtract-blend-mode.patch
Description: 0003-avfilter-vf_blend_vulkan-add-subtract-blend-mode.patch


0004-avfilter-vf_blend_vulkan-add-negation-blend-mode.patch
Description: 0004-avfilter-vf_blend_vulkan-add-negation-blend-mode.patch


0005-avfilter-vf_blend_vulkan-add-extremity-blend-mode.patch
Description: 0005-avfilter-vf_blend_vulkan-add-extremity-blend-mode.patch


0006-avfilter-vf_blend_vulkan-add-difference-blend-mode.patch
Description: 0006-avfilter-vf_blend_vulkan-add-difference-blend-mode.patch


0007-avfilter-vf_blend_vulkan-add-darken-blend-mode.patch
Description: 0007-avfilter-vf_blend_vulkan-add-darken-blend-mode.patch


0008-avfilter-vf_blend_vulkan-add-lighten-blend-mode.patch
Description: 0008-avfilter-vf_blend_vulkan-add-lighten-blend-mode.patch


0009-avfilter-vf_blend_vulkan-add-exclusion-blend-mode.patch
Description: 0009-avfilter-vf_blend_vulkan-add-exclusion-blend-mode.patch


0010-avfilter-vf_blend_vulkan-add-phoenix-blend-mode.patch
Description: 0010-avfilter-vf_blend_vulkan-add-phoenix-blend-mode.patch


0001-avfilter-vf_blend_vulkan-add-addition-blend-mode.patch
Description: 0001-avfilter-vf_blend_vulkan-add-addition-blend-mode.patch


0002-avfilter-vf_blend_vulkan-add-average-blend-mode.patch
Description: 0002-avfilter-vf_blend_vulkan-add-average-blend-mode.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option

2022-02-22 Thread Wu, Jianhua

Lynne:
> Sent: Tuesday, February 22, 2022 5:36 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV
> option
> 
> 22 Feb 2022, 07:27 by jianhua.wu-at-intel@ffmpeg.org:
> 
> > Lynne:
> >
> >> Sent: Tuesday, February 22, 2022 1:38 PM
> >> To: FFmpeg development discussions and patches  >> de...@ffmpeg.org>
> >> Subject: Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add
> >> sizeV option
> >>
> >> 18 Feb 2022, 16:24 by toq...@outlook.com:
> >>
> >> >> 29 Jan 2022, 13:34 by toqsxw at outlook.com:
> >> >>
> >> >>> Ping.
> >> >>>
> >> >>>> From: Wu, Jianhua<mailto:jianhua.wu-at-intel.com at ffmpeg.org>
> >> >>>> Sent: 2022年1月21日 19:42
> >> >>>> To: ffmpeg-devel at ffmpeg.org<mailto:ffmpeg-devel at
> >> >>>> ffmpeg.org>
> >> >>>> Cc: Wu, Jianhua<mailto:jianhua.wu at intel.com>
> >> >>>> Subject: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add
> >> >>>> sizeV option
> >> >>>>
> >> >>>> [PATCH 1/5] avfilter/vf_gblur_vulkan: add sizeV option [PATCH
> >> >>>> 2/5] avfilter:add shader_vulkan filter [PATCH 3/5]
> >> >>>> avfilter/vf_blend_vulkan: add multiply blend mode [PATCH 4/5]
> >> >>>> avutil/vulkan: don't use strlen as loop >condition [PATCH 5/5]
> >> >>>> avfilter/scale_vulkan: use RET for checking return value
> >> >>>>
> >> >>>> Patches attached.
> >> >>>>
> >> >>>
> >> >>> Hi there,
> >> >>>
> >> >>> Any update?
> >> >>>
> >> >>
> >> >> Sorry, haven't forgotten, but been busy with FFTs lately.
> >> >> Will try to review and test the patches soon.
> >> >>
> >> >
> >> > Hi there,
> >> >
> >> > I'm sorry for bothering you. If there is any update on this thread,
> >> > please do let me know.
> >> >
> >>
> >> Pushed all except the strlen() in a loop condition and the shader filter.
> >> I pushed a different, smaller version for the strlen patch.
> >>
> > Maybe you don't need to use strlen() at all. That patch could be
> > applied separately if you preferred to apply the shader_vulkan filter in the
> future.
> >
> >> As for a shader filter, I'd like something that's a lot less minimal.
> >> You should expose the frame number, framerate (with an avoption to
> >> set it), pixel format to the shader. Keep in mind the API will be
> >> fixed, so we need to get this right the first time hopefully.
> >>
> >
> > Frame number and framerate are okay to set if I can get them from
> > FFmpeg API, but pixel format may not be ideal to expose for there is a
> > lot of pixel formats in FFmpeg. Exposing a pixel format means we need
> > to expose all values related to pixel formats. Instead, we could
> > expose two functions like
> >
> 
> Wrapper functions won't help, shaders would still need to know what
> colorspace to work in.
> I think if you just expose the entire pixdesc flags as-is, so shaders know if 
> it's
> RGB or YUV, and the raw colorspace/transfer/matrix integers, it'll be good
> enough.
> 
> 
> >> You should expose alpha planes as well.
> >>
> > Could you elaborate it further?
> >
> 
> Yes, currently for YUVA images, the alpha channel remains untouched (the
> size of all image arrays is [3]) and unexposed to the shader.
> 
> 
> >> Finally, could you implement N-inputs and M-outputs, configurable via
> >> avoptions? That way, someone could make a custom blend filter without
> >> a separate avfilter which takes multiple inputs. Or a separator filter.
> >> Or a simple source filter that just produces an image pattern.
> >>
> > Sounds great! However, at the present, I've no idea about how this could
> be done.
> > I think the current filter is already useful already for I could make
> > some great video effects just like Shadertoy. More extensions need
> further contributions.
> >
> 
> Just set AVFilter.inputs and AVFilter.outputs to NULL. Then in the init
> function, copy what e.g. af_amix does to set inputs and what e.g.
> af_channelsplit does for outputs. There are probably better examples out
> there if you look.
> I think that this should be up to the shader to initialize, rather than the 
> filter
> user, so perhaps, in the init function, you compile the shader, run an init
> function in the shader which tells you what it requires and does checking,
> then return, and set it up for operation.
> 

I suppose you mean there is a init function in the shader source file and 
execute
it to read back some configuration data, don't you? If so, looks like a great 
feature.
And I think, isn't these characteristics should be only added when there is 
someone
who has the relative requirements?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option

2022-02-21 Thread Wu, Jianhua

Lynne:
> Sent: Tuesday, February 22, 2022 1:38 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV
> option
> 
> 18 Feb 2022, 16:24 by toq...@outlook.com:
> 
> >> 29 Jan 2022, 13:34 by toqsxw at outlook.com:
> >>
> >>> Ping.
> >>>
> >>>> From: Wu, Jianhua<mailto:jianhua.wu-at-intel.com at ffmpeg.org>
> >>>> Sent: 2022年1月21日 19:42
> >>>> To: ffmpeg-devel at ffmpeg.org<mailto:ffmpeg-devel at ffmpeg.org>
> >>>> Cc: Wu, Jianhua<mailto:jianhua.wu at intel.com>
> >>>> Subject: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add
> >>>> sizeV option
> >>>>
> >>>> [PATCH 1/5] avfilter/vf_gblur_vulkan: add sizeV option [PATCH 2/5]
> >>>> avfilter:add shader_vulkan filter [PATCH 3/5]
> >>>> avfilter/vf_blend_vulkan: add multiply blend mode [PATCH 4/5]
> >>>> avutil/vulkan: don't use strlen as loop >condition [PATCH 5/5]
> >>>> avfilter/scale_vulkan: use RET for checking return value
> >>>>
> >>>> Patches attached.
> >>>>
> >>>
> >>> Hi there,
> >>>
> >>> Any update?
> >>>
> >>
> >> Sorry, haven't forgotten, but been busy with FFTs lately.
> >> Will try to review and test the patches soon.
> >>
> >
> > Hi there,
> >
> > I'm sorry for bothering you. If there is any update on this thread,
> > please do let me know.
> >
> 
> Pushed all except the strlen() in a loop condition and the shader filter.
> I pushed a different, smaller version for the strlen patch.
> 
Maybe you don't need to use strlen() at all. That patch could be applied
separately if you preferred to apply the shader_vulkan filter in the future.

> As for a shader filter, I'd like something that's a lot less minimal.
> You should expose the frame number, framerate (with an avoption to set it),
> pixel format to the shader. Keep in mind the API will be fixed, so we need to
> get this right the first time hopefully.

Frame number and framerate are okay to set if I can get them from FFmpeg API,
but pixel format may not be ideal to expose for there is a lot of pixel formats 
in
FFmpeg. Exposing a pixel format means we need to expose all values related to
pixel formats. Instead, we could expose two functions like

vec4 pixel = av_read_pixel(intput_images, av_position) 
av_write_pixel(ouput_images, pixel, positions)

so the user shader could only concentrate on the vector4 pixel variable and 
don't
need to care about what pixel format is. Or simply expose the subsampling 
scheme,
444, 420, or 422, and color space, YUV, and RGB.

> Also, correct the name style. We don't use camelcase for variables, and we
> use "av_" instead of "ff_" for public API, which a shader sort of is IMO.

Got it.

> You should expose alpha planes as well.
Could you elaborate it further?

> Finally, could you implement N-inputs and M-outputs, configurable via
> avoptions? That way, someone could make a custom blend filter without a
> separate avfilter which takes multiple inputs. Or a separator filter.
> Or a simple source filter that just produces an image pattern.
> 
Sounds great! However, at the present, I've no idea about how this could be 
done.
I think the current filter is already useful already for I could make some 
great video
effects just like Shadertoy. More extensions need further contributions.

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option

2022-02-18 Thread Wu Jianhua

> 29 Jan 2022, 13:34 by toqsxw at outlook.com:

>> Ping.
>>
>>> From: Wu, Jianhua<mailto:jianhua.wu-at-intel.com at ffmpeg.org>
>>> Sent: 2022年1月21日 19:42
>>> To: ffmpeg-devel at ffmpeg.org<mailto:ffmpeg-devel at ffmpeg.org>
>>> Cc: Wu, Jianhua<mailto:jianhua.wu at intel.com>
>>> Subject: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV 
>>> option
>>>
>>> [PATCH 1/5] avfilter/vf_gblur_vulkan: add sizeV option [PATCH 2/5] 
>>> avfilter:add shader_vulkan filter [PATCH 3/5] avfilter/vf_blend_vulkan: add 
>>> multiply blend mode [PATCH 4/5] avutil/vulkan: don't use strlen as loop 
>>> >condition [PATCH 5/5] avfilter/scale_vulkan: use RET for checking return 
>>> value
>>>
>>> Patches attached.
>>>
>>
>> Hi there,
>>
>> Any update?
>>
>
> Sorry, haven't forgotten, but been busy with FFTs lately.
> Will try to review and test the patches soon.

Hi there,

I'm sorry for bothering you. If there is any update on this
thread, please do let me know. 

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option

2022-01-29 Thread Wu Jianhua

Ping.

> From: Wu, Jianhua<mailto:jianhua.wu-at-intel@ffmpeg.org>
> Sent: 2022年1月21日 19:42
> To: ffmpeg-devel@ffmpeg.org<mailto:ffmpeg-devel@ffmpeg.org>
> Cc: Wu, Jianhua<mailto:jianhua...@intel.com>
> Subject: [FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option
>
> [PATCH 1/5] avfilter/vf_gblur_vulkan: add sizeV option [PATCH 2/5] 
> avfilter:add shader_vulkan filter [PATCH 3/5] avfilter/vf_blend_vulkan: add 
> multiply blend mode [PATCH 4/5] avutil/vulkan: don't use strlen as loop 
> >condition [PATCH 5/5] avfilter/scale_vulkan: use RET for checking return 
> value
>
> Patches attached.

Hi there,

Any update?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH] hwcontext_vulkan: workaround MoltenVK's bug which leads to segmentation fault

2022-01-27 Thread Wu, Jianhua

Zhao Zhili wrote:
> Sent: Thursday, January 27, 2022 4:11 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Zhao Zhili 
> Subject: [FFmpeg-devel] [PATCH] hwcontext_vulkan: workaround
> MoltenVK's bug which leads to segmentation fault
> 
> MoltenVK doesn't reset instance pointer when CreateInstance() failed, then
> DestroyInstance() leads to segmentation fault. MoltenVK's bug has been
> fixed by [1], which doesn't available on homebrew yet.
> Regardless MoltenVK's bug, we shouldn't call DestroyInstance() in the case
> of CreateInstance() failed, so reset instance making sense.
> 
> [1] https://github.com/KhronosGroup/MoltenVK/commit/86a1fbdb8
> ---
>  libavutil/hwcontext_vulkan.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
> index 2e219511c9..ac8e3a 100644
> --- a/libavutil/hwcontext_vulkan.c
> +++ b/libavutil/hwcontext_vulkan.c
> @@ -719,6 +719,8 @@ static int create_instance(AVHWDeviceContext *ctx,
> AVDictionary *opts)
>  if (ret != VK_SUCCESS) {
>  av_log(ctx, AV_LOG_ERROR, "Instance creation failure: %s\n",
> vk_ret2str(ret));
> +/* Workaround MoltenVK's bug which doesn't reset instance pointer.
> */
> +hwctx->inst = (VkInstance) { 0 };

Hi,

It's no need to use the explicit cast and use hwctx->inst = VK_NULL_HANDLE 
instead, which is the null context defined by Vulkan spec.

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v1] avfilter/vf_gblur_vulkan: add sizeV option

2022-01-21 Thread Wu, Jianhua

[PATCH 1/5] avfilter/vf_gblur_vulkan: add sizeV option [PATCH 2/5] avfilter:add 
shader_vulkan filter [PATCH 3/5] avfilter/vf_blend_vulkan: add multiply blend 
mode [PATCH 4/5] avutil/vulkan: don't use strlen as loop condition [PATCH 5/5] 
avfilter/scale_vulkan: use RET for checking return value

Patches attached.


0003-avfilter-vf_blend_vulkan-add-multiply-blend-mode.patch
Description: 0003-avfilter-vf_blend_vulkan-add-multiply-blend-mode.patch


0004-avutil-vulkan-don-t-use-strlen-as-loop-condition.patch
Description: 0004-avutil-vulkan-don-t-use-strlen-as-loop-condition.patch


0005-avfilter-scale_vulkan-use-RET-for-checking-return-va.patch
Description: 0005-avfilter-scale_vulkan-use-RET-for-checking-return-va.patch


0001-avfilter-vf_gblur_vulkan-add-sizeV-option.patch
Description: 0001-avfilter-vf_gblur_vulkan-add-sizeV-option.patch


0002-avfilter-add-shader_vulkan-filter.patch
Description: 0002-avfilter-add-shader_vulkan-filter.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 2/5] transpose_vulkan: add passthrough option

2022-01-10 Thread Wu Jianhua

> It sure would be nice if all the different hw flavors of the same filter
> could share code.

Yeah. Definitely. The config_output function of the same filter
implemented by different hw is similar. Maybe something like
this could be integrated into one separate function.

Best Regards,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] avfilter: add a blend_vulkan filter

2022-01-05 Thread Wu Jianhua

Lynne wrote:

> 5 Jan 2022, 10:11 by jianhua.wu-at-intel@ffmpeg.org:
>
>> [PATCH v2 1/3] avfilter: add a blend_vulkan filter
>> [PATCH v2 2/3] avfilter/vf_blend: fix un-checked potential memory allocation 
>> failure
>> [PATCH v2 3/3] avutil/hwcontext_vulkan: fixed incorrect memory offset
>>
>> Patches attached.
>>
>
> Tested, pushed all.
> Didn't push blend_vulkan to the release/5.0 branch because
> it's not that useful, but do say if you'd like for it to be in 5.0.
> ___

It doesn't matter, so simply keep where it is.

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2] avfilter: add a blend_vulkan filter

2022-01-05 Thread Wu, Jianhua

[PATCH v2 1/3] avfilter: add a blend_vulkan filter
[PATCH v2 2/3] avfilter/vf_blend: fix un-checked potential memory allocation 
failure
[PATCH v2 3/3] avutil/hwcontext_vulkan: fixed incorrect memory offset

Patches attached.



0002-avfilter-vf_blend-fix-un-checked-potential-memory-al.patch
Description: 0002-avfilter-vf_blend-fix-un-checked-potential-memory-al.patch


0003-avutil-hwcontext_vulkan-fixed-incorrect-memory-offse.patch
Description: 0003-avutil-hwcontext_vulkan-fixed-incorrect-memory-offse.patch


0001-avfilter-add-a-blend_vulkan-filter.patch
Description: 0001-avfilter-add-a-blend_vulkan-filter.patch
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 5/5] avfilter/vf_blend: fix un-checked potential memory allocation failure

2022-01-03 Thread Wu Jianhua

Timo Rothenpieler<mailto:t...@rothenpieler.org> wrote:

> On 03.01.2022 09:39, Wu, Jianhua wrote:
>> And there is one more question, may I know why there is a suffix 
>> "@ffmpeg.org"
>> behind my commit Author email?
>
> Your E-Mail server is enforcing strict policy via DKIM/DMARC, so it's
> impossible for any other mail-servers, like mailing lists, to send
> E-Mails from @intel.com.
> Hence the only option the list server has is to mangle the sender
> address like that.

Got it. Thanks for your answer. Maybe it's better to send patches as 
attachments in case the commit message gets broken.

Best Regards,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 5/5] avfilter/vf_blend: fix un-checked potential memory allocation failure

2022-01-03 Thread Wu, Jianhua

 Lynne:
> Sent: Monday, January 3, 2022 10:23 AM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 5/5] avfilter/vf_blend: fix un-checked
> potential memory allocation failure
> 
> 2 Jan 2022, 15:51 by jianhua.wu-at-intel@ffmpeg.org:
> 
> > Signed-off-by: Wu Jianhua 
> > ---
> >  libavfilter/vf_blend.c | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/libavfilter/vf_blend.c b/libavfilter/vf_blend.c index
> > b6f3c4fed3..2d433e439f 100644
> > --- a/libavfilter/vf_blend.c
> > +++ b/libavfilter/vf_blend.c
> > @@ -279,7 +279,11 @@ static AVFrame *blend_frame(AVFilterContext
> *ctx,
> > AVFrame *top_buf,  dst_buf = ff_get_video_buffer(outlink, outlink->w,
> > outlink->h);  if (!dst_buf)  return top_buf;
> > -av_frame_copy_props(dst_buf, top_buf);
> > +
> > +if (av_frame_copy_props(dst_buf, top_buf) < 0) {
> > +av_frame_free(_buf);
> > +return top_buf;
> > +}
> >
> >  for (plane = 0; plane < s->nb_planes; plane++) {  int hsub = plane ==
> > 1 || plane == 2 ? s->hsub : 0;
> >
> 
> Pushed patches 2 and 3. The blend filter doesn't work for me:
> https://0x0.st/osRM.jpg
> This is not what it's meant to look like at all, for blank, default options.
> 

I'm afraid of it's not the problem of the blend_vulkan filter. Could you help 
try the other Vulkan filters and
see if they are still work?

> Patch 1 is a driver bug. The driver should not advertise the HDR extension as
> supported if there's no swapchain. The HDR extension explicitly requires a
> swapchain, and the Vulkan specs say that devices are meant to only
> advertise supported extensions, which the HDR extension wouldn't be if the
> swapchain extension has not been loaded.
> I pushed an alternative version that just removes the HDR extension, but you
> need to notify your Windows driver developers that it's not doing what it
> should.
> 

Removing it is okay if it  is not used totally. And I' sorry we may have a 
mistake here.
Below is my development environment on this patch:
Operating System: Windows 10
Physical Device: Nvidia RTX3070
Driver version: GeForce Game Ready Driver 497.29
I'll add something like these to commit message if I fix similar problems.

And there is one more question, may I know why there is a suffix "@ffmpeg.org"
behind my commit Author email?

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 5/5] avfilter/vf_blend: fix un-checked potential memory allocation failure

2022-01-02 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_blend.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/libavfilter/vf_blend.c b/libavfilter/vf_blend.c
index b6f3c4fed3..2d433e439f 100644
--- a/libavfilter/vf_blend.c
+++ b/libavfilter/vf_blend.c
@@ -279,7 +279,11 @@ static AVFrame *blend_frame(AVFilterContext *ctx, AVFrame 
*top_buf,
 dst_buf = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 if (!dst_buf)
 return top_buf;
-av_frame_copy_props(dst_buf, top_buf);
+
+if (av_frame_copy_props(dst_buf, top_buf) < 0) {
+av_frame_free(_buf);
+return top_buf;
+}
 
 for (plane = 0; plane < s->nb_planes; plane++) {
 int hsub = plane == 1 || plane == 2 ? s->hsub : 0;
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 4/5] avfilter: add a blend_vulkan filter

2022-01-02 Thread Wu Jianhua

This commit adds a blend_vulkan filter and a normal blend mode, and
reserves support for introducing the blend modes in the future.

Use the commands below to test: (href: https://trac.ffmpeg.org/wiki/Blend)
I. make an image for test
ffmpeg -f lavfi -i color=s=256x256,geq=r='H-1-Y':g='H-1-Y':b='H-1-Y' -frames 1 \
-y -pix_fmt yuv420p test.jpg

II. blend in sw
ffmpeg -i test.jpg -vf 
"split[a][b];[b]transpose[b];[a][b]blend=all_mode=normal,\
pseudocolor=preset=turbo" -y normal_sw.jpg

III. blend in vulkan
ffmpeg -init_hw_device vulkan -i test.jpg -vf "split[a][b];[b]transpose[b];\
[a]hwupload[a];[b]hwupload[b];[a][b]blend_vulkan=all_mode=normal,hwdownload,\
format=yuv420p,pseudocolor=preset=turbo" -y normal_vulkan.jpg

Signed-off-by: Wu Jianhua 
---
 configure |   1 +
 libavfilter/Makefile  |   1 +
 libavfilter/allfilters.c  |   1 +
 libavfilter/vf_blend_vulkan.c | 501 ++
 4 files changed, 504 insertions(+)
 create mode 100644 libavfilter/vf_blend_vulkan.c

diff --git a/configure b/configure
index 6ad70b9f7b..f6c9e38051 100755
--- a/configure
+++ b/configure
@@ -3609,6 +3609,7 @@ avgblur_opencl_filter_deps="opencl"
 avgblur_vulkan_filter_deps="vulkan spirv_compiler"
 azmq_filter_deps="libzmq"
 blackframe_filter_deps="gpl"
+blend_vulkan_filter_deps="vulkan spirv_compiler"
 bm3d_filter_deps="avcodec"
 bm3d_filter_select="dct"
 boxblur_filter_deps="gpl"
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index 090944a99c..ed727e3fd9 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -192,6 +192,7 @@ OBJS-$(CONFIG_BITPLANENOISE_FILTER)  += 
vf_bitplanenoise.o
 OBJS-$(CONFIG_BLACKDETECT_FILTER)+= vf_blackdetect.o
 OBJS-$(CONFIG_BLACKFRAME_FILTER) += vf_blackframe.o
 OBJS-$(CONFIG_BLEND_FILTER)  += vf_blend.o framesync.o
+OBJS-$(CONFIG_BLEND_VULKAN_FILTER)   += vf_blend_vulkan.o framesync.o 
vulkan.o vulkan_filter.o
 OBJS-$(CONFIG_BM3D_FILTER)   += vf_bm3d.o framesync.o
 OBJS-$(CONFIG_BOXBLUR_FILTER)+= vf_boxblur.o boxblur.o
 OBJS-$(CONFIG_BOXBLUR_OPENCL_FILTER) += vf_avgblur_opencl.o opencl.o \
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index caa755320e..84ba9fdf54 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -183,6 +183,7 @@ extern const AVFilter ff_vf_bitplanenoise;
 extern const AVFilter ff_vf_blackdetect;
 extern const AVFilter ff_vf_blackframe;
 extern const AVFilter ff_vf_blend;
+extern const AVFilter ff_vf_blend_vulkan;
 extern const AVFilter ff_vf_bm3d;
 extern const AVFilter ff_vf_boxblur;
 extern const AVFilter ff_vf_boxblur_opencl;
diff --git a/libavfilter/vf_blend_vulkan.c b/libavfilter/vf_blend_vulkan.c
new file mode 100644
index 00..fac1be532d
--- /dev/null
+++ b/libavfilter/vf_blend_vulkan.c
@@ -0,0 +1,501 @@
+/*
+ * copyright (c) 2021 Wu Jianhua 
+ * The blend modes are based on the blend.c.
+ *
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/random_seed.h"
+#include "libavutil/opt.h"
+#include "vulkan_filter.h"
+#include "internal.h"
+#include "framesync.h"
+#include "blend.h"
+
+#define CGS 32
+
+typedef struct FilterParamsVulkan {
+const char *blend;
+const char *blend_func;
+double opacity;
+enum BlendMode mode;
+} FilterParamsVulkan;
+
+typedef struct BlendVulkanContext {
+FFVulkanContext vkctx;
+FFVkQueueFamilyCtx qf;
+FFVkExecContext *exec;
+FFVulkanPipeline *pl;
+FFFrameSync fs;
+
+VkDescriptorImageInfo top_images[3];
+VkDescriptorImageInfo bottom_images[3];
+VkDescriptorImageInfo output_images[3];
+
+FilterParamsVulkan params[4];
+double all_opacity;
+enum BlendMode all_mode;
+
+int initialized;
+} BlendVulkanContext;
+
+#define DEFINE_BLEND_MODE(MODE, EXPR) \
+static const char blend_##MODE[] = "blend_"#MODE; \
+static const char blend_##MODE##_func[] = { \
+C(0, vec4 blend_##MODE(vec4 top, vec4 bottom, float opa

[FFmpeg-devel] [PATCH 3/5] avfilter/vf_scale_vulkan: align struct ScaleVulkanContext

2022-01-02 Thread Wu Jianhua

On 64 bit Operating System, sizeof(ScaleVulkanContext):
reduce from 2400 to 2392 on Linux
reduce from 2416 to 2408 on Windows

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_scale_vulkan.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vf_scale_vulkan.c b/libavfilter/vf_scale_vulkan.c
index cfce5ab1f8..c87a8d7e2e 100644
--- a/libavfilter/vf_scale_vulkan.c
+++ b/libavfilter/vf_scale_vulkan.c
@@ -35,7 +35,6 @@ enum ScalerFunc {
 typedef struct ScaleVulkanContext {
 FFVulkanContext vkctx;
 
-int initialized;
 FFVkQueueFamilyCtx qf;
 FFVkExecContext *exec;
 FFVulkanPipeline *pl;
@@ -46,11 +45,14 @@ typedef struct ScaleVulkanContext {
 VkDescriptorImageInfo output_images[3];
 VkDescriptorBufferInfo params_desc;
 
-enum ScalerFunc scaler;
 char *out_format_string;
-enum AVColorRange out_range;
 char *w_expr;
 char *h_expr;
+
+enum ScalerFunc scaler;
+enum AVColorRange out_range;
+
+int initialized;
 } ScaleVulkanContext;
 
 static const char scale_bilinear[] = {
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/5] transpose_vulkan: add passthrough option

2022-01-02 Thread Wu Jianhua

The following command is on how to apply passthrough option:

ffmpeg -init_hw_device vulkan -i input.264 -vf 
hwupload=extra_hw_frames=16,transpose_vulkan=passthrough=landscape,hwdownload,format=yuv420p
 output.264

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose_vulkan.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
index ce83cf0fd7..30d052e08c 100644
--- a/libavfilter/vf_transpose_vulkan.c
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -35,6 +35,7 @@ typedef struct TransposeVulkanContext {
 VkDescriptorImageInfo output_images[3];
 
 int dir;
+int passthrough;
 int initialized;
 } TransposeVulkanContext;
 
@@ -222,6 +223,9 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 TransposeVulkanContext *s = ctx->priv;
 AVFilterLink *outlink = ctx->outputs[0];
 
+if (s->passthrough)
+return ff_filter_frame(outlink, in);
+
 out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 if (!out) {
 err = AVERROR(ENOMEM);
@@ -267,6 +271,17 @@ static int config_props_output(AVFilterLink *outlink)
 FFVulkanContext *vkctx = >vkctx;
 AVFilterLink *inlink = avctx->inputs[0];
 
+if ((inlink->w >= inlink->h && s->passthrough == 
TRANSPOSE_PT_TYPE_LANDSCAPE) ||
+(inlink->w <= inlink->h && s->passthrough == 
TRANSPOSE_PT_TYPE_PORTRAIT)) {
+av_log(avctx, AV_LOG_VERBOSE,
+   "w:%d h:%d -> w:%d h:%d (passthrough mode)\n",
+   inlink->w, inlink->h, inlink->w, inlink->h);
+outlink->hw_frames_ctx = av_buffer_ref(inlink->hw_frames_ctx);
+return outlink->hw_frames_ctx ? 0 : AVERROR(ENOMEM);
+} else {
+s->passthrough = TRANSPOSE_PT_TYPE_NONE;
+}
+
 vkctx->output_width  = inlink->h;
 vkctx->output_height = inlink->w;
 
@@ -288,6 +303,13 @@ static const AVOption transpose_vulkan_options[] = {
 { "clock",   "rotate clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CLOCK   }, .flags=FLAGS, .unit = 
"dir" },
 { "cclock",  "rotate counter-clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK  }, .flags=FLAGS, .unit = 
"dir" },
 { "clock_flip",  "rotate clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CLOCK_FLIP  }, .flags=FLAGS, .unit = 
"dir" },
+
+{ "passthrough", "do not apply transposition if the input matches the 
specified geometry",
+  OFFSET(passthrough), AV_OPT_TYPE_INT, {.i64=TRANSPOSE_PT_TYPE_NONE},  0, 
INT_MAX, FLAGS, "passthrough" },
+{ "none",  "always apply transposition",   0, AV_OPT_TYPE_CONST, 
{.i64=TRANSPOSE_PT_TYPE_NONE},  INT_MIN, INT_MAX, FLAGS, "passthrough" },
+{ "portrait",  "preserve portrait geometry",   0, AV_OPT_TYPE_CONST, 
{.i64=TRANSPOSE_PT_TYPE_PORTRAIT},  INT_MIN, INT_MAX, FLAGS, "passthrough" },
+{ "landscape", "preserve landscape geometry",  0, AV_OPT_TYPE_CONST, 
{.i64=TRANSPOSE_PT_TYPE_LANDSCAPE}, INT_MIN, INT_MAX, FLAGS, "passthrough" },
+
 { NULL }
 };
 
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/5] avutil/hwcontext_vulkan: fixed validation error VUID 01387

2022-01-02 Thread Wu Jianhua

This commit fixed the validation error that occurred on the Windows platform.

Validation Error: [ VUID-vkCreateDevice-ppEnabledExtensionNames-01387 ] Object 
0: \
handle = 0x2ab1cfa0db0, type = VK_OBJECT_TYPE_INSTANCE; | MessageID = 
0x12537a2c | \
Missing extension required by the device extension VK_EXT_hdr_metadata: 
VK_KHR_swapchain. \
The Vulkan spec states: All required device extensions for each extension in 
the \
VkDeviceCreateInfo::ppEnabledExtensionNames list must also be present in that 
list \
(https://vulkan.lunarg.com/doc/view/1.2.198.1/windows/1.2-extensions/vkspec.html#\
VUID-vkCreateDevice-ppEnabledExtensionNames-01387)

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 83a7527198..a2a175a063 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -335,7 +335,11 @@ typedef struct VulkanOptExtension {
 } VulkanOptExtension;
 
 static const VulkanOptExtension optional_instance_exts[] = {
-/* For future use */
+/* Misc or required by other extensions */
+#ifdef _WIN32
+{ VK_KHR_WIN32_SURFACE_EXTENSION_NAME,
FF_VK_EXT_NO_FLAG},
+{ VK_KHR_SURFACE_EXTENSION_NAME,  
FF_VK_EXT_NO_FLAG},
+#endif
 };
 
 static const VulkanOptExtension optional_device_exts[] = {
@@ -344,6 +348,9 @@ static const VulkanOptExtension optional_device_exts[] = {
 { VK_EXT_HDR_METADATA_EXTENSION_NAME, 
FF_VK_EXT_NO_FLAG},
 { VK_KHR_SAMPLER_YCBCR_CONVERSION_EXTENSION_NAME, 
FF_VK_EXT_NO_FLAG},
 { VK_KHR_SYNCHRONIZATION_2_EXTENSION_NAME,
FF_VK_EXT_NO_FLAG},
+#ifdef _WIN32
+{ VK_KHR_SWAPCHAIN_EXTENSION_NAME,
FF_VK_EXT_NO_FLAG},
+#endif
 
 /* Imports/exports */
 { VK_KHR_EXTERNAL_MEMORY_FD_EXTENSION_NAME,   
FF_VK_EXT_EXTERNAL_FD_MEMORY },
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 4/4] avfilter/vf_transpose_vulkan: simplify config_props_output function

2021-12-10 Thread Wu Jianhua

It's no need to assign outlink here, which has been done in
ff_vk_filter_config_output already.

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose_vulkan.c | 11 +--
 1 file changed, 1 insertion(+), 10 deletions(-)

diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
index eceb9b9011..ce83cf0fd7 100644
--- a/libavfilter/vf_transpose_vulkan.c
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -262,7 +262,6 @@ static av_cold void transpose_vulkan_uninit(AVFilterContext 
*avctx)
 
 static int config_props_output(AVFilterLink *outlink)
 {
-int err = 0;
 AVFilterContext *avctx = outlink->src;
 TransposeVulkanContext *s = avctx->priv;
 FFVulkanContext *vkctx = >vkctx;
@@ -271,21 +270,13 @@ static int config_props_output(AVFilterLink *outlink)
 vkctx->output_width  = inlink->h;
 vkctx->output_height = inlink->w;
 
-RET(ff_vk_filter_config_output(outlink));
-
-outlink->w = inlink->h;
-outlink->h = inlink->w;
-
 if (inlink->sample_aspect_ratio.num)
 outlink->sample_aspect_ratio = av_div_q((AVRational) { 1, 1 },
 inlink->sample_aspect_ratio);
 else
 outlink->sample_aspect_ratio = inlink->sample_aspect_ratio;
 
-err = 0;
-
-fail:
-return err;
+return ff_vk_filter_config_output(outlink);
 }
 
 #define OFFSET(x) offsetof(TransposeVulkanContext, x)
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 3/4] avfilter/vf_transpose_vulkan: add clock option

2021-12-10 Thread Wu Jianhua

The following command is on how to apply clock option:

ffmpeg -init_hw_device vulkan -i input.264 -vf \
hwupload=extra_hw_frames=16,transpose_vulkan=dir=clock,hwdownload,format=yuv420p
 \
output.264

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose_vulkan.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
index 4c20becb5c..eceb9b9011 100644
--- a/libavfilter/vf_transpose_vulkan.c
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -88,16 +88,18 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in)
 GLSLC(0, void main()   );
 GLSLC(0, { );
 GLSLC(1, ivec2 size;   );
-GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy););
+GLSLC(1, ivec2 pos = ivec2(gl_GlobalInvocationID.xy);  );
 for (int i = 0; i < planes; i++) {
 GLSLC(0,   );
 GLSLF(1, size = imageSize(output_images[%i]);,i);
 GLSLC(1, if (IS_WITHIN(pos, size)) {   );
 if (s->dir == TRANSPOSE_CCLOCK)
 GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.y - 
pos.y, pos.x)); ,i);
-else if (s->dir == TRANSPOSE_CLOCK_FLIP)
+else if (s->dir == TRANSPOSE_CLOCK_FLIP || s->dir == 
TRANSPOSE_CLOCK) {
 GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.yx - 
pos.yx));  ,i);
-else
+if (s->dir == TRANSPOSE_CLOCK)
+GLSLC(2, pos = ivec2(pos.x, size.y - pos.y);   );
+} else
 GLSLF(2, vec4 res = texture(input_images[%i], pos.yx);   ,i);
 GLSLF(2, imageStore(output_images[%i], pos, res);,i);
 GLSLC(1, } );
@@ -292,6 +294,7 @@ fail:
 static const AVOption transpose_vulkan_options[] = {
 { "dir", "set transpose direction", OFFSET(dir), AV_OPT_TYPE_INT, { .i64 = 
TRANSPOSE_CCLOCK_FLIP }, 0, 7, FLAGS, "dir" },
 { "cclock_flip", "rotate counter-clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK_FLIP }, .flags=FLAGS, .unit = 
"dir" },
+{ "clock",   "rotate clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CLOCK   }, .flags=FLAGS, .unit = 
"dir" },
 { "cclock",  "rotate counter-clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK  }, .flags=FLAGS, .unit = 
"dir" },
 { "clock_flip",  "rotate clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CLOCK_FLIP  }, .flags=FLAGS, .unit = 
"dir" },
 { NULL }
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/4] avfilter/vf_transpose_vulkan: add clock_flip option

2021-12-10 Thread Wu Jianhua

The following command is on how to apply clock_flip option:

ffmpeg -init_hw_device vulkan -i input.264 -vf \
hwupload=extra_hw_frames=16,transpose_vulkan=dir=clock_flip,hwdownload,format=yuv420p
 \
output.264

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose_vulkan.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
index 59a548a12f..4c20becb5c 100644
--- a/libavfilter/vf_transpose_vulkan.c
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -95,6 +95,8 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame 
*in)
 GLSLC(1, if (IS_WITHIN(pos, size)) {   );
 if (s->dir == TRANSPOSE_CCLOCK)
 GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.y - 
pos.y, pos.x)); ,i);
+else if (s->dir == TRANSPOSE_CLOCK_FLIP)
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.yx - 
pos.yx));  ,i);
 else
 GLSLF(2, vec4 res = texture(input_images[%i], pos.yx);   ,i);
 GLSLF(2, imageStore(output_images[%i], pos, res);,i);
@@ -291,6 +293,7 @@ static const AVOption transpose_vulkan_options[] = {
 { "dir", "set transpose direction", OFFSET(dir), AV_OPT_TYPE_INT, { .i64 = 
TRANSPOSE_CCLOCK_FLIP }, 0, 7, FLAGS, "dir" },
 { "cclock_flip", "rotate counter-clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK_FLIP }, .flags=FLAGS, .unit = 
"dir" },
 { "cclock",  "rotate counter-clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK  }, .flags=FLAGS, .unit = 
"dir" },
+{ "clock_flip",  "rotate clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CLOCK_FLIP  }, .flags=FLAGS, .unit = 
"dir" },
 { NULL }
 };
 
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/4] avfilter/vf_transpose_vulkan: add cclock option

2021-12-10 Thread Wu Jianhua

The following command is on how to apply cclock option:

ffmpeg -init_hw_device vulkan -i input.264 -vf \
hwupload=extra_hw_frames=16,transpose_vulkan=dir=cclock,hwdownload,format=yuv420p
 \
output.264

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose_vulkan.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
index c9bae413c3..59a548a12f 100644
--- a/libavfilter/vf_transpose_vulkan.c
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -21,6 +21,7 @@
 #include "libavutil/opt.h"
 #include "vulkan_filter.h"
 #include "internal.h"
+#include "transpose.h"
 
 #define CGS 32
 
@@ -33,6 +34,7 @@ typedef struct TransposeVulkanContext {
 VkDescriptorImageInfo input_images[3];
 VkDescriptorImageInfo output_images[3];
 
+int dir;
 int initialized;
 } TransposeVulkanContext;
 
@@ -89,10 +91,13 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in)
 GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy););
 for (int i = 0; i < planes; i++) {
 GLSLC(0,   );
-GLSLF(1, size = imageSize(output_images[%i]); ,i);
+GLSLF(1, size = imageSize(output_images[%i]);,i);
 GLSLC(1, if (IS_WITHIN(pos, size)) {   );
-GLSLF(2, vec4 res = texture(input_images[%i], pos.yx);,i);
-GLSLF(2, imageStore(output_images[%i], pos, res); ,i);
+if (s->dir == TRANSPOSE_CCLOCK)
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.y - 
pos.y, pos.x)); ,i);
+else
+GLSLF(2, vec4 res = texture(input_images[%i], pos.yx);   ,i);
+GLSLF(2, imageStore(output_images[%i], pos, res);,i);
 GLSLC(1, } );
 }
 GLSLC(0, } );
@@ -279,7 +284,13 @@ fail:
 return err;
 }
 
+#define OFFSET(x) offsetof(TransposeVulkanContext, x)
+#define FLAGS (AV_OPT_FLAG_FILTERING_PARAM | AV_OPT_FLAG_VIDEO_PARAM)
+
 static const AVOption transpose_vulkan_options[] = {
+{ "dir", "set transpose direction", OFFSET(dir), AV_OPT_TYPE_INT, { .i64 = 
TRANSPOSE_CCLOCK_FLIP }, 0, 7, FLAGS, "dir" },
+{ "cclock_flip", "rotate counter-clockwise with vertical flip", 0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK_FLIP }, .flags=FLAGS, .unit = 
"dir" },
+{ "cclock",  "rotate counter-clockwise",0, 
AV_OPT_TYPE_CONST, { .i64 = TRANSPOSE_CCLOCK  }, .flags=FLAGS, .unit = 
"dir" },
 { NULL }
 };
 
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 3/4] avfilter/gblur_vulkan: fix incorrect semantics

2021-12-09 Thread Wu Jianhua

Lynne<mailto:d...@lynne.ee>:
Sent: 2021年12月9日 19:26
To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 3/4] avfilter/gblur_vulkan: fix incorrect 
semantics

9 Dec 2021, 10:36 by jianhua...@intel.com:

>> The input and output are arrays of images, so it's better to use the plural
>> to align them to the variable name of VkDescriptorImageInfo.
>>
>> Signed-off-by: Wu Jianhua 
>> ---
>>  libavfilter/vf_gblur_vulkan.c | 60 +--
>>  1 file changed, 30 insertions(+), 30 deletions(-)
>>
>> diff --git a/libavfilter/vf_gblur_vulkan.c b/libavfilter/vf_gblur_vulkan.c
>> index a2e33d1c90..2dbbbd0965 100644
>> --- a/libavfilter/vf_gblur_vulkan.c
>> +++ b/libavfilter/vf_gblur_vulkan.c
>> @@ -50,31 +50,31 @@ typedef struct GBlurVulkanContext {
>>  } GBlurVulkanContext;
>>
>
> Not going to apply either patch, image[] looks
> perfectly fine to me, and you didn't need to reindent the entire
> shader kernel either.
>

Alright.  I’m really like using the plural for arrays and indentation to make 
them more
clear and concise, but it’s okay to me to drop them if you really don’t like 
them.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2 1/4] avfilter: add a transpose_vulkan filter

2021-12-09 Thread Wu Jianhua

Lynne<mailto:d...@lynne.ee>:
Sent: 2021年12月9日 19:17
To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
Subject: Re: [FFmpeg-devel] [PATCH v2 1/4] avfilter: add a transpose_vulkan 
filter

9 Dec 2021, 10:36 by jianhua...@intel.com:

>> The following command is on how to apply transpose_vulkan filter:
>> ffmpeg -init_hw_device vulkan -i input.264 -vf \
>> hwupload=extra_hw_frames=16,transpose_vulkan,hwdownload,format=yuv420p 
>> output.264
>>
>> Signed-off-by: Wu Jianhua 
>> ---
>>  configure |   1 +
>>  libavfilter/Makefile  |   1 +
>>  libavfilter/allfilters.c  |   1 +
>>  libavfilter/vf_transpose_vulkan.c | 316 ++
>>  4 files changed, 319 insertions(+)
>>  create mode 100644 libavfilter/vf_transpose_vulkan.c
>>
>
> Could you make it match what the software transpose filter does
> by default, including the options?
>
Sure I do. The commit has a same effect as what software did. And the options 
will be
introduced when a different effects implemented in the next few days or weeks. 
I’m not
sure.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 4/4] avfilter/flip_vulkan: fix incorrect semantics

2021-12-09 Thread Wu Jianhua

The input and output are arrays of images, so it's better to use the plural
to align them to the variable name of VkDescriptorImageInfo.

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_flip_vulkan.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/libavfilter/vf_flip_vulkan.c b/libavfilter/vf_flip_vulkan.c
index 0223786ef1..6a6709e79b 100644
--- a/libavfilter/vf_flip_vulkan.c
+++ b/libavfilter/vf_flip_vulkan.c
@@ -52,7 +52,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame 
*in, enum FlipType
 
 FFVulkanDescriptorSetBinding image_descs[] = {
 {
-.name   = "input_image",
+.name   = "input_images",
 .type   = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
 .dimensions = 2,
 .elems  = planes,
@@ -60,7 +60,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame 
*in, enum FlipType
 .updater= s->input_images,
 },
 {
-.name   = "output_image",
+.name   = "output_images",
 .type   = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
 .mem_layout = ff_vk_shader_rep_fmt(s->vkctx.output_format),
 .mem_quali  = "writeonly",
@@ -89,33 +89,33 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in, enum FlipType
 ff_vk_set_compute_shader_sizes(shd, (int [3]){ CGS, 1, 1 });
 RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, image_descs, 
FF_ARRAY_ELEMS(image_descs), 0));
 
-GLSLC(0, void main()   
 );
-GLSLC(0, { 
 );
-GLSLC(1, ivec2 size;   
 );
-GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
 );
+GLSLC(0, void main()   
  );
+GLSLC(0, { 
  );
+GLSLC(1, ivec2 size;   
  );
+GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
  );
 for (int i = 0; i < planes; i++) {
-GLSLC(0,   
 );
-GLSLF(1, size = imageSize(output_image[%i]);   
   ,i);
-GLSLC(1, if (IS_WITHIN(pos, size)) {   
 );
+GLSLC(0,   
  );
+GLSLF(1, size = imageSize(output_images[%i]);  
,i);
+GLSLC(1, if (IS_WITHIN(pos, size)) {   
  );
 switch (type)
 {
 case FLIP_HORIZONTAL:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.x - 
pos.x, pos.y));   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.x - 
pos.x, pos.y));   ,i);
 break;
 case FLIP_VERTICAL:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
 break;
 case FLIP_BOTH:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.xy - 
pos.xy));, i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.xy - 
pos.xy));,i);
 break;
 default:
-GLSLF(2, vec4 res = texture(input_image[%i], pos); 
   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], pos);
,i);
 break;
 }
-GLSLF(2, imageStore(output_image[%i], pos, res);   
   ,i);
-GLSLC(1, } 
 );
+GLSLF(2, imageStore(output_images[%i], pos, res);  
,i);
+GLSLC(1, } 
  );
 }
-GLSLC(0, } 
 );
+GLSLC(0, } 
  );
 
 RET(ff_vk_compile_shader(vkctx, shd, "main"));
 RET(ff_vk_init_pipeline_layout(vkctx, s->pl));
-- 
2.25.1

___
ffmpeg-devel mail

[FFmpeg-devel] [PATCH v2 3/4] avfilter/gblur_vulkan: fix incorrect semantics

2021-12-09 Thread Wu Jianhua

The input and output are arrays of images, so it's better to use the plural
to align them to the variable name of VkDescriptorImageInfo.

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_gblur_vulkan.c | 60 +--
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/libavfilter/vf_gblur_vulkan.c b/libavfilter/vf_gblur_vulkan.c
index a2e33d1c90..2dbbbd0965 100644
--- a/libavfilter/vf_gblur_vulkan.c
+++ b/libavfilter/vf_gblur_vulkan.c
@@ -50,31 +50,31 @@ typedef struct GBlurVulkanContext {
 } GBlurVulkanContext;
 
 static const char gblur_horizontal[] = {
-C(0, void gblur(const ivec2 pos, const int index)  
)
-C(0, { 
)
-C(1, vec4 sum = texture(input_image[index], pos) * kernel[0];  
)
-C(0,   
)
-C(1, for(int i = 1; i < kernel.length(); i++) {
)
-C(2, sum += texture(input_image[index], pos + vec2(i, 0.0)) * 
kernel[i];   )
-C(2, sum += texture(input_image[index], pos - vec2(i, 0.0)) * 
kernel[i];   )
-C(1, } 
)
-C(0,   
)
-C(1, imageStore(output_image[index], pos, sum);
)
-C(0, } 
)
+C(0, void gblur(const ivec2 pos, const int index)  
 )
+C(0, { 
 )
+C(1, vec4 sum = texture(input_images[index], pos) * kernel[0]; 
 )
+C(0,   
 )
+C(1, for (int i = 1; i < kernel.length(); i++) {   
 )
+C(2, sum += texture(input_images[index], pos + vec2(i, 0.0)) * 
kernel[i];   )
+C(2, sum += texture(input_images[index], pos - vec2(i, 0.0)) * 
kernel[i];   )
+C(1, } 
 )
+C(0,   
 )
+C(1, imageStore(output_images[index], pos, sum);   
 )
+C(0, } 
 )
 };
 
 static const char gblur_vertical[] = {
-C(0, void gblur(const ivec2 pos, const int index)  
)
-C(0, { 
)
-C(1, vec4 sum = texture(input_image[index], pos) * kernel[0];  
)
-C(0,   
)
-C(1, for(int i = 1; i < kernel.length(); i++) {
)
-C(2, sum += texture(input_image[index], pos + vec2(0.0, i)) * 
kernel[i];   )
-C(2, sum += texture(input_image[index], pos - vec2(0.0, i)) * 
kernel[i];   )
-C(1, } 
)
-C(0,   
)
-C(1, imageStore(output_image[index], pos, sum);
)
-C(0, } 
)
+C(0, void gblur(const ivec2 pos, const int index)  
 )
+C(0, { 
 )
+C(1, vec4 sum = texture(input_images[index], pos) * kernel[0]; 
 )
+C(0,   
 )
+C(1, for (int i = 1; i < kernel.length(); i++) {   
 )
+C(2, sum += texture(input_images[index], pos + vec2(0.0, i)) * 
kernel[i];   )
+C(2, sum += texture(input_images[index], pos - vec2(0.0, i)) * 
kernel[i];   )
+C(1, } 
 )
+C(0,   
 )
+C(1, imageStore(output_images[index], pos, sum);   
 )
+C(0, } 
 )
 };
 
 static inline float gaussian(float sigma, float x)
@@ -133,14 +133,14 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in)
 
 FFVulkanDescriptorSetBinding image_descs[] = {
 {
-.name   = "input_image",
+.name   = "input_

[FFmpeg-devel] [PATCH v2 1/4] avfilter: add a transpose_vulkan filter

2021-12-09 Thread Wu Jianhua

The following command is on how to apply transpose_vulkan filter:
ffmpeg -init_hw_device vulkan -i input.264 -vf \
hwupload=extra_hw_frames=16,transpose_vulkan,hwdownload,format=yuv420p 
output.264

Signed-off-by: Wu Jianhua 
---
 configure |   1 +
 libavfilter/Makefile  |   1 +
 libavfilter/allfilters.c  |   1 +
 libavfilter/vf_transpose_vulkan.c | 316 ++
 4 files changed, 319 insertions(+)
 create mode 100644 libavfilter/vf_transpose_vulkan.c

diff --git a/configure b/configure
index a98a18abaa..12cb49e877 100755
--- a/configure
+++ b/configure
@@ -3718,6 +3718,7 @@ tonemap_vaapi_filter_deps="vaapi 
VAProcFilterParameterBufferHDRToneMapping"
 tonemap_opencl_filter_deps="opencl const_nan"
 transpose_opencl_filter_deps="opencl"
 transpose_vaapi_filter_deps="vaapi VAProcPipelineCaps_rotation_flags"
+transpose_vulkan_filter_deps="vulkan spirv_compiler"
 unsharp_opencl_filter_deps="opencl"
 uspp_filter_deps="gpl avcodec"
 vaguedenoiser_filter_deps="gpl"
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index c8082c4a2f..8744cc3c63 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -483,6 +483,7 @@ OBJS-$(CONFIG_TRANSPOSE_FILTER)  += 
vf_transpose.o
 OBJS-$(CONFIG_TRANSPOSE_NPP_FILTER)  += vf_transpose_npp.o
 OBJS-$(CONFIG_TRANSPOSE_OPENCL_FILTER)   += vf_transpose_opencl.o opencl.o 
opencl/transpose.o
 OBJS-$(CONFIG_TRANSPOSE_VAAPI_FILTER)+= vf_transpose_vaapi.o 
vaapi_vpp.o
+OBJS-$(CONFIG_TRANSPOSE_VULKAN_FILTER)   += vf_transpose_vulkan.o vulkan.o 
vulkan_filter.o
 OBJS-$(CONFIG_TRIM_FILTER)   += trim.o
 OBJS-$(CONFIG_UNPREMULTIPLY_FILTER)  += vf_premultiply.o framesync.o
 OBJS-$(CONFIG_UNSHARP_FILTER)+= vf_unsharp.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index b1af2cbcc8..9e16b4e71e 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -462,6 +462,7 @@ extern const AVFilter ff_vf_transpose;
 extern const AVFilter ff_vf_transpose_npp;
 extern const AVFilter ff_vf_transpose_opencl;
 extern const AVFilter ff_vf_transpose_vaapi;
+extern const AVFilter ff_vf_transpose_vulkan;
 extern const AVFilter ff_vf_trim;
 extern const AVFilter ff_vf_unpremultiply;
 extern const AVFilter ff_vf_unsharp;
diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
new file mode 100644
index 00..c9bae413c3
--- /dev/null
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -0,0 +1,316 @@
+/*
+ * copyright (c) 2021 Wu Jianhua 
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/random_seed.h"
+#include "libavutil/opt.h"
+#include "vulkan_filter.h"
+#include "internal.h"
+
+#define CGS 32
+
+typedef struct TransposeVulkanContext {
+FFVulkanContext vkctx;
+FFVkQueueFamilyCtx qf;
+FFVkExecContext *exec;
+FFVulkanPipeline *pl;
+
+VkDescriptorImageInfo input_images[3];
+VkDescriptorImageInfo output_images[3];
+
+int initialized;
+} TransposeVulkanContext;
+
+static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in)
+{
+int err = 0;
+FFVkSPIRVShader *shd;
+TransposeVulkanContext *s = ctx->priv;
+FFVulkanContext *vkctx = >vkctx;
+const int planes = av_pix_fmt_count_planes(s->vkctx.output_format);
+
+FFVulkanDescriptorSetBinding image_descs[] = {
+{
+.name   = "input_images",
+.type   = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
+.dimensions = 2,
+.elems  = planes,
+.stages = VK_SHADER_STAGE_COMPUTE_BIT,
+.updater= s->input_images,
+},
+{
+.name   = "output_images",
+.type   = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
+.mem_layout = ff_vk_shader_rep_fmt(s->vkctx.output_format),
+.mem_quali  = "writeonly",
+.dimensions = 2,
+.elems  = planes,
+.stages = VK_SHADER_STAGE_COMPUTE_BIT,
+.updater= s->output_images,
+

[FFmpeg-devel] [PATCH v2 2/4] avfilter/vf_transpose: fix un-checked potential memory allocation failure

2021-12-09 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vf_transpose.c b/libavfilter/vf_transpose.c
index f9f0d70cd5..b964daeee3 100644
--- a/libavfilter/vf_transpose.c
+++ b/libavfilter/vf_transpose.c
@@ -328,6 +328,7 @@ static int filter_slice(AVFilterContext *ctx, void *arg, 
int jobnr,
 
 static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 {
+int err = 0;
 AVFilterContext *ctx = inlink->dst;
 TransContext *s = ctx->priv;
 AVFilterLink *outlink = ctx->outputs[0];
@@ -339,10 +340,13 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 
 out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 if (!out) {
-av_frame_free();
-return AVERROR(ENOMEM);
+err = AVERROR(ENOMEM);
+goto fail;
 }
-av_frame_copy_props(out, in);
+
+err = av_frame_copy_props(out, in);
+if (err < 0)
+goto fail;
 
 if (in->sample_aspect_ratio.num == 0) {
 out->sample_aspect_ratio = in->sample_aspect_ratio;
@@ -356,6 +360,11 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
   FFMIN(outlink->h, ff_filter_get_nb_threads(ctx)));
 av_frame_free();
 return ff_filter_frame(outlink, out);
+
+fail:
+av_frame_free();
+av_frame_free();
+return err;
 }
 
 #define OFFSET(x) offsetof(TransContext, x)
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v3 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-12-08 Thread Wu, Jianhua

Ping
Wu, Jianhua:
> Ping.
> > From: Wu, Jianhua 
> > Sent: Monday, November 22, 2021 4:09 PM
> > To: ffmpeg-devel@ffmpeg.org
> > Cc: Wu, Jianhua 
> > Subject: [PATCH v3 1/3] avfilter/x86/vf_exposure: add x86 SIMD
> > optimization
> >
> > Performance data(Less is better):
> > exposure_c:    857394
> > exposure_sse:  327589
> >
> > Signed-off-by: Wu Jianhua 
> > ---
> >  libavfilter/exposure.h | 36 +++
> >  libavfilter/vf_exposure.c  | 36 +--
> >  libavfilter/x86/Makefile   |  2 ++
> >  libavfilter/x86/vf_exposure.asm| 55
> > ++
> >  libavfilter/x86/vf_exposure_init.c | 36 +++
> >  5 files changed, 147 insertions(+), 18 deletions(-)  create mode
> > 100644 libavfilter/exposure.h  create mode 100644
> > libavfilter/x86/vf_exposure.asm create mode 100644
> > libavfilter/x86/vf_exposure_init.c
> >
> > diff --git a/libavfilter/exposure.h b/libavfilter/exposure.h new file
> > mode
> > 100644 index 00..e76a517826
> > --- /dev/null
> > +++ b/libavfilter/exposure.h
> > @@ -0,0 +1,36 @@
> > +/*
> > + * This file is part of FFmpeg.
> > + *
> > + * FFmpeg is free software; you can redistribute it and/or
> > + * modify it under the terms of the GNU Lesser General Public
> > + * License as published by the Free Software Foundation; either
> > + * version 2.1 of the License, or (at your option) any later version.
> > + *
> > + * FFmpeg is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > GNU
> > + * Lesser General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU Lesser General Public
> > + * License along with FFmpeg; if not, write to the Free Software
> > + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> > +02110-1301 USA  */
> > +
> > +#ifndef AVFILTER_EXPOSURE_H
> > +#define AVFILTER_EXPOSURE_H
> > +#include "avfilter.h"
> > +
> > +typedef struct ExposureContext {
> > +const AVClass *class;
> > +
> > +float exposure;
> > +float black;
> > +float scale;
> > +
> > +void (*exposure_func)(float *ptr, int length, float black, float
> > +scale); } ExposureContext;
> > +
> > +void ff_exposure_init(ExposureContext *s); void
> > +ff_exposure_init_x86(ExposureContext *s);
> > +
> > +#endif
> > diff --git a/libavfilter/vf_exposure.c b/libavfilter/vf_exposure.c
> > index
> > 108fba7930..045ae710d3 100644
> > --- a/libavfilter/vf_exposure.c
> > +++ b/libavfilter/vf_exposure.c
> > @@ -26,23 +26,20 @@
> >  #include "formats.h"
> >  #include "internal.h"
> >  #include "video.h"
> > +#include "exposure.h"
> >
> > -typedef struct ExposureContext {
> > -const AVClass *class;
> > -
> > -float exposure;
> > -float black;
> > +static void exposure_c(float *ptr, int length, float black, float
> > +scale) {
> > +int i;
> >
> > -float scale;
> > -int (*do_slice)(AVFilterContext *s, void *arg,
> > -int jobnr, int nb_jobs);
> > -} ExposureContext;
> > +for (i = 0; i < length; i++)
> > +ptr[i] = (ptr[i] - black) * scale; }
> >
> >  static int exposure_slice(AVFilterContext *ctx, void *arg, int jobnr,
> > int
> > nb_jobs)  {
> >  ExposureContext *s = ctx->priv;
> >  AVFrame *frame = arg;
> > -const int width = frame->width;
> >  const int height = frame->height;
> >  const int slice_start = (height * jobnr) / nb_jobs;
> >  const int slice_end = (height * (jobnr + 1)) / nb_jobs; @@ -52,24
> > +49,27 @@ static int exposure_slice(AVFilterContext *ctx, void *arg,
> > int jobnr, int nb_job
> >  for (int p = 0; p < 3; p++) {
> >  const int linesize = frame->linesize[p] / 4;
> >  float *ptr = (float *)frame->data[p] + slice_start * linesize;
> > -for (int y = slice_start; y < slice_end; y++) {
> > -for (int x = 0; x < width; x++)
> > -ptr[x] = (ptr[x] - black) * scale;
> > -
> > -ptr += linesize;
> > -}
> > +s->exposure_func(ptr, linesize * (slice_end - slice_start),
> > + black, sc

[FFmpeg-devel] [PATCH 4/4] avfilter/flip_vulkan: fix incorrect sematic

2021-12-07 Thread Wu Jianhua

The input and output are arrays of images, so it's better to use the plural
to align them to the variable name of VkDescriptorImageInfo.

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_flip_vulkan.c | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/libavfilter/vf_flip_vulkan.c b/libavfilter/vf_flip_vulkan.c
index 0223786ef1..6a6709e79b 100644
--- a/libavfilter/vf_flip_vulkan.c
+++ b/libavfilter/vf_flip_vulkan.c
@@ -52,7 +52,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame 
*in, enum FlipType
 
 FFVulkanDescriptorSetBinding image_descs[] = {
 {
-.name   = "input_image",
+.name   = "input_images",
 .type   = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
 .dimensions = 2,
 .elems  = planes,
@@ -60,7 +60,7 @@ static av_cold int init_filter(AVFilterContext *ctx, AVFrame 
*in, enum FlipType
 .updater= s->input_images,
 },
 {
-.name   = "output_image",
+.name   = "output_images",
 .type   = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
 .mem_layout = ff_vk_shader_rep_fmt(s->vkctx.output_format),
 .mem_quali  = "writeonly",
@@ -89,33 +89,33 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in, enum FlipType
 ff_vk_set_compute_shader_sizes(shd, (int [3]){ CGS, 1, 1 });
 RET(ff_vk_add_descriptor_set(vkctx, s->pl, shd, image_descs, 
FF_ARRAY_ELEMS(image_descs), 0));
 
-GLSLC(0, void main()   
 );
-GLSLC(0, { 
 );
-GLSLC(1, ivec2 size;   
 );
-GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
 );
+GLSLC(0, void main()   
  );
+GLSLC(0, { 
  );
+GLSLC(1, ivec2 size;   
  );
+GLSLC(1, const ivec2 pos = ivec2(gl_GlobalInvocationID.xy);
  );
 for (int i = 0; i < planes; i++) {
-GLSLC(0,   
 );
-GLSLF(1, size = imageSize(output_image[%i]);   
   ,i);
-GLSLC(1, if (IS_WITHIN(pos, size)) {   
 );
+GLSLC(0,   
  );
+GLSLF(1, size = imageSize(output_images[%i]);  
,i);
+GLSLC(1, if (IS_WITHIN(pos, size)) {   
  );
 switch (type)
 {
 case FLIP_HORIZONTAL:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.x - 
pos.x, pos.y));   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.x - 
pos.x, pos.y));   ,i);
 break;
 case FLIP_VERTICAL:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
 break;
 case FLIP_BOTH:
-GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.xy - 
pos.xy));, i);
+GLSLF(2, vec4 res = texture(input_images[%i], ivec2(size.xy - 
pos.xy));,i);
 break;
 default:
-GLSLF(2, vec4 res = texture(input_image[%i], pos); 
   ,i);
+GLSLF(2, vec4 res = texture(input_images[%i], pos);
,i);
 break;
 }
-GLSLF(2, imageStore(output_image[%i], pos, res);   
   ,i);
-GLSLC(1, } 
 );
+GLSLF(2, imageStore(output_images[%i], pos, res);  
,i);
+GLSLC(1, } 
  );
 }
-GLSLC(0, } 
 );
+GLSLC(0, } 
  );
 
 RET(ff_vk_compile_shader(vkctx, shd, "main"));
 RET(ff_vk_init_pipeline_layout(vkctx, s->pl));
-- 
2.25.1

___
ffmpeg-devel mail

[FFmpeg-devel] [PATCH 3/4] avfilter/gblur_vulkan: fix incorrect sematic

2021-12-07 Thread Wu Jianhua

The input and output are arrays of images, so it's better to use the plural
to align them to the variable name of VkDescriptorImageInfo.

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_gblur_vulkan.c | 60 +--
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/libavfilter/vf_gblur_vulkan.c b/libavfilter/vf_gblur_vulkan.c
index a2e33d1c90..2dbbbd0965 100644
--- a/libavfilter/vf_gblur_vulkan.c
+++ b/libavfilter/vf_gblur_vulkan.c
@@ -50,31 +50,31 @@ typedef struct GBlurVulkanContext {
 } GBlurVulkanContext;
 
 static const char gblur_horizontal[] = {
-C(0, void gblur(const ivec2 pos, const int index)  
)
-C(0, { 
)
-C(1, vec4 sum = texture(input_image[index], pos) * kernel[0];  
)
-C(0,   
)
-C(1, for(int i = 1; i < kernel.length(); i++) {
)
-C(2, sum += texture(input_image[index], pos + vec2(i, 0.0)) * 
kernel[i];   )
-C(2, sum += texture(input_image[index], pos - vec2(i, 0.0)) * 
kernel[i];   )
-C(1, } 
)
-C(0,   
)
-C(1, imageStore(output_image[index], pos, sum);
)
-C(0, } 
)
+C(0, void gblur(const ivec2 pos, const int index)  
 )
+C(0, { 
 )
+C(1, vec4 sum = texture(input_images[index], pos) * kernel[0]; 
 )
+C(0,   
 )
+C(1, for (int i = 1; i < kernel.length(); i++) {   
 )
+C(2, sum += texture(input_images[index], pos + vec2(i, 0.0)) * 
kernel[i];   )
+C(2, sum += texture(input_images[index], pos - vec2(i, 0.0)) * 
kernel[i];   )
+C(1, } 
 )
+C(0,   
 )
+C(1, imageStore(output_images[index], pos, sum);   
 )
+C(0, } 
 )
 };
 
 static const char gblur_vertical[] = {
-C(0, void gblur(const ivec2 pos, const int index)  
)
-C(0, { 
)
-C(1, vec4 sum = texture(input_image[index], pos) * kernel[0];  
)
-C(0,   
)
-C(1, for(int i = 1; i < kernel.length(); i++) {
)
-C(2, sum += texture(input_image[index], pos + vec2(0.0, i)) * 
kernel[i];   )
-C(2, sum += texture(input_image[index], pos - vec2(0.0, i)) * 
kernel[i];   )
-C(1, } 
)
-C(0,   
)
-C(1, imageStore(output_image[index], pos, sum);
)
-C(0, } 
)
+C(0, void gblur(const ivec2 pos, const int index)  
 )
+C(0, { 
 )
+C(1, vec4 sum = texture(input_images[index], pos) * kernel[0]; 
 )
+C(0,   
 )
+C(1, for (int i = 1; i < kernel.length(); i++) {   
 )
+C(2, sum += texture(input_images[index], pos + vec2(0.0, i)) * 
kernel[i];   )
+C(2, sum += texture(input_images[index], pos - vec2(0.0, i)) * 
kernel[i];   )
+C(1, } 
 )
+C(0,   
 )
+C(1, imageStore(output_images[index], pos, sum);   
 )
+C(0, } 
 )
 };
 
 static inline float gaussian(float sigma, float x)
@@ -133,14 +133,14 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in)
 
 FFVulkanDescriptorSetBinding image_descs[] = {
 {
-.name   = "input_image",
+.name   = "input_

[FFmpeg-devel] [PATCH 2/4] avfilter/vf_transpose: fix un-checked potential memory allocation failure

2021-12-07 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavfilter/vf_transpose.c | 15 ---
 1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/libavfilter/vf_transpose.c b/libavfilter/vf_transpose.c
index f9f0d70cd5..b964daeee3 100644
--- a/libavfilter/vf_transpose.c
+++ b/libavfilter/vf_transpose.c
@@ -328,6 +328,7 @@ static int filter_slice(AVFilterContext *ctx, void *arg, 
int jobnr,
 
 static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 {
+int err = 0;
 AVFilterContext *ctx = inlink->dst;
 TransContext *s = ctx->priv;
 AVFilterLink *outlink = ctx->outputs[0];
@@ -339,10 +340,13 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
 
 out = ff_get_video_buffer(outlink, outlink->w, outlink->h);
 if (!out) {
-av_frame_free();
-return AVERROR(ENOMEM);
+err = AVERROR(ENOMEM);
+goto fail;
 }
-av_frame_copy_props(out, in);
+
+err = av_frame_copy_props(out, in);
+if (err < 0)
+goto fail;
 
 if (in->sample_aspect_ratio.num == 0) {
 out->sample_aspect_ratio = in->sample_aspect_ratio;
@@ -356,6 +360,11 @@ static int filter_frame(AVFilterLink *inlink, AVFrame *in)
   FFMIN(outlink->h, ff_filter_get_nb_threads(ctx)));
 av_frame_free();
 return ff_filter_frame(outlink, out);
+
+fail:
+av_frame_free();
+av_frame_free();
+return err;
 }
 
 #define OFFSET(x) offsetof(TransContext, x)
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/4] avfilter: add a transpose_vulkan filter

2021-12-07 Thread Wu Jianhua

The following command is on how to apply transpose_vulkan filter:
ffmpeg -init_hw_device vulkan -i input.264 -vf \
hwupload=extra_hw_frames=16,transpose_vulkan,hwdownload,format=yuv420p 
output.264

Signed-off-by: Wu Jianhua 
---
 configure |   1 +
 libavfilter/Makefile  |   1 +
 libavfilter/allfilters.c  |   1 +
 libavfilter/vf_transpose_vulkan.c | 316 ++
 4 files changed, 319 insertions(+)
 create mode 100644 libavfilter/vf_transpose_vulkan.c

diff --git a/configure b/configure
index a98a18abaa..12cb49e877 100755
--- a/configure
+++ b/configure
@@ -3718,6 +3718,7 @@ tonemap_vaapi_filter_deps="vaapi 
VAProcFilterParameterBufferHDRToneMapping"
 tonemap_opencl_filter_deps="opencl const_nan"
 transpose_opencl_filter_deps="opencl"
 transpose_vaapi_filter_deps="vaapi VAProcPipelineCaps_rotation_flags"
+transpose_vulkan_filter_deps="vulkan spirv_compiler"
 unsharp_opencl_filter_deps="opencl"
 uspp_filter_deps="gpl avcodec"
 vaguedenoiser_filter_deps="gpl"
diff --git a/libavfilter/Makefile b/libavfilter/Makefile
index c8082c4a2f..8744cc3c63 100644
--- a/libavfilter/Makefile
+++ b/libavfilter/Makefile
@@ -483,6 +483,7 @@ OBJS-$(CONFIG_TRANSPOSE_FILTER)  += 
vf_transpose.o
 OBJS-$(CONFIG_TRANSPOSE_NPP_FILTER)  += vf_transpose_npp.o
 OBJS-$(CONFIG_TRANSPOSE_OPENCL_FILTER)   += vf_transpose_opencl.o opencl.o 
opencl/transpose.o
 OBJS-$(CONFIG_TRANSPOSE_VAAPI_FILTER)+= vf_transpose_vaapi.o 
vaapi_vpp.o
+OBJS-$(CONFIG_TRANSPOSE_VULKAN_FILTER)   += vf_transpose_vulkan.o vulkan.o 
vulkan_filter.o
 OBJS-$(CONFIG_TRIM_FILTER)   += trim.o
 OBJS-$(CONFIG_UNPREMULTIPLY_FILTER)  += vf_premultiply.o framesync.o
 OBJS-$(CONFIG_UNSHARP_FILTER)+= vf_unsharp.o
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index b1af2cbcc8..9e16b4e71e 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -462,6 +462,7 @@ extern const AVFilter ff_vf_transpose;
 extern const AVFilter ff_vf_transpose_npp;
 extern const AVFilter ff_vf_transpose_opencl;
 extern const AVFilter ff_vf_transpose_vaapi;
+extern const AVFilter ff_vf_transpose_vulkan;
 extern const AVFilter ff_vf_trim;
 extern const AVFilter ff_vf_unpremultiply;
 extern const AVFilter ff_vf_unsharp;
diff --git a/libavfilter/vf_transpose_vulkan.c 
b/libavfilter/vf_transpose_vulkan.c
new file mode 100644
index 00..c9bae413c3
--- /dev/null
+++ b/libavfilter/vf_transpose_vulkan.c
@@ -0,0 +1,316 @@
+/*
+ * copyright (c) 2021 Wu Jianhua 
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#include "libavutil/random_seed.h"
+#include "libavutil/opt.h"
+#include "vulkan_filter.h"
+#include "internal.h"
+
+#define CGS 32
+
+typedef struct TransposeVulkanContext {
+FFVulkanContext vkctx;
+FFVkQueueFamilyCtx qf;
+FFVkExecContext *exec;
+FFVulkanPipeline *pl;
+
+VkDescriptorImageInfo input_images[3];
+VkDescriptorImageInfo output_images[3];
+
+int initialized;
+} TransposeVulkanContext;
+
+static av_cold int init_filter(AVFilterContext *ctx, AVFrame *in)
+{
+int err = 0;
+FFVkSPIRVShader *shd;
+TransposeVulkanContext *s = ctx->priv;
+FFVulkanContext *vkctx = >vkctx;
+const int planes = av_pix_fmt_count_planes(s->vkctx.output_format);
+
+FFVulkanDescriptorSetBinding image_descs[] = {
+{
+.name   = "input_images",
+.type   = VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER,
+.dimensions = 2,
+.elems  = planes,
+.stages = VK_SHADER_STAGE_COMPUTE_BIT,
+.updater= s->input_images,
+},
+{
+.name   = "output_images",
+.type   = VK_DESCRIPTOR_TYPE_STORAGE_IMAGE,
+.mem_layout = ff_vk_shader_rep_fmt(s->vkctx.output_format),
+.mem_quali  = "writeonly",
+.dimensions = 2,
+.elems  = planes,
+.stages = VK_SHADER_STAGE_COMPUTE_BIT,
+.updater= s->output_images,
+

Re: [FFmpeg-devel] [PATCH v3 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-12-01 Thread Wu, Jianhua

Ping.
> From: Wu, Jianhua 
> Sent: Monday, November 22, 2021 4:09 PM
> To: ffmpeg-devel@ffmpeg.org
> Cc: Wu, Jianhua 
> Subject: [PATCH v3 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization
> 
> Performance data(Less is better):
> exposure_c:857394
> exposure_sse:  327589
> 
> Signed-off-by: Wu Jianhua 
> ---
>  libavfilter/exposure.h | 36 +++
>  libavfilter/vf_exposure.c  | 36 +--
>  libavfilter/x86/Makefile   |  2 ++
>  libavfilter/x86/vf_exposure.asm| 55
> ++
>  libavfilter/x86/vf_exposure_init.c | 36 +++
>  5 files changed, 147 insertions(+), 18 deletions(-)  create mode 100644
> libavfilter/exposure.h  create mode 100644 libavfilter/x86/vf_exposure.asm
> create mode 100644 libavfilter/x86/vf_exposure_init.c
> 
> diff --git a/libavfilter/exposure.h b/libavfilter/exposure.h new file mode
> 100644 index 00..e76a517826
> --- /dev/null
> +++ b/libavfilter/exposure.h
> @@ -0,0 +1,36 @@
> +/*
> + * This file is part of FFmpeg.
> + *
> + * FFmpeg is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU Lesser General Public
> + * License as published by the Free Software Foundation; either
> + * version 2.1 of the License, or (at your option) any later version.
> + *
> + * FFmpeg is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> GNU
> + * Lesser General Public License for more details.
> + *
> + * You should have received a copy of the GNU Lesser General Public
> + * License along with FFmpeg; if not, write to the Free Software
> + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
> +02110-1301 USA  */
> +
> +#ifndef AVFILTER_EXPOSURE_H
> +#define AVFILTER_EXPOSURE_H
> +#include "avfilter.h"
> +
> +typedef struct ExposureContext {
> +const AVClass *class;
> +
> +float exposure;
> +float black;
> +float scale;
> +
> +void (*exposure_func)(float *ptr, int length, float black, float
> +scale); } ExposureContext;
> +
> +void ff_exposure_init(ExposureContext *s); void
> +ff_exposure_init_x86(ExposureContext *s);
> +
> +#endif
> diff --git a/libavfilter/vf_exposure.c b/libavfilter/vf_exposure.c index
> 108fba7930..045ae710d3 100644
> --- a/libavfilter/vf_exposure.c
> +++ b/libavfilter/vf_exposure.c
> @@ -26,23 +26,20 @@
>  #include "formats.h"
>  #include "internal.h"
>  #include "video.h"
> +#include "exposure.h"
> 
> -typedef struct ExposureContext {
> -const AVClass *class;
> -
> -float exposure;
> -float black;
> +static void exposure_c(float *ptr, int length, float black, float
> +scale) {
> +int i;
> 
> -float scale;
> -int (*do_slice)(AVFilterContext *s, void *arg,
> -int jobnr, int nb_jobs);
> -} ExposureContext;
> +for (i = 0; i < length; i++)
> +ptr[i] = (ptr[i] - black) * scale; }
> 
>  static int exposure_slice(AVFilterContext *ctx, void *arg, int jobnr, int
> nb_jobs)  {
>  ExposureContext *s = ctx->priv;
>  AVFrame *frame = arg;
> -const int width = frame->width;
>  const int height = frame->height;
>  const int slice_start = (height * jobnr) / nb_jobs;
>  const int slice_end = (height * (jobnr + 1)) / nb_jobs; @@ -52,24 +49,27
> @@ static int exposure_slice(AVFilterContext *ctx, void *arg, int jobnr, int
> nb_job
>  for (int p = 0; p < 3; p++) {
>  const int linesize = frame->linesize[p] / 4;
>  float *ptr = (float *)frame->data[p] + slice_start * linesize;
> -for (int y = slice_start; y < slice_end; y++) {
> -for (int x = 0; x < width; x++)
> -ptr[x] = (ptr[x] - black) * scale;
> -
> -ptr += linesize;
> -}
> +s->exposure_func(ptr, linesize * (slice_end - slice_start),
> + black, scale);
>  }
> 
>  return 0;
>  }
> 
> +void ff_exposure_init(ExposureContext *s) {
> +s->exposure_func = exposure_c;
> +
> +if (ARCH_X86)
> +ff_exposure_init_x86(s);
> +}
> +
>  static int filter_frame(AVFilterLink *inlink, AVFrame *frame)  {
>  AVFilterContext *ctx = inlink->dst;
>  ExposureContext *s = ctx->priv;
> 
>  s->scale = 1.f / (exp2f(-s->exposure) - s->black);
> -ff_filter_execute(ctx, s->do_slice, frame, NULL,
> +ff_filter_execute(ctx, exposure_slice, frame, NULL,
>

Re: [FFmpeg-devel] 回复: [PATCH v4 2/2] avfilter: add a flip_vulkan filter

2021-12-01 Thread Wu, Jianhua

Ping.
> Wu Jianhua:
> Lynne:
> >>ffm...@gyani.pro：
> >>> On 2021-11-26 04:33 pm, Lynne wrote:
> >>>
> >>>> 26 Nov 2021, 11:37 by ffm...@gyani.pro:
> >>>>
> >>>>>
> >>>>> On 2021-11-26 03:08 pm, Lynne wrote:
> >>>>>
> >>>>>> 26 Nov 2021, 10:10 by jianhua...@intel.com:
> >>>>>>
> >>>>>> This filter flips the input video both horizontally and
> >>>>>> vertically in one compute pipeline, and it's no need to use two
> >>>>>> pipelines for hflip_vulkan,vflip_vulkan anymore.
> >>>>>>
> >>>>>> Signed-off-by: Wu Jianhua 
> >>>>>> ---
> >>>>>>  configure|  1 +
> >>>>>>  libavfilter/allfilters.c |  1 +
> >>>>>>  libavfilter/vf_flip_vulkan.c | 61
> >>>>>> +---
> >>>>>>  3 files changed, 51 insertions(+), 12 deletions(-)
> >>>>>>
> >>>>> I'll push this tonight if there are no further objections to the name.
> >>>>>
> >>>> Will other flip modes be added to this filter?
> >>>>
> 
> Hi Gyan:
> 
> There are no more flip modes anymore. I add this filter is inspired by OpenCV,
> which offers three flip modes for horizontal, vertical, and both. When a
> person specifies hflip, it means that he is desired on horizontal flipping. 
> If the
> flip is specified without 'v' or 'h', it means there is no desire on 
> specifying the
> direction. So, I think it may be fine to use flip for both directions then we
> don't need to construct a new name.
> 
> Thanks,
> Jianhua
> 
> >>> No. Other flip modes are already separate filters, and transposition
> >>> would be another filter. Our transpose filter doesn't support
> >>> flipping the image, and to keep options compatible between software
> >>> and hardware filters, we can't add it. So the most appropriate
> >>> filter for this is a standalone one.
> >>>
> >>
> >> If modes can be added, you can add them here, and deprecate the single
> mode variants.
> >> Since these filters are very new, there are no legacy users to
> accommodate and now would be the best time they come become present
> in a release branch.
> >>
> >
> >There are no more modes to be added, flipping != transpose != rotate,
> >according to ffmpeg nomenclature, since we have separate filters for each.
> >
> 

Hi  there,

 Any update?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] 回复: [PATCH v4 2/2] avfilter: add a flip_vulkan filter

2021-11-26 Thread Wu Jianhua

Lynne:
>>ffm...@gyani.pro：
>>> On 2021-11-26 04:33 pm, Lynne wrote:
>>>
>>>> 26 Nov 2021, 11:37 by ffm...@gyani.pro:
>>>>
>>>>>
>>>>> On 2021-11-26 03:08 pm, Lynne wrote:
>>>>>
>>>>>> 26 Nov 2021, 10:10 by jianhua...@intel.com:
>>>>>>
>>>>>> This filter flips the input video both horizontally and vertically
>>>>>> in one compute pipeline, and it's no need to use two pipelines for
>>>>>> hflip_vulkan,vflip_vulkan anymore.
>>>>>>
>>>>>> Signed-off-by: Wu Jianhua 
>>>>>> ---
>>>>>>  configure|  1 +
>>>>>>  libavfilter/allfilters.c |  1 +
>>>>>>  libavfilter/vf_flip_vulkan.c | 61 +---
>>>>>>  3 files changed, 51 insertions(+), 12 deletions(-)
>>>>>>
>>>>> I'll push this tonight if there are no further objections to the name.
>>>>>
>>>> Will other flip modes be added to this filter?
>>>>

Hi Gyan:

There are no more flip modes anymore. I add this filter is inspired by OpenCV,
which offers three flip modes for horizontal, vertical, and both. When a person
specifies hflip, it means that he is desired on horizontal flipping. If the flip
is specified without 'v' or 'h', it means there is no desire on specifying the
direction. So, I think it may be fine to use flip for both directions then we
don't need to construct a new name.

Thanks,
Jianhua

>>> No. Other flip modes are already separate filters, and transposition
>>> would be another filter. Our transpose filter doesn't support flipping
>>> the image, and to keep options compatible between software and
>>> hardware filters, we can't add it. So the most appropriate filter for
>>> this is a standalone one.
>>>
>>
>> If modes can be added, you can add them here, and deprecate the single mode 
>> variants.
>> Since these filters are very new, there are no legacy users to accommodate 
>> and now would be the best time they come become present in a release branch.
>>
>
>There are no more modes to be added, flipping != transpose != rotate,
>according to ffmpeg nomenclature, since we have separate filters for each.
>



发件人: ffmpeg-devel  代表 Lynne 
发送时间: 2021年11月26日 3:45
收件人: FFmpeg development discussions and patches
主题: Re: [FFmpeg-devel] [PATCH v4 2/2] avfilter: add a flip_vulkan filter

26 Nov 2021, 12:20 by ffm...@gyani.pro:

>
>
> On 2021-11-26 04:33 pm, Lynne wrote:
>
>> 26 Nov 2021, 11:37 by ffm...@gyani.pro:
>>
>>>
>>> On 2021-11-26 03:08 pm, Lynne wrote:
>>>
>>>> 26 Nov 2021, 10:10 by jianhua...@intel.com:
>>>>
>>>>> This filter flips the input video both horizontally and vertically
>>>>> in one compute pipeline, and it's no need to use two pipelines for
>>>>> hflip_vulkan,vflip_vulkan anymore.
>>>>>
>>>>> Signed-off-by: Wu Jianhua 
>>>>> ---
>>>>>  configure|  1 +
>>>>>  libavfilter/allfilters.c |  1 +
>>>>>  libavfilter/vf_flip_vulkan.c | 61 +---
>>>>>  3 files changed, 51 insertions(+), 12 deletions(-)
>>>>>
>>>> I'll push this tonight if there are no further objections to the name.
>>>>
>>> Will other flip modes be added to this filter?
>>>
>> No. Other flip modes are already separate filters, and transposition
>> would be another filter. Our transpose filter doesn't support flipping
>> the image, and to keep options compatible between software and
>> hardware filters, we can't add it. So the most appropriate filter for
>> this is a standalone one.
>>
>
> If modes can be added, you can add them here, and deprecate the single mode 
> variants.
> Since these filters are very new, there are no legacy users to accommodate 
> and now would be the best time they come become present in a release branch.
>

There are no more modes to be added, flipping != transpose != rotate,
according to ffmpeg nomenclature, since we have separate filters for each.
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v4 2/2] avfilter: add a flip_vulkan filter

2021-11-26 Thread Wu Jianhua

This filter flips the input video both horizontally and vertically
in one compute pipeline, and it's no need to use two pipelines for
hflip_vulkan,vflip_vulkan anymore.

Signed-off-by: Wu Jianhua 
---
 configure|  1 +
 libavfilter/allfilters.c |  1 +
 libavfilter/vf_flip_vulkan.c | 61 +---
 3 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index d068b11073..7112d830c9 100755
--- a/configure
+++ b/configure
@@ -3608,6 +3608,7 @@ fftdnoiz_filter_select="fft"
 find_rect_filter_deps="avcodec avformat gpl"
 firequalizer_filter_deps="avcodec"
 firequalizer_filter_select="rdft"
+flip_vulkan_filter_deps="vulkan spirv_compiler"
 flite_filter_deps="libflite"
 framerate_filter_select="scene_sad"
 freezedetect_filter_select="scene_sad"
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 4bf17ef292..e014833bea 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -263,6 +263,7 @@ extern const AVFilter ff_vf_fieldmatch;
 extern const AVFilter ff_vf_fieldorder;
 extern const AVFilter ff_vf_fillborders;
 extern const AVFilter ff_vf_find_rect;
+extern const AVFilter ff_vf_flip_vulkan;
 extern const AVFilter ff_vf_floodfill;
 extern const AVFilter ff_vf_format;
 extern const AVFilter ff_vf_fps;
diff --git a/libavfilter/vf_flip_vulkan.c b/libavfilter/vf_flip_vulkan.c
index e9e04db91b..0223786ef1 100644
--- a/libavfilter/vf_flip_vulkan.c
+++ b/libavfilter/vf_flip_vulkan.c
@@ -26,7 +26,8 @@
 
 enum FlipType {
 FLIP_VERTICAL,
-FLIP_HORIZONTAL
+FLIP_HORIZONTAL,
+FLIP_BOTH
 };
 
 typedef struct FlipVulkanContext {
@@ -104,6 +105,9 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in, enum FlipType
 case FLIP_VERTICAL:
 GLSLF(2, vec4 res = texture(input_image[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
 break;
+case FLIP_BOTH:
+GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.xy - 
pos.xy));, i);
+break;
 default:
 GLSLF(2, vec4 res = texture(input_image[%i], pos); 
   ,i);
 break;
@@ -226,7 +230,7 @@ fail:
 return err;
 }
 
-static int flip_vulkan_filter_frame(AVFilterLink *link, AVFrame *in, enum 
FlipType type)
+static int filter_frame(AVFilterLink *link, AVFrame *in, enum FlipType type)
 {
 int err;
 AVFrame *out = NULL;
@@ -259,14 +263,27 @@ fail:
 
 static int hflip_vulkan_filter_frame(AVFilterLink *link, AVFrame *in)
 {
-return flip_vulkan_filter_frame(link, in, FLIP_HORIZONTAL);
+return filter_frame(link, in, FLIP_HORIZONTAL);
 }
 
 static int vflip_vulkan_filter_frame(AVFilterLink *link, AVFrame *in)
 {
-return flip_vulkan_filter_frame(link, in, FLIP_VERTICAL);
+return filter_frame(link, in, FLIP_VERTICAL);
 }
 
+static int flip_vulkan_filter_frame(AVFilterLink *link, AVFrame *in)
+{
+return filter_frame(link, in, FLIP_BOTH);
+}
+
+static const AVFilterPad flip_vulkan_outputs[] = {
+{
+.name = "default",
+.type = AVMEDIA_TYPE_VIDEO,
+.config_props = _vk_filter_config_output,
+}
+};
+
 static const AVOption hflip_vulkan_options[] = {
 { NULL },
 };
@@ -282,14 +299,6 @@ static const AVFilterPad hflip_vulkan_inputs[] = {
 }
 };
 
-static const AVFilterPad flip_vulkan_outputs[] = {
-{
-.name = "default",
-.type = AVMEDIA_TYPE_VIDEO,
-.config_props = _vk_filter_config_output,
-}
-};
-
 const AVFilter ff_vf_hflip_vulkan = {
 .name   = "hflip_vulkan",
 .description= NULL_IF_CONFIG_SMALL("Horizontally flip the input video 
in Vulkan"),
@@ -330,3 +339,31 @@ const AVFilter ff_vf_vflip_vulkan = {
 .priv_class = _vulkan_class,
 .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE,
 };
+
+static const AVOption flip_vulkan_options[] = {
+{ NULL },
+};
+
+AVFILTER_DEFINE_CLASS(flip_vulkan);
+
+static const AVFilterPad flip_vulkan_inputs[] = {
+{
+.name = "default",
+.type = AVMEDIA_TYPE_VIDEO,
+.filter_frame = _vulkan_filter_frame,
+.config_props = _vk_filter_config_input,
+}
+};
+
+const AVFilter ff_vf_flip_vulkan = {
+.name   = "flip_vulkan",
+.description= NULL_IF_CONFIG_SMALL("Flip both horizontally and 
vertically"),
+.priv_size  = sizeof(FlipVulkanContext),
+.init   = _vk_filter_init,
+.uninit = _vulkan_uninit,
+FILTER_INPUTS(flip_vulkan_inputs),
+FILTER_OUTPUTS(flip_vulkan_outputs),
+FILTER_SINGLE_PIXFMT(AV_PIX_FMT_VULKAN),
+.priv_class = _vulkan_class,
+.flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE,
+};
-- 
2.25.1

[FFmpeg-devel] [PATCH v4 1/2] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-26 Thread Wu Jianhua

Validation layer is an indispensable part of developing on Vulkan.

The following commands is on how to enable validation layers:

ffmpeg -init_hw_device 
vulkan=0,debug=1,validation_layers=VK_LAYER_LUNARG_monitor+VK_LAYER_LUNARG_api_dump

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 163 ---
 libavutil/vulkan_functions.h |   1 +
 2 files changed, 135 insertions(+), 29 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 644ed947f8..870a6fc71b 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -146,6 +146,13 @@ typedef struct AVVkFrameInternal {
 }  
\
 } while(0)
 
+#define RELEASE_PROPS(props, count)
\
+if (props) {   
\
+for (int i = 0; i < count; i++)
\
+av_free((void *)((props)[i])); 
\
+av_free((void *)props);
\
+}
+
 static const struct {
 enum AVPixelFormat pixfmt;
 const VkFormat vkfmts[4];
@@ -511,25 +518,128 @@ static int check_extensions(AVHWDeviceContext *ctx, int 
dev, AVDictionary *opts,
 return 0;
 
 fail:
-if (extension_names)
-for (int i = 0; i < extensions_found; i++)
-av_free((void *)extension_names[i]);
-av_free(extension_names);
+RELEASE_PROPS(extension_names, extensions_found);
 av_free(user_exts_str);
 av_free(sup_ext);
 return err;
 }
 
+static int check_validation_layers(AVHWDeviceContext *ctx, AVDictionary *opts,
+   const char * const **dst, uint32_t *num, 
int *debug_mode)
+{
+static const char default_layer[] = { "VK_LAYER_KHRONOS_validation" };
+
+int found = 0, err = 0;
+VulkanDevicePriv *priv = ctx->internal->priv;
+FFVulkanFunctions *vk = >vkfn;
+
+uint32_t sup_layer_count;
+VkLayerProperties *sup_layers;
+
+AVDictionaryEntry *user_layers;
+char *user_layers_str = NULL;
+char *save, *token;
+
+const char **enabled_layers = NULL;
+uint32_t enabled_layers_count = 0;
+
+AVDictionaryEntry *debug_opt = av_dict_get(opts, "debug", NULL, 0);
+int debug = debug_opt && strtol(debug_opt->value, NULL, 10);
+
+/* If `debug=0`, enable no layers at all. */
+if (debug_opt && !debug)
+return 0;
+
+vk->EnumerateInstanceLayerProperties(_layer_count, NULL);
+sup_layers = av_malloc_array(sup_layer_count, sizeof(VkLayerProperties));
+if (!sup_layers)
+return AVERROR(ENOMEM);
+vk->EnumerateInstanceLayerProperties(_layer_count, sup_layers);
+
+av_log(ctx, AV_LOG_VERBOSE, "Supported validation layers:\n");
+for (int i = 0; i < sup_layer_count; i++)
+av_log(ctx, AV_LOG_VERBOSE, "\t%s\n", sup_layers[i].layerName);
+
+/* If `debug=1` is specified, enable the standard validation layer 
extension */
+if (debug) {
+*debug_mode = debug;
+for (int i = 0; i < sup_layer_count; i++) {
+if (!strcmp(default_layer, sup_layers[i].layerName)) {
+found = 1;
+av_log(ctx, AV_LOG_VERBOSE,
+"Default validation layer %s is enabled\n", default_layer);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, 
default_layer);
+break;
+}
+}
+}
+
+user_layers = av_dict_get(opts, "validation_layers", NULL, 0);
+if (!user_layers)
+goto end;
+
+user_layers_str = av_strdup(user_layers->value);
+if (!user_layers_str) {
+err = AVERROR(EINVAL);
+goto fail;
+}
+
+token = av_strtok(user_layers_str, "+", );
+while (token) {
+found = 0;
+if (!strcmp(default_layer, token)) {
+if (debug) {
+/* if the `debug=1`, default_layer is enabled, skip here */
+token = av_strtok(NULL, "+", );
+continue;
+} else {
+/* if the `debug=0`, enable debug mode to load its callback 
properly */
+*debug_mode = debug;
+}
+}
+for (int j = 0; j < sup_layer_count; j++) {
+if (!strcmp(token, sup_layers[j].layerName)) {
+found = 1;
+break;
+}
+}
+if (found) {
+av_log(ctx, AV_LOG_VERBOSE, "Requested Validation Layer: %s\n", 
token);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, token);
+} else {
+av_log(ctx, AV_LOG_ERROR,
+   "Validation Layer \"%s\" not support.\n", token);
+

Re: [FFmpeg-devel] [PATCH v3 2/2] avfilter: add a bflip_vulkan filter

2021-11-25 Thread Wu, Jianhua

Lynne:
> From: ffmpeg-devel  On Behalf Of
> Lynne
> Sent: Thursday, November 25, 2021 5:33 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH v3 2/2] avfilter: add a bflip_vulkan filter
> 
> 25 Nov 2021, 10:22 by ffm...@gyani.pro:
> 
> >
> >
> > On 2021-11-25 02:38 pm, Wu Jianhua wrote:
> >
> >> This filter flips the input video both horizontally and vertically in
> >> one compute pipeline, and it's no need to use two pipelines for
> >> hflip_vulkan,vflip_vulkan anymore.
> >>
> >
> > bflip is not an intuitive name.
> >
> > Either hvflip, or since h+v flip  == 180 deg rotation, maybe rotate180
> > or rot180
> >
> > Regards,
> > Gyan
> >
> 
> I think I'd prefer if it was called 'transpose_vulkan', with the same options 
> as
> the regular transpose filter, but with only a single direction currently
> supported.
> That way, we'd have a template to which we could implement more modes
> later on.
> 

Does transpose only indicate switches the row and column indices? Ummm..I'm not 
sure.
Maybe rotate 180 is more better.

Thanks,
Jianhua


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3 2/2] avfilter: add a bflip_vulkan filter

2021-11-25 Thread Wu Jianhua

This filter flips the input video both horizontally and vertically
in one compute pipeline, and it's no need to use two pipelines for
hflip_vulkan,vflip_vulkan anymore.

Signed-off-by: Wu Jianhua 
---
 configure|  1 +
 libavfilter/allfilters.c |  1 +
 libavfilter/vf_flip_vulkan.c | 39 +++-
 3 files changed, 40 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index d068b11073..a7562b53c3 100755
--- a/configure
+++ b/configure
@@ -3569,6 +3569,7 @@ atempo_filter_select="rdft"
 avgblur_opencl_filter_deps="opencl"
 avgblur_vulkan_filter_deps="vulkan spirv_compiler"
 azmq_filter_deps="libzmq"
+bflip_vulkan_filter_deps="vulkan spirv_compiler"
 blackframe_filter_deps="gpl"
 bm3d_filter_deps="avcodec"
 bm3d_filter_select="dct"
diff --git a/libavfilter/allfilters.c b/libavfilter/allfilters.c
index 4bf17ef292..041292853a 100644
--- a/libavfilter/allfilters.c
+++ b/libavfilter/allfilters.c
@@ -175,6 +175,7 @@ extern const AVFilter ff_vf_avgblur_opencl;
 extern const AVFilter ff_vf_avgblur_vulkan;
 extern const AVFilter ff_vf_bbox;
 extern const AVFilter ff_vf_bench;
+extern const AVFilter ff_vf_bflip_vulkan;
 extern const AVFilter ff_vf_bilateral;
 extern const AVFilter ff_vf_bitplanenoise;
 extern const AVFilter ff_vf_blackdetect;
diff --git a/libavfilter/vf_flip_vulkan.c b/libavfilter/vf_flip_vulkan.c
index e9e04db91b..e20766e9ed 100644
--- a/libavfilter/vf_flip_vulkan.c
+++ b/libavfilter/vf_flip_vulkan.c
@@ -26,7 +26,8 @@
 
 enum FlipType {
 FLIP_VERTICAL,
-FLIP_HORIZONTAL
+FLIP_HORIZONTAL,
+FLIP_BOTH
 };
 
 typedef struct FlipVulkanContext {
@@ -104,6 +105,9 @@ static av_cold int init_filter(AVFilterContext *ctx, 
AVFrame *in, enum FlipType
 case FLIP_VERTICAL:
 GLSLF(2, vec4 res = texture(input_image[%i], ivec2(pos.x, 
size.y - pos.y));   ,i);
 break;
+case FLIP_BOTH:
+GLSLF(2, vec4 res = texture(input_image[%i], ivec2(size.xy - 
pos.xy));, i);
+break;
 default:
 GLSLF(2, vec4 res = texture(input_image[%i], pos); 
   ,i);
 break;
@@ -267,6 +271,11 @@ static int vflip_vulkan_filter_frame(AVFilterLink *link, 
AVFrame *in)
 return flip_vulkan_filter_frame(link, in, FLIP_VERTICAL);
 }
 
+static int bflip_vulkan_filter_frame(AVFilterLink *link, AVFrame *in)
+{
+return flip_vulkan_filter_frame(link, in, FLIP_BOTH);
+}
+
 static const AVOption hflip_vulkan_options[] = {
 { NULL },
 };
@@ -330,3 +339,31 @@ const AVFilter ff_vf_vflip_vulkan = {
 .priv_class = _vulkan_class,
 .flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE,
 };
+
+static const AVOption bflip_vulkan_options[] = {
+{ NULL },
+};
+
+AVFILTER_DEFINE_CLASS(bflip_vulkan);
+
+static const AVFilterPad bflip_vulkan_inputs[] = {
+{
+.name = "default",
+.type = AVMEDIA_TYPE_VIDEO,
+.filter_frame = _vulkan_filter_frame,
+.config_props = _vk_filter_config_input,
+}
+};
+
+const AVFilter ff_vf_bflip_vulkan = {
+.name   = "bflip_vulkan",
+.description= NULL_IF_CONFIG_SMALL("Flip both horizontally and 
vertically"),
+.priv_size  = sizeof(FlipVulkanContext),
+.init   = _vk_filter_init,
+.uninit = _vulkan_uninit,
+FILTER_INPUTS(bflip_vulkan_inputs),
+FILTER_OUTPUTS(flip_vulkan_outputs),
+FILTER_SINGLE_PIXFMT(AV_PIX_FMT_VULKAN),
+.priv_class = _vulkan_class,
+.flags_internal = FF_FILTER_FLAG_HWFRAME_AWARE,
+};
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3 1/2] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-25 Thread Wu Jianhua

Validation layer is an indispensable part of developing on Vulkan.

The following commands is on how to enable validation layers:

ffmpeg -init_hw_device 
vulkan=0,debug=1,validation_layers=VK_LAYER_LUNARG_monitor+VK_LAYER_LUNARG_api_dump

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 164 ---
 libavutil/vulkan_functions.h |   1 +
 2 files changed, 136 insertions(+), 29 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 644ed947f8..515e27aad8 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -146,6 +146,13 @@ typedef struct AVVkFrameInternal {
 }  
\
 } while(0)
 
+#define RELEASE_PROPS(props, count)
\
+if (props) {   
\
+for (int i = 0; i < count; i++)
\
+av_free((void *)((props)[i])); 
\
+av_free((void *)props);
\
+}
+
 static const struct {
 enum AVPixelFormat pixfmt;
 const VkFormat vkfmts[4];
@@ -511,25 +518,129 @@ static int check_extensions(AVHWDeviceContext *ctx, int 
dev, AVDictionary *opts,
 return 0;
 
 fail:
-if (extension_names)
-for (int i = 0; i < extensions_found; i++)
-av_free((void *)extension_names[i]);
-av_free(extension_names);
+RELEASE_PROPS(extension_names, extensions_found);
 av_free(user_exts_str);
 av_free(sup_ext);
 return err;
 }
 
+static int check_validation_layers(AVHWDeviceContext *ctx, AVDictionary *opts,
+   const char * const **dst, uint32_t *num, 
int *debug_mode)
+{
+static const char default_layer[] = { "VK_LAYER_KHRONOS_validation" };
+
+int found = 0, err = 0;
+VulkanDevicePriv *priv = ctx->internal->priv;
+FFVulkanFunctions *vk = >vkfn;
+
+uint32_t sup_layer_count;
+VkLayerProperties *sup_layers;
+
+AVDictionaryEntry *user_layers;
+char *user_layers_str = NULL;
+char *save, *token;
+
+const char **enabled_layers = NULL;
+uint32_t enabled_layers_count = 0;
+
+AVDictionaryEntry *debug_opt = av_dict_get(opts, "debug", NULL, 0);
+int debug = debug_opt && strtol(debug_opt->value, NULL, 10);
+
+/* If `debug=0`, enable no layers at all. */
+if (debug_opt && !debug)
+return 0;
+
+vk->EnumerateInstanceLayerProperties(_layer_count, NULL);
+sup_layers = av_malloc_array(sup_layer_count, sizeof(VkLayerProperties));
+if (!sup_layers)
+return AVERROR(ENOMEM);
+vk->EnumerateInstanceLayerProperties(_layer_count, sup_layers);
+
+av_log(ctx, AV_LOG_VERBOSE, "Supported validation layers:\n");
+for (int i = 0; i < sup_layer_count; i++)
+av_log(ctx, AV_LOG_VERBOSE, "\t%s\n", sup_layers[i].layerName);
+
+/* If `debug=1` is specified, enable the standard validation layer 
extension */
+if (debug) {
+*debug_mode = debug;
+for (int i = 0; i < sup_layer_count; i++) {
+if (!strcmp(default_layer, sup_layers[i].layerName)) {
+found = 1;
+av_log(ctx, AV_LOG_VERBOSE,
+"Default validation layer %s is enabled\n", default_layer);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, 
default_layer);
+break;
+}
+}
+}
+
+user_layers = av_dict_get(opts, "validation_layers", NULL, 0);
+if (!user_layers)
+goto end;
+
+user_layers_str = av_strdup(user_layers->value);
+if (!user_layers_str) {
+err = AVERROR(EINVAL);
+goto fail;
+}
+
+token = av_strtok(user_layers_str, "+", );
+while (token) {
+found = 0;
+if (!strcmp(default_layer, token)) {
+if (debug) {
+/* if the `debug=1`, default_layer is enabled, skip here */
+token = av_strtok(NULL, "+", );
+continue;
+}
+else {
+/* if the `debug=0, enable debug mode to load its callback 
properly */
+*debug_mode = debug;
+}
+}
+for (int j = 0; j < sup_layer_count; j++) {
+if (!strcmp(token, sup_layers[j].layerName)) {
+found = 1;
+break;
+}
+}
+if (found) {
+av_log(ctx, AV_LOG_VERBOSE, "Requested Validation Layer: %s\n", 
token);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, token);
+} else {
+av_log(ctx, AV_LOG_ERROR,
+   "Validation Layer \"%s\" not support.\n", toke

Re: [FFmpeg-devel] [PATCH v2 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-24 Thread Wu Jianhua

Lynne:
Sent: 2021年11月24日 18:36
To: FFmpeg development discussions and patches
Subject: Re: [FFmpeg-devel] [PATCH v2 4/4] avutil/hwcontext_vulkan: fully 
support customizable validation layers

24 Nov 2021, 05:11 by jianhua...@intel.com:

>>  /* Creates a VkInstance */
>>  static int create_instance(AVHWDeviceContext *ctx, AVDictionary *opts)
>>  {
>> @@ -558,13 +651,16 @@ static int create_instance(AVHWDeviceContext *ctx, 
>> AVDictionary *opts)
>>  /* Check for present/missing extensions */
>>  err = check_extensions(ctx, 0, opts, _props.ppEnabledExtensionNames,
>>  _props.enabledExtensionCount, debug_mode);
>> +hwctx->enabled_inst_extensions = inst_props.ppEnabledExtensionNames;
>> +hwctx->nb_enabled_inst_extensions = inst_props.enabledExtensionCount;
>>
>
> Why did you move that assignment?
>

If the creation fails or something exception, assign them here to ensure
that they could be released in the vulkan_device_free() just like releasing
by a de-constructor, and it's no need to write more codes to free them in
this function. If the context creation failed, the vulkan_device_free() will
be called immediately, so they would not keep for a long time.

>
> I've pushed patches 2 and 3, just squash patch 1 and 4 (this one) and
> resubmit with the changes I mentioned.
>

Okay. No problem. I’ll resubmit it.

Thanks,
Jianhua

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-23 Thread Wu Jianhua

Validation layer is an indispensable part of developing on Vulkan.

The following commands is on how to enable validation layers:

ffmpeg -init_hw_device 
vulkan=0,debug=1,validation_layers=VK_LAYER_LUNARG_monitor+VK_LAYER_LUNARG_api_dump

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 136 +--
 1 file changed, 113 insertions(+), 23 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 644ed947f8..75f9f90d70 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -146,6 +146,13 @@ typedef struct AVVkFrameInternal {
 }  
\
 } while(0)
 
+#define RELEASE_PROPS(props, count)
\
+if (props) {   
\
+for (int i = 0; i < count; i++)
\
+av_free((void *)((props)[i])); 
\
+av_free((void *)props);
\
+}
+
 static const struct {
 enum AVPixelFormat pixfmt;
 const VkFormat vkfmts[4];
@@ -511,15 +518,101 @@ static int check_extensions(AVHWDeviceContext *ctx, int 
dev, AVDictionary *opts,
 return 0;
 
 fail:
-if (extension_names)
-for (int i = 0; i < extensions_found; i++)
-av_free((void *)extension_names[i]);
-av_free(extension_names);
+RELEASE_PROPS(extension_names, extensions_found);
 av_free(user_exts_str);
 av_free(sup_ext);
 return err;
 }
 
+static int check_validation_layers(AVHWDeviceContext *ctx, AVDictionary *opts,
+   const char * const **dst, uint32_t *num)
+{
+static const char default_layer[] = { "VK_LAYER_KHRONOS_validation" };
+
+int found = 0, err = 0;
+VulkanDevicePriv *priv = ctx->internal->priv;
+FFVulkanFunctions *vk = >vkfn;
+
+uint32_t sup_layer_count;
+VkLayerProperties *sup_layers;
+
+AVDictionaryEntry *user_layers;
+char *user_layers_str, *save, *token;
+
+const char **enabled_layers = NULL;
+uint32_t enabled_layers_count = 0;
+
+user_layers = av_dict_get(opts, "validation_layers", NULL, 0);
+if (!user_layers)
+return 0;
+
+user_layers_str = av_strdup(user_layers->value);
+if (!user_layers_str) {
+err = AVERROR(EINVAL);
+goto fail;
+}
+
+vk->EnumerateInstanceLayerProperties(_layer_count, NULL);
+sup_layers = av_malloc_array(sup_layer_count, sizeof(VkLayerProperties));
+if (!sup_layers)
+return AVERROR(ENOMEM);
+vk->EnumerateInstanceLayerProperties(_layer_count, sup_layers);
+
+av_log(ctx, AV_LOG_VERBOSE, "Supported validation layers:\n");
+for (int i = 0; i < sup_layer_count; i++) {
+av_log(ctx, AV_LOG_VERBOSE, "\t%s\n", sup_layers[i].layerName);
+if (!strcmp(default_layer, sup_layers[i].layerName))
+found = 1;
+}
+
+if (!found) {
+av_log(ctx, AV_LOG_ERROR, "Default layer\"%s\" isn't supported. Please 
"
+   "check if vulkan-validation-layers installed\n", default_layer);
+} else {
+av_log(ctx, AV_LOG_VERBOSE,
+   "Default validation layer %s is enabled\n", default_layer);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, default_layer);
+}
+
+token = av_strtok(user_layers_str, "+", );
+while (token) {
+found = 0;
+if (!strcmp(default_layer, token)) {
+token = av_strtok(NULL, "+", );
+continue;
+}
+for (int j = 0; j < sup_layer_count; j++) {
+if (!strcmp(token, sup_layers[j].layerName)) {
+found = 1;
+break;
+}
+}
+if (found) {
+av_log(ctx, AV_LOG_VERBOSE, "Requested Validation Layer: %s\n", 
token);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, token);
+} else {
+av_log(ctx, AV_LOG_ERROR,
+   "Validation Layer \"%s\" not support.\n", token);
+err = AVERROR(EINVAL);
+goto fail;
+}
+token = av_strtok(NULL, "+", );
+}
+
+*dst = enabled_layers;
+*num = enabled_layers_count;
+
+av_free(sup_layers);
+av_free(user_layers_str);
+return 0;
+
+fail:
+RELEASE_PROPS(enabled_layers, enabled_layers_count);
+av_free(sup_layers);
+av_free(user_layers_str);
+return err;
+}
+
 /* Creates a VkInstance */
 static int create_instance(AVHWDeviceContext *ctx, AVDictionary *opts)
 {
@@ -558,13 +651,16 @@ static int create_instance(AVHWDeviceContext *ctx, 
AVDictionary *opts)
 /* Check for present/missing

[FFmpeg-devel] [PATCH v2 3/4] avutil/hwcontext_vulkan: check if created before destroying the instance

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 4ac1058181..644ed947f8 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -1157,7 +1157,8 @@ static void vulkan_device_free(AVHWDeviceContext *ctx)
 vk->DestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
   hwctx->alloc);
 
-vk->DestroyInstance(hwctx->inst, hwctx->alloc);
+if (hwctx->inst)
+vk->DestroyInstance(hwctx->inst, hwctx->alloc);
 
 if (p->libvulkan)
 dlclose(p->libvulkan);
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 2/4] avutil/hwcontext_vulkan: check if created before destroying the device

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index f1e750cd3e..4ac1058181 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -1150,7 +1150,8 @@ static void vulkan_device_free(AVHWDeviceContext *ctx)
 FFVulkanFunctions *vk = >vkfn;
 AVVulkanDeviceContext *hwctx = ctx->hwctx;
 
-vk->DestroyDevice(hwctx->act_dev, hwctx->alloc);
+if (hwctx->act_dev)
+vk->DestroyDevice(hwctx->act_dev, hwctx->alloc);
 
 if (p->debug_ctx)
 vk->DestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v2 1/4] avutil/vulkan_functions: add EnumerateInstanceLayerProperties

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/vulkan_functions.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavutil/vulkan_functions.h b/libavutil/vulkan_functions.h
index 85a9f943c8..96922d7286 100644
--- a/libavutil/vulkan_functions.h
+++ b/libavutil/vulkan_functions.h
@@ -45,6 +45,7 @@ typedef enum FFVulkanExtensions {
 #define FN_LIST(MACRO) 
  \
 /* Instance */ 
  \
 MACRO(0, 0, FF_VK_EXT_NO_FLAG,  
EnumerateInstanceExtensionProperties)\
+MACRO(0, 0, FF_VK_EXT_NO_FLAG,  
EnumerateInstanceLayerProperties)\
 MACRO(0, 0, FF_VK_EXT_NO_FLAG,  CreateInstance)
  \
 MACRO(1, 0, FF_VK_EXT_NO_FLAG,  DestroyInstance)   
  \

  \
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 3/4] avutil/hwcontext_vulkan: check if created before destroying the instance

2021-11-23 Thread Wu Jianhua

Dennis Mungai<mailto:dmng...@gmail.com>:
> Sent: 2021年11月23日 22:58
> To: FFmpeg development discussions and patches<mailto:ffmpeg-devel@ffmpeg.org>
> Cc: Wu Jianhua<mailto:jianhua...@intel.com>
> Subject: Re: [FFmpeg-devel] [PATCH 3/4] avutil/hwcontext_vulkan: check if 
> created before destroying the instance
>
> On Tue, 23 Nov 2021, 12:06 Wu Jianhua,  wrote:
>
>> Signed-off-by: Wu Jianhua 
>> ---
>>  libavutil/hwcontext_vulkan.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
>> index 4ac1058181..644ed947f8 100644
>> --- a/libavutil/hwcontext_vulkan.c
>> +++ b/libavutil/hwcontext_vulkan.c
>> @@ -1157,7 +1157,8 @@ static void vulkan_device_free(AVHWDeviceContext
>> *ctx)
>>  vk->DestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
>>hwctx->alloc);
>>
>> -vk->DestroyInstance(hwctx->inst, hwctx->alloc);
>> +if (hwctx->inst)
>> +vk->DestroyInstance(hwctx->inst, hwctx->alloc);
>>
>>  if (p->libvulkan)
>>  dlclose(p->libvulkan);
>> --
>> 2.25.1
>>
>
> Ping.
>
> This fixes a (somewhat obscure) bug where a "generic library error" is
> reported when running multiple concurrent ffmpeg commands with one or more
> Vulkan filter chains.
>

Hi Dennis:

Glad that this patch is helpful, but I’m unable to do more. Lynne may help apply
this patch when she sees your ping.

Thanks,
Jianhua



___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-23 Thread Wu Jianhua

Lynne:
> 23 Nov 2021, 12:03 by toq...@outlook.com:
>
>> Lynne:
>> >23 Nov 2021, 10:48 by jianhua...@intel.com:
>> >>Lynne:
>>
>>>>> From: ffmpeg-devel  On Behalf Of
>>>>> Lynne
>>>>> Sent: Tuesday, November 23, 2021 5:23 PM
>>>>> To: FFmpeg development discussions and patches >>>> de...@ffmpeg.org>
>>>>> Subject: Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully
>>>>> support customizable validation layers
>>>>>
>>>>> 23 Nov 2021, 10:01 by jianhua...@intel.com:
>>>>>
>>>>> > Validation layer is an indispensable part of developing on Vulkan.
>>>>> >
>>>>> > The following commands is on how to enable validation layers:
>>>>> >
>>>>> > ffmpeg -init_hw_device
>>>>> >
>>>>> vulkan=0,debug=1,validation_layers=VK_LAYER_KHRONOS_validation+VK_L
>>>>> AYE
>>>>> > R_LUNARG_api_dump
>>>>> >
>>>>> > Signed-off-by: Wu Jianhua 
>>>>> > ---
>>>>> >  libavutil/hwcontext_vulkan.c | 110 ---
>>>>> 
>>>>> >  libavutil/hwcontext_vulkan.h |   7 +++
>>>>> >  2 files changed, 97 insertions(+), 20 deletions(-)
>>>>> >
>>>>> >
>>>>> > +/**
>>>>> > + * Enabled validation layers.
>>>>> > + * If no layers are enabled, set these fields to NULL, and 0 
>>>>> > respectively.
>>>>> > + */
>>>>> > +const char * const *enabled_validation_layers;
>>>>> > +int nb_enabled_validation_layers;
>>>>> > +
>>>>> >
>>>>>
>>>>> Why are you exposing them? Do API users really need to know this?
>>>>>
>>>>
>>>> It's okay. For it's only integrated in a really small separate function it
>>>> could be skipped by the status debug_mode as before. And validation
>>>> layers are embed by other specific drivers, platforms(such as those
>>>> specific layers in androids) or SDK, the FFmpeg is not need to do anything
>>>> more whatever the current is compiled with optimization mode or debug mode.
>>>> The use who only want to use filter is simply not able to know how to
>>>> enable the debug_mode. For me, as a user also, it is important to me, I
>>>> don't want to changed the code then compiling to use the specific 
>>>> validation
>>>> layers again and again.  And not only the developer need to use it, those
>>>> people who help us test could report a more detailed problem. I think it's
>>>> really benefit.
>>>>
>>>
>>> Sorry, I didn't quite understand that.
>>> I'm not objecting to being able to use custom debug layers and activating
>>> them if they're requested from the user. We already do that for the
>>> standard debug layer anyway. I'm just not sure I understand why filters
>>> would need to know if a debug layer is ran, and which one it is, by exposing
>>> them via the public API.
>>>
>>
>> Oh! My bad. I supposed you mean we may be not necessary to expose
>> layer info to the Device Context, right?
>>
>
> Yup, that's right. We'd like to keep the API as small as possible, and
> it's already quite large compared to other hwcontexts. In this case,
> I don't think it's necessary. API users already know, since they can request
> layers, and filters shouldn't have to know.
>
> So just remove the public API part of the patch.
>
> As for the rest, could you always include VK_LAYER_KHRONOS_validation
> if debug mode is on? This patch removes that. And free the array of
> extensions once the instance has been initialized.
>

Sure I do. I'll update this patch soon.

Thanks,
Jianhua
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-23 Thread Wu Jianhua

Lynne:
>23 Nov 2021, 10:48 by jianhua...@intel.com:
>>Lynne:
>>> From: ffmpeg-devel  On Behalf Of
>>> Lynne
>>> Sent: Tuesday, November 23, 2021 5:23 PM
>>> To: FFmpeg development discussions and patches >> de...@ffmpeg.org>
>>> Subject: Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully
>>> support customizable validation layers
>>>
>>> 23 Nov 2021, 10:01 by jianhua...@intel.com:
>>>
>>> > Validation layer is an indispensable part of developing on Vulkan.
>>> >
>>> > The following commands is on how to enable validation layers:
>>> >
>>> > ffmpeg -init_hw_device
>>> >
>>> vulkan=0,debug=1,validation_layers=VK_LAYER_KHRONOS_validation+VK_L
>>> AYE
>>> > R_LUNARG_api_dump
>>> >
>>> > Signed-off-by: Wu Jianhua 
>>> > ---
>>> >  libavutil/hwcontext_vulkan.c | 110 ---
>>> 
>>> >  libavutil/hwcontext_vulkan.h |   7 +++
>>> >  2 files changed, 97 insertions(+), 20 deletions(-)
>>> >
>>> >
>>> > +/**
>>> > + * Enabled validation layers.
>>> > + * If no layers are enabled, set these fields to NULL, and 0 
>>> > respectively.
>>> > + */
>>> > +const char * const *enabled_validation_layers;
>>> > +int nb_enabled_validation_layers;
>>> > +
>>> >
>>>
>>> Why are you exposing them? Do API users really need to know this?
>>>
>>
>> It's okay. For it's only integrated in a really small separate function it
>> could be skipped by the status debug_mode as before. And validation
>> layers are embed by other specific drivers, platforms(such as those
>> specific layers in androids) or SDK, the FFmpeg is not need to do anything
>> more whatever the current is compiled with optimization mode or debug mode.
>> The use who only want to use filter is simply not able to know how to
>> enable the debug_mode. For me, as a user also, it is important to me, I
>> don't want to changed the code then compiling to use the specific validation
>> layers again and again.  And not only the developer need to use it, those
>> people who help us test could report a more detailed problem. I think it's
>> really benefit.
>>
>
> Sorry, I didn't quite understand that.
> I'm not objecting to being able to use custom debug layers and activating
> them if they're requested from the user. We already do that for the
> standard debug layer anyway. I'm just not sure I understand why filters
> would need to know if a debug layer is ran, and which one it is, by exposing
> them via the public API.
>

Oh! My bad. I supposed you mean we may be not necessary to expose
layer info to the Device Context, right?

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-23 Thread Wu, Jianhua




> -Original Message-
> From: ffmpeg-devel  On Behalf Of
> Lynne
> Sent: Tuesday, November 23, 2021 5:23 PM
> To: FFmpeg development discussions and patches  de...@ffmpeg.org>
> Subject: Re: [FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully
> support customizable validation layers
> 
> 23 Nov 2021, 10:01 by jianhua...@intel.com:
> 
> > Validation layer is an indispensable part of developing on Vulkan.
> >
> > The following commands is on how to enable validation layers:
> >
> > ffmpeg -init_hw_device
> >
> vulkan=0,debug=1,validation_layers=VK_LAYER_KHRONOS_validation+VK_L
> AYE
> > R_LUNARG_api_dump
> >
> > Signed-off-by: Wu Jianhua 
> > ---
> >  libavutil/hwcontext_vulkan.c | 110 ---
> 
> >  libavutil/hwcontext_vulkan.h |   7 +++
> >  2 files changed, 97 insertions(+), 20 deletions(-)
> >
> > diff --git a/libavutil/hwcontext_vulkan.c
> > b/libavutil/hwcontext_vulkan.c index 644ed947f8..b808f8f76c 100644
> > --- a/libavutil/hwcontext_vulkan.c
> > +++ b/libavutil/hwcontext_vulkan.c
> > @@ -146,6 +146,13 @@ typedef struct AVVkFrameInternal {
> >  }  \
> >  } while(0)
> >
> > +#define RELEASE_PROPS(props, count)
> > \
> > +if (props) {   
> > \
> > +for (int i = 0; i < count; i++)
> > \
> > +av_free((void *)((props)[i])); 
> > \
> > +av_free((void *)props);
> > \
> > +}
> > +
> >  static const struct {
> >  enum AVPixelFormat pixfmt;
> >  const VkFormat vkfmts[4];
> > @@ -511,15 +518,83 @@ static int check_extensions(AVHWDeviceContext
> > *ctx, int dev, AVDictionary *opts,  return 0;
> >
> >  fail:
> > -if (extension_names)
> > -for (int i = 0; i < extensions_found; i++)
> > -av_free((void *)extension_names[i]);
> > -av_free(extension_names);
> > +RELEASE_PROPS(extension_names, extensions_found);
> >  av_free(user_exts_str);
> >  av_free(sup_ext);
> >  return err;
> >  }
> >
> > +static int check_validation_layers(AVHWDeviceContext *ctx, AVDictionary
> *opts,
> > +   const char * const **dst, uint32_t
> > +*num) {
> > +int found, err = 0;
> > +VulkanDevicePriv *priv = ctx->internal->priv;
> > +FFVulkanFunctions *vk = >vkfn;
> > +
> > +uint32_t sup_layer_count;
> > +VkLayerProperties *sup_layers;
> > +
> > +AVDictionaryEntry *user_layers;
> > +char *user_layers_str, *save, *token;
> > +
> > +const char **enabled_layers = NULL;
> > +uint32_t enabled_layers_count = 0;
> > +
> > +user_layers = av_dict_get(opts, "validation_layers", NULL, 0);
> > +if (!user_layers)
> > +return 0;
> > +
> > +user_layers_str = av_strdup(user_layers->value);
> > +if (!user_layers_str) {
> > +err = AVERROR(EINVAL);
> > +goto fail;
> > +}
> > +
> > +vk->EnumerateInstanceLayerProperties(_layer_count, NULL);
> > +sup_layers = av_malloc_array(sup_layer_count,
> sizeof(VkLayerProperties));
> > +if (!sup_layers)
> > +return AVERROR(ENOMEM);
> > +vk->EnumerateInstanceLayerProperties(_layer_count,
> > + sup_layers);
> > +
> > +av_log(ctx, AV_LOG_VERBOSE, "Supported Validation layers:\n");
> > +for (int i = 0; i < sup_layer_count; i++)
> > +av_log(ctx, AV_LOG_VERBOSE, "\t%s\n",
> > + sup_layers[i].layerName);
> > +
> > +token = av_strtok(user_layers_str, "+", );
> > +while (token) {
> > +found = 0;
> > +for (int j = 0; j < sup_layer_count; j++) {
> > +if (!strcmp(token, sup_layers[j].layerName)) {
> > +found = 1;
> > +break;
> > +}
> > +}
> > +if (found) {
> > +av_log(ctx, AV_LOG_VERBOSE, "Requested Validation Layer: %s\n",
> token);
> > +ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, token);
> > +} else {
> > +av_log(ctx, AV_LOG_ERROR,
> > +   "Validation Layer \"%s\" no

[FFmpeg-devel] [PATCH 4/4] avutil/hwcontext_vulkan: fully support customizable validation layers

2021-11-23 Thread Wu Jianhua

Validation layer is an indispensable part of developing on Vulkan.

The following commands is on how to enable validation layers:

ffmpeg -init_hw_device 
vulkan=0,debug=1,validation_layers=VK_LAYER_KHRONOS_validation+VK_LAYER_LUNARG_api_dump

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 110 ---
 libavutil/hwcontext_vulkan.h |   7 +++
 2 files changed, 97 insertions(+), 20 deletions(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 644ed947f8..b808f8f76c 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -146,6 +146,13 @@ typedef struct AVVkFrameInternal {
 }  
\
 } while(0)
 
+#define RELEASE_PROPS(props, count)
\
+if (props) {   
\
+for (int i = 0; i < count; i++)
\
+av_free((void *)((props)[i])); 
\
+av_free((void *)props);
\
+}
+
 static const struct {
 enum AVPixelFormat pixfmt;
 const VkFormat vkfmts[4];
@@ -511,15 +518,83 @@ static int check_extensions(AVHWDeviceContext *ctx, int 
dev, AVDictionary *opts,
 return 0;
 
 fail:
-if (extension_names)
-for (int i = 0; i < extensions_found; i++)
-av_free((void *)extension_names[i]);
-av_free(extension_names);
+RELEASE_PROPS(extension_names, extensions_found);
 av_free(user_exts_str);
 av_free(sup_ext);
 return err;
 }
 
+static int check_validation_layers(AVHWDeviceContext *ctx, AVDictionary *opts,
+   const char * const **dst, uint32_t *num)
+{
+int found, err = 0;
+VulkanDevicePriv *priv = ctx->internal->priv;
+FFVulkanFunctions *vk = >vkfn;
+
+uint32_t sup_layer_count;
+VkLayerProperties *sup_layers;
+
+AVDictionaryEntry *user_layers;
+char *user_layers_str, *save, *token;
+
+const char **enabled_layers = NULL;
+uint32_t enabled_layers_count = 0;
+
+user_layers = av_dict_get(opts, "validation_layers", NULL, 0);
+if (!user_layers)
+return 0;
+
+user_layers_str = av_strdup(user_layers->value);
+if (!user_layers_str) {
+err = AVERROR(EINVAL);
+goto fail;
+}
+
+vk->EnumerateInstanceLayerProperties(_layer_count, NULL);
+sup_layers = av_malloc_array(sup_layer_count, sizeof(VkLayerProperties));
+if (!sup_layers)
+return AVERROR(ENOMEM);
+vk->EnumerateInstanceLayerProperties(_layer_count, sup_layers);
+
+av_log(ctx, AV_LOG_VERBOSE, "Supported Validation layers:\n");
+for (int i = 0; i < sup_layer_count; i++)
+av_log(ctx, AV_LOG_VERBOSE, "\t%s\n", sup_layers[i].layerName);
+
+token = av_strtok(user_layers_str, "+", );
+while (token) {
+found = 0;
+for (int j = 0; j < sup_layer_count; j++) {
+if (!strcmp(token, sup_layers[j].layerName)) {
+found = 1;
+break;
+}
+}
+if (found) {
+av_log(ctx, AV_LOG_VERBOSE, "Requested Validation Layer: %s\n", 
token);
+ADD_VAL_TO_LIST(enabled_layers, enabled_layers_count, token);
+} else {
+av_log(ctx, AV_LOG_ERROR,
+   "Validation Layer \"%s\" not support.\n", token);
+err = AVERROR(EINVAL);
+goto fail;
+}
+token = av_strtok(NULL, "+", );
+}
+
+*dst = enabled_layers;
+*num = enabled_layers_count;
+
+av_free(sup_layers);
+av_free(user_layers_str);
+return 0;
+
+fail:
+RELEASE_PROPS(enabled_layers, enabled_layers_count);
+av_free(sup_layers);
+av_free(user_layers_str);
+return err;
+}
+
 /* Creates a VkInstance */
 static int create_instance(AVHWDeviceContext *ctx, AVDictionary *opts)
 {
@@ -558,13 +633,18 @@ static int create_instance(AVHWDeviceContext *ctx, 
AVDictionary *opts)
 /* Check for present/missing extensions */
 err = check_extensions(ctx, 0, opts, _props.ppEnabledExtensionNames,
_props.enabledExtensionCount, debug_mode);
+hwctx->enabled_inst_extensions = inst_props.ppEnabledExtensionNames;
+hwctx->nb_enabled_inst_extensions = inst_props.enabledExtensionCount;
 if (err < 0)
 return err;
 
 if (debug_mode) {
-static const char *layers[] = { "VK_LAYER_KHRONOS_validation" };
-inst_props.ppEnabledLayerNames = layers;
-inst_props.enabledLayerCount = FF_ARRAY_ELEMS(layers);
+err = check_validation_layers(ctx, opts, 
_props.ppEnabledLayerNames,
+  _props.enabledLaye

[FFmpeg-devel] [PATCH 3/4] avutil/hwcontext_vulkan: check if created before destroying the instance

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index 4ac1058181..644ed947f8 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -1157,7 +1157,8 @@ static void vulkan_device_free(AVHWDeviceContext *ctx)
 vk->DestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
   hwctx->alloc);
 
-vk->DestroyInstance(hwctx->inst, hwctx->alloc);
+if (hwctx->inst)
+vk->DestroyInstance(hwctx->inst, hwctx->alloc);
 
 if (p->libvulkan)
 dlclose(p->libvulkan);
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 2/4] avutil/hwcontext_vulkan: check if created before destroying the device

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/hwcontext_vulkan.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavutil/hwcontext_vulkan.c b/libavutil/hwcontext_vulkan.c
index f1e750cd3e..4ac1058181 100644
--- a/libavutil/hwcontext_vulkan.c
+++ b/libavutil/hwcontext_vulkan.c
@@ -1150,7 +1150,8 @@ static void vulkan_device_free(AVHWDeviceContext *ctx)
 FFVulkanFunctions *vk = >vkfn;
 AVVulkanDeviceContext *hwctx = ctx->hwctx;
 
-vk->DestroyDevice(hwctx->act_dev, hwctx->alloc);
+if (hwctx->act_dev)
+vk->DestroyDevice(hwctx->act_dev, hwctx->alloc);
 
 if (p->debug_ctx)
 vk->DestroyDebugUtilsMessengerEXT(hwctx->inst, p->debug_ctx,
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH 1/4] avutil/vulkan_functions: add EnumerateInstanceLayerProperties

2021-11-23 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 libavutil/vulkan_functions.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/libavutil/vulkan_functions.h b/libavutil/vulkan_functions.h
index 85a9f943c8..96922d7286 100644
--- a/libavutil/vulkan_functions.h
+++ b/libavutil/vulkan_functions.h
@@ -45,6 +45,7 @@ typedef enum FFVulkanExtensions {
 #define FN_LIST(MACRO) 
  \
 /* Instance */ 
  \
 MACRO(0, 0, FF_VK_EXT_NO_FLAG,  
EnumerateInstanceExtensionProperties)\
+MACRO(0, 0, FF_VK_EXT_NO_FLAG,  
EnumerateInstanceLayerProperties)\
 MACRO(0, 0, FF_VK_EXT_NO_FLAG,  CreateInstance)
  \
 MACRO(1, 0, FF_VK_EXT_NO_FLAG,  DestroyInstance)   
  \

  \
-- 
2.25.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3 3/3] tests/checkasm: add check for vf_exposure

2021-11-22 Thread Wu Jianhua

Signed-off-by: Wu Jianhua 
---
 tests/checkasm/Makefile  |  1 +
 tests/checkasm/checkasm.c|  3 ++
 tests/checkasm/checkasm.h|  1 +
 tests/checkasm/vf_exposure.c | 59 
 tests/fate/checkasm.mak  |  1 +
 5 files changed, 65 insertions(+)
 create mode 100644 tests/checkasm/vf_exposure.c

diff --git a/tests/checkasm/Makefile b/tests/checkasm/Makefile
index 4ef5fa87da..7b86ffca6b 100644
--- a/tests/checkasm/Makefile
+++ b/tests/checkasm/Makefile
@@ -37,6 +37,7 @@ AVFILTEROBJS-$(CONFIG_AFIR_FILTER) += af_afir.o
 AVFILTEROBJS-$(CONFIG_BLEND_FILTER) += vf_blend.o
 AVFILTEROBJS-$(CONFIG_COLORSPACE_FILTER) += vf_colorspace.o
 AVFILTEROBJS-$(CONFIG_EQ_FILTER) += vf_eq.o
+AVFILTEROBJS-$(CONFIG_EXPOSURE_FILTER)   += vf_exposure.o
 AVFILTEROBJS-$(CONFIG_GBLUR_FILTER)  += vf_gblur.o
 AVFILTEROBJS-$(CONFIG_HFLIP_FILTER)  += vf_hflip.o
 AVFILTEROBJS-$(CONFIG_THRESHOLD_FILTER)  += vf_threshold.o
diff --git a/tests/checkasm/checkasm.c b/tests/checkasm/checkasm.c
index b1353f7cbe..50961d9961 100644
--- a/tests/checkasm/checkasm.c
+++ b/tests/checkasm/checkasm.c
@@ -169,6 +169,9 @@ static const struct {
 #if CONFIG_EQ_FILTER
 { "vf_eq", checkasm_check_vf_eq },
 #endif
+#if CONFIG_EXPOSURE_FILTER
+{ "vf_exposure", checkasm_check_vf_exposure },
+#endif
 #if CONFIG_GBLUR_FILTER
 { "vf_gblur", checkasm_check_vf_gblur },
 #endif
diff --git a/tests/checkasm/checkasm.h b/tests/checkasm/checkasm.h
index 68b0697d3e..b402894ad3 100644
--- a/tests/checkasm/checkasm.h
+++ b/tests/checkasm/checkasm.h
@@ -78,6 +78,7 @@ void checkasm_check_utvideodsp(void);
 void checkasm_check_v210dec(void);
 void checkasm_check_v210enc(void);
 void checkasm_check_vf_eq(void);
+void checkasm_check_vf_exposure(void);
 void checkasm_check_vf_gblur(void);
 void checkasm_check_vf_hflip(void);
 void checkasm_check_vf_threshold(void);
diff --git a/tests/checkasm/vf_exposure.c b/tests/checkasm/vf_exposure.c
new file mode 100644
index 00..7301a6ef33
--- /dev/null
+++ b/tests/checkasm/vf_exposure.c
@@ -0,0 +1,59 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with FFmpeg; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include 
+#include 
+#include "checkasm.h"
+#include "libavfilter/exposure.h"
+
+#define PIXELS 256
+#define BUF_SIZE (PIXELS * 4)
+
+#define randomize_buffers(buf, size) \
+do { \
+int j;   \
+float *tmp_buf = (float *)buf;   \
+for (j = 0; j < size; j++)   \
+tmp_buf[j] = (float)(rnd() & 0xFF); \
+} while (0)
+
+void checkasm_check_vf_exposure(void)
+{
+float *dst_ref[BUF_SIZE] = { 0 };
+float *dst_new[BUF_SIZE] = { 0 };
+ExposureContext s;
+
+s.exposure = 0.5f;
+s.black = 0.1f;
+s.scale = 1.f / (exp2f(-s.exposure) - s.black);
+
+randomize_buffers(dst_ref, PIXELS);
+memcpy(dst_new, dst_ref, BUF_SIZE);
+
+ff_exposure_init();
+
+if (check_func(s.exposure_func, "exposure")) {
+declare_func(void, float *dst, int length, float black, float scale);
+call_ref(dst_ref, PIXELS, s.black, s.scale);
+call_new(dst_new, PIXELS, s.black, s.scale);
+if (!float_near_abs_eps_array(dst_ref, dst_new, 0.01f, PIXELS))
+fail();
+bench_new(dst_new, PIXELS, s.black, s.scale);
+}
+report("exposure");
+}
diff --git a/tests/fate/checkasm.mak b/tests/fate/checkasm.mak
index 6e7edbe655..4d4cd6cc88 100644
--- a/tests/fate/checkasm.mak
+++ b/tests/fate/checkasm.mak
@@ -34,6 +34,7 @@ FATE_CHECKASM = fate-checkasm-aacpsdsp
  \
 fate-checkasm-vf_blend  \
 fate-checkasm-vf_colorspace \
 fate-checkasm-vf_eq \
+fate-checkasm-vf_exposure   \
 fate-checkasm-vf_gblur  \
 fate-checkasm-vf_hflip  \
 fate-checkasm-vf_nlmeans\
-- 
2.17.1

_

[FFmpeg-devel] [PATCH v3 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-22 Thread Wu Jianhua

Performance data(Less is better):
exposure_sse:   500491
exposure_avx2:  449122

Signed-off-by: Wu Jianhua 
---
 libavfilter/x86/vf_exposure.asm| 15 +++
 libavfilter/x86/vf_exposure_init.c |  4 
 2 files changed, 19 insertions(+)

diff --git a/libavfilter/x86/vf_exposure.asm b/libavfilter/x86/vf_exposure.asm
index 3351c6fb3b..4ee9fbcb15 100644
--- a/libavfilter/x86/vf_exposure.asm
+++ b/libavfilter/x86/vf_exposure.asm
@@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
 VBROADCASTSS m1, xmm1
 %endif
 
+%if cpuflag(fma3)
+mulps   m0, m0, m1 ; black * scale
+%endif
+
 .loop:
+%if cpuflag(fma3)
+movam2, m0
+vfmsub231ps m2, m1, [ptrq]
+movu[ptrq], m2
+%else
 movum2, [ptrq]
 subps   m2, m2, m0
 mulps   m2, m2, m1
 movu[ptrq], m2
+%endif
 add   ptrq, mmsize
 sublengthq, mmsize/4
 
@@ -52,4 +62,9 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
 %if ARCH_X86_64
 INIT_XMM sse
 EXPOSURE
+
+%if HAVE_AVX2_EXTERNAL
+INIT_YMM avx2
+EXPOSURE
+%endif
 %endif
diff --git a/libavfilter/x86/vf_exposure_init.c 
b/libavfilter/x86/vf_exposure_init.c
index de1b360f6c..edc1452850 100644
--- a/libavfilter/x86/vf_exposure_init.c
+++ b/libavfilter/x86/vf_exposure_init.c
@@ -24,6 +24,7 @@
 #include "libavfilter/exposure.h"
 
 void ff_exposure_sse(float *ptr, int length, float black, float scale);
+void ff_exposure_avx2(float *ptr, int length, float black, float scale);
 
 av_cold void ff_exposure_init_x86(ExposureContext *s)
 {
@@ -32,5 +33,8 @@ av_cold void ff_exposure_init_x86(ExposureContext *s)
 #if ARCH_X86_64
 if (EXTERNAL_SSE(cpu_flags))
 s->exposure_func = ff_exposure_sse;
+
+if (EXTERNAL_AVX2_FAST(cpu_flags))
+s->exposure_func = ff_exposure_avx2;
 #endif
 }
-- 
2.17.1

___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

[FFmpeg-devel] [PATCH v3 1/3] avfilter/x86/vf_exposure: add x86 SIMD optimization

2021-11-22 Thread Wu Jianhua

Performance data(Less is better):
exposure_c:857394
exposure_sse:  327589

Signed-off-by: Wu Jianhua 
---
 libavfilter/exposure.h | 36 +++
 libavfilter/vf_exposure.c  | 36 +--
 libavfilter/x86/Makefile   |  2 ++
 libavfilter/x86/vf_exposure.asm| 55 ++
 libavfilter/x86/vf_exposure_init.c | 36 +++
 5 files changed, 147 insertions(+), 18 deletions(-)
 create mode 100644 libavfilter/exposure.h
 create mode 100644 libavfilter/x86/vf_exposure.asm
 create mode 100644 libavfilter/x86/vf_exposure_init.c

diff --git a/libavfilter/exposure.h b/libavfilter/exposure.h
new file mode 100644
index 00..e76a517826
--- /dev/null
+++ b/libavfilter/exposure.h
@@ -0,0 +1,36 @@
+/*
+ * This file is part of FFmpeg.
+ *
+ * FFmpeg is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * FFmpeg is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with FFmpeg; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVFILTER_EXPOSURE_H
+#define AVFILTER_EXPOSURE_H
+#include "avfilter.h"
+
+typedef struct ExposureContext {
+const AVClass *class;
+
+float exposure;
+float black;
+float scale;
+
+void (*exposure_func)(float *ptr, int length, float black, float scale);
+} ExposureContext;
+
+void ff_exposure_init(ExposureContext *s);
+void ff_exposure_init_x86(ExposureContext *s);
+
+#endif
diff --git a/libavfilter/vf_exposure.c b/libavfilter/vf_exposure.c
index 108fba7930..045ae710d3 100644
--- a/libavfilter/vf_exposure.c
+++ b/libavfilter/vf_exposure.c
@@ -26,23 +26,20 @@
 #include "formats.h"
 #include "internal.h"
 #include "video.h"
+#include "exposure.h"
 
-typedef struct ExposureContext {
-const AVClass *class;
-
-float exposure;
-float black;
+static void exposure_c(float *ptr, int length, float black, float scale)
+{
+int i;
 
-float scale;
-int (*do_slice)(AVFilterContext *s, void *arg,
-int jobnr, int nb_jobs);
-} ExposureContext;
+for (i = 0; i < length; i++)
+ptr[i] = (ptr[i] - black) * scale;
+}
 
 static int exposure_slice(AVFilterContext *ctx, void *arg, int jobnr, int 
nb_jobs)
 {
 ExposureContext *s = ctx->priv;
 AVFrame *frame = arg;
-const int width = frame->width;
 const int height = frame->height;
 const int slice_start = (height * jobnr) / nb_jobs;
 const int slice_end = (height * (jobnr + 1)) / nb_jobs;
@@ -52,24 +49,27 @@ static int exposure_slice(AVFilterContext *ctx, void *arg, 
int jobnr, int nb_job
 for (int p = 0; p < 3; p++) {
 const int linesize = frame->linesize[p] / 4;
 float *ptr = (float *)frame->data[p] + slice_start * linesize;
-for (int y = slice_start; y < slice_end; y++) {
-for (int x = 0; x < width; x++)
-ptr[x] = (ptr[x] - black) * scale;
-
-ptr += linesize;
-}
+s->exposure_func(ptr, linesize * (slice_end - slice_start), black, 
scale);
 }
 
 return 0;
 }
 
+void ff_exposure_init(ExposureContext *s)
+{
+s->exposure_func = exposure_c;
+
+if (ARCH_X86)
+ff_exposure_init_x86(s);
+}
+
 static int filter_frame(AVFilterLink *inlink, AVFrame *frame)
 {
 AVFilterContext *ctx = inlink->dst;
 ExposureContext *s = ctx->priv;
 
 s->scale = 1.f / (exp2f(-s->exposure) - s->black);
-ff_filter_execute(ctx, s->do_slice, frame, NULL,
+ff_filter_execute(ctx, exposure_slice, frame, NULL,
   FFMIN(frame->height, ff_filter_get_nb_threads(ctx)));
 
 return ff_filter_frame(ctx->outputs[0], frame);
@@ -80,7 +80,7 @@ static av_cold int config_input(AVFilterLink *inlink)
 AVFilterContext *ctx = inlink->dst;
 ExposureContext *s = ctx->priv;
 
-s->do_slice = exposure_slice;
+ff_exposure_init(s);
 
 return 0;
 }
diff --git a/libavfilter/x86/Makefile b/libavfilter/x86/Makefile
index e87481bd7a..830a1e94cb 100644
--- a/libavfilter/x86/Makefile
+++ b/libavfilter/x86/Makefile
@@ -8,6 +8,7 @@ OBJS-$(CONFIG_BWDIF_FILTER)  += 
x86/vf_bwdif_init.o
 OBJS-$(CONFIG_COLORSPACE_FILTER) += x86/colorspacedsp_init.o
 OBJS-$(CONFIG_CONVOLUTION_FILTER)+= x86/vf_convolution_init.o
 OBJS-$(CONFIG_EQ_FILTER) += x86/vf_eq_init.o
+OBJS-$(

Re: [FFmpeg-devel] [PATCH v2 2/3] avfilter/x86/vf_exposure: add ff_exposure_avx2

2021-11-20 Thread Wu Jianhua

James Almer<mailto:jamr...@gmail.com>:
>On 11/20/2021 5:42 PM, Wu Jianhua wrote:
>> James Almer<mailto:jamr...@gmail.com>:
>> On 11/4/2021 1:18 AM, Wu Jianhua wrote:
>>>> Performance data(Less is better):
>>>>   exposure_sse:   500491
>>
>>> You reported a better result in the first patch.
>>
>> For they are tested on different baseline, I think it might be better to 
>> only compare these two values.
>>
>>>>   exposure_avx2:  449122
>>
>>> This looks like a really low speed up for a function that processes
>>>   twice the amount of floats per loop.
>>
>>>>
>>>> Signed-off-by: Wu Jianhua 
>>>> ---
>>>>libavfilter/x86/vf_exposure.asm| 15 +++
>>>>libavfilter/x86/vf_exposure_init.c |  6 ++
>>>>2 files changed, 21 insertions(+)
>>>>
>>>> diff --git a/libavfilter/x86/vf_exposure.asm 
>>>> b/libavfilter/x86/vf_exposure.asm
>>>> index 3351c6fb3b..f271167805 100644
>>>> --- a/libavfilter/x86/vf_exposure.asm
>>>> +++ b/libavfilter/x86/vf_exposure.asm
>>>> @@ -36,11 +36,21 @@ cglobal exposure, 2, 2, 4, ptr, length, black, scale
>>>>VBROADCASTSS m1, xmm1
>>>>%endif
>>>>
>>>> +%if cpuflag(fma3) || cpuflag(fma4)
>>
>>> Remove the fma4 check if you're not using it.
>>
>> No problem. Avx2 flag is only initialized with fma3, so the fma4 is 
>> redundant indeed.
>>
>>>> +mulps   m0, m0, m1 ; black * scale
>>>> +%endif
>>>> +
>>>>.loop:
>>>> +%if cpuflag(fma3) || cpuflag(fma4)
>>>> +movam2, m0
>>>> +vfmsub231ps m2, m1, [ptrq]
>>>> +movu[ptrq], m2
> >
>>> Have you tried to not use FMA for this and just kept the sub + mul even
>>> for AVX2 and see how it performs?
>>
>> Yeah. Definitely. I have had sufficient tests before. The first version is 
>> kept sub + mul
>> for AVX2. After that, I keep trying to find a way out to speed up it 
>> further. Using FMA
>> here would be faster than sub + mul indeed, precisely, improving by 4%-10% 
>> approximately.
>> Not that much better, but still an optimal way I found at the present.

> I tried the checkasm test you wrote and when i made the AVX2 version use
> sub + mul instead of vfmsub231ps i noticed that i could change the
> epsilon value to FLT_EPSILON instead of 0.01f and the test would still
> succeed, meaning the output of the version using vfmsub231ps deviates a
> bit from the normal sub + mul one.

> The speed up is pretty small, so it may be worth just using the sub +
> mul version instead.

Yeah. Small, but it’s not called just one time. Many a little makes a mickle, 
isn’t it?
I might be more prefer to keep this.


___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

1 2 3 >

1 - 100 of 225 matches

Mail list logo