Re: [FFmpeg-devel] [PATCH 1/6] lavc/aarch64: new optimization for 8-bit hevc_pel_bi_pixels

2023-11-24 Thread Logan.Lyu
still doesn't work, should I re-send the .eml files one by one?  Please tell me how to deal with it, I will be grateful. Thanks 在 2023/11/22 20:36, Martin Storsjö via ffmpeg-devel 写道: On Wed, 22 Nov 2023, Logan.Lyu wrote: I can't reproduce the error you mentioned... I can apply patches

Re: [FFmpeg-devel] [PATCH 1/6] lavc/aarch64: new optimization for 8-bit hevc_pel_bi_pixels

2023-11-22 Thread Logan.Lyu
right? Can you try these patches again? If the error still occurs, please tell me how it occurred then I will fixed it. 在 2023/11/20 4:42, Michael Niedermayer 写道: On Sat, Nov 18, 2023 at 10:06:37AM +0800, Logan.Lyu wrote: put_hevc_pel_bi_pixels4_8_c: 54.7 put_hevc_pel_bi_

[FFmpeg-devel] [PATCH 6/6] lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_hv

2023-11-17 Thread Logan.Lyu
put_hevc_qpel_bi_hv4_8_c: 433.7 put_hevc_qpel_bi_hv4_8_i8mm: 117.9 put_hevc_qpel_bi_hv6_8_c: 803.9 put_hevc_qpel_bi_hv6_8_i8mm: 252.7 put_hevc_qpel_bi_hv8_8_c: 1296.4 put_hevc_qpel_bi_hv8_8_i8mm: 316.2 put_hevc_qpel_bi_hv12_8_c: 2867.4 put_hevc_qpel_bi_hv12_8_i8mm: 669.2

[FFmpeg-devel] [PATCH 5/6] lavc/aarch64: new optimization for 8-bit hevc_qpel_bi_v

2023-11-17 Thread Logan.Lyu
put_hevc_qpel_bi_v4_8_c: 166.1 put_hevc_qpel_bi_v4_8_neon: 61.9 put_hevc_qpel_bi_v6_8_c: 309.4 put_hevc_qpel_bi_v6_8_neon: 75.6 put_hevc_qpel_bi_v8_8_c: 531.1 put_hevc_qpel_bi_v8_8_neon: 78.1 put_hevc_qpel_bi_v12_8_c: 1139.9 put_hevc_qpel_bi_v12_8_neon: 238.1 put_hevc_qpel_bi_v16_8_c: 2063.6

[FFmpeg-devel] [PATCH 4/6] lavc/aarch64: new optimization for 8-bit hevc_epel_bi_hv

2023-11-17 Thread Logan.Lyu
put_hevc_epel_bi_hv4_8_c: 242.9 put_hevc_epel_bi_hv4_8_i8mm: 68.6 put_hevc_epel_bi_hv6_8_c: 402.4 put_hevc_epel_bi_hv6_8_i8mm: 135.9 put_hevc_epel_bi_hv8_8_c: 636.4 put_hevc_epel_bi_hv8_8_i8mm: 145.6 put_hevc_epel_bi_hv12_8_c: 1363.1 put_hevc_epel_bi_hv12_8_i8mm: 324.1 put_hevc_epel_bi_hv16_8_c:

[FFmpeg-devel] [PATCH 3/6] lavc/aarch64: new optimization for 8-bit hevc_epel_bi_v

2023-11-17 Thread Logan.Lyu
put_hevc_epel_bi_v4_8_c: 138.4 put_hevc_epel_bi_v4_8_neon: 33.7 put_hevc_epel_bi_v6_8_c: 302.9 put_hevc_epel_bi_v6_8_neon: 46.7 put_hevc_epel_bi_v8_8_c: 408.7 put_hevc_epel_bi_v8_8_neon: 48.7 put_hevc_epel_bi_v12_8_c: 779.4 put_hevc_epel_bi_v12_8_neon: 139.7 put_hevc_epel_bi_v16_8_c: 1344.9

[FFmpeg-devel] [PATCH 2/6] lavc/aarch64: new optimization for 8-bit hevc_epel_bi_h

2023-11-17 Thread Logan.Lyu
put_hevc_epel_bi_h4_8_c: 96.0 put_hevc_epel_bi_h4_8_neon: 36.3 put_hevc_epel_bi_h6_8_c: 288.3 put_hevc_epel_bi_h6_8_neon: 59.3 put_hevc_epel_bi_h8_8_c: 358.5 put_hevc_epel_bi_h8_8_neon: 61.5 put_hevc_epel_bi_h12_8_c: 759.8 put_hevc_epel_bi_h12_8_neon: 159.5 put_hevc_epel_bi_h16_8_c: 1307.0

[FFmpeg-devel] [PATCH 1/6] lavc/aarch64: new optimization for 8-bit hevc_pel_bi_pixels

2023-11-17 Thread Logan.Lyu
put_hevc_pel_bi_pixels4_8_c: 54.7 put_hevc_pel_bi_pixels4_8_neon: 43.0 put_hevc_pel_bi_pixels6_8_c: 94.7 put_hevc_pel_bi_pixels6_8_neon: 37.0 put_hevc_pel_bi_pixels8_8_c: 171.0 put_hevc_pel_bi_pixels8_8_neon: 24.0 put_hevc_pel_bi_pixels12_8_c: 354.0 put_hevc_pel_bi_pixels12_8_neon: 68.7

Re: [FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_v

2023-10-26 Thread Logan.Lyu
that these patches can be successfully applied on the latest master branch. Please check again, thank you. 在 2023/10/23 1:18, Martin Storsjö 写道: On Sun, 22 Oct 2023, Logan.Lyu wrote: Hi, Martin, Could you please review these patches and let me know if there are any changes needed. Did you see

Re: [FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_v

2023-10-22 Thread Logan.Lyu
Hi, Martin, Could you please review these patches and let me know if there are any changes needed. Thanks. Logan Lyu 在 2023/10/14 16:45, Logan.Lyu 写道: checkasm bench: put_hevc_epel_v4_8_c: 79.9 put_hevc_epel_v4_8_neon: 25.7 put_hevc_epel_v6_8_c: 151.4 put_hevc_epel_v6_8_neon: 46.4

[FFmpeg-devel] [PATCH 4/4] lavc/aarch64: new optimization for 8-bit hevc_qpel_hv

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_hv4_8_c: 422.1 put_hevc_qpel_hv4_8_i8mm: 101.6 put_hevc_qpel_hv6_8_c: 756.4 put_hevc_qpel_hv6_8_i8mm: 225.9 put_hevc_qpel_hv8_8_c: 1189.9 put_hevc_qpel_hv8_8_i8mm: 296.6 put_hevc_qpel_hv12_8_c: 2407.4 put_hevc_qpel_hv12_8_i8mm: 552.4 put_hevc_qpel_hv16_8_c: 4021.4

[FFmpeg-devel] [PATCH 3/4] lavc/aarch64: new optimization for 8-bit hevc_qpel_v

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_v4_8_c: 138.1 put_hevc_qpel_v4_8_neon: 41.1 put_hevc_qpel_v6_8_c: 276.6 put_hevc_qpel_v6_8_neon: 60.9 put_hevc_qpel_v8_8_c: 478.9 put_hevc_qpel_v8_8_neon: 72.9 put_hevc_qpel_v12_8_c: 1072.6 put_hevc_qpel_v12_8_neon: 203.9 put_hevc_qpel_v16_8_c: 1852.1

[FFmpeg-devel] [PATCH 2/4] lavc/aarch64: new optimization for 8-bit hevc_epel_hv

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_epel_hv4_8_c: 213.7 put_hevc_epel_hv4_8_i8mm: 59.4 put_hevc_epel_hv6_8_c: 350.9 put_hevc_epel_hv6_8_i8mm: 130.2 put_hevc_epel_hv8_8_c: 548.7 put_hevc_epel_hv8_8_i8mm: 136.9 put_hevc_epel_hv12_8_c: 1126.7 put_hevc_epel_hv12_8_i8mm: 302.2 put_hevc_epel_hv16_8_c: 1925.2

[FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_v

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_epel_v4_8_c: 79.9 put_hevc_epel_v4_8_neon: 25.7 put_hevc_epel_v6_8_c: 151.4 put_hevc_epel_v6_8_neon: 46.4 put_hevc_epel_v8_8_c: 250.9 put_hevc_epel_v8_8_neon: 41.7 put_hevc_epel_v12_8_c: 542.7 put_hevc_epel_v12_8_neon: 108.7 put_hevc_epel_v16_8_c: 939.4

[FFmpeg-devel] (no subject)

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_hv4_8_c: 422.1 put_hevc_qpel_hv4_8_i8mm: 101.6 put_hevc_qpel_hv6_8_c: 756.4 put_hevc_qpel_hv6_8_i8mm: 225.9 put_hevc_qpel_hv8_8_c: 1189.9 put_hevc_qpel_hv8_8_i8mm: 296.6 put_hevc_qpel_hv12_8_c: 2407.4 put_hevc_qpel_hv12_8_i8mm: 552.4 put_hevc_qpel_hv16_8_c: 4021.4

[FFmpeg-devel] (no subject)

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_v4_8_c: 138.1 put_hevc_qpel_v4_8_neon: 41.1 put_hevc_qpel_v6_8_c: 276.6 put_hevc_qpel_v6_8_neon: 60.9 put_hevc_qpel_v8_8_c: 478.9 put_hevc_qpel_v8_8_neon: 72.9 put_hevc_qpel_v12_8_c: 1072.6 put_hevc_qpel_v12_8_neon: 203.9 put_hevc_qpel_v16_8_c: 1852.1

[FFmpeg-devel] (no subject)

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_epel_v4_8_c: 79.9 put_hevc_epel_v4_8_neon: 25.7 put_hevc_epel_v6_8_c: 151.4 put_hevc_epel_v6_8_neon: 46.4 put_hevc_epel_v8_8_c: 250.9 put_hevc_epel_v8_8_neon: 41.7 put_hevc_epel_v12_8_c: 542.7 put_hevc_epel_v12_8_neon: 108.7 put_hevc_epel_v16_8_c: 939.4

[FFmpeg-devel] (no subject)

2023-10-14 Thread Logan.Lyu
checkasm bench: put_hevc_epel_hv4_8_c: 213.7 put_hevc_epel_hv4_8_i8mm: 59.4 put_hevc_epel_hv6_8_c: 350.9 put_hevc_epel_hv6_8_i8mm: 130.2 put_hevc_epel_hv8_8_c: 548.7 put_hevc_epel_hv8_8_i8mm: 136.9 put_hevc_epel_hv12_8_c: 1126.7 put_hevc_epel_hv12_8_i8mm: 302.2 put_hevc_epel_hv16_8_c: 1925.2

Re: [FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_v

2023-09-22 Thread Logan.Lyu
2023/9/17 5:46, Martin Storsjö 写道: On Thu, 14 Sep 2023, Logan.Lyu wrote: Hi Martin, You can try the attached patchset. If that doesn't work, My code branch address is https://github.com/myais2023/FFmpeg/tree/hevc-aarch64 Thanks for the patches. Functionally, they seem to work, and the issues i

Re: [FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_v

2023-09-13 Thread Logan.Lyu
Hi Martin, You can try the attached patchset. If that doesn't work, My code branch address is https://github.com/myais2023/FFmpeg/tree/hevc-aarch64 Please try it again. Thanks 在 2023/9/12 19:48, Martin Storsjö 写道: Hi, Sorry for not tending to your patches sooner. Unfortunately, this

[FFmpeg-devel] [PATCH 3/4] lavc/aarch64: new optimization for 8-bit hevc_qpel_uni_v

2023-08-26 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_uni_v4_8_c: 146.2 put_hevc_qpel_uni_v4_8_neon: 43.2 put_hevc_qpel_uni_v6_8_c: 303.9 put_hevc_qpel_uni_v6_8_neon: 69.7 put_hevc_qpel_uni_v8_8_c: 495.2 put_hevc_qpel_uni_v8_8_neon: 74.7 put_hevc_qpel_uni_v12_8_c: 1100.9 put_hevc_qpel_uni_v12_8_neon: 222.4

[FFmpeg-devel] [PATCH 2/4] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_hv

2023-08-26 Thread Logan.Lyu
checkasm bench: put_hevc_epel_uni_hv4_8_c: 204.7 put_hevc_epel_uni_hv4_8_i8mm: 70.2 put_hevc_epel_uni_hv6_8_c: 378.2 put_hevc_epel_uni_hv6_8_i8mm: 131.9 put_hevc_epel_uni_hv8_8_c: 637.7 put_hevc_epel_uni_hv8_8_i8mm: 137.9 put_hevc_epel_uni_hv12_8_c: 1301.9 put_hevc_epel_uni_hv12_8_i8mm: 314.2

[FFmpeg-devel] [PATCH 4/4] lavc/aarch64: new optimization for 8-bit hevc_qpel_uni_hv

2023-08-26 Thread Logan.Lyu
checkasm bench: put_hevc_qpel_uni_hv4_8_c: 489.2 put_hevc_qpel_uni_hv4_8_i8mm: 105.7 put_hevc_qpel_uni_hv6_8_c: 852.7 put_hevc_qpel_uni_hv6_8_i8mm: 268.7 put_hevc_qpel_uni_hv8_8_c: 1345.7 put_hevc_qpel_uni_hv8_8_i8mm: 300.4 put_hevc_qpel_uni_hv12_8_c: 2757.4 put_hevc_qpel_uni_hv12_8_i8mm: 581.4

[FFmpeg-devel] [PATCH 1/4] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_v

2023-08-26 Thread Logan.Lyu
checkasm bench: put_hevc_epel_uni_hv64_8_i8mm: 6568.7 put_hevc_epel_uni_v4_8_c: 88.7 put_hevc_epel_uni_v4_8_neon: 32.7 put_hevc_epel_uni_v6_8_c: 185.4 put_hevc_epel_uni_v6_8_neon: 44.9 put_hevc_epel_uni_v8_8_c: 333.9 put_hevc_epel_uni_v8_8_neon: 44.4 put_hevc_epel_uni_v12_8_c: 728.7

Re: [FFmpeg-devel] [PATCH 5/5] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_w_hv

2023-07-13 Thread Logan.Lyu
, but later I confirmed that it is the lower 64bit, thank you for reminding. Please take a look. If there are some small mistakes, please correct them directly. If there are still many problems, please remind me again, thank you! 在 2023/7/2 5:28, Martin Storsjö 写道: On Sun, 18 Jun 2023, Logan.Lyu

Re: [FFmpeg-devel] [PATCH 1/5] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_pixels

2023-06-18 Thread Logan.Lyu
Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: The platform I tested is the g8y instance of Alibaba Cloud, with a chip based on armv9. put_hevc_pel_uni_pixels4_8_c: 35.9

Re: [FFmpeg-devel] [PATCH 4/5] lavc/aarch64: new optimization for 8-bit hevc_epel_h

2023-06-18 Thread Logan.Lyu
Add missing patch attachment... 在 2023/6/18 16:23, Logan.Lyu 写道: Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: put_hevc_epel_h4_8_c: 67.1 put_hevc_epel_h4_8_i8mm: 21.1 put_hevc_epel_h6_8_c

Re: [FFmpeg-devel] [PATCH 5/5] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_w_hv

2023-06-18 Thread Logan.Lyu
Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: put_hevc_epel_uni_w_hv4_8_c: 254.6 put_hevc_epel_uni_w_hv4_8_i8mm: 102.9 put_hevc_epel_uni_w_hv6_8_c: 411.6 put_hevc_epel_uni_w_hv6_8_i8mm: 221.6

Re: [FFmpeg-devel] [PATCH 4/5] lavc/aarch64: new optimization for 8-bit hevc_epel_h

2023-06-18 Thread Logan.Lyu
Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: put_hevc_epel_h4_8_c: 67.1 put_hevc_epel_h4_8_i8mm: 21.1 put_hevc_epel_h6_8_c: 147.1 put_hevc_epel_h6_8_i8mm: 45.1 put_hevc_epel_h8_8_c: 237.4

Re: [FFmpeg-devel] [PATCH 3/5] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_w_v

2023-06-18 Thread Logan.Lyu
Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: put_hevc_epel_uni_w_v4_8_c: 116.1 put_hevc_epel_uni_w_v4_8_neon: 48.6 put_hevc_epel_uni_w_v6_8_c: 248.9 put_hevc_epel_uni_w_v6_8_neon: 80.6

Re: [FFmpeg-devel] [PATCH 2/5] lavc/aarch64: new optimization for 8-bit hevc_epel_uni_w_h

2023-06-18 Thread Logan.Lyu
Hi, Martin, I modified it according to your comments. Please review again. And here are the checkasm benchmark results of the related functions: put_hevc_epel_uni_w_h4_8_c: 126.1 put_hevc_epel_uni_w_h4_8_i8mm: 41.6 put_hevc_epel_uni_w_h6_8_c: 222.9 put_hevc_epel_uni_w_h6_8_i8mm: 91.4

Re: [FFmpeg-devel] [PATCH] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_w_pixels, qpel_uni_w_h, qpel_uni_w_v, qpel_uni_w_hv and qpel_h

2023-06-02 Thread Logan.Lyu
Hi, Martin, I'm sorry I made a stupid mistake, And it's fixed now. If these patches are acceptable to you, I will submit some similar patches soon. Thanks. 在 2023/6/1 19:23, Martin Storsjö 写道: On Sun, 28 May 2023, Logan.Lyu wrote: 在 2023/5/28 12:36, Jean-Baptiste Kempf 写道: Hello

Re: [FFmpeg-devel] [PATCH] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_w_pixels, qpel_uni_w_h, qpel_uni_w_v, qpel_uni_w_hv and qpel_h

2023-05-28 Thread Logan.Lyu
在 2023/5/28 12:36, Jean-Baptiste Kempf 写道: Hello, The last interaction still has the wrong name in patchset. Thanks for reminding.  I modified the correct name in git. jb On Sun, 28 May 2023, at 12:23, Logan.Lyu wrote: Hi, Martin I have finished the modification, please review again

Re: [FFmpeg-devel] [PATCH] lavc/aarch64: new optimization for 8-bit hevc_pel_uni_w_pixels, qpel_uni_w_h, qpel_uni_w_v, qpel_uni_w_hv and qpel_h

2023-05-27 Thread Logan.Lyu
Hi, Martin I have finished the modification, please review again. Thanks. 在 2023/5/26 16:34, Martin Storsjö 写道: Hi, Overall these patches seem mostly ok, but I've got a few minor points to make: - The usdot instruction requires the i8mm extension (part of armv8.6-a), while udot or sdot