This makes the code slightly clearer, but doesn't make any functional
difference.
---
libavcodec/arm/vp9itxfm_16bpp_neon.S | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavcodec/arm/vp9itxfm_16bpp_neon.S
b/libavcodec/arm/vp9itxfm_16bpp_neon.S
index e6e9440..a92f323 100
This reduces the number of lines and reduces the duplication.
Also simplify the eob check for the half case.
If we are in the half case, we know we at least will need to do the
first three slices, we only need to check eob for the fourth one,
so we can hardcode the value to check against instead
This work is sponsored by, and copyright, Google.
This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.
This gives a pretty substantial speedup for the smaller subpartitions.
T
---
libavcodec/aarch64/vp9itxfm_16bpp_neon.S | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
b/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
index f53e94a..f80604f 100644
--- a/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
+++ b/libavcodec/aarch6
---
libavcodec/arm/vp9itxfm_16bpp_neon.S | 20 ++--
1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/libavcodec/arm/vp9itxfm_16bpp_neon.S
b/libavcodec/arm/vp9itxfm_16bpp_neon.S
index a92f323..9c02ed9 100644
--- a/libavcodec/arm/vp9itxfm_16bpp_neon.S
+++ b/libavcodec
Keep the idct32 coefficients in narrow form in q6-q7, and idct16
coefficients in lengthened 32 bit form in q0-q3. Avoid clobbering
q0-q3 in the pass1 function, and squeeze the idct16 coefficients
into q0-q1 in the pass2 function to avoid reloading them.
The idct16 coefficients are clobbered and re
In the half/quarter cases where we don't use the min_eob array, defer
loading the pointer until we know it will be needed.
This is cherrypicked from libav commit
3a0d5e206d24d41d87a25ba16a79b2ea04c39d4c.
---
libavcodec/aarch64/vp9itxfm_neon.S | 3 ++-
libavcodec/arm/vp9itxfm_neon.S | 4 ++--
This allows reusing the macro for a separate implementation of the
pass2 function.
---
libavcodec/aarch64/vp9itxfm_16bpp_neon.S | 98
1 file changed, 49 insertions(+), 49 deletions(-)
diff --git a/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
b/libavcodec/aarch64/vp9i
Align the second/third operands as they usually are.
Due to the wildly varying sizes of the written out operands
in aarch64 assembly, the column alignment is usually not as clear
as in arm assembly.
This is cherrypicked from libav commit
7995ebfad12002033c73feed422a1cfc62081e8f.
---
libavcodec/a
This avoids concatenation, which can't be used if the whole macro
is wrapped within another macro.
---
libavcodec/aarch64/vp9itxfm_16bpp_neon.S | 90
1 file changed, 45 insertions(+), 45 deletions(-)
diff --git a/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
b/libavco
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/aarch64/vp9itxfm_16bpp_neon.o from
26288 to 21512 bytes.
This gives a small slowdown of a couple of tens of cycles, but makes
it more feasible to add more optimized versions of these transforms.
Before:
vp
This makes the code a bit more readable.
---
libavcodec/aarch64/vp9itxfm_16bpp_neon.S | 24
1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
b/libavcodec/aarch64/vp9itxfm_16bpp_neon.S
index f80604f..86ea29e 100644
--
This work is sponsored by, and copyright, Google.
This reduces the code size of libavcodec/arm/vp9itxfm_16bpp_neon.o from
17500 to 14516 bytes.
This gives a small slowdown of a couple tens of cycles, up to around
150 cycles for the full case of the largest transform, but makes
it more feasible to
This work is sponsored by, and copyright, Google.
This avoids loading and calculating coefficients that we know will
be zero, and avoids filling the temp buffer with zeros in places
where we know the second pass won't read.
This gives a pretty substantial speedup for the smaller subpartitions.
T
Hi Jorge,
On Mon, 7 Aug 2017, Jorge Ramirez wrote:
On 08/03/2017 01:53 AM, Mark Thompson wrote:
+default:
+return 0;
+}
+
+SET_V4L_EXT_CTRL(value, qmin, avctx->qmin, "minimum video quantizer
scale");
+SET_V4L_EXT_CTRL(value, qmax, avctx->qmax, "maximum video quantizer
---
libavcodec/aarch64/hevcdsp_qpel_neon.S | 695 +
1 file changed, 355 insertions(+), 340 deletions(-)
diff --git a/libavcodec/aarch64/hevcdsp_qpel_neon.S
b/libavcodec/aarch64/hevcdsp_qpel_neon.S
index 06832603d9..ad568e415b 100644
--- a/libavcodec/aarch64/hevcdsp_qpel_n
The hv32 and hv64 functions were identical - both loop and
process 16 pixels at a time.
The hv16 function was near identical, except for the outer loop
(and using sp instead of a separate register).
Given the size of these functions, the extra cost of the outer
loop is negligible, so use the same
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
AWS Graviton 3:
put_hevc_qpel_uni_w_hv4_8_c: 422.2
put_hevc_qpel_uni_w_hv4_8_neon: 140.7
put_hevc_qpel_uni_w_hv4_8_i8mm: 100.7
put_hevc_qpel_uni_w_hv8_8_c: 1208.0
put_hevc_qpel_u
As the plain neon qpel_h functions process two rows at a time,
we need to allocate storage for h+8 rows instead of h+7.
By allocating storage for h+8 rows, incrementing the stack
pointer won't end up at the right spot in the end. Store the
intended final stack pointer value in a register x14 which
On Mon, 25 Mar 2024, Martin Storsjö wrote:
Since some time, we have pretty complete AArch64 NEON coverage
for the hevc decoder.
However, some of these functions require the I8MM instruction set
extension, and many of them (but not all) lack a plain NEON
version.
This patchset fills in a
On Tue, 26 Mar 2024, Jean-Baptiste Kempf wrote:
On Mon, 25 Mar 2024, at 22:56, J. Dekker wrote:
On Mon, 25 Mar 2024, Martin Storsjö wrote:
Since some time, we have pretty complete AArch64 NEON coverage
for the hevc decoder.
However, some of these functions require the I8MM instruction set
This fixes assembling files starting with bare symbol declarations,
without explicitly switching to .text first.
---
gas-preprocessor.pl | 3 +++
1 file changed, 3 insertions(+)
diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index 2880858..b66181a 100755
--- a/gas-preprocessor.pl
+++ b/ga
This line originates from 6f69f7a8bf6a0d013985578df2ef42ee6b1c7994.
---
libavformat/movenc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 46a5b3a62f..ccdd2dbfc9 100644
--- a/libavformat/movenc.c
+++ b/libavformat/movenc.c
@@ -1173,8 +1173,6
We have test to make sure that certain configurations do print
warnings. However, the normal operation of the muxer within this
test always printed a warning, so those tests to check for
extra warnings didn't essentially guard anything.
The warning that always was printed, "track 1: codec frame si
This is based on a spec at https://aomediacodec.github.io/id3-emsg/,
further based on ISO/IEC 23009-1:2019.
Within libavformat, timed ID3 metadata (already supported by the
mpegts demuxer and muxer) is handled as a separate data AVStream
with codec type AV_CODEC_ID_TIMED_ID3. However, it doesn't
h
On Tue, 2 Apr 2024, Geoff Hill wrote:
Here's v3 to push the AC-3 ARMv8 NEON experiment a step further.
This version implements 5 of the AC-3 encoder DSP functions,
and adds checkasm tests where missing.
I've tested that the checkasm tests pass on aarch64 and x86.
Thanks, I've tested that che
On Tue, 2 Apr 2024, Geoff Hill wrote:
Signed-off-by: Geoff Hill
---
libavcodec/aarch64/ac3dsp_init_aarch64.c | 5 +
libavcodec/aarch64/ac3dsp_neon.S | 24 +
tests/checkasm/ac3dsp.c | 27
3 files changed, 56 insertions(+)
d
On Tue, 2 Apr 2024, Geoff Hill wrote:
Signed-off-by: Geoff Hill
---
libavcodec/aarch64/ac3dsp_init_aarch64.c | 5
libavcodec/aarch64/ac3dsp_neon.S | 35
tests/checkasm/ac3dsp.c | 26 ++
3 files changed, 66 insertions(+)
diff
On Sat, 6 Apr 2024, Geoff Hill wrote:
Thanks Martin for your review and testing.
Here's v4 with the following changes:
* Use fmal in sum_square_butterfly_float loop. Faster.
* Removed redundant loop bound zero checks in extract_exponents,
sum_square_bufferfly_int32 and sum_square_bufferf
Before: Cortex A53 A72 A78
ac3_sum_square_bufferfly_float_neon: 1005.7 516.5 224.5
After:
ac3_sum_square_bufferfly_float_neon: 981.7 504.5 223.2
---
libavcodec/aarch64/ac3dsp_neon.S | 16
1 file changed, 4 insertions(+), 12 deletions(-)
On Mon, 8 Apr 2024, J. Dekker wrote:
In some cases, these scripts can be called directly by packagers, and
some systems require the interpreter to be explicit.
It is unclear to me which of the changes are needed and for what reason,
please elaborate much more in the commit message.
Is it po
On Mon, 8 Apr 2024, J. Dekker wrote:
The preferred way to use LTO is --enable-lto but often times packagers
still end up with -flto in cflags for various reasons. Using grep
on binary object files is brittle and relies on specific object
representation, which in the case of LLVM bitcode, debug-i
On Tue, 9 Apr 2024, J. Dekker wrote:
Note that the config.sh file is left without a shebang, this file is
supposed to be sourced into the current environment.
This commit is purely cosmetic.
Signed-off-by: J. Dekker
---
configure | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Thanks,
On Tue, 12 Mar 2024, Martin Storsjö wrote:
---
libavutil/aarch64/cpu.c | 25 +
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
index 7a05391343..196bdaf6b0 100644
--- a/libavutil/aarch64/cpu.c
+++ b
On Thu, 4 Apr 2024, Martin Storsjö wrote:
This line originates from 6f69f7a8bf6a0d013985578df2ef42ee6b1c7994.
---
libavformat/movenc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/libavformat/movenc.c b/libavformat/movenc.c
index 46a5b3a62f..ccdd2dbfc9 100644
--- a/libavformat/movenc.c
On Thu, 4 Apr 2024, Martin Storsjö wrote:
We have test to make sure that certain configurations do print
warnings. However, the normal operation of the muxer within this
test always printed a warning, so those tests to check for
extra warnings didn't essentially guard anything.
The wa
On Tue, 9 Apr 2024, James Almer wrote:
On 4/4/2024 7:29 AM, Martin Storsjö wrote:
This is based on a spec at https://aomediacodec.github.io/id3-emsg/,
further based on ISO/IEC 23009-1:2019.
Within libavformat, timed ID3 metadata (already supported by the
mpegts demuxer and muxer) is handled
On Wed, 10 Apr 2024, J. Dekker wrote:
The exclude_guest option only has an effect on x86. Omitting
'exclude_guest' defaults to zero which implies that you can count guest
events should you run one. Some non-x86 kernels just ignore it, while
others (e.g. the Asahi Linux kernels) require the user
On Wed, 17 Apr 2024, Ramiro Polla wrote:
The code is imported from libjpeg-turbo-3.0.1. The neon registers used
have been changed to avoid modifying v8-v15.
---
libavcodec/aarch64/Makefile | 2 +
libavcodec/aarch64/fdct.h | 26 ++
libavcodec/aarch64/fdctdsp_init_aa
Travis is no longer relevant for attempting to run CI jobs in our
setup.
---
.travis.yml | 30 --
1 file changed, 30 deletions(-)
delete mode 100644 .travis.yml
diff --git a/.travis.yml b/.travis.yml
deleted file mode 100644
index 784b7bdf73..00
--- a/.travis.
On Wed, 17 Apr 2024, Ramiro Polla wrote:
This patch set adds fdct to checkasm and neon-optimized fdct for aarch64.
Ramiro Polla (2):
checkasm: add test for fdct
lavc/aarch64/fdct: add neon-optimized fdct for aarch64
libavcodec/aarch64/Makefile | 2 +
libavcodec/aarch64/fdct.h
On Wed, 17 Apr 2024, Marvin Scholz wrote:
This fixes the checks to properly use runtime feature detection and
check the SDK version (*_MAX_ALLOWED) instead of the targeted version
for the relevant APIs.
As these things are pretty hard to think straight about, it could be good
with a more conc
On Mon, 22 Apr 2024, Derek Buitenhuis wrote:
Added in thep previous commit.
Typo in the commit message
// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel
To unsubscribe, visit link abov
On Mon, 22 Apr 2024, Derek Buitenhuis wrote:
Added in thep previous commit.
Signed-off-by: Derek Buitenhuis
---
libavformat/http.c | 6 ++
1 file changed, 6 insertions(+)
diff --git a/libavformat/http.c b/libavformat/http.c
index ed20359552..bbace2694f 100644
--- a/libavformat/http.c
+++ b
On Mon, 22 Apr 2024, Derek Buitenhuis wrote:
429 and 503 codes can, and often do (e.g. all Google Cloud
Storage URLs can), return a Retry-After header with the error,
indicating how long to wait, in seconds, before retrying again.
If it is not respected by, for example, using our default backoff
On Mon, 22 Apr 2024, Derek Buitenhuis wrote:
Not every use case benefits from setting retries in terms of the backoff.
Signed-off-by: Derek Buitenhuis
---
libavformat/http.c| 12 +---
libavformat/version.h | 2 +-
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/libav
On Mon, 22 Apr 2024, Derek Buitenhuis wrote:
This patch set adds support for properly handling HTTP 429 codes,
and their rate limiting, which is widely used and is standardized.
Changes since first set:
* Added AVERROR_HTTP_TOO_MANY_REQUESTS top error_entries in error.c, per
Andreas' review.
On Thu, 25 Apr 2024, Derek Buitenhuis wrote:
Changes since last set:
* Updated commit message with RFC references.
* Properly support Retry-After as both a date and integer number of seconds.
I have tested this against both an HTTP-Date and seconds, and confirmed
it to work.
Derek Buitenhuis
This fixes crashes in the mspel tests on x86.
---
tests/checkasm/vc1dsp.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/tests/checkasm/vc1dsp.c b/tests/checkasm/vc1dsp.c
index 407d9e5fe8..f18f0f8251 100644
--- a/tests/checkasm/vc1dsp.c
+++ b/tests/checkasm/vc1dsp.c
@@
On Tue, 30 Apr 2024, Andreas Rheinhardt wrote:
Regression since fd172185580c1ccdcfb90bbfdb59fa806fad3117;
triggered by vp4/KTkvw8dg1J8.avi in the FATE suite, but not
when running fate as this code is not used when the bitexact
flag is set.
Bisecting done by ami_stuff, patch from user Mika Fisch
On Fri, 3 May 2024, Rémi Denis-Courmont wrote:
This adds the Linux-specific function call to detect CPU features. Unlike
the more portable auxillary vector, this supports extensions other than
single lettered ones. At this point, FFmpeg already needs this to detect
Zba and Zbb at run-time, and p
On Mon, 6 May 2024, James Almer wrote:
It ignores and overwrites the previous values.
Fixes running the test under ubsan.
Signed-off-by: James Almer
---
tests/checkasm/blockdsp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
The change is probably correct, but what issue is ubsan co
On Tue, 7 May 2024, Andreas Rheinhardt wrote:
Martin Storsjö:
On Mon, 6 May 2024, James Almer wrote:
It ignores and overwrites the previous values.
Fixes running the test under ubsan.
Signed-off-by: James Almer
---
tests/checkasm/blockdsp.c | 3 ++-
1 file changed, 2 insertions(+), 1
On Tue, 7 May 2024, Rémi Denis-Courmont wrote:
---
libavutil/riscv/cpu.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/riscv/cpu.c b/libavutil/riscv/cpu.c
index c3683b06d0..69d1afe853 100644
--- a/libavutil/riscv/cpu.c
+++ b/libavutil/riscv/cpu.c
@@ -29,14 +29,
On Tue, 7 May 2024, Rémi Denis-Courmont wrote:
---
Makefile | 2 +-
configure | 3 +++
doc/APIchanges| 3 +++
ffbuild/arch.mak | 1 +
libavutil/cpu.h | 1 +
libavutil/tests/cpu.c | 1 +
tests/checkasm/checkasm.c | 1 +
7 files changed,
On Sat, 11 May 2024, Lynne via ffmpeg-devel wrote:
Unintentionally removed as part of 03cf10164578aed33f4d0cb5b69d63669c01a538.
Untested, but its assumed that unlike most of the old ARM code,
this one was still working.
---
libavcodec/aac/aacdec_float.c | 5 +
1 file changed, 5 insertions(+)
On Sat, 11 May 2024, Ramiro Polla wrote:
On Sun, Jan 21, 2024 at 10:57 PM Ramiro Polla wrote:
---
libavcodec/aarch64/idctdsp_init_aarch64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/idctdsp_init_aarch64.c
b/libavcodec/aarch64/idctdsp_init_aarch64
Clang supports the macro expansion counter (used for making unique
labels within macro expansions), but not when targeting darwin.
Convert uses of the counter into normal local labels, as used
elsewhere.
Since Xcode 9.3, the bundled clang supports altmacro and doesn't
require using gas-preprocess
When targeting darwin, clang requires commas between arguments,
while the no-comma form is allowed for other targets.
Since Xcode 9.3, the bundled clang supports altmacro and doesn't
require using gas-preprocessor any longer.
---
libavcodec/arm/hevcdsp_deblock_neon.S | 8
1 file changed,
Vanilla clang supports altmacro since clang 5.0, and thus doesn't
require gas-preprocessor for building the arm assembly any longer.
However, the built-in assembler doesn't support .dn directives.
This readds checks that were removed in d7320ca3ed10f0d, when
the last usage of .dn directives withi
On Sat, 31 Mar 2018, Hendrik Leppkes wrote:
On Fri, Mar 30, 2018 at 9:14 PM, Martin Storsjö wrote:
Clang supports the macro expansion counter (used for making unique
labels within macro expansions), but not when targeting darwin.
Convert uses of the counter into normal local labels, as used
On Mon, 12 Feb 2024, rcombs wrote:
ffmpeg | branch: master | rcombs | Sun Jan 28 14:27:17 2024
-0800| [7bf1b9b35769b37684dd2f18a54f01d852a540c8] | committer: rcombs
lavf/assenc: normalize line endings to \n
Previously, we produced output with either \r\n or mixed line endings.
This was undes
On Mon, 12 Feb 2024, Hendrik Leppkes wrote:
On Mon, Feb 12, 2024 at 11:22 AM Martin Storsjö wrote:
>
> diff --git a/.gitattributes b/.gitattributes
> index 5a19b963b6..a900528e47 100644
> --- a/.gitattributes
> +++ b/.gitattributes
> @@ -1,2 +1 @@
> *.pnm -diff -text
>
On Tue, 13 Feb 2024, Ridley Combs via ffmpeg-devel wrote:
On Feb 13, 2024, at 01:28, Anton Khirnov wrote:
Quoting Martin Storsjö (2024-02-12 12:31:29)
On Mon, 12 Feb 2024, Hendrik Leppkes wrote:
On Mon, Feb 12, 2024 at 11:22 AM Martin Storsjö wrote:
diff --git a/.gitattributes b
On Tue, 13 Feb 2024, Ridley Combs wrote:
It looks like checkout has different behavior from reset, and fate uses a
hard reset.
To test, I committed the change adding tests/ref/** -text,
unix2dos'd tests/ref/fate/sub-scc, then ran git -c core.autocrlf=true reset
--quiet --hard; this dos2unix'd th
On Tue, 13 Feb 2024, Ridley Combs wrote:
It looks like checkout has different behavior from reset, and fate uses a
hard reset.
To test, I committed the change adding tests/ref/** -text,
unix2dos'd tests/ref/fate/sub-scc, then ran git -c core.autocrlf=true reset
--quiet --hard; this dos2unix'd th
Hi,
On Sun, 4 Feb 2024, Ramiro Polla wrote:
The code is imported from libjpeg-turbo-3.0.1. The neon registers used
have been changed to avoid modifying v8-v15.
---
I don't remember if we have any extra routines we need to do if importing
foreign code with a differing license. The license her
Contrary to the existing "fate-checkasm", this always prints the
tool output, and runs all tests at once instead of splitting it up
per target group. This is more useful when the user expects to
look directly at the tool output, instead of being part of a full
fate run.
(On failure with the regula
On Fri, 9 Feb 2024, Martin Storsjö wrote:
By default the option "flv_metadata" (internally using the field
name "trust_metadata") is set to 0, meaning that we don't allocate
streams based on information in the metadata, only based on
actual streams we encounter
On Mon, 19 Feb 2024, Andreas Rheinhardt wrote:
Andreas Rheinhardt:
Obsolete since 7ec2354c38978b918dc079b611393becb6c80bf7.
Signed-off-by: Andreas Rheinhardt
---
libavutil/intreadwrite.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/libavutil/intreadwrite.h b/libavut
On Wed, 14 Feb 2024, Martin Storsjö wrote:
Contrary to the existing "fate-checkasm", this always prints the
tool output, and runs all tests at once instead of splitting it up
per target group. This is more useful when the user expects to
look directly at the tool output, instead of
On Wed, 21 Feb 2024, J. Dekker wrote:
Benched using single-threaded full decode on an Ampere Altra.
Bpp Before After Speedup
8 73,3s 65,2s 1.124x
10 114,2s 104,0s 1.098x
12 125,8s 115,7s 1.087x
Signed-off-by: J. Dekker
---
libavcodec/aarch64/hevcdsp_deblock_neon.S | 421 +++
This fixes building FFmpeg's libavcodec/aarch64/h264idct_neon.S
for a Linux target. (It's not necessary to use gas-preprocessor for
such a target for a very long time, but it can be useful to be able
to test gas-preprocessor there still.)
---
gas-preprocessor.pl | 5 -
1 file changed, 4 insert
On Sat, 24 Feb 2024, J. Dekker wrote:
Nuo Mi writes:
On Wed, Feb 21, 2024 at 7:10 PM J. Dekker wrote:
Over/underflow in some cases.
Signed-off-by: J. Dekker
---
libavcodec/x86/hevcdsp_init.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/libavcodec/x86/hev
On Tue, 27 Feb 2024, J. Dekker wrote:
Benched using single-threaded full decode on an Ampere Altra.
Bpp Before After Speedup
8 73,3s 65,2s 1.124x
10 114,2s 104,0s 1.098x
12 125,8s 115,7s 1.087x
Signed-off-by: J. Dekker
---
Slightly improved 12bit version.
libavcodec/aarch64/hevcd
The CPU feature detection was added in
493fcde50a84cb23854335bcb0e55c6f383d55db, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features were added much later. And if compiling with
ol
On Wed, 28 Feb 2024, J. Dekker wrote:
Martin Storsjö writes:
On Tue, 27 Feb 2024, J. Dekker wrote:
Benched using single-threaded full decode on an Ampere Altra.
Bpp Before After Speedup
8 73,3s 65,2s 1.124x
10 114,2s 104,0s 1.098x
12 125,8s 115,7s 1.087x
Signed-off-by: J
On Wed, 28 Feb 2024, J. Dekker wrote:
Martin Storsjö writes:
On Wed, 28 Feb 2024, J. Dekker wrote:
Martin Storsjö writes:
On Tue, 27 Feb 2024, J. Dekker wrote:
Benched using single-threaded full decode on an Ampere Altra.
Bpp Before After Speedup
8 73,3s 65,2s 1.124x
10
On Wed, 28 Feb 2024, Martin Storsjö wrote:
The CPU feature detection was added in
493fcde50a84cb23854335bcb0e55c6f383d55db, using HWCAP_CPUID.
The argument for using that, was that HWCAP_CPUID was added much
earlier in the kernel (in Linux v4.11), while the HWCAP flags for
individual features
On Wed, 6 Mar 2024, Ramiro Polla wrote:
ping
Did you miss my response here?
https://ffmpeg.org/pipermail/ffmpeg-devel/2024-February/321448.html
// Martin
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/f
This was missed in b800327f4c7233d09baca958121722a04c2035ff.
---
libavdevice/avfoundation.m | 11 ++-
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/libavdevice/avfoundation.m b/libavdevice/avfoundation.m
index a0ef87edff..d9b17ccdae 100644
--- a/libavdevice/avfoundation.m
+
In some builds, the following object files could be left behind
after make clean:
./libavfilter/metal/utils.o
./libavfilter/metal/vf_yadif_videotoolbox.metallib.o
./libavcodec/x86/h26x/h2656dsp.o
./libavcodec/neon/mpegvideo.o
./ffbuild/bin2c_host.o
---
ffbuild/common.mak | 2 +-
libavcod
This currently builds files in the libavcodec/x86/{vvc,h26x}
subdirectories, which is somewhat unexpected when building for
another architecture than x86.
The regular arch subdirectories are handled with
-include $(SRC_PATH)/$(1)/$(ARCH)/Makefile
in the toplevel Makefile. Switch this to a si
On Mon, 11 Mar 2024, Anton Khirnov wrote:
Quoting Tobias Rapp (2024-03-11 11:12:38)
On 10/03/2024 23:49, Anton Khirnov wrote:
Quoting James Almer (2024-03-10 23:29:27)
On 3/10/2024 7:24 PM, Anton Khirnov wrote:
Quoting Michael Niedermayer (2024-03-10 20:21:47)
On Sun, Mar 10, 2024 at 07:13
On Mon, 11 Mar 2024, Anton Khirnov wrote:
Well it IS obsolete. AFAIK it was never a particularly popular codec,
and was only really used by the anime and ripping scenes in early 2000s,
and even they dropped it very quickly once x264 appeared.
Within the scene of mobile HW, they commonly had HW
---
libavutil/aarch64/cpu.c | 25 +
1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/libavutil/aarch64/cpu.c b/libavutil/aarch64/cpu.c
index 7a05391343..196bdaf6b0 100644
--- a/libavutil/aarch64/cpu.c
+++ b/libavutil/aarch64/cpu.c
@@ -45,22 +45,23 @@ static i
The first 32 elements of each row were correct, while the
last 16 were scrambled.
This hasn't been noticed, because the checkasm test erroneously
only checked half of the output (for 8 bit functions), and
apparently none of the samples as part of "fate-hevc" seem to
trigger this specific function.
Previously it only checked half the output in 8 bit per pixel mode,
as the output actually is 16 bit elements here.
---
tests/checkasm/hevc_pel.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tests/checkasm/hevc_pel.c b/tests/checkasm/hevc_pel.c
index f9a7a7717c..065da876
---
tests/checkasm/hevc_pel.c | 134 --
1 file changed, 98 insertions(+), 36 deletions(-)
diff --git a/tests/checkasm/hevc_pel.c b/tests/checkasm/hevc_pel.c
index 065da87622..73a4619978 100644
--- a/tests/checkasm/hevc_pel.c
+++ b/tests/checkasm/hevc_pel.c
@@ -
This simplifies the code for checking the output, and can print
the failing output (including a map of matching/mismatching
elements) if checkasm is run with the -v/--verbose option.
---
tests/checkasm/hevc_pel.c | 71 ++-
1 file changed, 41 insertions(+), 30 de
On Thu, 14 Mar 2024, J. Dekker wrote:
Martin Storsjö writes:
The first 32 elements of each row were correct, while the
last 16 were scrambled.
This hasn't been noticed, because the checkasm test erroneously
only checked half of the output (for 8 bit functions), and
apparently none o
On Sun, 10 Mar 2024, Andreas Rheinhardt wrote:
All versions of MSVC that support C11 (namely >= v19.27)
also support the restrict keyword, therefore av_restrict
is no longer necessary since 75697836b1db3e0f0a3b7061be6be28d00c675a0.
Signed-off-by: Andreas Rheinhardt
---
Untested except via godb
On Sun, 17 Mar 2024, Rémi Denis-Courmont wrote:
Obviously not. Imported libraries are only there to resolve missing
symbols.
Sure - but if resolving the missing symbols brings in those conflicting
object files, there's not much to do about it. If the static library
contains dec_init in a sta
On Thu, 21 Mar 2024, Andreas Rheinhardt wrote:
Andreas Rheinhardt:
C11 provides static assertions via _Static_assert and
provides static_assert as a convenience define for this
in assert.h. MSVC 19.27 declares support for C11, but does
not support _Static_assert, but somehow supports
static_ass
On Fri, 22 Mar 2024, Andreas Rheinhardt wrote:
Martin Storsjö:
Both patches seem to work fine with MSVC 19.27 - I vaguely prefer the v2
version, which is simpler.
But to me, we could also just revert the change to
libavcodec/ccaption_dec.c, and declare that we require MSVC 19.28
instead
xes a subtle bug in the existing implementation;
two functions relied on the contents on the stack, below the
stack pointer, being untouched within a function. If a signal
gets delivered, those parts of the stack could be clobbered.
// Martin
Martin Storsjö (21):
aarch64: hevc: Reorder a misp
Group the epel and qpel functions together.
---
libavcodec/aarch64/hevcdsp_init_aarch64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/aarch64/hevcdsp_init_aarch64.c
b/libavcodec/aarch64/hevcdsp_init_aarch64.c
index 04692aa98e..d2f2a3681f 100644
--- a/libavcodec/
Many of the routines within hevcdsp_epel_neon and hevcdsp_qpel_neon
store temporary buffers on the stack. When consuming it,
many of these functions use the stack pointer as incremental pointer
for reading the data (instead of storing it in another register),
which is rather unusual.
Technically,
501 - 600 of 1517 matches
Mail list logo