Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm
On Thu, 3 Oct 2019, Martin Storsjö wrote: On Thu, 3 Oct 2019, Janne Grunau wrote: On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote: Armasm implicitly adds it instructions as needed. In VS 2019 16.3, there's a bug [1] in armasm making it fail to parse these it instructions (but it can still add them implicitly just fine). I'm not sure if it really is worth working around this issue, or just wait for it to hopefully be fixed by the next release again. [1] https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html --- gas-preprocessor.pl | 4 1 file changed, 4 insertions(+) diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl index b6c2786..9d8fb5d 100755 --- a/gas-preprocessor.pl +++ b/gas-preprocessor.pl @@ -1168,6 +1168,10 @@ sub handle_serialized_line { $line =~ s/fmxr/vmsr/; $line =~ s/fmrx/vmrs/; $line =~ s/fadds/vadd.f32/; +# Armasm in VS 2019 16.3 errors out on "it" instructions. But +# armasm implicitly adds the necessary it instructions anyway, so we +# can just filter it out. +$line =~ s/^\s*it[te]*\s+/$comm$&/; } if ($as_type eq "armasm" and $arch eq "aarch64") { # Convert "b.eq" into "beq" I guess ok-ish since armasm can handle implicit it instructions. Do you have expectation when a fixed version might be released? If it's more than a couple of weeks I'd say the workaround is worth it. There's roughly one stable release per 3 months, and the first preview for the next one (16.4) was already posted. In some cases, bugfixes do get into the next release (if deemed urgent enough I guess), but otherwise into current+2. So estimate of fix in a stable release is anywhere between 2 and 5 months maybe. https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html They confirmed the bug and told that it should be fixed in 16.5, which is due in about 5 months. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm
On Thu, 3 Oct 2019, Martin Storsjö wrote: On Thu, 3 Oct 2019, Janne Grunau wrote: On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote: Armasm implicitly adds it instructions as needed. In VS 2019 16.3, there's a bug [1] in armasm making it fail to parse these it instructions (but it can still add them implicitly just fine). I'm not sure if it really is worth working around this issue, or just wait for it to hopefully be fixed by the next release again. [1] https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html --- gas-preprocessor.pl | 4 1 file changed, 4 insertions(+) diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl index b6c2786..9d8fb5d 100755 --- a/gas-preprocessor.pl +++ b/gas-preprocessor.pl @@ -1168,6 +1168,10 @@ sub handle_serialized_line { $line =~ s/fmxr/vmsr/; $line =~ s/fmrx/vmrs/; $line =~ s/fadds/vadd.f32/; +# Armasm in VS 2019 16.3 errors out on "it" instructions. But +# armasm implicitly adds the necessary it instructions anyway, so we +# can just filter it out. +$line =~ s/^\s*it[te]*\s+/$comm$&/; } if ($as_type eq "armasm" and $arch eq "aarch64") { # Convert "b.eq" into "beq" I guess ok-ish since armasm can handle implicit it instructions. Do you have expectation when a fixed version might be released? If it's more than a couple of weeks I'd say the workaround is worth it. There's roughly one stable release per 3 months, and the first preview for the next one (16.4) was already posted. In some cases, bugfixes do get into the next release (if deemed urgent enough I guess), but otherwise into current+2. So estimate of fix in a stable release is anywhere between 2 and 5 months maybe. If I would have caught this in August when the first preview actually containing the new broken armasm was out, it might have been possible to have it fixed sooner... Pushed both of these now. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm
On Thu, 3 Oct 2019, Janne Grunau wrote: On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote: Armasm implicitly adds it instructions as needed. In VS 2019 16.3, there's a bug [1] in armasm making it fail to parse these it instructions (but it can still add them implicitly just fine). I'm not sure if it really is worth working around this issue, or just wait for it to hopefully be fixed by the next release again. [1] https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html --- gas-preprocessor.pl | 4 1 file changed, 4 insertions(+) diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl index b6c2786..9d8fb5d 100755 --- a/gas-preprocessor.pl +++ b/gas-preprocessor.pl @@ -1168,6 +1168,10 @@ sub handle_serialized_line { $line =~ s/fmxr/vmsr/; $line =~ s/fmrx/vmrs/; $line =~ s/fadds/vadd.f32/; +# Armasm in VS 2019 16.3 errors out on "it" instructions. But +# armasm implicitly adds the necessary it instructions anyway, so we +# can just filter it out. +$line =~ s/^\s*it[te]*\s+/$comm$&/; } if ($as_type eq "armasm" and $arch eq "aarch64") { # Convert "b.eq" into "beq" I guess ok-ish since armasm can handle implicit it instructions. Do you have expectation when a fixed version might be released? If it's more than a couple of weeks I'd say the workaround is worth it. There's roughly one stable release per 3 months, and the first preview for the next one (16.4) was already posted. In some cases, bugfixes do get into the next release (if deemed urgent enough I guess), but otherwise into current+2. So estimate of fix in a stable release is anywhere between 2 and 5 months maybe. If I would have caught this in August when the first preview actually containing the new broken armasm was out, it might have been possible to have it fixed sooner... // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] aarch64: Add assembly support for -fsanitize=hwaddress tagged globals.
From: Peter Collingbourne As of LLVM r368102, Clang will set a pointer tag in bits 56-63 of the address of a global when compiling with -fsanitize=hwaddress. This requires an adjustment to assembly code that takes the address of such globals: the code cannot use the regular R_AARCH64_ADR_PREL_PG_HI21 relocation to refer to the global, since the tag would take the address out of range. Instead, the code must use the non-checking (_NC) variant of the relocation (the link-time check is substituted by a runtime check). This change makes the necessary adjustment in the movrel macro, where it is needed when compiling with -fsanitize=hwaddress. Signed-off-by: Peter Collingbourne Signed-off-by: Martin Storsjö --- libavutil/aarch64/asm.S | 8 1 file changed, 8 insertions(+) diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S index bf5c1b7ee1..81d723b9b3 100644 --- a/libavutil/aarch64/asm.S +++ b/libavutil/aarch64/asm.S @@ -32,6 +32,10 @@ # define FUNC # #endif +#ifndef __has_feature +# define __has_feature(x) 0 +#endif + .macro function name, export=0, align=2 .macro endfunc ELF .size \name, . - \name @@ -94,7 +98,11 @@ ELF .size \name, . - \name add \rd, \rd, :lo12:\val+(\offset) .endif #elif CONFIG_PIC +# if __has_feature(hwaddress_sanitizer) +adrp\rd, :pg_hi21_nc:\val+(\offset) +# else adrp\rd, \val+(\offset) +# endif add \rd, \rd, :lo12:\val+(\offset) #else ldr \rd, =\val+\offset -- 2.20.1 (Apple Git-117) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] Add a fd protocol
On Thu, 25 Apr 2019, Luca Barbato wrote: --- Sometimes you receive a seekable fd from the outside. libavformat/file.c | 32 libavformat/protocols.c | 1 + 2 files changed, 33 insertions(+) diff --git a/libavformat/file.c b/libavformat/file.c index 27ce4de6eb..6a74ebbf48 100644 --- a/libavformat/file.c +++ b/libavformat/file.c @@ -204,3 +204,35 @@ const URLProtocol ff_pipe_protocol = { }; #endif /* CONFIG_PIPE_PROTOCOL */ + +#if CONFIG_FD_PROTOCOL + +static int fd_open(URLContext *h, const char *filename, int flags) +{ +FileContext *c = h->priv_data; +int fd; +char *final; +av_strstart(filename, "fd:", &filename); + +fd = strtol(filename, &final, 10); +if ((filename == final) || *final ) { +return AVERROR(EINVAL); +} +#if HAVE_SETMODE +setmode(fd, O_BINARY); +#endif +c->fd = fd; +return 0; +} + +const URLProtocol ff_pipe_protocol = { Did you test compilation of this? It doesn't look like it would work given this ^ Isn't this essentially exactly the same as the pipe protocol, except for not setting the is_streamed flag? Even though the name pipe doesn't feel quite right for that case, wouldn't it be possible to just add an option to the pipe protocol for controlling this? // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] arm: vp9lpf: Fix a typo in a comment about the register layout
--- libavcodec/aarch64/vp9lpf_neon.S | 2 +- libavcodec/arm/vp9lpf_neon.S | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/aarch64/vp9lpf_neon.S b/libavcodec/aarch64/vp9lpf_neon.S index e9c497096b..f68b54a2ee 100644 --- a/libavcodec/aarch64/vp9lpf_neon.S +++ b/libavcodec/aarch64/vp9lpf_neon.S @@ -415,7 +415,7 @@ 1: // flat8out -// This writes all outputs into v2-v17 (skipping v6 and v16). +// This writes all outputs into v2-v17 (skipping v7 and v16). // If this part is skipped, the output is read from v21-v26 (which is the input // to this section). ushll_szv0.8h, v1.8h, v16, #3, \sz // 8 * v16 diff --git a/libavcodec/arm/vp9lpf_neon.S b/libavcodec/arm/vp9lpf_neon.S index ae782b2ed0..e30f0cd5b4 100644 --- a/libavcodec/arm/vp9lpf_neon.S +++ b/libavcodec/arm/vp9lpf_neon.S @@ -362,7 +362,7 @@ beq 8f @ flat8out -@ This writes all outputs into d2-d17 (skipping d6 and d16). +@ This writes all outputs into d2-d17 (skipping d7 and d16). @ If this part is skipped, the output is read from d21-d26 (which is the input @ to this section). vshll.u8q0, d16, #3 @ 8 * d16 -- 2.20.1 (Apple Git-117) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
On Tue, 16 Apr 2019, Martin Storsjö wrote: On Tue, 16 Apr 2019, Diego Biurrun wrote: On Sun, Apr 14, 2019 at 09:33:40PM +0300, Martin Storsjö wrote: On Sun, 14 Apr 2019, Diego Biurrun wrote: > On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote: > > On Fri, 12 Apr 2019, Luca Barbato wrote: > > > On 11/04/2019 15:35, Martin Storsjö wrote: > > > > On Wed, 10 Apr 2019, Luca Barbato wrote: > > > > > On 10/04/2019 10:48, Martin Storsjö wrote: > > > > > > Mingw headers have got header inline implementations of localtime_r > > > > > > and gmtime_r, but only visible if certain posix thread safe functions > > > > > > have been requested. > > > > > > > > > > this is a preparatory step for improving the detection of those > > > > > > functions. > > > > > > --- > > > > > > An alternative fix is also provided in a different patch series, > > > > > > by adjusting libavutil/time_internal.h. > > > > > > > > Seems fine to me. > > > > > > Which ones do you mean - this series of 2 patches, the other one, or both? > > > > > > This series seems fine to me. > > > > Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a > > similar issue within gcc as well), but I guess this change probably is good > > to make anyway. > > I generally don't think that adding workarounds for foreign bugs is a > sustainable strategy, Well, the idea of prefixing local system function fallbacks/replacements isn't so much of a "workaround" as a sensible idea in general IMO. This is a pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc. That is, regardless of what the reason for using a fallback is (the real function does not exist, the real function is declared in headers but missing in libs, the real function exists but we want to avoid it because it's buggy, etc), the pattern of #include static inline ff_systemfunc() { ... } #define systemfunc ff_systemfunc should always be safe. So I think that should be a generally beneficial change in any case as well. IIRC we only do that within libavformat and use a different pattern within libavutil. Then again, my code knowledge might be getting a bit rusty. True, e.g. libavutil/libm.h does define some static inline functions unprefixed as well. Nevertheless, using prefixes for fallback functions is not a workaround/hack in my book, but a sane and healthy development practice. > but I clearly prefer the configure change. Well, the check_func_headers change obviously is for the better, yes. Adding the _POSIX_C_SOURCE define when building for mingw most probably also is sensible, but the fact that we add it manually to most OSes, while we don't add it automatically for all, makes it a little less clear cut. Switching from trying to set some flags globally for all platforms, inevitably hitting a snag on some fringe system, then adding an exception for that system, to setting flags by platform and strictly only when necessary on that platform, is - oddly enough - one of the single biggest improvements to the whole configure machinery. Sure, I generally agree with that. I was generally a bit weary of forcing the posix defines on other systems, but I generally think it should be good for this case, as it reduces inconsistencies between available/visible functions. So I'm very weary of changes in that area due to having been burned so often in the past. If the change was motivated by a bug (since fixed) in mingw, then we should not add workarounds for it. Well it's not quite as simple. The immediate issue is gone again, but the general underlying issue remains. The TL;DR version is: - mingw-w64 contains localtime_r/gmtime_r, but only visible if posix thread safe functions have been requested by some means. We currently don't detect these in configure. In practice, the posix thread safe functions define could be enabled transitively by some other included header (which has also been somewhat mitigated within mingw-w64). To safeguard against this inconsistency, defining it in configure would be helpful IMO. - Even if localtime_r was visible from mingw-w64 headers, it used to not conflict with ours, because the mingw-w64 was defined as extern inline, while ours was static inline. The mingw-w64 headers were changed to define this as static inline, and later reverted again. Anyway, I've presented my arguments. I trust you to make a good decision. Push at your discretion. Well in that case, I'd push all four paches. Pushed all four - thanks for the discussion! // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
On Tue, 16 Apr 2019, Diego Biurrun wrote: On Sun, Apr 14, 2019 at 09:33:40PM +0300, Martin Storsjö wrote: On Sun, 14 Apr 2019, Diego Biurrun wrote: > On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote: > > On Fri, 12 Apr 2019, Luca Barbato wrote: > > > On 11/04/2019 15:35, Martin Storsjö wrote: > > > > On Wed, 10 Apr 2019, Luca Barbato wrote: > > > > > On 10/04/2019 10:48, Martin Storsjö wrote: > > > > > > Mingw headers have got header inline implementations of localtime_r > > > > > > and gmtime_r, but only visible if certain posix thread safe functions > > > > > > have been requested. > > > > > > > > > > this is a preparatory step for improving the detection of those > > > > > > functions. > > > > > > --- > > > > > > An alternative fix is also provided in a different patch series, > > > > > > by adjusting libavutil/time_internal.h. > > > > > > > > Seems fine to me. > > > > > > Which ones do you mean - this series of 2 patches, the other one, or both? > > > > > > This series seems fine to me. > > > > Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a > > similar issue within gcc as well), but I guess this change probably is good > > to make anyway. > > I generally don't think that adding workarounds for foreign bugs is a > sustainable strategy, Well, the idea of prefixing local system function fallbacks/replacements isn't so much of a "workaround" as a sensible idea in general IMO. This is a pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc. That is, regardless of what the reason for using a fallback is (the real function does not exist, the real function is declared in headers but missing in libs, the real function exists but we want to avoid it because it's buggy, etc), the pattern of #include static inline ff_systemfunc() { ... } #define systemfunc ff_systemfunc should always be safe. So I think that should be a generally beneficial change in any case as well. IIRC we only do that within libavformat and use a different pattern within libavutil. Then again, my code knowledge might be getting a bit rusty. True, e.g. libavutil/libm.h does define some static inline functions unprefixed as well. Nevertheless, using prefixes for fallback functions is not a workaround/hack in my book, but a sane and healthy development practice. > but I clearly prefer the configure change. Well, the check_func_headers change obviously is for the better, yes. Adding the _POSIX_C_SOURCE define when building for mingw most probably also is sensible, but the fact that we add it manually to most OSes, while we don't add it automatically for all, makes it a little less clear cut. Switching from trying to set some flags globally for all platforms, inevitably hitting a snag on some fringe system, then adding an exception for that system, to setting flags by platform and strictly only when necessary on that platform, is - oddly enough - one of the single biggest improvements to the whole configure machinery. Sure, I generally agree with that. I was generally a bit weary of forcing the posix defines on other systems, but I generally think it should be good for this case, as it reduces inconsistencies between available/visible functions. So I'm very weary of changes in that area due to having been burned so often in the past. If the change was motivated by a bug (since fixed) in mingw, then we should not add workarounds for it. Well it's not quite as simple. The immediate issue is gone again, but the general underlying issue remains. The TL;DR version is: - mingw-w64 contains localtime_r/gmtime_r, but only visible if posix thread safe functions have been requested by some means. We currently don't detect these in configure. In practice, the posix thread safe functions define could be enabled transitively by some other included header (which has also been somewhat mitigated within mingw-w64). To safeguard against this inconsistency, defining it in configure would be helpful IMO. - Even if localtime_r was visible from mingw-w64 headers, it used to not conflict with ours, because the mingw-w64 was defined as extern inline, while ours was static inline. The mingw-w64 headers were changed to define this as static inline, and later reverted again. Anyway, I've presented my arguments. I trust you to make a good decision. Push at your discretion. Well in that case, I'd push all four paches. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] rtsp: add pkt_size option
On Mon, 15 Apr 2019, Tristan Matthews wrote: On Thu, Apr 11, 2019 at 1:41 AM Martin Storsjö wrote: On Thu, 11 Apr 2019, Tristan Matthews wrote: This allows users to specify an upper limit on the size of outgoing packets when publishing via RTSP. --- libavformat/rtsp.c | 5 - libavformat/rtsp.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c index 8bf9d9e3c..12c4998c6 100644 --- a/libavformat/rtsp.c +++ b/libavformat/rtsp.c @@ -74,7 +74,8 @@ #define COMMON_OPTS() \ { "reorder_queue_size", "Number of packets to buffer for handling of reordered packets", OFFSET(reordering_queue_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC }, \ -{ "buffer_size","Underlying protocol send/receive buffer size", OFFSET(buffer_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC } \ +{ "buffer_size","Underlying protocol send/receive buffer size", OFFSET(buffer_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC }, \ +{ "pkt_size", "Underlying protocol send packet size", OFFSET(pkt_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, ENC } \ const AVOption ff_rtsp_options[] = { @@ -118,6 +119,8 @@ static AVDictionary *map_to_opts(RTSPState *rt) snprintf(buf, sizeof(buf), "%d", rt->buffer_size); av_dict_set(&opts, "buffer_size", buf, 0); +snprintf(buf, sizeof(buf), "%d", rt->pkt_size); +av_dict_set(&opts, "pkt_size", buf, 0); return opts; } diff --git a/libavformat/rtsp.h b/libavformat/rtsp.h index 9dfbc5367..c38b90432 100644 --- a/libavformat/rtsp.h +++ b/libavformat/rtsp.h @@ -399,6 +399,7 @@ typedef struct RTSPState { char default_lang[4]; int buffer_size; +int pkt_size; const URLProtocol **protocols; } RTSPState; -- 2.17.1 LGTM // Martin This OK to merge? Pushed it for you now, thanks! // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
On Sun, 14 Apr 2019, Diego Biurrun wrote: On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote: On Fri, 12 Apr 2019, Luca Barbato wrote: > On 11/04/2019 15:35, Martin Storsjö wrote: > > On Wed, 10 Apr 2019, Luca Barbato wrote: > > > On 10/04/2019 10:48, Martin Storsjö wrote: > > > > Mingw headers have got header inline implementations of localtime_r > > > > and gmtime_r, but only visible if certain posix thread safe functions > > > > have been requested. > > > > > > > > This is a preparatory step for improving the detection of those > > > > functions. > > > > --- > > > > An alternative fix is also provided in a different patch series, > > > > by adjusting libavutil/time_internal.h. > > > > > > Seems fine to me. > > > > Which ones do you mean - this series of 2 patches, the other one, or both? > > > > This series seems fine to me. Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a similar issue within gcc as well), but I guess this change probably is good to make anyway. I generally don't think that adding workarounds for foreign bugs is a sustainable strategy, Well, the idea of prefixing local system function fallbacks/replacements isn't so much of a "workaround" as a sensible idea in general IMO. This is a pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc. That is, regardless of what the reason for using a fallback is (the real function does not exist, the real function is declared in headers but missing in libs, the real function exists but we want to avoid it because it's buggy, etc), the pattern of #include static inline ff_systemfunc() { ... } #define systemfunc ff_systemfunc should always be safe. So I think that should be a generally beneficial change in any case as well. but I clearly prefer the configure change. Well, the check_func_headers change obviously is for the better, yes. Adding the _POSIX_C_SOURCE define when building for mingw most probably also is sensible, but the fact that we add it manually to most OSes, while we don't add it automatically for all, makes it a little less clear cut. Also, s/Try adding/Add/ in the log message, you're not just trying to add those flags :-) Right, it wasn't a check_cflags but straightforward add_cflags. Yeah, I'll change that. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
On Fri, 12 Apr 2019, Luca Barbato wrote: On 11/04/2019 15:35, Martin Storsjö wrote: On Wed, 10 Apr 2019, Luca Barbato wrote: On 10/04/2019 10:48, Martin Storsjö wrote: Mingw headers have got header inline implementations of localtime_r and gmtime_r, but only visible if certain posix thread safe functions have been requested. This is a preparatory step for improving the detection of those functions. --- An alternative fix is also provided in a different patch series, by adjusting libavutil/time_internal.h. --- configure | 2 ++ 1 file changed, 2 insertions(+) Seems fine to me. Which ones do you mean - this series of 2 patches, the other one, or both? This series seems fine to me. Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a similar issue within gcc as well), but I guess this change probably is good to make anyway. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
On Wed, 10 Apr 2019, Luca Barbato wrote: On 10/04/2019 10:48, Martin Storsjö wrote: Mingw headers have got header inline implementations of localtime_r and gmtime_r, but only visible if certain posix thread safe functions have been requested. This is a preparatory step for improving the detection of those functions. --- An alternative fix is also provided in a different patch series, by adjusting libavutil/time_internal.h. --- configure | 2 ++ 1 file changed, 2 insertions(+) Seems fine to me. Which ones do you mean - this series of 2 patches, the other one, or both? // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] rtsp: add pkt_size option
On Thu, 11 Apr 2019, Tristan Matthews wrote: This allows users to specify an upper limit on the size of outgoing packets when publishing via RTSP. --- libavformat/rtsp.c | 5 - libavformat/rtsp.h | 1 + 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c index 8bf9d9e3c..12c4998c6 100644 --- a/libavformat/rtsp.c +++ b/libavformat/rtsp.c @@ -74,7 +74,8 @@ #define COMMON_OPTS() \ { "reorder_queue_size", "Number of packets to buffer for handling of reordered packets", OFFSET(reordering_queue_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC }, \ -{ "buffer_size","Underlying protocol send/receive buffer size", OFFSET(buffer_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC } \ +{ "buffer_size","Underlying protocol send/receive buffer size", OFFSET(buffer_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC }, \ +{ "pkt_size", "Underlying protocol send packet size", OFFSET(pkt_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, ENC } \ const AVOption ff_rtsp_options[] = { @@ -118,6 +119,8 @@ static AVDictionary *map_to_opts(RTSPState *rt) snprintf(buf, sizeof(buf), "%d", rt->buffer_size); av_dict_set(&opts, "buffer_size", buf, 0); +snprintf(buf, sizeof(buf), "%d", rt->pkt_size); +av_dict_set(&opts, "pkt_size", buf, 0); return opts; } diff --git a/libavformat/rtsp.h b/libavformat/rtsp.h index 9dfbc5367..c38b90432 100644 --- a/libavformat/rtsp.h +++ b/libavformat/rtsp.h @@ -399,6 +399,7 @@ typedef struct RTSPState { char default_lang[4]; int buffer_size; +int pkt_size; const URLProtocol **protocols; } RTSPState; -- 2.17.1 LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] time_internal: Prefix fallback versions of gmtime_r/localtime_r with ff_
Use a macro to redirect calling code from the official name to the ff_ prefixed one. Detecting these functions in configure can be tricky (on mingw, they are conditionally available depending on posix feature defines). If configure didn't detect them, but they still are visible at compile time (due to an unrelated header defining the posix feature defines), providing the local fallback versions with a prefixed name is safer. --- This fix is another alternative to improving the configure checks. Making configure use check_func_header probably is safe, but always forcing posix defines on mingw feels slightly more dubious. --- libavutil/time_internal.h | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/libavutil/time_internal.h b/libavutil/time_internal.h index d0597db050..8e647fdc16 100644 --- a/libavutil/time_internal.h +++ b/libavutil/time_internal.h @@ -23,7 +23,7 @@ #include "config.h" #if !HAVE_GMTIME_R && !defined(gmtime_r) -static inline struct tm *gmtime_r(const time_t* clock, struct tm *result) +static inline struct tm *ff_gmtime_r(const time_t* clock, struct tm *result) { struct tm *ptr = gmtime(clock); if (!ptr) @@ -31,10 +31,11 @@ static inline struct tm *gmtime_r(const time_t* clock, struct tm *result) *result = *ptr; return result; } +#define gmtime_r ff_gmtime_r #endif #if !HAVE_LOCALTIME_R && !defined(localtime_r) -static inline struct tm *localtime_r(const time_t* clock, struct tm *result) +static inline struct tm *ff_localtime_r(const time_t* clock, struct tm *result) { struct tm *ptr = localtime(clock); if (!ptr) @@ -42,6 +43,7 @@ static inline struct tm *localtime_r(const time_t* clock, struct tm *result) *result = *ptr; return result; } +#define localtime_r ff_localtime_r #endif #endif /* AVUTIL_TIME_INTERNAL_H */ -- 2.17.1 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well
Mingw headers have got header inline implementations of localtime_r and gmtime_r, but only visible if certain posix thread safe functions have been requested. This is a preparatory step for improving the detection of those functions. --- An alternative fix is also provided in a different patch series, by adjusting libavutil/time_internal.h. --- configure | 2 ++ 1 file changed, 2 insertions(+) diff --git a/configure b/configure index 26455054ba..3e8f2dcde1 100755 --- a/configure +++ b/configure @@ -4124,6 +4124,7 @@ probe_libc(){ add_${pfx}cppflags -D__printf__=__gnu_printf__ test_${pfx}cpp_condition windows.h "!defined(_WIN32_WINNT) || _WIN32_WINNT < 0x0600" && add_${pfx}cppflags -D_WIN32_WINNT=0x0600 +add_${pfx}cppflags -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 elif test_${pfx}cpp_condition _mingw.h "defined __MINGW_VERSION" || test_${pfx}cpp_condition _mingw.h "defined __MINGW32_VERSION"; then eval ${pfx}libc_type=mingw32 @@ -4137,6 +4138,7 @@ probe_libc(){ add_${pfx}cppflags -D_WIN32_WINNT=0x0600 eval test \$${pfx_no_}cc_type = "gcc" && add_${pfx}cppflags -D__printf__=__gnu_printf__ +add_${pfx}cppflags -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 elif test_${pfx}cpp_condition crtversion.h "defined _VC_CRT_MAJOR_VERSION"; then eval ${pfx}libc_type=msvcrt if test_${pfx}cpp_condition crtversion.h "_VC_CRT_MAJOR_VERSION < 14"; then -- 2.17.1 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] time_internal: Do not attempt to override *time_r() macros
From: Michael Niedermayer This allegedly fixed build on odd mingw setups, and generally seems like a safe thing to do (in case configure failed to detect them while they still are available in headers). --- libavutil/time_internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavutil/time_internal.h b/libavutil/time_internal.h index 829fefb007..d0597db050 100644 --- a/libavutil/time_internal.h +++ b/libavutil/time_internal.h @@ -22,7 +22,7 @@ #include #include "config.h" -#if !HAVE_GMTIME_R +#if !HAVE_GMTIME_R && !defined(gmtime_r) static inline struct tm *gmtime_r(const time_t* clock, struct tm *result) { struct tm *ptr = gmtime(clock); @@ -33,7 +33,7 @@ static inline struct tm *gmtime_r(const time_t* clock, struct tm *result) } #endif -#if !HAVE_LOCALTIME_R +#if !HAVE_LOCALTIME_R && !defined(localtime_r) static inline struct tm *localtime_r(const time_t* clock, struct tm *result) { struct tm *ptr = localtime(clock); -- 2.17.1 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] configure: Include time.h when checking for gmtime_r and localtime_r
These functions are available in time.h (conditional on posix thread safe functions) on mingw. Previously, these functions weren't detected by configure, and libavutil/time_internal.h provided replacements, even if time.h actually contained definitions of them. Previously, these mingw inline functions were defined as "extern __inline __attribute__((__gnu_inline__))". In this case, redefining a new static inline version of the same function with the same name was accepted. But recently, the mingw inline functions have changed to be declared as "static inline", where it no longer is allowed to have libavutil/time_internal.h redefine new static inline versions. --- Contrary to what is mentioned in a similar commit 1b4dd59e5fbdebb8d9f13ad2dbdaa0179d0cce57 in ffmpeg, using check_func_headers works just fine, provided that the posix defines have been added. (Without them, check_builtin, which that commit used, doesn't work either.) --- configure | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 3e8f2dcde1..8c46a870c4 100755 --- a/configure +++ b/configure @@ -4543,9 +4543,7 @@ check_func gethrtime check_func getopt check_func getrusage check_func gettimeofday -check_func gmtime_r check_func isatty -check_func localtime_r check_func mkstemp check_func mmap check_func mprotect @@ -4561,6 +4559,8 @@ check_func usleep check_func_headers io.h setmode check_func_headers mach/mach_time.h mach_absolute_time check_func_headers stdlib.h getenv +check_func_headers time.h gmtime_r +check_func_headers time.h localtime_r check_func_headers windows.h GetProcessAffinityMask check_func_headers windows.h GetProcessTimes -- 2.17.1 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] arm: Implement a NEON version of 422 h264_h_loop_filter_chroma
Previously, the 420 version was used even for 422. This fixes occasional checkasm failures. --- libavcodec/arm/h264dsp_init_arm.c | 8 +++- libavcodec/arm/h264dsp_neon.S | 19 +++ 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/libavcodec/arm/h264dsp_init_arm.c b/libavcodec/arm/h264dsp_init_arm.c index 7afd350..617632c 100644 --- a/libavcodec/arm/h264dsp_init_arm.c +++ b/libavcodec/arm/h264dsp_init_arm.c @@ -33,6 +33,8 @@ void ff_h264_v_loop_filter_chroma_neon(uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0); void ff_h264_h_loop_filter_chroma_neon(uint8_t *pix, int stride, int alpha, int beta, int8_t *tc0); +void ff_h264_h_loop_filter_chroma422_neon(uint8_t *pix, int stride, int alpha, + int beta, int8_t *tc0); void ff_weight_h264_pixels_16_neon(uint8_t *dst, int stride, int height, int log2_den, int weight, int offset); @@ -76,7 +78,11 @@ static av_cold void h264dsp_init_neon(H264DSPContext *c, const int bit_depth, c->h264_v_loop_filter_luma = ff_h264_v_loop_filter_luma_neon; c->h264_h_loop_filter_luma = ff_h264_h_loop_filter_luma_neon; c->h264_v_loop_filter_chroma = ff_h264_v_loop_filter_chroma_neon; -c->h264_h_loop_filter_chroma = ff_h264_h_loop_filter_chroma_neon; + +if (chroma_format_idc <= 1) +c->h264_h_loop_filter_chroma = ff_h264_h_loop_filter_chroma_neon; +else +c->h264_h_loop_filter_chroma = ff_h264_h_loop_filter_chroma422_neon; c->weight_h264_pixels_tab[0] = ff_weight_h264_pixels_16_neon; c->weight_h264_pixels_tab[1] = ff_weight_h264_pixels_8_neon; diff --git a/libavcodec/arm/h264dsp_neon.S b/libavcodec/arm/h264dsp_neon.S index 5e75565..783e0f6 100644 --- a/libavcodec/arm/h264dsp_neon.S +++ b/libavcodec/arm/h264dsp_neon.S @@ -237,6 +237,7 @@ function ff_h264_h_loop_filter_chroma_neon, export=1 h264_loop_filter_start sub r0, r0, #2 +h_loop_filter_chroma420: vld1.32 {d18[0]}, [r0], r1 vld1.32 {d16[0]}, [r0], r1 vld1.32 {d0[0]}, [r0], r1 @@ -271,6 +272,24 @@ function ff_h264_h_loop_filter_chroma_neon, export=1 bx lr endfunc +function ff_h264_h_loop_filter_chroma422_neon, export=1 +h264_loop_filter_start +push{r4, lr} +add r4, r0, r1 +add r1, r1, r1 +sub r0, r0, #2 + +bl h_loop_filter_chroma420 + +ldr r12, [sp, #8] +ldr r12, [r12] +vmov.32 d24[0], r12 +sub r0, r4, #2 + +bl h_loop_filter_chroma420 +pop {r4, pc} +endfunc + @ Biweighted prediction .macro biweight_16 macs, macd -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] checkasm/h264: test 4:2:2 chroma loop filter functions
On Wed, 27 Feb 2019, Janne Grunau wrote: --- tests/checkasm/h264dsp.c | 44 1 file changed, 26 insertions(+), 18 deletions(-) LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] h264/arm64: implement missing 4:2:2 chroma loop filter neon functions
On Wed, 27 Feb 2019, Janne Grunau wrote: --- libavcodec/aarch64/h264dsp_init_aarch64.c | 18 ++-- libavcodec/aarch64/h264dsp_neon.S | 36 +++ 2 files changed, 46 insertions(+), 8 deletions(-) LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCHv3] avio: Do not flush the buffer if a constant packet size is requested
On Fri, 22 Feb 2019, Luca Barbato wrote: --- Now with a separate option to be explicit on what is the behaviour wanted. libavformat/aviobuf.c | 9 +++-- libavformat/udp.c | 8 libavformat/url.h | 1 + 3 files changed, 16 insertions(+), 2 deletions(-) diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c index 98e35f776c..aa9e2fc483 100644 --- a/libavformat/aviobuf.c +++ b/libavformat/aviobuf.c @@ -244,8 +244,13 @@ void avio_write(AVIOContext *s, const unsigned char *buf, int size) void avio_flush(AVIOContext *s) { -flush_buffer(s); -s->must_flush = 0; +AVIOInternal *internal = s->opaque; +URLContext *h = internal->h; + No, this doesn't work. You can't assume that s->opaque exists and is an AVIOinternal struct. When AVIOContext has been allocated by avio_alloc_context, s->opaque is whatever custom pointer the caller provided. The only place you can use AVIOInternal is within the callbacks you provide in ffio_fdopen when AVIOInternal is created. To do this properly, you need to propagate the new value all the way into AVIOContext, just like the existing max_packet_size. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 19/19] aarch64: vp8: Optimize vp8_idct_add_neon for aarch64
On Fri, 1 Feb 2019, Martin Storsjö wrote: The previous version was a pretty exact translation of the arm version. This version does do some unnecessary arithemetic (it does more operations on vectors that are only half filled; it does 4 uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead of packing data together (which could be done for free in the arm version). This gives a decent speedup on Cortex A53, a minor speedup on A72 and a very minor slowdown on Cortex A73. Before:Cortex A53A72A73 vp8_idct_add_neon: 79.7 67.5 65.0 After: vp8_idct_add_neon: 67.7 64.8 66.7 --- libavcodec/aarch64/vp8dsp_neon.S | 49 1 file changed, 25 insertions(+), 24 deletions(-) 22:38 feel free to push next week if I didn't manage to start by then I'll push this patchset soon, with some changes squashed as suggested by Diego. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 04/19] aarch64: vp8: Fix assembling with armasm64
--- libavcodec/aarch64/vp8dsp_neon.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index f371ea7..14a9d11 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -28,7 +28,7 @@ function ff_vp8_idct_add_neon, export=1 ld1 {v0.8b - v3.8b}, [x1] mov w4, #20091 -movkw4, #35468/2, lsl 16 +movkw4, #35468/2, lsl #16 dup v4.2s, w4 smull v26.4s, v1.4h, v4.h[0] -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 19/19] aarch64: vp8: Optimize vp8_idct_add_neon for aarch64
The previous version was a pretty exact translation of the arm version. This version does do some unnecessary arithemetic (it does more operations on vectors that are only half filled; it does 4 uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead of packing data together (which could be done for free in the arm version). This gives a decent speedup on Cortex A53, a minor speedup on A72 and a very minor slowdown on Cortex A73. Before:Cortex A53A72A73 vp8_idct_add_neon: 79.7 67.5 65.0 After: vp8_idct_add_neon: 67.7 64.8 66.7 --- libavcodec/aarch64/vp8dsp_neon.S | 49 1 file changed, 25 insertions(+), 24 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index cac4558..47fdc21 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -125,36 +125,37 @@ function ff_vp8_idct_add_neon, export=1 sub v17.4h, v0.4h, v2.4h add v18.4h, v20.4h, v23.4h -ld1 {v24.d}[0], [x0], x2 -zip1v16.2d, v16.2d, v17.2d -sub v19.4h, v21.4h, v22.4h -ld1 {v25.d}[0], [x0], x2 -zip1v18.2d, v18.2d, v19.2d -add v0.8h, v16.8h, v18.8h -ld1 {v25.d}[1], [x0], x2 -sub v1.8h, v16.8h, v18.8h -ld1 {v24.d}[1], [x0], x2 -srshr v0.8h, v0.8h, #3 -trn1v24.4s, v24.4s, v25.4s -srshr v1.8h, v1.8h, #3 +ld1 {v24.s}[0], [x0], x2 +sub v19.4h, v21.4h, v22.4h +ld1 {v25.s}[0], [x0], x2 +add v0.4h, v16.4h, v18.4h +add v1.4h, v17.4h, v19.4h +ld1 {v26.s}[0], [x0], x2 +sub v3.4h, v16.4h, v18.4h +sub v2.4h, v17.4h, v19.4h +ld1 {v27.s}[0], [x0], x2 +srshr v0.4h, v0.4h, #3 +srshr v1.4h, v1.4h, #3 +srshr v2.4h, v2.4h, #3 +srshr v3.4h, v3.4h, #3 + sub x0, x0, x2, lsl #2 -ext v1.16b, v1.16b, v1.16b, #8 -trn1v3.2d, v0.2d, v1.2d -trn2v0.2d, v0.2d, v1.2d -trn1v1.8h, v3.8h, v0.8h -trn2v3.8h, v3.8h, v0.8h -uzp1v0.4s, v1.4s, v3.4s -uzp2v1.4s, v3.4s, v1.4s +transpose_4x4H v0, v1, v2, v3, v5, v6, v7, v16 uaddw v0.8h, v0.8h, v24.8b -uaddw2 v1.8h, v1.8h, v24.16b +uaddw v1.8h, v1.8h, v25.8b +uaddw v2.8h, v2.8h, v26.8b +uaddw v3.8h, v3.8h, v27.8b sqxtun v0.8b, v0.8h -sqxtun2 v0.16b, v1.8h +sqxtun v1.8b, v1.8h +sqxtun v2.8b, v2.8h +sqxtun v3.8b, v3.8h + st1 {v0.s}[0], [x0], x2 -st1 {v0.s}[1], [x0], x2 -st1 {v0.s}[3], [x0], x2 -st1 {v0.s}[2], [x0], x2 +st1 {v1.s}[0], [x0], x2 +st1 {v2.s}[0], [x0], x2 +st1 {v3.s}[0], [x0], x2 ret endfunc -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 01/19] libavcodec: vp8 neon optimizations for aarch64
From: Magnus Röös Partial port of the ARM Neon for aarch64. Benchmarks from fate: benchmarking with Linux Perf Monitoring API nop: 58.6 checkasm: using random seed 1760970128 NEON: - vp8dsp.idct [OK] - vp8dsp.mc [OK] - vp8dsp.loopfilter [OK] checkasm: all 21 tests passed vp8_idct_add_c: 201.6 vp8_idct_add_neon: 83.1 vp8_idct_dc_add_c: 107.6 vp8_idct_dc_add_neon: 33.8 vp8_idct_dc_add4y_c: 426.4 vp8_idct_dc_add4y_neon: 59.4 vp8_loop_filter8uv_h_c: 688.1 vp8_loop_filter8uv_h_neon: 216.3 vp8_loop_filter8uv_inner_h_c: 649.3 vp8_loop_filter8uv_inner_h_neon: 195.3 vp8_loop_filter8uv_inner_v_c: 544.8 vp8_loop_filter8uv_inner_v_neon: 131.3 vp8_loop_filter8uv_v_c: 706.1 vp8_loop_filter8uv_v_neon: 141.1 vp8_loop_filter16y_h_c: 668.8 vp8_loop_filter16y_h_neon: 242.8 vp8_loop_filter16y_inner_h_c: 647.3 vp8_loop_filter16y_inner_h_neon: 224.6 vp8_loop_filter16y_inner_v_c: 647.8 vp8_loop_filter16y_inner_v_neon: 128.8 vp8_loop_filter16y_v_c: 721.8 vp8_loop_filter16y_v_neon: 154.3 vp8_loop_filter_simple_h_c: 387.8 vp8_loop_filter_simple_h_neon: 187.6 vp8_loop_filter_simple_v_c: 384.1 vp8_loop_filter_simple_v_neon: 78.6 vp8_put_epel8_h4v4_c: 3971.1 vp8_put_epel8_h4v4_neon: 855.1 vp8_put_epel8_h4v6_c: 5060.1 vp8_put_epel8_h4v6_neon: 989.6 vp8_put_epel8_h6v4_c: 4320.8 vp8_put_epel8_h6v4_neon: 1007.3 vp8_put_epel8_h6v6_c: 5449.3 vp8_put_epel8_h6v6_neon: 1158.1 vp8_put_epel16_h6_c: 6683.8 vp8_put_epel16_h6_neon: 831.8 vp8_put_epel16_h6v6_c: 0.8 vp8_put_epel16_h6v6_neon: 2214.8 vp8_put_epel16_v6_c: 7024.8 vp8_put_epel16_v6_neon: 799.6 vp8_put_pixels8_c: 112.8 vp8_put_pixels8_neon: 78.1 vp8_put_pixels16_c: 131.3 vp8_put_pixels16_neon: 129.8 Signed-off-by: Magnus Röös --- libavcodec/aarch64/Makefile |2 + libavcodec/aarch64/vp8dsp.h | 70 ++ libavcodec/aarch64/vp8dsp_init_aarch64.c | 81 +++ libavcodec/aarch64/vp8dsp_neon.S | 1031 ++ libavcodec/vp8dsp.c |4 + libavcodec/vp8dsp.h |2 + 6 files changed, 1190 insertions(+) create mode 100644 libavcodec/aarch64/vp8dsp.h create mode 100644 libavcodec/aarch64/vp8dsp_init_aarch64.c create mode 100644 libavcodec/aarch64/vp8dsp_neon.S diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index 5c1d118..2555044 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -44,6 +44,8 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o \ aarch64/synth_filter_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o +NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_init_aarch64.o \ + aarch64/vp8dsp_neon.o NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o \ aarch64/vp9lpf_neon.o \ aarch64/vp9mc_neon.o diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h new file mode 100644 index 000..8a0c8fb --- /dev/null +++ b/libavcodec/aarch64/vp8dsp.h @@ -0,0 +1,70 @@ +/* + * This file is part of Libav. + * + * Libav is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * Libav is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with Libav; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#ifndef AVCODEC_ARM_VP8DSP_H +#define AVCODEC_ARM_VP8DSP_H + +#include "libavcodec/vp8dsp.h" + +#define VP8_LF_Y(hv, inner, opt) \ +void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst,\ +ptrdiff_t stride,\ +int flim_E, int flim_I, \ +int hev_thresh) + +#define VP8_LF_UV(hv, inner, opt)\ +void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU, \ + uint8_t *dstV, \ + ptrdiff_t stride, \ + int flim_E, int flim_I, \ +
[libav-devel] [PATCH 03/19] aarch64: vp8: Fix assembling with clang
This also partially fixes assembling with MS armasm64 (via gas-preprocessor). --- libavcodec/aarch64/vp8dsp_neon.S | 124 +++ 1 file changed, 62 insertions(+), 62 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 771877c..f371ea7 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -31,10 +31,10 @@ function ff_vp8_idct_add_neon, export=1 movkw4, #35468/2, lsl 16 dup v4.2s, w4 -smull v26.4s, v1.4h, v4.4h[0] -smull v27.4s, v3.4h, v4.4h[0] -sqdmulh v20.4h, v1.4h, v4.4h[1] -sqdmulh v23.4h, v3.4h, v4.4h[1] +smull v26.4s, v1.4h, v4.h[0] +smull v27.4s, v3.4h, v4.h[0] +sqdmulh v20.4h, v1.4h, v4.h[1] +sqdmulh v23.4h, v3.4h, v4.h[1] sqshrn v21.4h, v26.4s, #16 sqshrn v22.4h, v27.4s, #16 add v21.4h, v21.4h, v1.4h @@ -54,12 +54,12 @@ function ff_vp8_idct_add_neon, export=1 transpose_4x4H v0, v1, v2, v3, v24, v5, v6, v7 moviv29.8h, #0 -smull v26.4s, v1.4h, v4.4h[0] +smull v26.4s, v1.4h, v4.h[0] st1 {v29.8h}, [x1], #16 -smull v27.4s, v3.4h, v4.4h[0] +smull v27.4s, v3.4h, v4.h[0] st1 {v29.16b}, [x1] -sqdmulh v21.4h, v1.4h, v4.4h[1] -sqdmulh v23.4h, v3.4h, v4.4h[1] +sqdmulh v21.4h, v1.4h, v4.h[1] +sqdmulh v23.4h, v3.4h, v4.h[1] sqshrn v20.4h, v26.4s, #16 sqshrn v22.4h, v27.4s, #16 add v20.4h, v20.4h, v1.4h @@ -469,7 +469,7 @@ function ff_vp8_h_loop_filter16\name\()_neon, export=1 ld1 {v6.d}[1], [x0], x1 ld1 {v7.d}[1], [x0], x1 -transpose_8x16b v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 +transpose_8x16B v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 dup v22.16b, w2 // flim_E .if !\simple @@ -480,7 +480,7 @@ function ff_vp8_h_loop_filter16\name\()_neon, export=1 sub x0, x0, x1, lsl #4// backup 16 rows -transpose_8x16b v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 +transpose_8x16B v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 // Store pixels: st1 {v0.d}[0], [x0], x1 @@ -531,7 +531,7 @@ function ff_vp8_h_loop_filter8uv\name\()_neon, export=1 ld1 {v7.d}[0], [x0], x2 ld1 {v7.d}[1], [x1], x2 -transpose_8x16b v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 +transpose_8x16B v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 dup v22.16b, w3 // flim_E dup v23.16b, w4 // flim_I @@ -541,7 +541,7 @@ function ff_vp8_h_loop_filter8uv\name\()_neon, export=1 sub x0, x0, x2, lsl #3// backup u 8 rows sub x1, x1, x2, lsl #3// backup v 8 rows -transpose_8x16b v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 +transpose_8x16B v0, v1, v2, v3, v4, v5, v6, v7, v30, v31 // Store pixels: st1 {v0.d}[0], [x0], x2 // load u @@ -613,13 +613,13 @@ endfunc uxtlv22.8h, v24.8b ext v26.8b, \s0\().8b, \s1\().8b, #5 uxtlv25.8h, v25.8b -mul v21.8h, v21.8h, v0.8h[2] +mul v21.8h, v21.8h, v0.h[2] uxtlv26.8h, v26.8b -mul v22.8h, v22.8h, v0.8h[3] -mls v21.8h, v19.8h, v0.8h[1] -mls v22.8h, v25.8h, v0.8h[4] -mla v21.8h, v18.8h, v0.8h[0] -mla v22.8h, v26.8h, v0.8h[5] +mul v22.8h, v22.8h, v0.h[3] +mls v21.8h, v19.8h, v0.h[1] +mls v22.8h, v25.8h, v0.h[4] +mla v21.8h, v18.8h, v0.h[0] +mla v22.8h, v26.8h, v0.h[5] sqadd v22.8h, v21.8h, v22.8h sqrshrun\d\().8b, v22.8h, #7 .endm @@ -640,20 +640,20 @@ endfunc uxtl2 v2.8h, v2.16b uxtlv17.8h, v16.8b uxtl2 v16.8h, v16.16b -mul v19.8h, v19.8h, v0.8h[3] -mul v18.8h, v18.8h, v0.8h[2] -mul v3.8h, v3.8h, v0.8h[2] -mul v22.8h, v22.8h, v0.8h[3] -mls v19.8h, v20.8h, v0.8h[4] +mul v19.8h, v19.8h, v0.h[3] +mul v18.8h, v18.8h, v0.h[2] +mul
[libav-devel] [PATCH 18/19] aarch64: vp8: Skip saturating in shrn in ff_vp8_idct_add_neon
The original arm version didn't do saturation here. This probably doesn't make any difference for performance, but reduces the differences. --- libavcodec/aarch64/vp8dsp_neon.S | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 139b380..cac4558 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -92,8 +92,8 @@ function ff_vp8_idct_add_neon, export=1 smull v27.4s, v3.4h, v4.h[0] sqdmulh v20.4h, v1.4h, v4.h[1] sqdmulh v23.4h, v3.4h, v4.h[1] -sqshrn v21.4h, v26.4s, #16 -sqshrn v22.4h, v27.4s, #16 +shrnv21.4h, v26.4s, #16 +shrnv22.4h, v27.4s, #16 add v21.4h, v21.4h, v1.4h add v22.4h, v22.4h, v3.4h @@ -117,8 +117,8 @@ function ff_vp8_idct_add_neon, export=1 st1 {v29.16b}, [x1] sqdmulh v21.4h, v1.4h, v4.h[1] sqdmulh v23.4h, v3.4h, v4.h[1] -sqshrn v20.4h, v26.4s, #16 -sqshrn v22.4h, v27.4s, #16 +shrnv20.4h, v26.4s, #16 +shrnv22.4h, v27.4s, #16 add v20.4h, v20.4h, v1.4h add v22.4h, v22.4h, v3.4h add v16.4h, v0.4h, v2.4h -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 13/19] aarch64: vp8: Port missing epel8 functions from arm version
Cortex A53 A72 A73 vp8_put_epel8_h4_c: 2594.8 1159.6 1374.8 vp8_put_epel8_h4_neon: 506.4 244.2 314.0 vp8_put_epel8_h6_c: 3445.8 1677.1 1811.3 vp8_put_epel8_h6_neon: 634.4 371.7 433.0 vp8_put_epel8_v4_c: 2614.0 1174.8 1378.0 vp8_put_epel8_v4_neon: 321.0 221.7 235.8 vp8_put_epel8_v6_c: 3635.5 1703.0 2079.2 vp8_put_epel8_v6_neon: 416.9 317.0 295.5 --- libavcodec/aarch64/vp8dsp_init_aarch64.c | 4 ++ libavcodec/aarch64/vp8dsp_neon.S | 87 2 files changed, 91 insertions(+) diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index 8f060dc..1878d8e 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -47,8 +47,12 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_neon; dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon; +dsp->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_neon; +dsp->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_neon; +dsp->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_neon; dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_neon; dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon; +dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_neon; dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon; dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon; } diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 4ea62c0..c5badc4 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -957,6 +957,51 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 ret endfunc +function ff_put_vp8_epel8_v6_neon, export=1 +sub x2, x2, x3, lsl #1 + +movrel x7, subpel_filters, -16 +add x6, x7, w6, uxtw #4 +ld1 {v0.8h}, [x6] +1: +ld1 {v2.8b}, [x2], x3 +ld1 {v3.8b}, [x2], x3 +ld1 {v4.8b}, [x2], x3 +ld1 {v5.8b}, [x2], x3 +ld1 {v6.8b}, [x2], x3 +ld1 {v7.8b}, [x2], x3 +ld1 {v28.8b}, [x2] + +sub x2, x2, x3, lsl #2 + +vp8_epel8_v6_y2 v2, v3, v2, v3, v4, v5, v6, v7, v28 + +st1 {v2.8b}, [x0], x1 +st1 {v3.8b}, [x0], x1 +subsw4, w4, #2 +b.ne1b + +ret +endfunc + +function ff_put_vp8_epel8_h6_neon, export=1 +sub x2, x2, #2 + +movrel x7, subpel_filters, -16 +add x5, x7, w5, uxtw #4 +ld1 {v0.8h},[x5] +1: +ld1 {v2.8b, v3.8b}, [x2], x3 + +vp8_epel8_h6v2, v2, v3 + +st1 {v2.8b}, [x0], x1 +subsw4, w4, #1 +b.ne1b + +ret +endfunc + function ff_put_vp8_epel8_h6v6_neon, export=1 sub x2, x2, x3, lsl #1 sub x2, x2, #2 @@ -1003,6 +1048,48 @@ function ff_put_vp8_epel8_h6v6_neon, export=1 ret endfunc +function ff_put_vp8_epel8_v4_neon, export=1 +sub x2, x2, x3 + +movrel x7, subpel_filters, -16 +add x6, x7, w6, uxtw #4 +ld1 {v0.8h}, [x6] +1: +ld1 {v2.8b}, [x2], x3 +ld1 {v3.8b}, [x2], x3 +ld1 {v4.8b}, [x2], x3 +ld1 {v5.8b}, [x2], x3 +ld1 {v6.8b}, [x2] +sub x2, x2, x3, lsl #1 + +vp8_epel8_v4_y2 v2, v2, v3, v4, v5, v6 + +st1 {v2.d}[0], [x0], x1 +st1 {v2.d}[1], [x0], x1 +subsw4, w4, #2 +b.ne1b + +ret +endfunc + +function ff_put_vp8_epel8_h4_neon, export=1 +sub x2, x2, #1 + +movrel x7, subpel_filters, -16 +add x5, x7, w5, uxtw #4 +ld1 {v0.8h}, [x5] +1: +ld1 {v2.8b,v3.8b}, [x2], x3 + +vp8_epel8_h4v2, v2, v3 + +st1 {v2.8b}, [x0], x1 +subsw4, w4, #1 +b.ne1b + +ret +endfunc + function ff_put_vp8_epel8_h4v6_neon, export=1 sub x2, x2, x3, lsl #1 sub x2, x2, #1 -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 15/19] aarch64: vp8: Port bilin functions from arm version
Cortex A53 A72 A73 vp8_put_bilin4_h_c:303.8 102.2 161.8 vp8_put_bilin4_h_neon: 100.040.941.2 vp8_put_bilin4_hv_c: 322.8 201.0 305.9 vp8_put_bilin4_hv_neon:156.872.677.0 vp8_put_bilin4_v_c:304.7 101.7 166.5 vp8_put_bilin4_v_neon: 82.741.233.0 vp8_put_bilin8_h_c: 1192.7 352.5 623.8 vp8_put_bilin8_h_neon: 213.570.287.8 vp8_put_bilin8_hv_c: 1098.6 769.2 1041.9 vp8_put_bilin8_hv_neon:324.0 123.5 146.0 vp8_put_bilin8_v_c: 1193.9 350.4 617.7 vp8_put_bilin8_v_neon: 183.960.764.7 vp8_put_bilin16_h_c: 2353.1 671.2 1223.3 vp8_put_bilin16_h_neon:261.9 140.7 145.0 vp8_put_bilin16_hv_c: 2453.2 1470.9 2355.2 vp8_put_bilin16_hv_neon: 383.9 196.0 217.0 vp8_put_bilin16_v_c: 2349.3 669.8 1251.2 vp8_put_bilin16_v_neon:202.9 110.796.2 --- libavcodec/aarch64/vp8dsp.h | 5 + libavcodec/aarch64/vp8dsp_init_aarch64.c | 32 libavcodec/aarch64/vp8dsp_neon.S | 292 +++ 3 files changed, 329 insertions(+) diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h index 40d0cae..616252e 100644 --- a/libavcodec/aarch64/vp8dsp.h +++ b/libavcodec/aarch64/vp8dsp.h @@ -67,4 +67,9 @@ VP8_MC(epel ## w ## _h4v6, opt);\ VP8_MC(epel ## w ## _h6v6, opt) +#define VP8_BILIN(w, opt) \ +VP8_MC(bilin ## w ## _h, opt); \ +VP8_MC(bilin ## w ## _v, opt); \ +VP8_MC(bilin ## w ## _hv, opt) + #endif /* AVCODEC_AARCH64_VP8DSP_H */ diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index 478f849..53fbfcd 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -36,6 +36,9 @@ VP8_EPEL(16, neon); VP8_EPEL(8, neon); VP8_EPEL(4, neon); +VP8_BILIN(16, neon); +VP8_BILIN(8, neon); +VP8_BILIN(4, neon); av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) { @@ -65,6 +68,35 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_neon; dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_neon; dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_neon; + +dsp->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_neon; +dsp->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilin16_h_neon; +dsp->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilin16_h_neon; +dsp->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilin16_v_neon; +dsp->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilin16_v_neon; +dsp->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_neon; + +dsp->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon; +dsp->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilin8_h_neon; +dsp->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_neon; +dsp->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilin8_v_neon; +dsp->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_neon; +dsp->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_neon; + +dsp->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_neon; +dsp->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_neon; +dsp->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_neon; +dsp->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_neon; +dsp->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_neon; +dsp->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_neon; } av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 7fe2466..604be8a 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -1509,3 +1509,295 @@ function ff_put_vp8_epel4_h4v4_neon, export=1 add sp, sp, #44 ret endfunc + +/* Bilinear MC */ + +function ff_put_vp8_bilin16_h_neon, export=1 +mov w7, #8 +dup v0.8b, w5 +sub w5, w7, w5 +dup v1.8b, w5 +1: +subsw4,
[libav-devel] [PATCH 12/19] aarch64: vp8: Port vp8_luma_dc_wht and vp8_idct_dc_add4uv from arm version
Cortex A53A72A73 vp8_luma_dc_wht_c:115.7 75.7 90.7 vp8_luma_dc_wht_neon: 60.7 41.2 45.7 vp8_idct_dc_add4uv_c: 376.1 262.9 282.5 vp8_idct_dc_add4uv_neon: 52.0 29.0 37.0 --- libavcodec/aarch64/vp8dsp_init_aarch64.c | 3 + libavcodec/aarch64/vp8dsp_neon.S | 109 +++ 2 files changed, 112 insertions(+) diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index da54efd..8f060dc 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -28,6 +28,7 @@ void ff_vp8_luma_dc_wht_neon(int16_t block[4][4][16], int16_t dc[16]); void ff_vp8_idct_add_neon(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add_neon(uint8_t *dst, int16_t block[16], ptrdiff_t stride); void ff_vp8_idct_dc_add4y_neon(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); +void ff_vp8_idct_dc_add4uv_neon(uint8_t *dst, int16_t block[4][16], ptrdiff_t stride); VP8_LF(neon); @@ -57,10 +58,12 @@ av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp) if (!have_neon(av_get_cpu_flags())) { return; } +dsp->vp8_luma_dc_wht= ff_vp8_luma_dc_wht_neon; dsp->vp8_idct_add = ff_vp8_idct_add_neon; dsp->vp8_idct_dc_add= ff_vp8_idct_dc_add_neon; dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_neon; +dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_neon; dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_neon; dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon; diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 2b5b049..4ea62c0 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -4,6 +4,7 @@ * Copyright (c) 2010 Rob Clark * Copyright (c) 2011 Mans Rullgard * Copyright (c) 2018 Magnus Röös + * Copyright (c) 2019 Martin Storsjo * * This file is part of Libav. * @@ -25,6 +26,62 @@ #include "libavutil/aarch64/asm.S" #include "neon.S" +function ff_vp8_luma_dc_wht_neon, export=1 +ld1 {v0.4h - v3.4h}, [x1] +moviv30.8h, #0 + +add v4.4h, v0.4h, v3.4h +add v6.4h, v1.4h, v2.4h +st1 {v30.8h}, [x1], #16 +sub v7.4h, v1.4h, v2.4h +sub v5.4h, v0.4h, v3.4h +st1 {v30.8h}, [x1] +add v0.4h, v4.4h, v6.4h +add v1.4h, v5.4h, v7.4h +sub v2.4h, v4.4h, v6.4h +sub v3.4h, v5.4h, v7.4h + +moviv16.4h, #3 + +transpose_4x4H v0, v1, v2, v3, v4, v5, v6, v7 + +add v0.4h, v0.4h, v16.4h + +add v4.4h, v0.4h, v3.4h +add v6.4h, v1.4h, v2.4h +sub v7.4h, v1.4h, v2.4h +sub v5.4h, v0.4h, v3.4h +add v0.4h, v4.4h, v6.4h +add v1.4h, v5.4h, v7.4h +sub v2.4h, v4.4h, v6.4h +sub v3.4h, v5.4h, v7.4h + +sshrv0.4h, v0.4h, #3 +sshrv1.4h, v1.4h, #3 +sshrv2.4h, v2.4h, #3 +sshrv3.4h, v3.4h, #3 + +mov x3, #32 +st1 {v0.h}[0], [x0], x3 +st1 {v1.h}[0], [x0], x3 +st1 {v2.h}[0], [x0], x3 +st1 {v3.h}[0], [x0], x3 +st1 {v0.h}[1], [x0], x3 +st1 {v1.h}[1], [x0], x3 +st1 {v2.h}[1], [x0], x3 +st1 {v3.h}[1], [x0], x3 +st1 {v0.h}[2], [x0], x3 +st1 {v1.h}[2], [x0], x3 +st1 {v2.h}[2], [x0], x3 +st1 {v3.h}[2], [x0], x3 +st1 {v0.h}[3], [x0], x3 +st1 {v1.h}[3], [x0], x3 +st1 {v2.h}[3], [x0], x3 +st1 {v3.h}[3], [x0], x3 + +ret +endfunc + function ff_vp8_idct_add_neon, export=1 ld1 {v0.8b - v3.8b}, [x1] mov w4, #20091 @@ -102,6 +159,58 @@ function ff_vp8_idct_add_neon, export=1 ret endfunc +function ff_vp8_idct_dc_add4uv_neon, export=1 +moviv0.4h, #0 +mov x3, #32 +ld1r{v16.4h}, [x1] +st1 {v0.h}[0], [x1], x3 +ld1r{v17.4h}, [x1] +st1 {v0.h}[0], [x1], x3 +ld1r{v18.4h}, [x1] +st1 {v0.h}[0], [x1], x3 +ld1r{v19.4h}, [x1] +st1 {v0.h}[0], [x1], x3 +ins v16.d[1], v17.d[0] +ins v18.d[1], v19.d[0] +mov x3, x0 +srshr
[libav-devel] [PATCH 11/19] aarch64: vp8: Fix a typo in a comment
--- libavcodec/aarch64/vp8dsp_neon.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index c19ab0d..2b5b049 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -743,7 +743,7 @@ endfunc // note: worst case sum of all 6-tap filter values * 255 is 0x7f80 so 16 bit -// arithmatic can be used to apply filters +// arithmetic can be used to apply filters const subpel_filters, align=4 .short 0, 6, 123, 12, 1, 0, 0, 0 .short 2, 11, 108, 36, 8, 1, 0, 0 -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 16/19] arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
This makes it similar to put_epel16_v6, and gives a 10-25% speedup of this function. Before: Cortex A7 A8 A9 A53 A72 vp8_put_epel16_h6v6_neon:3058.0 2218.5 2459.8 2183.0 1572.2 After: vp8_put_epel16_h6v6_neon:2670.8 1934.2 2244.4 1729.4 1503.9 --- libavcodec/arm/vp8dsp_neon.S | 41 + 1 file changed, 13 insertions(+), 28 deletions(-) diff --git a/libavcodec/arm/vp8dsp_neon.S b/libavcodec/arm/vp8dsp_neon.S index f43b4f7..b707d19 100644 --- a/libavcodec/arm/vp8dsp_neon.S +++ b/libavcodec/arm/vp8dsp_neon.S @@ -773,23 +773,6 @@ endfunc vqrshrun.s16\d1, q14, #7 .endm -.macro vp8_epel8_v6d0, s0, s1, s2, s3, s4, s5 -vmovl.u8q10, \s2 -vmovl.u8q11, \s3 -vmovl.u8q9, \s1 -vmovl.u8q12, \s4 -vmovl.u8q8, \s0 -vmovl.u8q13, \s5 -vmul.u16q10, q10, d0[2] -vmul.u16q11, q11, d0[3] -vmls.u16q10, q9, d0[1] -vmls.u16q11, q12, d1[0] -vmla.u16q10, q8, d0[0] -vmla.u16q11, q13, d1[1] -vqadd.s16 q11, q10, q11 -vqrshrun.s16\d0, q11, #7 -.endm - .macro vp8_epel8_v6_y2 d0, d1, s0, s1, s2, s3, s4, s5, s6 vmovl.u8q10, \s0 vmovl.u8q11, \s3 @@ -909,12 +892,12 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 sub r2, r2, r3, lsl #1 sub r2, r2, #2 push{r4,lr} -vpush {d8-d9} +vpush {d8-d15} @ first pass (horizontal): -ldr r4, [sp, #28] @ mx +ldr r4, [sp, #64+8+4] @ mx movrel lr, subpel_filters-16 -ldr r12, [sp, #24] @ h +ldr r12, [sp, #64+8+0] @ h add r4, lr, r4, lsl #4 sub sp, sp, #336+16 vld1.16 {q0}, [r4,:128] @@ -931,9 +914,9 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 bne 1b @ second pass (vertical): -ldr r4, [sp, #336+16+32] @ my +ldr r4, [sp, #336+16+64+8+8] @ my movrel lr, subpel_filters-16 -ldr r12, [sp, #336+16+24] @ h +ldr r12, [sp, #336+16+64+8+0] @ h add r4, lr, r4, lsl #4 add lr, sp, #15 vld1.16 {q0}, [r4,:128] @@ -941,18 +924,20 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 2: vld1.8 {d2-d5}, [lr,:128]! vld1.8 {d6-d9}, [lr,:128]! -vld1.8 {d28-d31},[lr,:128] -sub lr, lr, #48 +vld1.8 {d10-d13},[lr,:128]! +vld1.8 {d14-d15},[lr,:128] +sub lr, lr, #64 -vp8_epel8_v6d2, d2, d4, d6, d8, d28, d30 -vp8_epel8_v6d3, d3, d5, d7, d9, d29, d31 +vp8_epel8_v6_y2 d2, d4, d2, d4, d6, d8, d10, d12, d14 +vp8_epel8_v6_y2 d3, d5, d3, d5, d7, d9, d11, d13, d15 vst1.8 {d2-d3}, [r0,:128], r1 -subsr12, r12, #1 +vst1.8 {d4-d5}, [r0,:128], r1 +subsr12, r12, #2 bne 2b add sp, sp, #336+16 -vpop{d8-d9} +vpop{d8-d15} pop {r4,pc} endfunc -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 17/19] aarch64: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2
This makes it similar to put_epel16_v6, and gives a large speedup on Cortex A53, a minor speedup on A72 and a very minor slowdown on A73. Before: Cortex A53 A72 A73 vp8_put_epel16_h6v6_neon: 2211.4 1586.5 1431.7 After: vp8_put_epel16_h6v6_neon: 1736.9 1522.0 1448.1 --- libavcodec/aarch64/vp8dsp_neon.S | 34 ++ 1 file changed, 10 insertions(+), 24 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 604be8a..139b380 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -769,23 +769,6 @@ endfunc sqrshrun2 \d0\().16b, v22.8h, #7 .endm -.macro vp8_epel8_v6d0, s0, s1, s2, s3, s4, s5 -uxtl\s2\().8h, \s2\().8b -uxtl\s3\().8h, \s3\().8b -uxtl\s1\().8h, \s1\().8b -uxtl\s4\().8h, \s4\().8b -uxtl\s0\().8h, \s0\().8b -uxtl\s5\().8h, \s5\().8b -mul \s2\().8h, \s2\().8h, v0.h[2] -mul \s3\().8h, \s3\().8h, v0.h[3] -mls \s2\().8h, \s1\().8h, v0.h[1] -mls \s3\().8h, \s4\().8h, v0.h[4] -mla \s2\().8h, \s0\().8h, v0.h[0] -mla \s3\().8h, \s5\().8h, v0.h[5] -sqadd \s3\().8h, \s2\().8h, \s3\().8h -sqrshrun\d0\().8b, \s3\().8h, #7 -.endm - .macro vp8_epel8_v6_y2 d0, d1, s0, s1, s2, s3, s4, s5, s6 uxtl\s0\().8h, \s0\().8b uxtl\s3\().8h, \s3\().8b @@ -942,15 +925,18 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 2: ld1 {v1.8b - v4.8b},[x7], #32 ld1 {v16.8b - v19.8b}, [x7], #32 -ld1 {v20.8b - v23.8b}, [x7] -sub x7, x7, #48 +ld1 {v20.8b - v23.8b}, [x7], #32 +ld1 {v24.8b - v25.8b}, [x7] +sub x7, x7, #64 -vp8_epel8_v6v5, v1, v3, v16, v18, v20, v22 -vp8_epel8_v6v2, v2, v4, v17, v19, v21, v23 -trn1v2.2d, v5.2d, v2.2d +vp8_epel8_v6_y2 v1, v3, v1, v3, v16, v18, v20, v22, v24 +vp8_epel8_v6_y2 v2, v4, v2, v4, v17, v19, v21, v23, v25 +trn1v1.2d, v1.2d, v2.2d +trn1v3.2d, v3.2d, v4.2d -st1 {v2.16b}, [x0], x1 -subsx4, x4, #1 +st1 {v1.16b}, [x0], x1 +st1 {v3.16b}, [x0], x1 +subsx4, x4, #2 b.ne2b add sp, sp, #336+16 -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 14/19] aarch64: vp8: Port epel4 functions from arm version
Cortex A53A72A73 vp8_put_epel4_h4_c:631.4 291.7 367.8 vp8_put_epel4_h4_neon: 241.0 131.0 155.7 vp8_put_epel4_h4v4_c: 967.5 529.3 667.7 vp8_put_epel4_h4v4_neon: 429.3 241.8 279.7 vp8_put_epel4_h4v6_c: 1374.7 657.5 864.5 vp8_put_epel4_h4v6_neon: 515.5 295.5 334.7 vp8_put_epel4_h6_c:851.0 421.0 486.0 vp8_put_epel4_h6_neon: 321.5 195.0 217.7 vp8_put_epel4_h6v4_c: .3 621.1 781.2 vp8_put_epel4_h6v4_neon: 539.2 328.0 365.3 vp8_put_epel4_h6v6_c: 1561.3 763.3 999.7 vp8_put_epel4_h6v6_neon: 645.5 401.0 434.7 vp8_put_epel4_v4_c:663.8 298.3 357.0 vp8_put_epel4_v4_neon: 116.0 81.5 72.5 vp8_put_epel4_v6_c:870.5 437.0 507.4 vp8_put_epel4_v6_neon: 147.7 108.8 92.0 --- libavcodec/aarch64/vp8dsp_init_aarch64.c | 10 ++ libavcodec/aarch64/vp8dsp_neon.S | 284 +++ 2 files changed, 294 insertions(+) diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index 1878d8e..478f849 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -34,6 +34,7 @@ VP8_LF(neon); VP8_EPEL(16, neon); VP8_EPEL(8, neon); +VP8_EPEL(4, neon); av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) @@ -55,6 +56,15 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_neon; dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon; dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon; + +dsp->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_neon; +dsp->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_neon; +dsp->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_neon; +dsp->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_neon; +dsp->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_neon; +dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_neon; +dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_neon; +dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_neon; } av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index c5badc4..7fe2466 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -1225,3 +1225,287 @@ function ff_put_vp8_epel8_h6v4_neon, export=1 add sp, sp, #168+16 ret endfunc + +function ff_put_vp8_epel4_v6_neon, export=1 +sub x2, x2, x3, lsl #1 + +movrel x7, subpel_filters, -16 +add x6, x7, w6, uxtw #4 +ld1 {v0.8h},[x6] +1: +ld1r{v2.2s},[x2], x3 +ld1r{v3.2s},[x2], x3 +ld1r{v4.2s},[x2], x3 +ld1r{v5.2s},[x2], x3 +ld1r{v6.2s},[x2], x3 +ld1r{v7.2s},[x2], x3 +ld1r{v28.2s}, [x2] +sub x2, x2, x3, lsl #2 +ld1 {v2.s}[1], [x2], x3 +ld1 {v3.s}[1], [x2], x3 +ld1 {v4.s}[1], [x2], x3 +ld1 {v5.s}[1], [x2], x3 +ld1 {v6.s}[1], [x2], x3 +ld1 {v7.s}[1], [x2], x3 +ld1 {v28.s}[1], [x2] +sub x2, x2, x3, lsl #2 + +vp8_epel8_v6_y2 v2, v3, v2, v3, v4, v5, v6, v7, v28 + +st1 {v2.s}[0], [x0], x1 +st1 {v3.s}[0], [x0], x1 +st1 {v2.s}[1], [x0], x1 +st1 {v3.s}[1], [x0], x1 +subsw4, w4, #4 +b.ne1b + +ret +endfunc + +function ff_put_vp8_epel4_h6_neon, export=1 +sub x2, x2, #2 + +movrel x7, subpel_filters, -16 +add x5, x7, w5, uxtw #4 +ld1 {v0.8h}, [x5] +1: +ld1 {v2.8b,v3.8b}, [x2], x3 +vp8_epel8_h6v2, v2, v3 +st1 {v2.s}[0], [x0], x1 +subsw4, w4, #1 +b.ne1b + +ret +endfunc + +function ff_put_vp8_epel4_h6v6_neon, export=1 +sub x2, x2, x3, lsl #1 +sub x2, x2, #2 + +movrel x7, subpel_filters, -16 +add x5, x7, w5, uxtw #4 +ld1 {v0.8h}, [x5] + +sub sp, sp, #52 +add w8, w4, #5 +mov x9, sp +1: +ld1 {v2.8b,v3.8b}, [x2], x3 +vp8_epel8_h6v2, v2, v3 +st1 {v2.s}[0], [x9], #4 +subsw8, w8, #1 +b.ne1b + +add x6,
[libav-devel] [PATCH 07/19] vp8dsp: Move the aarch64 dsp init call into alphabetical order
--- libavcodec/vp8dsp.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c index 3c8d1c8..ac9a6af 100644 --- a/libavcodec/vp8dsp.c +++ b/libavcodec/vp8dsp.c @@ -679,14 +679,14 @@ av_cold void ff_vp78dsp_init(VP8DSPContext *dsp) VP78_BILINEAR_MC_FUNC(1, 8); VP78_BILINEAR_MC_FUNC(2, 4); +if (ARCH_AARCH64) +ff_vp78dsp_init_aarch64(dsp); if (ARCH_ARM) ff_vp78dsp_init_arm(dsp); if (ARCH_PPC) ff_vp78dsp_init_ppc(dsp); if (ARCH_X86) ff_vp78dsp_init_x86(dsp); -if (ARCH_AARCH64) -ff_vp78dsp_init_aarch64(dsp); } #if CONFIG_VP7_DECODER @@ -741,11 +741,11 @@ av_cold void ff_vp8dsp_init(VP8DSPContext *dsp) dsp->vp8_v_loop_filter_simple = vp8_v_loop_filter_simple_c; dsp->vp8_h_loop_filter_simple = vp8_h_loop_filter_simple_c; +if (ARCH_AARCH64) +ff_vp8dsp_init_aarch64(dsp); if (ARCH_ARM) ff_vp8dsp_init_arm(dsp); if (ARCH_X86) ff_vp8dsp_init_x86(dsp); -if (ARCH_AARCH64) -ff_vp8dsp_init_aarch64(dsp); } #endif /* CONFIG_VP8_DECODER */ -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 10/19] aarch64: vp8: Reorder the function pointer inits to match the arm original
--- libavcodec/aarch64/vp8dsp_init_aarch64.c | 8 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index 3fb254a..da54efd 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -46,10 +46,10 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp) dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_neon; dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon; -dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon; -dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon; -dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon; dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_neon; +dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon; +dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon; +dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon; } av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp) @@ -62,8 +62,8 @@ av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp) dsp->vp8_idct_dc_add= ff_vp8_idct_dc_add_neon; dsp->vp8_idct_dc_add4y = ff_vp8_idct_dc_add4y_neon; -dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon; dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_neon; +dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon; dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_neon; dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_neon; -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 05/19] aarch64: vp8: Fix linking for iOS
The mach-o relocations don't allow a negative offset to a symbol; use the third movrel parameter to handle this issue transparently. --- libavcodec/aarch64/vp8dsp_neon.S | 14 +++--- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index 14a9d11..eb22c42 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -759,7 +759,7 @@ function ff_put_vp8_epel16_v6_neon, export=1 sxtwx4, w4 sxtwx6, w6 -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 add x6, x17, x6, lsl #4 // y ld1 {v0.8h}, [x6] 1: @@ -788,7 +788,7 @@ function ff_put_vp8_epel16_h6_neon, export=1 sxtwx5, w5 // x // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 add x5, x17, x5, lsl #4 // x ld1 {v0.8h}, [x5] 1: @@ -807,7 +807,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 sub x2, x2, #2 // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 sxtwx5, w5 // x add x16, x17, x5, lsl #4 // x sub sp, sp, #336+16 @@ -854,7 +854,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1 sxtwx4, w4 // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 sxtwx5, w5 add x5, x17, x5, lsl #4 // x sub sp, sp, #168+16 @@ -900,7 +900,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1 sxtwx4, w4 // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 sxtwx5, w5 add x5, x17, x5, lsl #4 // x sub sp, sp, #168+16 @@ -947,7 +947,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1 // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 sxtwx5, w5 add x5, x17, x5, lsl #4 // x sub sp, sp, #168+16 @@ -992,7 +992,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1 // first pass (horizontal): -movrel x17, subpel_filters-16 +movrel x17, subpel_filters, -16 sxtwx5, w5 add x5, x17, x5, lsl #4 // x sub sp, sp, #168+16 -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 02/19] aarch64: vp8: Fix the include guard
From: Carl Eugen Hoyos --- libavcodec/aarch64/vp8dsp.h | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h index 8a0c8fb..40d0cae 100644 --- a/libavcodec/aarch64/vp8dsp.h +++ b/libavcodec/aarch64/vp8dsp.h @@ -16,8 +16,8 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ -#ifndef AVCODEC_ARM_VP8DSP_H -#define AVCODEC_ARM_VP8DSP_H +#ifndef AVCODEC_AARCH64_VP8DSP_H +#define AVCODEC_AARCH64_VP8DSP_H #include "libavcodec/vp8dsp.h" @@ -67,4 +67,4 @@ VP8_MC(epel ## w ## _h4v6, opt);\ VP8_MC(epel ## w ## _h6v6, opt) -#endif /* AVCODEC_ARM_VP8DSP_H */ +#endif /* AVCODEC_AARCH64_VP8DSP_H */ -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 09/19] aarch64: vp8: Move the vp8dsp makefile entries to the right places
Even if NEON would be disabled, the init functions should be built as they are called as long as ARCH_AARCH64 is set. These functions are part of a generic DSP subsytem, not tied directly to one decoder. (They should be built if the vp7 decoder is enabled, even if the vp8 decoder is disabled.) --- libavcodec/aarch64/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile index 2555044..7228eae 100644 --- a/libavcodec/aarch64/Makefile +++ b/libavcodec/aarch64/Makefile @@ -11,6 +11,7 @@ OBJS-$(CONFIG_MDCT) += aarch64/mdct_init.o OBJS-$(CONFIG_MPEGAUDIODSP) += aarch64/mpegaudiodsp_init.o OBJS-$(CONFIG_NEON_CLOBBER_TEST)+= aarch64/neontest.o OBJS-$(CONFIG_VIDEODSP) += aarch64/videodsp_init.o +OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_init_aarch64.o # decoders/encoders OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_init.o @@ -39,13 +40,12 @@ NEON-OBJS-$(CONFIG_HPELDSP) += aarch64/hpeldsp_neon.o NEON-OBJS-$(CONFIG_IMDCT15) += aarch64/imdct15_neon.o NEON-OBJS-$(CONFIG_MDCT)+= aarch64/mdct_neon.o NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o +NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_neon.o # decoders/encoders NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o \ aarch64/synth_filter_neon.o NEON-OBJS-$(CONFIG_VORBIS_DECODER) += aarch64/vorbisdsp_neon.o -NEON-OBJS-$(CONFIG_VP8DSP) += aarch64/vp8dsp_init_aarch64.o \ - aarch64/vp8dsp_neon.o NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o \ aarch64/vp9lpf_neon.o \ aarch64/vp9mc_neon.o -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 06/19] aarch64: vp8: Use the proper aarch64 form for conditional branches
The previous form also does seem to assemble on current tools, but I think it might fail on some older aarch64 tools. --- libavcodec/aarch64/vp8dsp_neon.S | 28 ++-- 1 file changed, 14 insertions(+), 14 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S index eb22c42..c19ab0d 100644 --- a/libavcodec/aarch64/vp8dsp_neon.S +++ b/libavcodec/aarch64/vp8dsp_neon.S @@ -581,7 +581,7 @@ function ff_put_vp8_pixels16_neon, export=1 st1 {v1.16b}, [x0], x1 st1 {v2.16b}, [x0], x1 st1 {v3.16b}, [x0], x1 -bgt 1b +b.gt1b ret endfunc @@ -596,7 +596,7 @@ function ff_put_vp8_pixels8_neon, export=1 st1 {v0.d}[1], [x0], x1 st1 {v1.8b}, [x0], x1 st1 {v1.d}[1], [x0], x1 -bgt 1b +b.gt1b ret endfunc @@ -778,7 +778,7 @@ function ff_put_vp8_epel16_v6_neon, export=1 st1 {v1.1d - v2.1d}, [x0], x1 st1 {v3.1d - v4.1d}, [x0], x1 subsx4, x4, #2 -bne 1b +b.ne1b ret endfunc @@ -797,7 +797,7 @@ function ff_put_vp8_epel16_h6_neon, export=1 st1 {v1.16b}, [x0], x1 subsw4, w4, #1 -bne 1b +b.ne1b ret endfunc @@ -821,7 +821,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 vp8_epel16_h6 v1, v1, v2 st1 {v1.16b}, [x7], #16 subsx16, x16, #1 -bne 1b +b.ne1b // second pass (vertical): @@ -842,7 +842,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1 st1 {v2.16b}, [x0], x1 subsx4, x4, #1 -bne 2b +b.ne2b add sp, sp, #336+16 ret @@ -869,7 +869,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1 st1 {v1.8b}, [x7], #8 subsx16, x16, #1 -bne 1b +b.ne1b // second pass (vertical): sxtwx6, w6 @@ -888,7 +888,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1 st1 {v1.8b}, [x0], x1 st1 {v2.8b}, [x0], x1 subsx4, x4, #2 -bne 2b +b.ne2b add sp, sp, #168+16 ret @@ -915,7 +915,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1 st1 {v1.8b}, [x7], #8 subsx16, x16, #1 -bne 1b +b.ne1b // second pass (vertical): sxtwx6, w6 @@ -934,7 +934,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1 st1 {v1.8b}, [x0], x1 st1 {v2.8b}, [x0], x1 subsx4, x4, #2 -bne 2b +b.ne2b add sp, sp, #168+16 ret @@ -962,7 +962,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1 st1 {v1.8b}, [x7], #8 subsx16, x16, #1 -bne 1b +b.ne1b // second pass (vertical): sxtwx6, w6 @@ -979,7 +979,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1 st1 {v1.d}[0], [x0], x1 st1 {v1.d}[1], [x0], x1 subsx4, x4, #2 -bne 2b +b.ne2b add sp, sp, #168+16 ret @@ -1007,7 +1007,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1 st1 {v1.8b}, [x7], #8 subsx16, x16, #1 -bne 1b +b.ne1b // second pass (vertical): sxtwx6, w6 @@ -1024,7 +1024,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1 st1 {v1.d}[0], [x0], x1 st1 {v1.d}[1], [x0], x1 subsx4, x4, #2 -bne 2b +b.ne2b add sp, sp, #168+16 ret -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 08/19] aarch64: vp8: Remove superfluous includes
--- libavcodec/aarch64/vp8dsp_init_aarch64.c | 4 1 file changed, 4 deletions(-) diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c b/libavcodec/aarch64/vp8dsp_init_aarch64.c index f93bcfa..3fb254a 100644 --- a/libavcodec/aarch64/vp8dsp_init_aarch64.c +++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c @@ -17,10 +17,6 @@ */ #include -#include -#include -#include -#include #include "libavutil/attributes.h" #include "libavutil/aarch64/cpu.h" -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] avio: Do not flush the buffer if a constant packet size is requested
On Thu, 31 Jan 2019, Luca Barbato wrote: --- libavformat/aviobuf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c index 98e35f776c..3c882d6bdb 100644 --- a/libavformat/aviobuf.c +++ b/libavformat/aviobuf.c @@ -244,7 +244,8 @@ void avio_write(AVIOContext *s, const unsigned char *buf, int size) void avio_flush(AVIOContext *s) { -flush_buffer(s); +if (!s->max_packet_size || s->buf_ptr - s->buffer >= s->max_packet_size) +flush_buffer(s); s->must_flush = 0; } -- 2.12.2 You're not providing any explanation to why we should do this. And I'm fairly sure that this patch breaks the RTP muxer when sending over plain UDP. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/1] h264/x86: sign extend int stride in deblock functions
On Sun, 27 Jan 2019, Janne Grunau wrote: Fixes checkasm errors after adding the h264 deblock tests. --- libavcodec/x86/h264_deblock.asm | 8 libavcodec/x86/h264_deblock_10bit.asm | 9 + 2 files changed, 17 insertions(+) Ok with me. Yes, changing the prototypes to use ptrdiff_t instead of int would be good, but I think it's better to get tests back to green instead of blocking the fix by demanding the larger refactoring right now. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/4] checkasm/h264: add loop filter tests
On Tue, 1 Jan 2019, Janne Grunau wrote: --- tests/checkasm/h264dsp.c | 124 +++ 1 file changed, 124 insertions(+) This newly added test seems to fail on macOS. I haven't debugged through it properly yet, but disabling the use of checkasm_checked_call seems to make it pass. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] libopenh264dec: Use a newer decoding entry point function
On Sat, 26 Jan 2019, Janne Grunau wrote: On 2019-01-25 10:39:13 +0200, Martin Storsjö wrote: The "new" entry point actually has existed since OpenH264 1.4 in 2015, but with B-frames, this entry point is essential for actually getting the right frames returned and reordered. The name of this function, DecodeFrameNoDelay, is rather backwards considering that it doesn't return the latest decoded frame immediately, but actually does proper delaying and reordering of frames, but it's the recommended decoding entry point. The commit message is hard to parse. Something along below is imho easier to understand: | The "new" entry point actually has existed since OpenH264 1.4 in | 2015 and is the the recommended decoding entry point. | | The name of this function, DecodeFrameNoDelay, is rather backwards | considering that it doesn't return the latest decoded frame immediately, | but actually does proper delaying and reordering of frames. Thanks! That's indeed much more understandable. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libopenh264dec: Use a newer decoding entry point function
The "new" entry point actually has existed since OpenH264 1.4 in 2015, but with B-frames, this entry point is essential for actually getting the right frames returned and reordered. The name of this function, DecodeFrameNoDelay, is rather backwards considering that it doesn't return the latest decoded frame immediately, but actually does proper delaying and reordering of frames, but it's the recommended decoding entry point. --- libavcodec/libopenh264dec.c | 10 +- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c index 60e4b028ec..6adf984112 100644 --- a/libavcodec/libopenh264dec.c +++ b/libavcodec/libopenh264dec.c @@ -109,10 +109,18 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data, #endif } else { info.uiInBsTimeStamp = avpkt->pts; +#if OPENH264_VER_AT_LEAST(1, 4) +// Contrary to the name, DecodeFrameNoDelay actually does buffering +// and reordering of frames, and is the recommended decoding entry +// point since 1.4. This is essential for successfully decoding +// B-frames. +state = (*s->decoder)->DecodeFrameNoDelay(s->decoder, avpkt->data, avpkt->size, ptrs, &info); +#else state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, avpkt->size, ptrs, &info); +#endif } if (state != dsErrorFree) { -av_log(avctx, AV_LOG_ERROR, "DecodeFrame2 failed\n"); +av_log(avctx, AV_LOG_ERROR, "DecodeFrame failed\n"); return AVERROR_UNKNOWN; } if (info.iBufferStatus != 1) { -- 2.17.2 (Apple Git-113) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] arm: Create proper .rdata sections for COFF
As .rodata isn't one of the default created sections for COFF, it was created as a read-write data section. By using the default .rdata section name for COFF, it automatically becomes a read-only data section. The existing ".section .rodata" works as intended for ELF though. This is based on an original patch and diagnose by Tom Tan . --- libavutil/aarch64/asm.S | 2 ++ libavutil/arm/asm.S | 2 ++ 2 files changed, 4 insertions(+) diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S index 15b55d57d2..bf5c1b7ee1 100644 --- a/libavutil/aarch64/asm.S +++ b/libavutil/aarch64/asm.S @@ -63,6 +63,8 @@ ELF .size \name, . - \name .else .section.rodata .endif +#elif defined(_WIN32) +.section.rdata #elif !defined(__MACH__) .section.rodata #else diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S index 62ce493180..9842d03bc0 100644 --- a/libavutil/arm/asm.S +++ b/libavutil/arm/asm.S @@ -125,6 +125,8 @@ ELF .size \name, . - \name .else .section.rodata .endif +#elif defined(_WIN32) +.section.rdata #elif !defined(__MACH__) .section.rodata #else -- 2.17.2 (Apple Git-113) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] arm: Mark .rodata section as read only in COFF object file
From: Tom Tan .rodata directive from GAS assembly produces .rodata as read/write for COFF object file by default (object file format for Windows), but read only for ELF. This change marks it as read only explicitly for COFF. Signed-off-by: Martin Storsjö --- libavutil/aarch64/asm.S | 2 ++ libavutil/arm/asm.S | 2 ++ 2 files changed, 4 insertions(+) diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S index 15b55d57d2..65341d58cd 100644 --- a/libavutil/aarch64/asm.S +++ b/libavutil/aarch64/asm.S @@ -63,6 +63,8 @@ ELF .size \name, . - \name .else .section.rodata .endif +#elif defined(_WIN32) +.section.rodata, "r" #elif !defined(__MACH__) .section.rodata #else diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S index 62ce493180..06c3413489 100644 --- a/libavutil/arm/asm.S +++ b/libavutil/arm/asm.S @@ -125,6 +125,8 @@ ELF .size \name, . - \name .else .section.rodata .endif +#elif defined(_WIN32) +.section.rodata, "r" #elif !defined(__MACH__) .section.rodata #else -- 2.17.2 (Apple Git-113) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] Using Co-authored-by instead of Signed-off-by
On Wed, 9 Jan 2019, Luca Barbato wrote: Since the start of the project we used Signed-off-by to signal that a patch had been edited. I'd like to point out that you might have had this interpretation of it and used it in this way, but it hasn't been a written project wide rule that this is the intended interpretation in this context. I've used it as a general "I approve of"-mark. Currently git (and github/gitlab) has support for `Co-authored-by:`. It isn't as nice as `Signed-off-by:` since there isn't a easy shorthand such as `-s` that I know, but possibly could be nice to use. That sounds like a much better thing to use, especially as Signed-off-by has different interpretations in different projects. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 4/4] h264/aarch64: add intra loop filter neon asm
On Tue, 1 Jan 2019, Janne Grunau wrote: Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported (x264 uses nv12 chroma) and optimized. Cycle count for checkasm --bench on a Snapdragon 820e: h264_h_loop_filter_luma_intra_8bpp_c: 60.0 h264_h_loop_filter_luma_intra_8bpp_neon: 54.2 h264_v_loop_filter_luma_intra_8bpp_c: 148.3 h264_v_loop_filter_luma_intra_8bpp_neon: 73.8 h264_h_loop_filter_chroma_intra_8bpp_c: 27.8 h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4 h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8 h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7 h264_v_loop_filter_chroma_intra_8bpp_c: 45.8 h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3 --- libavcodec/aarch64/h264dsp_init_aarch64.c | 16 ++ libavcodec/aarch64/h264dsp_neon.S | 297 ++ 2 files changed, 313 insertions(+) LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 3/4] h264/aarch64: optimize neon loop filter
On Tue, 1 Jan 2019, Janne Grunau wrote: Exit as soon as possible if no filtering will be done. Improves the checkasm --bench cycle count on a Snapdragon 820e: h264_h_loop_filter_luma_8bpp_c: 72.4 -> 72.5 h264_h_loop_filter_luma_8bpp_neon: 97.1 -> 56.3 h264_v_loop_filter_luma_8bpp_c: 174.0 -> 173.5 h264_v_loop_filter_luma_8bpp_neon: 62.9 -> 60.9 h264_h_loop_filter_chroma_8bpp_c:30.2 -> 30.3 h264_h_loop_filter_chroma_8bpp_neon: 51.6 -> 25.7 h264_v_loop_filter_chroma_8bpp_c:57.3 -> 57.3 h264_v_loop_filter_chroma_8bpp_neon: 28.0 -> 24.0 --- libavcodec/aarch64/h264dsp_neon.S | 33 ++- 1 file changed, 19 insertions(+), 14 deletions(-) LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/4] checkasm/h264: add loop filter tests
On Tue, 1 Jan 2019, Janne Grunau wrote: --- tests/checkasm/h264dsp.c | 124 +++ 1 file changed, 124 insertions(+) Looks ok to me // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/4] h264/aarch64: sign extend int stride in loop filter asm
On Tue, 1 Jan 2019, Janne Grunau wrote: --- libavcodec/aarch64/h264dsp_neon.S | 3 +++ 1 file changed, 3 insertions(+) diff --git a/libavcodec/aarch64/h264dsp_neon.S b/libavcodec/aarch64/h264dsp_neon.S index 9b4610a4d4..60ffa24500 100644 --- a/libavcodec/aarch64/h264dsp_neon.S +++ b/libavcodec/aarch64/h264dsp_neon.S @@ -130,6 +130,7 @@ endfunc function ff_h264_h_loop_filter_luma_neon, export=1 h264_loop_filter_start +sxtwx1, w1 sub x0, x0, #4 ld1 {v6.8B}, [x0], x1 @@ -210,6 +211,7 @@ endfunc function ff_h264_v_loop_filter_chroma_neon, export=1 h264_loop_filter_start +sxtwx1, w1 sub x0, x0, x1, lsl #1 ld1 {v18.8B}, [x0], x1 @@ -228,6 +230,7 @@ endfunc function ff_h264_h_loop_filter_chroma_neon, export=1 h264_loop_filter_start +sxtwx1, w1 sub x0, x0, #2 ld1 {v18.S}[0], [x0], x1 -- 2.20.1 LGTM // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] libavutil: Undeprecate the AVFrame reordered_opaque field
On Fri, 26 Oct 2018, Luca Barbato wrote: On 25/10/2018 14:45, Martin Storsjö wrote: This was marked as deprecated (but only in the doxygen, not with an actual deprecation attribute) in 81c623fae05 in 2011, but was undeprecated in ad1ee5fa7. --- libavutil/frame.h | 1 - libavutil/version.h | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) The set is probably fine. Pushed, with a minor adjustment to patch 2/2, to overestimate the buffer size needed, in case a reconfiguration increases the delay. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] libx264: Pass the reordered_opaque field through the encoder
On Thu, 25 Oct 2018, Martin Storsjö wrote: libx264 does have a field for opaque data to pass along with frames through the encoder, but it is a pointer, while the libavcodec reordered_opaque field is an int64_t. Therefore, allocate an array within the libx264 wrapper, where reordered_opaque values in flight are stored, and pass a pointer to this array to libx264. Update the public libavcodec documentation for the AVCodecContext field to explain this usage, and add a codec capability that allows detecting whether an encoder handles this field. --- libavcodec/avcodec.h | 12 +++- libavcodec/libx264.c | 31 +-- 2 files changed, 40 insertions(+), 3 deletions(-) diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index fb8e34e7d5..727e1c411d 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -899,6 +899,13 @@ typedef struct RcOverride{ */ #define AV_CODEC_CAP_HYBRID (1 << 18) +/** + * This codec takes the reordered_opaque field from input AVFrames + * and returns it in the corresponding field in AVCodecContext after + * encoding. + */ +#define AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE (1 << 19) This obviously needs a minor bump, I'll add one locally. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] libx264: Pass the reordered_opaque field through the encoder
libx264 does have a field for opaque data to pass along with frames through the encoder, but it is a pointer, while the libavcodec reordered_opaque field is an int64_t. Therefore, allocate an array within the libx264 wrapper, where reordered_opaque values in flight are stored, and pass a pointer to this array to libx264. Update the public libavcodec documentation for the AVCodecContext field to explain this usage, and add a codec capability that allows detecting whether an encoder handles this field. --- libavcodec/avcodec.h | 12 +++- libavcodec/libx264.c | 31 +-- 2 files changed, 40 insertions(+), 3 deletions(-) diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h index fb8e34e7d5..727e1c411d 100644 --- a/libavcodec/avcodec.h +++ b/libavcodec/avcodec.h @@ -899,6 +899,13 @@ typedef struct RcOverride{ */ #define AV_CODEC_CAP_HYBRID (1 << 18) +/** + * This codec takes the reordered_opaque field from input AVFrames + * and returns it in the corresponding field in AVCodecContext after + * encoding. + */ +#define AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE (1 << 19) + /** * Pan Scan area. * This specifies the area which should be displayed. @@ -2297,7 +2304,10 @@ typedef struct AVCodecContext { /** * opaque 64-bit number (generally a PTS) that will be reordered and * output in AVFrame.reordered_opaque - * - encoding: unused + * - encoding: Set by libavcodec to the reordered_opaque of the input + * frame corresponding to the last returned packet. Only + * supported by encoders with the + * AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE capability. * - decoding: Set by user. */ int64_t reordered_opaque; diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c index 3dc53aaf38..c852858db8 100644 --- a/libavcodec/libx264.c +++ b/libavcodec/libx264.c @@ -85,6 +85,9 @@ typedef struct X264Context { int noise_reduction; char *x264_params; + +int nb_reordered_opaque, next_reordered_opaque; +int64_t *reordered_opaque; } X264Context; static void X264_log(void *p, int level, const char *fmt, va_list args) @@ -240,6 +243,7 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame, x264_nal_t *nal; int nnal, i, ret; x264_picture_t pic_out; +int64_t *out_opaque; x264_picture_init( &x4->pic ); x4->pic.img.i_csp = x4->params.i_csp; @@ -259,6 +263,11 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame, x4->pic.i_pts = frame->pts; +x4->reordered_opaque[x4->next_reordered_opaque] = frame->reordered_opaque; +x4->pic.opaque = &x4->reordered_opaque[x4->next_reordered_opaque]; +x4->next_reordered_opaque++; +x4->next_reordered_opaque %= x4->nb_reordered_opaque; + switch (frame->pict_type) { case AV_PICTURE_TYPE_I: x4->pic.i_type = x4->forced_idr ? X264_TYPE_IDR @@ -288,6 +297,15 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, const AVFrame *frame, pkt->pts = pic_out.i_pts; pkt->dts = pic_out.i_dts; +out_opaque = pic_out.opaque; +if (out_opaque >= x4->reordered_opaque && +out_opaque < &x4->reordered_opaque[x4->nb_reordered_opaque]) { +ctx->reordered_opaque = *out_opaque; +} else { +// Unexpected opaque pointer on picture output +ctx->reordered_opaque = 0; +} + #if FF_API_CODED_FRAME FF_DISABLE_DEPRECATION_WARNINGS switch (pic_out.i_type) { @@ -331,6 +349,7 @@ static av_cold int X264_close(AVCodecContext *avctx) av_freep(&avctx->extradata); av_freep(&x4->sei); +av_freep(&x4->reordered_opaque); if (x4->enc) { x264_encoder_close(x4->enc); @@ -663,6 +682,12 @@ FF_ENABLE_DEPRECATION_WARNINGS cpb_props->max_bitrate = x4->params.rc.i_vbv_max_bitrate * 1000; cpb_props->avg_bitrate = x4->params.rc.i_bitrate * 1000; +x4->nb_reordered_opaque = x264_encoder_maximum_delayed_frames(x4->enc) + 1; +x4->reordered_opaque= av_malloc_array(x4->nb_reordered_opaque, + sizeof(*x4->reordered_opaque)); +if (!x4->reordered_opaque) +return AVERROR(ENOMEM); + return 0; } @@ -850,7 +875,8 @@ AVCodec ff_libx264_encoder = { .init = X264_init, .encode2 = X264_frame, .close= X264_close, -.capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AUTO_THREADS, +.capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AUTO_THREADS | +AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE, .priv_class = &class, .defaults = x264_defaults, .init_static_data = X264_init_static, @@ -877,7 +903,8 @@ AVCodec ff_libx262_encoder = { .init = X264_init, .encode2 = X264_frame, .close= X264_close, -.capabilities
[libav-devel] [PATCH 1/2] libavutil: Undeprecate the AVFrame reordered_opaque field
This was marked as deprecated (but only in the doxygen, not with an actual deprecation attribute) in 81c623fae05 in 2011, but was undeprecated in ad1ee5fa7. --- libavutil/frame.h | 1 - libavutil/version.h | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/libavutil/frame.h b/libavutil/frame.h index ff3fe46dd6..c7240ebe9b 100644 --- a/libavutil/frame.h +++ b/libavutil/frame.h @@ -295,7 +295,6 @@ typedef struct AVFrame { * that time, * the decoder reorders values as needed and sets AVFrame.reordered_opaque * to exactly one of the values provided by the user through AVCodecContext.reordered_opaque - * @deprecated in favor of pkt_pts */ int64_t reordered_opaque; diff --git a/libavutil/version.h b/libavutil/version.h index 4a9fffef43..e5fbd4ca81 100644 --- a/libavutil/version.h +++ b/libavutil/version.h @@ -55,7 +55,7 @@ #define LIBAVUTIL_VERSION_MAJOR 56 #define LIBAVUTIL_VERSION_MINOR 7 -#define LIBAVUTIL_VERSION_MICRO 0 +#define LIBAVUTIL_VERSION_MICRO 1 #define LIBAVUTIL_VERSION_INT AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \ LIBAVUTIL_VERSION_MINOR, \ -- 2.17.1 (Apple Git-112) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] arm: Emit .thumb_func directives
Prior to Xcode 9.3, the clang built-in assembler didn't support altmacro, and gas-preprocessor was used for assembling for arm/darwin. For thumb functions, gas-preprocessor took care of adding the .thumb_func directives, but when now being able to assemble without gas-preprocessor, we need to add these directives ourselves. --- libavutil/arm/asm.S | 8 1 file changed, 8 insertions(+) diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S index e7eea0271f..5207a1a2b8 100644 --- a/libavutil/arm/asm.S +++ b/libavutil/arm/asm.S @@ -75,6 +75,12 @@ T .thumb ELF .eabi_attribute 25, 1 @ Tag_ABI_align_preserved ELF .section .note.GNU-stack,"",%progbits @ Mark stack as non-executable +.macro func_mode name +#if CONFIG_THUMB && defined(__APPLE__) +.thumb_func \name +#endif +.endm + .macro function name, export=0, align=2 .set.Lpic_idx, 0 .set.Lpic_gp, 0 @@ -98,10 +104,12 @@ FUNC.endfunc .global EXTERN_ASM\name ELF .type EXTERN_ASM\name, %function FUNC.func EXTERN_ASM\name +func_mode EXTERN_ASM\name EXTERN_ASM\name: .else ELF .type \name, %function FUNC.func \name +func_mode \name \name: .endif .endm -- 2.17.1 (Apple Git-112) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] libfdk-aac: Don't use defined() in a #define
On Wed, 12 Sep 2018, Martin Storsjö wrote: MSVC expands the preprocessor directives differently, making the version check fail in the previous form. --- I'm pretty sure I've seen a better description of this issue somewhere, I don't remember off-hand right now where that was. But I think the gist of it was that the previous form was undefined according to the C standard, even if GCC and clang handle it in the same way. This is similar to 5e3f6dc70198426fe0741e3017826b8bf3ee5ad8, which points out that if building with -Wexpansion-to-defined, the compiler (at least clang) would warn about it, clarifying that macro expansion of 'defined' has undefined behaviour. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libfdk-aac: Don't use defined() in a #define
MSVC expands the preprocessor directives differently, making the version check fail in the previous form. --- I'm pretty sure I've seen a better description of this issue somewhere, I don't remember off-hand right now where that was. But I think the gist of it was that the previous form was undefined according to the C standard, even if GCC and clang handle it in the same way. --- libavcodec/libfdk-aacdec.c | 9 ++--- libavcodec/libfdk-aacenc.c | 9 ++--- 2 files changed, 12 insertions(+), 6 deletions(-) diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c index ca70a49ad4..63856232d9 100644 --- a/libavcodec/libfdk-aacdec.c +++ b/libavcodec/libfdk-aacdec.c @@ -25,10 +25,13 @@ #include "avcodec.h" #include "internal.h" +#ifdef AACDECODER_LIB_VL0 #define FDKDEC_VER_AT_LEAST(vl0, vl1) \ -(defined(AACDECODER_LIB_VL0) && \ -((AACDECODER_LIB_VL0 > vl0) || \ - (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1))) +((AACDECODER_LIB_VL0 > vl0) || \ + (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1)) +#else +#define FDKDEC_VER_AT_LEAST(vl0, vl1) 0 +#endif #if !FDKDEC_VER_AT_LEAST(2, 5) // < 2.5.10 #define AAC_PCM_MAX_OUTPUT_CHANNELS AAC_PCM_OUTPUT_CHANNELS diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index f71a276403..3b492ef8f4 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -26,10 +26,13 @@ #include "audio_frame_queue.h" #include "internal.h" +#ifdef AACENCODER_LIB_VL0 #define FDKENC_VER_AT_LEAST(vl0, vl1) \ -(defined(AACENCODER_LIB_VL0) && \ -((AACENCODER_LIB_VL0 > vl0) || \ - (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1))) +((AACENCODER_LIB_VL0 > vl0) || \ + (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1)) +#else +#define FDKENC_VER_AT_LEAST(vl0, vl1) 0 +#endif typedef struct AACContext { const AVClass *class; -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/3] libfdk-aacdec: Allow setting the new dynamic range control effect setting
This is a new setting in FDK v2. --- libavcodec/libfdk-aacdec.c | 14 ++ 1 file changed, 14 insertions(+) diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c index c3d3b70fc9..ca70a49ad4 100644 --- a/libavcodec/libfdk-aacdec.c +++ b/libavcodec/libfdk-aacdec.c @@ -51,6 +51,7 @@ typedef struct FDKAACDecContext { int drc_level; int drc_boost; int drc_heavy; +int drc_effect; int drc_cut; int level_limit; } FDKAACDecContext; @@ -77,6 +78,10 @@ static const AVOption fdk_aac_dec_options[] = { OFFSET(drc_heavy), AV_OPT_TYPE_INT, { .i64 = -1}, -1, 1, AD, NULL}, #if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10 { "level_limit", "Signal level limiting", OFFSET(level_limit), AV_OPT_TYPE_INT, { .i64 = 0 }, -1, 1, AD }, +#endif +#if FDKDEC_VER_AT_LEAST(3, 0) // 3.0.0 +{ "drc_effect","Dynamic Range Control: effect type, where e.g. [0] is none and [6] is general", + OFFSET(drc_effect), AV_OPT_TYPE_INT, { .i64 = -1}, -1, 8, AD, NULL}, #endif { NULL } }; @@ -306,6 +311,15 @@ static av_cold int fdk_aac_decode_init(AVCodecContext *avctx) } #endif +#if FDKDEC_VER_AT_LEAST(3, 0) // 3.0.0 +if (s->drc_effect != -1) { +if (aacDecoder_SetParam(s->handle, AAC_UNIDRC_SET_EFFECT, s->drc_effect) != AAC_DEC_OK) { +av_log(avctx, AV_LOG_ERROR, "Unable to set DRC effect type in the decoder\n"); +return AVERROR_UNKNOWN; +} +} +#endif + avctx->sample_fmt = AV_SAMPLE_FMT_S16; s->decoder_buffer_size = DECODER_BUFFSIZE * DECODER_MAX_CHANNELS; -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/3] libfdk-aac: Consistently use a proper version check macro for detecting features
The previous version checks checked explicitly for the version where the version define was added to the installed headers, making an "#ifdef AACDECODER_LIB_VL0" enough. Now that we have a need for more diverse version checks than this, convert all checks to such checks. --- libavcodec/libfdk-aacdec.c | 13 - libavcodec/libfdk-aacenc.c | 6 +++--- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c index 3be65155b5..c3d3b70fc9 100644 --- a/libavcodec/libfdk-aacdec.c +++ b/libavcodec/libfdk-aacdec.c @@ -25,9 +25,12 @@ #include "avcodec.h" #include "internal.h" -/* The version macro is introduced the same time as the setting enum was - * changed, so this check should suffice. */ -#ifndef AACDECODER_LIB_VL0 +#define FDKDEC_VER_AT_LEAST(vl0, vl1) \ +(defined(AACDECODER_LIB_VL0) && \ +((AACDECODER_LIB_VL0 > vl0) || \ + (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1))) + +#if !FDKDEC_VER_AT_LEAST(2, 5) // < 2.5.10 #define AAC_PCM_MAX_OUTPUT_CHANNELS AAC_PCM_OUTPUT_CHANNELS #endif @@ -72,7 +75,7 @@ static const AVOption fdk_aac_dec_options[] = { OFFSET(drc_level), AV_OPT_TYPE_INT, { .i64 = -1}, -1, 127, AD, NULL}, { "drc_heavy", "Dynamic Range Control: heavy compression, where [1] is on (RF mode) and [0] is off", OFFSET(drc_heavy), AV_OPT_TYPE_INT, { .i64 = -1}, -1, 1, AD, NULL}, -#ifdef AACDECODER_LIB_VL0 +#if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10 { "level_limit", "Signal level limiting", OFFSET(level_limit), AV_OPT_TYPE_INT, { .i64 = 0 }, -1, 1, AD }, #endif { NULL } @@ -296,7 +299,7 @@ static av_cold int fdk_aac_decode_init(AVCodecContext *avctx) } } -#ifdef AACDECODER_LIB_VL0 +#if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10 if (aacDecoder_SetParam(s->handle, AAC_PCM_LIMITER_ENABLE, s->level_limit) != AAC_DEC_OK) { av_log(avctx, AV_LOG_ERROR, "Unable to set in signal level limiting in the decoder\n"); return AVERROR_UNKNOWN; diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index 2ad768ed44..92ad1762ae 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -159,7 +159,7 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) case 6: mode = MODE_1_2_2_1; sce = 2; cpe = 2; break; /* The version macro is introduced the same time as the 7.1 support, so this should suffice. */ -#ifdef AACENCODER_LIB_VL0 +#if FDKENC_VER_AT_LEAST(3, 4) // 3.4.12 case 8: sce = 2; cpe = 3; @@ -295,7 +295,7 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) } avctx->frame_size = info.frameLength; -#if FDKENC_VER_AT_LEAST(4, 0) +#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0 avctx->initial_padding = info.nDelay; #else avctx->initial_padding = info.encoderDelay; @@ -418,7 +418,7 @@ static const uint64_t aac_channel_layout[] = { AV_CH_LAYOUT_4POINT0, AV_CH_LAYOUT_5POINT0_BACK, AV_CH_LAYOUT_5POINT1_BACK, -#ifdef AACENCODER_LIB_VL0 +#if FDKENC_VER_AT_LEAST(3, 4) // 3.4.12 AV_CH_LAYOUT_7POINT1_WIDE_BACK, AV_CH_LAYOUT_7POINT1, #endif -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 3/3] libfdk-aacenc: Allow enabling the ELDv2 profile
This is a new feature in FDK v2. --- libavcodec/libfdk-aacenc.c | 27 ++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index 92ad1762ae..f71a276403 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -36,6 +36,7 @@ typedef struct AACContext { HANDLE_AACENCODER handle; int afterburner; int eld_sbr; +int eld_v2; int signaling; int latm; int header_period; @@ -47,6 +48,9 @@ typedef struct AACContext { static const AVOption aac_enc_options[] = { { "afterburner", "Afterburner (improved quality)", offsetof(AACContext, afterburner), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, { "eld_sbr", "Enable SBR for ELD (for SBR in other configurations, use the -profile parameter)", offsetof(AACContext, eld_sbr), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, +#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0 +{ "eld_v2", "Enable ELDv2 (LD-MPS extension for ELD stereo signals)", offsetof(AACContext, eld_v2), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM }, +#endif { "signaling", "SBR/PS signaling style", offsetof(AACContext, signaling), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 2, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM, "signaling" }, { "default", "Choose signaling implicitly (explicit hierarchical by default, implicit if global header is disabled)", 0, AV_OPT_TYPE_CONST, { .i64 = -1 }, 0, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM, "signaling" }, { "implicit", "Implicit backwards compatible signaling", 0, AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM, "signaling" }, @@ -152,7 +156,28 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) switch (avctx->channels) { case 1: mode = MODE_1; sce = 1; cpe = 0; break; -case 2: mode = MODE_2; sce = 0; cpe = 1; break; +case 2: +#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0 + // (profile + 1) to map from profile range to AOT range + if (aot == FF_PROFILE_AAC_ELD + 1 && s->eld_v2) { + if ((err = aacEncoder_SetParam(s->handle, AACENC_CHANNELMODE, + 128)) != AACENC_OK) { + av_log(avctx, AV_LOG_ERROR, "Unable to enable ELDv2: %s\n", + aac_get_error(err)); + goto error; + } else { +mode = MODE_212; +sce = 1; +cpe = 0; + } + } else +#endif + { +mode = MODE_2; +sce = 0; +cpe = 1; + } + break; case 3: mode = MODE_1_2; sce = 1; cpe = 1; break; case 4: mode = MODE_1_2_1; sce = 2; cpe = 1; break; case 5: mode = MODE_1_2_2; sce = 1; cpe = 2; break; -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libfdk-aacenc: Fix building with libfdk-aac v2
When flushing the encoder, we now need to provide non-null buffer parameters for everything, even if they are unused. The encoderDelay parameter has been replaced by two, nDelay and nDelayCore. --- libfdk-aac v2 also has a bunch of other new, yet untested features, like support for xHE-AAC. --- libavcodec/libfdk-aacenc.c | 34 +- 1 file changed, 25 insertions(+), 9 deletions(-) diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index c340a1e3e0..2ad768ed44 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -26,6 +26,11 @@ #include "audio_frame_queue.h" #include "internal.h" +#define FDKENC_VER_AT_LEAST(vl0, vl1) \ +(defined(AACENCODER_LIB_VL0) && \ +((AACENCODER_LIB_VL0 > vl0) || \ + (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1))) + typedef struct AACContext { const AVClass *class; HANDLE_AACENCODER handle; @@ -290,7 +295,11 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) } avctx->frame_size = info.frameLength; +#if FDKENC_VER_AT_LEAST(4, 0) +avctx->initial_padding = info.nDelay; +#else avctx->initial_padding = info.encoderDelay; +#endif ff_af_queue_init(avctx, &s->afq); if (avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER) { @@ -323,28 +332,35 @@ static int aac_encode_frame(AVCodecContext *avctx, AVPacket *avpkt, int out_buffer_size, out_buffer_element_size; void *in_ptr, *out_ptr; int ret; +uint8_t dummy_buf[1]; AACENC_ERROR err; /* handle end-of-stream small frame and flushing */ if (!frame) { +/* Must be a non-null pointer, even if it's a dummy. We could use + * the address of anything else on the stack as well. */ +in_ptr = dummy_buf; +in_buffer_size = 0; + in_args.numInSamples = -1; } else { -in_ptr = frame->data[0]; -in_buffer_size = 2 * avctx->channels * frame->nb_samples; -in_buffer_element_size = 2; +in_ptr = frame->data[0]; +in_buffer_size = 2 * avctx->channels * frame->nb_samples; -in_args.numInSamples = avctx->channels * frame->nb_samples; -in_buf.numBufs = 1; -in_buf.bufs = &in_ptr; -in_buf.bufferIdentifiers = &in_buffer_identifier; -in_buf.bufSizes = &in_buffer_size; -in_buf.bufElSizes= &in_buffer_element_size; +in_args.numInSamples = avctx->channels * frame->nb_samples; /* add current frame to the queue */ if ((ret = ff_af_queue_add(&s->afq, frame)) < 0) return ret; } +in_buffer_element_size = 2; +in_buf.numBufs = 1; +in_buf.bufs = &in_ptr; +in_buf.bufferIdentifiers = &in_buffer_identifier; +in_buf.bufSizes = &in_buffer_size; +in_buf.bufElSizes= &in_buffer_element_size; + /* The maximum packet size is 6144 bits aka 768 bytes per channel. */ if ((ret = ff_alloc_packet(avpkt, FFMAX(8192, 768 * avctx->channels { av_log(avctx, AV_LOG_ERROR, "Error getting output packet\n"); -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] libopenh264dec: Export the decoded profile and level in AVCodecContext
On Fri, 31 Aug 2018, Vittorio Giovara wrote: On Fri, Aug 31, 2018 at 11:25 AM, Martin Storsjö wrote: --- libavcodec/libopenh264dec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c index 5990a72ff9..7e9e66743a 100644 --- a/libavcodec/libopenh264dec.c +++ b/libavcodec/libopenh264dec.c @@ -95,6 +95,7 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data, int linesize[3]; AVFrame *avframe = data; DECODING_STATE state; +int opt; if (!avpkt->data) { #if OPENH264_VER_AT_LEAST(1, 9) @@ -136,6 +137,10 @@ FF_DISABLE_DEPRECATION_WARNINGS avframe->pkt_pts = avpkt->pts; FF_ENABLE_DEPRECATION_WARNINGS #endif +(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_PROFILE, &opt); +avctx->profile = opt; +(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_LEVEL, &opt); +avctx->level = opt; *got_frame = 1; return avpkt->size; -- lgtm Thanks - pushed with appropriate openh264 version ifdefs added. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] network: Add RFC 8305 style "Happy Eyeballs"/"Fast Fallback" helper function
On Wed, 22 Aug 2018, Luca Barbato wrote: On 21/08/2018 09:29, Martin Storsjö wrote: For cases with dual stack (IPv4 + IPv6) connectivity, but where one stack potentially is less reliable, strive to trying to connect over both protocols in parallel, using whichever address connected first. In cases with a hostname resolving to multiple IPv4 and IPv6 addresses, the current connection mechanism would try all addresses in the order returned by getaddrinfo (with all IPv6 addresses ordered before the IPv4 addresses normally). If connection attempts to the IPv6 addresses return quickly with an error, this was no problem, but if they were unsuccessful leading up to timeouts, the connection process would have to wait for timeouts on all IPv6 target addresses before attempting any IPv4 address. Similar to what RFC 8305 suggests, reorder the list of addresses to try connecting to, interleaving address families. After starting one connection attempt, start another one in parallel after a small delay (200 ms as suggested by the RFC). For cases with unreliable IPv6 but reliable IPv4, this should make connection attempts work as reliably as with plain IPv4, with only an extra 200 ms of connection delay. The set looks fine to me. Pushed. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libopenh264dec: Export the decoded profile and level in AVCodecContext
--- libavcodec/libopenh264dec.c | 5 + 1 file changed, 5 insertions(+) diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c index 5990a72ff9..7e9e66743a 100644 --- a/libavcodec/libopenh264dec.c +++ b/libavcodec/libopenh264dec.c @@ -95,6 +95,7 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data, int linesize[3]; AVFrame *avframe = data; DECODING_STATE state; +int opt; if (!avpkt->data) { #if OPENH264_VER_AT_LEAST(1, 9) @@ -136,6 +137,10 @@ FF_DISABLE_DEPRECATION_WARNINGS avframe->pkt_pts = avpkt->pts; FF_ENABLE_DEPRECATION_WARNINGS #endif +(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_PROFILE, &opt); +avctx->profile = opt; +(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_LEVEL, &opt); +avctx->level = opt; *got_frame = 1; return avpkt->size; -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] network: Add RFC 8305 style "Happy Eyeballs"/"Fast Fallback" helper function
For cases with dual stack (IPv4 + IPv6) connectivity, but where one stack potentially is less reliable, strive to trying to connect over both protocols in parallel, using whichever address connected first. In cases with a hostname resolving to multiple IPv4 and IPv6 addresses, the current connection mechanism would try all addresses in the order returned by getaddrinfo (with all IPv6 addresses ordered before the IPv4 addresses normally). If connection attempts to the IPv6 addresses return quickly with an error, this was no problem, but if they were unsuccessful leading up to timeouts, the connection process would have to wait for timeouts on all IPv6 target addresses before attempting any IPv4 address. Similar to what RFC 8305 suggests, reorder the list of addresses to try connecting to, interleaving address families. After starting one connection attempt, start another one in parallel after a small delay (200 ms as suggested by the RFC). For cases with unreliable IPv6 but reliable IPv4, this should make connection attempts work as reliably as with plain IPv4, with only an extra 200 ms of connection delay. --- libavformat/network.c | 226 ++ libavformat/network.h | 28 +++ 2 files changed, 254 insertions(+) diff --git a/libavformat/network.c b/libavformat/network.c index 24fcf20539..2d281539c6 100644 --- a/libavformat/network.c +++ b/libavformat/network.c @@ -23,7 +23,9 @@ #include "tls.h" #include "url.h" #include "libavcodec/internal.h" +#include "libavutil/avassert.h" #include "libavutil/mem.h" +#include "libavutil/time.h" void ff_tls_init(void) { @@ -240,6 +242,230 @@ int ff_listen_connect(int fd, const struct sockaddr *addr, return ret; } +static void interleave_addrinfo(struct addrinfo *base) +{ +struct addrinfo **next = &base->ai_next; +while (*next) { +struct addrinfo *cur = *next; +// Iterate forward until we find an entry of a different family. +if (cur->ai_family == base->ai_family) { +next = &cur->ai_next; +continue; +} +if (cur == base->ai_next) { +// If the first one following base is of a different family, just +// move base forward one step and continue. +base = cur; +next = &base->ai_next; +continue; +} +// Unchain cur from the rest of the list from its current spot. +*next = cur->ai_next; +// Hook in cur directly after base. +cur->ai_next = base->ai_next; +base->ai_next = cur; +// Restart with a new base. We know that before moving the cur element, +// everything between the previous base and cur had the same family, +// different from cur->ai_family. Therefore, we can keep next pointing +// where it was, and continue from there with base at the one after +// cur. +base = cur->ai_next; +} +} + +static void print_address_list(void *ctx, const struct addrinfo *addr, + const char *title) +{ +char hostbuf[100], portbuf[20]; +av_log(ctx, AV_LOG_DEBUG, "%s:\n", title); +while (addr) { +getnameinfo(addr->ai_addr, addr->ai_addrlen, +hostbuf, sizeof(hostbuf), portbuf, sizeof(portbuf), +NI_NUMERICHOST | NI_NUMERICSERV); +av_log(ctx, AV_LOG_DEBUG, "Address %s port %s\n", hostbuf, portbuf); +addr = addr->ai_next; +} +} + +struct ConnectionAttempt { +int fd; +int64_t deadline_us; +struct addrinfo *addr; +}; + +// Returns < 0 on error, 0 on successfully started connection attempt, +// > 0 for a connection that succeeded already. +static int start_connect_attempt(struct ConnectionAttempt *attempt, + struct addrinfo **ptr, int timeout_ms, + URLContext *h, + void (*customize_fd)(void *, int), void *customize_ctx) +{ +struct addrinfo *ai = *ptr; +int ret; + +*ptr = ai->ai_next; + +attempt->fd = ff_socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol); +if (attempt->fd < 0) +return ff_neterrno(); +attempt->deadline_us = av_gettime_relative() + timeout_ms * 1000; +attempt->addr = ai; + +ff_socket_nonblock(attempt->fd, 1); + +if (customize_fd) +customize_fd(customize_ctx, attempt->fd); + +while ((ret = connect(attempt->fd, ai->ai_addr, ai->ai_addrlen))) { +ret = ff_neterrno(); +switch (ret) { +case AVERROR(EINTR): +if (ff_check_interrupt(&h->interrupt_callback)) { +closesocket(attempt->fd); +attempt->fd = -1; +return AVERROR_EXIT; +} +continue; +case AVERROR(EINPROGRESS): +case AVERROR(EAGAIN): +return 0; +default: +closesocket(attempt->fd); +attempt->fd = -1; +
[libav-devel] [PATCH 2/2] tcp: Use ff_connect_parallel for RFC 8305 style connecting
--- libavformat/tcp.c | 41 +++-- 1 file changed, 15 insertions(+), 26 deletions(-) diff --git a/libavformat/tcp.c b/libavformat/tcp.c index 1498c26fbe..7044d44f06 100644 --- a/libavformat/tcp.c +++ b/libavformat/tcp.c @@ -108,30 +108,28 @@ static int tcp_open(URLContext *h, const char *uri, int flags) cur_ai = ai; - restart: -fd = ff_socket(cur_ai->ai_family, - cur_ai->ai_socktype, - cur_ai->ai_protocol); -if (fd < 0) { -ret = ff_neterrno(); -goto fail; -} - if (s->listen) { +while (cur_ai && fd < 0) { +fd = ff_socket(cur_ai->ai_family, + cur_ai->ai_socktype, + cur_ai->ai_protocol); +if (fd < 0) { +ret = ff_neterrno(); +cur_ai = cur_ai->ai_next; +} +} +if (fd < 0) +goto fail1; + if ((ret = ff_listen_bind(fd, cur_ai->ai_addr, cur_ai->ai_addrlen, s->listen_timeout, h)) < 0) { goto fail1; } fd = ret; } else { -if ((ret = ff_listen_connect(fd, cur_ai->ai_addr, cur_ai->ai_addrlen, - s->timeout, h, !!cur_ai->ai_next)) < 0) { - -if (ret == AVERROR_EXIT) -goto fail1; -else -goto fail; -} +ret = ff_connect_parallel(ai, s->timeout, 3, h, &fd, NULL, NULL); +if (ret < 0) +goto fail1; } h->is_streamed = 1; @@ -139,15 +137,6 @@ static int tcp_open(URLContext *h, const char *uri, int flags) freeaddrinfo(ai); return 0; - fail: -if (cur_ai->ai_next) { -/* Retry with the next sockaddr */ -cur_ai = cur_ai->ai_next; -if (fd >= 0) -closesocket(fd); -ret = 0; -goto restart; -} fail1: if (fd >= 0) closesocket(fd); -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] tls_openssl: Fix checks for SSL_ERROR_WANT_WRITE in nonblocking operation
This was a typo in 0671eb2346c, spotted by Chris Carroux. --- libavformat/tls_openssl.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/tls_openssl.c b/libavformat/tls_openssl.c index f0b325ae98..4a2fcfd771 100644 --- a/libavformat/tls_openssl.c +++ b/libavformat/tls_openssl.c @@ -112,7 +112,7 @@ static int print_tls_error(URLContext *h, int ret) TLSContext *c = h->priv_data; if (h->flags & AVIO_FLAG_NONBLOCK) { int err = SSL_get_error(c->ssl, ret); -if (err == SSL_ERROR_WANT_READ || err == SSL_ERROR_WANT_READ) +if (err == SSL_ERROR_WANT_READ || err == SSL_ERROR_WANT_WRITE) return AVERROR(EAGAIN); } av_log(h, AV_LOG_ERROR, "%s\n", ERR_error_string(ERR_get_error(), NULL)); -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/3] network: Use ff_neterrno instead of AVERROR(errno) for poll errors
From: Simon Thelen This makes sure to pick up the actual error codes on windows. --- libavformat/network.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavformat/network.c b/libavformat/network.c index 86d79553f7..1e02668ecf 100644 --- a/libavformat/network.c +++ b/libavformat/network.c @@ -138,7 +138,7 @@ static int ff_poll_interrupt(struct pollfd *p, nfds_t nfds, int timeout, if (!ret) return AVERROR(ETIMEDOUT); if (ret < 0) -return AVERROR(errno); +return ff_neterrno(); return ret; } -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/3] http: pass return code from http_open_cnx_internal() on its failure
From: Andrey Utkin Previously, AVERROR(EIO) was returned on failure of http_open_cnx_internal(). Now the value is passed to upper level, thus it is possible to distinguish ECONNREFUSED, ETIMEDOUT, ENETUNREACH etc. --- libavformat/http.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/libavformat/http.c b/libavformat/http.c index 80c87f786a..dfb95642c0 100644 --- a/libavformat/http.c +++ b/libavformat/http.c @@ -248,6 +248,8 @@ fail: if (s->hd) ffurl_close(s->hd); s->hd = NULL; +if (location_changed < 0) +return location_changed; return AVERROR(EIO); } -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 3/3] network: Check for EINTR in ff_poll_interrupt
--- libavformat/network.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/libavformat/network.c b/libavformat/network.c index 1e02668ecf..24fcf20539 100644 --- a/libavformat/network.c +++ b/libavformat/network.c @@ -131,14 +131,17 @@ static int ff_poll_interrupt(struct pollfd *p, nfds_t nfds, int timeout, if (ff_check_interrupt(cb)) return AVERROR_EXIT; ret = poll(p, nfds, POLLING_TIME); -if (ret != 0) +if (ret != 0) { +if (ret < 0) +ret = ff_neterrno(); +if (ret == AVERROR(EINTR)) +continue; break; +} } while (timeout < 0 || runs-- > 0); if (!ret) return AVERROR(ETIMEDOUT); -if (ret < 0) -return ff_neterrno(); return ret; } -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libopenh264: Add support for decoding of b-frames
The current git master version of libopenh264 supports decoding of b-frames. --- libavcodec/libopenh264dec.c | 19 ++- 1 file changed, 14 insertions(+), 5 deletions(-) diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c index cdb8d527cf..5990a72ff9 100644 --- a/libavcodec/libopenh264dec.c +++ b/libavcodec/libopenh264dec.c @@ -96,7 +96,18 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data, AVFrame *avframe = data; DECODING_STATE state; -state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, avpkt->size, ptrs, &info); +if (!avpkt->data) { +#if OPENH264_VER_AT_LEAST(1, 9) +int end_of_stream = 1; +(*s->decoder)->SetOption(s->decoder, DECODER_OPTION_END_OF_STREAM, &end_of_stream); +state = (*s->decoder)->FlushFrame(s->decoder, ptrs, &info); +#else +return 0; +#endif +} else { +info.uiInBsTimeStamp = avpkt->pts; +state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, avpkt->size, ptrs, &info); +} if (state != dsErrorFree) { av_log(avctx, AV_LOG_ERROR, "DecodeFrame2 failed\n"); return AVERROR_UNKNOWN; @@ -118,8 +129,8 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data, linesize[1] = linesize[2] = info.UsrData.sSystemBuffer.iStride[1]; av_image_copy(avframe->data, avframe->linesize, (const uint8_t **) ptrs, linesize, avctx->pix_fmt, avctx->width, avctx->height); -avframe->pts = avpkt->pts; -avframe->pkt_dts = avpkt->dts; +avframe->pts = info.uiOutYuvTimeStamp; +avframe->pkt_dts = AV_NOPTS_VALUE; #if FF_API_PKT_PTS FF_DISABLE_DEPRECATION_WARNINGS avframe->pkt_pts = avpkt->pts; @@ -139,8 +150,6 @@ AVCodec ff_libopenh264_decoder = { .init = svc_decode_init, .decode = svc_decode_frame, .close = svc_decode_close, -// The decoder doesn't currently support B-frames, and the decoder's API -// doesn't support reordering/delay, but the BSF could incur delay. .capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_DR1, .caps_internal = FF_CODEC_CAP_SETS_PKT_DTS | FF_CODEC_CAP_INIT_THREADSAFE | FF_CODEC_CAP_INIT_CLEANUP, -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] avconv: make sure packets put into the muxing FIFO are refcounted
From: wm4 Some callers (like do_subtitle_out(), or do_streamcopy()) call this with an AVPacket that is not refcounted. This can cause undefined behavior. Calling av_packet_move_ref() does not make a packet refcounted if it isn't yet. (And it can't be made to, because it always succeeds, and can't return ENOMEM.) Call av_packet_ref() instead to make sure it's refcounted. --- avtools/avconv.c | 7 +-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/avtools/avconv.c b/avtools/avconv.c index ac15464a8d..3abb7f872f 100644 --- a/avtools/avconv.c +++ b/avtools/avconv.c @@ -281,7 +281,7 @@ static void write_packet(OutputFile *of, AVPacket *pkt, OutputStream *ost) int ret; if (!of->header_written) { -AVPacket tmp_pkt; +AVPacket tmp_pkt = {0}; /* the muxer is not initialized yet, buffer the packet */ if (!av_fifo_space(ost->muxing_queue)) { int new_size = FFMIN(2 * av_fifo_size(ost->muxing_queue), @@ -296,8 +296,11 @@ static void write_packet(OutputFile *of, AVPacket *pkt, OutputStream *ost) if (ret < 0) exit_program(1); } -av_packet_move_ref(&tmp_pkt, pkt); +ret = av_packet_ref(&tmp_pkt, pkt); +if (ret < 0) +exit_program(1); av_fifo_generic_write(ost->muxing_queue, &tmp_pkt, sizeof(tmp_pkt), NULL); +av_packet_unref(pkt); return; } -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] libfdk-aac: Use enum names instead of literal numbers for the output format
--- libavcodec/libfdk-aacenc.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c index 26dfb6dc0b..c340a1e3e0 100644 --- a/libavcodec/libfdk-aacenc.c +++ b/libavcodec/libfdk-aacenc.c @@ -227,7 +227,8 @@ static av_cold int aac_encode_init(AVCodecContext *avctx) /* Choose bitstream format - if global header is requested, use * raw access units, otherwise use ADTS. */ if ((err = aacEncoder_SetParam(s->handle, AACENC_TRANSMUX, - avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER ? 0 : s->latm ? 10 : 2)) != AACENC_OK) { + avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER ? TT_MP4_RAW : + s->latm ? TT_MP4_LOAS : TT_MP4_ADTS)) != AACENC_OK) { av_log(avctx, AV_LOG_ERROR, "Unable to set the transmux format: %s\n", aac_get_error(err)); goto error; -- 2.15.2 (Apple Git-101.1) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] configure: fix inline asm checks
On Fri, 8 Jun 2018, Diego Biurrun wrote: On Thu, Jun 07, 2018 at 11:05:26PM -0300, James Almer wrote: On 6/7/2018 6:01 PM, Diego Biurrun wrote: > On Thu, Jun 07, 2018 at 03:03:21PM +0300, Martin Storsjö wrote: >> Commit 8c893aa3cd5 removed quotes that were required to detect >> inline asm: >> >> check_insn armv5te qadd r0, r0, r0 >> .../test.c:1:34: error: expected string literal in 'asm' >> void foo(void){ __asm__ volatile(qadd r0, r0, r0); } >> >> The correct code is: >> >> void foo(void){ __asm__ volatile("qadd r0, r0, r0"); } >> --- a/configure >> +++ b/configure >> @@ -866,7 +866,7 @@ EOF >> check_insn(){ >> log check_insn "$@" >> -check_inline_asm ${1}_inline "$2" >> +check_inline_asm ${1}_inline "\"$2\"" >> check_as ${1}_external "$2" >> } > > This does not look like the correct fix to me. The required quotes > should be part of the convenience function instead. Notice how calls > to check_insn and check_inline_asm differ in the way they quote their > arguments. There should be no need for this inconsistency. > > I'll look into it. Changing all the calls from check_insn name 'insn' to check_insn name '"insn"' would probably fix the check_inline_asm tests, but may break the check_as tests. Complicating the function calls is not the right way to go. The helper function should take care of the required quoting and not rely on the callers to pass arguments in nested quotes. Ping; whoever is waiting for the other, please pick the thread up again. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] configure: fix inline asm checks
From: John Cox Commit 8c893aa3cd5 removed quotes that were required to detect inline asm: check_insn armv5te qadd r0, r0, r0 .../test.c:1:34: error: expected string literal in 'asm' void foo(void){ __asm__ volatile(qadd r0, r0, r0); } The correct code is: void foo(void){ __asm__ volatile("qadd r0, r0, r0"); } Commit message written by Frank Liberato --- configure | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/configure b/configure index 5e79c0cec1..48e8536b07 100755 --- a/configure +++ b/configure @@ -866,7 +866,7 @@ EOF check_insn(){ log check_insn "$@" -check_inline_asm ${1}_inline "$2" +check_inline_asm ${1}_inline "\"$2\"" check_as ${1}_external "$2" } -- 2.15.1 (Apple Git-101) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API
On Tue, 17 Apr 2018, Diego Biurrun wrote: On Mon, Apr 16, 2018 at 05:50:04PM +0300, Martin Storsjö wrote: From: Steve Lhomme Remove the wincrypt API calls since we don't support XP anymore and bcrypt is available since Vista, even on Windows Store builds. --- Now with avutil_extralibs sorted alphabetically, and James' extended configure check included. --- a/configure +++ b/configure @@ -4579,9 +4579,10 @@ check_header windows.h +check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt && +check_cpp_condition bcrypt bcrypt.h "defined BCRYPT_RNG_ALGORITHM" This is a workaround for an old, already-obsolete version of mingw64. Before this shows up in a release it will be even more obsolete. IMO such workarounds are not worth the trouble; let the breakage occur where the actual bugs are and do the fixes at the root. I consider that the saner longterm strategy. Your call; push whichever version you prefer. If such versions regardless are common (which James fix would indicate), or even "aren't uncommon", I'd prefer to include the extended configure check. I wouldn't want to rule out building with a less-than-newest version of mingw-w64. Thus pushed with the extra check. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API
From: Steve Lhomme Remove the wincrypt API calls since we don't support XP anymore and bcrypt is available since Vista, even on Windows Store builds. --- Now with avutil_extralibs sorted alphabetically, and James' extended configure check included. --- configure | 7 --- libavutil/random_seed.c | 19 ++- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/configure b/configure index 3c7b6a0981..465fdcfb6d 100755 --- a/configure +++ b/configure @@ -1703,12 +1703,12 @@ SYSTEM_FUNCS=" " SYSTEM_LIBRARIES=" +bcrypt sdl vaapi_1 vaapi_drm vaapi_x11 vdpau_x11 -wincrypt " TOOLCHAIN_FEATURES=" @@ -2610,7 +2610,7 @@ avdevice_extralibs="libm_extralibs" avformat_extralibs="libm_extralibs" avfilter_extralibs="pthreads_extralibs libm_extralibs" avresample_extralibs="libm_extralibs" -avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" +avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs" swscale_extralibs="libm_extralibs" # programs @@ -4579,9 +4579,10 @@ check_header windows.h # so we also check that atomics actually work here check_builtin stdatomic stdatomic.h "atomic_int foo; atomic_store(&foo, 0)" +check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt && +check_cpp_condition bcrypt bcrypt.h "defined BCRYPT_RNG_ALGORITHM" check_lib ole32"windows.h"CoTaskMemFree-lole32 check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 -check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c index 089d883916..388cb401ba 100644 --- a/libavutil/random_seed.c +++ b/libavutil/random_seed.c @@ -23,9 +23,9 @@ #if HAVE_UNISTD_H #include #endif -#if HAVE_WINCRYPT +#if HAVE_BCRYPT #include -#include +#include #endif #include #include @@ -96,13 +96,14 @@ uint32_t av_get_random_seed(void) { uint32_t seed; -#if HAVE_WINCRYPT -HCRYPTPROV provider; -if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL, -CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) { -BOOL ret = CryptGenRandom(provider, sizeof(seed), (PBYTE) &seed); -CryptReleaseContext(provider, 0); -if (ret) +#if HAVE_BCRYPT +BCRYPT_ALG_HANDLE algo_handle; +NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, BCRYPT_RNG_ALGORITHM, + MS_PRIMITIVE_PROVIDER, 0); +if (BCRYPT_SUCCESS(ret)) { +NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, sizeof(seed), 0); +BCryptCloseAlgorithmProvider(algo_handle, 0); +if (BCRYPT_SUCCESS(ret)) return seed; } #endif -- 2.15.1 (Apple Git-101) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API
From: Steve Lhomme Remove the wincrypt API calls since we don't support XP anymore and bcrypt is available since Vista, even on Windows Store builds. --- configure | 6 +++--- libavutil/random_seed.c | 19 ++- 2 files changed, 13 insertions(+), 12 deletions(-) diff --git a/configure b/configure index 3c7b6a0981..0eba9b24f3 100755 --- a/configure +++ b/configure @@ -1703,12 +1703,12 @@ SYSTEM_FUNCS=" " SYSTEM_LIBRARIES=" +bcrypt sdl vaapi_1 vaapi_drm vaapi_x11 vdpau_x11 -wincrypt " TOOLCHAIN_FEATURES=" @@ -2610,7 +2610,7 @@ avdevice_extralibs="libm_extralibs" avformat_extralibs="libm_extralibs" avfilter_extralibs="pthreads_extralibs libm_extralibs" avresample_extralibs="libm_extralibs" -avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" +avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs bcrypt_extralibs" swscale_extralibs="libm_extralibs" # programs @@ -4579,9 +4579,9 @@ check_header windows.h # so we also check that atomics actually work here check_builtin stdatomic stdatomic.h "atomic_int foo; atomic_store(&foo, 0)" +check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt check_lib ole32"windows.h"CoTaskMemFree-lole32 check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 -check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c index 089d883916..388cb401ba 100644 --- a/libavutil/random_seed.c +++ b/libavutil/random_seed.c @@ -23,9 +23,9 @@ #if HAVE_UNISTD_H #include #endif -#if HAVE_WINCRYPT +#if HAVE_BCRYPT #include -#include +#include #endif #include #include @@ -96,13 +96,14 @@ uint32_t av_get_random_seed(void) { uint32_t seed; -#if HAVE_WINCRYPT -HCRYPTPROV provider; -if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL, -CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) { -BOOL ret = CryptGenRandom(provider, sizeof(seed), (PBYTE) &seed); -CryptReleaseContext(provider, 0); -if (ret) +#if HAVE_BCRYPT +BCRYPT_ALG_HANDLE algo_handle; +NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, BCRYPT_RNG_ALGORITHM, + MS_PRIMITIVE_PROVIDER, 0); +if (BCRYPT_SUCCESS(ret)) { +NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, sizeof(seed), 0); +BCryptCloseAlgorithmProvider(algo_handle, 0); +if (BCRYPT_SUCCESS(ret)) return seed; } #endif -- 2.15.1 (Apple Git-101) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] [v2] use bcrypt instead of the old wincrypt API
On Sun, 15 Apr 2018, Martin Storsjö wrote: On Tue, 3 Apr 2018, Steve Lhomme wrote: When targeting Windows Vista and above The wincrypt API is deprecated and not allowed for Windows Store apps. Wincrypt can be removed after XP support is dropped. --- configure | 4 +++- libavutil/random_seed.c | 17 +++-- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/configure b/configure index 77754d0f51..0ab975bb1c 100755 --- a/configure +++ b/configure @@ -1703,6 +1703,7 @@ SYSTEM_FUNCS=" " SYSTEM_LIBRARIES=" +bcrypt sdl vaapi_1 vaapi_drm @@ -2611,7 +2612,7 @@ avdevice_extralibs="libm_extralibs" avformat_extralibs="libm_extralibs" avfilter_extralibs="pthreads_extralibs libm_extralibs" avresample_extralibs="libm_extralibs" -avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" +avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" swscale_extralibs="libm_extralibs" # programs @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h" CoTaskMemFree-lole32 check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi +test_cpp_condition windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c index 089d883916..d11bff2ef6 100644 --- a/libavutil/random_seed.c +++ b/libavutil/random_seed.c @@ -23,7 +23,10 @@ #if HAVE_UNISTD_H #include #endif -#if HAVE_WINCRYPT +#if HAVE_BCRYPT +#include +#include +#elif HAVE_WINCRYPT #include #include #endif @@ -96,7 +99,17 @@ uint32_t av_get_random_seed(void) { uint32_t seed; -#if HAVE_WINCRYPT +#if HAVE_BCRYPT +BCRYPT_ALG_HANDLE algo_handle; +NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, BCRYPT_RNG_ALGORITHM, + MS_PRIMITIVE_PROVIDER, 0); +if (BCRYPT_SUCCESS(ret)) { +NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, sizeof(seed), 0); +BCryptCloseAlgorithmProvider(algo_handle, 0); +if (BCRYPT_SUCCESS(ret)) +return seed; +} +#elif HAVE_WINCRYPT HCRYPTPROV provider; if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) { -- 2.16.2 This is ok with me and I can push it (perhaps with removing the check for _WIN32_WINNT >= 0x600). I guess removing wincrypt can be left as a separate later patch? As the form pushed in ffmpeg was with removing wincrypt at the same time, I'd prefer using that form here as well. I'll send a version of the patch in that form, and push a day later unless there's anything further to change. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] [v2] use bcrypt instead of the old wincrypt API
On Tue, 3 Apr 2018, Steve Lhomme wrote: When targeting Windows Vista and above The wincrypt API is deprecated and not allowed for Windows Store apps. Wincrypt can be removed after XP support is dropped. --- configure | 4 +++- libavutil/random_seed.c | 17 +++-- 2 files changed, 18 insertions(+), 3 deletions(-) diff --git a/configure b/configure index 77754d0f51..0ab975bb1c 100755 --- a/configure +++ b/configure @@ -1703,6 +1703,7 @@ SYSTEM_FUNCS=" " SYSTEM_LIBRARIES=" +bcrypt sdl vaapi_1 vaapi_drm @@ -2611,7 +2612,7 @@ avdevice_extralibs="libm_extralibs" avformat_extralibs="libm_extralibs" avfilter_extralibs="pthreads_extralibs libm_extralibs" avresample_extralibs="libm_extralibs" -avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" +avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs" swscale_extralibs="libm_extralibs" # programs @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h"CoTaskMemFree -lole32 check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi +test_cpp_condition windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c index 089d883916..d11bff2ef6 100644 --- a/libavutil/random_seed.c +++ b/libavutil/random_seed.c @@ -23,7 +23,10 @@ #if HAVE_UNISTD_H #include #endif -#if HAVE_WINCRYPT +#if HAVE_BCRYPT +#include +#include +#elif HAVE_WINCRYPT #include #include #endif @@ -96,7 +99,17 @@ uint32_t av_get_random_seed(void) { uint32_t seed; -#if HAVE_WINCRYPT +#if HAVE_BCRYPT +BCRYPT_ALG_HANDLE algo_handle; +NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, BCRYPT_RNG_ALGORITHM, + MS_PRIMITIVE_PROVIDER, 0); +if (BCRYPT_SUCCESS(ret)) { +NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, sizeof(seed), 0); +BCryptCloseAlgorithmProvider(algo_handle, 0); +if (BCRYPT_SUCCESS(ret)) +return seed; +} +#elif HAVE_WINCRYPT HCRYPTPROV provider; if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL, CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) { -- 2.16.2 This is ok with me and I can push it (perhaps with removing the check for _WIN32_WINNT >= 0x600). I guess removing wincrypt can be left as a separate later patch? // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86: Don't declare a non-static function as inline
This fixes building with clang in msvc mode, which does support gcc style inline assembly. --- libavcodec/x86/xvididct_sse2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/x86/xvididct_sse2.c b/libavcodec/x86/xvididct_sse2.c index f318e95..0de59a5 100644 --- a/libavcodec/x86/xvididct_sse2.c +++ b/libavcodec/x86/xvididct_sse2.c @@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = { "movdqa %%xmm6, 4*16("dct") \n\t" \ "movdqa "SREG2", 7*16("dct")\n\t" -inline void ff_xvid_idct_sse2(short *block) +void ff_xvid_idct_sse2(short *block) { __asm__ volatile ( "movq "MANGLE (m127) ", %%mm0 \n\t" -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] x86: Don't declare a non-static function as inline
On Sat, 14 Apr 2018, Diego Biurrun wrote: On Sat, Apr 14, 2018 at 01:38:30PM +0300, Martin Storsjö wrote: Make the actual implementation static inline, but add a non-static non-inline frontend for it. This fixes building with clang in msvc mode, which does support gcc style inline assembly. --- a/libavcodec/x86/xvididct_sse2.c +++ b/libavcodec/x86/xvididct_sse2.c @@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = { -inline void ff_xvid_idct_sse2(short *block) +static inline void xvid_idct_sse2(short *block) { __asm__ volatile ( "movq "MANGLE (m127) ", %%mm0 \n\t" @@ -390,15 +390,20 @@ inline void ff_xvid_idct_sse2(short *block) +void ff_xvid_idct_sse2(short *block) +{ +xvid_idct_sse2(block); +} Why not simply drop the inline and be done with it? I notice that the MMX version of this does not have the inline keyword. That's probably just as good. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] x86: Don't declare a non-static function as inline
Make the actual implementation static inline, but add a non-static non-inline frontend for it. This fixes building with clang in msvc mode, which does support gcc style inline assembly. --- libavcodec/x86/xvididct_sse2.c | 11 --- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/libavcodec/x86/xvididct_sse2.c b/libavcodec/x86/xvididct_sse2.c index f318e95..00ed803 100644 --- a/libavcodec/x86/xvididct_sse2.c +++ b/libavcodec/x86/xvididct_sse2.c @@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = { "movdqa %%xmm6, 4*16("dct") \n\t" \ "movdqa "SREG2", 7*16("dct")\n\t" -inline void ff_xvid_idct_sse2(short *block) +static inline void xvid_idct_sse2(short *block) { __asm__ volatile ( "movq "MANGLE (m127) ", %%mm0 \n\t" @@ -390,15 +390,20 @@ inline void ff_xvid_idct_sse2(short *block) "%eax", "%ecx", "%edx", "%esi", "memory"); } +void ff_xvid_idct_sse2(short *block) +{ +xvid_idct_sse2(block); +} + void ff_xvid_idct_sse2_put(uint8_t *dest, ptrdiff_t line_size, short *block) { -ff_xvid_idct_sse2(block); +xvid_idct_sse2(block); ff_put_pixels_clamped_mmx(block, dest, line_size); } void ff_xvid_idct_sse2_add(uint8_t *dest, ptrdiff_t line_size, short *block) { -ff_xvid_idct_sse2(block); +xvid_idct_sse2(block); ff_add_pixels_clamped_mmx(block, dest, line_size); } -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 2/2] Drop Windows XP support remnants
On Thu, 5 Apr 2018, Diego Biurrun wrote: --- libavcodec/dxva2_h264.c | 6 +- libavcodec/dxva2_hevc.c | 6 +- libavcodec/dxva2_mpeg2.c | 7 ++- libavcodec/dxva2_vc1.c| 6 +- libavutil/hwcontext_d3d11va.c | 9 + libavutil/hwcontext_dxva2.c | 4 6 files changed, 6 insertions(+), 32 deletions(-) diff --git a/libavcodec/dxva2_h264.c b/libavcodec/dxva2_h264.c index 50e7863bf2..790e4a214b 100644 --- a/libavcodec/dxva2_h264.c +++ b/libavcodec/dxva2_h264.c @@ -20,16 +20,12 @@ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA */ +#include "dxva2_internal.h" #include "h264dec.h" #include "h264data.h" #include "h264_ps.h" #include "mpegutils.h" -// The headers above may include w32threads.h, which uses the original -// _WIN32_WINNT define, while dxva2_internal.h redefines it to target a -// potentially newer version. -#include "dxva2_internal.h" Well technically, this hasn't changed - dxva2_internal.h includes dxva2.h which still redefines _WIN32_WINNT. It just sets it to 0x0602, while the lowest it'll be here from before is 0x0600 and the difference shouldn't matter for e.g. w32threads.h. The patch probably is fine though, but reading the patch made me grep the source to see what the actual case was. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH 1/2] w32pthreads: always use Vista+ API, drop XP support
On Thu, 5 Apr 2018, Diego Biurrun wrote: From: wm4 This removes the XP compatibility code, and switches entirely to SRW locks, which are available starting at Windows Vista. This removes CRITICAL_SECTION use, which allows us to add PTHREAD_MUTEX_INITIALIZER, which will be useful later. Windows XP is hereby not a supported build target anymore. Signed-off-by: Diego Biurrun --- Changes to original patch: - proper w32threads dependencies - added missing Cygwin flags Changelog | 2 + compat/w32pthreads.h | 269 ++--- configure | 19 ++-- libavcodec/pthread_frame.c | 4 - libavcodec/pthread_slice.c | 4 - libavfilter/pthread.c | 4 - 6 files changed, 23 insertions(+), 279 deletions(-) Looks ok, haven't tested it myself yet. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API
On Fri, 30 Mar 2018, James Almer wrote: On 3/30/2018 3:13 PM, Martin Storsjö wrote: On Fri, 30 Mar 2018, James Almer wrote: On 3/30/2018 10:57 AM, Martin Storsjö wrote: On Fri, 30 Mar 2018, Diego Biurrun wrote: On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote: On 3/30/2018 10:38 AM, Diego Biurrun wrote: > On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote: >> Le 30/03/2018 à 10:46, Diego Biurrun a écrit : >>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote: >>>> --- a/configure >>>> +++ b/configure >>>> @@ -4581,6 +4582,7 @@ check_lib ole32 "windows.h" CoTaskMemFree -lole32 >>>> check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 >>>> check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 >>>> check_lib psapi "windows.h psapi.h" GetProcessMemoryInfo -lpsapi >>>> +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt If you don't need to set any variable then just use test_cpp_condition() Yes, good point. >>> Do you really need to check the Vista condition? What about using bcrypt >>> unconditionally if available? >> >> Yes, you need to use it only on builds that won't run on XP. Otherwise it >> will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever >> its form) will fail to load. It would be possible to do it dynamically but >> IMO it's overkill. It's not really a critical component. > > Is bcrypt available on XP? If no then the CPP condition check would seem > unnecessary. You could just check for bcrypt and bcrypt being available > would imply Vista. I think I'm missing something. check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt Seems to succeed even if targeting XP, at least on mingw-w64. Isn't that wrong then? I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs guarding the availability of this function in the headers. (The official windows SDK might, although that SDK also have dropped XP support long ago iirc.) bcrypt.h on mingw-w64 is completely wrapped in checks like #if WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) || _WIN32_WINNT = 0x0A00 The former is the reason it succeeds in XP, seeing the latter is checking for Windows 10 or newer. Hmm, ok. I guess the correct form would be something like "(WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) && _WIN32_WINNT >= 0x0600) || _WIN32_WINNT >= 0x0A00" then. // Martin The WINAPI_PARTITION_DESKTOP check is already done in configure to enable or disable the uwp variable. Not sure I see how that relates... that part of the header guard makes it visible on and makes the check succeed when targeting XP, even though it really isn't available there according to Steve. In any case, does this mean that on uwp neither BCryptGenRandom or CryptGenRandom are available/allowed? The way I read that, for UWP on Win10, the bcrypt.h stuff should be fine, no? (Based on the mingw-w64 header guards, it might not be for win8/8.1 RT/store/UWP/whatever apps, although MSDN doesn't seem to say anything about it.) // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API
On Fri, 30 Mar 2018, Diego Biurrun wrote: On Fri, Mar 30, 2018 at 04:58:29PM +0300, Martin Storsjö wrote: On Fri, 30 Mar 2018, Diego Biurrun wrote: > On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote: > > Le 30/03/2018 à 10:46, Diego Biurrun a écrit : > > > On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote: > > > > --- a/configure > > > > +++ b/configure > > > > @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h" CoTaskMemFree-lole32 > > > > check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 > > > > check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 > > > > check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi > > > > +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt > > > Do you really need to check the Vista condition? What about using bcrypt > > > unconditionally if available? > > > > Yes, you need to use it only on builds that won't run on XP. Otherwise it > > will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever > > its form) will fail to load. It would be possible to do it dynamically but > > IMO it's overkill. It's not really a critical component. > > Is bcrypt available on XP? If no then the CPP condition check would seem > unnecessary. You could just check for bcrypt and bcrypt being available > would imply Vista. I think I'm missing something. > > > But with time if XP support is dropped this check can go and wincrypt > > dropped entirely. > > Is it maybe time to consider dropping XP support? I wouldn't mind. Let's go ahead then. See e.g. 9b121dfc32810250938021952aab4172a988cb56 in ffmpeg; dropping XP support simplifies the w32pthreads wrapper and allows using better synchronization primitives, that allow e.g. static initialization of mutexes. Do we need to do more changes apart from importing that commit? Don't think so, except for whatever configure differences there are. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API
On Fri, 30 Mar 2018, James Almer wrote: On 3/30/2018 10:57 AM, Martin Storsjö wrote: On Fri, 30 Mar 2018, Diego Biurrun wrote: On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote: On 3/30/2018 10:38 AM, Diego Biurrun wrote: > On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote: >> Le 30/03/2018 à 10:46, Diego Biurrun a écrit : >>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote: >>>> --- a/configure >>>> +++ b/configure >>>> @@ -4581,6 +4582,7 @@ check_lib ole32 "windows.h" CoTaskMemFree -lole32 >>>> check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 >>>> check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 >>>> check_lib psapi "windows.h psapi.h" GetProcessMemoryInfo -lpsapi >>>> +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt If you don't need to set any variable then just use test_cpp_condition() Yes, good point. >>> Do you really need to check the Vista condition? What about using bcrypt >>> unconditionally if available? >> >> Yes, you need to use it only on builds that won't run on XP. Otherwise it >> will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever >> its form) will fail to load. It would be possible to do it dynamically but >> IMO it's overkill. It's not really a critical component. > > Is bcrypt available on XP? If no then the CPP condition check would seem > unnecessary. You could just check for bcrypt and bcrypt being available > would imply Vista. I think I'm missing something. check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt Seems to succeed even if targeting XP, at least on mingw-w64. Isn't that wrong then? I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs guarding the availability of this function in the headers. (The official windows SDK might, although that SDK also have dropped XP support long ago iirc.) bcrypt.h on mingw-w64 is completely wrapped in checks like #if WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) || _WIN32_WINNT = 0x0A00 The former is the reason it succeeds in XP, seeing the latter is checking for Windows 10 or newer. Hmm, ok. I guess the correct form would be something like "(WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) && _WIN32_WINNT >= 0x0600) || _WIN32_WINNT >= 0x0A00" then. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API
On Fri, 30 Mar 2018, Diego Biurrun wrote: On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote: Le 30/03/2018 à 10:46, Diego Biurrun a écrit : > On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote: > > --- a/configure > > +++ b/configure > > @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h" CoTaskMemFree-lole32 > > check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 > > check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 > > check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi > > +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt > Do you really need to check the Vista condition? What about using bcrypt > unconditionally if available? Yes, you need to use it only on builds that won't run on XP. Otherwise it will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever its form) will fail to load. It would be possible to do it dynamically but IMO it's overkill. It's not really a critical component. Is bcrypt available on XP? If no then the CPP condition check would seem unnecessary. You could just check for bcrypt and bcrypt being available would imply Vista. I think I'm missing something. But with time if XP support is dropped this check can go and wincrypt dropped entirely. Is it maybe time to consider dropping XP support? I wouldn't mind. See e.g. 9b121dfc32810250938021952aab4172a988cb56 in ffmpeg; dropping XP support simplifies the w32pthreads wrapper and allows using better synchronization primitives, that allow e.g. static initialization of mutexes. // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API
On Fri, 30 Mar 2018, Diego Biurrun wrote: On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote: On 3/30/2018 10:38 AM, Diego Biurrun wrote: > On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote: >> Le 30/03/2018 à 10:46, Diego Biurrun a écrit : >>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote: --- a/configure +++ b/configure @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h" CoTaskMemFree-lole32 check_lib shell32 "windows.h shellapi.h" CommandLineToArgvW -lshell32 check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom -ladvapi32 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt If you don't need to set any variable then just use test_cpp_condition() Yes, good point. >>> Do you really need to check the Vista condition? What about using bcrypt >>> unconditionally if available? >> >> Yes, you need to use it only on builds that won't run on XP. Otherwise it >> will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever >> its form) will fail to load. It would be possible to do it dynamically but >> IMO it's overkill. It's not really a critical component. > > Is bcrypt available on XP? If no then the CPP condition check would seem > unnecessary. You could just check for bcrypt and bcrypt being available > would imply Vista. I think I'm missing something. check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom -lbcrypt Seems to succeed even if targeting XP, at least on mingw-w64. Isn't that wrong then? I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs guarding the availability of this function in the headers. (The official windows SDK might, although that SDK also have dropped XP support long ago iirc.) // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] arm: Produce .const_data instead of .section .rodata for Mach-O
This is the same combination of .section directives as used in aarch64/asm.S. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. --- libavutil/arm/asm.S | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S index 08574852b5..e7eea0271f 100644 --- a/libavutil/arm/asm.S +++ b/libavutil/arm/asm.S @@ -111,11 +111,17 @@ FUNC.func \name ELF .size \name, . - \name .purgem endconst .endm -.if HAVE_SECTION_DATA_REL_RO && \relocate +#if HAVE_SECTION_DATA_REL_RO +.if \relocate .section.data.rel.ro .else .section.rodata .endif +#elif !defined(__MACH__) +.section.rodata +#else +.const_data +#endif .align \align \name: .endm -- 2.15.1 (Apple Git-101) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] arm: vc1dsp: Add commas between macro arguments
When targeting darwin, clang requires commas between arguments, while the no-comma form is allowed for other targets. Since Xcode 9.3, the bundled clang supports altmacro and doesn't require using gas-preprocessor any longer. --- libavcodec/arm/vc1dsp_neon.S | 94 ++-- 1 file changed, 47 insertions(+), 47 deletions(-) diff --git a/libavcodec/arm/vc1dsp_neon.S b/libavcodec/arm/vc1dsp_neon.S index ff88fe23c7..71cc3f4413 100644 --- a/libavcodec/arm/vc1dsp_neon.S +++ b/libavcodec/arm/vc1dsp_neon.S @@ -410,13 +410,13 @@ function ff_vc1_inv_trans_8x8_neon, export=1 @ src[48] q14 @ src[56] q15 -vc1_inv_trans_8x8_helper add=4 add1beforeshift=0 rshift=3 +vc1_inv_trans_8x8_helper add=4, add1beforeshift=0, rshift=3 @ Transpose result matrix of 8x8 swap4 d17, d19, d21, d23, d24, d26, d28, d30 transpose16_4x4 q8, q9, q10, q11, q12, q13, q14, q15 -vc1_inv_trans_8x8_helper add=64 add1beforeshift=1 rshift=7 +vc1_inv_trans_8x8_helper add=64, add1beforeshift=1, rshift=7 vst1.64 {q8-q9}, [r0,:128]! vst1.64 {q10-q11}, [r0,:128]! @@ -431,7 +431,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1 vld1.64 {q0-q1}, [r2,:128]! @ load 8 * 4 * 2 = 64 bytes / 16 bytes per quad = 4 quad registers vld1.64 {q2-q3}, [r2,:128] -transpose16 q0 q1 q2 q3 @ transpose rows to columns +transpose16 q0, q1, q2, q3 @ transpose rows to columns @ At this point: @ src[0] d0 @@ -443,7 +443,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1 @ src[6] d5 @ src[7] d7 -vc1_inv_trans_8x4_helperadd=4 add1beforeshift=0 rshift=3 +vc1_inv_trans_8x4_helperadd=4, add1beforeshift=0, rshift=3 @ Move output to more standardized registers vmovd0, d16 @@ -465,7 +465,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1 @ dst[6] d5 @ dst[7] d7 -transpose16 q0 q1 q2 q3 @ turn columns into rows +transpose16 q0, q1, q2, q3 @ turn columns into rows @ At this point: @ row[0] q0 @@ -473,7 +473,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1 @ row[2] q2 @ row[3] q3 -vc1_inv_trans_4x8_helperadd=64 rshift=7 +vc1_inv_trans_4x8_helperadd=64, rshift=7 @ At this point: @ line[0].l d0 @@ -523,7 +523,7 @@ function ff_vc1_inv_trans_4x8_neon, export=1 vld4.16 {d1[2], d3[2], d5[2], d7[2]}, [r2,:64], r12 vld4.16 {d1[3], d3[3], d5[3], d7[3]}, [r2,:64] -vc1_inv_trans_4x8_helperadd=4 rshift=3 +vc1_inv_trans_4x8_helperadd=4, rshift=3 @ At this point: @ dst[0] = q0 @@ -531,9 +531,9 @@ function ff_vc1_inv_trans_4x8_neon, export=1 @ dst[2] = q2 @ dst[3] = q3 -transpose16 q0 q1 q2 q3 @ Transpose rows (registers) into columns +transpose16 q0, q1, q2, q3 @ Transpose rows (registers) into columns -vc1_inv_trans_8x4_helperadd=64 add1beforeshift=1 rshift=7 +vc1_inv_trans_8x4_helperadd=64, add1beforeshift=1, rshift=7 vld1.32 {d28[]}, [r0,:32], r1 @ read dest vld1.32 {d28[1]}, [r0,:32], r1 @@ -611,7 +611,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1 @ src[2] = d1 @ src[3] = d3 -vc1_inv_trans_4x4_helper add=4 rshift=3 @ compute t1, t2, t3, t4 and combine them into dst[0-3] +vc1_inv_trans_4x4_helper add=4, rshift=3 @ compute t1, t2, t3, t4 and combine them into dst[0-3] @ At this point: @ dst[0] = d0 @@ -619,7 +619,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1 @ dst[2] = d1 @ dst[3] = d2 -transpose16 d0 d3 d1 d2 @ Transpose rows (registers) into columns +transpose16 d0, d3, d1, d2 @ Transpose rows (registers) into columns @ At this point: @ src[0] = d0 @@ -635,7 +635,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1 @ src[16] = d1 @ src[24] = d3 -vc1_inv_trans_4x4_helper add=64 rshift=7 @ compute t1, t2, t3, t4 and combine them into dst[0-3] +vc1_inv_trans_4x4_helper add=64, rshift=7 @ compute t1, t2, t3, t4 and combine them into dst[0-3] @ At this point: @ line[0] = d0 @@ -665,26 +665,26 @@ endfunc @ The absolute value of multiplication constants from vc1_mspel_filter and vc1_mspel_{ver,hor}_filter_16bits. @ The sign is embedded in the code below that carries out the multiplication (mspel_filter{,.16}). -#define MSPEL_MODE_1_MUL_CONSTANTS 4 53 18 3 -#define MSPEL_MODE_2_MUL_CONSTANTS 1 9 9 1 -#define MSPEL_MODE_3_MUL_CONSTANTS 3 18 53 4 +#define M
[libav-devel] [PATCH] configure: Don't assume a 16 byte aligned stack on BSDs on i386
With GCC, request it to maintain 16 byte alignment, and the existing entry points already align it via attribute_align_arg. With clang, do the same as for mingw; disable the aligned stack and let the assembly functions that require it do the alignment instead. --- configure | 22 -- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/configure b/configure index 95e6006440..78a2065208 100755 --- a/configure +++ b/configure @@ -4957,16 +4957,34 @@ elif enabled gcc; then check_cflags -Werror=format-security check_cflags -fdiagnostics-color=auto enabled extra_warnings || check_disable_warning -Wno-maybe-uninitialized +if enabled x86_32; then +case $target_os in +*bsd*) +# BSDs don't guarantee a 16 byte aligned stack, but we can +# request GCC to try to maintain 16 byte alignment throughout +# function calls. Library entry points that might call assembly +# functions align the stack. (The parameter means 2^4 bytes.) +check_cflags -mpreferred-stack-boundary=4 +;; +esac +fi elif enabled llvm_gcc; then check_cflags -mllvm -stack-alignment=16 elif enabled clang; then -if [ "$target_os" = "mingw32" -o "$target_os" = "win32" ] && enabled x86_32; then +if enabled x86_32; then # Clang doesn't support maintaining alignment without assuming the # same alignment in every function. If 16 byte alignment would be # enabled, one would also have to either add attribute_align_arg on # every single entry point into the libraries or enable -mstackrealign # (doing stack realignment in every single function). -disable aligned_stack +case $target_os in +mingw32|win32|*bsd*) +disable aligned_stack +;; +*) +check_cflags -mllvm -stack-alignment=16 +;; +esac else check_cflags -mllvm -stack-alignment=16 fi -- 2.14.3 (Apple Git-98) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH] configure: Don't assume an aligned stack on clang on windows
If we'd enable a 16 byte aligned stack, clang/llvm would also assume that alignment everywhere and produce code that strictly requires it. That would require adding realignment (via attribute_align_arg) on every single public library function or enable -mstackrealign (which does the same on every single function). Also relatedly; the parameter currently tested (-mllvm -stack-alignment=16) hasn't actually been supported for quite some time; current clang versions use -mstack-alignment=16 for the same. Actually testing for that parameter would be a different change though, since it has a real risk of changing behaviour on any other platform where clang is used. --- configure | 11 ++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/configure b/configure index b91be32..7042635 100755 --- a/configure +++ b/configure @@ -4955,7 +4955,16 @@ elif enabled gcc; then elif enabled llvm_gcc; then check_cflags -mllvm -stack-alignment=16 elif enabled clang; then -check_cflags -mllvm -stack-alignment=16 +if [ "$target_os" = "mingw32" -o "$target_os" = "win32" ] && enabled x86_32; then +# Clang doesn't support maintaining alignment without assuming the +# same alignment in every function. If 16 byte alignment would be +# enabled, one would also have to either add attribute_align_arg on +# every single entry point into the libraries or enable -mstackrealign +# (doing stack realignment in every single function). +disable aligned_stack +else +check_cflags -mllvm -stack-alignment=16 +fi check_cflags -Qunused-arguments check_cflags -Werror=implicit-function-declaration check_cflags -Werror=missing-prototypes -- 2.7.4 ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
Re: [libav-devel] [PATCH] configure: Restore original endianness test
On Thu, 8 Mar 2018, Diego Biurrun wrote: Previously the bit pattern for the endianness test was declared as a global, instead of a local, variable. This ensures that the pattern appears unchanged in the object file and is not optimized out. --- configure | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/configure b/configure index d59fc6fd1a..188b2d880b 100755 --- a/configure +++ b/configure @@ -4211,7 +4211,9 @@ done check_cc pragma_deprecated "" '_Pragma("GCC diagnostic ignored \"-Wdeprecated-declarations\"")' -require_cc "endian test" "" "unsigned int endian = 'B' << 24 | 'I' << 16 | 'G' << 8 | 'E'" +test_cc < Ok // Martin ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 1/2] Revert "configure: Stop using dlltool to create an import library"
This reverts commit 67c72f08a4707c18a67a4734660e3a23cc9488b6. While the linker produced import libraries might work with MSVC in simple test cases, they don't if e.g. linking to multiple GNU ld produced import libraries at the same time. The ones produced by dlltool work fine though. This issue was pointed out by Hendrik Leppkes. --- configure | 9 - 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/configure b/configure index ed930e6cd4..06fb839a18 100755 --- a/configure +++ b/configure @@ -3891,6 +3891,10 @@ case $target_os in ;; mingw32*|mingw64*) target_os=mingw32 +LIBTARGET=i386 +if enabled x86_64; then +LIBTARGET="i386:x86-64" +fi if enabled shared; then # Cannot build both shared and static libs when using dllimport. disable static @@ -3902,7 +3906,7 @@ case $target_os in SLIBSUF=".dll" SLIBNAME_WITH_VERSION='$(SLIBPREF)$(NAME)-$(LIBVERSION)$(SLIBSUF)' SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(NAME)-$(LIBMAJOR)$(SLIBSUF)' -SLIB_EXTRA_CMD='cp $(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) $(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)' +SLIB_EXTRA_CMD=-'$(DLLTOOL) -m $(LIBTARGET) -d $$(@:$(SLIBSUF)=.def) -l $(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib) -D $(SLIBNAME_WITH_MAJOR)' SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)' SLIB_INSTALL_LINKS= SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)' @@ -3910,6 +3914,7 @@ case $target_os in SLIB_CREATE_DEF_CMD='EXTERN_PREFIX="$(EXTERN_PREFIX)" AR="$(AR_CMD)" NM="$(NM_CMD)" $(SRC_PATH)/compat/windows/makedef $(SUBDIR)lib$(NAME).ver $(OBJS) > $$(@:$(SLIBSUF)=.def)' SHFLAGS='-shared -Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) -Wl,--enable-auto-image-base $$(@:$(SLIBSUF)=.def)' enabled x86_64 && objformat="win64" || objformat="win32" +dlltool="${cross_prefix}dlltool" ranlib=: enable dos_paths ;; @@ -5248,6 +5253,7 @@ X86ASM_O=$X86ASM_O LD_O=$LD_O LD_LIB=$LD_LIB LD_PATH=$LD_PATH +DLLTOOL=$dlltool LDFLAGS=$LDFLAGS LDEXEFLAGS=$LDEXEFLAGS LDSOFLAGS=$LDSOFLAGS @@ -5294,6 +5300,7 @@ LIB_INSTALL_EXTRA_CMD=$LIB_INSTALL_EXTRA_CMD EXTRALIBS=$extralibs COMPAT_OBJS=$compat_objs INSTALL=install +LIBTARGET=${LIBTARGET} SLIBNAME=${SLIBNAME} SLIBNAME_WITH_VERSION=${SLIBNAME_WITH_VERSION} SLIBNAME_WITH_MAJOR=${SLIBNAME_WITH_MAJOR} -- 2.14.3 (Apple Git-98) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel
[libav-devel] [PATCH 2/2] configure: Pass the right machine types to dlltool for arm and arm64 mingw
These are supported by llvm-dlltool. --- configure | 4 1 file changed, 4 insertions(+) diff --git a/configure b/configure index 06fb839a18..1c35f9dc64 100755 --- a/configure +++ b/configure @@ -3894,6 +3894,10 @@ case $target_os in LIBTARGET=i386 if enabled x86_64; then LIBTARGET="i386:x86-64" +elif enabled arm; then +LIBTARGET="arm" +elif enabled aarch64; then +LIBTARGET="arm64" fi if enabled shared; then # Cannot build both shared and static libs when using dllimport. -- 2.14.3 (Apple Git-98) ___ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel