Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm

2019-10-10 Thread Martin Storsjö

On Thu, 3 Oct 2019, Martin Storsjö wrote:


On Thu, 3 Oct 2019, Janne Grunau wrote:


On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote:

Armasm implicitly adds it instructions as needed. In VS 2019 16.3,
there's a bug [1] in armasm making it fail to parse these it instructions
(but it can still add them implicitly just fine).

I'm not sure if it really is worth working around this issue, or just
wait for it to hopefully be fixed by the next release again.

[1] 

https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html

---
 gas-preprocessor.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b6c2786..9d8fb5d 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1168,6 +1168,10 @@ sub handle_serialized_line {
 $line =~ s/fmxr/vmsr/;
 $line =~ s/fmrx/vmrs/;
 $line =~ s/fadds/vadd.f32/;
+# Armasm in VS 2019 16.3 errors out on "it" instructions. But
+# armasm implicitly adds the necessary it instructions anyway, so 

we

+# can just filter it out.
+$line =~ s/^\s*it[te]*\s+/$comm$&/;
 }
 if ($as_type eq "armasm" and $arch eq "aarch64") {
 # Convert "b.eq" into "beq"


I guess ok-ish since armasm can handle implicit it instructions. Do you 
have expectation when a fixed version might be released? If it's more 
than a couple of weeks I'd say the workaround is worth it.


There's roughly one stable release per 3 months, and the first preview for 
the next one (16.4) was already posted. In some cases, bugfixes do get 
into the next release (if deemed urgent enough I guess), but otherwise 
into current+2. So estimate of fix in a stable release is anywhere between 
2 and 5 months maybe.


https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html

They confirmed the bug and told that it should be fixed in 16.5, which is 
due in about 5 months.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm

2019-10-04 Thread Martin Storsjö

On Thu, 3 Oct 2019, Martin Storsjö wrote:


On Thu, 3 Oct 2019, Janne Grunau wrote:


On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote:

Armasm implicitly adds it instructions as needed. In VS 2019 16.3,
there's a bug [1] in armasm making it fail to parse these it instructions
(but it can still add them implicitly just fine).

I'm not sure if it really is worth working around this issue, or just
wait for it to hopefully be fixed by the next release again.

[1] 

https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html

---
 gas-preprocessor.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b6c2786..9d8fb5d 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1168,6 +1168,10 @@ sub handle_serialized_line {
 $line =~ s/fmxr/vmsr/;
 $line =~ s/fmrx/vmrs/;
 $line =~ s/fadds/vadd.f32/;
+# Armasm in VS 2019 16.3 errors out on "it" instructions. But
+# armasm implicitly adds the necessary it instructions anyway, so 

we

+# can just filter it out.
+$line =~ s/^\s*it[te]*\s+/$comm$&/;
 }
 if ($as_type eq "armasm" and $arch eq "aarch64") {
 # Convert "b.eq" into "beq"


I guess ok-ish since armasm can handle implicit it instructions. Do you 
have expectation when a fixed version might be released? If it's more 
than a couple of weeks I'd say the workaround is worth it.


There's roughly one stable release per 3 months, and the first preview for 
the next one (16.4) was already posted. In some cases, bugfixes do get 
into the next release (if deemed urgent enough I guess), but otherwise 
into current+2. So estimate of fix in a stable release is anywhere between 
2 and 5 months maybe.


If I would have caught this in August when the first preview actually 
containing the new broken armasm was out, it might have been possible to 
have it fixed sooner...


Pushed both of these now.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [GASPP PATCH] Comment out "it" instructions for armasm

2019-10-03 Thread Martin Storsjö

On Thu, 3 Oct 2019, Janne Grunau wrote:


On 2019-10-02 11:53:28 +0300, Martin Storsjö wrote:

Armasm implicitly adds it instructions as needed. In VS 2019 16.3,
there's a bug [1] in armasm making it fail to parse these it instructions
(but it can still add them implicitly just fine).

I'm not sure if it really is worth working around this issue, or just
wait for it to hopefully be fixed by the next release again.

[1] 
https://developercommunity.visualstudio.com/content/problem/757709/armasm-fails-to-handle-it-instructions.html
---
 gas-preprocessor.pl | 4 
 1 file changed, 4 insertions(+)

diff --git a/gas-preprocessor.pl b/gas-preprocessor.pl
index b6c2786..9d8fb5d 100755
--- a/gas-preprocessor.pl
+++ b/gas-preprocessor.pl
@@ -1168,6 +1168,10 @@ sub handle_serialized_line {
 $line =~ s/fmxr/vmsr/;
 $line =~ s/fmrx/vmrs/;
 $line =~ s/fadds/vadd.f32/;
+# Armasm in VS 2019 16.3 errors out on "it" instructions. But
+# armasm implicitly adds the necessary it instructions anyway, so we
+# can just filter it out.
+$line =~ s/^\s*it[te]*\s+/$comm$&/;
 }
 if ($as_type eq "armasm" and $arch eq "aarch64") {
 # Convert "b.eq" into "beq"


I guess ok-ish since armasm can handle implicit it instructions. Do you 
have expectation when a fixed version might be released? If it's more 
than a couple of weeks I'd say the workaround is worth it.


There's roughly one stable release per 3 months, and the first preview for 
the next one (16.4) was already posted. In some cases, bugfixes do get 
into the next release (if deemed urgent enough I guess), but otherwise 
into current+2. So estimate of fix in a stable release is anywhere between 
2 and 5 months maybe.


If I would have caught this in August when the first preview actually 
containing the new broken armasm was out, it might have been possible to 
have it fixed sooner...


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] aarch64: Add assembly support for -fsanitize=hwaddress tagged globals.

2019-08-21 Thread Martin Storsjö
From: Peter Collingbourne 

As of LLVM r368102, Clang will set a pointer tag in bits 56-63 of the
address of a global when compiling with -fsanitize=hwaddress. This requires
an adjustment to assembly code that takes the address of such globals: the
code cannot use the regular R_AARCH64_ADR_PREL_PG_HI21 relocation to refer
to the global, since the tag would take the address out of range. Instead,
the code must use the non-checking (_NC) variant of the relocation (the
link-time check is substituted by a runtime check).

This change makes the necessary adjustment in the movrel macro, where it is
needed when compiling with -fsanitize=hwaddress.

Signed-off-by: Peter Collingbourne 
Signed-off-by: Martin Storsjö 
---
 libavutil/aarch64/asm.S | 8 
 1 file changed, 8 insertions(+)

diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S
index bf5c1b7ee1..81d723b9b3 100644
--- a/libavutil/aarch64/asm.S
+++ b/libavutil/aarch64/asm.S
@@ -32,6 +32,10 @@
 #   define FUNC #
 #endif
 
+#ifndef __has_feature
+#   define __has_feature(x) 0
+#endif
+
 .macro  function name, export=0, align=2
 .macro endfunc
 ELF .size   \name, . - \name
@@ -94,7 +98,11 @@ ELF .size   \name, . - \name
 add \rd, \rd, :lo12:\val+(\offset)
 .endif
 #elif CONFIG_PIC
+#   if __has_feature(hwaddress_sanitizer)
+adrp\rd, :pg_hi21_nc:\val+(\offset)
+#   else
 adrp\rd, \val+(\offset)
+#   endif
 add \rd, \rd, :lo12:\val+(\offset)
 #else
 ldr \rd, =\val+\offset
-- 
2.20.1 (Apple Git-117)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] Add a fd protocol

2019-04-25 Thread Martin Storsjö

On Thu, 25 Apr 2019, Luca Barbato wrote:


---

Sometimes you receive a seekable fd from the outside.

libavformat/file.c  | 32 
libavformat/protocols.c |  1 +
2 files changed, 33 insertions(+)

diff --git a/libavformat/file.c b/libavformat/file.c
index 27ce4de6eb..6a74ebbf48 100644
--- a/libavformat/file.c
+++ b/libavformat/file.c
@@ -204,3 +204,35 @@ const URLProtocol ff_pipe_protocol = {
};

#endif /* CONFIG_PIPE_PROTOCOL */
+
+#if CONFIG_FD_PROTOCOL
+
+static int fd_open(URLContext *h, const char *filename, int flags)
+{
+FileContext *c = h->priv_data;
+int fd;
+char *final;
+av_strstart(filename, "fd:", &filename);
+
+fd = strtol(filename, &final, 10);
+if ((filename == final) || *final ) {
+return AVERROR(EINVAL);
+}
+#if HAVE_SETMODE
+setmode(fd, O_BINARY);
+#endif
+c->fd = fd;
+return 0;
+}
+
+const URLProtocol ff_pipe_protocol = {


Did you test compilation of this? It doesn't look like it would work given 
this ^


Isn't this essentially exactly the same as the pipe protocol, except for 
not setting the is_streamed flag? Even though the name pipe doesn't feel 
quite right for that case, wouldn't it be possible to just add an option 
to the pipe protocol for controlling this?


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: vp9lpf: Fix a typo in a comment about the register layout

2019-04-16 Thread Martin Storsjö
---
 libavcodec/aarch64/vp9lpf_neon.S | 2 +-
 libavcodec/arm/vp9lpf_neon.S | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/aarch64/vp9lpf_neon.S b/libavcodec/aarch64/vp9lpf_neon.S
index e9c497096b..f68b54a2ee 100644
--- a/libavcodec/aarch64/vp9lpf_neon.S
+++ b/libavcodec/aarch64/vp9lpf_neon.S
@@ -415,7 +415,7 @@
 
 1:
 // flat8out
-// This writes all outputs into v2-v17 (skipping v6 and v16).
+// This writes all outputs into v2-v17 (skipping v7 and v16).
 // If this part is skipped, the output is read from v21-v26 (which is 
the input
 // to this section).
 ushll_szv0.8h,  v1.8h,  v16,  #3,  \sz   // 8 * v16
diff --git a/libavcodec/arm/vp9lpf_neon.S b/libavcodec/arm/vp9lpf_neon.S
index ae782b2ed0..e30f0cd5b4 100644
--- a/libavcodec/arm/vp9lpf_neon.S
+++ b/libavcodec/arm/vp9lpf_neon.S
@@ -362,7 +362,7 @@
 beq 8f
 
 @ flat8out
-@ This writes all outputs into d2-d17 (skipping d6 and d16).
+@ This writes all outputs into d2-d17 (skipping d7 and d16).
 @ If this part is skipped, the output is read from d21-d26 (which is 
the input
 @ to this section).
 vshll.u8q0,  d16, #3  @ 8 * d16
-- 
2.20.1 (Apple Git-117)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-16 Thread Martin Storsjö

On Tue, 16 Apr 2019, Martin Storsjö wrote:


On Tue, 16 Apr 2019, Diego Biurrun wrote:


On Sun, Apr 14, 2019 at 09:33:40PM +0300, Martin Storsjö wrote:

On Sun, 14 Apr 2019, Diego Biurrun wrote:
> On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote:
> > On Fri, 12 Apr 2019, Luca Barbato wrote:
> > > On 11/04/2019 15:35, Martin Storsjö wrote:
> > > > On Wed, 10 Apr 2019, Luca Barbato wrote:
> > > > > On 10/04/2019 10:48, Martin Storsjö wrote:
> > > > > > Mingw headers have got header inline implementations of 

localtime_r
> > > > > > and gmtime_r, but only visible if certain posix thread safe 

functions

> > > > > > have been requested.
> > > > > > > > > > this is a preparatory step for improving the detection 

of those

> > > > > > functions.
> > > > > > ---
> > > > > > An alternative fix is also provided in a different patch 

series,

> > > > > > by adjusting libavutil/time_internal.h.
> > > > > > > > Seems fine to me.
> > > > > > Which ones do you mean - this series of 2 patches, the other 

one, or both?

> > > > > > This series seems fine to me.
> > 
> > Ok. FWIW, the change in mingw-w64 that broke it was reverted (there 

was a
> > similar issue within gcc as well), but I guess this change probably is 

good

> > to make anyway.
> 
> I generally don't think that adding workarounds for foreign bugs is a

> sustainable strategy,

Well, the idea of prefixing local system function fallbacks/replacements
isn't so much of a "workaround" as a sensible idea in general IMO. This is 

a

pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc.

That is, regardless of what the reason for using a fallback is (the real
function does not exist, the real function is declared in headers but
missing in libs, the real function exists but we want to avoid it because
it's buggy, etc), the pattern of

#include 
static inline ff_systemfunc() {
...
}
#define systemfunc ff_systemfunc

should always be safe. So I think that should be a generally beneficial
change in any case as well.


IIRC we only do that within libavformat and use a different pattern within
libavutil. Then again, my code knowledge might be getting a bit rusty.


True, e.g. libavutil/libm.h does define some static inline functions 
unprefixed as well.


Nevertheless, using prefixes for fallback functions is not a 
workaround/hack in my book, but a sane and healthy development practice.



> but I clearly prefer the configure change.

Well, the check_func_headers change obviously is for the better, yes. 

Adding

the _POSIX_C_SOURCE define when building for mingw most probably also is
sensible, but the fact that we add it manually to most OSes, while we 

don't

add it automatically for all, makes it a little less clear cut.


Switching from trying to set some flags globally for all platforms, 

inevitably
hitting a snag on some fringe system, then adding an exception for that 

system,
to setting flags by platform and strictly only when necessary on that 

platform,

is - oddly enough - one of the single biggest improvements to the whole
configure machinery.


Sure, I generally agree with that. I was generally a bit weary of forcing 
the posix defines on other systems, but I generally think it should be 
good for this case, as it reduces inconsistencies between 
available/visible functions.


So I'm very weary of changes in that area due to having been burned so 

often

in the past. If the change was motivated by a bug (since fixed) in mingw,
then we should not add workarounds for it.


Well it's not quite as simple. The immediate issue is gone again, but the 
general underlying issue remains.


The TL;DR version is:

- mingw-w64 contains localtime_r/gmtime_r, but only visible if posix 
thread safe functions have been requested by some means. We currently 
don't detect these in configure. In practice, the posix thread safe 
functions define could be enabled transitively by some other included 
header (which has also been somewhat mitigated within mingw-w64). To 
safeguard against this inconsistency, defining it in configure would be 
helpful IMO.


- Even if localtime_r was visible from mingw-w64 headers, it used to not 
conflict with ours, because the mingw-w64 was defined as extern inline, 
while ours was static inline. The mingw-w64 headers were changed to define 
this as static inline, and later reverted again.



Anyway, I've presented my arguments. I trust you to make a good decision.
Push at your discretion.


Well in that case, I'd push all four paches.


Pushed all four - thanks for the discussion!

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-16 Thread Martin Storsjö

On Tue, 16 Apr 2019, Diego Biurrun wrote:


On Sun, Apr 14, 2019 at 09:33:40PM +0300, Martin Storsjö wrote:

On Sun, 14 Apr 2019, Diego Biurrun wrote:
> On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote:
> > On Fri, 12 Apr 2019, Luca Barbato wrote:
> > > On 11/04/2019 15:35, Martin Storsjö wrote:
> > > > On Wed, 10 Apr 2019, Luca Barbato wrote:
> > > > > On 10/04/2019 10:48, Martin Storsjö wrote:
> > > > > > Mingw headers have got header inline implementations of localtime_r
> > > > > > and gmtime_r, but only visible if certain posix thread safe 
functions
> > > > > > have been requested.
> > > > > > > > > > this is a preparatory step for improving the detection of 
those
> > > > > > functions.
> > > > > > ---
> > > > > > An alternative fix is also provided in a different patch series,
> > > > > > by adjusting libavutil/time_internal.h.
> > > > > > > > Seems fine to me.
> > > > > > Which ones do you mean - this series of 2 patches, the other one, 
or both?
> > > > > > This series seems fine to me.
> > 
> > Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a

> > similar issue within gcc as well), but I guess this change probably is good
> > to make anyway.
> 
> I generally don't think that adding workarounds for foreign bugs is a

> sustainable strategy,

Well, the idea of prefixing local system function fallbacks/replacements
isn't so much of a "workaround" as a sensible idea in general IMO. This is a
pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc.

That is, regardless of what the reason for using a fallback is (the real
function does not exist, the real function is declared in headers but
missing in libs, the real function exists but we want to avoid it because
it's buggy, etc), the pattern of

#include 
static inline ff_systemfunc() {
...
}
#define systemfunc ff_systemfunc

should always be safe. So I think that should be a generally beneficial
change in any case as well.


IIRC we only do that within libavformat and use a different pattern within
libavutil. Then again, my code knowledge might be getting a bit rusty.


True, e.g. libavutil/libm.h does define some static inline functions 
unprefixed as well.


Nevertheless, using prefixes for fallback functions is not a 
workaround/hack in my book, but a sane and healthy development practice.



> but I clearly prefer the configure change.

Well, the check_func_headers change obviously is for the better, yes. Adding
the _POSIX_C_SOURCE define when building for mingw most probably also is
sensible, but the fact that we add it manually to most OSes, while we don't
add it automatically for all, makes it a little less clear cut.


Switching from trying to set some flags globally for all platforms, inevitably
hitting a snag on some fringe system, then adding an exception for that system,
to setting flags by platform and strictly only when necessary on that platform,
is - oddly enough - one of the single biggest improvements to the whole
configure machinery.


Sure, I generally agree with that. I was generally a bit weary of forcing 
the posix defines on other systems, but I generally think it should be 
good for this case, as it reduces inconsistencies between 
available/visible functions.



So I'm very weary of changes in that area due to having been burned so often
in the past. If the change was motivated by a bug (since fixed) in mingw,
then we should not add workarounds for it.


Well it's not quite as simple. The immediate issue is gone again, but the 
general underlying issue remains.


The TL;DR version is:

- mingw-w64 contains localtime_r/gmtime_r, but only visible if posix 
thread safe functions have been requested by some means. We currently 
don't detect these in configure. In practice, the posix thread safe 
functions define could be enabled transitively by some other included 
header (which has also been somewhat mitigated within mingw-w64). To 
safeguard against this inconsistency, defining it in configure would be 
helpful IMO.


- Even if localtime_r was visible from mingw-w64 headers, it used to not 
conflict with ours, because the mingw-w64 was defined as extern inline, 
while ours was static inline. The mingw-w64 headers were changed to define 
this as static inline, and later reverted again.



Anyway, I've presented my arguments. I trust you to make a good decision.
Push at your discretion.


Well in that case, I'd push all four paches.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] rtsp: add pkt_size option

2019-04-15 Thread Martin Storsjö

On Mon, 15 Apr 2019, Tristan Matthews wrote:


On Thu, Apr 11, 2019 at 1:41 AM Martin Storsjö  wrote:


On Thu, 11 Apr 2019, Tristan Matthews wrote:


This allows users to specify an upper limit on the size of outgoing packets
when publishing via RTSP.

---
libavformat/rtsp.c | 5 -
libavformat/rtsp.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c
index 8bf9d9e3c..12c4998c6 100644
--- a/libavformat/rtsp.c
+++ b/libavformat/rtsp.c
@@ -74,7 +74,8 @@

#define COMMON_OPTS() \
{ "reorder_queue_size", "Number of packets to buffer for handling of reordered 
packets", OFFSET(reordering_queue_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC }, \
-{ "buffer_size","Underlying protocol send/receive buffer size",
  OFFSET(buffer_size),   AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC } 
\
+{ "buffer_size","Underlying protocol send/receive buffer size",
  OFFSET(buffer_size),   AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC 
}, \
+{ "pkt_size",   "Underlying protocol send packet size",
  OFFSET(pkt_size),  AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, ENC } \


const AVOption ff_rtsp_options[] = {
@@ -118,6 +119,8 @@ static AVDictionary *map_to_opts(RTSPState *rt)

snprintf(buf, sizeof(buf), "%d", rt->buffer_size);
av_dict_set(&opts, "buffer_size", buf, 0);
+snprintf(buf, sizeof(buf), "%d", rt->pkt_size);
+av_dict_set(&opts, "pkt_size", buf, 0);

return opts;
}
diff --git a/libavformat/rtsp.h b/libavformat/rtsp.h
index 9dfbc5367..c38b90432 100644
--- a/libavformat/rtsp.h
+++ b/libavformat/rtsp.h
@@ -399,6 +399,7 @@ typedef struct RTSPState {

char default_lang[4];
int buffer_size;
+int pkt_size;

const URLProtocol **protocols;
} RTSPState;
--
2.17.1


LGTM

// Martin


This OK to merge?


Pushed it for you now, thanks!

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-14 Thread Martin Storsjö

On Sun, 14 Apr 2019, Diego Biurrun wrote:


On Sat, Apr 13, 2019 at 12:58:40AM +0300, Martin Storsjö wrote:

On Fri, 12 Apr 2019, Luca Barbato wrote:
> On 11/04/2019 15:35, Martin Storsjö wrote:
> > On Wed, 10 Apr 2019, Luca Barbato wrote:
> > > On 10/04/2019 10:48, Martin Storsjö wrote:
> > > > Mingw headers have got header inline implementations of localtime_r
> > > > and gmtime_r, but only visible if certain posix thread safe functions
> > > > have been requested.
> > > > 
> > > > This is a preparatory step for improving the detection of those

> > > > functions.
> > > > ---
> > > > An alternative fix is also provided in a different patch series,
> > > > by adjusting libavutil/time_internal.h.
> > > 
> > > Seems fine to me.
> > 
> > Which ones do you mean - this series of 2 patches, the other one, or both?
> > 
> 
> This series seems fine to me.


Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a
similar issue within gcc as well), but I guess this change probably is good
to make anyway.


I generally don't think that adding workarounds for foreign bugs is a
sustainable strategy,


Well, the idea of prefixing local system function fallbacks/replacements 
isn't so much of a "workaround" as a sensible idea in general IMO. This is 
a pattern that already is used e.g. for ff_getaddrinfo, ff_poll etc.


That is, regardless of what the reason for using a fallback is (the real 
function does not exist, the real function is declared in headers but 
missing in libs, the real function exists but we want to avoid it because 
it's buggy, etc), the pattern of


#include 
static inline ff_systemfunc() {
...
}
#define systemfunc ff_systemfunc

should always be safe. So I think that should be a generally beneficial 
change in any case as well.



but I clearly prefer the configure change.


Well, the check_func_headers change obviously is for the better, yes. 
Adding the _POSIX_C_SOURCE define when building for mingw most probably 
also is sensible, but the fact that we add it manually to most OSes, while 
we don't add it automatically for all, makes it a little less clear cut.



Also, s/Try adding/Add/ in the log message, you're not just trying to add
those flags :-)


Right, it wasn't a check_cflags but straightforward add_cflags. Yeah, I'll 
change that.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-12 Thread Martin Storsjö

On Fri, 12 Apr 2019, Luca Barbato wrote:


On 11/04/2019 15:35, Martin Storsjö wrote:

On Wed, 10 Apr 2019, Luca Barbato wrote:


On 10/04/2019 10:48, Martin Storsjö wrote:

Mingw headers have got header inline implementations of localtime_r
and gmtime_r, but only visible if certain posix thread safe functions
have been requested.

This is a preparatory step for improving the detection of those
functions.
---
An alternative fix is also provided in a different patch series,
by adjusting libavutil/time_internal.h.
---
  configure | 2 ++
  1 file changed, 2 insertions(+)



Seems fine to me.


Which ones do you mean - this series of 2 patches, the other one, or both?



This series seems fine to me.


Ok. FWIW, the change in mingw-w64 that broke it was reverted (there was a 
similar issue within gcc as well), but I guess this change probably is 
good to make anyway.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-11 Thread Martin Storsjö

On Wed, 10 Apr 2019, Luca Barbato wrote:


On 10/04/2019 10:48, Martin Storsjö wrote:

Mingw headers have got header inline implementations of localtime_r
and gmtime_r, but only visible if certain posix thread safe functions
have been requested.

This is a preparatory step for improving the detection of those
functions.
---
An alternative fix is also provided in a different patch series,
by adjusting libavutil/time_internal.h.
---
  configure | 2 ++
  1 file changed, 2 insertions(+)



Seems fine to me.


Which ones do you mean - this series of 2 patches, the other one, or both?

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] rtsp: add pkt_size option

2019-04-10 Thread Martin Storsjö

On Thu, 11 Apr 2019, Tristan Matthews wrote:


This allows users to specify an upper limit on the size of outgoing packets
when publishing via RTSP.

---
libavformat/rtsp.c | 5 -
libavformat/rtsp.h | 1 +
2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/libavformat/rtsp.c b/libavformat/rtsp.c
index 8bf9d9e3c..12c4998c6 100644
--- a/libavformat/rtsp.c
+++ b/libavformat/rtsp.c
@@ -74,7 +74,8 @@

#define COMMON_OPTS() \
{ "reorder_queue_size", "Number of packets to buffer for handling of reordered 
packets", OFFSET(reordering_queue_size), AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC }, \
-{ "buffer_size","Underlying protocol send/receive buffer size",
  OFFSET(buffer_size),   AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC } 
\
+{ "buffer_size","Underlying protocol send/receive buffer size",
  OFFSET(buffer_size),   AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, DEC|ENC 
}, \
+{ "pkt_size",   "Underlying protocol send packet size",
  OFFSET(pkt_size),  AV_OPT_TYPE_INT, { .i64 = -1 }, -1, INT_MAX, ENC } \


const AVOption ff_rtsp_options[] = {
@@ -118,6 +119,8 @@ static AVDictionary *map_to_opts(RTSPState *rt)

snprintf(buf, sizeof(buf), "%d", rt->buffer_size);
av_dict_set(&opts, "buffer_size", buf, 0);
+snprintf(buf, sizeof(buf), "%d", rt->pkt_size);
+av_dict_set(&opts, "pkt_size", buf, 0);

return opts;
}
diff --git a/libavformat/rtsp.h b/libavformat/rtsp.h
index 9dfbc5367..c38b90432 100644
--- a/libavformat/rtsp.h
+++ b/libavformat/rtsp.h
@@ -399,6 +399,7 @@ typedef struct RTSPState {

char default_lang[4];
int buffer_size;
+int pkt_size;

const URLProtocol **protocols;
} RTSPState;
--
2.17.1


LGTM

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] time_internal: Prefix fallback versions of gmtime_r/localtime_r with ff_

2019-04-10 Thread Martin Storsjö
Use a macro to redirect calling code from the official name to the
ff_ prefixed one.

Detecting these functions in configure can be tricky (on mingw, they
are conditionally available depending on posix feature defines).
If configure didn't detect them, but they still are visible at
compile time (due to an unrelated header defining the posix feature
defines), providing the local fallback versions with a prefixed
name is safer.
---
This fix is another alternative to improving the configure checks.
Making configure use check_func_header probably is safe, but
always forcing posix defines on mingw feels slightly more dubious.
---
 libavutil/time_internal.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/libavutil/time_internal.h b/libavutil/time_internal.h
index d0597db050..8e647fdc16 100644
--- a/libavutil/time_internal.h
+++ b/libavutil/time_internal.h
@@ -23,7 +23,7 @@
 #include "config.h"
 
 #if !HAVE_GMTIME_R && !defined(gmtime_r)
-static inline struct tm *gmtime_r(const time_t* clock, struct tm *result)
+static inline struct tm *ff_gmtime_r(const time_t* clock, struct tm *result)
 {
 struct tm *ptr = gmtime(clock);
 if (!ptr)
@@ -31,10 +31,11 @@ static inline struct tm *gmtime_r(const time_t* clock, 
struct tm *result)
 *result = *ptr;
 return result;
 }
+#define gmtime_r ff_gmtime_r
 #endif
 
 #if !HAVE_LOCALTIME_R && !defined(localtime_r)
-static inline struct tm *localtime_r(const time_t* clock, struct tm *result)
+static inline struct tm *ff_localtime_r(const time_t* clock, struct tm *result)
 {
 struct tm *ptr = localtime(clock);
 if (!ptr)
@@ -42,6 +43,7 @@ static inline struct tm *localtime_r(const time_t* clock, 
struct tm *result)
 *result = *ptr;
 return result;
 }
+#define localtime_r ff_localtime_r
 #endif
 
 #endif /* AVUTIL_TIME_INTERNAL_H */
-- 
2.17.1

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] configure: Try adding -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600 for mingw as well

2019-04-10 Thread Martin Storsjö
Mingw headers have got header inline implementations of localtime_r
and gmtime_r, but only visible if certain posix thread safe functions
have been requested.

This is a preparatory step for improving the detection of those
functions.
---
An alternative fix is also provided in a different patch series,
by adjusting libavutil/time_internal.h.
---
 configure | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/configure b/configure
index 26455054ba..3e8f2dcde1 100755
--- a/configure
+++ b/configure
@@ -4124,6 +4124,7 @@ probe_libc(){
 add_${pfx}cppflags -D__printf__=__gnu_printf__
 test_${pfx}cpp_condition windows.h "!defined(_WIN32_WINNT) || 
_WIN32_WINNT < 0x0600" &&
 add_${pfx}cppflags -D_WIN32_WINNT=0x0600
+add_${pfx}cppflags -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600
 elif test_${pfx}cpp_condition _mingw.h "defined __MINGW_VERSION"  ||
  test_${pfx}cpp_condition _mingw.h "defined __MINGW32_VERSION"; then
 eval ${pfx}libc_type=mingw32
@@ -4137,6 +4138,7 @@ probe_libc(){
 add_${pfx}cppflags -D_WIN32_WINNT=0x0600
 eval test \$${pfx_no_}cc_type = "gcc" &&
 add_${pfx}cppflags -D__printf__=__gnu_printf__
+add_${pfx}cppflags -D_POSIX_C_SOURCE=200112 -D_XOPEN_SOURCE=600
 elif test_${pfx}cpp_condition crtversion.h "defined 
_VC_CRT_MAJOR_VERSION"; then
 eval ${pfx}libc_type=msvcrt
 if test_${pfx}cpp_condition crtversion.h "_VC_CRT_MAJOR_VERSION < 14"; 
then
-- 
2.17.1

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] time_internal: Do not attempt to override *time_r() macros

2019-04-10 Thread Martin Storsjö
From: Michael Niedermayer 

This allegedly fixed build on odd mingw setups, and generally
seems like a safe thing to do (in case configure failed to detect
them while they still are available in headers).
---
 libavutil/time_internal.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavutil/time_internal.h b/libavutil/time_internal.h
index 829fefb007..d0597db050 100644
--- a/libavutil/time_internal.h
+++ b/libavutil/time_internal.h
@@ -22,7 +22,7 @@
 #include 
 #include "config.h"
 
-#if !HAVE_GMTIME_R
+#if !HAVE_GMTIME_R && !defined(gmtime_r)
 static inline struct tm *gmtime_r(const time_t* clock, struct tm *result)
 {
 struct tm *ptr = gmtime(clock);
@@ -33,7 +33,7 @@ static inline struct tm *gmtime_r(const time_t* clock, struct 
tm *result)
 }
 #endif
 
-#if !HAVE_LOCALTIME_R
+#if !HAVE_LOCALTIME_R && !defined(localtime_r)
 static inline struct tm *localtime_r(const time_t* clock, struct tm *result)
 {
 struct tm *ptr = localtime(clock);
-- 
2.17.1

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] configure: Include time.h when checking for gmtime_r and localtime_r

2019-04-10 Thread Martin Storsjö
These functions are available in time.h (conditional on posix thread
safe functions) on mingw.

Previously, these functions weren't detected by configure, and
libavutil/time_internal.h provided replacements, even if time.h
actually contained definitions of them.

Previously, these mingw inline functions were defined as
 "extern __inline __attribute__((__gnu_inline__))". In this case,
redefining a new static inline version of the same function with the
same name was accepted.

But recently, the mingw inline functions have changed to be declared
as "static inline", where it no longer is allowed to have
libavutil/time_internal.h redefine new static inline versions.
---
Contrary to what is mentioned in a similar commit
1b4dd59e5fbdebb8d9f13ad2dbdaa0179d0cce57 in ffmpeg, using
check_func_headers works just fine, provided that the posix defines
have been added. (Without them, check_builtin, which that commit
used, doesn't work either.)
---
 configure | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 3e8f2dcde1..8c46a870c4 100755
--- a/configure
+++ b/configure
@@ -4543,9 +4543,7 @@ check_func  gethrtime
 check_func  getopt
 check_func  getrusage
 check_func  gettimeofday
-check_func  gmtime_r
 check_func  isatty
-check_func  localtime_r
 check_func  mkstemp
 check_func  mmap
 check_func  mprotect
@@ -4561,6 +4559,8 @@ check_func  usleep
 check_func_headers io.h setmode
 check_func_headers mach/mach_time.h mach_absolute_time
 check_func_headers stdlib.h getenv
+check_func_headers time.h gmtime_r
+check_func_headers time.h localtime_r
 
 check_func_headers windows.h GetProcessAffinityMask
 check_func_headers windows.h GetProcessTimes
-- 
2.17.1

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: Implement a NEON version of 422 h264_h_loop_filter_chroma

2019-03-12 Thread Martin Storsjö
Previously, the 420 version was used even for 422.

This fixes occasional checkasm failures.
---
 libavcodec/arm/h264dsp_init_arm.c |  8 +++-
 libavcodec/arm/h264dsp_neon.S | 19 +++
 2 files changed, 26 insertions(+), 1 deletion(-)

diff --git a/libavcodec/arm/h264dsp_init_arm.c 
b/libavcodec/arm/h264dsp_init_arm.c
index 7afd350..617632c 100644
--- a/libavcodec/arm/h264dsp_init_arm.c
+++ b/libavcodec/arm/h264dsp_init_arm.c
@@ -33,6 +33,8 @@ void ff_h264_v_loop_filter_chroma_neon(uint8_t *pix, int 
stride, int alpha,
int beta, int8_t *tc0);
 void ff_h264_h_loop_filter_chroma_neon(uint8_t *pix, int stride, int alpha,
int beta, int8_t *tc0);
+void ff_h264_h_loop_filter_chroma422_neon(uint8_t *pix, int stride, int alpha,
+  int beta, int8_t *tc0);
 
 void ff_weight_h264_pixels_16_neon(uint8_t *dst, int stride, int height,
int log2_den, int weight, int offset);
@@ -76,7 +78,11 @@ static av_cold void h264dsp_init_neon(H264DSPContext *c, 
const int bit_depth,
 c->h264_v_loop_filter_luma   = ff_h264_v_loop_filter_luma_neon;
 c->h264_h_loop_filter_luma   = ff_h264_h_loop_filter_luma_neon;
 c->h264_v_loop_filter_chroma = ff_h264_v_loop_filter_chroma_neon;
-c->h264_h_loop_filter_chroma = ff_h264_h_loop_filter_chroma_neon;
+
+if (chroma_format_idc <= 1)
+c->h264_h_loop_filter_chroma = ff_h264_h_loop_filter_chroma_neon;
+else
+c->h264_h_loop_filter_chroma = 
ff_h264_h_loop_filter_chroma422_neon;
 
 c->weight_h264_pixels_tab[0] = ff_weight_h264_pixels_16_neon;
 c->weight_h264_pixels_tab[1] = ff_weight_h264_pixels_8_neon;
diff --git a/libavcodec/arm/h264dsp_neon.S b/libavcodec/arm/h264dsp_neon.S
index 5e75565..783e0f6 100644
--- a/libavcodec/arm/h264dsp_neon.S
+++ b/libavcodec/arm/h264dsp_neon.S
@@ -237,6 +237,7 @@ function ff_h264_h_loop_filter_chroma_neon, export=1
 h264_loop_filter_start
 
 sub r0,  r0,  #2
+h_loop_filter_chroma420:
 vld1.32 {d18[0]}, [r0], r1
 vld1.32 {d16[0]}, [r0], r1
 vld1.32 {d0[0]},  [r0], r1
@@ -271,6 +272,24 @@ function ff_h264_h_loop_filter_chroma_neon, export=1
 bx  lr
 endfunc
 
+function ff_h264_h_loop_filter_chroma422_neon, export=1
+h264_loop_filter_start
+push{r4, lr}
+add r4,  r0,  r1
+add r1,  r1,  r1
+sub r0,  r0,  #2
+
+bl  h_loop_filter_chroma420
+
+ldr r12, [sp, #8]
+ldr r12, [r12]
+vmov.32 d24[0], r12
+sub r0,  r4,  #2
+
+bl  h_loop_filter_chroma420
+pop {r4, pc}
+endfunc
+
 @ Biweighted prediction
 
 .macro  biweight_16 macs, macd
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] checkasm/h264: test 4:2:2 chroma loop filter functions

2019-02-27 Thread Martin Storsjö

On Wed, 27 Feb 2019, Janne Grunau wrote:


---
tests/checkasm/h264dsp.c | 44 
1 file changed, 26 insertions(+), 18 deletions(-)


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] h264/arm64: implement missing 4:2:2 chroma loop filter neon functions

2019-02-27 Thread Martin Storsjö

On Wed, 27 Feb 2019, Janne Grunau wrote:


---
libavcodec/aarch64/h264dsp_init_aarch64.c | 18 ++--
libavcodec/aarch64/h264dsp_neon.S | 36 +++
2 files changed, 46 insertions(+), 8 deletions(-)


LGTM

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCHv3] avio: Do not flush the buffer if a constant packet size is requested

2019-02-22 Thread Martin Storsjö

On Fri, 22 Feb 2019, Luca Barbato wrote:


---

Now with a separate option to be explicit on what is the behaviour
wanted.

libavformat/aviobuf.c | 9 +++--
libavformat/udp.c | 8 
libavformat/url.h | 1 +
3 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
index 98e35f776c..aa9e2fc483 100644
--- a/libavformat/aviobuf.c
+++ b/libavformat/aviobuf.c
@@ -244,8 +244,13 @@ void avio_write(AVIOContext *s, const unsigned char *buf, 
int size)

void avio_flush(AVIOContext *s)
{
-flush_buffer(s);
-s->must_flush = 0;
+AVIOInternal *internal = s->opaque;
+URLContext *h = internal->h;
+


No, this doesn't work. You can't assume that s->opaque exists and is an 
AVIOinternal struct. When AVIOContext has been allocated by 
avio_alloc_context, s->opaque is whatever custom pointer the caller 
provided.


The only place you can use AVIOInternal is within the callbacks you 
provide in ffio_fdopen when AVIOInternal is created.


To do this properly, you need to propagate the new value all the way into 
AVIOContext, just like the existing max_packet_size.


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 19/19] aarch64: vp8: Optimize vp8_idct_add_neon for aarch64

2019-02-19 Thread Martin Storsjö

On Fri, 1 Feb 2019, Martin Storsjö wrote:


The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).

This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.

Before:Cortex A53A72A73
vp8_idct_add_neon:   79.7   67.5   65.0
After:
vp8_idct_add_neon:   67.7   64.8   66.7
---
libavcodec/aarch64/vp8dsp_neon.S | 49 
1 file changed, 25 insertions(+), 24 deletions(-)


22:38  feel free to push next week if I didn't manage to start by
   then

I'll push this patchset soon, with some changes squashed as suggested by 
Diego.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 04/19] aarch64: vp8: Fix assembling with armasm64

2019-02-01 Thread Martin Storsjö
---
 libavcodec/aarch64/vp8dsp_neon.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index f371ea7..14a9d11 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -28,7 +28,7 @@
 function ff_vp8_idct_add_neon, export=1
 ld1 {v0.8b - v3.8b},  [x1]
 mov w4,  #20091
-movkw4,  #35468/2, lsl 16
+movkw4,  #35468/2, lsl #16
 dup v4.2s, w4
 
 smull   v26.4s, v1.4h,  v4.h[0]
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 19/19] aarch64: vp8: Optimize vp8_idct_add_neon for aarch64

2019-02-01 Thread Martin Storsjö
The previous version was a pretty exact translation of the arm
version. This version does do some unnecessary arithemetic (it does
more operations on vectors that are only half filled; it does 4
uaddw and 4 sqxtun instead of 2 of each), but it reduces the overhead
of packing data together (which could be done for free in the arm
version).

This gives a decent speedup on Cortex A53, a minor speedup on
A72 and a very minor slowdown on Cortex A73.

Before:Cortex A53A72A73
vp8_idct_add_neon:   79.7   67.5   65.0
After:
vp8_idct_add_neon:   67.7   64.8   66.7
---
 libavcodec/aarch64/vp8dsp_neon.S | 49 
 1 file changed, 25 insertions(+), 24 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index cac4558..47fdc21 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -125,36 +125,37 @@ function ff_vp8_idct_add_neon, export=1
 sub v17.4h, v0.4h,  v2.4h
 
 add v18.4h, v20.4h, v23.4h
-ld1 {v24.d}[0], [x0],   x2
-zip1v16.2d, v16.2d, v17.2d
-sub v19.4h, v21.4h, v22.4h
-ld1 {v25.d}[0], [x0],   x2
-zip1v18.2d, v18.2d, v19.2d
-add v0.8h,  v16.8h, v18.8h
-ld1 {v25.d}[1], [x0],   x2
-sub v1.8h,  v16.8h, v18.8h
-ld1 {v24.d}[1], [x0],   x2
-srshr   v0.8h,  v0.8h,  #3
-trn1v24.4s, v24.4s, v25.4s
-srshr   v1.8h,  v1.8h,  #3
+ld1 {v24.s}[0], [x0],   x2
+sub v19.4h, v21.4h, v22.4h
+ld1 {v25.s}[0], [x0],   x2
+add v0.4h,  v16.4h, v18.4h
+add v1.4h,  v17.4h, v19.4h
+ld1 {v26.s}[0], [x0],   x2
+sub v3.4h,  v16.4h, v18.4h
+sub v2.4h,  v17.4h, v19.4h
+ld1 {v27.s}[0], [x0],   x2
+srshr   v0.4h,  v0.4h,  #3
+srshr   v1.4h,  v1.4h,  #3
+srshr   v2.4h,  v2.4h,  #3
+srshr   v3.4h,  v3.4h,  #3
+
 sub x0,  x0,  x2,  lsl #2
 
-ext v1.16b, v1.16b, v1.16b, #8
-trn1v3.2d,  v0.2d,  v1.2d
-trn2v0.2d,  v0.2d,  v1.2d
-trn1v1.8h,  v3.8h,  v0.8h
-trn2v3.8h,  v3.8h,  v0.8h
-uzp1v0.4s,  v1.4s,  v3.4s
-uzp2v1.4s,  v3.4s,  v1.4s
+transpose_4x4H  v0, v1, v2, v3, v5, v6, v7, v16
 
 uaddw   v0.8h,  v0.8h, v24.8b
-uaddw2  v1.8h,  v1.8h, v24.16b
+uaddw   v1.8h,  v1.8h, v25.8b
+uaddw   v2.8h,  v2.8h, v26.8b
+uaddw   v3.8h,  v3.8h, v27.8b
 sqxtun  v0.8b,  v0.8h
-sqxtun2 v0.16b, v1.8h
+sqxtun  v1.8b,  v1.8h
+sqxtun  v2.8b,  v2.8h
+sqxtun  v3.8b,  v3.8h
+
 st1 {v0.s}[0],  [x0], x2
-st1 {v0.s}[1],  [x0], x2
-st1 {v0.s}[3],  [x0], x2
-st1 {v0.s}[2],  [x0], x2
+st1 {v1.s}[0],  [x0], x2
+st1 {v2.s}[0],  [x0], x2
+st1 {v3.s}[0],  [x0], x2
 
 ret
 endfunc
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 01/19] libavcodec: vp8 neon optimizations for aarch64

2019-02-01 Thread Martin Storsjö
From: Magnus Röös 

Partial port of the ARM Neon for aarch64.

Benchmarks from fate:

benchmarking with Linux Perf Monitoring API
nop: 58.6
checkasm: using random seed 1760970128
NEON:
 - vp8dsp.idct   [OK]
 - vp8dsp.mc [OK]
 - vp8dsp.loopfilter [OK]
checkasm: all 21 tests passed
vp8_idct_add_c: 201.6
vp8_idct_add_neon: 83.1
vp8_idct_dc_add_c: 107.6
vp8_idct_dc_add_neon: 33.8
vp8_idct_dc_add4y_c: 426.4
vp8_idct_dc_add4y_neon: 59.4
vp8_loop_filter8uv_h_c: 688.1
vp8_loop_filter8uv_h_neon: 216.3
vp8_loop_filter8uv_inner_h_c: 649.3
vp8_loop_filter8uv_inner_h_neon: 195.3
vp8_loop_filter8uv_inner_v_c: 544.8
vp8_loop_filter8uv_inner_v_neon: 131.3
vp8_loop_filter8uv_v_c: 706.1
vp8_loop_filter8uv_v_neon: 141.1
vp8_loop_filter16y_h_c: 668.8
vp8_loop_filter16y_h_neon: 242.8
vp8_loop_filter16y_inner_h_c: 647.3
vp8_loop_filter16y_inner_h_neon: 224.6
vp8_loop_filter16y_inner_v_c: 647.8
vp8_loop_filter16y_inner_v_neon: 128.8
vp8_loop_filter16y_v_c: 721.8
vp8_loop_filter16y_v_neon: 154.3
vp8_loop_filter_simple_h_c: 387.8
vp8_loop_filter_simple_h_neon: 187.6
vp8_loop_filter_simple_v_c: 384.1
vp8_loop_filter_simple_v_neon: 78.6
vp8_put_epel8_h4v4_c: 3971.1
vp8_put_epel8_h4v4_neon: 855.1
vp8_put_epel8_h4v6_c: 5060.1
vp8_put_epel8_h4v6_neon: 989.6
vp8_put_epel8_h6v4_c: 4320.8
vp8_put_epel8_h6v4_neon: 1007.3
vp8_put_epel8_h6v6_c: 5449.3
vp8_put_epel8_h6v6_neon: 1158.1
vp8_put_epel16_h6_c: 6683.8
vp8_put_epel16_h6_neon: 831.8
vp8_put_epel16_h6v6_c: 0.8
vp8_put_epel16_h6v6_neon: 2214.8
vp8_put_epel16_v6_c: 7024.8
vp8_put_epel16_v6_neon: 799.6
vp8_put_pixels8_c: 112.8
vp8_put_pixels8_neon: 78.1
vp8_put_pixels16_c: 131.3
vp8_put_pixels16_neon: 129.8

Signed-off-by: Magnus Röös 
---
 libavcodec/aarch64/Makefile  |2 +
 libavcodec/aarch64/vp8dsp.h  |   70 ++
 libavcodec/aarch64/vp8dsp_init_aarch64.c |   81 +++
 libavcodec/aarch64/vp8dsp_neon.S | 1031 ++
 libavcodec/vp8dsp.c  |4 +
 libavcodec/vp8dsp.h  |2 +
 6 files changed, 1190 insertions(+)
 create mode 100644 libavcodec/aarch64/vp8dsp.h
 create mode 100644 libavcodec/aarch64/vp8dsp_init_aarch64.c
 create mode 100644 libavcodec/aarch64/vp8dsp_neon.S

diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile
index 5c1d118..2555044 100644
--- a/libavcodec/aarch64/Makefile
+++ b/libavcodec/aarch64/Makefile
@@ -44,6 +44,8 @@ NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= 
aarch64/mpegaudiodsp_neon.o
 NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o   
\
aarch64/synth_filter_neon.o
 NEON-OBJS-$(CONFIG_VORBIS_DECODER)  += aarch64/vorbisdsp_neon.o
+NEON-OBJS-$(CONFIG_VP8DSP)  += aarch64/vp8dsp_init_aarch64.o   
\
+   aarch64/vp8dsp_neon.o
 NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o 
\
aarch64/vp9lpf_neon.o   
\
aarch64/vp9mc_neon.o
diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h
new file mode 100644
index 000..8a0c8fb
--- /dev/null
+++ b/libavcodec/aarch64/vp8dsp.h
@@ -0,0 +1,70 @@
+/*
+ * This file is part of Libav.
+ *
+ * Libav is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * Libav is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with Libav; if not, write to the Free Software
+ * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ */
+
+#ifndef AVCODEC_ARM_VP8DSP_H
+#define AVCODEC_ARM_VP8DSP_H
+
+#include "libavcodec/vp8dsp.h"
+
+#define VP8_LF_Y(hv, inner, opt) \
+void ff_vp8_##hv##_loop_filter16##inner##_##opt(uint8_t *dst,\
+ptrdiff_t stride,\
+int flim_E, int flim_I,  \
+int hev_thresh)
+
+#define VP8_LF_UV(hv, inner, opt)\
+void ff_vp8_##hv##_loop_filter8uv##inner##_##opt(uint8_t *dstU,  \
+ uint8_t *dstV,  \
+ ptrdiff_t stride,   \
+ int flim_E, int flim_I, \
+  

[libav-devel] [PATCH 03/19] aarch64: vp8: Fix assembling with clang

2019-02-01 Thread Martin Storsjö
This also partially fixes assembling with MS armasm64 (via
gas-preprocessor).
---
 libavcodec/aarch64/vp8dsp_neon.S | 124 +++
 1 file changed, 62 insertions(+), 62 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 771877c..f371ea7 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -31,10 +31,10 @@ function ff_vp8_idct_add_neon, export=1
 movkw4,  #35468/2, lsl 16
 dup v4.2s, w4
 
-smull   v26.4s, v1.4h,  v4.4h[0]
-smull   v27.4s, v3.4h,  v4.4h[0]
-sqdmulh v20.4h, v1.4h,  v4.4h[1]
-sqdmulh v23.4h, v3.4h,  v4.4h[1]
+smull   v26.4s, v1.4h,  v4.h[0]
+smull   v27.4s, v3.4h,  v4.h[0]
+sqdmulh v20.4h, v1.4h,  v4.h[1]
+sqdmulh v23.4h, v3.4h,  v4.h[1]
 sqshrn  v21.4h, v26.4s, #16
 sqshrn  v22.4h, v27.4s, #16
 add v21.4h, v21.4h, v1.4h
@@ -54,12 +54,12 @@ function ff_vp8_idct_add_neon, export=1
 transpose_4x4H  v0, v1, v2, v3, v24, v5, v6, v7
 
 moviv29.8h, #0
-smull   v26.4s, v1.4h,  v4.4h[0]
+smull   v26.4s, v1.4h,  v4.h[0]
 st1 {v29.8h},   [x1],   #16
-smull   v27.4s, v3.4h,  v4.4h[0]
+smull   v27.4s, v3.4h,  v4.h[0]
 st1 {v29.16b},  [x1]
-sqdmulh v21.4h, v1.4h,  v4.4h[1]
-sqdmulh v23.4h, v3.4h,  v4.4h[1]
+sqdmulh v21.4h, v1.4h,  v4.h[1]
+sqdmulh v23.4h, v3.4h,  v4.h[1]
 sqshrn  v20.4h, v26.4s, #16
 sqshrn  v22.4h, v27.4s, #16
 add v20.4h, v20.4h, v1.4h
@@ -469,7 +469,7 @@ function ff_vp8_h_loop_filter16\name\()_neon, export=1
 ld1 {v6.d}[1], [x0], x1
 ld1 {v7.d}[1], [x0], x1
 
-transpose_8x16b   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
+transpose_8x16B   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
 
 dup v22.16b, w2 // flim_E
 .if !\simple
@@ -480,7 +480,7 @@ function ff_vp8_h_loop_filter16\name\()_neon, export=1
 
 sub x0,  x0,  x1, lsl #4// backup 16 rows
 
-transpose_8x16b   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
+transpose_8x16B   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
 
 // Store pixels:
 st1 {v0.d}[0], [x0], x1
@@ -531,7 +531,7 @@ function ff_vp8_h_loop_filter8uv\name\()_neon, export=1
 ld1  {v7.d}[0], [x0], x2
 ld1  {v7.d}[1], [x1], x2
 
-transpose_8x16b   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
+transpose_8x16B   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
 
 dup v22.16b, w3 // flim_E
 dup v23.16b, w4 // flim_I
@@ -541,7 +541,7 @@ function ff_vp8_h_loop_filter8uv\name\()_neon, export=1
 sub x0,  x0,  x2, lsl #3// backup u 8 rows
 sub x1,  x1,  x2, lsl #3// backup v 8 rows
 
-transpose_8x16b   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
+transpose_8x16B   v0,  v1,  v2,  v3,  v4,  v5,  v6,  v7, v30, v31
 
 // Store pixels:
 st1  {v0.d}[0], [x0], x2 // load u
@@ -613,13 +613,13 @@ endfunc
 uxtlv22.8h, v24.8b
 ext v26.8b, \s0\().8b,  \s1\().8b,  #5
 uxtlv25.8h, v25.8b
-mul v21.8h, v21.8h, v0.8h[2]
+mul v21.8h, v21.8h, v0.h[2]
 uxtlv26.8h, v26.8b
-mul v22.8h, v22.8h, v0.8h[3]
-mls v21.8h, v19.8h, v0.8h[1]
-mls v22.8h, v25.8h, v0.8h[4]
-mla v21.8h, v18.8h, v0.8h[0]
-mla v22.8h, v26.8h, v0.8h[5]
+mul v22.8h, v22.8h, v0.h[3]
+mls v21.8h, v19.8h, v0.h[1]
+mls v22.8h, v25.8h, v0.h[4]
+mla v21.8h, v18.8h, v0.h[0]
+mla v22.8h, v26.8h, v0.h[5]
 sqadd   v22.8h, v21.8h, v22.8h
 sqrshrun\d\().8b, v22.8h, #7
 .endm
@@ -640,20 +640,20 @@ endfunc
 uxtl2   v2.8h,   v2.16b
 uxtlv17.8h,  v16.8b
 uxtl2   v16.8h,  v16.16b
-mul v19.8h,  v19.8h, v0.8h[3]
-mul v18.8h,  v18.8h, v0.8h[2]
-mul v3.8h,   v3.8h,  v0.8h[2]
-mul v22.8h,  v22.8h, v0.8h[3]
-mls v19.8h,  v20.8h, v0.8h[4]
+mul v19.8h,  v19.8h, v0.h[3]
+mul v18.8h,  v18.8h, v0.h[2]
+mul 

[libav-devel] [PATCH 18/19] aarch64: vp8: Skip saturating in shrn in ff_vp8_idct_add_neon

2019-02-01 Thread Martin Storsjö
The original arm version didn't do saturation here. This probably
doesn't make any difference for performance, but reduces the
differences.
---
 libavcodec/aarch64/vp8dsp_neon.S | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 139b380..cac4558 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -92,8 +92,8 @@ function ff_vp8_idct_add_neon, export=1
 smull   v27.4s, v3.4h,  v4.h[0]
 sqdmulh v20.4h, v1.4h,  v4.h[1]
 sqdmulh v23.4h, v3.4h,  v4.h[1]
-sqshrn  v21.4h, v26.4s, #16
-sqshrn  v22.4h, v27.4s, #16
+shrnv21.4h, v26.4s, #16
+shrnv22.4h, v27.4s, #16
 add v21.4h, v21.4h, v1.4h
 add v22.4h, v22.4h, v3.4h
 
@@ -117,8 +117,8 @@ function ff_vp8_idct_add_neon, export=1
 st1 {v29.16b},  [x1]
 sqdmulh v21.4h, v1.4h,  v4.h[1]
 sqdmulh v23.4h, v3.4h,  v4.h[1]
-sqshrn  v20.4h, v26.4s, #16
-sqshrn  v22.4h, v27.4s, #16
+shrnv20.4h, v26.4s, #16
+shrnv22.4h, v27.4s, #16
 add v20.4h, v20.4h, v1.4h
 add v22.4h, v22.4h, v3.4h
 add v16.4h, v0.4h,  v2.4h
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 13/19] aarch64: vp8: Port missing epel8 functions from arm version

2019-02-01 Thread Martin Storsjö
  Cortex A53 A72 A73
vp8_put_epel8_h4_c:   2594.8  1159.6  1374.8
vp8_put_epel8_h4_neon: 506.4   244.2   314.0
vp8_put_epel8_h6_c:   3445.8  1677.1  1811.3
vp8_put_epel8_h6_neon: 634.4   371.7   433.0
vp8_put_epel8_v4_c:   2614.0  1174.8  1378.0
vp8_put_epel8_v4_neon: 321.0   221.7   235.8
vp8_put_epel8_v6_c:   3635.5  1703.0  2079.2
vp8_put_epel8_v6_neon: 416.9   317.0   295.5
---
 libavcodec/aarch64/vp8dsp_init_aarch64.c |  4 ++
 libavcodec/aarch64/vp8dsp_neon.S | 87 
 2 files changed, 91 insertions(+)

diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index 8f060dc..1878d8e 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -47,8 +47,12 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
 dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_neon;
 
 dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon;
+dsp->put_vp8_epel_pixels_tab[1][0][1] = ff_put_vp8_epel8_h4_neon;
+dsp->put_vp8_epel_pixels_tab[1][0][2] = ff_put_vp8_epel8_h6_neon;
+dsp->put_vp8_epel_pixels_tab[1][1][0] = ff_put_vp8_epel8_v4_neon;
 dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_neon;
 dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon;
+dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_neon;
 dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon;
 dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon;
 }
diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 4ea62c0..c5badc4 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -957,6 +957,51 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 ret
 endfunc
 
+function ff_put_vp8_epel8_v6_neon, export=1
+sub x2,  x2,  x3,  lsl #1
+
+movrel  x7,  subpel_filters, -16
+add x6,  x7,  w6, uxtw #4
+ld1 {v0.8h},  [x6]
+1:
+ld1 {v2.8b},  [x2], x3
+ld1 {v3.8b},  [x2], x3
+ld1 {v4.8b},  [x2], x3
+ld1 {v5.8b},  [x2], x3
+ld1 {v6.8b},  [x2], x3
+ld1 {v7.8b},  [x2], x3
+ld1 {v28.8b}, [x2]
+
+sub x2,  x2,  x3,  lsl #2
+
+vp8_epel8_v6_y2 v2, v3, v2, v3, v4, v5, v6, v7, v28
+
+st1 {v2.8b}, [x0], x1
+st1 {v3.8b}, [x0], x1
+subsw4,  w4,  #2
+b.ne1b
+
+ret
+endfunc
+
+function ff_put_vp8_epel8_h6_neon, export=1
+sub x2,  x2,  #2
+
+movrel  x7,  subpel_filters, -16
+add x5,  x7,  w5, uxtw #4
+ld1 {v0.8h},[x5]
+1:
+ld1 {v2.8b, v3.8b}, [x2], x3
+
+vp8_epel8_h6v2,  v2,  v3
+
+st1 {v2.8b}, [x0], x1
+subsw4,  w4,  #1
+b.ne1b
+
+ret
+endfunc
+
 function ff_put_vp8_epel8_h6v6_neon, export=1
 sub x2,  x2,  x3,  lsl #1
 sub x2,  x2,  #2
@@ -1003,6 +1048,48 @@ function ff_put_vp8_epel8_h6v6_neon, export=1
 ret
 endfunc
 
+function ff_put_vp8_epel8_v4_neon, export=1
+sub x2,  x2,  x3
+
+movrel  x7,  subpel_filters, -16
+add x6,  x7,  w6, uxtw #4
+ld1 {v0.8h}, [x6]
+1:
+ld1 {v2.8b}, [x2], x3
+ld1 {v3.8b}, [x2], x3
+ld1 {v4.8b}, [x2], x3
+ld1 {v5.8b}, [x2], x3
+ld1 {v6.8b}, [x2]
+sub x2,  x2,  x3,  lsl #1
+
+vp8_epel8_v4_y2 v2, v2, v3, v4, v5, v6
+
+st1 {v2.d}[0], [x0], x1
+st1 {v2.d}[1], [x0], x1
+subsw4,  w4,  #2
+b.ne1b
+
+ret
+endfunc
+
+function ff_put_vp8_epel8_h4_neon, export=1
+sub x2,  x2,  #1
+
+movrel  x7,  subpel_filters, -16
+add x5,  x7,  w5, uxtw #4
+ld1 {v0.8h},   [x5]
+1:
+ld1 {v2.8b,v3.8b}, [x2], x3
+
+vp8_epel8_h4v2,  v2,  v3
+
+st1 {v2.8b}, [x0], x1
+subsw4,  w4,  #1
+b.ne1b
+
+ret
+endfunc
+
 function ff_put_vp8_epel8_h4v6_neon, export=1
 sub x2,  x2,  x3,  lsl #1
 sub x2,  x2,  #1
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 15/19] aarch64: vp8: Port bilin functions from arm version

2019-02-01 Thread Martin Storsjö
  Cortex A53 A72 A73
vp8_put_bilin4_h_c:303.8   102.2   161.8
vp8_put_bilin4_h_neon: 100.040.941.2
vp8_put_bilin4_hv_c:   322.8   201.0   305.9
vp8_put_bilin4_hv_neon:156.872.677.0
vp8_put_bilin4_v_c:304.7   101.7   166.5
vp8_put_bilin4_v_neon:  82.741.233.0
vp8_put_bilin8_h_c:   1192.7   352.5   623.8
vp8_put_bilin8_h_neon: 213.570.287.8
vp8_put_bilin8_hv_c:  1098.6   769.2  1041.9
vp8_put_bilin8_hv_neon:324.0   123.5   146.0
vp8_put_bilin8_v_c:   1193.9   350.4   617.7
vp8_put_bilin8_v_neon: 183.960.764.7
vp8_put_bilin16_h_c:  2353.1   671.2  1223.3
vp8_put_bilin16_h_neon:261.9   140.7   145.0
vp8_put_bilin16_hv_c: 2453.2  1470.9  2355.2
vp8_put_bilin16_hv_neon:   383.9   196.0   217.0
vp8_put_bilin16_v_c:  2349.3   669.8  1251.2
vp8_put_bilin16_v_neon:202.9   110.796.2
---
 libavcodec/aarch64/vp8dsp.h  |   5 +
 libavcodec/aarch64/vp8dsp_init_aarch64.c |  32 
 libavcodec/aarch64/vp8dsp_neon.S | 292 +++
 3 files changed, 329 insertions(+)

diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h
index 40d0cae..616252e 100644
--- a/libavcodec/aarch64/vp8dsp.h
+++ b/libavcodec/aarch64/vp8dsp.h
@@ -67,4 +67,9 @@
 VP8_MC(epel ## w ## _h4v6, opt);\
 VP8_MC(epel ## w ## _h6v6, opt)
 
+#define VP8_BILIN(w, opt)   \
+VP8_MC(bilin ## w ## _h, opt);  \
+VP8_MC(bilin ## w ## _v, opt);  \
+VP8_MC(bilin ## w ## _hv, opt)
+
 #endif /* AVCODEC_AARCH64_VP8DSP_H */
diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index 478f849..53fbfcd 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -36,6 +36,9 @@ VP8_EPEL(16, neon);
 VP8_EPEL(8,  neon);
 VP8_EPEL(4,  neon);
 
+VP8_BILIN(16, neon);
+VP8_BILIN(8,  neon);
+VP8_BILIN(4,  neon);
 
 av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
 {
@@ -65,6 +68,35 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
 dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_neon;
 dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_neon;
 dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_neon;
+
+dsp->put_vp8_bilinear_pixels_tab[0][0][0] = ff_put_vp8_pixels16_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][0][1] = ff_put_vp8_bilin16_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][0][2] = ff_put_vp8_bilin16_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][1][0] = ff_put_vp8_bilin16_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][1][1] = ff_put_vp8_bilin16_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][1][2] = ff_put_vp8_bilin16_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][2][0] = ff_put_vp8_bilin16_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][2][1] = ff_put_vp8_bilin16_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[0][2][2] = ff_put_vp8_bilin16_hv_neon;
+
+dsp->put_vp8_bilinear_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][0][1] = ff_put_vp8_bilin8_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][0][2] = ff_put_vp8_bilin8_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][1][0] = ff_put_vp8_bilin8_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][1][1] = ff_put_vp8_bilin8_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][1][2] = ff_put_vp8_bilin8_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][2][0] = ff_put_vp8_bilin8_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][2][1] = ff_put_vp8_bilin8_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[1][2][2] = ff_put_vp8_bilin8_hv_neon;
+
+dsp->put_vp8_bilinear_pixels_tab[2][0][1] = ff_put_vp8_bilin4_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][0][2] = ff_put_vp8_bilin4_h_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][1][0] = ff_put_vp8_bilin4_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][1][1] = ff_put_vp8_bilin4_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][1][2] = ff_put_vp8_bilin4_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][2][0] = ff_put_vp8_bilin4_v_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][2][1] = ff_put_vp8_bilin4_hv_neon;
+dsp->put_vp8_bilinear_pixels_tab[2][2][2] = ff_put_vp8_bilin4_hv_neon;
 }
 
 av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp)
diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 7fe2466..604be8a 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -1509,3 +1509,295 @@ function ff_put_vp8_epel4_h4v4_neon, export=1
 add sp,  sp,  #44
 ret
 endfunc
+
+/* Bilinear MC */
+
+function ff_put_vp8_bilin16_h_neon, export=1
+mov w7, #8
+dup v0.8b,  w5
+sub w5, w7, w5
+dup v1.8b,  w5
+1:
+subsw4, 

[libav-devel] [PATCH 12/19] aarch64: vp8: Port vp8_luma_dc_wht and vp8_idct_dc_add4uv from arm version

2019-02-01 Thread Martin Storsjö
 Cortex A53A72A73
vp8_luma_dc_wht_c:115.7   75.7   90.7
vp8_luma_dc_wht_neon:  60.7   41.2   45.7
vp8_idct_dc_add4uv_c: 376.1  262.9  282.5
vp8_idct_dc_add4uv_neon:   52.0   29.0   37.0
---
 libavcodec/aarch64/vp8dsp_init_aarch64.c |   3 +
 libavcodec/aarch64/vp8dsp_neon.S | 109 +++
 2 files changed, 112 insertions(+)

diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index da54efd..8f060dc 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -28,6 +28,7 @@ void ff_vp8_luma_dc_wht_neon(int16_t block[4][4][16], int16_t 
dc[16]);
 void ff_vp8_idct_add_neon(uint8_t *dst, int16_t block[16], ptrdiff_t stride);
 void ff_vp8_idct_dc_add_neon(uint8_t *dst, int16_t block[16], ptrdiff_t 
stride);
 void ff_vp8_idct_dc_add4y_neon(uint8_t *dst, int16_t block[4][16], ptrdiff_t 
stride);
+void ff_vp8_idct_dc_add4uv_neon(uint8_t *dst, int16_t block[4][16], ptrdiff_t 
stride);
 
 VP8_LF(neon);
 
@@ -57,10 +58,12 @@ av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp)
 if (!have_neon(av_get_cpu_flags())) {
 return;
 }
+dsp->vp8_luma_dc_wht= ff_vp8_luma_dc_wht_neon;
 
 dsp->vp8_idct_add   = ff_vp8_idct_add_neon;
 dsp->vp8_idct_dc_add= ff_vp8_idct_dc_add_neon;
 dsp->vp8_idct_dc_add4y  = ff_vp8_idct_dc_add4y_neon;
+dsp->vp8_idct_dc_add4uv = ff_vp8_idct_dc_add4uv_neon;
 
 dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_neon;
 dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon;
diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 2b5b049..4ea62c0 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -4,6 +4,7 @@
  * Copyright (c) 2010 Rob Clark 
  * Copyright (c) 2011 Mans Rullgard 
  * Copyright (c) 2018 Magnus Röös 
+ * Copyright (c) 2019 Martin Storsjo 
  *
  * This file is part of Libav.
  *
@@ -25,6 +26,62 @@
 #include "libavutil/aarch64/asm.S"
 #include "neon.S"
 
+function ff_vp8_luma_dc_wht_neon, export=1
+ld1 {v0.4h - v3.4h}, [x1]
+moviv30.8h, #0
+
+add v4.4h,  v0.4h,  v3.4h
+add v6.4h,  v1.4h,  v2.4h
+st1 {v30.8h}, [x1], #16
+sub v7.4h,  v1.4h,  v2.4h
+sub v5.4h,  v0.4h,  v3.4h
+st1 {v30.8h}, [x1]
+add v0.4h,  v4.4h,  v6.4h
+add v1.4h,  v5.4h,  v7.4h
+sub v2.4h,  v4.4h,  v6.4h
+sub v3.4h,  v5.4h,  v7.4h
+
+moviv16.4h, #3
+
+transpose_4x4H  v0, v1, v2, v3, v4, v5, v6, v7
+
+add v0.4h,  v0.4h,  v16.4h
+
+add v4.4h,  v0.4h,  v3.4h
+add v6.4h,  v1.4h,  v2.4h
+sub v7.4h,  v1.4h,  v2.4h
+sub v5.4h,  v0.4h,  v3.4h
+add v0.4h,  v4.4h,  v6.4h
+add v1.4h,  v5.4h,  v7.4h
+sub v2.4h,  v4.4h,  v6.4h
+sub v3.4h,  v5.4h,  v7.4h
+
+sshrv0.4h,  v0.4h,  #3
+sshrv1.4h,  v1.4h,  #3
+sshrv2.4h,  v2.4h,  #3
+sshrv3.4h,  v3.4h,  #3
+
+mov x3,  #32
+st1 {v0.h}[0],  [x0], x3
+st1 {v1.h}[0],  [x0], x3
+st1 {v2.h}[0],  [x0], x3
+st1 {v3.h}[0],  [x0], x3
+st1 {v0.h}[1],  [x0], x3
+st1 {v1.h}[1],  [x0], x3
+st1 {v2.h}[1],  [x0], x3
+st1 {v3.h}[1],  [x0], x3
+st1 {v0.h}[2],  [x0], x3
+st1 {v1.h}[2],  [x0], x3
+st1 {v2.h}[2],  [x0], x3
+st1 {v3.h}[2],  [x0], x3
+st1 {v0.h}[3],  [x0], x3
+st1 {v1.h}[3],  [x0], x3
+st1 {v2.h}[3],  [x0], x3
+st1 {v3.h}[3],  [x0], x3
+
+ret
+endfunc
+
 function ff_vp8_idct_add_neon, export=1
 ld1 {v0.8b - v3.8b},  [x1]
 mov w4,  #20091
@@ -102,6 +159,58 @@ function ff_vp8_idct_add_neon, export=1
 ret
 endfunc
 
+function ff_vp8_idct_dc_add4uv_neon, export=1
+moviv0.4h,  #0
+mov x3, #32
+ld1r{v16.4h},  [x1]
+st1 {v0.h}[0], [x1], x3
+ld1r{v17.4h},  [x1]
+st1 {v0.h}[0], [x1], x3
+ld1r{v18.4h},  [x1]
+st1 {v0.h}[0], [x1], x3
+ld1r{v19.4h},  [x1]
+st1 {v0.h}[0], [x1], x3
+ins v16.d[1],  v17.d[0]
+ins v18.d[1],  v19.d[0]
+mov x3,  x0
+srshr 

[libav-devel] [PATCH 11/19] aarch64: vp8: Fix a typo in a comment

2019-02-01 Thread Martin Storsjö
---
 libavcodec/aarch64/vp8dsp_neon.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index c19ab0d..2b5b049 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -743,7 +743,7 @@ endfunc
 
 
 // note: worst case sum of all 6-tap filter values * 255 is 0x7f80 so 16 bit
-// arithmatic can be used to apply filters
+// arithmetic can be used to apply filters
 const   subpel_filters, align=4
 .short 0,   6, 123,  12,   1,   0,   0,   0
 .short 2,  11, 108,  36,   8,   1,   0,   0
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 16/19] arm: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2

2019-02-01 Thread Martin Storsjö
This makes it similar to put_epel16_v6, and gives a 10-25%
speedup of this function.

Before:   Cortex A7   A8   A9  A53 A72
vp8_put_epel16_h6v6_neon:3058.0   2218.5   2459.8   2183.0  1572.2
After:
vp8_put_epel16_h6v6_neon:2670.8   1934.2   2244.4   1729.4  1503.9
---
 libavcodec/arm/vp8dsp_neon.S | 41 +
 1 file changed, 13 insertions(+), 28 deletions(-)

diff --git a/libavcodec/arm/vp8dsp_neon.S b/libavcodec/arm/vp8dsp_neon.S
index f43b4f7..b707d19 100644
--- a/libavcodec/arm/vp8dsp_neon.S
+++ b/libavcodec/arm/vp8dsp_neon.S
@@ -773,23 +773,6 @@ endfunc
 vqrshrun.s16\d1, q14, #7
 .endm
 
-.macro  vp8_epel8_v6d0,  s0,  s1,  s2,  s3,  s4,  s5
-vmovl.u8q10, \s2
-vmovl.u8q11, \s3
-vmovl.u8q9,  \s1
-vmovl.u8q12, \s4
-vmovl.u8q8,  \s0
-vmovl.u8q13, \s5
-vmul.u16q10, q10, d0[2]
-vmul.u16q11, q11, d0[3]
-vmls.u16q10, q9,  d0[1]
-vmls.u16q11, q12, d1[0]
-vmla.u16q10, q8,  d0[0]
-vmla.u16q11, q13, d1[1]
-vqadd.s16   q11, q10, q11
-vqrshrun.s16\d0, q11, #7
-.endm
-
 .macro  vp8_epel8_v6_y2 d0, d1, s0, s1, s2, s3, s4, s5, s6
 vmovl.u8q10, \s0
 vmovl.u8q11, \s3
@@ -909,12 +892,12 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 sub r2,  r2,  r3,  lsl #1
 sub r2,  r2,  #2
 push{r4,lr}
-vpush   {d8-d9}
+vpush   {d8-d15}
 
 @ first pass (horizontal):
-ldr r4,  [sp, #28]  @ mx
+ldr r4,  [sp, #64+8+4]  @ mx
 movrel  lr,  subpel_filters-16
-ldr r12, [sp, #24]  @ h
+ldr r12, [sp, #64+8+0]  @ h
 add r4,  lr,  r4, lsl #4
 sub sp,  sp,  #336+16
 vld1.16 {q0}, [r4,:128]
@@ -931,9 +914,9 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 bne 1b
 
 @ second pass (vertical):
-ldr r4,  [sp, #336+16+32]   @ my
+ldr r4,  [sp, #336+16+64+8+8]   @ my
 movrel  lr,  subpel_filters-16
-ldr r12, [sp, #336+16+24]   @ h
+ldr r12, [sp, #336+16+64+8+0]   @ h
 add r4,  lr,  r4, lsl #4
 add lr,  sp,  #15
 vld1.16 {q0}, [r4,:128]
@@ -941,18 +924,20 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 2:
 vld1.8  {d2-d5},  [lr,:128]!
 vld1.8  {d6-d9},  [lr,:128]!
-vld1.8  {d28-d31},[lr,:128]
-sub lr,  lr,  #48
+vld1.8  {d10-d13},[lr,:128]!
+vld1.8  {d14-d15},[lr,:128]
+sub lr,  lr,  #64
 
-vp8_epel8_v6d2, d2, d4, d6, d8, d28, d30
-vp8_epel8_v6d3, d3, d5, d7, d9, d29, d31
+vp8_epel8_v6_y2 d2,  d4,  d2,  d4,  d6,  d8,  d10, d12, d14
+vp8_epel8_v6_y2 d3,  d5,  d3,  d5,  d7,  d9,  d11, d13, d15
 
 vst1.8  {d2-d3}, [r0,:128], r1
-subsr12, r12, #1
+vst1.8  {d4-d5}, [r0,:128], r1
+subsr12, r12, #2
 bne 2b
 
 add sp,  sp,  #336+16
-vpop{d8-d9}
+vpop{d8-d15}
 pop {r4,pc}
 endfunc
 
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 17/19] aarch64: vp8: Optimize put_epel16_h6v6 with vp8_epel8_v6_y2

2019-02-01 Thread Martin Storsjö
This makes it similar to put_epel16_v6, and gives a large speedup
on Cortex A53, a minor speedup on A72 and a very minor slowdown on
A73.

Before: Cortex A53 A72 A73
vp8_put_epel16_h6v6_neon:   2211.4  1586.5  1431.7
After:
vp8_put_epel16_h6v6_neon:   1736.9  1522.0  1448.1
---
 libavcodec/aarch64/vp8dsp_neon.S | 34 ++
 1 file changed, 10 insertions(+), 24 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 604be8a..139b380 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -769,23 +769,6 @@ endfunc
 sqrshrun2   \d0\().16b, v22.8h, #7
 .endm
 
-.macro  vp8_epel8_v6d0,  s0,  s1,  s2, s3, s4, s5
-uxtl\s2\().8h, \s2\().8b
-uxtl\s3\().8h, \s3\().8b
-uxtl\s1\().8h, \s1\().8b
-uxtl\s4\().8h, \s4\().8b
-uxtl\s0\().8h, \s0\().8b
-uxtl\s5\().8h, \s5\().8b
-mul \s2\().8h, \s2\().8h, v0.h[2]
-mul \s3\().8h, \s3\().8h, v0.h[3]
-mls \s2\().8h, \s1\().8h, v0.h[1]
-mls \s3\().8h, \s4\().8h, v0.h[4]
-mla \s2\().8h, \s0\().8h, v0.h[0]
-mla \s3\().8h, \s5\().8h, v0.h[5]
-sqadd   \s3\().8h, \s2\().8h, \s3\().8h
-sqrshrun\d0\().8b, \s3\().8h, #7
-.endm
-
 .macro  vp8_epel8_v6_y2 d0, d1, s0, s1, s2, s3, s4, s5, s6
 uxtl\s0\().8h, \s0\().8b
 uxtl\s3\().8h, \s3\().8b
@@ -942,15 +925,18 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 2:
 ld1 {v1.8b - v4.8b},[x7], #32
 ld1 {v16.8b - v19.8b},  [x7], #32
-ld1 {v20.8b - v23.8b},  [x7]
-sub x7,  x7,  #48
+ld1 {v20.8b - v23.8b},  [x7], #32
+ld1 {v24.8b - v25.8b},  [x7]
+sub x7,  x7,  #64
 
-vp8_epel8_v6v5, v1, v3, v16, v18, v20, v22
-vp8_epel8_v6v2, v2, v4, v17, v19, v21, v23
-trn1v2.2d, v5.2d, v2.2d
+vp8_epel8_v6_y2 v1, v3, v1, v3, v16, v18, v20, v22, v24
+vp8_epel8_v6_y2 v2, v4, v2, v4, v17, v19, v21, v23, v25
+trn1v1.2d, v1.2d, v2.2d
+trn1v3.2d, v3.2d, v4.2d
 
-st1 {v2.16b}, [x0], x1
-subsx4, x4, #1
+st1 {v1.16b}, [x0], x1
+st1 {v3.16b}, [x0], x1
+subsx4, x4, #2
 b.ne2b
 
 add sp,  sp,  #336+16
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 14/19] aarch64: vp8: Port epel4 functions from arm version

2019-02-01 Thread Martin Storsjö
  Cortex A53A72A73
vp8_put_epel4_h4_c:631.4  291.7  367.8
vp8_put_epel4_h4_neon: 241.0  131.0  155.7
vp8_put_epel4_h4v4_c:  967.5  529.3  667.7
vp8_put_epel4_h4v4_neon:   429.3  241.8  279.7
vp8_put_epel4_h4v6_c: 1374.7  657.5  864.5
vp8_put_epel4_h4v6_neon:   515.5  295.5  334.7
vp8_put_epel4_h6_c:851.0  421.0  486.0
vp8_put_epel4_h6_neon: 321.5  195.0  217.7
vp8_put_epel4_h6v4_c: .3  621.1  781.2
vp8_put_epel4_h6v4_neon:   539.2  328.0  365.3
vp8_put_epel4_h6v6_c: 1561.3  763.3  999.7
vp8_put_epel4_h6v6_neon:   645.5  401.0  434.7
vp8_put_epel4_v4_c:663.8  298.3  357.0
vp8_put_epel4_v4_neon: 116.0   81.5   72.5
vp8_put_epel4_v6_c:870.5  437.0  507.4
vp8_put_epel4_v6_neon: 147.7  108.8   92.0
---
 libavcodec/aarch64/vp8dsp_init_aarch64.c |  10 ++
 libavcodec/aarch64/vp8dsp_neon.S | 284 +++
 2 files changed, 294 insertions(+)

diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index 1878d8e..478f849 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -34,6 +34,7 @@ VP8_LF(neon);
 
 VP8_EPEL(16, neon);
 VP8_EPEL(8,  neon);
+VP8_EPEL(4,  neon);
 
 
 av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
@@ -55,6 +56,15 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
 dsp->put_vp8_epel_pixels_tab[1][2][0] = ff_put_vp8_epel8_v6_neon;
 dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon;
 dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon;
+
+dsp->put_vp8_epel_pixels_tab[2][0][1] = ff_put_vp8_epel4_h4_neon;
+dsp->put_vp8_epel_pixels_tab[2][0][2] = ff_put_vp8_epel4_h6_neon;
+dsp->put_vp8_epel_pixels_tab[2][1][0] = ff_put_vp8_epel4_v4_neon;
+dsp->put_vp8_epel_pixels_tab[2][1][1] = ff_put_vp8_epel4_h4v4_neon;
+dsp->put_vp8_epel_pixels_tab[2][1][2] = ff_put_vp8_epel4_h6v4_neon;
+dsp->put_vp8_epel_pixels_tab[2][2][0] = ff_put_vp8_epel4_v6_neon;
+dsp->put_vp8_epel_pixels_tab[2][2][1] = ff_put_vp8_epel4_h4v6_neon;
+dsp->put_vp8_epel_pixels_tab[2][2][2] = ff_put_vp8_epel4_h6v6_neon;
 }
 
 av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp)
diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index c5badc4..7fe2466 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -1225,3 +1225,287 @@ function ff_put_vp8_epel8_h6v4_neon, export=1
 add sp,  sp,  #168+16
 ret
 endfunc
+
+function ff_put_vp8_epel4_v6_neon, export=1
+sub x2,  x2,  x3,  lsl #1
+
+movrel  x7,  subpel_filters, -16
+add x6,  x7,  w6, uxtw #4
+ld1 {v0.8h},[x6]
+1:
+ld1r{v2.2s},[x2], x3
+ld1r{v3.2s},[x2], x3
+ld1r{v4.2s},[x2], x3
+ld1r{v5.2s},[x2], x3
+ld1r{v6.2s},[x2], x3
+ld1r{v7.2s},[x2], x3
+ld1r{v28.2s},   [x2]
+sub x2,  x2,  x3,  lsl #2
+ld1 {v2.s}[1],  [x2], x3
+ld1 {v3.s}[1],  [x2], x3
+ld1 {v4.s}[1],  [x2], x3
+ld1 {v5.s}[1],  [x2], x3
+ld1 {v6.s}[1],  [x2], x3
+ld1 {v7.s}[1],  [x2], x3
+ld1 {v28.s}[1], [x2]
+sub x2,  x2,  x3,  lsl #2
+
+vp8_epel8_v6_y2 v2, v3, v2, v3, v4, v5, v6, v7, v28
+
+st1 {v2.s}[0],  [x0], x1
+st1 {v3.s}[0],  [x0], x1
+st1 {v2.s}[1],  [x0], x1
+st1 {v3.s}[1],  [x0], x1
+subsw4,  w4,  #4
+b.ne1b
+
+ret
+endfunc
+
+function ff_put_vp8_epel4_h6_neon, export=1
+sub x2,  x2,  #2
+
+movrel  x7,  subpel_filters, -16
+add x5,  x7,  w5, uxtw #4
+ld1 {v0.8h},   [x5]
+1:
+ld1 {v2.8b,v3.8b}, [x2], x3
+vp8_epel8_h6v2,  v2,  v3
+st1 {v2.s}[0], [x0], x1
+subsw4,  w4,  #1
+b.ne1b
+
+ret
+endfunc
+
+function ff_put_vp8_epel4_h6v6_neon, export=1
+sub x2,  x2,  x3,  lsl #1
+sub x2,  x2,  #2
+
+movrel  x7,  subpel_filters, -16
+add x5,  x7,  w5, uxtw #4
+ld1 {v0.8h},   [x5]
+
+sub sp,  sp,  #52
+add w8,  w4,  #5
+mov x9,  sp
+1:
+ld1 {v2.8b,v3.8b}, [x2], x3
+vp8_epel8_h6v2,  v2,  v3
+st1 {v2.s}[0], [x9], #4
+subsw8,  w8,  #1
+b.ne1b
+
+add x6, 

[libav-devel] [PATCH 07/19] vp8dsp: Move the aarch64 dsp init call into alphabetical order

2019-02-01 Thread Martin Storsjö
---
 libavcodec/vp8dsp.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/vp8dsp.c b/libavcodec/vp8dsp.c
index 3c8d1c8..ac9a6af 100644
--- a/libavcodec/vp8dsp.c
+++ b/libavcodec/vp8dsp.c
@@ -679,14 +679,14 @@ av_cold void ff_vp78dsp_init(VP8DSPContext *dsp)
 VP78_BILINEAR_MC_FUNC(1, 8);
 VP78_BILINEAR_MC_FUNC(2, 4);
 
+if (ARCH_AARCH64)
+ff_vp78dsp_init_aarch64(dsp);
 if (ARCH_ARM)
 ff_vp78dsp_init_arm(dsp);
 if (ARCH_PPC)
 ff_vp78dsp_init_ppc(dsp);
 if (ARCH_X86)
 ff_vp78dsp_init_x86(dsp);
-if (ARCH_AARCH64)
-ff_vp78dsp_init_aarch64(dsp);
 }
 
 #if CONFIG_VP7_DECODER
@@ -741,11 +741,11 @@ av_cold void ff_vp8dsp_init(VP8DSPContext *dsp)
 dsp->vp8_v_loop_filter_simple = vp8_v_loop_filter_simple_c;
 dsp->vp8_h_loop_filter_simple = vp8_h_loop_filter_simple_c;
 
+if (ARCH_AARCH64)
+ff_vp8dsp_init_aarch64(dsp);
 if (ARCH_ARM)
 ff_vp8dsp_init_arm(dsp);
 if (ARCH_X86)
 ff_vp8dsp_init_x86(dsp);
-if (ARCH_AARCH64)
-ff_vp8dsp_init_aarch64(dsp);
 }
 #endif /* CONFIG_VP8_DECODER */
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 10/19] aarch64: vp8: Reorder the function pointer inits to match the arm original

2019-02-01 Thread Martin Storsjö
---
 libavcodec/aarch64/vp8dsp_init_aarch64.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index 3fb254a..da54efd 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -46,10 +46,10 @@ av_cold void ff_vp78dsp_init_aarch64(VP8DSPContext *dsp)
 dsp->put_vp8_epel_pixels_tab[0][2][2] = ff_put_vp8_epel16_h6v6_neon;
 
 dsp->put_vp8_epel_pixels_tab[1][0][0] = ff_put_vp8_pixels8_neon;
-dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon;
-dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon;
-dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon;
 dsp->put_vp8_epel_pixels_tab[1][1][1] = ff_put_vp8_epel8_h4v4_neon;
+dsp->put_vp8_epel_pixels_tab[1][1][2] = ff_put_vp8_epel8_h6v4_neon;
+dsp->put_vp8_epel_pixels_tab[1][2][1] = ff_put_vp8_epel8_h4v6_neon;
+dsp->put_vp8_epel_pixels_tab[1][2][2] = ff_put_vp8_epel8_h6v6_neon;
 }
 
 av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp)
@@ -62,8 +62,8 @@ av_cold void ff_vp8dsp_init_aarch64(VP8DSPContext *dsp)
 dsp->vp8_idct_dc_add= ff_vp8_idct_dc_add_neon;
 dsp->vp8_idct_dc_add4y  = ff_vp8_idct_dc_add4y_neon;
 
-dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon;
 dsp->vp8_v_loop_filter16y = ff_vp8_v_loop_filter16_neon;
+dsp->vp8_h_loop_filter16y = ff_vp8_h_loop_filter16_neon;
 dsp->vp8_v_loop_filter8uv = ff_vp8_v_loop_filter8uv_neon;
 dsp->vp8_h_loop_filter8uv = ff_vp8_h_loop_filter8uv_neon;
 
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 05/19] aarch64: vp8: Fix linking for iOS

2019-02-01 Thread Martin Storsjö
The mach-o relocations don't allow a negative offset to a symbol;
use the third movrel parameter to handle this issue transparently.
---
 libavcodec/aarch64/vp8dsp_neon.S | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index 14a9d11..eb22c42 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -759,7 +759,7 @@ function ff_put_vp8_epel16_v6_neon, export=1
 
 sxtwx4,  w4
 sxtwx6,  w6
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 add x6,  x17,  x6, lsl #4  // y
 ld1 {v0.8h}, [x6]
 1:
@@ -788,7 +788,7 @@ function ff_put_vp8_epel16_h6_neon, export=1
 sxtwx5,  w5 // x
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 add x5,  x17,  x5, lsl #4 // x
 ld1 {v0.8h},  [x5]
 1:
@@ -807,7 +807,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 sub x2,  x2,  #2
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 sxtwx5,  w5 // x
 add x16,  x17,  x5, lsl #4 // x
 sub sp,  sp,  #336+16
@@ -854,7 +854,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1
 sxtwx4,  w4
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 sxtwx5,  w5
 add x5,  x17,  x5, lsl #4 // x
 sub sp,  sp,  #168+16
@@ -900,7 +900,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1
 sxtwx4,  w4
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 sxtwx5,  w5
 add x5,  x17,  x5, lsl #4 // x
 sub sp,  sp,  #168+16
@@ -947,7 +947,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1
 
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 sxtwx5,  w5
 add x5,  x17,  x5, lsl #4 // x
 sub sp,  sp,  #168+16
@@ -992,7 +992,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1
 
 
 // first pass (horizontal):
-movrel  x17,  subpel_filters-16
+movrel  x17,  subpel_filters, -16
 sxtwx5,  w5
 add x5,  x17,  x5, lsl #4 // x
 sub sp,  sp,  #168+16
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 02/19] aarch64: vp8: Fix the include guard

2019-02-01 Thread Martin Storsjö
From: Carl Eugen Hoyos 

---
 libavcodec/aarch64/vp8dsp.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp.h b/libavcodec/aarch64/vp8dsp.h
index 8a0c8fb..40d0cae 100644
--- a/libavcodec/aarch64/vp8dsp.h
+++ b/libavcodec/aarch64/vp8dsp.h
@@ -16,8 +16,8 @@
  * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
  */
 
-#ifndef AVCODEC_ARM_VP8DSP_H
-#define AVCODEC_ARM_VP8DSP_H
+#ifndef AVCODEC_AARCH64_VP8DSP_H
+#define AVCODEC_AARCH64_VP8DSP_H
 
 #include "libavcodec/vp8dsp.h"
 
@@ -67,4 +67,4 @@
 VP8_MC(epel ## w ## _h4v6, opt);\
 VP8_MC(epel ## w ## _h6v6, opt)
 
-#endif /* AVCODEC_ARM_VP8DSP_H */
+#endif /* AVCODEC_AARCH64_VP8DSP_H */
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 09/19] aarch64: vp8: Move the vp8dsp makefile entries to the right places

2019-02-01 Thread Martin Storsjö
Even if NEON would be disabled, the init functions should be built
as they are called as long as ARCH_AARCH64 is set.

These functions are part of a generic DSP subsytem, not tied directly
to one decoder. (They should be built if the vp7 decoder is enabled,
even if the vp8 decoder is disabled.)
---
 libavcodec/aarch64/Makefile | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/libavcodec/aarch64/Makefile b/libavcodec/aarch64/Makefile
index 2555044..7228eae 100644
--- a/libavcodec/aarch64/Makefile
+++ b/libavcodec/aarch64/Makefile
@@ -11,6 +11,7 @@ OBJS-$(CONFIG_MDCT) += aarch64/mdct_init.o
 OBJS-$(CONFIG_MPEGAUDIODSP) += aarch64/mpegaudiodsp_init.o
 OBJS-$(CONFIG_NEON_CLOBBER_TEST)+= aarch64/neontest.o
 OBJS-$(CONFIG_VIDEODSP) += aarch64/videodsp_init.o
+OBJS-$(CONFIG_VP8DSP)   += aarch64/vp8dsp_init_aarch64.o
 
 # decoders/encoders
 OBJS-$(CONFIG_DCA_DECODER)  += aarch64/dcadsp_init.o
@@ -39,13 +40,12 @@ NEON-OBJS-$(CONFIG_HPELDSP) += 
aarch64/hpeldsp_neon.o
 NEON-OBJS-$(CONFIG_IMDCT15) += aarch64/imdct15_neon.o
 NEON-OBJS-$(CONFIG_MDCT)+= aarch64/mdct_neon.o
 NEON-OBJS-$(CONFIG_MPEGAUDIODSP)+= aarch64/mpegaudiodsp_neon.o
+NEON-OBJS-$(CONFIG_VP8DSP)  += aarch64/vp8dsp_neon.o
 
 # decoders/encoders
 NEON-OBJS-$(CONFIG_DCA_DECODER) += aarch64/dcadsp_neon.o   
\
aarch64/synth_filter_neon.o
 NEON-OBJS-$(CONFIG_VORBIS_DECODER)  += aarch64/vorbisdsp_neon.o
-NEON-OBJS-$(CONFIG_VP8DSP)  += aarch64/vp8dsp_init_aarch64.o   
\
-   aarch64/vp8dsp_neon.o
 NEON-OBJS-$(CONFIG_VP9_DECODER) += aarch64/vp9itxfm_neon.o 
\
aarch64/vp9lpf_neon.o   
\
aarch64/vp9mc_neon.o
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 06/19] aarch64: vp8: Use the proper aarch64 form for conditional branches

2019-02-01 Thread Martin Storsjö
The previous form also does seem to assemble on current tools,
but I think it might fail on some older aarch64 tools.
---
 libavcodec/aarch64/vp8dsp_neon.S | 28 ++--
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_neon.S b/libavcodec/aarch64/vp8dsp_neon.S
index eb22c42..c19ab0d 100644
--- a/libavcodec/aarch64/vp8dsp_neon.S
+++ b/libavcodec/aarch64/vp8dsp_neon.S
@@ -581,7 +581,7 @@ function ff_put_vp8_pixels16_neon, export=1
 st1 {v1.16b}, [x0], x1
 st1 {v2.16b}, [x0], x1
 st1 {v3.16b}, [x0], x1
-bgt 1b
+b.gt1b
 ret
 endfunc
 
@@ -596,7 +596,7 @@ function ff_put_vp8_pixels8_neon, export=1
 st1 {v0.d}[1], [x0], x1
 st1 {v1.8b},   [x0], x1
 st1 {v1.d}[1], [x0], x1
-bgt 1b
+b.gt1b
 ret
 endfunc
 
@@ -778,7 +778,7 @@ function ff_put_vp8_epel16_v6_neon, export=1
 st1 {v1.1d - v2.1d}, [x0], x1
 st1 {v3.1d - v4.1d}, [x0], x1
 subsx4, x4, #2
-bne 1b
+b.ne1b
 
 ret
 endfunc
@@ -797,7 +797,7 @@ function ff_put_vp8_epel16_h6_neon, export=1
 st1 {v1.16b}, [x0], x1
 
 subsw4, w4, #1
-bne 1b
+b.ne1b
 ret
 endfunc
 
@@ -821,7 +821,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 vp8_epel16_h6   v1, v1, v2
 st1 {v1.16b}, [x7], #16
 subsx16, x16, #1
-bne 1b
+b.ne1b
 
 
 // second pass (vertical):
@@ -842,7 +842,7 @@ function ff_put_vp8_epel16_h6v6_neon, export=1
 
 st1 {v2.16b}, [x0], x1
 subsx4, x4, #1
-bne 2b
+b.ne2b
 
 add sp,  sp,  #336+16
 ret
@@ -869,7 +869,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1
 
 st1 {v1.8b}, [x7], #8
 subsx16, x16, #1
-bne 1b
+b.ne1b
 
 // second pass (vertical):
 sxtwx6,  w6
@@ -888,7 +888,7 @@ function ff_put_vp8_epel8_h6v6_neon, export=1
 st1 {v1.8b}, [x0], x1
 st1 {v2.8b}, [x0], x1
 subsx4, x4, #2
-bne 2b
+b.ne2b
 
 add sp,  sp,  #168+16
 ret
@@ -915,7 +915,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1
 
 st1 {v1.8b}, [x7], #8
 subsx16, x16, #1
-bne 1b
+b.ne1b
 
 // second pass (vertical):
 sxtwx6,  w6
@@ -934,7 +934,7 @@ function ff_put_vp8_epel8_h4v6_neon, export=1
 st1 {v1.8b}, [x0], x1
 st1 {v2.8b}, [x0], x1
 subsx4, x4, #2
-bne 2b
+b.ne2b
 
 add sp,  sp,  #168+16
 ret
@@ -962,7 +962,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1
 
 st1 {v1.8b}, [x7], #8
 subsx16, x16, #1
-bne 1b
+b.ne1b
 
 // second pass (vertical):
 sxtwx6,  w6
@@ -979,7 +979,7 @@ function ff_put_vp8_epel8_h4v4_neon, export=1
 st1 {v1.d}[0], [x0], x1
 st1 {v1.d}[1], [x0], x1
 subsx4, x4, #2
-bne 2b
+b.ne2b
 
 add sp,  sp,  #168+16
 ret
@@ -1007,7 +1007,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1
 
 st1 {v1.8b}, [x7], #8
 subsx16, x16, #1
-bne 1b
+b.ne1b
 
 // second pass (vertical):
 sxtwx6,  w6
@@ -1024,7 +1024,7 @@ function ff_put_vp8_epel8_h6v4_neon, export=1
 st1 {v1.d}[0], [x0], x1
 st1 {v1.d}[1], [x0], x1
 subsx4, x4, #2
-bne 2b
+b.ne2b
 
 add sp,  sp,  #168+16
 ret
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 08/19] aarch64: vp8: Remove superfluous includes

2019-02-01 Thread Martin Storsjö
---
 libavcodec/aarch64/vp8dsp_init_aarch64.c | 4 
 1 file changed, 4 deletions(-)

diff --git a/libavcodec/aarch64/vp8dsp_init_aarch64.c 
b/libavcodec/aarch64/vp8dsp_init_aarch64.c
index f93bcfa..3fb254a 100644
--- a/libavcodec/aarch64/vp8dsp_init_aarch64.c
+++ b/libavcodec/aarch64/vp8dsp_init_aarch64.c
@@ -17,10 +17,6 @@
  */
 
 #include 
-#include 
-#include 
-#include 
-#include 
 
 #include "libavutil/attributes.h"
 #include "libavutil/aarch64/cpu.h"
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] avio: Do not flush the buffer if a constant packet size is requested

2019-01-31 Thread Martin Storsjö

On Thu, 31 Jan 2019, Luca Barbato wrote:


---
libavformat/aviobuf.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavformat/aviobuf.c b/libavformat/aviobuf.c
index 98e35f776c..3c882d6bdb 100644
--- a/libavformat/aviobuf.c
+++ b/libavformat/aviobuf.c
@@ -244,7 +244,8 @@ void avio_write(AVIOContext *s, const unsigned char *buf, 
int size)

void avio_flush(AVIOContext *s)
{
-flush_buffer(s);
+if (!s->max_packet_size || s->buf_ptr - s->buffer >= s->max_packet_size)
+flush_buffer(s);
s->must_flush = 0;
}

--
2.12.2


You're not providing any explanation to why we should do this. And I'm 
fairly sure that this patch breaks the RTP muxer when sending over plain 
UDP.


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/1] h264/x86: sign extend int stride in deblock functions

2019-01-27 Thread Martin Storsjö

On Sun, 27 Jan 2019, Janne Grunau wrote:


Fixes checkasm errors after adding the h264 deblock tests.
---
libavcodec/x86/h264_deblock.asm   | 8 
libavcodec/x86/h264_deblock_10bit.asm | 9 +
2 files changed, 17 insertions(+)


Ok with me.

Yes, changing the prototypes to use ptrdiff_t instead of int would be 
good, but I think it's better to get tests back to green instead of 
blocking the fix by demanding the larger refactoring right now.


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/4] checkasm/h264: add loop filter tests

2019-01-26 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


---
tests/checkasm/h264dsp.c | 124 +++
1 file changed, 124 insertions(+)


This newly added test seems to fail on macOS. I haven't debugged through 
it properly yet, but disabling the use of checkasm_checked_call seems to 
make it pass.


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] libopenh264dec: Use a newer decoding entry point function

2019-01-26 Thread Martin Storsjö

On Sat, 26 Jan 2019, Janne Grunau wrote:


On 2019-01-25 10:39:13 +0200, Martin Storsjö wrote:

The "new" entry point actually has existed since OpenH264 1.4 in
2015, but with B-frames, this entry point is essential for actually
getting the right frames returned and reordered.

The name of this function, DecodeFrameNoDelay, is rather backwards
considering that it doesn't return the latest decoded frame immediately,
but actually does proper delaying and reordering of frames, but
it's the recommended decoding entry point.


The commit message is hard to parse. Something along below is imho
easier to understand:

| The "new" entry point actually has existed since OpenH264 1.4 in
| 2015 and is the the recommended decoding entry point.
|
| The name of this function, DecodeFrameNoDelay, is rather backwards
| considering that it doesn't return the latest decoded frame immediately,
| but actually does proper delaying and reordering of frames.


Thanks! That's indeed much more understandable.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libopenh264dec: Use a newer decoding entry point function

2019-01-25 Thread Martin Storsjö
The "new" entry point actually has existed since OpenH264 1.4 in
2015, but with B-frames, this entry point is essential for actually
getting the right frames returned and reordered.

The name of this function, DecodeFrameNoDelay, is rather backwards
considering that it doesn't return the latest decoded frame immediately,
but actually does proper delaying and reordering of frames, but
it's the recommended decoding entry point.
---
 libavcodec/libopenh264dec.c | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c
index 60e4b028ec..6adf984112 100644
--- a/libavcodec/libopenh264dec.c
+++ b/libavcodec/libopenh264dec.c
@@ -109,10 +109,18 @@ static int svc_decode_frame(AVCodecContext *avctx, void 
*data,
 #endif
 } else {
 info.uiInBsTimeStamp = avpkt->pts;
+#if OPENH264_VER_AT_LEAST(1, 4)
+// Contrary to the name, DecodeFrameNoDelay actually does buffering
+// and reordering of frames, and is the recommended decoding entry
+// point since 1.4. This is essential for successfully decoding
+// B-frames.
+state = (*s->decoder)->DecodeFrameNoDelay(s->decoder, avpkt->data, 
avpkt->size, ptrs, &info);
+#else
 state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, 
avpkt->size, ptrs, &info);
+#endif
 }
 if (state != dsErrorFree) {
-av_log(avctx, AV_LOG_ERROR, "DecodeFrame2 failed\n");
+av_log(avctx, AV_LOG_ERROR, "DecodeFrame failed\n");
 return AVERROR_UNKNOWN;
 }
 if (info.iBufferStatus != 1) {
-- 
2.17.2 (Apple Git-113)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: Create proper .rdata sections for COFF

2019-01-11 Thread Martin Storsjö
As .rodata isn't one of the default created sections for COFF, it was
created as a read-write data section. By using the default .rdata
section name for COFF, it automatically becomes a read-only data section.
The existing ".section .rodata" works as intended for ELF though.

This is based on an original patch and diagnose by Tom Tan
.
---
 libavutil/aarch64/asm.S | 2 ++
 libavutil/arm/asm.S | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S
index 15b55d57d2..bf5c1b7ee1 100644
--- a/libavutil/aarch64/asm.S
+++ b/libavutil/aarch64/asm.S
@@ -63,6 +63,8 @@ ELF .size   \name, . - \name
 .else
 .section.rodata
 .endif
+#elif defined(_WIN32)
+.section.rdata
 #elif !defined(__MACH__)
 .section.rodata
 #else
diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S
index 62ce493180..9842d03bc0 100644
--- a/libavutil/arm/asm.S
+++ b/libavutil/arm/asm.S
@@ -125,6 +125,8 @@ ELF .size   \name, . - \name
 .else
 .section.rodata
 .endif
+#elif defined(_WIN32)
+.section.rdata
 #elif !defined(__MACH__)
 .section.rodata
 #else
-- 
2.17.2 (Apple Git-113)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: Mark .rodata section as read only in COFF object file

2019-01-10 Thread Martin Storsjö
From: Tom Tan 

.rodata directive from GAS assembly produces .rodata as read/write for COFF
object file by default (object file format for Windows), but read only for
ELF. This change marks it as read only explicitly for COFF.

Signed-off-by: Martin Storsjö 
---
 libavutil/aarch64/asm.S | 2 ++
 libavutil/arm/asm.S | 2 ++
 2 files changed, 4 insertions(+)

diff --git a/libavutil/aarch64/asm.S b/libavutil/aarch64/asm.S
index 15b55d57d2..65341d58cd 100644
--- a/libavutil/aarch64/asm.S
+++ b/libavutil/aarch64/asm.S
@@ -63,6 +63,8 @@ ELF .size   \name, . - \name
 .else
 .section.rodata
 .endif
+#elif defined(_WIN32)
+.section.rodata, "r"
 #elif !defined(__MACH__)
 .section.rodata
 #else
diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S
index 62ce493180..06c3413489 100644
--- a/libavutil/arm/asm.S
+++ b/libavutil/arm/asm.S
@@ -125,6 +125,8 @@ ELF .size   \name, . - \name
 .else
 .section.rodata
 .endif
+#elif defined(_WIN32)
+.section.rodata, "r"
 #elif !defined(__MACH__)
 .section.rodata
 #else
-- 
2.17.2 (Apple Git-113)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] Using Co-authored-by instead of Signed-off-by

2019-01-09 Thread Martin Storsjö

On Wed, 9 Jan 2019, Luca Barbato wrote:

Since the start of the project we used Signed-off-by to signal that a 
patch had been edited.


I'd like to point out that you might have had this interpretation of it 
and used it in this way, but it hasn't been a written project wide rule 
that this is the intended interpretation in this context. I've used it as 
a general "I approve of"-mark.



Currently git (and github/gitlab) has support for `Co-authored-by:`.

It isn't as nice as `Signed-off-by:` since there isn't a easy shorthand 
such as `-s` that I know, but possibly could be nice to use.


That sounds like a much better thing to use, especially as Signed-off-by 
has different interpretations in different projects.


// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 4/4] h264/aarch64: add intra loop filter neon asm

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


Add my neon asm from x264 relicensed under the LGPL 2.1 or later. Ported
(x264 uses nv12 chroma) and optimized.

Cycle count for checkasm --bench on a Snapdragon 820e:
h264_h_loop_filter_luma_intra_8bpp_c: 60.0
h264_h_loop_filter_luma_intra_8bpp_neon: 54.2
h264_v_loop_filter_luma_intra_8bpp_c: 148.3
h264_v_loop_filter_luma_intra_8bpp_neon: 73.8
h264_h_loop_filter_chroma_intra_8bpp_c: 27.8
h264_h_loop_filter_chroma_intra_8bpp_neon: 21.4
h264_h_loop_filter_chroma_mbaff_intra_8bpp_c: 15.8
h264_h_loop_filter_chroma_mbaff_intra_8bpp_neon: 15.7
h264_v_loop_filter_chroma_intra_8bpp_c: 45.8
h264_v_loop_filter_chroma_intra_8bpp_neon: 17.3
---
libavcodec/aarch64/h264dsp_init_aarch64.c |  16 ++
libavcodec/aarch64/h264dsp_neon.S | 297 ++
2 files changed, 313 insertions(+)


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 3/4] h264/aarch64: optimize neon loop filter

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


Exit as soon as possible if no filtering will be done.

Improves the checkasm --bench cycle count on a Snapdragon 820e:
h264_h_loop_filter_luma_8bpp_c:  72.4 ->  72.5
h264_h_loop_filter_luma_8bpp_neon:   97.1 ->  56.3
h264_v_loop_filter_luma_8bpp_c: 174.0 -> 173.5
h264_v_loop_filter_luma_8bpp_neon:   62.9 ->  60.9
h264_h_loop_filter_chroma_8bpp_c:30.2 ->  30.3
h264_h_loop_filter_chroma_8bpp_neon: 51.6 ->  25.7
h264_v_loop_filter_chroma_8bpp_c:57.3 ->  57.3
h264_v_loop_filter_chroma_8bpp_neon: 28.0 ->  24.0
---
libavcodec/aarch64/h264dsp_neon.S | 33 ++-
1 file changed, 19 insertions(+), 14 deletions(-)


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/4] checkasm/h264: add loop filter tests

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


---
tests/checkasm/h264dsp.c | 124 +++
1 file changed, 124 insertions(+)


Looks ok to me

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/4] h264/aarch64: sign extend int stride in loop filter asm

2019-01-02 Thread Martin Storsjö

On Tue, 1 Jan 2019, Janne Grunau wrote:


---
libavcodec/aarch64/h264dsp_neon.S | 3 +++
1 file changed, 3 insertions(+)

diff --git a/libavcodec/aarch64/h264dsp_neon.S 
b/libavcodec/aarch64/h264dsp_neon.S
index 9b4610a4d4..60ffa24500 100644
--- a/libavcodec/aarch64/h264dsp_neon.S
+++ b/libavcodec/aarch64/h264dsp_neon.S
@@ -130,6 +130,7 @@ endfunc

function ff_h264_h_loop_filter_luma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  #4
ld1 {v6.8B},  [x0], x1
@@ -210,6 +211,7 @@ endfunc

function ff_h264_v_loop_filter_chroma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  x1, lsl #1
ld1 {v18.8B}, [x0], x1
@@ -228,6 +230,7 @@ endfunc

function ff_h264_h_loop_filter_chroma_neon, export=1
h264_loop_filter_start
+sxtwx1,  w1

sub x0,  x0,  #2
ld1 {v18.S}[0], [x0], x1
--
2.20.1


LGTM

// Martin

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] libavutil: Undeprecate the AVFrame reordered_opaque field

2018-11-05 Thread Martin Storsjö

On Fri, 26 Oct 2018, Luca Barbato wrote:


On 25/10/2018 14:45, Martin Storsjö wrote:

This was marked as deprecated (but only in the doxygen, not with an
actual deprecation attribute) in 81c623fae05 in 2011, but was
undeprecated in ad1ee5fa7.
---
 libavutil/frame.h   | 1 -
 libavutil/version.h | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)



The set is probably fine.


Pushed, with a minor adjustment to patch 2/2, to overestimate the buffer 
size needed, in case a reconfiguration increases the delay.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] libx264: Pass the reordered_opaque field through the encoder

2018-10-25 Thread Martin Storsjö

On Thu, 25 Oct 2018, Martin Storsjö wrote:


libx264 does have a field for opaque data to pass along with frames
through the encoder, but it is a pointer, while the libavcodec
reordered_opaque field is an int64_t. Therefore, allocate an array
within the libx264 wrapper, where reordered_opaque values in flight
are stored, and pass a pointer to this array to libx264.

Update the public libavcodec documentation for the AVCodecContext
field to explain this usage, and add a codec capability that allows
detecting whether an encoder handles this field.
---
libavcodec/avcodec.h | 12 +++-
libavcodec/libx264.c | 31 +--
2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index fb8e34e7d5..727e1c411d 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -899,6 +899,13 @@ typedef struct RcOverride{
 */
#define AV_CODEC_CAP_HYBRID  (1 << 18)

+/**
+ * This codec takes the reordered_opaque field from input AVFrames
+ * and returns it in the corresponding field in AVCodecContext after
+ * encoding.
+ */
+#define AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE (1 << 19)


This obviously needs a minor bump, I'll add one locally.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] libx264: Pass the reordered_opaque field through the encoder

2018-10-25 Thread Martin Storsjö
libx264 does have a field for opaque data to pass along with frames
through the encoder, but it is a pointer, while the libavcodec
reordered_opaque field is an int64_t. Therefore, allocate an array
within the libx264 wrapper, where reordered_opaque values in flight
are stored, and pass a pointer to this array to libx264.

Update the public libavcodec documentation for the AVCodecContext
field to explain this usage, and add a codec capability that allows
detecting whether an encoder handles this field.
---
 libavcodec/avcodec.h | 12 +++-
 libavcodec/libx264.c | 31 +--
 2 files changed, 40 insertions(+), 3 deletions(-)

diff --git a/libavcodec/avcodec.h b/libavcodec/avcodec.h
index fb8e34e7d5..727e1c411d 100644
--- a/libavcodec/avcodec.h
+++ b/libavcodec/avcodec.h
@@ -899,6 +899,13 @@ typedef struct RcOverride{
  */
 #define AV_CODEC_CAP_HYBRID  (1 << 18)
 
+/**
+ * This codec takes the reordered_opaque field from input AVFrames
+ * and returns it in the corresponding field in AVCodecContext after
+ * encoding.
+ */
+#define AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE (1 << 19)
+
 /**
  * Pan Scan area.
  * This specifies the area which should be displayed.
@@ -2297,7 +2304,10 @@ typedef struct AVCodecContext {
 /**
  * opaque 64-bit number (generally a PTS) that will be reordered and
  * output in AVFrame.reordered_opaque
- * - encoding: unused
+ * - encoding: Set by libavcodec to the reordered_opaque of the input
+ * frame corresponding to the last returned packet. Only
+ * supported by encoders with the
+ * AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE capability.
  * - decoding: Set by user.
  */
 int64_t reordered_opaque;
diff --git a/libavcodec/libx264.c b/libavcodec/libx264.c
index 3dc53aaf38..c852858db8 100644
--- a/libavcodec/libx264.c
+++ b/libavcodec/libx264.c
@@ -85,6 +85,9 @@ typedef struct X264Context {
 int noise_reduction;
 
 char *x264_params;
+
+int nb_reordered_opaque, next_reordered_opaque;
+int64_t *reordered_opaque;
 } X264Context;
 
 static void X264_log(void *p, int level, const char *fmt, va_list args)
@@ -240,6 +243,7 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, 
const AVFrame *frame,
 x264_nal_t *nal;
 int nnal, i, ret;
 x264_picture_t pic_out;
+int64_t *out_opaque;
 
 x264_picture_init( &x4->pic );
 x4->pic.img.i_csp   = x4->params.i_csp;
@@ -259,6 +263,11 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, 
const AVFrame *frame,
 
 x4->pic.i_pts  = frame->pts;
 
+x4->reordered_opaque[x4->next_reordered_opaque] = 
frame->reordered_opaque;
+x4->pic.opaque = &x4->reordered_opaque[x4->next_reordered_opaque];
+x4->next_reordered_opaque++;
+x4->next_reordered_opaque %= x4->nb_reordered_opaque;
+
 switch (frame->pict_type) {
 case AV_PICTURE_TYPE_I:
 x4->pic.i_type = x4->forced_idr ? X264_TYPE_IDR
@@ -288,6 +297,15 @@ static int X264_frame(AVCodecContext *ctx, AVPacket *pkt, 
const AVFrame *frame,
 pkt->pts = pic_out.i_pts;
 pkt->dts = pic_out.i_dts;
 
+out_opaque = pic_out.opaque;
+if (out_opaque >= x4->reordered_opaque &&
+out_opaque < &x4->reordered_opaque[x4->nb_reordered_opaque]) {
+ctx->reordered_opaque = *out_opaque;
+} else {
+// Unexpected opaque pointer on picture output
+ctx->reordered_opaque = 0;
+}
+
 #if FF_API_CODED_FRAME
 FF_DISABLE_DEPRECATION_WARNINGS
 switch (pic_out.i_type) {
@@ -331,6 +349,7 @@ static av_cold int X264_close(AVCodecContext *avctx)
 
 av_freep(&avctx->extradata);
 av_freep(&x4->sei);
+av_freep(&x4->reordered_opaque);
 
 if (x4->enc) {
 x264_encoder_close(x4->enc);
@@ -663,6 +682,12 @@ FF_ENABLE_DEPRECATION_WARNINGS
 cpb_props->max_bitrate = x4->params.rc.i_vbv_max_bitrate * 1000;
 cpb_props->avg_bitrate = x4->params.rc.i_bitrate * 1000;
 
+x4->nb_reordered_opaque = x264_encoder_maximum_delayed_frames(x4->enc) + 1;
+x4->reordered_opaque= av_malloc_array(x4->nb_reordered_opaque,
+  sizeof(*x4->reordered_opaque));
+if (!x4->reordered_opaque)
+return AVERROR(ENOMEM);
+
 return 0;
 }
 
@@ -850,7 +875,8 @@ AVCodec ff_libx264_encoder = {
 .init = X264_init,
 .encode2  = X264_frame,
 .close= X264_close,
-.capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AUTO_THREADS,
+.capabilities = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_AUTO_THREADS |
+AV_CODEC_CAP_ENCODER_REORDERED_OPAQUE,
 .priv_class   = &class,
 .defaults = x264_defaults,
 .init_static_data = X264_init_static,
@@ -877,7 +903,8 @@ AVCodec ff_libx262_encoder = {
 .init = X264_init,
 .encode2  = X264_frame,
 .close= X264_close,
-.capabilities 

[libav-devel] [PATCH 1/2] libavutil: Undeprecate the AVFrame reordered_opaque field

2018-10-25 Thread Martin Storsjö
This was marked as deprecated (but only in the doxygen, not with an
actual deprecation attribute) in 81c623fae05 in 2011, but was
undeprecated in ad1ee5fa7.
---
 libavutil/frame.h   | 1 -
 libavutil/version.h | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/libavutil/frame.h b/libavutil/frame.h
index ff3fe46dd6..c7240ebe9b 100644
--- a/libavutil/frame.h
+++ b/libavutil/frame.h
@@ -295,7 +295,6 @@ typedef struct AVFrame {
  * that time,
  * the decoder reorders values as needed and sets AVFrame.reordered_opaque
  * to exactly one of the values provided by the user through 
AVCodecContext.reordered_opaque
- * @deprecated in favor of pkt_pts
  */
 int64_t reordered_opaque;
 
diff --git a/libavutil/version.h b/libavutil/version.h
index 4a9fffef43..e5fbd4ca81 100644
--- a/libavutil/version.h
+++ b/libavutil/version.h
@@ -55,7 +55,7 @@
 
 #define LIBAVUTIL_VERSION_MAJOR 56
 #define LIBAVUTIL_VERSION_MINOR  7
-#define LIBAVUTIL_VERSION_MICRO  0
+#define LIBAVUTIL_VERSION_MICRO  1
 
 #define LIBAVUTIL_VERSION_INT   AV_VERSION_INT(LIBAVUTIL_VERSION_MAJOR, \
LIBAVUTIL_VERSION_MINOR, \
-- 
2.17.1 (Apple Git-112)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] arm: Emit .thumb_func directives

2018-10-12 Thread Martin Storsjö
Prior to Xcode 9.3, the clang built-in assembler didn't support
altmacro, and gas-preprocessor was used for assembling for arm/darwin.

For thumb functions, gas-preprocessor took care of adding the .thumb_func
directives, but when now being able to assemble without gas-preprocessor,
we need to add these directives ourselves.
---
 libavutil/arm/asm.S | 8 
 1 file changed, 8 insertions(+)

diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S
index e7eea0271f..5207a1a2b8 100644
--- a/libavutil/arm/asm.S
+++ b/libavutil/arm/asm.S
@@ -75,6 +75,12 @@ T   .thumb
 ELF .eabi_attribute 25, 1   @ Tag_ABI_align_preserved
 ELF .section .note.GNU-stack,"",%progbits @ Mark stack as non-executable
 
+.macro func_mode name
+#if CONFIG_THUMB && defined(__APPLE__)
+.thumb_func \name
+#endif
+.endm
+
 .macro  function name, export=0, align=2
 .set.Lpic_idx, 0
 .set.Lpic_gp, 0
@@ -98,10 +104,12 @@ FUNC.endfunc
 .global EXTERN_ASM\name
 ELF .type   EXTERN_ASM\name, %function
 FUNC.func   EXTERN_ASM\name
+func_mode EXTERN_ASM\name
 EXTERN_ASM\name:
 .else
 ELF .type   \name, %function
 FUNC.func   \name
+func_mode \name
 \name:
 .endif
 .endm
-- 
2.17.1 (Apple Git-112)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] libfdk-aac: Don't use defined() in a #define

2018-09-12 Thread Martin Storsjö

On Wed, 12 Sep 2018, Martin Storsjö wrote:


MSVC expands the preprocessor directives differently, making the
version check fail in the previous form.
---
I'm pretty sure I've seen a better description of this issue somewhere,
I don't remember off-hand right now where that was. But I think the
gist of it was that the previous form was undefined according to the
C standard, even if GCC and clang handle it in the same way.


This is similar to 5e3f6dc70198426fe0741e3017826b8bf3ee5ad8, which points 
out that if building with -Wexpansion-to-defined, the compiler (at least 
clang) would warn about it, clarifying that macro expansion of 'defined' 
has undefined behaviour.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libfdk-aac: Don't use defined() in a #define

2018-09-12 Thread Martin Storsjö
MSVC expands the preprocessor directives differently, making the
version check fail in the previous form.
---
I'm pretty sure I've seen a better description of this issue somewhere,
I don't remember off-hand right now where that was. But I think the
gist of it was that the previous form was undefined according to the
C standard, even if GCC and clang handle it in the same way.
---
 libavcodec/libfdk-aacdec.c | 9 ++---
 libavcodec/libfdk-aacenc.c | 9 ++---
 2 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c
index ca70a49ad4..63856232d9 100644
--- a/libavcodec/libfdk-aacdec.c
+++ b/libavcodec/libfdk-aacdec.c
@@ -25,10 +25,13 @@
 #include "avcodec.h"
 #include "internal.h"
 
+#ifdef AACDECODER_LIB_VL0
 #define FDKDEC_VER_AT_LEAST(vl0, vl1) \
-(defined(AACDECODER_LIB_VL0) && \
-((AACDECODER_LIB_VL0 > vl0) || \
- (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1)))
+((AACDECODER_LIB_VL0 > vl0) || \
+ (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1))
+#else
+#define FDKDEC_VER_AT_LEAST(vl0, vl1) 0
+#endif
 
 #if !FDKDEC_VER_AT_LEAST(2, 5) // < 2.5.10
 #define AAC_PCM_MAX_OUTPUT_CHANNELS AAC_PCM_OUTPUT_CHANNELS
diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c
index f71a276403..3b492ef8f4 100644
--- a/libavcodec/libfdk-aacenc.c
+++ b/libavcodec/libfdk-aacenc.c
@@ -26,10 +26,13 @@
 #include "audio_frame_queue.h"
 #include "internal.h"
 
+#ifdef AACENCODER_LIB_VL0
 #define FDKENC_VER_AT_LEAST(vl0, vl1) \
-(defined(AACENCODER_LIB_VL0) && \
-((AACENCODER_LIB_VL0 > vl0) || \
- (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1)))
+((AACENCODER_LIB_VL0 > vl0) || \
+ (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1))
+#else
+#define FDKENC_VER_AT_LEAST(vl0, vl1) 0
+#endif
 
 typedef struct AACContext {
 const AVClass *class;
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/3] libfdk-aacdec: Allow setting the new dynamic range control effect setting

2018-09-04 Thread Martin Storsjö
This is a new setting in FDK v2.
---
 libavcodec/libfdk-aacdec.c | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c
index c3d3b70fc9..ca70a49ad4 100644
--- a/libavcodec/libfdk-aacdec.c
+++ b/libavcodec/libfdk-aacdec.c
@@ -51,6 +51,7 @@ typedef struct FDKAACDecContext {
 int drc_level;
 int drc_boost;
 int drc_heavy;
+int drc_effect;
 int drc_cut;
 int level_limit;
 } FDKAACDecContext;
@@ -77,6 +78,10 @@ static const AVOption fdk_aac_dec_options[] = {
  OFFSET(drc_heavy),  AV_OPT_TYPE_INT,   { .i64 = -1},  
-1, 1,   AD, NULL},
 #if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10
 { "level_limit", "Signal level limiting", OFFSET(level_limit), 
AV_OPT_TYPE_INT, { .i64 = 0 }, -1, 1, AD },
+#endif
+#if FDKDEC_VER_AT_LEAST(3, 0) // 3.0.0
+{ "drc_effect","Dynamic Range Control: effect type, where e.g. [0] is none 
and [6] is general",
+ OFFSET(drc_effect), AV_OPT_TYPE_INT,   { .i64 = -1},  
-1, 8,   AD, NULL},
 #endif
 { NULL }
 };
@@ -306,6 +311,15 @@ static av_cold int fdk_aac_decode_init(AVCodecContext 
*avctx)
 }
 #endif
 
+#if FDKDEC_VER_AT_LEAST(3, 0) // 3.0.0
+if (s->drc_effect != -1) {
+if (aacDecoder_SetParam(s->handle, AAC_UNIDRC_SET_EFFECT, 
s->drc_effect) != AAC_DEC_OK) {
+av_log(avctx, AV_LOG_ERROR, "Unable to set DRC effect type in the 
decoder\n");
+return AVERROR_UNKNOWN;
+}
+}
+#endif
+
 avctx->sample_fmt = AV_SAMPLE_FMT_S16;
 
 s->decoder_buffer_size = DECODER_BUFFSIZE * DECODER_MAX_CHANNELS;
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/3] libfdk-aac: Consistently use a proper version check macro for detecting features

2018-09-04 Thread Martin Storsjö
The previous version checks checked explicitly for the version
where the version define was added to the installed headers,
making an "#ifdef AACDECODER_LIB_VL0" enough. Now that we have
a need for more diverse version checks than this, convert all checks
to such checks.
---
 libavcodec/libfdk-aacdec.c | 13 -
 libavcodec/libfdk-aacenc.c |  6 +++---
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/libavcodec/libfdk-aacdec.c b/libavcodec/libfdk-aacdec.c
index 3be65155b5..c3d3b70fc9 100644
--- a/libavcodec/libfdk-aacdec.c
+++ b/libavcodec/libfdk-aacdec.c
@@ -25,9 +25,12 @@
 #include "avcodec.h"
 #include "internal.h"
 
-/* The version macro is introduced the same time as the setting enum was
- * changed, so this check should suffice. */
-#ifndef AACDECODER_LIB_VL0
+#define FDKDEC_VER_AT_LEAST(vl0, vl1) \
+(defined(AACDECODER_LIB_VL0) && \
+((AACDECODER_LIB_VL0 > vl0) || \
+ (AACDECODER_LIB_VL0 == vl0 && AACDECODER_LIB_VL1 >= vl1)))
+
+#if !FDKDEC_VER_AT_LEAST(2, 5) // < 2.5.10
 #define AAC_PCM_MAX_OUTPUT_CHANNELS AAC_PCM_OUTPUT_CHANNELS
 #endif
 
@@ -72,7 +75,7 @@ static const AVOption fdk_aac_dec_options[] = {
  OFFSET(drc_level),  AV_OPT_TYPE_INT,   { .i64 = -1},  
-1, 127, AD, NULL},
 { "drc_heavy", "Dynamic Range Control: heavy compression, where [1] is on 
(RF mode) and [0] is off",
  OFFSET(drc_heavy),  AV_OPT_TYPE_INT,   { .i64 = -1},  
-1, 1,   AD, NULL},
-#ifdef AACDECODER_LIB_VL0
+#if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10
 { "level_limit", "Signal level limiting", OFFSET(level_limit), 
AV_OPT_TYPE_INT, { .i64 = 0 }, -1, 1, AD },
 #endif
 { NULL }
@@ -296,7 +299,7 @@ static av_cold int fdk_aac_decode_init(AVCodecContext 
*avctx)
 }
 }
 
-#ifdef AACDECODER_LIB_VL0
+#if FDKDEC_VER_AT_LEAST(2, 5) // 2.5.10
 if (aacDecoder_SetParam(s->handle, AAC_PCM_LIMITER_ENABLE, s->level_limit) 
!= AAC_DEC_OK) {
 av_log(avctx, AV_LOG_ERROR, "Unable to set in signal level limiting in 
the decoder\n");
 return AVERROR_UNKNOWN;
diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c
index 2ad768ed44..92ad1762ae 100644
--- a/libavcodec/libfdk-aacenc.c
+++ b/libavcodec/libfdk-aacenc.c
@@ -159,7 +159,7 @@ static av_cold int aac_encode_init(AVCodecContext *avctx)
 case 6: mode = MODE_1_2_2_1; sce = 2; cpe = 2; break;
 /* The version macro is introduced the same time as the 7.1 support, so this
should suffice. */
-#ifdef AACENCODER_LIB_VL0
+#if FDKENC_VER_AT_LEAST(3, 4) // 3.4.12
 case 8:
 sce = 2;
 cpe = 3;
@@ -295,7 +295,7 @@ static av_cold int aac_encode_init(AVCodecContext *avctx)
 }
 
 avctx->frame_size = info.frameLength;
-#if FDKENC_VER_AT_LEAST(4, 0)
+#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0
 avctx->initial_padding = info.nDelay;
 #else
 avctx->initial_padding = info.encoderDelay;
@@ -418,7 +418,7 @@ static const uint64_t aac_channel_layout[] = {
 AV_CH_LAYOUT_4POINT0,
 AV_CH_LAYOUT_5POINT0_BACK,
 AV_CH_LAYOUT_5POINT1_BACK,
-#ifdef AACENCODER_LIB_VL0
+#if FDKENC_VER_AT_LEAST(3, 4) // 3.4.12
 AV_CH_LAYOUT_7POINT1_WIDE_BACK,
 AV_CH_LAYOUT_7POINT1,
 #endif
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 3/3] libfdk-aacenc: Allow enabling the ELDv2 profile

2018-09-04 Thread Martin Storsjö
This is a new feature in FDK v2.
---
 libavcodec/libfdk-aacenc.c | 27 ++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c
index 92ad1762ae..f71a276403 100644
--- a/libavcodec/libfdk-aacenc.c
+++ b/libavcodec/libfdk-aacenc.c
@@ -36,6 +36,7 @@ typedef struct AACContext {
 HANDLE_AACENCODER handle;
 int afterburner;
 int eld_sbr;
+int eld_v2;
 int signaling;
 int latm;
 int header_period;
@@ -47,6 +48,9 @@ typedef struct AACContext {
 static const AVOption aac_enc_options[] = {
 { "afterburner", "Afterburner (improved quality)", offsetof(AACContext, 
afterburner), AV_OPT_TYPE_INT, { .i64 = 1 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | 
AV_OPT_FLAG_ENCODING_PARAM },
 { "eld_sbr", "Enable SBR for ELD (for SBR in other configurations, use the 
-profile parameter)", offsetof(AACContext, eld_sbr), AV_OPT_TYPE_INT, { .i64 = 
0 }, 0, 1, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM },
+#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0
+{ "eld_v2", "Enable ELDv2 (LD-MPS extension for ELD stereo signals)", 
offsetof(AACContext, eld_v2), AV_OPT_TYPE_INT, { .i64 = 0 }, 0, 1, 
AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM },
+#endif
 { "signaling", "SBR/PS signaling style", offsetof(AACContext, signaling), 
AV_OPT_TYPE_INT, { .i64 = -1 }, -1, 2, AV_OPT_FLAG_AUDIO_PARAM | 
AV_OPT_FLAG_ENCODING_PARAM, "signaling" },
 { "default", "Choose signaling implicitly (explicit hierarchical by 
default, implicit if global header is disabled)", 0, AV_OPT_TYPE_CONST, { .i64 
= -1 }, 0, 0, AV_OPT_FLAG_AUDIO_PARAM | AV_OPT_FLAG_ENCODING_PARAM, "signaling" 
},
 { "implicit", "Implicit backwards compatible signaling", 0, 
AV_OPT_TYPE_CONST, { .i64 = 0 }, 0, 0, AV_OPT_FLAG_AUDIO_PARAM | 
AV_OPT_FLAG_ENCODING_PARAM, "signaling" },
@@ -152,7 +156,28 @@ static av_cold int aac_encode_init(AVCodecContext *avctx)
 
 switch (avctx->channels) {
 case 1: mode = MODE_1;   sce = 1; cpe = 0; break;
-case 2: mode = MODE_2;   sce = 0; cpe = 1; break;
+case 2:
+#if FDKENC_VER_AT_LEAST(4, 0) // 4.0.0
+  // (profile + 1) to map from profile range to AOT range
+  if (aot == FF_PROFILE_AAC_ELD + 1 && s->eld_v2) {
+  if ((err = aacEncoder_SetParam(s->handle, AACENC_CHANNELMODE,
+ 128)) != AACENC_OK) {
+  av_log(avctx, AV_LOG_ERROR, "Unable to enable ELDv2: %s\n",
+ aac_get_error(err));
+  goto error;
+  } else {
+mode = MODE_212;
+sce = 1;
+cpe = 0;
+  }
+  } else
+#endif
+  {
+mode = MODE_2;
+sce = 0;
+cpe = 1;
+  }
+  break;
 case 3: mode = MODE_1_2; sce = 1; cpe = 1; break;
 case 4: mode = MODE_1_2_1;   sce = 2; cpe = 1; break;
 case 5: mode = MODE_1_2_2;   sce = 1; cpe = 2; break;
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libfdk-aacenc: Fix building with libfdk-aac v2

2018-09-02 Thread Martin Storsjö
When flushing the encoder, we now need to provide non-null buffer
parameters for everything, even if they are unused.

The encoderDelay parameter has been replaced by two, nDelay and
nDelayCore.
---
libfdk-aac v2 also has a bunch of other new, yet untested features,
like support for xHE-AAC.
---
 libavcodec/libfdk-aacenc.c | 34 +-
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c
index c340a1e3e0..2ad768ed44 100644
--- a/libavcodec/libfdk-aacenc.c
+++ b/libavcodec/libfdk-aacenc.c
@@ -26,6 +26,11 @@
 #include "audio_frame_queue.h"
 #include "internal.h"
 
+#define FDKENC_VER_AT_LEAST(vl0, vl1) \
+(defined(AACENCODER_LIB_VL0) && \
+((AACENCODER_LIB_VL0 > vl0) || \
+ (AACENCODER_LIB_VL0 == vl0 && AACENCODER_LIB_VL1 >= vl1)))
+
 typedef struct AACContext {
 const AVClass *class;
 HANDLE_AACENCODER handle;
@@ -290,7 +295,11 @@ static av_cold int aac_encode_init(AVCodecContext *avctx)
 }
 
 avctx->frame_size = info.frameLength;
+#if FDKENC_VER_AT_LEAST(4, 0)
+avctx->initial_padding = info.nDelay;
+#else
 avctx->initial_padding = info.encoderDelay;
+#endif
 ff_af_queue_init(avctx, &s->afq);
 
 if (avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER) {
@@ -323,28 +332,35 @@ static int aac_encode_frame(AVCodecContext *avctx, 
AVPacket *avpkt,
 int out_buffer_size, out_buffer_element_size;
 void *in_ptr, *out_ptr;
 int ret;
+uint8_t dummy_buf[1];
 AACENC_ERROR err;
 
 /* handle end-of-stream small frame and flushing */
 if (!frame) {
+/* Must be a non-null pointer, even if it's a dummy. We could use
+ * the address of anything else on the stack as well. */
+in_ptr   = dummy_buf;
+in_buffer_size   = 0;
+
 in_args.numInSamples = -1;
 } else {
-in_ptr   = frame->data[0];
-in_buffer_size   = 2 * avctx->channels * frame->nb_samples;
-in_buffer_element_size   = 2;
+in_ptr   = frame->data[0];
+in_buffer_size   = 2 * avctx->channels * frame->nb_samples;
 
-in_args.numInSamples = avctx->channels * frame->nb_samples;
-in_buf.numBufs   = 1;
-in_buf.bufs  = &in_ptr;
-in_buf.bufferIdentifiers = &in_buffer_identifier;
-in_buf.bufSizes  = &in_buffer_size;
-in_buf.bufElSizes= &in_buffer_element_size;
+in_args.numInSamples = avctx->channels * frame->nb_samples;
 
 /* add current frame to the queue */
 if ((ret = ff_af_queue_add(&s->afq, frame)) < 0)
 return ret;
 }
 
+in_buffer_element_size   = 2;
+in_buf.numBufs   = 1;
+in_buf.bufs  = &in_ptr;
+in_buf.bufferIdentifiers = &in_buffer_identifier;
+in_buf.bufSizes  = &in_buffer_size;
+in_buf.bufElSizes= &in_buffer_element_size;
+
 /* The maximum packet size is 6144 bits aka 768 bytes per channel. */
 if ((ret = ff_alloc_packet(avpkt, FFMAX(8192, 768 * avctx->channels {
 av_log(avctx, AV_LOG_ERROR, "Error getting output packet\n");
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] libopenh264dec: Export the decoded profile and level in AVCodecContext

2018-08-31 Thread Martin Storsjö

On Fri, 31 Aug 2018, Vittorio Giovara wrote:


On Fri, Aug 31, 2018 at 11:25 AM, Martin Storsjö  wrote:


---
 libavcodec/libopenh264dec.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c
index 5990a72ff9..7e9e66743a 100644
--- a/libavcodec/libopenh264dec.c
+++ b/libavcodec/libopenh264dec.c
@@ -95,6 +95,7 @@ static int svc_decode_frame(AVCodecContext *avctx, void
*data,
 int linesize[3];
 AVFrame *avframe = data;
 DECODING_STATE state;
+int opt;

 if (!avpkt->data) {
 #if OPENH264_VER_AT_LEAST(1, 9)
@@ -136,6 +137,10 @@ FF_DISABLE_DEPRECATION_WARNINGS
 avframe->pkt_pts = avpkt->pts;
 FF_ENABLE_DEPRECATION_WARNINGS
 #endif
+(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_PROFILE, &opt);
+avctx->profile = opt;
+(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_LEVEL, &opt);
+avctx->level = opt;

 *got_frame = 1;
 return avpkt->size;
--



lgtm


Thanks - pushed with appropriate openh264 version ifdefs added.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] network: Add RFC 8305 style "Happy Eyeballs"/"Fast Fallback" helper function

2018-08-31 Thread Martin Storsjö

On Wed, 22 Aug 2018, Luca Barbato wrote:


On 21/08/2018 09:29, Martin Storsjö wrote:

For cases with dual stack (IPv4 + IPv6) connectivity, but where one
stack potentially is less reliable, strive to trying to connect over
both protocols in parallel, using whichever address connected first.

In cases with a hostname resolving to multiple IPv4 and IPv6
addresses, the current connection mechanism would try all addresses
in the order returned by getaddrinfo (with all IPv6 addresses ordered
before the IPv4 addresses normally). If connection attempts to the
IPv6 addresses return quickly with an error, this was no problem, but
if they were unsuccessful leading up to timeouts, the connection process
would have to wait for timeouts on all IPv6 target addresses before
attempting any IPv4 address.

Similar to what RFC 8305 suggests, reorder the list of addresses to
try connecting to, interleaving address families. After starting one
connection attempt, start another one in parallel after a small delay
(200 ms as suggested by the RFC).

For cases with unreliable IPv6 but reliable IPv4, this should make
connection attempts work as reliably as with plain IPv4, with only an
extra 200 ms of connection delay.


The set looks fine to me.


Pushed.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libopenh264dec: Export the decoded profile and level in AVCodecContext

2018-08-31 Thread Martin Storsjö
---
 libavcodec/libopenh264dec.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c
index 5990a72ff9..7e9e66743a 100644
--- a/libavcodec/libopenh264dec.c
+++ b/libavcodec/libopenh264dec.c
@@ -95,6 +95,7 @@ static int svc_decode_frame(AVCodecContext *avctx, void *data,
 int linesize[3];
 AVFrame *avframe = data;
 DECODING_STATE state;
+int opt;
 
 if (!avpkt->data) {
 #if OPENH264_VER_AT_LEAST(1, 9)
@@ -136,6 +137,10 @@ FF_DISABLE_DEPRECATION_WARNINGS
 avframe->pkt_pts = avpkt->pts;
 FF_ENABLE_DEPRECATION_WARNINGS
 #endif
+(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_PROFILE, &opt);
+avctx->profile = opt;
+(*s->decoder)->GetOption(s->decoder, DECODER_OPTION_LEVEL, &opt);
+avctx->level = opt;
 
 *got_frame = 1;
 return avpkt->size;
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] network: Add RFC 8305 style "Happy Eyeballs"/"Fast Fallback" helper function

2018-08-21 Thread Martin Storsjö
For cases with dual stack (IPv4 + IPv6) connectivity, but where one
stack potentially is less reliable, strive to trying to connect over
both protocols in parallel, using whichever address connected first.

In cases with a hostname resolving to multiple IPv4 and IPv6
addresses, the current connection mechanism would try all addresses
in the order returned by getaddrinfo (with all IPv6 addresses ordered
before the IPv4 addresses normally). If connection attempts to the
IPv6 addresses return quickly with an error, this was no problem, but
if they were unsuccessful leading up to timeouts, the connection process
would have to wait for timeouts on all IPv6 target addresses before
attempting any IPv4 address.

Similar to what RFC 8305 suggests, reorder the list of addresses to
try connecting to, interleaving address families. After starting one
connection attempt, start another one in parallel after a small delay
(200 ms as suggested by the RFC).

For cases with unreliable IPv6 but reliable IPv4, this should make
connection attempts work as reliably as with plain IPv4, with only an
extra 200 ms of connection delay.
---
 libavformat/network.c | 226 ++
 libavformat/network.h |  28 +++
 2 files changed, 254 insertions(+)

diff --git a/libavformat/network.c b/libavformat/network.c
index 24fcf20539..2d281539c6 100644
--- a/libavformat/network.c
+++ b/libavformat/network.c
@@ -23,7 +23,9 @@
 #include "tls.h"
 #include "url.h"
 #include "libavcodec/internal.h"
+#include "libavutil/avassert.h"
 #include "libavutil/mem.h"
+#include "libavutil/time.h"
 
 void ff_tls_init(void)
 {
@@ -240,6 +242,230 @@ int ff_listen_connect(int fd, const struct sockaddr *addr,
 return ret;
 }
 
+static void interleave_addrinfo(struct addrinfo *base)
+{
+struct addrinfo **next = &base->ai_next;
+while (*next) {
+struct addrinfo *cur = *next;
+// Iterate forward until we find an entry of a different family.
+if (cur->ai_family == base->ai_family) {
+next = &cur->ai_next;
+continue;
+}
+if (cur == base->ai_next) {
+// If the first one following base is of a different family, just
+// move base forward one step and continue.
+base = cur;
+next = &base->ai_next;
+continue;
+}
+// Unchain cur from the rest of the list from its current spot.
+*next = cur->ai_next;
+// Hook in cur directly after base.
+cur->ai_next = base->ai_next;
+base->ai_next = cur;
+// Restart with a new base. We know that before moving the cur element,
+// everything between the previous base and cur had the same family,
+// different from cur->ai_family. Therefore, we can keep next pointing
+// where it was, and continue from there with base at the one after
+// cur.
+base = cur->ai_next;
+}
+}
+
+static void print_address_list(void *ctx, const struct addrinfo *addr,
+   const char *title)
+{
+char hostbuf[100], portbuf[20];
+av_log(ctx, AV_LOG_DEBUG, "%s:\n", title);
+while (addr) {
+getnameinfo(addr->ai_addr, addr->ai_addrlen,
+hostbuf, sizeof(hostbuf), portbuf, sizeof(portbuf),
+NI_NUMERICHOST | NI_NUMERICSERV);
+av_log(ctx, AV_LOG_DEBUG, "Address %s port %s\n", hostbuf, portbuf);
+addr = addr->ai_next;
+}
+}
+
+struct ConnectionAttempt {
+int fd;
+int64_t deadline_us;
+struct addrinfo *addr;
+};
+
+// Returns < 0 on error, 0 on successfully started connection attempt,
+// > 0 for a connection that succeeded already.
+static int start_connect_attempt(struct ConnectionAttempt *attempt,
+ struct addrinfo **ptr, int timeout_ms,
+ URLContext *h,
+ void (*customize_fd)(void *, int), void 
*customize_ctx)
+{
+struct addrinfo *ai = *ptr;
+int ret;
+
+*ptr = ai->ai_next;
+
+attempt->fd = ff_socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol);
+if (attempt->fd < 0)
+return ff_neterrno();
+attempt->deadline_us = av_gettime_relative() + timeout_ms * 1000;
+attempt->addr = ai;
+
+ff_socket_nonblock(attempt->fd, 1);
+
+if (customize_fd)
+customize_fd(customize_ctx, attempt->fd);
+
+while ((ret = connect(attempt->fd, ai->ai_addr, ai->ai_addrlen))) {
+ret = ff_neterrno();
+switch (ret) {
+case AVERROR(EINTR):
+if (ff_check_interrupt(&h->interrupt_callback)) {
+closesocket(attempt->fd);
+attempt->fd = -1;
+return AVERROR_EXIT;
+}
+continue;
+case AVERROR(EINPROGRESS):
+case AVERROR(EAGAIN):
+return 0;
+default:
+closesocket(attempt->fd);
+attempt->fd = -1;
+   

[libav-devel] [PATCH 2/2] tcp: Use ff_connect_parallel for RFC 8305 style connecting

2018-08-21 Thread Martin Storsjö
---
 libavformat/tcp.c | 41 +++--
 1 file changed, 15 insertions(+), 26 deletions(-)

diff --git a/libavformat/tcp.c b/libavformat/tcp.c
index 1498c26fbe..7044d44f06 100644
--- a/libavformat/tcp.c
+++ b/libavformat/tcp.c
@@ -108,30 +108,28 @@ static int tcp_open(URLContext *h, const char *uri, int 
flags)
 
 cur_ai = ai;
 
- restart:
-fd = ff_socket(cur_ai->ai_family,
-   cur_ai->ai_socktype,
-   cur_ai->ai_protocol);
-if (fd < 0) {
-ret = ff_neterrno();
-goto fail;
-}
-
 if (s->listen) {
+while (cur_ai && fd < 0) {
+fd = ff_socket(cur_ai->ai_family,
+   cur_ai->ai_socktype,
+   cur_ai->ai_protocol);
+if (fd < 0) {
+ret = ff_neterrno();
+cur_ai = cur_ai->ai_next;
+}
+}
+if (fd < 0)
+goto fail1;
+
 if ((ret = ff_listen_bind(fd, cur_ai->ai_addr, cur_ai->ai_addrlen,
   s->listen_timeout, h)) < 0) {
 goto fail1;
 }
 fd = ret;
 } else {
-if ((ret = ff_listen_connect(fd, cur_ai->ai_addr, cur_ai->ai_addrlen,
- s->timeout, h, !!cur_ai->ai_next)) < 0) {
-
-if (ret == AVERROR_EXIT)
-goto fail1;
-else
-goto fail;
-}
+ret = ff_connect_parallel(ai, s->timeout, 3, h, &fd, NULL, NULL);
+if (ret < 0)
+goto fail1;
 }
 
 h->is_streamed = 1;
@@ -139,15 +137,6 @@ static int tcp_open(URLContext *h, const char *uri, int 
flags)
 freeaddrinfo(ai);
 return 0;
 
- fail:
-if (cur_ai->ai_next) {
-/* Retry with the next sockaddr */
-cur_ai = cur_ai->ai_next;
-if (fd >= 0)
-closesocket(fd);
-ret = 0;
-goto restart;
-}
  fail1:
 if (fd >= 0)
 closesocket(fd);
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] tls_openssl: Fix checks for SSL_ERROR_WANT_WRITE in nonblocking operation

2018-08-16 Thread Martin Storsjö
This was a typo in 0671eb2346c, spotted by Chris Carroux.
---
 libavformat/tls_openssl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/tls_openssl.c b/libavformat/tls_openssl.c
index f0b325ae98..4a2fcfd771 100644
--- a/libavformat/tls_openssl.c
+++ b/libavformat/tls_openssl.c
@@ -112,7 +112,7 @@ static int print_tls_error(URLContext *h, int ret)
 TLSContext *c = h->priv_data;
 if (h->flags & AVIO_FLAG_NONBLOCK) {
 int err = SSL_get_error(c->ssl, ret);
-if (err == SSL_ERROR_WANT_READ || err == SSL_ERROR_WANT_READ)
+if (err == SSL_ERROR_WANT_READ || err == SSL_ERROR_WANT_WRITE)
 return AVERROR(EAGAIN);
 }
 av_log(h, AV_LOG_ERROR, "%s\n", ERR_error_string(ERR_get_error(), NULL));
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/3] network: Use ff_neterrno instead of AVERROR(errno) for poll errors

2018-08-13 Thread Martin Storsjö
From: Simon Thelen 

This makes sure to pick up the actual error codes on windows.
---
 libavformat/network.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavformat/network.c b/libavformat/network.c
index 86d79553f7..1e02668ecf 100644
--- a/libavformat/network.c
+++ b/libavformat/network.c
@@ -138,7 +138,7 @@ static int ff_poll_interrupt(struct pollfd *p, nfds_t nfds, 
int timeout,
 if (!ret)
 return AVERROR(ETIMEDOUT);
 if (ret < 0)
-return AVERROR(errno);
+return ff_neterrno();
 return ret;
 }
 
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/3] http: pass return code from http_open_cnx_internal() on its failure

2018-08-13 Thread Martin Storsjö
From: Andrey Utkin 

Previously, AVERROR(EIO) was returned on failure of
http_open_cnx_internal(). Now the value is passed to upper level, thus
it is possible to distinguish ECONNREFUSED, ETIMEDOUT, ENETUNREACH etc.
---
 libavformat/http.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libavformat/http.c b/libavformat/http.c
index 80c87f786a..dfb95642c0 100644
--- a/libavformat/http.c
+++ b/libavformat/http.c
@@ -248,6 +248,8 @@ fail:
 if (s->hd)
 ffurl_close(s->hd);
 s->hd = NULL;
+if (location_changed < 0)
+return location_changed;
 return AVERROR(EIO);
 }
 
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 3/3] network: Check for EINTR in ff_poll_interrupt

2018-08-13 Thread Martin Storsjö
---
 libavformat/network.c | 9 ++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/libavformat/network.c b/libavformat/network.c
index 1e02668ecf..24fcf20539 100644
--- a/libavformat/network.c
+++ b/libavformat/network.c
@@ -131,14 +131,17 @@ static int ff_poll_interrupt(struct pollfd *p, nfds_t 
nfds, int timeout,
 if (ff_check_interrupt(cb))
 return AVERROR_EXIT;
 ret = poll(p, nfds, POLLING_TIME);
-if (ret != 0)
+if (ret != 0) {
+if (ret < 0)
+ret = ff_neterrno();
+if (ret == AVERROR(EINTR))
+continue;
 break;
+}
 } while (timeout < 0 || runs-- > 0);
 
 if (!ret)
 return AVERROR(ETIMEDOUT);
-if (ret < 0)
-return ff_neterrno();
 return ret;
 }
 
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libopenh264: Add support for decoding of b-frames

2018-07-31 Thread Martin Storsjö
The current git master version of libopenh264 supports decoding of
b-frames.
---
 libavcodec/libopenh264dec.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/libavcodec/libopenh264dec.c b/libavcodec/libopenh264dec.c
index cdb8d527cf..5990a72ff9 100644
--- a/libavcodec/libopenh264dec.c
+++ b/libavcodec/libopenh264dec.c
@@ -96,7 +96,18 @@ static int svc_decode_frame(AVCodecContext *avctx, void 
*data,
 AVFrame *avframe = data;
 DECODING_STATE state;
 
-state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, avpkt->size, 
ptrs, &info);
+if (!avpkt->data) {
+#if OPENH264_VER_AT_LEAST(1, 9)
+int end_of_stream = 1;
+(*s->decoder)->SetOption(s->decoder, DECODER_OPTION_END_OF_STREAM, 
&end_of_stream);
+state = (*s->decoder)->FlushFrame(s->decoder, ptrs, &info);
+#else
+return 0;
+#endif
+} else {
+info.uiInBsTimeStamp = avpkt->pts;
+state = (*s->decoder)->DecodeFrame2(s->decoder, avpkt->data, 
avpkt->size, ptrs, &info);
+}
 if (state != dsErrorFree) {
 av_log(avctx, AV_LOG_ERROR, "DecodeFrame2 failed\n");
 return AVERROR_UNKNOWN;
@@ -118,8 +129,8 @@ static int svc_decode_frame(AVCodecContext *avctx, void 
*data,
 linesize[1] = linesize[2] = info.UsrData.sSystemBuffer.iStride[1];
 av_image_copy(avframe->data, avframe->linesize, (const uint8_t **) ptrs, 
linesize, avctx->pix_fmt, avctx->width, avctx->height);
 
-avframe->pts = avpkt->pts;
-avframe->pkt_dts = avpkt->dts;
+avframe->pts = info.uiOutYuvTimeStamp;
+avframe->pkt_dts = AV_NOPTS_VALUE;
 #if FF_API_PKT_PTS
 FF_DISABLE_DEPRECATION_WARNINGS
 avframe->pkt_pts = avpkt->pts;
@@ -139,8 +150,6 @@ AVCodec ff_libopenh264_decoder = {
 .init   = svc_decode_init,
 .decode = svc_decode_frame,
 .close  = svc_decode_close,
-// The decoder doesn't currently support B-frames, and the decoder's API
-// doesn't support reordering/delay, but the BSF could incur delay.
 .capabilities   = AV_CODEC_CAP_DELAY | AV_CODEC_CAP_DR1,
 .caps_internal  = FF_CODEC_CAP_SETS_PKT_DTS | FF_CODEC_CAP_INIT_THREADSAFE 
|
   FF_CODEC_CAP_INIT_CLEANUP,
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] avconv: make sure packets put into the muxing FIFO are refcounted

2018-07-31 Thread Martin Storsjö
From: wm4 

Some callers (like do_subtitle_out(), or do_streamcopy()) call this
with an AVPacket that is not refcounted. This can cause undefined
behavior.

Calling av_packet_move_ref() does not make a packet refcounted if it
isn't yet. (And it can't be made to, because it always succeeds,
and can't return ENOMEM.)

Call av_packet_ref() instead to make sure it's refcounted.
---
 avtools/avconv.c | 7 +--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/avtools/avconv.c b/avtools/avconv.c
index ac15464a8d..3abb7f872f 100644
--- a/avtools/avconv.c
+++ b/avtools/avconv.c
@@ -281,7 +281,7 @@ static void write_packet(OutputFile *of, AVPacket *pkt, 
OutputStream *ost)
 int ret;
 
 if (!of->header_written) {
-AVPacket tmp_pkt;
+AVPacket tmp_pkt = {0};
 /* the muxer is not initialized yet, buffer the packet */
 if (!av_fifo_space(ost->muxing_queue)) {
 int new_size = FFMIN(2 * av_fifo_size(ost->muxing_queue),
@@ -296,8 +296,11 @@ static void write_packet(OutputFile *of, AVPacket *pkt, 
OutputStream *ost)
 if (ret < 0)
 exit_program(1);
 }
-av_packet_move_ref(&tmp_pkt, pkt);
+ret = av_packet_ref(&tmp_pkt, pkt);
+if (ret < 0)
+exit_program(1);
 av_fifo_generic_write(ost->muxing_queue, &tmp_pkt, sizeof(tmp_pkt), 
NULL);
+av_packet_unref(pkt);
 return;
 }
 
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] libfdk-aac: Use enum names instead of literal numbers for the output format

2018-07-05 Thread Martin Storsjö
---
 libavcodec/libfdk-aacenc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/libavcodec/libfdk-aacenc.c b/libavcodec/libfdk-aacenc.c
index 26dfb6dc0b..c340a1e3e0 100644
--- a/libavcodec/libfdk-aacenc.c
+++ b/libavcodec/libfdk-aacenc.c
@@ -227,7 +227,8 @@ static av_cold int aac_encode_init(AVCodecContext *avctx)
 /* Choose bitstream format - if global header is requested, use
  * raw access units, otherwise use ADTS. */
 if ((err = aacEncoder_SetParam(s->handle, AACENC_TRANSMUX,
-   avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER 
? 0 : s->latm ? 10 : 2)) != AACENC_OK) {
+   avctx->flags & AV_CODEC_FLAG_GLOBAL_HEADER 
? TT_MP4_RAW :
+   s->latm ? TT_MP4_LOAS : TT_MP4_ADTS)) != 
AACENC_OK) {
 av_log(avctx, AV_LOG_ERROR, "Unable to set the transmux format: %s\n",
aac_get_error(err));
 goto error;
-- 
2.15.2 (Apple Git-101.1)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] configure: fix inline asm checks

2018-06-25 Thread Martin Storsjö

On Fri, 8 Jun 2018, Diego Biurrun wrote:


On Thu, Jun 07, 2018 at 11:05:26PM -0300, James Almer wrote:

On 6/7/2018 6:01 PM, Diego Biurrun wrote:
> On Thu, Jun 07, 2018 at 03:03:21PM +0300, Martin Storsjö wrote:
>> Commit 8c893aa3cd5 removed quotes that were required to detect
>> inline asm:
>>
>> check_insn armv5te qadd r0, r0, r0
>> .../test.c:1:34: error: expected string literal in 'asm'
>> void foo(void){ __asm__ volatile(qadd r0, r0, r0); }
>>
>> The correct code is:
>>
>> void foo(void){ __asm__ volatile("qadd r0, r0, r0"); }
>> --- a/configure
>> +++ b/configure
>> @@ -866,7 +866,7 @@ EOF
>>  check_insn(){
>>  log check_insn "$@"
>> -check_inline_asm ${1}_inline "$2"
>> +check_inline_asm ${1}_inline "\"$2\""
>>  check_as ${1}_external "$2"
>>  }
> 
> This does not look like the correct fix to me. The required quotes

> should be part of the convenience function instead. Notice how calls
> to check_insn and check_inline_asm differ in the way they quote their
> arguments. There should be no need for this inconsistency.
> 
> I'll look into it.


Changing all the calls from check_insn name 'insn' to check_insn name
'"insn"' would probably fix the check_inline_asm tests, but may break
the check_as tests.


Complicating the function calls is not the right way to go. The helper
function should take care of the required quoting and not rely on the
callers to pass arguments in nested quotes.


Ping; whoever is waiting for the other, please pick the thread up again.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] configure: fix inline asm checks

2018-06-07 Thread Martin Storsjö
From: John Cox 

Commit 8c893aa3cd5 removed quotes that were required to detect
inline asm:

check_insn armv5te qadd r0, r0, r0
.../test.c:1:34: error: expected string literal in 'asm'
void foo(void){ __asm__ volatile(qadd r0, r0, r0); }

The correct code is:

void foo(void){ __asm__ volatile("qadd r0, r0, r0"); }

Commit message written by Frank Liberato 
---
 configure | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/configure b/configure
index 5e79c0cec1..48e8536b07 100755
--- a/configure
+++ b/configure
@@ -866,7 +866,7 @@ EOF
 
 check_insn(){
 log check_insn "$@"
-check_inline_asm ${1}_inline "$2"
+check_inline_asm ${1}_inline "\"$2\""
 check_as ${1}_external "$2"
 }
 
-- 
2.15.1 (Apple Git-101)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API

2018-04-19 Thread Martin Storsjö

On Tue, 17 Apr 2018, Diego Biurrun wrote:


On Mon, Apr 16, 2018 at 05:50:04PM +0300, Martin Storsjö wrote:

From: Steve Lhomme 

Remove the wincrypt API calls since we don't support XP anymore and
bcrypt is available since Vista, even on Windows Store builds.
---
Now with avutil_extralibs sorted alphabetically, and James' extended
configure check included.

--- a/configure
+++ b/configure
@@ -4579,9 +4579,10 @@ check_header windows.h

+check_lib bcrypt   "windows.h bcrypt.h"   BCryptGenRandom  -lbcrypt &&
+check_cpp_condition bcrypt bcrypt.h "defined BCRYPT_RNG_ALGORITHM"


This is a workaround for an old, already-obsolete version of mingw64.
Before this shows up in a release it will be even more obsolete.  IMO
such workarounds are not worth the trouble; let the breakage occur where
the actual bugs are and do the fixes at the root. I consider that the
saner longterm strategy. Your call; push whichever version you prefer.


If such versions regardless are common (which James fix would indicate), 
or even "aren't uncommon", I'd prefer to include the extended configure 
check. I wouldn't want to rule out building with a less-than-newest 
version of mingw-w64.


Thus pushed with the extra check.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API

2018-04-16 Thread Martin Storsjö
From: Steve Lhomme 

Remove the wincrypt API calls since we don't support XP anymore and
bcrypt is available since Vista, even on Windows Store builds.
---
Now with avutil_extralibs sorted alphabetically, and James' extended
configure check included.
---
 configure   |  7 ---
 libavutil/random_seed.c | 19 ++-
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index 3c7b6a0981..465fdcfb6d 100755
--- a/configure
+++ b/configure
@@ -1703,12 +1703,12 @@ SYSTEM_FUNCS="
 "
 
 SYSTEM_LIBRARIES="
+bcrypt
 sdl
 vaapi_1
 vaapi_drm
 vaapi_x11
 vdpau_x11
-wincrypt
 "
 
 TOOLCHAIN_FEATURES="
@@ -2610,7 +2610,7 @@ avdevice_extralibs="libm_extralibs"
 avformat_extralibs="libm_extralibs"
 avfilter_extralibs="pthreads_extralibs libm_extralibs"
 avresample_extralibs="libm_extralibs"
-avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs 
d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs 
pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs 
vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs"
+avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs 
cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs 
nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs 
vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs"
 swscale_extralibs="libm_extralibs"
 
 # programs
@@ -4579,9 +4579,10 @@ check_header windows.h
 # so we also check that atomics actually work here
 check_builtin stdatomic stdatomic.h "atomic_int foo; atomic_store(&foo, 0)"
 
+check_lib bcrypt   "windows.h bcrypt.h"   BCryptGenRandom  -lbcrypt &&
+check_cpp_condition bcrypt bcrypt.h "defined BCRYPT_RNG_ALGORITHM"
 check_lib ole32"windows.h"CoTaskMemFree-lole32
 check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
-check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
 
 check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss
diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c
index 089d883916..388cb401ba 100644
--- a/libavutil/random_seed.c
+++ b/libavutil/random_seed.c
@@ -23,9 +23,9 @@
 #if HAVE_UNISTD_H
 #include 
 #endif
-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
 #include 
-#include 
+#include 
 #endif
 #include 
 #include 
@@ -96,13 +96,14 @@ uint32_t av_get_random_seed(void)
 {
 uint32_t seed;
 
-#if HAVE_WINCRYPT
-HCRYPTPROV provider;
-if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL,
-CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) {
-BOOL ret = CryptGenRandom(provider, sizeof(seed), (PBYTE) &seed);
-CryptReleaseContext(provider, 0);
-if (ret)
+#if HAVE_BCRYPT
+BCRYPT_ALG_HANDLE algo_handle;
+NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, 
BCRYPT_RNG_ALGORITHM,
+   MS_PRIMITIVE_PROVIDER, 0);
+if (BCRYPT_SUCCESS(ret)) {
+NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, 
sizeof(seed), 0);
+BCryptCloseAlgorithmProvider(algo_handle, 0);
+if (BCRYPT_SUCCESS(ret))
 return seed;
 }
 #endif
-- 
2.15.1 (Apple Git-101)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] random_seed: use bcrypt instead of the old wincrypt API

2018-04-15 Thread Martin Storsjö
From: Steve Lhomme 

Remove the wincrypt API calls since we don't support XP anymore and
bcrypt is available since Vista, even on Windows Store builds.
---
 configure   |  6 +++---
 libavutil/random_seed.c | 19 ++-
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/configure b/configure
index 3c7b6a0981..0eba9b24f3 100755
--- a/configure
+++ b/configure
@@ -1703,12 +1703,12 @@ SYSTEM_FUNCS="
 "
 
 SYSTEM_LIBRARIES="
+bcrypt
 sdl
 vaapi_1
 vaapi_drm
 vaapi_x11
 vdpau_x11
-wincrypt
 "
 
 TOOLCHAIN_FEATURES="
@@ -2610,7 +2610,7 @@ avdevice_extralibs="libm_extralibs"
 avformat_extralibs="libm_extralibs"
 avfilter_extralibs="pthreads_extralibs libm_extralibs"
 avresample_extralibs="libm_extralibs"
-avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs 
d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs 
pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs 
vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs"
+avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs 
d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs 
pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs 
vaapi_x11_extralibs vdpau_x11_extralibs bcrypt_extralibs"
 swscale_extralibs="libm_extralibs"
 
 # programs
@@ -4579,9 +4579,9 @@ check_header windows.h
 # so we also check that atomics actually work here
 check_builtin stdatomic stdatomic.h "atomic_int foo; atomic_store(&foo, 0)"
 
+check_lib bcrypt   "windows.h bcrypt.h"   BCryptGenRandom  -lbcrypt
 check_lib ole32"windows.h"CoTaskMemFree-lole32
 check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
-check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
 check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
 
 check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss
diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c
index 089d883916..388cb401ba 100644
--- a/libavutil/random_seed.c
+++ b/libavutil/random_seed.c
@@ -23,9 +23,9 @@
 #if HAVE_UNISTD_H
 #include 
 #endif
-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
 #include 
-#include 
+#include 
 #endif
 #include 
 #include 
@@ -96,13 +96,14 @@ uint32_t av_get_random_seed(void)
 {
 uint32_t seed;
 
-#if HAVE_WINCRYPT
-HCRYPTPROV provider;
-if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL,
-CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) {
-BOOL ret = CryptGenRandom(provider, sizeof(seed), (PBYTE) &seed);
-CryptReleaseContext(provider, 0);
-if (ret)
+#if HAVE_BCRYPT
+BCRYPT_ALG_HANDLE algo_handle;
+NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, 
BCRYPT_RNG_ALGORITHM,
+   MS_PRIMITIVE_PROVIDER, 0);
+if (BCRYPT_SUCCESS(ret)) {
+NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, 
sizeof(seed), 0);
+BCryptCloseAlgorithmProvider(algo_handle, 0);
+if (BCRYPT_SUCCESS(ret))
 return seed;
 }
 #endif
-- 
2.15.1 (Apple Git-101)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] [v2] use bcrypt instead of the old wincrypt API

2018-04-15 Thread Martin Storsjö

On Sun, 15 Apr 2018, Martin Storsjö wrote:


On Tue, 3 Apr 2018, Steve Lhomme wrote:


When targeting Windows Vista and above
The wincrypt API is deprecated and not allowed for Windows Store apps.

Wincrypt can be removed after XP support is dropped.
---
configure   |  4 +++-
libavutil/random_seed.c | 17 +++--
2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 77754d0f51..0ab975bb1c 100755
--- a/configure
+++ b/configure
@@ -1703,6 +1703,7 @@ SYSTEM_FUNCS="
"

SYSTEM_LIBRARIES="
+bcrypt
sdl
vaapi_1
vaapi_drm
@@ -2611,7 +2612,7 @@ avdevice_extralibs="libm_extralibs"
avformat_extralibs="libm_extralibs"
avfilter_extralibs="pthreads_extralibs libm_extralibs"
avresample_extralibs="libm_extralibs"
-avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs 
d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs 
pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs 
vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs"
+avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs 
cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs 
nanosleep_extralibs pthreads_extralibs user32_extralibs vaapi_extralibs 
vaapi_drm_extralibs vaapi_x11_extralibs vdpau_x11_extralibs 
wincrypt_extralibs"

swscale_extralibs="libm_extralibs"

# programs
@@ -4581,6 +4582,7 @@ check_lib ole32"windows.h" 

CoTaskMemFree-lole32

check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
+test_cpp_condition windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt 

"windows.h bcrypt.h" BCryptGenRandom  -lbcrypt


check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss

diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c
index 089d883916..d11bff2ef6 100644
--- a/libavutil/random_seed.c
+++ b/libavutil/random_seed.c
@@ -23,7 +23,10 @@
#if HAVE_UNISTD_H
#include 
#endif
-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
+#include 
+#include 
+#elif HAVE_WINCRYPT
#include 
#include 
#endif
@@ -96,7 +99,17 @@ uint32_t av_get_random_seed(void)
{
uint32_t seed;

-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
+BCRYPT_ALG_HANDLE algo_handle;
+NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, 

BCRYPT_RNG_ALGORITHM,

+   MS_PRIMITIVE_PROVIDER, 0);
+if (BCRYPT_SUCCESS(ret)) {
+NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, 

sizeof(seed), 0);

+BCryptCloseAlgorithmProvider(algo_handle, 0);
+if (BCRYPT_SUCCESS(ret))
+return seed;
+}
+#elif HAVE_WINCRYPT
HCRYPTPROV provider;
if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL,
CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) {
--
2.16.2


This is ok with me and I can push it (perhaps with removing the check for 
_WIN32_WINNT >= 0x600). I guess removing wincrypt can be left as a 
separate later patch?


As the form pushed in ffmpeg was with removing wincrypt at the same time, 
I'd prefer using that form here as well. I'll send a version of the patch 
in that form, and push a day later unless there's anything further to 
change.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] [v2] use bcrypt instead of the old wincrypt API

2018-04-14 Thread Martin Storsjö

On Tue, 3 Apr 2018, Steve Lhomme wrote:


When targeting Windows Vista and above
The wincrypt API is deprecated and not allowed for Windows Store apps.

Wincrypt can be removed after XP support is dropped.
---
configure   |  4 +++-
libavutil/random_seed.c | 17 +++--
2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/configure b/configure
index 77754d0f51..0ab975bb1c 100755
--- a/configure
+++ b/configure
@@ -1703,6 +1703,7 @@ SYSTEM_FUNCS="
"

SYSTEM_LIBRARIES="
+bcrypt
sdl
vaapi_1
vaapi_drm
@@ -2611,7 +2612,7 @@ avdevice_extralibs="libm_extralibs"
avformat_extralibs="libm_extralibs"
avfilter_extralibs="pthreads_extralibs libm_extralibs"
avresample_extralibs="libm_extralibs"
-avutil_extralibs="clock_gettime_extralibs cuda_extralibs cuvid_extralibs 
d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs pthreads_extralibs 
user32_extralibs vaapi_extralibs vaapi_drm_extralibs vaapi_x11_extralibs 
vdpau_x11_extralibs wincrypt_extralibs"
+avutil_extralibs="bcrypt_extralibs clock_gettime_extralibs cuda_extralibs 
cuvid_extralibs d3d11va_extralibs libm_extralibs libmfx_extralibs nanosleep_extralibs 
pthreads_extralibs user32_extralibs vaapi_extralibs vaapi_drm_extralibs 
vaapi_x11_extralibs vdpau_x11_extralibs wincrypt_extralibs"
swscale_extralibs="libm_extralibs"

# programs
@@ -4581,6 +4582,7 @@ check_lib ole32"windows.h"CoTaskMemFree   
 -lole32
check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
+test_cpp_condition windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt 
"windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

check_struct "sys/time.h sys/resource.h" "struct rusage" ru_maxrss

diff --git a/libavutil/random_seed.c b/libavutil/random_seed.c
index 089d883916..d11bff2ef6 100644
--- a/libavutil/random_seed.c
+++ b/libavutil/random_seed.c
@@ -23,7 +23,10 @@
#if HAVE_UNISTD_H
#include 
#endif
-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
+#include 
+#include 
+#elif HAVE_WINCRYPT
#include 
#include 
#endif
@@ -96,7 +99,17 @@ uint32_t av_get_random_seed(void)
{
uint32_t seed;

-#if HAVE_WINCRYPT
+#if HAVE_BCRYPT
+BCRYPT_ALG_HANDLE algo_handle;
+NTSTATUS ret = BCryptOpenAlgorithmProvider(&algo_handle, 
BCRYPT_RNG_ALGORITHM,
+   MS_PRIMITIVE_PROVIDER, 0);
+if (BCRYPT_SUCCESS(ret)) {
+NTSTATUS ret = BCryptGenRandom(algo_handle, (UCHAR*)&seed, 
sizeof(seed), 0);
+BCryptCloseAlgorithmProvider(algo_handle, 0);
+if (BCRYPT_SUCCESS(ret))
+return seed;
+}
+#elif HAVE_WINCRYPT
HCRYPTPROV provider;
if (CryptAcquireContext(&provider, NULL, NULL, PROV_RSA_FULL,
CRYPT_VERIFYCONTEXT | CRYPT_SILENT)) {
--
2.16.2


This is ok with me and I can push it (perhaps with removing the check for 
_WIN32_WINNT >= 0x600). I guess removing wincrypt can be left as a 
separate later patch?


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] x86: Don't declare a non-static function as inline

2018-04-14 Thread Martin Storsjö
This fixes building with clang in msvc mode, which does support
gcc style inline assembly.
---
 libavcodec/x86/xvididct_sse2.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/libavcodec/x86/xvididct_sse2.c b/libavcodec/x86/xvididct_sse2.c
index f318e95..0de59a5 100644
--- a/libavcodec/x86/xvididct_sse2.c
+++ b/libavcodec/x86/xvididct_sse2.c
@@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = {
 "movdqa   %%xmm6, 4*16("dct") \n\t" \
 "movdqa   "SREG2", 7*16("dct")\n\t"
 
-inline void ff_xvid_idct_sse2(short *block)
+void ff_xvid_idct_sse2(short *block)
 {
 __asm__ volatile (
 "movq "MANGLE (m127) ", %%mm0  \n\t"
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] x86: Don't declare a non-static function as inline

2018-04-14 Thread Martin Storsjö

On Sat, 14 Apr 2018, Diego Biurrun wrote:


On Sat, Apr 14, 2018 at 01:38:30PM +0300, Martin Storsjö wrote:

Make the actual implementation static inline, but add a non-static
non-inline frontend for it.

This fixes building with clang in msvc mode, which does support
gcc style inline assembly.
--- a/libavcodec/x86/xvididct_sse2.c
+++ b/libavcodec/x86/xvididct_sse2.c
@@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = {

-inline void ff_xvid_idct_sse2(short *block)
+static inline void xvid_idct_sse2(short *block)
 {
 __asm__ volatile (
 "movq "MANGLE (m127) ", %%mm0  \n\t"
@@ -390,15 +390,20 @@ inline void ff_xvid_idct_sse2(short *block)

+void ff_xvid_idct_sse2(short *block)
+{
+xvid_idct_sse2(block);
+}


Why not simply drop the inline and be done with it? I notice that the MMX
version of this does not have the inline keyword.


That's probably just as good.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] x86: Don't declare a non-static function as inline

2018-04-14 Thread Martin Storsjö
Make the actual implementation static inline, but add a non-static
non-inline frontend for it.

This fixes building with clang in msvc mode, which does support
gcc style inline assembly.
---
 libavcodec/x86/xvididct_sse2.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/libavcodec/x86/xvididct_sse2.c b/libavcodec/x86/xvididct_sse2.c
index f318e95..00ed803 100644
--- a/libavcodec/x86/xvididct_sse2.c
+++ b/libavcodec/x86/xvididct_sse2.c
@@ -342,7 +342,7 @@ DECLARE_ASM_CONST(16, int32_t, walkenIdctRounders)[] = {
 "movdqa   %%xmm6, 4*16("dct") \n\t" \
 "movdqa   "SREG2", 7*16("dct")\n\t"
 
-inline void ff_xvid_idct_sse2(short *block)
+static inline void xvid_idct_sse2(short *block)
 {
 __asm__ volatile (
 "movq "MANGLE (m127) ", %%mm0  \n\t"
@@ -390,15 +390,20 @@ inline void ff_xvid_idct_sse2(short *block)
   "%eax", "%ecx", "%edx", "%esi", "memory");
 }
 
+void ff_xvid_idct_sse2(short *block)
+{
+xvid_idct_sse2(block);
+}
+
 void ff_xvid_idct_sse2_put(uint8_t *dest, ptrdiff_t line_size, short *block)
 {
-ff_xvid_idct_sse2(block);
+xvid_idct_sse2(block);
 ff_put_pixels_clamped_mmx(block, dest, line_size);
 }
 
 void ff_xvid_idct_sse2_add(uint8_t *dest, ptrdiff_t line_size, short *block)
 {
-ff_xvid_idct_sse2(block);
+xvid_idct_sse2(block);
 ff_add_pixels_clamped_mmx(block, dest, line_size);
 }
 
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 2/2] Drop Windows XP support remnants

2018-04-05 Thread Martin Storsjö

On Thu, 5 Apr 2018, Diego Biurrun wrote:


---
libavcodec/dxva2_h264.c   | 6 +-
libavcodec/dxva2_hevc.c   | 6 +-
libavcodec/dxva2_mpeg2.c  | 7 ++-
libavcodec/dxva2_vc1.c| 6 +-
libavutil/hwcontext_d3d11va.c | 9 +
libavutil/hwcontext_dxva2.c   | 4 
6 files changed, 6 insertions(+), 32 deletions(-)

diff --git a/libavcodec/dxva2_h264.c b/libavcodec/dxva2_h264.c
index 50e7863bf2..790e4a214b 100644
--- a/libavcodec/dxva2_h264.c
+++ b/libavcodec/dxva2_h264.c
@@ -20,16 +20,12 @@
 * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
 */

+#include "dxva2_internal.h"
#include "h264dec.h"
#include "h264data.h"
#include "h264_ps.h"
#include "mpegutils.h"

-// The headers above may include w32threads.h, which uses the original
-// _WIN32_WINNT define, while dxva2_internal.h redefines it to target a
-// potentially newer version.
-#include "dxva2_internal.h"


Well technically, this hasn't changed - dxva2_internal.h includes dxva2.h 
which still redefines _WIN32_WINNT.


It just sets it to 0x0602, while the lowest it'll be here from before is 
0x0600 and the difference shouldn't matter for e.g. w32threads.h.


The patch probably is fine though, but reading the patch made me grep the 
source to see what the actual case was.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH 1/2] w32pthreads: always use Vista+ API, drop XP support

2018-04-05 Thread Martin Storsjö

On Thu, 5 Apr 2018, Diego Biurrun wrote:


From: wm4 

This removes the XP compatibility code, and switches entirely to SRW
locks, which are available starting at Windows Vista.

This removes CRITICAL_SECTION use, which allows us to add
PTHREAD_MUTEX_INITIALIZER, which will be useful later.

Windows XP is hereby not a supported build target anymore.

Signed-off-by: Diego Biurrun 
---

Changes to original patch:
- proper w32threads dependencies
- added missing Cygwin flags

Changelog  |   2 +
compat/w32pthreads.h   | 269 ++---
configure  |  19 ++--
libavcodec/pthread_frame.c |   4 -
libavcodec/pthread_slice.c |   4 -
libavfilter/pthread.c  |   4 -
6 files changed, 23 insertions(+), 279 deletions(-)


Looks ok, haven't tested it myself yet.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API

2018-03-30 Thread Martin Storsjö

On Fri, 30 Mar 2018, James Almer wrote:


On 3/30/2018 3:13 PM, Martin Storsjö wrote:

On Fri, 30 Mar 2018, James Almer wrote:


On 3/30/2018 10:57 AM, Martin Storsjö wrote:

On Fri, 30 Mar 2018, Diego Biurrun wrote:


On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote:

On 3/30/2018 10:38 AM, Diego Biurrun wrote:
> On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote:
>> Le 30/03/2018 à 10:46, Diego Biurrun a écrit :
>>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote:
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -4581,6 +4582,7 @@ check_lib ole32    "windows.h"   
CoTaskMemFree    -lole32
>>>>   check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW  
-lshell32
>>>>   check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom  
-ladvapi32
>>>>   check_lib psapi    "windows.h psapi.h"    GetProcessMemoryInfo
-lpsapi
>>>> +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600"
&& check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

If you don't need to set any variable then just use
test_cpp_condition()


Yes, good point.


>>> Do you really need to check the Vista condition? What about using
bcrypt
>>> unconditionally if available?
>>
>> Yes, you need to use it only on builds that won't run on XP.
Otherwise it
>> will fail to load the bcrypt.dll and the whole libavutil DLL (or
whatever
>> its form) will fail to load. It would be possible to do it
dynamically but
>> IMO it's overkill. It's not really a critical component.
> > Is bcrypt available on XP? If no then the CPP condition check
would seem
> unnecessary. You could just check for bcrypt and bcrypt being
available
> would imply Vista. I think I'm missing something.

check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

Seems to succeed even if targeting XP, at least on mingw-w64.


Isn't that wrong then?


I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs
guarding the availability of this function in the headers. (The official
windows SDK might, although that SDK also have dropped XP support long
ago iirc.)


bcrypt.h on mingw-w64 is completely wrapped in checks like

#if WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) || _WIN32_WINNT

= 0x0A00


The former is the reason it succeeds in XP, seeing the latter is
checking for Windows 10 or newer.


Hmm, ok. I guess the correct form would be something like
"(WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) && _WIN32_WINNT >=
0x0600) || _WIN32_WINNT >= 0x0A00" then.

// Martin


The WINAPI_PARTITION_DESKTOP check is already done in configure to
enable or disable the uwp variable.


Not sure I see how that relates... that part of the header guard makes it 
visible on and makes the check succeed when targeting XP, even though it 
really isn't available there according to Steve.



In any case, does this mean that on uwp neither BCryptGenRandom or
CryptGenRandom are available/allowed?


The way I read that, for UWP on Win10, the bcrypt.h stuff should be fine, 
no? (Based on the mingw-w64 header guards, it might not be for win8/8.1 
RT/store/UWP/whatever apps, although MSDN doesn't seem to say anything 
about it.)


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API

2018-03-30 Thread Martin Storsjö

On Fri, 30 Mar 2018, Diego Biurrun wrote:


On Fri, Mar 30, 2018 at 04:58:29PM +0300, Martin Storsjö wrote:

On Fri, 30 Mar 2018, Diego Biurrun wrote:
> On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote:
> > Le 30/03/2018 à 10:46, Diego Biurrun a écrit :
> > > On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote:
> > > > --- a/configure
> > > > +++ b/configure
> > > > @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h"
CoTaskMemFree-lole32
> > > >   check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   
-lshell32
> > > >   check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   
-ladvapi32
> > > >   check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
> > > > +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib 
bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt
> > > Do you really need to check the Vista condition? What about using bcrypt
> > > unconditionally if available?
> > 
> > Yes, you need to use it only on builds that won't run on XP. Otherwise it

> > will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever
> > its form) will fail to load. It would be possible to do it dynamically but
> > IMO it's overkill. It's not really a critical component.
> 
> Is bcrypt available on XP? If no then the CPP condition check would seem

> unnecessary. You could just check for bcrypt and bcrypt being available
> would imply Vista. I think I'm missing something.
> 
> > But with time if XP support is dropped this check can go and wincrypt

> > dropped entirely.
> 
> Is it maybe time to consider dropping XP support?


I wouldn't mind.


Let's go ahead then.


See e.g. 9b121dfc32810250938021952aab4172a988cb56 in ffmpeg; dropping XP
support simplifies the w32pthreads wrapper and allows using better
synchronization primitives, that allow e.g. static initialization of
mutexes.


Do we need to do more changes apart from importing that commit?


Don't think so, except for whatever configure differences there are.

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API

2018-03-30 Thread Martin Storsjö

On Fri, 30 Mar 2018, James Almer wrote:


On 3/30/2018 10:57 AM, Martin Storsjö wrote:

On Fri, 30 Mar 2018, Diego Biurrun wrote:


On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote:

On 3/30/2018 10:38 AM, Diego Biurrun wrote:
> On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote:
>> Le 30/03/2018 à 10:46, Diego Biurrun a écrit :
>>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote:
>>>> --- a/configure
>>>> +++ b/configure
>>>> @@ -4581,6 +4582,7 @@ check_lib ole32    "windows.h"   
CoTaskMemFree    -lole32
>>>>   check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW  
-lshell32
>>>>   check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom  
-ladvapi32
>>>>   check_lib psapi    "windows.h psapi.h"    GetProcessMemoryInfo
-lpsapi
>>>> +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600"
&& check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

If you don't need to set any variable then just use test_cpp_condition()


Yes, good point.


>>> Do you really need to check the Vista condition? What about using
bcrypt
>>> unconditionally if available?
>>
>> Yes, you need to use it only on builds that won't run on XP.
Otherwise it
>> will fail to load the bcrypt.dll and the whole libavutil DLL (or
whatever
>> its form) will fail to load. It would be possible to do it
dynamically but
>> IMO it's overkill. It's not really a critical component.
> > Is bcrypt available on XP? If no then the CPP condition check
would seem
> unnecessary. You could just check for bcrypt and bcrypt being
available
> would imply Vista. I think I'm missing something.

check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

Seems to succeed even if targeting XP, at least on mingw-w64.


Isn't that wrong then?


I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs
guarding the availability of this function in the headers. (The official
windows SDK might, although that SDK also have dropped XP support long
ago iirc.)


bcrypt.h on mingw-w64 is completely wrapped in checks like

#if WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) || _WIN32_WINNT

= 0x0A00


The former is the reason it succeeds in XP, seeing the latter is
checking for Windows 10 or newer.


Hmm, ok. I guess the correct form would be something like 
"(WINAPI_FAMILY_PARTITION (WINAPI_PARTITION_DESKTOP) && _WIN32_WINNT >= 
0x0600) || _WIN32_WINNT >= 0x0A00" then.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API

2018-03-30 Thread Martin Storsjö

On Fri, 30 Mar 2018, Diego Biurrun wrote:


On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote:

Le 30/03/2018 à 10:46, Diego Biurrun a écrit :
> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote:
> > --- a/configure
> > +++ b/configure
> > @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h"
CoTaskMemFree-lole32
> >   check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
> >   check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
> >   check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
> > +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt 
"windows.h bcrypt.h" BCryptGenRandom  -lbcrypt
> Do you really need to check the Vista condition? What about using bcrypt
> unconditionally if available?

Yes, you need to use it only on builds that won't run on XP. Otherwise it
will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever
its form) will fail to load. It would be possible to do it dynamically but
IMO it's overkill. It's not really a critical component.


Is bcrypt available on XP? If no then the CPP condition check would seem
unnecessary. You could just check for bcrypt and bcrypt being available
would imply Vista. I think I'm missing something.


But with time if XP support is dropped this check can go and wincrypt
dropped entirely.


Is it maybe time to consider dropping XP support?


I wouldn't mind.

See e.g. 9b121dfc32810250938021952aab4172a988cb56 in ffmpeg; dropping XP 
support simplifies the w32pthreads wrapper and allows using better 
synchronization primitives, that allow e.g. static initialization of 
mutexes.


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] use bcrypt instead of the old wincrypt API

2018-03-30 Thread Martin Storsjö

On Fri, 30 Mar 2018, Diego Biurrun wrote:


On Fri, Mar 30, 2018 at 10:43:27AM -0300, James Almer wrote:

On 3/30/2018 10:38 AM, Diego Biurrun wrote:
> On Fri, Mar 30, 2018 at 12:38:05PM +0200, Steve Lhomme wrote:
>> Le 30/03/2018 à 10:46, Diego Biurrun a écrit :
>>> On Fri, Mar 30, 2018 at 09:36:05AM +0200, Steve Lhomme wrote:
 --- a/configure
 +++ b/configure
 @@ -4581,6 +4582,7 @@ check_lib ole32"windows.h"
CoTaskMemFree-lole32
   check_lib shell32  "windows.h shellapi.h" CommandLineToArgvW   -lshell32
   check_lib wincrypt "windows.h wincrypt.h" CryptGenRandom   -ladvapi32
   check_lib psapi"windows.h psapi.h"GetProcessMemoryInfo -lpsapi
 +check_cpp_condition Vista+ windows.h "_WIN32_WINNT >= 0x0600" && check_lib bcrypt 
"windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

If you don't need to set any variable then just use test_cpp_condition()


Yes, good point.


>>> Do you really need to check the Vista condition? What about using bcrypt
>>> unconditionally if available?
>>
>> Yes, you need to use it only on builds that won't run on XP. Otherwise it
>> will fail to load the bcrypt.dll and the whole libavutil DLL (or whatever
>> its form) will fail to load. It would be possible to do it dynamically but
>> IMO it's overkill. It's not really a critical component.
> 
> Is bcrypt available on XP? If no then the CPP condition check would seem

> unnecessary. You could just check for bcrypt and bcrypt being available
> would imply Vista. I think I'm missing something.

check_lib bcrypt "windows.h bcrypt.h" BCryptGenRandom  -lbcrypt

Seems to succeed even if targeting XP, at least on mingw-w64.


Isn't that wrong then?


I guess it just means that mingw-w64 doesn't have _WIN32_WINNT ifdefs 
guarding the availability of this function in the headers. (The official 
windows SDK might, although that SDK also have dropped XP support long ago 
iirc.)


// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] arm: Produce .const_data instead of .section .rodata for Mach-O

2018-03-30 Thread Martin Storsjö
This is the same combination of .section directives as used in
aarch64/asm.S.

Since Xcode 9.3, the bundled clang supports altmacro and doesn't
require using gas-preprocessor any longer.
---
 libavutil/arm/asm.S | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/libavutil/arm/asm.S b/libavutil/arm/asm.S
index 08574852b5..e7eea0271f 100644
--- a/libavutil/arm/asm.S
+++ b/libavutil/arm/asm.S
@@ -111,11 +111,17 @@ FUNC.func   \name
 ELF .size   \name, . - \name
 .purgem endconst
 .endm
-.if HAVE_SECTION_DATA_REL_RO && \relocate
+#if HAVE_SECTION_DATA_REL_RO
+.if \relocate
 .section.data.rel.ro
 .else
 .section.rodata
 .endif
+#elif !defined(__MACH__)
+.section.rodata
+#else
+.const_data
+#endif
 .align  \align
 \name:
 .endm
-- 
2.15.1 (Apple Git-101)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] arm: vc1dsp: Add commas between macro arguments

2018-03-30 Thread Martin Storsjö
When targeting darwin, clang requires commas between arguments,
while the no-comma form is allowed for other targets.

Since Xcode 9.3, the bundled clang supports altmacro and doesn't
require using gas-preprocessor any longer.
---
 libavcodec/arm/vc1dsp_neon.S | 94 ++--
 1 file changed, 47 insertions(+), 47 deletions(-)

diff --git a/libavcodec/arm/vc1dsp_neon.S b/libavcodec/arm/vc1dsp_neon.S
index ff88fe23c7..71cc3f4413 100644
--- a/libavcodec/arm/vc1dsp_neon.S
+++ b/libavcodec/arm/vc1dsp_neon.S
@@ -410,13 +410,13 @@ function ff_vc1_inv_trans_8x8_neon, export=1
 @   src[48] q14
 @   src[56] q15
 
-vc1_inv_trans_8x8_helper add=4 add1beforeshift=0 rshift=3
+vc1_inv_trans_8x8_helper add=4, add1beforeshift=0, rshift=3
 
 @ Transpose result matrix of 8x8
 swap4   d17, d19, d21, d23, d24, d26, d28, d30
 transpose16_4x4 q8,  q9,  q10, q11, q12, q13, q14, q15
 
-vc1_inv_trans_8x8_helper add=64 add1beforeshift=1 rshift=7
+vc1_inv_trans_8x8_helper add=64, add1beforeshift=1, rshift=7
 
 vst1.64 {q8-q9},   [r0,:128]!
 vst1.64 {q10-q11}, [r0,:128]!
@@ -431,7 +431,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1
 vld1.64 {q0-q1}, [r2,:128]! @ load 8 * 4 * 2 = 64 bytes / 
16 bytes per quad = 4 quad registers
 vld1.64 {q2-q3}, [r2,:128]
 
-transpose16 q0 q1 q2 q3 @ transpose rows to columns
+transpose16 q0, q1, q2, q3  @ transpose rows to columns
 
 @ At this point:
 @   src[0]   d0
@@ -443,7 +443,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1
 @   src[6]   d5
 @   src[7]   d7
 
-vc1_inv_trans_8x4_helperadd=4 add1beforeshift=0 rshift=3
+vc1_inv_trans_8x4_helperadd=4, add1beforeshift=0, rshift=3
 
 @ Move output to more standardized registers
 vmovd0, d16
@@ -465,7 +465,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1
 @   dst[6]   d5
 @   dst[7]   d7
 
-transpose16 q0 q1 q2 q3   @ turn columns into rows
+transpose16 q0, q1, q2, q3   @ turn columns into rows
 
 @ At this point:
 @   row[0] q0
@@ -473,7 +473,7 @@ function ff_vc1_inv_trans_8x4_neon, export=1
 @   row[2] q2
 @   row[3] q3
 
-vc1_inv_trans_4x8_helperadd=64 rshift=7
+vc1_inv_trans_4x8_helperadd=64, rshift=7
 
 @ At this point:
 @   line[0].l   d0
@@ -523,7 +523,7 @@ function ff_vc1_inv_trans_4x8_neon, export=1
 vld4.16 {d1[2], d3[2], d5[2], d7[2]}, [r2,:64], r12
 vld4.16 {d1[3], d3[3], d5[3], d7[3]}, [r2,:64]
 
-vc1_inv_trans_4x8_helperadd=4 rshift=3
+vc1_inv_trans_4x8_helperadd=4, rshift=3
 
 @ At this point:
 @   dst[0] = q0
@@ -531,9 +531,9 @@ function ff_vc1_inv_trans_4x8_neon, export=1
 @   dst[2] = q2
 @   dst[3] = q3
 
-transpose16 q0 q1 q2 q3 @ Transpose rows (registers) into 
columns
+transpose16 q0, q1, q2, q3  @ Transpose rows (registers) into 
columns
 
-vc1_inv_trans_8x4_helperadd=64 add1beforeshift=1 rshift=7
+vc1_inv_trans_8x4_helperadd=64, add1beforeshift=1, rshift=7
 
 vld1.32 {d28[]},  [r0,:32], r1  @ read dest
 vld1.32 {d28[1]}, [r0,:32], r1
@@ -611,7 +611,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1
 @   src[2] = d1
 @   src[3] = d3
 
-vc1_inv_trans_4x4_helper add=4 rshift=3  @ compute t1, t2, t3, t4 
and combine them into dst[0-3]
+vc1_inv_trans_4x4_helper add=4, rshift=3 @ compute t1, t2, t3, t4 
and combine them into dst[0-3]
 
 @ At this point:
 @   dst[0] = d0
@@ -619,7 +619,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1
 @   dst[2] = d1
 @   dst[3] = d2
 
-transpose16 d0 d3 d1 d2 @ Transpose rows (registers) into 
columns
+transpose16 d0, d3, d1, d2  @ Transpose rows (registers) into 
columns
 
 @ At this point:
 @   src[0]  = d0
@@ -635,7 +635,7 @@ function ff_vc1_inv_trans_4x4_neon, export=1
 @   src[16] = d1
 @   src[24] = d3
 
-vc1_inv_trans_4x4_helper add=64 rshift=7  @ compute t1, 
t2, t3, t4 and combine them into dst[0-3]
+vc1_inv_trans_4x4_helper add=64, rshift=7 @ compute t1, 
t2, t3, t4 and combine them into dst[0-3]
 
 @ At this point:
 @   line[0] = d0
@@ -665,26 +665,26 @@ endfunc
 
 @ The absolute value of multiplication constants from vc1_mspel_filter and 
vc1_mspel_{ver,hor}_filter_16bits.
 @ The sign is embedded in the code below that carries out the multiplication 
(mspel_filter{,.16}).
-#define MSPEL_MODE_1_MUL_CONSTANTS  4 53 18 3
-#define MSPEL_MODE_2_MUL_CONSTANTS  1 9  9  1
-#define MSPEL_MODE_3_MUL_CONSTANTS  3 18 53 4
+#define M

[libav-devel] [PATCH] configure: Don't assume a 16 byte aligned stack on BSDs on i386

2018-03-16 Thread Martin Storsjö
With GCC, request it to maintain 16 byte alignment, and the existing
entry points already align it via attribute_align_arg.

With clang, do the same as for mingw; disable the aligned stack
and let the assembly functions that require it do the alignment
instead.
---
 configure | 22 --
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 95e6006440..78a2065208 100755
--- a/configure
+++ b/configure
@@ -4957,16 +4957,34 @@ elif enabled gcc; then
 check_cflags -Werror=format-security
 check_cflags -fdiagnostics-color=auto
 enabled extra_warnings || check_disable_warning -Wno-maybe-uninitialized
+if enabled x86_32; then
+case $target_os in
+*bsd*)
+# BSDs don't guarantee a 16 byte aligned stack, but we can
+# request GCC to try to maintain 16 byte alignment throughout
+# function calls. Library entry points that might call assembly
+# functions align the stack. (The parameter means 2^4 bytes.)
+check_cflags -mpreferred-stack-boundary=4
+;;
+esac
+fi
 elif enabled llvm_gcc; then
 check_cflags -mllvm -stack-alignment=16
 elif enabled clang; then
-if [ "$target_os" = "mingw32" -o "$target_os" = "win32" ] && enabled 
x86_32; then
+if enabled x86_32; then
 # Clang doesn't support maintaining alignment without assuming the
 # same alignment in every function. If 16 byte alignment would be
 # enabled, one would also have to either add attribute_align_arg on
 # every single entry point into the libraries or enable -mstackrealign
 # (doing stack realignment in every single function).
-disable aligned_stack
+case $target_os in
+mingw32|win32|*bsd*)
+disable aligned_stack
+;;
+*)
+check_cflags -mllvm -stack-alignment=16
+;;
+esac
 else
 check_cflags -mllvm -stack-alignment=16
 fi
-- 
2.14.3 (Apple Git-98)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH] configure: Don't assume an aligned stack on clang on windows

2018-03-12 Thread Martin Storsjö
If we'd enable a 16 byte aligned stack, clang/llvm would also assume
that alignment everywhere and produce code that strictly requires it.
That would require adding realignment (via attribute_align_arg) on every
single public library function or enable -mstackrealign (which does the
same on every single function).

Also relatedly; the parameter currently tested (-mllvm
-stack-alignment=16) hasn't actually been supported for quite some
time; current clang versions use -mstack-alignment=16 for the same.
Actually testing for that parameter would be a different change
though, since it has a real risk of changing behaviour on any other
platform where clang is used.
---
 configure | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index b91be32..7042635 100755
--- a/configure
+++ b/configure
@@ -4955,7 +4955,16 @@ elif enabled gcc; then
 elif enabled llvm_gcc; then
 check_cflags -mllvm -stack-alignment=16
 elif enabled clang; then
-check_cflags -mllvm -stack-alignment=16
+if [ "$target_os" = "mingw32" -o "$target_os" = "win32" ] && enabled 
x86_32; then
+# Clang doesn't support maintaining alignment without assuming the
+# same alignment in every function. If 16 byte alignment would be
+# enabled, one would also have to either add attribute_align_arg on
+# every single entry point into the libraries or enable -mstackrealign
+# (doing stack realignment in every single function).
+disable aligned_stack
+else
+check_cflags -mllvm -stack-alignment=16
+fi
 check_cflags -Qunused-arguments
 check_cflags -Werror=implicit-function-declaration
 check_cflags -Werror=missing-prototypes
-- 
2.7.4

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

Re: [libav-devel] [PATCH] configure: Restore original endianness test

2018-03-08 Thread Martin Storsjö

On Thu, 8 Mar 2018, Diego Biurrun wrote:


Previously the bit pattern for the endianness test was declared as a
global, instead of a local, variable. This ensures that the pattern
appears unchanged in the object file and is not optimized out.
---
configure | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index d59fc6fd1a..188b2d880b 100755
--- a/configure
+++ b/configure
@@ -4211,7 +4211,9 @@ done

check_cc pragma_deprecated "" '_Pragma("GCC diagnostic ignored 
\"-Wdeprecated-declarations\"")'

-require_cc "endian test" "" "unsigned int endian = 'B' << 24 | 'I' << 16 | 'G' << 8 
| 'E'"
+test_cc <

Ok

// Martin
___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 1/2] Revert "configure: Stop using dlltool to create an import library"

2018-02-16 Thread Martin Storsjö
This reverts commit 67c72f08a4707c18a67a4734660e3a23cc9488b6.

While the linker produced import libraries might work with MSVC in
simple test cases, they don't if e.g. linking to multiple GNU ld
produced import libraries at the same time. The ones produced by
dlltool work fine though.

This issue was pointed out by Hendrik Leppkes.
---
 configure | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index ed930e6cd4..06fb839a18 100755
--- a/configure
+++ b/configure
@@ -3891,6 +3891,10 @@ case $target_os in
 ;;
 mingw32*|mingw64*)
 target_os=mingw32
+LIBTARGET=i386
+if enabled x86_64; then
+LIBTARGET="i386:x86-64"
+fi
 if enabled shared; then
 # Cannot build both shared and static libs when using dllimport.
 disable static
@@ -3902,7 +3906,7 @@ case $target_os in
 SLIBSUF=".dll"
 SLIBNAME_WITH_VERSION='$(SLIBPREF)$(NAME)-$(LIBVERSION)$(SLIBSUF)'
 SLIBNAME_WITH_MAJOR='$(SLIBPREF)$(NAME)-$(LIBMAJOR)$(SLIBSUF)'
-SLIB_EXTRA_CMD='cp $(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
$(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib)'
+SLIB_EXTRA_CMD=-'$(DLLTOOL) -m $(LIBTARGET) -d $$(@:$(SLIBSUF)=.def) 
-l $(SUBDIR)$(SLIBNAME:$(SLIBSUF)=.lib) -D $(SLIBNAME_WITH_MAJOR)'
 SLIB_INSTALL_NAME='$(SLIBNAME_WITH_MAJOR)'
 SLIB_INSTALL_LINKS=
 SLIB_INSTALL_EXTRA_SHLIB='$(SLIBNAME:$(SLIBSUF)=.lib)'
@@ -3910,6 +3914,7 @@ case $target_os in
 SLIB_CREATE_DEF_CMD='EXTERN_PREFIX="$(EXTERN_PREFIX)" AR="$(AR_CMD)" 
NM="$(NM_CMD)" $(SRC_PATH)/compat/windows/makedef $(SUBDIR)lib$(NAME).ver 
$(OBJS) > $$(@:$(SLIBSUF)=.def)'
 SHFLAGS='-shared 
-Wl,--out-implib,$(SUBDIR)lib$(SLIBNAME:$(SLIBSUF)=.dll.a) 
-Wl,--enable-auto-image-base $$(@:$(SLIBSUF)=.def)'
 enabled x86_64 && objformat="win64" || objformat="win32"
+dlltool="${cross_prefix}dlltool"
 ranlib=:
 enable dos_paths
 ;;
@@ -5248,6 +5253,7 @@ X86ASM_O=$X86ASM_O
 LD_O=$LD_O
 LD_LIB=$LD_LIB
 LD_PATH=$LD_PATH
+DLLTOOL=$dlltool
 LDFLAGS=$LDFLAGS
 LDEXEFLAGS=$LDEXEFLAGS
 LDSOFLAGS=$LDSOFLAGS
@@ -5294,6 +5300,7 @@ LIB_INSTALL_EXTRA_CMD=$LIB_INSTALL_EXTRA_CMD
 EXTRALIBS=$extralibs
 COMPAT_OBJS=$compat_objs
 INSTALL=install
+LIBTARGET=${LIBTARGET}
 SLIBNAME=${SLIBNAME}
 SLIBNAME_WITH_VERSION=${SLIBNAME_WITH_VERSION}
 SLIBNAME_WITH_MAJOR=${SLIBNAME_WITH_MAJOR}
-- 
2.14.3 (Apple Git-98)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

[libav-devel] [PATCH 2/2] configure: Pass the right machine types to dlltool for arm and arm64 mingw

2018-02-16 Thread Martin Storsjö
These are supported by llvm-dlltool.
---
 configure | 4 
 1 file changed, 4 insertions(+)

diff --git a/configure b/configure
index 06fb839a18..1c35f9dc64 100755
--- a/configure
+++ b/configure
@@ -3894,6 +3894,10 @@ case $target_os in
 LIBTARGET=i386
 if enabled x86_64; then
 LIBTARGET="i386:x86-64"
+elif enabled arm; then
+LIBTARGET="arm"
+elif enabled aarch64; then
+LIBTARGET="arm64"
 fi
 if enabled shared; then
 # Cannot build both shared and static libs when using dllimport.
-- 
2.14.3 (Apple Git-98)

___
libav-devel mailing list
libav-devel@libav.org
https://lists.libav.org/mailman/listinfo/libav-devel

  1   2   3   4   5   6   7   8   9   10   >