Now that the xmm register and gpr count has decreased, it is
possible to port to x86_32. To save on code, x86_32 with or
without PIC is handled as if PIC.
---
libavcodec/x86/hevc_mc.asm| 39 +++
libavcodec/x86/hevcdsp.h | 4 +++-
libavcodec/x86/hevcdsp_init.c
The 3*stride value stored in r3src can be loaded much later,
so use r3src instead of a dedicated gpr when possible.
---
libavcodec/x86/hevc_mc.asm | 65 ++
1 file changed, 31 insertions(+), 34 deletions(-)
diff --git a/libavcodec/x86/hevc_mc.asm
The second parameter to the macro is always an immediate address,
so no lea is needed.
---
libavcodec/x86/hevc_mc.asm | 19 +++
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
index f6dff4c..aab69dd 100644
---
it, as that's 1-4% runtime
improvement for something far from realtime decoding.
Christophe Gisquet (5):
x86: hevc_mc: fewer gpr autoloads for _v filters
x86: hevc_mc: remove lea in EPEL_LOAD
x86: hevc_mc: save 1 gpr in epel filter loading
x86: hevc_mc: fewer xmm regs used in epel h/v
x86: hevc_mc
In that case, it's just to load my, but mx/r3src is not used.
---
libavcodec/x86/hevc_mc.asm | 18 --
1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
index 3f56782..f6dff4c 100644
---
---
libavcodec/x86/hevc_mc.asm | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
index 74e08d4..a127a4d 100644
--- a/libavcodec/x86/hevc_mc.asm
+++ b/libavcodec/x86/hevc_mc.asm
@@ -734,7 +734,7 @@ cglobal
2015-02-07 19:49 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Christophe Gisquet (5):
x86: hevc_mc: fewer gpr autoloads for _v filters
x86: hevc_mc: remove lea in EPEL_LOAD
x86: hevc_mc: save 1 gpr in epel filter loading
x86: hevc_mc: fewer xmm regs used in epel h/v
Hi,
2015-02-07 16:47 GMT+01:00 Carl Eugen Hoyos ceho...@ag.or.at:
Attached patch intends to fix reading the RLE-attribute from dpx files.
I don't think this is valid. You are skipping a byte, as if there was
an additional element to skip. But there is none.
The bitstream is really:
bits per
for the patch.
This patch is mostly meant to check that it does fix generated
assembly for other users.
--
Christophe
From 018f5122d53fd0514644b86169d593c28a6924db Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Tue, 3 Feb 2015 13:03:48 +0100
Subject: [PATCH] x86: lavu
2015-02-03 12:57 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Actually, 940300945 does need to be reverted for the patch to work, as
Mickael stated. It miscompiles hevc_mc.asm, more particularly the
[eq]pel_hv functions. No idea why.
The patch in [PATCH] x86: lavu/x264asm: fix
Hi,
2015-02-02 18:23 GMT+01:00 James Almer jamr...@gmail.com:
https://github.com/OpenHEVC/FFmpeg/commit/940300945995c20f7583394ebe6907e72829b4a
[...]
Tested it. Doesn't pass with avx2.
Actually, 940300945 does need to be reverted for the patch to work, as
Mickael stated. It miscompiles
Hi,
2015-02-04 4:55 GMT+01:00 James Almer jamr...@gmail.com:
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
Add your own copyright to this file then.
Width 32
158583 decicycles in edge, sao_edge_filter_8 runs, 0
Hi,
2015-02-04 4:55 GMT+01:00 James Almer jamr...@gmail.com:
-DECLARE_ALIGNED(16, const xmm_reg, ff_pw_1)= { 0x0001000100010001ULL,
0x0001000100010001ULL };
-DECLARE_ALIGNED(16, const xmm_reg, ff_pw_2)= { 0x0002000200020002ULL,
0x0002000200020002ULL };
+DECLARE_ALIGNED(32, const
Hi,
Le 3 févr. 2015 18:47, James Almer jamr...@gmail.com a écrit :
On 02/02/15 2:11 PM, Christophe Gisquet wrote:
@@ -87,11 +95,22 @@ QPEL_TABLE 12, 4, w, sse4
%elif %1 = 8
movdqa%3,
[%2] ; load data from source2
%elif %1
Hi,
2015-02-05 11:13 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
2015-02-05 10:13 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
The patch breaks make fate-hevc THREADS=3, so needs more thought.
Compilation issue, running make clean first passes fate-hevc
THREADS=3
Hi,
2015-02-05 14:22 GMT+01:00 Mickaël Raulet mrau...@gmail.com:
Michael,
Please find some commits that can be cherry picked from
https://github.com/OpenHEVC/FFmpeg/commits/ffmpeg_patch
Optimized deblocking filter (8bits only)
1b9ee47d2f43b0a029a9468233626102eb1473b8
Optimzed transform
Hi,
2015-02-05 5:18 GMT+01:00 James Almer jamr...@gmail.com:
Original x86 intrinsics code and initial yasm port by Pierre-Edouard Lepere.
Refactoring and optimizations by James Almer.
No further comment from me.
--
Christophe
___
ffmpeg-devel
2015-02-05 5:18 GMT+01:00 James Almer jamr...@gmail.com:
Original x86 intrinsics code by Pierre-Edouard Lepere.
Yasm port, refactoring and optimizations by James Almer.
OK.
--
Christophe
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
1c4004c78c0b33ad2d30d6ef8baf5eb099c5b1eb Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Thu, 5 Feb 2015 16:00:11 +0100
Subject: [PATCH] lavc/pthread_slice: release entries
When calling ff_alloc_entries, a number of entries are created.
They are never freed
From: plepere pierre-edouard.lep...@insa-rennes.fr
before
33304 decicycles in luma_bi_1, 523066 runs, 1222 skips
38138 decicycles in luma_bi_2, 523427 runs, 861 skips
13490 decicycles in luma_uni, 516138 runs, 8150 skips
after
20185 decicycles in luma_bi_1, 519970 runs, 4318 skips
24620
From: Mickaël Raulet mrau...@insa-rennes.fr
Conflicts:
libavcodec/x86/hevc_mc.asm
---
libavcodec/x86/hevc_mc.asm | 12
1 file changed, 4 insertions(+), 8 deletions(-)
diff --git a/libavcodec/x86/hevc_mc.asm b/libavcodec/x86/hevc_mc.asm
index efb4d1f..e8a5032 100644
---
---
libavcodec/x86/constants.c | 22 --
libavcodec/x86/constants.h | 16 +---
libavcodec/x86/h264_deblock_10bit.asm | 6 ++
libavcodec/x86/h264_idct_10bit.asm | 4 +++-
libavcodec/x86/h264_intrapred_10bit.asm | 3 ++-
From: Mickaël Raulet mrau...@insa-rennes.fr
---
libavcodec/hevc.h | 2 +-
libavcodec/x86/hevc_mc.asm | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/libavcodec/hevc.h b/libavcodec/hevc.h
index ae9a32a..e0af6f1 100644
--- a/libavcodec/hevc.h
+++
not affect the DSP API.
Christophe Gisquet (4):
x86: hevc_mc: use epel_hv 16-wide function
x86: lavc: share more constants
x86: lavc: share more constant through defines
x86: hevc: remove a parameter to WP internals
Mickaël Raulet (2):
x86/hevc: use CLIPW macro when possible
x86/hevc_mc: use
The second stride is always the internal buffer one, MAX_PB_SIZE (times 2 to
get the value in bytes).
---
libavcodec/x86/hevc_mc.asm| 30 +++---
libavcodec/x86/hevcdsp.h | 4 ++--
libavcodec/x86/hevcdsp_init.c | 16
3 files changed, 25
---
libavcodec/x86/ac3dsp.asm | 2 +-
libavcodec/x86/constants.c | 9 -
libavcodec/x86/constants.h | 4
libavcodec/x86/h264_qpel_10bit.asm | 2 +-
libavcodec/x86/hevc_mc.asm | 14 +++---
libavcodec/x86/hevc_sao.asm| 2 +-
2015-02-05 18:28 GMT+01:00 James Almer jamr...@gmail.com:
On 05/02/15 10:22 AM, Mickaël Raulet wrote:
More coming soon for epel and SAO!
The SAO prototypes got some slight changes with my patches. The author of this
ARM code (or someone else) will have to revise it before it can be merged.
2015-02-05 5:18 GMT+01:00 James Almer jamr...@gmail.com:
The stride_src parameter is always 2*MAX_PB_SIZE +
FF_INPUT_BUFFER_PADDING_SIZE.
OK. That's the change the SAO implementations should be aware of.
Good to commit.
--
Christophe
___
On the other hand, the stride is known at compilation time, so the asm
could use that to reduce the number of gprs and therefore helps having
a x86_32 version.
--
Christophe
From 55047bbb991c95f126d597bbe05e424406af4ec4 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq
2015-02-06 16:18 GMT+01:00 James Almer jamr...@gmail.com:
mov r5d, denomm is loading it on Win64. It's also a superfluous instruction
on
UNIX64, where it translates to mov r5d, r5d.
At least if i'm reading this right.
movifnidn then?
--
Christophe
Hi,
2015-02-06 17:54 GMT+01:00 James Almer jamr...@gmail.com:
pb_eo must be handled as a rip relative address for MSVC64, so an
intermediate register is needed. Should fix link failures.
Seems ok on principle, passes fate on mingw64. I'm always wary of
those ABI, so if anyone could verify for
having the
patch, but I'm not completely sure it will apply fine.
--
Christophe
From f0997ceac461add7dddbb1c0a75797bf462bf16e Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Fri, 6 Feb 2015 13:43:45 +0100
Subject: [PATCH] x86: hevc_sao: fix loading of RIP
Suggestions for more efficient make rules, clearer documentation and a
better macro than the generic 'DBG' welcome.
--
Christophe
From 8e2adea82a4ec440744b701716a93e6ecaf211d6 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Sun, 8 Feb 2015 12:18:27 +0100
17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Sun, 8 Feb 2015 12:18:27 +0100
Subject: [PATCH] x86/doc/Makefile: DBG=1 to preprocess external asm
The macro hell sometimes make it difficult to trace the source of
an error, so it is easier to analyze the preprocessed
by the first warn commit,
because of differing insn set (haven't checked though).
This one is specifically for might be insn set a or b, but reg size
makes it clearer.
--
Christophe
From cc1f681defcde77bf1a371633c0eba155b93f235 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq
Hi,
2015-02-07 23:06 GMT+01:00 James Almer jamr...@gmail.com:
Signed-off-by: James Almer jamr...@gmail.com
---
libavcodec/x86/hevc_sao.asm | 40
libavcodec/x86/hevcdsp_init.c | 24
2 files changed, 48 insertions(+), 16
;a=commitdiff;h=fc7e02f0ff345d5331b7c78f2400668d2c79a8b0;hp=4ccd7cb45b9aa46d94c29dbd1c065b652bda2319
It is currently in x264 'sandbox', ie not in a released version.
--
Christophe
From 1ade9345fbb392c0612ff0053523857abd5da563 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq
Hi,
2015-02-08 18:48 GMT+01:00 James Almer jamr...@gmail.com:
+%assign MMSIZE mmsize
Why do that? Not a big deal: it's only for my education, if there's
something I'm missing.
For width 48, the COMPUTE macro is last run after an INIT_XMM cpuname, so
mmsize becomes
16 and in the avx2
Hi,
another warning under MinGW fixed, because the format specifiers are
not supported, cf. for instance:
https://msdn.microsoft.com/en-us/library/fe06s4ak.aspx
--
Christophe
From 6d3e4eb5df0eacf3df011dbe3e05a82f814f63b1 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq
2015-01-21 20:06 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Hi,
the attached patch fixes a warning under MinGW (no idea about msys2).
With aforementioned patch.
--
Christophe
From 54b6d1b588d4a15f51345f8ca1fc5b8ab0660b7e Mon Sep 17 00:00:00 2001
From: Christophe Gisquet
So...
2015-01-21 21:08 GMT+01:00 arwa arif arwaarif1...@gmail.com:
Updated the patch.
There are trailing spaces, and the patch does not apply here (error on
libavfilter/x86/Makefile)
Furthemore, I think that hunk is incorrect:
+set_gamma(eq);
+set_contrast(eq);
+
2015-02-16 10:43 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Obviously I shouldn't unconditionally use r3srcq or equivalent, as
!PIC just directly access the %%table
I probably need to define an intermediate, say TABLE, which is either
r3srcq or %%table, and use it for loading
with the attached patch.
But I wonder if a dependency rule is really needed, as I can see it
causing issues... (does it depend on .asm or .dbg.asm etc)
So I don't think we there yet.
--
Christophe
From f3365ec79b096dad0ccd7246b78ea9b7074f3b49 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet
Hi,
2015-02-12 11:28 GMT+01:00 Michael Niedermayer michae...@gmx.at:
On Tue, Feb 10, 2015 at 01:44:26AM -0300, James Almer wrote:
Signed-off-by: James Almer jamr...@gmail.com
---
libavcodec/x86/hevc_sao.asm | 106
++
libavcodec/x86/hevcdsp_init.c
Hi,
2015-02-12 7:29 GMT+01:00 James Almer jamr...@gmail.com:
Before
40766 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips
After
37975 decicycles in ff_hevc_put_hevc_qpel_h64_8_avx2, 8192 runs, 0 skips
Looks straightforward. But now I understand why we declare using 11
xmm
Hi,
2015-02-08 14:53 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
This one is specifically for might be insn set a or b, but reg size
makes it clearer.
Ping?
--
Christophe
___
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
http
Hi,
2015-02-08 14:54 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
2015-02-08 14:07 GMT+01:00 Carl Eugen Hoyos ceho...@ag.or.at:
Doesn't this also need an update for make clean?
Right, here's an updated one, also improving dependency generation and
allowing more genering
Hi,
2015-02-07 23:11 GMT+01:00 Michael Niedermayer michae...@gmx.at:
this breaks building shared libs:
/usr/bin/ld: libavcodec/x86/hevc_mc.o: relocation R_X86_64_32 against `.text'
can not be used when making a shared object; recompile with -fPIC
libavcodec/x86/hevc_mc.o: could not read
here's a patch with the values swapped out.
From 2955eea46501d096a47fbf2fb1824daa622f6031 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Mon, 16 Feb 2015 20:12:04 +0100
Subject: [PATCH 2/3] x86: hevc_mc: fewer xmm regs used in epel h/v
11 xmm regs seem only required
Hi,
2015-02-17 13:29 GMT+01:00 Michael Niedermayer michae...@gmx.at:
also, i wonder, should this be extended to also work with C files ?
Again, with asm, macro hell sometimes gives you that. So it makes sense.
I don't know well enough icc/llvm/whatever we handle, so the main
issue might be to
Hi,
2015-02-17 13:21 GMT+01:00 Michael Niedermayer michae...@gmx.at:
btw, this works too: (without DBG=1)
make V=2 libavcodec/x86/vp8dsp_loopfilter.dbg.o
Yeah, once I'm working on a specific file, that's what I do.
But iirc, you can just do make DBG=1 and wait for any error to mention
the
Hi,
2015-01-30 19:50 GMT+01:00 James Almer jamr...@gmail.com:
Pass instead the two variables from the struct needed in the function.
This simplifies writing asm optimized versions of the function
ok, no impact by itself but...
void (*sao_band_filter)(uint8_t *_dst, uint8_t *_src,
the start/tail parts.
--
Christophe
From 49e41dd86eb65a774f3561420dd5e9de83f328f2 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Sun, 25 Jan 2015 13:52:16 +0100
Subject: [PATCH 2/2] Use different mem moves.
---
libavcodec/x86/sbrdsp.asm | 28
Hi,
2015-01-23 10:54 GMT+01:00 Timo Rothenpieler t...@rothenpieler.org:
This would also forward the ffmpeg default, 1, to nvenc, instead of the
nvenc default, 0, which lets the driver decide what is best. I'm not sure if
this is desireable and how much it affects the quality.
In that case,
adaptation from Fabrice's version is to match our
alignment requirements, and abuse the edge emu buffers instead of
adding a new buffer.
Decicycles: 26772-26220 (BO32), 83803-80942 (BO64)
Signed-off-by: Christophe Gisquet christophe.gisq...@gmail.com
---
libavcodec/hevc.c| 44
Hi,
2015-01-30 19:50 GMT+01:00 James Almer jamr...@gmail.com:
+%macro HEVC_SAO_BAND_FILTER_COMPUTE 3
+psraw %2, %3, %1-5
+pcmpeqw m10, %2, m0
+pcmpeqw m11, %2, m1
+pcmpeqw m12, %2, m2
+pcmpeqw %2, m3
+pand
Hi,
2015-01-30 14:59 GMT+01:00 Michael Niedermayer michae...@gmx.at:
On Fri, Jan 30, 2015 at 01:09:03PM +, Christophe Gisquet wrote:
The solution requires accessing an lavu internal, which may not be
a good example for ffmpeg as a library user.
[...]
ffmpeg.c as a user application should
Hi,
2015-01-31 11:33 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
2015-01-30 19:50 GMT+01:00 James Almer jamr...@gmail.com:
+%macro HEVC_SAO_BAND_FILTER_COMPUTE 3
+psraw %2, %3, %1-5
+pcmpeqw m10, %2, m0
+pcmpeqw m11, %2, m1
2015-02-14 17:14 GMT+01:00 Michael Niedermayer michae...@gmx.at:
On Sat, Feb 14, 2015 at 11:03:13PM +0800, zhaoxiu.zeng wrote:
From 7e4038fe1291b857261584e69323486fc955cfb2 Mon Sep 17 00:00:00 2001
From: Zeng Zhaoxiu zhaoxiu.z...@gmail.com
Date: Sat, 14 Feb 2015 20:08:48 +0800
Subject: [PATCH
Hi,
2015-02-13 17:49 GMT+01:00 zhaoxiu.zeng zhaoxiu.z...@gmail.com:
int8_t acfilter_order;
int8_t acfilter_scaling;
-int64_t acfilter_coeffs[16];
+int16_t acfilter_coeffs[16];
int acfilter_prevvalues[WMALL_MAX_CHANNELS][16];
int8_t mclms_order;
@@ -818,7
2015-02-12 11:47 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Looks straightforward. But now I understand why we declare using 11
xmm regs in some places, which impacts a patch that has been reviewed
and needs updating.
A patch of mine for x86_32. Just ignore me, I'm speaking
Hi,
2015-02-16 4:51 GMT+01:00 Michael Niedermayer michae...@gmx.at:
On Sat, Feb 07, 2015 at 06:49:38PM +, Christophe Gisquet wrote:
The 3*stride value stored in r3src can be loaded much later,
so use r3src instead of a dedicated gpr when possible.
---
libavcodec/x86/hevc_mc.asm | 65
: Christophe Gisquet christophe.gisq...@gmail.com
Date: Thu, 5 Feb 2015 19:51:22 +0100
Subject: [PATCH] hevc: free sao buffers when receiving a new SPS
The buffer pointers would be otherwise overwritten, causing a
leak on e.g. PERSIST_RPARAM_A_RExt_Sony_1.
---
libavcodec/hevc.c | 9 -
1 file
Hi,
2015-02-05 9:48 GMT+01:00 Mickaël Raulet mrau...@insa-rennes.fr:
WPP try it out with thread_type=slice.
Does it mean these buffer should rather be per thread? If having 2 ctb
lines of buffer fixes this, does this mean having 2 instances of a
single line/column, one per ctb line number
Hi,
2015-03-14 18:38 GMT+01:00 Michael Niedermayer michae...@gmx.at:
static void ff_prores_idct_wrap(int16_t *dst){
-DECLARE_ALIGNED(16, static int16_t, qmat)[64];
+LOCAL_ALIGNED(16, static int16_t, qmat, [64]);
int i;
this seems to break build
ffmpeg/libavcodec/dct-test.c:
, static int16_t, qmat, [64]); \
+LOCAL_ALIGNED(16, static int16_t, tmp, [64]); \
int i; \
LOCAL_ALIGNED + static looks unintended
Same fix then.
Best regards,
Christophe
From 93a8d803f6b87e2c5bd062724630e1d67804da29 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq
2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Sat, 14 Mar 2015 14:26:16 +0100
Subject: [PATCH 1/3] lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED
The later may yield incorrect code for on-stack variables.
---
libavcodec/dct-test.c | 2 +-
libavcodec
Force an additional parameter then.
---
libavutil/internal.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libavutil/internal.h b/libavutil/internal.h
index 9ba2ea0..8081fdb 100644
--- a/libavutil/internal.h
+++ b/libavutil/internal.h
@@ -107,9 +107,9 @@
t (*v) o =
The later may yield incorrect code for on-stack variables.
---
libavcodec/ppc/h264dsp.c| 8 +++
libavcodec/ppc/h264qpel.c | 50 -
libavcodec/ppc/vp8dsp_altivec.c | 2 +-
3 files changed, 30 insertions(+), 30 deletions(-)
diff --git
They were duplicating LOCAL_ALIGNED() without benefit.
---
configure | 8 +++-
libavcodec/aacps.c | 6 +++---
libavcodec/aacsbr.c| 6 +++---
libavcodec/ac3enc.c| 2 +-
libavcodec/ac3enc_template.c
The second patch is the most interesting, because it prevents incorrect
uses of LOCAL_ALIGNED that may have only caused warnings.
This patch is of course smaller because of the code duplication removal
of the first patch.
Christophe Gisquet (3):
lavc/lavu: remove LOCAL_ALIGNED_*
lavu
2015-03-15 8:48 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
The second patch is the most interesting, because it prevents incorrect
uses of LOCAL_ALIGNED that may have only caused warnings.
This patch is of course smaller because of the code duplication removal
of the first
2015-03-11 2:26 GMT+01:00 Michael Niedermayer michae...@gmx.at:
this breaks
make libavcodec/dct-test
...
LD libavcodec/dct-test
libavcodec/dct-test.o:(.rodata+0xe8): undefined reference to
`ff_xvid_idct_mmx'
libavcodec/dct-test.o:(.rodata+0x108): undefined reference to
Hi,
2015-03-12 14:37 GMT+01:00 Michael Niedermayer michae...@gmx.at:
const int mx = h-mv_cache[list][scan8[n]][0] + src_x_offset * 8;
int my= h-mv_cache[list][scan8[n]][1] + src_y_offset * 8;
const int luma_xy = (mx 3) + ((my 3) 2);
-ptrdiff_t offset
Hi,
2015-03-13 14:34 GMT+01:00 Michael Niedermayer michae...@gmx.at:
They will not be used in any real world scenario.
well it could be usefull in debuging in same cases
especially if all mmx code is disabled on x86_64, for any single
function its not too likely
I believe fate and proper
Hi,
2015-03-13 22:28 GMT+01:00 Andreas Cadhalpun andreas.cadhal...@googlemail.com:
-int ff_eac3_parse_header(AC3DecodeContext *s);
+static int ff_eac3_parse_header(AC3DecodeContext *s);
It's somewhat cosmetics, but if these functions become static, they
would better drop the ff_ prefix.
-
---
libavcodec/x86/dct-test.c | 39 ++-
1 file changed, 26 insertions(+), 13 deletions(-)
diff --git a/libavcodec/x86/dct-test.c b/libavcodec/x86/dct-test.c
index 63a9aeb..d1a5067 100644
--- a/libavcodec/x86/dct-test.c
+++ b/libavcodec/x86/dct-test.c
@@ -26,21
Some issues found while touching dct-test. Depends on the patch series
(rather its first 2 patches, of which one was already applied) for
XviD iDCTs.
Christophe Gisquet (2):
x86: dct-test: fix compilation for prores
x86: dct-test: evaluate prores idct avx version
libavcodec/x86/dct-test.c
When the decoder is deactivated, the x86-optimized versions are
not compiled, resulting in a link error.
The C version is unaffected, as it is part of the idctdsp
subsystem.
---
libavcodec/x86/dct-test.c | 8
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git
This is the remaining error, the output on the SPX samples,
respectively csi_miami_stereo_128_spx.eac3 and
csi_miami_5.1_256_spx.eac3, goes from:
stddev:8.71 PSNR: 77.52 MAXDIFF: 235
stddev:24270.51 PSNR: 22.17 MAXDIFF:47166
to:
stddev:0.12 PSNR:114.12 MAXDIFF:1
stddev:0.12
It was set to 1 instead of sqrt(3)
---
libavcodec/ac3dec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
index ce45186..ae4129f 100644
--- a/libavcodec/ac3dec.c
+++ b/libavcodec/ac3dec.c
@@ -939,7 +939,7 @@ static int
Should also improve decoding, but actually doesn't...
---
libavcodec/ac3dec.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
index 2f78d73..ce45186 100644
--- a/libavcodec/ac3dec.c
+++ b/libavcodec/ac3dec.c
@@ -872,7 +872,7 @@ static
---
libavcodec/ac3dec.c | 8 +++-
1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/libavcodec/ac3dec.c b/libavcodec/ac3dec.c
index ae4129f..ac53bdc 100644
--- a/libavcodec/ac3dec.c
+++ b/libavcodec/ac3dec.c
@@ -924,14 +924,13 @@ static int decode_audio_block(AC3DecodeContext *s,
The SPX blend factors were incorrectly computed. For now, still use
floating point operations, with proper scaling, but the interested
parties may want to write the proper FP.23 code.
Christophe Gisquet (4):
ac3_fixed: fix out-of-bound read
ac3_fixed: fix computation of spx_noise_blend
The later may yield incorrect code for on-stack variables.
---
libswscale/ppc/swscale_altivec.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libswscale/ppc/swscale_altivec.c b/libswscale/ppc/swscale_altivec.c
index a1548a7..3034c72 100644
---
The later may yield incorrect code for on-stack variables.
---
libavcodec/x86/ac3dsp_init.c | 2 +-
libavcodec/x86/cavsdsp.c | 4 ++--
libavcodec/x86/dct-test.c | 4 ++--
libavcodec/x86/h264_qpel.c| 22 +++---
libavcodec/x86/rv40dsp_init.c | 2 +-
They were duplicating LOCAL_ALIGNED() without benefit.
---
configure | 8 +++-
libavcodec/aacps.c | 6 +++---
libavcodec/aacsbr.c| 6 +++---
libavcodec/ac3enc.c| 2 +-
libavcodec/ac3enc_template.c
The later may yield incorrect code for on-stack variables.
---
libavcodec/dct-test.c | 2 +-
libavcodec/h264_loopfilter.c| 10 +-
libavcodec/proresenc_anatoliy.c | 3 ++-
libavcodec/vp8.c| 2 +-
4 files changed, 9 insertions(+), 8 deletions(-)
diff --git
The later may yield incorrect code for on-stack variables.
---
libavcodec/ppc/h264dsp.c| 10 -
libavcodec/ppc/h264qpel.c | 50 -
libavcodec/ppc/vp8dsp_altivec.c | 2 +-
3 files changed, 31 insertions(+), 31 deletions(-)
diff --git
* macros.
All changes (core+x86) tested with:
fate-vfilter fate-vcodec fate-h264 fate-ac3 fate-vp8 fate-cavs fate-vc1
for Win32 and Win64.
Christophe Gisquet (5):
lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED
x86: lavc: use LOCAL_ALIGNED instead of DECLARE_ALIGNED
ppc: libswscale: use
of the dct-test stuff.
Incidentally, I found other issues with dct-test, but they are outside
the scope of this patch series.
--
Christophe
From 866b481ecab3369712ff854ce6c0857b875b50e6 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Tue, 10 Mar 2015 23:11:52
2015-03-11 0:11 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
---
libavcodec/x86/xvididct.asm| 202
-
libavcodec/x86/xvididct_init.c | 8 +-
2 files changed, 140 insertions(+), 70 deletions(-)
Not sure it needed a refresh, but here
-xvid-idct and dct-test builds and runs on both Win32 and Win64.
--
Christophe
From 86da5a1f111f9f36318daa906c3245d6b883feb3 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Tue, 10 Mar 2015 23:11:51 +
Subject: [PATCH 1/4] x86: xvid_idct: port SSE2 iDCT
2015-03-11 0:11 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
---
libavcodec/x86/xvididct.asm| 92
--
libavcodec/x86/xvididct_init.c | 9 +
2 files changed, 91 insertions(+), 10 deletions(-)
Another refresh.
From
Hi,
2015-03-12 20:14 GMT+01:00 Christophe Gisquet christophe.gisq...@gmail.com:
Not sure it needed a refresh, but here is one.
btw, the incorrect %warning code is actually dead, placeholder code
for the following patch 4/4 of the series.
--
Christophe
This recently caused me some issues, as in, being ignored.
--
Christophe
From 5c1b07147502135a9f6a04a1edcf060a1575efd3 Mon Sep 17 00:00:00 2001
From: Christophe Gisquet christophe.gisq...@gmail.com
Date: Sun, 8 Mar 2015 17:54:25 +0100
Subject: [PATCH] x86: Makefile: fix DBG parameter evaluation
---
libavcodec/x86/xvididct.asm| 92 --
libavcodec/x86/xvididct_init.c | 9 +
2 files changed, 91 insertions(+), 10 deletions(-)
diff --git a/libavcodec/x86/xvididct.asm b/libavcodec/x86/xvididct.asm
index 58ffb11..0220885 100644
---
100644
--- a/libavcodec/x86/xvididct.asm
+++ b/libavcodec/x86/xvididct.asm
@@ -1,5 +1,9 @@
; XVID MPEG-4 VIDEO CODEC
-; - SSE2 inverse discrete cosine transform -
+;
+; Conversion from gcc syntax to x264asm syntax with modifications
+; by Christophe Gisquet christophe.gisq...@gmail.com
The main difference consists in renaming properly labels, and
letting yasm select the gprs for skipping 1D transforms.
---
libavcodec/x86/Makefile| 2 +-
libavcodec/x86/xvididct.asm| 379 ++
libavcodec/x86/xvididct_init.c | 18 +-
---
libavcodec/x86/xvididct.asm| 202 -
libavcodec/x86/xvididct_init.c | 8 +-
2 files changed, 140 insertions(+), 70 deletions(-)
diff --git a/libavcodec/x86/xvididct.asm b/libavcodec/x86/xvididct.asm
index 4c52bf1..58ffb11 100644
---
201 - 300 of 564 matches
Mail list logo