Re: [FFmpeg-devel] [PATCH] Disable MSA for big-endian mips cpu
The current upstreamed code has been written and tested for Little Endian systems. We do have plans to add the Big Endian support in near future, but till that time, need to disable all to avoid its usage and failures. -Original Message- From: Michael Niedermayer [mailto:mich...@niedermayer.cc] Sent: 26 May 2017 19:13 To: FFmpeg development discussions and patches Cc: Shivraj Patil Subject: Re: [FFmpeg-devel] [PATCH] Disable MSA for big-endian mips cpu On Fri, May 26, 2017 at 03:40:20PM +0200, Michael Niedermayer wrote: > On Fri, May 26, 2017 at 04:08:55PM +0530, shivraj.pa...@imgtec.com wrote: > > From: Shivraj Patil <shivraj.pa...@imgtec.com> > > > > Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com> > > --- > > libavcodec/mips/Makefile|2 ++ > > libavcodec/mips/blockdsp_init_mips.c|8 > > libavcodec/mips/h263dsp_init_mips.c |8 > > libavcodec/mips/h264chroma_init_mips.c |8 > > libavcodec/mips/h264dsp_init_mips.c |8 > > libavcodec/mips/h264pred_init_mips.c|8 > > libavcodec/mips/h264qpel_init_mips.c|8 > > libavcodec/mips/hevcdsp_init_mips.c |8 > > libavcodec/mips/hevcpred_init_mips.c|8 > > libavcodec/mips/hpeldsp_init_mips.c |8 > > libavcodec/mips/idctdsp_init_mips.c |8 > > libavcodec/mips/me_cmp_init_mips.c |8 > > libavcodec/mips/mpegvideo_init_mips.c |8 > > libavcodec/mips/mpegvideoencdsp_init_mips.c |8 > > libavcodec/mips/pixblockdsp_init_mips.c |8 > > libavcodec/mips/qpeldsp_init_mips.c |8 > > libavcodec/mips/vp8dsp_init_mips.c |8 > > libavcodec/mips/vp9dsp_init_mips.c |8 > > 18 files changed, 70 insertions(+), 68 deletions(-) > > Why does none of this code work on big endian mips ? > > Is it difficult to make it work ? > > Is it certain that the "disabled" code does not work on big endian > mips ? > > Is it known that the reason for it not working is the endianness or > could it be a unrelated issue that makes it work on neither endianness? and i forgot the CC, so repling with CC (sorry) [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB The misfortune of the wise is better than the prosperity of the fool. -- Epicurus ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Disable MSA for big-endian mips cpu
Is this on top of the configure patch? Shivraj: No, this is complete new patch. I'm a little confused. It seems the configure patch would be much simpler, no? Shivraj: This patch is according to michael’s suggestion. From: Ronald S. Bultje [mailto:rsbul...@gmail.com] Sent: 26 May 2017 17:26 To: FFmpeg development discussions and patches Cc: Shivraj Patil Subject: Re: [FFmpeg-devel] [PATCH] Disable MSA for big-endian mips cpu Hi, On Fri, May 26, 2017 at 6:38 AM, <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> wrote: From: Shivraj Patil <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> --- libavcodec/mips/Makefile|2 ++ libavcodec/mips/blockdsp_init_mips.c|8 libavcodec/mips/h263dsp_init_mips.c |8 libavcodec/mips/h264chroma_init_mips.c |8 libavcodec/mips/h264dsp_init_mips.c |8 libavcodec/mips/h264pred_init_mips.c|8 libavcodec/mips/h264qpel_init_mips.c|8 libavcodec/mips/hevcdsp_init_mips.c |8 libavcodec/mips/hevcpred_init_mips.c|8 libavcodec/mips/hpeldsp_init_mips.c |8 libavcodec/mips/idctdsp_init_mips.c |8 libavcodec/mips/me_cmp_init_mips.c |8 libavcodec/mips/mpegvideo_init_mips.c |8 libavcodec/mips/mpegvideoencdsp_init_mips.c |8 libavcodec/mips/pixblockdsp_init_mips.c |8 libavcodec/mips/qpeldsp_init_mips.c |8 libavcodec/mips/vp8dsp_init_mips.c |8 libavcodec/mips/vp9dsp_init_mips.c |8 18 files changed, 70 insertions(+), 68 deletions(-) Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Disable MSA optimization for big endian arch
Shivraj: yes, -mmsa flag will be added and should not be an issue for big endian mips builds. > +if enabled bigendian && enabled msa; then > +disable msa > +fi As currently, MSA optimizations does not support big endian, above code will disable MSA and switch to default C functions. -Original Message- From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of Michael Niedermayer Sent: 16 May 2017 20:22 To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] Disable MSA optimization for big endian arch On Mon, Apr 24, 2017 at 05:33:22PM +0530, shivraj.pa...@imgtec.com wrote: > From: Shivraj Patil <shivraj.pa...@imgtec.com> > > Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com> > --- > configure |4 > 1 file changed, 4 insertions(+) > > diff --git a/configure b/configure > index 1e3463c..c63a48a 100755 > --- a/configure > +++ b/configure > @@ -5357,6 +5357,10 @@ elif enabled mips; then > enabled mipsdsp && check_inline_asm_flags mipsdsp '"addu.qb $t0, $t1, > $t2"' '-mdsp' > enabled mipsdspr2 && check_inline_asm_flags mipsdspr2 '"absq_s.qb $t0, > $t1"' '-mdspr2' > > +if enabled bigendian && enabled msa; then > +disable msa > +fi the order of this looks a bit odd for example there is above: enabled mipsfpu && enabled msa && check_inline_asm_flags msa '"addvi.b $w0, $w1, 1"' '-mmsa' && check_header msa.h || disable msa I think this would add -mmsa to the flags or disable msa already with the code you add msa is disabled but -mmsa is left in the flags Please correct me if iam wrong. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Disable MSA optimization for big endian arch
Hi, Can anyone review the patch please? -Original Message- From: Shivraj Patil Sent: 24 April 2017 17:33 To: ffmpeg-devel@ffmpeg.org Cc: Shivraj Patil Subject: [PATCH] Disable MSA optimization for big endian arch From: Shivraj Patil <shivraj.pa...@imgtec.com> Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com> --- configure |4 1 file changed, 4 insertions(+) diff --git a/configure b/configure index 1e3463c..c63a48a 100755 --- a/configure +++ b/configure @@ -5357,6 +5357,10 @@ elif enabled mips; then enabled mipsdsp && check_inline_asm_flags mipsdsp '"addu.qb $t0, $t1, $t2"' '-mdsp' enabled mipsdspr2 && check_inline_asm_flags mipsdspr2 '"absq_s.qb $t0, $t1"' '-mdspr2' +if enabled bigendian && enabled msa; then +disable msa +fi + elif enabled parisc; then if enabled gcc; then -- 1.7.9.5 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Build fix for MIPS
Hi, Updated the patch as per comments. As hevcpred_init_mips.c needs definition of HAVE_MSA and av_cold, now I have included config.h & libavutil/attributes.h in this file. Thanks, Shivraj From: Ronald S. Bultje [mailto:rsbul...@gmail.com] Sent: 04 April 2017 17:47 To: Shivraj Patil Cc: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] Build fix for MIPS On Tue, Apr 4, 2017 at 8:05 AM, Shivraj Patil <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> wrote: Hi, > diff --git a/libavcodec/hevcpred.h b/libavcodec/hevcpred.h > [..] > @@ -26,6 +26,8 @@ > #include > #include > > +#include "get_bits.h" > + hevcpred does not depend on get_bits.h and does not use any symbols (get_bits* or GetBit*) from it. What compiler error does this fix? diff --git a/libavcodec/mips/hevcpred_msa.c b/libavcodec/mips/hevcpred_msa.c [..] -#include "libavcodec/hevc.h" +#include "libavcodec/hevcdec.h" This looks correct. Ronald 0001-build-fix-for-mips.patch Description: 0001-build-fix-for-mips.patch ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Build fix for MIPS
Hi, Above patch fixes the mips build bug. Shivraj. -Original Message- From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of Michael Niedermayer Sent: 31 March 2017 15:22 To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] Build fix for MIPS On Fri, Mar 31, 2017 at 05:32:55AM +, Shivraj Patil wrote: > > > Hi, > > On Thu, Mar 30, 2017 at 2:21 AM, > <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> wrote: > From: Shivraj Patil > <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> > > Signed-off-by: Shivraj Patil > <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> > --- > libavcodec/mips/hevcpred_mips.h |2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/libavcodec/mips/hevcpred_mips.h > b/libavcodec/mips/hevcpred_mips.h index 12f57a2..fcd687b 100644 > --- a/libavcodec/mips/hevcpred_mips.h > +++ b/libavcodec/mips/hevcpred_mips.h > @@ -21,7 +21,7 @@ > #ifndef AVCODEC_MIPS_HEVCPRED_MIPS_H > #define AVCODEC_MIPS_HEVCPRED_MIPS_H > > -#include "libavcodec/hevcdsp.h" > +#include "libavcodec/hevcdec.h" > > Shouldn't this be hevcpred.h? > > Shivraj:- No definition of “HEVCContext” in file hevcpred.h. > Hence observed error following error > ./libavcodec/mips/hevcpred_mips.h:70:32: error: unknown type name > HEVCContext void ff_intra_pred_8_16x16_msa(HEVCContext *s, int x0, int > y0, int c_idx); i have a fix for this locally, will be in my next push thx [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB If a bugfix only changes things apparently unrelated to the bug with no further explanation, that is a good sign that the bugfix is wrong. 0001-build-fix-for-mips.patch Description: 0001-build-fix-for-mips.patch ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] Build fix for MIPS
Hi, On Thu, Mar 30, 2017 at 2:21 AM, <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> wrote: From: Shivraj Patil <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com<mailto:shivraj.pa...@imgtec.com>> --- libavcodec/mips/hevcpred_mips.h |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libavcodec/mips/hevcpred_mips.h b/libavcodec/mips/hevcpred_mips.h index 12f57a2..fcd687b 100644 --- a/libavcodec/mips/hevcpred_mips.h +++ b/libavcodec/mips/hevcpred_mips.h @@ -21,7 +21,7 @@ #ifndef AVCODEC_MIPS_HEVCPRED_MIPS_H #define AVCODEC_MIPS_HEVCPRED_MIPS_H -#include "libavcodec/hevcdsp.h" +#include "libavcodec/hevcdec.h" Shouldn't this be hevcpred.h? Shivraj:- No definition of “HEVCContext” in file hevcpred.h. Hence observed error following error ./libavcodec/mips/hevcpred_mips.h:70:32: error: unknown type name HEVCContext void ff_intra_pred_8_16x16_msa(HEVCContext *s, int x0, int y0, int c_idx); Ronald ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
[FFmpeg-devel] Tools for MIPS MSA (MIPS SIMD-Arch)
Hi all, Please find below the toolchain and QEMU installables for the build and test of MIPS p5600 and i6400 cpu targets, MIPS Toolchain: * Download the toolchain for MIPS 32Bit (p5600) from following link (For 64Bit linux host) http://codescape-mips-sdk.imgtec.com/components/toolchain/2015.01-5/Codescape.GNU.Tools.2015.01-5.for.MIPS.MTI.Linux.CentOS-5.x86_64.tar.gz * Download the toolchain for MIPS 64Bit (i6400) from following link (For 64Bit linux host) http://codescape-mips-sdk.imgtec.com/components/toolchain/2015.01-5/Codescape.GNU.Tools.2015.01-5.for.MIPS.IMG.Linux.CentOS-5.x86_64.tar.gz QEMU: * Download the qemu from following link, https://github.com/prplfoundation/qemu/archive/rel/2.4.0.1.0.tar.gz * untar 2.4.0.1.0.tar.gz * configure ./configure --target-list="mipsel-linux-user mips64el-linux-user" * make FFMpeg Configure (cpu p5600): * ./configure --enable-cross-compile --cross-prefix=/install-mips-mti-linux-gnu/bin/mips-mti-linux-gnu- --arch=mips --cpu=p5600 --target-os=linux --extra-cflags="-EL -static" --extra-ldflags="-EL -static" --target- exec="/mipsel-linux-user/qemu-mipsel -cpu p5600" --disable-iconv * make SAMPLES=./fate-suite/ fate FFMpeg Configure (cpu i6400): 1. ./configure --enable-cross-compile --cross-prefix=/install-mips-img-linux-gnu/bin/mips-img-linux-gnu- --arch=mips64 --cpu=i6400 --target-os=linux --extra-cflags="-EL -static" --extra-ldflags="-EL -static" --target-exec="/mips64el-linux-user/qemu-mips64el -cpu i6400" --disable-iconv 2. make SAMPLES=./fate-suite/ fate Shivraj ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 1/2] avcodec/mips: build fix for MSA
-Original Message- From: Michael Niedermayer [mailto:mich...@niedermayer.cc] Sent: 08 October 2015 00:16 To: FFmpeg development discussions and patches Cc: Shivraj Patil; Parag Salasakar; Manojkumar Bhosale Subject: Re: [FFmpeg-devel] [PATCH 1/2] avcodec/mips: build fix for MSA On Wed, Oct 07, 2015 at 06:20:53PM +0530, shivraj.pa...@imgtec.com wrote: > From: Shivraj Patil <shivraj.pa...@imgtec.com> > > Modified sps and pps access from old HEVCContext(s) structure to newly > introduced HEVCParamSets(ps) > > Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com> > --- > libavcodec/mips/hevcpred_msa.c | 282 > > 1 file changed, 141 insertions(+), 141 deletions(-) does this or any other patch need to be backported to release/2.8 ? Shivraj:-Yes, both the patches needs to be backported to release/2.8. if so please backport them (assuming its not just a cherry pick without conflicts) Shivraj:- I am afraid with the backport steps. I have submitted both the patches prepared with release/2.8 clone. thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB No snowflake in an avalanche ever feels responsible. -- Voltaire ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] avcodec/mips: build fix for MSA 64bit
-Original Message- From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of Michael Niedermayer Sent: 08 October 2015 18:00 To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH 2/2] avcodec/mips: build fix for MSA 64bit On Thu, Oct 08, 2015 at 02:47:52PM +0530, shivraj.pa...@imgtec.com wrote: > From: Shivraj Patil <shivraj.pa...@imgtec.com> > > Modified datatype of function argument (pitch from int32_t to ptrdiff_t). > > Signed-off-by: Shivraj Patil <shivraj.pa...@imgtec.com> > --- > libavcodec/mips/vp9_lpf_msa.c | 42 > - > 1 file changed, 21 insertions(+), 21 deletions(-) applied in the future please add a note to the commit message to point to the hash of the commit in master Shivraj:- Sure Michael, Also will you please let us know the build/test setup currently in place for mips msa (p5600 & i6400 cpu configs)? If it is not setup, we will provide the toolchain & qemu paths. Thanks, thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB In fact, the RIAA has been known to suggest that students drop out of college or go to community college in order to be able to afford settlements. -- The RIAA ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avutil: runtime cpu detection for mips
On Wed, Sep 30, 2015 at 02:21:18PM +, Shivraj Patil wrote: > > On Wed, Sep 30, 2015 at 07:03:46PM +0530, shivraj.pa...@imgtec.com wrote: > > From: Shivraj Patil <shivraj.pa...@imgtec.com> > [...] > > > +static int get_cpuinfo(uint32_t *hwcap) { > > +FILE *f = fopen("/proc/cpuinfo", "r"); > > under qemu i get this: > > cpu_flags(raw) = 0x > cpu_flags_str(raw) = > cpu_flags(effective) = 0x > cpu_flags_str(effective) = > threads = 1 (cpu_count = 12) > > IIUC this disables all cpu extensions > is that intended ? > (we currently only have loongson hardware to test so this would mean > an end to testing imgtec specific mips optimizations) > > Shivraj:- cpu detection will not work for qemu, however we have tested it on > mips hardware at our end. iam not speaking about just this patch but we have just qemu to test on imgtec - mips i can apply this patch but it means that in the future FFmpeg will then only be tested on loongson because you effectively disable our only way to test code on imgtec mips its very strange that you want this Shivraj:- I have got the concern with this patch and so request you to please discard it for now. It will make more sense to resubmit (modified) at appropriate time in near future when MSA enabled devices are setup for testing. Till the time requesting to keep the mips testing under QEMU as it is. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB You can kill me, but you cannot change the truth. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avutil: runtime cpu detection for mips
On Wed, Sep 30, 2015 at 07:03:46PM +0530, shivraj.pa...@imgtec.com wrote: > From: Shivraj Patil <shivraj.pa...@imgtec.com> [...] > +static int get_cpuinfo(uint32_t *hwcap) { > +FILE *f = fopen("/proc/cpuinfo", "r"); under qemu i get this: cpu_flags(raw) = 0x cpu_flags_str(raw) = cpu_flags(effective) = 0x cpu_flags_str(effective) = threads = 1 (cpu_count = 12) IIUC this disables all cpu extensions is that intended ? (we currently only have loongson hardware to test so this would mean an end to testing imgtec specific mips optimizations) Shivraj:- cpu detection will not work for qemu, however we have tested it on mips hardware at our end. [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Avoid a single point of failure, be that a person or equipment. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avutil: runtime cpu detection for mips
imgtec.com> writes: > +FILE *f = fopen("/proc/cpuinfo", "r"); Is this what every other software for mips does? How does the kernel (or whatever sets cpuinfo) know? Shivraj:- We have used generic cpuinfo as it is unrestricted to access from user space compared to the kernel setting hwcaps (via low level mips control regs accessible) in kernel space for now. Thanks, Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP8 functions
Hi, +++ b/libavcodec/mips/vp8dsp_init_mips.c Is there a reason the init code lives in a different file than the implementations? It seems to me all symbols could be static if the init code lived in the same file as the implementation. This isn't a big deal, just wondering. Shivraj:- Yes, the files can be merged, we just followed the tradition as done by other platforms. Well, so, I have to explain this for it to make sense. On most platforms, like x86 and arm, we use raw assembly (in .asm files), where the init code (for function pointer assignment) is C, so they can't logically live in same files. On platforms where we use C-ish languages for platform-specific optimizations (e.g. intrinsics for altivec, or gcc-style inline assembly for x86), I believe the preference would be to use static functions and merge init/code in the same file. Shivraj:- *_init_mips.c file will contain pointer initialization of all mips extensions, right now only one extension present that is MIPS SIMD Arch(MSA). *_msa.c file contain optimized code for MSA extension. In future for other mips extension (like dsp-ase r2 etc), pointer initialization will still happen in file *_init_mips.c but sources will be in different file *_dspaser2.c Thanks, Shivraj ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 2/2] avcodec/mips: h264qpel init add missing mc00 msa optimization
Hi, -Original Message- From: ffmpeg-devel [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of ??? Sent: 04 August 2015 17:35 To: ffmpeg-devel Subject: [FFmpeg-devel] [PATCH 2/2] avcodec/mips: h264qpel init add missing mc00 msa optimization From 734eabc92df1b6ca26a943f9723e47a838d859f7 Mon Sep 17 00:00:00 2001 From: ZhouXiaoyong zhouxiaoy...@loongson.cn Date: Tue, 4 Aug 2015 19:39:51 +0800 Subject: [PATCH 2/2] avcodec/mips: h264qpel init add missing mc00 msa optimization Signed-off-by: ZhouXiaoyong zhouxiaoy...@loongson.cn --- libavcodec/mips/h264qpel_init_mips.c | 1 + 1 file changed, 1 insertion(+) diff --git a/libavcodec/mips/h264qpel_init_mips.c b/libavcodec/mips/h264qpel_init_mips.c index cfa5854..72797f1 100644 --- a/libavcodec/mips/h264qpel_init_mips.c +++ b/libavcodec/mips/h264qpel_init_mips.c @@ -59,6 +59,7 @@ static av_cold void h264qpel_init_msa(H264QpelContext *c, int bit_depth) c-put_h264_qpel_pixels_tab[1][14] = ff_put_h264_qpel8_mc23_msa; c-put_h264_qpel_pixels_tab[1][15] = ff_put_h264_qpel8_mc33_msa; +c-put_h264_qpel_pixels_tab[2][0] = ff_put_h264_qpel4_mc00_msa; c-put_h264_qpel_pixels_tab[2][1] = ff_put_h264_qpel4_mc10_msa; c-put_h264_qpel_pixels_tab[2][2] = ff_put_h264_qpel4_mc20_msa; c-put_h264_qpel_pixels_tab[2][3] = ff_put_h264_qpel4_mc30_msa; -- 2.1.0 Shivraj:- c-put_h264_qpel_pixels_tab[2][0] = ff_put_h264_qpel4_mc00_msa initialization is not missing. ff_put_h264_qpel4_mc00_msa is not implemented as it does not give any optimization than C counterpart. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP8 functions
Hi, On Mon, Jul 27, 2015 at 6:01 AM, shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com wrote: +++ b/libavcodec/mips/vp8dsp_init_mips.c Is there a reason the init code lives in a different file than the implementations? It seems to me all symbols could be static if the init code lived in the same file as the implementation. This isn't a big deal, just wondering. Shivraj:- Yes, the files can be merged, we just followed the tradition as done by other platforms. +++ b/libavcodec/mips/vp8dsp_msa.c Please split this file in 3, one for loopfilter, one for MC and one for idct-related stuff. I agree the x86 code is a bad example in this respect, but it's a good idea to improve this going forward. Shivraj:-Incorporated changes accordingly. +void ff_put_vp8_epel4_h4_msa(uint8_t *dst, ptrdiff_t dst_stride, + uint8_t *src, ptrdiff_t src_stride, + int height, int mx, int my) +{ +const int8_t *filter = subpel_filters_msa[mx - 1]; + +if (2 == height) { +common_hz_4t_4x2_msa(src, src_stride, dst, dst_stride, filter); Is this ever true? I don't think blocksize goes below 4 in either dimension, ever. +void ff_put_vp8_epel4_v4_msa(uint8_t *dst, ptrdiff_t dst_stride, + uint8_t *src, ptrdiff_t src_stride, + int height, int mx, int my) +{ +const int8_t *filter = subpel_filters_msa[my - 1]; + +if (2 == height) { +common_vt_4t_4x2_msa(src, src_stride, dst, dst_stride, filter); Same. +void ff_put_vp8_pixels8_msa(uint8_t *dst, ptrdiff_t dst_stride, +uint8_t *src, ptrdiff_t src_stride, +int height, int mx, int my) +{ +int32_t cnt; +uint64_t out0, out1, out2, out3, out4, out5, out6, out7; +v16u8 src0, src1, src2, src3, src4, src5, src6, src7; + +if (0 == height % 12) { Is this ever true? My impression was blocksize could ever only be a power of two (4, 8, 16) in vp8. +} else if (0 == height % 2) { Same as above, blocksize should never be 2. I don't think this code ever executes. +void ff_put_vp8_pixels16_msa(uint8_t *dst, ptrdiff_t dst_stride, +uint8_t *src, ptrdiff_t src_stride, +int height, int mx, int my) +{ +int32_t cnt; +v16u8 src0, src1, src2, src3, src4, src5, src6, src7; + +if (0 == height % 12) { Same as above. Shivraj:- All cases of height 2, 6, 12 are removed. The remainder of the patch looks OK to me. shivraj ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 bilinear functions
Hi, On Mon, Jul 27, 2015 at 7:59 AM, shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com wrote: From: Shivraj Patil shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com Signed-off-by: Shivraj Patil shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com --- libavcodec/mips/vp9_mc_msa.c | 2123 libavcodec/mips/vp9dsp_init_mips.c |2 + libavcodec/mips/vp9dsp_mips.h | 32 + 3 files changed, 2157 insertions(+) [..] +void ff_avg_bilin_4h_msa(uint8_t *dst, ptrdiff_t dst_stride, + const uint8_t *src, ptrdiff_t src_stride, + int height, int mx, int my) +{ +const int8_t *filter = vp9_bilinear_filters_msa[mx - 1]; + +if (4 == height) { +common_hz_2t_and_aver_dst_4x4_msa(src, src_stride, dst, dst_stride, + filter); +} else if (8 == height) { +common_hz_2t_and_aver_dst_4x8_msa(src, src_stride, dst, dst_stride, + filter); +} +} You're using this construct in various places, how much does it help? (Otherwise no comments, basically lgtm % the above.) Shivraj:- For 8 height case, it helps to reduce stalls (perf gain ~20%), as compared to calling 4 height function twice. ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 4/4] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 intra functions
From: Ronald S. Bultje [mailto:rsbul...@gmail.com] Sent: 16 July 2015 20:49 To: FFmpeg development discussions and patches Cc: Rob Isherwood; Shivraj Patil Subject: Re: [FFmpeg-devel] [PATCH 4/4] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 intra functions Hi, On Thu, Jul 9, 2015 at 9:15 AM, shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com wrote: +static void intra_predict_vert_4x4_msa(const uint8_t *src, uint8_t *dst, + int32_t dst_stride) +{ +uint32_t src_data; + +src_data = LW(src); + +SW4(src_data, src_data, src_data, src_data, dst, dst_stride); +} Is this faster than the C function? I know this is a fair bit of work, but ideally you'd profile each individual simd function to see how much faster it is than the C function. These won't be faster, so they just increase the binary size. Same is likely true for e.g. the vert_8x8 one. Shivraj:-Yes, there is no significant gain, so will keep the original c functions for both 4x4, 8x8 horiz and vert cases. +static void intra_predict_horiz_4x4_msa(const uint8_t *src, int32_t src_stride, +uint8_t *dst, int32_t dst_stride) +{ +uint32_t out0, out1, out2, out3; + +out0 = src[0 * src_stride] * 0x01010101; +out1 = src[1 * src_stride] * 0x01010101; +out2 = src[2 * src_stride] * 0x01010101; +out3 = src[3 * src_stride] * 0x01010101; + +SW4(out0, out1, out2, out3, dst, dst_stride); +} Same question here - I suspect this isn't faster than the C version. Same for horiz_8x8. +static void intra_predict_dc_4x4_msa(const uint8_t *src_top, + const uint8_t *src_left, + int32_t src_stride_left, + uint8_t *dst, int32_t dst_stride, + uint8_t is_above, uint8_t is_left) +{ +uint32_t row; +uint32_t out, addition = 0; +v16u8 src_above, store; +v8u16 sum_above; +v4u32 sum; + +if (is_left is_above) { +src_above = LD_UB(src_top); + +sum_above = __msa_hadd_u_h(src_above, src_above); +sum = __msa_hadd_u_w(sum_above, sum_above); +addition = __msa_copy_u_w((v4i32) sum, 0); + +for (row = 0; row 4; row++) { +addition += src_left[row * src_stride_left]; +} + +addition = (addition + 4) 3; +store = (v16u8) __msa_fill_b(addition); +} else if (is_left) { +for (row = 0; row 4; row++) { +addition += src_left[row * src_stride_left]; +} + +addition = (addition + 2) 2; +store = (v16u8) __msa_fill_b(addition); +} else if (is_above) { +src_above = LD_UB(src_top); + +sum_above = __msa_hadd_u_h(src_above, src_above); +sum = __msa_hadd_u_w(sum_above, sum_above); +sum = (v4u32) __msa_srari_w((v4i32) sum, 2); +store = (v16u8) __msa_splati_b((v16i8) sum, 0); +} else { +store = (v16u8) __msa_ldi_b(128); +} + +out = __msa_copy_u_w((v4i32) store, 0); + +for (row = 4; row--;) { +SW(out, dst); +dst += dst_stride; +} +} + +static void intra_predict_dc_8x8_msa(const uint8_t *src_top, + const uint8_t *src_left, + int32_t src_stride_left, + uint8_t *dst, int32_t dst_stride, + uint8_t is_above, uint8_t is_left) +{ +uint32_t row; +uint32_t out, addition = 0; +v16u8 src_above, store; +v8u16 sum_above; +v4u32 sum_top; +v2u64 sum; + +if (is_left is_above) { +src_above = LD_UB(src_top); + +sum_above = __msa_hadd_u_h(src_above, src_above); +sum_top = __msa_hadd_u_w(sum_above, sum_above); +sum = __msa_hadd_u_d(sum_top, sum_top); +addition = __msa_copy_u_w((v4i32) sum, 0); + +for (row = 0; row 8; row++) { +addition += src_left[row * src_stride_left]; +} + +addition = (addition + 8) 4; +store = (v16u8) __msa_fill_b(addition); +} else if (is_left) { +for (row = 0; row 8; row++) { +addition += src_left[row * src_stride_left]; +} + +addition = (addition + 4) 3; +store = (v16u8) __msa_fill_b(addition); +} else if (is_above) { +src_above = LD_UB(src_top); + +sum_above = __msa_hadd_u_h(src_above, src_above); +sum_top = __msa_hadd_u_w(sum_above, sum_above); +sum = __msa_hadd_u_d(sum_top, sum_top); +sum = (v2u64) __msa_srari_d((v2i64) sum, 3); +store = (v16u8) __msa_splati_b((v16i8) sum, 0); +} else { +store = (v16u8) __msa_ldi_b(128); +} + +out = __msa_copy_u_w((v4i32) store, 0); + +for (row = 8; row--;) { +SW(out, dst); +SW(out, (dst + 4)); +dst += dst_stride; +} +} + +static void
Re: [FFmpeg-devel] [PATCH 3/4] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for VP9 idct functions
Hi, On Thu, Jul 9, 2015 at 9:15 AM, shivraj.pa...@imgtec.commailto:shivraj.pa...@imgtec.com wrote: +void ff_idct_idct_16x16_add_msa(uint8_t *dst, ptrdiff_t stride, +int16_t *block, int eob) +{ +vp9_idct16x16_colcol_addblk_msa(block, dst, stride); +memset(block, 0, 16 * 16 * sizeof(*block)); +} (This comment applies to all code in this file), you're not using the eob parameter anywhere. Admittedly, for the iadst variants, the eob value is generally quite high so this won't give any merit, but for idct_idct, eob is typically low (possibly even 1), and you can make use of that to do sub-idcts. Look at the C code for an example of dc-only idct_idct, and look at the x86 simd for examples of sub-idcts. They give great speedups on top of the regular speedup expected from simd vectorization, especially for the bigger ones (16x16, 32x32). Agreed, will incorporate the same. Shivraj ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel
Re: [FFmpeg-devel] [PATCH 5/5] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for pixblock functions
Hi, May I request somebody from maintainers to review this patch please? -Original Message- From: Shivraj Patil Sent: 14 June 2015 23:26 To: ffmpeg-devel@ffmpeg.org Cc: Rob Isherwood; Shivraj Patil Subject: [PATCH 5/5] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for pixblock functions From: Shivraj Patil shivraj.pa...@imgtec.com This patch adds MSA (MIPS-SIMD-Arch) optimizations for pixblock functions in new file pixblockdsp_msa.c Adds new generic macros (needed for this patch) in libavutil/mips/generic_macros_msa.h Signed-off-by: Shivraj Patil shivraj.pa...@imgtec.com --- libavcodec/mips/Makefile| 2 + libavcodec/mips/pixblockdsp_init_mips.c | 53 libavcodec/mips/pixblockdsp_mips.h | 33 libavcodec/mips/pixblockdsp_msa.c | 143 libavcodec/pixblockdsp.c| 2 + libavcodec/pixblockdsp.h| 2 + libavutil/mips/generic_macros_msa.h | 8 ++ 7 files changed, 243 insertions(+) create mode 100644 libavcodec/mips/pixblockdsp_init_mips.c create mode 100644 libavcodec/mips/pixblockdsp_mips.h create mode 100644 libavcodec/mips/pixblockdsp_msa.c diff --git a/libavcodec/mips/Makefile b/libavcodec/mips/Makefile index 823a2c5..c0ecb15 100644 --- a/libavcodec/mips/Makefile +++ b/libavcodec/mips/Makefile @@ -28,6 +28,7 @@ OBJS-$(CONFIG_H263DSP)+= mips/h263dsp_init_mips.o OBJS-$(CONFIG_QPELDSP)+= mips/qpeldsp_init_mips.o OBJS-$(CONFIG_HPELDSP)+= mips/hpeldsp_init_mips.o OBJS-$(CONFIG_BLOCKDSP) += mips/blockdsp_init_mips.o +OBJS-$(CONFIG_PIXBLOCKDSP)+= mips/pixblockdsp_init_mips.o MSA-OBJS-$(CONFIG_HEVC_DECODER) += mips/hevcdsp_msa.o\ mips/hevc_mc_uni_msa.o\ mips/hevc_mc_uniw_msa.o \ @@ -45,5 +46,6 @@ MSA-OBJS-$(CONFIG_H263DSP)+= mips/h263dsp_msa.o MSA-OBJS-$(CONFIG_QPELDSP)+= mips/qpeldsp_msa.o MSA-OBJS-$(CONFIG_HPELDSP)+= mips/hpeldsp_msa.o MSA-OBJS-$(CONFIG_BLOCKDSP) += mips/blockdsp_msa.o +MSA-OBJS-$(CONFIG_PIXBLOCKDSP)+= mips/pixblockdsp_msa.o LOONGSON3-OBJS-$(CONFIG_H264DSP) += mips/h264dsp_mmi.o LOONGSON3-OBJS-$(CONFIG_H264CHROMA) += mips/h264chroma_mmi.o diff --git a/libavcodec/mips/pixblockdsp_init_mips.c b/libavcodec/mips/pixblockdsp_init_mips.c new file mode 100644 index 000..0f2fb15 --- /dev/null +++ b/libavcodec/mips/pixblockdsp_init_mips.c @@ -0,0 +1,53 @@ +/* + * Copyright (c) 2015 Shivraj Patil (shivraj.pa...@imgtec.com) + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA +02110-1301 USA */ + +#include pixblockdsp_mips.h + +#if HAVE_MSA +static av_cold void pixblockdsp_init_msa(PixblockDSPContext *c, + AVCodecContext *avctx, + unsigned high_bit_depth) { +c-diff_pixels = ff_diff_pixels_msa; + +switch (avctx-bits_per_raw_sample) { +case 9: +case 10: +case 12: +case 14: +c-get_pixels = ff_get_pixels_16_msa; +break; +default: +if (avctx-bits_per_raw_sample = 8 || avctx-codec_type != +AVMEDIA_TYPE_VIDEO) { +c-get_pixels = ff_get_pixels_8_msa; +} +break; +} +} +#endif // #if HAVE_MSA + +void ff_pixblockdsp_init_mips(PixblockDSPContext *c, AVCodecContext *avctx, + unsigned high_bit_depth) { #if HAVE_MSA +pixblockdsp_init_msa(c, avctx, high_bit_depth); #endif // #if +HAVE_MSA } diff --git a/libavcodec/mips/pixblockdsp_mips.h b/libavcodec/mips/pixblockdsp_mips.h new file mode 100644 index 000..3eee6e0 --- /dev/null +++ b/libavcodec/mips/pixblockdsp_mips.h @@ -0,0 +1,33 @@ +/* + * Copyright (c) 2015 Shivraj Patil (shivraj.pa...@imgtec.com) + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version
Re: [FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for H264 lpf and weight/biweight functions
-Original Message- From: ffmpeg-devel-boun...@ffmpeg.org [mailto:ffmpeg-devel-boun...@ffmpeg.org] On Behalf Of Timothy Gu Sent: 21 April 2015 00:28 To: FFmpeg development discussions and patches Subject: Re: [FFmpeg-devel] [PATCH] avcodec/mips: MSA (MIPS-SIMD-Arch) optimizations for H264 lpf and weight/biweight functions On Mon, Apr 20, 2015 at 2:54 AM shivraj.pa...@imgtec.com wrote: From: Shivraj Patil shivraj.pa...@imgtec.com Signed-off-by: Shivraj Patil shivraj.pa...@imgtec.com --- libavcodec/h264dsp.c|1 + libavcodec/h264dsp.h|2 + libavcodec/mips/Makefile|2 + libavcodec/mips/h264dsp_init_mips.c | 74 + libavcodec/mips/h264dsp_mips.h | 71 + libavcodec/mips/h264dsp_msa.c | 3037 +++ libavutil/mips/generic_macros_msa.h | 518 ++ 7 files changed, 3705 insertions(+) create mode 100644 libavcodec/mips/h264dsp_init_mips.c create mode 100644 libavcodec/mips/h264dsp_mips.h create mode 100644 libavcodec/mips/h264dsp_msa.c [...] diff --git a/libavcodec/mips/h264dsp_init_mips.c b/libavcodec/mips/h264dsp_init_mips.c new file mode 100644 index 000..8d3d760 --- /dev/null +++ b/libavcodec/mips/h264dsp_init_mips.c @@ -0,0 +1,74 @@ +/* + * Copyright (c) 2015 Parag Salasakar (parag.salasa...@imgtec.com) + * + * This file is part of FFmpeg. + * + * FFmpeg is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * FFmpeg is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with FFmpeg; if not, write to the Free Software + * Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA + */ + +#include h264dsp_mips.h + +#if HAVE_MSA +static av_cold void h264dsp_init_msa(H264DSPContext *c, + const int bit_depth, + const int chroma_format_idc) { +if (8 == bit_depth) { +c-h264_v_loop_filter_luma = ff_h264_v_lpf_luma_inter_msa; +c-h264_h_loop_filter_luma = ff_h264_h_lpf_luma_inter_msa; +c-h264_h_loop_filter_luma_mbaff = +ff_h264_h_loop_filter_luma_mbaff_msa; +c-h264_v_loop_filter_luma_intra = ff_h264_v_lpf_luma_intra_msa; +c-h264_h_loop_filter_luma_intra = ff_h264_h_lpf_luma_intra_msa; +c-h264_h_loop_filter_luma_mbaff_intra = +ff_h264_h_loop_filter_luma_mbaff_intra_msa; +c-h264_v_loop_filter_chroma = +ff_h264_v_lpf_chroma_inter_msa; + +if (chroma_format_idc = 1) +c-h264_h_loop_filter_chroma = ff_h264_h_lpf_chroma_inter_msa; +else +c-h264_h_loop_filter_chroma = +ff_h264_h_loop_filter_chroma422_msa; + +if (chroma_format_idc 1) +c-h264_h_loop_filter_chroma_mbaff = +ff_h264_h_loop_filter_chroma422_mbaff_msa; + +c-h264_v_loop_filter_chroma_intra = +ff_h264_v_lpf_chroma_intra_msa; + +if (chroma_format_idc = 1) +c-h264_h_loop_filter_chroma_intra = +ff_h264_h_lpf_chroma_intra_msa; + +/* Weighted MC */ +c-weight_h264_pixels_tab[0] = ff_weight_h264_pixels16_8_msa; +c-weight_h264_pixels_tab[1] = ff_weight_h264_pixels8_8_msa; +c-weight_h264_pixels_tab[2] = ff_weight_h264_pixels4_8_msa; + +c-biweight_h264_pixels_tab[0] = ff_biweight_h264_pixels16_8_msa; +c-biweight_h264_pixels_tab[1] = ff_biweight_h264_pixels8_8_msa; +c-biweight_h264_pixels_tab[2] = ff_biweight_h264_pixels4_8_msa; +} // if (8 == bit_depth) +} +#endif // #if HAVE_MSA + +av_cold void ff_h264dsp_init_mips(H264DSPContext *c, const int bit_depth, + const int chroma_format_idc) { #if +HAVE_MSA +h264dsp_init_msa(c, bit_depth, chroma_format_idc); #endif // #if +HAVE_MSA } You should fold the _init_msa() function into this function. ff_h264dsp_init_mips is a generic initialization function for all MIPS variants. h264dsp_init_msa() is specific to MIPS variants having MSA. In future, there could be another init function(s) for other variants. Hence not clubbing these two functions. You also need to add the flags into libavutil/mips/cpu.h as the user might not want to use MSA at runtime. See libavutil/cpu.h and libavutil/*/cpu.h. Timothy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman