[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
This bug was fixed in the package ffmpeg - 4:0.5+svn20090706-2ubuntu2 --- ffmpeg (4:0.5+svn20090706-2ubuntu2) karmic; urgency=low [ Reinhard Tartler ] * Make arguments of av_set_pts_info() unsigned. * update debian/changelog * use patch for issue1245 from git.ffmpeg.org * Support constant-quant encoding for libtheora, LP: #356322 * increase swscale compile time width (VOF/VOFW), LP: #443264 [ Loïc Minier ] * Update config for karmic's armel toolchain. * Enable neon flavour; LP: #383240. * Update NEON confflags to assume v7 and VFP. * Add backported NEON patches from ffmpeg trunk; see debian/patches/neon/. * Pass proper --cpu and --extra-flags on armel. * Pass -fPIC -DPIC to neon pass. -- Loic MinierTue, 13 Oct 2009 23:56:04 +0200 ** Changed in: ffmpeg (Ubuntu Karmic) Status: In Progress => Fix Released -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Extra note: I do get a good speedup when playing video using ffplay, when using the NEON libraries versus the standard (VFP) libraries. I don't have quantative benchmarks right now. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I did a brief test of your build, Loïc— seems to work OK on a NEON-enabled platform. I didn't test exhaustively, but at least MPEG2 and H.264 (mp4) appear to work well. I couldn't test audio because there's some other problem with my test platform. I haven't successfully played any naturally-occurring Theora video: I always get the following error: [ogg @ ]Too old or unsupported Theora (0) could not open codecs Is it just that I'm trying to play an obsolete format, or is there some other problem? (So far as I know, there's nothing ARM-specific going on here.) If anyone can chuck me a link to some playable files for codecs you're interested in, I can try and play them. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I did a brief test of your build, Loïc— seems to work OK on a NEON-enabled platform. I didn't test exhaustively, but at least MPEG2 and H.264 (mp4) appear to work well. I couldn't test audio because there's some other problem with my test platform. I haven't successfully played any naturally-occurring Theora video: I always get the following error: [ogg @ ]Too old or unsupported Theora (0) could not open codecs Is it just that I'm trying to play an obsolete format, or is there some other problem? (So far as I know, there's nothing ARM-specific going on here.) If anyone can chuck me a link to some playable files for codecs you're interested in, I can try and play them. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
dsputil.h: not sure it's a good enough test but I played on amd64 an ogv video fine with both the new ffmpeg and with mplayer installed allong the new ffmpeg libs. I need to test the NEON version and test with more codecs when I get home; any recommendations of the most relevant tests here? Thanks! -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
So folks who want to test the PPA packages sadlly have to manually select the proper version for the packages built by the ffmpeg source; 4:0.5+svn20090706-2ubuntu1~arm4 should be used; you can check version on the installed packages with: grep-status -F Status 'install ok installed' -a -F Source ffmpeg -s Package -s Version (dctrl-tools package for grep-status) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
@Mans, thanks for fixing this upstream; I just passed -fPIC -DPIC in extra flags as that was the case for other optimised passes already. armel and amd64 packages are now available in my PPA, ppa:lool/ppa. https://launchpad.net/~lool/+archive/ppa Git repo/branch: git.debian.org:/git/users/lool/pkg- multimedia/ffmpeg.git karmic-armel-opts The cherry-picked / ported patches are in the karmic-armel-opts-neon branch. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
The R_ARM_MOVW_ABS_NC errors are caused by the linkers (static and dynamic) not handling that type of relocation. Only gcc 4.4 and codesourcery 2009q1 generate the movw/movt pairs that result in this relocation entry, so it used to be possible to build non-pic shared libraries. If you pull in r19672, you can use --enable-pic to force PIC. If you also include r20203, PIC will be enabled automatically for ARM shared libs. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Mans Rullgard writes: > dsputil.h is not part of the public API, so changes there should not > affect anything outside libavcodec. Unfortunately mplayer does #inlcude dsputil.h, and we carefully need to make sure that changes to that file do not destabilize mplayer. I've also checked gstreamer0.10-ffmpeg, but that package does not seem to use dsputil.h. If in doubt, we should probably reupload mplayer with the changed dsputil.h together with the new mplayer. -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
So the regular aes.o for neon has this in the output of arm-none-linux-gnueabi-readelf -a libavutil/aes.o.neon: ... Relocation section '.rel.text' at offset 0x3a64 contains 30 entries: Offset InfoTypeSym.Value Sym. Name 000c 092b R_ARM_MOVW_ABS_NC .LANCHOR0 0010 092c R_ARM_MOVT_ABS .LANCHOR0 00a0 0a2b R_ARM_MOVW_ABS_NC 1ff8 .LANCHOR1 00a8 0a2c R_ARM_MOVT_ABS1ff8 .LANCHOR1 0100 0b2b R_ARM_MOVW_ABS_NC .LANCHOR2 0104 0b2c R_ARM_MOVT_ABS .LANCHOR2 019c 092b R_ARM_MOVW_ABS_NC .LANCHOR0 ... I prepended -fPIC to the gcc flags and now the reltext table is much smaller and says: Relocation section '.rel.text' at offset 0x3cbc contains 13 entries: Offset InfoTypeSym.Value Sym. Name 02c8 271b R_ARM_PLT32 memcpy 0320 271b R_ARM_PLT32 memcpy 072c 271b R_ARM_PLT32 memcpy 0940 271b R_ARM_PLT32 memcpy 0968 2819 R_ARM_BASE_PREL _GLOBAL_OFFSET_TABLE_ 096c 1618 R_ARM_GOTOFF32 .LANCHOR0 0970 1718 R_ARM_GOTOFF321ff8 .LANCHOR1 0974 1118 R_ARM_GOTOFF32 .LANCHOR2 0d70 271b R_ARM_PLT32 memcpy 1050 271b R_ARM_PLT32 memcpy 10b0 2819 R_ARM_BASE_PREL _GLOBAL_OFFSET_TABLE_ 10b4 1718 R_ARM_GOTOFF321ff8 .LANCHOR1 10b8 1618 R_ARM_GOTOFF32 .LANCHOR0 (that's the full table) Given that I don't see any _ABS thingies, I suspect it is good. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
libavutil/aes.o.neon: file format elf32-littlearm Disassembly of section .text: : 0: e92d4ff0push{r4, r5, r6, r7, r8, r9, sl, fp, lr} 4: e24ddb01sub sp, sp, #1024 ; 0x400 8: e24dd00csub sp, sp, #12 ; 0xc c: e3005000movwr5, #0 ; 0x0 c: R_ARM_MOVW_ABS_NC.LANCHOR0 10: e3405000movtr5, #0 ; 0x0 10: R_ARM_MOVT_ABS .LANCHOR0 14: e58d1068str r1, [sp, #104] 18: e5951ffcldr r1, [r5, #4092] 1c: e58d5014str r5, [sp, #20] 20: e351cmp r1, #0 ; 0x0 24: e58d006cstr r0, [sp, #108] 28: e58d2064str r2, [sp, #100] 2c: e58d3060str r3, [sp, #96] 30: 1a7ebne 230 34: e28dc098add ip, sp, #152; 0x98 38: e3a03001mov r3, #1 ; 0x1 ... Disassembly of section .bss: ... 1ff8 <.LANCHOR1>: ... ... -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I can reproduce with a cross-compiler: ./configure --arch=arm --cpu=cortex-a8 --extra-cflags='-mfpu=vfp -mfloat-abi=softfp' --cross-prefix=arm-none-linux-gnueabi- arm-none-linux-gnueabi-gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -std=c99 -fomit-frame-pointer -march=armv6 -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavutil/aes.o libavutil/aes.c versus: ./configure --arch=arm --cpu=cortex-a8 --extra-cflags='-mfpu=neon -mfloat-abi=softfp' --cross-prefix=arm-none-linux-gnueabi- arm-none-linux-gnueabi-gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -march=armv7-a -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavutil/aes.o libavutil/aes.c And arm-none-linux-gnueabi-readelf shows the problematic sections only in the neon case. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I can reproduce manually with -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Back to aes.o, this is the shared pass: gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -std=c99 -fomit-frame-pointer -march=armv6 -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavutil/aes.o /build/buildd/ffmpeg-0.5+svn20090706/libavutil/aes.c This is the neon pass: gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -march=armv7-a -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavutil/aes.o /build/buildd/ffmpeg-0.5+svn20090706/libavutil/aes.c So it would seem -mfpu=neon does not generate PIC code in some cases; perhaps -fPIC is missing? -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
@Mans: yes, the shared build should have VFP enabled; our toolchain defaults to -mfpu=vfp -mfloat-abi=softfp. The shared build passed configure with: ARCH arm (armv6) version string suffix 4:0.5+svn20090706-2ubuntu1~arm3 big-endianno ARMv5TE enabled yes ARMv6 enabled yes ARMv6T2 enabled no ARM VFP enabled yes IWMMXT enabledno NEON enabled no gprof enabled no debug symbols yes strip symbols no optimizations yes staticno sharedyes But I'll include r19474 as you suggest and pass -mfloat-abi+-mfpu in the shared build too for clarity. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -march=armv7-a -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavutil/utils.o /build/buildd/ffmpeg-0.5+svn20090706/libavutil/utils.c gcc -shared -Wl,-soname,libavutil.so.49 -Wl,-Bsymbolic-functions -Wl,--warn-common -Wl,--as-needed -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libpostproc -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libswscale -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libavfilter -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libavdevice -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libavformat -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libavcodec -Wl,-rpath-link,"/build/buildd/ffmpeg-0.5+svn20090706/debian-neon"/libavutil -Wl,-Bsymbolic -o libavutil/libavutil.so.49 libavutil/adler32.o libavutil/aes.o libavutil/avstring.o libavutil/base64.o libavutil/crc.o libavutil/des.o libavutil/fifo.o libavutil/intfloat_readwrite.o libavutil/lfg.o libavutil/lls.o libavutil/log.o libavutil/lzo.o libavutil/mathematics.o libavutil/md5.o libavutil/mem.o libavutil/random.o libavutil/rational.o libavutil/rc4.o libavutil/sha1.o libavutil/tree.o libavutil/utils.o -lz -lbz2 -pthread -lm -lgsm -lschroedinger-1.0 -lpthread -loil-0.3 -lm -lrt -lspeex -ltheora -logg -lvorbisenc -lvorbis -logg -ldc1394 -lraw1394-lasound -ldl -lasound -lX11 -lXext -lasound /usr/bin/ld: libavutil/aes.o: relocation R_ARM_MOVW_ABS_NC against `__stack_chk_guard' can not be used when making a shared object; recompile with -fPIC libavutil/aes.o: could not read symbols: Bad value It looks like aes.o is misbuilt, or can't build with the -fPIE stuff we're using by default; but I don't understand why that's NEON specific. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
You're probably missing r19474. Is the armv6 build intended to include VFP code? If so, you might need to add --extra-cflags="-mfpu=vfp -mfloat-abi=softfp" to the configure arguments. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I simply used --cpu=armv6 for the shared pass and --cpu=armv7-a --extra- cflags="-mfpu=neon -mfloat-abi=softfp" for the neon one (float-abi is probably not required but I find it safer); the static one is probably broken, I mentioned in to Reinhard over email. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Ok, I see your point with --cpu deciding how the other options will turn on/off. I think the Debian packaging is a bit paranoid about getting this or that feature right, but I guess it's ok for armel. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Ah I forgot the --cpu=armv7-a I thought I had added. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
So latest tree I have fails with: ... gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/dsputil_neon.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon.c gcc -DHAVE_AV_CONFIG_H -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -I. -I"/build/buildd/ffmpeg-0.5+svn20090706" -D_ISOC99_SOURCE -D_POSIX_C_SOURCE=200112 -I/build/buildd/ffmpeg-0.5+svn20090706/debian/include -mfpu=neon -mfloat-abi=softfp -std=c99 -fomit-frame-pointer -pthread -I/usr/include/schroedinger-1.0 -I/usr/include/liboil-0.3 -g -Wdeclaration-after-statement -Wall -Wno-switch -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wno-pointer-sign -Wcast-qual -Wwrite-strings -Wtype-limits -Wundef -O3 -fno-math-errno -fno-signed-zeros -c -o libavcodec/arm/dsputil_neon_s.o /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S: Assembler messages: /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S:748: Error: bad instruction `vfp vdup.32 q8,d0[0]' /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S:749: Error: bad instruction `novfp vld1.32{d16[],d17[]},[sp,:32]' /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S:751: Error: bad instruction `vfp ldr lr,[sp,#12]' /build/buildd/ffmpeg-0.5+svn20090706/libavcodec/arm/dsputil_neon_s.S:752: Error: bad instruction `novfp ldr lr,[sp,#16]' make[1]: *** [libavcodec/arm/dsputil_neon_s.o] Error 1 That's in the neon pass which is currently configured with: --disable-encoder=h263 --disable-encoder=h263p --disable-encoder=mpeg2video --disable-encoder=mpeg4 --disable-encoder=msmpeg4v1 --disable-encoder=msmpeg4v2 --disable-encoder=msmpeg4v3 --extra-version='4:0.5+svn20090706-2ubuntu1~arm2' --prefix=/usr --enable-avfilter --enable-avfilter-lavf --enable-vdpau --enable-bzlib --enable-libgsm --enable-libschroedinger --enable-libspeex --enable-libtheora --enable-libvorbis --enable-pthreads --enable-zlib --disable-stripping --disable-vhook --enable-gpl --enable-postproc --enable-swscale --enable-x11grab --enable-libdc1394 --extra-cflags="-I/build/buildd/ffmpeg-0.5+svn20090706/debian/include" --shlibdir=/usr/lib/neon/vfp --enable-neon --enable-armvfp --enable-shared --extra-cflags="-mfpu=neon -mfloat-abi=softfp" --disable-static --disable-ffmpeg --disable-ffplay Perhaps I shouldn't have enabled armvfp? -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
dsputil.h is not part of the public API, so changes there should not affect anything outside libavcodec. There is no point in explicitly enabling ARM features (e.g. --enable- armv6) as these are automatically selected depending on the --cpu option. A NEON build should have everything enabled, since anything with NEON is also ARMv6[T2] compatible. As for which changes to include, backporting the more extensive changes probably doesn't make much sense, at least not those depending on other changes to e.g. aac.c. Hopefully we'll make a new release (0.6?) at some point, which will include all these changes. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Shall we build the neon pass with --enable-armv6 and --enable-armv6t2 as well? -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I'm considering dropping r19818 and r19819 as they change the size of an ABI struct (AFAIK) in dsputil.h; r19806 only adds specific implementations to dsputil.h so that's ok. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Replying to self: > NB: I used -mcpu=armv7-a instead of -mcpu=cortex-a8 because cortex implies > the fast_unaligned ffmpeg flag. Could you confirm that implementing NEON > implies fast_unaligned? I pulled r19308 (ARM: enable fast_unaligned when --cpu=armv[67] is specified): --- a/configure +++ b/configure @@ -1792,6 +1792,10 @@ if test $cpu != "generic"; then add_cflags -mcpu=$cpu enable fast_unaligned ;; +armv[67]*) +add_cflags -march=$cpu +enable fast_unaligned +;; armv*) add_cflags -march=$cpu ;; so it's effectively identical to --cpu=cortex*. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Attachment added: "Update set of git revs I used locally" http://launchpadlibrarian.net/33410955/git-revs -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I started pulling r19955 and r19956, but r19956 has a bunch of changes to the AAC decoder which don't apply probably because I need more aac.c changes. aac.c saw a bunch more commits in the recent history, it's a very active file. I don't think it's reasonable to try to backport its changes. Then there's the problem that the changes touch dsputil.h which API is used all over the place. I'm not too happy to change it as it might impact other decoders or even other applications, but almost all interesting changes touch it. So I dropped revs after 19957 except r20151 and am trying another build. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Attachment added: "git revs I included / skipped in my local tree right now" http://launchpadlibrarian.net/33409505/git-revs -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
So I'm taking a fresh look on this bug from the start. Sorry I didn't make more time for NEON earlier in the cycle. This is kind of a brain dump of where I stand. First, back to your initial comment: "ffmpeg also contains some VFP, armv5te and armv6 optimisations. Based on the v6+VFP baseline for Karmic, I suggest that these can be turned on statically, when the karmic toolchain has stabilised sufficiently." So what you're saying is that we should configure ffmpeg for armv6 + VFP by default in Ubuntu. I agree and changed: nooptflags += --disable-armv5te --disable-armv6 --disable-armv6t2 nooptflags += --disable-armvfp --disable-neon to: nooptflags += --enable-armv6 --disable-armv6t2 nooptflags += --enable-armvfp --disable-neon And disabled the now useless VFP flavour. I didn't add conditions for Debian vs Ubuntu; need to check with Reinhard how he'd like these to look like. Since NEON implies v7+ and since you said ffmpeg's NEON implementation requires VFP, I've updated the neon flavour (which can be used on Debian and Ubuntu) to use: --shlibdir=/usr/lib/neon/vfp --enable-armvfp -mcpu=armv7-a \ --extra-cflags="-mfpu=neon -mfloat-abi=softfp" NB: I used -mcpu=armv7-a instead of -mcpu=cortex-a8 because cortex implies the fast_unaligned ffmpeg flag. Could you confirm that implementing NEON implies fast_unaligned? I looked at all upstream commits with NEON in the commit message (git log --grep=NEON -i or gitk --grep=NEON -i). Then the first missing NEON commit I found was indeed r18332. I found these interesting revs: bd97c6665a522e7f64ee1456ed9c39f7cde7234f18332 ARM: NEON optimised add_pixels_clamped 2a9b6b26f8aefcbe16ad3dd62c09e7224f55afb418333 ARM: NEON optimized put_signed_pixels_clamped e496c588b4dda7ef2372dbd562b3e288e3c7c81c18535 Add guaranteed alignment for loading dest pixels in avg_pixels16_neon d1779570309a76649c83f4969583bf8733b942cb18712 ARM: NEON put_pixels_clamped 1ac68a162bd6dcc80da9e950368b1fe13d4dadf718713 ARM: Use fewer register in NEON put_pixels _y2 and _xy2 3104a36996462c14e3b84c852e76280676088add18916 ARM: NEON VP3 Loop Filter 2a82c2a332b439a3c929e936909c83dfcc14734318944 NEON-OBJS should also be cleared for each subdir. 14727897af399dc8480ca09f39f8f93acb8d302918972 ARM: add some PLD in NEON IDCT 8a81301ef5051b8ee571cb3bd0bf4cfbefaf68a319216 ARM: slightly faster NEON H264 horizontal loop filter 03586fda4fb70f488e61294e5200f4365011753519345 ARM: NEON VP3 IDCT eb75e8538b294d7f097a1f53a546c353011d947119438 Require aligned memory for everything that needs it 1dc725bbd6a1cc6a572aff0702753dc3559d657b19745 ARM: handle VFP register arguments in ff_vector_fmul_window_neon() 822b4ce43bcb76899277a893e01eda322264402d19494 Only compile in NEON optimizations for H.264 when the H.264 decoder is enabled. 71b3800785cd7aaf6f0aed00be58b7ed007f31bb19637 ARM: NEON optimised vorbis_inverse_coupling 6768555cd5b51c60b061f3a41fa4ea5536e9b2e219806 ARM: NEON optimised FFT and MDCT a4f631b0f93557846e258fe3c571cdb25b401cf119817 ARM: faster NEON IMDCT fe312b587ef084d7d1d97813a6ffed1ff6f70ad819819 ARM: NEON optimised MDCT a99f03a385a8b026a9b032b0d67c437875ee302f19940 ARM: interleave cos/sin tables for improved NEON MDCT 36648139160e5cb3388bbb390656394e23b4eee419941 ARM: merge two loops in ff_mdct_calc_neon 94cd4381dc51d12312f83aa6bb519f17dfe4764419957 ARM: NEON optimisations for some dsputil functions f2f5d248aacafa68a9a1f772844a1eb4f6e720ed19971 ARM: NEON optimised scalarproduct_float 84acbb94ee14bbaee5f8a1eccaee1750d00f77a22 ARM: NEON optimised int32_to_float_fmul_scalar c14a6a4c6f7b62061bae87d45d8bb667d2e5064c20029 ARM: NEON optimised vector_fmul_reverse ef75b6ccb7503ea8818f15ce403ea332d64963a520031 ARM: NEON optimised vector_clipf 927725a4a687d89881a4a7486f892f7566f7a7cb20063 ARM: NEON optimised vector_fmul_add a5dc149d7439ca72f37364bb7a7829f85c1e4eb820151 ARM: remove unnecessary .fpu neon directives e7f5e7b5b8de7b0540c7db7f100b91b0a05760a720163 ARM: clean up dsputil initialisation And added the revs you mentionned in comment #20 to the list (superset of the list in comment #9): 9e80964529f669a752c25832a6d7c64adb32c81918600 Reorganise intreadwrite.h b2ff39618d600db05f527c641e2379fe85237fee18601 ARM asm for AV_RN*() 19760072819c8e8130d1d6f78c7de10d4d1fec4c18917 ARM: actually add VP3 loop filter 14cbfa75169225e2dc04a69c4c902ccc6802249319308 ARM: enable fast_unaligned when --cpu=armv[67] is specified Then Mans suggested an additional ones in comment #23: 10f371694ae5525b084c2ccddc24c2a3912b
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Changed in: ffmpeg (Ubuntu Karmic) Importance: Undecided => Medium -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
The MDCT improvement is in. However, it required some minor changes scattered all over the place, so considering the speedup is only a few %, it may not be worth the trouble to patch this. Next up: massive speedup of AAC decoder. I hope to finish it during the week. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
You'll want r19846 too, as it fixes a silly mistake I made. I don't think anything was affected by it in practice, but I don't like bugs. I have another small speedup of [I]MDCT pending too. Hopefully I'll get it in this week. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Thanks for having a go! Did you apply the extra changes suggested by Mans too? I've not tried applying those myself yet, but I'd expect them to apply cleanly. FFT/[I]MDCT optimisations are likely to benefit a lot of codecs. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I've started work on this, but the ffmpeg source package has been building all day on my beagle board so I expect it will take me a couple of days to complete; sorry for the delay. ** Changed in: ffmpeg (Ubuntu Karmic) Status: Triaged => In Progress -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
More good stuff committed: r19806 ARM: NEON optimised FFT and MDCT r19817 ARM: faster NEON IMDCT r19818 Prepare for optimised forward MDCT implementations r19819 ARM: NEON optimised MDCT The first two triple the speed of several audio decoders. The last two speed up some audio encoders by an unknown amount. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Is any progress expected on merging these changes for Karmic? -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I just checked in a Vorbis optimisation giving 12% speedup. Something to consider including, perhaps. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Changed in: ffmpeg (Ubuntu Karmic) Status: New => Triaged -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Hi again, I think I now have a stable build, by pulling the following revisions from svn://svn.ffmpeg.org/ffmpeg/trunk: r18332 r18333 r18535 r18600 r18601 r18712 r18713 r18916 r18917 r18972 r19216 r19308 r19345 The attached patch enables the NEON build flavour and applicable build options in debian/confflags. The patch is against the karmic source, but the upstream Debian source is not much different. To within that difference, I've succesfully built the upstream source package and the karmic package, and they seem to work OK, with a decent improvement in video decode speed for H.264 (I haven't tested exhaustively). ** Attachment added: "Patch against karmic ffmpeg-debian source to enable neon build flavour and relevant compiler options" http://launchpadlibrarian.net/29899937/Enable-NEON-build-flavour.patch -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Will look into this next week or so when I've setup a beagleboard I expect any time soon :-P ** Changed in: ffmpeg (Ubuntu Karmic) Assignee: Ubuntu armel porters (ubuntu-armel) => Loïc Minier (lool) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
BTW, you'll need to pull the checkin (r19308) I did a few minutes ago too. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
You should run configure with --cpu=cortex-a8 or --cpu=armv7-a. Do not use --extra-cflags for the -mcpu or -march flags, as this will not give optimal settings. Specifically, it will not allow FFmpeg to use unaligned memory accesses even though armv6 and later supports them. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Will do— I think I now have a good patch set, but I want to check it works. I'm currently doing a fresh build. One other question Mans: Because the baseline architecture used by the toolchain is -march=armv5t, many of the NEON sources fail to build even with -mfpu=neon, because of the presence of PLD instructions (not supported in v5). My current fix for this is to add -march=armv7-a in extra-cflags for the whole NEON-enabled build pass: my rationale for this is that the presence of NEON implies v7, which means that v7 features may as well be enabled for the whole pass. (This seems a general problem with -mfpu= and -march/cpu= which allow non-existent architectures, such as v5+NEON or NEON-without-v7 to be selected when used together.) Is this the right thing to do, or do you have a different suggestion? -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Ah, thanks a lot for that explanation, Mans. Given that, now I agree that Dave's approach outlined in comment #9 seems correct to me here. Dave, please attach your patches to this bug, I can then take care to integrate them into the ffmpeg package. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Ugh, sorry about the line breaks. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Let me explain how the ARM feature selection in FFmpeg works. First, there is the question of runtime feature detection. This is not done on ARM for two reasons: 1) building the entire code for the correct target makes a big difference in performance, and 2) there is currently no good way to detect processor features from userspace. The CPU selection flags passed to the compiler are determined by the --cpu configure option. Specific optimisations are activated by compiling a test file using a representative instruction. The assembler will reject this if the chosen target does not support that instruction. For this approach to work, the compiler must be either statically configured for the correct target, or the appropriate flags must be passed. The FFmpeg configure script has no means of determining correct values for -mfpu and similar flags. If any such flags are required, these must be set using --extra-cflags. The correct way to configure for Cortex-A8 with NEON looks like this: configure --arch=arm --cpu=cortex-a8 --extra-cflags='-mfpu=neon -mfloat-abi=softfp' --cross-prefix=arm-none-linux-gnueabi- Add other flags such as --sysroot, --enable/disable-stuff as necessary. There is rarely any need to explicitly --disable anything, since optional features are only enabled if the required support libraries etc are available. Hope this clarifies the situation. As for the ultrasparc flags, this is weird and probably broken. I'd fix it, but I don't have the hardware. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
The upstream configure script doesn't currently seem to work this way for any of the ARM extensions. Instead, you seem need to build with the right --extra-cflags as well as enabling the extension(s) you want, such as --enable-armvfp or --enable-neon. The configure script has the facilities to add cflags in this way though; it's used for the ultraspac vis extension (whatever that is); but I don't know what the upstream community prefers. In addition to the above, it looks like -march=armv7-a should also be added to extra-cflags when building the NEON flavour; otherwise the ffmpeg configuration assumes that the PLD (cache preload) instruction can't be used because the baseline build architecture is armv5. But it doesn't really make sense to assume that any ARMv7 features may be absent if NEON is present. This would give --extra-cflags='-march=armv7-a -mfpu=neon -mfloat- abi=softfp' for the neon flavour. I'll point Måns at this thread and ask for his comments. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Dave Martin writes: > Having played a bit, --extra-cflags="-mfpu=neon -mfloat-abi=softfp" > seems to be _required_ for the neon flavour in order for the NEON > optimisations to get built. Is Mans aware of that? If yes, then the upstream configure file should be patched to automatically add these settings to CFLAGS. -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Having played a bit, --extra-cflags="-mfpu=neon -mfloat-abi=softfp" seems to be _required_ for the neon flavour in order for the NEON optimisations to get built. -ftree-vectorize probably isn't needed explicitly, because the build seems to default to -O3. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I discussed this with Måns, and on his advice I pulled across the following patches from the ffmpeg trunk (revision numbers shown are relative to svn://svn.ffmpeg.org/ffmpeg/trunk). r18332 (ARM: NEON optimised add_pixels_clamped) r18333 (ARM: NEON optimized put_signed_pixels_clamped) r18535 (Add guaranteed alignment for loading dest pixels in avg_pixels16_neon) r18601 (ARM asm for AV_RN*()) r18712 (ARM: NEON put_pixels_clamped) r18713 (ARM: Use fewer register in NEON put_pixels _y2 and _xy2) r18916 (ARM: NEON VP3 Loop Filter) r18972 (ARM: add some PLD in NEON IDCT) r19216 (ARM: slightly faster NEON H264 horizontal loop filter) They mostly apply cleanly against ffmpeg 3:0.svn 20090303-1ubuntu6 with just a couple of minor manual merges needed. I uncommented FLAVORS += neon in debian/confflags and got a successful build, but I still need to examine and test the result. NOTE: Because the NEON code makes use of floating-point functionality, I think that we need the following in debian/confflags: neon_build_confflags += --shlibdir=/usr/lib/neon/vfp(or /usr/lib/vfp/neon) (instead of --shlibdir=/usr/lib/neon) We may also want to add --extra-cflags="-mfpu=neon -mfloat-abi=softfp" for this configuration. According to tests we've done here, -ftree- vectorize is also beneficial by parallelising some computations (this is not enabled by default for -O2). I haven't tested these yet myself though. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Changed in: ffmpeg (Ubuntu Karmic) Assignee: Canonical Mobile Team (canonical-mobile) => Ubuntu armel porters (ubuntu-armel) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
excellent, thanks for clarification Loïc! -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
@Reinhard: right, and the jaunty kernel is indeed built with CONFIG_NEON off which implies it wont expose the neon hwcap. However people with the affected hardware should run jaunty and karmic can assume sane hardware and neon in the kernel in a near future. Dave poked Måns Rullgård over email who explained among other things that ffmpeg wouldn't do feature runtime detection on ARM until we have a better kernel interface; so there's nothing desirable in trunk for runtime detection. There might be more optimizations in trunk. Måns Rullgård recommended the ld.so hwcaps for now which should be simple to uncomment in the rules:-) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
To clarify from my POV: * For Karmic, we expect there to be additional ARM hardware supported by the Ubuntu release, fully supporting NEON (though this is yet to be confirmed). It will be nice if we can backport the new packages for Jaunty, but that's a secondary concern. * I don't expect that ffmpeg will have full runtime detection for ARM features in the Karmic timeframe, so we may still need multiple flavours and the glibc hwcaps mechanism for now. This could be worth investigating again; but the last time I looked, the detection was done via #ifdef. The basic problem with (ii) is that the information provided by hwcaps is rather coarse. Mans doesn't favour hwcaps and was looking at defining a better mechanism, but it would require kernel changes and so could be time-consuming to stabilise/merge. -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Loïc Minier writes: > I think there are various different things here: > - ffmpeg has a mechanism to do multiple passes (multiple builds) and produce > multiple shared libs which can be loaded by glibc's hwcaps-based selection > mechanism automatically > - ffmpeg has grown some runtime detection of the feature (instead of build > time selection / autodetection on the buildd) > - NEON was broken in hardware on imx51 TO1 which is all hardware which was > around for jaunty perhaps the kernel shouldn't advertise NEON hwcaps in this case then? > What we want: > i) karmic should use the latest ffmpeg > ii) the latest ffmpeg should use NEON automatically at runtime > iii) we want to backport key stuff to some PPA to show the NEON stuff based > on jaunty (as it's impractical to play with karmic as it's still quite > unstable) > iv) keep the Debian and Ubuntu packaging as identical as possible > > I guess i) will happen as ffmpeg updates over the karmic cycle. Debian and ubuntu track the FFmpeg 0.5 release. From what I get of this thread of discussion, I believe the arm related changes have been implemented in the trunk branch of ffmpeg after 0.5 was branched of. I'd be against switching back to trunk because of instabilities it would cause in gstreamer0.10-ffmpeg, mplayer and vlc. You could literally feel the rate of bugreports dropping down after these package finally agreed on a common ffmpeg version to use. I'd rather suggest to backport the changes to the 0.5 release branch instead of tracking again ffmpeg trunk. -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
I think there are various different things here: - ffmpeg has a mechanism to do multiple passes (multiple builds) and produce multiple shared libs which can be loaded by glibc's hwcaps-based selection mechanism automatically - ffmpeg has grown some runtime detection of the feature (instead of build time selection / autodetection on the buildd) - NEON was broken in hardware on imx51 TO1 which is all hardware which was around for jaunty What we want: i) karmic should use the latest ffmpeg ii) the latest ffmpeg should use NEON automatically at runtime iii) we want to backport key stuff to some PPA to show the NEON stuff based on jaunty (as it's impractical to play with karmic as it's still quite unstable) iv) keep the Debian and Ubuntu packaging as identical as possible I guess i) will happen as ffmpeg updates over the karmic cycle. ii) and iv) are a bit tricky: are all SIMD features properly runtime detected in the latest ffmpeg? or is there still value in glibc-hwcaps-selected libs? which ARM configure flags can we safely turn on in the main armel build so that it's compatible with the Debian armel architecture expectations? I think we want to enable everything which is fully runtime detected in the main armel build, then if anything else NEON remains we can consider building the special NEON pass to have an even more optimized library. iii) should probably be discussed somewhere else I guess, but will happen after the karmic version works anyway :-) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
Re: [Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
Dave Martin writes: > It should be safe to uncomment the FLAVORS += neon line now, because we > deliberately disable NEON in the kernel for Jaunty to avoid applications > and libraries accidentally believing it works. (This also means neon is > not present in the reported hwcaps). So you're suggesting we should deliberatly and knowingly waste buildd time for a flavor nobody is technically able to use? Well, we could of course do that, but I'd rather prefer we wouldn't. -- Gruesse/greetings, Reinhard Tartler, KeyID 945348A4 -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
OK, so far so good. The confflags definitions for NEON look OK, but it's still disabled. It should be safe to uncomment the FLAVORS += neon line now, because we deliberately disable NEON in the kernel for Jaunty to avoid applications and libraries accidentally believing it works. (This also means neon is not present in the reported hwcaps). -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Tags added: armel ** Tags removed: arm -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Tags added: arm -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 383240] Re: Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel
** Also affects: ffmpeg (Ubuntu Karmic) Importance: Undecided Status: New ** Changed in: ffmpeg (Ubuntu Karmic) Assignee: (unassigned) => Canonical Mobile Team (canonical-mobile) -- Integrate and enable ARMv5TE/v6/VFP and NEON optimisations from ffmpeg trunk for armel https://bugs.launchpad.net/bugs/383240 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs