On 25/06/14 3:44 PM, Luca Barbato wrote: > On 25/06/14 20:33, James Almer wrote: >> On 24/06/14 11:26 AM, Luca Barbato wrote: >>> From: Pierre Edouard Lepere <pierre-edouard.lep...@insa-rennes.fr> >>> >>> The functions only support x86_64. >>> >>> Fixes from Hendrik Leppkes and James Almer >>> >>> Signed-off-by: Luca Barbato <lu_z...@gentoo.org> >>> --- >>> libavcodec/hevcdsp.c | 6 +- >>> libavcodec/hevcdsp.h | 3 + >>> libavcodec/x86/Makefile | 2 + >>> libavcodec/x86/hevc_mc.asm | 1256 >>> +++++++++++++++++++++++++++++++++++++++++ >>> libavcodec/x86/hevcdsp.h | 164 ++++++ >>> libavcodec/x86/hevcdsp_init.c | 373 ++++++++++++ >>> 6 files changed, 1803 insertions(+), 1 deletion(-) >>> create mode 100644 libavcodec/x86/hevc_mc.asm >>> create mode 100644 libavcodec/x86/hevcdsp.h >>> create mode 100644 libavcodec/x86/hevcdsp_init.c >>> >> >> Many of these functions are SSSE3 and a couple even SSE2 at most. > > Can you guide me in this regard?
The SSE4 functions are those using pextrw (with memory operand) and packusdw. hevc_put_hevc_bi_w2_{8,10} hevc_put_hevc_bi_w4_{8,10} hevc_put_hevc_bi_w6_{8,10} hevc_put_hevc_bi_w8_{8,10} hevc_put_hevc_uni_w2_{8,10} hevc_put_hevc_uni_w4_{8,10} hevc_put_hevc_uni_w6_{8,10} hevc_put_hevc_uni_w8_{8,10} hevc_put_hevc_uni_qpel_v{4,8}_10 hevc_put_hevc_uni_qpel_hv2_{8,10} hevc_put_hevc_uni_qpel_hv4_{8,10} hevc_put_hevc_uni_qpel_hv6_{8,10} hevc_put_hevc_uni_qpel_hv8_{8,10} hevc_put_hevc_uni_pel_pixels{2,6}_8 hevc_put_hevc_bi_pel_pixels{2,6}_8 hevc_put_hevc_{uni,bi}_epel_h2_8 hevc_put_hevc_{uni,bi}_epel_v2_8 hevc_put_hevc_{uni,bi}_epel_h6_8 hevc_put_hevc_{uni,bi}_epel_v6_8 hevc_put_hevc_{uni,bi}_epel_hv{2,6}_8 I think I'm not missing any. both instructions can be emulated using sse2, so the relevant functions could be duplicated to create an SSE2/SSSE3 variant, but that's for another time/patch. The rest are mostly SSSE3 because of pmaddubsw and pmulhrsw, and a few only SSE2. The qpel and epel tables also need to be renamed to remove the sse4 suffix (Which is unneeded). > >> It will require some init macros rewriting to change, but leaving things as >> is >> will make atom, conroe and bobcat cpus miss a considerable performance boost. > > Probably I can do myself but your help would be welcome =) I don't have time nor really want to deal with the init macros, but i can help you with the necessary changes to the asm file if needed. > lu > > _______________________________________________ > libav-devel mailing list > libav-devel@libav.org > https://lists.libav.org/mailman/listinfo/libav-devel > _______________________________________________ libav-devel mailing list libav-devel@libav.org https://lists.libav.org/mailman/listinfo/libav-devel