On Sun, Sep 18, 2011 at 4:57 PM, Ronald S. Bultje <[email protected]>wrote:
> Hi, > > On Sun, Sep 18, 2011 at 8:31 AM, Kieran Kunhya <[email protected]> > wrote: > > On Sun, Sep 18, 2011 at 4:18 PM, Ronald S. Bultje <[email protected]> > > wrote: > >> On Sun, Sep 18, 2011 at 4:10 AM, Kieran Kunhya <[email protected]> > wrote: > >> > This adds SSE4 ASM for the (easy) case of lumFilterSize=1 and for > >> > 10-bit. The naming scheme and function pointers doesn't yet match > swscale's > >> > scheme. > >> > Assistance to get this up to scratch appreciated > >> [..] > >> > + void (*lum_10_filter1)(uint16_t *dst, int16_t *lum_src, int16_t > >> > filter, int width); > >> [..] > >> > + if (output_bits == 10 && lumFilterSize == 1 && big_endian == 0) > >> > + c->lum_10_filter1(yDest, lumSrc[0], lumFilter[0], dstW); > >> > + else > >> > >> All these tricks can be done in the init function, so that you can > >> assign c->yuv2yuv1 directly. > > > > What about if my codepath involves lumFilterSize = 1 and chrFilterSize=8 > ? > > As far as I can tell yuv2yuv1 is for when both are equal to one. > > As discussed on IRC, I believe yuv2yuvX should be slimmed to do 1 > plane at a time, that solves this problem for free and also prevents > duplication of code in each yuv2yuv[1X] variant. > > Ronald > Agreed, these can then just become plain DSP functions with function pointers in asm.
_______________________________________________ libav-devel mailing list [email protected] https://lists.libav.org/mailman/listinfo/libav-devel
