On Sat, 07. Nov 16:31, Andriy Gelman wrote:
> On Sat, 31. Oct 10:17, Andriy Gelman wrote:
> > On Fri, 16. Oct 00:02, Andriy Gelman wrote:
> > > On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> > > > From: Chip Kerchner <chip.kerch...@ibm.com>
> > > > 
> > > > ---
> > > >  libswscale/ppc/yuv2rgb_altivec.c | 10 ++++++++++
> > > >  1 file changed, 10 insertions(+)
> > > > 
> > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> > > > b/libswscale/ppc/yuv2rgb_altivec.c
> > > > index 536545293d..930ef6b98f 100644
> > > > --- a/libswscale/ppc/yuv2rgb_altivec.c
> > > > +++ b/libswscale/ppc/yuv2rgb_altivec.c
> > > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, 
> > > > vector signed short Y,
> > > >   * 
> > > > ------------------------------------------------------------------------------
> > > >   */
> > > >  
> > > > +#if !HAVE_VSX
> > > > +static inline vector unsigned char vec_xl(signed long long offset, 
> > > > const ubyte *addr)
> > > > +{
> > > > +    const vector unsigned char *v_addr = (const vector unsigned char 
> > > > *) (addr + offset);
> > > > +    vector unsigned char align_perm = vec_lvsl(offset, addr);
> > > > +
> > > > +    return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> > > > align_perm);
> > > > +}
> > > > +#endif /* !HAVE_VSX */
> > > > +
> > > >  #define DEFCSP420_CVT(name, out_pixels)                                
> > > >        \
> > > >  static int altivec_ ## name(SwsContext *c, const unsigned char **in,   
> > > >        \
> > > >                              int *instrides, int srcSliceY, int 
> > > > srcSliceH,     \
> > > 
> > > Ping.
> > > This patch fixes PPC64 build on patchwork.
> > > 
> > 
> > ping
> > 
> 
> ping 
> 

I asked Reimar to have a look at the patch. He didn't have a link to original
post, so I'm forwarding his feedback: 

- a few comments sure would be good
- the function likely should be always_inline
- the offset is always 0 in the code, so that could be optimized
- I am not sure if the addresses even can be unaligned (the whole magic code is 
about fixing up unaligned loads since altivec simply ignores the low bits of 
the address, but it looks like the x86 asm also assumes unaligned)
- the extra load needed to fix the alignment can read outside the bounds of the 
buffer I think? Especially for chroma and if the load is aligned. Though chroma 
seems to have an issue already today...

-- 
Andriy
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to