Hi Reimar,

On Sat, 20. Feb 12:34, Reimar Döffinger wrote:
> Hi!
> 
> > On 24 Jan 2021, at 17:35, Andriy Gelman <andriy.gel...@gmail.com> wrote:
> > 
> > On Sat, 07. Nov 16:31, Andriy Gelman wrote:
> >> On Sat, 31. Oct 10:17, Andriy Gelman wrote:
> >>> On Fri, 16. Oct 00:02, Andriy Gelman wrote:
> >>>> On Fri, 09. Oct 20:17, Andriy Gelman wrote:
> >>>>> From: Chip Kerchner <chip.kerch...@ibm.com>
> >>>>> 
> >>>>> ---
> >>>>> libswscale/ppc/yuv2rgb_altivec.c | 10 ++++++++++
> >>>>> 1 file changed, 10 insertions(+)
> >>>>> 
> >>>>> diff --git a/libswscale/ppc/yuv2rgb_altivec.c 
> >>>>> b/libswscale/ppc/yuv2rgb_altivec.c
> >>>>> index 536545293d..930ef6b98f 100644
> >>>>> --- a/libswscale/ppc/yuv2rgb_altivec.c
> >>>>> +++ b/libswscale/ppc/yuv2rgb_altivec.c
> >>>>> @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, 
> >>>>> vector signed short Y,
> >>>>>  * 
> >>>>> ------------------------------------------------------------------------------
> >>>>>  */
> >>>>> 
> >>>>> +#if !HAVE_VSX
> >>>>> +static inline vector unsigned char vec_xl(signed long long offset, 
> >>>>> const ubyte *addr)
> >>>>> +{
> >>>>> +    const vector unsigned char *v_addr = (const vector unsigned char 
> >>>>> *) (addr + offset);
> >>>>> +    vector unsigned char align_perm = vec_lvsl(offset, addr);
> >>>>> +
> >>>>> +    return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], 
> >>>>> align_perm);
> >>>>> +}
> >>>>> +#endif /* !HAVE_VSX */
> >>>>> +
> >>>>> #define DEFCSP420_CVT(name, out_pixels)                                 
> >>>>>       \
> >>>>> static int altivec_ ## name(SwsContext *c, const unsigned char **in,    
> >>>>>       \
> >>>>>                             int *instrides, int srcSliceY, int 
> >>>>> srcSliceH,     \
> >>>> 
> >>>> Ping.
> >>>> This patch fixes PPC64 build on patchwork.
> >>>> 
> >>> 
> >>> ping
> >>> 
> >> 
> >> ping 
> >> 
> > 
> > I asked Reimar to have a look at the patch. He didn't have a link to 
> > original
> > post, so I'm forwarding his feedback: 
> > 
> > - a few comments sure would be good
> > - the function likely should be always_inline
> > - the offset is always 0 in the code, so that could be optimized
> > - I am not sure if the addresses even can be unaligned (the whole magic 
> > code is about fixing up unaligned loads since altivec simply ignores the 
> > low bits of the address, but it looks like the x86 asm also assumes 
> > unaligned)
> > - the extra load needed to fix the alignment can read outside the bounds of 
> > the buffer I think? Especially for chroma and if the load is aligned. 
> > Though chroma seems to have an issue already today...
> > 

> 
> I had a look at the code before the vec_xl was introduced, and besides the 
> unnecessary extra offset it seems it was pretty much like this.
> So worst case this patch would restore the behaviour to before the vec_xl 
> patch, which I
> guess is a reasonable argument to merge it whether it’s perfect or not.
> Personally I would have added a vecload (dropping the offset argument) or 
> similar function
> that either does vec_xl or this, instead of introducing a “fake” vec_xl 
> function,
> but that is just nitpicking I guess…
> 

Thanks for looking into this. 
Then I'll apply the patch in a few days if no one objects. 

I'll slightly reword the title/commit message:

lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments

Fixes #8750
vec_xl intrinsic is only available on POWER 7 or higher. Add inline function to
fix builds if VSX is not supported.

-- 
Andriy
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to