Hi Reimar, On Sat, 20. Feb 12:34, Reimar Döffinger wrote: > Hi! > > > On 24 Jan 2021, at 17:35, Andriy Gelman <andriy.gel...@gmail.com> wrote: > > > > On Sat, 07. Nov 16:31, Andriy Gelman wrote: > >> On Sat, 31. Oct 10:17, Andriy Gelman wrote: > >>> On Fri, 16. Oct 00:02, Andriy Gelman wrote: > >>>> On Fri, 09. Oct 20:17, Andriy Gelman wrote: > >>>>> From: Chip Kerchner <chip.kerch...@ibm.com> > >>>>> > >>>>> --- > >>>>> libswscale/ppc/yuv2rgb_altivec.c | 10 ++++++++++ > >>>>> 1 file changed, 10 insertions(+) > >>>>> > >>>>> diff --git a/libswscale/ppc/yuv2rgb_altivec.c > >>>>> b/libswscale/ppc/yuv2rgb_altivec.c > >>>>> index 536545293d..930ef6b98f 100644 > >>>>> --- a/libswscale/ppc/yuv2rgb_altivec.c > >>>>> +++ b/libswscale/ppc/yuv2rgb_altivec.c > >>>>> @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, > >>>>> vector signed short Y, > >>>>> * > >>>>> ------------------------------------------------------------------------------ > >>>>> */ > >>>>> > >>>>> +#if !HAVE_VSX > >>>>> +static inline vector unsigned char vec_xl(signed long long offset, > >>>>> const ubyte *addr) > >>>>> +{ > >>>>> + const vector unsigned char *v_addr = (const vector unsigned char > >>>>> *) (addr + offset); > >>>>> + vector unsigned char align_perm = vec_lvsl(offset, addr); > >>>>> + > >>>>> + return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > >>>>> align_perm); > >>>>> +} > >>>>> +#endif /* !HAVE_VSX */ > >>>>> + > >>>>> #define DEFCSP420_CVT(name, out_pixels) > >>>>> \ > >>>>> static int altivec_ ## name(SwsContext *c, const unsigned char **in, > >>>>> \ > >>>>> int *instrides, int srcSliceY, int > >>>>> srcSliceH, \ > >>>> > >>>> Ping. > >>>> This patch fixes PPC64 build on patchwork. > >>>> > >>> > >>> ping > >>> > >> > >> ping > >> > > > > I asked Reimar to have a look at the patch. He didn't have a link to > > original > > post, so I'm forwarding his feedback: > > > > - a few comments sure would be good > > - the function likely should be always_inline > > - the offset is always 0 in the code, so that could be optimized > > - I am not sure if the addresses even can be unaligned (the whole magic > > code is about fixing up unaligned loads since altivec simply ignores the > > low bits of the address, but it looks like the x86 asm also assumes > > unaligned) > > - the extra load needed to fix the alignment can read outside the bounds of > > the buffer I think? Especially for chroma and if the load is aligned. > > Though chroma seems to have an issue already today... > >
> > I had a look at the code before the vec_xl was introduced, and besides the > unnecessary extra offset it seems it was pretty much like this. > So worst case this patch would restore the behaviour to before the vec_xl > patch, which I > guess is a reasonable argument to merge it whether it’s perfect or not. > Personally I would have added a vecload (dropping the offset argument) or > similar function > that either does vec_xl or this, instead of introducing a “fake” vec_xl > function, > but that is just nitpicking I guess… > Thanks for looking into this. Then I'll apply the patch in a few days if no one objects. I'll slightly reword the title/commit message: lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments Fixes #8750 vec_xl intrinsic is only available on POWER 7 or higher. Add inline function to fix builds if VSX is not supported. -- Andriy _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".