Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
Hi Reimar, On Sat, 20. Feb 12:34, Reimar Döffinger wrote: > Hi! > > > On 24 Jan 2021, at 17:35, Andriy Gelman wrote: > > > > On Sat, 07. Nov 16:31, Andriy Gelman wrote: > >> On Sat, 31. Oct 10:17, Andriy Gelman wrote: > >>> On Fri, 16. Oct 00:02, Andriy Gelman wrote: > On Fri, 09. Oct 20:17, Andriy Gelman wrote: > > From: Chip Kerchner > > > > --- > > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > > b/libswscale/ppc/yuv2rgb_altivec.c > > index 536545293d..930ef6b98f 100644 > > --- a/libswscale/ppc/yuv2rgb_altivec.c > > +++ b/libswscale/ppc/yuv2rgb_altivec.c > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, > > vector signed short Y, > > * > > -- > > */ > > > > +#if !HAVE_VSX > > +static inline vector unsigned char vec_xl(signed long long offset, > > const ubyte *addr) > > +{ > > +const vector unsigned char *v_addr = (const vector unsigned char > > *) (addr + offset); > > +vector unsigned char align_perm = vec_lvsl(offset, addr); > > + > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > > align_perm); > > +} > > +#endif /* !HAVE_VSX */ > > + > > #define DEFCSP420_CVT(name, out_pixels) > > \ > > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > > \ > > int *instrides, int srcSliceY, int > > srcSliceH, \ > > Ping. > This patch fixes PPC64 build on patchwork. > > >>> > >>> ping > >>> > >> > >> ping > >> > > > > I asked Reimar to have a look at the patch. He didn't have a link to > > original > > post, so I'm forwarding his feedback: > > > > - a few comments sure would be good > > - the function likely should be always_inline > > - the offset is always 0 in the code, so that could be optimized > > - I am not sure if the addresses even can be unaligned (the whole magic > > code is about fixing up unaligned loads since altivec simply ignores the > > low bits of the address, but it looks like the x86 asm also assumes > > unaligned) > > - the extra load needed to fix the alignment can read outside the bounds of > > the buffer I think? Especially for chroma and if the load is aligned. > > Though chroma seems to have an issue already today... > > > > I had a look at the code before the vec_xl was introduced, and besides the > unnecessary extra offset it seems it was pretty much like this. > So worst case this patch would restore the behaviour to before the vec_xl > patch, which I > guess is a reasonable argument to merge it whether it’s perfect or not. > Personally I would have added a vecload (dropping the offset argument) or > similar function > that either does vec_xl or this, instead of introducing a “fake” vec_xl > function, > but that is just nitpicking I guess… > Thanks for looking into this. Then I'll apply the patch in a few days if no one objects. I'll slightly reword the title/commit message: lsws/ppc/yuv2rgb_altivec: Fix build in non-VSX environments Fixes #8750 vec_xl intrinsic is only available on POWER 7 or higher. Add inline function to fix builds if VSX is not supported. -- Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
Hi! > On 24 Jan 2021, at 17:35, Andriy Gelman wrote: > > On Sat, 07. Nov 16:31, Andriy Gelman wrote: >> On Sat, 31. Oct 10:17, Andriy Gelman wrote: >>> On Fri, 16. Oct 00:02, Andriy Gelman wrote: On Fri, 09. Oct 20:17, Andriy Gelman wrote: > From: Chip Kerchner > > --- > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > b/libswscale/ppc/yuv2rgb_altivec.c > index 536545293d..930ef6b98f 100644 > --- a/libswscale/ppc/yuv2rgb_altivec.c > +++ b/libswscale/ppc/yuv2rgb_altivec.c > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector > signed short Y, > * > -- > */ > > +#if !HAVE_VSX > +static inline vector unsigned char vec_xl(signed long long offset, const > ubyte *addr) > +{ > +const vector unsigned char *v_addr = (const vector unsigned char *) > (addr + offset); > +vector unsigned char align_perm = vec_lvsl(offset, addr); > + > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > align_perm); > +} > +#endif /* !HAVE_VSX */ > + > #define DEFCSP420_CVT(name, out_pixels) > \ > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > \ > int *instrides, int srcSliceY, int srcSliceH, > \ Ping. This patch fixes PPC64 build on patchwork. >>> >>> ping >>> >> >> ping >> > > I asked Reimar to have a look at the patch. He didn't have a link to original > post, so I'm forwarding his feedback: > > - a few comments sure would be good > - the function likely should be always_inline > - the offset is always 0 in the code, so that could be optimized > - I am not sure if the addresses even can be unaligned (the whole magic code > is about fixing up unaligned loads since altivec simply ignores the low bits > of the address, but it looks like the x86 asm also assumes unaligned) > - the extra load needed to fix the alignment can read outside the bounds of > the buffer I think? Especially for chroma and if the load is aligned. Though > chroma seems to have an issue already today... > I had a look at the code before the vec_xl was introduced, and besides the unnecessary extra offset it seems it was pretty much like this. So worst case this patch would restore the behaviour to before the vec_xl patch, which I guess is a reasonable argument to merge it whether it’s perfect or not. Personally I would have added a vecload (dropping the offset argument) or similar function that either does vec_xl or this, instead of introducing a “fake” vec_xl function, but that is just nitpicking I guess… Best regards, Reimar ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
Am Sa., 10. Okt. 2020 um 02:44 Uhr schrieb Andriy Gelman : > > From: Chip Kerchner > > --- > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > b/libswscale/ppc/yuv2rgb_altivec.c > index 536545293d..930ef6b98f 100644 > --- a/libswscale/ppc/yuv2rgb_altivec.c > +++ b/libswscale/ppc/yuv2rgb_altivec.c > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector > signed short Y, > * > -- > */ > > +#if !HAVE_VSX > +static inline vector unsigned char vec_xl(signed long long offset, const > ubyte *addr) > +{ > +const vector unsigned char *v_addr = (const vector unsigned char *) > (addr + offset); > +vector unsigned char align_perm = vec_lvsl(offset, addr); > + > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm); > +} > +#endif /* !HAVE_VSX */ Is there a speed impact if this function is used? Carl Eugen ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
On Sat, 07. Nov 16:31, Andriy Gelman wrote: > On Sat, 31. Oct 10:17, Andriy Gelman wrote: > > On Fri, 16. Oct 00:02, Andriy Gelman wrote: > > > On Fri, 09. Oct 20:17, Andriy Gelman wrote: > > > > From: Chip Kerchner > > > > > > > > --- > > > > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > > > > 1 file changed, 10 insertions(+) > > > > > > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > > > > b/libswscale/ppc/yuv2rgb_altivec.c > > > > index 536545293d..930ef6b98f 100644 > > > > --- a/libswscale/ppc/yuv2rgb_altivec.c > > > > +++ b/libswscale/ppc/yuv2rgb_altivec.c > > > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, > > > > vector signed short Y, > > > > * > > > > -- > > > > */ > > > > > > > > +#if !HAVE_VSX > > > > +static inline vector unsigned char vec_xl(signed long long offset, > > > > const ubyte *addr) > > > > +{ > > > > +const vector unsigned char *v_addr = (const vector unsigned char > > > > *) (addr + offset); > > > > +vector unsigned char align_perm = vec_lvsl(offset, addr); > > > > + > > > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > > > > align_perm); > > > > +} > > > > +#endif /* !HAVE_VSX */ > > > > + > > > > #define DEFCSP420_CVT(name, out_pixels) > > > >\ > > > > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > > > >\ > > > > int *instrides, int srcSliceY, int > > > > srcSliceH, \ > > > > > > Ping. > > > This patch fixes PPC64 build on patchwork. > > > > > > > ping > > > > ping > I asked Reimar to have a look at the patch. He didn't have a link to original post, so I'm forwarding his feedback: - a few comments sure would be good - the function likely should be always_inline - the offset is always 0 in the code, so that could be optimized - I am not sure if the addresses even can be unaligned (the whole magic code is about fixing up unaligned loads since altivec simply ignores the low bits of the address, but it looks like the x86 asm also assumes unaligned) - the extra load needed to fix the alignment can read outside the bounds of the buffer I think? Especially for chroma and if the load is aligned. Though chroma seems to have an issue already today... -- Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
On Sat, 31. Oct 10:17, Andriy Gelman wrote: > On Fri, 16. Oct 00:02, Andriy Gelman wrote: > > On Fri, 09. Oct 20:17, Andriy Gelman wrote: > > > From: Chip Kerchner > > > > > > --- > > > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > > > 1 file changed, 10 insertions(+) > > > > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > > > b/libswscale/ppc/yuv2rgb_altivec.c > > > index 536545293d..930ef6b98f 100644 > > > --- a/libswscale/ppc/yuv2rgb_altivec.c > > > +++ b/libswscale/ppc/yuv2rgb_altivec.c > > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector > > > signed short Y, > > > * > > > -- > > > */ > > > > > > +#if !HAVE_VSX > > > +static inline vector unsigned char vec_xl(signed long long offset, const > > > ubyte *addr) > > > +{ > > > +const vector unsigned char *v_addr = (const vector unsigned char *) > > > (addr + offset); > > > +vector unsigned char align_perm = vec_lvsl(offset, addr); > > > + > > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > > > align_perm); > > > +} > > > +#endif /* !HAVE_VSX */ > > > + > > > #define DEFCSP420_CVT(name, out_pixels) > > > \ > > > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > > > \ > > > int *instrides, int srcSliceY, int > > > srcSliceH, \ > > > > Ping. > > This patch fixes PPC64 build on patchwork. > > > > ping > ping > -- > Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
On Fri, 16. Oct 00:02, Andriy Gelman wrote: > On Fri, 09. Oct 20:17, Andriy Gelman wrote: > > From: Chip Kerchner > > > > --- > > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > > 1 file changed, 10 insertions(+) > > > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > > b/libswscale/ppc/yuv2rgb_altivec.c > > index 536545293d..930ef6b98f 100644 > > --- a/libswscale/ppc/yuv2rgb_altivec.c > > +++ b/libswscale/ppc/yuv2rgb_altivec.c > > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector > > signed short Y, > > * > > -- > > */ > > > > +#if !HAVE_VSX > > +static inline vector unsigned char vec_xl(signed long long offset, const > > ubyte *addr) > > +{ > > +const vector unsigned char *v_addr = (const vector unsigned char *) > > (addr + offset); > > +vector unsigned char align_perm = vec_lvsl(offset, addr); > > + > > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], > > align_perm); > > +} > > +#endif /* !HAVE_VSX */ > > + > > #define DEFCSP420_CVT(name, out_pixels) > >\ > > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > >\ > > int *instrides, int srcSliceY, int srcSliceH, > >\ > > Ping. > This patch fixes PPC64 build on patchwork. > ping -- Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
On Fri, 09. Oct 20:17, Andriy Gelman wrote: > From: Chip Kerchner > > --- > libswscale/ppc/yuv2rgb_altivec.c | 10 ++ > 1 file changed, 10 insertions(+) > > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > b/libswscale/ppc/yuv2rgb_altivec.c > index 536545293d..930ef6b98f 100644 > --- a/libswscale/ppc/yuv2rgb_altivec.c > +++ b/libswscale/ppc/yuv2rgb_altivec.c > @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector > signed short Y, > * > -- > */ > > +#if !HAVE_VSX > +static inline vector unsigned char vec_xl(signed long long offset, const > ubyte *addr) > +{ > +const vector unsigned char *v_addr = (const vector unsigned char *) > (addr + offset); > +vector unsigned char align_perm = vec_lvsl(offset, addr); > + > +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm); > +} > +#endif /* !HAVE_VSX */ > + > #define DEFCSP420_CVT(name, out_pixels) > \ > static int altivec_ ## name(SwsContext *c, const unsigned char **in, > \ > int *instrides, int srcSliceY, int srcSliceH, > \ Ping. This patch fixes PPC64 build on patchwork. -- Andriy ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
From: Chip Kerchner --- libswscale/ppc/yuv2rgb_altivec.c | 10 ++ 1 file changed, 10 insertions(+) diff --git a/libswscale/ppc/yuv2rgb_altivec.c b/libswscale/ppc/yuv2rgb_altivec.c index 536545293d..930ef6b98f 100644 --- a/libswscale/ppc/yuv2rgb_altivec.c +++ b/libswscale/ppc/yuv2rgb_altivec.c @@ -283,6 +283,16 @@ static inline void cvtyuvtoRGB(SwsContext *c, vector signed short Y, * -- */ +#if !HAVE_VSX +static inline vector unsigned char vec_xl(signed long long offset, const ubyte *addr) +{ +const vector unsigned char *v_addr = (const vector unsigned char *) (addr + offset); +vector unsigned char align_perm = vec_lvsl(offset, addr); + +return (vector unsigned char) vec_perm(v_addr[0], v_addr[1], align_perm); +} +#endif /* !HAVE_VSX */ + #define DEFCSP420_CVT(name, out_pixels) \ static int altivec_ ## name(SwsContext *c, const unsigned char **in, \ int *instrides, int srcSliceY, int srcSliceH, \ -- 2.28.0 ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
ffmpeg_altivec_yuv2rgb_novsx.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
ffmpeg_altivec_yuv2rgb_novsx.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
ffmpeg_altivec_yuv2rgb_novsx.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Re: [FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
On Tue, Jun 30, 2020 at 12:06:10PM +, Chip Kerchner wrote: > yuv2rgb_altivec.c | 10 ++ > 1 file changed, 10 insertions(+) > 42288c448d2dadce913a969945ee36afc510d200 ffmpeg_altivec_yuv2rgb_novsx.patch > diff --git a/libswscale/ppc/yuv2rgb_altivec.c > b/libswscale/ppc/yuv2rgb_altivec.c > index 536545293d..0cd4a1db13 100644 > --- a/libswscale/ppc/yuv2rgb_altivec.c > +++ b/libswscale/ppc/yuv2rgb_altivec.c it seems git here doesnt like this attachment Applying: Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments error: corrupt patch at line 8 error: could not build fake ancestor But the __VSX__ check looks wrong, nothing else checks for vsx that way configure checks for vsx and there is HAVE_VSX, pleaase check if that works too (iam not ppc maintainer and havnt tested. this suggestion is just from looking and guessing) Thanks [...] -- Michael GnuPG fingerprint: 9FF2128B147EF6730BADF133611EC787040B0FAB Opposition brings concord. Out of discord comes the fairest harmony. -- Heraclitus signature.asc Description: PGP signature ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
[FFmpeg-devel] [PATCH] Ticket #8750 Add inline function for the vec_xl intrinsic in non-VSX environments
ffmpeg_altivec_yuv2rgb_novsx.patch Description: Binary data ___ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".