Andreas: On Tue, Mar 12, 2024 at 8:42 PM Andreas Rheinhardt <andreas.rheinha...@outlook.com> wrote: > > HAVE_FAST_UNALIGNED being true does not imply that > one can simply read from any pointer via *(long*). > It is undefined behaviour in case the pointer is not > sufficiently aligned; and even if it is, it is (likely) > a violation of the effective-type rules. Fix both > of these by using the appropriate AV_[RW]N macros. > > Also, the current code used sizeof(long) as if this > were the CPU's native arithmetic size, but this is > not true on 64bit Windows. This has been fixed, too. > > This affected huffyuv FATE-tests. > > Signed-off-by: Andreas Rheinhardt <andreas.rheinha...@outlook.com> > --- > libavcodec/lossless_videoencdsp.c | 27 +++++++++++++++++++-------- > 1 file changed, 19 insertions(+), 8 deletions(-) > > diff --git a/libavcodec/lossless_videoencdsp.c > b/libavcodec/lossless_videoencdsp.c > index e2dc99e201..8d03a5b5c6 100644 > --- a/libavcodec/lossless_videoencdsp.c > +++ b/libavcodec/lossless_videoencdsp.c > @@ -18,19 +18,31 @@ > > #include "config.h" > #include "libavutil/attributes.h" > +#include "libavutil/intreadwrite.h" > #include "lossless_videoencdsp.h" > #include "mathops.h" > > +#if HAVE_FAST_64BIT > +typedef uint64_t uint_native; > +#define READ AV_RN64 > +#define READA AV_RN64A > +#define WRITEA AV_WN64A > +#else > +typedef uint32_t uint_native; > +#define READ AV_RN32 > +#define READA AV_RN32A > +#define WRITEA AV_WN32A > +#endif > // 0x7f7f7f7f or 0x7f7f7f7f7f7f7f7f or whatever, depending on the cpu's > native arithmetic size > -#define pb_7f (~0UL / 255 * 0x7f) > -#define pb_80 (~0UL / 255 * 0x80) > +#define pb_7f (~(uint_native)0 / 255 * 0x7f) > +#define pb_80 (~(uint_native)0 / 255 * 0x80) > > static void diff_bytes_c(uint8_t *dst, const uint8_t *src1, const uint8_t > *src2, intptr_t w) > { > long i; > > #if !HAVE_FAST_UNALIGNED > - if (((long)src1 | (long)src2) & (sizeof(long) - 1)) { > + if (((uintptr_t)src1 | (uintptr_t)src2) & (sizeof(uint_native) - 1)) { > for (i = 0; i + 7 < w; i += 8) { > dst[i + 0] = src1[i + 0] - src2[i + 0]; > dst[i + 1] = src1[i + 1] - src2[i + 1]; > @@ -43,11 +55,10 @@ static void diff_bytes_c(uint8_t *dst, const uint8_t > *src1, const uint8_t *src2, > } > } else > #endif > - for (i = 0; i <= w - (int) sizeof(long); i += sizeof(long)) { > - long a = *(long *) (src1 + i); > - long b = *(long *) (src2 + i); > - *(long *) (dst + i) = ((a | pb_80) - (b & pb_7f)) ^ > - ((a ^ b ^ pb_80) & pb_80); > + for (i = 0; i <= w - (int) sizeof(uint_native); i += > sizeof(uint_native)) { > + uint_native a = READA(src1 + i); > + uint_native b = READ(src2 + i); > + WRITEA(dst + i, ((a | pb_80) - (b & pb_7f)) ^ ((a ^ b ^ pb_80) & > pb_80)); > } > for (; i < w; i++) > dst[i + 0] = src1[i + 0] - src2[i + 0]; > -- > 2.40.1 > > _______________________________________________ > ffmpeg-devel mailing list > ffmpeg-devel@ffmpeg.org > https://ffmpeg.org/mailman/listinfo/ffmpeg-devel > > To unsubscribe, visit link above, or email > ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".
Confirming that this fixes 'fate-vsynth[123]-ffvhuff' and probably others on my PowerPC QEMU setup (provided the avidec fix is applied as well), POWER7 (ppc64) and POWER9 (ppc64le). Thanks, Sean McGovern _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".