Le perjantaina 9. helmikuuta 2024, 17.34.40 EET flow gg a écrit : > The issue here is that any load greater than e8 will fail the test(Bus > error), so it cannot use vlse64 or similar methods...
AFAICT, data is aligned on 16 bytes here, so using larger element sizes should not be a problem. That being the case, you can load pretty much any power-of- two byte quantity per row up to 512 bits, as 8 segments of 64-bit elements. That is more than enough to deal with 16-byte rows. Of course, that results in a tiled data layout, so it only works if individual elements are all treated equally with no cross-row calculations. This might require trickery or not work at all for those functions that subtract adjacent values. But your patchset seems to leave those out anyway. -- Rémi Denis-Courmont http://www.remlab.net/ _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".