On Sun, 2020-04-19 at 16:19 -0700, Peter Geoghegan wrote:
> Is it possible that the issue has something to do with what the
> compiler knows about the alignment of the tapes back when they were a
> flexible array vs. now, where it's a separate allocation? Perhaps I'm
> over reaching, but it occurs to me that MemSetAligned() is itself
> concerned about the alignment of data returned from palloc(). Could
> be
> a similar issue here, too.

The memcpy() is for the buffer, not the array of LogicalTapes, so I
don't really see how that would happen.

> Some guy on the internet says that microarchitectural issues can make
> __memcpy_avx_unaligned() a lot faster that the "rep movsq"
> instruction
> (which you mentioned was a factor on the other thread) in some cases
> [1]. This explanation sounds kind of plausible.
> 
> [1] https://news.ycombinator.com/item?id=12050579

That raises another consideration: perhaps this is not uniformly a
regression, but actually faster in some situations? If so, what
situations?

Regards,
        Jeff Davis




Reply via email to