Alignment to cache line is a great idea. I guess compiler can issue faster memcpy version.
Thank you! On Wed, Jan 06, 2021 at 01:16:35PM +0100, David Sterba wrote: > > On Sat, Dec 26, 2020 at 02:46:06PM -0700, shng...@gmail.com wrote: > > From: Sheng Mao <shng...@gmail.com> > > > > To use optimized CRC implemention, the input buffer must be > > unsigned long aligned. btrfs receive calculates checksum based on > > read_buf, including btrfs_cmd_header (with zero-ed CRC field) > > and command content. GCC attribute is added to both struct > > btrfs_send_stream and read_buf to make sure read_buf is allocated > > with proper alignment. > > > > Issue: #324 > > The issue has a lot of interesting debugging and performance information > so I'd put something to the changelog as well. > > The alignment fixup sounds correct, though I'd push it a bit further and > move read_buf to the beginning of the structure and align the whole > structure to 64 bytes, as this is the common cache line size. This could > potentially help too with memcpy.