urnathan wrote:

thanks for your comments,

> When I refer to CSE/DSE, I'm mostly talking about keeping values in registers 
> for longer. They don't know anything about individual fields in bitfields. If 
> we split bitfields too narrowly, we end up with extra memory accesses for 
> accessing multiple adjacent fields. And if you have accesses which overlap 
> (access some, but not all, of the same memory), CSE/DSE get much worse.

To be clear by 'accesses which overlap' you mean accesses to bitfields that 
share parts of a byte? I agree -- that's an unchanged part of the algorithm, 
bitfields that share parts of a single byte must be in the same access unit.

Or do you mean an access to a non-bitfield adjacent to a bitfield? How does 
that square with the memory model you pointed to earlier?  Or is it that if you 
see both such accesses, it must be safe to merge them, or there's some UB 
already happening? But anyway, this algorithm doesn't affect that -- in both 
before and after cases something outside of bitfieldAccumulation must be 
merging them.

> Having a bitfield unit that can't be loaded in a single memory access 
> probably doesn't have much benefit, if there's a natural boundary we can use 
> to split.

I think we're agreeing here.  The current algorithm generates access units that 
can require multiple accesses, even on a relaxed alignment machine (I refer 
back to the x86 example).  This replacement tries much harder to not do that -- 
it only merges access units if that demonstrably lowers the number of accesses 
needed for the merged access unit from accessing the original set of units.  
The motivating case we had was a sequence of 4 byte loads, to fetch an 
unaligned 32-bit access unit, 1 byte of that being fiddled with, and then that 
one byte being written back.

https://github.com/llvm/llvm-project/pull/65742
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to