omjavaid added a comment.

In D156687#4628743 <https://reviews.llvm.org/D156687#4628743>, @DavidSpickett 
wrote:

> First thing to note is that WriteRegister also behaves this way, but there it 
> is more appropriate because it updates only part of the buffer before writing 
> it out in its entirety. Useful to know where the pattern came from though.
>
> You would need roughly the following per `WriteXYZ`:
>
>   -      error = WriteTLS();
>   +      error = WriteTLS(src);
>   
>   -Status NativeRegisterContextLinux_arm64::WriteTLS() {
>   +Status NativeRegisterContextLinux_arm64::WriteTLS(const void* 
> src/*=nullptr*/) {
>   
>   -  ioVec.iov_base = GetTLSBuffer();
>   +  ioVec.iov_base = src ? const_cast<void*>(src) : GetTLSBuffer();
>   
>   -  Status WriteTLS();
>   +  Status WriteTLS(const void* src=nullptr);
>
> We can assume that the buffer is the same as the data to be written back if 
> it's something static like TLS. For SVE/SME, we would have resized our buffer 
> first, so it holds there too.
>
> The added complexity isn't that much but I think it adds more thinking time 
> for a developer than it potentially saves in copying time. I already feel 
> like the separate `xyz_is_valid` is enough to think about and having a 
> potential second data source just adds to that load.
>
> From the header docs it seems I was right that this is used primarily for 
> expression evaluation:
>
>   // These two functions are used to implement "push" and "pop" of register
>   // states.  They are used primarily for expression evaluation, where we need
>   // to push a new state (storing the old one in data_sp) and then restoring
>   // the original state by passing the data_sp we got from ReadAllRegisters to
>   // WriteAllRegisterValues.
>
> Which you would be doing a lot of in a formatter for example, but you'd get 
> better savings implementing a more efficient packet format to do all that at 
> once, I guess.
>
> QSaveRegisterState / QRestoreRegisterState packets call it as part of 
> expression evaluation, though in theory it's not always that. That's an lldb 
> extension anyway so we're in control of it at least. In theory this could be 
> used to restore state that is not just the previous state but I don't know 
> how you'd trigger that from lldb.
>
> The other use is `NativeProcessLinux::Syscall` which is sufficiently rare we 
> can ignore that.
>
> I did do a very rough benchmark where I printed the same expression 2000 
> times, so each one is doing a save/restore. Once with the code in this review 
> right now, and again with this potential optimisation added to GPR/FPR/TLS 
> (I'm on a Mountain Jade machine without SVE). Caveat shared machine, made up 
> benchmark, etc. but all runs of both hovered between 16 and 17 seconds. 
> Neither seemed to be consistently lower or higher than the other. Doesn't 
> mean this isn't a speedup in isolation but if it is, it's dwarfed by the 
> syscalls and packets sent back and forth.

I mostly agree with what you have above. I was only thinking about ever 
increasing size of this buffer and thought if we can find a way around 
duplication. Most probably we ll never have SVE/SME enabled on a slow/small 
machine to bother us. The only area of concern could be a LLDB running on a 
resource constrained container but we can ignore that for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D156687/new/

https://reviews.llvm.org/D156687

_______________________________________________
lldb-commits mailing list
lldb-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/lldb-commits

Reply via email to