We currently do this on boot: write_rdtscp_aux((node << 12) | cpu);
This *sucks*. It means that, to very quickly obtain the CPU number using RDPID, an ALU op is needed. It also doesn't bloody work on systems with more than 4096 CPUs. IMO it should be ((u64)node << 32) | cpu. Then getting the CPU number is just: RDPID %rax MOVL %eax, %eax I'm thinking about this because rseq users could avoid ever *loading* the rseq cacheline if they used RDPID to get the CPU number, and it would be nice to give them a sane way to do it. This won't break any existing RDPID users if we do it quickly because there aren't any (the CPUs aren't available). I would be a bit surprised if anyone uses RDTSCP for this because it's absurdly slow. We can change this without affecting the LSL hack, and I think there are user programs that do the LSL hack. --Andy