Hi,
I ran into what could be a bug in the implementation of the XCHG macro-op in 
X86.
I was trying to run a binary built using m5threads through X86_SE, and observed 
incorrect mutex-locking behavior.
Closer inspection showed this (execution trace, I have copied only the relevant 
micro-ops):

177896500: system.cpu05 T0 : @pthread_mutex_unlock+19.1  :   MOV_M_I : st   
t1b, DS:[rbx + 0x4] : MemWrite :  D=0x0000000000000000 A=0x695dc4
...
177900000: system.cpu06 T0 : @pthread_mutex_lock+33.0  :   XCHG_M_R : ldst   
t1b, DS:[rdx] : MemRead :  D=0x0000000000401700 A=0x695dc4
177900000: system.cpu07 T0 : @pthread_mutex_lock+33.0  :   XCHG_M_R : ldst   
t1b, DS:[rdx] : MemRead :  D=0x0000000000401700 A=0x695dc4
177900500: system.cpu07 T0 : @pthread_mutex_lock+33.1  :   XCHG_M_R : st   al, 
DS:[rdx] : MemWrite :  D=0x0000000000000001 A=0x695dc4
177900500: system.cpu06 T0 : @pthread_mutex_lock+33.1  :   XCHG_M_R : st   al, 
DS:[rdx] : MemWrite :  D=0x0000000000000001 A=0x695dc4
...

cpu05 does an unlock, and then *BOTH* cpu06 and cpu07 get the lock!
The reason is that both try and do a XCHG_M_R (atomic exchange between memory 
and register) and succeed!

The XCHG implementation 
(src/arch/x86/isa/insts/general_purpose/data_transfer/xchg.py)
has a comment saying
# All the memory versions need to use LOCK, regardless of if it was set

However, the descpription of XCHG_M_R is
def macroop XCHG_M_R
{
    ldst t1, seg, sib, disp
    st reg, seg, sib, disp
    mov reg, reg, t1
};

while the description of XCHG_LOCKED_M_R is
def macro-op XCHG_LOCKED_M_R
{
    ldstl t1, seg, sib, disp
    stul reg, seg, sib, disp
    mov reg, reg, t1
};


Changing XCHG_M_R to use the locked micro-ops (ldstl and stul) fixed the 
incorrect mutex-locking issue.
I am not sure if this is a bug, or is there some reason for not using the 
locked micro-ops?

There are similar, unlocked micro-ops used for XCHG_R_M, XCHG_R_P and XCHG_P_R 
... I am not sure if they all need to be changed to locked?
[What does the P in R_P  mean? Is this also referencing memory?].
If they all need to be changed to use ldstl and stul, I can create a patch for 
the same and check it in.

Thanks,
Tushar

_______________________________________________
m5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/m5-dev

Reply via email to