> do_signal is called before do_single_step. The order of checks of the > TIF_ bits is 1) machine checks, 2) need resched, 3) signal pending, 4) > notify resume, 5) restarting system call, 6) single step.
I see. Then the potential issue I raised would exist. > But why is that important ? If the TIF_SINGLE_STEP bit is set the order > of do_signal vs. do_single_step does not seem to be important to me. > There will be a SIGTRAP if TIF_SINGLE_STEP is set, no ? Right. It only becomes relevant if something else clears TIF_SINGLE_STEP, which does not happen now. I was discussing the scenario of having user_disable_single_step clear it. That might happen inside a stop, i.e. inside do_signal (get_signal_to_deliver). So given that order of checks, it becomes possible for user_disable_single_step to affect the pending-step-should-SIGTRAP situation. > But I agree, it is probably better to make all arches look the same in > regard to that pending step report. Right. That means we should leave the status quo of not clearing TIF_SINGLE_STEP in user_disable_single_step. > > Martin, I suggest having copy_thread clear TIF_SINGLE_STEP. > > That bit is always task-private state that should not be copied. > > Then let us do this. Yes, good. > The LCTLG of multiple control registers is rather expensive. Does it > happen often that user_*_single_step is called without need? For gdb is > doesn't matter, the cost to switch between tracer and tracee is already > large, the cycles added to FixPerRegisters won't matter much. For > utrace things might be different. In old (current) ptrace, user_*_single_step is never called on current. In utrace, it's always called on current, so utrace-based ptrace alone adds this second reload overhead beyond the same context-switch overhead old ptrace has. Indeed that addition may be neglible. In other circumstances with utrace, it is very possible to wind up with user_disable_single_step being called superfluously when there was no stop (and so not necessarily any context switch or other high overhead). On other machines, user_disable_single_step is pretty cheap even where user_enable_single_step is quite costly. Given how simple and cheap it is to short-circuit the excess work on s390, I think it is worthwhile. It looks like the context-switch code already checks some magic bits in per_info to decide whether to do the costly reload. So it seems vaguely consistent with that to conditionalize FixPerRegisters similarly. The single_step cases are a special case with an easy one-bit check to short-circuit, so skipping all of FixPerRegisters seems worthwhile IMHO. To be really optimization-happy about it, you'd also hack FixPerRegisters itself to do the reload on current only if PSW_MASK_PER is or was set (if I'm following the code correctly). Or perhaps it should check PER_EM_MASK instead to match what __switch_to does. (I don't understand the distinction between those two tests, if there is one.) Extra frobbity would be to leave the old state too when clearing PSW_MASK_PER, and then just have the trap handler lazily ignore a hit when current doesn't have it set. Then in a case where there is no hit before context switch, you haven't done anything. But that is probably both excessive and not even a win if the PER use is single-step and so there will really very likely be a hit before context switch. Thanks, Roland