We have found that one of our programs can cause system-wide
corruption of the x86 FPU under 2.2.16 and 2.2.17. That is, after we
run this program, the FPU gives bad results to all subsequent
processes.
We see this problem on dual 550MHz Xeons with 1GB RAM. We have 64 of
these things, and we s
Victor Zandy <[EMAIL PROTECTED]> writes:
> We have found that one of our programs can cause system-wide
> corruption of the x86 FPU under 2.2.16 and 2.2.17. That is, after we
> run this program, the FPU gives bad results to all subsequent
> processes.
We have now teste
No dice. Your program does not fix the problem.
If it were a hardware problem, I would expect the problem to occur
under 2.4.2 as well as 2.2.*, and I would be surprised that we can
consistently produce the behavior across our 64 node cluster. But we
are keeping the possibility in mind.
Thank
It looks to me like the kernel sets a trap for FP operations when a
process is switched in. Then when the process executes an FP op, the
kernel clears the trap and either loads the FP context or initializes
it, depending on whether it is the process' first FP operation. So no
help is need from
Someone else here traced the process flags of a FP-intensive program
on a machine before and after it is put in the faulty FPU state. He
periodically sampled /proc/pid/stat while the program was running.
He found that PF_USEDFPU was always set before the machine was broken.
After he found that
"Christian Ehrhardt" <[EMAIL PROTECTED]> writes:
> Victor: Could you try to reproduce the system wide corruption if you
> add an explicit call to stts(); at the very end of __switch_to?
> This should prevent the FPU corruption from spreading.
After adding this call, I cannot reproduce the global
Linus Torvalds writes:
> Ahh.. This actually _does_ look like a race on "current->flags":
> PTRACE_ATTACH will do a
>
> child->flags |= PF_PTRACED;
>
> without waiting for the child to have stopped.
I can see how this could case PF_USEDFPU to be cleared inadvertently,
but I do not
Alan Cox <[EMAIL PROTECTED]> writes:
> The preferable one for performance is certainly to backport the 2.4 changes
Is it any more substantial than changing all uses of the ptrace flags
to the new variable?
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a
If a process executes an int3 (breakpoint) instruction while
another process is attaching to it, the SIGTRAP can be lost. This bug
is present in 2.4.0-test8 and 2.2.14.
Below is a program that demonstrates this behavior. It forks a
child that repeatedly executes an int3 and handles the
If a process executes an int3 (breakpoint) instruction while
another process is attaching to it, the SIGTRAP can be lost. This bug
is present in 2.4.0-test8 and 2.2.14.
Below is a program that demonstrates this behavior. It forks a
child that repeatedly executes an int3 and handles the
Victor Zandy <[EMAIL PROTECTED]> writes:
> If a process executes an int3 (breakpoint) instruction while
> another process is attaching to it, the SIGTRAP can be lost. This bug
> is present in 2.4.0-test8 and 2.2.14.
Uh, this turns out to be my stupid programming error, n
We have not tested any other platform.
Please direct any questions or problems with the patch to
Victor Zandy <[EMAIL PROTECTED]>.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at http://vg
12 matches
Mail list logo