ebied...@xmission.com (Eric W. Biederman) writes: > Sean Christopherson <sean.j.christopher...@intel.com> writes: > >> On Wed, Oct 10, 2018 at 05:06:52PM -0500, Eric W. Biederman wrote: >>> ebied...@xmission.com (Eric W. Biederman) writes: >>> >>> > So I am flummoxed. I am reading through the code and I don't see >>> > anything that could trigger this, and when I ran the supplied reproducer >>> > it did not reproduce for me. >>> >>> Even more so. With my tool chain the line that reports the failing >>> address is impossible. >>> >>> [ 73.034423] RIP: 0010:copy_siginfo_from_user+0x4d/0xd0 >>> >>> With the supplied configureation my tool chain only has 0x30 bytes for >>> all of copy_siginfo_from_user. So I can't even begin to guess where >>> in that function things are failing. >>> >>> Any additional information that you can provide would be a real help >>> in tracking down this strange failure. >> >> I don't have the exact toolchain, but I was able to get somewhat close >> and may have found a smoking gun. 0x4d in my build is in the general >> vicinity of "sig_sicodes[sig].limit" in known_siginfo_layout(). This >> lines up with the register state from the log, e.g. RDI=0500104d8, >> which is the mask generated by sig_specific_sicodes. From what I can >> tell, @sig is never bounds checked. If the compiler generated an AND >> instruction to compare against sig_specific_sicodes then that could >> resolve true with any arbitrary value that happened to collide with >> sig_specific_sicodes and result in an out-of-bounds access to >> @sig_sicodes. siginfo_layout() for example explicitly checks @sig >> before indexing @sig_sicode, e.g. "sig < ARRAY_SIZE(sig_sicodes)". >> >> Maybe this? > > But sig is bounds checked. Even better sig is checked to see if it > is one of the values in the array. > >> From include/linux/signal.h > > #define SIG_SPECIFIC_SICODES_MASK (\ > rt_sigmask(SIGILL) | rt_sigmask(SIGFPE) | \ > rt_sigmask(SIGSEGV) | rt_sigmask(SIGBUS) | \ > rt_sigmask(SIGTRAP) | rt_sigmask(SIGCHLD) | \ > rt_sigmask(SIGPOLL) | rt_sigmask(SIGSYS) | \ > SIGEMT_MASK ) > > #define siginmask(sig, mask) \ > ((sig) < SIGRTMIN && (rt_sigmask(sig) & (mask))) > > #define sig_specific_sicodes(sig) siginmask(sig, > SIG_SPECIFIC_SICODES_MASK) > > > > Hmm. I wonder if something is passing in a negative signal number. > There is not a bounds check for that. A sufficiently large signal > number might be the problem here. Yes. I can get an oops with > a sufficiently large negative signal number. > > The code will later call valid_signal in check_permissions and > that will cause the system call to fail, so the issue is just that > the signal number is not being validated early enough. > > On the output path (copy_siginfo_to_user and copy_siginfo_to_user32) the > signal number should be validated before it ever reaches userspace > which is why I expect trinity never triggered anything. > > There is copy_siginfo_from_user32 and that does call siginfo_layout with > a possibly negative signal number. Which has the same potential issues. > > So I am going to go with the fix below. That fixes things in my testing > and by being unsigned should fix keep negative numbers from being a > problem.
Sean thank you very much for putting me on the right path to track this failing test down. Eric