[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-08 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

 Resolution|--- |FIXED
 Status|CONFIRMED   |RESOLVED

--- Comment #21 from Ivo Raisr  ---
Committed in SVN revision 15823.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-06 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #20 from Steven Smith  ---
Looks good to me, thank you.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-05 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

  Attachment #97643|0   |1
is obsolete||
  Attachment #97688|0   |1
is obsolete||

--- Comment #19 from Ivo Raisr  ---
Created attachment 97701
  --> https://bugs.kde.org/attachment.cgi?id=97701=edit
combined patch for ppoll and pselect on Linux and Solaris

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-05 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #18 from Ivo Raisr  ---
Thank you for a quick turnaround. I think this should work.
As regards the scalar test, if you are not able to get it working with high
confidence then we should probably leave it alone.
I will combine patches for Linux and Solaris together and commit them.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-05 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Steven Smith  changed:

   What|Removed |Added

  Attachment #97682|0   |1
is obsolete||

--- Comment #17 from Steven Smith  ---
Created attachment 97688
  --> https://bugs.kde.org/attachment.cgi?id=97688=edit
Forth attempt at a Linux fix. I think this one is actually correct.

My main concern was that, because we drop the global lock while making the
syscall, we might be racing with some other guest thread calling mmap/munmap,
so the system call might happen even if ML_(safe_to_deref) says no, and if
we've skipped the sanitisation the bug I was trying to fix will come back.
Similarly, if POST calls ML_(safe_to_deref) to figure out whether to VG_(free)
things then it might either free something it shouldn't or allow the memory to
leak. Force-failing guest calls where we can't do the sanitisation seemed like
the easiest way of avoiding the issues, except that I still need some way for
the PRE call to tell the POST one what/whether it needs to VG_(free), which
brought me to the assertion failures in my last attempt.

But I suppose it's possible to make syscalls fail without using
SET_STATUS_Failure, so that's what this variant does. I think it covers all the
relevant cases, or at least all of the ones I've thought of, even if it is a
bit more dependent on kernel implementation details than I would have liked.

Perhaps I was just overthinking things and the right answer would have been to
just ignore the races? The guest would have to be doing something pretty crazy
for them to matter, after all. On the other hand, I think this one'll work, and
it'd be nice to get the edge cases right, even if it is perhaps a little
pedantic.

This patch also includes some more bits in scalar.c, but I'm not convinced they
represent a sensible way of testing this:

1) I can reproduce the issue with --tool=none, so it clearly isn't a problem
with memcheck, and it seems a bit odd to put the test case there.
2) The scalar.c regtest fails on my machine even without any of my changes,
seemingly because it only works in 32 bit more and I have a 64 bit system, and
it doesn't seem to like the debug symbols on the libc in my cross-compile
setup. That makes it tricky for me to do any sensible testing of the test case.

I defer to your judgement on whether this ends up being a net positive for the
codebase.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-04 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #16 from Ivo Raisr  ---
Please could you also move the test for failed pselect into
memcheck/tests/x86-linux/scalar.c
where it truly belongs. See its header.
Thanks a lot!

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-04 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #15 from Ivo Raisr  ---
Thank you for quick fix and analysis. Yes, indeed, changing args for failed
syscall wrapper is not allowed. I wonder why you need to explicitly set failure
status if kernel is going to do anyway.
Could you just check if all conditions are met and only then allocate memory
and assign it to ARG6/ARG4?

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-04 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Steven Smith  changed:

   What|Removed |Added

  Attachment #97670|0   |1
is obsolete||

--- Comment #14 from Steven Smith  ---
Created attachment 97682
  --> https://bugs.kde.org/attachment.cgi?id=97682=edit
Third attempt at fixing it on Linux. Still doesn't quite work.

Oh, crud, that was embarrassing. Here's another one which avoids that
particular error, and even extends the test case to include a failing
pselect...

... which just brings us to the next problem:

valgrind: m_syswrap/syswrap-main.c:1950 (vgPlain_client_syscall): Assertion
'eq_SyscallArgs(>args, >orig_args)' failed.

host stacktrace:
==12189==at 0x38084978: show_sched_status_wrk (m_libcassert.c:343)
==12189==by 0x38084A94: report_and_quit (m_libcassert.c:415)
==12189==by 0x38084C21: vgPlain_assert_fail (m_libcassert.c:481)
==12189==by 0x380D7A44: vgPlain_client_syscall (syswrap-main.c:1950)
==12189==by 0x380D414A: handle_syscall (scheduler.c:1118)

>From here:

   if (sci->status.what == SsComplete && sr_isError(sci->status.sres)) {
  /* The pre-handler decided to fail syscall itself. */
  PRINT(" --> [pre-fail] %s", VG_(sr_as_string)(sci->status.sres));
  /* In this case, the pre-handler is also allowed to ask for the
 post-handler to be run anyway.  Changing the args is not
 allowed. */
  vg_assert(0 == (sci->flags & ~(SfMayBlock | SfPostOnFail |
SfPollAfter)));
  vg_assert(eq_SyscallArgs(>args, >orig_args));
   }

So I can't use the syscall arguments to communicate from the PRE to POST
handlers if the PRE call fails the syscall. Do you know if there any other ways
of coordinating between them? Or at least some way for the POST handler to tell
whether an error came from the PRE handler or from the actual kernel syscall?

I'm sorry to have needed quite this many iterations on this one; it turns out
to be a little more involved than I'd expected.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-04 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #13 from Ivo Raisr  ---
Thank you for the fixes. Everything seems to be reasonable, except for the
error paths.
If an error path is taken (setting status failure), allocated memory is not
assigned to ARG6/ARG4.
This means the memory is leaked and VG_(free) will attempt to free arbitrary
client memory.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-03 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Steven Smith  changed:

   What|Removed |Added

  Attachment #97645|0   |1
is obsolete||

--- Comment #12 from Steven Smith  ---
Created attachment 97670
  --> https://bugs.kde.org/attachment.cgi?id=97670=edit
Second attempt at a linux fix

Okay, here's another variant which sets more sensible cost centers for
VG_(malloc), uses -q in the test case, and avoids playing with ARG7.

The ARG7 business was there to let the PRE hook communicate to the POST one
whether it had actually done the substitution, so that it knew whether it had
to call VG_(free). I didn't want to call ML_(safe_to_deref) again because I was
worried about racing with the guess calling mmap(), and I didn't want to do it
unconditionally because I wasn't sure how to build the substitution if some of
the calls to ML_(safe_to_deref) said no. Looking at getSyscallArgsFromState(),
it seemed like ARG7 was always initialised to zero on Linux, so it'd be a safe
place for an extra flag (and I figured that if I was wrong an assertion failure
would be easier to track down than a bad VG_(free)()), but I see now that mips
leaves it uninitialised, so that isn't going to work.

The new patch always allocates and releases the substitution whenever ARG6 is
non-NULL, and just VKI_EFAULTs any calls where it can't get the guest's desired
mask. I also converted the ppoll wrapper to the same model, just for symmetry.

Thank you for the review.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-03 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #11 from Ivo Raisr  ---
Thank you for the patch covering Linux.

I have only three minor comments:
- Why assert ARG7 == 0 in pselect6 wrapper? On some architectures arg7 would be
on stack,
  containing some random values. On others arg7 could refer to a register
containing garbage
  as well.
- The first argument (cost center) to VG_(malloc)() should reflect the
function/wrapper name. On Solaris it was pollsys, on Linux it is pselect6 or
ppoll.
- Please use 'vgopts: -q' in pselect_alarm.vgtest so the stack trace is not a
part of stderr.exp. It differs between operating systems. If you insist on
having there some stack information then for example a filter can be used to
filter anything before main().

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-02 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #10 from Steven Smith  ---
Created attachment 97645
  --> https://bugs.kde.org/attachment.cgi?id=97645=edit
Port the ppoll patch to Linux, and extend to cover pselect6

Sure, here's a patch which does something similar for Linux ppoll and pselect,
and also adds a new test case for pselect().

I'm not sure what the usual workflow is for Valgrind development, so I've just
attached a single flattened patch. There's also a slightly more decomposed
variant on my github page at https://github.com/sos22/vgrind-359871, if you'd
prefer that.

Caveat: the only machine I have to test with at the moment is amd64 Linux, so
I've not checked it on any other platforms

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-02 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #9 from Ivo Raisr  ---
Steven, would you be able to prepare a patch for Linux using similar technique?
I cannot currently devote to developing it for a while.
Beware that Linux has ppoll and pselect6 syscalls. And pselect6 encodes the
last argument
as a pointer to a structure of the form [1]:

struct {
const sigset_t *ss; /* Pointer to signal set */
size_t  ss_len;/* Size (in bytes) of object pointed to by 'ss'
*/
};

Linux syscall wrapper is completely missing any checking of 'ss' contents. You
could also add that.

[1] http://linux.die.net/man/2/pselect6

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-02 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

  Attachment #97632|0   |1
is obsolete||

--- Comment #8 from Ivo Raisr  ---
Created attachment 97643
  --> https://bugs.kde.org/attachment.cgi?id=97643=edit
patch for ppoll on Solaris #2

This patch fixes shortcomings listed previously, that is client memory is not
modified (as it could be read-only).

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-01 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #7 from Ivo Raisr  ---
The attached patch has two problems:
- it modifies a client memory which may be read-only (see for example bugs for
modifying env)
- it modifies a client memory and does not restore the changes after syscall

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-01 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #6 from Ivo Raisr  ---
Steven, thank you for the test case provided and a suggested fix.
Indeed, simple things are very often also effective.
I will prepare patch for Linux as well and have that reviewed.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-01 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

  Attachment #97584|0   |1
is obsolete||
  Attachment #97622|0   |1
is obsolete||

--- Comment #5 from Ivo Raisr  ---
Created attachment 97632
  --> https://bugs.kde.org/attachment.cgi?id=97632=edit
patch for ppoll on Solaris

This fixes the problematic functionality for ppoll on Solaris.
On Linux needs some investigation if pselect and ppoll use the same syscall
wrapper or different ones.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-01 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

 Ever confirmed|0   |1
 Status|UNCONFIRMED |CONFIRMED
   Assignee|jsew...@acm.org |iv...@ivosh.net

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-03-01 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Steven Smith  changed:

   What|Removed |Added

  Attachment #97583|0   |1
is obsolete||

--- Comment #3 from Steven Smith  ---
Created attachment 97622
  --> https://bugs.kde.org/attachment.cgi?id=97622=edit
Another variant of the test program

I didn't use VKI_NSIG or VKI_SIGVGKILL in the test program because they're not
exposed in any of the headers in the Valgrind packages I have locally, and it
was easier to make it not need any VG-internal bits. Just blocking every signal
you can get at seems to reproduce it as well, so how about this one?

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-02-29 Thread Ivo Raisr via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

Ivo Raisr  changed:

   What|Removed |Added

 CC||iv...@ivosh.net

--- Comment #2 from Ivo Raisr  ---
Thank you for the test case provided.
It should not hardcode "64" but instead rely on at least _VKI_NSIG. Eventually
block a range of signals.

I thought we were already doing some work on signals in
sanitize_client_sigmask(). Seems like I was mistaken.

-- 
You are receiving this mail because:
You are watching all bug changes.


[valgrind] [Bug 359871] Incorrect mask handling in ppoll

2016-02-27 Thread Steven Smith via KDE Bugzilla
https://bugs.kde.org/show_bug.cgi?id=359871

--- Comment #1 from Steven Smith  ---
Created attachment 97584
  --> https://bugs.kde.org/attachment.cgi?id=97584=edit
Kind of a fix, if you squint

-- 
You are receiving this mail because:
You are watching all bug changes.