Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-21 Thread Paul Moore
On Thu, Mar 21, 2019 at 11:48 AM Todd Kjos  wrote:
> On Thu, Mar 21, 2019 at 2:50 AM Ondrej Mosnacek  wrote:
> >
> > On Thu, Mar 21, 2019 at 12:26 AM Todd Kjos  wrote:
> > > I can send you a patch tomorrow (I won't be able to test it though).
> >
> > So, I was a bit quicker than you and I think I managed to fix the test 
> > myself :)
> >
> > See:
> > https://github.com/SELinuxProject/selinux-testsuite/pull/50/commits/b559c3f54eae6130cb9e79c295b0f94db26e09e4
>
> Looks good. Thanks!

I'm getting clean runs on my test system now too - thanks everyone!

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-21 Thread Ondrej Mosnacek
On Thu, Mar 21, 2019 at 12:26 AM Todd Kjos  wrote:
> I can send you a patch tomorrow (I won't be able to test it though).

So, I was a bit quicker than you and I think I managed to fix the test myself :)

See:
https://github.com/SELinuxProject/selinux-testsuite/pull/50/commits/b559c3f54eae6130cb9e79c295b0f94db26e09e4

It seems the problem was indeed in the offsets array handling as you
discovered. With the above commit included the PR#50 version of the
binder test no longer fails for me.

Thanks,

>
> On Wed, Mar 20, 2019 at 4:23 PM Paul Moore  wrote:
> >
> > On Wed, Mar 20, 2019 at 3:50 PM Todd Kjos  wrote:
> > >
> > > Paul,
> > >
> > > Looking at main() in test_binder.c...
> > >
> > > int main(int argc, char **argv)
> > > {
> > >
> > > [...]
> > >
> > >   // Line 493
> > >   struct binder_write_read bwr;
> > >   struct flat_binder_object obj;
> > >   struct {
> > > uint32_t cmd;
> > > struct binder_transaction_data txn;
> > >   } __attribute__((packed)) writebuf;
> > >   unsigned int readbuf[32];
> > >
> > > [...]
> > >   // Line 630
> > >   writebuf.txn.data.ptr.buffer = (uintptr_t)&obj;
> > >   writebuf.txn.data.ptr.offsets = (uintptr_t)&obj +   // [A]
> > >  sizeof(struct
> > > flat_binder_object);
> > >
> > >   bwr.write_buffer = (uintptr_t)&writebuf;
> > >   bwr.write_size = sizeof(writebuf);
> > >
> > > It looks like bwr.txn.data.ptr.offsets points off the end of obj (see
> > > [A] above), which means the binder driver will read compiler-dependent
> > > stack data as the offset for the object. If it happens to be 0, then
> > > the test will work (read the object from offset 0). If it's not 0,
> > > then most likely offset > data_size (which is what found that BUG_ON
> > > case). With my patch applied, this will just cause an error to be
> > > returned (what you are seeing now).
> > >
> > > Same thing when you test with v5.0 -- if the offset happens to be 0,
> > > then the test will succeed. If not, then the test will fail because
> > > the transaction fails in an unexpected way.
> >
> > That might explain why the test used to work, but now fails - a
> > different compiler (I rebuild the test before each test run).
> >
> > Keeping in mind I'm really quite ignorant when it comes to binder, how
> > would you suggest fixing the test?
> >
> > --
> > paul moore
> > www.paul-moore.com



-- 
Ondrej Mosnacek 
Software Engineer, Security Technologies
Red Hat, Inc.
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Paul Moore
On Wed, Mar 20, 2019 at 7:26 PM Todd Kjos  wrote:
> I can send you a patch tomorrow (I won't be able to test it though).

I may not know much about binder, but I do know how to run the test suite :)

Thanks Todd.

> On Wed, Mar 20, 2019 at 4:23 PM Paul Moore  wrote:
> >
> > On Wed, Mar 20, 2019 at 3:50 PM Todd Kjos  wrote:
> > >
> > > Paul,
> > >
> > > Looking at main() in test_binder.c...
> > >
> > > int main(int argc, char **argv)
> > > {
> > >
> > > [...]
> > >
> > >   // Line 493
> > >   struct binder_write_read bwr;
> > >   struct flat_binder_object obj;
> > >   struct {
> > > uint32_t cmd;
> > > struct binder_transaction_data txn;
> > >   } __attribute__((packed)) writebuf;
> > >   unsigned int readbuf[32];
> > >
> > > [...]
> > >   // Line 630
> > >   writebuf.txn.data.ptr.buffer = (uintptr_t)&obj;
> > >   writebuf.txn.data.ptr.offsets = (uintptr_t)&obj +   // [A]
> > >  sizeof(struct
> > > flat_binder_object);
> > >
> > >   bwr.write_buffer = (uintptr_t)&writebuf;
> > >   bwr.write_size = sizeof(writebuf);
> > >
> > > It looks like bwr.txn.data.ptr.offsets points off the end of obj (see
> > > [A] above), which means the binder driver will read compiler-dependent
> > > stack data as the offset for the object. If it happens to be 0, then
> > > the test will work (read the object from offset 0). If it's not 0,
> > > then most likely offset > data_size (which is what found that BUG_ON
> > > case). With my patch applied, this will just cause an error to be
> > > returned (what you are seeing now).
> > >
> > > Same thing when you test with v5.0 -- if the offset happens to be 0,
> > > then the test will succeed. If not, then the test will fail because
> > > the transaction fails in an unexpected way.
> >
> > That might explain why the test used to work, but now fails - a
> > different compiler (I rebuild the test before each test run).
> >
> > Keeping in mind I'm really quite ignorant when it comes to binder, how
> > would you suggest fixing the test?
> >
> > --
> > paul moore
> > www.paul-moore.com



-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Todd Kjos
I can send you a patch tomorrow (I won't be able to test it though).

On Wed, Mar 20, 2019 at 4:23 PM Paul Moore  wrote:
>
> On Wed, Mar 20, 2019 at 3:50 PM Todd Kjos  wrote:
> >
> > Paul,
> >
> > Looking at main() in test_binder.c...
> >
> > int main(int argc, char **argv)
> > {
> >
> > [...]
> >
> >   // Line 493
> >   struct binder_write_read bwr;
> >   struct flat_binder_object obj;
> >   struct {
> > uint32_t cmd;
> > struct binder_transaction_data txn;
> >   } __attribute__((packed)) writebuf;
> >   unsigned int readbuf[32];
> >
> > [...]
> >   // Line 630
> >   writebuf.txn.data.ptr.buffer = (uintptr_t)&obj;
> >   writebuf.txn.data.ptr.offsets = (uintptr_t)&obj +   // [A]
> >  sizeof(struct
> > flat_binder_object);
> >
> >   bwr.write_buffer = (uintptr_t)&writebuf;
> >   bwr.write_size = sizeof(writebuf);
> >
> > It looks like bwr.txn.data.ptr.offsets points off the end of obj (see
> > [A] above), which means the binder driver will read compiler-dependent
> > stack data as the offset for the object. If it happens to be 0, then
> > the test will work (read the object from offset 0). If it's not 0,
> > then most likely offset > data_size (which is what found that BUG_ON
> > case). With my patch applied, this will just cause an error to be
> > returned (what you are seeing now).
> >
> > Same thing when you test with v5.0 -- if the offset happens to be 0,
> > then the test will succeed. If not, then the test will fail because
> > the transaction fails in an unexpected way.
>
> That might explain why the test used to work, but now fails - a
> different compiler (I rebuild the test before each test run).
>
> Keeping in mind I'm really quite ignorant when it comes to binder, how
> would you suggest fixing the test?
>
> --
> paul moore
> www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Paul Moore
On Wed, Mar 20, 2019 at 3:50 PM Todd Kjos  wrote:
>
> Paul,
>
> Looking at main() in test_binder.c...
>
> int main(int argc, char **argv)
> {
>
> [...]
>
>   // Line 493
>   struct binder_write_read bwr;
>   struct flat_binder_object obj;
>   struct {
> uint32_t cmd;
> struct binder_transaction_data txn;
>   } __attribute__((packed)) writebuf;
>   unsigned int readbuf[32];
>
> [...]
>   // Line 630
>   writebuf.txn.data.ptr.buffer = (uintptr_t)&obj;
>   writebuf.txn.data.ptr.offsets = (uintptr_t)&obj +   // [A]
>  sizeof(struct
> flat_binder_object);
>
>   bwr.write_buffer = (uintptr_t)&writebuf;
>   bwr.write_size = sizeof(writebuf);
>
> It looks like bwr.txn.data.ptr.offsets points off the end of obj (see
> [A] above), which means the binder driver will read compiler-dependent
> stack data as the offset for the object. If it happens to be 0, then
> the test will work (read the object from offset 0). If it's not 0,
> then most likely offset > data_size (which is what found that BUG_ON
> case). With my patch applied, this will just cause an error to be
> returned (what you are seeing now).
>
> Same thing when you test with v5.0 -- if the offset happens to be 0,
> then the test will succeed. If not, then the test will fail because
> the transaction fails in an unexpected way.

That might explain why the test used to work, but now fails - a
different compiler (I rebuild the test before each test run).

Keeping in mind I'm really quite ignorant when it comes to binder, how
would you suggest fixing the test?

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Todd Kjos
On Wed, Mar 20, 2019 at 3:25 PM Paul Moore  wrote:
>
> On Wed, Mar 20, 2019 at 11:54 AM Todd Kjos  wrote:
> > So, then it sounds like the test is not running properly ...
>
> Yes, the test is almost surely broken to some extent, although the
> kernel hitting the BUG_ON() was clearly a bug too :)

Absolutely -- I'm glad the test had this bug that caused the kernel
bug to be found :).

> > Can I add a "Tested-by: Paul Moore " on my patch
> > submission to fix the BUG_ON (the exact patch you tested) ?
>
> Yep.  Thanks for your help on fixing that.

Thanks.

>
> --
> paul moore
> www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Paul Moore
On Wed, Mar 20, 2019 at 11:54 AM Todd Kjos  wrote:
> So, then it sounds like the test is not running properly ...

Yes, the test is almost surely broken to some extent, although the
kernel hitting the BUG_ON() was clearly a bug too :)

> Can I add a "Tested-by: Paul Moore " on my patch
> submission to fix the BUG_ON (the exact patch you tested) ?

Yep.  Thanks for your help on fixing that.

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Todd Kjos
Paul,

Looking at main() in test_binder.c...

int main(int argc, char **argv)
{

[...]

  // Line 493
  struct binder_write_read bwr;
  struct flat_binder_object obj;
  struct {
uint32_t cmd;
struct binder_transaction_data txn;
  } __attribute__((packed)) writebuf;
  unsigned int readbuf[32];

[...]
  // Line 630
  writebuf.txn.data.ptr.buffer = (uintptr_t)&obj;
  writebuf.txn.data.ptr.offsets = (uintptr_t)&obj +   // [A]
 sizeof(struct
flat_binder_object);

  bwr.write_buffer = (uintptr_t)&writebuf;
  bwr.write_size = sizeof(writebuf);

It looks like bwr.txn.data.ptr.offsets points off the end of obj (see
[A] above), which means the binder driver will read compiler-dependent
stack data as the offset for the object. If it happens to be 0, then
the test will work (read the object from offset 0). If it's not 0,
then most likely offset > data_size (which is what found that BUG_ON
case). With my patch applied, this will just cause an error to be
returned (what you are seeing now).

Same thing when you test with v5.0 -- if the offset happens to be 0,
then the test will succeed. If not, then the test will fail because
the transaction fails in an unexpected way.

-Todd


On Wed, Mar 20, 2019 at 8:54 AM Todd Kjos  wrote:
>
> On Tue, Mar 19, 2019 at 8:04 PM Paul Moore  wrote:
> >
> > On Tue, Mar 19, 2019 at 9:08 PM Todd Kjos  wrote:
> > > Paul,
> > >
> > > Looking at a snippet of the test output:
> > >
> > > Service Provider read_consumed: 8
> > > Service Provider command: BR_NOOP
> > > Service Provider command: BR_FAILED_REPLY  <-- the txn
> > > failed as expected.
> > > Manager read_consumed: 8
> > > Manager command: BR_NOOP
> > > Manager command: BR_TRANSACTION_COMPLETE
> > > not ok 3
> > > <- but for some reason didn't exit(103)
> > > #   Failed test at ./test line 56.
> > >
> > > It looks like the test script expects test_binder to fail with
> > > exit(103) after processing the Server Provider commands. It's not
> > > clear why it didn't, since the return of a BR_FAILED_REPLY for that
> > > transaction should have executed this code (line 392 of
> > > test_binder.c):
> > >
> > > if (cmd == BR_FAILED_REPLY ||
> > > cmd == BR_DEAD_REPLY ||
> > > cmd == BR_DEAD_BINDER) {
> > > fprintf(stderr,
> > >   "Failed response from Service Provider using Managers 
> > > FD\n");
> > > exit(103);
> > > }
> > >
> > > Could this be an issue with the test? At least it doesn't look like a
> > > transaction is succeeding when it shouldn't.
> >
> > Hi Todd,
> >
> > Thanks for looking into this further.  Look a bit more at the test, it
> > appears that the code above (line 392) only comes into play if the
> > service provider is handling a BR_REPLY, but looking at the test
> > output it doesn't appear that this is the case.
> >
> > # runcon -t test_binder_provider_no_im_t ./test_binder -v -r 2 provider
> > Service Provider PID: 2095 Process context:
> >unconfined_u:unconfined_r:test_binder_provider_no_im_t:s0-s0:c0.c1023
> > Service Provider sending transaction to Manager - ADD_TEST_SERVICE
> > Service Provider read_consumed: 48
> > Service Provider command: BR_NOOP
> > Service Provider command: BR_INCREFS
> > Service Provider command: BR_ACQUIRE
> > Service Provider command: BR_TRANSACTION_COMPLETE
> >
> > Service Provider read_consumed: 8
> > Service Provider command: BR_NOOP
> > Service Provider command: BR_FAILED_REPLY
>
> So, then it sounds like the test is not running properly and not
> really testing what it intends to test (which I guess is consistent
> with the fact that it triggered that BUG_ON -- the transaction is
> malformed and failing early). It doesn't look like there is a
> successful transaction that should have failed due to selinux policy
> -- it looks like there is an invalid transaction that probably fails
> earlier and doesn't return 103 (it probably returns 8 -- it would be
> useful if the test script displayed the exit value that was detected
> as a failure).
>
> I don't think there is much I can do on this now given the apparent
> flakiness, but I'm happy to help when there is a stable issue. I'll
> look a little more at the test code to see if I can spot what is wrong
> with the transaction.
>
> Can I add a "Tested-by: Paul Moore " on my patch
> submission to fix the BUG_ON (the exact patch you tested) ?
>
> -Todd
>
>
> >
> > However, things get weird.  In the course of writing this email I
> > booted into my 5.0.0-1.1.secnext kernel (which passed the binder test
> > earlier) and now that kernel is failing in the same way (the test
> > hasn't changed).  This test system is a Fedora Rawhide system which is
> > updated before I make a test run and I'm wondering if there is some
> > other userspace component which may be affecting this ... ?  I've now
> > tried this on two different, current Rawhide VMs, hosted on two
> > different

Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-20 Thread Todd Kjos
On Tue, Mar 19, 2019 at 8:04 PM Paul Moore  wrote:
>
> On Tue, Mar 19, 2019 at 9:08 PM Todd Kjos  wrote:
> > Paul,
> >
> > Looking at a snippet of the test output:
> >
> > Service Provider read_consumed: 8
> > Service Provider command: BR_NOOP
> > Service Provider command: BR_FAILED_REPLY  <-- the txn
> > failed as expected.
> > Manager read_consumed: 8
> > Manager command: BR_NOOP
> > Manager command: BR_TRANSACTION_COMPLETE
> > not ok 3
> > <- but for some reason didn't exit(103)
> > #   Failed test at ./test line 56.
> >
> > It looks like the test script expects test_binder to fail with
> > exit(103) after processing the Server Provider commands. It's not
> > clear why it didn't, since the return of a BR_FAILED_REPLY for that
> > transaction should have executed this code (line 392 of
> > test_binder.c):
> >
> > if (cmd == BR_FAILED_REPLY ||
> > cmd == BR_DEAD_REPLY ||
> > cmd == BR_DEAD_BINDER) {
> > fprintf(stderr,
> >   "Failed response from Service Provider using Managers 
> > FD\n");
> > exit(103);
> > }
> >
> > Could this be an issue with the test? At least it doesn't look like a
> > transaction is succeeding when it shouldn't.
>
> Hi Todd,
>
> Thanks for looking into this further.  Look a bit more at the test, it
> appears that the code above (line 392) only comes into play if the
> service provider is handling a BR_REPLY, but looking at the test
> output it doesn't appear that this is the case.
>
> # runcon -t test_binder_provider_no_im_t ./test_binder -v -r 2 provider
> Service Provider PID: 2095 Process context:
>unconfined_u:unconfined_r:test_binder_provider_no_im_t:s0-s0:c0.c1023
> Service Provider sending transaction to Manager - ADD_TEST_SERVICE
> Service Provider read_consumed: 48
> Service Provider command: BR_NOOP
> Service Provider command: BR_INCREFS
> Service Provider command: BR_ACQUIRE
> Service Provider command: BR_TRANSACTION_COMPLETE
>
> Service Provider read_consumed: 8
> Service Provider command: BR_NOOP
> Service Provider command: BR_FAILED_REPLY

So, then it sounds like the test is not running properly and not
really testing what it intends to test (which I guess is consistent
with the fact that it triggered that BUG_ON -- the transaction is
malformed and failing early). It doesn't look like there is a
successful transaction that should have failed due to selinux policy
-- it looks like there is an invalid transaction that probably fails
earlier and doesn't return 103 (it probably returns 8 -- it would be
useful if the test script displayed the exit value that was detected
as a failure).

I don't think there is much I can do on this now given the apparent
flakiness, but I'm happy to help when there is a stable issue. I'll
look a little more at the test code to see if I can spot what is wrong
with the transaction.

Can I add a "Tested-by: Paul Moore " on my patch
submission to fix the BUG_ON (the exact patch you tested) ?

-Todd


>
> However, things get weird.  In the course of writing this email I
> booted into my 5.0.0-1.1.secnext kernel (which passed the binder test
> earlier) and now that kernel is failing in the same way (the test
> hasn't changed).  This test system is a Fedora Rawhide system which is
> updated before I make a test run and I'm wondering if there is some
> other userspace component which may be affecting this ... ?  I've now
> tried this on two different, current Rawhide VMs, hosted on two
> different systems and I'm seeing the same thing, so I don't think it's
> a *bad* system/VM?
>
> > On Tue, Mar 19, 2019 at 5:15 PM Todd Kjos  wrote:
> > >
> > > [...]
> > >
> > > > > Is there a public dashboard where I can take a look at those binder 
> > > > > failures?
> > > >
> > > > Not really.  I send test results to a not-yet-publicized mailing list,
> > > > but there is more detail in the GitHub issue below (my last comment
> > > > has the verbose test output):
> > > >
> > > > * https://github.com/SELinuxProject/selinux-kernel/issues/46
> > > >
> > >
> > > Ok, so it looks like something was introduced that causes binder to be
> > > too permissive (test 3 transaction succeeded when failure is
> > > expected). I don't know of any recent binder changes that could have
> > > caused that.
> > >
> > > It will take me a while to set up this test environment. Is this easy
> > > for you to run? Any chance of bisecting or at least trying a few
> > > versions to narrow it down? Here's a list of the recent patchset -- it
> > > would be useful to know which caused it (or if none of them did):
> > >
> > > 9e98c678c2d6a Linux 5.1-rc1
> > > ...
> > > 26528be6720bb binder: fix handling of misaligned binder object
> > > bde4a19fc04f5 binder: use userspace pointer as base of buffer space
> > > c41358a5f5217 binder: remove user_buffer_offset
> > > db6b0b810bf94 binder: avoid kernel vm_area for buffer fixups
> > > 7a67a39320dfb binder: add function to copy bi

Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-19 Thread Paul Moore
On Tue, Mar 19, 2019 at 9:08 PM Todd Kjos  wrote:
> Paul,
>
> Looking at a snippet of the test output:
>
> Service Provider read_consumed: 8
> Service Provider command: BR_NOOP
> Service Provider command: BR_FAILED_REPLY  <-- the txn
> failed as expected.
> Manager read_consumed: 8
> Manager command: BR_NOOP
> Manager command: BR_TRANSACTION_COMPLETE
> not ok 3
> <- but for some reason didn't exit(103)
> #   Failed test at ./test line 56.
>
> It looks like the test script expects test_binder to fail with
> exit(103) after processing the Server Provider commands. It's not
> clear why it didn't, since the return of a BR_FAILED_REPLY for that
> transaction should have executed this code (line 392 of
> test_binder.c):
>
> if (cmd == BR_FAILED_REPLY ||
> cmd == BR_DEAD_REPLY ||
> cmd == BR_DEAD_BINDER) {
> fprintf(stderr,
>   "Failed response from Service Provider using Managers 
> FD\n");
> exit(103);
> }
>
> Could this be an issue with the test? At least it doesn't look like a
> transaction is succeeding when it shouldn't.

Hi Todd,

Thanks for looking into this further.  Look a bit more at the test, it
appears that the code above (line 392) only comes into play if the
service provider is handling a BR_REPLY, but looking at the test
output it doesn't appear that this is the case.

# runcon -t test_binder_provider_no_im_t ./test_binder -v -r 2 provider
Service Provider PID: 2095 Process context:
   unconfined_u:unconfined_r:test_binder_provider_no_im_t:s0-s0:c0.c1023
Service Provider sending transaction to Manager - ADD_TEST_SERVICE
Service Provider read_consumed: 48
Service Provider command: BR_NOOP
Service Provider command: BR_INCREFS
Service Provider command: BR_ACQUIRE
Service Provider command: BR_TRANSACTION_COMPLETE

Service Provider read_consumed: 8
Service Provider command: BR_NOOP
Service Provider command: BR_FAILED_REPLY

However, things get weird.  In the course of writing this email I
booted into my 5.0.0-1.1.secnext kernel (which passed the binder test
earlier) and now that kernel is failing in the same way (the test
hasn't changed).  This test system is a Fedora Rawhide system which is
updated before I make a test run and I'm wondering if there is some
other userspace component which may be affecting this ... ?  I've now
tried this on two different, current Rawhide VMs, hosted on two
different systems and I'm seeing the same thing, so I don't think it's
a *bad* system/VM?

> On Tue, Mar 19, 2019 at 5:15 PM Todd Kjos  wrote:
> >
> > [...]
> >
> > > > Is there a public dashboard where I can take a look at those binder 
> > > > failures?
> > >
> > > Not really.  I send test results to a not-yet-publicized mailing list,
> > > but there is more detail in the GitHub issue below (my last comment
> > > has the verbose test output):
> > >
> > > * https://github.com/SELinuxProject/selinux-kernel/issues/46
> > >
> >
> > Ok, so it looks like something was introduced that causes binder to be
> > too permissive (test 3 transaction succeeded when failure is
> > expected). I don't know of any recent binder changes that could have
> > caused that.
> >
> > It will take me a while to set up this test environment. Is this easy
> > for you to run? Any chance of bisecting or at least trying a few
> > versions to narrow it down? Here's a list of the recent patchset -- it
> > would be useful to know which caused it (or if none of them did):
> >
> > 9e98c678c2d6a Linux 5.1-rc1
> > ...
> > 26528be6720bb binder: fix handling of misaligned binder object
> > bde4a19fc04f5 binder: use userspace pointer as base of buffer space
> > c41358a5f5217 binder: remove user_buffer_offset
> > db6b0b810bf94 binder: avoid kernel vm_area for buffer fixups
> > 7a67a39320dfb binder: add function to copy binder object from buffer
> > 8ced0c6231ead binder: add functions to copy to/from binder buffers
> > 1a7c3d9bb7a92 binder: create userspace-to-binder-buffer copy function
> > ...
> > 1c163f4c7b3f6 (tag: v5.0) Linux 5.0
> >
> > Thanks,
> >
> > -Todd



--
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-19 Thread Paul Moore
On Tue, Mar 19, 2019 at 6:16 PM Todd Kjos  wrote:
> On Tue, Mar 19, 2019 at 3:08 PM Paul Moore  wrote:
> >
> > On Tue, Mar 19, 2019 at 3:33 PM Paul Moore  wrote:
> > > On Tue, Mar 19, 2019 at 12:51 PM Todd Kjos  wrote:
> > > > Paul,
> > > >
> > > > I think this patch will fix it... can you run the selinux-testsuite
> > > > with the patch to verify? (the conditional assumed that size_t can go
> > > > negative)
> > >
> > > Building a test kernel now, I'll report back as soon as it is finished.
> >
> > Good news, the BUG_ON() panic is now gone,
>
> Great. Thanks for testing it.

Thanks for the fix :)

> > but I'm seeing a test
> > failure, although to be fair I saw some binder test failures during
> > the merge window (I generally don't worry about failures until -rc1 is
> > released).  The selinux-testsuite binder tests have been working since
> > spring 2018, but it's possible there might be a bug in the tests that
> > is just now showing up.  Have you ever looked at the selinux-testsuite
> > tests for binder?
>
> No, I didn't know they existed until yesterday. Glad to have more test
> coverage. Were they running clean on 5.0.0?

Yep.  They were added to the test suite last May and have been running
clean since then.

> Is there a public dashboard where I can take a look at those binder failures?

Not really.  I send test results to a not-yet-publicized mailing list,
but there is more detail in the GitHub issue below (my last comment
has the verbose test output):

* https://github.com/SELinuxProject/selinux-kernel/issues/46

> > * 
> > https://github.com/SELinuxProject/selinux-testsuite/tree/master/tests/binder
> >
> > > > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > > > index 9a7c431469b3..bb9a661ffecc 100644
> > > > --- a/drivers/android/binder.c
> > > > +++ b/drivers/android/binder.c
> > > > @@ -2240,7 +2240,8 @@ static size_t binder_get_object(struct 
> > > > binder_proc *proc,
> > > > size_t object_size = 0;
> > > >
> > > > read_size = min_t(size_t, sizeof(*object), buffer->data_size - 
> > > > offset);
> > > > -   if (read_size < sizeof(*hdr) || !IS_ALIGNED(offset, 
> > > > sizeof(u32)))
> > > > +   if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
> > > > +   !IS_ALIGNED(offset, sizeof(u32)))
> > > > return 0;
> > > > binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
> > > >   offset, read_size);
> > > >
> > > > On Mon, Mar 18, 2019 at 4:02 PM Paul Moore  wrote:
> > > > >
> > > > > On Mon, Mar 18, 2019 at 6:51 PM Todd Kjos  wrote:
> > > > > > On Mon, Mar 18, 2019 at 2:31 PM Paul Moore  
> > > > > > wrote:
> > > > > > > Hello all.
> > > > > > >
> > > > > > > When running the selinux-testsuite (link below) against v5.1-rc1 
> > > > > > > I hit
> > > > > > > the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
> > > > > > > below).  I'm hoping this is a known issue with a fix already in 
> > > > > > > the
> > > > > > > works?
> > > > > >
> > > > > >
> > > > > > Sadly, this is the first report of this, so no fix in flight. I'll 
> > > > > > try
> > > > > > to get a fix up in the next few days.
> > > > >
> > > > > No problem, thanks for letting me know.  If you need some testing
> > > > > help, let me know.
> >
> > --
> > paul moore
> > www.paul-moore.com



-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-19 Thread Paul Moore
On Tue, Mar 19, 2019 at 3:33 PM Paul Moore  wrote:
> On Tue, Mar 19, 2019 at 12:51 PM Todd Kjos  wrote:
> > Paul,
> >
> > I think this patch will fix it... can you run the selinux-testsuite
> > with the patch to verify? (the conditional assumed that size_t can go
> > negative)
>
> Building a test kernel now, I'll report back as soon as it is finished.

Good news, the BUG_ON() panic is now gone, but I'm seeing a test
failure, although to be fair I saw some binder test failures during
the merge window (I generally don't worry about failures until -rc1 is
released).  The selinux-testsuite binder tests have been working since
spring 2018, but it's possible there might be a bug in the tests that
is just now showing up.  Have you ever looked at the selinux-testsuite
tests for binder?

* https://github.com/SELinuxProject/selinux-testsuite/tree/master/tests/binder

> > diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> > index 9a7c431469b3..bb9a661ffecc 100644
> > --- a/drivers/android/binder.c
> > +++ b/drivers/android/binder.c
> > @@ -2240,7 +2240,8 @@ static size_t binder_get_object(struct binder_proc 
> > *proc,
> > size_t object_size = 0;
> >
> > read_size = min_t(size_t, sizeof(*object), buffer->data_size - 
> > offset);
> > -   if (read_size < sizeof(*hdr) || !IS_ALIGNED(offset, sizeof(u32)))
> > +   if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
> > +   !IS_ALIGNED(offset, sizeof(u32)))
> > return 0;
> > binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
> >   offset, read_size);
> >
> > On Mon, Mar 18, 2019 at 4:02 PM Paul Moore  wrote:
> > >
> > > On Mon, Mar 18, 2019 at 6:51 PM Todd Kjos  wrote:
> > > > On Mon, Mar 18, 2019 at 2:31 PM Paul Moore  wrote:
> > > > > Hello all.
> > > > >
> > > > > When running the selinux-testsuite (link below) against v5.1-rc1 I hit
> > > > > the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
> > > > > below).  I'm hoping this is a known issue with a fix already in the
> > > > > works?
> > > >
> > > >
> > > > Sadly, this is the first report of this, so no fix in flight. I'll try
> > > > to get a fix up in the next few days.
> > >
> > > No problem, thanks for letting me know.  If you need some testing
> > > help, let me know.

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-19 Thread Paul Moore
On Tue, Mar 19, 2019 at 12:51 PM Todd Kjos  wrote:
> Paul,
>
> I think this patch will fix it... can you run the selinux-testsuite
> with the patch to verify? (the conditional assumed that size_t can go
> negative)

Building a test kernel now, I'll report back as soon as it is finished.

Thanks.

> diff --git a/drivers/android/binder.c b/drivers/android/binder.c
> index 9a7c431469b3..bb9a661ffecc 100644
> --- a/drivers/android/binder.c
> +++ b/drivers/android/binder.c
> @@ -2240,7 +2240,8 @@ static size_t binder_get_object(struct binder_proc 
> *proc,
> size_t object_size = 0;
>
> read_size = min_t(size_t, sizeof(*object), buffer->data_size - 
> offset);
> -   if (read_size < sizeof(*hdr) || !IS_ALIGNED(offset, sizeof(u32)))
> +   if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
> +   !IS_ALIGNED(offset, sizeof(u32)))
> return 0;
> binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
>   offset, read_size);
>
> On Mon, Mar 18, 2019 at 4:02 PM Paul Moore  wrote:
> >
> > On Mon, Mar 18, 2019 at 6:51 PM Todd Kjos  wrote:
> > > On Mon, Mar 18, 2019 at 2:31 PM Paul Moore  wrote:
> > > > Hello all.
> > > >
> > > > When running the selinux-testsuite (link below) against v5.1-rc1 I hit
> > > > the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
> > > > below).  I'm hoping this is a known issue with a fix already in the
> > > > works?
> > >
> > >
> > > Sadly, this is the first report of this, so no fix in flight. I'll try
> > > to get a fix up in the next few days.
> >
> > No problem, thanks for letting me know.  If you need some testing
> > help, let me know.
> >
> > --
> > paul moore
> > www.paul-moore.com



-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-19 Thread Todd Kjos
Paul,

I think this patch will fix it... can you run the selinux-testsuite
with the patch to verify? (the conditional assumed that size_t can go
negative)

Thanks,

-Todd


diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 9a7c431469b3..bb9a661ffecc 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -2240,7 +2240,8 @@ static size_t binder_get_object(struct binder_proc *proc,
size_t object_size = 0;

read_size = min_t(size_t, sizeof(*object), buffer->data_size - offset);
-   if (read_size < sizeof(*hdr) || !IS_ALIGNED(offset, sizeof(u32)))
+   if (offset > buffer->data_size || read_size < sizeof(*hdr) ||
+   !IS_ALIGNED(offset, sizeof(u32)))
return 0;
binder_alloc_copy_from_buffer(&proc->alloc, object, buffer,
  offset, read_size);

On Mon, Mar 18, 2019 at 4:02 PM Paul Moore  wrote:
>
> On Mon, Mar 18, 2019 at 6:51 PM Todd Kjos  wrote:
> > On Mon, Mar 18, 2019 at 2:31 PM Paul Moore  wrote:
> > > Hello all.
> > >
> > > When running the selinux-testsuite (link below) against v5.1-rc1 I hit
> > > the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
> > > below).  I'm hoping this is a known issue with a fix already in the
> > > works?
> >
> >
> > Sadly, this is the first report of this, so no fix in flight. I'll try
> > to get a fix up in the next few days.
>
> No problem, thanks for letting me know.  If you need some testing
> help, let me know.
>
> --
> paul moore
> www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


Re: v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-18 Thread Paul Moore
On Mon, Mar 18, 2019 at 6:51 PM Todd Kjos  wrote:
> On Mon, Mar 18, 2019 at 2:31 PM Paul Moore  wrote:
> > Hello all.
> >
> > When running the selinux-testsuite (link below) against v5.1-rc1 I hit
> > the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
> > below).  I'm hoping this is a known issue with a fix already in the
> > works?
>
>
> Sadly, this is the first report of this, so no fix in flight. I'll try
> to get a fix up in the next few days.

No problem, thanks for letting me know.  If you need some testing
help, let me know.

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel


v5.1-rc1 binder_alloc_do_buffer_copy() BUG_ON triggered by selinux-testsuite

2019-03-18 Thread Paul Moore
Hello all.

When running the selinux-testsuite (link below) against v5.1-rc1 I hit
the BUG_ON() at the top of binder_alloc_do_buffer_copy() (trace
below).  I'm hoping this is a known issue with a fix already in the
works?

* https://github.com/SELinuxProject/selinux-testsuite

[  823.232432] [ cut here ]
[  823.234746] kernel BUG at drivers/android/binder_alloc.c:1141!
[  823.237447] invalid opcode:  [#1] SMP PTI
[  823.239421] CPU: 1 PID: 3644 Comm: test_binder Not tainted
5.1.0-0.rc1.git0.1.2.secnext.fc31.x86_64 #1
[  823.243538] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[  823.246079] RIP: 0010:binder_alloc_do_buffer_copy+0x34/0x210
[  823.248613] Code: 0a 41 55 49 89 fb 41 54 41 89 f4 48 8d 77 38 48
8b 42 58 55 53 48 39 f1 0f 84 17 01 00 00 48 8b 49 58 48 29 c1 49 39
c9 76 02 <0f> 0b 4c 29 c9 49 39 ca 77 f6 41 f6 c2 03 75 f0 0f b6 4a 28
f6 c1
[  823.256404] RSP: 0018:b04e41093b68 EFLAGS: 00010202
[  823.258513] RAX: 7fb600c52000 RBX: a0d48e24a0213e28 RCX: 0020
[  823.261375] RDX: 9c09b058a9c0 RSI: 9c09189165b0 RDI: 9c0918916578
[  823.264225] RBP: 9c09b058a9c0 R08: b04e41093c80 R09: 0028
[  823.267044] R10: a0d48e24a0213e28 R11: 9c0918916578 R12: 
[  823.269758] R13: 9c09b67c9660 R14: 9c09b116fb40 R15: 8acd4d08
[  823.272482] FS:  7fbeb3438800() GS:9c09b7a8()
knlGS:
[  823.275595] CS:  0010 DS:  ES:  CR0: 80050033
[  823.277676] CR2: 55b102d31cc9 CR3: 000234648000 CR4: 001406e0
[  823.280347] Call Trace:
[  823.281287]  binder_get_object+0x60/0xf0
[  823.282728]  binder_transaction+0xc2e/0x2370
[  823.284268]  ? __check_object_size+0x41/0x15d
[  823.285849]  ? binder_thread_read+0x9e2/0x1460
[  823.287342]  ? binder_update_ref_for_handle+0x83/0x1a0
[  823.289066]  binder_thread_write+0x2ae/0xfc0
[  823.290513]  ? finish_wait+0x80/0x80
[  823.291729]  binder_ioctl+0x659/0x836
[  823.292980]  do_vfs_ioctl+0x40a/0x670
[  823.294234]  ksys_ioctl+0x5e/0x90
[  823.295364]  __x64_sys_ioctl+0x16/0x20
[  823.296609]  do_syscall_64+0x5b/0x150
[  823.297796]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  823.299423] RIP: 0033:0x7fbeb35e782b
[  823.300580] Code: 0f 1e fa 48 8b 05 5d 96 0c 00 64 c7 00 26 00 00
00 48 c7 c0 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa b8 10 00 00
00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 2d 96 0c 00 f7 d8 64 89
01 48
[  823.306473] RSP: 002b:7ffdfae2f198 EFLAGS: 0287 ORIG_RAX:
0010
[  823.308868] RAX: ffda RBX:  RCX: 7fbeb35e782b
[  823.311029] RDX: 7ffdfae2f1b0 RSI: c0306201 RDI: 0003
[  823.313206] RBP: 7ffdfae30210 R08: 010fa330 R09: 
[  823.315379] R10: 00400644 R11: 0287 R12: 00401190
[  823.317459] R13: 7ffdfae304c0 R14:  R15: 
[  823.319510] Modules linked in: crypto_user nfnetlink xt_multiport
bluetooth ecdh_generic rfkill sctp overlay ip6table_security
xt_CONNSECMARK xt_SECMARK xt_state xt_conntrack nf_conntrack
nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c iptable_security ah6
xfrm6_mode_transport ah4 xfrm4_mode_transport ip6table_mangle
ip6table_filter ip6_tables iptable_mangle xt_mark xt_AUDIT ib_isert
iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp
rpcrdma rdma_ucm ib_iser ib_umad ib_ipoib rdma_cm iw_cm libiscsi
scsi_transport_iscsi ib_cm mlx5_ib ib_uverbs ib_core sunrpc
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel joydev
virtio_balloon i2c_piix4 drm_kms_helper virtio_net net_failover
failover ttm drm mlx5_core crc32c_intel virtio_blk ata_generic
virtio_console mlxfw serio_raw pata_acpi qemu_fw_cfg [last unloaded:
arp_tables]
[  823.339786] ---[ end trace 6f761f654b297775 ]---

-- 
paul moore
www.paul-moore.com
___
devel mailing list
de...@linuxdriverproject.org
http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel