Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2017-01-08 Thread Christoph Hellwig
On Tue, Jan 03, 2017 at 01:48:03PM -0800, Kees Cook wrote:
> There are a lot of cases of "missing" __GFP_COMP, which is why
> HARDENED_USERCOPY_PAGESPAN defaults to "n".
> 
> > If this is on a devie using blk-mq the block core will use high
> > order allocations (as high as possible) to allocate the requests
> > for each queue, so struct request could very well span multiple
> > pages.  But I don't see what __GFP_COMP would have to do with
> > user copy annoations.  As all requests for a queue are freed
> > togeth again there is no point in setting __GFP_COMP for the
> > request allocations.
> 
> Does it hurt anything to mark these pages as allocated "together" via
> __GFP_COMP?

It don't think it would hurt the block code - it only allocates the
pages once, and frees them once.  But I think hijacking your feature
on top of a totally unrelated flag is a horrible idea.  __GFP_COMP
is about refcounting the allocation, not about anything else.  The prime
use case of high order allocations is to use them as a single memory
object, which might include user copies.

So as-is I think HARDENED_USERCOPY_PAGESPAN is a misfeature, it needs to
be opt-in for allocations where we might not copy over the span of
pages, not opt-out.  And I suspect there aren't going to be all that
many opt-out candidates.

> 
> -Kees
> 
> -- 
> Kees Cook
> Nexus Security
---end quoted text---


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2017-01-03 Thread Kees Cook
On Fri, Dec 30, 2016 at 7:10 AM, Christoph Hellwig  wrote:
> On Fri, Dec 30, 2016 at 10:01:39AM -0500, Dave Jones wrote:
>> I threw this debug printk into the pagespan code to see what exactly
>> it was complaining about..
>>
>> ptr:88042614cff8 end:88042614d003 n:c
>>
>> so it was copying 12 bytes that spanned two pages.
>> >From my reading of the config option help text, this thing is
>> complaining that wasn't allocated with __GFP_COMP maybe ?

There are a lot of cases of "missing" __GFP_COMP, which is why
HARDENED_USERCOPY_PAGESPAN defaults to "n".

> If this is on a devie using blk-mq the block core will use high
> order allocations (as high as possible) to allocate the requests
> for each queue, so struct request could very well span multiple
> pages.  But I don't see what __GFP_COMP would have to do with
> user copy annoations.  As all requests for a queue are freed
> togeth again there is no point in setting __GFP_COMP for the
> request allocations.

Does it hurt anything to mark these pages as allocated "together" via
__GFP_COMP?

-Kees

-- 
Kees Cook
Nexus Security


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-30 Thread Christoph Hellwig
On Fri, Dec 30, 2016 at 10:01:39AM -0500, Dave Jones wrote:
> I threw this debug printk into the pagespan code to see what exactly
> it was complaining about..
> 
> ptr:88042614cff8 end:88042614d003 n:c
> 
> so it was copying 12 bytes that spanned two pages.
> >From my reading of the config option help text, this thing is
> complaining that wasn't allocated with __GFP_COMP maybe ?

If this is on a devie using blk-mq the block core will use high
order allocations (as high as possible) to allocate the requests
for each queue, so struct request could very well span multiple
pages.  But I don't see what __GFP_COMP would have to do with
user copy annoations.  As all requests for a queue are freed
togeth again there is no point in setting __GFP_COMP for the
request allocations.


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-30 Thread Dave Jones
On Fri, Dec 30, 2016 at 05:37:12AM -0800, Christoph Hellwig wrote:
 > On Thu, Dec 29, 2016 at 10:43:51AM -0500, Dave Jones wrote:
 > > On Wed, Dec 28, 2016 at 11:56:42PM -0800, Christoph Hellwig wrote:
 > >  > On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
 > >  > >  sg_io+0x113/0x470
 > >  > 
 > >  > Can you resolve that to a source line using a gdb?
 > > 
 > > It's the copy_from_user in an inlined copy of blk_fill_sghdr_rq.
 > 
 > That must be this line right at the beginning of blk_fill_sghdr_rq
 > 
 >  if (copy_from_user(rq->cmd, hdr->cmdp, hdr->cmd_len))
 >  return -EFAULT;
 > 
 > We're copying the SCSI CDB from the userspace pointer inside the hdr
 > we copied earlier into the request.
 > 
 > req->cmd is set to req->__cmd which is a u8 array with 16 members in
 > struct request by default, but if hdr->cmd_len is bigger than BLK_MAX_CDB
 > (16) we do a separate allocation for it in the caller:
 > 
 >  if (hdr->cmd_len > BLK_MAX_CDB) {
 >  rq->cmd = kzalloc(hdr->cmd_len, GFP_KERNEL);
 >  if (!rq->cmd)
 >  goto out_put_request;
 >  }
 > 
 > so I'm not really sure what the problem here could be.

I threw this debug printk into the pagespan code to see what exactly
it was complaining about..

ptr:88042614cff8 end:88042614d003 n:c

so it was copying 12 bytes that spanned two pages.
>From my reading of the config option help text, this thing is
complaining that wasn't allocated with __GFP_COMP maybe ?

Dave


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-30 Thread Christoph Hellwig
On Thu, Dec 29, 2016 at 10:43:51AM -0500, Dave Jones wrote:
> On Wed, Dec 28, 2016 at 11:56:42PM -0800, Christoph Hellwig wrote:
>  > On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
>  > >  sg_io+0x113/0x470
>  > 
>  > Can you resolve that to a source line using a gdb?
> 
> It's the copy_from_user in an inlined copy of blk_fill_sghdr_rq.

That must be this line right at the beginning of blk_fill_sghdr_rq

if (copy_from_user(rq->cmd, hdr->cmdp, hdr->cmd_len))
return -EFAULT;

We're copying the SCSI CDB from the userspace pointer inside the hdr
we copied earlier into the request.

req->cmd is set to req->__cmd which is a u8 array with 16 members in
struct request by default, but if hdr->cmd_len is bigger than BLK_MAX_CDB
(16) we do a separate allocation for it in the caller:

if (hdr->cmd_len > BLK_MAX_CDB) {
rq->cmd = kzalloc(hdr->cmd_len, GFP_KERNEL);
if (!rq->cmd)
goto out_put_request;
}

so I'm not really sure what the problem here could be.


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-29 Thread Dave Jones
On Wed, Dec 28, 2016 at 11:56:42PM -0800, Christoph Hellwig wrote:
 > On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
 > >  sg_io+0x113/0x470
 > 
 > Can you resolve that to a source line using a gdb?

It's the copy_from_user in an inlined copy of blk_fill_sghdr_rq.

Dave


Re: sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-28 Thread Christoph Hellwig
On Wed, Dec 28, 2016 at 04:40:16PM -0500, Dave Jones wrote:
>  sg_io+0x113/0x470

Can you resolve that to a source line using a gdb?



sg_io HARDENED_USERCOPY_PAGESPAN trace

2016-12-28 Thread Dave Jones
One of my machines won't boot 4.10rc1, because it hit this:

usercopy: kernel memory overwrite attempt detected to 88042601cff8 () (12 bytes)
[ cut here ]
kernel BUG at mm/usercopy.c:75!
invalid opcode:  [#1] SMP DEBUG_PAGEALLOC
CPU: 5 PID: 483 Comm: ata_id Not tainted 4.10.0-rc1-backup-debug+
task: 880427bd2ec0 task.stack: c900011d
RIP: 0010:__check_object_size+0xf4/0x316
RSP: 0018:c900011d3b28 EFLAGS: 00010282
RAX: 006a RBX: 88042601cff8 RCX: 
RDX:  RSI: 88043dd4cc28 RDI: 88043dd4cc28
RBP: c900011d3b60 R08: 0001 R09: 
R10: 0001 R11: 0001 R12: 000c
R13: ea0010980700 R14:  R15: 88042601d004
FS:  7f05c3011b40() GS:88043dd4() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 7f05c270ca30 CR3: 000427b38000 CR4: 001406e0
Call Trace:
 sg_io+0x113/0x470
 ? __might_fault+0x43/0xa0
 ? __check_object_size+0x11b/0x316
 scsi_cmd_ioctl+0x335/0x4d0
 ? __might_sleep+0x4b/0x90
 scsi_cmd_blk_ioctl+0x42/0x50
 sd_ioctl+0x85/0x110
 blkdev_ioctl+0x53b/0xa20
 block_ioctl+0x3d/0x50
 do_vfs_ioctl+0xa3/0x730
 ? __context_tracking_exit.part.6+0x4a/0x190
 ? __seccomp_filter+0x67/0x220
 SyS_ioctl+0x41/0x70
 do_syscall_64+0x7b/0x700
 entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x7f05c271bcc7
RSP: 002b:7ffe1559d8e8 EFLAGS: 0246 ORIG_RAX: 0010
RAX: ffda RBX: 0001 RCX: 7f05c271bcc7
RDX: 7ffe1559d930 RSI: 2285 RDI: 0003
RBP: 56259fdfd010 R08:  R09: 
R10: 7f05c3011b40 R11: 0246 R12: 7ffe1559ef1c
R13: 7ffe1559db90 R14: 0003 R15: 
Code: f9 01 00 00 49 c7 c0 3d 86 ef 8e 48 c7 c2 50 63 f0 8e 48 c7 c6 ed f4 ee 
8e 4d 89 e1 48 89 d9 48 c7 c7 b0 58 ef 8e e8 fc f3 f9 ff <0f> 0b 4c 89 ea 4c 89 
e6 48 89 df e8 3c b4 ff ff 48 85 c0 49 89 
RIP: __check_object_size+0xf4/0x316 RSP: c900011d3b28
usercopy: kernel memory overwrite attempt detected to 88042624cff8 () (16 bytes)

For now I've disabled HARDENED_USERCOPY_PAGESPAN.  I suspect this only
showed up on the one machine because it's got a RAID6 array, and I don't
use md anywhere else.

Dave