Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-08 Thread Keith Busch
On Thu, Nov 08, 2018 at 07:10:58PM +0800, Ming Lei wrote:
> I guess the issue may depend on specific QEMU version, just tried the test 
> over
> virtio-scsi/sata/usb-storage emulated via qemu-2.10.2-1.fc27, not observed
> this problem.

I actually didn't use virtio-scsi, but it really doesn't matter.

FWIW, this is what I ran:

  # qemu-system-x86_64 --version
  QEMU emulator version 2.10.2(qemu-2.10.2-1.fc27)

  # qemu-system-x86_64 -m 192 -smp 2 -enable-kvm -display none -snapshot \
-hda /mnt/images/fedora-27.img  -nographic \
-append "console=tty0 console=ttyS0 root=/dav/sda rw" \
-kernel /boot/vmlinuz-4.18.10-100.fc27.x86_64 \
-initrd /boot/initramfs-4.18.10-100.fc27.x86_64.img

The file "fedora-27.img" is just a filesystem image of a minimal mock
setup from /var/lib/mock/fedora-27-x86_64/root/.

A small memory size makes it easier to observe, otherwise your probability
of allocating unclean pages lowers. That's really the only reason I used
qemu since all my real machines have too much memory that I never come
close to using, making observations more random.


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-08 Thread Ming Lei
On Thu, Nov 8, 2018 at 6:07 PM Johannes Thumshirn  wrote:
>
> On 08/11/2018 02:22, Keith Busch wrote:
> > $ ./sg-test /dev/sda | grep -v 0
> > 40733f4019db8001244019db4065244019db0094244019db
> > c025244019dbc0e43a4019db40973a4019dbc0623a4019db
> > 800c244019dbc0d61d4019dbc05f244019db80091e4019db
> > 40913a4019db806f3f4019db40a83f4019dbc083244019db
> > 80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db
> > 007f1b4019dbc0d91e4019db40551e4019db804a1b4019db
> > 
> > --
> >
>
> Hi Keith,
> Can you please add the above to blktests?
>
> This would be very useful for downstream distributions.

I guess the issue may depend on specific QEMU version, just tried the test over
virtio-scsi/sata/usb-storage emulated via qemu-2.10.2-1.fc27, not observed
this problem.

Thanks,
Ming Lei


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-08 Thread Johannes Thumshirn
On 08/11/2018 02:22, Keith Busch wrote:
> $ ./sg-test /dev/sda | grep -v 0
> 40733f4019db8001244019db4065244019db0094244019db
> c025244019dbc0e43a4019db40973a4019dbc0623a4019db
> 800c244019dbc0d61d4019dbc05f244019db80091e4019db
> 40913a4019db806f3f4019db40a83f4019dbc083244019db
> 80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db
> 007f1b4019dbc0d91e4019db40551e4019db804a1b4019db
> 
> --
> 

Hi Keith,
Can you please add the above to blktests?

This would be very useful for downstream distributions.

Thanks a lot,
Johannes

-- 
Johannes ThumshirnSUSE Labs
jthumsh...@suse.de+49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Jens Axboe
On 11/7/18 6:12 PM, Ming Lei wrote:
> On Thu, Nov 8, 2018 at 12:12 AM Keith Busch  wrote:
>>
>> On Thu, Nov 08, 2018 at 12:03:41AM +0800, Ming Lei wrote:
>>> On Wed, Nov 7, 2018 at 11:47 PM Keith Busch  wrote:

 On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote:
> blk_update_request() may tell us how much progress made, :-)

 Except when it doesn't, which is 100% of the time for many block
 drivers, including nvme.
>>>
>>> Please look at blk_mq_end_request()(<- nvme_complete_rq()), which
>>> calls blk_update_request().
>>
>> Huh? That just says 100% of the request size was transerred no matter
>> what was actually transferred.
>>
>> The protocol doesn't have a way to express what transfer size occured
>> with a command's completion, and even it did, there's no way I'd trust
>> every firmware not to screw it.
> 
> That sounds something like below:
> 
> 1) userspace requires to read 8 sectors starting from sector 0
> 2) nvme tells hardware to do that, and hardware transfers only 4
> sectors(0~3) data
> to memory, but still tells driver we have done 8 sectors(0~7).
> 
> What if there are real data in sectors 4~7?
> 
> Is it NVMe specific issue or common problem in other storage hardware?  SCSI
> does call blk_update_request() and handles partial completion.

I would never believe any of that, it's much better to error on the side
of caution for things like this. blk-mq doesn't even support partial
completions, exactly because most hw doesn't even support it. But even
for the ones that do, I would not trust it a lot.

-- 
Jens Axboe



Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Keith Busch
On Thu, Nov 08, 2018 at 09:12:59AM +0800, Ming Lei wrote:
> Is it NVMe specific issue or common problem in other storage hardware?  SCSI
> does call blk_update_request() and handles partial completion.

Not specific to NVMe.

An example using SG_IO dumping 2MB of unsanitized kernel memory:

sg-test.c:
---
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#define SIZE (2 * 1024 * 1024 + 8)
int main(int argc, char **argv)
{
struct sg_io_hdr io_hdr;
unsigned char *buffer, cmd[6] = { TEST_UNIT_READY };
int sg, i;

if (argc < 2)
fprintf(stderr, "usage: %s \n", argv[0]), exit(0);

sg = open(argv[1], O_RDONLY);
if (sg < 0)
perror("open"), exit(0);

buffer = malloc(SIZE);
if (!buffer)
fprintf(stderr, "no memory\n"), exit(0);

memset(_hdr, 0, sizeof(struct sg_io_hdr));
io_hdr.interface_id = 'S';
io_hdr.cmd_len = 6;
io_hdr.cmdp = cmd;
io_hdr.dxfer_direction = SG_DXFER_FROM_DEV;
io_hdr.dxfer_len = SIZE;
io_hdr.dxferp = buffer;

memset(buffer, 0, SIZE);
ioctl(sg, SG_IO, _hdr);
for (i = 0; i < SIZE; i++) {
printf("%02x", buffer[i]);
if (i+1 % 32 == 0)
printf("\n");
}
}
--

Test on qemu:
---
$ ./sg-test /dev/sda | grep -v 0
40733f4019db8001244019db4065244019db0094244019db
c025244019dbc0e43a4019db40973a4019dbc0623a4019db
800c244019dbc0d61d4019dbc05f244019db80091e4019db
40913a4019db806f3f4019db40a83f4019dbc083244019db
80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db
007f1b4019dbc0d91e4019db40551e4019db804a1b4019db

--


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Jens Axboe
On 11/7/18 7:37 AM, Keith Busch wrote:
> If the kernel allocates a bounce buffer for user read data, this memory
> needs to be cleared before copying it to the user, otherwise it may leak
> kernel memory to user space.

Applied, thanks.

-- 
Jens Axboe



Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Keith Busch
On Thu, Nov 08, 2018 at 12:03:41AM +0800, Ming Lei wrote:
> On Wed, Nov 7, 2018 at 11:47 PM Keith Busch  wrote:
> >
> > On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote:
> > > blk_update_request() may tell us how much progress made, :-)
> >
> > Except when it doesn't, which is 100% of the time for many block
> > drivers, including nvme.
> 
> Please look at blk_mq_end_request()(<- nvme_complete_rq()), which
> calls blk_update_request().

Huh? That just says 100% of the request size was transerred no matter
what was actually transferred.

The protocol doesn't have a way to express what transfer size occured
with a command's completion, and even it did, there's no way I'd trust
every firmware not to screw it.


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Ming Lei
On Wed, Nov 7, 2018 at 11:47 PM Keith Busch  wrote:
>
> On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote:
> > blk_update_request() may tell us how much progress made, :-)
>
> Except when it doesn't, which is 100% of the time for many block
> drivers, including nvme.

Please look at blk_mq_end_request()(<- nvme_complete_rq()), which
calls blk_update_request().

Thanks,
Ming Lei


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Keith Busch
On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote:
> blk_update_request() may tell us how much progress made, :-)

Except when it doesn't, which is 100% of the time for many block
drivers, including nvme.


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Ming Lei
On Wed, Nov 7, 2018 at 11:19 PM Keith Busch  wrote:
>
> On Wed, Nov 07, 2018 at 11:09:27PM +0800, Ming Lei wrote:
> > On Wed, Nov 7, 2018 at 10:42 PM Keith Busch  wrote:
> > >
> > > If the kernel allocates a bounce buffer for user read data, this memory
> > > needs to be cleared before copying it to the user, otherwise it may leak
> > > kernel memory to user space.
> > >
> > > Signed-off-by: Keith Busch 
> > > ---
> > >  block/bio.c | 1 +
> > >  1 file changed, 1 insertion(+)
> > >
> > > diff --git a/block/bio.c b/block/bio.c
> > > index d5368a445561..a50d59236b19 100644
> > > --- a/block/bio.c
> > > +++ b/block/bio.c
> > > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue 
> > > *q,
> > > if (ret)
> > > goto cleanup;
> > > } else {
> > > +   zero_fill_bio(bio);
> > > iov_iter_advance(iter, bio->bi_iter.bi_size);
> > > }
> >
> > This way looks inefficient because zero fill should only be required
> > for short READ.
>
> Sure, but how do you know that happened before copying the bounce buffer
> to user space?

blk_update_request() may tell us how much progress made, :-)

So it should work by calling the following code after the req is completed
and before copying data to userspace:

__rq_for_each_bio(bio, rq)
zero_fill_bio(bio);
>
> We could zero the pages on allocation if that's better (and doesn't zero
> twice if __GFP_ZERO was already provided):
>
> ---
> diff --git a/block/bio.c b/block/bio.c
> index d5368a445561..a1b6383294f4 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1212,6 +1212,9 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
> nr_pages = 1 << map_data->page_order;
> i = map_data->offset / PAGE_SIZE;
> }
> +
> +   if (iov_iter_rw(iter) == READ)
> +   gfp_mask |= __GFP_ZERO;

No big difference compared with before, because most of times the zero fill
shouldn't be needed. However, this patch changes to do it every time.

Thanks,
Ming Lei


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Keith Busch
On Wed, Nov 07, 2018 at 11:09:27PM +0800, Ming Lei wrote:
> On Wed, Nov 7, 2018 at 10:42 PM Keith Busch  wrote:
> >
> > If the kernel allocates a bounce buffer for user read data, this memory
> > needs to be cleared before copying it to the user, otherwise it may leak
> > kernel memory to user space.
> >
> > Signed-off-by: Keith Busch 
> > ---
> >  block/bio.c | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/block/bio.c b/block/bio.c
> > index d5368a445561..a50d59236b19 100644
> > --- a/block/bio.c
> > +++ b/block/bio.c
> > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
> > if (ret)
> > goto cleanup;
> > } else {
> > +   zero_fill_bio(bio);
> > iov_iter_advance(iter, bio->bi_iter.bi_size);
> > }
> 
> This way looks inefficient because zero fill should only be required
> for short READ.

Sure, but how do you know that happened before copying the bounce buffer
to user space?

We could zero the pages on allocation if that's better (and doesn't zero
twice if __GFP_ZERO was already provided):

---
diff --git a/block/bio.c b/block/bio.c
index d5368a445561..a1b6383294f4 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1212,6 +1212,9 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
nr_pages = 1 << map_data->page_order;
i = map_data->offset / PAGE_SIZE;
}
+
+   if (iov_iter_rw(iter) == READ)
+   gfp_mask |= __GFP_ZERO;
while (len) {
unsigned int bytes = PAGE_SIZE;
 
--


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Ming Lei
On Wed, Nov 7, 2018 at 10:42 PM Keith Busch  wrote:
>
> If the kernel allocates a bounce buffer for user read data, this memory
> needs to be cleared before copying it to the user, otherwise it may leak
> kernel memory to user space.
>
> Signed-off-by: Keith Busch 
> ---
>  block/bio.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/block/bio.c b/block/bio.c
> index d5368a445561..a50d59236b19 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
> if (ret)
> goto cleanup;
> } else {
> +   zero_fill_bio(bio);
> iov_iter_advance(iter, bio->bi_iter.bi_size);
> }

This way looks inefficient because zero fill should only be required
for short READ.

Thanks,
Ming Lei


Re: [PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Laurence Oberman
On Wed, 2018-11-07 at 07:37 -0700, Keith Busch wrote:
> If the kernel allocates a bounce buffer for user read data, this
> memory
> needs to be cleared before copying it to the user, otherwise it may
> leak
> kernel memory to user space.
> 
> Signed-off-by: Keith Busch 
> ---
>  block/bio.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/block/bio.c b/block/bio.c
> index d5368a445561..a50d59236b19 100644
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct
> request_queue *q,
>   if (ret)
>   goto cleanup;
>   } else {
> + zero_fill_bio(bio);
>   iov_iter_advance(iter, bio->bi_iter.bi_size);
>   }
>  

Straightforward, looks good from my view.

Reviewed-by:
Laurence Oberman 


[PATCH] block: Clear kernel memory before copying to user

2018-11-07 Thread Keith Busch
If the kernel allocates a bounce buffer for user read data, this memory
needs to be cleared before copying it to the user, otherwise it may leak
kernel memory to user space.

Signed-off-by: Keith Busch 
---
 block/bio.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/block/bio.c b/block/bio.c
index d5368a445561..a50d59236b19 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q,
if (ret)
goto cleanup;
} else {
+   zero_fill_bio(bio);
iov_iter_advance(iter, bio->bi_iter.bi_size);
}
 
-- 
2.14.4