Re: [PATCH] block: Clear kernel memory before copying to user
On Thu, Nov 08, 2018 at 07:10:58PM +0800, Ming Lei wrote: > I guess the issue may depend on specific QEMU version, just tried the test > over > virtio-scsi/sata/usb-storage emulated via qemu-2.10.2-1.fc27, not observed > this problem. I actually didn't use virtio-scsi, but it really doesn't matter. FWIW, this is what I ran: # qemu-system-x86_64 --version QEMU emulator version 2.10.2(qemu-2.10.2-1.fc27) # qemu-system-x86_64 -m 192 -smp 2 -enable-kvm -display none -snapshot \ -hda /mnt/images/fedora-27.img -nographic \ -append "console=tty0 console=ttyS0 root=/dav/sda rw" \ -kernel /boot/vmlinuz-4.18.10-100.fc27.x86_64 \ -initrd /boot/initramfs-4.18.10-100.fc27.x86_64.img The file "fedora-27.img" is just a filesystem image of a minimal mock setup from /var/lib/mock/fedora-27-x86_64/root/. A small memory size makes it easier to observe, otherwise your probability of allocating unclean pages lowers. That's really the only reason I used qemu since all my real machines have too much memory that I never come close to using, making observations more random.
Re: [PATCH] block: Clear kernel memory before copying to user
On Thu, Nov 8, 2018 at 6:07 PM Johannes Thumshirn wrote: > > On 08/11/2018 02:22, Keith Busch wrote: > > $ ./sg-test /dev/sda | grep -v 0 > > 40733f4019db8001244019db4065244019db0094244019db > > c025244019dbc0e43a4019db40973a4019dbc0623a4019db > > 800c244019dbc0d61d4019dbc05f244019db80091e4019db > > 40913a4019db806f3f4019db40a83f4019dbc083244019db > > 80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db > > 007f1b4019dbc0d91e4019db40551e4019db804a1b4019db > > > > -- > > > > Hi Keith, > Can you please add the above to blktests? > > This would be very useful for downstream distributions. I guess the issue may depend on specific QEMU version, just tried the test over virtio-scsi/sata/usb-storage emulated via qemu-2.10.2-1.fc27, not observed this problem. Thanks, Ming Lei
Re: [PATCH] block: Clear kernel memory before copying to user
On 08/11/2018 02:22, Keith Busch wrote: > $ ./sg-test /dev/sda | grep -v 0 > 40733f4019db8001244019db4065244019db0094244019db > c025244019dbc0e43a4019db40973a4019dbc0623a4019db > 800c244019dbc0d61d4019dbc05f244019db80091e4019db > 40913a4019db806f3f4019db40a83f4019dbc083244019db > 80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db > 007f1b4019dbc0d91e4019db40551e4019db804a1b4019db > > -- > Hi Keith, Can you please add the above to blktests? This would be very useful for downstream distributions. Thanks a lot, Johannes -- Johannes ThumshirnSUSE Labs jthumsh...@suse.de+49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Jane Smithard, Graham Norton HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850
Re: [PATCH] block: Clear kernel memory before copying to user
On 11/7/18 6:12 PM, Ming Lei wrote: > On Thu, Nov 8, 2018 at 12:12 AM Keith Busch wrote: >> >> On Thu, Nov 08, 2018 at 12:03:41AM +0800, Ming Lei wrote: >>> On Wed, Nov 7, 2018 at 11:47 PM Keith Busch wrote: On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote: > blk_update_request() may tell us how much progress made, :-) Except when it doesn't, which is 100% of the time for many block drivers, including nvme. >>> >>> Please look at blk_mq_end_request()(<- nvme_complete_rq()), which >>> calls blk_update_request(). >> >> Huh? That just says 100% of the request size was transerred no matter >> what was actually transferred. >> >> The protocol doesn't have a way to express what transfer size occured >> with a command's completion, and even it did, there's no way I'd trust >> every firmware not to screw it. > > That sounds something like below: > > 1) userspace requires to read 8 sectors starting from sector 0 > 2) nvme tells hardware to do that, and hardware transfers only 4 > sectors(0~3) data > to memory, but still tells driver we have done 8 sectors(0~7). > > What if there are real data in sectors 4~7? > > Is it NVMe specific issue or common problem in other storage hardware? SCSI > does call blk_update_request() and handles partial completion. I would never believe any of that, it's much better to error on the side of caution for things like this. blk-mq doesn't even support partial completions, exactly because most hw doesn't even support it. But even for the ones that do, I would not trust it a lot. -- Jens Axboe
Re: [PATCH] block: Clear kernel memory before copying to user
On Thu, Nov 08, 2018 at 09:12:59AM +0800, Ming Lei wrote: > Is it NVMe specific issue or common problem in other storage hardware? SCSI > does call blk_update_request() and handles partial completion. Not specific to NVMe. An example using SG_IO dumping 2MB of unsanitized kernel memory: sg-test.c: --- #include #include #include #include #include #include #include #define SIZE (2 * 1024 * 1024 + 8) int main(int argc, char **argv) { struct sg_io_hdr io_hdr; unsigned char *buffer, cmd[6] = { TEST_UNIT_READY }; int sg, i; if (argc < 2) fprintf(stderr, "usage: %s \n", argv[0]), exit(0); sg = open(argv[1], O_RDONLY); if (sg < 0) perror("open"), exit(0); buffer = malloc(SIZE); if (!buffer) fprintf(stderr, "no memory\n"), exit(0); memset(_hdr, 0, sizeof(struct sg_io_hdr)); io_hdr.interface_id = 'S'; io_hdr.cmd_len = 6; io_hdr.cmdp = cmd; io_hdr.dxfer_direction = SG_DXFER_FROM_DEV; io_hdr.dxfer_len = SIZE; io_hdr.dxferp = buffer; memset(buffer, 0, SIZE); ioctl(sg, SG_IO, _hdr); for (i = 0; i < SIZE; i++) { printf("%02x", buffer[i]); if (i+1 % 32 == 0) printf("\n"); } } -- Test on qemu: --- $ ./sg-test /dev/sda | grep -v 0 40733f4019db8001244019db4065244019db0094244019db c025244019dbc0e43a4019db40973a4019dbc0623a4019db 800c244019dbc0d61d4019dbc05f244019db80091e4019db 40913a4019db806f3f4019db40a83f4019dbc083244019db 80eb1e4019db00a93f4019dbc09a3a4019db40503f4019db 007f1b4019dbc0d91e4019db40551e4019db804a1b4019db --
Re: [PATCH] block: Clear kernel memory before copying to user
On 11/7/18 7:37 AM, Keith Busch wrote: > If the kernel allocates a bounce buffer for user read data, this memory > needs to be cleared before copying it to the user, otherwise it may leak > kernel memory to user space. Applied, thanks. -- Jens Axboe
Re: [PATCH] block: Clear kernel memory before copying to user
On Thu, Nov 08, 2018 at 12:03:41AM +0800, Ming Lei wrote: > On Wed, Nov 7, 2018 at 11:47 PM Keith Busch wrote: > > > > On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote: > > > blk_update_request() may tell us how much progress made, :-) > > > > Except when it doesn't, which is 100% of the time for many block > > drivers, including nvme. > > Please look at blk_mq_end_request()(<- nvme_complete_rq()), which > calls blk_update_request(). Huh? That just says 100% of the request size was transerred no matter what was actually transferred. The protocol doesn't have a way to express what transfer size occured with a command's completion, and even it did, there's no way I'd trust every firmware not to screw it.
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, Nov 7, 2018 at 11:47 PM Keith Busch wrote: > > On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote: > > blk_update_request() may tell us how much progress made, :-) > > Except when it doesn't, which is 100% of the time for many block > drivers, including nvme. Please look at blk_mq_end_request()(<- nvme_complete_rq()), which calls blk_update_request(). Thanks, Ming Lei
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, Nov 07, 2018 at 11:44:59PM +0800, Ming Lei wrote: > blk_update_request() may tell us how much progress made, :-) Except when it doesn't, which is 100% of the time for many block drivers, including nvme.
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, Nov 7, 2018 at 11:19 PM Keith Busch wrote: > > On Wed, Nov 07, 2018 at 11:09:27PM +0800, Ming Lei wrote: > > On Wed, Nov 7, 2018 at 10:42 PM Keith Busch wrote: > > > > > > If the kernel allocates a bounce buffer for user read data, this memory > > > needs to be cleared before copying it to the user, otherwise it may leak > > > kernel memory to user space. > > > > > > Signed-off-by: Keith Busch > > > --- > > > block/bio.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/block/bio.c b/block/bio.c > > > index d5368a445561..a50d59236b19 100644 > > > --- a/block/bio.c > > > +++ b/block/bio.c > > > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue > > > *q, > > > if (ret) > > > goto cleanup; > > > } else { > > > + zero_fill_bio(bio); > > > iov_iter_advance(iter, bio->bi_iter.bi_size); > > > } > > > > This way looks inefficient because zero fill should only be required > > for short READ. > > Sure, but how do you know that happened before copying the bounce buffer > to user space? blk_update_request() may tell us how much progress made, :-) So it should work by calling the following code after the req is completed and before copying data to userspace: __rq_for_each_bio(bio, rq) zero_fill_bio(bio); > > We could zero the pages on allocation if that's better (and doesn't zero > twice if __GFP_ZERO was already provided): > > --- > diff --git a/block/bio.c b/block/bio.c > index d5368a445561..a1b6383294f4 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1212,6 +1212,9 @@ struct bio *bio_copy_user_iov(struct request_queue *q, > nr_pages = 1 << map_data->page_order; > i = map_data->offset / PAGE_SIZE; > } > + > + if (iov_iter_rw(iter) == READ) > + gfp_mask |= __GFP_ZERO; No big difference compared with before, because most of times the zero fill shouldn't be needed. However, this patch changes to do it every time. Thanks, Ming Lei
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, Nov 07, 2018 at 11:09:27PM +0800, Ming Lei wrote: > On Wed, Nov 7, 2018 at 10:42 PM Keith Busch wrote: > > > > If the kernel allocates a bounce buffer for user read data, this memory > > needs to be cleared before copying it to the user, otherwise it may leak > > kernel memory to user space. > > > > Signed-off-by: Keith Busch > > --- > > block/bio.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/block/bio.c b/block/bio.c > > index d5368a445561..a50d59236b19 100644 > > --- a/block/bio.c > > +++ b/block/bio.c > > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, > > if (ret) > > goto cleanup; > > } else { > > + zero_fill_bio(bio); > > iov_iter_advance(iter, bio->bi_iter.bi_size); > > } > > This way looks inefficient because zero fill should only be required > for short READ. Sure, but how do you know that happened before copying the bounce buffer to user space? We could zero the pages on allocation if that's better (and doesn't zero twice if __GFP_ZERO was already provided): --- diff --git a/block/bio.c b/block/bio.c index d5368a445561..a1b6383294f4 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1212,6 +1212,9 @@ struct bio *bio_copy_user_iov(struct request_queue *q, nr_pages = 1 << map_data->page_order; i = map_data->offset / PAGE_SIZE; } + + if (iov_iter_rw(iter) == READ) + gfp_mask |= __GFP_ZERO; while (len) { unsigned int bytes = PAGE_SIZE; --
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, Nov 7, 2018 at 10:42 PM Keith Busch wrote: > > If the kernel allocates a bounce buffer for user read data, this memory > needs to be cleared before copying it to the user, otherwise it may leak > kernel memory to user space. > > Signed-off-by: Keith Busch > --- > block/bio.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/bio.c b/block/bio.c > index d5368a445561..a50d59236b19 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, > if (ret) > goto cleanup; > } else { > + zero_fill_bio(bio); > iov_iter_advance(iter, bio->bi_iter.bi_size); > } This way looks inefficient because zero fill should only be required for short READ. Thanks, Ming Lei
Re: [PATCH] block: Clear kernel memory before copying to user
On Wed, 2018-11-07 at 07:37 -0700, Keith Busch wrote: > If the kernel allocates a bounce buffer for user read data, this > memory > needs to be cleared before copying it to the user, otherwise it may > leak > kernel memory to user space. > > Signed-off-by: Keith Busch > --- > block/bio.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/block/bio.c b/block/bio.c > index d5368a445561..a50d59236b19 100644 > --- a/block/bio.c > +++ b/block/bio.c > @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct > request_queue *q, > if (ret) > goto cleanup; > } else { > + zero_fill_bio(bio); > iov_iter_advance(iter, bio->bi_iter.bi_size); > } > Straightforward, looks good from my view. Reviewed-by: Laurence Oberman
[PATCH] block: Clear kernel memory before copying to user
If the kernel allocates a bounce buffer for user read data, this memory needs to be cleared before copying it to the user, otherwise it may leak kernel memory to user space. Signed-off-by: Keith Busch --- block/bio.c | 1 + 1 file changed, 1 insertion(+) diff --git a/block/bio.c b/block/bio.c index d5368a445561..a50d59236b19 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1260,6 +1260,7 @@ struct bio *bio_copy_user_iov(struct request_queue *q, if (ret) goto cleanup; } else { + zero_fill_bio(bio); iov_iter_advance(iter, bio->bi_iter.bi_size); } -- 2.14.4