[PATCH] scsi: ses: Guard against page 10 descriptors changes while processing them
SAS page 10 descriptors might change once we re-read them in ses_enclosure_data_process(), causing out-of-bound reads such as this one found by KASAN: [ 321.349000] sd 1:2:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 321.357723] sd 1:2:0:0: [sdb] tag#0 CDB: Read(16) 88 00 00 00 00 04 8b 92 0f 80 00 00 00 08 00 00 [ 321.366604] blk_update_request: I/O error, dev sdb, sector 19521474432 [ 321.373346] sd 1:2:0:0: [sdb] tag#0 FAILED Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 321.382074] sd 1:2:0:0: [sdb] tag#0 CDB: Read(16) 88 00 00 00 00 04 8b 92 0f 80 00 00 00 08 00 00 [ 321.390963] blk_update_request: I/O error, dev sdb, sector 19521474432 [ 321.397505] Buffer I/O error on dev sdb, logical block 2440184304, async page read [ 321.736854] == [ 321.744090] BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0xaa1/0xea0 [ses] [ 321.752352] Read of size 1 at addr 8807413710b9 by task systemd-udevd/708 [ 321.759485] [ 321.760990] CPU: 0 PID: 708 Comm: systemd-udevd Tainted: G I 3.10.0-916.el7.test.x86_64 #1 [ 321.771590] Hardware name: IBM -[7147I10]-/Node 1, System Card, BIOS -[MLE179BUS-1.79]- 07/28/2013 [ 321.780630] Call Trace: [ 321.783093] [] dump_stack+0x19/0x1b [ 321.788236] [] print_address_description+0xfc/0x290 [ 321.802162] [] kasan_report.part.3+0x242/0x330 [ 321.808255] [] __asan_report_load1_noabort+0x34/0x40 [ 321.814869] [] ses_enclosure_data_process+0xaa1/0xea0 [ses] [ 321.822094] [] ses_intf_add+0x85d/0xdde [ses] [ 321.828108] [] class_interface_register+0x219/0x330 [ 321.851979] [] scsi_register_interface+0x38/0x50 [ 321.858247] [] ses_init+0x11/0x1000 [ses] [ 321.863911] [] do_one_initcall+0x12a/0x370 [ 321.869665] [] load_module+0x5e3d/0x7550 [ 321.928619] [] SyS_init_module+0x253/0x350 [ 321.986411] [] system_call_fastpath+0x1c/0x21 [ 321.999029] [ 322.000527] Allocated by task 708: [ 322.003931] [] save_stack+0x43/0xe0 [ 322.009100] [] kasan_kmalloc+0xaa/0xe0 [ 322.014524] [] __kmalloc+0xee/0x270 [ 322.019683] [] ses_intf_add+0xa90/0xdde [ses] [ 322.025710] [] class_interface_register+0x219/0x330 [ 322.032259] [] scsi_register_interface+0x38/0x50 [ 322.038552] [] ses_init+0x11/0x1000 [ses] [ 322.044234] [] do_one_initcall+0x12a/0x370 [ 322.050002] [] load_module+0x5e3d/0x7550 [ 322.055599] [] SyS_init_module+0x253/0x350 [ 322.061371] [] system_call_fastpath+0x1c/0x21 [ 322.067399] [ 322.068897] Freed by task 0: [ 322.071782] (stack is not available) [ 322.075358] [ 322.076857] The buggy address belongs to the object at 880741370f00 [ 322.076857] which belongs to the cache kmalloc-512 of size 512 [ 322.089365] The buggy address is located 441 bytes inside of [ 322.089365] 512-byte region [880741370f00, 880741371100) [ 322.101089] The buggy address belongs to the page: [ 322.105885] page:ea001d04dc00 count:1 mapcount:1638426 mapping: (null) index:0x0 [ 322.114404] page flags: 0x2f4080(slab|head) [ 322.119347] page dumped because: kasan: bad access detected [ 322.124917] [ 322.126416] Memory state around the buggy address: [ 322.131212] 880741370f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 322.138431] 880741371000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 322.145653] >880741371080: 00 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc [ 322.152871] ^ [ 322.157925] 880741371100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 322.165144] 880741371180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 322.172360] == [ 322.179577] Disabling lock debugging due to kernel taint [ 322.184954] == [ 322.192180] BUG: KASAN: slab-out-of-bounds in ses_enclosure_data_process+0xaa1/0xea0 [ses] [ 322.200443] Read of size 1 at addr 8807413710bb by task systemd-udevd/708 Stop processing additional descriptors if we are already at the end of page 10 allocated buffer. Signed-off-by: Stefano Brivio --- drivers/scsi/ses.c | 12 +--- 1 file changed, 9 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 62f04c0511cf..d7fcda08a802 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -605,9 +605,15 @@ static void ses_enclosure_data_process(struct enclosure_device *edev, /* these elements are optional */ type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_TARGET_PORT || type_ptr[0] == ENCLOSURE_COMPONENT_SCSI_INITIATOR_PORT || -type_ptr[0] == ENCLOSURE_COMPONENT_CONTROLLER_ELECTRONICS)) - addl_desc_ptr += addl_desc_ptr[1]
Re: [PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, 6 Sep 2017 11:30:34 +0200 Johannes Thumshirn wrote: > On Wed, Sep 06, 2017 at 11:02:56AM +0200, Stefano Brivio wrote: > > Internal error codes happen to be positive, thus the PCI driver > > core won't treat them as failure, but we do. This would cause a > > crash later on as lpfc_pci_remove_one() is called (e.g. as > > shutdown function). > > > > Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") > > Signed-off-by: Stefano Brivio > > --- > > This seems to have been ignored. Re-sending as suggested by Johannes. > > > > drivers/scsi/lpfc/lpfc_init.c | 1 + > > 1 file changed, 1 insertion(+) > > > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > > index 491aa95eb0f6..38cc2b5bb5a2 100644 > > --- a/drivers/scsi/lpfc/lpfc_init.c > > +++ b/drivers/scsi/lpfc/lpfc_init.c > > @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) > > "Extents and RPI headers enabled.\n"); > > } > > mempool_free(mboxq, phba->mbox_mem_pool); > > + rc = -EIO; > > goto out_free_bsmbx; > > } > > > > -- > > 2.9.4 > > The patch looks good, but there are lots of > if (rc) { > mempool_free(mboxq, phba->mbox_mem_pool); > rc = -EIO; > goto out_free_bsmbx; > } > > in lpfc_sli4_driver_resource_setup(). Shouldn't out_free_bsmbx take care of it > all so we only have: > if (rc) > goto out_free_bsmbx; Thanks for your feedback! I considered doing something similar, but there are different error coded which are set when we reach the label out_free_mbsx. I checked all of them (and I hope I didn't miss any), but they all looked correct, and in quite a few cases different than -EIO (e.g. -ENODEV). So I think always returning -EIO in those cases is not what we want. > Because as this patch shows there's always a chance to miss an 'rc = -EIO'. > > Out of curiosity, do you know what's the value of rc in the failure case? Yes, MBXERR_ERROR (mentioned in patch subject -- sorry, I could have repeated it in the message perhaps). -- Stefano
[PATCH RESEND] lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core won't treat them as failure, but we do. This would cause a crash later on as lpfc_pci_remove_one() is called (e.g. as shutdown function). Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") Signed-off-by: Stefano Brivio --- This seems to have been ignored. Re-sending as suggested by Johannes. drivers/scsi/lpfc/lpfc_init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 491aa95eb0f6..38cc2b5bb5a2 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) "Extents and RPI headers enabled.\n"); } mempool_free(mboxq, phba->mbox_mem_pool); + rc = -EIO; goto out_free_bsmbx; } -- 2.9.4
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
On Wed, 6 Sep 2017 10:42:35 +0200 Johannes Thumshirn wrote: > On Wed, Sep 06, 2017 at 10:32:22AM +0200, Stefano Brivio wrote: > > > > I didn't get feedback about this patch. Is there any issue with the > > submission? > > > > I think it actually fixes a quite critical issue, if initialization > > fails we have crashes on reboot like the one reported below [1], and > > should perhaps also be queued for -stable. > > It seems to have slipped trough the cracks, can you please re-submit? The original submission is archived at https://marc.info/?l=linux-scsi&m=150392554622786&w=2. Before I cause any confusion... do you want me to re-submit this with the same subject? As v2 with a comment? -- Stefano
Re: [PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
Hi, On Mon, 28 Aug 2017 15:05:23 +0200 Stefano Brivio wrote: > Internal error codes happen to be positive, thus the PCI driver > core won't treat them as failure, but we do. This would cause a > crash later on as lpfc_pci_remove_one() is called (e.g. as > shutdown function). > > Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") > Signed-off-by: Stefano Brivio > --- > drivers/scsi/lpfc/lpfc_init.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c > index 491aa95eb0f6..38cc2b5bb5a2 100644 > --- a/drivers/scsi/lpfc/lpfc_init.c > +++ b/drivers/scsi/lpfc/lpfc_init.c > @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) > "Extents and RPI headers enabled.\n"); > } > mempool_free(mboxq, phba->mbox_mem_pool); > + rc = -EIO; > goto out_free_bsmbx; > } > I didn't get feedback about this patch. Is there any issue with the submission? I think it actually fixes a quite critical issue, if initialization fails we have crashes on reboot like the one reported below [1], and should perhaps also be queued for -stable. [1] [ 568.638555] BUG: unable to handle kernel NULL pointer dereference at 07c0 [ 568.679154] IP: lpfc_pci_remove_one+0x20/0x890 [lpfc] [ 568.704062] PGD 0 [ 568.704063] [ 568.721714] Oops: [#1] SMP [ 568.736895] Modules linked in: fuse vfat msdos fat binfmt_misc xfs fcoe libfcoe libfc rpcrdma ib_isert iscsi_target_mod ib_iser ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_core intel_rapl x86_pkg_temp_thermal intel_powerclamp intel_cstate intel_uncore intel_rapl_perf sg ipmi_si hpilo ipmi_devintf pcspkr ipmi_msghandler lpc_ich ioatdma shpchp dca pcc_cpufreq acpi_cpufreq ext4 jbd2 mbcache loop nfsv3 nfs_acl nfs lockd grace fscache sd_mod 8021q garp stp llc mrp mgag200 ata_generic i2c_algo_bit pata_acpi drm_kms_helper syscopyarea sfc sysfillrect sysimgblt fb_sys_fops ttm tg3 mtd crct10dif_pclmul lpfc crc32_pclmul ata_piix drm ptp hpsa serio_raw scsi_transport_fc be2net ghash_clmulni_intel libata i2c_core scsi_transport_sas pps_core [ 569.082589] mdio wmi sunrpc xts lrw gf128mul mcryptd dm_crypt dm_round_robin dm_multipath dm_snapshot dm_bufio dm_mirror dm_region_hash dm_log dm_zero dm_mod linear raid10 raid456 async_raid6_recov async_memcpy libcrc32c crc32c_intel async_pq async_xor xor async_tx raid6_pq raid1 raid0 iscsi_ibft iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi squashfs cramfs edd ctr(E) [ 569.254181] CPU: 9 PID: 46361 Comm: reboot Tainted: GE 4.11.0-22.el7a.x86_64 #1 [ 569.301048] Hardware name: HP ProLiant DL388p Gen8, BIOS P70 12/14/2012 [ 569.332878] task: 880801161680 task.stack: c9000514 [ 569.362693] RIP: 0010:lpfc_pci_remove_one+0x20/0x890 [lpfc] [ 569.390275] RSP: 0018:c90005143d18 EFLAGS: 00010296 [ 569.416040] RAX: RBX: 8804bd9cf000 RCX: [ 569.449874] RDX: RSI: 0246 RDI: 8804bd9cf000 [ 569.484402] RBP: c90005143d78 R08: 8804bd9cf0b8 R09: 0006 [ 569.519146] R10: 0020 R11: ea0020975b80 R12: 8804bd9cf000 [ 569.553408] R13: a057a2a0 R14: 8804bd9cf100 R15: fee1dead [ 569.588474] FS: 7f99fe258880() GS:88083f4c() knlGS: [ 569.628288] CS: 0010 DS: ES: CR0: 80050033 [ 569.656916] CR2: 07c0 CR3: 000825d91000 CR4: 000406e0 [ 569.693185] Call Trace: [ 569.705044] pci_device_shutdown+0x36/0x70 [ 569.725702] device_shutdown+0xdf/0x190 [ 569.744657] kernel_restart_prepare+0x36/0x40 [ 569.767099] kernel_restart+0x12/0x60 [ 569.784916] SYSC_reboot+0x1f3/0x220 [ 569.802649] ? __alloc_fd+0x46/0x170 [ 569.819539] ? vfs_writev+0x3c/0x50 [ 569.836206] ? do_writev+0x61/0xf0 [ 569.852730] SyS_reboot+0xe/0x10 [ 569.868375] entry_SYSCALL_64_fastpath+0x1a/0xa9 [ 569.890588] RIP: 0033:0x7f99fd065a56 [ 569.907894] RSP: 002b:7ffc8c5e2b58 EFLAGS: 0202 ORIG_RAX: 00a9 [ 569.944700] RAX: ffda RBX: ff91 RCX: 7f99fd065a56 [ 569.978659] RDX: 01234567 RSI: 28121969 RDI: fee1dead [ 570.013272] RBP: R08: 5583f9de32a0 R09: 7ffc8c5e2220 [ 570.048039] R10: 0024 R11: 0202 R12: 5583f9d6ef13 [ 570.082565] R13: 7ffc8c5e2e20 R14: R15: [ 570.116914] Code: 90 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 55 48 89 e5 41 57 41 56 41 55 41 54 53 48 89 fb 48 83 ec 38 48 8b 87 38 01 00 00 <4
[PATCH] lpfc: Don't return internal MBXERR_ERROR code from probe function
Internal error codes happen to be positive, thus the PCI driver core won't treat them as failure, but we do. This would cause a crash later on as lpfc_pci_remove_one() is called (e.g. as shutdown function). Fixes: 6d368e532168 ("[SCSI] lpfc 8.3.24: Add resource extent support") Signed-off-by: Stefano Brivio --- drivers/scsi/lpfc/lpfc_init.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 491aa95eb0f6..38cc2b5bb5a2 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -6118,6 +6118,7 @@ lpfc_sli4_driver_resource_setup(struct lpfc_hba *phba) "Extents and RPI headers enabled.\n"); } mempool_free(mboxq, phba->mbox_mem_pool); + rc = -EIO; goto out_free_bsmbx; } -- 2.9.4