Re: [4.10, panic, regression] iscsi: null pointer deref at iscsi_tcp_segment_done+0x20d/0x2e0
On Wed, 21 Dec 2016, Linus Torvalds wrote: > On Wed, Dec 21, 2016 at 9:13 PM, Dave Chinnerwrote: > > > > There may be deeper issues. I just started running scalability tests > > (e.g. 16-way fsmark create tests) and about a minute in I got a > > directory corruption reported - something I hadn't seen in the dev > > cycle at all. > > By "in the dev cycle", do you mean your XFS changes, or have you been > tracking the merge cycle at least for some testing? > > > I unmounted the fs, mkfs'd it again, ran the > > workload again and about a minute in this fired: > > > > [628867.607417] [ cut here ] > > [628867.608603] WARNING: CPU: 2 PID: 16925 at mm/workingset.c:461 > > shadow_lru_isolate+0x171/0x220 > > Well, part of the changes during the merge window were the shadow > entry tracking changes that came in through Andrew's tree. Adding > Johannes Weiner to the participants. > > > Now, this workload does not touch the page cache at all - it's > > entirely an XFS metadata workload, so it should not really be > > affecting the working set code. > > Well, I suspect that anything that creates memory pressure will end up > triggering the working set code, so .. > > That said, obviously memory corruption could be involved and result in > random issues too, but I wouldn't really expect that in this code. > > It would probably be really useful to get more data points - is the > problem reliably in this area, or is it going to be random and all > over the place. Data point: kswapd got WARNING on mm/workingset.c:457 in shadow_lru_isolate, soon followed by NULL pointer deref in list_lru_isolate, one time when I tried out Sunday's git tree. Not seen since, I haven't had time to investigate, just set it aside as something to worry about if it happens again. But it looks like shadow_lru_isolate() has issues beyond Dave's case (I've no XFS and no iscsi), suspect unrelated to his other problems. Hugh > > That said: > > > And worse, on that last error, the /host/ is now going into meltdown > > (running 4.7.5) with 32 CPUs all burning down in ACPI code: > > The obvious question here is how much you trust the environment if the > host ends up also showing problems. Maybe you do end up having hw > issues pop up too. > > The primary suspect would presumably be the development kernel you're > testing triggering something, but it has to be asked.. > > Linus -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rc8-mm1] hotfix libata-scsi corruption
On Tue, 22 Jan 2008, James Bottomley wrote: --- 2.6.24-rc8-mm1/drivers/ata/libata-scsi.c2008-01-17 16:49:47.0 + +++ linux/drivers/ata/libata-scsi.c 2008-01-22 15:45:40.0 + @@ -826,7 +826,7 @@ static void ata_scsi_sdev_config(struct sdev-max_device_blocked = 1; /* set the min alignment */ - blk_queue_update_dma_alignment(sdev-request_queue, ATA_DMA_PAD_SZ - 1); + blk_queue_update_dma_alignment(sdev-request_queue, ATA_SECT_SIZE - 1); } static void ata_scsi_dev_config(struct scsi_device *sdev, Unfortunately, that's likely not the entire hot fix ... the implication is that we have some mapping error in the way we do direct SG_IO. Quite possibly, I'm not sure. What the fix you propose does is make it far more likely that block will copy, perform I/O then uncopy (almost certain, since most smartd data transfers are well under ATA_SECT_SIZE, which is 512). However, implicating a generic path like this implies that we would get the same problem for SCSI commands as well, so the correct hot fix is below. I've not noticed any problems from the normal activity of the system, only from smartd's sg_ioctl. My impression was that it's a libata issue, because it's going through ata_pio_sector, which does ap-ops-data_xfer(qc-dev, buf + offset, qc-sect_size, do_write); referring to sect_size, without considering the possibility of any smaller I/O size. (Me, I don't even know why it's going PIO rather than DMA: I'm assuming smartd does things that way, but there's no limit to my ignorance here.) However, I'd like to see if we can track the problem through the SG_IO direct path ... how many adjacent page bytes are corrupt? Just a few or a large number (I'm wondering if it's an off by one or off by alignment type bug)? I've assumed it's just the one next page: because ata_pio_sector is doing a data_xfer of sect_size ATA_SECT_SIZE 512 to an offset above 0xe00 in the smartd stack page. The time I actually saw corruption rather than an oops at startup, it was in a tmpfs swap vector page running 64-bit kernel, and I didn't examine any further pages (just checked the page before and matched it up to smartd's stack, already suspecting that). I don't believe it's an off-by-one at your SCSI end. Hugh - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rc8-mm1] hotfix libata-scsi corruption
On Tue, 22 Jan 2008, James Bottomley wrote: Actually, I don't think it's a smaller I/O issue. The SMART protocol specifically mandates that the transfers for SMART READ DATA and SMART READ LOG shall be 512 bytes). However, the pio transfer routine does seem to be assuming sector alignment as well, which will be where your problems are coming from. I think we need to specify sector minimum alignment for ata (but not atapi, which has its own non sector size pio routine). How about the attached? We have to do this for all ATA devices, because they'll likely all support SMART, and SMART is defined to be a PIO command. Thanks, you've answered several of my uncertainties (why the PIO, why the sector size). I've just tried it, and can confirm that your replacement patch below fixes the issue for me in practice. What I can't say, you and Jeff and others will judge, is whether that's actually the right placement for the blk_queue_update_dma_alignment call (as an outsider, I'm not convinced that the SMART requirement should be imposing this limitation on the rest). Hugh diff --git a/drivers/ata/libata-scsi.c b/drivers/ata/libata-scsi.c index 4bb268b..bc5cf6b 100644 --- a/drivers/ata/libata-scsi.c +++ b/drivers/ata/libata-scsi.c @@ -824,9 +824,6 @@ static void ata_scsi_sdev_config(struct scsi_device *sdev) * requests. */ sdev-max_device_blocked = 1; - - /* set the min alignment */ - blk_queue_update_dma_alignment(sdev-request_queue, ATA_DMA_PAD_SZ - 1); } static void ata_scsi_dev_config(struct scsi_device *sdev, @@ -842,7 +839,14 @@ static void ata_scsi_dev_config(struct scsi_device *sdev, if (dev-class == ATA_DEV_ATAPI) { struct request_queue *q = sdev-request_queue; blk_queue_max_hw_segments(q, q-max_hw_segments - 1); - } + + /* set the min alignment */ + blk_queue_update_dma_alignment(sdev-request_queue, +ATA_DMA_PAD_SZ - 1); + } else + /* ATA devices must be sector aligned */ + blk_queue_update_dma_alignment(sdev-request_queue, +ATA_SECT_SIZE - 1); if (dev-flags ATA_DFLAG_AN) set_bit(SDEV_EVT_MEDIA_CHANGE, sdev-supported_events); - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rc8-mm1] hotfix libata-scsi corruption
On Tue, 22 Jan 2008, Alan Cox wrote: However, I'd like to see if we can track the problem through the SG_IO direct path ... how many adjacent page bytes are corrupt? Just a few or a large number (I'm wondering if it's an off by one or off by alignment type bug)? We moved away from that concern Which ATA controller is involved - in theory ATA DMA is byte aligned safe (or dword anyway) in practice I don't know if we've ever tested the non 512 byte aligned case historically for ATA just ATAPI ? but if it's still relevant, this was with ata_piix. Hugh - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH rc8-mm1] hotfix libata-scsi corruption
On Tue, 22 Jan 2008, James Bottomley wrote: libsas looks to be OK because it specifically kmallocs a 512 byte buffer which should (for off slab data) be 512 byte aligned. I don't remember the various SLAB and SLOB and SLUB rules offhand: I'm not sure it's safe to rely on such alignment on all of them libata actually has an issue because the usual location for IDENTIFY_DEVICE data is inside a struct ata_device, which is highly unlikely to be correctly aligned. Fortunately, I think we can only get the bug if we actually cross a page boundary for non contiguous pages in the identify data, which a kernel allocation will never do, so libata should be safe as well. but this would trump it: yes, we don't need 512-byte alignment for this, and it is okay to cross a page boundary, just so long as the start of the next page really belongs to our buffer not somebody else's. There doesn't seem much likelihood of anyone vmalloc'ing the buffer in which that IDENTIFY_DEVICE gets done. Though this discussion does make me wonder whether ata_pio_sector ought to have a BUG_ON (and yes, a BUG_ON rather than a WARN_ON) against the possibility. Hugh - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 14/28] scsi: fix CONFIG_SCSI_WAIT_SCAN=m
Regression still outstanding: ping? On Fri, 18 May 2007, Hugh Dickins wrote: On Fri, 11 May 2007, Hugh Dickins wrote: On Thu, 10 May 2007, [EMAIL PROTECTED] wrote: From: Hugh Dickins [EMAIL PROTECTED] CONFIG_MODULES=y CONFIG_SCSI=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_SCSI_WAIT_SCAN=m 2.6.21-rc5-mm2 VFS panics unable to find my root on /dev/sda2, but boots okay if I change drivers/scsi/Kconfig to default y instead of default m for SCSI_WAIT_SCAN. Make sure there's a late_initcall to scsi_complete_async_scans when it's built in, so a monolithic SCSI_SCAN_ASYNC kernel can rely on the scans being completed before trying to mount root, even if they're slow. Thanks for sending this through to James again, Andrew. I'd been gearing up to report it to the regression police: it's certainly still a problem in 2.6.21-git. And still a problem in 2.6.22-rc1-git7. Any further news on this regression since 2.6.21? I was hoping that a fix for 2.6.22-rc2 would emerge in the course of the Asynchronous scsi scanning thread, but that doesn't appear to be heading in any useful direction. It wouldn't be a problem for me to undo my CONFIG_SCSI_SCAN_ASYNC=y, but it would still be a little regressive. I thought a late_initcall was guaranteed to be called after all the potential-root scsi scans had been started; but admit I'm ignorant of both initcall sequencing and scsi probing. Would a late_initcall_sync or a rootfs_initcall be more to the point? But ignore my gropings, I'd rather be testing what you believe to be the right fix. Thanks, Hugh [EMAIL PROTECTED]: build fixes] Signed-off-by: Hugh Dickins [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- James sayeth This isn't the right fix ... this has to be invoked last in the call sequence ... I can't see another way of doing this except yet another file added as a final component to the link. I didn't reply to James' mail at the time, hoping the right fix was about to follow. I don't see what's wrong with the late_initcall myself. Hugh Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_scan.c |9 + 1 file changed, 9 insertions(+) diff -puN drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m drivers/scsi/scsi_scan.c --- a/drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m +++ a/drivers/scsi/scsi_scan.c @@ -184,6 +184,15 @@ int scsi_complete_async_scans(void) /* Only exported for the benefit of scsi_wait_scan */ EXPORT_SYMBOL_GPL(scsi_complete_async_scans); +#ifndef MODULE +/* + * For async scanning we need to wait for all the scans to complete before + * trying to mount the root fs. Otherwise non-modular drivers may not be ready + * yet. + */ +late_initcall(scsi_complete_async_scans); +#endif + /** * scsi_unlock_floptical - unlock device via a special MODE SENSE command * @sdev:scsi device to send command to - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 14/28] scsi: fix CONFIG_SCSI_WAIT_SCAN=m
On Fri, 11 May 2007, Hugh Dickins wrote: On Thu, 10 May 2007, [EMAIL PROTECTED] wrote: From: Hugh Dickins [EMAIL PROTECTED] CONFIG_MODULES=y CONFIG_SCSI=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_SCSI_WAIT_SCAN=m 2.6.21-rc5-mm2 VFS panics unable to find my root on /dev/sda2, but boots okay if I change drivers/scsi/Kconfig to default y instead of default m for SCSI_WAIT_SCAN. Make sure there's a late_initcall to scsi_complete_async_scans when it's built in, so a monolithic SCSI_SCAN_ASYNC kernel can rely on the scans being completed before trying to mount root, even if they're slow. Thanks for sending this through to James again, Andrew. I'd been gearing up to report it to the regression police: it's certainly still a problem in 2.6.21-git. And still a problem in 2.6.22-rc1-git7. Any further news on this regression since 2.6.21? I was hoping that a fix for 2.6.22-rc2 would emerge in the course of the Asynchronous scsi scanning thread, but that doesn't appear to be heading in any useful direction. It wouldn't be a problem for me to undo my CONFIG_SCSI_SCAN_ASYNC=y, but it would still be a little regressive. I thought a late_initcall was guaranteed to be called after all the potential-root scsi scans had been started; but admit I'm ignorant of both initcall sequencing and scsi probing. Would a late_initcall_sync or a rootfs_initcall be more to the point? But ignore my gropings, I'd rather be testing what you believe to be the right fix. Thanks, Hugh [EMAIL PROTECTED]: build fixes] Signed-off-by: Hugh Dickins [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- James sayeth This isn't the right fix ... this has to be invoked last in the call sequence ... I can't see another way of doing this except yet another file added as a final component to the link. I didn't reply to James' mail at the time, hoping the right fix was about to follow. I don't see what's wrong with the late_initcall myself. Hugh Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_scan.c |9 + 1 file changed, 9 insertions(+) diff -puN drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m drivers/scsi/scsi_scan.c --- a/drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m +++ a/drivers/scsi/scsi_scan.c @@ -184,6 +184,15 @@ int scsi_complete_async_scans(void) /* Only exported for the benefit of scsi_wait_scan */ EXPORT_SYMBOL_GPL(scsi_complete_async_scans); +#ifndef MODULE +/* + * For async scanning we need to wait for all the scans to complete before + * trying to mount the root fs. Otherwise non-modular drivers may not be ready + * yet. + */ +late_initcall(scsi_complete_async_scans); +#endif + /** * scsi_unlock_floptical - unlock device via a special MODE SENSE command * @sdev: scsi device to send command to _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch 14/28] scsi: fix CONFIG_SCSI_WAIT_SCAN=m
On Thu, 10 May 2007, [EMAIL PROTECTED] wrote: From: Hugh Dickins [EMAIL PROTECTED] CONFIG_MODULES=y CONFIG_SCSI=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_SCSI_WAIT_SCAN=m 2.6.21-rc5-mm2 VFS panics unable to find my root on /dev/sda2, but boots okay if I change drivers/scsi/Kconfig to default y instead of default m for SCSI_WAIT_SCAN. Make sure there's a late_initcall to scsi_complete_async_scans when it's built in, so a monolithic SCSI_SCAN_ASYNC kernel can rely on the scans being completed before trying to mount root, even if they're slow. Thanks for sending this through to James again, Andrew. I'd been gearing up to report it to the regression police: it's certainly still a problem in 2.6.21-git. [EMAIL PROTECTED]: build fixes] Signed-off-by: Hugh Dickins [EMAIL PROTECTED] Cc: James Bottomley [EMAIL PROTECTED] --- James sayeth This isn't the right fix ... this has to be invoked last in the call sequence ... I can't see another way of doing this except yet another file added as a final component to the link. I didn't reply to James' mail at the time, hoping the right fix was about to follow. I don't see what's wrong with the late_initcall myself. Hugh Signed-off-by: Andrew Morton [EMAIL PROTECTED] --- drivers/scsi/scsi_scan.c |9 + 1 file changed, 9 insertions(+) diff -puN drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m drivers/scsi/scsi_scan.c --- a/drivers/scsi/scsi_scan.c~scsi-fix-config_scsi_wait_scan=m +++ a/drivers/scsi/scsi_scan.c @@ -184,6 +184,15 @@ int scsi_complete_async_scans(void) /* Only exported for the benefit of scsi_wait_scan */ EXPORT_SYMBOL_GPL(scsi_complete_async_scans); +#ifndef MODULE +/* + * For async scanning we need to wait for all the scans to complete before + * trying to mount the root fs. Otherwise non-modular drivers may not be ready + * yet. + */ +late_initcall(scsi_complete_async_scans); +#endif + /** * scsi_unlock_floptical - unlock device via a special MODE SENSE command * @sdev:scsi device to send command to _ - To unsubscribe from this list: send the line unsubscribe linux-scsi in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html