Re: BLKZEROOUT not zeroing md dev on VMDK
On Wed, Jun 15, 2016 at 06:17:37PM +, Arvind Kumar wrote: > It is possibly some race. We saw a WRITE SAME related issue in past > for which Petr sent out a patch but looks like the patch didn't make > it. :( > > https://groups.google.com/forum/#!topic/linux.kernel/1WGDSlyY0y0 Indeed - the investigation you folks did is linked to within the upstream Bugzilla bug (see https://bugzilla.kernel.org/show_bug.cgi?id=118581#c2 ). Hopefully this issue will be resolved but there's still some debate over on http://thread.gmane.org/gmane.linux.kernel/2236800 . The problem is that it is causing real problems in stable kernels (data not being correctly zero'd) today... -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BLKZEROOUT not zeroing md dev on VMDK
It is possibly some race. We saw a WRITE SAME related issue in past for which Petr sent out a patch but looks like the patch didn't make it. :( https://groups.google.com/forum/#!topic/linux.kernel/1WGDSlyY0y0 Thanks! Arvind From: Sitsofe Wheeler Sent: Tuesday, May 31, 2016 10:04 PM To: Tom Yan Cc: Darrick J. Wong; Shaohua Li; Jens Axboe; Arvind Kumar; VMware PV-Drivers; linux-r...@vger.kernel.org; linux-scsi@vger.kernel.org; linux-bl...@vger.kernel.org; linux-ker...@vger.kernel.org Subject: Re: BLKZEROOUT not zeroing md dev on VMDK On 27 May 2016 at 10:30, Tom Yan wrote: > There seems to be some sort of race condition between > blkdev_issue_zeroout() and the scsi disk driver (disabling write same > after an illegal request). On my UAS drive, sometimes `blkdiscard -z > /dev/sdX` will return right away, even though if I then check > `write_same_max_bytes` it has turned 0. Sometimes it will just write > zero with SCSI WRITE even if `write_same_max_bytes` is 33553920 before > I issue `blkdiscard -z` (`write_same_max_bytes` also turned 0, as > expected). > > Not sure if it is directly related to the case here though. I'm not aware of hitting that particular problem myself directly on the underlying "SCSI" device but the patch on https://urldefense.proofpoint.com/v2/url?u=https-3A__patchwork.kernel.org_patch_9137311_&d=CwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=bUMaNc7nC9xbXtaMJrOvIIPNpPH0chY2kdRsskQn6GY&m=rx_5ntfhkt2GOpfjpiQjoCb5n4gCY7jKznXO0gKYcVI&s=W1F45VBu8NDxu2ImcbKM5b3d6UnUCLGgH8xEM9e6JQk&e= should be able to resolve that issue. Could you test it and follow up on https://urldefense.proofpoint.com/v2/url?u=http-3A__permalink.gmane.org_gmane.linux.kernel_2229377&d=CwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=bUMaNc7nC9xbXtaMJrOvIIPNpPH0chY2kdRsskQn6GY&m=rx_5ntfhkt2GOpfjpiQjoCb5n4gCY7jKznXO0gKYcVI&s=9ekqmTk18vzcwcY0SSMF8AZnJ_lWezFIM8tDvQqeDHI&e= ? I'm hoping more testing reports will lead to the patch being reviewed and accepted sooner rather than later as it's currently stalled... -- Sitsofe | https://urldefense.proofpoint.com/v2/url?u=http-3A__sucs.org_-7Esits_&d=CwIBaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=bUMaNc7nC9xbXtaMJrOvIIPNpPH0chY2kdRsskQn6GY&m=rx_5ntfhkt2GOpfjpiQjoCb5n4gCY7jKznXO0gKYcVI&s=arwniVbdl5KJZfyreWLhq-WUlgvKAf_eW1i6D2GbFGQ&e= -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BLKZEROOUT not zeroing md dev on VMDK
On 27 May 2016 at 10:30, Tom Yan wrote: > There seems to be some sort of race condition between > blkdev_issue_zeroout() and the scsi disk driver (disabling write same > after an illegal request). On my UAS drive, sometimes `blkdiscard -z > /dev/sdX` will return right away, even though if I then check > `write_same_max_bytes` it has turned 0. Sometimes it will just write > zero with SCSI WRITE even if `write_same_max_bytes` is 33553920 before > I issue `blkdiscard -z` (`write_same_max_bytes` also turned 0, as > expected). > > Not sure if it is directly related to the case here though. I'm not aware of hitting that particular problem myself directly on the underlying "SCSI" device but the patch on https://patchwork.kernel.org/patch/9137311/ should be able to resolve that issue. Could you test it and follow up on http://permalink.gmane.org/gmane.linux.kernel/2229377 ? I'm hoping more testing reports will lead to the patch being reviewed and accepted sooner rather than later as it's currently stalled... -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BLKZEROOUT not zeroing md dev on VMDK
There seems to be some sort of race condition between blkdev_issue_zeroout() and the scsi disk driver (disabling write same after an illegal request). On my UAS drive, sometimes `blkdiscard -z /dev/sdX` will return right away, even though if I then check `write_same_max_bytes` it has turned 0. Sometimes it will just write zero with SCSI WRITE even if `write_same_max_bytes` is 33553920 before I issue `blkdiscard -z` (`write_same_max_bytes` also turned 0, as expected). Not sure if it is directly related to the case here though. On 27 May 2016 at 12:45, Sitsofe Wheeler wrote: > On 27 May 2016 at 05:18, Darrick J. Wong wrote: >> >> It's possible that the pvscsi device advertised WRITE SAME, but if the device >> sends back ILLEGAL REQUEST then the SCSI disk driver will set >> write_same_max_bytes=0. Subsequent BLKZEROOUT attempts will then issue >> writes >> of zeroes to the drive. > > Thanks for following up on this but that's not what happens on the md > device - you can go on to issue as many BLKZEROOUT requests as you > like but the md disk is never zeroed nor is an error returned. > > I filed a bug at https://bugzilla.kernel.org/show_bug.cgi?id=118581 > (see https://bugzilla.kernel.org/show_bug.cgi?id=118581#c6 for > alternative reproduction steps that use scsi_debug and can be reworked > to impact device mapper) and Shaohua Li noted that > blkdev_issue_write_same could return 0 even when the disk didn't > support write same (see > https://bugzilla.kernel.org/show_bug.cgi?id=118581#c8 ). > > Shaohua went on to create a patch for this ("block: correctly fallback > for zeroout" - https://patchwork.kernel.org/patch/9137311/ ) which has > yet to be reviewed. > > -- > Sitsofe | http://sucs.org/~sits/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-scsi" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BLKZEROOUT not zeroing md dev on VMDK
On 27 May 2016 at 05:18, Darrick J. Wong wrote: > > It's possible that the pvscsi device advertised WRITE SAME, but if the device > sends back ILLEGAL REQUEST then the SCSI disk driver will set > write_same_max_bytes=0. Subsequent BLKZEROOUT attempts will then issue writes > of zeroes to the drive. Thanks for following up on this but that's not what happens on the md device - you can go on to issue as many BLKZEROOUT requests as you like but the md disk is never zeroed nor is an error returned. I filed a bug at https://bugzilla.kernel.org/show_bug.cgi?id=118581 (see https://bugzilla.kernel.org/show_bug.cgi?id=118581#c6 for alternative reproduction steps that use scsi_debug and can be reworked to impact device mapper) and Shaohua Li noted that blkdev_issue_write_same could return 0 even when the disk didn't support write same (see https://bugzilla.kernel.org/show_bug.cgi?id=118581#c8 ). Shaohua went on to create a patch for this ("block: correctly fallback for zeroout" - https://patchwork.kernel.org/patch/9137311/ ) which has yet to be reviewed. -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BLKZEROOUT not zeroing md dev on VMDK
On Wed, May 18, 2016 at 11:39:30PM +0100, Sitsofe Wheeler wrote: > Hi, > > With Ubuntu's 4.4.0-22-generic kernel and a Fedora 23 > 4.6.0-1.vanilla.knurd.1.fc23.x86_64 kernel I've found that the > BLKZEROOUT syscall can malfunction and not zero data. > > When BLKZEROOUT is issued to an MD device atop a PVSCSI controller > supplied VMDK from ESXi 6.0 the call returns immediately and with a zero > return code. Unfortunately, inspecting the data on the MD device shows > that it has not been zeroed and is in fact untouched. The easiest way to > see this behaviour is to boot the VM, create an mdadm device atop > /dev/sd?, scribble some non-zero value on the disk and then use > blkdiscard --zeroout /dev/md??? . If you then inspect the MD disk (e.g. > with hexdump) you will still see the old data and using POSIX_FADV_DONTNEED > on the MD device doesn't change the outcome. > > The only clue I've seen is that > /sys/block/sd?/queue/write_same_max_bytes starts out being 33553920 but > after a WRITE SAME is issued it becomes 0. If the MD device is created > after write_same_max_bytes has become 0 on the backing disk then > BLKZEROOUT seems to work correctly. It's possible that the pvscsi device advertised WRITE SAME, but if the device sends back ILLEGAL REQUEST then the SCSI disk driver will set write_same_max_bytes=0. Subsequent BLKZEROOUT attempts will then issue writes of zeroes to the drive. --D > > -- > Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BLKZEROOUT not zeroing md dev on VMDK
Hi, With Ubuntu's 4.4.0-22-generic kernel and a Fedora 23 4.6.0-1.vanilla.knurd.1.fc23.x86_64 kernel I've found that the BLKZEROOUT syscall can malfunction and not zero data. When BLKZEROOUT is issued to an MD device atop a PVSCSI controller supplied VMDK from ESXi 6.0 the call returns immediately and with a zero return code. Unfortunately, inspecting the data on the MD device shows that it has not been zeroed and is in fact untouched. The easiest way to see this behaviour is to boot the VM, create an mdadm device atop /dev/sd?, scribble some non-zero value on the disk and then use blkdiscard --zeroout /dev/md??? . If you then inspect the MD disk (e.g. with hexdump) you will still see the old data and using POSIX_FADV_DONTNEED on the MD device doesn't change the outcome. The only clue I've seen is that /sys/block/sd?/queue/write_same_max_bytes starts out being 33553920 but after a WRITE SAME is issued it becomes 0. If the MD device is created after write_same_max_bytes has become 0 on the backing disk then BLKZEROOUT seems to work correctly. -- Sitsofe | http://sucs.org/~sits/ -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html