Re: BTRFS with 8TB SMR drives
I decided to give this ST8000AS0002 a try for backups / storing old snapshots, although standardization for more optimal/native contol of SMR drives is still ongoing. I saw people got it working with 3.18 kernel, so that gave confidence. I wanted to see if i could get it running with 4.3.0-rc6 kernel (and 4.2.3 tools) on an H87M-Pro eSata (non-Intel) port. Filesystem is btrfs all single profiles on top of dm-crypt and mounted with compress-force=zlib,nossd (I use the drive via bcache but currently with not attached to a cache device). The initial snapshot send | receive action crashed after 1.2TB transferred, with all the typical/known problems in dmesg. Then same trial, newly created fs, on 1 of the Intel sata ports. Also the same timeouts seen in dmesg, but fs already corrupted after a few GB of datatransfer. It seemed that the drive was not able to handle and store the filesystemdatastream that was being pushed onto it. So I did some step back and just created an ext4 on it and did and rsync copy.Unfortunately, also the same timouts, port resets etc. As the drive made the main system unstable, I hooked it up to an AMD E-350 based board, also to try other kernels. Also on this board, no success with 4.x kernels and also not with 3.18.22 in the first place. But I figured out that a powercycle did the trick and not just a hard- or softreset. So again created fs from scratch and mounted as indicated. Now it is 55% filled (3.9TiB) with 10 snapshots (done as increments from the source fs from late 2013, with uncompressed allocation of about 5.6 TiB). The whole datatransfer took about 4 days, which is roughly 10x slower than what would be achieved if the drive were non-SMR and in a fast (e.g. Core i7) system. Although the task below took more than 8 minutes: [322087.174089] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] ... the fs and system runs OK. My take is that this relatively low average datatransfer (one reason I forced zlib compression) helped getting the task done successfully for this device-managed SMR drive, but it is unsatisfying that there are kernel version and computerystem dependencies. I had limited time for preparing and setting up the datatransfer, so other configurations with new kernels might also work, but I had most confidence upfront in the one that has turned out to work. Maybe now that all data is on the drive, I shrink the fs and create a test fs in a second partition. On Sat, Oct 24, 2015 at 5:27 AM, Ken Long wrote: > Hello, > > I have a a single version of this drive formatted with btrfs. Its my > only btrfs drive on this machine. > I'm getting similar errors. Is there any info I can provide to help > troubleshoot this? > > Is a full dmesg still wanted? > > here's what I'm running- > > $ uname -a > Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8 > 16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
I decided to give this ST8000AS0002 a try for storing old snapshots, although standardization for more optimal/native contol of SMR drives is still ongoing. I saw people got it working with 3.18 kernel, so that gave confidence. I wanted to see if i could get it running with 4.3.0-rc6 kernel (and 4.2.3 tools) on an H87M-Pro eSata (non-Intel) port. Filesystem is btrfs all single profiles on top of dm-crypt and mounted with compress-force=zlib,nossd (I use the drive via bcache but currently with not attached to a cache device). The initial snapshot send | receive action crashed after 1.2TB transferred, with all the typical/known problems in dmesg Then same trial, newly created fs, on 1 of the Intel sata ports. Also the same timeouts seen in dmesg, but fs already corrupted after a few GB of datatransfer. It seemed that the drive was not able to handle and store the filesystemdatastream that was being pushed onto it. So I did some step back and just created an ext4 on it and did and rsync copy.Unfortunately, also the same timouts, port resets etc. As the drive made the main system unstable, I hooked it up to an AMD E-350 based board, also to try other kernels. Also on this board, no success with 4.x kernels and also not with 3.18.22 in the first place. But I figured out that a powercycle did the trick and not just a hard- or softreset. So again created fs from scratch and mounted as indicated. Now it is 55% filled (3.9TiB) with 10 snapshots (done as increments from the source fs from late 2013, with uncompressed allocation of about 5.6 TiB). The whole datatransfer took about 4 days, which is roughly 10x slower than what would be achieved if the drive were non-SMR and in a fast (e.g. Core i7) system. Although the task below took more than 8 minutes: [322087.174089] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] ... the fs and system runs OK. My take is that this relatively low average datatransfer (one reason I forced zlib compression) helped getting the task done successfully for this device-managed SMR drive, but it is unsatisfying that there are kernel version and computerystem dependencies. I had limited time for preparing and setting up the datatransfer, so other configurations with new kernels might also work, but I had most confidence upfront in the one that has turned out to work. Maybe now that all data is on the drive, I shrink the fs and create a test fs in a second partition. On Sat, Oct 24, 2015 at 5:27 AM, Ken Long wrote: > Hello, > > I have a a single version of this drive formatted with btrfs. Its my > only btrfs drive on this machine. > I'm getting similar errors. Is there any info I can provide to help > troubleshoot this? > > Is a full dmesg still wanted? > > here's what I'm running- > > $ uname -a > Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8 > 16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Hello, I have a a single version of this drive formatted with btrfs. Its my only btrfs drive on this machine. I'm getting similar errors. Is there any info I can provide to help troubleshoot this? Is a full dmesg still wanted? here's what I'm running- $ uname -a Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8 16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
On Mon, Oct 12, 2015 at 06:25:52PM +0200, Henk Slager wrote: > and looking at this spec: > http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf > > it seems that it is a drive-managed SMR disk. I am not sure why David > assumes it is host-managed, maybe drive firmware/functionality can be > bypassed. Because the drive-managed ones are not interesting from the filesystem POV. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
yes indeed - referenced it in my update here https://mail-archive.com/linux-btrfs@vger.kernel.org/msg47380.html On 13 October 2015 at 13:04, Justin Maggard wrote: > Sounds to me like this: https://bugzilla.kernel.org/show_bug.cgi?id=93581 > > On Mon, Oct 12, 2015 at 11:37 AM, Chris Murphy > wrote: >> I get a lot of these from both sdb and sdc >> >> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] >> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 >> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense >> Key : 0x3 [current] >> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] >> ASC=0x11 ASCQ=0x0 >> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB: >> opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00 >> Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request: >> critical medium error, dev sdb, sector 297001368 >> >> >> >> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] >> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 >> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense >> Key : 0x3 [current] [descriptor] >> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] >> ASC=0x11 ASCQ=0x0 >> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB: >> opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00 >> Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request: >> critical medium error, dev sdc, sector 5335842176 >> >> There are a lot of these kinds of errors and they aren't all for the >> same LBA +/- 8 so they're are different physical sectors affected on >> both drives, but I don't know what the error is. >> >> >> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] >> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) >> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] >> 4096-byte physical blocks >> >> sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and >> eventually gets offlined >> Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting >> I/O to offline device >> >> And then reappears as sdo >> >> Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo] >> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) >> >> But no further scsi messages for this drive while Btrfs now complains >> about sdo instead of sdb. Seems to me that this device is confused >> even about its own error reporting. Anyway both sdb and sdc were >> having problems at the same time. >> >> >> Chris Murphy >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- Warren Hughes +64 21 633324 IM: gtalk + msn: this email address, skype: akawsh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Sounds to me like this: https://bugzilla.kernel.org/show_bug.cgi?id=93581 On Mon, Oct 12, 2015 at 11:37 AM, Chris Murphy wrote: > I get a lot of these from both sdb and sdc > > Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] > UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 > Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense > Key : 0x3 [current] > Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] > ASC=0x11 ASCQ=0x0 > Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB: > opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00 > Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request: > critical medium error, dev sdb, sector 297001368 > > > > Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] > UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 > Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense > Key : 0x3 [current] [descriptor] > Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] > ASC=0x11 ASCQ=0x0 > Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB: > opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00 > Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request: > critical medium error, dev sdc, sector 5335842176 > > There are a lot of these kinds of errors and they aren't all for the > same LBA +/- 8 so they're are different physical sectors affected on > both drives, but I don't know what the error is. > > > Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] > 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) > Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] > 4096-byte physical blocks > > sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and > eventually gets offlined > Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting > I/O to offline device > > And then reappears as sdo > > Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo] > 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) > > But no further scsi messages for this drive while Btrfs now complains > about sdo instead of sdb. Seems to me that this device is confused > even about its own error reporting. Anyway both sdb and sdc were > having problems at the same time. > > > Chris Murphy > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
I get a lot of these from both sdb and sdc Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense Key : 0x3 [current] Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] ASC=0x11 ASCQ=0x0 Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB: opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00 Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request: critical medium error, dev sdb, sector 297001368 Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08 Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense Key : 0x3 [current] [descriptor] Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] ASC=0x11 ASCQ=0x0 Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB: opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00 Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request: critical medium error, dev sdc, sector 5335842176 There are a lot of these kinds of errors and they aren't all for the same LBA +/- 8 so they're are different physical sectors affected on both drives, but I don't know what the error is. Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] 4096-byte physical blocks sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and eventually gets offlined Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting I/O to offline device And then reappears as sdo Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) But no further scsi messages for this drive while Btrfs now complains about sdo instead of sdb. Seems to me that this device is confused even about its own error reporting. Anyway both sdb and sdc were having problems at the same time. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Yes, correct its drive managed SMR. I have been following this bug: https://bugzilla.kernel.org/show_bug.cgi?id=93581 for a while As a test I compiled/installed 4.3.0-rc4 as it looks like they reverted some kernel patches that (negatively) affect SMR. I ran a complete balance overnight and not a single error on the 8TB SMR drive. I have a number of corrected and medium errors on one of my 3TB WD Red drives which appear to be genuine errors. Thankfully my BTRFS is RAID1. I'll remove and replace that 3TB drive and run a complete scrub - but for now it looks like I was a victim of the above bug entry. On 13 October 2015 at 05:25, Henk Slager wrote: > Hi Warren, > > from your dmesg I see: > Oct 10 07:42:36 cloud.warrenhughes.net kernel: scsi 0:0:1:0: > Direct-Access ATA ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5 > Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] > 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) > > Oct 11 23:57:56 cloud.warrenhughes.net kernel: scsi 0:0:1:0: > Direct-Access ATA ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5 > Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo] > 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) > > and looking at this spec: > http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf > > it seems that it is a drive-managed SMR disk. I am not sure why David > assumes it is host-managed, maybe drive firmware/functionality can be > bypassed. > > As far as I can see, the drive should not have a problem with btrfs as > such, but I read quite worrying stories w.r.t. raid. I think the write > characteristics of the balance operation, in combination with the > connection via the LSI controller, are not really compatible with > 'archive' use case of the drive. 'Simple', 'relaxed' write operation > should be OK, but beyond that, it might fail. See also: > http://www.storagereview.com/seagate_archive_hdd_review_8tb > > How much data is already on the drive? Is it an option to mount with > skip_balance and try to remove the device and then do some tests on it > in single independent mode? > > /Henk > > > On Mon, Oct 12, 2015 at 3:21 PM, David Sterba wrote: >> On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote: >>> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume >>> and I'm getting a tonne of errors when balancing or scrubbing. >>> >>> A short smartctl test reports fine, running a long one now. Will also >>> run seatools from a bootable DOS USB while at work today. >>> >>> Running latest firmware on my 9240-8i which explicitly supports this drive. >>> >>> I'm finding it very hard to tell if SMR drives are OK with BTRFS >>> currently - anyone chime in? >> >> I assume you have the host-managed SMR drives. This type needs tweaks to >> the operating system so the write patterns play well with the SMR >> constraints. Btrfs does not support that out of the box, but my >> colleague Hannes Reinecke managed to get it working with some minor >> changes to the allocator and disabled writing of superblock copies. >> >> For full support of SMR we'd have to change more than that, currently >> nothing prevents to write "backwards" in a given chunk that is allowed >> to be written only in the append way. So you can get mixed results when >> trying to use the SMR devices but I'd say it will mostly not work. >> >> But, btrfs has all the fundamental features in place, we'd have to make >> adjustments to follow the SMR constraints: >> >> * we can map the blockgroups to the SMR chunks (in some multiples) >> * remember the write pointers and do only append writes (easy with COW) >> * if the chunk is getting full, mark it read-only, rebalance the live >> data somewhere else and reset the chunk and the pointer >> >> I have some notes at >> https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Warren Hughes +64 21 633324 IM: gtalk + msn: this email address, skype: akawsh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Hi Warren, from your dmesg I see: Oct 10 07:42:36 cloud.warrenhughes.net kernel: scsi 0:0:1:0: Direct-Access ATA ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5 Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) Oct 11 23:57:56 cloud.warrenhughes.net kernel: scsi 0:0:1:0: Direct-Access ATA ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5 Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo] 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB) and looking at this spec: http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf it seems that it is a drive-managed SMR disk. I am not sure why David assumes it is host-managed, maybe drive firmware/functionality can be bypassed. As far as I can see, the drive should not have a problem with btrfs as such, but I read quite worrying stories w.r.t. raid. I think the write characteristics of the balance operation, in combination with the connection via the LSI controller, are not really compatible with 'archive' use case of the drive. 'Simple', 'relaxed' write operation should be OK, but beyond that, it might fail. See also: http://www.storagereview.com/seagate_archive_hdd_review_8tb How much data is already on the drive? Is it an option to mount with skip_balance and try to remove the device and then do some tests on it in single independent mode? /Henk On Mon, Oct 12, 2015 at 3:21 PM, David Sterba wrote: > On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote: >> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume >> and I'm getting a tonne of errors when balancing or scrubbing. >> >> A short smartctl test reports fine, running a long one now. Will also >> run seatools from a bootable DOS USB while at work today. >> >> Running latest firmware on my 9240-8i which explicitly supports this drive. >> >> I'm finding it very hard to tell if SMR drives are OK with BTRFS >> currently - anyone chime in? > > I assume you have the host-managed SMR drives. This type needs tweaks to > the operating system so the write patterns play well with the SMR > constraints. Btrfs does not support that out of the box, but my > colleague Hannes Reinecke managed to get it working with some minor > changes to the allocator and disabled writing of superblock copies. > > For full support of SMR we'd have to change more than that, currently > nothing prevents to write "backwards" in a given chunk that is allowed > to be written only in the append way. So you can get mixed results when > trying to use the SMR devices but I'd say it will mostly not work. > > But, btrfs has all the fundamental features in place, we'd have to make > adjustments to follow the SMR constraints: > > * we can map the blockgroups to the SMR chunks (in some multiples) > * remember the write pointers and do only append writes (easy with COW) > * if the chunk is getting full, mark it read-only, rebalance the live > data somewhere else and reset the chunk and the pointer > > I have some notes at > https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote: > Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume > and I'm getting a tonne of errors when balancing or scrubbing. > > A short smartctl test reports fine, running a long one now. Will also > run seatools from a bootable DOS USB while at work today. > > Running latest firmware on my 9240-8i which explicitly supports this drive. > > I'm finding it very hard to tell if SMR drives are OK with BTRFS > currently - anyone chime in? I assume you have the host-managed SMR drives. This type needs tweaks to the operating system so the write patterns play well with the SMR constraints. Btrfs does not support that out of the box, but my colleague Hannes Reinecke managed to get it working with some minor changes to the allocator and disabled writing of superblock copies. For full support of SMR we'd have to change more than that, currently nothing prevents to write "backwards" in a given chunk that is allowed to be written only in the append way. So you can get mixed results when trying to use the SMR devices but I'd say it will mostly not work. But, btrfs has all the fundamental features in place, we'd have to make adjustments to follow the SMR constraints: * we can map the blockgroups to the SMR chunks (in some multiples) * remember the write pointers and do only append writes (easy with COW) * if the chunk is getting full, mark it read-only, rebalance the live data somewhere else and reset the chunk and the pointer I have some notes at https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
more info for anyone interested: [wsh@cloud ~]$ sudo btrfs fi df /mnt/media Data, RAID1: total=13.64TiB, used=13.61TiB System, RAID1: total=32.00MiB, used=2.22MiB Metadata, RAID1: total=16.00GiB, used=15.10GiB GlobalReserve, single: total=512.00MiB, used=0.00B [wsh@cloud ~]$ sudo btrfs fi sh /mnt/media Label: none uuid: 643c3145-8371-4011-8c34-20240e1bbaff Total devices 11 FS bytes used 13.63TiB devid8 size 2.73TiB used 2.54TiB path /dev/sdh devid9 size 2.73TiB used 2.54TiB path /dev/sdc devid 10 size 2.73TiB used 2.54TiB path /dev/sdf devid 11 size 1.82TiB used 1.63TiB path /dev/sdn devid 12 size 2.73TiB used 2.54TiB path /dev/sdg devid 14 size 2.73TiB used 2.54TiB path /dev/sda devid 15 size 2.73TiB used 2.54TiB path /dev/sdd devid 16 size 2.73TiB used 2.54TiB path /dev/sdk devid 17 size 2.73TiB used 2.54TiB path /dev/sdl devid 18 size 3.64TiB used 3.45TiB path /dev/sdm devid 19 size 7.28TiB used 1.93TiB path /dev/sdo btrfs-progs v4.2.1 On 12 October 2015 at 14:43, Chris Murphy wrote: > Is it possible to get a complete dmesg included in the thread, or if > it's too big attach it to a bug report? I'm curious if there are any > libata messages, as well as the specific Btrfs messages. > > > --- > Chris Murphy -- Warren Hughes +64 21 633324 IM: gtalk + msn: this email address, skype: akawsh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Hopefully this is of use - its a beast; 34MB when uncompressed https://drive.google.com/file/d/0B74Kimpwe3nYYUZ2YTMtQXB4V1U/view?usp=sharing On 12 October 2015 at 14:43, Chris Murphy wrote: > Is it possible to get a complete dmesg included in the thread, or if > it's too big attach it to a bug report? I'm curious if there are any > libata messages, as well as the specific Btrfs messages. > > > --- > Chris Murphy -- Warren Hughes +64 21 633324 IM: gtalk + msn: this email address, skype: akawsh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Is it possible to get a complete dmesg included in the thread, or if it's too big attach it to a bug report? I'm curious if there are any libata messages, as well as the specific Btrfs messages. --- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Thanks Kristan, a scrub would be great; mine appeared to be working fine until the scrub (although I hadn't yet run a balance on it so who knows). I might move my 8TB onto the motherboard controller and see if the situation improves. Will update here tonight. Cheers, W. On 12 October 2015 at 11:53, Kristan wrote: > Warren Hughes warrenhughes.net> writes: > >> >> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume >> and I'm getting a tonne of errors when balancing or scrubbing. >> >> A short smartctl test reports fine, running a long one now. Will also >> run seatools from a bootable DOS USB while at work today. >> >> Running latest firmware on my 9240-8i which explicitly supports this > drive. >> >> I'm finding it very hard to tell if SMR drives are OK with BTRFS >> currently - anyone chime in? >> >> Thanks, Warren >> >> [wsh cloud storcli]$ uname -a >> Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44 >> CEST 2015 x86_64 GNU/Linux >> >> [wsh cloud storcli]$ sudo btrfs version >> btrfs-progs v4.2.1 >> >> [wsh cloud ~]$ sudo btrfs scrub status /mnt/media >> scrub status for 643c3145-8371-4011-8c34-20240e1bbaff >> scrub started at Sun Oct 11 20:37:38 2015 and was aborted > after 10:35:47 >> total bytes scrubbed: 8.15TiB with 104218141 errors >> error details: read=98736175 csum=5481966 >> corrected errors: 5484382, uncorrectable errors: 98733759, >> unverified errors: 0 >> >> [/dev/sdo].write_io_errs 100154203 >> [/dev/sdo].read_io_errs98735251 >> [/dev/sdo].flush_io_errs 634 >> [/dev/sdo].corruption_errs 5481966 >> [/dev/sdo].generation_errs 0 >> > > hi Warren, > > I recently (last week) built a 3 disk RAID 5 array using the same 8TB > drives which worked fine holding ~12TB then added a 4th disk using a > JMicron PCI SATA controller. I then ran a balance which failed after > just over 1TB written to the 4th disk. This caused the entire array to > fail but the main difference to your scenario was that the 4th disk also > wasn't reporting to SMART properly. > I then moved all 4 disks onto the motherboard based SATA controller, > built the array fresh and have copied ~18TB onto it and it seems to be > working fine. Perhaps I should try a scrub and see :) > > I'm using Centos 7.1 but kernel 4.2.1-ml and btrfs-progs 4.2.2 > Kristan > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- Warren Hughes +64 21 633324 IM: gtalk + msn: this email address, skype: akawsh -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS with 8TB SMR drives
Warren Hughes warrenhughes.net> writes: > > Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume > and I'm getting a tonne of errors when balancing or scrubbing. > > A short smartctl test reports fine, running a long one now. Will also > run seatools from a bootable DOS USB while at work today. > > Running latest firmware on my 9240-8i which explicitly supports this drive. > > I'm finding it very hard to tell if SMR drives are OK with BTRFS > currently - anyone chime in? > > Thanks, Warren > > [wsh cloud storcli]$ uname -a > Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44 > CEST 2015 x86_64 GNU/Linux > > [wsh cloud storcli]$ sudo btrfs version > btrfs-progs v4.2.1 > > [wsh cloud ~]$ sudo btrfs scrub status /mnt/media > scrub status for 643c3145-8371-4011-8c34-20240e1bbaff > scrub started at Sun Oct 11 20:37:38 2015 and was aborted after 10:35:47 > total bytes scrubbed: 8.15TiB with 104218141 errors > error details: read=98736175 csum=5481966 > corrected errors: 5484382, uncorrectable errors: 98733759, > unverified errors: 0 > > [/dev/sdo].write_io_errs 100154203 > [/dev/sdo].read_io_errs98735251 > [/dev/sdo].flush_io_errs 634 > [/dev/sdo].corruption_errs 5481966 > [/dev/sdo].generation_errs 0 > hi Warren, I recently (last week) built a 3 disk RAID 5 array using the same 8TB drives which worked fine holding ~12TB then added a 4th disk using a JMicron PCI SATA controller. I then ran a balance which failed after just over 1TB written to the 4th disk. This caused the entire array to fail but the main difference to your scenario was that the 4th disk also wasn't reporting to SMART properly. I then moved all 4 disks onto the motherboard based SATA controller, built the array fresh and have copied ~18TB onto it and it seems to be working fine. Perhaps I should try a scrub and see :) I'm using Centos 7.1 but kernel 4.2.1-ml and btrfs-progs 4.2.2 Kristan -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
BTRFS with 8TB SMR drives
Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume and I'm getting a tonne of errors when balancing or scrubbing. A short smartctl test reports fine, running a long one now. Will also run seatools from a bootable DOS USB while at work today. Running latest firmware on my 9240-8i which explicitly supports this drive. I'm finding it very hard to tell if SMR drives are OK with BTRFS currently - anyone chime in? Thanks, Warren [wsh@cloud storcli]$ uname -a Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44 CEST 2015 x86_64 GNU/Linux [wsh@cloud storcli]$ sudo btrfs version btrfs-progs v4.2.1 [wsh@cloud ~]$ sudo btrfs scrub status /mnt/media scrub status for 643c3145-8371-4011-8c34-20240e1bbaff scrub started at Sun Oct 11 20:37:38 2015 and was aborted after 10:35:47 total bytes scrubbed: 8.15TiB with 104218141 errors error details: read=98736175 csum=5481966 corrected errors: 5484382, uncorrectable errors: 98733759, unverified errors: 0 [/dev/sdo].write_io_errs 100154203 [/dev/sdo].read_io_errs98735251 [/dev/sdo].flush_io_errs 634 [/dev/sdo].corruption_errs 5481966 [/dev/sdo].generation_errs 0 [wsh@cloud ~]$ sudo smartctl -H -T permissive /dev/sdo smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.1.10-2-lts] (local build) Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org Short INQUIRY response, skip product id === START OF READ SMART DATA SECTION === SMART Health Status: OK [wsh@cloud storcli]$ sudo ./storcli64 /c0 show Product Name = LSI MegaRAID SAS 9240-8i Serial Number = P51010 SAS Address = 500605b004e9d030 PCI Address = 00:03:00:00 System Time = 10/12/2015 07:38:22 Mfg. Date = 03/17/10 Controller Time = 10/12/2015 07:38:20 FW Package Build = 20.13.1-0240 BIOS Version = 4.38.02.2_4.16.08.00_0x06060A05 FW Version = 2.130.404-4659 Driver Name = megaraid_sas Driver Version = 06.806.08.00-rc1 Vendor Id = 0x1000 Device Id = 0x73 SubVendor Id = 0x1000 SubDevice Id = 0x9240 Host Interface = PCI-E Device Interface = SAS-6G Bus Number = 3 Device Number = 0 Function Number = 0 Physical Drives = 7 PD LIST : === --- EID:Slt DID State DG Size Intf Med SED PI SeSz ModelSp --- 64:1 4 JBOD - 1.363 TB SATA HDD N N 512B WDC WD15EADS-00P8B0 U 64:2 0 JBOD - 2.728 TB SATA HDD N N 512B WDC WD30EFRX-68AX9N0 U 64:3 7 JBOD - 2.728 TB SATA HDD N N 512B ST3000DM001-1CH166 U 64:4 6 JBOD - 2.728 TB SATA HDD N N 512B WDC WD30EFRX-68AX9N0 U 64:5 5 JBOD - 2.728 TB SATA HDD N N 512B WDC WD30EFRX-68EUZN0 U 64:6 3 JBOD - 2.728 TB SATA HDD N N 512B WDC WD30EFRX-68AX9N0 U 64:7 2 JBOD - 2.728 TB SATA HDD N N 512B WDC WD30EFRX-68AX9N0 U --- -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html