Re: BTRFS with 8TB SMR drives

2015-10-26 Thread Henk Slager
I decided to give this ST8000AS0002 a try for storing old snapshots,
although standardization for more optimal/native contol of SMR drives
is still ongoing.  I saw people got it working with 3.18 kernel, so
that gave confidence.

I wanted to see if i could get it running with 4.3.0-rc6 kernel (and
4.2.3 tools) on an H87M-Pro eSata (non-Intel) port. Filesystem is
btrfs all single profiles on top of dm-crypt and mounted with
compress-force=zlib,nossd (I use the drive via bcache but currently
with not attached to a cache device). The initial snapshot send |
receive action crashed after 1.2TB transferred, with all the
typical/known problems in dmesg

Then same trial, newly created fs, on 1 of the Intel sata ports. Also
the same timeouts seen in dmesg, but fs already corrupted after a few
GB of datatransfer. It seemed  that the drive was not able to handle
and store the filesystemdatastream that was being pushed onto it.

So I did  some step back and just created an ext4 on it and did and
rsync copy.Unfortunately, also the same timouts, port resets etc.

As the drive made the main system unstable, I hooked it up to an AMD
E-350 based board, also to try other kernels. Also on this board, no
success with 4.x kernels and also not with 3.18.22 in the first place.
But I figured out that a powercycle did the trick and not just a hard-
or softreset. So again created fs from scratch and mounted as
indicated.


Now it is 55% filled (3.9TiB) with 10 snapshots (done as increments
from the source fs from late 2013, with uncompressed allocation of
about 5.6 TiB). The whole datatransfer took about 4 days, which is
roughly 10x slower than what would be achieved if the drive were
non-SMR and in a fast (e.g. Core i7) system.

Although the task below took more than 8 minutes:
[322087.174089] Workqueue: events_unbound
btrfs_async_reclaim_metadata_space [btrfs]
... the fs and system runs OK.

My take is that this relatively low average datatransfer (one reason I
forced zlib compression) helped getting the task done successfully for
this device-managed SMR drive, but it is unsatisfying that there are
kernel version and computerystem dependencies. I had limited time for
preparing and setting up the datatransfer, so other configurations
with new kernels might also work, but I had most confidence upfront in
the one that has turned out to work. Maybe now that all data is on the
drive, I shrink the fs and create a test fs in a second partition.

On Sat, Oct 24, 2015 at 5:27 AM, Ken Long  wrote:
> Hello,
>
> I have a a single version of this drive formatted with btrfs. Its my
> only btrfs drive on this machine.
> I'm getting similar errors. Is there any info I can provide to help
> troubleshoot this?
>
> Is a full dmesg still wanted?
>
> here's what I'm running-
>
> $ uname -a
> Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8
> 16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-26 Thread Henk Slager
I decided to give this ST8000AS0002 a try for backups / storing old
snapshots, although standardization for more optimal/native contol of
SMR drives is still ongoing.  I saw people got it working with 3.18
kernel, so that gave confidence.

I wanted to see if i could get it running with 4.3.0-rc6 kernel (and
4.2.3 tools) on an H87M-Pro eSata (non-Intel) port. Filesystem is
btrfs all single profiles on top of dm-crypt and mounted with
compress-force=zlib,nossd (I use the drive via bcache but currently
with not attached to a cache device). The initial snapshot send |
receive action crashed after 1.2TB transferred, with all the
typical/known problems in dmesg.

Then same trial, newly created fs, on 1 of the Intel sata ports. Also
the same timeouts seen in dmesg, but fs already corrupted after a few
GB of datatransfer. It seemed  that the drive was not able to handle
and store the filesystemdatastream that was being pushed onto it.

So I did  some step back and just created an ext4 on it and did and
rsync copy.Unfortunately, also the same timouts, port resets etc.

As the drive made the main system unstable, I hooked it up to an AMD
E-350 based board, also to try other kernels. Also on this board, no
success with 4.x kernels and also not with 3.18.22 in the first place.
But I figured out that a powercycle did the trick and not just a hard-
or softreset. So again created fs from scratch and mounted as
indicated.


Now it is 55% filled (3.9TiB) with 10 snapshots (done as increments
from the source fs from late 2013, with uncompressed allocation of
about 5.6 TiB). The whole datatransfer took about 4 days, which is
roughly 10x slower than what would be achieved if the drive were
non-SMR and in a fast (e.g. Core i7) system.

Although the task below took more than 8 minutes:
[322087.174089] Workqueue: events_unbound
btrfs_async_reclaim_metadata_space [btrfs]
... the fs and system runs OK.

My take is that this relatively low average datatransfer (one reason I
forced zlib compression) helped getting the task done successfully for
this device-managed SMR drive, but it is unsatisfying that there are
kernel version and computerystem dependencies. I had limited time for
preparing and setting up the datatransfer, so other configurations
with new kernels might also work, but I had most confidence upfront in
the one that has turned out to work. Maybe now that all data is on the
drive, I shrink the fs and create a test fs in a second partition.

On Sat, Oct 24, 2015 at 5:27 AM, Ken Long  wrote:
> Hello,
>
> I have a a single version of this drive formatted with btrfs. Its my
> only btrfs drive on this machine.
> I'm getting similar errors. Is there any info I can provide to help
> troubleshoot this?
>
> Is a full dmesg still wanted?
>
> here's what I'm running-
>
> $ uname -a
> Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8
> 16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-23 Thread Ken Long
Hello,

I have a a single version of this drive formatted with btrfs. Its my
only btrfs drive on this machine.
I'm getting similar errors. Is there any info I can provide to help
troubleshoot this?

Is a full dmesg still wanted?

here's what I'm running-

$ uname -a
Linux machine 4.2.0-16-lowlatency #19-Ubuntu SMP PREEMPT Thu Oct 8
16:19:23 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-13 Thread David Sterba
On Mon, Oct 12, 2015 at 06:25:52PM +0200, Henk Slager wrote:
> and looking at this spec:
> http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf
> 
> it seems that it is a drive-managed SMR disk. I am not sure why David
> assumes it is host-managed, maybe drive firmware/functionality can be
> bypassed.

Because the drive-managed ones are not interesting from the filesystem POV.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread Henk Slager
Hi Warren,

from your dmesg I see:
Oct 10 07:42:36 cloud.warrenhughes.net kernel: scsi 0:0:1:0:
Direct-Access ATA  ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5
Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)

Oct 11 23:57:56 cloud.warrenhughes.net kernel: scsi 0:0:1:0:
Direct-Access ATA  ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5
Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo]
15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)

and looking at this spec:
http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf

it seems that it is a drive-managed SMR disk. I am not sure why David
assumes it is host-managed, maybe drive firmware/functionality can be
bypassed.

As far as I can see, the drive should not have a problem with btrfs as
such, but I read quite worrying stories w.r.t. raid. I think the write
characteristics of the balance operation, in combination with the
connection via the LSI controller, are not really compatible with
'archive' use case of the drive. 'Simple', 'relaxed' write operation
should be OK, but beyond that, it might fail. See also:
http://www.storagereview.com/seagate_archive_hdd_review_8tb

How much data is already on the drive? Is it an option to mount with
skip_balance and try to remove the device and then do some tests on it
in single independent mode?

/Henk


On Mon, Oct 12, 2015 at 3:21 PM, David Sterba  wrote:
> On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote:
>> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
>> and I'm getting a tonne of errors when balancing or scrubbing.
>>
>> A short smartctl test reports fine, running a long one now. Will also
>> run seatools from a bootable DOS USB while at work today.
>>
>> Running latest firmware on my 9240-8i which explicitly supports this drive.
>>
>> I'm finding it very hard to tell if SMR drives are OK with BTRFS
>> currently - anyone chime in?
>
> I assume you have the host-managed SMR drives. This type needs tweaks to
> the operating system so the write patterns play well with the SMR
> constraints. Btrfs does not support that out of the box, but my
> colleague Hannes Reinecke managed to get it working with some minor
> changes to the allocator and disabled writing of superblock copies.
>
> For full support of SMR we'd have to change more than that, currently
> nothing prevents to write "backwards" in a given chunk that is allowed
> to be written only in the append way. So you can get mixed results when
> trying to use the SMR devices but I'd say it will mostly not work.
>
> But, btrfs has all the fundamental features in place, we'd have to make
> adjustments to follow the SMR constraints:
>
> * we can map the blockgroups to the SMR chunks (in some multiples)
> * remember the write pointers and do only append writes (easy with COW)
> * if the chunk is getting full, mark it read-only, rebalance the live
>   data somewhere else and reset the chunk and the pointer
>
> I have some notes at
> https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread David Sterba
On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote:
> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
> and I'm getting a tonne of errors when balancing or scrubbing.
> 
> A short smartctl test reports fine, running a long one now. Will also
> run seatools from a bootable DOS USB while at work today.
> 
> Running latest firmware on my 9240-8i which explicitly supports this drive.
> 
> I'm finding it very hard to tell if SMR drives are OK with BTRFS
> currently - anyone chime in?

I assume you have the host-managed SMR drives. This type needs tweaks to
the operating system so the write patterns play well with the SMR
constraints. Btrfs does not support that out of the box, but my
colleague Hannes Reinecke managed to get it working with some minor
changes to the allocator and disabled writing of superblock copies.

For full support of SMR we'd have to change more than that, currently
nothing prevents to write "backwards" in a given chunk that is allowed
to be written only in the append way. So you can get mixed results when
trying to use the SMR devices but I'd say it will mostly not work.

But, btrfs has all the fundamental features in place, we'd have to make
adjustments to follow the SMR constraints:

* we can map the blockgroups to the SMR chunks (in some multiples)
* remember the write pointers and do only append writes (easy with COW)
* if the chunk is getting full, mark it read-only, rebalance the live
  data somewhere else and reset the chunk and the pointer

I have some notes at
https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread Warren Hughes
Yes, correct its drive managed SMR.

I have been following this bug:
https://bugzilla.kernel.org/show_bug.cgi?id=93581 for a while

As a test I compiled/installed 4.3.0-rc4 as it looks like they
reverted some kernel patches that (negatively) affect SMR.

I ran a complete balance overnight and not a single error on the 8TB
SMR drive. I have a number of corrected and medium errors on one of my
3TB WD Red drives which appear to be genuine errors. Thankfully my
BTRFS is RAID1.

I'll remove and replace that 3TB drive and run a complete scrub - but
for now it looks like I was a victim of the above bug entry.

On 13 October 2015 at 05:25, Henk Slager  wrote:
> Hi Warren,
>
> from your dmesg I see:
> Oct 10 07:42:36 cloud.warrenhughes.net kernel: scsi 0:0:1:0:
> Direct-Access ATA  ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5
> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
>
> Oct 11 23:57:56 cloud.warrenhughes.net kernel: scsi 0:0:1:0:
> Direct-Access ATA  ST8000AS0002-1NA AR13 PQ: 0 ANSI: 5
> Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo]
> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
>
> and looking at this spec:
> http://www.seagate.com/files/www-content/product-content/hdd-fam/seagate-archive-hdd/en-us/docs/archive-hdd-dS1834-3-1411us.pdf
>
> it seems that it is a drive-managed SMR disk. I am not sure why David
> assumes it is host-managed, maybe drive firmware/functionality can be
> bypassed.
>
> As far as I can see, the drive should not have a problem with btrfs as
> such, but I read quite worrying stories w.r.t. raid. I think the write
> characteristics of the balance operation, in combination with the
> connection via the LSI controller, are not really compatible with
> 'archive' use case of the drive. 'Simple', 'relaxed' write operation
> should be OK, but beyond that, it might fail. See also:
> http://www.storagereview.com/seagate_archive_hdd_review_8tb
>
> How much data is already on the drive? Is it an option to mount with
> skip_balance and try to remove the device and then do some tests on it
> in single independent mode?
>
> /Henk
>
>
> On Mon, Oct 12, 2015 at 3:21 PM, David Sterba  wrote:
>> On Mon, Oct 12, 2015 at 07:43:50AM +1300, Warren Hughes wrote:
>>> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
>>> and I'm getting a tonne of errors when balancing or scrubbing.
>>>
>>> A short smartctl test reports fine, running a long one now. Will also
>>> run seatools from a bootable DOS USB while at work today.
>>>
>>> Running latest firmware on my 9240-8i which explicitly supports this drive.
>>>
>>> I'm finding it very hard to tell if SMR drives are OK with BTRFS
>>> currently - anyone chime in?
>>
>> I assume you have the host-managed SMR drives. This type needs tweaks to
>> the operating system so the write patterns play well with the SMR
>> constraints. Btrfs does not support that out of the box, but my
>> colleague Hannes Reinecke managed to get it working with some minor
>> changes to the allocator and disabled writing of superblock copies.
>>
>> For full support of SMR we'd have to change more than that, currently
>> nothing prevents to write "backwards" in a given chunk that is allowed
>> to be written only in the append way. So you can get mixed results when
>> trying to use the SMR devices but I'd say it will mostly not work.
>>
>> But, btrfs has all the fundamental features in place, we'd have to make
>> adjustments to follow the SMR constraints:
>>
>> * we can map the blockgroups to the SMR chunks (in some multiples)
>> * remember the write pointers and do only append writes (easy with COW)
>> * if the chunk is getting full, mark it read-only, rebalance the live
>>   data somewhere else and reset the chunk and the pointer
>>
>> I have some notes at
>> https://github.com/kdave/drafts/blob/master/btrfs/smr-mode.txt
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Warren Hughes
+64 21 633324
IM: gtalk + msn: this email address, skype: akawsh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread Chris Murphy
I get a lot of these from both sdb and sdc

Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense
Key : 0x3 [current]
Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
ASC=0x11 ASCQ=0x0
Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB:
opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00
Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request:
critical medium error, dev sdb, sector 297001368



Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense
Key : 0x3 [current] [descriptor]
Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
ASC=0x11 ASCQ=0x0
Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB:
opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00
Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request:
critical medium error, dev sdc, sector 5335842176

There are a lot of these kinds of errors and they aren't all for the
same LBA +/- 8 so they're are different physical sectors affected on
both drives, but I don't know what the error is.


Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
4096-byte physical blocks

sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and
eventually gets offlined
Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting
I/O to offline device

And then reappears as sdo

Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo]
15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)

But no further scsi messages for this drive while Btrfs now complains
about sdo instead of sdb. Seems to me that this device is confused
even about its own error reporting. Anyway both sdb and sdc were
having problems at the same time.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread Justin Maggard
Sounds to me like this: https://bugzilla.kernel.org/show_bug.cgi?id=93581

On Mon, Oct 12, 2015 at 11:37 AM, Chris Murphy  wrote:
> I get a lot of these from both sdb and sdc
>
> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense
> Key : 0x3 [current]
> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
> ASC=0x11 ASCQ=0x0
> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB:
> opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00
> Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request:
> critical medium error, dev sdb, sector 297001368
>
>
>
> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense
> Key : 0x3 [current] [descriptor]
> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
> ASC=0x11 ASCQ=0x0
> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB:
> opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00
> Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request:
> critical medium error, dev sdc, sector 5335842176
>
> There are a lot of these kinds of errors and they aren't all for the
> same LBA +/- 8 so they're are different physical sectors affected on
> both drives, but I don't know what the error is.
>
>
> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
> 4096-byte physical blocks
>
> sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and
> eventually gets offlined
> Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting
> I/O to offline device
>
> And then reappears as sdo
>
> Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo]
> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
>
> But no further scsi messages for this drive while Btrfs now complains
> about sdo instead of sdb. Seems to me that this device is confused
> even about its own error reporting. Anyway both sdb and sdc were
> having problems at the same time.
>
>
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-12 Thread Warren Hughes
yes indeed - referenced it in my update here
https://mail-archive.com/linux-btrfs@vger.kernel.org/msg47380.html

On 13 October 2015 at 13:04, Justin Maggard  wrote:
> Sounds to me like this: https://bugzilla.kernel.org/show_bug.cgi?id=93581
>
> On Mon, Oct 12, 2015 at 11:37 AM, Chris Murphy  
> wrote:
>> I get a lot of these from both sdb and sdc
>>
>> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
>> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
>> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] Sense
>> Key : 0x3 [current]
>> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
>> ASC=0x11 ASCQ=0x0
>> Oct 11 23:00:03 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb] CDB:
>> opcode=0x88 88 00 00 00 00 00 11 b3 e1 98 00 00 00 08 00 00
>> Oct 11 23:00:03 cloud.warrenhughes.net kernel: blk_update_request:
>> critical medium error, dev sdb, sector 297001368
>>
>>
>>
>> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
>> UNKNOWN(0x2003) Result: hostbyte=0x00 driverbyte=0x08
>> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] Sense
>> Key : 0x3 [current] [descriptor]
>> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc]
>> ASC=0x11 ASCQ=0x0
>> Oct 11 23:47:32 cloud.warrenhughes.net kernel: sd 0:0:2:0: [sdc] CDB:
>> opcode=0x88 88 00 00 00 00 01 3e 0a 7d 80 00 00 01 00 00 00
>> Oct 11 23:47:32 cloud.warrenhughes.net kernel: blk_update_request:
>> critical medium error, dev sdc, sector 5335842176
>>
>> There are a lot of these kinds of errors and they aren't all for the
>> same LBA +/- 8 so they're are different physical sectors affected on
>> both drives, but I don't know what the error is.
>>
>>
>> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
>> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
>> Oct 10 07:42:36 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdb]
>> 4096-byte physical blocks
>>
>> sd 0:0:1:0 starts out as sdb, but then goes a bit crazy somehow and
>> eventually gets offlined
>> Oct 11 23:55:24 cloud.warrenhughes.net kernel: sd 0:0:1:0: rejecting
>> I/O to offline device
>>
>> And then reappears as sdo
>>
>> Oct 11 23:57:56 cloud.warrenhughes.net kernel: sd 0:0:1:0: [sdo]
>> 15628053168 512-byte logical blocks: (8.00 TB/7.27 TiB)
>>
>> But no further scsi messages for this drive while Btrfs now complains
>> about sdo instead of sdb. Seems to me that this device is confused
>> even about its own error reporting. Anyway both sdb and sdc were
>> having problems at the same time.
>>
>>
>> Chris Murphy
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Warren Hughes
+64 21 633324
IM: gtalk + msn: this email address, skype: akawsh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


BTRFS with 8TB SMR drives

2015-10-11 Thread Warren Hughes
Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
and I'm getting a tonne of errors when balancing or scrubbing.

A short smartctl test reports fine, running a long one now. Will also
run seatools from a bootable DOS USB while at work today.

Running latest firmware on my 9240-8i which explicitly supports this drive.

I'm finding it very hard to tell if SMR drives are OK with BTRFS
currently - anyone chime in?

Thanks, Warren

[wsh@cloud storcli]$ uname -a
Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44
CEST 2015 x86_64 GNU/Linux

[wsh@cloud storcli]$ sudo btrfs version
btrfs-progs v4.2.1


[wsh@cloud ~]$ sudo btrfs scrub status /mnt/media
scrub status for 643c3145-8371-4011-8c34-20240e1bbaff
scrub started at Sun Oct 11 20:37:38 2015 and was aborted after 10:35:47
total bytes scrubbed: 8.15TiB with 104218141 errors
error details: read=98736175 csum=5481966
corrected errors: 5484382, uncorrectable errors: 98733759,
unverified errors: 0

[/dev/sdo].write_io_errs   100154203
[/dev/sdo].read_io_errs98735251
[/dev/sdo].flush_io_errs   634
[/dev/sdo].corruption_errs 5481966
[/dev/sdo].generation_errs 0

[wsh@cloud ~]$ sudo smartctl -H -T permissive /dev/sdo
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.1.10-2-lts] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

Short INQUIRY response, skip product id
=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

[wsh@cloud storcli]$ sudo ./storcli64 /c0 show
Product Name = LSI MegaRAID SAS 9240-8i
Serial Number = P51010
SAS Address =  500605b004e9d030
PCI Address = 00:03:00:00
System Time = 10/12/2015 07:38:22
Mfg. Date = 03/17/10
Controller Time = 10/12/2015 07:38:20
FW Package Build = 20.13.1-0240
BIOS Version = 4.38.02.2_4.16.08.00_0x06060A05
FW Version = 2.130.404-4659
Driver Name = megaraid_sas
Driver Version = 06.806.08.00-rc1
Vendor Id = 0x1000
Device Id = 0x73
SubVendor Id = 0x1000
SubDevice Id = 0x9240
Host Interface = PCI-E
Device Interface = SAS-6G
Bus Number = 3
Device Number = 0
Function Number = 0
Physical Drives = 7

PD LIST :
===

---
EID:Slt DID State DG Size Intf Med SED PI SeSz ModelSp
---
64:1  4 JBOD  -  1.363 TB SATA HDD N   N  512B WDC WD15EADS-00P8B0  U
64:2  0 JBOD  -  2.728 TB SATA HDD N   N  512B WDC WD30EFRX-68AX9N0 U
64:3  7 JBOD  -  2.728 TB SATA HDD N   N  512B ST3000DM001-1CH166   U
64:4  6 JBOD  -  2.728 TB SATA HDD N   N  512B WDC WD30EFRX-68AX9N0 U
64:5  5 JBOD  -  2.728 TB SATA HDD N   N  512B WDC WD30EFRX-68EUZN0 U
64:6  3 JBOD  -  2.728 TB SATA HDD N   N  512B WDC WD30EFRX-68AX9N0 U
64:7  2 JBOD  -  2.728 TB SATA HDD N   N  512B WDC WD30EFRX-68AX9N0 U
---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-11 Thread Kristan
Warren Hughes  warrenhughes.net> writes:

> 
> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
> and I'm getting a tonne of errors when balancing or scrubbing.
> 
> A short smartctl test reports fine, running a long one now. Will also
> run seatools from a bootable DOS USB while at work today.
> 
> Running latest firmware on my 9240-8i which explicitly supports this 
drive.
> 
> I'm finding it very hard to tell if SMR drives are OK with BTRFS
> currently - anyone chime in?
> 
> Thanks, Warren
> 
> [wsh  cloud storcli]$ uname -a
> Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44
> CEST 2015 x86_64 GNU/Linux
> 
> [wsh  cloud storcli]$ sudo btrfs version
> btrfs-progs v4.2.1
> 
> [wsh  cloud ~]$ sudo btrfs scrub status /mnt/media
> scrub status for 643c3145-8371-4011-8c34-20240e1bbaff
> scrub started at Sun Oct 11 20:37:38 2015 and was aborted 
after 10:35:47
> total bytes scrubbed: 8.15TiB with 104218141 errors
> error details: read=98736175 csum=5481966
> corrected errors: 5484382, uncorrectable errors: 98733759,
> unverified errors: 0
> 
> [/dev/sdo].write_io_errs   100154203
> [/dev/sdo].read_io_errs98735251
> [/dev/sdo].flush_io_errs   634
> [/dev/sdo].corruption_errs 5481966
> [/dev/sdo].generation_errs 0
> 

hi Warren,

I recently (last week) built a 3 disk RAID 5 array using the same 8TB 
drives which worked fine holding ~12TB then added a 4th disk using a 
JMicron PCI SATA controller. I then ran a balance which failed after 
just over 1TB written to the 4th disk. This caused the entire array to 
fail but the main difference to your scenario was that the 4th disk also 
wasn't reporting to SMART properly. 
I then moved all 4 disks onto the motherboard based SATA controller, 
built the array fresh and have copied ~18TB onto it and it seems to be 
working fine. Perhaps I should try a scrub and see :)

I'm using Centos 7.1 but kernel 4.2.1-ml and btrfs-progs 4.2.2
Kristan

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-11 Thread Warren Hughes
Thanks Kristan, a scrub would be great; mine appeared to be working
fine until the scrub (although I hadn't yet run a balance on it so who
knows).

I might move my 8TB onto the motherboard controller and see if the
situation improves. Will update here tonight.

Cheers, W.

On 12 October 2015 at 11:53, Kristan  wrote:
> Warren Hughes  warrenhughes.net> writes:
>
>>
>> Hi guys, just added a new Seagate Archive 8TB drive to my BTRFS volume
>> and I'm getting a tonne of errors when balancing or scrubbing.
>>
>> A short smartctl test reports fine, running a long one now. Will also
>> run seatools from a bootable DOS USB while at work today.
>>
>> Running latest firmware on my 9240-8i which explicitly supports this
> drive.
>>
>> I'm finding it very hard to tell if SMR drives are OK with BTRFS
>> currently - anyone chime in?
>>
>> Thanks, Warren
>>
>> [wsh  cloud storcli]$ uname -a
>> Linux cloud.warrenhughes.net 4.1.10-2-lts #1 SMP Wed Oct 7 21:57:44
>> CEST 2015 x86_64 GNU/Linux
>>
>> [wsh  cloud storcli]$ sudo btrfs version
>> btrfs-progs v4.2.1
>>
>> [wsh  cloud ~]$ sudo btrfs scrub status /mnt/media
>> scrub status for 643c3145-8371-4011-8c34-20240e1bbaff
>> scrub started at Sun Oct 11 20:37:38 2015 and was aborted
> after 10:35:47
>> total bytes scrubbed: 8.15TiB with 104218141 errors
>> error details: read=98736175 csum=5481966
>> corrected errors: 5484382, uncorrectable errors: 98733759,
>> unverified errors: 0
>>
>> [/dev/sdo].write_io_errs   100154203
>> [/dev/sdo].read_io_errs98735251
>> [/dev/sdo].flush_io_errs   634
>> [/dev/sdo].corruption_errs 5481966
>> [/dev/sdo].generation_errs 0
>>
>
> hi Warren,
>
> I recently (last week) built a 3 disk RAID 5 array using the same 8TB
> drives which worked fine holding ~12TB then added a 4th disk using a
> JMicron PCI SATA controller. I then ran a balance which failed after
> just over 1TB written to the 4th disk. This caused the entire array to
> fail but the main difference to your scenario was that the 4th disk also
> wasn't reporting to SMART properly.
> I then moved all 4 disks onto the motherboard based SATA controller,
> built the array fresh and have copied ~18TB onto it and it seems to be
> working fine. Perhaps I should try a scrub and see :)
>
> I'm using Centos 7.1 but kernel 4.2.1-ml and btrfs-progs 4.2.2
> Kristan
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Warren Hughes
+64 21 633324
IM: gtalk + msn: this email address, skype: akawsh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-11 Thread Warren Hughes
Hopefully this is of use - its a beast; 34MB when uncompressed

https://drive.google.com/file/d/0B74Kimpwe3nYYUZ2YTMtQXB4V1U/view?usp=sharing

On 12 October 2015 at 14:43, Chris Murphy  wrote:
> Is it possible to get a complete dmesg included in the thread, or if
> it's too big attach it to a bug report? I'm curious if there are any
> libata messages, as well as the specific Btrfs messages.
>
>
> ---
> Chris Murphy



-- 
Warren Hughes
+64 21 633324
IM: gtalk + msn: this email address, skype: akawsh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-11 Thread Warren Hughes
more info for anyone interested:

[wsh@cloud ~]$ sudo btrfs fi df /mnt/media
Data, RAID1: total=13.64TiB, used=13.61TiB
System, RAID1: total=32.00MiB, used=2.22MiB
Metadata, RAID1: total=16.00GiB, used=15.10GiB
GlobalReserve, single: total=512.00MiB, used=0.00B


[wsh@cloud ~]$ sudo btrfs fi sh /mnt/media
Label: none  uuid: 643c3145-8371-4011-8c34-20240e1bbaff
Total devices 11 FS bytes used 13.63TiB
devid8 size 2.73TiB used 2.54TiB path /dev/sdh
devid9 size 2.73TiB used 2.54TiB path /dev/sdc
devid   10 size 2.73TiB used 2.54TiB path /dev/sdf
devid   11 size 1.82TiB used 1.63TiB path /dev/sdn
devid   12 size 2.73TiB used 2.54TiB path /dev/sdg
devid   14 size 2.73TiB used 2.54TiB path /dev/sda
devid   15 size 2.73TiB used 2.54TiB path /dev/sdd
devid   16 size 2.73TiB used 2.54TiB path /dev/sdk
devid   17 size 2.73TiB used 2.54TiB path /dev/sdl
devid   18 size 3.64TiB used 3.45TiB path /dev/sdm
devid   19 size 7.28TiB used 1.93TiB path /dev/sdo

btrfs-progs v4.2.1

On 12 October 2015 at 14:43, Chris Murphy  wrote:
> Is it possible to get a complete dmesg included in the thread, or if
> it's too big attach it to a bug report? I'm curious if there are any
> libata messages, as well as the specific Btrfs messages.
>
>
> ---
> Chris Murphy



-- 
Warren Hughes
+64 21 633324
IM: gtalk + msn: this email address, skype: akawsh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS with 8TB SMR drives

2015-10-11 Thread Chris Murphy
Is it possible to get a complete dmesg included in the thread, or if
it's too big attach it to a bug report? I'm curious if there are any
libata messages, as well as the specific Btrfs messages.


---
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html