Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-09 Thread Duncan
constantine posted on Tue, 10 Feb 2015 00:54:56 + as excerpted: Could you please answer two questions?: 1. I am testing various files and all seem readable. Is there a way to list every file that resides on a particular device (like /dev/sdc1?) so as to check them? I don't know of

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-09 Thread Chris Murphy
On Mon, Feb 9, 2015 at 5:54 PM, constantine costas.magn...@gmail.com wrote: 1. I am testing various files and all seem readable. Is there a way to list every file that resides on a particular device (like /dev/sdc1?) so as to check them? There are a handful of files that seem corrupted,

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-09 Thread constantine
Thank you everybody for your support, care, cheerful comments and understandable criticism. I am in the process of backing up every file. Could you please answer two questions?: 1. I am testing various files and all seem readable. Is there a way to list every file that resides on a particular

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-09 Thread Kai Krakow
Brendan Hide bren...@swiftspirit.co.za schrieb: I have the following two lines in /etc/udev/rules.d/61-persistent-storage.rules for two old 250GB spindles. It sets the timeout to 120 seconds because these two disks don't support SCT ERC. This may very well apply without modification to other

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-09 Thread Brendan Hide
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 2015/02/09 10:30 PM, Kai Krakow wrote: Brendan Hide bren...@swiftspirit.co.za schrieb: I have the following two lines in /etc/udev/rules.d/61-persistent-storage.rules for two old 250GB [snip] Wouldn't it be easier and more efficient to use

Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
Hello everybody, I have a Raid-1 btrfs filesystem with 5 devices and I was running btrfs scrub once a week. Unfortunately, one disk (4TB) failed. I added two new (6TB each) disks in the array and now I get: # btrfs filesystem df /mnt/mountpoint Data, RAID1: total=6.58TiB, used=6.57TiB System,

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
What kernel and btrfs-progs version? Also include the entire dmesg. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
I have Arch Linux: # uname -a Linux hostname 3.19.0-1-mainline #1 SMP PREEMPT Wed Dec 24 00:27:17 WET 2014 x86_64 GNU/Linux btrfs-progs 3.17.3-1 dmesg: [0.00] Initializing cgroup subsys cpuset [0.00] Initializing cgroup subsys cpu [0.00] Initializing cgroup subsys

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
On Sun, Feb 8, 2015 at 2:06 PM, constantine costas.magn...@gmail.com wrote: [ 78.039253] BTRFS info (device sdc1): disk space caching is enabled [ 78.056020] BTRFS: failed to read chunk tree on sdc1 [ 78.091062] BTRFS: open_ctree failed [ 84.729944] BTRFS info (device sdc1): allowing

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
Thank you very much for your help. I do not have any recovery backup and I need these data :( Before my problems begun I was running btrfs-scrub in a weekly basis and I only got 17 uncorrectable errors for this array, concerning files that I do not care of, so I ignored them. I clearly should

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
By the way, /dev/sdc just completed the extended offline test without any error... I feel so confused, constantine On Sun, Feb 8, 2015 at 11:04 PM, constantine costas.magn...@gmail.com wrote: Thank you very much for your help. I do not have any recovery backup and I need these data :(

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Brendan Hide
On 2015/02/09 01:58, constantine wrote: Second, SMART is only saying its internal test is good. The errors are related to data transfer, so that implicates the enclosure (bridge chipset or electronics), the cable, or the controller interface. Actually it could also be a flaky controller or RAM

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Duncan
Chris Murphy posted on Sun, 08 Feb 2015 15:43:35 -0700 as excerpted: Confusing is that sdd1, sdi1, sdg1 have gen 0 and also have corruptions reported, just not anywhere near as many as sdc1. So I don't know what problems you have with your hardware, but they're not restricted to just one or

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
On Sun, Feb 8, 2015 at 4:58 PM, constantine costas.magn...@gmail.com wrote: Which test should I do from now on (on a weekly basis?) so as to prevent similar things from happening? You should check dmesg after each scrub and at each mount. You can also use btrfs scrub -Bd mountpoint This

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
I understood my mistake on using consumer drives and this is why I bought the RED versions some days ago. I would have done this earlier if I had the money. So to sum up. I have upgraded my btrfs-progs and I have mounted the filesystem with # mount -o degraded /dev/sdi1 /mnt/mountpoint I want

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
On Sun, Feb 8, 2015 at 4:53 PM, constantine costas.magn...@gmail.com wrote: I understood my mistake on using consumer drives and this is why I bought the RED versions some days ago. I would have done this earlier if I had the money. You need to raise the SCSI command timer value for the drives

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
So I just created a small 5 device btrfs raid1 volume, and added some data to it so all the devices have been written to. Then unmounted the volume, and destroyed one of the devices. So now I have a 4 device raid1 mounted degraded. And I can still device delete another device. So one device

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread constantine
Second, SMART is only saying its internal test is good. The errors are related to data transfer, so that implicates the enclosure (bridge chipset or electronics), the cable, or the controller interface. Actually it could also be a flaky controller or RAM on the drive itself too which I don't

Re: Replacing a (or two?) failed drive(s) in RAID-1 btrfs filesystem

2015-02-08 Thread Chris Murphy
On Sun, Feb 8, 2015 at 4:09 PM, constantine costas.magn...@gmail.com wrote: By the way, /dev/sdc just completed the extended offline test without any error... I feel so confused, First, we know from a number of studies, including the famous (and now kinda old) Google study that a huge