Re: Failed Disk RAID10 Problems

2014-05-31 Thread Justin Brown
Chris,

Thanks for the continued help. I had to put the recovery on hiatus
while I waited for new hard drives to be delivered. I never was able
to figure out how to replace the failed drive, but I did learn a lot
about how Btrfs works. The approach to doing practically all
operations with file system mounted specially was quite a surprise.

In the end, I created a Btrfs RAID5 file system with the newly
delivered drives on another system and used rsync to copy from the
degraded array. There was a little file system damage that showed up
as csum failed errors in the logs from the IO that was in progress
when the original failure occurred. Fortunately, it was all data that
could be recovered from other systems, and there wasn't any need to
troubleshoot the errors.

Thanks,
Justin


On Wed, May 28, 2014 at 3:40 PM, Chris Murphy li...@colorremedies.com wrote:

 On May 28, 2014, at 12:39 PM, Justin Brown justin.br...@fandingo.org wrote:

 Chris,

 Thanks for the tip. I was able to mount the drive as degraded and
 recovery. Then, I deleted the faulty drive, leaving me with the
 following array:


 Label: media  uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5

 Total devices 6 FS bytes used 2.40TiB

 devid1 size 931.51GiB used 919.88GiB path
 /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1

 devid2 size 931.51GiB used 919.38GiB path /dev/dm-8

 devid3 size 1.82TiB used 1.19TiB path /dev/dm-6

 devid4 size 931.51GiB used 919.88GiB path /dev/dm-5

 devid5 size 0.00 used 918.38GiB path /dev/dm-11

 devid6 size 1.82TiB used 3.88GiB path /dev/dm-9


 /dev/dm-11 is the failed drive. I take it that size 0 is a good sign.
 I'm not really sure where to go from here. I tried rebooting the
 system with the failed drive attached, and Btrfs re-adds it to the
 array. Should I physically remove the drive now? Is a balance
 recommended?

 I'm going to guess at what I think has happened. You had a 5 device raid10. 
 devid 5 is the failed device, but at the time you added new device devid 6, 
 it was not considered failed by btrfs. Your first btrfs fi show does not show 
 size 0 for devid 5. So I think btrfs made you a 6 device raid10 volume.

 But now devid 5 has failed, shows up as size 0. The reason you have to mount 
 degraded still is because you have a 6 device raid10 now, and 1 device has 
 failed. And you can't remove the failed device because you've mounted 
 degraded. So actually it was a mistake to add a new device first, but it's an 
 easy mistake to make because right now btrfs really tolerates a lot of error 
 conditions that it probably should give up on and outright fail the device.

 So I think you might have to get a 7th device to fix this with btrfs replace 
 start. You can later delete devices once you're not mounted degraded. Or you 
 can just do a backup now while you can mount degraded, and then blow away the 
 btrfs volume and start over.

 If you have a current backups and are willing to lose data on this volume, 
 you can try the following

 1. Poweroff, remove the failed drive, boot, and do a normal mount. That 
 probably won't work but it's worth a shot. If it doesn't work try mount -o 
 degraded. [That might not work either, in which case stop here, I think 
 you'll need to go with a 7th device and use 'btrfs replace start 5 
 /dev/newdevice7 /mp' That will explicitly replace failed device 5 with new 
 device.]

 2. Assuming mount -o degraded works, take a btrfs fi show. There should be a 
 missing device listed. Now try btrfs device delete missing /mp and see what 
 happens. If it at least doesn't complain, it means it's working and might 
 take hours to replicate data that was on the missing device onto the new one. 
 So I'd leave it alone until iotop or something like that tells you it's not 
 busy anymore.

 3. Unmount the file system. Try to mount normally (not degraded).



 Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Failed Disk RAID10 Problems

2014-05-28 Thread Justin Brown
Hi,

I have a Btrfs RAID 10 (data and metadata) file system that I believe
suffered a disk failure. In my attempt to replace the disk, I think
that I've made the problem worse and need some help recovering it.

I happened to notice a lot of errors in the journal:

end_request: I/O error, dev dm-11, sector 1549378344
BTRFS: bdev /dev/mapper/Hitachi_HDS721010KLA330_GTA040PBG71HXF1 errs:
wr 759675, rd 539730, flush 23, corrupt 0, gen 0

The file system continued to work for some time, but eventually a NFS
client encountered IO errors. I figured that device was failing (It
was very old.). I attached a new drive to the hot-swappable SATA slot
on my computer, partitioned it with GPT, and ran partprobe to detect
it. Next I attempted to add a new device, which was successful.
However, something peculiar happened:

~: btrfs fi df /var/media/
Data, RAID10: total=2.33TiB, used=2.33TiB
Data, RAID6: total=72.00GiB, used=71.96GiB
System, RAID10: total=96.00MiB, used=272.00KiB
Metadata, RAID10: total=4.12GiB, used=2.60GiB

I don't know where that RAID6 file system came from, but it did not
exist over the weekend when I last checked. I attempted to run a
balance operation, but this is when the IO errors became severe, and I
cancelled it. Next, I tried to remove the failed device, thinking that
Btrfs could rebalance after that. Removing the failed device failed:

~: btrfs device delete /dev/dm-11 /var/media
ERROR: error removing the device '/dev/dm-11' - Device or resource busy

I shutdown the system and detached the failed disk. Upon reboot, I
cannot mount the filesystem:

~: mount /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1 /var/media
mount: wrong fs type, bad option, bad superblock on
/dev/mapper/SAMSUNG_HD103SI_499431FS734755p1,
   missing codepage or helper program, or other error

   In some cases useful info is found in syslog - try
   dmesg | tail or so.

BTRFS: device label media devid 2 transid 44804
/dev/mapper/WDC_WD10EACS-00D6B0_WD-WCAU40229179p1
BTRFS info (device dm-10): disk space caching is enabled
BTRFS: failed to read the system array on dm-10
BTRFS: open_ctree failed

I reattached the failed disk, and I'm still getting the same mount
error as above.

Here's where the array currently stands:

Label: 'media'  uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5
Total devices 5 FS bytes used 2.39TiB
devid1 size 931.51GiB used 919.41GiB path
/dev/mapper/SAMSUNG_HD103SI_499431FS734755p1
devid2 size 931.51GiB used 919.41GiB path
/dev/mapper/WDC_WD10EACS-00D6B0_WD-WCAU40229179p1
devid3 size 1.82TiB used 1.19TiB path
/dev/mapper/WDC_WD20EFRX-68AX9N0_WD-WMC1T1268493p1
devid4 size 931.51GiB used 920.41GiB path
/dev/mapper/WDC_WD10EARS-00Y5B1_WD-WMAV50654875p1
devid5 size 931.51GiB used 918.50GiB path
/dev/mapper/Hitachi_HDS721010KLA330_GTA040PBG71HXF1
devid6 size 1.82TiB used 3.41GiB path
/dev/mapper/WDC_WD20EFRX-68AX9N0_WD-WMC300239240p1

Btrfs v3.12

Devid 6 is the drive that I added earlier.

What can I do to recover this file system? I have another spare drive
that I can use if it's any help.

Thanks,
Justin
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy

On May 28, 2014, at 12:19 AM, Justin Brown justin.br...@fandingo.org wrote:

 Hi,
 
 I have a Btrfs RAID 10 (data and metadata) file system that I believe
 suffered a disk failure. In my attempt to replace the disk, I think
 that I've made the problem worse and need some help recovering it.
 
 I happened to notice a lot of errors in the journal:
 
 end_request: I/O error, dev dm-11, sector 1549378344
 BTRFS: bdev /dev/mapper/Hitachi_HDS721010KLA330_GTA040PBG71HXF1 errs:
 wr 759675, rd 539730, flush 23, corrupt 0, gen 0
 
 The file system continued to work for some time, but eventually a NFS
 client encountered IO errors. I figured that device was failing (It
 was very old.). I attached a new drive to the hot-swappable SATA slot
 on my computer, partitioned it with GPT, and ran partprobe to detect
 it. Next I attempted to add a new device, which was successful.

For future reference, it should to add a device and then use btrfs device 
delete missing. But I've found btrfs replace start to be more reliable. It does 
the add, delete and balance in one step.


 ~: mount /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1 /var/media
 mount: wrong fs type, bad option, bad superblock on
 /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1,
   missing codepage or helper program, or other error
 
   In some cases useful info is found in syslog - try
   dmesg | tail or so.
 
 BTRFS: device label media devid 2 transid 44804
 /dev/mapper/WDC_WD10EACS-00D6B0_WD-WCAU40229179p1
 BTRFS info (device dm-10): disk space caching is enabled
 BTRFS: failed to read the system array on dm-10
 BTRFS: open_ctree failed

I'd try in order:

mount -o degraded,ro
mount -o recovery,ro
mount -o degraded,recovery,ro

If any of those works, then update your backup before trying anything else. 
Whatever command above worked, try it without ro.

If a degrade option is needed then that makes me think a btrfs device delete 
missing won't work, but then I'm also not seeing a missing device in your btrfs 
fi show either. You definitely need to make sure the device producing the 
errors is the device that's missing and is the one you're removing.

Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy

On May 28, 2014, at 1:03 AM, Chris Murphy li...@colorremedies.com wrote:
 
 For future reference, it should to add a device and then use btrfs device 
 delete missing.

it should work (if not it's probably a bug).

Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Fwd: Failed Disk RAID10 Problems

2014-05-28 Thread Justin Brown
Chris,

Thanks for the tip. I was able to mount the drive as degraded and
recovery. Then, I deleted the faulty drive, leaving me with the
following array:


Label: media  uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5

Total devices 6 FS bytes used 2.40TiB

devid1 size 931.51GiB used 919.88GiB path
/dev/mapper/SAMSUNG_HD103SI_499431FS734755p1

devid2 size 931.51GiB used 919.38GiB path /dev/dm-8

devid3 size 1.82TiB used 1.19TiB path /dev/dm-6

devid4 size 931.51GiB used 919.88GiB path /dev/dm-5

devid5 size 0.00 used 918.38GiB path /dev/dm-11

devid6 size 1.82TiB used 3.88GiB path /dev/dm-9


/dev/dm-11 is the failed drive. I take it that size 0 is a good sign.
I'm not really sure where to go from here. I tried rebooting the
system with the failed drive attached, and Btrfs re-adds it to the
array. Should I physically remove the drive now? Is a balance
recommended?


Thanks,

Justin
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy

On May 28, 2014, at 12:39 PM, Justin Brown justin.br...@fandingo.org wrote:

 Chris,
 
 Thanks for the tip. I was able to mount the drive as degraded and
 recovery. Then, I deleted the faulty drive, leaving me with the
 following array:
 
 
 Label: media  uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5
 
 Total devices 6 FS bytes used 2.40TiB
 
 devid1 size 931.51GiB used 919.88GiB path
 /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1
 
 devid2 size 931.51GiB used 919.38GiB path /dev/dm-8
 
 devid3 size 1.82TiB used 1.19TiB path /dev/dm-6
 
 devid4 size 931.51GiB used 919.88GiB path /dev/dm-5
 
 devid5 size 0.00 used 918.38GiB path /dev/dm-11
 
 devid6 size 1.82TiB used 3.88GiB path /dev/dm-9
 
 
 /dev/dm-11 is the failed drive.

You deleted a faulty drive, dm-11 is a failed drive. Is there a difference 
between faulty drive and failed drive, or are they the same drive? And what 
drive is the one you said you successfully added?

I don't see how you have 6 devices raid10, with one failed and one added 
device. You need an even number of good drives to fix this.


 I take it that size 0 is a good sign.

Seems neither good nor bad to me, it's 0 because it's a dead drive presumably 
and therefore Btrfs isn't getting device information from it.

 I'm not really sure where to go from here. I tried rebooting the
 system with the failed drive attached, and Btrfs re-adds it to the
 array. Should I physically remove the drive now? Is a balance
 recommended?

No don't do anything else until someone actually understands faulty vs failed 
vs added drives.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Failed Disk RAID10 Problems

2014-05-28 Thread Chris Murphy

On May 28, 2014, at 12:39 PM, Justin Brown justin.br...@fandingo.org wrote:

 Chris,
 
 Thanks for the tip. I was able to mount the drive as degraded and
 recovery. Then, I deleted the faulty drive, leaving me with the
 following array:
 
 
 Label: media  uuid: 7b7afc82-f77c-44c0-b315-669ebd82f0c5
 
 Total devices 6 FS bytes used 2.40TiB
 
 devid1 size 931.51GiB used 919.88GiB path
 /dev/mapper/SAMSUNG_HD103SI_499431FS734755p1
 
 devid2 size 931.51GiB used 919.38GiB path /dev/dm-8
 
 devid3 size 1.82TiB used 1.19TiB path /dev/dm-6
 
 devid4 size 931.51GiB used 919.88GiB path /dev/dm-5
 
 devid5 size 0.00 used 918.38GiB path /dev/dm-11
 
 devid6 size 1.82TiB used 3.88GiB path /dev/dm-9
 
 
 /dev/dm-11 is the failed drive. I take it that size 0 is a good sign.
 I'm not really sure where to go from here. I tried rebooting the
 system with the failed drive attached, and Btrfs re-adds it to the
 array. Should I physically remove the drive now? Is a balance
 recommended?

I'm going to guess at what I think has happened. You had a 5 device raid10. 
devid 5 is the failed device, but at the time you added new device devid 6, it 
was not considered failed by btrfs. Your first btrfs fi show does not show size 
0 for devid 5. So I think btrfs made you a 6 device raid10 volume.

But now devid 5 has failed, shows up as size 0. The reason you have to mount 
degraded still is because you have a 6 device raid10 now, and 1 device has 
failed. And you can't remove the failed device because you've mounted degraded. 
So actually it was a mistake to add a new device first, but it's an easy 
mistake to make because right now btrfs really tolerates a lot of error 
conditions that it probably should give up on and outright fail the device.

So I think you might have to get a 7th device to fix this with btrfs replace 
start. You can later delete devices once you're not mounted degraded. Or you 
can just do a backup now while you can mount degraded, and then blow away the 
btrfs volume and start over.

If you have a current backups and are willing to lose data on this volume, you 
can try the following

1. Poweroff, remove the failed drive, boot, and do a normal mount. That 
probably won't work but it's worth a shot. If it doesn't work try mount -o 
degraded. [That might not work either, in which case stop here, I think you'll 
need to go with a 7th device and use 'btrfs replace start 5 /dev/newdevice7 
/mp' That will explicitly replace failed device 5 with new device.]

2. Assuming mount -o degraded works, take a btrfs fi show. There should be a 
missing device listed. Now try btrfs device delete missing /mp and see what 
happens. If it at least doesn't complain, it means it's working and might take 
hours to replicate data that was on the missing device onto the new one. So I'd 
leave it alone until iotop or something like that tells you it's not busy 
anymore.

3. Unmount the file system. Try to mount normally (not degraded).



Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Scrub causes oom after removal of failed disk (linux 3.10)

2013-07-08 Thread David Sterba
On Wed, Jul 03, 2013 at 08:35:48PM +0200, Torbjørn wrote:
 Hi btrfs devs,
 
 I have a btrfs raid10 array consisting of 2TB drives.
 
 I added a new drive to the array, then balanced.
 The balance failed after ~50GB was moved to the new drive.
 The balance fixed lots of errors according to dmesg.
 
 Server rebooted
 
 The newly added drive were no longer detected as a btrfs disk.
 The array was then mounted -o recovery
 I ran btrfs dev del missing, and everything seemed to be fine.
 
 After this I ran a scrub on the array.
 The scrub was soon stopped by the oom-killer.
 
 After another reboot I started a new scrub.
 About 3TB into the scrub over 10 GB of memory was being consumed.
 The scrub had then fixed roughly 3,000,000 errors.
 
 Canceling the scrub and resuming it frees the 10 GB of memory.

Thanks for the report.

This looks like the same problem that was fixed by
https://patchwork.kernel.org/patch/2697501/
Btrfs: free csums when we're done scrubbing an extent

but I don't see it included in the current for-linus branch. We want
this in the 3.10.x stable series and according to stable tree policy it
has to be merged into Linus' tree first.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Scrub causes oom after removal of failed disk (linux 3.10)

2013-07-08 Thread Torbjørn Skagestad

On 07/08/2013 11:36 PM, David Sterba wrote:

On Wed, Jul 03, 2013 at 08:35:48PM +0200, Torbjørn wrote:

Hi btrfs devs,

I have a btrfs raid10 array consisting of 2TB drives.

I added a new drive to the array, then balanced.
The balance failed after ~50GB was moved to the new drive.
The balance fixed lots of errors according to dmesg.

Server rebooted

The newly added drive were no longer detected as a btrfs disk.
The array was then mounted -o recovery
I ran btrfs dev del missing, and everything seemed to be fine.

After this I ran a scrub on the array.
The scrub was soon stopped by the oom-killer.

After another reboot I started a new scrub.
About 3TB into the scrub over 10 GB of memory was being consumed.
The scrub had then fixed roughly 3,000,000 errors.

Canceling the scrub and resuming it frees the 10 GB of memory.

Thanks for the report.

This looks like the same problem that was fixed by
https://patchwork.kernel.org/patch/2697501/
Btrfs: free csums when we're done scrubbing an extent

but I don't see it included in the current for-linus branch. We want
this in the 3.10.x stable series and according to stable tree policy it
has to be merged into Linus' tree first.

david
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Ok, thanks

--
Torbjørn
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Scrub causes oom after removal of failed disk (linux 3.10)

2013-07-03 Thread Torbjørn

Hi btrfs devs,

I have a btrfs raid10 array consisting of 2TB drives.

I added a new drive to the array, then balanced.
The balance failed after ~50GB was moved to the new drive.
The balance fixed lots of errors according to dmesg.

Server rebooted

The newly added drive were no longer detected as a btrfs disk.
The array was then mounted -o recovery
I ran btrfs dev del missing, and everything seemed to be fine.

After this I ran a scrub on the array.
The scrub was soon stopped by the oom-killer.

After another reboot I started a new scrub.
About 3TB into the scrub over 10 GB of memory was being consumed.
The scrub had then fixed roughly 3,000,000 errors.

Canceling the scrub and resuming it frees the 10 GB of memory.

I'm assuming this is not expected behavior.
If I can help in any way please let me know.

dmesg from the failed balance:

[68190.748909] btrfs csum failed ino 1512 extent 1540228509696 csum 
2089345036 wanted 864794082 mirror 1
[68190.809090] BUG: unable to handle kernel paging request at 
87fe167a32c0
[68190.814638] IP: [a0272287] repair_io_failure+0x117/0x230 
[btrfs]

[68190.820709] PGD 0
[68190.826781] Oops:  [#1] SMP
[68190.833090] Modules linked in: xfs ip6table_filter ip6_tables 
ebtable_nat ebtables ipt_MASQUERADE ipt_REJECT xt_CHECKSUM sch_prio 
bridge stp llc dm_crypt xt_state iptable_filter xt_CLASSIFY xt_tcpudp 
xt_DSCP iptable_mangle iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 
nf_nat_ipv4 nf_nat nf_conntrack ip_tables x_tables intel_powerclamp 
kvm_intel kvm psmouse serio_raw microcode lpc_ich ppdev parport_pc 
w83627ehf hwmon_vid coretemp nfsd nfs_acl auth_rpcgss nfs lp fscache 
lockd parport sunrpc btrfs zlib_deflate libcrc32c raid10 raid1 raid0 
multipath linear raid456 async_pq async_xor xor async_memcpy 
async_raid6_recov raid6_pq async_tx hid_generic usbhid hid ast ttm 
drm_kms_helper crc32_pclmul ghash_clmulni_intel drm aesni_intel 
ablk_helper cryptd lrw i2c_algo_bit gf128mul sysimgblt glue_helper 
sysfillrect aes_x86_64 syscopyarea e1000e mpt2sas ahci ptp libahci 
scsi_transport_sas pps_core raid_class video

[68190.926164] CPU: 3 PID: 16472 Comm: btrfs-endio-8 Not tainted 3.10.0+ #11
[68190.941478] Hardware name: To be filled by O.E.M. To be filled by 
O.E.M./P8B-X series, BIOS 2107 05/04/2012
[68190.957876] task: 880125fe1740 ti: 8802cb6ee000 task.ti: 
8802cb6ee000
[68190.974836] RIP: 0010:[a0272287] [a0272287] 
repair_io_failure+0x117/0x230 [btrfs]

[68190.992754] RSP: 0018:8802cb6efca8  EFLAGS: 00010287
[68191.010933] RAX: fffa43a60fe8 RBX: 1000 RCX: 
019e1ce5e000
[68191.029830] RDX: 8803d2d422a0 RSI: 8802cb6efcc0 RDI: 
8803dd444be0
[68191.049014] RBP: 8802cb6efd18 R08:  R09: 

[68191.068584] R10: c2d195ff R11: 3fb5 R12: 
880416adc000
[68191.088446] R13: 0929834ae000 R14: c2d19600 R15: 
8803db4c5910
[68191.108648] FS:  () GS:88042fcc() 
knlGS:

[68191.129491] CS:  0010 DS:  ES:  CR0: 80050033
[68191.150587] CR2: 87fe167a32c0 CR3: 01c0c000 CR4: 
001427e0
[68191.172318] DR0:  DR1:  DR2: 

[68191.194339] DR3:  DR6: 0ff0 DR7: 
0400

[68191.216403] Stack:
[68191.238415]  0006c000 ea0005becf40 2000 
8803d2d422a0
[68191.261527]  8802  8802cb6efcd8 
8802cb6efcd8
[68191.285026]  8802cb6efd18 0006c000 8802137488a0 
ea0005becf40

[68191.308855] Call Trace:
[68191.332739]  [a0272bdf] end_bio_extent_readpage+0x78f/0x7f0 
[btrfs]

[68191.357675]  [811a38ad] bio_endio+0x1d/0x30
[68191.382816]  [a024cf41] end_workqueue_fn+0x41/0x50 [btrfs]
[68191.408455]  [a02822d8] worker_loop+0x148/0x520 [btrfs]
[68191.434422]  [816902c7] ? __schedule+0x3d7/0x800
[68191.460669]  [a0282190] ? btrfs_queue_worker+0x320/0x320 
[btrfs]

[68191.487415]  [81064410] kthread+0xc0/0xd0
[68191.514246]  [81064350] ? kthread_create_on_node+0x130/0x130
[68191.541603]  [81699f1c] ret_from_fork+0x7c/0xb0
[68191.569187]  [81064350] ? kthread_create_on_node+0x130/0x130
[68191.597279] Code: a0 e8 4e c1 00 00 85 c0 0f 85 b6 00 00 00 48 8b 55 
a8 44 3b 72 2c 0f 85 e8 00 00 00 45 8d 56 ff 4d 63 d2 4b 8d 04 52 48 c1 
e0 03 4c 8b 6c 02 38 49 c1 ed 09 4d 89 2f 48 8b 7d a8 4c 8b 64 07 30
[68191.657235] RIP  [a0272287] repair_io_failure+0x117/0x230 
[btrfs]

[68191.687689]  RSP 8802cb6efca8
[68191.718273] CR2: 87fe167a32c0
[68191.870900] ---[ end trace ad5eb9d56280bbe5 ]---
[68191.870902] BUG: unable to handle kernel paging request at 
87f6dfa7da60
[68191.870910] IP: [a0272287] repair_io_failure+0x117/0x230 
[btrfs]

[68191.870911] PGD 0
[68191.870912] Oops:  [#2] SMP
[68191.870992] Modules linked in: xfs 

failed disk (was: kernel 3.3.4 damages filesystem (?))

2012-05-09 Thread Helmut Hullen
Hallo, Hugo,

Du meintest am 07.05.12:

mkfs.btrfs -m raid1 -d single should give you that.

 What's the difference to

  mkfs.btrfs -m raid1 -d raid0

  - RAID-0 stripes each piece of data across all the disks.
  - single puts data on one disk at a time.

[...]


In fact, this is probably a good argument for having the option to
 put back the old allocator algorithm, which would have ensured that
 the first disk would fill up completely first before it touched the
 next one...

The actual version seems to oscillate from disk to disk:

Copying about 160 GiByte shows

Label: none  uuid: fd0596c6-d819-42cd-bb4a-420c38d2a60b
Total devices 2 FS bytes used 155.64GB
devid2 size 136.73GB used 114.00GB path /dev/sdl1
devid1 size 68.37GB used 45.04GB path /dev/sdk1

Btrfs Btrfs v0.19



Watching the amount showed that both disks are filled nearly  
simultaneously.

That would be more difficult to restore ...

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


failed disk (was: kernel 3.3.4 damages filesystem (?))

2012-05-09 Thread Helmut Hullen
Hallo, Hugo,

Du meintest am 07.05.12:

[...]

 With a file system like ext2/3/4 I can work with several directories
 which are mounted together, but (as said before) one broken disk
 doesn't disturb the others.

mkfs.btrfs -m raid1 -d single should give you that.

Just a small bug, perhaps:

created a system with

mkfs.btrfs -m raid1 -d single /dev/sdl1
mount /dev/sdl1 /mnt/Scsi
btrfs device add /dev/sdk1 /mnt/Scsi
btrfs device add /dev/sdm1 /mnt/Scsi
(filling with data)

and

btrfs fi df /mnt/Scsi

now tells

Data, RAID0: total=183.18GB, used=76.60GB
Data: total=80.01GB, used=79.83GB
System, DUP: total=8.00MB, used=32.00KB
System: total=4.00MB, used=0.00
Metadata, DUP: total=1.00GB, used=192.74MB
Metadata: total=8.00MB, used=0.00

--

Data, RAID0 confuses me (not very much ...), and the system for  
metadata (RAID1) is not told.


Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

2012-05-09 Thread Hugo Mills
On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen wrote:
 Du meintest am 07.05.12:
 
 [...]
 
  With a file system like ext2/3/4 I can work with several directories
  which are mounted together, but (as said before) one broken disk
  doesn't disturb the others.
 
 mkfs.btrfs -m raid1 -d single should give you that.
 
 Just a small bug, perhaps:
 
 created a system with
 
 mkfs.btrfs -m raid1 -d single /dev/sdl1
 mount /dev/sdl1 /mnt/Scsi
 btrfs device add /dev/sdk1 /mnt/Scsi
 btrfs device add /dev/sdm1 /mnt/Scsi
 (filling with data)
 
 and
 
 btrfs fi df /mnt/Scsi
 
 now tells
 
 Data, RAID0: total=183.18GB, used=76.60GB
 Data: total=80.01GB, used=79.83GB
 System, DUP: total=8.00MB, used=32.00KB
 System: total=4.00MB, used=0.00
 Metadata, DUP: total=1.00GB, used=192.74MB
 Metadata: total=8.00MB, used=0.00
 
 --
 
 Data, RAID0 confuses me (not very much ...), and the system for  
 metadata (RAID1) is not told.

   DUP is two copies of each block, but it allows the two copies to
live on the same device. It's done this because you started with a
single device, and you can't do RAID-1 on one device. The first bit of
metadata you write to it should automatically upgrade the DUP chunk to
RAID-1.

   As to the spurious upgrade of single to RAID-0, I thought Ilya
had stopped it doing that. What kernel version are you running?

   Out of interest, why did you do the device adds separately, instead
of just this?

# mkfs.btrfs -m raid1 -d single /dev/sdl1 /dev/sdk1 /dev/sdm1

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Comic Sans goes into a bar,  and the barman says, We don't ---   
 serve your type here.  


signature.asc
Description: Digital signature


Re: failed disk

2012-05-09 Thread Helmut Hullen
Hallo, Hugo,

Du meintest am 09.05.12:

mkfs.btrfs -m raid1 -d single should give you that.

 Just a small bug, perhaps:

 created a system with

 mkfs.btrfs -m raid1 -d single /dev/sdl1
 mount /dev/sdl1 /mnt/Scsi
 btrfs device add /dev/sdk1 /mnt/Scsi
 btrfs device add /dev/sdm1 /mnt/Scsi
 (filling with data)

 and

 btrfs fi df /mnt/Scsi

 now tells

 Data, RAID0: total=183.18GB, used=76.60GB
 Data: total=80.01GB, used=79.83GB
 System, DUP: total=8.00MB, used=32.00KB
 System: total=4.00MB, used=0.00
 Metadata, DUP: total=1.00GB, used=192.74MB
 Metadata: total=8.00MB, used=0.00

 --

 Data, RAID0 confuses me (not very much ...), and the system for
 metadata (RAID1) is not told.

DUP is two copies of each block, but it allows the two copies to
 live on the same device. It's done this because you started with a
 single device, and you can't do RAID-1 on one device. The first bit
 of metadata you write to it should automatically upgrade the DUP
 chunk to RAID-1.

Ok.

Sounds familiar - have you explained that to me many months ago?

As to the spurious upgrade of single to RAID-0, I thought Ilya
 had stopped it doing that. What kernel version are you running?

3.2.9, self made.
I could test the message with 3.3.4, but not today (if it's only an  
interpretation of always the same data).

Out of interest, why did you do the device adds separately,
 instead of just this?

a) making the first 2 devices: I have tested both versions (one line  
with 2 devices or 2 lines with 1 device); no big difference.

But I had tested the option -L (labelling) too, and that makes shit  
for the oneliner: both devices get the same label, and then findfs  
finds none of them.

The really safe way would be: deleting this option for the mkfs.btrfs  
command and only using

btrfs fi label device [newlabel]

b) third device: that's my usual test:
make a cluster of 2 deivces
fill them with data
add a third device
delete the smallest device

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: failed disk

2012-05-09 Thread Hugo Mills
On Wed, May 09, 2012 at 05:14:00PM +0200, Helmut Hullen wrote:
 Hallo, Hugo,
 
 Du meintest am 09.05.12:
 
 DUP is two copies of each block, but it allows the two copies to
  live on the same device. It's done this because you started with a
  single device, and you can't do RAID-1 on one device. The first bit
  of metadata you write to it should automatically upgrade the DUP
  chunk to RAID-1.
 
 Ok.
 
 Sounds familiar - have you explained that to me many months ago?

   Probably. I tend to explain this kind of thing a lot to people.

 As to the spurious upgrade of single to RAID-0, I thought Ilya
  had stopped it doing that. What kernel version are you running?
 
 3.2.9, self made.

   OK, I'm pretty sure that's too old -- it will upgrade single to
RAID-0. You can probably turn it back to single using balance
filters:

# btrfs fi balance -dconvert=single /mountpoint

(You may want to write at least a little data to the FS first --
balance has some slightly odd behaviour on empty filesystems).

 I could test the message with 3.3.4, but not today (if it's only an  
 interpretation of always the same data).
 
 Out of interest, why did you do the device adds separately,
  instead of just this?
 
 a) making the first 2 devices: I have tested both versions (one line  
 with 2 devices or 2 lines with 1 device); no big difference.
 
 But I had tested the option -L (labelling) too, and that makes shit  
 for the oneliner: both devices get the same label, and then findfs  
 finds none of them.

   Umm... Yes, of course both devices will get the same label --
you're labelling the filesystem, not the devices. (Didn't we have this
argument some time ago?).

   I don't know what findfs is doing, that it can't find the
filesystem by label: you may need to run sync after mkfs, possibly.

 The really safe way would be: deleting this option for the mkfs.btrfs  
 command and only using
 
 btrfs fi label device [newlabel]

   ... except that it'd have to take a filesystem as parameter, not a
device (see above).

 b) third device: that's my usual test:
 make a cluster of 2 deivces
 fill them with data
 add a third device
 delete the smallest device

   What are you testing? And by delete do you mean btrfs dev
delete or pull the cable out?

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Quidquid latine dictum sit,  altum videtur. ---   


signature.asc
Description: Digital signature


Re: failed disk (was: kernel 3.3.4 damages filesystem (?))

2012-05-09 Thread Ilya Dryomov
On Wed, May 09, 2012 at 03:37:35PM +0100, Hugo Mills wrote:
 On Wed, May 09, 2012 at 04:25:00PM +0200, Helmut Hullen wrote:
  Du meintest am 07.05.12:
  
  [...]
  
   With a file system like ext2/3/4 I can work with several directories
   which are mounted together, but (as said before) one broken disk
   doesn't disturb the others.
  
  mkfs.btrfs -m raid1 -d single should give you that.
  
  Just a small bug, perhaps:
  
  created a system with
  
  mkfs.btrfs -m raid1 -d single /dev/sdl1
  mount /dev/sdl1 /mnt/Scsi
  btrfs device add /dev/sdk1 /mnt/Scsi
  btrfs device add /dev/sdm1 /mnt/Scsi
  (filling with data)
  
  and
  
  btrfs fi df /mnt/Scsi
  
  now tells
  
  Data, RAID0: total=183.18GB, used=76.60GB
  Data: total=80.01GB, used=79.83GB
  System, DUP: total=8.00MB, used=32.00KB
  System: total=4.00MB, used=0.00
  Metadata, DUP: total=1.00GB, used=192.74MB
  Metadata: total=8.00MB, used=0.00
  
  --
  
  Data, RAID0 confuses me (not very much ...), and the system for  
  metadata (RAID1) is not told.
 
DUP is two copies of each block, but it allows the two copies to
 live on the same device. It's done this because you started with a
 single device, and you can't do RAID-1 on one device. The first bit of

What Hugo said.  Newer mkfs.btrfs will error out if you try to do this.

 metadata you write to it should automatically upgrade the DUP chunk to
 RAID-1.

We don't upgrade chunks in place, only during balance.

 
As to the spurious upgrade of single to RAID-0, I thought Ilya
 had stopped it doing that. What kernel version are you running?

I did, but again, we were doing it only as part of balance, not as part
of normal operation.

Helmut, do you have any additional data points - the output of btrfs fi
df right after you created FS or somewhere in the middle of filling it ?

Also could you please paste the output of btrfs fi show and tell us what
kernel version you are running ?

Thanks,

Ilya
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: failed disk

2012-05-09 Thread Helmut Hullen
Hallo, Hugo,

Du meintest am 09.05.12:

As to the spurious upgrade of single to RAID-0, I thought Ilya
 had stopped it doing that. What kernel version are you running?

 3.2.9, self made.

OK, I'm pretty sure that's too old -- it will upgrade single to
 RAID-0. You can probably turn it back to single using balance
 filters:

 # btrfs fi balance -dconvert=single /mountpoint

 (You may want to write at least a little data to the FS first --
 balance has some slightly odd behaviour on empty filesystems).

manana ... the system is just running balance after device delete.  
And that may still need 4 ... 5 hours.

Out of interest, why did you do the device adds separately,
 instead of just this?

 a) making the first 2 devices: I have tested both versions (one line
 with 2 devices or 2 lines with 1 device); no big difference.

 But I had tested the option -L (labelling) too, and that makes
 shit for the oneliner: both devices get the same label, and then
 findfs finds none of them.

Umm... Yes, of course both devices will get the same label --
 you're labelling the filesystem, not the devices. (Didn't we have
 this argument some time ago?).

Not with that special case (and that led me to misinterpreting the error  
...).

I don't know what findfs is doing, that it can't find the
 filesystem by label: you may need to run sync after mkfs, possibly.

No - findfs works quite simple: if it finds 1 label then it tells the  
partition.
If it finds more or less labels it tells nothing.

 b) third device: that's my usual test:
 make a cluster of 2 deivces
 fill them with data
 add a third device
 delete the smallest device

What are you testing? And by delete do you mean btrfs dev
 delete or pull the cable out?

First pure software delete. Tomorrow I'll reboot the system and look at  
the results with

btrfs fi show

It should tell only 2 devices (that's the part which seems to work as  
described at least since kernel 3.2).

By the way: it seems to be necessary running

btrfs fi balance ...

after btrfs device add ... and after btrfs device delete 

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: failed disk

2012-05-09 Thread Helmut Hullen
Hallo, Hugo,

Du meintest am 09.05.12:

 btrfs fi df /mnt/Scsi

 now tells

 Data, RAID0: total=183.18GB, used=76.60GB
 Data: total=80.01GB, used=79.83GB
 System, DUP: total=8.00MB, used=32.00KB
 System: total=4.00MB, used=0.00
 Metadata, DUP: total=1.00GB, used=192.74MB
 Metadata: total=8.00MB, used=0.00

 --

 Data, RAID0 confuses me (not very much ...), and the system for
 metadata (RAID1) is not told.

DUP is two copies of each block, but it allows the two copies to
 live on the same device. It's done this because you started with a
 single device, and you can't do RAID-1 on one device. The first bit
 of metadata you write to it should automatically upgrade the DUP
 chunk to RAID-1.

It has done - ok. Adding and removing disks/partitions works as  
expected.

Viele Gruesse!
Helmut
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html