Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-26 Thread Chris Samuel
On Sun, 22 Dec 2013 11:35:53 AM Duncan wrote:

 While the btrfs kernel config option no longer (as of 3.12 IIRC) directly 
 calls the filesystem unstable

It'll be in 3.13, the commit was:

$ git describe 4204617d142c0887e45fda2562cb5c58097b918e
v3.12-116-g4204617

The help text for btrfs now says:

  Btrfs is a general purpose copy-on-write filesystem with extents,
  writable snapshotting, support for multiple devices and many more
  features focused on fault tolerance, repair and easy administration.

  The filesystem disk format is no longer unstable, and it's not
  expected to change unless there are strong reasons to do so. If there
  is a format change, file systems with a unchanged format will
  continue to be mountable and usable by newer kernels.

  For more information, please see the web pages at
  http://btrfs.wiki.kernel.org.

cheers,
Chris
-- 
 Chris Samuel  :  http://www.csamuel.org/  :  Melbourne, VIC

This email may come with a PGP signature as a file. Do not panic.
For more info see: http://en.wikipedia.org/wiki/OpenPGP


signature.asc
Description: This is a digitally signed message part.


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-22 Thread Duncan
ronnie sahlberg posted on Sat, 21 Dec 2013 17:15:33 -0800 as excerpted:

 Similar things happened to me. (See my unanswered posts ~1Sep, this fs
 is not really ready for production I think)

No I think about that one.  Known fact.  Btrfs is not yet fully stable 
and is still under heavy development, and there are public warnings about 
having backups and being prepared to use them before testing it, too.

[snip]

 Then  depending on how important your data is, you start making backups
 regularely, or switch to a less fragile and unrepairable fs.

Of course, if the data's /that/ important, backups that are tested and 
ready to use have been done already, even if it's on a stable filesystem; 
certainly even more so if it's on a development filesystem.  Otherwise, 
the data simply isn't that important, no matter what people might claim, 
as their actions belie their words!

While the btrfs kernel config option no longer (as of 3.12 IIRC) directly 
calls the filesystem unstable, it /does/ say the disk format is no 
longer expected to change, which doesn't exactly sound entirely stable, 
either (you don't see that sort of qualifier on ext4, for instance, let 
alone ext3, or reiserfs, the generation of filesystem that some of the 
folks who /really/ value their data are either still on or have just 
recently upgraded from).  Further, it points to 
http://btrfs.wiki.kernel.org , which on the main page under stability 
status has this to say:

 The Btrfs code base is under heavy development.

... And under getting started, it says:

 Note also that btrfs is still considered experimental. While many
 people use it reliably, there are still problems being found.

... and (offset in red even)...

 You should keep and test backups of your data, and be prepared to
 use them.

So no doubt about it or secret there; people should have TESTED backups 
they are PREPARED TO USE before they stick anything on btrfs in the first 
place.

Failure to do that simply demonstrates in action if not in word that the 
data isn't considered valuable enough to bother reading and following the 
warnings about keeping tested backups they're prepared to use, while 
placing it on a filesystem known to be under heavy development and not 
fully stable yet, that being the reason for such warnings in the first 
place!

And the but I didn't know excuse isn't valid either, for the same 
reason.  If they care about their data, they have backups even on stable 
filesystems, and if they care /enough/, they've made it their business to 
know the stability of the entire system, including the filesystem, that 
data is on, too.

Anything else... simply demonstrates by their actions that they didn't 
care about their data as much as they might have said they did after 
all.  And when people play the odds on the safety of their data, instead 
of having redundant backups to a level of safety matching the level of 
value they claim to place on that data,  sometimes they lose.  It's as 
simple as that.

(Not that I'm always perfect about keeping current backups either, but if 
I lost the working copy and didn't have a current backup, I'd know 
exactly who to point the finger at... myself!  Meanwhile, while it's not 
always current, for the data I value I do tend to have multiple layers of 
backup, to the degree that if I lose them all, it'll mean something 
drastic enough has happened that I'll have more important things to worry 
about than recovering my data for awhile... like surviving whatever 
disaster took all those levels of backup away at once!  Other than that, 
if I lose the working copy and perhaps the first level of backup, yeah 
I'll be mad at myself for not having kept up with the backups, but oh, 
well, I'll pick up with the backups I have and life will go on.  I've 
done it before and if it comes to it I'll do it again!)

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-21 Thread Chris Kastorff
 - Array is good. All drives are accounted for, btrfs scrub runs cleanly.
 btrfs fi show shows no missing drives and reasonable allocations.
 - I start btrfs dev del to remove devid 9. It chugs along with no
 errors, until:
 - Another drive in the array (NOT THE ONE I RAN DEV DEL ON) fails, and
 all reads and writes to it fail, causing the SCSI errors above.
 - I attempt clean shutdown. It takes too long for because my drive
 controller card is buzzing loudly and the neighbors are sensitive to
 noise, so:
 - I power down the machine uncleanly.
 - I remove the failed drive, NOT the one I ran dev del on.
 - I reboot, attempt to mount with various options, all of which cause
 the kernel to yell at me and the mount command returns failure.
 
 devid 9 is device delete in-progress, and while that's occurring devid 15 
 fails completely. Is that correct?

Either devid 14 or devid 10 (from memory) dropped out, devid 15 is still 
working.

 Because previously you reported, in part this:
devid   15 size 1.82TB used 1.47TB path /dev/sdd
*** Some devices missing
 
 And this:
 
 sd 0:2:3:0: [sdd] Unhandled error code

Yeah, those two are from different boots. sdd is the one that dropped out, and 
after a reboot another (working) drive was renumbered to sdd. Sorry for the 
confusion.

(Also note that if devid 15 was missing, it would not be reported in btrfs fi 
show.)

 That why I was confused. It looks like dead/missing device is one devid, and 
 then devid 15 /dev/sdd is also having hardware problems - because all of this 
 was posted at the same time. But I take it they're different boots and the 
 /dev/sdd's are actually two different devids.
 
 So devid 9 was deleted and then devid 14 failed. Right? Lovely when 
 /dev/sdX changes between boots.

It never finished the deletion (was probably about halfway through,
based on previous dev dels), but otherwise yes.

 From what I understand, at all points there should be at least two
 copies of every extent during a dev del when all chunks are allocated
 RAID10 (and they are, according to btrfs fi df ran before on the mounted
 fs).

 Because of this, I expect to be able to use the chunks from the (not
 successfully removed) devid=9, as I have done many many times before due
 to other btrfs bugs that needed unclean shutdowns during dev del.
 
 I haven't looked at the code or read anything this specific on the state of 
 the file system during a device delete. But my expectation is that there are 
 1-2 chunks available for writes. And 2-3 chunks available for reads. Some 
 writes must be only one copy because a chunk hasn't yet been replicated 
 elsewhere, and presumably the device being deleted is not subject to writes 
 as the transid also implies. Whereas devid 9 is one set of chunks for 
 reading, those chunks have pre-existing copies elsewhere in the file system 
 so that's two copies. And there's a replication in progress of the soon to be 
 removed chunks. So that's up to three copies.
 
 Problem is that for sure you've lost some chunks due to the failed/missing 
 device. Normal raid10, it's unambiguous whether we've lost two mirrored sets. 
 With Btrfs that's not clear as chunks are distributed. So it's possible that 
 there are some chunks that don't exist at all for writes, and only 1 for 
 reads. It may be no chunks are in common between devid 9 and the dead one. It 
 may be only a couple of data or metadata chunks are in common.
 
 
 

 Under the assumption devid=9 is good, if a slightly out of date on
 transid (which ALL data says is true), I should be able to completely
 recover all data, because data that was not modified during the deletion
 resides on devid=9, and data that was modified should be redundantly
 (RAID10) stored on the remaining drives, and thus should work given this
 case of a single drive failure.

 Is this not the case? Does btrfs not maintain redundancy during device
 removal?
 
 Good questions. I'm not certain. But the speculation seems reasonable, not 
 accounting for the missing device. That's what makes this different.
 
 
 
 btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
 sector 62986400)

 btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
 sector 113318256)

 I'm not sure what constitutes a btrfs read error, maybe the device it
 originally requested data from didn't have it where it was expected
 but was able to find it on these devices. If the drive itself has a
 problem reading a sector and ECC can't correct it, it reports the
 read error to libata. So kernel messages report this with a line that
 starts with the word exception and then a line with cmd that
 shows what command and LBAs where issued to the drive, and then a
 res line that should contain an error mask with the actual error -
 bus error, media error. Very often you don't see these and instead
 see link reset messages, which means the drive is hanging doing
 something (probably attempting ECC) but then the linux 

Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-21 Thread Chris Murphy

On Dec 21, 2013, at 4:16 PM, Chris Kastorff encryp...@gmail.com wrote:
 
 1. btrfs-image -c 9 -t #cores (see man page)
 This is optional but one of the devs might want to see this because it 
 should be a more rare case that either normal mount fix ups or additional 
 recovery fix ups can't deal with this problem.
 
 This fails:
 
 deep# ./btrfs-image -c 9 -t 4 /dev/sda btrfsimg
 warning, device 14 is missing
 warning devid 14 not found already
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 Ignoring transid failure
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 Ignoring transid failure
 Error going to next leaf -5
 create failed (Bad file descriptor)

Well, that's unfortunate. Someone else is going to have to comment on the 
confusion of the tools trying to fix the file system while a device is missing, 
which cannot be removed due to the fact the file system can't be mounted, 
because it needs to be fixed first. Circular problem.

 
 3. Try to mount again with -o degraded,recovery and report back.
 
 Since btrfs-zero-log (probably) didn't modify anything, the error
 message is the same:
 
 btrfs: allowing degraded mounts
 btrfs: enabling auto recovery
 btrfs: disk space caching is enabled
 btrfs: bdev (null) errs: wr 344288, rd 230234, flush 0, corrupt 0, gen 0
 btrfs: bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 btrfs: bdev /dev/sdg errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 Failed to read block groups: -5
 btrfs: open_ctree failed

How about:

-o skip_balance,degraded,recovery

If that fails, try:


-o skip_balance,degraded,recovery,ro

The ro file system probably doesn't let you delete missing, but it's worth a 
shot because the seems to be limiting repairs due to the missing device.

If you still have failure, it's worth repeating with the absolute latest kernel 
you can find or even build.

After that it gets really aggressive to dangerous and I'm not sure what to 
recommend next except avoid btrfs check --repair until dead last. I'd sooner go 
for the ro mount and use btrfs send/receive to get the current data you want 
off the file system, and create a new one from scratch. 

 
 btrfs-zero-log's Unable to find block group for 0 combined with the
 earlier kernel message on mount attempts btrfs: failed to read the
 system array on sdc and btrfsck's Couldn't map the block %ld tells me
 the (first) underlying problem is that the block group tree(?) in the
 system allocated data is screwed up.
 
 I have no idea where to go from here, aside from grabbing a compiler and
 having at the disk structures myself.

There are some other options but they get progressively and quickly into 
possibly making things a lot worse. At a certain point it's an extraction 
operation rather than repair and continue.

Chris Murphy

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-21 Thread ronnie sahlberg
Similar things happened to me. (See my unanswered posts ~1Sep, this fs
is not really ready for production I think)

When you get wrong transid errors and reports that you have checksums
being repaired,
that is all bad news and no one can help you.

Unfortunately there are, I think, no real tools to fix basic fs erros.


I never managed to get the my in a state where it could be mounted at all
but did manage to recover most of my data using
btrfs restore from
https://github.com/FauxFaux/btrfs-progs

This is the argument from that command that I used to recover data :
I got most data back with ith but YMMV.

  commit 2a2a1fb21d375a46f9073e44a7b9d9bb7bfaa1e2
  Author: Peter Stuge pe...@stuge.se
  Date:   Fri Nov 25 01:03:58 2011 +0100

restore: Add regex matching of paths and files to be restored

The option -m is used to specify the regex string. -c is used to
specify case insensitive matching. -i was already taken.

In order to restore only a single folder somewhere in the btrfs
tree, it is unfortunately neccessary to construct a slightly
nontrivial regex, e.g.:

restore -m '^/(|home(|/username(|/Desktop(|/.*$' /dev/sdb2 /output

This is needed in order to match each directory along the way to the
Desktop directory, as well as all contents below the Desktop directory.

Signed-off-by: Peter Stuge pe...@stuge.se
Signed-off-by: Josef Bacik jo...@redhat.com


I wont give advice for your data.
For my data, I copied as much data as I could recover from the
filesystem over to a different filesystem
using the tools in the repo above.
After that destroy the damaged filesystem and rebuild from scratch.


Then  depending on how important your data is, you start making
backups regularely, or switch to a less fragile and unrepairable fs.


On Thu, Dec 19, 2013 at 1:26 AM, Chris Kastorff encryp...@gmail.com wrote:
 I'm using btrfs in data and metadata RAID10 on drives (not on md or any
 other fanciness.)

 I was removing a drive (btrfs dev del) and during that operation, a
 different drive in the array failed. Having not had this happen before,
 I shut down the machine immediately due to the extremely loud piezo
 buzzer on the drive controller card. I attempted to do so cleanly, but
 the buzzer cut through my patience and after 4 minutes I cut the power.

 Afterwards, I located and removed the failed drive from the system, and
 then got back to linux. The array no longer mounts (failed to read the
 system array on sdc), with nearly identical messages when attempted
 with -o recovery and -o recovery,ro.

 btrfsck asserts and coredumps, as usual.

 The drive that was being removed is devid 9 in the array, and is
 /dev/sdm1 in the btrfs fi show seen below.

 Kernel 3.12.4-1-ARCH, btrfs-progs v0.20-rc1-358-g194aa4a-dirty
 (archlinux build.)

 Can I recover the array?

 == dmesg during failure ==

 ...
 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 00 00 00 80 00
 end_request: I/O error, dev sdd, sector 646535936
 btrfs_dev_stat_print_on_error: 7791 callbacks suppressed
 btrfs: bdev /dev/sdd errs: wr 315858, rd 230194, flush 0, corrupt 0, gen 0
 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 80 00 00 80 00
 end_request: I/O error, dev sdd, sector 646536064
 ...

 == dmesg after new boot, mounting attempt ==

 btrfs: device label lake devid 11 transid 4893967 /dev/sda
 btrfs: disk space caching is enabled
 btrfs: failed to read the system array on sdc
 btrfs: open_ctree failed

 == dmesg after new boot, mounting attempt with -o recovery,ro ==

 btrfs: device label lake devid 11 transid 4893967 /dev/sda
 btrfs: enabling auto recovery
 btrfs: disk space caching is enabled
 btrfs: failed to read the system array on sdc
 btrfs: open_ctree failed

 == btrfsck ==

 deep# btrfsck /dev/sda
 warning, device 14 is missing
 warning devid 14 not found already
 parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
 parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
 parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
 parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
 parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
 parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 Ignoring transid failure
 Checking filesystem on /dev/sda
 UUID: d5e17c49-d980-4bde-bd96-3c8bc95ea077
 checking extents
 parent transid verify failed on 87601117159424 wanted 4893969 found 4893913
 parent transid verify failed on 87601117159424 wanted 4893969 found 4893913
 parent transid verify failed on 87601116368896 wanted 

Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Kastorff
I'm using btrfs in data and metadata RAID10 on drives (not on md or any
other fanciness.)

I was removing a drive (btrfs dev del) and during that operation, a
different drive in the array failed. Having not had this happen before,
I shut down the machine immediately due to the extremely loud piezo
buzzer on the drive controller card. I attempted to do so cleanly, but
the buzzer cut through my patience and after 4 minutes I cut the power.

Afterwards, I located and removed the failed drive from the system, and
then got back to linux. The array no longer mounts (failed to read the
system array on sdc), with nearly identical messages when attempted
with -o recovery and -o recovery,ro.

btrfsck asserts and coredumps, as usual.

The drive that was being removed is devid 9 in the array, and is
/dev/sdm1 in the btrfs fi show seen below.

Kernel 3.12.4-1-ARCH, btrfs-progs v0.20-rc1-358-g194aa4a-dirty
(archlinux build.)

Can I recover the array?

== dmesg during failure ==

...
sd 0:2:3:0: [sdd] Unhandled error code
sd 0:2:3:0: [sdd]
Result: hostbyte=0x04 driverbyte=0x00
sd 0:2:3:0: [sdd] CDB:
cdb[0]=0x2a: 2a 00 26 89 5b 00 00 00 80 00
end_request: I/O error, dev sdd, sector 646535936
btrfs_dev_stat_print_on_error: 7791 callbacks suppressed
btrfs: bdev /dev/sdd errs: wr 315858, rd 230194, flush 0, corrupt 0, gen 0
sd 0:2:3:0: [sdd] Unhandled error code
sd 0:2:3:0: [sdd]
Result: hostbyte=0x04 driverbyte=0x00
sd 0:2:3:0: [sdd] CDB:
cdb[0]=0x2a: 2a 00 26 89 5b 80 00 00 80 00
end_request: I/O error, dev sdd, sector 646536064
...

== dmesg after new boot, mounting attempt ==

btrfs: device label lake devid 11 transid 4893967 /dev/sda
btrfs: disk space caching is enabled
btrfs: failed to read the system array on sdc
btrfs: open_ctree failed

== dmesg after new boot, mounting attempt with -o recovery,ro ==

btrfs: device label lake devid 11 transid 4893967 /dev/sda
btrfs: enabling auto recovery
btrfs: disk space caching is enabled
btrfs: failed to read the system array on sdc
btrfs: open_ctree failed

== btrfsck ==

deep# btrfsck /dev/sda
warning, device 14 is missing
warning devid 14 not found already
parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
Ignoring transid failure
Checking filesystem on /dev/sda
UUID: d5e17c49-d980-4bde-bd96-3c8bc95ea077
checking extents
parent transid verify failed on 87601117159424 wanted 4893969 found 4893913
parent transid verify failed on 87601117159424 wanted 4893969 found 4893913
parent transid verify failed on 87601116368896 wanted 4893969 found 4893913
parent transid verify failed on 87601116368896 wanted 4893969 found 4893913
parent transid verify failed on 87601117163520 wanted 4893969 found 4893913
parent transid verify failed on 87601117163520 wanted 4893969 found 4893913
parent transid verify failed on 87601117638656 wanted 4893969 found 4893913
parent transid verify failed on 87601117638656 wanted 4893969 found 4893913
Ignoring transid failure
parent transid verify failed on 87601117171712 wanted 4893969 found 4893913
parent transid verify failed on 87601117171712 wanted 4893969 found 4893913
parent transid verify failed on 87601117175808 wanted 4893969 found 4893913
parent transid verify failed on 87601117175808 wanted 4893969 found 4893913
parent transid verify failed on 87601117188096 wanted 4893969 found 4893913
parent transid verify failed on 87601117188096 wanted 4893969 found 4893913
parent transid verify failed on 87601116807168 wanted 4893969 found 4893913
parent transid verify failed on 87601116807168 wanted 4893969 found 4893913
Ignoring transid failure
parent transid verify failed on 87601117642752 wanted 4893969 found 4893913
parent transid verify failed on 87601117642752 wanted 4893969 found 4893913
Ignoring transid failure
parent transid verify failed on 87601117650944 wanted 4893969 found 4893913
parent transid verify failed on 87601117650944 wanted 4893969 found 4893913
Ignoring transid failure
Couldn't map the block 5764607523034234880
btrfsck: volumes.c:1019: btrfs_num_copies: Assertion `!(!ce)' failed.
zsh: abort (core dumped)  btrfsck /dev/sda

== btrfs fi show ==

Label: 'lake'  uuid: d5e17c49-d980-4bde-bd96-3c8bc95ea077
Total devices 10 FS bytes used 7.43TB
devid9 size 1.82TB used 1.61TB path /dev/sdm1
devid   12 size 1.82TB used 1.47TB path /dev/sdb
devid   16 size 1.82TB used 1.47TB path /dev/sde
devid   13 size 1.82TB used 1.47TB path /dev/sdc
devid   11 

Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Duncan
Chris Kastorff posted on Thu, 19 Dec 2013 01:26:57 -0800 as excerpted:

 I'm using btrfs in data and metadata RAID10 on drives (not on md or any
 other fanciness.)
 
 I was removing a drive (btrfs dev del) and during that operation, a
 different drive in the array failed. Having not had this happen before,
 I shut down the machine immediately due to the extremely loud piezo
 buzzer on the drive controller card. I attempted to do so cleanly, but
 the buzzer cut through my patience and after 4 minutes I cut the power.
 
 Afterwards, I located and removed the failed drive from the system, and
 then got back to linux. The array no longer mounts (failed to read the
 system array on sdc), with nearly identical messages when attempted
 with -o recovery and -o recovery,ro.

This may be a stupid question, but you're missing a drive so the 
filesystem will be degraded, but you didn't mention that in your mount 
options, so...

Did you try mounting with -o degraded (possibly with recovery, etc, also, 
but just try -o degraded plus any normal options first)?

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Kastorff
 I'm using btrfs in data and metadata RAID10 on drives (not on md or any
 other fanciness.)

 I was removing a drive (btrfs dev del) and during that operation, a
 different drive in the array failed. Having not had this happen before,
 I shut down the machine immediately due to the extremely loud piezo
 buzzer on the drive controller card. I attempted to do so cleanly, but
 the buzzer cut through my patience and after 4 minutes I cut the power.

 Afterwards, I located and removed the failed drive from the system, and
 then got back to linux. The array no longer mounts (failed to read the
 system array on sdc), with nearly identical messages when attempted
 with -o recovery and -o recovery,ro.
 
 This may be a stupid question, but you're missing a drive so the 
 filesystem will be degraded, but you didn't mention that in your mount 
 options, so...
 
 Did you try mounting with -o degraded (possibly with recovery, etc, also, 
 but just try -o degraded plus any normal options first)?
 

I did not try degraded because I didn't remember that there were two
different options for handling broken btrfs volumes.

mount -o degraded,ro yields:

btrfs: device label lake devid 11 transid 4893967 /dev/sda
btrfs: allowing degraded mounts
btrfs: disk space caching is enabled
parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
sector 62986400)
parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116381184 (dev /dev/sdf
sector 62986432)
parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601115320320 (dev /dev/sdf
sector 62985896)
parent transid verify failed on 87601116368896 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116368896 (dev /dev/sdf
sector 62986408)
parent transid verify failed on 87601116377088 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116377088 (dev /dev/sdf
sector 62986424)
btrfs: bdev (null) errs: wr 344288, rd 230234, flush 0, corrupt 0, gen 0
btrfs: bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
btrfs: bdev /dev/sdg errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
Failed to read block groups: -5
btrfs: open_ctree failed

mount -o degraded,recovery,ro yields:

btrfs: device label lake devid 11 transid 4893967 /dev/sda
btrfs: allowing degraded mounts
btrfs: enabling auto recovery
btrfs: disk space caching is enabled
parent transid verify failed on 87601116798976 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
sector 113318256)
parent transid verify failed on 87601119379456 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601119379456 (dev /dev/sdg
sector 113319456)
parent transid verify failed on 87601116774400 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116774400 (dev /dev/sdg
sector 113318208)
parent transid verify failed on 87601119391744 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601119391744 (dev /dev/sdg
sector 113319480)
parent transid verify failed on 87601116778496 wanted 4893969 found 4893913
btrfs read error corrected: ino 1 off 87601116778496 (dev /dev/sdg
sector 113318216)
parent transid verify failed on 87601116786688 wanted 4893969 found 4893849
btrfs read error corrected: ino 1 off 87601116786688 (dev /dev/sdg
sector 113318232)
btrfs: bdev (null) errs: wr 344288, rd 230234, flush 0, corrupt 0, gen 0
btrfs: bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
btrfs: bdev /dev/sdg errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
parent transid verify failed on 8760515136 wanted 4893968 found 4893913
btrfs read error corrected: ino 1 off 8760515136 (dev /dev/sdg
sector 113315616)
parent transid verify failed on 8760523328 wanted 4893968 found 4893913
btrfs read error corrected: ino 1 off 8760523328 (dev /dev/sdg
sector 113315632)
parent transid verify failed on 8760535616 wanted 4893968 found 4893913
btrfs read error corrected: ino 1 off 8760535616 (dev /dev/sdg
sector 113315656)
parent transid verify failed on 8760556096 wanted 4893968 found 4893913
btrfs read error corrected: ino 1 off 8760556096 (dev /dev/sdg
sector 113315696)
Failed to read block groups: -5
btrfs: open_ctree failed

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Kastorff
 I'm using btrfs in data and metadata RAID10 on drives (not on md or any
 other fanciness.)

 I was removing a drive (btrfs dev del) and during that operation, a
 different drive in the array failed. Having not had this happen before,
 I shut down the machine immediately due to the extremely loud piezo
 buzzer on the drive controller card. I attempted to do so cleanly, but
 the buzzer cut through my patience and after 4 minutes I cut the power.

 Afterwards, I located and removed the failed drive from the system, and
 then got back to linux. The array no longer mounts (failed to read the
 system array on sdc), with nearly identical messages when attempted
 with -o recovery and -o recovery,ro.

 This may be a stupid question, but you're missing a drive so the 
 filesystem will be degraded, but you didn't mention that in your mount 
 options, so...

 Did you try mounting with -o degraded (possibly with recovery, etc, also, 
 but just try -o degraded plus any normal options first)?

 
 I did not try degraded because I didn't remember that there were two
 different options for handling broken btrfs volumes.
 
 mount -o degraded,ro yields:
 
 btrfs: device label lake devid 11 transid 4893967 /dev/sda
 btrfs: allowing degraded mounts
 btrfs: disk space caching is enabled
 parent transid verify failed on 87601116364800 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
 sector 62986400)
 parent transid verify failed on 87601116381184 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116381184 (dev /dev/sdf
 sector 62986432)
 parent transid verify failed on 87601115320320 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601115320320 (dev /dev/sdf
 sector 62985896)
 parent transid verify failed on 87601116368896 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116368896 (dev /dev/sdf
 sector 62986408)
 parent transid verify failed on 87601116377088 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116377088 (dev /dev/sdf
 sector 62986424)
 btrfs: bdev (null) errs: wr 344288, rd 230234, flush 0, corrupt 0, gen 0
 btrfs: bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 btrfs: bdev /dev/sdg errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 parent transid verify failed on 87601117097984 wanted 4893969 found 4892460
 Failed to read block groups: -5
 btrfs: open_ctree failed
 
 mount -o degraded,recovery,ro yields:
 
 btrfs: device label lake devid 11 transid 4893967 /dev/sda
 btrfs: allowing degraded mounts
 btrfs: enabling auto recovery
 btrfs: disk space caching is enabled
 parent transid verify failed on 87601116798976 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
 sector 113318256)
 parent transid verify failed on 87601119379456 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601119379456 (dev /dev/sdg
 sector 113319456)
 parent transid verify failed on 87601116774400 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116774400 (dev /dev/sdg
 sector 113318208)
 parent transid verify failed on 87601119391744 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601119391744 (dev /dev/sdg
 sector 113319480)
 parent transid verify failed on 87601116778496 wanted 4893969 found 4893913
 btrfs read error corrected: ino 1 off 87601116778496 (dev /dev/sdg
 sector 113318216)
 parent transid verify failed on 87601116786688 wanted 4893969 found 4893849
 btrfs read error corrected: ino 1 off 87601116786688 (dev /dev/sdg
 sector 113318232)
 btrfs: bdev (null) errs: wr 344288, rd 230234, flush 0, corrupt 0, gen 0
 btrfs: bdev /dev/sdm1 errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 btrfs: bdev /dev/sdg errs: wr 0, rd 0, flush 0, corrupt 4, gen 0
 parent transid verify failed on 8760515136 wanted 4893968 found 4893913
 btrfs read error corrected: ino 1 off 8760515136 (dev /dev/sdg
 sector 113315616)
 parent transid verify failed on 8760523328 wanted 4893968 found 4893913
 btrfs read error corrected: ino 1 off 8760523328 (dev /dev/sdg
 sector 113315632)
 parent transid verify failed on 8760535616 wanted 4893968 found 4893913
 btrfs read error corrected: ino 1 off 8760535616 (dev /dev/sdg
 sector 113315656)
 parent transid verify failed on 8760556096 wanted 4893968 found 4893913
 btrfs read error corrected: ino 1 off 8760556096 (dev /dev/sdg
 sector 113315696)
 Failed to read block groups: -5
 btrfs: open_ctree failed
 

I should also mention that the corrupt 4 errs on /dev/sdm1 and
/dev/sdg are there from an earlier btrfs extent corruption bug, and do
not exist on the filesystem anymore (a scrub hours before the device
deletion completed with 0 errors.)
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Murphy

On Dec 19, 2013, at 2:26 AM, Chris Kastorff encryp...@gmail.com wrote:

 btrfs-progs v0.20-rc1-358-g194aa4a-dirty

Most of what you're using is in the kernel so this is not urgent but if it gets 
to needing btrfs check/repair, I'd upgrade to v3.12 progs:
https://www.archlinux.org/packages/testing/x86_64/btrfs-progs/


 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 00 00 00 80 00
 end_request: I/O error, dev sdd, sector 646535936
 btrfs_dev_stat_print_on_error: 7791 callbacks suppressed
 btrfs: bdev /dev/sdd errs: wr 315858, rd 230194, flush 0, corrupt 0, gen 0
 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 80 00 00 80 00
 end_request: I/O error, dev sdd, sector 646536064

These are hardware errors. And you have missing devices, or at least a message 
of missing devices. So if a device went bad, and a new one added without 
deleting the missing one, then the new device only has new data. Data hasn't 
been recovered and replicated to the replacement. So it's possible with a 
missing device that's not removed, and a 2nd device failure, to lose some data.

 btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
 sector 62986400)
 
 btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
 sector 113318256)

I'm not sure what constitutes a btrfs read error, maybe the device it 
originally requested data from didn't have it where it was expected but was 
able to find it on these devices. If the drive itself has a problem reading a 
sector and ECC can't correct it, it reports the read error to libata. So kernel 
messages report this with a line that starts with the word exception and then 
a line with cmd that shows what command and LBAs where issued to the drive, 
and then a res line that should contain an error mask with the actual error - 
bus error, media error. Very often you don't see these and instead see link 
reset messages, which means the drive is hanging doing something (probably 
attempting ECC) but then the linux SCSI layer hits its 30 second time out on 
the (hanged) queued command and resets the drive instead of waiting any longer. 
And that's a problem also because it prevents bad sectors from being fixed by 
Btrfs. So they just get worse to the point where then it can't do anyt
 hing about the situation.

So I think you need to post a full dmesg somewhere rather than snippets. And 
I'd also like to see the result from smartctl -x for the above three drives, 
sdd, sdf, and sdg. And we need to know what this missing drive message is 
about, if you've done a drive replacement and exactly what commands you used to 
do that and how long ago.


Chris Murphy--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Kastorff
On 12/19/2013 02:21 PM, Chris Murphy wrote:

 On Dec 19, 2013, at 2:26 AM, Chris Kastorff encryp...@gmail.com wrote:

 btrfs-progs v0.20-rc1-358-g194aa4a-dirty

 Most of what you're using is in the kernel so this is not urgent but
if it gets to needing btrfs check/repair, I'd upgrade to v3.12 progs:
 https://www.archlinux.org/packages/testing/x86_64/btrfs-progs/

Adding the testing repository is a bad idea for this machine; turning
off the testing repository is extremely error prone.

Instead, I am now using the btrfs tools from
git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git's
master (specifically 8cae184), which reports itself as:

deep# ./btrfs version
Btrfs v3.12


 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 00 00 00 80 00
 end_request: I/O error, dev sdd, sector 646535936
 btrfs_dev_stat_print_on_error: 7791 callbacks suppressed
 btrfs: bdev /dev/sdd errs: wr 315858, rd 230194, flush 0, corrupt 0,
gen 0
 sd 0:2:3:0: [sdd] Unhandled error code
 sd 0:2:3:0: [sdd]
 Result: hostbyte=0x04 driverbyte=0x00
 sd 0:2:3:0: [sdd] CDB:
 cdb[0]=0x2a: 2a 00 26 89 5b 80 00 00 80 00
 end_request: I/O error, dev sdd, sector 646536064

 These are hardware errors. And you have missing devices, or at least
 a message of missing devices. So if a device went bad, and a new one
 added without deleting the missing one, then the new device only has
 new data. Data hasn't been recovered and replicated to the
 replacement. So it's possible with a missing device that's not
 removed, and a 2nd device failure, to lose some data.


This is not what happened, as I explained earlier; I shall explain
again, with more verbosity:

- Array is good. All drives are accounted for, btrfs scrub runs cleanly.
btrfs fi show shows no missing drives and reasonable allocations.
- I start btrfs dev del to remove devid 9. It chugs along with no
errors, until:
- Another drive in the array (NOT THE ONE I RAN DEV DEL ON) fails, and
all reads and writes to it fail, causing the SCSI errors above.
- I attempt clean shutdown. It takes too long for because my drive
controller card is buzzing loudly and the neighbors are sensitive to
noise, so:
- I power down the machine uncleanly.
- I remove the failed drive, NOT the one I ran dev del on.
- I reboot, attempt to mount with various options, all of which cause
the kernel to yell at me and the mount command returns failure.

From what I understand, at all points there should be at least two
copies of every extent during a dev del when all chunks are allocated
RAID10 (and they are, according to btrfs fi df ran before on the mounted
fs).

Because of this, I expect to be able to use the chunks from the (not
successfully removed) devid=9, as I have done many many times before due
to other btrfs bugs that needed unclean shutdowns during dev del.

Under the assumption devid=9 is good, if a slightly out of date on
transid (which ALL data says is true), I should be able to completely
recover all data, because data that was not modified during the deletion
resides on devid=9, and data that was modified should be redundantly
(RAID10) stored on the remaining drives, and thus should work given this
case of a single drive failure.

Is this not the case? Does btrfs not maintain redundancy during device
removal?

 btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
 sector 62986400)

 btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
 sector 113318256)

 I'm not sure what constitutes a btrfs read error, maybe the device it
 originally requested data from didn't have it where it was expected
 but was able to find it on these devices. If the drive itself has a
 problem reading a sector and ECC can't correct it, it reports the
 read error to libata. So kernel messages report this with a line that
 starts with the word exception and then a line with cmd that
 shows what command and LBAs where issued to the drive, and then a
 res line that should contain an error mask with the actual error -
 bus error, media error. Very often you don't see these and instead
 see link reset messages, which means the drive is hanging doing
 something (probably attempting ECC) but then the linux SCSI layer
 hits its 30 second time out on the (hanged) queued command and resets
 the drive instead of waiting any longer. And that's a problem also
 because it prevents bad sectors from being fixed by Btrfs. So they
 just get worse to the point where then it can't do anything about the
 situation.

There was a single drive immediately failing all its writes and reads
because that's how the controller card was configured. No ECC failures,
no timeouts. I have hit those issues on other arrays, but the drive
controller I'm using here correctly and immediately returned errors on
requests when the drive failed. I am no stranger to SCSI error messages
on both shitty drive interfaces (which behave as you 

Re: Unmountable Array After Drive Failure During Device Deletion

2013-12-19 Thread Chris Murphy

On Dec 19, 2013, at 5:06 PM, Chris Kastorff encryp...@gmail.com wrote:

 On 12/19/2013 02:21 PM, Chris Murphy wrote:
 
 On Dec 19, 2013, at 2:26 AM, Chris Kastorff encryp...@gmail.com wrote:
 
 btrfs-progs v0.20-rc1-358-g194aa4a-dirty
 
 Most of what you're using is in the kernel so this is not urgent but
 if it gets to needing btrfs check/repair, I'd upgrade to v3.12 progs:
 https://www.archlinux.org/packages/testing/x86_64/btrfs-progs/
 
 Adding the testing repository is a bad idea for this machine; turning
 off the testing repository is extremely error prone.
 
 Instead, I am now using the btrfs tools from
 git://git.kernel.org/pub/scm/linux/kernel/git/mason/btrfs-progs.git's
 master (specifically 8cae184), which reports itself as:
 
 deep# ./btrfs version
 Btrfs v3.12

Good. As I thought about it again, you're using user space tools to add, 
remove, replace devices also, and that code has changed too, so better to use 
current.

 - Array is good. All drives are accounted for, btrfs scrub runs cleanly.
 btrfs fi show shows no missing drives and reasonable allocations.
 - I start btrfs dev del to remove devid 9. It chugs along with no
 errors, until:
 - Another drive in the array (NOT THE ONE I RAN DEV DEL ON) fails, and
 all reads and writes to it fail, causing the SCSI errors above.
 - I attempt clean shutdown. It takes too long for because my drive
 controller card is buzzing loudly and the neighbors are sensitive to
 noise, so:
 - I power down the machine uncleanly.
 - I remove the failed drive, NOT the one I ran dev del on.
 - I reboot, attempt to mount with various options, all of which cause
 the kernel to yell at me and the mount command returns failure.

devid 9 is device delete in-progress, and while that's occurring devid 15 
fails completely. Is that correct?

Because previously you reported, in part this:
   devid   15 size 1.82TB used 1.47TB path /dev/sdd
   *** Some devices missing

And this:

sd 0:2:3:0: [sdd] Unhandled error code

That why I was confused. It looks like dead/missing device is one devid, and 
then devid 15 /dev/sdd is also having hardware problems - because all of this 
was posted at the same time. But I take it they're different boots and the 
/dev/sdd's are actually two different devids.

So devid 9 was deleted and then devid 14 failed. Right? Lovely when /dev/sdX 
changes between boots.


 From what I understand, at all points there should be at least two
 copies of every extent during a dev del when all chunks are allocated
 RAID10 (and they are, according to btrfs fi df ran before on the mounted
 fs).
 
 Because of this, I expect to be able to use the chunks from the (not
 successfully removed) devid=9, as I have done many many times before due
 to other btrfs bugs that needed unclean shutdowns during dev del.

I haven't looked at the code or read anything this specific on the state of the 
file system during a device delete. But my expectation is that there are 1-2 
chunks available for writes. And 2-3 chunks available for reads. Some writes 
must be only one copy because a chunk hasn't yet been replicated elsewhere, and 
presumably the device being deleted is not subject to writes as the transid 
also implies. Whereas devid 9 is one set of chunks for reading, those chunks 
have pre-existing copies elsewhere in the file system so that's two copies. And 
there's a replication in progress of the soon to be removed chunks. So that's 
up to three copies.

Problem is that for sure you've lost some chunks due to the failed/missing 
device. Normal raid10, it's unambiguous whether we've lost two mirrored sets. 
With Btrfs that's not clear as chunks are distributed. So it's possible that 
there are some chunks that don't exist at all for writes, and only 1 for reads. 
It may be no chunks are in common between devid 9 and the dead one. It may be 
only a couple of data or metadata chunks are in common.



 
 Under the assumption devid=9 is good, if a slightly out of date on
 transid (which ALL data says is true), I should be able to completely
 recover all data, because data that was not modified during the deletion
 resides on devid=9, and data that was modified should be redundantly
 (RAID10) stored on the remaining drives, and thus should work given this
 case of a single drive failure.
 
 Is this not the case? Does btrfs not maintain redundancy during device
 removal?

Good questions. I'm not certain. But the speculation seems reasonable, not 
accounting for the missing device. That's what makes this different.



 btrfs read error corrected: ino 1 off 87601116364800 (dev /dev/sdf
 sector 62986400)
 
 btrfs read error corrected: ino 1 off 87601116798976 (dev /dev/sdg
 sector 113318256)
 
 I'm not sure what constitutes a btrfs read error, maybe the device it
 originally requested data from didn't have it where it was expected
 but was able to find it on these devices. If the drive itself has a
 problem reading a sector and ECC can't correct it, it reports the
 read