systemd.setenv and a mount.unit

2014-11-20 Thread Jakob Schürz

Hi there!

Another challenge...
I'm using btrfs. So i make snapshots from my system. And in a script, I 
make a symlink (for example: @system.CURRENT and @system.LAST) for the 
current and the last snapshot.


So i want to add 2 entries in grub2 from which i can boot into the 
current and the last snapshot.


I tried to pass an environmental variable with 
systemd.setenv=BOOTSNAP=@system.CURRENT, and i have a mount-unit 
containing the option


Options=defaults,nofail,subvol=archive-local/@system.$BOOTSNAP

but it doesn't work. If i change $BOOTSNAP to CURRENT, the mount works.
So i made a test.service, containing only
StartExec=/bin/echo $BOOTSNAP
I get the value @system.CURRENT in the logs...

How can I do this mounts?

jakob
--
http://xundeenergie.at
http://verkehrsloesungen.wordpress.com/
http://cogitationum.wordpress.com/

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd.setenv and a mount.unit

2014-11-20 Thread Goffredo Baroncelli
On 2014-11-19 23:48, Jakob Schürz wrote:
 Hi there!
 
 Another challenge... I'm using btrfs. So i make snapshots from my
 system. And in a script, I make a symlink (for example:
 @system.CURRENT and @system.LAST) for the current and the last
 snapshot.

Interesting, I was unaware that I could mount a subvolume passing
a soft link.

 
 So i want to add 2 entries in grub2 from which i can boot into the
 current and the last snapshot.
 
 I tried to pass an environmental variable with
 systemd.setenv=BOOTSNAP=@system.CURRENT, and i have a mount-unit
 containing the option
 
 Options=defaults,nofail,subvol=archive-local/@system.$BOOTSNAP
 
 but it doesn't work. If i change $BOOTSNAP to CURRENT, the mount
 works. So i made a test.service, containing only StartExec=/bin/echo
 $BOOTSNAP I get the value @system.CURRENT in the logs...

This is more a systemd related question. However it seems that ExecStart 
supports ...basic environment variable substitution...; but the mount unit
doesn't. This explain your difficulties. Anyway I suggest you to contact the 
systemd developers to get further support; maybe that this could be a add as
TODO item.

 How can I do this mounts?

For the boot, I used the rootflags= command line options. This 
usually is interpreted by the initrd/initramfs as option to pass
to the mount command. In my case I have:

rootflags=subvol=debian

so, the subvol=debian option is passed to mount. When grub-mkconfig
generates the grub menu entries, does so.


 jakob
BR
G.Baroncelli

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd.setenv and a mount.unit

2014-11-20 Thread Jakob Schürz

Am 2014-11-20 um 11:17 schrieb Goffredo Baroncelli:

On 2014-11-19 23:48, Jakob Schürz wrote:

Hi there!

Another challenge... I'm using btrfs. So i make snapshots from my
system. And in a script, I make a symlink (for example:
@system.CURRENT and @system.LAST) for the current and the last
snapshot.


Interesting, I was unaware that I could mount a subvolume passing
a soft link.


Fortunately Yes, that works. :)
So i came on this idea, to create 2 (or more) stable links on the 
current, last and so on snapshot. The Problem is the fstab...






So i want to add 2 entries in grub2 from which i can boot into the
current and the last snapshot.

I tried to pass an environmental variable with
systemd.setenv=BOOTSNAP=@system.CURRENT, and i have a mount-unit
containing the option

Options=defaults,nofail,subvol=archive-local/@system.$BOOTSNAP

but it doesn't work. If i change $BOOTSNAP to CURRENT, the mount
works. So i made a test.service, containing only StartExec=/bin/echo
$BOOTSNAP I get the value @system.CURRENT in the logs...


This is more a systemd related question. However it seems that ExecStart
supports ...basic environment variable substitution...; but the mount unit
doesn't. This explain your difficulties. Anyway I suggest you to contact the
systemd developers to get further support; maybe that this could be a add as
TODO item.


I hope so... and i found out, that this is not a btrfs-challenge.
On the grub-devel-list i saw a discussion to make dynamically entries 
for all snapshots in a certain directory according to this challenge.


But this also didn't solve the fstab-challenge.

So i think i have to place it on a systemd-devel-list...




How can I do this mounts?


For the boot, I used the rootflags= command line options. This
usually is interpreted by the initrd/initramfs as option to pass
to the mount command. In my case I have:

rootflags=subvol=debian

so, the subvol=debian option is passed to mount. When grub-mkconfig
generates the grub menu entries, does so.


This I also have in my grub-config, and it works. But it doesn't solve 
the challenge. Mounting more subvolumes according to the 
root-subvolume... :-)


regards
Jakob

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: make sure we wait on logged extents when fsycning two subvols

2014-11-20 Thread Miao Xie
On Thu, 6 Nov 2014 10:19:54 -0500, Josef Bacik wrote:
 If we have two fsync()'s race on different subvols one will do all of its work
 to get into the log_tree, wait on it's outstanding IO, and then allow the
 log_tree to finish it's commit.  The problem is we were just free'ing that
 subvols logged extents instead of waiting on them, so whoever lost the race
 wouldn't really have their data on disk.  Fix this by waiting properly instead
 of freeing the logged extents.  Thanks,
 
 cc: sta...@vger.kernel.org
 Signed-off-by: Josef Bacik jba...@fb.com
 ---
  fs/btrfs/tree-log.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)
 
 diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
 index 2d0fa43..70f99b1 100644
 --- a/fs/btrfs/tree-log.c
 +++ b/fs/btrfs/tree-log.c
 @@ -2600,9 +2600,9 @@ int btrfs_sync_log(struct btrfs_trans_handle *trans,
   if (atomic_read(log_root_tree-log_commit[index2])) {
   blk_finish_plug(plug);
   btrfs_wait_marked_extents(log, log-dirty_log_pages, mark);
 + btrfs_wait_logged_extents(log, log_transid);

Why not add this log root into a list of log root tree, and then the committer
wait all ordered extents in each log root which is added in that list? By this
way, we can let the committer do some work during the data of ordered extents 
is 
being transferred to the disk.

Thanks
Miao

   wait_log_commit(trans, log_root_tree,
   root_log_ctx.log_transid);
 - btrfs_free_logged_extents(log, log_transid);
   mutex_unlock(log_root_tree-log_mutex);
   ret = root_log_ctx.log_ret;
   goto out;
 

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd.setenv and a mount.unit

2014-11-20 Thread Goffredo Baroncelli
On 2014-11-20 11:35, Jakob Schürz wrote:
 Am 2014-11-20 um 11:17 schrieb Goffredo Baroncelli:
[]
 
 rootflags=subvol=debian
 
 so, the subvol=debian option is passed to mount. When
 grub-mkconfig generates the grub menu entries, does so.
 
 This I also have in my grub-config, and it works. But it doesn't
 solve the challenge. Mounting more subvolumes according to the
 root-subvolume... :-)
 

Ah! You never told that this is your request.

I can only suggest that after snapshooting the /,
you snapshot their subvolumes, placing these under
the root snapshoot itself:

Supposing to have the following four subvolumes

/root/
/root/etc
/root/usr
/root/var

When you need to snapshot, you should:

# btrfs subvolume snapshot /root /backup-root-20141120
# btrfs subvolume snapshot /root/etc /backup-root-20141120/etc
# btrfs subvolume snapshot /root/usr /backup-root-20141120/usr
# btrfs subvolume snapshot /root/var /backup-root-20141120/var

So in order to remount an old filesystem, you need to make only
1 mount.










 regards Jakob
 
 -- To unsubscribe from this list: send the line unsubscribe
 linux-btrfs in the body of a message to majord...@vger.kernel.org 
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send and an existing backup

2014-11-20 Thread Marc Joliet
Am Wed, 19 Nov 2014 16:58:16 +0100
schrieb Jakob Schürz wertsto...@nurfuerspam.de:

 Hi there!
 
 I'm new on btrfs, and I like it :)

Me too :) . (I've been using it since May.)

 But i have a question. I have a existing backup on an external HDD. This 
 was ext4 before i converted it to btrfs.
 And i installed my debian new on btrfs with some subvolumes. (f.e. home, 
 var, multimedia/Video multimedia/Audio...)
 
 On my backup there are no subvolumes.
 
 Now i wrote a script to take local snapshots on my laptops HDD an mirror 
 this snapshots with btrfs send/receive to the external HDD.

Yeah, I also recently made the switch to btrfs send/receive, and I just love
being able to do incremental full system backups in less than two minutes (it's
also efficient enough that I backup my (borrowed) laptop over WLAN).

So from me a big thanks to the btrfs devs :) !

But to get to the questions:

 An i don't know, how to do, to make the inital snapshot on the external 
 HDD. I want to use the existing data there, so I don't have to transmit 
 the whole bunch of data to the external drive, which exists there 
 already...

Yeah, I had that problem, too, with my old rsync based backups; see below.

 What happens, if i make the same structure on the external drive with 
 creating subvolumes and »cp --reflink«, give this subvolumes the correct 
 names, and fire a »btrfs send«?

Do you mean cp --reflink from the original backup to the new structure? That
won't help.  Again, see below.

 Or is the best (ONLY???) way, to make an initial snapshot on the 
 external drive and delete the old backup there?

I couldn't think of any other way than doing an initial snapshot + send that
transferred the entire subvolumes, then doing incremental sends from there.

Here's my understanding as a complete non-expert:

The problem is that you need a parent snapshot, which needs to be on *both* the
source *and* target volumes, with which to be able to generate and then receive
the incremental send.  Currently, your source and target volumes are
independent, so btrfs can't infer anything about any differences between them;
that is, while the data may be related, the file systems themselves have
independent histories, making it impossible to compare them via their data
structures.

This is why you need to make an initial send: to give both volumes a common
frame of reference, so to speak.

So I bit the bullet and went through with it, and am keeping the original
backups until enough snapshots have accumulated in the new backup location
(both of my backups are on the same file system in different subvolumes).

 greetings
 jakob

HTH
-- 
Marc Joliet
--
People who think they know everything really annoy those of us who know we
don't - Bjarne Stroustrup


signature.asc
Description: PGP signature


Re: btrfs send and an existing backup

2014-11-20 Thread Bardur Arantsson
On 2014-11-19 16:58, Jakob Schürz wrote:
 Hi there!
 
 I'm new on btrfs, and I like it :)
 
 But i have a question. I have a existing backup on an external HDD. This
 was ext4 before i converted it to btrfs.
 And i installed my debian new on btrfs with some subvolumes. (f.e. home,
 var, multimedia/Video multimedia/Audio...)
 

If you have no other backups, I would really recommend that you *don't*
use btrfs for your backup, or at least have a *third* backup which isn't
on btrfs -- there are *still* problems with btrfs that can potentially
wreck your backup filesystem. (Although it's obviously less likely if
the external HDD will only be connected occasionally.)

Don't get me wrong, btrfs is becoming more and more stable, but I
wouldn't trust it with my *only* backup, especially if also running
btrfs on the backed-up filesystem.

Regards,

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send and an existing backup

2014-11-20 Thread Duncan
Bardur Arantsson posted on Thu, 20 Nov 2014 14:17:52 +0100 as excerpted:

 If you have no other backups, I would really recommend that you *don't*
 use btrfs for your backup, or at least have a *third* backup which isn't
 on btrfs -- there are *still* problems with btrfs that can potentially
 wreck your backup filesystem. (Although it's obviously less likely if
 the external HDD will only be connected occasionally.)
 
 Don't get me wrong, btrfs is becoming more and more stable, but I
 wouldn't trust it with my *only* backup, especially if also running
 btrfs on the backed-up filesystem.

This.

My working versions and first backups are btrfs.  My secondary backups 
are reiserfs (my old filesystem of choice, which has been very reliable 
for me), just in case both the btrfs versions bite the dust due to a bug 
in btrfs itself.

-- 
Duncan - List replies preferred.   No HTML msgs.
Every nonfree program has a lord, a master --
and if you use the program, he is your master.  Richard Stallman

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs send erroring...

2014-11-20 Thread Ken D'Ambrosio

Hi!  Trying to do a btrfs send, and failing with:

root@khamul:~# btrfs send /biggie/BACKUP/ | btrfs receive /tmp/sdd1/
At subvol /biggie/BACKUP/
At subvol BACKUP
ERROR: rename o2046806-17126-0 - volumes/ccdn-ch2-01 failed. No such 
file or directory


Judging by disk capacity, it hits this about 40% of the way through.  As 
my disk has subvolumes on it, which are underneath /biggie/BACKUP/, is 
there a different way I should go about sending an entire disk?


Thanks!

-Ken
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs send erroring...

2014-11-20 Thread Hugo Mills
On Thu, Nov 20, 2014 at 11:57:50AM -0500, Ken D'Ambrosio wrote:
 Hi!  Trying to do a btrfs send, and failing with:
 
 root@khamul:~# btrfs send /biggie/BACKUP/ | btrfs receive /tmp/sdd1/
 At subvol /biggie/BACKUP/
 At subvol BACKUP
 ERROR: rename o2046806-17126-0 - volumes/ccdn-ch2-01 failed. No
 such file or directory

   This looks like one of several bugs that have been fixed
recently. What kernel version and userspace tools version are you
using?

   Hugo.

 Judging by disk capacity, it hits this about 40% of the way through.
 As my disk has subvolumes on it, which are underneath
 /biggie/BACKUP/, is there a different way I should go about
 sending an entire disk?
 
 Thanks!
 
 -Ken

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- ©1973 Unclear Research Ltd ---   


signature.asc
Description: Digital signature


Re: btrfs send erroring...

2014-11-20 Thread Ken D'Ambrosio

On 2014-11-20 12:11, Hugo Mills wrote:

On Thu, Nov 20, 2014 at 11:57:50AM -0500, Ken D'Ambrosio wrote:

Hi!  Trying to do a btrfs send, and failing with:

root@khamul:~# btrfs send /biggie/BACKUP/ | btrfs receive /tmp/sdd1/
At subvol /biggie/BACKUP/
At subvol BACKUP
ERROR: rename o2046806-17126-0 - volumes/ccdn-ch2-01 failed. No
such file or directory


   This looks like one of several bugs that have been fixed
recently. What kernel version and userspace tools version are you
using?


3.11...  older than I'd realized!  Guess it's time for an upgrade, eh?  
I'll give that a go -- thanks for the pointer!


-Ken
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: systemd.setenv and a mount.unit

2014-11-20 Thread Chris Murphy
On Thu, Nov 20, 2014 at 4:14 AM, Goffredo Baroncelli kreij...@inwind.it wrote:

 Supposing to have the following four subvolumes

 /root/
 /root/etc
 /root/usr
 /root/var

 When you need to snapshot, you should:

 # btrfs subvolume snapshot /root /backup-root-20141120
 # btrfs subvolume snapshot /root/etc /backup-root-20141120/etc
 # btrfs subvolume snapshot /root/usr /backup-root-20141120/usr
 # btrfs subvolume snapshot /root/var /backup-root-20141120/var

 So in order to remount an old filesystem, you need to make only
 1 mount.

I like this layout better than either the openSUSE or Fedora layouts.
It's easier to mount and old filesystem, where on Fedora each
subvolume must be explicitly mounted. And it ensures old binaries
aren't in the current mount path – kinda like running in a chroot –
where on openSUSE the snapshots containing old binaries are in the
current mount path.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/19/2014 5:25 PM, Robert White wrote:
 The controller, the thing that sets the ready bit and sends the 
 interrupt is distinct from the driver, the thing that polls the
 ready bit when the interrupt is sent. At the bus level there are
 fixed delays and retries. Try putting two drives on a pin-select
 IDE bus and strapping them both as _slave_ (or indeed master)
 sometime and watch the shower of fixed delay retries.

No, it does not.  In classical IDE, the controller is really just a
bus bridge.  When you read from the status register in the controller,
the read bus cycle is propagated down the IDE ribbon, and into the
drive, and you are in fact, reading the register directly from the
drive.  That is where the name Integrated Device Electronics came
from: because the controller was really integrated into the drive.

The only fixed delays at the bus level are the bus cycle speed.  There
are no retries.  There are only 3 mentions of the word retry in the
ATA8-APT and they all refer to the host driver.

 That's odd... my bios reads from storage to boot the device and it
 does so using the ACPI storage methods.

No, it doesn't.  It does so by accessing the IDE or ACHI registers
just as pc bios always has.  I suppose I also need to remind you that
we are talking about the context of linux here, and linux does not
make use of the bios for disk access.

 ACPI 4.0 Specification Section 9.8 even disagrees with you at some
 length.
 
 Let's just do the titles shall we:
 
 9.8 ATA Controller Devices 9.8.1 Objects for both ATA and SATA
 Controllers. 9.8.2 IDE Controller Device 9.8.3 Serial ATA (SATA)
 controller Device
 
 Oh, and _lookie_ _here_ in Linux Kernel Menuconfig at Device
 Drivers - * Serial ATA and Parallel ATA drivers (libata) - *
 ACPI firmware driver for PATA
 
 CONFIG_PATA_ACPI:
 
 This option enables an ACPI method driver which drives motherboard
 PATA controller interfaces through the ACPI firmware in the BIOS.
 This driver can sometimes handle otherwise unsupported hardware.
 
 You are a storage _genius_ for knowing that all that stuff doesn't 
 exist... the rest of us must simply muddle along in our
 delusion...

Yes, ACPI 4.0 added this mess.  I have yet to see a single system that
actually implements it.  I can't believe they even bothered adding
this driver to the kernel.  Is there anyone in the world who has ever
used it?  If no motherboard vendor has bothered implementing the ACPI
FAN specs, I very much doubt anyone will ever bother with this.

 Do tell us more... I didn't say the driver would cause long delays,
 I said that the time it takes to error out other improperly
 supported drivers and fall back to this one could induce long
 delays and resets.

There is no error out and fall back.  If the device is in AHCI
mode then it identifies itself as such and the ACHI driver is loaded.
 If it is in IDE mode, then it identifies itself as such, and the IDE
driver is loaded.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUbk5qAAoJEI5FoCIzSKrw++IH/2DAayNzDqKlA7DBi79UVlpg
jJHDOlmzPqJCLMkffZRX1TLM/OEzu3k/pYMlS0HCdNggbG7eTpHxsoCetiETPcnc
LCcolWXa/eMfzkEphSq4GToeEj5FKrVNzymNvPVL6zdiSfySvSg4RZOs123ULYNM
nPUaOYPSiDPzfC7ggUS3RSvWb8mNzfRVJtgGXlZd/jDh+NAjy3oTb4fYksZjq8qb
n5emKU1jJafvSbBek41wo7Xji1vLThiDZ4kcf4c7oT3x4WuQUMUhzkficqEnwYsm
HK12pv0ktDJr6hKMcHPT26YKsdUOPE6XC3GgNaxt8EZ3bioWYRb4RRAdAuAjI2s=
=+M2o
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Phillip Susi
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/19/2014 5:33 PM, Robert White wrote:
 That would be fake raid, not hardware raid.
 
 The LSI MegaRaid controller people would _love_ to hear more about
 your insight into how their battery-backed multi-drive RAID
 controller is fake. You should go work for them. Try the contact
 us link at the bottom of this page. I'm sure they are waiting for
 your insight with baited breath!

Forgive me, I should have trimmed the quote a bit more.  I was
responding specifically to the many mother boards have hardware RAID
support available through the bios part, not the lsi part.

 Odd, my MegaRaid controller takes about fifteen seconds
 by-the-clock to initialize and to the integrity check on my single
 initialized drive.

It is almost certainly spending those 15 seconds on something else,
like bootstrapping its firmware code from a slow serial eeprom or
waiting for you to press the magic key to enter the bios utility.  I
would be very surprised to see that time double if you add a second
disk.  If it does, then they are doing something *very* wrong, and
certainly quite different from any other real or fake raid controller
I've ever used.

 It's amazing that with a fail and retry it would be _faster_...

I have no idea what you are talking about here.  I said that they
aren't going to retry a read that *succeeded* but came back without
their magic signature.  It isn't like reading it again is going to
magically give different results.


-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (MingW32)

iQEcBAEBAgAGBQJUblBmAAoJEI5FoCIzSKrwFKkIAKNGOGyLrMIcTeV4DQntdbaa
NMkjXnWnk6lHeqTyE/pb+l4VgVH8nQwDp8hRCnKNnKHoZbT8LOGFULSmBes+DDmW
dxPVDTytUu1AiqB7AyxNJU8213BQCaF0inL7ofZmX95N+0eajuVxOyHIMeokdwUU
zLOnXQg0awLkQwk7U6YLAKA4A7HrOEXw4wHt9hPy/yUySMVqCeHYV3tpf7t96guU
0IRctvpwcNvvVtt65I8A4EklR+vCvqEDUZfKyG8WJAeyAdC4UoHT9vZcJAVkiFl+
Y+Mp5wsr1vuo3dYQ1bKO8RvPTB9D9npFyFIlyHEBMJlCHDU43YsNP8hGcu0mKco=
=AJ6/
-END PGP SIGNATURE-
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] Move BTRFS RCU string to common library

2014-11-20 Thread Omar Sandoval
On Thu, Nov 13, 2014 at 02:18:21AM -0800, Omar Sandoval wrote:
 The RCU-friendly string API used internally by BTRFS is generic enough for
 common use. This doesn't add any new functionality, but instead just moves the
 code and documents the existing API.
 
 Reviewed-by: Josh Triplett j...@joshtriplett.org
 Acked-by: Paul E. McKenney paul...@linux.vnet.ibm.com
 Signed-off-by: Omar Sandoval osan...@osandov.com
 ---
 Alright, here's one more go at it.
 
 v7: Add arguments to kernel doc for printk wrappers, use ##__VA_ARGS
 v6: Add header dependencies to rcustring.h
 v5: Rebase against v3.18-rc3
 v4: Don't return anything from the printk wrappers on the assumption that
 printk will return void someday
 v3: Add __rcu annotation to relevant functions, add Paul's ack and Josh's
 review
 
  fs/btrfs/check-integrity.c |  6 +--
  fs/btrfs/dev-replace.c | 19 -
  fs/btrfs/disk-io.c |  6 +--
  fs/btrfs/extent_io.c   |  4 +-
  fs/btrfs/ioctl.c   |  4 +-
  fs/btrfs/raid56.c  |  2 +-
  fs/btrfs/rcu-string.h  | 56 --
  fs/btrfs/scrub.c   | 15 +++
  fs/btrfs/super.c   |  2 +-
  fs/btrfs/volumes.c | 14 +++
  include/linux/rcustring.h  | 99 
 ++
  11 files changed, 136 insertions(+), 91 deletions(-)
  delete mode 100644 fs/btrfs/rcu-string.h
  create mode 100644 include/linux/rcustring.h
 
Ping -- does everything look good here?

-- 
Omar
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Robert White

On 11/20/2014 12:26 PM, Phillip Susi wrote:

Yes, ACPI 4.0 added this mess.  I have yet to see a single system that
actually implements it.  I can't believe they even bothered adding
this driver to the kernel.  Is there anyone in the world who has ever
used it?  If no motherboard vendor has bothered implementing the ACPI
FAN specs, I very much doubt anyone will ever bother with this.


Nice attempt at saving face, but wrong as _always_.

The CONFIG_PATA_ACPI option has been in the kernel since 2008 and lots 
of people have used it.


If you search for ACPI ide you'll find people complaining in 2008-2010 
about windows error messages indicating the device is present in their 
system but no OS driver is available.


That you have yet to see a single system that implements it is about 
the worst piece of internet research I've ever seen. Do you not _get_ 
that your opinion about what exists and how it works is not authoritative?


You can also find articles about both windows and linux systems actively 
using ACPI fan control going back to 2009


These are not hard searches to pull off. These are not obscure 
references. Go to the google box and start typing ACPI fan... and 
check the autocomplete.


I'll skip ovea all the parts where you don't know how a chipset works 
and blah, blah, blah...


You really should have just stopped at I don't know and I've never 
because you keep demonstrating that you _don't_ know, and that you 
really _should_ _never_.


Tell us more about the lizard aliens controlling your computer, I find 
your versions of realty fascinating...

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Robert White

On 11/20/2014 12:34 PM, Phillip Susi wrote:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 11/19/2014 5:33 PM, Robert White wrote:

That would be fake raid, not hardware raid.


The LSI MegaRaid controller people would _love_ to hear more about
your insight into how their battery-backed multi-drive RAID
controller is fake. You should go work for them. Try the contact
us link at the bottom of this page. I'm sure they are waiting for
your insight with baited breath!


Forgive me, I should have trimmed the quote a bit more.  I was
responding specifically to the many mother boards have hardware RAID
support available through the bios part, not the lsi part.


Well you should have _actually_ trimmed your response down to not 
pressing send.


_Many_ motherboards have complete RAID support at levels 0, 1, 10, and 
five 5. A few have RAID6.


Some of them even use the LSI chip-set.

Seriously... are you trolling this list with disinformation or just 
repeating tribal knowledge from fifteen year old copies of PC Magazine?


Yea, some of the IDE motherboards and that only had RAID1 and RAID0 (and 
indeed some of the add-on controllers) back in the IDE-only days were 
really lame just-forked-write devices with no integrity checks (hence 
fake raid) but that's from like the 1990s; it's paleolithic age 
wisdom at this point.


Phillip say sky god angry, all go hide in cave! /D'oh...
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Changing label few times killed filesystem?

2014-11-20 Thread Boris Chernov


I have changed file system label few times in total. When I tried 
to mount it after that, it became not mountable:


# mount /dev/sdb1 /mnt
mount: Not a directory

In dmesg I see the following after above command:

[ 5198.413202] BTRFS info (device sdb1): disk space caching is enabled
[ 5198.629958] BTRFS: checking UUID tree

I have lots of manually sorted downloaded files on this partition 
(in other words nothing very important but downloading and sorting all 
files again would require a lot of time), so I would appreciate any 
help.  This is what I have tried so far to restore it:


# btrfs check /dev/sdb1
Checking filesystem on /dev/sdb1
UUID: 787e3bc1-7583-4bd8-a52e-e57fd7fc9243
checking extents
btrfs: cmds-check.c:2266: check_owner_ref: Assertion `!(rec-is_root)' 
failed.

zsh: abort  btrfs check /dev/sdb1

Since it failed after checking extents I decided to try 
--init-extent-tree:


# btrfs check --init-extent-tree /dev/sdb1
Checking filesystem on /dev/sdb1
UUID: 787e3bc1-7583-4bd8-a52e-e57fd7fc9243
Creating a new extent tree
Failed to find [29376512, 168, 16384]
btrfs unable to find ref byte nr 29376512 parent 0 root 1  owner 1 offset 0
Failed to find [30818304, 168, 16384]
btrfs unable to find ref byte nr 30818304 parent 0 root 1  owner 0 offset 1
Failed to find [47546368, 168, 16384]
btrfs unable to find ref byte nr 47546368 parent 0 root 1  owner 0 offset 1
parent transid verify failed on 29442048 wanted 4 found 2758
Ignoring transid failure
checking extents
btrfs: cmds-check.c:2266: check_owner_ref: Assertion `!(rec-is_root)' 
failed.

zsh: abort  btrfs check --init-extent-tree /dev/sdb1

# btrfs restore /dev/sdb1 /media/backup/sdb1  # this commands exits 
after a second with 0 return code

# echo $?
0

I also tried btrfs restore with --path-regex and got the same result.

# btrfs-find-root /dev/sdb1
Super think's the tree root is at 29360128, chunk root 20971520
Well block 4194304 seems great, but generation doesn't match, have=2, 
want=2759 level 0
Well block 4243456 seems great, but generation doesn't match, have=3, 
want=2759 level 0

Found tree root at 29360128 gen 2759 level 1

https://btrfs.wiki.kernel.org/index.php/Restore talks about picking root 
with largest transid, but I do not see transid in my output, so not 
sure what to do.


I also tried btrfsck:

# btrfsck /dev/sdb1
*** Error in `btrfs check': double free or corruption (fasttop): 
0x01074020 ***

zsh: abort  btrfsck /dev/sdb1

# btrfsck -b /dev/sdb1
*** Error in `btrfs check': double free or corruption (fasttop): 
0x024e8020 ***

zsh: abort  btrfsck -b /dev/sdb1

# btrfsck --repair /dev/sdb1
enabling repair mode
*** Error in `btrfs check': double free or corruption (fasttop): 
0x00e26020 ***

zsh: abort  btrfsck --repair /dev/sdb1

# uname -a
Linux debian 3.15.0-pf2 #1 SMP Sat Jun 28 15:09:48 EEST 2014 x86_64 
GNU/Linux

# btrfs --version
Btrfs v3.14.1
# btrfs fi show
Label: 'label'  uuid: 787e3bc1-7583-4bd8-a52e-e57fd7fc9243
Total devices 1 FS bytes used 411.76GiB
devid1 size 465.76GiB used 465.76GiB path /dev/sdb1

Btrfs v3.14.1

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Changing label few times killed filesystem?

2014-11-20 Thread Chris Murphy
On Thu, Nov 20, 2014 at 6:27 PM, Boris Chernov aqs1...@hotmail.com wrote:

 Since it failed after checking extents I decided to try
 --init-extent-tree:

There might be hope yet if you didn't use --repair which is said on
the wiki and many times on this list is kindof a last resort. But at
the very least before going with the hammer approach you should
upgrade your btrfs-progs which is kind old. Current is 3.17.2. I
suggest upgrading and just posting the results from 'btrfs check
device' without any options and see what you get. This check and
--repair code are mostly in btrfs-progs, whereas the mount time fixing
code is in the kernel. So upgrading btrfs-progs may be sufficient for
your case, but ultimately it might be necessary to go to a newer
kernel also.

 Btrfs v3.14.1


-- 
Chris Murphy
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: BTRFS messes up snapshot LV with origin

2014-11-20 Thread Zygo Blaxell
On Mon, Nov 17, 2014 at 08:04:05PM +0100, Goffredo Baroncelli wrote:
 On 2014-11-17 07:59, Brendan Hide wrote:
  
  That leaves two aspects of this issue which I view as two separate bugs:
  a) Btrfs cannot gracefully handle separate filesystems that have the same 
  UUID. At all.
  b) Grub appears to pick the wrong filesystem when presented with two 
  filesystems with the same UUID.
  
  I feel a) is a btrfs bug.
  I feel b) is a bug that is more about ecosystem design than grub being 
  silly.
 
 Regarding a)
 IIRC, btrfs collects the filesystem information by UUID; if two 
 filesystems have the same UUID (like the LVM-snapshot case), the
 last filesystem discovered overwrite the first one.
 
 The filesystem discovering is done in user-space; so it should be simple
 to skip a filesystem on a LVM-snapshot.
 
 Regarding b)
 I am bit confused: if I understood correctly, the root filesystem was
 picked from a LVM-snapshot, so grub-probe *correctly* reported that
 the root device is the snapshot.
 The problem was that during the boot filesystem discovering: first
 scanned the *real* device, then the LVM-snapshot; the latter
 overwrote the former so the system booted from the LVM-snapshot.

IMHO if the device UUID search finds multiple devices with the same device
UUID, it should ignore _all_ of them as the identification problem
is unsolvable without further user input.  This is what the 'device='
mount option is for.

 My conclusion is that we should improve the btrfs scan so:
 - in udev rules, a partition that is a LVM snapshot by default 
 should be not scanned by btrfs dev scan
 - btrfs dev scan, during the partition discovery should skip the 
 lvm-snapshot.

That would mean I can't do this:

1.  lvm snapshot of ext4 filesystem

2.  btrfs-convert the snapshot

3.  mount the snapshot, make sure it's OK

4.  merge LVM snapshot to overwrite original ext4 filesystem

which would be a shame since that's the only way I ever convert ext3/4
filesystems to btrfs (btrfs-convert is a little buggy still).

 BR
 G.Baroncelli
 
 
 
 -- 
 gpg @keyserver.linux.it: Goffredo Baroncelli kreijackATinwind.it
 Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: Digital signature


Re: BTRFS messes up snapshot LV with origin

2014-11-20 Thread Zygo Blaxell
On Wed, Nov 19, 2014 at 10:20:17AM -0500, Phillip Susi wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 On 11/18/2014 9:54 PM, Chris Murphy wrote:
  Why is it silly? Btrfs on a thin volume has practical use case
  aside from just being thinly provisioned, its snapshots are block
  device based, not merely that of an fs tree.
 
 Umm... because one of the big selling points of btrfs is that it is in
 a much better position to make snapshots being aware of the fs tree
 rather than doing it in the block layer.

One of the big selling points of LVM is that it is in a much better
position to make snapshots so you can run btrfsck on the shattered
remains of your broken btrfs filesystem.

The UUID-driven behavior of btrfs is _really extremely annoying_.
No other filesystem forces me to jump through the hoops btrfs does
to get routine admin tasks done.

e.g. if an ext4 filesystem explodes, I can:

1.  make a LVM snapshot of the broken filesystem

2.  run e2fsck on the snapshot

3.  mount and repair the snapshot, e.g. rsync any missing files
from backups, salvage anything that survived

4.  LVM merge the snapshot to its origin volume

5.  umount the origin volume and mount the merged volume
(or just reboot)

...and I can do all of this on a running system, in-place, with only a
few minutes of downtime in the must-reboot case.

None of the above works with btrfs at all.  Multi-device btrfs fails
at 2, and mounting the filesystem fails at 3.  The closest I've gotten
to this workflow is to set up a kvm instance that can see only the LVM
snapshots, (only) and run the btrfsck or rsync there--and hope that the
system doesn't crash and reboot during that time, or the filesystem will
be more or less destroyed by the random combination of origin and
snapshot LVs.

I've also learned the hard way to always make an LVM snapshot before
running btrfsck, just in case you discover a new btrfsck bug with your
filesystem.  That at least works for single-device btrfs filesystems.

 So it is kind of silly in the first place to be using lvm snapshots
 under btrfs, but it is is doubly silly to use lvm for snapshots, and
 btrfs for the mirroring rather than lvm.  Pick one layer and use it
 for both functions.  Even if that is lvm, then it should also be
 handling the mirroring.
 
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.17 (MingW32)
 
 iQEcBAEBAgAGBQJUbLUxAAoJEI5FoCIzSKrwh0oH/3TZ2oo8u2BjHYO3b0x8800/
 LFkmGFWrZFSnAvtWuN5B1WlhMXku4dxLRXz14fJKFp3fNmnYRNVvw3tu9btvsBsC
 sZdwLaKwKPHTK8RS+QCI2pZPX+cGB+F7/z9PCHrzIzzCKk/4SvnJ76e2nnZFpY1m
 Md3f1BCHEVUPMMXbqv6Ry6v7PDs/8bx8WITYyAL9uh3tjh0dXQsjbZJn5u4XDitS
 /CoE8eX4rf1vc7qHI4K56TtArCcXQxAHcC56fXmcmS03bVhAkkJ5Z+/uwi6+TkJe
 55rMFCd7UFy9pwKha3Q2flJHtDYG6ns7Njyff6BSL9Yzq7tHh4wLk1H3XxaOCP8=
 =ktv/
 -END PGP SIGNATURE-
 --
 To unsubscribe from this list: send the line unsubscribe linux-btrfs in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


signature.asc
Description: Digital signature


Re: Changing label few times killed filesystem?

2014-11-20 Thread Roman Mamedov
On Fri, 21 Nov 2014 01:27:17 +
Boris Chernov aqs1...@hotmail.com wrote:

 
  I have changed file system label few times in total. When I tried 
 to mount it after that, it became not mountable:
 
 # mount /dev/sdb1 /mnt
 mount: Not a directory

I'd say that implies something is wrong with your /mnt, rather than /dev/sdb1.
Before mounting try things like ls -la /mnt/, umount /mnt, etc.
Or simply mounting somewhere else other than /mnt/

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fstests: mark replace tests in btrfs/group

2014-11-20 Thread Eryu Guan
On Wed, Nov 19, 2014 at 09:33:31AM -0600, Eric Sandeen wrote:
 A couple tests exercise replace but were not marked as such
 in the group file.

Hi Eric,

I have a patch sitting in my git tree that adds most of btrfs tests in
one or more groups, I'll send it out for review soon.

Thanks,
Eryu
 
 Signed-off-by: Eric Sandeen sand...@redhat.com
 ---
 
 diff --git a/tests/btrfs/group b/tests/btrfs/group
 index 9adf862..1f23979 100644
 --- a/tests/btrfs/group
 +++ b/tests/btrfs/group
 @@ -13,7 +13,7 @@
  008 auto quick
  009 auto quick
  010 auto quick
 -011 auto
 +011 auto replace
  012 auto
  013 auto quick
  014 auto
 @@ -22,7 +22,7 @@
  017 auto quick
  018 auto quick
  019 auto quick
 -020 auto quick
 +020 auto quick replace
  021 auto quick
  022 auto
  023 auto
 
 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: add groups for btrfs tests

2014-11-20 Thread Eryu Guan
Some new btrfs groups have been added in the btrfs stress patchset add
other tests to proper groups too.

Signed-off-by: Eryu Guan eg...@redhat.com
---
 tests/btrfs/group | 110 +++---
 1 file changed, 55 insertions(+), 55 deletions(-)

diff --git a/tests/btrfs/group b/tests/btrfs/group
index b63f174..9cee026 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -3,64 +3,64 @@
 # - do not start group names with a digit
 # - comment line before each group is new description
 #
-001 auto quick
-002 auto
-003 auto
+001 auto quick subvol snapshot
+002 auto snapshot
+003 auto replace
 004 auto rw metadata
-005 auto
+005 auto defrag
 006 auto quick
-007 auto rw metadata
-008 auto quick
-009 auto quick
-010 auto quick
-011 auto
-012 auto
-013 auto quick
-014 auto
-015 auto quick
-016 auto quick
-017 auto quick
-018 auto quick
-019 auto quick
-020 auto quick
-021 auto quick
-022 auto
+007 auto rw metadata send
+008 auto quick send
+009 auto quick subvol
+010 auto quick defrag
+011 auto replace
+012 auto convert
+013 auto quick balance
+014 auto balance
+015 auto quick snapshot
+016 auto quick send
+017 auto quick snapshot
+018 auto quick subvol
+019 auto quick send
+020 auto quick replace
+021 auto quick balance defrag
+022 auto qgroup
 023 auto
-024 auto quick
-025 auto quick
-026 auto quick
-027 auto quick
-028 auto quick
-029 auto quick
-030 auto quick
-031 auto quick
-032 auto quick
-033 auto quick
-034 auto quick
-035 auto quick
-036 auto quick
-037 auto quick
-038 auto quick
-039 auto quick
-040 auto quick
-041 auto quick
-042 auto quick
-043 auto quick
-044 auto quick
-045 auto quick
-046 auto quick
-047 auto quick
+024 auto quick compress
+025 auto quick send clone
+026 auto quick clone
+027 auto quick clone
+028 auto quick clone
+029 auto quick clone
+030 auto quick send
+031 auto quick subvol clone
+032 auto quick remount
+033 auto quick send snapshot
+034 auto quick send
+035 auto quick clone
+036 auto quick send snapshot
+037 auto quick compress
+038 auto quick compress send
+039 auto quick send
+040 auto quick send
+041 auto quick compress
+042 auto quick qgroup
+043 auto quick send
+044 auto quick send
+045 auto quick send
+046 auto quick send
+047 auto quick send
 048 auto quick
 049 auto quick
-050 auto
-051 auto quick
-052 auto quick
-053 auto quick
-054 auto quick
-055 auto quick
-056 auto quick
+050 auto send
+051 auto quick send
+052 auto quick clone
+053 auto quick send
+054 auto quick send
+055 auto quick clone
+056 auto quick clone
 057 auto quick
-058 auto quick
+058 auto quick send snapshot
 059 auto quick
 060 auto balance subvol
 061 auto balance scrub
@@ -78,7 +78,7 @@
 073 auto scrub remount compress
 074 auto defrag remount compress
 075 auto quick subvol
-076 auto quick
-077 auto quick
-078 auto
+076 auto quick compress
+077 auto quick send snapshot
+078 auto snapshot
 079 auto
-- 
1.9.3

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Zygo Blaxell
On Tue, Nov 18, 2014 at 09:29:54AM +0200, Brendan Hide wrote:
 Hey, guys
 
 See further below extracted output from a daily scrub showing csum
 errors on sdb, part of a raid1 btrfs. Looking back, it has been
 getting errors like this for a few days now.
 
 The disk is patently unreliable but smartctl's output implies there
 are no issues. Is this somehow standard faire for S.M.A.R.T. output?
 
 Here are (I think) the important bits of the smartctl output for
 $(smartctl -a /dev/sdb) (the full results are attached):
 ID# ATTRIBUTE_NAME  FLAG VALUE WORST THRESH TYPE UPDATED
 WHEN_FAILED RAW_VALUE
   1 Raw_Read_Error_Rate 0x000f   100   253   006Pre-fail
 Always   -   0
   5 Reallocated_Sector_Ct   0x0033   100   100   036Pre-fail
 Always   -   1
   7 Seek_Error_Rate 0x000f   086   060   030Pre-fail
 Always   -   440801014
 197 Current_Pending_Sector  0x0012   100   100   000Old_age
 Always   -   0
 198 Offline_Uncorrectable   0x0010   100   100   000Old_age
 Offline  -   0
 199 UDMA_CRC_Error_Count0x003e   200   200   000Old_age
 Always   -   0
 200 Multi_Zone_Error_Rate   0x   100   253   000Old_age
 Offline  -   0
 202 Data_Address_Mark_Errs  0x0032   100   253   000Old_age
 Always   -   0

You have one reallocated sector, so the drive has lost some data at some
time in the last 49000(!) hours.  Normally reallocations happen during
writes so the data that was lost was data you were in the process of
overwriting anyway; however, the reallocated sector count could also be
a sign of deteriorating drive integrity.

In /var/lib/smartmontools there might be a csv file with logged error
attribute data that you could use to figure out whether that reallocation
was recent.

I also notice you are not running regular SMART self-tests (e.g.
by smartctl -t long) and the last (and first, and only!) self-test the
drive ran was ~12000 hours ago.  That means most of your SMART data is
about 18 months old.  The drive won't know about sectors that went bad
in the last year and a half unless the host happens to stumble across
them during a read.

The drive is over five years old in operating hours alone.  It is probably
so fragile now that it will break if you try to move it.


 
 
  Original Message 
 Subject:  Cron root@watricky /usr/local/sbin/btrfs-scrub-all
 Date: Tue, 18 Nov 2014 04:19:12 +0200
 From: (Cron Daemon) root@watricky
 To:   brendan@watricky
 
 
 
 WARNING: errors detected during scrubbing, corrected.
 [snip]
 scrub device /dev/sdb2 (id 2) done
   scrub started at Tue Nov 18 03:22:58 2014 and finished after 2682 
 seconds
   total bytes scrubbed: 189.49GiB with 5420 errors
   error details: read=5 csum=5415
   corrected errors: 5420, uncorrectable errors: 0, unverified errors: 164

That seems a little off.  If there were 5 read errors, I'd expect the drive to
have errors in the SMART error log.

Checksum errors could just as easily be a btrfs bug or a RAM/CPU problem.
There have been a number of fixes to csums in btrfs pulled into the kernel
recently, and I've retired two five-year-old computers this summer due
to RAM/CPU failures.

 [snip]
 

 smartctl 6.3 2014-07-26 r3976 [x86_64-linux-3.17.2-1-ARCH] (local build)
 Copyright (C) 2002-14, Bruce Allen, Christian Franke, www.smartmontools.org
 
 === START OF INFORMATION SECTION ===
 Model Family: Seagate Barracuda 7200.10
 Device Model: ST3250410AS
 Serial Number:6RYF5NP7
 Firmware Version: 4.AAA
 User Capacity:250,059,350,016 bytes [250 GB]
 Sector Size:  512 bytes logical/physical
 Device is:In smartctl database [for details use: -P show]
 ATA Version is:   ATA/ATAPI-7 (minor revision not indicated)
 Local Time is:Tue Nov 18 09:16:03 2014 SAST
 SMART support is: Available - device has SMART capability.
 SMART support is: Enabled
 
 === START OF READ SMART DATA SECTION ===
 SMART overall-health self-assessment test result: PASSED
 See vendor-specific Attribute list for marginal Attributes.
 
 General SMART Values:
 Offline data collection status:  (0x82)   Offline data collection activity
   was completed without error.
   Auto Offline Data Collection: Enabled.
 Self-test execution status:  (   0)   The previous self-test routine 
 completed
   without error or no self-test has ever 
   been run.
 Total time to complete Offline 
 data collection:  (  430) seconds.
 Offline data collection
 capabilities:  (0x5b) SMART execute Offline immediate.
   Auto Offline data collection on/off 
 support.
   Suspend Offline collection upon new
   command.
  

Re: [PATCH] fstests: mark replace tests in btrfs/group

2014-11-20 Thread Eric Sandeen
On Nov 20, 2014, at 10:44 PM, Eryu Guan eg...@redhat.com wrote:
 
 On Wed, Nov 19, 2014 at 09:33:31AM -0600, Eric Sandeen wrote:
 A couple tests exercise replace but were not marked as such
 in the group file.
 
 Hi Eric,
 
 I have a patch sitting in my git tree that adds most of btrfs tests in
 one or more groups, I'll send it out for review soon.
 
Thanks, much more complete than mine.

-Eric

 Thanks,
 Eryu
 
 Signed-off-by: Eric Sandeen sand...@redhat.com
 ---
 
 diff --git a/tests/btrfs/group b/tests/btrfs/group
 index 9adf862..1f23979 100644
 --- a/tests/btrfs/group
 +++ b/tests/btrfs/group
 @@ -13,7 +13,7 @@
 008 auto quick
 009 auto quick
 010 auto quick
 -011 auto
 +011 auto replace
 012 auto
 013 auto quick
 014 auto
 @@ -22,7 +22,7 @@
 017 auto quick
 018 auto quick
 019 auto quick
 -020 auto quick
 +020 auto quick replace
 021 auto quick
 022 auto
 023 auto
 
 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe fstests in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: scrub implies failing drive - smartctl blissfully unaware

2014-11-20 Thread Brendan Hide

On 2014/11/21 06:58, Zygo Blaxell wrote:

You have one reallocated sector, so the drive has lost some data at some
time in the last 49000(!) hours.  Normally reallocations happen during
writes so the data that was lost was data you were in the process of
overwriting anyway; however, the reallocated sector count could also be
a sign of deteriorating drive integrity.

In /var/lib/smartmontools there might be a csv file with logged error
attribute data that you could use to figure out whether that reallocation
was recent.

I also notice you are not running regular SMART self-tests (e.g.
by smartctl -t long) and the last (and first, and only!) self-test the
drive ran was ~12000 hours ago.  That means most of your SMART data is
about 18 months old.  The drive won't know about sectors that went bad
in the last year and a half unless the host happens to stumble across
them during a read.

The drive is over five years old in operating hours alone.  It is probably
so fragile now that it will break if you try to move it.
All interesting points. Do you schedule SMART self-tests on your own 
systems? I have smartd running. In theory it tracks changes and sends 
alerts if it figures a drive is going to fail. But, based on what you've 
indicated, that isn't good enough.



WARNING: errors detected during scrubbing, corrected.
[snip]
scrub device /dev/sdb2 (id 2) done
scrub started at Tue Nov 18 03:22:58 2014 and finished after 2682 
seconds
total bytes scrubbed: 189.49GiB with 5420 errors
error details: read=5 csum=5415
corrected errors: 5420, uncorrectable errors: 0, unverified errors: 164
That seems a little off.  If there were 5 read errors, I'd expect the drive to
have errors in the SMART error log.

Checksum errors could just as easily be a btrfs bug or a RAM/CPU problem.
There have been a number of fixes to csums in btrfs pulled into the kernel
recently, and I've retired two five-year-old computers this summer due
to RAM/CPU failures.
The difference here is that the issue only affects the one drive. This 
leaves the probable cause at:

- the drive itself
- the cable/ports

with a negligibly-possible cause at the motherboard chipset.


--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line unsubscribe linux-btrfs in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html