Re: raid1 has failing disks, but smart is clear

2016-07-07 Thread Corey Coughlin

Hi Duncan,
Thanks for the info!  I've seen that done in the fstab, but it 
didn't work for me the last time I tried it on the command line. Worth a 
shot!


-- Corey

On 07/07/2016 06:24 PM, Duncan wrote:

Corey Coughlin posted on Wed, 06 Jul 2016 23:40:30 -0700 as excerpted:


Well yeah, if I was mounting all the disks to different mount points, I
would definitely use UUIDs to get them mounted.  But I haven't seen any
way to set up a "mkfs.btrfs" command to use UUID or anything else for
individual drives.  Am I missing something?  I've been doing a lot of
googling.

FWIW, you can use the /dev/disk/by-*/* symlinks (as normally setup by
udev) to reference various devices.

Of course because the identifiers behind by-uuid and by-label are per-
filesystem, those will normally only identify the one device of a multi-
device filesystem, but the by-id links ID on device serials and partition
number, and if you are using GPT partitioning, you have by-partuuid and
(if you set them when setting up the partitions) by-partlabel as well.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-07 Thread Corey Coughlin

Hi Austin,
Thanks for the reply!  I'll go inline for more:

On 07/07/2016 04:58 AM, Austin S. Hemmelgarn wrote:

On 2016-07-06 18:59, Tomasz Kusmierz wrote:


On 6 Jul 2016, at 23:14, Corey Coughlin 
 wrote:


Hi all,
   Hoping you all can help, have a strange problem, think I know 
what's going on, but could use some verification.  I set up a raid1 
type btrfs filesystem on an Ubuntu 16.04 system, here's what it 
looks like:


btrfs fi show
Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
   Total devices 10 FS bytes used 3.42TiB
   devid1 size 1.82TiB used 1.18TiB path /dev/sdd
   devid2 size 698.64GiB used 47.00GiB path /dev/sdk
   devid3 size 931.51GiB used 280.03GiB path /dev/sdm
   devid4 size 931.51GiB used 280.00GiB path /dev/sdl
   devid5 size 1.82TiB used 1.17TiB path /dev/sdi
   devid6 size 1.82TiB used 823.03GiB path /dev/sdj
   devid7 size 698.64GiB used 47.00GiB path /dev/sdg
   devid8 size 1.82TiB used 1.18TiB path /dev/sda
   devid9 size 1.82TiB used 1.18TiB path /dev/sdb
   devid   10 size 1.36TiB used 745.03GiB path /dev/sdh

I added a couple disks, and then ran a balance operation, and that 
took about 3 days to finish.  When it did finish, tried a scrub and 
got this message:


scrub status for 597ee185-36ac-4b68-8961-d4adc13f95d4
   scrub started at Sun Jun 26 18:19:28 2016 and was aborted after 
01:16:35

   total bytes scrubbed: 926.45GiB with 18849935 errors
   error details: read=18849935
   corrected errors: 5860, uncorrectable errors: 18844075, 
unverified errors: 0


So that seems bad.  Took a look at the devices and a few of them 
have errors:

...
[/dev/sdi].generation_errs 0
[/dev/sdj].write_io_errs   289436740
[/dev/sdj].read_io_errs289492820
[/dev/sdj].flush_io_errs   12411
[/dev/sdj].corruption_errs 0
[/dev/sdj].generation_errs 0
[/dev/sdg].write_io_errs   0
...
[/dev/sda].generation_errs 0
[/dev/sdb].write_io_errs   3490143
[/dev/sdb].read_io_errs111
[/dev/sdb].flush_io_errs   268
[/dev/sdb].corruption_errs 0
[/dev/sdb].generation_errs 0
[/dev/sdh].write_io_errs   5839
[/dev/sdh].read_io_errs2188
[/dev/sdh].flush_io_errs   11
[/dev/sdh].corruption_errs 1
[/dev/sdh].generation_errs 16373

So I checked the smart data for those disks, they seem perfect, no 
reallocated sectors, no problems.  But one thing I did notice is 
that they are all WD Green drives.  So I'm guessing that if they 
power down and get reassigned to a new /dev/sd* letter, that could 
lead to data corruption.  I used idle3ctl to turn off the shut down 
mode on all the green drives in the system, but I'm having trouble 
getting the filesystem working without the errors.  I tried a 'check 
--repair' command on it, and it seems to find a lot of verification 
errors, but it doesn't look like things are getting fixed.
 But I have all the data on it backed up on another system, so I can 
recreate this if I need to.  But here's what I want to know:


1.  Am I correct about the issues with the WD Green drives, if they 
change mounts during disk operations, will that corrupt data?
I just wanted to chip in with WD Green drives. I have a RAID10 
running on 6x2TB of those, actually had for ~3 years. If disk goes 
down for spin down, and you try to access something - kernel & FS & 
whole system will wait for drive to re-spin and everything works OK. 
I’ve never had a drive being reassigned to different /dev/sdX due to 
spin down / up.
2 years ago I was having a corruption due to not using ECC ram on my 
system and one of RAM modules started producing errors that were 
never caught up by CPU / MoBo. Long story short, guy here managed to 
point me to the right direction and I started shifting my data to 
hopefully new and not corrupted FS … but I was sceptical of similar 
issue that you have described AND I did raid1 and while mounted I did 
shift disk from one SATA port to another and FS managed to pick up 
the disk in new location and did not even blinked (as far as I 
remember there was syslog entry to say that disk vanished and then 
that disk was added)


Last word, you got plenty of errors in your SMART for transfer 
related stuff, please be advised that this may mean:

- faulty cable
- faulty mono controller
- faulty drive controller
- bad RAM - yes, mother board CAN use your ram for storing data and 
transfer related stuff … specially chapter ones.
It's worth pointing out that the most likely point here for data 
corruption assuming the cable and controllers are OK is during the DMA 
transfer from system RAM to the drive controller.  Even when dealing 
with really good HBA's that have an on-board NVRAM cache, you still 
have to copy the data out of system RAM at some point, and that's 
usually when the corruption occurs if the problem is with the RAM, CPU 
or MB.


Well, I was able to run memtest on the system last night, that passed 
with flying colors, so I'm now leaning toward the problem being in the 
sas card.  But 

kdave/for-next commit 26112f7f472

2016-07-07 Thread Jeff Mahoney
Hi Dave -

This commit introduces a bug.  I ran across it when running xfstests
against my own integrated branch.

The problem is that btrfs_calc_reclaim_metadata_size didn't used to be
called from recovery, so it was safe to use fs_info->fs_root.  With
commit 7c83c6a09 (Btrfs: don't bother kicking async if there's nothing
to reclaim) we do call it from recovery context and fs_info->fs_root is
NULL.

The fix is to just not switch btrfs_calc_reclaim_metadata_size to take
an fs_info.  All the other call sites were using fs_info->fs_root
anyway, so it's not like we're pinning a root somewhere just for this call.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 00/31] btrfs: simplify use of struct btrfs_root pointers

2016-07-07 Thread Jeff Mahoney
On 7/7/16 9:48 PM, Jeff Mahoney wrote:
> On 6/24/16 6:14 PM, je...@suse.com wrote:
>> From: Jeff Mahoney 
>>
>> One of the common complaints I've heard from new and experienced
>> developers alike about the btrfs code is the ubiquity of
>> struct btrfs_root.  There is one for every tree on disk and it's not
>> always obvious which root is needed in a particular call path.  It can
>> be frustrating to spend time figuring out which root is required only
>> to discover that it's not actually used for anything other than
>> getting the fs-global struct btrfs_fs_info.
>>
>> The patchset contains several sections.
>>
>> 1) The fsid trace event patchset I posted earlier; I can rebase without this
>>but I'd prefer not to.
>>
>> 2) Converting btrfs_test_opt and friends to use an fs_info.
>>
>> 3) Converting tests to use an fs_info pointer whenever a root is used.
>>
>> 4) Moving sectorsize and nodesize to fs_info and cleaning up the
>>macros used to access them.
> 
> This change was a little overzealous in free-space-cache.c, which hit
> block_group->sectorsize as well as root->sectorsize by accident.  While
> the change is fine for general btrfs usage, it breaks the sanity tests
> since dummy block groups now depend on a dummy fs_info as well.

There's also another error in btrfs_alloc_dummy_fs_info that doesn't
initialize sectorsize.

Clearly my test config got CONFIG_BTRFS_FS_RUN_SANITY_TESTS disabled at
some point. :(

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 00/31] btrfs: simplify use of struct btrfs_root pointers

2016-07-07 Thread Jeff Mahoney
On 6/24/16 6:14 PM, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> One of the common complaints I've heard from new and experienced
> developers alike about the btrfs code is the ubiquity of
> struct btrfs_root.  There is one for every tree on disk and it's not
> always obvious which root is needed in a particular call path.  It can
> be frustrating to spend time figuring out which root is required only
> to discover that it's not actually used for anything other than
> getting the fs-global struct btrfs_fs_info.
> 
> The patchset contains several sections.
> 
> 1) The fsid trace event patchset I posted earlier; I can rebase without this
>but I'd prefer not to.
> 
> 2) Converting btrfs_test_opt and friends to use an fs_info.
> 
> 3) Converting tests to use an fs_info pointer whenever a root is used.
> 
> 4) Moving sectorsize and nodesize to fs_info and cleaning up the
>macros used to access them.

This change was a little overzealous in free-space-cache.c, which hit
block_group->sectorsize as well as root->sectorsize by accident.  While
the change is fine for general btrfs usage, it breaks the sanity tests
since dummy block groups now depend on a dummy fs_info as well.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: [PATCH 05/31] btrfs: tests, require fs_info for root

2016-07-07 Thread Jeff Mahoney
On 6/24/16 6:14 PM, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> This allows the upcoming patchset to push nodesize and sectorsize into
> fs_info.
> 
> Signed-off-by: Jeff Mahoney 
> ---
>  fs/btrfs/ctree.h   |  1 +
>  fs/btrfs/disk-io.c | 15 +++
>  fs/btrfs/disk-io.h |  3 ++-
>  fs/btrfs/tests/btrfs-tests.c   | 20 ---
>  fs/btrfs/tests/btrfs-tests.h   |  1 +
>  fs/btrfs/tests/extent-buffer-tests.c   | 23 +++--
>  fs/btrfs/tests/free-space-tests.c  | 14 +++
>  fs/btrfs/tests/free-space-tree-tests.c | 18 +++--
>  fs/btrfs/tests/inode-tests.c   | 46 
> ++
>  fs/btrfs/tests/qgroup-tests.c  | 23 +
>  10 files changed, 103 insertions(+), 61 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
> index 100d2ea..4781057 100644
> --- a/fs/btrfs/ctree.h
> +++ b/fs/btrfs/ctree.h
> @@ -117,6 +117,7 @@ static inline unsigned long btrfs_chunk_item_size(int 
> num_stripes)
>  #define BTRFS_FS_STATE_REMOUNTING1
>  #define BTRFS_FS_STATE_TRANS_ABORTED 2
>  #define BTRFS_FS_STATE_DEV_REPLACING 3
> +#define BTRFS_FS_STATE_DUMMY_FS_INFO 4
>  
>  #define BTRFS_BACKREF_REV_MAX256
>  #define BTRFS_BACKREF_REV_SHIFT  56
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index 8f27127..418163d 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -1233,6 +1233,7 @@ static void __setup_root(u32 nodesize, u32 sectorsize, 
> u32 stripesize,
>struct btrfs_root *root, struct btrfs_fs_info *fs_info,
>u64 objectid)
>  {
> + bool dummy = test_bit(BTRFS_FS_STATE_DUMMY_FS_INFO, _info->fs_state);
>   root->node = NULL;
>   root->commit_root = NULL;
>   root->sectorsize = sectorsize;
> @@ -1287,14 +1288,14 @@ static void __setup_root(u32 nodesize, u32 
> sectorsize, u32 stripesize,
>   root->log_transid = 0;
>   root->log_transid_committed = -1;
>   root->last_log_commit = 0;
> - if (fs_info)
> + if (dummy)

This should be:
if (!dummy)

>   extent_io_tree_init(>dirty_log_pages,
>fs_info->btree_inode->i_mapping);
>  
>   memset(>root_key, 0, sizeof(root->root_key));
>   memset(>root_item, 0, sizeof(root->root_item));
>   memset(>defrag_progress, 0, sizeof(root->defrag_progress));
> - if (fs_info)
> + if (dummy)

So should this.

-Jeff

-- 
Jeff Mahoney
SUSE Labs



signature.asc
Description: OpenPGP digital signature


Re: raid1 has failing disks, but smart is clear

2016-07-07 Thread Duncan
Corey Coughlin posted on Wed, 06 Jul 2016 23:40:30 -0700 as excerpted:

> Well yeah, if I was mounting all the disks to different mount points, I
> would definitely use UUIDs to get them mounted.  But I haven't seen any
> way to set up a "mkfs.btrfs" command to use UUID or anything else for
> individual drives.  Am I missing something?  I've been doing a lot of
> googling.

FWIW, you can use the /dev/disk/by-*/* symlinks (as normally setup by 
udev) to reference various devices.

Of course because the identifiers behind by-uuid and by-label are per-
filesystem, those will normally only identify the one device of a multi-
device filesystem, but the by-id links ID on device serials and partition 
number, and if you are using GPT partitioning, you have by-partuuid and 
(if you set them when setting up the partitions) by-partlabel as well.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 4:38 PM, Andrew E. Mileski  wrote:
> On 2016-07-07 17:13, Francesco Turco wrote:
>>
>>
>> On 2016-07-07 23:11, Andrew E. Mileski wrote:
>>>
>>> How large is this USB flash device?
>>
>>
>> 64 GB.
>>
>
> I don't know if there is an official recommended minimum size for btrfs, but
> I would expect 64 GB to be okay.

In my similar case, it was a 16GiB stick, but the Btrfs on LUKS
partition was maybe 4GiB. Again I used -M and ran into zero problems
in ~6 months of almost daily usage. But not rsync. I was using it for
encrypted /home.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Andrew E. Mileski

On 2016-07-07 17:13, Francesco Turco wrote:


On 2016-07-07 23:11, Andrew E. Mileski wrote:

How large is this USB flash device?


64 GB.



I don't know if there is an official recommended minimum size for btrfs, but I 
would expect 64 GB to be okay.


I've personally set my minimum recommendation for btrfs at 120 GB based on my 
experience with failures in various flash devices from 4 to 30 GB.


If you want to experiment, I have a theory that formatting single volumes with 
"-m single" can avoid a potential controller race in one specific situation, 
plus it helps to reduce the meta overhead on smaller devices.


Lastly, the last two USB issues I investigated were both fixed by replacing the 
cables.  Something to try if it is a cabled device.


~~
Andrew E. Mileski
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs module does not load on sparc64

2016-07-07 Thread alexmcwhirter

On 2016-07-07 10:29, Anatoly Pugachev wrote:

Hi!

Compiled linux kernel (git version 4.7.0-rc6+) using my own kernel
config file, enabling :

CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y

and now I can't load btrfs module:

# modprobe btrfs
modprobe: ERROR: could not insert 'btrfs': Invalid argument


and in logs (and on console):

[1897399.942697] Btrfs loaded, crc32c=crc32c-generic, debug=on, 
assert=on

[1897400.024645] BTRFS: selftest: sectorsize: 8192  nodesize: 8192
[1897400.098089] BTRFS: selftest: Running btrfs free space cache tests
[1897400.175863] BTRFS: selftest: Running extent only tests
[1897400.241871] BTRFS: selftest: Running bitmap only tests
[1897400.307877] BTRFS: selftest: Running bitmap and extent tests
[1897400.380329] BTRFS: selftest: Running space stealing from bitmap to 
extent

[1897400.470517] BTRFS: selftest: Free space cache tests finished
[1897400.542875] BTRFS: selftest: Running extent buffer operation tests
[1897400.621710] BTRFS: selftest: Running btrfs_split_item tests
[1897400.692929] BTRFS: selftest: Running extent I/O tests
[1897400.757459] BTRFS: selftest: Running find delalloc tests
[1897401.082670] BTRFS: selftest: Running extent buffer bitmap tests
[1897401.161223] BTRFS: selftest: Setting straddling pages failed
[1897401.233661] BTRFS: selftest: Extent I/O tests finished


this is sparc64 sid/unstable debian:

# uname -a
Linux nvg5120 4.7.0-rc6+ #38 SMP Thu Jul 7 14:51:23 MSK 2016 sparc64 
GNU/Linux


# getconf PAGESIZE
8192

PS: using btrfs-progs from kdave repo,
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git ,
i'm able to create fs, but unable to mount:

root@nvg5120:/home/mator/btrfs-progs# ./mkfs.btrfs -f /dev/vg1/vol1
btrfs-progs v4.6.1
See http://btrfs.wiki.kernel.org for more information.

WARNING: failed to open /dev/btrfs-control, skipping device
registration: No such device
Label:  (null)
UUID:   ddd8a268-62e5-444c-9baf-6ba1b2d4448b
Node size:  16384
Sector size:8192
Filesystem size:15.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP   1.01GiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   IDSIZE  PATH
115.00GiB  /dev/vg1/vol1


Can someone help please? Thanks.
--
To unsubscribe from this list: send the line "unsubscribe sparclinux" 
in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


On kernel 4.6.2 i can load the module on quite a few machines (e6k, 
v210, v215, m4000, t5120). However, i can only create / mount raid 0,1, 
and 10 filesystems. raid 5 and 6 fails on both creation and mounting. 
Also, After creation / mounting of a raid 0/1/10 filesystem btrfs will 
eventually corrupt itself after a large amount of data has been written.


Not the same issue you are experiencing, but it's worthwhile to note.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Andrew E. Mileski

On 2016-07-07 09:49, Francesco Turco wrote:

I have a USB flash drive with an encrypted Btrfs filesystem where I
store daily backups. My problem is that this btrfs filesystem gets
corrupted very often, after a few days of usage. Usually I just reformat
it and move along, but this time I'd like to understand the root cause
of the problem and fix it.


How large is this USB flash device?

I've had issues with btrfs and small devices, where a 1 GB data chunk is 
relatively large.


~~
Andrew E. Mileski
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Francesco Turco

On 2016-07-07 23:11, Andrew E. Mileski wrote:
> How large is this USB flash device?

64 GB.

-- 
Website: http://www.fturco.net/
GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 1:59 PM, Austin S. Hemmelgarn
 wrote:

> D-Bus support needs to be optional, period.  Not everybody uses D-Bus (I
> have dozens of systems that get by just fine without it, and know hundreds
> of other people who do as well), and even people who do don't always use
> every tool needed (on the one system I manage that does have it, the only
> things I need it for are Avahi, ConsoleKit, udev, and NetworkManager, and
> I'm getting pretty close to the point of getting rid of NM and CK and
> re-implementing or forking Avahi).  You have to consider the fact that there
> are and always will be people who do not install a GUI on their system and
> want the absolute minimum of software installed.

That's fine, they can monitor kernel messages directly as their
notification system. I'm concerned with people who don't ever look at
kernel messages, you know, mortal users who have better things to do
with a computer than that. It's important for most anyone to not have
to wait for problems to manifest traumatically.


> Personally, I don't care what Fedora is doing, or even what GNOME (or any
> other DE for that matter, the only reason I use Xfce is because some things
> need a GUI (many of them unnecessarily), and that happens to be the DE I
> have the fewest complaints about) is doing.  The only reason that things
> like GNOME Disks and such exist is because they're trying to imitate Windows
> and OS X, which is all well and good for a desktop, but is absolute crap for
> many server and embedded environments (Microsoft finally realized this, and
> Windows Server 2012 added the ability to install without a full desktop,
> which actually means that they have _more_ options than a number of Linux
> distributions (yes you can rip out the desktop on many distros if you want,
> but that takes an insane amount of effort most of the time, not to mention
> storage space)).

I'm willing to bet dollars to donuts Xfce fans would love to know if
one of their rootfs mirrors is spewing read errors, while smartd
defers to the drive which says "hey no problems here". GNOME at least
does report certain critical smart errors, but that still leaves
something like 40% of drive failures happening without prior notice.


> Storaged also qualifies as something that _needs_ to be optional, especially
> because it appears to require systemd (and it falls into the same category
> as D-Bus of 'unnecessary bloat on many systems').  Adding a mandatory
> dependency on systemd _will_ split the community and severely piss off quite
> a few people (you will likely get some rather nasty looks from a number of
> senior kernel developers if you meet them in person).

I just want things to work for users, defined as people who would like
to stop depending on Windows and macOS for both server and desktop
usage. I don't really care about ideological issues outside of that
goal.

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-07 20:58, Chris Murphy wrote:
> I get all kinds of damn strange behaviors in GNOME
> with Btrfs multiple device volumes: volume names appearing twice in
> the UI, unmounting one causes umount errors with the other.
> https://fedoraproject.org/wiki/Changes/Replace_UDisks2_by_Storaged
> http://storaged.org/

Unfortunately BTRFS is a mess from this point of view. Some btrfs subcommand 
query the system inspecting directly the data stored on the disks; others use 
the ioctl(2) syscall, which provides what the kernel think. Unfortunately, due 
to the cache, these two kind of source of information are out of sync. 

Often, when some command output don't convince me, I do some "sync"; then 
repeating the command the output is better ("btrfs fi show" is one of these 
commands).

BR
G.Baroncelli


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Austin S. Hemmelgarn

On 2016-07-07 14:58, Chris Murphy wrote:

On Thu, Jul 7, 2016 at 12:23 PM, Austin S. Hemmelgarn
 wrote:



Here's how I would picture the ideal situation:
* A device is processed by udev.  It detects that it's part of a BTRFS
array, updates blkid and whatever else in userspace with this info, and then
stops without telling the kernel.
* The kernel tracks devices until the filesystem they are part of is
unmounted, or a mount of that FS fails.
* When the user goes to mount the a BTRFS filesystem, they use a mount
helper.
  1. This helper queries udev/blkid/whatever to see which devices are part
of an array.
  2. Once the helper determines which devices are potentially in the
requested FS, it checks the following things to ensure array integrity:
- Does each device report the same number of component devices for the
array?
- Does the reported number match the number of devices found?
- If a mount by UUID is requested, do all the labels match on each
device?
- If a mount by LABEL is requested, do all the UUID's match on each
device?
- If a mount by path is requested, do all the component devices reported
by that device have matching LABEL _and_ UUID?
- Is any of the devices found already in-use by another mount?
  4. If any of the above checks fails, and the user has not specified an
option to request a mount anyway, report the error and exit with non-zero
status _before_ even talking to the kernel.
  5. If only the second check fails (the check verifying the number of
devices found), and it fails because the number found is less than required
for a non-degraded mount, ignore that check if and only if the user
specified -o degraded.
  6. If any of the other checks fail, ignore them if and only if the user
asks to ignore that specific check.
  7. Otherwise, notify the kernel about the devices and call mount(2).
* The mount helper parses it's own set of special options similar to the
bg/fg/retry options used by mount.nfs to allow for timeouts when mounting,
as well as asynchronous mounts in the background.
* btrfs device scan becomes a no-op
* btrfs device ready uses the above logic minus step 7 to determine if a
filesystem is probably ready.

Such a situation would probably eliminate or at least reduce most of our
current issues with device discovery, and provide much better error
reporting and general flexibility.


It might be useful to see where ZFS and LVM work and fail in this
regard. And also plan for D-Bus support to get state notifications up
to something like storaged or other such user space management tools.
Early on in Fedora there were many difficulties between systemd and
LVM, so avoiding whatever that was about would be nice.
D-Bus support needs to be optional, period.  Not everybody uses D-Bus (I 
have dozens of systems that get by just fine without it, and know 
hundreds of other people who do as well), and even people who do don't 
always use every tool needed (on the one system I manage that does have 
it, the only things I need it for are Avahi, ConsoleKit, udev, and 
NetworkManager, and I'm getting pretty close to the point of getting rid 
of NM and CK and re-implementing or forking Avahi).  You have to 
consider the fact that there are and always will be people who do not 
install a GUI on their system and want the absolute minimum of software 
installed.


Also, tangentially related, Fedora is replacing udisks2 with storaged.
Storaged already has a Btrfs plug-in so there should be better
awareness there. I get all kinds of damn strange behaviors in GNOME
with Btrfs multiple device volumes: volume names appearing twice in
the UI, unmounting one causes umount errors with the other.
https://fedoraproject.org/wiki/Changes/Replace_UDisks2_by_Storaged
http://storaged.org/
Personally, I don't care what Fedora is doing, or even what GNOME (or 
any other DE for that matter, the only reason I use Xfce is because some 
things need a GUI (many of them unnecessarily), and that happens to be 
the DE I have the fewest complaints about) is doing.  The only reason 
that things like GNOME Disks and such exist is because they're trying to 
imitate Windows and OS X, which is all well and good for a desktop, but 
is absolute crap for many server and embedded environments (Microsoft 
finally realized this, and Windows Server 2012 added the ability to 
install without a full desktop, which actually means that they have 
_more_ options than a number of Linux distributions (yes you can rip out 
the desktop on many distros if you want, but that takes an insane amount 
of effort most of the time, not to mention storage space)).


Storaged also qualifies as something that _needs_ to be optional, 
especially because it appears to require systemd (and it falls into the 
same category as D-Bus of 'unnecessary bloat on many systems').  Adding 
a mandatory dependency on systemd _will_ split the community and 
severely piss off quite a few people (you will likely get some 

Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-07 20:23, Austin S. Hemmelgarn wrote:
[...]
> FWIW, I've pretty much always been of the opinion that the device discovery 
> belongs in a mount helper.  The auto-discovery from udev (and more 
> importantly, how the kernel handles being told about a device) is much of the 
> reason that it's so inherently dangerous to do block level copies.  There's 
> obviously no way that can be changed now without breaking something, but 
> that's on the really short list of things that I personally feel are worth 
> breaking to fix a particularly dangerous pitfall.  The recent discovery that 
> device ready state is write-once when set just reinforces this in my opinion.
> 
> Here's how I would picture the ideal situation:
> * A device is processed by udev.  It detects that it's part of a BTRFS array, 
> updates blkid and whatever else in userspace with this info, and then stops 
> without telling the kernel.
> * The kernel tracks devices until the filesystem they are part of is 
> unmounted, or a mount of that FS fails.
> * When the user goes to mount the a BTRFS filesystem, they use a mount helper.
>   1. This helper queries udev/blkid/whatever to see which devices are part of 
> an array.
>   2. Once the helper determines which devices are potentially in the 
> requested FS, it checks the following things to ensure array integrity:
> - Does each device report the same number of component devices for the 
> array?
> - Does the reported number match the number of devices found?
> - If a mount by UUID is requested, do all the labels match on each device?
> - If a mount by LABEL is requested, do all the UUID's match on each 
> device?
> - If a mount by path is requested, do all the component devices reported 
> by that device have matching LABEL _and_ UUID?
> - Is any of the devices found already in-use by another mount?
^ It is possible to mount two time the same device.

I add my favorite:
- is there a conflict of disk-uuid (i.e two different disk with the 
same uuid) ?

Anyway the point 2 has to be in loop until timeout: i.e. if systemd ask to 
mount a filesystem when the first device appear, wait for all devices appear.

>   4. If any of the above checks fails, and the user has not specified an 
> option to request a mount anyway, report the error and exit with non-zero 
> status _before_ even talking to the kernel.
>   5. If only the second check fails (the check verifying the number of 
> devices found), and it fails because the number found is less than required 
> for a non-degraded mount, ignore that check if and only if the user specified 
> -o degraded.
>   6. If any of the other checks fail, ignore them if and only if the user 
> asks to ignore that specific check.
>   7. Otherwise, notify the kernel about the devices and call mount(2).
> * The mount helper parses it's own set of special options similar to the 
> bg/fg/retry options used by mount.nfs to allow for timeouts when mounting, as 
> well as asynchronous mounts in the background.
> * btrfs device scan becomes a no-op
> * btrfs device ready uses the above logic minus step 7 to determine if a 
> filesystem is probably ready.
> 
> Such a situation would probably eliminate or at least reduce most of our 
> current issues with device discovery, and provide much better error reporting 
> and general flexibility.
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: A lot warnings in dmesg while running thunderbird

2016-07-07 Thread Chris Mason



On 07/07/2016 06:24 AM, Gabriel C wrote:

Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :


I definitely thought we had this fixed in v4.7-rc.  Can you easily fsck 
this filesystem?  Something strange is going on.


-chris
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Chris Murphy
More Btrfs udev issues, they involve making btrfs multiple device
volumes via 'btrfs dev add' which then causes problems at boot time.
https://bugzilla.opensuse.org/show_bug.cgi?id=912170
https://bugzilla.suse.com/show_bug.cgi?id=984516

The last part is amusing in that the proposed fix is going to end up
in btrfs-progs. And so that's why:

[chris@f24m ~]$ dnf provides /usr/lib/udev/rules.d/64-btrfs-dm.rules
Last metadata expiration check: 1:18:18 ago on Thu Jul  7 11:54:20 2016.
btrfs-progs-4.6-1.fc25.x86_64 : Userspace programs for btrfs
Repo: @System

[chris@f24m ~]$ dnf provides /usr/lib/udev/rules.d/64-btrfs.rules
Last metadata expiration check: 1:18:30 ago on Thu Jul  7 11:54:20 2016.
systemd-udev-229-8.fc24.x86_64 : Rule-based device node and kernel event manager
Repo: @System

Ha. So the btrfs rule is provided by udev upstream. The dm specific
Btrfs rule is provided by Btrfs upstream. That's not confusing at all.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 12:23 PM, Austin S. Hemmelgarn
 wrote:

>
> Here's how I would picture the ideal situation:
> * A device is processed by udev.  It detects that it's part of a BTRFS
> array, updates blkid and whatever else in userspace with this info, and then
> stops without telling the kernel.
> * The kernel tracks devices until the filesystem they are part of is
> unmounted, or a mount of that FS fails.
> * When the user goes to mount the a BTRFS filesystem, they use a mount
> helper.
>   1. This helper queries udev/blkid/whatever to see which devices are part
> of an array.
>   2. Once the helper determines which devices are potentially in the
> requested FS, it checks the following things to ensure array integrity:
> - Does each device report the same number of component devices for the
> array?
> - Does the reported number match the number of devices found?
> - If a mount by UUID is requested, do all the labels match on each
> device?
> - If a mount by LABEL is requested, do all the UUID's match on each
> device?
> - If a mount by path is requested, do all the component devices reported
> by that device have matching LABEL _and_ UUID?
> - Is any of the devices found already in-use by another mount?
>   4. If any of the above checks fails, and the user has not specified an
> option to request a mount anyway, report the error and exit with non-zero
> status _before_ even talking to the kernel.
>   5. If only the second check fails (the check verifying the number of
> devices found), and it fails because the number found is less than required
> for a non-degraded mount, ignore that check if and only if the user
> specified -o degraded.
>   6. If any of the other checks fail, ignore them if and only if the user
> asks to ignore that specific check.
>   7. Otherwise, notify the kernel about the devices and call mount(2).
> * The mount helper parses it's own set of special options similar to the
> bg/fg/retry options used by mount.nfs to allow for timeouts when mounting,
> as well as asynchronous mounts in the background.
> * btrfs device scan becomes a no-op
> * btrfs device ready uses the above logic minus step 7 to determine if a
> filesystem is probably ready.
>
> Such a situation would probably eliminate or at least reduce most of our
> current issues with device discovery, and provide much better error
> reporting and general flexibility.

It might be useful to see where ZFS and LVM work and fail in this
regard. And also plan for D-Bus support to get state notifications up
to something like storaged or other such user space management tools.
Early on in Fedora there were many difficulties between systemd and
LVM, so avoiding whatever that was about would be nice.

Also, tangentially related, Fedora is replacing udisks2 with storaged.
Storaged already has a Btrfs plug-in so there should be better
awareness there. I get all kinds of damn strange behaviors in GNOME
with Btrfs multiple device volumes: volume names appearing twice in
the UI, unmounting one causes umount errors with the other.
https://fedoraproject.org/wiki/Changes/Replace_UDisks2_by_Storaged
http://storaged.org/

-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Francesco Turco
On 2016-07-07 20:25, Chris Murphy wrote:
> On Thu, Jul 7, 2016 at 8:55 AM, Francesco Turco  wrote:
>> Perhaps I
>> should try to rule out an hardware problem by filling my USB flash drive
>> with a large random file and then checking if its SHA-1 checksum
>> corresponds to the original copy on the hard disk. But first I probably
>> should backup the current Btrfs filesystem with the dd command. Can I
>> proceed?
> 
> https://btrfs.wiki.kernel.org/index.php/Gotchas

Thank you for the link, I didn't know that using LVM snapshots or
mounting dd copies can create problems! That could explain the reason
for some of the problems I had in the past.

-- 
Website: http://www.fturco.net/
GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-07 Thread Henk Slager
On Thu, Jul 7, 2016 at 7:40 PM, Chris Murphy  wrote:
> On Thu, Jul 7, 2016 at 10:01 AM, Henk Slager  wrote:
>
>> What the latest debian likes as naming convention I dont know, but in
>> openSuSE @ is a directory in the toplevel volume (ID=5 or ID=0 as
>> alias) and that directory contains subvolumes.

Sorry, I mixed-up latest opensuse and my own adaptions to older installations.

> No, opensuse doesn't use @ at all. They use a subvolume called
> .snapshots to contain snapper snapshots.

On current fresh install "openSUSE Tumbleweed (20160703) (x86_64)" you get this:

# btrfs sub list /
ID 257 gen 24369 top level 5 path @
ID 258 gen 24369 top level 257 path @/.snapshots
ID 259 gen 24369 top level 258 path @/.snapshots/1/snapshot
ID 265 gen 25404 top level 257 path @/tmp
ID 267 gen 24369 top level 257 path @/var/cache
ID 268 gen 20608 top level 257 path @/var/crash
ID 269 gen 20608 top level 257 path @/var/lib/libvirt/images
ID 270 gen 3903 top level 257 path @/var/lib/mailman
ID 271 gen 2996 top level 257 path @/var/lib/mariadb
ID 272 gen 3904 top level 257 path @/var/lib/mysql
ID 273 gen 3903 top level 257 path @/var/lib/named
ID 274 gen 8 top level 257 path @/var/lib/pgsql
ID 275 gen 25404 top level 257 path @/var/log
ID 276 gen 20611 top level 257 path @/var/opt
ID 277 gen 25404 top level 257 path @/var/spool
ID 278 gen 24369 top level 257 path @/var/tmp
ID 300 gen 10382 top level 258 path @/.snapshots/15/snapshot
[..]

@ is the only thing in the toplevel

I have changed it a bit for this particular PC, so that more is in one subvol.
Just after default install, subvol with ID 259 is made default and rw

I had also updated my older linux installs a bit like this, but with @
a dir, not a subvol, so that at least I can easily swap
'latestroofs' subvol with something else. My interpretation of the
OP's report was that he basically wants something like that too.

> On a system using snapper, its snapshots should probably be deleted
> via snapper so it's aware of the state change.

You can do that, but also with btrfs sub del in re-organisation
actions like described here. If you delete the .xml files in the
subvol .snapshots, it starts counting from 1 again. Changing the
latest .xml file can make it start counting from some higher number if
that is important for many-months history for example.

> And very clearly from the OP's output from 'btrfs sub list' there are
> no subvolumes with @ in the path, so there is no subvolume @, nor are
> there any subvolumes contained in a directory @.
>
> Assuming the posted output from btrfs sub list is the complete output,
> .snapshots is a directory and there are three subvolumes in it. I
> suspect the OP is unfamiliar with snapper conventions and is trying to
> delete a snapshot outside of snapper, and is used to some other
> (Debian or Ubuntu) convention where snapshots somehow relate to @,
> which is a mimicking of how ZFS does things.
>
> Anyway the reason why the command fails is stated in the error
> message. The system appears to be installed in the top level of the
> file system (subvolid=5), and that can't be deleted. First it's the
> immutable first subvolume of a Btrfs file system, and second it's
> populated with other subvolumes which would inhibit its removal even
> if it weren't the top level subvolume.
>
> What can be done is delete the directories in the top level, retaining
> the subvolumes that are there.

Indeed, yes, as a last cleanup step.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 8:55 AM, Francesco Turco  wrote:

> I'm not sure. Commands don't fail explicitely when I use ext4, but I
> agree with you that I may get corruption silently nonetheless.

Use XFS v5 format which is the default in xfsprogs 3.2.3 and later. It
at least checksums metadata.

> Perhaps I
> should try to rule out an hardware problem by filling my USB flash drive
> with a large random file and then checking if its SHA-1 checksum
> corresponds to the original copy on the hard disk. But first I probably
> should backup the current Btrfs filesystem with the dd command. Can I
> proceed?

https://btrfs.wiki.kernel.org/index.php/Gotchas


>> Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or
>> is it just raw encryption, or even something completely different?), on
>> a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it),
>> correct?
>
> I'm using a btrfs filesystem on a GUID partition encrypted with LUKS.
> It's a Kingston USB flash drive connected directly to my desktop machine
> via USB. It's definitively not a SSD or a HDD, and I'm not using any
> adapter.

First definitely check to make sure it's not fake. It's a well known
brand and there's a lot of incentive to make fake Kingston devices. I
have a Kingston DTR500 and have used it in the same use case you have,
Btrfs on LUKS, for maybe 6 months with no corruptions. In my case I
formatted with -M (mixed bg), and it was with kernels older than 4.x,
but otherwise sounds the same. Granted, individual units of the same
model can have big differences let alone between models. But if it's a
Btrfs bug, it might be a regression.

I wonder if this might be a use case for one of the integrity check
mount options? It slows things down a lot but the extra checking might
help pin point at least the moment something bad is happening.


-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Austin S. Hemmelgarn

On 2016-07-07 12:52, Goffredo Baroncelli wrote:

On 2016-07-06 14:48, Austin S. Hemmelgarn wrote:

On 2016-07-06 08:39, Andrei Borzenkov wrote:

[]


To be entirely honest, if it were me, I'd want systemd to
fsck off.  If the kernel mount(2) call succeeds, then the
filesystem was ready enough to mount, and if it doesn't, then
it wasn't, end of story.


How should user space know when to try mount? What user space
is supposed to do during boot if mount fails? Do you suggest

while true; do mount /dev/foo && exit 0 done

as part of startup sequence? And note that nowhere is systemd
involved so far.

Nowhere there, except if you have a filesystem in fstab (or a
mount unit, which I hate for other reasons that I will not go
into right now), and you mount it and systemd thinks the device
isn't ready, it unmounts it _immediately_.  In the case of boot,
it's because of systemd thinking the device isn't ready that you
can't mount degraded with a missing device.  In the case of the
root filesystem at least, the initramfs is expected to handle
this, and most of them do poll in some way, or have other methods
of determining this.  I occasionally have issues with it with
dracut without systemd, but that's due to a separate bug there
involving the device mapper.



How this systemd bashing answers my question - how user space knows
when it can call mount at startup?

You mentioned that systemd wasn't involved, which is patently false
if it's being used as your init system, and I was admittedly mostly
responding to that.

Now, to answer the primary question which I forgot to answer:
Userspace doesn't.  Systemd doesn't either but assumes it does and
checks in a flawed way.  Dracut's polling loop assumes it does but
sometimes fails in a different way.  There is no way other than
calling mount right now to know for sure if the mount will succeed,
and that actually applies to a certain degree to any filesystem
(because any number of things that are outside of even the kernel's
control might happen while trying to mount the device.


I think that there is no a simple answer, and the answer may depend by context.
In the past, I made a prototype of a mount helper for btrfs [1]; the aim was to:

1) get rid of the actual btrfs volume discovery (udev which trigger btrfs dev 
scan) which has a lot of strange condition (what happens when a device 
disappear ?)
2) create a place where we develop and define strategies to handle all (or 
most) of the case of [partial] failure of a [multi-device] btrfs filesystem

By default, my mount.btrfs waited the needed devices for a filesystem, and 
mount in degraded mode if not all devices are appeared (depending by a switch); 
if a timeout is reached, and error is returned.

It doesn't need any special udev rule, because it performs a discovery of the 
devices using libuuid. I think that mounting a filesystem and handling all the 
possibles case relaying of the udev and its syntax of the udev rules is more a 
problem than a solution. Adding that udev and the udev rules are developed in a 
different project, the difficulties increase.

I think that BTRFS for its complexity and their peculiarities need a dedicated 
tool like a mount helper.

My mount.btrfs is not able to solve all the problem, but might be a starts for 
handling the issues.
FWIW, I've pretty much always been of the opinion that the device 
discovery belongs in a mount helper.  The auto-discovery from udev (and 
more importantly, how the kernel handles being told about a device) is 
much of the reason that it's so inherently dangerous to do block level 
copies.  There's obviously no way that can be changed now without 
breaking something, but that's on the really short list of things that I 
personally feel are worth breaking to fix a particularly dangerous 
pitfall.  The recent discovery that device ready state is write-once 
when set just reinforces this in my opinion.


Here's how I would picture the ideal situation:
* A device is processed by udev.  It detects that it's part of a BTRFS 
array, updates blkid and whatever else in userspace with this info, and 
then stops without telling the kernel.
* The kernel tracks devices until the filesystem they are part of is 
unmounted, or a mount of that FS fails.
* When the user goes to mount the a BTRFS filesystem, they use a mount 
helper.
  1. This helper queries udev/blkid/whatever to see which devices are 
part of an array.
  2. Once the helper determines which devices are potentially in the 
requested FS, it checks the following things to ensure array integrity:
- Does each device report the same number of component devices for 
the array?

- Does the reported number match the number of devices found?
- If a mount by UUID is requested, do all the labels match on each 
device?
- If a mount by LABEL is requested, do all the UUID's match on each 
device?
- If a mount by path is requested, do all the component devices 
reported by that device have matching 

errors with linux-next-20160701

2016-07-07 Thread Laszlo Fiat
I have a simple btrfs filesystem on a single device. It worked well so far.

Recently I compiled a new kernel linux-next-20160701, with this new
kernel I get warnings and errors in the logs. But btrfs scrub
completes with 0 errors, and if I boot back to the older
linux-next-20160527 kernel, there are no error messages in the logs
when using it or running scrub. The filesystem is mountable
read-write, but I am worried about the warnings and errors. The
checksum warnings always come up with new numbers, never the same.

$ uname -a
Linux debian 4.7.0-rc5-next-20160701+ #46 SMP Sun Jul 3 15:29:10 CEST
2016 x86_64 GNU/Linux

$ btrfs --version
btrfs-progs v4.5.2

# btrfs fi show
Label: none  uuid: d6cab9ca-5e89-4d8c-b55b-c700a6096d37
Total devices 1 FS bytes used 55.31GiB
devid1 size 119.23GiB used 62.07GiB path /dev/mapper/home

# btrfs fi df /home
Data, single: total=59.01GiB, used=54.86GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=1.50GiB, used=462.56MiB
GlobalReserve, single: total=88.42MiB, used=0.00B

# dmesg | grep Btrfs
[4.530960] Btrfs loaded, crc32c=crc32c-intel

# dmesg | grep BTRFS
[4.530968] BTRFS: selftest: sectorsize: 4096  nodesize: 4096
[4.530973] BTRFS: selftest: sectorsize: 4096  nodesize: 8192
[4.530978] BTRFS: selftest: sectorsize: 4096  nodesize: 16384
[4.530982] BTRFS: selftest: sectorsize: 4096  nodesize: 32768
[4.530986] BTRFS: selftest: sectorsize: 4096  nodesize: 65536
[   52.114886] BTRFS: device fsid d6cab9ca-5e89-4d8c-b55b-c700a6096d37
devid 1 transid 29825 /dev/dm-0
[   60.598254] BTRFS info (device dm-0): use lzo compression
[   60.598266] BTRFS info (device dm-0): disk space caching is enabled
[   60.598273] BTRFS info (device dm-0): has skinny extents
[   60.797475] BTRFS warning (device dm-0): dm-0 checksum verify
failed on 343146496 wanted D962670F found 32292342 level 0
[   60.850008] BTRFS info (device dm-0): detected SSD devices, enabling SSD mode
[  165.491150] BTRFS error (device dm-0): bad tree block start
8242807833012638730 658522112

# grep "BTRFS error" /var/log/syslog.1
Jul  6 19:35:23 debian kernel: [ 2712.823929] BTRFS error (device
dm-0): bad tree block start 10283429131165574676 662503424
Jul  6 19:35:23 debian kernel: [ 2712.850468] BTRFS error (device
dm-0): bad tree block start 14801127411347629381 663502848
Jul  6 19:35:23 debian kernel: [ 2712.888038] BTRFS error (device
dm-0): bad tree block start 9855282569545798023 664141824
Jul  6 19:35:23 debian kernel: [ 2712.888491] BTRFS error (device
dm-0): bad tree block start 12220505751590977444 664207360
Jul  6 19:35:59 debian kernel: [ 2748.728044] BTRFS error (device
dm-0): bad tree block start 16324583772582058537 665927680
Jul  6 19:37:31 debian kernel: [ 2840.465648] BTRFS error (device
dm-0): bad tree block start 13618790082605902229 309936128
Jul  6 19:37:31 debian kernel: [ 2840.465672] BTRFS error (device
dm-0): bad tree block start 9260130888975445835 309870592
Jul  6 19:37:31 debian kernel: [ 2840.466526] BTRFS error (device
dm-0): bad tree block start 17351834078110360434 309968896
Jul  6 19:37:31 debian kernel: [ 2840.466579] BTRFS error (device
dm-0): bad tree block start 3476538019772833052 309985280
Jul  6 19:37:31 debian kernel: [ 2840.509021] BTRFS error (device
dm-0): bad tree block start 1881224518785478735 327696384
Jul  6 19:37:31 debian kernel: [ 2840.509085] BTRFS error (device
dm-0): bad tree block start 14212257183956925500 327712768
Jul  6 19:37:31 debian kernel: [ 2840.533393] BTRFS error (device
dm-0): bad tree block start 13574459615317154064 331268096
Jul  6 19:37:53 debian kernel: [ 2862.852848] BTRFS error (device
dm-0): bad tree block start 13618790082605902229 309936128
Jul  7 19:52:05 debian kernel: [  165.491150] BTRFS error (device
dm-0): bad tree block start 8242807833012638730 658522112

# grep "BTRFS warning" /var/log/syslog.1
Jul  6 19:35:24 debian kernel: [ 2712.926597] BTRFS warning (device
dm-0): dm-0 checksum verify failed on 665927680 wanted 2E7409B found
44E60A96 level 0
Jul  6 19:35:24 debian kernel: [ 2713.001532] BTRFS warning (device
dm-0): dm-0 checksum verify failed on 671416320 wanted D7E63A7D found
17064A1 level 0
Jul  6 19:37:31 debian kernel: [ 2840.596783] BTRFS warning (device
dm-0): dm-0 checksum verify failed on 373882880 wanted 55479736 found
3C3D6BD6 level 0
Jul  6 19:37:31 debian kernel: [ 2840.785562] BTRFS warning (device
dm-0): dm-0 checksum verify failed on 668319744 wanted A6C233A1 found
625A2B9D level 0
Jul  7 19:50:21 debian kernel: [   60.797475] BTRFS warning (device
dm-0): dm-0 checksum verify failed on 343146496 wanted D962670F found
32292342 level 0

# zgrep "BTRFS warning" /var/log/syslog.*.gz
/var/log/syslog.2.gz:Jul  5 19:40:23 debian kernel: [ 3560.469370]
BTRFS warning (device dm-0): dm-0 checksum verify failed on 373145600
wanted FDF307ED found F56F3DA6 level 0
/var/log/syslog.2.gz:Jul  5 19:42:10 debian kernel: [ 3668.114486]
BTRFS warning (device dm-0): dm-0 checksum 

Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 7:49 AM, Francesco Turco  wrote:

> $ btrfs filesystem show
> /run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3
> $

Try it with sudo. I think it's a bug that 'btrfs fi show' returns
silently for non-root. It should produce an error that root privileges
are needed, or it should work for unprivileged users.




> Btrfs-check reports many errors. I attached the output to this e-mail
> message.
>
> Output from dmesg:
>
> $ dmesg | tail
> [18756.159963] BTRFS error (device dm-4): bad tree block start
> 6592115285688248773 35323904

The problem happened before this, so I think we need the entire dmesg.



> I checked this USB flash drive with badblocks in non-destructive
> read-write mode. No errors.

Use F3 to test flash:
http://oss.digirati.com.br/f3/

Some distros have it in their repo, Fedora does. It's a bit
unintuitive what you need to do is use the write binary to write the
test files to the stick (this is destructive) and then use the read
binary to read back the written files.

Read more, and also includes a much faster alternative for GNOME:
https://blogs.gnome.org/hughsie/2015/01/28/detecting-fake-flash/




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-07 Thread Chris Murphy
On Thu, Jul 7, 2016 at 10:01 AM, Henk Slager  wrote:

> What the latest debian likes as naming convention I dont know, but in
> openSuSE @ is a directory in the toplevel volume (ID=5 or ID=0 as
> alias) and that directory contains subvolumes.

No, opensuse doesn't use @ at all. They use a subvolume called
.snapshots to contain snapper snapshots.

On a system using snapper, its snapshots should probably be deleted
via snapper so it's aware of the state change.

And very clearly from the OP's output from 'btrfs sub list' there are
no subvolumes with @ in the path, so there is no subvolume @, nor are
there any subvolumes contained in a directory @.

Assuming the posted output from btrfs sub list is the complete output,
.snapshots is a directory and there are three subvolumes in it. I
suspect the OP is unfamiliar with snapper conventions and is trying to
delete a snapshot outside of snapper, and is used to some other
(Debian or Ubuntu) convention where snapshots somehow relate to @,
which is a mimicking of how ZFS does things.

Anyway the reason why the command fails is stated in the error
message. The system appears to be installed in the top level of the
file system (subvolid=5), and that can't be deleted. First it's the
immutable first subvolume of a Btrfs file system, and second it's
populated with other subvolumes which would inhibit its removal even
if it weren't the top level subvolume.

What can be done is delete the directories in the top level, retaining
the subvolumes that are there.




-- 
Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-06 20:57, Chris Murphy wrote:
[...]
> 
> Seems like we need more granularity by btrfs ioctl for device ready,
> e.g. some way to indicate:
> 
> 0 all devices ready
> 1 devices not ready (don't even try to mount)
> 2 minimum devices ready (degraded mount possible)
> 
> 
> Btrfs multiple device single and raid0 only return code 0 or 1. Where
> raid 1, 5, 6 could return code 2. The systemd default policy for code
> 2 could be to wait some amount of time to see if state goes to 0. At
> the timeout, try to mount anyway. If rootflags=degraded, it mounts. If
> not, mount fails, and we get a dracut prompt.
> 

Pay attention that to return 2, you have to scan all the VGs to check if all 
the involved devices are available: i.e. a filesystem composed by 5 disks, may 
have a VG RAID5 with only 3 disks used for data, and a VG RAID1 for metadata 
for the other two disks

Think to try to perform this for each disk appearing I fear that it is too 
expensive


> That's better behavior than now.
> 
>> Equivalent of this rule is required under systemd and desired in
>> general to avoid polling. On systemd list I outlined possible
>> alternative implementation as systemd service instead of really
>> hackish udev rule.
> 
> I'll go read it there. Thanks.
> 
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-06 22:00, Chris Murphy wrote:
> On Wed, Jul 6, 2016 at 1:17 PM, Austin S. Hemmelgarn
>  wrote:
> 
>> In bash or most other POSIX compliant shells, you can run this:
>> echo $?
>> to get the return code of the previous command.
>>
>> In your case though, it may be reporting the FS ready because it had already
>> seen all the devices, IIUC, the flag that checks is only set once, and never
>> unset, which is not a good design in this case.
> 
> Oh dear.
> 
> [root@f24s ~]# lvs
>   LV VG Attr   LSize  Pool   Origin Data%  Meta%  Move
> Log Cpy%Sync Convert
>   1  VG Vwi---tz-- 50.00g thintastic
>   2  VG Vwi---tz-- 50.00g thintastic
>   3  VG Vwi-a-tz-- 50.00g thintastic2.54
>   thintastic VG twi-aotz-- 90.00g   5.05   2.92
> [root@f24s ~]# btrfs dev scan
> Scanning for Btrfs filesystems
> [root@f24s ~]# echo $?
> 0
> [root@f24s ~]# btrfs device ready /dev/mapper/VG-3
> [root@f24s ~]# echo $?
> 0
> [root@f24s ~]# btrfs fi show
> warning, device 2 is missing
> Label: none  uuid: 96240fd9-ea76-47e7-8cf4-05d3570ccfd7
> Total devices 3 FS bytes used 2.26GiB
> devid3 size 50.00GiB used 3.01GiB path /dev/mapper/VG-3
> *** Some devices missing
> 
> 
> Cute, device 1 is also missing but that's not mentioned. In any case,
> the device is still ready even after a dev scan. I guess this isn't
> exactly testable all that easily unless I reboot.

IIRC a device when "registered" by "btrfs dev scan", is never removed from the 
available devices. This means that if you remove a valid device after that it 
is already scanned, "btrfs dev ready" still return OK until a reboot happened.

>From your email, it is not clear if you rebooted (or rmmod-ded btrfs) after 
>you removed the devices.

Only my 2¢...

BR
G.Baroncelli
> 
> 
> 


-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-06 14:48, Austin S. Hemmelgarn wrote:
> On 2016-07-06 08:39, Andrei Borzenkov wrote:
[]
> 
> To be entirely honest, if it were me, I'd want systemd to
> fsck off.  If the kernel mount(2) call succeeds, then the
> filesystem was ready enough to mount, and if it doesn't, then
> it wasn't, end of story.
 
 How should user space know when to try mount? What user space
 is supposed to do during boot if mount fails? Do you suggest
 
 while true; do mount /dev/foo && exit 0 done
 
 as part of startup sequence? And note that nowhere is systemd
 involved so far.
>>> Nowhere there, except if you have a filesystem in fstab (or a
>>> mount unit, which I hate for other reasons that I will not go
>>> into right now), and you mount it and systemd thinks the device
>>> isn't ready, it unmounts it _immediately_.  In the case of boot,
>>> it's because of systemd thinking the device isn't ready that you
>>> can't mount degraded with a missing device.  In the case of the
>>> root filesystem at least, the initramfs is expected to handle
>>> this, and most of them do poll in some way, or have other methods
>>> of determining this.  I occasionally have issues with it with
>>> dracut without systemd, but that's due to a separate bug there
>>> involving the device mapper.
>>> 
>> 
>> How this systemd bashing answers my question - how user space knows
>> when it can call mount at startup?
> You mentioned that systemd wasn't involved, which is patently false
> if it's being used as your init system, and I was admittedly mostly
> responding to that.
> 
> Now, to answer the primary question which I forgot to answer: 
> Userspace doesn't.  Systemd doesn't either but assumes it does and
> checks in a flawed way.  Dracut's polling loop assumes it does but
> sometimes fails in a different way.  There is no way other than
> calling mount right now to know for sure if the mount will succeed,
> and that actually applies to a certain degree to any filesystem
> (because any number of things that are outside of even the kernel's
> control might happen while trying to mount the device.

I think that there is no a simple answer, and the answer may depend by context. 
In the past, I made a prototype of a mount helper for btrfs [1]; the aim was to:

1) get rid of the actual btrfs volume discovery (udev which trigger btrfs dev 
scan) which has a lot of strange condition (what happens when a device 
disappear ?)
2) create a place where we develop and define strategies to handle all (or 
most) of the case of [partial] failure of a [multi-device] btrfs filesystem

By default, my mount.btrfs waited the needed devices for a filesystem, and 
mount in degraded mode if not all devices are appeared (depending by a switch); 
if a timeout is reached, and error is returned.

It doesn't need any special udev rule, because it performs a discovery of the 
devices using libuuid. I think that mounting a filesystem and handling all the 
possibles case relaying of the udev and its syntax of the udev rules is more a 
problem than a solution. Adding that udev and the udev rules are developed in a 
different project, the difficulties increase.

I think that BTRFS for its complexity and their peculiarities need a dedicated 
tool like a mount helper.

My mount.btrfs is not able to solve all the problem, but might be a starts for 
handling the issues.

BR
G.Baroncelli


[1] http://www.spinics.net/lists/linux-btrfs/msg28764.html



-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 64-btrfs.rules and degraded boot

2016-07-07 Thread Goffredo Baroncelli
On 2016-07-05 20:53, Chris Murphy wrote:
> I am kinda confused about this "btrfs ready $devnode" portion. Isn't
> it "btrfs device ready $devnode" if this is based on user space tools?

systemd, implemented this as internal command

-- 
gpg @keyserver.linux.it: Goffredo Baroncelli 
Key fingerprint BBF5 1610 0B64 DAC6 5F7D  17B2 0EDA 9B37 8B82 E0B5
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of space error even though there's 100 GB unused?

2016-07-07 Thread Henk Slager
On Thu, Jul 7, 2016 at 5:17 PM, Stanislaw Kaminski
 wrote:
> Hi Chris, Alex, Hugo,
>
> Running now: Linux archb3 4.6.2-1-ARCH #1 PREEMPT Mon Jun 13 02:11:34
> MDT 2016 armv5tel GNU/Linux
>
> Seems to be working fine. I started a defrag, and it seems I'm getting
> my space back:
> $ sudo btrfs fi usage /home
> Overall:
> Device size:   1.81TiB
> Device allocated:  1.73TiB
> Device unallocated:   80.89GiB
> Device missing:  0.00B
> Used:  1.65TiB
> Free (estimated):159.63GiB  (min: 119.19GiB)
> Data ratio:   1.00
> Metadata ratio:   2.00
> Global reserve:  512.00MiB  (used: 240.00KiB)
>
> Data,single: Size:1.72TiB, Used:1.65TiB
>/dev/sda4   1.72TiB
>
> Metadata,DUP: Size:3.50GiB, Used:2.16GiB
>/dev/sda4   7.00GiB
>
> System,DUP: Size:32.00MiB, Used:224.00KiB
>/dev/sda4  64.00MiB
>
> Unallocated:
>/dev/sda4  80.89GiB
>
> I deleted some unfinished torrent, ~10 GB in size, but as you can see,
> "Free space" has grown by 60 GB (re-checked now and it's 1 GB more now
> - so definitely caused by defrag).
>
> What has changed between 4.6.2 and 4.6.3?

https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/diff/?id=v4.6.3=v4.6.2=2

I see no change for Btrfs

> Cheers,
> Stan
>
> 2016-07-07 12:28 GMT+02:00 Stanislaw Kaminski :
>> Too early report, the issue is back. Back to testing
>>
>> 2016-07-07 12:18 GMT+02:00 Stanislaw Kaminski :
>>> Hi all,
>>> I downgraded to 4.4.1-1 - all fine, 4.5.5.-1 - also fine, then got
>>> back to 4.6.3-2 - and it's still fine. Apparently running under
>>> different kernel somehow fixed the glitch (as far as I can test...).
>>>
>>> That leaves me with the other question: before issues, I 1.6 TiB was
>>> used, now all the tools report 1.7 TiB issued (except for btrfs fs du
>>> /home, this reports 1.6 TiB). How is that possible?
>>>
>>> Cheers,
>>> Stan
>>>
>>> 2016-07-06 19:42 GMT+02:00 Chris Murphy :
 On Wed, Jul 6, 2016 at 3:55 AM, Stanislaw Kaminski
  wrote:

> Device unallocated:   97.89GiB

 There should be no problem creating any type of block group from this
 much space. It's a bug.

 I would try regression testing. Kernel 4.5.7 has some changes that may
 or may not relate to this (they should only relate when there is no
 unallocated space left) so you could try 4.5.6 and 4.5.7. And also
 4.4.14.

 But also the kernel messages are important. There is this obscure
 enospc with error -28, so either with or without enospc_debug mount
 option is useful to try in 4.6.3 (I think it's less useful in older
 kernels).

 But do try nospace_cache first. If that works, you could then mount
 with clear_cache one time and see if that provides an enduring fix. It
 can take some time to rebuild the cache after clear_cache is used.



 --
 Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fstrim problem/bug

2016-07-07 Thread Henk Slager
On Thu, Jul 7, 2016 at 11:46 AM, M G Berberich  wrote:
> Hello,
>
> On a filesystem with 40 G free space and 54 G used, ‘fstrim -v’ gave
> this result:
>
> # fstrim -v /
> /: 0 B (0 bytes) trimmed
>
> After running balance it gave a more sensible
>
> # fstrim -v /
> /: 37.3 GiB (40007368704 bytes) trimmed
>
> As far as I understand, fstrim should report any unused block to the
> disk, so its controller can reuse these blocks. I expected ’fstrim -v’
> to report about 40 G trimmed. The fact, that after balance fstrim
> reports a sensible amount of trimmed bytes leads to the conclusion,
> that fstrim on btrfs does not report unused blocks to the disk (as it
> should), but only the blocks of unused chunks. As the fstrim-command
> only does a ‘ioctl(fd, FITRIM, ))’ this seems to be a bug in the
> fstrim kernel-code.
> In the field this means, that without regularly running balance,
> fstrim does not work on btrfs.

hmm, yes indeed I see this as well:

# btrfs fi us /
Overall:
   Device size:  55.00GiB
   Device allocated: 46.55GiB
   Device unallocated:8.45GiB
   Device missing:  0.00B
   Used: 39.64GiB
   Free (estimated): 13.96GiB  (min: 13.96GiB)
   Data ratio:   1.00
   Metadata ratio:   1.00
   Global reserve:  480.00MiB  (used: 0.00B)

Data,single: Size:43.77GiB, Used:38.25GiB
  /dev/sda1  43.77GiB

Metadata,single: Size:2.75GiB, Used:1.39GiB
  /dev/sda1   2.75GiB

System,single: Size:32.00MiB, Used:16.00KiB
  /dev/sda1  32.00MiB

Unallocated:
  /dev/sda1   8.45GiB
# fstrim -v /
/: 9,3 GiB (10014126080 bytes) trimmed
# fallocate -l 5G testfile
# fstrim -v /
/: 4,3 GiB (4644130816 bytes) trimmed

Where the difference between 8.45GiB and 9,3 GiB comes from, I
currently don't understand.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rollback to a snapshot and delete old top volume - missing of "@"

2016-07-07 Thread Henk Slager
On Thu, Jul 7, 2016 at 2:17 PM, Kai Herlemann  wrote:
> Hi,
>
> I want to rollback a snapshot and have done this by execute "btrfs sub
> set-default / 618".
maybe just a typo here, command syntax is:
# sudo btrfs sub set-default
btrfs subvolume set-default: too few arguments
usage: btrfs subvolume set-default  

   Set the default subvolume of a filesystem

> Now I want to delete the old top volume to save space, but google and
> manuals didn't helped.
>
> I mounted for the following the root volume at /mnt/gparted with subvolid=0,
> subvol=/ has the same effect.
> Usually, the top volume is saved in /@, so I would be able to delete it by
> execute "btrfs sub delete /@" (or move at first @ to @_badroot and the
> snapshot to @). But that isn't possible, the output of that command is
> "ERROR: cannot access subvolume /@: No such file or directory".
> I've posted the output of "btrfs sub list /mnt/gparted" at
> http://pastebin.com/r7WNbJq8. As you can see, there's no subvolume named @.

I think one or the other commandtyping didn't have its expected
effect, just to make sure I get the right state, can you do:

mkdir -p /fsroot
mount -o subvolid=0 UUID= /fsroot
btrfs sub list /fsroot
btrfs subvolume get-default /

What the latest debian likes as naming convention I dont know, but in
openSuSE @ is a directory in the toplevel volume (ID=5 or ID=0 as
alias) and that directory contains subvolumes. You can do whatever you
like best, but at least make sure you have mount entries in fstab
subvolumes like var/cache/apt and usr/src, otherwise this magnificent
rootfstree snapshotting gets you into trouble.

I think your current default subvolume is still 5, so you would need:

fstab:
UUID=/btrfsdefaults   0 0
#UUID=/homebtrfsdefaults,subvol=@/home   0 0
UUID=/usr/srcbtrfs
defaults,subvol=@/usr/src   0 0
UUID=/var/cache/aptbtrfs
defaults,subvol=@/var/cache/apt   0 0
UUID=/.snapshotsbtrfs
defaults,subvol=@/.snapshots   0 0
UUID=/fsrootbtrfsnoauto,subvolid=0   0 0


mkdir -p /fsroot
mount -o subvolid=0 UUID= /fsroot

mkdir -p /usr/src
mkdir -p /var/cache/apt
mkdir -p /.snapshots

mkdir -p /fsroot/@/usr
mkdir -p /fsroot/@/var/cache

btrfs sub create /fsroot/@/usr/src
btrfs sub create /fsroot/@/var/cache/apt
btrfs sub create /fsroot/@/.snapshots

#snapshots might need different, the proposed one works at least for snapper

btrfs sub snap / /fsroot/@/latestrootfs
btrfs sub set-default  /
btrfs fi sync /

#for home fs is it similar as for root fs

reboot

You can then when you want rollback, set a snapshot to rw  (or rename
latestrootfs, snapshot snapshot to that name ) and make it default
subvol and reboot (or maybe also do some temp chroot tricks, I have
not tried that)

> I have the same problem with my /home/ partition.
>
> Output of "uname -a" (self-compiled kernel):
> Linux debian-linux 4.1.26 #1 SMP Wed Jun 8 18:40:04 CEST 2016 x86_64
> GNU/Linux
>
> Output of "btrfs --version":
> btrfs-progs v4.5.2
>
> Output of "btrfs fi show":
> Label: none  uuid: f778877c-d50b-48c8-8951-6635c6e23c61
>   Total devices 1 FS bytes used 43.70GiB
>   devid1 size 55.62GiB used 47.03GiB path /dev/sda1
>
> Output of "btrfs fi df /":
> Data, single: total=44.00GiB, used=42.48GiB
> System, single: total=32.00MiB, used=16.00KiB
> Metadata, single: total=3.00GiB, used=1.22GiB
> GlobalReserve, single: total=416.00MiB, used=0.00B
>
> Output of dmesg attached.
>
> Thank you,
> Kai
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Austin S. Hemmelgarn

On 2016-07-07 10:55, Francesco Turco wrote:

On 2016-07-07 16:27, Austin S. Hemmelgarn wrote:

This seems odd, are you trying to access anything over NFS or some other
network filesystem protocol here?  If not, then I believe you've found a
bug, because I'm pretty certain we shouldn't be returning -ESTALE for
anything.


No, I don't use NFS or any other network filesystem.
OK, I'm going to try and check the kernel code to figure out if there's 
any other case we might return that in.  I'm pretty certain that there's 
nowhere BTRFS should return that though, which means you've either hit a 
bug or have some other hardware issue (Given past experience, I think 
it's more likely that you've hit a bug).



The question here is: Do you get any data corruption when using ext4?
Quite often when there's a hardware issue, you won't see _any_
indication of it other than corrupted files when using something like
ext4 or XFS, but it will show up almost immediately with BTRFS because
we validate checksums on almost everything.  There have been at least a
couple of times I've found disk issues while converting from ext4 to
BTRFS that I didn't know existed before, and then going back was able to
reliable reproduce using other tools.

Also, FWIW, badblocks is not necessarily a reliable test method for
flash drives, they often handle serialized reads like badblocks does
very well even when failing.


I'm not sure. Commands don't fail explicitely when I use ext4, but I
agree with you that I may get corruption silently nonetheless. Perhaps I
should try to rule out an hardware problem by filling my USB flash drive
with a large random file and then checking if its SHA-1 checksum
corresponds to the original copy on the hard disk. But first I probably
should backup the current Btrfs filesystem with the dd command. Can I
proceed?
Yeah, I would suggest backing up the filesystem, be careful that you 
don't have both copies of the filesystem visible to the system at the 
same time once you've finished creating the backup copy though, as there 
are potential issues if you have both visible while trying to mount the FS.


As far as checking the drive, I'd do essentially what you had said, with 
two extra parts:
1. Calculate the checksum of the data on the drive multiple times and 
make sure that it matches each time as well as matching the original 
file (if it doesn't match the original file, but each calculation from 
the drive matches, then the issue is something in the write path only).
2. Do so multiple times so you can be sure to cover _every_ block.  Most 
flash drives have a pool of spare blocks that are used for wear 
leveling, and if the issue is in one of those, this is the only way to 
find it.


You might also try doing some testing with FIO or iozone, those tend to 
exercise a wider variety of things than stuff like badblocks or dd. 
Also, since you'll have a backup copy of the FS, you might consider 
running a destructive test with badblocks (it works a bit more reliably 
on flash devices this way, just make sure to run it multiple times too), 
both with and without the -B option (-B affects how things are buffered, 
if you see errors with it enabled but none without it, then you probably 
have some bad RAM).



Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or
is it just raw encryption, or even something completely different?), on
a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it),
correct?


I'm using a btrfs filesystem on a GUID partition encrypted with LUKS.
It's a Kingston USB flash drive connected directly to my desktop machine
via USB. It's definitively not a SSD or a HDD, and I'm not using any
adapter.
OK, that both simplifies things, and makes them a bit more complicated. 
If it had been a SSD or HDD connected through an adapter, the preferred 
method of checking would be to pull it out and put it directly in the 
system to verify the drive.  However, since it's a regular flash drive, 
if it is the drive, it will probably be significantly less expensive to 
replace.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of space error even though there's 100 GB unused?

2016-07-07 Thread Stanislaw Kaminski
Hi Chris, Alex, Hugo,

Running now: Linux archb3 4.6.2-1-ARCH #1 PREEMPT Mon Jun 13 02:11:34
MDT 2016 armv5tel GNU/Linux

Seems to be working fine. I started a defrag, and it seems I'm getting
my space back:
$ sudo btrfs fi usage /home
Overall:
Device size:   1.81TiB
Device allocated:  1.73TiB
Device unallocated:   80.89GiB
Device missing:  0.00B
Used:  1.65TiB
Free (estimated):159.63GiB  (min: 119.19GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 240.00KiB)

Data,single: Size:1.72TiB, Used:1.65TiB
   /dev/sda4   1.72TiB

Metadata,DUP: Size:3.50GiB, Used:2.16GiB
   /dev/sda4   7.00GiB

System,DUP: Size:32.00MiB, Used:224.00KiB
   /dev/sda4  64.00MiB

Unallocated:
   /dev/sda4  80.89GiB

I deleted some unfinished torrent, ~10 GB in size, but as you can see,
"Free space" has grown by 60 GB (re-checked now and it's 1 GB more now
- so definitely caused by defrag).

What has changed between 4.6.2 and 4.6.3?

Cheers,
Stan

2016-07-07 12:28 GMT+02:00 Stanislaw Kaminski :
> Too early report, the issue is back. Back to testing
>
> 2016-07-07 12:18 GMT+02:00 Stanislaw Kaminski :
>> Hi all,
>> I downgraded to 4.4.1-1 - all fine, 4.5.5.-1 - also fine, then got
>> back to 4.6.3-2 - and it's still fine. Apparently running under
>> different kernel somehow fixed the glitch (as far as I can test...).
>>
>> That leaves me with the other question: before issues, I 1.6 TiB was
>> used, now all the tools report 1.7 TiB issued (except for btrfs fs du
>> /home, this reports 1.6 TiB). How is that possible?
>>
>> Cheers,
>> Stan
>>
>> 2016-07-06 19:42 GMT+02:00 Chris Murphy :
>>> On Wed, Jul 6, 2016 at 3:55 AM, Stanislaw Kaminski
>>>  wrote:
>>>
 Device unallocated:   97.89GiB
>>>
>>> There should be no problem creating any type of block group from this
>>> much space. It's a bug.
>>>
>>> I would try regression testing. Kernel 4.5.7 has some changes that may
>>> or may not relate to this (they should only relate when there is no
>>> unallocated space left) so you could try 4.5.6 and 4.5.7. And also
>>> 4.4.14.
>>>
>>> But also the kernel messages are important. There is this obscure
>>> enospc with error -28, so either with or without enospc_debug mount
>>> option is useful to try in 4.6.3 (I think it's less useful in older
>>> kernels).
>>>
>>> But do try nospace_cache first. If that works, you could then mount
>>> with clear_cache one time and see if that provides an enduring fix. It
>>> can take some time to rebuild the cache after clear_cache is used.
>>>
>>>
>>>
>>> --
>>> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Francesco Turco
On 2016-07-07 16:27, Austin S. Hemmelgarn wrote:
> This seems odd, are you trying to access anything over NFS or some other
> network filesystem protocol here?  If not, then I believe you've found a
> bug, because I'm pretty certain we shouldn't be returning -ESTALE for
> anything.

No, I don't use NFS or any other network filesystem.

> The question here is: Do you get any data corruption when using ext4?
> Quite often when there's a hardware issue, you won't see _any_
> indication of it other than corrupted files when using something like
> ext4 or XFS, but it will show up almost immediately with BTRFS because
> we validate checksums on almost everything.  There have been at least a
> couple of times I've found disk issues while converting from ext4 to
> BTRFS that I didn't know existed before, and then going back was able to
> reliable reproduce using other tools.
> 
> Also, FWIW, badblocks is not necessarily a reliable test method for
> flash drives, they often handle serialized reads like badblocks does
> very well even when failing.

I'm not sure. Commands don't fail explicitely when I use ext4, but I
agree with you that I may get corruption silently nonetheless. Perhaps I
should try to rule out an hardware problem by filling my USB flash drive
with a large random file and then checking if its SHA-1 checksum
corresponds to the original copy on the hard disk. But first I probably
should backup the current Btrfs filesystem with the dd command. Can I
proceed?

> Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or
> is it just raw encryption, or even something completely different?), on
> a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it),
> correct?

I'm using a btrfs filesystem on a GUID partition encrypted with LUKS.
It's a Kingston USB flash drive connected directly to my desktop machine
via USB. It's definitively not a SSD or a HDD, and I'm not using any
adapter.

-- 
Website: http://www.fturco.net/
GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs module does not load on sparc64

2016-07-07 Thread Anatoly Pugachev
Hi!

Compiled linux kernel (git version 4.7.0-rc6+) using my own kernel
config file, enabling :

CONFIG_BTRFS_FS_RUN_SANITY_TESTS=y
CONFIG_BTRFS_DEBUG=y
CONFIG_BTRFS_ASSERT=y

and now I can't load btrfs module:

# modprobe btrfs
modprobe: ERROR: could not insert 'btrfs': Invalid argument


and in logs (and on console):

[1897399.942697] Btrfs loaded, crc32c=crc32c-generic, debug=on, assert=on
[1897400.024645] BTRFS: selftest: sectorsize: 8192  nodesize: 8192
[1897400.098089] BTRFS: selftest: Running btrfs free space cache tests
[1897400.175863] BTRFS: selftest: Running extent only tests
[1897400.241871] BTRFS: selftest: Running bitmap only tests
[1897400.307877] BTRFS: selftest: Running bitmap and extent tests
[1897400.380329] BTRFS: selftest: Running space stealing from bitmap to extent
[1897400.470517] BTRFS: selftest: Free space cache tests finished
[1897400.542875] BTRFS: selftest: Running extent buffer operation tests
[1897400.621710] BTRFS: selftest: Running btrfs_split_item tests
[1897400.692929] BTRFS: selftest: Running extent I/O tests
[1897400.757459] BTRFS: selftest: Running find delalloc tests
[1897401.082670] BTRFS: selftest: Running extent buffer bitmap tests
[1897401.161223] BTRFS: selftest: Setting straddling pages failed
[1897401.233661] BTRFS: selftest: Extent I/O tests finished


this is sparc64 sid/unstable debian:

# uname -a
Linux nvg5120 4.7.0-rc6+ #38 SMP Thu Jul 7 14:51:23 MSK 2016 sparc64 GNU/Linux

# getconf PAGESIZE
8192

PS: using btrfs-progs from kdave repo,
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/btrfs-progs.git ,
i'm able to create fs, but unable to mount:

root@nvg5120:/home/mator/btrfs-progs# ./mkfs.btrfs -f /dev/vg1/vol1
btrfs-progs v4.6.1
See http://btrfs.wiki.kernel.org for more information.

WARNING: failed to open /dev/btrfs-control, skipping device
registration: No such device
Label:  (null)
UUID:   ddd8a268-62e5-444c-9baf-6ba1b2d4448b
Node size:  16384
Sector size:8192
Filesystem size:15.00GiB
Block group profiles:
  Data: single8.00MiB
  Metadata: DUP   1.01GiB
  System:   DUP  12.00MiB
SSD detected:   no
Incompat features:  extref, skinny-metadata
Number of devices:  1
Devices:
   IDSIZE  PATH
115.00GiB  /dev/vg1/vol1


Can someone help please? Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Austin S. Hemmelgarn

On 2016-07-07 09:49, Francesco Turco wrote:

I have a USB flash drive with an encrypted Btrfs filesystem where I
store daily backups. My problem is that this btrfs filesystem gets
corrupted very often, after a few days of usage. Usually I just reformat
it and move along, but this time I'd like to understand the root cause
of the problem and fix it.

I can mount the partition without problems, but then when using commands
such as rsync or even humble ls I get the following error message:

$ rsync /home/fturco/Buffer/E-book/
/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Buffer/E-book/
--recursive
rsync:
readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Riviste")
failed: Stale file handle (116)
rsync:
readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Backup")
failed: Stale file handle (116)
rsync:
readdir("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Calibre
(TMSU)"): Input/output error (5)
This seems odd, are you trying to access anything over NFS or some other 
network filesystem protocol here?  If not, then I believe you've found a 
bug, because I'm pretty certain we shouldn't be returning -ESTALE for 
anything.


The previous command gets stuck and I had to manually stop it.

The following command doesn't return any output, but its exit code is 1
(failure):

$ btrfs filesystem show
/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3
$
Something is definitely wrong here.  Unless Parabola has seriously 
modified btrfs-progs, this should be spitting out info about the devices 
and filesystem usage.  This may be a result of the errors seen by check, 
but I doubt that


Btrfs-check reports many errors. I attached the output to this e-mail
message.
Looking at this, I see a couple of things I know it should fix correctly 
(the 'errors 2001' stuff is fixable, and I'm pretty certain that the 
'errors 200' thing is too, and I think it will fix the bytenr mismatch 
stuff mostly safely), but there's enough I'm not sure about that I can't 
in good conscience recommend that you run check with --repair, as it may 
make things worse.  Hopefully someone who actually understands what the 
other things actually mean can provide more help on that.


Output from dmesg:

$ dmesg | tail
[18756.159963] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.160828] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.161821] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.163047] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.163921] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.164806] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.165673] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.166548] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18757.950603] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18757.951492] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904

I checked this USB flash drive with badblocks in non-destructive
read-write mode. No errors.

If I format this partition as Ext4 instead of Btrfs I can use it without
problems, but my goal is to use Btrfs on all devices.
The question here is: Do you get any data corruption when using ext4? 
Quite often when there's a hardware issue, you won't see _any_ 
indication of it other than corrupted files when using something like 
ext4 or XFS, but it will show up almost immediately with BTRFS because 
we validate checksums on almost everything.  There have been at least a 
couple of times I've found disk issues while converting from ext4 to 
BTRFS that I didn't know existed before, and then going back was able to 
reliable reproduce using other tools.


Also, FWIW, badblocks is not necessarily a reliable test method for 
flash drives, they often handle serialized reads like badblocks does 
very well even when failing.


Just to clarify, you're using BTRFS on top of disk encryption (LUKS? Or 
is it just raw encryption, or even something completely different?), on 
a USB flash drive (not a USB to SATA adapter with an SSD or HDD in it), 
correct?


My GNU/Linux distribution is Parabola GNU/Linux-libre.
Kernel version is: 4.6.3.
Btrfs-progs version is: 4.6

Please tell me if you need other details. Thanks.



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 00/31] btrfs: simplify use of struct btrfs_root pointers

2016-07-07 Thread David Sterba
On Fri, Jun 24, 2016 at 06:14:53PM -0400, je...@suse.com wrote:
> From: Jeff Mahoney 
> 
> One of the common complaints I've heard from new and experienced
> developers alike about the btrfs code is the ubiquity of
> struct btrfs_root.  There is one for every tree on disk and it's not
> always obvious which root is needed in a particular call path.  It can
> be frustrating to spend time figuring out which root is required only
> to discover that it's not actually used for anything other than
> getting the fs-global struct btrfs_fs_info.
> 
> The patchset contains several sections.
[...]

The whole series looks very good. The patches are well split and
reviewable. I'll take the squashed series, but due to the merge
conflicts I'll  merge it last on top of other patches that have been
sent before.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Frequent btrfs corruption on a USB flash drive

2016-07-07 Thread Francesco Turco
I have a USB flash drive with an encrypted Btrfs filesystem where I
store daily backups. My problem is that this btrfs filesystem gets
corrupted very often, after a few days of usage. Usually I just reformat
it and move along, but this time I'd like to understand the root cause
of the problem and fix it.

I can mount the partition without problems, but then when using commands
such as rsync or even humble ls I get the following error message:

$ rsync /home/fturco/Buffer/E-book/
/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Buffer/E-book/
--recursive
rsync:
readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Riviste")
failed: Stale file handle (116)
rsync:
readlink_stat("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Backup")
failed: Stale file handle (116)
rsync:
readdir("/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3/Calibre
(TMSU)"): Input/output error (5)

The previous command gets stuck and I had to manually stop it.

The following command doesn't return any output, but its exit code is 1
(failure):

$ btrfs filesystem show
/run/media/fturco/5283147c-b7b4-448f-97b0-b235344a56a3
$

Btrfs-check reports many errors. I attached the output to this e-mail
message.

Output from dmesg:

$ dmesg | tail
[18756.159963] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.160828] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.161821] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.163047] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.163921] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.164806] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18756.165673] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18756.166548] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904
[18757.950603] BTRFS error (device dm-4): bad tree block start
6592115285688248773 35323904
[18757.951492] BTRFS error (device dm-4): bad tree block start
8533404122473270145 35323904

I checked this USB flash drive with badblocks in non-destructive
read-write mode. No errors.

If I format this partition as Ext4 instead of Btrfs I can use it without
problems, but my goal is to use Btrfs on all devices.

My GNU/Linux distribution is Parabola GNU/Linux-libre.
Kernel version is: 4.6.3.
Btrfs-progs version is: 4.6

Please tell me if you need other details. Thanks.

-- 
Website: http://www.fturco.net/
GPG key: 6712 2364 B2FE 30E1 4791 EB82 7BB1 1F53 29DE CD34
# btrfs check --readonly /dev/mapper/luks-08e23ed4-a2a1-41f0-a5f6-794ff0647ada
Checking filesystem on /dev/mapper/luks-08e23ed4-a2a1-41f0-a5f6-794ff0647ada
UUID: 5283147c-b7b4-448f-97b0-b235344a56a3
checking extents
checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0
checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0
checksum verify failed on 35274752 found E8B38F1B wanted B3F4F728
checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0
bytenr mismatch, want=35274752, have=6970279768983377651
checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29
checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29
checksum verify failed on 35291136 found 607F5103 wanted F21126A3
checksum verify failed on 35291136 found 6B9667D1 wanted CDED2E29
bytenr mismatch, want=35291136, have=16962852950865328208
checksum verify failed on 35307520 found 088ACE59 wanted 22164173
checksum verify failed on 35307520 found 088ACE59 wanted 22164173
checksum verify failed on 35307520 found F59BACEE wanted E647A1CD
checksum verify failed on 35307520 found 088ACE59 wanted 22164173
bytenr mismatch, want=35307520, have=16013504349018505369
checksum verify failed on 35323904 found CA154283 wanted 10E9FA6B
checksum verify failed on 35323904 found CA154283 wanted 10E9FA6B
checksum verify failed on 35323904 found 4DA7B234 wanted 794014C7
checksum verify failed on 35323904 found 4DA7B234 wanted 794014C7
bytenr mismatch, want=35323904, have=8533404122473270145
parent transid verify failed on 35340288 wanted 44 found 37
parent transid verify failed on 35340288 wanted 44 found 37
parent transid verify failed on 35340288 wanted 44 found 37
parent transid verify failed on 35340288 wanted 44 found 37
Ignoring transid failure
leaf parent key incorrect 35340288
bad block 35340288
Errors found in extent allocation tree or chunk allocation
checking free space cache
checking fs roots
parent transid verify failed on 35340288 wanted 44 found 37
Ignoring transid failure
parent transid verify failed on 35340288 wanted 44 found 37
Ignoring transid failure
parent transid verify failed on 35340288 wanted 44 found 37
Ignoring transid failure
parent transid verify failed on 35340288 wanted 44 found 37
Ignoring transid failure
checksum verify failed on 35274752 found E042416D wanted 4CD1CFA0
checksum 

Re: [PATCH 1/2] btrfs: fix fsfreeze hang caused by delayed iputs deal

2016-07-07 Thread David Sterba
On Wed, Jul 06, 2016 at 06:20:40PM +0800, Wang Xiaoguang wrote:
> > There's opencoding an existing wrapper sb_start_write, please use it
> > instead.
> OK, I can submit a new version using this wrapper.
> Also could you please have a look at my reply to Filipe Manana in
> last mail? I suggest another solution, thanks.

I missed Filipe's response when writing mine, but I agree with him. We
should avoid relying on sb freezing API calls (or peeking into the
interal structures) in general. It could be quite hard to prove that the
change will not break something else.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rollback to a snapshot and delete old top volume - missing of "@"

2016-07-07 Thread Kai Herlemann

Hi,

I want to rollback a snapshot and have done this by execute "btrfs sub 
set-default / 618".
Now I want to delete the old top volume to save space, but google and 
manuals didn't helped.


I mounted for the following the root volume at /mnt/gparted with 
subvolid=0, subvol=/ has the same effect.
Usually, the top volume is saved in /@, so I would be able to delete it 
by execute "btrfs sub delete /@" (or move at first @ to @_badroot and 
the snapshot to @). But that isn't possible, the output of that command 
is "ERROR: cannot access subvolume /@: No such file or directory".
I've posted the output of "btrfs sub list /mnt/gparted" at 
http://pastebin.com/r7WNbJq8. As you can see, there's no subvolume named @.


I have the same problem with my /home/ partition.

Output of "uname -a" (self-compiled kernel):
Linux debian-linux 4.1.26 #1 SMP Wed Jun 8 18:40:04 CEST 2016 x86_64 
GNU/Linux


Output of "btrfs --version":
btrfs-progs v4.5.2

Output of "btrfs fi show":
Label: none  uuid: f778877c-d50b-48c8-8951-6635c6e23c61
  Total devices 1 FS bytes used 43.70GiB
  devid1 size 55.62GiB used 47.03GiB path /dev/sda1

Output of "btrfs fi df /":
Data, single: total=44.00GiB, used=42.48GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=3.00GiB, used=1.22GiB
GlobalReserve, single: total=416.00MiB, used=0.00B

Output of dmesg attached.

Thank you,
Kai

[0.00] microcode: CPU0 microcode updated early to revision 0xe, date = 2013-06-26
[0.00] Initializing cgroup subsys cpuset
[0.00] Initializing cgroup subsys cpu
[0.00] Initializing cgroup subsys cpuacct
[0.00] Linux version 4.1.26 (root@debian-linux) (gcc version 5.3.1 20160528 (Debian 5.3.1-21) ) #1 SMP Wed Jun 8 18:40:04 CEST 2016
[0.00] Command line: BOOT_IMAGE=/boot/vmlinuz-4.1.26 root=UUID=f778877c-d50b-48c8-8951-6635c6e23c61 ro resume=UUID=9be0bf62-859d-42cb-b075-4bd31f41c53d init=/lib/systemd/systemd
[0.00] e820: BIOS-provided physical RAM map:
[0.00] BIOS-e820: [mem 0x-0x0009cfff] usable
[0.00] BIOS-e820: [mem 0x0009d000-0x0009] reserved
[0.00] BIOS-e820: [mem 0x000e-0x000f] reserved
[0.00] BIOS-e820: [mem 0x0010-0x9f680fff] usable
[0.00] BIOS-e820: [mem 0x9f681000-0x9f6befff] reserved
[0.00] BIOS-e820: [mem 0x9f6bf000-0x9f735fff] usable
[0.00] BIOS-e820: [mem 0x9f736000-0x9f7befff] ACPI NVS
[0.00] BIOS-e820: [mem 0x9f7bf000-0x9f7defff] usable
[0.00] BIOS-e820: [mem 0x9f7df000-0x9f7fefff] ACPI data
[0.00] BIOS-e820: [mem 0x9f7ff000-0x9f7f] usable
[0.00] BIOS-e820: [mem 0x9f80-0x9fff] reserved
[0.00] BIOS-e820: [mem 0xe000-0xefff] reserved
[0.00] BIOS-e820: [mem 0xfeb0-0xfeb03fff] reserved
[0.00] BIOS-e820: [mem 0xfec0-0xfec00fff] reserved
[0.00] BIOS-e820: [mem 0xfed1-0xfed13fff] reserved
[0.00] BIOS-e820: [mem 0xfed18000-0xfed19fff] reserved
[0.00] BIOS-e820: [mem 0xfed1b000-0xfed1] reserved
[0.00] BIOS-e820: [mem 0xfee0-0xfee00fff] reserved
[0.00] BIOS-e820: [mem 0xffe8-0x] reserved
[0.00] BIOS-e820: [mem 0x0001-0x000157ff] usable
[0.00] NX (Execute Disable) protection: active
[0.00] SMBIOS 2.6 present.
[0.00] DMI: eMachineseME730G /eME730G , BIOS V1.23 04/25/2011
[0.00] e820: update [mem 0x-0x0fff] usable ==> reserved
[0.00] e820: remove [mem 0x000a-0x000f] usable
[0.00] AGP: No AGP bridge found
[0.00] e820: last_pfn = 0x158000 max_arch_pfn = 0x4
[0.00] MTRR default type: uncachable
[0.00] MTRR fixed ranges enabled:
[0.00]   0-9 write-back
[0.00]   A-B uncachable
[0.00]   C-F write-protect
[0.00] MTRR variable ranges enabled:
[0.00]   0 base 0 mask F8000 write-back
[0.00]   1 base 0FFE0 mask FFFE0 write-protect
[0.00]   2 base 08000 mask FE000 write-back
[0.00]   3 base 09F80 mask FFF80 uncachable
[0.00]   4 base 1 mask FC000 write-back
[0.00]   5 base 14000 mask FF000 write-back
[0.00]   6 base 15000 mask FF800 write-back
[0.00]   7 disabled
[0.00] PAT configuration [0-7]: WB  WC  UC- UC  WB  WC  UC- UC  
[0.00] e820: last_pfn = 0x9f800 max_arch_pfn = 0x4
[0.00] Base memory trampoline at [88097000] 97000 size 24576
[0.00] init_memory_mapping: [mem 

Re: Unable to mount degraded RAID5

2016-07-07 Thread Gonzalo Gomez-Arrue Azpiazu
Thanks a lot, your will to help out someone you do not know (and who
is obviously way over his depth) is inspiring.

This is what it says:

btrfs rescue super-recover -v /dev/sdc1
All Devices:
Device: id = 3, name = /dev/sdd1
Device: id = 1, name = /dev/sdc1

Before Recovering:
[All good supers]:
device name = /dev/sdd1
superblock bytenr = 65536

device name = /dev/sdd1
superblock bytenr = 67108864

device name = /dev/sdd1
superblock bytenr = 274877906944

device name = /dev/sdc1
superblock bytenr = 65536

device name = /dev/sdc1
superblock bytenr = 67108864

device name = /dev/sdc1
superblock bytenr = 274877906944

[All bad supers]:

All supers are valid, no need to recover

Any suggestion on what to do next?

(again, really appreciated - I hope to be able to give back the
support I am receiving at some point!)

On Wed, Jul 6, 2016 at 9:19 PM, Chris Murphy  wrote:
> On Wed, Jul 6, 2016 at 11:12 AM, Gonzalo Gomez-Arrue Azpiazu
>  wrote:
>> Hello,
>>
>> I had a RAID5 with 3 disks and one failed; now the filesystem cannot be 
>> mounted.
>>
>> None of the recommendations that I found seem to work. The situation
>> seems to be similar to this one:
>> http://www.spinics.net/lists/linux-btrfs/msg56825.html
>>
>> Any suggestion on what to try next?
>
> Basically if you are degraded *and* it runs into additional errors,
> then it's broken because raid5 only protects against one device error.
> The main problem is if it can't read the chunk root it's hard for any
> tool to recover data because the chunk tree mapping is vital to
> finding data.
>
> What do you get for:
> btrfs rescue super-recover -v /dev/sdc1
>
> It's a problem with the chunk tree because all of your super blocks
> point to the same chunk tree root so there isn't another one to try.
>
>>sudo btrfs-find-root /dev/sdc1
>>warning, device 2 is missing
>>Couldn't read chunk root
>>Open ctree failed
>
> It's bad news. I'm not even sure 'btrfs restore' can help this case.
>
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: error out if generic_bin_search get invalid arguments

2016-07-07 Thread David Sterba
On Thu, Jun 23, 2016 at 04:32:45PM -0700, Liu Bo wrote:
> With btrfs-corrupt-block, one can set btree node/leaf's field, if
> we assign a negative value to node/leaf, we can get various hangs,
> eg. if extent_root's nritems is -2ULL, then we get stuck in
>  btrfs_read_block_groups() because it has a while loop and
> btrfs_search_slot() on extent_root will always return the first
>  child.
> 
> This lets us know what's happening and returns a EINVAL to callers
> instead of returning the first item.
> 
> Signed-off-by: Liu Bo 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] Btrfs: fix read_node_slot to return errors

2016-07-07 Thread David Sterba
On Tue, Jul 05, 2016 at 12:10:14PM -0700, Liu Bo wrote:
> We use read_node_slot() to read btree node and it has two cases,
> a) slot is out of range, which means 'no such entry'
> b) we fail to read the block, due to checksum fails or corrupted
>content or not with uptodate flag.
> But we're returning NULL in both cases, this makes it return -ENOENT
> in case a) and return -EIO in case b), and this fixes its callers
> as well as btrfs_search_forward() 's caller to catch the new errors.
> 
> The problem is reported by Peter Becker, and I can manage to
> hit the same BUG_ON by mounting my fuzz image.
> 
> Reported-by: Peter Becker 
> Signed-off-by: Liu Bo 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: cleanup BUG_ON in merge_bio

2016-07-07 Thread David Sterba
On Wed, Jun 22, 2016 at 06:31:49PM -0700, Liu Bo wrote:
> One can use btrfs-corrupt-block to hit BUG_ON() in merge_bio(),
> thus this aims to stop anyone to panic the whole system by using
>  their btrfs.
> 
> Since the error in merge_bio can only come from __btrfs_map_block()
> when chunk tree mapping has something insane and __btrfs_map_block()
> has already had printed the reason, we can just return errors in
> merge_bio.
> 
> Signed-off-by: Liu Bo 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-07 Thread Austin S. Hemmelgarn

On 2016-07-06 18:59, Tomasz Kusmierz wrote:



On 6 Jul 2016, at 23:14, Corey Coughlin  wrote:

Hi all,
   Hoping you all can help, have a strange problem, think I know what's going 
on, but could use some verification.  I set up a raid1 type btrfs filesystem on 
an Ubuntu 16.04 system, here's what it looks like:

btrfs fi show
Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
   Total devices 10 FS bytes used 3.42TiB
   devid1 size 1.82TiB used 1.18TiB path /dev/sdd
   devid2 size 698.64GiB used 47.00GiB path /dev/sdk
   devid3 size 931.51GiB used 280.03GiB path /dev/sdm
   devid4 size 931.51GiB used 280.00GiB path /dev/sdl
   devid5 size 1.82TiB used 1.17TiB path /dev/sdi
   devid6 size 1.82TiB used 823.03GiB path /dev/sdj
   devid7 size 698.64GiB used 47.00GiB path /dev/sdg
   devid8 size 1.82TiB used 1.18TiB path /dev/sda
   devid9 size 1.82TiB used 1.18TiB path /dev/sdb
   devid   10 size 1.36TiB used 745.03GiB path /dev/sdh

I added a couple disks, and then ran a balance operation, and that took about 3 
days to finish.  When it did finish, tried a scrub and got this message:

scrub status for 597ee185-36ac-4b68-8961-d4adc13f95d4
   scrub started at Sun Jun 26 18:19:28 2016 and was aborted after 01:16:35
   total bytes scrubbed: 926.45GiB with 18849935 errors
   error details: read=18849935
   corrected errors: 5860, uncorrectable errors: 18844075, unverified errors: 0

So that seems bad.  Took a look at the devices and a few of them have errors:
...
[/dev/sdi].generation_errs 0
[/dev/sdj].write_io_errs   289436740
[/dev/sdj].read_io_errs289492820
[/dev/sdj].flush_io_errs   12411
[/dev/sdj].corruption_errs 0
[/dev/sdj].generation_errs 0
[/dev/sdg].write_io_errs   0
...
[/dev/sda].generation_errs 0
[/dev/sdb].write_io_errs   3490143
[/dev/sdb].read_io_errs111
[/dev/sdb].flush_io_errs   268
[/dev/sdb].corruption_errs 0
[/dev/sdb].generation_errs 0
[/dev/sdh].write_io_errs   5839
[/dev/sdh].read_io_errs2188
[/dev/sdh].flush_io_errs   11
[/dev/sdh].corruption_errs 1
[/dev/sdh].generation_errs 16373

So I checked the smart data for those disks, they seem perfect, no reallocated 
sectors, no problems.  But one thing I did notice is that they are all WD Green 
drives.  So I'm guessing that if they power down and get reassigned to a new 
/dev/sd* letter, that could lead to data corruption.  I used idle3ctl to turn 
off the shut down mode on all the green drives in the system, but I'm having 
trouble getting the filesystem working without the errors.  I tried a 'check 
--repair' command on it, and it seems to find a lot of verification errors, but 
it doesn't look like things are getting fixed.
 But I have all the data on it backed up on another system, so I can recreate 
this if I need to.  But here's what I want to know:

1.  Am I correct about the issues with the WD Green drives, if they change 
mounts during disk operations, will that corrupt data?

I just wanted to chip in with WD Green drives. I have a RAID10 running on 6x2TB of 
those, actually had for ~3 years. If disk goes down for spin down, and you try to 
access something - kernel & FS & whole system will wait for drive to re-spin 
and everything works OK. I’ve never had a drive being reassigned to different /dev/sdX 
due to spin down / up.
2 years ago I was having a corruption due to not using ECC ram on my system and 
one of RAM modules started producing errors that were never caught up by CPU / 
MoBo. Long story short, guy here managed to point me to the right direction and 
I started shifting my data to hopefully new and not corrupted FS … but I was 
sceptical of similar issue that you have described AND I did raid1 and while 
mounted I did shift disk from one SATA port to another and FS managed to pick 
up the disk in new location and did not even blinked (as far as I remember 
there was syslog entry to say that disk vanished and then that disk was added)

Last word, you got plenty of errors in your SMART for transfer related stuff, 
please be advised that this may mean:
- faulty cable
- faulty mono controller
- faulty drive controller
- bad RAM - yes, mother board CAN use your ram for storing data and transfer 
related stuff … specially chapter ones.
It's worth pointing out that the most likely point here for data 
corruption assuming the cable and controllers are OK is during the DMA 
transfer from system RAM to the drive controller.  Even when dealing 
with really good HBA's that have an on-board NVRAM cache, you still have 
to copy the data out of system RAM at some point, and that's usually 
when the corruption occurs if the problem is with the RAM, CPU or MB.



2.  If that is the case:
   a.) Is there any way I can stop the /dev/sd* mount points from changing?  Or 
can I set up the filesystem using UUIDs or something more solid?  I googled 
about it, but found conflicting info

Don’t get it the wrong way but I’m personally surprised that anybody still 

[PATCH v2] btrfs-progs: du: fix to skip not btrfs dir/file

2016-07-07 Thread Wang Shilong
'btrfs file du' is a very useful tool to watch my system
file usage information with snapshot aware.

when trying to run following commands:
[root@localhost btrfs-progs]# btrfs file du /
 Total   Exclusive  Set shared  Filename
ERROR: Failed to lookup root id - Inappropriate ioctl for device
ERROR: cannot check space of '/': Unknown error -1

and My Filesystem looks like this:
[root@localhost btrfs-progs]# df -Th
Filesystem Type  Size  Used Avail Use% Mounted on
devtmpfs   devtmpfs   16G 0   16G   0% /dev
tmpfs  tmpfs  16G  368K   16G   1% /dev/shm
tmpfs  tmpfs  16G  1.4M   16G   1% /run
tmpfs  tmpfs  16G 0   16G   0% /sys/fs/cgroup
/dev/sda3  btrfs  60G   19G   40G  33% /
tmpfs  tmpfs  16G  332K   16G   1% /tmp
/dev/sdc   btrfs 2.8T  166G  1.7T   9% /data
/dev/sda2  xfs   2.0G  452M  1.6G  23% /boot
/dev/sda1  vfat  1.9G   11M  1.9G   1% /boot/efi
tmpfs  tmpfs 3.2G   24K  3.2G   1% /run/user/1000

So I installed Btrfs as my root partition, but boot partition
can be other fs.

We can Let btrfs tool aware of this is not a btrfs file or
directory and skip those files, so that someone like me
could just run 'btrfs file du /' to scan all btrfs filesystems.

After patch, it will look like:
   Total   Exclusive  Set shared  Filename
 0.00B   0.00B   -  //root/.bash_logout
 0.00B   0.00B   -  //root/.bash_profile
 0.00B   0.00B   -  //root/.bashrc
 0.00B   0.00B   -  //root/.cshrc
 0.00B   0.00B   -  //root/.tcshrc

This works for me to analysis system usage and analysis
performaces.

Signed-off-by: Wang Shilong 
---
v1->v2: remove extra unnecessary messages output
---
 cmds-fi-du.c   | 8 +++-
 cmds-inspect.c | 2 +-
 utils.c| 8 
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/cmds-fi-du.c b/cmds-fi-du.c
index 12855a5..6d5bf35 100644
--- a/cmds-fi-du.c
+++ b/cmds-fi-du.c
@@ -389,8 +389,14 @@ static int du_walk_dir(struct du_dir_ctxt *ctxt, struct 
rb_root *shared_extents)
  dirfd(dirstream),
  shared_extents, , ,
  0);
-   if (ret)
+   if (ret == -ENOTTY) {
+   continue;
+   } else if (ret) {
+   fprintf(stderr,
+   "failed to walk dir/file: %s 
:%s\n",
+   entry->d_name, strerror(-ret));
break;
+   }
 
ctxt->bytes_total += tot;
ctxt->bytes_shared += shr;
diff --git a/cmds-inspect.c b/cmds-inspect.c
index dd7b9dd..2ae44be 100644
--- a/cmds-inspect.c
+++ b/cmds-inspect.c
@@ -323,7 +323,7 @@ static int cmd_inspect_rootid(int argc, char **argv)
 
ret = lookup_ino_rootid(fd, );
if (ret) {
-   error("rootid failed with ret=%d", ret);
+   error("failed to lookup root id: %s", strerror(-ret));
goto out;
}
 
diff --git a/utils.c b/utils.c
index 578fdb0..f73b048 100644
--- a/utils.c
+++ b/utils.c
@@ -2815,6 +2815,8 @@ path:
if (fd < 0)
goto err;
ret = lookup_ino_rootid(fd, );
+   if (ret)
+   error("failed to lookup root id: %s", strerror(-ret));
close(fd);
if (ret < 0)
goto err;
@@ -3497,10 +3499,8 @@ int lookup_ino_rootid(int fd, u64 *rootid)
args.objectid = BTRFS_FIRST_FREE_OBJECTID;
 
ret = ioctl(fd, BTRFS_IOC_INO_LOOKUP, );
-   if (ret < 0) {
-   error("failed to lookup root id: %s", strerror(errno));
-   return ret;
-   }
+   if (ret < 0)
+   return -errno;
 
*rootid = args.treeid;
 
-- 
2.7.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of space error even though there's 100 GB unused?

2016-07-07 Thread Stanislaw Kaminski
Too early report, the issue is back. Back to testing

2016-07-07 12:18 GMT+02:00 Stanislaw Kaminski :
> Hi all,
> I downgraded to 4.4.1-1 - all fine, 4.5.5.-1 - also fine, then got
> back to 4.6.3-2 - and it's still fine. Apparently running under
> different kernel somehow fixed the glitch (as far as I can test...).
>
> That leaves me with the other question: before issues, I 1.6 TiB was
> used, now all the tools report 1.7 TiB issued (except for btrfs fs du
> /home, this reports 1.6 TiB). How is that possible?
>
> Cheers,
> Stan
>
> 2016-07-06 19:42 GMT+02:00 Chris Murphy :
>> On Wed, Jul 6, 2016 at 3:55 AM, Stanislaw Kaminski
>>  wrote:
>>
>>> Device unallocated:   97.89GiB
>>
>> There should be no problem creating any type of block group from this
>> much space. It's a bug.
>>
>> I would try regression testing. Kernel 4.5.7 has some changes that may
>> or may not relate to this (they should only relate when there is no
>> unallocated space left) so you could try 4.5.6 and 4.5.7. And also
>> 4.4.14.
>>
>> But also the kernel messages are important. There is this obscure
>> enospc with error -28, so either with or without enospc_debug mount
>> option is useful to try in 4.6.3 (I think it's less useful in older
>> kernels).
>>
>> But do try nospace_cache first. If that works, you could then mount
>> with clear_cache one time and see if that provides an enduring fix. It
>> can take some time to rebuild the cache after clear_cache is used.
>>
>>
>>
>> --
>> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


A lot warnings in dmesg while running thunderbird

2016-07-07 Thread Gabriel C
Hi,

while running thunderbird on linux 4.6.3 and 4.7.0-rc6 ( didn't tested
other versions )
I trigger the following :


[ 6393.305675] WARNING: CPU: 6 PID: 5870 at fs/btrfs/inode.c:9306
btrfs_destroy_inode+0x22e/0x2a0 [btrfs]
[ 6393.305677] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs
vfat msdos fat jfs ext4 crc16 jbd2 ext2 mbcache binfmt_misc ctr ccm
af_packet xfs libcrc32c crc32c_generic amdkfd amd_iommu_v2 coretemp
radeon intel_powerclamp ttm ar
c4 kvm_intel drm_kms_helper ath9k ath9k_common ath9k_hw ath mac80211
joydev drm agpgart i2c_algo_bit fb_sys_fops syscopyarea sysfillrect
kvm sysimgblt evdev ipmi_ssif mac_hid snd_hda_codec_hdmi snd_hda_intel
snd_hda_codec snd_hda_core sn
d_hwdep iTCO_wdt cfg80211 rfkill iTCO_vendor_support ipmi_si
ipmi_msghandler irqbypass pcspkr e1000e ptp pps_core fjes tpm_infineon
button i2c_i801 ac acpi_power_meter i5500_temp hwmon i7core_edac
ioatdma acpi_cpufreq tpm_tis dca edac_co
re tpm shpchp lpc_ich i2c_core ppdev sch_fq_codel snd_seq_dummy
snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_oss
[ 6393.305742]  snd_seq_midi_event snd_seq snd_seq_device snd_timer
snd soundcore lp parport_pc parport hid_logitech_hidpp hid_logitech_dj
usbhid hid btrfs xor raid6_pq uas usb_storage sd_mod uhci_hcd ahci
libahci libata megaraid_sas crc
32c_intel ehci_pci ehci_hcd usbcore scsi_mod usb_common dm_mirror
dm_region_hash dm_log dm_mod unix ipv6 autofs4 jitterentropy_rng
sha256_ssse3 sha256_generic hmac drbg ansi_cprng
[ 6393.305772] CPU: 6 PID: 5870 Comm: mozStorage #1 Tainted: G
 I 4.7.0-rc6 #1
[ 6393.305774] Hardware name: FUJITSU
PRIMERGY TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709
 02/04/2013
[ 6393.305775]   81249333 

[ 6393.305778]  81058044 8807e1de2480 8807e152c898
88083b273800
[ 6393.305780]  8807e152c898 88043f52edd8 88083aa3b910
a051eade
[ 6393.305783] Call Trace:
[ 6393.305791]  [] ? dump_stack+0x5c/0x79
[ 6393.305795]  [] ? __warn+0xb4/0xd0
[ 6393.305809]  [] ? btrfs_destroy_inode+0x22e/0x2a0 [btrfs]
[ 6393.305814]  [] ? __dentry_kill+0x191/0x210
[ 6393.305816]  [] ? dput+0x162/0x270
[ 6393.305819]  [] ? __fput+0x11d/0x1c0
[ 6393.305824]  [] ? task_work_run+0x70/0x90
[ 6393.305827]  [] ? exit_to_usermode_loop+0x9b/0xa0
[ 6393.305829]  [] ? syscall_return_slowpath+0x45/0x50
[ 6393.305836]  [] ? entry_SYSCALL_64_fastpath+0xa2/0xa4
[ 6393.305855] ---[ end trace cd086f0cb12a2d9c ]---

and after that lots:

[ 6509.253357] WARNING: CPU: 12 PID: 7271 at
fs/btrfs/extent-tree.c:4303
btrfs_free_reserved_data_space_noquota+0x68/0x80 [btrfs]
[ 6509.253359] Modules linked in: fuse ufs qnx4 hfsplus hfs minix ntfs
vfat msdos fat jfs ext4 crc16 jbd2 ext2 mbcache binfmt_misc ctr ccm
af_packet xfs libcrc32c crc32c_generic amdkfd amd_iommu_v2 coretemp
radeon intel_powerclamp ttm ar
c4 kvm_intel drm_kms_helper ath9k ath9k_common ath9k_hw ath mac80211
joydev drm agpgart i2c_algo_bit fb_sys_fops syscopyarea sysfillrect
kvm sysimgblt evdev ipmi_ssif mac_hid snd_hda_codec_hdmi snd_hda_intel
snd_hda_codec snd_hda_core sn
d_hwdep iTCO_wdt cfg80211 rfkill iTCO_vendor_support ipmi_si
ipmi_msghandler irqbypass pcspkr e1000e ptp pps_core fjes tpm_infineon
button i2c_i801 ac acpi_power_meter i5500_temp hwmon i7core_edac
ioatdma acpi_cpufreq tpm_tis dca edac_co
re tpm shpchp lpc_ich i2c_core ppdev sch_fq_codel snd_seq_dummy
snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_oss
[ 6509.253420]  snd_seq_midi_event snd_seq snd_seq_device snd_timer
snd soundcore lp parport_pc parport hid_logitech_hidpp hid_logitech_dj
usbhid hid btrfs xor raid6_pq uas usb_storage sd_mod uhci_hcd ahci
libahci libata megaraid_sas crc
32c_intel ehci_pci ehci_hcd usbcore scsi_mod usb_common dm_mirror
dm_region_hash dm_log dm_mod unix ipv6 autofs4 jitterentropy_rng
sha256_ssse3 sha256_generic hmac drbg ansi_cprng
[ 6509.253450] CPU: 12 PID: 7271 Comm: systemd-journal Tainted: G
  W I 4.7.0-rc6 #1
[ 6509.253451] Hardware name: FUJITSU
PRIMERGY TX200 S5 /D2709, BIOS 6.00 Rev. 1.14.2709
 02/04/2013
[ 6509.253453]   81249333 

[ 6509.253456]  81058044 1000 88043b720200
0fff
[ 6509.253459]  88083d57990c 8807de10b924 1000
a04f3b08
[ 6509.253462] Call Trace:
[ 6509.253467]  [] ? dump_stack+0x5c/0x79
[ 6509.253470]  [] ? __warn+0xb4/0xd0
[ 6509.253483]  [] ?
btrfs_free_reserved_data_space_noquota+0x68/0x80 [btrfs]
[ 6509.253499]  [] ? btrfs_clear_bit_hook+0x282/0x360 [btrfs]
[ 6509.253515]  [] ? clear_state_bit+0x50/0x1c0 [btrfs]
[ 6509.253532]  [] ? __clear_extent_bit+0x142/0x3c0 [btrfs]
[ 6509.253549]  [] ?
extent_clear_unlock_delalloc+0x5b/0x220 [btrfs]
[ 6509.253560]  [] ? btrfs_release_path+0x27/0x80 [btrfs]
[ 6509.253563]  [] ? kmem_cache_alloc+0x13e/0x150
[ 6509.253567]  [] ? igrab+0x2c/0x50
[ 6509.253583]  [] ?
__btrfs_add_ordered_extent+0x20c/0x2d0 [btrfs]
[ 6509.253595]  [] ?

Re: [PATCH] btrfs: Fix slab accounting flags

2016-07-07 Thread David Sterba
On Thu, Jun 23, 2016 at 09:17:08PM +0300, Nikolay Borisov wrote:
> BTRFS is using a variety of slab caches to satisfy internal needs.
> Those slab caches are always allocated with the SLAB_RECLAIM_ACCOUNT,
> meaning allocations from the caches are going to be accounted as
> SReclaimable. At the same time btrfs is not registering any shrinkers
> whatsoever, thus preventing memory from the slabs to be shrunk. This
> means those caches are not in fact reclaimable.
> 
> To fix this remove the SLAB_RECLAIM_ACCOUNT on all caches apart from the
> inode cache, since this one is being freed by the generic VFS super_block
> shrinker. Also set the transaction related caches as SLAB_TEMPORARY,
> to better document the lifetime of the objects (it just translates
> to SLAB_RECLAIM_ACCOUNT).
> 
> Signed-off-by: Nikolay Borisov 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Out of space error even though there's 100 GB unused?

2016-07-07 Thread Stanislaw Kaminski
Hi all,
I downgraded to 4.4.1-1 - all fine, 4.5.5.-1 - also fine, then got
back to 4.6.3-2 - and it's still fine. Apparently running under
different kernel somehow fixed the glitch (as far as I can test...).

That leaves me with the other question: before issues, I 1.6 TiB was
used, now all the tools report 1.7 TiB issued (except for btrfs fs du
/home, this reports 1.6 TiB). How is that possible?

Cheers,
Stan

2016-07-06 19:42 GMT+02:00 Chris Murphy :
> On Wed, Jul 6, 2016 at 3:55 AM, Stanislaw Kaminski
>  wrote:
>
>> Device unallocated:   97.89GiB
>
> There should be no problem creating any type of block group from this
> much space. It's a bug.
>
> I would try regression testing. Kernel 4.5.7 has some changes that may
> or may not relate to this (they should only relate when there is no
> unallocated space left) so you could try 4.5.6 and 4.5.7. And also
> 4.4.14.
>
> But also the kernel messages are important. There is this obscure
> enospc with error -28, so either with or without enospc_debug mount
> option is useful to try in 4.6.3 (I think it's less useful in older
> kernels).
>
> But do try nospace_cache first. If that works, you could then mount
> with clear_cache one time and see if that provides an enduring fix. It
> can take some time to rebuild the cache after clear_cache is used.
>
>
>
> --
> Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


fstrim problem/bug

2016-07-07 Thread M G Berberich
Hello,

On a filesystem with 40 G free space and 54 G used, ‘fstrim -v’ gave
this result:

# fstrim -v /
/: 0 B (0 bytes) trimmed

After running balance it gave a more sensible

# fstrim -v /
/: 37.3 GiB (40007368704 bytes) trimmed

As far as I understand, fstrim should report any unused block to the
disk, so its controller can reuse these blocks. I expected ’fstrim -v’
to report about 40 G trimmed. The fact, that after balance fstrim
reports a sensible amount of trimmed bytes leads to the conclusion,
that fstrim on btrfs does not report unused blocks to the disk (as it
should), but only the blocks of unused chunks. As the fstrim-command
only does a ‘ioctl(fd, FITRIM, ))’ this seems to be a bug in the
fstrim kernel-code.
In the field this means, that without regularly running balance,
fstrim does not work on btrfs.

MfG
bmg

-- 
„Des is völlig wurscht, was heut beschlos- | M G Berberich
 sen wird: I bin sowieso dagegn!“  | m...@m-berberich.de
(SPD-Stadtrat Kurt Schindler; Regensburg)  | 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Bug-tar] stat() on btrfs reports the st_blocks with delay (data loss in archivers)

2016-07-07 Thread Pavel Raiskup
On Monday, July 4, 2016 1:35:25 PM CEST Andreas Dilger wrote:
> I think in addition to fixing btrfs (because it needs to work with existing
> tar/rsync/etc. tools) it makes sense to *also* fix the heuristics of tar to
> handle this situation more robustly.

What I was rather thinking about is to remove the [2] heuristic.  As there is
now SEEK_HOLE implemented, the need for that check "completely sparse files"
might be considered less useful.  With [1], I'm not sure -- is it that bad to
face some false positive there? (it is documented that tar shouldn't be run
concurrently with other processes writing to archived files .., and waiting for
flush here is probably a very similar race condition).

> One option is if st_blocks == 0 then tar should also check if st_mtime is
> less than 60s in the past, and if yes then it should call fsync() on the
> file to flush any unwritten data to disk , or assume the file is not sparse
> and read the whole file, so that it doesn't incorrectly assume that the file
> is sparse and skip archiving the file data.

The reported fact 'st_blocks != 0' doesn't mean that the fsync() call is not
needed, so I'm not 100% we should special-case the 'st_blocks == 0' files.

--

As this effectively breaks tar's testsuite on btrfs, could we also explicitly
sync in 'genfile'?

Pavel

> 
> Cheers, Andreas
> 
> > [1] 
> > http://git.savannah.gnu.org/cgit/paxutils.git/tree/lib/system.h?id=ec72abd9dd63bbff4534ec77e97b1a6cadfc3cf8#n392
> > [2] 
> > http://git.savannah.gnu.org/cgit/tar.git/tree/src/sparse.c?id=ac065c57fdc1788a2769fb119ed0c8146e1b9dd6#n273

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: raid1 has failing disks, but smart is clear

2016-07-07 Thread Corey Coughlin

Hi Tomasz,
Thanks for the response!  I should clear some things up, though.

On 07/06/2016 03:59 PM, Tomasz Kusmierz wrote:

On 6 Jul 2016, at 23:14, Corey Coughlin  wrote:

Hi all,
Hoping you all can help, have a strange problem, think I know what's going 
on, but could use some verification.  I set up a raid1 type btrfs filesystem on 
an Ubuntu 16.04 system, here's what it looks like:

btrfs fi show
Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
Total devices 10 FS bytes used 3.42TiB
devid1 size 1.82TiB used 1.18TiB path /dev/sdd
devid2 size 698.64GiB used 47.00GiB path /dev/sdk
devid3 size 931.51GiB used 280.03GiB path /dev/sdm
devid4 size 931.51GiB used 280.00GiB path /dev/sdl
devid5 size 1.82TiB used 1.17TiB path /dev/sdi
devid6 size 1.82TiB used 823.03GiB path /dev/sdj
devid7 size 698.64GiB used 47.00GiB path /dev/sdg
devid8 size 1.82TiB used 1.18TiB path /dev/sda
devid9 size 1.82TiB used 1.18TiB path /dev/sdb
devid   10 size 1.36TiB used 745.03GiB path /dev/sdh
Now when I say that the drives mount points change, I'm not saying they 
change when I reboot.  They change while the system is running.  For 
instance, here's the fi show after I ran a "check --repair" run this 
afternoon:


btrfs fi show
Label: none  uuid: 597ee185-36ac-4b68-8961-d4adc13f95d4
Total devices 10 FS bytes used 3.42TiB
devid1 size 1.82TiB used 1.18TiB path /dev/sdd
devid2 size 698.64GiB used 47.00GiB path /dev/sdk
devid3 size 931.51GiB used 280.03GiB path /dev/sdm
devid4 size 931.51GiB used 280.00GiB path /dev/sdl
devid5 size 1.82TiB used 1.17TiB path /dev/sdi
devid6 size 1.82TiB used 823.03GiB path /dev/sds
devid7 size 698.64GiB used 47.00GiB path /dev/sdg
devid8 size 1.82TiB used 1.18TiB path /dev/sda
devid9 size 1.82TiB used 1.18TiB path /dev/sdb
devid   10 size 1.36TiB used 745.03GiB path /dev/sdh

Notice that /dev/sdj in the previous run changed to /dev/sds.  There was 
no reboot, the mount just changed.  I don't know why that is happening, 
but it seems like the majority of the errors are on that drive.  But 
given that I've fixed the start/stop issue on that disk, it probably 
isn't a WD Green issue.




I added a couple disks, and then ran a balance operation, and that took about 3 
days to finish.  When it did finish, tried a scrub and got this message:

scrub status for 597ee185-36ac-4b68-8961-d4adc13f95d4
scrub started at Sun Jun 26 18:19:28 2016 and was aborted after 01:16:35
total bytes scrubbed: 926.45GiB with 18849935 errors
error details: read=18849935
corrected errors: 5860, uncorrectable errors: 18844075, unverified errors: 0

So that seems bad.  Took a look at the devices and a few of them have errors:
...
[/dev/sdi].generation_errs 0
[/dev/sdj].write_io_errs   289436740
[/dev/sdj].read_io_errs289492820
[/dev/sdj].flush_io_errs   12411
[/dev/sdj].corruption_errs 0
[/dev/sdj].generation_errs 0
[/dev/sdg].write_io_errs   0
...
[/dev/sda].generation_errs 0
[/dev/sdb].write_io_errs   3490143
[/dev/sdb].read_io_errs111
[/dev/sdb].flush_io_errs   268
[/dev/sdb].corruption_errs 0
[/dev/sdb].generation_errs 0
[/dev/sdh].write_io_errs   5839
[/dev/sdh].read_io_errs2188
[/dev/sdh].flush_io_errs   11
[/dev/sdh].corruption_errs 1
[/dev/sdh].generation_errs 16373

So I checked the smart data for those disks, they seem perfect, no reallocated 
sectors, no problems.  But one thing I did notice is that they are all WD Green 
drives.  So I'm guessing that if they power down and get reassigned to a new 
/dev/sd* letter, that could lead to data corruption.  I used idle3ctl to turn 
off the shut down mode on all the green drives in the system, but I'm having 
trouble getting the filesystem working without the errors.  I tried a 'check 
--repair' command on it, and it seems to find a lot of verification errors, but 
it doesn't look like things are getting fixed.
  But I have all the data on it backed up on another system, so I can recreate 
this if I need to.  But here's what I want to know:

1.  Am I correct about the issues with the WD Green drives, if they change 
mounts during disk operations, will that corrupt data?

I just wanted to chip in with WD Green drives. I have a RAID10 running on 6x2TB of 
those, actually had for ~3 years. If disk goes down for spin down, and you try to 
access something - kernel & FS & whole system will wait for drive to re-spin 
and everything works OK. I’ve never had a drive being reassigned to different /dev/sdX 
due to spin down / up.
2 years ago I was having a corruption due to not using ECC ram on my system and 
one of RAM modules started producing errors that were never caught up by CPU / 
MoBo. Long story short, guy here managed to point me to the right direction and 
I started shifting my data to hopefully new and not corrupted FS … but I was 
sceptical of