from:"Marc Haber"

btrfs won't mount any more

2017-04-11 Thread Marc Haber

Hi,

I have wrecked another btrfs file system, probably for good this time.

It's a 80 GB filesystem from 2015, in my secondary notebook, on an
encrypted SSD. The btrfs holds the root filesystem and the rest of the
system as well.

I have a cronjob that makes snapshots of the system directories daily,
and of /home every ten minutes. A second cronjob cleans up old snapshots
so that the number of snapshots present is about between 400 and 600.
This is the key feature that made me decide for btrfs in the first
place.

Last week (I was on kernel 4.10.8 with Debian unstable), I was forced to
promote the secondary laptop to the primary one which resulted in
serious work being done on the first time. Over time, the filesystem
filled up without me noticing and was finally 100% full.

I then cleaned up about four gigs by deleting a couple of redundant ISO
images and some snapshots that were not due for regular deletion yet. I
then started a btrfs balance / -d50, unfortunately without stopping the
snapshot-making cronjob. This resulted in the notebook becoming
unuseable for extended periods of time, without even being able to log
in. After running for some 30 hours, the notebook ran out of battery
(don't ask, stupid me).

After rebooting, the btrfs balance proceeded immediately after mounting
the root fs. System unuseable again. After a day, I finally had a root
shell and was able to issue a btrfs cancel /. Unfortunately, the system
didn't care about that command and happily continued to balance. After
some more 30 hours, I lost patience and resetted the system.

To be able to keep control of the system and to monitor operations from
remote, I installed a fresh copy of Debian unstable with the same 4.10.8
kernel on an USB stick and booted the notebook from the stick. I brought
up the system and tried to mount the btrfs. The mount process quickly
went up to 100 % CPU usage and stayed that way until I went to bed last
night. This morning, the machine had dropped off the network (couldn't
ping the default gateway any more despite the network looked fine), and
spewed kernel oopses of about 80 lines (too long to scroll back even)
every few seconds.

I will try to tweak kernel.printk tonight so that I get my console back
and see whether the oopses are also in journal, dmesg or syslog so that
I can copypaste them. I also have a reasonably current backup of the
filesystem so nuking it from orbit is an option, I would however hate
losing my snapshots.

Is it worthwhile to save information about the borked filesystem, or
does the btrfs community just dont care about a heavily snapshotted two
years old filesystem?

I would like to hear comments and opinions about what has happened here
and how to avoid things like that in the future. Do more recently
created btrfs filesystems have safeguards against damage that may occur
when a filesystem fills up?

Greetings
Marc


-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs won't mount any more

2017-04-13 Thread Marc Haber

On Tue, Apr 11, 2017 at 06:15:02PM +0200, Adam Borowski wrote:
> On Tue, Apr 11, 2017 at 09:15:31AM +0200, Marc Haber wrote:
> > I have wrecked another btrfs file system, probably for good this time.
> > 
> > It's a 80 GB filesystem from 2015, in my secondary notebook, on an
> > encrypted SSD. The btrfs holds the root filesystem and the rest of the
> > system as well.
> > 
> > I have a cronjob that makes snapshots of the system directories daily,
> > and of /home every ten minutes. A second cronjob cleans up old snapshots
> > so that the number of snapshots present is about between 400 and 600.
> > This is the key feature that made me decide for btrfs in the first
> > place.
> > 
> > Last week (I was on kernel 4.10.8 with Debian unstable), I was forced to
> > promote the secondary laptop to the primary one which resulted in
> > serious work being done on the first time. Over time, the filesystem
> > filled up without me noticing and was finally 100% full.
> 
> CoW and log-structured filesystems in general tend to take 100% full
> conditions far worse than traditional filesystems, but it still should
> result only in performance degradation and/or metadata-vs-data issues rather
> than a fatal error.  So if this is the cause, you obviously hit a bug.

Given that btrfs has a reputation of not gracefully handling out of
space situations, the trouble was expected.

> > I then cleaned up about four gigs by deleting a couple of redundant ISO
> > images and some snapshots that were not due for regular deletion yet. I
> > then started a btrfs balance / -d50, unfortunately without stopping the
> > snapshot-making cronjob. This resulted in the notebook becoming
> > unuseable for extended periods of time, without even being able to log
> > in. After running for some 30 hours, the notebook ran out of battery
> > (don't ask, stupid me).
> 
> Ouch, this is generally harmless unless your disk lies about barriers. 
> Btrfs absolutely depends on them, and tends to suffer catastrophic
> corruption if writes were reordered when they shouldn't.

So if the disk would actually lie, I would have had much trouble even
earlier. It's an SSD from 2013 or 2014, I think from Kingston. The box
is offline and remote at the moment, so I cannot give the exact type.

Between the btrfs and the actual disk there was a dm-crypt/LUKS layer
and LVM, but I can reproduce the crash from an image on a different host
now.

> Even in such a case, using an older root would help, although that
> possibility is almost certainly gone now.

How would I try that? A pointer to the docs is fine.

> > After rebooting, the btrfs balance proceeded immediately after mounting
> > the root fs. System unuseable again. After a day, I finally had a root
> > shell and was able to issue a btrfs cancel /. Unfortunately, the system
> > didn't care about that command and happily continued to balance. After
> > some more 30 hours, I lost patience and resetted the system.
> 
> Mounting with -o skip_balance may help.

No, same issue.

> Two years old is not much, the format nor its use hasn't changed noticeably
> since then.  You run the very latest upstream stable kernel, with its almost
> freshest version (4.10.9 was tagged Saturday).  400-600 snapshots is nothing
> remarkable, it's the usual range.  The only thing differing from the most
> typical usage is your snapshot frequency, and even that is nothing
> frightening.

Doesn't SuSE's snapper do snapshots every ten minutes?

I can reproduce the crash on 4.10.10 now.

> Thus, a failure like yours in mainstream use is certainly interesting.
> 
> However, I have a piece of advice for now: could you make a copy of the
> filesystem?  80GB is _nothing_: it's way below the accuracy of du -h on a
> modern HDD, and not a burden for a typical SSD.  Being able to investigate
> it from a bigger machine would be convenient, and having a copy would let
> you use dangerous rescue methods without any risk.  And debugging oopses on
> a laptop with no working serial or netconsole sucks; if you have no other
> machine at hand then running the victim kernel in qemu-kvm might offer a
> poor-man's console.

qemu-kvm is also a good idea, up to now I have logs from a physical box
mounting the btrfs image loopback.

> For advice for your specific case, we can't do much without seeing the
> actual error messages.

I do have a dd copy of the device now.

$ sudo losetup --find --show ./dropbtr0.btrfs
$ sudo mount -o skip_balance -t btrfs /dev/loop0 /mnt/tempdisk

does immediately result in:

Apr 12 22:37:48 fan kernel: [  124.742104] loop: module loaded
Apr 12 22:37:48 fan kernel: [  124.784727] BTRFS: device label dropbtr0 dev

Re: btrfs won't mount any more

2017-04-20 Thread Marc Haber

On Thu, Apr 13, 2017 at 10:45:09AM +0200, Marc Haber wrote:
> On Tue, Apr 11, 2017 at 06:15:02PM +0200, Adam Borowski wrote:
> > Ouch, this is generally harmless unless your disk lies about barriers. 
> > Btrfs absolutely depends on them, and tends to suffer catastrophic
> > corruption if writes were reordered when they shouldn't.
> 
> So if the disk would actually lie, I would have had much trouble even
> earlier. It's an SSD from 2013 or 2014, I think from Kingston. The box
> is offline and remote at the moment, so I cannot give the exact type.

I have re-built the btrfs and restored from backup, so I can access the
disk again. It's a Crucial/Micron RealSSD m4/C400/P400 M4-CT256M4SSD1
with 256 GB Capacity.

I do have an image of the bad btrfs that makes the kernel oops on mount,
reproducibly. Does this help in debugging?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs won't mount any more

2017-05-02 Thread Marc Haber

On Thu, Apr 13, 2017 at 10:45:09AM +0200, Marc Haber wrote:
> I do have a dd copy of the device now.
> 
> $ sudo losetup --find --show ./dropbtr0.btrfs
> $ sudo mount -o skip_balance -t btrfs /dev/loop0 /mnt/tempdisk
> 
> does immediately result in:
> 
> Apr 12 22:37:48 fan kernel: [  124.742104] loop: module loaded
> Apr 12 22:37:48 fan kernel: [  124.784727] BTRFS: device label dropbtr0 devid 
> 1 transid 1530529 /dev/loop0
> Apr 12 22:38:07 fan kernel: [  143.120268] BTRFS info (device loop0): disk 
> space caching is enabled
> Apr 12 22:38:07 fan kernel: [  143.207872] BUG: unable to handle kernel NULL 
> pointer dereference at 01f0

This is now https://bugzilla.kernel.org/show_bug.cgi?id=195631

Greetings
Marc
-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Again, no space left on device while rebalancing and recipe doesnt work

2016-02-27 Thread Marc Haber

Hi,

I have again the issue of no space left on device while rebalancing
(with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable):

mh@fan:~$ sudo btrfs balance start /mnt/fanbtr
ERROR: error during balancing '/mnt/fanbtr': No space left on device
mh@fan:~$ sudo btrfs fi show /mnt/fanbtr
mh@fan:~$ sudo btrfs fi show -m
Label: 'fanbtr'  uuid: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
Total devices 1 FS bytes used 116.49GiB
devid1 size 417.19GiB used 177.06GiB path /dev/mapper/fanbtr
mh@fan:~$ sudo btrfs fi df /mnt/fanbtr
Data, single: total=113.00GiB, used=112.77GiB
System, DUP: total=32.00MiB, used=48.00KiB
Metadata, DUP: total=32.00GiB, used=3.72GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
mh@fan:~$

The filesystem was recently resized from 300 GB to 420 GB.

Why does btrfs fi show /mnt/fanbtr not give any output? Wy does btrfs
fi df /mnt/fanbtr say that my data space is only 113 GiB large?

btrfs balance start -dusage=5 works up to -dusage=100:

mh@fan:~$ sudo btrfs balance start -dusage=100 /mnt/fanbtr
Done, had to relocate 111 out of 179 chunks
mh@fan:~$ sudo btrfs balance start -dusage=100 /mnt/fanbtr
Done, had to relocate 111 out of 179 chunks
mh@fan:~$ sudo btrfs balance start -dusage=100 /mnt/fanbtr
Done, had to relocate 110 out of 179 chunks
mh@fan:~$ sudo btrfs balance start -dusage=100 /mnt/fanbtr
Done, had to relocate 109 out of 179 chunks
mh@fan:~$ sudo btrfs balance start /mnt/fanbtr
ERROR: error during balancing '/mnt/fanbtr': No space left on device
mh@fan:~$

What is going on here? How do I get away from here?

Greetings
Marc


-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-02-27 Thread Marc Haber

On Sun, Feb 28, 2016 at 12:15:21AM +0100, Martin Steigerwald wrote:
> On Samstag, 27. Februar 2016 22:14:50 CET Marc Haber wrote:
> > I have again the issue of no space left on device while rebalancing
> > (with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable):
> > 
> > mh@fan:~$ sudo btrfs balance start /mnt/fanbtr
> > ERROR: error during balancing '/mnt/fanbtr': No space left on device
> > mh@fan:~$ sudo btrfs fi show /mnt/fanbtr
> > mh@fan:~$ sudo btrfs fi show -m
> > Label: 'fanbtr'  uuid: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
> > Total devices 1 FS bytes used 116.49GiB
> > devid1 size 417.19GiB used 177.06GiB path /dev/mapper/fanbtr
> 
> Hmmm, thats still a ton of space to allocate chunks from.
> 
> > mh@fan:~$ sudo btrfs fi df /mnt/fanbtr
> > Data, single: total=113.00GiB, used=112.77GiB
> > System, DUP: total=32.00MiB, used=48.00KiB
> > Metadata, DUP: total=32.00GiB, used=3.72GiB
> > GlobalReserve, single: total=512.00MiB, used=0.00B
> > mh@fan:~$
> > 
> > The filesystem was recently resized from 300 GB to 420 GB.
> > 
> > Why does btrfs fi show /mnt/fanbtr not give any output? Wy does btrfs
> > fi df /mnt/fanbtr say that my data space is only 113 GiB large?
> 
> Cause it is.
> 
> The "used" in "devid 1" line is btrfs fi sh is "data + 2x system + 2x 
> metadata 
> = 113 GiB + 2 * 32 GiB + 2 * 32 MiB, i.e. what amount of the size of the 
> device is allocated for chunks.
> 
> The value one line above is what is allocated inside the chunks.
> 
> I.e. the line in "devid 1" is "total" of btrfs fi df summed up, and the line 
> above is "used" in btrfs fi df summed up. And… with more devices you have 
> more 
> fun.

Why wouldn't btrfs allocate more data chunks from the ample free space?

> I suggest:
> 
> merkaba:~> btrfs fi usage -T /daten


[2/498]mh@fan:~$ sudo btrfs fi usage /mnt/fanbtr
Overall:
Device size: 417.19GiB
Device allocated:177.06GiB
Device unallocated:  240.12GiB
Device missing:  0.00B
Used:120.23GiB
Free (estimated):240.33GiB  (min: 120.27GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:113.00GiB, Used:112.79GiB
   /dev/mapper/fanbtr113.00GiB

Metadata,DUP: Size:32.00GiB, Used:3.72GiB
   /dev/mapper/fanbtr 64.00GiB

System,DUP: Size:32.00MiB, Used:48.00KiB
   /dev/mapper/fanbtr 64.00MiB

[3/498]mh@fan:~$ sudo btrfs fi usage -T /mnt/fanbtr
Overall:
Device size: 417.19GiB
Device allocated:177.06GiB
Device unallocated:  240.12GiB
Device missing:  0.00B
Used:120.23GiB
Free (estimated):240.33GiB  (min: 120.27GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

  Data  Metadata System
Id Path   singleDUP  DUP  Unallocated
-- -- -   ---
 1 /dev/mapper/fanbtr 113.00GiB 64.00GiB 64.00MiB   240.12GiB
-- -- -   ---
   Total  113.00GiB 32.00GiB 32.00MiB   240.12GiB
   Used   112.79GiB  3.72GiB 48.00KiB
[4/499]mh@fan:~$

> (this is actually the situation asking for hung task trouble with kworker 
> threads seeking for free space inside chunks, as no new chunks can be 
> allocated, lets hope kernel 4.4 finally really has fixes for this)

I am running a 4.4.2 kernel on the system in question.

> Adding a new device temporarily, doing the balance and then removing it.

I currently refuse to do this on a 400 GiB device that has more than
half of its capacity free. I do expect a modern filesystem to get out
of that situation without a manual intervention this invasive.

> Before that I´d try to balance the metadata chunks, cause 
> 
> > Metadata, DUP: total=32.00GiB, used=3.72GiB
> 
> 32 GiB chunks allocated, only 3,72 GiB used.

Why would I rebalance metadata if there is less than 20 % used?

[21/504]mh@fan:~$ sudo btrfs balance start -musage=5 /mnt/fanbtr
ERROR: error during balancing '/mnt/fanbtr': No space left on device
There may be more info in syslog - try dmesg | tail
[22/505]mh@fan:~$ sudo btrfs balance start -musage=1 /mnt/fanbtr
Done, had to relocate 56 out of 179 chunks
[23/506]mh@fan:~$ sudo btrfs balance start -musage=1 /mnt/fanbtr
Done, had to relocate 56 out of 179 chunks
[24/506]mh@fan:~$ sudo btrfs balance start -musage=1 /mnt/fanbt

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-02-28 Thread Marc Haber

On Sun, Feb 28, 2016 at 12:22:45AM +, Hugo Mills wrote:
> On Sun, Feb 28, 2016 at 01:08:29AM +0100, Marc Haber wrote:
> > Why wouldn't btrfs allocate more data chunks from the ample free space?
> 
>It's a bug. It's been around for years (literally), but nobody's
> tracked it down and fixed it yet.

Is there a fix/workaround?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-02-29 Thread Marc Haber

Hi,

On Mon, Feb 29, 2016 at 09:56:58AM +0800, Qu Wenruo wrote:
> Marc Haber wrote on 2016/02/27 22:14 +0100:
> >I have again the issue of no space left on device while rebalancing
> >(with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable):
> >
> >mh@fan:~$ sudo btrfs balance start /mnt/fanbtr
> >ERROR: error during balancing '/mnt/fanbtr': No space left on device
> 
> It seems that, only when balancing all chunks, ENOSPC error happens.
> 
> And did you run any other heavy IO at background?

Not when running those last commands for the mailing list post.

> BTW, is there any kernel log when the ENOSPC happens?

> Would you please try the following commands to see which one caused the
> problem?
> And would you please provide the dmesg of them?
> 
> # btrfs balance start -dprofiles=single /mnt/fanbtr
> # btrfs balance start -mprofile=dup /mnt/fanbtr
> # btrfs balance start -sprofile=dup /mnt/fanbtr

I have attached the logs. I used logger(1) to have in syslog which
command I executed, and I have piped the userspace's output to logger
so that the syslog entries match the userspace output.

-mprofile gave an error message, I therefore tried -mprofiles, and
-sprofiles wanted me to use the --force, so I did that as well.

The three balance commands above all three finshed alright without
running into ENOSPC, while running a plain balance (which is also part
of the log) errors out every time.

And, the -dprofiles=single log caused a number of INFOs regarding
btrfs-cleaner and btrfa-balance processes gotten stuck for more than
120 seconds during the run.

I now have a kworker and a btfs-transact kernel process taking most of
one CPU core each, even after the userspace programs have terminated.
Is there a way to find out what these threads are actually doing?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

balance hangs and starts again on reboot

2016-03-04 Thread Marc Haber

Hi,

I have another btrfs on the same host that has no the no space left on
device balance issue, but on another disk. On this btrfs, it seems
like a balance process is stuck, with a lot of hanging kernel
threads. After a reboot, when I mount the filesystem, the balance
immediately starts again. btrfs balance cancel just hangs around with
no visible reaction for hours.

Log appended. Is there rescue?

Greetings
Marc


-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421

Mar  4 17:27:36 fan mh: mount /mnt/snapshots/fanbtr_r
Mar  4 17:27:41 fan kernel: [  453.124792] BTRFS info (device dm-17): disk 
space caching is enabled
Mar  4 17:27:41 fan kernel: [  453.124797] BTRFS: has skinny extents
Mar  4 17:27:46 fan kernel: [  458.308485] BTRFS: checking UUID tree
Mar  4 17:27:46 fan kernel: [  458.308493] BTRFS info (device dm-17): 
continuing balance
Mar  4 17:27:50 fan kernel: [  462.297618] BTRFS info (device dm-17): 
relocating block group 150434162 flags 36
Mar  4 17:32:08 fan kernel: [  720.473141] INFO: task btrfs-balance:3753 
blocked for more than 120 seconds.
Mar  4 17:32:08 fan kernel: [  720.473154]   Not tainted 4.4.4-zgws1 #2
Mar  4 17:32:08 fan kernel: [  720.473159] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  4 17:32:08 fan kernel: [  720.473165] btrfs-balance   D 88062fc556c0   
  0  3753  2 0x
Mar  4 17:32:08 fan kernel: [  720.473176]  88060c3f0c00 0001 
880036da4000 880036da3bd0
Mar  4 17:32:08 fan kernel: [  720.473184]  8806113b6c60 0002 
8140d24c 88060c3f0c00
Mar  4 17:32:08 fan kernel: [  720.473192]  8140b759 7fff 
8140d28a 88062fc156c0
Mar  4 17:32:08 fan kernel: [  720.473199] Call Trace:
Mar  4 17:32:08 fan kernel: [  720.473214]  [] ? 
usleep_range+0x35/0x35
Mar  4 17:32:08 fan kernel: [  720.473225]  [] ? 
schedule+0x6f/0x7c
Mar  4 17:32:08 fan kernel: [  720.473231]  [] ? 
schedule_timeout+0x3e/0x128
Mar  4 17:32:08 fan kernel: [  720.473241]  [] ? 
check_preempt_curr+0x25/0x63
Mar  4 17:32:08 fan kernel: [  720.473248]  [] ? 
ttwu_do_wakeup+0xf/0xd0
Mar  4 17:32:08 fan kernel: [  720.473255]  [] ? 
_raw_spin_unlock_irqrestore+0xd/0xe
Mar  4 17:32:08 fan kernel: [  720.473263]  [] ? 
try_to_wake_up+0x1cb/0x1dc
Mar  4 17:32:08 fan kernel: [  720.473271]  [] ? 
__wait_for_common+0x121/0x16d
Mar  4 17:32:08 fan kernel: [  720.473278]  [] ? 
__wait_for_common+0x121/0x16d
Mar  4 17:32:08 fan kernel: [  720.473286]  [] ? 
wake_up_q+0x3b/0x3b
Mar  4 17:32:08 fan kernel: [  720.473339]  [] ? 
btrfs_async_run_delayed_refs+0xbf/0xd5 [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473390]  [] ? 
__btrfs_end_transaction+0x291/0x2d5 [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473438]  [] ? 
relocate_block_group+0x2b8/0x4ab [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473488]  [] ? 
btrfs_wait_ordered_roots+0x175/0x191 [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473536]  [] ? 
btrfs_relocate_block_group+0x132/0x25a [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473585]  [] ? 
btrfs_relocate_chunk.isra.35+0x3c/0xad [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473633]  [] ? 
btrfs_balance+0xd23/0xd8f [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473684]  [] ? 
balance_kthread+0x4f/0x6d [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473732]  [] ? 
btrfs_balance+0xd8f/0xd8f [btrfs]
Mar  4 17:32:08 fan kernel: [  720.473740]  [] ? 
kthread+0x95/0x9d
Mar  4 17:32:08 fan kernel: [  720.473747]  [] ? 
kthread_parkme+0x16/0x16
Mar  4 17:32:08 fan kernel: [  720.473754]  [] ? 
ret_from_fork+0x3f/0x70
Mar  4 17:32:08 fan kernel: [  720.473761]  [] ? 
kthread_parkme+0x16/0x16
Mar  4 17:34:08 fan kernel: [  840.465597] INFO: task btrfs-balance:3753 
blocked for more than 120 seconds.
Mar  4 17:34:08 fan kernel: [  840.465610]   Not tainted 4.4.4-zgws1 #2
Mar  4 17:34:08 fan kernel: [  840.465615] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  4 17:34:08 fan kernel: [  840.465621] btrfs-balance   D 88062fc556c0   
  0  3753  2 0x
Mar  4 17:34:08 fan kernel: [  840.465632]  88060c3f0c00 0001 
880036da4000 880036da3bd0
Mar  4 17:34:08 fan kernel: [  840.465641]  8806113b6c60 0002 
8140d24c 88060c3f0c00
Mar  4 17:34:08 fan kernel: [  840.465648]  8140b759 7fff 
8140d28a 88062fc156c0
Mar  4 17:34:08 fan kernel: [  840.465655] Call Trace:
Mar  4 17:34:08 fan kernel: [  840.465669]  [] ? 
usleep_range+0x35/0x35
Mar  4 17:34:08 fan kernel: [  840.465680]  [] ? 
schedule+0x6f/0x7c
Mar  4 17:34:08 fan kernel: [  840.465687]  [] ? 
schedule_timeout+0x3e/0x128

Re: balance hangs and starts again on reboot

2016-03-05 Thread Marc Haber

Hi Chris,

I apologize for not being able to deliver logs in the way you might
find them more helpful.

On Fri, Mar 04, 2016 at 12:08:10PM -0700, Chris Murphy wrote:
> On Fri, Mar 4, 2016 at 10:31 AM, Marc Haber  
> wrote:
> > I have another btrfs on the same host that has no the no space left on
> > device balance issue, but on another disk. On this btrfs, it seems
> > like a balance process is stuck, with a lot of hanging kernel
> > threads. After a reboot, when I mount the filesystem, the balance
> > immediately starts again. btrfs balance cancel just hangs around with
> > no visible reaction for hours.
> >
> > Log appended. Is there rescue?
> 
> The log is made much more useful if you can sysrq+w while the blocked
> task is happening; and then dmesg or journalctl -k to get the results
> into a file for attachment to avoid the annoying MUA wrapping.

This list has repeatedly eaten log attachments without giving any
indication why. I had assumed that attachments are disallowed here,
and am taking careful attention that inserted logs are not wrapped on
my side. The list archives
(http://www.spinics.net/lists/linux-btrfs/msg52663.html) show that my
efforts not to cause wrapping on my side were actually successful.

What is the most helpful way to include logs? Pastebinning them would
probably reduce the list archives' usefulness due to pastebin
expiring, attaching doesn't work (see above), and including them
causes "annoying MUA wrapping".  I do only have 24 years of e-mail
experience, so I'm a clueless newbie, maybe one can give advice how to
do that properly.

I'm going to try the sysrq+w thing next time things happen.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: balance hangs and starts again on reboot

2016-03-05 Thread Marc Haber

On Fri, Mar 04, 2016 at 07:09:39PM +0100, Holger Hoffstätte wrote:
> On 03/04/16 18:31, Marc Haber wrote:
> > I have another btrfs on the same host that has no the no space left on
> > device balance issue, but on another disk. On this btrfs, it seems
> > like a balance process is stuck, with a lot of hanging kernel
> > threads. After a reboot, when I mount the filesystem, the balance
> > immediately starts again. btrfs balance cancel just hangs around with
> > no visible reaction for hours.
> > 
> > Log appended. Is there rescue?
> 
> Can't offer much help other than to recommend to *always* mount with
> -o skip_balance, which IMHO should have been the default behaviour
> from the beginning.

That's an important hint. The btrfs balance cancel has worked over
night though.

> Then try to balance in small increments.

-dusage=5 and incrementing? Or what do you mean with "in small
increments"?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-05 Thread Marc Haber

Hi,

I have not seen this message coming back to the mailing list. Was it
again too long?

I have pastebinned the log at http://paste.debian.net/412118/

On Tue, Mar 01, 2016 at 08:51:32PM +, Duncan wrote:
> There has been something bothering me about this thread that I wasn't 
> quite pinning down, but here it is.
> 
> If you look at the btrfs fi df/usage numbers, data chunk total vs. used 
> are very close to one another (113 GiB total, 112.77 GiB used, single 
> profile, assuming GiB data chunks, that's only a fraction of a single 
> data chunk unused), so balance would seem to be getting thru them just 
> fine.

Where would you see those numbers? I have those, pre-balance:

Mar  2 20:28:01 fan root: Data, single: total=77.00GiB, used=76.35GiB
Mar  2 20:28:01 fan root: System, DUP: total=32.00MiB, used=48.00KiB
Mar  2 20:28:01 fan root: Metadata, DUP: total=86.50GiB, used=2.11GiB
Mar  2 20:28:01 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

> But there's a /huge/ spread between total vs. used metadata (32 GiB 
> total, under 4 GiB used, clearly _many_ empty or nearly empty chunks), 
> implying that has not been successfully balanced in quite some time, if 
> ever.

This is possible, yes.

>   So I'd surmise the problem is in metadata, not in data.
> 
> Which would explain why balancing data works fine, but a whole-filesystem 
> balance doesn't, because it's getting stuck on the metadata, not the data.
> 
> Now the balance metadata filters include system as well, by default, and 
> the -mprofiles=dup and -sprofiles=dup balances finished, apparently 
> without error, which throws a wrench into my theory.

Also finishes without changing things, post-balance:
Mar  2 21:55:37 fan root: Data, single: total=77.00GiB, used=76.36GiB
Mar  2 21:55:37 fan root: System, DUP: total=32.00MiB, used=80.00KiB
Mar  2 21:55:37 fan root: Metadata, DUP: total=99.00GiB, used=2.11GiB
Mar  2 21:55:37 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B

Wait, Metadata used actually _grew_???

> But while we have the btrfs fi df from before the attempt with the 
> profiles filters, we don't have the same output from after.
s
We now have everything. New log attached.

> > I'd like to remove unused snapshots and keep the number of them to 4
> > digits, as a workaround.
> 
> I'll strongly second that recommendation.  Btrfs is known to have 
> snapshot scaling issues at 10K snapshots and above.  My strong 
> recommendation is to limit snapshots per filesystem to 3000 or less, with 
> a target of 2000 per filesystem or less if possible, and an ideal of 1000 
> per filesystem or less if it's practical to keep it to that, which it 
> should be with thinning, if you're only snapshotting 1-2 subvolumes, but 
> may not be if you're snapshotting more.

I'm snapshotting /home every 10 minutes, the filesystem that I have
been posting logs from has about 400 snapshots, and snapshot cleanup
works fine. The slow snapshot removal is a different filesystem on the
same host which is on a rotating rust HDD, and is much bigger.

> By 3000 snapshots per filesystem, you'll be beginning to notice slowdowns 
> in some btrfs maintenance commands if you're sensitive to it, tho it's 
> still at least practical to work with, and by 10K, it's generally 
> noticeable by all, at least once they thin down to 2K or so, as it's 
> suddenly faster again!  Above 100K, some btrfs maintenance commands slow 
> to a crawl and doing that sort of maintenance really becomes impractical 
> enough that it's generally easier to backup what you need to and blow 
> away the filesystem to start again with a new one, than it is to try to 
> recover the existing filesystem to a workable state, given that 
> maintenance can at that point take days to weeks.

Ouch. This shold not be the case, or btrfs subvolume snapshot should
at least emit a warning. It is not good that it is so easy to get a
filesystem into a state this bad.

> So 5-digits of snapshots on a filesystem is definitely well outside of 
> the recommended range, to the point that in some cases, particularly 
> approaching 6-digits of snapshots, it'll be more practical to simply 
> ditch the filesystem and start over, than to try to work with it any 
> longer.  Just don't do it; setup your thinning schedule so your peak is 
> 3000 snapshots per filesystem or under, and you won't have that problem 
> to worry about. =:^)

That needs to be documented prominently. Ths ZFS fanbois will love that.

> Oh, and btrfs quota management exacerbates the scaling issues 
> dramatically.  If you're using btrfs quotas

Am not, thankfully.

Greetings
Marc

-- 
-
M

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-05 Thread Marc Haber

On Thu, Mar 03, 2016 at 02:28:36AM +0200, Dāvis Mosāns wrote:
> I've same issue, 4.4.3 kernel on Arch Linux
> 
> $ sudo btrfs fi show /mnt/fs/
> Label: 'fs'  uuid: a3c66d25-2c25-40e5-a827-5f7e5208e235
> Total devices 1 FS bytes used 396.94GiB
> devid1 size 435.76GiB used 435.76GiB path /dev/sdi2
> 
> $ sudo btrfs fi df /mnt/fs/
> Data, single: total=416.70GiB, used=390.62GiB
> System, DUP: total=32.00MiB, used=96.00KiB
> Metadata, DUP: total=9.50GiB, used=6.32GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> $ sudo btrfs fi usage /mnt/fs/
> Overall:
> Device size: 435.76GiB
> Device allocated:435.76GiB
> Device unallocated:1.00MiB
> Device missing:  0.00B
> Used:403.26GiB
> Free (estimated): 26.07GiB  (min: 26.07GiB)
> Data ratio:   1.00
> Metadata ratio:   2.00
> Global reserve:  512.00MiB  (used: 0.00B)
> 
> Data,single: Size:416.70GiB, Used:390.62GiB
>/dev/sdi2 416.70GiB
> 
> Metadata,DUP: Size:9.50GiB, Used:6.32GiB
>/dev/sdi2  19.00GiB
> 
> System,DUP: Size:32.00MiB, Used:96.00KiB
>/dev/sdi2  64.00MiB
> 
> Unallocated:
>/dev/sdi2   1.00MiB

http://paste.ubuntu.com/15292589/ has another log of mine with btrfs
fi usage calls as well, just in case this helps.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: balance hangs and starts again on reboot

2016-03-05 Thread Marc Haber

On Sat, Mar 05, 2016 at 04:38:57PM +0100, Holger Hoffstätte wrote:
> On 03/05/16 15:17, Marc Haber wrote:
> >> Then try to balance in small increments.
> > 
> > -dusage=5 and incrementing? Or what do you mean with "in small
> > increments"?
> 
> Exactly, yes. Sorry for not being more clear.

So you would recommend something along

for nr in $(seq 5 5 100); do
  btrfs balance start -dusage=$nr $FS
done

right?

Won't this take ages longer than a straight unfiltered balance?

> FWIW I've been balancing a lot recently (both for stress testing and
> cleaning up a few filesystems) and have never run into this particular
> stall, but only ever do filtered balances. Also I wouldn't be surprised
> at all if this is yet another problem where md does something in a way
> that btrfs doesn' expect, and things go wrong.

md as in the Linux Software RAID? That's not in the game here, it's a
single SATA hard disk.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-05 Thread Marc Haber

On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> I can't tell what this btrfs-balance script is doing because not every
> btrfs balance command is in the log.

It is. I wrote it to produce reproducible logs.

[1/499]mh@fan:~$ cat btrfs-balance
#!/bin/bash

FS="/mnt/fanbtr"

showdf() {
logger -- btrfs fi df $FS
btrfs fi df $FS 2>&1 | logger
logger -- btrfs fi show /
btrfs fi show / | logger
logger -- btrfs fi usage /
btrfs fi usage / | logger
}

logger -- BEGIN btrfs-balance script
showdf

btrfs balance start  $FS 2>&1 | logger
showdf

logger -- BEGIN btrfs balance start -dprofiles=single $FS
btrfs balance start -dprofiles=single $FS 2>&1 | logger
showdf

logger -- BEGIN btrfs balance start -mprofiles=dup $FS
btrfs balance start -mprofiles=dup $FS 2>&1 | logger
showdf

logger -- BEGIN btrfs balance start --force -sprofiles=dup $FS
btrfs balance start --force -sprofiles=dup $FS 2>&1 | logger
showdf

logger -- BEGIN btrfs balance start $FS
btrfs balance start  $FS 2>&1 | logger
showdf

logger -- END btrfs-balance script
[2/500]mh@fan:~$ 

I see. The logger -- BEGIN is missing for the very first command. My
bad.

> Something is happening with the usage of this file system that's out
> of the ordinary. This is the first time I've seen such a large amount
> of unused metadata allocation. And then for it not only fail to
> balance, but for the allocation amount to increase is a first.

It is just a root filesystem of a workstation running Debian Linux, in
daily use, with daily snapshots of the system, and
ten-minute-increment snapshots of /home, with no cleanup happening for
a few months.

>  So understanding the usage is important to figuring out what's
>  happening. I'd file a bug and include as much information on how the
>  fs got into this state as possible. And also if possible make a
>  btrfs-image using the proper flags to blot out the filenames for
>  privacy.

That would btrfs-image -s?

> And what btrfs-progs tools were used to create this file system. Etc.

The file system is at least two years old, I do not remember, which
version of btrfs-tools was in Debian unstable back then. Is this
information somewhere in the filesystem label? How do I obtain this one?

> The alternative if this can't be fixed, is to recreate the filesystem
> because there's no practical way yet to migrate so many snapshots to a
> new file system.

I am now back to a mid three-digit number of snapshots.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-07 Thread Marc Haber

On Sun, Mar 06, 2016 at 06:43:46AM +, Duncan wrote:
> Marc Haber posted on Sat, 05 Mar 2016 21:09:09 +0100 as excerpted:
> > On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> >> Something is happening with the usage of this file system that's out of
> >> the ordinary. This is the first time I've seen such a large amount of
> >> unused metadata allocation. And then for it not only fail to balance,
> >> but for the allocation amount to increase is a first.
> > 
> > It is just a root filesystem of a workstation running Debian Linux, in
> > daily use, with daily snapshots of the system, and ten-minute-increment
> > snapshots of /home, with no cleanup happening for a few months.
> > 
> >>  So understanding the usage is important to figuring out what's
> >>  happening. I'd file a bug and include as much information on how the
> >>  fs got into this state as possible. And also if possible make a
> >>  btrfs-image using the proper flags to blot out the filenames for
> >>  privacy.
> 
> Now you're homing in on what I picked up on.  There's something very 
> funky about that metadata, 100+ GiB of metadata total, only just over 2 
> GiB metadata used, and attempts to balance it don't help with the spread 
> between the two at all, only increasing the total metadata, if anything, 
> but still seem to complete without error.  There's gotta be some sort of 
> bug going on there, and I'd /bet/ it's the same one that's keeping full 
> balances from working, as well.

I don't understand a single word of this, but you seem to understand
it. Good.

> 
> OK, this question's out of left field, but it's the only thing (well, 
> /almost/ only, see below) I've seen do anything /remotely/ like that:
> 
> Was the filesystem originally created as a convert from ext*, using btrfs-
> convert?  If so, was the ext2_saved or whatever subvolume removed, and a 
> successful defrag and balance completed at that time?

I have dug aroud in my auth.logs, and thanks to my not working in a
root shell but using sudo for every single command I can say that the
filesystem was created on September 1, 2015, so it is not _this_ old,
and snapshot.debian.net tells me that Debian unstable had btrfs-tools
4.1.2 uploaded on August 31, so i guess that the filesystem was either
created by the 4.0 version we had since May 2015 or by the brand new
4.1.2.

And it was a mkfs.btrfs with no special options. I suspected this
since I would probably not have made an ext4 filesystem of 300 GB in
size. Back in the ext4 days, I usually made /, /usr, /var, /home and
/boot their own filesystems.

> Tho AFAIK there was in addition a very narrow timeframe in which a bug in 
> mkfs.btrfs would create invalid btrfs'.  That was with btrfs-progs 4.1.1, 
> released in July 2015, with an urgent bugfix release 4.1.2 in the same 
> month to fix the problem, so the timeframe was days or weeks.

Debian is chastized for their allegedly quirky release schedules even
in this thread, I usually ignore that, but this time a smile comes to
my face when I say that btrfs-progs 4.1.1 was never packaged in
Debian, hence we're clear of this bug here. We went from 4.0 straight
to 4.1.2.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-07 Thread Marc Haber

On Sun, Mar 06, 2016 at 01:27:10PM -0700, Chris Murphy wrote:
> Marc said it was created maybe 2 years ago and doesn't remember what
> version of the tools were used. Between it being two years ago and
> also being Debian, for all we know it could've been 0.19. *shrug*

You are mixing up Debian unstable and Debian stable *snort*. You're
lucky that I'm not on RHEL 6[1].

> On the one hand, the practical advice is to just blow it away and use
> everything current, go back to the same workload including thousands
> of snapshots, and see if this balance problem is reproducible. That's
> pretty clearly a bug.

To have the same thing happen in half a year again? That's not why I
converted to a snapshottable file system.

> On the other hand, we're approaching the state with Btrfs where the
> problems we're seeing are at least as much about aging file systems,
> because the stability is permitting file systems to get older.

And this is really something to be proud of? I mean, this is a file
system that is part of the vanilla linux kernel, not marked as
experimental or something, and you're still concerned about file
systems that were made a year ago? This is a new experience for me.

>  As they get older though, the issues get more non-deterministic. So
>  it's an interesting bug from that perspective, the current kernel
>  code ought to be able to contend with this (as in, the user is right
>  to expect the code to deal with this scenario, and if it doesn't it's
>  a bug; not that I expect today's code to actually do this).

Kernel 4.4.4 as of the day before yesterday, thanks for considering.

> So if it were me, I'd gather all possible data, including complete,
> not trimmed, logs.

So you seriously want all messages like
Mar  7 09:25:23 fan systemd[1]: Started http per-connection Server, forwarding 
to 3142 ([2a01:238:4071:328d:5054:ff:fea9:6807]:41060).
Mar  7 09:25:23 fan named[3000]: client 
2a01:238:4071:328d:5054:ff:fea9:6807#59920 (debian.debian.zugschlus.de): query: 
debian.debian.zugschlus.de IN  + (fec0:0:0:::1)
Mar  7 09:21:34 fan dhcpd[2468]: DHCPREQUEST for 192.168.182.29 from 
54:04:a6:82:21:00 via eth0: unknown lease 192.168.182.29.
Mar  7 09:17:01 fan CRON[19474]: (root) CMD (   cd / && run-parts --report 
/etc/cron.hourly)
Mar  7 09:18:06 fan systemd[1]: Started Session c101 of user mh.
Mar  7 08:21:40 fan smartd[1956]: Device: /dev/sdb [SAT], SMART Usage 
Attribute: 194 Temperature_Celsius changed from 31 to 30

I _can_ swamp the bug report literally with gigabytes of logs, but is
that really what you want? If it is not, please state what you mean by
"not trimmed" as I only removed those clutter messages from the logs I
sent.

Greetings
Marc

[1] Does RHEL 6 have btrfs in the first place?

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-07 Thread Marc Haber

On Sun, Mar 06, 2016 at 01:37:31PM -0700, Chris Murphy wrote:
> On Sun, Mar 6, 2016 at 1:27 PM, Chris Murphy  wrote:
> > So if it were me, I'd gather all possible data, including complete,
> > not trimmed, logs.
> 
> Also include in the bug, the balance script being used. It might be a
> contributing factor.

The balance script was only written after Duncan asked me to do
filtered balances instead of a full balance. The issue showed itself
while the filesystem was still managed using the procedures from "the
book" ;-)

> I wonder if the ENOSPC is happening just prior to the point where
> balance would free up the unused portion of allocated metadata chunks
> and that's why this just keeps getting worse? The balance function is
> COW, so I wonder if there are a bunch of failed chunk migrations that
> are just accumulating due to the ENOSPC stopping the balance?

How do we find out?

> Anyway, after collecting all data and btrfs-image, I would blow away
> this fs using current kernel and tools. And then go back to the
> original workload. I would not pare down the number or frequency of
> snapshots. If anything increase it. The idea is to reproduce the bug.

... losing another pile of snapshots in the process? This is a
productive machine[1].

Greetings
Marc

[1] yes, with off-line backups being made

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-07 Thread Marc Haber

On Sat, Mar 05, 2016 at 09:09:09PM +0100, Marc Haber wrote:
> On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> >  So understanding the usage is important to figuring out what's
> >  happening. I'd file a bug and include as much information on how the
> >  fs got into this state as possible. And also if possible make a
> >  btrfs-image using the proper flags to blot out the filenames for
> >  privacy.
> 
> That would btrfs-image -s?

btrfs-image -s -t 8 -s /dev/mapper/fanbtr

complains about a mounted filesystem. Will an image made from the
running system with the filesyste mounted help, or do I need to take
down the machine while the image is being made?

Also, threading does not seem to work, despite the -t 8 CPU usage
never increases 100 % in atop.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

btrfs-image run time

2016-03-07 Thread Marc Haber

Hi,

how long is btrfs-image taking to run on a 400 GiB filesystem?

I have /bin/btrfs-image -s -t 8 -s /dev/mapper/mydevice - | pixz -9 >
file.on.other.fs running for four hours now, and it's constantly
taking a single core, but is neither reading from the disk nor writing
to its output.

Is that expeced behavior?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-image run time

2016-03-07 Thread Marc Haber

On Mon, Mar 07, 2016 at 06:27:17PM +0100, Marc Haber wrote:
> how long is btrfs-image taking to run on a 400 GiB filesystem?
> 
> I have /bin/btrfs-image -s -t 8 -s /dev/mapper/mydevice - | pixz -9 >
> file.on.other.fs running for four hours now

Strike my question please, I didn't see that I had the -s doubled.
With one -s I now see actual progress.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs-image run time

2016-03-07 Thread Marc Haber

On Mon, Mar 07, 2016 at 07:15:24PM +0100, Garmine 42 wrote:
> According to the manpage duplicate -s is valid and the high CPU usage is
> intended. Although a warning could be valid in case of -ss.

Or use a different letter. Anyway, that was my stupidity and no
developer time should be wasted for that. btrfs-image behaves as
documented, everything else was a problem existing between chair and
keyboard. Sorry for the noise.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-07 Thread Marc Haber

On Mon, Mar 07, 2016 at 01:56:54PM -0500, Austin S. Hemmelgarn wrote:
> Yeah, in general, if you want to get good upstream support for BTRFS (such
> as from the mailing lists), you still want to steer clear of 'Enterprise'
> branded distros (RHEL (and by extension CentOS) is particularly bad about
> kernel versioning

Just to get back to this thread's subject, I am using Debian unstable,
with a vanilla kernel, 4.4.3 at the beginning of this thread, and
4.4.4 today.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-12 Thread Marc Haber

On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> Something is happening with the usage of this file system that's out
> of the ordinary. This is the first time I've seen such a large amount
> of unused metadata allocation. And then for it not only fail to
> balance, but for the allocation amount to increase is a first. So
> understanding the usage is important to figuring out what's happening.
> I'd file a bug and include as much information on how the fs got into
> this state as possible. And also if possible make a btrfs-image using
> the proper flags to blot out the filenames for privacy. And what
> btrfs-progs tools were used to create this file system. Etc.

https://bugzilla.kernel.org/show_bug.cgi?id=114451

Please advise if there is something missing.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-12 Thread Marc Haber

=0.00B
[31/530]mh@fan:~$ time sudo btrfs balance start -musage=74 /media/tempdisk
Done, had to relocate 50 out of 134 chunks

real0m4.546s
user0m0.000s
sys 0m0.620s
[32/531]mh@fan:~$ sudo btrfs fi df /media/tempdisk/
Data, single: total=79.00GiB, used=78.32GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=27.00GiB, used=2.30GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

So one does not see a decrease in total Metadata size until -musage
has gone up to 70, then it decreases by half a gig. -musage=75 is the
first musage value that leads to the ENOSPC condition, with total
Metadata size going up to 27 GiB again, and -musage=74 being the
biggest musage value that finishs without ENOSPC, but no visible
decrease of total Metadata size.

Greetings
Marc


-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-12 Thread Marc Haber

On Sun, Mar 06, 2016 at 01:27:10PM -0700, Chris Murphy wrote:
> So if it were me, I'd gather all possible data, including complete,
> not trimmed, logs. And as for the btrfs-image, it could be huge.

[5/504]mh@q:~/.www/public_html/stuff$ unxz --list 20160307-fanbtr-image.xz
Strms  Blocks   Compressed Uncompressed  Ratio  Check   Filename
1  19248.0 MiB   2385.2 MiB  0.104  CRC32   20160307-fanbtr-image.xz

>  It might not be
>  a bad idea to capture a complete btrfs-debug-tree also, and compress
>  that, add as attachment.

How do I do that?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-12 Thread Marc Haber

On Mon, Mar 07, 2016 at 11:39:11AM -0700, Chris Murphy wrote:
> Since there's no hardware issue suspect, you could filter for just btrfs.
> 
> journalctl -o short-iso | grep -i btrfs

Which is exactly what I did. Why did you suspect that my logs were
"trimmed"? That's what got me kind of furious. I took great care to
not trim relevant information.

> When there's hardware stuff suspect it's better to include all the
> SCSI and  libata (and USB if it's a USB drive) messages also.

None there.

> If you have any logs that include the filesystem mounted with
> enospc_debug, that might be useful for a developer?

The later logs I posted were actually taken with enospc_debug, the
4.4.3 ones even with Duncan's patch. I think I didn't apply it before
building 4.4.4.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-13 Thread Marc Haber

17 fan root: Device allocated:#011#011  93.03GiB
Mar 13 11:36:17 fan root: Device unallocated:#011#011 106.97GiB
Mar 13 11:36:17 fan root: Device missing:#011#011 0.00B
Mar 13 11:36:17 fan root: Used:#011#011#011  80.09GiB
Mar 13 11:36:17 fan root: Free (estimated):#011#011 107.26GiB#011(min: 
107.26GiB)
Mar 13 11:36:17 fan root: Data ratio:#011#011#011  1.00
Mar 13 11:36:17 fan root: Metadata ratio:#011#011  1.00
Mar 13 11:36:17 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 13 11:36:17 fan root:
Mar 13 11:36:17 fan root: Data,single: Size:78.00GiB, Used:77.71GiB
Mar 13 11:36:17 fan root:/dev/mapper/fanbtr#011  78.00GiB
Mar 13 11:36:17 fan root:
Mar 13 11:36:17 fan root: Metadata,single: Size:15.00GiB, Used:2.38GiB
Mar 13 11:36:17 fan root:/dev/mapper/fanbtr#011  15.00GiB
Mar 13 11:36:17 fan root:
Mar 13 11:36:17 fan root: System,single: Size:32.00MiB, Used:16.00KiB
Mar 13 11:36:17 fan root:/dev/mapper/fanbtr#011  32.00MiB
Mar 13 11:36:17 fan root:
Mar 13 11:36:17 fan root: Unallocated:
Mar 13 11:36:17 fan root:/dev/mapper/fanbtr#011 106.97GiB
Mar 13 11:36:17 fan root: BEGIN btrfs balance start /
Mar 13 11:51:23 fan root: ERROR: error during balancing '/': No space left on 
device
Mar 13 11:51:23 fan root: There may be more info in syslog - try dmesg | tail
Mar 13 11:51:23 fan root: btrfs fi df /
Mar 13 11:51:23 fan root: Data, single: total=78.00GiB, used=77.70GiB
Mar 13 11:51:23 fan root: System, single: total=32.00MiB, used=16.00KiB
Mar 13 11:51:23 fan root: Metadata, single: total=23.00GiB, used=2.38GiB
Mar 13 11:51:23 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Mar 13 11:51:23 fan root: btrfs fi show /
Mar 13 11:51:23 fan root: Label: 'fanbtr'  uuid: 
90f8d728-6bae-4fca-8cda-b368ba2c008e
Mar 13 11:51:23 fan root: #011Total devices 1 FS bytes used 80.08GiB
Mar 13 11:51:23 fan root: #011devid1 size 200.00GiB used 101.03GiB path 
/dev/mapper/fanbtr
Mar 13 11:51:23 fan root:
Mar 13 11:51:23 fan root: btrfs fi usage /
Mar 13 11:51:23 fan root: Overall:
Mar 13 11:51:23 fan root: Device size:#011#011 200.00GiB
Mar 13 11:51:23 fan root: Device allocated:#011#011 101.03GiB
Mar 13 11:51:23 fan root: Device unallocated:#011#011  98.97GiB
Mar 13 11:51:23 fan root: Device missing:#011#011 0.00B
Mar 13 11:51:23 fan root: Used:#011#011#011  80.08GiB
Mar 13 11:51:23 fan root: Free (estimated):#011#011  99.26GiB#011(min: 
99.26GiB)
Mar 13 11:51:23 fan root: Data ratio:#011#011#011  1.00
Mar 13 11:51:23 fan root: Metadata ratio:#011#011  1.00
Mar 13 11:51:23 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 13 11:51:23 fan root:
Mar 13 11:51:23 fan root: Data,single: Size:78.00GiB, Used:77.70GiB
Mar 13 11:51:23 fan root:/dev/mapper/fanbtr#011  78.00GiB
Mar 13 11:51:23 fan root:
Mar 13 11:51:23 fan root: Metadata,single: Size:23.00GiB, Used:2.38GiB
Mar 13 11:51:23 fan root:/dev/mapper/fanbtr#011  23.00GiB
Mar 13 11:51:23 fan root:
Mar 13 11:51:23 fan root: System,single: Size:32.00MiB, Used:16.00KiB
Mar 13 11:51:23 fan root:/dev/mapper/fanbtr#011  32.00MiB
Mar 13 11:51:23 fan root:
Mar 13 11:51:23 fan root: Unallocated:
Mar 13 11:51:23 fan root:/dev/mapper/fanbtr#011  98.97GiB
Mar 13 11:51:23 fan root: END btrfs-balance script
[10/509]mh@fan:~$

I see the same metadata spread as with the old filesystem in btrfs fi
df, totl at 23 and used at 2.38 GiB. What I find strange is that this
filesystem has Data, System and Metadata in "single" profile, is this
the new default for a 200 GiB file system?

Full log is at http://q.bofh.de/~mh/stuff/20160313-fanbtr-btrfs-syslog

The log was taken with enospc_debug active on the file system and all
file system, block device and storage relevant log lines were left in.

Is there anything missing? Is this the same issue? Would the log help
as addition in https://bugzilla.kernel.org/show_bug.cgi?id=114451?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-13 Thread Marc Haber

On Mon, Mar 14, 2016 at 12:17:24AM +1100, Andrew Vaughan wrote:
> On 13 March 2016 at 22:58, Marc Haber  wrote:
> > Hi,
> >
> > On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> >> The alternative if this can't be fixed, is to recreate the filesystem
> >> because there's no practical way yet to migrate so many snapshots to a
> >> new file system.
> >
> > I recreated the file system on March 7, with 200 GiB in size, using
> > btrfs-tools 4.4. The snapshot-taking process has been running since
> > then, but I also regularly cleaned up. The number of snapshots on the
> > new filesystem has never exceeded 1000, with the current count being
> > at 148.
> >
> 
> 
> 
> I'm not a dev, so I'll just thouw out a random, and possibly naive idea.
> 
> How much i/o load is this filesystem under?
> What type of access pattern(s), how frequent and large are the changes?

Nearly none. It's a workstation which I have avoided using in the last
days due to the filesystem trouble and to avoid impact of local work
to the filesystem behavior. I even log out after working on the box
for a few minutes.

There is a Debian apt-cacher running on the box and writing its cache
to this btrfs, but /var is on its own subvolume that is only
snapshotted once a day. I'll move /var/cache to its own subvolume and
set this subvolume on a "no snapshots" schedule.

The box itself is running a couple of KVM VMs, but the virtual disks
of the VMs are on dedicated LVs.

> Are you still making snapshots every 10m?

I am snapshotting the subvolume /home/mh, with the obvious contents,
every ten minutes, yes. Most of the other subvolumes is snapshotted
once daily, with some of them not getting snapshotted at all.

> How often do you delete old snapshots?  Also every 10m, or do you
> delete them in batchs every hour or so?

I delete them in batches about every ohter day.

> How long does "btrfs subvolume delete -c " take?
> What does "time btrfs subvolume delete -C  ;

[4/504]mh@fan:~$ time sudo btrfs subvolume delete -c 
/mnt/snapshots/fanbtr/user/subdaily/2016/03/13/07/5001/-home-mh
Delete subvolume (commit): 
'/mnt/snapshots/fanbtr/user/subdaily/2016/03/13/07/5001/-home-mh'

real0m0.100s
user0m0.000s
sys 0m0.016s
[5/505]mh@fan:~$ time sudo btrfs subvolume delete -C 
/mnt/snapshots/fanbtr/user/subdaily/2016/03/13/07/4001/-home-mh
Delete subvolume (commit): 
'/mnt/snapshots/fanbtr/user/subdaily/2016/03/13/07/4001/-home-mh'

real0m0.079s
user0m0.012s
sys 0m0.000s
[6/506]mh@fan:~$

The difference between -c and -C does only show when there is more
than one snapshot to be deleted.

>  time btrfs subvolume sync " print ?

[8/508]mh@fan:~$ time sudo btrfs subvolume sync /

real0m0.030s
user0m0.004s
sys 0m0.008s
[9/509]mh@fan:~$

> The reason for asking is that even on a lightly loaded filesystem I
> have seen btrfs subvolume delete take more than 30 seconds.  On a more
> heavily load filesystem  I have seen 5+ minutes before btrfs subvolume
> delete had finished.

In my experience, deleting snapshot in huge batches slows down quite a
bit, but this btrfs does not suffer from this disease.

> If you have a high enough i/o load, plus large enough changes per
> snapshot, it might be possible to get btrfs into a situation were it
> never actually finishes cleaning up deleted snapshots.  (I'm also not
> sure what happens if you shutdown or unmount whilst btrfs is still
> cleaning up, but I expect the devs thought of that).

It is a COW filesystem, I'd expect it to be consistent no matter what.
But that's the theory.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-03-13 Thread Marc Haber

On Sun, Mar 13, 2016 at 01:43:50PM -0600, Chris Murphy wrote:
> On Sat, Mar 12, 2016 at 12:57 PM, Marc Haber
>  wrote:
> > On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> >> Something is happening with the usage of this file system that's out
> >> of the ordinary. This is the first time I've seen such a large amount
> >> of unused metadata allocation. And then for it not only fail to
> >> balance, but for the allocation amount to increase is a first. So
> >> understanding the usage is important to figuring out what's happening.
> >> I'd file a bug and include as much information on how the fs got into
> >> this state as possible. And also if possible make a btrfs-image using
> >> the proper flags to blot out the filenames for privacy. And what
> >> btrfs-progs tools were used to create this file system. Etc.
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=114451
> >
> > Please advise if there is something missing.
> 
> No enospc_debug mount option used for kernel messages.

I apologize for not having this mentioned, but why do you think that
it wasn't active?

|[28/527]mh@fan:~$ grep enospc /proc/mounts
|/dev/mapper/fanbtr / btrfs 
rw,noatime,nodiratime,ssd,space_cache,enospc_debug,subvolid=257,subvol=/fan-root
 0 0
|/dev/mapper/fanbtr /mnt/snapshots/fanbtr btrfs 
rw,noatime,nodiratime,ssd,space_cache,enospc_debug,subvolid=266,subvol=/snapshots
 0 0
|[29/528]mh@fan:~$

>  And no indication you applied Qu's patch mentioned on March 1 to get
>  more info with enospc_debug mount:
> 
> >Oh, I'm sorry that the output is not necessary, it's better to use the newer 
> >patch:
> >https://patchwork.kernel.org/patch/8462881/
> >With the newer patch, you will need to use enospc_debug mount option to get 
> >the debug information.

That one didn't make it in 4.4.5 yet?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-13 Thread Marc Haber

On Sun, Mar 13, 2016 at 08:14:45PM +0100, Henk Slager wrote:
> On Sun, Mar 13, 2016 at 12:58 PM, Marc Haber
>  wrote:
> > Hi,
> >
> > On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> >> The alternative if this can't be fixed, is to recreate the filesystem
> >> because there's no practical way yet to migrate so many snapshots to a
> >> new file system.
> >
> > I recreated the file system on March 7, with 200 GiB in size, using
> > btrfs-tools 4.4. The snapshot-taking process has been running since
> > then, but I also regularly cleaned up. The number of snapshots on the
> > new filesystem has never exceeded 1000, with the current count being
> > at 148.
> 
> Is the snapshotting still read-write?

Yes, I want to keep the possibility to remove huge files from
snapshots that shouldnt have been on a snapshotted volume in the first
place without having to ditch the entire snapshot.

> Also, If some part of the OS or tools scans through the snapshot dirs
> every now and then with atime creation on, metadata grows without a
> real need.

I mount with noatime and nodiratime anyway, and the directory the
snapshots are mounted to (/mnt/snapshots) are excluded in
updatedb.conf. Any other idea which tool might scan filesystems and
that might not be noticed when it's running about a five digit number
of snapshots?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-13 Thread Marc Haber

On Sun, Mar 13, 2016 at 05:12:35PM +, Duncan wrote:
> Marc Haber posted on Sun, 13 Mar 2016 12:58:10 +0100 as excerpted:
> > I see the same metadata spread as with the old filesystem in btrfs fi
> > df,
> > totl at 23 and used at 2.38 GiB. What I find strange is that this
> > filesystem has Data, System and Metadata in "single" profile, is this
> > the new default for a 200 GiB file system?
> 
> Single is default for data.  Metadata (and system) will normally default 
> to dup on a single device, raid1 on multi-device, EXCEPT on detected 
> SSDs, where it defaults to single as well, because the firmware on some 
> ssds will dedup it in any case.  If you know your ssd isn't one of the 
> deduping ones (as I do, here), you can of course overrule that by 
> specifying modes at mkfs.btrfs time.

It was both times the same Samsung 840 EVO. Has this SSD detection
been added recently, or did older versions of mkfs.btrfs not detect an
SSD through a crypto layer, maybe?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-14 Thread Marc Haber

On Mon, Mar 14, 2016 at 01:05:39AM +, Duncan wrote:
> But according to the mkfs.btrfs manpage, the detection is based on 
> /sys/block/DEV/queue/rotational (with DEV substituted appropriately), and 
> various layers got support for correctly passing that thru at various 
> times, some before btrfs, some after.  So that's very likely why btrfs 
> didn't detect it originally, if it was on top of crypto and/or some other 
> layer that might not have been passing that thru.

That explains it, thanks.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-14 Thread Marc Haber

On Sun, Mar 13, 2016 at 12:58:09PM +0100, Marc Haber wrote:
> On Sat, Mar 05, 2016 at 12:34:09PM -0700, Chris Murphy wrote:
> > The alternative if this can't be fixed, is to recreate the filesystem
> > because there's no practical way yet to migrate so many snapshots to a
> > new file system.
> 
> I recreated the file system on March 7, with 200 GiB in size, using
> btrfs-tools 4.4. The snapshot-taking process has been running since
> then, but I also regularly cleaned up. The number of snapshots on the
> new filesystem has never exceeded 1000, with the current count being
> at 148.
> 
> And btrfs balance runs into the same ENOSPC issues as the old one:

... with Qu's patch, I now get a reproducible kernel trace:

Mar 14 10:23:49 fan mh: BEGIN btrfs-balance script
Mar 14 10:23:49 fan mh: btrfs fi df /
Mar 14 10:23:49 fan root: Data, single: total=79.00GiB, used=78.42GiB
Mar 14 10:23:49 fan root: System, single: total=32.00MiB, used=16.00KiB
Mar 14 10:23:49 fan root: Metadata, single: total=10.00GiB, used=2.46GiB
Mar 14 10:23:49 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Mar 14 10:23:49 fan mh: btrfs fi show /
Mar 14 10:23:49 fan root: Label: 'fanbtr'  uuid: 
90f8d728-6bae-4fca-8cda-b368ba2c008e
Mar 14 10:23:49 fan root: #011Total devices 1 FS bytes used 80.89GiB
Mar 14 10:23:49 fan root: #011devid1 size 200.00GiB used 89.03GiB path 
/dev/mapper/fanbtr
Mar 14 10:23:49 fan root: 
Mar 14 10:23:49 fan mh: btrfs fi usage /
Mar 14 10:23:49 fan root: Overall:
Mar 14 10:23:49 fan root: Device size:#011#011 200.00GiB
Mar 14 10:23:49 fan root: Device allocated:#011#011  89.03GiB
Mar 14 10:23:49 fan root: Device unallocated:#011#011 110.97GiB
Mar 14 10:23:49 fan root: Device missing:#011#011 0.00B
Mar 14 10:23:49 fan root: Used:#011#011#011  80.89GiB
Mar 14 10:23:49 fan root: Free (estimated):#011#011 111.54GiB#011(min: 
111.54GiB)
Mar 14 10:23:49 fan root: Data ratio:#011#011#011  1.00
Mar 14 10:23:49 fan root: Metadata ratio:#011#011  1.00
Mar 14 10:23:49 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 14 10:23:49 fan root: 
Mar 14 10:23:49 fan root: Data,single: Size:79.00GiB, Used:78.42GiB
Mar 14 10:23:49 fan root:/dev/mapper/fanbtr#011  79.00GiB
Mar 14 10:23:49 fan root: 
Mar 14 10:23:49 fan root: Metadata,single: Size:10.00GiB, Used:2.46GiB
Mar 14 10:23:49 fan root:/dev/mapper/fanbtr#011  10.00GiB
Mar 14 10:23:49 fan root: 
Mar 14 10:23:49 fan root: System,single: Size:32.00MiB, Used:16.00KiB
Mar 14 10:23:49 fan root:/dev/mapper/fanbtr#011  32.00MiB
Mar 14 10:23:49 fan root: 
Mar 14 10:23:49 fan root: Unallocated:
Mar 14 10:23:49 fan root:/dev/mapper/fanbtr#011 110.97GiB
Mar 14 10:23:49 fan mh: BEGIN btrfs balance start /
Mar 14 10:36:46 fan kernel: [  890.995815] BTRFS info (device dm-15): 6 enospc 
errors during balance
Mar 14 10:36:46 fan root: ERROR: error during balancing '/': No space left on 
device
Mar 14 10:36:46 fan root: There may be more info in syslog - try dmesg | tail
Mar 14 10:36:46 fan root: btrfs fi df /
Mar 14 10:36:46 fan root: Data, single: total=79.00GiB, used=78.42GiB
Mar 14 10:36:46 fan root: System, single: total=32.00MiB, used=16.00KiB
Mar 14 10:36:46 fan root: Metadata, single: total=12.00GiB, used=2.46GiB
Mar 14 10:36:46 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Mar 14 10:36:46 fan root: btrfs fi show /
Mar 14 10:36:46 fan root: Label: 'fanbtr'  uuid: 
90f8d728-6bae-4fca-8cda-b368ba2c008e
Mar 14 10:36:46 fan root: #011Total devices 1 FS bytes used 80.89GiB
Mar 14 10:36:46 fan root: #011devid1 size 200.00GiB used 91.03GiB path 
/dev/mapper/fanbtr
Mar 14 10:36:46 fan root: 
Mar 14 10:36:46 fan root: btrfs fi usage /
Mar 14 10:36:46 fan root: Overall:
Mar 14 10:36:46 fan root: Device size:#011#011 200.00GiB
Mar 14 10:36:46 fan root: Device allocated:#011#011  91.03GiB
Mar 14 10:36:46 fan root: Device unallocated:#011#011 108.97GiB
Mar 14 10:36:46 fan root: Device missing:#011#011 0.00B
Mar 14 10:36:46 fan root: Used:#011#011#011  80.89GiB
Mar 14 10:36:46 fan root: Free (estimated):#011#011 109.54GiB#011(min: 
109.54GiB)
Mar 14 10:36:46 fan root: Data ratio:#011#011#011  1.00
Mar 14 10:36:46 fan root: Metadata ratio:#011#011  1.00
Mar 14 10:36:46 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 14 10:36:46 fan root: 
Mar 14 10:36:46 fan root: Data,single: Size:79.00GiB, Used:78.42GiB
Mar 14 10:36:46 fan root:/dev/mapper/fanbtr#011  79.00GiB
Mar 14 10:36:46 fan root: 
Mar 14 10:36:46 fan root: Metadata,single: Size:12.00GiB, Used:2.46GiB
Mar 14 10:36:46 fan root:/dev/mapper/fanbtr#011  12.00GiB
Mar 14 10:36:46 fan root: 
Mar 14 10:36:46 fan root: System,single: Size:32.00MiB, Used:16.00KiB
Mar 14 10:36:46 fan root:/dev/mapper/fanbtr#011  32.00MiB
Mar 14 10:36:46 fan root: 
Mar 14 10:36:46 fan root:

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-14 Thread Marc Haber

Hi Henk,

On Mon, Mar 14, 2016 at 02:46:54PM +0100, Henk Slager wrote:
> On Mon, Mar 14, 2016 at 1:07 PM, Marc Haber  
> wrote:
> > Mar 14 10:23:49 fan mh: BEGIN btrfs-balance script
> > Mar 14 10:23:49 fan mh: btrfs fi df /
> > Mar 14 10:23:49 fan root: Data, single: total=79.00GiB, used=78.42GiB
> > Mar 14 10:23:49 fan root: System, single: total=32.00MiB, used=16.00KiB
> > Mar 14 10:23:49 fan root: Metadata, single: total=10.00GiB, used=2.46GiB
> > Mar 14 10:23:49 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
> > Mar 14 10:23:49 fan mh: btrfs fi show /
> > Mar 14 10:23:49 fan root: Label: 'fanbtr'  uuid: 
> > 90f8d728-6bae-4fca-8cda-b368ba2c008e
> > Mar 14 10:23:49 fan root: #011Total devices 1 FS bytes used 80.89GiB
> > Mar 14 10:23:49 fan root: #011devid1 size 200.00GiB used 89.03GiB path 
> > /dev/mapper/fanbtr
> > Mar 14 10:23:49 fan root:
> > Mar 14 10:23:49 fan mh: btrfs fi usage /
> > Mar 14 10:23:49 fan root: Overall:
> > Mar 14 10:23:49 fan root: Device size:#011#011 200.00GiB
> > Mar 14 10:23:49 fan root: Device allocated:#011#011  89.03GiB
> > Mar 14 10:23:49 fan root: Device unallocated:#011#011 110.97GiB
> > Mar 14 10:23:49 fan root: Device missing:#011#011 0.00B
> > Mar 14 10:23:49 fan root: Used:#011#011#011  80.89GiB
> > Mar 14 10:23:49 fan root: Free (estimated):#011#011 111.54GiB#011(min: 
> > 111.54GiB)
> > Mar 14 10:23:49 fan root: Data ratio:#011#011#011  1.00
> > Mar 14 10:23:49 fan root: Metadata ratio:#011#011  1.00
> > Mar 14 10:23:49 fan root: Global reserve:#011#011 512.00MiB#011(used: 
> > 0.00B)
> It it looks a bit strange to me that this is already 512MiB for and fs
> of 200GiB. Just after creation (4.4 tools) it should be something like
> 16MiB. And grows when fs is used, but 512MiB... An fs created with
> older tools had 512MiB from start AFAIK

Confirmed, a new btrfs of 200 GB made on a rotating disk has 16 MiB of
global reserve. Unfortunately, I do not have history about how this
grew over time. The first btrfs fi usage I have on file was about half
a day into this fs' existence on Mar 7, after copying data on to it,
and Global reserve was already at 512 MiB.

> > Mar 14 10:51:06 fan root: BEGIN btrfs balance start -mprofiles=dup /
> 
> This probably should have been  -mprofiles=single
> So that its gets more clear where and when the enospc errors occur

Good catch. So I'd need to parse btrfs fi df's output to call the
right balance option. I blindly copied that over from the script I
wrote for the older btrfs which still has DUP metadata and system.

> BTW, I restored and mounted your 20160307-fanbtr-image:
> 
> [266169.207952] BTRFS: device label fanbtr devid 1 transid 22215732 /dev/loop0
> [266203.734804] BTRFS info (device loop0): disk space caching is enabled
> [266203.734806] BTRFS: has skinny extents
> [266204.022175] BTRFS: checking UUID tree
> [266239.407249] attempt to access beyond end of device
> [266239.407252] loop0: rw=1073, want=715202688, limit=70576
> [266239.407254] BTRFS error (device loop0): bdev /dev/loop0 errs: wr
> 1, rd 0, flush 0, corrupt 0, gen 0
> [266239.407272] attempt to access beyond end of device
> .. and 16 more
> 
> As a quick fix/workaround, I truncated the image to 1T

The original fs was 417 GiB in size. What size does the image claim?

> After re-loop and mount and while doing a balance of the metadata I got this:
> [27.431704] BTRFS error (device loop0): bad tree block start 0 
> 5827368812544
>
> So something is/was wrong with the fs. Did you do a btrfs check before 
> imaging?

No, I didn't. And there is indeed something wrong:

[10/509]mh@fan:~$ sudo btrfs check /media/tempdisk/
Superblock bytenr is larger than device size
Couldn't open file system
[11/509]mh@fan:~$

Can this be fixed?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue

2016-03-14 Thread Marc Haber

On Mon, Mar 14, 2016 at 01:48:18PM +0100, Holger Hoffstätte wrote:
> On 03/14/16 13:07, Marc Haber wrote:
> >> And btrfs balance runs into the same ENOSPC issues as the old one:
> > 
> > ... with Qu's patch, I now get a reproducible kernel trace:
> 
> 
> 
> That is interesting and useful. Sorry if this was asked before, but
> did you ever try to clear the free-space cache via -o clear_cache
> on mount?

This was not asked, and I didn't try. Since this is an encrypted root
filesystem, is it a workable way to add clear_cache to /etc/fstab,
rebuild initramfs and reboot? Or do you recommend using a rescue system?

> Give it a try, let it run for a while and then try balancing
> again.

Do I need to wait for clear_cache to finish, like until I see disk
usage dropping?

> Uncle Occam's razor also suggests that the involvement of dm
> doesn't help. Why not just use the device/partition directly?

I need the dm intermediate since I don't want to repartition the
expensive SSD and the entire system is crypted.

> _Someone_ is lying to btrfs in terms of device size and/or allocated
> chunks, otherwise you wouldn't get the ENOSPC.

Which properties does a block device report other than size?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Mon, Mar 14, 2016 at 09:39:51PM +0100, Henk Slager wrote:
> >> BTW, I restored and mounted your 20160307-fanbtr-image:
> >>
> >> [266169.207952] BTRFS: device label fanbtr devid 1 transid 22215732 
> >> /dev/loop0
> >> [266203.734804] BTRFS info (device loop0): disk space caching is enabled
> >> [266203.734806] BTRFS: has skinny extents
> >> [266204.022175] BTRFS: checking UUID tree
> >> [266239.407249] attempt to access beyond end of device
> >> [266239.407252] loop0: rw=1073, want=715202688, limit=70576
> >> [266239.407254] BTRFS error (device loop0): bdev /dev/loop0 errs: wr
> >> 1, rd 0, flush 0, corrupt 0, gen 0
> >> [266239.407272] attempt to access beyond end of device
> >> .. and 16 more
> >>
> >> As a quick fix/workaround, I truncated the image to 1T
> >
> > The original fs was 417 GiB in size. What size does the image claim?
> 
> ls -alFh  of the restored image showed 337G I remember.
> btrfs fi us showed also a number over 400G, I don't have the
> files/loopdev anymore.

sounds legit.

> It could some side effect of btrfs-image, I only have used it for
> multi-device, where dev id's are ignore, but total image size did not
> lead to problems.

The original "ofanbtr" seems to have a problem, since btrfs check
/media/tempdisk says:

> > [10/509]mh@fan:~$ sudo btrfs check /media/tempdisk/
> > Superblock bytenr is larger than device size
> > Couldn't open file system
> > [11/509]mh@fan:~$
> >
> > Can this be fixed?
> 
> What I would do in order to fix it, is resize the fs to let's say
> 190GiB. That should write correct values to the superblocks I /hope/.
> And then resize back to max.

It doesn't:
[20/518]mh@fan:~$ sudo btrfs filesystem resize 300G /media/tempdisk/
Resize '/media/tempdisk/' of '300G'
[22/520]mh@fan:~$ sudo btrfs check /media/tempdisk/
Superblock bytenr is larger than device size
Couldn't open file system
[23/521]mh@fan:~$ df -h

> Maybe btrfs check --repair can also fix it, but before doing --repair
> or other actions, I would see what else besides btrfs could be wrong,
> see also suggestion of Holger.

Like putting the filesystem on an unencrypted medium? Sorry, no,
private data, paranoia.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Tue, Mar 15, 2016 at 12:22:00AM +0100, Henk Slager wrote:
> The other question is: What is mounted on /media/tempdisk/  ?

The "old" btrfs filesystem "ofanbtr", formerly 417 GB in size, now
resized to 300 GB. Does it need to be umounted to be checked?

> At least I think a check of the current 200GiB fs is needed. As it is
> a rootfs and encrypted, some work is needed to make that happen.

You suggested a btrfs check after looking at the image of "ofanbtr".
Do you want me to check the new "fanbtr" also?

Too bad that we went back to looking at "ofanbtr" after I changed the
subject to avoid mixing up both instances.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Mon, Mar 14, 2016 at 01:00:13AM +0100, Henk Slager wrote:
> On Sun, Mar 13, 2016 at 9:56 PM, Marc Haber  
> wrote:
> > Yes, I want to keep the possibility to remove huge files from
> > snapshots that shouldnt have been on a snapshotted volume in the first
> > place without having to ditch the entire snapshot.
> 
> You could do ro snapshotting and in case you want to modify something
> inside a snapshot/subvolume:
> # btrfs property set  ro false
> # rm /
> # btrfs property set  ro true

I was not aware that it is possible to fiddle with the ro property of
an already existing snapshot. I am not yet sure whether I love or hate
this.

> >> Also, If some part of the OS or tools scans through the snapshot dirs
> >> every now and then with atime creation on, metadata grows without a
> >> real need.
> >
> > I mount with noatime and nodiratime anyway, and the directory the
> > snapshots are mounted to (/mnt/snapshots) are excluded in
> > updatedb.conf. Any other idea which tool might scan filesystems and
> > that might not be noticed when it's running about a five digit number
> > of snapshots?
> 
> Maybe baloo or so if you use KDE.

I usually do those tests via ssh without even being logged in to a
local desktop.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Tue, Mar 15, 2016 at 01:15:33PM +0100, Henk Slager wrote:
> On Tue, Mar 15, 2016 at 8:16 AM, Marc Haber  
> wrote:
> > On Tue, Mar 15, 2016 at 12:22:00AM +0100, Henk Slager wrote:
> >> The other question is: What is mounted on /media/tempdisk/  ?
> >
> > The "old" btrfs filesystem "ofanbtr", formerly 417 GB in size, now
> > resized to 300 GB. Does it need to be umounted to be checked?
> 
> Yes, that's the whole point
> 
> >> At least I think a check of the current 200GiB fs is needed. As it is
> >> a rootfs and encrypted, some work is needed to make that happen.
> >
> > You suggested a btrfs check after looking at the image of "ofanbtr".
> > Do you want me to check the new "fanbtr" also?
> 
> I was not sure if 'ofanbtr' is an image created by btrfs-image or a
> extra dd created image you might have locally. Both 'ofanbtr' and
> 'fanbtr' have the same balance issue, but 'fanbtr' is created with
> newer and known kernel+tools version I assume, so that's why the
> suggestion.

ofanbtr is the old btrfs, on /dev/mapper/ofanbtr:
Label: 'ofanbtr'  uuid: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
Total devices 1 FS bytes used 80.63GiB
devid1 size 300.00GiB used 122.06GiB path /dev/mapper/ofanbtr
it was created as 'fanbtr' in September, 300 GiB in Size, then - in
February, I think, resized to 417 MiB to make room for more data and
for balancing, used until March 7, and then renamed to ofanbtr with
lvrename and btrfs fi label. It was then imaged, and then resized back
to 300 GiB in the hope that this will fix the size issue.

fanbtr is the new btrfs, on /dev/mapper/fanbtr:
Label: 'fanbtr'  uuid: 90f8d728-6bae-4fca-8cda-b368ba2c008e
Total devices 1 FS bytes used 82.45GiB
devid1 size 200.00GiB used 113.03GiB path /dev/mapper/fanbtr
it was created on march 7, had the data from ofanbtr cp'ed over, and
is being used as the active filesystem since then. It is smaller
because I don't have much more room on the SSD.

Both do have the same balance issue, yes.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Mon, Mar 14, 2016 at 09:05:46PM +0100, Marc Haber wrote:
> [10/509]mh@fan:~$ sudo btrfs check /media/tempdisk/
> Superblock bytenr is larger than device size
> Couldn't open file system
> [11/509]mh@fan:~$

After umounting and btrfs check the block device, things seem to be
fine now:

[34/532]mh@fan:~$ sudo btrfs check /dev/mapper/ofanbtr
Checking filesystem on /dev/mapper/ofanbtr
UUID: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 86554574954 bytes used err is 0
total csum bytes: 81815012
total tree bytes: 2476670976
total fs tree bytes: 2246311936
total extent tree bytes: 133201920
btree space waste bytes: 452859567
file data blocks allocated: 292994375680
 referenced 132664688640
[35/533]mh@fan:~$ sudo btrfs check /dev/mapper/ofanbtr
Checking filesystem on /dev/mapper/ofanbtr
UUID: 4198d1bc-e3ce-40df-a7ee-44a2d120bff3
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 86554574954 bytes used err is 0
total csum bytes: 81815012
total tree bytes: 2476670976
total fs tree bytes: 2246311936
total extent tree bytes: 133201920
btree space waste bytes: 452859567
file data blocks allocated: 292994375680
 referenced 132664688640
[36/533]mh@fan:~$

This does not indicate an error, does it?

Greetings
Marc, who would like to the tools a bit more explicit and consistent
in whether they want the fs mounted, umounted, the mountpoint or the
device on their command line

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-15 Thread Marc Haber

On Tue, Mar 15, 2016 at 02:29:32PM +0100, Marc Haber wrote:
> After umounting and btrfs check the block device, things seem to be
> fine now

But, umounting the btrfs seemed to trigger the following kernel traces:

Mar 15 14:21:30 fan kernel: [92308.377104] [ cut here ]
Mar 15 14:21:30 fan kernel: [92308.377135] WARNING: CPU: 5 PID: 28243 at 
fs/btrfs/extent-tree.c:5380 bt
rfs_free_block_groups+0x1bc/0x36f [btrfs]()
Mar 15 14:21:30 fan kernel: [92308.377137] Modules linked in: vhost_net vhost 
macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 soundcore 
acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid autofs4 
crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac sha256_ssse3 
sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher af_alg dm_crypt 
dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom ohci_pci r8169 
mii amdkfd radeon i2c_algo_bit ahci ttm sym53c8xx libahci xhci_pci 
scsi_transport_spi drm_kms_helper ohci_hcd ehci_pci xhci_hcd libata ehci_hcd 
drm usbcore scsi_mod usb_common i2c_core button
Mar 15 14:21:30 fan kernel: [92308.377203] CPU: 5 PID: 28243 Comm: umount Not 
tainted 4.4.5-zgws1 #2
Mar 15 14:21:30 fan kernel: [92308.377205] Hardware name: System manufacturer 
System Product Name/M5A88-V EVO, BIOS 160310/12/2012
Mar 15 14:21:30 fan kernel: [92308.377207]  005b 811dd418 
 0009
Mar 15 14:21:30 fan kernel: [92308.377210]  81051e21 a047a147 
880600a28000 
Mar 15 14:21:30 fan kernel: [92308.377212]  880600a28080 8805af7eea00 
a047a147 880600a28000
Mar 15 14:21:30 fan kernel: [92308.377215] Call Trace:
Mar 15 14:21:30 fan kernel: [92308.377221]  [] ? 
dump_stack+0x5a/0x6f
Mar 15 14:21:30 fan kernel: [92308.377224]  [] ? 
warn_slowpath_common+0x8e/0xa3
Mar 15 14:21:30 fan kernel: [92308.377239]  [] ? 
btrfs_free_block_groups+0x1bc/0x36f[btrfs]
Mar 15 14:21:30 fan kernel: [92308.377252]  [] ? 
btrfs_free_block_groups+0x1bc/0x36f[btrfs]
Mar 15 14:21:30 fan kernel: [92308.377267]  [] ? 
close_ctree+0x1e6/0x2f2 [btrfs]
Mar 15 14:21:30 fan kernel: [92308.377271]  [] ? 
generic_shutdown_super+0x64/0xdf
Mar 15 14:21:30 fan kernel: [92308.377273]  [] ? 
kill_anon_super+0x9/0xe
Mar 15 14:21:30 fan kernel: [92308.377285]  [] ? 
btrfs_kill_super+0xd/0x16 [btrfs]
Mar 15 14:21:30 fan kernel: [92308.377288]  [] ? 
deactivate_locked_super+0x2f/0x56
Mar 15 14:21:30 fan kernel: [92308.377291]  [] ? 
cleanup_mnt+0x4f/0x6b
Mar 15 14:21:30 fan kernel: [92308.377293]  [] ? 
task_work_run+0x5d/0x71
Mar 15 14:21:30 fan kernel: [92308.377296]  [] ? 
prepare_exit_to_usermode+0x70/0x99
Mar 15 14:21:30 fan kernel: [92308.377300]  [] ? 
int_ret_from_sys_call+0x25/0x8f
Mar 15 14:21:30 fan kernel: [92308.377302] ---[ end trace 18c6bb90b0c6c689 ]---

Mar 15 14:21:30 fan kernel: [92308.377303] [ cut here ]
Mar 15 14:21:30 fan kernel: [92308.377318] WARNING: CPU: 5 PID: 28243 at 
fs/btrfs/extent-tree.c:5381 btrfs_free_block_groups+0x1d7/0x36f [btrfs]()
Mar 15 14:21:30 fan kernel: [92308.377319] Modules linked in: vhost_net vhost 
macvtap macvlan tun iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 
nf_nat nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp dummy ebtable_filter 
ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables 
cpufreq_userspace cpufreq_stats cpufreq_powersave cpufreq_conservative bridge 
stp llc snd_cmipci snd_hda_codec_realtek snd_hda_codec_generic 
snd_hda_codec_hdmi kvm_amd snd_mpu401_uart snd_opl3_lib snd_rawmidi kvm 
snd_hda_intel snd_seq_device snd_hda_codec snd_hda_core snd_hwdep 
amd64_edac_mod snd_pcm_oss edac_mce_amd irqbypass input_leds snd_mixer_oss 
pcspkr k10temp edac_core snd_pcm snd_timer snd i2c_piix4 asus_atk0110 soundcore 
acpi_cpufreq tpm_tis tpm sg processor evdev shpchp hwmon_vid autofs4 
crc32c_generic btrfs xor raid6_pq ext4 crc16 mbcache jbd2 hmac sha256_ssse3 
sha256_generic drbg ansi_cprng xts gf128mul algif_skcipher af_alg dm_crypt 
dm_mod hid_generic usbhid hid usb_storage sr_mod sd_mod cdrom ohci_pci r8169 
mii amdkfd radeon i2c_algo_bit ahci ttm sym53c8xx libahci xhci_pci 
scsi_transport_spi drm_kms_helper ohci_hcd ehci_pci xhci_hcd libata ehci_hcd 
drm usbcore scsi_mod usb_common i2c_core button
Mar 15 14:21:30 fan kernel: [92308.377362] CPU: 5 PID: 28243 Comm:

Re: New file system with same issue

2016-03-15 Thread Marc Haber

On Tue, Mar 15, 2016 at 11:52:30AM +0100, Holger Hoffstätte wrote:
> On 03/14/16 21:13, Marc Haber wrote:
> > Do I need to wait for clear_cache to finish, like until I see disk
> > usage dropping?
> 
> The cache isn't that big, so you won't see a huge drop. Just use the
> disk normally for a few minutes, after some time the cache will be
> written out again.

Is it necessary to actually cause activity on the file system or is it
ok to just let it sit there for an hour or so?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: New file system with same issue

2016-03-15 Thread Marc Haber

On Tue, Mar 15, 2016 at 09:54:06AM -0400, Austin S. Hemmelgarn wrote:
> On 2016-03-15 09:46, Marc Haber wrote:
> >On Tue, Mar 15, 2016 at 11:52:30AM +0100, Holger Hoffstätte wrote:
> >>On 03/14/16 21:13, Marc Haber wrote:
> >>>Do I need to wait for clear_cache to finish, like until I see disk
> >>>usage dropping?
> >>
> >>The cache isn't that big, so you won't see a huge drop. Just use the
> >>disk normally for a few minutes, after some time the cache will be
> >>written out again.
> >
> >Is it necessary to actually cause activity on the file system or is it
> >ok to just let it sit there for an hour or so?
> It should be OK to just let it sit there for ten or fifteen minutes. I'm
> pretty certain that the free space cache gets rebuilt relatively quickly,
> and I'm almost 100% certain that the old one gets dropped within seconds of
> the FS being mounted with -o clear_cache.  I've rebuilt the cache on the 64G
> root filesystem on my laptop a couple of times before, and it consistently
> appears to take about 2-3 minutes to do so at most (based on disk usage from
> the kernel itself).

In my case, atop has not seen any notable disk activity after mounting
with -o clerar_cache.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Current state of old filesystem (was: Again, no space left on device while rebalancing and recipe doesnt work)

2016-03-27 Thread Marc Haber

btrfs fi df /mnt/ofanbtr
Mar 26 11:03:36 fan root: Data, single: total=79.00GiB, used=78.32GiB
Mar 26 11:03:36 fan root: System, DUP: total=32.00MiB, used=16.00KiB
Mar 26 11:03:36 fan root: Metadata, DUP: total=14.50GiB, used=2.30GiB
Mar 26 11:03:36 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Mar 26 11:03:36 fan mh: btrfs fi show /mnt/ofanbtr
Mar 26 11:03:36 fan root: Label: 'ofanbtr'  uuid: 
4198d1bc-e3ce-40df-a7ee-44a2d120bff3
Mar 26 11:03:36 fan root: #011Total devices 1 FS bytes used 80.63GiB
Mar 26 11:03:36 fan root: #011devid1 size 300.00GiB used 108.06GiB path 
/dev/mapper/ofanbtr
Mar 26 11:03:36 fan root:
Mar 26 11:03:36 fan mh: btrfs fi usage /mnt/ofanbtr
Mar 26 11:03:36 fan root: Overall:
Mar 26 11:03:36 fan root: Device size:#011#011 300.00GiB
Mar 26 11:03:36 fan root: Device allocated:#011#011 108.06GiB
Mar 26 11:03:36 fan root: Device unallocated:#011#011 191.94GiB
Mar 26 11:03:36 fan root: Device missing:#011#011 0.00B
Mar 26 11:03:36 fan root: Used:#011#011#011  82.93GiB
Mar 26 11:03:36 fan root: Free (estimated):#011#011 192.61GiB#011(min: 
96.64GiB)
Mar 26 11:03:36 fan root: Data ratio:#011#011#011  1.00
Mar 26 11:03:36 fan root: Metadata ratio:#011#011  2.00
Mar 26 11:03:36 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 26 11:03:36 fan root:
Mar 26 11:03:36 fan root: Data,single: Size:79.00GiB, Used:78.32GiB
Mar 26 11:03:36 fan root:/dev/mapper/ofanbtr#011  79.00GiB
Mar 26 11:03:36 fan root:
Mar 26 11:03:36 fan root: Metadata,DUP: Size:14.50GiB, Used:2.30GiB
Mar 26 11:03:36 fan root:/dev/mapper/ofanbtr#011  29.00GiB
Mar 26 11:03:36 fan root:
Mar 26 11:03:36 fan root: System,DUP: Size:32.00MiB, Used:16.00KiB
Mar 26 11:03:36 fan root:/dev/mapper/ofanbtr#011  64.00MiB
Mar 26 11:03:36 fan root:
Mar 26 11:03:36 fan root: Unallocated:
Mar 26 11:03:36 fan root:/dev/mapper/ofanbtr#011 191.94GiB
Mar 26 11:03:36 fan mh: BEGIN btrfs balance start /mnt/ofanbtr
Mar 26 11:38:40 fan kernel: [ 6796.907286] BTRFS info (device dm-31): 9 enospc 
errors during balance
Mar 26 11:38:40 fan root: ERROR: error during balancing '/mnt/ofanbtr': No 
space left on device
Mar 26 11:38:40 fan root: There may be more info in syslog - try dmesg | tail
Mar 26 11:38:40 fan mh: btrfs fi df /mnt/ofanbtr
Mar 26 11:38:40 fan root: Data, single: total=79.00GiB, used=78.33GiB
Mar 26 11:38:40 fan root: System, DUP: total=32.00MiB, used=16.00KiB
Mar 26 11:38:40 fan root: Metadata, DUP: total=19.00GiB, used=2.31GiB
Mar 26 11:38:40 fan root: GlobalReserve, single: total=512.00MiB, used=0.00B
Mar 26 11:38:40 fan mh: btrfs fi show /mnt/ofanbtr
Mar 26 11:38:40 fan root: Label: 'ofanbtr'  uuid: 
4198d1bc-e3ce-40df-a7ee-44a2d120bff3
Mar 26 11:38:40 fan root: #011Total devices 1 FS bytes used 80.63GiB
Mar 26 11:38:40 fan root: #011devid1 size 300.00GiB used 117.06GiB path 
/dev/mapper/ofanbtr
Mar 26 11:38:40 fan root:
Mar 26 11:38:40 fan mh: btrfs fi usage /mnt/ofanbtr
Mar 26 11:38:40 fan root: Overall:
Mar 26 11:38:40 fan root: Device size:#011#011 300.00GiB
Mar 26 11:38:40 fan root: Device allocated:#011#011 117.06GiB
Mar 26 11:38:40 fan root: Device unallocated:#011#011 182.94GiB
Mar 26 11:38:40 fan root: Device missing:#011#011 0.00B
Mar 26 11:38:40 fan root: Used:#011#011#011  82.94GiB
Mar 26 11:38:40 fan root: Free (estimated):#011#011 183.61GiB#011(min: 
92.14GiB)
Mar 26 11:38:40 fan root: Data ratio:#011#011#011  1.00
Mar 26 11:38:40 fan root: Metadata ratio:#011#011  2.00
Mar 26 11:38:40 fan root: Global reserve:#011#011 512.00MiB#011(used: 0.00B)
Mar 26 11:38:40 fan root:
Mar 26 11:38:40 fan root: Data,single: Size:79.00GiB, Used:78.33GiB
Mar 26 11:38:40 fan root:/dev/mapper/ofanbtr#011  79.00GiB
Mar 26 11:38:40 fan root:
Mar 26 11:38:40 fan root: Metadata,DUP: Size:19.00GiB, Used:2.31GiB
Mar 26 11:38:40 fan root:/dev/mapper/ofanbtr#011  38.00GiB
Mar 26 11:38:40 fan root:
Mar 26 11:38:40 fan root: System,DUP: Size:32.00MiB, Used:16.00KiB
Mar 26 11:38:40 fan root:/dev/mapper/ofanbtr#011  64.00MiB
Mar 26 11:38:40 fan root:
Mar 26 11:38:40 fan root: Unallocated:
Mar 26 11:38:40 fan root:/dev/mapper/ofanbtr#011 182.94GiB
Mar 26 11:38:40 fan mh: END btrfs-balance script
[16/515]mh@fan:~$


-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

"bad metadata" not fixed by btrfs repair

2016-03-28 Thread Marc Haber

Hi,

I have a btrfs which btrfs check --repair doesn't fix:

# btrfs check --repair /dev/mapper/fanbtr
bad metadata [4425377054720, 4425377071104) crossing stripe boundary
bad metadata [4425380134912, 4425380151296) crossing stripe boundary
bad metadata [4427532795904, 4427532812288) crossing stripe boundary
bad metadata [4568321753088, 4568321769472) crossing stripe boundary
bad metadata [4568489656320, 4568489672704) crossing stripe boundary
bad metadata [4571474493440, 4571474509824) crossing stripe boundary
bad metadata [4571946811392, 4571946827776) crossing stripe boundary
bad metadata [4572782919680, 4572782936064) crossing stripe boundary
bad metadata [4573086351360, 4573086367744) crossing stripe boundary
bad metadata [4574221041664, 4574221058048) crossing stripe boundary
bad metadata [4574373412864, 4574373429248) crossing stripe boundary
bad metadata [4574958649344, 4574958665728) crossing stripe boundary
bad metadata [4575996018688, 4575996035072) crossing stripe boundary
bad metadata [4580376772608, 4580376788992) crossing stripe boundary
repaired damaged extent references
Fixed 0 roots.
checking free space cache
checking fs roots
checking csums
checking root refs
enabling repair mode
Checking filesystem on /dev/mapper/fanbtr
UUID: 90f8d728-6bae-4fca-8cda-b368ba2c008e
cache and super generation don't match, space cache will be invalidated
found 97171628230 bytes used err is 0
total csum bytes: 91734220
total tree bytes: 3021848576
total fs tree bytes: 2762784768
total extent tree bytes: 148570112
btree space waste bytes: 545440822
file data blocks allocated: 308328280064
 referenced 177314340864
# btrfs check --repair /dev/mapper/fanbtr
checking extents
bad metadata [4425377054720, 4425377071104) crossing stripe boundary
bad metadata [4425380134912, 4425380151296) crossing stripe boundary
bad metadata [4427532795904, 4427532812288) crossing stripe boundary
bad metadata [4568321753088, 4568321769472) crossing stripe boundary
bad metadata [4568489656320, 4568489672704) crossing stripe boundary
bad metadata [4571474493440, 4571474509824) crossing stripe boundary
bad metadata [4571946811392, 4571946827776) crossing stripe boundary
bad metadata [4572782919680, 4572782936064) crossing stripe boundary
bad metadata [4573086351360, 4573086367744) crossing stripe boundary
bad metadata [4574221041664, 4574221058048) crossing stripe boundary
bad metadata [4574373412864, 4574373429248) crossing stripe boundary
bad metadata [4574958649344, 4574958665728) crossing stripe boundary
bad metadata [4575996018688, 4575996035072) crossing stripe boundary
bad metadata [4580376772608, 4580376788992) crossing stripe boundary
repaired damaged extent references
Fixed 0 roots.
checking free space cache
checking fs roots
checking csums
checking root refs
enabling repair mode
Checking filesystem on /dev/mapper/fanbtr
UUID: 90f8d728-6bae-4fca-8cda-b368ba2c008e
cache and super generation don't match, space cache will be invalidated
found 97171628230 bytes used err is 0
total csum bytes: 91734220
total tree bytes: 3021848576
total fs tree bytes: 2762784768
total extent tree bytes: 148570112
btree space waste bytes: 545440822
file data blocks allocated: 308328280064
 referenced 177314340864

How do I fix this?

Does the kernel play a role in btrfs check --repair, or is this all a
userspace matter?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-28 Thread Marc Haber

On Mon, Mar 28, 2016 at 04:37:14PM +0200, Marc Haber wrote:
> I have a btrfs which btrfs check --repair doesn't fix:
> 
> # btrfs check --repair /dev/mapper/fanbtr
> bad metadata [4425377054720, 4425377071104) crossing stripe boundary
> bad metadata [4425380134912, 4425380151296) crossing stripe boundary
> bad metadata [4427532795904, 4427532812288) crossing stripe boundary
> bad metadata [4568321753088, 4568321769472) crossing stripe boundary
> bad metadata [4568489656320, 4568489672704) crossing stripe boundary
> bad metadata [4571474493440, 4571474509824) crossing stripe boundary
> bad metadata [4571946811392, 4571946827776) crossing stripe boundary
> bad metadata [4572782919680, 4572782936064) crossing stripe boundary
> bad metadata [4573086351360, 4573086367744) crossing stripe boundary
> bad metadata [4574221041664, 4574221058048) crossing stripe boundary
> bad metadata [4574373412864, 4574373429248) crossing stripe boundary
> bad metadata [4574958649344, 4574958665728) crossing stripe boundary
> bad metadata [4575996018688, 4575996035072) crossing stripe boundary
> bad metadata [4580376772608, 4580376788992) crossing stripe boundary
> repaired damaged extent references
> Fixed 0 roots.
> checking free space cache
> checking fs roots
> checking csums
> checking root refs
> enabling repair mode
> Checking filesystem on /dev/mapper/fanbtr
> UUID: 90f8d728-6bae-4fca-8cda-b368ba2c008e
> cache and super generation don't match, space cache will be invalidated
> found 97171628230 bytes used err is 0
> total csum bytes: 91734220
> total tree bytes: 3021848576
> total fs tree bytes: 2762784768
> total extent tree bytes: 148570112
> btree space waste bytes: 545440822
> file data blocks allocated: 308328280064
>  referenced 177314340864

Mounting this filesystem gives:
Mar 28 20:25:18 fan kernel: [   20.979673] BTRFS error (device dm-16): could 
not find root 8
Mar 28 20:25:18 fan kernel: [   20.979739] BTRFS error (device dm-16): could 
not find root 8
Mar 28 20:25:18 fan kernel: [   20.980900] BTRFS error (device dm-16): could 
not find root 8
Mar 28 20:25:18 fan kernel: [   20.980948] BTRFS error (device dm-16): could 
not find root 8
Mar 28 20:25:18 fan kernel: [   20.981428] BTRFS error (device dm-16): could 
not find root 8
Mar 28 20:25:18 fan kernel: [   20.981472] BTRFS error (device dm-16): could 
not find root 8

which is not detected by btrfs check.

What is going on here?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-28 Thread Marc Haber

On Mon, Mar 28, 2016 at 06:51:02PM +, Hugo Mills wrote:
>"Could not find root 8" is harmless (and will be going away as a
> message soon). It just means that systemd is probing the FS for
> quotas, and you don't have quotas enabled.

*phew* That message was not what I wanted to read on this filesystem.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-28 Thread Marc Haber

On Mon, Mar 28, 2016 at 03:35:32PM -0400, Austin S. Hemmelgarn wrote:
> Did you convert this filesystem from ext4 (or ext3)?

No.

> You hadn't mentioned what version of btrfs-progs you're using, and that is
> somewhat important for recovery.  I'm not sure if current versions of btrfs
> check can fix this issue, but I know for a fact that older versions (prior
> to at least 4.1) can not fix it.

4.1 for creation and btrfs check.

> As far as what the kernel is involved with, the easy way to check is if it's
> operating on a mounted filesystem or not.  If it only operates on mounted
> filesystems, it almost certainly goes through the kernel, if it only
> operates on unmounted filesystems, it's almost certainly done in userspace
> (except dev scan and technically fi show).

Then btrfs check is a userspace-only matter, as it wants the fs
unmounted, and it is irrelevant that I did btrfs check from a rescue
system with an older kernel, 3.16 if I recall correctly.

> 2. Regarding general support:  If you're using an enterprise distribution
> (RHEL, SLES, CentOS, OEL, or something similar), you are almost certainly
> going to get better support from your vendor than from the mailing list or
> IRC.

My "productive" desktops (fan is one of them) run Debian unstable with
a current vanilla kernel. At the moment, I can't use 4.5 because it
acts up with KVM.  When I need a rescue system, I use grml, which
unfortunately hasn't released since November 2014 and is still with
kernel 3.16

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-29 Thread Marc Haber

arc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-29 Thread Marc Haber

On Tue, Mar 29, 2016 at 08:43:51AM +0200, Marc Haber wrote:
> On Mon, Mar 28, 2016 at 03:35:32PM -0400, Austin S. Hemmelgarn wrote:
> > As far as what the kernel is involved with, the easy way to check is if it's
> > operating on a mounted filesystem or not.  If it only operates on mounted
> > filesystems, it almost certainly goes through the kernel, if it only
> > operates on unmounted filesystems, it's almost certainly done in userspace
> > (except dev scan and technically fi show).
> 
> Then btrfs check is a userspace-only matter, as it wants the fs
> unmounted, and it is irrelevant that I did btrfs check from a rescue
> system with an older kernel, 3.16 if I recall correctly.

And it also means that I should not try btrfs balance from grml
because btrfs balance goes through the kernel code.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-29 Thread Marc Haber

On Mon, Mar 28, 2016 at 02:46:54PM -0600, Chris Murphy wrote:
> http://git.kernel.org/cgit/linux/kernel/git/kdave/btrfs-progs.git/tree/cmds-check.c
> line 7722 discusses this error message and it looks like there's no
> repair function for it yet; uncertain what problems can result from
> this.

This basically means that I am ripe for a new mkfs.btrfs, right?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-30 Thread Marc Haber

On Wed, Mar 30, 2016 at 03:00:19PM +0800, Qu Wenruo wrote:
> Marc Haber wrote on 2016/03/29 08:43 +0200:
> >On Mon, Mar 28, 2016 at 03:35:32PM -0400, Austin S. Hemmelgarn wrote:
> >>Did you convert this filesystem from ext4 (or ext3)?
> >
> >No.
> >
> >>You hadn't mentioned what version of btrfs-progs you're using, and that is
> >>somewhat important for recovery.  I'm not sure if current versions of btrfs
> >>check can fix this issue, but I know for a fact that older versions (prior
> >>to at least 4.1) can not fix it.
> >
> >4.1 for creation and btrfs check.
> 
> I assume that you have run older kernel on it, like v4.1 or v4.2.

No, the productive system was always on a reasonably recent kernel. I
guess that this instance of btrfs has never been mounted on anything
older than 4.4.4. The rescue system I used to btrfs check (4.4-1 from
Debian unstable, I updated btrfs-tools on the rescue system before
going btrfs check) had kernel 3.16, but I have never actually mounted
the btrfs there.

> >Then btrfs check is a userspace-only matter, as it wants the fs
> >unmounted, and it is irrelevant that I did btrfs check from a rescue
> >system with an older kernel, 3.16 if I recall correctly.
> 
> Not recommended to use older kernel to RW mount or use older fsck to do
> repair.

Oldest kernel that has mounted this btrfs is 4.4.4, fsck that touched
the fs is 4.4. I'm trying to get hold of btrfs-tools 4.5.

> >My "productive" desktops (fan is one of them) run Debian unstable with
> >a current vanilla kernel. At the moment, I can't use 4.5 because it
> >acts up with KVM.  When I need a rescue system, I use grml, which
> >unfortunately hasn't released since November 2014 and is still with
> >kernel 3.16
> 
> To fix your problem(make these error message just disappear, even they are
> harmless on recent kernels), the most easy one, is to balance your metadata.

This does not work on kernel 4.4.6 with tools 4.4. Truckloads of
kernel traces, "WARNING: CPU: 5 PID: 31021 at
fs/btrfs/extent-tree.c:7897 btrfs_alloc_tree_block+0xeb/0x3d6
[btrfs]()", "BTRFS: block rsv returned -28", full trace is in this
thread.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-03-30 Thread Marc Haber

On Wed, Mar 30, 2016 at 04:03:17PM +0800, Qu Wenruo wrote:
> Did your btrfs have enough *unallocated* space?

87 Gig out of a total 200 Gig Device size. I guess that should be
enough for a rebalance of 2,8 Gig Metadata.

Greetings
Ma "please excuse my cynism" rc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

How to cancel btrfs balance on unmounted filesystem

2016-03-30 Thread Marc Haber

Hi,

one of my problem btrfs instances went into a hung process state
while blancing metadata. This process is recorded in the file system
somehow and the balance restarts immediately after mounting the
filesystem with no chance to issue a btrfs balance cancel command
before the system hangs again.

Is there any possiblity to cancel the pending balance without mounting
the fs first?

I have also filed https://bugzilla.kernel.org/show_bug.cgi?id=115581
to adress this in a more elegant way.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to cancel btrfs balance on unmounted filesystem

2016-03-31 Thread Marc Haber

On Thu, Mar 31, 2016 at 01:01:37PM +0500, Roman Mamedov wrote:
> On Thu, 31 Mar 2016 08:21:12 +0200
> Marc Haber  wrote:
> > the balance restarts immediately after mounting
> 
> You can use the skip_balance mount option to prevent that.

Thanks. I now have this in all fstabs. On the system in questionl, I
was able to sneak in a btrfs balance cancel before the system hanged
itself.

Mar 31 08:17:42 fan kernel: [  240.595465] INFO: task kworker/u16:0:6 blocked 
for more than 120 seconds.
Mar 31 08:17:42 fan kernel: [  240.595604]   Tainted: GW   
4.4.6-zgws1 #2
Mar 31 08:17:42 fan kernel: [  240.595705] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 31 08:17:42 fan kernel: [  240.595845] kworker/u16:0   D 88062fc956c0   
  0 6  2 0x
Mar 31 08:17:42 fan kernel: [  240.595913] Workqueue: btrfs-endio-write 
btrfs_endio_write_helper [btrfs]
Mar 31 08:17:42 fan kernel: [  240.595919]  88017ca680c0 0002 
88017ca78000 88017ca77ca0
Mar 31 08:17:42 fan kernel: [  240.595927]  8800c9388960 0002 
81409e1c 88017ca680c0
Mar 31 08:17:42 fan kernel: [  240.595934]  81408329 7fff 
81409e5a 00c0a044e7d3
Mar 31 08:17:42 fan kernel: [  240.595941] Call Trace:
Mar 31 08:17:42 fan kernel: [  240.595955]  [] ? 
usleep_range+0x35/0x35
Mar 31 08:17:42 fan kernel: [  240.595964]  [] ? 
schedule+0x6f/0x7c
Mar 31 08:17:42 fan kernel: [  240.595973]  [] ? 
schedule_timeout+0x3e/0x128
Mar 31 08:17:42 fan kernel: [  240.595981]  [] ? 
cache_alloc+0x1bd/0x277
Mar 31 08:17:42 fan kernel: [  240.595990]  [] ? 
__wait_for_common+0x121/0x16d
Mar 31 08:17:42 fan kernel: [  240.595997]  [] ? 
__wait_for_common+0x121/0x16d
Mar 31 08:17:42 fan kernel: [  240.596006]  [] ? 
wake_up_q+0x3b/0x3b
Mar 31 08:17:42 fan kernel: [  240.596047]  [] ? 
btrfs_async_run_delayed_refs+0xbf/0xd5 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596093]  [] ? 
__btrfs_end_transaction+0x291/0x2d5 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596140]  [] ? 
btrfs_finish_ordered_io+0x418/0x4d7 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596187]  [] ? 
btrfs_scrubparity_helper+0xf4/0x233 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596198]  [] ? 
process_one_work+0x178/0x27b
Mar 31 08:17:42 fan kernel: [  240.596206]  [] ? 
worker_thread+0x1da/0x280
Mar 31 08:17:42 fan kernel: [  240.596213]  [] ? 
rescuer_thread+0x284/0x284
Mar 31 08:17:42 fan kernel: [  240.596220]  [] ? 
kthread+0x95/0x9d
Mar 31 08:17:42 fan kernel: [  240.596227]  [] ? 
kthread_parkme+0x16/0x16
Mar 31 08:17:42 fan kernel: [  240.596234]  [] ? 
ret_from_fork+0x3f/0x70
Mar 31 08:17:42 fan kernel: [  240.596240]  [] ? 
kthread_parkme+0x16/0x16
Mar 31 08:17:42 fan kernel: [  240.596272] INFO: task kworker/u16:2:134 blocked 
for more than 120 seconds.
Mar 31 08:17:42 fan kernel: [  240.596399]   Tainted: GW   
4.4.6-zgws1 #2
Mar 31 08:17:42 fan kernel: [  240.596499] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 31 08:17:42 fan kernel: [  240.596637] kworker/u16:2   D 88062fcd56c0   
  0   134  2 0x
Mar 31 08:17:42 fan kernel: [  240.596688] Workqueue: btrfs-endio-write 
btrfs_endio_write_helper [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596692]  8806130e4780 0003 
880613108000 880613107ca0
Mar 31 08:17:42 fan kernel: [  240.596699]  8805caa1d960 0002 
81409e1c 8806130e4780
Mar 31 08:17:42 fan kernel: [  240.596706]  81408329 7fff 
81409e5a 88062fd556c0
Mar 31 08:17:42 fan kernel: [  240.596712] Call Trace:
Mar 31 08:17:42 fan kernel: [  240.596721]  [] ? 
usleep_range+0x35/0x35
Mar 31 08:17:42 fan kernel: [  240.596728]  [] ? 
schedule+0x6f/0x7c
Mar 31 08:17:42 fan kernel: [  240.596735]  [] ? 
schedule_timeout+0x3e/0x128
Mar 31 08:17:42 fan kernel: [  240.596742]  [] ? 
check_preempt_curr+0x41/0x63
Mar 31 08:17:42 fan kernel: [  240.596750]  [] ? 
ttwu_do_wakeup+0xf/0xd0
Mar 31 08:17:42 fan kernel: [  240.596757]  [] ? 
__wait_for_common+0x121/0x16d
Mar 31 08:17:42 fan kernel: [  240.596764]  [] ? 
__wait_for_common+0x121/0x16d
Mar 31 08:17:42 fan kernel: [  240.596771]  [] ? 
wake_up_q+0x3b/0x3b
Mar 31 08:17:42 fan kernel: [  240.596812]  [] ? 
btrfs_async_run_delayed_refs+0xbf/0xd5 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596858]  [] ? 
__btrfs_end_transaction+0x291/0x2d5 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596904]  [] ? 
btrfs_finish_ordered_io+0x418/0x4d7 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596952]  [] ? 
btrfs_scrubparity_helper+0xf4/0x233 [btrfs]
Mar 31 08:17:42 fan kernel: [  240.596960]  [] ? 
process_one_work+0x178/0x27b
Mar 31 08:17:42 fan kernel: [  240.596968]  [] ? 
worker_thread+0x1da/0x280
Mar 31 08:17:42 fan kernel: [  240.596976]  [] ? 
rescuer_thread+0x284/0x284
Mar 31 08:17:42 fan kernel: [  240.596982]  [] ? 
kthread+0x95/0x

Re: bad metadata crossing stripe boundary

2016-03-31 Thread Marc Haber

On Thu, Mar 31, 2016 at 10:31:49AM +0800, Qu Wenruo wrote:
> Would you please try the following patch based on v4.5 btrfs-progs?
> https://patchwork.kernel.org/patch/8706891/

This also fixes the "bad metadata crossing stripe boundary" on my pet
patient.

I find it somewhere between funny and disturbing that the first call
of btrfs check made my kernel log the following:
Mar 31 22:45:36 fan kernel: [ 6253.178264] EXT4-fs (dm-31): mounted filesystem 
with ordered data mode. Opts: (null)
Mar 31 22:45:38 fan kernel: [ 6255.361328] BTRFS: device label fanbtr devid 1 
transid 67526 /dev/dm-31

No, the filesystem was not converted, it was directly created as
btrfs, and no, I didn't try mounting it.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: bad metadata crossing stripe boundary

2016-03-31 Thread Marc Haber

On Thu, Mar 31, 2016 at 11:16:30PM +0200, Kai Krakow wrote:
> Am Thu, 31 Mar 2016 23:00:04 +0200
> schrieb Marc Haber :
> > I find it somewhere between funny and disturbing that the first call
> > of btrfs check made my kernel log the following:
> > Mar 31 22:45:36 fan kernel: [ 6253.178264] EXT4-fs (dm-31): mounted
> > filesystem with ordered data mode. Opts: (null) Mar 31 22:45:38 fan
> > kernel: [ 6255.361328] BTRFS: device label fanbtr devid 1 transid
> > 67526 /dev/dm-31
> > 
> > No, the filesystem was not converted, it was directly created as
> > btrfs, and no, I didn't try mounting it.
> 
> I suggest that your partition contained ext4 before, and you didn't run
> wipefs before running mkfs.btrfs.

I cryptsetup luksFormat'ted the partition before I mkfs.btrfs'ed it.
That should do a much better job than wipefsing it, shouldnt it?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "bad metadata" not fixed by btrfs repair

2016-04-01 Thread Marc Haber

On Thu, Mar 31, 2016 at 08:42:46PM +0200, Henk Slager wrote:
> So also false alerts.

btrfs-tools 4.5.1 with Qu's patch from patchwork doesnt show those
warnings any more.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Another ENOSPC situation

2016-04-01 Thread Marc Haber

49.509491]  [] ? 
smpboot_thread_fn+0xf7/0x13a
Apr  1 11:16:39 swivel kernel: [249949.509495]  [] ? 
sort_range+0x17/0x17
Apr  1 11:16:39 swivel kernel: [249949.509500]  [] ? 
kthread+0x95/0x9d
Apr  1 11:16:39 swivel kernel: [249949.509505]  [] ? 
kthread_parkme+0x16/0x16
Apr  1 11:16:39 swivel kernel: [249949.509510]  [] ? 
ret_from_fork+0x3f/0x70
Apr  1 11:16:39 swivel kernel: [249949.509515]  [] ? 
kthread_parkme+0x16/0x16
Apr  1 11:16:39 swivel kernel: [249949.509519] Mem-Info:
Apr  1 11:16:39 swivel kernel: [249949.509529] active_anon:1107088 
inactive_anon:326101 isolated_anon:0
Apr  1 11:16:39 swivel kernel: [249949.509529]  active_file:1104846 
inactive_file:1367650 isolated_file:0
Apr  1 11:16:39 swivel kernel: [249949.509529]  unevictable:2526 dirty:14757 
writeback:0 unstable:0
Apr  1 11:16:39 swivel kernel: [249949.509529]  slab_reclaimable:56106 
slab_unreclaimable:33051
Apr  1 11:16:39 swivel kernel: [249949.509529]  mapped:67336 shmem:87440 
pagetables:12012 bounce:0
Apr  1 11:16:39 swivel kernel: [249949.509529]  free:30592 free_pcp:170 
free_cma:0
Apr  1 11:16:39 swivel kernel: [249949.509538] Node 0 DMA free:15360kB min:12kB 
low:12kB high:16kB active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB 
present:15984kB managed:15360kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB 
shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB 
pagetables:0kB unstable:0kB bounce:0kB free_pcp:0kB local_pcp:0kB free_cma:0kB 
writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Apr  1 11:16:39 swivel kernel: [249949.509553] lowmem_reserve[]: 0 3403 15919 
15919
Apr  1 11:16:39 swivel kernel: [249949.509559] Node 0 DMA32 free:64968kB 
min:3436kB low:4292kB high:5152kB active_anon:475148kB inactive_anon:357880kB 
active_file:1173604kB inactive_file:1314960kB unevictable:3416kB 
isolated(anon):0kB isolated(file):0kB present:3561088kB managed:3487816kB 
mlocked:3416kB dirty:13592kB writeback:0kB mapped:55924kB shmem:70004kB 
slab_reclaimable:47096kB slab_unreclaimable:17888kB kernel_stack:2000kB 
pagetables:8308kB unstable:0kB bounce:0kB free_pcp:4kB local_pcp:0kB 
free_cma:0kB writeback_tmp:0kB pages_scanned:128 all_unreclaimable? no
Apr  1 11:16:39 swivel kernel: [249949.509575] lowmem_reserve[]: 0 0 12516 12516
Apr  1 11:16:39 swivel kernel: [249949.509580] Node 0 Normal free:42040kB 
min:12648kB low:15808kB high:18972kB active_anon:3953204kB 
inactive_anon:946524kB active_file:3245780kB inactive_file:4155640kB 
unevictable:6688kB isolated(anon):0kB isolated(file):0kB present:13080576kB 
managed:12816596kB mlocked:6688kB dirty:45436kB writeback:0kB mapped:213420kB 
shmem:279756kB slab_reclaimable:177328kB slab_unreclaimable:114316kB 
kernel_stack:8688kB pagetables:39740kB unstable:0kB bounce:0kB free_pcp:764kB 
local_pcp:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? 
no
Apr  1 11:16:39 swivel kernel: [249949.509596] lowmem_reserve[]: 0 0 0 0
Apr  1 11:16:39 swivel kernel: [249949.509601] Node 0 DMA: 0*4kB 0*8kB 0*16kB 
0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 1*1024kB (U) 1*2048kB (M) 3*4096kB (M) = 
15360kB
Apr  1 11:16:39 swivel kernel: [249949.509619] Node 0 DMA32: 11548*4kB (UME) 
2282*8kB (UME) 55*16kB (UM) 2*32kB (UM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 65392kB
Apr  1 11:16:39 swivel kernel: [249949.509638] Node 0 Normal: 3736*4kB (UME) 
3206*8kB (UE) 131*16kB (U) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 
0*2048kB 0*4096kB = 42688kB
Apr  1 11:16:39 swivel kernel: [249949.509657] Node 0 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr  1 11:16:39 swivel kernel: [249949.509661] 2561271 total pagecache pages
Apr  1 11:16:39 swivel kernel: [249949.509664] 616 pages in swap cache
Apr  1 11:16:39 swivel kernel: [249949.509667] Swap cache stats: add 28221, 
delete 27605, find 294750/295285
Apr  1 11:16:39 swivel kernel: [249949.509670] Free swap  = 8277324kB
Apr  1 11:16:39 swivel kernel: [249949.509672] Total swap = 8386556kB
Apr  1 11:16:39 swivel kernel: [249949.509674] 4164412 pages RAM
Apr  1 11:16:39 swivel kernel: [249949.509676] 0 pages HighMem/MovableOnly
Apr  1 11:16:39 swivel kernel: [249949.509678] 84469 pages reserved
Apr  1 11:16:39 swivel kernel: [249949.509681] 0 pages hwpoisoned
Apr  1 11:16:39 swivel kernel: [249949.509717] NMI watchdog: enabled on all 
CPUs, permanently consumes one hw-PMU counter.
Apr  1 11:16:39 swivel kernel: [249949.537265] EXT4-fs (dm-16): re-mounted. 
Opts: data=ordered,commit=0
Apr  1 11:16:39 swivel systemd[1]: Reloading Laptop Mode Tools.
Apr  1 11:16:39 swivel kernel: [249949.664133] thinkpad_acpi: EC reports that 
Thermal Table has changed
Apr  1 11:16:39 swivel kernel: [249949.723795] IPv6: ADDRCONF(NETDEV_UP): eth0: 
link is not ready



This btrfs is ripe for the backup-format-restore procedure, right?

Greetings
Marc


-- 
------

Re: Again, no space left on device while rebalancing and recipe doesnt work

2016-04-01 Thread Marc Haber

On Sat, Feb 27, 2016 at 10:14:50PM +0100, Marc Haber wrote:
> I have again the issue of no space left on device while rebalancing
> (with btrfs-tools 4.4.1 on kernel 4.4.2 on Debian unstable):

just for the record: The host started acting up in more and more
interesting ways, and after a call of rm during kernel build resulted
in SIGSEGV, I did the backup-format-restore routine for this system
back to ext4 just to find out whether I have bad hardware or a bad
filesystem.

And, since going back to ext4, the system is just fine again. So it's
not bad hardware.

This systems's root drive is going to stay on ext4 for a loong
time. If I get the btrfs phenomena I experience on other hosts get
solved at some time in the future, I might migrate /home back to
btrfs, but that's not going to happen in the next six months.

This is a really bad experience which has made me lost a lot of faith
in the new filesystem. I really feel sad about that.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber

On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> wrote:
> > btrfs balance -mprofiles seems to do something. one kworked and one
> > btrfs-transaction process hog one CPU core each for hours, while
> > blocking the filesystem for minutes apiece, which leads to the host
> > being nearly unuseable up to the point of "clock and mouse pointer
> > frozen for nearly ten minutes".
> 
> I assume you still have your every 10 minutes snapshotting running
> while balancing?

No, I disabled the cronjob before trying the balance. I might be
crazy, but not stup^wunexperienced.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber

On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> > wrote:
> > > btrfs balance -mprofiles seems to do something. one kworked and one
> > > btrfs-transaction process hog one CPU core each for hours, while
> > > blocking the filesystem for minutes apiece, which leads to the host
> > > being nearly unuseable up to the point of "clock and mouse pointer
> > > frozen for nearly ten minutes".
> > 
> > I assume you still have your every 10 minutes snapshotting running
> > while balancing?
> 
> No, I disabled the cronjob before trying the balance. I might be
> crazy, but not stup^wunexperienced.

That being said, I would still expect the code not to allow _this_
kind of effect on the entire system when two alledgely incompatible
operations run simultaneously. I mean, Linux is a multi-user,
multi-tasking operating system where one simply cannot expect all
processes to be cooperative to each other. We have the operating
systems to prevent this kind of issues, not to cause them.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Another ENOSPC situation

2016-04-01 Thread Marc Haber

On Fri, Apr 01, 2016 at 09:20:52PM +0200, Henk Slager wrote:
> On Fri, Apr 1, 2016 at 6:50 PM, Marc Haber  
> wrote:
> > On Fri, Apr 01, 2016 at 06:30:20PM +0200, Marc Haber wrote:
> >> On Fri, Apr 01, 2016 at 05:44:30PM +0200, Henk Slager wrote:
> >> > On Fri, Apr 1, 2016 at 3:40 PM, Marc Haber  
> >> > wrote:
> >> > > btrfs balance -mprofiles seems to do something. one kworked and one
> >> > > btrfs-transaction process hog one CPU core each for hours, while
> >> > > blocking the filesystem for minutes apiece, which leads to the host
> >> > > being nearly unuseable up to the point of "clock and mouse pointer
> >> > > frozen for nearly ten minutes".
> >> >
> >> > I assume you still have your every 10 minutes snapshotting running
> >> > while balancing?
> >>
> >> No, I disabled the cronjob before trying the balance. I might be
> >> crazy, but not stup^wunexperienced.
> >
> > That being said, I would still expect the code not to allow _this_
> > kind of effect on the entire system when two alledgely incompatible
> > operations run simultaneously. I mean, Linux is a multi-user,
> > multi-tasking operating system where one simply cannot expect all
> > processes to be cooperative to each other. We have the operating
> > systems to prevent this kind of issues, not to cause them.
> 
> Maybe look at it differently: Does user mh have trouble using this
> laptop w.r.t. storing files?

No. I would have cried murder otherwise.

> In openSUSE Tumbleweed (the snapshot from end of march), root access
> is needed to change the default snapshotting config, otherwise you
> will have a 10 year history. After that change has been done according
> to needs of the user, there is no need to run manual balance.

So you are saying the balancing a filesystem should never be
necessary? Or what are you trying to say?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: bad metadata crossing stripe boundary

2016-04-02 Thread Marc Haber

On Sat, Apr 02, 2016 at 11:03:53AM +0200, Kai Krakow wrote:
> Am Fri, 1 Apr 2016 07:57:25 +0200
> schrieb Marc Haber :
> > On Thu, Mar 31, 2016 at 11:16:30PM +0200, Kai Krakow wrote:
> > > Am Thu, 31 Mar 2016 23:00:04 +0200
> > > schrieb Marc Haber :  
> > > > I find it somewhere between funny and disturbing that the first
> > > > call of btrfs check made my kernel log the following:
> > > > Mar 31 22:45:36 fan kernel: [ 6253.178264] EXT4-fs (dm-31):
> > > > mounted filesystem with ordered data mode. Opts: (null) Mar 31
> > > > 22:45:38 fan kernel: [ 6255.361328] BTRFS: device label fanbtr
> > > > devid 1 transid 67526 /dev/dm-31
> > > > 
> > > > No, the filesystem was not converted, it was directly created as
> > > > btrfs, and no, I didn't try mounting it.  
> > > 
> > > I suggest that your partition contained ext4 before, and you didn't
> > > run wipefs before running mkfs.btrfs.  
> > 
> > I cryptsetup luksFormat'ted the partition before I mkfs.btrfs'ed it.
> > That should do a much better job than wipefsing it, shouldnt it?
> 
> Not sure how luksFormat works. If it encrypts what is already on the
> device, it would also encrypt orphan superblocks.

It overwrites the LUKS metadata including the symmetric key that was
used to encrypt the existing data. Short of Shor's Algorithm and
Quantum Computers, after that operation it is no longer possible to
even guess what was on the disk before.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: bad metadata crossing stripe boundary

2016-04-03 Thread Marc Haber

On Sat, Apr 02, 2016 at 08:31:17PM +0200, Kai Krakow wrote:
> Am Sat, 2 Apr 2016 11:44:32 +0200
> schrieb Marc Haber :
> 
> > On Sat, Apr 02, 2016 at 11:03:53AM +0200, Kai Krakow wrote:
> > > Am Fri, 1 Apr 2016 07:57:25 +0200
> > > schrieb Marc Haber :  
> > > > On Thu, Mar 31, 2016 at 11:16:30PM +0200, Kai Krakow wrote:  
> >  [...]  
> >  [...]  
> >  [...]  
> > > > 
> > > > I cryptsetup luksFormat'ted the partition before I mkfs.btrfs'ed
> > > > it. That should do a much better job than wipefsing it, shouldnt
> > > > it?  
> > > 
> > > Not sure how luksFormat works. If it encrypts what is already on the
> > > device, it would also encrypt orphan superblocks.  
> > 
> > It overwrites the LUKS metadata including the symmetric key that was
> > used to encrypt the existing data. Short of Shor's Algorithm and
> > Quantum Computers, after that operation it is no longer possible to
> > even guess what was on the disk before.
> 
> If it was encrypted before... ;-)

First, it was.

Second, cleartext found on the block device is quite unlikely to be
readable from the unlocked crypto device. I would be very worried if
that were the case.

I must be missing something here.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: bad metadata crossing stripe boundary

2016-04-03 Thread Marc Haber

On Sat, Apr 02, 2016 at 01:41:53PM -0600, Chris Murphy wrote:
> On Thu, Mar 31, 2016 at 11:57 PM, Marc Haber
>  wrote:
> > On Thu, Mar 31, 2016 at 11:16:30PM +0200, Kai Krakow wrote:
> >> Am Thu, 31 Mar 2016 23:00:04 +0200
> >> schrieb Marc Haber :
> >> > I find it somewhere between funny and disturbing that the first call
> >> > of btrfs check made my kernel log the following:
> >> > Mar 31 22:45:36 fan kernel: [ 6253.178264] EXT4-fs (dm-31): mounted
> >> > filesystem with ordered data mode. Opts: (null) Mar 31 22:45:38 fan
> >> > kernel: [ 6255.361328] BTRFS: device label fanbtr devid 1 transid
> >> > 67526 /dev/dm-31
> >> >
> >> > No, the filesystem was not converted, it was directly created as
> >> > btrfs, and no, I didn't try mounting it.
> >>
> >> I suggest that your partition contained ext4 before, and you didn't run
> >> wipefs before running mkfs.btrfs.
> >
> > I cryptsetup luksFormat'ted the partition before I mkfs.btrfs'ed it.
> > That should do a much better job than wipefsing it, shouldnt it?
> 
> Not really. The first btrfs super is at 64K. The second at 64M. The
> third at 256G. While wipefs will remove the magic only on the first,
> mkfs.btrfs will take care of all three. And luksFormat only overwrites
> the first 132K of a block device. There's a scant chance of bugs
> related to previous filesystems not being erased, I think this is more
> likely when mixing and matching filesystems just because the
> superblocks for each filesystem aren't in the same location.

If I do:

umount /dev/mapper/foo
cryptsetup close /dev/mapper/foo
cryptsetup luksFormat /dev/mapper/pv-c_foo
cryptsetup open /dev/mapper/pv-c_foo foo

and the contents of /dev/mapper/foo would randomly resemble its
previous contents afterwards, I would be _very_ disturbed. During the
luksFormat process, a new random symmetric key is created, and
overwrites the old random symmetric key in the LUKS header. Therefore,
the following crypto operations are _very_ unlikely to produce
something that resembles an ext4 fileystem.

Even if I did:

umount /dev/mapper/foo
cryptsetup close /dev/mapper/foo
mkfs.btrfs /dev/mapper/pv-c_foo

(assuming I previously did cryptsetup open /dev/mapper/pv-c_foo foo)

I would be _very_ surprised if the kernel would find something
resembling and ext4 file system on /dev/mapper/pv-c_foo.

> If you're concerned about traces of previous file systems, then use
> the dmcrypt device itself, rather than merely using the original block
> device where merely 132K at the beginning has been overwritten.
> Everytime you format a device, the resulting dmcrypt logical device is
> in effect full of completely random data. A new random key is
> generated each time you use luksFormat, even if you're using the same
> passphrase.

That's what I am saying.

I must be missing something.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

"disk full" on a 5 GB btrfs filesystem, FAQ outdated?

2015-11-29 Thread Marc Haber

Hi,

I have a banana pi with a btrfs filesystem of 5 GB in size, which
frequently runs out of space (lots of snapshots). This is currently
again the case:

[27/524]mh@banana:~$ sudo btrfs balance start /
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail
[28/525]mh@banana:~$ sudo btrfs balance start / -dlimit=3
[sudo] password for mh on banana:
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail
[29/526]mh@banana:~$ sudo btrfs balance start / -dlimit=3
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail
[30/526]mh@banana:~$ sudo btrfs balance start / -dusage=0
Done, had to relocate 0 out of 8 chunks
[31/527]mh@banana:~$ sudo btrfs balance start / -dlimit=3
ERROR: error during balancing '/' - No space left on device
There may be more info in syslog - try dmesg | tail
[32/528]mh@banana:~$ sudo btrfs fi show /
Label: none  uuid: ada6b7f5-98d6-4fee-a3a3-b73bd152ff6c
Total devices 1 FS bytes used 3.37GiB
devid1 size 6.89GiB used 4.22GiB path /dev/mapper/banana-root

btrfs-progs v4.3
[33/529]mh@banana:~$ sudo btrfs fi df /
Data, single: total=3.41GiB, used=3.25GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=384.00MiB, used=121.75MiB
GlobalReserve, single: total=48.00MiB, used=0.00B
[34/530]mh@banana:~$ uname -a
Linux banana 4.3.0-zgbpi-armmp-lpae+ #2 SMP Sat Nov 7 13:07:34 UTC 2015 armv7l 
GNU/Linux
[36/532]mh@banana:~$ df -h /
Filesystem   Size  Used Avail Use% Mounted on
/dev/mapper/banana-root  6.9G  3.6G  2.9G  56% /
[37/533]mh@banana:~$

The first kernel that was ever booted on the device was 4.1, I am
therefore reasonably sure that the filesystem was also created with a
recent kernel. Is there any possibility to find out about the kernel
version that a filesystem was created with?

However, the FAQ
https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21
suggests that for small filesystems (<16 GB), the best solution would
be to upgrade to at least 2.6.37 and recreate the filesystem. 2.6.37
is ancient, from 2011, so I am pretty sure that the filesystem _was_
created at least with a kernel more recent than that.

My normal way to recover from this situation is to btrfs add a new
device, btrfs balance, btrfs --convert=single --force balance, btfs
device remove, btr balance start -mconvert=dup --force and finally
balance start again.

Is there any solution to solve this more elegantly?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "disk full" on a 5 GB btrfs filesystem, FAQ outdated?

2015-11-29 Thread Marc Haber

Hi Hugo,

On Sun, Nov 29, 2015 at 02:18:06PM +, Hugo Mills wrote:
> On Sun, Nov 29, 2015 at 02:07:54PM +0100, Marc Haber wrote:
> > However, the FAQ
> > https://btrfs.wiki.kernel.org/index.php/FAQ#Help.21_Btrfs_claims_I.27m_out_of_space.2C_but_it_looks_like_I_should_have_lots_left.21
> > suggests that for small filesystems (<16 GB), the best solution would
> > be to upgrade to at least 2.6.37 and recreate the filesystem. 2.6.37
> > is ancient, from 2011, so I am pretty sure that the filesystem _was_
> > created at least with a kernel more recent than that.
> 
>You missed the most important thing from that paragraph: Use mixed
> block groups. That's "mkfs.btrfs --mixed ..." (which I realise is
> missing from the text, and I'll be adding it after I send this email).

Yes, that was the important bit of missing information. My filesystem
now reads:

[26/512]mh@fan:/mnt/tempdisk$ df -h .
Filesystem   Size  Used Avail Use% Mounted on
/dev/mapper/banana-root  6,0G  836M  5,2G  14% /mnt/tempdisk
[27/513]mh@fan:/mnt/tempdisk$ sudo btrfs fi show .
Label: none  uuid: b2906231-70a9-46d9-9830-38a13cb73171
Total devices 1 FS bytes used 861.29MiB
devid1 size 6.00GiB used 6.00GiB path /dev/mapper/banana-root

btrfs-progs v4.3
[28/514]mh@fan:/mnt/tempdisk$ sudo btrfs fi df .
System, single: total=4.00MiB, used=4.00KiB
Data+Metadata, single: total=6.00GiB, used=861.29MiB
GlobalReserve, single: total=20.00MiB, used=0.00B
[29/515]mh@fan:/mnt/tempdisk$

Can I somehow get duplicate metadata back? Or is that unnecessary?

> > My normal way to recover from this situation is to btrfs add a new
> > device, btrfs balance, btrfs --convert=single --force balance, btfs
> > device remove, btr balance start -mconvert=dup --force and finally
> > balance start again.
> > 
> > Is there any solution to solve this more elegantly?
> 
>Recreate the FS with --mixed, and that should deal with it.

Done. Thanks!

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "disk full" on a 5 GB btrfs filesystem, FAQ outdated?

2015-11-30 Thread Marc Haber

On Mon, Nov 30, 2015 at 05:44:23AM +, Duncan wrote:
> Yes, you can get dup metadata back, but because data and metadata
> are now combined in the same blockgroups (aka chunks), they must
> both be the same replication type.

Thanks for this explanation, it's perfectly clear to me now.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Transaction aborted (error -17) during balance

2015-12-11 Thread Marc Haber

Hi,

during a balance on my main notebook, I have received the following
call trace:

[ 1545.229672] [ cut here ]
[ 1545.229688] WARNING: CPU: 4 PID: 5545 at 
/build/linux-eGTGmU/linux-4.3/fs/btrfs/extent-tree.c:2093 
__btrfs_inc_extent_ref.isra.52+0x20e/0x280 [btrfs]()
[ 1545.229689] BTRFS: Transaction aborted (error -17)
[ 1545.229690] Modules linked in: ctr ccm tun rfcomm cpufreq_userspace 
binfmt_misc cpufreq_stats cpufreq_powersave cpufreq_conservative 
nf_conntrack_netlink nfnetlink bnep ip6table_filter ip6_tables xt_TCPMSS 
xt_tcpudp iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat 
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_filter 
ip_tables x_tables bridge stp llc joydev arc4 iTCO_wdt iwldvm 
iTCO_vendor_support mac80211 snd_hda_codec_conexant intel_rapl 
snd_hda_codec_generic iosf_mbi x86_pkg_temp_thermal btusb intel_powerclamp 
btrtl snd_hda_intel iwlwifi btbcm kvm_intel snd_hda_codec btintel kvm 
snd_hda_core psmouse bluetooth snd_hwdep snd_pcm_oss pcspkr serio_raw i2c_i801 
sg cfg80211 snd_mixer_oss lpc_ich snd_pcm mfd_core snd_timer mei_me shpchp mei 
thinkpad_acpi nvram
[ 1545.229718]  tpm_tis snd tpm soundcore rfkill evdev battery ac processor 
coretemp loop drbd lru_cache libcrc32c parport_pc ppdev lp parport autofs4 
btrfs xor raid6_pq ext4 crc16 mbcache jbd2 algif_skcipher af_alg dm_crypt 
dm_mod md_mod hid_generic hid_logitech_hidpp hid_logitech_dj usbhid hid sd_mod 
uas usb_storage crct10dif_pclmul crc32_pclmul crc32c_intel jitterentropy_rng 
sha256_ssse3 sha256_generic hmac drbg ansi_cprng aesni_intel aes_x86_64 lrw 
gf128mul glue_helper i915 ahci ablk_helper cryptd libahci sdhci_pci 
i2c_algo_bit libata ehci_pci drm_kms_helper sdhci ehci_hcd scsi_mod mmc_core 
e1000e usbcore ptp usb_common drm pps_core thermal wmi video button
[ 1545.229747] CPU: 4 PID: 5545 Comm: kworker/u16:1 Not tainted 
4.3.0-trunk-amd64 #1 Debian 4.3-1~exp2
[ 1545.229747] Hardware name: LENOVO 4240CTO/4240CTO, BIOS 8AET63WW (1.43 ) 
05/08/2013
[ 1545.229758] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper [btrfs]
[ 1545.229760]  a0627250 812c5319 88020dc23ba0 
8106ebcd
[ 1545.229761]  880406146000 88020dc23bf0 8803c90b9410 

[ 1545.229762]  0106 8106ec4c a0627420 
0020
[ 1545.229764] Call Trace:
[ 1545.229768]  [] ? dump_stack+0x40/0x57
[ 1545.229771]  [] ? warn_slowpath_common+0x7d/0xb0
[ 1545.229772]  [] ? warn_slowpath_fmt+0x4c/0x50
[ 1545.229778]  [] ? insert_tree_block_ref+0x49/0x60 [btrfs]
[ 1545.229783]  [] ? 
__btrfs_inc_extent_ref.isra.52+0x20e/0x280 [btrfs]
[ 1545.229789]  [] ? __btrfs_run_delayed_refs+0xc47/0x1050 
[btrfs]
[ 1545.229792]  [] ? sched_clock+0x5/0x10
[ 1545.229795]  [] ? check_preempt_curr+0x50/0x90
[ 1545.229797]  [] ? ttwu_do_wakeup+0x14/0xc0
[ 1545.229803]  [] ? btrfs_run_delayed_refs+0x78/0x2a0 [btrfs]
[ 1545.229808]  [] ? delayed_ref_async_start+0x32/0x80 [btrfs]
[ 1545.229816]  [] ? btrfs_scrubparity_helper+0xc8/0x260 
[btrfs]
[ 1545.229818]  [] ? process_one_work+0x19f/0x3d0
[ 1545.229819]  [] ? worker_thread+0x4d/0x450
[ 1545.229821]  [] ? process_one_work+0x3d0/0x3d0
[ 1545.229822]  [] ? kthread+0xbd/0xe0
[ 1545.229824]  [] ? kthread_create_on_node+0x170/0x170
[ 1545.229827]  [] ? ret_from_fork+0x3f/0x70
[ 1545.229829]  [] ? kthread_create_on_node+0x170/0x170
[ 1545.229830] ---[ end trace 6671e30ac2882b40 ]---
[ 1545.229832] BTRFS: error (device dm-11) in __btrfs_inc_extent_ref:2093: 
errno=-17 Object already exists
[ 1545.229834] BTRFS info (device dm-11): forced readonly
[ 1545.229836] BTRFS: error (device dm-11) in btrfs_run_delayed_refs:2851: 
errno=-17 Object already exists

I have been trying to balance this filesystem for the better part of
the afternoon, with numerous freezes of my notebook. I was able to
finish the balance by not doing anything on the notebook while the
balance was running. I then proceeded to initiate a second rebalance
of the same filesystem "just to be sure", which led to a read-only
btrfs and me at least being able to obtain this trace.

This is a distribution kernel, I have debug symbols installed after
this log extrct was obtained. Is there a tool which can help to make
this trace useable?

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

"not a btrfs filesystem"

2017-01-21 Thread Marc Haber

I have a file system that I can show, check, but not rebalance:

1 [4/3420]mh@fan:~ $ sudo btrfs fi show /dev/mapper/banana-root
Label: none  uuid: b2906231-70a9-46d9-9830-38a13cb73171
Total devices 1 FS bytes used 1.82GiB
devid1 size 6.00GiB used 3.69GiB path /dev/mapper/banana-root

1 [10/3426]mh@fan:~ $ sudo btrfs check /dev/mapper/banana-root
Checking filesystem on /dev/mapper/banana-root
UUID: b2906231-70a9-46d9-9830-38a13cb73171
checking extents
checking free space cache
checking fs roots
checking csums
checking root refs
found 1954635785 bytes used err is 0
total csum bytes: 1833964
total tree bytes: 75907072
total fs tree bytes: 64241664
total extent tree bytes: 7843840
btree space waste bytes: 21224578
file data blocks allocated: 2168119296
 referenced 2088648704
[11/3427]mh@fan:~ $ sudo btrfs balance /dev/mapper/banana-root
ERROR: not a btrfs filesystem: /dev/mapper/banana-root
1 [12/3428]mh@fan:~ $

The filesystem is on a logical volume which is on a PV which is in a
loop device that is on a disk file:

[12/3428]mh@fan:~ $ sudo lvs
  LV  VG Attr   LSize
  rootbanana -wi-a---p-   6,00g
[13/3429]mh@fan:~ $ sudo vgs
  VG #PV #LV #SN Attr   VSize   VFree  
  banana   1   2   0 wz-pn-   7,16g 164,00m
[14/3430]mh@fan:~ $ sudo pvs
  PV VG Fmt  Attr PSize   PFree  
  [unknown]  banana lvm2 a-m7,16g 164,00m
(no idea why this says "unknown" here, it is
[15/3431]mh@fan:~ $ ls -al /dev/mapper/loop0p2
lrwxrwxrwx 1 root root 8 Jan 21 14:40 /dev/mapper/loop0p2 -> ../dm-39
[16/3432]mh@fan:~ $ ls -al /dev/dm-39
brw-rw 1 root disk 254, 39 Jan 21 14:40 /dev/dm-39
[24/3440]mh@fan:~ $ sudo kpartx -l banana.sdcard
loop0p1 : 0 497664 /dev/loop0 2048
loop0p2 : 0 15024128 /dev/loop0 499712
[25/3441]mh@fan:~ $ sudo losetup --list
NAME   SIZELIMIT OFFSET AUTOCLEAR RO BACK-FILE  DIO
/dev/loop0 0  0 0  0 /home/mh/banana.sdcard   0
[26/3442]mh@fan:~ $

Can a btrfs be so broken that btrfs balance doesn't recognize it any
more? What is going on with this file system?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "not a btrfs filesystem"

2017-01-21 Thread Marc Haber

On Sat, Jan 21, 2017 at 03:52:19PM +0100, Hans van Kranenburg wrote:
> You have to point balance to the mount point of the banana, not to the
> block device. (balance does its work while the file system is mounted)

Idiot me. I always forget that.

Greetings
Marc

-- 
-
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: BTRFS constantly reports "No space left on device" even with a huge unallocated space

2016-08-25 Thread Marc Haber

hi,

On Thu, Aug 25, 2016 at 05:56:18PM -0600, Chris Murphy wrote:
> Anyway it's a known problem, I don't think it's fixed still. There's a
> lot of enospc work in 4.8 so eventually it'll make sense to give it a
> shot with that kernel.

assuming that I'm willing to try that, will a successful rebalance
with 4.8 fix a filesystem, or is the recommended way still "backup,
format, restore, lose all snapshots"?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Is stability a joke? (wiki updated)

2016-09-13 Thread Marc Haber

On Mon, Sep 12, 2016 at 02:44:35PM -0600, Chris Murphy wrote:
> Just to cut yourself some slack, you could skip 3.14 because it's EOL
> now, and just go from 4.4.

Don't the btrfs-tools used to create the filesystem also play a huge
role in this game?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: "No space left on device" and balance doesn't work

2016-06-02 Thread Marc Haber

On Fri, Jun 03, 2016 at 12:45:51AM +0200, Henk Slager wrote:
> The setup looks all pretty normal and btrfs should be able to handle
> it, but unfortunately your fs is a typical example that one currently
> needs to monitor/tune a btrfs fs for its 'health' in order to keep it
> running longterm.

What kind of work is being done to address this major usability issue?
What is the timeframe for a fix?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Recommended why to use btrfs for production?

2016-06-03 Thread Marc Haber

On Fri, Jun 03, 2016 at 11:49:09AM +0200, Martin wrote:
> We would like to use urBackup to make laptop backups, and they mention
> btrfs as an option.
> 
> https://www.urbackup.org/administration_manual.html#x1-8400010.6
> 
> So if we go with btrfs and we need 100TB usable space in raid6, and to
> have it replicated each night to another btrfs server for "backup" of
> the backup, how should we then install btrfs?

Do you plan to use Snapshots? How many of them?

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: btrfs filesystem keeps allocating new chunks for no apparent reason

2016-06-09 Thread Marc Haber

On Thu, Jun 09, 2016 at 01:10:46AM +0200, Hans van Kranenburg wrote:
> So, instead of being the cause, apt-get update causing a new chunk to be
> allocated might as well be the result of existing ones already filled up
> with too many fragments.
> 
> The next question is what files these extents belong to. To find out, I need
> to open up the extent items I get back and follow a backreference to an
> inode object. Might do that tomorrow, fun.

Does your apt use pdiffs to update the packages lists? If yes, I'd try
turning it off just for the fun of it and to see whether this changes
btrfs' allocation behavior. I have never looked at apt's pdiff stuff
in detail, but I guess that it creates many tiny temporary files.

Greetings
Marc

-- 
-----
Marc Haber | "I don't trust Computers. They | Mailadresse im Header
Leimen, Germany|  lose things."Winona Ryder | Fon: *49 6224 1600402
Nordisch by Nature |  How to make an American Quilt | Fax: *49 6224 1600421
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

78 matches

Mail list logo