Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
r. Disclaimer: all the above statements in relation to conception and understanding of quotas, not to be confused with qgroups. -- Tomasz Pala

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
other problem) I am (more than before) aware what btrfs quotas are not. So, my only expectation (except for worldwide peace and other unrealistic ones) would be to stop using "quotas", "subvolume quotas" and "qgroups" interchangeably in btrfs context, as IMvHO these are not plain, well-known "quotas". -- Tomasz Pala

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
ommand btrfs qgroup(8)" - they are the same... just completely different from traditional "quotas". My suggestion would be to completely remove the standalone "quota" word from btrfs documentation - there is no "quota", just "subvolume quota" or "qgroup" supported. -- Tomasz Pala

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-10 Thread Tomasz Pala
one day without any known reason), misnamed ...and not reflecting anything valuable, unless the problems with extent fragmentation are already resolved somehow? So IMHO current quotas are: - not discoverable for user (shared->exclusive transition of my data by someone's else action), - not reliable for sysadm (offensive write pattern by any user can allocate virtually any space despite of quotas). -- Tomasz Pala

Re: Report correct filesystem usage / limits on BTRFS subvolumes with quota

2018-08-09 Thread Tomasz Pala
fs should account half of the data, and twice the data in an opposite scenario (like "dup" profile on single-drive filesystem). In short: values representing quotas are user-oriented ("the numbers one bought"), not storage-oriented ("the numbers they actually occupy").

Re: Any chance to get snapshot-aware defragmentation?

2018-05-18 Thread Tomasz Pala
en with current approach it should be possible to interlace defragmentation with some kind of naive-deduplication; "naive" in the approach of comparing blocks only within the same in-subvolume paths. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux

Re: your mail

2018-02-18 Thread Tomasz Pala
On Sun, Feb 18, 2018 at 10:28:02 +0100, Tomasz Pala wrote: > I've already noticed this problem on February 10th: > [btrfs-progs] coreutils-like -i parameter, splitting permissions for various > tasks > > In short: not possible. Regular user can only create subvolumes. Not

Re: your mail

2018-02-18 Thread Tomasz Pala
ld fail miserably. > After few years not using btrfs (because previously was quite > unstable) It is really good to see that now I'm not able to crash it. It's not crashing with LTS 4.4 and 4.9 kernels, many reports of various crashes in 4.12, 4.14 and 4.15 were posted here. It is real

Re: btrfs-cleaner / snapshot performance analysis

2018-02-10 Thread Tomasz Pala
sibly hostile write patterns (like /home) as nocow. Actually, if you do not use compression and don't need checksums of data blocks, you may want to mount all the btrfs with nocow by default. This way the quotas would be more accurate (no fragmentation _between_ snapshots) and you&

[btrfs-progs] coreutils-like -i parameter, splitting permissions for various tasks

2018-02-10 Thread Tomasz Pala
ackup-admin with access to all the subvolumes or maintenance-admin that could issue scrub or rebalance volumes. For backward compatibility, these tools could be issued by 'btrfs' wrapper binary. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
is planned: http://0pointer.net/blog/projects/stateless.html -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
look like. Hard to agree with someone who refuses to do _anything_. You can choose to follow whatever, MD, LVM, ZFS, invent something totally different, write custom daemon or put timeout logic inside the kernel itself. It doesn't matter. You know the ecosystem - it is the udev that must be

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
; profiles degraded using OpenRC without needing anything more than adding > rootflags=degraded to the kernel parameters must be a fluke then... We are talking about automatic fallback after timeout, not manually casting any magic spells! Since OpenRC doesn't read rootflags at all: grep -iE 'rootflags|degraded|btrfs' openrc/**/* it won't support this without some extra code. > The thing is, it primarily breaks if there are hardware issues, > regardless of the init system being used, but at least the other init > systems _give you an error message_ (even if it's really the kernel > spitting it out) instead of just hanging there forever with no > indication of what's going on like systemd does. If your systemd waits forever and you have no error messages, report bug to your distro maintainer, as he is probably the one to blame for fixing what ain't broken. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
On Tue, Jan 30, 2018 at 16:09:50 +0100, Tomasz Pala wrote: >> BCP for over a >> decade has been to put multipathing at the bottom, then crypto, then >> software RAID, than LVM, and then whatever filesystem you're using. > > Really? Let's enumerate some ca

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
systemd stepped in for some of there is that nobody else could introduce and force Linux-wide consensus. And if anyone would succeed, there would be some Austins blaming them for 'overtaking good old trashyard into coherent de facto standard.' > In this particular case, you don't need

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
won't be accepted in systemd upstream, especially because it requires the current udev rule to be slightly changed. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
things do try to mount > filesystems without calling a mount helper, most notably the kernel when > it mounts the root filesystem on boot if you're not using an initramfs). > All in all, this type of thing gets out of hand _very_ fast. You need to think about the two separately: 1.

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
Just change the BTRFS_IOC_DEVICES_READY handler to always return READY. >> > Or maybe we should just remove it completely, because checking it _IS > WRONG_, That's right. But before commiting upstream, check for consequences. I've already described a few today, pointed the source and gave some possible alternate solutions. > which is why no other init system does it, and in fact no Other init systems either fail at mounting degraded btrfs just like systemd does, or have buggy workarounds in their code reimplemented in each other just to handle thing, that should be centrally organized. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
operator action, so the umount SHOULD happen, or we are facing some MALFUNCION, which is fatal itself, not by being a "race condition". -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.

Re: degraded permanent mount option

2018-01-30 Thread Tomasz Pala
esort. > It's not rocket science to edit an init script if knobs it exposes are not > configurable enough for your needs. How many init scripts were you involved in? > If systemd decides to hide this > functionality, it needs to provide the admin with some way to override. There is - udev roules and systemd units I've mentioned. Just use them. > We're talking about issuing a mount call, it's not _that_ complicated. So just do it! https://github.com/systemd/systemd Please, go ahead with some PoC implementation, as this is REALLY hard to discuss init systems/scripts corner cases with someone that has apparently never written a single line of such code. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-29 Thread Tomasz Pala
ator. But somewhere, sometime, someone would have a NEED for totally different set of rules for handling degraded volumes, just like MD or LVM does. This would be totally irresponsible to hardcode any mount-degraded rule inside systemd itself. That is exactly why this must go through the udev - u

Re: degraded permanent mount option

2018-01-28 Thread Tomasz Pala
y to push. If the IOCTL would be extended to return TRYING_DEGRADED (when instructed to do so after expired timeout), systemd could handle additional per-filesystem fstab options, like x-systemd.allow-degraded. Then in would be possible to have best-effort policy for rootfs (to make machine boot)

Re: degraded permanent mount option

2018-01-28 Thread Tomasz Pala
othing more than that), but overall _availability_. I do not care if there are 2, 5 or 100 devices. I do care if there is ENOUGH devices to run regular (including N-way mirroring and hot spares) and if not - if there is ENOUGH devices to run degraded. Having ALL the devices is just the edge case

Re: degraded permanent mount option

2018-01-28 Thread Tomasz Pala
On Sun, Jan 28, 2018 at 01:00:16 +0100, Tomasz Pala wrote: > It can't mount degraded, because the "missing" device might go online a > few seconds ago. s/ago/after/ >> The central problem is the lack of a timer and time out. > > You got mdadm-last-resort@.time

Re: degraded permanent mount option

2018-01-28 Thread Tomasz Pala
ked as 'not available', don't expect it to be kept used. Just fix the code to match reality. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: degraded permanent mount option

2018-01-27 Thread Tomasz Pala
;the kernel has already mounted it" and ignore kernel screaming "the device is (not yet there/gone)"? Just update the internal state after successful mount and this particular problem is gone. Unless there is some race condition and the state should be changed before the mount i

Re: degraded permanent mount option

2018-01-27 Thread Tomasz Pala
/blah" -> BTRFS_IOC_DEVICES_READY returns "READY" (or new value "DEGRADED") -> udev catches event and changes SYSTEMD_READY -> systemd mounts the volume. This is really simple. All you need to do is to pass "degraded" to the btrfs.ko, so the BTRFS_IOC_DEVIC

Re: degraded permanent mount option

2018-01-27 Thread Tomasz Pala
at is not true. It's not how mdadm works anyway. Yes it does. You can't mount mdadm until /dev/mdX appears, which happens when array get's fully assembled *OR* times out and kernel get's instructed to run array as degraded, which effects in /dev/mdX appearing. There is NO a

Re: degraded permanent mount option

2018-01-27 Thread Tomasz Pala
sdc would answer the same, BTW). It can > even ask for UUIDs -- all devices are present. So, mount will succeed, > right? Systemd doesn't count anything, it asks BTRFS_IOC_DEVICES_READY as implemented in btrfs/super.c. > Ie, the thing systemd can safely do, is to stop trying to rule

Re: degraded permanent mount option

2018-01-27 Thread Tomasz Pala
eady to be mounted, but not fully populated" (i.e. "degraded mount possible"). Then systemd could _fallback_ after timing out to degraded mount automatically according to some systemd-level option. Unless there is *some* signalling from btrfs, there is really not much systemd can *sa

Re: Unexpected raid1 behaviour

2017-12-22 Thread Tomasz Pala
r basis? By 'required' I mean by design/implementation issues/quirks, _not_ related to possible hardware malfunctions. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Unexpected raid1 behaviour

2017-12-22 Thread Tomasz Pala
y kernel itself, while btrfs cannot (so initrd is required for rootfs). -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Unexpected raid1 behaviour

2017-12-22 Thread Tomasz Pala
when _you_ need to stop ignoring the fact, that you simply cannot just try mounting devices in a loop as this would render any NAS/FC/iSCSI-backed or more complicated systems unusable or hide problems in case of temporary problems with connection. systemd waits for the _underlying_ device - unless btr

Re: Unexpected raid1 behaviour

2017-12-20 Thread Tomasz Pala
Errata: On Wed, Dec 20, 2017 at 09:34:48 +0100, Tomasz Pala wrote: > /dev/sda -> 'not ready' > /dev/sdb -> 'not ready' > /dev/sdc -> 'ready', triggers /dev/sda -> 'not ready' and /dev/sdb - still > 'not ready' &

Re: Unexpected raid1 behaviour

2017-12-20 Thread Tomasz Pala
t;no more devices, give me all the remaining btrfs volumes in degraded mode if possible". By "give me btrfs vulumes" I mean "mark them as 'ready'" so the udev could fire it's rules. And if there would be anything for udev to distinguish 'ready' fr

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
me knob, module >> parameter or anything else to make the *R*aid work. > There's a mount option for it per-filesystem. Just add that to all your > mount calls, and you get exactly the same effect. If only they were passed... -- Tomasz Pala -- To unsubscribe from this lis

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
that aren't Arch, Gentoo, or > Slackware derived do so too to a lesser degree), and it would require > constant curation to keep up to date. Only for long-term known issues OK, you've convinced me that kernel-vs-feature list is overhead. So maybe other approach: just like sy

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
enable it. I thought the work was already done if current kernel handles degraded RAID1 without switching to r/o, doesn't it? Or something else is missing? -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
have to be default, might be kernel compile-time knob, module parameter or anything else to make the *R*aid work. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
from raid.  And Wouldn't want to worry you, but properly managed RAIDs make I/J-of-K trivial-failures transparent. Just like ECC protects N/M bits transparently. Investigating the reasons is sysadmin's job, just like other maintenance, including restoring protection level. -- Tomas

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
d be posted without creating the impression, that it's all about creating complain-list. Not to mention I'm absolutely not familiar with current patches, WIP and many many other corner cases or usage scenarios. In a fact, not only the internals, but motivation and design principles must be wel

Re: Unexpected raid1 behaviour

2017-12-19 Thread Tomasz Pala
to fix the volume, accidentally the machine has rebooted. Which should do no harm if I had a RAID1. 4. As already said before, using r/w degraded RAID1 is FULLY ACCEPTABLE, as long as you accept "no more redundancy"... 4a. ...or had an N-way mirror and there is still some redundancy

Re: Unexpected raid1 behaviour

2017-12-18 Thread Tomasz Pala
I got one "RAID1" stuck in r/o after degraded mount, not nice... Not _expected_ to happen after single disk failure (without any reappearing). -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: exclusive subvolume space missing

2017-12-15 Thread Tomasz Pala
like /home or /tmp (if held on btrfs). I'd say, that from security point of view the nocow should be default, unless specified for mount or specific file... Currently, if I mount with nocow, there is no way to whitelist trusted users or secure location, and until btrfs-specific options cou

Re: exclusive subvolume space missing

2017-12-11 Thread Tomasz Pala
oesn't share any physical locations with the old one. But still grows, so what does this situation have with snapshots anyway? Oh, and BTW - 900+ extents for ~5 GB taken means there is about 5.5 MB occupied per extent. How is that possible? -- Tomasz Pala File log.14 has 933

Re: exclusive subvolume space missing

2017-12-10 Thread Tomasz Pala
On Sun, Dec 10, 2017 at 12:27:38 +0100, Tomasz Pala wrote: > I have found a directory - pam_abl databases, which occupy 10 MB (yes, > TEN MEGAbytes) and released ...8.7 GB (almost NINE GIGAbytes) after # df Filesystem Size Used Avail Use% Mounted on /dev/sda264G

Re: ERROR: failed to repair root items: Input/output error

2017-12-10 Thread Tomasz Pala
estore complete files due to the nature of data loss (beginning of blocks). -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: exclusive subvolume space missing

2017-12-10 Thread Tomasz Pala
are worth defragging if space released from extents is greater than space lost on inter-snapshot duplication. I can't just defrag entire filesystem since it breaks links with snapshots. This change was a real deal-breaker here... Any way to fed the deduplication code with snapshots maybe? Th

Re: exclusive subvolume space missing

2017-12-10 Thread Tomasz Pala
cted to be fixed internally, as the needs are conflicting, but their impact might be nullified by some housekeeping. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: exclusive subvolume space missing

2017-12-02 Thread Tomasz Pala
On Sat, Dec 02, 2017 at 17:28:12 +0100, Tomasz Pala wrote: >> Suppose you start with a 100 MiB file (I'm adjusting the sizes down from > [...] >> Now make various small changes to the file, say under 16 KiB each. These >> will each be COWed elsewhere as one might ex

Re: exclusive subvolume space missing

2017-12-02 Thread Tomasz Pala
eral times the size of the original file! > > Luckily few people have this sort of usage pattern, but if you do... > > It would certainly explain the space eating... Did anyone investigated how is that related to RRD rewrites? I don't use rrdcached, never thought that 1

Re: exclusive subvolume space missing

2017-12-02 Thread Tomasz Pala
ect here - reclaiming space before it is being locked inside snapshot. Rationale behind this is obvious: since the snapshot-aware defrag was removed, allow to defrag snapshot exclusive data only. This would of course result in partial file defragmentation, but that should be enough for pathological cases like mine. -- Tomasz Pala -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: exclusive subvolume space missing

2017-12-01 Thread Tomasz Pala
w / 0.00s user 0.00s system 0% cpu 30.798 total > And further more, please ensure that all deleted files are really deleted. > Btrfs delay file and subvolume deletion, so you may need to sync several > times or use "btrfs subv sync" to ensure deleted files are deleted. Ye

Re: exclusive subvolume space missing

2017-12-01 Thread Tomasz Pala
GB. At least one recent snapshot, that was taken after some minor (<100 MB) changes from the subvolume, that has undergo some minor changes since then, occupied 8 GB during one night when the entire system was idling. This was crosschecked on files metadata (mtimes compared) and 'du'

Re: exclusive subvolume space missing

2017-12-01 Thread Tomasz Pala
altering text config files mostly (plus etckeeper's git metadata), so the volume of difference is extremelly low. Actually most of the difs between subvolumes come from updating distro packages. There were not much reflink copies made on this partition, only one kernel source compiled (.ccache

Re: exclusive subvolume space missing

2017-12-01 Thread Tomasz Pala
64.00GiB Device slack: 0.00B Data,single: 1.07GiB Data,RAID1: 55.97GiB Metadata,RAID1: 2.00GiB System,RAID1: 32.00MiB Unallocated: 4.93GiB /dev/sdb2, ID: 2 Device size:64.00GiB Device slack:

exclusive subvolume space missing

2017-12-01 Thread Tomasz Pala
. And the same happens with other snapshots, much more exclusive data shown in qgroup than actually found in files. So if not files, where is that space wasted? Metadata? btrfs-progs-4.12 running on Linux 4.9.46. best regards, -- Tomasz Pala -- To unsubscribe from this list: send the line "uns