Re: btrfs-cleaner / snapshot performance analysis

2018-02-09 Thread Peter Grandi
> I am trying to better understand how the cleaner kthread > (btrfs-cleaner) impacts foreground performance, specifically > during snapshot deletion. My experience so far has been that > it can be dramatically disruptive to foreground I/O. That's such a warmly innocent and optimistic question!

Re: Btrfs reserve metadata problem

2018-01-02 Thread Peter Grandi
> When testing Btrfs with fio 4k random write, That's an exceptionally narrowly defined workload. Also it is narrower than that, because it must be without 'fsync' after each write, or else there would be no accumulation of dirty blocks in memory at all. > I found that volume with smaller free

Re: Unexpected raid1 behaviour

2017-12-19 Thread Peter Grandi
[ ... ] > The advantage of writing single chunks when degraded, is in > the case where a missing device returns (is readded, > intact). Catching up that device with the first drive, is a > manual but simple invocation of 'btrfs balance start > -dconvert=raid1,soft -mconvert=raid1,soft' The

Re: Unexpected raid1 behaviour

2017-12-18 Thread Peter Grandi
>> The fact is, the only cases where this is really an issue is >> if you've either got intermittently bad hardware, or are >> dealing with external > Well, the RAID1+ is all about the failing hardware. >> storage devices. For the majority of people who are using >> multi-device setups, the

Re: Unexpected raid1 behaviour

2017-12-18 Thread Peter Grandi
>> I haven't seen that, but I doubt that it is the radical >> redesign of the multi-device layer of Btrfs that is needed to >> give it operational semantics similar to those of MD RAID, >> and that I have vaguely described previously. > I agree that btrfs volume manager is incomplete in view of >

Re: Unexpected raid1 behaviour

2017-12-17 Thread Peter Grandi
"Duncan"'s reply is slightly optimistic in parts, so some further information... [ ... ] > Basically, at this point btrfs doesn't have "dynamic" device > handling. That is, if a device disappears, it doesn't know > it. That's just the consequence of what is a completely broken conceptual

Re: [PATCH 0/7] retry write on error

2017-12-03 Thread Peter Grandi
> [ ... ] btrfs incorporates disk management which is actually a > version of md layer, [ ... ] As far as I know Btrfs has no disk management, and was wisely designed without any, just like MD: Btrfs volumes and MD sets can be composed from "block devices", not disks, and block devices are quite

Re: [PATCH 0/7] retry write on error

2017-11-28 Thread Peter Grandi
>>> If the underlying protocal doesn't support retry and there >>> are some transient errors happening somewhere in our IO >>> stack, we'd like to give an extra chance for IO. >> A limited number of retries may make sense, though I saw some >> long stalls after retries on bad disks. Indeed! One

Re: Fixed subject: updatedb does not index separately mounted btrfs subvolumes

2017-11-05 Thread Peter Grandi
>> The issue is that updatedb by default will not index bind >> mounts, but by default on Fedora and probably other distros, >> put /home on a subvolume and then mount that subvolume which >> is in effect a bind mount. > > So the issue isn't /home being btrfs (as you said in the > subject), but

Re: defragmenting best practice?

2017-11-01 Thread Peter Grandi
> Another one is to find the most fragmented files first or all > files of at least 1M with with at least say 100 fragments as in: > find "$HOME" -xdev -type f -size +1M -print0 | xargs -0 filefrag \ > | perl -n -e 'print "$1\0" if (m/(.*): ([0-9]+) extents/ && $1 > 100)' \ > | xargs -0 btrfs fi

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Peter Grandi
[ ... ] > The poor performance has existed from the beginning of using > BTRFS + KDE + Firefox (almost 2 years ago), at a point when > very few snapshots had yet been created. A comparison system > running similar hardware as well as KDE + Firefox (and LVM + > EXT4) did not have the performance

Re: defragmenting best practice?

2017-11-01 Thread Peter Grandi
> When defragmenting individual files on a BTRFS filesystem with > COW, I assume reflinks between that file and all snapshots are > broken. So if there are 30 snapshots on that volume, that one > file will suddenly take up 30 times more space... [ ... ] Defragmentation works by effectively making

Re: defragmenting best practice?

2017-10-31 Thread Peter Grandi
> I'm following up on all the suggestions regarding Firefox performance > on BTRFS. [ ... ] I haven't read that yet, so maybe I am missing something, but I use Firefox with Btrfs all the time and I haven't got issues. [ ... ] > 1. BTRFS snapshots have proven to be too useful (and too important

RE: SLES 11 SP4: can't mount btrfs

2017-10-26 Thread Peter Grandi
>> But it could simply be that you have forgotten to refresh the >> 'initramfs' with 'mkinitrd' after modifying the '/etc/fstab'. > I finally managed it. I'm pretty sure having changed > /boot/grub/menu.lst, but somehow changes got lost/weren't > saved ? So the next thing to check would indeed

RE: SLES 11 SP4: can't mount btrfs

2017-10-26 Thread Peter Grandi
> I formatted the / partition with Btrfs again and could restore > the files from a backup. Everything seems to be there, I can > mount the Btrfs manually. [ ... ] But SLES finds from where I > don't know a UUID (see screenshot). This UUID is commented out > in fstab and replaced by

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >> are USB drives really that unreliable [ ... ] [ ... ] > There are similar SATA chips too (occasionally JMicron and > Marvell for example are somewhat less awesome than they could > be), and practically all Firewire bridge chips of old "lied" a > lot [ ... ] > That plus Btrfs is

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >>> Oh please, please a bit less silliness would be welcome here. >>> In a previous comment on this tedious thread I had written: > If the block device abstraction layer and lower layers work > correctly, Btrfs does not have problems of that sort when > adding new devices;

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
> [ ... ] when writes to a USB device fail due to a temporary > disconnection, the kernel can actually recognize that a write > error happened. [ ... ] Usually, but who knows? Maybe half transfer gets written; maybe the data gets written to the wrong address; maybe stuff gets written but failure

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
> [ ... ] However, the disappearance of the device doesn't get > propagated up to the filesystem correctly, Indeed, sometimes it does, sometimes it does not, in part because of chipset bugs, in part because the USB protocol signaling side does not handle errors well even if the chipset were bug

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-19 Thread Peter Grandi
[ ... ] >> Oh please, please a bit less silliness would be welcome here. >> In a previous comment on this tedious thread I had written: >> > If the block device abstraction layer and lower layers work >> > correctly, Btrfs does not have problems of that sort when >> > adding new devices;

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-18 Thread Peter Grandi
> [ ... ] After all, btrfs would just have to discard one copy > of each chunk. [ ... ] One more thing that is not clear to me > is the replication profile of a volume. I see that balance can > convert chunks between profiles, for example from single to > raid1, but I don't see how the default

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-18 Thread Peter Grandi
>> I forget sometimes that people insist on storing large >> volumes of data on unreliable storage... Here obviously "unreliable" is used on the sense of storage that can work incorrectly, not in the sense of storage that can fail. > In my opinion the unreliability of the storage is the exact >

Re: Is it safe to use btrfs on top of different types of devices?

2017-10-14 Thread Peter Grandi
> A few years ago I tried to use a RAID1 mdadm array of a SATA > and a USB disk, which lead to strange error messages and data > corruption. That's common, quite a few reports of similar issues in previous entries in this mailing list and for many other filesystems. > I did some searching back

Re: btrfs errors over NFS

2017-10-13 Thread Peter Grandi
>> TL;DR: ran into some btrfs errors and weird behaviour, but >> things generally seem to work. Just posting some details in >> case it helps devs or other users. [ ... ] I've run into a >> btrfs error trying to do a -j8 build of android on a btrfs >> filesystem exported over NFSv3. [ ... ] I

Re: What means "top level" in "btrfs subvolume list" ?

2017-09-30 Thread Peter Grandi
> I am trying to figure out which means "top level" in the > output of "btrfs sub list" The terminology (and sometimes the detailed behaviour) of Btrfs is not extremely consistent, I guess because of permissive editorship of the design, in a "let 1000 flowers bloom" sort of fashion so that does

Re: Btrfs performance with small blocksize on SSD

2017-09-26 Thread Peter Grandi
> i run a few performance tests comparing mdadm, hardware raid > and the btrfs raid. Fantastic beginning already! :-) > I noticed that the performance I have seen over the years a lot of messages like this where there is a wanton display of amusing misuses of terminology, of which the misuse of

Re: A user cannot remove his readonly snapshots?!

2017-09-16 Thread Peter Grandi
[ ... ] > I can delete normal subvolumes but not the readonly snapshots: It is because of ordinary permissions for both subvolumes and snapshots: tree$ btrfs sub create /fs/sda7/sub Create subvolume '/fs/sda7/sub' tree$ chmod a-w /fs/sda7/sub tree$ btrfs sub del /fs/sda7/sub

Re: A user cannot remove his readonly snapshots?!

2017-09-15 Thread Peter Grandi
> [ ... ] mounted with option user_subvol_rm_allowed [ ... ] > root can delete this snapshot, but not the user. Why? [ ... ] Ordinary permissions still apply both to 'create' and 'delete': tree$ sudo mkdir /fs/sda7/dir tree$ btrfs sub create /fs/sda7/dir/sub ERROR: cannot access

Re: defragmenting best practice?

2017-09-15 Thread Peter Grandi
[ ... ] Case #1 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage -> guest BTRFS filesystem SQL table row insertions per second: 1-2 Case #2 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu raw storage -> guest EXT4 filesystem

Re: defragmenting best practice?

2017-09-15 Thread Peter Grandi
> Case #1 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs -> qemu cow2 storage > -> guest BTRFS filesystem > SQL table row insertions per second: 1-2 "Doctor, if I stab my hand with a fork it hurts a lot: can you cure that?" > Case #2 > 2x 7200 rpm HDD -> md raid 1 -> host BTRFS rootfs ->

Re: generic name for volume and subvolume root?

2017-09-10 Thread Peter Grandi
> As I am writing some documentation abount creating snapshots: > Is there a generic name for both volume and subvolume root? Yes, it is from the UNIX side 'root directory' and from the Btrfs side 'subvolume'. Like some other things Btrfs, its terminology is often inconsistent, but "volume"

Re: test if a subvolume is a snapshot?

2017-09-08 Thread Peter Grandi
> How can I test if a subvolume is a snapshot? [ ... ] This question is based on the assumption that "snapshot" is a distinct type of subvolume and not just an operation that creates a subvolume with reflinked contents. Unfortunately Btrfs does indeed make snapshots a distinct type of

Re: 4.13: No space left with plenty of free space (/home/kernel/COD/linux/fs/btrfs/extent-tree.c:6989 __btrfs_free_extent.isra.62+0xc2c/0xdb0)

2017-09-08 Thread Peter Grandi
[ ... ] > [233787.921018] Call Trace: > [233787.921031] ? btrfs_merge_delayed_refs+0x62/0x550 [btrfs] > [233787.921039] __btrfs_run_delayed_refs+0x6f0/0x1380 [btrfs] > [233787.921047] btrfs_run_delayed_refs+0x6b/0x250 [btrfs] > [233787.921054] btrfs_write_dirty_block_groups+0x158/0x390 [btrfs]

Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Peter Grandi
>>> [ ... ] Currently without any ssds i get the best speed with: >>> - 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" devices >>> and using btrfs as raid 0 for data and metadata on top of >>> those 4 raid 5. [ ... ] the write speed is not as good as i >>> would like - especially for random

Re: speed up big btrfs volumes with ssds

2017-09-04 Thread Peter Grandi
>> [ ... ] Currently the write speed is not as good as i would >> like - especially for random 8k-16k I/O. [ ... ] > [ ... ] So this 60TB is then 20 4TB disks or so and the 4x 1GB > cache is simply not very helpful I think. The working set > doesn't fit in it I guess. If there is mostly single or

Re: read-only for no good reason on 4.9.30

2017-09-04 Thread Peter Grandi
> [ ... ] I ran "btrfs balance" and then it started working > correctly again. It seems that a btrfs filesystem if left > alone will eventually get fragmented enough that it rejects > writes [ ... ] Free space will get fragmented, because Btrfs has a 2-level allocator scheme (chunks within

Re: speed up big btrfs volumes with ssds

2017-09-03 Thread Peter Grandi
> [ ... ] - needed volume size is 60TB I wonder how long that takes to 'scrub', 'balance', 'check', 'subvolume delete', 'find', etc. > [ ... ] 4x HW Raid 5 with 1GB controller memory of 4TB 3,5" > devices and using btrfs as raid 0 for data and metadata on top > of those 4 raid 5. [ ... ] the

Re: number of subvolumes

2017-08-24 Thread Peter Grandi
>> Using hundreds or thousands of snapshots is probably fine >> mostly. As I mentioned previously, with a link to the relevant email describing the details, the real issue is reflinks/backrefs. Usually subvolume and snapshots involve them. > We find that typically apt is very slow on a machine

Re: number of subvolumes

2017-08-23 Thread Peter Grandi
> This is a vanilla SLES12 installation: [ ... ] Why does SUSE > ignore this "not too many subvolumes" warning? As in many cases with Btrfs "it's complicated" because of the interaction of advanced features among themselves and the chosen implementation and properties of storage; anisotropy

Re: user snapshots

2017-08-23 Thread Peter Grandi
> So, still: What is the problem with user_subvol_rm_allowed? As usual, it is complicated: mostly that while subvol creation is very cheap, subvol deletion can be very expensive. But then so can be creating many snapshots, as in this: https://www.spinics.net/lists/linux-btrfs/msg62760.html

Re: netapp-alike snapshots?

2017-08-22 Thread Peter Grandi
[ ... ] It is beneficial to not have snapshots in-place. With a local directory of snapshots, [ ... ] Indeed and there is a fair description of some options for subvolume nesting policies here which may be interesting to the original poster:

Re: finding root filesystem of a subvolume?

2017-08-22 Thread Peter Grandi
[ ... ] >> There is no fixed relationship between the root directory >> inode of a subvolume and the root directory inode of any >> other subvolume or the main volume. > Actually, there is, because it's inherently rooted in the > hierarchy of the volume itself. That root inode for the >

Re: finding root filesystem of a subvolume?

2017-08-22 Thread Peter Grandi
> How do I find the root filesystem of a subvolume? > Example: > root@fex:~# df -T > Filesystem Type 1K-blocks Used Available Use% Mounted on > - -1073740800 104244552 967773976 10% /local/.backup/home [ ... ] > I know, the root filesystem is /local, That question is

Re: slow btrfs with a single kworker process using 100% CPU

2017-08-16 Thread Peter Grandi
>>> I've one system where a single kworker process is using 100% >>> CPU sometimes a second process comes up with 100% CPU >>> [btrfs-transacti]. [ ... ] >> [ ... ]1413 Snapshots. I'm deleting 50 of them every night. But >> btrfs-cleaner process isn't running / consuming CPU currently. Reminder

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
[ ... ] >>> Snapshots work fine with nodatacow, each block gets CoW'ed >>> once when it's first written to, and then goes back to being >>> NOCOW. >>> The only caveat is that you probably want to defrag either >>> once everything has been rewritten, or right after the >>> snapshot. >> I thought

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
[ ... ] > But I've talked to some friend at the local super computing > centre and they have rather general issues with CoW at their > virtualisation cluster. Amazing news! :-) > Like SUSE's snapper making many snapshots leading the storage > images of VMs apparently to explode (in terms of

Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?

2017-08-16 Thread Peter Grandi
> We use the crcs to catch storage gone wrong, [ ... ] And that's an opportunistically feasible idea given that current CPUs can do that in real-time. > [ ... ] It's possible to protect against all three without COW, > but all solutions have their own tradeoffs and this is the setup > we chose.

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
[ ... ] > This is the "storage for beginners" version, what happens in > practice however depends a lot on specific workload profile > (typical read/write size and latencies and rates), caching and > queueing algorithms in both Linux and the HA firmware. To add a bit of slightly more advanced

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe >> size". [ ... ] several back-to-back 128KiB writes [ ... ] get >> merged by the 3ware firmware only if it has a persistent >> cache, and maybe your 3ware does not have one, > KOS: No I don't have persistent cache. Only the 512 Mb cache

Re: Btrfs + compression = slow performance and high cpu usage

2017-08-01 Thread Peter Grandi
> Peter, I don't think the filefrag is showing the correct > fragmentation status of the file when the compression is used. As reported on a previous message the output of 'filefrag -v' which can be used to see what is going on: filefrag /mnt/sde3/testfile /mnt/sde3/testfile: 49287

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
> [ ... ] It is hard for me to see a speed issue here with > Btrfs: for comparison I have done a simple test with a both a > 3+1 MD RAID5 set with a 256KiB chunk size and a single block > device on "contemporary" 1T/2TB drives, capable of sequential > transfer rates of 150-190MB/s: [ ... ] The

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
[ ... ] > Also added: Feeling very generous :-) today, adding these too: soft# mkfs.btrfs -mraid10 -draid10 -L test5 /dev/sd{b,c,d,e}3 [ ... ] soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb3 /mnt/test5 soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-31 Thread Peter Grandi
[ ... ] > grep 'model name' /proc/cpuinfo | sort -u > model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Good, contemporary CPU with all accelerations. > The sda device is a hardware RAID5 consisting of 4x8TB drives. [ ... ] > Strip Size : 256 KB So the full RMW data

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-28 Thread Peter Grandi
In addition to my previous "it does not happen here" comment, if someone is reading this thread, there are some other interesting details: > When the compression is turned off, I am able to get the > maximum 500-600 mb/s write speed on this disk (raid array) > with minimal cpu usage. No details

Re: Btrfs + compression = slow performance and high cpu usage

2017-07-28 Thread Peter Grandi
> I am stuck with a problem of btrfs slow performance when using > compression. [ ... ] That to me looks like an issue with speed, not performance, and in particular with PEBCAK issues. As to high CPU usage, when you find a way to do both compression and checksumming without using much CPU time,

Re: kernel btrfs file system wedged -- is it toast?

2017-07-21 Thread Peter Grandi
> [ ... ] announce loudly and clearly to any potential users, in > multiple places (perhaps a key announcement in a few places > and links to that announcement from many places, https://btrfs.wiki.kernel.org/index.php/Gotchas#Having_many_subvolumes_can_be_very_slow > ... DO expect to first have

Re: Exactly what is wrong with RAID5/6

2017-06-21 Thread Peter Grandi
> [ ... ] This will make some filesystems mostly RAID1, negating > all space savings of RAID5, won't it? [ ... ] RAID5/RAID6/... don't merely save space, more precisely they trade lower resilience and a more anisotropic and smaller performance envelope to gain lower redundancy (= save space). --

Re: does using different uid/gid/forceuid/... mount options for different subvolumes work / does fuse.bindfs play nice with btrfs?

2017-06-20 Thread Peter Grandi
> I intend to provide different "views" of the data stored on > btrfs subvolumes. e.g. mount a subvolume in location A rw; > and ro in location B while also overwriting uids, gids, and > permissions. [ ... ] That's not how UNIX/Linux permissions and ACLs are supposed to work, perhaps you should

Re: Struggling with file system slowness

2017-05-04 Thread Peter Grandi
> Trying to peg down why I have one server that has > btrfs-transacti pegged at 100% CPU for most of the time. Too little information. Is IO happening at the same time? Is compression on? Deduplicated? Lots of subvolumes? SSD? What kind of workload and file size/distribution profile? Typical

Re: Ded

2017-05-03 Thread Peter Grandi
> I have a btrfs filesystem mounted at /btrfs_vol/ Every N > minutes, I run bedup for deduplication of data in /btrfs_vol > Inside /btrfs_vol, I have several subvolumes (consider this as > home directories of several users) I have set individual > qgroup limits for each of these subvolumes. [ ...

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-29 Thread Peter Grandi
>> [ ... ] these extents are all over the place, they're not >> contiguous at all. 4K here, 4K there, 4K over there, back to >> 4K here next to this one, 4K over there...12K over there, 500K >> unwritten, 4K over there. This seems not so consequential on >> SSD, [ ... ] > Indeed there were recent

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-29 Thread Peter Grandi
> [ ... ] Instead, you can use raw files (preferably sparse unless > there's both nocow and no snapshots). Btrfs does natively everything > you'd gain from qcow2, and does it better: you can delete the master > of a cloned image, deduplicate them, deduplicate two unrelated images; > you can turn

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> [ ... ] these extents are all over the place, they're not > contiguous at all. 4K here, 4K there, 4K over there, back to > 4K here next to this one, 4K over there...12K over there, 500K > unwritten, 4K over there. This seems not so consequential on > SSD, [ ... ] Indeed there were recent

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
>> The gotcha though is there's a pile of data in the journal >> that would never make it to rsyslogd. If you use journalctl >> -o verbose you can see some of this. > You can send *all the info* to rsyslogd via imjournal > http://www.rsyslog.com/doc/v8-stable/configuration/modules/imjournal.html

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> [ ... ] And that makes me wonder whether metadata > fragmentation is happening as a result. But in any case, > there's a lot of metadata being written for each journal > update compared to what's being added to the journal file. [ > ... ] That's the "wandering trees" problem in COW filesystems,

Re: btrfs, journald logs, fragmentation, and fallocate

2017-04-28 Thread Peter Grandi
> Old news is that systemd-journald journals end up pretty > heavily fragmented on Btrfs due to COW. This has been discussed before in detail indeeed here, but also here: http://www.sabi.co.uk/blog/15-one.html?150203#150203 > While journald uses chattr +C on journal files now, COW still >

Re: About free space fragmentation, metadata write amplification and (no)ssd

2017-04-08 Thread Peter Grandi
> [ ... ] This post is way too long [ ... ] Many thanks for your report, it is really useful, especially the details. > [ ... ] using rsync with --link-dest to btrfs while still > using rsync, but with btrfs subvolumes and snapshots [1]. [ > ... ] Currently there's ~35TiB of data present on the

Re: btrfs filesystem keeps allocating new chunks for no apparent reason

2017-04-07 Thread Peter Grandi
[ ... ] >>> I've got a mostly inactive btrfs filesystem inside a virtual >>> machine somewhere that shows interesting behaviour: while no >>> interesting disk activity is going on, btrfs keeps >>> allocating new chunks, a GiB at a time. [ ... ] > Because the allocator keeps walking forward every

Re: Do different btrfs volumes compete for CPU?

2017-04-04 Thread Peter Grandi
> [ ... ] I tried to use eSATA and ext4 first, but observed > silent data corruption and irrecoverable kernel hangs -- > apparently, SATA is not really designed for external use. SATA works for external use, eSATA works well, but what really matters is the chipset of the adapter card. In my

Re: Shrinking a device - performance?

2017-04-01 Thread Peter Grandi
[ ... ] >>> $ D='btrfs f2fs gfs2 hfsplus jfs nilfs2 reiserfs udf xfs' >>> $ find $D -name '*.ko' | xargs size | sed 's/^ *//;s/ .*\t//g' >>> textfilename >>> 832719 btrfs/btrfs.ko >>> 237952 f2fs/f2fs.ko >>> 251805 gfs2/gfs2.ko >>> 72731 hfsplus/hfsplus.ko >>> 171623

Re: Do different btrfs volumes compete for CPU?

2017-04-01 Thread Peter Grandi
>> Approximately 16 hours ago I've run a script that deleted >> >~100 snapshots and started quota rescan on a large >> USB-connected btrfs volume (5.4 of 22 TB occupied now). That "USB-connected is a rather bad idea. On the IRC channel #Btrfs whenever someone reports odd things happening I ask

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
> [ ... ] what the signifigance of the xargs size limits of > btrfs might be. [ ... ] So what does it mean that btrfs has a > higher xargs size limit than other file systems? [ ... ] Or > does the lower capacity for argument length for hfsplus > demonstrate it is the superior file system for

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>>> My guess is that very complex risky slow operations like >>> that are provided by "clever" filesystem developers for >>> "marketing" purposes, to win box-ticking competitions. >>> That applies to those system developers who do know better; >>> I suspect that even some filesystem developers

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>> [ ... ] CentOS, Redhat, and Oracle seem to take the position >> that very large data subvolumes using btrfs should work >> fine. But I would be curious what the rest of the list thinks >> about 20 TiB in one volume/subvolume. > To be sure I'm a biased voice here, as I have multiple >

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
> Can you try to first dedup the btrfs volume? This is probably > out of date, but you could try one of these: [ ... ] Yep, > that's probably a lot of work. [ ... ] My recollection is that > btrfs handles deduplication differently than zfs, but both of > them can be very, very slow But the big

Re: Shrinking a device - performance?

2017-03-31 Thread Peter Grandi
>>> The way btrfs is designed I'd actually expect shrinking to >>> be fast in most cases. [ ... ] >> The proposed "move whole chunks" implementation helps only if >> there are enough unallocated chunks "below the line". If regular >> 'balance' is done on the filesystem there will be some, but

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
>> My guess is that very complex risky slow operations like that are >> provided by "clever" filesystem developers for "marketing" purposes, >> to win box-ticking competitions. That applies to those system >> developers who do know better; I suspect that even some filesystem >> developers are

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
> I’ve glazed over on “Not only that …” … can you make youtube > video of that :)) [ ... ] It’s because I’m special :* Well played again, that's a fairly credible impersonation of a node.js/mongodb developer :-). > On a real note thank’s [ ... ] to much of open source stuff is > based on short

Re: Shrinking a device - performance?

2017-03-30 Thread Peter Grandi
>> As a general consideration, shrinking a large filetree online >> in-place is an amazingly risky, difficult, slow operation and >> should be a last desperate resort (as apparently in this case), >> regardless of the filesystem type, and expecting otherwise is >> "optimistic". > The way btrfs is

Re: Shrinking a device - performance?

2017-03-28 Thread Peter Grandi
> I glazed over at “This is going to be long” … :) >> [ ... ] Not only that, you also top-posted while quoting it pointlessly in its entirety, to the whole mailing list. Well played :-). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to

Re: Shrinking a device - performance?

2017-03-28 Thread Peter Grandi
> [ ... ] slaps together a large storage system in the cheapest > and quickest way knowing that while it is mostly empty it will > seem very fast regardless and therefore to have awesome > performance, and then the "clever" sysadm disappears surrounded > by a halo of glory before the storage

Re: Shrinking a device - performance?

2017-03-28 Thread Peter Grandi
> [ ... ] reminded of all the cases where someone left me to > decatastrophize a storage system built on "optimistic" > assumptions. In particular when some "clever" sysadm with a "clever" (or dumb) manager slaps together a large storage system in the cheapest and quickest way knowing that while

Re: Shrinking a device - performance?

2017-03-28 Thread Peter Grandi
This is going to be long because I am writing something detailed hoping pointlessly that someone in the future will find it by searching the list archives while doing research before setting up a new storage system, and they will be the kind of person that tolerates reading messages longer than

Re: send snapshot from snapshot incremental

2017-03-26 Thread Peter Grandi
[ ... ] > BUT if i take a snapshot from the system, and want to transfer > it to the external HD, i can not set a parent subvolume, > because there isn't any. Questions like this are based on incomplete understanding of 'send' and 'receive', and on IRC user "darkling" explained it fairly well: >

Re: backing up a file server with many subvolumes

2017-03-26 Thread Peter Grandi
> [ ... ] In each filesystem subdirectory are incremental > snapshot subvolumes for that filesystem. [ ... ] The scheme > is something like this: > /backup/// BTW hopefully this does not amounts to too many subvolumes in the '.../backup/' volume, because that can create complications, where

Re: BTRFS Metadata Corruption Prevents Scrub and btrfs check

2017-03-17 Thread Peter Grandi
> How can I attempt to rebuild the metadata, with a treescan or > otherwise? I don't know unfortunately for backrefs. >> In general metadata in Btrfs is fairly intricate and metadata >> block loss is pretty fatal, that's why metadata should most >> times be redundant as in 'dup' or 'raid1' or

Re: BTRFS Metadata Corruption Prevents Scrub and btrfs check

2017-03-17 Thread Peter Grandi
> Read error at byte 0, while reading 3975 bytes: Input/output error Bad news. That means that probably the disk is damaged and further issues may happen. > corrected errors: 0, uncorrectable errors: 2, unverified errors: 0 Even worse news. > Incorrect local backref count on

Re: raid1 degraded mount still produce single chunks, writeable mount not allowed

2017-03-09 Thread Peter Grandi
>> Consider the common case of a 3-member volume with a 'raid1' >> target profile: if the sysadm thinks that a drive should be >> replaced, the goal is to take it out *without* converting every >> chunk to 'single', because with 2-out-of-3 devices half of the >> chunks will still be fully

Re: raid1 degraded mount still produce single chunks, writeable mount not allowed

2017-03-05 Thread Peter Grandi
[ ... on the difference between number of devices and length of a chunk-stripe ... ] > Note: possibilities get even more interesting with a 4-device > volume with 'raid1' profile chunks, and similar case involving > other profiles than 'raid1'. Consider for example a 4-device volume with 2

Re: raid1 degraded mount still produce single chunks, writeable mount not allowed

2017-03-05 Thread Peter Grandi
>> What makes me think that "unmirrored" 'raid1' profile chunks >> are "not a thing" is that it is impossible to remove >> explicitly a member device from a 'raid1' profile volume: >> first one has to 'convert' to 'single', and then the 'remove' >> copies back to the remaining devices the 'single'

Re: raid1 degraded mount still produce single chunks, writeable mount not allowed

2017-03-02 Thread Peter Grandi
> [ ... ] Meanwhile, the problem as I understand it is that at > the first raid1 degraded writable mount, no single-mode chunks > exist, but without the second device, they are created. [ > ... ] That does not make any sense, unless there is a fundamental mistake in the design of the 'raid1'

Re: Low IOOP Performance

2017-02-27 Thread Peter Grandi
[ ... ] > I have a 6-device test setup at home and I tried various setups > and I think I got rather better than that. * 'raid1' profile: soft# btrfs fi df /mnt/sdb5 Data, RAID1:

Re: Low IOOP Performance

2017-02-27 Thread Peter Grandi
>>> On Mon, 27 Feb 2017 22:11:29 +, p...@btrfs.list.sabi.co.uk (Peter >>> Grandi) said: > [ ... ] >> I have a 6-device test setup at home and I tried various setups >> and I think I got rather better than that. [ ... ] > That's a range of 700-1300 4KiB

Re: Low IOOP Performance

2017-02-27 Thread Peter Grandi
[ ... ] > a ten disk raid1 using 7.2k 3 TB SAS drives Those are really low IOPS-per-TB devices, but good choice for SAS, as they will have SCT/ERC. > and used aio to test IOOP rates. I was surprised to measure > 215 read and 72 write IOOPs on the clean new filesystem. For that you really want

Re: understanding disk space usage

2017-02-08 Thread Peter Grandi
[ ... ] > The issue isn't total size, it's the difference between total > size and the amount of data you want to store on it. and how > well you manage chunk usage. If you're balancing regularly to > compact chunks that are less than 50% full, [ ... ] BTRFS on > 16GB disk images before with

Re: understanding disk space usage

2017-02-08 Thread Peter Grandi
>> My system is or seems to be running out of disk space but I >> can't find out how or why. [ ... ] >> FilesystemSize Used Avail Use% Mounted on >> /dev/sda3 28G 26G 2.1G 93% / [ ... ] > So from chunk level, your fs is already full. And balance > won't success since

Re: BTRFS for OLTP Databases

2017-02-07 Thread Peter Grandi
> I have tried BTRFS from Ubuntu 16.04 LTS for write intensive > OLTP MySQL Workload. This has a lot of interesting and mostly agreeable information: https://blog.pgaddict.com/posts/friends-dont-let-friends-use-btrfs-for-oltp The main target of Btrfs is where one wants checksums and occasional

Re: [Jfs-discussion] benchmark results

2009-12-24 Thread Peter Grandi
I've had the chance to use a testsystem here and couldn't resist Unfortunately there seems to be an overproduction of rather meaningless file system benchmarks... running a few benchmark programs on them: bonnie++, tiobench, dbench and a few generic ones (cp/rm/tar/etc...) on ext{234},