some project ideas: NFS4 ACLs, resilience on the same device, allowing to specify which devices are distinct in a RAID

2014-06-02 Thread Christoph Anton Mitterer
Hi. Christian Kujau suggested in the wiki[] to post project ideas to the list to give them some possible wider discussion. So far I've had these ideas: 1) NFS 4 ACLs[1] Not sure whether it has been proposed and/or rejected before),... but it would be nice if it was a goal for btrfs to support

Re: Using BTRFS on SSD now ?

2014-06-05 Thread Christoph Anton Mitterer
On Thu, 2014-06-05 at 07:56 -0700, Marc MERLIN wrote: Dmcrypt is ok, however add discard to cryptsetup options too Be aware, that discard used with dm-crypt may have security implications. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: some project ideas: NFS4 ACLs, resilience on the same device, allowing to specify which devices are distinct in a RAID

2014-06-06 Thread Christoph Anton Mitterer
On Tue, 2014-06-03 at 19:03 +0200, Goffredo Baroncelli wrote: There is (was ?) a project to address that: richacl http://www.bestbits.at/richacl/. This is not a btrfs project, but a linux kernel project because from a filesystem POV the implementation requires to store some information in

Re: some project ideas: NFS4 ACLs, resilience on the same device, allowing to specify which devices are distinct in a RAID

2014-06-13 Thread Christoph Anton Mitterer
On Sat, 2014-06-07 at 10:25 +0200, Goffredo Baroncelli wrote: You can add new ideas to the wiki pages, supporting by link and other info were available. This is the real nature of the wiki pages. I've added some stuff now:

Re: btrfs data dup on single device?

2014-06-25 Thread Christoph Anton Mitterer
On Wed, 2014-06-25 at 08:47 +0100, Hugo Mills wrote: This has variously been possible and not over the last few years. I think it's finally come down on the side of not, I think that would really be a loss... :( The question is, why? Well imagine you have some computer which can only

general thoughts and questions + general and RAID5/6 stability?

2014-08-30 Thread Christoph Anton Mitterer
Hey. For some time now I consider to use btrfs at a larger scale, basically in two scenarios: a) As the backend for data pools handled by dcache (dcache.org), where we run a Tier-2 in the higher PiB range for the LHC Computing Grid... For now that would be rather boring use of btrfs (i.e. not

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-30 Thread Christoph Anton Mitterer
Agree with others about -C 256...-C sha256 is only three letters more ;) Ideally, sha2-256 would be used, since there will be (are) other versions of sha which have 256 bits size. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-30 Thread Christoph Anton Mitterer
Agree with others about -C 256...-C sha256 is only three letters more ;) Ideally, sha2-256 would be used, since there will be (are) other versions of sha which have 256 bits size. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-11-30 Thread Christoph Anton Mitterer
On Sun, 2014-11-30 at 23:05 +, Dimitri John Ledkov wrote: Nope, we should use standard names. Well I wouldn't know that there is really a standardised name in the sense that it tells it's mandatory. People use SHA2-xxx, SHA-xxx, SHAxxx and probably even more combinations. And just because

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread Christoph Anton Mitterer
On Sat, 2014-11-29 at 13:00 -0800, John Williams wrote: On Sat, Nov 29, 2014 at 12:38 PM, Alex Elsayed eternal...@gmail.com wrote: Why not just use the kernel crypto API? Then the user can just specify any hash the kernel supports. One reason is that crytographic hashes are an order of

Re: [RFC PATCH] Btrfs: add sha256 checksum option

2014-12-01 Thread Christoph Anton Mitterer
On Mon, 2014-12-01 at 16:43 -0800, Alex Elsayed wrote: including that MAC-then-encrypt is fragile against a number of attacks, mainly in the padding-oracle category (See: TLS BEAST attack). Well but here we talk about disk encryption... how would the MtE oracle problems apply to that? Either

is cryptographically secure integrity checking possible with btrfs?

2015-03-11 Thread Christoph Anton Mitterer
Hey. For encryption we have dm-crypt and in principle I'm happy with having that at the block device level below the filesystem - perhaps except for any possible performance issues, especially when used with software RAID (regardless of whether MD or btrfs')[0]. But obviously integrity

Re: btrfs dedup - available or experimental? Or yet to be?

2015-03-29 Thread Christoph Anton Mitterer
On Sun, 2015-03-29 at 13:43 +0200, Kai Krakow wrote: Concluding that: duperemove should probably not try to become smart about filesystem boundaries. It should either cross them or not as it is now - the option is left to the user (as is the task to supply proper cmdline arguments with

Re: btrfs dedup - available or experimental? Or yet to be?

2015-03-29 Thread Christoph Anton Mitterer
On Sun, 2015-03-29 at 16:44 +0200, Kai Krakow wrote: Yes, the chosen default is probably not the best for this kind of utility. But I suppose it follows the principle of least surprise. At least every utility I'm daily using (like find) follows this default route. But the default with all

Re: how to clone a btrfs filesystem

2015-04-20 Thread Christoph Anton Mitterer
On Mon, 2015-04-20 at 05:23 +, Duncan wrote: Which, given the common developer wisdom about premature optimization, can be explained. But accepting that explanation, one is still stymied by the fact that all the previous warnings about btrfs being in heavy development, keep good

Re: how to clone a btrfs filesystem

2015-04-19 Thread Christoph Anton Mitterer
On Sun, 2015-04-19 at 01:02 +1000, Russell Coker wrote: An rsync on block devices wouldn't lose BTRFS checksums, you could run a scrub on the target at any time to verify them. For a dd or anything based on that the target needs to be at least as big as the source. But typical use of

incremental full file backups to smaller mediums possible?

2015-04-09 Thread Christoph Anton Mitterer
Hey. I wondered whether this is possible in btrfs (or could be implemented),... it's in a way similar to send/receive, but AFAIU not fully solvable with that. What I want to do is making incremental backups of a (btrfs) filesystem to smaller mediums (that is for example: from a big RAID

system frozen during send/receive

2015-04-18 Thread Christoph Anton Mitterer
Hi. As mentioned before on the list, I'm just playing with send/receive. The first huge disappointment (after copying already hundreds of gigabytes for hours) was, that when I Ctrl-Z the sending/receiving pipe (to give the disks a little bit of rest to cool down) can resuming it (fg) it

Re: how to clone a btrfs filesystem

2015-04-18 Thread Christoph Anton Mitterer
On Sat, 2015-04-18 at 10:20 -0600, Chris Murphy wrote: Make the source a seed device, add new device, delete seed. Once that completes, unmount, unset btrfs seed, and now the two devices are separate fs volumes each with unique UUID. There may still be bugs with seed device, it's been maybe 6

how to clone a btrfs filesystem

2015-04-17 Thread Christoph Anton Mitterer
Hey. I've seen that this has been asked some times before, and there are stackoverflow/etc. questions on that, but none with a really good answer. How can I best copy one btrfs filesystem (with snapshots and subvolumes) into another, especially with keeping the CoW/reflink status of all files?

Re: incremental full file backups to smaller mediums possible?

2015-04-17 Thread Christoph Anton Mitterer
On Thu, 2015-04-09 at 16:33 +, Hugo Mills wrote: btrfs sub find-new might be more helpful to you here. That will give you the list of changed files; then just feed that list to your existing bin-packing algorithm for working out what goes on which disks, and you're done. hmm that

Re: how to clone a btrfs filesystem

2015-04-17 Thread Christoph Anton Mitterer
On Sat, 2015-04-18 at 04:24 +, Russell Coker wrote: dd works. ;) There are patches to rsync that make it work on block devices. Of course that will copy space occupied by deleted files too. I think both are not quite the solutions I was looking for. Guess for dd this is obvious,

possible raid6 corruption

2015-06-01 Thread Christoph Anton Mitterer
Hi. The following is a possible corruption of a btrfs with RAID6,... it may however also be just and issue with the megasas driver or the PERC controller behind it. Anyway since RADI56 is quite new in btrfs, an expert may want to have a look at it whether it's something that needs to be focused

strange corruptions found during btrfs check

2015-07-02 Thread Christoph Anton Mitterer
Hi. This is on a btrfs created and used with a 4.0 kernel. Not much was done on it, apart from send/receive snapshots from another btrfs (with -p). Some of the older snapshots (that were used as parents before) have been removed in the meantime). Now a btrfs check gives this: # btrfs check

Re: strange corruptions found during btrfs check

2015-07-06 Thread Christoph Anton Mitterer
After removing some of the snapshots that were received, the errors at btrfs check went away. Is there some list of features in btrfs which are considered stable? Cause I though send/receive and the subvolumes would be, but apparently this doesn't seem to be the case :-/ Cheers, Chris.

Re: strange "No space left on device"

2015-11-08 Thread Christoph Anton Mitterer
On Sun, 2015-11-08 at 20:39 +, Duncan wrote: > Wow, yes!  Good catch, Henk! =:^)  Hugo obviously didn't catch it, > and I > wouldn't have either, as the bad size detection behavior is so > unexpected, it just wouldn't occur to me to look! Hmm... all that *may* be more likely an error of

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
I've uploaded the full output of btrfs check on that device to: http://christoph.anton.mitterer.name/tmp/public/cbec6446-898b-11e5-90a4-502690aa641f.xz there are nearly 600k of these error lines... WTF?! Also, the filesystem still mounts (without any errors to dmesg) Any help would be

bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
Hey. I get these errors on fsck'ing a btrfs: bad extent [5993525264384, 5993525280768), type mismatch with chunk bad extent [5993525280768, 5993525297152), type mismatch with chunk bad extent [5993525297152, 5993525313536), type mismatch with chunk bad extent [5993529442304, 5993529458688), type

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
On Fri, 2015-11-13 at 11:23 +0800, Qu Wenruo wrote: > No, "-t 2" means only dump extent tree, no privacy issues at all. > Since only numeric inode/snapshot number and offset inside file. > Or I'll give you a warning on privacy. > > No file name at all, just try it yourself. I'm preparing it...

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-13 Thread Christoph Anton Mitterer
I just got the backup disk back, also btrfs, which was made via send/receive... It has the same errors during fsck. The main disk still hasn't found any file (apart from a few, others for which none of my hash sums were stored at all) that doesn't verify. So I guess there's definitely some bug

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-13 Thread Christoph Anton Mitterer
On Fri, 2015-11-13 at 07:05 +, Duncan wrote: > 8 TiB disks -- are those the disk-managed SMR "archive" disks I've > read > about on a number of threads? Yes... but... > If so, that hardware is almost certainly the cause, as they're known > to > be problematic on current kernels.  While most

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
Hey. On Fri, 2015-11-13 at 10:13 +0800, Qu Wenruo wrote: > Like this one, if any extent type doesn't match with its chunk, like > metadata extent in a data chunk, btrfsck will report like that. So these errors... are they anything serious? I.e. like data loss/corruption? Or is it more a

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
On Fri, 2015-11-13 at 11:23 +0800, Qu Wenruo wrote: > No, "-t 2" means only dump extent tree, no privacy issues at all. > Since only numeric inode/snapshot number and offset inside file. > Or I'll give you a warning on privacy. Done...

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
On Fri, 2015-11-13 at 10:52 +0800, Qu Wenruo wrote: > You can provide the output of "btrfs-debug-tree -t 2 " to help > further debug. > It would be quite big, so it's better to zip it. That would contain all the filenames, right? Hmm that could be problematic because of privacy issues... >

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-12 Thread Christoph Anton Mitterer
And I should perhaps mention one more thing: As I've said I have these two 8TiB disks... one which is basically the master with loads of precious data, the other being a backup from the master, regularly created with incremental btrfs send/receive. Everytime I did this (which is every two months

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-13 Thread Christoph Anton Mitterer
On Sat, 2015-11-14 at 09:22 +0800, Qu Wenruo wrote: > Manually checked they all. thanks a lot :-) > Strangely, they are all OK... although it's a good news for you. Oh man... you're s mean ;-D > They are all tree blocks and are all in metadata block group. and I guess that's...

Re: bad extent [5993525264384, 5993525280768), type mismatch with chunk

2015-11-14 Thread Christoph Anton Mitterer
On Sun, 2015-11-15 at 09:29 +0800, Qu Wenruo wrote: > > > If type is wrong, all the extents inside the chunk should be > > > reported > > > as > > > mismatch type with chunk. > > Isn't that the case? At least there are so many reported extents... > > If you posted all the output Sure, I posted

Re: [PATCH 00/15] btrfs: Hot spare and Auto replace

2015-11-15 Thread Christoph Anton Mitterer
Hey. You guys may want to update: https://btrfs.wiki.kernel.org/index.php/Project_ideas#Hot_spare_support Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

mkfs.btrfs doesn't detect SSD

2015-11-07 Thread Christoph Anton Mitterer
Hey. I'm creating a filesystem on Samsung Evo 850 Pro on top of a dm- crypt/LUKS container (with TRIM not being passed on, for the usual security reasons): # mkfs.btrfs --label system /dev/mapper/system btrfs-progs v4.2.2 See http://btrfs.wiki.kernel.org for more information. Label: 

Re: mkfs.btrfs doesn't detect SSD

2015-11-07 Thread Christoph Anton Mitterer
Hmm in fact it seems to be the kernel who wrongly, detects the type: /sys/block/sdb/queue/rotational = 1 or more like the USB/SATA bridge simply reports it wrong. Anyway, is there a way to override? Or will setting /sys/block/sdb/queue/rotational = 0 give the expected behaviour? Thanks, Chris.

strange "No space left on device"

2015-11-07 Thread Christoph Anton Mitterer
Hey. I just repeatedly did the following twice on a ~8GB USB stick, under Debian sid (ergo kernel 4.2.0-1-amd64, btrfsprogs 4.2.2-1). First, created some GPT on the stick: Number  Start (sector)End (sector)  Size   Code  Name    12048 1048575   511.0 MiB   EF02  BIOS

Re: strange "No space left on device"

2015-11-07 Thread Christoph Anton Mitterer
On Sat, 2015-11-07 at 23:30 +, Hugo Mills wrote: >    These are all really small. Well enough for booting =) >    I would suggest running mkfs with --mixed for all of these > filesystems and trying again. I thought btrfs would do that automatically:

Re: How to detect / notify when a raid drive fails?

2015-11-27 Thread Christoph Anton Mitterer
On Fri, 2015-11-27 at 17:16 +0800, Anand Jain wrote: >   I understand as a user, a full md/lvm set of features are important >   to begin operations using btrfs and we don't have it yet. I have to >   blame it on the priority list. What's would be especially nice from the admin side, would be

slowness when cp respectively send/receiving on top of dm-crypt

2015-11-27 Thread Christoph Anton Mitterer
Hey. Not sure if that's valuable input for the devs, but here's some vague real-world report about performance: I'm just copying (via send/receive) a large filesystem (~7TB) from on HDD over to another. The devices are both connected via USB3, and each of the btrfs is on top of dm-crypt. It's

Re: [PATCH] btrfs: Introduce new mount option to disable tree log replay

2015-12-08 Thread Christoph Anton Mitterer
On Tue, 2015-12-08 at 07:15 -0500, Austin S Hemmelgarn wrote: > Despite this, it really isn't a widely known or well documented > behavior > outside of developers, forensic specialists, and people who have had > to > deal with the implications it has on data recovery.  There really > isn't >

Re: [RFC] Btrfs device and pool management (wip)

2015-12-08 Thread Christoph Anton Mitterer
On Mon, 2015-11-30 at 13:17 -0700, Chris Murphy wrote: > On Mon, Nov 30, 2015 at 7:51 AM, Austin S Hemmelgarn > wrote: > > > General thoughts on this: > > 1. If there's a write error, we fail unconditionally right now.  It > > would be > > nice to have a configurable number

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-08 Thread Christoph Anton Mitterer
Hey Hugo, On Thu, 2015-11-26 at 00:33 +, Hugo Mills wrote: >    Answering the second part first, no, it can't. Thanks so far :) >    The issue is that nodatacow bypasses the transactional nature of > the FS, making changes to live data immediately. This then means that > if you modify a

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-08 Thread Christoph Anton Mitterer
On 2015-11-27 00:08, Duncan wrote: > Christoph Anton Mitterer posted on Thu, 26 Nov 2015 01:23:59 +0100 as > excerpted: >> 1) AFAIU, the fragmentation problem exists especially for those files >> that see many random writes, especially, but not limited to, big files. >> No

Re: subvols and parents - how?

2015-12-08 Thread Christoph Anton Mitterer
On Fri, 2015-11-27 at 02:02 +, Duncan wrote: > Uhm, I don't get the big security advantage here... whether nested > > or > > manually mounted to a subdir,... if the permissions are insecure > > I'll > > have a problem... if they're secure, than not. > Consider a setuid-root binary with a

Re: attacking btrfs filesystems via UUID collisions?

2015-12-08 Thread Christoph Anton Mitterer
On Sun, 2015-12-06 at 22:34 +0800, Qu Wenruo wrote: > Not sure about LVM/MD, but they should suffer the same UUID conflict > problem. Well I had that actually quite often in LVM (i.e. same UUIDs visible on the same system), basically because we made clones from one template VM image and when that

Re: kernel call trace during send/receive

2015-12-08 Thread Christoph Anton Mitterer
Hey. Hmm I guess no one has any clue about that error? Well it seems at least that an fsck over the receiving fs passes through without any error. Cheers, Chris. On Fri, 2015-11-27 at 02:49 +0100, Christoph Anton Mitterer wrote: > Hey. > > Just got the following during send/receiv

Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-08 Thread Christoph Anton Mitterer
On Sun, 2015-12-06 at 04:06 +, Duncan wrote: > There's actually a number of USB-based hardware and software vulns > out > there, from the under $10 common-component-capacitor-based charge- > and-zap > (charges off the 5V USB line, zaps the port with several hundred > volts > reverse-polarity,

Re: subvols and parents - how?

2015-12-08 Thread Christoph Anton Mitterer
On Fri, 2015-11-27 at 01:02 +, Duncan wrote: [snip snap] > #1 could be a pain to setup if you weren't actually mounting it > previously, just relying on the nested tree, AND... > > #2 The point I was trying to make, now, to mount it you'll mount not > a > native nested subvol, and not a

Re: Subvolume UUID, data corruption?

2015-12-04 Thread Christoph Anton Mitterer
On Fri, 2015-12-04 at 13:07 +, Hugo Mills wrote: > I don't think it'll cause problems. Is there any guaranteed behaviour when btrfs encounters two filesystems (i.e. not talking about the subvols now) with the same UUID? Given that it's long standing behaviour that people could clone

Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-04 Thread Christoph Anton Mitterer
Thinking a bit more I that, I came to the conclusion that it's actually security relevant that btrfs deals gracefully with filesystems having the same UUID: Getting to know someone else's filesystem's UUID may be more easily possible than one may think. It's usually not considered secret and

Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-05 Thread Christoph Anton Mitterer
On Sat, 2015-12-05 at 13:19 +, Duncan wrote: > The problem with btrfs is that because (unlike traditional > filesystems) > it's multi-device, it needs some way to identify what devices belong > to a > particular filesystem. Sure, but that applies to lvm, or MD as well... and I wouldn't know

Re: attacking btrfs filesystems via UUID collisions? (was: Subvolume UUID, data corruption?)

2015-12-05 Thread Christoph Anton Mitterer
On Sat, 2015-12-05 at 12:01 +, Hugo Mills wrote: > On Sat, Dec 05, 2015 at 04:28:24AM +0100, Christoph Anton Mitterer > wrote: > > On Fri, 2015-12-04 at 13:07 +, Hugo Mills wrote: > > > I don't think it'll cause problems. > > Is there any guaranteed behaviou

Re: btrfs crashing the kernel with Seagate 8TB SMR drives.

2015-12-03 Thread Christoph Anton Mitterer
Any chances that this is: https://bugzilla.kernel.org/show_bug.cgi?id=93581 Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: [PATCH] btrfs: Introduce new mount option to disable tree log replay

2015-12-07 Thread Christoph Anton Mitterer
On Mon, 2015-12-07 at 11:29 -0600, Eric Sandeen wrote: > FWIW, new mount options and their descriptions should be added to > BTRFS-MOUNT(5) > as well. Also, from the end-user perspective, there should be: 1) another option like (hard-ro) which is defined to imply any other options that are

Re: [PATCH] btrfs: Introduce new mount option to disable tree log replay

2015-12-07 Thread Christoph Anton Mitterer
On Mon, 2015-12-07 at 17:06 -0600, Eric Sandeen wrote: > Yeah, I don't know that this is true.  It hasn't been true for over a > decade (2?), with the most widely-used filesystem in linux history, > i.e. > ext3. Based on what? I'd now many sysadmins who don't expect that e.g. the journal is

Re: subvols, ro- and bind mounts - how?

2015-12-10 Thread Christoph Anton Mitterer
Hey. I'd have an additional question about subvols O:-) Given the following setup: 5 | +--root (subvol, /)    +-- mnt (dir) with the following done: - init 1 - remount,ro / (i.e. the subvol root) - mount /dev/btrfs-device /mnt (i.e. mount the top subvol at /mnt) The following happened: - / was

Re: attacking btrfs filesystems via UUID collisions?

2015-12-11 Thread Christoph Anton Mitterer
Sorry, I'm just about to change my mail system, and used a bogus test From: address in the previous mail (please replace fo@fo with cales...@scientia.net). Apologies for any inconveniences and this noise here. Cheers, Chris. smime.p7s Description: S/MIME cryptographic signature

Re: attacking btrfs filesystems via UUID collisions?

2015-12-11 Thread Christoph Anton Mitterer
On Thu, 2015-12-10 at 12:42 -0700, Chris Murphy wrote: > That isn't what I'm suggesting. In the multiple device volume case > where there are two exact (same UUID, same devid, same generation) > instances of one of the block devices, Btrfs could randomly choose > either one if it's an RO mount.

Re: attacking btrfs filesystems via UUID collisions?

2015-12-11 Thread Christoph Anton Mitterer
On Wed, 2015-12-09 at 22:48 +0100, S.J. wrote: > > 3. Some way to fail gracefully, when there's ambiguity that cannot > > be > > resolved. Once there are duplicate devs (dd or lvm snapshots, etc) > > then there's simply no way to resolve the ambiguity automatically, > > and > > the volume should

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-16 Thread Christoph Anton Mitterer
On Wed, 2015-12-09 at 16:36 +, Duncan wrote: > But... as I've pointed out in other replies, in many cases including > this > specific one (bittorrent), applications have already had to develop > their > own integrity management features Well let's move discussion upon that into the "dear

Re: btrfs: poor performance on deleting many large files

2015-12-16 Thread Christoph Anton Mitterer
On Sun, 2015-12-13 at 07:10 +, Duncan wrote: > > So you basically mean that ro snapshots won't have their atime > > updated > > even without noatime? > > Well I guess that was anyway the recent behaviour of Linux > > filesystems, > > and only very old UNIX systems updated the atime even when

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-16 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 10:51 +, Duncan wrote: > > AFAIU, the one the get's fragmented then is the snapshot, right, > > and the > > "original" will stay in place where it was? (Which is of course > > good, > > because one probably marked it nodatacow, to avoid that > > fragmentation > > problem

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-16 Thread Christoph Anton Mitterer
On Thu, 2015-12-17 at 01:09 +, Duncan wrote: > Well, "don't load the journal on mounting" is exactly what the option > would do.  The journal (aka log) of course has a slightly different > meaning, it's only the fsync log, but loading it is exactly what the > option would prevent, here.

Re: dear developers, can we have notdatacow + checksumming, plz?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 11:00 -0500, Austin S. Hemmelgarn wrote: > > Well sure, I think we'de done most of this and have dedicated > > controllers, at least of a quality that funding allows us ;-) > > But regardless how much one tunes, and how good the hardware is. If > > you'd then loose always a

Re: attacking btrfs filesystems via UUID collisions?

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 14:26 -0700, Chris Murphy wrote: > The automobile is invented and due to the ensuing chaos, common > practice of doing whatever the F you wanted came to an end in favor > of > rules of the road and traffic lights. I'm sure some people went > ballistic, but for the most part

Re: dear developers, can we have notdatacow + checksumming, plz?

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 17:42 +1100, Russell Coker wrote: > My understanding of BTRFS is that the metadata referencing data > blocks has the > checksums for those blocks, then the blocks which link to that > metadata (EG > directory entries referencing file metadata) has checksums of those. You

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 15:20 -0500, Austin S. Hemmelgarn wrote: > On 2015-12-14 14:44, Christoph Anton Mitterer wrote: > > On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: > > > The traditional reasoning was that read-only meant that users > > > cou

Re: attacking btrfs filesystems via UUID collisions?

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 13:55 -0700, Chris Murphy wrote: > I'm aware of this proof of concept. I'd put it, and this one, in the > realm of a targeted attack, so it's not nearly as likely as other > problems needing fixing. That doesn't mean don't understand it better > so it can be fixed. It means

Re: attacking btrfs filesystems via UUID collisions?

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 08:23 -0500, Austin S. Hemmelgarn wrote: > The reason that this isn't quite as high of a concern is because > performing this attack requires either root access, or direct > physical > access to the hardware, and in either case, your system is already > compromised. No

Re: dear developers, can we have notdatacow + checksumming, plz?

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 09:16 -0500, Austin S. Hemmelgarn wrote: > > When one starts to get a bit deeper into btrfs (from the admin/end- > > user > > side) one sooner or later stumbles across the recommendation/need > > to > > use nodatacow for certain types of data (DBs, VM images, etc.) and > >

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-13 Thread Christoph Anton Mitterer
On Wed, 2015-12-09 at 13:36 +, Duncan wrote: > Answering the BTW first, not to my knowledge, and I'd be > skeptical.  In > general, btrfs is cowed, and that's the focus.  To the extent that > nocow > is necessary for fragmentation/performance reasons, etc, the idea is > to > try to make cow

Re: attacking btrfs filesystems via UUID collisions?

2015-12-13 Thread Christoph Anton Mitterer
On Fri, 2015-12-11 at 16:06 -0700, Chris Murphy wrote: > For anything but a new and empty Btrfs volume What's the influence of the fs being new/empty? > this hypothetical > attack would be a ton easier to do on LVM and mdadm raid because they > have a tiny amount of metadata to spoof compared to

Re: attacking btrfs filesystems via UUID collisions?

2015-12-13 Thread Christoph Anton Mitterer
On Sat, 2015-12-12 at 02:34 +0100, S.J. wrote: > A bit more about the dd-is-bad-topic: > > IMHO it doesn't matter at all. Yes, fully agree. > a) For this specific problem here, fixing a security problem > automatically > fixes the risk of data corruption because careless cloning+mounting >

Re: [auto-]defrag, nodatacow - general suggestions?(was: btrfs: poor performance on deleting many large files?)

2015-12-13 Thread Christoph Anton Mitterer
Two more on these: On Thu, 2015-11-26 at 00:33 +, Hugo Mills wrote: > 3) When I would actually disable datacow for e.g. a subvolume that > > holds VMs or DBs... what are all the implications? > > Obviously no checksumming, but what happens if I snapshot such a > > subvolume or if I

Re: subvols and parents - how?

2015-12-12 Thread Christoph Anton Mitterer
On Wed, 2015-12-09 at 10:53 +, Duncan wrote: > If you use the recipe (subvol create, cp with reflink) it suggests > there, > you'll end up with the reflinked copy in a subvol. > > You can then mount that subvol over top of the existing dir, and > *new* > file opens will access the new

Re: subvols, ro- and bind mounts - how?

2015-12-12 Thread Christoph Anton Mitterer
On Thu, 2015-12-10 at 19:32 -0700, Chris Murphy wrote: > That seems due for a revision because I do rw, ro, rw, rw, ro mounts > in sequence and they stick fine. In fact they stick with the same > subvolume. > > [root@f23m ]# mount /dev/sda7 /mnt/1 -o subvol=home > [root@f23m ]# mount /dev/sda7

Re: Will "btrfs check --repair" fix the mounting problem?

2015-12-12 Thread Christoph Anton Mitterer
On Sat, 2015-12-12 at 13:16 -0700, Chris Murphy wrote: > > What is the better way to get data? send/receive works only with RO > > snapshots. Is there another way to preserve subvolumes and CoW > > structure (a lot of files was copied between subvols using "cp > > --reflink=always")? Or just

dear developers, can we have notdatacow + checksumming, plz?

2015-12-13 Thread Christoph Anton Mitterer
(consider that question being asked with that face on: http://goo.gl/LQaOuA) Hey. I've had some discussions on the list these days about not having checksumming with nodatacow (mostly with Hugo and Duncan). They both basically told me it wouldn't be straight possible with CoW, and Duncan thinks

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-16 Thread Christoph Anton Mitterer
On Wed, 2015-12-16 at 11:10 +, Duncan wrote: > And noload doesn't have the namespace collision problem norecovery > does > on btrfs, so I'd strongly suggest using it, at least as an alias for > whatever other btrfs-specific name we might choose. but noload is, AFAIU, not what's desired

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 08:54 -0500, Austin S. Hemmelgarn wrote: > Except for one thing:  Automobiles actually provide a measurable > significant benefit to society.  What specific benefit does embedding > the filesystem UUID in the metadata actually provide? I guess that's quite obvious. You want

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 14:18 +, Hugo Mills wrote: >    That one's easy to answer. It deals with a major issue that > reiserfs had: if you have a filesystem with another filesystem image > stored on it, reiserfsck could end up deciding that both the metadata > blocks of the main filesystem *and*

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 11:03 -0500, Austin S. Hemmelgarn wrote: > May I propose the alternative option of adding a flag to tell mount > to > _only_ use the devices specified in the options? That's one part of exactly what I propose since a few days :-P (no one seems to read my mails ;-) ) Plus

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Wed, 2015-12-16 at 09:41 -0500, Chris Mason wrote: > Hugo is right here.  reiserfs had tools that would scan and entire > block > device for metadata blocks and try to reconstruct the filesystem > based > on what it found. Creepy... at least when talking about a "normal" fsck... good that btrfs

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 14:42 +, Hugo Mills wrote: >    I would suggest trying to migrate to a state where detecting more > than one device with the same UUID and devid is cause to prevent the > FS from mounting, unless there's also a "mount_duplicates_yes_i_ >

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-16 Thread Christoph Anton Mitterer
On Wed, 2015-12-16 at 07:12 -0500, Austin S. Hemmelgarn wrote: > I kind of agree with Christoph here.  I don't think that noload > should > be the what we actually use, although I do think having it as an > alias > for whatever name we end up using would be a good thing. No, because people would

Re: attacking btrfs filesystems via UUID collisions?

2015-12-16 Thread Christoph Anton Mitterer
On Tue, 2015-12-15 at 09:19 -0500, Austin S. Hemmelgarn wrote: > Um, no you don't have direct physical access to the hardware with an > ATM, at least, not unless you are going to take apart the cover and > anything else in your way (and probably set off internal alarms). Well access to the

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-16 Thread Christoph Anton Mitterer
On Wed, 2015-12-16 at 07:57 -0500, Austin S. Hemmelgarn wrote: > No, because we should ease the transition from other filesystems to > the > greatest extent reasonably possible.  It should be clearly documented > as > an alias for compatibility with ext{3,4}, and that it might go away > in >

Re: subvols, ro- and bind mounts - how?

2015-12-10 Thread Christoph Anton Mitterer
On Thu, 2015-12-10 at 23:36 +0100, S.J. wrote: > Quote: > > " Most mount options apply to the whole filesystem, and only the > options > for the first subvolume > to be mounted will take effect. This is due to lack of implementation > and may change in the future. " > > from

Re: btrfs: poor performance on deleting many large files

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 09:24 -0500, Austin S. Hemmelgarn wrote: > Unless things have changed very recently, even many modern systems > update atime on read-only filesystems, unless the media itself is > read-only. Seriously? Oh... *sigh*... You mean as in Linux, ext*, xfs? > If you have software

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 18:32 +0100, David Sterba wrote: > I've read the discussions around the change and from the user's POV > I'd > suggest to add another mount option that would be just an alias for > any > mount options that would implement the 'hard-ro' semantics. Nice to hear...  > Say it's

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 12:50 -0500, Austin S. Hemmelgarn wrote: > It should also imply noatime.  I'm not sure how BTRFS handles atime > when > mounted RO, but I know a lot of old UNIX systems updated atime even > on > filesystems mounted RO, and I know that at least at one point Linux > did too.

Re: [PATCH v3] btrfs: Introduce new mount option to disable tree log replay

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 14:33 -0500, Austin S. Hemmelgarn wrote: > The traditional reasoning was that read-only meant that users > couldn't > change anything Where I'd however count the atime changes to. The atimes wouldn't change magically, but only because the user stared some program, configured

Re: btrfs: poor performance on deleting many large files

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 15:27 -0500, Austin S. Hemmelgarn wrote: > On 2015-12-14 14:39, Christoph Anton Mitterer wrote: > > On Mon, 2015-12-14 at 09:24 -0500, Austin S. Hemmelgarn wrote: > > > Unless things have changed very recently, even many modern > > > systems >

project idea: per-object default mount-options / more btrfs-properties / chattr attributes (was: btrfs: poor performance on deleting many large files)

2015-12-14 Thread Christoph Anton Mitterer
Just FYI: On Mon, 2015-12-14 at 15:27 -0500, Austin S. Hemmelgarn wrote: > > My idea would be basically, that having a noatime btrfs-property, > > which > > is perhaps even set automatically, would be an elegant way of doing > > that. > > I just haven't had time to properly write that up and add

Re: btrfs: poor performance on deleting many large files

2015-12-14 Thread Christoph Anton Mitterer
On Mon, 2015-12-14 at 22:30 +0100, Lionel Bouton wrote: > Mutt is often used as an example but tmpwatch uses atime by default > too > and it's quite useful. Hmm one could probably argue that these few cases justify the use of separate filesystems (or btrfs subvols ;) ), so that the majority could

  1   2   3   4   >