Re: BTRFS messes up snapshot LV with origin
On 11/28/2014 11:35 PM, Goffredo Baroncelli wrote: I agree with you; but I have to find a default so during the boot a system can start even if snapshots are present. No, you really _don't_ need to find such a default. Better a system that doesn't boot than one that boots based on a guess. I've been spending a lot of time thinking about booting while writing underdog (http://underdog.sourceforge.net) and while booting is fragile, an even partially incorrect boot is a system and _security_ nightmare. If you start making preferential guesses then an intruder could trick the system into booting from a thumb-drive or other alternate media by coercing a UUID colision in a way that the system picks the new media. Conflicts should _never_ be guessed at during boot. Ever. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
On 11/28/2014 11:29 PM, Duncan wrote: Since I can't/won't run pretty much anything proprietary, there's little chance of it being taken as anything but Linux, here. (Tho I actually use (c)gdisk for partitioning here and it appears to use a different GUID. (0700 in its short form which AFAIK is gdisk specific, for MS basic data, while it uses 8300 for general Linux filesystems. I could look up the long form GUIDs, but meh...) Partition type codes (e.g. 0700, 8300, EF00, etc) have _nothing_ to do with UUIDs. They are type codes. They aren't short form of anything else at all. In fact 0700 is the _long_ _form_ of the original code of 7, but in big-endian order now that it went from one byte to two. Microsoft started using pre-assigned UUIDs as classes, e.g. type codes they could cram into their various registry files. If you actually read the registry you'll find a lot of places where rational word is defined as {some_uuid_here} and then eslwere {some_uuid_here} has a bunch of data items attached to it. So gpartd didn;t reuse microsoft UUIDs. In some/many of the older formats there was a code for operating system data (which I think is what 7 was originally). Others came by and said since we're going to put in a type code for linux swap (82) then lets put in a code for linux data as well (83), and all this before the whole byte expansion to turn these things from bytes into two-byte words. Once everybody else picked their own type codes for their data partitions, everybody just started calling 7 microsoft data. And linux doesn't care at all since it's noise since every partition just ends up as /dev/[sh]d? anyway. All this stuff has historical reasons. GNU/Linux attempts to be an egalitarian actor so it adapts to whatever you do. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
Robert White posted on Sat, 29 Nov 2014 00:20:11 -0800 as excerpted: On 11/28/2014 11:29 PM, Duncan wrote: (Tho I actually use (c)gdisk for partitioning here and it appears to use a different GUID. (0700 in its short form which AFAIK is gdisk specific, for MS basic data, while it uses 8300 for general Linux filesystems. I could look up the long form GUIDs, but meh...) Partition type codes (e.g. 0700, 8300, EF00, etc) have _nothing_ to do with UUIDs. They are type codes. They aren't short form of anything else at all. In fact 0700 is the _long_ _form_ of the original code of 7, but in big-endian order now that it went from one byte to two. You obviously know where the short forms originated (MBR type codes), but you haven't the foggiest what you're talking about in relation to gdisk, where they're used as 4-hex-char entry shortcuts for the similar GPT/EFI GUIDs. Now that's what I expected with the mention of a different partition editor, thus my mention that they were shortcuts for GUIDs, apparently gdisk specific, but in gdisk they certainly ARE shortcuts to the various GUIDs and you certainly do *NOT* know what you're talking about saying they are not even related. From the gdisk (8) manpage entry for the l/list action: l Display a summary of partition types. GPT uses a GUID to identify partition types for particular OSes and purposes. For ease of data entry, gdisk compresses these into two-byte (four-digit hexadecimal) values that are related to their equivalent MBR codes. Specifically, the MBR code is multiplied by hexadecimal 0x0100. For instance, the code for Linux swap space in MBR is 0x82, and it's 0x8200 in gdisk. A one-to-one correspondence is impossible, though. Most notably, the codes for all varieties of FAT and NTFS partition correspond to a single GPT code (entered as 0x0700 in sgdisk). Some OSes use a single MBR code but employ many more codes in GPT. For these, gdisk adds code numbers sequentially, such as 0xa500 for a FreeBSD disklabel, 0xa501 for FreeBSD boot, 0xa502 for FreeBSD swap, and so on. Note that these two-byte codes are unique to gdisk. See also the gdisk home page: http://www.rodsbooks.com/gdisk/ In particular, see the gdisk walkthru here: http://www.rodsbooks.com/gdisk/walkthrough.html ... and the gdisk manpage I quoted above here: http://www.rodsbooks.com/gdisk/gdisk.html So as I said, gdisk uses a 4-hexit short code based on the legacy MBR type-code as an easy entry and display form referencing the longer and much less human readable GUIDs, just like I said, and such usage is gdisk specific, just like I said I thought it was. And you might have known the legacy MBR type-codes from which they were derived, but obviously you had no idea what I was talking about here, and despite my saying it was gdisk specific you decided to simply claim I didn't know what I was talking about without actually checking the situation, despite my telling you exactly what app I was referring to and that I thought those references were app-specific, giving you plenty of chance to actually look it up yourself if you decided to, or simply not argue that point if you weren't interested in checking out the app- specific stuff. =:^( Microsoft started using pre-assigned UUIDs as classes, e.g. type codes they could cram into their various registry files. If you actually read the registry you'll find a lot of places where rational word is defined as {some_uuid_here} and then eslwere {some_uuid_here} has a bunch of data items attached to it. FWIW I know about the MS registry stuff from actually doing MS-registry and API related programming (hobbiest/VB level but using the regular API not just the VB exposed stuff) back before the turn of the century. I've not touched it in nearing a decade and a half now and my knowledge is consequently dated 9x vintage, but it obviously had the registry and I used to be /quite/ familiar with it, including of course the UUIDs. So gpartd didn;t reuse microsoft UUIDs. In some/many of the older formats there was a code for operating system data (which I think is what 7 was originally). Others came by and said since we're going to put in a type code for linux swap (82) then lets put in a code for linux data as well (83), and all this before the whole byte expansion to turn these things from bytes into two-byte words. Once everybody else picked their own type codes for their data partitions, everybody just started calling 7 microsoft data. And linux doesn't care at all since it's noise since every partition just ends up as /dev/[sh]d? anyway. All this stuff has historical reasons. GNU/Linux attempts to be an egalitarian actor so it adapts to whatever you do. This part I have no disagreement with... -- Duncan - List replies preferred. No HTML msgs. Every nonfree program
Moving contents from one subvol to another
Hello. I am now taking the first steps to making my backup external HDD in BtrFS. From http://askubuntu.com/questions/119014/btrfs-subvolumes-vs-folders I understand that the only difference between subvolumes and ordinary folders is that the former can be snapshotted and independently mounted. But I have a question. I have two subvols test1, test2. $ cd test1 $ dd if=/dev/urandom of=file bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 36.2291 s, 14.5 MB/s $ time mv file ../test2/ real0m2.061s user0m0.013s sys 0m0.459s $ time { cp --reflink ../test2/file . rm ../test2/file ; } real0m0.677s user0m0.022s sys 0m0.086s $ mkdir foo $ time mv file foo/ real0m0.096s user0m0.008s sys 0m0.013s It seems that mv is not CoW aware and hence is not able to create reflinks so it is actually processing the entire file because it thinks test2 is a different device/filesystem/partition or such. Is this understanding correct? So doing cp --reflink with rm is much faster. But it is still slower than doing mv within the same subvol. Is it because of the housekeeping with updating the metadata of the two subvols? Methinks --reflink option should be added to mv for the above usecase. Do people think this is useful? Why or why not? My concern is that if somebody wants to consolidate two subvols into one, though really only the metadata needs to be processed using ordinary mv isn't aware of this and using cp --reflink with rm is unnecessarily complicated, especially if it will involve multiple files. And it's not clear to me what it would entail to cp --reflink + rm an entire directory tree because IIUC I'd have to handle each file separately. Perhaps something (unnecessarily convoluted) like: find . | while read f do [ -d $f ] mkdir target/$f touch target/$f -r $f [ -f $f ] cp -a --reflink $f target/ rm $f done Again, what would happen to files which are not regular directories or files? And why isn't --reflink given a single letter alias for cp? -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving contents from one subvol to another
On Sat, Nov 29, 2014 at 07:51:07PM +0530, Shriramana Sharma wrote: Hello. I am now taking the first steps to making my backup external HDD in BtrFS. From http://askubuntu.com/questions/119014/btrfs-subvolumes-vs-folders I understand that the only difference between subvolumes and ordinary folders is that the former can be snapshotted and independently mounted. But I have a question. I have two subvols test1, test2. $ cd test1 $ dd if=/dev/urandom of=file bs=1M count=500 500+0 records in 500+0 records out 524288000 bytes (524 MB) copied, 36.2291 s, 14.5 MB/s $ time mv file ../test2/ real0m2.061s user0m0.013s sys 0m0.459s $ time { cp --reflink ../test2/file . rm ../test2/file ; } real0m0.677s user0m0.022s sys 0m0.086s $ mkdir foo $ time mv file foo/ real0m0.096s user0m0.008s sys 0m0.013s It seems that mv is not CoW aware and hence is not able to create reflinks so it is actually processing the entire file because it thinks test2 is a different device/filesystem/partition or such. Is this understanding correct? The latest version of mv should be able to use CoW copies to make it more efficient. It has a --reflink option, the same as cp. Note that you can't make reflinks crossing a mount boundary, but you can do so crossing a subvolume boundary (as you're doing here). So doing cp --reflink with rm is much faster. But it is still slower than doing mv within the same subvol. Is it because of the housekeeping with updating the metadata of the two subvols? I should think so, yes. Methinks --reflink option should be added to mv for the above usecase. Do people think this is useful? Why or why not? See above: it already has been. :) My concern is that if somebody wants to consolidate two subvols into one, though really only the metadata needs to be processed using ordinary mv isn't aware of this and using cp --reflink with rm is unnecessarily complicated, especially if it will involve multiple files. And it's not clear to me what it would entail to cp --reflink + rm an entire directory tree because IIUC I'd have to handle each file separately. Perhaps something (unnecessarily convoluted) like: find . | while read f do [ -d $f ] mkdir target/$f touch target/$f -r $f [ -f $f ] cp -a --reflink $f target/ rm $f done Again, what would happen to files which are not regular directories or files? Probably just the same thing that would happen without the --reflink=always. And why isn't --reflink given a single letter alias for cp? I don't know about that; you'll have to ask the coreutils developers. They're probably expecting it to be largely set to a single value by default (e.g. through a shall alias). Hugo. -- Hugo Mills | I will not be pushed, filed, stamped, indexed, hugo@... carfax.org.uk | briefed, debriefed or numbered. http://carfax.org.uk/ | My life is my own. PGP: 65E74AC0 |Number 6, The Prisoner signature.asc Description: Digital signature
Re: Moving contents from one subvol to another
On Sat, Nov 29, 2014 at 7:58 PM, Hugo Mills h...@carfax.org.uk wrote: The latest version of mv should be able to use CoW copies to make it more efficient. It has a --reflink option, the same as cp. Note that you can't make reflinks crossing a mount boundary, but you can do so crossing a subvolume boundary (as you're doing here). Hi thanks for this. I suppose you are referring to the commit: http://git.savannah.gnu.org/cgit/coreutils.git/commit/?id=b47231be6813e6cb5305266e391b4bb745f27f13 From http://git.savannah.gnu.org/cgit/coreutils.git/log/?qt=grepq=mv%3A, http://git.savannah.gnu.org/cgit/coreutils.git/plain/NEWS?id=b47231be6813e6cb5305266e391b4bb745f27f13 and finally http://git.savannah.gnu.org/cgit/coreutils.git/tree/src/mv.c?id=b47231be6813e6cb5305266e391b4bb745f27f13 it doesn't seem as if there was any earlier commit actually adding a --reflink option so it seems the improvement is in-built. That's nice to know! Any idea when the next coreutils point release with this will be out? -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
On 11/29/2014 01:41 AM, Duncan wrote: Robert White posted on Sat, 29 Nov 2014 00:20:11 -0800 as excerpted: l Display a summary of partition types. GPT uses a GUID to identify partition types for particular OSes and purposes. For ease of data entry, gdisk compresses these into two-byte (four-digit hexadecimal) values that are related to their equivalent MBR codes. Specifically, the MBR code is multiplied by hexadecimal 0x0100. That EFI uses GUIDs is one thing. That the standard allows these to be selected based on type codes originally derived from ms-dos partition type codes (compressed is the wrong word) is something else. If they were compressed then it would be a relationship that could represent any GUID at all. It's marginally hashed, in that there is a table lookup, but its not properly a hashed as the hash function is undefined for virtually all possible input values. The other partition GUID is acutally more interesting. So as I said, gdisk uses a 4-hexit short code based on the legacy MBR type-code as an easy entry and display form referencing the longer and much less human readable GUIDs, just like I said, and such usage is gdisk specific, just like I said I thought it was. Which is not what you said. None of the above was mentioned in the email to which I responded. What you actually said :: [QUOTE] Since I can't/won't run pretty much anything proprietary, there's little chance of it being taken as anything but Linux, here. (Tho I actually use (c)gdisk for partitioning here and it appears to use a different GUID. (0700 in its short form which AFAIK is gdisk specific, for MS basic data, while it uses 8300 for general Linux filesystems. I could look up the long form GUIDs, but meh...) [/QUOTE] None of which is gdisk specific, and all of which is based on EFI and the GUID partition table. What I mistakenly attributed to you and was key to my initial response was your extension of Chris Murphy: Chris Murphy posted on Fri, 28 Nov 2014 00:10:40 -0700 as excerpted: A very good example of WTF reusage of a UUID that irks me to no end is GNU parted devs decided to recycle the Microsoft Windows Basic Data partition type GUID for Linux partitions. It's like watching someone get run over by a zamboni with 50 feet of advance notice... [So my bad there on the quoting...] The irking there being dumb because the universally used type GUID has nothing to do with the second GUID that universally identifies the partition regardless of type. But here is the thing... for all the screed about open and closed source... (and I am an open source guy myself) The actual EFI standard dictates these partition numbers and whatnot so if you used the microsoft tools you'd get the same results. http://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs AND microsoft was one of several principle players in the EFI and its GUID partition subparts. So his being irked to no end and your agreement and that's why I used gdisk response are both completely misplaced, and potentially misleading to others. I just went a little off the rails while trying to explain. /D'oh. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
To those reading along who don't already know. My explanation below is factually inadequate or wrong in various places... The type codes as presented in the various EFI/GUID disk partitioning tools as 0700, 8200, 8300, EF02, and so on are never written to disk as such. They are short-hand values (chosen to be deliberately similar to the MS-DOS partitioning type codes of 07, 82, 83, etc) to select standardized GUIDs for the partition type field. So there is the two-digit code from the ms-dos partitoning scheme, then there are the four-digit codes that let you select which type GUID will be written in an EFI partition scheme. The question of reuse is still improper as the type codes were assigned by the EFI standard for specific use as type codes. The EFI tool used (gdisk, or windows disk partitioning tool, etc) is immaterial as the result codes are selected by standard. I could have, and should have, been _way_ more clear, and/or less wrong. 8-) http://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs On 11/29/2014 12:20 AM, Robert White wrote: On 11/28/2014 11:29 PM, Duncan wrote: Since I can't/won't run pretty much anything proprietary, there's little chance of it being taken as anything but Linux, here. (Tho I actually use (c)gdisk for partitioning here and it appears to use a different GUID. (0700 in its short form which AFAIK is gdisk specific, for MS basic data, while it uses 8300 for general Linux filesystems. I could look up the long form GUIDs, but meh...) Partition type codes (e.g. 0700, 8300, EF00, etc) have _nothing_ to do with UUIDs. They are type codes. They aren't short form of anything else at all. In fact 0700 is the _long_ _form_ of the original code of 7, but in big-endian order now that it went from one byte to two. Microsoft started using pre-assigned UUIDs as classes, e.g. type codes they could cram into their various registry files. If you actually read the registry you'll find a lot of places where rational word is defined as {some_uuid_here} and then eslwere {some_uuid_here} has a bunch of data items attached to it. So gpartd didn;t reuse microsoft UUIDs. In some/many of the older formats there was a code for operating system data (which I think is what 7 was originally). Others came by and said since we're going to put in a type code for linux swap (82) then lets put in a code for linux data as well (83), and all this before the whole byte expansion to turn these things from bytes into two-byte words. Once everybody else picked their own type codes for their data partitions, everybody just started calling 7 microsoft data. And linux doesn't care at all since it's noise since every partition just ends up as /dev/[sh]d? anyway. All this stuff has historical reasons. GNU/Linux attempts to be an egalitarian actor so it adapts to whatever you do. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving contents from one subvol to another
On 11/29/2014 07:15 AM, Shriramana Sharma wrote: On Sat, Nov 29, 2014 at 7:58 PM, Hugo Mills h...@carfax.org.uk wrote: The latest version of mv should be able to use CoW copies to make it more efficient. It has a --reflink option, the same as cp. Note that you can't make reflinks crossing a mount boundary, but you can do so crossing a subvolume boundary (as you're doing here). One thing to keep in mind is that mv, when crossing any of these boundaries degenerates to a copy-and-remove operation and _none_ of the source files will be removed until _all_ of the files have been copied. If any of the copy operations fail the removes will not take place at all. It would only take a couple large NOCOW files to put you over a limit somewhere. So if you get to an out-of-space condition mid-move you are going to have to disentangle your stuff by hand anyway. If you are consolidating sub-volumes (as per the original question) on a nearly full drive you may want to do it all long-hand with a script moving various chunks or something instead of just trying a move/copy of cp --reflinks /vol1/* /vol2/ (same for mv when you get that --reflinks revision). ASIDE: Also be aware that such a moment would be the perfect time to consider compression and so on. A regular copy (non reflinks) will apply the currently selected compress= regime (and reconsider sparsity etc) in a way that the move will not. e.g. once you decide to do intrusive maintenance you might be well served by taking the extra time to restructure your storage. 8-) -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: add sha256 checksum option
David Sterba wrote: On Mon, Nov 24, 2014 at 01:23:05PM +0800, Liu Bo wrote: This brings a strong-but-slow checksum algorithm, sha256. Actually btrfs used sha256 at the early time, but then moved to crc32c for performance purposes. As crc32c is sort of weak due to its hash collision issue, we need a stronger algorithm as an alternative. Users can choose sha256 from mkfs.btrfs via $ mkfs.btrfs -C 256 /device There's already some good feedback so I'll try to cover what hasn't been mentioned yet. I think it's better to separate the preparatory works from adding the algorithm itself. The former can be merged (and tested) independently. There are several checksum algorithms that trade off speed and strength so we may want to support more than just sha256. Easy to add but I'd rather see them added in all at once than one by one. Why not just use the kernel crypto API? Then the user can just specify any hash the kernel supports. Another question is if we'd like to use different checksum for data and metadata. This would not cost any format change if we use the 2 bytes in super block csum_type. Mmm, that might be a good reason - although maybe store an entry in some tree of the full crypto api spec, and have a special value of one byte meaning crypto API and the other byte counts how many bytes the csum is. Optional/crazy/format change stuff: * per-file checksum algorithm - unlike compression, the whole file would have to use the same csum algo reflink would work iff the algos match snapshotting is unaffected * per-subvolume checksum algorithm - specify the csum type at creation time, or afterwards unless it's modified -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html W -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Skript for backup btrfs on external HD
Hi there! I made a script to do backup with btrfs on a external HD. You can see the function, how it works, and how it's to be used on my site http://linux.xundeenergie.at/doku.php?id=mkbtrbackup The site is in german. An english one will follow later. Do you want some explanations? greetings jakob -- http://xundeenergie.at http://verkehrsloesungen.wordpress.com/ http://cogitationum.wordpress.com/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: add sha256 checksum option
On Sat, Nov 29, 2014 at 12:38 PM, Alex Elsayed eternal...@gmail.com wrote: Why not just use the kernel crypto API? Then the user can just specify any hash the kernel supports. One reason is that crytographic hashes are an order of magnitude slower than the fastest non-cryptographic hashes. And for filesystem checksums, I do not see a need for crypotgraphic hashes. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: add sha256 checksum option
John Williams wrote: On Sat, Nov 29, 2014 at 12:38 PM, Alex Elsayed eternal...@gmail.com wrote: Why not just use the kernel crypto API? Then the user can just specify any hash the kernel supports. One reason is that crytographic hashes are an order of magnitude slower than the fastest non-cryptographic hashes. And for filesystem checksums, I do not see a need for crypotgraphic hashes. I'd suggest looking more closely at the crypto api section of menuconfig - it already has crc32c, among others. Just because it's called the crypto api doesn't mean it only has cryptographically-strong algorithms. As a side benefit, if someone implements (say) SipHash for it, then not only could btrfs benefit, but also all other users of the API, including (now) userspace. The crypto api also has compression, for zlib/lzo/lz4/lz4hc, but I'm given to understand that btrfs' usage of compression doesn't match well to that API. -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Skript for backup btrfs on external HD
On Sat, Nov 29, 2014 at 09:34:01PM +0100, Jakob Schürz wrote: Hi there! I made a script to do backup with btrfs on a external HD. You can see the function, how it works, and how it's to be used on my site http://linux.xundeenergie.at/doku.php?id=mkbtrbackup The site is in german. An english one will follow later. Do you want some explanations? Sure, how is it different from those 3? https://btrfs.wiki.kernel.org/index.php/Incremental_Backup#Available_Backup_Tools Thanks, Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
On Sat, Nov 29, 2014 at 1:20 AM, Robert White rwh...@pobox.com wrote: On 11/28/2014 11:29 PM, Duncan wrote: Since I can't/won't run pretty much anything proprietary, there's little chance of it being taken as anything but Linux, here. (Tho I actually use (c)gdisk for partitioning here and it appears to use a different GUID. (0700 in its short form which AFAIK is gdisk specific, for MS basic data, while it uses 8300 for general Linux filesystems. I could look up the long form GUIDs, but meh...) Partition type codes (e.g. 0700, 8300, EF00, etc) have _nothing_ to do with UUIDs. They are type codes. They aren't short form of anything else at all. In fact 0700 is the _long_ _form_ of the original code of 7, but in big-endian order now that it went from one byte to two. No that's not correct. These four digit type codes are a user facing friendly type code, the actual on-disk partitiontype GUID is a UUID in that at the time of creation that UUID followed RFC 4122 so it was unique: no one else was using the UUID. That UUID in the context of a partitiontype GUID is intended to describe the purpose of that partition: what OS, what file system, where it should mount or be used for, etc. This is elaborately detailed in the GPT (GUID partition table) portion of the UEFI specification. A 120 bit type code is rather difficult for humans to remember and interact with, hence gdisk and recently fdisk now use a four digit type code as a front end for the partitiontypeGUID. The selection of four digits was to account for the fact there are many many many more type codes now possible, essentially unlimited. This is a case where UUID are reused effectively. Microsoft started using pre-assigned UUIDs as classes, e.g. type codes they could cram into their various registry files. If you actually read the registry you'll find a lot of places where rational word is defined as {some_uuid_here} and then eslwere {some_uuid_here} has a bunch of data items attached to it. So gpartd didn;t reuse microsoft UUIDs. GNU parted absolutely re-used partitiontypeGUID EBD0A0A2-B9E5-4433-87C0-68B6B72699C for Linux, by default. This you know as gdisk (and friends) type code 0700. It's the same thing as using type code 07 on an MBR partitioned disk instead of 83. It's ridiculous that this happened considering we had distinction on MBR with limited type code availability, and on GPT with unlimited type codes the decision was to use an already existing type code, EBD0A0A2-B9E5-4433-87C0-68B6B72699C. http://www.rodsbooks.com/linux-fs-code/ The Linux partitiontype GUID is now 0FC63DAF-8483-4772-8E79-3D69D8477DE4. And actually some others have been created also for encryption, RAID, LVM, swap, and a pile of GUIDs from the 'discoverable partitions spec' hosted at freedesktop.org for autodiscovery by systemd. Only very recent versions of parted supports code 0FC63DAF-8483-4772-8E79-3D69D8477DE4. All this stuff has historical reasons. GNU/Linux attempts to be an egalitarian actor so it adapts to whatever you do. With respect to this particular reuse of a Windows type code, it did a total face plant on adaptation. The very decision to reuse that GUID was a huge, weird mistake that we'll live with for years to come. Data loss will result from it. And then it was made worse, upon recognition that the conflict was probably not a good idea, to undermine patching GNU parted in a timely manner. The patch to fix the problem, from the gdisk author, sat around for two years before parted upstream merged it. There really isn't good diplomatic language to use for this. Some people flat out dropped the ball, and just didn't give a crap. -- Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: add sha256 checksum option
On Sat, Nov 29, 2014 at 1:07 PM, Alex Elsayed eternal...@gmail.com wrote: I'd suggest looking more closely at the crypto api section of menuconfig - it already has crc32c, among others. Just because it's called the crypto api doesn't mean it only has cryptographically-strong algorithms. I have looked. What 128- or 256-bit hash functions in crypto api are you referring to that are as fast as Spooky2 or CityHash? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC PATCH] Btrfs: add sha256 checksum option
John Williams wrote: On Sat, Nov 29, 2014 at 1:07 PM, Alex Elsayed eternal...@gmail.com wrote: I'd suggest looking more closely at the crypto api section of menuconfig - it already has crc32c, among others. Just because it's called the crypto api doesn't mean it only has cryptographically-strong algorithms. I have looked. What 128- or 256-bit hash functions in crypto api are you referring to that are as fast as Spooky2 or CityHash? I'm saying that neither of those are in the kernel _anywhere_ now, so if someone's adding them the sensible thing seems to be to add them to the crypto api, access them through it, and then if we ever add more we get them for free on the btrfs side instead of needing to reinvent the wheel every time. In short, there's a place for hashes - why not use it? -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Moving an entire subvol?
So the Ubuntu Wiki BtrFS entry advises against using subvol set-default because it boots its kernel using root=subvol=@ and home as subvol=@home, and these two subvols are only present under the subvol with ID 5. But isn't it just possible to move i.e. reparent a subvol so I can move these two under another subvol and have that as default? Possibly this is a hypothetical question as I'm not sure whether it would be actually practically required but looking at the specific Ubuntu advice on this I thought I should ask. I'm also not sure what openSUSE (or other distros) do about this... Do they mount root using subvolid, or subvol name or such? -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
root subvol id is 0 or 5?
I am confused with this: should I call it the root subvol or top-level subvol or default subvol or doesn't it matter? Are all subvols equal, or some are more equal than others [hark to Orwell's Animal Farm ;-)]? And more importantly, is the ID of the root subvol 0 or 5? The Oracle guide (https://docs.oracle.com/cd/E37670_01/E37355/html/ol_use_case3_btrfs.html) seems to say it's 0 : By default, the operating system mounts the parent btrfs volume, which has an ID of 0 but the BtrFS wiki (and btrfs subvol manpage) reads 5: every btrfs filesystem has a default subvolume as its initially top-level subvolume, whose subvolume id is 5(FS_TREE). as also the Ubuntu Wiki: The default subvolume to mount is always the top of the btrfs tree (subvolid=5). Now this Oracle page http://www.oracle.com/technetwork/articles/servers-storage-admin/advanced-btrfs-1734952.html says: The only clean way to destroy the default subvolume is to rerun the mkfs.btrfs command, which would destroy existing data. So from what I've (confusedly) understood so far, 0 refers to the superstructure (or whatchamacallit) of the entire BtrFS-based contents of the device(s) and hence cannot be deleted but only reset by a mkfs.btrfs, but 5 is only the default subvol (mounted when the FS as a whole is mounted without subvol spec) provided by mkfs.btrfs, and subvol set-default can have another subvol mounted as default instead, after which 5 can actually be deleted? [confused]... -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Possible to undo subvol delete?
IIUC with BtrFS while it is possible to easily undelete a file or ordinary directory if a snapshot of the containing subvol exists, it seems that it's not elementary to undelete a subvol itself, because all subvols are under the root-level subvol (id 0 or 5, see my other q) but even snapshotting the root subvol will not snapshot any subvols under it. So is there any way to undo a subvol delete? [If no, then ordinary users should probably prefer regular directories to subvols.] -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Experimental tag in FAQ?
https://btrfs.wiki.kernel.org/index.php/FAQ#Is_btrfs_stable.3F still reads experimental whereas the warning has been removed in the tools recently IIUC. The FAQ item to be updated, no? -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ�)ߡ�a�����G���h��j:+v���w��٥
ToS page does not exist?
I am asked to read the ToS before signing up on the wiki: Make sure that you first read the Terms of Service before requesting an account. ... but the link is red and the page does not exist. For signing up I'm going to say I've agreed to the ToS anyhow, but still either there should be a ToS page or this requirement should not be there... -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving contents from one subvol to another
On Sat, Nov 29, 2014 at 10:37 PM, Robert White rwh...@pobox.com wrote: One thing to keep in mind is that mv, when crossing any of these boundaries degenerates to a copy-and-remove operation and _none_ of the source files will be removed until _all_ of the files have been copied. If any of the copy operations fail the removes will not take place at all. It would only take a couple large NOCOW files to put you over a limit somewhere. Hmm... So you're saying like because the copy routine that mv calls will see the nocow attribute (and it doesn't know it's being called as part of a move operation) and so do a full copy rather than reflink? Correct me if I'm wrong but it seems that mv should actually ignore the nocow attribute as far as moving it to a new location is concerned, no, because I'm moving, not copying? Of course it should retain the attribute of the original files *after* the move is done. Why should noCoW affect cp --reflink anyhow? I just created a 500 MiB file from /dev/urandom under a chattr +C-ed dir, and copied to another subvol using cp --reflink, and fi df still shows 500 MiB, not 1 GiB. If you are consolidating sub-volumes (as per the original question) on a nearly full drive you may want to do it all long-hand with a script moving various chunks or something instead of just trying a move/copy of cp --reflinks /vol1/* /vol2/ (same for mv when you get that --reflinks revision). As I said, there doesn't actually seem to be a --reflink command line option. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Change total in btrfs filesystem df output to alloc
On Sun, Aug 31, 2014 at 7:25 AM, Shriramana Sharma samj...@gmail.com wrote: Hello. There seem to be lots of questions in various forums re the output of btrfs fi df -- especially w.r.t. the usage of the word total. For example see https://community.oracle.com/thread/2459838 I feel it would make the intent clearer if total were changed to alloc or allocated (if the short form is felt unclear). It would also help people understand the output of regular df on a btrfs system since one can understand easier that pre-allocated space would count as used space as it is not free! Where should I report a bug to get this fixed? Thanks. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Moving an entire subvol?
On Sun, Nov 30, 2014 at 09:01:42AM +0530, Shriramana Sharma wrote: So the Ubuntu Wiki BtrFS entry advises against using subvol set-default because it boots its kernel using root=subvol=@ and home as subvol=@home, and these two subvols are only present under the subvol with ID 5. But isn't it just possible to move i.e. reparent a subvol so I can move these two under another subvol and have that as default? Make a new subvolume called /root and just mount subvol=root Note that you can't mount subvols recursively in one mount AFAIK. This is what my system looks like: LABEL=btrfs_pool1 / btrfs subvol=root,defaults,compress=lzo,discard,skip_balance,noatime 0 0 LABEL=btrfs_pool1 /usrbtrfs subvol=usr,defaults,compress=lzo,discard,skip_balance,noatime 0 0 LABEL=btrfs_pool1 /varbtrfs subvol=var,defaults,compress=lzo,discard,skip_balance,noatime 0 0 LABEL=btrfs_pool1 /home btrfs subvol=home,defaults,compress=lzo,discard,skip_balance,noatime 0 0 LABEL=btrfs_pool1 /tmpbtrfs subvol=tmp,defaults,compress=lzo,discard,skip_balance,noatime,noexec 0 0 LABEL=btrfs_pool1 /mnt/btrfs_pool1 btrfs defaults,compress=lzo,discard,skip_balance,noatime,subvolid=0 0 0 Hope this helps. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Possible to undo subvol delete?
On Sun, Nov 30, 2014 at 09:03:14AM +0530, Shriramana Sharma wrote: IIUC with BtrFS while it is possible to easily undelete a file or ordinary directory if a snapshot of the containing subvol exists, it seems that it's not elementary to undelete a subvol itself, because all subvols are under the root-level subvol (id 0 or 5, see my other q) but even snapshotting the root subvol will not snapshot any subvols under it. So is there any way to undo a subvol delete? If you didn't snapshot that volume before deleting it, you're SOL. If you snapshotted it, rename that snapshot to the other name, and you're done. Btrfs doesn't offer undelete, it only lets you keep multiple copies of your data at very little cost, so you can retrieve a snapshot copy if you deleted your current volume's data. Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Experimental tag in FAQ?
On Sun, Nov 30, 2014 at 09:16:38AM +0530, Shriramana Sharma wrote: https://btrfs.wiki.kernel.org/index.php/FAQ#Is_btrfs_stable.3F still reads experimental whereas the warning has been removed in the tools recently IIUC. The FAQ item to be updated, no? I would say no. Btrfs is still unstable, still gets changed heavily, and stability/no data loss doesn't seem to be the prime directive. Given that, it is experimental still. And I'm not just saying that, it does break. 3.15 and 3.16.0 had serious stability issues, 3.17.0 also broke other stuff, and so on (they get fixed later, but if you say btrfs is stable, users will scream) Marc -- A mouse is a device used to point at the xterm you want to type in - A.S.R. Microsoft is to operating systems what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/ | PGP 1024R/763BE901 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: BTRFS messes up snapshot LV with origin
Robert White posted on Sat, 29 Nov 2014 08:50:57 -0800 as excerpted: To those reading along who don't already know. My explanation below is factually inadequate or wrong in various places... The type codes as presented in the various EFI/GUID disk partitioning tools as 0700, 8200, 8300, EF02, and so on are never written to disk as such. They are short-hand values (chosen to be deliberately similar to the MS-DOS partitioning type codes of 07, 82, 83, etc) to select standardized GUIDs for the partition type field. I could have, and should have, been _way_ more clear, and/or less wrong. 8-) http://en.wikipedia.org/wiki/GUID_Partition_Table#Partition_type_GUIDs Thanks. While I guess we all end up eat humble pie occasionally, you handled it with more rather more grace that I often do, and by taking such a hard line myself I didn't make it as easy as I might have. -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Running out of disk space during BTRFS_IOC_CLONE - rebalance doesn't help
I'm having an issue with a filesystem where I'm regularly running out of disk space during deduplication with bedup. Rebalancing does not help and the same issue occurs even after a full rebalance. Main use-case for this filesystem is a 3 TB backup disk where I'm creating backups by copying a newer version of the data into a new directory and then afterwards running bedup to deduplicate the data (using the older already existing data). What happens is that bedup will deduplicate some files successfully, but at some point fails with an errno 28 (no space left on device) during deduplication. I had some very limited success with running a balance, but afterwards the same issue happens again after a few more files are deduplicated (applies to balances with and without filters). According to fsck the filesystem appears to be OK. Is there anything else that I can try out in order to fix this issue? Or should I try to create a new filesystem and copy the existing data? Here's the log output: dmesg: [235491.227888] [ cut here ] [235491.227912] WARNING: CPU: 0 PID: 14837 at fs/btrfs/super.c:259 __btrfs_abort_transaction+0x50/0x110 [btrfs]() [235491.227914] BTRFS: Transaction aborted (error -28) [235491.227916] Modules linked in: fuse btrfs xor raid6_pq uas usb_storage ctr ccm toshiba_acpi sparse_keymap toshiba_haps joydev hp_accel lis3lv02d input_polldev hdaps(O) btusb bluetooth uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common videodev qcserial media usb_wwan usbserial arc4 iwldvm snd_hda_codec_hdmi mousedev snd_hda_codec_conexant snd_hda_codec_generic mac80211 iTCO_wdt iTCO_vendor_support coretemp intel_powerclamp snd_hda_intel snd_hda_controller snd_hda_codec kvm_intel snd_hwdep iwlwifi thinkpad_acpi mei_me mei cfg80211 snd_pcm nvram lpc_ich kvm evdev snd_timer i915 snd mac_hid ac serio_raw e1000e psmouse led_class wmi rfkill shpchp drm_kms_helper intel_ips i2c_i801 soundcore drm battery hwmon ptp thermal pps_core i2c_algo_bit i2c_core video intel_agp intel_gtt button [235491.227968] acpi_cpufreq processor sch_fq_codel tp_smapi(O) thinkpad_ec(O) nfs lockd sunrpc fscache ext4 crc16 mbcache jbd2 algif_skcipher af_alg dm_crypt dm_mod atkbd libps2 crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd ehci_pci ehci_hcd usbcore usb_common i8042 serio ata_piix sd_mod crct10dif_generic crct10dif_pclmul crc_t10dif crct10dif_common ahci libahci ata_generic libata scsi_mod [235491.228001] CPU: 0 PID: 14837 Comm: bedup Tainted: GW O 3.17.4-1-ARCH #1 [235491.228003] Hardware name: LENOVO 3680U4M/3680U4M, BIOS 6QET68WW (1.38 ) 12/01/2011 [235491.228004] 5deed0d1 880144a57a90 81537b0e [235491.228006] 880144a57ad8 880144a57ac8 8107078d ffe4 [235491.228008] 8801719dcaa0 88009e273800 a09f7630 0c46 [235491.228010] Call Trace: [235491.228017] [81537b0e] dump_stack+0x4d/0x6f [235491.228021] [8107078d] warn_slowpath_common+0x7d/0xa0 [235491.228024] [8107080c] warn_slowpath_fmt+0x5c/0x80 [235491.228029] [a0949d10] __btrfs_abort_transaction+0x50/0x110 [btrfs] [235491.228040] [a09aa9ba] clone_finish_inode_update+0xda/0xf0 [btrfs] [235491.228046] [a09ad0de] btrfs_clone+0x6ae/0xcc0 [btrfs] [235491.228053] [a09ade69] btrfs_ioctl_clone+0x779/0x7b0 [btrfs] [235491.228059] [a09b18b7] btrfs_ioctl+0x10d7/0x2810 [btrfs] [235491.228063] [81193b19] ? free_pages_and_swap_cache+0xb9/0xe0 [235491.228066] [8117d14c] ? tlb_flush_mmu_free+0x2c/0x50 [235491.228068] [8117dd2d] ? tlb_finish_mmu+0x4d/0x50 [235491.228070] [81185cd2] ? unmap_region+0xe2/0x130 [235491.228073] [811ac539] ? kmem_cache_free+0x199/0x1d0 [235491.228075] [811da5f0] do_vfs_ioctl+0x2d0/0x4b0 [235491.228076] [81187fd0] ? do_munmap+0x260/0x400 [235491.228078] [811da851] SyS_ioctl+0x81/0xa0 [235491.228081] [8153db29] system_call_fastpath+0x16/0x1b [235491.228082] ---[ end trace 636d52c4c1dff6bc ]--- btrfs fi show: Label: none uuid: 36c795fe-acb8-458e-87f4-721fedd81b8e Total devices 1 FS bytes used 2.14TiB devid1 size 2.73TiB used 2.17TiB path /dev/mapper/crypt btrfs fi df: Data, single: total=2.12TiB, used=2.12TiB System, DUP: total=32.00MiB, used=248.00KiB Metadata, DUP: total=25.00GiB, used=23.64GiB GlobalReserve, single: total=512.00MiB, used=0.00B I reported the same issue a year ago in 20131202081543.ga1...@gst.name and didn't receive a reply back then. The report in this email still applies to the same filesystem. I just didn't use that filesystem a lot since then and also I just recently retried to deduplicate the data on it. - Guenther -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More