Re: Help with space
On 05/02/2014 03:21 PM, Chris Murphy wrote: On May 2, 2014, at 2:23 AM, Duncan 1i5t5.dun...@cox.net wrote: Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace… The syntax for btrfs device is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Actually, MD RAID10 can be configured to work almost the same with an odd number of disks, except it uses (much) smaller chunks, and it does more intelligent striping of reads. Disk1 Disk2 Disk3 134 124 235 679 578 689 So 1 through 9 each represent a 1GB chunk. Disk 1 and 2 each have a chunk 1; disk 2 and 3 each have a chunk 2, and so on. Total of 9GB of data taking up 18GB of space, 6GB on each drive. You can't do this with any other raid1 as far as I know. You do definitely run out of space on one disk first though because of uneven metadata to data chunk allocation. Anyway I think we're off the rails with raid1 nomenclature as soon as we have 3 devices. It's probably better to call it replication, with an assumed default of 2 replicates unless otherwise specified. There's definitely a benefit to a 3 device volume with 2 replicates, efficiency wise. As soon as we go to four disks 2 replicates it makes more sense to do raid10, although I haven't tested odd device raid10 setups so I'm not sure what happens. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On May 3, 2014, at 10:31 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote: On 05/02/2014 03:21 PM, Chris Murphy wrote: On May 2, 2014, at 2:23 AM, Duncan 1i5t5.dun...@cox.net wrote: Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace… The syntax for btrfs device is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Actually, MD RAID10 can be configured to work almost the same with an odd number of disks, except it uses (much) smaller chunks, and it does more intelligent striping of reads. The efficiency of storage depends on the file system placed on top. Btrfs will allocate space exclusively for metadata, and it's possible much of that space either won't or can't be used. So ext4 or XFS on md probably is more efficient in that regard; but then Btrfs also has compression options so this clouds the efficiency analysis. For striping of reads, there is a note in man 4 md about the layout with respect to raid10: The 'far' arrangement can give sequential read performance equal to that of a RAID0 array, but at the cost of reduced write performance. The default layout for raid10 is near 2. I think either the read performance is a wash with defaults, and md reads are better while writes are worse with the far layout. I'm not sure how Btrfs performs reads with multiple devices. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On May 3, 2014, at 1:09 PM, Chris Murphy li...@colorremedies.com wrote: On May 3, 2014, at 10:31 AM, Austin S Hemmelgarn ahferro...@gmail.com wrote: On 05/02/2014 03:21 PM, Chris Murphy wrote: Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Actually, MD RAID10 can be configured to work almost the same with an odd number of disks, except it uses (much) smaller chunks, and it does more intelligent striping of reads. The efficiency of storage depends on the file system placed on top. Btrfs will allocate space exclusively for metadata, and it's possible much of that space either won't or can't be used. So ext4 or XFS on md probably is more efficient in that regard; but then Btrfs also has compression options so this clouds the efficiency analysis. For striping of reads, there is a note in man 4 md about the layout with respect to raid10: The 'far' arrangement can give sequential read performance equal to that of a RAID0 array, but at the cost of reduced write performance. The default layout for raid10 is near 2. I think either the read performance is a wash with defaults, and md reads are better while writes are worse with the far layout. I'm not sure how Btrfs performs reads with multiple devices. Also, for unequal sized devices, for example 12G,6G,6G, Btrfs raid1 is OK with this and efficiently uses the space, whereas md does not in raid10. First it complains when creating, asking if I want to continue anyway, and then it Second it ends up with *less* usable space than if it had 3x 6GB drives. 12G,6G,6G md raid10 # mdadm -C /dev/md0 -n 3 -l raid10 --assume-clean /dev/sd[bcd] mdadm: largest drive (/dev/sdb) exceeds size (6283264K) by more than 1%. # mdadm -D /dev/md0 (partial) Array Size : 9424896 (8.99 GiB 9.65 GB) Used Dev Size : 6283264 (5.99 GiB 6.43 GB) # df -h Filesystem Size Used Avail Use% Mounted on /dev/md09.0G 33M 9.0G 1% /mnt 12G,6G,6G btrfs raid1 # mkfs.btrfs -d raid1 -m raid1 /dev/sd[bcd] # df -h Filesystem Size Used Avail Use% Mounted on /dev/sdb 24G 1.3M 12G 1% /mnt For performance workloads, this is probably a pathological configuration since it depends on disproportionate reading almost no matter what. But for those who happen to have uneven devices available, and favor space usage efficiency over performance, it's a nice capability. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
Russell Coker posted on Fri, 02 May 2014 11:48:07 +1000 as excerpted: On Thu, 1 May 2014, Duncan 1i5t5.dun...@cox.net wrote: Am I missing something or is it impossible to do a disk replace on BTRFS right now? I can delete a device, I can add a device, but I'd like to replace a device. You're missing something... but it's easy to do as I almost missed it too even tho I was sure it was there. Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace... http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf Whether a true RAID-1 means just 2 copies or N copies is a matter of opinion. Papers such as the above seem to clearly imply that RAID-1 is strictly 2 copies of data. Thanks for that link. =:^) My position would be that reflects the original, but not the modern, definition. The paper seems to describe as raid1 what would later come to be called raid1+0, which quickly morphed into raid10, leaving the raid1 description only covering pure mirror-raid. And even then, the paper says mirrors in spots without specifically defining it as (only) two mirrors, but in others it seems to /assume/, without further explanation, just two mirrors. So I'd argue that even then the definition of raid1 allowed more than two mirrors, but that it just so happened that the examples and formulae given dealt with only two mirrors. Tho certainly I can see the room for differing opinions on the matter as well. I don't have a strong opinion on how many copies of data can be involved in a RAID-1, but I think that there's no good case to claim that only 2 copies means that something isn't true RAID-1. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. But still, opinions can differ. Point well made... and taken. =:^) Surprisingly, after shutting everything down, getting a new AC, and letting the system cool for a few hours, it pretty much all came back to life, including the CPU(s) (that was pre-multi-core, but I don't remember whether it was my dual socket original Opteron, or pre-dual-socket for me as well) which I had feared would be dead. CPUs have had thermal shutdown for a long time. When a CPU lacks such controls (as some buggy Opteron chips did a few years ago) it makes the IT news. That was certainly some years ago, and I remember for awhile, AMD Athlons didn't have thermal shutdown yet, while Intel CPUs of the time did. And that was an AMD CPU as I've run mostly AMD (with only specific exceptions) for literally decades, now. But what IDR for sure is whether it was my original AMD Athlon (500 MHz), or the Athlon C @ 1.2 GHz, or the dual Opteron 242s I ran for several years. If it was the original Athlon, it wouldn't have had thermal shutdown. If it was the Opterons I think they did, but I think the Athlon Cs were in the period when Intel had introduced thermal shutdown but AMD hadn't, and Tom's Hardware among others had dramatic videos of just exactly what happened if one actually tried to run the things without cooling, compared to running an Intel of the period. But I remember being rather surprised that the CPU(s) was/were unharmed, which means it very well could have been the Athlon C era, and I had seen the dramatic videos and knew my CPU wasn't protected. I'd like to be able to run a combination of dup and RAID-1 for metadata. ZFS has a copies option, it would be good if we could do that. Well, if N-way-mirroring were possible, one could do more or less just that easily enough with suitable partitioning and setting the data vs metadata number of mirrors as appropriate... but of course with only two- way-mirroring and dup as choices... the only way to do it would be layering btrfs atop something else, say md/raid. And without real-time checksumming verification at the md/raid level... I use BTRFS for all my backups too. I think that the chance of data patterns triggering filesystem bugs that break backups as well as primary storage is vanishingly small. The chance of such bugs being latent for long enough that I can't easily recreate the data isn't worth worrying about. The fact that my primary filesystems and their first backups are btrfs raid1 on dual SSDs, while secondary backups are on spinning rust, does factor into my calculations here. I ran reiserfs for many years, since I first switched to Linux full time in the early kernel 2.4 era in fact, and while it had its problems early on, since the introduction of ordered data mode in IIRC 2.6.16 or some such,
Re: Help with space
On 02/05/14 10:23, Duncan wrote: Russell Coker posted on Fri, 02 May 2014 11:48:07 +1000 as excerpted: On Thu, 1 May 2014, Duncan 1i5t5.dun...@cox.net wrote: [snip] http://www.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-391.pdf Whether a true RAID-1 means just 2 copies or N copies is a matter of opinion. Papers such as the above seem to clearly imply that RAID-1 is strictly 2 copies of data. Thanks for that link. =:^) My position would be that reflects the original, but not the modern, definition. The paper seems to describe as raid1 what would later come to be called raid1+0, which quickly morphed into raid10, leaving the raid1 description only covering pure mirror-raid. Personally I'm flexible on using the terminology in day-to-day operations and discussion due to the fact that the end-result is close enough. But ... The definition of RAID 1 is still only a mirror of two devices. As far as I'm aware, Linux's mdraid is the only raid system in the world that allows N-way mirroring while still referring to it as RAID1. Due to the way it handles data in chunks, and also due to its rampant layering violations, *technically* btrfs's RAID-like features are not RAID. To differentiate from RAID, we're already using lowercase raid and, in the long term, some of us are also looking to do away with raid{x} terms altogether with what Hugo and I last termed as csp notation. Changing the terminology is important - but it is particularly non-urgent. -- __ Brendan Hide http://swiftspirit.co.za/ http://www.webafrica.co.za/?AFF1E97 -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On May 2, 2014, at 2:23 AM, Duncan 1i5t5.dun...@cox.net wrote: Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace… The syntax for btrfs device is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Disk1 Disk2 Disk3 134 124 235 679 578 689 So 1 through 9 each represent a 1GB chunk. Disk 1 and 2 each have a chunk 1; disk 2 and 3 each have a chunk 2, and so on. Total of 9GB of data taking up 18GB of space, 6GB on each drive. You can't do this with any other raid1 as far as I know. You do definitely run out of space on one disk first though because of uneven metadata to data chunk allocation. Anyway I think we're off the rails with raid1 nomenclature as soon as we have 3 devices. It's probably better to call it replication, with an assumed default of 2 replicates unless otherwise specified. There's definitely a benefit to a 3 device volume with 2 replicates, efficiency wise. As soon as we go to four disks 2 replicates it makes more sense to do raid10, although I haven't tested odd device raid10 setups so I'm not sure what happens. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Fri, May 02, 2014 at 01:21:50PM -0600, Chris Murphy wrote: On May 2, 2014, at 2:23 AM, Duncan 1i5t5.dun...@cox.net wrote: Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace… The syntax for btrfs device is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Disk1 Disk2 Disk3 134 124 235 679 578 689 So 1 through 9 each represent a 1GB chunk. Disk 1 and 2 each have a chunk 1; disk 2 and 3 each have a chunk 2, and so on. Total of 9GB of data taking up 18GB of space, 6GB on each drive. You can't do this with any other raid1 as far as I know. You do definitely run out of space on one disk first though because of uneven metadata to data chunk allocation. The algorithm is that when the chunk allocator is asked for a block group (in pairs of chunks for RAID-1), it picks the number of chunks it needs, from different devices, in order of the device with the most free space. So, with disks of size 8, 4, 4, you get: Disk 1: 12345678 Disk 2: 1357 Disk 3: 2468 and with 8, 8, 4, you get: Disk 1: 1234568A Disk 2: 1234579A Disk 3: 6789 Hugo. Anyway I think we're off the rails with raid1 nomenclature as soon as we have 3 devices. It's probably better to call it replication, with an assumed default of 2 replicates unless otherwise specified. There's definitely a benefit to a 3 device volume with 2 replicates, efficiency wise. As soon as we go to four disks 2 replicates it makes more sense to do raid10, although I haven't tested odd device raid10 setups so I'm not sure what happens. Chris Murphy -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- Prisoner unknown: Return to Zenda. --- signature.asc Description: Digital signature
Re: Help with space
On May 2, 2014, at 3:08 PM, Hugo Mills h...@carfax.org.uk wrote: On Fri, May 02, 2014 at 01:21:50PM -0600, Chris Murphy wrote: On May 2, 2014, at 2:23 AM, Duncan 1i5t5.dun...@cox.net wrote: Something tells me btrfs replace (not device replace, simply replace) should be moved to btrfs device replace… The syntax for btrfs device is different though; replace is like balance: btrfs balance start and btrfs replace start. And you can also get a status on it. We don't (yet) have options to stop, start, resume, which could maybe come in handy for long rebuilds and a reboot is required (?) although maybe that just gets handled automatically: set it to pause, then unmount, then reboot, then mount and resume. Well, I'd say two copies if it's only two devices in the raid1... would be true raid1. But if it's say four devices in the raid1, as is certainly possible with btrfs raid1, that if it's not mirrored 4-way across all devices, it's not true raid1, but rather some sort of hybrid raid, raid10 (or raid01) if the devices are so arranged, raid1+linear if arranged that way, or some form that doesn't nicely fall into a well defined raid level categorization. Well, md raid1 is always n-way. So if you use -n 3 and specify three devices, you'll get 3-way mirroring (3 mirrors). But I don't know any hardware raid that works this way. They all seem to be raid 1 is strictly two devices. At 4 devices it's raid10, and only in pairs. Btrfs raid1 with 3+ devices is unique as far as I can tell. It is something like raid1 (2 copies) + linear/concat. But that allocation is round robin. I don't read code but based on how a 3 disk raid1 volume grows VDI files as it's filled it looks like 1GB chunks are copied like this Disk1Disk2 Disk3 134 124 235 679 578 689 So 1 through 9 each represent a 1GB chunk. Disk 1 and 2 each have a chunk 1; disk 2 and 3 each have a chunk 2, and so on. Total of 9GB of data taking up 18GB of space, 6GB on each drive. You can't do this with any other raid1 as far as I know. You do definitely run out of space on one disk first though because of uneven metadata to data chunk allocation. The algorithm is that when the chunk allocator is asked for a block group (in pairs of chunks for RAID-1), it picks the number of chunks it needs, from different devices, in order of the device with the most free space. So, with disks of size 8, 4, 4, you get: Disk 1: 12345678 Disk 2: 1357 Disk 3: 2468 and with 8, 8, 4, you get: Disk 1: 1234568A Disk 2: 1234579A Disk 3: 6789 Sure in my example I was assuming equal size disks. But it's a good example to have uneven disks also, because it exemplifies all the more the flexibility btrfs replication has, over alternatives, with odd numbered *and* uneven size disks. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Fri, 28 Feb 2014 10:34:36 Roman Mamedov wrote: I've a 18 tera hardware raid 5 (areca ARC-1170 w/ 8 3 gig drives) in Do you sleep well at night knowing that if one disk fails, you end up with basically a RAID0 of 7x3TB disks? And that if 2nd one encounters unreadable sector during rebuild, you lost your data? RAID5 actually stopped working 5 years ago, apparently you didn't get the memo. :) http://hardware.slashdot.org/story/08/10/21/2126252/why-raid-5-stops-working -in-2009 I've just been doing some experiments with a failing disk used for backups (so I'm not losing any real data here). The dup option for metadata means that the entire filesystem structure is intact in spite of having lots of errors (in another thread I wrote about getting 50+ correctable errors on metadata while doing a backup). My experience is that in the vast majority of disk failures that don't involve dropping a disk the majority of disk data will still be readable. For example one time I had a workstation running RAID-1 get too hot in summer and both disks developed significant numbers of errors, enough that it couldn't maintain a Linux Software RAID-1 (disks got kicked out all the time). I wrote a program to read all the data from disk 0 and read from disk 1 any blocks that couldn't be read from disk 0, the result was that after running e2fsck on the result I didn't lose any data. So if you have BTRFS configured to dup metadata on a RAID-5 array (either hardware RAID or Linux Software RAID) then the probability of losing metadata would be a lot lower than for a filesystem which doesn't do checksums and doesn't duplicate metadata. To lose metadata you would need to have two errors that line up with both copies of the same metadata block. One problem with many RAID arrays is that it seems to only be possible to remove a disk and generate a replacement from parity. I'd like to be able to read all the data from the old disk which is readable and write it to the new disk. Then use the parity from other disks to recover the blocks which weren't readable. That way if you have errors on two disks it won't matter unless they both happen to be on the same stripe. Given that BTRFS RAID-5 isn't usable yet it seems that the only way to get this result is to use RAID- Z on ZFS. -- My Main Blog http://etbe.coker.com.au/ My Documents Bloghttp://doc.coker.com.au/ -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
Russell Coker posted on Thu, 01 May 2014 11:52:33 +1000 as excerpted: I've just been doing some experiments with a failing disk used for backups (so I'm not losing any real data here). =:^) The dup option for metadata means that the entire filesystem structure is intact in spite of having lots of errors (in another thread I wrote about getting 50+ correctable errors on metadata while doing a backup). TL;DR: Discustion of btrfs raid1 and n-way-mirroring. Bonus discussion on spinning rust heat-death and death in general modes. That's why I'm running raid1 for both data and metadata here. I love btrfs' data/metadata checksumming and integrity mechanisms, and having that second copy to scrub from in the event of an error on one of them is just as important to me as the device-redundancy-and-failure-recovery bit. I could get the latter on md/raid and did run it for some years, but the fact that there's no way to have it do routine read-time parity cross- check and scrub (or N-way checking and vote, rewriting to a bad copy on failure, in the case of raid1), even tho it has all the cross-checksums already there and available to do it, but only actually makes /use/ of that for recovery if a device fails... My biggest frustration with btrfs ATM is the lack of true raid1, aka N-way-mirroring. Btrfs presently only does pair-mirroring, no matter the number of devices in the raid1. Checksummed-3-way-redundancy really is the sweet spot I'd like to hit, and yes it's on the road map, but this thing seems to be taking about as long as Christmas does to a five or six year old... which is a pretty apt metaphor of my anticipation and the eagerness with which I'll be unwrapping and playing with that present once it comes! =:^) My experience is that in the vast majority of disk failures that don't involve dropping a disk the majority of disk data will still be readable. For example one time I had a workstation running RAID-1 get too hot in summer and both disks developed significant numbers of errors, enough that it couldn't maintain a Linux Software RAID-1 (disks got kicked out all the time). I wrote a program to read all the data from disk 0 and read from disk 1 any blocks that couldn't be read from disk 0, the result was that after running e2fsck on the result I didn't lose any data. That's rather similar to an experience of mine. I'm in Phoenix, AZ, and outdoor in-the-shade temps can reach near 50C. Air-conditioning failure with a system left running while I was elsewhere. I came home the the hot car effect, far hotter inside than out, so likely 55-60C ambient air temp, very likely 70+ device temps. The system was still on but frozen (broiled?) due to disk head crash and possibly CPU thermal shutdown. Surprisingly, after shutting everything down, getting a new AC, and letting the system cool for a few hours, it pretty much all came back to life, including the CPU(s) (that was pre-multi-core, but I don't remember whether it was my dual socket original Opteron, or pre-dual-socket for me as well) which I had feared would be dead. The disk as well came back, minus the sections that were being accessed at the time of the head crash, which I expect were physically grooved. I only had the one main disk running at the time, but fortunately I had partitioned it up and had working and backup partitions for everything vital, and of course the backup partitions weren't mounted at the time, and they came thru just fine (tho without checksumming so I'll never know if there were bit-flips, but I could boot from the backup / and mount the other backups, and a working partition or two that weren't hurt, just fine. But I *DID* have quite a time recovering anyway, primarily because my rootfs, /usr/ and /var (which had the system's installed package database), were three different partitions that ended up being from three different backup dates... on gentoo, with its rolling updates! IIRC I had a current /var including the package database, but the package files actually on the rootfs and on /usr were from different package versions from what the db in /var was tracking, and were different from each other as well. I was still finding stale package remnants nearly two years later! But I continued running that disk for several months until I had some money to replace it, then copied the system, by then current again except for the occasional stale file, to the new setup. I always wondered how much longer I could have run the heat-tested one, but didn't want to trust my luck any further, so retired it. Which was when I got into md/raid, first mostly raid6, then later redone to raid1, once I figured out the fancy dual checksums weren't doing anything but slowing me down in normal operations anyway. And on my new setup, I used a partitioning policy I continue to this day, namely, everything that the package manager touches[1] including its installed-pkg
Re: Help with space
On Feb 27, 2014, at 11:19 AM, Justin Brown otakujunct...@gmail.com wrote: I've a 18 tera hardware raid 5 (areca ARC-1170 w/ 8 3 gig drives) in need of help. Disk usage (du) shows 13 tera allocated yet strangely enough df shows approx. 780 gigs are free. It seems, somehow, btrfs has eaten roughly 4 tera internally. I've run a scrub and a balance usage=5 with no success, in fact I lost about 20 gigs after the balance attempt. Some numbers: terra:/var/lib/nobody/fs/ubfterra # uname -a Linux terra 3.12.4-2.44-desktop #1 SMP PREEMPT Mon Dec 9 03:14:51 CST 2013 i686 i686 i386 GNU/Linux This is on i686? The kernel page cache is limited to 16TB on i686, so effectively your block device is limited to 16TB. While the file system successfully creates, I think it's a bug that the mount -t btrfs command is probably a btrfs bug. The way this works for XFS and ext4 is mount fails. EXT4-fs (sdc): filesystem too large to mount safely on this system XFS (sdc): file system too large to be mounted on this system. If you're on a 32-bit OS, the file system might be toast, I'm not really sure. But I'd immediately stop using it and only use 64-bit OS for file systems of this size. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Feb 27, 2014, at 12:27 PM, Chris Murphy li...@colorremedies.com wrote: This is on i686? The kernel page cache is limited to 16TB on i686, so effectively your block device is limited to 16TB. While the file system successfully creates, I think it's a bug that the mount -t btrfs command is probably a btrfs bug. Yes Chris, circular logic day. It's probably a btrfs bug that the mount command succeeds. So let us know if this is i686 or x86_64, because if it's the former it's a bug that should get fixed. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
Yes it's an ancient 32 bit machine. There must be a complex bug involved as the system, when originally mounted, claimed the correct free space and only as used over time did the discrepancy between used and free grow. I'm afraid I chose btrfs because it appeared capable of breaking the 16 tera limit on a 32 bit system. If this isn't the case then it's incredible that I've been using this file system for about a year without difficulty until now. -Justin Sent from my iPad On Feb 27, 2014, at 1:51 PM, Chris Murphy li...@colorremedies.com wrote: On Feb 27, 2014, at 12:27 PM, Chris Murphy li...@colorremedies.com wrote: This is on i686? The kernel page cache is limited to 16TB on i686, so effectively your block device is limited to 16TB. While the file system successfully creates, I think it's a bug that the mount -t btrfs command is probably a btrfs bug. Yes Chris, circular logic day. It's probably a btrfs bug that the mount command succeeds. So let us know if this is i686 or x86_64, because if it's the former it's a bug that should get fixed. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Feb 27, 2014, at 1:49 PM, otakujunct...@gmail.com wrote: Yes it's an ancient 32 bit machine. There must be a complex bug involved as the system, when originally mounted, claimed the correct free space and only as used over time did the discrepancy between used and free grow. I'm afraid I chose btrfs because it appeared capable of breaking the 16 tera limit on a 32 bit system. If this isn't the case then it's incredible that I've been using this file system for about a year without difficulty until now. Yep, it's not a good bug. This happened some years ago on XFS too, where people would use the file system for a long time and then at 16TB+1byte written to the volume, kablewy! And then it wasn't usable at all, until put on a 64-bit kernel. http://oss.sgi.com/pipermail/xfs/2014-February/034588.html I can't tell you if there's a work around for this other than to go to a 64bit kernel. Maybe you could partition the raid5 into two 9TB block devices, and then format the two partitions with -d single -m raid1. That way it behaves as one volume, and alternates 1GB chunks to the two partitions. This should be decent performing for large files, but otherwise it's possible that you will sometimes have the allocator writing to two data chunks on what it thinks are two drives, atthe same time, but it's actually writing to the physical device (array) at the same time. Hardware raid should optimize some of this, but I don't know what the penalty will be, if it'll work for your use case. And I definitely don't know if the kernel page cache limit applies to the block device (partition) or if it applies to the file system. It sounds like it applies to the block device, so this might be a way around this if you had to stick to a 32bit system. Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Thu, Feb 27, 2014 at 02:11:19PM -0700, Chris Murphy wrote: On Feb 27, 2014, at 1:49 PM, otakujunct...@gmail.com wrote: Yes it's an ancient 32 bit machine. There must be a complex bug involved as the system, when originally mounted, claimed the correct free space and only as used over time did the discrepancy between used and free grow. I'm afraid I chose btrfs because it appeared capable of breaking the 16 tera limit on a 32 bit system. If this isn't the case then it's incredible that I've been using this file system for about a year without difficulty until now. Yep, it's not a good bug. This happened some years ago on XFS too, where people would use the file system for a long time and then at 16TB+1byte written to the volume, kablewy! And then it wasn't usable at all, until put on a 64-bit kernel. http://oss.sgi.com/pipermail/xfs/2014-February/034588.html Well, no, that's not what I said. I said that it was limited on XFS, not that the limit was a result of a user making a filesystem too large and then finding out it didn't work. Indeed, you can't do that on XFS - mkfs will refuse to run on a block device it can't access the last block on, and the kernel has the same can I access the last block of the filesystem sanity checks that are run at mount and growfs time. IOWs, XFS has *never* allowed 16TB on 32 bit systems on Linux. And, historically speaking, it didn't even allow it on Irix. Irix on 32 bit systems was limited to 1TB (2^31 sectors of 2^9 bytes = 1TB), and only as Linux gained sufficient capability on 32 bit systems (e.g. CONFIG_LBD) was the limit increased. The limit we are now at is the address space index being 32 bits, so the size is limited by 2^32 * PAGE_SIZE = 2^44 = 16TB i.e Back when XFS was still being ported to Linux from Irix in 2000: 203 #if !XFS_BIG_FILESYSTEMS 204 if (sbp-sb_dblocks INT_MAX || sbp-sb_rblocks INT_MAX) { 205 cmn_err(CE_WARN, 206 XFS: File systems greater than 1TB not supported on this system.\n); 207 return XFS_ERROR(E2BIG); 208 } 209 #endif (http://oss.sgi.com/cgi-bin/gitweb.cgi?p=archive/xfs-import.git;a=blob;f=fs/xfs/xfs_mount.c;hb=60a4726a60437654e2af369ccc8458376e1657b9) So, good story, but is not true. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Feb 27, 2014, at 5:12 PM, Dave Chinner da...@fromorbit.com wrote: On Thu, Feb 27, 2014 at 02:11:19PM -0700, Chris Murphy wrote: On Feb 27, 2014, at 1:49 PM, otakujunct...@gmail.com wrote: Yes it's an ancient 32 bit machine. There must be a complex bug involved as the system, when originally mounted, claimed the correct free space and only as used over time did the discrepancy between used and free grow. I'm afraid I chose btrfs because it appeared capable of breaking the 16 tera limit on a 32 bit system. If this isn't the case then it's incredible that I've been using this file system for about a year without difficulty until now. Yep, it's not a good bug. This happened some years ago on XFS too, where people would use the file system for a long time and then at 16TB+1byte written to the volume, kablewy! And then it wasn't usable at all, until put on a 64-bit kernel. http://oss.sgi.com/pipermail/xfs/2014-February/034588.html Well, no, that's not what I said. What are you thinking I said you said? I wasn't quoting or paraphrasing anything you've said above. I had done a google search on this early and found some rather old threads where some people had this experience of making a large file system on a 32-bit kernel, and only after filling it beyond 16TB did they run into the problem. Here is one of them: http://lists.centos.org/pipermail/centos/2011-April/109142.html I said that it was limited on XFS, not that the limit was a result of a user making a filesystem too large and then finding out it didn't work. Indeed, you can't do that on XFS - mkfs will refuse to run on a block device it can't access the last block on, and the kernel has the same can I access the last block of the filesystem sanity checks that are run at mount and growfs time. Nope. What I reported on the XFS list, I had used mkfs.xfs while running 32bit kernel on a 20TB virtual disk. It did not fail to make the file system, it failed only to mount it. It was the same booted virtual machine, I created the file system and immediately mounted it. If you want the specifics, I'll post on the XFS list with versions and reproduce steps. IOWs, XFS has *never* allowed 16TB on 32 bit systems on Linux. OK that's fine, I've only reported what other people said they experienced, and it comes as no surprise they might have been confused. Although not knowing the size of one's file system would seem to be rare. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Thu, Feb 27, 2014 at 05:27:48PM -0700, Chris Murphy wrote: On Feb 27, 2014, at 5:12 PM, Dave Chinner da...@fromorbit.com wrote: On Thu, Feb 27, 2014 at 02:11:19PM -0700, Chris Murphy wrote: On Feb 27, 2014, at 1:49 PM, otakujunct...@gmail.com wrote: Yes it's an ancient 32 bit machine. There must be a complex bug involved as the system, when originally mounted, claimed the correct free space and only as used over time did the discrepancy between used and free grow. I'm afraid I chose btrfs because it appeared capable of breaking the 16 tera limit on a 32 bit system. If this isn't the case then it's incredible that I've been using this file system for about a year without difficulty until now. Yep, it's not a good bug. This happened some years ago on XFS too, where people would use the file system for a long time and then at 16TB+1byte written to the volume, kablewy! And then it wasn't usable at all, until put on a 64-bit kernel. http://oss.sgi.com/pipermail/xfs/2014-February/034588.html Well, no, that's not what I said. What are you thinking I said you said? I wasn't quoting or paraphrasing anything you've said above. I had done a google search on this early and found some rather old threads where some people had this experience of making a large file system on a 32-bit kernel, and only after filling it beyond 16TB did they run into the problem. Here is one of them: http://lists.centos.org/pipermail/centos/2011-April/109142.html sigh No, he didn't fill it with 16TB of data and then have it fail. He made a new filesystem *larger* than 16TB and tried to mount it: | On a CentOS 32-bit backup server with a 17TB LVM logical volume on | EMC storage. Worked great, until it rolled 16TB. Then it quit | working. Altogether. /var/log/messages told me that the | filesystem was too large to be mounted. Had to re-image the VM as | a 64-bit CentOS, and then re-attached the RDM's to the LUNs | holding the PV's for the LV, and it mounted instantly, and we | kept on trucking. This just backs up what I told you originally - that XFS has always refused to mount 16TB filesystems on 32 bit systems. I said that it was limited on XFS, not that the limit was a result of a user making a filesystem too large and then finding out it didn't work. Indeed, you can't do that on XFS - mkfs will refuse to run on a block device it can't access the last block on, and the kernel has the same can I access the last block of the filesystem sanity checks that are run at mount and growfs time. Nope. What I reported on the XFS list, I had used mkfs.xfs while running 32bit kernel on a 20TB virtual disk. It did not fail to make the file system, it failed only to mount it. You said no such thing. All you said was you couldn't mount a filesystem 16TB - you made no mention of how you made the fs, what the block device was or any other details. It was the same booted virtual machine, I created the file system and immediately mounted it. If you want the specifics, I'll post on the XFS list with versions and reproduce steps. Did you check to see whether the block device silently wrapped at 16TB? There's a real good chance it did - but you might have got lucky because mkfs.xfs uses direct IO and *maybe* that works correctly on block devices on 32 bit systems. I wouldn't bet on it, though, given it's something we don't support and therefore never test Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Thu, 27 Feb 2014 12:19:05 -0600 Justin Brown otakujunct...@gmail.com wrote: I've a 18 tera hardware raid 5 (areca ARC-1170 w/ 8 3 gig drives) in Do you sleep well at night knowing that if one disk fails, you end up with basically a RAID0 of 7x3TB disks? And that if 2nd one encounters unreadable sector during rebuild, you lost your data? RAID5 actually stopped working 5 years ago, apparently you didn't get the memo. :) http://hardware.slashdot.org/story/08/10/21/2126252/why-raid-5-stops-working-in-2009 need of help. Disk usage (du) shows 13 tera allocated yet strangely enough df shows approx. 780 gigs are free. It seems, somehow, btrfs has eaten roughly 4 tera internally. I've run a scrub and a balance usage=5 with no success, in fact I lost about 20 gigs after the Did you run balance with -dusage=5 or -musage=5? Or both? What is the output of the balance command? terra:/var/lib/nobody/fs/ubfterra # btrfs fi df . Data, single: total=17.58TiB, used=17.57TiB System, DUP: total=8.00MiB, used=1.93MiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=392.00GiB, used=33.50GiB ^ If you'd use -musage=5, I think this metadata reserve should have been shrunk, and you'd gain a lot more free space. But then as others mentioned it may be risky to use this FS on 32-bit at all, so I'd suggest trying anything else only after you reboot into a 64-bit kernel. -- With respect, Roman signature.asc Description: PGP signature
Re: Help with space
On Feb 27, 2014, at 9:21 PM, Dave Chinner da...@fromorbit.com wrote: http://lists.centos.org/pipermail/centos/2011-April/109142.html sigh No, he didn't fill it with 16TB of data and then have it fail. He made a new filesystem *larger* than 16TB and tried to mount it: | On a CentOS 32-bit backup server with a 17TB LVM logical volume on | EMC storage. Worked great, until it rolled 16TB. Then it quit | working. Altogether. /var/log/messages told me that the | filesystem was too large to be mounted. Had to re-image the VM as | a 64-bit CentOS, and then re-attached the RDM's to the LUNs | holding the PV's for the LV, and it mounted instantly, and we | kept on trucking. This just backs up what I told you originally - that XFS has always refused to mount 16TB filesystems on 32 bit systems. That isn't how I read that at all. It was a 17TB LV, working great (i.e. mounted) until it was filled with 16TB, then it quite working and could not subsequently be mounted until put on a 64-bit kernel. I don't see how it's working great if it's not mountable. I said that it was limited on XFS, not that the limit was a result of a user making a filesystem too large and then finding out it didn't work. Indeed, you can't do that on XFS - mkfs will refuse to run on a block device it can't access the last block on, and the kernel has the same can I access the last block of the filesystem sanity checks that are run at mount and growfs time. Nope. What I reported on the XFS list, I had used mkfs.xfs while running 32bit kernel on a 20TB virtual disk. It did not fail to make the file system, it failed only to mount it. You said no such thing. All you said was you couldn't mount a filesystem 16TB - you made no mention of how you made the fs, what the block device was or any other details. All correct. It wasn't intended as a bug report, it seemed normal. What I reported = the mount failure. VBox 25TB VDI as a single block device, as well as 5x 5TB VDIs in an 20TB linear LV, as well as a 100TB virtual size LV using LVM thinp - all can be formatted with default mkfs.xfs with no complaints. 3.13.4-200.fc20.i686+PAE xfsprogs-3.1.11-2.fc20.i686 It was the same booted virtual machine, I created the file system and immediately mounted it. If you want the specifics, I'll post on the XFS list with versions and reproduce steps. Did you check to see whether the block device silently wrapped at 16TB? There's a real good chance it did - but you might have got lucky because mkfs.xfs uses direct IO and *maybe* that works correctly on block devices on 32 bit systems. I wouldn't bet on it, though, given it's something we don't support and therefore never test…. I did not check to see if any of the block devices silently wrapped, I don't know how to do that although I have a strace of the mkfs on the 100TB virtual LV here: https://dl.dropboxusercontent.com/u/3253801/mkfsxfs32bit100TBvLV.txt Chris Murphy-- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Feb 27, 2014, at 11:19 AM, Justin Brown otakujunct...@gmail.com wrote: terra:/var/lib/nobody/fs/ubfterra # btrfs fi df . Data, single: total=17.58TiB, used=17.57TiB System, DUP: total=8.00MiB, used=1.93MiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=392.00GiB, used=33.50GiB Metadata, single: total=8.00MiB, used=0.00 After glancing at this again, what I thought might be going on might not be going on. The fact it has 17+TB already used, not merely allocated, doesn't seem possible if there's a hard 16TB limit for Btrfs on 32-bit kernels. But then I don't know why du -h is reporting only 13T total used. And I'm unconvinced this is a balance issue either. Is anything obviously missing from the file system? Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Feb 27, 2014, at 11:13 PM, Chris Murphy li...@colorremedies.com wrote: On Feb 27, 2014, at 11:19 AM, Justin Brown otakujunct...@gmail.com wrote: terra:/var/lib/nobody/fs/ubfterra # btrfs fi df . Data, single: total=17.58TiB, used=17.57TiB System, DUP: total=8.00MiB, used=1.93MiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=392.00GiB, used=33.50GiB Metadata, single: total=8.00MiB, used=0.00 After glancing at this again, what I thought might be going on might not be going on. The fact it has 17+TB already used, not merely allocated, doesn't seem possible if there's a hard 16TB limit for Btrfs on 32-bit kernels. But then I don't know why du -h is reporting only 13T total used. And I'm unconvinced this is a balance issue either. Is anything obviously missing from the file system? What are your mount options? Maybe compression? Clearly du is calculating things differently. I'm getting: du -sch = 4.2G df -h= 5.4G btrfs df = 4.7G data and 620MB metadata(total). I am using compress=lzo. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
Roman Mamedov posted on Fri, 28 Feb 2014 10:34:36 +0600 as excerpted: But then as others mentioned it may be risky to use this FS on 32-bit at all, so I'd suggest trying anything else only after you reboot into a 64-bit kernel. Based on what I've read on-list, btrfs is not arch-agnostic, with certain on-disk sizes set to native kernel page size, etc, so a filesystem created on one arch may well not work on another. Question: Does this apply to x86/amd64? Will a filesystem created/used on 32-bit x86 even mount/work on 64-bit amd64/x86_64, or does upgrading to 64-bit imply backing up (in this case) double-digit TiB of data to something other than btrfs and testing it, doing a mkfs on the original filesystem once in 64-bit mode, and restoring all that data from backup? If the existing 32-bit x86 btrfs can't be used on 64-bit amd64, transferring all that data (assuming there's something big enough available to transfer it to!) to backup and then restoring it is going to hurt! -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
On Fri, 28 Feb 2014 07:27:06 + (UTC) Duncan 1i5t5.dun...@cox.net wrote: Based on what I've read on-list, btrfs is not arch-agnostic, with certain on-disk sizes set to native kernel page size, etc, so a filesystem created on one arch may well not work on another. Question: Does this apply to x86/amd64? Will a filesystem created/used on 32-bit x86 even mount/work on 64-bit amd64/x86_64, or does upgrading to 64-bit imply backing up (in this case) double-digit TiB of data to something other than btrfs and testing it, doing a mkfs on the original filesystem once in 64-bit mode, and restoring all that data from backup? Page size (4K) is the same on both i386 and amd64. It's also the same on ARM. Problem arises only on architectures like MIPS and PowerPC, some variants of which use 16K or 64K page sizes. Other than this page size issue, it has no arch-specific dependencies, e.g. no on-disk structures with CPU-native integer sized fields etc, that'd be too crazy to be true. -- With respect, Roman signature.asc Description: PGP signature
Re: Help with space
Apologies for the late reply, I'd assumed the issue was closed even given the unusual behavior. My mount options are: /dev/sdb1 on /var/lib/nobody/fs/ubfterra type btrfs (rw,noatime,nodatasum,nodatacow,noacl,space_cache,skip_balance) I only recently added nodatacow and skip_balance in an attempt to figure out where the missing space had gone. I don't know what impact it might have if any on things. I've got a full balance running at the moment which, after about a day or so, has managed to process 5% of the chunks it's considering (988 out of about 18396 chunks balanced (989 considered), 95% left). The amount of free space has vacillated slightly, growing by about a gig to shrink back. As far as objects in the file system missing, I've not seen any such. I've a lot of files of various data types, the majority is encoded japanese animation. Since I actually play these files via samba from a htpc, particularly the more recent additions, I'd hazard to guess that if something were breaking I'd have tripped across it by now, the unusual used to free space delta being the exception. My brother also uses this raid for data storage, he's something of a closet meteorologist and is fascinated by tornadoes. He hasn't noticed any unusual behavior either. I'm in the process of sourcing a 64 bit capable system in the hopes that will resolve the issue. Neither of us are currently writing anything to the file system for fear of things breaking, but both have been reading from it without issue other than the noticeable impact in performance balance seems to be having. Thanks for the help. -Justin On Fri, Feb 28, 2014 at 12:26 AM, Chris Murphy li...@colorremedies.com wrote: On Feb 27, 2014, at 11:13 PM, Chris Murphy li...@colorremedies.com wrote: On Feb 27, 2014, at 11:19 AM, Justin Brown otakujunct...@gmail.com wrote: terra:/var/lib/nobody/fs/ubfterra # btrfs fi df . Data, single: total=17.58TiB, used=17.57TiB System, DUP: total=8.00MiB, used=1.93MiB System, single: total=4.00MiB, used=0.00 Metadata, DUP: total=392.00GiB, used=33.50GiB Metadata, single: total=8.00MiB, used=0.00 After glancing at this again, what I thought might be going on might not be going on. The fact it has 17+TB already used, not merely allocated, doesn't seem possible if there's a hard 16TB limit for Btrfs on 32-bit kernels. But then I don't know why du -h is reporting only 13T total used. And I'm unconvinced this is a balance issue either. Is anything obviously missing from the file system? What are your mount options? Maybe compression? Clearly du is calculating things differently. I'm getting: du -sch = 4.2G df -h= 5.4G btrfs df = 4.7G data and 620MB metadata(total). I am using compress=lzo. Chris Murphy -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Help with space
Absolutely. I'd like to know the answer to this, as 13 tera will take a considerable amount of time to back up anywhere, assuming I find a place. I'm considering rebuilding a smaller raid with newer drives (it was originally built using 16 250 gig western digital drives, it's about eleven years old now, having been in use the entire time without failure, I'm considering replacing each 250 gig with a 3 tera alternative). Unfortunately, between upgrading the host and building a new raid the expense isn't something I'm anticipating with pleasure... On Fri, Feb 28, 2014 at 1:27 AM, Duncan 1i5t5.dun...@cox.net wrote: Roman Mamedov posted on Fri, 28 Feb 2014 10:34:36 +0600 as excerpted: But then as others mentioned it may be risky to use this FS on 32-bit at all, so I'd suggest trying anything else only after you reboot into a 64-bit kernel. Based on what I've read on-list, btrfs is not arch-agnostic, with certain on-disk sizes set to native kernel page size, etc, so a filesystem created on one arch may well not work on another. Question: Does this apply to x86/amd64? Will a filesystem created/used on 32-bit x86 even mount/work on 64-bit amd64/x86_64, or does upgrading to 64-bit imply backing up (in this case) double-digit TiB of data to something other than btrfs and testing it, doing a mkfs on the original filesystem once in 64-bit mode, and restoring all that data from backup? If the existing 32-bit x86 btrfs can't be used on 64-bit amd64, transferring all that data (assuming there's something big enough available to transfer it to!) to backup and then restoring it is going to hurt! -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe linux-btrfs in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html