Re: Does ssd auto detected work for microSD cards?
Great, thank you. That would make sense, but I might have to specify something for the mmcblk devices. Here is the terminal output when the MicroSD is inserted to the USB 3 holder: $ mount | grep btrfs $ /dev/sdb3 on / type btrfs (rw,ssd,subvol=@) $ /dev/sdb3 on /home type btrfs (rw,ssd,subvol=@home) $ /dev/sdd1 on /media/gwb09/btrfs-32G-MicroSDc type btrfs (rw,nosuid,nodev,uhelper=udisks2) $ cat /sys/block/sdd/queue/rotational 1 Now the same MicroSD in the SD slot on the computer: $ mount | grep btrfs $ /dev/sdb3 on / type btrfs (rw,ssd,subvol=@) $ /dev/sdb3 on /home type btrfs (rw,ssd,subvol=@home) $ /dev/mmcblk0p1 on /media/gwb09/btrfs-32G-MicroSDc type btrfs (rw,nosuid,nodev,uhelper=udisks2) $ cat /sys/block/mmcblk0/queue/rotational 0 So Ubuntu 14 knows the mmcblk is non-rotational. It also looks as if the block device has a partition table of some sort by the existence of: /sys/block/mmcblk0/mmcblk0p1 I will see what happens after I install Ubuntu 18. I probably specified the mount options for /dev/sdb in /etc/fstab by using a UUID. I'll probably tweak the ssd mounts, using ssd_spread, ssd, etc. at some point. I've been using nilfs2 for this, but it occurs to me that btrfs will have more support on more platforms and on more OS's. There's also a mounting issue for nilfs2 in Ubuntu 14 which prevents the nilfs-clean daemon from starting. I will see if F2FS is in the kernel of the other machines here. No complaints here, just gratitude, for the money, time and effort on the part of tech firms that support and develop btrfs. I think Oracle developed the first blueprints for btrfs, but I might be wrong. Oracle also, of course, caught vast amounts of flak from some of the open source zfs devs for changing the dev model after buying Sun. But I have no idea what parts of Sun would have survived without a buyer. Gordon On Mon, Sep 3, 2018 at 11:22 PM Chris Murphy wrote: > > On Mon, Sep 3, 2018 at 7:53 PM, GWB wrote: > > Curious instance here, but perhaps this is the expected behaviour: > > > > mount | grep btrfs > > /dev/sdb3 on / type btrfs (rw,ssd,subvol=@) > > /dev/sdb3 on /home type btrfs (rw,ssd,subvol=@home) > > /dev/sde1 on /media/gwb09/btrfs-32G-MicroSDc type btrfs > > (rw,nosuid,nodev,uhelper=udisks2) > > > > This is on an Ubuntu 14 client. > > > > /dev/sdb is indeed an ssd, a Samsung 850 EVO 500Gig, where Ubuntu runs > > on btrfs root. It appears btrfs did indeed auto detected an ssd > > drive. However: > > > > /dev/sde is a micro SD card (32Gig Samsung) sitting in a USB 3 card > > reader, inserted into a USB 3 card slot. But ssh is not detected. > > > > So is that the expected behavior? > > cat /sys/block/sde/queue/rotational > > That's what Btrfs uses for detection. I'm willing to bet the SD Card > slot is not using the mmc driver, but instead USB and therefore always > treated as a rotational device. > > > > If not, does it make a difference? > > > > Would it be best to mount an sd card with ssd_spread? > > For the described use case, it probably doesn't make much of a > difference. It sounds like these are fairly large contiguous files, > ZFS send files. > > I think for both SDXC and eMMC, F2FS is probably more applicable > overall than Btrfs due to its reduced wandering trees problem. But > again for your use case it may not matter much. > > > > > Yet another side note: both btrfs and zfs are now "housed" at Oracle > > (and most of java, correct?). > > Not really. The ZFS we care about now is OpenZFS, forked from Oracle's > ZFS. And a bunch of people not related to Oracle do that work. And > Btrfs has a wide assortment of developers: Facebook, SUSE, Fujitsu, > Oracle, and more. > > > -- > Chris Murphy
Does ssd auto detected work for microSD cards?
Curious instance here, but perhaps this is the expected behaviour: mount | grep btrfs /dev/sdb3 on / type btrfs (rw,ssd,subvol=@) /dev/sdb3 on /home type btrfs (rw,ssd,subvol=@home) /dev/sde1 on /media/gwb09/btrfs-32G-MicroSDc type btrfs (rw,nosuid,nodev,uhelper=udisks2) This is on an Ubuntu 14 client. /dev/sdb is indeed an ssd, a Samsung 850 EVO 500Gig, where Ubuntu runs on btrfs root. It appears btrfs did indeed auto detected an ssd drive. However: /dev/sde is a micro SD card (32Gig Samsung) sitting in a USB 3 card reader, inserted into a USB 3 card slot. But ssh is not detected. So is that the expected behavior? If not, does it make a difference? Would it be best to mount an sd card with ssd_spread? The wiki suggests this, but I thought I would check: https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)#MOUNT_OPTIONS I think this is an older btrfs userland tools bundle: btrfs --version Btrfs v3.12 It's whatever is still in the Ubuntu 14 standard (universal?) repository. Here's the tl/dr. This is incidental info, and probably does not affect the question above, or answer. I use btrfs on Ubuntu root, and then zfs for home (lots of data, just shy of 2 TB on this laptop). I snapshot and send the zfs filesystems onto a btrfs formatted sd card, and then use zfs receive for the backup zpools: sudo zfs send -D -v -i 20180830042620u zpb9/home2@20180904005531u > zfs-send-zpb9-home2-20180904005531u (on the sd card, copy to the back up zpools, and then:) zfs receive -v zpf3/BackUpPoolHome < zfs-send-zpb9-home2-20180904005531u No complaints at all about btrfs for the root file system, and apt-btrfs-snapshot works great for rolling back from failed upgrades. My servers are Solaris and Ubuntu, both of which support zfs, as long as I don't upgrade beyond the "dreaded" point of no return for "open source", zpool version 28 and zfs version 5. When one server is upgraded to Ubuntu 18, I will try again to use btrfs on the larger Toshiba Hard Drives (6TB and 8TB, either the x300 NAS or Desktop models). I don't want to try that with Ubuntu 14 given the older userland tools. Many thanks. Please point me to the correct mail list if this is not the right one. Yet another side note: both btrfs and zfs are now "housed" at Oracle (and most of java, correct?). Any chance of Solaris 11 getting btrfs in the kernel? I'm guessing not for Sparc, but it might help x86 Solaris users migrate to Oracle Linux. At this point, I think Ubuntu is the only distribution with a version of both btrfs and zfs in the kernel. But not Oracle. Gordon Bynum
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
Yep, and thank you to Suse, Fujitsu, and all the contributors. I suppose we can all be charitable when reading this from the Red Hat Whitepaper at: https://www.redhat.com/whitepapers/rha/gfs/GFS_INS0032US.pdf: << Red Hat GFS is the world’s leading cluster file system for Linux. >> If that is GFS2, it is a different use case than Gluster (https://www.redhat.com/en/technologies/storage). So perhaps marketing might tweak that a little bit, maybe: << Red Hat GFS is the world’s leading cluster file system for Linux for Oracle RAC Database Clustering. >> But you can see how Oracle might quibble with that. So Red Hat goes as far as it can in the Whitepaper: << Red Hat GFS simplifies the installation, configuration, and on-going maintenance of the SAN infrastructure necessary for Oracle RAC clustering. Oracle tables, log files, program files, and archive information can all be stored in GFS files, avoiding the complexity and difficulties of managing raw storage devices on a SAN while achieving excellent performance. >> Which avoids a comparison between, say, an Oracle Sparc server (probably made by Fujitsu) hosting Oracle Rack Clusters on Solaris. Given the price of Oracle's sparc servers, Red Hat may be as good as an Oracle RAC DB server can get for a price less than the annual budget of a small country. Well, great news, Austin and Chris, that clears it up for me, and now I know of yet another use case for btrfs as the dmu for Gluster. So, again, I'm not too worried about Red Hat deprecating btrfs, given the number of supporters and developers. If Oracle or Suse drops out, then I would worry. Gordon On Thu, Aug 17, 2017 at 2:00 PM, Chris Murphywrote: > On Thu, Aug 17, 2017 at 5:47 AM, Austin S. Hemmelgarn > wrote: > >> Also, I don't think I've ever seen any patches posted from a Red Hat address >> on the ML, so I don't think they were really all that involved in >> development to begin with. > > Unfortunately the email domain doesn't tell the whole story who's > backing development, the company or the individual. > > [chris@f26s linux]$ git log --since=”2016-01-01” --pretty=format:"%an > %ae" --no-merges -- fs/btrfs | sort -u | grep redhat > Andreas Gruenbacher agrue...@redhat.com > David Howells dhowe...@redhat.com > Eric Sandeen sand...@redhat.com > Jeff Layton jlay...@redhat.com > Mike Christie mchri...@redhat.com > Miklos Szeredi mszer...@redhat.com > $ > > > >> GFS and GlusterFS are different technologies, unless Red Hat's marketing >> department is trying to be actively deceptive. > > https://www.redhat.com/en/technologies/storage > > Seems very clear. I don't even see GFS or GFS2 on here. It's Gluster and Ceph. > > >> >> SUSE is also pretty actively involved in the development too, and I think >> Fujitsu is as well. > > > >>> >>> >>> I'm not too worried. I'll keep using btrfs as it is now, within the >>> limits of what it can consistently do, and do what I can to help >>> support the effort. I'm not a file system coder, but I very much >>> appreciate the enormous amount of work that goes into btrfs. >>> >>> Steady on, ButterFS people. Back now to cat videos. >> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majord...@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > Big bunch of SUSE contributions (yes David Sterba is counted three > times here), and Fujitsu. > > [chris@f26s linux]$ git log --since=”2016-01-01” --pretty=format:"%an > %ae" --no-merges -- fs/btrfs | sort -u | grep suse > Borislav Petkov b...@suse.de > David Sterba dste...@suse.com > David Sterba dste...@suse.com > David Sterba dste...@suse.cz > Edmund Nadolski enadol...@suse.com > Filipe Manana fdman...@suse.com > Goldwyn Rodrigues rgold...@suse.com > Guoqing Jiang gqji...@suse.com > Jan Kara j...@suse.cz > Jeff Mahoney je...@suse.com > Jiri Kosina jkos...@suse.cz > Mark Fasheh mfas...@suse.de > Michal Hocko mho...@suse.com > NeilBrown ne...@suse.com > Nikolay Borisov nbori...@suse.com > Petr Mladek pmla...@suse.com > > [chris@f26s linux]$ git log --since=”2016-01-01” --pretty=format:"%an > %ae" --no-merges -- fs/btrfs | sort -u | grep fujitsu > Lu Fengqi lufq.f...@cn.fujitsu.com > Qu Wenruo quwen...@cn.fujitsu.com > Satoru Takeuchi takeuchi_sat...@jp.fujitsu.com > Su Yue suy.f...@cn.fujitsu.com > Tsutomu Itoh t-i...@jp.fujitsu.com > Wang Xiaoguang wangxg.f...@cn.fujitsu.com > Xiaoguang Wang wangxg.f...@cn.fujitsu.com > Zhao Lei zhao...@cn.fujitsu.com > > > Over the past 18 months, it's about 100 Btrfs contributors, 71 ext4, > 63 XFS. So all three have many contributors. That of course does not > tell the whole story by any means. > > -- > Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
<< Or else it could be an argument that they expect Btrfs to do their job while they watch cat videos from the intertubes. :-) >> My favourite quote from the list this week, and, well, obviously, that is the main selling point of file systems like btrfs, zfs, and various other lvm and raid set ups. The need to free up time to watch cat videos on the intertubes (whilst at work) has driven most technological innovations, going back at least to the time of the Roman Empire. So, sure, I'll be happy to admit that I like it very much when a file system or some other software or hardware component makes my job easier (which gives me more time to watch cat videos). But if hours on hours of cat videos have taught me one thing, it is that catastrophe (pun intended) awaits those who assume that btrfs (or zfs or nilfs or whatever) will magically work well in all use cases. That may be what their customers assumed about btrfs, but did Red Hat make that claim implicitly or explicitly? I don't know, but it seems unlikely, and all the things mentioned in this thread make sense to me. It looks like Red Hat is pushing "GFS" (Red Hat Global File System) for its clustered file system: https://www.redhat.com/whitepapers/rha/gfs/GFS_INS0032US.pdf XFS is now the standard "on disk" fs for Red Hat, but I can't tell if XFS is the DMU (backing file system or Data Management Unit) for GFS (zfs is the dmu for lustre). Probably, but why does GFS still has a file size limit of 100TB, while XFS has a 500TB limit, according to Red Hat? https://access.redhat.com/articles/rhel-limits And btrfs is gone from that list. So does this mean that Red Hat deprecating btrfs have a tangible effect on its development, future improvements, and adoption? It doesn't help, but maybe its not too bad. From reading the list, my impression is that the typical Red Hat customer with large data arrays might do fine running xfs over lvm2 over hardware raid (or at least the customers who are paying attention to the monitor stats between cat videos). That's not for me, because I prefer mirrors, not stripes, and "hot spares" that I can pull out of the enclosure, place in another machine, and get running again (which points me back to btrfs and zfs). But it must work great for a lot of data silos. On the plus side, btrfs is one of the backing file systems in ceph; on the minus side, with Red Hat out, btrfs might lose some developers and support: http://www.h-online.com/open/features/Kernel-Log-Coming-in-2-6-37-Part-2-File-systems-1148305.html%3Fpage=2 As long as FaceBook keeps using btrfs, I wouldn't worry too much about large firm adoption. Chris (from facebook, post above) points out that Facebook runs both xfs and btrfs as backing file systems for Gluster: https://www.linux.com/news/learn/intro-to-linux/how-facebook-uses-linux-and-btrfs-interview-chris-mason And Gluster is... owned by Red Hat (since 2011), which now advertises its "Red Hat Global File System", which would be... Gluster? Chris, is that right? So Facebook runs Gluster (which might be Red Hat Global File System) with both xfs and btrfs as the backing fs, and Red Hat... advertises Red Hat GFS as a platform for Oracle RAC Database Clustering. But not (presumably) running with btrfs as the backing fs, but rather xfs. So could one Gluster "grid" run over two file systems, xfs for the applications, and btrfs for the primary data storage? So Oracle still supports btrfs. Facebook still uses it. And it would be very funny if Red Hat GFS does use btrfs (eventually, at some point in the future) as the backing fs, but their customers probably won't notice the difference. I'm not too worried. I'll keep using btrfs as it is now, within the limits of what it can consistently do, and do what I can to help support the effort. I'm not a file system coder, but I very much appreciate the enormous amount of work that goes into btrfs. Steady on, ButterFS people. Back now to cat videos. Gordon Aug 16, 2017 at 11:54 AM, Peter Grandiwrote: > [ ... ] > >> But I've talked to some friend at the local super computing >> centre and they have rather general issues with CoW at their >> virtualisation cluster. > > Amazing news! :-) > >> Like SUSE's snapper making many snapshots leading the storage >> images of VMs apparently to explode (in terms of space usage). > > Well, this could be an argument that some of your friends are being > "challenged" by running the storage systems of a "super computing > centre" and that they could become "more prepared" about system > administration, for example as to the principle "know which tool to > use for which workload". Or else it could be an argument that they > expect Btrfs to do their job while they watch cat videos from the > intertubes. :-) > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at
Re: Shrinking a device - performance?
Indeed, that does make sense. It's the output of the size command in the Berkeley format of "text", not decimal, octal or hex. Out of curiosity about kernel module sizes, I dug up some old MacBooks and looked around in: /System/Library/Extensions/[modulename].kext/Content/MacOS: udf is 637K on Mac OS 10.6 exfat is 75K on Mac OS 10.9 msdosfs is 79K on Mac OS 10.9 ntfs is 394K (That must be Paragon's ntfs for Mac) And here's the kernel extension sizes for zfs (From OpenZFS): /Library/Extensions/[modulename].kext/Content/MacOS: zfs is 1.7M (10.9) spl is 247K (10.9) Different kernel from linux, of course (evidently a "mish mash" of NextStep, BSD, Mach and Apple's own code), but that is one large kernel extension for zfs. If they are somehow comparable even with the differences, 833K is not bad for btrfs compared to zfs. I did not look at the format of the file; it must be binary, but compression may be optional for third party kexts. So the kernel module sizes are large for both btrfs and zfs. Given the feature sets of both, is that surprising? My favourite kernel extension in Mac OS X is: /System/Library/Extensions/Dont Steal Mac OS X.kext/ Subtle, very subtle. Gordon On Fri, Mar 31, 2017 at 9:42 PM, Duncan <1i5t5.dun...@cox.net> wrote: > GWB posted on Fri, 31 Mar 2017 19:02:40 -0500 as excerpted: > >> It is confusing, and now that I look at it, more than a little funny. >> Your use of xargs returns the size of the kernel module for each of the >> filesystem types. I think I get it now: you are pointing to how large >> the kernel module for btrfs is compared to other file system kernel >> modules, 833 megs (piping find through xargs to sed). That does not >> mean the btrfs kernel module can accommodate an upper limit of a command >> line length that is 833 megs. It is just a very big loadable kernel >> module. > > Umm... 833 K, not M, I believe. (The unit is bytes not KiB.) > > Because if just one kernel module is nearing a gigabyte, then the kernel > must be many gigabytes either monolithic or once assembled in memory, and > it just ain't so. > > But FWIW megs was my first-glance impression too, until my brain said "No > way! Doesn't work!" and I took a second look. > > The kernel may indeed no longer fit on a 1.44 MB floppy, but it's still > got a ways to go before it's multiple GiB! =:^) While they're XZ- > compressed, I'm still fitting several monolithic-build kernels including > their appended initramfs, along with grub, its config and modules, and a > few other misc things, in a quarter-GB dup-mode btrfs, meaning 128 MiB > capacity, including the 16 MiB system chunk so 112 MiB for data and > metadata. That simply wouldn't be possible if the kernel itself were > multi-GB, even uncompressed. Even XZ isn't /that/ good! > > -- > Duncan - List replies preferred. No HTML msgs. > "Every nonfree program has a lord, a master -- > and if you use the program, he is your master." Richard Stallman > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Shrinking a device - performance?
It is confusing, and now that I look at it, more than a little funny. Your use of xargs returns the size of the kernel module for each of the filesystem types. I think I get it now: you are pointing to how large the kernel module for btrfs is compared to other file system kernel modules, 833 megs (piping find through xargs to sed). That does not mean the btrfs kernel module can accommodate an upper limit of a command line length that is 833 megs. It is just a very big loadable kernel module. So same question, but different expression: what is the signifigance of the large size of the btrfs kernel module? Is it that the larger the module, the more complex, the more prone to breakage, and more difficult to debug? Is the hfsplus kernel module less complex, and more robust? What did the file system designers of hfsplus (or udf) know better (or worse?) than the file system designers of btrfs? VAX/VMS clusters just aren't happy outside of a deeply hidden bunker running 9 machines in a cluster from one storage device connected by myranet over 500 miles to the next cluster. I applaud the move to x86, but like I wrote earlier, time has moved on. I suppose weird is in the eye of the beholder, but yes, when dial up was king and disco pants roamed the earth, they were nice. I don't think x86 is a viable use case even for OpenVMS. If you really need a VAX/VMS cluster, chances are you have already have had one running with a continuous uptime of more than a decade and you have already upgraded and changed out every component several times by cycling down one machine in the cluster at a time. Gordon On Fri, Mar 31, 2017 at 3:27 PM, Peter Grandiwrote: >> [ ... ] what the signifigance of the xargs size limits of >> btrfs might be. [ ... ] So what does it mean that btrfs has a >> higher xargs size limit than other file systems? [ ... ] Or >> does the lower capacity for argument length for hfsplus >> demonstrate it is the superior file system for avoiding >> breakage? [ ... ] > > That confuses, as my understanding of command argument size > limit is that it is a system, not filesystem, property, and for > example can be obtained with 'getconf _POSIX_ARG_MAX'. > >> Personally, I would go back to fossil and venti on Plan 9 for >> an archival data server (using WORM drives), > > In an ideal world we would be using Plan 9. Not necessarily with > Fossil and Venti. As a to storage/backup/archival Linux based > options are not bad, even if the platform is far messier than > Plan 9 (or some other alternatives). BTW I just noticed with a > search that AWS might be offering Plan 9 hosts :-). > >> and VAX/VMS cluster for an HA server. [ ... ] > > Uhmmm, however nice it was, it was fairly weird. An IA32 or > AMD64 port has been promised however :-). > > https://www.theregister.co.uk/2016/10/13/openvms_moves_slowly_towards_x86/ > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Shrinking a device - performance?
Well, now I am curious. Until we hear back from Christiane on the progress of the never ending file system shrinkage, I suppose it can't hurt to ask what the signifigance of the xargs size limits of btrfs might be. Or, again, if Christiane is already happily on his way to an xfs server running over lvm, skip, ignore, delete. Here is the output of xargs --size-limits on my laptop: << $ xargs --show-limits Your environment variables take up 4830 bytes POSIX upper limit on argument length (this system): 2090274 POSIX smallest allowable upper limit on argument length (all systems): 4096 Maximum length of command we could actually use: 2085444 Size of command buffer we are actually using: 131072 Execution of xargs will continue now... >> That is for a laptop system. So what does it mean that btrfs has a higher xargs size limit than other file systems? Could I theoretically use 40% of the total allowed argument length of the system for btrfs arguments alone? Would that make balance, shrinkage, etc., faster? Does the higher capacity for argument length mean btrfs is overly complex and therefore more prone to breakage? Or does the lower capacity for argument length for hfsplus demonstrate it is the superior file system for avoiding breakage? Or does it means that hfsplus is very old (and reflects older xargs limits), and that btrfs is newer code? I am relatively new to btrfs, and would like to find out. I am also attracted to the idea that it is better to leave some operations to the system itself, and not code them into the file system. For example, I think deduplication "off line" or "out of band" is an advantage for btrfs over zfs. But that's only for what I do. For other uses deduplication "in line", while writing the file, is preferred, and that is what zfs does (preferably with lots of memory, at least one ssd to run zil, caches, etc.). I use btrfs now because Ubuntu has it as a default in the kernel, and I assume that when (not "if") I have to use a system rescue disk (USB or CD) it will have some capacity to repair btrfs. Along the way, btrfs has been quite good as a general purpose file system on root; it makes and sends snapshots, and so far only needs an occasional scrub and balance. My earlier experience with btrfs on a 2TB drive was more complicated, but I expected that for a file system with a lot of potential but less maturity. Personally, I would go back to fossil and venti on Plan 9 for an archival data server (using WORM drives), and VAX/VMS cluster for an HA server. But of course that no longer makes sense except for a very few usage cases. Time has moved on, prices have dropped drastically, and hardware can do a lot more per penny than it used to. Gordon On Fri, Mar 31, 2017 at 12:25 PM, Peter Grandiwrote: My guess is that very complex risky slow operations like that are provided by "clever" filesystem developers for "marketing" purposes, to win box-ticking competitions. > That applies to those system developers who do know better; I suspect that even some filesystem developers are "optimistic" as to what they can actually achieve. > There are cases where there really is no other sane option. Not everyone has the kind of budget needed for proper HA setups, > >>> Thnaks for letting me know, that must have never occurred to >>> me, just as it must have never occurred to me that some >>> people expect extremely advanced features that imply >>> big-budget high-IOPS high-reliability storage to be fast and >>> reliable on small-budget storage too :-) > >> You're missing my point (or intentionally ignoring it). > > In "Thanks for letting me know" I am not missing your point, I > am simply pointing out that I do know that people try to run > high-budget workloads on low-budget storage. > > The argument as to whether "very complex risky slow operations" > should be provided in the filesystem itself is a very different > one, and I did not develop it fully. But is quite "optimistic" > to simply state "there really is no other sane option", even > when for people that don't have "proper HA setups". > > Let'a start by assuming for the time being. that "very complex > risky slow operations" are indeed feasible on very reliable high > speed storage layers. Then the questions become: > > * Is it really true that "there is no other sane option" to > running "very complex risky slow operations" even on storage > that is not "big-budget high-IOPS high-reliability"? > > * Is is really true that it is a good idea to run "very complex > risky slow operations" even on ¨big-budget high-IOPS > high-reliability storage"? > >> Those types of operations are implemented because there are >> use cases that actually need them, not because some developer >> thought it would be cool. [ ... ] > > And this is the really crucial bit, I'll disregard without > agreeing too much (but in part I do) with the rest of the >
Re: Shrinking a device - performance?
Hello, Christiane, I very much enjoyed the discussion you sparked with your original post. My ability in btrfs is very limited, much less than the others who have replied here, so this may not be much help. Let us assume that you have been able to shrink the device to the size you need, and you are now merrily on your way to moving the data to XFS. If so, ignore this email, delete, whatever, and read no further. If that is not the case, perhaps try something like the following. Can you try to first dedup the btrfs volume? This is probably out of date, but you could try one of these: https://btrfs.wiki.kernel.org/index.php/Deduplication If that does not work, this is a longer shot, but you might consider adding an intermediate step of creating yet another btrfs volume on the underlying lvm2 device mapper, turning on dedup, compression, and whatever else can squeeze some extra space out of the current btrfs volume. You could then try to copy over files and see if you get the results you need (or try sending the current btrfs volume as a snapshot, but I'm guessing 20TB is too much). Once the new btrfs volume on top of lvm2 is complete, you could just delete the old one, and then transfer the (hopefully compressed and deduped) data to XFS. Yep, that's probably a lot of work. I use both btrfs (on root on Ubuntu) and zfs (for data, home), and I try to do as little as possible with live mounted file systems other than snapshots. I avoid sending and receive snapshots from the live system (mostly zfs, but sometimes btrfs) but instead write increment snapshots as a file on the backup disks, and then import the incremental snaps into a backup pool at night. My recollection is that btrfs handles deduplication differently than zfs, but both of them can be very, very slow (from the human perspective; call that what you will; a sub optimal relationship of the parameters of performance and speed). The advantage you have is that with lvm you can create a number of different file systems. And lvm can also create snapshots. I think both zfs and btrfs both have a more "elegant" way of dealing with snapshots, but lvm allows a file system without that feature to have it. Others on the list can tell you about the disadvantages. I would be curious how it turns out for you. If you are able to move the data to XFS running on top of lvm, what is your plan for snapshots in lvm? Again, I'm not an expert in btrfs, but in most cases a full balance and scrub takes care of any problems on the root partition, but that is a relatively small partition. A full balance (without the options) and scrub on 20 TiB must take a very long time even with robust hardware, would it not? CentOS, Redhat, and Oracle seem to take the position that very large data subvolumes using btrfs should work fine. But I would be curious what the rest of the list thinks about 20 TiB in one volume/subvolume. Gordon On Thu, Mar 30, 2017 at 5:13 PM, Piotr Pawłowwrote: >> The proposed "move whole chunks" implementation helps only if >> there are enough unallocated chunks "below the line". If regular >> 'balance' is done on the filesystem there will be some, but that >> just spreads the cost of the 'balance' across time, it does not >> by itself make a «risky, difficult, slow operation» any less so, >> just spreads the risk, difficulty, slowness across time. > > Isn't that too pessimistic? Most of my filesystems have 90+% of free > space unallocated, even those I never run balance on. For me it wouldn't > just spread the cost, it would reduce it considerably. > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs recovery - solved for me
Michael, That's great news. Well done. ext4 works just fine for most cases. If you wish to experiment I might suggest more work on your part (just what you need, right?) by using btrfs for smaller file systems (perhaps just root, maybe /var, /bin etc.) but try installing zfs for large file systems (user, tmp, whatever might get a little larger). I do this on ubuntu, and btrfs works great on a 60 Gig root partition, and zfs for home, vms, iso, etc. I think zfs works on Suse (or maybe mini-Suse; don't remember), and if your machine has enough ram, it can be quite snappy (for low ram machine for zfs turn off compression, dedup, etc.). Or just use ext4, make lots of backups, and experiment later. I don't have a preference on file systems in general, but I do know they each (usually) excel at different things. Btrfs is very nice to have on linux, because it is already in many kernels by now, and can work with limited ram. I do have to say the ability to create, send and receive incremental snapshots saves a lot of time and work, and as far as I know only btrfs and zfs do this without using something like lvm (maybe you do this now with xfs and lvm? If so, I think you can also make snapshots with lvm and ext4). I would not give up on btrfs yet. You might consider using btrfs for detachable backup disks. Gordon On Tue, Jan 31, 2017 at 3:30 PM, Michael Bornwrote: > Thank you all for your help. > > Magically, btrfs-find-root worked today. (I attached the steps at the end) > I don't think I changed anything. The btrfs-progs version is still 4.1 > because I tried different tagged versions (starting from 4.9) from the > cloned git repo. > > The btrfs-find-root on the working / of my computer (Intel 3750K@4.2GHz > on a Samsung 850EVO) took several seconds. > btrfs-find-root on the 60GB dd-image (on the same computer on a 960EVO > ssd) loop device took 6 minutes until it stared asking me "do you want > to keep going on ? (y/N/a): We seem to be looping a lot on..." questions. > I answered them with yes. The command was finished then 1 minute later. > - just for a reference, if somebody wonders, how many days one has to wait. > > Strangely, the second run of btrfs-find-root finished differently. The > first run did not mention the snapshots. > This also confirms my suspicion, that the tool is working far from > perfectly. > > It's absolutely clear that I came here because of my stupidity (making a > dd backup of my live btrfs root partition), but it really frightens me > how quickly (2 days reading/trying wiki/blog/mailing list solutions) I > came to the point of having to grep text files by its contents from my > image.dd. > I don't run database computer center, so I'm not relying on the nice > btrfs features. But, for my next clean installation I will rather use > ext4 just because there are tons of forensic/recovery tools available > for the worst case. > Thank you again for your quick and helpful answers. > > Cheers, > Michael > > PS: I now just took the NetworkManager settings to not have to enter all > Wifi details when traveling. So, it was really mostly an exercise. > > > linux-2bf5:/home/michael # losetup -f > /dev/loop0 > linux-2bf5:/home/michael # losetup /dev/loop0 ./20170126_sda2root.dd > > > linux-2bf5:/home/michael # btrfs fi show > Label: none uuid: 779e9c04-be4b-4a45-9fc2-000acca5549d > Total devices 1 FS bytes used 19.02GiB > devid1 size 118.16GiB used 25.03GiB path /dev/sda4 > > Label: none uuid: 91a79eeb-08e0-470e-beab-916b38e09aca > Total devices 1 FS bytes used 44.22GiB > devid1 size 60.00GiB used 60.00GiB path /dev/loop0 > > linux-2bf5:/home/michael # > > > linux-2bf5:/home/michael/btrfs-progs # ./btrfs fi show > Label: none uuid: 779e9c04-be4b-4a45-9fc2-000acca5549d > Total devices 1 FS bytes used 19.02GiB > devid1 size 118.16GiB used 25.03GiB path /dev/sda4 > > Label: none uuid: 91a79eeb-08e0-470e-beab-916b38e09aca > Total devices 1 FS bytes used 44.22GiB > devid1 size 60.00GiB used 60.00GiB path /dev/loop0 > > btrfs-progs v4.1 > linux-2bf5:/home/michael/btrfs-progs # ./btrfs-find-root /dev/sda4 > Superblock thinks the generation is 14210 > Superblock thinks the level is 1 > Found tree root at 12232638464 gen 14210 level 1 > Well block 12173475840(gen: 14209 level: 1) seems good, but > generation/level doesn't match, want gen: 14210 level: 1 > Well block 12081152000(gen: 14198 level: 1) seems good, but > generation/level doesn't match, want gen: 14210 level: 1 > linux-2bf5:/home/michael/btrfs-progs # ./btrfs-find-root /dev/loop0 > Couldn't read tree root > Superblock thinks the generation is 549995 > Superblock thinks the level is 1 > Well block 32794263552(gen: 550001 level: 1) seems good, but > generation/level doesn't match, want gen: 549995 level: 1 > linux-2bf5:/home/michael/btrfs-progs # > > > > linux-2bf5:/home/michael/btrfs-progs # ./btrfs restore -t 32794263552 >
Re: btrfs recovery
Hello, Micheal, Yes, you would certainly run the risk of doing more damage with dd, so if you have an alternative, use that, and avoid dd. If nothing else works and you need the files, you might try it as a last resort. My guess (and it is only a guess) is that if the image is close to the same size as the root partition, the file data is there. But that doesn't do you much good if btrfs cannot read the "container" or the specific partition and file system information, which btrfs send provides. Does someone on the list know if ext3/4 data recovery tools can also search btrfs data? That's another option. Gordon On Mon, Jan 30, 2017 at 4:37 PM, Michael Born <michael.b...@aei.mpg.de> wrote: > Hi Gordon, > > I'm quite sure this is not a good idea. > I do understand, that dd-ing a running system will miss some changes > done to the file system while copying. I'm surprised that I didn't end > up with some corrupted files, but with no files at all. > Also, I'm not interested in restoring the old Suse 13.2 system. I just > want some configuration files from it. > > Cheers, > Michael > > Am 30.01.2017 um 23:24 schrieb GWB: >> << >> Hi btrfs experts. >> >> Hereby I apply for the stupidity of the month award. >>>> >> >> I have no doubt that I will will mount a serious challenge to you for >> that title, so you haven't won yet. >> >> Why not dd the image back onto the original partition (or another >> partition identical in size) and see if that is readable? >> >> My limited experience with btrfs (I am not an expert) is that read >> only snapshots work well in this situation, but the initial hurdle is >> using dd to get the image back onto a partition. So I wonder if you >> could dd the image back onto the original media (the hd sdd), then >> make a read only snapshot, and then send the snapshot with btrfs send >> to another storage medium. With any luck, the machine might boot, and >> you might find other snapshots which you may be able to turn into read >> only snaps for btrfs send. >> >> This has worked for me on Ubuntu 14 for quite some time, but luckily I >> have not had to restore the image file sent from btrfs send yet. I >> say luckily, because I realise now that the image created from btrfs >> send should be tested, but so far no catastrophic failures with my >> root partition have occurred (knock on wood). >> >> dd is (like dumpfs, ddrescue, and the bsd variations) good for what it >> tries to do, but not so great on for some file systems for more >> intricate uses. But why not try: >> >> dd if=imagefile.dd of=/dev/sdaX >> >> and see if it boots? If it does not, then perhaps you have another >> shot at the one time mount for btrfs rw if that works. >> >> Or is your root partition now running fine under Suse 14.2, and you >> are just looking to recover a file files from the image? If so, you >> might try to dd from the image to a partition of original size as the >> previous root, then adjust with gparted or fpart, and see if it is >> readable. >> >> So instead of trying to restore a btrfs file structure, why not just >> restore a partition with dd that happens to contain a btrfs file >> structure, and then adjust the partition size to match the original? >> btrfs cares about the tree structures, etc. dd does not. >> >> What you did is not unusual, and can work fine with a number of file >> structures, but the potential for disaster with dd is also great. The >> only thing I know of in btrfs that does a similar thing is: >> >> btrfs send -f btrfs-send-image-file /mount/read-write-snapshot >> >> Chances are, of course, good that without having current backups dd >> could potentially ruin the rest of your file system set up, so maybe >> transfer the image over to another machine that is expendable and test >> this out. I use btrfs on root and zfs for data, and make lots of >> snapshots and send them to incremental backups (mostly zfs, but btrfs >> works nicely with Ubuntu on root, with the occasional balance >> problem). >> >> If dd did it, dd might be able to fix it. Do that first before you >> try to restore btrfs file structures. >> >> Or is this a terrible idea? Someone else on the list should say so if >> they know otherwise. >> >> Gordon > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs recovery
<< Hi btrfs experts. Hereby I apply for the stupidity of the month award. >> I have no doubt that I will will mount a serious challenge to you for that title, so you haven't won yet. Why not dd the image back onto the original partition (or another partition identical in size) and see if that is readable? My limited experience with btrfs (I am not an expert) is that read only snapshots work well in this situation, but the initial hurdle is using dd to get the image back onto a partition. So I wonder if you could dd the image back onto the original media (the hd sdd), then make a read only snapshot, and then send the snapshot with btrfs send to another storage medium. With any luck, the machine might boot, and you might find other snapshots which you may be able to turn into read only snaps for btrfs send. This has worked for me on Ubuntu 14 for quite some time, but luckily I have not had to restore the image file sent from btrfs send yet. I say luckily, because I realise now that the image created from btrfs send should be tested, but so far no catastrophic failures with my root partition have occurred (knock on wood). dd is (like dumpfs, ddrescue, and the bsd variations) good for what it tries to do, but not so great on for some file systems for more intricate uses. But why not try: dd if=imagefile.dd of=/dev/sdaX and see if it boots? If it does not, then perhaps you have another shot at the one time mount for btrfs rw if that works. Or is your root partition now running fine under Suse 14.2, and you are just looking to recover a file files from the image? If so, you might try to dd from the image to a partition of original size as the previous root, then adjust with gparted or fpart, and see if it is readable. So instead of trying to restore a btrfs file structure, why not just restore a partition with dd that happens to contain a btrfs file structure, and then adjust the partition size to match the original? btrfs cares about the tree structures, etc. dd does not. What you did is not unusual, and can work fine with a number of file structures, but the potential for disaster with dd is also great. The only thing I know of in btrfs that does a similar thing is: btrfs send -f btrfs-send-image-file /mount/read-write-snapshot Chances are, of course, good that without having current backups dd could potentially ruin the rest of your file system set up, so maybe transfer the image over to another machine that is expendable and test this out. I use btrfs on root and zfs for data, and make lots of snapshots and send them to incremental backups (mostly zfs, but btrfs works nicely with Ubuntu on root, with the occasional balance problem). If dd did it, dd might be able to fix it. Do that first before you try to restore btrfs file structures. Or is this a terrible idea? Someone else on the list should say so if they know otherwise. Gordon On Mon, Jan 30, 2017 at 3:16 PM, Hans van Kranenburgwrote: > On 01/30/2017 10:07 PM, Michael Born wrote: >> >> >> Am 30.01.2017 um 21:51 schrieb Chris Murphy: >>> On Mon, Jan 30, 2017 at 1:02 PM, Michael Born >>> wrote: Hi btrfs experts. Hereby I apply for the stupidity of the month award. >>> >>> There's still another day :-D >>> >>> >>> Before switching from Suse 13.2 to 42.2, I copied my / partition with dd to an image file - while the system was online/running. Now, I can't mount the image. >>> >>> That won't ever work for any file system. It must be unmounted. >> >> I could mount and copy the data out of my /home image.dd (encrypted >> xfs). That was also online while dd-ing it. >> Could you give me some instructions how to repair the file system or extract some files from it? >>> >>> Not possible. The file system was being modified while dd was >>> happening, so the image you've taken is inconsistent. >> >> The files I'm interested in (fstab, NetworkManager.conf, ...) didn't >> change for months. Why would they change in the moment I copy their >> blocks with dd? > > The metadata of btrfs is organized in a bunch of tree structures. The > top of the trees (the smallest parts, trees are upside-down here /\ ) > and the superblock get modified quite often. Every time a tree gets > modified, the new modified parts are written as a modified copy in > unused space. > > So even if the files themselves do not change... if you miss those new > writes which are being done in space that your dd already left behind... > you end up with old and new parts of trees all over the place. > > In other words, a big puzzle with parts that do not connect with each > other any more. > > And that's exactly what you see in all the errors. E.g. "parent transid > verify failed on 32869482496 wanted 550112 found 550121" <- a part of a > tree points to another part, but suddenly something else is found which > should not be there. In this case wanted 550112 found 550121
Re: df -i shows 0 inodes 0 used 0 free on 4.4.0-36-generic Ubuntu 14 - Bug or not?
Good to know, and thank you for the quick reply. That helps. I'm running btrfs on root and one of the vm partitions, and zfs on the user folders and other vm partitions, largely because Ubuntu (and gentoo, redhat, etc.) has btrfs in the kernel, it's very well integrated with the kernel, and it's uses less memory than zfs. /vm0 is pretty much full; after scrub and balance I get this: $ sudo btrfs fi df /vm0 ... Data, single: total=354.64GiB, used=349.50GiB System, single: total=32.00MiB, used=80.00KiB Metadata, single: total=1.00GiB, used=413.69MiB unknown, single: total=144.00MiB, used=0.00 Scrub and balance seems to do the trick for / as well, after deleting snapshots. When we get to the newer userland tools, I'll try the version with later userspace tools you suggested. btrfs works great on Ubuntu 14 on root running on an mSata drive with apt-btrfs-snapshot installed. Nothing wrong with ext4, but coming from Solaris and FreeBSD I wanted a fs that I could snapshot and roll back in case an upgrade did not work. The Stallman quote is great. Oracle taught me that lesson the hard way when it "branched" zfs after version 28 into new revisions that were incompatible with the OpenSolaris (and zfs linux) revisions going forward. "zpool upgrade" on Solaris 11 makes the pool incompatible with OpenSolaris and zfs-on-linux distros. Gordon On Thu, Sep 15, 2016 at 10:26 PM, Duncan <1i5t5.dun...@cox.net> wrote: > GWB posted on Thu, 15 Sep 2016 18:58:24 -0500 as excerpted: > >> I don't expect accurate data on a btrfs file system when using df, but >> after upgrading to kernel 4.4.0 I get the following: >> >> $ df -i ... >> /dev/sdc3 0 0 0 - /home >> /dev/sdc4 0 0 0 - /vm0 ... >> >> Where /dev/sdc3 and /dev/sdc4 are btrfs filesystems. >> >> So is this a bug or not? > > Not a bug. > > Btrfs uses inodes, but unlike ext*, it creates them dynamically as- > needed, so showing inodes used vs. free simply makes no sense in btrfs > context. > > Now btrfs /does/ track data and metadata separately, creating chunks of > each type, and it /is/ possible to have all otherwise free space already > allocated to chunks of one type or the other and then run out of space in > the one type of chunk while there's plenty of space in the other type of > chunk, but that's quite a different concept, and btrfs fi usage (tho your > v3.14 btrfs-progs will be too old for usage) or btrfs fi df coupled with > btrfs fi show (the old way to get the same info), gives the information > for that. > > And in fact, the btrfs fi show for vm0 says 374.66 GiB size and used, so > indeed, all space on that one is allocated. Unfortunately you don't post > the btrfs fi df for that one, so we can't tell where all that allocated > space is going and whether it's actually used, but it's all allocated. > You probably want to run a balance to get back some unallocated space. > > Meanwhile, your kernel is 4.4.x LTS series so not bad there, but your > userspace is extremely old, 3.12, making support a bit hard as some of > the commands have changed (btrfs fi usage, for one, and I think the > checker was still btrfsck in 3.12, while in current btrfs-progs, it's > btrfs check). I'd suggest updating that to at least something around the > 4.4 level to match the kernel, tho you can upgrade to the latest 4.7.2 > (don't try 4.6 or 4.7 previous to 4.7.2, or don't btrfs check --repair if > you do, as there's a bug with it in those versions that's fixed in 4.7.2) > if you like, as newer userspace is designed to work with older kernels as > well. > > Besides which, while old btrfs userspace isn't a big deal (other than > translating back and forth between old style and new style commands) when > your filesystems are running pretty much correctly, as in that case all > userspace is doing in most cases is calling the kernel to do the real > work anyway, it becomes a much bigger deal when something goes wrong, > because it's userspace code that's executing with btrfs check or btrfs > restore, and newer userspace knows about and can fix a LOT more problems > than the really ancient 3.12. > > -- > Duncan - List replies preferred. No HTML msgs. > "Every nonfree program has a lord, a master -- > and if you use the program, he is your master." Richard Stallman > > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
df -i shows 0 inodes 0 used 0 free on 4.4.0-36-generic Ubuntu 14 - Bug or not?
I don't expect accurate data on a btrfs file system when using df, but after upgrading to kernel 4.4.0 I get the following: $ df -i ... /dev/sdc3 0 0 0 - /home /dev/sdc4 0 0 0 - /vm0 ... Where /dev/sdc3 and /dev/sdc4 are btrfs filesystems. So is this a bug or not? I ask because root (/dev/sdc3) began to display the error message "no space left on device", which was eventually cured by deleting old snapshots then btrfs fi sync and btrfs balance. fi show and fi df show space, even when df -i shows 0 inodes: sudo btrfs fi show / ... Label: none uuid: 9acdb642-d743-4c2a-a59f-811022c2a2b0 Total devices 1 FS bytes used 23.86GiB devid1 size 60.00GiB used 42.03GiB path /dev/sdc3 sudo btrfs fi df / Data, single: total=37.00GiB, used=21.11GiB System, single: total=32.00MiB, used=16.00KiB Metadata, single: total=5.00GiB, used=2.75GiB unknown, single: total=512.00MiB, used=0.00 Please excuse my inexperience here; I don't know how to use btrfs commands to show inodes. btrfs inspect-internal will reference an inode as near as I can tell, but won't list the used and free inodes ("free" may not be the correct word here, given btrfs architecture). Many Thanks, Gordon Machine Specs: % uname -a Linux Bon-E 4.4.0-36-generic #55~14.04.1-Ubuntu SMP Fri Aug 12 11:49:30 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux % btrfs --version Btrfs v3.12 % sudo btrfs fi show Label: none uuid: 9acdb642-d743-4c2a-a59f-811022c2a2b0 Total devices 1 FS bytes used 23.86GiB devid1 size 60.00GiB used 42.03GiB path /dev/sdc3 %Label: vm0 uuid: 72539416-d30e-4a34-8b2d-b2369d1fb075 Total devices 1 FS bytes used 349.96GiB devid1 size 374.66GiB used 374.66GiB path /dev/sdc4 dmesg does not appear to show anything useful for btrfs or device, but mount shows: % mount | grep btrfs /dev/sdc3 on / type btrfs (rw,ssd,subvol=@) /dev/sdc3 on /home type btrfs (rw,ssd,subvol=@home) /dev/sdc4 on /vm0 type btrfs (rw,ssd,space_cache,compress=lzo,subvol=@vm0) /dev/sdc3 on /mnt/btrfs-root type btrfs (rw) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html