Re: [gentoo-user] Re: USB crucial file recovery
On Tue, Aug 30, 2016 at 4:42 PM, Grant Edwards wrote: > > There's nothing in Gentoo that guarantees everybody has ext2 support > in their kernels. That said, I agree that ext2 (or perhaps ext3 with > journalling disabled -- I've always been a bit fuzzy on whether that's > exactly the same thing or not) Sorry, I just wanted to chime in on one thing. While a journal probably will cause more flash wear, it also potentially adds data integrity. Now consider that the original message that started this whole thread was important files being stored on flash and being corrupted. Getting rid of the journal might not be the best move. Unless you have a LOT of writes to flash you're not going to wear it out, especially with wear-leveling algorithms. Oh, and finally, if it matters that much, have a backup... -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
Am 30.08.2016 um 22:46 schrieb Rich Freeman: > On Tue, Aug 30, 2016 at 4:42 PM, Grant Edwards > wrote: >> There's nothing in Gentoo that guarantees everybody has ext2 support >> in their kernels. That said, I agree that ext2 (or perhaps ext3 with >> journalling disabled -- I've always been a bit fuzzy on whether that's >> exactly the same thing or not) > Sorry, I just wanted to chime in on one thing. While a journal > probably will cause more flash wear, it also potentially adds data > integrity. > > Now consider that the original message that started this whole thread > was important files being stored on flash and being corrupted. > Getting rid of the journal might not be the best move. > > Unless you have a LOT of writes to flash you're not going to wear it > out, especially with wear-leveling algorithms. > > Oh, and finally, if it matters that much, have a backup... > the journal does not add any data integrity benefits at all. It just makes it more likely that the fs is in a sane state if there is a crash. Likely. Not a guarantee. Your data? No one cares. If you want an fs that cares about your data: zfs.
Re: [gentoo-user] Re: USB crucial file recovery
On Tue, Aug 30, 2016 at 4:58 PM, Volker Armin Hemmann wrote: > > the journal does not add any data integrity benefits at all. It just > makes it more likely that the fs is in a sane state if there is a crash. > Likely. Not a guarantee. Your data? No one cares. > That depends on the mode of operation. In journal=data I believe everything gets written twice, which should make it fairly immune to most forms of corruption. f2fs would also have this benefit. Data is not overwritten in-place in a log-based filesystem; they're essentially journaled by their design (actually, they're basically what you get if you ditch the regular part of the filesystem and keep nothing but the journal). > If you want an fs that cares about your data: zfs. > I won't argue that the COW filesystems have better data security features. It will be nice when they're stable in the main kernel. -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
Am 30.08.2016 um 23:59 schrieb Rich Freeman: > On Tue, Aug 30, 2016 at 4:58 PM, Volker Armin Hemmann > wrote: >> the journal does not add any data integrity benefits at all. It just >> makes it more likely that the fs is in a sane state if there is a crash. >> Likely. Not a guarantee. Your data? No one cares. >> > That depends on the mode of operation. In journal=data I believe > everything gets written twice, which should make it fairly immune to > most forms of corruption. nope. Crash at the wrong time, data gone. FS hopefully sane. > > f2fs would also have this benefit. Data is not overwritten in-place > in a log-based filesystem; they're essentially journaled by their > design (actually, they're basically what you get if you ditch the > regular part of the filesystem and keep nothing but the journal). > >> If you want an fs that cares about your data: zfs. >> > I won't argue that the COW filesystems have better data security > features. It will be nice when they're stable in the main kernel. > it is not so much about cow, but integrity checks all the way from the moment the cpu spends some cycles on it. Caught some silent file corruptions that way. Switched to ECC ram and never saw them again.
Re: [gentoo-user] Re: USB crucial file recovery
On Tue, 30 Aug 2016 20:42:05 + (UTC), Grant Edwards wrote: > > And why use exfat if you use linux? It is just not needed at all. > > I agree. If you want to transport something between Linux systems, > use ext2/3 and use "mount" options to handle the permission issues. You can't control ownership and permissions of existing files with mount options on a Linux filesystem. See man mount. -- Neil Bothwick The people who are wrapped up in themselves are overdressed. pgp_hKVGQIi1l.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On 31/08/2016 01:06, Grant Edwards wrote: > On 2016-08-30, Neil Bothwick wrote: >> On Tue, 30 Aug 2016 20:42:05 + (UTC), Grant Edwards wrote: >> And why use exfat if you use linux? It is just not needed at all. >>> >>> I agree. If you want to transport something between Linux systems, >>> use ext2/3 and use "mount" options to handle the permission issues. >> >> You can't control ownership and permissions of existing files with mount >> options on a Linux filesystem. See man mount. > > Oops, you're right. I guess the options I was thinking of don't work > for ext2/3. They do work for fat, cifs, hfs, hpfs, ntfs, iso9660, and > various others. > > I very rarely put a writable filesystem on a USB flash drive. I treat > them either as a CD/DVD for installation ISO images, or I use them as > "tapes" and just tar stuff to/from them. > > I do make a point of using consistent UID/GID values across multiple > installations, so on the rare occasions I do put a writable filesystem > on a flash drive, it "just works". > Something intrigues me about this thread: If the file in question is so valuable and expensive, why don't you make another copy of the original onto a new USB stick? -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] Re: USB crucial file recovery
>> > And why use exfat if you use linux? It is just not needed at all. >> >> I agree. If you want to transport something between Linux systems, >> use ext2/3 and use "mount" options to handle the permission issues. > > You can't control ownership and permissions of existing files with mount > options on a Linux filesystem. See man mount. So in order to use a USB stick between multiple Gentoo systems with ext2, I need to make sure my users have matching UIDs/GIDs? I think this is how I ended up on NTFS in the first place. Is there a filesystem that will make that unnecessary and exhibit better reliability than NTFS? - Grant
Re: [gentoo-user] Re: USB crucial file recovery
On 31/08/2016 02:08, Grant wrote: And why use exfat if you use linux? It is just not needed at all. >>> >>> I agree. If you want to transport something between Linux systems, >>> use ext2/3 and use "mount" options to handle the permission issues. >> >> You can't control ownership and permissions of existing files with mount >> options on a Linux filesystem. See man mount. > > > So in order to use a USB stick between multiple Gentoo systems with > ext2, I need to make sure my users have matching UIDs/GIDs? Yes The uids/gids/modes in the inodes themselves are the owners and perms, you cannot override them. So unless you have mode=666, you will need matching UIDs/GIDs (which is a royal massive pain in the butt to bring about without NIS or similar > I think > this is how I ended up on NTFS in the first place. Didn't we have this discussion about a year ago? Sounds familiar now > Is there a > filesystem that will make that unnecessary and exhibit better > reliability than NTFS? Yes, FAT. It works and works well. Or exFAT which is Microsoft's solution to the problem of very large files on FAT. Which NTFS system are you using? ntfs kernel module? It's quite dodgy and unsafe with writes ntfs-ng on fuse? I find that one quite solid ntfs-ng does have an annoyance that has bitten me more than once. When ntfs-nf writes to an FS, it can get marked dirty. Somehow, when used in a Windows machine the driver there has issues with the FS. Remount it in Linux again and all is good. The cynic in me says that Microsoft didn'y implement their own FS spec properly whereas ntfs-ng did :-) -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] Re: USB crucial file recovery
On Tue, 30 Aug 2016 17:08:26 -0700, Grant wrote: > > You can't control ownership and permissions of existing files with > > mount options on a Linux filesystem. See man mount. > > So in order to use a USB stick between multiple Gentoo systems with > ext2, I need to make sure my users have matching UIDs/GIDs? Yes, I said that when I first mentioned ext2. > I think > this is how I ended up on NTFS in the first place. Is there a > filesystem that will make that unnecessary and exhibit better > reliability than NTFS? FAT is tried and tested as long as you can live with the file size limitations. But USB sticks are not that reliable to start with, so relying on the filesystem to preserve your important files is not enough. You have spent far more time on this than you would have spent making backups of the file! -- Neil Bothwick Use Colgate toothpaste or end up with teeth like a Ferengi. pgpoi0lUvSUNR.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, 31 Aug 2016 08:45:22 +0100, Neil Bothwick wrote: > USB sticks are not that reliable to start with, so > relying on the filesystem to preserve your important files is not > enough. You have spent far more time on this than you would have spent > making backups of the file! Have you considered using cloud storage for the files instead? That also gives you the option of version control with some services. -- Neil Bothwick Remember that the Titanic was built by experts, and the Ark by a newbie pgp0NNX4TvHop.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On Wed Aug 31 08:47:11 2016, Neil Bothwick wrote: > Have you considered using cloud storage for the files instead? That also > gives you the option of version control with some services. Seriously, why cloud? The Cloud is basically a marketing term that define “Internet, like before, but cooler”, so it’s just someone else computer. I think that almost everybody here have more than on computer, or at least more than one hard disk drive. So… Why not using it? You will know who own your data. -- alarig signature.asc Description: Digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, Aug 31, 2016 at 6:30 AM, Alarig Le Lay wrote: > On Wed Aug 31 08:47:11 2016, Neil Bothwick wrote: >> Have you considered using cloud storage for the files instead? That also >> gives you the option of version control with some services. > > Seriously, why cloud? The Cloud is basically a marketing term that > define “Internet, like before, but cooler”, so it’s just someone else > computer. I think that almost everybody here have more than on computer, > or at least more than one hard disk drive. So… Why not using it? You > will know who own your data. It might have something to do with the fact that cloud services at least run backups of their servers. I'd be the first to agree that it is possible to do a better job yourself at providing the sorts of services you find on dropbox, google drive, lastpass, and so on. However, the reality is that most people don't actually do a better job with it, which is why I see the occassional post on Facebook about how some relative lost all their files when their hard drive crashed, or when some ransomware came along. Most who have "backups" just have a USB hard drive with some software that came with it, which is probably always mounted. If you know how to professionally manage a server, then sure, feel free to DIY. Though, you might be surprised at how many people who do know how to professionally manage servers still use cloud services. The plethora of clients make them convenient for some things (though I always back them up). And I store all my important backups encrypted on S3 (I don't care if they lose them as long as it isn't on the same day that I need them, and if they want to try to data mine files that have gone through gpg I wish them good luck). -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, 31 Aug 2016 12:30:42 +0200, Alarig Le Lay wrote: > On Wed Aug 31 08:47:11 2016, Neil Bothwick wrote: > > Have you considered using cloud storage for the files instead? That > > also gives you the option of version control with some services. > > Seriously, why cloud? The Cloud is basically a marketing term that > define “Internet, like before, but cooler”, so it’s just someone else > computer. Not necessarily, it's a catch-all term for network storage, it would be ownCloud running on a LAN. However, professionally provided services are many orders of magnitude safer than storing important files on a no-name USB stick using a reverse engineered filesystem running through a userspace layer. Or you could simply use a shared folder synced with something like SyncThing for everyone to access the files. Then you have safer hard drive storage and some level of backup. -- Neil Bothwick DCE seeks DTE for mutual exchange of data. pgpdLXGzYt5BN.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On Wednesday, August 31, 2016 12:12:15 AM Volker Armin Hemmann wrote: > Am 30.08.2016 um 23:59 schrieb Rich Freeman: > > On Tue, Aug 30, 2016 at 4:58 PM, Volker Armin Hemmann > > > > wrote: > >> the journal does not add any data integrity benefits at all. It just > >> makes it more likely that the fs is in a sane state if there is a crash. > >> Likely. Not a guarantee. Your data? No one cares. > > > > That depends on the mode of operation. In journal=data I believe > > everything gets written twice, which should make it fairly immune to > > most forms of corruption. > > nope. Crash at the wrong time, data gone. FS hopefully sane. No, seriously. Mount with data=ordered. Per ext4(5): data={journal|ordered|writeback} Specifies the journaling mode for file data. Metadata is always journaled. To use modes other than ordered on the root filesystem, pass the mode to the kernel as boot parameter, e.g. rootflags=data=journal. journal All data is committed into the journal prior to being written into the main filesystem. ordered This is the default mode. All data is forced directly out to the main file system prior to its metadata being committed to the journal. writeback Data ordering is not preserved – data may be written into the main filesystem after its metadata has been committed to the journal. This is rumoured to be the highest-throughput option. It guarantees internal filesystem integrity, however it can allow old data to appear in files after a crash and journal recovery. In writeback mode, only filesystem metadata goes through the journal. This guarantees that the filesystem's structure itself will remain intact in the event of a crash. In data=journal mode, the contents of files pass through the journal as well, ensuring that, at least as far as the filesystem's responsibility is concerned, the data will be intact in the event of a crash. Now, I can still think of ways you can lose data in data=journal mode: * You mounted the filesystem with barrier=0 or with nobarrier; this can result in data writes going to disk out of order, if the I/O stack supports barriers. If you say "my file is ninety bytes" "here are ninety bytes of data, all 9s", "my file is now thirty bytes", "here are thirty bytes of data, all 3s", then in the end you should have a thirty-byte file filled with 3s. If you have barriers enabled and you crash halfway through the whole process, you should find a file of ninety bytes, all 9s. But if you have barriers disabled, the data may hit disk as though you'd said "my file is ninety bytes, here are ninety bytes of data, all 9s, here are thirty bytes of data, all 3s, now my file is thirty bytes." If that happens, and you crash partway through the commit to disk, you may see a ninety-byte file consisting of thirty 3s and sixty 9s. Or things may landthat you see a thirty-byte file of 9s. * Your application didn't flush its writes to disk when it should have. * Your vm.dirty_bytes or vm.dirty_ratio are too high, you've been writing a lot to disk, and the kernel still has a lot of data buffered waiting to be written. (Well, that can always lead to data loss regardless of how high those settings are, which is why applications should flush their writes.) * You've used hdparm to enable write buffers in your hard disks, and your hard disks lose power while their buffers have data waiting to be written. * You're using a buggy disk device that does a poor job of handling power loss. Such as some SSDs which don't have large enough capacitors for their own write reordering. Or just about any flash drive. * There's a bug in some code, somewhere. > > > f2fs would also have this benefit. Data is not overwritten in-place > > in a log-based filesystem; they're essentially journaled by their > > design (actually, they're basically what you get if you ditch the > > regular part of the filesystem and keep nothing but the journal). > > > >> If you want an fs that cares about your data: zfs. > > > > I won't argue that the COW filesystems have better data security > > features. It will be nice when they're stable in the main kernel. > > it is not so much about cow, but integrity checks all the way from the > moment the cpu spends some cycles on it. Caught some silent file > corruptions that way. Switched to ECC ram and never saw them again. In-memory corruption of a data is a universal hazard. ECC should be the norm, not the exception, honestly. -- :wq signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] Re: USB crucial file recovery
>> Is there a >> filesystem that will make that unnecessary and exhibit better >> reliability than NTFS? > > Yes, FAT. It works and works well. > Or exFAT which is Microsoft's solution to the problem of very large > files on FAT. FAT32 won't work for me since I need to use files larger than 4GB. I know it's beta software but should exfat be more reliable than ntfs? > Which NTFS system are you using? > > ntfs kernel module? It's quite dodgy and unsafe with writes > ntfs-ng on fuse? I find that one quite solid I'm using ntfs-ng as opposed to the kernel option(s). - Grant
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, Aug 31, 2016 at 10:33 AM, Michael Mol wrote: > On Wednesday, August 31, 2016 12:12:15 AM Volker Armin Hemmann wrote: >> Am 30.08.2016 um 23:59 schrieb Rich Freeman: >> > >> > That depends on the mode of operation. In journal=data I believe >> > everything gets written twice, which should make it fairly immune to >> > most forms of corruption. >> >> nope. Crash at the wrong time, data gone. FS hopefully sane. > > In data=journal mode, the contents of files pass through the journal as well, > ensuring that, at least as far as the filesystem's responsibility is > concerned, > the data will be intact in the event of a crash. > Correct. As with any other sane filesystem if you're using data=journal mode with ext4 then your filesystem will always reflect the state of data and metadata on a transaction boundary. If you write something to disk and pull the power after fsck the disk will either contain the contents of your files before you did the write, or after the write was completed, and never anything in-between. This is barring silent corruption, which ext4 does not protect against. > Now, I can still think of ways you can lose data in data=journal mode: Agree, though all of those concerns apply to any filesystem. If you unplug an unmounted device, or pull the power when writes are pending, or never hit save, or whatever, then your data won't end up on disk. Now, a good filesystem should ensure that the data which is on disk is completely consistent. That is, you won't get half of a write, just all or nothing. >> >> If you want an fs that cares about your data: zfs. >> > >> > I won't argue that the COW filesystems have better data security >> > features. It will be nice when they're stable in the main kernel. >> >> it is not so much about cow, but integrity checks all the way from the >> moment the cpu spends some cycles on it. What COW does get you is the security of data=journal without the additional cost of writing it twice. Since data is not overwritten in place you ensure that on an fsck the system can either roll the write completely forwards or backwards. With data=ordered on ext4 there is always the risk of a half-overwritten file if you are overwriting in place. But I agree that many of the zfs/btrfs data integrity features could be implemented on a non-cow filesystem. Maybe ext5 will have some of them, though I'm not sure how much work is going into that vs just fixing btrfs, or begging Oracle to re-license zfs. >> Caught some silent file >> corruptions that way. Switched to ECC ram and never saw them again. > > In-memory corruption of a data is a universal hazard. ECC should be the norm, > not the exception, honestly. > Couldn't agree more here. The hardware vendors aren't helping here though, in their quest to try to make more money from those sensitive to such things. I believe Intel disables ECC on anything less than an i7. As I understand it most of the mainline AMD offerings support it (basically anything over $80 or so), but it isn't clear to me what motherboard support is required and the vendors almost never make a mention of it on anything reasonably-priced. If your RAM gets hosed than any filesystem is going to store bad data or metadata for a multitude of reasons. The typical x86+ arch wasn't designed to handle hardware failures around anything associated with cpu/ram. The ZFS folks tend to make a really big deal out of ECC, but as far as I'm aware it isn't any more important for ZFS than anything else. I think ZFS just tends to draw people really concerned with data integrity, and once you've controlled everything that happens after the data gets sent to the hard drive you tend to start thinking about what happens to it beforehand. I had to completely reinstall a windows system not long ago due to memory failure and drive corruption. Wasn't that big a deal since I don't keep anything on a windows box that isn't disposable, or backed up to something else. -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, Aug 31, 2016 at 08:47:11AM +0100, Neil Bothwick wrote > On Wed, 31 Aug 2016 08:45:22 +0100, Neil Bothwick wrote: > > > USB sticks are not that reliable to start with, so > > relying on the filesystem to preserve your important files is not > > enough. You have spent far more time on this than you would have spent > > making backups of the file! > > Have you considered using cloud storage for the files instead? That also > gives you the option of version control with some services. The initial backup of my hard drives would easily burn through my monthly gigabytes allotment. Until evrybody gets truly unlimited bandwidth, forget about it. -- Walter Dnes I don't run "desktop environments"; I run useful applications
Re: [gentoo-user] Re: USB crucial file recovery
On Wed, 31 Aug 2016 13:09:43 -0400, waltd...@waltdnes.org wrote: > > Have you considered using cloud storage for the files instead? That > > also gives you the option of version control with some services. > > The initial backup of my hard drives would easily burn through my > monthly gigabytes allotment. Until evrybody gets truly unlimited > bandwidth, forget about it. Who mentioned using it for backups? I suggested as an alternative to a USB stick for sharing a file or two between machines. What is your monthly gigabyte allotment for your LAN? Keeping the files on a personal cloud, or NS storage, may well be the better alternative. -- Neil Bothwick "Time is the best teacher., unfortunately it kills all the students" pgpGuvBew8Uo1.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On 31/08/2016 17:25, Grant wrote: Is there a filesystem that will make that unnecessary and exhibit better reliability than NTFS? Yes, FAT. It works and works well. Or exFAT which is Microsoft's solution to the problem of very large files on FAT. FAT32 won't work for me since I need to use files larger than 4GB. I know it's beta software but should exfat be more reliable than ntfs? It doesn't do all the fancy journalling that ntfs does, so based solely on complexity, it ought to be more reliable. None of us have done real tests and mentioned it here, so we really don't know how it pans out in the real world. Do a bunch of tests yourself and decide Which NTFS system are you using? ntfs kernel module? It's quite dodgy and unsafe with writes ntfs-ng on fuse? I find that one quite solid I'm using ntfs-ng as opposed to the kernel option(s). I'm offering 10 to 1 odds that your problems came from a faulty USB stick, or maybe one that you yanked too soon
Re: [gentoo-user] Re: USB crucial file recovery
On August 31, 2016 11:45:15 PM GMT+02:00, Alan McKinnon wrote: >On 31/08/2016 17:25, Grant wrote: Is there a filesystem that will make that unnecessary and exhibit better reliability than NTFS? >>> >>> Yes, FAT. It works and works well. >>> Or exFAT which is Microsoft's solution to the problem of very large >>> files on FAT. >> >> >> FAT32 won't work for me since I need to use files larger than 4GB. I >> know it's beta software but should exfat be more reliable than ntfs? > >It doesn't do all the fancy journalling that ntfs does, so based solely > >on complexity, it ought to be more reliable. > >None of us have done real tests and mentioned it here, so we really >don't know how it pans out in the real world. > >Do a bunch of tests yourself and decide When I was a student, one of my professors used FAT to explain how filesystems work. The reason for this is that the actual filesystem is quite simple to follow and fixing can actually be done by hand using a hex editor. This is no longer possible with other filesystems. Then again, a lot of embedded devices (especially digital cameras) don't even get FAT correctly. Leading to broken images. Those implementations are broken at the point where fragmentation would occur. Solution: never delete pictures on the camera. Simply move them off and do it on a computer. >>> Which NTFS system are you using? >>> >>> ntfs kernel module? It's quite dodgy and unsafe with writes >>> ntfs-ng on fuse? I find that one quite solid >> >> >> I'm using ntfs-ng as opposed to the kernel option(s). > >I'm offering 10 to 1 odds that your problems came from a faulty USB >stick, or maybe one that you yanked too soon I'm with Alan here. I have seen too many handout USB sticks from conferences that don't last. I only use them for: Quickly moving a file from A to B. Booting the latest sysresccd Scanning a document Printing a PDF (For last 2, my printer has a USB slot) Important files are stored on my NAS which is backed up regularly. -- Joost -- Sent from my Android device with K-9 Mail. Please excuse my brevity.
Re: [gentoo-user] Re: USB crucial file recovery
On 01/09/2016 05:42, J. Roeleveld wrote: > On August 31, 2016 11:45:15 PM GMT+02:00, Alan McKinnon > wrote: >> On 31/08/2016 17:25, Grant wrote: > Is there a > filesystem that will make that unnecessary and exhibit better > reliability than NTFS? Yes, FAT. It works and works well. Or exFAT which is Microsoft's solution to the problem of very large files on FAT. >>> >>> >>> FAT32 won't work for me since I need to use files larger than 4GB. I >>> know it's beta software but should exfat be more reliable than ntfs? >> >> It doesn't do all the fancy journalling that ntfs does, so based solely >> >> on complexity, it ought to be more reliable. >> >> None of us have done real tests and mentioned it here, so we really >> don't know how it pans out in the real world. >> >> Do a bunch of tests yourself and decide > > When I was a student, one of my professors used FAT to explain how > filesystems work. The reason for this is that the actual filesystem is quite > simple to follow and fixing can actually be done by hand using a hex editor. > > This is no longer possible with other filesystems. > > Then again, a lot of embedded devices (especially digital cameras) don't even > get FAT correctly. Leading to broken images. > Those implementations are broken at the point where fragmentation would occur. > Solution: never delete pictures on the camera. Simply move them off and do it > on a computer. > Which NTFS system are you using? ntfs kernel module? It's quite dodgy and unsafe with writes ntfs-ng on fuse? I find that one quite solid >>> >>> >>> I'm using ntfs-ng as opposed to the kernel option(s). >> >> I'm offering 10 to 1 odds that your problems came from a faulty USB >> stick, or maybe one that you yanked too soon > > I'm with Alan here. I have seen too many handout USB sticks from conferences > that don't last. I only use them for: > Quickly moving a file from A to B. > Booting the latest sysresccd > Scanning a document > Printing a PDF > (For last 2, my printer has a USB slot) > > Important files are stored on my NAS which is backed up regularly. Indeed. The trouble with backups is that they are difficult to get right, time consuming, easy to ignore, and very very expensive (time and money wise) -- Alan McKinnon alan.mckin...@gmail.com
Re: [gentoo-user] Re: USB crucial file recovery
On Wednesday, August 31, 2016 11:45:15 PM Alan McKinnon wrote: > On 31/08/2016 17:25, Grant wrote: > >> Which NTFS system are you using? > >> > >> ntfs kernel module? It's quite dodgy and unsafe with writes > >> ntfs-ng on fuse? I find that one quite solid > > > > I'm using ntfs-ng as opposed to the kernel option(s). > > I'm offering 10 to 1 odds that your problems came from ... one that you > yanked too soon (pardon the in-line snip, while I get on my soap box) The likelihood of this happening can be greatly reduced by setting vm.dirty_bytes to something like 2097125 and vm.dirty_background_bytes to something like 1048576. This prevents the kernel from queuing up as much data for sending to disk. The application doing the copy or write will normally report "complete" long before writes to slow media are actually...complete. Setting vm.dirty_bytes to something low prevents the kernel's backlog of data from getting so long. vm.dirty_bytes has another, closely-related setting, vm.dirty_bytes_ratio. vm.dirty_bytes_ratio is a percentage of RAM that is used for dirty bytes. If vm.dirty_bytes_ratio is set, vm.dirty_bytes will read 0. If vm.dirty_bytes is set, vm.dirty_bytes_ratio will read 0. The default is for vm.dirty_bytes_ratio to be 20, which means up to 20% of your memory can find itself used as a write buffer for data on its way to a filesystem. On a system with only 2GiB of RAM, that's 409MiB of data that the kernel may still be waiting to push through the filesystem layer! If you're writing to, say, a class 10 SDHC card, the data may not be at rest for another 40s after the application reports the copy operation is complete! If you've got a system with 8GiB of memory, multiply all that by four. The defaults for vm.dirty_bytes and vm.dirty_background_bytes are, IMO, badly broken and an insidious source of problems for both regular Linux users and system administrators. -- :wq signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] Re: USB crucial file recovery
On Thu, Sep 1, 2016 at 8:41 AM, Michael Mol wrote: > > The defaults for vm.dirty_bytes and vm.dirty_background_bytes are, IMO, badly > broken and an insidious source of problems for both regular Linux users and > system administrators. > It depends on whether you tend to yank out drives without unmounting them, or if you have a poorly-implemented database that doesn't know about fsync and tries to implement transactions across multiple hosts. The flip side of all of this is that you can save-save-save in your applications and not sit there and watch your application wait for the USB drive to catch up. It also allows writes to be combined more efficiently (less of an issue for flash, but you probably can still avoid multiple rounds of overwriting data in place if multiple revisions come in succession, and metadata updating can be consolidated). For a desktop-oriented workflow I'd think that having nice big write buffers would greatly improve the user experience, as long as you hit that unmount button or pay attention to that flashing green light every time you yank a drive. -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
On Thursday, September 01, 2016 08:41:39 AM Michael Mol wrote: > On Wednesday, August 31, 2016 11:45:15 PM Alan McKinnon wrote: > > On 31/08/2016 17:25, Grant wrote: > > >> Which NTFS system are you using? > > >> > > >> ntfs kernel module? It's quite dodgy and unsafe with writes > > >> ntfs-ng on fuse? I find that one quite solid > > > > > > I'm using ntfs-ng as opposed to the kernel option(s). > > > > I'm offering 10 to 1 odds that your problems came from ... one that you > > yanked too soon > > (pardon the in-line snip, while I get on my soap box) > > The likelihood of this happening can be greatly reduced by setting > vm.dirty_bytes to something like 2097125 and vm.dirty_background_bytes to > something like 1048576. This prevents the kernel from queuing up as much > data for sending to disk. The application doing the copy or write will > normally report "complete" long before writes to slow media are > actually...complete. Setting vm.dirty_bytes to something low prevents the > kernel's backlog of data from getting so long. > > vm.dirty_bytes has another, closely-related setting, vm.dirty_bytes_ratio. > vm.dirty_bytes_ratio is a percentage of RAM that is used for dirty bytes. If > vm.dirty_bytes_ratio is set, vm.dirty_bytes will read 0. If vm.dirty_bytes > is set, vm.dirty_bytes_ratio will read 0. > > The default is for vm.dirty_bytes_ratio to be 20, which means up to 20% of > your memory can find itself used as a write buffer for data on its way to a > filesystem. On a system with only 2GiB of RAM, that's 409MiB of data that > the kernel may still be waiting to push through the filesystem layer! If > you're writing to, say, a class 10 SDHC card, the data may not be at rest > for another 40s after the application reports the copy operation is > complete! > > If you've got a system with 8GiB of memory, multiply all that by four. > > The defaults for vm.dirty_bytes and vm.dirty_background_bytes are, IMO, > badly broken and an insidious source of problems for both regular Linux > users and system administrators. I would prefer to be able to have different settings per disk. Swappable drives like USB, I would put small numbers. But for built-in drives, I'd prefer to keep default values or tuned to the actual drive. -- Joost
Re: [gentoo-user] Re: USB crucial file recovery
On Thursday, September 01, 2016 09:35:15 AM Rich Freeman wrote: > On Thu, Sep 1, 2016 at 8:41 AM, Michael Mol wrote: > > The defaults for vm.dirty_bytes and vm.dirty_background_bytes are, IMO, > > badly broken and an insidious source of problems for both regular Linux > > users and system administrators. > > It depends on whether you tend to yank out drives without unmounting > them, The sad truth is that many (most?) users don't understand the idea of unmounting. Even Microsoft largely gave up, having flash drives "optimized for data safety" as opposed to "optimized for speed". While it'd be nice if the average John Doe would follow instructions, anyone who's worked in IT understands that the average John Doe...doesn't. And above-average ones assume they know better and don't have to. As such, queuing up that much data while reporting to the user that the copy is already complete violates the principle of least surprise. > or if you have a poorly-implemented database that doesn't know > about fsync and tries to implement transactions across multiple hosts. I don't know off the top of my head what database implementation would do that, though I could think of a dozen that could be vulnerable if they didn't sync properly. The real culprit that comes to mind, for me, are copy tools. Whether it's dd, mv, cp, or a copy dialog in GNOME or KDE. I would love to see CoDeL-style time-based buffer sizes applied throughout the stack. The user may not care about how many milliseconds it takes for a read to turn into a completed write on the face of it, but they do like accurate time estimates and low latency UI. > > The flip side of all of this is that you can save-save-save in your > applications and not sit there and watch your application wait for the > USB drive to catch up. It also allows writes to be combined more > efficiently (less of an issue for flash, but you probably can still > avoid multiple rounds of overwriting data in place if multiple > revisions come in succession, and metadata updating can be > consolidated). I recently got bit by vim's easytags causing saves to take a couple dozen seconds, leading me not to save as often as I used to. And then a bunch of code I wrote Monday...wasn't there any more. I was sad. > > For a desktop-oriented workflow I'd think that having nice big write > buffers would greatly improve the user experience, as long as you hit > that unmount button or pay attention to that flashing green light > every time you yank a drive. Realistically, users aren't going to pay attention. You and I do, but that's because we understand the *why* behind the importance. I love me fat write buffers for write combining, page caches etc. But, IMO, it shouldn't take longer than 1-2s (barring spinning rust disk wake) for full buffers to flush to disk; at modern write speeds (even for a slow spinning disc), that's going to be a dozen or so megabytes of data, which is plenty big for write-combining purposes. -- :wq signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] Re: USB crucial file recovery
On Thursday, September 01, 2016 04:21:18 PM J. Roeleveld wrote: > On Thursday, September 01, 2016 08:41:39 AM Michael Mol wrote: > > On Wednesday, August 31, 2016 11:45:15 PM Alan McKinnon wrote: > > > On 31/08/2016 17:25, Grant wrote: > > > >> Which NTFS system are you using? > > > >> > > > >> ntfs kernel module? It's quite dodgy and unsafe with writes > > > >> ntfs-ng on fuse? I find that one quite solid > > > > > > > > I'm using ntfs-ng as opposed to the kernel option(s). > > > > > > I'm offering 10 to 1 odds that your problems came from ... one that you > > > yanked too soon > > > > (pardon the in-line snip, while I get on my soap box) > > > > The likelihood of this happening can be greatly reduced by setting > > vm.dirty_bytes to something like 2097125 and vm.dirty_background_bytes to > > something like 1048576. This prevents the kernel from queuing up as much > > data for sending to disk. The application doing the copy or write will > > normally report "complete" long before writes to slow media are > > actually...complete. Setting vm.dirty_bytes to something low prevents the > > kernel's backlog of data from getting so long. > > > > vm.dirty_bytes has another, closely-related setting, vm.dirty_bytes_ratio. > > vm.dirty_bytes_ratio is a percentage of RAM that is used for dirty bytes. > > If vm.dirty_bytes_ratio is set, vm.dirty_bytes will read 0. If > > vm.dirty_bytes is set, vm.dirty_bytes_ratio will read 0. > > > > The default is for vm.dirty_bytes_ratio to be 20, which means up to 20% of > > your memory can find itself used as a write buffer for data on its way to > > a > > filesystem. On a system with only 2GiB of RAM, that's 409MiB of data that > > the kernel may still be waiting to push through the filesystem layer! If > > you're writing to, say, a class 10 SDHC card, the data may not be at rest > > for another 40s after the application reports the copy operation is > > complete! > > > > If you've got a system with 8GiB of memory, multiply all that by four. > > > > The defaults for vm.dirty_bytes and vm.dirty_background_bytes are, IMO, > > badly broken and an insidious source of problems for both regular Linux > > users and system administrators. > > I would prefer to be able to have different settings per disk. > Swappable drives like USB, I would put small numbers. > But for built-in drives, I'd prefer to keep default values or tuned to the > actual drive. The problem is that's not really possible. vm.dirty_bytes and vm.dirty_background_bytes deal with the page cache, which sits at the VFS layer, not the block device layer. It could certainly make sense to apply it on a per-mount basis, though. -- :wq signature.asc Description: This is a digitally signed message part.
Re: [gentoo-user] Re: USB crucial file recovery
On Thu, Sep 1, 2016 at 10:55 AM, Michael Mol wrote: > > The sad truth is that many (most?) users don't understand the idea of > unmounting. Even Microsoft largely gave up, having flash drives "optimized for > data safety" as opposed to "optimized for speed". While it'd be nice if the > average John Doe would follow instructions, anyone who's worked in IT > understands that the average John Doe...doesn't. And above-average ones assume > they know better and don't have to. > If these users are the target of your OS then you should probably tune the settings accordingly. This mailing list notwitstanding (sometimes), I don't think this is really Gentoo's core audience. -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
Am 31.08.2016 um 16:33 schrieb Michael Mol: > > In data=journal mode, the contents of files pass through the journal as well, > ensuring that, at least as far as the filesystem's responsibility is > concerned, > the data will be intact in the event of a crash. a common misconception. But not true at all. Google a bit. > > Now, I can still think of ways you can lose data in data=journal mode: > > * You mounted the filesystem with barrier=0 or with nobarrier; this can > result not needed. > in data writes going to disk out of order, if the I/O stack supports > barriers. > If you say "my file is ninety bytes" "here are ninety bytes of data, all 9s", > "my file is now thirty bytes", "here are thirty bytes of data, all 3s", then > in > the end you should have a thirty-byte file filled with 3s. If you have > barriers > enabled and you crash halfway through the whole process, you should find a > file > of ninety bytes, all 9s. But if you have barriers disabled, the data may hit > disk as though you'd said "my file is ninety bytes, here are ninety bytes of > data, all 9s, here are thirty bytes of data, all 3s, now my file is thirty > bytes." If that happens, and you crash partway through the commit to disk, > you > may see a ninety-byte file consisting of thirty 3s and sixty 9s. Or things > may > landthat you see a thirty-byte file of 9s. > > * Your application didn't flush its writes to disk when it should have. not needed either. > > * Your vm.dirty_bytes or vm.dirty_ratio are too high, you've been writing a > lot to disk, and the kernel still has a lot of data buffered waiting to be > written. (Well, that can always lead to data loss regardless of how high > those > settings are, which is why applications should flush their writes.) > > * You've used hdparm to enable write buffers in your hard disks, and your > hard > disks lose power while their buffers have data waiting to be written. > > * You're using a buggy disk device that does a poor job of handling power > loss. Such as some SSDs which don't have large enough capacitors for their > own > write reordering. Or just about any flash drive. > > * There's a bug in some code, somewhere. nope. > In-memory corruption of a data is a universal hazard. ECC should be the norm, > not the exception, honestly. >
Re: [gentoo-user] Re: USB crucial file recovery
On Thu, Sep 1, 2016 at 2:09 PM, Volker Armin Hemmann wrote: > > a common misconception. But not true at all. Google a bit. Feel free to enlighten us. My understanding is that data=journal means that all data gets written first to the journal. Completed writes will make it to the main filesystem after a crash, and incomplete writes will of course be rolled back, which is what you want. But simply disagreeing and saying to search Google is fairly useless, since you can find all kinds of junk on Google. You can't even guarantee that the same search terms will lead to the same results for two different people. And FWIW, this is a topic that Linus and the ext3 authors have disagreed with at points (not this specific question, but rather what the most appropriate defaults are). So, it isn't like there isn't room for disagreement on best practice, or that any two people with knowledge of the issues are guaranteed to agree. >> >> Now, I can still think of ways you can lose data in data=journal mode: >> >> * You mounted the filesystem with barrier=0 or with nobarrier; this can >> result > > not needed. Well, duh. He is telling people NOT to do this, because this is how you can LOSE data. >> >> * Your application didn't flush its writes to disk when it should have. > > not needed either. That very much depends on the application. If you need to ensure that transactions are in-sync with remote hosts (such as in a database) it is absolutely critical to flush writes. Applications shouldn't just flush on every write or close, because that causes needless disk thrashing. Yes, data will be lost if users have write caching enabled, and users who would prefer a slow system over one that loses more data when the power goes out should disable caching or buy a UPS. > > nope. Care to actually offer anything constructive? His advice was reasonably well-founded, even if I personally wouldn't do everything exactly as he prefers to do so. -- Rich
Re: [gentoo-user] Re: USB crucial file recovery
On Thu, 1 Sep 2016 23:50:17 +0200, Kai Krakow wrote: > > > ext2 will work, but you'll have to mount it or chmod -R 0777, or > > > only root will be able to access it. > > > > That's not true. Whoever owns the files and directories will be able > > to access then, even if root mounted the stick, just like a hard > > drive. If you have the same UID on all your systems, chown -R > > youruser: /mount/point will make everything available on all > > systems. > > As long as uids match... That's what I said, whoever owns the files. As far as Linux is concerned, the UID is the person, usernames are just a convenience mapping to make life simpler for the wetware. -- Neil Bothwick deja noo - reminds you of the last time you visited Scotland pgpqnp0MYGuwf.pgp Description: OpenPGP digital signature
Re: [gentoo-user] Re: USB crucial file recovery
On Thu, Sep 1, 2016 at 6:35 PM, Kai Krakow wrote: > Am Tue, 30 Aug 2016 17:59:02 -0400 > schrieb Rich Freeman : > >> >> That depends on the mode of operation. In journal=data I believe >> everything gets written twice, which should make it fairly immune to >> most forms of corruption. > > No, journal != data integrity. Journal only ensure that data is written > transactionally. You won't end up with messed up meta data, and from > API perspective and with journal=data, a partial written block of data > will be rewritten after recovering from a crash - up to the last fsync. > If it happens that this last fsync was half way into a file: Well, then > there's only your work written upto the half of the file. Well, sure, but all an application needs to do is make sure it calls write on whole files, and not half-files. It doesn't need to fsync as far as I'm aware. It just needs to write consistent files in one system call. Then that write either will or won't make it to disk, but you won't get half of a write. > Journals only ensure consistency on API level, not integrity. Correct, but this is way better than not journaling or ordering data, which protects the metadata but doesn't ensure your files aren't garbled even if the application is careful. > > If you need integrity, so then file system can tell you if your file is > broken or not, you need checksums. > Btrfs and zfs fail in the exact same way in this particular regard. If you call write with half of a file, btrfs/zfs will tell you that half of that file was successfully written. But, it won't hold up for the other half of the file that the kernel hasn't been told about. The checksumming in these filesystems really only protects data from modification after it is written. Sectors that were only half-written during an outage which have inconsistent checksums probably won't even be looked at during an fsck/mount, because the filesystem is just going to replay the journal and write right over them (or to some new block, still treating the half-written data as unallocated). These filesystems don't go scrubbing the disk to figure out what happened, they just replay the log back to the last checkpoint. The checksums are just used during routine reads to ensure the data wasn't somehow corrupted after it was written, in which case a good copy is used, assuming one exists. If not at least you'll know about the problem. > If you need a way to recover from a half written file, you need a CoW > file system where you could, by luck, go back some generations. Only if you've kept snapshots, or plan to hex-edit your disk/etc. The solution here is to correctly use the system calls. > >> f2fs would also have this benefit. Data is not overwritten in-place >> in a log-based filesystem; they're essentially journaled by their >> design (actually, they're basically what you get if you ditch the >> regular part of the filesystem and keep nothing but the journal). > > This is log-structed, not journalled. You pointed that out, yes, but > you weakened that by writing "basically the same". I think the > difference is important. Mostly because the journal is a fixed area on > the disk, while a log-structured file system has no such journal. My point was that they're equivalent from the standpoint that every write either completes or fails and you don't get half-written data. Yes, I know how f2fs actually works, and this wasn't intended to be a primer on log-based filesystems. The COW filesystems have similar benefits since they don't overwrite data in place, other than maybe their superblocks (or whatever you call them). I don't know what the on-disk format of zfs is, but btrfs has multiple copies of the tree root with a generation number so if something dies partway it is really easy for it to figure out where it left off (if none of the roots were updated then any partial tree structures laid down are in unallocated space and just get rewritten on the next commit, and if any were written then you have a fully consistent new tree used to update the remaining roots). One of these days I'll have to read up on the on-disk format of zfs as I suspect it would make an interest contrast with btrfs. > > This point was raised because it supports checksums, not because it > supports CoW. Sure, but both provide benefits in these contexts. And the only COW filesystems are also the only ones I'm aware of (at least in popular use) that have checksums. > > Log structered file systems are, btw, interesting for write-mostly > workloads on spinning disks because head movements are minimized. > They are not automatically helping dumb/simple flash translation layers. > This incorporates a little more logic by exploiting the internal > structure of flash (writing only sequentially in page sized blocks, > garbage collection and reuse only on erase block level). F2fs and > bcache (as a caching layer) do this. Not sure about the others. Sure. It is just really easy to do bi
Re: [gentoo-user] Re: USB crucial file recovery
On 02/09/2016 00:56, Kai Krakow wrote: Am Wed, 31 Aug 2016 02:32:24 +0200 schrieb Alan McKinnon : On 31/08/2016 02:08, Grant wrote: [...] [...] You can't control ownership and permissions of existing files with mount options on a Linux filesystem. See man mount. So in order to use a USB stick between multiple Gentoo systems with ext2, I need to make sure my users have matching UIDs/GIDs? Yes The uids/gids/modes in the inodes themselves are the owners and perms, you cannot override them. So unless you have mode=666, you will need matching UIDs/GIDs (which is a royal massive pain in the butt to bring about without NIS or similar I think this is how I ended up on NTFS in the first place. Didn't we have this discussion about a year ago? Sounds familiar now Is there a filesystem that will make that unnecessary and exhibit better reliability than NTFS? Yes, FAT. It works and works well. Or exFAT which is Microsoft's solution to the problem of very large files on FAT. Which NTFS system are you using? ntfs kernel module? It's quite dodgy and unsafe with writes ntfs-ng on fuse? I find that one quite solid ntfs-ng does have an annoyance that has bitten me more than once. When ntfs-nf writes to an FS, it can get marked dirty. Somehow, when used in a Windows machine the driver there has issues with the FS. Remount it in Linux again and all is good. Well, ntfs-ng simply sets the dirty flag which to Windows means "needs chkdsk". So Windows complains upon mount that it needs to chkdsk the drive first. That's all. Nothing bad. No, that's not it. Read again what I wrote - i have a specific fail mode which I don't care to investigate, not the general dirty state flag setting you describe
Re: [gentoo-user] Re: USB crucial file recovery
Is there a filesystem that will make that unnecessary and exhibit better reliability than NTFS? >>> >>> >>> Yes, FAT. It works and works well. >>> Or exFAT which is Microsoft's solution to the problem of very large >>> files on FAT. >> >> >> FAT32 won't work for me since I need to use files larger than 4GB. I >> know it's beta software but should exfat be more reliable than ntfs? > > > It doesn't do all the fancy journalling that ntfs does, so based solely on > complexity, it ought to be more reliable. > > None of us have done real tests and mentioned it here, so we really don't > know how it pans out in the real world. > > Do a bunch of tests yourself and decide >> >> >>> Which NTFS system are you using? >>> >>> ntfs kernel module? It's quite dodgy and unsafe with writes >>> ntfs-ng on fuse? I find that one quite solid >> >> >> I'm using ntfs-ng as opposed to the kernel option(s). > > > I'm offering 10 to 1 odds that your problems came from a faulty USB stick, > or maybe one that you yanked too soon It could be failing hardware but I didn't touch the USB stick when it freaked out. This same thing has happened several times now with two different USB sticks. It sounds like I'm stuck with NTFS if I want to share the USB stick amongst Gentoo systems without managing UUIDs and I want to work with files larger than 4GB. exfat is the other option but it sounds rather unproven. - Grant