Re: [PATCH 08/16] btrfs: add sanity check when resuming balance after mount

2018-04-15 Thread Anand Jain
On 04/04/2018 02:34 AM, David Sterba wrote: Replace a WARN_ON with a proper check and message in case something goes really wrong and resumed balance cannot set up its exclusive status. The check is a user friendly assertion, I don't expect to ever happen under normal circumstances. Also do

Re: Symlink not persisted even after fsync

2018-04-15 Thread Theodore Y. Ts'o
On Sun, Apr 15, 2018 at 07:10:52PM -0500, Vijay Chidambaram wrote: > > I don't think this is what the paper's ext3-fast does. All the paper > says is if you have a file system where the fsync of a file persisted > only data related to that file, it would increase performance. > ext3-fast is the na

Re: Symlink not persisted even after fsync

2018-04-15 Thread Amir Goldstein
On Mon, Apr 16, 2018 at 3:10 AM, Vijay Chidambaram wrote: [...] > Consider the following workload: > > creat foo > link (foo, A/bar) > fsync(foo) > crash > > In this case, after the file system recovers, do we expect foo's link > count to be 2 or 1? I would say 2, but POSIX is silent on this,

[PATCH v2] btrfs: Do super block verification before writing it to disk

2018-04-15 Thread Qu Wenruo
There are already 2 reports about strangely corrupted super blocks, where csum type and incompat flags get some obvious garbage, but csum still matches and all other vitals are correct. This normally means some kernel memory corruption happens, although the cause is unknown, at least detect it and

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 10:33 PM, Chris Murphy wrote: > On Sun, Apr 15, 2018 at 7:45 PM, Alexander Zapatka > wrote: >> thanks, Chris. i have given a timeout of 300 to all the drives. they >> are all USB, all connected to an apollo lake based htpc. then i >> started the command again... the dme

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 7:45 PM, Alexander Zapatka wrote: > thanks, Chris. i have given a timeout of 300 to all the drives. they > are all USB, all connected to an apollo lake based htpc. then i > started the command again... the dmesg output is here from a few > minutes after i started the btr

[PATCH] btrfs: Do super block verification before writing it to disk

2018-04-15 Thread Qu Wenruo
There are already 2 reports about strangely corrupted super blocks, where csum type and incompat flags get some obvious garbage, but csum still matches and all other vitals are correct. This normally means some kernel memory corruption happens, although the cause is unknown, at least detect it and

Re: having issue removing a drive with a bad block

2018-04-15 Thread Alexander Zapatka
thanks, Chris. i have given a timeout of 300 to all the drives. they are all USB, all connected to an apollo lake based htpc. then i started the command again... the dmesg output is here from a few minutes after i started the btrfs device remove command. https://paste.ee/p/H1R0i. no hopes, high

Add udev-md-raid-safe-timeouts.rules

2018-04-15 Thread Chris Murphy
I just ran into this: https://github.com/neilbrown/mdadm/pull/32/commits/af1ddca7d5311dfc9ed60a5eb6497db1296f1bec This solution is inadequate, can it be made more generic? This isn't an md specific problem, it affects Btrfs and LVM as well. And in fact raid0, and even none raid setups. There is n

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 6:30 PM, Chris Murphy wrote: > # echo value > /sys/block/device-name/device/timeout > Also note that this is not a persistent setting. It needs to be done per boot. But before you change it, use cat to find out what the value is. Default is 30. I'm seeing this: https://g

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
Please keep the list in the cc: On Sun, Apr 15, 2018 at 5:55 PM, Alexander Zapatka wrote: > output: > > $ sudo smartctl -l scterc /dev/sdc > smartctl 6.6 2016-05-31 r4324 [x86_64-linux-4.13.0-38-generic] (local build) > Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

Re: Symlink not persisted even after fsync

2018-04-15 Thread Vijay Chidambaram
Hi Ted, On Sun, Apr 15, 2018 at 9:13 AM, Theodore Y. Ts'o wrote: > On Sat, Apr 14, 2018 at 08:35:45PM -0500, Vijaychidambaram Velayudhan Pillai > wrote: >> I was one of the authors on that paper, and I didn't know until today you >> didn't like that work :) The paper did *not* suggest we support

Re: having issue removing a drive with a bad block

2018-04-15 Thread Chris Murphy
On Sun, Apr 15, 2018 at 6:14 AM, Alexander Zapatka wrote: > i recently set up a drive pool in single mode on my little media > server. about a week later SMART started telling me that the drive > was having issue and there is one bad sector. since the array is far > from full i decided to remove

Re: Symlink not persisted even after fsync

2018-04-15 Thread Theodore Y. Ts'o
On Sat, Apr 14, 2018 at 08:35:45PM -0500, Vijaychidambaram Velayudhan Pillai wrote: > I was one of the authors on that paper, and I didn't know until today you > didn't like that work :) The paper did *not* suggest we support invented > guarantees without considering the performance impact. I had

Re: remounted ro during operation, unmountable since

2018-04-15 Thread Duncan
Qu Wenruo posted on Sat, 14 Apr 2018 22:41:50 +0800 as excerpted: >> sectorsize        4096 >> nodesize        4096 > > Nodesize is not the default 16K, any reason for this? > (Maybe performance?) > >>> 3) Extra hardware info about your sda >>>     Things like SMART and hardware model would also

[GIT PULL] Btrfs updates for 4.17, part 2

2018-04-15 Thread David Sterba
Hi, we have queued a few more fixes (error handling, log replay, softlockup) and the rest is SPDX update that touches almost all files so the diffstat is long. The top patch is a fixup for excessive warning and was not in linux-next but I've tested it locally. Please pull, thanks. --

having issue removing a drive with a bad block

2018-04-15 Thread Alexander Zapatka
i recently set up a drive pool in single mode on my little media server. about a week later SMART started telling me that the drive was having issue and there is one bad sector. since the array is far from full i decided to remove the drive from the pool. but running btrfs device remove /dev/sd

Re: remounted ro during operation, unmountable since

2018-04-15 Thread Qu Wenruo
On 2018年04月15日 16:39, Timo Nentwig wrote: > On 04/14/2018 03:45 PM, Timo Nentwig wrote: >> On 04/14/2018 11:42 AM, Qu Wenruo wrote: >>> And the work load when the RO happens is also helpful. >>> (Well, the dmesg of RO happens would be the best though) >> I had a glance at dmesg but don't remember

Re: remounted ro during operation, unmountable since

2018-04-15 Thread Timo Nentwig
On 04/14/2018 03:45 PM, Timo Nentwig wrote: On 04/14/2018 11:42 AM, Qu Wenruo wrote: And the work load when the RO happens is also helpful. (Well, the dmesg of RO happens would be the best though) I had a glance at dmesg but don't remember anything specific (think the usual " [cut here] ---