Urgently need money? We can help you!

2018-12-07 Thread Mr. Muller Dieter
Urgently need money? We can help you!
Are you by the current situation in trouble or threatens you in trouble?
In this way, we give you the ability to take a new development.
As a rich person I feel obliged to assist people who are struggling to give 
them a chance. Everyone deserved a second chance and since the Government 
fails, it will have to come from others.
No amount is too crazy for us and the maturity we determine by mutual agreement.
No surprises, no extra costs, but just the agreed amounts and nothing else.
Don't wait any longer and comment on this post. Please specify the amount you 
want to borrow and we will contact you with all the possibilities. contact us 
today at stewarrt.l...@gmail.com


Re: HELP unmountable partition after btrfs balance to RAID0

2018-12-07 Thread Duncan
Thomas Mohr posted on Thu, 06 Dec 2018 12:31:15 +0100 as excerpted:

> We wanted to convert a file system to a RAID0 with two partitions.
> Unfortunately we had to reboot the server during the balance operation
> before it could complete.
> 
> Now following happens:
> 
> A mount attempt of the array fails with following error code:
> 
> btrfs recover yields roughly 1.6 out of 4 TB.

[Just another btrfs user and list regular, not a dev.  A dev may reply to 
your specific case, but meanwhile, for next time...]

That shouldn't be a problem.  Because with raid0 a failure of any of the 
components will take down the entire raid, making it less reliable than a 
single device, raid0 (in general, not just btrfs) is considered only 
useful for data of low enough value that its loss is no big deal, either 
because it's truly of little value (internet cache being a good example), 
or because backups are kept available and updated for whenever the raid0 
array fails.  Because with raid0, it's always a question of when it'll 
fail, not if.

So loss of a filesystem being converted to raid0 isn't a problem, because 
the data on it, by virtue of being in the process of conversion to raid0, 
is defined as of throw-away value in any case.  If it's of higher value 
than that, it's not going to be raid0 (or in the process of conversion to 
it) in the first place.

Of course that's simply an extension of the more general first sysadmin's 
rule of backups, that the true value of data is defined not by arbitrary 
claims, but by the number of backups of that data it's worth having.  
Because "things happen", whether it's fat-fingering, bad hardware, buggy 
software, or simply someone tripping over the power cable or running into 
the power pole outside at the wrong time.

So no backup is simply defining the data as worth less than the time/
trouble/resources necessary to make that backup.

Note that you ALWAYS save what was of most value to you, either the time/
trouble/resources to do the backup, if your actions defined that to be of 
more value than the data, or the data, if you had that backup, thereby 
defining the value of the data to be worth backing up.

Similarly, failure of the only backup isn't a problem because by virtue 
of there being only that one backup, the data is defined as not worth 
having more than one, and likewise, having an outdated backup isn't a 
problem, because that's simply the special case of defining the data in 
the delta between the backup time and the present as not (yet) worth the 
time/hassle/resources to make/refresh that backup.

(And FWIW, the second sysadmin's rule of backups is that it's not a 
backup until you've successfully tested it recoverable in the same sort 
of conditions you're likely to need to recover it in.  Because so many 
people have /thought/ they had backups, that turned out not to be, 
because they never tested that they could actually recover the data from 
them.  For instance, if the backup tools you'll need to recover the 
backup are on the backup itself, how do you get to them?  Can you create 
a filesystem for the new copy of the data and recover it from the backup 
with just the tools and documentation available from your emergency boot 
media?  Untested backup == no backup, or at best, backup still in 
process!)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman



HELP unmountable partition after btrfs balance to RAID0

2018-12-06 Thread Thomas Mohr

Dear developers of BTRFS,

we have a problem. We wanted to convert a file system to a RAID0 with 
two partitions. Unfortunately we had to reboot the server during the 
balance operation before it could complete.


Now following happens:

A mount attempt of the array fails with following error code:

btrfs recover yields roughly 1.6 out of 4 TB.

to recover the rest we have tried:

mount:

[18192.357444] BTRFS info (device sdb1): disk space caching is enabled
[18192.357447] BTRFS info (device sdb1): has skinny extents
[18192.370664] BTRFS error (device sdb1): parent transid verify failed 
on 30523392 wanted 7432 found 7445
[18192.370810] BTRFS error (device sdb1): parent transid verify failed 
on 30523392 wanted 7432 found 7445

[18192.394745] BTRFS error (device sdb1): open_ctree failed

mounting with options ro, degraded, cache_clear etc yields the same errors.


btrfs rescue zero-log. This operation works, however, the error persists 
and the array remains unmountable


parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
Ignoring transid failure
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
Ignoring transid failure
Clearing log on /dev/sdb1, previous log_root 0, level 0

btrfs rescue chunk-recover fails with following error message:

btrfs check results in:

Opening filesystem to check...
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
parent transid verify failed on 59768832 wanted 7422 found 7187
Ignoring transid failure
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
parent transid verify failed on 30408704 wanted 7430 found 7443
Ignoring transid failure
Checking filesystem on /dev/sdb1
UUID: 6c9ed4e1-d63f-46f0-b1e9-608b8fa43bb8
[1/7] checking root items
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
parent transid verify failed on 30523392 wanted 7432 found 7443
Ignoring transid failure
leaf parent key incorrect 30523392ERROR: failed to repair root items: 
Operation not permitted


Any ideas what is going on or how to recover the file system ? I would 
greatly appreciate your help !!!


best,

Thomas


uname -a:

Linux server2 4.19.5-1-default #1 SMP PREEMPT Tue Nov 27 19:56:09 UTC 
2018 (6210279) x86_64 x86_64 x86_64 GNU/Linux


btrfs-progs version 4.19


--
ScienceConsult - DI Thomas Mohr KG
DI Thomas Mohr
Enzianweg 10a
2353 Guntramsdorf
Austria
+43 2236 56793
+43 660 461 1966
http://www.mohrkeg.co.at



Re: Need help with potential ~45TB dataloss

2018-12-04 Thread Chris Murphy
On Tue, Dec 4, 2018 at 3:09 AM Patrick Dijkgraaf
 wrote:
>
> Hi Chris,
>
> See the output below. Any suggestions based on it?

If they're SATA drives, they may not support SCT ERC; and if they're
SAS, depending on what controller they're behind, smartctl might need
a hint to properly ask the drive for SCT ERC status. Simplest way to
know is do 'smartctl -x' on one drive, assuming they're all the same
basic make/model other than size.


-- 
Chris Murphy


Re: Need help with potential ~45TB dataloss

2018-12-04 Thread Patrick Dijkgraaf
Hi Chris,

See the output below. Any suggestions based on it?
Thanks!

-- 
Groet / Cheers,
Patrick Dijkgraaf



On Mon, 2018-12-03 at 20:16 -0700, Chris Murphy wrote:
> Also useful information for autopsy, perhaps not for fixing, is to
> know whether the SCT ERC value for every drive is less than the
> kernel's SCSI driver block device command timeout value. It's super
> important that the drive reports an explicit read failure before the
> read command is considered failed by the kernel. If the drive is
> still
> trying to do a read, and the kernel command timer times out, it'll
> just do a reset of the whole link and we lose the outcome for the
> hanging command. Upon explicit read error only, can Btrfs, or md
> RAID,
> know what device and physical sector has a problem, and therefore how
> to reconstruct the block, and fix the bad sector with a write of
> known
> good data.
> 
> smartctl -l scterc /device/

Seems to not work:

[root@cornelis ~]# for disk in /dev/sd{e..x}; do echo ${disk}; smartctl
-l scterc ${disk}; done
/dev/sde
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdf
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdg
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdh
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdi
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdj
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdk
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

Smartctl open device: /dev/sdk failed: No such device
/dev/sdl
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdm
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdn
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SCT Error Recovery Control command not supported

/dev/sdo
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdp
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdq
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sdr
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW register
SCT (Get) Error Recovery Control command failed

/dev/sds
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.18.16-arch1-1-ARCH]
(local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, 
www.smartmontools.org

SMART WRITE LOG does not return COUNT and LBA_LOW 

Re: Need help with potential ~45TB dataloss

2018-12-04 Thread Patrick Dijkgraaf
Hi, thanks again.
Please see answers inline.

-- 
Groet / Cheers,
Patrick Dijkgraaf



On Mon, 2018-12-03 at 08:35 +0800, Qu Wenruo wrote:
> 
> On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote:
> > Hi Qu,
> > 
> > Thanks for helping me!
> > 
> > Please see the reponses in-line.
> > Any suggestions based on this?
> > 
> > Thanks!
> > 
> > 
> > On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
> > > On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
> > > > Hi all,
> > > > 
> > > > I have been a happy BTRFS user for quite some time. But now I'm
> > > > facing
> > > > a potential ~45TB dataloss... :-(
> > > > I hope someone can help!
> > > > 
> > > > I have Server A and Server B. Both having a 20-devices BTRFS
> > > > RAID6
> > > > filesystem. Because of known RAID5/6 risks, Server B was a
> > > > backup
> > > > of
> > > > Server A.
> > > > After applying updates to server B and reboot, the FS would not
> > > > mount
> > > > anymore. Because it was "just" a backup. I decided to recreate
> > > > the
> > > > FS
> > > > and perform a new backup. Later, I discovered that the FS was
> > > > not
> > > > broken, but I faced this issue: 
> > > > https://patchwork.kernel.org/patch/10694997/
> > > > 
> > > > 
> > > 
> > > Sorry for the inconvenience.
> > > 
> > > I didn't realize the max_chunk_size limit isn't reliable at that
> > > timing.
> > 
> > No problem, I should not have jumped to the conclusion to recreate
> > the
> > backup volume.
> > 
> > > > Anyway, the FS was already recreated, so I needed to do a new
> > > > backup.
> > > > During the backup (using rsync -vah), Server A (the source)
> > > > encountered
> > > > an I/O error and my rsync failed. In an attempt to "quick fix"
> > > > the
> > > > issue, I rebooted Server A after which the FS would not mount
> > > > anymore.
> > > 
> > > Did you have any dmesg about that IO error?
> > 
> > Yes there was. But I omitted capturing it... The system is now
> > rebooted
> > and I can't retrieve it anymore. :-(
> > 
> > > And how is the reboot scheduled? Forced power off or normal
> > > reboot
> > > command?
> > 
> > The system was rebooted using a normal reboot command.
> 
> Then the problem is pretty serious.
> 
> Possibly already corrupted before.
> 
> > > > I documented what I have tried, below. I have not yet tried
> > > > anything
> > > > except what is shown, because I am afraid of causing more harm
> > > > to
> > > > the FS.
> > > 
> > > Pretty clever, no btrfs check --repair is a pretty good move.
> > > 
> > > > I hope somebody here can give me advice on how to (hopefully)
> > > > retrieve my data...
> > > > 
> > > > Thanks in advance!
> > > > 
> > > > ==
> > > > 
> > > > [root@cornelis ~]# btrfs fi show
> > > > Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-
> > > > f329fc3795fd
> > > > Total devices 1 FS bytes used 463.92GiB
> > > > devid1 size 800.00GiB used 493.02GiB path
> > > > /dev/mapper/cornelis-cornelis--btrfs
> > > > 
> > > > Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
> > > > Total devices 20 FS bytes used 44.85TiB
> > > > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
> > > > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
> > > > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
> > > > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
> > > > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
> > > > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
> > > > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
> > > > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
> > > > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
> > > > devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
> > > > devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
> > > > devid   12 size 3.64TiB 

Re: Need help with potential ~45TB dataloss

2018-12-03 Thread Chris Murphy
Also useful information for autopsy, perhaps not for fixing, is to
know whether the SCT ERC value for every drive is less than the
kernel's SCSI driver block device command timeout value. It's super
important that the drive reports an explicit read failure before the
read command is considered failed by the kernel. If the drive is still
trying to do a read, and the kernel command timer times out, it'll
just do a reset of the whole link and we lose the outcome for the
hanging command. Upon explicit read error only, can Btrfs, or md RAID,
know what device and physical sector has a problem, and therefore how
to reconstruct the block, and fix the bad sector with a write of known
good data.

smartctl -l scterc /device/
and
cat /sys/block/sda/device/timeout

Only if SCT ERC is enabled with a value below 30, or if the kernel
command timer is change to be well above 30 (like 180, which is
absolutely crazy but a separate conversation) can we be sure that
there haven't just been resets going on for a while, preventing bad
sectors from being fixed up all along, and can contribute to the
problem. This comes up on the linux-raid (mainly md driver) list all
the time, and it contributes to lost RAID all the time. And arguably
it leads to unnecessary data loss in even the single device
desktop/laptop use case as well.


Chris Murphy


Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Qu Wenruo


On 2018/12/3 上午4:30, Andrei Borzenkov wrote:
> 02.12.2018 23:14, Patrick Dijkgraaf пишет:
>> I have some additional info.
>>
>> I found the reason the FS got corrupted. It was a single failing drive,
>> which caused the entire cabinet (containing 7 drives) to reset. So the
>> FS suddenly lost 7 drives.
>>
> 
> This remains mystery for me. btrfs is marketed to be always consistent
> on disk - you either have previous full transaction or current full
> transaction. If current transaction was interrupted the promise is you
> are left with previous valid consistent transaction.
> 
> Obviously this is not what happens in practice. Which nullifies the main
> selling point of btrfs.
> 
> Unless this is expected behavior, it sounds like some barriers are
> missing and summary data is updated before (and without waiting for)
> subordinate data. And if it is expected behavior ...

There are one (unfortunately) known problem for RAID5/6 and one special
problem for RAID6.

The common problem is write hole.
For a RAID5 stripe like:
Disk 1  |Disk 2|   Disk 3
---
DATA1   |DATA2 |   PARITY

If we have written something into DATA1, but powerloss happened before
we update PARITY in disk 3.
In this case, we can't tolerant Disk 2 loss, since DATA1 doesn't match
PARAITY anymore.

Without the ability to know what exactly block we have written, for
write hole problem exists for any parity based solution, including BTRFS
RAID5/6.

From the guys in the mail list, other RAID5/6 implementations have their
own record of which block is updated on-disk, and for powerloss case
they will rebuild involved stripes.

Since btrfs doesn't has such ability, we need to scrub the whole fs to
regain the disk loss tolerance (and hope there will not be another power
loss during it)


The RAID6 special problem is the missing of rebuilt retry logic.
(Not any more after 4.16 kernel, but still missing btrfs-progs support)

For a RAID6 stripe like:
Disk 1 |Disk 2  | Disk 3 |Disk 4

DATA1  |DATA2   |   P|  Q

If data read from DATA1 failed, we have 3 ways to rebuild the data:
1) Using DATA2 and P (just as RAID5)
2) Using P and Q
3) Using DATA2 and Q

However until 4.16 we won't retry all possible ways to build it.
(Thanks Liu for solving this problem).

Thanks,
Qu

> 
>> I have removed the failed drive, so the RAID is now degraded. I hope
>> the data is still recoverable... ☹
>>
> 



signature.asc
Description: OpenPGP digital signature


Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Qu Wenruo


On 2018/12/3 上午8:35, Qu Wenruo wrote:
> 
> 
> On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote:
>> Hi Qu,
>>
>> Thanks for helping me!
>>
>> Please see the reponses in-line.
>> Any suggestions based on this?
>>
>> Thanks!
>>
>>
>> On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
>>> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
>>>> Hi all,
>>>>
>>>> I have been a happy BTRFS user for quite some time. But now I'm
>>>> facing
>>>> a potential ~45TB dataloss... :-(
>>>> I hope someone can help!
>>>>
>>>> I have Server A and Server B. Both having a 20-devices BTRFS RAID6
>>>> filesystem.

I forgot one important thing here, specially for RAID6.

If one data device corrupted, RAID6 will normally try to rebuild using
RAID5 way, and if another one disk get corrupted, it may not recover
correctly.

Current way to recover is try *all* combination.

IIRC Liu Bo tried such patch but not merged.

This means current RAID6 can only handle two missing devices at its best
condition.
But for corruption, it can only be as good as RAID5.

Thanks,
Qu

> Because of known RAID5/6 risks, Server B was a backup
>>>> of
>>>> Server A.
>>>> After applying updates to server B and reboot, the FS would not
>>>> mount
>>>> anymore. Because it was "just" a backup. I decided to recreate the
>>>> FS
>>>> and perform a new backup. Later, I discovered that the FS was not
>>>> broken, but I faced this issue: 
>>>> https://patchwork.kernel.org/patch/10694997/
>>>>
>>>
>>> Sorry for the inconvenience.
>>>
>>> I didn't realize the max_chunk_size limit isn't reliable at that
>>> timing.
>>
>> No problem, I should not have jumped to the conclusion to recreate the
>> backup volume.
>>
>>>> Anyway, the FS was already recreated, so I needed to do a new
>>>> backup.
>>>> During the backup (using rsync -vah), Server A (the source)
>>>> encountered
>>>> an I/O error and my rsync failed. In an attempt to "quick fix" the
>>>> issue, I rebooted Server A after which the FS would not mount
>>>> anymore.
>>>
>>> Did you have any dmesg about that IO error?
>>
>> Yes there was. But I omitted capturing it... The system is now rebooted
>> and I can't retrieve it anymore. :-(
>>
>>> And how is the reboot scheduled? Forced power off or normal reboot
>>> command?
>>
>> The system was rebooted using a normal reboot command.
> 
> Then the problem is pretty serious.
> 
> Possibly already corrupted before.
> 
>>
>>>> I documented what I have tried, below. I have not yet tried
>>>> anything
>>>> except what is shown, because I am afraid of causing more harm to
>>>> the FS.
>>>
>>> Pretty clever, no btrfs check --repair is a pretty good move.
>>>
>>>> I hope somebody here can give me advice on how to (hopefully)
>>>> retrieve my data...
>>>>
>>>> Thanks in advance!
>>>>
>>>> ==
>>>>
>>>> [root@cornelis ~]# btrfs fi show
>>>> Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
>>>>Total devices 1 FS bytes used 463.92GiB
>>>>devid1 size 800.00GiB used 493.02GiB path
>>>> /dev/mapper/cornelis-cornelis--btrfs
>>>>
>>>> Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
>>>>Total devices 20 FS bytes used 44.85TiB
>>>>devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
>>>>devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
>>>>devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
>>>>devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
>>>>devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
>>>>devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
>>>>devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
>>>>devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
>>>>devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
>>>>devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
>>>>devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
>>>>devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
>>>>devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
>>>>devid   14 size 3

Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Qu Wenruo


On 2018/12/2 下午5:03, Patrick Dijkgraaf wrote:
> Hi Qu,
> 
> Thanks for helping me!
> 
> Please see the reponses in-line.
> Any suggestions based on this?
> 
> Thanks!
> 
> 
> On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
>> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
>>> Hi all,
>>>
>>> I have been a happy BTRFS user for quite some time. But now I'm
>>> facing
>>> a potential ~45TB dataloss... :-(
>>> I hope someone can help!
>>>
>>> I have Server A and Server B. Both having a 20-devices BTRFS RAID6
>>> filesystem. Because of known RAID5/6 risks, Server B was a backup
>>> of
>>> Server A.
>>> After applying updates to server B and reboot, the FS would not
>>> mount
>>> anymore. Because it was "just" a backup. I decided to recreate the
>>> FS
>>> and perform a new backup. Later, I discovered that the FS was not
>>> broken, but I faced this issue: 
>>> https://patchwork.kernel.org/patch/10694997/
>>>
>>
>> Sorry for the inconvenience.
>>
>> I didn't realize the max_chunk_size limit isn't reliable at that
>> timing.
> 
> No problem, I should not have jumped to the conclusion to recreate the
> backup volume.
> 
>>> Anyway, the FS was already recreated, so I needed to do a new
>>> backup.
>>> During the backup (using rsync -vah), Server A (the source)
>>> encountered
>>> an I/O error and my rsync failed. In an attempt to "quick fix" the
>>> issue, I rebooted Server A after which the FS would not mount
>>> anymore.
>>
>> Did you have any dmesg about that IO error?
> 
> Yes there was. But I omitted capturing it... The system is now rebooted
> and I can't retrieve it anymore. :-(
> 
>> And how is the reboot scheduled? Forced power off or normal reboot
>> command?
> 
> The system was rebooted using a normal reboot command.

Then the problem is pretty serious.

Possibly already corrupted before.

> 
>>> I documented what I have tried, below. I have not yet tried
>>> anything
>>> except what is shown, because I am afraid of causing more harm to
>>> the FS.
>>
>> Pretty clever, no btrfs check --repair is a pretty good move.
>>
>>> I hope somebody here can give me advice on how to (hopefully)
>>> retrieve my data...
>>>
>>> Thanks in advance!
>>>
>>> ==
>>>
>>> [root@cornelis ~]# btrfs fi show
>>> Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
>>> Total devices 1 FS bytes used 463.92GiB
>>> devid1 size 800.00GiB used 493.02GiB path
>>> /dev/mapper/cornelis-cornelis--btrfs
>>>
>>> Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
>>> Total devices 20 FS bytes used 44.85TiB
>>> devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
>>> devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
>>> devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
>>> devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
>>> devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
>>> devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
>>> devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
>>> devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
>>> devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
>>> devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
>>> devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
>>> devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
>>> devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
>>> devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
>>> devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
>>> devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
>>> devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
>>> devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
>>> devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
>>> devid   20 size 7.28TiB used 588.80GiB path /dev/sde2
>>>
>>> [root@cornelis ~]# mount /dev/sdn2 /mnt/data
>>> mount: /mnt/data: wrong fs type, bad option, bad superblock on
>>> /dev/sdn2, missing codepage or helper program, or other error.
>>
>> What is the dmesg of the mount failure?
> 
> [Sun Dec  2 09:41:08 2018] BTRFS info (device sdn2): disk space caching
> is enabled
> [Sun Dec  2 09:41:08 2018] BTRFS inf

Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Andrei Borzenkov
02.12.2018 23:14, Patrick Dijkgraaf пишет:
> I have some additional info.
> 
> I found the reason the FS got corrupted. It was a single failing drive,
> which caused the entire cabinet (containing 7 drives) to reset. So the
> FS suddenly lost 7 drives.
> 

This remains mystery for me. btrfs is marketed to be always consistent
on disk - you either have previous full transaction or current full
transaction. If current transaction was interrupted the promise is you
are left with previous valid consistent transaction.

Obviously this is not what happens in practice. Which nullifies the main
selling point of btrfs.

Unless this is expected behavior, it sounds like some barriers are
missing and summary data is updated before (and without waiting for)
subordinate data. And if it is expected behavior ...

> I have removed the failed drive, so the RAID is now degraded. I hope
> the data is still recoverable... ☹
> 



Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Patrick Dijkgraaf
I have some additional info.

I found the reason the FS got corrupted. It was a single failing drive,
which caused the entire cabinet (containing 7 drives) to reset. So the
FS suddenly lost 7 drives.

I have removed the failed drive, so the RAID is now degraded. I hope
the data is still recoverable... ☹

-- 
Groet / Cheers,
Patrick Dijkgraaf



On Sun, 2018-12-02 at 10:03 +0100, Patrick Dijkgraaf wrote:
> Hi Qu,
> 
> Thanks for helping me!
> 
> Please see the reponses in-line.
> Any suggestions based on this?
> 
> Thanks!
> 
> 
> On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
> > On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
> > > Hi all,
> > > 
> > > I have been a happy BTRFS user for quite some time. But now I'm
> > > facing
> > > a potential ~45TB dataloss... :-(
> > > I hope someone can help!
> > > 
> > > I have Server A and Server B. Both having a 20-devices BTRFS
> > > RAID6
> > > filesystem. Because of known RAID5/6 risks, Server B was a backup
> > > of
> > > Server A.
> > > After applying updates to server B and reboot, the FS would not
> > > mount
> > > anymore. Because it was "just" a backup. I decided to recreate
> > > the
> > > FS
> > > and perform a new backup. Later, I discovered that the FS was not
> > > broken, but I faced this issue: 
> > > https://patchwork.kernel.org/patch/10694997/
> > > 
> > > 
> > 
> > Sorry for the inconvenience.
> > 
> > I didn't realize the max_chunk_size limit isn't reliable at that
> > timing.
> 
> No problem, I should not have jumped to the conclusion to recreate
> the
> backup volume.
> 
> > > Anyway, the FS was already recreated, so I needed to do a new
> > > backup.
> > > During the backup (using rsync -vah), Server A (the source)
> > > encountered
> > > an I/O error and my rsync failed. In an attempt to "quick fix"
> > > the
> > > issue, I rebooted Server A after which the FS would not mount
> > > anymore.
> > 
> > Did you have any dmesg about that IO error?
> 
> Yes there was. But I omitted capturing it... The system is now
> rebooted
> and I can't retrieve it anymore. :-(
> 
> > And how is the reboot scheduled? Forced power off or normal reboot
> > command?
> 
> The system was rebooted using a normal reboot command.
> 
> > > I documented what I have tried, below. I have not yet tried
> > > anything
> > > except what is shown, because I am afraid of causing more harm to
> > > the FS.
> > 
> > Pretty clever, no btrfs check --repair is a pretty good move.
> > 
> > > I hope somebody here can give me advice on how to (hopefully)
> > > retrieve my data...
> > > 
> > > Thanks in advance!
> > > 
> > > ==
> > > 
> > > [root@cornelis ~]# btrfs fi show
> > > Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-
> > > f329fc3795fd
> > >   Total devices 1 FS bytes used 463.92GiB
> > >   devid1 size 800.00GiB used 493.02GiB path
> > > /dev/mapper/cornelis-cornelis--btrfs
> > > 
> > > Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
> > >   Total devices 20 FS bytes used 44.85TiB
> > >   devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
> > >   devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
> > >   devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
> > >   devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
> > >   devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
> > >   devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
> > >   devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
> > >   devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
> > >   devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
> > >   devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
> > >   devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
> > >   devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
> > >   devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
> > >   devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
> > >   devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
> > >   devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
> > >   devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
> > >   devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
> > >   devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
> > >   devid 

Re: Need help with potential ~45TB dataloss

2018-12-02 Thread Patrick Dijkgraaf
Hi Qu,

Thanks for helping me!

Please see the reponses in-line.
Any suggestions based on this?

Thanks!


On Sat, 2018-12-01 at 07:57 +0800, Qu Wenruo wrote:
> On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
> > Hi all,
> > 
> > I have been a happy BTRFS user for quite some time. But now I'm
> > facing
> > a potential ~45TB dataloss... :-(
> > I hope someone can help!
> > 
> > I have Server A and Server B. Both having a 20-devices BTRFS RAID6
> > filesystem. Because of known RAID5/6 risks, Server B was a backup
> > of
> > Server A.
> > After applying updates to server B and reboot, the FS would not
> > mount
> > anymore. Because it was "just" a backup. I decided to recreate the
> > FS
> > and perform a new backup. Later, I discovered that the FS was not
> > broken, but I faced this issue: 
> > https://patchwork.kernel.org/patch/10694997/
> > 
> 
> Sorry for the inconvenience.
> 
> I didn't realize the max_chunk_size limit isn't reliable at that
> timing.

No problem, I should not have jumped to the conclusion to recreate the
backup volume.

> > Anyway, the FS was already recreated, so I needed to do a new
> > backup.
> > During the backup (using rsync -vah), Server A (the source)
> > encountered
> > an I/O error and my rsync failed. In an attempt to "quick fix" the
> > issue, I rebooted Server A after which the FS would not mount
> > anymore.
> 
> Did you have any dmesg about that IO error?

Yes there was. But I omitted capturing it... The system is now rebooted
and I can't retrieve it anymore. :-(

> And how is the reboot scheduled? Forced power off or normal reboot
> command?

The system was rebooted using a normal reboot command.

> > I documented what I have tried, below. I have not yet tried
> > anything
> > except what is shown, because I am afraid of causing more harm to
> > the FS.
> 
> Pretty clever, no btrfs check --repair is a pretty good move.
> 
> > I hope somebody here can give me advice on how to (hopefully)
> > retrieve my data...
> > 
> > Thanks in advance!
> > 
> > ==
> > 
> > [root@cornelis ~]# btrfs fi show
> > Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
> > Total devices 1 FS bytes used 463.92GiB
> > devid1 size 800.00GiB used 493.02GiB path
> > /dev/mapper/cornelis-cornelis--btrfs
> > 
> > Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
> > Total devices 20 FS bytes used 44.85TiB
> > devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
> > devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
> > devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
> > devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
> > devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
> > devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
> > devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
> > devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
> > devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
> > devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
> > devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
> > devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
> > devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
> > devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
> > devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
> > devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
> > devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
> > devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
> > devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
> > devid   20 size 7.28TiB used 588.80GiB path /dev/sde2
> > 
> > [root@cornelis ~]# mount /dev/sdn2 /mnt/data
> > mount: /mnt/data: wrong fs type, bad option, bad superblock on
> > /dev/sdn2, missing codepage or helper program, or other error.
> 
> What is the dmesg of the mount failure?

[Sun Dec  2 09:41:08 2018] BTRFS info (device sdn2): disk space caching
is enabled
[Sun Dec  2 09:41:08 2018] BTRFS info (device sdn2): has skinny extents
[Sun Dec  2 09:41:08 2018] BTRFS error (device sdn2): parent transid
verify failed on 46451963543552 wanted 114401 found 114173
[Sun Dec  2 09:41:08 2018] BTRFS critical (device sdn2): corrupt leaf:
root=2 block=46451963543552 slot=0, unexpected item end, have
1387359977 expect 16283
[Sun Dec  2 09:41:08 2018] BTRFS warning (device sdn2): failed to read
tree root
[Sun Dec  2 09:41:08 2018] BTRFS error (device sdn2): open_ctree failed

> And have you tried -o ro,degraded

Re: Need help with potential ~45TB dataloss

2018-11-30 Thread Qu Wenruo


On 2018/11/30 下午9:53, Patrick Dijkgraaf wrote:
> Hi all,
> 
> I have been a happy BTRFS user for quite some time. But now I'm facing
> a potential ~45TB dataloss... :-(
> I hope someone can help!
> 
> I have Server A and Server B. Both having a 20-devices BTRFS RAID6
> filesystem. Because of known RAID5/6 risks, Server B was a backup of
> Server A.
> After applying updates to server B and reboot, the FS would not mount
> anymore. Because it was "just" a backup. I decided to recreate the FS
> and perform a new backup. Later, I discovered that the FS was not
> broken, but I faced this issue: 
> https://patchwork.kernel.org/patch/10694997/

Sorry for the inconvenience.

I didn't realize the max_chunk_size limit isn't reliable at that timing.

> 
> Anyway, the FS was already recreated, so I needed to do a new backup.
> During the backup (using rsync -vah), Server A (the source) encountered
> an I/O error and my rsync failed. In an attempt to "quick fix" the
> issue, I rebooted Server A after which the FS would not mount anymore.

Did you have any dmesg about that IO error?

And how is the reboot scheduled? Forced power off or normal reboot command?

> 
> I documented what I have tried, below. I have not yet tried anything
> except what is shown, because I am afraid of causing more harm to
> the FS.

Pretty clever, no btrfs check --repair is a pretty good move.

> I hope somebody here can give me advice on how to (hopefully)
> retrieve my data...
> 
> Thanks in advance!
> 
> ==
> 
> [root@cornelis ~]# btrfs fi show
> Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
>   Total devices 1 FS bytes used 463.92GiB
>   devid1 size 800.00GiB used 493.02GiB path
> /dev/mapper/cornelis-cornelis--btrfs
> 
> Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
>   Total devices 20 FS bytes used 44.85TiB
>   devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
>   devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
>   devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
>   devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
>   devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
>   devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
>   devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
>   devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
>   devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
>   devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
>   devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
>   devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
>   devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
>   devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
>   devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
>   devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
>   devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
>   devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
>   devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
>   devid   20 size 7.28TiB used 588.80GiB path /dev/sde2
> 
> [root@cornelis ~]# mount /dev/sdn2 /mnt/data
> mount: /mnt/data: wrong fs type, bad option, bad superblock on
> /dev/sdn2, missing codepage or helper program, or other error.

What is the dmesg of the mount failure?

And have you tried -o ro,degraded ?

> 
> [root@cornelis ~]# btrfs check /dev/sdn2
> Opening filesystem to check...
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
> bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
> have=75208089814272
> Couldn't read tree root

Would you please also paste the output of "btrfs ins dump-super /dev/sdn2" ?

It looks like your tree root (or at least some tree root nodes/leaves
get corrupted)

> ERROR: cannot open file system

And since it's your tree root corrupted, you could also try
"btrfs-find-root " to try to get a good old copy of your tree root.

But I suspect the corruption happens before you noticed, thus the old
tree root may not help much.

Also, the output of "btrfs ins dump-tree -t root " will help.

Thanks,
Qu
> 
> [root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> parent transid verify failed on 46451963543552 wanted 114401 found
> 114173
> checksum verify failed on 464

Need help with potential ~45TB dataloss

2018-11-30 Thread Patrick Dijkgraaf
Hi all,

I have been a happy BTRFS user for quite some time. But now I'm facing
a potential ~45TB dataloss... :-(
I hope someone can help!

I have Server A and Server B. Both having a 20-devices BTRFS RAID6
filesystem. Because of known RAID5/6 risks, Server B was a backup of
Server A.
After applying updates to server B and reboot, the FS would not mount
anymore. Because it was "just" a backup. I decided to recreate the FS
and perform a new backup. Later, I discovered that the FS was not
broken, but I faced this issue: 
https://patchwork.kernel.org/patch/10694997/

Anyway, the FS was already recreated, so I needed to do a new backup.
During the backup (using rsync -vah), Server A (the source) encountered
an I/O error and my rsync failed. In an attempt to "quick fix" the
issue, I rebooted Server A after which the FS would not mount anymore.

I documented what I have tried, below. I have not yet tried anything
except what is shown, because I am afraid of causing more harm to
the FS. I hope somebody here can give me advice on how to (hopefully)
retrieve my data...

Thanks in advance!

==

[root@cornelis ~]# btrfs fi show
Label: 'cornelis-btrfs'  uuid: ac643516-670e-40f3-aa4c-f329fc3795fd
Total devices 1 FS bytes used 463.92GiB
devid1 size 800.00GiB used 493.02GiB path
/dev/mapper/cornelis-cornelis--btrfs

Label: 'data'  uuid: 4c66fa8b-8fc6-4bba-9d83-02a2a1d69ad5
Total devices 20 FS bytes used 44.85TiB
devid1 size 3.64TiB used 3.64TiB path /dev/sdn2
devid2 size 3.64TiB used 3.64TiB path /dev/sdp2
devid3 size 3.64TiB used 3.64TiB path /dev/sdu2
devid4 size 3.64TiB used 3.64TiB path /dev/sdx2
devid5 size 3.64TiB used 3.64TiB path /dev/sdh2
devid6 size 3.64TiB used 3.64TiB path /dev/sdg2
devid7 size 3.64TiB used 3.64TiB path /dev/sdm2
devid8 size 3.64TiB used 3.64TiB path /dev/sdw2
devid9 size 3.64TiB used 3.64TiB path /dev/sdj2
devid   10 size 3.64TiB used 3.64TiB path /dev/sdt2
devid   11 size 3.64TiB used 3.64TiB path /dev/sdk2
devid   12 size 3.64TiB used 3.64TiB path /dev/sdq2
devid   13 size 3.64TiB used 3.64TiB path /dev/sds2
devid   14 size 3.64TiB used 3.64TiB path /dev/sdf2
devid   15 size 7.28TiB used 588.80GiB path /dev/sdr2
devid   16 size 7.28TiB used 588.80GiB path /dev/sdo2
devid   17 size 7.28TiB used 588.80GiB path /dev/sdv2
devid   18 size 7.28TiB used 588.80GiB path /dev/sdi2
devid   19 size 7.28TiB used 588.80GiB path /dev/sdl2
devid   20 size 7.28TiB used 588.80GiB path /dev/sde2

[root@cornelis ~]# mount /dev/sdn2 /mnt/data
mount: /mnt/data: wrong fs type, bad option, bad superblock on
/dev/sdn2, missing codepage or helper program, or other error.

[root@cornelis ~]# btrfs check /dev/sdn2
Opening filesystem to check...
parent transid verify failed on 46451963543552 wanted 114401 found
114173
parent transid verify failed on 46451963543552 wanted 114401 found
114173
checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
have=75208089814272
Couldn't read tree root
ERROR: cannot open file system

[root@cornelis ~]# btrfs restore /dev/sdn2 /mnt/data/
parent transid verify failed on 46451963543552 wanted 114401 found
114173
parent transid verify failed on 46451963543552 wanted 114401 found
114173
checksum verify failed on 46451963543552 found A8F2A769 wanted 4C111ADF
checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
checksum verify failed on 46451963543552 found 32153BE8 wanted 8B07ABE4
bad tree block 46451963543552, bytenr mismatch, want=46451963543552,
have=75208089814272
Couldn't read tree root
Could not open root, trying backup super
warning, device 14 is missing
warning, device 13 is missing
warning, device 12 is missing
warning, device 11 is missing
warning, device 10 is missing
warning, device 9 is missing
warning, device 8 is missing
warning, device 7 is missing
warning, device 6 is missing
warning, device 5 is missing
warning, device 4 is missing
warning, device 3 is missing
warning, device 2 is missing
checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
checksum verify failed on 22085632 found 5630EA32 wanted 1AA6FFF0
bad tree block 22085632, bytenr mismatch, want=22085632,
have=1147797504
ERROR: cannot read chunk root
Could not open root, trying backup super
warning, device 14 is missing
warning, device 13 is missing
warning, device 12 is missing
warning, device 11 is missing
warning, device 10 is missing
warning, device 9 is missing
warning, device 8 is missing
warning, device 7 is missing
warning, device 6 is missing
warning, device 5 is missi

[PATCH v2 15/20] btrfs-progs: sub list: Update help message of -d option

2018-06-18 Thread Misono Tomohiro
Explicitly states that -d requires root privileges.
Also, update some option handling with regard to -d option.

Signed-off-by: Misono Tomohiro 
---
 Documentation/btrfs-subvolume.asciidoc | 3 ++-
 cmds-subvolume.c   | 8 
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/Documentation/btrfs-subvolume.asciidoc 
b/Documentation/btrfs-subvolume.asciidoc
index 0381c92c..2db1d479 100644
--- a/Documentation/btrfs-subvolume.asciidoc
+++ b/Documentation/btrfs-subvolume.asciidoc
@@ -149,7 +149,8 @@ only snapshot subvolumes in the filesystem will be listed.
 -r
 only readonly subvolumes in the filesystem will be listed.
 -d
-list deleted subvolumes that are not yet cleaned.
+list deleted subvolumes that are not yet cleaned
+(require root privileges).
 
 Other;;
 -t
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 552c6dea..ef39789a 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -1569,6 +1569,7 @@ static const char * const cmd_subvol_list_usage[] = {
"-s   list only snapshots",
"-r   list readonly subvolumes (including snapshots)",
"-d   list deleted subvolumes that are not yet cleaned",
+   " (require root privileges)",
"",
"Other:",
"-t   print the result as a table",
@@ -1744,6 +1745,13 @@ static int cmd_subvol_list(int argc, char **argv)
goto out;
}
 
+   if (filter_set->only_deleted &&
+   (is_list_all || absolute_path || follow_mount)) {
+   ret = -1;
+   error("cannot use -d with -a/f/A option");
+   goto out;
+   }
+
subvol = argv[optind];
fd = btrfs_open_dir(subvol, , 1);
if (fd < 0) {
-- 
2.14.4


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 14/20] btrfs-progs: sub list: Update help message of -o option

2018-06-18 Thread Misono Tomohiro
Currently "sub list -o" lists only child subvolumes of the specified
path. So, update help message and variable name more appropriately.

Signed-off-by: Misono Tomohiro 
---
 Documentation/btrfs-subvolume.asciidoc |  2 +-
 cmds-subvolume.c   | 10 +-
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/Documentation/btrfs-subvolume.asciidoc 
b/Documentation/btrfs-subvolume.asciidoc
index 20fae1e1..0381c92c 100644
--- a/Documentation/btrfs-subvolume.asciidoc
+++ b/Documentation/btrfs-subvolume.asciidoc
@@ -116,7 +116,7 @@ or at mount time via the subvolid= option.
 +
 Path filtering;;
 -o
-print only subvolumes below specified .
+print only subvolumes which the subvolume of  contains.
 -a
 print all the subvolumes in the filesystem, including subvolumes
 which cannot be accessed from current mount point.
diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index dab266aa..552c6dea 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -1550,7 +1550,7 @@ static const char * const cmd_subvol_list_usage[] = {
"It is possible to specify non-subvolume directory as .",
"",
"Path filtering:",
-   "-o   print only subvolumes below specified path",
+   "-o   print only subvolumes which the subvolume of  
contains",
"-a   print all the subvolumes in the filesystem.",
" path to be shown is relative to the top-level",
" subvolume (require root privileges)",
@@ -1605,7 +1605,7 @@ static int cmd_subvol_list(int argc, char **argv)
int follow_mount = 0;
int sort = 0;
int no_sort = 0;
-   int is_only_in_path = 0;
+   int is_only_child = 0;
int absolute_path = 0;
DIR *dirstream = NULL;
enum btrfs_list_layout layout = BTRFS_LIST_LAYOUT_DEFAULT;
@@ -1651,7 +1651,7 @@ static int cmd_subvol_list(int argc, char **argv)
btrfs_list_setup_print_column_v2(BTRFS_LIST_GENERATION);
break;
case 'o':
-   is_only_in_path = 1;
+   is_only_child = 1;
break;
case 't':
layout = BTRFS_LIST_LAYOUT_TABLE;
@@ -1732,7 +1732,7 @@ static int cmd_subvol_list(int argc, char **argv)
goto out;
}
 
-   if (follow_mount && (is_list_all || is_only_in_path)) {
+   if (follow_mount && (is_list_all || is_only_child)) {
ret = -1;
error("cannot use -f with -a or -o option");
goto out;
@@ -1760,7 +1760,7 @@ static int cmd_subvol_list(int argc, char **argv)
if (ret)
goto out;
 
-   if (is_only_in_path)
+   if (is_only_child)
btrfs_list_setup_filter_v2(_set,
BTRFS_LIST_FILTER_TOPID_EQUAL,
top_id);
-- 
2.14.4


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/18] btrfs-progs: reorder placement of help declarations for send/receive

2018-05-16 Thread jeffm
From: Jeff Mahoney 

The usage definitions for send and receive follow the command
definitions, which use them.  This works because we declare them
in commands.h.  When we move to using cmd_struct as the entry point,
these declarations will be removed, breaking the commands.  Since
that would be an otherwise unrelated change, this patch reorders
them separately.

Signed-off-by: Jeff Mahoney 
---
 cmds-receive.c | 62 ++--
 cmds-send.c| 69 +-
 2 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 68123a31..b3709f36 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -1248,6 +1248,37 @@ out:
return ret;
 }
 
+const char * const cmd_receive_usage[] = {
+   "btrfs receive [options] \n"
+   "btrfs receive --dump [options]",
+   "Receive subvolumes from a stream",
+   "Receives one or more subvolumes that were previously",
+   "sent with btrfs send. The received subvolumes are stored",
+   "into MOUNT.",
+   "The receive will fail in case the receiving subvolume",
+   "already exists. It will also fail in case a previously",
+   "received subvolume has been changed after it was received.",
+   "After receiving a subvolume, it is immediately set to",
+   "read-only.",
+   "",
+   "-v   increase verbosity about performed actions",
+   "-f FILE  read the stream from FILE instead of stdin",
+   "-e   terminate after receiving an  marker in the 
stream.",
+   " Without this option the receiver side terminates only 
in case",
+   " of an error on end of file.",
+   "-C|--chroot  confine the process to  using chroot",
+   "-E|--max-errors NERR",
+   " terminate as soon as NERR errors occur while",
+   " stream processing commands from the stream.",
+   " Default value is 1. A value of 0 means no limit.",
+   "-m ROOTMOUNT the root mount point of the destination filesystem.",
+   " If /proc is not accessible, use this to tell us 
where",
+   " this file system is mounted.",
+   "--dump   dump stream metadata, one line per operation,",
+   " does not require the MOUNT parameter",
+   NULL
+};
+
 int cmd_receive(int argc, char **argv)
 {
char *tomnt = NULL;
@@ -1357,34 +1388,3 @@ out:
 
return !!ret;
 }
-
-const char * const cmd_receive_usage[] = {
-   "btrfs receive [options] \n"
-   "btrfs receive --dump [options]",
-   "Receive subvolumes from a stream",
-   "Receives one or more subvolumes that were previously",
-   "sent with btrfs send. The received subvolumes are stored",
-   "into MOUNT.",
-   "The receive will fail in case the receiving subvolume",
-   "already exists. It will also fail in case a previously",
-   "received subvolume has been changed after it was received.",
-   "After receiving a subvolume, it is immediately set to",
-   "read-only.",
-   "",
-   "-v   increase verbosity about performed actions",
-   "-f FILE  read the stream from FILE instead of stdin",
-   "-e   terminate after receiving an  marker in the 
stream.",
-   " Without this option the receiver side terminates only 
in case",
-   " of an error on end of file.",
-   "-C|--chroot  confine the process to  using chroot",
-   "-E|--max-errors NERR",
-   " terminate as soon as NERR errors occur while",
-   " stream processing commands from the stream.",
-   " Default value is 1. A value of 0 means no limit.",
-   "-m ROOTMOUNT the root mount point of the destination filesystem.",
-   " If /proc is not accessible, use this to tell us 
where",
-   " this file system is mounted.",
-   "--dump   dump stream metadata, one line per operation,",
-   " does not require the MOUNT parameter",
-   NULL
-};
diff --git a/cmds-send.c b/cmds-send.c
index c5ecdaa1..8365e9c9 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -489,6 +489,41 @@ static void free_send_info(struct btrfs_send *sctx)
subvol_uuid_search_finit(>sus);
 }
 
+
+const char * const cmd_send_usage[] = {
+   "btrfs send [-ve] [-p ] [-c ] [-f ] 
 [...]",
+   "Send the subvolume(s) to stdout.",
+   "Sends the subvolume(s) specified by  to stdout.",
+   " should be read-only here.",
+   "By default, this will send the whole subvolume. To do an incremental",
+   "send, use '-p '. If you want to allow btrfs to clone from",
+   "any additional local snapshots, use '-c ' (multiple times",
+   

[PATCH 09/18] btrfs-progs: help: convert ints used as bools to bool

2018-05-16 Thread jeffm
From: Jeff Mahoney <je...@suse.com>

We use an int for 'full', 'all', and 'err' when we really mean a boolean.

Reviewed-by: Qu Wenruo <w...@suse.com>
Signed-off-by: Jeff Mahoney <je...@suse.com>
---
 btrfs.c | 14 +++---
 help.c  | 25 +
 help.h  |  4 ++--
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 2d39f2ce..fec1a135 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -109,7 +109,7 @@ static void handle_help_options_next_level(const struct 
cmd_struct *cmd,
argv++;
help_command_group(cmd->next, argc, argv);
} else {
-   usage_command(cmd, 1, 0);
+   usage_command(cmd, true, false);
}
 
exit(0);
@@ -125,7 +125,7 @@ int handle_command_group(const struct cmd_group *grp, int 
argc,
argc--;
argv++;
if (argc < 1) {
-   usage_command_group(grp, 0, 0);
+   usage_command_group(grp, false, false);
exit(1);
}
 
@@ -212,20 +212,20 @@ static int handle_global_options(int argc, char **argv)
 
 void handle_special_globals(int shift, int argc, char **argv)
 {
-   int has_help = 0;
-   int has_full = 0;
+   bool has_help = false;
+   bool has_full = false;
int i;
 
for (i = 0; i < shift; i++) {
if (strcmp(argv[i], "--help") == 0)
-   has_help = 1;
+   has_help = true;
else if (strcmp(argv[i], "--full") == 0)
-   has_full = 1;
+   has_full = true;
}
 
if (has_help) {
if (has_full)
-   usage_command_group(_cmd_group, 1, 0);
+   usage_command_group(_cmd_group, true, false);
else
cmd_help(argc, argv);
exit(0);
diff --git a/help.c b/help.c
index f1dd3946..99fd325b 100644
--- a/help.c
+++ b/help.c
@@ -196,8 +196,8 @@ static int do_usage_one_command(const char * const 
*usagestr,
 }
 
 static int usage_command_internal(const char * const *usagestr,
- const char *token, int full, int lst,
- int alias, FILE *outf)
+ const char *token, bool full, bool lst,
+ bool alias, FILE *outf)
 {
unsigned int flags = 0;
int ret;
@@ -223,17 +223,17 @@ static int usage_command_internal(const char * const 
*usagestr,
 }
 
 static void usage_command_usagestr(const char * const *usagestr,
-  const char *token, int full, int err)
+  const char *token, bool full, bool err)
 {
FILE *outf = err ? stderr : stdout;
int ret;
 
-   ret = usage_command_internal(usagestr, token, full, 0, 0, outf);
+   ret = usage_command_internal(usagestr, token, full, false, false, outf);
if (!ret)
fputc('\n', outf);
 }
 
-void usage_command(const struct cmd_struct *cmd, int full, int err)
+void usage_command(const struct cmd_struct *cmd, bool full, bool err)
 {
usage_command_usagestr(cmd->usagestr, cmd->token, full, err);
 }
@@ -241,11 +241,11 @@ void usage_command(const struct cmd_struct *cmd, int 
full, int err)
 __attribute__((noreturn))
 void usage(const char * const *usagestr)
 {
-   usage_command_usagestr(usagestr, NULL, 1, 1);
+   usage_command_usagestr(usagestr, NULL, true, true);
exit(1);
 }
 
-static void usage_command_group_internal(const struct cmd_group *grp, int full,
+static void usage_command_group_internal(const struct cmd_group *grp, bool 
full,
 FILE *outf)
 {
const struct cmd_struct *cmd = grp->commands;
@@ -265,7 +265,8 @@ static void usage_command_group_internal(const struct 
cmd_group *grp, int full,
}
 
usage_command_internal(cmd->usagestr, cmd->token, full,
-  1, cmd->flags & CMD_ALIAS, outf);
+  true, cmd->flags & CMD_ALIAS,
+  outf);
if (cmd->flags & CMD_ALIAS)
putchar('\n');
continue;
@@ -327,7 +328,7 @@ void usage_command_group_short(const struct cmd_group *grp)
fprintf(stderr, "All command groups have their manual page named 
'btrfs-'.\n");
 }
 
-void usage_command_group(const struct cmd_group *grp, int full, int err)
+void usage_command_group(const struct cmd_group *grp, bool full, bool err)
 {
const char * const *usagestr = grp->usagestr;
FILE *outf = err ? stderr : stdout;
@@ -350,7 +351,7 @@ __attribute__((nore

Help

2018-04-29 Thread Sg Swanson Denniss
Dear Sir/Madam,

I am Sgt Swanson Dennis, I have a good business proposal for you.
There are no risks involved and it is easy. Please reply for briefs
and procedures.

Best regards,
Sgt Swanson Dennis
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/20] btrfs-progs: help: convert ints used as bools to bool

2018-03-07 Thread Qu Wenruo


On 2018年03月08日 10:40, je...@suse.com wrote:
> From: Jeff Mahoney <je...@suse.com>
> 
> We use an int for 'full', 'all', and 'err' when we really mean a boolean.
> 
> Signed-off-by: Jeff Mahoney <je...@suse.com>

Reviewed-by: Qu Wenruo <w...@suse.com>

Thanks,
Qu

> ---
>  btrfs.c | 14 +++---
>  help.c  | 25 +
>  help.h  |  4 ++--
>  3 files changed, 22 insertions(+), 21 deletions(-)
> 
> diff --git a/btrfs.c b/btrfs.c
> index 2d39f2ce..fec1a135 100644
> --- a/btrfs.c
> +++ b/btrfs.c
> @@ -109,7 +109,7 @@ static void handle_help_options_next_level(const struct 
> cmd_struct *cmd,
>   argv++;
>   help_command_group(cmd->next, argc, argv);
>   } else {
> - usage_command(cmd, 1, 0);
> + usage_command(cmd, true, false);
>   }
>  
>   exit(0);
> @@ -125,7 +125,7 @@ int handle_command_group(const struct cmd_group *grp, int 
> argc,
>   argc--;
>   argv++;
>   if (argc < 1) {
> - usage_command_group(grp, 0, 0);
> + usage_command_group(grp, false, false);
>   exit(1);
>   }
>  
> @@ -212,20 +212,20 @@ static int handle_global_options(int argc, char **argv)
>  
>  void handle_special_globals(int shift, int argc, char **argv)
>  {
> - int has_help = 0;
> - int has_full = 0;
> + bool has_help = false;
> + bool has_full = false;
>   int i;
>  
>   for (i = 0; i < shift; i++) {
>   if (strcmp(argv[i], "--help") == 0)
> - has_help = 1;
> + has_help = true;
>   else if (strcmp(argv[i], "--full") == 0)
> - has_full = 1;
> + has_full = true;
>   }
>  
>   if (has_help) {
>   if (has_full)
> - usage_command_group(_cmd_group, 1, 0);
> + usage_command_group(_cmd_group, true, false);
>   else
>   cmd_help(argc, argv);
>   exit(0);
> diff --git a/help.c b/help.c
> index 311a4320..ef7986b4 100644
> --- a/help.c
> +++ b/help.c
> @@ -196,8 +196,8 @@ static int do_usage_one_command(const char * const 
> *usagestr,
>  }
>  
>  static int usage_command_internal(const char * const *usagestr,
> -   const char *token, int full, int lst,
> -   int alias, FILE *outf)
> +   const char *token, bool full, bool lst,
> +   bool alias, FILE *outf)
>  {
>   unsigned int flags = 0;
>   int ret;
> @@ -223,17 +223,17 @@ static int usage_command_internal(const char * const 
> *usagestr,
>  }
>  
>  static void usage_command_usagestr(const char * const *usagestr,
> -const char *token, int full, int err)
> +const char *token, bool full, bool err)
>  {
>   FILE *outf = err ? stderr : stdout;
>   int ret;
>  
> - ret = usage_command_internal(usagestr, token, full, 0, 0, outf);
> + ret = usage_command_internal(usagestr, token, full, false, false, outf);
>   if (!ret)
>   fputc('\n', outf);
>  }
>  
> -void usage_command(const struct cmd_struct *cmd, int full, int err)
> +void usage_command(const struct cmd_struct *cmd, bool full, bool err)
>  {
>   usage_command_usagestr(cmd->usagestr, cmd->token, full, err);
>  }
> @@ -241,11 +241,11 @@ void usage_command(const struct cmd_struct *cmd, int 
> full, int err)
>  __attribute__((noreturn))
>  void usage(const char * const *usagestr)
>  {
> - usage_command_usagestr(usagestr, NULL, 1, 1);
> + usage_command_usagestr(usagestr, NULL, true, true);
>   exit(1);
>  }
>  
> -static void usage_command_group_internal(const struct cmd_group *grp, int 
> full,
> +static void usage_command_group_internal(const struct cmd_group *grp, bool 
> full,
>FILE *outf)
>  {
>   const struct cmd_struct *cmd = grp->commands;
> @@ -265,7 +265,8 @@ static void usage_command_group_internal(const struct 
> cmd_group *grp, int full,
>   }
>  
>   usage_command_internal(cmd->usagestr, cmd->token, full,
> -1, cmd->flags & CMD_ALIAS, outf);
> +true, cmd->flags & CMD_ALIAS,
> +outf);
>   if (cmd->flags & CMD_

[PATCH 11/20] btrfs-progs: reorder placement of help declarations for send/receive

2018-03-07 Thread jeffm
From: Jeff Mahoney 

The usage definitions for send and receive follow the command
definitions, which use them.  This works because we declare them
in commands.h.  When we move to using cmd_struct as the entry point,
these declarations will be removed, breaking the commands.  Since
that would be an otherwise unrelated change, this patch reorders
them separately.

Signed-off-by: Jeff Mahoney 
---
 cmds-receive.c | 62 ++--
 cmds-send.c| 69 +-
 2 files changed, 66 insertions(+), 65 deletions(-)

diff --git a/cmds-receive.c b/cmds-receive.c
index 68123a31..b3709f36 100644
--- a/cmds-receive.c
+++ b/cmds-receive.c
@@ -1248,6 +1248,37 @@ out:
return ret;
 }
 
+const char * const cmd_receive_usage[] = {
+   "btrfs receive [options] \n"
+   "btrfs receive --dump [options]",
+   "Receive subvolumes from a stream",
+   "Receives one or more subvolumes that were previously",
+   "sent with btrfs send. The received subvolumes are stored",
+   "into MOUNT.",
+   "The receive will fail in case the receiving subvolume",
+   "already exists. It will also fail in case a previously",
+   "received subvolume has been changed after it was received.",
+   "After receiving a subvolume, it is immediately set to",
+   "read-only.",
+   "",
+   "-v   increase verbosity about performed actions",
+   "-f FILE  read the stream from FILE instead of stdin",
+   "-e   terminate after receiving an  marker in the 
stream.",
+   " Without this option the receiver side terminates only 
in case",
+   " of an error on end of file.",
+   "-C|--chroot  confine the process to  using chroot",
+   "-E|--max-errors NERR",
+   " terminate as soon as NERR errors occur while",
+   " stream processing commands from the stream.",
+   " Default value is 1. A value of 0 means no limit.",
+   "-m ROOTMOUNT the root mount point of the destination filesystem.",
+   " If /proc is not accessible, use this to tell us 
where",
+   " this file system is mounted.",
+   "--dump   dump stream metadata, one line per operation,",
+   " does not require the MOUNT parameter",
+   NULL
+};
+
 int cmd_receive(int argc, char **argv)
 {
char *tomnt = NULL;
@@ -1357,34 +1388,3 @@ out:
 
return !!ret;
 }
-
-const char * const cmd_receive_usage[] = {
-   "btrfs receive [options] \n"
-   "btrfs receive --dump [options]",
-   "Receive subvolumes from a stream",
-   "Receives one or more subvolumes that were previously",
-   "sent with btrfs send. The received subvolumes are stored",
-   "into MOUNT.",
-   "The receive will fail in case the receiving subvolume",
-   "already exists. It will also fail in case a previously",
-   "received subvolume has been changed after it was received.",
-   "After receiving a subvolume, it is immediately set to",
-   "read-only.",
-   "",
-   "-v   increase verbosity about performed actions",
-   "-f FILE  read the stream from FILE instead of stdin",
-   "-e   terminate after receiving an  marker in the 
stream.",
-   " Without this option the receiver side terminates only 
in case",
-   " of an error on end of file.",
-   "-C|--chroot  confine the process to  using chroot",
-   "-E|--max-errors NERR",
-   " terminate as soon as NERR errors occur while",
-   " stream processing commands from the stream.",
-   " Default value is 1. A value of 0 means no limit.",
-   "-m ROOTMOUNT the root mount point of the destination filesystem.",
-   " If /proc is not accessible, use this to tell us 
where",
-   " this file system is mounted.",
-   "--dump   dump stream metadata, one line per operation,",
-   " does not require the MOUNT parameter",
-   NULL
-};
diff --git a/cmds-send.c b/cmds-send.c
index c5ecdaa1..8365e9c9 100644
--- a/cmds-send.c
+++ b/cmds-send.c
@@ -489,6 +489,41 @@ static void free_send_info(struct btrfs_send *sctx)
subvol_uuid_search_finit(>sus);
 }
 
+
+const char * const cmd_send_usage[] = {
+   "btrfs send [-ve] [-p ] [-c ] [-f ] 
 [...]",
+   "Send the subvolume(s) to stdout.",
+   "Sends the subvolume(s) specified by  to stdout.",
+   " should be read-only here.",
+   "By default, this will send the whole subvolume. To do an incremental",
+   "send, use '-p '. If you want to allow btrfs to clone from",
+   "any additional local snapshots, use '-c ' (multiple times",
+   

[PATCH 10/20] btrfs-progs: help: convert ints used as bools to bool

2018-03-07 Thread jeffm
From: Jeff Mahoney <je...@suse.com>

We use an int for 'full', 'all', and 'err' when we really mean a boolean.

Signed-off-by: Jeff Mahoney <je...@suse.com>
---
 btrfs.c | 14 +++---
 help.c  | 25 +
 help.h  |  4 ++--
 3 files changed, 22 insertions(+), 21 deletions(-)

diff --git a/btrfs.c b/btrfs.c
index 2d39f2ce..fec1a135 100644
--- a/btrfs.c
+++ b/btrfs.c
@@ -109,7 +109,7 @@ static void handle_help_options_next_level(const struct 
cmd_struct *cmd,
argv++;
help_command_group(cmd->next, argc, argv);
} else {
-   usage_command(cmd, 1, 0);
+   usage_command(cmd, true, false);
}
 
exit(0);
@@ -125,7 +125,7 @@ int handle_command_group(const struct cmd_group *grp, int 
argc,
argc--;
argv++;
if (argc < 1) {
-   usage_command_group(grp, 0, 0);
+   usage_command_group(grp, false, false);
exit(1);
}
 
@@ -212,20 +212,20 @@ static int handle_global_options(int argc, char **argv)
 
 void handle_special_globals(int shift, int argc, char **argv)
 {
-   int has_help = 0;
-   int has_full = 0;
+   bool has_help = false;
+   bool has_full = false;
int i;
 
for (i = 0; i < shift; i++) {
if (strcmp(argv[i], "--help") == 0)
-   has_help = 1;
+   has_help = true;
else if (strcmp(argv[i], "--full") == 0)
-   has_full = 1;
+   has_full = true;
}
 
if (has_help) {
if (has_full)
-   usage_command_group(_cmd_group, 1, 0);
+   usage_command_group(_cmd_group, true, false);
else
cmd_help(argc, argv);
exit(0);
diff --git a/help.c b/help.c
index 311a4320..ef7986b4 100644
--- a/help.c
+++ b/help.c
@@ -196,8 +196,8 @@ static int do_usage_one_command(const char * const 
*usagestr,
 }
 
 static int usage_command_internal(const char * const *usagestr,
- const char *token, int full, int lst,
- int alias, FILE *outf)
+ const char *token, bool full, bool lst,
+ bool alias, FILE *outf)
 {
unsigned int flags = 0;
int ret;
@@ -223,17 +223,17 @@ static int usage_command_internal(const char * const 
*usagestr,
 }
 
 static void usage_command_usagestr(const char * const *usagestr,
-  const char *token, int full, int err)
+  const char *token, bool full, bool err)
 {
FILE *outf = err ? stderr : stdout;
int ret;
 
-   ret = usage_command_internal(usagestr, token, full, 0, 0, outf);
+   ret = usage_command_internal(usagestr, token, full, false, false, outf);
if (!ret)
fputc('\n', outf);
 }
 
-void usage_command(const struct cmd_struct *cmd, int full, int err)
+void usage_command(const struct cmd_struct *cmd, bool full, bool err)
 {
usage_command_usagestr(cmd->usagestr, cmd->token, full, err);
 }
@@ -241,11 +241,11 @@ void usage_command(const struct cmd_struct *cmd, int 
full, int err)
 __attribute__((noreturn))
 void usage(const char * const *usagestr)
 {
-   usage_command_usagestr(usagestr, NULL, 1, 1);
+   usage_command_usagestr(usagestr, NULL, true, true);
exit(1);
 }
 
-static void usage_command_group_internal(const struct cmd_group *grp, int full,
+static void usage_command_group_internal(const struct cmd_group *grp, bool 
full,
 FILE *outf)
 {
const struct cmd_struct *cmd = grp->commands;
@@ -265,7 +265,8 @@ static void usage_command_group_internal(const struct 
cmd_group *grp, int full,
}
 
usage_command_internal(cmd->usagestr, cmd->token, full,
-  1, cmd->flags & CMD_ALIAS, outf);
+  true, cmd->flags & CMD_ALIAS,
+  outf);
if (cmd->flags & CMD_ALIAS)
putchar('\n');
continue;
@@ -327,7 +328,7 @@ void usage_command_group_short(const struct cmd_group *grp)
fprintf(stderr, "All command groups have their manual page named 
'btrfs-'.\n");
 }
 
-void usage_command_group(const struct cmd_group *grp, int full, int err)
+void usage_command_group(const struct cmd_group *grp, bool full, bool err)
 {
const char * const *usagestr = grp->usagestr;
FILE *outf = err ? stderr : stdout;
@@ -350,7 +351,7 @@ __attribute__((noreturn))
 void help_unknown_token(const ch

RE: Help with leaf parent key incorrect

2018-02-26 Thread Paul Jones
> -Original Message-
> From: Anand Jain [mailto:anand.j...@oracle.com]
> Sent: Monday, 26 February 2018 7:27 PM
> To: Paul Jones <p...@pauljones.id.au>; linux-btrfs@vger.kernel.org
> Subject: Re: Help with leaf parent key incorrect
> 
> 
> 
>  > There is one io error in the log below,
> 
> Apparently, that's not a real EIO. We need to fix it.
> But can't be the root cause we are looking for here.
> 
> 
>  > Feb 24 22:41:59 home kernel: BTRFS: error (device dm-6) in
> btrfs_run_delayed_refs:3076: errno=-5 IO failure  > Feb 24 22:41:59 home
> kernel: BTRFS info (device dm-6): forced readonly
> 
> static int run_delayed_extent_op(struct btrfs_trans_handle *trans,
>   struct btrfs_fs_info *fs_info,
>   struct btrfs_delayed_ref_head *head,
>   struct btrfs_delayed_extent_op *extent_op) {
> ::
> 
>  } else {
>  err = -EIO;
>  goto out;
>  }
> 
> 
>  > but other than that I have never had io errors before, or any other
> troubles.
> 
>   Hm. btrfs dev stat shows real disk IO errors.
>   As this FS isn't mountable .. pls try
>btrfs dev stat  > file
>search for 'device stats', there will be one for each disk.
>   Or it reports in the syslog when it happens not necessarily
>   during dedupe.

vm-server ~ # btrfs dev stat /media/storage/
[/dev/mapper/b-storage--b].write_io_errs0
[/dev/mapper/b-storage--b].read_io_errs 0
[/dev/mapper/b-storage--b].flush_io_errs0
[/dev/mapper/b-storage--b].corruption_errs  0
[/dev/mapper/b-storage--b].generation_errs  0
[/dev/mapper/a-storage--a].write_io_errs0
[/dev/mapper/a-storage--a].read_io_errs 0
[/dev/mapper/a-storage--a].flush_io_errs0
[/dev/mapper/a-storage--a].corruption_errs  0
[/dev/mapper/a-storage--a].generation_errs  0
vm-server ~ # btrfs dev stat /
[/dev/sdb1].write_io_errs0
[/dev/sdb1].read_io_errs 0
[/dev/sdb1].flush_io_errs0
[/dev/sdb1].corruption_errs  0
[/dev/sdb1].generation_errs  0
[/dev/sda1].write_io_errs0
[/dev/sda1].read_io_errs 0
[/dev/sda1].flush_io_errs0
[/dev/sda1].corruption_errs  0
[/dev/sda1].generation_errs  0
vm-server ~ # btrfs dev stat /dev/mapper/a-backup--a
ERROR: '/dev/mapper/a-backup--a' is not a mounted btrfs device

I check syslog regularly and I haven't seen any errors on any drives for over a 
year.

> 
>  > One of my other filesystems share the same two discs and it is still fine, 
> so I
> think the hardware is probably ok.
>   Right. I guess that too. A confirmation will be better.
>  > I've copied the beginning of the errors below.
> 
> 
>   At my end finding the root cause of 'parent transid verify failed'
>   during/after dedupe is is kind of fading as disk seems to be had
>   no issues. which I had in mind.
> 
>   Also, there wasn't abrupt power-recycle here? I presume.

No, although now that I think about it I just realised it happened right after 
I upgraded from 4.15.4 to 4.15.5 and I didn't quit bees before rebooting, I let 
the system do it. Not sure if it's relevant or not.
I also just noticed that the kernel has spawned hundreds of kworkers - the 
highest number I can see is 516.

> 
>   It's better to save the output disk1-log and disk2-log as below
>   before further efforts to recovery. Just in case if something
>   pops out.
> 
>btrfs in dump-super -fa disk1 > disk1-log
>btrfs in dump-tree --degraded disk1 >> disk1-log [1]

I applied the patch and started dumping the tree, but I stopped it after about 
10 mins and 9GB.
Because I use zstd and free space tree the recovery tools wouldn't do anything 
in RW mode, so I've decided to just blow it away and restore from a backup.
I made a block level copy of both discs in case I need anything.

Thanks for your help anyway.

Regards,
Paul.


Re: Help with leaf parent key incorrect

2018-02-26 Thread Anand Jain



> There is one io error in the log below,

Apparently, that's not a real EIO. We need to fix it.
But can't be the root cause we are looking for here.


> Feb 24 22:41:59 home kernel: BTRFS: error (device dm-6) in 
btrfs_run_delayed_refs:3076: errno=-5 IO failure

> Feb 24 22:41:59 home kernel: BTRFS info (device dm-6): forced readonly

static int run_delayed_extent_op(struct btrfs_trans_handle *trans,
 struct btrfs_fs_info *fs_info,
 struct btrfs_delayed_ref_head *head,
 struct btrfs_delayed_extent_op *extent_op)
{
::

} else {
err = -EIO;
goto out;
}


> but other than that I have never had io errors before, or any other 
troubles.


 Hm. btrfs dev stat shows real disk IO errors.
 As this FS isn't mountable .. pls try
  btrfs dev stat  > file
  search for 'device stats', there will be one for each disk.
 Or it reports in the syslog when it happens not necessarily
 during dedupe.

> One of my other filesystems share the same two discs and it is still 
fine, so I think the hardware is probably ok.

 Right. I guess that too. A confirmation will be better.
> I've copied the beginning of the errors below.


 At my end finding the root cause of 'parent transid verify failed'
 during/after dedupe is is kind of fading as disk seems to be had
 no issues. which I had in mind.

 Also, there wasn't abrupt power-recycle here? I presume.

 It's better to save the output disk1-log and disk2-log as below
 before further efforts to recovery. Just in case if something
 pops out.

  btrfs in dump-super -fa disk1 > disk1-log
  btrfs in dump-tree --degraded disk1 >> disk1-log [1]

  btrfs in dump-super -fa disk2 > disk2-log
  btrfs in dump-tree --degraded disk2 >> disk2-log [1]

 [1]
  --degraded option is in the ML.
  [PATCH] btrfs-progs: dump-tree: add degraded option

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help with leaf parent key incorrect

2018-02-25 Thread Anand Jain



On 02/25/2018 06:16 PM, Paul Jones wrote:

Hi all,

I was running dedupe on my filesystem and something went wrong overnight, by 
the time I noticed the fs was readonly.


 Thanks for the report. I have few questions..
  Kind of raid profile used here?
  Dedupe tool that was used?
  Was the fs full before dedupe?
  Were there any IO errors?

Thanks, Anand


When trying to check it this is what I get:
vm-server ~ # btrfs check /dev/mapper/a-backup--a
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
Ignoring transid failure
leaf parent key incorrect 2371034071040
ERROR: cannot open file system

Is there a way to fix this? I'm using kernel 4.15.5

This is the last part of dmesg

[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.107963] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.05] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.473598] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.001927] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.60] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +2.676048] verify_parent_transid: 10362 callbacks suppressed
[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.078432] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.43] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.058638] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.139174] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[Feb25 20:48] BTRFS info (device dm-6): using free space tree
[  +0.02] BTRFS error (device dm-6): Remounting read-write after error is 
not allowed
[Feb25 20:49] BTRFS error (device dm-6): cleaner transaction attach returned -30
[  +0.238718] BTRFS warning (device dm-6): page private not zero on page 
1596642967552
[  +0.03] BTRFS warning (device dm-6): page private not zero on page 
1596642971648
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642975744
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642979840
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643672064
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643676160
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643680256
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643684352
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643704832
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643708928
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643713024
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643717120
[  +0.28] BTRFS warning (device dm-6): page private not zero on page 
2363051098112
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051102208
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051106304
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051110400
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056344576
[  +0.00] BTRFS 

Help with leaf parent key incorrect

2018-02-25 Thread Paul Jones
Hi all,

I was running dedupe on my filesystem and something went wrong overnight, by 
the time I noticed the fs was readonly.
When trying to check it this is what I get:
vm-server ~ # btrfs check /dev/mapper/a-backup--a
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
parent transid verify failed on 2371034071040 wanted 62977 found 62893
Ignoring transid failure
leaf parent key incorrect 2371034071040
ERROR: cannot open file system

Is there a way to fix this? I'm using kernel 4.15.5

This is the last part of dmesg

[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.107963] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +0.05] BTRFS error (device dm-6): parent transid verify failed on 
2374016368640 wanted 63210 found 63208
[  +1.473598] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.001927] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.60] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996298240 wanted 63210 found 63208
[  +2.676048] verify_parent_transid: 10362 callbacks suppressed
[  +0.02] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.03] BTRFS error (device dm-6): parent transid verify failed on 
2373991677952 wanted 63210 found 63208
[  +0.078432] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.43] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.01] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.058638] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.139174] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[  +0.04] BTRFS error (device dm-6): parent transid verify failed on 
2373996232704 wanted 63210 found 63208
[Feb25 20:48] BTRFS info (device dm-6): using free space tree
[  +0.02] BTRFS error (device dm-6): Remounting read-write after error is 
not allowed
[Feb25 20:49] BTRFS error (device dm-6): cleaner transaction attach returned -30
[  +0.238718] BTRFS warning (device dm-6): page private not zero on page 
1596642967552
[  +0.03] BTRFS warning (device dm-6): page private not zero on page 
1596642971648
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642975744
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596642979840
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643672064
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643676160
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643680256
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643684352
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643704832
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643708928
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643713024
[  +0.02] BTRFS warning (device dm-6): page private not zero on page 
1596643717120
[  +0.28] BTRFS warning (device dm-6): page private not zero on page 
2363051098112
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051102208
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051106304
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2363051110400
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056344576
[  +0.00] BTRFS warning (device dm-6): page private not zero on page 
2368056348672
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 
2368056352768
[  +0.01] BTRFS warning (device dm-6): page private not zero on page 

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-17 Thread Kai Krakow
Am Fri, 17 Nov 2017 06:51:52 +0300
schrieb Andrei Borzenkov :

> 16.11.2017 19:13, Kai Krakow пишет:
> ...
> > > BTW: From user API perspective, btrfs snapshots do not guarantee  
> > perfect granular consistent backups.  
> 
> Is it documented somewhere? I was relying on crash-consistent
> write-order-preserving snapshots in NetApp for as long as I remember.
> And I was sure btrfs offers is as it is something obvious for
> redirect-on-write idea.

I think it has ordering guarantees, but it is not as atomic in time as
one might think. That's the point. But devs may tell better.


> > A user-level file transaction may
> > still end up only partially in the snapshot. If you are running
> > transaction sensitive applications, those usually do provide some
> > means of preparing a freeze and a thaw of transactions.
> >   
> 
> Is snapshot creation synchronous to know when thaw?

I think you could do "btrfs snap create", then "btrfs fs sync", and
everything should be fine.


> > I think the user transactions API which could've been used for this
> > will even be removed during the next kernel cycles. I remember
> > reiserfs4 tried to deploy something similar. But there's no
> > consistent layer in the VFS for subscribing applications to
> > filesystem snapshots so they could prepare and notify the kernel
> > when they are ready. 
> 
> I do not see what VFS has to do with it. NetApp works by simply
> preserving previous consistency point instead of throwing it away.
> I.e. snapshot is always last committed image on stable storage. Would
> something like this be possible on btrfs level by duplicating current
> on-disk root (sorry if I use wrong term)?

I think btrfs gives the same consistency. But the moment you issue
"btrfs snap create" may delay snapshot creation a little bit. So if
your application relies on exact point in time snapshots, you need to
ensure synchronizing your application to the filesystem. I think the
same is true for NetApp.

I just wanted to point that out because it may not be obvious, given
that btrfs snapshot creation is built right into the tool chain of
filesystem itself, unlike e.g. NetApp or LVM, or other storage layers.

Background: A good while back I was told that btrfs snapshots during
ongoing IO may result in some of the later IO carried over to before
the snapshot. Transactional ordering of IO operations is still
guaranteed but it may overlap with snapshot creation. So you can still
loose a transaction you didn't expect to loose at that point in time.

So I understood this as:

If you just want to ensure transactional integrity of your database,
you are all fine with btrfs snapshots.

But if you want to ensure that a just finished transaction makes it
into the snapshot completely, you have to sync the processes.

However, things may have changed since then.


-- 
Regards,
Kai

Replies to list-only preferred.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-16 Thread Andrei Borzenkov
16.11.2017 19:13, Kai Krakow пишет:
...
> > BTW: From user API perspective, btrfs snapshots do not guarantee
> perfect granular consistent backups.

Is it documented somewhere? I was relying on crash-consistent
write-order-preserving snapshots in NetApp for as long as I remember.
And I was sure btrfs offers is as it is something obvious for
redirect-on-write idea.

> A user-level file transaction may
> still end up only partially in the snapshot. If you are running
> transaction sensitive applications, those usually do provide some means
> of preparing a freeze and a thaw of transactions.
> 

Is snapshot creation synchronous to know when thaw?

> I think the user transactions API which could've been used for this
> will even be removed during the next kernel cycles. I remember
> reiserfs4 tried to deploy something similar. But there's no consistent
> layer in the VFS for subscribing applications to filesystem snapshots
> so they could prepare and notify the kernel when they are ready.
> 

I do not see what VFS has to do with it. NetApp works by simply
preserving previous consistency point instead of throwing it away. I.e.
snapshot is always last committed image on stable storage. Would
something like this be possible on btrfs level by duplicating current
on-disk root (sorry if I use wrong term)?

...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-16 Thread Kai Krakow
Link 2 slipped away, adding it below...

Am Tue, 14 Nov 2017 15:51:57 -0500
schrieb Dave :

> On Tue, Nov 14, 2017 at 3:50 AM, Roman Mamedov  wrote:
> >
> > On Mon, 13 Nov 2017 22:39:44 -0500
> > Dave  wrote:
> >  
> > > I have my live system on one block device and a backup snapshot
> > > of it on another block device. I am keeping them in sync with
> > > hourly rsync transfers.
> > >
> > > Here's how this system works in a little more detail:
> > >
> > > 1. I establish the baseline by sending a full snapshot to the
> > > backup block device using btrfs send-receive.
> > > 2. Next, on the backup device I immediately create a rw copy of
> > > that baseline snapshot.
> > > 3. I delete the source snapshot to keep the live filesystem free
> > > of all snapshots (so it can be optimally defragmented, etc.)
> > > 4. hourly, I take a snapshot of the live system, rsync all
> > > changes to the backup block device, and then delete the source
> > > snapshot. This hourly process takes less than a minute currently.
> > > (My test system has only moderate usage.)
> > > 5. hourly, following the above step, I use snapper to take a
> > > snapshot of the backup subvolume to create/preserve a history of
> > > changes. For example, I can find the version of a file 30 hours
> > > prior.  
> >
> > Sounds a bit complex, I still don't get why you need all these
> > snapshot creations and deletions, and even still using btrfs
> > send-receive.  
> 
> 
> Hopefully, my comments below will explain my reasons.
> 
> >
> > Here is my scheme:
> > 
> > /mnt/dst <- mounted backup storage volume
> > /mnt/dst/backup  <- a subvolume
> > /mnt/dst/backup/host1/ <- rsync destination for host1, regular
> > directory /mnt/dst/backup/host2/ <- rsync destination for host2,
> > regular directory /mnt/dst/backup/host3/ <- rsync destination for
> > host3, regular directory etc.
> >
> > /mnt/dst/backup/host1/bin/
> > /mnt/dst/backup/host1/etc/
> > /mnt/dst/backup/host1/home/
> > ...
> > Self explanatory. All regular directories, not subvolumes.
> >
> > Snapshots:
> > /mnt/dst/snaps/backup <- a regular directory
> > /mnt/dst/snaps/backup/2017-11-14T12:00/ <- snapshot 1
> > of /mnt/dst/backup /mnt/dst/snaps/backup/2017-11-14T13:00/ <-
> > snapshot 2
> > of /mnt/dst/backup /mnt/dst/snaps/backup/2017-11-14T14:00/ <-
> > snapshot 3 of /mnt/dst/backup
> >
> > Accessing historic data:
> > /mnt/dst/snaps/backup/2017-11-14T12:00/host1/bin/bash
> > ...
> > /bin/bash for host1 as of 2017-11-14 12:00 (time on the backup
> > system).
> > 
> >
> > No need for btrfs send-receive, only plain rsync is used, directly
> > from hostX:/ to /mnt/dst/backup/host1/;  
> 
> 
> I prefer to start with a BTRFS snapshot at the backup destination. I
> think that's the most "accurate" starting point.

No, you should finish with a snapshot. Use the rsync destination as a
"dirty" scratch area, let rsync also delete files which are no longer
in the source. After successfully running rsync, make a snapshot of
that directory and make it RO, leave the scratch in place (even when
rsync dies or becomes killed).

I once made some scripts[2] following those rules, you may want to adapt
them.


> > No need to create or delete snapshots during the actual backup
> > process;  
> 
> Then you can't guarantee consistency of the backed up information.

Take a temporary snapshot of the source, rsync to to the scratch
destination, take a RO snapshot of that destination, remove the
temporary snapshot.

BTW: From user API perspective, btrfs snapshots do not guarantee
perfect granular consistent backups. A user-level file transaction may
still end up only partially in the snapshot. If you are running
transaction sensitive applications, those usually do provide some means
of preparing a freeze and a thaw of transactions.

I think the user transactions API which could've been used for this
will even be removed during the next kernel cycles. I remember
reiserfs4 tried to deploy something similar. But there's no consistent
layer in the VFS for subscribing applications to filesystem snapshots
so they could prepare and notify the kernel when they are ready.


> > A single common timeline is kept for all hosts to be backed up,
> > snapshot count not multiplied by the number of hosts (in my case
> > the backup location is multi-purpose, so I somewhat care about
> > total number of snapshots there as well);
> >
> > Also, all of this works even with source hosts which do not use
> > Btrfs.  
> 
> That's not a concern for me because I prefer to use BTRFS everywhere.

At least I suggest looking into bees[1] to deduplicate the backup
destination. Rsync is not very efficient to work with btrfs snapshots.
It will break reflinks often and write inefficiently sized blocks, even
with inplace option. Also, 

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-16 Thread Kai Krakow
Am Tue, 14 Nov 2017 15:51:57 -0500
schrieb Dave :

> On Tue, Nov 14, 2017 at 3:50 AM, Roman Mamedov  wrote:
> >
> > On Mon, 13 Nov 2017 22:39:44 -0500
> > Dave  wrote:
> >  
> > > I have my live system on one block device and a backup snapshot
> > > of it on another block device. I am keeping them in sync with
> > > hourly rsync transfers.
> > >
> > > Here's how this system works in a little more detail:
> > >
> > > 1. I establish the baseline by sending a full snapshot to the
> > > backup block device using btrfs send-receive.
> > > 2. Next, on the backup device I immediately create a rw copy of
> > > that baseline snapshot.
> > > 3. I delete the source snapshot to keep the live filesystem free
> > > of all snapshots (so it can be optimally defragmented, etc.)
> > > 4. hourly, I take a snapshot of the live system, rsync all
> > > changes to the backup block device, and then delete the source
> > > snapshot. This hourly process takes less than a minute currently.
> > > (My test system has only moderate usage.)
> > > 5. hourly, following the above step, I use snapper to take a
> > > snapshot of the backup subvolume to create/preserve a history of
> > > changes. For example, I can find the version of a file 30 hours
> > > prior.  
> >
> > Sounds a bit complex, I still don't get why you need all these
> > snapshot creations and deletions, and even still using btrfs
> > send-receive.  
> 
> 
> Hopefully, my comments below will explain my reasons.
> 
> >
> > Here is my scheme:
> > 
> > /mnt/dst <- mounted backup storage volume
> > /mnt/dst/backup  <- a subvolume
> > /mnt/dst/backup/host1/ <- rsync destination for host1, regular
> > directory /mnt/dst/backup/host2/ <- rsync destination for host2,
> > regular directory /mnt/dst/backup/host3/ <- rsync destination for
> > host3, regular directory etc.
> >
> > /mnt/dst/backup/host1/bin/
> > /mnt/dst/backup/host1/etc/
> > /mnt/dst/backup/host1/home/
> > ...
> > Self explanatory. All regular directories, not subvolumes.
> >
> > Snapshots:
> > /mnt/dst/snaps/backup <- a regular directory
> > /mnt/dst/snaps/backup/2017-11-14T12:00/ <- snapshot 1
> > of /mnt/dst/backup /mnt/dst/snaps/backup/2017-11-14T13:00/ <-
> > snapshot 2
> > of /mnt/dst/backup /mnt/dst/snaps/backup/2017-11-14T14:00/ <-
> > snapshot 3 of /mnt/dst/backup
> >
> > Accessing historic data:
> > /mnt/dst/snaps/backup/2017-11-14T12:00/host1/bin/bash
> > ...
> > /bin/bash for host1 as of 2017-11-14 12:00 (time on the backup
> > system).
> > 
> >
> > No need for btrfs send-receive, only plain rsync is used, directly
> > from hostX:/ to /mnt/dst/backup/host1/;  
> 
> 
> I prefer to start with a BTRFS snapshot at the backup destination. I
> think that's the most "accurate" starting point.

No, you should finish with a snapshot. Use the rsync destination as a
"dirty" scratch area, let rsync also delete files which are no longer
in the source. After successfully running rsync, make a snapshot of
that directory and make it RO, leave the scratch in place (even when
rsync dies or becomes killed).

I once made some scripts[2] following those rules, you may want to adapt
them.


> > No need to create or delete snapshots during the actual backup
> > process;  
> 
> Then you can't guarantee consistency of the backed up information.

Take a temporary snapshot of the source, rsync to to the scratch
destination, take a RO snapshot of that destination, remove the
temporary snapshot.

BTW: From user API perspective, btrfs snapshots do not guarantee
perfect granular consistent backups. A user-level file transaction may
still end up only partially in the snapshot. If you are running
transaction sensitive applications, those usually do provide some means
of preparing a freeze and a thaw of transactions.

I think the user transactions API which could've been used for this
will even be removed during the next kernel cycles. I remember
reiserfs4 tried to deploy something similar. But there's no consistent
layer in the VFS for subscribing applications to filesystem snapshots
so they could prepare and notify the kernel when they are ready.


> > A single common timeline is kept for all hosts to be backed up,
> > snapshot count not multiplied by the number of hosts (in my case
> > the backup location is multi-purpose, so I somewhat care about
> > total number of snapshots there as well);
> >
> > Also, all of this works even with source hosts which do not use
> > Btrfs.  
> 
> That's not a concern for me because I prefer to use BTRFS everywhere.

At least I suggest looking into bees[1] to deduplicate the backup
destination. Rsync is not very efficient to work with btrfs snapshots.
It will break reflinks often and write inefficiently sized blocks, even
with inplace option. Also, rsync won't efficiently catch files 

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-14 Thread Dave
On Tue, Nov 14, 2017 at 3:50 AM, Roman Mamedov  wrote:
>
> On Mon, 13 Nov 2017 22:39:44 -0500
> Dave  wrote:
>
> > I have my live system on one block device and a backup snapshot of it
> > on another block device. I am keeping them in sync with hourly rsync
> > transfers.
> >
> > Here's how this system works in a little more detail:
> >
> > 1. I establish the baseline by sending a full snapshot to the backup
> > block device using btrfs send-receive.
> > 2. Next, on the backup device I immediately create a rw copy of that
> > baseline snapshot.
> > 3. I delete the source snapshot to keep the live filesystem free of
> > all snapshots (so it can be optimally defragmented, etc.)
> > 4. hourly, I take a snapshot of the live system, rsync all changes to
> > the backup block device, and then delete the source snapshot. This
> > hourly process takes less than a minute currently. (My test system has
> > only moderate usage.)
> > 5. hourly, following the above step, I use snapper to take a snapshot
> > of the backup subvolume to create/preserve a history of changes. For
> > example, I can find the version of a file 30 hours prior.
>
> Sounds a bit complex, I still don't get why you need all these snapshot
> creations and deletions, and even still using btrfs send-receive.


Hopefully, my comments below will explain my reasons.

>
> Here is my scheme:
> 
> /mnt/dst <- mounted backup storage volume
> /mnt/dst/backup  <- a subvolume
> /mnt/dst/backup/host1/ <- rsync destination for host1, regular directory
> /mnt/dst/backup/host2/ <- rsync destination for host2, regular directory
> /mnt/dst/backup/host3/ <- rsync destination for host3, regular directory
> etc.
>
> /mnt/dst/backup/host1/bin/
> /mnt/dst/backup/host1/etc/
> /mnt/dst/backup/host1/home/
> ...
> Self explanatory. All regular directories, not subvolumes.
>
> Snapshots:
> /mnt/dst/snaps/backup <- a regular directory
> /mnt/dst/snaps/backup/2017-11-14T12:00/ <- snapshot 1 of /mnt/dst/backup
> /mnt/dst/snaps/backup/2017-11-14T13:00/ <- snapshot 2 of /mnt/dst/backup
> /mnt/dst/snaps/backup/2017-11-14T14:00/ <- snapshot 3 of /mnt/dst/backup
>
> Accessing historic data:
> /mnt/dst/snaps/backup/2017-11-14T12:00/host1/bin/bash
> ...
> /bin/bash for host1 as of 2017-11-14 12:00 (time on the backup system).
> 
>
> No need for btrfs send-receive, only plain rsync is used, directly from
> hostX:/ to /mnt/dst/backup/host1/;


I prefer to start with a BTRFS snapshot at the backup destination. I
think that's the most "accurate" starting point.

>
> No need to create or delete snapshots during the actual backup process;


Then you can't guarantee consistency of the backed up information.

>
> A single common timeline is kept for all hosts to be backed up, snapshot count
> not multiplied by the number of hosts (in my case the backup location is
> multi-purpose, so I somewhat care about total number of snapshots there as
> well);
>
> Also, all of this works even with source hosts which do not use Btrfs.


That's not a concern for me because I prefer to use BTRFS everywhere.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-14 Thread Roman Mamedov
On Mon, 13 Nov 2017 22:39:44 -0500
Dave  wrote:

> I have my live system on one block device and a backup snapshot of it
> on another block device. I am keeping them in sync with hourly rsync
> transfers.
> 
> Here's how this system works in a little more detail:
> 
> 1. I establish the baseline by sending a full snapshot to the backup
> block device using btrfs send-receive.
> 2. Next, on the backup device I immediately create a rw copy of that
> baseline snapshot.
> 3. I delete the source snapshot to keep the live filesystem free of
> all snapshots (so it can be optimally defragmented, etc.)
> 4. hourly, I take a snapshot of the live system, rsync all changes to
> the backup block device, and then delete the source snapshot. This
> hourly process takes less than a minute currently. (My test system has
> only moderate usage.)
> 5. hourly, following the above step, I use snapper to take a snapshot
> of the backup subvolume to create/preserve a history of changes. For
> example, I can find the version of a file 30 hours prior.

Sounds a bit complex, I still don't get why you need all these snapshot
creations and deletions, and even still using btrfs send-receive.

Here is my scheme:

/mnt/dst <- mounted backup storage volume
/mnt/dst/backup  <- a subvolume 
/mnt/dst/backup/host1/ <- rsync destination for host1, regular directory
/mnt/dst/backup/host2/ <- rsync destination for host2, regular directory
/mnt/dst/backup/host3/ <- rsync destination for host3, regular directory
etc.

/mnt/dst/backup/host1/bin/
/mnt/dst/backup/host1/etc/
/mnt/dst/backup/host1/home/
...
Self explanatory. All regular directories, not subvolumes.

Snapshots:
/mnt/dst/snaps/backup <- a regular directory
/mnt/dst/snaps/backup/2017-11-14T12:00/ <- snapshot 1 of /mnt/dst/backup
/mnt/dst/snaps/backup/2017-11-14T13:00/ <- snapshot 2 of /mnt/dst/backup
/mnt/dst/snaps/backup/2017-11-14T14:00/ <- snapshot 3 of /mnt/dst/backup

Accessing historic data:
/mnt/dst/snaps/backup/2017-11-14T12:00/host1/bin/bash
...
/bin/bash for host1 as of 2017-11-14 12:00 (time on the backup system).


No need for btrfs send-receive, only plain rsync is used, directly from
hostX:/ to /mnt/dst/backup/host1/;

No need to create or delete snapshots during the actual backup process;

A single common timeline is kept for all hosts to be backed up, snapshot count
not multiplied by the number of hosts (in my case the backup location is
multi-purpose, so I somewhat care about total number of snapshots there as
well);

Also, all of this works even with source hosts which do not use Btrfs.

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-14 Thread Roman Mamedov
On Tue, 14 Nov 2017 10:14:55 +0300
Marat Khalili  wrote:

> Don't keep snapshots under rsync target, place them under ../snapshots 
> (if snapper supports this):

> Or, specify them in --exclude and avoid using --delete-excluded.

Both are good suggestions, in my case each system does have its own snapshots
as well, but they are retained for much shorter. So I both use --exclude to
avoid fetching the entire /snaps tree from the source system, and store
snapshots of the destination system outside of the rsync target dirs.

>Or keep using -x if it works, why not?

-x will exclude content of all subvolumes down the tree on the source side --
not only the time-based ones. If you take care to never casually create any
subvolumes content of which you'd still want backed up, then I guess it can
work.

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-13 Thread Marat Khalili

On 14/11/17 06:39, Dave wrote:

My rsync command currently looks like this:

rsync -axAHv --inplace --delete-delay --exclude-from="/some/file"
"$source_snapshop/" "$backup_location"
As I learned from Kai Krakow in this maillist, you should also add 
--no-whole-file if both sides are local. Otherwise target space usage 
can be much worse (but fragmentation much better).


I wonder what is your justification for --delete-delay, I just use --delete.

Here's what I use: --verbose --archive --hard-links --acls --xattrs 
--numeric-ids --inplace --delete --delete-excluded --stats. Since in my 
case source is always remote, there's no --no-whole-file, but there's 
--numeric-ids.



In particular, I want to know if I should or should not be using these options:

 -H, --hard-linkspreserve hard links
 -A, --acls  preserve ACLs (implies -p)
 -X, --xattrspreserve extended attributes
 -x, --one-file-system   don't cross filesystem boundaries
I don't know any semantic use of hard links in modern systems. There're 
ACLs on some files in /var/log/journal on systems with systemd. Synology 
actively uses ACL, but it's implementation is sadly incompatible with 
rsync. There can always be some ACLs or xattrs set by sysadmin manually. 
End result, I always specify first three options where possible just in 
case (even though man page says that --hard-links may affect performance).



I had to use the "x" option to prevent rsync from deleting files in
snapshots in the backup location (as the source location does not
retain any snapshots). Is there a better way?
Don't keep snapshots under rsync target, place them under ../snapshots 
(if snapper supports this):



# find . -maxdepth 2
.
./snapshots
./snapshots/2017-11-08T13:18:20+00:00
./snapshots/2017-11-08T15:10:03+00:00
./snapshots/2017-11-08T23:28:44+00:00
./snapshots/2017-11-09T23:41:30+00:00
./snapshots/2017-11-10T22:44:36+00:00
./snapshots/2017-11-11T21:48:19+00:00
./snapshots/2017-11-12T21:27:41+00:00
./snapshots/2017-11-13T23:29:49+00:00
./rsync
Or, specify them in --exclude and avoid using --delete-excluded. Or keep 
using -x if it works, why not?


--

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-13 Thread Dave
On Wed, Nov 1, 2017 at 1:15 AM, Roman Mamedov  wrote:
> On Wed, 1 Nov 2017 01:00:08 -0400
> Dave  wrote:
>
>> To reconcile those conflicting goals, the only idea I have come up
>> with so far is to use btrfs send-receive to perform incremental
>> backups as described here:
>> https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .
>
> Another option is to just use the regular rsync to a designated destination
> subvolume on the backup host, AND snapshot that subvolume on that host from
> time to time (or on backup completions, if you can synchronize that).
>
> rsync --inplace will keep space usage low as it will not reupload entire files
> in case of changes/additions to them.
>
> Yes rsync has to traverse both directory trees to find changes, but that's
> pretty fast (couple of minutes at most, for a typical root filesystem),
> especially if you use SSD or SSD caching.

Hello. I am implementing this suggestion. So far, so good. However, I
need some further recommendations on rsync options to use for this
purpose.

My rsync command currently looks like this:

rsync -axAHv --inplace --delete-delay --exclude-from="/some/file"
"$source_snapshop/" "$backup_location"

In particular, I want to know if I should or should not be using these options:

-H, --hard-linkspreserve hard links
-A, --acls  preserve ACLs (implies -p)
-X, --xattrspreserve extended attributes
-x, --one-file-system   don't cross filesystem boundaries

I had to use the "x" option to prevent rsync from deleting files in
snapshots in the backup location (as the source location does not
retain any snapshots). Is there a better way?

I have my live system on one block device and a backup snapshot of it
on another block device. I am keeping them in sync with hourly rsync
transfers.

Here's how this system works in a little more detail:

1. I establish the baseline by sending a full snapshot to the backup
block device using btrfs send-receive.
2. Next, on the backup device I immediately create a rw copy of that
baseline snapshot.
3. I delete the source snapshot to keep the live filesystem free of
all snapshots (so it can be optimally defragmented, etc.)
4. hourly, I take a snapshot of the live system, rsync all changes to
the backup block device, and then delete the source snapshot. This
hourly process takes less than a minute currently. (My test system has
only moderate usage.)
5. hourly, following the above step, I use snapper to take a snapshot
of the backup subvolume to create/preserve a history of changes. For
example, I can find the version of a file 30 hours prior.

The backup volume contains up to 100 snapshots while the live volume
has no snapshots. Best of both worlds? I guess I'll find out over
time.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-03 Thread Kai Krakow
Am Thu, 2 Nov 2017 23:24:29 -0400
schrieb Dave :

> On Thu, Nov 2, 2017 at 4:46 PM, Kai Krakow 
> wrote:
> > Am Wed, 1 Nov 2017 02:51:58 -0400
> > schrieb Dave :
> >  
>  [...]  
>  [...]  
>  [...]  
> >>
> >> Thanks for confirming. I must have missed those reports. I had
> >> never considered this idea until now -- but I like it.
> >>
> >> Are there any blogs or wikis where people have done something
> >> similar to what we are discussing here?  
> >
> > I used rsync before, backup source and destination both were btrfs.
> > I was experiencing the same btrfs bug from time to time on both
> > devices, luckily not at the same time.
> >
> > I instead switched to using borgbackup, and xfs as the destination
> > (to not fall the same-bug-in-two-devices pitfall).  
> 
> I'm going to stick with btrfs everywhere. My reasoning is that my
> biggest pitfalls will be related to lack of knowledge. So focusing on
> learning one filesystem better (vs poorly learning two) is the better
> strategy for me, given my limited time. (I'm not an IT professional of
> any sort.)
> 
> Is there any problem with the Borgbackup repository being on btrfs?

No. I just wanted to point out that keeping backup and source on
different media (which includes different technology, too) is common
best practice and adheres to the 3-2-1 backup strategy.


> > Borgbackup achieves a
> > much higher deduplication density and compression, and as such also
> > is able to store much more backup history in the same storage
> > space. The first run is much slower than rsync (due to enabled
> > compression) but successive runs are much faster (like 20 minutes
> > per backup run instead of 4-5 hours).
> >
> > I'm currently storing 107 TB of backup history in just 2.2 TB backup
> > space, which counts a little more than one year of history now,
> > containing 56 snapshots. This is my retention policy:
> >
> >   * 5 yearly snapshots
> >   * 12 monthly snapshots
> >   * 14 weekly snapshots (worth around 3 months)
> >   * 30 daily snapshots
> >
> > Restore is fast enough, and a snapshot can even be fuse-mounted
> > (tho, in that case mounted access can be very slow navigating
> > directories).
> >
> > With latest borgbackup version, the backup time increased to around
> > 1 hour from 15-20 minutes in the previous version. That is due to
> > switching the file cache strategy from mtime to ctime. This can be
> > tuned to get back to old performance, but it may miss some files
> > during backup if you're doing awkward things to file timestamps.
> >
> > I'm also backing up some servers with it now, then use rsync to sync
> > the borg repository to an offsite location.
> >
> > Combined with same-fs local btrfs snapshots with short retention
> > times, this could be a viable solution for you.  
> 
> Yes, I appreciate the idea. I'm going to evaluate both rsync and
> Borgbackup.
> 
> The advantage of rsync, I think, is that it will likely run in just a
> couple minutes. That will allow me to run it hourly and to keep my
> live volume almost entire free of snapshots and fully defragmented.
> It's also very simple as I already have rsync. And since I'm going to
> run btrfs on the backup volume, I can perform hourly snapshots there
> and use Snapper to manage retention. It's all very simple and relies
> on tools I already have and know.
> 
> However, the advantages of Borgbackup you mentioned (much higher
> deduplication density and compression) make it worth considering.
> Maybe Borgbackup won't take long to complete successive (incremental)
> backups on my system.

Once a full backup was taken, incremental backups are extremely fast.
At least for me, it works much faster than rsync. And as with btrfs
snapshots, each incremental backup is also a full backup. It's not like
traditional backup software that needs the backup parent and grand
parent to make use of the differential and/or incremental backups.

There's one caveat, tho: Only one process can access a repository at a
time, that is you need to serialize different backup jobs if you want
them to go into the same repository. Deduplication is done only within
the same repository. Tho, you might be able to leverage btrfs
deduplication (e.g. using bees) across multiple repositories if you're
not using encrypted repositories.

But since you're currently using send/receive and/or rsync, encrypted
storage of the backup doesn't seem to be an important point to you.

Burp with its client/server approach may have an advantage here, so its
setup seems to be more complicated. Borg is really easy to use. I never
tried burp, tho.


> I'll have to try it to see. It's a very nice
> looking project. I'm surprised I never heard of it before.

It seems to follow similar principles as burp (which I never heard of
previously). It seems like the really good backup software has some
sort of PR problem... ;-)


-- 
Regards,
Kai

Replies to list-only preferred.

--

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-02 Thread Dave
On Thu, Nov 2, 2017 at 4:46 PM, Kai Krakow  wrote:
> Am Wed, 1 Nov 2017 02:51:58 -0400
> schrieb Dave :
>
>> >
>> >> To reconcile those conflicting goals, the only idea I have come up
>> >> with so far is to use btrfs send-receive to perform incremental
>> >> backups
>> >
>> > As already said by Romain Mamedov, rsync is viable alternative to
>> > send-receive with much less hassle. According to some reports it
>> > can even be faster.
>>
>> Thanks for confirming. I must have missed those reports. I had never
>> considered this idea until now -- but I like it.
>>
>> Are there any blogs or wikis where people have done something similar
>> to what we are discussing here?
>
> I used rsync before, backup source and destination both were btrfs. I
> was experiencing the same btrfs bug from time to time on both devices,
> luckily not at the same time.
>
> I instead switched to using borgbackup, and xfs as the destination (to
> not fall the same-bug-in-two-devices pitfall).

I'm going to stick with btrfs everywhere. My reasoning is that my
biggest pitfalls will be related to lack of knowledge. So focusing on
learning one filesystem better (vs poorly learning two) is the better
strategy for me, given my limited time. (I'm not an IT professional of
any sort.)

Is there any problem with the Borgbackup repository being on btrfs?

> Borgbackup achieves a
> much higher deduplication density and compression, and as such also is
> able to store much more backup history in the same storage space. The
> first run is much slower than rsync (due to enabled compression) but
> successive runs are much faster (like 20 minutes per backup run instead
> of 4-5 hours).
>
> I'm currently storing 107 TB of backup history in just 2.2 TB backup
> space, which counts a little more than one year of history now,
> containing 56 snapshots. This is my retention policy:
>
>   * 5 yearly snapshots
>   * 12 monthly snapshots
>   * 14 weekly snapshots (worth around 3 months)
>   * 30 daily snapshots
>
> Restore is fast enough, and a snapshot can even be fuse-mounted (tho,
> in that case mounted access can be very slow navigating directories).
>
> With latest borgbackup version, the backup time increased to around 1
> hour from 15-20 minutes in the previous version. That is due to
> switching the file cache strategy from mtime to ctime. This can be
> tuned to get back to old performance, but it may miss some files during
> backup if you're doing awkward things to file timestamps.
>
> I'm also backing up some servers with it now, then use rsync to sync
> the borg repository to an offsite location.
>
> Combined with same-fs local btrfs snapshots with short retention times,
> this could be a viable solution for you.

Yes, I appreciate the idea. I'm going to evaluate both rsync and Borgbackup.

The advantage of rsync, I think, is that it will likely run in just a
couple minutes. That will allow me to run it hourly and to keep my
live volume almost entire free of snapshots and fully defragmented.
It's also very simple as I already have rsync. And since I'm going to
run btrfs on the backup volume, I can perform hourly snapshots there
and use Snapper to manage retention. It's all very simple and relies
on tools I already have and know.

However, the advantages of Borgbackup you mentioned (much higher
deduplication density and compression) make it worth considering.
Maybe Borgbackup won't take long to complete successive (incremental)
backups on my system. I'll have to try it to see. It's a very nice
looking project. I'm surprised I never heard of it before.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-02 Thread Kai Krakow
Am Wed, 1 Nov 2017 02:51:58 -0400
schrieb Dave :

> >  
> >> To reconcile those conflicting goals, the only idea I have come up
> >> with so far is to use btrfs send-receive to perform incremental
> >> backups  
> >
> > As already said by Romain Mamedov, rsync is viable alternative to
> > send-receive with much less hassle. According to some reports it
> > can even be faster.  
> 
> Thanks for confirming. I must have missed those reports. I had never
> considered this idea until now -- but I like it.
> 
> Are there any blogs or wikis where people have done something similar
> to what we are discussing here?

I used rsync before, backup source and destination both were btrfs. I
was experiencing the same btrfs bug from time to time on both devices,
luckily not at the same time.

I instead switched to using borgbackup, and xfs as the destination (to
not fall the same-bug-in-two-devices pitfall). Borgbackup achieves a
much higher deduplication density and compression, and as such also is
able to store much more backup history in the same storage space. The
first run is much slower than rsync (due to enabled compression) but
successive runs are much faster (like 20 minutes per backup run instead
of 4-5 hours).

I'm currently storing 107 TB of backup history in just 2.2 TB backup
space, which counts a little more than one year of history now,
containing 56 snapshots. This is my retention policy:

  * 5 yearly snapshots
  * 12 monthly snapshots
  * 14 weekly snapshots (worth around 3 months)
  * 30 daily snapshots

Restore is fast enough, and a snapshot can even be fuse-mounted (tho,
in that case mounted access can be very slow navigating directories).

With latest borgbackup version, the backup time increased to around 1
hour from 15-20 minutes in the previous version. That is due to
switching the file cache strategy from mtime to ctime. This can be
tuned to get back to old performance, but it may miss some files during
backup if you're doing awkward things to file timestamps.

I'm also backing up some servers with it now, then use rsync to sync
the borg repository to an offsite location.

Combined with same-fs local btrfs snapshots with short retention times,
this could be a viable solution for you.


-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Peter Grandi
[ ... ]

> The poor performance has existed from the beginning of using
> BTRFS + KDE + Firefox (almost 2 years ago), at a point when
> very few snapshots had yet been created. A comparison system
> running similar hardware as well as KDE + Firefox (and LVM +
> EXT4) did not have the performance problems. The difference
> has been consistent and significant.

That seems rather unlikely to depend on Btrfs, as I use Firefox
56 + KDE4 + Btrfs without issue, on somewhat old/small desktop
and laptop, and is implausible on general grounds. You haven't
provided so far any indication or quantification of your "speed"
problem (which may or not be a "performance" issue".

The things to look at usually at disk IO latency and rates, and
system CPU time while the bad speed is observable (user CPU time
is usually stuck at 100% on any JS based site as written earlier).
To look at IO latency and rates the #1 choice is always: 'iostat
-dk -zyx 1' and to look as system CPU (and user CPU) and other
interesting details I suggest using 'htop' with the attached
configuration file to write to "$HOME/.config/htop/htoprc".

> Sometimes I have used Snapper settings like this:

> TIMELINE_MIN_AGE="1800"
> TIMELINE_LIMIT_HOURLY="36"
> TIMELINE_LIMIT_DAILY="30"
> TIMELINE_LIMIT_MONTHLY="12"
> TIMELINE_LIMIT_YEARLY="10"

> However, I also have some computers set like this:

> TIMELINE_MIN_AGE="1800"
> TIMELINE_LIMIT_HOURLY="10"
> TIMELINE_LIMIT_DAILY="10"
> TIMELINE_LIMIT_WEEKLY="0"
> TIMELINE_LIMIT_MONTHLY="0"
> TIMELINE_LIMIT_YEARLY="0"

The first seems a bit "aspirational". IIRC "someone" confessed
that the SUSE default of 'TIMELINE_LIMIT_YEARLY="10"' was imposed
by external forces in the SUSE default configuration:
https://github.com/openSUSE/snapper/blob/master/data/default-config

https://wiki.archlinux.org/index.php/Snapper#Set_snapshot_limits
https://lists.opensuse.org/yast-devel/2014-05/msg00036.html

# Beware! This file is rewritten by htop when settings are changed in the 
interface.
# The parser is also very primitive, and not human-friendly.
fields=0 48 38 39 40 44 62 63 2 46 13 14 1 
sort_key=47
sort_direction=1
hide_threads=1
hide_kernel_threads=1
hide_userland_threads=1
shadow_other_users=0
show_thread_names=1
highlight_base_name=1
highlight_megabytes=1
highlight_threads=1
tree_view=0
header_margin=0
detailed_cpu_time=1
cpu_count_from_zero=1
update_process_names=0
color_scheme=0
delay=15
left_meters=AllCPUs Memory Swap 
left_meter_modes=1 1 1 
right_meters=Tasks LoadAverage Uptime 
right_meter_modes=2 2 2 


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Dave
On Wed, Nov 1, 2017 at 4:34 AM, Marat Khalili  wrote:

>> We do experience severe performance problems now, especially with
>> Firefox. Part of my experiment is to reduce the number of snapshots on
>> the live volumes, hence this question.
>
> Just for statistics, how many snapshots do you have and how often do you
> take them? It's on SSD, right?

I don't think the severe performance problems stem solely from the
number of snapshots. I think it is also related to Firefox stuff
(cache fragmentation, lack of multi-processor mode maybe, etc.) I
still have to investigate the Firefox issues, but I'm starting at the
foundation by trying to get a basic BTRFS setup that will support
better desktop application performance first.

The poor performance has existed from the beginning of using BTRFS +
KDE + Firefox (almost 2 years ago), at a point when very few snapshots
had yet been created. A comparison system running similar hardware as
well as KDE + Firefox (and LVM + EXT4) did not have the performance
problems. The difference has been consistent and significant. For a
while I thought the difference was due to the hardware, as one system
used the z170 chipset and the other used the X99 chipset (but were
otherwise equivalent). So I repeated the testing on identical hardware
and the stark performance difference remained. When I realized that, I
began focusing on BTRFS, as it is the only consistent difference I can
recognize.

Sometimes I have used Snapper settings like this:

TIMELINE_MIN_AGE="1800"
TIMELINE_LIMIT_HOURLY="36"
TIMELINE_LIMIT_DAILY="30"
TIMELINE_LIMIT_MONTHLY="12"
TIMELINE_LIMIT_YEARLY="10"

However, I also have some computers set like this:

TIMELINE_MIN_AGE="1800"
TIMELINE_LIMIT_HOURLY="10"
TIMELINE_LIMIT_DAILY="10"
TIMELINE_LIMIT_WEEKLY="0"
TIMELINE_LIMIT_MONTHLY="0"
TIMELINE_LIMIT_YEARLY="0"

> BTW beware of deleting too many snapshots at once with any tool. Delete few
> and let filesystem stabilize before proceeding.

OK, thanks for the tip.

> For deduplication tool to be useful you ought to have some duplicate data on
> your live volume. Do you have any (e.g. many LXC containers with the same
> distribution)?

No, no containers and no duplication to that large extent.

> P.S. I still think you need some off-system backup solution too, either
> rsync+snapshot-based over ssh or e.g. Burp (shameless advertising:
> http://burp.grke.org/ ).

I agree, but that's beyond the scope of the current problem I'm trying
to solve.  However, I'll check out Burp once I have a base
configuration that is working satisfactorily.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Marat Khalili

On 01/11/17 09:51, Dave wrote:

As already said by Romain Mamedov, rsync is viable alternative to
send-receive with much less hassle. According to some reports it can even be
faster.

Thanks for confirming. I must have missed those reports. I had never
considered this idea until now -- but I like it.

Are there any blogs or wikis where people have done something similar
to what we are discussing here?

I don't know any. Probably someone needs to write it.


We will delete most snapshots on the live volume, but retain many (or
all) snapshots on the backup block device. Is that a good strategy,
given my goals?

Depending on the way you use it, retaining even a dozen snapshots on a live
volume might hurt performance (for high-performance databases) or be
completely transparent (for user folders). You may want to experiment with
this number.

We do experience severe performance problems now, especially with
Firefox. Part of my experiment is to reduce the number of snapshots on
the live volumes, hence this question.
Just for statistics, how many snapshots do you have and how often do you 
take them? It's on SSD, right?



Thanks. I hope you do find time to publish it. (And what do you mean
by portable?) For now, Snapper has a cleanup algorithm that we can
use. At least one of the tools listed here has a thinout algorithm
too: https://btrfs.wiki.kernel.org/index.php/Incremental_Backup
It is currently a small part of yet another home-grown backup tool which 
is itself fairly big and tuned to particular environment. I thought many 
times that it would be very nice to have thinning tool separately and 
with no unnecessary dependencies, but...


BTW beware of deleting too many snapshots at once with any tool. Delete 
few and let filesystem stabilize before proceeding.



Should I consider a dedup tool like one of these?

Certainly NOT for snapshot-based backups: it is already deduplicated almost
as much as possible, dedup tools can only make it *less* deduplicated.

The question is whether to use a dedup tool on the live volume which
has a few snapshots. Even with the new strategy (based on rsync), the
live volume may sometimes have two snapshots (pre- and post- pacman
upgrades).
For deduplication tool to be useful you ought to have some duplicate 
data on your live volume. Do you have any (e.g. many LXC containers with 
the same distribution)?



Also still wondering about these options: no-holes, skinny metadata,
or extended inode refs?

I don't know anything about any of these, sorry.

P.S. I still think you need some off-system backup solution too, either 
rsync+snapshot-based over ssh or e.g. Burp (shameless advertising: 
http://burp.grke.org/ ).


--

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Dave
On Wed, Nov 1, 2017 at 2:19 AM, Marat Khalili  wrote:

> You seem to have two tasks: (1) same-volume snapshots (I would not call them
> backups) and (2) updating some backup volume (preferably on a different
> box). By solving them separately you can avoid some complexity...

Yes, it appears that is a very good strategy -- solve the concerns
separately. Make the live volume performant and the backup volume
historical.

>
>> To reconcile those conflicting goals, the only idea I have come up
>> with so far is to use btrfs send-receive to perform incremental
>> backups
>
> As already said by Romain Mamedov, rsync is viable alternative to
> send-receive with much less hassle. According to some reports it can even be
> faster.

Thanks for confirming. I must have missed those reports. I had never
considered this idea until now -- but I like it.

Are there any blogs or wikis where people have done something similar
to what we are discussing here?

>
>> Given the hourly snapshots, incremental backups are the only practical
>> option. They take mere moments. Full backups could take an hour or
>> more, which won't work with hourly backups.
>
> I don't see much sense in re-doing full backups to the same physical device.
> If you care about backup integrity, it is probably more important to invest
> in backups verification. (OTOH, while you didn't reveal data size, if full
> backup takes just an hour on your system then why not?)

I was saying that a full backup could take an hour or more. That means
full backups are not compatible with an hourly backup schedule. And it
is certainly not a potential solution to making the system perform
better because the system will be spending all its time running
backups -- it would be never ending. With hourly backups, they should
complete in just a few moments, which is the case with incremental
backups. (It sounds like this will be the case with rsync as well.)
>
>> We will delete most snapshots on the live volume, but retain many (or
>> all) snapshots on the backup block device. Is that a good strategy,
>> given my goals?
>
> Depending on the way you use it, retaining even a dozen snapshots on a live
> volume might hurt performance (for high-performance databases) or be
> completely transparent (for user folders). You may want to experiment with
> this number.

We do experience severe performance problems now, especially with
Firefox. Part of my experiment is to reduce the number of snapshots on
the live volumes, hence this question.

>
> In any case I'd not recommend retaining ALL snapshots on backup device, even
> if you have infinite space. Such filesystem would be as dangerous as the
> demon core, only good for adding more snapshots (not even deleting them),
> and any little mistake will blow everything up. Keep a few dozen, hundred at
> most.

The intention -- if we were to keep all snapshots on a backup device
-- would be to never ever try to delete them. However, with the
suggestion to separate the concerns and use rsync, we could also
easily run the Snapper timeline cleanup on the backup volume, thereby
limiting the retained snapshots to some reasonable number.

> Unlike other backup systems, you can fairly easily remove snapshots in the
> middle of sequence, use this opportunity. My thinout rule is: remove
> snapshot if resulting gap will be less than some fraction (e.g. 1/4) of its
> age. One day I'll publish portable solution on github.

Thanks. I hope you do find time to publish it. (And what do you mean
by portable?) For now, Snapper has a cleanup algorithm that we can
use. At least one of the tools listed here has a thinout algorithm
too: https://btrfs.wiki.kernel.org/index.php/Incremental_Backup

>> Given this minimal retention of snapshots on the live volume, should I
>> defrag it (assuming there is at least 50% free space available on the
>> device)? (BTW, is defrag OK on an NVMe drive? or an SSD?)
>>
>> In the above procedure, would I perform that defrag before or after
>> taking the snapshot? Or should I use autodefrag?
>
> I ended up using autodefrag, didn't try manual defragmentation. I don't use
> SSDs as backup volumes.

I don't use SSD's as backup volumes either. I was asking about the live volume.
>
>> Should I consider a dedup tool like one of these?
>
> Certainly NOT for snapshot-based backups: it is already deduplicated almost
> as much as possible, dedup tools can only make it *less* deduplicated.

The question is whether to use a dedup tool on the live volume which
has a few snapshots. Even with the new strategy (based on rsync), the
live volume may sometimes have two snapshots (pre- and post- pacman
upgrades).

I still wish to know, in that case, about using both a dedup tool and
defragmenting the btrfs filesystem.

Also still wondering about these options: no-holes, skinny metadata,
or extended inode refs?

This is a very helpful discussion. Thank you.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the 

Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Dave
On Wed, Nov 1, 2017 at 1:15 AM, Roman Mamedov  wrote:
> On Wed, 1 Nov 2017 01:00:08 -0400
> Dave  wrote:
>
>> To reconcile those conflicting goals, the only idea I have come up
>> with so far is to use btrfs send-receive to perform incremental
>> backups as described here:
>> https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .
>
> Another option is to just use the regular rsync to a designated destination
> subvolume on the backup host, AND snapshot that subvolume on that host from
> time to time (or on backup completions, if you can synchronize that).
>
> rsync --inplace will keep space usage low as it will not reupload entire files
> in case of changes/additions to them.
>

This seems like a brilliant idea, something that has a lot of potential...

On a system where the root filesystem is on an SSD and the backup
volume on an HDD, I could rsync hourly, and then run Snapper on the
backup volume hourly, as well as using Snapper's timeline cleanup on
the backup volume. The live filesystem would have zero snapshots and
could be optimized for performance. The backup volume could retain a
large number of snapshots (even more than several hundred) because
performance would not be very important (as far as I can guess). This
seems to resolve our conflict.

How about on a system (such as a laptop) with only a single SSD? Would
this same idea work where the backup volume is on the same block
device? I know that is not technically a backup, but what it does
accomplish is separation of the live filesystem from the snapshotted
backup volume for performance reasons -- yet the hourly snapshot
history is still available. That would seem to meet our use case too.
(An external backup disk would be connected to the laptop
periodically, of course, too.)

Currently, for most btrfs volumes, I have three volumes: the main
volume, a snapshot subvolume which contains all the individual
snapshots, and a backup volume* (on a different block device but on
the same machine).

With this new idea, I would have a main volume without any snapshots
and a backup volume which contains all the snapshots. It simplifies
things on that level and it also simplifies performance tuning on the
main volume. In fact it simplifies backup snapshot management too.

My initial impression is that this simplifies everything as well as
optimizing everything. So surely it must have some disadvantages
compared to btrfs send-receive incremental backups
(https://btrfs.wiki.kernel.org/index.php/Incremental_Backup). What
would those disadvantages be?

The first one that comes to mind is that I would lose the
functionality of pre- and post- upgrade snapshots on the root
filesystem. But I think that's minor. I could either keep those two
snapshots for a few hours or days after major upgrades or maybe I
could find a pacman hook that uses rsync to make pre- and post-
upgrade copies...

* Footnote: on some workstation computers, we have 2 or 3 separate
backup block devices (e..g, external USB hard drives, etc.). Laptops,
however, generally only have a single block device and are not always
connected to an external USB hard drive for backup as often as would
be ideal. But we also don't keep any critical data on laptops.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-11-01 Thread Marat Khalili
I'm active user of backup using btrfs snapshots. Generally it works with 
some caveats.


You seem to have two tasks: (1) same-volume snapshots (I would not call 
them backups) and (2) updating some backup volume (preferably on a 
different box). By solving them separately you can avoid some complexity 
like accidental remove of snapshot that's still needed for updating 
backup volume.



To reconcile those conflicting goals, the only idea I have come up
with so far is to use btrfs send-receive to perform incremental
backups as described here:
https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .
As already said by Romain Mamedov, rsync is viable alternative to 
send-receive with much less hassle. According to some reports it can 
even be faster.



Given the hourly snapshots, incremental backups are the only practical
option. They take mere moments. Full backups could take an hour or
more, which won't work with hourly backups.
I don't see much sense in re-doing full backups to the same physical 
device. If you care about backup integrity, it is probably more 
important to invest in backups verification. (OTOH, while you didn't 
reveal data size, if full backup takes just an hour on your system then 
why not?)



We will delete most snapshots on the live volume, but retain many (or
all) snapshots on the backup block device. Is that a good strategy,
given my goals?
Depending on the way you use it, retaining even a dozen snapshots on a 
live volume might hurt performance (for high-performance databases) or 
be completely transparent (for user folders). You may want to experiment 
with this number.


In any case I'd not recommend retaining ALL snapshots on backup device, 
even if you have infinite space. Such filesystem would be as dangerous 
as the demon core, only good for adding more snapshots (not even 
deleting them), and any little mistake will blow everything up. Keep a 
few dozen, hundred at most.


Unlike other backup systems, you can fairly easily remove snapshots in 
the middle of sequence, use this opportunity. My thinout rule is: remove 
snapshot if resulting gap will be less than some fraction (e.g. 1/4) of 
its age. One day I'll publish portable solution on github.



Given this minimal retention of snapshots on the live volume, should I
defrag it (assuming there is at least 50% free space available on the
device)? (BTW, is defrag OK on an NVMe drive? or an SSD?)

In the above procedure, would I perform that defrag before or after
taking the snapshot? Or should I use autodefrag?
I ended up using autodefrag, didn't try manual defragmentation. I don't 
use SSDs as backup volumes.



Should I consider a dedup tool like one of these?
Certainly NOT for snapshot-based backups: it is already deduplicated 
almost as much as possible, dedup tools can only make it *less* 
deduplicated.



* Footnote: On the backup device, maybe we will never delete
snapshots. In any event, that's not a concern now. We'll retain many,
many snapshots on the backup device.
Again, DO NOT do this, btrfs in its current state does not support it. 
Good rule of thumb for time of some operations is data size multiplied 
by number of snapshots (raised to some power >= 1) and divided by IO/CPU 
speed. By creating snapshots it is very easy to create petabytes of data 
for kernel to process, which it won't be able to do in many years.


--

With Best Regards,
Marat Khalili

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-10-31 Thread Roman Mamedov
On Wed, 1 Nov 2017 01:00:08 -0400
Dave  wrote:

> To reconcile those conflicting goals, the only idea I have come up
> with so far is to use btrfs send-receive to perform incremental
> backups as described here:
> https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .

Another option is to just use the regular rsync to a designated destination
subvolume on the backup host, AND snapshot that subvolume on that host from
time to time (or on backup completions, if you can synchronize that).

rsync --inplace will keep space usage low as it will not reupload entire files
in case of changes/additions to them.

Yes rsync has to traverse both directory trees to find changes, but that's
pretty fast (couple of minutes at most, for a typical root filesystem),
especially if you use SSD or SSD caching.

-- 
With respect,
Roman
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Need help with incremental backup strategy (snapshots, defragmentingt & performance)

2017-10-31 Thread Dave
Our use case requires snapshots. btrfs snapshots are best solution we
have found for our requirements, and over the last year snapshots have
proven their value to us.

(For this discussion I am considering both the "root" volume and the
"home" volume on a typical desktop workstation. Also, all btfs volumes
are mounted with noatime and nodiratime flags.)

For performance reasons, I now wish to minimize the number of
snapshots retained on the live btrfs volume.

However, for backup purposes, I wish to maximize the number of
snapshots retained over time. We'll keep yearly, monthly, weekly,
daily and hourly snapshots for as long as possible.

To reconcile those conflicting goals, the only idea I have come up
with so far is to use btrfs send-receive to perform incremental
backups as described here:
https://btrfs.wiki.kernel.org/index.php/Incremental_Backup .

Given the hourly snapshots, incremental backups are the only practical
option. They take mere moments. Full backups could take an hour or
more, which won't work with hourly backups.

We will delete most snapshots on the live volume, but retain many (or
all) snapshots on the backup block device. Is that a good strategy,
given my goals?

The steps:

I know step one is to do the "bootstrapping" where a full initial copy
of the live volume is sent to the backup volume. I also know the steps
for doing incremental backups.

However, the first problem I see is that performing incremental
backups requires both the live volume and the backup volume to have an
identical "parent" snapshot before each new incremental can be sent. I
have found it easy to accidentally delete that specific required
parent snapshot when hourly snapshots are being taken and many
snaphots exist.

Given that I want to retain the minimum number of snapshots on the
live volume, how do I ensure that a valid "parent" subvolume exists
there in order to perform the incremental backup? (Again, I have often
run into the error "no valid parent exists" when doing incremental
backups.)

I think the rule is like this:

Do not delete a snapshot from the live volume until the next snapshot
based on it has been sent to the backup volume.

In other words, always retain the *exact* snapshot that was the last
one sent to the backup volume. Deleting that one then taking another
one does not seem sufficient. BTRFS does not seem to recognize
parent-child-grandchild relationships of snapshots when doing
send-receive incremental backups.

However, maybe I'm wrong. Would it be sufficient to first take another
snapshot, then delete the prior snapshot? Will the send-receive
algorithm be able to infer a parent exists on the backup volume when
it receives an incremental based on a child snapshot? (My experience
says "no", but I'd like a more authoritative answer.)

The next step in my proposed procedure is to take a new snapshot, send
it to the backup volume, and only then delete the prior snapshot ( and
only from the live volume* ).

Using this strategy, the live volume will always have the current
snapshot (which I guess should not be called a snapshot -- it's the
live volume) plus at least one more snapshot. Briefly, during the
incremental backup, it will have an additional snapshot until the
older one gets deleted.

Given this minimal retention of snapshots on the live volume, should I
defrag it (assuming there is at least 50% free space available on the
device)? (BTW, is defrag OK on an NVMe drive? or an SSD?)

In the above procedure, would I perform that defrag before or after
taking the snapshot? Or should I use autodefrag?

Should I consider a dedup tool like one of these?

g2p/bedup: Btrfs deduplication
https://github.com/g2p/bedup

markfasheh/duperemove: Tools for deduping file systems
https://github.com/markfasheh/duperemove

Zygo/bees: Best-Effort Extent-Same, a btrfs dedup agent
https://github.com/Zygo/bees

Does anyone care to elaborate on the relationship between a dedup tool
like Bees and defragmenting a btrfs filesystem with snapshots? I
understand they do opposing things, but I think it was suggested in
another thread on defragmenting that they can be combined to good
effect. Should I consider this as a possible solution for my
situation?

Should I consider any of these options: no-holes, skinny metadata, or
extended inode refs?

Finally, are there any good BTRFS performance wiki articles or blogs I
should refer to for my situation?

* Footnote: On the backup device, maybe we will never delete
snapshots. In any event, that's not a concern now. We'll retain many,
many snapshots on the backup device.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4 1/2] btrfs-progs: device: add description of alias to help message

2017-10-19 Thread Misono, Tomohiro
State that the 'delete' is the alias of 'remove' as the man page says.

Signed-off-by: Tomohiro Misono 
Reviewed-by: Satoru Takeuchi 
---
 cmds-device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-device.c b/cmds-device.c
index 4337eb2..3b6b985 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -237,7 +237,7 @@ static int cmd_device_remove(int argc, char **argv)
 
 static const char * const cmd_device_delete_usage[] = {
"btrfs device delete | [|...] ",
-   "Remove a device from a filesystem",
+   "Remove a device from a filesystem (alias of \"btrfs device remove\")",
NULL
 };
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/2] btrfs-progs: device: add description of alias to help message

2017-10-15 Thread Misono, Tomohiro
State the 'delete' is the alias of 'remove' as the man page says.

Signed-off-by: Tomohiro Misono 
Reviewed-by: Satoru Takeuchi 
---
 cmds-device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-device.c b/cmds-device.c
index 4337eb2..3b6b985 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -237,7 +237,7 @@ static int cmd_device_remove(int argc, char **argv)
 
 static const char * const cmd_device_delete_usage[] = {
"btrfs device delete | [|...] ",
-   "Remove a device from a filesystem",
+   "Remove a device from a filesystem (alias of \"btrfs device remove\")",
NULL
 };
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 1/3] btrfs-progs: device: add description of alias to help message

2017-10-10 Thread Misono, Tomohiro
State the 'delete' is the alias of 'remove' as the man page says.

Signed-off-by: Tomohiro Misono 
Reviewed-by: Satoru Takeuchi 
---
 cmds-device.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/cmds-device.c b/cmds-device.c
index 4337eb2..3b6b985 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -237,7 +237,7 @@ static int cmd_device_remove(int argc, char **argv)
 
 static const char * const cmd_device_delete_usage[] = {
"btrfs device delete | [|...] ",
-   "Remove a device from a filesystem",
+   "Remove a device from a filesystem (alias of \"btrfs device remove\")",
NULL
 };
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: doc: update help/document of btrfs device remove

2017-10-10 Thread Misono, Tomohiro
On 2017/10/11 6:22, Satoru Takeuchi wrote:
> At Tue, 3 Oct 2017 17:12:39 +0900,
> Misono, Tomohiro wrote:
>>
>> This patch updates help/document of "btrfs device remove" in two points:
>>
>> 1. Add explanation of 'missing' for 'device remove'. This is only
>> written in wikipage currently.
>> (https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices)
>>
>> 2. Add example of device removal in the man document. This is because
>> that explanation of "remove" says "See the example section below", but
>> there is no example of removal currently.
>>
>> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
>> ---
>>  Documentation/btrfs-device.asciidoc | 19 +++
>>  cmds-device.c   | 10 +-
>>  2 files changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/Documentation/btrfs-device.asciidoc 
>> b/Documentation/btrfs-device.asciidoc
>> index 88822ec..dc523a9 100644
>> --- a/Documentation/btrfs-device.asciidoc
>> +++ b/Documentation/btrfs-device.asciidoc
>> @@ -75,6 +75,10 @@ The operation can take long as it needs to move all data 
>> from the device.
>>  It is possible to delete the device that was used to mount the filesystem. 
>> The
>>  device entry in mount table will be replaced by another device name with the
>>  lowest device id.
>> ++
>> +If device is mounted as degraded mode (-o degraded), special term "missing"
>> +can be used for . In that case, the first device that is described 
>> by
>> +the filesystem metadata, but not presented at the mount time will be 
>> removed.
>>  
>>  *delete* | [|...] ::
>>  Alias of remove kept for backward compatibility
>> @@ -206,6 +210,21 @@ data or the block groups occupy the whole first device.
>>  The device size of '/dev/sdb' as seen by the filesystem remains unchanged, 
>> but
>>  the logical space from 50-100GiB will be unused.
>>  
>> + REMOVE DEVICE 
> 
> It's a part of "TYPICAL USECASES" section. So it's also necessary to modify
> the following sentence
> 
> ===
> See the example section below.
> ===
> 
> to as follow.
> 
> ===
> See the *TYPICAL USECASES* section below.
> ===
> 
> Or just removing the above mentioned sentence is also OK since there is
> "See the section *TYPICAL USECASES* for some examples." in "DEVICE MANAGEMENT"
> section.
> 
>> +
>> +Device removal must satisfy the profile constraints, otherwise the command
>> +fails. For example:
>> +
>> + $ btrfs device remove /dev/sda /mnt
>> + $ ERROR: error removing device '/dev/sda': unable to go below two devices 
>> on raid1
> 
> s/^$  ERROR/ERROR/
> 
>> +
>> +
>> +In order to remove a device, you need to convert profile in this case:
>> +
>> + $ btrfs balance start -mconvert=dup /mnt
>> + $ btrfs balance start -dconvert=single /mnt
> 
> It's simpler to convert both the RAID configuration of data and metadata
> by the following one command.
> 
> $ btrfs balance -mconvert=dup -dconvert=single /mnt
> 
>> + $ btrfs device remove /dev/sda /mnt
>> +
>>  DEVICE STATS
>>  
>>  
>> diff --git a/cmds-device.c b/cmds-device.c
>> index 4337eb2..6cb53ff 100644
>> --- a/cmds-device.c
>> +++ b/cmds-device.c
>> @@ -224,9 +224,16 @@ static int _cmd_device_remove(int argc, char **argv,
>>  return !!ret;
>>  }
>>  
>> +#define COMMON_USAGE_REMOVE_DELETE \
>> +"", \
>> +"If 'missing' is specified for , the first device that is", \
>> +"described by the filesystem metadata, but not presented at the", \
>> +"mount time will be removed."
>> +
>>  static const char * const cmd_device_remove_usage[] = {
>>  "btrfs device remove | [|...] ",
>>  "Remove a device from a filesystem",
>> +COMMON_USAGE_REMOVE_DELETE,
>>  NULL
>>  };
>>  
>> @@ -237,7 +244,8 @@ static int cmd_device_remove(int argc, char **argv)
>>  
>>  static const char * const cmd_device_delete_usage[] = {
>>  "btrfs device delete | [|...] ",
>> -"Remove a device from a filesystem",
>> +"Remove a device from a filesystem (alias of \"btrfs device remove\")",
>> +COMMON_USAGE_REMOVE_DELETE,
>>  NULL
>>  };
> 
> This snippet is not related to the description of this patch.
> Dividing this patch is better.
&g

Re: [PATCH] btrfs-progs: doc: update help/document of btrfs device remove

2017-10-10 Thread Satoru Takeuchi
At Tue, 3 Oct 2017 17:12:39 +0900,
Misono, Tomohiro wrote:
> 
> This patch updates help/document of "btrfs device remove" in two points:
> 
> 1. Add explanation of 'missing' for 'device remove'. This is only
> written in wikipage currently.
> (https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices)
> 
> 2. Add example of device removal in the man document. This is because
> that explanation of "remove" says "See the example section below", but
> there is no example of removal currently.
> 
> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
> ---
>  Documentation/btrfs-device.asciidoc | 19 +++
>  cmds-device.c   | 10 +-
>  2 files changed, 28 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/btrfs-device.asciidoc 
> b/Documentation/btrfs-device.asciidoc
> index 88822ec..dc523a9 100644
> --- a/Documentation/btrfs-device.asciidoc
> +++ b/Documentation/btrfs-device.asciidoc
> @@ -75,6 +75,10 @@ The operation can take long as it needs to move all data 
> from the device.
>  It is possible to delete the device that was used to mount the filesystem. 
> The
>  device entry in mount table will be replaced by another device name with the
>  lowest device id.
> ++
> +If device is mounted as degraded mode (-o degraded), special term "missing"
> +can be used for . In that case, the first device that is described by
> +the filesystem metadata, but not presented at the mount time will be removed.
>  
>  *delete* | [|...] ::
>  Alias of remove kept for backward compatibility
> @@ -206,6 +210,21 @@ data or the block groups occupy the whole first device.
>  The device size of '/dev/sdb' as seen by the filesystem remains unchanged, 
> but
>  the logical space from 50-100GiB will be unused.
>  
> + REMOVE DEVICE 

It's a part of "TYPICAL USECASES" section. So it's also necessary to modify
the following sentence

===
See the example section below.
===

to as follow.

===
See the *TYPICAL USECASES* section below.
===

Or just removing the above mentioned sentence is also OK since there is
"See the section *TYPICAL USECASES* for some examples." in "DEVICE MANAGEMENT"
section.

> +
> +Device removal must satisfy the profile constraints, otherwise the command
> +fails. For example:
> +
> + $ btrfs device remove /dev/sda /mnt
> + $ ERROR: error removing device '/dev/sda': unable to go below two devices 
> on raid1

s/^$  ERROR/ERROR/

> +
> +
> +In order to remove a device, you need to convert profile in this case:
> +
> + $ btrfs balance start -mconvert=dup /mnt
> + $ btrfs balance start -dconvert=single /mnt

It's simpler to convert both the RAID configuration of data and metadata
by the following one command.

$ btrfs balance -mconvert=dup -dconvert=single /mnt

> + $ btrfs device remove /dev/sda /mnt
> +
>  DEVICE STATS
>  
>  
> diff --git a/cmds-device.c b/cmds-device.c
> index 4337eb2..6cb53ff 100644
> --- a/cmds-device.c
> +++ b/cmds-device.c
> @@ -224,9 +224,16 @@ static int _cmd_device_remove(int argc, char **argv,
>   return !!ret;
>  }
>  
> +#define COMMON_USAGE_REMOVE_DELETE \
> + "", \
> + "If 'missing' is specified for , the first device that is", \
> + "described by the filesystem metadata, but not presented at the", \
> + "mount time will be removed."
> +
>  static const char * const cmd_device_remove_usage[] = {
>   "btrfs device remove | [|...] ",
>   "Remove a device from a filesystem",
> + COMMON_USAGE_REMOVE_DELETE,
>   NULL
>  };
>  
> @@ -237,7 +244,8 @@ static int cmd_device_remove(int argc, char **argv)
>  
>  static const char * const cmd_device_delete_usage[] = {
>   "btrfs device delete | [|...] ",
> - "Remove a device from a filesystem",
> + "Remove a device from a filesystem (alias of \"btrfs device remove\")",
> + COMMON_USAGE_REMOVE_DELETE,
>   NULL
>  };

This snippet is not related to the description of this patch.
Dividing this patch is better.

Thanks,
Satoru

>  
> -- 
> 2.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking Help on Corruption Issues

2017-10-04 Thread Hugo Mills
On Tue, Oct 03, 2017 at 03:49:25PM -0700, Stephen Nesbitt wrote:
> 
> On 10/3/2017 2:11 PM, Hugo Mills wrote:
> >Hi, Stephen,
> >
> >On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote:
> >>Here it i. There are a couple of out-of-order entries beginning at 117. And
> >>yes I did uncover a bad stick of RAM:
> >>
> >>btrfs-progs v4.9.1
> >>leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2
> >>fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3
> >>chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6
> >[snip]
> >>item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53
> >>extent refs 1 gen 3346444 flags DATA
> >>extent data backref root 271 objectid 2478 offset 0 count 1
> >>item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53
> >>extent refs 1 gen 3346495 flags DATA
> >>extent data backref root 271 objectid 21751764 offset 6733824 count 1
> >>item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53
> >>extent refs 1 gen 3351513 flags DATA
> >>extent data backref root 271 objectid 5724364 offset 680640512 count 1
> >>item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53
> >>extent refs 1 gen 3346376 flags DATA
> >>extent data backref root 271 objectid 21751764 offset 6701056 count 1
> hex(1623012749312)
> >'0x179e3193000'
> hex(1621939052544)
> >'0x179a319e000'
> hex(1623012450304)
> >'0x179e314a000'
> hex(1623012802560)
> >'0x179e31a'
> >
> >That's "e" -> "a" in the fourth hex digit, which is a single-bit
> >flip, and should be fixable by btrfs check (I think). However, even
> >fixing that, it's not ordered, because 118 is then before 117, which
> >could be another bitflip ("9" -> "4" in the 7th digit), but two bad
> >bits that close to each other seems unlikely to me.
> >
> >Hugo.
> 
> Hope this is a duplicate reply - I might have fat fingered something.
> 
> The underlying file is disposable/replaceable. Any way to zero
> out/zap the bad BTRFS entry?

   Not really. Even trying to delete the related file(s), it's going
to fall over when reading the metadata in in the first place. (The key
order check is a metadata invariant, like the csum checks and transid
checks).

   At best, you'd have to get btrfs check to fix it. It should be able
to manage a single-bit error, but you've got two single-bit errors in
close proximity, and I'm not sure it'll be able to deal with it. Might
be worth trying it. The FS _might_ blow up as a result of an attempted
fix, but you say it's replacable, so that's kind of OK. The worst I'd
_expect_ to happen with btrfs check --repair is that it just won't be
able to deal with it and you're left where you started.

   Go for it.

   Hugo.

-- 
Hugo Mills | You shouldn't anthropomorphise computers. They
hugo@... carfax.org.uk | really don't like that.
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: Seeking Help on Corruption Issues

2017-10-03 Thread Stephen Nesbitt


On 10/3/2017 2:11 PM, Hugo Mills wrote:

Hi, Stephen,

On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote:

Here it i. There are a couple of out-of-order entries beginning at 117. And
yes I did uncover a bad stick of RAM:

btrfs-progs v4.9.1
leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2
fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3
chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6

[snip]

item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53
extent refs 1 gen 3346444 flags DATA
extent data backref root 271 objectid 2478 offset 0 count 1
item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53
extent refs 1 gen 3346495 flags DATA
extent data backref root 271 objectid 21751764 offset 6733824 count 1
item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53
extent refs 1 gen 3351513 flags DATA
extent data backref root 271 objectid 5724364 offset 680640512 count 1
item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53
extent refs 1 gen 3346376 flags DATA
extent data backref root 271 objectid 21751764 offset 6701056 count 1

hex(1623012749312)

'0x179e3193000'

hex(1621939052544)

'0x179a319e000'

hex(1623012450304)

'0x179e314a000'

hex(1623012802560)

'0x179e31a'

That's "e" -> "a" in the fourth hex digit, which is a single-bit
flip, and should be fixable by btrfs check (I think). However, even
fixing that, it's not ordered, because 118 is then before 117, which
could be another bitflip ("9" -> "4" in the 7th digit), but two bad
bits that close to each other seems unlikely to me.

Hugo.


Hope this is a duplicate reply - I might have fat fingered something.

The underlying file is disposable/replaceable. Any way to zero out/zap 
the bad BTRFS entry?


-steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Seeking Help on Corruption Issues

2017-10-03 Thread Hugo Mills
   Hi, Stephen,

On Tue, Oct 03, 2017 at 08:52:04PM +, Stephen Nesbitt wrote:
> Here it i. There are a couple of out-of-order entries beginning at 117. And
> yes I did uncover a bad stick of RAM:
> 
> btrfs-progs v4.9.1
> leaf 2589782867968 items 134 free space 6753 generation 3351574 owner 2
> fs uuid 24b768c3-2141-44bf-ae93-1c3833c8c8e3
> chunk uuid 19ce12f0-d271-46b8-a691-e0d26c1790c6
[snip]
> item 116 key (1623012749312 EXTENT_ITEM 45056) itemoff 10908 itemsize 53
> extent refs 1 gen 3346444 flags DATA
> extent data backref root 271 objectid 2478 offset 0 count 1
> item 117 key (1621939052544 EXTENT_ITEM 8192) itemoff 10855 itemsize 53
> extent refs 1 gen 3346495 flags DATA
> extent data backref root 271 objectid 21751764 offset 6733824 count 1
> item 118 key (1623012450304 EXTENT_ITEM 8192) itemoff 10802 itemsize 53
> extent refs 1 gen 3351513 flags DATA
> extent data backref root 271 objectid 5724364 offset 680640512 count 1
> item 119 key (1623012802560 EXTENT_ITEM 12288) itemoff 10749 itemsize 53
> extent refs 1 gen 3346376 flags DATA
> extent data backref root 271 objectid 21751764 offset 6701056 count 1

>>> hex(1623012749312)
'0x179e3193000'
>>> hex(1621939052544)
'0x179a319e000'
>>> hex(1623012450304)
'0x179e314a000'
>>> hex(1623012802560)
'0x179e31a'

   That's "e" -> "a" in the fourth hex digit, which is a single-bit
flip, and should be fixable by btrfs check (I think). However, even
fixing that, it's not ordered, because 118 is then before 117, which
could be another bitflip ("9" -> "4" in the 7th digit), but two bad
bits that close to each other seems unlikely to me.

   Hugo.

-- 
Hugo Mills | Great films about cricket: Silly Point Break
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Re: Seeking Help on Corruption Issues

2017-10-03 Thread Hugo Mills
On Tue, Oct 03, 2017 at 01:06:50PM -0700, Stephen Nesbitt wrote:
> All:
> 
> I came back to my computer yesterday to find my filesystem in read
> only mode. Running a btrfs scrub start -dB aborts as follows:
> 
> btrfs scrub start -dB /mnt
> ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5
> (Input/output error)
> ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5
> (Input/output error)
> scrub device /dev/sdb (id 4) canceled
>     scrub started at Mon Oct  2 21:51:46 2017 and was aborted after
> 00:09:02
>     total bytes scrubbed: 75.58GiB with 1 errors
>     error details: csum=1
>     corrected errors: 0, uncorrectable errors: 1, unverified errors: 0
> scrub device /dev/sdc (id 5) canceled
>     scrub started at Mon Oct  2 21:51:46 2017 and was aborted after
> 00:11:11
>     total bytes scrubbed: 50.75GiB with 0 errors
> 
> The resulting dmesg is:
> [  699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0,
> rd 0, flush 0, corrupt 6, gen 0
> [  699.703045] BTRFS error (device sdc): unable to fixup (regular)
> error at logical 1609808347136 on dev /dev/sdb
> [  783.306525] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116

   This error usually means bad RAM. Can you show us the output of
"btrfs-debug-tree -b 2589782867968 /dev/sdc"?

   Hugo.

> [  789.776132] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> [  911.529842] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> [  918.365225] BTRFS critical (device sdc): corrupt leaf, bad key
> order: block=2589782867968, root=1, slot=116
> 
> Running btrfs check /dev/sdc results in:
> btrfs check /dev/sdc
> Checking filesystem on /dev/sdc
> UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
> checking extents
> bad key ordering 116 117
> bad block 2589782867968
> ERROR: errors found in extent allocation tree or chunk allocation
> checking free space cache
> There is no free space entry for 1623012450304-1623012663296
> There is no free space entry for 1623012450304-1623225008128
> cache appears valid but isn't 1622151266304
> found 288815742976 bytes used err is -22
> total csum bytes: 0
> total tree bytes: 350781440
> total fs tree bytes: 0
> total extent tree bytes: 350027776
> btree space waste bytes: 115829777
> file data blocks allocated: 156499968
> 
> uname -a:
> Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC
> 2017 x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel
> GNU/Linux
> 
> btrfs --version: btrfs-progs v4.9.1
> 
> btrfs fi show:
> Label: none  uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
>     Total devices 2 FS bytes used 475.08GiB
>     devid    4 size 931.51GiB used 612.06GiB path /dev/sdb
>     devid    5 size 931.51GiB used 613.09GiB path /dev/sdc
> 
> btrfs fi df /mnt:
> Data, RAID1: total=603.00GiB, used=468.03GiB
> System, RAID1: total=64.00MiB, used=112.00KiB
> System, single: total=32.00MiB, used=0.00B
> Metadata, RAID1: total=9.00GiB, used=7.04GiB
> Metadata, single: total=1.00GiB, used=0.00B
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> What is the recommended procedure at this point? Run btrfs check
> --repair? I have backups so losing a file or two isn't critical, but
> I really don't want to go through the effort of a bare metal
> reinstall.
> 
> In the process of researching this I did uncover a bad DIMM. Am I
> correct that the problems I'm seeing are likely linked to the
> resulting memory errors.
> 
> Thx in advance,
> 
> -steve
> 

-- 
Hugo Mills | Quidquid latine dictum sit, altum videtur
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |


signature.asc
Description: Digital signature


Seeking Help on Corruption Issues

2017-10-03 Thread Stephen Nesbitt

All:

I came back to my computer yesterday to find my filesystem in read only 
mode. Running a btrfs scrub start -dB aborts as follows:


btrfs scrub start -dB /mnt
ERROR: scrubbing /mnt failed for device id 4: ret=-1, errno=5 
(Input/output error)
ERROR: scrubbing /mnt failed for device id 5: ret=-1, errno=5 
(Input/output error)

scrub device /dev/sdb (id 4) canceled
    scrub started at Mon Oct  2 21:51:46 2017 and was aborted after 
00:09:02

    total bytes scrubbed: 75.58GiB with 1 errors
    error details: csum=1
    corrected errors: 0, uncorrectable errors: 1, unverified errors: 0
scrub device /dev/sdc (id 5) canceled
    scrub started at Mon Oct  2 21:51:46 2017 and was aborted after 
00:11:11

    total bytes scrubbed: 50.75GiB with 0 errors

The resulting dmesg is:
[  699.534066] BTRFS error (device sdc): bdev /dev/sdb errs: wr 0, rd 0, 
flush 0, corrupt 6, gen 0
[  699.703045] BTRFS error (device sdc): unable to fixup (regular) error 
at logical 1609808347136 on dev /dev/sdb
[  783.306525] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  789.776132] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  911.529842] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116
[  918.365225] BTRFS critical (device sdc): corrupt leaf, bad key order: 
block=2589782867968, root=1, slot=116


Running btrfs check /dev/sdc results in:
btrfs check /dev/sdc
Checking filesystem on /dev/sdc
UUID: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
checking extents
bad key ordering 116 117
bad block 2589782867968
ERROR: errors found in extent allocation tree or chunk allocation
checking free space cache
There is no free space entry for 1623012450304-1623012663296
There is no free space entry for 1623012450304-1623225008128
cache appears valid but isn't 1622151266304
found 288815742976 bytes used err is -22
total csum bytes: 0
total tree bytes: 350781440
total fs tree bytes: 0
total extent tree bytes: 350027776
btree space waste bytes: 115829777
file data blocks allocated: 156499968

uname -a:
Linux sysresccd 4.9.24-std500-amd64 #2 SMP Sat Apr 22 17:14:43 UTC 2017 
x86_64 Intel(R) Core(TM) i5-6500 CPU @ 3.20GHz GenuineIntel GNU/Linux


btrfs --version: btrfs-progs v4.9.1

btrfs fi show:
Label: none  uuid: 24b768c3-2141-44bf-ae93-1c3833c8c8e3
    Total devices 2 FS bytes used 475.08GiB
    devid    4 size 931.51GiB used 612.06GiB path /dev/sdb
    devid    5 size 931.51GiB used 613.09GiB path /dev/sdc

btrfs fi df /mnt:
Data, RAID1: total=603.00GiB, used=468.03GiB
System, RAID1: total=64.00MiB, used=112.00KiB
System, single: total=32.00MiB, used=0.00B
Metadata, RAID1: total=9.00GiB, used=7.04GiB
Metadata, single: total=1.00GiB, used=0.00B
GlobalReserve, single: total=512.00MiB, used=0.00B

What is the recommended procedure at this point? Run btrfs check 
--repair? I have backups so losing a file or two isn't critical, but I 
really don't want to go through the effort of a bare metal reinstall.


In the process of researching this I did uncover a bad DIMM. Am I 
correct that the problems I'm seeing are likely linked to the resulting 
memory errors.


Thx in advance,

-steve

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: doc: update help/document of btrfs device remove

2017-10-03 Thread Misono, Tomohiro
This patch updates help/document of "btrfs device remove" in two points:

1. Add explanation of 'missing' for 'device remove'. This is only
written in wikipage currently.
(https://btrfs.wiki.kernel.org/index.php/Using_Btrfs_with_Multiple_Devices)

2. Add example of device removal in the man document. This is because
that explanation of "remove" says "See the example section below", but
there is no example of removal currently.

Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
---
 Documentation/btrfs-device.asciidoc | 19 +++
 cmds-device.c   | 10 +-
 2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/Documentation/btrfs-device.asciidoc 
b/Documentation/btrfs-device.asciidoc
index 88822ec..dc523a9 100644
--- a/Documentation/btrfs-device.asciidoc
+++ b/Documentation/btrfs-device.asciidoc
@@ -75,6 +75,10 @@ The operation can take long as it needs to move all data 
from the device.
 It is possible to delete the device that was used to mount the filesystem. The
 device entry in mount table will be replaced by another device name with the
 lowest device id.
++
+If device is mounted as degraded mode (-o degraded), special term "missing"
+can be used for . In that case, the first device that is described by
+the filesystem metadata, but not presented at the mount time will be removed.
 
 *delete* | [|...] ::
 Alias of remove kept for backward compatibility
@@ -206,6 +210,21 @@ data or the block groups occupy the whole first device.
 The device size of '/dev/sdb' as seen by the filesystem remains unchanged, but
 the logical space from 50-100GiB will be unused.
 
+ REMOVE DEVICE 
+
+Device removal must satisfy the profile constraints, otherwise the command
+fails. For example:
+
+ $ btrfs device remove /dev/sda /mnt
+ $ ERROR: error removing device '/dev/sda': unable to go below two devices on 
raid1
+
+
+In order to remove a device, you need to convert profile in this case:
+
+ $ btrfs balance start -mconvert=dup /mnt
+ $ btrfs balance start -dconvert=single /mnt
+ $ btrfs device remove /dev/sda /mnt
+
 DEVICE STATS
 
 
diff --git a/cmds-device.c b/cmds-device.c
index 4337eb2..6cb53ff 100644
--- a/cmds-device.c
+++ b/cmds-device.c
@@ -224,9 +224,16 @@ static int _cmd_device_remove(int argc, char **argv,
return !!ret;
 }
 
+#define COMMON_USAGE_REMOVE_DELETE \
+   "", \
+   "If 'missing' is specified for , the first device that is", \
+   "described by the filesystem metadata, but not presented at the", \
+   "mount time will be removed."
+
 static const char * const cmd_device_remove_usage[] = {
"btrfs device remove | [|...] ",
"Remove a device from a filesystem",
+   COMMON_USAGE_REMOVE_DELETE,
NULL
 };
 
@@ -237,7 +244,8 @@ static int cmd_device_remove(int argc, char **argv)
 
 static const char * const cmd_device_delete_usage[] = {
"btrfs device delete | [|...] ",
-   "Remove a device from a filesystem",
+   "Remove a device from a filesystem (alias of \"btrfs device remove\")",
+   COMMON_USAGE_REMOVE_DELETE,
NULL
 };
 
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help Recovering BTRFS array

2017-09-21 Thread grondinm

Hi Duncan,

I'm not sure if this will attache to my original message...

Thank you for your reply. For some reason i'm not getting list messages even 
tho i know i am subscribed.

I know all to well about the golden rule of data. It has bitten me  a few 
times. The data on this array is mostly data that i don't really care about. I 
was able to copy off what i wanted. The main reason i sent it to the list was 
just to see if i could somehow return the FS to a working state without having 
to recreate. I'm just surprised that all 3 copies of the super block got 
corrupted. Probably my lack of understanding but i always assumed that if one 
copy got corrupted it would be replaced by a good copy therefore leaving all 
copies in a good state. Is that not the case. If it is then what back luck that 
all 3 got messed up at same time. 

Some information i forgot to include in my original message

uname -a
Linux thebeach 4.12.13-gentoo-GMAN #1 SMP Sat Sep 16 15:28:26 ADT 2017 x86_64 
Intel(R) Core(TM) i5-2320 CPU @ 3.00GHz GenuineIntel GNU/Linux

btrfs --version
btrfs-progs v4.10.2

Anyways thank you again for your reply. I will leave the FS intact for a few 
days in case anymore details could help the development of BTRFS and maybe 
avoid this happening or having a recovery option.

Marc


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help Recovering BTRFS array

2017-09-18 Thread Duncan
grondinm posted on Mon, 18 Sep 2017 14:14:08 -0300 as excerpted:

> superblock: bytenr=65536, device=/dev/md0
> -
> ERROR: bad magic on superblock on /dev/md0 at 65536
> 
> superblock: bytenr=67108864, device=/dev/md0
> -
> ERROR: bad magic on superblock on /dev/md0 at 67108864
> 
> superblock: bytenr=274877906944, device=/dev/md0
> -
> ERROR: bad magic on superblock on /dev/md0 at 274877906944
> 
> Now i'm really panicked. Is the FS toast? Can any recovery be attempted?

First I'm a user and list regular, not a dev.  With luck they can help 
beyond the below suggestions...

However, there's no need to panic in any case, due to the sysadmin's 
first rule of backups: The true value of any data is defined by the 
number of backups of that data you consider(ed) it worth having.

As a result, there are precisely two possibilities, neither one of which 
calls for panic.

1) No need to panic because you have a backup, and recovery is as simple 
as restoring from that backup.

2) You don't have a backup, in which case the lack of that backup means 
you have defined the value of the data as only trivial, worth less than 
the time/trouble/resources you saved by not making that backup.  Because 
the data is only of trivial value anyway, and you saved the more valuable 
assets of the time/trouble/resources you would have put into that backup 
were the data of more than trivial value, you've still saved the stuff 
you considered most valuable, so again, no need to panic.

It's a binary state.  There's no third possibility available, and no 
possibility you lost what your actions, or lack of them in the case of no 
backup, defined as of most value to you.

(As for the freshness of that backup, the same rule applies, but to the 
data delta between the state as of the backup and the current state.  If 
the value of the changed data is worth it to you to have it backed up, 
you'll have freshened your backup.  If not, you defined it to be as of 
such trivial value as to not be worth the time/trouble/resources to do 
so.)


That said, at the time you're calculating the value of the data against 
the value of the time/trouble/resources required to back it up, the loss 
potential remains theoretical.  Once something actually happens to the 
data, it's no longer theoretical, and the data, while of trivial enough 
value to be worth the risk when it was theoretical, may still be valuable 
enough to you to spend at least some time/trouble on trying to recover it.

In that case, since you can still mount, I'd suggest mounting read-only 
to prevent any further damage, and then do a copy off of the data you 
can, to a different, unaffected, filesystem.

Then if there's still data you want that you couldn't simply copy off, 
you can try btrfs restore.  While I do have backups here, a couple times 
when things went bad, btrfs restore was able to get back pretty much 
everything to current, while were I to have had to restore from backups, 
I'd have lost enough changed data to hurt, even if I had defined it as of 
trivial enough value when the risk remained theoretical that I hadn't yet 
freshened the backup.  (Since then I upgraded the rest of my storage to 
ssd, thus lowering the time and hassle cost of backups, encouraging me to 
do them more frequently.  Talking about which, I need to freshen them in 
the near future.  It's now on my list for my next day off...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help Recovering BTRFS array

2017-09-18 Thread grondinm
Hello,

I will try to provide all information pertinent to the situation i find myself 
in.

Yesterday while trying to write some data to a BTRFS filesystem on top of a 
mdadm raid5 array encrypted with dmcrypt comprising of 4 1tb HDD my system 
became unresponsive and i had no choice but to hard reset. System came back up 
no problem and the array in question mounted without a complaint. Once i tried 
to write data to it again however the system became unresponsive again and 
required another hard reset. Again system came back up and everything mounted 
with no complaints.

This time i decided to run some checks. Ran a raid check by issuing 'echo check 
> /sys/block/md0/md/sync_action'. This completed without a single error. So i 
performed a proper restart just because and once the system came back up i 
initiated a scrub on the btrfs filesystem. This greeted me with my first 
indication that something is wrong:

btrfs sc stat /media/Storage2 
scrub status for e5bd5cf3-c736-48ff-b1c6-c9f678567788
scrub started at Mon Sep 18 06:05:21 2017, running for 07:40:47
total bytes scrubbed: 1.03TiB with 1 errors
error details: super=1
corrected errors: 0, uncorrectable errors: 0, unverified errors: 0

I was concerned but since it was still scrubbing i left it. Now things look 
really bleak... 

Every few minutes the scrub process goes into a D status as shown by htop it 
eventually keeps going and as far as i can see is still scrubbing(slowly). I 
decided to check a something else(based on the error above) I ran btrfs 
inspect-internal dump-super -a -f /dev/md0 which gave me this:

superblock: bytenr=65536, device=/dev/md0 
-
ERROR: bad magic on superblock on /dev/md0 at 65536

superblock: bytenr=67108864, device=/dev/md0
-
ERROR: bad magic on superblock on /dev/md0 at 67108864

superblock: bytenr=274877906944, device=/dev/md0
-
ERROR: bad magic on superblock on /dev/md0 at 274877906944

Now i'm really panicked. Is the FS toast? Can any recovery be attempted?

Here is the output of dump-super with the -F option:

superblock: bytenr=65536, device=/dev/md0
-
csum_type   43668 (INVALID)
csum_size   32
csum
0x76c647b04abf1057f04e40d1dc52522397258064b98a1b8f6aa6934c74c0dd55 [DON'T MATCH]
bytenr  6376050623103086821
flags   0x7edcc412b742c79f
( WRITTEN |
  RELOC |
  METADUMP |
  unknown flag: 0x7edcc410b742c79c )
magic   ..l~...q [DON'T MATCH]
fsid2cf827fa-7ab8-e290-b152-1735c2735a37
label   
.a.9.@.=4.#.|.D...]..dh=d,..k..n..~.5.i.8...(.._.tl.a.@..2..qidj.>Hy.U..{X5.kG0.)t..;/.2...@.T.|.u.<.`!J*9./8...&.g\.V...*.,/95.uEs..W.i..z..h...n(...VGn^F...H...5.DT..3.A..mK...~..}.1..n.
generation  1769598730239175261
root14863846352370317867
sys_array_size  1744503544
chunk_root_generation   18100024505086712407
root_level  79
chunk_root  10848092274453435018
chunk_root_level156
log_root7514172289378668244
log_root_transid6227239369566282426
log_root_level  18
total_bytes 5481087866519986730
bytes_used  13216280034370888020
sectorsize  4102056786
nodesize1038279258
leafsize276348297
stripesize  2473897044
root_dir12090183195204234845
num_devices 12836127619712721941
compat_flags0xf98ff436fc954bd4
compat_ro_flags 0x3fe8246616164da7
( FREE_SPACE_TREE |
  FREE_SPACE_TREE_VALID |
  unknown flag: 0x3fe8246616164da4 )
incompat_flags  0x3989a5037330bfd8
( COMPRESS_LZO |
  COMPRESS_LZOv2 |
  EXTENDED_IREF |
  RAID56 |
  SKINNY_METADATA |
  NO_HOLES |
  unknown flag: 0x3989a5037330bc10 )
cache_generation10789185961859482334
uuid_tree_generation14921288820846890813
dev_item.uuid   e6e382b3-de66-4c25-7cc9-3cc43cde9c24
dev_item.fsid   f8430e37-12ca-adaf-b038-f0ee10ce6327 [DON'T MATCH]
dev_item.type   7909001383421391155
dev_item.total_bytes4839925749276763097
dev_item.bytes_used 14330418354255459170
dev_item.io_align   4136652250
dev_item.io_width   1113335506
dev_item.sector_size1197062542
dev_item.devid  16559830033162408461
dev_item.dev_group  3271056113

Re: Please help with exact actions for raid1 hot-swap

2017-09-12 Thread Austin S. Hemmelgarn

On 2017-09-11 17:33, Duncan wrote:

Austin S. Hemmelgarn posted on Mon, 11 Sep 2017 11:11:01 -0400 as
excerpted:


On 2017-09-11 09:16, Marat Khalili wrote:

Patrik, Duncan, thank you for the help. The `btrfs replace start
/dev/sdb7 /dev/sdd7 /mnt/data` worked without a hitch (though I didn't
try to reboot yet, still have grub/efi/several mdadm partitions to
copy).



Does this mean:
* I should not be afraid to reboot and find /dev/sdb7 mounted again?
* I will not be able to easily mount /dev/sdb7 on a different computer
to do some tests?

This depends.  I don't remember if the replace command wipes the
super-block on the old device after the replace completes or not.


AFAIK it does.

Based on checking after I sent my reply, it does.



If it
does not, then you can't safely mount the filesystem while that device
is still in the system, but can transfer it to another system and mount
it degraded (probably, not a certainty).


It's worth noting that while this shouldn't be a problem here (because
the magic should be gone), the problem does appear in other contexts.  In
particular, any context that does device duplication is a problem.

This means dd-ing the content of a device to another device is a problem,
because once btrfs device scan is triggered (and udev can trigger it
automatically/unexpectedly), btrfs will see the second device and
consider it part of the same filesystem as the first, causing problems if
either one is mounted.

dd-ing to a file tends to be less of a problem, because it's just a file
until activated as a loopback device, and that doesn't tend to happen
automatically.

Similarly, lvm's device mirroring modes can be problematic, with udev
again sometimes unexpectedly triggering btrfs device scan on device
appearance, unless measures are taken to hide the new device.  I tried
lvm some time ago and decided I didn't find it useful for my on use-
cases, so I don't know the details here, in particular, I'm not sure of
the device hiding options, but there have certainly been threads on the
list discussing the problem and the option to hide the device to prevent
it came up in one of them.
Based on my own experience, LVM works fine as of right now provided you 
use the standard LVM udev rules (which disable almost all udev 
processing on LVM internal devices).  In fact, the only issues I've had 
in the past with BTRFS on LVM were related to dm-cache not properly 
hiding the backing device originally, and some generic stability issues 
early on with BTRFS on top of dm-thinp



if it does, then you can
safely keep the device in the system, but won't be able to move it to
another computer and get data off of it.


This should be the case.  Tho it may be as simple as restoring the btrfs
magic in the superblock to restore it to mountability, but I believe the
replace process deletes chunks as they are transfered, so actually
getting data off it may be more complicated than simply making it
mountable again.


Regardless of which is the
case, you won't see /dev/sdb7 mounted as a separate filesystem when you
reboot.



Also, although /dev/sdd7 is much larger than /dev/sdb7 was, `btrfs fi
show` still displays it as 2.71TiB, why?



`btrfs replace` is functionally equivalent to using dd to copy the
contents of the device being replaced to the new device, albeit a bit
smarter (as mentioned above).  This means in particular that it does not
resize the filesystem (although i think I saw some discussion and
possibly patches to handle that with a command-line option).


This is documented.  From the btrfs-replace manpage (from btrfs-progs
4.12, reformatted a bit here for posting):





The  needs to be same size or larger than the .

Note:
The filesystem has to be resized to fully take advantage of a larger
target device, this can be achieved with

btrfs filesystem resize :max /path

<<<<<<



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-11 Thread Duncan
Austin S. Hemmelgarn posted on Mon, 11 Sep 2017 11:11:01 -0400 as
excerpted:

> On 2017-09-11 09:16, Marat Khalili wrote:
>> Patrik, Duncan, thank you for the help. The `btrfs replace start
>> /dev/sdb7 /dev/sdd7 /mnt/data` worked without a hitch (though I didn't
>> try to reboot yet, still have grub/efi/several mdadm partitions to
>> copy).

>> Does this mean:
>> * I should not be afraid to reboot and find /dev/sdb7 mounted again?
>> * I will not be able to easily mount /dev/sdb7 on a different computer
>> to do some tests?
> This depends.  I don't remember if the replace command wipes the
> super-block on the old device after the replace completes or not.

AFAIK it does.

> If it
> does not, then you can't safely mount the filesystem while that device
> is still in the system, but can transfer it to another system and mount
> it degraded (probably, not a certainty).

It's worth noting that while this shouldn't be a problem here (because 
the magic should be gone), the problem does appear in other contexts.  In 
particular, any context that does device duplication is a problem.

This means dd-ing the content of a device to another device is a problem, 
because once btrfs device scan is triggered (and udev can trigger it 
automatically/unexpectedly), btrfs will see the second device and 
consider it part of the same filesystem as the first, causing problems if 
either one is mounted.

dd-ing to a file tends to be less of a problem, because it's just a file 
until activated as a loopback device, and that doesn't tend to happen 
automatically.

Similarly, lvm's device mirroring modes can be problematic, with udev 
again sometimes unexpectedly triggering btrfs device scan on device 
appearance, unless measures are taken to hide the new device.  I tried 
lvm some time ago and decided I didn't find it useful for my on use-
cases, so I don't know the details here, in particular, I'm not sure of 
the device hiding options, but there have certainly been threads on the 
list discussing the problem and the option to hide the device to prevent 
it came up in one of them.

> if it does, then you can
> safely keep the device in the system, but won't be able to move it to
> another computer and get data off of it.

This should be the case.  Tho it may be as simple as restoring the btrfs 
magic in the superblock to restore it to mountability, but I believe the 
replace process deletes chunks as they are transfered, so actually 
getting data off it may be more complicated than simply making it 
mountable again.

> Regardless of which is the
> case, you won't see /dev/sdb7 mounted as a separate filesystem when you
> reboot.

>> Also, although /dev/sdd7 is much larger than /dev/sdb7 was, `btrfs fi
>> show` still displays it as 2.71TiB, why?

> `btrfs replace` is functionally equivalent to using dd to copy the
> contents of the device being replaced to the new device, albeit a bit
> smarter (as mentioned above).  This means in particular that it does not
> resize the filesystem (although i think I saw some discussion and
> possibly patches to handle that with a command-line option).

This is documented.  From the btrfs-replace manpage (from btrfs-progs 
4.12, reformatted a bit here for posting):

>>>>>>

The  needs to be same size or larger than the .

Note:
The filesystem has to be resized to fully take advantage of a larger 
target device, this can be achieved with

btrfs filesystem resize :max /path

<<<<<<

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-11 Thread Austin S. Hemmelgarn

On 2017-09-11 09:16, Marat Khalili wrote:
Patrik, Duncan, thank you for the help. The `btrfs replace start 
/dev/sdb7 /dev/sdd7 /mnt/data` worked without a hitch (though I didn't 
try to reboot yet, still have grub/efi/several mdadm partitions to copy).


It also worked much faster than mdadm would take, apparently only moving 
126GB used, not 2.71TB total.
This is why replace is preferred over add/remove.  The replace operation 
only copies exactly the data that is needed off of the old device, 
instead of copying the whole device like LVM and MD need to, or 
rewriting the whole filesystem (like add/remove does).


For what it's worth, if you can't use replace for some reason and have 
to use add and remove, it is more efficient to add the new device and 
then remove the old one, because it will require less data movement to 
get a properly balanced filesystem (removing a device is actually a 
balance operation that prevents writes to the device being removed).
Interestingly, according to HDD lights it 
mostly read from the remaining /dev/sda, not from replaced /dev/sdb 
(which must be completely readable now according to smartctl -- 
problematic sector got finally remapped after ~1day).
This is odd.  I was under the impression that replace preferentially 
reads from the device being replaced unless you tell it to avoid reading 
from said device.


It now looks like follows:


$ sudo blkid /dev/sda7 /dev/sdb7 /dev/sdd7
/dev/sda7: LABEL="data" UUID="37d3313a-e2ad-4b7f-98fc-a01d815952e0" 
UUID_SUB="db644855-2334-4d61-a27b-9a591255aa39" TYPE="btrfs" 
PARTUUID="c5ceab7e-e5f8-47c8-b922-c5fa0678831f"

/dev/sdb7: PARTUUID="493923cd-9ecb-4ee8-988b-5d0bfa8991b3"
/dev/sdd7: LABEL="data" UUID="37d3313a-e2ad-4b7f-98fc-a01d815952e0" 
UUID_SUB="9c2f05e9-5996-479f-89ad-f94f7ce130e6" TYPE="btrfs" 
PARTUUID="178cd274-7251-4d25-9116-ce0732d2410b"

$ sudo btrfs fi show /dev/sdb7
ERROR: no btrfs on /dev/sdb7
$ sudo btrfs fi show /dev/sdd7
Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
    Total devices 2 FS bytes used 108.05GiB
    devid    1 size 2.71TiB used 131.03GiB path /dev/sda7
    devid    2 size 2.71TiB used 131.03GiB path /dev/sdd7

Does this mean:
* I should not be afraid to reboot and find /dev/sdb7 mounted again?
* I will not be able to easily mount /dev/sdb7 on a different computer 
to do some tests?
This depends.  I don't remember if the replace command wipes the 
super-block on the old device after the replace completes or not.  If it 
does not, then you can't safely mount the filesystem while that device 
is still in the system, but can transfer it to another system and mount 
it degraded (probably, not a certainty).  if it does, then you can 
safely keep the device in the system, but won't be able to move it to 
another computer and get data off of it.  Regardless of which is the 
case, you won't see /dev/sdb7 mounted as a separate filesystem when you 
reboot.


Also, although /dev/sdd7 is much larger than /dev/sdb7 was, `btrfs fi 
show` still displays it as 2.71TiB, why?
`btrfs replace` is functionally equivalent to using dd to copy the 
contents of the device being replaced to the new device, albeit a bit 
smarter (as mentioned above).  This means in particular that it does not 
resize the filesystem (although i think I saw some discussion and 
possibly patches to handle that with a command-line option).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-11 Thread Marat Khalili
Patrik, Duncan, thank you for the help. The `btrfs replace start 
/dev/sdb7 /dev/sdd7 /mnt/data` worked without a hitch (though I didn't 
try to reboot yet, still have grub/efi/several mdadm partitions to copy).


It also worked much faster than mdadm would take, apparently only moving 
126GB used, not 2.71TB total. Interestingly, according to HDD lights it 
mostly read from the remaining /dev/sda, not from replaced /dev/sdb 
(which must be completely readable now according to smartctl -- 
problematic sector got finally remapped after ~1day).


It now looks like follows:


$ sudo blkid /dev/sda7 /dev/sdb7 /dev/sdd7
/dev/sda7: LABEL="data" UUID="37d3313a-e2ad-4b7f-98fc-a01d815952e0" 
UUID_SUB="db644855-2334-4d61-a27b-9a591255aa39" TYPE="btrfs" 
PARTUUID="c5ceab7e-e5f8-47c8-b922-c5fa0678831f"

/dev/sdb7: PARTUUID="493923cd-9ecb-4ee8-988b-5d0bfa8991b3"
/dev/sdd7: LABEL="data" UUID="37d3313a-e2ad-4b7f-98fc-a01d815952e0" 
UUID_SUB="9c2f05e9-5996-479f-89ad-f94f7ce130e6" TYPE="btrfs" 
PARTUUID="178cd274-7251-4d25-9116-ce0732d2410b"

$ sudo btrfs fi show /dev/sdb7
ERROR: no btrfs on /dev/sdb7
$ sudo btrfs fi show /dev/sdd7
Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
Total devices 2 FS bytes used 108.05GiB
devid1 size 2.71TiB used 131.03GiB path /dev/sda7
devid2 size 2.71TiB used 131.03GiB path /dev/sdd7

Does this mean:
* I should not be afraid to reboot and find /dev/sdb7 mounted again?
* I will not be able to easily mount /dev/sdb7 on a different computer 
to do some tests?


Also, although /dev/sdd7 is much larger than /dev/sdb7 was, `btrfs fi 
show` still displays it as 2.71TiB, why?


--

With Best Regards,
Marat Khalili

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-11 Thread Austin S. Hemmelgarn

On 2017-09-10 02:33, Marat Khalili wrote:

It doesn't need replaced disk to be readable, right? Then what prevents same 
procedure to work without a spare bay?


In theory, nothing.

In practice, there are reliability issues with mounting a filesystem 
degraded (and you should be avoiding running any array degraded, 
regardless of if it's BTRFS or actual RAID (be that LVM, MD, or 
hardware)).  It's also significantly faster to do it with a spare drive 
bay because that will just read from the device being replaced and copy 
data directly, while pulling the device to be replaced requires 
rebuilding the data (there is more involved than just copying, even with 
a raid1 profile).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-11 Thread FLJ
Thanks everyone for the helpful and detailed responses.
Now that you confirmed that everything is fine with my FS, I'm all
relaxed because I can for sure live with the output of df.



On Mon, Sep 11, 2017 at 5:29 AM, Andrei Borzenkov  wrote:
> 10.09.2017 23:17, Dmitrii Tcvetkov пишет:
 Drive1  Drive2Drive3
 X   X
 X X
 X X

 Where X is a chunk of raid1 block group.
>>>
>>> But this table clearly shows that adding third drive increases free
>>> space by 50%. You need to reallocate data to actually make use of it,
>>> but it was done in this case.
>>
>> It increases it but I don't see how this space is in any way useful
>> unless data is in single profile. After full balance chunks will be
>> spread over 3 devices, how it helps in raid1 data profile case?
>>
> A1 A2  => A1 A2 - => A1 A2 B1 => A1 A2 B1
> B1 B2 B1 B2 --  B2  -C1 B2 C2
>
> It is raid1 profile on three disks fully utilizing them (assuming equal
> sizes of course). Where "raid1" means - each data block has two copies
> on different devices.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Andrei Borzenkov
10.09.2017 23:17, Dmitrii Tcvetkov пишет:
>>> Drive1  Drive2Drive3
>>> X   X
>>> X X
>>> X X
>>>
>>> Where X is a chunk of raid1 block group.  
>>
>> But this table clearly shows that adding third drive increases free
>> space by 50%. You need to reallocate data to actually make use of it,
>> but it was done in this case.
> 
> It increases it but I don't see how this space is in any way useful
> unless data is in single profile. After full balance chunks will be
> spread over 3 devices, how it helps in raid1 data profile case?
> 
A1 A2  => A1 A2 - => A1 A2 B1 => A1 A2 B1
B1 B2 B1 B2 --  B2  -C1 B2 C2

It is raid1 profile on three disks fully utilizing them (assuming equal
sizes of course). Where "raid1" means - each data block has two copies
on different devices.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Duncan
FLJ posted on Sun, 10 Sep 2017 15:45:42 +0200 as excerpted:

> I have a BTRFS RAID1 volume running for the past year. I avoided all
> pitfalls known to me that would mess up this volume. I never
> experimented with quotas, no-COW, snapshots, defrag, nothing really.
> The volume is a RAID1 from day 1 and is working reliably until now.
> 
> Until yesterday it consisted of two 3 TB drives, something along the
> lines:
> 
> Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
> Total devices 2 FS bytes used 2.47TiB
> devid1 size 2.73TiB used 2.47TiB path /dev/sdb
> devid2 size 2.73TiB used 2.47TiB path /dev/sdc

I'm going to try a different approach than I see in the two existing 
subthreads, so I started from scratch with my own subthread...

So the above looks reasonable so far...

> 
> Yesterday I've added a new drive to the FS and did a full rebalance
> (without filters) over night, which went through without any issues.
> 
> Now I have:
>  Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
> Total devices 3 FS bytes used 2.47TiB
> devid1 size 2.73TiB used 1.24TiB path /dev/sdb
> devid2 size 2.73TiB used 1.24TiB path /dev/sdc
> devid3 size 7.28TiB used 2.48TiB path /dev/sda

That's exactly as expected, after a balance.

Note the size, 2.73 TiB (twos-power) for the smaller two, not 3 (tho it's 
probably 3 TB, tens-power), 7.28 TiB, not 8, for the larger one.

The most-free-space chunk allocation, with raid1-paired chunks, means the 
first chunk of every pair will get allocated to the largest, 7.28 TiB 
device.  The other two devices are equal in size, 2.73 TiB each, and the 
second chunk can't get allocated to the largest device as only one chunk 
of the pair can go there, so the allocator will in general alternate 
allocations from the smaller two, for the second chunk of each pair.  (I 
say in general, because metadata chunks are smaller than data chunks, so 
it's possible that two chunks in a row, a metadata chunk and a data 
chunk, will be allocated from the same device, before it switches to the 
other.)

Because the larger device is larger than the other two combined, it'll 
always get one copy, while the others fill up evenly at half the usage of 
the larger device, until both smaller devices are full, at which point 
you won't be able to allocate further raid1 chunks and you'll ENOSPC.

> # btrfs fi df /mnt/BigVault/
> Data, RAID1: total=2.47TiB, used=2.47TiB
> System, RAID1: total=32.00MiB, used=384.00KiB
> Metadata, RAID1: total=4.00GiB, used=2.74GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B

Still looks reasonable.

Note that assuming you're using a reasonably current btrfs-progs, there's 
also the btrfs fi usage and btrfs dev usage commands.  Btrfs fi df is an 
older form that has much less information than the fi and dev usage 
commands, tho between btrfs fi show and btrfs fi df, /most/ of the 
filesystem-level information in btrfs fi usage can be deduced, tho not 
necessarily the device-level detail.  Btrfs fi usage is thus preferred, 
assuming it's available to you.  (In addition to btrfs fi usage being 
newer, both it and btrfs fi df require a mounted btrfs.  If the 
filesystem refuses to mount, btrfs fi show may be all that's available.)

While I'm digressing, I'm guessing you know this already, but for others, 
global reserve is reserved from and comes out of metadata, so you can add 
global reserve total to metadata used.  Normally, btrfs won't use 
anything from the global reserve, so usage there will be zero.  If it's 
not, that's a very strong indication that your filesystem believes it is 
very short on space (even if data and metadata say they both have lots of 
unused space left, for some reason, very likely a bug in that case, the 
filesystem believes otherwise) and you need to take corrective action 
immediately, or risk the filesystem effectively going read-only when 
nothing else can be written.
 
> But still df -h is giving me:
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/sdb 6.4T  2.5T  1.5T  63% /mnt/BigVault
> 
> Although I've heard and read about the difficulty in reporting free
> space due to the flexibility of BTRFS, snapshots and subvolumes, etc.,
> but I only have a single volume, no subvolumes, no snapshots, no quotas
> and both data and metadata are RAID1.

The most practical advice I've seen regarding "normal" df (that is, the 
one from coreutils, not btrfs fi df) in the case of uneven device sizes 
in particular, is simply ignore its numbers -- they're not reliable.  The 
only thing you need to be sure of is that it says you have enough space 
for whatever you're actually doing ATM, since various applications will 
trust its numbers and may refuse to do whatever filesystem operation at 
all, if it says there's not enough space.

The algorithm reasonably new coreutils df (and the kernel calls it 
depends on) uses is much better 

Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Kai Krakow
Am Sun, 10 Sep 2017 20:15:52 +0200
schrieb Ferenc-Levente Juhos :

> >Problem is that each raid1 block group contains two chunks on two
> >separate devices, it can't utilize fully three devices no matter
> >what. If that doesn't suit you then you need to add 4th disk. After
> >that FS will be able to use all unallocated space on all disks in
> >raid1 profile. But even then you'll be able to safely lose only one
> >disk since BTRFS still will be storing only 2 copies of data.  
> 
> I hope I didn't say that I want to utilize all three devices fully. It
> was clear to me that there will be 2 TB of wasted space.
> Also I'm not questioning the chunk allocator for RAID1 at all. It's
> clear and it always has been clear that for RAID1 the chunks need to
> be allocated on different physical devices.
> If I understood Kai's point of view, he even suggested that I might
> need to do balancing to make sure that the free space on the three
> devices is being used smartly. Hence the questions about balancing.

It will allocate chunks from the device with the most space available.
So while you fill your disks space usage will evenly distribute.

The problem comes when you start deleting stuff, some chunks may even
be freed, and everything becomes messed up. In an aging file system you
may notice that the chunks are no longer evenly distributed. A balance
is a way to fix that because it will reallocate chunks and coalesce
data back into single chunks, making free space for new allocations. In
this process it will actually evenly distribute your data again.

You may want to use this rebalance script:
https://www.spinics.net/lists/linux-btrfs/msg52076.html

> I mean in worst case it could happen like this:
> 
> Again I have disks of sizes 3, 3, 8:
> Fig.1
> Drive1(8) Drive2(3) Drive3(3)
>  -   X1X1
>  -   X2X2
>  -   X3X3
> Here the new drive is completely unused. Even if one X1 chunk would be
> on Drive1 it would be still a sub-optimal allocation.

This won't happen while filling a fresh btrfs. Chunks are always
allocated from a device with most free space (within the raid1
constraints). This it will allocate space alternating between disk1+2
and disk1+3.

> This is the optimal allocation. Will btrfs allocate like this?
> Considering that Drive1 has the most free space.
> Fig. 2
> Drive1(8) Drive2(3) Drive3(3)
> X1X1-
> X2-   X2
> X3X3-
> X4-   X4

Yes.

> From my point of view Fig.2 shows the optimal allocation, by the time
> the disks Drive2 and Drive3 are full (3TB) Drive1 must have 6TB
> (because it is exclusively holding the mirrors for both Drive2 and 3).
> For sure now btrfs can say, since two of the drives are completely
> full he can't allocate any more chunks and the remaining 2 TB of space
> from Drive1 is wasted. This is clear it's even pointed out by the
> btrfs size calculator.

Yes.


> But again if the above statements are true, then df might as well tell
> the "truth" and report that I have 3.5 TB space free and not 1.5TB (as
> it is reported now). Again here I fully understand Kai's explanation.
> Because coming back to my first e-mail, my "problem" was that df is
> reporting 1.5 TB free, whereas the whole FS holds 2.5 TB of data.

The size calculator has undergone some revisions. I think it currently
estimates the free space from net data to raw data ratio across all
devices, taking the current raid constraints into account.

Calculating free space in btrfs is difficult because in the future
btrfs may even support different raid levels for different sub volumes.
It's probably best to calculate for the worst case scenario then.

Even today it's already difficult if you use different raid levels for
meta data and content data: The filesystem cannot predict the future of
allocations. It can only give an educated guess. And the calculation
was revised a few times to not "overshoot".


> So the question still remains, is it just that df is intentionally not
> smart enough to give a more accurate estimation,

The df utility doesn't now anything about btrfs allocations. The value
is estimated by btrfs itself. To get more detailed info for capacity
planning, you should use "btrfs fi df" and its various siblings.

> or is the assumption
> that the allocator picks the drive with most free space mistaken?
> If I continue along the lines of what Kai said, and I need to do
> re-balance, because the allocation is not like shown above (Fig.2),
> then my question is still legitimate. Are there any filters that one
> might use to speed up or to selectively balance in my case? or will I
> need to do full balance?

Your assumption is misguided. The total free space estimation is a
totally different thing than what the allocator bases its decision on.
See "btrfs dev usage". The allocator uses space from the biggest
unallocated space 

Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Dmitrii Tcvetkov
> > Drive1  Drive2Drive3
> > X   X
> > X X
> > X X
> > 
> > Where X is a chunk of raid1 block group.  
> 
> But this table clearly shows that adding third drive increases free
> space by 50%. You need to reallocate data to actually make use of it,
> but it was done in this case.

It increases it but I don't see how this space is in any way useful
unless data is in single profile. After full balance chunks will be
spread over 3 devices, how it helps in raid1 data profile case?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Andrei Borzenkov
10.09.2017 19:11, Dmitrii Tcvetkov пишет:
>> Actually based on http://carfax.org.uk/btrfs-usage/index.html I
>> would've expected 6 TB of usable space. Here I get 6.4 which is odd,
>> but that only 1.5 TB is available is even stranger.
>>
>> Could anyone explain what I did wrong or why my expectations are wrong?
>>
>> Thank you in advance
> 
> I'd say df and the website calculate different things. In btrfs raid1 profile 
> stores exactly 2 copies of data, each copy is on separate device. 
> So by adding third drive, no matter how big, effective free space didn't 
> expand because btrfs still needs space on any one of other two drives to 
> store second half of each raid1 chunk stored on that third drive. 
> 
> Basically:
> 
> Drive1  Drive2Drive3
> X   X
> X X
> X X
> 
> Where X is a chunk of raid1 block group.

But this table clearly shows that adding third drive increases free
space by 50%. You need to reallocate data to actually make use of it,
but it was done in this case.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Andrei Borzenkov
10.09.2017 18:47, Kai Krakow пишет:
> Am Sun, 10 Sep 2017 15:45:42 +0200
> schrieb FLJ :
> 
>> Hello all,
>>
>> I have a BTRFS RAID1 volume running for the past year. I avoided all
>> pitfalls known to me that would mess up this volume. I never
>> experimented with quotas, no-COW, snapshots, defrag, nothing really.
>> The volume is a RAID1 from day 1 and is working reliably until now.
>>
>> Until yesterday it consisted of two 3 TB drives, something along the
>> lines:
>>
>> Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
>> Total devices 2 FS bytes used 2.47TiB
>> devid1 size 2.73TiB used 2.47TiB path /dev/sdb
>> devid2 size 2.73TiB used 2.47TiB path /dev/sdc
>>
>> Yesterday I've added a new drive to the FS and did a full rebalance
>> (without filters) over night, which went through without any issues.
>>
>> Now I have:
>>  Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
>> Total devices 3 FS bytes used 2.47TiB
>> devid1 size 2.73TiB used 1.24TiB path /dev/sdb
>> devid2 size 2.73TiB used 1.24TiB path /dev/sdc
>> devid3 size 7.28TiB used 2.48TiB path /dev/sda
>>
>> # btrfs fi df /mnt/BigVault/
>> Data, RAID1: total=2.47TiB, used=2.47TiB
>> System, RAID1: total=32.00MiB, used=384.00KiB
>> Metadata, RAID1: total=4.00GiB, used=2.74GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> But still df -h is giving me:
>> Filesystem   Size  Used Avail Use% Mounted on
>> /dev/sdb 6.4T  2.5T  1.5T  63% /mnt/BigVault
>>
>> Although I've heard and read about the difficulty in reporting free
>> space due to the flexibility of BTRFS, snapshots and subvolumes, etc.,
>> but I only have a single volume, no subvolumes, no snapshots, no
>> quotas and both data and metadata are RAID1.
>>
>> My expectation would've been that in case of BigVault Size == Used +
>> Avail.
>>
>> Actually based on http://carfax.org.uk/btrfs-usage/index.html I
>> would've expected 6 TB of usable space. Here I get 6.4 which is odd,

Total size is estimation which in this case is computed as (sum of
device sizes)/2 which is approximately 6.4TiB.

>> but that only 1.5 TB is available is even stranger.
>>
>> Could anyone explain what I did wrong or why my expectations are
>> wrong?
>>
>> Thank you in advance
> 
> Btrfs reports estimated free space from the free space of the smallest
> member as it can only guarantee that.

It's not exactly true. For three devices with free space of 1TiB, 2TiB
and 3TiB it would return 2TiB as available space. But it is not
sophisticated enough to notice that it actually has 3TiB available.

I wonder if this is only free space calculation or actual allocation
algorithm behaves similar (effectively ignoring part of available space).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Ferenc-Levente Juhos
>Problem is that each raid1 block group contains two chunks on two
>separate devices, it can't utilize fully three devices no matter what.
>If that doesn't suit you then you need to add 4th disk. After
>that FS will be able to use all unallocated space on all disks in raid1
>profile. But even then you'll be able to safely lose only one disk
>since BTRFS still will be storing only 2 copies of data.

I hope I didn't say that I want to utilize all three devices fully. It
was clear to me that there will be 2 TB of wasted space.
Also I'm not questioning the chunk allocator for RAID1 at all. It's
clear and it always has been clear that for RAID1 the chunks need to
be allocated on different physical devices.
If I understood Kai's point of view, he even suggested that I might
need to do balancing to make sure that the free space on the three
devices is being used smartly. Hence the questions about balancing.

I mean in worst case it could happen like this:

Again I have disks of sizes 3, 3, 8:
Fig.1
Drive1(8) Drive2(3) Drive3(3)
 -   X1X1
 -   X2X2
 -   X3X3
Here the new drive is completely unused. Even if one X1 chunk would be
on Drive1 it would be still a sub-optimal allocation.

This is the optimal allocation. Will btrfs allocate like this?
Considering that Drive1 has the most free space.
Fig. 2
Drive1(8) Drive2(3) Drive3(3)
X1X1-
X2-   X2
X3X3-
X4-   X4

>From my point of view Fig.2 shows the optimal allocation, by the time
the disks Drive2 and Drive3 are full (3TB) Drive1 must have 6TB
(because it is exclusively holding the mirrors for both Drive2 and 3).
For sure now btrfs can say, since two of the drives are completely
full he can't allocate any more chunks and the remaining 2 TB of space
from Drive1 is wasted. This is clear it's even pointed out by the
btrfs size calculator.

But again if the above statements are true, then df might as well tell
the "truth" and report that I have 3.5 TB space free and not 1.5TB (as
it is reported now). Again here I fully understand Kai's explanation.
Because coming back to my first e-mail, my "problem" was that df is
reporting 1.5 TB free, whereas the whole FS holds 2.5 TB of data.

So the question still remains, is it just that df is intentionally not
smart enough to give a more accurate estimation, or is the assumption
that the allocator picks the drive with most free space mistaken?
If I continue along the lines of what Kai said, and I need to do
re-balance, because the allocation is not like shown above (Fig.2),
then my question is still legitimate. Are there any filters that one
might use to speed up or to selectively balance in my case? or will I
need to do full balance?

On Sun, Sep 10, 2017 at 7:19 PM, Dmitrii Tcvetkov  wrote:
>> @Kai and Dmitrii
>> thank you for your explanations if I understand you correctly, you're
>> saying that btrfs makes no attempt to "optimally" use the physical
>> devices it has in the FS, once a new RAID1 block group needs to be
>> allocated it will semi-randomly pick two devices with enough space and
>> allocate two equal sized chunks, one on each. This new chunk may or
>> may not fall onto my newly added 8 TB drive. Am I understanding this
>> correctly?
> If I remember correctly chunk allocator allocates new chunks on device
> which has the most unallocated space.
>
>> Is there some sort of balance filter that would speed up this sort of
>> balancing? Will balance be smart enough to make the "right" decision?
>> As far as I read the chunk allocator used during balance is the same
>> that is used during normal operation. If the allocator is already
>> sub-optimal during normal operations, what's the guarantee that it
>> will make a "better" decision during balancing?
>
> I don't really see any way that being possible in raid1 profile. How
> can you fill all three devices if you can split data only twice? There
> will be moment when two of three disks are full and BTRFS can't
> allocate new raid1 block group because it has only one drive with
> unallocated space.
>
>>
>> When I say "right" and "better" I mean this:
>> Drive1(8) Drive2(3) Drive3(3)
>> X1X1
>> X2X2
>> X3X3
>> X4X4
>> I was convinced until now that the chunk allocator at least tries a
>> best possible allocation. I'm sure it's complicated to develop a
>> generic algorithm to fit all setups, but it should be possible.
>
>
> Problem is that each raid1 block group contains two chunks on two
> separate devices, it can't utilize fully three devices no matter what.
> If that doesn't suit you then you need to add 4th disk. After
> that FS will be able to use all unallocated space on all disks in raid1
> profile. But even then you'll be able to safely lose only one disk
> since BTRFS still will be storing only 2 

Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Dmitrii Tcvetkov
> @Kai and Dmitrii
> thank you for your explanations if I understand you correctly, you're
> saying that btrfs makes no attempt to "optimally" use the physical
> devices it has in the FS, once a new RAID1 block group needs to be
> allocated it will semi-randomly pick two devices with enough space and
> allocate two equal sized chunks, one on each. This new chunk may or
> may not fall onto my newly added 8 TB drive. Am I understanding this
> correctly?
If I remember correctly chunk allocator allocates new chunks on device
which has the most unallocated space. 

> Is there some sort of balance filter that would speed up this sort of
> balancing? Will balance be smart enough to make the "right" decision?
> As far as I read the chunk allocator used during balance is the same
> that is used during normal operation. If the allocator is already
> sub-optimal during normal operations, what's the guarantee that it
> will make a "better" decision during balancing?

I don't really see any way that being possible in raid1 profile. How
can you fill all three devices if you can split data only twice? There
will be moment when two of three disks are full and BTRFS can't
allocate new raid1 block group because it has only one drive with
unallocated space.

> 
> When I say "right" and "better" I mean this:
> Drive1(8) Drive2(3) Drive3(3)
> X1X1
> X2X2
> X3X3
> X4X4
> I was convinced until now that the chunk allocator at least tries a
> best possible allocation. I'm sure it's complicated to develop a
> generic algorithm to fit all setups, but it should be possible.
 

Problem is that each raid1 block group contains two chunks on two
separate devices, it can't utilize fully three devices no matter what.
If that doesn't suit you then you need to add 4th disk. After
that FS will be able to use all unallocated space on all disks in raid1
profile. But even then you'll be able to safely lose only one disk
since BTRFS still will be storing only 2 copies of data.

This behavior is not relevant for single or raid0 profiles of
multidevice BTRFS filesystems.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Ferenc-Levente Juhos
@Kai and Dmitrii
thank you for your explanations if I understand you correctly, you're
saying that btrfs makes no attempt to "optimally" use the physical
devices it has in the FS, once a new RAID1 block group needs to be
allocated it will semi-randomly pick two devices with enough space and
allocate two equal sized chunks, one on each. This new chunk may or
may not fall onto my newly added 8 TB drive. Am I understanding this
correctly?
> You will probably need to
>run balance once in a while to evenly redistribute allocated chunks
>across all disks.

Is there some sort of balance filter that would speed up this sort of
balancing? Will balance be smart enough to make the "right" decision?
As far as I read the chunk allocator used during balance is the same
that is used during normal operation. If the allocator is already
sub-optimal during normal operations, what's the guarantee that it
will make a "better" decision during balancing?

When I say "right" and "better" I mean this:
Drive1(8) Drive2(3) Drive3(3)
X1X1
X2X2
X3X3
X4X4
I was convinced until now that the chunk allocator at least tries a
best possible allocation. I'm sure it's complicated to develop a
generic algorithm to fit all setups, but it should be possible.

On Sun, Sep 10, 2017 at 5:47 PM, Kai Krakow  wrote:
> Am Sun, 10 Sep 2017 15:45:42 +0200
> schrieb FLJ :
>
>> Hello all,
>>
>> I have a BTRFS RAID1 volume running for the past year. I avoided all
>> pitfalls known to me that would mess up this volume. I never
>> experimented with quotas, no-COW, snapshots, defrag, nothing really.
>> The volume is a RAID1 from day 1 and is working reliably until now.
>>
>> Until yesterday it consisted of two 3 TB drives, something along the
>> lines:
>>
>> Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
>> Total devices 2 FS bytes used 2.47TiB
>> devid1 size 2.73TiB used 2.47TiB path /dev/sdb
>> devid2 size 2.73TiB used 2.47TiB path /dev/sdc
>>
>> Yesterday I've added a new drive to the FS and did a full rebalance
>> (without filters) over night, which went through without any issues.
>>
>> Now I have:
>>  Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
>> Total devices 3 FS bytes used 2.47TiB
>> devid1 size 2.73TiB used 1.24TiB path /dev/sdb
>> devid2 size 2.73TiB used 1.24TiB path /dev/sdc
>> devid3 size 7.28TiB used 2.48TiB path /dev/sda
>>
>> # btrfs fi df /mnt/BigVault/
>> Data, RAID1: total=2.47TiB, used=2.47TiB
>> System, RAID1: total=32.00MiB, used=384.00KiB
>> Metadata, RAID1: total=4.00GiB, used=2.74GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>
>> But still df -h is giving me:
>> Filesystem   Size  Used Avail Use% Mounted on
>> /dev/sdb 6.4T  2.5T  1.5T  63% /mnt/BigVault
>>
>> Although I've heard and read about the difficulty in reporting free
>> space due to the flexibility of BTRFS, snapshots and subvolumes, etc.,
>> but I only have a single volume, no subvolumes, no snapshots, no
>> quotas and both data and metadata are RAID1.
>>
>> My expectation would've been that in case of BigVault Size == Used +
>> Avail.
>>
>> Actually based on http://carfax.org.uk/btrfs-usage/index.html I
>> would've expected 6 TB of usable space. Here I get 6.4 which is odd,
>> but that only 1.5 TB is available is even stranger.
>>
>> Could anyone explain what I did wrong or why my expectations are
>> wrong?
>>
>> Thank you in advance
>
> Btrfs reports estimated free space from the free space of the smallest
> member as it can only guarantee that. In your case this is 2.73 minus
> 1.24 free which is roughly around 1.5T. But since this free space
> distributes across three disks with one having much more free space, it
> probably will use up that space at half the rate of actual allocation.
> But due to how btrfs allocates from free space in chunks, that may not
> be possible - thus the low unexpected value. You will probably need to
> run balance once in a while to evenly redistribute allocated chunks
> across all disks.
>
> It may give you better estimates if you combine sdb and sdc into one
> logical device, e.g. using raid0 or jbod via md or lvm.
>
>
> --
> Regards,
> Kai
>
> Replies to list-only preferred.
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Dmitrii Tcvetkov
>Actually based on http://carfax.org.uk/btrfs-usage/index.html I
>would've expected 6 TB of usable space. Here I get 6.4 which is odd,
>but that only 1.5 TB is available is even stranger.
>
>Could anyone explain what I did wrong or why my expectations are wrong?
>
>Thank you in advance

I'd say df and the website calculate different things. In btrfs raid1 profile 
stores exactly 2 copies of data, each copy is on separate device. 
So by adding third drive, no matter how big, effective free space didn't expand 
because btrfs still needs space on any one of other two drives to store second 
half of each raid1 chunk stored on that third drive. 

Basically:

Drive1  Drive2Drive3
X   X
X   X
  X X

Where X is a chunk of raid1 block group.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help me understand what is going on with my RAID1 FS

2017-09-10 Thread Kai Krakow
Am Sun, 10 Sep 2017 15:45:42 +0200
schrieb FLJ :

> Hello all,
> 
> I have a BTRFS RAID1 volume running for the past year. I avoided all
> pitfalls known to me that would mess up this volume. I never
> experimented with quotas, no-COW, snapshots, defrag, nothing really.
> The volume is a RAID1 from day 1 and is working reliably until now.
> 
> Until yesterday it consisted of two 3 TB drives, something along the
> lines:
> 
> Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
> Total devices 2 FS bytes used 2.47TiB
> devid1 size 2.73TiB used 2.47TiB path /dev/sdb
> devid2 size 2.73TiB used 2.47TiB path /dev/sdc
> 
> Yesterday I've added a new drive to the FS and did a full rebalance
> (without filters) over night, which went through without any issues.
> 
> Now I have:
>  Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
> Total devices 3 FS bytes used 2.47TiB
> devid1 size 2.73TiB used 1.24TiB path /dev/sdb
> devid2 size 2.73TiB used 1.24TiB path /dev/sdc
> devid3 size 7.28TiB used 2.48TiB path /dev/sda
> 
> # btrfs fi df /mnt/BigVault/
> Data, RAID1: total=2.47TiB, used=2.47TiB
> System, RAID1: total=32.00MiB, used=384.00KiB
> Metadata, RAID1: total=4.00GiB, used=2.74GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> But still df -h is giving me:
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/sdb 6.4T  2.5T  1.5T  63% /mnt/BigVault
> 
> Although I've heard and read about the difficulty in reporting free
> space due to the flexibility of BTRFS, snapshots and subvolumes, etc.,
> but I only have a single volume, no subvolumes, no snapshots, no
> quotas and both data and metadata are RAID1.
> 
> My expectation would've been that in case of BigVault Size == Used +
> Avail.
> 
> Actually based on http://carfax.org.uk/btrfs-usage/index.html I
> would've expected 6 TB of usable space. Here I get 6.4 which is odd,
> but that only 1.5 TB is available is even stranger.
> 
> Could anyone explain what I did wrong or why my expectations are
> wrong?
> 
> Thank you in advance

Btrfs reports estimated free space from the free space of the smallest
member as it can only guarantee that. In your case this is 2.73 minus
1.24 free which is roughly around 1.5T. But since this free space
distributes across three disks with one having much more free space, it
probably will use up that space at half the rate of actual allocation.
But due to how btrfs allocates from free space in chunks, that may not
be possible - thus the low unexpected value. You will probably need to
run balance once in a while to evenly redistribute allocated chunks
across all disks.

It may give you better estimates if you combine sdb and sdc into one
logical device, e.g. using raid0 or jbod via md or lvm.


-- 
Regards,
Kai

Replies to list-only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Help me understand what is going on with my RAID1 FS

2017-09-10 Thread FLJ
Hello all,

I have a BTRFS RAID1 volume running for the past year. I avoided all
pitfalls known to me that would mess up this volume. I never
experimented with quotas, no-COW, snapshots, defrag, nothing really.
The volume is a RAID1 from day 1 and is working reliably until now.

Until yesterday it consisted of two 3 TB drives, something along the lines:

Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
Total devices 2 FS bytes used 2.47TiB
devid1 size 2.73TiB used 2.47TiB path /dev/sdb
devid2 size 2.73TiB used 2.47TiB path /dev/sdc

Yesterday I've added a new drive to the FS and did a full rebalance
(without filters) over night, which went through without any issues.

Now I have:
 Label: 'BigVault'  uuid: a37ad5f5-a21b-41c7-970b-13b6c4db33db
Total devices 3 FS bytes used 2.47TiB
devid1 size 2.73TiB used 1.24TiB path /dev/sdb
devid2 size 2.73TiB used 1.24TiB path /dev/sdc
devid3 size 7.28TiB used 2.48TiB path /dev/sda

# btrfs fi df /mnt/BigVault/
Data, RAID1: total=2.47TiB, used=2.47TiB
System, RAID1: total=32.00MiB, used=384.00KiB
Metadata, RAID1: total=4.00GiB, used=2.74GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

But still df -h is giving me:
Filesystem   Size  Used Avail Use% Mounted on
/dev/sdb 6.4T  2.5T  1.5T  63% /mnt/BigVault

Although I've heard and read about the difficulty in reporting free
space due to the flexibility of BTRFS, snapshots and subvolumes, etc.,
but I only have a single volume, no subvolumes, no snapshots, no
quotas and both data and metadata are RAID1.

My expectation would've been that in case of BigVault Size == Used + Avail.

Actually based on http://carfax.org.uk/btrfs-usage/index.html I
would've expected 6 TB of usable space. Here I get 6.4 which is odd,
but that only 1.5 TB is available is even stranger.

Could anyone explain what I did wrong or why my expectations are wrong?

Thank you in advance
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-10 Thread Patrik Lundquist
On 10 September 2017 at 08:33, Marat Khalili  wrote:
> It doesn't need replaced disk to be readable, right?

Only enough to be mountable, which it already is, so your read errors
on /dev/sdb isn't a problem.

> Then what prevents same procedure to work without a spare bay?

It is basically the same procedure but with a bunch of gotchas due to
bugs and odd behaviour. Only having one shot at it, before it can only
be mounted read-only, is especially problematic (will be fixed in
Linux 4.14).


> --
>
> With Best Regards,
> Marat Khalili
>
> On September 9, 2017 1:29:08 PM GMT+03:00, Patrik Lundquist 
>  wrote:
>>On 9 September 2017 at 12:05, Marat Khalili  wrote:
>>> Forgot to add, I've got a spare empty bay if it can be useful here.
>>
>>That makes it much easier since you don't have to mount it degraded,
>>with the risks involved.
>>
>>Add and partition the disk.
>>
>># btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data
>>
>>Remove the old disk when it is done.
>>
>>> --
>>>
>>> With Best Regards,
>>> Marat Khalili
>>>
>>> On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili
>> wrote:
Dear list,

I'm going to replace one hard drive (partition actually) of a btrfs
raid1. Can you please spell exactly what I need to do in order to get
my
filesystem working as RAID1 again after replacement, exactly as it
>>was
before? I saw some bad examples of drive replacement in this list so
>>I
afraid to just follow random instructions on wiki, and putting this
system out of action even temporarily would be very inconvenient.

For this filesystem:

> $ sudo btrfs fi show /dev/sdb7
> Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
> Total devices 2 FS bytes used 106.23GiB
> devid1 size 2.71TiB used 126.01GiB path /dev/sda7
> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
> $ grep /mnt/data /proc/mounts
> /dev/sda7 /mnt/data btrfs
> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0
> $ sudo btrfs fi df /mnt/data
> Data, RAID1: total=123.00GiB, used=104.57GiB
> System, RAID1: total=8.00MiB, used=48.00KiB
> Metadata, RAID1: total=3.00GiB, used=1.67GiB
> GlobalReserve, single: total=512.00MiB, used=0.00B
> $ uname -a
> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC
> 2017 x86_64 x86_64 x86_64 GNU/Linux

I've got this in dmesg:

> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0
> action 0x0
> [  +0.51] ata6.00: irq_stat 0x4008
> [  +0.29] ata6.00: failed command: READ FPDMA QUEUED
> [  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag
>>3
> ncq 57344 in
>res 41/40:00:68:6c:f3/00:00:79:00:00/40
>>Emask
> 0x409 (media error) 
> [  +0.94] ata6.00: status: { DRDY ERR }
> [  +0.26] ata6.00: error: { UNC }
> [  +0.001195] ata6.00: configured for UDMA/133
> [  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result:
>>hostbyte=DID_OK
> driverbyte=DRIVER_SENSE
> [  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error
> [current] [descriptor]
> [  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read
> error - auto reallocate failed
> [  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00
>>00

> 79 f3 6c 50 00 00 00 70 00 00
> [  +0.03] blk_update_request: I/O error, dev sdb, sector
2045996136
> [  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
rd
> 1, flush 0, corrupt 0, gen 0
> [  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
rd
> 2, flush 0, corrupt 0, gen 0
> [  +0.77] ata6: EH complete

There's still 1 in Current_Pending_Sector line of smartctl output as
>>of

now, so it probably won't heal by itself.

--

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe
>>linux-btrfs"
in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>>linux-btrfs" in
>>> the body of a message to majord...@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>--
>>To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>in
>>the body of a message to majord...@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-10 Thread Marat Khalili
It doesn't need replaced disk to be readable, right? Then what prevents same 
procedure to work without a spare bay?
-- 

With Best Regards,
Marat Khalili

On September 9, 2017 1:29:08 PM GMT+03:00, Patrik Lundquist 
 wrote:
>On 9 September 2017 at 12:05, Marat Khalili  wrote:
>> Forgot to add, I've got a spare empty bay if it can be useful here.
>
>That makes it much easier since you don't have to mount it degraded,
>with the risks involved.
>
>Add and partition the disk.
>
># btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data
>
>Remove the old disk when it is done.
>
>> --
>>
>> With Best Regards,
>> Marat Khalili
>>
>> On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili
> wrote:
>>>Dear list,
>>>
>>>I'm going to replace one hard drive (partition actually) of a btrfs
>>>raid1. Can you please spell exactly what I need to do in order to get
>>>my
>>>filesystem working as RAID1 again after replacement, exactly as it
>was
>>>before? I saw some bad examples of drive replacement in this list so
>I
>>>afraid to just follow random instructions on wiki, and putting this
>>>system out of action even temporarily would be very inconvenient.
>>>
>>>For this filesystem:
>>>
 $ sudo btrfs fi show /dev/sdb7
 Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
 Total devices 2 FS bytes used 106.23GiB
 devid1 size 2.71TiB used 126.01GiB path /dev/sda7
 devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
 $ grep /mnt/data /proc/mounts
 /dev/sda7 /mnt/data btrfs
 rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0
 $ sudo btrfs fi df /mnt/data
 Data, RAID1: total=123.00GiB, used=104.57GiB
 System, RAID1: total=8.00MiB, used=48.00KiB
 Metadata, RAID1: total=3.00GiB, used=1.67GiB
 GlobalReserve, single: total=512.00MiB, used=0.00B
 $ uname -a
 Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC
 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>>I've got this in dmesg:
>>>
 [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0
 action 0x0
 [  +0.51] ata6.00: irq_stat 0x4008
 [  +0.29] ata6.00: failed command: READ FPDMA QUEUED
 [  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag
>3
 ncq 57344 in
res 41/40:00:68:6c:f3/00:00:79:00:00/40
>Emask
 0x409 (media error) 
 [  +0.94] ata6.00: status: { DRDY ERR }
 [  +0.26] ata6.00: error: { UNC }
 [  +0.001195] ata6.00: configured for UDMA/133
 [  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result:
>hostbyte=DID_OK
 driverbyte=DRIVER_SENSE
 [  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error
 [current] [descriptor]
 [  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read
 error - auto reallocate failed
 [  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00
>00
>>>
 79 f3 6c 50 00 00 00 70 00 00
 [  +0.03] blk_update_request: I/O error, dev sdb, sector
>>>2045996136
 [  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>>>rd
 1, flush 0, corrupt 0, gen 0
 [  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>>>rd
 2, flush 0, corrupt 0, gen 0
 [  +0.77] ata6: EH complete
>>>
>>>There's still 1 in Current_Pending_Sector line of smartctl output as
>of
>>>
>>>now, so it probably won't heal by itself.
>>>
>>>--
>>>
>>>With Best Regards,
>>>Marat Khalili
>>>--
>>>To unsubscribe from this list: send the line "unsubscribe
>linux-btrfs"
>>>in
>>>the body of a message to majord...@vger.kernel.org
>>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-09 Thread Duncan
Patrik Lundquist posted on Sat, 09 Sep 2017 12:29:08 +0200 as excerpted:

> On 9 September 2017 at 12:05, Marat Khalili  wrote:
>> Forgot to add, I've got a spare empty bay if it can be useful here.
> 
> That makes it much easier since you don't have to mount it degraded,
> with the risks involved.
> 
> Add and partition the disk.
> 
> # btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data
> 
> Remove the old disk when it is done.

I did this with my dozen-plus (but small) btrfs raid1s on ssd partitions 
several kernel cycles ago.  It went very smoothly. =:^)

(TL;DR can stop there.)

I had actually been taking advantage of btrfs raid1's checksumming and 
scrub ability to continue running a failing ssd, with more and more 
sectors going bad and being replaced from spares, for quite some time 
after I'd have otherwise replaced it.  Everything of value was backed up, 
and I was simply doing it for the experience with both btrfs raid1 
scrubbing and continuing ssd sector failure.  But eventually the scrubs 
were finding and fixing errors every boot, especially when off for 
several hours, and further experience was of diminishing value while the 
hassle factor was building fast, so I attached the spare ssd, partitioned 
it up, did a final scrub on all the btrfs, and then one btrfs at a time 
btrfs replaced the devices from the old ssd's partitions to the new one's 
partitions.  Given that I was already used to running scrubs at every 
boot, the entirely uneventful replacements were actually somewhat 
anticlimactic, but that was a good thing! =:^)

Then more recently I bought a larger/newer pair of ssds (1 TB each, the 
old ones were quarter TB each) and converted my media partitions and 
secondary backups, which had still been on reiserfs on spinning rust, to 
btrfs raid1 on ssd as well, making me all-btrfs on all-ssd now, with 
everything but /boot and its backups on the other ssds being btrfs raid1, 
and /boot and its backups being btrfs dup. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-09 Thread Patrik Lundquist
On 9 September 2017 at 12:05, Marat Khalili  wrote:
> Forgot to add, I've got a spare empty bay if it can be useful here.

That makes it much easier since you don't have to mount it degraded,
with the risks involved.

Add and partition the disk.

# btrfs replace start /dev/sdb7 /dev/sdc(?)7 /mnt/data

Remove the old disk when it is done.

> --
>
> With Best Regards,
> Marat Khalili
>
> On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili  wrote:
>>Dear list,
>>
>>I'm going to replace one hard drive (partition actually) of a btrfs
>>raid1. Can you please spell exactly what I need to do in order to get
>>my
>>filesystem working as RAID1 again after replacement, exactly as it was
>>before? I saw some bad examples of drive replacement in this list so I
>>afraid to just follow random instructions on wiki, and putting this
>>system out of action even temporarily would be very inconvenient.
>>
>>For this filesystem:
>>
>>> $ sudo btrfs fi show /dev/sdb7
>>> Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
>>> Total devices 2 FS bytes used 106.23GiB
>>> devid1 size 2.71TiB used 126.01GiB path /dev/sda7
>>> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
>>> $ grep /mnt/data /proc/mounts
>>> /dev/sda7 /mnt/data btrfs
>>> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0
>>> $ sudo btrfs fi df /mnt/data
>>> Data, RAID1: total=123.00GiB, used=104.57GiB
>>> System, RAID1: total=8.00MiB, used=48.00KiB
>>> Metadata, RAID1: total=3.00GiB, used=1.67GiB
>>> GlobalReserve, single: total=512.00MiB, used=0.00B
>>> $ uname -a
>>> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC
>>> 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>>I've got this in dmesg:
>>
>>> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0
>>> action 0x0
>>> [  +0.51] ata6.00: irq_stat 0x4008
>>> [  +0.29] ata6.00: failed command: READ FPDMA QUEUED
>>> [  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3
>>> ncq 57344 in
>>>res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask
>>> 0x409 (media error) 
>>> [  +0.94] ata6.00: status: { DRDY ERR }
>>> [  +0.26] ata6.00: error: { UNC }
>>> [  +0.001195] ata6.00: configured for UDMA/133
>>> [  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK
>>> driverbyte=DRIVER_SENSE
>>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error
>>> [current] [descriptor]
>>> [  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read
>>> error - auto reallocate failed
>>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00
>>
>>> 79 f3 6c 50 00 00 00 70 00 00
>>> [  +0.03] blk_update_request: I/O error, dev sdb, sector
>>2045996136
>>> [  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>>rd
>>> 1, flush 0, corrupt 0, gen 0
>>> [  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>>rd
>>> 2, flush 0, corrupt 0, gen 0
>>> [  +0.77] ata6: EH complete
>>
>>There's still 1 in Current_Pending_Sector line of smartctl output as of
>>
>>now, so it probably won't heal by itself.
>>
>>--
>>
>>With Best Regards,
>>Marat Khalili
>>--
>>To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>>in
>>the body of a message to majord...@vger.kernel.org
>>More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-09 Thread Marat Khalili
Forgot to add, I've got a spare empty bay if it can be useful here.
--

With Best Regards,
Marat Khalili

On September 9, 2017 10:46:10 AM GMT+03:00, Marat Khalili  wrote:
>Dear list,
>
>I'm going to replace one hard drive (partition actually) of a btrfs 
>raid1. Can you please spell exactly what I need to do in order to get
>my 
>filesystem working as RAID1 again after replacement, exactly as it was 
>before? I saw some bad examples of drive replacement in this list so I 
>afraid to just follow random instructions on wiki, and putting this 
>system out of action even temporarily would be very inconvenient.
>
>For this filesystem:
>
>> $ sudo btrfs fi show /dev/sdb7
>> Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
>> Total devices 2 FS bytes used 106.23GiB
>> devid1 size 2.71TiB used 126.01GiB path /dev/sda7
>> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
>> $ grep /mnt/data /proc/mounts
>> /dev/sda7 /mnt/data btrfs 
>> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0
>> $ sudo btrfs fi df /mnt/data
>> Data, RAID1: total=123.00GiB, used=104.57GiB
>> System, RAID1: total=8.00MiB, used=48.00KiB
>> Metadata, RAID1: total=3.00GiB, used=1.67GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>> $ uname -a
>> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 
>> 2017 x86_64 x86_64 x86_64 GNU/Linux
>
>I've got this in dmesg:
>
>> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 
>> action 0x0
>> [  +0.51] ata6.00: irq_stat 0x4008
>> [  +0.29] ata6.00: failed command: READ FPDMA QUEUED
>> [  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3 
>> ncq 57344 in
>>res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask 
>> 0x409 (media error) 
>> [  +0.94] ata6.00: status: { DRDY ERR }
>> [  +0.26] ata6.00: error: { UNC }
>> [  +0.001195] ata6.00: configured for UDMA/133
>> [  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK 
>> driverbyte=DRIVER_SENSE
>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error 
>> [current] [descriptor]
>> [  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read 
>> error - auto reallocate failed
>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00
>
>> 79 f3 6c 50 00 00 00 70 00 00
>> [  +0.03] blk_update_request: I/O error, dev sdb, sector
>2045996136
>> [  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>rd 
>> 1, flush 0, corrupt 0, gen 0
>> [  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0,
>rd 
>> 2, flush 0, corrupt 0, gen 0
>> [  +0.77] ata6: EH complete
>
>There's still 1 in Current_Pending_Sector line of smartctl output as of
>
>now, so it probably won't heal by itself.
>
>--
>
>With Best Regards,
>Marat Khalili
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help with exact actions for raid1 hot-swap

2017-09-09 Thread Patrik Lundquist
On 9 September 2017 at 09:46, Marat Khalili  wrote:
>
> Dear list,
>
> I'm going to replace one hard drive (partition actually) of a btrfs raid1. 
> Can you please spell exactly what I need to do in order to get my filesystem 
> working as RAID1 again after replacement, exactly as it was before? I saw 
> some bad examples of drive replacement in this list so I afraid to just 
> follow random instructions on wiki, and putting this system out of action 
> even temporarily would be very inconvenient.


I recently replaced both disks in a two disk Btrfs raid1 to increase
capacity and took some notes.

Using systemd? systemd will automatically unmount a degraded disk and
ruin your one chance to replace the disk as long as Btrfs has the bug
where it notes single chunks and one disk missing and refuses to mount
degraded again.

Comment out your mount in fstab and run "systemctl daemon-reload". The
mount file in /var/run/systemd/generator/ will be removed. (Is there a
better way?)

Unmount the volume.

# hdparm -Y /dev/sdb
# echo 1 > /sys/block/sdb/device/delete

Replace the disk. Create partitions etc. You might have to restart
smartd, if using it.

Make Btrfs forget the old device. Will otherwise think the old disk is
still there. (Is there a better way?)
# rmmod btrfs; modprobe btrfs
# btrfs device scan

# mount -o degraded /dev/sda7 /mnt/data
# btrfs device usage /mnt/data

# btrfs replace start  /dev/sdbX /mnt/data
# btrfs replace status /mnt/data

Convert single or dup chunks to raid1
# btrfs balance start -fv -dconvert=raid1,soft -mconvert=raid1,soft
-sconvert=raid1,soft /mnt/data

Unmount, restore fstab, reload systemd again, mount.

>
> For this filesystem:
>
>> $ sudo btrfs fi show /dev/sdb7
>> Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
>> Total devices 2 FS bytes used 106.23GiB
>> devid1 size 2.71TiB used 126.01GiB path /dev/sda7
>> devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
>> $ grep /mnt/data /proc/mounts
>> /dev/sda7 /mnt/data btrfs 
>> rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0
>> $ sudo btrfs fi df /mnt/data
>> Data, RAID1: total=123.00GiB, used=104.57GiB
>> System, RAID1: total=8.00MiB, used=48.00KiB
>> Metadata, RAID1: total=3.00GiB, used=1.67GiB
>> GlobalReserve, single: total=512.00MiB, used=0.00B
>> $ uname -a
>> Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 2017 
>> x86_64 x86_64 x86_64 GNU/Linux
>
>
> I've got this in dmesg:
>
>> [Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 action 
>> 0x0
>> [  +0.51] ata6.00: irq_stat 0x4008
>> [  +0.29] ata6.00: failed command: READ FPDMA QUEUED
>> [  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3 ncq 
>> 57344 in
>>res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask 0x409 
>> (media error) 
>> [  +0.94] ata6.00: status: { DRDY ERR }
>> [  +0.26] ata6.00: error: { UNC }
>> [  +0.001195] ata6.00: configured for UDMA/133
>> [  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK 
>> driverbyte=DRIVER_SENSE
>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error [current] 
>> [descriptor]
>> [  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read error - 
>> auto reallocate failed
>> [  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00 79 f3 
>> 6c 50 00 00 00 70 00 00
>> [  +0.03] blk_update_request: I/O error, dev sdb, sector 2045996136
>> [  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 1, 
>> flush 0, corrupt 0, gen 0
>> [  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 2, 
>> flush 0, corrupt 0, gen 0
>> [  +0.77] ata6: EH complete
>
>
> There's still 1 in Current_Pending_Sector line of smartctl output as of now, 
> so it probably won't heal by itself.
>
> --
>
> With Best Regards,
> Marat Khalili
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Please help with exact actions for raid1 hot-swap

2017-09-09 Thread Marat Khalili

Dear list,

I'm going to replace one hard drive (partition actually) of a btrfs 
raid1. Can you please spell exactly what I need to do in order to get my 
filesystem working as RAID1 again after replacement, exactly as it was 
before? I saw some bad examples of drive replacement in this list so I 
afraid to just follow random instructions on wiki, and putting this 
system out of action even temporarily would be very inconvenient.


For this filesystem:


$ sudo btrfs fi show /dev/sdb7
Label: 'data'  uuid: 37d3313a-e2ad-4b7f-98fc-a01d815952e0
Total devices 2 FS bytes used 106.23GiB
devid1 size 2.71TiB used 126.01GiB path /dev/sda7
devid2 size 2.71TiB used 126.01GiB path /dev/sdb7
$ grep /mnt/data /proc/mounts
/dev/sda7 /mnt/data btrfs 
rw,noatime,space_cache,autodefrag,subvolid=5,subvol=/ 0 0

$ sudo btrfs fi df /mnt/data
Data, RAID1: total=123.00GiB, used=104.57GiB
System, RAID1: total=8.00MiB, used=48.00KiB
Metadata, RAID1: total=3.00GiB, used=1.67GiB
GlobalReserve, single: total=512.00MiB, used=0.00B
$ uname -a
Linux host 4.4.0-93-generic #116-Ubuntu SMP Fri Aug 11 21:17:51 UTC 
2017 x86_64 x86_64 x86_64 GNU/Linux


I've got this in dmesg:

[Sep 8 20:31] ata6.00: exception Emask 0x0 SAct 0x7ecaa5ef SErr 0x0 
action 0x0

[  +0.51] ata6.00: irq_stat 0x4008
[  +0.29] ata6.00: failed command: READ FPDMA QUEUED
[  +0.38] ata6.00: cmd 60/70:18:50:6c:f3/00:00:79:00:00/40 tag 3 
ncq 57344 in
   res 41/40:00:68:6c:f3/00:00:79:00:00/40 Emask 
0x409 (media error) 

[  +0.94] ata6.00: status: { DRDY ERR }
[  +0.26] ata6.00: error: { UNC }
[  +0.001195] ata6.00: configured for UDMA/133
[  +0.30] sd 6:0:0:0: [sdb] tag#3 FAILED Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
[  +0.05] sd 6:0:0:0: [sdb] tag#3 Sense Key : Medium Error 
[current] [descriptor]
[  +0.04] sd 6:0:0:0: [sdb] tag#3 Add. Sense: Unrecovered read 
error - auto reallocate failed
[  +0.05] sd 6:0:0:0: [sdb] tag#3 CDB: Read(16) 88 00 00 00 00 00 
79 f3 6c 50 00 00 00 70 00 00

[  +0.03] blk_update_request: I/O error, dev sdb, sector 2045996136
[  +0.47] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 
1, flush 0, corrupt 0, gen 0
[  +0.62] BTRFS error (device sda7): bdev /dev/sdb7 errs: wr 0, rd 
2, flush 0, corrupt 0, gen 0

[  +0.77] ata6: EH complete


There's still 1 in Current_Pending_Sector line of smartctl output as of 
now, so it probably won't heal by itself.


--

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help. Repair probably bitflip damage and suspected bug

2017-06-20 Thread Chris Murphy
>
[Sun Jun 18 04:02:43 2017] BTRFS critical (device sdb2): corrupt node,
bad key order: block=5123372711936, root=1, slot=82


>From the archives, most likely it's bad RAM. I see this system also
uses XFS v4 file system, if it were made as XFS v5 using metadata
csums you'd probably eventually run into a similar problem that would
be caught by metadata checksum errors. It'll fail faster with Btrfs
because it's checksumming everything.


Chris Murphy
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Jesse
Thanks Мяу!
I will ensure I reply all :)

On 19 June 2017 at 23:38, Adam Borowski <kilob...@angband.pl> wrote:
> On Mon, Jun 19, 2017 at 12:48:54PM +0300, Ivan Sizov wrote:
>> 2017-06-19 12:32 GMT+03:00 Jesse <btrfs_mail_l...@mymail.isbest.biz>:
>> > So I guess that means when I initiate a post, I also need to send it
>> > to myself as well as the mail list.
>> You need to do it in the reply only, not in the initial post.
>>
>> > Does it make any difference where I put respective addresses, eg: TO: CC: 
>> > BCC:
>> You need to put a person to whom you reply in "TO" field and mailing
>> list in "CC" field.
>
> Any mail client I know (but I haven't looked at many...) can do all of this
> by "Reply All" (a button by that name in Thunderbird, 'g' in mutt, ...).
>
> It's worth noting that vger lists have rules different to those in most of
> Free Software communities: on vger, you're supposed to send copies to
> everyone -- pretty much everywhere else you are expected to send to the list
> only.  This is done by "Reply List" (in Thunderbird, 'L' in mutt, ...).
> Such lists do add a set of "List-*:" headers that help the client.
>
>
> Мяу!
> --
> ⢀⣴⠾⠻⢶⣦⠀
> ⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
> ⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
> ⠈⠳⣄ A master species delegates.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Adam Borowski
On Mon, Jun 19, 2017 at 12:48:54PM +0300, Ivan Sizov wrote:
> 2017-06-19 12:32 GMT+03:00 Jesse <btrfs_mail_l...@mymail.isbest.biz>:
> > So I guess that means when I initiate a post, I also need to send it
> > to myself as well as the mail list.
> You need to do it in the reply only, not in the initial post.
> 
> > Does it make any difference where I put respective addresses, eg: TO: CC: 
> > BCC:
> You need to put a person to whom you reply in "TO" field and mailing
> list in "CC" field.

Any mail client I know (but I haven't looked at many...) can do all of this
by "Reply All" (a button by that name in Thunderbird, 'g' in mutt, ...).

It's worth noting that vger lists have rules different to those in most of
Free Software communities: on vger, you're supposed to send copies to
everyone -- pretty much everywhere else you are expected to send to the list
only.  This is done by "Reply List" (in Thunderbird, 'L' in mutt, ...).
Such lists do add a set of "List-*:" headers that help the client.


Мяу!
-- 
⢀⣴⠾⠻⢶⣦⠀ 
⣾⠁⢠⠒⠀⣿⡁ A dumb species has no way to open a tuna can.
⢿⡄⠘⠷⠚⠋⠀ A smart species invents a can opener.
⠈⠳⣄ A master species delegates.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help. Repair probably bitflip damage and suspected bug

2017-06-19 Thread Jesse
I just noticed a series of seemingly btrfs related call traces that
for the first time, did not lock up the system.

I have uploaded dmesg to https://paste.ee/p/An8Qy

Anyone able to help advise on these?

Thanks

Jesse


On 19 June 2017 at 17:19, Jesse <btrfs_mail_l...@mymail.isbest.biz> wrote:
> Further to the above message reporting problems, I have been able to
> capture a call trace under the main system rather than live media.
>
> Note this occurred in rsync from btrfs to a separate drive running xfs
> on a local filesystem (both sata drives). So I presume that btrfs is
> only reading the drive at the time of crash, unless rsync is also
> doing some sort of disc caching of the files to btrfs as it is the OS
> filesystem.
>
> The destination drive directories being copied to in this case were
> empty, so I was making a copy of the data off of the btrfs drive (due
> to the btrfs tree errors and problems reported in the post I am here
> replying to).
>
> I am suspecting that there is a direct correlation to using rsync
> while (or subsequent to) touching areas of the btrfs tree that have
> corruption which results in a complete system lockup/crash.
>
> I have also noted that when these crashes while running rsync occur,
> the prior x files (eg: 10 files) show in the rsync log as being
> synced, however, show on the destination drive with filesize of zero.
>
> The trace (/var/log/messages | grep btrfs) I have uploaded to
> https://paste.ee/p/nRcj0
>
> The important part of which is:
>
> Jun 18 23:43:24 Orion vmunix: [38084.183174] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.183195] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.183209] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.183222] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.217552] BTRFS info (device sda2):
> csum failed ino 12497 extent 1700305813504 csum 1405070872 wanted 0
> mirror 0
> Jun 18 23:43:24 Orion vmunix: [38084.217626] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.217643] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.217657] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix: [38084.217669] BTRFS info (device sda2):
> no csum found for inode 12497 start 0
> Jun 18 23:43:24 Orion vmunix:  auth_rpcgss nfs_acl nfs lockd grace
> sunrpc fscache zfs(POE) zunicode(POE) zcommon(POE) znvpair(POE)
> spl(OE) zavl(POE) btrfs xor raid6_pq dm_mirror dm_region_hash dm_log
> hid_generic usbhid hid uas usb_storage radeon i2c_algo_bit ttm
> drm_kms_helper drm r8169 ahci mii libahci wmi
> Jun 18 23:43:24 Orion vmunix: [38084.220604] Workqueue: btrfs-endio
> btrfs_endio_helper [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.220812] RIP:
> 0010:[]  []
> __btrfs_map_block+0x32a/0x1180 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.222459]  [] ?
> __btrfs_lookup_bio_sums.isra.8+0x3e0/0x540 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.222632]  []
> btrfs_map_bio+0x7d/0x2b0 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.222781]  []
> btrfs_submit_compressed_read+0x484/0x4e0 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.222948]  []
> btrfs_submit_bio_hook+0x1c1/0x1d0 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.223198]  [] ?
> btrfs_create_repair_bio+0xf0/0x110 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.223360]  []
> bio_readpage_error+0x117/0x180 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.223514]  [] ?
> clean_io_failure+0x1b0/0x1b0 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.223667]  []
> end_bio_extent_readpage+0x3be/0x3f0 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.223996]  []
> end_workqueue_fn+0x48/0x60 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.224145]  []
> normal_work_helper+0x82/0x210 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.224297]  []
> btrfs_endio_helper+0x12/0x20 [btrfs]
> Jun 18 23:43:24 Orion vmunix:  auth_rpcgss nfs_acl nfs lockd grace
> sunrpc fscache zfs(POE) zunicode(POE) zcommon(POE) znvpair(POE)
> spl(OE) zavl(POE) btrfs xor raid6_pq dm_mirror dm_region_hash dm_log
> hid_generic usbhid hid uas usb_storage radeon i2c_algo_bit ttm
> drm_kms_helper drm r8169 ahci mii libahci wmi
> Jun 18 23:43:24 Orion vmunix: [38084.330053]  [] ?
> __btrfs_map_block+0x32a/0x1180 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.330106]  [] ?
> __btrfs_map_block+0x2cc/0x1180 [btrfs]
> Jun 18 23:43:24 Orion vmunix: [38084.330154]  [] ?
> __btrfs_loo

Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Ivan Sizov
2017-06-19 13:15 GMT+03:00 Jesse :
> Thanks again. So am I to understand that you go into your 'sent'
> folder, find a mail to the mail list (that is not CC to yourself),
> then you reply to this and add the mail list when you need to update
> your own post that no-one has yet replied to?
Yes, exactly.

-- 
Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Ivan Sizov
2017-06-19 13:03 GMT+03:00 Jesse :
> Thanks Ivan.
> What about when initiating a post, do I do the same eg:
> TO: myself
> CC: mailing list
>
> or do I
> TO: mailing list
> CC: myself
If your mail client doesn't have "sent" folder, you can, of course,
follow one of these examples. But I didn't face with such situation
ever.

-- 
Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Ivan Sizov
2017-06-19 13:03 GMT+03:00 Jesse :
> Thanks Ivan.
> What about when initiating a post, do I do the same eg:
> TO: myself
> CC: mailing list
>
> or do I
> TO: mailing list
> CC: myself
When initiating a post you should to specify "TO: mailing list" only,
without any other addresses. At least I used to initiate posts in such
way.



-- 
Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Jesse
Thanks Ivan.
What about when initiating a post, do I do the same eg:
TO: myself
CC: mailing list

or do I
TO: mailing list
CC: myself

TIA

On 19 June 2017 at 17:48, Ivan Sizov  wrote:
> 2017-06-19 12:32 GMT+03:00 Jesse :
>> So I guess that means when I initiate a post, I also need to send it
>> to myself as well as the mail list.
> You need to do it in the reply only, not in the initial post.
>
>> Does it make any difference where I put respective addresses, eg: TO: CC: 
>> BCC:
> You need to put a person to whom you reply in "TO" field and mailing
> list in "CC" field.
>
> --
> Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Ivan Sizov
2017-06-19 12:32 GMT+03:00 Jesse :
> So I guess that means when I initiate a post, I also need to send it
> to myself as well as the mail list.
You need to do it in the reply only, not in the initial post.

> Does it make any difference where I put respective addresses, eg: TO: CC: BCC:
You need to put a person to whom you reply in "TO" field and mailing
list in "CC" field.

-- 
Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Jesse
Ok thanks Ivan.

So I guess that means when I initiate a post, I also need to send it
to myself as well as the mail list.
Does it make any difference where I put respective addresses, eg: TO: CC: BCC:

Regards

Jesse

On 19 June 2017 at 17:20, Ivan Sizov  wrote:
> You should reply both to linux-btrfs@vger.kernel.org and the person
> whom you talk to.
>
> 2017-06-19 11:37 GMT+03:00 Jesse :
>> I have subscribed successfully and am able to post successfully and
>> eventually view the post on spinics.net when it becomes available:
>> eg: http://www.spinics.net/lists/linux-btrfs/msg66605.html
>>
>> However I do not know how to reply to messages, especially my own to
>> add more information, such as a call trace.
>> 1. I do not receive an email of my post for which I could reply
>> 2. The emails that I do receive from the list are from the respective
>> sender, and not the vger.kernel.org, as such I do not even know how to
>> reply to someone in a way that it ends up on the mailing list and not
>> directly to that person.
>>
>> Could someone please be so kind as to direct me to a good guide for
>> using this mailing list?
>>
>> Thanks
>>
>> Jesse
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Help on using linux-btrfs mailing list please

2017-06-19 Thread Ivan Sizov
You should reply both to linux-btrfs@vger.kernel.org and the person
whom you talk to.

2017-06-19 11:37 GMT+03:00 Jesse :
> I have subscribed successfully and am able to post successfully and
> eventually view the post on spinics.net when it becomes available:
> eg: http://www.spinics.net/lists/linux-btrfs/msg66605.html
>
> However I do not know how to reply to messages, especially my own to
> add more information, such as a call trace.
> 1. I do not receive an email of my post for which I could reply
> 2. The emails that I do receive from the list are from the respective
> sender, and not the vger.kernel.org, as such I do not even know how to
> reply to someone in a way that it ends up on the mailing list and not
> directly to that person.
>
> Could someone please be so kind as to direct me to a good guide for
> using this mailing list?
>
> Thanks
>
> Jesse
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Ivan Sizov
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Please help. Repair probably bitflip damage and suspected bug

2017-06-19 Thread Jesse
 18 23:43:24 Orion vmunix: [38084.330618]  []
normal_work_helper+0x82/0x210 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.330668]  []
btrfs_endio_helper+0x12/0x20 [btrfs]
Jun 18 23:43:24 Orion vmunix:  auth_rpcgss nfs_acl nfs lockd grace
sunrpc fscache zfs(POE) zunicode(POE) zcommon(POE) znvpair(POE)
spl(OE) zavl(POE) btrfs xor raid6_pq dm_mirror dm_region_hash dm_log
hid_generic usbhid hid uas usb_storage radeon i2c_algo_bit ttm
drm_kms_helper drm r8169 ahci mii libahci wmi
Jun 18 23:43:24 Orion vmunix: [38084.331102]  [] ?
__btrfs_map_block+0x32a/0x1180 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331152]  [] ?
__btrfs_map_block+0x2cc/0x1180 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331202]  [] ?
__btrfs_lookup_bio_sums.isra.8+0x3e0/0x540 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331255]  []
btrfs_map_bio+0x7d/0x2b0 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331310]  []
btrfs_submit_compressed_read+0x484/0x4e0 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331360]  []
btrfs_submit_bio_hook+0x1c1/0x1d0 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331423]  [] ?
btrfs_create_repair_bio+0xf0/0x110 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331477]  []
bio_readpage_error+0x117/0x180 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331530]  [] ?
clean_io_failure+0x1b0/0x1b0 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331585]  []
end_bio_extent_readpage+0x3be/0x3f0 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331649]  []
end_workqueue_fn+0x48/0x60 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331703]  []
normal_work_helper+0x82/0x210 [btrfs]
Jun 18 23:43:24 Orion vmunix: [38084.331757]  []
btrfs_endio_helper+0x12/0x20 [btrfs]
Jun 19 07:29:22 Orion vmunix: [3.107113] Btrfs loaded
Jun 19 07:29:22 Orion vmunix: [3.665536] BTRFS: device label
btrfs1 devid 2 transid 1086759 /dev/sdb2
Jun 19 07:29:22 Orion vmunix: [3.665811] BTRFS: device label
btrfs1 devid 1 transid 1086759 /dev/sda2
Jun 19 07:29:22 Orion vmunix: [8.673689] BTRFS info (device sda2):
disk space caching is enabled
Jun 19 07:29:22 Orion vmunix: [   28.190962] BTRFS info (device sda2):
enabling auto defrag
Jun 19 07:29:22 Orion vmunix: [   28.191039] BTRFS info (device sda2):
disk space caching is enabled

I notice that this page
https://btrfs.wiki.kernel.org/index.php/Gotchas mentions "Files with a
lot of random writes can become heavily fragmented (1+ extents)
causing thrashing on HDDs and excessive multi-second spikes" and as
such I am wondering if this is related to the crashing. AFAIK rsync
should be creating the temp file in the destination drive (xfs),
unless there is some part of rsync that I am not understanding that
would be writing to the file system drive (btrfs)  which is also in
the case the source hdd (btrfs).

Can someone please help with these btrfs problems.

Thankyou



> My Linux Mint system is starting up and usable, however, I am unable
> to complete any scrub as they abort before finished. There are various
> inode errors in dmesg. Badblocks (readonly) finds no errors. checking
> extents gives bad block 5123372711936 on both /dev/sda2 and /dev/sda2.
> A btrfscheck (readonly) results in a 306MB text file of output of root
> xxx inode errors.
> There are two drives 3TB each in RAID 1 for sda2/sdb2 for which
> partition 2 is nearly the entire drive.
>
> Currently I am now using a Manjaro Live Boot with btrfs tools
> btrfs-progs v4.10.1 in an attempt to recover/repair what seems to be
> bitflip
> (The original Linux Mint System has btrfs-progs v4.5.3)
>
> When doing a scrub on '/', the status of /dev/sdb2 aborts always at ~
> 383GiB with 0 errors. Whereas the /dev/sda2 and thus the '/' scrub
> aborts at more diverse values starting at 537.90GiB with 0 errors.
>
> btrfs inspect-internal dump-tree -b 5123372711936 has one item
> evidently out of order:
> 2551224532992 -> 2551253647360 -> 2551251468288
>
> I am currently attempting to copy files off the system while in
> Manjaro using rsync prior to attempting whatever the knowlegable
> people here recommend. It has resulting in two files not being able to
> be read so far, however, a lot of messages in dmesg for btrfs errors
> https://ptpb.pw/L9Z9
>
> Pastebins from original machine:
> System specs as on original Linux Mint system: https://ptpb.pw/dFz3
> dmesg btrfs grep from prior to errors starting until scrub attempts:
> https://ptpb.pw/rTzs
>
> Pastebins from subsequent live boot with newer btrfs tools 4.10:
> LiveBoot Repair (Manjaro Arch) specs: https://ptpb.pw/ikMM
> Scrub failing/aborting at same place on /dev/sdb: https://ptpb.pw/-vcP
> badblock_extent_btrfscheck_5123372711936: https://ptpb.pw/T1rD
> 'btrfs inspect-internal dump-tree -b 5123372711936 /dev/sda2':
> https://ptpb.pw/zcyI
> 'btrfs inspect-internal dump-tree -b 5123372711936 /dev/sdb2':
> https://ptpb.pw/zcyI
> dmesg on Manjaro attempting to rsync recove

Help on using linux-btrfs mailing list please

2017-06-19 Thread Jesse
I have subscribed successfully and am able to post successfully and
eventually view the post on spinics.net when it becomes available:
eg: http://www.spinics.net/lists/linux-btrfs/msg66605.html

However I do not know how to reply to messages, especially my own to
add more information, such as a call trace.
1. I do not receive an email of my post for which I could reply
2. The emails that I do receive from the list are from the respective
sender, and not the vger.kernel.org, as such I do not even know how to
reply to someone in a way that it ends up on the mailing list and not
directly to that person.

Could someone please be so kind as to direct me to a good guide for
using this mailing list?

Thanks

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Please help repair probably bitflip damage

2017-06-17 Thread Jesse
My Linux Mint system is starting up and usable, however, I am unable
to complete any scrub as they abort before finished. There are various
inode errors in dmesg. Badblocks (readonly) finds no errors. checking
extents gives bad block 5123372711936 on both /dev/sda2 and /dev/sda2.
A btrfscheck (readonly) results in a 306MB text file of output of root
xxx inode errors.
There are two drives 3TB each in RAID 1 for sda2/sdb2 for which
partition 2 is nearly the entire drive.

Currently I am now using a Manjaro Live Boot with btrfs tools
btrfs-progs v4.10.1 in an attempt to recover/repair what seems to be
bitflip
(The original Linux Mint System has btrfs-progs v4.5.3)

When doing a scrub on '/', the status of /dev/sdb2 aborts always at ~
383GiB with 0 errors. Whereas the /dev/sda2 and thus the '/' scrub
aborts at more diverse values starting at 537.90GiB with 0 errors.

btrfs inspect-internal dump-tree -b 5123372711936 has one item
evidently out of order:
2551224532992 -> 2551253647360 -> 2551251468288

I am currently attempting to copy files off the system while in
Manjaro using rsync prior to attempting whatever the knowlegable
people here recommend. It has resulting in two files not being able to
be read so far, however, a lot of messages in dmesg for btrfs errors
https://ptpb.pw/L9Z9

Pastebins from original machine:
System specs as on original Linux Mint system: https://ptpb.pw/dFz3
dmesg btrfs grep from prior to errors starting until scrub attempts:
https://ptpb.pw/rTzs

Pastebins from subsequent live boot with newer btrfs tools 4.10:
LiveBoot Repair (Manjaro Arch) specs: https://ptpb.pw/ikMM
Scrub failing/aborting at same place on /dev/sdb: https://ptpb.pw/-vcP
badblock_extent_btrfscheck_5123372711936: https://ptpb.pw/T1rD
'btrfs inspect-internal dump-tree -b 5123372711936 /dev/sda2':
https://ptpb.pw/zcyI
'btrfs inspect-internal dump-tree -b 5123372711936 /dev/sdb2':
https://ptpb.pw/zcyI
dmesg on Manjaro attempting to rsync recover files: https://ptpb.pw/L9Z9

Could someone please advise the steps to repair this.

Thankyou

Jesse
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   3   4   5   6   >