Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Qu Wenruo



On 2017年09月20日 13:10, Qu Wenruo wrote:



On 2017年09月20日 12:59, Qu Wenruo wrote:



On 2017年09月20日 12:49, Rich Rauenzahn wrote:



On 9/19/2017 5:31 PM, Qu Wenruo wrote:

On 2017年09月19日 23:56, Rich Rauenzahn wrote:

[    4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]


Is that line the following WARN_ON()?
---
static inline void btrfs_set_device_total_bytes(struct extent_buffer 
*eb,

    struct btrfs_dev_item *s,
    u64 val)
{
BUILD_BUG_ON(sizeof(u64) !=
 sizeof(((struct btrfs_dev_item *)0))->total_bytes);
WARN_ON(!IS_ALIGNED(val, eb->fs_info->sectorsize)); <<<
btrfs_set_64(eb, s, offsetof(struct btrfs_dev_item, 
total_bytes), val);

}
---

If so, that means your devices size is not aligned to 4K.

Is your block device still using old 512 block size?
AFAIK nowadays most HDDs are using 4K blocksize and it's recommended 
to use it.


It's not a big problem and one can easily remove the WARN_ON().
But I think we'd better fix the caller to do round_down() before 
calling this function.




That's interesting!  I believe I made an effort to align them when I 
set it up years ago, but never knew how to verify.


Well, best verifying if that's the line causing the warning, since I 
don't have the source of RedHat kernel.




I have three mirrored filesystems:


[snip]


Number  Start (sector)    End (sector)  Size   Code Name
    1  40  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6


At least this size is not aligned to 4K.



Partition table scan:

[snip]


.and one is aligned differently!

Could it be /dev/sdd that's the issue?  But it's aligned at 4096 -- 
so I'm not sure that's the issue after all.


Its start sector is aligned, but end point is not, so the size is not 
aligned either.


BTW, is /dev/sdd added to btrfs using "btrfs device add"?
In my test, if making btrfs on a unaligned file, it will round down to 
its sectorsize boundary.


Confirmed that "btrfs device add" won't round down the size.
Check the btrfs-debug-tree output:
--
     item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
     devid 1 total_bytes 10737418240 bytes_used 2172649472
     io_align 4096 io_width 4096 sector_size 4096 type 0
     generation 0 start_offset 0 dev_group 0
     seek_speed 0 bandwidth 0
     uuid 243a1117-ca31-4d87-8656-81c5630aafb2
     fsid 6452cde7-14d5-4541-aa07-b265a400bad0
     item 1 key (DEV_ITEMS DEV_ITEM 2) itemoff 16087 itemsize 98
     devid 2 total_bytes 1073742336 bytes_used 0
     io_align 4096 io_width 4096 sector_size 4096 type 0
     generation 0 start_offset 0 dev_group 0
     seek_speed 0 bandwidth 0
     uuid 6bb07260-d230-4e22-88b1-1eabb46622ed
     fsid 6452cde7-14d5-4541-aa07-b265a400bad0
--


Sorry, the output is from v4.12.x, so no kernel warning nor the patch 
rounding down the value.




Where first device is completely aligned, the 2nd device which is just 
1G + 512, definitely not aligned.


So if you're using single device purely created by mkfs.btrfs, you're OK.
But if any new device added, you're not OK and causing the false alert.

Any way, it should not be hard to fix.
Just remove the WARN_ON() and add extra round_down when adding device.


In v4.13 kernel, the newly added devices are in fact rounded down.
But existing device doesn't get the round down.

So it's recommended to resize (shrink) your fs for very small size to 
fix it if you don't want to wait for the kernel fix.


Thanks,
Qu


Thanks for the report,
Qu



So I'm wondering if it's caused by added new btrfs device.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe 
linux-btrfs" in

the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Qu Wenruo



On 2017年09月20日 12:59, Qu Wenruo wrote:



On 2017年09月20日 12:49, Rich Rauenzahn wrote:



On 9/19/2017 5:31 PM, Qu Wenruo wrote:

On 2017年09月19日 23:56, Rich Rauenzahn wrote:

[    4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]


Is that line the following WARN_ON()?
---
static inline void btrfs_set_device_total_bytes(struct extent_buffer 
*eb,

    struct btrfs_dev_item *s,
    u64 val)
{
BUILD_BUG_ON(sizeof(u64) !=
 sizeof(((struct btrfs_dev_item *)0))->total_bytes);
WARN_ON(!IS_ALIGNED(val, eb->fs_info->sectorsize)); <<<
btrfs_set_64(eb, s, offsetof(struct btrfs_dev_item, total_bytes), 
val);

}
---

If so, that means your devices size is not aligned to 4K.

Is your block device still using old 512 block size?
AFAIK nowadays most HDDs are using 4K blocksize and it's recommended 
to use it.


It's not a big problem and one can easily remove the WARN_ON().
But I think we'd better fix the caller to do round_down() before 
calling this function.




That's interesting!  I believe I made an effort to align them when I 
set it up years ago, but never knew how to verify.


Well, best verifying if that's the line causing the warning, since I 
don't have the source of RedHat kernel.




I have three mirrored filesystems:


[snip]


Number  Start (sector)    End (sector)  Size   Code Name
    1  40  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6


At least this size is not aligned to 4K.



Partition table scan:

[snip]


.and one is aligned differently!

Could it be /dev/sdd that's the issue?  But it's aligned at 4096 -- so 
I'm not sure that's the issue after all.


Its start sector is aligned, but end point is not, so the size is not 
aligned either.


BTW, is /dev/sdd added to btrfs using "btrfs device add"?
In my test, if making btrfs on a unaligned file, it will round down to 
its sectorsize boundary.


Confirmed that "btrfs device add" won't round down the size.
Check the btrfs-debug-tree output:
--
item 0 key (DEV_ITEMS DEV_ITEM 1) itemoff 16185 itemsize 98
devid 1 total_bytes 10737418240 bytes_used 2172649472
io_align 4096 io_width 4096 sector_size 4096 type 0
generation 0 start_offset 0 dev_group 0
seek_speed 0 bandwidth 0
uuid 243a1117-ca31-4d87-8656-81c5630aafb2
fsid 6452cde7-14d5-4541-aa07-b265a400bad0
item 1 key (DEV_ITEMS DEV_ITEM 2) itemoff 16087 itemsize 98
devid 2 total_bytes 1073742336 bytes_used 0
io_align 4096 io_width 4096 sector_size 4096 type 0
generation 0 start_offset 0 dev_group 0
seek_speed 0 bandwidth 0
uuid 6bb07260-d230-4e22-88b1-1eabb46622ed
fsid 6452cde7-14d5-4541-aa07-b265a400bad0
--

Where first device is completely aligned, the 2nd device which is just 
1G + 512, definitely not aligned.


So if you're using single device purely created by mkfs.btrfs, you're OK.
But if any new device added, you're not OK and causing the false alert.

Any way, it should not be hard to fix.
Just remove the WARN_ON() and add extra round_down when adding device.

Thanks for the report,
Qu



So I'm wondering if it's caused by added new btrfs device.

Thanks,
Qu


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Qu Wenruo



On 2017年09月20日 12:49, Rich Rauenzahn wrote:



On 9/19/2017 5:31 PM, Qu Wenruo wrote:

On 2017年09月19日 23:56, Rich Rauenzahn wrote:

[    4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]


Is that line the following WARN_ON()?
---
static inline void btrfs_set_device_total_bytes(struct extent_buffer *eb,
    struct btrfs_dev_item *s,
    u64 val)
{
BUILD_BUG_ON(sizeof(u64) !=
 sizeof(((struct btrfs_dev_item *)0))->total_bytes);
WARN_ON(!IS_ALIGNED(val, eb->fs_info->sectorsize)); <<<
btrfs_set_64(eb, s, offsetof(struct btrfs_dev_item, total_bytes), 
val);

}
---

If so, that means your devices size is not aligned to 4K.

Is your block device still using old 512 block size?
AFAIK nowadays most HDDs are using 4K blocksize and it's recommended 
to use it.


It's not a big problem and one can easily remove the WARN_ON().
But I think we'd better fix the caller to do round_down() before 
calling this function.




That's interesting!  I believe I made an effort to align them when I set 
it up years ago, but never knew how to verify.


Well, best verifying if that's the line causing the warning, since I 
don't have the source of RedHat kernel.




I have three mirrored filesystems:

$ for i in /dev/sd[abcdef]; do sudo gdisk -l $i; done
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 03FFF12A-2EF5-4916-92D9-59C244EFDF5B
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
    1    2048  3907029134   1.8 TiB 8300 BTRFS BACKUPS
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): B0CF9AC1-7DD0-46CD-AF62-2E54761686C7
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
    1    2048  3907029134   1.8 TiB 8300 BTRFS BACKUPS
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdc: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 21CA2468-8185-4ECA-B63D-8A9A1557F302
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
    1    2048  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdd: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 5214ED9D-769A-4DF8-886F-8EEC3FDD4D0D
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 6 sectors (3.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
    1  40  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6


At least this size is not aligned to 4K.



Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sde: 234441648 sectors, 111.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): D0E4B890-0002-4DA1-B011-24CE7FD435FE
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 234441614
Partitions will be aligned on 2048-sector boundaries
Total free space is 2925 sectors (1.4 MiB)

Number  Start (sector)    End (sector)  Size   Code Name
    1    2048  411647   200.0 MiB   EF00 EFI System 
Partition

    2  411648 1435647   500.0 MiB   0700 Primary /boot
    3 1435648   234440703   111.1 GiB   0700 Primary /home
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
   MBR: protective
   BSD: not present
   APM: not present
   GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdf: 234441648 sectors, 111.8 GiB

Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Rich Rauenzahn



On 9/19/2017 5:31 PM, Qu Wenruo wrote:

On 2017年09月19日 23:56, Rich Rauenzahn wrote:

[    4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]


Is that line the following WARN_ON()?
---
static inline void btrfs_set_device_total_bytes(struct extent_buffer *eb,
    struct btrfs_dev_item *s,
    u64 val)
{
BUILD_BUG_ON(sizeof(u64) !=
 sizeof(((struct btrfs_dev_item *)0))->total_bytes);
WARN_ON(!IS_ALIGNED(val, eb->fs_info->sectorsize)); <<<
btrfs_set_64(eb, s, offsetof(struct btrfs_dev_item, total_bytes), 
val);

}
---

If so, that means your devices size is not aligned to 4K.

Is your block device still using old 512 block size?
AFAIK nowadays most HDDs are using 4K blocksize and it's recommended 
to use it.


It's not a big problem and one can easily remove the WARN_ON().
But I think we'd better fix the caller to do round_down() before 
calling this function.




That's interesting!  I believe I made an effort to align them when I set 
it up years ago, but never knew how to verify.


I have three mirrored filesystems:

$ for i in /dev/sd[abcdef]; do sudo gdisk -l $i; done
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sda: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 03FFF12A-2EF5-4916-92D9-59C244EFDF5B
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
   1    2048  3907029134   1.8 TiB 8300 BTRFS BACKUPS
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdb: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): B0CF9AC1-7DD0-46CD-AF62-2E54761686C7
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
   1    2048  3907029134   1.8 TiB 8300 BTRFS BACKUPS
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdc: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 21CA2468-8185-4ECA-B63D-8A9A1557F302
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 2048-sector boundaries
Total free space is 2014 sectors (1007.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
   1    2048  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdd: 3907029168 sectors, 1.8 TiB
Logical sector size: 512 bytes
Disk identifier (GUID): 5214ED9D-769A-4DF8-886F-8EEC3FDD4D0D
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 3907029134
Partitions will be aligned on 8-sector boundaries
Total free space is 6 sectors (3.0 KiB)

Number  Start (sector)    End (sector)  Size   Code Name
   1  40  3907029134   1.8 TiB 8300 BTRFS MEDIA
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sde: 234441648 sectors, 111.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): D0E4B890-0002-4DA1-B011-24CE7FD435FE
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 234441614
Partitions will be aligned on 2048-sector boundaries
Total free space is 2925 sectors (1.4 MiB)

Number  Start (sector)    End (sector)  Size   Code Name
   1    2048  411647   200.0 MiB   EF00 EFI System 
Partition

   2  411648 1435647   500.0 MiB   0700 Primary /boot
   3 1435648   234440703   111.1 GiB   0700 Primary /home
GPT fdisk (gdisk) version 0.8.6

Partition table scan:
  MBR: protective
  BSD: not present
  APM: not present
  GPT: present

Found valid GPT with protective MBR; using GPT.
Disk /dev/sdf: 234441648 sectors, 111.8 GiB
Logical sector size: 512 bytes
Disk identifier (GUID): D1523F65-B975-4A94-8519-3D1679A50342
Partition table holds up to 128 entries
First usable sector is 34, last usable sector is 234441614
Partitions will be aligned on 2048-sector 

Re: difference between -c and -p for send-receive?

2017-09-19 Thread Andrei Borzenkov
19.09.2017 03:41, Dave пишет:
> new subject for new question
> 
> On Mon, Sep 18, 2017 at 1:37 PM, Andrei Borzenkov  wrote:
> 
 What scenarios can lead to "ERROR: parent determination failed"?
>>>
>>> The man page for btrfs-send is reasonably clear on the requirements
>>> btrfs imposes. If you want to use incremental sends (i.e. the -c or -p
>>> options) then the specified snapshots must exist on both the source and
>>> destination. If you don't have a suitable existing snapshot then don't
>>> use -c or -p and just do a full send.
>>>
>>
>> Well, I do not immediately see why -c must imply incremental send. We
>> want to reduce amount of data that is transferred, so reuse data from
>> existing snapshots, but it is really orthogonal to whether we send full
>> subvolume or just changes since another snapshot.
>>
> 
> Starting months ago when I began using btrfs serious, I have been
> reading, rereading and trying to understand this:
> 
> FAQ - btrfs Wiki
> https://btrfs.wiki.kernel.org/index.php/FAQ#What_is_the_difference_between_-c_and_-p_in_send.3F
> 

This wiki entry is wrong (and as long as I can believe git, it has
always been wrong).

First, "btrfs send -c" does not start with blank subvolume; it starts
with "best parent" which is determined automatically. Actually if you
look at the help output in the very first version of send command:

"By default, this will send the whole subvolume. To do",
"an incremental send, one or multiple '-i '",
"arguments have to be specified. A 'clone source' is",
"a subvolume that is known to exist on the receiving",
"side in exactly the same state as on the sending side.\n",
"Normally, a good snapshot parent is searched automatically",
"in the list of 'clone sources'. To override this, use",
"'-p ' to manually specify a snapshot parent.",

it explains fat better what -c and -p do (ignore -i, this is error that
was fixed later, it means -c).

Second, example in wiki simply does not work. All snapshots listed in -c
options and snapshot that we want to transfer must have the same parent
uuid, unless -p is explicitly provided. Example shows snapshots of two
different subvolumes. I could not make it work even if A and B
themselves are cloned from common subvolume.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: subvolume: outputs message only when operation succeeds

2017-09-19 Thread Satoru Takeuchi
At Tue, 19 Sep 2017 16:41:52 +0900,
Misono, Tomohiro wrote:
> 
> "btrfs subvolume create/delete" outputs the message of "Create/Delete
> subvolume ..." even when an operation fails.
> Since it is confusing, let's outputs the message only when an operation 
> succeeds.
> 
> Signed-off-by: Tomohiro Misono 

Current message as follows is odd as you said. 

```
Create subvolume './test'
ERROR: cannot create subvolume: No such file or directory
```

It's ambiguous for users to know whether creating subvolume succeeded or not.

I tested this patch with injecting error on ioctl() for subvol 
creation/deletion.

Reviewed-by: Satoru Takeuchi 
Tested-by: Satoru Takeuchi 

> ---
>  cmds-subvolume.c | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/cmds-subvolume.c b/cmds-subvolume.c
> index 666f6e0..6d4b0fe 100644
> --- a/cmds-subvolume.c
> +++ b/cmds-subvolume.c
> @@ -189,7 +189,6 @@ static int cmd_subvol_create(int argc, char **argv)
>   if (fddst < 0)
>   goto out;
>  
> - printf("Create subvolume '%s/%s'\n", dstdir, newname);
>   if (inherit) {
>   struct btrfs_ioctl_vol_args_v2  args;
>  
> @@ -213,6 +212,7 @@ static int cmd_subvol_create(int argc, char **argv)
>   error("cannot create subvolume: %s", strerror(errno));
>   goto out;
>   }
> + printf("Create subvolume '%s/%s'\n", dstdir, newname);
>  
>   retval = 0; /* success */
>  out:
> @@ -337,9 +337,6 @@ again:
>   goto out;
>   }
>  
> - printf("Delete subvolume (%s): '%s/%s'\n",
> - commit_mode == 2 || (commit_mode == 1 && cnt + 1 == argc)
> - ? "commit" : "no-commit", dname, vname);
>   memset(, 0, sizeof(args));
>   strncpy_null(args.name, vname);
>   res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, );
> @@ -349,6 +346,9 @@ again:
>   ret = 1;
>   goto out;
>   }
> + printf("Delete subvolume (%s): '%s/%s'\n",
> + commit_mode == 2 || (commit_mode == 1 && cnt + 1 == argc)
> + ? "commit" : "no-commit", dname, vname);
>  
>   if (commit_mode == 1) {
>   res = wait_for_commit(fd);
> -- 
> 2.9.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs-progs: allow "none" to disable compression for convenience

2017-09-19 Thread Satoru Takeuchi
At Tue, 19 Sep 2017 17:14:27 +0200,
David Sterba wrote:
> 
> On Mon, Sep 18, 2017 at 09:41:17AM +0900, Satoru Takeuchi wrote:
> > At Sun, 17 Sep 2017 14:08:40 +0100,
> > Mike Fleetwood wrote:
> > > 
> > > On 17 September 2017 at 01:36, Satoru Takeuchi
> > >  wrote:
> > > > It's messy to use "" to disable compression. Introduce the new value 
> > > > "no"
> > > > which can also be used for this purpose.
> > > 
> > > From an English language point of view, "none" would be better.  None
> > > says the absence of, where as no is more general negative.
> > 
> > Thank you for your comment. How about is it?
> > 
> > ---
> > It's messy to use "" to disable compression. Introduce the new value "none"
> > which can also be used for this purpose.
> 
> I'd allow both values, 'no' and 'none', similar to the mount options,
> that also accept both (technically, the 'no' + anything is accepted for
> disabling compression).

As a result of reading "man 5 btrfs", now I prefer "no". It's used
to mean disabling compression there. On the other hand, "none" is
not used at all.

From man 5 btrfs:
===
...
FILE ATTRIBUTES
...
   compress, compress=type, compress-force, compress-force=type
   (default: off)

   Control BTRFS file data compression. Type may be specified as zlib, 
lzo or no (for no compression, used for remounting). If no type is specified, 
zlib is used. If
   compress-force is specified, all files will be compressed, whether 
or not they compress well.
...
   X
   no compression, permanently turn off compression on the given file, 
other compression mount options will not affect that
...
===

So David, please apply my v1 patcth if it looks good for you.

Thanks,
Satoru
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SSD caching an existing btrfs raid1

2017-09-19 Thread Duncan
Austin S. Hemmelgarn posted on Tue, 19 Sep 2017 11:47:24 -0400 as
excerpted:

> A better option if you can afford to remove a single device from that
> array temporarily is to use bcache.  Bcache has one specific advantage
> in this case, multiple backend devices can share the same cache device.
> This means you don't have to carve out dedicated cache space for each
> disk on the SSD and leave some unused space so that you can add new
> devices if needed.  The downside is that you can't convert each device
> in-place, but because you're using BTRFS, you can still convert the
> volume as a whole in-place.  The procedure for doing so looks like this:
> 
> 1. Format the SSD as a bcache cache.
> 2. Use `btrfs device delete` to remove a single hard drive from the
> array.
> 3. Set up the drive you just removed as a bcache backing device bound to
> the cache you created in step 1.
> 4. Add the new bcache device to the array.
> 5. Repeat from step 2 until the whole array is converted.
> 
> A similar procedure can actually be used to do almost any underlying
> storage conversion (for example, switching to whole disk encryption, or
> adding LVM underneath BTRFS) provided all your data can fit on one less
> disk than you have.

... And you're not already at the minimum operational for that btrfs 
array type.

For instance, I have lots of btrfs raid1s, two devices each.  I'd have 
trouble with the above, because I can't simply remove a device, despite 
all the data and metadata fitting on a single device.

I'd have three options that /would/ work, however.

1) If I was willing to risk having only a single checksummed copy of 
everything for the time it would take to process, I could do a btrfs 
balance convert from raid1 to single mode, then do a btrfs device remove, 
and go from there, converting back to raid1 after I was done.  Of course 
if anything goes wrong with that single copy and something fails 
checksum, I've potentially lost whatever failed the checksum unless I 
have it on backup, so converting to single mode is risky.

2) If the data and metadata will fit on /half/ of a single device, then I 
could convert to dup mode (since btrfs now allows dup data chunks) taking 
double the space on that single device, but keeping two checksummed 
copies of everything due to the dup mode, thus avoiding the risk of #1.

3) If #2 won't work due to size and I was unwilling to risk #1, I'd have 
to add a device (making it the first one converted to bcache before the 
add) before I could remove one of the others, in ordered to allow me to 
keep the checksummed redundancy of raid1 for the entire procedure.  After 
the bcache conversion and addition of the new device, and the removal, 
conversion, and readdition of one of the two existing devices, I could 
simply remove the third, giving me back my spare device, but I'd have to 
have and use it for the process.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does btrfs use crc32 for error correction?

2017-09-19 Thread Timofey Titovets
2017-09-19 21:04 GMT+03:00 Marat Khalili :
> Would be cool, but probably not wise IMHO, since on modern hardware you 
> almost never get one-bit errors (usually it's a whole sector of garbage), and 
> therefore you'd more often see an incorrect recovery than actually fixed bit.
> --
>
> With Best Regards,
> Marat Khalili

Over the past 2 months, I've thinking about some parity solution for
btrfs to have a trade-off between full duplication and single
profiles.

Something like the variable stripe len for mate(data), to fix one
sector's errors.
that is, calculate a one-time parity for each written extent.

But for now, I think that this can not be fixed without tricks or
changing the format.
(Because if you do this in FS lvl, i.e. use a space in B-Tree, this
will create one more exception, if you do this at block level, that
needs a "new" raid5, it's not cool).
But these are only my thoughts.

Therefore, I was recall that CRC in theory allow fixup one bit of error,
but for now, I do not have enough knowledge about CRC to try and
implement a proof of concept for this = \.

(I think that can be usefull not only in btrfs code, i.e. one bit CRC
correction)

But I also remember that btrfs have a lot of unused checksum space in
the checksum tree:
32 bytes of the checksum field, 4 bytes for CRC32C => 28 bytes of freedom =)

So for now I think about calculating the parity with the checksum data,
Proof of the concept (code):
https://github.com/Nefelim4ag/CRC32C_AND_8Byte_parity

As I see it:
1. Btrfs calculates parity 8/16 bytes and 4 bytes of CRC32C
8/16 bytes stored at the end of the field.
2. Compatibility bits?
Reason for absence:
 - For an old kernel that does not change anything, it's the same
for old btrfs progs
 - That possible to silent assume that it's have a parity and try
use it and fixup
 Because if it's missing or broken, we just fall back to old behaviour
Reason for Yes:
 - May be we need to show by something that btrfs has parity +
CRC32 for this data?
3. For x86_64, this works comparably fast with HW CRC32
--- Checking speed of hash / parity functions ---
PAGE_SIZE: 4096, number of cycles: 1048576
Parity64:   0xf7182ccbfc34f088perf: 233750 usec, th:
18374.191641 MiB / s
parity32:   0xb2cdc43perf: 464824 μs,th:
9239.986094 MiB / s
crc32:   0xa4aa10b2 perf: 312446 μs, th:
13746.270703 MiB / s
xxhash64: 0x77e7064e1a16f422 perf: 367570 μs, th: 11684.760171 MiB / s
4. If a CRC mismatch detected, try to correct the data by parity (for
single profile only):
4.1 Make a tmp data copy
4.2. Suppose that the 0+N block / stripe damaged
   inverse computation of that block from parity
4.3 Check CRC for page:
   - mismatch? -> N + 1 -> Go to 3.1
   - match? -> Hooray! -> Overwrite broken block

That solution will easy fix for most sort of bit flips and up to 1-16
byte -local- corruption

Possible parity combinations:
1 byte: x1 or x2 or x4 or x8 or x16
2 byte: x1 or x2 or x4 or x8
4 byte: x1 or x2 or x4
8 byte: x1 or x2 - fastest on x86_64 (i didn't have other CPUs)

That you think about that?

Thanks.

P.S.
Script for reproduction of 1 bit error case, where FS can't be mounted:
#!/bin/bash

DISK_IMAGE=$(mktemp)
MNT_TMP_DIR="$(mktemp -d)"

truncate -s 48M $DISK_IMAGE
mkfs.btrfs -f -L CRC_TEST -m single $DISK_IMAGE

TMP_DIR="$(mktemp -d)"

mount $DISK_IMAGE $MNT_TMP_DIR
echo "Test String: some_text_data" | tee $MNT_TMP_DIR/file.txt
umount $MNT_TMP_DIR

echo "Add 1 bit error: o -> n"
sed -i 's/some_text_data/snme_text_data/g' $DISK_IMAGE
btrfs check -b -p $DISK_IMAGE

echo "Fix 1 bit error: n -> o"
sed -i 's/snme_text_data/some_text_data/g' $DISK_IMAGE
btrfs check -b -p $DISK_IMAGE

mount $DISK_IMAGE $MNT_TMP_DIR
cat $MNT_TMP_DIR/file.txt
umount $MNT_TMP_DIR

rm -fv "$DISK_IMAGE"


-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: SSD caching an existing btrfs raid1

2017-09-19 Thread Paul Jones
> -Original Message-
> From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs-
> ow...@vger.kernel.org] On Behalf Of Pat Sailor
> Sent: Wednesday, 20 September 2017 1:31 AM
> To: Btrfs BTRFS 
> Subject: SSD caching an existing btrfs raid1
> 
> Hello,
> 
> I have a half-filled raid1 on top of six spinning devices. Now I have come 
> into
> a spare SSD I'd like to use for caching, if possible without having to 
> rebuild or,
> failing that, without having to renounce to btrfs and flexible reshaping.
> 
> I've been reading about the several options out there; I thought that
> EnhanceIO would be the simplest bet but unfortunately I couldn't get it to
> build with my recent kernel (last commits are from years ago).
> 
> Failing that, I read that lvmcache could be the way to go. However, I can't
> think of a way of setting it up in which I retain the ability to
> add/remove/replace drives as I can do now with pure btrfs; if I opted to drop
> btrfs to go to ext4 I still would have to offline the filesystem for 
> downsizes.
> Not a frequent occurrence I hope, but now I'm used to keep working while I
> reshape things in btrfs, and it's better if I can avoid large downtimes.
> 
> Is what I want doable at all? Thanks in advance for any
> suggestions/experiences to proceed.

When I did mine I used a spare disk to create the initial LVM filesystem, then 
used btrfs dev replace to swap one of the raid1 mirrors to the new lvm device. 
Then repeat for the other mirror.
My suggestion for lvm setup is to use two different pools, one for each btrfs 
mirror. That ensures you don't accidently have btrfs sharing the one physical 
disk by mistake, or lvm using the same SSD to cache the two discs.

Paul.


[PATCH] Btrfs: use btrfs_op instead of bio_op in __btrfs_map_block

2017-09-19 Thread Liu Bo
This seems to be a leftover of commit cf8cddd38bab ("btrfs: don't
abuse REQ_OP_* flags for btrfs_map_block").

It should use btrfs_op() helper to provide one of 'enum btrfs_map_op'
types.

Signed-off-by: Liu Bo 
---
 fs/btrfs/volumes.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index bd679bc..bd7b75d3 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -6229,7 +6229,7 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, 
struct bio *bio,
map_length = length;
 
btrfs_bio_counter_inc_blocked(fs_info);
-   ret = __btrfs_map_block(fs_info, bio_op(bio), logical,
+   ret = __btrfs_map_block(fs_info, btrfs_op(bio), logical,
_length, , mirror_num, 1);
if (ret) {
btrfs_bio_counter_dec(fs_info);
-- 
2.9.4

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Problems with debian install

2017-09-19 Thread Qu Wenruo



On 2017年09月20日 05:35, Pierre Couderc wrote:

I am trying to install stretch on a computer with 2 btrfs disks and EFI.

Is there a howto to do that ? Did someone success ?

I get problems as soon as the partionning in debian installer. What 
partitions do I need ? I understand the grub needs them even if btrfs 
does not require them. But I am lost, I think due to EFI...


For EFI, you must have one EFI system partition, which must be formatted 
as FAT.


So, at least you can't use Btrfs for ESP.

Despite of that, everything can be read/understood by kernel can be your 
/boot (mainly for grub) or /.



It will be even simpler if using systemd-boot (previously called 
gummiboot), just put kernel (with EFI-stub configured) and initrd into 
ESP, mounting ESP as /boot, then everything is done.



For reference, my current fs layout is just like this:
nvme0n1  259:00 238.5G  0 disk
├─nvme0n1p1  259:10   512M  0 part /boot
└─nvme0n1p2  259:20   238G  0 part
  ├─system-root  254:0032G  0 lvm  /
  ├─system-swap  254:10 4G  0 lvm  [SWAP]
  └─system-home  254:20   150G  0 lvm  /home

As you can see, I'm using systemd-boot, reusing ESP as /boot.
Then as long as one fs/stacked block device is supported by kernel, you 
can use it as you wish.




I see nothing on that in google.


Well, Archlinux has quite nice wiki page for this:
https://wiki.archlinux.org/index.php/Unified_Extensible_Firmware_Interface

Thanks,
Qu



Thnaks

PC

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Qu Wenruo



On 2017年09月19日 23:56, Rich Rauenzahn wrote:

I've filed a bug on this kernel trace -- I get 100's of these a day.
I'd like to make them go away 

https://bugzilla.kernel.org/show_bug.cgi?id=196949

[4.747356] [ cut here ]
[4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]


Is that line the following WARN_ON()?
---
static inline void btrfs_set_device_total_bytes(struct extent_buffer *eb,
struct btrfs_dev_item *s,
u64 val)
{
BUILD_BUG_ON(sizeof(u64) !=
 sizeof(((struct btrfs_dev_item *)0))->total_bytes);
WARN_ON(!IS_ALIGNED(val, eb->fs_info->sectorsize)); <<<
btrfs_set_64(eb, s, offsetof(struct btrfs_dev_item, total_bytes), val);
}
---

If so, that means your devices size is not aligned to 4K.

Is your block device still using old 512 block size?
AFAIK nowadays most HDDs are using 4K blocksize and it's recommended to 
use it.


It's not a big problem and one can easily remove the WARN_ON().
But I think we'd better fix the caller to do round_down() before calling 
this function.


Thanks,
Qu


[4.747377] Modules linked in: nfs_acl lockd grace sunrpc ip_tables
btrfs xor raid6_pq sd_mod crc32c_intel firewire_ohci igb ahci
  firewire_core crc_itu_t dca libahci i915 libata i2c_algo_bit e1000e
drm_kms_helper ptp syscopyarea sysfillrect pps_core sysimgblt f
b_sys_fops drm video
[4.747385] CPU: 3 PID: 439 Comm: btrfs-cleaner Not tainted
4.13.2-1.el7.elrepo.x86_64 #1
[4.747385] Hardware name: Supermicro X10SAE/X10SAE, BIOS 2.0a 05/09/2014
[4.747386] task: 88040cdcae80 task.stack: c900021f4000
[4.747396] RIP: 0010:btrfs_update_device+0x1c5/0x1d0 [btrfs]
[4.747396] RSP: 0018:c900021f7d00 EFLAGS: 00010206
[4.747397] RAX: 0fff RBX: 880407b7aa80 RCX: 001bc6c71e00
[4.747397] RDX: 8800 RSI: 880404cd3f3c RDI: 880409417b58
[4.747398] RBP: c900021f7d48 R08: 3f60 R09: c900021f7cb8
[4.747398] R10: 1000 R11: 0003 R12: 88040559f800
[4.747398] R13:  R14: 880409417b58 R15: 3f3c
[4.747399] FS:  () GS:88041fac()
knlGS:
[4.747399] CS:  0010 DS:  ES:  CR0: 80050033
[4.747400] CR2: 7f29c3000248 CR3: 0004056a4000 CR4: 001406e0
[4.747400] Call Trace:
[4.747410]  btrfs_remove_chunk+0x2fb/0x8b0 [btrfs]
[4.747418]  btrfs_delete_unused_bgs+0x363/0x440 [btrfs]
[4.747426]  cleaner_kthread+0x150/0x180 [btrfs]
[4.747429]  kthread+0x109/0x140
[4.747436]  ? btree_invalidatepage+0xa0/0xa0 [btrfs]
[4.747437]  ? kthread_park+0x60/0x60
[4.747439]  ret_from_fork+0x25/0x30
[4.747439] Code: 10 00 00 00 4c 89 fe e8 8a 30 ff ff 4c 89 f7 e8
32 f6 fc ff e9 d3 fe ff ff b8 f4 ff ff ff e9 d4 fe ff ff 0f 1f 00 e8
bb 2e 9e e0 <0f> ff eb af 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 31 d2
be 02
[4.747450] ---[ end trace 1ef80a625983d73b ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-progs: suggestion of removing --commit-after option of subvol delete

2017-09-19 Thread Qu Wenruo



On 2017年09月19日 22:48, David Sterba wrote:

On Tue, Sep 19, 2017 at 04:50:04PM +0900, Misono, Tomohiro wrote:

I read the code of "subvolume delete" and found that --commit-after option is
not working well.

Since it issues BTRFS_IOC_START/WAIT_SYNC to the last fd (of directory
containing the last deleted subvolume),
1. sync operation affects only the last fd's filesystem.
("subvolume delete" can take multiple subvolumes on different filesystems.)


You're right, though I don't think this is a common usecase.


2. if the last delete action fails to open the path (fd == -1),
SYNC is not issued at all.

One solution is to keep every fd for deleted subvolumes, but I think it takes
too much cost.


How do you evaluate the cost? We'd have to keep track of all the
distinct filesystems of the subvolumes, so we issue sync on each of them
in the end.


The costly part will be tracking the filesystems of subvolumes.
We must do it for each subvolume and batch them to address the final 
transaction commit for each fs.


I didn't see any easy ioctl to get the UUID from fd, meaning (if didn't 
miss anything) we need to iterate the path until reaching the mount 
boundary and then refer to mountinfo to find the fs.


Not to mention that the possible softlink in the path may make things 
more complex.


Yes, this may be fixed with tons of code, but I don't think the 
complexity worthy for a minor feature.
(Remember how --rootdir get untested and buggy over time? And we even 
don't know how to make test case to verify --commit-after/each)


Thanks,
Qu




Since we can just use "btrfs filesystem sync" after delete if
needed, I think it is ok to remove --comit-after option.


The point of the option is to allow the sync in the same command as the
subvolume deletion. Even with the sync, the user would still have to
know all the filesystems where the the subvolumes were, so some way of
tracking them would need to be done.

I don't want to remove the option unless it's really not working as
expected and could mislead the users. What you found are bugs that can
be fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-progs: suggestion of removing --commit-after option of subvol delete

2017-09-19 Thread Misono, Tomohiro
On 2017/09/19 23:48, David Sterba wrote:
> On Tue, Sep 19, 2017 at 04:50:04PM +0900, Misono, Tomohiro wrote:
>> I read the code of "subvolume delete" and found that --commit-after option is
>> not working well.
>>
>> Since it issues BTRFS_IOC_START/WAIT_SYNC to the last fd (of directory
>> containing the last deleted subvolume),
>> 1. sync operation affects only the last fd's filesystem.
>>("subvolume delete" can take multiple subvolumes on different 
>> filesystems.)
> 
> You're right, though I don't think this is a common usecase.
> 
>> 2. if the last delete action fails to open the path (fd == -1),
>>SYNC is not issued at all.
>>
>> One solution is to keep every fd for deleted subvolumes, but I think it takes
>> too much cost.
> 
> How do you evaluate the cost? We'd have to keep track of all the
> distinct filesystems of the subvolumes, so we issue sync on each of them
> in the end.

This potentially keep many fds long time. Since the number of open file 
descriptors
is limited, I don't think it is good to keep fds long time.

>> Since we can just use "btrfs filesystem sync" after delete if
>> needed, I think it is ok to remove --comit-after option.
> 
> The point of the option is to allow the sync in the same command as the
> subvolume deletion. Even with the sync, the user would still have to
> know all the filesystems where the the subvolumes were, so some way of
> tracking them would need to be done.
> 
> I don't want to remove the option unless it's really not working as
> expected and could mislead the users. What you found are bugs that can
> be fixed.
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Problems with debian install

2017-09-19 Thread Pierre Couderc

I am trying to install stretch on a computer with 2 btrfs disks and EFI.

Is there a howto to do that ? Did someone success ?

I get problems as soon as the partionning in debian installer. What 
partitions do I need ? I understand the grub needs them even if btrfs 
does not require them. But I am lost, I think due to EFI...


I see nothing on that in google.

Thnaks

PC

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: parent determination failed (btrfs send-receive)

2017-09-19 Thread Andrei Borzenkov
18.09.2017 09:10, Dave пишет:
> I use snap-sync to create and send snapshots.
> 
> GitHub - wesbarnett/snap-sync: Use snapper snapshots to backup to external 
> drive
> https://github.com/wesbarnett/snap-sync
> 

Are you trying to backup top-level subvolume? I just reproduced this
behavior with this tool. The problem is, snapshots of top-level
subvolume do not have parent UUID (I am not even sure if UUID exists at
all TBH). If you mount any other subvolume, it will work. On openSUSE
root is always mounted as subvolume (actually, the very first snapshot)
which explains why I did not see it before.

I.e.

mkfs -t btrfs /dev/sdb1
mount /dev/sdb1 /test
snapper -c test create-config /test

attempt to "snap-sync -c test" will fail second time. But

btrfs sub create /test/@
umount /test
mount -o subvol=@ /dev/sdb1 /test
snapper -c test create-config /test
...

will work.

As I told you in the first reply, showing output of "btrfs su li -qu
/path/to/src" would explain your problem much earlier.

Actually if snap-sync used "btrfs send -p" instead of "btrfs send -c" it
would work as well, as then no parent search would be needed (and as I
mentioned in another mail both commands are functionally equivalent).
But this becomes really off-topic on this list. As already suggested,
open issue for snap-sync.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Storage and snapshots as historical yearly

2017-09-19 Thread Austin S. Hemmelgarn

On 2017-09-19 14:33, Andrei Borzenkov wrote:

19.09.2017 14:49, Senén Vidal Blanco пишет:

Perfect!! Just what I was looking for.
Sorry for the delay, because before doing so, I preferred to test to see if it
actually worked.

I have a doubt. The system works perfectly, but at the time of deleting the
writing disk and merging the data on the read-only disk I fail to understand
the process.

I have tried to remove the seed bit on disk A and delete the write B as you
mention, and so move the data to A, but tells me that disk B does not exist.
These are the orders I have made:

md127-> A
md126-> B

btrfstune -S 0 /dev /md127
mount /dev/md127 /mnt (I mount this disk since the md126 gives error)
btrfs device delete /dev/md126 /mnt
ERROR: error removing device '/dev/md126': No such file or directory

Another thing I've tried is to remove disk B without removing the seed bit,
but it gives me the error:

ERROR: error removing device '/dev/md126': unable to remove the only writeable
device.

Any ideas about it?


Yes, sorry about it. Clearing seed flag on device invalidates
filesystem. What you can do, is to rotate devices. I.e. remove
/dev/md126, set seed flag on md127 and add md126 back.

I actually tested it and it works for me.
Same.  FWIW, this (like many other things in BTRFS) could stand to be 
better documented, especially since it's not likely to be intuitive for 
most people.



Thank you very much for the reply.
Greetings.

El martes, 12 de septiembre de 2017 6:34:15 (CEST) Andrei Borzenkov escribió:

11.09.2017 21:17, Senén Vidal Blanco пишет:

I am trying to implement a system that stores the data in a unit (A) with
BTRFS format that is untouchable and that future files and folders created
or modified are stored in another physical unit (B) with BTRFS format.
Each year the new files will be moved to store A and start over.

The idea is that a duplicate of disk A can be made to keep it in a safe
place and that the files stored there can not be modified until the
mixture of (A) and (B) is made.


This can probably be achieved using seed device. Mark original device as
seed and all changes will go to another writable device, similar to
overlay; then remove seed bit from original device, "btrfs device remove
writable" device and it should relocate its content back. Rinse and repeat.







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Storage and snapshots as historical yearly

2017-09-19 Thread Senén Vidal Blanco
Hi Austin,
Thank you very much for your answer.
I comment a little on your suggestions in the context.

El lunes, 11 de septiembre de 2017 14:49:05 (CEST) Austin S. Hemmelgarn 
escribió:
> On 2017-09-11 14:17, Senén Vidal Blanco wrote:
> > I am trying to implement a system that stores the data in a unit (A) with
> > BTRFS format that is untouchable and that future files and folders created
> > or modified are stored in another physical unit (B) with BTRFS format.
> > Each year the new files will be moved to store A and start over.
> > 
> > The idea is that a duplicate of disk A can be made to keep it in a safe
> > place and that the files stored there can not be modified until the
> > mixture of (A) and (B) is made.
> 
> Before I get into anything further, I would like to comment that this is
> a very odd use case.  Yearly granularity doesn't make sense for backups
> unless you generate very little data throughout the year (otherwise
> backups will take forever) and can afford to lose multiple months of
> that data.
> 
> The timescale you're talking about combined with the requirement that
> files not be modifiable on A except during the times when you sync
> changes indicates you will probably be much better served by proper
> off-line archival storage than some online configuration as you appear
> to be trying to create.

Totally agree with you, although it really is not the objective, for that I 
use Bareos as backup system. It is simply because I need to have stored almost 
2 Terabytes and be able to have a copy of the data outside the enclosure and 3 
years of duplicate disks and be able to access almost instantaneously the old 
data.
With a backup system it would take several days to restore the entire system 
or several hours to rebuild the file system and restore the file you need. 
While 
with an image from last year you would only have to restore a year at best.

> 
> > I have looked for information on this but I do not see it very clear. Both
> > "SEND" and "RECEIVE" do the opposite case to what interests me.
> 
> If you can afford to operate at a shorter timescale (even monthly), then
> BTRFS snapshots plus send and receive probably are one of the best
> options.  Perhaps you could explain your understanding of send and
> receive and why you think it won't work, and I (and possibly other
> people on the list as well) could confirm whether or not you understand
> correctly.

That is another long-term goal that I would like to implement, since I have 
seen that it can be combined with external storage and able to move the data 
through snapshots. It uses little bandwidth and seems to be more immediate 
than the Bareos, although it would leave both systems for security, since I 
speak of very sensitive and important data to lose them.

> 
> > I have also seen that you could try to get a RAID 0 but I am afraid that
> > the data in A will not remain intact if the system performs a "BALANCE"
> > at some point and mixes data between (A) and (B).
> > 
> > It is assumed that both the (A) and (B) data will be displayed in the same
> > structure transparently, for example in "/ home".
> > 
> > ---
> > 
> > | HOME|
> > | 
> > |     |
> > | 
> > | | DISK (A) |   | DISK (B) | |
> > | | 
> > | |  BTRFS   |   |  BTRFS   | |
> > | 
> > |     |
> > 
> > ---
> 
> What you are describing here is called an overlay or union mount.
> Source A would be your lower directory, and B would be your upper
> directory.  Unmodified data is passed through from the lower directory
> until a file is changed.  When changes are made to the overlay mount,
> they are reflected on the upper directory.  Special files called
> 'whiteout' files are used in the upper directory to represent deleted items.
> 
> Unfortunately, I don't know of any overlay mount implementation that
> works correctly and reliably with BTRFS.  I know for a fact that
> OverlayFS (the upstream in-kernel implementation) does not work, and I
> believe that AUFS3 and UnionFS (the third-party options that are used by
> most LiveCD's) don't work either.  UnionFS-FUSE (a userspace
> implementation completely unrelated to UnionFS) might work, but I've
> never tested it and it will likely have performance issues because it's
> implemented in userspace.  As far as I know, whiteout support is the
> primary missing piece here, but I may be mistaken.
> 
> Alternatively, this could be done with a seed device.  The concept is
> pretty similar to an overlay mount, but it operates at a lower level,
> and it's a BTRFS specific feature.  Unfortunately, it's not well
> documented, and I'm not confident that I could explain how to do it
> correctly.

This part does seem to me very interesting, since one of your colleagues has 
commented on the topic of using "Seed" with BTRFS, which I am seeing its 
operation; and I can see how this would be OverlayFS, 

How to recover failing filesystem?

2017-09-19 Thread Dmitry Kudriavtsev
Hello,

My btrfs filesystem is constantly remounting itself read-only. Relevant log 
segments: https://hastebin.com/uyegipadex.log 
https://hastebin.com/wakihobivi.log

How do I fix these errors?


Thank you,
Dmitry Kudriavtsev

⠀⠀⠀⣸⣧⠀⠀⠀
⠀⠀⣰⣿⣿⣆⠀⠀
⠀⣀⡙⠿⣿⣿⣆⠀Hey, did you hear about that cool new OS? It's called
⣰⣿⣿⣷⣿⣿⣿⣆Arch Linux. I use Arch Linux. Have you ever used Arch
⠀⠀⠀⣰⣿⣿⣿⡿⢿⣿⣿⣿⣆⠀⠀⠀Linux? You should use Arch Linux. Everyone uses Arch!
⠀⠀⣰⣿⣿⣿⡏⠀⠀⢹⣿⣿⠿⡆⠀⠀Check out i3wm too!
⠀⣰⣿⣿⣿⡿⠇⠀⠀⠸⢿⣿⣷⣦⣄⠀
⣼⠿⠛⠉⠉⠛⠿⣦
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Storage and snapshots as historical yearly

2017-09-19 Thread Andrei Borzenkov
19.09.2017 14:49, Senén Vidal Blanco пишет:
> Perfect!! Just what I was looking for.
> Sorry for the delay, because before doing so, I preferred to test to see if 
> it 
> actually worked.
> 
> I have a doubt. The system works perfectly, but at the time of deleting the 
> writing disk and merging the data on the read-only disk I fail to understand 
> the process.
> 
> I have tried to remove the seed bit on disk A and delete the write B as you 
> mention, and so move the data to A, but tells me that disk B does not exist.
> These are the orders I have made:
> 
> md127-> A
> md126-> B
> 
> btrfstune -S 0 /dev /md127
> mount /dev/md127 /mnt (I mount this disk since the md126 gives error)
> btrfs device delete /dev/md126 /mnt
> ERROR: error removing device '/dev/md126': No such file or directory
> 
> Another thing I've tried is to remove disk B without removing the seed bit, 
> but it gives me the error:
> 
> ERROR: error removing device '/dev/md126': unable to remove the only 
> writeable 
> device.
> 
> Any ideas about it?

Yes, sorry about it. Clearing seed flag on device invalidates
filesystem. What you can do, is to rotate devices. I.e. remove
/dev/md126, set seed flag on md127 and add md126 back.

I actually tested it and it works for me.

> Thank you very much for the reply.
> Greetings.
> 
> El martes, 12 de septiembre de 2017 6:34:15 (CEST) Andrei Borzenkov escribió:
>> 11.09.2017 21:17, Senén Vidal Blanco пишет:
>>> I am trying to implement a system that stores the data in a unit (A) with
>>> BTRFS format that is untouchable and that future files and folders created
>>> or modified are stored in another physical unit (B) with BTRFS format.
>>> Each year the new files will be moved to store A and start over.
>>>
>>> The idea is that a duplicate of disk A can be made to keep it in a safe
>>> place and that the files stored there can not be modified until the
>>> mixture of (A) and (B) is made.
>>
>> This can probably be achieved using seed device. Mark original device as
>> seed and all changes will go to another writable device, similar to
>> overlay; then remove seed bit from original device, "btrfs device remove
>> writable" device and it should relocate its content back. Rinse and repeat.
> 




signature.asc
Description: OpenPGP digital signature


Re: snapshots of encrypted directories?

2017-09-19 Thread Dave
On Fri, Sep 15, 2017 at 6:01 AM, Ulli Horlacher
 wrote:
>
> On Fri 2017-09-15 (06:45), Andrei Borzenkov wrote:
>
> > The actual question is - do you need to mount each individual btrfs
> > subvolume when using encfs?
>
> And even worse it goes with ecryptfs: I do not know at all how to mount a
> snapshot, so that the user has access to it.
>
> It seems snapshots are incompatible with encrypted filesystems :-(


My experience is the opposite. I use dm-crypt as well as encfs with
BTRFS and everything, including snapshots, works as I would expect it
to work.

I have been able to successfully restore snapshots that contained
encrypted data.

I think the other answers have already provided more details than I
could provide, so I just wanted to add the fact that my experience has
been positive with BTRFS snapshots and encryption.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does btrfs use crc32 for error correction?

2017-09-19 Thread Marat Khalili
Would be cool, but probably not wise IMHO, since on modern hardware you almost 
never get one-bit errors (usually it's a whole sector of garbage), and 
therefore you'd more often see an incorrect recovery than actually fixed bit.
-- 

With Best Regards,
Marat Khalili
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent_io.c:1989

2017-09-19 Thread Liu Bo
On Tue, Sep 19, 2017 at 05:07:25PM +0200, David Sterba wrote:
> On Tue, Sep 19, 2017 at 11:32:46AM +, Paul Jones wrote:
> > > This 'mirror 0' looks fishy, (as mirror comes from 
> > > btrfs_io_bio->mirror_num,
> > > which should be at least 1 if raid1 setup is in use.)
> > > 
> > > Not sure if 4.13.2-gentoo made any changes on btrfs, but can you please
> > > verify with the upstream kernel, say, v4.13?
> > 
> > It's basically a vanilla kernel with a handful of unrelated patches.
> > The filesystem fell apart overnight, there were a few thousand
> > checksum errors and eventually it went read-only. I tried to remount
> > it, but got open_ctree failed. Btrfs check segfaulted, lowmem mode
> > completed with so many errors I gave up and will restore from the
> > backup.
> > 
> > I think I know the problem now - the lvm cache was in writeback mode
> > (by accident) so during a defrag there would be gigabytes of unwritten
> > data in memory only, which was all lost when the system crashed
> > (motherboard failure). No wonder the filesystem didn't quite survive.
> 
> Yeah, the caching layer was my first suspicion, and lack of propagating
> of the barriers. Good that you were able to confirm that as the root cause.
> 
> > I must say though, I'm seriously impressed at the data integrity of
> > BTRFS - there were near 10,000 checksum errors, 4 which were
> > uncorrectable, and from what I could tell nearly all of the data was
> > still intact according to rsync checksums.
> 
> Yay!

But still don't get why mirror_num is 0, do you have an idea on how
does writeback cache make that?

Thanks,

-liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does btrfs use crc32 for error correction?

2017-09-19 Thread Eric Sandeen
On 9/19/17 10:35 AM, Timofey Titovets wrote:
> Stupid question:
> Does btrfs use crc32 for error correction?
> If no, why?
> 
> (AFAIK if using CRC that possible to fix 1 bit flip)
> 
> P.S. I try check that (i create image, create text file, flip bit, try
> read and btrfs show IO-error)
> 
> Thanks!

I wasn't aware that crc32 could (in general) be used for single bit correction; 
I've read up on that, and it seems pretty cool.
 
However, I don't think that the generator polynomial used in crc32c /can/ be 
used for error correction.  I just skimmed some reading, but that seems to be 
the case if I understand it correctly.

-Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559 btrfs_update_device+0x1c5/0x1d0 [btrfs]

2017-09-19 Thread Rich Rauenzahn
I've filed a bug on this kernel trace -- I get 100's of these a day.
I'd like to make them go away 

https://bugzilla.kernel.org/show_bug.cgi?id=196949

[4.747356] [ cut here ]
[4.747377] WARNING: CPU: 3 PID: 439 at fs/btrfs/ctree.h:1559
btrfs_update_device+0x1c5/0x1d0 [btrfs]
[4.747377] Modules linked in: nfs_acl lockd grace sunrpc ip_tables
btrfs xor raid6_pq sd_mod crc32c_intel firewire_ohci igb ahci
 firewire_core crc_itu_t dca libahci i915 libata i2c_algo_bit e1000e
drm_kms_helper ptp syscopyarea sysfillrect pps_core sysimgblt f
b_sys_fops drm video
[4.747385] CPU: 3 PID: 439 Comm: btrfs-cleaner Not tainted
4.13.2-1.el7.elrepo.x86_64 #1
[4.747385] Hardware name: Supermicro X10SAE/X10SAE, BIOS 2.0a 05/09/2014
[4.747386] task: 88040cdcae80 task.stack: c900021f4000
[4.747396] RIP: 0010:btrfs_update_device+0x1c5/0x1d0 [btrfs]
[4.747396] RSP: 0018:c900021f7d00 EFLAGS: 00010206
[4.747397] RAX: 0fff RBX: 880407b7aa80 RCX: 001bc6c71e00
[4.747397] RDX: 8800 RSI: 880404cd3f3c RDI: 880409417b58
[4.747398] RBP: c900021f7d48 R08: 3f60 R09: c900021f7cb8
[4.747398] R10: 1000 R11: 0003 R12: 88040559f800
[4.747398] R13:  R14: 880409417b58 R15: 3f3c
[4.747399] FS:  () GS:88041fac()
knlGS:
[4.747399] CS:  0010 DS:  ES:  CR0: 80050033
[4.747400] CR2: 7f29c3000248 CR3: 0004056a4000 CR4: 001406e0
[4.747400] Call Trace:
[4.747410]  btrfs_remove_chunk+0x2fb/0x8b0 [btrfs]
[4.747418]  btrfs_delete_unused_bgs+0x363/0x440 [btrfs]
[4.747426]  cleaner_kthread+0x150/0x180 [btrfs]
[4.747429]  kthread+0x109/0x140
[4.747436]  ? btree_invalidatepage+0xa0/0xa0 [btrfs]
[4.747437]  ? kthread_park+0x60/0x60
[4.747439]  ret_from_fork+0x25/0x30
[4.747439] Code: 10 00 00 00 4c 89 fe e8 8a 30 ff ff 4c 89 f7 e8
32 f6 fc ff e9 d3 fe ff ff b8 f4 ff ff ff e9 d4 fe ff ff 0f 1f 00 e8
bb 2e 9e e0 <0f> ff eb af 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 31 d2
be 02
[4.747450] ---[ end trace 1ef80a625983d73b ]---
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SSD caching an existing btrfs raid1

2017-09-19 Thread Austin S. Hemmelgarn

On 2017-09-19 11:30, Pat Sailor wrote:

Hello,

I have a half-filled raid1 on top of six spinning devices. Now I have 
come into a spare SSD I'd like to use for caching, if possible without 
having to rebuild or, failing that, without having to renounce to btrfs 
and flexible reshaping.


I've been reading about the several options out there; I thought that 
EnhanceIO would be the simplest bet but unfortunately I couldn't get it 
to build with my recent kernel (last commits are from years ago).
As a general rule, avoid things that are out of tree if you want any 
chance of support here.  In the case of a third party module like 
EnhanceIO, you'll likely get asked to retest with a kernel without that 
module loaded if you run into a bug (because the module itself is suspect).


Failing that, I read that lvmcache could be the way to go. However, I 
can't think of a way of setting it up in which I retain the ability to 
add/remove/replace drives as I can do now with pure btrfs; if I opted to 
drop btrfs to go to ext4 I still would have to offline the filesystem 
for downsizes. Not a frequent occurrence I hope, but now I'm used to 
keep working while I reshape things in btrfs, and it's better if I can 
avoid large downtimes.


Is what I want doable at all? Thanks in advance for any 
suggestions/experiences to proceed.
dm-cache (or lvmcache as the LVM2 developers want you to call it, 
despite it not being an LVM specific thing) would work fine, it won't 
prevent you from adding and removing devices, it would just make it more 
complicated.  Without it, you just need to issue a replace command (or 
add then remove).  With it, you need to set up the new cache device, 
bind that to the target, add the new device, and then delete the old 
device and remove the old cache target.  dm-cache managed through LVM 
also has the advantage that you can convert the existing FS trivially, 
although you will have to take it off-line for the conversion.


A better option if you can afford to remove a single device from that 
array temporarily is to use bcache.  Bcache has one specific advantage 
in this case, multiple backend devices can share the same cache device. 
This means you don't have to carve out dedicated cache space for each 
disk on the SSD and leave some unused space so that you can add new 
devices if needed.  The downside is that you can't convert each device 
in-place, but because you're using BTRFS, you can still convert the 
volume as a whole in-place.  The procedure for doing so looks like this:


1. Format the SSD as a bcache cache.
2. Use `btrfs device delete` to remove a single hard drive from the array.
3. Set up the drive you just removed as a bcache backing device bound to 
the cache you created in step 1.

4. Add the new bcache device to the array.
5. Repeat from step 2 until the whole array is converted.

A similar procedure can actually be used to do almost any underlying 
storage conversion (for example, switching to whole disk encryption, or 
adding LVM underneath BTRFS) provided all your data can fit on one less 
disk than you have.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Does btrfs use crc32 for error correction?

2017-09-19 Thread Hugo Mills
On Tue, Sep 19, 2017 at 06:35:48PM +0300, Timofey Titovets wrote:
> Stupid question:
> Does btrfs use crc32 for error correction?

   It uses it for error _detection_. On read, it'll verify the data
(or metadata) against the checksum.

   With no reduncancy (single, RAID-0), a bad csum check will return
I/O error.

   With redundancy (RAID-1, 10, 5, 6), a bad csum check will try
reading the other copy. If that's good, it will use it and repair the
broken copy.

   Hugo.

> If no, why?
> 
> (AFAIK if using CRC that possible to fix 1 bit flip)
> 
> P.S. I try check that (i create image, create text file, flip bit, try
> read and btrfs show IO-error)
> 
> Thanks!

-- 
Hugo Mills | Dullest spy film ever: The Eastbourne Ultimatum
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4  |   The Thick of It


signature.asc
Description: Digital signature


Does btrfs use crc32 for error correction?

2017-09-19 Thread Timofey Titovets
Stupid question:
Does btrfs use crc32 for error correction?
If no, why?

(AFAIK if using CRC that possible to fix 1 bit flip)

P.S. I try check that (i create image, create text file, flip bit, try
read and btrfs show IO-error)

Thanks!
-- 
Have a nice day,
Timofey.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


SSD caching an existing btrfs raid1

2017-09-19 Thread Pat Sailor

Hello,

I have a half-filled raid1 on top of six spinning devices. Now I have 
come into a spare SSD I'd like to use for caching, if possible without 
having to rebuild or, failing that, without having to renounce to btrfs 
and flexible reshaping.


I've been reading about the several options out there; I thought that 
EnhanceIO would be the simplest bet but unfortunately I couldn't get it 
to build with my recent kernel (last commits are from years ago).


Failing that, I read that lvmcache could be the way to go. However, I 
can't think of a way of setting it up in which I retain the ability to 
add/remove/replace drives as I can do now with pure btrfs; if I opted to 
drop btrfs to go to ext4 I still would have to offline the filesystem 
for downsizes. Not a frequent occurrence I hope, but now I'm used to 
keep working while I reshape things in btrfs, and it's better if I can 
avoid large downtimes.


Is what I want doable at all? Thanks in advance for any 
suggestions/experiences to proceed.


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] btrfs-progs: allow "none" to disable compression for convenience

2017-09-19 Thread David Sterba
On Mon, Sep 18, 2017 at 09:41:17AM +0900, Satoru Takeuchi wrote:
> At Sun, 17 Sep 2017 14:08:40 +0100,
> Mike Fleetwood wrote:
> > 
> > On 17 September 2017 at 01:36, Satoru Takeuchi
> >  wrote:
> > > It's messy to use "" to disable compression. Introduce the new value "no"
> > > which can also be used for this purpose.
> > 
> > From an English language point of view, "none" would be better.  None
> > says the absence of, where as no is more general negative.
> 
> Thank you for your comment. How about is it?
> 
> ---
> It's messy to use "" to disable compression. Introduce the new value "none"
> which can also be used for this purpose.

I'd allow both values, 'no' and 'none', similar to the mount options,
that also accept both (technically, the 'no' + anything is accepted for
disabling compression).
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: make array types static const, reduces object code size

2017-09-19 Thread David Sterba
On Tue, Sep 19, 2017 at 04:01:23PM +0100, Colin King wrote:
> From: Colin Ian King 
> 
> Don't populate the read-only array types on the stack, instead make
> it static const.  Makes the object code smaller by nearly 60 bytes:
> 
> Before:
>text  data bss dec hex filename
>   90536  6552  64   97152   17b80 fs/btrfs/ioctl.o
> 
> After:
>text  data bss dec hex filename
>   90414  6616  64   97094   17b46 fs/btrfs/ioctl.o
> 
> Signed-off-by: Colin Ian King 

Reviewed-by: David Sterba 
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: kernel BUG at fs/btrfs/extent_io.c:1989

2017-09-19 Thread David Sterba
On Tue, Sep 19, 2017 at 11:32:46AM +, Paul Jones wrote:
> > This 'mirror 0' looks fishy, (as mirror comes from btrfs_io_bio->mirror_num,
> > which should be at least 1 if raid1 setup is in use.)
> > 
> > Not sure if 4.13.2-gentoo made any changes on btrfs, but can you please
> > verify with the upstream kernel, say, v4.13?
> 
> It's basically a vanilla kernel with a handful of unrelated patches.
> The filesystem fell apart overnight, there were a few thousand
> checksum errors and eventually it went read-only. I tried to remount
> it, but got open_ctree failed. Btrfs check segfaulted, lowmem mode
> completed with so many errors I gave up and will restore from the
> backup.
> 
> I think I know the problem now - the lvm cache was in writeback mode
> (by accident) so during a defrag there would be gigabytes of unwritten
> data in memory only, which was all lost when the system crashed
> (motherboard failure). No wonder the filesystem didn't quite survive.

Yeah, the caching layer was my first suspicion, and lack of propagating
of the barriers. Good that you were able to confirm that as the root cause.

> I must say though, I'm seriously impressed at the data integrity of
> BTRFS - there were near 10,000 checksum errors, 4 which were
> uncorrectable, and from what I could tell nearly all of the data was
> still intact according to rsync checksums.

Yay!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: make array types static const, reduces object code size

2017-09-19 Thread Colin King
From: Colin Ian King 

Don't populate the read-only array types on the stack, instead make
it static const.  Makes the object code smaller by nearly 60 bytes:

Before:
   textdata bss dec hex filename
  905366552  64   97152   17b80 fs/btrfs/ioctl.o

After:
   textdata bss dec hex filename
  904146616  64   97094   17b46 fs/btrfs/ioctl.o

Signed-off-by: Colin Ian King 
---
 fs/btrfs/ioctl.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a74ed6c12d6a..feab6f61cb97 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -4114,10 +4114,12 @@ static long btrfs_ioctl_space_info(struct btrfs_fs_info 
*fs_info,
struct btrfs_ioctl_space_info *dest_orig;
struct btrfs_ioctl_space_info __user *user_dest;
struct btrfs_space_info *info;
-   u64 types[] = {BTRFS_BLOCK_GROUP_DATA,
-  BTRFS_BLOCK_GROUP_SYSTEM,
-  BTRFS_BLOCK_GROUP_METADATA,
-  BTRFS_BLOCK_GROUP_DATA | BTRFS_BLOCK_GROUP_METADATA};
+   static const u64 types[] = {
+   BTRFS_BLOCK_GROUP_DATA,
+   BTRFS_BLOCK_GROUP_SYSTEM,
+   BTRFS_BLOCK_GROUP_METADATA,
+   BTRFS_BLOCK_GROUP_DATA | BTRFS_BLOCK_GROUP_METADATA
+   };
int num_types = 4;
int alloc_size;
int ret = 0;
-- 
2.14.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-progs: suggestion of removing --commit-after option of subvol delete

2017-09-19 Thread David Sterba
On Tue, Sep 19, 2017 at 04:50:04PM +0900, Misono, Tomohiro wrote:
> I read the code of "subvolume delete" and found that --commit-after option is
> not working well.
> 
> Since it issues BTRFS_IOC_START/WAIT_SYNC to the last fd (of directory
> containing the last deleted subvolume),
> 1. sync operation affects only the last fd's filesystem.
>("subvolume delete" can take multiple subvolumes on different filesystems.)

You're right, though I don't think this is a common usecase.

> 2. if the last delete action fails to open the path (fd == -1),
>SYNC is not issued at all.
> 
> One solution is to keep every fd for deleted subvolumes, but I think it takes
> too much cost.

How do you evaluate the cost? We'd have to keep track of all the
distinct filesystems of the subvolumes, so we issue sync on each of them
in the end.

> Since we can just use "btrfs filesystem sync" after delete if
> needed, I think it is ok to remove --comit-after option.

The point of the option is to allow the sync in the same command as the
subvolume deletion. Even with the sync, the user would still have to
know all the filesystems where the the subvolumes were, so some way of
tracking them would need to be done.

I don't want to remove the option unless it's really not working as
expected and could mislead the users. What you found are bugs that can
be fixed.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Storage and snapshots as historical yearly

2017-09-19 Thread Senén Vidal Blanco
Perfect!! Just what I was looking for.
Sorry for the delay, because before doing so, I preferred to test to see if it 
actually worked.

I have a doubt. The system works perfectly, but at the time of deleting the 
writing disk and merging the data on the read-only disk I fail to understand 
the process.

I have tried to remove the seed bit on disk A and delete the write B as you 
mention, and so move the data to A, but tells me that disk B does not exist.
These are the orders I have made:

md127-> A
md126-> B

btrfstune -S 0 /dev /md127
mount /dev/md127 /mnt (I mount this disk since the md126 gives error)
btrfs device delete /dev/md126 /mnt
ERROR: error removing device '/dev/md126': No such file or directory

Another thing I've tried is to remove disk B without removing the seed bit, 
but it gives me the error:

ERROR: error removing device '/dev/md126': unable to remove the only writeable 
device.

Any ideas about it?
Thank you very much for the reply.
Greetings.

El martes, 12 de septiembre de 2017 6:34:15 (CEST) Andrei Borzenkov escribió:
> 11.09.2017 21:17, Senén Vidal Blanco пишет:
> > I am trying to implement a system that stores the data in a unit (A) with
> > BTRFS format that is untouchable and that future files and folders created
> > or modified are stored in another physical unit (B) with BTRFS format.
> > Each year the new files will be moved to store A and start over.
> > 
> > The idea is that a duplicate of disk A can be made to keep it in a safe
> > place and that the files stored there can not be modified until the
> > mixture of (A) and (B) is made.
> 
> This can probably be achieved using seed device. Mark original device as
> seed and all changes will go to another writable device, similar to
> overlay; then remove seed bit from original device, "btrfs device remove
> writable" device and it should relocate its content back. Rinse and repeat.

-- 
Senén Vidal Blanco - SGISoft S.L.
 
Tlf.: 986413322 - 660923711
GPG ID 466431A8AF01F99A
http://www.sgisoft.com/
--
 


signature.asc
Description: This is a digitally signed message part.


RE: kernel BUG at fs/btrfs/extent_io.c:1989

2017-09-19 Thread Paul Jones
> -Original Message-
> From: Liu Bo [mailto:bo.li@oracle.com]
> Sent: Tuesday, 19 September 2017 3:10 AM
> To: Paul Jones 
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: kernel BUG at fs/btrfs/extent_io.c:1989
> 
> 
> This 'mirror 0' looks fishy, (as mirror comes from btrfs_io_bio->mirror_num,
> which should be at least 1 if raid1 setup is in use.)
> 
> Not sure if 4.13.2-gentoo made any changes on btrfs, but can you please
> verify with the upstream kernel, say, v4.13?

It's basically a vanilla kernel with a handful of unrelated patches.
The filesystem fell apart overnight, there were a few thousand checksum errors 
and eventually it went read-only. I tried to remount it, but got open_ctree 
failed. Btrfs check segfaulted, lowmem mode completed with so many errors I 
gave up and will restore from the backup.

I think I know the problem now - the lvm cache was in writeback mode (by 
accident) so during a defrag there would be gigabytes of unwritten data in 
memory only, which was all lost when the system crashed (motherboard failure). 
No wonder the filesystem didn't quite survive.

I must say though, I'm seriously impressed at the data integrity of BTRFS - 
there were near 10,000 checksum errors, 4 which were uncorrectable, and from 
what I could tell nearly all of the data was still intact according to rsync 
checksums.

Cheers,
Paul.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: difference between -c and -p for send-receive?

2017-09-19 Thread Andrei Borzenkov
On Tue, Sep 19, 2017 at 1:24 PM, Graham Cobb  wrote:
> On 19/09/17 01:41, Dave wrote:
>> Would it be correct to say the following?
>
> Like Duncan, I am just a user, and I haven't checked the code. I
> recommend Duncan's explanation, but in case you are looking for
> something simpler, how about thinking with the following analogy...
>
> Think of -p as like doing an incremental backup: it tells send to just
> send the instructions for the changes to get from the "parent" subvolume
> to the current subvolume. Without -p it is like a full backup:
> everything in the current subvolume is sent.
>
> -c is different:

It is not really different - it is extra. You have -p and optionally
-c which modifies its behavior.

> it says "and by the way, these files also already exist
> on the destination so they might be useful to skip actually sending some
> of the file contents". Imagine that whenever a file content is about to
> be sent (whether incremental or full), btrfs-send checks to see if the
> data is in one of the -c subvolumes and, if it is, it sends "get the
> data by reflinking to this file over here" instead of sending the data
> itself. -c is really just an optimisation to save sending data if you
> know the data is already available somewhere else on the destination.
>
> Be aware that this is really just an analogy (like "hard linking" is an
> analogy for reflinking using the clone range ioctl). Duncan's email
> provides more real details.
>
> In particular, this analogy doesn't explain the original questioner's
> problem. In the analogy, -c might work without the files actually being
> present on the source (as long as they are on the destination). But, in
> reality, because the underlying mechanism is extent range cloning, the
> files have to be present on **both** the source and the destination in
> order for btrfs-send to work out what commands to send.
>

Yes. Decision whether to send full data or reflink is taken on source,
so data must be present on source.

> By the way, like Duncan, I was surprised that the man page suggests that
> -c without -p causes one of the clones to be treated as a parent. I have
> not checked the code to see if that is actually how it works.
>

It is. As implemented, -c *requires* parent snapshot, either
explicitly via -p option or implicitly. What it does:

a) checks that both snapshot to transfer and all snapshots given as
arguments to -c have the same parent uuid;
b) selects "best match" by comparing how close snapshots from -c
option are to parent. As far as I can tell it chooses the oldest
snapshot (with minimal difference to the parent) as base (implicit
-p).

Which implies that "btrfs send -c foo bar" is entirely equivalent to
"btrfs send -p foo bar".

Which still does not explain why script fails. As mentioned, as
snapshots created by snapper should have the same parent uuid, which
leaves only possibility of non-existent subvolume, but then script
should have failed much earlier.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: difference between -c and -p for send-receive?

2017-09-19 Thread Graham Cobb
On 19/09/17 01:41, Dave wrote:
> Would it be correct to say the following?

Like Duncan, I am just a user, and I haven't checked the code. I
recommend Duncan's explanation, but in case you are looking for
something simpler, how about thinking with the following analogy...

Think of -p as like doing an incremental backup: it tells send to just
send the instructions for the changes to get from the "parent" subvolume
to the current subvolume. Without -p it is like a full backup:
everything in the current subvolume is sent.

-c is different: it says "and by the way, these files also already exist
on the destination so they might be useful to skip actually sending some
of the file contents". Imagine that whenever a file content is about to
be sent (whether incremental or full), btrfs-send checks to see if the
data is in one of the -c subvolumes and, if it is, it sends "get the
data by reflinking to this file over here" instead of sending the data
itself. -c is really just an optimisation to save sending data if you
know the data is already available somewhere else on the destination.

Be aware that this is really just an analogy (like "hard linking" is an
analogy for reflinking using the clone range ioctl). Duncan's email
provides more real details.

In particular, this analogy doesn't explain the original questioner's
problem. In the analogy, -c might work without the files actually being
present on the source (as long as they are on the destination). But, in
reality, because the underlying mechanism is extent range cloning, the
files have to be present on **both** the source and the destination in
order for btrfs-send to work out what commands to send.

By the way, like Duncan, I was surprised that the man page suggests that
-c without -p causes one of the clones to be treated as a parent. I have
not checked the code to see if that is actually how it works.

Graham
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: check: check invalid extent_inline_ref type in lowmem

2017-09-19 Thread Qu Wenruo



On 2017年09月19日 17:48, Su Yue wrote:



On 09/19/2017 04:48 PM, Qu Wenruo wrote:



On 2017年09月19日 16:32, Su Yue wrote:

Lowmem check does not skip invalid type in extent_inline_ref then
calls btrfs_extent_inline_ref_size(type) which causes crash.

Example:
$ ./btrfs-corrupt-block -e -l 20971520 /tmp/data_small
corrupting extent record: key 20971520 169 0
$ btrfs check --mode=lowmem  /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1]    5188 abort (core dumped)  btrfs check --mode=lowmem 
/tmp/data_small


Fix it by checking type before obtaining inline_ref size.

Signed-off-by: Su Yue 


Code itself looks good.
And test case please.


Thanks for review. I'll add test case in next version.


Reviewed-by: Qu Wenruo 

However, such whac-a-mole fix will finally be a nightmare to maintain.

What about integrating all of such validation checkers into one place?
So fsck part will only need to check their cross reference without 
bothering such corruption.



I was confused how to fix the bug(new checker or little changes
in this patch).
The reason why I fix it in this way is that most callers do
check type before calling btrfs_extent_inline_ref_size().

Since you prefer the former, I'm going to do it.


Current version looks good enough as a fix.

Just saying we'd better using an integrated solution later.

Thanks,
Qu


Thanks
Su


Just like the kernel patch I submitted some times ago.
https://www.spinics.net/lists/linux-btrfs/msg68498.html

Thanks,
Qu


---
  cmds-check.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 4e0b0fe4..74c10c75 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11565,6 +11565,10 @@ static int check_tree_block_ref(struct 
btrfs_root *root,

  offset, level + 1, owner,
  NULL);
  }
+    } else {
+    error("extent[%llu %u %llu] has unknown ref type: %d",
+  key.objectid, key.type, key.offset, type);
+    break;
  }
  if (found_ref)
@@ -11831,6 +11835,11 @@ static int check_extent_data_item(struct 
btrfs_root *root,

  found_dbackref = !check_tree_block_ref(root, NULL,
  btrfs_extent_inline_ref_offset(leaf, iref),
  0, owner, NULL);
+    } else {
+    error("extent[%llu %u %llu] has unknown ref type: %d",
+  disk_bytenr, BTRFS_EXTENT_DATA_KEY,
+  disk_num_bytes, type);
+    break;
  }
  if (found_dbackref)








--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: check: check invalid extent_inline_ref type in lowmem

2017-09-19 Thread Su Yue



On 09/19/2017 04:48 PM, Qu Wenruo wrote:



On 2017年09月19日 16:32, Su Yue wrote:

Lowmem check does not skip invalid type in extent_inline_ref then
calls btrfs_extent_inline_ref_size(type) which causes crash.

Example:
$ ./btrfs-corrupt-block -e -l 20971520 /tmp/data_small
corrupting extent record: key 20971520 169 0
$ btrfs check --mode=lowmem  /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1]    5188 abort (core dumped)  btrfs check --mode=lowmem 
/tmp/data_small


Fix it by checking type before obtaining inline_ref size.

Signed-off-by: Su Yue 


Code itself looks good.
And test case please.


Thanks for review. I'll add test case in next version.


Reviewed-by: Qu Wenruo 

However, such whac-a-mole fix will finally be a nightmare to maintain.

What about integrating all of such validation checkers into one place?
So fsck part will only need to check their cross reference without 
bothering such corruption.



I was confused how to fix the bug(new checker or little changes
in this patch).
The reason why I fix it in this way is that most callers do
check type before calling btrfs_extent_inline_ref_size().

Since you prefer the former, I'm going to do it.

Thanks
Su


Just like the kernel patch I submitted some times ago.
https://www.spinics.net/lists/linux-btrfs/msg68498.html

Thanks,
Qu


---
  cmds-check.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 4e0b0fe4..74c10c75 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11565,6 +11565,10 @@ static int check_tree_block_ref(struct 
btrfs_root *root,

  offset, level + 1, owner,
  NULL);
  }
+    } else {
+    error("extent[%llu %u %llu] has unknown ref type: %d",
+  key.objectid, key.type, key.offset, type);
+    break;
  }
  if (found_ref)
@@ -11831,6 +11835,11 @@ static int check_extent_data_item(struct 
btrfs_root *root,

  found_dbackref = !check_tree_block_ref(root, NULL,
  btrfs_extent_inline_ref_offset(leaf, iref),
  0, owner, NULL);
+    } else {
+    error("extent[%llu %u %llu] has unknown ref type: %d",
+  disk_bytenr, BTRFS_EXTENT_DATA_KEY,
+  disk_num_bytes, type);
+    break;
  }
  if (found_dbackref)







--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs-progs: check: check invalid extent_inline_ref type in lowmem

2017-09-19 Thread Qu Wenruo



On 2017年09月19日 16:32, Su Yue wrote:

Lowmem check does not skip invalid type in extent_inline_ref then
calls btrfs_extent_inline_ref_size(type) which causes crash.

Example:
$ ./btrfs-corrupt-block -e -l 20971520 /tmp/data_small
corrupting extent record: key 20971520 169 0
$ btrfs check --mode=lowmem  /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1]5188 abort (core dumped)  btrfs check --mode=lowmem /tmp/data_small

Fix it by checking type before obtaining inline_ref size.

Signed-off-by: Su Yue 


Code itself looks good.
And test case please.

Reviewed-by: Qu Wenruo 

However, such whac-a-mole fix will finally be a nightmare to maintain.

What about integrating all of such validation checkers into one place?
So fsck part will only need to check their cross reference without 
bothering such corruption.


Just like the kernel patch I submitted some times ago.
https://www.spinics.net/lists/linux-btrfs/msg68498.html

Thanks,
Qu


---
  cmds-check.c | 9 +
  1 file changed, 9 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 4e0b0fe4..74c10c75 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11565,6 +11565,10 @@ static int check_tree_block_ref(struct btrfs_root 
*root,
offset, level + 1, owner,
NULL);
}
+   } else {
+   error("extent[%llu %u %llu] has unknown ref type: %d",
+ key.objectid, key.type, key.offset, type);
+   break;
}
  
  		if (found_ref)

@@ -11831,6 +11835,11 @@ static int check_extent_data_item(struct btrfs_root 
*root,
found_dbackref = !check_tree_block_ref(root, NULL,
btrfs_extent_inline_ref_offset(leaf, iref),
0, owner, NULL);
+   } else {
+   error("extent[%llu %u %llu] has unknown ref type: %d",
+ disk_bytenr, BTRFS_EXTENT_DATA_KEY,
+ disk_num_bytes, type);
+   break;
}
  
  		if (found_dbackref)



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: btrfs-progs: suggestion of removing --commit-after option of subvol delete

2017-09-19 Thread Qu Wenruo



On 2017年09月19日 15:50, Misono, Tomohiro wrote:

Hello,

I read the code of "subvolume delete" and found that --commit-after option is
not working well.

Since it issues BTRFS_IOC_START/WAIT_SYNC to the last fd (of directory
containing the last deleted subvolume),
1. sync operation affects only the last fd's filesystem.
("subvolume delete" can take multiple subvolumes on different filesystems.)
2. if the last delete action fails to open the path (fd == -1),
SYNC is not issued at all.

One solution is to keep every fd for deleted subvolumes, but I think it takes
too much cost. Since we can just use "btrfs filesystem sync" after delete if
needed, I think it is ok to remove --comit-after option.


Personally speaking I'm OK removing --commit-after, as implementing a 
full working --commit-after seems too complex for a minor feature.
(Need to finding the same fs of multiple subvolume and doing commit for 
each fs, and fallback to other fd if open failed)


Since --commit-after is a relatively lightweight solution compared to 
--commit-each, and both can only ensure subvolume doesn't show up, while 
"fi sync" can do a "deeper" sync to ensure the whole subvolume get 
removed on disk.


But instead of deleting the option, it would be better to keep it 
deprecated for a while.
Showing a message informing user this option is deprecated and falling 
back to --commit-each seems to be a better solution.


Thanks,
Qu


Regards,
Tomohiro Misono
(misono.tomoh...@jp.fujitsu.com)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: check: check invalid extent_inline_ref type in lowmem

2017-09-19 Thread Su Yue
Lowmem check does not skip invalid type in extent_inline_ref then
calls btrfs_extent_inline_ref_size(type) which causes crash.

Example:
$ ./btrfs-corrupt-block -e -l 20971520 /tmp/data_small
corrupting extent record: key 20971520 169 0
$ btrfs check --mode=lowmem  /tmp/data_small
Checking filesystem on /tmp/data_small
UUID: ee205d69-8724-4aa2-a4f5-bc8558a62169
checking extents
ERROR: extent[20971520 16384] backref type mismatch, missing bit: 2
ERROR: extent[20971520 16384] backref generation mismatch,
wanted: 7, have: 0
ERROR: extent[20971520 16384] is referred by other roots than 3
ctree.h:1754: btrfs_extent_inline_ref_size: BUG_ON `1` triggered,
value 1
btrfs(+0x543db)[0x55fabc2ab3db]
btrfs(+0x587f7)[0x55fabc2af7f7]
btrfs(+0x5fa44)[0x55fabc2b6a44]
btrfs(cmd_check+0x194a)[0x55fabc2bd717]
btrfs(main+0x88)[0x55fabc2682e0]
/usr/lib/libc.so.6(__libc_start_main+0xea)[0x7f021c3824ca]
btrfs(_start+0x2a)[0x55fabc267e7a]
[1]5188 abort (core dumped)  btrfs check --mode=lowmem /tmp/data_small

Fix it by checking type before obtaining inline_ref size.

Signed-off-by: Su Yue 
---
 cmds-check.c | 9 +
 1 file changed, 9 insertions(+)

diff --git a/cmds-check.c b/cmds-check.c
index 4e0b0fe4..74c10c75 100644
--- a/cmds-check.c
+++ b/cmds-check.c
@@ -11565,6 +11565,10 @@ static int check_tree_block_ref(struct btrfs_root 
*root,
offset, level + 1, owner,
NULL);
}
+   } else {
+   error("extent[%llu %u %llu] has unknown ref type: %d",
+ key.objectid, key.type, key.offset, type);
+   break;
}
 
if (found_ref)
@@ -11831,6 +11835,11 @@ static int check_extent_data_item(struct btrfs_root 
*root,
found_dbackref = !check_tree_block_ref(root, NULL,
btrfs_extent_inline_ref_offset(leaf, iref),
0, owner, NULL);
+   } else {
+   error("extent[%llu %u %llu] has unknown ref type: %d",
+ disk_bytenr, BTRFS_EXTENT_DATA_KEY,
+ disk_num_bytes, type);
+   break;
}
 
if (found_dbackref)
-- 
2.14.1



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs-progs: suggestion of removing --commit-after option of subvol delete

2017-09-19 Thread Misono, Tomohiro
Hello,

I read the code of "subvolume delete" and found that --commit-after option is
not working well.

Since it issues BTRFS_IOC_START/WAIT_SYNC to the last fd (of directory
containing the last deleted subvolume),
1. sync operation affects only the last fd's filesystem.
   ("subvolume delete" can take multiple subvolumes on different filesystems.)
2. if the last delete action fails to open the path (fd == -1),
   SYNC is not issued at all.

One solution is to keep every fd for deleted subvolumes, but I think it takes
too much cost. Since we can just use "btrfs filesystem sync" after delete if
needed, I think it is ok to remove --comit-after option.

Regards,
Tomohiro Misono
(misono.tomoh...@jp.fujitsu.com)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: subvolume: outputs message only when operation succeeds

2017-09-19 Thread Misono, Tomohiro
"btrfs subvolume create/delete" outputs the message of "Create/Delete
subvolume ..." even when an operation fails.
Since it is confusing, let's outputs the message only when an operation 
succeeds.

Signed-off-by: Tomohiro Misono 
---
 cmds-subvolume.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/cmds-subvolume.c b/cmds-subvolume.c
index 666f6e0..6d4b0fe 100644
--- a/cmds-subvolume.c
+++ b/cmds-subvolume.c
@@ -189,7 +189,6 @@ static int cmd_subvol_create(int argc, char **argv)
if (fddst < 0)
goto out;
 
-   printf("Create subvolume '%s/%s'\n", dstdir, newname);
if (inherit) {
struct btrfs_ioctl_vol_args_v2  args;
 
@@ -213,6 +212,7 @@ static int cmd_subvol_create(int argc, char **argv)
error("cannot create subvolume: %s", strerror(errno));
goto out;
}
+   printf("Create subvolume '%s/%s'\n", dstdir, newname);
 
retval = 0; /* success */
 out:
@@ -337,9 +337,6 @@ again:
goto out;
}
 
-   printf("Delete subvolume (%s): '%s/%s'\n",
-   commit_mode == 2 || (commit_mode == 1 && cnt + 1 == argc)
-   ? "commit" : "no-commit", dname, vname);
memset(, 0, sizeof(args));
strncpy_null(args.name, vname);
res = ioctl(fd, BTRFS_IOC_SNAP_DESTROY, );
@@ -349,6 +346,9 @@ again:
ret = 1;
goto out;
}
+   printf("Delete subvolume (%s): '%s/%s'\n",
+   commit_mode == 2 || (commit_mode == 1 && cnt + 1 == argc)
+   ? "commit" : "no-commit", dname, vname);
 
if (commit_mode == 1) {
res = wait_for_commit(fd);
-- 
2.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: ERROR: parent determination failed (btrfs send-receive)

2017-09-19 Thread Dave
On Mon, Sep 18, 2017 at 12:23 AM, Andrei Borzenkov  wrote:
>
> 18.09.2017 05:31, Dave пишет:
> > Sometimes when using btrfs send-receive, I get errors like this:
> >
> > ERROR: parent determination failed for 
> >
> > When this happens, btrfs send-receive backups fail. And all subsequent
> > backups fail too.
> >
> > The issue seems to stem from the fact that an automated cleanup
> > process removes certain earlier subvolumes. (I'm using Snapper.)
> >
> > I'd like to understand exactly what is happening so that my backups do
> > not unexpectedly fail.
> >
>
> You try to send incremental changes but you deleted subvolume to compute
> changes against. It is hard to tell more without seeing subvolume list
> with uuid/parent uuid.
>

Thanks to Duncan <1i5t5.dun...@cox.net> I have a bit better
understanding of -c and -p now.

My backup tool is using only the -c option. (The tool is GitHub -
wesbarnett/snap-sync https://github.com/wesbarnett/snap-sync .)

No subvolume at the destination had ever been deleted.

At the source, a number (about 30 in this case) preceding snapshots
(subvolumes) exist, but some others have been cleaned up with
Snapper's timeline algorithm.

I think I understand that with the -c option, it is **not** strictly
necessary that the specified snapshots exist on both the source and
destination. It seems I had sufficient subvolumes available for the
incremental send to work with the -c option.

Therefore, it still isn't clear why I got the error: ERROR: parent
determination failed.

Further general comments will be helpful.

I understand the limits in getting too specific in replies because I
cannot give subvolume lists with uuid's.

As mentioned, I cannot give that info because I tried to fix the above
error by sending a subvolume from the destination back to the target.
This resulted in my source volume having a "Received UUID" which
wrecked all subsequent backups.

I now understand that (for my use cases) a source subvolume for a
send-receive should **never** have a Received UUID. (If that is indeed
a general rule, it seems btrfs tools should check it. Or possibly
something about this the pitfalls of a "Received UUID" in a source
could be listed on the BTRFS incremental backup wiki page?)

I was previously referred to the FAQ here:
https://github.com/digint/btrbk/blob/master/doc/FAQ.md#im-getting-an-error-aborted-received-uuid-is-set

But in the end, I decided the safest strategy was to delete all prior
backup subvolumes so I could be sure my backups were valid going
forward.

I am asking these questions now to avoid getting into a situation like
this again (hopefully). Any general comments are appreciated.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html