Re: [arch-general] btrfs kernel incompatibility?

2020-02-06 Thread Chris Murphy
On Thu, Feb 6, 2020 at 3:03 PM Chris Murphy  wrote:
>
> There is a gotcha moving Btrfs between different archs. "Btrfs sector
> size", which is an internal Btrfs thing, not a reference to either
> logical or physical sector size of the device, must be the same as
> page size. Page size is 4KiB on x86 and I'm pretty sure it's 16KiB on
> ARM.
>
> So I wonder if you've run into an arch mixing problem (or bug).

OK I pulled my Pi Zero out, which also uses Btrfs, and 4.19.75 kernel.
The Btrfs sector size is 4096. Same as on x86_64.

So now I'm back to something stepping on the end of the drives, that
munged the backup GPT, and possibly munged Btrfs too.

Has either device been mounted degraded since the successful scrub?

-- 
Chris Murphy


Re: [arch-general] [pacman-dev] Privilege separation in the pacman downloader (Was: Pacman Database Signatures)

2020-02-06 Thread Eli Schwartz via arch-general
On 2/4/20 11:08 PM, Eli Schwartz wrote:
> Since I'm unfamiliar with apt and other tools, what exactly do they do?
> Given pacman/apt/your-choice-of-package-manager must somehow write to a
> cachedir, e.g. /var/cache/pacman/pkg, it would need a dedicated download
> user, which would then exclusively hold ownership of the cachedir.
> 
> pacman is one big binary at the moment, it doesn't fork+exec to run
> collections of binaries implementing different parts of the package
> manager (which is actually a plus when it comes to speed), so this might
> entail major re-architecturing of that part of pacman. Doing it for
> external XferCommand programs could be a start.
> 
> Is this a topic you're interested in exploring?

I've opened a feature request for this:
https://bugs.archlinux.org/task/65401

-- 
Eli Schwartz
Bug Wrangler and Trusted User



signature.asc
Description: OpenPGP digital signature


Re: [arch-general] btrfs kernel incompatibility?

2020-02-06 Thread Chris Murphy
On Thu, Feb 6, 2020 at 2:01 PM Simeon Felis  wrote:
>
> [   21.417002] Btrfs loaded, crc32c=crc32c-generic
> [   21.418650] BTRFS: device label URAID devid 8 transid 1254272 /dev/sdb1
> [   21.423593] BTRFS: device label URAID devid 7 transid 1254272 /dev/sda1
> [  313.823521] BTRFS info (device sda1): disk space caching is enabled
> [  313.823537] BTRFS info (device sda1): has skinny extents
> [  313.826597] BTRFS critical (device sda1): unable to find logical 
> 3746684731392 length 4096

For what it's worth, Btrfs uses logical addresses internally. These
are in bytes. They do not correlate to a physical location. So while
3746684731392 translates to ~3.4TiB and might suggest that this is a
location near the end of the drive, it's not necessarily true - but
would be consistent with something stepping on the end of the drive,
wiping out both the alternate GPT and the chunk tree.

>From the super you provided:
compat_flags0x0
compat_ro_flags0x0
incompat_flags0x161
( MIXED_BACKREF |
  BIG_METADATA |
  EXTENDED_IREF |
  SKINNY_METADATA )

These are supported since forever, going back to 3.x kernels. Can you
confirm the architecture of the workstation you did the scrub and
balance on? I'm guessing it's x86_64.

The problem isn't with the superblock. It's with boot strapping the
chunk tree. It's not clear why. The chunk tree is what's responsible
for translating a logical address into a device + sector lookup; for
raid1 that's two devices and two different physical sectors. If the
chunk tree cannot be read, it's not possible to find anything,
including other metadata.

It should be true that the partition this Btrfs is on, is exactly the
same number of bytes as dev_item.total_bytes.

I think there's a bug somewhere, you've got two devices experiencing
the exact same complaint and problem. The GPT thing might be a
distraction, I'm not really sure yet.

What do you get for:

btrfs insp dump-s -fa /dev/
btrfs insp dump-t -t chunk /dev

These are all read only commands.


-- 
Chris Murphy


Re: [arch-general] btrfs kernel incompatibility?

2020-02-06 Thread Chris Murphy
There's not enough information to know what's going on yet. My strong
advice is to make no further changes (no writes) to either block
device until you understand exactly what's going on. Every write
increases the chance of permanently losing the file system.


On Thu, Feb 6, 2020 at 2:01 PM Simeon Felis  wrote:
>
> 4.19.75 dmesg:
>
> [   17.707873] GPT:Primary header thinks Alt. header is not at the end of the 
> di
> sk.
> [   17.707889] GPT:7814037167 != 253879390758629
> [   17.707895] GPT:Alternate GPT header not at the end of the disk.
> [   17.707902] GPT:7814037167 != 253879390758629
> [   17.707907] GPT: Use GNU Parted to correct GPT errors.
> [   17.707977]  sdb: sdb1
> [   17.709682] GPT:Primary header thinks Alt. header is not at the end of the 
> disk.
> [   17.709697] GPT:7814037167 != 253879390758629
> [   17.709703] GPT:Alternate GPT header not at the end of the disk.
> [   17.709710] GPT:7814037167 != 253879390758629

This is terrible error reporting (by the kernel) in that it's not
clearly stating whether the primary GPT is reporting 7814037167 or
253879390758629. No shit, they aren't the same. Usually this error
means the backup GPT at the end of the drive has been stepped on by
something; but LBA 253879390758629 is plainly bogus, that's ~115PiB.

What do you get for either 'fdisk -l' or 'gdisk -l' or 'parted
/dev/sda u s p' for each device?

> btrfs inspect-internal dump-super /dev/disk/by-label/URAID
> superblock: bytenr=65536, device=/dev/disk/by-label/URAID
> total_bytes8001571065856
...
> dev_item.total_bytes4000785104896

4000785104896 bytes is 7814033408 sectors, which approximates LBA
7814037167. And fortunately the latter number is bigger.

There is a gotcha moving Btrfs between different archs. "Btrfs sector
size", which is an internal Btrfs thing, not a reference to either
logical or physical sector size of the device, must be the same as
page size. Page size is 4KiB on x86 and I'm pretty sure it's 16KiB on
ARM.

So I wonder if you've run into an arch mixing problem (or bug).


-- 
Chris Murphy


[arch-general] btrfs kernel incompatibility?

2020-02-06 Thread Simeon Felis
I have a btrfs raid1 on raspbian (kernel 4.19.75) which overheated. To fix the 
btrfs filesystem I attached the raid1 to my workstation with Arch Linux (kernel 
5.5.1). I run scrub to identify broken files and fixed them. Furthermore I run 
--full-balance and defrag -r. All fine so far.

Now I can't mount the btrfs on raspbian any more (bad superblock). I checked 
with a blank raspbian and different boards, same result. On my Arch Linux 
workstation there are no problems.

The btrfs changelog [1] only mentions more checksums (xxhash, sha256...) as 
possible incompatible features between 4.19 and 5.5. However the raid1 is still 
crc32c (see below).

I'm afraid balance/scrub/defrag operations of my Arch Linux made the btrfs 
raid1 incompatible to older kernels. Is that plausible? How could I convert it 
back?


4.19.75 dmesg:

[   17.707873] GPT:Primary header thinks Alt. header is not at the end of the di
sk.
[   17.707889] GPT:7814037167 != 253879390758629
[   17.707895] GPT:Alternate GPT header not at the end of the disk.
[   17.707902] GPT:7814037167 != 253879390758629
[   17.707907] GPT: Use GNU Parted to correct GPT errors.
[   17.707977]  sdb: sdb1
[   17.709682] GPT:Primary header thinks Alt. header is not at the end of the 
disk.
[   17.709697] GPT:7814037167 != 253879390758629
[   17.709703] GPT:Alternate GPT header not at the end of the disk.
[   17.709710] GPT:7814037167 != 253879390758629
[   17.709715] GPT: Use GNU Parted to correct GPT errors.
[   17.709787]  sda: sda1
[   17.710776] sd 0:0:0:1: [sdb] Attached SCSI disk
[   17.721324] sd 0:0:0:0: [sda] Attached SCSI disk
[   18.301910] raid6: int32x1  gen()   203 MB/s
[   18.471765] raid6: int32x1  xor()   178 MB/s
[   18.641705] raid6: int32x2  gen()   278 MB/s
[   18.811703] raid6: int32x2  xor()   207 MB/s
[   18.981816] raid6: int32x4  gen()   307 MB/s
[   19.151686] raid6: int32x4  xor()   228 MB/s
[   19.321816] raid6: int32x8  gen()   315 MB/s
[   19.491762] raid6: int32x8  xor()   219 MB/s
[   19.661699] raid6: neonx1   gen()   711 MB/s
[   19.831664] raid6: neonx1   xor()   811 MB/s
[   20.001711] raid6: neonx2   gen()  1175 MB/s
[   20.171661] raid6: neonx2   xor()  1187 MB/s
[   20.341685] raid6: neonx4   gen()  1550 MB/s
[   20.511663] raid6: neonx4   xor()  1344 MB/s
[   20.681678] raid6: neonx8   gen()  1371 MB/s
[   20.851667] raid6: neonx8   xor()  1125 MB/s
[   20.851673] raid6: using algorithm neonx4 gen() 1550 MB/s
[   20.851676] raid6:  xor() 1344 MB/s, rmw enabled
[   20.851680] raid6: using neon recovery algorithm
[   20.881255] xor: measuring software checksum speed
[   20.971664]    arm4regs  :  2020.800 MB/sec
[   21.071659]    8regs :  1357.600 MB/sec
[   21.171657]    32regs    :  1262.800 MB/sec
[   21.271658]    neon  :  2156.800 MB/sec
[   21.271664] xor: using function: neon (2156.800 MB/sec)
[   21.417002] Btrfs loaded, crc32c=crc32c-generic
[   21.418650] BTRFS: device label URAID devid 8 transid 1254272 /dev/sdb1
[   21.423593] BTRFS: device label URAID devid 7 transid 1254272 /dev/sda1
[  313.823521] BTRFS info (device sda1): disk space caching is enabled
[  313.823537] BTRFS info (device sda1): has skinny extents
[  313.826597] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.839009] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.851766] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.864676] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.878039] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.891649] BTRFS critical (device sda1): unable to find logical 
3746684731392 length 4096
[  313.905782] BTRFS error (device sda1): failed to read chunk root
[  313.942529] BTRFS error (device sda1): open_ctree failed


btrfs inspect-internal dump-super /dev/disk/by-label/URAID
superblock: bytenr=65536, device=/dev/disk/by-label/URAID
-
csum_type        0 (crc32c)
csum_size        4
csum            0x33567328 [match]
bytenr            65536
flags            0x1
            ( WRITTEN )
magic            _BHRfS_M [match]
fsid            da28e00e-6ae7-4a28-9bf4-6826157c7e43
metadata_uuid        da28e00e-6ae7-4a28-9bf4-6826157c7e43
label            URAID
generation        1254278
root            14868497367040
sys_array_size        129
chunk_root_generation    1254275
root_level        1
chunk_root        21338870824960
chunk_root_level    1
log_root        0
log_root_transid    0
log_root_level        0
total_bytes        8001571065856
bytes_used        2903387471872
sectorsize        4096
nodesize        16384
leafsize (deprecated)    16384
stripesize        4096
root_dir        6
num_devices        2
compat_flags        0x0
compat_ro_flags        0x0
incompat_flags        0x161
            ( MIXED_BACKREF |
              BIG_METADATA |
              EXTENDED_IREF