Re: FS_IOC_FIEMAP fe_physical discrepancies on btrfs

2018-10-27 Thread Qu Wenruo


On 2018/10/27 下午11:45, Lennert Buytenhek wrote:
> Hello!
> 
> FS_IOC_FIEMAP on btrfs seems to be returning fe_physical values that
> don't always correspond to the actual on-disk data locations.  For some
> files the values match, but e.g. for this file:
> 
> # filefrag -v foo
> Filesystem type is: 9123683e
> File size of foo is 4096 (1 block of 4096 bytes)
>  ext: logical_offset:physical_offset: length:   expected: flags:
>0:0..   0:5774454..   5774454:  1: last,eof
> foo: 1 extent found
> #
> 
> The file data is actually on disk not in block 5774454 (0x581c76), but
> in block 6038646 (0x5c2476), an offset of +0x40800.  Is this expected
> behavior?  Googling didn't turn up much, apologies if this is an FAQ. :(

Btrfs uses chunk map to build a logical address space.

And all bytenrs in btrfs are in that logical address space, not physical
disk bytenr.

So you need refer to chunk mapping to get the real on-disk bytenr.

You could consider inside btrfs there is another layer like LVM, and
btrfs is on a super large virtual device.

The result returned by fiemap() is just the bytenr in that virtual
device (LV).
For real on-disk bytenr (PV), you need to do the mapping calculation.

Thanks,
Qu

> 
> (This is on 4.18.16-200.fc28.x86_64, the current Fedora 28 kernel.)
> 
> 
> Thanks,
> Lennert
> 



signature.asc
Description: OpenPGP digital signature


Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Marc MERLIN
On Sun, Oct 28, 2018 at 07:27:22AM +0800, Qu Wenruo wrote:
> > I can't drop all the snapshots since at least two is used for btrfs
> > send/receive backups.
> > However, if I delete more snapshots, and do a full balance, you think
> > it'll free up more space?
> 
> No.
> 
> You're already too worried about an non-existing problem.
> Your fs looks pretty healthy.

Thanks both for the answers. I'll go back and read them more carefully
later to see how I can adjust my monitoring but basically I hit the 90%
space used in df alert, and I know that once I get close to full, or
completely full, very bad things happen with btrfs, making the system
sometimes so unusable that it's very hard to reclaim space and fix the
issue (not counting that if you have btrfs send snapshots, you're forced
to break the snapshot relationship and start over since deleting data
does not reclaim blocks that are obviously still marked as used by the
last snapshot that was sent to the backup server).

Long story short, I try very hard to not ever hit this problem again :)

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08


Re: btrfs-qgroup-rescan using 100% CPU

2018-10-27 Thread Qu Wenruo


On 2018/10/28 上午6:58, Dave wrote:
> I'm using btrfs and snapper on a system with an SSD. On this system
> when I run `snapper -c root ls` (where `root` is the snapper config
> for /), the process takes a very long time and top shows the following
> process using 100% of the CPU:
> 
> kworker/u8:6+btrfs-qgroup-rescan

Not sure about what snapper is doing, but it looks like snapper needs to
use btrfs qgroup.

And then enable btrfs qgroup will do a initial qgroup scan.

If you have a lot of snapshots or a lot of files, it will take a long
time to do the initial rescan.

That's the designed behavior.

> 
> I have multiple computers (also with SSD's) set up the same way with
> snapper and btrfs. On the other computers, `snapper -c root ls`
> completes almost instantly, even on systems with many more snapshots.

The size of each subvolume also counts.

The time consumed by qgroup rescan depends on the number of references.
Snapshots, reflinks all contribute to that number.

Also, large files contribute less than small files, as one large file
(128M) could only contain one reference, while 128 small files (1M)
contains 128 references.

Snapshot is one of the heaviest workload in such case.
If one extent is shared 4 times between 4 snapshots, then it will cost 4
times CPU resource to do the rescan.

> This system has 20 total snapshots on `/`.

That already sounds a lot, especially considering the size of the fs.

> 
> System info:
> 
> 4.18.16-arch1-1-ARCH (Arch Linux)
> btrfs-progs v4.17.1
> scrub started at Sat Oct 27 18:37:21 2018 and finished after 00:04:02
> total bytes scrubbed: 75.97GiB with 0 errors
> 
> Filesystem   Size  Used Avail Use% Mounted on
> /dev/mapper/cryptdv  116G   77G   38G  67% /
> 
> Data, single: total=72.01GiB, used=71.38GiB
> System, DUP: total=32.00MiB, used=16.00KiB
> Metadata, DUP: total=3.50GiB, used=2.22GiB

From a quick glance, it indeed needs some time do to the rescan.


Also, to make sure it's not some deadlock, you could check the rescan
progress by the following command:

# btrfs ins dump-tree -t quota /dev/mapper/cryptdv

And look for the following item:

item 0 key (0 QGROUP_STATUS 0) itemoff 16251 itemsize 32
version 1 generation 7 flags ON|RESCAN scan 1024

That scan number should change with rescan progress.
(That number is only updated after each transaction, so you need to wait
some time to see that change).

Thanks,
Qu

> 
> What other info would be helpful? What troubleshooting steps should I try?
> 



signature.asc
Description: OpenPGP digital signature


Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Qu Wenruo


On 2018/10/28 上午1:42, Marc MERLIN wrote:
> On Wed, Oct 24, 2018 at 01:07:25PM +0800, Qu Wenruo wrote:
>>> saruman:/mnt/btrfs_pool1# btrfs balance start -musage=80 -v .
>>> Dumping filters: flags 0x6, state 0x0, force is off
>>>   METADATA (flags 0x2): balancing, usage=80
>>>   SYSTEM (flags 0x2): balancing, usage=80
>>> Done, had to relocate 5 out of 202 chunks
>>> saruman:/mnt/btrfs_pool1# btrfs fi show .
>>> Label: 'btrfs_pool1'  uuid: fda628bc-1ca4-49c5-91c2-4260fe967a23
>>> Total devices 1 FS bytes used 188.24GiB
>>> devid1 size 228.67GiB used 203.54GiB path /dev/mapper/pool1
>>>
>>> and it's back to 15GB :-/
>>>
>>> How can I get 188.24 and 203.54 to converge further? Where is all that
>>> space gone?
>>
>> Your original chunks are already pretty compact.
>> Thus really no need to do extra balance.
>>
>> You may get some extra space by doing full system balance (no usage=
>> filter), but that's really not worthy in my opinion.
>>
>> Maybe you could try defrag to free some space wasted by CoW instead?
>> (If you're not using many snapshots)
> 
> Thanks for the reply.
> 
> So right now, I have:
> saruman:~# btrfs fi show /mnt/btrfs_pool1/
> Label: 'btrfs_pool1'  uuid: fda628bc-1ca4-49c5-91c2-4260fe967a23
>   Total devices 1 FS bytes used 188.25GiB
>   devid1 size 228.67GiB used 203.54GiB path /dev/mapper/pool1

The fs is over 50G, so your metadata chunk will be allocated in 1G size.

> 
> saruman:~# btrfs fi df /mnt/btrfs_pool1/
> Data, single: total=192.48GiB, used=184.87GiB

Your data usage is over 96%, so your data chunks are already pretty compact.

In theory you could reach the minimal data usage 185G, but I think any
new data write would cause new data chunks to be created in that case.

To reclaim that 7.5G, you need to use -dusage filter other than your
-musage filter.
And your usage parameter may be pretty low.

> System, DUP: total=32.00MiB, used=48.00KiB
> Metadata, DUP: total=5.50GiB, used=3.38GiB

Metadata looks more sparse than data.

But considering your metadata chunks are allocated in 1G size and CoW
happens more frequently, it's not that easy to reclaim more space.

And even you succeeded relocating all these metadata chunks, you could
only reclaim at most 2~4G.

> GlobalReserve, single: total=512.00MiB, used=0.00B
> 
> I've been using btrfs for a long time now but I've never had a
> filesystem where I had 15GB apparently unusable (7%) after a balance.

You really don't need to worry, and that "15G" is not unusable.

It's will mostly likely to be used by data, and from your "fi df"
output, your data:metadata ratio is over 10, so it should be completely
fine.

> 
> I can't drop all the snapshots since at least two is used for btrfs
> send/receive backups.
> However, if I delete more snapshots, and do a full balance, you think
> it'll free up more space?

No.

You're already too worried about an non-existing problem.
Your fs looks pretty healthy.

Thanks,
Qu

> I can try a defrag next, but since I have COW for snapshots, it's not
> going to help much, correct?>
> Thanks,
> Marc
> 



signature.asc
Description: OpenPGP digital signature


btrfs-qgroup-rescan using 100% CPU

2018-10-27 Thread Dave
I'm using btrfs and snapper on a system with an SSD. On this system
when I run `snapper -c root ls` (where `root` is the snapper config
for /), the process takes a very long time and top shows the following
process using 100% of the CPU:

kworker/u8:6+btrfs-qgroup-rescan

I have multiple computers (also with SSD's) set up the same way with
snapper and btrfs. On the other computers, `snapper -c root ls`
completes almost instantly, even on systems with many more snapshots.
This system has 20 total snapshots on `/`.

System info:

4.18.16-arch1-1-ARCH (Arch Linux)
btrfs-progs v4.17.1
scrub started at Sat Oct 27 18:37:21 2018 and finished after 00:04:02
total bytes scrubbed: 75.97GiB with 0 errors

Filesystem   Size  Used Avail Use% Mounted on
/dev/mapper/cryptdv  116G   77G   38G  67% /

Data, single: total=72.01GiB, used=71.38GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=3.50GiB, used=2.22GiB

What other info would be helpful? What troubleshooting steps should I try?


Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Remi Gauvin
On 2018-10-27 04:19 PM, Marc MERLIN wrote:

> Thanks for confirming. Because I always have snapshots for btrfs
> send/receive, defrag will duplicate as you say, but once the older
> snapshots get freed up, the duplicate blocks should go away, correct?
> 
> Back to usage, thanks for pointing out that command:
> saruman:/mnt/btrfs_pool1# btrfs fi usage .
> Overall:
> Device size:   228.67GiB
> Device allocated:  203.54GiB
> Device unallocated: 25.13GiB
> Device missing:0.00B
> Used:  192.01GiB
> Free (estimated):   32.44GiB  (min: 19.88GiB)
> Data ratio: 1.00
> Metadata ratio: 2.00
> Global reserve:512.00MiB  (used: 0.00B)
> 
> Data,single: Size:192.48GiB, Used:185.16GiB
>/dev/mapper/pool1   192.48GiB
> 
> Metadata,DUP: Size:5.50GiB, Used:3.42GiB
>/dev/mapper/pool111.00GiB
> 
> System,DUP: Size:32.00MiB, Used:48.00KiB
>/dev/mapper/pool164.00MiB
> 
> Unallocated:
>/dev/mapper/pool125.13GiB
> 
> 
> I'm still seing that I'm using 192GB, but 203GB allocated.
> Do I have 25GB usable:
> Device unallocated: 25.13GiB
> 
> Or 35GB usable?
> Device size:   228.67GiB
>   -
> Used:192.01GiB
>   = 36GB ?
> 


The answer is somewhere between the two.  (BTRFS's estimate of 32.44
Free is probably as close as you'll get to predicting.)

So you have 7.32GB  allocated but still free space for data, and 25GB of
completely unallocated disk space. However, as you add more data, or
create more snapshots and create metadata duplication, some of that 25GB
will be allocated for Metadata.  Remember that Metadata is Duplicated
(so that 3.42GB of Metadata you are using now is actually using 6.84GB
of disk space, out of the allocated 11GB

You want to be careful that unallocated space doesn't run out.If the
system runs out of usable space for metadata, it can be tricky to get
yourself out of the corner.  That is why a large discrepency between
Data Size and Used would be a concern.  If those 25GB of space were
allocated to data, your would get out of space errors even if the 25GB
was still unused.

On that note, you seem to have a rather high metadata to data ratio..
(at least, compared to my limited experience.).  Are you using noatime
on your filesystems?  without it, snapshots will end up causing
duplicated metadata when atime updates.




<>

Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Andrei Borzenkov
27.10.2018 21:12, Remi Gauvin пишет:
> On 2018-10-27 01:42 PM, Marc MERLIN wrote:
> 
>>
>> I've been using btrfs for a long time now but I've never had a
>> filesystem where I had 15GB apparently unusable (7%) after a balance.
>>
> 
> The space isn't unusable.  It's just allocated.. (It's used in the sense
> that it's reserved for data chunks.).  Start writing data to the drive,
> and the data will fill that space before more gets allocated.. (Unless

No (at least, not necessarily).

On empty filesystem:

bor@10:~> df -h /mnt
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdb1  1023M   17M  656M   3% /mnt
bor@10:~> sudo dd if=/dev/zero of=/mnt/foo bs=100M count=1
1+0 records in
1+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.260088 s, 403 MB/s
bor@10:~> sync
bor@10:~> df -h /mnt
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdb1  1023M  117M  556M  18% /mnt
bor@10:~> sudo filefrag -v /mnt/foo
Filesystem type is: 9123683e
File size of /mnt/foo is 104857600 (25600 blocks of 4096 bytes)
 ext: logical_offset:physical_offset: length:   expected: flags:
   0:0..   14419:  36272.. 50691:  14420:
   1:14420..   25599: 125312..136491:  11180:  50692:
last,eof
/mnt/foo: 2 extents found
bor@10:~> sudo dd if=/dev/zero of=/mnt/foo bs=10M count=1 conv=notrunc
seek=2
1+0 records in
1+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.034 s, 350 MB/s
bor@10:~> sync
bor@10:~> df -h /mnt
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdb1  1023M  127M  546M  19% /mnt
bor@10:~> sudo filefrag -v /mnt/foo
Filesystem type is: 9123683e
File size of /mnt/foo is 104857600 (25600 blocks of 4096 bytes)
 ext: logical_offset:physical_offset: length:   expected: flags:
   0:0..5119:  36272.. 41391:   5120:
   1: 5120..7679:  33696.. 36255:   2560:  41392:
   2: 7680..   14419:  43952.. 50691:   6740:  36256:
   3:14420..   25599: 125312..136491:  11180:  50692:
last,eof
/mnt/foo: 4 extents found
bor@10:~> sudo dd if=/dev/zero of=/mnt/foo bs=10M count=1 conv=notrunc
seek=7
1+0 records in
1+0 records out
10485760 bytes (10 MB, 10 MiB) copied, 0.0314211 s, 334 MB/s
bor@10:~> sync
bor@10:~> df -h /mnt
Filesystem  Size  Used Avail Use% Mounted on
/dev/sdb1  1023M  137M  536M  21% /mnt
bor@10:~> sudo filefrag -v /mnt/foo
Filesystem type is: 9123683e
File size of /mnt/foo is 104857600 (25600 blocks of 4096 bytes)
 ext: logical_offset:physical_offset: length:   expected: flags:
   0:0..5119:  36272.. 41391:   5120:
   1: 5120..7679:  33696.. 36255:   2560:  41392:
   2: 7680..   14419:  43952.. 50691:   6740:  36256:
   3:14420..   17919: 125312..128811:   3500:  50692:
   4:17920..   20479: 136492..139051:   2560: 128812:
   5:20480..   25599: 131372..136491:   5120: 139052:
last,eof
/mnt/foo: 6 extents found
bor@10:~> ll -sh /mnt
total 100M
100M -rw-r--r-- 1 root root 100M Oct 27 23:30 foo
bor@10:~>

So you still have the single file with size of 100M but space consumed
on filesystem is 120M because two initial large extents remain. Each
write of 10M will get new extent allocated, but large extents are not
split. If you look at file details

bor@10:~/python-btrfs/examples> sudo ./show_file.py /mnt/foo
filename /mnt/foo tree 5 inum 259
inode generation 239 transid 242 size 104857600 nbytes 104857600
block_group 0 mode 100644 nlink 1 uid 0 gid 0 rdev 0 flags 0x0(none)
inode ref list size 1
inode ref index 3 name utf-8 foo
extent data at 0 generation 239 ram_bytes 46563328 compression none type
regular disk_bytenr 148570112 disk_num_bytes 46563328 offset 0 num_bytes
20971520

This extent consumes about 44MB on disk but only 20MB of it is part of file.

extent data at 20971520 generation 241 ram_bytes 10485760 compression
none type regular disk_bytenr 138018816 disk_num_bytes 10485760 offset 0
num_bytes 10485760
extent data at 31457280 generation 239 ram_bytes 46563328 compression
none type regular disk_bytenr 148570112 disk_num_bytes 46563328 offset
31457280 num_bytes 15106048

And another 14MB. So 10MB allocated on disk are "lost".

extent data at 46563328 generation 239 ram_bytes 12500992 compression
none type regular disk_bytenr 195133440 disk_num_bytes 12500992 offset 0
num_bytes 12500992
extent data at 59064320 generation 239 ram_bytes 45793280 compression
none type regular disk_bytenr 513277952 disk_num_bytes 45793280 offset 0
num_bytes 14336000
extent data at 73400320 generation 242 ram_bytes 10485760 compression
none type regular disk_bytenr 559071232 disk_num_bytes 10485760 offset 0
num_bytes 10485760
extent data at 83886080 generation 239 ram_bytes 45793280 compression
none type regular disk_bytenr 513277952 disk_num_bytes 45793280 offset
24821760 num_bytes 20971520

Same here. 10MB in extent at 513277952 are lost.

There is no 

Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Marc MERLIN
On Sat, Oct 27, 2018 at 02:12:02PM -0400, Remi Gauvin wrote:
> On 2018-10-27 01:42 PM, Marc MERLIN wrote:
> 
> > 
> > I've been using btrfs for a long time now but I've never had a
> > filesystem where I had 15GB apparently unusable (7%) after a balance.
> > 
> 
> The space isn't unusable.  It's just allocated.. (It's used in the sense
> that it's reserved for data chunks.).  Start writing data to the drive,
> and the data will fill that space before more gets allocated.. (Unless
> you are using an older kernel and the filesystem gets mounted with ssd
> option, in which case, you'll want to add nossd option to prevent that
> behaviour.)
> 
> You can use btrfs fi usage to display that more clearly.
 
Got it. I have disk space free alerts based on df, which I know doesn't
mean that much on btrfs. Maybe I'll just need to change that alert code
to make it btrfs aware.
 
> > I can try a defrag next, but since I have COW for snapshots, it's not
> > going to help much, correct?
> 
> The defrag will end up using more space, as the fragmented parts of
> files will get duplicated.  That being said, if you have the luxury to
> defrag *before* taking new snapshots, that would be the time to do it.

Thanks for confirming. Because I always have snapshots for btrfs
send/receive, defrag will duplicate as you say, but once the older
snapshots get freed up, the duplicate blocks should go away, correct?

Back to usage, thanks for pointing out that command:
saruman:/mnt/btrfs_pool1# btrfs fi usage .
Overall:
Device size: 228.67GiB
Device allocated:203.54GiB
Device unallocated:   25.13GiB
Device missing:  0.00B
Used:192.01GiB
Free (estimated): 32.44GiB  (min: 19.88GiB)
Data ratio:   1.00
Metadata ratio:   2.00
Global reserve:  512.00MiB  (used: 0.00B)

Data,single: Size:192.48GiB, Used:185.16GiB
   /dev/mapper/pool1 192.48GiB

Metadata,DUP: Size:5.50GiB, Used:3.42GiB
   /dev/mapper/pool1  11.00GiB

System,DUP: Size:32.00MiB, Used:48.00KiB
   /dev/mapper/pool1  64.00MiB

Unallocated:
   /dev/mapper/pool1  25.13GiB


I'm still seing that I'm using 192GB, but 203GB allocated.
Do I have 25GB usable:
Device unallocated:   25.13GiB

Or 35GB usable?
Device size: 228.67GiB
  -
Used:192.01GiB
  = 36GB ?

Yes I know that I shouldn't get close to filling up the device, just
trying to clear up if I should stay below 25GB or below 35GB

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08


Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Remi Gauvin
On 2018-10-27 01:42 PM, Marc MERLIN wrote:

> 
> I've been using btrfs for a long time now but I've never had a
> filesystem where I had 15GB apparently unusable (7%) after a balance.
> 

The space isn't unusable.  It's just allocated.. (It's used in the sense
that it's reserved for data chunks.).  Start writing data to the drive,
and the data will fill that space before more gets allocated.. (Unless
you are using an older kernel and the filesystem gets mounted with ssd
option, in which case, you'll want to add nossd option to prevent that
behaviour.)

You can use btrfs fi usage to display that more clearly.


> I can try a defrag next, but since I have COW for snapshots, it's not
> going to help much, correct?

The defrag will end up using more space, as the fragmented parts of
files will get duplicated.  That being said, if you have the luxury to
defrag *before* taking new snapshots, that would be the time to do it.

<>

Re: Have 15GB missing in btrfs filesystem.

2018-10-27 Thread Marc MERLIN
On Wed, Oct 24, 2018 at 01:07:25PM +0800, Qu Wenruo wrote:
> > saruman:/mnt/btrfs_pool1# btrfs balance start -musage=80 -v .
> > Dumping filters: flags 0x6, state 0x0, force is off
> >   METADATA (flags 0x2): balancing, usage=80
> >   SYSTEM (flags 0x2): balancing, usage=80
> > Done, had to relocate 5 out of 202 chunks
> > saruman:/mnt/btrfs_pool1# btrfs fi show .
> > Label: 'btrfs_pool1'  uuid: fda628bc-1ca4-49c5-91c2-4260fe967a23
> > Total devices 1 FS bytes used 188.24GiB
> > devid1 size 228.67GiB used 203.54GiB path /dev/mapper/pool1
> > 
> > and it's back to 15GB :-/
> > 
> > How can I get 188.24 and 203.54 to converge further? Where is all that
> > space gone?
> 
> Your original chunks are already pretty compact.
> Thus really no need to do extra balance.
> 
> You may get some extra space by doing full system balance (no usage=
> filter), but that's really not worthy in my opinion.
> 
> Maybe you could try defrag to free some space wasted by CoW instead?
> (If you're not using many snapshots)

Thanks for the reply.

So right now, I have:
saruman:~# btrfs fi show /mnt/btrfs_pool1/
Label: 'btrfs_pool1'  uuid: fda628bc-1ca4-49c5-91c2-4260fe967a23
Total devices 1 FS bytes used 188.25GiB
devid1 size 228.67GiB used 203.54GiB path /dev/mapper/pool1

saruman:~# btrfs fi df /mnt/btrfs_pool1/
Data, single: total=192.48GiB, used=184.87GiB
System, DUP: total=32.00MiB, used=48.00KiB
Metadata, DUP: total=5.50GiB, used=3.38GiB
GlobalReserve, single: total=512.00MiB, used=0.00B

I've been using btrfs for a long time now but I've never had a
filesystem where I had 15GB apparently unusable (7%) after a balance.

I can't drop all the snapshots since at least two is used for btrfs
send/receive backups.
However, if I delete more snapshots, and do a full balance, you think
it'll free up more space?
I can try a defrag next, but since I have COW for snapshots, it's not
going to help much, correct?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems 
   what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/   | PGP 7F55D5F27AAF9D08


Re: FS_IOC_FIEMAP fe_physical discrepancies on btrfs

2018-10-27 Thread Andrei Borzenkov
27.10.2018 18:45, Lennert Buytenhek пишет:
> Hello!
> 
> FS_IOC_FIEMAP on btrfs seems to be returning fe_physical values that
> don't always correspond to the actual on-disk data locations.  For some
> files the values match, but e.g. for this file:
> 
> # filefrag -v foo
> Filesystem type is: 9123683e
> File size of foo is 4096 (1 block of 4096 bytes)
>  ext: logical_offset:physical_offset: length:   expected: flags:
>0:0..   0:5774454..   5774454:  1: last,eof
> foo: 1 extent found
> #
> 
> The file data is actually on disk not in block 5774454 (0x581c76), but
> in block 6038646 (0x5c2476), an offset of +0x40800.  Is this expected
> behavior?  Googling didn't turn up much, apologies if this is an FAQ. :(
> 
> (This is on 4.18.16-200.fc28.x86_64, the current Fedora 28 kernel.)
> 

My understanding is that it returns logical block address in btrfs
address space. btrfs can span multiple devices so you will need to
convert extent address to (device,offset) pair if necessary.


FS_IOC_FIEMAP fe_physical discrepancies on btrfs

2018-10-27 Thread Lennert Buytenhek
Hello!

FS_IOC_FIEMAP on btrfs seems to be returning fe_physical values that
don't always correspond to the actual on-disk data locations.  For some
files the values match, but e.g. for this file:

# filefrag -v foo
Filesystem type is: 9123683e
File size of foo is 4096 (1 block of 4096 bytes)
 ext: logical_offset:physical_offset: length:   expected: flags:
   0:0..   0:5774454..   5774454:  1: last,eof
foo: 1 extent found
#

The file data is actually on disk not in block 5774454 (0x581c76), but
in block 6038646 (0x5c2476), an offset of +0x40800.  Is this expected
behavior?  Googling didn't turn up much, apologies if this is an FAQ. :(

(This is on 4.18.16-200.fc28.x86_64, the current Fedora 28 kernel.)


Thanks,
Lennert


Re: fsck lowmem mode only: ERROR: errors found in fs roots

2018-10-27 Thread Christoph Anton Mitterer
Hey.


Without the last patches on 4.17:

checking extents
checking free space cache
checking fs roots
ERROR: errors found in fs roots
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
found 619543498752 bytes used, error(s) found
total csum bytes: 602382204
total tree bytes: 2534309888
total fs tree bytes: 1652097024
total extent tree bytes: 160432128
btree space waste bytes: 459291608
file data blocks allocated: 7334036647936
 referenced 730839187456


With the last patches, on 4.17:

checking extents
checking free space cache
checking fs roots
checking only csum items (without verifying data)
checking root refs
Checking filesystem on /dev/mapper/system
UUID: 6050ca10-e778-4d08-80e7-6d27b9c89b3c
found 619543498752 bytes used, no error found
total csum bytes: 602382204
total tree bytes: 2534309888
total fs tree bytes: 1652097024
total extent tree bytes: 160432128
btree space waste bytes: 459291608
file data blocks allocated: 7334036647936
 referenced 730839187456


Cheers,
Chris.