Re: Provide a better free space estimate on RAID1

2014-02-05 Thread Brendan Hide

On 2014/02/05 10:15 PM, Roman Mamedov wrote:

Hello,

On a freshly-created RAID1 filesystem of two 1TB disks:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

I cannot write 2TB of user data to that RAID1, so this estimate is clearly
misleading. I got tired of looking at the bogus disk free space on all my
RAID1 btrfs systems, so today I decided to do something about this:

...

After:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  912G   1% /mnt/p2

Until per-subvolume RAID profiles are implemented, this estimate will be
correct, and even after, it should be closer to the truth than assuming the
user will fill their RAID1 FS only with subvolumes of single or raid0 profiles.
This is a known issue: 
https://btrfs.wiki.kernel.org/index.php/FAQ#Why_does_df_show_incorrect_free_space_for_my_RAID_volume.3F


Btrfs is still considered experimental - this is just one of those 
caveats we've learned to adjust to.


The change could work well for now and I'm sure it has been considered. 
I guess the biggest end-user issue is that you can, at a whim, change 
the model for new blocks - raid0/5/6,single etc and your value from 5 
minutes ago is far out from your new value without having written 
anything or taken up any space. Not a show-stopper problem, really.


The biggest dev issue is that future features will break this behaviour, 
such as the "per-subvolume RAID profiles" you mentioned. It is difficult 
to motivate including code (for which there's a known workaround) where 
we know it will be obsoleted.


--
__
Brendan Hide
http://swiftspirit.co.za/
http://www.webafrica.co.za/?AFF1E97

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device delete missing panic

2014-02-05 Thread Thermionix
~ # dmesg | tail
[  351.083338] btrfs: failed to read chunk tree on sdg
[  351.088283] btrfs: open_ctree failed
[  392.541465] device label pool devid 7 transid 55056 /dev/sdh
[  393.558496] btrfs: allowing degraded mounts
[  393.558503] btrfs: disk space caching is enabled
[  393.605461] BTRFS critical (device sdg): unable to find logical 
1563838812160 len 4096
[  393.605479] BTRFS critical (device sdg): unable to find logical 
1563838812160 len 4096
[  393.605491] BTRFS critical (device sdg): unable to find logical 
1563838812160 len 4096
[  393.605495] btrfs: failed to read tree root on sdg
[  393.612297] btrfs: open_ctree failed

On 06/02/14 16:20, Anand Jain wrote:
> 
> 
> 
>> as I now can't mount (open_ctree failed)
> 
>  at around open_ctree failed message should help.
> 
> Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device delete missing panic

2014-02-05 Thread Anand Jain





as I now can't mount (open_ctree failed)


 at around open_ctree failed message should help.

Thanks, Anand
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: device delete missing panic

2014-02-05 Thread Thermionix
those are the last useful log outputs before the server locks up 

digging in /var/log/messages - you can see it stopped logging at 12:47, and I 
hard reset at 3:07

maybe I should have specified hard-lock-up instead of panic

2014-02-06T12:47:47.590784+11:00 store03 kernel: [ 4619.769346] [ 
cut here ]
2014-02-06T12:47:47.590785+11:00 store03 kernel: [ 4619.769369] WARNING: CPU: 0 
PID: 3005 at /home/abuild/rpmbuild/BUILD/kernel-pae-3.11.6/l
inux-3.11/fs/btrfs/disk-io.c:482 btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]()
2014-02-06T12:47:47.590893+11:00 store03 kernel: [ 4619.769399] Modules linked 
in: bonding hwmon_vid btrfs raid6_pq zlib_deflate xor libcrc32c joydev 
hid_generic iTCO_wdt iTCO_vendor_support coretemp pcspkr serio_raw i2c_i801 
ata_generic lpc_ich mfd_core usbhid mvsas libsas scsi_transport_sas e1000e ptp 
pps_core shpchp mperf sg dm_mod autofs4 ata_piix uhci_hcd ehci_pci ehci_hcd 
usbcore usb_common i915 fan thermal processor drm_kms_helper drm i2c_algo_bit 
button video thermal_sys scsi_dh_hp_sw scsi_dh_emc scsi_dh_rdac scsi_dh_alua 
scsi_dh
2014-02-06T12:47:47.590896+11:00 store03 kernel: [ 4619.769402] CPU: 0 PID: 
3005 Comm: btrfs-worker-1 Tainted: GW3.11.6-4-pae #1
2014-02-06T12:47:47.590898+11:00 store03 kernel: [ 4619.769403] Hardware name: 
PhoenixAward 945GM/945GM, BIOS 6.00 PG 08/13/2008
2014-02-06T12:47:47.590899+11:00 store03 kernel: [ 4619.769407]  0009 
c06e075a  c0242c5e c085dbc8  0bbd f8a06e34
2014-02-06T12:47:47.590901+11:00 store03 kernel: [ 4619.769411]  01e2 
f8985503 f8985503 000d f5abaa5c d7c20a5c f1c97070 c0242d1b
2014-02-06T12:47:47.590903+11:00 store03 kernel: [ 4619.769415]  0009 
 f8985503 e85afce0 f5abaa5c e62a6c00 16c205ed c6b5476c
2014-02-06T12:47:47.590905+11:00 store03 kernel: [ 4619.769415] Call Trace:
2014-02-06T12:47:47.590906+11:00 store03 kernel: [ 4619.769424]  [] 
try_stack_unwind+0x179/0x190
2014-02-06T12:47:47.590908+11:00 store03 kernel: [ 4619.769430]  [] 
dump_trace+0x47/0xf0
2014-02-06T12:47:47.590910+11:00 store03 kernel: [ 4619.769434]  [] 
show_trace_log_lvl+0x3f/0x50
2014-02-06T12:47:47.590911+11:00 store03 kernel: [ 4619.769437]  [] 
show_stack_log_lvl+0x50/0xd0
2014-02-06T12:47:47.590913+11:00 store03 kernel: [ 4619.769441]  [] 
show_stack+0x1f/0x40
2014-02-06T12:47:47.590915+11:00 store03 kernel: [ 4619.769445]  [] 
dump_stack+0x3e/0x4e
2014-02-06T12:47:47.590917+11:00 store03 kernel: [ 4619.769450]  [] 
warn_slowpath_common+0x7e/0xa0
2014-02-06T12:47:47.590918+11:00 store03 kernel: [ 4619.769454]  [] 
warn_slowpath_null+0x1b/0x20
2014-02-06T12:47:47.590920+11:00 store03 kernel: [ 4619.769472]  [] 
btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]
2014-02-06T12:47:47.590922+11:00 store03 kernel: [ 4619.769555]  [] 
run_one_async_start+0x2f/0x40 [btrfs]
2014-02-06T12:47:47.590924+11:00 store03 kernel: [ 4619.769630]  [] 
worker_loop+0x107/0x470 [btrfs]
^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2014-02-06T15:07:05.120258+11:00
 store03 rsyslogd: [origin software="rsyslogd" swVersion="7.4.7" x-pid="418" 
x-info="http://www.rsyslog.com";] start
2014-02-06T15:07:05.127408+11:00 store03 kernel: [0.00] Initializing 
cgroup subsys cpuset

as I now can't mount (open_ctree failed)
Should I be mounting with -o recovery ?

On 06/02/14 15:35, Anand Jain wrote:
> 
> 
> your test case is same as in the patch below
> and the panic was due to null bdev (which matches
> in your logs).
> 
>   [RFC PATCH] btrfs: fix null pointer deference at
> btrfs_sysfs_add_one+0x105
> 
> 
> But in your logs below, there isn't a panic right ?
> wrong cut and paste ? or what did I miss?
> 
> 
> Thanks, Anand
> 
> 
> 
> On 02/06/14 11:40 AM, Thermionix wrote:
>> openSUSE 13.1 i686 8 device raid 10
>> when replacing a failed disk (new device is added)
>>
>> ~ # uname -r
>> 3.11.6-4-pae
>>
>> ~ # btrfs --version
>> Btrfs v3.12+20131125
>>
>> ~ # mount -o degraded /pool
>>
>> ~ # journalctl | tail
>>
>> Feb 06 12:22:51 store03 kernel: device label pool devid 4 transid 55050
>> /dev/sde
>> Feb 06 12:22:53 store03 kernel: btrfs: allowing degraded mounts
>> Feb 06 12:22:53 store03 kernel: btrfs: disk space caching is enabled
>> Feb 06 12:22:53 store03 kernel: btrfs: bdev (null) errs: wr 353, rd 1,
>> flush 17, corrupt 0, gen 0
>> Feb 06 12:23:16 store03 kernel: BTRFS debug (device sde): unlinked 1
>> orphans
>>
>> ~ # btrfs filesystem show /dev/disk/by-label/pool
>> Label: pool  uuid: 3e6ba20f-a4d0-40e4-88e7-a31c4930bcfe
>>  Total devices 9 FS bytes used 5.19TiB
>>  devid1 size 1.36TiB used 169.50GiB path
>>  devid2 size 1.82TiB used 1.62TiB path /dev/sdc
>>  devid3 size 931.51GiB used 931.51GiB path /dev/sdd
>>  devid4 size 931.51GiB used 931.51GiB path /dev/sde
>>  devid6 size 1.82TiB used 1.62TiB path /dev/sdg
>>  devid7 size 1.82TiB used 

Re: device delete missing panic

2014-02-05 Thread Anand Jain



your test case is same as in the patch below
and the panic was due to null bdev (which matches
in your logs).

  [RFC PATCH] btrfs: fix null pointer deference at 
btrfs_sysfs_add_one+0x105



But in your logs below, there isn't a panic right ?
wrong cut and paste ? or what did I miss?


Thanks, Anand



On 02/06/14 11:40 AM, Thermionix wrote:

openSUSE 13.1 i686 8 device raid 10
when replacing a failed disk (new device is added)

~ # uname -r
3.11.6-4-pae

~ # btrfs --version
Btrfs v3.12+20131125

~ # mount -o degraded /pool

~ # journalctl | tail

Feb 06 12:22:51 store03 kernel: device label pool devid 4 transid 55050
/dev/sde
Feb 06 12:22:53 store03 kernel: btrfs: allowing degraded mounts
Feb 06 12:22:53 store03 kernel: btrfs: disk space caching is enabled
Feb 06 12:22:53 store03 kernel: btrfs: bdev (null) errs: wr 353, rd 1,
flush 17, corrupt 0, gen 0
Feb 06 12:23:16 store03 kernel: BTRFS debug (device sde): unlinked 1 orphans

~ # btrfs filesystem show /dev/disk/by-label/pool
Label: pool  uuid: 3e6ba20f-a4d0-40e4-88e7-a31c4930bcfe
 Total devices 9 FS bytes used 5.19TiB
 devid1 size 1.36TiB used 169.50GiB path
 devid2 size 1.82TiB used 1.62TiB path /dev/sdc
 devid3 size 931.51GiB used 931.51GiB path /dev/sdd
 devid4 size 931.51GiB used 931.51GiB path /dev/sde
 devid6 size 1.82TiB used 1.62TiB path /dev/sdg
 devid7 size 1.82TiB used 1.62TiB path /dev/sdh
 devid8 size 931.51GiB used 931.51GiB path /dev/sdi
 devid9 size 1.82TiB used 1.62TiB path /dev/sdf
 devid10 size 1.82TiB used 1.01TiB path /dev/sdb

~ # btrfs device delete missing /pool

~ # journalctl -l | tail

Feb 06 12:25:43 store03 kernel: btrfs: relocating block group
10590585618432 flags 68
...
Feb 06 12:47:23 store03 kernel:  [] kthread+0x92/0xa0
Feb 06 12:47:23 store03 kernel:  []
ret_from_kernel_thread+0x1b/0x28
Feb 06 12:47:23 store03 kernel:  []
kthread_create_on_node+0xd0/0xd0
Feb 06 12:47:23 store03 kernel: DWARF2 unwinder stuck at kthread+0x0/0xa0
Feb 06 12:47:23 store03 kernel: Feb 06 12:47:23 store03 kernel: Leftover
inexact backtrace:
Feb 06 12:47:23 store03 kernel: ---[ end trace c47f82d03f79250d ]---
Feb 06 12:47:23 store03 kernel: [ cut here ]
Feb 06 12:47:23 store03 kernel: WARNING: CPU: 0 PID: 3028 at
/home/abuild/rpmbuild/BUILD/kernel-pae-3.11.6/linux-3.11/fs/btrfs/disk-io.c:482
btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]()
Feb 06 12:47:23 store03 kernel: Modules linked in: bonding hwmon_vid
btrfs raid6_pq zlib_deflate xor libcrc32c joydev hid_generic iTCO_wdt
iTCO_vendor_support coretemp pcspkr serio_raw i2c_i801 ata_generic
lpc_ich mfd_core usbhid mvsas libsas scsi_transport_sas e1000e ptp
pps_core shpchp mperf sg dm_mod autofs4 ata_piix uhci_hcd ehci_pci
ehci_hcd usbcore usb_common i915 fan thermal processor drm_kms_helper
drm i2c_algo_bit button video thermal_sys scsi_dh_hp_sw scsi_dh_emc
scsi_dh_rdac scsi_dh_alua scsi_dh
Feb 06 12:47:23 store03 kernel: CPU: 0 PID: 3028 Comm: btrfs-worker-2
Tainted: GW3.11.6-4-pae #1
Feb 06 12:47:23 store03 kernel: Hardware name: PhoenixAward 945GM/945GM,
BIOS 6.00 PG 08/13/2008
Feb 06 12:47:23 store03 kernel:  0009 c06e075a  c0242c5e
c085dbc8  0bd4 f8a06e34
Feb 06 12:47:23 store03 kernel:  01e2 f8985503 f8985503 0002
f5c60304 f2e606d8 c14ca4f0 c0242d1b
Feb 06 12:47:23 store03 kernel:  0009  f8985503 ef93d4a0
f5c60304 e62a6c00 16c1f682 f46fe86c
Feb 06 12:47:23 store03 kernel: Call Trace:
Feb 06 12:47:23 store03 kernel:  [] try_stack_unwind+0x179/0x190
Feb 06 12:47:23 store03 kernel:  [] dump_trace+0x47/0xf0
Feb 06 12:47:23 store03 kernel:  [] show_trace_log_lvl+0x3f/0x50
Feb 06 12:47:23 store03 kernel:  [] show_stack_log_lvl+0x50/0xd0
Feb 06 12:47:23 store03 kernel:  [] show_stack+0x1f/0x40
Feb 06 12:47:23 store03 kernel:  [] dump_stack+0x3e/0x4e
Feb 06 12:47:23 store03 kernel:  [] warn_slowpath_common+0x7e/0xa0
Feb 06 12:47:23 store03 kernel:  [] warn_slowpath_null+0x1b/0x20
Feb 06 12:47:23 store03 kernel:  []
btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]
Feb 06 12:47:23 store03 kernel:  []
run_one_async_start+0x2f/0x40 [btrfs]
Feb 06 12:47:23 store03 kernel:  [] worker_loop+0x107/0x470
[btrfs]
Feb 06 12:47:23 store03 kernel:  [] kthread+0x92/0xa0
Feb 06 12:47:23 store03 kernel:  []
ret_from_kernel_thread+0x1b/0x28
Feb 06 12:47:23 store03 kernel:  []
kthread_create_on_node+0xd0/0xd0
Feb 06 12:47:23 store03 kernel: DWARF2 unwinder stuck at kthread+0x0/0xa0
Feb 06 12:47:23 store03 kernel: Feb 06 12:47:23 store03 kernel: Leftover
inexact backtrace:
Feb 06 12:47:23 store03 kernel: ---[ end trace c47f82d03f79250e ]---
Feb 06 12:47:23 store03 kernel: [ cut here ]
...

kernel soon locks up, any advice on how to proceed?
any other info needed?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More maj

Re: device delete missing panic

2014-02-05 Thread Thermionix
Yep, not planning on using btrfsck unless told ;)

hmm now I can't even mount it;

~ # dmesg | tail
[  351.083338] btrfs: failed to read chunk tree on sdg
[  351.088283] btrfs: open_ctree failed
[  392.541465] device label pool devid 7 transid 55056 /dev/sdh
[  393.558496] btrfs: allowing degraded mounts
[  393.558503] btrfs: disk space caching is enabled
[  393.605461] BTRFS critical (device sdg): unable to find logical
1563838812160 len 4096
[  393.605479] BTRFS critical (device sdg): unable to find logical
1563838812160 len 4096
[  393.605491] BTRFS critical (device sdg): unable to find logical
1563838812160 len 4096
[  393.605495] btrfs: failed to read tree root on sdg
[  393.612297] btrfs: open_ctree failed

On 06/02/14 14:56, Christian Robert wrote:
> and also "cat /proc/mounts | fgrep /mountpoint"  # will show most mount
> options, rw/ro status and so.
> 
> and *never* try to btrfsck this filesystem (fsck.btrfs is still broken)
> unless an expert developper tell you to try this.
> 
> Xtian.
> 
> On 2014-02-05 22:47, Christian Robert wrote:
>> ps:
>>
>>always add a "btrfs file df /mountpoint" to your reporting bug,
>>so they can figure if it is single dup raid0 raid1 raid10 raid5
>> raid6 and so on... or a mixup
>>
>> my 2 cents.
>>
>> On 2014-02-05 22:40, Thermionix wrote:
>>> openSUSE 13.1 i686 8 device raid 10
>>> when replacing a failed disk (new device is added)
>>>
>>> ~ # uname -r
>>> 3.11.6-4-pae
>>>
>>> ~ # btrfs --version
>>> Btrfs v3.12+20131125
>>>
>>> ~ # mount -o degraded /pool
>>>
>>> ~ # journalctl | tail
>>>
>>> Feb 06 12:22:51 store03 kernel: device label pool devid 4 transid 55050
>>> /dev/sde
>>> Feb 06 12:22:53 store03 kernel: btrfs: allowing degraded mounts
>>> Feb 06 12:22:53 store03 kernel: btrfs: disk space caching is enabled
>>> Feb 06 12:22:53 store03 kernel: btrfs: bdev (null) errs: wr 353, rd 1,
>>> flush 17, corrupt 0, gen 0
>>> Feb 06 12:23:16 store03 kernel: BTRFS debug (device sde): unlinked 1
>>> orphans
>>>
>>> ~ # btrfs filesystem show /dev/disk/by-label/pool
>>> Label: pool  uuid: 3e6ba20f-a4d0-40e4-88e7-a31c4930bcfe
>>>  Total devices 9 FS bytes used 5.19TiB
>>>  devid1 size 1.36TiB used 169.50GiB path
>>>  devid2 size 1.82TiB used 1.62TiB path /dev/sdc
>>>  devid3 size 931.51GiB used 931.51GiB path /dev/sdd
>>>  devid4 size 931.51GiB used 931.51GiB path /dev/sde
>>>  devid6 size 1.82TiB used 1.62TiB path /dev/sdg
>>>  devid7 size 1.82TiB used 1.62TiB path /dev/sdh
>>>  devid8 size 931.51GiB used 931.51GiB path /dev/sdi
>>>  devid9 size 1.82TiB used 1.62TiB path /dev/sdf
>>>  devid10 size 1.82TiB used 1.01TiB path /dev/sdb
>>>
>>> ~ # btrfs device delete missing /pool
>>>
>>> ~ # journalctl -l | tail
>>>
>>> Feb 06 12:25:43 store03 kernel: btrfs: relocating block group
>>> 10590585618432 flags 68
>>> ...
>>> Feb 06 12:47:23 store03 kernel:  [] kthread+0x92/0xa0
>>> Feb 06 12:47:23 store03 kernel:  []
>>> ret_from_kernel_thread+0x1b/0x28
>>> Feb 06 12:47:23 store03 kernel:  []
>>> kthread_create_on_node+0xd0/0xd0
>>> Feb 06 12:47:23 store03 kernel: DWARF2 unwinder stuck at
>>> kthread+0x0/0xa0
>>> Feb 06 12:47:23 store03 kernel: Feb 06 12:47:23 store03 kernel: Leftover
>>> inexact backtrace:
>>> Feb 06 12:47:23 store03 kernel: ---[ end trace c47f82d03f79250d ]---
>>> Feb 06 12:47:23 store03 kernel: [ cut here ]
>>> Feb 06 12:47:23 store03 kernel: WARNING: CPU: 0 PID: 3028 at
>>> /home/abuild/rpmbuild/BUILD/kernel-pae-3.11.6/linux-3.11/fs/btrfs/disk-io.c:482
>>>
>>> btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]()
>>> Feb 06 12:47:23 store03 kernel: Modules linked in: bonding hwmon_vid
>>> btrfs raid6_pq zlib_deflate xor libcrc32c joydev hid_generic iTCO_wdt
>>> iTCO_vendor_support coretemp pcspkr serio_raw i2c_i801 ata_generic
>>> lpc_ich mfd_core usbhid mvsas libsas scsi_transport_sas e1000e ptp
>>> pps_core shpchp mperf sg dm_mod autofs4 ata_piix uhci_hcd ehci_pci
>>> ehci_hcd usbcore usb_common i915 fan thermal processor drm_kms_helper
>>> drm i2c_algo_bit button video thermal_sys scsi_dh_hp_sw scsi_dh_emc
>>> scsi_dh_rdac scsi_dh_alua scsi_dh
>>> Feb 06 12:47:23 store03 kernel: CPU: 0 PID: 3028 Comm: btrfs-worker-2
>>> Tainted: GW3.11.6-4-pae #1
>>> Feb 06 12:47:23 store03 kernel: Hardware name: PhoenixAward 945GM/945GM,
>>> BIOS 6.00 PG 08/13/2008
>>> Feb 06 12:47:23 store03 kernel:  0009 c06e075a  c0242c5e
>>> c085dbc8  0bd4 f8a06e34
>>> Feb 06 12:47:23 store03 kernel:  01e2 f8985503 f8985503 0002
>>> f5c60304 f2e606d8 c14ca4f0 c0242d1b
>>> Feb 06 12:47:23 store03 kernel:  0009  f8985503 ef93d4a0
>>> f5c60304 e62a6c00 16c1f682 f46fe86c
>>> Feb 06 12:47:23 store03 kernel: Call Trace:
>>> Feb 06 12:47:23 store03 kernel:  []
>>> try_stack_unwind+0x179/0x190
>>> Feb 06 12:47:23 store03 kernel:  [] dump_trace+0x47/0xf0
>>> Feb 06 12:47:23 store03 kernel:  []
>>> s

device delete missing panic

2014-02-05 Thread Thermionix
openSUSE 13.1 i686 8 device raid 10
when replacing a failed disk (new device is added)

~ # uname -r
3.11.6-4-pae

~ # btrfs --version
Btrfs v3.12+20131125

~ # mount -o degraded /pool

~ # journalctl | tail

Feb 06 12:22:51 store03 kernel: device label pool devid 4 transid 55050
/dev/sde
Feb 06 12:22:53 store03 kernel: btrfs: allowing degraded mounts
Feb 06 12:22:53 store03 kernel: btrfs: disk space caching is enabled
Feb 06 12:22:53 store03 kernel: btrfs: bdev (null) errs: wr 353, rd 1,
flush 17, corrupt 0, gen 0
Feb 06 12:23:16 store03 kernel: BTRFS debug (device sde): unlinked 1 orphans

~ # btrfs filesystem show /dev/disk/by-label/pool
Label: pool  uuid: 3e6ba20f-a4d0-40e4-88e7-a31c4930bcfe
Total devices 9 FS bytes used 5.19TiB
devid1 size 1.36TiB used 169.50GiB path
devid2 size 1.82TiB used 1.62TiB path /dev/sdc
devid3 size 931.51GiB used 931.51GiB path /dev/sdd
devid4 size 931.51GiB used 931.51GiB path /dev/sde
devid6 size 1.82TiB used 1.62TiB path /dev/sdg
devid7 size 1.82TiB used 1.62TiB path /dev/sdh
devid8 size 931.51GiB used 931.51GiB path /dev/sdi
devid9 size 1.82TiB used 1.62TiB path /dev/sdf
devid10 size 1.82TiB used 1.01TiB path /dev/sdb

~ # btrfs device delete missing /pool

~ # journalctl -l | tail

Feb 06 12:25:43 store03 kernel: btrfs: relocating block group
10590585618432 flags 68
...
Feb 06 12:47:23 store03 kernel:  [] kthread+0x92/0xa0
Feb 06 12:47:23 store03 kernel:  []
ret_from_kernel_thread+0x1b/0x28
Feb 06 12:47:23 store03 kernel:  []
kthread_create_on_node+0xd0/0xd0
Feb 06 12:47:23 store03 kernel: DWARF2 unwinder stuck at kthread+0x0/0xa0
Feb 06 12:47:23 store03 kernel: Feb 06 12:47:23 store03 kernel: Leftover
inexact backtrace:
Feb 06 12:47:23 store03 kernel: ---[ end trace c47f82d03f79250d ]---
Feb 06 12:47:23 store03 kernel: [ cut here ]
Feb 06 12:47:23 store03 kernel: WARNING: CPU: 0 PID: 3028 at
/home/abuild/rpmbuild/BUILD/kernel-pae-3.11.6/linux-3.11/fs/btrfs/disk-io.c:482
btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]()
Feb 06 12:47:23 store03 kernel: Modules linked in: bonding hwmon_vid
btrfs raid6_pq zlib_deflate xor libcrc32c joydev hid_generic iTCO_wdt
iTCO_vendor_support coretemp pcspkr serio_raw i2c_i801 ata_generic
lpc_ich mfd_core usbhid mvsas libsas scsi_transport_sas e1000e ptp
pps_core shpchp mperf sg dm_mod autofs4 ata_piix uhci_hcd ehci_pci
ehci_hcd usbcore usb_common i915 fan thermal processor drm_kms_helper
drm i2c_algo_bit button video thermal_sys scsi_dh_hp_sw scsi_dh_emc
scsi_dh_rdac scsi_dh_alua scsi_dh
Feb 06 12:47:23 store03 kernel: CPU: 0 PID: 3028 Comm: btrfs-worker-2
Tainted: GW3.11.6-4-pae #1
Feb 06 12:47:23 store03 kernel: Hardware name: PhoenixAward 945GM/945GM,
BIOS 6.00 PG 08/13/2008
Feb 06 12:47:23 store03 kernel:  0009 c06e075a  c0242c5e
c085dbc8  0bd4 f8a06e34
Feb 06 12:47:23 store03 kernel:  01e2 f8985503 f8985503 0002
f5c60304 f2e606d8 c14ca4f0 c0242d1b
Feb 06 12:47:23 store03 kernel:  0009  f8985503 ef93d4a0
f5c60304 e62a6c00 16c1f682 f46fe86c
Feb 06 12:47:23 store03 kernel: Call Trace:
Feb 06 12:47:23 store03 kernel:  [] try_stack_unwind+0x179/0x190
Feb 06 12:47:23 store03 kernel:  [] dump_trace+0x47/0xf0
Feb 06 12:47:23 store03 kernel:  [] show_trace_log_lvl+0x3f/0x50
Feb 06 12:47:23 store03 kernel:  [] show_stack_log_lvl+0x50/0xd0
Feb 06 12:47:23 store03 kernel:  [] show_stack+0x1f/0x40
Feb 06 12:47:23 store03 kernel:  [] dump_stack+0x3e/0x4e
Feb 06 12:47:23 store03 kernel:  [] warn_slowpath_common+0x7e/0xa0
Feb 06 12:47:23 store03 kernel:  [] warn_slowpath_null+0x1b/0x20
Feb 06 12:47:23 store03 kernel:  []
btree_csum_one_bio.isra.48+0x93/0x110 [btrfs]
Feb 06 12:47:23 store03 kernel:  []
run_one_async_start+0x2f/0x40 [btrfs]
Feb 06 12:47:23 store03 kernel:  [] worker_loop+0x107/0x470
[btrfs]
Feb 06 12:47:23 store03 kernel:  [] kthread+0x92/0xa0
Feb 06 12:47:23 store03 kernel:  []
ret_from_kernel_thread+0x1b/0x28
Feb 06 12:47:23 store03 kernel:  []
kthread_create_on_node+0xd0/0xd0
Feb 06 12:47:23 store03 kernel: DWARF2 unwinder stuck at kthread+0x0/0xa0
Feb 06 12:47:23 store03 kernel: Feb 06 12:47:23 store03 kernel: Leftover
inexact backtrace:
Feb 06 12:47:23 store03 kernel: ---[ end trace c47f82d03f79250e ]---
Feb 06 12:47:23 store03 kernel: [ cut here ]
...

kernel soon locks up, any advice on how to proceed?
any other info needed?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Are nocow files snapshot-aware

2014-02-05 Thread Duncan
Kai Krakow posted on Wed, 05 Feb 2014 19:17:10 +0100 as excerpted:

> David Sterba  schrieb:
> 
>> On Tue, Feb 04, 2014 at 08:22:05PM -0500, Josef Bacik wrote:
>>> On 02/04/2014 03:52 PM, Kai Krakow wrote:
>>> >Hi!
>>> >
>>> >I'm curious... The whole snapshot thing on btrfs is based on its COW
>>> >design. But you can make individual files and directory contents
>>> >nocow by applying the C attribute on it using chattr. This is usually
>>> >recommended for database files and VM images. So far, so good...
>>> >
>>> >But what happens to such files when they are part of a snapshot? Do
>>> >they become duplicated during the snapshot? Do they become unshared
>>> >(as a whole) when written to? Or when the the parent snapshot becomes
>>> >deleted?
>>> >Or maybe the nocow attribute is just ignored after a snapshot was
>>> >taken?
>>> >
>>> When snapshotted nocow files fallback to normal cow behaviour.
>> 
>> This may seem unclear to people not familiar with the actual
>> implementation, and I had to think for a second about that sentence.
>> The file will keep the NOCOW status, but any modified blocks will be
>> newly allocated on the first write (in a COW manner), then the block
>> location will not change anymore (unlike ordinary COW).
> 
> Ah okay, that makes it clear. So, actually, in the snapshot the file is
> still nocow - just for the exception that blocks being written to become
> unshared and relocated. This may introduce a lot of fragmentation but it
> won't become worse when rewriting the same blocks over and over again.

That also explains the report of a NOCOW VM-image still triggering the 
snapshot-aware-defrag-related pathology.  It was a _heavily_ auto-
snapshotted btrfs (thousands of snapshots, something like every 30 
seconds or more frequent, without thinning them down right away), and the 
continuing VM writes would nearly guarantee that many of those snapshots 
had unique blocks, so the effect was nearly as bad as if it wasn't NOCOW 
at all!

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] btrfs/035: add new clone overwrite regression test

2014-02-05 Thread Dave Chinner
On Wed, Feb 05, 2014 at 12:16:49PM +0100, David Disseldorp wrote:
> This test uses the newly added cloner binary to dispatch full file and
> range specific clone (reflink) requests.

A couple of small nits:

> +CLONER_PROG=$here/src/cloner

Need to test that the binary was build and is present.

> +
> +src_str="aa"
> +
> +echo -n "$src_str" > $SCRATCH_MNT/src || _fail "failed to create src"

No need for the "|| _fail ..." in any part of this test. Failures
will be caught in the output and hence cause golden output
mismatches.

Letting the test run even after a failure exercises the filesystem
in interesting ways, so it's worthwhile ignoring failures in the
test and letting the harness pick up the failures through error
messages.

> +$CLONER_PROG $SCRATCH_MNT/src  $SCRATCH_MNT/src.clone1
> +
> +src_str="bbcc"
> +
> +echo -n "$src_str" > $SCRATCH_MNT/src || _fail "failed to create src"
> +
> +$CLONER_PROG $SCRATCH_MNT/src $SCRATCH_MNT/src.clone2
> +
> +snap_src_sz=`ls -lah $SCRATCH_MNT/src.clone1 | awk '{print $5}'`
> +echo "attempting ioctl (src.clone1 src)"
> +$CLONER_PROG -s 0 -d 0 -l ${snap_src_sz} \
> + $SCRATCH_MNT/src.clone1 $SCRATCH_MNT/src || _fail "ioctl failed"

And to do that here, you probably need to add perror() output to
the cloner program when it detects an error. i.e. let it give you
the exact error that was detected, rather than lumping them all into
a catchall here...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] btrfs: add small program for clone testing

2014-02-05 Thread Dave Chinner
On Wed, Feb 05, 2014 at 12:16:48PM +0100, David Disseldorp wrote:
> The cloner program is capable of cloning files using the BTRFS_IOC_CLONE
> and BTRFS_IOC_CLONE_RANGE ioctls.
> 
> Signed-off-by: David Disseldorp 

Hi Dave - long time since I've seen your head pop up around here ;)

A few comments below.

> +struct btrfs_ioctl_clone_range_args {
> + int64_t src_fd;
> + uint64_t src_offset;
> + uint64_t src_length;
> + uint64_t dest_offset;
> +};
> +
> +#define BTRFS_IOCTL_MAGIC 0x94
> +#define BTRFS_IOC_CLONE   _IOW(BTRFS_IOCTL_MAGIC, 9, int)
> +#define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \
> +struct btrfs_ioctl_clone_range_args)

Is there some published header file that these belong to? i.e.
somewhere in the include/linux/uapi/ kernel directory? Normally the
way to handle this sort of thing is by autoconf - if the header file
exists, then we include it, otherwise we use the manual definitions.
This just means that if the public api ever changes, we'll pick it
up automatically in future...

> +int
> +main(int argc, char **argv)
> +{
> + bool full_file = true;
> + uint64_t src_off = 0;
> + uint64_t dst_off = 0;
> + uint64_t len = 0;
> + char *src_file;
> + int src_fd;
> + char *dst_file;
> + int dst_fd;
> + int ret;
> + int opt;
> +
> + while ((opt = getopt(argc, argv, "s:d:l:")) != -1) {
> + switch (opt) {
> + case 's':
> + src_off = atoi(optarg);

atoi() only returns 32 bit numbers. You probably should use
strtoull() as the offset parameters are 64 bit.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte
On Wed, 5 Feb 2014 16:46:57 -0500
Josef Bacik  wrote:

> 
> On 02/05/2014 04:42 PM, Johannes Hirte wrote:
> > On Wed, 5 Feb 2014 14:36:39 -0500
> > Josef Bacik  wrote:
> >
> >> On 02/05/2014 02:30 PM, Johannes Hirte wrote:
> >>> On Wed, 5 Feb 2014 14:00:57 -0500
> >>> Josef Bacik  wrote:
> >>>
>  On 02/05/2014 12:34 PM, Johannes Hirte wrote:
> > On Wed, 5 Feb 2014 10:49:15 -0500
> > Josef Bacik  wrote:
> >
> >> Ok none of those make sense which makes me think it may be the
> >> ktime bits, instead of un-applying the whole patch could you
> >> just comment out the parts
> >>
> >> ktime_t start = ktime_get();
> >>
> >> and
> >>
> >> if (actual_count > 0) {
> >> u64 runtime =
> >> ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;
> >>
> >> /*
> >>  * We weigh the current average higher than
> >> our current runtime
> >>  * to avoid large swings in the average.
> >>  */
> >> spin_lock(&delayed_refs->lock);
> >> avg = fs_info->avg_delayed_ref_runtime * 3
> >> + runtime; avg = div64_u64(avg, 4);
> >> fs_info->avg_delayed_ref_runtime = avg;
> >> spin_unlock(&delayed_refs->lock);
> >> }
> >>
> >> in __btrfs_run_delayed_refs and see if that makes the problem
> >> stop? If it does will you try chris's for-linus branch to see
> >> if it still reproduces there?  Maybe some patch changed
> >> ktime_get() in -rc1 that is causing issues and we're just now
> >> exposing it. Thanks,
> > With the ktime bits disabled, I wasn't able to reproduce the
> > problem anymore. With Chris' for-linus branch it took longer but
> > still appeared.
> >
>  Ok can you send your .config, maybe there's some weird time bug
>  being exposed.  What kind of CPU do you have?  Thanks,
> 
>  Josef
> >>> It's a Core i5-540M, dualcore + hyperthreading
> >> Ok while I'm doing this can you change
> >> btrfs_should_throttle_delayed_refs to _always_ return 1, still with
> >> all the ktime stuff commented out, and see if that causes the
> >> problem to happen?  Thanks,
> > Yes it does. Same behavior as without ktime stuff commented out.
> >
> Ok perfect, can you send me a btrfs fi df of that volume, and do you 
> have any snapshots or anything?  Thanks,

btrfs fi df /
Data, single: total=220.01GiB, used=210.85GiB
System, DUP: total=8.00MiB, used=32.00KiB
System, single: total=4.00MiB, used=0.00
Metadata, DUP: total=4.00GiB, used=2.93GiB
Metadata, single: total=8.00MiB, used=0.00

No snapshots but several subvolumes. / itself is a seperate subvolume
and subvol 0 only contains the other subvolumes (5 at moment). qgroups
aren't enabled.

mount options are noatime,inode_cache, if this matters

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Josef Bacik


On 02/05/2014 04:42 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 14:36:39 -0500
Josef Bacik  wrote:


On 02/05/2014 02:30 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 14:00:57 -0500
Josef Bacik  wrote:


On 02/05/2014 12:34 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:


Ok none of those make sense which makes me think it may be the
ktime bits, instead of un-applying the whole patch could you just
comment out the parts

ktime_t start = ktime_get();

and

if (actual_count > 0) {
u64 runtime =
ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;

/*
 * We weigh the current average higher than
our current runtime
 * to avoid large swings in the average.
 */
spin_lock(&delayed_refs->lock);
avg = fs_info->avg_delayed_ref_runtime * 3 +
runtime; avg = div64_u64(avg, 4);
fs_info->avg_delayed_ref_runtime = avg;
spin_unlock(&delayed_refs->lock);
}

in __btrfs_run_delayed_refs and see if that makes the problem
stop? If it does will you try chris's for-linus branch to see if
it still reproduces there?  Maybe some patch changed ktime_get()
in -rc1 that is causing issues and we're just now exposing it.
Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but
still appeared.


Ok can you send your .config, maybe there's some weird time bug
being exposed.  What kind of CPU do you have?  Thanks,

Josef

It's a Core i5-540M, dualcore + hyperthreading

Ok while I'm doing this can you change
btrfs_should_throttle_delayed_refs to _always_ return 1, still with
all the ktime stuff commented out, and see if that causes the problem
to happen?  Thanks,

Yes it does. Same behavior as without ktime stuff commented out.


Do you happen to have qgroups enabled?  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: fix assert screwup for the pending move stuff

2014-02-05 Thread Filipe David Manana
On Wed, Feb 5, 2014 at 9:19 PM, Josef Bacik  wrote:
> Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
> reproduce.  Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT 
> set,
> which meant that a key part of Filipe's original patch was not being built in.
> This appears to be a mess up with merging Filipe's patch as it does not exist 
> in
> his original patch.  Fix this by changing how we make sure 
> del_waiting_dir_move
> asserts that it did not error and take the function out of the ifdef check.
> This makes btrfs/030 pass with the assert on or off.  Thanks,
>
> Signed-off-by: Josef Bacik 

Reviewed-by: Filipe Manana 

Thanks Josef.

I actually had the ASSERT(del_waiting_dir_move(sctx, pm->ino) == 0), I
didn't had only the function declaration inside #ifdef
CONFIG_BTRFS_ASSERT ... #endif. Obviously because I never built with
assertions disabled and totally forgot about not using expressions
with side effects inside assert macros.

> ---
>  fs/btrfs/send.c | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
> index f71fcbc..afb145d 100644
> --- a/fs/btrfs/send.c
> +++ b/fs/btrfs/send.c
> @@ -2734,8 +2734,6 @@ static int add_waiting_dir_move(struct send_ctx *sctx, 
> u64 ino)
> return 0;
>  }
>
> -#ifdef CONFIG_BTRFS_ASSERT
> -
>  static int del_waiting_dir_move(struct send_ctx *sctx, u64 ino)
>  {
> struct rb_node *n = sctx->waiting_dir_moves.rb_node;
> @@ -2756,8 +2754,6 @@ static int del_waiting_dir_move(struct send_ctx *sctx, 
> u64 ino)
> return -ENOENT;
>  }
>
> -#endif
> -
>  static int add_pending_dir_move(struct send_ctx *sctx, u64 parent_ino)
>  {
> struct rb_node **p = &sctx->pending_dir_moves.rb_node;
> @@ -2862,7 +2858,9 @@ static int apply_dir_move(struct send_ctx *sctx, struct 
> pending_dir_move *pm)
> }
>
> sctx->send_progress = sctx->cur_ino + 1;
> -   ASSERT(del_waiting_dir_move(sctx, pm->ino) == 0);
> +   ret = del_waiting_dir_move(sctx, pm->ino);
> +   ASSERT(ret == 0);
> +
> ret = get_cur_path(sctx, pm->ino, pm->gen, to_path);
> if (ret < 0)
> goto out;
> --
> 1.8.3.1
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Josef Bacik


On 02/05/2014 04:42 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 14:36:39 -0500
Josef Bacik  wrote:


On 02/05/2014 02:30 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 14:00:57 -0500
Josef Bacik  wrote:


On 02/05/2014 12:34 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:


Ok none of those make sense which makes me think it may be the
ktime bits, instead of un-applying the whole patch could you just
comment out the parts

ktime_t start = ktime_get();

and

if (actual_count > 0) {
u64 runtime =
ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;

/*
 * We weigh the current average higher than
our current runtime
 * to avoid large swings in the average.
 */
spin_lock(&delayed_refs->lock);
avg = fs_info->avg_delayed_ref_runtime * 3 +
runtime; avg = div64_u64(avg, 4);
fs_info->avg_delayed_ref_runtime = avg;
spin_unlock(&delayed_refs->lock);
}

in __btrfs_run_delayed_refs and see if that makes the problem
stop? If it does will you try chris's for-linus branch to see if
it still reproduces there?  Maybe some patch changed ktime_get()
in -rc1 that is causing issues and we're just now exposing it.
Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but
still appeared.


Ok can you send your .config, maybe there's some weird time bug
being exposed.  What kind of CPU do you have?  Thanks,

Josef

It's a Core i5-540M, dualcore + hyperthreading

Ok while I'm doing this can you change
btrfs_should_throttle_delayed_refs to _always_ return 1, still with
all the ktime stuff commented out, and see if that causes the problem
to happen?  Thanks,

Yes it does. Same behavior as without ktime stuff commented out.

Ok perfect, can you send me a btrfs fi df of that volume, and do you 
have any snapshots or anything?  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte
On Wed, 5 Feb 2014 14:36:39 -0500
Josef Bacik  wrote:

> 
> On 02/05/2014 02:30 PM, Johannes Hirte wrote:
> > On Wed, 5 Feb 2014 14:00:57 -0500
> > Josef Bacik  wrote:
> >
> >> On 02/05/2014 12:34 PM, Johannes Hirte wrote:
> >>> On Wed, 5 Feb 2014 10:49:15 -0500
> >>> Josef Bacik  wrote:
> >>>
>  Ok none of those make sense which makes me think it may be the
>  ktime bits, instead of un-applying the whole patch could you just
>  comment out the parts
> 
> ktime_t start = ktime_get();
> 
>  and
> 
> if (actual_count > 0) {
> u64 runtime =
>  ktime_to_ns(ktime_sub(ktime_get(), start)); u64 avg;
> 
> /*
>  * We weigh the current average higher than
>  our current runtime
>  * to avoid large swings in the average.
>  */
> spin_lock(&delayed_refs->lock);
> avg = fs_info->avg_delayed_ref_runtime * 3 +
>  runtime; avg = div64_u64(avg, 4);
> fs_info->avg_delayed_ref_runtime = avg;
> spin_unlock(&delayed_refs->lock);
> }
> 
>  in __btrfs_run_delayed_refs and see if that makes the problem
>  stop? If it does will you try chris's for-linus branch to see if
>  it still reproduces there?  Maybe some patch changed ktime_get()
>  in -rc1 that is causing issues and we're just now exposing it.
>  Thanks,
> >>> With the ktime bits disabled, I wasn't able to reproduce the
> >>> problem anymore. With Chris' for-linus branch it took longer but
> >>> still appeared.
> >>>
> >> Ok can you send your .config, maybe there's some weird time bug
> >> being exposed.  What kind of CPU do you have?  Thanks,
> >>
> >> Josef
> > It's a Core i5-540M, dualcore + hyperthreading
> Ok while I'm doing this can you change 
> btrfs_should_throttle_delayed_refs to _always_ return 1, still with
> all the ktime stuff commented out, and see if that causes the problem
> to happen?  Thanks,

Yes it does. Same behavior as without ktime stuff commented out.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Btrfs: fix assert screwup for the pending move stuff

2014-02-05 Thread Josef Bacik
Wang noticed that he was failing btrfs/030 even though me and Filipe couldn't
reproduce.  Turns out this is because Wang didn't have CONFIG_BTRFS_ASSERT set,
which meant that a key part of Filipe's original patch was not being built in.
This appears to be a mess up with merging Filipe's patch as it does not exist in
his original patch.  Fix this by changing how we make sure del_waiting_dir_move
asserts that it did not error and take the function out of the ifdef check.
This makes btrfs/030 pass with the assert on or off.  Thanks,

Signed-off-by: Josef Bacik 
---
 fs/btrfs/send.c | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index f71fcbc..afb145d 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -2734,8 +2734,6 @@ static int add_waiting_dir_move(struct send_ctx *sctx, 
u64 ino)
return 0;
 }
 
-#ifdef CONFIG_BTRFS_ASSERT
-
 static int del_waiting_dir_move(struct send_ctx *sctx, u64 ino)
 {
struct rb_node *n = sctx->waiting_dir_moves.rb_node;
@@ -2756,8 +2754,6 @@ static int del_waiting_dir_move(struct send_ctx *sctx, 
u64 ino)
return -ENOENT;
 }
 
-#endif
-
 static int add_pending_dir_move(struct send_ctx *sctx, u64 parent_ino)
 {
struct rb_node **p = &sctx->pending_dir_moves.rb_node;
@@ -2862,7 +2858,9 @@ static int apply_dir_move(struct send_ctx *sctx, struct 
pending_dir_move *pm)
}
 
sctx->send_progress = sctx->cur_ino + 1;
-   ASSERT(del_waiting_dir_move(sctx, pm->ino) == 0);
+   ret = del_waiting_dir_move(sctx, pm->ino);
+   ASSERT(ret == 0);
+
ret = get_cur_path(sctx, pm->ino, pm->gen, to_path);
if (ret < 0)
goto out;
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: convert to add transaction protection for btrfs send

2014-02-05 Thread Josef Bacik

On 02/05/2014 12:23 PM, Wang Shilong wrote:
> Hi Josef,
>
>> On 02/05/2014 03:59 AM, Wang Shilong wrote:
>>> Hi Josef,
>>>
>>> [..SNIP..]
 On 01/31/2014 11:37 AM, Wang Shilong wrote:
> Hello Josef,
>
 2) Remove the per-root rwsem for the commit root and just make one big
 rwsem that covers all commit root switching. This way everybody who
 wants to search with the commit root can just use this semaphore and all
 be safe. It will mean that the inode cache stuff may block longer than
 normal but I don't think that's too big of a deal.

>>> I am ok with this fix,  I wanted to talk something about protecting 
>>> searching commit file root, this is really a
>>> problem especially for full send.
>>>
>>> I have some ideas about this issue:
>>>
>>> #1.don't use commit file root to search.
>>> This will become a nightmare when we are doing full send which will iterate 
>>> the whole file tree,
>>> at the same time, we snapshot send root, snapshots will be blocked until 
>>> send finished.
>>>
>>> #2. don't allow snapshot if we are sending root.
>>> This may be a little confusing, snapshots are readonly, but users can not 
>>> snapshot it.
>> I think this is the best bet. The fact is we don't want to hold this
>> commit_root_sem for the entire duration of the send, it would block
>> people trying to commit the transaction. We could check for contention
>> and drop the sem and re-search down to where we were but I think that
>> would be prone to errors. If we just check to see if the snapshot is
>> being sent and just return -EBUSY when we try to create a snapshot I
>> think that's perfectly reasonable.
>>> #3. after one iteration, we do check send_root's generation, and make sure 
>>> it doesn't
>>> change, if it changed, then we restart send again.
>>>
>>> I don't know which approach is better,and also snapshot-aware defragment 
>>> will change
>>> read-only snapshot?
>>>
>>> Did you have any better ideas about this issue? Share it with me here.^_^
>>>
>> Snapshot-aware defrag will definitely screw us here. I think we need to
>> do the same thing above as we do here, which is to simply skip the
>> snapshot aware defrag if we are currently using that root for send. This
>> sound reasonable to you? Thanks,
> Yeah, very reasonable, if you don't mind, i would give a patch for this issue.
Go for it, you'll be faster than I will be, all I do is run xfstests and
try to reproduce things that will never reproduce for me.

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Josef Bacik


On 02/05/2014 11:14 AM, Wang Shilong wrote:

Hi Filipe,


So i knew what was wrong here, we need found_key while 
btrfs_previous_extent_item() did set
it properly..^_^

I will send a v2 to fix this, thanks!



On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  wrote:

From: Wang Shilong 

Since we have introduced btrfs_previous_extent_item() to search previous
extent item, just switch into it.

Signed-off-by: Wang Shilong 

Hi Shilong,

This patch is making btrfs/004 fail for me, consistently:

I was trying to reproduce this xfstest failure(though we have known what's 
wrong with my previous patch).
I did not really hit 004 failure, but i can reproduce btrfs/030 fail 
consistently, i think you might be interested in this:

FSTYP -- btrfs
PLATFORM  -- Linux/i686 wangsl 3.13.0-4-default+
MKFS_OPTIONS  -- /dev/sdb2
MOUNT_OPTIONS -- /dev/sdb2 /mnt/scratch

btrfs/030[failed, exit status 1] - output mismatch (see /home/wangsl/tools/xfstests/results//btrfs/030.out.bad)

 --- tests/btrfs/030.out 2014-02-01 01:01:11.261999486 +0800
 +++ /home/wangsl/tools/xfstests/results//btrfs/030.out.bad  2014-02-05 
23:56:31.740988010 +0800
 @@ -1 +1,3 @@
  QA output created by 030
 +failed: '/home/wangsl/tools/xfstests/src/fssum -r 
/tmp/tmp.30GWDU8xaU/2.fssum /mnt/scratch/mysnap2'
 +(see /home/wangsl/tools/xfstests/results//btrfs/030.full for details)
 ...
 (Run 'diff -u tests/btrfs/030.out 
/home/wangsl/tools/xfstests/results//btrfs/030.out.bad'  to see the entire diff)
Ran: btrfs/030
Failures: btrfs/030
Failed 1 of 1 tests

dmesg show more information:

[  818.988731] WARNING: CPU: 0 PID: 29978 at fs/btrfs/send.c:5427 
btrfs_ioctl_send+0x34b/0xeb0 [btrfs]()
[  818.988733] Modules linked in: xt_tcpudp xt_pkttype xt_LOG xt_limit 
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT 
iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns 
nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack 
nf_conntrack ip6table_filter ip6_tables x_tables fuse bnep snd_ens1371 coretemp 
crc32_pclmul gameport crc32c_intel snd_rawmidi aesni_intel snd_ac97_codec 
sr_mod cdrom ata_generic ac97_bus snd_pcm snd_seq ppdev ata_piix snd_timer 
snd_seq_device ablk_helper ahci btusb snd libahci cryptd bluetooth libata 
vmw_balloon lrw aes_i586 xts serio_raw gf128mul vmw_vmci parport_pc pcspkr 
soundcore mptctl snd_page_alloc parport pcnet32 i2c_piix4 shpchp joydev floppy 
mii ac button rfkill sg autofs4 btrfs raid6_pq xor linear hid_generic
[  818.988766]  usbhid hid uhci_hcd vmwgfx ehci_pci ehci_hcd processor 
thermal_sys usbcore hwmon ttm usb_common mptspi mptscsih mptbase 
scsi_transport_spi drm i2c_core scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc 
scsi_dh_alua scsi_dh dm_snapshot dm_mirror dm_region_hash dm_log dm_mod
[  818.988786] CPU: 0 PID: 29978 Comm: btrfs Tainted: GW
3.13.0-4-default+ #44
[  818.988787] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 07/02/2012
[  818.988789]    c9561cf8 c06a8276  c9561d28 c02432f9 
c080cf24
[  818.988793]   751a fa1b7b6e 1533 fa1a647b fa1a647b dade1140 
dade1138
[  818.988797]  dade1000 c9561d38 c024338d 0009  c9561df4 fa1a647b 
dade1000
[  818.988800] Call Trace:
[  818.988858]  [] dump_stack+0x41/0x52
[  818.988941]  [] warn_slowpath_common+0x79/0x90
[  818.988962]  [] ? btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.988975]  [] ? btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.988977]  [] warn_slowpath_null+0x1d/0x20
[  818.988990]  [] btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.989004]  [] ? update_ioctl_balance_args+0x2c0/0x2c0 [btrfs]
[  818.989017]  [] btrfs_ioctl+0x2a8/0x33f0 [btrfs]
[  818.989021]  [] ? update_cfs_rq_blocked_load+0x116/0x170
[  818.989023]  [] ? __enqueue_entity+0x65/0x70
[  818.989025]  [] ? enqueue_entity+0x31c/0xe60
[  818.989028]  [] ? enqueue_task_fair+0x5d1/0x7d0
[  818.989031]  [] ? sched_clock+0x8/0x10
[  818.989043]  [] ? update_ioctl_balance_args+0x2c0/0x2c0 [btrfs]
[  818.989048]  [] do_vfs_ioctl+0x2d2/0x4b0
[  818.989051]  [] ? resched_task+0x3b/0x50
[  818.989053]  [] ? check_preempt_curr+0x5d/0x80
[  818.989056]  [] ? wake_up_new_task+0xe5/0x140
[  818.989058]  [] ? do_fork+0x100/0x2b0
[  818.989061]  [] SyS_ioctl+0x58/0x80
[  818.989063]  [] sysenter_do_call+0x12/0x28
[  818.989065] ---[ end trace 7f6e499355102e48 ]---
[  819.101601] BTRFS: device fsid 061bb332-4adc-4489-9a79-0931007b9d51 devid 1 
transid 4 /dev/sdb2
[  819.117930] BTRFS: device fsid 061bb332-4adc-4489-9a79-0931007b9d51 devid 1 
transid 4 /dev/sdb2
[  819.118653] BTRFS info (device sdb2): disk space caching is enabled
[  819.118655] BTRFS: flagging fs with big metadata feature
[  819.119958

Provide a better free space estimate on RAID1

2014-02-05 Thread Roman Mamedov
Hello,

On a freshly-created RAID1 filesystem of two 1TB disks:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  1.8T   1% /mnt/p2

I cannot write 2TB of user data to that RAID1, so this estimate is clearly
misleading. I got tired of looking at the bogus disk free space on all my
RAID1 btrfs systems, so today I decided to do something about this:

--- fs/btrfs/super.c.orig   2014-02-06 01:28:36.636164982 +0600
+++ fs/btrfs/super.c2014-02-06 01:28:58.304164370 +0600
@@ -1481,6 +1481,11 @@
}
 
kfree(devices_info);
+
+   if (type & BTRFS_BLOCK_GROUP_RAID1) {
+   do_div(avail_space, min_stripes);
+   }
+  
*free_bytes = avail_space;
return 0;
 }


After:

# df -h /mnt/p2/
Filesystem  Size  Used Avail Use% Mounted on
/dev/sda2   1.8T  1.1M  912G   1% /mnt/p2

Until per-subvolume RAID profiles are implemented, this estimate will be
correct, and even after, it should be closer to the truth than assuming the
user will fill their RAID1 FS only with subvolumes of single or raid0 profiles.

If anyone likes feel free to reimplement my PoC patch in a better way, e.g.
integrate this into the calculation 'while' block of that function immediately
before it (logic of which I couldn't yet grasp due to it lacking comments),
and not just tacked onto the tail of it.

-- 
With respect,
Roman


signature.asc
Description: PGP signature


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Josef Bacik


On 02/05/2014 02:30 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 14:00:57 -0500
Josef Bacik  wrote:


On 02/05/2014 12:34 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:


Ok none of those make sense which makes me think it may be the
ktime bits, instead of un-applying the whole patch could you just
comment out the parts

   ktime_t start = ktime_get();

and

   if (actual_count > 0) {
   u64 runtime = ktime_to_ns(ktime_sub(ktime_get(),
start)); u64 avg;

   /*
* We weigh the current average higher than our
current runtime
* to avoid large swings in the average.
*/
   spin_lock(&delayed_refs->lock);
   avg = fs_info->avg_delayed_ref_runtime * 3 +
runtime; avg = div64_u64(avg, 4);
   fs_info->avg_delayed_ref_runtime = avg;
   spin_unlock(&delayed_refs->lock);
   }

in __btrfs_run_delayed_refs and see if that makes the problem stop?
If it does will you try chris's for-linus branch to see if it still
reproduces there?  Maybe some patch changed ktime_get() in -rc1
that is causing issues and we're just now exposing it.  Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but
still appeared.


Ok can you send your .config, maybe there's some weird time bug being
exposed.  What kind of CPU do you have?  Thanks,

Josef

It's a Core i5-540M, dualcore + hyperthreading
Ok while I'm doing this can you change 
btrfs_should_throttle_delayed_refs to _always_ return 1, still with all 
the ktime stuff commented out, and see if that causes the problem to 
happen?  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Josef Bacik


On 02/05/2014 12:34 PM, Johannes Hirte wrote:

On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:


Ok none of those make sense which makes me think it may be the ktime
bits, instead of un-applying the whole patch could you just comment
out the parts

  ktime_t start = ktime_get();

and

  if (actual_count > 0) {
  u64 runtime = ktime_to_ns(ktime_sub(ktime_get(),
start)); u64 avg;

  /*
   * We weigh the current average higher than our
current runtime
   * to avoid large swings in the average.
   */
  spin_lock(&delayed_refs->lock);
  avg = fs_info->avg_delayed_ref_runtime * 3 + runtime;
  avg = div64_u64(avg, 4);
  fs_info->avg_delayed_ref_runtime = avg;
  spin_unlock(&delayed_refs->lock);
  }

in __btrfs_run_delayed_refs and see if that makes the problem stop?
If it does will you try chris's for-linus branch to see if it still
reproduces there?  Maybe some patch changed ktime_get() in -rc1 that
is causing issues and we're just now exposing it.  Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but still
appeared.

Ok can you send your .config, maybe there's some weird time bug being 
exposed.  What kind of CPU do you have?  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Are nocow files snapshot-aware

2014-02-05 Thread Kai Krakow
David Sterba  schrieb:

> On Tue, Feb 04, 2014 at 08:22:05PM -0500, Josef Bacik wrote:
>> On 02/04/2014 03:52 PM, Kai Krakow wrote:
>> >Hi!
>> >
>> >I'm curious... The whole snapshot thing on btrfs is based on its COW
>> >design. But you can make individual files and directory contents nocow
>> >by applying the C attribute on it using chattr. This is usually
>> >recommended for database files and VM images. So far, so good...
>> >
>> >But what happens to such files when they are part of a snapshot? Do they
>> >become duplicated during the snapshot? Do they become unshared (as a
>> >whole) when written to? Or when the the parent snapshot becomes deleted?
>> >Or maybe the nocow attribute is just ignored after a snapshot was taken?
>> >
>> >After all they are nocow and thus would be handled in another way when
>> >snapshotted.
>> >
>> When snapshotted nocow files fallback to normal cow behaviour.
> 
> This may seem unclear to people not familiar with the actual
> implementation, and I had to think for a second about that sentence. The
> file will keep the NOCOW status, but any modified blocks will be newly
> allocated on the first write (in a COW manner), then the block location
> will not change anymore (unlike ordinary COW).

Ah okay, that makes it clear. So, actually, in the snapshot the file is 
still nocow - just for the exception that blocks being written to become 
unshared and relocated. This may introduce a lot of fragmentation but it 
won't become worse when rewriting the same blocks over and over again.

> HTH

Yes, it does. ;-)

-- 
Replies to list only preferred.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte
On Wed, 5 Feb 2014 10:49:15 -0500
Josef Bacik  wrote:

> Ok none of those make sense which makes me think it may be the ktime 
> bits, instead of un-applying the whole patch could you just comment
> out the parts
> 
>  ktime_t start = ktime_get();
> 
> and
> 
>  if (actual_count > 0) {
>  u64 runtime = ktime_to_ns(ktime_sub(ktime_get(),
> start)); u64 avg;
> 
>  /*
>   * We weigh the current average higher than our
> current runtime
>   * to avoid large swings in the average.
>   */
>  spin_lock(&delayed_refs->lock);
>  avg = fs_info->avg_delayed_ref_runtime * 3 + runtime;
>  avg = div64_u64(avg, 4);
>  fs_info->avg_delayed_ref_runtime = avg;
>  spin_unlock(&delayed_refs->lock);
>  }
> 
> in __btrfs_run_delayed_refs and see if that makes the problem stop?
> If it does will you try chris's for-linus branch to see if it still 
> reproduces there?  Maybe some patch changed ktime_get() in -rc1 that
> is causing issues and we're just now exposing it.  Thanks,

With the ktime bits disabled, I wasn't able to reproduce the
problem anymore. With Chris' for-linus branch it took longer but still
appeared.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: convert to add transaction protection for btrfs send

2014-02-05 Thread Wang Shilong
Hi Josef,

> 
> On 02/05/2014 03:59 AM, Wang Shilong wrote:
>> Hi Josef,
>> 
>> [..SNIP..]
>>> On 01/31/2014 11:37 AM, Wang Shilong wrote:
 Hello Josef,
 
> 
>>> 2) Remove the per-root rwsem for the commit root and just make one big
>>> rwsem that covers all commit root switching. This way everybody who
>>> wants to search with the commit root can just use this semaphore and all
>>> be safe. It will mean that the inode cache stuff may block longer than
>>> normal but I don't think that's too big of a deal.
>>> 
>> I am ok with this fix,  I wanted to talk something about protecting 
>> searching commit file root, this is really a
>> problem especially for full send.
>> 
>> I have some ideas about this issue:
>> 
>> #1.don't use commit file root to search.
>> This will become a nightmare when we are doing full send which will iterate 
>> the whole file tree,
>> at the same time, we snapshot send root, snapshots will be blocked until 
>> send finished.
>> 
>> #2. don't allow snapshot if we are sending root.
>> This may be a little confusing, snapshots are readonly, but users can not 
>> snapshot it.
> I think this is the best bet. The fact is we don't want to hold this
> commit_root_sem for the entire duration of the send, it would block
> people trying to commit the transaction. We could check for contention
> and drop the sem and re-search down to where we were but I think that
> would be prone to errors. If we just check to see if the snapshot is
> being sent and just return -EBUSY when we try to create a snapshot I
> think that's perfectly reasonable.
>> #3. after one iteration, we do check send_root's generation, and make sure 
>> it doesn't
>> change, if it changed, then we restart send again.
>> 
>> I don't know which approach is better,and also snapshot-aware defragment 
>> will change
>> read-only snapshot?
>> 
>> Did you have any better ideas about this issue? Share it with me here.^_^
>> 
> Snapshot-aware defrag will definitely screw us here. I think we need to
> do the same thing above as we do here, which is to simply skip the
> snapshot aware defrag if we are currently using that root for send. This
> sound reasonable to you? Thanks,

Yeah, very reasonable, if you don't mind, i would give a patch for this issue.

Thanks,
Wang
> 
> Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Btrfs: make some tree searches in send.c more efficient

2014-02-05 Thread Filipe David Borba Manana
We have this pattern where we do search for a contiguous group of
items in a tree and everytime we find an item, we process it, then
we release our path, increment the offset of the search key, do
another full tree search and repeat these steps until a tree search
can't find more items we're interested in.

Instead of doing these full tree searches after processing each item,
just process the next item/slot in our leaf and don't release the path.
Since all these trees are read only and we always use the commit root
for a search and skip node/leaf locks, we're not affecting concurrency
on the trees.

Signed-off-by: Filipe David Borba Manana 
---
 fs/btrfs/send.c |  105 +--
 1 file changed, 64 insertions(+), 41 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 7a1b547..f46c43f 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -2460,17 +2460,26 @@ static int did_create_dir(struct send_ctx *sctx, u64 
dir)
key.objectid = dir;
key.type = BTRFS_DIR_INDEX_KEY;
key.offset = 0;
+   ret = btrfs_search_slot(NULL, sctx->send_root, &key, path, 0, 0);
+   if (ret < 0)
+   goto out;
+
while (1) {
-   ret = btrfs_search_slot_for_read(sctx->send_root, &key, path,
-   1, 0);
-   if (ret < 0)
-   goto out;
-   if (!ret) {
-   eb = path->nodes[0];
-   slot = path->slots[0];
-   btrfs_item_key_to_cpu(eb, &found_key, slot);
+   eb = path->nodes[0];
+   slot = path->slots[0];
+   if (slot >= btrfs_header_nritems(eb)) {
+   ret = btrfs_next_leaf(sctx->send_root, path);
+   if (ret < 0) {
+   goto out;
+   } else if (ret > 0) {
+   ret = 0;
+   break;
+   }
+   continue;
}
-   if (ret || found_key.objectid != key.objectid ||
+
+   btrfs_item_key_to_cpu(eb, &found_key, slot);
+   if (found_key.objectid != key.objectid ||
found_key.type != key.type) {
ret = 0;
goto out;
@@ -2485,8 +2494,7 @@ static int did_create_dir(struct send_ctx *sctx, u64 dir)
goto out;
}
 
-   key.offset = found_key.offset + 1;
-   btrfs_release_path(path);
+   path->slots[0]++;
}
 
 out:
@@ -2652,19 +2660,24 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, 
u64 send_progress)
key.objectid = dir;
key.type = BTRFS_DIR_INDEX_KEY;
key.offset = 0;
+   ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+   if (ret < 0)
+   goto out;
 
while (1) {
-   ret = btrfs_search_slot_for_read(root, &key, path, 1, 0);
-   if (ret < 0)
-   goto out;
-   if (!ret) {
-   btrfs_item_key_to_cpu(path->nodes[0], &found_key,
-   path->slots[0]);
+   if (path->slots[0] >= btrfs_header_nritems(path->nodes[0])) {
+   ret = btrfs_next_leaf(root, path);
+   if (ret < 0)
+   goto out;
+   else if (ret > 0)
+   break;
+   continue;
}
-   if (ret || found_key.objectid != key.objectid ||
-   found_key.type != key.type) {
+   btrfs_item_key_to_cpu(path->nodes[0], &found_key,
+ path->slots[0]);
+   if (found_key.objectid != key.objectid ||
+   found_key.type != key.type)
break;
-   }
 
di = btrfs_item_ptr(path->nodes[0], path->slots[0],
struct btrfs_dir_item);
@@ -2675,8 +2688,7 @@ static int can_rmdir(struct send_ctx *sctx, u64 dir, u64 
send_progress)
goto out;
}
 
-   btrfs_release_path(path);
-   key.offset = found_key.offset + 1;
+   path->slots[0]++;
}
 
ret = 1;
@@ -3581,15 +3593,22 @@ static int process_all_refs(struct send_ctx *sctx,
key.objectid = sctx->cmp_key->objectid;
key.type = BTRFS_INODE_REF_KEY;
key.offset = 0;
-   while (1) {
-   ret = btrfs_search_slot_for_read(root, &key, path, 1, 0);
-   if (ret < 0)
-   goto out;
-   if (ret)
-   break;
+   ret = btrfs_search_slot(NULL, root, &key, path, 0, 0);
+   if (ret < 0)
+   goto out;
 
+   while (1) {
e

[PATCH 1/2] Btrfs: use right extent item position in send when finding extent clones

2014-02-05 Thread Filipe David Borba Manana
This was a leftover from the commit:

   74dd17fbe3d65829e75d84f00a9525b2ace93998
   (Btrfs: fix btrfs send for inline items and compression)

Signed-off-by: Filipe David Borba Manana 
---
 fs/btrfs/send.c |2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index f71fcbc..7a1b547 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -1257,8 +1257,6 @@ static int find_extent_clone(struct send_ctx *sctx,
extent_item_pos = logical - found_key.objectid;
else
extent_item_pos = 0;
-
-   extent_item_pos = logical - found_key.objectid;
ret = iterate_extent_inodes(sctx->send_root->fs_info,
found_key.objectid, extent_item_pos, 1,
__iterate_backrefs, backref_ctx);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Wang Shilong
Hi Filipe,

> So i knew what was wrong here, we need found_key while 
> btrfs_previous_extent_item() did set
> it properly..^_^
> 
> I will send a v2 to fix this, thanks!
> 
> 
>> On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  
>> wrote:
>>> From: Wang Shilong 
>>> 
>>> Since we have introduced btrfs_previous_extent_item() to search previous
>>> extent item, just switch into it.
>>> 
>>> Signed-off-by: Wang Shilong 
>> 
>> Hi Shilong,
>> 
>> This patch is making btrfs/004 fail for me, consistently:

I was trying to reproduce this xfstest failure(though we have known what's 
wrong with my previous patch).
I did not really hit 004 failure, but i can reproduce btrfs/030 fail 
consistently, i think you might be interested in this:

FSTYP -- btrfs  

   
PLATFORM  -- Linux/i686 wangsl 3.13.0-4-default+

   
MKFS_OPTIONS  -- /dev/sdb2  

   
MOUNT_OPTIONS -- /dev/sdb2 /mnt/scratch 

   


   
btrfs/030[failed, exit status 1] - output mismatch (see 
/home/wangsl/tools/xfstests/results//btrfs/030.out.bad) 
   
--- tests/btrfs/030.out 2014-02-01 01:01:11.261999486 +0800 

   
+++ /home/wangsl/tools/xfstests/results//btrfs/030.out.bad  2014-02-05 
23:56:31.740988010 +0800
@@ -1 +1,3 @@
 QA output created by 030
+failed: '/home/wangsl/tools/xfstests/src/fssum -r 
/tmp/tmp.30GWDU8xaU/2.fssum /mnt/scratch/mysnap2'
+(see /home/wangsl/tools/xfstests/results//btrfs/030.full for details)
...
(Run 'diff -u tests/btrfs/030.out 
/home/wangsl/tools/xfstests/results//btrfs/030.out.bad'  to see the entire diff)
Ran: btrfs/030
Failures: btrfs/030
Failed 1 of 1 tests

dmesg show more information:

[  818.988731] WARNING: CPU: 0 PID: 29978 at fs/btrfs/send.c:5427 
btrfs_ioctl_send+0x34b/0xeb0 [btrfs]()
[  818.988733] Modules linked in: xt_tcpudp xt_pkttype xt_LOG xt_limit 
ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT 
iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns 
nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack 
nf_conntrack ip6table_filter ip6_tables x_tables fuse bnep snd_ens1371 coretemp 
crc32_pclmul gameport crc32c_intel snd_rawmidi aesni_intel snd_ac97_codec 
sr_mod cdrom ata_generic ac97_bus snd_pcm snd_seq ppdev ata_piix snd_timer 
snd_seq_device ablk_helper ahci btusb snd libahci cryptd bluetooth libata 
vmw_balloon lrw aes_i586 xts serio_raw gf128mul vmw_vmci parport_pc pcspkr 
soundcore mptctl snd_page_alloc parport pcnet32 i2c_piix4 shpchp joydev floppy 
mii ac button rfkill sg autofs4 btrfs raid6_pq xor linear hid_generic
[  818.988766]  usbhid hid uhci_hcd vmwgfx ehci_pci ehci_hcd processor 
thermal_sys usbcore hwmon ttm usb_common mptspi mptscsih mptbase 
scsi_transport_spi drm i2c_core scsi_dh_rdac scsi_dh_hp_sw scsi_dh_emc 
scsi_dh_alua scsi_dh dm_snapshot dm_mirror dm_region_hash dm_log dm_mod
[  818.988786] CPU: 0 PID: 29978 Comm: btrfs Tainted: GW
3.13.0-4-default+ #44
[  818.988787] Hardware name: VMware, Inc. VMware Virtual Platform/440BX 
Desktop Reference Platform, BIOS 6.00 07/02/2012
[  818.988789]    c9561cf8 c06a8276  c9561d28 c02432f9 
c080cf24
[  818.988793]   751a fa1b7b6e 1533 fa1a647b fa1a647b dade1140 
dade1138
[  818.988797]  dade1000 c9561d38 c024338d 0009  c9561df4 fa1a647b 
dade1000
[  818.988800] Call Trace:
[  818.988858]  [] dump_stack+0x41/0x52
[  818.988941]  [] warn_slowpath_common+0x79/0x90
[  818.988962]  [] ? btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.988975]  [] ? btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.988977]  [] warn_slowpath_null+0x1d/0x20
[  818.988990]  [] btrfs_ioctl_send+0x34b/0xeb0 [btrfs]
[  818.989004]  [] ? update_ioctl_balance_args+0x2c0/0x2c0 [btrfs]
[  818.989017]  [] btrfs_ioctl+0x2a8/0x33f0 [btrfs]
[  818.989021]  [] ? update_cfs_rq_blocked_load+0x116/0x170
[  818.989023]  [] ? __enqueue_entity+0x65/0x70
[  818.989025

Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Filipe David Manana
On Wed, Feb 5, 2014 at 4:20 PM, Josef Bacik  wrote:
>
> On 02/05/2014 11:14 AM, Wang Shilong wrote:
>>
>> Hi Filipe,
>>
>>> So i knew what was wrong here, we need found_key while
>>> btrfs_previous_extent_item() did set
>>> it properly..^_^
>>>
>>> I will send a v2 to fix this, thanks!
>>>
>>>
 On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong
  wrote:
>
> From: Wang Shilong 
>
> Since we have introduced btrfs_previous_extent_item() to search
> previous
> extent item, just switch into it.
>
> Signed-off-by: Wang Shilong 

 Hi Shilong,

 This patch is making btrfs/004 fail for me, consistently:
>>
>> I was trying to reproduce this xfstest failure(though we have known what's
>> wrong with my previous patch).
>> I did not really hit 004 failure, but i can reproduce btrfs/030 fail
>> consistently, i think you might be interested in this:
>>
> Do you guys have some CONFIG_ONLY_BREAK_FOR_ME=y set or something? I can't
> reproduce this failure either.  Will you send the updated
> btrfs_previous_extent_item patch and then see if you can bisect down why 030
> is failing for you?  Thanks,

I can't reproduce Shilong's 030 failure on latest btrfs-next, neither
with nor without the previous_extent_item patch. And quite surprised,
since the test is very deterministic (hardcoded fs structure).

>
> Josef



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Josef Bacik


On 02/05/2014 11:14 AM, Wang Shilong wrote:

Hi Filipe,


So i knew what was wrong here, we need found_key while 
btrfs_previous_extent_item() did set
it properly..^_^

I will send a v2 to fix this, thanks!



On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  wrote:

From: Wang Shilong 

Since we have introduced btrfs_previous_extent_item() to search previous
extent item, just switch into it.

Signed-off-by: Wang Shilong 

Hi Shilong,

This patch is making btrfs/004 fail for me, consistently:

I was trying to reproduce this xfstest failure(though we have known what's 
wrong with my previous patch).
I did not really hit 004 failure, but i can reproduce btrfs/030 fail 
consistently, i think you might be interested in this:

Do you guys have some CONFIG_ONLY_BREAK_FOR_ME=y set or something? I 
can't reproduce this failure either.  Will you send the updated 
btrfs_previous_extent_item patch and then see if you can bisect down why 
030 is failing for you?  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Josef Bacik


On 02/05/2014 03:14 AM, Johannes Hirte wrote:

On Tue, 4 Feb 2014 09:12:54 -0500
Josef Bacik  wrote:


Hrm I was hoping that was going to be more helpful.  Can you get perf
record -ag and then perf report while it's at full cpu and get the
first 3 or 4 things with their traces?

Here it comes:

# 
# captured on: Wed Feb  5 00:11:41 2014
# 
#
no symbols found in /usr/sbin/acpid, maybe install a debug package?
unexpected end of event stream
# Samples: 168K of event 'cycles'
# Event count (approx.): 126847081763
#
# Overhead  Command   Shared Object 
  Symbol
#   ...  ..  
...
#
 18.48%  btrfs-freespace  [kernel.kallsyms]   [k] state_store
 |
 --- state_store

 10.25%  btrfs-freespace  [kernel.kallsyms]   [k] 
sys_sched_rr_get_interval
 |
 --- sys_sched_rr_get_interval

  9.02%  btrfs-freespace  [kernel.kallsyms]   [k] 
rt_mutex_slowunlock
 |
 --- rt_mutex_slowunlock

  8.76%  btrfs-freespace  [kernel.kallsyms]   [k] 
btrfs_submit_compressed_write
 |
 --- btrfs_submit_compressed_write

  6.63%  btrfs-freespace  [kernel.kallsyms]   [k] sched_show_task
 |
 --- sched_show_task

  5.19%  btrfs-freespace  [kernel.kallsyms]   [k] find_free_extent
 |
 --- find_free_extent

  5.15%  btrfs-freespace  [kernel.kallsyms]   [k] 
trace_print_graph_duration
 |
 --- trace_print_graph_duration


I'm going to try and
reproduce today, is there anything special about your fs?
Compression, large blocksizes, skinny metadata?  Thanks,

Filesystem was created with -l 32768 -n 32768 and skinny metadata enabled.

Ok none of those make sense which makes me think it may be the ktime 
bits, instead of un-applying the whole patch could you just comment out 
the parts


ktime_t start = ktime_get();

and

if (actual_count > 0) {
u64 runtime = ktime_to_ns(ktime_sub(ktime_get(), start));
u64 avg;

/*
 * We weigh the current average higher than our current 
runtime

 * to avoid large swings in the average.
 */
spin_lock(&delayed_refs->lock);
avg = fs_info->avg_delayed_ref_runtime * 3 + runtime;
avg = div64_u64(avg, 4);
fs_info->avg_delayed_ref_runtime = avg;
spin_unlock(&delayed_refs->lock);
}

in __btrfs_run_delayed_refs and see if that makes the problem stop? If 
it does will you try chris's for-linus branch to see if it still 
reproduces there?  Maybe some patch changed ktime_get() in -rc1 that is 
causing issues and we're just now exposing it.  Thanks,


Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] btrfs: send: lower memory requirements in common case

2014-02-05 Thread David Sterba
The fs_path structure uses an inline buffer and falls back to a chain of
allocations, but vmalloc is not necessary because PATH_MAX fits into
PAGE_SIZE.

The size of fs_path has been reduced to 256 bytes from PAGE_SIZE,
usually 4k. Experimental measurements show that most paths on a single
filesystem do not exceed 200 bytes, and these get stored into the inline
buffer directly, which is now 230 bytes. Longer paths are kmalloced when
needed.

Signed-off-by: David Sterba 
---
v2:
- intel build test reports that krealloc should not reuse the buffer for return
  value, though it's not a problem in our case, a failed allocation leads to
  immediate return, let's use a temporary variable to keep the check happy

 fs/btrfs/send.c |  106 +++---
 1 files changed, 37 insertions(+), 69 deletions(-)

diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 1f09c74e1c1f..dd0b02adb1e6 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -57,7 +57,12 @@ struct fs_path {
unsigned short reversed:1;
char inline_buf[];
};
-   char pad[PAGE_SIZE];
+   /*
+* Average path length does not exceed 200 bytes, we'll have
+* better packing in the slab and higher chance to satisfy
+* a allocation later during send.
+*/
+   char pad[256];
};
 };
 #define FS_PATH_INLINE_SIZE \
@@ -262,12 +267,8 @@ static void fs_path_free(struct fs_path *p)
 {
if (!p)
return;
-   if (p->buf != p->inline_buf) {
-   if (is_vmalloc_addr(p->buf))
-   vfree(p->buf);
-   else
-   kfree(p->buf);
-   }
+   if (p->buf != p->inline_buf)
+   kfree(p->buf);
kfree(p);
 }
 
@@ -287,40 +288,31 @@ static int fs_path_ensure_buf(struct fs_path *p, int len)
if (p->buf_len >= len)
return 0;
 
-   path_len = p->end - p->start;
-   old_buf_len = p->buf_len;
-   len = PAGE_ALIGN(len);
-
+   /*
+* First time the inline_buf does not suffice
+*/
if (p->buf == p->inline_buf) {
-   tmp_buf = kmalloc(len, GFP_NOFS | __GFP_NOWARN);
-   if (!tmp_buf) {
-   tmp_buf = vmalloc(len);
-   if (!tmp_buf)
-   return -ENOMEM;
-   }
-   memcpy(tmp_buf, p->buf, p->buf_len);
-   p->buf = tmp_buf;
-   p->buf_len = len;
+   p->buf = kmalloc(len, GFP_NOFS);
+   if (!p->buf)
+   return -ENOMEM;
+   /*
+* The real size of the buffer is bigger, this will let the
+* fast path happen most of the time
+*/
+   p->buf_len = ksize(p->buf);
} else {
-   if (is_vmalloc_addr(p->buf)) {
-   tmp_buf = vmalloc(len);
-   if (!tmp_buf)
-   return -ENOMEM;
-   memcpy(tmp_buf, p->buf, p->buf_len);
-   vfree(p->buf);
-   } else {
-   tmp_buf = krealloc(p->buf, len, GFP_NOFS);
-   if (!tmp_buf) {
-   tmp_buf = vmalloc(len);
-   if (!tmp_buf)
-   return -ENOMEM;
-   memcpy(tmp_buf, p->buf, p->buf_len);
-   kfree(p->buf);
-   }
-   }
-   p->buf = tmp_buf;
-   p->buf_len = len;
+   char *tmp;
+
+   tmp = krealloc(p->buf, len, GFP_NOFS);
+   if (!tmp)
+   return -ENOMEM;
+   p->buf = tmp;
+   p->buf_len = ksize(p->buf);
}
+
+   path_len = p->end - p->start;
+   old_buf_len = p->buf_len;
+
if (p->reversed) {
tmp_buf = p->buf + old_buf_len - path_len - 1;
p->end = p->buf + p->buf_len - 1;
@@ -911,9 +903,7 @@ static int iterate_dir_item(struct btrfs_root *root, struct 
btrfs_path *path,
struct btrfs_dir_item *di;
struct btrfs_key di_key;
char *buf = NULL;
-   char *buf2 = NULL;
-   int buf_len;
-   int buf_virtual = 0;
+   const int buf_len = PATH_MAX;
u32 name_len;
u32 data_len;
u32 cur;
@@ -923,7 +913,6 @@ static int iterate_dir_item(struct btrfs_root *root, struct 
btrfs_path *path,
int num;
u8 type;
 
-   buf_len = PAGE_SIZE;
buf = kmalloc(buf_len, GFP_NOFS);
if (!buf) {
ret = -ENOMEM;
@@ -945,30 +934,12 @@ static int iterate_dir_item(struct btrfs_root *root, 
struct btrfs_path *path,
type = btrfs_dir_type(eb, di);
btrfs_d

Re: Booting with syslinux not possible

2014-02-05 Thread Alex
Duncan <1i5t5.duncan  cox.net> writes:

> 
> Alex posted on Tue, 04 Feb 2014 17:19:09 + as excerpted:
> 
> > I have quite an (overly) complicated setup.
> 
> I had to chuckle at that one.  Fits my setup to a "T", altho they're 
> different complications than yours.  I'll have to remember it the next 
> time I find a fitting context to use it! =:^)
> 


Pretty shiny things. ;-)
Kind regards.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v4] btrfs: add simple debugfs interface

2014-02-05 Thread David Sterba
Help during debugging to export various interesting infromation and
tunables without the need of extra mount options or ioctls.

Usage:
* declare your variable in sysfs.h, and include where you need it
* define the variable in sysfs.c and make it visible via
  debugfs_create_TYPE

Depends on CONFIG_DEBUG_FS.

Signed-off-by: David Sterba 
---
v4:
- the intel build test reported that the example variable is not declared and
should be static, acutally it should declared as the changelog suggests, fix
the code to match that and declare it in sysfs.h

v3:
- fix typo in changelog

v2:
- added missing return to btrfs_init_debugfs
- updated error handling to btrfs_init_sysfs, the cleanup
  is done in btrfs_exit_sysfs
- removed #ifdef in btrfs_exit_sysfs,

 fs/btrfs/sysfs.c |   33 +++--
 fs/btrfs/sysfs.h |5 +
 2 files changed, 32 insertions(+), 6 deletions(-)

diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index 782374d8fd19..b725e4574448 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -24,6 +24,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ctree.h"
 #include "disk-io.h"
@@ -593,6 +594,12 @@ static int add_device_membership(struct btrfs_fs_info 
*fs_info)
 /* /sys/fs/btrfs/ entry */
 static struct kset *btrfs_kset;
 
+/* /sys/kernel/debug/btrfs */
+static struct dentry *btrfs_debugfs_root_dentry;
+
+/* Debugging tunables and exported data */
+u64 btrfs_debugfs_test;
+
 int btrfs_sysfs_add_one(struct btrfs_fs_info *fs_info)
 {
int error;
@@ -636,27 +643,41 @@ failure:
return error;
 }
 
+static int btrfs_init_debugfs(void)
+{
+#ifdef CONFIG_DEBUG_FS
+   btrfs_debugfs_root_dentry = debugfs_create_dir("btrfs", NULL);
+   if (!btrfs_debugfs_root_dentry)
+   return -ENOMEM;
+
+   debugfs_create_u64("test", S_IRUGO | S_IWUGO, btrfs_debugfs_root_dentry,
+   &btrfs_debugfs_test);
+#endif
+   return 0;
+}
+
 int btrfs_init_sysfs(void)
 {
int ret;
+
btrfs_kset = kset_create_and_add("btrfs", NULL, fs_kobj);
if (!btrfs_kset)
return -ENOMEM;
 
-   init_feature_attrs();
+   ret = btrfs_init_debugfs();
+   if (ret)
+   return ret;
 
+   init_feature_attrs();
ret = sysfs_create_group(&btrfs_kset->kobj, &btrfs_feature_attr_group);
-   if (ret) {
-   kset_unregister(btrfs_kset);
-   return ret;
-   }
 
-   return 0;
+   return ret;
 }
 
 void btrfs_exit_sysfs(void)
 {
sysfs_remove_group(&btrfs_kset->kobj, &btrfs_feature_attr_group);
kset_unregister(btrfs_kset);
+   debugfs_remove_recursive(btrfs_debugfs_root_dentry);
 }
 
diff --git a/fs/btrfs/sysfs.h b/fs/btrfs/sysfs.h
index f3cea3710d44..9ab576318a84 100644
--- a/fs/btrfs/sysfs.h
+++ b/fs/btrfs/sysfs.h
@@ -1,6 +1,11 @@
 #ifndef _BTRFS_SYSFS_H_
 #define _BTRFS_SYSFS_H_
 
+/*
+ * Data exported through sysfs
+ */
+extern u64 btrfs_debugfs_test;
+
 enum btrfs_feature_set {
FEAT_COMPAT,
FEAT_COMPAT_RO,
-- 
1.7.9

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: hitting BUG_ON on troublesome FS

2014-02-05 Thread Josef Bacik


On 02/05/2014 01:34 AM, Remco Hosman - Yerf-it.com wrote:

How can i tell?

Label: data  uuid: a8626d67-4684-4b23-99b3-8d5fa8e7fd69
Total devices 5 FS bytes used 820.00KiB
devid2 size 1.82TiB used 1.00GiB path /dev/sdb2
devid3 size 1.82TiB used 1.00GiB path /dev/sdf2
devid5 size 2.73TiB used 3.00GiB path /dev/sdd2
devid10 size 2.73TiB used 2.03GiB path /dev/sde2
devid11 size 3.64TiB used 1.03GiB path /dev/sdc1

Data, RAID10: total=2.00GiB, used=768.00KiB
Data, RAID1: total=1.00GiB, used=12.00KiB
System, RAID1: total=32.00MiB, used=4.00KiB
Metadata, RAID1: total=1.00GiB, used=36.00KiB

i made a image with `btrfs-image`, when i do -c 9, the file size is 
7k, so eazy enough to mail if it would be of any use.



Yup that would be perfect.  THanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: convert to add transaction protection for btrfs send

2014-02-05 Thread Josef Bacik

On 02/05/2014 03:59 AM, Wang Shilong wrote:
> Hi Josef,
>
> [..SNIP..]
>> On 01/31/2014 11:37 AM, Wang Shilong wrote:
>>> Hello Josef,
>>>
  
>> 2) Remove the per-root rwsem for the commit root and just make one big
>> rwsem that covers all commit root switching. This way everybody who
>> wants to search with the commit root can just use this semaphore and all
>> be safe. It will mean that the inode cache stuff may block longer than
>> normal but I don't think that's too big of a deal.
>>
> I am ok with this fix,  I wanted to talk something about protecting searching 
> commit file root, this is really a
> problem especially for full send.
>
> I have some ideas about this issue:
>
> #1.don't use commit file root to search.
> This will become a nightmare when we are doing full send which will iterate 
> the whole file tree,
> at the same time, we snapshot send root, snapshots will be blocked until send 
> finished.
>
> #2. don't allow snapshot if we are sending root.
> This may be a little confusing, snapshots are readonly, but users can not 
> snapshot it.
I think this is the best bet. The fact is we don't want to hold this
commit_root_sem for the entire duration of the send, it would block
people trying to commit the transaction. We could check for contention
and drop the sem and re-search down to where we were but I think that
would be prone to errors. If we just check to see if the snapshot is
being sent and just return -EBUSY when we try to create a snapshot I
think that's perfectly reasonable.
> #3. after one iteration, we do check send_root's generation, and make sure it 
> doesn't
> change, if it changed, then we restart send again.
>
> I don't know which approach is better,and also snapshot-aware defragment will 
> change
> read-only snapshot?
>
> Did you have any better ideas about this issue? Share it with me here.^_^
>
Snapshot-aware defrag will definitely screw us here. I think we need to
do the same thing above as we do here, which is to simply skip the
snapshot aware defrag if we are currently using that root for send. This
sound reasonable to you? Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Frequent error messages (block group X has wrong amount of free space)

2014-02-05 Thread Elifarley Callado Coelho Cruz
I've recently installed ArchLinux on a Lenovo Ideapad laptop.
Right on the first day of use, I received error messages at boot time
like these:

BTRFS error (device dm-0): block group 11840520192 has wrong amount of
free space
BTRFS error (device dm-0): failed to load free space cache for block
group 11840520192


Then I added the clear_cache option to see if the problem would go
away. After 3 boots, it did go away. But after some more boots, it's
back!

I believe there has been no unclean shutdowns nor lock-ups so far.

What could be causing this error during normal operation?

I have:
Kernel 3.12.9
btrfs-progs 3.12

/etc/fstab:

# /dev/mapper/luks-pool
UUID=c56031b3-af27-41f6-8103-ede1f0743ac0   /   btrfs
 subvol=arch,compress=zlib,nospace_cache,rw,noatime,clear_cache
 0 0

# /dev/mapper/luks-pool
UUID=c56031b3-af27-41f6-8103-ede1f0743ac0   /varbtrfs
 
subvol=arch-var,compress=lzo,nospace_cache,commit=180,rw,noatime,clear_cache
   0 0

# /dev/mapper/luks-pool
UUID=c56031b3-af27-41f6-8103-ede1f0743ac0   /home   btrfs
 subvol=home,compress=zlib,nospace_cache,rw,noatime,clear_cache
 0 0


Thanks!
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Wang Shilong
So i knew what was wrong here, we need found_key while 
btrfs_previous_extent_item() did set
it properly..^_^

I will send a v2 to fix this, thanks!


> On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  
> wrote:
>> From: Wang Shilong 
>> 
>> Since we have introduced btrfs_previous_extent_item() to search previous
>> extent item, just switch into it.
>> 
>> Signed-off-by: Wang Shilong 
> 
> Hi Shilong,
> 
> This patch is making btrfs/004 fail for me, consistently:
> 
> btrfs/004 99s ... [failed, exit status 1] - output mismatch (see
> /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
>--- tests/btrfs/004.out 2013-11-26 18:25:29.26714 +
>+++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad
> 2014-02-05 12:20:26.053570545 +
>@@ -1,3 +1,100 @@
> QA output created by 004
> *** test backref walking
>-*** done
>+unexpected output from
>+ /home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal
> logical-resolve -P 137719808 /home/fdmanana/btrfs-tests/scratch_1
>+expected inum: 278, expected address: 53248, file:
> /home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d3/da/d174/d1c/d3e/d4d/d16f/f132,
> got:
>+ioctl ret=-1, error: No such file or directory
>...
>(Run 'diff -u tests/btrfs/004.out
> /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad'  to see
> the entire diff)
> Ran: btrfs/004
> Failures: btrfs/004
> Failed 1 of 1 tests
> 
> See comment inline below as well.
> 
> Thanks
> 
>> ---
>> fs/btrfs/backref.c | 34 +++---
>> 1 file changed, 3 insertions(+), 31 deletions(-)
>> 
>> diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
>> index aded3ef..4f59f07 100644
>> --- a/fs/btrfs/backref.c
>> +++ b/fs/btrfs/backref.c
>> @@ -1333,37 +1333,9 @@ int extent_from_logical(struct btrfs_fs_info 
>> *fs_info, u64 logical,
>>if (ret < 0)
>>return ret;
>> 
>> -   while (1) {
>> -   u32 nritems;
>> -   if (path->slots[0] == 0) {
>> -   btrfs_set_path_blocking(path);
>> -   ret = btrfs_prev_leaf(fs_info->extent_root, path);
>> -   if (ret != 0) {
>> -   if (ret > 0) {
>> -   pr_debug("logical %llu is not within 
>> "
>> -"any extent\n", logical);
>> -   ret = -ENOENT;
>> -   }
>> -   return ret;
>> -   }
>> -   } else {
>> -   path->slots[0]--;
>> -   }
>> -   nritems = btrfs_header_nritems(path->nodes[0]);
>> -   if (nritems == 0) {
>> -   pr_debug("logical %llu is not within any extent\n",
>> -logical);
>> -   return -ENOENT;
>> -   }
>> -   if (path->slots[0] == nritems)
>> -   path->slots[0]--;
>> -
>> -   btrfs_item_key_to_cpu(path->nodes[0], found_key,
>> - path->slots[0]);
>> -   if (found_key->type == BTRFS_EXTENT_ITEM_KEY ||
>> -   found_key->type == BTRFS_METADATA_ITEM_KEY)
>> -   break;
>> -   }
>> +   ret = btrfs_previous_extent_item(fs_info->extent_root, path, 0);
>> +   if (ret)
>> +   return ret;
> 
> This isn't equivalent to what we had before. We're now returning 1
> when we previously returned -ENOENT. However this isn't what's making
> the test fail.
> 
>> 
>>if (found_key->type == BTRFS_METADATA_ITEM_KEY)
>>size = fs_info->extent_root->leafsize;
>> --
>> 1.8.4
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> -- 
> Filipe David Manana,
> 
> "Reasonable men adapt themselves to the world.
> Unreasonable men adapt the world to themselves.
> That's why all progress depends on unreasonable men."

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Wang Shilong

Hi Filipe,

> On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  
> wrote:
>> From: Wang Shilong 
>> 
>> Since we have introduced btrfs_previous_extent_item() to search previous
>> extent item, just switch into it.
>> 
>> Signed-off-by: Wang Shilong 
> 
> Hi Shilong,
> 
> This patch is making btrfs/004 fail for me, consistently:
> 
> btrfs/004 99s ... [failed, exit status 1] - output mismatch (see
> /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
>--- tests/btrfs/004.out 2013-11-26 18:25:29.26714 +
>+++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad
> 2014-02-05 12:20:26.053570545 +
>@@ -1,3 +1,100 @@
> QA output created by 004
> *** test backref walking
>-*** done
>+unexpected output from
>+ /home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal
> logical-resolve -P 137719808 /home/fdmanana/btrfs-tests/scratch_1
>+expected inum: 278, expected address: 53248, file:
> /home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d3/da/d174/d1c/d3e/d4d/d16f/f132,
> got:
>+ioctl ret=-1, error: No such file or directory
>...
>(Run 'diff -u tests/btrfs/004.out
> /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad'  to see
> the entire diff)
> Ran: btrfs/004
> Failures: btrfs/004
> Failed 1 of 1 tests
> 
> See comment inline below as well.

I could not reproduce this problem in my virtual box, how did you
trigger the problem(for a loop, options etc?)

Also, the strange thing is that this patch did not change the logic before,
the function btrfs_previous_extent_item() had the same behavior as josef's
previous codes did.

> 
> Thanks
> 
>> ---
>> fs/btrfs/backref.c | 34 +++---
>> 1 file changed, 3 insertions(+), 31 deletions(-)
>> 
>> diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
>> index aded3ef..4f59f07 100644
>> --- a/fs/btrfs/backref.c
>> +++ b/fs/btrfs/backref.c
>> @@ -1333,37 +1333,9 @@ int extent_from_logical(struct btrfs_fs_info 
>> *fs_info, u64 logical,
>>if (ret < 0)
>>return ret;
>> 
>> -   while (1) {
>> -   u32 nritems;
>> -   if (path->slots[0] == 0) {
>> -   btrfs_set_path_blocking(path);
>> -   ret = btrfs_prev_leaf(fs_info->extent_root, path);
>> -   if (ret != 0) {
>> -   if (ret > 0) {
>> -   pr_debug("logical %llu is not within 
>> "
>> -"any extent\n", logical);
>> -   ret = -ENOENT;
>> -   }
>> -   return ret;
>> -   }
>> -   } else {
>> -   path->slots[0]--;
>> -   }
>> -   nritems = btrfs_header_nritems(path->nodes[0]);
>> -   if (nritems == 0) {
>> -   pr_debug("logical %llu is not within any extent\n",
>> -logical);
>> -   return -ENOENT;
>> -   }
>> -   if (path->slots[0] == nritems)
>> -   path->slots[0]--;
>> -
>> -   btrfs_item_key_to_cpu(path->nodes[0], found_key,
>> - path->slots[0]);
>> -   if (found_key->type == BTRFS_EXTENT_ITEM_KEY ||
>> -   found_key->type == BTRFS_METADATA_ITEM_KEY)
>> -   break;
>> -   }
>> +   ret = btrfs_previous_extent_item(fs_info->extent_root, path, 0);
>> +   if (ret)
>> +   return ret;
> 
> This isn't equivalent to what we had before. We're now returning 1
> when we previously returned -ENOENT. However this isn't what's making
> the test fail.

Yeah, Filipe, thanks for pointing this out.^_^

Thanks,
Wang
> 
>> 
>>if (found_key->type == BTRFS_METADATA_ITEM_KEY)
>>size = fs_info->extent_root->leafsize;
>> --
>> 1.8.4
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 
> -- 
> Filipe David Manana,
> 
> "Reasonable men adapt themselves to the world.
> Unreasonable men adapt the world to themselves.
> That's why all progress depends on unreasonable men."

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: looping across fs_devices isn't necessary

2014-02-05 Thread Anand Jain



 This patch was incomplete. Kindly ignore.

Thanks, Anand


On 02/05/14 08:57 PM, Anand Jain wrote:

btrfs_show_devname() is trying to know dev name with
lowest devid for a given FSID, so looping across the
FSID isn't necessary

Signed-off-by: Anand Jain 
---
  fs/btrfs/super.c |   18 +++---
  1 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 378157c..6ed76d8 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1885,22 +1885,18 @@ static int btrfs_unfreeze(struct super_block *sb)
  static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
  {
struct btrfs_fs_info *fs_info = btrfs_sb(root->d_sb);
-   struct btrfs_fs_devices *cur_devices;
struct btrfs_device *dev, *first_dev = NULL;
struct list_head *head;
struct rcu_string *name;

mutex_lock(&fs_info->fs_devices->device_list_mutex);
-   cur_devices = fs_info->fs_devices;
-   while (cur_devices) {
-   head = &cur_devices->devices;
-   list_for_each_entry(dev, head, dev_list) {
-   if (dev->missing)
-   continue;
-   if (!first_dev || dev->devid < first_dev->devid)
-   first_dev = dev;
-   }
-   cur_devices = cur_devices->seed;
+
+   head = &fs_info->fs_devices->devices;
+   list_for_each_entry(dev, head, dev_list) {
+   if (dev->missing)
+   continue;
+   if (!first_dev || dev->devid < first_dev->devid)
+   first_dev = dev;
}

if (first_dev) {


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs-progs: fix typo in reported error

2014-02-05 Thread Anand Jain
Signed-off-by: Anand Jain 
---
 utils.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/utils.c b/utils.c
index a045ffd..1e72b77 100644
--- a/utils.c
+++ b/utils.c
@@ -1340,7 +1340,7 @@ static int set_label_mounted(const char *mount_path, 
const char *label)
 
fd = open(mount_path, O_RDONLY | O_NOATIME);
if (fd < 0) {
-   fprintf(stderr, "ERROR: unable access to '%s'\n", mount_path);
+   fprintf(stderr, "ERROR: unable to access '%s'\n", mount_path);
return -1;
}
 
@@ -1397,7 +1397,7 @@ int get_label_mounted(const char *mount_path, char 
*labelp)
 
fd = open(mount_path, O_RDONLY | O_NOATIME);
if (fd < 0) {
-   fprintf(stderr, "ERROR: unable access to '%s'\n", mount_path);
+   fprintf(stderr, "ERROR: unable to access '%s'\n", mount_path);
return -1;
}
 
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] btrfs: looping across fs_devices isn't necessary

2014-02-05 Thread Anand Jain
btrfs_show_devname() is trying to know dev name with
lowest devid for a given FSID, so looping across the
FSID isn't necessary

Signed-off-by: Anand Jain 
---
 fs/btrfs/super.c |   18 +++---
 1 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 378157c..6ed76d8 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -1885,22 +1885,18 @@ static int btrfs_unfreeze(struct super_block *sb)
 static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
 {
struct btrfs_fs_info *fs_info = btrfs_sb(root->d_sb);
-   struct btrfs_fs_devices *cur_devices;
struct btrfs_device *dev, *first_dev = NULL;
struct list_head *head;
struct rcu_string *name;
 
mutex_lock(&fs_info->fs_devices->device_list_mutex);
-   cur_devices = fs_info->fs_devices;
-   while (cur_devices) {
-   head = &cur_devices->devices;
-   list_for_each_entry(dev, head, dev_list) {
-   if (dev->missing)
-   continue;
-   if (!first_dev || dev->devid < first_dev->devid)
-   first_dev = dev;
-   }
-   cur_devices = cur_devices->seed;
+
+   head = &fs_info->fs_devices->devices;
+   list_for_each_entry(dev, head, dev_list) {
+   if (dev->missing)
+   continue;
+   if (!first_dev || dev->devid < first_dev->devid)
+   first_dev = dev;
}
 
if (first_dev) {
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] Btrfs: switch to btrfs_previous_extent_item()

2014-02-05 Thread Filipe David Manana
On Fri, Jan 31, 2014 at 4:42 PM, Wang Shilong  wrote:
> From: Wang Shilong 
>
> Since we have introduced btrfs_previous_extent_item() to search previous
> extent item, just switch into it.
>
> Signed-off-by: Wang Shilong 

Hi Shilong,

This patch is making btrfs/004 fail for me, consistently:

btrfs/004 99s ... [failed, exit status 1] - output mismatch (see
/home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad)
--- tests/btrfs/004.out 2013-11-26 18:25:29.26714 +
+++ /home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad
2014-02-05 12:20:26.053570545 +
@@ -1,3 +1,100 @@
 QA output created by 004
 *** test backref walking
-*** done
+unexpected output from
+ /home/fdmanana/git/hub/btrfs-progs/btrfs inspect-internal
logical-resolve -P 137719808 /home/fdmanana/btrfs-tests/scratch_1
+expected inum: 278, expected address: 53248, file:
/home/fdmanana/btrfs-tests/scratch_1/snap1/p0/d3/da/d174/d1c/d3e/d4d/d16f/f132,
got:
+ioctl ret=-1, error: No such file or directory
...
(Run 'diff -u tests/btrfs/004.out
/home/fdmanana/git/hub/xfstests_2/results//btrfs/004.out.bad'  to see
the entire diff)
Ran: btrfs/004
Failures: btrfs/004
Failed 1 of 1 tests

See comment inline below as well.

Thanks

> ---
>  fs/btrfs/backref.c | 34 +++---
>  1 file changed, 3 insertions(+), 31 deletions(-)
>
> diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
> index aded3ef..4f59f07 100644
> --- a/fs/btrfs/backref.c
> +++ b/fs/btrfs/backref.c
> @@ -1333,37 +1333,9 @@ int extent_from_logical(struct btrfs_fs_info *fs_info, 
> u64 logical,
> if (ret < 0)
> return ret;
>
> -   while (1) {
> -   u32 nritems;
> -   if (path->slots[0] == 0) {
> -   btrfs_set_path_blocking(path);
> -   ret = btrfs_prev_leaf(fs_info->extent_root, path);
> -   if (ret != 0) {
> -   if (ret > 0) {
> -   pr_debug("logical %llu is not within "
> -"any extent\n", logical);
> -   ret = -ENOENT;
> -   }
> -   return ret;
> -   }
> -   } else {
> -   path->slots[0]--;
> -   }
> -   nritems = btrfs_header_nritems(path->nodes[0]);
> -   if (nritems == 0) {
> -   pr_debug("logical %llu is not within any extent\n",
> -logical);
> -   return -ENOENT;
> -   }
> -   if (path->slots[0] == nritems)
> -   path->slots[0]--;
> -
> -   btrfs_item_key_to_cpu(path->nodes[0], found_key,
> - path->slots[0]);
> -   if (found_key->type == BTRFS_EXTENT_ITEM_KEY ||
> -   found_key->type == BTRFS_METADATA_ITEM_KEY)
> -   break;
> -   }
> +   ret = btrfs_previous_extent_item(fs_info->extent_root, path, 0);
> +   if (ret)
> +   return ret;

This isn't equivalent to what we had before. We're now returning 1
when we previously returned -ENOENT. However this isn't what's making
the test fail.

>
> if (found_key->type == BTRFS_METADATA_ITEM_KEY)
> size = fs_info->extent_root->leafsize;
> --
> 1.8.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Filipe David Manana,

"Reasonable men adapt themselves to the world.
 Unreasonable men adapt the world to themselves.
 That's why all progress depends on unreasonable men."
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] btrfs: add small program for clone testing

2014-02-05 Thread David Disseldorp
The cloner program is capable of cloning files using the BTRFS_IOC_CLONE
and BTRFS_IOC_CLONE_RANGE ioctls.

Signed-off-by: David Disseldorp 
---
 src/Makefile |   2 +-
 src/cloner.c | 168 +++
 2 files changed, 169 insertions(+), 1 deletion(-)
 create mode 100644 src/cloner.c

diff --git a/src/Makefile b/src/Makefile
index 84c8297..6509f2d 100644
--- a/src/Makefile
+++ b/src/Makefile
@@ -18,7 +18,7 @@ LINUX_TARGETS = xfsctl bstat t_mtab getdevicesize 
preallo_rw_pattern_reader \
locktest unwritten_mmap bulkstat_unlink_test t_stripealign \
bulkstat_unlink_test_modified t_dir_offset t_futimens t_immutable \
stale_handle pwrite_mmap_blocked t_dir_offset2 seek_sanity_test \
-   seek_copy_test t_readdir_1 t_readdir_2 fsync-tester nsexec
+   seek_copy_test t_readdir_1 t_readdir_2 fsync-tester nsexec cloner
 
 SUBDIRS =
 
diff --git a/src/cloner.c b/src/cloner.c
new file mode 100644
index 000..59defbb
--- /dev/null
+++ b/src/cloner.c
@@ -0,0 +1,168 @@
+/*
+ *  Tiny program to perform file (range) clones using raw Btrfs ioctls.
+ *  It should only be needed until btrfs-progs has an xfs_io equivalent.
+ *
+ *  Copyright (C) 2014 SUSE Linux Products GmbH. All Rights Reserved.
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; either version 2 of the License, or
+ *  (at your option) any later version.
+ *
+ *  This program is distributed in the hope that it will be useful,
+ *  but WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ *  GNU General Public License for more details.
+ *
+ *  You should have received a copy of the GNU General Public License
+ *  along with this program; if not, write to the Free Software
+ *  Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA  02111-1307, USA.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+struct btrfs_ioctl_clone_range_args {
+   int64_t src_fd;
+   uint64_t src_offset;
+   uint64_t src_length;
+   uint64_t dest_offset;
+};
+
+#define BTRFS_IOCTL_MAGIC 0x94
+#define BTRFS_IOC_CLONE   _IOW(BTRFS_IOCTL_MAGIC, 9, int)
+#define BTRFS_IOC_CLONE_RANGE _IOW(BTRFS_IOCTL_MAGIC, 13, \
+  struct btrfs_ioctl_clone_range_args)
+
+static void
+usage(char *name, const char *msg)
+{
+   printf("Fatal: %s\n"
+  "Usage:\n"
+  "%s [options]  \n"
+  "\tA full file clone (reflink) is performed by default, "
+  "unless any of the following are specified:\n"
+  "\t-s :  source file offset (default = 0)\n"
+  "\t-d :  destination file offset (default = 0)\n"
+  "\t-l :  length of clone (default = 0)\n",
+  msg, name);
+   _exit(1);
+}
+
+static int
+clone_file(int src_fd, int dst_fd)
+{
+   int ret = ioctl(dst_fd, BTRFS_IOC_CLONE, src_fd);
+   if (ret != 0)
+   ret = errno;
+   return ret;
+}
+
+static int
+clone_file_range(int src_fd, int dst_fd, uint64_t src_off, uint64_t dst_off,
+uint64_t len)
+{
+   struct btrfs_ioctl_clone_range_args cr_args;
+   int ret;
+
+   memset(&cr_args, 0, sizeof(cr_args));
+   cr_args.src_fd = src_fd;
+   cr_args.src_offset = src_off;
+   cr_args.src_length = len;
+   cr_args.dest_offset = dst_off;
+   ret = ioctl(dst_fd, BTRFS_IOC_CLONE_RANGE, &cr_args);
+   if (ret != 0)
+   ret = errno;
+   return ret;
+}
+
+int
+main(int argc, char **argv)
+{
+   bool full_file = true;
+   uint64_t src_off = 0;
+   uint64_t dst_off = 0;
+   uint64_t len = 0;
+   char *src_file;
+   int src_fd;
+   char *dst_file;
+   int dst_fd;
+   int ret;
+   int opt;
+
+   while ((opt = getopt(argc, argv, "s:d:l:")) != -1) {
+   switch (opt) {
+   case 's':
+   src_off = atoi(optarg);
+   full_file = false;
+   break;
+   case 'd':
+   dst_off = atoi(optarg);
+   full_file = false;
+   break;
+   case 'l':
+   len = atoi(optarg);
+   full_file = false;
+   break;
+   default:
+   usage(argv[0], "invalid argument");
+   }
+   }
+
+   /* should be exactly two args left */
+   if (optind != argc - 2)
+   usage(argv[0], "src_file and dst_file arguments are madatory");
+
+   src_file = (char *)strdup(argv[optind++]);
+   if (src_file == NULL) {
+   ret = ENOMEM;
+   goto err_out;
+   }
+  

[PATCH 2/2] btrfs/035: add new clone overwrite regression test

2014-02-05 Thread David Disseldorp
This test uses the newly added cloner binary to dispatch full file and
range specific clone (reflink) requests.

Signed-off-by: David Disseldorp 
---
 tests/btrfs/035 | 76 +
 tests/btrfs/035.out |  3 +++
 tests/btrfs/group   |  1 +
 3 files changed, 80 insertions(+)
 create mode 100755 tests/btrfs/035
 create mode 100644 tests/btrfs/035.out

diff --git a/tests/btrfs/035 b/tests/btrfs/035
new file mode 100755
index 000..03c2cd3
--- /dev/null
+++ b/tests/btrfs/035
@@ -0,0 +1,76 @@
+#!/bin/bash
+# FS QA Test No. btrfs/035
+#
+# Regression test for overwriting clones
+#
+#---
+# Copyright (C) 2014 SUSE Linux Products GmbH. All Rights Reserved.
+#
+# This program is free software; you can redistribute it and/or
+# modify it under the terms of the GNU General Public License as
+# published by the Free Software Foundation.
+#
+# This program is distributed in the hope that it would be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write the Free Software Foundation,
+# Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
+#
+#---
+#
+
+seq=`basename $0`
+seqres=$RESULT_DIR/$seq
+echo "QA output created by $seq"
+
+here=`pwd`
+tmp=/tmp/$$
+status=1   # failure is the default!
+
+_cleanup()
+{
+rm -f $tmp.*
+}
+
+trap "_cleanup ; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common/rc
+. ./common/filter
+
+# real QA test starts here
+_supported_fs btrfs
+_supported_os Linux
+_require_scratch
+
+_scratch_mkfs > /dev/null 2>&1
+_scratch_mount
+
+CLONER_PROG=$here/src/cloner
+
+src_str="aa"
+
+echo -n "$src_str" > $SCRATCH_MNT/src || _fail "failed to create src"
+
+$CLONER_PROG $SCRATCH_MNT/src  $SCRATCH_MNT/src.clone1
+
+src_str="bbcc"
+
+echo -n "$src_str" > $SCRATCH_MNT/src || _fail "failed to create src"
+
+$CLONER_PROG $SCRATCH_MNT/src $SCRATCH_MNT/src.clone2
+
+snap_src_sz=`ls -lah $SCRATCH_MNT/src.clone1 | awk '{print $5}'`
+echo "attempting ioctl (src.clone1 src)"
+$CLONER_PROG -s 0 -d 0 -l ${snap_src_sz} \
+   $SCRATCH_MNT/src.clone1 $SCRATCH_MNT/src || _fail "ioctl failed"
+
+snap_src_sz=`ls -lah $SCRATCH_MNT/src.clone2 | awk '{print $5}'`
+echo "attempting ioctl (src.clone2 src)"
+$CLONER_PROG -s 0 -d 0 -l ${snap_src_sz} \
+   $SCRATCH_MNT/src.clone2 $SCRATCH_MNT/src || _fail "ioctl failed"
+
+status=0 ; exit
diff --git a/tests/btrfs/035.out b/tests/btrfs/035.out
new file mode 100644
index 000..f86cadf
--- /dev/null
+++ b/tests/btrfs/035.out
@@ -0,0 +1,3 @@
+QA output created by 035
+attempting ioctl (src.clone1 src)
+attempting ioctl (src.clone2 src)
diff --git a/tests/btrfs/group b/tests/btrfs/group
index f9f062f..bee57cb 100644
--- a/tests/btrfs/group
+++ b/tests/btrfs/group
@@ -37,3 +37,4 @@
 032 auto quick
 033 auto quick
 034 auto quick
+035 auto quick
-- 
1.8.4.5

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/2] __btrfs_drop_extents() BUG_ON reproducer

2014-02-05 Thread David Disseldorp
This patch-set provides a reproducer for hitting the 3.14.0-rc1 BUG_ON()
at:
 692 int __btrfs_drop_extents(struct btrfs_trans_handle *trans,
...
 839 /*
 840  *  |  range to drop - |
 841  *  |  extent  |
 842  */
 843 if (start <= key.offset && end < extent_end) {
 844 BUG_ON(extent_type == BTRFS_FILE_EXTENT_INLINE);
 845 
 846 memcpy(&new_key, &key, sizeof(new_key));

The first patch adds a small cloner binary which is used by btrfs/035 to
dispatch BTRFS_IOC_CLONE_RANGE requests.

This workload resembles that of Samba's vfs_btrfs module, when a Windows
client restores a file from a shadow-copy (snapshot) using server-side
copy requests.

Feedback appreciated.

Cheers, David


 src/Makefile|   2 +-
 src/cloner.c| 168 
+++
 tests/btrfs/035 |  76 

 tests/btrfs/035.out |   3 +++
 tests/btrfs/group   |   1 +
 5 files changed, 249 insertions(+), 1 deletion(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: convert to add transaction protection for btrfs send

2014-02-05 Thread Wang Shilong

Hi Josef,

[..SNIP..]
> 
> On 01/31/2014 11:37 AM, Wang Shilong wrote:
>> Hello Josef,
>> 
>>>  
> 
> 2) Remove the per-root rwsem for the commit root and just make one big
> rwsem that covers all commit root switching. This way everybody who
> wants to search with the commit root can just use this semaphore and all
> be safe. It will mean that the inode cache stuff may block longer than
> normal but I don't think that's too big of a deal.
> 

I am ok with this fix,  I wanted to talk something about protecting searching 
commit file root, this is really a
problem especially for full send.

I have some ideas about this issue:

#1.don't use commit file root to search.
This will become a nightmare when we are doing full send which will iterate the 
whole file tree,
at the same time, we snapshot send root, snapshots will be blocked until send 
finished.

#2. don't allow snapshot if we are sending root.
This may be a little confusing, snapshots are readonly, but users can not 
snapshot it.

#3. after one iteration, we do check send_root's generation, and make sure it 
doesn't
change, if it changed, then we restart send again.

I don't know which approach is better,and also snapshot-aware defragment will 
change
read-only snapshot?

Did you have any better ideas about this issue? Share it with me here.^_^

Thanks,
Wang
> 
> Josef

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Btrfs: throttle delayed refs better

2014-02-05 Thread Johannes Hirte
On Tue, 4 Feb 2014 09:12:54 -0500
Josef Bacik  wrote:

> Hrm I was hoping that was going to be more helpful.  Can you get perf 
> record -ag and then perf report while it's at full cpu and get the
> first 3 or 4 things with their traces?

Here it comes:

# 
# captured on: Wed Feb  5 00:11:41 2014
# 
#
no symbols found in /usr/sbin/acpid, maybe install a debug package?
unexpected end of event stream
# Samples: 168K of event 'cycles'   


# Event count (approx.): 126847081763   


#   


# Overhead  Command   Shared Object 
  Symbol

#   ...  ..  
...

#   


18.48%  btrfs-freespace  [kernel.kallsyms]   [k] state_store


|
--- state_store

10.25%  btrfs-freespace  [kernel.kallsyms]   [k] 
sys_sched_rr_get_interval   
   
|
--- sys_sched_rr_get_interval

 9.02%  btrfs-freespace  [kernel.kallsyms]   [k] 
rt_mutex_slowunlock 
   
|
--- rt_mutex_slowunlock

 8.76%  btrfs-freespace  [kernel.kallsyms]   [k] 
btrfs_submit_compressed_write   
   
|
--- btrfs_submit_compressed_write

 6.63%  btrfs-freespace  [kernel.kallsyms]   [k] sched_show_task

|
--- sched_show_task

 5.19%  btrfs-freespace  [kernel.kallsyms]   [k] find_free_extent   

|
--- find_free_extent

 5.15%  btrfs-freespace  [kernel.kallsyms]   [k] 
trace_print_graph_duration  
   
|
--- trace_print_graph_duration

> I'm going to try and
> reproduce today, is there anything special about your fs?
> Compression, large blocksizes, skinny metadata?  Thanks,

Filesystem was created with -l 32768 -n 32768 and skinny metadata enabled.

regards,
  Johannes
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html