At 02/07/2017 04:02 PM, Anand Jain wrote:
Hi Qu,
I don't think I have seen this before, I don't know the reason
why I wrote this, may be to test encryption, however it was all
with default options.
Forgot to mention, thanks for the test case.
Or we will never find it.
Thanks,
Qu
But now I could reproduce and, looks like balance fails to
start with IO error though the mount is successful.
------------------
# tail -f ./results/btrfs/125.full
intense and takes potentially very long. It is recommended to
use the balance filters to narrow down the balanced data.
Use 'btrfs balance start --full-balance' option to skip this
warning. The operation will start in 10 seconds.
Use Ctrl-C to stop it.
10 9 8 7 6 5 4 3 2 1ERROR: error during balancing '/scratch':
Input/output error
There may be more info in syslog - try dmesg | tail
Starting balance without any filters.
failed: '/root/bin/btrfs balance start /scratch'
--------------------
This must be fixed. For debugging if I add a sync before previous
unmount, the problem isn't reproduced. just fyi. Strange.
-------
diff --git a/tests/btrfs/125 b/tests/btrfs/125
index 91aa8d8c3f4d..4d4316ca9f6e 100755
--- a/tests/btrfs/125
+++ b/tests/btrfs/125
@@ -133,6 +133,7 @@ echo "-----Mount normal-----" >> $seqres.full
echo
echo "Mount normal and balance"
+_run_btrfs_util_prog filesystem sync $SCRATCH_MNT
_scratch_unmount
_run_btrfs_util_prog device scan
_scratch_mount >> $seqres.full 2>&1
------
HTH.
Thanks, Anand
On 02/07/17 14:09, Qu Wenruo wrote:
Hi Anand,
I found that btrfs/125 test case can only pass if we enabled space cache.
If using nospace_cache or space_cache=v2 mount option, it will get
blocked forever with the following callstack(the only blocked process):
[11382.046978] btrfs D11128 6705 6057 0x00000000
[11382.047356] Call Trace:
[11382.047668] __schedule+0x2d4/0xae0
[11382.047956] schedule+0x3d/0x90
[11382.048283] btrfs_start_ordered_extent+0x160/0x200 [btrfs]
[11382.048630] ? wake_atomic_t_function+0x60/0x60
[11382.048958] btrfs_wait_ordered_range+0x113/0x210 [btrfs]
[11382.049360] btrfs_relocate_block_group+0x260/0x2b0 [btrfs]
[11382.049703] btrfs_relocate_chunk+0x51/0xf0 [btrfs]
[11382.050073] btrfs_balance+0xaa9/0x1610 [btrfs]
[11382.050404] ? btrfs_ioctl_balance+0x3a0/0x3b0 [btrfs]
[11382.050739] btrfs_ioctl_balance+0x3a0/0x3b0 [btrfs]
[11382.051109] btrfs_ioctl+0xbe7/0x27f0 [btrfs]
[11382.051430] ? trace_hardirqs_on+0xd/0x10
[11382.051747] ? free_object+0x74/0xa0
[11382.052084] ? debug_object_free+0xf2/0x130
[11382.052413] do_vfs_ioctl+0x94/0x710
[11382.052750] ? enqueue_hrtimer+0x160/0x160
[11382.053090] ? do_nanosleep+0x71/0x130
[11382.053431] SyS_ioctl+0x79/0x90
[11382.053735] entry_SYSCALL_64_fastpath+0x18/0xad
[11382.054570] RIP: 0033:0x7f397d7a6787
I also found in the test case, we only have 3 continuous data extents,
whose sizes are 1M, 68.5M and 31.5M respectively.
Original data block group:
0 1M 64M 69.5M 101M 128M
| Ext A | Extent B(68.5M) | Extent C(31.5M) |
While relocation write them in 4 extents:
0~1M :same as Extent A. (1st)
1M~68.3438M :smaller than Extent B (2nd)
68.3438M~69.5M :tail part of Extent B (3rd)
69.5M~ 101M :same as Extent C. (4th)
However only ordered extent of (3rd) and (4th) get finished.
While ordered extent of (1st) and (2nd) never reached
finish_ordered_io().
So relocation will wait for no one to finish the these two ordered
extent, and get blocked.
Did you experienced the same bug submitting the test case?
Is there any known fix for it?
Thanks,
Qu
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html