Re: [PATCH 0/9] Implement device scrub/replace for RAID56

2014-11-25 Thread Chris Mason

On Fri, Nov 14, 2014 at 8:50 AM, Miao Xie  wrote:
This patchset implement the device scrub/replace function for RAID56, 
the
most implementation of the common data is similar to the other RAID 
type.

The differentia or difficulty is the parity process. In order to avoid
that problem the data that is easy to be change out the stripe lock,
we do most work in the RAID56 stripe lock context.

And in order to avoid making the code more and more complex, we copy 
some

code of common data process for the parity, the cleanup work is in my
TODO list.

We have done some test, the patchset worked well. Of course, more 
tests

are welcome. If you are interesting to use it or test it, you can pull
the patchset from

  https://github.com/miaoxie/linux-btrfs.git raid56-scrub-replace


I'm getting crashes from btrfs/060 with these in place:

[ 1649.712413] BTRFS: assertion failed: logical + PAGE_SIZE <= 
rbio->raid_map[0] + rbio->stripe_len * rbio->nr_data, file: 
fs/btrfs/raid56.c, line: 2248^M

[ 1649.738982] [ cut here ]^M
[ 1649.748727] kernel BUG at fs/btrfs/ctree.h:4020!^M
[ 1649.758039] invalid opcode:  [#1] SMP DEBUG_PAGEALLOC^M
[ 1649.768977] Modules linked in: fuse loop btrfs raid6_pq 
zlib_deflate lzo_compress xor k10temp coretemp hwmon xfs exportfs 
libcrc32c tcp_diag inet_diag nfsv4 ip6table_filter ip6_tables 
xt_NFLOG nfnetlink_log nfnetlink xt_comment xt_statistic 
iptable_filter ip_tables x_tables nfsv3 nfs lockd grace mptctl 
netconsole autofs4 rpcsec_gss_krb5 auth_rpcgss oid_registry sunrpc 
ipv6 ext3 jbd dm_mod iTCO_wdt iTCO_vendor_support rtc_cmos ipmi_si 
ipmi_msghandler pcspkr i2c_i801 lpc_ich mfd_core shpchp ehci_pci 
ehci_hcd mlx4_en ptp pps_core mlx4_core ses enclosure sg button 
megaraid_sas^M
[ 1649.872917] CPU: 0 PID: 16687 Comm: kworker/u65:0 Not tainted 
3.18.0-rc6-mason+ #3^M
[ 1649.888171] Hardware name: ZTSYSTEMS Echo Ridge T4  /A9DRPF-10D, 
BIOS 1.07 05/10/2012^M
[ 1649.903962] Workqueue: btrfs-btrfs-scrub btrfs_scrub_helper 
[btrfs]^M
[ 1649.916588] task: 88072557dd90 ti: 88070fdc4000 task.ti: 
88070fdc4000^M
[ 1649.931669] RIP: 0010:[]  [] 
raid56_parity_add_scrub_pages+0x8f/0xa0 [btrfs]^M

[ 1649.952169] RSP: 0018:88070fdc7b68  EFLAGS: 00010292^M
[ 1649.962852] RAX: 0089 RBX: 8804cf681f30 RCX: 
4b4a^M
[ 1649.977177] RDX: 004a RSI: 0001 RDI: 
^M
[ 1649.991496] RBP: 88070fdc7b68 R08: 0001 R09: 
^M
[ 1650.005819] R10: 0001 R11:  R12: 
880689b62800^M
[ 1650.020140] R13: 88024d85cf80 R14: 88075d0dd800 R15: 
0003^M
[ 1650.034459] FS:  () GS:88085fc0() 
knlGS:^M

[ 1650.050757] CS:  0010 DS:  ES:  CR0: 80050033^M
[ 1650.062306] CR2: 7f445b6d0e78 CR3: 01c14000 CR4: 
000407f0^M

[ 1650.076625] Stack:^M
[ 1650.080716]  88070fdc7bc8 a05f2e50 8804cf681fc8 
88070010^M
[ 1650.095761]  88070fdc7b98 0001 880290f92340 
88074eda9f00^M
[ 1650.110792]  8807edbb1700 880639910e20 8807edbb1700 
1000^M

[ 1650.125865] Call Trace:^M
[ 1650.130845]  [] 
scrub_parity_check_and_repair+0x140/0x1e0 [btrfs]^M
[ 1650.146286]  [] scrub_block_put+0x8d/0x90 
[btrfs]^M
[ 1650.158884]  [] ? 
cpuacct_account_field+0xd0/0xd0^M
[ 1650.171493]  [] 
scrub_bio_end_io_worker+0xe9/0x870 [btrfs]^M
[ 1650.185725]  [] normal_work_helper+0x84/0x330 
[btrfs]^M
[ 1650.199041]  [] btrfs_scrub_helper+0x12/0x20 
[btrfs]^M

[ 1650.212165]  [] process_one_work+0x1bf/0x520^M
[ 1650.223892]  [] ? process_one_work+0x13d/0x520^M
[ 1650.235988]  [] worker_thread+0x11e/0x4b0^M
[ 1650.247204]  [] ? __schedule+0x389/0x880^M
[ 1650.258242]  [] ? process_one_work+0x520/0x520^M
[ 1650.270314]  [] kthread+0xde/0x100^M
[ 1650.280302]  [] ? 
__init_kthread_worker+0x70/0x70^M

[ 1650.292894]  [] ret_from_fork+0x7c/0xb0^M
[ 1650.303746]  [] ? 
__init_kthread_worker+0x70/0x70^M
[ 1650.316359] Code: c0 e8 1d 5a 04 e1 0f 0b eb fe b9 c8 08 00 00 48 
c7 c2 71 6b 62 a0 48 c7 c6 b8 c4 62 a0 48 c7 c7 80 c4 62 a0 31 c0 e8 
f8 59 04 e1 <0f> 0b eb fe 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 
48 89 e5 ^M
[ 1650.356466] RIP  [] 
raid56_parity_add_scrub_pages+0x8f/0xa0 [btrfs]^M

[ 1650.372307]  RSP ^M
[ 1650.381427] ---[ end trace 14445249faa12848 ]---^M


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/9] Implement device scrub/replace for RAID56

2014-11-14 Thread Chris Mason



On Fri, Nov 14, 2014 at 8:50 AM, Miao Xie  wrote:
This patchset implement the device scrub/replace function for RAID56, 
the
most implementation of the common data is similar to the other RAID 
type.

The differentia or difficulty is the parity process. In order to avoid
that problem the data that is easy to be change out the stripe lock,
we do most work in the RAID56 stripe lock context.

And in order to avoid making the code more and more complex, we copy 
some

code of common data process for the parity, the cleanup work is in my
TODO list.


I'm starting to review and test these, but many thanks for tackling 
this.


-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/9] Implement device scrub/replace for RAID56

2014-11-14 Thread Miao Xie
This patchset implement the device scrub/replace function for RAID56, the
most implementation of the common data is similar to the other RAID type.
The differentia or difficulty is the parity process. In order to avoid
that problem the data that is easy to be change out the stripe lock,
we do most work in the RAID56 stripe lock context.

And in order to avoid making the code more and more complex, we copy some
code of common data process for the parity, the cleanup work is in my
TODO list.

We have done some test, the patchset worked well. Of course, more tests
are welcome. If you are interesting to use it or test it, you can pull
the patchset from

  https://github.com/miaoxie/linux-btrfs.git raid56-scrub-replace

Thanks
Miao

Miao Xie (6):
  Btrfs, raid56: don't change bbio and raid_map
  Btrfs, scrub: repair the common data on RAID5/6 if it is corrupted
  Btrfs,raid56: use a variant to record the operation type
  Btrfs,raid56: support parity scrub on raid56
  Btrfs, replace: write dirty pages into the replace target device
  Btrfs, replace: write raid56 parity into the replace target device

Zhao Lei (3):
  Btrfs: remove noused bbio_ret in __btrfs_map_block in condition
  Btrfs: remove unnecessary code of stripe_index assignment in
__btrfs_map_block
  Btrfs, replace: enable dev-replace for raid56

 fs/btrfs/dev-replace.c |   5 -
 fs/btrfs/raid56.c  | 711 +++-
 fs/btrfs/raid56.h  |  14 +-
 fs/btrfs/scrub.c   | 793 +++--
 fs/btrfs/volumes.c |  47 ++-
 fs/btrfs/volumes.h |  14 +-
 6 files changed, 1471 insertions(+), 113 deletions(-)

-- 
1.9.3

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html