Re: Btrfs broken in massive transfar

2018-03-13 Thread MASAKI haruka
> On Tue, Mar 13, 2018 at 1:25 PM, MASAKI haruka <y...@reasonset.net> wrote:
> > journal(Kernel log), 7th try (to be readonly):
> >
> > ---
> >  3月 12 16:25:51 lily kernel: BTRFS info (device dm-6): creating UUID tree
> >  3月 12 16:25:53 lily iscsid[1406]: Connection-1:0 to [target: 
> > iqn.1994-11.com.netgear:eggplant-01:edc9adcf:btr1group, portal: 
> > 192.168.1.166,3260] through [iface: default] is shutdown.
> >  3月 12 16:25:53 lily iscsid[1406]: IPC qtask write failed: Broken pipe
> >  3月 12 16:26:18 lily kernel:  connection1:0: detected conn error (1020)
> >  3月 12 16:26:19 lily iscsid[1406]: Kernel reported iSCSI connection 1:0 
> > error (1020 - ISCSI_ERR_TCP_CONN_CLOSE: TCP connection closed) state (3)
> >  3月 12 16:26:21 lily kernel: sd 8:0:0:0: [sdg] tag#5 UNKNOWN(0x2003) 
> > Result: hostbyte=0x00 driverbyte=0x08
> >  3月 12 16:26:21 lily kernel: sd 8:0:0:0: [sdg] tag#5 Sense Key : 0x2 
> > [current] [descriptor]
> >  3月 12 16:26:21 lily kernel: sd 8:0:0:0: [sdg] tag#5 ASC=0x8 ASCQ=0x0
> >  3月 12 16:26:21 lily kernel: sd 8:0:0:0: [sdg] tag#5 CDB: opcode=0x8a 8a 00 
> > 00 00 00 00 00 42 5c 00 00 00 34 00 00 00
> >  3月 12 16:26:21 lily kernel: print_req_error: I/O error, dev sdg, sector 
> > 4348928
> 
> 
> Looks like network problems. Is one of these Btrfs volumes on an iSCSI
> device? Because there's a bunch of iSCSI errors followed by an I/O
> error with sector LBA reported, and then you get a bunch of Btrfs
> write errors.
> 
> What's the relationship between /dev/sdg and device (dm-6)
> /dev/mapper/hymaster_1 ?
> 


/dev/mapper/hymaster_1 is dm-crypt plain device.
Its real device is /dev/sdg it is an iSCSI disk
connected to NAS over GbE link local.

If this probrem from network, it's looked difficult to solve
because I tried with two different computers without any other network device...

> 
> >
> > Note: This system's structure is;
> > Computer (Linux 4.14/4.15) - btrfs (original) - dm-crypt plain - internal 4 
> > disks
> >  \_ btrfs (destination) - dm-crypt plain - iSCSI (single) - NAS - Hardware 
> > RAID5 - 8 disks
> 
> dm-6 is what btrfs is directly using and is complaining about, and I
> will guess that this is a dmcrypt device backed by /dev/sdg which is
> iSCSI to the NAS. Correct? Looks like either network problems, or
> possibly there is a real hardware problem with an error that's only
> partly passing through iSCSI. I can't parse this:
> 
>  3月 13 00:36:47 lily kernel: sd 8:0:0:0: [sdg] tag#1 UNKNOWN(0x2003)
> Result: hostbyte=0x00 driverbyte=0x08
>  3月 13 00:36:47 lily kernel: sd 8:0:0:0: [sdg] tag#1 Sense Key : 0x2
> [current] [descriptor]
>  3月 13 00:36:47 lily kernel: sd 8:0:0:0: [sdg] tag#1 ASC=0x8 ASCQ=0x0
>  3月 13 00:36:47 lily kernel: sd 8:0:0:0: [sdg] tag#1 CDB: opcode=0x8a
> 8a 00 00 00 00 00 cd 6f 0b 80 00 00 2a 20 00 00
> 
> Anyway, Btrfs detects the write failures, and is going read-only in
> order to prevent corrupting the file system. So I think you've got
> some iSCSI troubleshooting to do, and fix that. Doesn't seem like it's
> a Btrfs specific problem to me.
> 

> and I will guess that this is a dmcrypt device backed by /dev/sdg which is
> iSCSI to the NAS. Correct?

Yes.

The log looks network (disk?) probrem me too, but I think it is unlikely
because I didn't used iSCSI in case of I experienced (when Linux 3.9.)
Altough then btrfs disks are little unstable, so it's guessable that
target device (disks, iSCSI or network) reason...

I didn't see iSCSI error without btrfs transfaring.
I thought if most people didn't see probrem like this,
maybe the reason is some difference... dm-crypt plain?

I'm trying to use encrypt function on NAS (LUKS?) instead of dm-crypt plain on 
iSCSI disk.

(I don't know how to find iSCSI probrem...)

Thank you.

> -- 
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


-- 
MASAKI haruka <y...@reasonset.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs broken in massive transfar

2018-03-13 Thread MASAKI haruka
---

Note: This system's structure is;
Computer (Linux 4.14/4.15) - btrfs (original) - dm-crypt plain - internal 4 
disks
 \_ btrfs (destination) - dm-crypt plain - iSCSI (single) - NAS - Hardware 
RAID5 - 8 disks




> 
> 
> On 2018年03月13日 12:19, MASAKI haruka wrote:
> > *Now* I tried with Linux 4.14 and 4.15.
> > I experienced same probrem and reported in 2014 with Linux 3.9 and 3.10. 
> > (Perchance, actually the kernel was newer than 3.10, anyway I experienced 
> > same probrem with old 3.x kernel.)
> 
> Then kernel message please.
> 
> Especially for the readonly case.
> 
> And "btrfs check" output please.
> 
> Thanks,
> Qu
> 
> 
> > 
> >>
> >>
> >> On 2018年03月13日 05:57, MASAKI haruka wrote:
> >>> I'm trying to clone 18TiB data between btrfs,
> >>> but it will crash anyway.
> >>>
> >>> This probrem is occured even how to clone (btrfs send/receive, rsync or 
> >>> cp.)
> >>> I experienced same probrem in Linux 3.9 and Linux 3.10.
> >>
> >> Did you really mean *3*.9 and *3*.10?
> >>
> >> That's too old for btrfs usage IIRC.
> >>
> >> It would be *4*.9 or *4*.10 for a relative new kernel for btrfs.
> >>
> >> Would you please try some latest mainline kernel again?
> >>
> >>>
> >>> What happen:
> >>>
> >>> 1. Failed to write because I/O error (read only filesystem)
> >>> 2. writing to the btrfs succeeds and fails randomly.
> >>> 3. The btrfs unable to unmount (resource is busy.) Unable to umount even 
> >>> forcely, so cannot halt.
> >>>
> >>> Example:
> >>> ---
> >>> mkfile o7784-11-0
> >>> rename o7784-11-0 -> 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >>> utimes 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
> >>> truncate 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >>>  size=1073698824
> >>> chown 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >>>  - uid=1000, gid=1000
> >>> chmod 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >>>  - mode=0600
> >>> utimes 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >>> mkfile o7785-12-0
> >>> rename o7785-12-0 -> 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> >>> utimes 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
> >>> truncate 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> >>>  size=864067592
> >>> ERROR: truncate 
> >>> .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> >>>  failed: Input/output error
> >>> btrfs send 180310235348  0.09s user 11.98s system 16% cpu 1:14.42 total
> >>> ---
> >>
> >> In that case, we need kernel message to investigate.
> >> (And of course, please use at least 4.x kernel)
> >>
> >> Thanks,
> >> Qu
> >>
> >>>
> >>> Tries:
> >>> 1.
> >>> Connect between host A (btrfs, 4disks) and B with socat (TCP).
> >>> Host B write to iSCSI disk (btrfs, single).
> >>> clone with btrfs send/receive. Linux 4.15.
> >>> -> Crashed at transfarred 1.78TB
> >>>
> >>> 2.
> >>> Delete snapshot and retry.
> >>> Connect between host A and B with SSH and socat (UNIX).
> >>> Host B write to iSCSI disk (btrfs, single).
> >>> clone with btrfs send/receive. Linux 4.15.
> >>> -> Crashed at transfarred 90GB
> >>>
> >>> 3.
> >>> Recreate btrfs.
> >>> Host A write to iSCSI disk.
> >>> clone with btrfs send/receive. Linux 4.15.
> >>> -> Crashed at transfarred 260GB
> >>>
> >>> 4.
> >>> Recreate btrfs.
> >>> Original disk attach to other computer (having more resource.)
> >>> clone with btrfs send/receive. Linux 4.15.
> >>> -> Crashed at transfarred 120GB
> >>>
> >>> 5.
> >>> Recreate btrfs.
> >>> Clone with rsync. Linux 4.15.
> >>> -> Crashed at transfarred 100GB
> >>>
> >>> 6.
> >>> Recreate btrfs.
> >>> Try with Linux 4.14, btrfs send/receive.
> >>> -> Crashed at transfarred 3.98TB
> >>>
> >>> 7.
> >>> Recreate btrfs.
> >>> Connect between host and NAS (iSCSI) with GbE cable directly.
> >>> Mounted with options relatime, spase_cache, compress=lzo.
> >>> clone with btrfs send/receive. Linux 4.14.
> >>> -> Crashed at transfarred 2.13TB
> >>>
> >>
> > 
> > 
> 


-- 
MASAKI haruka <y...@reasonset.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Btrfs broken in massive transfar

2018-03-12 Thread MASAKI haruka
*Now* I tried with Linux 4.14 and 4.15.
I experienced same probrem and reported in 2014 with Linux 3.9 and 3.10. 
(Perchance, actually the kernel was newer than 3.10, anyway I experienced same 
probrem with old 3.x kernel.)

> 
> 
> On 2018年03月13日 05:57, MASAKI haruka wrote:
> > I'm trying to clone 18TiB data between btrfs,
> > but it will crash anyway.
> > 
> > This probrem is occured even how to clone (btrfs send/receive, rsync or cp.)
> > I experienced same probrem in Linux 3.9 and Linux 3.10.
> 
> Did you really mean *3*.9 and *3*.10?
> 
> That's too old for btrfs usage IIRC.
> 
> It would be *4*.9 or *4*.10 for a relative new kernel for btrfs.
> 
> Would you please try some latest mainline kernel again?
> 
> > 
> > What happen:
> > 
> > 1. Failed to write because I/O error (read only filesystem)
> > 2. writing to the btrfs succeeds and fails randomly.
> > 3. The btrfs unable to unmount (resource is busy.) Unable to umount even 
> > forcely, so cannot halt.
> > 
> > Example:
> > ---
> > mkfile o7784-11-0
> > rename o7784-11-0 -> 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> > utimes 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
> > truncate 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >  size=1073698824
> > chown 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >  - uid=1000, gid=1000
> > chmod 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> >  - mode=0600
> > utimes 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
> > mkfile o7785-12-0
> > rename o7785-12-0 -> 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> > utimes 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
> > truncate 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> >  size=864067592
> > ERROR: truncate 
> > .filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
> >  failed: Input/output error
> > btrfs send 180310235348  0.09s user 11.98s system 16% cpu 1:14.42 total
> > ---
> 
> In that case, we need kernel message to investigate.
> (And of course, please use at least 4.x kernel)
> 
> Thanks,
> Qu
> 
> > 
> > Tries:
> > 1.
> > Connect between host A (btrfs, 4disks) and B with socat (TCP).
> > Host B write to iSCSI disk (btrfs, single).
> > clone with btrfs send/receive. Linux 4.15.
> > -> Crashed at transfarred 1.78TB
> > 
> > 2.
> > Delete snapshot and retry.
> > Connect between host A and B with SSH and socat (UNIX).
> > Host B write to iSCSI disk (btrfs, single).
> > clone with btrfs send/receive. Linux 4.15.
> > -> Crashed at transfarred 90GB
> > 
> > 3.
> > Recreate btrfs.
> > Host A write to iSCSI disk.
> > clone with btrfs send/receive. Linux 4.15.
> > -> Crashed at transfarred 260GB
> > 
> > 4.
> > Recreate btrfs.
> > Original disk attach to other computer (having more resource.)
> > clone with btrfs send/receive. Linux 4.15.
> > -> Crashed at transfarred 120GB
> > 
> > 5.
> > Recreate btrfs.
> > Clone with rsync. Linux 4.15.
> > -> Crashed at transfarred 100GB
> > 
> > 6.
> > Recreate btrfs.
> > Try with Linux 4.14, btrfs send/receive.
> > -> Crashed at transfarred 3.98TB
> > 
> > 7.
> > Recreate btrfs.
> > Connect between host and NAS (iSCSI) with GbE cable directly.
> > Mounted with options relatime, spase_cache, compress=lzo.
> > clone with btrfs send/receive. Linux 4.14.
> > -> Crashed at transfarred 2.13TB
> > 
> 


-- 
MASAKI haruka <y...@reasonset.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Btrfs broken in massive transfar

2018-03-12 Thread MASAKI haruka
I'm trying to clone 18TiB data between btrfs,
but it will crash anyway.

This probrem is occured even how to clone (btrfs send/receive, rsync or cp.)
I experienced same probrem in Linux 3.9 and Linux 3.10.

What happen:

1. Failed to write because I/O error (read only filesystem)
2. writing to the btrfs succeeds and fails randomly.
3. The btrfs unable to unmount (resource is busy.) Unable to umount even 
forcely, so cannot halt.

Example:
---
mkfile o7784-11-0
rename o7784-11-0 -> 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
utimes 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
truncate 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
 size=1073698824
chown 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
 - uid=1000, gid=1000
chmod 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
 - mode=0600
utimes 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/kVxX8RdGhryQiEMOm4II2qMw
mkfile o7785-12-0
rename o7785-12-0 -> 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
utimes 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo
truncate 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
 size=864067592
ERROR: truncate 
.filesystem/HDD/.XFV_pp/,fQO40jotqhUZ0/5JSSubx1Ph5xYNOcXhIAoIK3/XDGOWpbx,5zYWEi0L5LHdWBo/lSABmfoArm9pAtade-gHmS6X
 failed: Input/output error
btrfs send 180310235348  0.09s user 11.98s system 16% cpu 1:14.42 total
---

Tries:
1.
Connect between host A (btrfs, 4disks) and B with socat (TCP).
Host B write to iSCSI disk (btrfs, single).
clone with btrfs send/receive. Linux 4.15.
-> Crashed at transfarred 1.78TB

2.
Delete snapshot and retry.
Connect between host A and B with SSH and socat (UNIX).
Host B write to iSCSI disk (btrfs, single).
clone with btrfs send/receive. Linux 4.15.
-> Crashed at transfarred 90GB

3.
Recreate btrfs.
Host A write to iSCSI disk.
clone with btrfs send/receive. Linux 4.15.
-> Crashed at transfarred 260GB

4.
Recreate btrfs.
Original disk attach to other computer (having more resource.)
clone with btrfs send/receive. Linux 4.15.
-> Crashed at transfarred 120GB

5.
Recreate btrfs.
Clone with rsync. Linux 4.15.
-> Crashed at transfarred 100GB

6.
Recreate btrfs.
Try with Linux 4.14, btrfs send/receive.
-> Crashed at transfarred 3.98TB

7.
Recreate btrfs.
Connect between host and NAS (iSCSI) with GbE cable directly.
Mounted with options relatime, spase_cache, compress=lzo.
clone with btrfs send/receive. Linux 4.14.
-> Crashed at transfarred 2.13TB
-- 
MASAKI haruka <y...@reasonset.net>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html