Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-23 Thread bepi
Yes.

Is through to the btrfs-tools error message that the script has printed, that I
realized the filesystem corruption.


P.S. Various messages that you see in the working examples of the script, are
emitted directly by the btrfs-tools.


Gdb

Xin Zhou :

> Hi,
> 
> Does the script check the transfer status, and is there a transfer returns an
> error code?
> Thanks,
> Xin
>  
>  
> 
> Sent: Thursday, December 22, 2016 at 11:28 PM
> From: "Giuseppe Della Bianca" 
> To: "Btrfs BTRFS" 
> Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system
> during the snapshot receive
> (synthetic resend)
> 
> Hi.
> 
> Is possible that there are transfers, cancellations and other, at the same
> time, but not in the same subvolume.
> 
> My script checks that there are no transfers in progress on the same
> subvolume.
> 
> Is possible that the same subvolume is mounted several times (temporary
> mount
> at the beginning, and unmount at the end, in my script).
> 
> 
> Thanks for all.
> 
> 
> P.S. Sorry for my bad English.
> 
> 
> Gdb
> 
> 
> In data mercoledì 21 dicembre 2016 23:14:44, Xin Zhou ha scritto:
> > Hi,
> > Racing condition can happen, if running multiple transfers to the same
> > destination. Would you like to tell how many transfers are the scripts
> > running at a time to a specific hdd?
> >
> > Thanks,
> > Xin
> >
> >
> > Sent: Wednesday, December 21, 2016 at 1:11 PM
> > From: "Chris Murphy" 
> > To: No recipient address
> > Cc: "Giuseppe Della Bianca" , "Xin Zhou"
> ,
> > "Btrfs BTRFS"  Subject: Re: [CORRUPTION
> > FILESYSTEM] Corrupted and unrecoverable file system during the snapshot
> > receive
> > On Wed, Dec 21, 2016 at 2:09 PM, Chris Murphy 
> wrote:
> > > What about CONFIG_BTRFS_FS_CHECK_INTEGRITY? And then using check_int
> > > mount option?
> >
> > This slows things down, and in that case it might avoid the problem if
> > it's the result of a race condition.
> >
> > --
> > Chris Murphy
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> 





This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-21 Thread bepi
Hi.

I will insert ' btrfs check ' after each ' receive ' in my script.

I will test again my hardware.
But is not very likely that 2 computers, 3 HDD, 3 partitions, all have issue.

I think that the problem is a concomitance of operations, a race condition, a
random conditions.
I'll try to create a test case.


P.S. For find the problem may need to insert tools as ' coredumper ' and  '
sanitize ' in ' btrfs ', detect in realtime the ' extent ' corruption, and log
detection.


Thank you.

Gdb



Xin Zhou :

> Hi,
> 
> The system seems running some customized scripts continuously backup data
> from a NVME drive to HDDs.
> If the 3 HDDs backup storage are same in btrfs config, and the there is a bug
> in btrfs code,
> they all suppose to fail after the same operation sequence.
> 
> Otherwise, probably one of the HDDs might have issue, or there is a bug in
> layer below btrfs.
> 
> For the customize script, it might be helpful to check the file system
> consistency after each transfer.
> That might be useful to figure out which step generates a corruption, and if
> there is error propagations.
> 
> Regards,
> Xin
>  
>  
> 
> Sent: Monday, December 19, 2016 at 10:55 AM
> From: "Giuseppe Della Bianca" 
> To: "Xin Zhou" 
> Cc: linux-btrfs@vger.kernel.org
> Subject: Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system
> during the snapshot receive
> a concrete example
> 
> 
> SNAPSHOT
> 
> /dev/nvme0n1p2 on /tmp/tmp.X3vU6dLLVI type btrfs
> (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
> 
> btrfsManage SNAPSHOT /
> 
> (2016-12-19 19:44:00) Start btrfsManage
> . . . Start managing SNAPSHOT ' / ' filesystem ' root ' snapshot
> 
> In ' btrfssnapshot ' latest source snapshot ' root-2016-12-18_15:10:01.40 '
> . . . date ' 2016-12-18_15:10:01 ' number ' 40 '
> 
> Creation ' root-2016-12-19_19:44:00.part ' snapshot from ' root ' subvolume
> . . . Create a readonly snapshot of '/tmp/tmp.X3vU6dLLVI/root' in
> '/tmp/tmp.X3vU6dLLVI/btrfssnapshot/root/root-2016-12-19_19:44:00.part'
> 
> Renaming ' root-2016-12-19_19:44:00.part ' into ' root-2016-12-19_19:44:00.41
> ' snapshot
> 
> Source snapshot list of ' root ' subvolume
> . . . btrfssnapshot/root/root-2016-08-28-12-35-01.1
> ]zac[
> . . . btrfssnapshot/root/root-2016-12-19_19:44:00.41
> 
> (2016-12-19 19:44:05) End btrfsManage
> . . . End managing SNAPSHOT ' / ' filesystem ' root ' snapshot
> CORRECTLY
> 
> 
> 
> SEND e RECEIVE
> 
> /dev/nvme0n1p2 on /tmp/tmp.o78czE0Bo6 type btrfs
> (rw,relatime,ssd,space_cache,subvolid=5,subvol=/)
> /dev/sda2 on /tmp/tmp.XcwqQCKq09 type btrfs
> (rw,relatime,space_cache,subvolid=5,subvol=/)
> 
> btrfsManage SEND / /dev/sda2
> 
> (2016-12-19 19:47:24) Start btrfsManage
> . . . Start managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2
> '
> 
> Sending ' root-2016-12-19_19:44:00.41 ' source snapshot to ' btrfsreceive '
> subvolume
> . . . btrfs send -p
> /tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-18_15:10:01.40
> /tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-19_19:44:00.41 | btrfs
> receive /tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/
> . . . At subvol
> /tmp/tmp.o78czE0Bo6/btrfssnapshot/root/root-2016-12-19_19:44:00.41
> . . . At snapshot root-2016-12-19_19:44:00.41
> 
> Creation ' root-2016-12-19_19:44:00.41 ' snapshot from '
> .part/root-2016-12-19_19:44:00.41 ' subvolume
> . . . Create a readonly snapshot of
> '/tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/root-2016-12-19_19:44:00.41' in
> '/tmp/tmp.XcwqQCKq09/btrfsreceive/root/root-2016-12-19_19:44:00.41'
> . . . Delete subvolume (commit):
> '/tmp/tmp.XcwqQCKq09/btrfsreceive/root/.part/root-2016-12-19_19:44:00.41'
> 
> Snapshot list in ' /dev/sda2 ' device
> . . . btrfsreceive/data_backup/data_backup-2016-12-17_12:07:00.1
> . . . btrfsreceive/data_storage/data_storage-2016-12-10_17:05:51.1
> . . . btrfsreceive/root/root-2016-08-28-12-35-01.1
> ]zac[
> . . . btrfsreceive/root/root-2016-12-19_19:44:00.41
> 
> (2016-12-19 19:48:37) End btrfsManage
> . . . End managing SEND ' / ' filesystem ' root ' snapshot in ' /dev/sda2 '
> CORRECTLY
> 
> 
> 
> > Hi Giuseppe,
> >
> > Would you like to tell some details about:
> > 1. the XYZ snapshot was taken from which subvolume
> > 2. where the base (initial) snapshot is stored
> > 3. The 3 partitions receives the same snapshot, are they in the same btrfs
> > configuration and subvol structure?
> >
> > Also, would you send the link reports "two files unreadable error" post
> > mentioned in step 2? Hope can see the message and figure out if the issue
> > first comes from sender or receiver side.
> >
> > Thanks,
> > Xin
> >
> >
>  
> 





This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-19 Thread bepi
(Resend)

Hi.

It is a bit complex.


Primary system

subvolume on SSD devices on PCIe slot

/root/ (fedora 23, 50GB usati)

/btrfssnapshot/
/btrfssnapshot/root/  (for /root/ snapshot)
/btrfssnapshot/root/root.1
/btrfssnapshot/root/root.2
/btrfssnapshot/root/root.XYZ


subvolume on device HDD "1" sata

/data_storage/  (data, 100GB usati)
/data_backup/   (backup tar files, programs, downloads, etc., used 250GB)

/btrfssnapshot/
/btrfssnapshot/data_storage/  (for /data_storage/ snapshot)
/btrfssnapshot/data_backup/   (for /data_backup/ snapshot)
/btrfssnapshot/data_storage/data_storage.1
/btrfssnapshot/data_storage/data_storage.2
/btrfssnapshot/data_storage/data_storage.XYZ
/btrfssnapshot/data_backup/data_backup.1
/btrfssnapshot/data_backup/data_backup.2
/btrfssnapshot/data_backup/data_backup.XYZ


subvolume on HDD device "2" sata

partition 1

/btrfsreceive/root/  (for receive /btrfssnapshot/root/ snapshot)
/btrfsreceive/root/.part/

partition 2

/btrfsreceive/data_storage/  (for receive /btrfssnapshot/data_storage/ snapshot)
/btrfsreceive/data_storage/.part/
/btrfsreceive/data_backup/  (for receive /btrfssnapshot/data_backup/ snapshot)
/btrfsreceive/data_backup/.part/



Secondary system for receiving snapshot

subvolume on HDD device "3" sata

partition 1

/btrfsreceive/root/
/btrfsreceive/root/.part/

partition 2

/btrfsreceive/data_storage/  (for receive /btrfssnapshot/data_storage/ snapshot)
/btrfsreceive/data_storage/.part/
/btrfsreceive/data_backup/  (for receive /btrfssnapshot/data_backup/ snapshot)
/btrfsreceive/data_backup/.part/


My bash script create snapshot of /root/, /data_storage/ and /data_backup/ on
/btrfssnapshot/ .
Snapshot is created with .part extension, when the creation is finished
properly, is renamed to .1, .2, .XYZ .


My bash script send snapshot from /btrfssnapshot/root/,
/btrfssnapshot/data_storage/ and /btrfssnapshot/data_backup/ (sending
differential, n - 1 -> n) to the subvolume /btrfsreceive/root/,
/btrfsreceive/data_storage/ and /btrfsreceive/data_backup/.


My bash script sends the same snapshot also to the secondary system for
receiving snapshots (using ssh).


P.S. If I recreate the receiving partition from scratch, the receive working
properly.


The previous problems information are in this thread.
Please, you can read the thread?


Thanks to you.


Gdb

Scrive Xin Zhou :

> 
> Hi Giuseppe,
> 
> Would you like to tell some details about:
> 1. the XYZ snapshot was taken from which subvolume
> 2. where the base (initial) snapshot is stored
> 3. The 3 partitions receives the same snapshot, are they in the same btrfs
> configuration and subvol structure?
> 
> Also, would you send the link reports "two files unreadable error" post
> mentioned in step 2? 
> Hope can see the message and figure out if the issue first comes from sender
> or receiver side. 
> 
> Thanks,
> Xin
>  
> 



This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-12-19 Thread bepi
> 
> 
> At 11/21/2016 08:09 PM, b...@adria.it wrote:
> > Hi.
> >
> > My system: Fedora 23, kernel-4.7.10-100.fc23.x86_64
> btrfs-progs-4.4.1-1.fc23.x86_64
> >
> > Testing the remote differential receive (via ssh and in local network) of
> 24
> > sequential snapshots, and simultaneously deleting the snapshot, (in the
> same
> > file system, but in a different subvolume), there has been an file access
> error,
> > and the file system has corrupt.
> 
> Are you using qgroup?
> 
> IIRC, Filipe fixed a problem which could cause backref corruption which 
> only happens if quota is enabled.
> 
> Thanks,
> Qu


Unfortunately not.

I read that thread (and others), but no one it seemed like my case.


To you (Thanks).


Gdb


> 
> >
> > Both scrub, both recovery and clear_cache mount options, both btrfsck,
> have
> > failed, the file system is left in a state unusable.
> >
> > After reformatting the filesystem, remote receive of 24 snapshots worked
> properly.
> >
> > The file system is used exclusively for receive the snapshot, it is
> composed of
> > a single device.
> > The initial snapshot is a linux installation of 50Gb.
> >
> >
> > I think that there was a race condition between the receive and deletion
> of
> > snapshots (that were performed on two different subvolume).
> >
> >
> > Best regards.
> >
> > gdb
> >
> > 
> > This mail has been sent using Alpikom webmail system
> > http://www.alpikom.it
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >
> >
> 
> 
> 





This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


btrfs check --repair question

2016-12-13 Thread bepi
Hi.


I had two cases of 'ref mismatch on extents  ..', like you.

Any attempt at recovery has much worsened the problem.

I suggest you save importanto data and delete and recreate the partition.

I always have a partition for re-install from scratch, so that I can recover
data from damaged file system, without being forced to try to repair it.


Gdb


This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[CORRUPTION FILESYSTEM] Corrupted and unrecoverable file system during the snapshot receive

2016-11-22 Thread bepi
Hi.

My system: Fedora 23, kernel-4.7.10-100.fc23.x86_64 
btrfs-progs-4.4.1-1.fc23.x86_64

Testing the remote differential receive (via ssh and in local network) of 24
sequential snapshots, and simultaneously deleting the snapshot, (in the same
file system, but in a different subvolume), there has been an file access error,
and the file system has corrupt.

Both scrub, both recovery and clear_cache mount options, both btrfsck, have
failed, the file system is left in a state unusable.

After reformatting the filesystem, remote receive of 24 snapshots worked 
properly.

The file system is used exclusively for receive the snapshot, it is composed of
a single device.
The initial snapshot is a linux installation of 50Gb.


I think that there was a race condition between the receive and deletion of
snapshots (that were performed on two different subvolume).


Best regards.

gdb




This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re:

2016-11-10 Thread bepi
Hi.

P.S. Sorry for the double sending and for the blank email subject.


Yes.
The various controls are designed to be used separated, and to be launched both
as cronjobs and manually.

For example 
you can create a series of snapshots

  btrsfManage SNAPSHOT /

and send the new snapshots (incremental stream)

  btrsfManage SEND / /dev/sda1

in cronjobs or manually, it is indifferent.


Best regards.

Gdb

Scrive Alex Powell :

> Hi,
> It would be good but perhaps each task should be created via cronjobs
> instead of having a script running all the time or one script via one
> cronjob
> 
> Working in the enterprise environment for a major bank, we quickly
> learn that these sort of daily tasks should be split up
> 
> Kind Regards,
> Alex
> 
> On Thu, Nov 10, 2016 at 4:25 AM,   wrote:
> > Hi.
> >
> > I'm making a script for managing btrfs.
> >
> > To perform the scrub, to create and send (even to a remote system) of the
> backup
> > snapshot (or for one copy of the current state of the data).
> >
> > The script is designed to:
> > - Be easy to use:
> >   - The preparation is carried out automatically.
> >   - Autodetect of the subvolume mounted.
> > - Be safe and robust:
> >   - Check that not exist a another btrfs managing already started.
> >   - Subvolume for created and received snapshot are mounted and accessible
> only
> > for the time necessary to perform the requested operation.
> >   - Verify that the snapshot and sending snapshot are been executed
> completely.
> >   - Progressive numbering of the snapshots for identify with certainty
> > the latest snapshot.
> >
> > Are also available command for view the list of snaphost present, command
> for
> > delete the snapshots.
> >
> > For example:
> >
> > btrsfManage SCRUB /
> > btrsfManage SNAPSHOT /
> > btrsfManage SEND / /dev/sda1
> > btrsfManage SEND / r...@gdb.exnet.it/dev/sda1
> > btrsfManage SNAPLIST /dev/sda1
> > btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"
> >
> > You are interested?
> >
> > Gdb
> >
> >
> > 
> > This mail has been sent using Alpikom webmail system
> > http://www.alpikom.it
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majord...@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 





This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[OT][MANAGING BTRFS] Script for managing btrfs

2016-11-09 Thread bepi
Hi.

I'm making a script for managing btrfs.

To perform the scrub, to create and send (even to a remote system) of the backup
snapshot (or for one copy of the current state of the data).

The script is designed to:
- Be easy to use:
  - The preparation is carried out automatically.
  - Autodetect of the subvolume mounted.
- Be safe and robust:
  - Check that not exist a another btrfs managing already started.
  - Subvolume for created and received snapshot are mounted and accessible only
for the time necessary to perform the requested operation.
  - Verify that the snapshot and sending snapshot are been executed completely.
  - Progressive numbering of the snapshots for identify with certainty
the latest snapshot.

Are also available command for view the list of snaphost present, command for
delete the snapshots.

For example:

btrsfManage SCRUB /
btrsfManage SNAPSHOT /
btrsfManage SEND / /dev/sda1
btrsfManage SEND / r...@gdb.exnet.it/dev/sda1
btrsfManage SNAPLIST /dev/sda1
btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"

You are interested?

Gdb



This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[no subject]

2016-11-09 Thread bepi
Hi.

I'm making a script for managing btrfs.

To perform the scrub, to create and send (even to a remote system) of the backup
snapshot (or for one copy of the current state of the data).

The script is designed to:
- Be easy to use:
  - The preparation is carried out automatically.
  - Autodetect of the subvolume mounted.
- Be safe and robust:
  - Check that not exist a another btrfs managing already started.
  - Subvolume for created and received snapshot are mounted and accessible only
for the time necessary to perform the requested operation.
  - Verify that the snapshot and sending snapshot are been executed completely.
  - Progressive numbering of the snapshots for identify with certainty
the latest snapshot.

Are also available command for view the list of snaphost present, command for
delete the snapshots.

For example:

btrsfManage SCRUB /
btrsfManage SNAPSHOT /
btrsfManage SEND / /dev/sda1
btrsfManage SEND / r...@gdb.exnet.it/dev/sda1
btrsfManage SNAPLIST /dev/sda1
btrsfManage SNAPDEL /dev/sda1 "root-2016-11*"

You are interested?

Gdb



This mail has been sent using Alpikom webmail system
http://www.alpikom.it

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html