Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 9:40 PM,wrote: > [root@backuplogC7 ~]# rsync -avnc /var/lib/mariadb/mysql_201708090830/ > root@192.168.45.166://var/lib/mariadb/mysql_201708090830/ > sending incremental file list > ./ > ib_logfile1 > ibdata1 > > sent 3779 bytes received 25 bytes 507.20 bytes/sec > total size is 718361496 speedup is 188843.72 (DRY RUN) OK so I don't think this can be a sync related problem. That snapshot has been committed to disk days ago. There's definitely something wrong with the incremental send/receive, but it's unclear whether this is a kernel bug (send side) or btrfs-progs (receive side), or if there's any chance of file system corruption/confusion happening with either of the two subvolumes on the origin or the subvolume (parent) on the destination. So that means you're really in the weeds on what to do next. Try deleting mysql_201708090830/ snapshot on the destination. And resend but this time do a full send of that snapshot, don't use -p. I wonder if a full send, rather than incremental makes a difference. Follow it up with the rsync command to compare origin and destination. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 3/4] btrfs: decode compress type for tracing
So with this now we see the compression type in string. Signed-off-by: Anand Jain--- include/trace/events/btrfs.h | 25 - 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index c4e4b9427b81..d412c49f5a6a 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -190,6 +190,12 @@ DEFINE_EVENT(btrfs__inode, btrfs_inode_evict, { (1 << EXTENT_FLAG_FILLING), "FILLING" },\ { (1 << EXTENT_FLAG_FS_MAPPING),"FS_MAPPING"}) +#define show_compress_type(type)\ + __print_symbolic(type, \ + { BTRFS_COMPRESS_NONE, "none" },\ + { BTRFS_COMPRESS_ZLIB, "zlib" },\ + { BTRFS_COMPRESS_LZO, "lzo" }) + TRACE_EVENT_CONDITION(btrfs_get_extent, TP_PROTO(struct btrfs_root *root, struct btrfs_inode *inode, @@ -228,7 +234,7 @@ TRACE_EVENT_CONDITION(btrfs_get_extent, TP_printk_btrfs("root=%llu(%s) ino=%llu start=%llu len=%llu " "orig_start=%llu block_start=%llu(%s) " "block_len=%llu flags=%s refs=%u " - "compress_type=%u", + "compress_type=%s", show_root_type(__entry->root_objectid), (unsigned long long)__entry->ino, (unsigned long long)__entry->start, @@ -236,8 +242,8 @@ TRACE_EVENT_CONDITION(btrfs_get_extent, (unsigned long long)__entry->orig_start, show_map_type(__entry->block_start), (unsigned long long)__entry->block_len, - show_map_flags(__entry->flags), - __entry->refs, __entry->compress_type) + show_map_flags(__entry->flags), __entry->refs, + show_compress_type(__entry->compress_type)) ); /* file extent item */ @@ -285,14 +291,14 @@ DECLARE_EVENT_CLASS(btrfs__file_extent_item_regular, "file extent range=[%llu %llu] " "(num_bytes=%llu ram_bytes=%llu disk_bytenr=%llu " "disk_num_bytes=%llu extent_offset=%llu type=%s " - "compression=%u", + "compression=%s", show_root_type(__entry->root_obj), __entry->ino, __entry->isize, __entry->disk_isize, __entry->extent_start, __entry->extent_end, __entry->num_bytes, __entry->ram_bytes, __entry->disk_bytenr, __entry->disk_num_bytes, __entry->extent_offset, show_fi_type(__entry->extent_type), - __entry->compression) + show_compress_type(__entry->compression)) ); DECLARE_EVENT_CLASS( @@ -329,11 +335,11 @@ DECLARE_EVENT_CLASS( TP_printk_btrfs( "root=%llu(%s) inode=%llu size=%llu disk_isize=%llu " "file extent range=[%llu %llu] " - "extent_type=%s compression=%u", + "extent_type=%s compression=%s", show_root_type(__entry->root_obj), __entry->ino, __entry->isize, __entry->disk_isize, __entry->extent_start, __entry->extent_end, show_fi_type(__entry->extent_type), - __entry->compression) + show_compress_type(__entry->compression)) ); DEFINE_EVENT( @@ -424,7 +430,7 @@ DECLARE_EVENT_CLASS(btrfs__ordered_extent, TP_printk_btrfs("root=%llu(%s) ino=%llu file_offset=%llu " "start=%llu len=%llu disk_len=%llu " "truncated_len=%llu " - "bytes_left=%llu flags=%s compress_type=%d " + "bytes_left=%llu flags=%s compress_type=%s " "refs=%d", show_root_type(__entry->root_objectid), (unsigned long long)__entry->ino, @@ -435,7 +441,8 @@ DECLARE_EVENT_CLASS(btrfs__ordered_extent, (unsigned long long)__entry->truncated_len, (unsigned long long)__entry->bytes_left, show_ordered_flags(__entry->flags), - __entry->compress_type, __entry->refs) + show_compress_type(__entry->compress_type), + __entry->refs) ); DEFINE_EVENT(btrfs__ordered_extent, btrfs_ordered_extent_add, -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/4] btrfs: convert enum btrfs_compression_type to define
There isn't a huge list to manage the types, which can be managed with defines. It helps to easily print the types in tracing as well. Signed-off-by: Anand Jain--- fs/btrfs/compression.h | 7 --- fs/btrfs/super.c| 2 +- include/uapi/linux/btrfs_tree.h | 5 + 3 files changed, 6 insertions(+), 8 deletions(-) diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index e749df6dd39a..f28a501e7828 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -95,13 +95,6 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start, blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, int mirror_num, unsigned long bio_flags); -enum btrfs_compression_type { - BTRFS_COMPRESS_NONE = 0, - BTRFS_COMPRESS_ZLIB = 1, - BTRFS_COMPRESS_LZO = 2, - BTRFS_COMPRESS_TYPES = 2, -}; - struct btrfs_compress_op { struct list_head *(*alloc_workspace)(void); diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c index 12540b6104b5..b711357352f8 100644 --- a/fs/btrfs/super.c +++ b/fs/btrfs/super.c @@ -404,7 +404,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options, int ret = 0; char *compress_type; bool compress_force = false; - enum btrfs_compression_type saved_compress_type; + unsigned int saved_compress_type; bool saved_compress_force; int no_compress = 0; diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h index 10689e1fdf11..7a1fec2d10ab 100644 --- a/include/uapi/linux/btrfs_tree.h +++ b/include/uapi/linux/btrfs_tree.h @@ -733,6 +733,11 @@ struct btrfs_balance_item { #define BTRFS_FILE_EXTENT_REG 1 #define BTRFS_FILE_EXTENT_PREALLOC 2 +#define BTRFS_COMPRESS_NONE0 +#define BTRFS_COMPRESS_TYPES 2 +#define BTRFS_COMPRESS_ZLIB1 +#define BTRFS_COMPRESS_LZO 2 + struct btrfs_file_extent_item { /* * transaction id that created this extent -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/4] btrfs: remove unused BTRFS_COMPRESS_LAST
We aren't using this define, so removing it. Signed-off-by: Anand Jain--- fs/btrfs/compression.h | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h index 87f6d3332163..e749df6dd39a 100644 --- a/fs/btrfs/compression.h +++ b/fs/btrfs/compression.h @@ -100,7 +100,6 @@ enum btrfs_compression_type { BTRFS_COMPRESS_ZLIB = 1, BTRFS_COMPRESS_LZO = 2, BTRFS_COMPRESS_TYPES = 2, - BTRFS_COMPRESS_LAST = 3, }; struct btrfs_compress_op { -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4 v3] btrfs: add compression trace points
From: Anand JainThis patch adds compression and decompression trace points for the purpose of debugging. Signed-off-by: Anand Jain Reviewed-by: Nikolay Borisov --- v3: . Rename to a simple names, without worrying about being compatible with the future naming. . The type was not working fixed it. v2: . Use better naming. (If transform is not good enough I have run out of ideas, pls suggest). . To be applied on top of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git for-next (tested without namelen check patch set) fs/btrfs/compression.c | 11 +++ include/trace/events/btrfs.h | 39 +++ 2 files changed, 50 insertions(+) diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index d2ef9ac2a630..4a652f67ee87 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -895,6 +895,10 @@ int btrfs_compress_pages(int type, struct address_space *mapping, start, pages, out_pages, total_in, total_out); + + trace_btrfs_compress(1, 1, mapping->host, type, *total_in, + *total_out, start, ret); + free_workspace(type, workspace); return ret; } @@ -921,6 +925,10 @@ static int btrfs_decompress_bio(struct compressed_bio *cb) workspace = find_workspace(type); ret = btrfs_compress_op[type - 1]->decompress_bio(workspace, cb); + + trace_btrfs_compress(0, 0, cb->inode, type, + cb->compressed_len, cb->len, cb->start, ret); + free_workspace(type, workspace); return ret; @@ -943,6 +951,9 @@ int btrfs_decompress(int type, unsigned char *data_in, struct page *dest_page, dest_page, start_byte, srclen, destlen); + trace_btrfs_compress(0, 1, dest_page->mapping->host, + type, srclen, destlen, start_byte, ret); + free_workspace(type, workspace); return ret; } diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index d412c49f5a6a..db33d6649d12 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -1629,6 +1629,45 @@ TRACE_EVENT(qgroup_meta_reserve, show_root_type(__entry->refroot), __entry->diff) ); +TRACE_EVENT(btrfs_compress, + + TP_PROTO(int compress, int page, struct inode *inode, + unsigned int type, + unsigned long len_before, unsigned long len_after, + unsigned long start, int ret), + + TP_ARGS(compress, page, inode, type, len_before, + len_after, start, ret), + + TP_STRUCT__entry_btrfs( + __field(int,compress) + __field(int,page) + __field(ino_t, i_ino) + __field(unsigned int, type) + __field(unsigned long, len_before) + __field(unsigned long, len_after) + __field(unsigned long, start) + __field(int,ret) + ), + + TP_fast_assign_btrfs(btrfs_sb(inode->i_sb), + __entry->compress = compress; + __entry->page = page; + __entry->i_ino = inode->i_ino; + __entry->type = type; + __entry->len_before = len_before; + __entry->len_after = len_after; + __entry->start = start; + __entry->ret= ret; + ), + + TP_printk_btrfs("%s %s ino=%lu type=%s len_before=%lu len_after=%lu start=%lu ret=%d", + __entry->compress ? "compress":"uncompress", + __entry->page ? "page":"bio", __entry->i_ino, + show_compress_type(__entry->type), + __entry->len_before, __entry->len_after, __entry->start, + __entry->ret) +); #endif /* _TRACE_BTRFS_H */ /* This part must be outside protection */ -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/4] misc compression tracing related patches
Anand Jain (4): btrfs: remove unused BTRFS_COMPRESS_LAST btrfs: convert enum btrfs_compression_type to define btrfs: decode compress type for tracing btrfs: add compression trace points fs/btrfs/compression.c | 11 +++ fs/btrfs/compression.h | 8 -- fs/btrfs/super.c| 2 +- include/trace/events/btrfs.h| 64 +++-- include/uapi/linux/btrfs_tree.h | 5 5 files changed, 72 insertions(+), 18 deletions(-) -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] btrfs: use BTRFS_FSID_SIZE for fsid
We have define for FSID size so use it. Signed-off-by: Anand Jain--- include/trace/events/btrfs.h | 18 +- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h index cd99a3658156..c4e4b9427b81 100644 --- a/include/trace/events/btrfs.h +++ b/include/trace/events/btrfs.h @@ -73,11 +73,11 @@ struct btrfs_qgroup; { BTRFS_BLOCK_GROUP_RAID5, "RAID5"}, \ { BTRFS_BLOCK_GROUP_RAID6, "RAID6"} -#define BTRFS_UUID_SIZE 16 -#define TP_STRUCT__entry_fsid __array(u8, fsid, BTRFS_UUID_SIZE) +#define BTRFS_FSID_SIZE 16 +#define TP_STRUCT__entry_fsid __array(u8, fsid, BTRFS_FSID_SIZE) #define TP_fast_assign_fsid(fs_info) \ - memcpy(__entry->fsid, fs_info->fsid, BTRFS_UUID_SIZE) + memcpy(__entry->fsid, fs_info->fsid, BTRFS_FSID_SIZE) #define TP_STRUCT__entry_btrfs(args...) \ TP_STRUCT__entry( \ @@ -612,7 +612,7 @@ TRACE_EVENT(btrfs_add_block_group, TP_ARGS(fs_info, block_group, create), TP_STRUCT__entry( - __array(u8, fsid, BTRFS_UUID_SIZE ) + __array(u8, fsid, BTRFS_FSID_SIZE ) __field(u64,offset ) __field(u64,size) __field(u64,flags ) @@ -622,7 +622,7 @@ TRACE_EVENT(btrfs_add_block_group, ), TP_fast_assign( - memcpy(__entry->fsid, fs_info->fsid, BTRFS_UUID_SIZE); + memcpy(__entry->fsid, fs_info->fsid, BTRFS_FSID_SIZE); __entry->offset = block_group->key.objectid; __entry->size = block_group->key.offset; __entry->flags = block_group->flags; @@ -969,7 +969,7 @@ TRACE_EVENT(btrfs_trigger_flush, TP_ARGS(fs_info, flags, bytes, flush, reason), TP_STRUCT__entry( - __array(u8, fsid, BTRFS_UUID_SIZE ) + __array(u8, fsid, BTRFS_FSID_SIZE ) __field(u64,flags ) __field(u64,bytes ) __field(int,flush ) @@ -977,7 +977,7 @@ TRACE_EVENT(btrfs_trigger_flush, ), TP_fast_assign( - memcpy(__entry->fsid, fs_info->fsid, BTRFS_UUID_SIZE); + memcpy(__entry->fsid, fs_info->fsid, BTRFS_FSID_SIZE); __entry->flags = flags; __entry->bytes = bytes; __entry->flush = flush; @@ -1010,7 +1010,7 @@ TRACE_EVENT(btrfs_flush_space, TP_ARGS(fs_info, flags, num_bytes, orig_bytes, state, ret), TP_STRUCT__entry( - __array(u8, fsid, BTRFS_UUID_SIZE ) + __array(u8, fsid, BTRFS_FSID_SIZE ) __field(u64,flags ) __field(u64,num_bytes ) __field(u64,orig_bytes ) @@ -1019,7 +1019,7 @@ TRACE_EVENT(btrfs_flush_space, ), TP_fast_assign( - memcpy(__entry->fsid, fs_info->fsid, BTRFS_UUID_SIZE); + memcpy(__entry->fsid, fs_info->fsid, BTRFS_FSID_SIZE); __entry->flags = flags; __entry->num_bytes = num_bytes; __entry->orig_bytes = orig_bytes; -- 2.13.1 -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs issue with mariadb incremental backup
Hi Chris, I started as your suggestion again. The diff occured since snapshot mysql_201708090830 manually send. What should I do next? - delete all the bad/mismatching snapshots only on the destination computer. [root@joytest ~]# date Sun Aug 13 10:27:23 ICT 2017 [root@joytest ~]# cd /var/lib/mariadb [root@joytest mariadb]# btrfs sub list . ID 313 gen 220 top level 5 path mysql_201708070830 ID 316 gen 199 top level 5 path mysql_201708080830 ID 318 gen 205 top level 5 path mysql_201708090830 ID 320 gen 211 top level 5 path mysql_201708100830 ID 322 gen 219 top level 5 path mysql_201708110830 ID 323 gen 219 top level 5 path mysql_201708120830 ID 324 gen 224 top level 5 path mysql_201708130830 ID 325 gen 225 top level 5 path mysql [root@joytest mariadb]# btrfs sub del mysql_201708130830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708130830' [root@joytest mariadb]# btrfs sub del mysql_201708120830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708120830' [root@joytest mariadb]# btrfs sub del mysql_201708110830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708110830' [root@joytest mariadb]# btrfs sub del mysql_201708100830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708100830' [root@joytest mariadb]# btrfs sub del mysql_201708090830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708090830' [root@joytest mariadb]# btrfs sub sync . [root@joytest mariadb]# systemctl status mariadb mariadb.service - MariaDB database server Loaded: loaded (/usr/lib/systemd/system/mariadb.service; enabled; vendor preset: disabled) Active: failed (Result: exit-code) since Sun 2017-08-13 09:07:00 ICT; 1h 24min ago Process: 19871 ExecStartPost=/usr/libexec/mariadb-wait-ready $MAINPID (code=exited, status=1/FAILURE) Process: 19870 ExecStart=/usr/bin/mysqld_safe --basedir=/usr (code=exited, status=0/SUCCESS) Process: 19842 ExecStartPre=/usr/libexec/mariadb-prepare-db-dir %n (code=exited, status=0/SUCCESS) Main PID: 19870 (code=exited, status=0/SUCCESS) Aug 13 09:06:58 joytest systemd[1]: Starting MariaDB database server... Aug 13 09:06:58 joytest mysqld_safe[19870]: 170813 09:06:58 mysqld_safe Logging to '/var/log/mariadb/mariadb.log'. Aug 13 09:06:58 joytest mysqld_safe[19870]: 170813 09:06:58 mysqld_safe Starting mysqld daemon with databases from /var/lib/mariadb/mysql Aug 13 09:07:00 joytest systemd[1]: mariadb.service: control process exited, code=exited status=1 Aug 13 09:07:00 joytest systemd[1]: Failed to start MariaDB database server. Aug 13 09:07:00 joytest systemd[1]: Unit mariadb.service entered failed state. Aug 13 09:07:00 joytest systemd[1]: mariadb.service failed. [root@joytest mariadb]# btrfs sub list . ID 313 gen 220 top level 5 path mysql_201708070830 ID 316 gen 199 top level 5 path mysql_201708080830 ID 325 gen 225 top level 5 path mysql [root@joytest mariadb]# - The most recent good snapshot pair, which rsync shows origin and destination match, is mysql_201708080830 so you can keep that one on both sides. [root@backuplogC7 ~]# btrfs sub list /var/lib/mariadb ID 257 gen 538 top level 5 path mysql ID 316 gen 498 top level 5 path mysql_201708060830 ID 317 gen 503 top level 5 path mysql_201708070830 ID 318 gen 507 top level 5 path mysql_201708080830 ID 319 gen 514 top level 5 path mysql_201708090830 ID 320 gen 524 top level 5 path mysql_201708100830 ID 321 gen 529 top level 5 path mysql_201708110830 ID 322 gen 533 top level 5 path mysql_201708120830 ID 323 gen 538 top level 5 path mysql_201708130830 [root@backuplogC7 ~]# rsync -avnc /var/lib/mariadb/mysql_201708070830/ root@192.168.45.166://var/lib/mariadb/mysql_201708070830/ sending incremental file list ./ sent 3773 bytes received 19 bytes 842.67 bytes/sec total size is 718361496 speedup is 189441.32 (DRY RUN) [root@backuplogC7 ~]# rsync -avnc /var/lib/mariadb/mysql_201708080830/ root@192.168.45.166://var/lib/mariadb/mysql_201708080830/ sending incremental file list ./ sent 3769 bytes received 19 bytes 841.78 bytes/sec total size is 718361496 speedup is 189641.37 (DRY RUN) [root@backuplogC7 ~]# date Sun Aug 13 10:34:05 ICT 2017 [root@backuplogC7 ~]# - manually do incremental send/receive, starting with mysql_201708090830/, to make the destination current again with the origin. [root@backuplogC7 ~]# btrfs send -p /var/lib/mariadb/mysql_201708080830 /var/lib/mariadb/mysql_201708090830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At subvol /var/lib/mariadb/mysql_201708090830 At snapshot mysql_201708090830 [root@backuplogC7 ~]# rsync -avnc /var/lib/mariadb/mysql_201708090830/ root@192.168.45.166://var/lib/mariadb/mysql_201708090830/ sending incremental file list ./ ib_logfile1 ibdata1 sent 3779 bytes received 25 bytes 507.20 bytes/sec total size is 718361496 speedup is 188843.72 (DRY RUN) [root@backuplogC7 ~]# Best Regards, Siranee Jaraswachirakul. > On Sat, Aug 12, 2017 at 8:20 PM,wrote: > >> [root@backuplogC7 ~]# rsync -avnc
Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 8:20 PM,wrote: > [root@backuplogC7 ~]# rsync -avnc /var/lib/mariadb/mysql_201708090830 > root@192.168.45.166://var/lib/mariadb/mysql_201708090830 You need trailing / for the first directory with -a option. rsync -a dir dir is not the same command as rsync -a dir/ dir It's confusing but your command is trying to create mysql_201708090830 directory on the source, in the mysql_201708090830 on the destination. That is why everything mismatches. To make it mean "contents of" you need trailing slash on at least the origin. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs issue with mariadb incremental backup
Hi Chris, I did as you suggest and the result was bad then I decided to start over with snapshot mysql_201708070830 and manually send incremental the result rsync always said "a log diff" but the dest can start mariadb until snapshot "mysql_201708100830" it couldn't start mariadb. The following are the result. - delete all the bad/mismatching snapshots only on the destination computer. tpcorp@virtualtrust3:~$ lxc exec joytest -- bash [root@joytest ~]# btrfs sub list /var/lib/mariadb ID 298 gen 177 top level 5 path mysql_201708070830 ID 301 gen 148 top level 5 path mysql_201708080830 ID 303 gen 156 top level 5 path mysql_201708090830 ID 305 gen 162 top level 5 path mysql_201708100830 ID 309 gen 175 top level 5 path mysql_201708110830 ID 310 gen 176 top level 5 path mysql ID 311 gen 180 top level 5 path mysql_201708120830 [root@joytest ~]# cd /var/lib/mariadb [root@joytest mariadb]# btrfs sub list . ID 298 gen 177 top level 5 path mysql_201708070830 ID 301 gen 148 top level 5 path mysql_201708080830 ID 303 gen 156 top level 5 path mysql_201708090830 ID 305 gen 162 top level 5 path mysql_201708100830 ID 309 gen 175 top level 5 path mysql_201708110830 ID 310 gen 176 top level 5 path mysql ID 311 gen 180 top level 5 path mysql_201708120830 [root@joytest mariadb]# btrfs sub del mysql_201708090830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708090830' [root@joytest mariadb]# btrfs sub del mysql_201708100830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708100830' [root@joytest mariadb]# btrfs sub del mysql_201708110830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708110830' [root@joytest mariadb]# btrfs sub del mysql_201708120830 Delete subvolume (no-commit): '/var/lib/mariadb/mysql_201708120830' [root@joytest mariadb]# btrfs sub sync . - The most recent good snapshot pair, which rsync shows origin and destination match, is mysql_201708080830 so you can keep that one on both sides. [root@joytest mariadb]# btrfs sub list . ID 298 gen 177 top level 5 path mysql_201708070830 ID 301 gen 148 top level 5 path mysql_201708080830 ID 310 gen 176 top level 5 path mysql - manually do incremental send/receive, starting with mysql_201708090830/, to make the destination current again with the origin. tpcorp@virtualtrust3:~$ lxc exec backuplogC7 -- bash [root@backuplogC7 ~]# btrfs sub list /var/lib/mariadb ID 257 gen 537 top level 5 path mysql ID 316 gen 498 top level 5 path mysql_201708060830 ID 317 gen 503 top level 5 path mysql_201708070830 ID 318 gen 507 top level 5 path mysql_201708080830 ID 319 gen 514 top level 5 path mysql_201708090830 ID 320 gen 524 top level 5 path mysql_201708100830 ID 321 gen 529 top level 5 path mysql_201708110830 ID 322 gen 533 top level 5 path mysql_201708120830 [root@backuplogC7 ~]# more /var/log/btrfs_send.log Start Send 201708040905 btrfs send -p /var/lib/mariadb/mysql_201708030830 /var/lib/mariadb/mysql_201708040830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708040830 Stop Send 201708040905 Start Send 201708050905 btrfs send -p /var/lib/mariadb/mysql_201708040830 /var/lib/mariadb/mysql_201708050830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708050830 Stop Send 201708050905 Start Send 201708060905 btrfs send -p /var/lib/mariadb/mysql_201708050830 /var/lib/mariadb/mysql_201708060830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708060830 Stop Send 201708060905 Start Send 201708070905 btrfs send -p /var/lib/mariadb/mysql_201708060830 /var/lib/mariadb/mysql_201708070830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708070830 Stop Send 201708070905 Start Send 201708080905 btrfs send -p /var/lib/mariadb/mysql_201708070830 /var/lib/mariadb/mysql_201708080830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708080830 Stop Send 201708080905 Start Send 201708090905 btrfs send -p /var/lib/mariadb/mysql_201708080830 /var/lib/mariadb/mysql_201708090830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708090830 Stop Send 201708090905 Start Send 201708100905 btrfs send -p /var/lib/mariadb/mysql_201708090830 /var/lib/mariadb/mysql_201708100830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708100830 Stop Send 201708100905 Start Send 201708110905 btrfs send -p /var/lib/mariadb/mysql_201708100830 /var/lib/mariadb/mysql_201708110830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708110830 Stop Send 201708110905 Start Send 201708120905 btrfs send -p /var/lib/mariadb/mysql_201708110830 /var/lib/mariadb/mysql_201708120830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At snapshot mysql_201708120830 Stop Send 201708120905 [root@backuplogC7 ~]# btrfs send -p /var/lib/mariadb/mysql_201708080830 /var/lib/mariadb/mysql_201708090830 | ssh 192.168.45.166 btrfs receive /var/lib/mariadb At subvol /var/lib/mariadb/mysql_201708090830 At snapshot mysql_201708090830
Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 4:49 PM, Hugo Millswrote: > On Sat, Aug 12, 2017 at 03:34:01PM -0600, Chris Murphy wrote: >> On Fri, Aug 11, 2017 at 11:08 PM, wrote: >> >> >> > The backup script has the btrfs sync command since Aug 3 >> >> >> From your script: >> > system btrfs sub snap -r $basepath $snappath >> > system btrfs sub sync $basepath >> >> From the man page: sync [subvolid...] >>Wait until given subvolume(s) are completely removed from the >>filesystem after deletion. >> >> >> This 'subvolume sync' command, per the man page, is only about >> subvolume deletion. I suggest replacing it with a regular sync >> command. >> >> I think the problem is that the script does things so fast that the >> snapshot is not always consistent on disk before btrfs send starts. >> It's just a guess though. If I'm right, this means the rsync mismaches >> mean the destination snapshots are bad. Here's what I would do: > >I don't see how that can happen. Snapshots are atomic -- they're > either there or not there. It's not a matter even of copying the > metadata part of the subvol. It's literally just adding a pointer to > point at an existing FS tree. Do snapshots come with sync and flush to disk? Or does it just set a checkpoint and the extents that are as yet uncommitted prior to the snapshot aren't for sure on disk yet until the next commit time or sync? Nevertheless the wiki explicitly says to do sync after taking a snapshot in the context of send/receive. https://btrfs.wiki.kernel.org/index.php/Incremental_Backup#Initial_Bootstrapping And people on the list have had this "stale NFS file handle" error . I have no idea if it's still a problem with current kernels, or if it applies to 4.4 kernel. Anyway, the OP does appear to be having a real problem. rync is very clearly showing a given snapshot with incremental send/receive differ by *a lot* on source and destination machines. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 4:41 PM, Janos Toth F.wrote: > On Sat, Aug 12, 2017 at 11:34 PM, Chris Murphy > wrote: >> On Fri, Aug 11, 2017 at 11:08 PM, wrote: >> >> I think the problem is that the script does things so fast that the >> snapshot is not always consistent on disk before btrfs send starts. >> It's just a guess though. If I'm right, this means the rsync mismaches >> mean the destination snapshots are bad. > > Hmm. I was wondering about this exact issue the other day when I > fiddled with my own backup script. > - Should I issue a sync between creating a snapshot and starting to > rsync (or send/receive) the content of that fresh snapshot to an > external backup storage? A sync won't hurt, but I think we need btrfs send to include sync if this is still required as stated by the Wiki. And if the problem stated in the Wiki has been fixed, then it'd be useful to know when, but a git log search doesn't reveal anything related to NFS and Btrfs. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 03:34:01PM -0600, Chris Murphy wrote: > On Fri, Aug 11, 2017 at 11:08 PM,wrote: > > > > The backup script has the btrfs sync command since Aug 3 > > > From your script: > > system btrfs sub snap -r $basepath $snappath > > system btrfs sub sync $basepath > > From the man page: sync [subvolid...] >Wait until given subvolume(s) are completely removed from the >filesystem after deletion. > > > This 'subvolume sync' command, per the man page, is only about > subvolume deletion. I suggest replacing it with a regular sync > command. > > I think the problem is that the script does things so fast that the > snapshot is not always consistent on disk before btrfs send starts. > It's just a guess though. If I'm right, this means the rsync mismaches > mean the destination snapshots are bad. Here's what I would do: I don't see how that can happen. Snapshots are atomic -- they're either there or not there. It's not a matter even of copying the metadata part of the subvol. It's literally just adding a pointer to point at an existing FS tree. Hugo. -- Hugo Mills | If it's December 1941 in Casablanca, what time is it hugo@... carfax.org.uk | in New York? http://carfax.org.uk/ | PGP: E2AB1DE4 | Rick Blaine, Casablanca signature.asc Description: Digital signature
Re: btrfs issue with mariadb incremental backup
On Sat, Aug 12, 2017 at 11:34 PM, Chris Murphywrote: > On Fri, Aug 11, 2017 at 11:08 PM, wrote: > > I think the problem is that the script does things so fast that the > snapshot is not always consistent on disk before btrfs send starts. > It's just a guess though. If I'm right, this means the rsync mismaches > mean the destination snapshots are bad. Hmm. I was wondering about this exact issue the other day when I fiddled with my own backup script. - Should I issue a sync between creating a snapshot and starting to rsync (or send/receive) the content of that fresh snapshot to an external backup storage? I dismissed the thought because I figured rsync should see the proper state regardless if some data is still waiting in the system write-back cache. Now I am confused again. Is it only for send/receive? (I don't use that feature yet but thought about switching to it from rsync.) -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: btrfs issue with mariadb incremental backup
On Fri, Aug 11, 2017 at 11:08 PM,wrote: > The backup script has the btrfs sync command since Aug 3 >From your script: > system btrfs sub snap -r $basepath $snappath > system btrfs sub sync $basepath >From the man page: sync [subvolid...] Wait until given subvolume(s) are completely removed from the filesystem after deletion. This 'subvolume sync' command, per the man page, is only about subvolume deletion. I suggest replacing it with a regular sync command. I think the problem is that the script does things so fast that the snapshot is not always consistent on disk before btrfs send starts. It's just a guess though. If I'm right, this means the rsync mismaches mean the destination snapshots are bad. Here's what I would do: - delete all the bad/mismatching snapshots only on the destination computer. - he most recent good snapshot pair, which rsync shows origin and destination match, is mysql_201708080830 so you can keep that one on both sides. - manually do incremental send/receive, starting with mysql_201708090830/, to make the destination current again with the origin. - confirm with rsync that the snapshot pairs on origin and destination are the same - now resume using the modified script, which will do snapshot -> sync -> send. OPTIONAL, you could add to your script an rsync -avnc to double check that the incremental send receive is working. This is admittedly inefficient because it checks the *entire* contents of the snapshots on both sides, it's not just checking the incremental data. But if it doesn't take too long, it will help restore trust in send/receive, and confirm that a regular sync is needed in between snapshot and send. -- Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
On Sat, Aug 12, 2017 at 01:51:46PM +0200, Christoph Anton Mitterer wrote: > On Sat, 2017-08-12 at 00:42 -0700, Christoph Hellwig wrote: > > And how are you going to write your data and checksum atomically when > > doing in-place updates? > > Maybe I misunderstand something, but what's the big deal with not doing > it atomically (I assume you mean in terms of actually writing to the > pyhsical medium)? Isn't that anyway already a problem in case of a > crash? With normal CoW operations, the atomicity is achieved by constructing a completely new metadata tree containing both changes (references to the data, and the csum metadata), and then atomically changing the superblock to point to the new tree, so it really is atomic. With nodatacow, that approach doesn't work, because the new data replaces the old on the physical medium, so you'd have to make the data write atomic with the superblock write -- which can't be done, because it's (at least) two distinct writes. > And isn't that the case also with all forms of e.g. software RAID (when > not having a journal)? > > And as I've said, what's the worst thing that can happen? Either the > data would not have been completely written - with or without > checksumming. Then what's the difference to try the checksumming (and > do it successfully in all non crash cases)? > My understanding was (but that may be wrong of course, I'm not a > filesystem expert at all), that worst that can happen is that data an > csum aren't *both* fully written (in all possible combinations), so > we'd have four cases in total: > > data=good csum=good => fine > data=bad csum=bad => doesn't matter whether csum or not and whether atomic > or not > data=bad csum=good => the csum will tell us, that the data is bad > data= > good csum=bad => the only real problem, data would be actually > > good, but csum is not I don't think this is a particularly good description of the problem. I'd say it's more like this: If you write data and metadata separately (which you have to do in the nodatacow case), and the system halts between the two writes, then you either have the new data with the old csum, or the old csum with the new data. Both data and csum are "good", but good from different states of the FS. In both cases (data first or metadata first), the csum doesn't match the data, and so you now have an I/O error reported when trying to read that data. You can't easily fix this, because when the data and csum don't match, you need to know the _reason_ they don't match -- is it because the machine was interrupted during write (in which case you can fix it), or is it because the hard disk has had someone write data to it directly, and the data is now toast (in which case you shouldn't fix the I/O error)? Basically, nodatacow bypasses the very mechanisms that are meant to provide consistency in the filesystem. Hugo. -- Hugo Mills | vi vi vi: the Editor of the Beast. hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
On Sat, 2017-08-12 at 00:42 -0700, Christoph Hellwig wrote: > And how are you going to write your data and checksum atomically when > doing in-place updates? Maybe I misunderstand something, but what's the big deal with not doing it atomically (I assume you mean in terms of actually writing to the pyhsical medium)? Isn't that anyway already a problem in case of a crash? And isn't that the case also with all forms of e.g. software RAID (when not having a journal)? And as I've said, what's the worst thing that can happen? Either the data would not have been completely written - with or without checksumming. Then what's the difference to try the checksumming (and do it successfully in all non crash cases)? My understanding was (but that may be wrong of course, I'm not a filesystem expert at all), that worst that can happen is that data an csum aren't *both* fully written (in all possible combinations), so we'd have four cases in total: data=good csum=good => fine data=bad csum=bad => doesn't matter whether csum or not and whether atomic or not data=bad csum=good => the csum will tell us, that the data is bad data= good csum=bad => the only real problem, data would be actually good, but csum is not smime.p7s Description: S/MIME cryptographic signature
Re: RedHat 7.4 Release Notes: "Btrfs has been deprecated" - wut?
On Sat, Aug 12, 2017 at 02:10:18AM +0200, Christoph Anton Mitterer wrote: > Qu Wenruo wrote: > >Although Btrfs can disable data CoW, nodatacow also disables data > >checksum, which is another main feature for btrfs. > > Then decoupling of the two should probably decoupled and support for > notdatacow+checksumming be implemented?! And how are you going to write your data and checksum atomically when doing in-place updates? -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html