On 04/14/2018 11:42 AM, Qu Wenruo wrote:
And the work load when the RO happens is also helpful.
(Well, the dmesg of RO happens would be the best though)
Surprisingly nothing special AFAIR. It's a private, mostly idle machine. Probably "just" browsing with chrome. I didn't notice the remount right away as there were no obvious failures. And even then I kept it running for a couple more hours/a day or so.

I had a glance at dmesg but don't remember anything specific (think the usual "---- [cut here] ---" + dump of registers, but I'm not even sure about that). Sorry.

Actually the same thing happened just a few days earlier and after a reboot (and maybe fsck) it was back up and good. Was optimistic it would go the same way this time as well :) In general I had to hard-reset (+ fsck) a couple of times in recent times. Except for the SSD it's an all-new machine that I'm still OC/stress-testing. But not when that particular event happened.
Despite above salvage method, please also considering provide the
following data, as your case is pretty special and may help us to catch
a long hidden bug.
If only I had know I would have saved dmesg! :)
Sure, I'd be happy to help. If you need any more information just let me know.
1) Extent tree dump
    Need above 2 patches applied first.

    # btrfs inspect dump-tree -t extent /dev/sda2 &> \
      /tmp/extent_tree_dump
    If above dump is too large, "grep -C20 166030671872" of the output is
    also good enough.

I'll send you a link to the full dump directly.

    key (162020040704 EXTENT_ITEM 4096) block 130834812928 (31942093) gen 1627950     key (162032852992 EXTENT_ITEM 16384) block 130834829312 (31942097) gen 1627950     key (162058788864 EXTENT_ITEM 126976) block 166896513024 (40746219) gen 1627103     key (162067652608 EXTENT_ITEM 122880) block 181095837696 (44212851) gen 1626650     key (162074021888 EXTENT_ITEM 126976) block 181095841792 (44212852) gen 1626650     key (162080391168 EXTENT_ITEM 126976) block 181095845888 (44212853) gen 1626650     key (162086768640 EXTENT_ITEM 126976) block 140649435136 (34338241) gen 1697624     key (162541010944 EXTENT_ITEM 1052672) block 141310992384 (34499754) gen 1626990     key (162557276160 EXTENT_ITEM 118784) block 130641969152 (31895012) gen 1627722     key (162677518336 EXTENT_ITEM 122880) block 140914860032 (34403042) gen 1626930     key (162678145024 EXTENT_ITEM 122880) block 140343918592 (34263652) gen 1626743     key (162682937344 EXTENT_ITEM 122880) block 181095866368 (44212858) gen 1626650     key (162692177920 EXTENT_ITEM 118784) block 130621235200 (31889950) gen 1627194     key (162698711040 EXTENT_ITEM 28672) block 130621198336 (31889941) gen 1627194     key (162699722752 EXTENT_ITEM 4096) block 180581654528 (44087318) gen 1675166     key (162702168064 EXTENT_ITEM 4096) block 131325329408 (32061848) gen 1626697     key (162702237696 EXTENT_ITEM 8192) block 181116600320 (44217920) gen 1626654     key (162707652608 EXTENT_ITEM 49152) block 181095890944 (44212864) gen 1626650     key (162710052864 EXTENT_ITEM 1859584) block 140858179584 (34389204) gen 1626926     key (162735849472 EXTENT_ITEM 32768) block 140858195968 (34389208) gen 1626926     key (162793705472 EXTENT_ITEM 4096) block 166030671872 (40534832) gen 1702074     key (162858274816 EXTENT_ITEM 21327872) block 140301692928 (34253343) gen 1626728     key (162881642496 EXTENT_ITEM 73728) block 130903351296 (31958826) gen 1626665     key (162884112384 EXTENT_ITEM 53248) block 130903355392 (31958827) gen 1626665     key (162884964352 EXTENT_ITEM 4096) block 130903359488 (31958828) gen 1626665 leaf 181087133696 items 51 free space 17 generation 1626650 owner EXTENT_TREE
leaf 181087133696 flags 0x1(WRITTEN) backref revision 1
fs uuid 22e778f7-2499-4379-99d2-cdd399d1cc6e
chunk uuid bee8ad15-e128-45f1-a3d7-e2fda17806ce
    item 0 key (161456054272 EXTENT_ITEM 4096) itemoff 3942 itemsize 53
        refs 1 gen 1626650 flags DATA
        extent data backref root 5 objectid 89243202 offset 0 count 1
    item 1 key (161456058368 EXTENT_ITEM 4096) itemoff 3889 itemsize 53
        refs 1 gen 1626650 flags DATA
        extent data backref root 5 objectid 89243208 offset 0 count 1
    item 2 key (161456062464 EXTENT_ITEM 4096) itemoff 3836 itemsize 53
        refs 1 gen 1626650 flags DATA
        extent data backref root 5 objectid 89243192 offset 0 count 1
    item 3 key (161456066560 EXTENT_ITEM 4096) itemoff 3783 itemsize 53
        refs 1 gen 1626650 flags DATA
        extent data backref root 5 objectid 89243198 offset 0 count 1
--
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187101 offset 655360 count 1
    item 35 key (162793402368 EXTENT_ITEM 20480) itemoff 2087 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187104 offset 0 count 1
    item 36 key (162793422848 EXTENT_ITEM 110592) itemoff 2034 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187107 offset 0 count 1
    item 37 key (162793533440 EXTENT_ITEM 118784) itemoff 1981 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187110 offset 0 count 1
    item 38 key (162793656320 EXTENT_ITEM 4096) itemoff 1928 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187112 offset 0 count 1
    item 39 key (162793664512 EXTENT_ITEM 20480) itemoff 1875 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187115 offset 0 count 1
    item 40 key (162793684992 EXTENT_ITEM 20480) itemoff 1822 itemsize 53
        refs 1 gen 1626665 flags DATA
        extent data backref root 5 objectid 32187119 offset 0 count 1
parent transid verify failed on 166030671872 wanted 1702074 found 1705980
parent transid verify failed on 166030671872 wanted 1702074 found 1705980
Ignoring transid failure
leaf 166030671872 items 60 free space 95 generation 1705980 owner TREE_LOG
leaf 166030671872 flags 0x1(WRITTEN) backref revision 1
fs uuid 22e778f7-2499-4379-99d2-cdd399d1cc6e
chunk uuid bee8ad15-e128-45f1-a3d7-e2fda17806ce
    item 0 key (46475 DIR_ITEM 2335543231) itemoff 3955 itemsize 40
        location key (1973554 INODE_ITEM 0) type FILE
        transid 1705438 data_len 0 name_len 10
        name: 002443.sst
    item 1 key (46475 DIR_ITEM 2338985614) itemoff 3915 itemsize 40
        location key (1972743 INODE_ITEM 0) type FILE
        transid 1705436 data_len 0 name_len 10
        name: 001638.sst
    item 2 key (46475 DIR_ITEM 2339658528) itemoff 3875 itemsize 40
        location key (1973225 INODE_ITEM 0) type FILE
        transid 1705437 data_len 0 name_len 10
        name: 002115.sst
    item 3 key (46475 DIR_ITEM 2341002938) itemoff 3835 itemsize 40
        location key (311255 INODE_ITEM 0) type FILE
        transid 1640128 data_len 0 name_len 10
        name: 000500.sst
    item 4 key (46475 DIR_ITEM 2344385674) itemoff 3795 itemsize 40
        location key (1945180 INODE_ITEM 0) type FILE
--
        refs 1 gen 1697626 flags DATA
        extent data backref root 448 objectid 672641 offset 939524096 count 1     item 10 key (165604331520 EXTENT_ITEM 10088448) itemoff 3412 itemsize 53
        refs 1 gen 1626620 flags DATA
        extent data backref root 5 objectid 48964469 offset 10264576 count 1
    item 11 key (165663883264 EXTENT_ITEM 9146368) itemoff 3359 itemsize 53
        refs 1 gen 1626614 flags DATA
        extent data backref root 5 objectid 42160317 offset 0 count 1
    item 12 key (165708148736 EXTENT_ITEM 9912320) itemoff 3306 itemsize 53
        refs 1 gen 1626620 flags DATA
        extent data backref root 5 objectid 48964469 offset 20353024 count 1     item 13 key (165735571456 EXTENT_ITEM 25378816) itemoff 3253 itemsize 53
        refs 1 gen 1626613 flags DATA
        extent data backref root 5 objectid 1905 offset 83456000 count 1
    item 14 key (165760950272 EXTENT_ITEM 134217728) itemoff 3200 itemsize 53
        refs 1 gen 1697627 flags DATA
        extent data backref root 448 objectid 672641 offset 1073741824 count 1     item 15 key (165978742784 EXTENT_ITEM 13717504) itemoff 3147 itemsize 53
        refs 1 gen 1626614 flags DATA
        extent data backref root 5 objectid 42160317 offset 9146368 count 1
    item 16 key (166030671872 EXTENT_ITEM 4096) itemoff 3096 itemsize 51
        refs 1 gen 1702074 flags TREE_BLOCK
        tree block key (162793705472 EXTENT_ITEM 4096) level 0
        tree block backref root 2
    item 17 key (166030671872 BLOCK_GROUP_ITEM 1073741824) itemoff 3072 itemsize 24
        block group used 96915456 chunk_objectid 256 flags METADATA
    item 18 key (166030696448 EXTENT_ITEM 4096) itemoff 3021 itemsize 51
        refs 1 gen 1705980 flags TREE_BLOCK
        tree block key (EXTENT_CSUM EXTENT_CSUM 137702375424) level 0
        tree block backref root 7
    item 19 key (166030766080 EXTENT_ITEM 4096) itemoff 2970 itemsize 51
        refs 1 gen 1705985 flags TREE_BLOCK
        tree block key (138697830400 EXTENT_ITEM 24576) level 0
        tree block backref root 2
    item 20 key (166030770176 EXTENT_ITEM 4096) itemoff 2919 itemsize 51
        refs 1 gen 1705985 flags TREE_BLOCK
        tree block key (EXTENT_CSUM EXTENT_CSUM 138697756672) level 0
        tree block backref root 7
    item 21 key (166030778368 EXTENT_ITEM 4096) itemoff 2868 itemsize 51
        refs 1 gen 1705985 flags TREE_BLOCK
        tree block key (138729590784 EXTENT_ITEM 225280) level 0
        tree block backref root 2
    item 22 key (166030802944 EXTENT_ITEM 4096) itemoff 2817 itemsize 51
        refs 1 gen 1705985 flags TREE_BLOCK
        tree block key (140492824576 EXTENT_ITEM 4096) level 0
2) super block dump
    # btrfs inspect dump-super -f /dev/sda2
superblock: bytenr=65536, device=/dev/sda2
---------------------------------------------------------
csum_type        0 (crc32c)
csum_size        4
csum            0xef0068ba [match]
bytenr            65536
flags            0x1
            ( WRITTEN )
magic            _BHRfS_M [match]
fsid            22e778f7-2499-4379-99d2-cdd399d1cc6e
label            830
generation        1706541
root            167104118784
sys_array_size        97
chunk_root_generation    1702072
root_level        1
chunk_root        186120536064
chunk_root_level    1
log_root        180056702976
log_root_transid    0
log_root_level        0
total_bytes        63879249920
bytes_used        36929691648
sectorsize        4096
nodesize        4096
leafsize (deprecated)        4096
stripesize        4096
root_dir        6
num_devices        1
compat_flags        0x0
compat_ro_flags        0x0
incompat_flags        0x59
            ( MIXED_BACKREF |
              COMPRESS_LZO |
              COMPRESS_ZSTD |
              EXTENDED_IREF )
cache_generation    1706541
uuid_tree_generation    1706541
dev_item.uuid        12f20294-7a53-4ba2-a2ca-09e91a365ff2
dev_item.fsid        22e778f7-2499-4379-99d2-cdd399d1cc6e [match]
dev_item.type        0
dev_item.total_bytes    63879249920
dev_item.bytes_used    63325601792
dev_item.io_align    4096
dev_item.io_width    4096
dev_item.sector_size    4096
dev_item.devid        1
dev_item.dev_group    0
dev_item.seek_speed    0
dev_item.bandwidth    0
dev_item.generation    0
sys_chunk_array[2048]:
    item 0 key (FIRST_CHUNK_TREE CHUNK_ITEM 186120536064)
        length 33554432 owner 2 stripe_len 65536 type SYSTEM
        io_align 65536 io_width 65536 sector_size 4096
        num_stripes 1 sub_stripes 1
            stripe 0 devid 1 offset 1048576
            dev_uuid 12f20294-7a53-4ba2-a2ca-09e91a365ff2
backup_roots[4]:
    backup 0:
        backup_tree_root:    167104118784    gen: 1706541    level: 1
        backup_chunk_root:    186120536064    gen: 1702072    level: 1
        backup_extent_root:    167103516672    gen: 1706541 level: 3
        backup_fs_root:        167103123456    gen: 1706541 level: 3
        backup_dev_root:    130644881408    gen: 1705305    level: 1
        backup_csum_root:    167103152128    gen: 1706541    level: 3
        backup_total_bytes:    63879249920
        backup_bytes_used:    36929691648
        backup_num_devices:    1

    backup 1:
        backup_tree_root:    167100444672    gen: 1706538    level: 1
        backup_chunk_root:    186120536064    gen: 1702072    level: 1
        backup_extent_root:    167099588608    gen: 1706538 level: 3
        backup_fs_root:        167099183104    gen: 1706538 level: 3
        backup_dev_root:    130644881408    gen: 1705305    level: 1
        backup_csum_root:    167099199488    gen: 1706538    level: 3
        backup_total_bytes:    63879249920
        backup_bytes_used:    36929826816
        backup_num_devices:    1

    backup 2:
        backup_tree_root:    167101612032    gen: 1706539    level: 1
        backup_chunk_root:    186120536064    gen: 1702072    level: 1
        backup_extent_root:    167100915712    gen: 1706539 level: 3
        backup_fs_root:        167101820928    gen: 1706540 level: 3
        backup_dev_root:    130644881408    gen: 1705305    level: 1
        backup_csum_root:    167101845504    gen: 1706540    level: 3
        backup_total_bytes:    63879249920
        backup_bytes_used:    36929593344
        backup_num_devices:    1

    backup 3:
        backup_tree_root:    167102959616    gen: 1706540    level: 1
        backup_chunk_root:    186120536064    gen: 1702072    level: 1
        backup_extent_root:    167102296064    gen: 1706540 level: 3
        backup_fs_root:        167101820928    gen: 1706540 level: 3
        backup_dev_root:    130644881408    gen: 1705305    level: 1
        backup_csum_root:    167101845504    gen: 1706540    level: 3
        backup_total_bytes:    63879249920
        backup_bytes_used:    36929810432
        backup_num_devices:    1

3) Extra hardware info about your sda
    Things like SMART and hardware model would also help here.
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.15.15-1-ARCH] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Samsung based SSDs
Device Model:     SAMSUNG SSD 830 Series
Serial Number:    S0VXNYABC10154
LU WWN Device Id: 5 002538 043584d30
Firmware Version: CXM01B1Q
User Capacity:    64,023,257,088 bytes [64.0 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4c
SATA Version is:  SATA 3.0, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Apr 14 14:22:22 2018 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x83)    Offline data collection activity
                    is in a Reserved state.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (  17)    The self-test routine was aborted by
                    the host.
Total time to complete Offline
data collection:         (  300) seconds.
Offline data collection
capabilities:              (0x5b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    No Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003)    Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01)    Error logging supported.
                    General Purpose Logging supported.
Short self-test routine
recommended polling time:      (   2) minutes.
Extended self-test routine
recommended polling time:      (   5) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE UPDATED  WHEN_FAILED RAW_VALUE   5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail Always       -       0   9 Power_On_Hours          0x0032   095   095   000    Old_age Always       -       21508  12 Power_Cycle_Count       0x0032   092   092   000    Old_age Always       -       7390 177 Wear_Leveling_Count     0x0013   001   001   000    Pre-fail Always       -       3779 179 Used_Rsvd_Blk_Cnt_Tot   0x0013   100   100   010    Pre-fail Always       -       0 181 Program_Fail_Cnt_Total  0x0032   100   100   010    Old_age Always       -       0 182 Erase_Fail_Count_Total  0x0032   100   100   010    Old_age Always       -       0 183 Runtime_Bad_Block       0x0013   100   100   010    Pre-fail Always       -       0 187 Uncorrectable_Error_Cnt 0x0032   100   100   000    Old_age Always       -       0 190 Airflow_Temperature_Cel 0x0032   070   056   000    Old_age Always       -       30 195 ECC_Error_Rate          0x001a   200   200   000    Old_age Always       -       0 199 CRC_Error_Count         0x003e   253   253   000    Old_age Always       -       0 235 POR_Recovery_Count      0x0012   099   099   000    Old_age Always       -       153 241 Total_LBAs_Written      0x0032   099   099   000    Old_age Always       -       43256578423

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Aborted by host               10% 21291         -
# 2  Short offline       Completed without error       00% 21291         -
# 3  Short offline       Completed without error       00% 20933         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

4) The mount option of /dev/sda2

/dev/sda2    /    btrfs compress=zstd,discard,autodefrag,subvol=/       0       0

And if that matters (AFAIK subvolume mount options have no effect anyway):

/dev/sda2    /var/lib/postgres       btrfs compress=zstd,discard,nodatacow,subvol=var/lib/postgres 0       0 /dev/sda2       /var/cache              btrfs compress=off,discard,subvol=var/cache                   0       0 /dev/sda2       /var/tmp                btrfs compress=zstd,discard,subvol=var/tmp            0       0

Thanks,
Qu

Got a couple of these:
We seem to be looping a lot on /mnt/sda2/var/lib/postgres/data/.., do you want to keep going on ? (y/N/a): y

Is this something I need to be worried about? Postgres did at least start up.


Thanks a lot for your help!
Timo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to