Hi,

Sorry to resurrect an old issue, but I've just come across the same (or
very similar-looking) problem. I'm also on an Openstack Swift storage
node with lots of small writes to SSDs as in the OP, running on Debian
Stretch in our case with the following kernel:

Linux swift-storage-1 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3
(2018-03-02) x86_64 GNU/Linux

Kernel logs said:

[4769736.560752] XFS (sdc1): Metadata corruption detected at 
xfs_attr3_leaf_write_verify+0xe8/0x100 [xfs], xfs_attr3_leaf block 0xe7dd89b0
[4769736.563285] XFS (sdc1): Unmount and run xfs_repair
[4769736.564554] XFS (sdc1): First 64 bytes of corrupted metadata buffer:
[4769736.565818] ffff960ab1d0d000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 00 
00  ................
[4769736.567064] ffff960ab1d0d010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 00 
00  ..... ..........
[4769736.568272] ffff960ab1d0d020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00  ................
[4769736.569446] ffff960ab1d0d030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00  ................
[4769736.570611] XFS (sdc1): xfs_do_force_shutdown(0x8) called from line 1339 
of file /build/linux-YDazDa/linux-4.9.82/fs/xfs/xfs_buf.c.  Return address = 
0xffffffffc06c1ada
[4769736.573226] XFS (sdc1): Corruption of in-memory data detected.  Shutting 
down filesystem
[4769736.574419] XFS (sdc1): Please umount the filesystem and rectify the 
problem(s)


As per the message, I unmounted the filesystem and ran xfs_repair on it. The 
first run of xfs_repair told me to mount the filesystem to replay the log, 
which I did. I then unmounted it and ran xfs_repair again:

~$ sudo xfs_repair /dev/sdc1
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done


The filesystem now seems to be back up and running OK. I don't know if there's 
any more information I could provide to help track down this issue?


Thanks,
Chris

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux in Ubuntu.
https://bugs.launchpad.net/bugs/1596550

Title:
  Metadata corruption detected at xfs_attr3_leaf_write_verify+0xd7/0xf0

Status in linux package in Ubuntu:
  Confirmed

Bug description:
  We noticed a XFS metadata corruption once we ran a lot of small write
  IOs on SSDs in our OpenStack swift environment:

  [1468860.211158] XFS (sdax): Metadata corruption detected at 
xfs_attr3_leaf_write_verify+0xd7/0xf0 [xfs], block 0x7c99480
  [1468860.211195] XFS (sdax): Unmount and run xfs_repair
  [1468860.211215] XFS (sdax): First 64 bytes of corrupted metadata buffer:
  [1468860.211247] ffff880630f66000: 00 00 00 00 00 00 00 00 fb ee 00 00 00 00 
00 00  ................
  [1468860.211268] ffff880630f66010: 10 00 00 00 00 20 0f e0 00 00 00 00 00 00 
00 00  ..... ..........
  [1468860.211289] ffff880630f66020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00  ................
  [1468860.211309] ffff880630f66030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
00 00  ................
  [1468860.211328] XFS (sdax): xfs_do_force_shutdown(0x8) called from line 1254 
of file /build/linux-lts-xenial-7RlTta/linux-lts-xenial-4.4.0/fs/xfs/xfs_buf.c. 
 Return address = 0x
  ffffffffc068f616
  [1468860.212214] XFS (sdax): Corruption of in-memory data detected.  Shutting 
down filesystem
  [1468860.212232] XFS (sdax): Please umount the filesystem and rectify the 
problem(s)
  [1468860.212323] XFS (sdax): xfs_do_force_shutdown(0x1) called from line 315 
of file 
/build/linux-lts-xenial-7RlTta/linux-lts-xenial-4.4.0/fs/xfs/xfs_trans_buf.c.  
Return address
   = 0xffffffffc06bdda2
  [1468860.261436] XFS (sdax): xfs_log_force: error -5 returned.

  This error is reported with linux-generic-lts-xenial @4.4.0.22.12 on a
  XFS filesystem formatted with 1024 as inode size and mounted with

  
rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,sunit=512,swidth=512,noquota

  
  For us this issue seems to be reproducible after several hours of stress 
testing.

  cat /proc/version_signature
  Ubuntu 4.4.0-22.40~14.04.1-generic 4.4.8

  Description:    Ubuntu 14.04.3 LTS
  Release:        14.04
  --- 
  AlsaDevices:
   total 0
   crw-rw---- 1 root audio 116,  1 Jun 10 13:19 seq
   crw-rw---- 1 root audio 116, 33 Jun 10 13:19 timer
  AplayDevices: Error: [Errno 2] No such file or directory
  ApportVersion: 2.14.1-0ubuntu3.11
  Architecture: amd64
  ArecordDevices: Error: [Errno 2] No such file or directory
  AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', 
'/dev/snd/timer'] failed with exit code 1:
  CRDA: Error: [Errno 2] No such file or directory
  DistroRelease: Ubuntu 14.04
  IwConfig: Error: [Errno 2] No such file or directory
  MachineType: HP ProLiant DL380 Gen9
  Package: linux (not installed)
  PciMultimedia:
   
  ProcEnviron:
   TERM=screen
   PATH=(custom, no user)
   XDG_RUNTIME_DIR=<set>
   LANG=en_US.UTF-8
   SHELL=/bin/bash
  ProcFB:
   
  ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.4.0-22-generic 
root=/dev/mapper/lxc-root00 ro biosdevname=1 net.ifnames=0 
usbcore.autosuspend=-1 vga=normal nomodeset nomdmonddf nomdmonisw 
crashkernel=1024M-:128M
  ProcVersionSignature: Ubuntu 4.4.0-22.40~14.04.1-generic 4.4.8
  RelatedPackageVersions:
   linux-restricted-modules-4.4.0-22-generic N/A
   linux-backports-modules-4.4.0-22-generic  N/A
   linux-firmware                            1.127.15
  RfKill: Error: [Errno 2] No such file or directory
  Tags:  trusty
  Uname: Linux 4.4.0-22-generic x86_64
  UpgradeStatus: No upgrade log present (probably fresh install)
  UserGroups:
   
  _MarkForUpload: True
  dmi.bios.date: 07/20/2015
  dmi.bios.vendor: HP
  dmi.bios.version: P89
  dmi.chassis.type: 23
  dmi.chassis.vendor: HP
  dmi.modalias: 
dmi:bvnHP:bvrP89:bd07/20/2015:svnHP:pnProLiantDL380Gen9:pvr:cvnHP:ct23:cvr:
  dmi.product.name: ProLiant DL380 Gen9
  dmi.sys.vendor: HP

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1596550/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to