Public bug reported:

----Overview----

Automation scripts testing SSD firmware over power transitions during
interoperability testing, with the following procedure:

1)  Create 4 25% partitions (varying file systems) and mount as secondary data 
drive
2)  BTRFS partition mounted with discard flag via /etc/fstab
3)  Create 10G unique data pattern file on root fs
4)  Copy to each target 
5)  Verify each target
6)  Perform power transition (restart, shutdown, sleep, or hibernate)
7)  Verify each target
8)  Remove target file
9)  Copy file from the internal to each target again
10) Verify targets
11) Perform power transition
..etc

BTRFS fails at step 10. The machine has come up from the power event,
verified the target files, deleted the target files, copied from the
internal again, and fails verifying the freshly copied file.

----Failure----

On failure we see the fio verify threads fail with invalid header (data
is ALWAYS "101" when expecting fios ACCA header, I assume a quirk of
FIO), dmesg has csum failed messages

 csum failed ino 262 off 9985851392 csum 1474905414 expected csum
210901362

and the file is readable to a certain point, at which it will yield I/O
error when attempting to dd.

root@xxxxx:$ dd if=/mnt/g/restart-3.bin of=/tmp/fio bs=512 count=1 skip=19503615
1+0 records in
1+0 records out
512 bytes copied, 0.000311177 s, 1.6 MB/s

root@xxxxx:$ dd if=/mnt/g/restart-3.bin of=/tmp/fio bs=512 count=1 skip=19503616
dd: error reading '/mnt/g/restart-3.bin': Input/output error
0+0 records in
0+0 records out
0 bytes copied, 0.000773759 s, 0.0 kB/s

Here we see that both files claim to be the right size but restart-3.bin
is unreadable after the offset above.

-rw-r--r-- 1 root root 10737418240 Oct 17 17:34 restart-1.bin
-rw-r--r-- 1 root root 10737418240 Oct 17 17:44 restart-3.bin

This fails on Ubuntu Server 16.04 with btrfs-progs 4.4 and 4.8, and now
Ubuntu Server 16.10. Removing the discard flag from btrfs entry in fstab
will result in failure to reproduce, also removing the power event will
also result in a failure to reproduce.


----Reproducibility----

Ubuntu Server 16.04 / BTRFS-PROGS 4.4 : 100% within 10 restarts, 25-30 
reproductions
Ubuntu Server 16.04 / BTRFS-PROGS 4.8 : 100% within 10 restarts, 5 reproductions
Ubuntu Server 16.10 / BTRFS-PROGS 4.7 : 100% within 10 restarts, 3 reproductions

----System Information----

Distro  : ubuntu 16.10
Kernel  : Linux 4.8.0-22-generic #24-Ubuntu SMP Sat Oct 8 09:15:00 UTC 2016 
x86_64 x86_64 x86_64 

CPU     : Intel(R) Core(TM) i5-6600K CPU @ 3.50GHz (1261.444)
CPUCores: 4
Model   : Gigabyte Technology Co., Ltd. Z170M-D3H-CF
BIOS    : American Megatrends Inc. F2

--DUT Controller Info---
PCI Bus ID : 0000:00:17.0
Device Path: /sys/bus/pci/devices//0000:00:17.0
Module Name: ahci
Module Vers: 3.0

---DUT Controller Bus---
00:17.0 SATA controller [0106]: Intel Corporation Sunrise Point-H SATA 
controller [AHCI mode] [8086:a102] (rev 31) (prog-if 01 [AHCI 1.0])


(END)


---DUT Layout---
/dev/sdb4         ext4      110G   23G   82G  22% /mnt/i
/dev/sdb1         ext4      110G   39G   66G  37% /mnt/f
/dev/sdb2         btrfs     112G   33G   79G  30% /mnt/g
/dev/sdb3         xfs       112G   25G   88G  22% /mnt/h


btrfs-tools:
  Installed: 4.7.3-1
  Candidate: 4.7.3-1
  Version table:
 *** 4.7.3-1 500
        500 http://gb.archive.ubuntu.com/ubuntu yakkety/main amd64 Packages
        100 /var/lib/dpkg/status


---- Logs ----
17-10 17:43:41 | --------------------------------------------------------
17-10 17:43:41 | loopy : restart 3 - pre-power copy
17-10 17:43:41 | --------------------------------------------------------
17-10 17:43:41 | Copying from /systemtest/files/loopy_small.bin to 
/mnt/f/restart-3.bin
17-10 17:43:41 | Started tag cp-_mnt_f_restart-3.bin [2922]
17-10 17:43:41 | Copying from /systemtest/files/loopy_small.bin to 
/mnt/g/restart-3.bin
17-10 17:43:41 | Started tag cp-_mnt_g_restart-3.bin [2933]
17-10 17:43:41 | Copying from /systemtest/files/loopy_small.bin to 
/mnt/h/restart-3.bin
17-10 17:43:41 | Started tag cp-_mnt_h_restart-3.bin [2944]
17-10 17:43:41 | Copying from /systemtest/files/loopy_small.bin to 
/mnt/i/restart-3.bin
17-10 17:43:41 | Started tag cp-_mnt_i_restart-3.bin [2955]
17-10 17:43:41 | --------------------------------------
17-10 17:43:41 | Monitoring 4 pids for 999 minutes
17-10 17:44:15 |        PID 2933 - cp-_mnt_g_restart-3.bin - Finished. Exit: 0
17-10 17:45:07 |        PID 2944 - cp-_mnt_h_restart-3.bin - Finished. Exit: 0
17-10 17:45:12 |        PID 2922 - cp-_mnt_f_restart-3.bin - Finished. Exit: 0
17-10 17:45:13 |        PID 2955 - cp-_mnt_i_restart-3.bin - Finished. Exit: 0
17-10 17:45:14 |        All tags exhausted
17-10 17:45:14 | --------------------------------------
17-10 17:45:14 | 
17-10 17:45:25 | 
17-10 17:45:25 | --------------------------------------------------------
17-10 17:45:25 | loopy : restart 3 - pre-power verification
17-10 17:45:25 | --------------------------------------------------------
17-10 17:45:25 | Verifying /mnt/f/restart-3.bin
17-10 17:45:25 | Started tag restart-_mnt_f_-pre [5031]
17-10 17:45:25 | Verifying /mnt/g/restart-3.bin
17-10 17:45:25 | Started tag restart-_mnt_g_-pre [5045]
17-10 17:45:25 | Verifying /mnt/h/restart-3.bin
17-10 17:45:25 | Started tag restart-_mnt_h_-pre [5059]
17-10 17:45:25 | Verifying /mnt/i/restart-3.bin
17-10 17:45:25 | Started tag restart-_mnt_i_-pre [5073]
17-10 17:45:25 | --------------------------------------
17-10 17:45:25 | Monitoring 4 pids for 999 minutes
17-10 17:46:40 |        PID 5045 - restart-_mnt_g_-pre - FAILED. Exit: 1
17-10 17:46:40 |        FAILED: 5045 has failed.
17-10 17:46:40 | --------------------------------------
17-10 17:46:40 | ERROR: Failed during restart 3 pre-power event verification 
[Line:499]


BTRFS warning (device sdb2): csum failed ino 262 off 9985851392 csum 1474905414 
expected csum 210901362
BTRFS warning (device sdb2): csum failed ino 262 off 9985982464 csum 1218422395 
expected csum 1497608406
BTRFS warning (device sdb2): csum failed ino 262 off 9986113536 csum 3058027576 
expected csum 25891403

** Affects: btrfs-tools (Ubuntu)
     Importance: Undecided
         Status: New


** Tags: xenial yakkety

** Attachment added: "dmesg from after failure"
   
https://bugs.launchpad.net/bugs/1634377/+attachment/4762992/+files/dmesg_after_failure.log

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1634377

Title:
  btrfs discard issue after power event

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/btrfs-tools/+bug/1634377/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to