On Fri, Apr 12, 2019 at 6:00 AM Patrick M. Hausen <hau...@punkt.de> wrote:

> Hi all,
>
> my problems seem not to be TRIM related after all … and I can now
> quickly reproduce it.
>
> =====
> root@freenas01[~]# sysctl vfs.zfs.trim.enabled
> vfs.zfs.trim.enabled: 0
> =====
> root@freenas01[~]# cd /mnt/zfs
> root@freenas01[/mnt/zfs]# dd if=/dev/urandom of=hurz bs=10m
> ^C — system freezes temporarily
>

This does one I/O at a time to the filesystem, which then repackages the
I/Os such that multiple I/Os are going on with the NVMe card.


> =====
> Apr 12 13:42:16 freenas01 nvme6: resetting controller
>

OK. This means that whatever I/O workload we've done has caused the NVME
card to stop responding for 30s, so we reset it.


> Apr 12 13:42:16 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:42:16 freenas01 nvme6: WRITE sqid:1 cid:117 nsid:1 lba:981825104
> len:176
> Apr 12 13:42:16 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:117 cdw0:0
>

But only one request was in flight...  And we keep doing it over and over
again, but to different LBAs suggesting that we're stuttering: a few go
through and then things wedge again. This happens every 30ish seconds.


> Apr 12 13:42:49 freenas01 nvme6: resetting controller
> Apr 12 13:42:50 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:42:50 freenas01 nvme6: WRITE sqid:1 cid:127 nsid:1 lba:984107936
> len:96
> Apr 12 13:42:50 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:127 cdw0:0
> Apr 12 13:43:35 freenas01 nvme6: resetting controller
> Apr 12 13:43:35 freenas01 nvme6: aborting outstanding i/o
> Apr 12 13:43:35 freenas01 nvme6: WRITE sqid:1 cid:112 nsid:1 lba:976172032
> len:176
> Apr 12 13:43:35 freenas01 nvme6: ABORTED - BY REQUEST (00/07) sqid:1
> cid:112 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: resetting controller
>

And then this one goes wonkies.


> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:111 nsid:1 lba:976199176
> len:248
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:111 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:102 nsid:1 lba:976199432
> len:248
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:102 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:112 nsid:1 lba:976199680
> len:8
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:112 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:105 nsid:1 lba:976199752
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:105 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:122 nsid:1 lba:976199816
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:122 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:103 nsid:1 lba:976199688
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:103 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:126 nsid:1 lba:976200136
> len:56
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:126 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:106 nsid:1 lba:976200192
> len:8
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:106 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:107 nsid:1 lba:976200200
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:107 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:127 nsid:1 lba:976200264
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:127 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:113 nsid:1 lba:976200328
> len:120
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:113 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:108 nsid:1 lba:976200448
> len:72
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:108 cdw0:0
> Apr 12 13:44:06 freenas01 nvme7: aborting outstanding i/o
> Apr 12 13:44:06 freenas01 nvme7: WRITE sqid:1 cid:116 nsid:1 lba:976200520
> len:64
> Apr 12 13:44:06 freenas01 nvme7: ABORTED - BY REQUEST (00/07) sqid:1
> cid:116 cdw0:0
> =====
> root@freenas01[~]# nvmecontrol identify nvme6
> Controller Capabilities/Features
> ================================
> Vendor ID:                  8086
> Subsystem Vendor ID:        8086
> Serial Number:              BTLJ90230EC61P0FGN
> Model Number:               INTEL SSDPE2KX010T8
>

So it's an intel card.


> Firmware Version:           VDV10131
> Recommended Arb Burst:      0
> IEEE OUI Identifier:        e4 d2 5c
> Multi-Interface Cap:        00
> Max Data Transfer Size:     131072
> Controller ID:              0x00
>
> Admin Command Set Attributes
> ============================
> Security Send/Receive:       Not Supported
> Format NVM:                  Supported
> Firmware Activate/Download:  Supported
> Namespace Managment:         Supported
> Abort Command Limit:         4
> Async Event Request Limit:   4
> Number of Firmware Slots:    1
> Firmware Slot 1 Read-Only:   No
> Per-Namespace SMART Log:     No
> Error Log Page Entries:      64
> Number of Power States:      1
>
> NVM Command Set Attributes
> ==========================
> Submission Queue Entry Size
>   Max:                       64
>   Min:                       64
> Completion Queue Entry Size
>   Max:                       16
>   Min:                       16
> Number of Namespaces:        1
> Compare Command:             Not Supported
> Write Uncorrectable Command: Supported
> Dataset Management Command:  Supported
> Volatile Write Cache:        Not Present
>
> Namespace Drive Attributes
> ==========================
> NVM total cap:               1000204886016
> NVM unallocated cap:         0
> =====
> root@freenas01[~]# zpool status
>   pool: freenas-boot
>  state: ONLINE
>   scan: scrub repaired 0 in 0 days 00:00:03 with 0 errors on Sun Apr  7
> 03:45:03 2019
> config:
>
>         NAME        STATE     READ WRITE CKSUM
>         freenas-boot  ONLINE       0     0     0
>           mirror-0  ONLINE       0     0     0
>             nvd0p2  ONLINE       0     0     0
>             nvd1p2  ONLINE       0     0     0
>
> errors: No known data errors
>
>   pool: zfs
>  state: ONLINE
>   scan: scrub repaired 0 in 0 days 00:01:53 with 0 errors on Fri Mar 22
> 19:53:37 2019
> config:
>
>         NAME                                            STATE     READ
> WRITE CKSUM
>         zfs                                             ONLINE       0
>  0     0
>           raidz2-0                                      ONLINE       0
>  0     0
>             gptid/97d0a7ce-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>             gptid/98053880-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>             gptid/983a9468-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>             gptid/987100f2-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>             gptid/98aa6e88-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>             gptid/98f20b8c-44e5-11e9-982e-0025905f99ac  ONLINE       0
>  0     0
>
> errors: No known data errors
> =====
>
> The problem only appears on the data pool built from 6 INTEL
> SSDPE2KX010T8. The system
> pool built from two KXG50ZNV256G TOSHIBA does not show any problem with
> write load.
>

OK. That suggests Intel has a problem with their firmware.


> All the Intel drives have the latest firmware according to the Intel
> support website.
>
> Could it possibly help to tweak dev.nvme.7.ioq0.num_entries and similar
> entries?
>

It might. Maybe. I'm unsure what the intel firmware bug is.


> What about switching to the nda device instead of nvd?
>

I doubt that would have any effect. They both throw as much I/O onto the
card as possible in the default config.

There's been some minor improvements in -current here. Any chance you could
experimentally try that with this test? You won't get as many I/O abort
errors (since we don't print those), and we have a few more workarounds for
the reset path (though honestly, it's still kinda stinky).

Warner
_______________________________________________
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to