On Sat, Mar 14, 2026 at 4:38 PM Trond Myklebust <[email protected]> wrote: > > Hi Salvatore, > > On Sat, 2026-03-14 at 13:23 +0100, Salvatore Bonaccorso wrote: > > Control: forwarded -1 > > https://lore.kernel.org/regressions/[email protected] > > Control: tags -1 + upstream > > > > Hi Trond, hi Anna > > > > In Debian we got reports of a NFS client regression where large > > rsize/wsize (1MB) causes EIO after the commit 2b092175f5e3 ("NFS: Fix > > inheritance of the block sizes when automounting") and its backports > > to the stable series. The report in full is at: > > https://bugs.debian.org/1128834 > > > > Maik reported: > > > after upgrading from Linux 6.1.158 to 6.1.162, NFS client writes > > > fail with input/output errors (EIO). > > > > > > Environment: > > > - Debian Bookworm > > > - Kernel: 6.1.0-43-amd64 (6.1.162-1) > > > - NFSv4.2 (also reproducible with 4.1) > > > - Default mount options include rsize=1048576,wsize=1048576 > > > > > > Reproducer: > > > dd if=/dev/zero of=~/testfile bs=1M count=500 > > > or > > > dd if=/dev/zero of=~/testfile bs=4k count=100000 > > > > > > On different computers and VMs! > > > > > > > > > Result: > > > dd: closing output file: Input/output error > > > > > > Workaround: > > > Mount with: > > > rsize=65536,wsize=65536 > > > > > > With reduced I/O size, the issue disappears completely. > > > > > > Impact: > > > - File writes fail (file >1M) > > > - KDE Plasma crashes due to corrupted cache/config writes > > > > > > The issue does NOT occur on kernel 6.1.0-42 (6.1.158). > > > > I was not able to reproduce the problem, and it turned out that it > > seems to be triggerable when on NFS server side a Dell EMC (Isilion) > > system was used. So the issue was not really considered initially as > > beeing "our" issue. > > > > Valentin SAMIR, a second user affected, did as well report the issue > > to Dell, and Dell seems to point at a client issue instead. Valentin > > writes: > > > > > We are facing the same issue. Dell seems to point to a client > > > issue: > > > The kernel treats the max size as the nfs payload max size whereas > > > OneFs treat the max size as the overall compound packet max size > > > (everything related to NFS in the call). Hence when OneFS receives > > > a > > > call with a payload of 1M, the overall NFS packet is slightly > > > bigger > > > and it returns an NFS4ERR_REQ_TOO_BIG. > > > > > > So the question is: should max req size/max resp size be treated as > > > the > > > nfs payload max size or the whole nfs packet max size? > > > > His reply in https://bugs.debian.org/1128834#55 contains a quote from > > the response Valentin got from Dell, I'm full quoting it here for > > easier followup in case needed: > > > > > I have been looking at the action plan output we captured. > > > Specifically around when you first mounted and then repro'ed the > > > error. > > > > > > Looking at the pcap we gathered, firstly, lets concentrate on the > > > "create session" calls between Client / Node. > > > Here we can these max sizes advertised - per screenshot. > > > > > > > > > Frame 17: 306 bytes on wire (2448 bits), 306 bytes captured (2448 > > > bits) > > > Ethernet II, Src: SuperMicroCo_1d:7d:b2 (ac:1f:6b:1d:7d:b2), Dst: > > > MellanoxTech_bd:8c:7a (c4:70:bd:bd:8c:7a) > > > Internet Protocol Version 4, Src: 172.22.1.132, Dst: 172.22.16.29 > > > Transmission Control Protocol, Src Port: 810, Dst Port: 2049, Seq: > > > 613, Ack: 277, Len: 240 > > > Remote Procedure Call, Type:Call XID:0x945b7e1d > > > Network File System, Ops(1): CREATE_SESSION > > > [Program Version: 4] > > > [V4 Procedure: COMPOUND (1)] > > > Tag: <EMPTY> > > > minorversion: 2 > > > Operations (count: 1): CREATE_SESSION > > > Opcode: CREATE_SESSION (43) > > > clientid: 0x36adef626e919bf4 > > > seqid: 0x00000001 > > > csa_flags: 0x00000003, CREATE_SESSION4_FLAG_PERSIST, > > > CREATE_SESSION4_FLAG_CONN_BACK_CHAN > > > csa_fore_chan_attrs > > > hdr pad size: 0 > > > max req size: 1049620 > > > max resp size: 1049480 > > > max resp size cached: 7584 > > > max ops: 8 > > > max reqs: 64 > > > csa_back_chan_attrs > > > hdr pad size: 0 > > > max req size: 4096 > > > max resp size: 4096 > > > max resp size cached: 0 > > > max ops: 2 > > > max reqs: 16 > > > cb_program: 0x40000000 > > > flavor: 1 > > > stamp: 2087796144 > > > machine name: srv-transfert.ad.phedre.fr > > > uid: 0 > > > gid: 0 > > > [Main Opcode: CREATE_SESSION (43)] > > > > > > > > > And the Node responds, as expected confirming the max size of > > > 1048576. > > > > > > > > > Frame 19: 194 bytes on wire (1552 bits), 194 bytes captured (1552 > > > bits) > > > Ethernet II, Src: MellanoxTech_bd:8c:7a (c4:70:bd:bd:8c:7a), Dst: > > > IETF-VRRP-VRID_3f (00:00:5e:00:01:3f) > > > Internet Protocol Version 4, Src: 172.22.16.29, Dst: 172.22.1.132 > > > Transmission Control Protocol, Src Port: 2049, Dst Port: 810, Seq: > > > 321, Ack: 853, Len: 128 > > > Remote Procedure Call, Type:Reply XID:0x945b7e1d > > > Network File System, Ops(1): CREATE_SESSION > > > [Program Version: 4] > > > [V4 Procedure: COMPOUND (1)] > > > Status: NFS4_OK (0) > > > Tag: <EMPTY> > > > Operations (count: 1) > > > Opcode: CREATE_SESSION (43) > > > Status: NFS4_OK (0) > > > sessionid: f49b916e62efad36f200000006000000 > > > seqid: 0x00000001 > > > csr_flags: 0x00000002, > > > CREATE_SESSION4_FLAG_CONN_BACK_CHAN > > > csr_fore_chan_attrs > > > hdr pad size: 0 > > > max req size: 1048576 > > > max resp size: 1048576 > > > max resp size cached: 7584 > > > max ops: 8 > > > max reqs: 64 > > > csr_back_chan_attrs > > > hdr pad size: 0 > > > max req size: 4096 > > > max resp size: 4096 > > > max resp size cached: 0 > > > max ops: 2 > > > max reqs: 16 > > > [Main Opcode: CREATE_SESSION (43)] > > > > > > > > > Now if we look later on in the sequence when the Client sends the > > > write request to the Node - we see in the frame, the max size is as > > > expected 1048576 > > > > > > > > > Frame 747: 1998 bytes on wire (15984 bits), 1998 bytes captured > > > (15984 bits) > > > Ethernet II, Src: SuperMicroCo_1d:7d:b2 (ac:1f:6b:1d:7d:b2), Dst: > > > MellanoxTech_bd:8c:7a (c4:70:bd:bd:8c:7a) > > > Internet Protocol Version 4, Src: 172.22.1.132, Dst: 172.22.16.29 > > > Transmission Control Protocol, Src Port: 810, Dst Port: 2049, Seq: > > > 1054149, Ack: 6009, Len: 1932 > > > [345 Reassembled TCP Segments (1048836 bytes): #84(1448), > > > #85(5792), > > > #87(5792), #89(1448), #90(1448), #92(4344), #94(4344), #96(2896), > > > #98(1448), #99(2896), #101(4344), #103(4344), #105(1448), > > > #106(1448), > > > #108(2896), #110(1448), #111(2896)] > > > Remote Procedure Call, Type:Call XID:0xb45b7e1d > > > Network File System, Ops(4): SEQUENCE, PUTFH, WRITE, GETATTR > > > [Program Version: 4] > > > [V4 Procedure: COMPOUND (1)] > > > Tag: <EMPTY> > > > minorversion: 2 > > > Operations (count: 4): SEQUENCE, PUTFH, WRITE, GETATTR > > > Opcode: SEQUENCE (53) > > > Opcode: PUTFH (22) > > > Opcode: WRITE (38) > > > StateID > > > offset: 0 > > > stable: FILE_SYNC4 (2) > > > Write length: 1048576 > > > Data: <DATA> > > > Opcode: GETATTR (9) > > > [Main Opcode: WRITE (38)] > > > > > > > > > However we then see the Node reply a short time later with (as > > > below) > > > REQ_TOO_BIG - meaning the max size has been exceeded. > > > > > > Frame 749: 114 bytes on wire (912 bits), 114 bytes captured (912 > > > bits) > > > Ethernet II, Src: MellanoxTech_bd:8c:7a (c4:70:bd:bd:8c:7a), Dst: > > > IETF-VRRP-VRID_3f (00:00:5e:00:01:3f) > > > Internet Protocol Version 4, Src: 172.22.16.29, Dst: 172.22.1.132 > > > Transmission Control Protocol, Src Port: 2049, Dst Port: 810, Seq: > > > 6009, Ack: 1056081, Len: 48 > > > Remote Procedure Call, Type:Reply XID:0xb45b7e1d > > > Network File System, Ops(1): SEQUENCE(NFS4ERR_REQ_TOO_BIG) > > > [Program Version: 4] > > > [V4 Procedure: COMPOUND (1)] > > > Status: NFS4ERR_REQ_TOO_BIG (10065) > > > Tag: <EMPTY> > > > Operations (count: 1) > > > Opcode: SEQUENCE (53) > > > Status: NFS4ERR_REQ_TOO_BIG (10065) > > > [Main Opcode: SEQUENCE (53)] > > > > > > > > > Why is this? > > > > > > The reason for this seems to be related to the Client. > > > > > > From the Cluster side, the max rsize/wsize is the overall compound > > > packet max size (everything related to NFS in the call) > > > > > > So for example with a compound call in nfsv4.2 - this might include > > > the below type detail which does not exceed the overall size > > > 1048576: > > > > > > [ > > > COMPOUND header > > > SEQUENCE .... > > > PUTFH ... > > > WRITE header > > > WRITE payload > > > ] (overall) < 1mb > > > > > > > > > However the Client instead uses r/wsize from mount option, as a > > > limit > > > for the write payload. > > > > > > So with the same example > > > COMPOUND header > > > SEQUENCE .... > > > PUTFH ... > > > WRITE header > > > > > > [ > > > WRITE payload > > > ] (write) < 1mb > > > > > > But overall this ends up being 1mb + all the overhead of write > > > header, compound header, putfh etc > > > Puts it over the channel limit of 1048576 and hence the error > > > returned. > > > > > > So it seems here the Client ignores that value and insists on the > > > WRITE with a payload == wszie; which in total with WRITE overhead > > > and > > > all other requests in COMPOUND (PUTFH, etc) exceeds maxrequestsize, > > > which prompts NFS4ERR_REQ_TOO_BIG. > > > > > > > > > And as you can see, once you reduce the size within the mount > > > options > > > on the Client side, it no longer exceeds its limits. > > > Meaning you don't get the I/O error. > > > > So question, are we behaving here correctly or is it our Problem, or > > is the > > issue still considered on Dell's side? > > > > #regzbot introduced: 2b092175f5e301cdaa935093edfef2be9defb6df > > #regzbot monitor: https://bugs.debian.org/1128834 > > > > How to proceeed from here? > > > The Linux NFS client uses the 'maxread' and 'maxwrite' attributes (see > RFC8881 Sections 5.8.2.20. and 5.8.2.21.) to decide how big a payload > to request/send to the server in a READ/WRITE COMPOUND.
So maxread and/or maxwrite MUST NOT be larger than the clients maximum RPC size? > > If Dell's implementation is returning a size of 1MB, then the Linux > client will use that value. It won't cross check with the max request > size, because it assumes that since both values derive from the server, > there will be no conflict between them. Maybe add an assert()-like warning to syslog if there is a mismatch? Aurélien -- Aurélien Couderc <[email protected]> Big Data/Data mining expert, chess enthusiast

