On 01/29/2016 02:31 PM, Christian Seiler wrote: > On 01/29/2016 09:07 PM, Mike Christie wrote: >> On 01/27/2016 05:17 PM, Christian Seiler wrote: >>> The test setup is quite simple: LIO target running on the host with >>> fileio backend (I tried both Debian Jessie host with 3.16 kernel and >>> Ubuntu 15.04 host with 3.19 kernel, no difference), initiator >>> running inside a libvirt/KVM instance (Debian sid); nothing special >>> about the setup otherwise. (The target itself shows no problems.) >> >> So there are no other log messages on the LIO box? Something about max >> sectors being violated or a memory allocation failure? > > I didn't notice that earlier, but I get: > > kernel: fd_do_rw() write returned -22 > > (-22 is -EINVAL) > > Weird. I could have sworn that when I first encountered the problem > I didn't see anything in the target's logs... Obviously I'm mistaken > and I'm very sorry that I didn't notice that earlier. > >> What kernel version is LIO running on in these tests? > > LIO targets I've tried: > - 3.16.7-ckt20 (Debian Jessie) > - 3.19.0-43-generic (Ubuntu 15.04) > > I'll try a newer version tomorrow and see if the problem > persists. > >> I think you are hitting a bug where the block layer is now sending >> really large IOs that LIO cannot handle or does not want to. >> >> There was a change in LIO to better tell the the initiator what size to >> use, so get the LIO kernel version so we can check that. >> >> You can still hit memory allocation failures in LIO and hit a similar >> issue, but with just a dd of 4MB IOs you should not hit the problem. > > I just checked: 8388608 bytes (8 MiB) in a single dd are fine, > 8388609 (1 byte more) consistently reproduces the error. > > Ok, so this is then not actually an initiator problem but a LIO > target problem... Ok, thanks, then I'll investigate in that > direction (and ask the LIO people for help if I run into > problems). > > But I'm curious - why haven't I seen any problems with older > kernel versions for the initiator? I've been using the same LIO > target for a long time (3.16 was released 1.5 years ago) and > I've never had any problems with it, with multiple different > kernel versions for the initiator (going back as far as 3.2). > > So something must have also changed with the initiator some > time after 3.16 that it now triggers the bug...
The block layer used to cap block/fs IO at I think around 1024 sectors. It now lets you go up to the a min of the driver and what the target says it can handle. Check out /sys/block/sdX/queue/max_hw_sectors_kb and /sys/block/sdX/queue/max_sectors_kb in the different kernel versions. In the newer kernel you will see max_sectors_kb a lot higher. You can work around the problem by manually setting that lower. I think in newer versions of LIO, it will now also report a IO size. > > Btw. is there any setting (sysctl or iscsid configuration > option) to adjust the IO size to work around the target > problem? The block layer sysfs file /sys/block/sdX/queue/max_sectors_kb. -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To unsubscribe from this group and stop receiving emails from it, send an email to open-iscsi+unsubscr...@googlegroups.com. To post to this group, send email to open-iscsi@googlegroups.com. Visit this group at https://groups.google.com/group/open-iscsi. For more options, visit https://groups.google.com/d/optout.