On 01/29/2016 02:31 PM, Christian Seiler wrote:
> On 01/29/2016 09:07 PM, Mike Christie wrote:
>> On 01/27/2016 05:17 PM, Christian Seiler wrote:
>>> The test setup is quite simple: LIO target running on the host with
>>> fileio backend (I tried both Debian Jessie host with 3.16 kernel and
>>> Ubuntu 15.04 host with 3.19 kernel, no difference), initiator
>>> running inside a libvirt/KVM instance (Debian sid); nothing special
>>> about the setup otherwise. (The target itself shows no problems.)
>>
>> So there are no other log messages on the LIO box? Something about max
>> sectors being violated or a memory allocation failure?
> 
> I didn't notice that earlier, but I get:
> 
> kernel: fd_do_rw() write returned -22
> 
> (-22 is -EINVAL)
> 
> Weird. I could have sworn that when I first encountered the problem
> I didn't see anything in the target's logs... Obviously I'm mistaken
> and I'm very sorry that I didn't notice that earlier.
> 
>> What kernel version is LIO running on in these tests?
> 
> LIO targets I've tried:
>  - 3.16.7-ckt20 (Debian Jessie)
>  - 3.19.0-43-generic (Ubuntu 15.04)
> 
> I'll try a newer version tomorrow and see if the problem
> persists.
> 
>> I think you are hitting a bug where the block layer is now sending
>> really large IOs that LIO cannot handle or does not want to.
>>
>> There was a change in LIO to better tell the the initiator what size to
>> use, so get the LIO kernel version so we can check that.
>>
>> You can still hit memory allocation failures in LIO and hit a similar
>> issue, but with just a dd of 4MB IOs you should not hit the problem.
> 
> I just checked: 8388608 bytes (8 MiB) in a single dd are fine,
> 8388609 (1 byte more) consistently reproduces the error.
> 
> Ok, so this is then not actually an initiator problem but a LIO
> target problem... Ok, thanks, then I'll investigate in that
> direction (and ask the LIO people for help if I run into
> problems).
> 
> But I'm curious - why haven't I seen any problems with older
> kernel versions for the initiator? I've been using the same LIO
> target for a long time (3.16 was released 1.5 years ago) and
> I've never had any problems with it, with multiple different
> kernel versions for the initiator (going back as far as 3.2).
> 
> So something must have also changed with the initiator some
> time after 3.16 that it now triggers the bug...

The block layer used to cap block/fs IO at I think around 1024 sectors.
It now lets you go up to the a min of the driver and what the target
says it can handle.

Check out

/sys/block/sdX/queue/max_hw_sectors_kb
and
/sys/block/sdX/queue/max_sectors_kb

in the different kernel versions. In the newer kernel you will see
max_sectors_kb a lot higher. You can work around the problem by manually
setting that lower.

I think in newer versions of LIO, it will now also report a IO size.

> 
> Btw. is there any setting (sysctl or iscsid configuration
> option) to adjust the IO size to work around the target
> problem?

The block layer sysfs file /sys/block/sdX/queue/max_sectors_kb.


-- 
You received this message because you are subscribed to the Google Groups 
"open-iscsi" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to open-iscsi+unsubscr...@googlegroups.com.
To post to this group, send email to open-iscsi@googlegroups.com.
Visit this group at https://groups.google.com/group/open-iscsi.
For more options, visit https://groups.google.com/d/optout.

Reply via email to