> Am 02.09.2014 um 21:30 schrieb Peter Lieven <p...@kamp.de>:
> 
> Looking at the code, is it possible that not the guest is causing trouble 
> here, but
> multiwrite_merge code?
> 
> From what I see the only limit it has when merging requests is the number of 
> IOVs.
> 
> 
> Any thoughts?
> 
> Mine are:
> a) Introducing bs->bl.max_request_size and set merge = 0 if the result would 
> be too big. Default
> max request size to 32768 sectors (see below).
> b) Hardcoding the limit in multiwrite_merge for now limiting the merged size 
> to 16MB (32768 sectors).
>     Which is the limit we already use in bdrv_co_discard and 
> bdrv_co_write_zeroes if we don't know
>     better.

or c) disabling multiwrite merge for RAW or only iSCSI completely.

Peter

> 
> Peter
> 
>> Am 02.09.2014 um 17:28 schrieb ronnie sahlberg:
>> That is one big request.  I assume the device reports "no limit" in
>> the VPD page so we can not state it is the guest/application going
>> beyond the allowed limit?
>> 
>> 
>> I am not entirely sure what meaning the target assigns to Protocol
>> Error means here.
>> It could be that ~100M is way higher than MaxBurstLength ?  What is
>> the MaxBurstLength that was reported by the server during login
>> negotiation?
>> If so, we should make libiscsi check the maxburstlength and fail the
>> request early. We would still fail the I/O so it will not really solve
>> anything much
>> but at least we should not send the request to the server.
>> 
>> Best would probably be to take the smallest of a non-zero
>> Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size
>> and pass this back to the guest in the emulated Block-Limits-VPD.
>> At least then you have tried to tell the guest "never do SCSI I/O
>> bigger than this".
>> 
>> I.e. even if the target reports BlockLimits.MaxTransferLength == 0 ==
>> no limit to QEMU, QEMU should probably take the iscsi transport limit
>> into account and pass this to the guest
>> by setting the emulated BlockLimits page it passes to scale to the
>> maximum that MaxBurstLength allows.
>> 
>> 
>> Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is
>> clearly a guest problem.
>> 
>> (A different interpretation for ProtocolError could be the mismatch
>> between the iscsi expected data transfer length and the scsi transfer
>> length, but that should result in residuals, not protocol error.)
>> 
>> 
>> 
>> Hypothetically there could be targets that support really huge
>> MaxBurstLengths > 32MB. For those you probably want to switch to
>> WRITE16 when the SCSI transfer length goes > 0xffff.
>> 
>> - if (iscsilun->use_16_for_rw)  {
>> + if (iscsilun->use_16_for_rw || num_sectors > 0xffff)  {
>> 
>> 
>> regards
>> ronnie sahlberg
>> 
>>> On Mon, Sep 1, 2014 at 8:21 AM, Peter Lieven <p...@kamp.de> wrote:
>>> On 17.06.2014 13:46, Paolo Bonzini wrote:
>>> 
>>> Il 17/06/2014 13:37, Peter Lieven ha scritto:
>>> 
>>> On 17.06.2014 13:15, Paolo Bonzini wrote:
>>> 
>>> Il 17/06/2014 08:14, Peter Lieven ha scritto:
>>> 
>>> 
>>> 
>>> BTW, while debugging a case with a bigger storage supplier I found
>>> that open-iscsi seems to do exactly this undeterministic behaviour.
>>> I have a 3TB LUN. If I access < 2TB sectors it uses READ10/WRITE10 and
>>> if I go beyond 2TB it changes to READ16/WRITE16.
>>> 
>>> 
>>> Isn't that exactly what your latest patch does for >64K sector writes? :)
>>> 
>>> 
>>> Not exactly, we choose the default by checking the LUN size. 10 Byte for
>>> < 2TB and 16 Byte otherwise.
>>> 
>>> 
>>> Yeah, I meant introducing the non-determinism.
>>> 
>>> My latest patch makes an exception if a request is bigger than 64K
>>> sectors and
>>> switches to 16 Byte requests. These would otherwise end in an I/O error.
>>> 
>>> 
>>> It could also be split at the block layer, like we do for unmap.  I think
>>> there's also a maximum transfer size somewhere in the VPD, we could to
>>> READ16/WRITE16 if it is >64K sectors.
>>> 
>>> 
>>> It seems that there might be a real world example where Linux issues >32MB
>>> write requests. Maybe someone familiar with btrfs can advise.
>>> I see iSCSI Protocol Errors in my logs:
>>> 
>>> Sep  1 10:10:14 libiscsi:0 PDU header: 01 a1 00 00 00 01 00 00 00 00 00 00
>>> 00 00 00 00 00 00 00 07 06 8f 30 00 00 00 00 06 00 00 00 0a 2a 00 01 09 9e
>>> 50 00 47 98 00 00 00 00 00 00 00 [XXX]
>>> Sep  1 10:10:14 qemu-2.0.0: iSCSI: Failed to write10 data to iSCSI lun.
>>> Request was rejected with reason: 0x04 (Protocol Error)
>>> 
>>> Looking at the headers the xferlen in the iSCSI PDU is 110047232 Byte which
>>> is 214936 sectors.
>>> 214936 % 65536 = 18328 which is exactly the number of blocks in the SCSI
>>> WRITE10 CDB.
>>> 
>>> Can someone advise if this is something that btrfs can cause
>>> or if I have to
>>> blame the customer that he issues very big write requests with Direct I/O?
>>> 
>>> The user sseems something like this in the log:
>>> [34640.489284] BTRFS: bdev /dev/vda2 errs: wr 8232, rd 0, flush 0, corrupt
>>> 0, gen 0
>>> [34640.490379] end_request: I/O error, dev vda, sector 17446880
>>> [34640.491251] end_request: I/O error, dev vda, sector 5150144
>>> [34640.491290] end_request: I/O error, dev vda, sector 17472080
>>> [34640.492201] end_request: I/O error, dev vda, sector 17523488
>>> [34640.492201] end_request: I/O error, dev vda, sector 17536592
>>> [34640.492201] end_request: I/O error, dev vda, sector 17599088
>>> [34640.492201] end_request: I/O error, dev vda, sector 17601104
>>> [34640.685611] end_request: I/O error, dev vda, sector 15495456
>>> [34640.685650] end_request: I/O error, dev vda, sector 7138216
>>> 
>>> Thanks,
>>> Peter
> 

Reply via email to