> Am 02.09.2014 um 21:30 schrieb Peter Lieven <p...@kamp.de>: > > Looking at the code, is it possible that not the guest is causing trouble > here, but > multiwrite_merge code? > > From what I see the only limit it has when merging requests is the number of > IOVs. > > > Any thoughts? > > Mine are: > a) Introducing bs->bl.max_request_size and set merge = 0 if the result would > be too big. Default > max request size to 32768 sectors (see below). > b) Hardcoding the limit in multiwrite_merge for now limiting the merged size > to 16MB (32768 sectors). > Which is the limit we already use in bdrv_co_discard and > bdrv_co_write_zeroes if we don't know > better.
or c) disabling multiwrite merge for RAW or only iSCSI completely. Peter > > Peter > >> Am 02.09.2014 um 17:28 schrieb ronnie sahlberg: >> That is one big request. I assume the device reports "no limit" in >> the VPD page so we can not state it is the guest/application going >> beyond the allowed limit? >> >> >> I am not entirely sure what meaning the target assigns to Protocol >> Error means here. >> It could be that ~100M is way higher than MaxBurstLength ? What is >> the MaxBurstLength that was reported by the server during login >> negotiation? >> If so, we should make libiscsi check the maxburstlength and fail the >> request early. We would still fail the I/O so it will not really solve >> anything much >> but at least we should not send the request to the server. >> >> Best would probably be to take the smallest of a non-zero >> Block-Limits.max_transfer_length and iscsi-MaxBurstLength/block-size >> and pass this back to the guest in the emulated Block-Limits-VPD. >> At least then you have tried to tell the guest "never do SCSI I/O >> bigger than this". >> >> I.e. even if the target reports BlockLimits.MaxTransferLength == 0 == >> no limit to QEMU, QEMU should probably take the iscsi transport limit >> into account and pass this to the guest >> by setting the emulated BlockLimits page it passes to scale to the >> maximum that MaxBurstLength allows. >> >> >> Then if BTRFS or SG_IO in the guest ignores the BlockLimits it is >> clearly a guest problem. >> >> (A different interpretation for ProtocolError could be the mismatch >> between the iscsi expected data transfer length and the scsi transfer >> length, but that should result in residuals, not protocol error.) >> >> >> >> Hypothetically there could be targets that support really huge >> MaxBurstLengths > 32MB. For those you probably want to switch to >> WRITE16 when the SCSI transfer length goes > 0xffff. >> >> - if (iscsilun->use_16_for_rw) { >> + if (iscsilun->use_16_for_rw || num_sectors > 0xffff) { >> >> >> regards >> ronnie sahlberg >> >>> On Mon, Sep 1, 2014 at 8:21 AM, Peter Lieven <p...@kamp.de> wrote: >>> On 17.06.2014 13:46, Paolo Bonzini wrote: >>> >>> Il 17/06/2014 13:37, Peter Lieven ha scritto: >>> >>> On 17.06.2014 13:15, Paolo Bonzini wrote: >>> >>> Il 17/06/2014 08:14, Peter Lieven ha scritto: >>> >>> >>> >>> BTW, while debugging a case with a bigger storage supplier I found >>> that open-iscsi seems to do exactly this undeterministic behaviour. >>> I have a 3TB LUN. If I access < 2TB sectors it uses READ10/WRITE10 and >>> if I go beyond 2TB it changes to READ16/WRITE16. >>> >>> >>> Isn't that exactly what your latest patch does for >64K sector writes? :) >>> >>> >>> Not exactly, we choose the default by checking the LUN size. 10 Byte for >>> < 2TB and 16 Byte otherwise. >>> >>> >>> Yeah, I meant introducing the non-determinism. >>> >>> My latest patch makes an exception if a request is bigger than 64K >>> sectors and >>> switches to 16 Byte requests. These would otherwise end in an I/O error. >>> >>> >>> It could also be split at the block layer, like we do for unmap. I think >>> there's also a maximum transfer size somewhere in the VPD, we could to >>> READ16/WRITE16 if it is >64K sectors. >>> >>> >>> It seems that there might be a real world example where Linux issues >32MB >>> write requests. Maybe someone familiar with btrfs can advise. >>> I see iSCSI Protocol Errors in my logs: >>> >>> Sep 1 10:10:14 libiscsi:0 PDU header: 01 a1 00 00 00 01 00 00 00 00 00 00 >>> 00 00 00 00 00 00 00 07 06 8f 30 00 00 00 00 06 00 00 00 0a 2a 00 01 09 9e >>> 50 00 47 98 00 00 00 00 00 00 00 [XXX] >>> Sep 1 10:10:14 qemu-2.0.0: iSCSI: Failed to write10 data to iSCSI lun. >>> Request was rejected with reason: 0x04 (Protocol Error) >>> >>> Looking at the headers the xferlen in the iSCSI PDU is 110047232 Byte which >>> is 214936 sectors. >>> 214936 % 65536 = 18328 which is exactly the number of blocks in the SCSI >>> WRITE10 CDB. >>> >>> Can someone advise if this is something that btrfs can cause >>> or if I have to >>> blame the customer that he issues very big write requests with Direct I/O? >>> >>> The user sseems something like this in the log: >>> [34640.489284] BTRFS: bdev /dev/vda2 errs: wr 8232, rd 0, flush 0, corrupt >>> 0, gen 0 >>> [34640.490379] end_request: I/O error, dev vda, sector 17446880 >>> [34640.491251] end_request: I/O error, dev vda, sector 5150144 >>> [34640.491290] end_request: I/O error, dev vda, sector 17472080 >>> [34640.492201] end_request: I/O error, dev vda, sector 17523488 >>> [34640.492201] end_request: I/O error, dev vda, sector 17536592 >>> [34640.492201] end_request: I/O error, dev vda, sector 17599088 >>> [34640.492201] end_request: I/O error, dev vda, sector 17601104 >>> [34640.685611] end_request: I/O error, dev vda, sector 15495456 >>> [34640.685650] end_request: I/O error, dev vda, sector 7138216 >>> >>> Thanks, >>> Peter >