Re: WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-12 Thread Aravind Parchuri

[EMAIL PROTECTED] wrote:

Aravind Parchuri wrote:

[EMAIL PROTECTED] wrote:

Aravind Parchuri wrote:
  

My log messages were getting all mixed up, so I cleaned up my little
test to send just one command at a time. It actually looks like the mid
layer passes the command through to open-iscsi with the right size the
first time, but then it sends a second command with request_bufflen = 0.

I can verify that the command completed on the target just like the
regular ones did, so there should be no reason for a retry of any sort.

Here's the log for a 32896 byte command:
Mar  9 11:27:43 ITX000c292c3c8d kernel: sg_open: dev=0, flags=0x802
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_add_sfp: sfp=0xcbadc000
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_reserve: req_size=32768
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_indirect:
buff_size=32768, blk_size=32768
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_add_sfp:   bufflen=32768,
k_use_sg=1
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_ioctl: sg0, cmd=0x2285
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_common_write:  scsi
opcode=0x3b, cmd_size=10
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_start_req: dxfer_len=32896
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_indirect:
buff_size=32896, blk_size=33280
Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_write_xfer: num_xfer=32896,
iovec_count=0, k_use_sg=2
Mar  8 11:27:43 ITX000c292c3c8d kernel: iscsi_queuecommand: opcode 3b
request_bufflen 32896 transfersize 32896


Did you add your own output? Could you enable iscsi debugging? What
kernel is this with and what versions of open-iscsi (upstream or svn or
tarball release)?
  

No custom output, all of this is from scsi (SCSI_LOG_TIMEOUT=5) and
open-iscsi (DEBUG_SCSI enabled). I thought I mentioned this in an
earlier mail -  the kernel is 2.6.19, but the open-iscsi drivers and


The output here does not have DEBUG_SCSI enabled. It just has your
custom iscsi output. The first log had DEBUG_SCSI iscsi output.



I'm extremely sorry - I think I forgot to enable it while recompiling 
stuff with your patch, and I was fooled by that print that we put in 
iscsi_queuecommand.



But what I want is not just parts you cut out. I am looking for all of
it. If I run:

sg_write_buffer /dev/sdi -l 32896


I wasn't aware that such a utility existed - it must be part of 
sg_utils, I guess. I just hacked up a test to fire an SG_IO ioctl to a 
given device. I guess I should try with this one next, just to check.




with my patch and a netapp target, I get the error:

Write buffer command not supported

If I look at the iscsi log, I see

Mar 12 12:59:54 madmax kernel: iscsi: cmd [itt 0x25 total 32896 imm_data
32896 unsol count 0, unsol offset 32896]
Mar 12 12:59:54 madmax kernel: iscsi: ctask enq [write cid 0 sc
81000e82d6c0 cdb 0x3b itt 0x25 len 32896 cmdsn 294 win 128]
Mar 12 12:59:54 madmax kernel: iscsi: ctask deq [cid 0 xmstate 2 itt 0x25]
Mar 12 12:59:55 madmax kernel: iscsi: cmdrsp [op 0x21 cid 0 itt 0x25 len 24]
Mar 12 12:59:55 madmax kernel: iscsi: copied 22 bytes of sense
Mar 12 12:59:55 madmax kernel: iscsi: done [sc 81000e82d6c0 res 2
itt 0x25]


which shows the len got set and that the target failed the target and we
got some sense back. The command is then failed to sg and sg returns to
userspace.

Your output just shows that we may be retrying the same command over and
over and it never completes which would be strange because the command
is a block pc command. It would help to see what happens with iscsi's
DEBUG_SCSI option to see if the command failed or completed ok. It would
also be nice to get the scsi-ml output to see what the sense is and what
is going on there, but as a first step we can look at the iscsi output.



Thanks for explaining in detail what's useful in debugging the problem. 
Here are logs with DEBUG_SCSI enabled, SCSI_LOG_TIMEOUT set to level 5, 
SCSI_LOG_MLQUEUE to 4 and SCSI_LOG_MLCOMPLETE to 4. This is everything 
in /var/log/messages from the time I issued the ioctl.


Command with 32896 bytes:

Mar 12 11:02:33 ITX000c292c3c8d kernel: iscsi: mgmtpdu [op 0x40 hdr-itt 
0x datalen 0]
Mar 12 11:02:33 ITX000c292c3c8d kernel: iscsi: mtask deq [cid 0 state 4 
itt 0xa05]
Mar 12 11:03:03 ITX000c292c3c8d kernel: iscsi: mgmtpdu [op 0x40 hdr-itt 
0x datalen 0]
Mar 12 11:03:03 ITX000c292c3c8d kernel: iscsi: mtask deq [cid 0 state 4 
itt 0xa06]
Mar 12 11:03:32 ITX000c292c3c8d kernel: iscsi: mgmtpdu [op 0x40 hdr-itt 
0x datalen 0]
Mar 12 11:03:32 ITX000c292c3c8d kernel: iscsi: mtask deq [cid 0 state 4 
itt 0xa07]
Mar 12 11:04:02 ITX000c292c3c8d kernel: iscsi: mgmtpdu [op 0x40 hdr-itt 
0x datalen 0]
Mar 12 11:04:02 ITX000c292c3c8d kernel: iscsi: mtask deq [cid 0 state 4 
itt 0xa08]

Mar 12 11:04:21 ITX000c292c3c8d kernel: sg_open: dev=0, flags=0x802
Mar 12 11:04:21 ITX000c292c3c8d kernel: sg_add_sfp: sfp=0xcb554000
Mar 12 11:04:21 ITX000c292c3c8d kernel: sg_build_reserve: req_size=32768
Mar 12 11:04:21 ITX000c292c3c8d

Re: WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-10 Thread Aravind Parchuri
[EMAIL PROTECTED] wrote:
 Aravind Parchuri wrote:
   
 My log messages were getting all mixed up, so I cleaned up my little
 test to send just one command at a time. It actually looks like the mid
 layer passes the command through to open-iscsi with the right size the
 first time, but then it sends a second command with request_bufflen = 0.

 I can verify that the command completed on the target just like the
 regular ones did, so there should be no reason for a retry of any sort.

 Here's the log for a 32896 byte command:
 Mar  9 11:27:43 ITX000c292c3c8d kernel: sg_open: dev=0, flags=0x802
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_add_sfp: sfp=0xcbadc000
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_reserve: req_size=32768
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_indirect:
 buff_size=32768, blk_size=32768
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_add_sfp:   bufflen=32768,
 k_use_sg=1
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_ioctl: sg0, cmd=0x2285
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_common_write:  scsi
 opcode=0x3b, cmd_size=10
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_start_req: dxfer_len=32896
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_build_indirect:
 buff_size=32896, blk_size=33280
 Mar  8 11:27:43 ITX000c292c3c8d kernel: sg_write_xfer: num_xfer=32896,
 iovec_count=0, k_use_sg=2
 Mar  8 11:27:43 ITX000c292c3c8d kernel: iscsi_queuecommand: opcode 3b
 request_bufflen 32896 transfersize 32896
 

 Did you add your own output? Could you enable iscsi debugging? What
 kernel is this with and what versions of open-iscsi (upstream or svn or
 tarball release)?
   
No custom output, all of this is from scsi (SCSI_LOG_TIMEOUT=5) and
open-iscsi (DEBUG_SCSI enabled). I thought I mentioned this in an
earlier mail -  the kernel is 2.6.19, but the open-iscsi drivers and
utils are from the 2.0-754 tarball.

I'm not sure what other debugging to enable in open-iscsi. Are you
talking about DEBUG_TCP or iscsid's debug? If you think it'll help, I
can enable them and try again. Has anyone tried to reproduce this
behavior independently though, maybe with a dummy LLD of some sort?

Aravind.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-09 Thread Aravind Parchuri
: k_use_sg=2
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_release: sg0
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_fasync: sg0, mode=0
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_remove_scat: k_use_sg=1

It's still got something to do with the sector-aligned scatter-gather 
buffer allocation in sg_build_indirect. sg_cmd_done isn't ever called, 
so is there a check somewhere in the completion path that's still using 
the wrong size?


Aravind.





[EMAIL PROTECTED] wrote:

Aravind Parchuri wrote:

The patch has some problem. While ioctls with dxfer_len  32k still make
it through properly, the problematic ones now show up in open-iscsi's
queuecommand with request_bufflen = 0. I'm not sure what the problem is
now.



Could you send the sg and iscsi log output with the patch?


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-09 Thread Aravind Parchuri
: sg_write_xfer: num_xfer=33792, 
iovec_count=0, k_use_sg=2
Mar  8 12:22:43 ITX000c292c3c8d kernel: iscsi_queuecommand: opcode 3b 
request_bufflen 33792 transfersize 33792
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_cmd_done: sg0, pack_id=0, 
res=0x0

Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_finish_rem_req: res_used=0
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_remove_scat: k_use_sg=2
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_release: sg0
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_fasync: sg0, mode=0
Mar  8 12:22:43 ITX000c292c3c8d kernel: sg_remove_scat: k_use_sg=1

It's still got something to do with the sector-aligned scatter-gather 
buffer allocation in sg_build_indirect. sg_cmd_done isn't ever called, 
so is there a check somewhere in the completion path that's still using 
the wrong size?


Aravind.





[EMAIL PROTECTED] wrote:

Aravind Parchuri wrote:

The patch has some problem. While ioctls with dxfer_len  32k still make
it through properly, the problematic ones now show up in open-iscsi's
queuecommand with request_bufflen = 0. I'm not sure what the problem is
now.



Could you send the sg and iscsi log output with the patch?






-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-08 Thread Aravind Parchuri
The patch has some problem. While ioctls with dxfer_len  32k still make 
it through properly, the problematic ones now show up in open-iscsi's 
queuecommand with request_bufflen = 0. I'm not sure what the problem is now.


Aravind.

[EMAIL PROTECTED] wrote:

Aravind Parchuri wrote:
  

In fact, even with the newer kernel, all commands smaller than 32k make
it through with the right size, which leads me to think it's something
to do with the scatter-gather list. I'm not particularly familiar with
the scsi code, but inside scsi_execute_async, in scsi_map_req_sg, should
the request's data_len be set to the dxfer_len instead of summing up the
scatter-gather list lengths? I must mention again that I haven't really
gone through the code thoroughly.




You are right. sg_build_indirect will do this
blk_size = (blk_size + SG_SECTOR_MSK)  (~SG_SECTOR_MSK)
so we should be using the bufflen passed into us. Attached is a patch
made over scsi-misc. It should also apply to scsi rc fixes.
  



sg's may have setup a the buffer with a different length than
the transfer length so we should be using the bufflen passed
in as the request's data len.

Signed-off-by: Mike Christie [EMAIL PROTECTED]

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 9f7482d..dfe3ccd 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -301,7 +301,7 @@ static int scsi_req_map_sg(struct reques
 {
struct request_queue *q = rq-q;
int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1)  PAGE_SHIFT;
-   unsigned int data_len = 0, len, bytes, off;
+   unsigned int len, bytes, off;
struct page *page;
struct bio *bio = NULL;
int i, err, nr_vecs = 0;
@@ -310,7 +310,6 @@ static int scsi_req_map_sg(struct reques
page = sgl[i].page;
off = sgl[i].offset;
len = sgl[i].length;
-   data_len += len;
 
while (len  0) {

bytes = min_t(unsigned int, len, PAGE_SIZE - off);
@@ -350,7 +349,7 @@ static int scsi_req_map_sg(struct reques
}
 
rq-buffer = rq-data = NULL;

-   rq-data_len = data_len;
+   rq-data_len = bufflen;
return 0;
 
 free_bios:
  


-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


WRITE BUFFER commands through SG_IO getting rounded up to sector past 32k

2007-03-06 Thread Aravind Parchuri
While trying to fire WRITE BUFFER commands to an iscsi target, we found 
that if the command is larger than 32kB, it gets rounded up to the next 
sector before it reaches the LLD (in this case, open-iscsi). If this is 
expected behavior, please disregard the rest of the message. 
Unfortunately, I haven't found any documentation that mandates 
sector-multiple sizes for SG_IO ioctls, and besides,  the above behavior 
seems to have been introduced recently (post 2.6.12). The above scenario 
was produced in 2.6.19.2 (a Fedora update kernel, actually).


Here is the relevant log section for a command with dxfer_len 32896 
bytes, with SCSI_LOG_TIMEOUT=5 and DEBUG_SCSI enabled in open-iscsi:


Mar  5 12:39:12 ITX003048865470 kernel: sg_ioctl: sg4, cmd=0x2285
Mar  5 12:39:12 ITX003048865470 kernel: sg_common_write:  scsi 
opcode=0x3b, cmd_size=10

Mar  5 12:39:12 ITX003048865470 kernel: sg_start_req: dxfer_len=32896
Mar  5 12:39:12 ITX003048865470 kernel: sg_build_indirect: 
buff_size=32896, blk_size=33280
Mar  5 12:39:12 ITX003048865470 kernel: sg_write_xfer: num_xfer=32896, 
iovec_count=0, k_use_sg=2
Mar  5 12:39:12 ITX003048865470 kernel: iscsi: cmd [itt 0x23 total 33280 
imm_data 33280 unsol count 0, unsol offset 33280]
Mar  5 12:39:12 ITX003048865470 kernel: iscsi: ctask enq [write cid 0 sc 
e5ffcb00 cdb 0x3b itt 0x23 len 33280 cmdsn 36 win 257]
Mar  5 12:39:12 ITX003048865470 kernel: iscsi: ctask deq [cid 0 xmstate 
2 itt 0x23]
Mar  5 12:39:12 ITX003048865470 kernel:  connection0:0: iscsi: detected 
conn error (1011)

Mar  5 12:39:12 ITX003048865470 kernel: iscsi: blocking session
Mar  5 12:39:12 ITX003048865470 kernel: iscsi: failing in progress sc 
e5ffcb00 itt 0x23
Mar  5 12:39:12 ITX003048865470 kernel: sg_cmd_done: sg4, pack_id=0, 
res=0x2Mar  5 12:39:12 ITX003048865470 kernel: sg_finish_rem_req: 
res_used=0

Mar  5 12:39:12 ITX003048865470 kernel: sg_remove_scat: k_use_sg=2

The first print from open-iscsi is from inside its queuecommand, and it 
shows that the scsi_cmnd-request_bufflen is 33280 (total 33280). When 
I tried the same experiment with 2.6.12, the number matched the dxfer_len.


In fact, even with the newer kernel, all commands smaller than 32k make 
it through with the right size, which leads me to think it's something 
to do with the scatter-gather list. I'm not particularly familiar with 
the scsi code, but inside scsi_execute_async, in scsi_map_req_sg, should 
the request's data_len be set to the dxfer_len instead of summing up the 
scatter-gather list lengths? I must mention again that I haven't really 
gone through the code thoroughly.


Anyway, the above behavior leads to a problem with open-iscsi - it fills 
in a pdu with the Data Segment Length equal to the sector-rounded 
number, while the cdb contains the original dxfer_len. The target uses 
the cdb's value to allocate buffers for the command, so it fails to 
receive the whole packet. Any help would be appreciated. Thanks,



Aravind.

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html