Re: [RESEND] TCMUser: add read length support

2018-05-30 Thread Bodo Stroesser

On 05/26/18 19:14, kbuild test robot wrote:


Hi Bodo,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc6]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

Sorry for the late response.

The missing 'SCF_TREAT_READ_AS_NORMAL'is defined in another patch from 
Lee Duncan.
The title of the patch is '[PATCH v4] target: transport should handle st 
FM/EOM/ILI reads'

Link: https://www.spinics.net/lists/target-devel/msg16738.html

According to an email from Martin K. Petersen from 05/18/18 18:26 he 
applied this patch to 4.18/scsi-queue.

Link: https://www.spinics.net/lists/target-devel/msg16743.html

HTH

Bodo


url:
https://github.com/0day-ci/linux/commits/Bodo-Stroesser/TCMUser-add-read-length-support/20180526-231412
config: x86_64-randconfig-x003-201820 (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
 # save the attached .config to linux build tree
 make ARCH=x86_64

All errors (new ones prefixed by >>):

drivers//target/target_core_user.c: In function 'tcmu_handle_completion':

drivers//target/target_core_user.c:1077:28: error: 'SCF_TREAT_READ_AS_NORMAL' 
undeclared (first use in this function)

se_cmd->se_cmd_flags |= SCF_TREAT_READ_AS_NORMAL;
^~~~
drivers//target/target_core_user.c:1077:28: note: each undeclared 
identifier is reported only once for each function it appears in





[RESEND] TCMUser: add read length support

2018-05-24 Thread Bodo Stroesser
This is patch version 3.

Generally target core and TCMUser seem to work fine for tape devices and
media changers.
But there is at least one situation, where TCMUser is not able to support
sequential access device emulation correctly.

The situation is when an initiator sends a SCSI READ CDB with a length that is
greater than the length of the tape block to read. We can distinguish two
subcases:

A) The initiator sent the READ CDB with the SILI bit being set.
   In this case the sequential access device has to transfer the data from the
   tape block (only the length of the tape block) and transmit a good status.
   The current interface between TCMUser and the userspace does not support
   reduction of the read data size by the userspace program.

   The patch below fixes this subcase by allowing the userspace program to
   specify a reduced data size in read direction.

B) The initiator sent the READ CDB with the SILI bit not being set.
   In this case the sequential access device has to transfer the data from the
   tape block as in A), but additionally has to transmit CHECK CONDITION with
   the ILI bit set and NO SENSE in the sensebytes. The information field in
   the sensebytes must contain the residual count.

   With the below patch a user space program can specify the real read data
   length and appropriate sensebytes.
   TCMUser then uses the se_cmd flag SCF_TREAT_READ_AS_NORMAL, to force target
   core to transmit the real data size and the sensebytes.
   Note: the flag SCF_TREAT_READ_AS_NORMAL is introduced by Lee Duncan's
   patch "[PATCH v4] target: transport should handle st FM/EOM/ILI reads" from
   Tue, 15 May 2018 18:25:24 -0700.

This patch was created for kernel 4.15.9.


Changes from RFC:
- patch now uses SCF_TREAT_READ_AS_NORMAL to fix B) also.
- comment changed accordingly

Changes from V2:
- Cleaned up code according to review
- Userspace can set read_len for data in only, not for bidi.
- read_len from userspace no longer is checked implicitly by
  gather_data_area(), but the check is done explicitly
  in tcmu_handle_completion(). Now code is easier to understand.
  

Signed-off-by: Bodo Stroesser <bstroes...@ts.fujitsu.com>
Acked-by: Mike Christie <mchri...@redhat.com>

---
 include/uapi/linux/target_core_user.h |4 ++-
 drivers/target/target_core_user.c |   44 +++---
 2 files changed, 39 insertions(+), 9 deletions(-)

--- a/include/uapi/linux/target_core_user.h
+++ b/include/uapi/linux/target_core_user.h
@@ -43,6 +43,7 @@
 #define TCMU_MAILBOX_VERSION 2
 #define ALIGN_SIZE 64 /* Should be enough for most CPUs */
 #define TCMU_MAILBOX_FLAG_CAP_OOOC (1 << 0) /* Out-of-order completions */
+#define TCMU_MAILBOX_FLAG_CAP_READ_LEN (1 << 1) /* Read data length */
 
 struct tcmu_mailbox {
__u16 version;
@@ -70,6 +71,7 @@ struct tcmu_cmd_entry_hdr {
__u16 cmd_id;
__u8 kflags;
 #define TCMU_UFLAG_UNKNOWN_OP 0x1
+#define TCMU_UFLAG_READ_LEN   0x2
__u8 uflags;
 
 } __packed;
@@ -118,7 +120,7 @@ struct tcmu_cmd_entry {
__u8 scsi_status;
__u8 __pad1;
__u16 __pad2;
-   __u32 __pad3;
+   __u32 read_len;
char sense_buffer[TCMU_SENSE_BUFFERSIZE];
} rsp;
};
--- a/drivers/target/target_core_user.c
+++ b/drivers/target/target_core_user.c
@@ -576,7 +576,7 @@ static int scatter_data_area(struct tcmu
 }
 
 static void gather_data_area(struct tcmu_dev *udev, struct tcmu_cmd *cmd,
-bool bidi)
+bool bidi, uint32_t read_len)
 {
struct se_cmd *se_cmd = cmd->se_cmd;
int i, dbi;
@@ -609,7 +609,7 @@ static void gather_data_area(struct tcmu
for_each_sg(data_sg, sg, data_nents, i) {
int sg_remaining = sg->length;
to = kmap_atomic(sg_page(sg)) + sg->offset;
-   while (sg_remaining > 0) {
+   while (sg_remaining > 0 && read_len > 0) {
if (block_remaining == 0) {
if (from)
kunmap_atomic(from);
@@ -621,6 +621,8 @@ static void gather_data_area(struct tcmu
}
copy_bytes = min_t(size_t, sg_remaining,
block_remaining);
+   if (read_len < copy_bytes)
+   copy_bytes = read_len;
offset = DATA_BLOCK_SIZE - block_remaining;
tcmu_flush_dcache_range(from, copy_bytes);
memcpy(to + sg->length - sg_remaining, from + offset,
@@ -628,8 +630,11 @@ static void gather_data_area(struct tcmu
 
sg_remaining -= copy_bytes;
block_remai

Re: [PATCH v3] target: transport should allow st FM/EOM/ILI reads

2018-05-14 Thread Bodo Stroesser
> When a tape drive is exported via LIO using the pscsi module, a read that
> requests more bytes per block than the tape can supply returns an empty
> buffer. This is because the pscsi pass-through target module sees the
> "ILI" illegal length bit set and thinks there is no reason to return
> the data.
> 
> This is a long-standing transport issue, since it assumes that no data
> need be returned under a check condition, which isn't always the case
> for tape.
> 
> Add in a check for tape reads with the ILI, EOM, or FM bits set,
> with a sense code of NO_SENSE, treating such cases as if the read
> succeeded. The layered tape driver then "does the right thing" when
> it gets such a response.

I tested the patch. Using the st driver to handle the exported tape drive on
the initiator side, everything works fine.
But the st driver masks a problem that still persists. A real tape drive, if
the READ is longer than the data block to read, will transfer only the amount
of data from the tape block.
The exported tape drive with this patch transfers the full amount of data
as requested by the READ. The over scoring part of the transmitted data is
padded with NUL bytes.

Sending READ commands via sg with sg_raw I was able to test this behavior.
I also verified it using a FibreChannel analyzer.

I'd suggest to call target_complete_cmd_with_length() instead of
target_complete_cmd() in the descibed situation. That should fix the length
of the transmitted data, I think.

BR, Bodo

> 
> Changes from v2:
>  - Cleaned up subject line and bug text formatting
>  - Removed unneeded inner braces
>  - Removed ugly goto
>  - Also updated the "queue full" path to handle this case
> 
> Changes from RFC:
>  - Moved ugly code from transport to pscsi module
>  - Added checking EOM and FM bits, as well as ILI
>  - fixed malformed patch
>  - Clarified description a bit
> 
> Signed-off-by: Lee Duncan 
> ---
>  drivers/target/target_core_pscsi.c | 23 +++-
>  drivers/target/target_core_transport.c | 39 
> +-
>  include/target/target_core_base.h  |  1 +
>  3 files changed, 57 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/target/target_core_pscsi.c 
> b/drivers/target/target_core_pscsi.c
> index 0d99b242e82e..a9656368a96d 100644
> --- a/drivers/target/target_core_pscsi.c
> +++ b/drivers/target/target_core_pscsi.c
> @@ -689,8 +689,29 @@ static void pscsi_complete_cmd(struct se_cmd *cmd, u8 
> scsi_status,
>   }
>  after_mode_select:
>  
> - if (scsi_status == SAM_STAT_CHECK_CONDITION)
> + if (scsi_status == SAM_STAT_CHECK_CONDITION) {
>   transport_copy_sense_to_cmd(cmd, req_sense);
> +
> + /*
> +  * hack to check for TAPE device reads with
> +  * FM/EOM/ILI set, so that we can get data
> +  * back despite framework assumption that a
> +  * check condition means there is no data
> +  */
> + if (sd->type == TYPE_TAPE &&
> + cmd->data_direction == DMA_FROM_DEVICE) {
> + /*
> +  * is sense data valid, fixed format,
> +  * and have FM, EOM, or ILI set?
> +  */
> + if (req_sense[0] == 0xf0 && /* valid, fixed format 
> */
> + req_sense[2] & 0xe0 &&  /* FM, EOM, or ILI */
> + (req_sense[2] & 0xf) == 0) { /* key==NO_SENSE */
> + pr_debug("Tape FM/EOM/ILI status detected. 
> Treat as normal read.\n");
> + cmd->se_cmd_flags |= SCF_TREAT_READ_AS_NORMAL;
> + }
> + }
> + }
>  }
>  
>  enum {
> diff --git a/drivers/target/target_core_transport.c 
> b/drivers/target/target_core_transport.c
> index 74b646f165d4..a15a9e3dce11 100644
> --- a/drivers/target/target_core_transport.c
> +++ b/drivers/target/target_core_transport.c
> @@ -2084,12 +2084,24 @@ static void transport_complete_qf(struct se_cmd *cmd)
>   goto queue_status;
>   }
>  
> - if (cmd->se_cmd_flags & SCF_TRANSPORT_TASK_SENSE)
> + /*
> +  * Check if we need to send a sense buffer from
> +  * the struct se_cmd in question. We do NOT want
> +  * to take this path of the IO has been marked as
> +  * needing to be treated like a "normal read". This
> +  * is the case if it's a tape read, and either the
> +  * FM, EOM, or ILI bits are set, but there is no
> +  * sense data.
> +  */
> + if (!(cmd->se_cmd_flags & SCF_TREAT_READ_AS_NORMAL) &&
> + cmd->se_cmd_flags & SCF_TRANSPORT_TASK_SENSE)
>   goto queue_status;
>  
>   switch (cmd->data_direction) {
>   case DMA_FROM_DEVICE:
> - if (cmd->scsi_status)
> + /* queue status if not treating this as a normal read */
> + if (cmd->scsi_status &&
> + !(cmd->se_cmd_flags & 

[PATCH 3/3] st.ko: change enlarge_buffer result

2013-12-02 Thread Bodo Stroesser
From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Mon, 2 Dec 2013 18:52:10 +0100
Subject: [PATCH 3/3] st.ko: change enlarge_buffer result

enlarge_buffer() just returns 1 or 0 if it could or could
not allocate the requested buffer.

In case of result 0, the callers always set the error to
EOVERFLOW. I think, this is not a good errno for those
cases, where enlarge_buffer() could not allocate the pages
it needed.

So I changed enlarge_buffer() to return a meaningful
result (-ENOMEM or -EOVERFLOW in case of error, 0 in case of
success) and the callers to use this result.

I also removed a check in setup_buffering() that is done
in enlarge_buffer() anyway.

Cc: Kai Makisara kai.makis...@kolumbus.fi
Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
+++ b/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
@@ -1221,10 +1221,9 @@ static int st_open(struct inode *inode,
}
 
/* See that we have at least a one page buffer available */
-   if (!enlarge_buffer(STp-buffer, PAGE_SIZE, STp-restr_dma)) {
+   if ((retval = enlarge_buffer(STp-buffer, PAGE_SIZE, STp-restr_dma)) 
!= 0) {
printk(KERN_WARNING %s: Can't allocate one page tape 
buffer.\n,
   name);
-   retval = (-EOVERFLOW);
goto err_out;
}
 
@@ -1517,11 +1516,9 @@ static int setup_buffering(struct scsi_t
clear_buffer(STbp);
}
 
-   if (bufsize  STbp-buffer_size 
-   !enlarge_buffer(STbp, bufsize, STp-restr_dma)) {
+   if ((retval = enlarge_buffer(STbp, bufsize, STp-restr_dma)) != 
0) {
printk(KERN_WARNING %s: Can't allocate %d byte tape 
buffer.\n,
   tape_name(STp), bufsize);
-   retval = (-EOVERFLOW);
goto out;
}
if (STp-block_size)
@@ -3723,7 +3720,7 @@ static int enlarge_buffer(struct st_buff
gfp_t priority;
 
if (new_size = STbuffer-buffer_size)
-   return 1;
+   return 0;
 
max_segs = STbuffer-use_sg;
 
@@ -3747,7 +3744,7 @@ static int enlarge_buffer(struct st_buff
}
if (max_segs * (PAGE_SIZE  order)  new_size) {
if (order == ST_MAX_ORDER)
-   return 0;
+   return -EOVERFLOW;
normalize_buffer(STbuffer);
return enlarge_buffer(STbuffer, new_size, need_dma);
}
@@ -3760,7 +3757,7 @@ static int enlarge_buffer(struct st_buff
if (!page) {
DEB(STbuffer-buffer_size = got);
normalize_buffer(STbuffer);
-   return 0;
+   return -ENOMEM;
}
 
STbuffer-frp_segs += 1;
@@ -3771,7 +3768,7 @@ static int enlarge_buffer(struct st_buff
}
STbuffer-b_data = page_address(STbuffer-reserved_pages[0]);
 
-   return 1;
+   return 0;
 }
 
 
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±±ËŠ{ayºʇڙë,j­¢f£¢·hš‹àz¹®w¥¢¸
¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝ¢jú!¶i

[PATCH 0/3] st.ko: some changes for enlarge_buffer

2013-12-02 Thread Bodo Stroesser
Hi Kai,

here is a short series of 3 small patches for st.ko

The first one removes a bug in enlarge_buffer() that can make
a read or write fail under special conditions.

The second one removes a small piece of code in enlarge_buffer(),
that AFAICS is not necessary.

If enlarge_buffer() fails, the errno currently presented to userland
always is EOVERFLOW, while I think it sometimes should be ENOMEM.
The third patch is a try to enhance this.

All three patches are slightly tested under SLES11-SP2 kernel
 3.0.80-0.7.1-x and after this adapted to 3.12.2 (just had to fix
some offset). They compile fine under 3.12.2

HTH

Thanks,
Bodo


[PATCH 2/3] st.ko: remove unnecessary normalize_buffer

2013-12-02 Thread Bodo Stroesser
From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Mon, 2 Dec 2013 18:52:10 +0100
Subject: [PATCH 2/3] st.ko: remove unnecessary normalize_buffer

This patch removes an unnecessary call to normalize_buffer()
in enlarge_buffer()

In st_open() always a buffer of one page is allocated.
When the buffer needs to be enlarged later, it does not make
sense to free this page unconditionally.
 
Cc: Kai Makisara kai.makis...@kolumbus.fi
Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
+++ b/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
@@ -3725,9 +3725,6 @@ static int enlarge_buffer(struct st_buff
if (new_size = STbuffer-buffer_size)
return 1;
 
-   if (STbuffer-buffer_size = PAGE_SIZE)
-   normalize_buffer(STbuffer);  /* Avoid extra segment */
-
max_segs = STbuffer-use_sg;
 
priority = GFP_KERNEL | __GFP_NOWARN;


[PATCH 1/3] st.ko: fix enlarge_buffer

2013-12-02 Thread Bodo Stroesser
From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Mon, 2 Dec 2013 18:52:10 +0100
Subject: [PATCH 1/3] st.ko: fix enlarge_buffer

This patch removes a bug in enlarge_buffer() that can make a
read or write fail under special conditions.

After changing TRY_DIRECT_IO to 0 and ST_MAX_SG to 32 in
st_options.h, a program that writes a first block of 128k and
than a second bigger block (e.g. 256k) fails. The second write
returns errno EOVERFLOW, as enlarge_buffer() checks the sg list
and detects that it already is full.
As enlarge_buffer uses different page allocation orders
depending on the size of the buffer needed, the check does not
make sense.
 
Cc: Kai Makisara kai.makis...@kolumbus.fi
Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
+++ b/drivers/scsi/st.c 2013-12-02 18:52:10.0 +0100
@@ -3719,7 +3719,7 @@ static struct st_buffer *new_tape_buffer
 
 static int enlarge_buffer(struct st_buffer * STbuffer, int new_size, int 
need_dma)
 {
-   int segs, nbr, max_segs, b_size, order, got;
+   int segs, max_segs, b_size, order, got;
gfp_t priority;
 
if (new_size = STbuffer-buffer_size)
@@ -3729,9 +3729,6 @@ static int enlarge_buffer(struct st_buff
normalize_buffer(STbuffer);  /* Avoid extra segment */
 
max_segs = STbuffer-use_sg;
-   nbr = max_segs - STbuffer-frp_segs;
-   if (nbr = 0)
-   return 0;
 
priority = GFP_KERNEL | __GFP_NOWARN;
if (need_dma)
N‹§²æìr¸›yúèšØb²X¬¶Ç§vØ^–)Þº{.nÇ+‰·¥Š{±±ËŠ{ayºʇڙë,j­¢f£¢·hš‹àz¹®w¥¢¸
¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝ¢jú!¶i

Re: PATCH: st.c: Fix blk_get_queue usage

2013-11-15 Thread Bodo Stroesser
On 14.11.2013, at 22.50, Joe Lawrence joe.lawre...@stratus.com wrote:

 On Thu, 14 Nov 2013 20:22:40 +0100
 Bodo Stroesser bstroes...@ts.fujitsu.com wrote:
 
  On 14.11.2013, at 20.05, Kai M??kisara (Kolumbus) 
  kai.makis...@kolumbus.fi wrote:
  
   
   On 14.11.2013, at 16.48, Bodo Stroesser bstroes...@ts.fujitsu.com wrote:
   

 snip 
 
   
   With this patch, blk_get_queue() is not called with the correct argument.
   Maybe change the call to blk_get_queue(SDp-request_queue) ?
   
if (blk_get_queue(disk-queue))
  
  Yes, thank you. You are obviously right. Below is the revised patch.
  Sorry for the mistake.
  
  Bodo
  
goto out_put_disk;
+   disk-queue = SDp-request_queue;
tpnt-driver = st_template;

tpnt-device = SDp--
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
;
   
   Thanks,
   Kai
   
  
  From: Bodo Stroesser bstroes...@ts.fujitsu.com
  Date: Thu, 14 Nov 2013 14:35:00 +0100
  Subject: [PATCH] sg: fix blk_get_queue usage
  
  If blk_queue_get() in st_probe fails, disk-queue must not
  be set to SDp-request_queue, as that would result in
  put_disk() dropping a not taken reference.
  
  Thus, disk-queue should be set only after a successful
  blk_queue_get().
  Revised patch due to a hint from Kai Makisara.
  
  Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com
  
  ---
  
  --- a/drivers/scsi/st.c 2013-11-14 14:10:40.0 +0100
  +++ b/drivers/scsi/st.c 2013-11-14 14:10:57.0 +0100
  @@ -4107,11 +4107,11 @@
  kref_init(tpnt-kref);
  tpnt-disk = disk;
  disk-private_data = tpnt-driver;
  -   disk-queue = SDp-request_queue;
  /* SCSI tape doesn't register this gendisk via add_disk().  Manually
   * take queue reference that release_disk() expects. */
  -   if (blk_get_queue(disk-queue))
  +   if (blk_get_queue(SDp-request_queue))
  goto out_put_disk;
  +   disk-queue = SDp-request_queue;
  tpnt-driver = st_template;
   
  tpnt-device = SDp;
   ??{.n?+???+%??lzwm??b??r??zX??  ?(?? 
  }?z?j:+v???zZ+??+zf???h???~i???z? ?w??)?? f
 
 Hi Bodo,
 
 Minor nit, wasn't blk_get_queue modified to return false on failure?
 
 09ac46c42946 block: misc updates to blk_get_queue()
 
 Just curious what tree this patch was tested against.
 
 Thanks,
 
 -- Joe
 

Hi Joe,

we are intensively using SuSE SLES11 (currently kernel 3.0.80-x). At the 
moment
I'm not involved with mainline.

The patch resulted from the hunt for another problem and was accepted by SuSE. 
They
told me that this would be relevant for mainline also. So I posted it but didn't
check mainline for changes :-(

Thus, here is the third revision of the patch that now is suited for mainline.
Sorry for the noise.

Thank you.

Bodo

---

From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Thu, 14 Nov 2013 14:35:00 +0100
Subject: [PATCH] sg: fix blk_get_queue usage

If blk_queue_get() in st_probe fails, disk-queue must not
be set to SDp-request_queue, as that would result in
put_disk() dropping a not taken reference.

Thus, disk-queue should be set only after a successful
blk_queue_get().

2nd revised patch due to hints from Kai Makisara and Joe Lawrence.

Now this should be suited for mainline.

Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-11-15 11:58:39.0 +0100
+++ b/drivers/scsi/st.c 2013-11-15 11:59:10.0 +0100
@@ -4111,11 +4111,11 @@ static int st_probe(struct device *dev)
kref_init(tpnt-kref);
tpnt-disk = disk;
disk-private_data = tpnt-driver;
-   disk-queue = SDp-request_queue;
/* SCSI tape doesn't register this gendisk via add_disk().  Manually
 * take queue reference that release_disk() expects. */
-   if (!blk_get_queue(disk-queue))
+   if (!blk_get_queue(SDp-request_queue))
goto out_put_disk;
+   disk-queue = SDp-request_queue;
tpnt-driver = st_template;
 
tpnt-device = SDp;



PATCH: st.c: Fix blk_get_queue usage

2013-11-14 Thread Bodo Stroesser
Hi,

in st_probe(), st.c I stumbled across what I'd call a minor problem.

So I'd like to suggest the following patch.

Best Regards,
Bodo

P.S.: Please CC me, I'm not on the list.

-


From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Thu, 14 Nov 2013 14:35:00 +0100
Subject: [PATCH] sg: fix blk_get_queue usage

If blk_queue_get() in st_probe fails, disk-queue must not
be set to SDp-request_queue, as that would result in
put_disk() dropping a not taken reference.

Thus, disk-queue should be set only after a successful
blk_queue_get().

Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-11-14 14:10:40.0 +0100
+++ b/drivers/scsi/st.c 2013-11-14 14:10:57.0 +0100
@@ -4107,11 +4107,11 @@
kref_init(tpnt-kref);
tpnt-disk = disk;
disk-private_data = tpnt-driver;
-   disk-queue = SDp-request_queue;
/* SCSI tape doesn't register this gendisk via add_disk().  Manually
 * take queue reference that release_disk() expects. */
if (blk_get_queue(disk-queue))
goto out_put_disk;
+   disk-queue = SDp-request_queue;
tpnt-driver = st_template;
 
tpnt-device = SDp;


Re: PATCH: st.c: Fix blk_get_queue usage

2013-11-14 Thread Bodo Stroesser
On 14.11.2013, at 20.05, Kai Mäkisara (Kolumbus) kai.makis...@kolumbus.fi 
wrote:

 
 On 14.11.2013, at 16.48, Bodo Stroesser bstroes...@ts.fujitsu.com wrote:
 
  Hi,
  
  in st_probe(), st.c I stumbled across what I'd call a minor problem.
  
  So I'd like to suggest the following patch.
  
  Best Regards,
  Bodo
  
  P.S.: Please CC me, I'm not on the list.
  
  -
  
  
  From: Bodo Stroesser bstroes...@ts.fujitsu.com
  Date: Thu, 14 Nov 2013 14:35:00 +0100
  Subject: [PATCH] sg: fix blk_get_queue usage
  
  If blk_queue_get() in st_probe fails, disk-queue must not
  be set to SDp-request_queue, as that would result in
  put_disk() dropping a not taken reference.
  
  Thus, disk-queue should be set only after a successful
  blk_queue_get().
  
  Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com
  
  ---
  
  --- a/drivers/scsi/st.c 2013-11-14 14:10:40.0 +0100
  +++ b/drivers/scsi/st.c 2013-11-14 14:10:57.0 +0100
  @@ -4107,11 +4107,11 @@
  kref_init(tpnt-kref);
  tpnt-disk = disk;
  disk-private_data = tpnt-driver;
  -   disk-queue = SDp-request_queue;
  /* SCSI tape doesn't register this gendisk via add_disk().  Manually
   * take queue reference that release_disk() expects. */
 
 With this patch, blk_get_queue() is not called with the correct argument.
 Maybe change the call to blk_get_queue(SDp-request_queue) ?
 
  if (blk_get_queue(disk-queue))

Yes, thank you. You are obviously right. Below is the revised patch.
Sorry for the mistake.

Bodo

  goto out_put_disk;
  +   disk-queue = SDp-request_queue;
  tpnt-driver = st_template;
  
  tpnt-device = SDp--
  To unsubscribe from this list: send the line unsubscribe linux-scsi in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
  ;
 
 Thanks,
 Kai
 

From: Bodo Stroesser bstroes...@ts.fujitsu.com
Date: Thu, 14 Nov 2013 14:35:00 +0100
Subject: [PATCH] sg: fix blk_get_queue usage

If blk_queue_get() in st_probe fails, disk-queue must not
be set to SDp-request_queue, as that would result in
put_disk() dropping a not taken reference.

Thus, disk-queue should be set only after a successful
blk_queue_get().
Revised patch due to a hint from Kai Makisara.

Signed-off-by: Bodo Stroesser bstroes...@ts.fujitsu.com

---

--- a/drivers/scsi/st.c 2013-11-14 14:10:40.0 +0100
+++ b/drivers/scsi/st.c 2013-11-14 14:10:57.0 +0100
@@ -4107,11 +4107,11 @@
kref_init(tpnt-kref);
tpnt-disk = disk;
disk-private_data = tpnt-driver;
-   disk-queue = SDp-request_queue;
/* SCSI tape doesn't register this gendisk via add_disk().  Manually
 * take queue reference that release_disk() expects. */
-   if (blk_get_queue(disk-queue))
+   if (blk_get_queue(SDp-request_queue))
goto out_put_disk;
+   disk-queue = SDp-request_queue;
tpnt-driver = st_template;
 
tpnt-device = SDp;


Re: lpfc: system freezing if FC connection is broken under load

2005-08-03 Thread Bodo Stroesser

Bodo Stroesser wrote:

Hi,

my dual Xeon machine freezes, if connection between
FC switch and tape drives is broken while writing to tapes.

There is one SCSI target with 16 tape LUNs connected to my
FC controller via FC switch. I can reproduce the problem by
starting dd if=/dev/zero of=/dev/st[0-7] bs=256K on the
first 8 LUNs. Then I unplug the connection between switch and
tapes.

It doesn't matter if using LP9802 or one channel of LP9402DC.
The problem happens immediately after cfg_nodev_tmo has
run out. If nodev_tmo is changed, time from breaking connection
to machine freezing changes accordingly.

After the problem happened, even NMIs no longer are handled.
I added nmi_watchdog=1 to cmdline and added some simple code
to nmi handler, that writes the nmi counter directly to video ram.
In case of error, nmi no longer counts (but I have no idea, how
this can happen, maybe there is some HW bug).


Sorry, I was wrong. The machine panics but I didn't note that,
because panic_blink() doesn't work and panic messages were not
written to the fb-console.

Meanwhile I could reproduce the problem with verbose logging in
lpfc-driver and with a modified panic(), that writes out full
log_buf to serial port. I've attached the logging starting from
lpfc insertion. I added some comments to it, to show what I was
doing.

If I understand panic info correctly, the machine crashes in
st_sleep_done[st] because of an invalid pointer STp, but the
real reason for the crash seems to be some unexpected event in
lpfc, after nodev_tmo expired.

I hope, this helps to find the problem. If other traces are
needed, please let me know.

Please CC me, I'm not on the lists.

Regards
Bodo

I inserted some comments (#), but did not remove or change
any console messages after reloading lpfc.


# lpfc reloaded 
4Emulex LightPulse Fibre Channel SCSI driver 8.0.28
6ACPI: PCI Interrupt :04:04.0[A] - GSI 54 (level, low) - IRQ 54
6scsi6 :  on PCI bus 04 device 20 irq 54
3lpfc :04:04.0: 0:1303 Link Up Event x1 received Data: x1 xf7 x8 x0
6ACPI: PCI Interrupt :04:05.0[A] - GSI 55 (level, low) - IRQ 55
6scsi7 :  on PCI bus 04 device 28 irq 55
3lpfc :04:05.0: 1:1303 Link Up Event x1 received Data: x1 xf7 x8 x0
6ACPI: PCI Interrupt :09:04.0[A] - GSI 72 (level, low) - IRQ 72
6scsi8 :  on PCI bus 09 device 20 irq 72
3lpfc :09:04.0: 2:1303 Link Up Event x1 received Data: x1 xf7 x8 x0
6ACPI: PCI Interrupt :09:05.0[A] - GSI 73 (level, low) - IRQ 73
6scsi9 :  on PCI bus 09 device 28 irq 73
3lpfc :09:05.0: 3:1303 Link Up Event x1 received Data: x1 xf7 x8 x0
6ACPI: PCI Interrupt :0a:01.0[A] - GSI 96 (level, low) - IRQ 96
6scsi10 :  on PCI bus 0a device 08 irq 96
3lpfc :0a:01.0: 4:1303 Link Up Event x1 received Data: x1 xf7 x8 x0

# FC connection between switch and tape devices plugged 
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st0 at scsi10, channel 0, id 0, lun 0
4st0: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg2 at scsi10, channel 0, id 0, lun 0,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st1 at scsi10, channel 0, id 0, lun 1
4st1: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg3 at scsi10, channel 0, id 0, lun 1,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st2 at scsi10, channel 0, id 0, lun 2
4st2: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg4 at scsi10, channel 0, id 0, lun 2,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st3 at scsi10, channel 0, id 0, lun 3
4st3: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg5 at scsi10, channel 0, id 0, lun 3,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st4 at scsi10, channel 0, id 0, lun 4
4st4: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg6 at scsi10, channel 0, id 0, lun 4,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access  ANSI SCSI revision: 03
4Attached scsi tape st5 at scsi10, channel 0, id 0, lun 5
4st5: try direct i/o: yes (alignment 512 B), max page reachable by HBA 
4294967295
5Attached scsi generic sg7 at scsi10, channel 0, id 0, lun 5,  type 1
5  Vendor: IBM   Model: 03590E1A  Rev: E32E
5  Type:   Sequential-Access

Re: lpfc: System freezing if fiber is broken

2005-07-27 Thread Bodo Stroesser

Mike Anderson wrote:

Bodo Stroesser [EMAIL PROTECTED] wrote:


Hi James,

disrupting a working FC connection makes my i386 SMP server
(2.6.12.2) freeze just one or two seconds after this.
I'm normally using lpfc_nodev_tmo = 1. When I change this to the
default value of 35, the system stalls about 36 seconds after
disruption. So I guess, the problem is caused by nodev_tmo
expiring.
I activated the nmi_watchdog, but no output.

What can I do to analyze this problem?



Does changing the timeout for a scsi device also alter the problem. In the
past people have seen issues of the nodev_tmo expiring near the scsi
timeout. This past cases lead to devices being offlined, but may this
could be causing a different symptom on your system.

The amount of time between cutting the connection and the system freezing
is nearly the same as lpfc_nodev_tmo. Using the default nodev_tmo of 35 seconds
results in about 36 seconds, while setting nodev_tmo to 1 results in
2 seconds. As the devices on the Fibre Channel are tapedrives scsi timeout is
900 seconds.
There are 8 tests running that write 8 tape-LUNs at the same SCSI target.
If the connection is broken, some of the tests immediately receive a bad
result for write(), some keep waiting for a result.

Meanwhile I also did some tests with timeout set to 5 and nodev_tmo to 35
(The test I'm running doesn't fail with that small timeout). Those tests, that
do not receive a bad result, stay waiting for result even after 5 second timeout
is expired. In most cases, the system doesn't freeze after nodev_tmo with this 
test.
But about 5 seconds after plugging FC cable again, it freezes.


You can change the timeout for the device by echoing a higher value into
/sys/bus/scsi/devices/${nexus}/timeout.

Is this a full system freeze or only the controlling console?

Full freeze, no more replies via console or network.



-andmike
--
Michael Anderson
[EMAIL PROTECTED]

-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


lpfc: System freezing if fiber is broken

2005-07-26 Thread Bodo Stroesser

Hi James,

disrupting a working FC connection makes my i386 SMP server
(2.6.12.2) freeze just one or two seconds after this.
I'm normally using lpfc_nodev_tmo = 1. When I change this to the
default value of 35, the system stalls about 36 seconds after
disruption. So I guess, the problem is caused by nodev_tmo
expiring.
I activated the nmi_watchdog, but no output.

What can I do to analyze this problem?

Regards
Bodo


BTW:
I couldn't reproduce the problem of a wrong WWPN yet.
-
To unsubscribe from this list: send the line unsubscribe linux-scsi in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html