RE: [PATCH][drivers/scsi/u14-34f.c] duplicate test 'SCpnt->sc_data_direction == DMA_FROM_DEVICE'

2008-02-05 Thread Ballabio_Dario
 Good to know that somebody still uses the Ultrastor 14f board :).
Yes, this typo was introduced by somebody doing massive editing to all
scsi drivers long ago.
Cheers,
--db
 

-Original Message-
From: Roel Kluin [mailto:[EMAIL PROTECTED] 
Sent: Monday, February 04, 2008 11:37 PM
To: Ballabio, Dario
Cc: linux-scsi@vger.kernel.org; lkml
Subject: [PATCH][drivers/scsi/u14-34f.c] duplicate test
'SCpnt->sc_data_direction == DMA_FROM_DEVICE'

It should be like this I guess? this patch was not yet tested, please
confirm.
--
Note the duplicate test 'SCpnt->sc_data_direction == DMA_FROM_DEVICE'

from Documentation/DMA-API.txt:
DMA_TO_DEVICE = PCI_DMA_TODEVICE  data is going from the
  memory to the device
DMA_FROM_DEVICE   = PCI_DMA_FROMDEVICEdata is coming from
  the device to the

Signed-off-by: Roel Kluin <[EMAIL PROTECTED]>
---
diff --git a/drivers/scsi/u14-34f.c b/drivers/scsi/u14-34f.c
index 662c004..1e704f9 100644
--- a/drivers/scsi/u14-34f.c
+++ b/drivers/scsi/u14-34f.c
@@ -1208,15 +1208,15 @@ static void scsi_to_dev_dir(unsigned int i,
unsigned int j) {
   };
 
struct mscp *cpp;
struct scsi_cmnd *SCpnt;
 
cpp = &HD(j)->cp[i]; SCpnt = cpp->SCpnt;
 
-   if (SCpnt->sc_data_direction == DMA_FROM_DEVICE) {
+   if (SCpnt->sc_data_direction == DMA_TO_DEVICE) {
   cpp->xdir = DTD_IN;
   return;
   }
else if (SCpnt->sc_data_direction == DMA_FROM_DEVICE) {
   cpp->xdir = DTD_OUT;
   return;
   }


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Bart Van Assche
On Feb 4, 2008 11:57 PM, Jeff Garzik <[EMAIL PROTECTED]> wrote:

> Networked block devices are attractive because the concepts and
> implementation are more simple than networked filesystems... but usually
> you want to run some sort of filesystem on top.  At that point you might
> as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your
> networked block device (and associated complexity).

Running a filesystem on top of iSCSI results in better performance
than NFS, especially if the NFS client conforms to the NFS standard
(=synchronous writes).
By searching the web search for the keywords NFS, iSCSI and
performance I found the following (6 years old) document:
http://www.technomagesinc.com/papers/ip_paper.html. A quote from the
conclusion:
Our results, generated by running some of industry standard benchmarks,
show that iSCSI significantly outperforms NFS for situations when
performing streaming, database like accesses and small file transactions.

Bart Van Assche.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH][drivers/scsi/u14-34f.c] duplicate test 'SCpnt->sc_data_direction == DMA_FROM_DEVICE'

2008-02-05 Thread Roel Kluin
[EMAIL PROTECTED] wrote:
>  Good to know that somebody still uses the Ultrastor 14f board :).
> Yes, this typo was introduced by somebody doing massive editing to all
> scsi drivers long ago.
> Cheers,
> --db
Actually, I do not own a Ultrastor 14f board. I found this by searching for
if (test)
...
else if (exactly same test)
...
Thanks,
Roel
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 08/13] sg: nopage

2008-02-05 Thread Andrew Morton
On Mon, 04 Feb 2008 23:53:21 -0800 [EMAIL PROTECTED] wrote:

> From: Nick Piggin <[EMAIL PROTECTED]>
> 
> Convert SG from nopage to fault.
> 

Please give this some additional attention.  We'd like to remove 
vm_operations_struct.nopage() altogether and we can't do that while
it's hanging around in various subsystems.

Thanks.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-05 Thread Boaz Harrosh
On Mon, Feb 04 2008 at 22:05 +0200, Alan Stern <[EMAIL PROTECTED]> wrote:
> On Sun, 3 Feb 2008, Matthew Dharm wrote:
> 
>> But, the modifications to usb_stor_access_xfer_buf() look good -- no
>> request from a sub-driver should be allowed to scribble into memory.  The
>> current code does make the implicit assumption that there is enough
>> storage, and will walk right off the end of the sg list if there isn't.
>>
>> I'm not sure I like the mods to usb_stor_set_xfer_buf().  Any place we set
>> a status that we know is going to be thrown away is an invitation for a
>> problem later if someone changes the code to preserve that status.  It's a
>> jack-in-the-box, waiting to spring open in our face later.  The limit check
>> (which mirrors the usb_stor_access_xfer_buf modification) and WARN_ON() are
>> probably good.
>>
>> In a strictly technical sense, the change to protocol.c are sufficient.
>> That is, they will prevent a serious error.  There is a justification tho
>> to fix all of the users of usb_stor_access_buf() to not attempt to use more
>> SCSI buffer than exists.
>>
>> My opinion is this:  Let's make the protocol.c mods (modulo my comments
>> about setting useless status bits) now.  Then, let's decide if we're going
>> to patch all the other users of the usb_stor_*_xfer_buf() functions as a
>> separate discussion.
> 
> I think the correct approach is to modify those routines so that they 
> will never overrun the s-g buffer (like Boaz has done), and _document_ 
> this behavior.  Then the callers can feel free to try and transfer as 
> much as they want, knowing that an overrun can't occur.  There won't 
> be any need for a WARN_ON or anything else.
> 
> However the interface to usb_stor_access_xfer_buf() will have to change
> slightly.  Right now if it sees that *sgptr is NULL, it assumes this
> means it should start at the beginning of the s-g buffer.  But with 
> Boaz's change, *sgptr == NULL means the transfer has reached the end of 
> the buffer.  So I'll have to go through and audit all the callers.
> 
> Alan Stern
> 
> -
No it does not, this as not changed. Please look again.
Note that this patch was tested and working. It is a bug
in v2.2.24 and it should be accepted already. One way or
the other.

Callers of usb_stor_access_xfer_buf() need not change.
Matthew Dharm should decide if he wants the WARN_ON in 
usb_stor_set_xfer_buf() or not and be done with it.

I have found and fixed the bug, but it is not a SCSI
related bug, and it is not do to any scsi changes. It
is a bug from the SG changes of early 2.6.24. Please
take it through the USB tree. Feel free to change it
the way you like it, and submit it.

Boaz


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()

2008-02-05 Thread Boaz Harrosh
On Mon, Feb 04 2008 at 20:32 +0200, "Salyzyn, Mark" <[EMAIL PROTECTED]> wrote:
> ACK with condition that community accepts the RFC's entire premise.
> 
> The removed code that shunted the REQUEST_SENSE was based on the assumption 
> that the sense data in the current scsi command packet was left over from the 
> previous command's execution with a check condition as the scsi command 
> packet 
> is reused to issue the REQUEST_SENSE. For a new, or second from the target's 
> point 
> of view, request sense to the target issued by these older kernels would 
> always 
> return an erased sense. The dpt_i2o driver does not itself maintain the sense 
> history, 
> nor does the Firmware. This behavior, I believe, is not the case for current 
> kernels so 
> the code fragment made little sense (pun not intended). If my historical 
> knowledge is 
> correct, this (now removed) workaround makes no more sense because the scsi 
> layer correctly 
> manages adapters that produce auto-request sense and does not ever turn 
> around the command 
> and send a second request for sense information.

> Given this understanding, I have no problem with the removed fragment of 
> REQUEST_SENSE shunting. 
> However, I do urge some target error recovery testing, tape drives being the 
> likely type of target 
> affected by this change. I have no such hardware to confirm...
> Sincerely -- Mark Salyzyn

I have removed this test because the midlayer does a scsi_eh_reset_sense() just 
before
the new invocation of a command. So even if the second bad REQUEST_SENSE comes 
this
will not filter it out anymore. If such a thing still happens? A driver state 
machine
must be used to filter it out, or of course midlayer should be fixed.

Boaz
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/7] blk_end_request: full I/O completion handler

2008-02-05 Thread Jens Axboe
On Tue, Feb 05 2008, S, Chandrakala (STSD) wrote:
> Hello,
> 
> We would like to know in which kernel version these patches are
> available.  

They were merged after 2.6.24 was released, so they will show up in the
2.6.25 kernel.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 0/7] blk_end_request: full I/O completion handler

2008-02-05 Thread S, Chandrakala (STSD)
Hello,

We would like to know in which kernel version these patches are
available.  
 
Thanks,
Chandrakala

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Jens Axboe
Sent: Monday, September 03, 2007 1:16 PM
To: Kiyoshi Ueda
Cc: [EMAIL PROTECTED]; linux-scsi@vger.kernel.org;
[EMAIL PROTECTED]; Miller, Mike (OS Dev);
[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: Re: [PATCH 0/7] blk_end_request: full I/O completion handler

On Fri, Aug 31 2007, Kiyoshi Ueda wrote:
> Hello,
> 
> This set of patches changes request completion interface between 
> device drivers and block layer to 1 step procedure from current 2 step

> procedures using end_that_request_{first/chunk} and 
> end_that_request_last().
> 
> This change allows request-based multipath to hook in before 
> completing each chunk of request, check errors for it and retry it 
> using another path if error is detected.
> 
> Summaries of each patch are below:
>   1/7: add new request completion interface, blk_end_request()
>   2/7: add some macros to get the size of request in bytes
>   3/7: convert normal drivers to use blk_end_request()
>   4/7: convert odd drivers like cciss/cpqarray/xsysace to use
>blk_end_request()
>   5/7: convert ide-cd (cdrom_newpc_intr) to use blk_end_request()
>   6/7: remove/unexport no longer needed end_that_request_*
>   7/7: change rq->end_io to cover request completion as a whole
> 
> I have tested the patch on two machines, ia64+QLA1280+QLA2200 and 
> x86_64+SATA+IDE-CDROM.
> I can't test other device drivers for which I don't have hardware.
> So testing help and any comments would be very much appreciated.
> 
> The interface change causes code modifications of *ALL DEVICE DRIVERS*

> which are using end_that_request_{first/chunk/last} to complete
request.
> But it should not affect the behavior.
> 
> Please review and apply if no problem.
> This patch-set should be applied on top of 2.6.23-rc3-mm1.
> 
> 
> BACKGROUND
> ==
> The patch is necessary to allow device stacking at request level, that

> is request-based device-mapper multipath.
> Currently, device-mapper is implemented as a stacking block device at 
> BIO level.  OTOH, request-based DM will stack at request level to 
> allow better multipathing decision.
> To allow device stacking at request level, the completion procedure 
> need to provide a hook for it.
> For example, dm-multipath has to check errors and retry with other 
> paths if necessary before returning the I/O result to upper layer.
> struct request has 'end_io' hook currently.  But it's called at the 
> very late stage of completion handling where the I/O result is already

> returned to the upper layer.
> So we need something here.
> 
> The first approach to hook in completion of each chunk of request was 
> adding a new rq->end_io_first() hook and calling it on the top of 
> __end_that_request_first().
>   - http://marc.theaimsgroup.com/?l=linux-scsi&m=115520444515914&w=2
>   - http://marc.theaimsgroup.com/?l=linux-kernel&m=116656637425880&w=2
> However, Jens pointed out that redesigning rq->end_io() as a full 
> completion handler would be better:
> 
> On Thu, 21 Dec 2006 08:49:47 +0100, Jens Axboe <[EMAIL PROTECTED]>
wrote:
> > Ok, I see what you are getting at. The current ->end_io() is called 
> > when the request has fully completed, you want notification for each

> > chunk potentially completed.
> > 
> > I think a better design here would be to use ->end_io() as the full 
> > completion handler, similar to how bio->bi_end_io() works. A request

> > originating from __make_request() would set something ala:
> .
> > instead of calling the functions manually. That would allow you to 
> > get notification right at the beginning and do what you need, 
> > without adding a special hook for this.
> 
> I thought his comment was reasonable.
> So I modified the patches based on his suggestion.
> 
> 
> WHAT IS CHANGED
> ===
> The change is basically illustlated by the following pseudo code:
> 
> [Before]
>   if (end_that_request_{first/chunk} succeeds) { <-- completes bios
>  
>  end_that_request_last() <-- calls end_io()
>  
>   } else {
>  
>   }
> 
> [After]
>   if (blk_end_request() succeeds) { <-- calls end_io(), completes bios
>  
>   } else {
>  
>   }
> 
> 
> In detail, request completion procedures are changed like below.
> 
> [Before]
>   o 2 steps completion using end_that_request_{first/chunk}
> and end_that_request_last().
>   o Device drivers have ownership of a request until they
> call end_that_request_last().
>   o rq->end_io() is called at the last stage of
> end_that_request_last() for some block layer codes need
> specific request handling when completing it.
> 
> [After]
>   o 1 step completion using blk_end_request().
> (end_that_request_* are no longer used from device drivers.)
>   o Device drivers give over ownership of a request
> when call

diskdump fails on x3850 (x86_64)

2008-02-05 Thread Ciju Rajan K

Hello folks,

Diskdump on x3850 with adp94xx driver is not working (Redhat Enterprise Linux 4.6). 
'Service diskdump restart' fails. Is there any plan to support it?


Thanks
Ciju
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Olivier Galibert
On Mon, Feb 04, 2008 at 05:57:47PM -0500, Jeff Garzik wrote:
> iSCSI and NBD were passe ideas at birth.  :)
> 
> Networked block devices are attractive because the concepts and 
> implementation are more simple than networked filesystems... but usually 
> you want to run some sort of filesystem on top.  At that point you might 
> as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your 
> networked block device (and associated complexity).

Call me a sysadmin, but I find easier to plug in and keep in place an
ethernet cable than these parallel scsi cables from hell.  Every
server has at least two ethernet ports by default, with rarely any
surprises at the kernel level.  Adding ethernet cards is inexpensive,
and you pretty much never hear of compatibility problems between
cards.

So ethernet as a connection medium is really nice compared to scsi.
Too bad iscsi is demented and ATAoE/NBD inexistant.  Maybe external
SAS will be nice, but I don't see it getting to the level of
universality of ethernet any time soon.  And it won't get the same
amount of user-level compatibility testing in any case.

  OG.



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Marvell 6440 SAS/SATA driver

2008-02-05 Thread Ke Wei

Added support for hotplug and wide port.


Signed-off-by: Ke Wei <[EMAIL PROTECTED]>
---
 drivers/scsi/mvsas.c |  445 ++
 1 files changed, 339 insertions(+), 106 deletions(-)

diff --git a/drivers/scsi/mvsas.c b/drivers/scsi/mvsas.c
index 3bf009b..e5cf3ad 100755
--- a/drivers/scsi/mvsas.c
+++ b/drivers/scsi/mvsas.c
@@ -26,6 +26,12 @@
  structures.  this permits elimination of all the le32_to_cpu()
  and cpu_to_le32() conversions.
 
+   Changelog:
+   2008-02-05  0.4 Added support for hotplug and wide port.
+   2008-01-22  0.3 Added support for SAS HD and SATA Devices.
+   2008-01-09  0.2 detect SAS disk.
+   2007-09-95  0.1 rough draft, Initial version.
+
  */
 
 #include 
@@ -39,13 +45,13 @@
 #include 
 
 #define DRV_NAME   "mvsas"
-#define DRV_VERSION"0.3"
+#define DRV_VERSION"0.4"
 #define _MV_DUMP 0
 #define MVS_DISABLE_NVRAM
 
 #define mr32(reg)  readl(regs + MVS_##reg)
 #define mw32(reg,val)  writel((val), regs + MVS_##reg)
-#define mw32_f(reg,val)do {\
+#define mw32_f(reg,val)do {\
writel((val), regs + MVS_##reg);\
readl(regs + MVS_##reg);\
} while (0)
@@ -54,13 +60,19 @@
 #define MVS_CHIP_SLOT_SZ   (1U << mvi->chip->slot_width)
 
 /* offset for D2H FIS in the Received FIS List Structure */
-#define SATA_RECEIVED_D2H_FIS(reg_set) \
+#define SATA_RECEIVED_D2H_FIS(reg_set) \
((void *) mvi->rx_fis + 0x400 + 0x100 * reg_set + 0x40)
-#define SATA_RECEIVED_PIO_FIS(reg_set) \
+#define SATA_RECEIVED_PIO_FIS(reg_set) \
((void *) mvi->rx_fis + 0x400 + 0x100 * reg_set + 0x20)
-#define UNASSOC_D2H_FIS(id) \
+#define UNASSOC_D2H_FIS(id)\
((void *) mvi->rx_fis + 0x100 * id)
 
+#define for_each_phy(__lseq_mask, __mc, __lseq, __rest)
\
+   for ((__mc) = (__lseq_mask), (__lseq) = 0;  \
+   (__mc) != 0 && __rest;  \
+   (++__lseq), (__mc) >>= 1)   \
+   if (((__mc) & 1))
+
 /* driver compile-time configuration */
 enum driver_configuration {
MVS_TX_RING_SZ  = 1024, /* TX ring size (12-bit) */
@@ -130,6 +142,7 @@ enum hw_registers {
MVS_INT_STAT= 0x150, /* Central int status */
MVS_INT_MASK= 0x154, /* Central int enable */
MVS_INT_STAT_SRS= 0x158, /* SATA register set status */
+   MVS_INT_MASK_SRS= 0x15C,
 
 /* ports 1-3 follow after this */
MVS_P0_INT_STAT = 0x160, /* port0 interrupt status */
@@ -223,7 +236,7 @@ enum hw_register_bits {
 
/* shl for ports 1-3 */
CINT_PORT_STOPPED   = (1U << 16),   /* port0 stopped */
-   CINT_PORT   = (1U << 8),/* port0 event */
+   CINT_PORT   = (1U << 8),/* port0 event */
CINT_PORT_MASK_OFFSET   = 8,
CINT_PORT_MASK  = (0xFF << CINT_PORT_MASK_OFFSET),
 
@@ -300,6 +313,7 @@ enum hw_register_bits {
PHY_READY_MASK  = (1U << 20),
 
/* MVS_Px_INT_STAT, MVS_Px_INT_MASK (per-phy events) */
+   PHYEV_DEC_ERR   = (1U << 24),   /* Phy Decoding Error */
PHYEV_UNASSOC_FIS   = (1U << 19),   /* unassociated FIS rx'd */
PHYEV_AN= (1U << 18),   /* SATA async notification */
PHYEV_BIST_ACT  = (1U << 17),   /* BIST activate FIS */
@@ -501,6 +515,9 @@ enum status_buffer {
SB_RFB_MAX  =  0x400,   /* RFB size*/
 };
 
+enum error_info_rec {
+   CMD_ISS_STPD=  (1U << 31),  /* Cmd Issue Stopped */
+};
 
 struct mvs_chip_info {
u32 n_phy;
@@ -534,6 +551,7 @@ struct mvs_cmd_hdr {
 struct mvs_slot_info {
struct sas_task *task;
u32 n_elem;
+   u32 tx;
 
/* DMA buffer for storing cmd tbl, open addr frame, status buffer,
 * and PRD table
@@ -546,23 +564,28 @@ struct mvs_slot_info {
 
 struct mvs_port {
struct asd_sas_port sas_port;
-   u8  taskfileset;
+   u8  port_attached;
+   union {
+   u8  taskfileset;
+   u8  wide_port_phymap;
+   };
 };
 
 struct mvs_phy {
struct mvs_port *port;
struct asd_sas_phy  sas_phy;
-   struct sas_identify identify;
+   struct sas_identify identify;
+   struct scsi_device  *sdev;
u64 dev_sas_addr;
u64 att_dev_sas_addr;
u32 att_dev_info;
u32 dev_info;
-   u32 type;
+   u32 phy_type;
u32 phy_status;

RE: [PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()

2008-02-05 Thread Salyzyn, Mark
I do not think the midlayer needs to be fixed. I think this was a bug/feature 
that presented itself in the 2.2 tree when we were developing this driver in 
1996...

Sincerely -- Mark Salyzyn

> -Original Message-
> From: Boaz Harrosh [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, February 05, 2008 3:52 AM
> To: Salyzyn, Mark
> Cc: James Bottomley; FUJITA Tomonori; Christoph Hellwig; Jens
> Axboe; Jeff Garzik; linux-scsi; Andrew Morton
> Subject: Re: [PATCH 5/24][RFC] dpt_i2o: Use new scsi_eh_cpy_sense()
>
> On Mon, Feb 04 2008 at 20:32 +0200, "Salyzyn, Mark"
> <[EMAIL PROTECTED]> wrote:
> > ACK with condition that community accepts the RFC's entire premise.
> >
> > The removed code that shunted the REQUEST_SENSE was based
> on the assumption
> > that the sense data in the current scsi command packet was
> left over from the
> > previous command's execution with a check condition as the
> scsi command packet
> > is reused to issue the REQUEST_SENSE. For a new, or second
> from the target's point
> > of view, request sense to the target issued by these older
> kernels would always
> > return an erased sense. The dpt_i2o driver does not itself
> maintain the sense history,
> > nor does the Firmware. This behavior, I believe, is not the
> case for current kernels so
> > the code fragment made little sense (pun not intended). If
> my historical knowledge is
> > correct, this (now removed) workaround makes no more sense
> because the scsi layer correctly
> > manages adapters that produce auto-request sense and does
> not ever turn around the command
> > and send a second request for sense information.
>
> > Given this understanding, I have no problem with the
> removed fragment of REQUEST_SENSE shunting.
> > However, I do urge some target error recovery testing, tape
> drives being the likely type of target
> > affected by this change. I have no such hardware to confirm...
> > Sincerely -- Mark Salyzyn
>
> I have removed this test because the midlayer does a
> scsi_eh_reset_sense() just before
> the new invocation of a command. So even if the second bad
> REQUEST_SENSE comes this
> will not filter it out anymore. If such a thing still
> happens? A driver state machine
> must be used to filter it out, or of course midlayer should be fixed.
>
> Boaz
>
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> James Bottomley schrieb:
> 
> > These are both features being independently worked on, are they not?
> > Even if they weren't, the combination of the size of SCST in kernel plus
> > the problem of having to find a migration path for the current STGT
> > users still looks to me to involve the greater amount of work.
> 
> I don't want to be mean, but does anyone actually use STGT in
> production? Seriously?
> 
> In the latest development version of STGT, it's only possible to stop
> the tgtd target daemon using KILL / 9 signal - which also means all
> iSCSI initiator connections are corrupted when tgtd target daemon is
> started again (kernel upgrade, target daemon upgrade, server reboot etc.).

I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


> Imagine you have to reboot all your NFS clients when you reboot your NFS
> server. Not only that - your data is probably corrupted, or at least the
> filesystem deserves checking...
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 05:43:10 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> Hi all,
> And sorry for intrusion, i am not a developer but i work everyday with iscsi
> and i found it fantastic.
> Altough Aoe, Fcoe and so on could be better, we have to look in real world
> implementations what is needed *now*, and if we look at vmware world,
> virtual iron, microsoft clustering etc, the answer is iSCSI.
> And now, SCST is the best open-source iSCSI target. So, from an end-user
> point of view, what are the really problems to not integrate scst in the
> mainstream kernel?

Currently, the best open-source iSCSI target implemenation in Linux is
Nicholas's LIO, I guess.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new scsi sense handling

2008-02-05 Thread FUJITA Tomonori
On Mon, 4 Feb 2008 18:39:22 -0800 (PST)
Luben Tuikov <[EMAIL PROTECTED]> wrote:

> --- On Mon, 2/4/08, Boaz Harrosh <[EMAIL PROTECTED]> wrote:
> > There are 3 usages of sense handling in drivers
> > 
> > 1. sense is available in driver internal structure and is
> > mem-copied to upper level
> > 2. A CHECK_CONDITION status was returned and the driver
> > uses the scsi_eh_prep_cmnd()
> >for a REQUEST_SENSE invocation to the target. Then
> > returning the sense in 
> >scsi_eh_return_cmnd(). A variation on this is when the
> > driver does nothing the queue
> >is frozen an the scsi watchdog timer does the above.
> > 3. The underline host adapter does the REQUEST_SENSE and a
> > pre-allocated and DMA mapped
> >sense buffer receives the sense information from HW.
> 
> Many years ago when "ACA" had a constructive meaning,
> so did "Autosense".  Then about 5 years ago, "Autosense"
> disappeared completely since it became the de facto
> implementation of the then SCSI Execute Command "RPC",
> now just SCSI Execute Command procedure call.
> 
> At that point in time, the SCSI mid-layer decided
> to embrace this model and give the LLDD a scsi command
> structure which included the sense data buffer to
> a size that the SCSI mid-layer was interested in,
> at the moment 96 bytes, macro defined in
> include/scsi/scsi_cmnd.h.
> 
> The concept of "Autosense" was off-loaded to LLDD
> to emulate it if the specific target device to
> which the command was issued, didn't supply the
> sense data on CHECK CONDITION, and more so
> relevant to target devices which implemented
> queuing, thus the ACA.
> 
> And the mid-layer would consider extracting
> the sense data via REQUEST SENSE command
> as a _special case_ if the LLDD/transport layer
> didn't implement the "autosense" model.

Only SPI and USB?

The most of LLDs using the transport protocol that we care about today
uses sense buffer in their own internal structure.

I think that the issue to solve to kill scsi_cmnd:sense_buffer is how
to share (or export) such sense buffer with the scsi mid-layer.

For the old transport protocols, we could do something that James said
in this thread to to kill scsi_cmnd:sense_buffer.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Mon, 4 Feb 2008 20:07:01 -0600
"Chris Weiss" <[EMAIL PROTECTED]> wrote:

> On Feb 4, 2008 11:30 AM, Douglas Gilbert <[EMAIL PROTECTED]> wrote:
> > Alan Cox wrote:
> > >> better. So for example, I personally suspect that ATA-over-ethernet is 
> > >> way
> > >> better than some crazy SCSI-over-TCP crap, but I'm biased for simple and
> > >> low-level, and against those crazy SCSI people to begin with.
> > >
> > > Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > > would probably trash iSCSI for latency if nothing else.
> >
> > And a variant that doesn't do ATA or IP:
> > http://www.fcoe.com/
> >
> 
> however, and interestingly enough, the open-fcoe software target
> depends on scst (for now anyway)

STGT also supports software FCoE target driver though it's still
experimental stuff.

http://www.mail-archive.com/linux-scsi@vger.kernel.org/msg12705.html

It works in user space like STGT's iSCSI (and iSER) target driver
(i.e. no kernel/user space interaction).
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-05 Thread Alan Stern
On Tue, 5 Feb 2008, Boaz Harrosh wrote:

> > However the interface to usb_stor_access_xfer_buf() will have to change
> > slightly.  Right now if it sees that *sgptr is NULL, it assumes this
> > means it should start at the beginning of the s-g buffer.  But with 
> > Boaz's change, *sgptr == NULL means the transfer has reached the end of 
> > the buffer.  So I'll have to go through and audit all the callers.
> > 
> > Alan Stern
> > 
> > -
> No it does not, this as not changed. Please look again.

You look again.  Your patched code goes like this:

struct scatterlist *sg = *sgptr;

if (!sg)
sg = (struct scatterlist *) srb->request_buffer;

Hence if *sgptr is NULL upon entry, it is taken to mean that the 
transfer should start at the beginning of the s-g buffer.

/* This loop handles a single s-g list entry, which may
 * include multiple pages.  Find the initial page structure
 * and the starting offset within the page, and update
 * the *offset and *index values for the next loop. */
cnt = 0;
while (cnt < buflen && sg) {

Hence if sg is NULL, it indicates the end of the buffer has been 
reached.  And then down near the end of the routine:

*sgptr = sg;

Hence if the end is reached and the caller makes another call to try 
transferring more data, the additional data will get stored back at the 
beginning of the buffer.

> Note that this patch was tested and working. It is a bug
> in v2.2.24 and it should be accepted already. One way or
> the other.
> 
> Callers of usb_stor_access_xfer_buf() need not change.
> Matthew Dharm should decide if he wants the WARN_ON in 
> usb_stor_set_xfer_buf() or not and be done with it.
> 
> I have found and fixed the bug, but it is not a SCSI
> related bug, and it is not do to any scsi changes. It
> is a bug from the SG changes of early 2.6.24. Please
> take it through the USB tree. Feel free to change it
> the way you like it, and submit it.

I will post a new version of this which handles all these issues.  
Expect it in a day or so.

Alan Stern

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread James Bottomley
On Mon, 2008-02-04 at 21:35 -0800, Luben Tuikov wrote:
> > > I guess the same could be said for STGT and SCST,
> > right?
> > 
> > You mean both of their kernel pieces are modular? 
> > That's correct.
> 
> No, you know very well what I mean.
> 
> By the same logic you're preaching to include your
> solution part of the kernel, you can also apply to
> SCST.

Ah, but it's not ... the current patch is merely exporting an interface.
The debate in STGT vs SCST is not whether to export an interface but
where to draw the line.

You could also argue in the same vein that sd is redundant because a
filesystem could talk directly to the device via /dev/sgX (in fact OSD
based filesystems already do this).  The argument is true, but misses
the bigger picture that the interfaces exported by sd are more portable
(apply to non-SCSI block devices) and easier to use.

> > > Yes, for which the transport layer, implements the
> > > scsi device node for the SES device.  It doesn't
> > really
> > > matter if the SCSI commands sent to the SES device go
> > > over SGPIO or FC or SAS or Bluetooth or I2C, etc, the
> > > transport layer can implement that and present the
> > > /dev/sgX node.
> > 
> > But it does matter if the enclosure device doesn't
> > speak SCSI.
> 
> Enclosure management isn't as simple as you're
> portraying it here.  The enclosure management
> device speaks either SES or SAF-TE.  The transport
> protocol to access it could be SGPIO or I2C or...

Look, just read the spec; SGPIO is a bus for driving enclosures ... it
doesn't require SES or SAF-TE or even any SCSI protocol.

> >  SGPIO
> > isn't a SCSI protocol ... it's a general purpose
> > serial bus protocol.
> > It's pretty simple and register based, but it might (or
> > might not) be
> > accessible via a SCSI bridge.
> 
> I see.  You've just discovered SGPIO -- good for you.
> 
> At any rate, I told you already that what is needed
> is not what you've provided but a _device node_
> exported by the kernel, either a processor or
> enclosure type.

Wrong ... we don't export non-SCSI devices as SCSI (with the single and
rather annoying exception of ATA via SAT).

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [patch 01/13] git-scsi-misc: fix isa/pcmcia compile problem

2008-02-05 Thread James Bottomley
On Mon, 2008-02-04 at 23:53 -0800, [EMAIL PROTECTED] wrote:
> From: Tejun Heo <[EMAIL PROTECTED]>
> 
> aha152x.c and fdomain are built twice - once for the isa driver and once
> for the PCMCIA one.  Through #ifdefs, the compiled codes are slightly
> different; thus, global symbols need to be given different names depending
> on which flavor is being built.  This patch adds GLOBAL() macro to
> aha152x.h and fdomain.h which change the symbol depending on PCMCIA.
> 
> This bug has always existed but has been masked by the fact the
> drivers/scsi/pcmcia used subdir-(y|m) instead of obj-(y|m) which made
> drivers/scsi/pcmcia/built_in.o not linked into the kernel and thus avoided
> the duplicate symbols during compilation.
> 
> [EMAIL PROTECTED]: coding-style fixes]
> Signed-off-by: Tejun Heo <[EMAIL PROTECTED]>
> Tested-by: Kamalesh Babulal <[EMAIL PROTECTED]>
> Cc: James Bottomley <[EMAIL PROTECTED]>
> Signed-off-by: Andrew Morton <[EMAIL PROTECTED]>

An alternative fix for this is already in.

Author: James Bottomley <[EMAIL PROTECTED]>
Date:   Fri Jan 18 17:47:56 2008 -0600

[SCSI] fix pcmcia compile problem


James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] [SCSI] fix BUG when sum(scatterlist) > bufflen

2008-02-05 Thread Tony Battersby
When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).

When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512.  This can result in
sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen.  When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.

This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.

Signed-off-by: Tony Battersby <[EMAIL PROTECTED]>
---
--- linux-2.6.24-git14/drivers/scsi/scsi_lib.c.orig 2008-02-05 
09:33:05.0 -0500
+++ linux-2.6.24-git14/drivers/scsi/scsi_lib.c  2008-02-05 09:33:10.0 
-0500
@@ -301,7 +301,6 @@ static int scsi_req_map_sg(struct reques
page = sg_page(sg);
off = sg->offset;
len = sg->length;
-   data_len += len;
 
while (len > 0 && data_len > 0) {
/*


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez
On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

> commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
> Merge: 50d9a12... 23c3e29...
> Author: Linus Torvalds <[EMAIL PROTECTED]>
> Date:   Fri Jan 25 17:19:08 2008 -0800
> 
> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
> 
> * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 
> commits)
> 
> I believe a regression was introduced. I'm running on a 4-way IA64,
> with straight 2.6.24 and 2 dual-port cards:
> 
> 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> 
> the adapters failed initialization. In particular, I narrowed it down
> to failing the qla2x00_mbox_command call within qla2x00_init_firmware
> function. I went and removed the qla2x00-related parts of this (large-ish)
> merge, and the 4 ports initialized just fine.

Could you load the (default 2.6.24) driver with
ql2xextended_error_logging modules parameter set:

# insmod qla2xxx ql2xextended_error_logging=1

and send the resultant kernel logs?

> Specifically, reverting the "patch" below enabled the devices to initialize 
> properly.
> 
> If need be, I'm certainly willing to help narrow down to the specific part in
> this patch...

That's a rather large patch... :(   Any chance you could git-bisect?
Also, could you send your .config file you are using?

Thanks,
Andrew Vasquez
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Tomasz Chmielewski

FUJITA Tomonori schrieb:

On Tue, 05 Feb 2008 08:14:01 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:


James Bottomley schrieb:


These are both features being independently worked on, are they not?
Even if they weren't, the combination of the size of SCST in kernel plus
the problem of having to find a migration path for the current STGT
users still looks to me to involve the greater amount of work.

I don't want to be mean, but does anyone actually use STGT in
production? Seriously?

In the latest development version of STGT, it's only possible to stop
the tgtd target daemon using KILL / 9 signal - which also means all
iSCSI initiator connections are corrupted when tgtd target daemon is
started again (kernel upgrade, target daemon upgrade, server reboot etc.).


I don't know what "iSCSI initiator connections are corrupted"
mean. But if you reboot a server, how can an iSCSI target
implementation keep iSCSI tcp connections?


The problem with tgtd is that you can't start it (configured) in an
"atomic" way.
Usually, one will start tgtd and it's configuration in a script (I 
replaced some parameters with "..." to make it shorter and more readable):



tgtd
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


However, this won't work - tgtd goes immediately in the background as it 
is still starting, and the first tgtadm commands will fail:


# bash -x tgtd-start
+ tgtd
+ tgtadm --op new --mode target ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op new --mode account ...
tgtadm: can't connect to the tgt daemon, Connection refused
tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
not connected

+ tgtadm --lld iscsi --op bind --mode account --tid 1 ...
tgtadm: can't find the target
+ tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
tgtadm: can't find the target
+ tgtadm --op bind --mode target --tid 1 -I ALL
tgtadm: can't find the target
+ tgtadm --op new --mode target --tid 2 ...
+ tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
+ tgtadm --op bind --mode target --tid 2 -I ALL


OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
second right after tgtd?


tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...


No, it is not a good idea - if tgtd listens on port 3260 *and* is 
unconfigured yet,  any reconnecting initiator will fail, like below:


end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
Aborting journal on device sdb.
ext3_abort called.
EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 6728
Buffer I/O error on device sdb, logical block 841
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045192
Buffer I/O error on device sdb, logical block 880649
lost page write due to I/O error on sdb
end_request: I/O error, dev sdb, sector 7045880
Buffer I/O error on device sdb, logical block 880735
lost page write due to I/O error on sdb
__journal_remove_journal_head: freeing b_frozen_data
__journal_remove_journal_head: freeing b_frozen_data


Ouch.

So the only way to start/restart tgtd reliably is to do hacks which are 
needed with yet another iSCSI kernel implementation (IET): use iptables.


iptables 
tgtd
sleep 1
tgtadm --op new ...
tgtadm --lld iscsi --op new ...
iptables 


A bit ugly, isn't it?
Having to tinker with a firewall in order to start a daemon is by no 
means a sign of a well-tested and mature project.


That's why I asked how many people use stgt in a production environment 
- James was worried about a potential migration path for current users.




--
Tomasz Chmielewski
http://wpkg.org

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez
On Tue, 05 Feb 2008, Andrew Vasquez wrote:

> On Tue, 05 Feb 2008, Alan D. Brunelle wrote:
> 
> > commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
> > Merge: 50d9a12... 23c3e29...
> > Author: Linus Torvalds <[EMAIL PROTECTED]>
> > Date:   Fri Jan 25 17:19:08 2008 -0800
> > 
> > Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
> > 
> > * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: 
> > (200 commits)
> > 
> > I believe a regression was introduced. I'm running on a 4-way IA64,
> > with straight 2.6.24 and 2 dual-port cards:
> > 
> > 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
> > 
> > the adapters failed initialization. In particular, I narrowed it down
> > to failing the qla2x00_mbox_command call within qla2x00_init_firmware
> > function. I went and removed the qla2x00-related parts of this (large-ish)
> > merge, and the 4 ports initialized just fine.
> 
> Could you load the (default 2.6.24) driver with
> ql2xextended_error_logging modules parameter set:
> 
>   # insmod qla2xxx ql2xextended_error_logging=1
> 
> and send the resultant kernel logs?

Could you tray the patch referenced here:

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.
http://article.gmane.org/gmane.linux.scsi/38240

Thanks, av

---

qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
non-NPIV capable ISPs.

BIT_2 of the firmware attributes is only valid on FW-interface-2
type HBAs.  Code in commit
c48339decceec8e011498b0fc4c7c7d8b2ea06c1 would cause the
incorrect initialize-firmware mailbox command to be issued for
non-NPIV capable ISPs.  Correct this by reverting to previously
used (and correct) pre-condition 'if' check.

Signed-off-by: Andrew Vasquez <[EMAIL PROTECTED]>
---
 drivers/scsi/qla2xxx/qla_mbx.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/qla2xxx/qla_mbx.c b/drivers/scsi/qla2xxx/qla_mbx.c
index 0c10c0b..99d29ff 100644
--- a/drivers/scsi/qla2xxx/qla_mbx.c
+++ b/drivers/scsi/qla2xxx/qla_mbx.c
@@ -980,7 +980,7 @@ qla2x00_init_firmware(scsi_qla_host_t *ha, uint16_t size)
DEBUG11(printk("qla2x00_init_firmware(%ld): entered.\n",
ha->host_no));
 
-   if (ha->fw_attributes & BIT_2)
+   if (ha->flags.npiv_supported)
mcp->mb[0] = MBC_MID_INITIALIZE_FIRMWARE;
else
mcp->mb[0] = MBC_INITIALIZE_FIRMWARE;
-- 
1.5.4.rc5.5.gab98

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/24][RFC] scsi_eh: Define new API for sense handling

2008-02-05 Thread Boaz Harrosh
On Mon, Feb 04 2008 at 19:33 +0200, James Bottomley <[EMAIL PROTECTED]> wrote:
> On Mon, 2008-02-04 at 17:30 +0200, Boaz Harrosh wrote:
>> This patch defines a new API for sense handling. All drivers will
>>   be converted to this API, before the sense handling implementation will
>>   change. API is as follows:
>>
>> void scsi_eh_cpy_sense(struct scsi_cmnd *cmd, void* sense,
>>   unsigned sense_bytes);
>> To be used by drivers, when they have sense-bits
>> and wants to send them to upper layer. Max size
>> need not be a concern, If upper layer does not have
>> enough space it will be automatically truncated.
>>
>> u8 *scsi_make_sense(struct scsi_cmnd *cmd);
>> To be used by drivers, and scsi-midlayer. Returns a DMA-able
>> sense buffer. Must be returned by scsi_return_sense(). It should
>> never fail if .pre_allocate_sense && .sense_buffsize in host
>> template where properly set.
>> the buffer is of shost->sense_buffsize long.
>>
>> void *scsi_return_sense(struct scsi_cmnd *cmd, u8 *sb);
>> Frees and returns the sense to the upper layer,
>> copying only what's necessary.
>>
>> void scsi_eh_reset_sense(struct scsi_cmnd *cmd)
>> Should not be used or necessary.
>>
>> const u8 *scsi_sense(struct scsi_cmnd *cmd)
>> Used by ULDs and for inspecting the returned sense, can not
>> be modified. It is only valid after a call to
>> scsi_eh_cpy_sense() or a call to scsi_return_sense(). Before
>> that it will/should return an empty buffer.
>>
>> New members at scsi host template:
>> .sense_buffsize - if a driver calls scsi_make_sense() or
>>   scsi_eh_prep_cmnd(), This value should be none
>>   zero indicating the max sense size, the driver
>>   supports. In most cases it should be
>>   SCSI_SENSE_BUFFERSIZE.
>>   If this value is zero the driver will only call
>>   scsi_eh_cpy_sense().
>>
>> .pre_allocate_sense - if a Driver calls scsi_make_sense()
>>   in .queuecommand for every cmnd, this
>>   should be set to true. In which case
>>   scsi_make_sense() will not fail because
>>   midlayer will fail the command allocation.
>>   If the drivers calls scsi_eh_prep_cmnd()
>>   then sense_buffsize is not Zero but this
>>   here is set to false.
> 
> My initial reaction to this is that you're doing too many contortions to
> ensure something we don't particularly care about:  whether we can
> allocate a sense buffer atomically or not.

I hope that now, once you've actually seen the implementation, my
motivation is clearer.  Perhaps I explained it badly, but the actual code
is pretty simple and contortions is not how I would describe it. The API
above is just a way for drivers to say how they intend to behave, and the
midlayer will accommodate easily. None of the solutions are hard and they
are all simpler then what exists today. The only added complexity introduced 
is the initial choice.

> 
> What all this code should be doing is simply allocating the sense buffer
> in scsi_eh_prep_cmnd() using tomo's existing slab (and GFP_ATOMIC) 

This is what we are doing. Only allocating the sense buffer in the very
unlikely event of the call to scsi_eh_prep_cmnd(). So we are in agreement
here.

> if
> that fails, we need a return from scsi_eh_prep_cmnd() telling us.  At
> that point, the driver should abandon the auto request sense attempt and
> instead just return the CC/UA without the DRIVER_SENSE bit set which
> will trigger the eh to collect the sense for us.
> 

This is a nightmare and a serious regression. It will cause an IO deadlock
in the event of an IO error during an IO-to-free-memory scenario.

The memory footprint of a system running with my patchset, after the very first 
request, is the same as with the current (Post Tomo) code. Only thing is, my 
system
will preallocate a bit more memory, 96 bytes, per device scanned.  This happens
anyway, currently, with Tomo's code as soon as the device is used the first 
time.

Preallocating the sense buffer during initialization eliminates the need to 
allocate
it for every command, providing considerable performance and memory consumption 
benefits. All that without compromising robustness in the event of an IO error 
on 
heavily loaded systems.

> Ideally, doing it this way might mean we could even dump the
> sense_buffer pointer from the command (although I don't see that as
> necessary).
> 
> This solves the 99% case without getting into preallocation contortions.
> 

after the final patch you can see that I have ditched the sense_buffer pointer
without sacrificing anything in reliability, and absolutely got rid of any
sense al

Re: [PATCH 21/24][RFC] scsi_tgt: use of sense accessors

2008-02-05 Thread Pete Wyckoff
[EMAIL PROTECTED] wrote on Mon, 04 Feb 2008 19:53 +0200:
>   FIXME: I need help with this driver (Pete?)
> I used scsi_sense() in a none const way. But since
> scsi_tgt is the ULD here, it can just access it's own sense
> buffer directly. I did not use scsi_eh_cpy_sense() because
> I did not want the extra copy. Pete will want to use a 260
> bytes buffer here.
> 
> Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
> Need-help-from: Pete Wyckoff <[EMAIL PROTECTED]>

FYI, I never use scsi_tgt.  Only just pure userspace on the target,
and a dumb ethernet NIC that does not know it is speaking any form
of SCSI.

People who need scsi_tgt have real target-enabled NICs like the
fancy qla4xxx.  Those act as SCSI targets across FC or IP or
whatever and bring commands into the kernel, which then relays them
to a userspace tgtd process, which does the read/write as necessary,
and returns a result code to the NIC to ship back across FC.

So sorry, I won't take a guess at what has to happen here.  But
yeah, you are right that an OSD target implementation would at times
need a sense buffer bigger than 96.  Protocol maximum length for all
sense data is 264.

-- Pete
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Ming Zhang
On Tue, 2008-02-05 at 17:07 +0100, Tomasz Chmielewski wrote:
> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):
> 
> 
> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> However, this won't work - tgtd goes immediately in the background as it 
> is still starting, and the first tgtadm commands will fail:

this should be a easy fix. start tgtd, get port setup ready in forked
process, then signal its parent that ready to quit. or set port ready in
parent, fork and pass to daemon.


> 
> # bash -x tgtd-start
> + tgtd
> + tgtadm --op new --mode target ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op new --mode account ...
> tgtadm: can't connect to the tgt daemon, Connection refused
> tgtadm: can't send the request to the tgt daemon, Transport endpoint is 
> not connected
> + tgtadm --lld iscsi --op bind --mode account --tid 1 ...
> tgtadm: can't find the target
> + tgtadm --op new --mode logicalunit --tid 1 --lun 1 ...
> tgtadm: can't find the target
> + tgtadm --op bind --mode target --tid 1 -I ALL
> tgtadm: can't find the target
> + tgtadm --op new --mode target --tid 2 ...
> + tgtadm --op new --mode logicalunit --tid 2 --lun 1 ...
> + tgtadm --op bind --mode target --tid 2 -I ALL
> 
> 
> OK, if tgtd takes longer to start, perhaps it's a good idea to sleep a 
> second right after tgtd?
> 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> 
> 
> No, it is not a good idea - if tgtd listens on port 3260 *and* is 
> unconfigured yet,  any reconnecting initiator will fail, like below:

this is another easy fix. tgtd started with unconfigured status and then
a tgtadm can configure it and turn it into ready status.


those are really minor usability issue. ( i know it is painful for user,
i agree)


the major problem here is to discuss in architectural wise, which one is
better... linux kernel should have one implementation that is good from
foundation...





> 
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> Aborting journal on device sdb.
> ext3_abort called.
> EXT3-fs error (device sdb): ext3_journal_start_sb: Detected aborted journal
> Remounting filesystem read-only
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 6728
> Buffer I/O error on device sdb, logical block 841
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045192
> Buffer I/O error on device sdb, logical block 880649
> lost page write due to I/O error on sdb
> end_request: I/O error, dev sdb, sector 7045880
> Buffer I/O error on device sdb, logical block 880735
> lost page write due to I/O error on sdb
> __journal_remove_journal_head: freeing b_frozen_data
> __journal_remove_journal_head: freeing b_frozen_data
> 
> 
> Ouch.
> 
> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables 
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.
> 
> 
> 
> -- 
> Tomasz Chmielewski
> http://wpkg.org
> 
> 
> ---

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez
On Tue, 05 Feb 2008, Andrew Vasquez wrote:

> > Could you load the (default 2.6.24) driver with
> > ql2xextended_error_logging modules parameter set:
> > 
> > # insmod qla2xxx ql2xextended_error_logging=1
> > 
> > and send the resultant kernel logs?
> 
> Could you tray the patch referenced here:
> 
> qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
> non-NPIV capable ISPs.
> http://article.gmane.org/gmane.linux.scsi/38240


BTW:  the regression in question is not present in vanilla 2.6.24.
Instead it was introduced early on in the 2.6.25 merge-window.  Linus'
tree currently has the patch referenced above as well.

--
av
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Bart Van Assche
Regarding the performance tests I promised to perform: although until
now I only have been able to run two tests (STGT + iSER versus SCST +
SRP), the results are interesting. I will run the remaining test cases
during the next days.

About the test setup: dd and xdd were used to transfer 2 GB of data
between an initiator system and a target system via direct I/O over an
SDR InfiniBand network (1GB/s). The block size varied between 512
bytes and 1 GB, but was always a power of two.

Expected results:
* The measurement results are consistent with the numbers I published earlier.
* During data transfers all data is transferred in blocks between 4 KB
and 32 KB in size (according to the SCST statistics).
* For small and medium block sizes (<= 32 KB) transfer times can be
modeled very well by the following formula: (transfer time) = (setup
latency) + (bytes transferred)/(bandwidth). The correlation numbers
are very close to one.
* The latency and bandwidth parameters depend on the test tool (dd
versus xdd), on the kind of test performed (reading versus writing),
on the SCSI target and on the communication protocol.
* When using RDMA (iSER or SRP), SCST has a lower latency and higher
bandwidth than STGT (results from linear regression for block sizes <=
32 KB):
   Test  Latency(us) Bandwidth (MB/s) Correlation
   STGT+iSER, read, dd   64  560  0.95
   STGT+iSER, read, xdd  65  556  0.94
   STGT+iSER, write, dd  53  394  0.71
   STGT+iSER, write, xdd 54  445  0.59
   SCST+SRP, read, dd39  657  0.83
   SCST+SRP, read, xdd   41  668  0.87
   SCST+SRP, write, dd   52  449  0.62
   SCST+SRP, write, xdd  52  516  0.77

Results that I did not expect:
* A block transfer size of 1 MB is not enough to measure the maximal
throughput. The maximal throughput is only reached at much higher
block sizes (about 10 MB for SCST + SRP and about 100 MB for STGT +
iSER).
* There is one case where dd and xdd results are inconsistent: when
reading via SCST + SRP and for block sizes of about 1 MB.
* For block sizes > 64 KB the measurements differ from the model. This
is probably because all initiator-target transfers happen in blocks of
32 KB or less.

For the details and some graphs, see also
http://software.qlayer.com/display/iSCSI/Measurements .

Bart Van Assche.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Alan D. Brunelle
Andrew Vasquez wrote:
> On Tue, 05 Feb 2008, Alan D. Brunelle wrote:
> 
>> commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
>> Merge: 50d9a12... 23c3e29...
>> Author: Linus Torvalds <[EMAIL PROTECTED]>
>> Date:   Fri Jan 25 17:19:08 2008 -0800
>>
>> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
>>
>> * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: (200 
>> commits)
>>
>> I believe a regression was introduced. I'm running on a 4-way IA64,
>> with straight 2.6.24 and 2 dual-port cards:
>>
>> 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>> 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>> c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>> c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>>
>> the adapters failed initialization. In particular, I narrowed it down
>> to failing the qla2x00_mbox_command call within qla2x00_init_firmware
>> function. I went and removed the qla2x00-related parts of this (large-ish)
>> merge, and the 4 ports initialized just fine.
> 
> Could you load the (default 2.6.24) driver with
> ql2xextended_error_logging modules parameter set:
> 
>   # insmod qla2xxx ql2xextended_error_logging=1
> 
> and send the resultant kernel logs?

Here's the output to the console (if there are other logs you need, let me 
know). I'll try the patch next, and sorry, hadn't realized merges were still 
coming in under 2.6.24 in Linus' tree... 

QLogic Fibre Channel HBA Driver
ACPI: PCI Interrupt :40:01.0[A] -> GSI 38 (level, low) -> IRQ 58
qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000
qla2xxx :40:01.0: Configuring PCI space...
qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c.
qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c.
qla2xxx :40:01.0: Configure NVRAM parameters...
qla2xxx :40:01.0: Verifying loaded RISC code...
scsi(14):  Load RISC code 
scsi(14): Verifying Checksum of loaded RISC code.
scsi(14): Checksum OK, start firmware.
qla2xxx :40:01.0: Allocated (412 KB) for firmware dump...
scsi(14): Issue init firmware.
qla2x00_mailbox_command(14):  FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, cmd=48 

qla2x00_init_firmware(14): failed=102 mb0=4001.
scsi(14): Init firmware  FAILED .
qla2xxx :40:01.0: Failed to initialize adapter
scsi(14): Failed to initialize adapter - Adapter flags 10.
ACPI: PCI Interrupt :40:01.1[B] -> GSI 39 (level, low) -> IRQ 59
qla2xxx :40:01.1: Found an ISP2312, irq 59, iobase 0xc000a004
qla2xxx :40:01.1: Configuring PCI space...
qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c.
qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c.
qla2xxx :40:01.1: Configure NVRAM parameters...
qla2xxx :40:01.1: Verifying loaded RISC code...
scsi(15):  Load RISC code 
scsi(15): Verifying Checksum of loaded RISC code.
scsi(15): Checksum OK, start firmware.
qla2xxx :40:01.1: Allocated (412 KB) for firmware dump...
scsi(15): Issue init firmware.
qla2x00_mailbox_command(15):  FAILED. mbx0=4001, mbx1=0, mbx2=bac6, cmd=48 

qla2x00_init_firmware(15): failed=102 mb0=4001.
scsi(15): Init firmware  FAILED .
qla2xxx :40:01.1: Failed to initialize adapter
scsi(15): Failed to initialize adapter - Adapter flags 10.
ACPI: PCI Interrupt :c0:01.0[A] -> GSI 71 (level, low) -> IRQ 60
qla2xxx :c0:01.0: Found an ISP2312, irq 60, iobase 0xc000e0041000
qla2xxx :c0:01.0: Configuring PCI space...
qla2x00_get_flash_version(): Unrecognized code type ff at pcids c61c.
qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1da1c.
qla2xxx :c0:01.0: Configure NVRAM parameters...
qla2xxx :c0:01.0: Verifying loaded RISC code...
scsi(16):  Load RISC code 
scsi(16): Verifying Checksum of loaded RISC code.
scsi(16): Checksum OK, start firmware.
qla2xxx :c0:01.0: Allocated (412 KB) for firmware dump...
scsi(16): Issue init firmware.
qla2x00_mailbox_command(16):  FAILED. mbx0=4001, mbx1=0, mbx2=bae3, cmd=48 

qla2x00_init_firmware(16): failed=102 mb0=4001.
scsi(16): Init firmware  FAILED .
qla2xxx :c0:01.0: Failed to initialize adapter
scsi(16): Failed to initialize adapter - Adapter flags 10.
ACPI: PCI Interrupt :c0:01.1[B] -> GSI 72 (level, low) -> IRQ 61
qla2xxx :c0:01.1: Found an ISP2312, irq 61, iobase 0xc000e004
qla2xxx :c0:01.1: Configuring PCI space...
qla2x00_get_flash_version(): Unrecognized code type ff at pcids c61c.
qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1da1c.
qla2xxx :c0:01.1: Configure NVRAM parameters...
qla2xxx :c0:01.1: Verifying loaded RISC code...
scsi(17):  Load RISC code 
scsi(17): Verifying Checksum of loaded RISC code.
scsi(17): Checksum OK, start firmware.
qla2xxx :c0:01.1: Allocated (412 KB) for firmware dump...
sc

Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Alan D. Brunelle
Andrew Vasquez wrote:
> On Tue, 05 Feb 2008, Andrew Vasquez wrote:
> 
>> On Tue, 05 Feb 2008, Alan D. Brunelle wrote:
>>
>>> commit 9b73e76f3cf63379dcf45fcd4f112f5812418d0a
>>> Merge: 50d9a12... 23c3e29...
>>> Author: Linus Torvalds <[EMAIL PROTECTED]>
>>> Date:   Fri Jan 25 17:19:08 2008 -0800
>>>
>>> Merge git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6
>>>
>>> * git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6: 
>>> (200 commits)
>>>
>>> I believe a regression was introduced. I'm running on a 4-way IA64,
>>> with straight 2.6.24 and 2 dual-port cards:
>>>
>>> 40:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>>> 40:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>>> c0:01.0 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>>> c0:01.1 Fibre Channel: QLogic Corp. QLA2312 Fibre Channel Adapter (rev 03)
>>>
>>> the adapters failed initialization. In particular, I narrowed it down
>>> to failing the qla2x00_mbox_command call within qla2x00_init_firmware
>>> function. I went and removed the qla2x00-related parts of this (large-ish)
>>> merge, and the 4 ports initialized just fine.
>> Could you load the (default 2.6.24) driver with
>> ql2xextended_error_logging modules parameter set:
>>
>>  # insmod qla2xxx ql2xextended_error_logging=1
>>
>> and send the resultant kernel logs?
> 
> Could you tray the patch referenced here:
> 
> qla2xxx: Correct issue where incorrect init-fw mailbox command was used on 
> non-NPIV capable ISPs.
> http://article.gmane.org/gmane.linux.scsi/38240
> 
> Thanks, av

The referenced patch worked fine Andrew, thanks much! 

Alan
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 2.6.24 regression w/ QLA2300

2008-02-05 Thread Andrew Vasquez
On Tue, 05 Feb 2008, Alan D. Brunelle wrote:

> > and send the resultant kernel logs?
> 
> Here's the output to the console (if there are other logs you need,
> let me know). I'll try the patch next, and sorry, hadn't realized
> merges were still coming in under 2.6.24 in Linus' tree... 
> 
> QLogic Fibre Channel HBA Driver
> ACPI: PCI Interrupt :40:01.0[A] -> GSI 38 (level, low) -> IRQ 58
> qla2xxx :40:01.0: Found an ISP2312, irq 58, iobase 0xc000a0041000
> qla2xxx :40:01.0: Configuring PCI space...
> qla2x00_get_flash_version(): Unrecognized code type ff at pcids da1c.
> qla2x00_get_flash_version(): Unrecognized code type ff at pcids 1f61c.
> qla2xxx :40:01.0: Configure NVRAM parameters...
> qla2xxx :40:01.0: Verifying loaded RISC code...
> scsi(14):  Load RISC code 
> scsi(14): Verifying Checksum of loaded RISC code.
> scsi(14): Checksum OK, start firmware.
> qla2xxx :40:01.0: Allocated (412 KB) for firmware dump...
> scsi(14): Issue init firmware.
> qla2x00_mailbox_command(14):  FAILED. mbx0=4001, mbx1=0, mbx2=ba8a, 
> cmd=48 

Ok, this is what I would have expected with the linus' tree prior to
the fix.  I just double-checked, the fix in question has yet to make
it's way to Linus' tree.  It's currently in scsi-misc-2.6:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=a571fdf7caa010e17f6a70c0c52e0992e87af7db

which should filter up to linux-2.6.git during Linus' next pull.

thanks, av
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 17:07:07 +0100
Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:

> FUJITA Tomonori schrieb:
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >>
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >>
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> 
> The problem with tgtd is that you can't start it (configured) in an
> "atomic" way.
> Usually, one will start tgtd and it's configuration in a script (I 
> replaced some parameters with "..." to make it shorter and more readable):

Thanks for the details. So the way to stop the daemon is not related
with your problem.

It's easily fixable. Can you start a new thread about this on
stgt-devel mailing list? When we agree on the interface to start the
daemon, I'll implement it.


> tgtd
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...

(snip)

> So the only way to start/restart tgtd reliably is to do hacks which are 
> needed with yet another iSCSI kernel implementation (IET): use iptables.
> 
> iptables 
> tgtd
> sleep 1
> tgtadm --op new ...
> tgtadm --lld iscsi --op new ...
> iptables 
> 
> 
> A bit ugly, isn't it?
> Having to tinker with a firewall in order to start a daemon is by no 
> means a sign of a well-tested and mature project.
> 
> That's why I asked how many people use stgt in a production environment 
> - James was worried about a potential migration path for current users.

I don't know how many people use stgt in a production environment but
I'm not sure that this problem prevents many people from using it in a
production environment.

You want to reboot a server running target devices while initiators
connect to it. Rebooting the target server behind the initiators
seldom works. System adminstorators in my workplace reboot storage
devices once a year and tell us to shut down the initiator machines
that use them before that.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-05 Thread Boaz Harrosh
On Tue, Feb 05 2008 at 17:42 +0200, Alan Stern <[EMAIL PROTECTED]> wrote:
> On Tue, 5 Feb 2008, Boaz Harrosh wrote:
> 
>>> However the interface to usb_stor_access_xfer_buf() will have to change
>>> slightly.  Right now if it sees that *sgptr is NULL, it assumes this
>>> means it should start at the beginning of the s-g buffer.  But with 
>>> Boaz's change, *sgptr == NULL means the transfer has reached the end of 
>>> the buffer.  So I'll have to go through and audit all the callers.
>>>
>>> Alan Stern
>>>
>>> -
>> No it does not, this as not changed. Please look again.
> 
> You look again.  Your patched code goes like this:
> 
>   struct scatterlist *sg = *sgptr;
> 
>   if (!sg)
>   sg = (struct scatterlist *) srb->request_buffer;
> 
> Hence if *sgptr is NULL upon entry, it is taken to mean that the 
> transfer should start at the beginning of the s-g buffer.
> 
>   /* This loop handles a single s-g list entry, which may
>* include multiple pages.  Find the initial page structure
>* and the starting offset within the page, and update
>* the *offset and *index values for the next loop. */
>   cnt = 0;
>   while (cnt < buflen && sg) {
> 
> Hence if sg is NULL, it indicates the end of the buffer has been 
> reached.  And then down near the end of the routine:
> 
>   *sgptr = sg;
> 
> Hence if the end is reached and the caller makes another call to try 
> transferring more data, the additional data will get stored back at the 
> beginning of the buffer.
> 
That behavior did not change. In the likely event of sg-length matching
bufflen the last call to sg_next will return NULL, and will be returned 
in *sgptr. The end condition of an outside caller is either sum of 
returned counts reaching some target count, or *sgptr return to NULL.
The code before the sg change would have *indexptr >= some_sg_count, but 
now we do not have an index we have a pointer and the termination condition 
is *sgptr == NULL.

So I guess you are afraid that calling code that was converted from index
to pointer, was done wrong, and where something did  *indexptr >= some_sg_count
before, does not do *sgptr == NULL now.

So I guess, yes you are welcome to check. I did not do the conversion so
I can not comment.

>> Note that this patch was tested and working. It is a bug
>> in v2.2.24 and it should be accepted already. One way or
>> the other.
>>
>> Callers of usb_stor_access_xfer_buf() need not change.
>> Matthew Dharm should decide if he wants the WARN_ON in 
>> usb_stor_set_xfer_buf() or not and be done with it.
>>
>> I have found and fixed the bug, but it is not a SCSI
>> related bug, and it is not do to any scsi changes. It
>> is a bug from the SG changes of early 2.6.24. Please
>> take it through the USB tree. Feel free to change it
>> the way you like it, and submit it.
> 
> I will post a new version of this which handles all these issues.  
> Expect it in a day or so.
> 

Please do. Thanks, that would be better.
Don't forget to also submit a patch for current head-of-line. It's exactly
the same fix but has diff conflicts with surrounding code.

> Alan Stern
> 

Boaz
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: (fwd) Bug#11922: I/O error on blank tapes

2008-02-05 Thread Kai Makisara
On Mon, 4 Feb 2008, James Bottomley wrote:

> 
> On Mon, 2008-02-04 at 22:28 +0100, Borislav Petkov wrote:
> > On Mon, Feb 04, 2008 at 03:22:06PM +0100, maximilian attems wrote:
> > 
> > (Added Bart to CC)
> > 
> > > hello borislav,
> > > 
> > > may i forward you that *old* Debian kernel bug,
> > > have seen you working on ide-tape:
> > > http://bugs.debian.org/11922
> > > no we don't carry any ide patches anymore.
> > > 
> > > maybe you've already fixed it in latest?
> > > 
> > > thanks
> > > 
> > > -- 
> > > maks
> > > 
> > > - Forwarded message from Stephen Kitt <[EMAIL PROTECTED]> -
> > > 
> > > Subject: Bug#11922: I/O error on blank tapes
> > > Date: Sat, 1 Dec 2007 19:06:18 +0100
> > > From: Stephen Kitt <[EMAIL PROTECTED]>
> > > To: [EMAIL PROTECTED]
> > > 
> > > Hi,
> > > 
> > > This does still occur with 2.6.22; with a blank tape in my HP DDS-4 drive:
> > > 
> > > $ tar tzvf /dev/nst0
> > > tar: /dev/nst0: Cannot read: Input/output error
> 
> That's a SCSI tape, not an IDE one.  I cc'd the SCSI list
> 
This is not a bug, it is a feature. There is _nothing_ on the tape and if 
you try to read something, you get an error. The same thing applies to 
reading after the last filemark. Note that after writing a filemark at the 
beginning of the tape, the situation is different. Now there is a file and 
the normal EOF semantics apply although there still is no data.

I admit that the error return could be more descriptive but the st driver 
tries to be compatible with other Unices.

The behavior can be changed if Linux does not match other Unices. I don't 
remember if I have tested just this with other Unices. I will try to test 
this with Tru64 tomorrow. If anyone has data on other Unices, it would be 
helpful.

-- 
Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 21/24][RFC] scsi_tgt: use of sense accessors

2008-02-05 Thread FUJITA Tomonori
On Tue, 5 Feb 2008 11:21:33 -0500
Pete Wyckoff <[EMAIL PROTECTED]> wrote:

> [EMAIL PROTECTED] wrote on Mon, 04 Feb 2008 19:53 +0200:
> >   FIXME: I need help with this driver (Pete?)
> > I used scsi_sense() in a none const way. But since
> > scsi_tgt is the ULD here, it can just access it's own sense
> > buffer directly. I did not use scsi_eh_cpy_sense() because
> > I did not want the extra copy. Pete will want to use a 260
> > bytes buffer here.
> > 
> > Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
> > Need-help-from: Pete Wyckoff <[EMAIL PROTECTED]>
> 
> FYI, I never use scsi_tgt.  Only just pure userspace on the target,
> and a dumb ethernet NIC that does not know it is speaking any form
> of SCSI.

Seems that many people misunderstand STGT iSCSI (and iSER), FCoE, and
SRP (not implemented yet) software target drivers. They don't use the
tgt kernel module. They just run in user space like user-space nfs
daemon.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Erez Zilber
Bart Van Assche wrote:
> On Jan 30, 2008 12:32 AM, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
>   
>> iSER has parameters to limit the maximum size of RDMA (it needs to
>> repeat RDMA with a poor configuration)?
>> 
>
> Please specify which parameters you are referring to. As you know I
> had already repeated my tests with ridiculously high values for the
> following iSER parameters: FirstBurstLength, MaxBurstLength and
> MaxRecvDataSegmentLength (16 MB, which is more than the 1 MB block
> size specified to dd).
>
>   
Using such large values for FirstBurstLength will give you poor
performance numbers for WRITE commands (with iSER). FirstBurstLength
means how much data should you send as unsolicited data (i.e. without
RDMA). It means that your WRITE commands were sent without RDMA.

Erez
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Erez Zilber
Bart Van Assche wrote:
> As you probably know there is a trend in enterprise computing towards
> networked storage. This is illustrated by the emergence during the
> past few years of standards like SRP (SCSI RDMA Protocol), iSCSI
> (Internet SCSI) and iSER (iSCSI Extensions for RDMA). Two different
> pieces of software are necessary to make networked storage possible:
> initiator software and target software. As far as I know there exist
> three different SCSI target implementations for Linux:
> - The iSCSI Enterprise Target Daemon (IETD,
> http://iscsitarget.sourceforge.net/);
> - The Linux SCSI Target Framework (STGT, http://stgt.berlios.de/);
> - The Generic SCSI Target Middle Level for Linux project (SCST,
> http://scst.sourceforge.net/).
> Since I was wondering which SCSI target software would be best suited
> for an InfiniBand network, I started evaluating the STGT and SCST SCSI
> target implementations. Apparently the performance difference between
> STGT and SCST is small on 100 Mbit/s and 1 Gbit/s Ethernet networks,
> but the SCST target software outperforms the STGT software on an
> InfiniBand network. See also the following thread for the details:
> http://sourceforge.net/mailarchive/forum.php?thread_name=e2e108260801170127w2937b2afg9bef324efa945e43%40mail.gmail.com&forum_name=scst-devel.
>
>   
Sorry for the late response (but better late than never).

One may claim that STGT should have lower performance than SCST because
its data path is from userspace. However, your results show that for
non-IB transports, they both show the same numbers. Furthermore, with IB
there shouldn't be any additional difference between the 2 targets
because data transfer from userspace is as efficient as data transfer
from kernel space.

The only explanation that I see is that fine tuning for iSCSI & iSER is
required. As was already mentioned in this thread, with SDR you can get
~900 MB/sec with iSER (on STGT).

Erez
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Matteo Tescione
On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:

> On Tue, 05 Feb 2008 08:14:01 +0100
> Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> 
>> James Bottomley schrieb:
>> 
>>> These are both features being independently worked on, are they not?
>>> Even if they weren't, the combination of the size of SCST in kernel plus
>>> the problem of having to find a migration path for the current STGT
>>> users still looks to me to involve the greater amount of work.
>> 
>> I don't want to be mean, but does anyone actually use STGT in
>> production? Seriously?
>> 
>> In the latest development version of STGT, it's only possible to stop
>> the tgtd target daemon using KILL / 9 signal - which also means all
>> iSCSI initiator connections are corrupted when tgtd target daemon is
>> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> 
> I don't know what "iSCSI initiator connections are corrupted"
> mean. But if you reboot a server, how can an iSCSI target
> implementation keep iSCSI tcp connections?
> 
> 
>> Imagine you have to reboot all your NFS clients when you reboot your NFS
>> server. Not only that - your data is probably corrupted, or at least the
>> filesystem deserves checking...

Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
manages stop/crash, by sending unit attention to clients on reconnect.
Drbd+heartbeat correctly manages those things too.
Still from an end-user POV, i was able to reboot/survive a crash only with
SCST, IETD still has reconnect problems and STGT are even worst.

Regards,
--matteo


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


aic7xxx build failure

2008-02-05 Thread Adrian Bunk
Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
compilation:

<--  snip  -->

$ make O=../out/x86-full
...
  SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
  SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
  CC  drivers/scsi/aic7xxx/aic79xx_core.o
gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
gcc: no input files
make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1

<--  snip  -->

Next "make" run brings the same failure in 
drivers/scsi/aic7xxx/aic7xxx_core.c.

With the third "make" it works.

It might compile for people with SMP systems using -j?

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] bugfix for an underflow condition in usb storage & isd200.c

2008-02-05 Thread Matthew Dharm
On Mon, Feb 04, 2008 at 03:05:58PM -0500, Alan Stern wrote:
> On Sun, 3 Feb 2008, Matthew Dharm wrote:
> 
> I think the correct approach is to modify those routines so that they 
> will never overrun the s-g buffer (like Boaz has done), and _document_ 
> this behavior.  Then the callers can feel free to try and transfer as 
> much as they want, knowing that an overrun can't occur.  There won't 
> be any need for a WARN_ON or anything else.

Six of one and a half-dozen of the other.  All we're arguing over is the
definition of "correct behavior" here.  You want to change the API so that
overrun is acceptable and handled; I prefer calling it a Bad Thing(tm).

We both agree that the code shouldn't run off the end of the s-g list.

Since you've already committed to updating the patch, then we can do it
your way.  Just make sure it's very very clear in the comments.

Matt

-- 
Matthew Dharm  Home: [EMAIL PROTECTED] 
Maintainer, Linux USB Mass Storage Driver

E:  You run this ship with Windows?!  YOU IDIOT!
L:  Give me a break, it came bundled with the computer!
-- ESR and Lan Solaris
User Friendly, 12/8/1998


pgpcpyc8SXPyv.pgp
Description: PGP signature


Re: [PATCH 21/24][RFC] scsi_tgt: use of sense accessors

2008-02-05 Thread Jeff Garzik

FUJITA Tomonori wrote:

On Tue, 5 Feb 2008 11:21:33 -0500
Pete Wyckoff <[EMAIL PROTECTED]> wrote:


[EMAIL PROTECTED] wrote on Mon, 04 Feb 2008 19:53 +0200:

  FIXME: I need help with this driver (Pete?)
I used scsi_sense() in a none const way. But since
scsi_tgt is the ULD here, it can just access it's own sense
buffer directly. I did not use scsi_eh_cpy_sense() because
I did not want the extra copy. Pete will want to use a 260
bytes buffer here.

Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
Need-help-from: Pete Wyckoff <[EMAIL PROTECTED]>

FYI, I never use scsi_tgt.  Only just pure userspace on the target,
and a dumb ethernet NIC that does not know it is speaking any form
of SCSI.


Seems that many people misunderstand STGT iSCSI (and iSER), FCoE, and
SRP (not implemented yet) software target drivers. They don't use the
tgt kernel module. They just run in user space like user-space nfs
daemon.


FWIW, some AHCI and other SATA chips implement ATA target mode.  I'm 
watching this SCSI work with interest, hoping that many of the concepts 
(and code?) can be applied to SATA as well.


If for no other reason than I can build a cheap ATA protocol analyzer, 
or bridge.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Jeff Garzik

Bart Van Assche wrote:

On Feb 4, 2008 11:57 PM, Jeff Garzik <[EMAIL PROTECTED]> wrote:


Networked block devices are attractive because the concepts and
implementation are more simple than networked filesystems... but usually
you want to run some sort of filesystem on top.  At that point you might
as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your
networked block device (and associated complexity).


Running a filesystem on top of iSCSI results in better performance
than NFS, especially if the NFS client conforms to the NFS standard
(=synchronous writes).
By searching the web search for the keywords NFS, iSCSI and
performance I found the following (6 years old) document:
http://www.technomagesinc.com/papers/ip_paper.html. A quote from the
conclusion:
Our results, generated by running some of industry standard benchmarks,
show that iSCSI significantly outperforms NFS for situations when
performing streaming, database like accesses and small file transactions.


async performs better than sync...  this is news?  Furthermore, NFSv4 
has not only async capability but delegation too (and RDMA if you like 
such things), so the comparison is not relevant to modern times.


But a networked filesystem (note I'm using that term, not "NFS", from 
here on) is simply far more useful to the average user.  A networked 
block device is a building block -- and a useful one.  A networked 
filesystem is an immediately usable solution.


For remotely accessing data, iSCSI+fs is quite simply more overhead than 
a networked fs.  With iSCSI you are doing


local VFS -> local blkdev -> network

whereas a networked filesystem is

local VFS -> network

iSCSI+fs also adds new manageability issues, because unless the 
filesystem is single-computer (such as diskless iSCSI root fs), you 
still need to go across the network _once again_ to handle filesystem 
locking and coordination issues.


There is no _fundamental_ reason why remote shared storage via iSCSI OSD 
 is any faster than a networked filesystem.



SCSI-over-IP has its uses.  Absolutely.  It needed to be standardized. 
But let's not pretend iSCSI is anything more than what it is.  Its a 
bloated cat5 cabling standard :)


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Jeff Garzik

Olivier Galibert wrote:

On Mon, Feb 04, 2008 at 05:57:47PM -0500, Jeff Garzik wrote:

iSCSI and NBD were passe ideas at birth.  :)

Networked block devices are attractive because the concepts and 
implementation are more simple than networked filesystems... but usually 
you want to run some sort of filesystem on top.  At that point you might 
as well run NFS or [gfs|ocfs|flavor-of-the-week], and ditch your 
networked block device (and associated complexity).


Call me a sysadmin, but I find easier to plug in and keep in place an
ethernet cable than these parallel scsi cables from hell.  Every
server has at least two ethernet ports by default, with rarely any
surprises at the kernel level.  Adding ethernet cards is inexpensive,
and you pretty much never hear of compatibility problems between
cards.

So ethernet as a connection medium is really nice compared to scsi.
Too bad iscsi is demented and ATAoE/NBD inexistant.  Maybe external
SAS will be nice, but I don't see it getting to the level of
universality of ethernet any time soon.  And it won't get the same
amount of user-level compatibility testing in any case.


Indeed, at the end of the day iSCSI is a bloated cabling standard.  :)

It has its uses, but I don't see it as ever coming close to replacing 
direct-to-network (perhaps backed with local cachefs) filesystems... 
which is how all the hype comes across to me.


Cheap "Lintel" boxes everybody is familiar with _are_ the storage 
appliances.  Until mass-produced ATA and SCSI devices start shipping 
with ethernet connectors, anyway.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread James Bottomley
On Tue, 2008-02-05 at 19:40 +0200, Adrian Bunk wrote:
> Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> compilation:
> 
> <--  snip  -->
> 
> $ make O=../out/x86-full
> ...
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
>   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> gcc: no input files
> make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> 
> <--  snip  -->
> 
> Next "make" run brings the same failure in 
> drivers/scsi/aic7xxx/aic7xxx_core.c.
> 
> With the third "make" it works.
> 
> It might compile for people with SMP systems using -j?

I'd just say "weird behaviour"  the file being complained about is
definitely part of the tree ... does it actually exist in your tree when
gcc claims it doesn't? if so, I suspect some type of make path screwup
here.  The commit in question is this:

commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb
Author: Sam Ravnborg <[EMAIL PROTECTED]>
Date:   Sun Feb 3 21:55:49 2008 +0100

scsi: fix dependency bug in aic7 Makefile

Building the aic7xxx driver includes the copy
of an .h file from a _shipped file.

In a highly parallel build Ingo saw that the
build sometimes failed (included distcc usage).
It was tracked down to a missing dependency from the .c
source file to the generated .h file.
We started to build the .c file before the
copy (cat) operation of the .h file completed
and we then only got half of the definitions
from the copied .h file.

Add an explicit dependency from the .c files to the
generated .h files so make knows all dependencies and
finsih the build of the .h files before it starts
building the .o files.

Ingo tested this fix and reported:
good news: hundreds of successful kernel builds and no failures
overnight.

Signed-off-by: Sam Ravnborg <[EMAIL PROTECTED]>
Acked-by: Ingo Molnar <[EMAIL PROTECTED]>
Acked-by: James Bottomley <[EMAIL PROTECTED]>

diff --git a/drivers/scsi/aic7xxx/Makefile b/drivers/scsi/aic7xxx/Makefile
index e4f70c5..4c54954 100644
--- a/drivers/scsi/aic7xxx/Makefile
+++ b/drivers/scsi/aic7xxx/Makefile
@@ -44,13 +44,8 @@ clean-files += aic79xx_seq.h aic79xx_reg.h 
aic79xx_reg_print.c
 
 # Dependencies for generated files need to be listed explicitly
 
-$(obj)/aic7xxx_core.o: $(obj)/aic7xxx_seq.h
-$(obj)/aic7xxx_core.o: $(obj)/aic7xxx_reg.h
-$(obj)/aic79xx_core.o: $(obj)/aic79xx_seq.h
-$(obj)/aic79xx_core.o: $(obj)/aic79xx_reg.h
-
-$(addprefix $(obj)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h
-$(addprefix $(obj)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h
+$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
$(obj)/aic7xxx_reg.h
+$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
$(obj)/aic79xx_reg.h
 
 aic7xxx-gen-$(CONFIG_AIC7XXX_BUILD_FIRMWARE)   := $(obj)/aic7xxx_reg.h
 aic7xxx-gen-$(CONFIG_AIC7XXX_REG_PRETTY_PRINT) += $(obj)/aic7xxx_reg_print.c

The last two additions look wrong:  they make source files depend on
headers, which isn't right: it's object files that depend on headers (we
don't know how to rebuild the .c files because they're not auto
generated).  However, the commit log indicates the cause might be
deeper.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread James Bottomley

On Tue, 2008-02-05 at 19:40 +0200, Adrian Bunk wrote:
> Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> compilation:
> 
> <--  snip  -->
> 
> $ make O=../out/x86-full
> ...
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
>   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> gcc: no input files
> make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1

Could you run this with V=1 to get us a verbose output of what the exact
files gcc is failing on are?

Thanks,

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Linus Torvalds


On Tue, 5 Feb 2008, Bart Van Assche wrote:
> 
> Results that I did not expect:
> * A block transfer size of 1 MB is not enough to measure the maximal
> throughput. The maximal throughput is only reached at much higher
> block sizes (about 10 MB for SCST + SRP and about 100 MB for STGT +
> iSER).

Block transfer sizes over about 64kB are totally irrelevant for 99% of all 
people.

Don't even bother testing anything more. Yes, bigger transfers happen, but 
a lot of common loads have *smaller* transfers than 64kB.

So benchmarks that try to find "theoretical throughput" by just making big 
transfers should just be banned. They give numbers, yes, but the numbers 
are pointless.

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Adrian Bunk
On Tue, Feb 05, 2008 at 12:18:04PM -0600, James Bottomley wrote:
> 
> On Tue, 2008-02-05 at 19:40 +0200, Adrian Bunk wrote:
> > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > compilation:
> > 
> > <--  snip  -->
> > 
> > $ make O=../out/x86-full
> > ...
> >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > gcc: no input files
> > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> 
> Could you run this with V=1 to get us a verbose output of what the exact
> files gcc is failing on are?

make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
obj=drivers/scsi
make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
obj=drivers/scsi/aacraid
(cat /dev/null; ) > drivers/scsi/aacraid/modules.order
make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
obj=drivers/scsi/aic7xxx
  cat 
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_seq.h_shipped
 > drivers/scsi/aic7xxx/aic79xx_seq.h
  cat 
/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_reg.h_shipped
 > drivers/scsi/aic7xxx/aic79xx_reg.h
  gcc -Wp,-MD,drivers/scsi/aic7xxx/.aic79xx_core.o.d  -nostdinc -isystem 
/usr/lib/gcc/i486-linux-gnu/4.2.3/include -D__KERNEL__ -Iinclude -Iinclude2 
-I/home/bunk/linux/kernel-2.6/git/linux-2.6/include -include 
include/linux/autoconf.h 
-I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx 
-Idrivers/scsi/aic7xxx -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
-fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os -m32 
-msoft-float -mregparm=3 -freg-struct-return -mpreferred-stack-boundary=2 
-march=athlon -ffreestanding -DCONFIG_AS_CFI=1 -DCONFIG_AS_CFI_SIGNAL_FRAME=1 
-pipe -Wno-sign-compare -fno-asynchronous-unwind-tables -mno-sse -mno-mmx 
-mno-sse2 -mno-3dnow 
-I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-generic 
-Iinclude/asm-x86/mach-generic 
-I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-default 
-Iinclude/asm-x86/mach-default -fno-omit-frame-pointer 
-fno-optimize-sibling-calls -fno-stack-protector -Wdeclaration-after-statement 
-Wno-pointer-sign -I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi 
-Idrivers/scsi  -D"KBUILD_STR(s)=#s" 
-D"KBUILD_BASENAME=KBUILD_STR(aic79xx_core)"  
-D"KBUILD_MODNAME=KBUILD_STR(aic79xx)" -c -o 
drivers/scsi/aic7xxx/aic79xx_core.o drivers/scsi/aic7xxx/aic79xx_core.c
gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
gcc: no input files
make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
make[3]: *** [drivers/scsi/aic7xxx] Error 2
make[2]: *** [drivers/scsi] Error 2
make[1]: *** [drivers] Error 2
make: *** [sub-make] Error 2


> Thanks,
> 
> James

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread James Bottomley

On Tue, 2008-02-05 at 20:24 +0200, Adrian Bunk wrote:
> On Tue, Feb 05, 2008 at 12:18:04PM -0600, James Bottomley wrote:
> > 
> > On Tue, 2008-02-05 at 19:40 +0200, Adrian Bunk wrote:
> > > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > > compilation:
> > > 
> > > <--  snip  -->
> > > 
> > > $ make O=../out/x86-full
> > > ...
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> > >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > > gcc: no input files
> > > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > 
> > Could you run this with V=1 to get us a verbose output of what the exact
> > files gcc is failing on are?
> 
> make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> obj=drivers/scsi
> make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> obj=drivers/scsi/aacraid
> (cat /dev/null; ) > drivers/scsi/aacraid/modules.order
> make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> obj=drivers/scsi/aic7xxx
>   cat 
> /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_seq.h_shipped
>  > drivers/scsi/aic7xxx/aic79xx_seq.h
>   cat 
> /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_reg.h_shipped
>  > drivers/scsi/aic7xxx/aic79xx_reg.h
>   gcc -Wp,-MD,drivers/scsi/aic7xxx/.aic79xx_core.o.d  -nostdinc -isystem 
> /usr/lib/gcc/i486-linux-gnu/4.2.3/include -D__KERNEL__ -Iinclude -Iinclude2 
> -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include -include 
> include/linux/autoconf.h 
> -I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx 
> -Idrivers/scsi/aic7xxx -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
> -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os 
> -m32 -msoft-float -mregparm=3 -freg-struct-return 
> -mpreferred-stack-boundary=2 -march=athlon -ffreestanding -DCONFIG_AS_CFI=1 
> -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare 
> -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow 
> -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-generic 
> -Iinclude/asm-x86/mach-generic 
> -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-default 
> -Iinclude/asm-x86/mach-default -fno-omit-frame-pointer 
> -fno-optimize-sibling-calls -fno-stack-protector 
> -Wdeclaration-after-statement -Wno-pointer-sign 
> -I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi -Idrivers/scsi  
> -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(aic79xx_core)"  
> -D"KBUILD_MODNAME=KBUILD_STR(aic79xx)" -c -o 
> drivers/scsi/aic7xxx/aic79xx_core.o drivers/scsi/aic7xxx/aic79xx_core.c
> gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> gcc: no input files
> make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> make[3]: *** [drivers/scsi/aic7xxx] Error 2
> make[2]: *** [drivers/scsi] Error 2
> make[1]: *** [drivers] Error 2
> make: *** [sub-make] Error 2

Do I assume from this that you have different source and object
directories?  There shouldn't be a failure if this is building
in /home/bunk/linux/kernel-2.6/git/linux-2.6/ because the source file
should be there.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Adrian Bunk
On Tue, Feb 05, 2008 at 12:30:56PM -0600, James Bottomley wrote:
> 
> On Tue, 2008-02-05 at 20:24 +0200, Adrian Bunk wrote:
> > On Tue, Feb 05, 2008 at 12:18:04PM -0600, James Bottomley wrote:
> > > 
> > > On Tue, 2008-02-05 at 19:40 +0200, Adrian Bunk wrote:
> > > > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > > > compilation:
> > > > 
> > > > <--  snip  -->
> > > > 
> > > > $ make O=../out/x86-full
> > > > ...
> > > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> > > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> > > >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > > > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > > > gcc: no input files
> > > > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > > 
> > > Could you run this with V=1 to get us a verbose output of what the exact
> > > files gcc is failing on are?
> > 
> > make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> > obj=drivers/scsi
> > make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> > obj=drivers/scsi/aacraid
> > (cat /dev/null; ) > drivers/scsi/aacraid/modules.order
> > make -f /home/bunk/linux/kernel-2.6/git/linux-2.6/scripts/Makefile.build 
> > obj=drivers/scsi/aic7xxx
> >   cat 
> > /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_seq.h_shipped
> >  > drivers/scsi/aic7xxx/aic79xx_seq.h
> >   cat 
> > /home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx/aic79xx_reg.h_shipped
> >  > drivers/scsi/aic7xxx/aic79xx_reg.h
> >   gcc -Wp,-MD,drivers/scsi/aic7xxx/.aic79xx_core.o.d  -nostdinc -isystem 
> > /usr/lib/gcc/i486-linux-gnu/4.2.3/include -D__KERNEL__ -Iinclude -Iinclude2 
> > -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include -include 
> > include/linux/autoconf.h 
> > -I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi/aic7xxx 
> > -Idrivers/scsi/aic7xxx -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs 
> > -fno-strict-aliasing -fno-common -Werror-implicit-function-declaration -Os 
> > -m32 -msoft-float -mregparm=3 -freg-struct-return 
> > -mpreferred-stack-boundary=2 -march=athlon -ffreestanding -DCONFIG_AS_CFI=1 
> > -DCONFIG_AS_CFI_SIGNAL_FRAME=1 -pipe -Wno-sign-compare 
> > -fno-asynchronous-unwind-tables -mno-sse -mno-mmx -mno-sse2 -mno-3dnow 
> > -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-generic 
> > -Iinclude/asm-x86/mach-generic 
> > -I/home/bunk/linux/kernel-2.6/git/linux-2.6/include/asm-x86/mach-default 
> > -Iinclude/asm-x86/mach-default -fno-omit-frame-pointer 
> > -fno-optimize-sibling-calls -fno-stack-protector 
> > -Wdeclaration-after-statement -Wno-pointer-sign 
> > -I/home/bunk/linux/kernel-2.6/git/linux-2.6/drivers/scsi -Idrivers/scsi  
> > -D"KBUILD_STR(s)=#s" -D"KBUILD_BASENAME=KBUILD_STR(aic79xx_core)"  
> > -D"KBUILD_MODNAME=KBUILD_STR(aic79xx)" -c -o 
> > drivers/scsi/aic7xxx/aic79xx_core.o drivers/scsi/aic7xxx/aic79xx_core.c
> > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > gcc: no input files
> > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > make[3]: *** [drivers/scsi/aic7xxx] Error 2
> > make[2]: *** [drivers/scsi] Error 2
> > make[1]: *** [drivers] Error 2
> > make: *** [sub-make] Error 2
> 
> Do I assume from this that you have different source and object
> directories?

Yes, as I wrote in my bug report:
  make O=../out/x86-full

> There shouldn't be a failure if this is building
> in /home/bunk/linux/kernel-2.6/git/linux-2.6/ because the source file
> should be there.
> 
> James

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread James Bottomley
This email somehow didn't manage to make it to the list (I suspect
because it had html attachments).

James

---

  From: 
Julian Satran
<[EMAIL PROTECTED]>
To: 
Nicholas A. Bellinger
<[EMAIL PROTECTED]>
Cc: 
Andrew Morton
<[EMAIL PROTECTED]>, Alan
Cox <[EMAIL PROTECTED]>, Bart
Van Assche
<[EMAIL PROTECTED]>, FUJITA
Tomonori
<[EMAIL PROTECTED]>,
James Bottomley
<[EMAIL PROTECTED]>, ...
   Subject: 
Re: Integration of SCST in the
mainstream Linux kernel
  Date: 
Mon, 4 Feb 2008 21:31:48 -0500
(20:31 CST)


Well stated. In fact the "layers" above ethernet do provide the services 
that make the TCP/IP stack compelling - a whole complement of services.
ALL services required (naming, addressing, discovery, security etc.) will 
have to be recreated if you take the FcOE route. That makes good business 
for some but not necessary for the users. Those services BTW are not on 
the data path and are not "overhead".
The TCP/IP stack pathlength is decently low. What makes most 
implementations poor is that they where naively extended in the SMP world. 
Recent implementations (published) from IBM and Intel show excellent 
performance (4-6 times the regular stack). I do not have unfortunately 
latency numbers (as the community major stress has been throughput) but I 
assume that RDMA (not necessarily hardware RDMA) and/or the use of 
infiniband or latency critical applications - within clusters may be the 
ultimate low latency solution. Ethernet has some inherent latency issues 
(the bridges) that are inherited by anything on ethernet (FcOE included). 
The IP protocol stack is not inherently slow but some implementations are 
somewhat sluggish.
But instead of replacing them with new and half backed contraptions we 
would be all better of improving what we have and understand.

In the whole debate of around FcOE I heard a single argument that may have 
some merit - building convertors iSCSI-FCP to support legacy islands of 
FCP (read storage products that do not support iSCSI natively) is 
expensive. It is correct technically - only that FcOE eliminates an 
expense at the wrong end of the wire - it reduces the cost of the storage 
box at the expense of added cost at the server (and usually there a many 
servers using a storage box). FcOE vendors are also bound to provide FCP 
like services for FcOE - naming, security, discovery etc. - that do not 
exist on Ethernet. It is a good business for FcOE vendors - a duplicate 
set of solution for users.

It should be apparent by now that if one speaks about a "converged" 
network we should speak about an IP network and not about Ethernet.
If we take this route we might get perhaps also to an "infrastructure 
physical variants" that support very low latency better than ethernet and 
we might be able to use them with the same "stack" - a definite forward 
looking solution.

IMHO it is foolish to insist on throwing away the whole stack whenever we 
make a slight improvement in the physical layer of the network. We have a 
substantial investment and body of knowledge in the protocol stack and 
nothing proposed improves on it - obviously not as in its total level of 
service nor in performance.

Julo

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Arjan van de Ven
On Tue, 05 Feb 2008 12:30:56 -0600

> > make: *** [sub-make] Error 2
> 
> Do I assume from this that you have different source and object
> directories?  There shouldn't be a failure if this is building
> in /home/bunk/linux/kernel-2.6/git/linux-2.6/ because the source file
> should be there.
> 
> James

time to run a fsck as well just to rule stuff out?

-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Sam Ravnborg
On Tue, Feb 05, 2008 at 07:40:24PM +0200, Adrian Bunk wrote:
> Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> compilation:
> 
> <--  snip  -->
> 
> $ make O=../out/x86-full
> ...
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
>   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
>   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> gcc: no input files
> make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> 
> <--  snip  -->
> 
> Next "make" run brings the same failure in 
> drivers/scsi/aic7xxx/aic7xxx_core.c.
> 
> With the third "make" it works.
> 
> It might compile for people with SMP systems using -j?

I can reproduce it and will fix it.

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

Erez Zilber wrote:

Bart Van Assche wrote:


As you probably know there is a trend in enterprise computing towards
networked storage. This is illustrated by the emergence during the
past few years of standards like SRP (SCSI RDMA Protocol), iSCSI
(Internet SCSI) and iSER (iSCSI Extensions for RDMA). Two different
pieces of software are necessary to make networked storage possible:
initiator software and target software. As far as I know there exist
three different SCSI target implementations for Linux:
- The iSCSI Enterprise Target Daemon (IETD,
http://iscsitarget.sourceforge.net/);
- The Linux SCSI Target Framework (STGT, http://stgt.berlios.de/);
- The Generic SCSI Target Middle Level for Linux project (SCST,
http://scst.sourceforge.net/).
Since I was wondering which SCSI target software would be best suited
for an InfiniBand network, I started evaluating the STGT and SCST SCSI
target implementations. Apparently the performance difference between
STGT and SCST is small on 100 Mbit/s and 1 Gbit/s Ethernet networks,
but the SCST target software outperforms the STGT software on an
InfiniBand network. See also the following thread for the details:
http://sourceforge.net/mailarchive/forum.php?thread_name=e2e108260801170127w2937b2afg9bef324efa945e43%40mail.gmail.com&forum_name=scst-devel.

 


Sorry for the late response (but better late than never).

One may claim that STGT should have lower performance than SCST because
its data path is from userspace. However, your results show that for
non-IB transports, they both show the same numbers. Furthermore, with IB
there shouldn't be any additional difference between the 2 targets
because data transfer from userspace is as efficient as data transfer
from kernel space.


And now consider if one target has zero-copy cached I/O. How much that 
will improve its performance?



The only explanation that I see is that fine tuning for iSCSI & iSER is
required. As was already mentioned in this thread, with SDR you can get
~900 MB/sec with iSER (on STGT).

Erez

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Jeff Garzik

Vladislav Bolkhovitin wrote:

Jeff Garzik wrote:
iSCSI is way, way too complicated. 


I fully agree. From one side, all that complexity is unavoidable for 
case of multiple connections per session, but for the regular case of 
one connection per session it must be a lot simpler.



Actually, think about those multiple connections...  we already had to 
implement fast-failover (and load bal) SCSI multi-pathing at a higher 
level.  IMO that portion of the protocol is redundant:   You need the 
same capability elsewhere in the OS _anyway_, if you are to support 
multi-pathing.


Jeff



-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread James Bottomley
On Tue, 2008-02-05 at 21:59 +0300, Vladislav Bolkhovitin wrote:
> >>Hmm, how can one write to an mmaped page and don't touch it?
> > 
> > I meant from user space ... the writes are done inside the kernel.
> 
> Sure, the mmap() approach agreed to be unpractical, but could you 
> elaborate more on this anyway, please? I'm just curious. Do you think 
> about implementing a new syscall, which would put pages with data in the 
> mmap'ed area?

No, it has to do with the way invalidation occurs.  When you mmap a
region from a device or file, the kernel places page translations for
that region into your vm_area.  The regions themselves aren't backed
until faulted.  For write (i.e. incoming command to target) you specify
the write flag and send the area off to receive the data.  The gather,
expecting the pages to be overwritten, backs them with pages marked
dirty but doesn't fault in the contents (unless it already exists in the
page cache).  The kernel writes the data to the pages and the dirty
pages go back to the user.  msync() flushes them to the device.

The disadvantage of all this is that the handle for the I/O if you will
is a virtual address in a user process that doesn't actually care to see
the data. non-x86 architectures will do flushes/invalidates on this
address space as the I/O occurs.


> > However, as Linus has pointed out, this discussion is getting a bit off
> > topic. 
> 
> No, that isn't off topic. We've just proved that there is no good way to 
> implement zero-copy cached I/O for STGT. I see the only practical way 
> for that, proposed by FUJITA Tomonori some time ago: duplicating Linux 
> page cache in the user space. But will you like it?

Well, there's no real evidence that zero copy or lack of it is a problem
yet.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

Jeff Garzik wrote:
iSCSI is way, way too complicated. 


I fully agree. From one side, all that complexity is unavoidable for 
case of multiple connections per session, but for the regular case of 
one connection per session it must be a lot simpler.


Actually, think about those multiple connections...  we already had to 
implement fast-failover (and load bal) SCSI multi-pathing at a higher 
level.  IMO that portion of the protocol is redundant:   You need the 
same capability elsewhere in the OS _anyway_, if you are to support 
multi-pathing.


I'm thinking about MC/S as about a way to improve performance using 
several physical links. There's no other way, except MC/S, to keep 
commands processing order in that case. So, it's really valuable 
property of iSCSI, although with a limited application.


Vlad
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread Luben Tuikov
--- On Tue, 2/5/08, James Bottomley <[EMAIL PROTECTED]> wrote:
> > > > I guess the same could be said for STGT and
> SCST,
> > > right?
> > > 
> > > You mean both of their kernel pieces are modular?
> 
> > > That's correct.
> > 
> > No, you know very well what I mean.
> > 
> > By the same logic you're preaching to include your
> > solution part of the kernel, you can also apply to
> > SCST.
> 
> Ah, but it's not ... the current patch is merely
> exporting an interface.
> The debate in STGT vs SCST is not whether to export an
> interface but
> where to draw the line.

"draw the line" -- I see.
BTW, what is wrong with "exporting the interface"?

What is wrong if both implementations are in the kernel
and then let the users and distros decide which one
they like best and use more?  It'll not be the fist time
this has happened in the kernel.  Both are actively
maintained.

It seems highly arbitrary to say: "X is in the kernel, Y
is not. If you want Y, just forget about it and fix X."
Give people choice at config time.

This is off topic anyway.

> You could also argue in the same vein that sd is redundant
> because a
> filesystem could talk directly to the device via /dev/sgX
> (in fact OSD
> based filesystems already do this).

Yes, I've mentioned this thing before on this list.  Oh, maybe 3
years ago.  This is why I had wanted for transport protocols
to export ... (oh, let's not get this off topic).

(Apparently it takes 3 years...)

> The argument is true,
> but misses
> the bigger picture that the interfaces exported by sd are
> more portable
> (apply to non-SCSI block devices) and easier to use.

It isn't quite the same thing.  It's like comparing
apples to oranges.

> 
> > > > Yes, for which the transport layer,
> implements the
> > > > scsi device node for the SES device.  It
> doesn't
> > > really
> > > > matter if the SCSI commands sent to the SES
> device go
> > > > over SGPIO or FC or SAS or Bluetooth or I2C,
> etc, the
> > > > transport layer can implement that and
> present the
> > > > /dev/sgX node.
> > > 
> > > But it does matter if the enclosure device
> doesn't
> > > speak SCSI.
> > 
> > Enclosure management isn't as simple as you're
> > portraying it here.  The enclosure management
> > device speaks either SES or SAF-TE.  The transport
> > protocol to access it could be SGPIO or I2C or...
> 
> Look, just read the spec; SGPIO is a bus for driving
> enclosures ...

I thought Serial General Purpose Input Output
(SGPIO) was a method to serialize general purpose
IO signals.

> it
> doesn't require SES or SAF-TE or even any SCSI
> protocol.

That's true.  And this is why I mentioned a couple
of emails ago to simply export a sgpio device node *IF*
this is what is needed.  Of course devices that use SGPIO
abstract it away for their functional purpose, e.g.
enclosures, LED, etc, and provide a more general way to
control it -- highly hardware specific on one side.

Your abstraction currently deals with "SES" devices
and I'd rather leave that to user-space.  Alternatively,
which I presume is what you're thinking, a HW specific
core would be using your "abstraction" to provide
some unified access to raw features, and that "unified
access" isn't defined anywhere, and would likely not
be.  Alternatively that "unified access" is things
like SES and SAF-TE, which is what vendors prefer
to export, or they prefer to drive this directly
via other means.

That is, I fail to see the kernel bloat, for things
that aren't necessary in the kernel.

If you want your abstraction to fly, it first needs
a common usage model to abstract, and the latter is
missing _from the kernel_.

Unless I don't know the details and you've been
asked to implement this for a single vendor's HW solution.

> > >  SGPIO
> > > isn't a SCSI protocol ... it's a general
> purpose
> > > serial bus protocol.
> > > It's pretty simple and register based, but it
> might (or
> > > might not) be
> > > accessible via a SCSI bridge.
> > 
> > I see.  You've just discovered SGPIO -- good for
> you.
> > 
> > At any rate, I told you already that what is needed
> > is not what you've provided but a _device node_
> > exported by the kernel, either a processor or
> > enclosure type.
> 
> Wrong ... we don't export non-SCSI devices as SCSI
> (with the single and
> rather annoying exception of ATA via SAT).

I didn't say you should do that.  I had already
mentioned that vendors export such controls
as either enclosure or processor type devices,
and this is why I told you that that is what
needs to be exported, which incidentally is
a device node of that type.

Without a common usage model already in the kernel
to abstract (e.g. sd for block device, since you brought
that up) your abstraction seems redundant and arbitrary.

Your kernel code already uses READ DIAGNOSTIC, etc,
and I'd rather leave that to user-space.

   Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

Jeff Garzik wrote:

Alan Cox wrote:

better. So for example, I personally suspect that ATA-over-ethernet is way 
better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
low-level, and against those crazy SCSI people to begin with.


Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
would probably trash iSCSI for latency if nothing else.



AoE is truly a thing of beauty.  It has a two/three page RFC (say no more!).

But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
really do tagged queueing, etc.



iSCSI is way, way too complicated. 


I fully agree. From one side, all that complexity is unavoidable for 
case of multiple connections per session, but for the regular case of 
one connection per session it must be a lot simpler.


And now think about iSER, which brings iSCSI on the whole new complexity 
level ;)


It's an Internet protocol designed 
by storage designers, what do you expect?


For years I have been hoping that someone will invent a simple protocol 
(w/ strong auth) that can transit ATA and SCSI commands and responses. 
Heck, it would be almost trivial if the kernel had a TLS/SSL implementation.


Jeff

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

Linus Torvalds wrote:

I'd assumed the move was primarily because of the difficulty of getting
correct semantics on a shared filesystem



.. not even shared. It was hard to get correct semantics full stop. 

Which is a traditional problem. The thing is, the kernel always has some 
internal state, and it's hard to expose all the semantics that the kernel 
knows about to user space.


So no, performance is not the only reason to move to kernel space. It can 
easily be things like needing direct access to internal data queues (for a 
iSCSI target, this could be things like barriers or just tagged commands - 
yes, you can probably emulate things like that without access to the 
actual IO queues, but are you sure the semantics will be entirely right?


The kernel/userland boundary is not just a performance boundary, it's an 
abstraction boundary too, and these kinds of protocols tend to break 
abstractions. NFS broke it by having "file handles" (which is not 
something that really exists in user space, and is almost impossible to 
emulate correctly), and I bet the same thing happens when emulating a SCSI 
target in user space.


Yes, there is something like that for SCSI target as well. It's a "local 
initiator" or "local nexus", see 
http://thread.gmane.org/gmane.linux.scsi/31288 and 
http://news.gmane.org/find-root.php?message_id=%3c463F36AC.3010207%40vlnb.net%3e 
for more info about that.


In fact, existence of local nexus is one more point why SCST is better, 
than STGT, because for STGT it's pretty hard to support it (all locally 
generated commands would have to be passed through its daemon, which 
would be a total disaster for performance), while for SCST it can be 
done relatively simply.


Vlad
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

James Bottomley wrote:

On Mon, 2008-02-04 at 21:38 +0300, Vladislav Bolkhovitin wrote:


James Bottomley wrote:


On Mon, 2008-02-04 at 20:56 +0300, Vladislav Bolkhovitin wrote:



James Bottomley wrote:



On Mon, 2008-02-04 at 20:16 +0300, Vladislav Bolkhovitin wrote:




James Bottomley wrote:



So, James, what is your opinion on the above? Or the overall SCSI target 
project simplicity doesn't matter much for you and you think it's fine 
to duplicate Linux page cache in the user space to keep the in-kernel 
part of the project as small as possible?



The answers were pretty much contained here

http://marc.info/?l=linux-scsi&m=120164008302435

and here:

http://marc.info/?l=linux-scsi&m=120171067107293

Weren't they?


No, sorry, it doesn't look so for me. They are about performance, but 
I'm asking about the overall project's architecture, namely about one 
part of it: simplicity. Particularly, what do you think about 
duplicating Linux page cache in the user space to have zero-copy cached 
I/O? Or can you suggest another architectural solution for that problem 
in the STGT's approach?



Isn't that an advantage of a user space solution?  It simply uses the
backing store of whatever device supplies the data.  That means it takes
advantage of the existing mechanisms for caching.


No, please reread this thread, especially this message: 
http://marc.info/?l=linux-kernel&m=120169189504361&w=2. This is one of 
the advantages of the kernel space implementation. The user space 
implementation has to have data copied between the cache and user space 
buffer, but the kernel space one can use pages in the cache directly, 
without extra copy.



Well, you've said it thrice (the bellman cried) but that doesn't make it
true.

The way a user space solution should work is to schedule mmapped I/O



from the backing store and then send this mmapped region off for target



I/O.  For reads, the page gather will ensure that the pages are up to
date from the backing store to the cache before sending the I/O out.
For writes, You actually have to do a msync on the region to get the
data secured to the backing store. 


James, have you checked how fast is mmaped I/O if work size > size of 
RAM? It's several times slower comparing to buffered I/O. It was many 
times discussed in LKML and, seems, VM people consider it unavoidable. 



Erm, but if you're using the case of work size > size of RAM, you'll
find buffered I/O won't help because you don't have the memory for
buffers either.


James, just check and you will see, buffered I/O is a lot faster.


So in an out of memory situation the buffers you don't have are a lot
faster than the pages I don't have?


There isn't OOM in both cases. Just pages reclamation/readahead work 
much better in the buffered case.


So, using mmaped IO isn't an option for high performance. Plus, mmaped 
IO isn't an option for high reliability requirements, since it doesn't 
provide a practical way to handle I/O errors.


I think you'll find it does ... the page gather returns -EFAULT if
there's an I/O error in the gathered region. 


Err, to whom return? If you try to read from a mmaped page, which can't 
be populated due to I/O error, you will get SIGBUS or SIGSEGV, I don't 
remember exactly. It's quite tricky to get back to the faulted command 
from the signal handler.


Or do you mean mmap(MAP_POPULATE)/munmap() for each command? Do you 
think that such mapping/unmapping is good for performance?




msync does something
similar if there's a write failure.



You also have to pull tricks with
the mmap region in the case of writes to prevent useless data being read
in from the backing store.


Can you be more exact and specify what kind of tricks should be done for 
that?


Actually, just avoid touching it seems to do the trick with a recent
kernel.


Hmm, how can one write to an mmaped page and don't touch it?


I meant from user space ... the writes are done inside the kernel.


Sure, the mmap() approach agreed to be unpractical, but could you 
elaborate more on this anyway, please? I'm just curious. Do you think 
about implementing a new syscall, which would put pages with data in the 
mmap'ed area?



However, as Linus has pointed out, this discussion is getting a bit off
topic. 


No, that isn't off topic. We've just proved that there is no good way to 
implement zero-copy cached I/O for STGT. I see the only practical way 
for that, proposed by FUJITA Tomonori some time ago: duplicating Linux 
page cache in the user space. But will you like it?



There's no actual evidence that copy problems are causing any
performatince issues issues for STGT.  In fact, there's evidence that
they're not for everything except IB networks.


The zero-copy cached I/O has not yet been implemented in SCST, I simply 
so far have not had time for that. Currently SCST performs better STGT, 
because of simpler processing path and less context switches per 
command. Memcpy() speed on modern systems is about t

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Vladislav Bolkhovitin

Linus Torvalds wrote:
So just going by what has happened in the past, I'd assume that iSCSI 
would eventually turn into "connecting/authentication in user space" with 
"data transfers in kernel space".


This is exactly how iSCSI-SCST (iSCSI target driver for SCST) is 
implemented, credits to IET and Ardis target developers.


Vlad
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Bart Van Assche
On Feb 5, 2008 6:10 PM, Erez Zilber <[EMAIL PROTECTED]> wrote:
> One may claim that STGT should have lower performance than SCST because
> its data path is from userspace. However, your results show that for
> non-IB transports, they both show the same numbers. Furthermore, with IB
> there shouldn't be any additional difference between the 2 targets
> because data transfer from userspace is as efficient as data transfer
> from kernel space.
>
> The only explanation that I see is that fine tuning for iSCSI & iSER is
> required. As was already mentioned in this thread, with SDR you can get
> ~900 MB/sec with iSER (on STGT).

My most recent measurements also show that one can get 900 MB/s with
STGT + iSER on an SDR IB network, but only for very large block sizes
(>= 100 MB). A quote from Linus Torvalds is relevant here (February 5,
2008):

Block transfer sizes over about 64kB are totally irrelevant for
99% of all people.

Please read my e-mail (posted earlier today) with a comparison for 4
KB - 64 KB block transfer sizes between SCST and STGT.

Bart Van Assche.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new scsi sense handling

2008-02-05 Thread Luben Tuikov
--- On Tue, 2/5/08, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> On Mon, 4 Feb 2008 18:39:22 -0800 (PST)
> Luben Tuikov <[EMAIL PROTECTED]> wrote:
> 
> > --- On Mon, 2/4/08, Boaz Harrosh
> <[EMAIL PROTECTED]> wrote:
> > > There are 3 usages of sense handling in drivers
> > > 
> > > 1. sense is available in driver internal
> structure and is
> > > mem-copied to upper level
> > > 2. A CHECK_CONDITION status was returned and the
> driver
> > > uses the scsi_eh_prep_cmnd()
> > >for a REQUEST_SENSE invocation to the target.
> Then
> > > returning the sense in 
> > >scsi_eh_return_cmnd(). A variation on this is
> when the
> > > driver does nothing the queue
> > >is frozen an the scsi watchdog timer does the
> above.
> > > 3. The underline host adapter does the
> REQUEST_SENSE and a
> > > pre-allocated and DMA mapped
> > >sense buffer receives the sense information
> from HW.
> > 
> > Many years ago when "ACA" had a constructive
> meaning,
> > so did "Autosense".  Then about 5 years ago,
> "Autosense"
> > disappeared completely since it became the de facto
> > implementation of the then SCSI Execute Command
> "RPC",
> > now just SCSI Execute Command procedure call.
> > 
> > At that point in time, the SCSI mid-layer decided
> > to embrace this model and give the LLDD a scsi command
> > structure which included the sense data buffer to
> > a size that the SCSI mid-layer was interested in,
> > at the moment 96 bytes, macro defined in
> > include/scsi/scsi_cmnd.h.
> > 
> > The concept of "Autosense" was off-loaded to
> LLDD
> > to emulate it if the specific target device to
> > which the command was issued, didn't supply the
> > sense data on CHECK CONDITION, and more so
> > relevant to target devices which implemented
> > queuing, thus the ACA.
> > 
> > And the mid-layer would consider extracting
> > the sense data via REQUEST SENSE command
> > as a _special case_ if the LLDD/transport layer
> > didn't implement the "autosense" model.
> 
> Only SPI and USB?

I don't understand this question.

> 
> The most of LLDs using the transport protocol that we care
> about today
> uses sense buffer in their own internal structure.

Yes.

> 
> I think that the issue to solve to kill
> scsi_cmnd:sense_buffer is how
> to share (or export) such sense buffer with the scsi
> mid-layer.

And therein lies the problem.  Sense data is SCSI specific,
it should be left to SCSI, unless of course you can
stipulate that _all_ block devices return sense data.
If that's not the case and you move it to the block
layer, then you get a whole bunch of other problems,
like does this device want/use it, should we allocate
it, etc. OTOH, if that _is_ the case, then you don't
have to worry about this and the model is pretty
much as the SCSI mid-layer has it, i.e. sense buffer
always present.  So I guess the question is, can
you stipulate that _all_ block devices return sense data?

Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Sam Ravnborg
On Tue, Feb 05, 2008 at 07:47:35PM +0100, Sam Ravnborg wrote:
> On Tue, Feb 05, 2008 at 07:40:24PM +0200, Adrian Bunk wrote:
> > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > compilation:
> > 
> > <--  snip  -->
> > 
> > $ make O=../out/x86-full
> > ...
> >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > gcc: no input files
> > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > 
> > <--  snip  -->
> > 
> > Next "make" run brings the same failure in 
> > drivers/scsi/aic7xxx/aic7xxx_core.c.
> > 
> > With the third "make" it works.
> > 
> > It might compile for people with SMP systems using -j?
> 
> I can reproduce it and will fix it.
Seems I was sidetracked by some wrong assumptions.
Could you please test this fix.

Works for me but this time I will do more testing

Sam

diff --git a/drivers/scsi/aic7xxx/Makefile b/drivers/scsi/aic7xxx/Makefile
index 4c54954..6aa49e7 100644
--- a/drivers/scsi/aic7xxx/Makefile
+++ b/drivers/scsi/aic7xxx/Makefile
@@ -44,8 +44,8 @@ clean-files += aic79xx_seq.h aic79xx_reg.h aic79xx_reg_print.c
 
 # Dependencies for generated files need to be listed explicitly
 
-$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
$(obj)/aic7xxx_reg.h
-$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
$(obj)/aic79xx_reg.h
+$(addprefix $(src)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h $(obj)/aic7xxx_reg.h
+$(addprefix $(src)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h $(obj)/aic79xx_reg.h
 
 aic7xxx-gen-$(CONFIG_AIC7XXX_BUILD_FIRMWARE)   := $(obj)/aic7xxx_reg.h
 aic7xxx-gen-$(CONFIG_AIC7XXX_REG_PRETTY_PRINT) += $(obj)/aic7xxx_reg_print.c

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-05 Thread Mike Christie

James Bottomley wrote:

On Mon, 2008-02-04 at 12:15 -0800, Chandra Seetharaman wrote:

On Mon, 2008-02-04 at 12:58 -0600, James Bottomley wrote:

On Wed, 2008-01-23 at 16:32 -0800, Chandra Seetharaman wrote:

Subject: scsi_dh: Add support for SDEV_PASSIVE

From: Chandra Seetharaman <[EMAIL PROTECTED]>

This patch adds a new device state SDEV_PASSIVE, to correspond to the
passive side access of an active/passive multipathed device.

Really, no; this isn't right.  The state field of a SCSI device is for
the SCSI state model.  Passive might be a valid device mapper state, but

Hi James,

It is not the "device mapper state", it is the state of the device
itself. These devices have active/passive paths, the passive paths will
be represented by SDEV_PASSIVE device state in SCSI.


Yes, it is .. you're killing commands on the basis of being in this
state, which nothing in SCSI ever sets.


SCSI does set this. See below.



A proper return from a passive path is the SCSI standard NOT_READY
LOGICAL UNIT NOT READY, INITIALIZING COMMAND REQUIRED.  We expect to see
this, not the command being killed.



I think this part of the patch is trying to implement and detect the 
Target port asymetric access states from spc3 section 5.8.2.4 (it does 
not follow it exactly because devices like RDAC or old clarrions did not 
implement the spec), and then use that info to fail commands before they 
are even sent to the device to avoid start up delays from when programs 
like udev, hal, kernel partition scanning probe the device.


For the LSI patch it works like the following:

When IO is sent to a path that cannot execute IO optimally, the scsi hw 
handler hook for sense processing (see rdac_check_sense in "[PATCH 8/9] 
scsi_dh: add lsi rdac device handler" and the scsi_error.c hook in in 
"scsi_dh: add skeleton for SCSI Device Handlers") will detect this and 
set the state to passive so future IO is not execute on the path 
(SG_IO/passthrough is allowed).


I am not sure about alternatives. If we just exported the port access 
state in sysfs, but did not fail IO from scsi_prep_state_check, then the 
users could still check the state before sending IO. Would it be 
horrible to convert apps to do this?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread James Bottomley
On Tue, 2008-02-05 at 21:06 +0100, Sam Ravnborg wrote:
> On Tue, Feb 05, 2008 at 07:47:35PM +0100, Sam Ravnborg wrote:
> > On Tue, Feb 05, 2008 at 07:40:24PM +0200, Adrian Bunk wrote:
> > > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > > compilation:
> > > 
> > > <--  snip  -->
> > > 
> > > $ make O=../out/x86-full
> > > ...
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> > >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > > gcc: no input files
> > > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > > 
> > > <--  snip  -->
> > > 
> > > Next "make" run brings the same failure in 
> > > drivers/scsi/aic7xxx/aic7xxx_core.c.
> > > 
> > > With the third "make" it works.
> > > 
> > > It might compile for people with SMP systems using -j?
> > 
> > I can reproduce it and will fix it.
> Seems I was sidetracked by some wrong assumptions.
> Could you please test this fix.
> 
> Works for me but this time I will do more testing
> 
>   Sam
> 
> diff --git a/drivers/scsi/aic7xxx/Makefile b/drivers/scsi/aic7xxx/Makefile
> index 4c54954..6aa49e7 100644
> --- a/drivers/scsi/aic7xxx/Makefile
> +++ b/drivers/scsi/aic7xxx/Makefile
> @@ -44,8 +44,8 @@ clean-files += aic79xx_seq.h aic79xx_reg.h 
> aic79xx_reg_print.c
>  
>  # Dependencies for generated files need to be listed explicitly
>  
> -$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
> $(obj)/aic7xxx_reg.h
> -$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
> $(obj)/aic79xx_reg.h
> +$(addprefix $(src)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h $(obj)/aic7xxx_reg.h
> +$(addprefix $(src)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h $(obj)/aic79xx_reg.h

OK, I think it's time for me to give up completely on understanding
kbuild.  To me this construction looks like you're adding source
directory prefixes to objects ... which can never be satisfied can it,
if the objectas are in the object directory?

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread James Bottomley
On Tue, 2008-02-05 at 11:33 -0800, Luben Tuikov wrote:
> > Wrong ... we don't export non-SCSI devices as SCSI
> > (with the single and
> > rather annoying exception of ATA via SAT).
> 
> I didn't say you should do that.  I had already
> mentioned that vendors export such controls
> as either enclosure or processor type devices,
> and this is why I told you that that is what
> needs to be exported, which incidentally is
> a device node of that type.
> 
> Without a common usage model already in the kernel
> to abstract (e.g. sd for block device, since you brought
> that up) your abstraction seems redundant and arbitrary.

Exactly, so the first patch in this series (a while ago now) was a
common usage model abstraction of enclosures, and the second was an
implementation in terms of SES.   I will do one in terms of SGPIO as
well ... assuming I ever find a SGPIO enclosure ...

> Your kernel code already uses READ DIAGNOSTIC, etc,
> and I'd rather leave that to user-space.

You can do it in user space as well.  It's just a bit difficult to get
information out of a SES enclosure without using it, and getting some of
the information is a requirement of the abstraction.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] [SCSI] fix BUG when sum(scatterlist) > bufflen

2008-02-05 Thread Mike Christie

Tony Battersby wrote:

When sending a SCSI command to a tape drive via the SCSI Generic (sg)
driver, if the command has a data transfer length more than
scatter_elem_sz (32 KB default) and not a multiple of 512, then I either
hit BUG_ON(!valid_dma_direction(direction)) in dma_unmap_sg() or else
the command never completes (depending on the LLDD).

When constructing scatterlists, the sg driver rounds up the scatterlist
element sizes to be a multiple of 512.  This can result in
sum(scatterlist lengths) > bufflen.  In this case, scsi_req_map_sg()
incorrectly sets bio->bi_size to sum(scatterlist lengths) rather than to
bufflen.  When the command completes, req_bio_endio() detects that
bio->bi_size != 0, and so it doesn't call bio_endio().  This causes the
command to be resubmitted, resulting in BUG_ON or the command never
completing.

This patch makes scsi_req_map_sg() set bio->bi_size to bufflen rather
than to sum(scatterlist lengths), which fixes the problem.

Signed-off-by: Tony Battersby <[EMAIL PROTECTED]>
---
--- linux-2.6.24-git14/drivers/scsi/scsi_lib.c.orig 2008-02-05 
09:33:05.0 -0500
+++ linux-2.6.24-git14/drivers/scsi/scsi_lib.c  2008-02-05 09:33:10.0 
-0500
@@ -301,7 +301,6 @@ static int scsi_req_map_sg(struct reques
page = sg_page(sg);
off = sg->offset;
len = sg->length;
-   data_len += len;
 


Thanks for finding this. I am not sure what happened. That line got 
deleted in this commit when we fixed this problem:

http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commit;h=bd441deaf341c524b28fd72831ebf6fef88f1c41

but was added back here:
http://git.kernel.org/?p=linux/kernel/git/jejb/scsi-misc-2.6.git;a=commitdiff;h=c6132da1704be252ee6c923f47501083d835c238

Acked-by: Mike Christie <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread Luben Tuikov
--- On Tue, 2/5/08, James Bottomley <[EMAIL PROTECTED]> wrote:
> > > Wrong ... we don't export non-SCSI devices as
> SCSI
> > > (with the single and
> > > rather annoying exception of ATA via SAT).
> > 
> > I didn't say you should do that.  I had already
> > mentioned that vendors export such controls
> > as either enclosure or processor type devices,
> > and this is why I told you that that is what
> > needs to be exported, which incidentally is
> > a device node of that type.
> > 
> > Without a common usage model already in the kernel
> > to abstract (e.g. sd for block device, since you
> brought
> > that up) your abstraction seems redundant and
> arbitrary.
> 
> Exactly, so the first patch in this series (a while ago
^^^

See last paragraph.

> now) was a
> common usage model abstraction of enclosures, and the
> second was an
> implementation in terms of SES.   I will do one in terms of
> SGPIO as
> well ... assuming I ever find a SGPIO enclosure ...

The vendor would've abstracted that away most commonly
using SES.

> 
> > Your kernel code already uses READ DIAGNOSTIC, etc,
> > and I'd rather leave that to user-space.
> 
> You can do it in user space as well.  It's just a bit
> difficult to get
> information out of a SES enclosure without using it, and
> getting some of
> the information is a requirement of the abstraction.

You missed my point.  Your abstraction is redundant and
arbitrary -- it is not based on any known, in-practice,
usage model, already in place that needs a better, common
way of doing XYZ, and therefore needs an abstraction.

   Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Adrian Bunk
On Tue, Feb 05, 2008 at 09:06:23PM +0100, Sam Ravnborg wrote:
> On Tue, Feb 05, 2008 at 07:47:35PM +0100, Sam Ravnborg wrote:
> > On Tue, Feb 05, 2008 at 07:40:24PM +0200, Adrian Bunk wrote:
> > > Commit 8891fec65ac5b5a74b50c705e31b66c92c3eddeb broke aic7xxx 
> > > compilation:
> > > 
> > > <--  snip  -->
> > > 
> > > $ make O=../out/x86-full
> > > ...
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_seq.h
> > >   SHIPPED drivers/scsi/aic7xxx/aic79xx_reg.h
> > >   CC  drivers/scsi/aic7xxx/aic79xx_core.o
> > > gcc: drivers/scsi/aic7xxx/aic79xx_core.c: No such file or directory
> > > gcc: no input files
> > > make[4]: *** [drivers/scsi/aic7xxx/aic79xx_core.o] Error 1
> > > 
> > > <--  snip  -->
> > > 
> > > Next "make" run brings the same failure in 
> > > drivers/scsi/aic7xxx/aic7xxx_core.c.
> > > 
> > > With the third "make" it works.
> > > 
> > > It might compile for people with SMP systems using -j?
> > 
> > I can reproduce it and will fix it.
> Seems I was sidetracked by some wrong assumptions.
> Could you please test this fix.
> 
> Works for me but this time I will do more testing

Thanks, works fine for me.

>   Sam
> 
> diff --git a/drivers/scsi/aic7xxx/Makefile b/drivers/scsi/aic7xxx/Makefile
> index 4c54954..6aa49e7 100644
> --- a/drivers/scsi/aic7xxx/Makefile
> +++ b/drivers/scsi/aic7xxx/Makefile
> @@ -44,8 +44,8 @@ clean-files += aic79xx_seq.h aic79xx_reg.h 
> aic79xx_reg_print.c
>  
>  # Dependencies for generated files need to be listed explicitly
>  
> -$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
> $(obj)/aic7xxx_reg.h
> -$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
> $(obj)/aic79xx_reg.h
> +$(addprefix $(src)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h $(obj)/aic7xxx_reg.h
> +$(addprefix $(src)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h $(obj)/aic79xx_reg.h
>  
>  aic7xxx-gen-$(CONFIG_AIC7XXX_BUILD_FIRMWARE) := $(obj)/aic7xxx_reg.h
>  aic7xxx-gen-$(CONFIG_AIC7XXX_REG_PRETTY_PRINT)   += 
> $(obj)/aic7xxx_reg_print.c

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Marvell 6440 SAS/SATA driver

2008-02-05 Thread Luben Tuikov
--- On Tue, 2/5/08, Ke Wei <[EMAIL PROTECTED]> wrote:
> + for_each_phy(port->wide_port_phymap, no, j, mvi->chip->n_phy) {
> + mvs_write_port_cfg_addr(mvi, no, PHYR_WIDE_PORT);
> + mvs_write_port_cfg_data(mvi, no , port->wide_port_phymap);
> + } else {
> + mvs_write_port_cfg_addr(mvi, no, PHYR_WIDE_PORT);
> + mvs_write_port_cfg_data(mvi, no , 0);
> + }
> +}

Don't do this.  Make the "if" explicit.

Since I can see you've taken this verbatim from the SAS code,
if "no" means number, then it is "j". "no" is just a temporary
register which gets shifted right each iteration and not of
much use outside the macro.

Also if "__rest" (which you added to the macro) is 0, then nether
statement would execute, which is probably not what you want.

If "n_phy" means "number of phys", then its usage that you added
into the macro is inconsistent. Furthermore it shouldn't be
necessary since wide_port_phymap & ~((2^n_phy)-1) must never be true.

Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: aic7xxx build failure

2008-02-05 Thread Sam Ravnborg
> > index 4c54954..6aa49e7 100644
> > --- a/drivers/scsi/aic7xxx/Makefile
> > +++ b/drivers/scsi/aic7xxx/Makefile
> > @@ -44,8 +44,8 @@ clean-files += aic79xx_seq.h aic79xx_reg.h 
> > aic79xx_reg_print.c
> >  
> >  # Dependencies for generated files need to be listed explicitly
> >  
> > -$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
> > $(obj)/aic7xxx_reg.h
> > -$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
> > $(obj)/aic79xx_reg.h
> > +$(addprefix $(src)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h 
> > $(obj)/aic7xxx_reg.h
> > +$(addprefix $(src)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h 
> > $(obj)/aic79xx_reg.h
> 
> OK, I think it's time for me to give up completely on understanding
> kbuild.  To me this construction looks like you're adding source
> directory prefixes to objects ... which can never be satisfied can it,
> if the objectas are in the object directory?
Or maybe I'm just so damn tired that I should sleep instead of trying to fix
this Makefile for 117 time.
You are right that it should read:

-$(addprefix $(src)/,$(aic7xxx-y:.o=.c)): $(obj)/aic7xxx_seq.h 
$(obj)/aic7xxx_reg.h
-$(addprefix $(src)/,$(aic79xx-y:.o=.c)): $(obj)/aic79xx_seq.h 
$(obj)/aic79xx_reg.h
+$(addprefix $(obj)/,$(aic7xxx-y)): $(obj)/aic7xxx_seq.h $(obj)/aic7xxx_reg.h
+$(addprefix $(obj)/,$(aic79xx-y)): $(obj)/aic79xx_seq.h $(obj)/aic79xx_reg.h

But for now the distinction between src and obj is purely for documentation
as they have the same value - also when O= is used.
So it should work anyway.

If you use M=... (or SUBDIRS=...) I think it matters but this
is not the case for this in-tree driver in normal usage situations.

I will test some more tomorrow and if feedback from Adrian is positive I
will submit the hopefully last update to this Makefile to Linus.
[I need to test if it can generate the files using the aicasm tool for 
instance).

Sam
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-05 Thread Mike Anderson
Mike Christie <[EMAIL PROTECTED]> wrote:
> When IO is sent to a path that cannot execute IO optimally, the scsi hw 
> handler hook for sense processing (see rdac_check_sense in "[PATCH 8/9] 
> scsi_dh: add lsi rdac device handler" and the scsi_error.c hook in in 
> "scsi_dh: add skeleton for SCSI Device Handlers") will detect this and set 
> the state to passive so future IO is not execute on the path 
> (SG_IO/passthrough is allowed).
>
> I am not sure about alternatives. If we just exported the port access state 
> in sysfs, but did not fail IO from scsi_prep_state_check, then the users 
> could still check the state before sending IO. Would it be horrible to 
> convert apps to do this?

The majority of the boot up delays is caused by the kernel partition
scanning and other kernel init code (Chandra please correct if that is not
true). Sysfs attributes would not help here. One option maybe to add
handling of the newer BLKERR_ codes in the generators of IO or some
similar solution with a rollout possibly focused at the top generators of
IO.

A number of user apps like lvm scanning that execute media access commands
already have filter capability to filter devices that one does not want to
scan. Another class of device scanners just use inquiries which are not
effected by the passive state (though some could probably use udevinfo and
reduce the amount of repeated SCSI inquiries execute on the system.

-andmike
--
Michael Anderson
[EMAIL PROTECTED]
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Tue, 2008-02-05 at 22:21 +0300, Vladislav Bolkhovitin wrote:
> Jeff Garzik wrote:
> >>> iSCSI is way, way too complicated. 
> >>
> >> I fully agree. From one side, all that complexity is unavoidable for 
> >> case of multiple connections per session, but for the regular case of 
> >> one connection per session it must be a lot simpler.
> > 
> > Actually, think about those multiple connections...  we already had to 
> > implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> > level.  IMO that portion of the protocol is redundant:   You need the 
> > same capability elsewhere in the OS _anyway_, if you are to support 
> > multi-pathing.
> 
> I'm thinking about MC/S as about a way to improve performance using 
> several physical links. There's no other way, except MC/S, to keep 
> commands processing order in that case. So, it's really valuable 
> property of iSCSI, although with a limited application.
> 
> Vlad
> 

Greetings,

I have always observed the case with LIO SE/iSCSI target mode (as well
as with other software initiators we can leave out of the discussion for
now, and congrats to the open/iscsi on folks recent release. :-) that
execution core hardware thread and inter-nexus per 1 Gb/sec ethernet
port performance scales up to 4x and 2x core x86_64 very well with
MC/S).  I have been seeing 450 MB/sec using 2x socket 4x core x86_64 for
a number of years with MC/S.  Using MC/S on 10 Gb/sec (on PCI-X v2.0
266mhz as well, which was the first transport that LIO Target ran on
that was able to reach handle duplex ~1200 MB/sec with 3 initiators and
MC/S.  In the point to point 10 GB/sec tests on IBM p404 machines, the
initiators where able to reach ~910 MB/sec with MC/S.  Open/iSCSI was
able to go a bit faster (~950 MB/sec) because it uses struct sk_buff
directly. 

A good rule to keep in mind here while considering performance is that
context switching overhead and pipeline <-> bus stalling (along with
other legacy OS specific storage stack limitations with BLOCK and VFS
with O_DIRECT, et al and I will leave out of the discussion for iSCSI
and SE engine target mode) is that a initiator will scale roughly 1/2 as
well as a target, given comparable hardware and virsh output.  The
software target case target case also depends, in great regard in many
cases, if we are talking about something something as simple as doing
contiguous DMA memory allocations in from a SINGLE kernel thread, and
handling direction execution to a storage hardware DMA ring that may
have not been allocated in the current kernel thread.  In MC/S mode this
breaks down to:

1) Sorting logic that handles pre execution statemachine for transport
from local RDMA memory and OS specific data buffers.   TCP application
data buffer, struct sk_buff, or RDMA struct page or SG.  This should be
generic between iSCSI and iSER.

2) Allocation of said memory buffers to OS subsystem dependent code that
can be queued up to these drivers.  It breaks down to what you can get
drivers and OS subsystem folks to agree to implement, and can be made
generic in a Transport / BLOCK / VFS layered storage stack.  In the
"allocate thread DMA ring and use OS supported software and vendor
available hardware" I don't think the kernel space requirement will
every completely be able to go away.

Without diving into RFC-3720 specifics, the statemachine for MC/S side
for memory allocation, login and logout generic to iSCSi and ISER, and
ERL=2 recovery.  My plan is to post the locations in the LIO code where
this has been implemented, and where we where can make this easier, etc.
In the early in the development of what eventually became LIO Target
code, ERL was broken into separete files and separete function
prefixes. 

iscsi_target_erl0, iscsi_target_erl1 and iscsi_target_erl2.

The statemachine for ERL=0 and ERL=2 is pretty simple in RFC-3720 (have
a look for those interested in the discussion)

7.1.1.  State Descriptions for Initiators and Targets

The LIO target code is also pretty simple for this:

[EMAIL PROTECTED] target]# wc -l iscsi_target_erl*
  1115 iscsi_target_erl0.c
45 iscsi_target_erl0.h
   526 iscsi_target_erl0.o
  1426 iscsi_target_erl1.c
51 iscsi_target_erl1.h
  1253 iscsi_target_erl1.o
   605 iscsi_target_erl2.c
45 iscsi_target_erl2.h
   447 iscsi_target_erl2.o
  5513 total

erl1.c is a bit larger than the others because it contains the MC/S
statemachine functions. iscsi_target_erl1.c:iscsi_execute_cmd() and
iscsi_target_util.c:iscsi_check_received_cmdsn() do most of the work for
LIO MC/S state machine.  I would  probably benefit from being in broken
up into say iscsi_target_mcs.c.  Note that all of this code is MC/S
safe, with the exception of the specific SCSI TMR functions.  For the
SCSI TMR pieces, I have always hoped to use SCST code for doing this...

Most of the login/logout code is done in iscsi_target.c, which is could
probably also benefit fot getting broken out...

--nab


-
To unsubscribe from this list: send the l

Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread Andrew Morton
On Sun, 03 Feb 2008 18:16:51 -0600
James Bottomley <[EMAIL PROTECTED]> wrote:

> 
> From: James Bottomley <[EMAIL PROTECTED]>
> Date: Sun, 3 Feb 2008 15:40:56 -0600
> Subject: [SCSI] enclosure: add support for enclosure services
> 
> The enclosure misc device is really just a library providing sysfs
> support for physical enclosure devices and their components.
> 

Thanks for sending it out for review.

> +struct enclosure_device *enclosure_find(struct device *dev)
> +{
> + struct enclosure_device *edev = NULL;
> +
> + mutex_lock(&container_list_lock);
> + list_for_each_entry(edev, &container_list, node) {
> + if (edev->cdev.dev == dev) {
> + mutex_unlock(&container_list_lock);
> + return edev;
> + }
> + }
> + mutex_unlock(&container_list_lock);
> +
> + return NULL;
> +}
> +EXPORT_SYMBOL_GPL(enclosure_find);

This looks a little odd.  We don't take a ref on the object after looking
it up, so what prevents some other thread of control from freeing or
otherwise altering the returned object while the caller is playing with it?

> +/**
> + * enclosure_for_each_device - calls a function for each enclosure
> + * @fn:  the function to call
> + * @data:the data to pass to each call
> + *
> + * Loops over all the enclosures calling the function.
> + *
> + * Note, this function uses a mutex which will be held across calls to
> + * @fn, so it must have user context, and @fn should not sleep or

Probably "non atomic context" would be more accurate.

fn() actually _can_ sleep.

> + * otherwise cause the mutex to be held for indefinite periods
> + */
> +int enclosure_for_each_device(int (*fn)(struct enclosure_device *, void *),
> +   void *data)
> +{
> + int error = 0;
> + struct enclosure_device *edev;
> +
> + mutex_lock(&container_list_lock);
> + list_for_each_entry(edev, &container_list, node) {
> + error = fn(edev, data);
> + if (error)
> + break;
> + }
> + mutex_unlock(&container_list_lock);
> +
> + return error;
> +}
> +EXPORT_SYMBOL_GPL(enclosure_for_each_device);
> +
> +/**
> + * enclosure_register - register device as an enclosure
> + *
> + * @dev: device containing the enclosure
> + * @components:  number of components in the enclosure
> + *
> + * This sets up the device for being an enclosure.  Note that @dev does
> + * not have to be a dedicated enclosure device.  It may be some other type
> + * of device that additionally responds to enclosure services
> + */
> +struct enclosure_device *
> +enclosure_register(struct device *dev, const char *name, int components,
> +struct enclosure_component_callbacks *cb)
> +{
> + struct enclosure_device *edev =
> + kzalloc(sizeof(struct enclosure_device) +
> + sizeof(struct enclosure_component)*components,
> + GFP_KERNEL);
> + int err, i;
> +
> + if (!edev)
> + return ERR_PTR(-ENOMEM);
> +
> + if (!cb) {
> + kfree(edev);
> + return ERR_PTR(-EINVAL);
> + }

It would be less fuss if this were to test cb before doing the kzalloc().

Can cb==NULL actually and legitimately happen?

> + edev->components = components;
> +
> + edev->cdev.class = &enclosure_class;
> + edev->cdev.dev = get_device(dev);
> + edev->cb = cb;
> + snprintf(edev->cdev.class_id, BUS_ID_SIZE, "%s", name);
> + err = class_device_register(&edev->cdev);
> + if (err)
> + goto err;
> +
> + for (i = 0; i < components; i++)
> + edev->component[i].number = -1;
> +
> + mutex_lock(&container_list_lock);
> + list_add_tail(&edev->node, &container_list);
> + mutex_unlock(&container_list_lock);
> +
> + return edev;
> +
> + err:
> + put_device(edev->cdev.dev);
> + kfree(edev);
> + return ERR_PTR(err);
> +}
> +EXPORT_SYMBOL_GPL(enclosure_register);
> +
> +static struct enclosure_component_callbacks enclosure_null_callbacks;
> +
> +/**
> + * enclosure_unregister - remove an enclosure
> + *
> + * @edev:the registered enclosure to remove;
> + */
> +void enclosure_unregister(struct enclosure_device *edev)
> +{
> + int i;
> +
> + if (!edev)
> + return;

Is this legal?

> + mutex_lock(&container_list_lock);
> + list_del(&edev->node);
> + mutex_unlock(&container_list_lock);

See, right now, someone who found this enclosure_device via
enclosure_find() could still be playing with it?

> + for (i = 0; i < edev->components; i++)
> + if (edev->component[i].number != -1)
> + class_device_unregister(&edev->component[i].cdev);
> +
> + /* prevent any callbacks into service user */
> + edev->cb = &enclosure_null_callbacks;
> + class_device_unregister(&edev->cdev);
> +}
> +EXPORT_SYMBOL_GPL(enclosure_unregister);
> +
> +/**
> + * enclosure_component_

Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Tue, 2008-02-05 at 14:12 -0500, Jeff Garzik wrote:
> Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> >> iSCSI is way, way too complicated. 
> > 
> > I fully agree. From one side, all that complexity is unavoidable for 
> > case of multiple connections per session, but for the regular case of 
> > one connection per session it must be a lot simpler.
> 
> 
> Actually, think about those multiple connections...  we already had to 
> implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> level.  IMO that portion of the protocol is redundant:   You need the 
> same capability elsewhere in the OS _anyway_, if you are to support 
> multi-pathing.
> 
>   Jeff
> 
> 

Hey Jeff,

I put a whitepaper on the LIO cluster recently about this topic.. It is
from a few years ago but the datapoints are very relevant.

http://linux-iscsi.org/builds/user/nab/Inter.vs.OuterNexus.Multiplexing.pdf

The key advantage to MC/S and ERL=2 has always been that they are
completely OS independent.  They are designed to work together and
actually benefit from one another.

They are also are protocol independent between Traditional iSCSI and
iSER.

--nab

PS: A great thanks for my former colleague Edward Cheng for putting this
together.

> 
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/9] scsi_dh: Add support for SDEV_PASSIVE

2008-02-05 Thread Chandra Seetharaman
On Tue, 2008-02-05 at 13:56 -0800, Mike Anderson wrote:
> Mike Christie <[EMAIL PROTECTED]> wrote:
> > When IO is sent to a path that cannot execute IO optimally, the scsi hw 
> > handler hook for sense processing (see rdac_check_sense in "[PATCH 8/9] 
> > scsi_dh: add lsi rdac device handler" and the scsi_error.c hook in in 
> > "scsi_dh: add skeleton for SCSI Device Handlers") will detect this and set 
> > the state to passive so future IO is not execute on the path 
> > (SG_IO/passthrough is allowed).
> >
> > I am not sure about alternatives. If we just exported the port access state 
> > in sysfs, but did not fail IO from scsi_prep_state_check, then the users 
> > could still check the state before sending IO. Would it be horrible to 
> > convert apps to do this?
> 
> The majority of the boot up delays is caused by the kernel partition
> scanning and other kernel init code (Chandra please correct if that is not

Yes, this is the case.

Some level of scanning happens at the rc scripts level too. That can be
reduced by what Mikec is suggesting. But, as andmike is suggesting, it
won't be a complete solution.

> true). Sysfs attributes would not help here. One option maybe to add
> handling of the newer BLKERR_ codes in the generators of IO or some
> similar solution with a rollout possibly focused at the top generators of

are you suggesting the partition scanners (kernel) and lvm(user space
scanner) should stop sending I/Os to a passive device once they realize
that the device is passive (thru BLKERR_ return codes) ?

> IO.
> 
> A number of user apps like lvm scanning that execute media access commands
> already have filter capability to filter devices that one does not want to

Yes, it will help. But, it will lead to additional instructions to the
users which if they do not follow (due to not knowing it or some such)
will lead to a delayed boot.

IMO, It will be good if it works nicely out of the box.

> scan. Another class of device scanners just use inquiries which are not
> effected by the passive state (though some could probably use udevinfo and
> reduce the amount of repeated SCSI inquiries execute on the system.
> 
> -andmike
> --
> Michael Anderson
> [EMAIL PROTECTED]
-- 

--
Chandra Seetharaman   | Be careful what you choose
  - [EMAIL PROTECTED]   |  ...you may get it.
--


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Tue, 2008-02-05 at 22:01 +0300, Vladislav Bolkhovitin wrote:
> Jeff Garzik wrote:
> > Alan Cox wrote:
> > 
> >>>better. So for example, I personally suspect that ATA-over-ethernet is way 
> >>>better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
> >>>low-level, and against those crazy SCSI people to begin with.
> >>
> >>Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> >>would probably trash iSCSI for latency if nothing else.
> > 
> > 
> > AoE is truly a thing of beauty.  It has a two/three page RFC (say no more!).
> > 
> > But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
> > really do tagged queueing, etc.
> > 
> > 
> > iSCSI is way, way too complicated. 
> 
> I fully agree. From one side, all that complexity is unavoidable for 
> case of multiple connections per session, but for the regular case of 
> one connection per session it must be a lot simpler.
> 
> And now think about iSER, which brings iSCSI on the whole new complexity 
> level ;)

Actually, the iSER protocol wire protocol itself is quite simple,
because it builds on iSCSI and IPS fundamentals, and because traditional
iSCSI's recovery logic for CRC failures (and hence alot of
acknowledgement sequence PDUs that go missing, etc) and the RDMA Capable
Protocol (RCaP).

The logic that iSER collectively disables is known as within-connection
and within-command recovery (negotiated as ErrorRecoveryLevel=1 on the
wire), RFC-5046 requires that the iSCSI layer that iSER is being enabled
to disable CRC32C checksums and any associated timeouts for ERL=1.

Also, have a look at Appendix A. in the iSER spec.

  A.1. iWARP Message Format for iSER Hello Message ...73
  A.2. iWARP Message Format for iSER HelloReply Message ..74
  A.3. iWARP Message Format for SCSI Read Command PDU 75
  A.4. iWARP Message Format for SCSI Read Data ...76
  A.5. iWARP Message Format for SCSI Write Command PDU ...77
  A.6. iWARP Message Format for RDMA Read Request 78
  A.7. iWARP Message Format for Solicited SCSI Write Data 79
  A.8. iWARP Message Format for SCSI Response PDU 80

This is about as 1/2 as many traditional iSCSI PDUs, that iSER
encapulates.

--nab

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Tue, 2008-02-05 at 16:48 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2008-02-05 at 22:01 +0300, Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> > > Alan Cox wrote:
> > > 
> > >>>better. So for example, I personally suspect that ATA-over-ethernet is 
> > >>>way 
> > >>>better than some crazy SCSI-over-TCP crap, but I'm biased for simple and 
> > >>>low-level, and against those crazy SCSI people to begin with.
> > >>
> > >>Current ATAoE isn't. It can't support NCQ. A variant that did NCQ and IP
> > >>would probably trash iSCSI for latency if nothing else.
> > > 
> > > 
> > > AoE is truly a thing of beauty.  It has a two/three page RFC (say no 
> > > more!).
> > > 
> > > But quite so...  AoE is limited to MTU size, which really hurts.  Can't 
> > > really do tagged queueing, etc.
> > > 
> > > 
> > > iSCSI is way, way too complicated. 
> > 
> > I fully agree. From one side, all that complexity is unavoidable for 
> > case of multiple connections per session, but for the regular case of 
> > one connection per session it must be a lot simpler.
> > 
> > And now think about iSER, which brings iSCSI on the whole new complexity 
> > level ;)
> 
> Actually, the iSER protocol wire protocol itself is quite simple,
> because it builds on iSCSI and IPS fundamentals, and because traditional
> iSCSI's recovery logic for CRC failures (and hence alot of
> acknowledgement sequence PDUs that go missing, etc) and the RDMA
> Capable
> Protocol (RCaP).

this should be:

.. and instead the RDMA Capacle Protocol (RCaP) provides the 32-bit or
greater data integrity.

--nab

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new scsi sense handling

2008-02-05 Thread FUJITA Tomonori
On Tue, 5 Feb 2008 11:43:58 -0800 (PST)
Luben Tuikov <[EMAIL PROTECTED]> wrote:

> --- On Tue, 2/5/08, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > On Mon, 4 Feb 2008 18:39:22 -0800 (PST)
> > Luben Tuikov <[EMAIL PROTECTED]> wrote:
> > 
> > > --- On Mon, 2/4/08, Boaz Harrosh
> > <[EMAIL PROTECTED]> wrote:
> > > > There are 3 usages of sense handling in drivers
> > > > 
> > > > 1. sense is available in driver internal
> > structure and is
> > > > mem-copied to upper level
> > > > 2. A CHECK_CONDITION status was returned and the
> > driver
> > > > uses the scsi_eh_prep_cmnd()
> > > >for a REQUEST_SENSE invocation to the target.
> > Then
> > > > returning the sense in 
> > > >scsi_eh_return_cmnd(). A variation on this is
> > when the
> > > > driver does nothing the queue
> > > >is frozen an the scsi watchdog timer does the
> > above.
> > > > 3. The underline host adapter does the
> > REQUEST_SENSE and a
> > > > pre-allocated and DMA mapped
> > > >sense buffer receives the sense information
> > from HW.
> > > 
> > > Many years ago when "ACA" had a constructive
> > meaning,
> > > so did "Autosense".  Then about 5 years ago,
> > "Autosense"
> > > disappeared completely since it became the de facto
> > > implementation of the then SCSI Execute Command
> > "RPC",
> > > now just SCSI Execute Command procedure call.
> > > 
> > > At that point in time, the SCSI mid-layer decided
> > > to embrace this model and give the LLDD a scsi command
> > > structure which included the sense data buffer to
> > > a size that the SCSI mid-layer was interested in,
> > > at the moment 96 bytes, macro defined in
> > > include/scsi/scsi_cmnd.h.
> > > 
> > > The concept of "Autosense" was off-loaded to
> > LLDD
> > > to emulate it if the specific target device to
> > > which the command was issued, didn't supply the
> > > sense data on CHECK CONDITION, and more so
> > > relevant to target devices which implemented
> > > queuing, thus the ACA.
> > > 
> > > And the mid-layer would consider extracting
> > > the sense data via REQUEST SENSE command
> > > as a _special case_ if the LLDD/transport layer
> > > didn't implement the "autosense" model.
> > 
> > Only SPI and USB?
> 
> I don't understand this question.

I meant, 'what transport protocols are categorized into the transport
protocol that doesn't implement the "autosense" model?'


> > The most of LLDs using the transport protocol that we care
> > about today
> > uses sense buffer in their own internal structure.
> 
> Yes.
> 
> > 
> > I think that the issue to solve to kill
> > scsi_cmnd:sense_buffer is how
> > to share (or export) such sense buffer with the scsi
> > mid-layer.
> 
> And therein lies the problem.  Sense data is SCSI specific,
> it should be left to SCSI, unless of course you can
> stipulate that _all_ block devices return sense data.

Yeah, sense data is SCSI specific and it should be left to SCSI. But
I'm not sure we need to stipulate that _all_ block devices return
sense data. Today the block layer users (sg, bsg, etc) use it only
when it's appropriate (or only if they want to use it).


> If that's not the case and you move it to the block
> layer, then you get a whole bunch of other problems,
> like does this device want/use it, should we allocate
> it, etc. OTOH, if that _is_ the case, then you don't
> have to worry about this and the model is pretty
> much as the SCSI mid-layer has it, i.e. sense buffer
> always present.  So I guess the question is, can
> you stipulate that _all_ block devices return sense data?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread FUJITA Tomonori
On Tue, 05 Feb 2008 18:09:15 +0100
Matteo Tescione <[EMAIL PROTECTED]> wrote:

> On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> 
> > On Tue, 05 Feb 2008 08:14:01 +0100
> > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > 
> >> James Bottomley schrieb:
> >> 
> >>> These are both features being independently worked on, are they not?
> >>> Even if they weren't, the combination of the size of SCST in kernel plus
> >>> the problem of having to find a migration path for the current STGT
> >>> users still looks to me to involve the greater amount of work.
> >> 
> >> I don't want to be mean, but does anyone actually use STGT in
> >> production? Seriously?
> >> 
> >> In the latest development version of STGT, it's only possible to stop
> >> the tgtd target daemon using KILL / 9 signal - which also means all
> >> iSCSI initiator connections are corrupted when tgtd target daemon is
> >> started again (kernel upgrade, target daemon upgrade, server reboot etc.).
> > 
> > I don't know what "iSCSI initiator connections are corrupted"
> > mean. But if you reboot a server, how can an iSCSI target
> > implementation keep iSCSI tcp connections?
> > 
> > 
> >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> >> server. Not only that - your data is probably corrupted, or at least the
> >> filesystem deserves checking...
> 
> Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> manages stop/crash, by sending unit attention to clients on reconnect.
> Drbd+heartbeat correctly manages those things too.
> Still from an end-user POV, i was able to reboot/survive a crash only with
> SCST, IETD still has reconnect problems and STGT are even worst.

Please tell us on stgt-devel mailing list if you see problems. We will
try to fix them.

Thanks,
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Tue, 2008-02-05 at 16:11 -0800, Nicholas A. Bellinger wrote:
> On Tue, 2008-02-05 at 22:21 +0300, Vladislav Bolkhovitin wrote:
> > Jeff Garzik wrote:
> > >>> iSCSI is way, way too complicated. 
> > >>
> > >> I fully agree. From one side, all that complexity is unavoidable for 
> > >> case of multiple connections per session, but for the regular case of 
> > >> one connection per session it must be a lot simpler.
> > > 
> > > Actually, think about those multiple connections...  we already had to 
> > > implement fast-failover (and load bal) SCSI multi-pathing at a higher 
> > > level.  IMO that portion of the protocol is redundant:   You need the 
> > > same capability elsewhere in the OS _anyway_, if you are to support 
> > > multi-pathing.
> > 
> > I'm thinking about MC/S as about a way to improve performance using 
> > several physical links. There's no other way, except MC/S, to keep 
> > commands processing order in that case. So, it's really valuable 
> > property of iSCSI, although with a limited application.
> > 
> > Vlad
> > 
> 
> Greetings,
> 
> I have always observed the case with LIO SE/iSCSI target mode (as well
> as with other software initiators we can leave out of the discussion for
> now, and congrats to the open/iscsi on folks recent release. :-) that
> execution core hardware thread and inter-nexus per 1 Gb/sec ethernet
> port performance scales up to 4x and 2x core x86_64 very well with
> MC/S).  I have been seeing 450 MB/sec using 2x socket 4x core x86_64 for
> a number of years with MC/S.  Using MC/S on 10 Gb/sec (on PCI-X v2.0
> 266mhz as well, which was the first transport that LIO Target ran on
> that was able to reach handle duplex ~1200 MB/sec with 3 initiators and
> MC/S.  In the point to point 10 GB/sec tests on IBM p404 machines, the
> initiators where able to reach ~910 MB/sec with MC/S.  Open/iSCSI was
> able to go a bit faster (~950 MB/sec) because it uses struct sk_buff
> directly. 
> 
 
Sorry, these where IBM p505 express (not p404, duh) which had a 2x
socket 2x core POWER5 setup.  These along with an IBM X-series machine)
where the only ones available for PCI-X v2.0, and this probably is still
the case. :-)

Also, these numbers where with a ~9000 MTU (I don't recall what the
hardware limit on the 10 Gb/sec switch lwas) doing direct struct iovec
to preallocated struct page mapping for payload on the target side.
This is known as RAMDISK_DR plugin in the LIO-SE.  On the initiator, LTP
disktest and O_DIRECT where used for direct to SCSI block device access.

I can big up this paper if anyone is interested.

--nab

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [Scst-devel] Integration of SCST in the mainstream Linux kernel

2008-02-05 Thread Nicholas A. Bellinger
On Wed, 2008-02-06 at 10:29 +0900, FUJITA Tomonori wrote:
> On Tue, 05 Feb 2008 18:09:15 +0100
> Matteo Tescione <[EMAIL PROTECTED]> wrote:
> 
> > On 5-02-2008 14:38, "FUJITA Tomonori" <[EMAIL PROTECTED]> wrote:
> > 
> > > On Tue, 05 Feb 2008 08:14:01 +0100
> > > Tomasz Chmielewski <[EMAIL PROTECTED]> wrote:
> > > 
> > >> James Bottomley schrieb:
> > >> 
> > >>> These are both features being independently worked on, are they not?
> > >>> Even if they weren't, the combination of the size of SCST in kernel plus
> > >>> the problem of having to find a migration path for the current STGT
> > >>> users still looks to me to involve the greater amount of work.
> > >> 
> > >> I don't want to be mean, but does anyone actually use STGT in
> > >> production? Seriously?
> > >> 
> > >> In the latest development version of STGT, it's only possible to stop
> > >> the tgtd target daemon using KILL / 9 signal - which also means all
> > >> iSCSI initiator connections are corrupted when tgtd target daemon is
> > >> started again (kernel upgrade, target daemon upgrade, server reboot 
> > >> etc.).
> > > 
> > > I don't know what "iSCSI initiator connections are corrupted"
> > > mean. But if you reboot a server, how can an iSCSI target
> > > implementation keep iSCSI tcp connections?
> > > 
> > > 
> > >> Imagine you have to reboot all your NFS clients when you reboot your NFS
> > >> server. Not only that - your data is probably corrupted, or at least the
> > >> filesystem deserves checking...
> > 

The TCP connection will drop, remember that the TCP connection state for
one side has completely vanished.  Depending on iSCSI/iSER
ErrorRecoveryLevel that is set, this will mean:

1) Session Recovery, ERL=0 - Restarting the entire nexus and all
connections across all of the possible subnets or comm-links.  All
outstanding un-StatSN acknowledged commands will be returned back to the
SCSI subsystem with RETRY status.  Once a single connection has been
reestablished to start the nexus, the CDBs will be resent.

2) Connection Recovery, ERL=2 - CDBs from the failed connection(s) will
be retried (nothing changes in the PDU) to fill the iSCSI CmdSN ordering
gap, or be explictly retried with TMR TASK_REASSIGN for ones already
acknowledged by the ExpCmdSN that are returned to the initiator in
response packets or by way of unsolicited NopINs.

> > Don't know if matters, but in my setup (iscsi on top of drbd+heartbeat)
> > rebooting the primary server doesn't affect my iscsi traffic, SCST correctly
> > manages stop/crash, by sending unit attention to clients on reconnect.
> > Drbd+heartbeat correctly manages those things too.
> > Still from an end-user POV, i was able to reboot/survive a crash only with
> > SCST, IETD still has reconnect problems and STGT are even worst.
> 
> Please tell us on stgt-devel mailing list if you see problems. We will
> try to fix them.
> 

FYI, the LIO code also supports rmmoding iscsi_target_mod while at full
10 Gb/sec speed.  I think it should be a requirement to be able to
control per initiator, per portal group, per LUN, per device, per HBA in
the design without restarting any other objects.

--nab

> Thanks,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [EMAIL PROTECTED]
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: new scsi sense handling

2008-02-05 Thread Luben Tuikov
--- On Tue, 2/5/08, FUJITA Tomonori <[EMAIL PROTECTED]> wrote:
> > --- On Tue, 2/5/08, FUJITA Tomonori
> <[EMAIL PROTECTED]> wrote:
> > > On Mon, 4 Feb 2008 18:39:22 -0800 (PST)
> > > Luben Tuikov <[EMAIL PROTECTED]> wrote:
> > > 
> > > > --- On Mon, 2/4/08, Boaz Harrosh
> > > <[EMAIL PROTECTED]> wrote:
> > > > > There are 3 usages of sense handling in
> drivers
> > > > > 
> > > > > 1. sense is available in driver
> internal
> > > structure and is
> > > > > mem-copied to upper level
> > > > > 2. A CHECK_CONDITION status was
> returned and the
> > > driver
> > > > > uses the scsi_eh_prep_cmnd()
> > > > >for a REQUEST_SENSE invocation to
> the target.
> > > Then
> > > > > returning the sense in 
> > > > >scsi_eh_return_cmnd(). A variation
> on this is
> > > when the
> > > > > driver does nothing the queue
> > > > >is frozen an the scsi watchdog timer
> does the
> > > above.
> > > > > 3. The underline host adapter does the
> > > REQUEST_SENSE and a
> > > > > pre-allocated and DMA mapped
> > > > >sense buffer receives the sense
> information
> > > from HW.
> > > > 
> > > > Many years ago when "ACA" had a
> constructive
> > > meaning,
> > > > so did "Autosense".  Then about 5
> years ago,
> > > "Autosense"
> > > > disappeared completely since it became the
> de facto
> > > > implementation of the then SCSI Execute
> Command
> > > "RPC",
> > > > now just SCSI Execute Command procedure
> call.
> > > > 
> > > > At that point in time, the SCSI mid-layer
> decided
> > > > to embrace this model and give the LLDD a
> scsi command
> > > > structure which included the sense data
> buffer to
> > > > a size that the SCSI mid-layer was
> interested in,
> > > > at the moment 96 bytes, macro defined in
> > > > include/scsi/scsi_cmnd.h.
> > > > 
> > > > The concept of "Autosense" was
> off-loaded to
> > > LLDD
> > > > to emulate it if the specific target device
> to
> > > > which the command was issued, didn't
> supply the
> > > > sense data on CHECK CONDITION, and more so
> > > > relevant to target devices which implemented
> > > > queuing, thus the ACA.
> > > > 
> > > > And the mid-layer would consider extracting
> > > > the sense data via REQUEST SENSE command
> > > > as a _special case_ if the LLDD/transport
> layer
> > > > didn't implement the
> "autosense" model.
> > > 
> > > Only SPI and USB?
> > 
> > I don't understand this question.
> 
> I meant, 'what transport protocols are categorized into
> the transport
> protocol that doesn't implement the
> "autosense" model?'

If any transport protocol conforms to SAM, it supports it.
Either emulated in the transport itself or supported
by the device (target) itself.  But ideally, the SCSI
mid-layer shouldn't have to get a CHECK CONDITION and
then turn around and send REQUEST SENSE, due to the
atomicity (per command) of the sense data, especially
if the target supports queuing.  There used to be
a mechanism to support this in SAM but is now obsolete.

> 
> 
> > > The most of LLDs using the transport protocol
> that we care
> > > about today
> > > uses sense buffer in their own internal
> structure.
> > 
> > Yes.
> > 
> > > 
> > > I think that the issue to solve to kill
> > > scsi_cmnd:sense_buffer is how
> > > to share (or export) such sense buffer with the
> scsi
> > > mid-layer.
> > 
> > And therein lies the problem.  Sense data is SCSI
> specific,
> > it should be left to SCSI, unless of course you can
> > stipulate that _all_ block devices return sense data.
> 
> Yeah, sense data is SCSI specific and it should be left to
> SCSI. But
> I'm not sure we need to stipulate that _all_ block
> devices return
> sense data. Today the block layer users (sg, bsg, etc) use
> it only
> when it's appropriate (or only if they want to use it).

I agree.

   Luben

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] enclosure: add support for enclosure services

2008-02-05 Thread James Bottomley
On Tue, 2008-02-05 at 16:12 -0800, Andrew Morton wrote:
> On Sun, 03 Feb 2008 18:16:51 -0600
> James Bottomley <[EMAIL PROTECTED]> wrote:
> 
> > 
> > From: James Bottomley <[EMAIL PROTECTED]>
> > Date: Sun, 3 Feb 2008 15:40:56 -0600
> > Subject: [SCSI] enclosure: add support for enclosure services
> > 
> > The enclosure misc device is really just a library providing sysfs
> > support for physical enclosure devices and their components.
> > 
> 
> Thanks for sending it out for review.
> 
> > +struct enclosure_device *enclosure_find(struct device *dev)
> > +{
> > +   struct enclosure_device *edev = NULL;
> > +
> > +   mutex_lock(&container_list_lock);
> > +   list_for_each_entry(edev, &container_list, node) {
> > +   if (edev->cdev.dev == dev) {
> > +   mutex_unlock(&container_list_lock);
> > +   return edev;
> > +   }
> > +   }
> > +   mutex_unlock(&container_list_lock);
> > +
> > +   return NULL;
> > +}
> > +EXPORT_SYMBOL_GPL(enclosure_find);
> 
> This looks a little odd.  We don't take a ref on the object after looking
> it up, so what prevents some other thread of control from freeing or
> otherwise altering the returned object while the caller is playing with it?

The use case is for enclosure destruction, so the free should never
happen, but I take the point; I've added a class_device_get().

> > +/**
> > + * enclosure_for_each_device - calls a function for each enclosure
> > + * @fn:the function to call
> > + * @data:  the data to pass to each call
> > + *
> > + * Loops over all the enclosures calling the function.
> > + *
> > + * Note, this function uses a mutex which will be held across calls to
> > + * @fn, so it must have user context, and @fn should not sleep or
> 
> Probably "non atomic context" would be more accurate.
> 
> fn() actually _can_ sleep.

"should" to me means you don't have to do this but ought to. I'll add a
may (but should not).

> > +   if (!cb) {
> > +   kfree(edev);
> > +   return ERR_PTR(-EINVAL);
> > +   }
> 
> It would be less fuss if this were to test cb before doing the kzalloc().
> 
> Can cb==NULL actually and legitimately happen?

Not really ... I'll make it a BUG_ON.

> > +void enclosure_unregister(struct enclosure_device *edev)
> > +{
> > +   int i;
> > +
> > +   if (!edev)
> > +   return;
> 
> Is this legal?

No ... it'll oops on the null deref later ... I'll remove this.

> > +   mutex_lock(&container_list_lock);
> > +   list_del(&edev->node);
> > +   mutex_unlock(&container_list_lock);
> 
> See, right now, someone who found this enclosure_device via
> enclosure_find() could still be playing with it?

Yes, fixed.

> > +   if (!edev || number >= edev->components)
> > +   return ERR_PTR(-EINVAL);
> 
> Is !edev possible and legitimate?

It shouldn't be, no ... I can remove it.

> > +   snprintf(cdev->class_id, BUS_ID_SIZE, "%d", number);
> 
> %u :)

Nitpicker!

> > +   return snprintf(buf, 40, "%d\n", edev->components);
> > +}
> 
> "40"?

I just followed precedence ;-P

There doesn't seem to be a define for this maximum length, so 40 is the
most commonly picked constant.

> > +static char *enclosure_type [] = {
> > +   [ENCLOSURE_COMPONENT_DEVICE] = "device",
> > +   [ENCLOSURE_COMPONENT_ARRAY_DEVICE] = "array device",
> > +};
> 
> One could play with const here, if sufficiently keen.

One will try to summon up the enthusiasm.

> > +static ssize_t set_component_fault(struct class_device *cdev, const char 
> > *buf,
> > +  size_t count)
> > +{
> > +   struct enclosure_device *edev = to_enclosure_device(cdev->parent);
> > +   struct enclosure_component *ecomp = to_enclosure_component(cdev);
> > +   int val = simple_strtoul(buf, NULL, 0);
> 
> hrm, we do this conversion about 1e99 times in the kernel and we have to go
> and pass three args where only one was needed. katoi()?

Yes ... I'll add it to the todo list.

> > +   for (i = 0; enclosure_status[i]; i++) {
> > +   if (strncmp(buf, enclosure_status[i],
> > +   strlen(enclosure_status[i])) == 0 &&
> > +   buf[strlen(enclosure_status[i])] == '\n')
> > +   break;
> > +   }
> 
> So if an application does
> 
>   write(fd, "foo", 3)
> 
> it won't work?  Thye have to do
> 
>   write(fd, "foo\n", 4)
> 
> ?

No ... it's designed for echo; however, I'll add a check for '\0' which
will catch the write case.

> > +#define to_enclosure_device(x) container_of((x), struct enclosure_device, 
> > cdev)
> > +#define to_enclosure_component(x) container_of((x), struct 
> > enclosure_component, cdev)
> 
> These could be C functions...

OK ... I was just following precedence again, but I can make them
inlines.

> Nice looking driver.

Thanks,

James

---

Here's the incremental diff.

diff --git a/drivers/misc/enclosure.c b/drivers/misc/enclosure.c
index 42e6e43..6fcb0e9 100644
--- a/drivers/misc/enclosure.c
+++ b/drivers/misc/enclosure.c
@@ -39,7 +39,8