Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-05 Thread Christoph Hellwig
On Fri, Mar 02, 2007 at 03:59:42PM -0600, Mike Christie wrote:
> If scsi_execute_async is going to be limited to what is in mainline
> until I can kill it, then we may not want to merge Boaz's patch and just
> have people convert the code to use blk_get_request, blk_rq_map_kern or
> blk_rq_map_user and blk_execute_rq_nowait.

scsi_execute_async should go away as soon as we're able to fix it's
users and it should not grow additional users.  a struct scatterlist is the
wrong representation for data going down the block layer.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-05 Thread Christoph Hellwig
On Sun, Mar 04, 2007 at 12:06:36PM -0700, Dachepalli, Sudhir wrote:
> James,
> 
> How about the following:
> 
> 1. Cmd= Scsi_get_command( on physical device )  
> 2. Clone the scsi_cmnd fields of the virtual cmnd( command received by
> failover driver) to physical cmnd (the command allocated by

No.  Cloning fields of an existing scsi_cmnd will always get you into
trouble.

Can please stop this idiotic discussion now and can you and LSI Logic
please contribute on request based multipath instead of reinventing
it wrongly?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Dachepalli, Sudhir
James,

How about the following:

1. Cmd= Scsi_get_command( on physical device )  
2. Clone the scsi_cmnd fields of the virtual cmnd( command received by
failover driver) to physical cmnd (the command allocated by
scsi_get_command )
3. blk_get_request()
4. fill the request fields, req->special = Cmd ( point the special field
to allocated scsi_cmnd )
5. set the request flags to REQ_DONTPREP
6. blk_execute_req_nowait()
7. once the command comes back call the blk_put_request()


Sudhir

-Original Message-
From: James Bottomley [mailto:[EMAIL PROTECTED] 
Sent: Sunday, March 04, 2007 12:14 PM
To: Dachepalli, Sudhir
Cc: Mike Christie; Benny Halevy; Jens Axboe; Boaz Harrosh;
linux-scsi@vger.kernel.org
Subject: RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

On Sun, 2007-03-04 at 11:00 -0700, Dachepalli, Sudhir wrote:
> Do you think the following could work If I used blk layer functions 
> instead of "scsi_execute_async":
> 
> Blk_get_request()
> Req->flags |= REQ_DONTPREP

There's additional complexity here:  if the request isn't prepared, no
command is allocated.  You can't use the one you have because it belongs
to a different device (plus it would be freed by this request in the
completion path).

> Blk_rq_map_kern()
> Blk_execute_req_nowait()

If the request is already mapped, you don't want to be mapping it again.

> Blk_put_request()

Without seeing the code it's hard to say definitively whether all this
is correct.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread James Bottomley
On Sun, 2007-03-04 at 11:00 -0700, Dachepalli, Sudhir wrote:
> Do you think the following could work If I used blk layer functions
> instead of "scsi_execute_async":
> 
> Blk_get_request()
> Req->flags |= REQ_DONTPREP

There's additional complexity here:  if the request isn't prepared, no
command is allocated.  You can't use the one you have because it belongs
to a different device (plus it would be freed by this request in the
completion path).

> Blk_rq_map_kern()
> Blk_execute_req_nowait()

If the request is already mapped, you don't want to be mapping it again.

> Blk_put_request() 

Without seeing the code it's hard to say definitively whether all this
is correct.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Dachepalli, Sudhir
Do you think the following could work If I used blk layer functions
instead of "scsi_execute_async":

Blk_get_request()
Req->flags |= REQ_DONTPREP
Blk_rq_map_kern()
Blk_execute_req_nowait()
Blk_put_request() 

Regards,
Sudhir

-Original Message-
From: Mike Christie [mailto:[EMAIL PROTECTED] 
Sent: Sunday, March 04, 2007 11:07 AM
To: James Bottomley
Cc: Dachepalli, Sudhir; Benny Halevy; Jens Axboe; Boaz Harrosh;
linux-scsi@vger.kernel.org
Subject: Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

Mike Christie wrote:
> James Bottomley wrote:
>> On Sun, 2007-03-04 at 10:21 -0600, Mike Christie wrote:
>>> I think they get around this and other request settings that need 
>>> resetting by using scsi_execute_async. They will take the command, 
>>> data direction and buffer fields from the original scsi_cmnd, then 
>>> pass those on to scsi_ececute_async which would allocate a new 
>>> request and then as you know that new request gets sent to the scsi 
>>> layer and looks like a brand new request. So I misspoke above. It 
>>> might be better to say they are using it for routing what the other 
>>> multipath layers would call cloned commands.
>> But actually, that's what I don't think they want to do.
> 
> Yeah, you are right. Proper cloning is the better way to go.

I meant routing of commands and possibly cloning like you described, not
just cloning.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Mike Christie
Mike Christie wrote:
> James Bottomley wrote:
>> On Sun, 2007-03-04 at 10:21 -0600, Mike Christie wrote:
>>> I think they get around this and other request settings that need
>>> resetting by using scsi_execute_async. They will take the command, data
>>> direction and buffer fields from the original scsi_cmnd, then pass those
>>> on to scsi_ececute_async which would allocate a new request and then as
>>> you know that new request gets sent to the scsi layer and looks like a
>>> brand new request. So I misspoke above. It might be better to say they
>>> are using it for routing what the other multipath layers would call
>>> cloned commands.
>> But actually, that's what I don't think they want to do.
> 
> Yeah, you are right. Proper cloning is the better way to go.

I meant routing of commands and possibly cloning like you described, not
just cloning.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Mike Christie
James Bottomley wrote:
> On Sun, 2007-03-04 at 10:21 -0600, Mike Christie wrote:
>> I think they get around this and other request settings that need
>> resetting by using scsi_execute_async. They will take the command, data
>> direction and buffer fields from the original scsi_cmnd, then pass those
>> on to scsi_ececute_async which would allocate a new request and then as
>> you know that new request gets sent to the scsi layer and looks like a
>> brand new request. So I misspoke above. It might be better to say they
>> are using it for routing what the other multipath layers would call
>> cloned commands.
> 
> But actually, that's what I don't think they want to do.

Yeah, you are right. Proper cloning is the better way to go.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread James Bottomley
On Sun, 2007-03-04 at 10:21 -0600, Mike Christie wrote:
> I think they get around this and other request settings that need
> resetting by using scsi_execute_async. They will take the command, data
> direction and buffer fields from the original scsi_cmnd, then pass those
> on to scsi_ececute_async which would allocate a new request and then as
> you know that new request gets sent to the scsi layer and looks like a
> brand new request. So I misspoke above. It might be better to say they
> are using it for routing what the other multipath layers would call
> cloned commands.

But actually, that's what I don't think they want to do.
scsi_execute_async() will try to map the sg list.  What they probably
get is a fully set up SCSI command which needs to be cloned to a new
device, so the request sg setup has already been done.  So what you need
to do is something like get a request for the new device, clone the
relevant parameters, set the flags to REQ_DONTPREP and inject it into
the correct queue.  You probably also need to clone the scsi_cmnd as
well.  Finally you need to chain the completions of the old and new
commands.  This should avoid having to setup and re tear down the
requests, while not using some kludgy mechanism like special commands.

However, obviously, this is speculation about the operation of an unseen
driver, which is not incredibly profitable ...

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Mike Christie
James Bottomley wrote:
> On Sun, 2007-03-04 at 09:43 -0600, Mike Christie wrote:
>> There are trying to do a scsi level multipath driver for RDAC support.
>> And they are trying to do something similar to request based multipath
>> (route requests instead of bios but in their case they are routing
>> scsi_cmnds instead of requests).
> 
> I guessed it might be something like that ... 
> 
> The SCSI driver layer wasn't designed to stack like that ... hence the
> problems of setup and teardown of the sg list.  Wouldn't it just be
> easier to make request based multi-path work?  Unfortunately, you can't
> avoid the request system because what you want to do is resubmit an I/O
> without having the mid-layer execute the prepare function (which is
> where it's mapped) so you need the REQ_DONTPREP flag.
> 

I think they get around this and other request settings that need
resetting by using scsi_execute_async. They will take the command, data
direction and buffer fields from the original scsi_cmnd, then pass those
on to scsi_ececute_async which would allocate a new request and then as
you know that new request gets sent to the scsi layer and looks like a
brand new request. So I misspoke above. It might be better to say they
are using it for routing what the other multipath layers would call
cloned commands.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread James Bottomley
On Sun, 2007-03-04 at 09:43 -0600, Mike Christie wrote:
> There are trying to do a scsi level multipath driver for RDAC support.
> And they are trying to do something similar to request based multipath
> (route requests instead of bios but in their case they are routing
> scsi_cmnds instead of requests).

I guessed it might be something like that ... 

The SCSI driver layer wasn't designed to stack like that ... hence the
problems of setup and teardown of the sg list.  Wouldn't it just be
easier to make request based multi-path work?  Unfortunately, you can't
avoid the request system because what you want to do is resubmit an I/O
without having the mid-layer execute the prepare function (which is
where it's mapped) so you need the REQ_DONTPREP flag.

James


James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread Mike Christie
James Bottomley wrote:
> On Sun, 2007-03-04 at 00:36 -0700, Dachepalli, Sudhir wrote:
>> Our driver gets called in with the following fashion through the
>> queuecommand.
>>
>> scsi_request_fn() -> scsi_dispatch_cmd() -> rtn =
>> host->hostt->queuecommand(cmd, scsi_done);
>>
>> We are using the "cmd" ( scsi_cmnd) as a pass through with out touching
>> the "request_buffer" and "request_bufflen".
>>
>> We do not allocate memory similar to sg or st for page allocations.
>>
>> The request_buffer should already contain the scatter gather list built.
> 
> We're trying not to wrapper pass throughs in SCSI commands (unless
> defined by standard like the ATA ones).  However, I'm not entirely sure
> what you're trying to do ... can you post a link to the driver so that
> we can see if there's a better way?
> 

There are trying to do a scsi level multipath driver for RDAC support.
And they are trying to do something similar to request based multipath
(route requests instead of bios but in their case they are routing
scsi_cmnds instead of requests).
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-04 Thread James Bottomley
On Sun, 2007-03-04 at 00:36 -0700, Dachepalli, Sudhir wrote:
> Our driver gets called in with the following fashion through the
> queuecommand.
> 
> scsi_request_fn() -> scsi_dispatch_cmd() -> rtn =
> host->hostt->queuecommand(cmd, scsi_done);
> 
> We are using the "cmd" ( scsi_cmnd) as a pass through with out touching
> the "request_buffer" and "request_bufflen".
> 
> We do not allocate memory similar to sg or st for page allocations.
> 
> The request_buffer should already contain the scatter gather list built.

We're trying not to wrapper pass throughs in SCSI commands (unless
defined by standard like the ATA ones).  However, I'm not entirely sure
what you're trying to do ... can you post a link to the driver so that
we can see if there's a better way?

Thanks,

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-03 Thread Dachepalli, Sudhir
Mike ,

Our driver gets called in with the following fashion through the
queuecommand.

scsi_request_fn() -> scsi_dispatch_cmd() -> rtn =
host->hostt->queuecommand(cmd, scsi_done);

We are using the "cmd" ( scsi_cmnd) as a pass through with out touching
the "request_buffer" and "request_bufflen".

We do not allocate memory similar to sg or st for page allocations.

The request_buffer should already contain the scatter gather list built.


Regards,
Sudhir 

-Original Message-
From: Mike Christie [mailto:[EMAIL PROTECTED] 
Sent: Saturday, March 03, 2007 6:04 PM
To: Dachepalli, Sudhir
Cc: Benny Halevy; Jens Axboe; Boaz Harrosh; linux-scsi@vger.kernel.org;
James Bottomley
Subject: Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

Dachepalli, Sudhir wrote:
> Where is the depricated warning that you mentioned about ?
> I tried to look in scsi_lib.c and scsi_device.h
> 

I meant in the first versions of the patches there was a warning. Did
you also try the workarounds mentioned in the bugzilla, to allocate
memory like sg and st?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-03 Thread Mike Christie
Dachepalli, Sudhir wrote:
> Where is the depricated warning that you mentioned about ?
> I tried to look in scsi_lib.c and scsi_device.h
> 

I meant in the first versions of the patches there was a warning. Did
you also try the workarounds mentioned in the bugzilla, to allocate
memory like sg and st?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-02 Thread Dachepalli, Sudhir
Mike,

I applied "patch 1" posted by Boaz Harrosh to my SLES 10 kernel.
So far IO's are running for 45 minutes which is the longest in my
numerous re-tries.


We were not aware that "scsi_execute_async" was temperory workaround.

Where is the depricated warning that you mentioned about ?
I tried to look in scsi_lib.c and scsi_device.h


Regards,
Sudhir 

-Original Message-
From: Mike Christie [mailto:[EMAIL PROTECTED] 
Sent: Friday, March 02, 2007 4:00 PM
To: Dachepalli, Sudhir
Cc: Benny Halevy; Jens Axboe; Boaz Harrosh; linux-scsi@vger.kernel.org;
James Bottomley
Subject: Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

Mike Christie wrote:
> Dachepalli, Sudhir wrote:
>> scsi_req_map_sg::i=2,len=1024,data_len=3072,off=2048,PAGE_SIZE=4096,b
>> yte
>> s=1024,nr_vecs=0, nr_pages=0
> 
> 
>> if (bio_add_pc_page(q, bio, page, bytes, off)
!=
>> bytes) {
>> printk("scsi_req_map_sg:: calling 
>> bio_put \n");
>>  
>> printk("scsi_req_map_sg::i=%d,len=%d,data_len=%d,off=%d,PAGE_SIZE=%ld
>> ,by
>> tes=%d,nr_vecs=%d, nr_pages=%d\n",
>>  
>> i,len,data_len,off,PAGE_SIZE,bytes,nr_vecs,nr_pages);
>> if( bio->bi_io_vec == NULL )
> 
> 
> I think Boaz's first patch in this thread that counts the offsets 
> correctly should be merged.

Just one clarification.

When I wrote scsi_execute_async, it was meant as a temp hack so we would
kill scsi_request and fix scatterlists for drivers like iscsi_tcp and
ib_iser, but in the original patches and the patches I am sending now I
modify the blk helpers and have sg use them directly.

scsi_execute_async was meant to be temporary only supported what sg and
st and other mainline drivers were sending at the time, so it did not
support something like:

sg[0].offset 0;
sg[0].length = 4096;
sg[1].offset = 1024;
sg[1].offset = 3072;

because sg and st can only have a offset for the first sg element
(offsets for the sg element is supported).

If we are going to support whatever scsi_do_req supported the we should
merge Boaz's patch.

If scsi_execute_async is going to be limited to what is in mainline
until I can kill it, then we may not want to merge Boaz's patch and just
have people convert the code to use blk_get_request, blk_rq_map_kern or
blk_rq_map_user and blk_execute_rq_nowait.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-02 Thread Mike Christie
Mike Christie wrote:
> Dachepalli, Sudhir wrote:
>> scsi_req_map_sg::i=2,len=1024,data_len=3072,off=2048,PAGE_SIZE=4096,byte
>> s=1024,nr_vecs=0, nr_pages=0
> 
> 
>> if (bio_add_pc_page(q, bio, page, bytes, off) !=
>> bytes) {
>> printk("scsi_req_map_sg:: calling
>> bio_put \n");
>>  
>> printk("scsi_req_map_sg::i=%d,len=%d,data_len=%d,off=%d,PAGE_SIZE=%ld,by
>> tes=%d,nr_vecs=%d, nr_pages=%d\n",
>>  
>> i,len,data_len,off,PAGE_SIZE,bytes,nr_vecs,nr_pages);
>> if( bio->bi_io_vec == NULL )
> 
> 
> I think Boaz's first patch in this thread that counts the offsets
> correctly should be merged.

Just one clarification.

When I wrote scsi_execute_async, it was meant as a temp hack so we would
kill scsi_request and fix scatterlists for drivers like iscsi_tcp and
ib_iser, but in the original patches and the patches I am sending now I
modify the blk helpers and have sg use them directly.

scsi_execute_async was meant to be temporary only supported what sg and
st and other mainline drivers were sending at the time, so it did not
support something like:

sg[0].offset 0;
sg[0].length = 4096;
sg[1].offset = 1024;
sg[1].offset = 3072;

because sg and st can only have a offset for the first sg element
(offsets for the sg element is supported).

If we are going to support whatever scsi_do_req supported the we should
merge Boaz's patch.

If scsi_execute_async is going to be limited to what is in mainline
until I can kill it, then we may not want to merge Boaz's patch and just
have people convert the code to use blk_get_request, blk_rq_map_kern or
blk_rq_map_user and blk_execute_rq_nowait.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-02 Thread Mike Christie
Dachepalli, Sudhir wrote:
> scsi_req_map_sg::i=2,len=1024,data_len=3072,off=2048,PAGE_SIZE=4096,byte
> s=1024,nr_vecs=0, nr_pages=0


> if (bio_add_pc_page(q, bio, page, bytes, off) !=
> bytes) {
> printk("scsi_req_map_sg:: calling
> bio_put \n");
>  
> printk("scsi_req_map_sg::i=%d,len=%d,data_len=%d,off=%d,PAGE_SIZE=%ld,by
> tes=%d,nr_vecs=%d, nr_pages=%d\n",
>  
> i,len,data_len,off,PAGE_SIZE,bytes,nr_vecs,nr_pages);
> if( bio->bi_io_vec == NULL )


I think Boaz's first patch in this thread that counts the offsets
correctly should be merged.

I am not sure about the second one. If allocating a bio with zero vecs
is valid, then I guess the patch should be merged. bio_alloc_bioset
looks like it allows this, but people probably never do it (just when
they hit bugs like this one :)). But I think if we are counting segments
correctly we will not need that patch for this problem will we Boaz?
Boaz, maybe you could also send the first patch in a seperate mail so
that it can be merged (I do not think James wants to cut and paste two
patches from one mail).

Also, you do not want to use scsi_execute_async. I am trying to kill it.
The original patches, never had that function and the later patches had
a warning that you should not use it and that you should just use the
request and the blk_rq* helpers directly.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: Possible bug in scsi_lib.c:scsi_req_map_sg()

2007-03-02 Thread Dachepalli, Sudhir
err = -ENOMEM;
goto free_bios;
}
bio->bi_end_io = scsi_bi_endio;
}

if (bio_add_pc_page(q, bio, page, bytes, off) !=
bytes) {
printk("scsi_req_map_sg:: calling
bio_put \n");
 
printk("scsi_req_map_sg::i=%d,len=%d,data_len=%d,off=%d,PAGE_SIZE=%ld,by
tes=%d,nr_vecs=%d, nr_pages=%d\n",
 
i,len,data_len,off,PAGE_SIZE,bytes,nr_vecs,nr_pages);
if( bio->bi_io_vec == NULL )
printk("scsi_req_map_sg:: bio->bi_io_vec is NULL\n");
bio_put(bio);
err = -EINVAL;
goto free_bios;
}

if (bio->bi_vcnt >= nr_vecs) {
err = scsi_merge_bio(rq, bio);
if (err) {
bio_endio(bio, bio->bi_size, 0);
goto free_bios;
}
bio = NULL;
}

page++;
len -= bytes;
off = 0;
}
}

rq->buffer = rq->data = NULL;
rq->data_len = data_len;
return 0;

free_bios:
while ((bio = rq->bio) != NULL) {
rq->bio = bio->bi_next;
/*
 * call endio instead of bio_put incase it was bounced
 */
bio_endio(bio, bio->bi_size, 0);
}

return err;
}






regards 
Sudhir Dachepalli  

-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Benny Halevy
Sent: Wednesday, November 29, 2006 3:30 AM
To: Jens Axboe
Cc: Mike Christie; Boaz Harrosh; linux-scsi@vger.kernel.org; James
Bottomley
Subject: Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

Jens Axboe wrote:
> On Mon, Nov 27 2006, Mike Christie wrote:
>> Mike Christie wrote:
>>> Boaz Harrosh wrote:
>>>> Playing with some tests which I admit are not 100% orthodox I have 
>>>> stumbled upon a bug that raises a serious question:
>>>>
>>>> In the call to scsi_execute_async() in the use_sg case, must the
>>>> scatterlist* (pointed to by buffer) map a buffer that's contiguous 
>>>> in virtual memory or is it allowed to map disjoint segments of
memory?
>>> I thought they were continguous. I think James has said before that 
>>> they can be disjoint. When we converted sg it did not look like sg 
>>> or st supported disjoint. The main non dio path used a buffer from 
>>> get_free_pages so I thought that would always be contiguous. The dio

>>> path then always set the first sg offset, but the rest it set to
zero.
>> And the len is set to page size for the middle entries too.
>>
>> But for the non DIO st path we can end up with some middle sg entires

>> that are not a full page so that code in scsi_execute_async is broken

>> for that.
> 
> If something doesn't work with non-contig sg entries, that would be a 
> bug. If the question is regarding holes in the sg list, that is 
> probably unchartered territory and I would not regard that as
supported.
> 

Jens, I'm not sure I understand the terms you used.  Can you please
define more clearly what you mean by "non-contig sg entries" vs.
"holes in the sg list"?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED] More majordomo info
at  http://vger.kernel.org/majordomo-info.html
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-29 Thread Benny Halevy

Jens Axboe wrote:

On Mon, Nov 27 2006, Mike Christie wrote:

Mike Christie wrote:

Boaz Harrosh wrote:

Playing with some tests which I admit are not 100% orthodox I have
stumbled upon a bug that raises a serious question:

In the call to scsi_execute_async() in the use_sg case, must the
scatterlist* (pointed to by buffer) map a buffer that's contiguous in
virtual memory or is it allowed to map disjoint segments of memory?

I thought they were continguous. I think James has said before that they
can be disjoint. When we converted sg it did not look like sg or st
supported disjoint. The main non dio path used a buffer from
get_free_pages so I thought that would always be contiguous. The dio
path then always set the first sg offset, but the rest it set to zero.

And the len is set to page size for the middle entries too.

But for the non DIO st path we can end up with some middle sg entires
that are not a full page so that code in scsi_execute_async is broken
for that.


If something doesn't work with non-contig sg entries, that would be a
bug. If the question is regarding holes in the sg list, that is probably
unchartered territory and I would not regard that as supported.



Jens, I'm not sure I understand the terms you used.  Can you please
define more clearly what you mean by "non-contig sg entries" vs.
"holes in the sg list"?

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Jens Axboe
On Mon, Nov 27 2006, Mike Christie wrote:
> Mike Christie wrote:
> > Boaz Harrosh wrote:
> >> Playing with some tests which I admit are not 100% orthodox I have
> >> stumbled upon a bug that raises a serious question:
> >>
> >> In the call to scsi_execute_async() in the use_sg case, must the
> >> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
> >> virtual memory or is it allowed to map disjoint segments of memory?
> > 
> > I thought they were continguous. I think James has said before that they
> > can be disjoint. When we converted sg it did not look like sg or st
> > supported disjoint. The main non dio path used a buffer from
> > get_free_pages so I thought that would always be contiguous. The dio
> > path then always set the first sg offset, but the rest it set to zero.
> 
> And the len is set to page size for the middle entries too.
> 
> But for the non DIO st path we can end up with some middle sg entires
> that are not a full page so that code in scsi_execute_async is broken
> for that.

If something doesn't work with non-contig sg entries, that would be a
bug. If the question is regarding holes in the sg list, that is probably
unchartered territory and I would not regard that as supported.

-- 
Jens Axboe

-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Mike Christie
Boaz Harrosh wrote:
> Mike Christie wrote:
>> Boaz Harrosh wrote:
>>> Playing with some tests which I admit are not 100% orthodox I have
>>> stumbled upon a bug that raises a serious question:
>>>
>>> In the call to scsi_execute_async() in the use_sg case, must the
>>> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
>>> virtual memory or is it allowed to map disjoint segments of memory?
>>
>> I thought they were continguous. I think James has said before that they
>> can be disjoint. When we converted sg it did not look like sg or st
>> supported disjoint. The main non dio path used a buffer from
>> get_free_pages so I thought that would always be contiguous. The dio
>> path then always set the first sg offset, but the rest it set to zero.
>>
>> How did you hit this problem? Is it with sg or st, or with some other
>> code? Is it the mmap path maybe?
> 
> OK I admit, guilty as charged, I was using it from a kernel driver,
> OSD-Initiator from IBM. The code is unorthodox in mapping user space
> iovects into scatterlist*. I will have to work around it than.

Well, you do not have to work around it :)

I want to kill scsi_execute_async and just allow the ULDs to allocate a
request, call blk_rq_map_* (and add any new map helpers we need), then
call blk_execute_rq_nowait. This gives the ULDs some flexibility and
kills my ugly function. This is what I originally did here
http://marc.theaimsgroup.com/?l=linux-scsi&m=112356952007369&w=2

For some reason, I flip flopped and went with scsi_execute_async and the
scatterlist argument hack. I think I did this because I thought it would
be less problems in converting the ULDs in stages. First stage was to
remove scsi_request usage and clean/fix up scsi-ml and LLDs, next would
be to convert to block layer functions directly, but looking back it
might have been better to just go through one big headache.

I think Christoph Hellwig has patches to remove scsi_execute_async as
part of his bidi work. He needs help testing and reviewing them, so you
should help him out instead of working around it :)
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-28 Thread Boaz Harrosh

Mike Christie wrote:

Boaz Harrosh wrote:

Playing with some tests which I admit are not 100% orthodox I have
stumbled upon a bug that raises a serious question:

In the call to scsi_execute_async() in the use_sg case, must the
scatterlist* (pointed to by buffer) map a buffer that's contiguous in
virtual memory or is it allowed to map disjoint segments of memory?


I thought they were continguous. I think James has said before that they
can be disjoint. When we converted sg it did not look like sg or st
supported disjoint. The main non dio path used a buffer from
get_free_pages so I thought that would always be contiguous. The dio
path then always set the first sg offset, but the rest it set to zero.

How did you hit this problem? Is it with sg or st, or with some other
code? Is it the mmap path maybe?


OK I admit, guilty as charged, I was using it from a kernel driver, OSD-Initiator from 
IBM. The code is unorthodox in mapping user space iovects into scatterlist*. I will have 
to work around it than. Such a petty because it saves me a copy of an high bandwidth 
channel. with iSCSI the fix works well but I guess if the working assumption was 
"contiguous", than allowing it here will expose problems in drivers that don't 
expect it.

In any way the bio.c patch should go in. 1. Zero vecs bio cannot be freed with 
current code 2. It lets kernel exit gracefully with an error instead of a crash.

should we at least do below patch so people know what happened postmortem:

diff -Npu /tmp/tmp.5864.0 
/home/bharrosh/p4.local/local/scsi-misc-2.6-dev/linux/drivers/scsi/scsi_lib.c 
-L a/scsi_lib.c -L b/scsi_lib.c
--- a/scsi_lib.c
+++ b/scsi_lib.c
@@ -321,6 +321,9 @@ static int scsi_req_map_sg(struct reques
nr_vecs = min_t(int, BIO_MAX_PAGES, nr_pages);
nr_pages -= nr_vecs;

+/* most probably not a contiguous memory mapping */ 
+BUGON(!nr_vecs); 
+

bio = bio_alloc(gfp, nr_vecs);
if (!bio) {
err = -ENOMEM;


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-27 Thread James Bottomley
On Mon, 2006-11-27 at 13:13 -0600, Mike Christie wrote:
> I thought they were continguous. I think James has said before that
> they
> can be disjoint. When we converted sg it did not look like sg or st
> supported disjoint. The main non dio path used a buffer from
> get_free_pages so I thought that would always be contiguous. The dio
> path then always set the first sg offset, but the rest it set to zero.

For ordinary mappings, they're contiguous ... there's an edge case where
you go through the deprecated buffer cache you can end up with multiple
sg elements that are 512 bytes in length, but this probably isn't worth
worrying about any more.

James


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-27 Thread Kai Makisara
On Mon, 27 Nov 2006, Mike Christie wrote:

> Mike Christie wrote:
> > Boaz Harrosh wrote:
> >> Playing with some tests which I admit are not 100% orthodox I have
> >> stumbled upon a bug that raises a serious question:
> >>
> >> In the call to scsi_execute_async() in the use_sg case, must the
> >> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
> >> virtual memory or is it allowed to map disjoint segments of memory?
> > 
> > I thought they were continguous. I think James has said before that they
> > can be disjoint. When we converted sg it did not look like sg or st
> > supported disjoint. The main non dio path used a buffer from
> > get_free_pages so I thought that would always be contiguous. The dio
> > path then always set the first sg offset, but the rest it set to zero.
> 
> And the len is set to page size for the middle entries too.
> 
> But for the non DIO st path we can end up with some middle sg entires
> that are not a full page so that code in scsi_execute_async is broken
> for that.

I don't think so. Only the last sg entry may have length != n * (page size) 
when the driver buffer is used.

-- 
Kai
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-27 Thread Mike Christie
Mike Christie wrote:
> Boaz Harrosh wrote:
>> Playing with some tests which I admit are not 100% orthodox I have
>> stumbled upon a bug that raises a serious question:
>>
>> In the call to scsi_execute_async() in the use_sg case, must the
>> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
>> virtual memory or is it allowed to map disjoint segments of memory?
> 
> I thought they were continguous. I think James has said before that they
> can be disjoint. When we converted sg it did not look like sg or st
> supported disjoint. The main non dio path used a buffer from
> get_free_pages so I thought that would always be contiguous. The dio
> path then always set the first sg offset, but the rest it set to zero.

And the len is set to page size for the middle entries too.

But for the non DIO st path we can end up with some middle sg entires
that are not a full page so that code in scsi_execute_async is broken
for that.
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-27 Thread Mike Christie
Boaz Harrosh wrote:
> Playing with some tests which I admit are not 100% orthodox I have
> stumbled upon a bug that raises a serious question:
> 
> In the call to scsi_execute_async() in the use_sg case, must the
> scatterlist* (pointed to by buffer) map a buffer that's contiguous in
> virtual memory or is it allowed to map disjoint segments of memory?

I thought they were continguous. I think James has said before that they
can be disjoint. When we converted sg it did not look like sg or st
supported disjoint. The main non dio path used a buffer from
get_free_pages so I thought that would always be contiguous. The dio
path then always set the first sg offset, but the rest it set to zero.

How did you hit this problem? Is it with sg or st, or with some other
code? Is it the mmap path maybe?
-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Possible bug in scsi_lib.c:scsi_req_map_sg()

2006-11-27 Thread Boaz Harrosh

Playing with some tests which I admit are not 100% orthodox I have stumbled 
upon a bug that raises a serious question:

In the call to scsi_execute_async() in the use_sg case, must the scatterlist* 
(pointed to by buffer) map a buffer that's contiguous in virtual memory or is 
it allowed to map disjoint segments of memory?

In scsi_req_map_sg() nr_pages is calculated like this:
int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;

this calculation assumes that only the first page can have a non zero offset so 
disjoint memory segments pointed by sgl will fail mapping in some case when we 
do not allocate enough nr_pages for them.

Please consider below patch for a fix for such cases. With this patch my tests 
pass, including booting from a SATA drive (with a 2.6.18 code base).

Without the patch the kernel fails really badly. What happens is that 
bio_add_pc_page(..) fails and then inside bio_put() and later bio_free() at the call to 
mempool_free(bio->bi_io_vec, bio_set->bvec_pools[pool_idx]), bio->bi_io_vec == 
NULL and the kernel panics.

bio->bi_io_vec is set to NULL in bio_alloc_bioset() when nr_iovecs == 0.  As this 
seems like a valid use case bio_free() should free bio->bi_io_vec only if 
bio->bi_io_vec != NULL (see second patch).

Both patches are based off of scsi-misc-2.6.



Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
Signed-off-by: Benny Halevy <[EMAIL PROTECTED]>

 //depot/local/scsi-misc-2.6-dev/linux/drivers/scsi/scsi_lib.c#1 - 
/home/bharrosh/p4.local/local/scsi-misc-2.6-dev/linux/drivers/scsi/scsi_lib.c 

diff -Npu /tmp/tmp.20230.0 
/home/bharrosh/p4.local/local/scsi-misc-2.6-dev/linux/drivers/scsi/scsi_lib.c 
-L a/drivers/scsi/scsi_lib.c -L b/drivers/scsi/scsi_lib.c
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -299,16 +299,20 @@ static int scsi_bi_endio(struct bio *bio
 * sent to use, as some ULDs use that struct to only organize the pages.
 */
static int scsi_req_map_sg(struct request *rq, struct scatterlist *sgl,
-  int nsegs, unsigned bufflen, gfp_t gfp)
+  int nsegs, gfp_t gfp)
{
struct request_queue *q = rq->q;
-   int nr_pages = (bufflen + sgl[0].offset + PAGE_SIZE - 1) >> PAGE_SHIFT;
+   int nr_pages = 0;
unsigned int data_len = 0, len, bytes, off;
struct page *page;
struct bio *bio = NULL;
int i, err, nr_vecs = 0;

for (i = 0; i < nsegs; i++) {
+   nr_pages += (sgl[i].length + sgl[i].offset + PAGE_SIZE - 1) >> 
PAGE_SHIFT;
+   }
+
+   for (i = 0; i < nsegs; i++) {
page = sgl[i].page;
off = sgl[i].offset;
len = sgl[i].length;
@@ -402,7 +406,7 @@ int scsi_execute_async(struct scsi_devic
req->cmd_flags |= REQ_QUIET;

if (use_sg)
-   err = scsi_req_map_sg(req, buffer, use_sg, bufflen, gfp);
+   err = scsi_req_map_sg(req, buffer, use_sg, gfp);
else if (bufflen)
err = blk_rq_map_kern(req->q, req, buffer, bufflen, gfp);




Signed-off-by: Boaz Harrosh <[EMAIL PROTECTED]>
Signed-off-by: Benny Halevy <[EMAIL PROTECTED]>

diff --git a/fs/bio.c b/fs/bio.c
index f95c874..199b929 100644
--- a/fs/bio.c
+++ b/fs/bio.c
@@ -113,7 +113,8 @@ void bio_free(struct bio *bio, struct bi

BIO_BUG_ON(pool_idx >= BIOVEC_NR_POOLS);

-   mempool_free(bio->bi_io_vec, bio_set->bvec_pools[pool_idx]);
+   if (likely(bio->bi_io_vec))
+   mempool_free(bio->bi_io_vec, bio_set->bvec_pools[pool_idx]);
mempool_free(bio, bio_set->bio_pool);
}


-
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html