[PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Wang Sen
When using the commands below to write some data to a virtio-scsi LUN of the 
QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.

# sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
# sudo mount /dev/sdb /mnt
# dd if=/dev/zero of=/mnt/file bs=1M count=1024

In current implementation, sg_set_buf is called to add buffers to sg list which
is put into the virtqueue eventually. But there are some HighMem pages in 
table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
return NULL value. This will cause QEMU exit when virtqueue_map_sg is called 
in QEMU because an invalid GPA is passed by virtqueue.

My solution is using sg_set_page instead of sg_set_buf.

I have tested the patch on my workstation. QEMU would not crash any more.

Signed-off-by: Wang Sen 
---
 drivers/scsi/virtio_scsi.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 1b38431..fc5c88a 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
unsigned int *p_idx,
int i;
 
for_each_sg(table->sgl, sg_elem, table->nents, i)
-   sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
+   sg_set_page(&sg[idx++], sg_page(sg_elem), sg_elem->length,
+   sg_elem->offset);
 
*p_idx = idx;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 10:29, Wang Sen ha scritto:
> When using the commands below to write some data to a virtio-scsi LUN of the 
> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.
> 
>   # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>   # sudo mount /dev/sdb /mnt
>   # dd if=/dev/zero of=/mnt/file bs=1M count=1024
> 
> In current implementation, sg_set_buf is called to add buffers to sg list 
> which
> is put into the virtqueue eventually. But there are some HighMem pages in 
> table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called 
> in QEMU because an invalid GPA is passed by virtqueue.

Heh, I was compiling (almost) the same patch as we speak. :)

I've never seen QEMU crash; the VM would more likely just fail to boot
with a panic.  But it's the same bug anyway.

> My solution is using sg_set_page instead of sg_set_buf.
> 
> I have tested the patch on my workstation. QEMU would not crash any more.
> 
> Signed-off-by: Wang Sen 
> ---
>  drivers/scsi/virtio_scsi.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 1b38431..fc5c88a 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
> unsigned int *p_idx,
>   int i;
>  
>   for_each_sg(table->sgl, sg_elem, table->nents, i)
> - sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
> + sg_set_page(&sg[idx++], sg_page(sg_elem), sg_elem->length,
> + sg_elem->offset);

This can simply be

   sg[idx++] = *sg_elem;

Can you repost it with this change, and also add sta...@vger.kernel.org
to the Cc?  Thanks very much!

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 11:44 AM, Paolo Bonzini wrote:

> Il 25/07/2012 10:29, Wang Sen ha scritto:
>> When using the commands below to write some data to a virtio-scsi LUN of the 
>> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
>> crash.
>>
>>  # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>>  # sudo mount /dev/sdb /mnt
>>  # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>>
>> In current implementation, sg_set_buf is called to add buffers to sg list 
>> which
>> is put into the virtqueue eventually. But there are some HighMem pages in 
>> table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
>> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called 
>> in QEMU because an invalid GPA is passed by virtqueue.
> 
> Heh, I was compiling (almost) the same patch as we speak. :)
> 
> I've never seen QEMU crash; the VM would more likely just fail to boot
> with a panic.  But it's the same bug anyway.
> 
>> My solution is using sg_set_page instead of sg_set_buf.
>>
>> I have tested the patch on my workstation. QEMU would not crash any more.
>>
>> Signed-off-by: Wang Sen 
>> ---
>>  drivers/scsi/virtio_scsi.c |3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 1b38431..fc5c88a 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
>> unsigned int *p_idx,
>>  int i;
>>  
>>  for_each_sg(table->sgl, sg_elem, table->nents, i)
>> -sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
>> +sg_set_page(&sg[idx++], sg_page(sg_elem), sg_elem->length,
>> +sg_elem->offset);
> 
> This can simply be
> 
>sg[idx++] = *sg_elem;
> 
> Can you repost it with this change, and also add sta...@vger.kernel.org
> to the Cc?  Thanks very much!
> 


No! Please use sg_set_page()! Look at sg_set_page(), which calls 
sg_assign_page().
It has all these jump over chained arrays. When you'll start using long
sg_lists (which you should) then jumping from chain to chain must go through
sg_page(sg_elem) && sg_assign_page(), As in the original patch.

Thanks
Boaz

> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 11:22, Boaz Harrosh ha scritto:
>>> >>  for_each_sg(table->sgl, sg_elem, table->nents, i)
>>> >> -sg_set_buf(&sg[idx++], sg_virt(sg_elem), 
>>> >> sg_elem->length);
>>> >> +sg_set_page(&sg[idx++], sg_page(sg_elem), 
>>> >> sg_elem->length,
>>> >> +sg_elem->offset);
>> > 
>> > This can simply be
>> > 
>> >sg[idx++] = *sg_elem;
>> > 
>> > Can you repost it with this change, and also add sta...@vger.kernel.org
>> > to the Cc?  Thanks very much!
>> > 
> 
> No! Please use sg_set_page()! Look at sg_set_page(), which calls 
> sg_assign_page().
> It has all these jump over chained arrays. When you'll start using long
> sg_lists (which you should) then jumping from chain to chain must go through
> sg_page(sg_elem) && sg_assign_page(), As in the original patch.

Hi Boaz,

actually it seems to me that using sg_set_page is wrong, because it will
not copy the end marker from table->sgl to sg[].  If something chained
the sg[] scatterlist onto something else, sg_next's test for sg_is_last
would go beyond the table->nents-th item and access invalid memory.

Using chained sglists is on my to-do list, I expect that it would make a
nice performance improvement.  However, I was a bit confused as to
what's the plan there; there is hardly any user, and many arches still
do not define ARCH_HAS_SG_CHAIN.  Do you have any pointer to discussions
or LWN articles?

I would need to add support for the long sglists to virtio; this is not
a problem, but in the past Rusty complained that long sg-lists are not
well suited to virtio (which would like to add elements not just at the
beginning of a given sglist, but also at the end).  It seems to me that
virtio would prefer to work with a struct scatterlist ** rather than a
long sglist.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Rolf Eike Beer

Am 25.07.2012 10:29, schrieb Wang Sen:
When using the commands below to write some data to a virtio-scsi LUN 
of the

QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu
will crash.

# sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
# sudo mount /dev/sdb /mnt
# dd if=/dev/zero of=/mnt/file bs=1M count=1024

In current implementation, sg_set_buf is called to add buffers to sg
list which
is put into the virtqueue eventually.


The next sentence is somehow broken:


But there are some HighMem pages in
table->sgl can not get virtual address by sg_virt.


Maybe something like "But _if_ there are ... _you_ can not get ..."?

Eike
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Stefan Hajnoczi
On Wed, Jul 25, 2012 at 04:00:19PM +0800, Wang Sen wrote:
> When using the commands below to write some data to a virtio-scsi LUN of the 
> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.
> 
>   # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>   # sudo mount /dev/sdb /mnt
>   # dd if=/dev/zero of=/mnt/file bs=1M count=1024
> 
> In current implementation, sg_set_buf is called to add buffers to sg list 
> which
> is put into the virtqueue eventually. But there are some HighMem pages in 
> table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called 
> in QEMU because an invalid GPA is passed by virtqueue.
> 
> My solution is using sg_set_page instead of sg_set_buf.
> 
> I have tested the patch on my workstation. QEMU would not crash any more.
> 
> Signed-off-by: Wang Sen 
> ---
>  drivers/scsi/virtio_scsi.c |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Reviewed-by: Stefan Hajnoczi 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread Reddy, Sreekanth
Hi,

We have done some analysis on this issue. From our analysis we observed that, 
this issue is reproducible on kernel 3.1.10 onwards but in 3.0.36 this issue is 
not reproducible. So, we have taken the mpt2sas code from 3.1.10 kernel and 
compiled and run it on 3.0.36 kernel. Here this issue is not reproducible (i.e. 
it is working fine). 

From 3.0.36 kernel onwards we have not added any patches that will cause this 
issue. So, what I mean to say is "this issue is not because of mpt2sas driver".

Regards,
Sreekanth.  

> -Original Message-
> From: linux-scsi-ow...@vger.kernel.org [mailto:linux-scsi-
> ow...@vger.kernel.org] On Behalf Of Matthias Prager
> Sent: Wednesday, July 25, 2012 3:34 AM
> To: Tejun Heo
> Cc: Robert Trace; linux-scsi@vger.kernel.org; Jens Axboe; Moore, Eric;
> James E.J. Bottomley; Alan; Darrick J. Wong; Matthias Prager
> Subject: Re: 'Device not ready' issue on mpt2sas since 3.1.10
> 
> Hello everyone,
> 
> I retested with a new firmware (P14 - released today), since it
> contains
> a bunch of sata and SATL fixes (according to the changelog).
> Unfortunately the observed behavior is unchanged (tested on a 3.4.5
> kernel).
> 
> Just wanted to let everyone know.
> 
> Cheers
> Matthias
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Stefan Hajnoczi
On Wed, Jul 25, 2012 at 10:44:14AM +0200, Paolo Bonzini wrote:
> Il 25/07/2012 10:29, Wang Sen ha scritto:
> > When using the commands below to write some data to a virtio-scsi LUN of 
> > the 
> > QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
> > crash.
> > 
> > # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
> > # sudo mount /dev/sdb /mnt
> > # dd if=/dev/zero of=/mnt/file bs=1M count=1024
> > 
> > In current implementation, sg_set_buf is called to add buffers to sg list 
> > which
> > is put into the virtqueue eventually. But there are some HighMem pages in 
> > table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
> > return NULL value. This will cause QEMU exit when virtqueue_map_sg is 
> > called 
> > in QEMU because an invalid GPA is passed by virtqueue.
> 
> Heh, I was compiling (almost) the same patch as we speak. :)
> 
> I've never seen QEMU crash; the VM would more likely just fail to boot
> with a panic.  But it's the same bug anyway.

It's not a segfault "crash", I think it hits an abort(3) in QEMU's
virtio code when trying to map an invalid guest physical address.

Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Sen Wang
2012/7/25 Paolo Bonzini :
> Il 25/07/2012 10:29, Wang Sen ha scritto:
>> When using the commands below to write some data to a virtio-scsi LUN of the
>> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
>> crash.
>>
>>   # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>>   # sudo mount /dev/sdb /mnt
>>   # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>>
>> In current implementation, sg_set_buf is called to add buffers to sg list 
>> which
>> is put into the virtqueue eventually. But there are some HighMem pages in
>> table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
>> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
>> in QEMU because an invalid GPA is passed by virtqueue.
>
> Heh, I was compiling (almost) the same patch as we speak. :)

Uh, what a coincidence! :)

>
> I've never seen QEMU crash; the VM would more likely just fail to boot
> with a panic.  But it's the same bug anyway.

I never met this before. How this situation happens?

>
>> My solution is using sg_set_page instead of sg_set_buf.
>>
>> I have tested the patch on my workstation. QEMU would not crash any more.
>>
>> Signed-off-by: Wang Sen 
>> ---
>>  drivers/scsi/virtio_scsi.c |3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 1b38431..fc5c88a 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
>> unsigned int *p_idx,
>>   int i;
>>
>>   for_each_sg(table->sgl, sg_elem, table->nents, i)
>> - sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
>> + sg_set_page(&sg[idx++], sg_page(sg_elem), sg_elem->length,
>> + sg_elem->offset);
>
> This can simply be
>
>sg[idx++] = *sg_elem;
>

Yes, I saw your another E-mail. I think you're right. Simply calling
sg_set_page can not handle
the flag bits correctly. So, I'll repost the patch soon. Thank you!

> Can you repost it with this change, and also add sta...@vger.kernel.org
> to the Cc?  Thanks very much!
>
> Paolo
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
--
Wang Sen
Addr: XUPT,Xi'an,Shaanxi,China
Email: kelvin.x...@gmail.com
--
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Sen Wang
2012/7/25 Rolf Eike Beer :
> Am 25.07.2012 10:29, schrieb Wang Sen:
>
>> When using the commands below to write some data to a virtio-scsi LUN of
>> the
>> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu
>> will crash.
>>
>> # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>> # sudo mount /dev/sdb /mnt
>> # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>>
>> In current implementation, sg_set_buf is called to add buffers to sg
>> list which
>> is put into the virtqueue eventually.
>
>
> The next sentence is somehow broken:
>
>
>> But there are some HighMem pages in
>> table->sgl can not get virtual address by sg_virt.
>
>
> Maybe something like "But _if_ there are ... _you_ can not get ..."?

Thanks, I'll pay attention in next post.

>
> Eike
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
--
Wang Sen
Addr: XUPT,Xi'an,Shaanxi,China
Email: kelvin.x...@gmail.com
--
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Sen Wang
2012/7/25 Stefan Hajnoczi :
> On Wed, Jul 25, 2012 at 10:44:14AM +0200, Paolo Bonzini wrote:
>> Il 25/07/2012 10:29, Wang Sen ha scritto:
>> > When using the commands below to write some data to a virtio-scsi LUN of 
>> > the
>> > QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
>> > crash.
>> >
>> > # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>> > # sudo mount /dev/sdb /mnt
>> > # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>> >
>> > In current implementation, sg_set_buf is called to add buffers to sg list 
>> > which
>> > is put into the virtqueue eventually. But there are some HighMem pages in
>> > table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
>> > return NULL value. This will cause QEMU exit when virtqueue_map_sg is 
>> > called
>> > in QEMU because an invalid GPA is passed by virtqueue.
>>
>> Heh, I was compiling (almost) the same patch as we speak. :)
>>
>> I've never seen QEMU crash; the VM would more likely just fail to boot
>> with a panic.  But it's the same bug anyway.
>
> It's not a segfault "crash", I think it hits an abort(3) in QEMU's
> virtio code when trying to map an invalid guest physical address.

How the guest boot fail? I never met this case.

>
> Stefan
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
--
Wang Sen
Addr: XUPT,Xi'an,Shaanxi,China
Email: kelvin.x...@gmail.com
--
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Stefan Hajnoczi
On Tue, Jul 24, 2012 at 01:45:24PM -0700, Nicholas A. Bellinger wrote:
> On Mon, 2012-07-23 at 18:56 -0700, Greg Kroah-Hartman wrote:
> > On Tue, Jul 24, 2012 at 01:26:20AM +, Nicholas A. Bellinger wrote:
> > > From: Nicholas Bellinger 
> > > 
> > > As requested by Anthony, here is a patch against 
> > > target-pending/for-next-merge
> > > to expose an ABI version to userspace via a new VHOST_SCSI_GET_ABI_VERSION
> > > ioctl operation.
> > > 
> > > As mentioned in the comment, ABI Rev 0 is for pre 2012 out-of-tree code, 
> > > and
> > > ABI Rev 1 (the current rev) is for current WIP v3.6 kernel merge candiate 
> > > code.
> > > 
> > > I think this is what you had in mind, and hopefully it will make MST 
> > > happy too.
> > > The incremental vhost-scsi patches against Zhi's QEMU are going out 
> > > shortly ahead
> > > of cutting a new vhost-scsi RFC over the next days.
> > > 
> > > Please have a look and let me know if you have any concerns here.
> > > 
> > > Thanks!
> > > 
> > > Reported-by: Anthony Liguori 
> > > Cc: Stefan Hajnoczi 
> > > Cc: Michael S. Tsirkin 
> > > Cc: Paolo Bonzini 
> > > Cc: Zhi Yong Wu 
> > > Signed-off-by: Nicholas Bellinger 
> > > ---
> > >  drivers/vhost/tcm_vhost.c |9 +
> > >  drivers/vhost/tcm_vhost.h |   11 +++
> > >  2 files changed, 20 insertions(+), 0 deletions(-)
> > > 
> 
> 
> 
> > > diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
> > > index e942df9..3d5378f 100644
> > > --- a/drivers/vhost/tcm_vhost.h
> > > +++ b/drivers/vhost/tcm_vhost.h
> > > @@ -80,7 +80,17 @@ struct tcm_vhost_tport {
> > >  
> > >  #include 
> > >  
> > > +/*
> > > + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
> > > + *
> > > + * ABI Rev 0: All pre 2012 revisions used by prototype out-of-tree code
> > > + * ABI Rev 1: 2012 version for v3.6 kernel merge candiate
> > > + */
> > > +
> > > +#define VHOST_SCSI_ABI_VERSION   1
> > > +
> > >  struct vhost_scsi_target {
> > > + int abi_version;
> > >   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
> > >   unsigned short vhost_tpgt;
> > >  };
> > > @@ -88,3 +98,4 @@ struct vhost_scsi_target {
> > >  /* VHOST_SCSI specific defines */
> > >  #define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> > > vhost_scsi_target)
> > >  #define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> > > vhost_scsi_target)
> > > +#define VHOST_SCSI_GET_ABI_VERSION _IOW(VHOST_VIRTIO, 0x42, struct 
> > > vhost_scsi_target)
> > 
> > No, you just broke the ABI for version "0" here, that's not how you do
> > this at all.
> > 
> 
> The intention of this patch is use ABI=1 as a starting point for
> tcm_vhost moving forward, with no back-wards compat for the ABI=0
> prototype userspace code because:
> 
> - It's based on a slightly older version of QEMU (updating the QEMU series 
> now)
> - It does not have an GET_ABI_VERSION ioctl cmd (that starts with ABI=1)
> - It has a small user-base of target + virtio-scsi developers
> 
> So I did consider just starting from ABI=0, but figured this would help
> reduce the confusion for QEMU userspace wrt to the vhost-scsi code
> that's been floating around out-of-tree for the last 2 years.

There is no real user base beyond the handful of people who have hacked
on this.  Adding the GET_ABI_VERSION ioctl() at this stage is fine,
especially considering that the userspace code that talks to tcm_vhost
isn't in mainline in userspace yet either.

Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [dm-devel] [RESEND PATCH 3/3] dm mpath: add ability to disable partition creation

2012-07-25 Thread Hannes Reinecke
On 07/11/2012 05:18 PM, Alasdair G Kergon wrote:
> On Tue, Jun 26, 2012 at 02:32:05PM -0400, Mike Snitzer wrote:
>> The new 'no_partitions' feature serves as a notifier to kpartx to _not_
>> create partitions on these multipath devices.
>  
> This isn't really multipath-specific so doesn't belong in the target.
> It could go into dm core, but we already have flags attached to
> udev cookies that can turn udev rules on and off and thereby could
> allow userspace multipath to control whether or not kpartx creates
> partitions on any particular device.
> 
> But first I'd like us to explore creating a config file for kpartx and
> controlling the behaviour from there.  Activation could then be
> triggered by target type, device name, scsi WWID, dm UUID etc. according
> to rules in that file.
> 
But that would mean one would have to maintain _two_ configuration
files, one for multipath and one for kpartx.

Also kpartx initially doesn't have any clue about device names, SCSI
WWIDs etc. It just takes a device-mapper device and acts upon it.
The only information it's got is the device-mapper name (which could
be anything) and the device-mapper UUID if present.

Same goes for udev rules; one would have to traverse the device
stack backwards to retrieve any information about the hardware.

So when moving the configuration into kpartx we would lose quite
some flexibility. And reliability, methinks.

Flexilibity is a valid goal, but not when it can achieved only by
adding complexity to other pieces.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sd: do not set changed flag on all unit attention conditions

2012-07-25 Thread Hannes Reinecke
On 07/17/2012 11:11 AM, James Bottomley wrote:
> On Tue, 2012-07-17 at 10:54 +0200, Paolo Bonzini wrote:
>> Il 17/07/2012 10:40, James Bottomley ha scritto:
>
> It's not specific to virtio-scsi, in fact I expect that virtio-scsi will
> be almost always used with non-removable disks.
>
> However, QEMU's SCSI target is not used just for virtio-scsi (for
> example it can be used for USB storage), and it lets you mark a disk as
> removable---why? because there exists real hardware that presents itself
> as an SBC removable disk.  The only thing that is specific to
> virtualization, is support for online resizing (which generates a unit
> attention condition CAPACITY DATA HAS CHANGED).
>>> So what's the problem?  If you're doing pass through of a physical disk,
>>> we pick up removable from its inquiry string ... a physical removable
>>> device doesn't get resized.  If you have a virtual disk you want to
>>> resize, you don't set the removable flag in the inquiry data.
>>
>> In practice people will do what you said, and it's not a problem.
>>
>> However, there's nothing that prevents you from running qemu with a
>> removable SCSI disk, and then resizing it.  I would like this to work,
>> because SBC allows it and there's no reason why it shouldn't.
> 
> There's no such thing in the market today as a removable disk that's
> resizeable.  Removable disks are for things like backup cartridges and
> ageing jazz drives.  Worse: most removeable devices today are USB card
> readers whose standards compliance varies from iffy to non existent.
> Resizeable disks are currently the province of storage arrays.
> 
Ho-hum. I beg to disagree.

drivers/scsi/aacraid/aachba.c:2266

/* Do not cache partition table for arrays */
scsicmd->device->removable = 1;

To the extend of my knowledge aacraid does this _precisely_ to allow
for resizing; in effect every open() will trigger a device revalidation.

So I guess by just setting the 'removable' flag you should be okay.
You might need to remount it, but that's another story.

Cheers,

Hannes
-- 
Dr. Hannes Reinecke   zSeries & Storage
h...@suse.de  +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Wang Sen
When using the commands below to write some data to a virtio-scsi LUN of the
QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.

# sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
# sudo mount /dev/sdb /mnt
# dd if=/dev/zero of=/mnt/file bs=1M count=1024

In current implementation, sg_set_buf is called to add buffers to sg list which
is put into the virtqueue eventually. But if there are some HighMem pages in
table->sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
in QEMU because an invalid GPA is passed by virtqueue.

I take Paolo's solution mentioned in last thread to avoid failure on handling 
flag bits.

I have tested the patch on my workstation. QEMU would not crash any more.

Signed-off-by: Wang Sen 
---
 drivers/scsi/virtio_scsi.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index 1b38431..6661610 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -198,7 +198,7 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
unsigned int *p_idx,
int i;
 
for_each_sg(table->sgl, sg_elem, table->nents, i)
-   sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
+   sg[idx++] = *sg_elem;
 
*p_idx = idx;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Ben Hutchings
On Wed, 2012-07-25 at 20:13 +0800, Wang Sen wrote:
> When using the commands below to write some data to a virtio-scsi LUN of the
> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.
> 
> # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
> # sudo mount /dev/sdb /mnt
> # dd if=/dev/zero of=/mnt/file bs=1M count=1024
> 
> In current implementation, sg_set_buf is called to add buffers to sg list 
> which
> is put into the virtqueue eventually. But if there are some HighMem pages in
> table->sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) 
> may
> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
> in QEMU because an invalid GPA is passed by virtqueue.
> 
> I take Paolo's solution mentioned in last thread to avoid failure on handling 
> flag bits.
> 
> I have tested the patch on my workstation. QEMU would not crash any more.
>
> Signed-off-by: Wang Sen 
[...]

This is not the correct way to submit a change for stable.  See
Documentation/stable_kernel_rules.txt.

Ben.

-- 
Ben Hutchings
If more than one person is responsible for a bug, no one is at fault.


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 14:13, Wang Sen ha scritto:
> When using the commands below to write some data to a virtio-scsi LUN of the
> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will crash.
> 
> # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
> # sudo mount /dev/sdb /mnt
> # dd if=/dev/zero of=/mnt/file bs=1M count=1024
> 
> In current implementation, sg_set_buf is called to add buffers to sg list 
> which
> is put into the virtqueue eventually. But if there are some HighMem pages in
> table->sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) 
> may
> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
> in QEMU because an invalid GPA is passed by virtqueue.
> 
> I take Paolo's solution mentioned in last thread to avoid failure on handling 
> flag bits.

Please include an URL or (better) summarize the reason why sg_set_page
is not correct in the commit message.  For example, replace this
paragraph with the following:

"To fix this, we can simply copy the original scatterlist entries into
virtio-scsi's; we need to copy the entries entirely, including the flag
bits, so using sg_set_page is not correct".

Please send v3 with this change and I'll add my Acked-by.

Paolo

> 
> I have tested the patch on my workstation. QEMU would not crash any more.
> 
> Signed-off-by: Wang Sen 
> ---
>  drivers/scsi/virtio_scsi.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
> index 1b38431..6661610 100644
> --- a/drivers/scsi/virtio_scsi.c
> +++ b/drivers/scsi/virtio_scsi.c
> @@ -198,7 +198,7 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
> unsigned int *p_idx,
>   int i;
>  
>   for_each_sg(table->sgl, sg_elem, table->nents, i)
> - sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
> + sg[idx++] = *sg_elem;
>  
>   *p_idx = idx;
>  }
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 12:41 PM, Paolo Bonzini wrote:

> Il 25/07/2012 11:22, Boaz Harrosh ha scritto:
>>  for_each_sg(table->sgl, sg_elem, table->nents, i)
>> -sg_set_buf(&sg[idx++], sg_virt(sg_elem), 
>> sg_elem->length);
>> +sg_set_page(&sg[idx++], sg_page(sg_elem), 
>> sg_elem->length,
>> +sg_elem->offset);

 This can simply be

sg[idx++] = *sg_elem;

 Can you repost it with this change, and also add sta...@vger.kernel.org
 to the Cc?  Thanks very much!

>>
>> No! Please use sg_set_page()! Look at sg_set_page(), which calls 
>> sg_assign_page().
>> It has all these jump over chained arrays. When you'll start using long
>> sg_lists (which you should) then jumping from chain to chain must go through
>> sg_page(sg_elem) && sg_assign_page(), As in the original patch.
> 
> Hi Boaz,
> 
> actually it seems to me that using sg_set_page is wrong, because it will
> not copy the end marker from table->sgl to sg[].  If something chained
> the sg[] scatterlist onto something else, sg_next's test for sg_is_last
> would go beyond the table->nents-th item and access invalid memory.
> 


Yes, you did not understand this structure. And Yes I am right, when
using chained lists you *must* use sg_set_page().

You see the chaining belongs to the allocation not the value of the
sg-elements. One must not copy the chaining marker to the destination
array which might have it's own. And one must not crap all over the
destination chaining markers, set at allocation time.

The sizes and mostly the pointers of source and destination are not
the same.

> Using chained sglists is on my to-do list, I expect that it would make a
> nice performance improvement.  However, I was a bit confused as to
> what's the plan there; there is hardly any user, and many arches still
> do not define ARCH_HAS_SG_CHAIN.  Do you have any pointer to discussions
> or LWN articles?
> 


Only the source code I'm afraid.

In SCSI land most LLDs should support chaining just by virtu of using the
for_each_sg macro. That all it takes. Your code above does support it.
(In Wang version).

Though more code need probably be added at sg allocation to actually
allocate and prepare a chain.

> I would need to add support for the long sglists to virtio; this is not
> a problem, but in the past Rusty complained that long sg-lists are not
> well suited to virtio (which would like to add elements not just at the
> beginning of a given sglist, but also at the end).  


Well that can be done as well, (If done carefully) It should be easy to add
chained fragments to both the end of a given chain just as at beginning.

It only means that the last element of the appended-to chain moves to
the next fragment and it's place is replaced by a link.

If you have ready made two long segments A and C which you would like not
to reallocate and copy, you insert a two-elements segment in the middle, say
call it B.

The first element of B is the last element of A which is now used as the pointer
to B, and the 2nd element of B is a pointer to C.

> It seems to me that
> virtio would prefer to work with a struct scatterlist ** rather than a
> long sglist.
> 


That's just going backwards, and lazy. As you said if you want to enjoy
the better performance cake you better break some eggs ;-)

> Paolo


Cheers
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 02:44 PM, Sen Wang wrote:

> 2012/7/25 Paolo Bonzini :
>> Il 25/07/2012 10:29, Wang Sen ha scritto:
>>> When using the commands below to write some data to a virtio-scsi LUN of the
>>> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
>>> crash.
>>>
>>>   # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>>>   # sudo mount /dev/sdb /mnt
>>>   # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>>>
>>> In current implementation, sg_set_buf is called to add buffers to sg list 
>>> which
>>> is put into the virtqueue eventually. But there are some HighMem pages in
>>> table->sgl can not get virtual address by sg_virt. So, sg_virt(sg_elem) may
>>> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
>>> in QEMU because an invalid GPA is passed by virtqueue.
>>
>> Heh, I was compiling (almost) the same patch as we speak. :)
> 
> Uh, what a coincidence! :)
> 
>>
>> I've never seen QEMU crash; the VM would more likely just fail to boot
>> with a panic.  But it's the same bug anyway.
> 
> I never met this before. How this situation happens?
> 
>>
>>> My solution is using sg_set_page instead of sg_set_buf.
>>>
>>> I have tested the patch on my workstation. QEMU would not crash any more.
>>>
>>> Signed-off-by: Wang Sen 
>>> ---
>>>  drivers/scsi/virtio_scsi.c |3 ++-
>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>>> index 1b38431..fc5c88a 100644
>>> --- a/drivers/scsi/virtio_scsi.c
>>> +++ b/drivers/scsi/virtio_scsi.c
>>> @@ -198,7 +198,8 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
>>> unsigned int *p_idx,
>>>   int i;
>>>
>>>   for_each_sg(table->sgl, sg_elem, table->nents, i)
>>> - sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
>>> + sg_set_page(&sg[idx++], sg_page(sg_elem), sg_elem->length,
>>> + sg_elem->offset);
>>
>> This can simply be
>>
>>sg[idx++] = *sg_elem;
>>
> 
> Yes, I saw your another E-mail. I think you're right. Simply calling
> sg_set_page can not handle
> the flag bits correctly. So, I'll repost the patch soon. Thank you!
> 


No this code is correct, though you will need to make sure to properly
terminate the destination sg_list.

But since old code was using sg_set_buf(), than it means it was properly
terminated before, and there for this code is good as is please don't
touch it.

Thanks
Boaz

>> Can you repost it with this change, and also add sta...@vger.kernel.org
>> to the Cc?  Thanks very much!
>>
>> Paolo
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 03:34 PM, Paolo Bonzini wrote:

> Il 25/07/2012 14:13, Wang Sen ha scritto:
>> When using the commands below to write some data to a virtio-scsi LUN of the
>> QEMU guest(32-bit) with 1G physical memory(qemu -m 1024), the qemu will 
>> crash.
>>
>> # sudo mkfs.ext4 /dev/sdb  (/dev/sdb is the virtio-scsi LUN.)
>> # sudo mount /dev/sdb /mnt
>> # dd if=/dev/zero of=/mnt/file bs=1M count=1024
>>
>> In current implementation, sg_set_buf is called to add buffers to sg list 
>> which
>> is put into the virtqueue eventually. But if there are some HighMem pages in
>> table->sgl you can not get virtual address by sg_virt. So, sg_virt(sg_elem) 
>> may
>> return NULL value. This will cause QEMU exit when virtqueue_map_sg is called
>> in QEMU because an invalid GPA is passed by virtqueue.
>>
>> I take Paolo's solution mentioned in last thread to avoid failure on 
>> handling 
>> flag bits.
> 
> Please include an URL or (better) summarize the reason why sg_set_page
> is not correct in the commit message.  For example, replace this
> paragraph with the following:
> 
> "To fix this, we can simply copy the original scatterlist entries into
> virtio-scsi's; we need to copy the entries entirely, including the flag
> bits, so using sg_set_page is not correct".
> 

> Please send v3 with this change and I'll add my Acked-by.
> 


NACK-by: Boaz Harrosh


Apart from the HighMem pages problem, where in previous sg_set_buf()
code was the marker copied? It was not because it is not needed because
the allocation of sg took care of that. For example in 64bit the is no
bugs, right?

If there was a destination sg_list termination bug, it should be fixed
as a separate patch from this "HighMem pages problem". But I bet if
you inspect the code carefully there isn't such a bug.

Cheers
Boaz

> Paolo
> 
>>
>> I have tested the patch on my workstation. QEMU would not crash any more.
>>
>> Signed-off-by: Wang Sen 
>> ---
>>  drivers/scsi/virtio_scsi.c |2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
>> index 1b38431..6661610 100644
>> --- a/drivers/scsi/virtio_scsi.c
>> +++ b/drivers/scsi/virtio_scsi.c
>> @@ -198,7 +198,7 @@ static void virtscsi_map_sgl(struct scatterlist *sg, 
>> unsigned int *p_idx,
>>  int i;
>>  
>>  for_each_sg(table->sgl, sg_elem, table->nents, i)
>> -sg_set_buf(&sg[idx++], sg_virt(sg_elem), sg_elem->length);
>> +sg[idx++] = *sg_elem;
>>  
>>  *p_idx = idx;
>>  }
>>
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 14:34, Boaz Harrosh ha scritto:
>>> for_each_sg(table->sgl, sg_elem, table->nents, i)
>>> -   sg_set_buf(&sg[idx++], sg_virt(sg_elem), 
>>> sg_elem->length);
>>> +   sg_set_page(&sg[idx++], sg_page(sg_elem), 
>>> sg_elem->length,
>>> +   sg_elem->offset);
>
> This can simply be
>
>sg[idx++] = *sg_elem;
>
>>>
>>> No! Please use sg_set_page()! Look at sg_set_page(), which calls 
>>> sg_assign_page().
>>> It has all these jump over chained arrays. When you'll start using long
>>> sg_lists (which you should) then jumping from chain to chain must go through
>>> sg_page(sg_elem) && sg_assign_page(), As in the original patch.
>>
>> actually it seems to me that using sg_set_page is wrong, because it will
>> not copy the end marker from table->sgl to sg[].  If something chained
>> the sg[] scatterlist onto something else, sg_next's test for sg_is_last
>> would go beyond the table->nents-th item and access invalid memory.
> 
> 
> Yes, you did not understand this structure. And Yes I am right, when
> using chained lists you *must* use sg_set_page().
> 
> You see the chaining belongs to the allocation not the value of the
> sg-elements. One must not copy the chaining marker to the destination
> array which might have it's own.

Except here the destination array has to be given to virtio, which
doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
simple memcpy exactly because I want to flatten the input scatterlist
onto consecutive scatterlist entries, which is what virtio expects (and
what I'll change when I get to it).

for_each_sg guarantees that I get non-chain scatterlists only, so it is
okay to value-assign them to sg[].

(Replying to your other message,

> No this code is correct, though you will need to make sure to properly
> terminate the destination sg_list.
> 
> But since old code was using sg_set_buf(), than it means it was properly
> terminated before, and there for this code is good as is please don't
> touch it.

It was _not_ properly terminated, and didn't matter because virtio
doesn't care about termination.  Changing all the virtio devices to
properly terminate chains (and to use for_each_sg) is a prerequisite for
properly supporting long sglists).

> In SCSI land most LLDs should support chaining just by virtu of using the
> for_each_sg macro. That all it takes. Your code above does support it.

Yes, it supports it but still has to undo them before passing to virtio.

What my LLD does is add a request descriptor in front of the scatterlist
that the LLD receives.  I would like to do this with a 2-item
scatterlist: one for the request descriptor, and one which is a chain to
the original scatterlist.  Except that if I call sg_chain and my
architecture does not define ARCH_HAS_SG_CHAIN, it will BUG out.  So I
need to keep all the scatterlist allocation and copying crap that I have
now within #ifdef, and it will bitrot.

>> I would need to add support for the long sglists to virtio; this is not
>> a problem, but in the past Rusty complained that long sg-lists are not
>> well suited to virtio (which would like to add elements not just at the
>> beginning of a given sglist, but also at the end).  
> 
> Well that can be done as well, (If done carefully) It should be easy to add
> chained fragments to both the end of a given chain just as at beginning.
> It only means that the last element of the appended-to chain moves to
> the next fragment and it's place is replaced by a link.

But you cannot do that in constant time, can you?  And that means you do
not enjoy any benefit in terms of cache misses etc.

Also, this assumes that I can modify the appended-to chain.  I'm not
sure I can do this?

>> It seems to me that virtio would prefer to work with a struct
>> scatterlist ** rather than a long sglist.
> 
> That's just going backwards, and lazy. As you said if you want to enjoy
> the better performance cake you better break some eggs ;-)

:)

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 14:47, Boaz Harrosh ha scritto:
> 
> NACK-by: Boaz Harrosh
> 
> 
> Apart from the HighMem pages problem, where in previous sg_set_buf()
> code was the marker copied? It was not because it is not needed because
> the allocation of sg took care of that. For example in 64bit the is no
> bugs, right?
> 
> If there was a destination sg_list termination bug, it should be fixed
> as a separate patch from this "HighMem pages problem". But I bet if
> you inspect the code carefully there isn't such a bug.

Hmm, we're talking past each other.  Let's sort it out in the v1 thread.
 Sen, please hold up the patch for a moment.

Paolo

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Avi Kivity
On 07/24/2012 11:45 PM, Nicholas A. Bellinger wrote:

>> > diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
>> > index e942df9..3d5378f 100644
>> > --- a/drivers/vhost/tcm_vhost.h
>> > +++ b/drivers/vhost/tcm_vhost.h
>> > @@ -80,7 +80,17 @@ struct tcm_vhost_tport {
>> >  
>> >  #include 
>> >  
>> > +/*
>> > + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
>> > + *
>> > + * ABI Rev 0: All pre 2012 revisions used by prototype out-of-tree code
>> > + * ABI Rev 1: 2012 version for v3.6 kernel merge candiate
>> > + */


If it's out of tree, why consider it at all?  Put a stable ABI in tree
and extend it in compatible ways.


-- 
error compiling committee.c: too many arguments to function


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 03:49 PM, Paolo Bonzini wrote:

> 
> Except here the destination array has to be given to virtio, which
> doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
> simple memcpy exactly because I want to flatten the input scatterlist
> onto consecutive scatterlist entries, which is what virtio expects (and
> what I'll change when I get to it).
> 
> for_each_sg guarantees that I get non-chain scatterlists only, so it is
> okay to value-assign them to sg[].
> 


So if the virtio does not understand chaining at all then surly it will
not understand the 2-bit end marker and will get a wrong page pointer
with the 1st bit set.

As I said!! Since the last code did sg_set_buff() and worked then you must
change it with sg_set_page().

If there are *any* chaining-or-no-chaining markers errors these should
be fixed as a separate patch!

Please lets concentrate at the problem at hand.



> 
> It was _not_ properly terminated, and didn't matter because virtio
> doesn't care about termination.  Changing all the virtio devices to
> properly terminate chains (and to use for_each_sg) is a prerequisite for
> properly supporting long sglists).
> 


Fine then your code is now a crash because the terminating bit was just
copied over, which it was not before.


Now Back to the how to support chaining:

Lets separate the two topics from now on. Send me one mail concerning
the proper above patch, And a different mail for how to support chaining.

>> In SCSI land most LLDs should support chaining just by virtu of using the
>> for_each_sg macro. That all it takes. Your code above does support it.
> 
> Yes, it supports it but still has to undo them before passing to virtio.
> 
> What my LLD does is add a request descriptor in front of the scatterlist
> that the LLD receives.  I would like to do this with a 2-item
> scatterlist: one for the request descriptor, and one which is a chain to
> the original scatterlist.  


I hate that plan. Why yet override the scatter element yet again with a third
union of a "request descriptor"

The reason it was overloaded as a link-pointer in the first place was because
of historical compatibility reasons and not because of a good design.

You should have a proper "request descriptor" structure defined, pointing to
(or followed by), an sglist-chain. And all of the above is mute.
 

> Except that if I call sg_chain and my
> architecture does not define ARCH_HAS_SG_CHAIN, it will BUG out.  So I
> need to keep all the scatterlist allocation and copying crap that I have
> now within #ifdef, and it will bitrot.
> 


except that with the correct design you don't call sg_chain you just do:
request_descriptor->sg_list = sg;

>>> I would need to add support for the long sglists to virtio; this is not
>>> a problem, but in the past Rusty complained that long sg-lists are not
>>> well suited to virtio (which would like to add elements not just at the
>>> beginning of a given sglist, but also at the end).  
>>
>> Well that can be done as well, (If done carefully) It should be easy to add
>> chained fragments to both the end of a given chain just as at beginning.
>> It only means that the last element of the appended-to chain moves to
>> the next fragment and it's place is replaced by a link.
> 
> But you cannot do that in constant time, can you?  And that means you do
> not enjoy any benefit in terms of cache misses etc.
> 


I did not understand "constant time" it is O(0) if that what you meant.

(And surly today's code of copy the full list "cache misses")

> Also, this assumes that I can modify the appended-to chain.  I'm not
> sure I can do this?
> 


Each case it's own. If the appended-to chain is const, yes you'll have
to reallocate it and copy. Is that your case?

Cheers
Boaz

>>> It seems to me that virtio would prefer to work with a struct
>>> scatterlist ** rather than a long sglist.
>>
>> That's just going backwards, and lazy. As you said if you want to enjoy
>> the better performance cake you better break some eggs ;-)
> 
> :)
> 
> Paolo


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 15:26, Boaz Harrosh ha scritto:
> On 07/25/2012 03:49 PM, Paolo Bonzini wrote:
> 
>>
>> Except here the destination array has to be given to virtio, which
>> doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
>> simple memcpy exactly because I want to flatten the input scatterlist
>> onto consecutive scatterlist entries, which is what virtio expects (and
>> what I'll change when I get to it).
>>
>> for_each_sg guarantees that I get non-chain scatterlists only, so it is
>> okay to value-assign them to sg[].
> 
> So if the virtio does not understand chaining at all then surly it will
> not understand the 2-bit end marker and will get a wrong page pointer
> with the 1st bit set.

It doesn't understand chaining, but it does use sg_phys(x) so it will
not get a wrong page pointer for the end marker.

> Fine then your code is now a crash because the terminating bit was just
> copied over, which it was not before.

I did test the patch with value-assignment.

> Lets separate the two topics from now on. Send me one mail concerning
> the proper above patch, And a different mail for how to support chaining.

Ok, and I'll change the topic.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 15:26, Boaz Harrosh ha scritto:
>>> In SCSI land most LLDs should support chaining just by virtu of using the
>>> for_each_sg macro. That all it takes. Your code above does support it.
>>
>> Yes, it supports it but still has to undo them before passing to virtio.
>>
>> What my LLD does is add a request descriptor in front of the scatterlist
>> that the LLD receives.  I would like to do this with a 2-item
>> scatterlist: one for the request descriptor, and one which is a chain to
>> the original scatterlist.  
> 
> I hate that plan. Why yet override the scatter element yet again with a third
> union of a "request descriptor"

I'm not overriding (or did you mean overloading?) anything, and I think
you're reading too much in my words.

What I am saying is (for a WRITE command):

1) what I get is a scsi_cmnd which contains an N-element scatterlist.

2) virtio-scsi has to build the "packet" that is passed to the hardware
(it does not matter that the hardware is virtual).  This packet (per
virtio-scsi spec) has an N+1-element scatterlist, where the first
element is a request descriptor (struct virtio_scsi_cmd_req), and the
others describe the written data.

3) virtio takes care of converting the "packet" from a scatterlist
(which currently must be a flat one) to the hardware representation.
Here a walk is inevitable, so we don't care about this walk.

4) What I'm doing now: copying (and flattening) the N-element
scatterlist onto the last elements of an N+1 array that I pass to virtio.

  _ _ _ _ _ _
 |_|_|_|_|_|_|  scsi_cmnd scatterlist

 vvv COPY vvv
_ _ _ _ _ _ _
   |_|_|_|_|_|_|_|  scatterlist passed to virtio
|
virtio_scsi_cmd_req

Then I hand off the scatterlist to virtio.  virtio walks it and converts
it to hardware format.

5) What I want to do: create a 2-element scatterlist, the first being
the request descriptor and the second chaining to scsi_cmnd's N-element
scatterlist.

 _ _ _ _ _ _
|_|_|_|_|_|_|  scsi_cmnd scatterlist
_ _/
   |_|C|   scatterlist passed to virtio
|
virtio_scsi_cmd_req

Then I hand off the scatterlist to virtio.  virtio will still walk the
scatterlist chain, and convert it to N+1 elements for the hardware to
consume.  Still, removing one walk largely reduces the length of my
critical sections.  I also save some pointer-chasing because the
2-element scatterlist are short-lived and can reside on the stack.


Other details (you can probably skip these):

There is also a response descriptor.  In the case of writes this is the
only element that the hardware will write to, so in the case of writes
the "written by hardware" scatterlist has 1 element only and does not
need chaining.

Reads are entirely symmetric.  The hardware will read the request
descriptor from a 1-element scatterlist, and will write response+data
into an N+1-element scatterlist (the response descriptor precedes the
data that was read).  It can be treated in exactly the same way.  The
N+1-element scatterlist could also become a 2-element scatterlist with
chaining.

>> Except that if I call sg_chain and my
>> architecture does not define ARCH_HAS_SG_CHAIN, it will BUG out.  So I
>> need to keep all the scatterlist allocation and copying crap that I have
>> now within #ifdef, and it will bitrot.
> 
> except that with the correct design you don't call sg_chain you just do:
>   request_descriptor->sg_list = sg;

By the above it should be clear, that request_descriptor is not a
driver-private extension of the scsi_cmnd.  It is something passed to
the hardware.

>>> Well that can be done as well, (If done carefully) It should be easy to add
>>> chained fragments to both the end of a given chain just as at beginning.
>>> It only means that the last element of the appended-to chain moves to
>>> the next fragment and it's place is replaced by a link.
>>
>> But you cannot do that in constant time, can you?  And that means you do
>> not enjoy any benefit in terms of cache misses etc.
> 
> I did not understand "constant time" it is O(0) if that what you meant.

In the worst case it is a linked list, no?  So in the worst case
_finding_ the last element of the appended-to chain is O(n).

Actually, appending to the end is not a problem for virtio-scsi.  But
for example the virtio-blk spec places the response descriptor _after_
the input data.  I think this was a mistake, and I didn't repeat it for
virtio-scsi, but I cited it because in the past Rusty complained that
the long sglist implementation was "SCSI-centric".

>> Also, this assumes that I can modify the appended-to chain.  I'm not
>> sure I can do this?
> 
> Each case it's own. If the appended-to chain is const, yes you'll have
> to reallocate it and copy. Is that your case?

It will be virtio-blk's case, but we can ignore it.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.

Re: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread James Bottomley
On Sun, 2012-07-22 at 10:31 -0700, Tejun Heo wrote:
> Hello,
> 
> On Sat, Jul 21, 2012 at 02:15:56PM +0200, Matthias Prager wrote:
> > Now I'm not sure this isn't taping over another bug. Which leads me to
> > my question: What is the correct behavior?
> > 
> > #1 Issuing a separate spin-up command (START UNIT?) prior to sending i/o
> > by setting allow_restart=1 for sata disks on sas controllers
> > 
> > or
> > 
> > #2 Teaching the sas drivers they do not need spin-up commands and can
> > simply start issuing i/o to sata disks
> 
> I haven't consulted SAT but it seems like a bug in SAS driver or
> firmware.  If it's a driver bug, we better fix it there.  If a
> firmware bug, working around those is one of major roles of drivers,
> so I think setting allow_restart is fine.

Actually, I don't think so.  SAT-2 section 8.12.2 does say 

if the device is in the stopped state as the result of
processing a START STOP UNIT command (see 9.11), then the SATL
shall terminate the TEST UNIT READY command with CHECK CONDITION
status with the sense key set to NOT READY and the additional
sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
REQUIRED;

START STOP UNIT (with START=0) translates to STANDBY IMMEDIATE, and
that's what hdparm -y issues.  We don't see this in /drivers/ata because
TEST UNIT READY always returns success.

So it looks like the mpt2sas SAT is doing the correct thing and we only
don't see this problem in normal SATA devices because of a bug in the
libata-scsi SAT.

However, the kernel log

Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj] Device not ready
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Result: hostbyte=DID_OK 
driverbyte=DRIVER_SENSE
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Sense Key : Not Ready [current]
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj]  Add. Sense: Logical unit not ready, 
initializing command required
Apr 04 22:55:10 [kernel] sd 1:0:1:0: [sdj] CDB: Write(10): 2a 00 57 54 52 3f 00 
00 08 00

Indicates we got the NOT READY to a non-TUR command, so I suspect what's
happening is that sending the TUR causes the SAT to remember the standby
state and respond NOT READY to all subsequent commands.  However, if we
just send an ordinary command, not a TUR, it quietly wakes the drive and
we don't see any problems.

There is support in SAT for this behaviour because there's a note on the
START STOP UNIT command saying

After returning GOOD status for a START STOP UNIT command with
the START bit set to zero, the SATL shall consider the ATA
device to be in the Stopped power state (see SBC-2)

Which in SCSI terms would mean return NOT READY to any subsequent
commands.

Can someone verify this is indeed what the mpt2sas HBA is doing?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 04:36 PM, Paolo Bonzini wrote:

> Il 25/07/2012 15:26, Boaz Harrosh ha scritto:
>> On 07/25/2012 03:49 PM, Paolo Bonzini wrote:
>>
>>>
>>> Except here the destination array has to be given to virtio, which
>>> doesn't (yet) understand chaining.  I'm using for_each_sg rather than a
>>> simple memcpy exactly because I want to flatten the input scatterlist
>>> onto consecutive scatterlist entries, which is what virtio expects (and
>>> what I'll change when I get to it).
>>>
>>> for_each_sg guarantees that I get non-chain scatterlists only, so it is
>>> okay to value-assign them to sg[].
>>
>> So if the virtio does not understand chaining at all then surly it will
>> not understand the 2-bit end marker and will get a wrong page pointer
>> with the 1st bit set.
> 
> It doesn't understand chaining, but it does use sg_phys(x) so it will
> not get a wrong page pointer for the end marker.
> 
>> Fine then your code is now a crash because the terminating bit was just
>> copied over, which it was not before.
> 
> I did test the patch with value-assignment.
> 


Still you should use the sg_set_page()!!
1. It is not allowed to directly manipulate sg entries. One should always
   use the proper accessor. Even if open coding does work and is not a bug
   it should not be used anyway!
2. Future code that will support chaining will need to do as I say so why
   change it then, again?

Please don't change two things in one patch. The fix is for high-pages
please fix only that here. You can blasphemy open-code the sg manipulation
in a separate patch.

Please
Boaz

>> Lets separate the two topics from now on. Send me one mail concerning
>> the proper above patch, And a different mail for how to support chaining.
> 
> Ok, and I'll change the topic.
> 
> Paolo


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


performance improvements for the sglist API (Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 16:36, Boaz Harrosh ha scritto:
>> > 
>> > I did test the patch with value-assignment.
>> > 
> 
> Still you should use the sg_set_page()!!
> 1. It is not allowed to directly manipulate sg entries. One should always
>use the proper accessor. Even if open coding does work and is not a bug
>it should not be used anyway!
> 2. Future code that will support chaining will need to do as I say so why
>change it then, again?

Future code that will support chaining will not copy anything at all.

Also, and more important, note that I am _not_ calling sg_init_table
before the loop, only once in the driver initialization.  That's because
memset in sg_init_table is an absolute performance killer, especially if
you have to do it in a critical section; and I'm not making this up, see
blk_rq_map_sg:

  * If the driver previously mapped a shorter
  * list, we could see a termination bit
  * prematurely unless it fully inits the sg
  * table on each mapping. We KNOW that there
  * must be more entries here or the driver
  * would be buggy, so force clear the
  * termination bit to avoid doing a full
  * sg_init_table() in drivers for each command.
  */
  sg->page_link &= ~0x02;
  sg = sg_next(sg);

So let's instead fix the API so that I (and blk-merge.c) can touch
memory just once.  For example you could add __sg_set_page and
__sg_set_buf, basically the equivalent of

memset(sg, 0, sizeof(*sg));
sg_set_{page,buf}(sg, page, len, offset);

Calling these functions would be fine if you later add a manual call to
sg_mark_end, again the same as blk-merge.c does.  See the attached
untested/uncompiled patch.

And value assignment would be the same as a

__sg_set_page(sg, sg_page(page), sg->length, sg->offset);

> Please don't change two things in one patch. The fix is for high-pages
> please fix only that here. You can blasphemy open-code the sg manipulation
> in a separate patch.

The blasphemy is already there (the scatterlist that is handed to virtio
won't have the right end-of-chain marker).  If anything,
value-assignment is trading a subtle blasphemy for a blatant one.
That's already an improvement, but let's just fix the API instead.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: performance improvements for the sglist API (Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 17:09, Paolo Bonzini ha scritto:
> Il 25/07/2012 16:36, Boaz Harrosh ha scritto:

 I did test the patch with value-assignment.

>>
>> Still you should use the sg_set_page()!!
>> 1. It is not allowed to directly manipulate sg entries. One should always
>>use the proper accessor. Even if open coding does work and is not a bug
>>it should not be used anyway!
>> 2. Future code that will support chaining will need to do as I say so why
>>change it then, again?
> 
> Future code that will support chaining will not copy anything at all.
> 
> Also, and more important, note that I am _not_ calling sg_init_table
> before the loop, only once in the driver initialization.  That's because
> memset in sg_init_table is an absolute performance killer, especially if
> you have to do it in a critical section; and I'm not making this up, see
> blk_rq_map_sg:
> 
>   * If the driver previously mapped a shorter
>   * list, we could see a termination bit
>   * prematurely unless it fully inits the sg
>   * table on each mapping. We KNOW that there
>   * must be more entries here or the driver
>   * would be buggy, so force clear the
>   * termination bit to avoid doing a full
>   * sg_init_table() in drivers for each command.
>   */
>   sg->page_link &= ~0x02;
>   sg = sg_next(sg);
> 
> So let's instead fix the API so that I (and blk-merge.c) can touch
> memory just once.  For example you could add __sg_set_page and
> __sg_set_buf, basically the equivalent of
> 
> memset(sg, 0, sizeof(*sg));
> sg_set_{page,buf}(sg, page, len, offset);
> 
> Calling these functions would be fine if you later add a manual call to
> sg_mark_end, again the same as blk-merge.c does.  See the attached
> untested/uncompiled patch.
> 
> And value assignment would be the same as a
> 
> __sg_set_page(sg, sg_page(page), sg->length, sg->offset);

ENOPATCH...

diff --git a/block/blk-merge.c b/block/blk-merge.c
index 160035f..00ba3d4 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -146,7 +146,9 @@ int blk_rq_map_sg(struct request_queue *q, struct request 
*rq,
 new_segment:
if (!sg)
sg = sglist;
-   else {
+   else
+   sg = sg_next(sg);
+
/*
 * If the driver previously mapped a shorter
 * list, we could see a termination bit
@@ -158,11 +160,7 @@ new_segment:
 * termination bit to avoid doing a full
 * sg_init_table() in drivers for each command.
 */
-   sg->page_link &= ~0x02;
-   sg = sg_next(sg);
-   }
-
-   sg_set_page(sg, bvec->bv_page, nbytes, bvec->bv_offset);
+   __sg_set_page(sg, bvec->bv_page, nbytes, 
bvec->bv_offset);
nsegs++;
}
bvprv = bvec;
@@ -182,12 +180,11 @@ new_segment:
if (rq->cmd_flags & REQ_WRITE)
memset(q->dma_drain_buffer, 0, q->dma_drain_size);
 
-   sg->page_link &= ~0x02;
sg = sg_next(sg);
-   sg_set_page(sg, virt_to_page(q->dma_drain_buffer),
-   q->dma_drain_size,
-   ((unsigned long)q->dma_drain_buffer) &
-   (PAGE_SIZE - 1));
+   __sg_set_page(sg, virt_to_page(q->dma_drain_buffer),
+ q->dma_drain_size,
+ ((unsigned long)q->dma_drain_buffer) &
+ (PAGE_SIZE - 1));
nsegs++;
rq->extra_len += q->dma_drain_size;
}
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index ac9586d..d6a937e 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -44,32 +44,23 @@ struct sg_table {
 #define sg_chain_ptr(sg)   \
((struct scatterlist *) ((sg)->page_link & ~0x03))
 
-/**
- * sg_assign_page - Assign a given page to an SG entry
- * @sg:SG entry
- * @page:  The page
- *
- * Description:
- *   Assign page to sg entry. Also see sg_set_page(), the most commonly used
- *   variant.
- *
- **/
-static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
+static inline void __sg_set_page(struct scatterlist *sg, struct page *page,
+unsigned int len, unsigned int offset)
 {
-   unsigned long page_link = sg->page_link & 0x3;
-
/*
 *

Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 05:17 PM, Paolo Bonzini wrote:

> Il 25/07/2012 15:26, Boaz Harrosh ha scritto:
 In SCSI land most LLDs should support chaining just by virtu of using the
 for_each_sg macro. That all it takes. Your code above does support it.
>>>
>>> Yes, it supports it but still has to undo them before passing to virtio.
>>>
>>> What my LLD does is add a request descriptor in front of the scatterlist
>>> that the LLD receives.  I would like to do this with a 2-item
>>> scatterlist: one for the request descriptor, and one which is a chain to
>>> the original scatterlist.  
>>
>> I hate that plan. Why yet override the scatter element yet again with a third
>> union of a "request descriptor"
> 
> I'm not overriding (or did you mean overloading?) anything, and I think
> you're reading too much in my words.
> 
> What I am saying is (for a WRITE command):
> 
> 1) what I get is a scsi_cmnd which contains an N-element scatterlist.
> 
> 2) virtio-scsi has to build the "packet" that is passed to the hardware
> (it does not matter that the hardware is virtual).  This packet (per
> virtio-scsi spec) has an N+1-element scatterlist, where the first
> element is a request descriptor (struct virtio_scsi_cmd_req), and the
> others describe the written data.
> 


Then "virtio-scsi spec" is crap. It overloads the meaning of
"struct scatterlist" of the first element in an array. to be a
"struct virtio_scsi_cmd_req".

Instead of simply defining the protocol as passing a "struct 
virtio_scsi_cmd_req".

The following scatterlist can then be pointed to or followed in the stream.
But the above is just bad programming, regardless of standard or not.

Since you need to change the standard to support chaining then
it is a good time to fix this. (You need to change it because virtio
will now need to decode a guest-pointer at each chain-end to yet a new
guest-memory buffer)

In Practice you will get the same stream. a first packet of a
struct virtio_scsi_cmd_req followed or pointing to an array of
struct scatterlist. Only you don't have that contraption of virtio_scsi_cmd_req
must be the same size as "struct scatterlist" and all that union crap.

What where you guys smoking that day? ;-)

> 3) virtio takes care of converting the "packet" from a scatterlist
> (which currently must be a flat one) to the hardware representation.
> Here a walk is inevitable, so we don't care about this walk.
> 


"hardware representation" you mean aio or biovec, what ever the
IO submission path uses at the host end?

> 4) What I'm doing now: copying (and flattening) the N-element
> scatterlist onto the last elements of an N+1 array that I pass to virtio.
> 
>   _ _ _ _ _ _
>  |_|_|_|_|_|_|  scsi_cmnd scatterlist
> 
>  vvv COPY vvv
> _ _ _ _ _ _ _
>|_|_|_|_|_|_|_|  scatterlist passed to virtio
> |
> virtio_scsi_cmd_req
> 
> Then I hand off the scatterlist to virtio.  virtio walks it and converts
> it to hardware format.
> 


Crap design. All that extra full vector copy. Just for that request
header - virtio_scsi_cmd_req.

Why are they at all related. Why does virtio_scsi_cmd_req need to be
as part of scatterlist and of the same size as the first element?

> 5) What I want to do: create a 2-element scatterlist, the first being
> the request descriptor and the second chaining to scsi_cmnd's N-element
> scatterlist.
> 
>  _ _ _ _ _ _
> |_|_|_|_|_|_|  scsi_cmnd scatterlist
> _ _/
>|_|C|   scatterlist passed to virtio
> |
> virtio_scsi_cmd_req
> 
> Then I hand off the scatterlist to virtio.  virtio will still walk the
> scatterlist chain, and convert it to N+1 elements for the hardware to
> consume.  Still, removing one walk largely reduces the length of my
> critical sections.  I also save some pointer-chasing because the
> 2-element scatterlist are short-lived and can reside on the stack.
> 


Sure this is much better. 

But since you are already changing the two code sites, Here, and at
virtio to support chained scatterlist, why not fix it properly

  _ _ _ _ _ _
 |_|_|_|_|_|_|  scsi_cmnd scatterlist
 _ _/___
|virtio_scsi_cmd_req|virtio_scsi_cmd_req passed to virtio
 

Just a regularly defined header with an embedded pointer to a scatterlist.

And BTW the "scsi_cmnd scatterlist" can now be a chained one and virtio
can support that as well.

> 
> Other details (you can probably skip these):
> 
> There is also a response descriptor.  In the case of writes this is the
> only element that the hardware will write to, so in the case of writes
> the "written by hardware" scatterlist has 1 element only and does not
> need chaining.
> 


Again the scatterlist as a union for everything crap. Why why??
That is not at all SCSI. in iSCSI a response packet is well defined.

> Reads are entirely symmetric.  The hardware will read the request
> descriptor from a 1-element

Question about releasing buffers allocated to request->special

2012-07-25 Thread Aniket Kulkarni

Hello

I have a setup with a 12 daisy chained EXP3000 enclosures connected to
a server such that each of the disks are accessible via two paths
through multiple sas expanders. The server has 2 dual ported HBAs. I'm
running 2.6.32 kernel variant based on RHEL 6.0. I have seen this on
2.6.31 as well.

I am seeing a bunch of kernel panics pretty frequently with an
interesting pattern -

The kernel paniced trying to deference stuff from scsi_cmnd passed to  
scsi_dispatch_cmd. Looking at its caller


scsi_request_fn(request_queue *q)
device = q->quedata;
req = get_request_from_queue(q);

cmd = req->special;
if (!cmd)
return;

scsi_dispatch_cmd(cmd);

The interesting thing is cmd->device != device; they should've been  
the same. In fact it looks like req->special itself is garbage and  
that's why we have been looking at the wrong stuff in  
scsi_dispatch_cmd. It also explains why we can find the device on  
upper frames and prove that it (and all its attributes) are healthy  
all the way from MD to LSI.


The only place in scsi layer that frees req->special is

scsi_prep_return()

case BLKPREP_KILL:
req->errors = DID_NO_CONNECT << 16;
/* release the command and kill it */
if (req->special) {
struct scsi_cmnd *cmd = req->special;   <--
scsi_release_buffers(cmd);|__ not atomic
scsi_put_command(cmd);|
req->special = NULL;<--
}

I know this is a basic question, but I'm wondering how does one  
prevent threads from using req->special while its being deallocated  
here? I see checks for special == NULL at a lot of places but the  
release and reset of pointer (above) are not atomic.


Thanks in advance.
--
Aniket Kulkarni


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread Tejun Heo
Hello, James.

On Wed, Jul 25, 2012 at 06:19:13PM +0400, James Bottomley wrote:
> > I haven't consulted SAT but it seems like a bug in SAS driver or
> > firmware.  If it's a driver bug, we better fix it there.  If a
> > firmware bug, working around those is one of major roles of drivers,
> > so I think setting allow_restart is fine.
> 
> Actually, I don't think so.  SAT-2 section 8.12.2 does say 
> 
> if the device is in the stopped state as the result of
> processing a START STOP UNIT command (see 9.11), then the SATL
> shall terminate the TEST UNIT READY command with CHECK CONDITION
> status with the sense key set to NOT READY and the additional
> sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
> REQUIRED;
> 
> START STOP UNIT (with START=0) translates to STANDBY IMMEDIATE, and
> that's what hdparm -y issues.  We don't see this in /drivers/ata because
> TEST UNIT READY always returns success.

Urgh... ATA device in standby mode is ready for any command and
definitely doesn't need an "initializing command".  Oh, well...

> So it looks like the mpt2sas SAT is doing the correct thing and we only
> don't see this problem in normal SATA devices because of a bug in the
> libata-scsi SAT.

libata is inconsistent with the standard but I think the standard is
wrong here. :(

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 17:28, Boaz Harrosh ha scritto:
>> 1) what I get is a scsi_cmnd which contains an N-element scatterlist.
>>
>> 2) virtio-scsi has to build the "packet" that is passed to the hardware
>> (it does not matter that the hardware is virtual).  This packet (per
>> virtio-scsi spec) has an N+1-element scatterlist, where the first
>> element is a request descriptor (struct virtio_scsi_cmd_req), and the
>> others describe the written data.
> 
> Then "virtio-scsi spec" is crap. It overloads the meaning of
> "struct scatterlist" of the first element in an array. to be a
> "struct virtio_scsi_cmd_req".

What the holy fuck?  The first element simply _points_ to the "struct
virtio_scsi_cmd_req", just like subsequent elements point to the data.

And the protocol of the device is _not_ a struct scatterlist[].  The
virtio _API_ takes that array and converts to a series of physical
address + offset pairs.

> Since you need to change the standard to support chaining then
> it is a good time to fix this.

Perhaps it is a good time for you to read the virtio spec.  You are
making a huge confusion between the LLD->virtio interface and the
virtio->hardware interface.  I'm talking only of the former.

>> 3) virtio takes care of converting the "packet" from a scatterlist
>> (which currently must be a flat one) to the hardware representation.
>> Here a walk is inevitable, so we don't care about this walk.
> 
> "hardware representation" you mean aio or biovec, what ever the
> IO submission path uses at the host end?

No, I mean the way the virtio spec encodes the physical address + offset
pairs.

I stopped reading here.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RESEND PATCH] SCSI: scsi_lib: fix scsi_io_completion's SG_IO error propagation

2012-07-25 Thread Mike Snitzer
The following v3.4-rc1 commit unmasked an existing bug in
scsi_io_completion's SG_IO error handling:
47ac56d [SCSI] scsi_error: classify some ILLEGAL_REQUEST sense as a permanent 
TARGET_ERROR

Given that certain ILLEGAL_REQUEST are now properly categorized as
TARGET_ERROR the host_byte is being set (before host_byte wasn't ever
set for these ILLEGAL_REQUEST).

In scsi_io_completion, initialize req->errors with cmd->result _after_
the SG_IO block that calls __scsi_error_from_host_byte (which may
modify the host_byte).

Before this fix:

cdb to send: 12 01 01 00 00 00
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=2, flags=0,
status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0x10,
driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
SCSI Status: Check Condition

Sense Information:
sense buffer empty

After:

cdb to send: 12 01 01 00 00 00
ioctl(3, SG_IO, {'S', SG_DXFER_NONE, cmd[6]=[12, 01, 01, 00, 00, 00],
mx_sb_len=32, iovec_count=0, dxfer_len=0, timeout=2, flags=0,
status=02, masked_status=01, sb[19]=[70, 00, 05, 00, 00, 00, 00, 0b,
00, 00, 00, 00, 24, 00, 00, 00, 00, 00, 00], host_status=0,
driver_status=0x8, resid=0, duration=0, info=0x1}) = 0
SCSI Status: Check Condition

Sense Information:
 Fixed format, current;  Sense key: Illegal Request
 Additional sense: Invalid field in cdb
 Raw sense data (in hex):
70 00 05 00 00 00 00 0b  00 00 00 00 24 00 00 00
00 00 00

Signed-off-by: Mike Snitzer 
Reported-by: Paolo Bonzini 
Tested-by: Paolo Bonzini 
Reviewed-by: Babu Moger 
Cc: sta...@vger.kernel.org [v3.4+]
---
 drivers/scsi/scsi_lib.c |5 -
 1 files changed, 4 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b583277..9377ed2 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -759,7 +759,6 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int 
good_bytes)
}
 
if (req->cmd_type == REQ_TYPE_BLOCK_PC) { /* SG_IO ioctl from block 
level */
-   req->errors = result;
if (result) {
if (sense_valid && req->sense) {
/*
@@ -775,6 +774,10 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned 
int good_bytes)
if (!sense_deferred)
error = __scsi_error_from_host_byte(cmd, 
result);
}
+   /*
+* __scsi_error_from_host_byte may have reset the host_byte
+*/
+   req->errors = cmd->result;
 
req->resid_len = scsi_get_resid(cmd);
 
-- 
1.7.4.4

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [resend PATCH 5/5] libsas, ipr: cleanup ata_host flags initialization via ata_host_init

2012-07-25 Thread Jeff Garzik

On 07/10/2012 12:06 AM, Dan Williams wrote:

libsas and ipr pass flags to ata_host_init that are meant for the port.

ata_host flags:
ATA_HOST_SIMPLEX= (1 << 0),   /* Host is simplex, one DMA 
channel per host only */
ATA_HOST_STARTED= (1 << 1),   /* Host started */
ATA_HOST_PARALLEL_SCAN  = (1 << 2),   /* Ports on this host can be 
scanned in parallel */
ATA_HOST_IGNORE_ATA = (1 << 3),   /* Ignore ATA devices on this 
host. */

flags passed by libsas:
ATA_FLAG_SATA   = (1 << 1),
ATA_FLAG_PIO_DMA= (1 << 7), /* PIO cmds via DMA */
ATA_FLAG_NCQ= (1 << 10), /* host supports NCQ */

The only one that aliases is ATA_HOST_STARTED which is a 'don't care' in
the libsas and ipr cases since ata_hosts from these sources are not
registered with libata.

Cc: Brian King 
Reported-by: Hannes Reinecke 
Signed-off-by: Dan Williams 
---
  drivers/ata/libata-core.c |   10 ++
  drivers/scsi/ipr.c|3 +--
  drivers/scsi/libsas/sas_ata.c |5 +
  include/linux/libata.h|3 +--


Acked-by: Jeff Garzik 




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 08:43 PM, Paolo Bonzini wrote:

> Il 25/07/2012 17:28, Boaz Harrosh ha scritto:
>>> 1) what I get is a scsi_cmnd which contains an N-element scatterlist.
>>>
>>> 2) virtio-scsi has to build the "packet" that is passed to the hardware
>>> (it does not matter that the hardware is virtual).  This packet (per
>>> virtio-scsi spec) has an N+1-element scatterlist, where the first
>>> element is a request descriptor (struct virtio_scsi_cmd_req), and the
>>> others describe the written data.
>>
>> Then "virtio-scsi spec" is crap. It overloads the meaning of
>> "struct scatterlist" of the first element in an array. to be a
>> "struct virtio_scsi_cmd_req".
> 
> What the holy fuck?  The first element simply _points_ to the "struct
> virtio_scsi_cmd_req", just like subsequent elements point to the data.
> 
> And the protocol of the device is _not_ a struct scatterlist[].  The
> virtio _API_ takes that array and converts to a series of physical
> address + offset pairs.
> 
>> Since you need to change the standard to support chaining then
>> it is a good time to fix this.
> 
> Perhaps it is a good time for you to read the virtio spec.  You are
> making a huge confusion between the LLD->virtio interface and the
> virtio->hardware interface.  I'm talking only of the former.
> 
>>> 3) virtio takes care of converting the "packet" from a scatterlist
>>> (which currently must be a flat one) to the hardware representation.
>>> Here a walk is inevitable, so we don't care about this walk.
>>
>> "hardware representation" you mean aio or biovec, what ever the
>> IO submission path uses at the host end?
> 
> No, I mean the way the virtio spec encodes the physical address + offset
> pairs.
> 
> I stopped reading here.
> 


The picture confused me. It looked like the first element is the 
virtio_scsi_cmd_req
not an sgilist-element that points to the struct's buffer.

In that case then yes your plan of making a two-elements fragment that points 
to the
original scsi-sglist is perfect. All you have to do is that, and all you have 
to do
at virtio is use the sg_for_each macro and you are done.

You don't need any sglist allocation or reshaping. And you can easily support
chaining. Looks like order of magnitude more simple then what you do now
So what is the problem?

And BTW you won't need that new __sg_set_page API anymore.

> Paolo


Cheers
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread James Bottomley
On Wed, 2012-07-25 at 10:17 -0700, Tejun Heo wrote:
> Hello, James.
> 
> On Wed, Jul 25, 2012 at 06:19:13PM +0400, James Bottomley wrote:
> > > I haven't consulted SAT but it seems like a bug in SAS driver or
> > > firmware.  If it's a driver bug, we better fix it there.  If a
> > > firmware bug, working around those is one of major roles of drivers,
> > > so I think setting allow_restart is fine.
> > 
> > Actually, I don't think so.  SAT-2 section 8.12.2 does say 
> > 
> > if the device is in the stopped state as the result of
> > processing a START STOP UNIT command (see 9.11), then the SATL
> > shall terminate the TEST UNIT READY command with CHECK CONDITION
> > status with the sense key set to NOT READY and the additional
> > sense code of LOGICAL UNIT NOT READY, INITIALIZING COMMAND
> > REQUIRED;
> > 
> > START STOP UNIT (with START=0) translates to STANDBY IMMEDIATE, and
> > that's what hdparm -y issues.  We don't see this in /drivers/ata because
> > TEST UNIT READY always returns success.
> 
> Urgh... ATA device in standby mode is ready for any command and
> definitely doesn't need an "initializing command".  Oh, well...

Well, it does in sleep mode ... which seems to most closely map to what
SCSI thinks of as a stopped unit. I checked the specs just in case there
was an error ... they all say STANDBY not SLEEP.

> > So it looks like the mpt2sas SAT is doing the correct thing and we only
> > don't see this problem in normal SATA devices because of a bug in the
> > libata-scsi SAT.
> 
> libata is inconsistent with the standard but I think the standard is
> wrong here. :(

Well, reading it, so do I.  Unfortunately, we get to deal with the world
as it is rather than as we would wish it to be.  We likely have this
problem with a lot of USB SATLs as well ...

It looks like a hack like this might be needed.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 4a6381c..7e59a7f 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -42,6 +42,8 @@
 
 #include 
 
+static void scsi_eh_done(struct scsi_cmnd *scmd);
+
 #define SENSE_TIMEOUT  (10*HZ)
 
 /*
@@ -241,6 +243,14 @@ static int scsi_check_sense(struct scsi_cmnd *scmd)
if (! scsi_command_normalize_sense(scmd, &sshdr))
return FAILED;  /* no valid sense data */
 
+   if (scmd->cmnd[0] == TEST_UNIT_READY && scmd->scsi_done != scsi_eh_done)
+   /* 
+* nasty: for mid-layer issued TURs, we need to return the
+* actual sense data without any recovery attempt.  For eh
+* issued ones, we need to try to recover and interpret
+*/
+   return SUCCESS;
+
if (scsi_sense_is_deferred(&sshdr))
return NEEDS_RETRY;
 
diff --git a/drivers/scsi/scsi_scan.c b/drivers/scsi/scsi_scan.c
index 56a9379..91d3366 100644
--- a/drivers/scsi/scsi_scan.c
+++ b/drivers/scsi/scsi_scan.c
@@ -764,6 +764,16 @@ static int scsi_add_lun(struct scsi_device *sdev, unsigned 
char *inq_result,
sdev->model = (char *) (sdev->inquiry + 16);
sdev->rev = (char *) (sdev->inquiry + 32);
 
+   if (strncmp(sdev->vendor, "ATA ", 8) == 0) {
+   /* 
+* sata emulation layer device.  This is a hack to work around
+* the SATL power management specifications which state that
+* when the SATL detects the device has gone into standby
+* mode, it shall respond with NOT READY.
+*/
+   sdev->allow_restart = 1;
+   }
+
if (*bflags & BLIST_ISROM) {
sdev->type = TYPE_ROM;
sdev->removable = 1;


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Paolo Bonzini
Il 25/07/2012 21:16, Boaz Harrosh ha scritto:
> The picture confused me. It looked like the first element is the 
> virtio_scsi_cmd_req
> not an sgilist-element that points to the struct's buffer.
> 
> In that case then yes your plan of making a two-elements fragment that points 
> to the
> original scsi-sglist is perfect. All you have to do is that, and all you have 
> to do
> at virtio is use the sg_for_each macro and you are done.
> 
> You don't need any sglist allocation or reshaping. And you can easily support
> chaining. Looks like order of magnitude more simple then what you do now

It is.

> So what is the problem?

That not all architectures have ARCH_HAS_SG_CHAIN (though all those I
care about do).  So I need to go through all architectures and make sure
they use for_each_sg, or at least to change ARCH_HAS_SG_CHAIN to a
Kconfig define so that dependencies can be expressed properly.

> And BTW you won't need that new __sg_set_page API anymore.

Kind of.

   sg_init_table(sg, 2);
   sg_set_buf(sg[0], req, sizeof(req));
   sg_chain(sg[1], scsi_out(sc));

is still a little bit worse than

   __sg_set_buf(sg[0], req, sizeof(req));
   __sg_chain(sg[1], scsi_out(sc));

Paolo
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [resend PATCH 5/5] libsas, ipr: cleanup ata_host flags initialization via ata_host_init

2012-07-25 Thread Brian King
On 07/09/2012 11:06 PM, Dan Williams wrote:
> libsas and ipr pass flags to ata_host_init that are meant for the port.
> 
> ata_host flags:
>   ATA_HOST_SIMPLEX= (1 << 0), /* Host is simplex, one DMA 
> channel per host only */
>   ATA_HOST_STARTED= (1 << 1), /* Host started */
>   ATA_HOST_PARALLEL_SCAN  = (1 << 2), /* Ports on this host can be 
> scanned in parallel */
>   ATA_HOST_IGNORE_ATA = (1 << 3), /* Ignore ATA devices on this 
> host. */
> 
> flags passed by libsas:
>   ATA_FLAG_SATA   = (1 << 1),
>   ATA_FLAG_PIO_DMA= (1 << 7), /* PIO cmds via DMA */
>   ATA_FLAG_NCQ= (1 << 10), /* host supports NCQ */
> 
> The only one that aliases is ATA_HOST_STARTED which is a 'don't care' in
> the libsas and ipr cases since ata_hosts from these sources are not
> registered with libata.
> 
> Cc: Brian King 
> Reported-by: Hannes Reinecke 
> Signed-off-by: Dan Williams 
> ---
>  drivers/ata/libata-core.c |   10 ++
>  drivers/scsi/ipr.c|3 +--
>  drivers/scsi/libsas/sas_ata.c |5 +
>  include/linux/libata.h|3 +--

Acked-by: Brian King 

-- 
Brian King
Power Linux I/O
IBM Linux Technology Center




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] scsi_lib: add NULL check to scsi_setup_blk_pc_cmnd

2012-07-25 Thread Jörn Engel
On Tue, 24 July 2012 09:01:41 +0400, James Bottomley wrote:
> On Mon, 2012-07-23 at 15:24 -0400, Jörn Engel wrote:
> > On Mon, 23 July 2012 23:45:55 +0400, James Bottomley wrote:
> > > 
> > > Have you checked this with the patches in scsi-misc?  There's a series
> > > of patches in there that alters the way sdev handling is done.
> > 
> > I would have liked to, but the tree referenced in MAINTAINERS does not
> > appear to exist:
> > 
> > git clone 
> > git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi-misc-2.6.git
> > Cloning into 'scsi-misc-2.6'...
> > fatal: The remote end hung up unexpectedly
> 
> git clone git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git
> 
> use the misc branch

joern@Sligo:/usr/src/kernel$ git clone --reference git/ 
git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git
Cloning into 'scsi'...
remote: Counting objects: 869, done.
remote: Compressing objects: 100% (420/420), done.
remote: Total 668 (delta 562), reused 274 (delta 248)
Receiving objects: 100% (668/668), 112.59 KiB, done.
Resolving deltas: 100% (562/562), completed with 110 local objects.
Checking out files: 100% (39127/39127), done.
joern@Sligo:/usr/src/kernel$ cd scsi/
joern@Sligo:/usr/src/kernel/scsi$ git branch
* master

Normally I never use branches, so there is a chance that I'm just
clueless.  But to my untrained eye, this looks as if there is no misc
branch.

Jörn

--
To my face you have the audacity to advise me to become a thief - the worst
kind of thief that is conceivable, a thief of spiritual things, a thief of
ideas! It is insufferable, intolerable!
-- M. Binet in Scarabouche
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Jeff Garzik
On Wed, Jul 25, 2012 at 04:35:51PM -0400, Jeff Garzik wrote:
> Please pull 641589bff714f39b33ef1d7f02eaa009f2993b64 from
> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git 
> tags/upstream
> 

Oh, I forgot to point out the merge commit, making my HEAD more recent
than might be expected.  There was a merge conflict and an API change
that needed to be dealt with, in order for your pull to be correct.

Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: virtio(-scsi) vs. chained sg_lists (was Re: [PATCH] scsi: virtio-scsi: Fix address translation failure of HighMem pages used by sg list)

2012-07-25 Thread Boaz Harrosh
On 07/25/2012 11:06 PM, Paolo Bonzini wrote:

> Il 25/07/2012 21:16, Boaz Harrosh ha scritto:
>> The picture confused me. It looked like the first element is the 
>> virtio_scsi_cmd_req
>> not an sgilist-element that points to the struct's buffer.
>>
>> In that case then yes your plan of making a two-elements fragment that 
>> points to the
>> original scsi-sglist is perfect. All you have to do is that, and all you 
>> have to do
>> at virtio is use the sg_for_each macro and you are done.
>>
>> You don't need any sglist allocation or reshaping. And you can easily support
>> chaining. Looks like order of magnitude more simple then what you do now
> 
> It is.
> 
>> So what is the problem?
> 
> That not all architectures have ARCH_HAS_SG_CHAIN (though all those I
> care about do).  So I need to go through all architectures and make sure
> they use for_each_sg, or at least to change ARCH_HAS_SG_CHAIN to a
> Kconfig define so that dependencies can be expressed properly.
> 


What is actually preventing ARCH_HAS_SG_CHAIN from all these ARCHES?
is that the DMA drivers not using for_each_sg(). Sounds like easy
to fix.

But yes a deep change would convert ARCH_HAS_SG_CHAIN to a Kconfig.

If you want to be lazy, like me, You might just put a BUILD_BUG_ON
in code, requesting the user to disable the driver for this ARCH.

I bet there is more things to do at ARCH to enable virtualization
then just support ARCH_HAS_SG_CHAIN. Be it just another requirement.

If you Document it and make sure current ARCHs are fine, it should
not ever trigger.

>> And BTW you won't need that new __sg_set_page API anymore.
> 
> Kind of.
> 
>sg_init_table(sg, 2);
>sg_set_buf(sg[0], req, sizeof(req));
>sg_chain(sg[1], scsi_out(sc));
> 
> is still a little bit worse than
> 
>__sg_set_buf(sg[0], req, sizeof(req));
>__sg_chain(sg[1], scsi_out(sc));
> 


I believe they are the same, specially for the
on the stack 2 elements array. Actually I think
In both cases you need to at least call sg_init_table()
once after allocation, No?

Your old code with big array copy and re-shaping was
a better example of the need for your new API. Which I agree.

But please for my sake do not call it __sg_chain. Call it
something like sg_chain_not_end(). I hate those "__" which
for god sack means what? 
(A good name is when I don't have to read the code, an "__"
 means "fuck you go read the code")

> Paolo


Thanks
Boaz
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Nicholas A. Bellinger
On Wed, 2012-07-25 at 16:10 +0300, Avi Kivity wrote:
> On 07/24/2012 11:45 PM, Nicholas A. Bellinger wrote:
> 
> >> > diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
> >> > index e942df9..3d5378f 100644
> >> > --- a/drivers/vhost/tcm_vhost.h
> >> > +++ b/drivers/vhost/tcm_vhost.h
> >> > @@ -80,7 +80,17 @@ struct tcm_vhost_tport {
> >> >  
> >> >  #include 
> >> >  
> >> > +/*
> >> > + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
> >> > + *
> >> > + * ABI Rev 0: All pre 2012 revisions used by prototype out-of-tree code
> >> > + * ABI Rev 1: 2012 version for v3.6 kernel merge candiate
> >> > + */
> 
> 
> If it's out of tree, why consider it at all?  Put a stable ABI in tree
> and extend it in compatible ways.
> 
> 

This comment was supposed to convey that ABI=0 vhost-scsi userspace code
is not supported with tcm_vhost mainline code.

But obviously that was not clear enough here.  Updating the comment to
reflect to make this clear.

So the main question here was if it's fine to start with ABI=1, and
require >= ABI=1 for all vhost-scsi userspace code to function with
tcm_vhost.

The idea was to avoid confusion for the ABI=0 vhost-scsi code that's
been floating around for the last 2 years.

--nab







--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Nicholas A. Bellinger
On Wed, 2012-07-25 at 12:55 +0100, Stefan Hajnoczi wrote:
> On Tue, Jul 24, 2012 at 01:45:24PM -0700, Nicholas A. Bellinger wrote:
> > On Mon, 2012-07-23 at 18:56 -0700, Greg Kroah-Hartman wrote:
> > > On Tue, Jul 24, 2012 at 01:26:20AM +, Nicholas A. Bellinger wrote:
> > > > From: Nicholas Bellinger 



> > 
> > > > diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
> > > > index e942df9..3d5378f 100644
> > > > --- a/drivers/vhost/tcm_vhost.h
> > > > +++ b/drivers/vhost/tcm_vhost.h
> > > > @@ -80,7 +80,17 @@ struct tcm_vhost_tport {
> > > >  
> > > >  #include 
> > > >  
> > > > +/*
> > > > + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
> > > > + *
> > > > + * ABI Rev 0: All pre 2012 revisions used by prototype out-of-tree code
> > > > + * ABI Rev 1: 2012 version for v3.6 kernel merge candiate
> > > > + */
> > > > +
> > > > +#define VHOST_SCSI_ABI_VERSION 1
> > > > +
> > > >  struct vhost_scsi_target {
> > > > +   int abi_version;
> > > > unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > unsigned short vhost_tpgt;
> > > >  };
> > > > @@ -88,3 +98,4 @@ struct vhost_scsi_target {
> > > >  /* VHOST_SCSI specific defines */
> > > >  #define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> > > > vhost_scsi_target)
> > > >  #define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> > > > vhost_scsi_target)
> > > > +#define VHOST_SCSI_GET_ABI_VERSION _IOW(VHOST_VIRTIO, 0x42, struct 
> > > > vhost_scsi_target)
> > > 
> > > No, you just broke the ABI for version "0" here, that's not how you do
> > > this at all.
> > > 
> > 
> > The intention of this patch is use ABI=1 as a starting point for
> > tcm_vhost moving forward, with no back-wards compat for the ABI=0
> > prototype userspace code because:
> > 
> > - It's based on a slightly older version of QEMU (updating the QEMU series 
> > now)
> > - It does not have an GET_ABI_VERSION ioctl cmd (that starts with ABI=1)
> > - It has a small user-base of target + virtio-scsi developers
> > 
> > So I did consider just starting from ABI=0, but figured this would help
> > reduce the confusion for QEMU userspace wrt to the vhost-scsi code
> > that's been floating around out-of-tree for the last 2 years.
> 
> There is no real user base beyond the handful of people who have hacked
> on this.  Adding the GET_ABI_VERSION ioctl() at this stage is fine,
> especially considering that the userspace code that talks to tcm_vhost
> isn't in mainline in userspace yet either.


Do you have a preference for a VHOST_SCSI_ABI_VERSION starting point
here..?

I thought that v1 would be helpful to avoid confusion with the older
userspace code, but don't really have a strong opinion either way..

Let me know what you'd prefer here, and I'll make the changes to
tcm_vhost + vhost-scsi patch series accordingly.

Thanks!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Jeff Garzik

On 07/25/2012 04:35 PM, Jeff Garzik wrote:


Please pull 641589bff714f39b33ef1d7f02eaa009f2993b64 from
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git 
tags/upstream

(text copied from the upstream-linus tag)
Notable changes:

* Updating libata to directly bind with ACPI / runtime power mgmt.
   This is a pre-req for SATA ZPODD (CD-ROM power management).

   Touches ACPI (exports++) and SCSI in minor ways.  Has been in linux-next
   for weeks.

   The rest of [ZPODD] will probably come via SCSI tree, as it involves
   a lot of updates to the 'sr' driver etc.


BTW Lin and Aaron, note that this did not include these changes:

  sr: check support for device busy class events
  sr: support zero power ODD
  sr: make sure ODD is in resumed state in block ioctl

as in the end I wanted to put the brakes on SCSI-touching patches. 
These should be able to go into James' scsi-misc tree with the other 
SCSI-area ZPODD changes.


For those three 'sr' changes listed above, you may add

Acked-by: Jeff Garzik 

when moving them over.

Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Linus Torvalds
On Wed, Jul 25, 2012 at 1:43 PM, Jeff Garzik  wrote:
> On Wed, Jul 25, 2012 at 04:35:51PM -0400, Jeff Garzik wrote:
>> Please pull 641589bff714f39b33ef1d7f02eaa009f2993b64 from
>> git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git 
>> tags/upstream
>>
>
> Oh, I forgot to point out the merge commit, making my HEAD more recent
> than might be expected.  There was a merge conflict and an API change
> that needed to be dealt with, in order for your pull to be correct.

So I'd *much* rather see an explanation of what the conflict is when
you ask me to pull, and let me handle it, rather than you pre-merging
things for me. I *want* to see conflicts between subsystems.
Seriously.

   Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Jeff Garzik

On 07/25/2012 06:06 PM, Linus Torvalds wrote:

On Wed, Jul 25, 2012 at 1:43 PM, Jeff Garzik  wrote:

On Wed, Jul 25, 2012 at 04:35:51PM -0400, Jeff Garzik wrote:

Please pull 641589bff714f39b33ef1d7f02eaa009f2993b64 from
git://git.kernel.org/pub/scm/linux/kernel/git/jgarzik/libata-dev.git 
tags/upstream



Oh, I forgot to point out the merge commit, making my HEAD more recent
than might be expected.  There was a merge conflict and an API change
that needed to be dealt with, in order for your pull to be correct.


So I'd *much* rather see an explanation of what the conflict is when
you ask me to pull, and let me handle it, rather than you pre-merging
things for me. I *want* to see conflicts between subsystems.
Seriously.


Tried to add some explanation to the merge commit itself, giving plenty 
of detail.


Even so, separately, it still needed that post-merge compile fix.

Jeff



--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Linus Torvalds
On Wed, Jul 25, 2012 at 3:26 PM, Jeff Garzik  wrote:
>
> Even so, separately, it still needed that post-merge compile fix.

And that's yet another example of how *NOT* to do things.

If the merge has errors like that, then they should be fixed up in the merge.

Please. Don't do this. Let me merge stuff, and you explain in the pull
request why it gets merge problems. Not this mess.

That merge itself was *trivial*. I do those kinds of fixups in my
sleep and you don't even need to explain those. The non-trivial part
you did as a separate commit. But neither of those should have been
"I'll pre-merge for Linus so that he doesn't see these problems".

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Stefan Hajnoczi
On Wed, Jul 25, 2012 at 02:14:50PM -0700, Nicholas A. Bellinger wrote:
> On Wed, 2012-07-25 at 12:55 +0100, Stefan Hajnoczi wrote:
> > On Tue, Jul 24, 2012 at 01:45:24PM -0700, Nicholas A. Bellinger wrote:
> > > On Mon, 2012-07-23 at 18:56 -0700, Greg Kroah-Hartman wrote:
> > > > On Tue, Jul 24, 2012 at 01:26:20AM +, Nicholas A. Bellinger wrote:
> > > > > From: Nicholas Bellinger 
> 
> 
> 
> > > 
> > > > > diff --git a/drivers/vhost/tcm_vhost.h b/drivers/vhost/tcm_vhost.h
> > > > > index e942df9..3d5378f 100644
> > > > > --- a/drivers/vhost/tcm_vhost.h
> > > > > +++ b/drivers/vhost/tcm_vhost.h
> > > > > @@ -80,7 +80,17 @@ struct tcm_vhost_tport {
> > > > >  
> > > > >  #include 
> > > > >  
> > > > > +/*
> > > > > + * Used by QEMU userspace to ensure a consistent vhost-scsi ABI.
> > > > > + *
> > > > > + * ABI Rev 0: All pre 2012 revisions used by prototype out-of-tree 
> > > > > code
> > > > > + * ABI Rev 1: 2012 version for v3.6 kernel merge candiate
> > > > > + */
> > > > > +
> > > > > +#define VHOST_SCSI_ABI_VERSION   1
> > > > > +
> > > > >  struct vhost_scsi_target {
> > > > > + int abi_version;
> > > > >   unsigned char vhost_wwpn[TRANSPORT_IQN_LEN];
> > > > >   unsigned short vhost_tpgt;
> > > > >  };
> > > > > @@ -88,3 +98,4 @@ struct vhost_scsi_target {
> > > > >  /* VHOST_SCSI specific defines */
> > > > >  #define VHOST_SCSI_SET_ENDPOINT _IOW(VHOST_VIRTIO, 0x40, struct 
> > > > > vhost_scsi_target)
> > > > >  #define VHOST_SCSI_CLEAR_ENDPOINT _IOW(VHOST_VIRTIO, 0x41, struct 
> > > > > vhost_scsi_target)
> > > > > +#define VHOST_SCSI_GET_ABI_VERSION _IOW(VHOST_VIRTIO, 0x42, struct 
> > > > > vhost_scsi_target)
> > > > 
> > > > No, you just broke the ABI for version "0" here, that's not how you do
> > > > this at all.
> > > > 
> > > 
> > > The intention of this patch is use ABI=1 as a starting point for
> > > tcm_vhost moving forward, with no back-wards compat for the ABI=0
> > > prototype userspace code because:
> > > 
> > > - It's based on a slightly older version of QEMU (updating the QEMU 
> > > series now)
> > > - It does not have an GET_ABI_VERSION ioctl cmd (that starts with ABI=1)
> > > - It has a small user-base of target + virtio-scsi developers
> > > 
> > > So I did consider just starting from ABI=0, but figured this would help
> > > reduce the confusion for QEMU userspace wrt to the vhost-scsi code
> > > that's been floating around out-of-tree for the last 2 years.
> > 
> > There is no real user base beyond the handful of people who have hacked
> > on this.  Adding the GET_ABI_VERSION ioctl() at this stage is fine,
> > especially considering that the userspace code that talks to tcm_vhost
> > isn't in mainline in userspace yet either.
> 
> 
> Do you have a preference for a VHOST_SCSI_ABI_VERSION starting point
> here..?
> 
> I thought that v1 would be helpful to avoid confusion with the older
> userspace code, but don't really have a strong opinion either way..
> 
> Let me know what you'd prefer here, and I'll make the changes to
> tcm_vhost + vhost-scsi patch series accordingly.

I don't think 0 for out-of-tree is needed.  I'd start at 0 but either
way is okay.

The main thing I would like to confirm is that this only versions the
tcm_vhost ioctls?  In that case a single version number works.

Stefan

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread tomm
Tejun Heo  kernel.org> writes:

> 
> Hello,
> 
> On Sat, Jul 21, 2012 at 02:15:56PM +0200, Matthias Prager wrote:
> > Now I'm not sure this isn't taping over another bug. Which leads me to
> > my question: What is the correct behavior?
> > 
> > #1 Issuing a separate spin-up command (START UNIT?) prior to sending i/o
> > by setting allow_restart=1 for sata disks on sas controllers
> > 
> > or
> > 
> > #2 Teaching the sas drivers they do not need spin-up commands and can
> > simply start issuing i/o to sata disks
> 
> I haven't consulted SAT but it seems like a bug in SAS driver or
> firmware.  If it's a driver bug, we better fix it there.  If a
> firmware bug, working around those is one of major roles of drivers,
> so I think setting allow_restart is fine.
> 
> Thanks.
> 
If this is a driver or firmware bug, then why would commit
85ef06d1d252f6a2e73b678591ab71caad4667bb 
cause this to happen?  What is the interaction between this issue
and this commit which just flushes events?

Also this issue does not happen with mvsas, only with mpt2sas.




--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Jeff Garzik

On 07/25/2012 06:31 PM, Linus Torvalds wrote:

On Wed, Jul 25, 2012 at 3:26 PM, Jeff Garzik  wrote:


Even so, separately, it still needed that post-merge compile fix.


And that's yet another example of how *NOT* to do things.

If the merge has errors like that, then they should be fixed up in the merge.

Please. Don't do this. Let me merge stuff, and you explain in the pull
request why it gets merge problems. Not this mess.

That merge itself was *trivial*. I do those kinds of fixups in my
sleep and you don't even need to explain those. The non-trivial part
you did as a separate commit. But neither of those should have been
"I'll pre-merge for Linus so that he doesn't see these problems".


What is the right course in when a post-merge change is needed?

Jeff





--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [GIT PATCH 0/4] isci update for 3.6

2012-07-25 Thread Skirvin, Jeffrey D

James, is there any update available about the pull of the driver changes that
Dan mentioned?  Please let me know if there is anything we need to do to assist.

Regards,




-Original Message-
From: linux-scsi-ow...@vger.kernel.org 
[mailto:linux-scsi-ow...@vger.kernel.org] On Behalf Of Dan Williams
Sent: Wednesday, July 11, 2012 2:59 PM
To: jbottom...@parallels.com
Cc: linux-scsi@vger.kernel.org
Subject: [GIT PATCH 0/4] isci update for 3.6

James,

This is the backlog of the libsas indepedent driver changes for 3.6.  Just a
small collection of fixlets.  Please pull, thanks.

--
Dan


The following changes since commit f8f5701bdaf9134b1f90e5044a82c66324d2073f:

  Linux 3.5-rc1 (2012-06-02 18:29:26 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/djbw/isci.git tags/isci-for-3.6

for you to fetch changes up to 6734092e66011def7875bd67beef889d0fee1cc9:

  isci: add a couple __iomem annotations (2012-07-03 12:09:32 -0700)



isci update for 3.6

1/ Fix the workaround for drives that have a slow response to COMSAS.
   Drives with this problem intermittently take a long time to be
   identified, or fail to be identified altogether.

2/ A minor fix for the efi variable code failure path

3/ A handful of smatch fixups from Dan Carpenter


Dan Carpenter (2):
  isci: make function declaration match implementation
  isci: add a couple __iomem annotations

Dan Williams (1):
  isci: fix isci_pci_probe() generates warning on efi failure path

Dave Maurer (1):
  isci: fix COMSAS negation timout workaround for WD SAS drives

 drivers/scsi/isci/host.c| 2 +-
 drivers/scsi/isci/init.c| 1 -
 drivers/scsi/isci/phy.c | 4 +++-
 drivers/scsi/isci/probe_roms.c  | 1 -
 drivers/scsi/isci/remote_node_context.h | 2 +-
 5 files changed, 5 insertions(+), 5 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Linus Torvalds
On Wed, Jul 25, 2012 at 3:58 PM, Jeff Garzik  wrote:
>
> What is the right course in when a post-merge change is needed?

Just describe the issue and the required change. Than I can just do it
as part of the merge, and now the whole series is bisectable,
including the merge itself.

Here's a (fairly bad) example:

  http://www.spinics.net/lists/netdev/msg192349.html

and the reason I call that a bad example is not because that's a bad
pull request, but simply that those are all real data conflicts, not
the more subtle kind of "it merges fine, but because new code
introduced uses an interface that changed, you need to do xyz".

I couldn't find an example of that in a quick look, it's fairly
uncommon to have non-conflicting merges that had semantic - but not
contextual - conflicts. (Although it does happen, and sometimes it's
actually not the developer, but Stephen Rothwell who notices it in
-next and lets me know before the merge).

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: 'Device not ready' issue on mpt2sas since 3.1.10

2012-07-25 Thread Matthias Prager
Hello James,

Am 25.07.2012 21:55, schrieb James Bottomley:>
> It looks like a hack like this might be needed.
>
> James
>



I don't yet understand all the code but I'm following your discussion
with Tejun: I've set up a minimal vm running gentoo with a mpt2sas
driven controller in passthrough mode. I've applied your proposed patch
against the vanilla 3.5.0 kernel (which includes Tejun's commit), and
I'm happy to report the problem does seem to get fixed by it.
Well at least sending the sata drive in standby using 'hdparm -y' now
works (according to 'hdparm -C') without these nasty i/o errors on later
i/o. That is to say the drive wakes up again (e.g. from a 'fdisk -l
/dev/sda' command) and returns data.

--
Matthias
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Jeff Garzik

On 07/25/2012 07:30 PM, Linus Torvalds wrote:

On Wed, Jul 25, 2012 at 3:58 PM, Jeff Garzik  wrote:


What is the right course in when a post-merge change is needed?


Just describe the issue and the required change. Than I can just do it
as part of the merge, and now the whole series is bisectable,
including the merge itself.

Here's a (fairly bad) example:

   http://www.spinics.net/lists/netdev/msg192349.html

and the reason I call that a bad example is not because that's a bad
pull request, but simply that those are all real data conflicts, not
the more subtle kind of "it merges fine, but because new code
introduced uses an interface that changed, you need to do xyz".


Thanks, so noted.  I guess if the merge gets more complex than something 
easily described in an email, that implies that maintainers should do 
more cross-coordination and maybe a merge tree.


What's the best way for libata to move forward, now that this hideous 
merge has been pushed out to the Well Known libata branches?  The 
pre-jgarzik-merge commit you would have pulled is 
dc7f71f486f4f5fa96f6dcf86833da020cde8a11 had my pull request been proper.


I can lop off the top 3 commits and force-update the libata-dev.git 
branches, then send a new pull request -- but you have grumbled at that 
sort of behavior in maintainer trees before too...


Jeff


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] tcm_vhost: Expose ABI version via VHOST_SCSI_GET_ABI_VERSION

2012-07-25 Thread Nicholas A. Bellinger
On Wed, 2012-07-25 at 23:35 +0100, Stefan Hajnoczi wrote:
> On Wed, Jul 25, 2012 at 02:14:50PM -0700, Nicholas A. Bellinger wrote:
> > On Wed, 2012-07-25 at 12:55 +0100, Stefan Hajnoczi wrote:
> > > On Tue, Jul 24, 2012 at 01:45:24PM -0700, Nicholas A. Bellinger wrote:
> > > > On Mon, 2012-07-23 at 18:56 -0700, Greg Kroah-Hartman wrote:
> > > > > On Tue, Jul 24, 2012 at 01:26:20AM +, Nicholas A. Bellinger wrote:
> > > > > > From: Nicholas Bellinger 



> > > > The intention of this patch is use ABI=1 as a starting point for
> > > > tcm_vhost moving forward, with no back-wards compat for the ABI=0
> > > > prototype userspace code because:
> > > > 
> > > > - It's based on a slightly older version of QEMU (updating the QEMU 
> > > > series now)
> > > > - It does not have an GET_ABI_VERSION ioctl cmd (that starts with ABI=1)
> > > > - It has a small user-base of target + virtio-scsi developers
> > > > 
> > > > So I did consider just starting from ABI=0, but figured this would help
> > > > reduce the confusion for QEMU userspace wrt to the vhost-scsi code
> > > > that's been floating around out-of-tree for the last 2 years.
> > > 
> > > There is no real user base beyond the handful of people who have hacked
> > > on this.  Adding the GET_ABI_VERSION ioctl() at this stage is fine,
> > > especially considering that the userspace code that talks to tcm_vhost
> > > isn't in mainline in userspace yet either.
> > 
> > 
> > Do you have a preference for a VHOST_SCSI_ABI_VERSION starting point
> > here..?
> > 
> > I thought that v1 would be helpful to avoid confusion with the older
> > userspace code, but don't really have a strong opinion either way..
> > 
> > Let me know what you'd prefer here, and I'll make the changes to
> > tcm_vhost + vhost-scsi patch series accordingly.
> 
> I don't think 0 for out-of-tree is needed.  I'd start at 0 but either
> way is okay.
> 



In that case, respinning a -v5 for tcm_vhost to start from ABI=0 and
will post an updated patch shortly.

> The main thing I would like to confirm is that this only versions the
> tcm_vhost ioctls?  In that case a single version number works.
> 

Correct, the GET_ABI_VERSION call is only intended to identify the
changing of tcm_vhost ioctls.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Aaron Lu

On 07/26/2012 05:38 AM, Jeff Garzik wrote:

On 07/25/2012 04:35 PM, Jeff Garzik wrote:

* Updating libata to directly bind with ACPI / runtime power mgmt.
This is a pre-req for SATA ZPODD (CD-ROM power management).

Touches ACPI (exports++) and SCSI in minor ways. Has been in linux-next
for weeks.

The rest of [ZPODD] will probably come via SCSI tree, as it involves
a lot of updates to the 'sr' driver etc.


BTW Lin and Aaron, note that this did not include these changes:

sr: check support for device busy class events
sr: support zero power ODD
sr: make sure ODD is in resumed state in block ioctl

as in the end I wanted to put the brakes on SCSI-touching patches. These
should be able to go into James' scsi-misc tree with the other SCSI-area
ZPODD changes.

For those three 'sr' changes listed above, you may add

Acked-by: Jeff Garzik 

when moving them over.


Thanks Jeff.

Hi James,
I'll prepare these dropped patches plus some other fixes for ZPODD which
I've sent v2 recently and merge them into v3 for you to review.

Thanks,
Aaron

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread James Bottomley
On Thu, 2012-07-26 at 12:47 +0800, Aaron Lu wrote:
> On 07/26/2012 05:38 AM, Jeff Garzik wrote:
> > On 07/25/2012 04:35 PM, Jeff Garzik wrote:
> >> * Updating libata to directly bind with ACPI / runtime power mgmt.
> >> This is a pre-req for SATA ZPODD (CD-ROM power management).
> >>
> >> Touches ACPI (exports++) and SCSI in minor ways. Has been in linux-next
> >> for weeks.
> >>
> >> The rest of [ZPODD] will probably come via SCSI tree, as it involves
> >> a lot of updates to the 'sr' driver etc.
> >
> > BTW Lin and Aaron, note that this did not include these changes:
> >
> > sr: check support for device busy class events
> > sr: support zero power ODD
> > sr: make sure ODD is in resumed state in block ioctl
> >
> > as in the end I wanted to put the brakes on SCSI-touching patches. These
> > should be able to go into James' scsi-misc tree with the other SCSI-area
> > ZPODD changes.
> >
> > For those three 'sr' changes listed above, you may add
> >
> > Acked-by: Jeff Garzik 
> >
> > when moving them over.
> 
> Thanks Jeff.
> 
> Hi James,
> I'll prepare these dropped patches plus some other fixes for ZPODD which
> I've sent v2 recently and merge them into v3 for you to review.

They weren't exactly dropped ... I've been waiting for you to address
Alan Stern's comments, since he's our resident expert on suspend/resume.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [GIT PATCH 0/4] isci update for 3.6

2012-07-25 Thread James Bottomley
On Wed, 2012-07-25 at 23:08 +, Skirvin, Jeffrey D wrote:
> James, is there any update available about the pull of the driver changes that
> Dan mentioned?  Please let me know if there is anything we need to do to 
> assist.

Oh, oops.  When I had to rebase the branches to accommodate Linus' fix
for the async problem, this got lost.  I'll readd it.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git patches] libata updates

2012-07-25 Thread Aaron Lu

On 07/26/2012 01:05 PM, James Bottomley wrote:

On Thu, 2012-07-26 at 12:47 +0800, Aaron Lu wrote:

On 07/26/2012 05:38 AM, Jeff Garzik wrote:

On 07/25/2012 04:35 PM, Jeff Garzik wrote:

* Updating libata to directly bind with ACPI / runtime power mgmt.
This is a pre-req for SATA ZPODD (CD-ROM power management).

Touches ACPI (exports++) and SCSI in minor ways. Has been in linux-next
for weeks.

The rest of [ZPODD] will probably come via SCSI tree, as it involves
a lot of updates to the 'sr' driver etc.


BTW Lin and Aaron, note that this did not include these changes:

sr: check support for device busy class events
sr: support zero power ODD
sr: make sure ODD is in resumed state in block ioctl

as in the end I wanted to put the brakes on SCSI-touching patches. These
should be able to go into James' scsi-misc tree with the other SCSI-area
ZPODD changes.

For those three 'sr' changes listed above, you may add

Acked-by: Jeff Garzik

when moving them over.


Thanks Jeff.

Hi James,
I'll prepare these dropped patches plus some other fixes for ZPODD which
I've sent v2 recently and merge them into v3 for you to review.


They weren't exactly dropped ... I've been waiting for you to address
Alan Stern's comments, since he's our resident expert on suspend/resume.


Oh, I forgot to mention, that I agree with Alan's comments and have
addressed them in my v2 patches here:
http://marc.info/?l=linux-scsi&m=134312317325650&w=2

The 2 patches Alan has comments are:
http://marc.info/?l=linux-scsi&m=134312311025619&w=2
http://marc.info/?l=linux-scsi&m=134312308225610&w=2

Hi Alan,
Are the v2 patches look OK to you?

And James,
Do you want me to rebase these patches on top of scsi-misc tree?

Thanks,
Aaron

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html