from:"Sarah Sharp"

Re: Kernel 3.16.0 USB crash

2014-08-14 Thread Sarah Sharp

Adding Mathias Nyman.  He is now the USB 3.0 maintainer.

Sarah Sharp

On Thu, Aug 14, 2014 at 11:46:33AM +0200, Hans de Goede wrote:
> Hi,
> 
> On 08/14/2014 10:39 AM, Claudio Bizzarri wrote:
> > Ciao,
> > 
> > thank you very much for replay, you are right: it's UAS module. Now I'm
> > using Ubuntu 14.04 with kernel 3.16.1 from
> > http://kernel.ubuntu.com/~kernel-ppa/mainline/, there is no /proc/config.gz,
> > but but there is a config file in /boot:
> > 
> > b0@hp850ssd:~⟫ grep USB_UAS /boot/config-3.16.1-031601-generic
> > CONFIG_USB_UAS=m
> > 
> > When I attach my external USB disk I've 30 seconds before my laptop freeze,
> > here is my dmesg output, disk is not mounted:
> 
> Hmm, this sounds like a similar problem we've been having with JMicron UAS
> bridges over USB-2.
> 
> Can you collect "lsusb -v" output for the drive in question when connected
> through an usb-3 port (the uas module does not need to be loaded).
> 
> Also can you try the following patch, and see if that makes uas work ? :
> 
> diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
> index 511b229..6cdc1b9 100644
> --- a/drivers/usb/storage/uas.c
> +++ b/drivers/usb/storage/uas.c
> @@ -1033,6 +1033,7 @@ static int uas_configure_endpoints(struct uas_dev_info 
> *devinfo)
>   3, 256, GFP_NOIO);
>   if (devinfo->qdepth < 0)
>   return devinfo->qdepth;
> + devinfo->qdepth = 32;
>   devinfo->use_streams = 1;
>   }
> 
> 
> This is in essence the fix we've done for using these devices with uas over 
> usb-2,
> I would have expected this to not be be necessary at superspeed since there 
> the number
> of streams the device supports is part of the usb descriptors, but maybe the 
> device
> claims to support more streams then it can actually handle.
> 
> Note I'm on vacation next week, so don't expect another reply from me in this 
> thread
> for at least a week.
> 
> Regards,
> 
> Hans
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel 3.16.0 USB crash

2014-08-14 Thread Sarah Sharp

Adding Mathias Nyman.  He is now the USB 3.0 maintainer.

Sarah Sharp

On Thu, Aug 14, 2014 at 11:46:33AM +0200, Hans de Goede wrote:
 Hi,
 
 On 08/14/2014 10:39 AM, Claudio Bizzarri wrote:
  Ciao,
  
  thank you very much for replay, you are right: it's UAS module. Now I'm
  using Ubuntu 14.04 with kernel 3.16.1 from
  http://kernel.ubuntu.com/~kernel-ppa/mainline/, there is no /proc/config.gz,
  but but there is a config file in /boot:
  
  b0@hp850ssd:~⟫ grep USB_UAS /boot/config-3.16.1-031601-generic
  CONFIG_USB_UAS=m
  
  When I attach my external USB disk I've 30 seconds before my laptop freeze,
  here is my dmesg output, disk is not mounted:
 
 Hmm, this sounds like a similar problem we've been having with JMicron UAS
 bridges over USB-2.
 
 Can you collect lsusb -v output for the drive in question when connected
 through an usb-3 port (the uas module does not need to be loaded).
 
 Also can you try the following patch, and see if that makes uas work ? :
 
 diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c
 index 511b229..6cdc1b9 100644
 --- a/drivers/usb/storage/uas.c
 +++ b/drivers/usb/storage/uas.c
 @@ -1033,6 +1033,7 @@ static int uas_configure_endpoints(struct uas_dev_info 
 *devinfo)
   3, 256, GFP_NOIO);
   if (devinfo-qdepth  0)
   return devinfo-qdepth;
 + devinfo-qdepth = 32;
   devinfo-use_streams = 1;
   }
 
 
 This is in essence the fix we've done for using these devices with uas over 
 usb-2,
 I would have expected this to not be be necessary at superspeed since there 
 the number
 of streams the device supports is part of the usb descriptors, but maybe the 
 device
 claims to support more streams then it can actually handle.
 
 Note I'm on vacation next week, so don't expect another reply from me in this 
 thread
 for at least a week.
 
 Regards,
 
 Hans
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Debugging Support

2014-08-04 Thread Sarah Sharp

On Mon, Aug 04, 2014 at 07:11:07PM -0400, Nick Krause wrote:
> On Mon, Aug 4, 2014 at 7:03 PM, Paul Zimmerman
>  wrote:
> >> From: Nick Krause [mailto:xerofo...@gmail.com]

[snip]

> >> Paul ,
> >> My computer is rather old now as of Sandy Bridge days, so I probably
> >> can't test the patch
> >> on my own machine. However I will look at the code and see if I can
> >> forward port it
> >> against the usb git tree I have a current version of. In addition I
> >> would like the new xhci
> >> maintainers information in  order to send out a patch with the
> >> Maintainer for xhci updated.
> >
> > Sarah already told you who the new maintainer is, and then CCed him
> > on this thread. Hint: There is a file name 'MAINTAINERS' in the root
> > of the kernel tree, which tells you who the maintainers are for all of
> > the subsystems. Please read Documentation/SubmittingPatches, it has a
> > lot of information like this that you need to know.
> >
> > --
> > Paul
> >
> Thanks I will read this file and thanks for the information. I known
> where the file is I will
> add the information then.

You may be looking at an older version of MAINTAINERS.  Mathias has only
been marked as the maintainer since 3.15.  Which kernel version are you
working on?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Debugging Support

2014-08-04 Thread Sarah Sharp

On Sat, Aug 02, 2014 at 12:47:59AM -0400, Nick Krause wrote:
> Hey Sharp,

Hi Krause,

Please Cc the new xHCI driver maintainer, Mathias Nyman.  I'm officially
retired from USB development and have moved onto other projects. :)

> After reading around seems people want support for usb debugging in
> kgdb or other usb based solutions.

Yes, early boot USB debug over xHCI has been a low-priority feature
request.

Unfortunately, USB debug is an optional feature for xHCI host
controllers, so many of them don't have USB debug ports.  Even if it is
included, the USB debug ports are often either routed to internal USB
ports or not exposed on the board altogether.  Since it's not widely
available, it's not a high priority to implement.

> If you and the other developers are able to help me out a bit as I am
> new I can definitively write this
> area of kgdb support.

The kgdb support is all there.  It's the xHCI host controller driver
debug port support that is needed.

Hmm, I only see one commit from your email address in Greg's tree.  I
think you should work on some smaller clean ups and bug fixes, and get
some more patches into mainline before you go on to tackle a larger
feature like this.  Perhaps Mathias has a smaller task that would be
good for you to tackle?

There's a good tutorial here on how to create a kernel patch, and
respond properly to patch review and rework requests:

http://kernelnewbies.org/OPWfirstpatch

And here's my personal philosophy on how to create a patchset:

http://sarah.thesharps.us/2013/05/08/patchsets-for-dinner/

> Regards Nick
> P.S. If  you want Sharp I can change the commit message on my other
> commit you didn't like if me or you
> talk to Greg in order to remove in from mainline, if no that's OK too.

I don't understand what the above sentence means.  What commit message
are you referencing?  What do you mean by "remove in from mainline"?

Try again?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Debugging Support

2014-08-04 Thread Sarah Sharp

On Sat, Aug 02, 2014 at 12:47:59AM -0400, Nick Krause wrote:
 Hey Sharp,

Hi Krause,

Please Cc the new xHCI driver maintainer, Mathias Nyman.  I'm officially
retired from USB development and have moved onto other projects. :)

 After reading around seems people want support for usb debugging in
 kgdb or other usb based solutions.

Yes, early boot USB debug over xHCI has been a low-priority feature
request.

Unfortunately, USB debug is an optional feature for xHCI host
controllers, so many of them don't have USB debug ports.  Even if it is
included, the USB debug ports are often either routed to internal USB
ports or not exposed on the board altogether.  Since it's not widely
available, it's not a high priority to implement.

 If you and the other developers are able to help me out a bit as I am
 new I can definitively write this
 area of kgdb support.

The kgdb support is all there.  It's the xHCI host controller driver
debug port support that is needed.

Hmm, I only see one commit from your email address in Greg's tree.  I
think you should work on some smaller clean ups and bug fixes, and get
some more patches into mainline before you go on to tackle a larger
feature like this.  Perhaps Mathias has a smaller task that would be
good for you to tackle?

There's a good tutorial here on how to create a kernel patch, and
respond properly to patch review and rework requests:

http://kernelnewbies.org/OPWfirstpatch

And here's my personal philosophy on how to create a patchset:

http://sarah.thesharps.us/2013/05/08/patchsets-for-dinner/

 Regards Nick
 P.S. If  you want Sharp I can change the commit message on my other
 commit you didn't like if me or you
 talk to Greg in order to remove in from mainline, if no that's OK too.

I don't understand what the above sentence means.  What commit message
are you referencing?  What do you mean by remove in from mainline?

Try again?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: Kernel Debugging Support

2014-08-04 Thread Sarah Sharp

On Mon, Aug 04, 2014 at 07:11:07PM -0400, Nick Krause wrote:
 On Mon, Aug 4, 2014 at 7:03 PM, Paul Zimmerman
 paul.zimmer...@synopsys.com wrote:
  From: Nick Krause [mailto:xerofo...@gmail.com]

[snip]

  Paul ,
  My computer is rather old now as of Sandy Bridge days, so I probably
  can't test the patch
  on my own machine. However I will look at the code and see if I can
  forward port it
  against the usb git tree I have a current version of. In addition I
  would like the new xhci
  maintainers information in  order to send out a patch with the
  Maintainer for xhci updated.
 
  Sarah already told you who the new maintainer is, and then CCed him
  on this thread. Hint: There is a file name 'MAINTAINERS' in the root
  of the kernel tree, which tells you who the maintainers are for all of
  the subsystems. Please read Documentation/SubmittingPatches, it has a
  lot of information like this that you need to know.
 
  --
  Paul
 
 Thanks I will read this file and thanks for the information. I known
 where the file is I will
 add the information then.

You may be looking at an older version of MAINTAINERS.  Mathias has only
been marked as the maintainer since 3.15.  Which kernel version are you
working on?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-04-30 Thread Sarah Sharp

Script is attached now.

Sarah

On Wed, Apr 30, 2014 at 04:04:24PM -0700, Sarah Sharp wrote:
> Hi Mathias,
> 
> I tested both this patch and your global command queue patches on top of
> your for-usb-linus branch.  After reverting commit 400362f1d8dc "ALSA:
> usb-audio: Resume mixer values properly", I was able to get my USB
> webcam working. [1]
> 
> I wrote a small shell script (attached) to start and kill guvcview over
> and over, so that I could test the xHCI driver issuing Configure
> Endpoint commands, and proceeded to plug and unplug a VIA USB 3.0 hub in
> over and over again.  I got occasional descriptor fetch errors and once
> saw a Set Address timeout, and everything seemed to work as expected.
> 
> In short, I think it's fine to merge Julius' patch to usb-linus and your
> command queue patches to usb-next.
> 
> Sarah Sharp
> 
> [1] https://lkml.org/lkml/2014/4/19/117
> 
> On Tue, Apr 29, 2014 at 10:38:17AM -0700, Julius Werner wrote:
> > The current XHCI driver recalculates the Context Entries field in the
> > Slot Context on every add_endpoint() and drop_endpoint() call. In the
> > case of drop_endpoint(), it seems to assume that the add_flags will
> > always contain every endpoint for the new configuration, which is not
> > necessarily correct if you don't make assumptions about how the USB core
> > uses the add_endpoint/drop_endpoint interface (add_flags only contains
> > endpoints that are new additions in the new configuration).
> > 
> > Furthermore, EP0_FLAG is not consistently set in add_flags throughout
> > the lifetime of a device. This means that when all endpoints are
> > dropped, the Context Entries field can be set to 0 (which is invalid and
> > may cause a Parameter Error) or -1 (which is interpreted as 31 and
> > causes the driver to keep using the old, incorrect value).
> > 
> > The only surefire way to set this field right is to also take all
> > existing endpoints into account, and to force the value to 1 (meaning
> > only EP0 is active) if no other endpoint is found. This patch implements
> > that as a single step in the final check_bandwidth() call and removes
> > the intermediary calculations from add_endpoint() and drop_endpoint().
> > 
> > Signed-off-by: Julius Werner 
> > ---
> >  drivers/usb/host/xhci.c | 51 
> > +
> >  1 file changed, 18 insertions(+), 33 deletions(-)
> > 
> > diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> > index 924a6cc..fec6423 100644
> > --- a/drivers/usb/host/xhci.c
> > +++ b/drivers/usb/host/xhci.c
> > @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> > usb_device *udev,
> > struct xhci_hcd *xhci;
> > struct xhci_container_ctx *in_ctx, *out_ctx;
> > struct xhci_input_control_ctx *ctrl_ctx;
> > -   struct xhci_slot_ctx *slot_ctx;
> > -   unsigned int last_ctx;
> > unsigned int ep_index;
> > struct xhci_ep_ctx *ep_ctx;
> > u32 drop_flag;
> > -   u32 new_add_flags, new_drop_flags, new_slot_info;
> > +   u32 new_add_flags, new_drop_flags;
> > int ret;
> >  
> > ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
> > @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> > usb_device *udev,
> > ctrl_ctx->add_flags &= cpu_to_le32(~drop_flag);
> > new_add_flags = le32_to_cpu(ctrl_ctx->add_flags);
> >  
> > -   last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx->add_flags));
> > -   slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
> > -   /* Update the last valid endpoint context, if we deleted the last one */
> > -   if ((le32_to_cpu(slot_ctx->dev_info) & LAST_CTX_MASK) >
> > -   LAST_CTX(last_ctx)) {
> > -   slot_ctx->dev_info &= cpu_to_le32(~LAST_CTX_MASK);
> > -   slot_ctx->dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
> > -   }
> > -   new_slot_info = le32_to_cpu(slot_ctx->dev_info);
> > -
> > xhci_endpoint_zero(xhci, xhci->devs[udev->slot_id], ep);
> >  
> > -   xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> > flags = %#x, new slot info = %#x\n",
> > +   xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> > flags = %#x\n",
> > (unsigned int) ep->desc.bEndpointAddress,
> > udev->slot_id,
> > (unsigned int) new_drop_flags,
> > -   (unsigned int) new_add_flags,
> > -

Re: [PATCH v2] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-04-30 Thread Sarah Sharp

Hi Mathias,

I tested both this patch and your global command queue patches on top of
your for-usb-linus branch.  After reverting commit 400362f1d8dc "ALSA:
usb-audio: Resume mixer values properly", I was able to get my USB
webcam working. [1]

I wrote a small shell script (attached) to start and kill guvcview over
and over, so that I could test the xHCI driver issuing Configure
Endpoint commands, and proceeded to plug and unplug a VIA USB 3.0 hub in
over and over again.  I got occasional descriptor fetch errors and once
saw a Set Address timeout, and everything seemed to work as expected.

In short, I think it's fine to merge Julius' patch to usb-linus and your
command queue patches to usb-next.

Sarah Sharp

[1] https://lkml.org/lkml/2014/4/19/117

On Tue, Apr 29, 2014 at 10:38:17AM -0700, Julius Werner wrote:
> The current XHCI driver recalculates the Context Entries field in the
> Slot Context on every add_endpoint() and drop_endpoint() call. In the
> case of drop_endpoint(), it seems to assume that the add_flags will
> always contain every endpoint for the new configuration, which is not
> necessarily correct if you don't make assumptions about how the USB core
> uses the add_endpoint/drop_endpoint interface (add_flags only contains
> endpoints that are new additions in the new configuration).
> 
> Furthermore, EP0_FLAG is not consistently set in add_flags throughout
> the lifetime of a device. This means that when all endpoints are
> dropped, the Context Entries field can be set to 0 (which is invalid and
> may cause a Parameter Error) or -1 (which is interpreted as 31 and
> causes the driver to keep using the old, incorrect value).
> 
> The only surefire way to set this field right is to also take all
> existing endpoints into account, and to force the value to 1 (meaning
> only EP0 is active) if no other endpoint is found. This patch implements
> that as a single step in the final check_bandwidth() call and removes
> the intermediary calculations from add_endpoint() and drop_endpoint().
> 
> Signed-off-by: Julius Werner 
> ---
>  drivers/usb/host/xhci.c | 51 
> +
>  1 file changed, 18 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 924a6cc..fec6423 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> usb_device *udev,
>   struct xhci_hcd *xhci;
>   struct xhci_container_ctx *in_ctx, *out_ctx;
>   struct xhci_input_control_ctx *ctrl_ctx;
> - struct xhci_slot_ctx *slot_ctx;
> - unsigned int last_ctx;
>   unsigned int ep_index;
>   struct xhci_ep_ctx *ep_ctx;
>   u32 drop_flag;
> - u32 new_add_flags, new_drop_flags, new_slot_info;
> + u32 new_add_flags, new_drop_flags;
>   int ret;
>  
>   ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
> @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> usb_device *udev,
>   ctrl_ctx->add_flags &= cpu_to_le32(~drop_flag);
>   new_add_flags = le32_to_cpu(ctrl_ctx->add_flags);
>  
> - last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx->add_flags));
> - slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
> - /* Update the last valid endpoint context, if we deleted the last one */
> - if ((le32_to_cpu(slot_ctx->dev_info) & LAST_CTX_MASK) >
> - LAST_CTX(last_ctx)) {
> - slot_ctx->dev_info &= cpu_to_le32(~LAST_CTX_MASK);
> - slot_ctx->dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
> - }
> - new_slot_info = le32_to_cpu(slot_ctx->dev_info);
> -
>   xhci_endpoint_zero(xhci, xhci->devs[udev->slot_id], ep);
>  
> - xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> flags = %#x, new slot info = %#x\n",
> + xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> flags = %#x\n",
>   (unsigned int) ep->desc.bEndpointAddress,
>   udev->slot_id,
>   (unsigned int) new_drop_flags,
> - (unsigned int) new_add_flags,
> - (unsigned int) new_slot_info);
> + (unsigned int) new_add_flags);
>   return 0;
>  }
>  
> @@ -1654,11 +1641,9 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
> usb_device *udev,
>   struct xhci_hcd *xhci;
>   struct xhci_container_ctx *in_ctx, *out_ctx;
>   unsigned int ep_index;
> - struct xhci_slot_ctx *slot_ctx;
>   struct xhci_input_control_ctx *ctrl_ctx;
>   u32 added_ctxs

Re: [PATCH v2] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-04-30 Thread Sarah Sharp

Hi Mathias,

I tested both this patch and your global command queue patches on top of
your for-usb-linus branch.  After reverting commit 400362f1d8dc ALSA:
usb-audio: Resume mixer values properly, I was able to get my USB
webcam working. [1]

I wrote a small shell script (attached) to start and kill guvcview over
and over, so that I could test the xHCI driver issuing Configure
Endpoint commands, and proceeded to plug and unplug a VIA USB 3.0 hub in
over and over again.  I got occasional descriptor fetch errors and once
saw a Set Address timeout, and everything seemed to work as expected.

In short, I think it's fine to merge Julius' patch to usb-linus and your
command queue patches to usb-next.

Sarah Sharp

[1] https://lkml.org/lkml/2014/4/19/117

On Tue, Apr 29, 2014 at 10:38:17AM -0700, Julius Werner wrote:
 The current XHCI driver recalculates the Context Entries field in the
 Slot Context on every add_endpoint() and drop_endpoint() call. In the
 case of drop_endpoint(), it seems to assume that the add_flags will
 always contain every endpoint for the new configuration, which is not
 necessarily correct if you don't make assumptions about how the USB core
 uses the add_endpoint/drop_endpoint interface (add_flags only contains
 endpoints that are new additions in the new configuration).
 
 Furthermore, EP0_FLAG is not consistently set in add_flags throughout
 the lifetime of a device. This means that when all endpoints are
 dropped, the Context Entries field can be set to 0 (which is invalid and
 may cause a Parameter Error) or -1 (which is interpreted as 31 and
 causes the driver to keep using the old, incorrect value).
 
 The only surefire way to set this field right is to also take all
 existing endpoints into account, and to force the value to 1 (meaning
 only EP0 is active) if no other endpoint is found. This patch implements
 that as a single step in the final check_bandwidth() call and removes
 the intermediary calculations from add_endpoint() and drop_endpoint().
 
 Signed-off-by: Julius Werner jwer...@chromium.org
 ---
  drivers/usb/host/xhci.c | 51 
 +
  1 file changed, 18 insertions(+), 33 deletions(-)
 
 diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
 index 924a6cc..fec6423 100644
 --- a/drivers/usb/host/xhci.c
 +++ b/drivers/usb/host/xhci.c
 @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   struct xhci_hcd *xhci;
   struct xhci_container_ctx *in_ctx, *out_ctx;
   struct xhci_input_control_ctx *ctrl_ctx;
 - struct xhci_slot_ctx *slot_ctx;
 - unsigned int last_ctx;
   unsigned int ep_index;
   struct xhci_ep_ctx *ep_ctx;
   u32 drop_flag;
 - u32 new_add_flags, new_drop_flags, new_slot_info;
 + u32 new_add_flags, new_drop_flags;
   int ret;
  
   ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
 @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   ctrl_ctx-add_flags = cpu_to_le32(~drop_flag);
   new_add_flags = le32_to_cpu(ctrl_ctx-add_flags);
  
 - last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx-add_flags));
 - slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
 - /* Update the last valid endpoint context, if we deleted the last one */
 - if ((le32_to_cpu(slot_ctx-dev_info)  LAST_CTX_MASK) 
 - LAST_CTX(last_ctx)) {
 - slot_ctx-dev_info = cpu_to_le32(~LAST_CTX_MASK);
 - slot_ctx-dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
 - }
 - new_slot_info = le32_to_cpu(slot_ctx-dev_info);
 -
   xhci_endpoint_zero(xhci, xhci-devs[udev-slot_id], ep);
  
 - xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
 flags = %#x, new slot info = %#x\n,
 + xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
 flags = %#x\n,
   (unsigned int) ep-desc.bEndpointAddress,
   udev-slot_id,
   (unsigned int) new_drop_flags,
 - (unsigned int) new_add_flags,
 - (unsigned int) new_slot_info);
 + (unsigned int) new_add_flags);
   return 0;
  }
  
 @@ -1654,11 +1641,9 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   struct xhci_hcd *xhci;
   struct xhci_container_ctx *in_ctx, *out_ctx;
   unsigned int ep_index;
 - struct xhci_slot_ctx *slot_ctx;
   struct xhci_input_control_ctx *ctrl_ctx;
   u32 added_ctxs;
 - unsigned int last_ctx;
 - u32 new_add_flags, new_drop_flags, new_slot_info;
 + u32 new_add_flags, new_drop_flags;
   struct xhci_virt_device *virt_dev;
   int ret = 0;
  
 @@ -1673,7 +1658,6 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   return -ENODEV;
  
   added_ctxs = xhci_get_endpoint_flag(ep-desc);
 - last_ctx = xhci_last_valid_endpoint(added_ctxs

Re: [PATCH v2] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-04-30 Thread Sarah Sharp

Script is attached now.

Sarah

On Wed, Apr 30, 2014 at 04:04:24PM -0700, Sarah Sharp wrote:
 Hi Mathias,
 
 I tested both this patch and your global command queue patches on top of
 your for-usb-linus branch.  After reverting commit 400362f1d8dc ALSA:
 usb-audio: Resume mixer values properly, I was able to get my USB
 webcam working. [1]
 
 I wrote a small shell script (attached) to start and kill guvcview over
 and over, so that I could test the xHCI driver issuing Configure
 Endpoint commands, and proceeded to plug and unplug a VIA USB 3.0 hub in
 over and over again.  I got occasional descriptor fetch errors and once
 saw a Set Address timeout, and everything seemed to work as expected.
 
 In short, I think it's fine to merge Julius' patch to usb-linus and your
 command queue patches to usb-next.
 
 Sarah Sharp
 
 [1] https://lkml.org/lkml/2014/4/19/117
 
 On Tue, Apr 29, 2014 at 10:38:17AM -0700, Julius Werner wrote:
  The current XHCI driver recalculates the Context Entries field in the
  Slot Context on every add_endpoint() and drop_endpoint() call. In the
  case of drop_endpoint(), it seems to assume that the add_flags will
  always contain every endpoint for the new configuration, which is not
  necessarily correct if you don't make assumptions about how the USB core
  uses the add_endpoint/drop_endpoint interface (add_flags only contains
  endpoints that are new additions in the new configuration).
  
  Furthermore, EP0_FLAG is not consistently set in add_flags throughout
  the lifetime of a device. This means that when all endpoints are
  dropped, the Context Entries field can be set to 0 (which is invalid and
  may cause a Parameter Error) or -1 (which is interpreted as 31 and
  causes the driver to keep using the old, incorrect value).
  
  The only surefire way to set this field right is to also take all
  existing endpoints into account, and to force the value to 1 (meaning
  only EP0 is active) if no other endpoint is found. This patch implements
  that as a single step in the final check_bandwidth() call and removes
  the intermediary calculations from add_endpoint() and drop_endpoint().
  
  Signed-off-by: Julius Werner jwer...@chromium.org
  ---
   drivers/usb/host/xhci.c | 51 
  +
   1 file changed, 18 insertions(+), 33 deletions(-)
  
  diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
  index 924a6cc..fec6423 100644
  --- a/drivers/usb/host/xhci.c
  +++ b/drivers/usb/host/xhci.c
  @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
  usb_device *udev,
  struct xhci_hcd *xhci;
  struct xhci_container_ctx *in_ctx, *out_ctx;
  struct xhci_input_control_ctx *ctrl_ctx;
  -   struct xhci_slot_ctx *slot_ctx;
  -   unsigned int last_ctx;
  unsigned int ep_index;
  struct xhci_ep_ctx *ep_ctx;
  u32 drop_flag;
  -   u32 new_add_flags, new_drop_flags, new_slot_info;
  +   u32 new_add_flags, new_drop_flags;
  int ret;
   
  ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
  @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
  usb_device *udev,
  ctrl_ctx-add_flags = cpu_to_le32(~drop_flag);
  new_add_flags = le32_to_cpu(ctrl_ctx-add_flags);
   
  -   last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx-add_flags));
  -   slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
  -   /* Update the last valid endpoint context, if we deleted the last one */
  -   if ((le32_to_cpu(slot_ctx-dev_info)  LAST_CTX_MASK) 
  -   LAST_CTX(last_ctx)) {
  -   slot_ctx-dev_info = cpu_to_le32(~LAST_CTX_MASK);
  -   slot_ctx-dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
  -   }
  -   new_slot_info = le32_to_cpu(slot_ctx-dev_info);
  -
  xhci_endpoint_zero(xhci, xhci-devs[udev-slot_id], ep);
   
  -   xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
  flags = %#x, new slot info = %#x\n,
  +   xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
  flags = %#x\n,
  (unsigned int) ep-desc.bEndpointAddress,
  udev-slot_id,
  (unsigned int) new_drop_flags,
  -   (unsigned int) new_add_flags,
  -   (unsigned int) new_slot_info);
  +   (unsigned int) new_add_flags);
  return 0;
   }
   
  @@ -1654,11 +1641,9 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
  usb_device *udev,
  struct xhci_hcd *xhci;
  struct xhci_container_ctx *in_ctx, *out_ctx;
  unsigned int ep_index;
  -   struct xhci_slot_ctx *slot_ctx;
  struct xhci_input_control_ctx *ctrl_ctx;
  u32 added_ctxs;
  -   unsigned int last_ctx;
  -   u32 new_add_flags, new_drop_flags, new_slot_info;
  +   u32 new_add_flags, new_drop_flags;
  struct xhci_virt_device *virt_dev;
  int ret = 0;
   
  @@ -1673,7 +1658,6 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
  usb_device *udev,
  return -ENODEV

Re: [PATCH v2 0/4] xhci: fixes for 3.15-rc usb-linus

2014-04-25 Thread Sarah Sharp

On Fri, Apr 25, 2014 at 09:35:05AM -0700, Greg KH wrote:
> On Fri, Apr 25, 2014 at 07:20:12PM +0300, Mathias Nyman wrote:
> > Hi Greg 
> >
> > 
> >
> > Second try at this xhci fixes series for 3.15-rc usb-linus. 
> > 
> > Most of them are very small fixes that didn't make  
> >
> > it to 3.14, sitting and waiting for 3.15-rc1 to come out.   
> >
> > 
> >
> > Only the "Prefer endpoint context.." patch  by Julius has a bit more 
> > content.  
> > 
> >
> > These patches are picked together with Sarah, they are tested on top of 
> >
> > 3.15-rc1, and apply on your current usb-linus branch
> >
> 
> Much better, all now applied.
> 
> What's with all that trailing whitespace in your email text?

I use the following lines in my .vimrc to highlight trailing whitespace:

hi link localWhitespaceError Error
au Syntax * syn match localWhitespaceError /\(\zs\%#\|\s\)\+$/ display
au Syntax * syn match localWhitespaceError / \+\ze\t/ display

I also have a macro to add stable tags and which commit IDs introduced
the issue the bug fixes:

iab backporthis
\Fixes: commitID ("commitDescription")
\Cc: sta...@vger.kernel.org # kernelVersion

Then when I'm amending someone's commit to add my Signed-off-by line, I
type backporthis, and vim expands it to the longer version.  (I've
deliberately misspelled it here, so it doesn't get expanded.) Sometimes
it takes a couple of round of `git blame` to figure out when the bug was
introduced.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 0/4] xhci: fixes for 3.15-rc usb-linus

2014-04-25 Thread Sarah Sharp

On Fri, Apr 25, 2014 at 09:35:05AM -0700, Greg KH wrote:
 On Fri, Apr 25, 2014 at 07:20:12PM +0300, Mathias Nyman wrote:
  Hi Greg 
 
  
 
  Second try at this xhci fixes series for 3.15-rc usb-linus. 
  
  Most of them are very small fixes that didn't make  
 
  it to 3.14, sitting and waiting for 3.15-rc1 to come out.   
 
  
 
  Only the Prefer endpoint context.. patch  by Julius has a bit more 
  content.  
  
 
  These patches are picked together with Sarah, they are tested on top of 
 
  3.15-rc1, and apply on your current usb-linus branch
 
 
 Much better, all now applied.
 
 What's with all that trailing whitespace in your email text?

I use the following lines in my .vimrc to highlight trailing whitespace:

hi link localWhitespaceError Error
au Syntax * syn match localWhitespaceError /\(\zs\%#\|\s\)\+$/ display
au Syntax * syn match localWhitespaceError / \+\ze\t/ display

I also have a macro to add stable tags and which commit IDs introduced
the issue the bug fixes:

iab backporthis
\CRFixes: commitID (commitDescription)
\CRCc: sta...@vger.kernel.org # kernelVersion

Then when I'm amending someone's commit to add my Signed-off-by line, I
type backporthis, and vim expands it to the longer version.  (I've
deliberately misspelled it here, so it doesn't get expanded.) Sometimes
it takes a couple of round of `git blame` to figure out when the bug was
introduced.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-03-31 Thread Sarah Sharp

On Tue, Mar 25, 2014 at 11:42:43AM -0700, Julius Werner wrote:
> The current XHCI driver recalculates the Context Entries field in the
> Slot Context on every add_endpoint() and drop_endpoint() call. In the
> case of drop_endpoint(), it seems to assume that the add_flags will
> always contain every endpoint for the new configuration, which is not
> necessarily correct if you don't make assumptions about how the USB core
> uses the add_endpoint/drop_endpoint interface (add_flags only contains
> endpoints that are new additions in the new configuration).

The last valid endpoint context has been discussed before:

http://marc.info/?l=linux-usb=137158978503741=2

There's an xHCI spec ambiguity:  Does the last valid context entry refer
to the last valid endpoint context in the *input* device context or the
*output* device context?

The code currently assumes it refers to the input device context, namely
the endpoints we're adding or changing.  If hardware needs the last
valid endpoint context for the re-calculated *output* device context,
then yes, this needs to be changed.  However, based on spec errata, I
believe that's not the intent of the spec authors:

http://marc.info/?l=linux-kernel=137208958411696=2

What is the impact if we calculate the valid last valid endpoint context
for the input context?  Do you have evidence of hardware misbehaving?
If so, which hardware?

> Furthermore, EP0_FLAG is not consistently set in add_flags throughout
> the lifetime of a device. This means that when all endpoints are
> dropped, the Context Entries field can be set to 0 (which is invalid and
> may cause a Parameter Error) or -1 (which is interpreted as 31 and
> causes the driver to keep using the old, incorrect value).

That should be fixed in a separate patch.

Sarah Sharp

> The only surefire way to set this field right is to also take all
> existing endpoints into account, and to force the value to 1 (meaning
> only EP0 is active) if no other endpoint is found. This patch implements
> that as a single step in the final check_bandwidth() call and removes
> the intermediary calculations from add_endpoint() and drop_endpoint().
> 
> This patch should be backported to kernels as old as 2.6.31 that contain
> the commit f94e0186312b0fc39f41eed4e21836ed74b7efe1 "USB: xhci:
> Bandwidth allocation support".
> 
> Signed-off-by: Julius Werner 
> ---
>  drivers/usb/host/xhci.c | 51 
> +
>  1 file changed, 18 insertions(+), 33 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 924a6cc..e7d9dfa 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> usb_device *udev,
>   struct xhci_hcd *xhci;
>   struct xhci_container_ctx *in_ctx, *out_ctx;
>   struct xhci_input_control_ctx *ctrl_ctx;
> - struct xhci_slot_ctx *slot_ctx;
> - unsigned int last_ctx;
>   unsigned int ep_index;
>   struct xhci_ep_ctx *ep_ctx;
>   u32 drop_flag;
> - u32 new_add_flags, new_drop_flags, new_slot_info;
> + u32 new_add_flags, new_drop_flags;
>   int ret;
>  
>   ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
> @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
> usb_device *udev,
>   ctrl_ctx->add_flags &= cpu_to_le32(~drop_flag);
>   new_add_flags = le32_to_cpu(ctrl_ctx->add_flags);
>  
> - last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx->add_flags));
> - slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
> - /* Update the last valid endpoint context, if we deleted the last one */
> - if ((le32_to_cpu(slot_ctx->dev_info) & LAST_CTX_MASK) >
> - LAST_CTX(last_ctx)) {
> - slot_ctx->dev_info &= cpu_to_le32(~LAST_CTX_MASK);
> - slot_ctx->dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
> - }
> - new_slot_info = le32_to_cpu(slot_ctx->dev_info);
> -
>   xhci_endpoint_zero(xhci, xhci->devs[udev->slot_id], ep);
>  
> - xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> flags = %#x, new slot info = %#x\n",
> + xhci_dbg(xhci, "drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
> flags = %#x\n",
>   (unsigned int) ep->desc.bEndpointAddress,
>   udev->slot_id,
>   (unsigned int) new_drop_flags,
> - (unsigned int) new_add_flags,
> - (unsigned int) new_slot_info);
> + (unsigned int) new_add_flags);
>   return 0;
>  }
>  
> @@ -1654,11 +1641,9 @@ int xhci_add_endpoint(struct

Re: [PATCH] usb: xhci: Correct last context entry calculation for Configure Endpoint

2014-03-31 Thread Sarah Sharp

On Tue, Mar 25, 2014 at 11:42:43AM -0700, Julius Werner wrote:
 The current XHCI driver recalculates the Context Entries field in the
 Slot Context on every add_endpoint() and drop_endpoint() call. In the
 case of drop_endpoint(), it seems to assume that the add_flags will
 always contain every endpoint for the new configuration, which is not
 necessarily correct if you don't make assumptions about how the USB core
 uses the add_endpoint/drop_endpoint interface (add_flags only contains
 endpoints that are new additions in the new configuration).

The last valid endpoint context has been discussed before:

http://marc.info/?l=linux-usbm=137158978503741w=2

There's an xHCI spec ambiguity:  Does the last valid context entry refer
to the last valid endpoint context in the *input* device context or the
*output* device context?

The code currently assumes it refers to the input device context, namely
the endpoints we're adding or changing.  If hardware needs the last
valid endpoint context for the re-calculated *output* device context,
then yes, this needs to be changed.  However, based on spec errata, I
believe that's not the intent of the spec authors:

http://marc.info/?l=linux-kernelm=137208958411696w=2

What is the impact if we calculate the valid last valid endpoint context
for the input context?  Do you have evidence of hardware misbehaving?
If so, which hardware?

 Furthermore, EP0_FLAG is not consistently set in add_flags throughout
 the lifetime of a device. This means that when all endpoints are
 dropped, the Context Entries field can be set to 0 (which is invalid and
 may cause a Parameter Error) or -1 (which is interpreted as 31 and
 causes the driver to keep using the old, incorrect value).

That should be fixed in a separate patch.

Sarah Sharp

 The only surefire way to set this field right is to also take all
 existing endpoints into account, and to force the value to 1 (meaning
 only EP0 is active) if no other endpoint is found. This patch implements
 that as a single step in the final check_bandwidth() call and removes
 the intermediary calculations from add_endpoint() and drop_endpoint().
 
 This patch should be backported to kernels as old as 2.6.31 that contain
 the commit f94e0186312b0fc39f41eed4e21836ed74b7efe1 USB: xhci:
 Bandwidth allocation support.
 
 Signed-off-by: Julius Werner jwer...@chromium.org
 ---
  drivers/usb/host/xhci.c | 51 
 +
  1 file changed, 18 insertions(+), 33 deletions(-)
 
 diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
 index 924a6cc..e7d9dfa 100644
 --- a/drivers/usb/host/xhci.c
 +++ b/drivers/usb/host/xhci.c
 @@ -1562,12 +1562,10 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   struct xhci_hcd *xhci;
   struct xhci_container_ctx *in_ctx, *out_ctx;
   struct xhci_input_control_ctx *ctrl_ctx;
 - struct xhci_slot_ctx *slot_ctx;
 - unsigned int last_ctx;
   unsigned int ep_index;
   struct xhci_ep_ctx *ep_ctx;
   u32 drop_flag;
 - u32 new_add_flags, new_drop_flags, new_slot_info;
 + u32 new_add_flags, new_drop_flags;
   int ret;
  
   ret = xhci_check_args(hcd, udev, ep, 1, true, __func__);
 @@ -1614,24 +1612,13 @@ int xhci_drop_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   ctrl_ctx-add_flags = cpu_to_le32(~drop_flag);
   new_add_flags = le32_to_cpu(ctrl_ctx-add_flags);
  
 - last_ctx = xhci_last_valid_endpoint(le32_to_cpu(ctrl_ctx-add_flags));
 - slot_ctx = xhci_get_slot_ctx(xhci, in_ctx);
 - /* Update the last valid endpoint context, if we deleted the last one */
 - if ((le32_to_cpu(slot_ctx-dev_info)  LAST_CTX_MASK) 
 - LAST_CTX(last_ctx)) {
 - slot_ctx-dev_info = cpu_to_le32(~LAST_CTX_MASK);
 - slot_ctx-dev_info |= cpu_to_le32(LAST_CTX(last_ctx));
 - }
 - new_slot_info = le32_to_cpu(slot_ctx-dev_info);
 -
   xhci_endpoint_zero(xhci, xhci-devs[udev-slot_id], ep);
  
 - xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
 flags = %#x, new slot info = %#x\n,
 + xhci_dbg(xhci, drop ep 0x%x, slot id %d, new drop flags = %#x, new add 
 flags = %#x\n,
   (unsigned int) ep-desc.bEndpointAddress,
   udev-slot_id,
   (unsigned int) new_drop_flags,
 - (unsigned int) new_add_flags,
 - (unsigned int) new_slot_info);
 + (unsigned int) new_add_flags);
   return 0;
  }
  
 @@ -1654,11 +1641,9 @@ int xhci_add_endpoint(struct usb_hcd *hcd, struct 
 usb_device *udev,
   struct xhci_hcd *xhci;
   struct xhci_container_ctx *in_ctx, *out_ctx;
   unsigned int ep_index;
 - struct xhci_slot_ctx *slot_ctx;
   struct xhci_input_control_ctx *ctrl_ctx;
   u32 added_ctxs;
 - unsigned int last_ctx;
 - u32 new_add_flags, new_drop_flags, new_slot_info;
 + u32 new_add_flags, new_drop_flags

Re: [PATCH] #CleanUp non-gender-neutral README

2014-03-23 Thread Sarah Sharp

http://knowyourmeme.com/memes/events/c-plus-equality-c

Don't feed the trolls.

>  From: Feminist-Software-Foundation 
> 
> 
> This patch started as an effort inspired by EthicalCode's #CleanUpGitHub 
> project  to find and replace 
> either hateful, hurtful or discriminatory text in GitHub repositories.  
> The Linux kernel, being the de facto crown jewel of FOSS, deserves 
> better than to conform to non-gender-neutral pronouns and articles in 
> its README.  This patch rectifies that.
> 
> We are the Feminist Software Foundation.  We are the inventor of C+=, 
> world's first feminist programming language 
> ).  As 
> our latest effort, we are lending our help to a very popular feminist 
> phenomenon in the programming scene: purging popular FOSS repositories 
> of their Patriarchal influences.
> 
> As reported by several developers, Linux's kernel development does not 
> follow the activities on GitHub.  Whereby, we are submitting this patch 
> to the LKML in hope that this will garner a more professional response, 
> in contrast to the blatant sexism and booing that this patch has 
> received from the GitHub brogrammer community.
> 
> Singed-off-by: Feminist-Software-Foundation 
> 
> ---
> --- README.orig   2014-03-24 00:28:33.506830489 +
> +++ README2014-03-24 00:27:37.754554028 +
> @@ -1,64 +1,64 @@
>   Linux kernel release 3.x 
> 
> -These are the release notes for Linux version 3.  Read them carefully,
> -as they tell you what this is all about, explain how to install the
> +These are xhe release notes for Linux version 3.  Read xhem carefully,
> +as xhey tell you what this is all about, explain how to install xhe
>   kernel, and what to do if something goes wrong.
> 
>   WHAT IS LINUX?
> 
> -  Linux is a clone of the operating system Unix, written from scratch 
> by
> +  Linux is a clone of xhe operating system Unix, written from scratch 
> by
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] #CleanUp non-gender-neutral README

2014-03-23 Thread Sarah Sharp

http://knowyourmeme.com/memes/events/c-plus-equality-c

Don't feed the trolls.

  From: Feminist-Software-Foundation 
 feministsoftwarefoundat...@loves.dicksinmyan.us
 
 This patch started as an effort inspired by EthicalCode's #CleanUpGitHub 
 project http://ethicalco.de/events/cughinfo/ to find and replace 
 either hateful, hurtful or discriminatory text in GitHub repositories.  
 The Linux kernel, being the de facto crown jewel of FOSS, deserves 
 better than to conform to non-gender-neutral pronouns and articles in 
 its README.  This patch rectifies that.
 
 We are the Feminist Software Foundation.  We are the inventor of C+=, 
 world's first feminist programming language 
 https://github.com/Feminist-Software-Foundation/C-plus-Equality).  As 
 our latest effort, we are lending our help to a very popular feminist 
 phenomenon in the programming scene: purging popular FOSS repositories 
 of their Patriarchal influences.
 
 As reported by several developers, Linux's kernel development does not 
 follow the activities on GitHub.  Whereby, we are submitting this patch 
 to the LKML in hope that this will garner a more professional response, 
 in contrast to the blatant sexism and booing that this patch has 
 received from the GitHub brogrammer community.
 
 Singed-off-by: Feminist-Software-Foundation 
 feministsoftwarefoundat...@loves.dicksinmyan.us
 ---
 --- README.orig   2014-03-24 00:28:33.506830489 +
 +++ README2014-03-24 00:27:37.754554028 +
 @@ -1,64 +1,64 @@
   Linux kernel release 3.x http://kernel.org/
 
 -These are the release notes for Linux version 3.  Read them carefully,
 -as they tell you what this is all about, explain how to install the
 +These are xhe release notes for Linux version 3.  Read xhem carefully,
 +as xhey tell you what this is all about, explain how to install xhe
   kernel, and what to do if something goes wrong.
 
   WHAT IS LINUX?
 
 -  Linux is a clone of the operating system Unix, written from scratch 
 by
 +  Linux is a clone of xhe operating system Unix, written from scratch 
 by
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2 RESEND 4/4] xhci: Use pci_enable_msix_exact() instead of pci_enable_msix()

2014-03-06 Thread Sarah Sharp

On Thu, Mar 06, 2014 at 09:11:24PM +0100, Alexander Gordeev wrote:
> As result of deprecation of MSI-X/MSI enablement functions
> pci_enable_msix() and pci_enable_msi_block() all drivers
> using these two interfaces need to be updated to use the
> new pci_enable_msi_range()  or pci_enable_msi_exact()
> and pci_enable_msix_range() or pci_enable_msix_exact()
> interfaces.
> 
> This update also cleans up a bit xhci_setup_msi() and
> xhci_setup_msix() returning of success.

What do you mean by this sentence?  Are you fixing some bug in those two
functions, or just cleaning up how they look?  Either way, this should
really be two patches.

Sarah Sharp

> Signed-off-by: Alexander Gordeev 
> Cc: Sarah Sharp 
> Cc: Greg Kroah-Hartman 
> Cc: linux-...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> ---
>  drivers/usb/host/xhci.c |7 ---
>  1 files changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 6fe577d..dc7cfb5 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -232,9 +232,10 @@ static int xhci_setup_msi(struct xhci_hcd *xhci)
>   xhci_dbg_trace(xhci, trace_xhci_dbg_init,
>   "disable MSI interrupt");
>   pci_disable_msi(pdev);
> + return ret;
>   }
>  
> - return ret;
> + return 0;
>  }
>  
>  /*
> @@ -291,7 +292,7 @@ static int xhci_setup_msix(struct xhci_hcd *xhci)
>   xhci->msix_entries[i].vector = 0;
>   }
>  
> - ret = pci_enable_msix(pdev, xhci->msix_entries, xhci->msix_count);
> + ret = pci_enable_msix_exact(pdev, xhci->msix_entries, xhci->msix_count);
>   if (ret) {
>   xhci_dbg_trace(xhci, trace_xhci_dbg_init,
>   "Failed to enable MSI-X");
> @@ -307,7 +308,7 @@ static int xhci_setup_msix(struct xhci_hcd *xhci)
>   }
>  
>   hcd->msix_enabled = 1;
> - return ret;
> + return 0;
>  
>  disable_msix:
>   xhci_dbg_trace(xhci, trace_xhci_dbg_init, "disable MSI-X interrupt");
> -- 
> 1.7.7.6
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv3 4/4] xhci: rework command timeout and cancellation,

2014-03-06 Thread Sarah Sharp

9:14 xanatos kernel: [ 5405.363836] INFO: lockdep is turned off.
Mar  6 13:09:14 xanatos kernel: [ 5405.363860] INFO: task alsa-sink-USB A:14271 
blocked for more than 120 seconds.
Mar  6 13:09:14 xanatos kernel: [ 5405.363862]   Not tainted 3.14.0-rc5+ 
#200
Mar  6 13:09:14 xanatos kernel: [ 5405.363864] "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar  6 13:09:14 xanatos kernel: [ 5405.363866] alsa-sink-USB A D 
8800c7168000 0 14271   1801 0x
Mar  6 13:09:14 xanatos kernel: [ 5405.363871]  8800c5657b90 
0046 8800c7168000 880118672180
Mar  6 13:09:14 xanatos kernel: [ 5405.363877]  8800c5657fd8 
00013b80 00013b80 8800c7168000
Mar  6 13:09:14 xanatos kernel: [ 5405.363882]  7fff 
880032314f68 880032314f60 8800c7168000
Mar  6 13:09:14 xanatos kernel: [ 5405.363888] Call Trace:
Mar  6 13:09:14 xanatos kernel: [ 5405.363894]  [] 
schedule+0x29/0x70
Mar  6 13:09:14 xanatos kernel: [ 5405.363899]  [] 
schedule_timeout+0x281/0x310
Mar  6 13:09:14 xanatos kernel: [ 5405.363905]  [] ? 
wait_for_completion+0x3b/0x120
Mar  6 13:09:14 xanatos kernel: [ 5405.363911]  [] 
wait_for_completion+0xb4/0x120
Mar  6 13:09:14 xanatos kernel: [ 5405.363917]  [] ? 
wake_up_state+0x20/0x20
Mar  6 13:09:14 xanatos kernel: [ 5405.363927]  [] 
xhci_configure_endpoint+0xcf/0x550 [xhci_hcd]
Mar  6 13:09:14 xanatos kernel: [ 5405.363936]  [] 
xhci_check_bandwidth+0x10d/0x2e0 [xhci_hcd]
Mar  6 13:09:14 xanatos kernel: [ 5405.363954]  [] 
usb_hcd_alloc_bandwidth+0x2a3/0x340 [usbcore]
Mar  6 13:09:14 xanatos kernel: [ 5405.363972]  [] 
usb_set_interface+0xca/0x390 [usbcore]
Mar  6 13:09:14 xanatos kernel: [ 5405.363986]  [] 
snd_usb_pcm_close.isra.9+0x51/0x80 [snd_usb_audio]
Mar  6 13:09:14 xanatos kernel: [ 5405.363996]  [] 
snd_usb_playback_close+0x14/0x20 [snd_usb_audio]
Mar  6 13:09:14 xanatos kernel: [ 5405.364008]  [] 
snd_pcm_release_substream.part.29+0x3f/0x90 [snd_pcm]
Mar  6 13:09:14 xanatos kernel: [ 5405.364018]  [] 
snd_pcm_release+0xb0/0xc0 [snd_pcm]
Mar  6 13:09:14 xanatos kernel: [ 5405.364025]  [] 
__fput+0xef/0x240
Mar  6 13:09:14 xanatos kernel: [ 5405.364031]  [] 
fput+0xe/0x10
Mar  6 13:09:14 xanatos kernel: [ 5405.364037]  [] 
task_work_run+0xac/0xe0
Mar  6 13:09:14 xanatos kernel: [ 5405.364045]  [] 
do_notify_resume+0x59/0x90
Mar  6 13:09:14 xanatos kernel: [ 5405.364049]  [] 
int_signal+0x12/0x17

And now khubd is hung, I can't unload the xHCI module, and the sound
layer is probably hung as well.

I think the issue is that when the xHCI host was marked as dying, we
halted the host, and the command ring will be marked as not running.
But there are still processes waiting on pending commands.  When the
timer fires because the first command times out, xhci_abort_cmd_ring
notices the command ring is not running, and assumes we'll get a command
event with the completion code set to COMP_CMD_STOP.  However, the host
is dead, so we'll never see that event.

I think the easiest way to fix this would be to have a new function that
handles when the host dies, that's called from both
xhci_stop_endpoint_command_watchdog and xhci_abort_cmd_ring (when
stopping the command ring fails).  That function should call
xhci_quiesce, xhci_halt, and flush out the entire command queue.

Also, what will happen if the xhci-hcd module is unloaded when a command
is still pending?  I think the code in xhci_mem_cleanup will leave those
processes hanging, waiting for their commands to complete:

list_for_each_entry_safe(cur_cmd, next_cmd,
>cmd_list, cmd_list) {
list_del(_cmd->cmd_list);
kfree(cur_cmd);
}

You should probably also audit all xHCI functions that submit commands,
and make them not submit if the host is dying.  I don't see anything
stopping the USB core from causing command submission to a dead xHCI
host.

This needs to get fixed before the code can be merged.  (And again,
reverting those patches that were causing xHCI regressions takes
priority to fixing this.)

Sarah Sharp

> 
> Signed-off-by: Mathias Nyman 
> ---
>  drivers/usb/host/xhci-hub.c  |  11 +-
>  drivers/usb/host/xhci-mem.c  |  15 +-
>  drivers/usb/host/xhci-ring.c | 335 
> +--
>  drivers/usb/host/xhci.c  |  78 --
>  drivers/usb/host/xhci.h  |   8 +-
>  5 files changed, 138 insertions(+), 309 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 0a57d95..8350fd9 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -271,7 +271,6 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int 
> slot_id, int suspend)
>   struct xhci_virt_device *virt_dev;
>   struct xhci_command *cmd;
>   unsigned long flags;
> - int timeleft;
>   int ret;
>   int i;
>

Re: [RFCv3 1/4] xhci: Use command structures when queuing commands on the command ring

2014-03-06 Thread Sarah Sharp

 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881695]  [] 
usb_start_wait_urb+0x74/0x190 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881722]  [] 
usb_control_msg+0xc5/0x110 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881744]  [] 
set_port_feature+0x48/0x50 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881764]  [] 
usb_port_suspend+0x15b/0x440 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881790]  [] 
generic_suspend+0x2a/0x40 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881815]  [] 
usb_suspend_both+0x1bf/0x1e0 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881838]  [] 
usb_runtime_suspend+0x33/0x70 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881859]  [] ? 
usb_probe_interface+0x300/0x300 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881867]  [] 
__rpm_callback+0x32/0x70
Mar  6 12:03:22 xanatos kernel: [ 1450.881872]  [] 
rpm_callback+0x24/0x80
Mar  6 12:03:22 xanatos kernel: [ 1450.881878]  [] 
rpm_suspend+0x126/0x6e0
Mar  6 12:03:22 xanatos kernel: [ 1450.881885]  [] 
__pm_runtime_suspend+0x5d/0xa0
Mar  6 12:03:22 xanatos kernel: [ 1450.881906]  [] ? 
usb_runtime_resume+0x20/0x20 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881928]  [] 
usb_runtime_idle+0x2a/0x40 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881935]  [] 
__rpm_callback+0x32/0x70
Mar  6 12:03:22 xanatos kernel: [ 1450.881942]  [] 
rpm_idle+0x1ed/0x300
Mar  6 12:03:22 xanatos kernel: [ 1450.881948]  [] 
pm_runtime_work+0xbf/0xd0
Mar  6 12:03:22 xanatos kernel: [ 1450.881956]  [] 
process_one_work+0x1f4/0x550
Mar  6 12:03:22 xanatos kernel: [ 1450.881962]  [] ? 
process_one_work+0x192/0x550
Mar  6 12:03:22 xanatos kernel: [ 1450.881968]  [] 
worker_thread+0x121/0x3a0
Mar  6 12:03:22 xanatos kernel: [ 1450.881975]  [] ? 
manage_workers.isra.22+0x2a0/0x2a0
Mar  6 12:03:22 xanatos kernel: [ 1450.881982]  [] 
kthread+0xfc/0x120
Mar  6 12:03:22 xanatos kernel: [ 1450.881990]  [] ? 
kthread_create_on_node+0x230/0x230
Mar  6 12:03:22 xanatos kernel: [ 1450.881999]  [] 
ret_from_fork+0x7c/0xb0
Mar  6 12:03:22 xanatos kernel: [ 1450.882005]  [] ? 
kthread_create_on_node+0x230/0x230

Can you fix this in a second revision?  (Note, this does not take
priority over reverting those two patches that are causing regressions.
Let me know if you need any help with that.)

Sarah Sharp


> Signed-off-by: Mathias Nyman 
> ---
>  drivers/usb/host/xhci-hub.c  |  21 +++--
>  drivers/usb/host/xhci-ring.c | 105 ---
>  drivers/usb/host/xhci.c  | 194 
> ---
>  drivers/usb/host/xhci.h  |  31 +++
>  4 files changed, 214 insertions(+), 137 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
> index 9992fbf..fb0f936 100644
> --- a/drivers/usb/host/xhci-hub.c
> +++ b/drivers/usb/host/xhci-hub.c
> @@ -20,7 +20,8 @@
>   * Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
>   */
>  
> -#include 
> +
> +#include 
>  #include 
>  
>  #include "xhci.h"
> @@ -284,12 +285,22 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int 
> slot_id, int suspend)
>  
>   spin_lock_irqsave(>lock, flags);
>   for (i = LAST_EP_INDEX; i > 0; i--) {
> - if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue)
> - xhci_queue_stop_endpoint(xhci, slot_id, i, suspend);
> + if (virt_dev->eps[i].ring && virt_dev->eps[i].ring->dequeue) {
> + struct xhci_command *command;
> + command = xhci_alloc_command(xhci, false, false,
> +  GFP_NOIO);

This GFP_NOIO should be GFP_ATOMIC, since you're holding the spinlock.
I'll fix that in your patch, since the code seems to be working fine
otherwise.

(Note, fixing this does not take priority over sending those two reverts
to Greg.  Let me know if you need help figuring out how to send a revert
patch.)

Sarah Sharp

> + if (!command) {
> + spin_unlock_irqrestore(>lock, flags);
> + xhci_free_command(xhci, cmd);
> + return -ENOMEM;
> +
> + }
> + xhci_queue_stop_endpoint(xhci, command, slot_id, i,
> +  suspend);
> + }
>   }
> - cmd->command_trb = xhci_find_next_enqueue(xhci->cmd_ring);
>   list_add_tail(>cmd_list, _dev->cmd_list);
> - xhci_queue_stop_endpoint(xhci, slot_id, 0, suspend);
> + xhci_queue_stop_endpoint(xhci, cmd, slot_id, 0, suspend);
>   xhci_ring_cmd_db(xhci);
>   spin_unlock_irqrestore(>lock, flags);
>  
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index 0ed64eb..fa34c9b 100644
>

Re: xHCI regression in stable 3.13.5 with USB3 card reader (Bisected)

2014-03-06 Thread Sarah Sharp

[Adding Mathias.]

On Thu, Mar 06, 2014 at 12:27:59AM -0600, Robert Hancock wrote:
> On 05/03/14 11:17 PM, Robert Hancock wrote:
> >I have a USB 3.0 multi-card reader device:
> >
> >Bus 004 Device 002: ID 05e3:0743 Genesys Logic, Inc.
> >
> >which seems to work fine in 3.13.4 (Fedora version kernel-3.13.4-200
> >specifically) but fails in 3.13.5 (specifically kernel-3.13.5-202).
> >Below is what I get in dmesg. Essentially there's a bunch of
> >input/output errors making the reader mostly unusable.
> >
> >This is on an Intel Haswell machine with this controller:
> >
> >00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series
> >Chipset Family USB xHCI [8086:8c31] (rev 05)
> >
> >It looks like there were some XHCI commits that went into 3.13.5 so it
> >seems likely one of those is the cause. I can try current git if there's
> >anything in there that's likely to fix it. But it does seem like a
> >regression got into the stable kernel in this respect.
> 
> Bisecting between 3.13.4 and 3.13.5 gives me this:
> 
> c8f44f98901994832ccecb87c3dd7900274b699a is the first bad commit
> commit c8f44f98901994832ccecb87c3dd7900274b699a
> Author: Sarah Sharp 
> Date:   Fri Jan 31 11:26:25 2014 -0800
> 
> xhci 1.0: Limit arbitrarily-aligned scatter gather.

Yes, this is a known regression.  That commit should be reverted in
3.14-rc shortly, and the patch will be backported to stable kernels.
Mathias is taking over as xHCI maintainer, and he will queue the revert
patches to Greg shortly.

Sarah Sharp

> 
> commit 247bf557273dd775505fb9240d2d152f4f20d304 upstream.
> 
> xHCI 1.0 hosts have a set of requirements on how to align transfer
> buffers on the endpoint rings called "TD fragment" rules.  When the
> ax88179_178a driver added support for scatter gather in 3.12, with
> commit 804fad45411b48233b48003e33a78f290d227c8 "USBNET: ax88179_178a:
> enable tso if usb host supports sg dma", it broke the device under xHCI
> 1.0 hosts.  Under certain network loads, the device would see an
> unexpected short packet from the host, which would cause the device to
> stop sending ethernet packets, even through USB packets would still be
> sent.
> 
> Commit 35773dac5f86 "usb: xhci: Link TRB must not occur within a USB
> payload burst" attempted to fix this.  It was a quick hack to partially
> implement the TD fragment rules.  However, it caused regressions in the
> usb-storage layer and userspace USB drivers using libusb.  The patches
> to attempt to fix this are too far reaching into the USB core, and we
> really need to implement the TD fragment rules correctly in the xHCI
> driver, instead of continuing to wallpaper over the issues.
> 
> Disable arbitrarily-aligned scatter-gather in the xHCI driver for 1.0
> hosts.  Only the ax88179_178a driver checks the no_sg_constraint flag,
> so don't set it for 1.0 hosts.  This should not impact usb-storage or
> usbfs behavior, since they pass down max packet sized aligned sg-list
> entries (512 for USB 2.0 and 1024 for USB 3.0).
> 
> Signed-off-by: Sarah Sharp 
> Tested-by: Mark Lord 
> Cc: David Laight 
> Cc: Bjørn Mork 
> Cc: Freddy Xin 
> Cc: Ming Lei 
> Signed-off-by: Greg Kroah-Hartman 
> 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-usb" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: xHCI regression in stable 3.13.5 with USB3 card reader (Bisected)

2014-03-06 Thread Sarah Sharp

[Adding Mathias.]

On Thu, Mar 06, 2014 at 12:27:59AM -0600, Robert Hancock wrote:
 On 05/03/14 11:17 PM, Robert Hancock wrote:
 I have a USB 3.0 multi-card reader device:
 
 Bus 004 Device 002: ID 05e3:0743 Genesys Logic, Inc.
 
 which seems to work fine in 3.13.4 (Fedora version kernel-3.13.4-200
 specifically) but fails in 3.13.5 (specifically kernel-3.13.5-202).
 Below is what I get in dmesg. Essentially there's a bunch of
 input/output errors making the reader mostly unusable.
 
 This is on an Intel Haswell machine with this controller:
 
 00:14.0 USB controller [0c03]: Intel Corporation 8 Series/C220 Series
 Chipset Family USB xHCI [8086:8c31] (rev 05)
 
 It looks like there were some XHCI commits that went into 3.13.5 so it
 seems likely one of those is the cause. I can try current git if there's
 anything in there that's likely to fix it. But it does seem like a
 regression got into the stable kernel in this respect.
 
 Bisecting between 3.13.4 and 3.13.5 gives me this:
 
 c8f44f98901994832ccecb87c3dd7900274b699a is the first bad commit
 commit c8f44f98901994832ccecb87c3dd7900274b699a
 Author: Sarah Sharp sarah.a.sh...@linux.intel.com
 Date:   Fri Jan 31 11:26:25 2014 -0800
 
 xhci 1.0: Limit arbitrarily-aligned scatter gather.

Yes, this is a known regression.  That commit should be reverted in
3.14-rc shortly, and the patch will be backported to stable kernels.
Mathias is taking over as xHCI maintainer, and he will queue the revert
patches to Greg shortly.

Sarah Sharp

 
 commit 247bf557273dd775505fb9240d2d152f4f20d304 upstream.
 
 xHCI 1.0 hosts have a set of requirements on how to align transfer
 buffers on the endpoint rings called TD fragment rules.  When the
 ax88179_178a driver added support for scatter gather in 3.12, with
 commit 804fad45411b48233b48003e33a78f290d227c8 USBNET: ax88179_178a:
 enable tso if usb host supports sg dma, it broke the device under xHCI
 1.0 hosts.  Under certain network loads, the device would see an
 unexpected short packet from the host, which would cause the device to
 stop sending ethernet packets, even through USB packets would still be
 sent.
 
 Commit 35773dac5f86 usb: xhci: Link TRB must not occur within a USB
 payload burst attempted to fix this.  It was a quick hack to partially
 implement the TD fragment rules.  However, it caused regressions in the
 usb-storage layer and userspace USB drivers using libusb.  The patches
 to attempt to fix this are too far reaching into the USB core, and we
 really need to implement the TD fragment rules correctly in the xHCI
 driver, instead of continuing to wallpaper over the issues.
 
 Disable arbitrarily-aligned scatter-gather in the xHCI driver for 1.0
 hosts.  Only the ax88179_178a driver checks the no_sg_constraint flag,
 so don't set it for 1.0 hosts.  This should not impact usb-storage or
 usbfs behavior, since they pass down max packet sized aligned sg-list
 entries (512 for USB 2.0 and 1024 for USB 3.0).
 
 Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
 Tested-by: Mark Lord ml...@pobox.com
 Cc: David Laight david.lai...@aculab.com
 Cc: Bjørn Mork bj...@mork.no
 Cc: Freddy Xin fre...@asix.com.tw
 Cc: Ming Lei ming@canonical.com
 Signed-off-by: Greg Kroah-Hartman gre...@linuxfoundation.org
 
 
 
 --
 To unsubscribe from this list: send the line unsubscribe linux-usb in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv3 1/4] xhci: Use command structures when queuing commands on the command ring

2014-03-06 Thread Sarah Sharp

 12:03:22 xanatos kernel: [ 1450.881668]  [a0037d8c] 
usb_submit_urb+0x31c/0x5c0 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881695]  [a00387c4] 
usb_start_wait_urb+0x74/0x190 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881722]  [a00389a5] 
usb_control_msg+0xc5/0x110 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881744]  [a002bac8] 
set_port_feature+0x48/0x50 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881764]  [a002f28b] 
usb_port_suspend+0x15b/0x440 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881790]  [a00456ca] 
generic_suspend+0x2a/0x40 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881815]  [a003bd3f] 
usb_suspend_both+0x1bf/0x1e0 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881838]  [a003d103] 
usb_runtime_suspend+0x33/0x70 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881859]  [a003d0d0] ? 
usb_probe_interface+0x300/0x300 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881867]  [81448b52] 
__rpm_callback+0x32/0x70
Mar  6 12:03:22 xanatos kernel: [ 1450.881872]  [81448bb4] 
rpm_callback+0x24/0x80
Mar  6 12:03:22 xanatos kernel: [ 1450.881878]  [814490c6] 
rpm_suspend+0x126/0x6e0
Mar  6 12:03:22 xanatos kernel: [ 1450.881885]  [8144a8dd] 
__pm_runtime_suspend+0x5d/0xa0
Mar  6 12:03:22 xanatos kernel: [ 1450.881906]  [a003d160] ? 
usb_runtime_resume+0x20/0x20 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881928]  [a003d18a] 
usb_runtime_idle+0x2a/0x40 [usbcore]
Mar  6 12:03:22 xanatos kernel: [ 1450.881935]  [81448b52] 
__rpm_callback+0x32/0x70
Mar  6 12:03:22 xanatos kernel: [ 1450.881942]  [8144995d] 
rpm_idle+0x1ed/0x300
Mar  6 12:03:22 xanatos kernel: [ 1450.881948]  [8144aa5f] 
pm_runtime_work+0xbf/0xd0
Mar  6 12:03:22 xanatos kernel: [ 1450.881956]  [81069184] 
process_one_work+0x1f4/0x550
Mar  6 12:03:22 xanatos kernel: [ 1450.881962]  [81069122] ? 
process_one_work+0x192/0x550
Mar  6 12:03:22 xanatos kernel: [ 1450.881968]  [81069ec1] 
worker_thread+0x121/0x3a0
Mar  6 12:03:22 xanatos kernel: [ 1450.881975]  [81069da0] ? 
manage_workers.isra.22+0x2a0/0x2a0
Mar  6 12:03:22 xanatos kernel: [ 1450.881982]  [8107084c] 
kthread+0xfc/0x120
Mar  6 12:03:22 xanatos kernel: [ 1450.881990]  [81070750] ? 
kthread_create_on_node+0x230/0x230
Mar  6 12:03:22 xanatos kernel: [ 1450.881999]  [816677ec] 
ret_from_fork+0x7c/0xb0
Mar  6 12:03:22 xanatos kernel: [ 1450.882005]  [81070750] ? 
kthread_create_on_node+0x230/0x230

Can you fix this in a second revision?  (Note, this does not take
priority over reverting those two patches that are causing regressions.
Let me know if you need any help with that.)

Sarah Sharp


 Signed-off-by: Mathias Nyman mathias.ny...@linux.intel.com
 ---
  drivers/usb/host/xhci-hub.c  |  21 +++--
  drivers/usb/host/xhci-ring.c | 105 ---
  drivers/usb/host/xhci.c  | 194 
 ---
  drivers/usb/host/xhci.h  |  31 +++
  4 files changed, 214 insertions(+), 137 deletions(-)
 
 diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
 index 9992fbf..fb0f936 100644
 --- a/drivers/usb/host/xhci-hub.c
 +++ b/drivers/usb/host/xhci-hub.c
 @@ -20,7 +20,8 @@
   * Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
   */
  
 -#include linux/gfp.h
 +
 +#include linux/slab.h
  #include asm/unaligned.h
  
  #include xhci.h
 @@ -284,12 +285,22 @@ static int xhci_stop_device(struct xhci_hcd *xhci, int 
 slot_id, int suspend)
  
   spin_lock_irqsave(xhci-lock, flags);
   for (i = LAST_EP_INDEX; i  0; i--) {
 - if (virt_dev-eps[i].ring  virt_dev-eps[i].ring-dequeue)
 - xhci_queue_stop_endpoint(xhci, slot_id, i, suspend);
 + if (virt_dev-eps[i].ring  virt_dev-eps[i].ring-dequeue) {
 + struct xhci_command *command;
 + command = xhci_alloc_command(xhci, false, false,
 +  GFP_NOIO);

This GFP_NOIO should be GFP_ATOMIC, since you're holding the spinlock.
I'll fix that in your patch, since the code seems to be working fine
otherwise.

(Note, fixing this does not take priority over sending those two reverts
to Greg.  Let me know if you need help figuring out how to send a revert
patch.)

Sarah Sharp

 + if (!command) {
 + spin_unlock_irqrestore(xhci-lock, flags);
 + xhci_free_command(xhci, cmd);
 + return -ENOMEM;
 +
 + }
 + xhci_queue_stop_endpoint(xhci, command, slot_id, i,
 +  suspend);
 + }
   }
 - cmd-command_trb = xhci_find_next_enqueue(xhci-cmd_ring);
   list_add_tail(cmd-cmd_list, virt_dev-cmd_list);
 - xhci_queue_stop_endpoint(xhci, slot_id

Re: [RFCv3 4/4] xhci: rework command timeout and cancellation,

2014-03-06 Thread Sarah Sharp

+0x49/0xa0
Mar  6 13:09:14 xanatos kernel: [ 5405.363834]  [81667896] 
system_call_fastpath+0x1a/0x1f
Mar  6 13:09:14 xanatos kernel: [ 5405.363836] INFO: lockdep is turned off.
Mar  6 13:09:14 xanatos kernel: [ 5405.363860] INFO: task alsa-sink-USB A:14271 
blocked for more than 120 seconds.
Mar  6 13:09:14 xanatos kernel: [ 5405.363862]   Not tainted 3.14.0-rc5+ 
#200
Mar  6 13:09:14 xanatos kernel: [ 5405.363864] echo 0  
/proc/sys/kernel/hung_task_timeout_secs disables this message.
Mar  6 13:09:14 xanatos kernel: [ 5405.363866] alsa-sink-USB A D 
8800c7168000 0 14271   1801 0x
Mar  6 13:09:14 xanatos kernel: [ 5405.363871]  8800c5657b90 
0046 8800c7168000 880118672180
Mar  6 13:09:14 xanatos kernel: [ 5405.363877]  8800c5657fd8 
00013b80 00013b80 8800c7168000
Mar  6 13:09:14 xanatos kernel: [ 5405.363882]  7fff 
880032314f68 880032314f60 8800c7168000
Mar  6 13:09:14 xanatos kernel: [ 5405.363888] Call Trace:
Mar  6 13:09:14 xanatos kernel: [ 5405.363894]  [81659519] 
schedule+0x29/0x70
Mar  6 13:09:14 xanatos kernel: [ 5405.363899]  [81658711] 
schedule_timeout+0x281/0x310
Mar  6 13:09:14 xanatos kernel: [ 5405.363905]  [8165a88b] ? 
wait_for_completion+0x3b/0x120
Mar  6 13:09:14 xanatos kernel: [ 5405.363911]  [8165a904] 
wait_for_completion+0xb4/0x120
Mar  6 13:09:14 xanatos kernel: [ 5405.363917]  [81082480] ? 
wake_up_state+0x20/0x20
Mar  6 13:09:14 xanatos kernel: [ 5405.363927]  [a015b0df] 
xhci_configure_endpoint+0xcf/0x550 [xhci_hcd]
Mar  6 13:09:14 xanatos kernel: [ 5405.363936]  [a015bead] 
xhci_check_bandwidth+0x10d/0x2e0 [xhci_hcd]
Mar  6 13:09:14 xanatos kernel: [ 5405.363954]  [a0037043] 
usb_hcd_alloc_bandwidth+0x2a3/0x340 [usbcore]
Mar  6 13:09:14 xanatos kernel: [ 5405.363972]  [a003a33a] 
usb_set_interface+0xca/0x390 [usbcore]
Mar  6 13:09:14 xanatos kernel: [ 5405.363986]  [a055f3e1] 
snd_usb_pcm_close.isra.9+0x51/0x80 [snd_usb_audio]
Mar  6 13:09:14 xanatos kernel: [ 5405.363996]  [a055f444] 
snd_usb_playback_close+0x14/0x20 [snd_usb_audio]
Mar  6 13:09:14 xanatos kernel: [ 5405.364008]  [a043654f] 
snd_pcm_release_substream.part.29+0x3f/0x90 [snd_pcm]
Mar  6 13:09:14 xanatos kernel: [ 5405.364018]  [a0436680] 
snd_pcm_release+0xb0/0xc0 [snd_pcm]
Mar  6 13:09:14 xanatos kernel: [ 5405.364025]  [811b3a3f] 
__fput+0xef/0x240
Mar  6 13:09:14 xanatos kernel: [ 5405.364031]  [811b3bde] 
fput+0xe/0x10
Mar  6 13:09:14 xanatos kernel: [ 5405.364037]  [8106d4fc] 
task_work_run+0xac/0xe0
Mar  6 13:09:14 xanatos kernel: [ 5405.364045]  [81002d69] 
do_notify_resume+0x59/0x90
Mar  6 13:09:14 xanatos kernel: [ 5405.364049]  [81667ba2] 
int_signal+0x12/0x17

And now khubd is hung, I can't unload the xHCI module, and the sound
layer is probably hung as well.

I think the issue is that when the xHCI host was marked as dying, we
halted the host, and the command ring will be marked as not running.
But there are still processes waiting on pending commands.  When the
timer fires because the first command times out, xhci_abort_cmd_ring
notices the command ring is not running, and assumes we'll get a command
event with the completion code set to COMP_CMD_STOP.  However, the host
is dead, so we'll never see that event.

I think the easiest way to fix this would be to have a new function that
handles when the host dies, that's called from both
xhci_stop_endpoint_command_watchdog and xhci_abort_cmd_ring (when
stopping the command ring fails).  That function should call
xhci_quiesce, xhci_halt, and flush out the entire command queue.

Also, what will happen if the xhci-hcd module is unloaded when a command
is still pending?  I think the code in xhci_mem_cleanup will leave those
processes hanging, waiting for their commands to complete:

list_for_each_entry_safe(cur_cmd, next_cmd,
xhci-cmd_list, cmd_list) {
list_del(cur_cmd-cmd_list);
kfree(cur_cmd);
}

You should probably also audit all xHCI functions that submit commands,
and make them not submit if the host is dying.  I don't see anything
stopping the USB core from causing command submission to a dead xHCI
host.

This needs to get fixed before the code can be merged.  (And again,
reverting those patches that were causing xHCI regressions takes
priority to fixing this.)

Sarah Sharp

 
 Signed-off-by: Mathias Nyman mathias.ny...@linux.intel.com
 ---
  drivers/usb/host/xhci-hub.c  |  11 +-
  drivers/usb/host/xhci-mem.c  |  15 +-
  drivers/usb/host/xhci-ring.c | 335 
 +--
  drivers/usb/host/xhci.c  |  78 --
  drivers/usb/host/xhci.h  |   8 +-
  5 files changed, 138 insertions(+), 309 deletions(-)
 
 diff --git a/drivers/usb/host/xhci-hub.c b/drivers/usb/host/xhci-hub.c
 index 0a57d95..8350fd9

Re: [PATCH v2 RESEND 4/4] xhci: Use pci_enable_msix_exact() instead of pci_enable_msix()

2014-03-06 Thread Sarah Sharp

On Thu, Mar 06, 2014 at 09:11:24PM +0100, Alexander Gordeev wrote:
 As result of deprecation of MSI-X/MSI enablement functions
 pci_enable_msix() and pci_enable_msi_block() all drivers
 using these two interfaces need to be updated to use the
 new pci_enable_msi_range()  or pci_enable_msi_exact()
 and pci_enable_msix_range() or pci_enable_msix_exact()
 interfaces.
 
 This update also cleans up a bit xhci_setup_msi() and
 xhci_setup_msix() returning of success.

What do you mean by this sentence?  Are you fixing some bug in those two
functions, or just cleaning up how they look?  Either way, this should
really be two patches.

Sarah Sharp

 Signed-off-by: Alexander Gordeev agord...@redhat.com
 Cc: Sarah Sharp sarah.a.sh...@linux.intel.com
 Cc: Greg Kroah-Hartman gre...@linuxfoundation.org
 Cc: linux-...@vger.kernel.org
 Cc: linux-...@vger.kernel.org
 ---
  drivers/usb/host/xhci.c |7 ---
  1 files changed, 4 insertions(+), 3 deletions(-)
 
 diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
 index 6fe577d..dc7cfb5 100644
 --- a/drivers/usb/host/xhci.c
 +++ b/drivers/usb/host/xhci.c
 @@ -232,9 +232,10 @@ static int xhci_setup_msi(struct xhci_hcd *xhci)
   xhci_dbg_trace(xhci, trace_xhci_dbg_init,
   disable MSI interrupt);
   pci_disable_msi(pdev);
 + return ret;
   }
  
 - return ret;
 + return 0;
  }
  
  /*
 @@ -291,7 +292,7 @@ static int xhci_setup_msix(struct xhci_hcd *xhci)
   xhci-msix_entries[i].vector = 0;
   }
  
 - ret = pci_enable_msix(pdev, xhci-msix_entries, xhci-msix_count);
 + ret = pci_enable_msix_exact(pdev, xhci-msix_entries, xhci-msix_count);
   if (ret) {
   xhci_dbg_trace(xhci, trace_xhci_dbg_init,
   Failed to enable MSI-X);
 @@ -307,7 +308,7 @@ static int xhci_setup_msix(struct xhci_hcd *xhci)
   }
  
   hcd-msix_enabled = 1;
 - return ret;
 + return 0;
  
  disable_msix:
   xhci_dbg_trace(xhci, trace_xhci_dbg_init, disable MSI-X interrupt);
 -- 
 1.7.7.6
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/1] xhci: Prevent runtime pm from autosuspending during initialization

2014-03-04 Thread Sarah Sharp

On Tue, Mar 04, 2014 at 09:04:58AM -0800, Greg KH wrote:
> On Tue, Mar 04, 2014 at 01:50:55PM +0200, Mathias Nyman wrote:
> > On 03/03/2014 08:37 PM, Greg KH wrote:
> > >On Mon, Mar 03, 2014 at 07:30:17PM +0200, Mathias Nyman wrote:
> > >>xHCI driver has its own pci probe function that will call 
> > >>usb_hcd_pci_probe
> > >>to register its usb-2 bus, and then continue to manually register the
> > >>usb-3 bus. usb_hcd_pci_probe does a pm_runtime_put_noidle at the end and
> > >>might thus trigger a runtime suspend before the usb-3 bus is ready.
> > >
> > >What is the result if that happens?
> > 
> > Crashes. Null pointer dereference in xhci_suspend() when touching
> > xhci->shared_hcd before it's initialized. More info here:
> > 
> > http://marc.info/?l=linux-usb=138914518219334=2
> > 
> > >
> > >Is this a regression from 3.13?  Or something new for 3.14?
> > >
> > 
> > According to reporter its been around since 3.7
> > 
> > commit 596d789a211d134dc5f94d1e5957248c204ef850
> > USB: set hub's default autosuspend delay as 0
> > 
> > But nobody else than the reporter is able to trigger it.
> 
> Then it can wait for 3.15-rc1, and then go back to the stable trees at
> that time, right?  I'd prefer that as it's not a regression and not
> common.

Ok, I'll queue it with the other features for 3.15.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v3 1/1] xhci: Prevent runtime pm from autosuspending during initialization

2014-03-04 Thread Sarah Sharp

On Tue, Mar 04, 2014 at 09:04:58AM -0800, Greg KH wrote:
 On Tue, Mar 04, 2014 at 01:50:55PM +0200, Mathias Nyman wrote:
  On 03/03/2014 08:37 PM, Greg KH wrote:
  On Mon, Mar 03, 2014 at 07:30:17PM +0200, Mathias Nyman wrote:
  xHCI driver has its own pci probe function that will call 
  usb_hcd_pci_probe
  to register its usb-2 bus, and then continue to manually register the
  usb-3 bus. usb_hcd_pci_probe does a pm_runtime_put_noidle at the end and
  might thus trigger a runtime suspend before the usb-3 bus is ready.
  
  What is the result if that happens?
  
  Crashes. Null pointer dereference in xhci_suspend() when touching
  xhci-shared_hcd before it's initialized. More info here:
  
  http://marc.info/?l=linux-usbm=138914518219334w=2
  
  
  Is this a regression from 3.13?  Or something new for 3.14?
  
  
  According to reporter its been around since 3.7
  
  commit 596d789a211d134dc5f94d1e5957248c204ef850
  USB: set hub's default autosuspend delay as 0
  
  But nobody else than the reporter is able to trigger it.
 
 Then it can wait for 3.15-rc1, and then go back to the stable trees at
 that time, right?  I'd prefer that as it's not a regression and not
 common.

Ok, I'll queue it with the other features for 3.15.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb

2014-02-28 Thread Sarah Sharp

On Thu, Feb 20, 2014 at 09:12:15PM -0800, Julius Werner wrote:
> We have observed a rare cycle state desync bug after Set TR Dequeue
> Pointer commands on Intel LynxPoint xHCs (resulting in an endpoint that
> doesn't fetch new TRBs and thus an unresponsive USB device). It always
> triggers when a previous Set TR Dequeue Pointer command has set the
> pointer to the final Link TRB of a segment, and then another URB gets
> enqueued and cancelled again before it can be completed. Further
> investigation showed that the xHC had returned the Link TRB in the TRB
> Pointer field of the Transfer Event (CC == Stopped -- Length Invalid),
> but when xhci_find_new_dequeue_state() later accesses the Endpoint
> Context's TR Dequeue Pointer field it is set to the first TRB of the
> next segment.
> 
> The driver expects those two values to be the same in this situation,
> and uses the cycle state of the latter together with the address of the
> former. This should be fine according to the XHCI specification, since
> the endpoint ring should be stopped when returning the Transfer Event
> and thus should not advance over the Link TRB before it gets restarted.
> However, real-world XHCI implementations apparently don't really care
> that much about these details, so the driver should follow a more
> defensive approach to try to work around HC spec violations.

Length Invalid can actually be returned when the host is "in between"
processing TRBs.  Perhaps it had processed the link TRB, but hadn't
started processing the TRB at the top of the ring.  So it's not a spec
violation per say, but definitely a spec ambiguity.

The patch looks fine.  Mathias is taking over for xHCI driver
maintainership in 3.15.  He's currently handling queuing bug fix patches
for 3.14 while I finish queueing feature patches for 3.15.  Mathias,
will you test and queue this up for 3.14?

Signed-off-by: Sarah Sharp 

> This patch removes the stopped_trb variable that had been used to store
> the TRB Pointer from the last Transfer Event of a stopped TRB. Instead,
> xhci_find_new_dequeue_state() now relies only on the Endpoint Context,
> requiring a small amount of additional processing to find the virtual
> address corresponding to the TR Dequeue Pointer. Some other parts of the
> function were slightly rearranged to better fit into this model.
> 
> This patch should be backported to kernels as old as 2.6.31 that contain
> the commit ae636747146ea97efa18e04576acd3416e2514f5 "USB: xhci: URB
> cancellation support."
> 
> Signed-off-by: Julius Werner 
> ---
>  drivers/usb/host/xhci-ring.c | 66 
> 
>  drivers/usb/host/xhci.c  |  1 -
>  drivers/usb/host/xhci.h  |  2 --
>  3 files changed, 30 insertions(+), 39 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
> index a0b248c..b8277c7 100644
> --- a/drivers/usb/host/xhci-ring.c
> +++ b/drivers/usb/host/xhci-ring.c
> @@ -549,6 +549,7 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci,
>   struct xhci_generic_trb *trb;
>   struct xhci_ep_ctx *ep_ctx;
>   dma_addr_t addr;
> + u64 hw_dequeue;
>  
>   ep_ring = xhci_triad_to_transfer_ring(xhci, slot_id,
>   ep_index, stream_id);
> @@ -558,56 +559,56 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci,
>   stream_id);
>   return;
>   }
> - state->new_cycle_state = 0;
> - xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
> - "Finding segment containing stopped TRB.");
> - state->new_deq_seg = find_trb_seg(cur_td->start_seg,
> - dev->eps[ep_index].stopped_trb,
> - >new_cycle_state);
> - if (!state->new_deq_seg) {
> - WARN_ON(1);
> - return;
> - }
>  
>   /* Dig out the cycle state saved by the xHC during the stop ep cmd */
>   xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
>   "Finding endpoint context");
>   ep_ctx = xhci_get_ep_ctx(xhci, dev->out_ctx, ep_index);
> - state->new_cycle_state = 0x1 & le64_to_cpu(ep_ctx->deq);
> + hw_dequeue = le64_to_cpu(ep_ctx->deq);
> +
> + /* Find virtual address and segment of hardware dequeue pointer */
> + state->new_deq_seg = ep_ring->deq_seg;
> + state->new_deq_ptr = ep_ring->dequeue;
> + while (xhci_trb_virt_to_dma(state->new_deq_seg, state->new_deq_ptr)
> + != (dma_addr_t)(hw_dequeue & ~0x1)) {
> + next_trb(xhci, ep_ring, >new_deq_seg,
> + >new_deq_ptr);
> + if (state-

Re: [RFC PATCH v2] xhci: Prevent runtime pm from autosuspending during initialization

2014-02-28 Thread Sarah Sharp

On Mon, Feb 24, 2014 at 12:44:46PM -0500, Alan Stern wrote:
> On Mon, 24 Feb 2014, Mathias Nyman wrote:
> 
> > xHCI driver has its own pci probe function that will call usb_hcd_pci_probe
> > to register its usb-2 bus, and then continue to manually register the
> > usb-3 bus. usb_hcd_pci_probe does a pm_runtime_put_noidle at the end and
> > might thus trigger a runtime suspend before the usb-3 bus is ready.
> > 
> > Prevent the runtime suspend by increasing the usage count in the
> > beginning of xhci_pci_probe, and decrease it once the usb-3 bus is
> > ready.
> > 
> > xhci-platform driver is not using usb_hcd_pci_probe to set up
> > busses and should not need to have it's usage count increased during probe.
> > 
> > Signed-off-by: Mathias Nyman 
> > ---
> >  drivers/usb/host/xhci-pci.c | 11 ++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> > index 04f986d..ea7158b 100644
> > --- a/drivers/usb/host/xhci-pci.c
> > +++ b/drivers/usb/host/xhci-pci.c
> > @@ -190,6 +190,10 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
> > struct pci_device_id *id)
> > struct usb_hcd *hcd;
> >  
> > driver = (struct hc_driver *)id->driver_data;
> > +
> > +   /* Prevent USB-2 roothub runtime suspend until USB-3 is initialized. */
> > +   pm_runtime_get_noresume(>dev);
> 
> Strictly speaking, this prevents the _controller_ from going into
> runtime suspend -- not the root hub.

Signed-off-by: Sarah Sharp 

Mathias, want to fix this comment and queue this up to Greg?  It should
probably be marked for stable.

> Apart from nit about the comment,
> 
> Acked-by: Alan Stern 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc4 xHCI] Regression haswell B85 with xHCI on

2014-02-28 Thread Sarah Sharp

h Gen Core
> Processor HD Audio Controller (rev 06)
> 00:16.0 Communication controller: Intel Corporation 8 Series/C220 Series
> Chipset Family MEI Controller #1 (rev 04)
> 00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
> Family USB EHCI #2 (rev 05)
> 00:1b.0 Audio device: Intel Corporation 8 Series/C220 Series Chipset
> High Definition Audio Controller (rev 05)
> 00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
> Family PCI Express Root Port #1 (rev d5)
> 00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
> Family PCI Express Root Port #3 (rev d5)
> 00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
> Family PCI Express Root Port #4 (rev d5)
> 00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
> Family USB EHCI #1 (rev 05)
> 00:1f.0 ISA bridge: Intel Corporation B85 Express LPC Controller (rev 05)
> 00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset
> Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
> 00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family
> SMBus Controller (rev 05)
> 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
> 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
> RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
> 
> Any Idea how to fix these?

I won't know until I see more dmesg.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [3.14-rc4 xHCI] Regression haswell B85 with xHCI on

2014-02-28 Thread Sarah Sharp

 Chipset
 High Definition Audio Controller (rev 05)
 00:1c.0 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
 Family PCI Express Root Port #1 (rev d5)
 00:1c.2 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
 Family PCI Express Root Port #3 (rev d5)
 00:1c.3 PCI bridge: Intel Corporation 8 Series/C220 Series Chipset
 Family PCI Express Root Port #4 (rev d5)
 00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset
 Family USB EHCI #1 (rev 05)
 00:1f.0 ISA bridge: Intel Corporation B85 Express LPC Controller (rev 05)
 00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset
 Family 6-port SATA Controller 1 [AHCI mode] (rev 05)
 00:1f.3 SMBus: Intel Corporation 8 Series/C220 Series Chipset Family
 SMBus Controller (rev 05)
 02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
 RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
 03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd.
 RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
 
 Any Idea how to fix these?

I won't know until I see more dmesg.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC PATCH v2] xhci: Prevent runtime pm from autosuspending during initialization

2014-02-28 Thread Sarah Sharp

On Mon, Feb 24, 2014 at 12:44:46PM -0500, Alan Stern wrote:
 On Mon, 24 Feb 2014, Mathias Nyman wrote:
 
  xHCI driver has its own pci probe function that will call usb_hcd_pci_probe
  to register its usb-2 bus, and then continue to manually register the
  usb-3 bus. usb_hcd_pci_probe does a pm_runtime_put_noidle at the end and
  might thus trigger a runtime suspend before the usb-3 bus is ready.
  
  Prevent the runtime suspend by increasing the usage count in the
  beginning of xhci_pci_probe, and decrease it once the usb-3 bus is
  ready.
  
  xhci-platform driver is not using usb_hcd_pci_probe to set up
  busses and should not need to have it's usage count increased during probe.
  
  Signed-off-by: Mathias Nyman mathias.ny...@linux.intel.com
  ---
   drivers/usb/host/xhci-pci.c | 11 ++-
   1 file changed, 10 insertions(+), 1 deletion(-)
  
  diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
  index 04f986d..ea7158b 100644
  --- a/drivers/usb/host/xhci-pci.c
  +++ b/drivers/usb/host/xhci-pci.c
  @@ -190,6 +190,10 @@ static int xhci_pci_probe(struct pci_dev *dev, const 
  struct pci_device_id *id)
  struct usb_hcd *hcd;
   
  driver = (struct hc_driver *)id-driver_data;
  +
  +   /* Prevent USB-2 roothub runtime suspend until USB-3 is initialized. */
  +   pm_runtime_get_noresume(dev-dev);
 
 Strictly speaking, this prevents the _controller_ from going into
 runtime suspend -- not the root hub.

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com

Mathias, want to fix this comment and queue this up to Greg?  It should
probably be marked for stable.

 Apart from nit about the comment,
 
 Acked-by: Alan Stern st...@rowland.harvard.edu
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: xhci: Prefer endpoint context dequeue pointer over stopped_trb

2014-02-28 Thread Sarah Sharp

On Thu, Feb 20, 2014 at 09:12:15PM -0800, Julius Werner wrote:
 We have observed a rare cycle state desync bug after Set TR Dequeue
 Pointer commands on Intel LynxPoint xHCs (resulting in an endpoint that
 doesn't fetch new TRBs and thus an unresponsive USB device). It always
 triggers when a previous Set TR Dequeue Pointer command has set the
 pointer to the final Link TRB of a segment, and then another URB gets
 enqueued and cancelled again before it can be completed. Further
 investigation showed that the xHC had returned the Link TRB in the TRB
 Pointer field of the Transfer Event (CC == Stopped -- Length Invalid),
 but when xhci_find_new_dequeue_state() later accesses the Endpoint
 Context's TR Dequeue Pointer field it is set to the first TRB of the
 next segment.
 
 The driver expects those two values to be the same in this situation,
 and uses the cycle state of the latter together with the address of the
 former. This should be fine according to the XHCI specification, since
 the endpoint ring should be stopped when returning the Transfer Event
 and thus should not advance over the Link TRB before it gets restarted.
 However, real-world XHCI implementations apparently don't really care
 that much about these details, so the driver should follow a more
 defensive approach to try to work around HC spec violations.

Length Invalid can actually be returned when the host is in between
processing TRBs.  Perhaps it had processed the link TRB, but hadn't
started processing the TRB at the top of the ring.  So it's not a spec
violation per say, but definitely a spec ambiguity.

The patch looks fine.  Mathias is taking over for xHCI driver
maintainership in 3.15.  He's currently handling queuing bug fix patches
for 3.14 while I finish queueing feature patches for 3.15.  Mathias,
will you test and queue this up for 3.14?

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com

 This patch removes the stopped_trb variable that had been used to store
 the TRB Pointer from the last Transfer Event of a stopped TRB. Instead,
 xhci_find_new_dequeue_state() now relies only on the Endpoint Context,
 requiring a small amount of additional processing to find the virtual
 address corresponding to the TR Dequeue Pointer. Some other parts of the
 function were slightly rearranged to better fit into this model.
 
 This patch should be backported to kernels as old as 2.6.31 that contain
 the commit ae636747146ea97efa18e04576acd3416e2514f5 USB: xhci: URB
 cancellation support.
 
 Signed-off-by: Julius Werner jwer...@chromium.org
 ---
  drivers/usb/host/xhci-ring.c | 66 
 
  drivers/usb/host/xhci.c  |  1 -
  drivers/usb/host/xhci.h  |  2 --
  3 files changed, 30 insertions(+), 39 deletions(-)
 
 diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
 index a0b248c..b8277c7 100644
 --- a/drivers/usb/host/xhci-ring.c
 +++ b/drivers/usb/host/xhci-ring.c
 @@ -549,6 +549,7 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci,
   struct xhci_generic_trb *trb;
   struct xhci_ep_ctx *ep_ctx;
   dma_addr_t addr;
 + u64 hw_dequeue;
  
   ep_ring = xhci_triad_to_transfer_ring(xhci, slot_id,
   ep_index, stream_id);
 @@ -558,56 +559,56 @@ void xhci_find_new_dequeue_state(struct xhci_hcd *xhci,
   stream_id);
   return;
   }
 - state-new_cycle_state = 0;
 - xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
 - Finding segment containing stopped TRB.);
 - state-new_deq_seg = find_trb_seg(cur_td-start_seg,
 - dev-eps[ep_index].stopped_trb,
 - state-new_cycle_state);
 - if (!state-new_deq_seg) {
 - WARN_ON(1);
 - return;
 - }
  
   /* Dig out the cycle state saved by the xHC during the stop ep cmd */
   xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
   Finding endpoint context);
   ep_ctx = xhci_get_ep_ctx(xhci, dev-out_ctx, ep_index);
 - state-new_cycle_state = 0x1  le64_to_cpu(ep_ctx-deq);
 + hw_dequeue = le64_to_cpu(ep_ctx-deq);
 +
 + /* Find virtual address and segment of hardware dequeue pointer */
 + state-new_deq_seg = ep_ring-deq_seg;
 + state-new_deq_ptr = ep_ring-dequeue;
 + while (xhci_trb_virt_to_dma(state-new_deq_seg, state-new_deq_ptr)
 + != (dma_addr_t)(hw_dequeue  ~0x1)) {
 + next_trb(xhci, ep_ring, state-new_deq_seg,
 + state-new_deq_ptr);
 + if (state-new_deq_ptr == ep_ring-dequeue) {
 + WARN_ON(1);
 + return;
 + }
 + }
  
 + /*
 +  * Find cycle state for last_trb, starting at old cycle state of
 +  * hw_dequeue. If there is only one segment ring, find_trb_seg() will
 +  * return immediately and cannot toggle the cycle state if this search
 +  * wraps around

Re: [RESEND] [PATCH] xhci: Switch Intel Lynx Point ports to EHCI on shutdown.

2014-02-18 Thread Sarah Sharp

Sorry for the delay in reviewing this.  It helps me if you don't make the
patch in-reply-to a months old thread. :)  I'll take a look at this
shortly.

Sarah Sharp

On Tue, Feb 18, 2014 at 09:42:39AM +0200, Denis Turischev wrote:
> The same issue like with Panther Point chipsets. If the USB ports are
> switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
> which will wake the system. Some BIOS have work around for this, but not all.
> One example is Compulab's mini-desktop, the Intense-PC2.
> 
> The bug can be avoided if the USB ports are switched back to EHCI on
> shutdown.
> 
> Signed-off-by: Denis Turischev 
> ---
>  drivers/usb/host/xhci-pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> index 3c898c1..9233d12 100644
> --- a/drivers/usb/host/xhci-pci.c
> +++ b/drivers/usb/host/xhci-pci.c
> @@ -134,6 +134,8 @@ static void xhci_pci_quirks(struct device *dev, struct 
> xhci_hcd *xhci)
>*/
>   if (pdev->subsystem_vendor == PCI_VENDOR_ID_HP)
>   xhci->quirks |= XHCI_SPURIOUS_WAKEUP;
> +
> + xhci->quirks |= XHCI_SPURIOUS_REBOOT;
>   }
>   if (pdev->vendor == PCI_VENDOR_ID_ETRON &&
>   pdev->device == PCI_DEVICE_ID_ASROCK_P67) {
> -- 1.8.1.2
> 
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RESEND] [PATCH] xhci: Switch Intel Lynx Point ports to EHCI on shutdown.

2014-02-18 Thread Sarah Sharp

Sorry for the delay in reviewing this.  It helps me if you don't make the
patch in-reply-to a months old thread. :)  I'll take a look at this
shortly.

Sarah Sharp

On Tue, Feb 18, 2014 at 09:42:39AM +0200, Denis Turischev wrote:
 The same issue like with Panther Point chipsets. If the USB ports are
 switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
 which will wake the system. Some BIOS have work around for this, but not all.
 One example is Compulab's mini-desktop, the Intense-PC2.
 
 The bug can be avoided if the USB ports are switched back to EHCI on
 shutdown.
 
 Signed-off-by: Denis Turischev de...@compulab.co.il
 ---
  drivers/usb/host/xhci-pci.c | 2 ++
  1 file changed, 2 insertions(+)
 
 diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
 index 3c898c1..9233d12 100644
 --- a/drivers/usb/host/xhci-pci.c
 +++ b/drivers/usb/host/xhci-pci.c
 @@ -134,6 +134,8 @@ static void xhci_pci_quirks(struct device *dev, struct 
 xhci_hcd *xhci)
*/
   if (pdev-subsystem_vendor == PCI_VENDOR_ID_HP)
   xhci-quirks |= XHCI_SPURIOUS_WAKEUP;
 +
 + xhci-quirks |= XHCI_SPURIOUS_REBOOT;
   }
   if (pdev-vendor == PCI_VENDOR_ID_ETRON 
   pdev-device == PCI_DEVICE_ID_ASROCK_P67) {
 -- 1.8.1.2
 
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv2 08/10] xhci: Add a global command queue

2014-02-05 Thread Sarah Sharp

On Tue, Feb 04, 2014 at 10:57:09PM -0800, Dan Williams wrote:
> On Thu, Jan 30, 2014 at 6:10 AM, Mathias Nyman 
>  wrote:
> > @@ -1722,6 +1723,12 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
> > kfree(cur_cd);
> > }
> >
> > +   list_for_each_entry_safe(cur_cmd, next_cmd,
> > +   >cmd_list, cmd_list) {
> > +   list_del(_cmd->cmd_list);
> > +   kfree(cur_cmd);
> > +   }
> > +
> 
> Aren't commands on the cmd_list currently being executed, or are there
> other guarantees that make sure all commands have terminated?

By the time we get to xhci_mem_cleanup, we've done our best effort to
halt the xHCI host controller.  That could timeout, I suppose, but I'm
not sure what we're expected to do in that case.  If the host won't
halt, I'm not sure it will be able to, say, respond to the request to
cancel the current command.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv2 08/10] xhci: Add a global command queue

2014-02-05 Thread Sarah Sharp

On Tue, Feb 04, 2014 at 10:57:09PM -0800, Dan Williams wrote:
 On Thu, Jan 30, 2014 at 6:10 AM, Mathias Nyman 
 mathias.ny...@linux.intel.com wrote:
  @@ -1722,6 +1723,12 @@ void xhci_mem_cleanup(struct xhci_hcd *xhci)
  kfree(cur_cd);
  }
 
  +   list_for_each_entry_safe(cur_cmd, next_cmd,
  +   xhci-cmd_list, cmd_list) {
  +   list_del(cur_cmd-cmd_list);
  +   kfree(cur_cmd);
  +   }
  +
 
 Aren't commands on the cmd_list currently being executed, or are there
 other guarantees that make sure all commands have terminated?

By the time we get to xhci_mem_cleanup, we've done our best effort to
halt the xHCI host controller.  That could timeout, I suppose, but I'm
not sure what we're expected to do in that case.  If the host won't
halt, I'm not sure it will be able to, say, respond to the request to
cancel the current command.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv2 00/10] xhci: re-work command queue management

2014-01-30 Thread Sarah Sharp

On Thu, Jan 30, 2014 at 02:25:48PM +, David Laight wrote:
> I think it would be much simpler to allocate a parallel array to the actual
> hardware command ring that contains the additional information for the request
> (instead of allocating it pre-request).
> This would immediately solve any problems allocating the memory from interrupt
> context and failing to free in correctly in all the code paths.
> 
> A similar solution could be used for the transfer rings thus removing the
> need to the 'td' list - which there are reports of it failing to find 
> transfers
> and the code paths for aborting isoch transfers are badly broken.
> 
> Adding another list that will have its own set of bugs seems retrograde top 
> me.

I do not have a problem with it.  The shadow ring is an optimization we
can look at later.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFCv2 00/10] xhci: re-work command queue management

2014-01-30 Thread Sarah Sharp

On Thu, Jan 30, 2014 at 02:25:48PM +, David Laight wrote:
 I think it would be much simpler to allocate a parallel array to the actual
 hardware command ring that contains the additional information for the request
 (instead of allocating it pre-request).
 This would immediately solve any problems allocating the memory from interrupt
 context and failing to free in correctly in all the code paths.
 
 A similar solution could be used for the transfer rings thus removing the
 need to the 'td' list - which there are reports of it failing to find 
 transfers
 and the code paths for aborting isoch transfers are badly broken.
 
 Adding another list that will have its own set of bugs seems retrograde top 
 me.

I do not have a problem with it.  The shadow ring is an optimization we
can look at later.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: latest git usb3.0 ports not working

2014-01-29 Thread Sarah Sharp

On Tue, Jan 21, 2014 at 02:17:06PM -0800, Sarah Sharp wrote:
> On Tue, Jan 21, 2014 at 07:47:22PM +0100, Branimir Maksimovic wrote:
> > asus maximus v gene motherboard,
> > this is from dmesg:
> > 
> > [   75.576160] xhci_hcd :03:00.0: Timeout while waiting for a slot
> > [   88.991634] xhci_hcd :03:00.0: Stopped the command ring
> > failed, maybe the host is dead
> > [   88.991748] xhci_hcd :03:00.0: Abort command ring failed
> > [   88.991845] xhci_hcd :03:00.0: HC died; cleaning up
> > [   93.985489] xhci_hcd :03:00.0: Timeout while waiting for a slot
> > [   93.985494] xhci_hcd :03:00.0: Abort the command ring, but
> > the xHCI is dead.
> > [   98.982586] xhci_hcd :03:00.0: Timeout while waiting for a slot
> > [   98.982591] xhci_hcd :03:00.0: Abort the command ring, but
> > the xHCI is dead.
> > [  103.979696] xhci_hcd :03:00.0: Timeout while waiting for a slot
> > [  103.979702] xhci_hcd :03:00.0: Abort the command ring, but
> > the xHCI is dead.
> 
> By latest git, do you mean linus/master, or 3.13.0?  If it's
> linus/master, please provide the commit ID.  Which kernel version worked
> for you?

Try reverting commit 7dd09a1af2c7150269350aaa567a11b06e831003 "xhci:
replace xhci_write_64() with writeq()" and let me know if it worked for
you.

I would like to help you, but I can't if you don't respond.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: latest git usb3.0 ports not working

2014-01-29 Thread Sarah Sharp

On Tue, Jan 21, 2014 at 02:17:06PM -0800, Sarah Sharp wrote:
 On Tue, Jan 21, 2014 at 07:47:22PM +0100, Branimir Maksimovic wrote:
  asus maximus v gene motherboard,
  this is from dmesg:
  
  [   75.576160] xhci_hcd :03:00.0: Timeout while waiting for a slot
  [   88.991634] xhci_hcd :03:00.0: Stopped the command ring
  failed, maybe the host is dead
  [   88.991748] xhci_hcd :03:00.0: Abort command ring failed
  [   88.991845] xhci_hcd :03:00.0: HC died; cleaning up
  [   93.985489] xhci_hcd :03:00.0: Timeout while waiting for a slot
  [   93.985494] xhci_hcd :03:00.0: Abort the command ring, but
  the xHCI is dead.
  [   98.982586] xhci_hcd :03:00.0: Timeout while waiting for a slot
  [   98.982591] xhci_hcd :03:00.0: Abort the command ring, but
  the xHCI is dead.
  [  103.979696] xhci_hcd :03:00.0: Timeout while waiting for a slot
  [  103.979702] xhci_hcd :03:00.0: Abort the command ring, but
  the xHCI is dead.
 
 By latest git, do you mean linus/master, or 3.13.0?  If it's
 linus/master, please provide the commit ID.  Which kernel version worked
 for you?

Try reverting commit 7dd09a1af2c7150269350aaa567a11b06e831003 xhci:
replace xhci_write_64() with writeq() and let me know if it worked for
you.

I would like to help you, but I can't if you don't respond.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 00/10] xhci: re-work command queue management

2014-01-27 Thread Sarah Sharp

Hi Keith,

You've told me in the past that you've run into an issue where you can
hang the xHCI driver when one of your TeleMetrum boards refuses to
respond to a Set Address command.

Can you test the following patchset, and see if it fixes your problem?
I've applied the patchset against 3.13 here:

https://git.kernel.org/cgit/linux/kernel/git/sarah/xhci.git/log/?h=for-usb-next-command-queue

Thanks,
Sarah Sharp

On Mon, Jan 13, 2014 at 05:05:49PM +0200, Mathias Nyman wrote:
> This is an attempt to re-work and solve the issues in xhci command
> queue management that Sarah has descibed earlier: 
> 
> Right now, the command management in the xHCI driver is rather ad-hock.  
> Different parts of the driver all submit commands, including interrupt 
> handling routines, functions called from the USB core (with or without the
> bus bandwidth mutex held).
> Some times they need to wait for the command to complete, and sometimes 
> they just issue the command and don't care about the result of the command.
> 
> The places that wait on a command all time the command for five seconds,
> and then attempt to cancel the command.  
> Unfortunately, that means if several commands are issued at once, and one of
> them times out, all the commands timeout, even though the host hasn't gotten
> a chance to service them yet.
> 
> This is apparent with some devices that take a long time to respond to the 
> Set Address command during device enumeration (when the device is plugged in).
> If a driver for a different device attempts to change alternate interface
> settings at the same time (causing a Configure Endpoint command to be issued),
> both commands timeout.
> 
> Instead of having each command timeout after five seconds, the driver should
> wait indefinitely in an uninterruptible sleep on the command completion.  
> A global command queue manager should time whatever command is currently
> running, and cancel that command after five seconds.
> 
> If the commands were in a list, like TDs currently are, it may be easier to 
> keep
> track of where the command ring dequeue pointer is, and avoid racing with 
> events.
> We may need to have parts of the driver that issue commands without waiting on
> them still put the commands in the command list.
> 
> The Implementation:
> ---
> 
> First step is to create a list of the commands submitted to the command queue.
> To accomplish this each command is required to be submitted with a properly
> filled command structure containing completion, status variable and a pointer 
> to
> the command TRB that will be used.
> 
> The first 7 patches are all about creating these command structures and
> submitting them when we queue commands.
> The command structures are allocated on the fly, the commands that are 
> submitted
> in interrupt context are allocated with GFP_ATOMIC.
> 
> Next, the global command queue is introduced. Commands are added to the queue
> when trb's are queued, and remove when the commad completes. 
> Also switch to use the status variable and completion in the command struct.
> 
> A new timer handles command timeout, the timer is kicked every time when a 
> command finishes and there's a new command waiting in the queue, or when a new
> command is submitted to an _empty_ command queue.
> Timer is deleted when the the last command on the queue finishes (empty queue)
> 
> The old cancel_cmd_list is removed. 
> When the timer expires we simply tag the current command as "ABORTED" and 
> start
> the ring abortion process. Functions waiting for an aborted command to finish 
> are
> called after the command abortion is completed.
> 
> Mathias Nyman (10):
>   xhci: Use command structures when calling xhci_configure_endpoint
>   xhci: use a command structure internally in xhci_address_device()
>   xhci: use command structures for xhci_queue_slot_control()
>   xhci: use command structures for xhci_queue_stop_endpoint()
>   xhci: use command structure for xhci_queue_new_dequeue_state()
>   xhci: use command structures for xhci_queue_reset_ep()
>   xhci: Use command structured when queuing commands
>   xhci: Add a global command queue
>   xhci: Use completion and status in global command queue
>   xhci: rework command timeout and cancellation,
> 
>  drivers/usb/host/xhci-hub.c  |  42 ++--
>  drivers/usb/host/xhci-mem.c  |  22 +-
>  drivers/usb/host/xhci-ring.c | 532 
> ++-
>  drivers/usb/host/xhci.c  | 264 +++--
>  drivers/usb/host/xhci.h  |  43 ++--
>  5 files changed, 373 insertions(+), 530 deletions(-)
> 
> -- 
> 1.8.1.2
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [RFC 00/10] xhci: re-work command queue management

2014-01-27 Thread Sarah Sharp

Hi Keith,

You've told me in the past that you've run into an issue where you can
hang the xHCI driver when one of your TeleMetrum boards refuses to
respond to a Set Address command.

Can you test the following patchset, and see if it fixes your problem?
I've applied the patchset against 3.13 here:

https://git.kernel.org/cgit/linux/kernel/git/sarah/xhci.git/log/?h=for-usb-next-command-queue

Thanks,
Sarah Sharp

On Mon, Jan 13, 2014 at 05:05:49PM +0200, Mathias Nyman wrote:
 This is an attempt to re-work and solve the issues in xhci command
 queue management that Sarah has descibed earlier: 
 
 Right now, the command management in the xHCI driver is rather ad-hock.  
 Different parts of the driver all submit commands, including interrupt 
 handling routines, functions called from the USB core (with or without the
 bus bandwidth mutex held).
 Some times they need to wait for the command to complete, and sometimes 
 they just issue the command and don't care about the result of the command.
 
 The places that wait on a command all time the command for five seconds,
 and then attempt to cancel the command.  
 Unfortunately, that means if several commands are issued at once, and one of
 them times out, all the commands timeout, even though the host hasn't gotten
 a chance to service them yet.
 
 This is apparent with some devices that take a long time to respond to the 
 Set Address command during device enumeration (when the device is plugged in).
 If a driver for a different device attempts to change alternate interface
 settings at the same time (causing a Configure Endpoint command to be issued),
 both commands timeout.
 
 Instead of having each command timeout after five seconds, the driver should
 wait indefinitely in an uninterruptible sleep on the command completion.  
 A global command queue manager should time whatever command is currently
 running, and cancel that command after five seconds.
 
 If the commands were in a list, like TDs currently are, it may be easier to 
 keep
 track of where the command ring dequeue pointer is, and avoid racing with 
 events.
 We may need to have parts of the driver that issue commands without waiting on
 them still put the commands in the command list.
 
 The Implementation:
 ---
 
 First step is to create a list of the commands submitted to the command queue.
 To accomplish this each command is required to be submitted with a properly
 filled command structure containing completion, status variable and a pointer 
 to
 the command TRB that will be used.
 
 The first 7 patches are all about creating these command structures and
 submitting them when we queue commands.
 The command structures are allocated on the fly, the commands that are 
 submitted
 in interrupt context are allocated with GFP_ATOMIC.
 
 Next, the global command queue is introduced. Commands are added to the queue
 when trb's are queued, and remove when the commad completes. 
 Also switch to use the status variable and completion in the command struct.
 
 A new timer handles command timeout, the timer is kicked every time when a 
 command finishes and there's a new command waiting in the queue, or when a new
 command is submitted to an _empty_ command queue.
 Timer is deleted when the the last command on the queue finishes (empty queue)
 
 The old cancel_cmd_list is removed. 
 When the timer expires we simply tag the current command as ABORTED and 
 start
 the ring abortion process. Functions waiting for an aborted command to finish 
 are
 called after the command abortion is completed.
 
 Mathias Nyman (10):
   xhci: Use command structures when calling xhci_configure_endpoint
   xhci: use a command structure internally in xhci_address_device()
   xhci: use command structures for xhci_queue_slot_control()
   xhci: use command structures for xhci_queue_stop_endpoint()
   xhci: use command structure for xhci_queue_new_dequeue_state()
   xhci: use command structures for xhci_queue_reset_ep()
   xhci: Use command structured when queuing commands
   xhci: Add a global command queue
   xhci: Use completion and status in global command queue
   xhci: rework command timeout and cancellation,
 
  drivers/usb/host/xhci-hub.c  |  42 ++--
  drivers/usb/host/xhci-mem.c  |  22 +-
  drivers/usb/host/xhci-ring.c | 532 
 ++-
  drivers/usb/host/xhci.c  | 264 +++--
  drivers/usb/host/xhci.h  |  43 ++--
  5 files changed, 373 insertions(+), 530 deletions(-)
 
 -- 
 1.8.1.2
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: latest git usb3.0 ports not working

2014-01-21 Thread Sarah Sharp

On Tue, Jan 21, 2014 at 07:47:22PM +0100, Branimir Maksimovic wrote:
> asus maximus v gene motherboard,
> this is from dmesg:
> 
> [   75.576160] xhci_hcd :03:00.0: Timeout while waiting for a slot
> [   88.991634] xhci_hcd :03:00.0: Stopped the command ring
> failed, maybe the host is dead
> [   88.991748] xhci_hcd :03:00.0: Abort command ring failed
> [   88.991845] xhci_hcd :03:00.0: HC died; cleaning up
> [   93.985489] xhci_hcd :03:00.0: Timeout while waiting for a slot
> [   93.985494] xhci_hcd :03:00.0: Abort the command ring, but
> the xHCI is dead.
> [   98.982586] xhci_hcd :03:00.0: Timeout while waiting for a slot
> [   98.982591] xhci_hcd :03:00.0: Abort the command ring, but
> the xHCI is dead.
> [  103.979696] xhci_hcd :03:00.0: Timeout while waiting for a slot
> [  103.979702] xhci_hcd :03:00.0: Abort the command ring, but
> the xHCI is dead.

By latest git, do you mean linus/master, or 3.13.0?  If it's
linus/master, please provide the commit ID.  Which kernel version worked
for you?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: latest git usb3.0 ports not working

2014-01-21 Thread Sarah Sharp

On Tue, Jan 21, 2014 at 07:47:22PM +0100, Branimir Maksimovic wrote:
 asus maximus v gene motherboard,
 this is from dmesg:
 
 [   75.576160] xhci_hcd :03:00.0: Timeout while waiting for a slot
 [   88.991634] xhci_hcd :03:00.0: Stopped the command ring
 failed, maybe the host is dead
 [   88.991748] xhci_hcd :03:00.0: Abort command ring failed
 [   88.991845] xhci_hcd :03:00.0: HC died; cleaning up
 [   93.985489] xhci_hcd :03:00.0: Timeout while waiting for a slot
 [   93.985494] xhci_hcd :03:00.0: Abort the command ring, but
 the xHCI is dead.
 [   98.982586] xhci_hcd :03:00.0: Timeout while waiting for a slot
 [   98.982591] xhci_hcd :03:00.0: Abort the command ring, but
 the xHCI is dead.
 [  103.979696] xhci_hcd :03:00.0: Timeout while waiting for a slot
 [  103.979702] xhci_hcd :03:00.0: Abort the command ring, but
 the xHCI is dead.

By latest git, do you mean linus/master, or 3.13.0?  If it's
linus/master, please provide the commit ID.  Which kernel version worked
for you?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUGREPORT] Linux USB 3.0

2014-01-20 Thread Sarah Sharp

t;>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] PGD 0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Oops:  [#1] SMP
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Modules linked in:
> >>> videodev pci_stub vboxpci(OF) vboxnetadp(OF) vboxnetflt(OF)
> >>> vboxdrv(OF) dm_crypt snd_hda_codec_ca0132 snd_hda_intel snd_hda_codec
> >>> snd_hwdep snd_pcm snd_seq_midi dm_multipath psmouse scsi_dh
> >>> snd_rawmidi serio_raw sb_edac snd_seq_midi_event edac_core snd_seq
> >>> snd_timer snd_seq_device lpc_ich snd bnep rfcomm soundcore
> >>> snd_page_alloc bluetooth mei_me mei mac_hid ppdev nfsd w83627ehf
> >>> hwmon_vid nfs_acl auth_rpcgss coretemp nfs fscache lockd lp parport
> >>> sunrpc raid10 raid456 async_pq async_xor async_memcpy
> >>> async_raid6_recov async_tx raid0 multipath linear btrfs raid6_pq xor
> >>> libcrc32c osst st raid1 tg3 mptsas firewire_ohci ptp mxm_wmi
> >>> firewire_core ahci mptscsih pps_core crc_itu_t libahci mpt2sas mptbase
> >>> wmi scsi_transport_sas raid_class [last unloaded: vmnet]
> >>>
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CPU: 0 PID: 0 Comm: 
> >>> swapper/0 Tainted: GF O 3.12.0-031200-generic 
> >>> #201311031935
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Hardware name: To Be 
> >>> Filled By O.E.M. To Be Filled By O.E.M./X79 Extreme9, BIOS P3.30 
> >>> 01/28/2013
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] task: 81c144a0 ti: 
> >>> 81c0 task.ti: 81c0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RIP: 0010:[]  [] 
> >>> finish_td+0x13f/0x250

It would help if your client could reproduce this oops on their machine,
and then run markup_oops.pl to find out exactly where the driver is
oopsing.  I suspect it has to do with the bad completion length in the
line above, but it could be unrelated.

> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RSP: 0018:88102fc03ca8 
> >>>  EFLAGS: 00010046
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RAX: 880f865d2b10 RBX: 
> >>> 880f865d2b00 RCX: 0006
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RDX: 880f865d2b10 RSI: 
> >>> 0007 RDI: 0046
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] RBP: 88102fc03d08 R08: 
> >>> 000a R09: 
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R10: 06fd R11: 
> >>> 06fc R12: 880fd2de
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] R13: 880fd32b1780 R14: 
> >>>  R15: 880fd5c5f000
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] FS: () 
> >>> GS:88102fc0() knlGS:
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CS:  0010 DS:  ES: 
> >>>  CR0: 80050033
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] CR2: 0004 CR3: 
> >>> 01c0d000 CR4: 000407f0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Stack:
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  88102fc03ce8 
> >>> 880fd0bc8000 88102fc03d00 880fd268d1a0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  88102fc03df4 
> >>> 00010002 880fd32b1780 880f865d2b00
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  880fd268d1a0 
> >>> 880fd5c5f000 880fd2de 880fd2c497b0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450] Call Trace:
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
> >>> process_bulk_intr_td+0x116/0x2d0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
> >>> handle_tx_event+0x656/0xb50
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ? 
> >>> __queue_work+0x3b0/0x3c0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ? 
> >>> call_timer_fn+0x46/0x160
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
> >>> xhci_handle_event+0x1db/0x2a0
> >>> Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ?  
> >>> run_timer_softirq+0x1b2/0x300
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] xhci_irq+0x120/0x1f0
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] xhci_msi_irq+0x11/0x20
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
> >>> handle_irq_event_percpu+0x5d/0x210
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
> >>> handle_irq_event+0x48/0x70
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] ?  
> >>> native_apic_msr_eoi_write+0x14/0x20
> >>> Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
> >>> handle_edge_irq+0x77/0x110

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-20 Thread Sarah Sharp

On Mon, Jan 20, 2014 at 11:21:14AM +, David Laight wrote:
> From: walt 
> > On 01/17/2014 06:34 AM, David Laight wrote:
> > 
> > > Can you try the patch I posted that stops the ownership on LINK TRBs
> > > being changed before that on the linked-to TRB?
> > 
> > Please disregard my earlier post about the patch not applying cleanly.
> > That was the usual html corruption, so I found the original on the usb
> > list and it was okay.
> > 
> > Sadly, the patch didn't fix the ASMedia lockup behavior, however :(
> > 
> > I did notice that the lockup occurred only when copying *to* the usb3
> > drive, and not when copying from it.  I think that may be new behavior
> > but I can't swear to it.
> 
> Consistent with another report that says that ethernet worked provided
> that TSO was disabled (ie no sg tx).
> (Without the patch to delay he ownership change on link trbs it didn't
> work at all.)

Please be more clear.  What do you mean by these statements?  That
someone privately reported that your earlier patch [1] did not help
them, but applying your new patch [2] on top of the old patch did?

[1] http://marc.info/?l=linux-usb=138418996717941=2
[2] http://marc.info/?l=linux-usb=138996538403468=2

In general, will you please Cc me and the USB list when replying to
privately reported bugs/confirmations that patches work?  Or if the
confirmation was reported, please provide a link to the mailing list
discussion or bugzilla entry.  We need to keep bug and fix confirmations
publicly archived.  Please keep me on Cc since I filter mail based on
that.

> A guess...
> 
> In queue_bulk_sg_tx() try calling xhci_v1_0_td_remainder() instead
> of xhci_td_remainder().

Why?  Walt has a 0.96 xHCI host controller, and the format for how to
calculate the TD remainder changed between the 0.96 and the 1.0 spec.
That's why we have xhci_v1_0_td_remainder() and xhci_td_remainder().

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-20 Thread Sarah Sharp

On Mon, Jan 20, 2014 at 11:21:14AM +, David Laight wrote:
 From: walt 
  On 01/17/2014 06:34 AM, David Laight wrote:
  
   Can you try the patch I posted that stops the ownership on LINK TRBs
   being changed before that on the linked-to TRB?
  
  Please disregard my earlier post about the patch not applying cleanly.
  That was the usual html corruption, so I found the original on the usb
  list and it was okay.
  
  Sadly, the patch didn't fix the ASMedia lockup behavior, however :(
  
  I did notice that the lockup occurred only when copying *to* the usb3
  drive, and not when copying from it.  I think that may be new behavior
  but I can't swear to it.
 
 Consistent with another report that says that ethernet worked provided
 that TSO was disabled (ie no sg tx).
 (Without the patch to delay he ownership change on link trbs it didn't
 work at all.)

Please be more clear.  What do you mean by these statements?  That
someone privately reported that your earlier patch [1] did not help
them, but applying your new patch [2] on top of the old patch did?

[1] http://marc.info/?l=linux-usbm=138418996717941w=2
[2] http://marc.info/?l=linux-usbm=138996538403468w=2

In general, will you please Cc me and the USB list when replying to
privately reported bugs/confirmations that patches work?  Or if the
confirmation was reported, please provide a link to the mailing list
discussion or bugzilla entry.  We need to keep bug and fix confirmations
publicly archived.  Please keep me on Cc since I filter mail based on
that.

 A guess...
 
 In queue_bulk_sg_tx() try calling xhci_v1_0_td_remainder() instead
 of xhci_td_remainder().

Why?  Walt has a 0.96 xHCI host controller, and the format for how to
calculate the TD remainder changed between the 0.96 and the 1.0 spec.
That's why we have xhci_v1_0_td_remainder() and xhci_td_remainder().

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [BUGREPORT] Linux USB 3.0

2014-01-20 Thread Sarah Sharp

 mei_me mei mac_hid ppdev nfsd w83627ehf
  hwmon_vid nfs_acl auth_rpcgss coretemp nfs fscache lockd lp parport
  sunrpc raid10 raid456 async_pq async_xor async_memcpy
  async_raid6_recov async_tx raid0 multipath linear btrfs raid6_pq xor
  libcrc32c osst st raid1 tg3 mptsas firewire_ohci ptp mxm_wmi
  firewire_core ahci mptscsih pps_core crc_itu_t libahci mpt2sas mptbase
  wmi scsi_transport_sas raid_class [last unloaded: vmnet]
 
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] CPU: 0 PID: 0 Comm: 
  swapper/0 Tainted: GF O 3.12.0-031200-generic 
  #201311031935
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] Hardware name: To Be 
  Filled By O.E.M. To Be Filled By O.E.M./X79 Extreme9, BIOS P3.30 
  01/28/2013
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] task: 81c144a0 ti: 
  81c0 task.ti: 81c0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] RIP: 0010:[]  [] 
  finish_td+0x13f/0x250

It would help if your client could reproduce this oops on their machine,
and then run markup_oops.pl to find out exactly where the driver is
oopsing.  I suspect it has to do with the bad completion length in the
line above, but it could be unrelated.

  Dec 24 14:30:39 homenas kernel: [ 1469.822450] RSP: 0018:88102fc03ca8 
   EFLAGS: 00010046
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] RAX: 880f865d2b10 RBX: 
  880f865d2b00 RCX: 0006
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] RDX: 880f865d2b10 RSI: 
  0007 RDI: 0046
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] RBP: 88102fc03d08 R08: 
  000a R09: 
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] R10: 06fd R11: 
  06fc R12: 880fd2de
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] R13: 880fd32b1780 R14: 
   R15: 880fd5c5f000
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] FS: () 
  GS:88102fc0() knlGS:
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] CS:  0010 DS:  ES: 
   CR0: 80050033
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] CR2: 0004 CR3: 
  01c0d000 CR4: 000407f0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] Stack:
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  88102fc03ce8 
  880fd0bc8000 88102fc03d00 880fd268d1a0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  88102fc03df4 
  00010002 880fd32b1780 880f865d2b00
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  880fd268d1a0 
  880fd5c5f000 880fd2de 880fd2c497b0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450] Call Trace:
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
  process_bulk_intr_td+0x116/0x2d0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
  handle_tx_event+0x656/0xb50
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ? 
  __queue_work+0x3b0/0x3c0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ? 
  call_timer_fn+0x46/0x160
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] 
  xhci_handle_event+0x1db/0x2a0
  Dec 24 14:30:39 homenas kernel: [ 1469.822450]  [] ?  
  run_timer_softirq+0x1b2/0x300
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] xhci_irq+0x120/0x1f0
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] xhci_msi_irq+0x11/0x20
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
  handle_irq_event_percpu+0x5d/0x210
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
  handle_irq_event+0x48/0x70
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] ?  
  native_apic_msr_eoi_write+0x14/0x20
  Dec 24 14:30:39 homenas kernel: [ 1470.312076]  [] 
  handle_edge_irq+0x77/0x110

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-16 Thread Sarah Sharp

On Tue, Jan 14, 2014 at 01:27:25PM -0800, walt wrote:
> On 01/14/2014 09:20 AM, Sarah Sharp wrote:
> > On Mon, Jan 13, 2014 at 03:39:07PM -0800, walt wrote:
> 
> >> Sarah, I just fixed my xhci bug for US$19.99 :)
> >>
> >> #lspci | tail -1
> >> 04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller 
> >> (rev 03)
> >>
> >> This new NEC usb3 controller does everything the ASMedia controller should 
> >> have
> >> done from the start.
> 
>  
> > I just got a similar report from someone with a Fresco Logic host
> > controller, so I may need you to test a work-around patch for the
> > ASMedia host at some point.

Hmm, the Fresco Logic host issue seems unrelated.

> Oy, Sarah! ;)  I put the ASMedia adapter in my older amd64 machine, and, well,
> the stupid thing Just Works(TM) with kernel 3.12.7!  (Yes, with the same disk
> docking station, too.)

Ugh.  Well, I suppose we can chalk it up to hardware failure?  I think
you're the only one to report a verified issue with the Link TRB patch.

You are sure you're running a vanilla 3.12.7 kernel, right?

> I can't believe the adapter works perfectly in a different computer.  Have you
> seen this kind of thing before?

No, at least not this particular host-dying-only-on-one-machine failure
mode.

> At the moment I have two machines using your xhci driver and both work 
> perfectly,
> so I thank you again :)
> 
> I'm not sure where to go with this next.  I could put the adapter back in the
> other machine again if you have more patches to test.

I think any patches I was going to send are moot with this new
information.  The issue with your PCI add-in card only happens in
combination with a specific motherboard, so I don't think it makes sense
to disable the no-op TRBs for that host.

If lots of other people start reporting the same issue with the ASMedia
0.96 host, and reverting commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e
"usb: xhci: Link TRB must not occur within a USB payload burst" helps
them, then I'll reconsider that decision.

Thank you so much for your patience while debugging this issue, and
being willing to try all sorts of kernels and patches.  I'm glad we
finally figured out what the issue was, and you have working xHCI hosts
now.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-16 Thread Sarah Sharp

On Tue, Jan 14, 2014 at 01:27:25PM -0800, walt wrote:
 On 01/14/2014 09:20 AM, Sarah Sharp wrote:
  On Mon, Jan 13, 2014 at 03:39:07PM -0800, walt wrote:
 
  Sarah, I just fixed my xhci bug for US$19.99 :)
 
  #lspci | tail -1
  04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller 
  (rev 03)
 
  This new NEC usb3 controller does everything the ASMedia controller should 
  have
  done from the start.
 
  
  I just got a similar report from someone with a Fresco Logic host
  controller, so I may need you to test a work-around patch for the
  ASMedia host at some point.

Hmm, the Fresco Logic host issue seems unrelated.

 Oy, Sarah! ;)  I put the ASMedia adapter in my older amd64 machine, and, well,
 the stupid thing Just Works(TM) with kernel 3.12.7!  (Yes, with the same disk
 docking station, too.)

Ugh.  Well, I suppose we can chalk it up to hardware failure?  I think
you're the only one to report a verified issue with the Link TRB patch.

You are sure you're running a vanilla 3.12.7 kernel, right?

 I can't believe the adapter works perfectly in a different computer.  Have you
 seen this kind of thing before?

No, at least not this particular host-dying-only-on-one-machine failure
mode.

 At the moment I have two machines using your xhci driver and both work 
 perfectly,
 so I thank you again :)
 
 I'm not sure where to go with this next.  I could put the adapter back in the
 other machine again if you have more patches to test.

I think any patches I was going to send are moot with this new
information.  The issue with your PCI add-in card only happens in
combination with a specific motherboard, so I don't think it makes sense
to disable the no-op TRBs for that host.

If lots of other people start reporting the same issue with the ASMedia
0.96 host, and reverting commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e
usb: xhci: Link TRB must not occur within a USB payload burst helps
them, then I'll reconsider that decision.

Thank you so much for your patience while debugging this issue, and
being willing to try all sorts of kernels and patches.  I'm glad we
finally figured out what the issue was, and you have working xHCI hosts
now.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-14 Thread Sarah Sharp

On Mon, Jan 13, 2014 at 03:39:07PM -0800, walt wrote:
> On 01/09/2014 03:50 PM, Sarah Sharp wrote:
> 
> >>> On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
> >>
> >> I've wondered if my xhci problems might be caused by hardware quirks, and
> >> wondering why I seem to be the only one who has this problem.
> >>
> >> Maybe I could "take one for the team" by buying new hardware toys that I
> >> don't really need but I could use to test the xhci driver?  (I do enjoy
> >> buying new toys, necessary, or, um, maybe not :)
> > 
> > It would be appreciated if you could see if your device causes other
> > host controllers to fail.  Who am I to keep a geek from new toys? ;)
> 
> Sarah, I just fixed my xhci bug for US$19.99 :)
> 
> #lspci | tail -1
> 04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller 
> (rev 03)
> 
> This new NEC usb3 controller does everything the ASMedia controller should 
> have
> done from the start.  I can even power-up the outboard usb3 disk docking 
> station
> after the computer is running and still everything Just Works :)
> 
> I appreciate all the debugging work you've done to fix the ASMedia problem but
> I think it's time to quit now.  If hundreds of irate linux users complain 
> about
> the same bug then I'll be happy to test more patches.

I just got a similar report from someone with a Fresco Logic host
controller, so I may need you to test a work-around patch for the
ASMedia host at some point.

I'm considering disabling the effect of David's patch for those two host
controllers.  That will mean USB storage works fine, but USB ethernet
may fail.

I had considered just disabling scatter-gather for the hosts, but we can
still run into the ethernet issue if we need to break a TRB at a 64KB
boundary.  So disabling scatter-gather would make USB ethernet work
_most of the time_, but fail intermittently, and USB storage performance
would be impacted.  Since USB ethernet will fail in either case, I would
rather keep USB storage performance and sacrifice USB ethernet on those
particular hosts.

So please keep the ASMedia host around for testing, if possible.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst [NEW HARDWARE]

2014-01-14 Thread Sarah Sharp

On Mon, Jan 13, 2014 at 03:39:07PM -0800, walt wrote:
 On 01/09/2014 03:50 PM, Sarah Sharp wrote:
 
  On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
 
  I've wondered if my xhci problems might be caused by hardware quirks, and
  wondering why I seem to be the only one who has this problem.
 
  Maybe I could take one for the team by buying new hardware toys that I
  don't really need but I could use to test the xhci driver?  (I do enjoy
  buying new toys, necessary, or, um, maybe not :)
  
  It would be appreciated if you could see if your device causes other
  host controllers to fail.  Who am I to keep a geek from new toys? ;)
 
 Sarah, I just fixed my xhci bug for US$19.99 :)
 
 #lspci | tail -1
 04:00.0 USB controller: NEC Corporation uPD720200 USB 3.0 Host Controller 
 (rev 03)
 
 This new NEC usb3 controller does everything the ASMedia controller should 
 have
 done from the start.  I can even power-up the outboard usb3 disk docking 
 station
 after the computer is running and still everything Just Works :)
 
 I appreciate all the debugging work you've done to fix the ASMedia problem but
 I think it's time to quit now.  If hundreds of irate linux users complain 
 about
 the same bug then I'll be happy to test more patches.

I just got a similar report from someone with a Fresco Logic host
controller, so I may need you to test a work-around patch for the
ASMedia host at some point.

I'm considering disabling the effect of David's patch for those two host
controllers.  That will mean USB storage works fine, but USB ethernet
may fail.

I had considered just disabling scatter-gather for the hosts, but we can
still run into the ethernet issue if we need to break a TRB at a 64KB
boundary.  So disabling scatter-gather would make USB ethernet work
_most of the time_, but fail intermittently, and USB storage performance
would be impacted.  Since USB ethernet will fail in either case, I would
rather keep USB storage performance and sacrifice USB ethernet on those
particular hosts.

So please keep the ASMedia host around for testing, if possible.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-09 Thread Sarah Sharp

[Walt, please use reply-all to keep the list in the loop, thanks.]

On Wed, Jan 08, 2014 at 04:09:14PM +, David Laight wrote:
> > From: Sarah Sharp
> > On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
> > > On 01/07/2014 01:21 PM, Sarah Sharp wrote:
> > >
> > > > Can you please try the attached patch, on top of the previous three
> > > > patches, and send me dmesg?
> > >
> > > Hi Sarah, I just now finished running 0001-More-debugging.patch for the
> > > first time.  The previous dmesg didn't include that patch, but this one
> > > does.
> > >
> > > I read through this dmesg but I nodded off somewhere around line 500.
> > > I hope you can stay awake :)
> > 
> > Well, it has all the info I need, but the results don't make me too
> > happy.  Everything I've checked seems consistent, and I don't know why
> > the host stopped.  The link TRBs are intact, the dequeue pointer for the
> > endpoint was pointing to the transfer that timed out and it had the
> > cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
> > issue.
> > 
> > I'll have to take a look at the log again tomorrow.  I posted the dmesg
> > on pastebin if David wants to check it out as well:
> > http://pastebin.com/a4AUpsL1
> 
> I can't see anything obvious either.
> However there is no response to the 'stop endpoint' command.
> Section 4.6.9 (page 107 of rev1.0) states that the controller will complete
> any USB IN or OUT transaction before raising the command completion event.
> Possibly it is too 'stuck' to complete the transaction?

The host has to stop processing the transaction, it can't "wait" for the
transaction to finish.  "The Stop Endpoint Command is expected to stop
endpoint activity as soon as possible, which may mean that it stops in
the middle of a TRB."

Usually when hosts get into this kind of mode, something has seriously
gone wrong, like bus error when it issues a bad memory access.

> The endpoint status is also still '1' (running).
> This also means that the 'TR dequeue pointer' is undefined - so the
> controller could easily be processing a later TRB.
> This field might even still contain the ring base address written by
> the driver much earlier.
> 
> This might mean that something 'catastrophic' has happened earlier.
> Maybe the controller isn't actually seeing any doorbell writes at all.
> Maybe the base addresses it has for the rings have all got corrupted.
> At least this looks like amd64 - so there aren't memory coherency issues.
> 
> Some hacks that might help isolate the problem:
> 1) Request an interrupt from the last nop data TRB.
> 2) Put a command nop (decimal 23) TRB into the command ring before
>the 'stop endpoint'.
> 3) Comment out the code that adds the nop data TRBs.
> The first two might need code adding to handle the responses.
> 
> Do we know the actual xhci device?
> I think it reports version 0x96.
> (Sarah - it might be useful if that version were in one of the trace
> messages that is output by default.)

You mean print the PCI device and vendor ID?  Perhaps Subsystem vendor
as well?

On Tue, Jan 07, 2014 at 05:26:37PM -0800, walt wrote:
> On 01/07/2014 04:47 PM, Sarah Sharp wrote:
>  
> > Can you send me the output of `sudo lspci -vvv -n`?  Maybe we can just
> > turn off scatter-gather for your host controller until we get a proper
> > fix in that uses link TRBs instead of no-op TRBs.
> 
> The aftermarket usb3 adapter card and the usb3 outboard hard-drive docking
> station are the only two usb3 devices I have.
> 
> I've wondered if my xhci problems might be caused by hardware quirks, and
> wondering why I seem to be the only one who has this problem.
> 
> Maybe I could "take one for the team" by buying new hardware toys that I
> don't really need but I could use to test the xhci driver?  (I do enjoy
> buying new toys, necessary, or, um, maybe not :)

It would be appreciated if you could see if your device causes other
host controllers to fail.  Who am I to keep a geek from new toys? ;)

In the meantime, try this patch, which is something of a long shot.

Sarah Sharp
>From 19e2ab85ac2cc0d84f56247dcf29bdce14bd70d5 Mon Sep 17 00:00:00 2001
From: Sarah Sharp 
Date: Thu, 9 Jan 2014 15:46:04 -0800
Subject: [PATCH] xhci: Enable Link TRB quirk for 0.96 ASMedia host.

A recent bug fix commit causes an ASMedia host to stop responding to
commands.  See if it needs the link TRB quirk.  This was generally only
necessary for 0.95 hosts, but maybe this 0.96 host needs it.

Signed-off-by: Sarah Sharp 
---
 drivers/usb/host/xhci-pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-09 Thread Sarah Sharp

[Walt, please use reply-all to keep the list in the loop, thanks.]

On Wed, Jan 08, 2014 at 04:09:14PM +, David Laight wrote:
  From: Sarah Sharp
  On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
   On 01/07/2014 01:21 PM, Sarah Sharp wrote:
  
Can you please try the attached patch, on top of the previous three
patches, and send me dmesg?
  
   Hi Sarah, I just now finished running 0001-More-debugging.patch for the
   first time.  The previous dmesg didn't include that patch, but this one
   does.
  
   I read through this dmesg but I nodded off somewhere around line 500.
   I hope you can stay awake :)
  
  Well, it has all the info I need, but the results don't make me too
  happy.  Everything I've checked seems consistent, and I don't know why
  the host stopped.  The link TRBs are intact, the dequeue pointer for the
  endpoint was pointing to the transfer that timed out and it had the
  cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
  issue.
  
  I'll have to take a look at the log again tomorrow.  I posted the dmesg
  on pastebin if David wants to check it out as well:
  http://pastebin.com/a4AUpsL1
 
 I can't see anything obvious either.
 However there is no response to the 'stop endpoint' command.
 Section 4.6.9 (page 107 of rev1.0) states that the controller will complete
 any USB IN or OUT transaction before raising the command completion event.
 Possibly it is too 'stuck' to complete the transaction?

The host has to stop processing the transaction, it can't wait for the
transaction to finish.  The Stop Endpoint Command is expected to stop
endpoint activity as soon as possible, which may mean that it stops in
the middle of a TRB.

Usually when hosts get into this kind of mode, something has seriously
gone wrong, like bus error when it issues a bad memory access.

 The endpoint status is also still '1' (running).
 This also means that the 'TR dequeue pointer' is undefined - so the
 controller could easily be processing a later TRB.
 This field might even still contain the ring base address written by
 the driver much earlier.
 
 This might mean that something 'catastrophic' has happened earlier.
 Maybe the controller isn't actually seeing any doorbell writes at all.
 Maybe the base addresses it has for the rings have all got corrupted.
 At least this looks like amd64 - so there aren't memory coherency issues.
 
 Some hacks that might help isolate the problem:
 1) Request an interrupt from the last nop data TRB.
 2) Put a command nop (decimal 23) TRB into the command ring before
the 'stop endpoint'.
 3) Comment out the code that adds the nop data TRBs.
 The first two might need code adding to handle the responses.
 
 Do we know the actual xhci device?
 I think it reports version 0x96.
 (Sarah - it might be useful if that version were in one of the trace
 messages that is output by default.)

You mean print the PCI device and vendor ID?  Perhaps Subsystem vendor
as well?

On Tue, Jan 07, 2014 at 05:26:37PM -0800, walt wrote:
 On 01/07/2014 04:47 PM, Sarah Sharp wrote:
  
  Can you send me the output of `sudo lspci -vvv -n`?  Maybe we can just
  turn off scatter-gather for your host controller until we get a proper
  fix in that uses link TRBs instead of no-op TRBs.
 
 The aftermarket usb3 adapter card and the usb3 outboard hard-drive docking
 station are the only two usb3 devices I have.
 
 I've wondered if my xhci problems might be caused by hardware quirks, and
 wondering why I seem to be the only one who has this problem.
 
 Maybe I could take one for the team by buying new hardware toys that I
 don't really need but I could use to test the xhci driver?  (I do enjoy
 buying new toys, necessary, or, um, maybe not :)

It would be appreciated if you could see if your device causes other
host controllers to fail.  Who am I to keep a geek from new toys? ;)

In the meantime, try this patch, which is something of a long shot.

Sarah Sharp
From 19e2ab85ac2cc0d84f56247dcf29bdce14bd70d5 Mon Sep 17 00:00:00 2001
From: Sarah Sharp sarah.a.sh...@linux.intel.com
Date: Thu, 9 Jan 2014 15:46:04 -0800
Subject: [PATCH] xhci: Enable Link TRB quirk for 0.96 ASMedia host.

A recent bug fix commit causes an ASMedia host to stop responding to
commands.  See if it needs the link TRB quirk.  This was generally only
necessary for 0.95 hosts, but maybe this 0.96 host needs it.

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
---
 drivers/usb/host/xhci-pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 3c898c12a06b..8196ac2289e4 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -92,6 +92,8 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
 		xhci-quirks |= XHCI_TRUST_TX_LENGTH;
 	}
 
+	if (pdev-vendor == PCI_VENDOR_ID_ASMEDIA  pdev-device == 1042)
+		xhci-quirks |= XHCI_LINK_TRB_QUIRK;
 	if (pdev-vendor == PCI_VENDOR_ID_NEC)
 		xhci-quirks

Re: [PATCH] usb:hub set hub->change_bits when over-current happens

2014-01-08 Thread Sarah Sharp

On Wed, Jan 08, 2014 at 12:49:57PM -0500, Alan Stern wrote:
> On Wed, 8 Jan 2014, Greg KH wrote:
> 
> > On Wed, Jan 08, 2014 at 02:45:42PM +0800, Shen Guang wrote:
> > > When we are doing compliance test with xHCI, we found that if we
> > > enable CONFIG_USB_SUSPEND and plug in a bad device which causes
> > > over-current condition to the root port, software will not be noticed.
> > > The reason is that current code don't set hub->change_bits in
> > > hub_activate() when over-current happens, and then hub_events() will
> > > not check the port status because it thinks nothing changed.
> > > If CONFIG_USB_SUSPEND is disabled, the interrupt pipe of the hub will
> > > report the change and set hub->event_bits, and then hub_events() will
> > > check what events happened.In this case over-current can be detected.
> > > 
> > > Signed-off-by: Shen Guang 
> > > ---
> > >  drivers/usb/core/hub.c |3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > > index bd9dc35..98b5679 100644
> > > --- a/drivers/usb/core/hub.c
> > > +++ b/drivers/usb/core/hub.c
> > > @@ -1154,7 +1154,8 @@ static void hub_activate(struct usb_hub *hub,
> > > enum hub_activation_type type)
> > > /* Tell khubd to disconnect the device or
> > >  * check for a new connection
> > >  */
> > > -   if (udev || (portstatus & 
> > > USB_PORT_STAT_CONNECTION))
> > > +   if (udev || (portstatus & 
> > > USB_PORT_STAT_CONNECTION) ||
> > > +   (portstatus & USB_PORT_STAT_OVERCURRENT))
> > >     set_bit(port1, hub->change_bits);
> > > 
> > > } else if (portstatus & USB_PORT_STAT_ENABLE) {
> > > --
> > > 1.7.9.5
> > 
> > Alan and Sarah, any objection to this patch?
> 
> It seems okay to me.
> 
> Acked-by: Alan Stern 

Looks fine to me as well.

Acked-by: Sarah Sharp 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb:hub set hub-change_bits when over-current happens

2014-01-08 Thread Sarah Sharp

On Wed, Jan 08, 2014 at 12:49:57PM -0500, Alan Stern wrote:
 On Wed, 8 Jan 2014, Greg KH wrote:
 
  On Wed, Jan 08, 2014 at 02:45:42PM +0800, Shen Guang wrote:
   When we are doing compliance test with xHCI, we found that if we
   enable CONFIG_USB_SUSPEND and plug in a bad device which causes
   over-current condition to the root port, software will not be noticed.
   The reason is that current code don't set hub-change_bits in
   hub_activate() when over-current happens, and then hub_events() will
   not check the port status because it thinks nothing changed.
   If CONFIG_USB_SUSPEND is disabled, the interrupt pipe of the hub will
   report the change and set hub-event_bits, and then hub_events() will
   check what events happened.In this case over-current can be detected.
   
   Signed-off-by: Shen Guang shenguan...@gmail.com
   ---
drivers/usb/core/hub.c |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
   
   diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
   index bd9dc35..98b5679 100644
   --- a/drivers/usb/core/hub.c
   +++ b/drivers/usb/core/hub.c
   @@ -1154,7 +1154,8 @@ static void hub_activate(struct usb_hub *hub,
   enum hub_activation_type type)
   /* Tell khubd to disconnect the device or
* check for a new connection
*/
   -   if (udev || (portstatus  
   USB_PORT_STAT_CONNECTION))
   +   if (udev || (portstatus  
   USB_PORT_STAT_CONNECTION) ||
   +   (portstatus  USB_PORT_STAT_OVERCURRENT))
   set_bit(port1, hub-change_bits);
   
   } else if (portstatus  USB_PORT_STAT_ENABLE) {
   --
   1.7.9.5
  
  Alan and Sarah, any objection to this patch?
 
 It seems okay to me.
 
 Acked-by: Alan Stern st...@rowland.harvard.edu

Looks fine to me as well.

Acked-by: Sarah Sharp sarah.a.sh...@linux.intel.com
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
> On 01/07/2014 01:21 PM, Sarah Sharp wrote:
> 
> > Can you please try the attached patch, on top of the previous three
> > patches, and send me dmesg?
> 
> Hi Sarah, I just now finished running 0001-More-debugging.patch for the
> first time.  The previous dmesg didn't include that patch, but this one
> does.
> 
> I read through this dmesg but I nodded off somewhere around line 500.
> I hope you can stay awake :)

Well, it has all the info I need, but the results don't make me too
happy.  Everything I've checked seems consistent, and I don't know why
the host stopped.  The link TRBs are intact, the dequeue pointer for the
endpoint was pointing to the transfer that timed out and it had the
cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
issue.

I'll have to take a look at the log again tomorrow.  I posted the dmesg
on pastebin if David wants to check it out as well:
http://pastebin.com/a4AUpsL1

Can you send me the output of `sudo lspci -vvv -n`?  Maybe we can just
turn off scatter-gather for your host controller until we get a proper
fix in that uses link TRBs instead of no-op TRBs.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 12:00:00PM -0800, walt wrote:
> Okay, I used log_buf_len to make dmesg bigger and now I think I have
> the whole thing.  It's attached.

Walt, can you make sure the patch I sent you was applied?  The output
doesn't look like it is.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 11:03:00AM +0100, Takashi Iwai wrote:
> At Mon, 06 Jan 2014 14:34:28 +0200,
> Denis Turischev wrote:
> > 
> > Hi Sarah,
> > 
> > On 01/03/2014 02:03 AM, Sarah Sharp wrote:
> > > Denis, do all of Compulab's Haswell systems reboot on shutdown?  Are
> > > they all running a Phoenix BIOS?  Can you send me the output of `sudo
> > > lspci -vvv -s` for the xHCI host?
> > 
> > oem@oem-Intense-PC2 ~ $ sudo lspci -vvv -s 00:14.0
> > 00:14.0 USB controller: Intel Corporation Lynx Point-LP USB xHCI HC (rev 
> > 04) (prog-if 30 [XHCI])
> > Subsystem: Intel Corporation Device 7270
> > Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
> > Stepping- SERR- FastB2B- DisINTx+
> > Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- 
> > SERR-  > Latency: 0
> > Interrupt: pin A routed to IRQ 59
> > Region 0: Memory at f062 (64-bit, non-prefetchable) [size=64K]
> > Capabilities: [70] Power Management version 2
> > Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
> > PME(D0-,D1-,D2-,D3hot+,D3cold+)
> > Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> > Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
> > Address: fee0200c  Data: 41b1
> > Kernel driver in use: xhci_hcd
> > 
> > > Basically, I'm trying to find a common variable to key off.  I suspect
> > > BIOS vendor is probably the right thing, instead of system vendor.

Hmm, since Compulab isn't the subsystem vendor, we can't enable the same
HP quirk using that piece of information.  We can't enable the quirk to
put the host into D3 for all Lynx Point-LP hosts, since that quirk
breaks other vendors' systems.  Does this impact any Lynx Point (non-LP)
systems as well?

So far, two of the other systems that don't react well to the quirk are
both ASRock systems with American Megatrends BIOSes, based on info
provided by Art and Meng.  I can see from Giorgos' posted lspci that his
xHCI also lists ASRock as the Subsystem vendor, although I don't know
what the BIOS manufacturer is.

Niklas's xHCI subsystem VID:PID is 1558:7410, which is CLEVO/KAPOK
Computer Device.  Looks like Clevo is a laptop manufacturer.

Giorgos and Niklas, can you post output from `sudo dmidecode` please?

> > By the way the quirk introduced by commit 
> > e95829f474f0db3a4d940cae1423783edd966027 "xhci: Switch PPT
> > ports to EHCI on shutdown." works for Lynx Point as well at least on 
> > Intense-PC2. I mean we can add
> > XHCI_SPURIOUS_REBOOT flag that invokes usb_disable_xhci_ports().
> > May be this solution works for HP and other systems without side effects?
> 
> No, we already tested it at first, but didn't fix the behavior on HP
> machines.  It was harmless as far as we've tested, though.

Denis, what do you mean by "works for Lynx Point"?  Do you mean that
adding the quirk to switch the ports on EHCI on shutdown (e95829f474)
for the Intense-PC2 *instead of* the commit to put the host in D3 on
shutdown (638298dc66) works?  Or do you mean you need both patches for
your system?

If you only need the quirk to switch the ports to EHCI on shutdown, then
we could apply that broadly to Lynx Point LP, and see whether other
BIOSes tolerate that quirk.

The alternative would be to turn on the D3 quirk for systems with an HP
or Phoenix BIOS, by checking dmi_name_in_vendors() for those strings.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 01:58:32PM +, David Laight wrote:
> The dmesg contains:
> 
> [  538.728064] EXT4-fs warning (device dm-0): ext4_end_bio:316: I/O error 
> writing to inode 23330865 (offset 0 size 8388608 starting block 812628)
> 
> An 8MB transfer will need at least 128 ring entries (TRB) even if the request
> is a single contiguous memory block.
> 
> Are you using the patch that increases the ring size from 64 to 256?

It's likely that the block layer is breaking up the EXT4 write into
several transfers, since usb-storage limits overall transfer size to
120 KB.  In any case, I added more debugging in the last patch to print
the number of TRBs necessary.  That way we can verify the patch to limit
the number of scatter-gather list entries is working.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 05:29:48AM -0800, walt wrote:
> On 01/06/2014 04:31 PM, Sarah Sharp wrote:
> > Hi Walt,
> > 
> > I have a couple of patches for you to test.
> 
> > Please only apply the first patch (which is diagnostic only), trigger
> > your issue, and send me the resulting dmesg.  Then try applying the
> > other two patches, and see if the issue goes away.  (I suspect it won't
> > but I can't be sure.)
> 
> Thanks Sarah.  dmesg0 is from the diagnostic patch only.  dmesg1 has all
> three patches applied.  Some of the messages in dmesg1 fell off the end of
> the kernel buffer, so I may need to make the buffer larger next time but
> I'll need a reminder of how to do it.

Set CONFIG_LOG_BUF_SHIFT to 21.

> As you suspected, the patches didn't fix the problem, sorry.

Yep, I thought so.  I did glean one bit of information from the logs: it
seems that your host does handle no-op TRBs, at least for a while.
However, after a bigger chunk of TRBs, it goes off into la-la-land.

Assuming one of the rings is comprised of two segments:
0xbb711000 (start)
0xbb7113f0 (end)
0xbb711400 (start)
0xbb7117f0 (end)

The log show no-ops were inserted at:
0xbb7207d0
0xbb7206a0
0xbb720be0
0xbb720be0
0xbb720bd0
0xbb7207e0
0xbb711370 = 8 no-ops
0xbb7117c0 = 3
0xbb7113b0 = 4
0xbb7113a0 = 5
0xbb7117d0 = 2
0xbb711340 = 11
0xbb711770 = 8
0xbb711230 = 28
0xbb7117e0 = 1
0xbb7117b0 = 4
0xbb7113d0 = 2
0xbb7117b0 = 4
0xbb711340 = 11
0xbb711690 = 22

So the host was able to process 28 no-op TRBs, but failed on 22 no-ops
later.  The event ring debugging shows the last event was for
0xbb711680, which is the last TRB before the first no-op inserted before
the host died.  There's no Stop Endpoint Command completion, and it
looks like the command was correctly put on the command ring, so it
seems the host is actually hanging for some reason.

Unfortunately, I made a mistake in the debugging patch I sent
you, so it didn't print out the endpoint rings when the host died.  I
need that info, to see whether the link TRB was still intact, or if we
over-wrote it and caused the host to go fetch some invalid memory.

Can you please try the attached patch, on top of the previous three
patches, and send me dmesg?

> I find that I can tell in advance whether the copy is going to succeed,
> just by watching the light flicker on the usb3 drive.  When the flicker
> is absolutely regular, with no variation whatever, I can tell in 10 or
> 15 seconds that the copy will fail.
> 
> At the same time the light on the main drive goes dark after 10 seconds,
> implying that the usb3 drive stops receiving any data from the main drive
> after 10 seconds, yet the light on the usb3 drive continues to flicker as
> if writing data -- even after the cp officially fails.  The light on the
> usb3 drive never stops flickering until I reboot the machine or unplug
> the usb cable.

Interesting.  Without a USB analyzer, we can't really tell what's
happening.  However, one hypothesis could be that the blinking light is
triggered by an active SCSI command (read/write, etc).

There are three phases of the command: setup, data, and status.  I think
your device is getting the setup phase, and the host is dying before it
sends the data phase.  If the light blinks when it gets a setup phase,
and turns off when the devices sends a status phase, that would explain
its behavior.

But that's just a hypothesis, I have no idea whether it's correct.

Sarah Sharp
>From d085fb5b9630e935d7954fe5947b48402e43bdc1 Mon Sep 17 00:00:00 2001
From: Sarah Sharp 
Date: Tue, 7 Jan 2014 12:39:47 -0800
Subject: [PATCH] More debugging.

Signed-off-by: Sarah Sharp 
---
 drivers/usb/host/xhci-ring.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2afaf15009e8..228ab8cf868e 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -993,6 +993,9 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 	for (i = 0; i < MAX_HC_SLOTS; i++) {
 		if (!xhci->devs[i])
 			continue;
+
+		xhci_dbg(xhci, "Slot %d output context\n", i);
+		xhci_dbg_ctx(xhci, xhci->devs[i]->out_ctx, 30);
 		for (j = 0; j < 31; j++) {
 			temp_ep = >devs[i]->eps[j];
 			ring = temp_ep->ring;
@@ -1001,6 +1004,10 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 			xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
 	"Killing URBs for slot ID %u, "
 	"ep index %u", i, j);
+			xhci_dbg(xhci, "Dev %i Ep 0x%x:\n", i,
+	xhci_get_endpoint_address(j));
+			xhci_debug_ring(xhci, ring);
+			xhci_dbg_ring_ptrs(xhci, ring);
 			while (!list_empty(>td_list)) {
 cur_td = list_first_entry(>td_list,
 		struct xhci_td,
@@ -1011,12 +1018,6 @@ void xhci_stop_endpoint_command_watchdog(unsigned long ar

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 05:29:48AM -0800, walt wrote:
 On 01/06/2014 04:31 PM, Sarah Sharp wrote:
  Hi Walt,
  
  I have a couple of patches for you to test.
 
  Please only apply the first patch (which is diagnostic only), trigger
  your issue, and send me the resulting dmesg.  Then try applying the
  other two patches, and see if the issue goes away.  (I suspect it won't
  but I can't be sure.)
 
 Thanks Sarah.  dmesg0 is from the diagnostic patch only.  dmesg1 has all
 three patches applied.  Some of the messages in dmesg1 fell off the end of
 the kernel buffer, so I may need to make the buffer larger next time but
 I'll need a reminder of how to do it.

Set CONFIG_LOG_BUF_SHIFT to 21.

 As you suspected, the patches didn't fix the problem, sorry.

Yep, I thought so.  I did glean one bit of information from the logs: it
seems that your host does handle no-op TRBs, at least for a while.
However, after a bigger chunk of TRBs, it goes off into la-la-land.

Assuming one of the rings is comprised of two segments:
0xbb711000 (start)
0xbb7113f0 (end)
0xbb711400 (start)
0xbb7117f0 (end)

The log show no-ops were inserted at:
0xbb7207d0
0xbb7206a0
0xbb720be0
0xbb720be0
0xbb720bd0
0xbb7207e0
0xbb711370 = 8 no-ops
0xbb7117c0 = 3
0xbb7113b0 = 4
0xbb7113a0 = 5
0xbb7117d0 = 2
0xbb711340 = 11
0xbb711770 = 8
0xbb711230 = 28
0xbb7117e0 = 1
0xbb7117b0 = 4
0xbb7113d0 = 2
0xbb7117b0 = 4
0xbb711340 = 11
0xbb711690 = 22

So the host was able to process 28 no-op TRBs, but failed on 22 no-ops
later.  The event ring debugging shows the last event was for
0xbb711680, which is the last TRB before the first no-op inserted before
the host died.  There's no Stop Endpoint Command completion, and it
looks like the command was correctly put on the command ring, so it
seems the host is actually hanging for some reason.

Unfortunately, I made a mistake in the debugging patch I sent
you, so it didn't print out the endpoint rings when the host died.  I
need that info, to see whether the link TRB was still intact, or if we
over-wrote it and caused the host to go fetch some invalid memory.

Can you please try the attached patch, on top of the previous three
patches, and send me dmesg?

 I find that I can tell in advance whether the copy is going to succeed,
 just by watching the light flicker on the usb3 drive.  When the flicker
 is absolutely regular, with no variation whatever, I can tell in 10 or
 15 seconds that the copy will fail.
 
 At the same time the light on the main drive goes dark after 10 seconds,
 implying that the usb3 drive stops receiving any data from the main drive
 after 10 seconds, yet the light on the usb3 drive continues to flicker as
 if writing data -- even after the cp officially fails.  The light on the
 usb3 drive never stops flickering until I reboot the machine or unplug
 the usb cable.

Interesting.  Without a USB analyzer, we can't really tell what's
happening.  However, one hypothesis could be that the blinking light is
triggered by an active SCSI command (read/write, etc).

There are three phases of the command: setup, data, and status.  I think
your device is getting the setup phase, and the host is dying before it
sends the data phase.  If the light blinks when it gets a setup phase,
and turns off when the devices sends a status phase, that would explain
its behavior.

But that's just a hypothesis, I have no idea whether it's correct.

Sarah Sharp
From d085fb5b9630e935d7954fe5947b48402e43bdc1 Mon Sep 17 00:00:00 2001
From: Sarah Sharp sarah.a.sh...@linux.intel.com
Date: Tue, 7 Jan 2014 12:39:47 -0800
Subject: [PATCH] More debugging.

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
---
 drivers/usb/host/xhci-ring.c | 19 ++-
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 2afaf15009e8..228ab8cf868e 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -993,6 +993,9 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 	for (i = 0; i  MAX_HC_SLOTS; i++) {
 		if (!xhci-devs[i])
 			continue;
+
+		xhci_dbg(xhci, Slot %d output context\n, i);
+		xhci_dbg_ctx(xhci, xhci-devs[i]-out_ctx, 30);
 		for (j = 0; j  31; j++) {
 			temp_ep = xhci-devs[i]-eps[j];
 			ring = temp_ep-ring;
@@ -1001,6 +1004,10 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 			xhci_dbg_trace(xhci, trace_xhci_dbg_cancel_urb,
 	Killing URBs for slot ID %u, 
 	ep index %u, i, j);
+			xhci_dbg(xhci, Dev %i Ep 0x%x:\n, i,
+	xhci_get_endpoint_address(j));
+			xhci_debug_ring(xhci, ring);
+			xhci_dbg_ring_ptrs(xhci, ring);
 			while (!list_empty(ring-td_list)) {
 cur_td = list_first_entry(ring-td_list,
 		struct xhci_td,
@@ -1011,12 +1018,6 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 xhci_giveback_urb_in_irq(xhci, cur_td,
 		-ESHUTDOWN, killed);
 			}
-			if (!list_empty(temp_ep-cancelled_td_list)) {
-xhci_dbg(xhci, Dev %i Ep 0x%x:\n, i

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 01:58:32PM +, David Laight wrote:
 The dmesg contains:
 
 [  538.728064] EXT4-fs warning (device dm-0): ext4_end_bio:316: I/O error 
 writing to inode 23330865 (offset 0 size 8388608 starting block 812628)
 
 An 8MB transfer will need at least 128 ring entries (TRB) even if the request
 is a single contiguous memory block.
 
 Are you using the patch that increases the ring size from 64 to 256?

It's likely that the block layer is breaking up the EXT4 write into
several transfers, since usb-storage limits overall transfer size to
120 KB.  In any case, I added more debugging in the last patch to print
the number of TRBs necessary.  That way we can verify the patch to limit
the number of scatter-gather list entries is working.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 11:03:00AM +0100, Takashi Iwai wrote:
 At Mon, 06 Jan 2014 14:34:28 +0200,
 Denis Turischev wrote:
  
  Hi Sarah,
  
  On 01/03/2014 02:03 AM, Sarah Sharp wrote:
   Denis, do all of Compulab's Haswell systems reboot on shutdown?  Are
   they all running a Phoenix BIOS?  Can you send me the output of `sudo
   lspci -vvv -s` for the xHCI host?
  
  oem@oem-Intense-PC2 ~ $ sudo lspci -vvv -s 00:14.0
  00:14.0 USB controller: Intel Corporation Lynx Point-LP USB xHCI HC (rev 
  04) (prog-if 30 [XHCI])
  Subsystem: Intel Corporation Device 7270
  Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
  Stepping- SERR- FastB2B- DisINTx+
  Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium TAbort- 
  TAbort- MAbort- SERR- PERR- INTx-
  Latency: 0
  Interrupt: pin A routed to IRQ 59
  Region 0: Memory at f062 (64-bit, non-prefetchable) [size=64K]
  Capabilities: [70] Power Management version 2
  Flags: PMEClk- DSI- D1- D2- AuxCurrent=375mA 
  PME(D0-,D1-,D2-,D3hot+,D3cold+)
  Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
  Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
  Address: fee0200c  Data: 41b1
  Kernel driver in use: xhci_hcd
  
   Basically, I'm trying to find a common variable to key off.  I suspect
   BIOS vendor is probably the right thing, instead of system vendor.

Hmm, since Compulab isn't the subsystem vendor, we can't enable the same
HP quirk using that piece of information.  We can't enable the quirk to
put the host into D3 for all Lynx Point-LP hosts, since that quirk
breaks other vendors' systems.  Does this impact any Lynx Point (non-LP)
systems as well?

So far, two of the other systems that don't react well to the quirk are
both ASRock systems with American Megatrends BIOSes, based on info
provided by Art and Meng.  I can see from Giorgos' posted lspci that his
xHCI also lists ASRock as the Subsystem vendor, although I don't know
what the BIOS manufacturer is.

Niklas's xHCI subsystem VID:PID is 1558:7410, which is CLEVO/KAPOK
Computer Device.  Looks like Clevo is a laptop manufacturer.

Giorgos and Niklas, can you post output from `sudo dmidecode` please?

  By the way the quirk introduced by commit 
  e95829f474f0db3a4d940cae1423783edd966027 xhci: Switch PPT
  ports to EHCI on shutdown. works for Lynx Point as well at least on 
  Intense-PC2. I mean we can add
  XHCI_SPURIOUS_REBOOT flag that invokes usb_disable_xhci_ports().
  May be this solution works for HP and other systems without side effects?
 
 No, we already tested it at first, but didn't fix the behavior on HP
 machines.  It was harmless as far as we've tested, though.

Denis, what do you mean by works for Lynx Point?  Do you mean that
adding the quirk to switch the ports on EHCI on shutdown (e95829f474)
for the Intense-PC2 *instead of* the commit to put the host in D3 on
shutdown (638298dc66) works?  Or do you mean you need both patches for
your system?

If you only need the quirk to switch the ports to EHCI on shutdown, then
we could apply that broadly to Lynx Point LP, and see whether other
BIOSes tolerate that quirk.

The alternative would be to turn on the D3 quirk for systems with an HP
or Phoenix BIOS, by checking dmi_name_in_vendors() for those strings.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 12:00:00PM -0800, walt wrote:
 Okay, I used log_buf_len to make dmesg bigger and now I think I have
 the whole thing.  It's attached.

Walt, can you make sure the patch I sent you was applied?  The output
doesn't look like it is.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-07 Thread Sarah Sharp

On Tue, Jan 07, 2014 at 03:57:00PM -0800, walt wrote:
 On 01/07/2014 01:21 PM, Sarah Sharp wrote:
 
  Can you please try the attached patch, on top of the previous three
  patches, and send me dmesg?
 
 Hi Sarah, I just now finished running 0001-More-debugging.patch for the
 first time.  The previous dmesg didn't include that patch, but this one
 does.
 
 I read through this dmesg but I nodded off somewhere around line 500.
 I hope you can stay awake :)

Well, it has all the info I need, but the results don't make me too
happy.  Everything I've checked seems consistent, and I don't know why
the host stopped.  The link TRBs are intact, the dequeue pointer for the
endpoint was pointing to the transfer that timed out and it had the
cycle bit set correctly, etc.  Perhaps the no-op TRBs are really the
issue.

I'll have to take a look at the log again tomorrow.  I posted the dmesg
on pastebin if David wants to check it out as well:
http://pastebin.com/a4AUpsL1

Can you send me the output of `sudo lspci -vvv -n`?  Maybe we can just
turn off scatter-gather for your host controller until we get a proper
fix in that uses link TRBs instead of no-op TRBs.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-06 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 03:29:29PM -0800, Sarah Sharp wrote:
> On Fri, Jan 03, 2014 at 01:21:18PM -0800, walt wrote:
> > I'm so sorry Sarah, that was another mistake.  The mistake is so stupid I'm 
> > not
> > going to publish it here :(
> > 
> > Once I finally ran the kernel with debugging actually compiled in, dmesg 
> > contains
> > xhci debugging messages.  Wow :)
> > 
> > It's a big file so I zipped and attached it, which I hope is acceptable in 
> > lkml.
> 
> Yep, that's fine.  Sticking it in pastebin (or up on your server) is
> also fine, if it gets really big.
> 
> > BTW, this dmesg is from a kernel with sg_tablesize = 31, which as I said 
> > before
> > doesn't fix the problem.  The cp stopped around 7GB just as before.
> > 
> > Sorry for the noise...
> 
> No worries! :)  With the dmesg, I can finally see what happened:
> 
> [  188.703059] xhci_hcd :03:00.0: Cancel URB 8800b7d2e0c0, dev 1, ep 
> 0x2, starting at offset 0xbb7b9000
> [  188.703072] xhci_hcd :03:00.0: // Ding dong!
> [  193.711022] xhci_hcd :03:00.0: xHCI host not responding to stop 
> endpoint command.
> [  193.711029] xhci_hcd :03:00.0: Assuming host is dying, halting host.
> [  193.711046] xhci_hcd :03:00.0: // Halt the HC
> [  193.711060] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 0
> [  193.711066] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 2
> [  193.711078] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 3
> [  193.711096] xhci_hcd :03:00.0: Calling usb_hc_died()
> [  193.711103] xhci_hcd :03:00.0: HC died; cleaning up
> [  193.76] xhci_hcd :03:00.0: xHCI host controller is dead.
> 
> It seems that the xHCI driver tried to stop the endpoint ring in order
> to cancel a SCSI transfer, and the driver never got a response for that.
> 
> The offset is rather suspicious (0xbb7b9000), and it probably means the
> driver attempted to cancel a transfer that had been moved to the
> beginning of the ring segment, with no-op TRBs before the link TRB.
> 
> I suspect David's patch triggers a bug in the command cancellation code.
> There's also the unlikely possibility that the no-op TRBs did indeed
> cause the host to hang.  Either way, I'll have to look into it.
> 
> I'll let you know when I have some diagnostic patches ready.

Hi Walt,

I have a couple of patches for you to test.  You can either apply the
attached three patches, or you can pull down a kernel with:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci.git -b 
3.12-td-fragment-failure

Please only apply the first patch (which is diagnostic only), trigger
your issue, and send me the resulting dmesg.  Then try applying the
other two patches, and see if the issue goes away.  (I suspect it won't
but I can't be sure.)

Sarah Sharp
>From 0261dcd2711c010d786dcd940803a44e1bc19512 Mon Sep 17 00:00:00 2001
From: Sarah Sharp 
Date: Mon, 6 Jan 2014 16:06:27 -0800
Subject: [PATCH 1/3] TD fragment debugging

Signed-off-by: Sarah Sharp 
---
 drivers/usb/host/xhci-ring.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 55fc0c39b7e1..d05f61dc8359 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -982,6 +982,14 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 		 * doesn't touch the memory.
 		 */
 	}
+
+	xhci_dbg(xhci, "Command ring:\n");
+	xhci_debug_ring(xhci, xhci->cmd_ring);
+	xhci_dbg_ring_ptrs(xhci, xhci->cmd_ring);
+	xhci_dbg(xhci, "Event ring:\n");
+	xhci_debug_ring(xhci, xhci->event_ring);
+	xhci_dbg_ring_ptrs(xhci, xhci->event_ring);
+
 	for (i = 0; i < MAX_HC_SLOTS; i++) {
 		if (!xhci->devs[i])
 			continue;
@@ -1003,6 +1011,12 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 xhci_giveback_urb_in_irq(xhci, cur_td,
 		-ESHUTDOWN, "killed");
 			}
+			if (!list_empty(_ep->cancelled_td_list)) {
+xhci_dbg(xhci, "Dev %i Ep 0x%x:\n", i,
+		xhci_get_endpoint_address(j));
+xhci_debug_ring(xhci, ring);
+xhci_dbg_ring_ptrs(xhci, ring);
+			}
 			while (!list_empty(_ep->cancelled_td_list)) {
 cur_td = list_first_entry(
 		_ep->cancelled_td_list,
@@ -2966,6 +2980,10 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 		num_trbs, TRBS_PER_SEGMENT - 1);
 return -ENOMEM;
 			}
+			xhci_dbg(xhci, "Insert no-op TRBs at 0x%llx\n",
+	(unsigned long long)
+	xhci_trb_virt_to_dma(ep_ring->enq_seg,
+		ep_ring->enqueue));
 
 			nop_cmd = cpu_to_le32(TRB_TYPE(TRB_TR_NOOP) |
 	ep_ring->cycle_state);
-- 
1.8.3.3

>From 380071d6fa2430c7141faefc8acfc0909c75a0ed Mon Sep 17 00:00:00 2001
From:

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-06 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 03:29:29PM -0800, Sarah Sharp wrote:
 On Fri, Jan 03, 2014 at 01:21:18PM -0800, walt wrote:
  I'm so sorry Sarah, that was another mistake.  The mistake is so stupid I'm 
  not
  going to publish it here :(
  
  Once I finally ran the kernel with debugging actually compiled in, dmesg 
  contains
  xhci debugging messages.  Wow :)
  
  It's a big file so I zipped and attached it, which I hope is acceptable in 
  lkml.
 
 Yep, that's fine.  Sticking it in pastebin (or up on your server) is
 also fine, if it gets really big.
 
  BTW, this dmesg is from a kernel with sg_tablesize = 31, which as I said 
  before
  doesn't fix the problem.  The cp stopped around 7GB just as before.
  
  Sorry for the noise...
 
 No worries! :)  With the dmesg, I can finally see what happened:
 
 [  188.703059] xhci_hcd :03:00.0: Cancel URB 8800b7d2e0c0, dev 1, ep 
 0x2, starting at offset 0xbb7b9000
 [  188.703072] xhci_hcd :03:00.0: // Ding dong!
 [  193.711022] xhci_hcd :03:00.0: xHCI host not responding to stop 
 endpoint command.
 [  193.711029] xhci_hcd :03:00.0: Assuming host is dying, halting host.
 [  193.711046] xhci_hcd :03:00.0: // Halt the HC
 [  193.711060] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 0
 [  193.711066] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 2
 [  193.711078] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 3
 [  193.711096] xhci_hcd :03:00.0: Calling usb_hc_died()
 [  193.711103] xhci_hcd :03:00.0: HC died; cleaning up
 [  193.76] xhci_hcd :03:00.0: xHCI host controller is dead.
 
 It seems that the xHCI driver tried to stop the endpoint ring in order
 to cancel a SCSI transfer, and the driver never got a response for that.
 
 The offset is rather suspicious (0xbb7b9000), and it probably means the
 driver attempted to cancel a transfer that had been moved to the
 beginning of the ring segment, with no-op TRBs before the link TRB.
 
 I suspect David's patch triggers a bug in the command cancellation code.
 There's also the unlikely possibility that the no-op TRBs did indeed
 cause the host to hang.  Either way, I'll have to look into it.
 
 I'll let you know when I have some diagnostic patches ready.

Hi Walt,

I have a couple of patches for you to test.  You can either apply the
attached three patches, or you can pull down a kernel with:

git clone git://git.kernel.org/pub/scm/linux/kernel/git/sarah/xhci.git -b 
3.12-td-fragment-failure

Please only apply the first patch (which is diagnostic only), trigger
your issue, and send me the resulting dmesg.  Then try applying the
other two patches, and see if the issue goes away.  (I suspect it won't
but I can't be sure.)

Sarah Sharp
From 0261dcd2711c010d786dcd940803a44e1bc19512 Mon Sep 17 00:00:00 2001
From: Sarah Sharp sarah.a.sh...@linux.intel.com
Date: Mon, 6 Jan 2014 16:06:27 -0800
Subject: [PATCH 1/3] TD fragment debugging

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
---
 drivers/usb/host/xhci-ring.c | 18 ++
 1 file changed, 18 insertions(+)

diff --git a/drivers/usb/host/xhci-ring.c b/drivers/usb/host/xhci-ring.c
index 55fc0c39b7e1..d05f61dc8359 100644
--- a/drivers/usb/host/xhci-ring.c
+++ b/drivers/usb/host/xhci-ring.c
@@ -982,6 +982,14 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 		 * doesn't touch the memory.
 		 */
 	}
+
+	xhci_dbg(xhci, Command ring:\n);
+	xhci_debug_ring(xhci, xhci-cmd_ring);
+	xhci_dbg_ring_ptrs(xhci, xhci-cmd_ring);
+	xhci_dbg(xhci, Event ring:\n);
+	xhci_debug_ring(xhci, xhci-event_ring);
+	xhci_dbg_ring_ptrs(xhci, xhci-event_ring);
+
 	for (i = 0; i  MAX_HC_SLOTS; i++) {
 		if (!xhci-devs[i])
 			continue;
@@ -1003,6 +1011,12 @@ void xhci_stop_endpoint_command_watchdog(unsigned long arg)
 xhci_giveback_urb_in_irq(xhci, cur_td,
 		-ESHUTDOWN, killed);
 			}
+			if (!list_empty(temp_ep-cancelled_td_list)) {
+xhci_dbg(xhci, Dev %i Ep 0x%x:\n, i,
+		xhci_get_endpoint_address(j));
+xhci_debug_ring(xhci, ring);
+xhci_dbg_ring_ptrs(xhci, ring);
+			}
 			while (!list_empty(temp_ep-cancelled_td_list)) {
 cur_td = list_first_entry(
 		temp_ep-cancelled_td_list,
@@ -2966,6 +2980,10 @@ static int prepare_ring(struct xhci_hcd *xhci, struct xhci_ring *ep_ring,
 		num_trbs, TRBS_PER_SEGMENT - 1);
 return -ENOMEM;
 			}
+			xhci_dbg(xhci, Insert no-op TRBs at 0x%llx\n,
+	(unsigned long long)
+	xhci_trb_virt_to_dma(ep_ring-enq_seg,
+		ep_ring-enqueue));
 
 			nop_cmd = cpu_to_le32(TRB_TYPE(TRB_TR_NOOP) |
 	ep_ring-cycle_state);
-- 
1.8.3.3

From 380071d6fa2430c7141faefc8acfc0909c75a0ed Mon Sep 17 00:00:00 2001
From: Ben Hutchings b...@decadent.org.uk
Date: Mon, 6 Jan 2014 03:16:32 +
Subject: [PATCH 2/3] xhci: Avoid infinite loop when sg urb requires too many
 trbs

Currently prepare_ring() returns -ENOMEM if the urb won't fit into a
single ring segment.  usb_sg_wait() treats this error as a temporary
condition

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-03 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 01:21:18PM -0800, walt wrote:
> I'm so sorry Sarah, that was another mistake.  The mistake is so stupid I'm 
> not
> going to publish it here :(
> 
> Once I finally ran the kernel with debugging actually compiled in, dmesg 
> contains
> xhci debugging messages.  Wow :)
> 
> It's a big file so I zipped and attached it, which I hope is acceptable in 
> lkml.

Yep, that's fine.  Sticking it in pastebin (or up on your server) is
also fine, if it gets really big.

> BTW, this dmesg is from a kernel with sg_tablesize = 31, which as I said 
> before
> doesn't fix the problem.  The cp stopped around 7GB just as before.
> 
> Sorry for the noise...

No worries! :)  With the dmesg, I can finally see what happened:

[  188.703059] xhci_hcd :03:00.0: Cancel URB 8800b7d2e0c0, dev 1, ep 
0x2, starting at offset 0xbb7b9000
[  188.703072] xhci_hcd :03:00.0: // Ding dong!
[  193.711022] xhci_hcd :03:00.0: xHCI host not responding to stop endpoint 
command.
[  193.711029] xhci_hcd :03:00.0: Assuming host is dying, halting host.
[  193.711046] xhci_hcd :03:00.0: // Halt the HC
[  193.711060] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 0
[  193.711066] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 2
[  193.711078] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 3
[  193.711096] xhci_hcd :03:00.0: Calling usb_hc_died()
[  193.711103] xhci_hcd :03:00.0: HC died; cleaning up
[  193.76] xhci_hcd :03:00.0: xHCI host controller is dead.

It seems that the xHCI driver tried to stop the endpoint ring in order
to cancel a SCSI transfer, and the driver never got a response for that.

The offset is rather suspicious (0xbb7b9000), and it probably means the
driver attempted to cancel a transfer that had been moved to the
beginning of the ring segment, with no-op TRBs before the link TRB.

I suspect David's patch triggers a bug in the command cancellation code.
There's also the unlikely possibility that the no-op TRBs did indeed
cause the host to hang.  Either way, I'll have to look into it.

I'll let you know when I have some diagnostic patches ready.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-03 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 07:40:33AM -0800, walt wrote:
> On 01/02/2014 11:15 AM, Sarah Sharp wrote:
> > On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
> >> On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote:
> >>> 3.12-stable review patch.  If anyone has any objections, please let me 
> >>> know.
> >>>
> >>> --
> >>>
> >>> From: David Laight 
> >>>
> >>> commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream.
> >>>
> >>> Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link 
> >>> TRB
> >>> can only occur at a boundary between underlying USB frames (512 bytes for
> >>> high speed devices).
> >>>
> >>> If this isn't done the USB frames aren't formatted correctly and, for 
> >>> example,
> >>> the USB3 ethernet ax88179_178a card will stop sending...
> >>
> >>
> >> Unfortunately this patch causes a regression when copying large files to my
> >> outboard USB3 drive. (Nothing at all to do with networking.)
> 
> > Do you have CONFIG_USB_DEBUG turned on for 3.13?  If so, you should see
> > dmesg output from this statement shortly before your drive fails:
> > 
> > if (num_trbs >= TRBS_PER_SEGMENT) {
> > xhci_err(xhci, "Too many fragments %d, max %d\n",
> > num_trbs, TRBS_PER_SEGMENT - 1);
> > return -ENOMEM;
> > }
> 
> Well, the answers depend on whether the usb3 drive uses logical volumes or not
> (lvm2), which I can't explain.  What I've described so far is with lvm2.
> 
> When using lvm2 on the usb3 drive, turning on USB_DEBUG has *no* effect -- the
> console prints two or three lines stating that the ext4 journal has quit and
> the drive is remounted ro.  That particular drive stays wedged until the next
> reboot, but no other ill effects to the system.

Odd. In 3.12 xHCI has dynamic debugging, and turning on CONFIG_USB_DEBUG
should turn on debugging by default, so it's confusing that you didn't
see any messages.

Can I see your .config from /boot/?  Also, did you try capturing dmesg
with `tail -f /var/log/kern.log` or just dmesg?  Perhaps you need to run
`sudo dmesg -n 7`?

> OTOH, when I put a disk with just an ordinary ext4 partition in the usb3 dock,
> (no logical volumes) the copy failure becomes catastrophic, with kernel panic
> messages, leaving the system unresponsive and needing a hard reset to recover.
> 
> I also tried your other suggestion:
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 4265b48..1a6a43d 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -4714,7 +4714,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, 
> xhci_get_quirks_t get_quirks)
> int retval;
>  
> /* Accept arbitrarily long scatter-gather lists */
> -   hcd->self.sg_tablesize = ~0;
> +   hcd->self.sg_tablesize = 31;
>  
> /* support to build packet from discontinuous buffers */
> hcd->self.no_sg_constraint = 1;
> 
> Sadly it didn't fix the problem.  Did I get the patch right?

Yes, you did.  So perhaps the patch triggers a different bug.  I can't
tell until I see xHCI debugging output.

> Thanks for your help, and I'm happy to try more ideas, as always.

Thanks for your patience. :)

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-03 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 07:40:33AM -0800, walt wrote:
 On 01/02/2014 11:15 AM, Sarah Sharp wrote:
  On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
  On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote:
  3.12-stable review patch.  If anyone has any objections, please let me 
  know.
 
  --
 
  From: David Laight david.lai...@aculab.com
 
  commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream.
 
  Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link 
  TRB
  can only occur at a boundary between underlying USB frames (512 bytes for
  high speed devices).
 
  If this isn't done the USB frames aren't formatted correctly and, for 
  example,
  the USB3 ethernet ax88179_178a card will stop sending...
 
 
  Unfortunately this patch causes a regression when copying large files to my
  outboard USB3 drive. (Nothing at all to do with networking.)
 
  Do you have CONFIG_USB_DEBUG turned on for 3.13?  If so, you should see
  dmesg output from this statement shortly before your drive fails:
  
  if (num_trbs = TRBS_PER_SEGMENT) {
  xhci_err(xhci, Too many fragments %d, max %d\n,
  num_trbs, TRBS_PER_SEGMENT - 1);
  return -ENOMEM;
  }
 
 Well, the answers depend on whether the usb3 drive uses logical volumes or not
 (lvm2), which I can't explain.  What I've described so far is with lvm2.
 
 When using lvm2 on the usb3 drive, turning on USB_DEBUG has *no* effect -- the
 console prints two or three lines stating that the ext4 journal has quit and
 the drive is remounted ro.  That particular drive stays wedged until the next
 reboot, but no other ill effects to the system.

Odd. In 3.12 xHCI has dynamic debugging, and turning on CONFIG_USB_DEBUG
should turn on debugging by default, so it's confusing that you didn't
see any messages.

Can I see your .config from /boot/?  Also, did you try capturing dmesg
with `tail -f /var/log/kern.log` or just dmesg?  Perhaps you need to run
`sudo dmesg -n 7`?

 OTOH, when I put a disk with just an ordinary ext4 partition in the usb3 dock,
 (no logical volumes) the copy failure becomes catastrophic, with kernel panic
 messages, leaving the system unresponsive and needing a hard reset to recover.
 
 I also tried your other suggestion:
 
 diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
 index 4265b48..1a6a43d 100644
 --- a/drivers/usb/host/xhci.c
 +++ b/drivers/usb/host/xhci.c
 @@ -4714,7 +4714,7 @@ int xhci_gen_setup(struct usb_hcd *hcd, 
 xhci_get_quirks_t get_quirks)
 int retval;
  
 /* Accept arbitrarily long scatter-gather lists */
 -   hcd-self.sg_tablesize = ~0;
 +   hcd-self.sg_tablesize = 31;
  
 /* support to build packet from discontinuous buffers */
 hcd-self.no_sg_constraint = 1;
 
 Sadly it didn't fix the problem.  Did I get the patch right?

Yes, you did.  So perhaps the patch triggers a different bug.  I can't
tell until I see xHCI debugging output.

 Thanks for your help, and I'm happy to try more ideas, as always.

Thanks for your patience. :)

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-03 Thread Sarah Sharp

On Fri, Jan 03, 2014 at 01:21:18PM -0800, walt wrote:
 I'm so sorry Sarah, that was another mistake.  The mistake is so stupid I'm 
 not
 going to publish it here :(
 
 Once I finally ran the kernel with debugging actually compiled in, dmesg 
 contains
 xhci debugging messages.  Wow :)
 
 It's a big file so I zipped and attached it, which I hope is acceptable in 
 lkml.

Yep, that's fine.  Sticking it in pastebin (or up on your server) is
also fine, if it gets really big.

 BTW, this dmesg is from a kernel with sg_tablesize = 31, which as I said 
 before
 doesn't fix the problem.  The cp stopped around 7GB just as before.
 
 Sorry for the noise...

No worries! :)  With the dmesg, I can finally see what happened:

[  188.703059] xhci_hcd :03:00.0: Cancel URB 8800b7d2e0c0, dev 1, ep 
0x2, starting at offset 0xbb7b9000
[  188.703072] xhci_hcd :03:00.0: // Ding dong!
[  193.711022] xhci_hcd :03:00.0: xHCI host not responding to stop endpoint 
command.
[  193.711029] xhci_hcd :03:00.0: Assuming host is dying, halting host.
[  193.711046] xhci_hcd :03:00.0: // Halt the HC
[  193.711060] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 0
[  193.711066] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 2
[  193.711078] xhci_hcd :03:00.0: Killing URBs for slot ID 1, ep index 3
[  193.711096] xhci_hcd :03:00.0: Calling usb_hc_died()
[  193.711103] xhci_hcd :03:00.0: HC died; cleaning up
[  193.76] xhci_hcd :03:00.0: xHCI host controller is dead.

It seems that the xHCI driver tried to stop the endpoint ring in order
to cancel a SCSI transfer, and the driver never got a response for that.

The offset is rather suspicious (0xbb7b9000), and it probably means the
driver attempted to cancel a transfer that had been moved to the
beginning of the ring segment, with no-op TRBs before the link TRB.

I suspect David's patch triggers a bug in the command cancellation code.
There's also the unlikely possibility that the no-op TRBs did indeed
cause the host to hang.  Either way, I'll have to look into it.

I'll let you know when I have some diagnostic patches ready.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2014-01-02 Thread Sarah Sharp

On Sun, Dec 22, 2013 at 09:47:49AM +0200, Denis Turischev wrote:
> On 12/21/2013 01:45 AM, Sarah Sharp wrote:
> > On Fri, Dec 20, 2013 at 12:41:11PM +0200, Denis Turischev wrote:
> >>> Also, which kernel are you experiencing this issue on?  In 3.12, I
> >>> queued a separate patch to deal with spurious reboot issues on Lynx
> >>> Point:
> >>>
> >>> commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
> >> Sorry, I indeed tested not on the latest kernel version, Ubuntu 3.13-rc3 
> >> has this patch and it works
> >> for me.
> > 
> > What does "Ubuntu 3.13-rc3" mean?  Where did you get your kernel from?
> Latest Ubuntu development kernel based on mainline 3.13-rc3.
> http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13-rc3-trusty/
> > 
> > Also, do you have an HP system, or is this a different vendor?
> No, it's not HP system, it's Compulab's IntensePC-2 with Phoenix BIOS.

Ok, that's a bit of an issue then.  Your system needs the quirk
introduced by commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 "xhci: Fix
spurious wakeups after S5 on Haswell".  That went into 3.12-rc3.
However, in 3.13-rc6, commit 6962d914f317b119e0db7189199b21ec77a4b3e0
"xhci: Limit the spurious wakeup fix only to HP machines" limited the
quirk to only HP systems.

That means your system worked fine in 3.13-rc3 (when the quirk was
applied broadly), but won't work for 3.13-rc6 (when the quirk was
narrowed to HP machines).  So we need the quirk to apply to your systems
as well.

ISTR that the other folks on Cc (Meng, Niklas, Giorgos, and Art) all had
systems that broke when commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
was introduced.  For those systems, what vendor was the system, and what
BIOS was it running?

Takashi, did the HP systems that needed the quirk have a Phoenix BIOS?

Denis, do all of Compulab's Haswell systems reboot on shutdown?  Are
they all running a Phoenix BIOS?  Can you send me the output of `sudo
lspci -vvv -s` for the xHCI host?

Basically, I'm trying to find a common variable to key off.  I suspect
BIOS vendor is probably the right thing, instead of system vendor.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-02 Thread Sarah Sharp

On Thu, Jan 02, 2014 at 04:01:29PM -0500, Mark Lord wrote:
> On 14-01-02 02:15 PM, Sarah Sharp wrote:
> > On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
> ..
> >> Unfortunately this patch causes a regression when copying large files to my
> >> outboard USB3 drive. (Nothing at all to do with networking.)
> >>
> >> When I try to copy a large (20GB) file to the USB3 drive, the copy dies 
> >> after
> >> about 7GB, the ext4 journal aborts and the drive is remounted read-only.
> >>
> >> This bug is 100% reproducible (always pretty close to 7GB) and reverting 
> >> this
> >> patch completely fixes the problem.
> > 
> > Ok, I had feared that would be a consequence of this patch.  I think the
> > problem is that the usb-storage driver submitted an URB with more
> > scatter-gather entries than would fit on the ring segment, the xHCI
> > driver rejected the URB with -ENOMEM, and the SCSI core eventually gave
> > up on the SCSI command.
> 
> Is there not a block layer / scheduler tunable for max sg entries or 
> something?

There is a USB host controller tunable for max number of sg entries
that's passed up to the SCSI or block layer.  We discussed changing it,
but there wasn't a good consensus on what to change it to:

http://marc.info/?l=linux-usb=138496358223904=2
http://marc.info/?l=linux-netdev=138496706007262=2

In the end, we thought we didn't need to limit the sglist size because
Paul thought usb-storage limits the overall transfer size to 120K, which
should fit in 31 TRBs:

http://marc.info/?l=linux-usb=138498190419312=2

Walt, could you see if limiting the sglist size helps, by setting
hcd->self.sg_tablesize in xhci.c to 31?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-02 Thread Sarah Sharp

On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
> On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote:
> > 3.12-stable review patch.  If anyone has any objections, please let me know.
> > 
> > --
> > 
> > From: David Laight 
> > 
> > commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream.
> > 
> > Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB
> > can only occur at a boundary between underlying USB frames (512 bytes for
> > high speed devices).
> > 
> > If this isn't done the USB frames aren't formatted correctly and, for 
> > example,
> > the USB3 ethernet ax88179_178a card will stop sending...
> 
> 
> Unfortunately this patch causes a regression when copying large files to my
> outboard USB3 drive. (Nothing at all to do with networking.)
> 
> When I try to copy a large (20GB) file to the USB3 drive, the copy dies after
> about 7GB, the ext4 journal aborts and the drive is remounted read-only.
> 
> This bug is 100% reproducible (always pretty close to 7GB) and reverting this
> patch completely fixes the problem.

Ok, I had feared that would be a consequence of this patch.  I think the
problem is that the usb-storage driver submitted an URB with more
scatter-gather entries than would fit on the ring segment, the xHCI
driver rejected the URB with -ENOMEM, and the SCSI core eventually gave
up on the SCSI command.

Do you have CONFIG_USB_DEBUG turned on for 3.13?  If so, you should see
dmesg output from this statement shortly before your drive fails:

if (num_trbs >= TRBS_PER_SEGMENT) {
xhci_err(xhci, "Too many fragments %d, max %d\n",
num_trbs, TRBS_PER_SEGMENT - 1);
return -ENOMEM;
}

> (Note to Sarah:  I recently emailed you about this problem, and I *wrongly*
> said that reverting the patch doesn't help.  That was a mistake, sorry.)
> 
> I'm happy to try any debugging suggestions/tricks.

Unfortunately a real fix for this is going to take a bit.  I have a
couple different solutions to the bug the patch solved, but they're much
more invasive than the original patch and will take a couple weeks to
implement and thoroughly test.

If David's patch is just reverted, USB ethernet on 3.12 and later breaks
under xHCI.  The networking folks added scatter-gather support in 3.12.
Those patches could be reverted, but I suspect David Miller will not be
happy with that solution, since the real problem is the xHCI driver
itself, and EHCI scatter-gather works fine.

I think the short term solution is to simply turn off scatter-gather all
together under xHCI until this gets fixed.  It could mean a big
performance hit for USB storage devices, but that means we get correct
behavior for both USB ethernet and USB storage.

> BTW, please tell me if I've cc'd too many people.

Nope, you're fine.  I've Cc'ed the USB and SCSI mailing lists as well.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-02 Thread Sarah Sharp

On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
 On 12/18/2013 01:11 PM, Greg Kroah-Hartman wrote:
  3.12-stable review patch.  If anyone has any objections, please let me know.
  
  --
  
  From: David Laight david.lai...@aculab.com
  
  commit 35773dac5f862cb1c82ea151eba3e2f6de51ec3e upstream.
  
  Section 4.11.7.1 of rev 1.0 of the xhci specification states that a link TRB
  can only occur at a boundary between underlying USB frames (512 bytes for
  high speed devices).
  
  If this isn't done the USB frames aren't formatted correctly and, for 
  example,
  the USB3 ethernet ax88179_178a card will stop sending...
 
 
 Unfortunately this patch causes a regression when copying large files to my
 outboard USB3 drive. (Nothing at all to do with networking.)
 
 When I try to copy a large (20GB) file to the USB3 drive, the copy dies after
 about 7GB, the ext4 journal aborts and the drive is remounted read-only.
 
 This bug is 100% reproducible (always pretty close to 7GB) and reverting this
 patch completely fixes the problem.

Ok, I had feared that would be a consequence of this patch.  I think the
problem is that the usb-storage driver submitted an URB with more
scatter-gather entries than would fit on the ring segment, the xHCI
driver rejected the URB with -ENOMEM, and the SCSI core eventually gave
up on the SCSI command.

Do you have CONFIG_USB_DEBUG turned on for 3.13?  If so, you should see
dmesg output from this statement shortly before your drive fails:

if (num_trbs = TRBS_PER_SEGMENT) {
xhci_err(xhci, Too many fragments %d, max %d\n,
num_trbs, TRBS_PER_SEGMENT - 1);
return -ENOMEM;
}

 (Note to Sarah:  I recently emailed you about this problem, and I *wrongly*
 said that reverting the patch doesn't help.  That was a mistake, sorry.)
 
 I'm happy to try any debugging suggestions/tricks.

Unfortunately a real fix for this is going to take a bit.  I have a
couple different solutions to the bug the patch solved, but they're much
more invasive than the original patch and will take a couple weeks to
implement and thoroughly test.

If David's patch is just reverted, USB ethernet on 3.12 and later breaks
under xHCI.  The networking folks added scatter-gather support in 3.12.
Those patches could be reverted, but I suspect David Miller will not be
happy with that solution, since the real problem is the xHCI driver
itself, and EHCI scatter-gather works fine.

I think the short term solution is to simply turn off scatter-gather all
together under xHCI until this gets fixed.  It could mean a big
performance hit for USB storage devices, but that means we get correct
behavior for both USB ethernet and USB storage.

 BTW, please tell me if I've cc'd too many people.

Nope, you're fine.  I've Cc'ed the USB and SCSI mailing lists as well.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH 3.12 033/118] usb: xhci: Link TRB must not occur within a USB payload burst

2014-01-02 Thread Sarah Sharp

On Thu, Jan 02, 2014 at 04:01:29PM -0500, Mark Lord wrote:
 On 14-01-02 02:15 PM, Sarah Sharp wrote:
  On Tue, Dec 31, 2013 at 12:40:16PM -0800, walt wrote:
 ..
  Unfortunately this patch causes a regression when copying large files to my
  outboard USB3 drive. (Nothing at all to do with networking.)
 
  When I try to copy a large (20GB) file to the USB3 drive, the copy dies 
  after
  about 7GB, the ext4 journal aborts and the drive is remounted read-only.
 
  This bug is 100% reproducible (always pretty close to 7GB) and reverting 
  this
  patch completely fixes the problem.
  
  Ok, I had feared that would be a consequence of this patch.  I think the
  problem is that the usb-storage driver submitted an URB with more
  scatter-gather entries than would fit on the ring segment, the xHCI
  driver rejected the URB with -ENOMEM, and the SCSI core eventually gave
  up on the SCSI command.
 
 Is there not a block layer / scheduler tunable for max sg entries or 
 something?

There is a USB host controller tunable for max number of sg entries
that's passed up to the SCSI or block layer.  We discussed changing it,
but there wasn't a good consensus on what to change it to:

http://marc.info/?l=linux-usbm=138496358223904w=2
http://marc.info/?l=linux-netdevm=138496706007262w=2

In the end, we thought we didn't need to limit the sglist size because
Paul thought usb-storage limits the overall transfer size to 120K, which
should fit in 31 TRBs:

http://marc.info/?l=linux-usbm=138498190419312w=2

Walt, could you see if limiting the sglist size helps, by setting
hcd-self.sg_tablesize in xhci.c to 31?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2014-01-02 Thread Sarah Sharp

On Sun, Dec 22, 2013 at 09:47:49AM +0200, Denis Turischev wrote:
 On 12/21/2013 01:45 AM, Sarah Sharp wrote:
  On Fri, Dec 20, 2013 at 12:41:11PM +0200, Denis Turischev wrote:
  Also, which kernel are you experiencing this issue on?  In 3.12, I
  queued a separate patch to deal with spurious reboot issues on Lynx
  Point:
 
  commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
  Sorry, I indeed tested not on the latest kernel version, Ubuntu 3.13-rc3 
  has this patch and it works
  for me.
  
  What does Ubuntu 3.13-rc3 mean?  Where did you get your kernel from?
 Latest Ubuntu development kernel based on mainline 3.13-rc3.
 http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.13-rc3-trusty/
  
  Also, do you have an HP system, or is this a different vendor?
 No, it's not HP system, it's Compulab's IntensePC-2 with Phoenix BIOS.

Ok, that's a bit of an issue then.  Your system needs the quirk
introduced by commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 xhci: Fix
spurious wakeups after S5 on Haswell.  That went into 3.12-rc3.
However, in 3.13-rc6, commit 6962d914f317b119e0db7189199b21ec77a4b3e0
xhci: Limit the spurious wakeup fix only to HP machines limited the
quirk to only HP systems.

That means your system worked fine in 3.13-rc3 (when the quirk was
applied broadly), but won't work for 3.13-rc6 (when the quirk was
narrowed to HP machines).  So we need the quirk to apply to your systems
as well.

ISTR that the other folks on Cc (Meng, Niklas, Giorgos, and Art) all had
systems that broke when commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
was introduced.  For those systems, what vendor was the system, and what
BIOS was it running?

Takashi, did the HP systems that needed the quirk have a Phoenix BIOS?

Denis, do all of Compulab's Haswell systems reboot on shutdown?  Are
they all running a Phoenix BIOS?  Can you send me the output of `sudo
lspci -vvv -s` for the xHCI host?

Basically, I'm trying to find a common variable to key off.  I suspect
BIOS vendor is probably the right thing, instead of system vendor.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2013-12-20 Thread Sarah Sharp

On Fri, Dec 20, 2013 at 12:41:11PM +0200, Denis Turischev wrote:
> > Also, which kernel are you experiencing this issue on?  In 3.12, I
> > queued a separate patch to deal with spurious reboot issues on Lynx
> > Point:
> > 
> > commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
> Sorry, I indeed tested not on the latest kernel version, Ubuntu 3.13-rc3 has 
> this patch and it works
> for me.

What does "Ubuntu 3.13-rc3" mean?  Where did you get your kernel from?

Also, do you have an HP system, or is this a different vendor?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2013-12-20 Thread Sarah Sharp

On Fri, Dec 20, 2013 at 12:41:11PM +0200, Denis Turischev wrote:
  Also, which kernel are you experiencing this issue on?  In 3.12, I
  queued a separate patch to deal with spurious reboot issues on Lynx
  Point:
  
  commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
 Sorry, I indeed tested not on the latest kernel version, Ubuntu 3.13-rc3 has 
 this patch and it works
 for me.

What does Ubuntu 3.13-rc3 mean?  Where did you get your kernel from?

Also, do you have an HP system, or is this a different vendor?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2013-12-19 Thread Sarah Sharp

This is actually v2.

On Thu, Dec 19, 2013 at 07:07:33PM +0200, Denis Turischev wrote:
> The same issue like with Panther Point chipsets. If the USB ports are
> switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
> which will wake the system. Some BIOS have work around for this, but not all.
> 
> The bug can be avoided if the USB ports are switched back to EHCI on
> shutdown.
> 
> v1: add new device id locally, not in 

This line shouldn't go in the patch description.

> Signed-off-by: Denis Turischev 

Instead, it should go after the --- line, which should be here, but
isn't.  How did you generate this patch?  `git format-patch` is
recommended, or `git send-email`.

Also, which kernel are you experiencing this issue on?  In 3.12, I
queued a separate patch to deal with spurious reboot issues on Lynx
Point:

commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
Author: Takashi Iwai 
Date:   Thu Sep 12 08:11:06 2013 +0200

xhci: Fix spurious wakeups after S5 on Haswell

Haswell LynxPoint and LynxPoint-LP with the recent Intel BIOS show
mysterious wakeups after shutdown occasionally.  After discussing with
BIOS engineers, they explained that the new BIOS expects that the
wakeup sources are cleared and set to D3 for all wakeup devices when
the system is going to sleep or power off, but the current xhci driver
doesn't do this properly (partly intentionally).

This patch introduces a new quirk, XHCI_SPURIOUS_WAKEUP, for
fixing the spurious wakeups at S5 by calling xhci_reset() in the xhci
shutdown ops as done in xhci_stop(), and setting the device to PCI D3
at shutdown and remove ops.

The PCI D3 call is based on the initial fix patch by Oliver Neukum.

[Note: Sarah changed the quirk name from XHCI_HSW_SPURIOUS_WAKEUP to
XHCI_SPURIOUS_WAKEUP, since none of the other quirks have system names
in them.  Sarah also fixed a collision with a quirk submitted around the
same time, by changing the xhci->quirks bit from 17 to 18.]

This patch should be backported to kernels as old as 3.0, that
contain the commit 1c12443ab8eba71a658fae4572147e56d1f84f66 "xhci: Add
Lynx Point to list of Intel switchable hosts."

Cc: Oliver Neukum 
Signed-off-by: Takashi Iwai 
Signed-off-by: Sarah Sharp 
Cc: sta...@vger.kernel.org

This patch is in 3.12, but a patch to narrow the quirk to only apply HP systems
will hit 3.13 shortly:

commit 6962d914f317b119e0db7189199b21ec77a4b3e0
Author: Takashi Iwai 
Date:   Mon Dec 9 14:53:36 2013 +0100

xhci: Limit the spurious wakeup fix only to HP machines

We've got regression reports that my previous fix for spurious wakeups
after S5 on HP Haswell machines leads to the automatic reboot at
shutdown on some machines.  It turned out that the fix for one side
triggers another BIOS bug in other side.  So, it's exclusive.

Since the original S5 wakeups have been confirmed only on HP machines,
it'd be safer to apply it only to limited machines.  As a wild guess,
limiting to machines with HP PCI SSID should suffice.

This patch should be backported to kernels as old as 3.12, that
contain the commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 "xhci: Fix
spurious wakeups after S5 on Haswell".

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
Cc: sta...@vger.kernel.org
    Signed-off-by: Takashi Iwai 
Signed-off-by: Sarah Sharp 
Tested-by: 
Reported-by: Niklas Schnelle 
Reported-by: Giorgos 
Reported-by: 

So, do you experience the spurious reboots on 3.11?  Do they go away in
3.12 with the first patch applied?

Sarah Sharp

> 
> diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
> --- a/drivers/usb/host/xhci-pci.c 2013-12-19 11:36:12.049589400 +0200
> +++ b/drivers/usb/host/xhci-pci.c 2013-12-19 11:37:27.261590385 +0200
> @@ -34,6 +34,8 @@
>  #define PCI_VENDOR_ID_ETRON  0x1b6f
>  #define PCI_DEVICE_ID_ASROCK_P67 0x7023
> 
> +#define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI   0x9c31
> +
>  static const char hcd_name[] = "xhci_hcd";
> 
>  /* called after powerup, by probe or system-pm "wakeup" */
> @@ -91,8 +93,9 @@
>   xhci->quirks |= XHCI_LPM_SUPPORT;
>   xhci->quirks |= XHCI_INTEL_HOST;
>   }
> - if (pdev->vendor == PCI_VENDOR_ID_INTEL &&
> - pdev->device == PCI_DEVICE_ID_INTEL_PANTHERPOINT_XHCI) {
> + if (pdev->vendor == PCI_VENDOR_ID_INTEL && (
> + (pdev->device == PCI_DEVICE_ID_INTEL_PANTHERPOINT_XHCI) 
> ||
> + (pdev->device == PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI))) {
>   xhci->quirks |= XHCI_EP_LIMIT_QUIRK;
>   xhci->limit_active_eps = 64

Re: [PATCH v1] xhci: Switch Intel Lynx Point ports to EHCI on shutdown

2013-12-19 Thread Sarah Sharp

This is actually v2.

On Thu, Dec 19, 2013 at 07:07:33PM +0200, Denis Turischev wrote:
 The same issue like with Panther Point chipsets. If the USB ports are
 switched to xHCI on shutdown, the xHCI host will send a spurious interrupt,
 which will wake the system. Some BIOS have work around for this, but not all.
 
 The bug can be avoided if the USB ports are switched back to EHCI on
 shutdown.
 
 v1: add new device id locally, not in linux/pci_ids.h

This line shouldn't go in the patch description.

 Signed-off-by: Denis Turischev de...@compulab.co.il

Instead, it should go after the --- line, which should be here, but
isn't.  How did you generate this patch?  `git format-patch` is
recommended, or `git send-email`.

Also, which kernel are you experiencing this issue on?  In 3.12, I
queued a separate patch to deal with spurious reboot issues on Lynx
Point:

commit 638298dc66ea36623dbc2757a24fc2c4ab41b016
Author: Takashi Iwai ti...@suse.de
Date:   Thu Sep 12 08:11:06 2013 +0200

xhci: Fix spurious wakeups after S5 on Haswell

Haswell LynxPoint and LynxPoint-LP with the recent Intel BIOS show
mysterious wakeups after shutdown occasionally.  After discussing with
BIOS engineers, they explained that the new BIOS expects that the
wakeup sources are cleared and set to D3 for all wakeup devices when
the system is going to sleep or power off, but the current xhci driver
doesn't do this properly (partly intentionally).

This patch introduces a new quirk, XHCI_SPURIOUS_WAKEUP, for
fixing the spurious wakeups at S5 by calling xhci_reset() in the xhci
shutdown ops as done in xhci_stop(), and setting the device to PCI D3
at shutdown and remove ops.

The PCI D3 call is based on the initial fix patch by Oliver Neukum.

[Note: Sarah changed the quirk name from XHCI_HSW_SPURIOUS_WAKEUP to
XHCI_SPURIOUS_WAKEUP, since none of the other quirks have system names
in them.  Sarah also fixed a collision with a quirk submitted around the
same time, by changing the xhci-quirks bit from 17 to 18.]

This patch should be backported to kernels as old as 3.0, that
contain the commit 1c12443ab8eba71a658fae4572147e56d1f84f66 xhci: Add
Lynx Point to list of Intel switchable hosts.

Cc: Oliver Neukum oneu...@suse.de
Signed-off-by: Takashi Iwai ti...@suse.de
Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
Cc: sta...@vger.kernel.org

This patch is in 3.12, but a patch to narrow the quirk to only apply HP systems
will hit 3.13 shortly:

commit 6962d914f317b119e0db7189199b21ec77a4b3e0
Author: Takashi Iwai ti...@suse.de
Date:   Mon Dec 9 14:53:36 2013 +0100

xhci: Limit the spurious wakeup fix only to HP machines

We've got regression reports that my previous fix for spurious wakeups
after S5 on HP Haswell machines leads to the automatic reboot at
shutdown on some machines.  It turned out that the fix for one side
triggers another BIOS bug in other side.  So, it's exclusive.

Since the original S5 wakeups have been confirmed only on HP machines,
it'd be safer to apply it only to limited machines.  As a wild guess,
limiting to machines with HP PCI SSID should suffice.

This patch should be backported to kernels as old as 3.12, that
contain the commit 638298dc66ea36623dbc2757a24fc2c4ab41b016 xhci: Fix
spurious wakeups after S5 on Haswell.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=66171
Cc: sta...@vger.kernel.org
Signed-off-by: Takashi Iwai ti...@suse.de
Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
Tested-by: dashing.m...@gmail.com
Reported-by: Niklas Schnelle nik...@komani.de
Reported-by: Giorgos ganastasio...@gmail.com
Reported-by: a...@vhex.net

So, do you experience the spurious reboots on 3.11?  Do they go away in
3.12 with the first patch applied?

Sarah Sharp

 
 diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
 --- a/drivers/usb/host/xhci-pci.c 2013-12-19 11:36:12.049589400 +0200
 +++ b/drivers/usb/host/xhci-pci.c 2013-12-19 11:37:27.261590385 +0200
 @@ -34,6 +34,8 @@
  #define PCI_VENDOR_ID_ETRON  0x1b6f
  #define PCI_DEVICE_ID_ASROCK_P67 0x7023
 
 +#define PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI   0x9c31
 +
  static const char hcd_name[] = xhci_hcd;
 
  /* called after powerup, by probe or system-pm wakeup */
 @@ -91,8 +93,9 @@
   xhci-quirks |= XHCI_LPM_SUPPORT;
   xhci-quirks |= XHCI_INTEL_HOST;
   }
 - if (pdev-vendor == PCI_VENDOR_ID_INTEL 
 - pdev-device == PCI_DEVICE_ID_INTEL_PANTHERPOINT_XHCI) {
 + if (pdev-vendor == PCI_VENDOR_ID_INTEL  (
 + (pdev-device == PCI_DEVICE_ID_INTEL_PANTHERPOINT_XHCI) 
 ||
 + (pdev-device == PCI_DEVICE_ID_INTEL_LYNXPOINT_XHCI))) {
   xhci-quirks |= XHCI_EP_LIMIT_QUIRK;
   xhci-limit_active_eps = 64

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-13 Thread Sarah Sharp

On Thu, Dec 12, 2013 at 11:05:04AM -0500, Alan Stern wrote:
> On Wed, 11 Dec 2013, Julius Werner wrote:
> 
> > >> ...although, the spec says that it does not wait for the port resets
> > >> to complete.  As far as I can see re-issuing a warm reset and waiting
> > >> is the only way to guarantee the core times the recovery.  Presumably
> > >> the portstatus debounce in hub_activate() mitigates this, but that
> > >> 100ms is less than a full reset timeout.
> > 
> > It's definitely not just a timing issue for us. I can't reproduce all
> > the same cases as Vikas, but when I attach a USB analyzer to the ones
> > I do see the host controller doesn't even start sending a reset.
> > 
> > >>> The xHCI spec requires that when the xHCI host is reset, a USB reset is
> > >>> driven down the USB 3.0 ports.  If hot reset fails, the port may migrate
> > >>> to warm reset.  See table 32 in the xHCI spec, in the definition of
> > >>> HCRST.  It sounds like this host doesn't drive a USB reset down USB 3.0
> > >>> ports at all on host controller reset?
> > 
> > Oh, interesting, I hadn't seen that yet. So I guess the spec itself is
> > fine if it were followed to the letter.
> > 
> > I did some more tests about this on my Exynos machine: when I put a
> > device to autosuspend (U3) and manually poke the xHC reset bit, I do
> > see an automatic warm reset on the analyzer and the ports manage to
> > retrain to U0. But after a system suspend/resume which calls
> > xhci_reset() in the process, there is no reset on the wire. I also
> > noticed that it doesn't drive a reset (even after manual poking) when
> > there is no device connected on the other end of the analyzer.
> > 
> > So this might be our problem: maybe these host controllers (Synopsys
> > DesignWare) issue the spec-mandated warm reset only on ports where
> > they think there is a device attached. But after a system
> > suspend/resume (where the whole IP block on the SoC was powered down),
> > the host controller cannot know that there is still a device with an
> > active power session attached, and therefore doesn't drive the reset
> > on its own.

Ok, that makes some sense.  I could see why host controllers wouldn't
want to drive reset on an unconnected port.

> > Even though this is a host controller bug, we still have to deal with
> > it somehow. I guess we could move the code into xhci_plat_resume() and
> > hide it behind a quirk to lessen the impact. But since reset_resume is
> > not a common case for most host controllers, it's hard to say if this
> > is DesignWare specific or a more widespread implementation mistake.
> 
> I was going to suggest something along these lines too.  This seems to 
> be a bug in xHCI.  Therefore the fix belongs in xhci-hcd, not in the 
> hub driver.

I agree.  Is there a chance that the Synopsys DesignWare will be a PCI
device instead of a platform device?  If so, it would be better to put
the code into xhci_resume instead of xhci_plat_resume.  That also allows
you to only issue the warm reset when the register restore state command
fails, after the xhci_reset call.

Also, I assume that other systems with the Synopsys DesignWare IP will
experience this issue?  I know of at least two other chipsets that will
include that IP, and it would be good to find a way to trigger on the
Synopsys IP, rather than off xHCI PCI vendor and device ID.  Otherwise
we'll be adding PCI IDs to the xHCI driver quirks for many many kernels
to come.

I'm actually leaning towards enabling the check for warm reset broadly.
It seems like it wouldn't hurt to issue a warm reset on the USB 3.0
ports if they're in compliance, poll, or rx.detect.  So, let's enable
this broadly in xhci_resume, mark the patch for stable, but ask for the
backport to be delayed until 3.13.3 is out, to allow for more testing.
If anyone complains of xHCI behavior changes, we'll change the code to
add a quirk.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-13 Thread Sarah Sharp

On Thu, Dec 12, 2013 at 11:05:04AM -0500, Alan Stern wrote:
 On Wed, 11 Dec 2013, Julius Werner wrote:
 
   ...although, the spec says that it does not wait for the port resets
   to complete.  As far as I can see re-issuing a warm reset and waiting
   is the only way to guarantee the core times the recovery.  Presumably
   the portstatus debounce in hub_activate() mitigates this, but that
   100ms is less than a full reset timeout.
  
  It's definitely not just a timing issue for us. I can't reproduce all
  the same cases as Vikas, but when I attach a USB analyzer to the ones
  I do see the host controller doesn't even start sending a reset.
  
   The xHCI spec requires that when the xHCI host is reset, a USB reset is
   driven down the USB 3.0 ports.  If hot reset fails, the port may migrate
   to warm reset.  See table 32 in the xHCI spec, in the definition of
   HCRST.  It sounds like this host doesn't drive a USB reset down USB 3.0
   ports at all on host controller reset?
  
  Oh, interesting, I hadn't seen that yet. So I guess the spec itself is
  fine if it were followed to the letter.
  
  I did some more tests about this on my Exynos machine: when I put a
  device to autosuspend (U3) and manually poke the xHC reset bit, I do
  see an automatic warm reset on the analyzer and the ports manage to
  retrain to U0. But after a system suspend/resume which calls
  xhci_reset() in the process, there is no reset on the wire. I also
  noticed that it doesn't drive a reset (even after manual poking) when
  there is no device connected on the other end of the analyzer.
  
  So this might be our problem: maybe these host controllers (Synopsys
  DesignWare) issue the spec-mandated warm reset only on ports where
  they think there is a device attached. But after a system
  suspend/resume (where the whole IP block on the SoC was powered down),
  the host controller cannot know that there is still a device with an
  active power session attached, and therefore doesn't drive the reset
  on its own.

Ok, that makes some sense.  I could see why host controllers wouldn't
want to drive reset on an unconnected port.

  Even though this is a host controller bug, we still have to deal with
  it somehow. I guess we could move the code into xhci_plat_resume() and
  hide it behind a quirk to lessen the impact. But since reset_resume is
  not a common case for most host controllers, it's hard to say if this
  is DesignWare specific or a more widespread implementation mistake.
 
 I was going to suggest something along these lines too.  This seems to 
 be a bug in xHCI.  Therefore the fix belongs in xhci-hcd, not in the 
 hub driver.

I agree.  Is there a chance that the Synopsys DesignWare will be a PCI
device instead of a platform device?  If so, it would be better to put
the code into xhci_resume instead of xhci_plat_resume.  That also allows
you to only issue the warm reset when the register restore state command
fails, after the xhci_reset call.

Also, I assume that other systems with the Synopsys DesignWare IP will
experience this issue?  I know of at least two other chipsets that will
include that IP, and it would be good to find a way to trigger on the
Synopsys IP, rather than off xHCI PCI vendor and device ID.  Otherwise
we'll be adding PCI IDs to the xHCI driver quirks for many many kernels
to come.

I'm actually leaning towards enabling the check for warm reset broadly.
It seems like it wouldn't hurt to issue a warm reset on the USB 3.0
ports if they're in compliance, poll, or rx.detect.  So, let's enable
this broadly in xhci_resume, mark the patch for stable, but ask for the
backport to be delayed until 3.13.3 is out, to allow for more testing.
If anyone complains of xHCI behavior changes, we'll change the code to
add a quirk.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-11 Thread Sarah Sharp

On Wed, Dec 11, 2013 at 11:00:13AM -0800, Julius Werner wrote:
> > I don't know what you mean by "fails".  The system goes to sleep and
> > then later on wakes up, doesn't it?
> >
> > Do you mean that the Jetflash device gets disconnected when the system
> > wakes up?  That's _supposed_ to happen under those circumstances.
> > When hub_activate() sees HUB_RESET_RESUME, all child devices get
> > disconnected except those where udev->persist_enabled is set.
> 
> This patch was written in response to the same bug as my "usb: hub:
> Use correct reset for wedged USB3 devices that are NOTATTACHED"
> submission. My patch only helps when the port gets stuck in Compliance
> Mode, but Vikas reports that he can sometimes see it stuck in Polling
> or Recovery states as well.
> 
> The underlying issue is a deadlock in the USB 3.0 link training state
> machine when the host controller is unilaterally reset on resume
> (without driving a reset on the bus).

The xHCI spec requires that when the xHCI host is reset, a USB reset is
driven down the USB 3.0 ports.  If hot reset fails, the port may migrate
to warm reset.  See table 32 in the xHCI spec, in the definition of
HCRST.  It sounds like this host doesn't drive a USB reset down USB 3.0
ports at all on host controller reset?

> The host port starts out back in
> Rx.Detect without remembering anything about its previous state, but
> the device is still in U3. The host detects Rx terminations, moves to
> Polling and starts sending LFPS link training packets, but the device
> doesn't expect those and interprets them as link problems (moving to
> Recovery). What happens next seems to be device specific, but
> apparently the device can end up in SS.Inactive while the host port
> gets stuck in Polling or Recovery (or some kind of livelock between
> those).
> 
> This patch tries to warm reset all USB 3.0 ports on reset-resume
> (after xhci_reset() was called) that had devices connected to them
> before suspend. This seems to be the only way to ensure the devices'
> state machines get back to a well-defined state that the host can work
> with. I don't think this is a specific hardware bug, it's just an
> unfortunate design flaw that the USB 3.0 spec doesn't account for a
> root hub port being reset independently of its connected device. I
> think Sarah is correct that it could be limited to root hubs, though.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-11 Thread Sarah Sharp

On Wed, Dec 11, 2013 at 11:00:13AM -0800, Julius Werner wrote:
  I don't know what you mean by fails.  The system goes to sleep and
  then later on wakes up, doesn't it?
 
  Do you mean that the Jetflash device gets disconnected when the system
  wakes up?  That's _supposed_ to happen under those circumstances.
  When hub_activate() sees HUB_RESET_RESUME, all child devices get
  disconnected except those where udev-persist_enabled is set.
 
 This patch was written in response to the same bug as my usb: hub:
 Use correct reset for wedged USB3 devices that are NOTATTACHED
 submission. My patch only helps when the port gets stuck in Compliance
 Mode, but Vikas reports that he can sometimes see it stuck in Polling
 or Recovery states as well.
 
 The underlying issue is a deadlock in the USB 3.0 link training state
 machine when the host controller is unilaterally reset on resume
 (without driving a reset on the bus).

The xHCI spec requires that when the xHCI host is reset, a USB reset is
driven down the USB 3.0 ports.  If hot reset fails, the port may migrate
to warm reset.  See table 32 in the xHCI spec, in the definition of
HCRST.  It sounds like this host doesn't drive a USB reset down USB 3.0
ports at all on host controller reset?

 The host port starts out back in
 Rx.Detect without remembering anything about its previous state, but
 the device is still in U3. The host detects Rx terminations, moves to
 Polling and starts sending LFPS link training packets, but the device
 doesn't expect those and interprets them as link problems (moving to
 Recovery). What happens next seems to be device specific, but
 apparently the device can end up in SS.Inactive while the host port
 gets stuck in Polling or Recovery (or some kind of livelock between
 those).
 
 This patch tries to warm reset all USB 3.0 ports on reset-resume
 (after xhci_reset() was called) that had devices connected to them
 before suspend. This seems to be the only way to ensure the devices'
 state machines get back to a well-defined state that the host can work
 with. I don't think this is a specific hardware bug, it's just an
 unfortunate design flaw that the USB 3.0 spec doesn't account for a
 root hub port being reset independently of its connected device. I
 think Sarah is correct that it could be limited to root hubs, though.

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-09 Thread Sarah Sharp

On Mon, Dec 09, 2013 at 10:24:52AM -0500, Alan Stern wrote:
> On Mon, 9 Dec 2013, Vikas Sajjan wrote:
> 
> > Does warm reset while activating SuperSpeed HUBs if the hub activate type
> > is HUB_RESET_RESUME.
> > 
> > When we do Suspend-to-RAM with (any one of the 16, 32, 64 Jetflash) 
> > transcend
> > USB 3.0 device connected on 3.0 port, during resume I noticed that the
> > XHCI controller has moved to sometimes RECOVERY, POLLING or INACTIVE STATE.
> > This behaviour is inconsistent and the connection with connected USB 3.0 
> > device
> > on 3.0 port was LOST.

Does the device eventually re-connect on the USB port?  Or is warm reset
necessary to make the device connect?

Does the xHCI register restore complete after resume from S3, or is
power lost?  I'm trying to figure out whether xhci_reset is called
before your issue is triggered.

> > Doing warm reset while activating SuperSpeed HUBs if the hub
> > activate type is HUB_RESET_RESUME, gets the connected device to the stable 
> > state.
> > 
> > Reviewed at https://chromium-review.googlesource.com/#/c/177132/
> > 
> > Tested on exynos5420 and exynos5250 with Transcend Jetflash USB 3.0 device 
> > (8564:1000)

Is this issue specific to the particular USB device manufacturer
(Transcend)?  Does the same device lose connection on resume from S3
with other host controller vendors?  Have you seen this issue when the
USB 3.0 device is behind a USB 3.0 hub?

I ask because this sounds like a low-level link training issue that's
specific to the exynos host or USB device.  I would rather track down
which hardware is to blame than generically add a warm reset for all USB
3.0 devices.

> > rebased on Greg Kroah-Hartman's usb-next
> > git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
> > 
> > Signed-off-by: Vikas Sajjan 
> > ---
> >  drivers/usb/core/hub.c |   41 +
> >  1 file changed, 25 insertions(+), 16 deletions(-)
> > 
> > diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> > index a7c04e2..d8432b0 100644
> > --- a/drivers/usb/core/hub.c
> > +++ b/drivers/usb/core/hub.c
> 
> > @@ -1093,6 +1108,16 @@ static void hub_activate(struct usb_hub *hub, enum 
> > hub_activation_type type)
> > u16 portstatus, portchange;
> >  
> > portstatus = portchange = 0;
> > +
> > +   /* Some connected devices might be still in unknown state even
> > +* after reset-resume, a WARM_RESET gets the connected device
> > +* to the normal state.
> > +*/
> > +   if (udev && hub_is_superspeed(hub->hdev) &&
> > +   type == HUB_RESET_RESUME)
> > +   hub_port_reset(hub, port1, NULL,
> > +   HUB_BH_RESET_TIME, true);
> 
> Please don't do this all the time to every attached port.  Do it only 
> when it is really needed.

Agreed.  Can we at least limit the warm reset to devices directly
attached to roothubs?  You can also change this code to get the port
status and only do the warm reset if the port link state is
USB_SS_PORT_LS_POLLING, USB_SS_PORT_LS_RX_DETECT, or
USB_SS_PORT_LS_SS_INACTIVE.

> Shouldn't you pass udev as the third argument?  If not, please explain
> why not.
> 
> Finally, I don't see why you put this in hub_activate().  Isn't it more 
> closely connected with the reset-resume procedure for the child device?

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] USB: core: Add warm reset while reset-resuming SuperSpeed HUBs

2013-12-09 Thread Sarah Sharp

On Mon, Dec 09, 2013 at 10:24:52AM -0500, Alan Stern wrote:
 On Mon, 9 Dec 2013, Vikas Sajjan wrote:
 
  Does warm reset while activating SuperSpeed HUBs if the hub activate type
  is HUB_RESET_RESUME.
  
  When we do Suspend-to-RAM with (any one of the 16, 32, 64 Jetflash) 
  transcend
  USB 3.0 device connected on 3.0 port, during resume I noticed that the
  XHCI controller has moved to sometimes RECOVERY, POLLING or INACTIVE STATE.
  This behaviour is inconsistent and the connection with connected USB 3.0 
  device
  on 3.0 port was LOST.

Does the device eventually re-connect on the USB port?  Or is warm reset
necessary to make the device connect?

Does the xHCI register restore complete after resume from S3, or is
power lost?  I'm trying to figure out whether xhci_reset is called
before your issue is triggered.

  Doing warm reset while activating SuperSpeed HUBs if the hub
  activate type is HUB_RESET_RESUME, gets the connected device to the stable 
  state.
  
  Reviewed at https://chromium-review.googlesource.com/#/c/177132/
  
  Tested on exynos5420 and exynos5250 with Transcend Jetflash USB 3.0 device 
  (8564:1000)

Is this issue specific to the particular USB device manufacturer
(Transcend)?  Does the same device lose connection on resume from S3
with other host controller vendors?  Have you seen this issue when the
USB 3.0 device is behind a USB 3.0 hub?

I ask because this sounds like a low-level link training issue that's
specific to the exynos host or USB device.  I would rather track down
which hardware is to blame than generically add a warm reset for all USB
3.0 devices.

  rebased on Greg Kroah-Hartman's usb-next
  git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
  
  Signed-off-by: Vikas Sajjan vikas.saj...@samsung.com
  ---
   drivers/usb/core/hub.c |   41 +
   1 file changed, 25 insertions(+), 16 deletions(-)
  
  diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
  index a7c04e2..d8432b0 100644
  --- a/drivers/usb/core/hub.c
  +++ b/drivers/usb/core/hub.c
 
  @@ -1093,6 +1108,16 @@ static void hub_activate(struct usb_hub *hub, enum 
  hub_activation_type type)
  u16 portstatus, portchange;
   
  portstatus = portchange = 0;
  +
  +   /* Some connected devices might be still in unknown state even
  +* after reset-resume, a WARM_RESET gets the connected device
  +* to the normal state.
  +*/
  +   if (udev  hub_is_superspeed(hub-hdev) 
  +   type == HUB_RESET_RESUME)
  +   hub_port_reset(hub, port1, NULL,
  +   HUB_BH_RESET_TIME, true);
 
 Please don't do this all the time to every attached port.  Do it only 
 when it is really needed.

Agreed.  Can we at least limit the warm reset to devices directly
attached to roothubs?  You can also change this code to get the port
status and only do the warm reset if the port link state is
USB_SS_PORT_LS_POLLING, USB_SS_PORT_LS_RX_DETECT, or
USB_SS_PORT_LS_SS_INACTIVE.

 Shouldn't you pass udev as the third argument?  If not, please explain
 why not.
 
 Finally, I don't see why you put this in hub_activate().  Isn't it more 
 closely connected with the reset-resume procedure for the child device?

Sarah Sharp
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that are NOTATTACHED

2013-12-04 Thread Sarah Sharp


On Tue, Nov 19, 2013 at 02:53:22PM +, Cortes, Alexis wrote:
> Hi Sarah,
> 
> Sorry for my delayed response, I just saw your e-mail (it got filtered 
> somehow). About your question: actually I'm not sure, I'll have to check that 
> to confirm it. I'll get back to you with an answer as soon as I have it.

Ping, Alexis: any info on this question?

Sarah Sharp

> -Original Message-----
> From: Sarah Sharp [mailto:sarah.a.sh...@linux.intel.com] 
> Sent: Thursday, November 14, 2013 5:31 PM
> To: Alan Stern
> Cc: Julius Werner; Greg Kroah-Hartman; LKML; linux-...@vger.kernel.org; 
> Benson Leung; Vincent Palatin; Cortes, Alexis
> Subject: Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that 
> are NOTATTACHED
> 
> On Thu, Nov 07, 2013 at 10:32:33AM -0500, Alan Stern wrote:
> > On Wed, 6 Nov 2013, Julius Werner wrote:
> > 
> > > > What if the device is in USB_STATE_SUSPENDED?
> > > 
> > > I'm not sure that is possible at that point in hub_events(), I don't 
> > > know of a way that could lead to this situation. I could still add 
> > > the check just to be sure if you want it, though.
> > 
> > I don't know either.  But Sarah has said that ports can spontaneously 
> > go into Compliance Mode for no apparent reason.  If that can happen, 
> > maybe it can happen while the port is in U3 and the device is 
> > suspended.  In such cases, though, you'd need to do a reset-resume 
> > rather than a simple reset.
> 
> Looking at commits c3897aa5386faba77e5bbdf94902a1658d3a5b11 and 
> 71c731a296f1b08a3724bd1b514b64f1bda87a23, it seems that the TI host 
> controllers' ports can go into compliance mode only when a device is 
> inserted.  Once the device is link trained by the redriver, the port 
> shouldn't go into compliance mode.  So we should never see compliance mode on 
> a port with an attached USB device in suspend.
> 
> Alex, can you confirm that the TI host's port won't go into compliance mode 
> while a connected device is suspended?
> 
> > > > Not at all.  If a device is unplugged, its state changes to 
> > > > NOTATTACHED before the driver is unbound.  During that time, the 
> > > > driver will see all its URBs failing, so it may very well try to reset 
> > > > the device.
> > > > (For example, usbhid behaves like this.)  That isn't a bug.
> > > 
> > > Oh, okay, I wasn't quite sure how that plays together. Would you 
> > > think it's still valuable to print it out (maybe as dev_info() 
> > > instead of
> > > dev_warn()) instead of just silently ignoring the reset request? It 
> > > would have certainly been useful for me to find this problem faster, 
> > > but I can take it out again if you think it would result in too much 
> > > noise.
> > 
> > I think keeping dev_dbg() is best.  If you're searching for the 
> > solution to a problem, you should have debugging enabled and so you 
> > ought to see the message.
> > 
> > Alan Stern
> > 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that are NOTATTACHED

2013-12-04 Thread Sarah Sharp


On Tue, Nov 19, 2013 at 02:53:22PM +, Cortes, Alexis wrote:
 Hi Sarah,
 
 Sorry for my delayed response, I just saw your e-mail (it got filtered 
 somehow). About your question: actually I'm not sure, I'll have to check that 
 to confirm it. I'll get back to you with an answer as soon as I have it.

Ping, Alexis: any info on this question?

Sarah Sharp

 -Original Message-
 From: Sarah Sharp [mailto:sarah.a.sh...@linux.intel.com] 
 Sent: Thursday, November 14, 2013 5:31 PM
 To: Alan Stern
 Cc: Julius Werner; Greg Kroah-Hartman; LKML; linux-...@vger.kernel.org; 
 Benson Leung; Vincent Palatin; Cortes, Alexis
 Subject: Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that 
 are NOTATTACHED
 
 On Thu, Nov 07, 2013 at 10:32:33AM -0500, Alan Stern wrote:
  On Wed, 6 Nov 2013, Julius Werner wrote:
  
What if the device is in USB_STATE_SUSPENDED?
   
   I'm not sure that is possible at that point in hub_events(), I don't 
   know of a way that could lead to this situation. I could still add 
   the check just to be sure if you want it, though.
  
  I don't know either.  But Sarah has said that ports can spontaneously 
  go into Compliance Mode for no apparent reason.  If that can happen, 
  maybe it can happen while the port is in U3 and the device is 
  suspended.  In such cases, though, you'd need to do a reset-resume 
  rather than a simple reset.
 
 Looking at commits c3897aa5386faba77e5bbdf94902a1658d3a5b11 and 
 71c731a296f1b08a3724bd1b514b64f1bda87a23, it seems that the TI host 
 controllers' ports can go into compliance mode only when a device is 
 inserted.  Once the device is link trained by the redriver, the port 
 shouldn't go into compliance mode.  So we should never see compliance mode on 
 a port with an attached USB device in suspend.
 
 Alex, can you confirm that the TI host's port won't go into compliance mode 
 while a connected device is suspended?
 
Not at all.  If a device is unplugged, its state changes to 
NOTATTACHED before the driver is unbound.  During that time, the 
driver will see all its URBs failing, so it may very well try to reset 
the device.
(For example, usbhid behaves like this.)  That isn't a bug.
   
   Oh, okay, I wasn't quite sure how that plays together. Would you 
   think it's still valuable to print it out (maybe as dev_info() 
   instead of
   dev_warn()) instead of just silently ignoring the reset request? It 
   would have certainly been useful for me to find this problem faster, 
   but I can take it out again if you think it would result in too much 
   noise.
  
  I think keeping dev_dbg() is best.  If you're searching for the 
  solution to a problem, you should have debugging enabled and so you 
  ought to see the message.
  
  Alan Stern
  
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that are NOTATTACHED

2013-11-14 Thread Sarah Sharp

On Thu, Nov 07, 2013 at 10:32:33AM -0500, Alan Stern wrote:
> On Wed, 6 Nov 2013, Julius Werner wrote:
> 
> > > What if the device is in USB_STATE_SUSPENDED?
> > 
> > I'm not sure that is possible at that point in hub_events(), I don't
> > know of a way that could lead to this situation. I could still add the
> > check just to be sure if you want it, though.
> 
> I don't know either.  But Sarah has said that ports can spontaneously
> go into Compliance Mode for no apparent reason.  If that can happen,
> maybe it can happen while the port is in U3 and the device is
> suspended.  In such cases, though, you'd need to do a reset-resume
> rather than a simple reset.

Looking at commits c3897aa5386faba77e5bbdf94902a1658d3a5b11 and
71c731a296f1b08a3724bd1b514b64f1bda87a23, it seems that the TI host
controllers' ports can go into compliance mode only when a device is
inserted.  Once the device is link trained by the redriver, the port
shouldn't go into compliance mode.  So we should never see compliance
mode on a port with an attached USB device in suspend.

Alex, can you confirm that the TI host's port won't go into compliance
mode while a connected device is suspended?

> > > Not at all.  If a device is unplugged, its state changes to NOTATTACHED
> > > before the driver is unbound.  During that time, the driver will see
> > > all its URBs failing, so it may very well try to reset the device.
> > > (For example, usbhid behaves like this.)  That isn't a bug.
> > 
> > Oh, okay, I wasn't quite sure how that plays together. Would you think
> > it's still valuable to print it out (maybe as dev_info() instead of
> > dev_warn()) instead of just silently ignoring the reset request? It
> > would have certainly been useful for me to find this problem faster,
> > but I can take it out again if you think it would result in too much
> > noise.
> 
> I think keeping dev_dbg() is best.  If you're searching for the
> solution to a problem, you should have debugging enabled and so you
> ought to see the message.
> 
> Alan Stern
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH] usb: hub: Use correct reset for wedged USB3 devices that are NOTATTACHED

2013-11-14 Thread Sarah Sharp

On Thu, Nov 07, 2013 at 10:32:33AM -0500, Alan Stern wrote:
 On Wed, 6 Nov 2013, Julius Werner wrote:
 
   What if the device is in USB_STATE_SUSPENDED?
  
  I'm not sure that is possible at that point in hub_events(), I don't
  know of a way that could lead to this situation. I could still add the
  check just to be sure if you want it, though.
 
 I don't know either.  But Sarah has said that ports can spontaneously
 go into Compliance Mode for no apparent reason.  If that can happen,
 maybe it can happen while the port is in U3 and the device is
 suspended.  In such cases, though, you'd need to do a reset-resume
 rather than a simple reset.

Looking at commits c3897aa5386faba77e5bbdf94902a1658d3a5b11 and
71c731a296f1b08a3724bd1b514b64f1bda87a23, it seems that the TI host
controllers' ports can go into compliance mode only when a device is
inserted.  Once the device is link trained by the redriver, the port
shouldn't go into compliance mode.  So we should never see compliance
mode on a port with an attached USB device in suspend.

Alex, can you confirm that the TI host's port won't go into compliance
mode while a connected device is suspended?

   Not at all.  If a device is unplugged, its state changes to NOTATTACHED
   before the driver is unbound.  During that time, the driver will see
   all its URBs failing, so it may very well try to reset the device.
   (For example, usbhid behaves like this.)  That isn't a bug.
  
  Oh, okay, I wasn't quite sure how that plays together. Would you think
  it's still valuable to print it out (maybe as dev_info() instead of
  dev_warn()) instead of just silently ignoring the reset request? It
  would have certainly been useful for me to find this problem faster,
  but I can take it out again if you think it would result in too much
  noise.
 
 I think keeping dev_dbg() is best.  If you're searching for the
 solution to a problem, you should have debugging enabled and so you
 ought to see the message.
 
 Alan Stern
 
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Standing for the Technical Advisory Board - Sarah Sharp

2013-10-22 Thread Sarah Sharp

[Resending, since the Tech-board-discuss mailing list is
subscriber-only, and my original email didn't make it through.]

I'm running for The Linux Foundation Technical Advisory Board (TAB),
and I would appreciate your vote!  I will be at the joint Kernel Summit
and LinuxCon party tomorrow, where you can cast your vote.

The TAB is designed to be a bridge between the Linux community and the
Linux Foundation.  It advises the Linux Foundation on anything from
conference planning to community matters.  The TAB also advises the
Linux Foundation on technical matters, including how to influence the
larger technical industry in order to get better Linux support.  For
example, the TAB members were involved in creating an initial solution
for UEFI secure boot and advising the Linux Foundation on how to
influence the tech industry to allow booting Linux on Windows 8
computers.  Basically, the TAB is the voice of the Linux community.

If I'm elected, the TAB will benefit from my six years of experience
with influencing the wider tech industry.  I've worked directly with
everyone from hardware architects, BIOS developers, OEMs, and Linux
distros.  I've influenced USB specification working groups in order to
ensure good Linux support.  I feel this experience working with industry
will help me advise the Linux Foundation and the TAB on to how balance
the business needs of companies while still influencing the larger
industry in directions that benefit Linux.

I also bring a unique perspective to the TAB.  I've been the USB 3.0
driver maintainer for the past four years, which is long enough to
master the Linux kernel patch creation process, from submitting an RFC
patchset to sending pull requests and working with the stable kernel
trees.  However, I haven't been working in the kernel long enough to
forget what it's like being a newbie to our community.  Through working
with Linux kernel interns in the FOSS Outreach Program for women, I see
ways we could make our communities and conferences better for newcomers.
I would love to bring that knowledge and ideas to the Linux Foundation
through the TAB.

I also plan on encouraging the Linux Foundation to continue its efforts
to increase diversity in the Linux community.  Since April, I have been
coordinating women interns and Linux kernel mentors through the FOSS
Outreach Program for Women.  We had 41 women apply for 7 internship
positions, which shows there is an interest and a need for these kinds
of programs.  Our summer interns were outstanding, and made the #13 top
contributor spot for the 3.11 kernel with 230 patches.  The Linux
Foundation was a part of that, through sponsoring three internship
slots.  However, there's a lot more the Linux Foundation could do to
encourage women and other minorities in our community.  I would love to
advise them through the TAB on how to increase diversity within the
Linux community and in Linux Foundation conferences.

I feel the TAB could benefit from my experience with influencing the
tech industry, and my experience as a Linux kernel maintainer.  Please
join me in improving our communities for newcomers and underrepresented
groups. Please vote for me for the Linux Foundation Technical Advisory
Board tomorrow!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Standing for the Technical Advisory Board - Sarah Sharp

2013-10-22 Thread Sarah Sharp

[Resending, since the Tech-board-discuss mailing list is
subscriber-only, and my original email didn't make it through.]

I'm running for The Linux Foundation Technical Advisory Board (TAB),
and I would appreciate your vote!  I will be at the joint Kernel Summit
and LinuxCon party tomorrow, where you can cast your vote.

The TAB is designed to be a bridge between the Linux community and the
Linux Foundation.  It advises the Linux Foundation on anything from
conference planning to community matters.  The TAB also advises the
Linux Foundation on technical matters, including how to influence the
larger technical industry in order to get better Linux support.  For
example, the TAB members were involved in creating an initial solution
for UEFI secure boot and advising the Linux Foundation on how to
influence the tech industry to allow booting Linux on Windows 8
computers.  Basically, the TAB is the voice of the Linux community.

If I'm elected, the TAB will benefit from my six years of experience
with influencing the wider tech industry.  I've worked directly with
everyone from hardware architects, BIOS developers, OEMs, and Linux
distros.  I've influenced USB specification working groups in order to
ensure good Linux support.  I feel this experience working with industry
will help me advise the Linux Foundation and the TAB on to how balance
the business needs of companies while still influencing the larger
industry in directions that benefit Linux.

I also bring a unique perspective to the TAB.  I've been the USB 3.0
driver maintainer for the past four years, which is long enough to
master the Linux kernel patch creation process, from submitting an RFC
patchset to sending pull requests and working with the stable kernel
trees.  However, I haven't been working in the kernel long enough to
forget what it's like being a newbie to our community.  Through working
with Linux kernel interns in the FOSS Outreach Program for women, I see
ways we could make our communities and conferences better for newcomers.
I would love to bring that knowledge and ideas to the Linux Foundation
through the TAB.

I also plan on encouraging the Linux Foundation to continue its efforts
to increase diversity in the Linux community.  Since April, I have been
coordinating women interns and Linux kernel mentors through the FOSS
Outreach Program for Women.  We had 41 women apply for 7 internship
positions, which shows there is an interest and a need for these kinds
of programs.  Our summer interns were outstanding, and made the #13 top
contributor spot for the 3.11 kernel with 230 patches.  The Linux
Foundation was a part of that, through sponsoring three internship
slots.  However, there's a lot more the Linux Foundation could do to
encourage women and other minorities in our community.  I would love to
advise them through the TAB on how to increase diversity within the
Linux community and in Linux Foundation conferences.

I feel the TAB could benefit from my experience with influencing the
tech industry, and my experience as a Linux kernel maintainer.  Please
join me in improving our communities for newcomers and underrepresented
groups. Please vote for me for the Linux Foundation Technical Advisory
Board tomorrow!
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

[PATCH] xhci: Remove segments from radix tree on failed insert.

2013-10-17 Thread Sarah Sharp

If we're expanding a stream ring, we want to make sure we can add those
ring segments to the radix tree that maps segments to ring pointers.
Try the radix tree insert after the new ring segments have been allocated
(the last segment in the new ring chunk will point to the first newly
allocated segment), but before the new ring segments are linked into the
old ring.

If insert fails on any one segment, remove each segment from the radix
tree, deallocate the new segments, and return.  Otherwise, link the new
segments into the tree.

Signed-off-by: Sarah Sharp 
---

Something like this.  It's ugly, but it compiles.  I haven't tested it.
Hans, can you review and test this?

Sarah Sharp

 drivers/usb/host/xhci-mem.c | 106 +---
 1 file changed, 80 insertions(+), 26 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index a455c56..6ce8d31 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -180,53 +180,98 @@ static void xhci_link_rings(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
  * extended systems (where the DMA address can be bigger than 32-bits),
  * if we allow the PCI dma mask to be bigger than 32-bits.  So don't do that.
  */
-static int xhci_update_stream_mapping(struct xhci_ring *ring, gfp_t mem_flags)
+static int xhci_insert_segment_mapping(struct radix_tree_root *trb_address_map,
+   struct xhci_ring *ring,
+   struct xhci_segment *seg,
+   gfp_t mem_flags)
 {
-   struct xhci_segment *seg;
unsigned long key;
int ret;
 
-   if (WARN_ON_ONCE(ring->trb_address_map == NULL))
+   key = (unsigned long)(seg->dma >> TRB_SEGMENT_SHIFT);
+   /* Skip any segments that were already added. */
+   if (radix_tree_lookup(trb_address_map, key))
return 0;
 
-   seg = ring->first_seg;
-   do {
-   key = (unsigned long)(seg->dma >> TRB_SEGMENT_SHIFT);
-   /* Skip any segments that were already added. */
-   if (radix_tree_lookup(ring->trb_address_map, key))
-   continue;
+   ret = radix_tree_maybe_preload(mem_flags);
+   if (ret)
+   return ret;
+   ret = radix_tree_insert(trb_address_map,
+   key, ring);
+   radix_tree_preload_end();
+   return ret;
+}
 
-   ret = radix_tree_maybe_preload(mem_flags);
-   if (ret)
-   return ret;
-   ret = radix_tree_insert(ring->trb_address_map,
-   key, ring);
-   radix_tree_preload_end();
+static void xhci_remove_segment_mapping(struct radix_tree_root 
*trb_address_map,
+   struct xhci_segment *seg)
+{
+   unsigned long key;
+
+   key = (unsigned long)(seg->dma >> TRB_SEGMENT_SHIFT);
+   if (radix_tree_lookup(trb_address_map, key))
+   radix_tree_delete(trb_address_map, key);
+}
+
+static int xhci_update_stream_segment_mapping(
+   struct radix_tree_root *trb_address_map,
+   struct xhci_ring *ring,
+   struct xhci_segment *first_seg,
+   struct xhci_segment *last_seg,
+   gfp_t mem_flags)
+{
+   struct xhci_segment *seg;
+   struct xhci_segment *failed_seg;
+   int ret;
+
+   if (WARN_ON_ONCE(trb_address_map == NULL))
+   return 0;
+
+   seg = first_seg;
+   do {
+   ret = xhci_insert_segment_mapping(trb_address_map,
+   ring, seg, mem_flags);
if (ret)
-   return ret;
+   goto remove_streams;
+   if (seg == last_seg)
+   return 0;
seg = seg->next;
-   } while (seg != ring->first_seg);
+   } while (seg != first_seg);
 
return 0;
+
+remove_streams:
+   failed_seg = seg;
+   seg = first_seg;
+   do {
+   xhci_remove_segment_mapping(trb_address_map, seg);
+   if (seg == failed_seg)
+   return ret;
+   seg = seg->next;
+   } while (seg != first_seg);
+
+   return ret;
 }
 
 static void xhci_remove_stream_mapping(struct xhci_ring *ring)
 {
struct xhci_segment *seg;
-   unsigned long key;
 
if (WARN_ON_ONCE(ring->trb_address_map == NULL))
return;
 
seg = ring->first_seg;
do {
-   key = (unsigned long)(seg->dma >> TRB_SEGMENT_SHIFT);
-   if (radix_tree_lookup(ring->trb_address_map, key))
-   radix_tree_delete(ring->trb_address_map, key);
+   xhci_remove_segment_mapping(ring->trb_address_map, seg);
seg = seg->next;
} while (seg != ring->first_seg);
 }
 
+static int xhci_update_stream_mapping(struct xhci_ring *ring, gfp_t mem_fla

Re: [PATCH v2] xhci: fix usb3 streams

2013-10-17 Thread Sarah Sharp

The more I look at this patch, the more I hate it for the failure cases
it doesn't cover.

What happens if the radix_tree_insert fails in the middle of adding a
set of ring segments?  We leave those segments that were inserted in the
radix tree, which is a problem, since we could allocate those segments
out of the DMA pool later for a different stream ID.

That's OK for the initial stream ring allocation, since the xhci_ring
itself will get freed.  It's not ok for ring expansion through, since
the xhci_ring remains in tact, and we simply fail the URB submission.

I'm working on a patch to fix this, but may not get it done today.

Sarah Sharp

On Mon, Oct 14, 2013 at 06:54:24PM -0700, Gerd Hoffmann wrote:
> Gerd, Hans, any objections to this updated patch?  The warning is fixed
> with it.
> 
> The patch probably still needs to address the case where the ring
> expansion fails because we can't insert the new segments into the radix
> tree.  The patch should probably allocate the segments, attempt to add
> them to the radix tree, and fail without modifying the link TRBs of the
> ring.  I'd have to look more deeply into the code to see what exactly
> should be done there.
> 
> I would like that issue fixed before I merge these patches, so if you
> want to take a stab at fixing it, please do.
> 
> Sarah Sharp
> 
> 8<>8
> 
> xhci maintains a radix tree for each stream endpoint because it must
> be able to map a trb address to the stream ring.  Each ring segment
> must be added to the ring for this to work.  Currently xhci sticks
> only the first segment of each stream ring into the radix tree.
> 
> Result is that things work initially, but as soon as the first segment
> is full xhci can't map the trb address from the completion event to the
> stream ring any more -> BOOM.  You'll find this message in the logs:
> 
>   ERROR Transfer event for disabled endpoint or incorrect stream ring
> 
> This patch adds a helper function to update the radix tree, and a
> function to remove ring segments from the tree.  Both functions loop
> over the segment list and handles all segments instead of just the
> first.
> 
> [Note: Sarah changed this patch to add radix_tree_maybe_preload() and
> radix_tree_preload_end() calls around the radix tree insert, since we
> can now insert entries in interrupt context.  There are now two helper
> functions to make the code cleaner, and those functions are moved to
> make them static.]
> 
> Signed-off-by: Gerd Hoffmann 
> Signed-off-by: Hans de Goede 
> Signed-off-by: Sarah Sharp 
> ---
>  drivers/usb/host/xhci-mem.c | 132 
> +---
>  drivers/usb/host/xhci.h |   1 +
>  2 files changed, 90 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
> index 83bcd13..8b1ba5b 100644
> --- a/drivers/usb/host/xhci-mem.c
> +++ b/drivers/usb/host/xhci-mem.c
> @@ -149,14 +149,95 @@ static void xhci_link_rings(struct xhci_hcd *xhci, 
> struct xhci_ring *ring,
>   }
>  }
>  
> +/*
> + * We need a radix tree for mapping physical addresses of TRBs to which 
> stream
> + * ID they belong to.  We need to do this because the host controller won't 
> tell
> + * us which stream ring the TRB came from.  We could store the stream ID in 
> an
> + * event data TRB, but that doesn't help us for the cancellation case, since 
> the
> + * endpoint may stop before it reaches that event data TRB.
> + *
> + * The radix tree maps the upper portion of the TRB DMA address to a ring
> + * segment that has the same upper portion of DMA addresses.  For example, 
> say I
> + * have segments of size 1KB, that are always 64-byte aligned.  A segment may
> + * start at 0x10c91000 and end at 0x10c913f0.  If I use the upper 10 bits, 
> the
> + * key to the stream ID is 0x43244.  I can use the DMA address of the TRB to
> + * pass the radix tree a key to get the right stream ID:
> + *
> + *   0x10c90fff >> 10 = 0x43243
> + *   0x10c912c0 >> 10 = 0x43244
> + *   0x10c91400 >> 10 = 0x43245
> + *
> + * Obviously, only those TRBs with DMA addresses that are within the segment
> + * will make the radix tree return the stream ID for that ring.
> + *
> + * Caveats for the radix tree:
> + *
> + * The radix tree uses an unsigned long as a key pair.  On 32-bit systems, an
> + * unsigned long will be 32-bits; on a 64-bit system an unsigned long will be
> + * 64-bits.  Since we only request 32-bit DMA addresses, we can use that as 
> the
> + * key on 32-bit or 64-bit systems (it would also be fine if we asked for 
> 64-bit
> + * PCI DMA addresses on a 64-bit system).  There might be

Re: [PATCH v2] xhci: fix usb3 streams

2013-10-17 Thread Sarah Sharp

The more I look at this patch, the more I hate it for the failure cases
it doesn't cover.

What happens if the radix_tree_insert fails in the middle of adding a
set of ring segments?  We leave those segments that were inserted in the
radix tree, which is a problem, since we could allocate those segments
out of the DMA pool later for a different stream ID.

That's OK for the initial stream ring allocation, since the xhci_ring
itself will get freed.  It's not ok for ring expansion through, since
the xhci_ring remains in tact, and we simply fail the URB submission.

I'm working on a patch to fix this, but may not get it done today.

Sarah Sharp

On Mon, Oct 14, 2013 at 06:54:24PM -0700, Gerd Hoffmann wrote:
 Gerd, Hans, any objections to this updated patch?  The warning is fixed
 with it.
 
 The patch probably still needs to address the case where the ring
 expansion fails because we can't insert the new segments into the radix
 tree.  The patch should probably allocate the segments, attempt to add
 them to the radix tree, and fail without modifying the link TRBs of the
 ring.  I'd have to look more deeply into the code to see what exactly
 should be done there.
 
 I would like that issue fixed before I merge these patches, so if you
 want to take a stab at fixing it, please do.
 
 Sarah Sharp
 
 88
 
 xhci maintains a radix tree for each stream endpoint because it must
 be able to map a trb address to the stream ring.  Each ring segment
 must be added to the ring for this to work.  Currently xhci sticks
 only the first segment of each stream ring into the radix tree.
 
 Result is that things work initially, but as soon as the first segment
 is full xhci can't map the trb address from the completion event to the
 stream ring any more - BOOM.  You'll find this message in the logs:
 
   ERROR Transfer event for disabled endpoint or incorrect stream ring
 
 This patch adds a helper function to update the radix tree, and a
 function to remove ring segments from the tree.  Both functions loop
 over the segment list and handles all segments instead of just the
 first.
 
 [Note: Sarah changed this patch to add radix_tree_maybe_preload() and
 radix_tree_preload_end() calls around the radix tree insert, since we
 can now insert entries in interrupt context.  There are now two helper
 functions to make the code cleaner, and those functions are moved to
 make them static.]
 
 Signed-off-by: Gerd Hoffmann kra...@redhat.com
 Signed-off-by: Hans de Goede hdego...@redhat.com
 Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
 ---
  drivers/usb/host/xhci-mem.c | 132 
 +---
  drivers/usb/host/xhci.h |   1 +
  2 files changed, 90 insertions(+), 43 deletions(-)
 
 diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
 index 83bcd13..8b1ba5b 100644
 --- a/drivers/usb/host/xhci-mem.c
 +++ b/drivers/usb/host/xhci-mem.c
 @@ -149,14 +149,95 @@ static void xhci_link_rings(struct xhci_hcd *xhci, 
 struct xhci_ring *ring,
   }
  }
  
 +/*
 + * We need a radix tree for mapping physical addresses of TRBs to which 
 stream
 + * ID they belong to.  We need to do this because the host controller won't 
 tell
 + * us which stream ring the TRB came from.  We could store the stream ID in 
 an
 + * event data TRB, but that doesn't help us for the cancellation case, since 
 the
 + * endpoint may stop before it reaches that event data TRB.
 + *
 + * The radix tree maps the upper portion of the TRB DMA address to a ring
 + * segment that has the same upper portion of DMA addresses.  For example, 
 say I
 + * have segments of size 1KB, that are always 64-byte aligned.  A segment may
 + * start at 0x10c91000 and end at 0x10c913f0.  If I use the upper 10 bits, 
 the
 + * key to the stream ID is 0x43244.  I can use the DMA address of the TRB to
 + * pass the radix tree a key to get the right stream ID:
 + *
 + *   0x10c90fff  10 = 0x43243
 + *   0x10c912c0  10 = 0x43244
 + *   0x10c91400  10 = 0x43245
 + *
 + * Obviously, only those TRBs with DMA addresses that are within the segment
 + * will make the radix tree return the stream ID for that ring.
 + *
 + * Caveats for the radix tree:
 + *
 + * The radix tree uses an unsigned long as a key pair.  On 32-bit systems, an
 + * unsigned long will be 32-bits; on a 64-bit system an unsigned long will be
 + * 64-bits.  Since we only request 32-bit DMA addresses, we can use that as 
 the
 + * key on 32-bit or 64-bit systems (it would also be fine if we asked for 
 64-bit
 + * PCI DMA addresses on a 64-bit system).  There might be a problem on 32-bit
 + * extended systems (where the DMA address can be bigger than 32-bits),
 + * if we allow the PCI dma mask to be bigger than 32-bits.  So don't do that.
 + */
 +static int xhci_update_stream_mapping(struct xhci_ring *ring, gfp_t 
 mem_flags)
 +{
 + struct xhci_segment *seg;
 + unsigned long key;
 + int ret

[PATCH] xhci: Remove segments from radix tree on failed insert.

2013-10-17 Thread Sarah Sharp

If we're expanding a stream ring, we want to make sure we can add those
ring segments to the radix tree that maps segments to ring pointers.
Try the radix tree insert after the new ring segments have been allocated
(the last segment in the new ring chunk will point to the first newly
allocated segment), but before the new ring segments are linked into the
old ring.

If insert fails on any one segment, remove each segment from the radix
tree, deallocate the new segments, and return.  Otherwise, link the new
segments into the tree.

Signed-off-by: Sarah Sharp sarah.a.sh...@linux.intel.com
---

Something like this.  It's ugly, but it compiles.  I haven't tested it.
Hans, can you review and test this?

Sarah Sharp

 drivers/usb/host/xhci-mem.c | 106 +---
 1 file changed, 80 insertions(+), 26 deletions(-)

diff --git a/drivers/usb/host/xhci-mem.c b/drivers/usb/host/xhci-mem.c
index a455c56..6ce8d31 100644
--- a/drivers/usb/host/xhci-mem.c
+++ b/drivers/usb/host/xhci-mem.c
@@ -180,53 +180,98 @@ static void xhci_link_rings(struct xhci_hcd *xhci, struct 
xhci_ring *ring,
  * extended systems (where the DMA address can be bigger than 32-bits),
  * if we allow the PCI dma mask to be bigger than 32-bits.  So don't do that.
  */
-static int xhci_update_stream_mapping(struct xhci_ring *ring, gfp_t mem_flags)
+static int xhci_insert_segment_mapping(struct radix_tree_root *trb_address_map,
+   struct xhci_ring *ring,
+   struct xhci_segment *seg,
+   gfp_t mem_flags)
 {
-   struct xhci_segment *seg;
unsigned long key;
int ret;
 
-   if (WARN_ON_ONCE(ring-trb_address_map == NULL))
+   key = (unsigned long)(seg-dma  TRB_SEGMENT_SHIFT);
+   /* Skip any segments that were already added. */
+   if (radix_tree_lookup(trb_address_map, key))
return 0;
 
-   seg = ring-first_seg;
-   do {
-   key = (unsigned long)(seg-dma  TRB_SEGMENT_SHIFT);
-   /* Skip any segments that were already added. */
-   if (radix_tree_lookup(ring-trb_address_map, key))
-   continue;
+   ret = radix_tree_maybe_preload(mem_flags);
+   if (ret)
+   return ret;
+   ret = radix_tree_insert(trb_address_map,
+   key, ring);
+   radix_tree_preload_end();
+   return ret;
+}
 
-   ret = radix_tree_maybe_preload(mem_flags);
-   if (ret)
-   return ret;
-   ret = radix_tree_insert(ring-trb_address_map,
-   key, ring);
-   radix_tree_preload_end();
+static void xhci_remove_segment_mapping(struct radix_tree_root 
*trb_address_map,
+   struct xhci_segment *seg)
+{
+   unsigned long key;
+
+   key = (unsigned long)(seg-dma  TRB_SEGMENT_SHIFT);
+   if (radix_tree_lookup(trb_address_map, key))
+   radix_tree_delete(trb_address_map, key);
+}
+
+static int xhci_update_stream_segment_mapping(
+   struct radix_tree_root *trb_address_map,
+   struct xhci_ring *ring,
+   struct xhci_segment *first_seg,
+   struct xhci_segment *last_seg,
+   gfp_t mem_flags)
+{
+   struct xhci_segment *seg;
+   struct xhci_segment *failed_seg;
+   int ret;
+
+   if (WARN_ON_ONCE(trb_address_map == NULL))
+   return 0;
+
+   seg = first_seg;
+   do {
+   ret = xhci_insert_segment_mapping(trb_address_map,
+   ring, seg, mem_flags);
if (ret)
-   return ret;
+   goto remove_streams;
+   if (seg == last_seg)
+   return 0;
seg = seg-next;
-   } while (seg != ring-first_seg);
+   } while (seg != first_seg);
 
return 0;
+
+remove_streams:
+   failed_seg = seg;
+   seg = first_seg;
+   do {
+   xhci_remove_segment_mapping(trb_address_map, seg);
+   if (seg == failed_seg)
+   return ret;
+   seg = seg-next;
+   } while (seg != first_seg);
+
+   return ret;
 }
 
 static void xhci_remove_stream_mapping(struct xhci_ring *ring)
 {
struct xhci_segment *seg;
-   unsigned long key;
 
if (WARN_ON_ONCE(ring-trb_address_map == NULL))
return;
 
seg = ring-first_seg;
do {
-   key = (unsigned long)(seg-dma  TRB_SEGMENT_SHIFT);
-   if (radix_tree_lookup(ring-trb_address_map, key))
-   radix_tree_delete(ring-trb_address_map, key);
+   xhci_remove_segment_mapping(ring-trb_address_map, seg);
seg = seg-next;
} while (seg != ring-first_seg);
 }
 
+static int xhci_update_stream_mapping(struct xhci_ring *ring, gfp_t mem_flags)
+{
+   return xhci_update_stream_segment_mapping(ring

Re: [PATCH v2] usb: hub: Clear Port Reset Change during init/resume

2013-10-16 Thread Sarah Sharp

On Tue, Oct 15, 2013 at 05:45:00PM -0700, Julius Werner wrote:
> This patch adds the Port Reset Change flag to the set of bits that are
> preemptively cleared on init/resume of a hub. In theory this bit should
> never be set unexpectedly... in practice it can still happen if BIOS,
> SMM or ACPI code plays around with USB devices without cleaning up
> correctly. This is especially dangerous for XHCI root hubs, which don't
> generate any more Port Status Change Events until all change bits are
> cleared, so this is a good precaution to have (similar to how it's
> already done for the Warm Port Reset Change flag).

Did you run into an issue where port status change events weren't being
generated because the Port Reset flag was set?  I'm trying to figure out
if this addresses a real issue you hit (and thus should be queued for
stable), or if this is just a precaution.

I do agree this is a good fix.  ISTR that with some xHCI vendors, when
the host is reset, the hardware will drive a port reset down all ports.
That may cause the port reset and even warm port reset bits to be set.
I have a vague recollection that some implementations might not even
wait for the port reset to complete before saying the reset is done.  We
can't tell which hosts will and won't drive a reset, so we rely on the
USB core to clear those bits if they are set.

So perhaps instead of the BIOS/SMM/ACPI code leaving the ports in an
unclean state, the root cause of your issue is actually the call to
xhci_reset?  You can check the port status before that call to find out.

> Signed-off-by: Julius Werner 
> ---
>  drivers/usb/core/hub.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/drivers/usb/core/hub.c b/drivers/usb/core/hub.c
> index e6b682c..c3dd64c 100644
> --- a/drivers/usb/core/hub.c
> +++ b/drivers/usb/core/hub.c
> @@ -1130,6 +1130,11 @@ static void hub_activate(struct usb_hub *hub, enum 
> hub_activation_type type)
>   usb_clear_port_feature(hub->hdev, port1,
>   USB_PORT_FEAT_C_ENABLE);
>   }
> + if (portchange & USB_PORT_STAT_C_RESET) {
> + need_debounce_delay = true;
> + usb_clear_port_feature(hub->hdev, port1,
> + USB_PORT_FEAT_C_RESET);
> + }
>   if ((portchange & USB_PORT_STAT_C_BH_RESET) &&
>   hub_is_superspeed(hub->hdev)) {
>   need_debounce_delay = true;
> -- 
> 1.7.12.4
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Re: [PATCH v2] xhci: fix usb3 streams

2013-10-16 Thread Sarah Sharp

On Tue, Oct 15, 2013 at 10:53:57AM -0400, Alan Stern wrote:
> On Mon, 14 Oct 2013, Gerd Hoffmann wrote:
> 
> > Gerd, Hans, any objections to this updated patch?  The warning is fixed
> > with it.
> > 
> > The patch probably still needs to address the case where the ring
> > expansion fails because we can't insert the new segments into the radix
> > tree.  The patch should probably allocate the segments, attempt to add
> > them to the radix tree, and fail without modifying the link TRBs of the
> > ring.  I'd have to look more deeply into the code to see what exactly
> > should be done there.
> > 
> > I would like that issue fixed before I merge these patches, so if you
> > want to take a stab at fixing it, please do.
> > 
> > Sarah Sharp
> 
> Sarah, how did you manage to send an email with the "From:" line set to 
> Gerd's name and address?

I sent it using git format-patch and mutt, and accidentally left Gerd's
>From line intact.  Looking at the headers, it seems like the default
Intel exchange servers simply passed the email through.  Header forging,
what fun!

> > 8<>8
> > 
> > xhci maintains a radix tree for each stream endpoint because it must
> > be able to map a trb address to the stream ring.  Each ring segment
> > must be added to the ring for this to work.  Currently xhci sticks
> > only the first segment of each stream ring into the radix tree.
> > 
> > Result is that things work initially, but as soon as the first segment
> > is full xhci can't map the trb address from the completion event to the
> > stream ring any more -> BOOM.  You'll find this message in the logs:
> > 
> >   ERROR Transfer event for disabled endpoint or incorrect stream ring
> > 
> > This patch adds a helper function to update the radix tree, and a
> > function to remove ring segments from the tree.  Both functions loop
> > over the segment list and handles all segments instead of just the
> > first.
> 
> There may be a simpler approach to this problem.
> 
> When using a new ring segment, keep the first TRB entry in reserve.  
> Don't put a normal TRB in there, instead leave it as a no-op entry
> containing a pointer to the stream ring.  (Make the prior Link TRB
> point to the second entry in the new segment instead of the first.)
> 
> Then you won't have to add to or remove anything from the radix tree.

I don't understand how this would help.  Are you advocating a different
way of mapping TRB DMA addresses to stream rings that would allow us to
ditch the radix tree all together?

Ok, so with your solution, we have a virtual stream ring pointer as the
first TRB of the segment.  We get an event with the DMA address of a TRB
in one of many stream rings on an endpoint.  From that, I think we can
infer the DMA address of the first TRB on the segment, due to the
alignment requirements and ring size.

And then what do we do with that?  We don't have the virtual address of
that first TRB, so the xHCI driver can't read the ring pointer from it.
I'm confused as to what the next steps would be to solve this.

Sarah Sharp
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

1 2 3 4 5 >

1 - 100 of 418 matches

Mail list logo