Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-20 Thread Christoph Hellwig
On Sat, May 20, 2017 at 09:49:56AM -0700, Linus Torvalds wrote:
> Side note: why is it doing that " > 1" check, when any value _other_
> than 1 is wrong?

It's the same effect, so either one is fine with me.

> Also, to match the non-MSI implementation, wouldn't it be nicer to
> just write it that same way (and also verify "dev->irq"):
> 
> if (flags & PCI_IRQ_LEGACY) {
> if (min_vecs == 1 && dev->irq)
> return 1;
> }
> return -ENOSPC;
> 
> (the exact error value probably doesn't matter in practice, but the
> CONFIG_MSI case returns ENOSPC by default and that's what
> Documentation/PCI/MSI-HOWTO.txt says too).

Sure.  Just sent the previous version to Bjorn so that he could maybe
make it for -rc2, but I'll respin it.
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-20 Thread Linus Torvalds
On Fri, May 19, 2017 at 5:46 AM, Christoph Hellwig  wrote:
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 33c2b0b77429..5a7fd3b6a7b9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, 
> unsigned int min_vecs,
>unsigned int max_vecs, unsigned int flags,
>const struct irq_affinity *aff_desc)
>  {
> -   if (min_vecs > 1)
> +   if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
> return -EINVAL;
> return 1;
>  }

Side note: why is it doing that " > 1" check, when any value _other_
than 1 is wrong?

Also, to match the non-MSI implementation, wouldn't it be nicer to
just write it that same way (and also verify "dev->irq"):

if (flags & PCI_IRQ_LEGACY) {
if (min_vecs == 1 && dev->irq)
return 1;
}
return -ENOSPC;

(the exact error value probably doesn't matter in practice, but the
CONFIG_MSI case returns ENOSPC by default and that's what
Documentation/PCI/MSI-HOWTO.txt says too).

Hmm?

 Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-19 Thread Steven Rostedt
On Fri, 19 May 2017 14:46:25 +0200
Christoph Hellwig  wrote:

> On Fri, May 19, 2017 at 08:37:21AM -0400, Steven Rostedt wrote:
> > ktest config bisect ended with:
> > 
> > ***
> > Found bad config: CONFIG_PCI_MSI
> > ***  
> 
> Oh, that's interesting.  I think there's been a bug in the !CONFIG_PCI_MSI
> fallback for pci_alloc_irq_vectors since the very beginning.  And it
> didn't matter for any driver so far, but xhci has a very odd way
> to set MSI(-X) vs legacy interrupts.
> 
> Can you try the patch below?

Works. Thanks!

Tested-by: Steven Rostedt (VMware) 

-- Steve

> 
> 
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 33c2b0b77429..5a7fd3b6a7b9 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, 
> unsigned int min_vecs,
>  unsigned int max_vecs, unsigned int flags,
>  const struct irq_affinity *aff_desc)
>  {
> - if (min_vecs > 1)
> + if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
>   return -EINVAL;
>   return 1;
>  }

--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-19 Thread Christoph Hellwig
On Fri, May 19, 2017 at 08:37:21AM -0400, Steven Rostedt wrote:
> ktest config bisect ended with:
> 
> ***
> Found bad config: CONFIG_PCI_MSI
> ***

Oh, that's interesting.  I think there's been a bug in the !CONFIG_PCI_MSI
fallback for pci_alloc_irq_vectors since the very beginning.  And it
didn't matter for any driver so far, but xhci has a very odd way
to set MSI(-X) vs legacy interrupts.

Can you try the patch below?


diff --git a/include/linux/pci.h b/include/linux/pci.h
index 33c2b0b77429..5a7fd3b6a7b9 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -1342,7 +1342,7 @@ pci_alloc_irq_vectors_affinity(struct pci_dev *dev, 
unsigned int min_vecs,
   unsigned int max_vecs, unsigned int flags,
   const struct irq_affinity *aff_desc)
 {
-   if (min_vecs > 1)
+   if (min_vecs > 1 || !(flags & PCI_IRQ_LEGACY))
return -EINVAL;
return 1;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-19 Thread Steven Rostedt
On Fri, 19 May 2017 06:08:56 -0400
Steven Rostedt  wrote:

> > But other configs on this same hardware work, can you do a diff of a
> > working vs. not working?  
> 
> I could probably run my config-bisect and see what it comes up with.

ktest config bisect ended with:

***
Found bad config: CONFIG_PCI_MSI
***

diffconfig good_config bad_config
-ENA_ETHERNET n
-FM10K n
-GENERIC_MSI_IRQ y
-GENERIC_MSI_IRQ_DOMAIN y
-I40EVF n
-INTEL_IOMMU n
-IRQ_REMAP n
-IXGBEVF n
-LIQUIDIO_VF n
-NFP n
-PCIE_DW_PLAT n
-PCI_MSI_IRQ_DOMAIN y
-VMD n
 PCI_MSI y -> n


When that is not set, it fails to boot. It boots fine if I enable it.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-19 Thread Steven Rostedt
On Fri, 19 May 2017 07:42:23 +0200
Greg Kroah-Hartman  wrote:

> On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> > 
> > One of my the configs I use to test ftrace with (configs that have
> > caused failures in the past), has lots of irq issues and fails to
> > initialize the network of my box. I bisected the problem down to a
> > single commit, and when I revert that commit, my box boots without any
> > network or irq issues.
> > 
> > Note, my other configs work fine on this box. I haven't investigated
> > which config is also the culprit. But since it use to work with this
> > config, I want to report it.  
> 
> So what commit is causing the problem?

Ug, I forgot to cut and paste the sha1. I thought I did, but I only cut
and pasted the subject into the subject of this email.

commit 77d45b4500967de674b8f75a9a91f58d57d5704d

> 
> It looks like the ehci driver is having problems, but first, your
> interrupts are whack:

Could be. It's an old board.

> 
> >  irq 16: nobody cared (try booting with the "irqpoll" option)
> >  CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-rc1-test-dirty #24
> >  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> >  Call Trace:
> >   
> >  devtmpfs: mounted
> >   dump_stack+0x9a/0xd6
> >   __report_bad_irq+0x35/0xc0
> >   note_interrupt+0x234/0x270
> >   handle_irq_event_percpu+0x45/0x60
> >   handle_irq_event+0x39/0x60
> >   handle_fasteoi_irq+0x8f/0x160
> >   handle_irq+0x6f/0x110
> >   do_IRQ+0x46/0xd0
> >   common_interrupt+0x93/0x93
> >  RIP: 0010:native_safe_halt+0x6/0x10
> >  RSP: :b54240cd7e90 EFLAGS: 0286 ORIG_RAX: ff7e
> >  RAX:  RBX: 8ea214498040 RCX: 
> >  RDX:  RSI:  RDI: 
> >  RBP: b54240cd7e90 R08: 0001 R09: 41129b0c
> >  R10: b54240cd7d68 R11: 0001 R12: 0002
> >  R13: 8ea214498040 R14:  R15: 8ea214498040
> >   
> >   default_idle+0x38/0x160
> >   arch_cpu_idle+0xf/0x20
> >   default_idle_call+0x28/0x50
> >   do_idle+0x182/0x220
> >   cpu_startup_entry+0x1d/0x20
> >   start_secondary+0x132/0x160
> >   secondary_startup_64+0x9f/0x9f
> >  handlers:
> >  [] xhci_msi_irq
> >  Disabling IRQ #16  
> 
> Have you tried taking the kernel's advice?  :)

You mean the "irqpoll"?  No. This works fine without that commit. Why
should I have to change?

> 
> >  ehci-pci :00:1a.0: new USB bus registered, assigned bus number 3
> >  ehci-pci :00:1a.0: debug port 2
> >  ehci-pci :00:1a.0: cache line size of 64 is not supported
> >  genirq: Flags mismatch irq 16. 0080 (ehci_hcd:usb3) vs.  
> > (xhci_hcd)  
> 
> What does that mean?

No idea ;-)

> 
> >  CPU: 0 PID: 307 Comm: modprobe Tainted: GE   
> > 4.12.0-rc1-test-dirty #24
> >  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
> >  Call Trace:
> >   dump_stack+0x9a/0xd6
> >   __setup_irq+0x5d4/0x630
> >   request_threaded_irq+0x10d/0x190
> >   usb_add_hcd+0x658/0x970
> >   ? for_each_companion+0x3e/0xb0
> >   usb_hcd_pci_probe+0x3e4/0x490
> >   ehci_pci_probe+0x36/0x40 [ehci_pci]
> >   local_pci_probe+0x45/0xa0
> >   ? pci_match_device+0xca/0x110
> >   pci_device_probe+0xdb/0x130
> >   driver_probe_device+0x2ed/0x480
> >   __driver_attach+0xd5/0x100
> >   ? driver_probe_device+0x480/0x480
> >   bus_for_each_dev+0x62/0xa0
> >   driver_attach+0x1e/0x20
> >   bus_add_driver+0x1c6/0x290
> >   driver_register+0x60/0xe0
> >   __pci_register_driver+0x60/0x70
> >   ? 0xc0346000
> >   ehci_pci_init+0x6a/0x1000 [ehci_pci]
> >   do_one_initcall+0x43/0x190
> >   ? kmem_cache_alloc_trace+0x1be/0x200
> >   do_init_module+0x7d/0x210
> >   load_module+0x1891/0x1eb0
> >   ? vmap_page_range_noflush+0x29b/0x370
> >   ? show_coresize+0x30/0x30
> >   SYSC_init_module+0x143/0x180
> >   ? load_module+0x5/0x1eb0
> >   ? SYSC_init_module+0x143/0x180
> >   SyS_init_module+0xe/0x10
> >   entry_SYSCALL_64_fastpath+0x23/0xc2
> >  RIP: 0033:0x3b918e0ffa
> >  RSP: 002b:7ffd11d575c8 EFLAGS: 0246 ORIG_RAX: 00af
> >  RAX: ffda RBX: 0061f950 RCX: 003b918e0ffa
> >  RDX: 0061f7d0 RSI: 36b0 RDI: 0062c9e0
> >  RBP:  R08: 00630090 R09: 7f019c07c700
> >  R10: 7ffd11d574f0 R11: 0246 R12: 00626200
> >  R13: 0061f930 R14:  R15: 0061f420
> >  ehci-pci :00:1a.0: request interrupt 16 failed  
> 
> So ehci can't use the same irq line as xhci?  No sharing allowed?
> 
> But other configs on this same hardware work, can you do a diff of a
> working vs. not working?

I could probably run my config-bisect and see what it comes up with.

-- Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

Re: [REGRESSION] Failed network caused by: xhci: switch to pci_alloc_irq_vectors

2017-05-18 Thread Greg Kroah-Hartman
On Thu, May 18, 2017 at 11:42:34PM -0400, Steven Rostedt wrote:
> 
> One of my the configs I use to test ftrace with (configs that have
> caused failures in the past), has lots of irq issues and fails to
> initialize the network of my box. I bisected the problem down to a
> single commit, and when I revert that commit, my box boots without any
> network or irq issues.
> 
> Note, my other configs work fine on this box. I haven't investigated
> which config is also the culprit. But since it use to work with this
> config, I want to report it.

So what commit is causing the problem?

It looks like the ehci driver is having problems, but first, your
interrupts are whack:

>  irq 16: nobody cared (try booting with the "irqpoll" option)
>  CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.12.0-rc1-test-dirty #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  Call Trace:
>   
>  devtmpfs: mounted
>   dump_stack+0x9a/0xd6
>   __report_bad_irq+0x35/0xc0
>   note_interrupt+0x234/0x270
>   handle_irq_event_percpu+0x45/0x60
>   handle_irq_event+0x39/0x60
>   handle_fasteoi_irq+0x8f/0x160
>   handle_irq+0x6f/0x110
>   do_IRQ+0x46/0xd0
>   common_interrupt+0x93/0x93
>  RIP: 0010:native_safe_halt+0x6/0x10
>  RSP: :b54240cd7e90 EFLAGS: 0286 ORIG_RAX: ff7e
>  RAX:  RBX: 8ea214498040 RCX: 
>  RDX:  RSI:  RDI: 
>  RBP: b54240cd7e90 R08: 0001 R09: 41129b0c
>  R10: b54240cd7d68 R11: 0001 R12: 0002
>  R13: 8ea214498040 R14:  R15: 8ea214498040
>   
>   default_idle+0x38/0x160
>   arch_cpu_idle+0xf/0x20
>   default_idle_call+0x28/0x50
>   do_idle+0x182/0x220
>   cpu_startup_entry+0x1d/0x20
>   start_secondary+0x132/0x160
>   secondary_startup_64+0x9f/0x9f
>  handlers:
>  [] xhci_msi_irq
>  Disabling IRQ #16

Have you tried taking the kernel's advice?  :)

>  ehci-pci :00:1a.0: new USB bus registered, assigned bus number 3
>  ehci-pci :00:1a.0: debug port 2
>  ehci-pci :00:1a.0: cache line size of 64 is not supported
>  genirq: Flags mismatch irq 16. 0080 (ehci_hcd:usb3) vs.  
> (xhci_hcd)

What does that mean?

>  CPU: 0 PID: 307 Comm: modprobe Tainted: GE   
> 4.12.0-rc1-test-dirty #24
>  Hardware name: MSI MS-7823/CSM-H87M-G43 (MS-7823), BIOS V1.6 02/22/2014
>  Call Trace:
>   dump_stack+0x9a/0xd6
>   __setup_irq+0x5d4/0x630
>   request_threaded_irq+0x10d/0x190
>   usb_add_hcd+0x658/0x970
>   ? for_each_companion+0x3e/0xb0
>   usb_hcd_pci_probe+0x3e4/0x490
>   ehci_pci_probe+0x36/0x40 [ehci_pci]
>   local_pci_probe+0x45/0xa0
>   ? pci_match_device+0xca/0x110
>   pci_device_probe+0xdb/0x130
>   driver_probe_device+0x2ed/0x480
>   __driver_attach+0xd5/0x100
>   ? driver_probe_device+0x480/0x480
>   bus_for_each_dev+0x62/0xa0
>   driver_attach+0x1e/0x20
>   bus_add_driver+0x1c6/0x290
>   driver_register+0x60/0xe0
>   __pci_register_driver+0x60/0x70
>   ? 0xc0346000
>   ehci_pci_init+0x6a/0x1000 [ehci_pci]
>   do_one_initcall+0x43/0x190
>   ? kmem_cache_alloc_trace+0x1be/0x200
>   do_init_module+0x7d/0x210
>   load_module+0x1891/0x1eb0
>   ? vmap_page_range_noflush+0x29b/0x370
>   ? show_coresize+0x30/0x30
>   SYSC_init_module+0x143/0x180
>   ? load_module+0x5/0x1eb0
>   ? SYSC_init_module+0x143/0x180
>   SyS_init_module+0xe/0x10
>   entry_SYSCALL_64_fastpath+0x23/0xc2
>  RIP: 0033:0x3b918e0ffa
>  RSP: 002b:7ffd11d575c8 EFLAGS: 0246 ORIG_RAX: 00af
>  RAX: ffda RBX: 0061f950 RCX: 003b918e0ffa
>  RDX: 0061f7d0 RSI: 36b0 RDI: 0062c9e0
>  RBP:  R08: 00630090 R09: 7f019c07c700
>  R10: 7ffd11d574f0 R11: 0246 R12: 00626200
>  R13: 0061f930 R14:  R15: 0061f420
>  ehci-pci :00:1a.0: request interrupt 16 failed

So ehci can't use the same irq line as xhci?  No sharing allowed?

But other configs on this same hardware work, can you do a diff of a
working vs. not working?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-usb" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html