On Sun, Jan 29, 2012 at 6:29 PM, Roland Dreier <rol...@purestorage.com> wrote:

> I'm having a strange problem passing an mlx4 device into a kvm guest.
> The device in question is:
>
>    05:00.0 InfiniBand [0c06]: Mellanox Technologies MT26428 [ConnectX
> VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] [15b3:673c] (rev b0)
>
> running the latest (I believe) FW version 2.9.1000.
>
> The symptom of the problem is
> that when the mlx4_core driver starts, I get normal output like
>
>    mlx4_core 0000:00:04.0: FW version 2.9.1000 (cmd intf rev 3), max
> commands 16
>    mlx4_core 0000:00:04.0: Catastrophic error buffer at 0x1f020, size
> 0x10, BAR 0
>    mlx4_core 0000:00:04.0: FW size 385 KB
>
> up until the driver tries to enable interrupts, when I get a long
> stream of
>
>    Completion event for bogus CQ 00000000
>
> and then it gives up because the NOP command interrupt test
> fails.

Just to follow up on this, it turns out this is a bug in how the
Mellanox firmware deals with FLR (function level reset).  The
FW will be fixed in a future release, but in the meantime I've
been able to work around this with the following hack (probably
going to be whitespace destroyed by the gmail web interface
I'm using, but you should be able to recreate it if you care):

--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -3085,6 +3085,12 @@ static int reset_intel_82599_sfp_virtfn(struct
pci_dev *dev, int probe)
        return 0;
 }

+static int reset_mellanox_dev(struct pci_dev *dev, int probe)
+{
+       /* skip FLR, it busts the Mellanox FW */
+       return 0;
+}
+
 #define PCI_DEVICE_ID_INTEL_82599_SFP_VF   0x10ed

 static const struct pci_dev_reset_methods pci_dev_reset_methods[] = {
@@ -3092,6 +3098,8 @@ static const struct pci_dev_reset_methods
pci_dev_reset_methods[] = {
                 reset_intel_82599_sfp_virtfn },
        { PCI_VENDOR_ID_INTEL, PCI_ANY_ID,
                reset_intel_generic_dev },
+       { PCI_VENDOR_ID_MELLANOX, 0x673c,
+               reset_mellanox_dev },
        { 0 }
 };
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to