Re: mlx4 query in sriov mode
On Thu, Aug 28, 2014 at 10:58:50PM +0530, Bob Biloxi wrote: >Hi All, > > >I really appreciate this wonderful community which has immensely >helped me broaden my knowledge and understanding. > > >I was going through the mlx4 sriov code, trying to understand the >communication between the VF driver and the PF driver. > >I was having a few queries..hoping to get a better understanding. > > >As I understand, the commands are communicated between VF and PF >through a mechanism called communication channel. VF writes to >specific address in its BAR space, PF gets an event and then proceeds >ahead to read the command from its BAR space and then complete the >execution of it.. > > >Now, my query is, lets say the VF driver is not yet present and only >the PF driver is there... > >In this case, can we simulate a VF command write and get notified >through an event? > Hi, I am not that familiar with mlx4 driver. As you mentioned in previous, VF communicate with PF by writing some word in BAR and PF gets it. If this is true, I believe it would works. >For eg. we write to some offset in the PF BAR space itself upon >completion of which, an event is generated because of the write? kind >of like loopback mechanism. > > >I searched through the code but couldn't find anywhere. > >Can anyone please help me understand if this is possible? And if there >is any location in the code where i can find this? Where you fund the communication between PF and VF is by writing its BAR? > >Thanks a lot in advance!! > > >Best Regards, >Bob >-- >To unsubscribe from this list: send the line "unsubscribe linux-pci" in >the body of a message to majord...@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- Richard Yang Help you, Help me -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rmmod mlx4_core panic 3.16-rc1
On Fri, Jun 20, 2014 at 07:02:41AM +0300, Or Gerlitz wrote: >On Fri, Jun 20, 2014 at 6:51 AM, Wei Yang wrote: > >> >> mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port >> >> - single port VFs syntax is only supported when all ports are configured >> >> as ethernet >> >> BUG: unable to handle kernel NULL pointer dereference at 038c >> >> IP: [] __mlx4_remove_one+0x20/0x380 [mlx4_core] >> >> From this log, it happens during probe? >> If not, any action after probe? > >yep, maybe the bug still exists in the error flow of probe? you can probe with > >num_vfs=1,1,1 port_type_array=1,1 and see if you hit it > I tried this modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1 It looks good to me. BTW, I didn't test 3.16-rc1, sine the SRIOV patch on power platform is not rebased to the latest kernel yet. > >> >> >> PGD 45d3ba067 PUD 45ace8067 PMD 0 >> >> Oops: [#1] SMP DEBUG_PAGEALLOC >> >> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE >> >> iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc >> >> autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 >> >> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 >> >> xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash >> >> dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt >> >> iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr >> >> i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core >> >> ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache >> >> sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas >> >> scsi_transport_sas raid_class [last unloaded: mlx4_core] >> >> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1 >> >> Hardware name: Oracle Corporation SUN FIRE X4170 M3 >> >> /ASSY,MOTHERBOARD,1U , BIOS 17050100 08/29/2013 >> >> task: 880461540110 ti: 88046500 task.ti: 88046500 >> >> RIP: 0010:[] [] >> >> __mlx4_remove_one+0x20/0x380 [mlx4_core] >> >> RSP: 0018:880465003d88 EFLAGS: 00010296 >> >> RAX: 0001 RBX: RCX: >> >> RDX: 0026 RSI: 0292 RDI: 880468b8f000 >> >> RBP: 880465003db8 R08: R09: >> >> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: >> >> R13: 880468b8f000 R14: a036de40 R15: 0001 >> >> FS: 7ff287fc2700() GS:88046fce() >> >> knlGS: >> >> CS: 0010 DS: ES: CR0: 80050033 >> >> CR2: 038c CR3: 00045cfae000 CR4: 000407e0 >> >> Stack: >> >> 880465003da8 880468b8f000 880468b8f000 >> >> a036de40 0001 880465003dd8 a0350805 >> >> 880468b8f098 a036dd60 880465003e08 812ebaa6 >> >> Call Trace: >> >> [] mlx4_remove_one+0x25/0x50 [mlx4_core] >> >> [] pci_device_remove+0x46/0xc0 >> >> [] __device_release_driver+0x7f/0xf0 >> >> [] driver_detach+0xc8/0xd0 >> >> [] bus_remove_driver+0x59/0xd0 >> >> [] driver_unregister+0x30/0x70 >> >> [] pci_unregister_driver+0x23/0x80 >> >> [] mlx4_cleanup+0x10/0x1e [mlx4_core] >> >> [] SyS_delete_module+0x189/0x210 >> >> [] system_call_fastpath+0x16/0x1b >> >> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 >> >> 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 >> >> 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44 >> >> RIP [] __mlx4_remove_one+0x20/0x380 [mlx4_core] >> >> RSP >> >> CR2: 038c >> >> -- >> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in >> >> the body of a message to majord...@vger.kernel.org >> >> More majordomo info at http://vger.kernel.org/majordomo-info.html >> >> -- >> Richard Yang >> Help you, Help me >> -- Richard Yang Help you, Help me -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: rmmod mlx4_core panic 3.16-rc1
On Fri, Jun 20, 2014 at 06:34:48AM +0300, Or Gerlitz wrote: >On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma wrote: >> >> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)? >> >> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it >> failed with mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with >> IB port - single port VFs syntax is only supported when all ports are >> configured as ethernet > > >What do you mean by "port1" and "port2" -- can you give the exact >command line you used? > >Single ported VFs are currently supported for Ethernet only >configuration, that is not for only IB nor for VPI, that is only if >you use port_type_arrary=2,2 > > > >> >> >> 2. After mlx4_core module is being loaded with with num_vfs={} parameters, >> when removing mlx4_core, it consistently hits below panic. Whether this >> problem is being tracked? > > >what do you mean by "num_vfs={}", is it num_vfs=N or {N}, also here, >please send the exact setting you used. The crash you indicated below >is supposed to be fixed by the upstream commit >da1de8dfff09d33d4a5345762c21b487028e25f5 "net/mlx4_core: Keep only one >driver entry release" - are you sure to have this commit in the tree >you are working with? > Just checked, this patch is in 3.16-rc1. >Or. > >> >> mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 >> (Feb 2014) >> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014) >> mlx4_core: Initializing :40:00.0 >> mlx4_core :40:00.0: Enabling SR-IOV with 2 VFs >> pci :40:00.1: [15b3:1002] type 00 class 0x0c0600 >> mlx4_core: Initializing :40:00.1 >> mlx4_core :40:00.1: enabling device ( -> 0002) >> mlx4_core :40:00.1: Skipping virtual function:1 >> pci :40:00.2: [15b3:1002] type 00 class 0x0c0600 >> mlx4_core: Initializing :40:00.2 >> mlx4_core :40:00.2: enabling device ( -> 0002) >> mlx4_core :40:00.2: Skipping virtual function:2 >> mlx4_core :40:00.0: Running in master mode >> mlx4_core :40:00.0: PCIe BW is different than device's capability >> mlx4_core :40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s >> mlx4_core :40:00.0: PCIe link width is x8, device supports x8 >> mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - >> single port VFs syntax is only supported when all ports are configured as >> ethernet >> BUG: unable to handle kernel NULL pointer dereference at 038c >> IP: [] __mlx4_remove_one+0x20/0x380 [mlx4_core] >From this log, it happens during probe? If not, any action after probe? >> PGD 45d3ba067 PUD 45ace8067 PMD 0 >> Oops: [#1] SMP DEBUG_PAGEALLOC >> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE >> iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc >> autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 >> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 >> xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash >> dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt >> iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr >> i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core >> ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod >> crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas >> scsi_transport_sas raid_class [last unloaded: mlx4_core] >> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1 >> Hardware name: Oracle Corporation SUN FIRE X4170 M3 /ASSY,MOTHERBOARD,1U >> , BIOS 17050100 08/29/2013 >> task: 880461540110 ti: 88046500 task.ti: 88046500 >> RIP: 0010:[] [] >> __mlx4_remove_one+0x20/0x380 [mlx4_core] >> RSP: 0018:880465003d88 EFLAGS: 00010296 >> RAX: 0001 RBX: RCX: >> RDX: 0026 RSI: 0292 RDI: 880468b8f000 >> RBP: 880465003db8 R08: R09: >> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: >> R13: 880468b8f000 R14: a036de40 R15: 0001 >> FS: 7ff287fc2700() GS:88046fce() knlGS: >> CS: 0010 DS: ES: CR0: 80050033 >> CR2: 038c CR3: 00045cfae000 CR4: 000407e0 >> Stack: >> 880465003da8 880468b8f000 880468b8f000 >> a036de40 0001 880465003dd8 a0350805 >> 880468b8f098 a036dd60 880465003e08 812ebaa6 >> Call Trace: >> [] mlx4_remove_one+0x25/0x50 [mlx4_core] >> [] pci_device_remove+0x46/0xc0 >> [] __device_release_driver+0x7f/0xf0 >> [] driver_detach+0xc8/0xd0 >> [] bus_remove_driver+0x59/0xd0 >> [] driver_unregister+0x30/0x70 >> [] pci_unregister_driver+0x23/0x80 >> [] mlx4_cleanup+0x10/0x1e [mlx4_core] >> [] SyS_delete_module+0x189/0x210 >> [] syst