Re: mlx4 query in sriov mode

2014-08-28 Thread Wei Yang
On Thu, Aug 28, 2014 at 10:58:50PM +0530, Bob Biloxi wrote:
>Hi All,
>
>
>I really appreciate this wonderful community which has immensely
>helped me broaden my knowledge and understanding.
>
>
>I was going through the mlx4 sriov code, trying to understand the
>communication between the VF driver and the PF driver.
>
>I was having a few queries..hoping to get a better understanding.
>
>
>As I understand, the commands are communicated between VF and PF
>through a mechanism called communication channel. VF writes to
>specific address in its BAR space, PF gets an event and then proceeds
>ahead to read the command from its BAR space and then complete the
>execution of it..
>
>
>Now, my query is, lets say the VF driver is not yet present and only
>the PF driver is there...
>
>In this case, can we simulate a VF command write and get notified
>through an event?
>

Hi,

I am not that familiar with mlx4 driver. As you mentioned in previous, VF
communicate with PF by writing some word in BAR and PF gets it. If this is
true, I believe it would works.

>For eg. we write to some offset in the PF BAR space itself upon
>completion of which, an event is generated because of the write? kind
>of like loopback mechanism.
>
>
>I searched through the code but couldn't find anywhere.
>
>Can anyone please help me understand if this is possible? And if there
>is any location in the code where i can find this?

Where you fund the communication between PF and VF is by writing its BAR?

>
>Thanks a lot in advance!!
>
>
>Best Regards,
>Bob
>--
>To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>the body of a message to majord...@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rmmod mlx4_core panic 3.16-rc1

2014-06-19 Thread Wei Yang
On Fri, Jun 20, 2014 at 07:02:41AM +0300, Or Gerlitz wrote:
>On Fri, Jun 20, 2014 at 6:51 AM, Wei Yang  wrote:
>
>> >> mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port 
>> >> - single port VFs syntax is only supported when all ports are configured 
>> >> as ethernet
>> >> BUG: unable to handle kernel NULL pointer dereference at 038c
>> >> IP: [] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>>
>> From this log, it happens during probe?
>> If not, any action after probe?
>
>yep, maybe the bug still exists in the error flow of probe? you can probe with
>
>num_vfs=1,1,1 port_type_array=1,1 and see if you hit it
>

I tried this

modprobe mlx4_core num_vfs=3 probe_vf=3 port_type_array=1,1

It looks good to me.

BTW, I didn't test 3.16-rc1, sine the SRIOV patch on power platform is not
rebased to the latest kernel yet.

>
>>
>> >> PGD 45d3ba067 PUD 45ace8067 PMD 0
>> >> Oops:  [#1] SMP DEBUG_PAGEALLOC
>> >> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE 
>> >> iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc 
>> >> autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 
>> >> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
>> >> xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash 
>> >> dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt 
>> >> iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr 
>> >> i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core 
>> >> ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache 
>> >> sd_mod crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas 
>> >> scsi_transport_sas raid_class [last unloaded: mlx4_core]
>> >> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
>> >> Hardware name: Oracle Corporation SUN FIRE X4170 M3 
>> >> /ASSY,MOTHERBOARD,1U   , BIOS 17050100 08/29/2013
>> >> task: 880461540110 ti: 88046500 task.ti: 88046500
>> >> RIP: 0010:[]  [] 
>> >> __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> >> RSP: 0018:880465003d88  EFLAGS: 00010296
>> >> RAX: 0001 RBX:  RCX: 
>> >> RDX: 0026 RSI: 0292 RDI: 880468b8f000
>> >> RBP: 880465003db8 R08:  R09: 
>> >> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 
>> >> R13: 880468b8f000 R14: a036de40 R15: 0001
>> >> FS:  7ff287fc2700() GS:88046fce() 
>> >> knlGS:
>> >> CS:  0010 DS:  ES:  CR0: 80050033
>> >> CR2: 038c CR3: 00045cfae000 CR4: 000407e0
>> >> Stack:
>> >>  880465003da8 880468b8f000  880468b8f000
>> >>  a036de40 0001 880465003dd8 a0350805
>> >>  880468b8f098 a036dd60 880465003e08 812ebaa6
>> >> Call Trace:
>> >>  [] mlx4_remove_one+0x25/0x50 [mlx4_core]
>> >>  [] pci_device_remove+0x46/0xc0
>> >>  [] __device_release_driver+0x7f/0xf0
>> >>  [] driver_detach+0xc8/0xd0
>> >>  [] bus_remove_driver+0x59/0xd0
>> >>  [] driver_unregister+0x30/0x70
>> >>  [] pci_unregister_driver+0x23/0x80
>> >>  [] mlx4_cleanup+0x10/0x1e [mlx4_core]
>> >>  [] SyS_delete_module+0x189/0x210
>> >>  [] system_call_fastpath+0x16/0x1b
>> >> Code: 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 41 57 41 56 41 55 41 
>> >> 54 53 48 83 ec 08 66 66 66 66 90 48 8b 9f 58 01 00 00 49 89 fd <44> 8b b3 
>> >> 8c 03 00 00 45 85 f6 0f 85 41 02 00 00 f6 43 08 04 44
>> >> RIP  [] __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> >>  RSP 
>> >> CR2: 038c
>> >> --
>> >> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> >> the body of a message to majord...@vger.kernel.org
>> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Richard Yang
>> Help you, Help me
>>

-- 
Richard Yang
Help you, Help me

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: rmmod mlx4_core panic 3.16-rc1

2014-06-19 Thread Wei Yang
On Fri, Jun 20, 2014 at 06:34:48AM +0300, Or Gerlitz wrote:
>On Thu, Jun 19, 2014 at 6:33 AM, Shirley Ma  wrote:
>>
>> 1. Whether IB VFs is supported in ConnectX-2 (mlx4 driver)?
>>
>> I tried to num_vfs={port1,port2,port1+2} when loading mlx4_core module, it 
>> failed with mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with 
>> IB port - single port VFs syntax is only supported when all ports are 
>> configured as ethernet
>
>
>What do you mean by "port1" and "port2" -- can you give the exact
>command line you used?
>
>Single ported VFs are currently supported for Ethernet only
>configuration, that is not for only IB nor for VPI, that is only if
>you use port_type_arrary=2,2
>
>
>
>>
>>
>> 2. After mlx4_core module is being loaded with with num_vfs={} parameters, 
>> when removing mlx4_core, it consistently hits below panic. Whether this 
>> problem is being tracked?
>
>
>what do you mean by  "num_vfs={}", is it num_vfs=N or {N}, also here,
>please send the exact setting you used. The crash you indicated below
>is supposed to be fixed by the upstream  commit
>da1de8dfff09d33d4a5345762c21b487028e25f5 "net/mlx4_core: Keep only one
>driver entry release" - are you sure to have this commit in the tree
>you are working with?
>

Just checked, this patch is in 3.16-rc1.

>Or.
>
>>
>>  mlx4_ib_add: mlx4_ib: Mellanox ConnectX InfiniBand driver v2.2-1 
>> (Feb 2014)
>> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
>> mlx4_core: Initializing :40:00.0
>> mlx4_core :40:00.0: Enabling SR-IOV with 2 VFs
>> pci :40:00.1: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing :40:00.1
>> mlx4_core :40:00.1: enabling device ( -> 0002)
>> mlx4_core :40:00.1: Skipping virtual function:1
>> pci :40:00.2: [15b3:1002] type 00 class 0x0c0600
>> mlx4_core: Initializing :40:00.2
>> mlx4_core :40:00.2: enabling device ( -> 0002)
>> mlx4_core :40:00.2: Skipping virtual function:2
>> mlx4_core :40:00.0: Running in master mode
>> mlx4_core :40:00.0: PCIe BW is different than device's capability
>> mlx4_core :40:00.0: PCIe link speed is 5.0GT/s, device supports 8.0GT/s
>> mlx4_core :40:00.0: PCIe link width is x8, device supports x8
>> mlx4_core :40:00.0: Invalid syntax of num_vfs/probe_vfs with IB port - 
>> single port VFs syntax is only supported when all ports are configured as 
>> ethernet
>> BUG: unable to handle kernel NULL pointer dereference at 038c
>> IP: [] __mlx4_remove_one+0x20/0x380 [mlx4_core]

>From this log, it happens during probe?
If not, any action after probe?

>> PGD 45d3ba067 PUD 45ace8067 PMD 0
>> Oops:  [#1] SMP DEBUG_PAGEALLOC
>> Modules linked in: mlx4_core(-) ebtable_nat ebtables ipt_MASQUERADE 
>> iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc 
>> autofs4 cpufreq_ondemand ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 
>> iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 
>> xt_state nf_conntrack ip6table_filter ip6_tables dm_mirror dm_region_hash 
>> dm_log dm_mod vhost_net macvtap macvlan vhost tun kvm_intel kvm iTCO_wdt 
>> iTCO_vendor_support microcode ipmi_si ipmi_msghandler acpi_cpufreq pcspkr 
>> i2c_i801 i2c_core lpc_ich mfd_core shpchp sg ioatdma ib_sa ib_mad ib_core 
>> ib_addr ipv6 vxlan ixgbe dca ptp pps_core hwmon mdio ext3 jbd mbcache sd_mod 
>> crc_t10dif crct10dif_common usb_storage ahci libahci mpt2sas 
>> scsi_transport_sas raid_class [last unloaded: mlx4_core]
>> CPU: 13 PID: 7212 Comm: rmmod Not tainted 3.16.0-rc1+ #1
>> Hardware name: Oracle Corporation SUN FIRE X4170 M3 /ASSY,MOTHERBOARD,1U 
>>   , BIOS 17050100 08/29/2013
>> task: 880461540110 ti: 88046500 task.ti: 88046500
>> RIP: 0010:[]  [] 
>> __mlx4_remove_one+0x20/0x380 [mlx4_core]
>> RSP: 0018:880465003d88  EFLAGS: 00010296
>> RAX: 0001 RBX:  RCX: 
>> RDX: 0026 RSI: 0292 RDI: 880468b8f000
>> RBP: 880465003db8 R08:  R09: 
>> R10: 09f911029d74e35b R11: 09f911029d74e35b R12: 
>> R13: 880468b8f000 R14: a036de40 R15: 0001
>> FS:  7ff287fc2700() GS:88046fce() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: 038c CR3: 00045cfae000 CR4: 000407e0
>> Stack:
>>  880465003da8 880468b8f000  880468b8f000
>>  a036de40 0001 880465003dd8 a0350805
>>  880468b8f098 a036dd60 880465003e08 812ebaa6
>> Call Trace:
>>  [] mlx4_remove_one+0x25/0x50 [mlx4_core]
>>  [] pci_device_remove+0x46/0xc0
>>  [] __device_release_driver+0x7f/0xf0
>>  [] driver_detach+0xc8/0xd0
>>  [] bus_remove_driver+0x59/0xd0
>>  [] driver_unregister+0x30/0x70
>>  [] pci_unregister_driver+0x23/0x80
>>  [] mlx4_cleanup+0x10/0x1e [mlx4_core]
>>  [] SyS_delete_module+0x189/0x210
>>  [] syst