Multicast Filtering - mlx4_SET_MCAST_FLTR

2015-04-09 Thread Bob Biloxi
Hi,

I was going through the mlx4 code and noticed that this function
mlx4_SET_MCAST_FLTR calls the mlx4_SET_MCAST_FLTR_wrapper which in
turns has an empty body.


So, I was just wondering if the multicast filtering functionality is disabled?

Is QP_ATTACH the replacement for this?

Couldn't understand so wanted your help on this...


Thanks

Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 VLAN filtering support: mlx4_SET_VLAN_FLTR_wrapper

2015-02-24 Thread Bob Biloxi
Hi,

I was going through the code and noticed that this function
mlx4_SET_VLAN_FLTR_wrapper has an empty body and returns success.

Also this function mlx4_common_set_vlan_fltr doesn't have an implementation.

I was going through the 3.19 Linux kernel.


So, does that mean VLAN filtering is not being handled in mlx4?

Any other place this code has been moved to?


Thanks

Best Regards,
Ahsan
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4: having trouble getting mlx4_NOP to succeed in the VF driver

2014-12-31 Thread Bob Biloxi
Hi Jack,

Thanks so much for the really quick response. This is very helpful..
Please find my answers below:


 This simply indicates that the VF did not receive a command-completion
 interrupt.

So, in this case, I think that VF didn't receive a command-completion
interrupt for the NOP command. I say this because, it is still in
mlx4_setup_hca and at these logs were seen after the VF has just
called the mlx4_NOP function

 What is your setup topology?  Is the VF running on the Hypervisor?  Is
 it running on a VM?

 What is your O/S (Ubuntu X.Y, Fedora, SLES, etc).  What kernel are you
 running?

 I assume that you are running inbox under kernel 3.18.1.  Is this
 correct?

The VF is running on a virtual machine(LPAR). The PF runs on the hypervisor.

The VF runs on Redhat 7.The kernel version is 3.10.0-123.el.

The kernel version is a bit older than 3.18.1. It is 3.10.0-123.el

I didn't get by inbox...does that refer to any driver version?

 This is because GEN_EQE did not succeed in triggering the EQ which the
 VF uses for Async/command-completion events.

Oh okay...thank you.. now i am able to understand... but the GEN_EQE
command did return 0 status. i mean when i invoke the HCR command
after providing the slave(VF) number. So I think command status and
generation of the event to the VF EQ are two separate things...


 ConnectX does generate *send/receive* completion events directly to the
 VF. This is because each CQ is associated individually with an EQ, and
 the VF associates CQs it creates with its own EQs.

 Each VF also creates an Async/command-completion EQ.  However, this EQ
 is triggered by the PF via GEN_EQ (see explanation immediately below).

Thanks so much Jack...This is really helpful...now i understand in
more detail how the events get delivered(completion vs async)




 The issue here is that only the PF posts commands to the FW -- and
 receives the command-completion event when a command completes.
 The VF submits to the PF a command it wishes to post.  The PF posts
 the command to the firmware (i.e., the HCA), and fields the
 command-completion event.  It then invokes GEN_EQ to trigger the command
 completion event on the VF's async EQ.

Againthanks so much for explaining..this is really very helpful.I
think I might have understood the issue..but not sure...
So in my case after receiving the NOP command from the VF, the PF
posts it to the HCA... so once the NOP is posted to the HCA
the PF should get the interrupt (command completion event) even though
the command initially originated from the VF.
After handling the interrupt, it must use GEN_EQE to send the
event(interrupt) to the VF's async EQ... did i understand correctly.
I will verify this, by seeing the logs if the PF received an
interrupt(event) for the NOP that came from VF.



 You need to verify that the IOMMU options are activated in
 make menuconfig on the Hypervisor:

 --- IOMMU Hardware Support
 [*]   AMD IOMMU support
 [*] Export AMD IOMMU statistics to debugfs
  AMD IOMMU Version 2 driver
 [*]   Support for Intel IOMMU using DMA Remapping Devices
 [*] Enable Intel DMA Remapping Devices by default
 [*]   Support for Interrupt Remapping

 I suspect that this may not have been done.
 also, add intel_iommu=on to the kernel line in /boot/grub/menu.lst

Sure Jack... will check this out...
Had another doubt, if the PF driver doesn't run on Linux and on a
separate OS, is there anyway we can map the above options to it?


Thanks so much Jack for taking the time to answer my query and
explaining it in a way that i understand. I really appreciate the
help.
I really hope I get past this error..


Thank you...

Best Regards,
Bob






On Wed, Dec 31, 2014 at 2:55 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Wed, 31 Dec 2014 02:26:07 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 Hi,

 I was going through the mlx4 source code and had a few questions
 regarding the generation of interrupts upon execution of the NOP
 command from the VF driver.

 If i am running as a dedicated driver, then NOP seems to work fine(I
 get an interrupt)

 But if I enable SRIOV and then from the VF driver, i run the NOP
 command, I don't receive any interrupt(on the VF side)

 err = mlx4_NOP(dev); //this command when executed from VF driver
 doesn't raise any interrupt.

 I get the following from VF logs:

 [  117.879100] mlx4_core :01:00.0: communication channel command
 0x5 timed out
 [  117.879120] mlx4_core :01:00.0: failed execution of VHCR_POST
 commandopcode 0x31
 [  117.879127] mlx4_core :01:00.0: NOP command failed to generate
 MSI-X interrupt IRQ 24).


 This simply indicates that the VF did not receive a command-completion
 interrupt.


 I have checked the logs and it seems from the VHCR, NOP is received
 properly on the PF side and the HCR command is successful.

 Also GEN_EQE HCR command when executed in response to NOP is also
 successful.( i can see the return status of the command execution)


 What

mlx4: having trouble getting mlx4_NOP to succeed in the VF driver

2014-12-30 Thread Bob Biloxi
Hi,

I was going through the mlx4 source code and had a few questions
regarding the generation of interrupts upon execution of the NOP
command from the VF driver.

If i am running as a dedicated driver, then NOP seems to work fine(I
get an interrupt)

But if I enable SRIOV and then from the VF driver, i run the NOP
command, I don't receive any interrupt(on the VF side)

err = mlx4_NOP(dev); //this command when executed from VF driver
doesn't raise any interrupt.

I get the following from VF logs:

[  117.879100] mlx4_core :01:00.0: communication channel command
0x5 timed out
[  117.879120] mlx4_core :01:00.0: failed execution of VHCR_POST
commandopcode 0x31
[  117.879127] mlx4_core :01:00.0: NOP command failed to generate
MSI-X interrupt IRQ 24).


I have checked the logs and it seems from the VHCR, NOP is received
properly on the PF side and the HCR command is successful.

Also GEN_EQE HCR command when executed in response to NOP is also
successful.( i can see the return status of the command execution)



But on the VF side, the mlx4_eq_int function doesn't get called.

I have checked the return value of request_irq and it seems to be 0(no error)

mlx4_enable_msi_x is also successful.


Can anyone please help me if I am missing something?
Is there anything to be done so as to get interrupts in the mlx4 VF driver?

Can i check at any logs? dmesg output is the only place i was checking.



Also, can the ConnectX hardware generate interrupt to the VF driver?
Or is it that it only generates to the PF driver and PF driver uses
GEN_EQE? I understand that GEN_EQE is used to generate an event
towards a VF..But how are the interrupts routed to the VF driver?


I would be really very much grateful if I can get any kind of help.


Thanks so much !!


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Query regarding MAD_DEMUX and Secure Host

2014-12-16 Thread Bob Biloxi
Hi Jack,

Thank you so much for clarifying this. Now I understand. It all ties
down to QP1.
To support QP1, we need to support MAD_DEMUX

This is really helpful.

Best Regards,
Bob


On Tue, Dec 16, 2014 at 9:03 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Mon, 15 Dec 2014 15:07:58 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 am I correct in my understanding
 when i say that MAD_DEMUX feature is not required to be
 supported/implemented in Mellanox RoCE Drivers?

 It is required only for Infiniband drivers?

 Actually, you will need to support MAD_DEMUX anyway. If not, the
 CONF_SPECIAL_QP command will fail if Secure Host mode is operating.

 CONF_SPECIAL_QP is required for RoCE as well, since if it is not called
 we will not have QP1.  However, since this command maps QP0 as well to
 a QP, the MAD_DEMUX command is still required.

 -Jack
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Query regarding MAD_DEMUX and Secure Host

2014-12-15 Thread Bob Biloxi
Hi,

I was going through the mlx4 code for Secure Host and MAD_DEMUX feature.

I had a few queries..hoping that i can get these clarified.


If i understand correctly mlx4 codebase (mlx4_core/mlx4_en/mlx4_ib
drivers) take care of both RoCE and Infiniband adapters ( in both
dedicated and shared/SRIOV mode)


(1) After reading through the RoCE standard, I understood that Subnet
Management features are not supported in RoCE(they are part of
Infiniband)

(2) Also, as i understand MAD_DEMUX mechanism is used to control which
management traffic is passed to the host.

So by correlating these two( 1  2), am I correct in my understanding
when i say that MAD_DEMUX feature is not required to be
supported/implemented in Mellanox RoCE Drivers?

It is required only for Infiniband drivers?


I would be really thankful to get my understanding clarified.


Thanks so much


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FMR Support in multi-function environment

2014-11-11 Thread Bob Biloxi
 In SRIOV, FMR is supported only for the PF, not for VFs (since this
 feature requires writing directly to mapped ICM memory).

Hi,

Thank you so much for pointing to the exact code!!

I have a related question. I was trying to figure out the use case for
FMR in this environment(where in only PF supports FMR).


As per my understanding if an application wants to register huge
amounts of memory and wants to avoid the overhead of SW2HW_MPT HCR
command, it can do so using the alloc fmr verb.
Now in the SRIOV case, the application sits on top of the VF driver
and I was curious as to how the VF communicates with the PF driver to
register memory/map using FMR.

More specifically, I was trying to understand how the following
function gets called by the VF driver:


int mlx4_map_phys_fmr(struct mlx4_dev *dev, struct mlx4_fmr *fmr, u64
*page_list,
 int npages, u64 iova, u32 *lkey, u32 *rkey)
{
u32 key;
int i, err;

err = mlx4_check_fmr(fmr, page_list, npages, iova);
if (err)
return err;

++fmr-maps;

key = key_to_hw_index(fmr-mr.key);
key += dev-caps.num_mpts;
*lkey = *rkey = fmr-mr.key = hw_index_to_key(key);

*(u8 *) fmr-mpt = MLX4_MPT_STATUS_SW;

/* Make sure MPT status is visible before writing MTT entries */
wmb();

dma_sync_single_for_cpu(dev-pdev-dev, fmr-dma_handle,
npages * sizeof(u64), DMA_TO_DEVICE);

for (i = 0; i  npages; ++i)
fmr-mtts[i] = cpu_to_be64(page_list[i] | MLX4_MTT_FLAG_PRESENT);

dma_sync_single_for_device(dev-pdev-dev, fmr-dma_handle,
  npages * sizeof(u64), DMA_TO_DEVICE);

fmr-mpt-key= cpu_to_be32(key);
fmr-mpt-lkey   = cpu_to_be32(key);
fmr-mpt-length = cpu_to_be64(npages * (1ull  fmr-page_shift));
fmr-mpt-start  = cpu_to_be64(iova);

/* Make MTT entries are visible before setting MPT status */
wmb();

*(u8 *) fmr-mpt = MLX4_MPT_STATUS_HW;

/* Make sure MPT status is visible before consumer can use FMR */
wmb();

return 0;
}

Because the way i understood, VF can communicate with PF driver by
posting VHCR commands which cause an event to be generated on the PF
side. I can see _WRAPPER calls to handle those cases.

As there doesn't seem to be FMR related VHCR
command(virtual/para-virtual command), I was struggling to understand
how the flow happens for FMR from
application-kernel-VF-driver-PF-driver

I would be much grateful, if you can help me understand this.


Thanks so much!! Your replies really helped me improve my understanding.



Best Regards,
Bob





On Tue, Nov 11, 2014 at 4:24 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Mon, 10 Nov 2014 19:58:46 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 Hi,

 Is FMR (Fast Memory Regions) supported in a multi-function mode?

 In SRIOV, FMR is supported only for the PF, not for VFs (since this
 feature requires writing directly to mapped ICM memory).

 You can see this in file drivers/infiniband/hw/mlx4/main.c, function
 mlx4_ib_add() :


 if (!mlx4_is_slave(ibdev-dev)) {
 ibdev-ib_dev.alloc_fmr = mlx4_ib_fmr_alloc;
 ibdev-ib_dev.map_phys_fmr  = mlx4_ib_map_phys_fmr;
 ibdev-ib_dev.unmap_fmr = mlx4_ib_unmap_fmr;
 ibdev-ib_dev.dealloc_fmr   = mlx4_ib_fmr_dealloc;
 }

 i.e., the fmr functions are not put into the device virtual function
 table for slave (= VF) devices.

 -Jack


 If yes, I couldn't find the source code for the same in the mlx4
 codebase. Can anyone please point me to the right location...

 What I was trying to understand is this:

 Suppose a VF driver wants to register large amount of memory using
 FMR, will it be able to do so using the mlx4 code.

 Or FMR is supported only in dedicated mode?


 Thanks

 Best Regards,
 Bob
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma
 in the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


FMR Support in multi-function environment

2014-11-10 Thread Bob Biloxi
Hi,

Is FMR (Fast Memory Regions) supported in a multi-function mode?

If yes, I couldn't find the source code for the same in the mlx4
codebase. Can anyone please point me to the right location...

What I was trying to understand is this:

Suppose a VF driver wants to register large amount of memory using
FMR, will it be able to do so using the mlx4 code.

Or FMR is supported only in dedicated mode?


Thanks

Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4: RoCE support in SRIOV environment

2014-11-10 Thread Bob Biloxi
Hi,

I was going through the mlx4 code and previous mailing lists when I
came across the following thread:


http://marc.info/?l=linux-rdmam=134398354428293w=2


In that thread, it is mentioned as follows:

Some Limitations

1. FMRs are not currently supported on slaves. This will be corrected in a
   future submission.
2. RoCE is not currently supported on slaves. This will be corrected in a
   future submission.


As the thread dats back to 2012, I wanted to confirm if FMR  RoCE are
still not supported in SRIOV environment(master  slave)?


Thanks so much in advance!!

Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4: RoCE support in SRIOV environment

2014-11-10 Thread Bob Biloxi


 Hi, RoCE support in SRIOV was added in upstream kernel 3.14 :


Hi,

Thank you so much for pointing out when the support was added and also
the commit details.
This is really very helpful!!

I will go through these to further increase my understanding. Thanks so much...

Best Regards,
Bob



On Tue, Nov 11, 2014 at 11:46 AM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Mon, 10 Nov 2014 20:05:54 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 Hi,

 I was going through the mlx4 code and previous mailing lists when I
 came across the following thread:


 http://marc.info/?l=linux-rdmam=134398354428293w=2


 In that thread, it is mentioned as follows:

 Some Limitations
 
 1. FMRs are not currently supported on slaves. This will be corrected
 in a future submission.
 2. RoCE is not currently supported on slaves. This will be corrected
 in a future submission.


 Hi, RoCE support in SRIOV was added in upstream kernel 3.14 :

 commit 39e7d095f9d0a82a78804650917cd57972a480ce
 Merge: 36f6fdb aa9a2d5
 Author: David S. Miller da...@davemloft.net
 Date:   Wed Mar 12 15:57:26 2014 -0400

 Merge branch 'mlx4-next'

 Or Gerlitz says:

 
 mlx4: Add SRIOV support for RoCE

 This series adds SRIOV support for RoCE (RDMA over Ethernet) to the mlx4 
 driver.

 The patches are against net-next, as of commit 2d8d40a pkt_sched: fq:
 do not hold qdisc lock while allocating memory

 changes from V1:
  - addressed feedback from Dave on patch #3 and changed 
 get_real_sgid_index()
to be called fill_in_real_sgid_index() and be a void  function.
  - removed some checkpatch warnings on long lines

 changes from V0:
   - always check the return code of mlx4_get_roce_gid_from_slave().
 The call we fixed is introduced in patch #1 and later removed by
 patch #3 that allows guests to have multiple GIDS. The 1..3
 separation was done for proper division of patches to logical changes.
 

 Signed-off-by: David S. Miller da...@davemloft.net


 -Jack


 As the thread dats back to 2012, I wanted to confirm if FMR  RoCE are
 still not supported in SRIOV environment(master  slave)?


 Thanks so much in advance!!

 Best Regards,
 Bob
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma
 in the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: FMR Support in multi-function environment

2014-11-10 Thread Bob Biloxi
Hi,

 no, the proprietary FMRs are not supported for mlx4 VFs, nor for mlx5 both
 PF/VFs
 use the fast reg API, see for example this commit 5587856 IB/iser:
 Introduce fast memory registration model (FRWR) how this is done. The API
 was introduced in commit  00f7ec3 RDMA/core: Add memory management
 extensions support


Thanks so much for the detailed information. This is very much helpful.
I will go through these.


Best Regards,
Bob

On Tue, Nov 11, 2014 at 11:55 AM, Or Gerlitz ogerl...@mellanox.com wrote:
 On 11/10/2014 4:28 PM, Bob Biloxi wrote:

 Suppose a VF driver wants to register large amount of memory using
 FMR, will it be able to do so using the mlx4 code.


 no, the proprietary FMRs are not supported for mlx4 VFs, nor for mlx5 both
 PF/VFs


 Or FMR is supported only in dedicated mode?


 use the fast reg API, see for example this commit 5587856 IB/iser:
 Introduce fast memory registration model (FRWR) how this is done. The API
 was introduced in commit  00f7ec3 RDMA/core: Add memory management
 extensions support

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 roce PF driver responsibilities

2014-10-28 Thread Bob Biloxi
Hi All,

I was going through the mlx4 RoCE driver. I wanted to understand the
functionality that is implemented only by the Physical Function Driver
(PF Driver).

Assuming there is no RDMA stack, can anyone please tell me what all a
PF driver needs to take care of?

Is it sufficient if it only implements the HCR commands that are
related to RDMA?

Any CM or SA related things to take care of ?


Any help is really appreciated!!


Thanks so much


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


issue while executing RST2INIT_QP

2014-10-15 Thread Bob Biloxi
Hi All,

I ran into an issue while executing RST2INIT_QP HCR command on mlx4.

The command gets completed, but returns a status 3 instead of 0.

The 3 indicates that the QP number is reserved.

I did allocate ICM for all the QPs. I am using SRIOV mode.

The reserved number of QPs obtained from QUERY_DEV_CAP is 64.

My profile indicates I am using 2048 QPs. so as i understand QP
numbers from 65 to 2047 should not be reserved by the firmware..


Any idea why I am seeing this error?
Please help me debug the same, will be very much grateful to you.


Thanks

Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Can a PF driver access the VF config space, BAR(MMIO) etc?

2014-09-25 Thread Bob Biloxi
Hi,

Thanks so much for the quick reply! This is really helpful.
I will try to do the same and would get back if i face any difficulties.


Thanks again!

Best Regards,
Bob

On Wed, Sep 24, 2014 at 11:12 PM, Sunil Kovvuri sunil.kovv...@gmail.com wrote:
 If you anyway want to simulate VF functionality in PF driver itself,
 i am not sure why do you need to access VF's config space from PF.

 FYI, VF's BAR(MMIO) are not used, MMIO regions are carved using
 VF BARs in PF's SRIOV config space.

 VFx BAR0 = PF SRIOV BAR0 + BAR_SIZE * x (VF_NUMBER);

 For accessing VF's MMIO regions you can try mapping
 PF's pci_dev-resource[PCI_IOV_RESOURCES] and using above formula
 to get exact MMIO base for corresponding VF.

 Regards,
 Sunil.

 On Wed, Sep 24, 2014 at 10:10 PM, Bob Biloxi iambobbil...@gmail.com wrote:
 Hi,

 I am new to writing pci sriov drivers. So i could use your help and
 expertise here

 As I understand once sriov is enabled, the PF driver can access the
 PF(Physical Function) configuration space, BAR(MMIO) etc and the VF
 driver can access the VF(Virtual Function) configuration space,
 BAR(MMIO)...

 Is it possible for a PF driver to access the VF config space, BAR(MMIO)?
 If yes, can you please point me as to what needs to be done in order
 to do it(existing driver sources will be really helpful).

 Now as to why this requirement is if the PF driver is ready and the VF
 driver is still under development, and I want to simulate the VF
 functionality from PF itself.

 It would be of immense help to me if anyone can help me understand my
 aforementioned query.


 Thanks a lot!!!


 Best Regards,
 Bob
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Can a PF driver access the VF config space, BAR(MMIO) etc?

2014-09-24 Thread Bob Biloxi
Hi,

I am new to writing pci sriov drivers. So i could use your help and
expertise here

As I understand once sriov is enabled, the PF driver can access the
PF(Physical Function) configuration space, BAR(MMIO) etc and the VF
driver can access the VF(Virtual Function) configuration space,
BAR(MMIO)...

Is it possible for a PF driver to access the VF config space, BAR(MMIO)?
If yes, can you please point me as to what needs to be done in order
to do it(existing driver sources will be really helpful).

Now as to why this requirement is if the PF driver is ready and the VF
driver is still under development, and I want to simulate the VF
functionality from PF itself.

It would be of immense help to me if anyone can help me understand my
aforementioned query.


Thanks a lot!!!


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 query in sriov mode

2014-08-29 Thread Bob Biloxi
Hi,

 Where you fund the communication between PF and VF is by writing its BAR?
http://lxr.free-electrons.com/source/drivers/net/ethernet/mellanox/mlx4/cmd.c#L1964


Best Regards,
Bob

On Fri, Aug 29, 2014 at 9:51 AM, Wei Yang weiy...@linux.vnet.ibm.com wrote:
 On Thu, Aug 28, 2014 at 10:58:50PM +0530, Bob Biloxi wrote:
Hi All,


I really appreciate this wonderful community which has immensely
helped me broaden my knowledge and understanding.


I was going through the mlx4 sriov code, trying to understand the
communication between the VF driver and the PF driver.

I was having a few queries..hoping to get a better understanding.


As I understand, the commands are communicated between VF and PF
through a mechanism called communication channel. VF writes to
specific address in its BAR space, PF gets an event and then proceeds
ahead to read the command from its BAR space and then complete the
execution of it..


Now, my query is, lets say the VF driver is not yet present and only
the PF driver is there...

In this case, can we simulate a VF command write and get notified
through an event?


 Hi,

 I am not that familiar with mlx4 driver. As you mentioned in previous, VF
 communicate with PF by writing some word in BAR and PF gets it. If this is
 true, I believe it would works.

For eg. we write to some offset in the PF BAR space itself upon
completion of which, an event is generated because of the write? kind
of like loopback mechanism.


I searched through the code but couldn't find anywhere.

Can anyone please help me understand if this is possible? And if there
is any location in the code where i can find this?

 Where you fund the communication between PF and VF is by writing its BAR?


Thanks a lot in advance!!


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-pci in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

 --
 Richard Yang
 Help you, Help me

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 query in sriov mode

2014-08-28 Thread Bob Biloxi
Hi All,


I really appreciate this wonderful community which has immensely
helped me broaden my knowledge and understanding.


I was going through the mlx4 sriov code, trying to understand the
communication between the VF driver and the PF driver.

I was having a few queries..hoping to get a better understanding.


As I understand, the commands are communicated between VF and PF
through a mechanism called communication channel. VF writes to
specific address in its BAR space, PF gets an event and then proceeds
ahead to read the command from its BAR space and then complete the
execution of it..


Now, my query is, lets say the VF driver is not yet present and only
the PF driver is there...

In this case, can we simulate a VF command write and get notified
through an event?

For eg. we write to some offset in the PF BAR space itself upon
completion of which, an event is generated because of the write? kind
of like loopback mechanism.


I searched through the code but couldn't find anywhere.

Can anyone please help me understand if this is possible? And if there
is any location in the code where i can find this?

Thanks a lot in advance!!


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4: using dma_sync_single_for_cpu/dma_sync_single_for_cpu for writing MTT instead of WRITE_MTT hcr command

2014-07-10 Thread Bob Biloxi
Hi Jack,

Thanks so much for the response.. This is really helpful!


Best Regards,
Marc

On Thu, Jul 10, 2014 at 1:25 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Wed, 9 Jul 2014 18:40:46 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 Hi,

 I was going through the mr.c file as part of understanding WRITE_MTT
 command in the mlx4 code.

 I could see that instead of issuing the WRITE_MTT HCR command, in case
 of SRIOV, we're directly accessing the ICM space for the MTT Table,
 taking the ownership and updating it. We're doing this using
 dma_sync_single_for_cpu and dma_sync_single_for_cpu.

 I was curious as to why this approach was chosen instead of using the
 HCR command.

 Can anyone please explain the reason/motivation behind this approach?

 Performance. Direct write to memory is much faster than via HCR

 -Jack



 Thanks so much,


 Best Regards,
 Marc
 --
 To unsubscribe from this list: send the line unsubscribe linux-rdma
 in the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4: using dma_sync_single_for_cpu/dma_sync_single_for_cpu for writing MTT instead of WRITE_MTT hcr command

2014-07-09 Thread Bob Biloxi
Hi,

I was going through the mr.c file as part of understanding WRITE_MTT
command in the mlx4 code.

I could see that instead of issuing the WRITE_MTT HCR command, in case
of SRIOV, we're directly accessing the ICM space for the MTT Table,
taking the ownership and updating it. We're doing this using
dma_sync_single_for_cpu and dma_sync_single_for_cpu.

I was curious as to why this approach was chosen instead of using the
HCR command.

Can anyone please explain the reason/motivation behind this approach?



Thanks so much,


Best Regards,
Marc
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 - query regarding PF VF functionality division

2014-06-26 Thread Bob Biloxi
 I see, by the nature of the mlx4 SRIOV architecture under which
 there's no dedicated/separated PF vs VF driver, this is indeed a
 non-trivial task, need to think how to make it easier for you
Thanks for considering. It would be really helpful. I actually
appreciate the mlx4 architecture wherein 3 separate drivers are part
of 1 integrated code base which I think is really difficult to
accomplish. I mean the dedicated driver, VF driver, PF driver are all
part of single code base which i think is good.

The only thing is when one wants to understand these pieces (VF or PF)
separately, it takes some effort(as in my case)

For e.g

there is this flow:

mlx4_en_get-qp()-mlx4_register_mac()-mlx4_slave_cmd()

after this the wrapper function gets called be it either for
master(PF) or for slave(VF)

Till now I was assuming that this makes sense only for the slave(VF)
because register_mac is supposed to be called by VF. But from the code
even the PF makes the call. So i was somewhat confused.

It would be of immense help if I could understand this somehow..


On Thu, Jun 26, 2014 at 2:13 AM, Or Gerlitz or.gerl...@gmail.com wrote:
 On Tue, Jun 24, 2014 at 4:42 PM, Bob Biloxi iambobbil...@gmail.com wrote:

  Not really, but let's take EIM approach, what's your goal/mission?
 Let's say I am going through code to understand only the PF related
 functionality without bothering about VF, what are the things I need
 to keep in mind.



 I see, by the nature of the mlx4 SRIOV architecture under which
 there's no dedicated/separated PF vs VF driver, this is indeed a
 non-trivial task, need to think how to make it easier for you

 Or.



 I mean do i need to :

 a. Go through all the files in mlx4
 b. If the code is not specifically under mlx4_is_master, I still need
 to understand it because it is also part of PF driver
 c. I can ignore the logic which executes if mlx4_is_master is false,
 because that would be the VF code

 Thanks for the help!!




 On Mon, Jun 23, 2014 at 5:30 PM, Or Gerlitz or.gerl...@gmail.com wrote:
  On Mon, Jun 23, 2014 at 12:33 PM, Bob Biloxi iambobbil...@gmail.com 
  wrote:
  [...]
  Is there any way we can clearly separate the files that are used by PF
  vs the files that are used by VF in the (drivers/net/ethernet/mlx4
  sub-directory)?
  [...]
 
  Not really, but let's take EIM approach, what's your goal/mission?
 
  Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 - query regarding PF VF functionality division

2014-06-24 Thread Bob Biloxi
 Not really, but let's take EIM approach, what's your goal/mission?
Let's say I am going through code to understand only the PF related
functionality without bothering about VF, what are the things I need
to keep in mind.
I mean do i need to :

a. Go through all the files in mlx4
b. If the code is not specifically under mlx4_is_master, I still need
to understand it because it is also part of PF driver
c. I can ignore the logic which executes if mlx4_is_master is false,
because that would be the VF code

Thanks for the help!!




On Mon, Jun 23, 2014 at 5:30 PM, Or Gerlitz or.gerl...@gmail.com wrote:
 On Mon, Jun 23, 2014 at 12:33 PM, Bob Biloxi iambobbil...@gmail.com wrote:
 [...]
 Is there any way we can clearly separate the files that are used by PF
 vs the files that are used by VF in the (drivers/net/ethernet/mlx4
 sub-directory)?
 [...]

 Not really, but let's take EIM approach, what's your goal/mission?

 Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 - query regarding PF VF functionality division

2014-06-23 Thread Bob Biloxi
Hi All,

I was going through the Mellanox driver (mlx4) and then I had
difficulty understanding which portion of code corresponds to the one
executed by the PF(Physical Function Driver) and which portion of code
by (Virtual Function Driver) in the SRIOV mode.

My confusion is because, I was of the understanding that the QPs, CQs
(and their creation, state mgmt commands) etc are to be performed by
the virtual function driver(VF driver).


And the role of the physical function driver(PF driver) is to just
take care of the resource_tracker.c and ICM allocation.


But of late, I think I may have understood wrong. This is because
there is code that is specifically executed when mlx4_is_master is
true/false( indicating PF or VF).

And then, there is code which is not surrounded by this test, which
indicates it is executed in both cases(PF driver as well as VF
driver).

Is my understanding correct? If yes, then are the QPs, CQs and
ethernet tx, rx related functionality is executed both by master and
slave?

Is there any way we can clearly separate the files that are used by PF
vs the files that are used by VF in the (drivers/net/ethernet/mlx4
sub-directory)?

I would be really thankful and really appreciate all the
help/clarification I can get in understanding this.


Thank you so much.


Best Regards,
Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


DMA window vs actual allocatable DMA memory

2014-06-11 Thread Bob Biloxi
Hi All,

I am having trouble understanding DMA window and actual amount of
addressable DMA memory.

I hope someone explains me. Let me put my understanding and doubts here:

Let's say I am writing code for an ethernet device driver in the
virtualisation(hypervisor) environment.

Now, if the ethernet adapter requires certain amount of DMA memory, I
need to allocate heap memory and dma map it and provide to the
adapter.

From the hardware perspective, we have a 64GB DMA window.

I am having trouble understanding this value. Does it mean i can
allocate 64GB of RAM(Heap memory) and dma map it?

As i understand there might be a table that translates bus address to
physical(RAM) addresses. Each entry of such table points to a 4KB
page. If the size of each entry is 8 bytes and there are 16M such
entries( 16M * 4K = 64GB DMA window), the size of the table comes to
around 128M

Now do I have 64GB DMA memory or 128M DMA memory?

I want to know what is the max amount of memory that I can
allocate(heap), dma map and provide it to the adapter.

I will be really thankful in all the help that I can get!!

Thanks so much


Best Regards,
Marc
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: DMA window vs actual allocatable DMA memory

2014-06-11 Thread Bob Biloxi
Thanks so much Bjorn for the reply. This is very helpful. Now my
understanding is more clear.
I was curious whether there is any limit from the operating
system(hypervisor) as to how much DMA memory can drivers map.
I'll try to investigate.

On Thu, Jun 12, 2014 at 5:42 AM, Bjorn Helgaas bhelg...@google.com wrote:
 On Wed, Jun 11, 2014 at 12:23 PM, Bob Biloxi iambobbil...@gmail.com wrote:
 Hi All,

 I am having trouble understanding DMA window and actual amount of
 addressable DMA memory.

 I hope someone explains me. Let me put my understanding and doubts here:

 Let's say I am writing code for an ethernet device driver in the
 virtualisation(hypervisor) environment.

 Now, if the ethernet adapter requires certain amount of DMA memory, I
 need to allocate heap memory and dma map it and provide to the
 adapter.

 From the hardware perspective, we have a 64GB DMA window.

 I am having trouble understanding this value. Does it mean i can
 allocate 64GB of RAM(Heap memory) and dma map it?

 As i understand there might be a table that translates bus address to
 physical(RAM) addresses. Each entry of such table points to a 4KB
 page. If the size of each entry is 8 bytes and there are 16M such
 entries( 16M * 4K = 64GB DMA window), the size of the table comes to
 around 128M

 Now do I have 64GB DMA memory or 128M DMA memory?

 Documentation/DMA-API-HOWTO.txt might help answer your questions.

 The bus address to RAM address translation is done by a IOMMU
 hardware.  The tables you mention are I/O page tables used by the
 IOMMU.  The 128M occupied by the tables is kernel bookkeeping overhead
 and has nothing to do with the adapter itself.

 The 64GB DMA window might be a hardware feature of the device, i.e.,
 maybe it can only generate 36-bit DMA addresses.  That doesn't mean
 you have to allocate memory for the whole window; I would guess
 drivers would only allocate and map what they need.  I don't know how
 they figure out how much to map.

 I want to know what is the max amount of memory that I can
 allocate(heap), dma map and provide it to the adapter.

 I will be really thankful in all the help that I can get!!

 Thanks so much


 Best Regards,
 Marc
 --
 To unsubscribe from this list: send the line unsubscribe linux-pci in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 qp allocation

2014-02-14 Thread Bob Biloxi
Hi Jack,

Thanks so much for clarifying my understanding!!

Best Regards,
Bob

On Thu, Feb 13, 2014 at 7:08 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Thu, 13 Feb 2014 00:18:22 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 The VFs need to allocate the memory for Send Queue Buffer, Receive
 Queue Buffer, Completion Queue Buffer, Event Queue Buffer.

 Is that right?

 Yes.


 Also, as the QPs, CQs etc are created by the HCA when ALLOC_RES
 command is issued, does the PF driver need to maintain anything to
 associate the QPs, CQs created by the HCA with owners(VFs) possessing
 them?

 Of course. These resources must be de-allocated if, for example, the
 VM running the VF crashes -- or we have a resource leak.

 This also is used for security checking, to make sure that a VF does
 not mess around with resources that do not belong to it.

 -Jack

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: mlx4 qp allocation

2014-02-12 Thread Bob Biloxi
Hi Jack,

Thanks for the reply. Now I understand. On a related note, I had the
following question. Would really appreciate if you can help answer the
same:

Considering the resources QPs, CQs, EQs etc after going through the
code my understanding is that:

1. Physical Function Driver/Hypervisor allocates memory only for the
ICM space for these resources.
2. Virtual Function Driver needs to allocate corresponding system
memory for the resources

For e.g let's say I need 32K QPs, 64K CQs, 512 EQs, the PF driver
allocates the memory only for the ICM.

The VFs need to allocate the memory for Send Queue Buffer, Receive
Queue Buffer, Completion Queue Buffer, Event Queue Buffer.

Is that right?

Also, as the QPs, CQs etc are created by the HCA when ALLOC_RES
command is issued, does the PF driver need to maintain anything to
associate the QPs, CQs created by the HCA with owners(VFs) possessing
them?

I would really appreciate your help!

Thanks so much..

Best Regards,
Bob


On Tue, Feb 11, 2014 at 5:01 PM, Jack Morgenstein
ja...@dev.mellanox.co.il wrote:
 On Wed, 29 Jan 2014 15:52:09 +0530
 Bob Biloxi iambobbil...@gmail.com wrote:

 These paths are taken based on the return value of mlx4_is_func(dev).
 This is true for MASTER or SLAVE which I believe is Physical Function
 Driver/Virtual Function Driver. So for SRIOV, it covers all cases.

 The MAP_ICM portion which gets executed as part of __mlx4_qp_alloc_icm
 never gets called??

 For slaves (VFs), the command is sent via the comm channel to the
 Hypervisor.  It is the Hypervisor which invokes map_icm on behalf of
 that slave.

 -Jack
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


mlx4 qp allocation

2014-01-29 Thread Bob Biloxi
Hi,

I was going through the linux/drivers/net/ethernet/mellanox/mlx4/qp.c

Got a few questions. Would really appreciate if someone can clarify:

In the function, mlx4_qp_alloc_icm,

To allocate a QP, there are 2 paths taken:

using the ALLOC_RES virtual command

using the MAP_ICM

These paths are taken based on the return value of mlx4_is_func(dev).
This is true for MASTER or SLAVE which I believe is Physical Function
Driver/Virtual Function Driver. So for SRIOV, it covers all cases.

The MAP_ICM portion which gets executed as part of __mlx4_qp_alloc_icm
never gets called??

Am I understanding it properly? Because as per my understanding ICM
needs to be allocated for all the QPs.

Please help me in understanding this.

Thanks so much.

Best Regards,

Bob
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html