Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-19 Thread Or Gerlitz
On Thu, Mar 17, 2016 at 3:40 AM, Alexey Kardashevskiy  wrote:
> On 03/16/2016 08:45 PM, Or Gerlitz wrote:
>> On Wed, Mar 16, 2016 at 10:34 AM, Alexey Kardashevskiy 
>> wrote:
>>
>>> Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not
>>> work in a guest:
>>
>>
>> So where is the breakage point for you? does 4.4 works? if not, what?

> Ah, my bad. It is unrelated to the kernel version.
> I tried passing a PF to a guest while its VFs are already passed to another
> guest and see how exactly it blows up (AER/EEH were thrown but the host
> recovered => good) but this left the device in a weird state when I could
> not use VF in a guest anymore but it seemed to keep working on the host.

> It seems like the actual adapter does not reset completely when the machine
> is rebooted, I had unplug/replug power cables to fix this.

So to make sure, now things works fine with the patch reverted?


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-18 Thread Alexey Kardashevskiy

On 03/16/2016 08:45 PM, Or Gerlitz wrote:

On Wed, Mar 16, 2016 at 10:34 AM, Alexey Kardashevskiy  wrote:


Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not
work in a guest:


So where is the breakage point for you? does 4.4 works? if not, what?


Ah, my bad. It is unrelated to the kernel version.

I tried passing a PF to a guest while its VFs are already passed to another 
guest and see how exactly it blows up (AER/EEH were thrown but the host 
recovered => good) but this left the device in a weird state when I could 
not use VF in a guest anymore but it seemed to keep working on the host.


It seems like the actual adapter does not reset completely when the machine 
is rebooted, I had unplug/replug power cables to fix this.



--
Alexey


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-16 Thread Or Gerlitz
On Wed, Mar 16, 2016 at 10:34 AM, Alexey Kardashevskiy  wrote:

> Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not
> work in a guest:

So where is the breakage point for you? does 4.4 works? if not, what?


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-16 Thread Alexey Kardashevskiy

On 03/16/2016 05:09 PM, Eli Cohen wrote:

On Wed, Mar 16, 2016 at 04:49:00PM +1100, Alexey Kardashevskiy wrote:

On 03/16/2016 04:10 PM, Eli Cohen wrote:

On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote:


So with v4.5 as a host, there is no actual distro available today to
use as a guest in the next 6 months (or whatever it takes to
backport this partucular patch back there).

You could have added a module parameter to enforce the old behavoir,
at least...

And sorry but from the original commit log I could not understand
why exactly all existing guests need to be broken. Could you please
point me to a piece of documentation describing all this UAR
bisuness (what is UAR, why 128 UARs are required and for what, etc).
Thanks.



We are going to send a patch that fixes this using a module parameter.
The patch will be on top of Huy's patch.

Some background to the problem: mlx4 supported devices require 128 UAR


What does UAR stand for?

User Access Region. It's the way you interface with the hardware.



pages from PCI memory space defined by BAR2-3. Each UAR page can be
any power of 2 value from 4K up to 64K. Before Huy's patch the driver
chose UAR page size to be equal to system page size. Since PowerPC's
page size is 64K this means minimum requirement of UAR pages is not
met (default UAR BAR is 8MB and only half of it is really reserved for
UARs).


And what was the downside? afaict the performance was good...



It's not a performance issue. Defining 64KB for a UAR is not required
and wastes pci memory mapped i/o space.




More details can be found in the programmer's manual.


Can you please point me to this manual on the website? I tried,
honestly, could not find it. Thanks.


It's not publically available. If you have an FAE that work with your
company you can ask him how to get the doc.



Oh. ok. It also looks like even with the reverted patch, mlx4 VF does not 
work in a guest:


root@le-dbg:~# dhclient eth0
mlx4_en: eth0:   frag:0 - size:1518 prefix:0 stride:1536

mlx4_core :00:00.0: Internal error detected on the communication channel
mlx4_core :00:00.0: device is going to be reset
mlx4_core :00:00.0: VF reset is not needed
mlx4_core :00:00.0: device was reset successfully
mlx4_en :00:00.0: Internal error detected, restarting device
mlx4_core :00:00.0: command 0x5 failed: fw status = 0x1
mlx4_core :00:00.0: Failed to close slave function
mlx4_core :00:00.0: Detected virtual function - running in slave mode
mlx4_core :00:00.0: Sending reset


mlx4_core :00:00.0: slave is currently in the middle of FLR - Deferring 
probe
mlx4_core :00:00.0: mlx4_restart_one: ERROR: mlx4_load_one failed, 
pci_name=:00:00.0, err=-517

mlx4_core :00:00.0: mlx4_restart_one was ended, ret=-517

root@le-dbg:~# ifconfig -a
loLink encap:Local Loopback
  inet addr:127.0.0.1  Mask:255.0.0.0
  UP LOOPBACK RUNNING  MTU:65536  Metric:1
  RX packets:0 errors:0 dropped:0 overruns:0 frame:0
  TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
  collisions:0 txqueuelen:0
  RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

root@le-dbg:~# lspci -v
00:00.0 Ethernet controller: Mellanox Technologies MT27500/MT27520 Family 
[ConnectX-3/ConnectX-3 Pro Virtual Function]

Subsystem: IBM Device 61b0
Physical Slot: C16
Flags: bus master, fast devsel, latency 0
Memory at 1012000 (64-bit, prefetchable) [size=64M]
Capabilities: [60] Express Endpoint, MSI 00
Capabilities: [9c] MSI-X: Enable- Count=52 Masked-
Capabilities: [40] Power Management version 0
Kernel driver in use: mlx4_core



--
Alexey


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-16 Thread Eli Cohen
On Wed, Mar 16, 2016 at 04:49:00PM +1100, Alexey Kardashevskiy wrote:
> On 03/16/2016 04:10 PM, Eli Cohen wrote:
> >On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote:
> >>
> >>So with v4.5 as a host, there is no actual distro available today to
> >>use as a guest in the next 6 months (or whatever it takes to
> >>backport this partucular patch back there).
> >>
> >>You could have added a module parameter to enforce the old behavoir,
> >>at least...
> >>
> >>And sorry but from the original commit log I could not understand
> >>why exactly all existing guests need to be broken. Could you please
> >>point me to a piece of documentation describing all this UAR
> >>bisuness (what is UAR, why 128 UARs are required and for what, etc).
> >>Thanks.
> >>
> >
> >We are going to send a patch that fixes this using a module parameter.
> >The patch will be on top of Huy's patch.
> >
> >Some background to the problem: mlx4 supported devices require 128 UAR
> 
> What does UAR stand for?
User Access Region. It's the way you interface with the hardware.
> 
> >pages from PCI memory space defined by BAR2-3. Each UAR page can be
> >any power of 2 value from 4K up to 64K. Before Huy's patch the driver
> >chose UAR page size to be equal to system page size. Since PowerPC's
> >page size is 64K this means minimum requirement of UAR pages is not
> >met (default UAR BAR is 8MB and only half of it is really reserved for
> >UARs).
> 
> And what was the downside? afaict the performance was good...
>

It's not a performance issue. Defining 64KB for a UAR is not required
and wastes pci memory mapped i/o space.

> 
> >More details can be found in the programmer's manual.
> 
> Can you please point me to this manual on the website? I tried,
> honestly, could not find it. Thanks.
>
It's not publically available. If you have an FAE that work with your
company you can ask him how to get the doc.
> 
> -- 
> Alexey
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy

On 03/16/2016 04:10 PM, Eli Cohen wrote:

On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote:


So with v4.5 as a host, there is no actual distro available today to
use as a guest in the next 6 months (or whatever it takes to
backport this partucular patch back there).

You could have added a module parameter to enforce the old behavoir,
at least...

And sorry but from the original commit log I could not understand
why exactly all existing guests need to be broken. Could you please
point me to a piece of documentation describing all this UAR
bisuness (what is UAR, why 128 UARs are required and for what, etc).
Thanks.



We are going to send a patch that fixes this using a module parameter.
The patch will be on top of Huy's patch.

Some background to the problem: mlx4 supported devices require 128 UAR


What does UAR stand for?


pages from PCI memory space defined by BAR2-3. Each UAR page can be
any power of 2 value from 4K up to 64K. Before Huy's patch the driver
chose UAR page size to be equal to system page size. Since PowerPC's
page size is 64K this means minimum requirement of UAR pages is not
met (default UAR BAR is 8MB and only half of it is really reserved for
UARs).


And what was the downside? afaict the performance was good...



More details can be found in the programmer's manual.


Can you please point me to this manual on the website? I tried, honestly, 
could not find it. Thanks.



--
Alexey


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Eli Cohen
On Wed, Mar 16, 2016 at 01:07:58PM +1100, Alexey Kardashevskiy wrote:
> 
> So with v4.5 as a host, there is no actual distro available today to
> use as a guest in the next 6 months (or whatever it takes to
> backport this partucular patch back there).
> 
> You could have added a module parameter to enforce the old behavoir,
> at least...
> 
> And sorry but from the original commit log I could not understand
> why exactly all existing guests need to be broken. Could you please
> point me to a piece of documentation describing all this UAR
> bisuness (what is UAR, why 128 UARs are required and for what, etc).
> Thanks.
> 

We are going to send a patch that fixes this using a module parameter.
The patch will be on top of Huy's patch.

Some background to the problem: mlx4 supported devices require 128 UAR
pages from PCI memory space defined by BAR2-3. Each UAR page can be
any power of 2 value from 4K up to 64K. Before Huy's patch the driver
chose UAR page size to be equal to system page size. Since PowerPC's
page size is 64K this means minimum requirement of UAR pages is not
met (default UAR BAR is 8MB and only half of it is really reserved for
UARs).

More details can be found in the programmer's manual.


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy

On 03/15/2016 09:40 PM, Or Gerlitz wrote:

On Tue, Mar 15, 2016 at 12:19 PM, Alexey Kardashevskiy  wrote:

This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917.

Without this revert, POWER "pseries" KVM guests with a VF passed to a guest
using VFIO fail to bring the driver up:

mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
mlx4_core: Initializing :00:00.0
mlx4_core :00:00.0: enabling device ( -> 0002)
mlx4_core :00:00.0: Detected virtual function - running in slave mode
mlx4_core :00:00.0: Sending reset
mlx4_core :00:00.0: Sending vhcr0
mlx4_core :00:00.0: HCA minimum page size:512
mlx4_core :00:00.0: UAR size:4096 != kernel PAGE_SIZE of 65536
mlx4_core :00:00.0: Failed to obtain slave caps



Both host and guest use 64K system pages.

How to fix this properly? Thanks.


The commit message says:

"[..] Regarding backward compatibility in SR-IOV, if hypervisor has
this new code, the virtual OS must be updated. [...]"



So with v4.5 as a host, there is no actual distro available today to use as 
a guest in the next 6 months (or whatever it takes to backport this 
partucular patch back there).


You could have added a module parameter to enforce the old behavoir, at 
least...


And sorry but from the original commit log I could not understand why 
exactly all existing guests need to be broken. Could you please point me to 
a piece of documentation describing all this UAR bisuness (what is UAR, why 
128 UARs are required and for what, etc). Thanks.



--
Alexey


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Alexey Kardashevskiy

On 03/16/2016 02:29 AM, Christoph Hellwig wrote:

On Tue, Mar 15, 2016 at 04:23:33PM +0200, Or Gerlitz wrote:

Let us check. I was under (the maybe wrong) impression, that before this
patch both PF/VF drivers were not operative on some systems, so on those
systems it's fair to require the VF driver to be patched too.


To me it sounds like the system worked before. Alexey, can you confirm?


It worked just fine for year(s), this is definitely regression.


--
Alexey


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Christoph Hellwig
On Tue, Mar 15, 2016 at 04:23:33PM +0200, Or Gerlitz wrote:
> Let us check. I was under (the maybe wrong) impression, that before this
> patch both PF/VF drivers were not operative on some systems, so on those
> systems it's fair to require the VF driver to be patched too.

To me it sounds like the system worked before. Alexey, can you confirm?


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Or Gerlitz
On Tue, Mar 15, 2016 at 2:18 PM, Christoph Hellwig  wrote:
> On Tue, Mar 15, 2016 at 12:40:06PM +0200, Or Gerlitz wrote:
>> "[..] Regarding backward compatibility in SR-IOV, if hypervisor has
>> this new code, the virtual OS must be updated. [...]"

> Which is broken, we can't break user or guest VM ABIs ever.

Let us check. I was under (the maybe wrong) impression, that before this
patch both PF/VF drivers were not operative on some systems, so on those
systems it's fair to require the VF driver to be patched too.

Or.


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Christoph Hellwig
On Tue, Mar 15, 2016 at 12:40:06PM +0200, Or Gerlitz wrote:
> "[..] Regarding backward compatibility in SR-IOV, if hypervisor has
> this new code, the virtual OS must be updated. [...]"

Which is broken, we can't break user or guest VM ABIs ever.


Re: [RFC PATCH kernel] Revert "net/mlx4_core: Set UAR page size to 4KB regardless of system page size"

2016-03-15 Thread Or Gerlitz
On Tue, Mar 15, 2016 at 12:19 PM, Alexey Kardashevskiy  wrote:
> This reverts commit 85743f1eb34548ba4b056d2f184a3d107a3b8917.
>
> Without this revert, POWER "pseries" KVM guests with a VF passed to a guest
> using VFIO fail to bring the driver up:
>
> mlx4_core: Mellanox ConnectX core driver v2.2-1 (Feb, 2014)
> mlx4_core: Initializing :00:00.0
> mlx4_core :00:00.0: enabling device ( -> 0002)
> mlx4_core :00:00.0: Detected virtual function - running in slave mode
> mlx4_core :00:00.0: Sending reset
> mlx4_core :00:00.0: Sending vhcr0
> mlx4_core :00:00.0: HCA minimum page size:512
> mlx4_core :00:00.0: UAR size:4096 != kernel PAGE_SIZE of 65536
> mlx4_core :00:00.0: Failed to obtain slave caps

> Both host and guest use 64K system pages.
>
> How to fix this properly? Thanks.

The commit message says:

"[..] Regarding backward compatibility in SR-IOV, if hypervisor has
this new code, the virtual OS must be updated. [...]"


Or.