Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-30 Thread Or Gerlitz
Roland Dreier rol...@kernel.org wrote:
Jim Schutt jasc...@sandia.gov wrote:
 no good reason to insist that the VL usage is the same for both
 interswitch links, and switch-CA links. Do I need to change this?

 I don't think changing this is a high priority, since it's a pretty small
 slice of the world, and QoS on the edge links probably is important
 to an even smaller slice, but IMHO it would be better to give QoS to
 HCAs that only support 4 VLs by using a different SL2VL table for links to 
 CAs.

Jim,

AFAIK, the way opensm applies an SL-to-VL mapping specification (e.g
dictated by the admin or maybe your routing engine) on a specific link
is by modulation on the number of active VLs for that link - e.g say
the ID mapping was required and there are two VLs for that link, so
we'll have SL-to-VL of 0-0 1-1 2-0 3-1 and so on. So in that
respect, I wasn't sure what's the change here for you.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-30 Thread Or Gerlitz
 Roland Dreier rol...@kernel.org wrote:
 The obvious answer is no, and therefore we have mlx4_portX  attributes in
 sysfs that are per port.  MTU is the same way.  For example, you suggest
 that CX3 won't have the same limitation of only 4 VLs with 4K MTU.  In
 that case, think about a system with one CX2 and one CX3 -- should the
 CX3 be limited to 2K MTU because of CX2 limitations?

 Rather than having a completely different way of handling MTU, why can't
 we just handle it the same way as the port type, and have a sysfs attribute
 like mlx4_mtuN for each port?

okay, got that. I'd like to make another round of thinking / checking if we
can make 4k mtu being the default and not configurable also for
pre-CX3 devices,
if yes, I guess we can avoid the per port sysfs entry, if not, I'll add that as
part of the patch.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-26 Thread Or Gerlitz

Roland Dreier wrote:

I mean set the MTU port-by-port with the module loaded, the same way
we are supposed to be able to do for the port type.  Rather than having
one global module parameter.


The HCA has set of per port buffers, from which comes the dependency 
between VLs to MTU. So with this code running for each IB hca/port, 
we're actually doing that logic port-by-port. I assume you didn't mean 
let the user specify a desired MTU for each hca/port... or I'm still not 
fully with you?


Anyway, I'd be happy to provide at least the folks that use torus/mesh 
and/or sophisticated QoS schemes an ability to use eight VLs with the 
current HW, so how about keeping the module param but with default value 
turned on?


Or.

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-26 Thread Jim Schutt

Or Gerlitz wrote:

Roland Dreier wrote:

Bob Pearson rpear...@systemfabricworks.com wrote:
With lash+mesh redsky required 6-7 VLs to wire up without deadlocks. 
I think

that Jim's version uses 8 SLs but only 2VLs to work.
If someone was using a torus and also wanted to support QOS and also 
wanted
to separate multicast and management on a separate VL to be 
absolutely sure
that there is no possibility of a deadlock you might end up with #QOS 
* 2 +

1 which would be 5 using the current algorithm.



But again you don't need all those VLs on the HCAs' links, do you?


Jason Gunthorpe wrote:

Routing algorithms only need VLs on interswitch links, not on HCA to
switch links. The only use of the HCA to switch VLs is for QoS. Mesh
topologies can usually be routed with only two VLs, but you need alot
of SLs to make that work.


Bob, Jim, Alex

I wasn't sure if the SL-to-VL mapping done by open SM is dictated by the 
directives @ the user config file or if the routing algorithm is VL 
aware but the routing engine? if the latter, do interswitch links use 
different mapping vs. HCA - switch links?


FWIW, the torus-2QoS routing engine uses VL bit 0 for torus deadlock
avoidance, VL bit 1 to route around a missing switch without deadlocks,
and VL bit 2 to provide two QoS levels.  It needs the port dependence
of the SL2VL maps to do this in switches.

The interswitch and HCAs use the same mapping, but only VL bit 2
is needed on HCAs, to provide the QoS levels.

I chose that bit usage because it seemed the proper ordering of
capabilities if there are fewer than 8 data VLs available - basic
deadlock avoidance is most important; some QoS is nice to have but
not that useful if the fabric can deadlock.

Is that what you were asking, at least WRT. torus-2QoS?

-- Jim



Or.





--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-26 Thread Roland Dreier
On Thu, May 26, 2011 at 7:30 AM, Jim Schutt jasc...@sandia.gov wrote:
 It occurred to me as soon I sent the above that there's no
 good reason to insist that the VL usage is the same for both
 interswitch links, and switch-CA links.

 Do I need to change this?

I don't think changing this is a high priority, since it's a pretty small
slice of the world, and QoS on the edge links probably is important
to an even smaller slice, but IMHO it would be better to give QoS to
HCAs that only support 4 VLs by using a different SL2VL table for
links to CAs.

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Or Gerlitz
Roland Dreier rol...@kernel.org wrote:

 Is the issue that we trade off VL cap for MTU?

yes, this is it

 [...] however I still think needing to set this with a module parameter kind 
 of
 sucks for the end user.  Can we think of a better way to handle this?

with the above @ hand, setting mtu cap of 4k w.o an ability of
reducing that to 2k, makes the patch distruptive for users that do
need eight VLs. Maybe it would be easier for the common user if turn
on the module param by default.

 Does anyone really care about max VL cap with 2K MTU?

I'm not with you... can you elaborate a little further here? the
current HW generation support four VLs with 4k mtu, newer HW might
support more.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Roland Dreier
On Wed, May 25, 2011 at 2:05 PM, Or Gerlitz or.gerl...@gmail.com wrote:
 Does anyone really care about max VL cap with 2K MTU?

 I'm not with you... can you elaborate a little further here? the
 current HW generation support four VLs with 4k mtu, newer HW might
 support more.

I mean is there anyone who really uses 4 VLs?  Presumably the
HW designers didn't think so, because they limited HW to 4 VLs
with 4K MTU.

At least can we make this a runtime thing?  If we're able to set a
port as IB vs ethernet then # of VLs seems like it should be doable too.

And 4K MTU should probably be the default, since almost all users
want 4K MTU vs. caring about VLs.  (Probably 99% of IB users
never set SL of anything)

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Or Gerlitz
Roland Dreier rol...@kernel.org wrote:

 And 4K MTU should probably be the default, since almost all users
 want 4K MTU vs. caring about VLs.  (Probably 99% of IB users
 never set SL of anything)

I agree that we want that to be the default, I'm not sure the 99%
thing is accurate, with more and more (specifically the huge ones) IB
clusters that are built in some sort of 2D/3D torus, mesh or alike
topologies for which routing engines such as DOR and LASH use multiple
VLs to avoid credit loops. also I assume that some users (maybe  5%)
would like to enjoy 8 HW traffic classes, so if pressed to the wall,
they would prefer 2k mtu with the current HW.

 I mean is there anyone who really uses 4 VLs?  Presumably the
 HW designers didn't think so, because they limited HW to 4 VLs with 4K MTU.

I'm not sure if 4 VLs are enough for all the topologies / algorithms I
mentioned above, so I do prefer to leave an option to run with eight
VLs. As for the HW designers comment, its always good to look forward
for improvements in newer HCA drops (the patch for the CX3 series
device IDs is already comitted by
31dd272e8cbb32ef31a411492cc642c363bb54b9, so one can expect for the
actual cards to be coming soon as well).

 At least can we make this a runtime thing?  If we're able to set a
 port as IB vs ethernet then # of VLs seems like it should be doable too.

Here I lost you again, the policy is dictated by the module param,
which whose default value should be turned on, the code that sets the
mtu and VL cap is executed each time the function change by the patch
is called, which in turn happens each time an IB link type is sensed
or dictated for the port.

Or.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Roland Dreier
On Wed, May 25, 2011 at 2:46 PM, Or Gerlitz or.gerl...@gmail.com wrote:
 At least can we make this a runtime thing?  If we're able to set a
 port as IB vs ethernet then # of VLs seems like it should be doable too.

 Here I lost you again, the policy is dictated by the module param,
 which whose default value should be turned on, the code that sets the
 mtu and VL cap is executed each time the function change by the patch
 is called, which in turn happens each time an IB link type is sensed
 or dictated for the port.

I mean set the MTU port-by-port with the module loaded, the same way
we are supposed to be able to do for the port type.  Rather than having
one global module parameter.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Bob Pearson
With lash+mesh redsky required 6-7 VLs to wire up without deadlocks. I think
that Jim's version uses 8 SLs but only 2VLs to work.
If someone was using a torus and also wanted to support QOS and also wanted
to separate multicast and management on a separate VL to be absolutely sure
that there is no possibility of a deadlock you might end up with #QOS * 2 +
1 which would be 5 using the current algorithm.

-Original Message-
From: linux-rdma-ow...@vger.kernel.org
[mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Roland Dreier
Sent: Wednesday, May 25, 2011 4:28 PM
To: Or Gerlitz
Cc: Or Gerlitz; linux-rdma; Vladimir Sokolovsky
Subject: Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

On Wed, May 25, 2011 at 2:05 PM, Or Gerlitz or.gerl...@gmail.com wrote:
 Does anyone really care about max VL cap with 2K MTU?

 I'm not with you... can you elaborate a little further here? the 
 current HW generation support four VLs with 4k mtu, newer HW might 
 support more.

I mean is there anyone who really uses 4 VLs?  Presumably the HW designers
didn't think so, because they limited HW to 4 VLs with 4K MTU.

At least can we make this a runtime thing?  If we're able to set a port as
IB vs ethernet then # of VLs seems like it should be doable too.

And 4K MTU should probably be the default, since almost all users want 4K
MTU vs. caring about VLs.  (Probably 99% of IB users never set SL of
anything)

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in the
body of a message to majord...@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] mlx4: allow for 4K mtu configuration of IB ports

2011-05-25 Thread Roland Dreier
On Wed, May 25, 2011 at 3:19 PM, Bob Pearson
rpear...@systemfabricworks.com wrote:
 With lash+mesh redsky required 6-7 VLs to wire up without deadlocks. I think
 that Jim's version uses 8 SLs but only 2VLs to work.
 If someone was using a torus and also wanted to support QOS and also wanted
 to separate multicast and management on a separate VL to be absolutely sure
 that there is no possibility of a deadlock you might end up with #QOS * 2 +
 1 which would be 5 using the current algorithm.

But again you don't need all those VLs on the HCAs' links, do you?

 - R.
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html