RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-06-05 Thread Eitan Zahavi
Hi Sasha,

Please see my comments below

> >
> > 9. OpenSM features
> > ---
> > The QoS related functionality to be provided by OpenSM can be split
into two
> > main parts:
> >
> > 3.1. Fabric Setup
> > During fabric initialization the SM should parse the policy and
apply its
> > settings to the discovered fabric elements. The following actions
should be
> > performed:
> > * Parsing of policy
> > * Node Group identification. Warning should be provided for each
node not
> >   specified but found.
> > * SL2VL settings validation should be checked:
> >   + A warning will be provided if there are no matching targets for
the SL2VL
> > setting statement.
> >   + An error message will be printed to the log file if an invalid
setting is
> > found. A setting is invalid if it refers to:
> > - Non existing port numbers of the target devices
> > - Unsupported VLs for the target device. In the later case the
map to non
> >   existing VLs should be replaced to VL15 i.e. packets will be
dropped.
> 
> Not sure that unsupported VLs mapping to VL15 is best option. Actually
> if SL2VL will be specified per port group this may mean that at least
in
> "generic" case all group members should have similar physical
> capabilities or "reliable" part of SLs will be limited by lowest VLCap
> in this group (other SLs will be just dropped somewhere).
[EZ] I prefer not hiding the mismatch. In my mind the explicit setting
should be provided for each of the groups of switches that do not share
same VLs support. 
But this is not a strong requirement in my mind. In general I would
prefer to get a clear error message when the fabric can not support the
given policy. Once such error is provided I think we could use whatever
"recovery" option you have in mind.
> 
> In current SL2VL mapping implementation we are using such rule to
replace
> unsupported VLs: (new VL) = (requested VL) % (operational data VLs)
> This may have some disadvantage too, but I think it is generally
"safer".
[EZ] It is safer since it will not cause data loss. But then the QoS
will probably be broken.
> 
> Also I guess that by "unsupported VLs" you are referring unsupported
or
> non-configured VLs.
[EZ] Yes true.
> 
> > * SL2VL setting is to be performed
> > * VL Arbitration table settings should be validated according to the
following
> >   rules:
> >   + A warning will be provided if there are no matching targets for
the setting
> > statement
> >   + An error will be provided if the port number exceeds the target
ports
> >   + An error will be generated if the table length exceeds device
capabilities
> >   + An warning will be generated if the table quote a VL that is not
supported
> > by the target device
> 
> Should there be replacement rule for not supported VLs?
> 
> In IBTA spec (v.1, p.190, l.14) is stated that entry with unsupported
VL
> may be skipped _OR_ "trusted" to other (supported) VL. I think if we
will
> not care about unsupported replacement there may be hole for
> "device/vendor dependent" behavior.
[EZ] OK good point. Lets have a replacement rule.
> 
> Sasha
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-06-01 Thread Grant Grundler
On Thu, Jun 01, 2006 at 09:09:49PM +0300, Sasha Khapyorsky wrote:
> On 15:49 Tue 30 May , Grant Grundler wrote:
> > On Tue, May 30, 2006 at 10:09:36PM +0300, Sasha Khapyorsky wrote:
> > > > XML style syntax is provided for the policy file.
> > > 
> > > Why XML? It is not too much readable and writable (by human) format.
> > 
> > It is human readable and very portable.
> > An example is here:
> > http://svn.gnumonks.org/trunk/mmio_test/mmio_test.xml
> 
> Yes it is readable, but for many people it is _less_ readable and even
> _less_ writable than "plain" text.

This might be a good starting point for "many people":
http://ahds.ac.uk/creating/information-papers/xml-editors/

I tried conglomerate (debian) and it doesn't like mmiot_test.xnl
for some reason.  But I suppose that could be fixed.

Anyway, my point is there is no shortage of GUIs to edit XML files
and verify syntactical correctness.

hth,
grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-06-01 Thread Sasha Khapyorsky
Hi Eitan,

Some more comments related to OpenSM.

On 17:53 Tue 30 May , Eitan Zahavi wrote:
> 
> 9. OpenSM features
> ---
> The QoS related functionality to be provided by OpenSM can be split into two 
> main parts:
> 
> 3.1. Fabric Setup
> During fabric initialization the SM should parse the policy and apply its 
> settings to the discovered fabric elements. The following actions should be 
> performed:
> * Parsing of policy
> * Node Group identification. Warning should be provided for each node not 
>   specified but found.
> * SL2VL settings validation should be checked:
>   + A warning will be provided if there are no matching targets for the SL2VL 
> setting statement. 
>   + An error message will be printed to the log file if an invalid setting is 
> found. A setting is invalid if it refers to:
> - Non existing port numbers of the target devices
> - Unsupported VLs for the target device. In the later case the map to non
>   existing VLs should be replaced to VL15 i.e. packets will be dropped.

Not sure that unsupported VLs mapping to VL15 is best option. Actually
if SL2VL will be specified per port group this may mean that at least in
"generic" case all group members should have similar physical
capabilities or "reliable" part of SLs will be limited by lowest VLCap
in this group (other SLs will be just dropped somewhere).

In current SL2VL mapping implementation we are using such rule to replace
unsupported VLs: (new VL) = (requested VL) % (operational data VLs)
This may have some disadvantage too, but I think it is generally "safer".

Also I guess that by "unsupported VLs" you are referring unsupported or
non-configured VLs.

> * SL2VL setting is to be performed
> * VL Arbitration table settings should be validated according to the 
> following 
>   rules:
>   + A warning will be provided if there are no matching targets for the 
> setting 
> statement
>   + An error will be provided if the port number exceeds the target ports
>   + An error will be generated if the table length exceeds device capabilities
>   + An warning will be generated if the table quote a VL that is not 
> supported 
> by the target device

Should there be replacement rule for not supported VLs?

In IBTA spec (v.1, p.190, l.14) is stated that entry with unsupported VL
may be skipped _OR_ "trusted" to other (supported) VL. I think if we will
not care about unsupported replacement there may be hole for
"device/vendor dependent" behavior.

Sasha
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-06-01 Thread Sasha Khapyorsky
On 15:49 Tue 30 May , Grant Grundler wrote:
> On Tue, May 30, 2006 at 10:09:36PM +0300, Sasha Khapyorsky wrote:
> > > XML style syntax is provided for the policy file.
> > 
> > Why XML? It is not too much readable and writable (by human) format.
> 
> It is human readable and very portable.
> An example is here:
>   http://svn.gnumonks.org/trunk/mmio_test/mmio_test.xml

Yes it is readable, but for many people it is _less_ readable and even
_less_ writable than "plain" text.

> And GPL libraries can parse XML.

It is true, but currently we have "portability" complaints even against
using libpthread.

Sasha

> So the new code is fairly short:
>   http://svn.gnumonks.org/trunk/mmio_test/xmlin.c
> 
> hth,
> grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-31 Thread Rimmer, Todd
I am am member of MgtWG and will look for the discussion there.

It doesn't seem like a topic LWG would cover.

Todd Rimmer

> -Original Message-
> From: Eitan Zahavi [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, May 31, 2006 1:35 AM
> To: Rimmer, Todd
> Cc: openib-general@openib.org
> Subject: RE: [openib-general] QoS RFC - Resend using a friendly mailer
> 
> 
> Hi Todd,
> 
> It is LWG. MgtWG will also be involved. 
> > 
> > I am a member of IBTA however I have not noticed this discussion on
> the IBTA
> > working groups.  Which working group have you engaged with this
> proposal?
> > 
> > Todd Rimmer
> 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Eitan Zahavi
Hi Todd,

It is LWG. MgtWG will also be involved. 
> 
> I am a member of IBTA however I have not noticed this discussion on
the IBTA
> working groups.  Which working group have you engaged with this
proposal?
> 
> Todd Rimmer
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Hal Rosenstock
On Tue, 2006-05-30 at 18:05, Rimmer, Todd wrote:
> > Eitan Wrote:
> > > As Roland suggest, before implementing a non-standard approach, IBTA
> > should be
> > > engaged to define an appropriate extension to the standard.  Such
> > extensions would
> > > need to be carefully defined to avoid breaking existing applications
> > and fabrics.
> > [EZ] You are welcome to join IBTA and work on this too.
> 
> I am a member of IBTA however I have not noticed this discussion 
> on the IBTA working groups.  Which working group have you engaged 
> with this proposal?

It's at the LWG.

-- Hal

> 
> Todd Rimmer 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Grant Grundler
On Tue, May 30, 2006 at 10:09:36PM +0300, Sasha Khapyorsky wrote:
> > XML style syntax is provided for the policy file.
> 
> Why XML? It is not too much readable and writable (by human) format.

It is human readable and very portable.
An example is here:
http://svn.gnumonks.org/trunk/mmio_test/mmio_test.xml

And GPL libraries can parse XML.
So the new code is fairly short:
http://svn.gnumonks.org/trunk/mmio_test/xmlin.c

hth,
grant
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Rimmer, Todd

> Eitan Wrote:
> > As Roland suggest, before implementing a non-standard approach, IBTA
> should be
> > engaged to define an appropriate extension to the standard.  Such
> extensions would
> > need to be carefully defined to avoid breaking existing applications
> and fabrics.
> [EZ] You are welcome to join IBTA and work on this too.

I am a member of IBTA however I have not noticed this discussion on the IBTA 
working groups.  Which working group have you engaged with this proposal?

Todd Rimmer 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Sasha Khapyorsky
On 22:43 Tue 30 May , Eitan Zahavi wrote:
> > >
> > > XML style syntax is provided for the policy file.
> > 
> > Why XML? It is not too much readable and writable (by human) format.
> [EZ] Well, I agree with you but already got so many requests for XML
> that I could not resists. Maybe we could do both. If we have a nice BNF
> it would be just a matter of some yacc exercise.

I less care about a parser complexity but more about people which will
wish to edit the policy definitions with just 'vi' (or any other text
editor). And I agree that OpenSM config -> XML converter may be not so
hard to do.

Sasha
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Eitan Zahavi
Hi Sasha,

Thanks for your comments.
Please see my comments inside

> > 3. Supported Policy
> > 
> >
> > The QoS policy supported by this proposal is divided into 4 sub
sections:
> >
> > * Node Group: a set of HCAs, Routers or Switches that share the same
settings.
> > A node groups might be a partition defined by the partition manager
policy in
> > terms of GUIDs. Future implementations might provide support for
> NodeDescription
> > based definition of node groups.
> 
> Port/Node groups could be defined as separate configuration, then
those
> definitions will be shared by different policies like Partitions, QoS
(and
> maybe others in future).
[EZ] Great idea. I would suggest using NodeDescription as a way to get
node names. But this is yet another issue for discussion on the IBTA and
this list.
> 
> > * Fabric Setup:
> > Defines how the SL2VL and VLArb tables should be setup. This policy
definition
> > assumes the computation of target behavior should be performed
outside of
> > OpenSM.
> >
> > * QoS-Levels Definition:
> > This section defines the possible sets of parameters for QoS that a
client might
> > be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate,
Path Bits
> > (in case LMC > 0 is used for QoS) and TClass.
> >
> > * Matching Rules:
> > A list of rules that match an incoming PathRecord request to a
QoS-Level. The
> > rules are processed in order such as the first match is applied.
Each rule is
> > built out of set of match expressions which should all match for the
rule to
> > apply. The matching expressions are defined for the following fields
> > ** SRC and DST to lists of node groups
> > ** Service-ID to a list of Service-ID or Service-ID ranges
> > ** TClass to a list of TClass values or ranges
> >
> > XML style syntax is provided for the policy file.
> 
> Why XML? It is not too much readable and writable (by human) format.
[EZ] Well, I agree with you but already got so many requests for XML
that I could not resists. Maybe we could do both. If we have a nice BNF
it would be just a matter of some yacc exercise. IMO it is the least of
our problems. 
> 
> Sasha
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Eitan Zahavi
Hi Hal,

Please see my responses inside

Eitan
> >
> >   RFC: OpenFabrics Enhancements for QoS Support
> >  ===
> >
> > Authors: . Eitan Zahavi <[EMAIL PROTECTED]>
> > Date:  May 2006.
> > Revision:  0.1
> >
> > Table of contents:
> > 1. Overview
> > 2. Architecture
> > 3. Supported Policy
> > 4. CMA functionality
> > 5. IPoIB functionality
> > 6. SDP functionality
> > 7. SRP functionality
> > 8. iSER functionality
> > 9. OpenSM functionality
> >
> > 1. Overview
> > 
> > Quality of Service requirements stem from the realization of I/O
consolidation
> > over IB network: As multiple applications and ULPs share the same
fabric, means
> > to control their use of the network resources are becoming a must.
The basic
> > need is to differentiate the service levels provided to different
traffic flows.
> > Such that a policy could be enforced and control each flow
utilization of the
> > fabric resources.
> >
> > IBTA specification defined several hardware features and management
interfaces
> > to support QoS:
> > * Up to 15 Virtual Lanes (VL) could carry traffic in a non-blocking
manner
> > * Arbitration between traffic of different VL is performed by a 2
priority
> >   levels weighted round robin arbiter. The arbiter is programmable
with
> >   a sequence of (VL, weight) pairs and maximal number of high
priority credits
> >   to be processed before low priority is served
> > * Packets carry class of service marking in the range 0 to 15 in
their
> >   header SL field
> > * Each switch can map the incoming packet by its SL to a particular
output
> >   VL based on programmable table VL=SL-to-VL-MAP(in-port, out-port,
SL)
> > * The Subnet Administrator controls each communication flow
parameters
> >   by providing them as a response to Path Record query
> >
> > The IB QoS features provide the means to implement a DiffServ like
architecture.
> > DiffServ architecture (IETF RFC2474 2475) is widely used today in
highly dynamic
> > fabrics.
> 
> Only certain DSCP code point equivalents are provided by IBA.
[EZ] True.
> 
> > This proposal provides the detailed functional definition for the
various
> > software elements that are required to enable a DiffServ like
architecture over
> > the OpenFabrics software stack.
> >
> >
> >
> >
> >
> > 2. Architecture
> > 
> > This proposal split the QoS functionality between the SM/SA, CMA and
the various
> > ULPS. We take the "chronology approach" to describe how the overall
system
> > works:
> >
> > 2.1. The network manager (human) provides a set of rules (policy)
that defines
> > how the network is being configured and how its resources are split
to different
> > QoS-Levels. The policy also define how to decide which QoS-Level
each
> > application or ULP or service use.
> 
> > 2.2. The SM analyzes the provided policy to see if it is realizable
and performs
> > the necessary fabric setup. The SM may continuously monitor the
policy and adapt
> > to changes in it.
> 
> Do you mean monitor the policy or the fabric here ?
[EZ] I mean monitor the policy such that changes in it are enforced.
> 
> >  Part of this policy defines the default QoS-Level of each
> > partition. The SA is being enhanced to match the requested Source,
Destination,
> > TClass, Service-ID
> 
> Service ID does not apply to many ULPs. Also, how is it known what
> ULP/application a particular service ID refers to (other than perhaps
> some well known ones) ?
[EZ] True - only well known Service-IDs can have a predefined policy
attached to. 
But I disagree on the fact services are unknown - if they are unknown
how are they being found by the clients?
> 
> >  (and optionally SL and priority) against the policy. So
> > clients (ULPs, programs) can obtain a policy enforced QoS. The SM is
also
> > enhanced to support setting up partitions with appropriate IPoIB
broadcast
> > group. This broadcast group carries its QoS attributes: TClass, SL,
MTU and
> > RATE.
> >
> > 2.3. IPoIB is being setup. IPoIB uses the SL, MTU and RATE available
on the
> > multicast group which forms the broadcast group of this partition.
> >
> > 2.4. MPI which provides non IB based connection management should be
> configured
> > to run using hard coded SLs. It uses these SLs in every QP being
opened.
> >
> > 2.5. ULPs that use CM interface (like SRP) should have their own
pre-assigned
> > Service-ID and use it while obtaining PathRecord for establishing
their
> > connections. The SA receiving the PathRecord should match it against
the policy
> > and return the appropriate PathRecord including SL, MTU, RATE and
TClass.
> >
> > 2.6. ULPs and programs using CMA to establish RC connection should
provide the
> > CMA the target IP and Service-ID. Some of the ULPs might also
provide TClass
> > (E.g. for SDP sockets that are provided the TOS socket option). The
CMA should
> > then use the provided Service-ID and optional TClass and pass them
in the
> > PathRecor

RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Eitan Zahavi
Hi Todd,
 
> While using the Service ID is an interesting idea, the problem is the
Service ID values
> are not well defined by IBTA.  Rather each endpoint is permitted to
define its own,
> potentially transient set of Service ID values.  The Service ID values
are discovered via
> Service Records in the SA or Device Management queries which get their
data from
> the IOU.
[EZ] Actually there are quite a few rules for how service IDs are made.
Different service vendors are supposed to use different Service-IDs. 
Also this RFC does enforce using Service-IDs in cases where they are not
defined. But it does provide the means to do that when such service are
defined. So in no way you can say it breaks existing implementations.
Just provide a way for applications that do make a constant use of
Service-IDs benefit from that property.

> 
> Hence while a few service ID values are well defined (such as those
for SDP), many
> are not (such as those for MPI, uDAPL, SRP, etc) and may vary between
both
> hardware and software suppliers.  Many are likely to be duplicated
between different
> vendors target devices (for example a uDAPL target application may
duplicate values
> used by an SRP target) and this would not be a problem provided both
applications
> were never run on the same IB Node target device.  Some might even
change on each
> reboot (IBTA spec implies this could be a 64 bit pointer or context in
the target),
> although I'm not aware of any which do.
> 
> I believe it is for the above reasons that IBTA chose not to make
ServiceID part of the
> PathRecord and MultiPathRecord queries.
> 
> As Roland suggest, before implementing a non-standard approach, IBTA
should be
> engaged to define an appropriate extension to the standard.  Such
extensions would
> need to be carefully defined to avoid breaking existing applications
and fabrics.
[EZ] You are welcome to join IBTA and work on this too.
> 
> Todd Rimmer
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit
http://openib.org/mailman/listinfo/openib-general
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Sasha Khapyorsky
Hi Eitan,

First comments...

On 17:53 Tue 30 May , Eitan Zahavi wrote:
> 
> 3. Supported Policy
>  
> 
> The QoS policy supported by this proposal is divided into 4 sub sections:
> 
> * Node Group: a set of HCAs, Routers or Switches that share the same 
> settings. 
> A node groups might be a partition defined by the partition manager policy in 
> terms of GUIDs. Future implementations might provide support for 
> NodeDescription 
> based definition of node groups.

Port/Node groups could be defined as separate configuration, then those
definitions will be shared by different policies like Partitions, QoS (and
maybe others in future).

> * Fabric Setup: 
> Defines how the SL2VL and VLArb tables should be setup. This policy 
> definition 
> assumes the computation of target behavior should be performed outside of 
> OpenSM.
> 
> * QoS-Levels Definition:
> This section defines the possible sets of parameters for QoS that a client 
> might 
> be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate, Path Bits 
> (in case LMC > 0 is used for QoS) and TClass.
> 
> * Matching Rules:
> A list of rules that match an incoming PathRecord request to a QoS-Level. The 
> rules are processed in order such as the first match is applied. Each rule is 
> built out of set of match expressions which should all match for the rule to 
> apply. The matching expressions are defined for the following fields
> ** SRC and DST to lists of node groups
> ** Service-ID to a list of Service-ID or Service-ID ranges
> ** TClass to a list of TClass values or ranges
> 
> XML style syntax is provided for the policy file.

Why XML? It is not too much readable and writable (by human) format.

Sasha
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Michael Krause



High-level feedback:
- An IB fabric could be used for a single ULP and still require
QoS.  The issue is how to differentiate flows on a given shared
element within the fabric.
- QoS controls must be dynamic. The document references initialization as
the time when decisions are made but obviously that is just a first pass
on use of the fabric and not what it will become in potentially a short
period of time.
- QoS also involves multi-path support (not really touched upon in terms
of specifics in this document).  Distributing or segregating work
even if for the same ULP should be done across multiple or distinct
paths.  In one sense this may complicate the work but in another it
is simpler in that arbitration controls for shared links become easier to
manage if the number of flows is reduced.  
- IP over IB defines a multicast group which is ultimately a spanning
tree.  That should not constrain what paths are used to communicate
between endnode pairs.  That only defines the multicast paths which
are not strongly ordered relative to the unicast traffic.  Further
IP over IB may operate using the RC mode between endnodes.  It is
very simple to replicate RC and then segregate these into QoS domains
(one could just align priority with the 802.1p for simplicity and
practical execution) which can in turn flow over shared or distinct
paths.
- IB is a centrally managed fabric.  Adding in SID into records and
such really isn't going to help solve the problem unless there is also a
centralized management entity well above IB that can prioritize
communication service rates for different ULP and endnode pairs. 
Given most of these centralized management entities are rather ignorant
of IB at the moment, this presents a chicken-egg dilemma which is further
complicated by developing SOA technology.  It might be more valuable
in one sense to examine SOA technology and how it is translating itself
to say Ethernet and then see how this can be leveraged to IB.
- QoS needs to examine the sums of the consumers of a given path and
their service rate requirements.  It isn't just about setting a
priority level but also about the packet injection rate to the fabric on
that priority.  This needs to be taken into account as
well.
Overall, it is not clear to me what the end value of this document. 
The challenge for any network admin is to translate SOA driven
requirements into fabric control knob setting.  Without such
translation algorithms / understanding, it is not clear that there is
anything truly missing in the IBTA spec suite or that this RFC will
really advance the integration of IB into the data center in a truly
meaningful manner.
Mike

At 07:53 AM 5/30/2006, Eitan Zahavi wrote:
To: OPENIB

Subject: QoS RFC - Resend using a friendly mailer
--text follows this line--
Hi All 
Please find the attached RFC describing how QoS policy support could be
implemented in the OpenFabrics stack.
Your comments are welcome.
Eitan
 
RFC: OpenFabrics Enhancements for QoS Support

===
Authors: . Eitan Zahavi <[EMAIL PROTECTED]>
Date:  May 2006.
Revision:  0.1
Table of contents:
1. Overview
2. Architecture
3. Supported Policy
4. CMA functionality
5. IPoIB functionality
6. SDP functionality
7. SRP functionality
8. iSER functionality
9. OpenSM functionality
1. Overview

Quality of Service requirements stem from the realization of I/O
consolidation 
over IB network: As multiple applications and ULPs share the same fabric,
means 
to control their use of the network resources are becoming a must. The
basic 
need is to differentiate the service levels provided to different traffic
flows. 
Such that a policy could be enforced and control each flow utilization of
the 
fabric resources.
IBTA specification defined several hardware features and management
interfaces 
to support QoS:
* Up to 15 Virtual Lanes (VL) could carry traffic in a non-blocking
manner
* Arbitration between traffic of different VL is performed by a 2
priority 
  levels weighted round robin arbiter. The arbiter is programmable
with 
  a sequence of (VL, weight) pairs and maximal number of high
priority credits 
  to be processed before low priority is served
* Packets carry class of service marking in the range 0 to 15 in
their
  header SL field
* Each switch can map the incoming packet by its SL to a particular
output
  VL based on programmable table VL=SL-to-VL-MAP(in-port, out-port,
SL)
* The Subnet Administrator controls each communication flow
parameters
  by providing them as a response to Path Record query
The IB QoS features provide the means to implement a DiffServ like
architecture. 
DiffServ architecture (IETF RFC2474 2475) is widely used today in highly
dynamic 
fabrics. 
This proposal provides the detailed functional definition for the various

software elements that are required to enable a DiffServ like
architecture over 
the OpenFabrics software stack.


2. Architecture
--

Re: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Hal Rosenstock
On Tue, 2006-05-30 at 10:53, Eitan Zahavi wrote:
> To: OPENIB 
> Subject: QoS RFC - Resend using a friendly mailer
> --text follows this line--
> Hi All 
> 
> Please find the attached RFC describing how QoS policy support could be 
> implemented in the OpenFabrics stack.
> Your comments are welcome.

Some initial comments.

> 
> Eitan
> 
>   RFC: OpenFabrics Enhancements for QoS Support
>  ===
> 
> Authors: . Eitan Zahavi <[EMAIL PROTECTED]>
> Date:  May 2006.
> Revision:  0.1
> 
> Table of contents:
> 1. Overview
> 2. Architecture
> 3. Supported Policy
> 4. CMA functionality
> 5. IPoIB functionality
> 6. SDP functionality
> 7. SRP functionality
> 8. iSER functionality
> 9. OpenSM functionality
> 
> 1. Overview
> 
> Quality of Service requirements stem from the realization of I/O 
> consolidation 
> over IB network: As multiple applications and ULPs share the same fabric, 
> means 
> to control their use of the network resources are becoming a must. The basic 
> need is to differentiate the service levels provided to different traffic 
> flows. 
> Such that a policy could be enforced and control each flow utilization of the 
> fabric resources.
> 
> IBTA specification defined several hardware features and management 
> interfaces 
> to support QoS:
> * Up to 15 Virtual Lanes (VL) could carry traffic in a non-blocking manner
> * Arbitration between traffic of different VL is performed by a 2 priority 
>   levels weighted round robin arbiter. The arbiter is programmable with 
>   a sequence of (VL, weight) pairs and maximal number of high priority 
> credits 
>   to be processed before low priority is served
> * Packets carry class of service marking in the range 0 to 15 in their
>   header SL field
> * Each switch can map the incoming packet by its SL to a particular output
>   VL based on programmable table VL=SL-to-VL-MAP(in-port, out-port, SL)
> * The Subnet Administrator controls each communication flow parameters
>   by providing them as a response to Path Record query
> 
> The IB QoS features provide the means to implement a DiffServ like 
> architecture. 
> DiffServ architecture (IETF RFC2474 2475) is widely used today in highly 
> dynamic 
> fabrics. 

Only certain DSCP code point equivalents are provided by IBA.

> This proposal provides the detailed functional definition for the various 
> software elements that are required to enable a DiffServ like architecture 
> over 
> the OpenFabrics software stack.
> 
> 
> 
> 
> 
> 2. Architecture
> 
> This proposal split the QoS functionality between the SM/SA, CMA and the 
> various 
> ULPS. We take the "chronology approach" to describe how the overall system 
> works:
> 
> 2.1. The network manager (human) provides a set of rules (policy) that 
> defines 
> how the network is being configured and how its resources are split to 
> different 
> QoS-Levels. The policy also define how to decide which QoS-Level each 
> application or ULP or service use.

> 2.2. The SM analyzes the provided policy to see if it is realizable and 
> performs 
> the necessary fabric setup. The SM may continuously monitor the policy and 
> adapt 
> to changes in it.

Do you mean monitor the policy or the fabric here ?

>  Part of this policy defines the default QoS-Level of each 
> partition. The SA is being enhanced to match the requested Source, 
> Destination, 
> TClass, Service-ID

Service ID does not apply to many ULPs. Also, how is it known what
ULP/application a particular service ID refers to (other than perhaps
some well known ones) ?

>  (and optionally SL and priority) against the policy. So 
> clients (ULPs, programs) can obtain a policy enforced QoS. The SM is also 
> enhanced to support setting up partitions with appropriate IPoIB broadcast 
> group. This broadcast group carries its QoS attributes: TClass, SL, MTU and 
> RATE.
> 
> 2.3. IPoIB is being setup. IPoIB uses the SL, MTU and RATE available on the 
> multicast group which forms the broadcast group of this partition.
> 
> 2.4. MPI which provides non IB based connection management should be 
> configured 
> to run using hard coded SLs. It uses these SLs in every QP being opened.
> 
> 2.5. ULPs that use CM interface (like SRP) should have their own pre-assigned 
> Service-ID and use it while obtaining PathRecord for establishing their 
> connections. The SA receiving the PathRecord should match it against the 
> policy 
> and return the appropriate PathRecord including SL, MTU, RATE and TClass. 
> 
> 2.6. ULPs and programs using CMA to establish RC connection should provide 
> the 
> CMA the target IP and Service-ID. Some of the ULPs might also provide TClass 
> (E.g. for SDP sockets that are provided the TOS socket option). The CMA 
> should 
> then use the provided Service-ID and optional TClass and pass them in the 
> PathRecord request. The resulting PathRecord should be used for configuring 
> the 
> connecti

RE: [openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Rimmer, Todd

> Eitan wrote:
> 2.2. The SM analyzes the provided policy to see if it is 
> realizable and performs 
> the necessary fabric setup. The SM may continuously monitor 
> the policy and adapt 
> to changes in it. Part of this policy defines the default 
> QoS-Level of each 
> partition. The SA is being enhanced to match the requested 
> Source, Destination, 
> TClass, Service-ID (and optionally SL and priority) against 
> the policy. So 
> clients (ULPs, programs) can obtain a policy enforced QoS. 
> The SM is also 
> enhanced to support setting up partitions with appropriate 
> IPoIB broadcast 
> group. This broadcast group carries its QoS attributes: 
> TClass, SL, MTU and 
> RATE.

While using the Service ID is an interesting idea, the problem is the Service 
ID values are not well defined by IBTA.  Rather each endpoint is permitted to 
define its own, potentially transient set of Service ID values.  The Service ID 
values are discovered via Service Records in the SA or Device Management 
queries which get their data from the IOU.

Hence while a few service ID values are well defined (such as those for SDP), 
many are not (such as those for MPI, uDAPL, SRP, etc) and may vary between both 
hardware and software suppliers.  Many are likely to be duplicated between 
different vendors target devices (for example a uDAPL target application may 
duplicate values used by an SRP target) and this would not be a problem 
provided both applications were never run on the same IB Node target device.  
Some might even change on each reboot (IBTA spec implies this could be a 64 bit 
pointer or context in the target), although I'm not aware of any which do.

I believe it is for the above reasons that IBTA chose not to make ServiceID 
part of the PathRecord and MultiPathRecord queries.

As Roland suggest, before implementing a non-standard approach, IBTA should be 
engaged to define an appropriate extension to the standard.  Such extensions 
would need to be carefully defined to avoid breaking existing applications and 
fabrics.

Todd Rimmer 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


[openib-general] QoS RFC - Resend using a friendly mailer

2006-05-30 Thread Eitan Zahavi
To: OPENIB 
Subject: QoS RFC - Resend using a friendly mailer
--text follows this line--
Hi All 

Please find the attached RFC describing how QoS policy support could be 
implemented in the OpenFabrics stack.
Your comments are welcome.

Eitan

  RFC: OpenFabrics Enhancements for QoS Support
 ===

Authors: . Eitan Zahavi <[EMAIL PROTECTED]>
Date:  May 2006.
Revision:  0.1

Table of contents:
1. Overview
2. Architecture
3. Supported Policy
4. CMA functionality
5. IPoIB functionality
6. SDP functionality
7. SRP functionality
8. iSER functionality
9. OpenSM functionality

1. Overview

Quality of Service requirements stem from the realization of I/O consolidation 
over IB network: As multiple applications and ULPs share the same fabric, means 
to control their use of the network resources are becoming a must. The basic 
need is to differentiate the service levels provided to different traffic 
flows. 
Such that a policy could be enforced and control each flow utilization of the 
fabric resources.

IBTA specification defined several hardware features and management interfaces 
to support QoS:
* Up to 15 Virtual Lanes (VL) could carry traffic in a non-blocking manner
* Arbitration between traffic of different VL is performed by a 2 priority 
  levels weighted round robin arbiter. The arbiter is programmable with 
  a sequence of (VL, weight) pairs and maximal number of high priority credits 
  to be processed before low priority is served
* Packets carry class of service marking in the range 0 to 15 in their
  header SL field
* Each switch can map the incoming packet by its SL to a particular output
  VL based on programmable table VL=SL-to-VL-MAP(in-port, out-port, SL)
* The Subnet Administrator controls each communication flow parameters
  by providing them as a response to Path Record query

The IB QoS features provide the means to implement a DiffServ like 
architecture. 
DiffServ architecture (IETF RFC2474 2475) is widely used today in highly 
dynamic 
fabrics. 

This proposal provides the detailed functional definition for the various 
software elements that are required to enable a DiffServ like architecture over 
the OpenFabrics software stack.





2. Architecture

This proposal split the QoS functionality between the SM/SA, CMA and the 
various 
ULPS. We take the "chronology approach" to describe how the overall system 
works:

2.1. The network manager (human) provides a set of rules (policy) that defines 
how the network is being configured and how its resources are split to 
different 
QoS-Levels. The policy also define how to decide which QoS-Level each 
application or ULP or service use.

2.2. The SM analyzes the provided policy to see if it is realizable and 
performs 
the necessary fabric setup. The SM may continuously monitor the policy and 
adapt 
to changes in it. Part of this policy defines the default QoS-Level of each 
partition. The SA is being enhanced to match the requested Source, Destination, 
TClass, Service-ID (and optionally SL and priority) against the policy. So 
clients (ULPs, programs) can obtain a policy enforced QoS. The SM is also 
enhanced to support setting up partitions with appropriate IPoIB broadcast 
group. This broadcast group carries its QoS attributes: TClass, SL, MTU and 
RATE.

2.3. IPoIB is being setup. IPoIB uses the SL, MTU and RATE available on the 
multicast group which forms the broadcast group of this partition.

2.4. MPI which provides non IB based connection management should be configured 
to run using hard coded SLs. It uses these SLs in every QP being opened.

2.5. ULPs that use CM interface (like SRP) should have their own pre-assigned 
Service-ID and use it while obtaining PathRecord for establishing their 
connections. The SA receiving the PathRecord should match it against the policy 
and return the appropriate PathRecord including SL, MTU, RATE and TClass. 

2.6. ULPs and programs using CMA to establish RC connection should provide the 
CMA the target IP and Service-ID. Some of the ULPs might also provide TClass 
(E.g. for SDP sockets that are provided the TOS socket option). The CMA should 
then use the provided Service-ID and optional TClass and pass them in the 
PathRecord request. The resulting PathRecord should be used for configuring the 
connection QP.

PathRecord and MultiPathRecord enhancement for QoS: 
As mentioned above the PathRecord and MultiPathRecord attributes should be 
enhanced to carry the Service-ID which is a 64bit value. Given the existing 
definition for these attributes we propose to use the following fields for 
Service-ID:
* For PathRecord: use the first 2 reserved fields whicg are 32bits each 
  (component masks 0x1 and 0x2). Component mask 1 should be used to refer to 
the 
  merged Service-ID field
* For MultiPathRecord: use 2 reserved fields: 
  1. after the packet life (8 bits) which is component mask bit 0x1 (17)