[openib-general] nightly osm_sim report 2007-01-30:normal completion

2007-01-29 Thread Eitan Zahavi
OSM Simulation Regression Summary
OpenSM rev = Mon_Jan_29_10:06:23_2007 1f5e50 
ibutils rev = Wed_Jan_3_11:42:12_2007 913448 
Total=410 Pass=409 Fail=1

Pass:
30 Stability IS1-16.topo
30 Pkey IS1-16.topo
30 OsmTest IS1-16.topo
30 Multicast IS1-16.topo
30 LidMgr IS1-16.topo
29 OsmStress IS1-16.topo
10 Stability IS3-loop.topo
10 Stability IS3-128.topo
10 Pkey IS3-128.topo
10 OsmTest IS3-loop.topo
10 OsmTest IS3-128.topo
10 OsmStress IS3-128.topo
10 Multicast IS3-loop.topo
10 Multicast IS3-128.topo
10 LidMgr IS3-128.topo
10 FatTree part-4-ary-3-tree.topo
10 FatTree merge-roots-reorder-4-ary-2-tree.topo
10 FatTree merge-roots-4-ary-2-tree.topo
10 FatTree merge-root-4-ary-3-tree.topo
10 FatTree merge-root-12-ary-2-tree.topo
10 FatTree merge-2-ary-4-tree.topo
10 FatTree half-4-ary-3-tree.topo
10 FatTree blend-4-ary-2-tree.topo
10 FatTree 4-ary-4-tree.topo
10 FatTree 4-ary-3-tree.topo
10 FatTree 32nodes-3lvl-is1.topo
10 FatTree 2-ary-4-tree.topo
10 FatTree 12-node-spaced.topo
10 FatTree 12-ary-2-tree.topo

Failures:
1 OsmStress IS1-16.topo

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] [PATCH ofed1.2 2/2] libehca: change path to ehca.driver for make dist

2007-01-29 Thread Hoang-Nam Nguyen
applied both patches


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] QoS in OSM

2007-01-29 Thread Hal Rosenstock
Hi Yevgeny,

On Mon, 2007-01-29 at 18:10, Yevgeny Kliteynik wrote:
> Hi guys.
> 
> I've finished the first implementation of QoS-aware PathRecord.
> The path selection logic itself is implemented in a separate function
> that is called only when QoS in OpenSM is on.
> It cases some code duplication, but as we've discussed, the idea is to
> minimize the changes in the existing logic in OSM.
> Tonight the regression testing is running on this OSM version to make 
> sure that I didn't screw something up.
> Since none of the QoS patches has made its way to the trunk yet, the
> patch series will be pretty long. It will include:
>  - QoS policy file parser (Lex & Yacc files that implement grammar, 
>C & H files that implements parser auxiliary functions)
>  - Additional fields is path_record_t (instead of 'reserved' fields)
>  - Additional command line option for OpenSM to specify the QoS 
>policy file name
>  - QoS-aware selection of PathRecord.
> I'll issue the patch series with all the details in the morning, and then
> I'll start working on MultiPath Record.
> 
> In addition to all the questions that you already have and I haven't answered
> yet, I'm sure you'll have many questions and remarks regarding these patches.

I would like some time to go over the new set of patches.

> I suggest that we set up a conference call to discuss all these questions - it
> might save us a lot of time and clear some issues.

Some might make sense via con call but there are a number of outstanding
ones which could be answered ahead of time. I'm not sure why these can't
be done on the list.

> How about tomorrow morning? (I mean Hal's morning). The earlier the better.

I'm not sure that leaves sufficient time for review. I'll look at them
tomorrow and we'll figure out a plan from there.

-- Hal

> Please let me know what you think about it.

> Thanks,
> 
> -- Yevgeny
> 
> Hal Rosenstock wrote:
> > Hi again Yevgeny,
> > 
> > On Thu, 2007-01-25 at 11:53, Yevgeny Kliteynik wrote: 
> >> Hi Hal.
> >>
> >> Hal Rosenstock wrote:
> >>> Hi Yevgeny,
> >>>
> >>> On Wed, 2007-01-24 at 09:10, Yevgeny Kliteynik wrote:
>  Hi Hal, Sasha.
> 
>  Here's a description of the QoS policy file, and an
>  example of such file (with more comments inside).
> >>> This makes the start of a good document on this. If you add this to
> >>> osm/doc, I will incorporate it into the opensm man page.
> >> OK, I'll do that.
> >>
>  QoS Policy file
>  --
> 
>  The QoS policy file is divided into 4 sub sections:
> 
>  * Node Group: a set of HCAs, Routers or Switches that share the same 
>  settings. 
>    A node groups might be a partition defined by the partition manager 
>  policy in 
>    terms of GUIDs.
> >>> Are these Node or Port Groups ? It looks like port groups from the
> >>> below.
> >> Good point - it should be "Port Groups".
> >>
>   Future implementations might provide support for NodeDescription 
>    based definition of node groups.
> 
>  * Fabric Setup: 
>    Defines how the SL2VL and VLArb tables should be setup. This policy 
>  definition 
>    assumes the computation of target behavior should be performed outside 
>  of 
>    OpenSM.
> 
>  * QoS-Levels Definition:
>    This section defines the possible sets of parameters for QoS that a 
>  client might 
>    be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate, 
>  Path Bits 
>    (in case LMC > 0 is used for QoS) and TClass.
> >>> How does this relate to/interact with partition configuration ? Also,
> >>> what about preexisting QoS ?
> >> As I understand from the osm man or from the partition-config.txt,
> >> partitions definition is intended to be used for IPoIB only.
> >>[quote]
> >>sl= - specifies SL for this IPoIB MC group
> >>   (default is 0)
> >>[/quote]
> >>
> >> I think that QoS policy may only "tighten" the constraints and enforce
> >> lower-than-requested values, both in case of partition and in case of
> >> preexisting QoS settings.
> > 
> > I'm not following you on this specific point. A specific SL is chosen by
> > partition config so how can it be "tightened" ? Does it mean it might be
> > changed to a different SL (in which case this QoS config superceeds the
> > partition config for SL setting) ? Have you tried this to be sure ? 
> > 
> > Are multicast groups handled as part QoS definition in the XML syntax ?
> > If not, might this be a future addition ? If it is, how are they
> > specified ?
> > 
> > The other half of the original question was how a QoS request is handled
> > if the original QoS support is enabled rather than this new QoS support
> > in terms of the SA PR and MPR code.
> > 
>  * Matching Rules:
>    A list of rules that match an incoming PathRecord request to a 
>  QoS-Level. The 
>    rules are processed in order such as the first match is applied. Each 
>  rule is 
>    b

Re: [openib-general] QoS in OSM

2007-01-29 Thread Sasha Khapyorsky
On 01:10 Tue 30 Jan , Yevgeny Kliteynik wrote:
> Hi guys.
> 
> I've finished the first implementation of QoS-aware PathRecord.
> The path selection logic itself is implemented in a separate function
> that is called only when QoS in OpenSM is on.
> It cases some code duplication, but as we've discussed, the idea is to
> minimize the changes in the existing logic in OSM.
> Tonight the regression testing is running on this OSM version to make 
> sure that I didn't screw something up.
> Since none of the QoS patches has made its way to the trunk yet, the
> patch series will be pretty long. It will include:
>  - QoS policy file parser (Lex & Yacc files that implement grammar, 
>C & H files that implements parser auxiliary functions)
>  - Additional fields is path_record_t (instead of 'reserved' fields)
>  - Additional command line option for OpenSM to specify the QoS 
>policy file name
>  - QoS-aware selection of PathRecord.
> I'll issue the patch series with all the details in the morning, and then
> I'll start working on MultiPath Record.

And what about integration with VLArb and SL2VL port's setup?

Sasha

> 
> In addition to all the questions that you already have and I haven't answered
> yet, I'm sure you'll have many questions and remarks regarding these patches.
> 
> I suggest that we set up a conference call to discuss all these questions - it
> might save us a lot of time and clear some issues.
> 
> How about tomorrow morning? (I mean Hal's morning). The earlier the better.
> 
> Please let me know what you think about it.
> 
> Thanks,
> 
> -- Yevgeny
> 
> Hal Rosenstock wrote:
> > Hi again Yevgeny,
> > 
> > On Thu, 2007-01-25 at 11:53, Yevgeny Kliteynik wrote: 
> >> Hi Hal.
> >>
> >> Hal Rosenstock wrote:
> >>> Hi Yevgeny,
> >>>
> >>> On Wed, 2007-01-24 at 09:10, Yevgeny Kliteynik wrote:
>  Hi Hal, Sasha.
> 
>  Here's a description of the QoS policy file, and an
>  example of such file (with more comments inside).
> >>> This makes the start of a good document on this. If you add this to
> >>> osm/doc, I will incorporate it into the opensm man page.
> >> OK, I'll do that.
> >>
>  QoS Policy file
>  --
> 
>  The QoS policy file is divided into 4 sub sections:
> 
>  * Node Group: a set of HCAs, Routers or Switches that share the same 
>  settings. 
>    A node groups might be a partition defined by the partition manager 
>  policy in 
>    terms of GUIDs.
> >>> Are these Node or Port Groups ? It looks like port groups from the
> >>> below.
> >> Good point - it should be "Port Groups".
> >>
>   Future implementations might provide support for NodeDescription 
>    based definition of node groups.
> 
>  * Fabric Setup: 
>    Defines how the SL2VL and VLArb tables should be setup. This policy 
>  definition 
>    assumes the computation of target behavior should be performed outside 
>  of 
>    OpenSM.
> 
>  * QoS-Levels Definition:
>    This section defines the possible sets of parameters for QoS that a 
>  client might 
>    be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate, 
>  Path Bits 
>    (in case LMC > 0 is used for QoS) and TClass.
> >>> How does this relate to/interact with partition configuration ? Also,
> >>> what about preexisting QoS ?
> >> As I understand from the osm man or from the partition-config.txt,
> >> partitions definition is intended to be used for IPoIB only.
> >>[quote]
> >>sl= - specifies SL for this IPoIB MC group
> >>   (default is 0)
> >>[/quote]
> >>
> >> I think that QoS policy may only "tighten" the constraints and enforce
> >> lower-than-requested values, both in case of partition and in case of
> >> preexisting QoS settings.
> > 
> > I'm not following you on this specific point. A specific SL is chosen by
> > partition config so how can it be "tightened" ? Does it mean it might be
> > changed to a different SL (in which case this QoS config superceeds the
> > partition config for SL setting) ? Have you tried this to be sure ? 
> > 
> > Are multicast groups handled as part QoS definition in the XML syntax ?
> > If not, might this be a future addition ? If it is, how are they
> > specified ?
> > 
> > The other half of the original question was how a QoS request is handled
> > if the original QoS support is enabled rather than this new QoS support
> > in terms of the SA PR and MPR code.
> > 
>  * Matching Rules:
>    A list of rules that match an incoming PathRecord request to a 
>  QoS-Level. The 
>    rules are processed in order such as the first match is applied. Each 
>  rule is 
>    built out of set of match expressions which should all match for the 
>  rule to 
>    apply. The matching expressions are defined for the following fields
>  - SRC and DST to lists of node groups
>  - Service-ID to a list of Service-ID or Service-ID ranges
>  -

[openib-general] QoS in OSM

2007-01-29 Thread Yevgeny Kliteynik
Hi guys.

I've finished the first implementation of QoS-aware PathRecord.
The path selection logic itself is implemented in a separate function
that is called only when QoS in OpenSM is on.
It cases some code duplication, but as we've discussed, the idea is to
minimize the changes in the existing logic in OSM.
Tonight the regression testing is running on this OSM version to make 
sure that I didn't screw something up.
Since none of the QoS patches has made its way to the trunk yet, the
patch series will be pretty long. It will include:
 - QoS policy file parser (Lex & Yacc files that implement grammar, 
   C & H files that implements parser auxiliary functions)
 - Additional fields is path_record_t (instead of 'reserved' fields)
 - Additional command line option for OpenSM to specify the QoS 
   policy file name
 - QoS-aware selection of PathRecord.
I'll issue the patch series with all the details in the morning, and then
I'll start working on MultiPath Record.

In addition to all the questions that you already have and I haven't answered
yet, I'm sure you'll have many questions and remarks regarding these patches.

I suggest that we set up a conference call to discuss all these questions - it
might save us a lot of time and clear some issues.

How about tomorrow morning? (I mean Hal's morning). The earlier the better.

Please let me know what you think about it.

Thanks,

-- Yevgeny

Hal Rosenstock wrote:
> Hi again Yevgeny,
> 
> On Thu, 2007-01-25 at 11:53, Yevgeny Kliteynik wrote: 
>> Hi Hal.
>>
>> Hal Rosenstock wrote:
>>> Hi Yevgeny,
>>>
>>> On Wed, 2007-01-24 at 09:10, Yevgeny Kliteynik wrote:
 Hi Hal, Sasha.

 Here's a description of the QoS policy file, and an
 example of such file (with more comments inside).
>>> This makes the start of a good document on this. If you add this to
>>> osm/doc, I will incorporate it into the opensm man page.
>> OK, I'll do that.
>>
 QoS Policy file
 --

 The QoS policy file is divided into 4 sub sections:

 * Node Group: a set of HCAs, Routers or Switches that share the same 
 settings. 
   A node groups might be a partition defined by the partition manager 
 policy in 
   terms of GUIDs.
>>> Are these Node or Port Groups ? It looks like port groups from the
>>> below.
>> Good point - it should be "Port Groups".
>>
  Future implementations might provide support for NodeDescription 
   based definition of node groups.

 * Fabric Setup: 
   Defines how the SL2VL and VLArb tables should be setup. This policy 
 definition 
   assumes the computation of target behavior should be performed outside 
 of 
   OpenSM.

 * QoS-Levels Definition:
   This section defines the possible sets of parameters for QoS that a 
 client might 
   be mapped to. Each set holds: SL and optionally: Max MTU, Max Rate, Path 
 Bits 
   (in case LMC > 0 is used for QoS) and TClass.
>>> How does this relate to/interact with partition configuration ? Also,
>>> what about preexisting QoS ?
>> As I understand from the osm man or from the partition-config.txt,
>> partitions definition is intended to be used for IPoIB only.
>>[quote]
>>  sl= - specifies SL for this IPoIB MC group
>> (default is 0)
>>[/quote]
>>
>> I think that QoS policy may only "tighten" the constraints and enforce
>> lower-than-requested values, both in case of partition and in case of
>> preexisting QoS settings.
> 
> I'm not following you on this specific point. A specific SL is chosen by
> partition config so how can it be "tightened" ? Does it mean it might be
> changed to a different SL (in which case this QoS config superceeds the
> partition config for SL setting) ? Have you tried this to be sure ? 
> 
> Are multicast groups handled as part QoS definition in the XML syntax ?
> If not, might this be a future addition ? If it is, how are they
> specified ?
> 
> The other half of the original question was how a QoS request is handled
> if the original QoS support is enabled rather than this new QoS support
> in terms of the SA PR and MPR code.
> 
 * Matching Rules:
   A list of rules that match an incoming PathRecord request to a 
 QoS-Level. The 
   rules are processed in order such as the first match is applied. Each 
 rule is 
   built out of set of match expressions which should all match for the 
 rule to 
   apply. The matching expressions are defined for the following fields
 - SRC and DST to lists of node groups
 - Service-ID to a list of Service-ID or Service-ID ranges
 - TClass to a list of TClass values or ranges

 QoS policy file example
 --

 
 
 
 
 
  
 Storage 
 our SRP storage targets
>>> Is the use clause more than commentary ? How is it "used" ?
>> The 'use' clause is just a description of the port group that
>> can 

Re: [openib-general] IPOIB CM with Non SRQ support

2007-01-29 Thread Pradeep Satyanarayana
Hello Michael,

Yes, the code seems to get complex with lots of small changes spread 
across all over the recieve side. Plus 
special cassing them with #ifdef makes it look a little messy. It is 
unlikely I can get this out by Feb 1st.

As I was working through this I noticed a few things and here are my 
observations:

-ipoib_cm_modify_rx_rts() does not actually transition the passive side qp 
to RTS state and remains in the
RTR state. However, the active side qp does transition to RTS.

-One artifact of the current send side implemantation is that for every 
message we create a new set of tx qps.
So, if one were to use IB for the cluster heartbeat mechanism as an 
example, then for every heartbeat we
end up creating an ipoib_cm_tx structure and initiating a set of CM 
exchanges.  This might consume a lot of
 resources (even on an "idle" system). Changing this has a potential 
performance upside.

Pradeep
[EMAIL PROTECTED] 

"Michael S. Tsirkin" <[EMAIL PROTECTED]> wrote on 01/25/2007 11:41:28 PM:

> > Quoting Pradeep Satyanarayana <[EMAIL PROTECTED]>:
> > Subject: IPOIB CM with Non SRQ support
> > 
> > 
> > Michael, 
> > 
> > I am working on a prototype based on your IPOIB CM patch to 
> incorporate support for Non SRQ  as well. IPOIB CM was planned to be
> in OFED 1.2 if I remember correctly. If I were to submit a patch for
> non SRQ support, what would be the cut off date to make it
> > into OFED 1.2? 
> 
> I think it must be ready for merge by feature freeze on Feb 1st, but at 
this
> stage it really needs to be a small patch. I can't commit to merging it
> before I see it.
> 
> I have to warn you that I thought about this problem, and unfortunately
> I do not see a way to implement it in a robust fashion without 
complicating
> the code significantly. In this case, you'll just might have to maintain 
it
> as a separate patch until the code lands upstream, and propose as a 
separate
> improvement later.
> 
> -- 
> MST
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-29 Thread Sean Hefty
>>To allow others to join this group, we need a way to determine
>>if additional join requests are for a specific MGID, or require
>>IP to MGID mapping.  This is done by comparing the requested
>>join address against SA assigned MGIDs.
> 
> 
> Still not understanding this part -- this means that I'm not able to get 
> some sort of portable handle for the group on the process that initially 
> joins the group, and pass it to other processes who would then use that 
> to join the group?

I believe that this patch lets you can do what you're trying to do.  The group 
handle would be the returned mgid from the initial join that created the group. 
  The mgid would need to be passed to other processes as an IPv6 address, who 
issue a join request on that group.  (The mgid is available from the 
rdma_cm_event.param.ud.ah_attr.grh.dgid.)

Typically, the rdma_cm maps IP addresses to mgids using the ipoib ip mapping 
algorithm.  This patch avoids that mapping if the upper 32-bits of the IP 
address match a specific pattern.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC/BUG] libibverbs: DMA vs. CQ race

2007-01-29 Thread Roland Dreier
Hmm...

Well, first the changes to the userspace libmthca need to be such that
new libmthca continues to work with old kernels.  I'm OK with saying
to people, "You upgraded your kernel so you also have to upgrade your
userspace library."  But I'm not OK with saying to people, "To get a
fix for that bug, you need to upgrade libmthca, which means you also
need to upgrade your kernel," and I also don't want to tell people,
"If you reboot into an older kernel then you need to downgrade your
userspace library."

Also,

 > +off_t   offset;

 > +/* offset encodes CQ and cqn; lower PAGE_SHIFT bits MBZ */
 > +offset = cq->cqn;
 > +offset <<= 32;
 > +offset += MTHCA_MAGIC_CQ_OFFSET * page_size;

is obviously not going to work on architectures where off_t is 32 bits.

Even with that resolved this all seems rather unfortunate to me.  I
don't like the idea of having the kernel keep all these buffers around
and then have the userspace library have to map the right buffer.  It
leads to awkwardness like the fact that mthca_resize_cq() seems to be
totally screwed if ibv_cmd_resize_cq() fails for some reason -- it
already munmap'ed the original buffer, and it can't map the new
buffer, and so the CQ is dead with no chance to recover.

The really strange thing about this is that this Altix
coherent/consistent memory really isn't about the memory itself, but
about the relationship of that memory with DMA elsewhere -- as I
understand the code, doing dma_alloc_coherent() returns normal memory
with a special DMA address that tells the system to flush other DMAs
before doing DMA to the coherent region.  Which isn't really what most
people understand coherent memory to be, but it has the magic property
of making most drivers work.

So I'd really like a better solution, but I don't have one in mind
unfortunately.  Maybe we can all meditate on this and try to come up
with something cleaner -- I really hope there is a better way to
handle this.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-29 Thread Andrew Friedley
Sean Hefty wrote:
> Modify rdma_join_multicast to allow the user to specify that
> they want the underlying transport to assign them a unique
> multicast address.  This is done by specifying an IP address
> of 0, which will translate into an IB MGID of 0.
> 
> To allow others to join this group, we need a way to determine
> if additional join requests are for a specific MGID, or require
> IP to MGID mapping.  This is done by comparing the requested
> join address against SA assigned MGIDs.

Still not understanding this part -- this means that I'm not able to get 
some sort of portable handle for the group on the process that initially 
joins the group, and pass it to other processes who would then use that 
to join the group?

Andrew

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [RFC][PATCH] rdma_cm: allow joins to return a unique address

2007-01-29 Thread Sean Hefty
Modify rdma_join_multicast to allow the user to specify that
they want the underlying transport to assign them a unique
multicast address.  This is done by specifying an IP address
of 0, which will translate into an IB MGID of 0.

To allow others to join this group, we need a way to determine
if additional join requests are for a specific MGID, or require
IP to MGID mapping.  This is done by comparing the requested
join address against SA assigned MGIDs.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
diff --git a/drivers/infiniband/core/cma.c b/drivers/infiniband/core/cma.c
index 827df2a..395cf2f 100644
--- a/drivers/infiniband/core/cma.c
+++ b/drivers/infiniband/core/cma.c
@@ -2490,13 +2490,36 @@ out:
return 0;
 }
 
+static void cma_set_mgid(struct rdma_id_private *id_priv,
+struct sockaddr *addr, union ib_gid *mgid)
+{
+   unsigned char mc_map[MAX_ADDR_LEN];
+   struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
+   struct sockaddr_in *sin = (struct sockaddr_in *) addr;
+   struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *) addr;
+
+   if (cma_any_addr(addr)) {
+   memset(mgid, 0, sizeof *mgid);
+   } else if ((addr->sa_family == AF_INET6) &&
+  ((be32_to_cpu(sin6->sin6_addr.s6_addr32[0]) & 0xFF10A01B) ==
+0xFF10A01B)) {
+   /* IPv6 address is an SA assigned MGID. */
+   memcpy(mgid, &sin6->sin6_addr, sizeof *mgid);
+   } else {
+   ip_ib_mc_map(sin->sin_addr.s_addr, mc_map);
+   if (id_priv->id.ps == RDMA_PS_UDP)
+   mc_map[7] = 0x01;   /* Use RDMA CM signature */
+   mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8;
+   mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr);
+   *mgid = *(union ib_gid *) (mc_map + 4);
+   }
+}
+
 static int cma_join_ib_multicast(struct rdma_id_private *id_priv,
 struct cma_multicast *mc)
 {
struct ib_sa_mcmember_rec rec;
-   unsigned char mc_map[MAX_ADDR_LEN];
struct rdma_dev_addr *dev_addr = &id_priv->id.route.addr.dev_addr;
-   struct sockaddr_in *sin = (struct sockaddr_in *) &mc->addr;
ib_sa_comp_mask comp_mask;
int ret;
 
@@ -2506,15 +2529,9 @@ static int cma_join_ib_multicast(struct rdma_id_private 
*id_priv,
if (ret)
return ret;
 
-   ip_ib_mc_map(sin->sin_addr.s_addr, mc_map);
-   if (id_priv->id.ps == RDMA_PS_UDP) {
-   mc_map[7] = 0x01;   /* Use RDMA CM signature */
+   cma_set_mgid(id_priv, &mc->addr, &rec.mgid);
+   if (id_priv->id.ps == RDMA_PS_UDP)
rec.qkey = cpu_to_be32(RDMA_UDP_QKEY);
-   }
-   mc_map[8] = ib_addr_get_pkey(dev_addr) >> 8;
-   mc_map[9] = (unsigned char) ib_addr_get_pkey(dev_addr);
-
-   rec.mgid = *(union ib_gid *) (mc_map + 4);
ib_addr_get_sgid(dev_addr, &rec.port_gid);
rec.pkey = cpu_to_be16(ib_addr_get_pkey(dev_addr));
rec.join_state = 1;


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] [PATCH ofed-1.2 0/6] ehca (kernel space) patches for ofed-1.2

2007-01-29 Thread Hoang-Nam Nguyen
[EMAIL PROTECTED] wrote on 27.01.2007 17:11:34:
> > PS2: For backport on 2.6.16 resp. SLES10 I saw that there is a
> > hvcall.h under backport/2.6.16/include/linux. However that one
> > is not sufficient for ehca and include/linux is the wrong place.
> > Hence, I'm patching a new one under include/asm. If I'm right,
> > please remove include/linux/hvcall.h!
> I remember this was needed for iser backport for some reason.
> Does someone remember?
Can someone from iser group please check this?
On pseries hvcall.h is placed under include/asm which is a link to
include/asm-ppc resp include/asm-powerpc.
Thanks
Nam


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] ipoib, ipv6 and multicast groups

2007-01-29 Thread Hal Rosenstock
On Mon, 2007-01-29 at 13:17, chas williams - CONTRACTOR wrote:
> recently our sm started throwing the following errors:
> 
> Jan 29 18:10:49 706710 [42003940] -> __get_new_mlid: ERR 1B23: All 
> available:32 mlids are taken
> Jan 29 18:10:49 706721 [42003940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
> __get_new_mlid failed
> Jan 29 18:10:51 345113 [42804940] -> __get_new_mlid: ERR 1B23: All 
> available:32 mlids are taken
> Jan 29 18:10:51 345132 [42804940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
> __get_new_mlid failed
> Jan 29 18:10:51 514312 [41802940] -> __get_new_mlid: ERR 1B23: All 
> available:32 mlids are taken
> Jan 29 18:10:51 514320 [41802940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
> __get_new_mlid failed
> Jan 29 18:10:51 735732 [42804940] -> __get_new_mlid: ERR 1B23: All 
> available:32 mlids are taken

32 is too low for MLID space support IMO.

> we tracked this down to a problem with ipoib interaction
> with ipv6.  ipv6 joins two multicast groups, instead of 
> just one like ipv4.
> 
>   # netstat -A inet6 -g  -n
>   ...
>   IPv6/IPv4 Group Memberships
>   Interface   RefCnt Group
>   --- -- -
>   lo  1  ff02::1
>   ib0 1  ff02::1:ff00:77a2
>   ib0 1  ff02::1
> 
> 
>   # netstat -A inet6 -g  -n
>   ...
>   IPv6/IPv4 Group Memberships
>   Interface   RefCnt Group
>   --- -- -
>   lo  1  224.0.0.1
>   ib0 1  224.0.0.1
> 
> 
>   # cat /sys/kernel/debug/ipoib/ib0_mcg
>   GID: ff12:401b::0:0:0:0:1
> created: 4298482097
> queuelen: 0
> complete:   yes
> send_only:   no
> 
>   GID: ff12:401b::0:0:0::
> created: 4298482097
> queuelen: 0
> complete:   yes
> send_only:   no
> 
>   GID: ff12:601b::0:0:0:0:1
> created: 4298482097
> queuelen: 0
> complete:   yes
> send_only:   no
> 
>   GID: ff12:601b::0:0:1:ff00:77a2
> created: 4298482097
> queuelen: 0
> complete:   yes
> send_only:   no
> 
> 
> the ff02::1:ff00:77a2 group is specific to the interface (link local),
> so each of our ib hosts running ipv6 registers its own unique multicast
> group.  since our network is bigger than 32 hosts, it appears that we
> have exceeded the multicast tables in our local switches and this is
> making opensm generate the above error.
> 
> besides not running ipv6, are there any thoughts about this?

This has been discussed on the list before. Last time was a thread on
"IPv6 and IPoIB scalability issue" back in late November (11/30) to
early December (12/2). There are some options presented. None have been
pursued to the best of my knowledge.

-- Hal

> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general
> 


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH ofed1.2 2/2] libehca: change path to ehca.driver for make dist

2007-01-29 Thread Stefan Roscher
Hi,

this patch fixes the path to ehca.driver in Makfile.am.

Regards
Stefan


Signed-off-by: Stefan Roscher <[EMAIL PROTECTED]>
---



Files libehca_old/.git/index and libehca_new/.git/index differ
diff -Nurp libehca_old/Makefile.am libehca_new/Makefile.am
--- libehca_old/Makefile.am 2007-01-29 17:16:22.0 +0100
+++ libehca_new/Makefile.am 2007-01-29 17:17:18.0 +0100
@@ -70,7 +70,7 @@ EXTRA_DIST = src/ehca_asm.h \
src/ehca_utools.h \
src/hipz_hw.h \
src/libehca.map \
-   src/ehca.driver
+   ehca.driver
 
 # dist-hook: libehca.spec
 # cp libehca.spec $(distdir)






___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH ofed1.2 1/2] libehca: create config directory in autogen.sh

2007-01-29 Thread Stefan Roscher
Hi,

this patch change autogen.sh, so the config directory is created if it's not
exist.

Regards
Stefan



Signed-off-by: Stefan Roscher <[EMAIL PROTECTED]>
---


diff -Nurp libehca_old/autogen.sh libehca_new/autogen.sh
--- libehca_old/autogen.sh  2007-01-29 17:16:22.0 +0100
+++ libehca_new/autogen.sh  2007-01-29 17:17:01.0 +0100
@@ -1,5 +1,6 @@
 #! /bin/sh
 
+mkdir -p config
 set -x
 aclocal -I config
 libtoolize --force --copy



___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] ipoib, ipv6 and multicast groups

2007-01-29 Thread chas williams - CONTRACTOR
recently our sm started throwing the following errors:

Jan 29 18:10:49 706710 [42003940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:49 706721 [42003940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 345113 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:51 345132 [42804940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 514312 [41802940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken
Jan 29 18:10:51 514320 [41802940] -> osm_mcmr_rcv_create_new_mgrp: ERR 1B19: 
__get_new_mlid failed
Jan 29 18:10:51 735732 [42804940] -> __get_new_mlid: ERR 1B23: All available:32 
mlids are taken

we tracked this down to a problem with ipoib interaction
with ipv6.  ipv6 joins two multicast groups, instead of 
just one like ipv4.

# netstat -A inet6 -g  -n
...
IPv6/IPv4 Group Memberships
Interface   RefCnt Group
--- -- -
lo  1  ff02::1
ib0 1  ff02::1:ff00:77a2
ib0 1  ff02::1


# netstat -A inet6 -g  -n
...
IPv6/IPv4 Group Memberships
Interface   RefCnt Group
--- -- -
lo  1  224.0.0.1
ib0 1  224.0.0.1


# cat /sys/kernel/debug/ipoib/ib0_mcg
GID: ff12:401b::0:0:0:0:1
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:401b::0:0:0::
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:601b::0:0:0:0:1
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no

GID: ff12:601b::0:0:1:ff00:77a2
  created: 4298482097
  queuelen: 0
  complete:   yes
  send_only:   no


the ff02::1:ff00:77a2 group is specific to the interface (link local),
so each of our ib hosts running ipv6 registers its own unique multicast
group.  since our network is bigger than 32 hosts, it appears that we
have exceeded the multicast tables in our local switches and this is
making opensm generate the above error.

besides not running ipv6, are there any thoughts about this?

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] CM callbacks

2007-01-29 Thread Sean Hefty
Eric Barton wrote:
> Is the following possible?
> 
> 1. I listen for connection requests.
> 
> 2. RDMA_CM_EVENT_CONNECT_REQUEST is delivered, I rdma_accept() successfully 
> and
>return from the callback.
> 
> 3. RDMA_CM_EVENT_DISCONNECTED is delivered.
> 
> Am I wrong to assume I can only get RDMA_CM_EVENT_DISCONNECTED after I've seen
> RDMA_CM_EVENT_ESTABLISHED?  I thought I'd get one of the other callbacks
> (e.g. RDMA_CM_EVENT_CONNECT_ERROR) if something went wrong before the
> ESTABLISHED callback.

This is possible.  To see why, we need to follow the IB CM protocol:

client  server
listen
connect
send REQ
recv REQ -> causes CONNECT_REQUEST
accept
send REP
recv REP
send RTU
RTU wanders away and gets lost
ESTABLISHED
disconnect
send DREQ
recv DREQ - DISCONNECTED event

 From the viewpoint of the client a connection was established, and data could 
have been transferred over the connection.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Michael S. Tsirkin
> Quoting Hal Rosenstock <[EMAIL PROTECTED]>:
> Subject: Re: OFED 1.2 release - to be reviewed in the meeting today
> 
> On Mon, 2007-01-29 at 09:05, Tziporet Koren wrote:
> > Hi,
> > 
> > This is the proposal for OFED 1.2 branching and tagging:
> > 
> > Sources developed in OFA:
> > 1. Each git owner will open a branch with the name ofed_1_2. This
> > branch should be opened on 31-Jan (based on code readiness we will
> > review today).
> > 2. Vlad will open a new /pub/ofed_1_2.
> > 3. All ofed_1_2 branches will be cloned to this directory. (Note:
> > libibverbs and libmthca will be cloned from kernel.org for Roland's
> > trees.)
> 
> I'm confused about releasing the libraries as libxxx-.tar.gz. 
> How is this to be handled (aside from what is in the ofed_1.2 branch) ?

I think this is not covered by this proposal, need to be discussed separately.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Hal Rosenstock
On Mon, 2007-01-29 at 09:05, Tziporet Koren wrote:
> Hi,
> 
> This is the proposal for OFED 1.2 branching and tagging:
> 
> Sources developed in OFA:
> 1. Each git owner will open a branch with the name ofed_1_2. This
> branch should be opened on 31-Jan (based on code readiness we will
> review today).
> 2. Vlad will open a new /pub/ofed_1_2.
> 3. All ofed_1_2 branches will be cloned to this directory. (Note:
> libibverbs and libmthca will be cloned from kernel.org for Roland's
> trees.)

I'm confused about releasing the libraries as libxxx-.tar.gz. 
How is this to be handled (aside from what is in the ofed_1.2 branch) ?

-- Hal

> 4. Any change that should be included in the next OFED package will be
> first check-in to the maintainer ofed_1_2 branch. 
> A mail should be sent to Vlad (and cc the list) to pull this
> change.
> 5. A tag will be set before any package is build. Tag name convention:
> ofed_1_2_ where version will be the suffix of OFED package
> (e.g. 1.2-alpha1)
> 6. OFED package will be built based on this tag.
> 7. There will be a build script (as in OFED 1.1) to enable each owner
> to build the package for testing.
> 
> MPI packages:
> 1. MPI packages are provided as source RPMs
> 2. Each MPI owner will have an account on the OFA server and will open
> a directory named ofed_1_2
> 3. The SRPM package will be placed in this directory, with version
> indication in the filename (e.g.ompi-1.2.1-xxx)
> 4. There will be a file named latest.txt that will contain the package
> that should be taken in the OFED package
> 
> Any other external packages that supplied as SRPs (e.g bonding) and
> not source will use the same method as above.
> 
> Tziporet
> 
> 
> 
> __
> 
> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] oops at device removal

2007-01-29 Thread Sean Hefty
> @@ -71,6 +70,7 @@ struct mcast_device {
>   int start_port;
>   int end_port;
>   struct mcast_port   port[0];
> + struct ib_event_handler event_handler;
>  };

The mcast_port data is allocated at the end of the structure.  event_handler 
will need to be located up in the structure.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] ib_sa/multicast: Fix crash when multiple HCAs are present

2007-01-29 Thread Michael S. Tsirkin
> @@ -742,9 +742,7 @@ static void mcast_event_handler(struct ib_event_handler 
> *handler,
>  {
>   struct mcast_device *dev;
>  
> - dev = ib_get_client_data(event->device, &mcast_client);
> - if (!dev)
> - return;
> + dev = container_of(handler, struct mcast_device, event_handler);
>  
>   switch (event->event) {
>   case IB_EVENT_PORT_ERR:

I'm seeing crashes with the patch that I posted.
This seems identical to my patch except for the container_of trick. Right?
Is there a reason why ib_get_client_data won't work?


-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] ib_sa/multicast: Fix crash when multiple HCAs are present

2007-01-29 Thread Sean Hefty
We need to use a per device event handler, rather than a single,
global handler that gets reinitialized when a new device is added
to the system.

Signed-off-by: Sean Hefty <[EMAIL PROTECTED]>
---
diff --git a/drivers/infiniband/core/multicast.c 
b/drivers/infiniband/core/multicast.c
index fde977e..039f1eb 100644
--- a/drivers/infiniband/core/multicast.c
+++ b/drivers/infiniband/core/multicast.c
@@ -51,7 +51,6 @@ static struct ib_client mcast_client = {
 };
 
 static struct ib_sa_client sa_client;
-static struct ib_event_handler event_handler;
 static struct workqueue_struct *mcast_wq;
 static union ib_gid mgid0;
 
@@ -68,6 +67,7 @@ struct mcast_port {
 
 struct mcast_device {
struct ib_device*device;
+   struct ib_event_handler event_handler;
int start_port;
int end_port;
struct mcast_port   port[0];
@@ -742,9 +742,7 @@ static void mcast_event_handler(struct ib_event_handler 
*handler,
 {
struct mcast_device *dev;
 
-   dev = ib_get_client_data(event->device, &mcast_client);
-   if (!dev)
-   return;
+   dev = container_of(handler, struct mcast_device, event_handler);
 
switch (event->event) {
case IB_EVENT_PORT_ERR:
@@ -793,8 +791,8 @@ static void mcast_add_one(struct ib_device *device)
dev->device = device;
ib_set_client_data(device, &mcast_client, dev);
 
-   INIT_IB_EVENT_HANDLER(&event_handler, device, mcast_event_handler);
-   ib_register_event_handler(&event_handler);
+   INIT_IB_EVENT_HANDLER(&dev->event_handler, device, mcast_event_handler);
+   ib_register_event_handler(&dev->event_handler);
 }
 
 static void mcast_remove_one(struct ib_device *device)
@@ -807,7 +805,7 @@ static void mcast_remove_one(struct ib_device *device)
if (!dev)
return;
 
-   ib_unregister_event_handler(&event_handler);
+   ib_unregister_event_handler(&dev->event_handler);
flush_workqueue(mcast_wq);
 
for (i = 0; i <= dev->end_port - dev->start_port; i++) {


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] CM callbacks

2007-01-29 Thread Eric Barton

Is the following possible?

1. I listen for connection requests.

2. RDMA_CM_EVENT_CONNECT_REQUEST is delivered, I rdma_accept() successfully and
   return from the callback.

3. RDMA_CM_EVENT_DISCONNECTED is delivered.

Am I wrong to assume I can only get RDMA_CM_EVENT_DISCONNECTED after I've seen
RDMA_CM_EVENT_ESTABLISHED?  I thought I'd get one of the other callbacks
(e.g. RDMA_CM_EVENT_CONNECT_ERROR) if something went wrong before the
ESTABLISHED callback.

-- 

Cheers,
Eric


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFT] [PATCH] Add ABI compatibility for apps linked against libibverbs 1.0

2007-01-29 Thread Roland Dreier
 > Updated patch is below.  I would still appreciate test reports with
 > other apps, but now I think I'm confident enough that I will push this
 > out on the libibverbs.git master branch soon.

OK, I've committed this to the libibverbs master branch and pushed it out.

I think I'm getting close to a libibverbs 1.1-rc1 release -- the only
items remaining on my todo list are to add stub low-level driver
methods for reregister memory region and memory window handling, so
that we have a chance at adding those things to later libibverbs 1.1
releases without breaking ABI.

 - R.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] The ibv_cmd_* create functions need to set the context.

2007-01-29 Thread Roland Dreier
Thanks, applied to master and stable branches.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] Performance Manager

2007-01-29 Thread Sean Hefty
> Initially ? It is also an implementation phasing issue as stated. The
> core support is needed in both so there is very little unneeded work to
> get to the first phase in terms of a distributed approach. We would
> certainly grow/evolve towards this after that initial implementation.

Based on what you're saying, then a phased approach makes sense to me.

>>>1. Use vendor specific MADs (which can be RMPP'd) and build on top of
>>>this
>>>2. Use IPoIB which is much more powerful as sockets can then be utilized.
>>
>>You could also use RC QP communication up/down the hierarchy.
> 
> 
> Wouldn't that have the same issues as approach 1 (as compared with
> approach 2) ?

MAD overhead is significant.  You would get less overhead plus RDMA 
capabilities, which could affect the implementation design.

- Sean

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Michael S. Tsirkin
> Quoting Hoang-Nam Nguyen <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] OFED 1.2 release - to be reviewed in the 
> meeting today
> 
> > > This looks pretty similar to ofed-1.1/SVN release procedure. What about
> > > discussed idea of "per package release by maintainers"?
> >
> > I guess there's no conflict: maintainers can make ofed_1_2 point to
> > their release. But I agree we need maintainers' buy-in and commitment to 
> > schedule
> > that matches OFED release schedule.
> 
> Can you explain me what "per package release by maintainers" means for me?

This really has to do with package versioning, not branching/tagging.
Basically this boils down to the agreement that maintainers create a release
of their package that matches code in OFED, taking care of library versioning
(that is, assigning a version number to this package).

Then
1. OFED versions the packages according to what maintainers do.
2. Properly versioned packages are available separately for other distributions.

You can look at how libibverbs is versionned to get the idea.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] ofed_1_2 Copy in the library cq rptr address only for non T3A devices.

2007-01-29 Thread Vladimir Sokolovsky
On Mon, 2007-01-29 at 08:40 -0600, Steve Wise wrote:
> This fixes a bug with the rev 0 Chelsio T3 hardware...
> 
> It needs to be pulled into ofed_1_2.  
> 
> Roland, it will need to be merged in with the T3 rdma driver.  I'm
> maintaining this in my git tree, so I can resend it to you once you
> finish reviewing/merging the T3 driver.
> 
> Thanks,
> 
> Steve.
> 
> ---
> 
> 
> Don't copy in the library cq rptr address for T3A devices.
> 
> T3A doesn't support kernel bypass, so we must _not_ save off the lib's
> cq rptr address for these devices.  Otherwise the re-arm logic will try
> and use the library rptr value for T3A re-arm.
> 
> Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
> ---

Applied.

-- 
Vladimir Sokolovsky <[EMAIL PROTECTED]>
Mellanox Technologies Ltd.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Hoang-Nam Nguyen
Hi,
> > This looks pretty similar to ofed-1.1/SVN release procedure. What about
> > discussed idea of "per package release by maintainers"?
> I guess there's no conflict: maintainers can make ofed_1_2 point to
> their release.
> But I agree we need maintainers' buy-in and commitment to schedule
> that matches
> OFED release schedule.
Can you explain me what "per package release by maintainers" means for me?
Thx
Nam


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Michael S. Tsirkin
> Quoting Sasha Khapyorsky <[EMAIL PROTECTED]>:
> Subject: Re: [openib-general] OFED 1.2 release - to be reviewed in the 
> meeting today
> 
> Hi Tziporet,
> 
> On 16:05 Mon 29 Jan , Tziporet Koren wrote:
> > 
> > This is the proposal for OFED 1.2 branching and tagging:
> > 
> > *Sources developed in OFA:*
> > 1. Each git owner will open a branch with the name ofed_1_2. This branch 
> > should be opened on 31-Jan (based on code readiness we will review today).
> > 2. Vlad will open a new /pub/ofed_1_2.
> > 3. All ofed_1_2 branches will be cloned to this directory. (Note: 
> > libibverbs and libmthca will be cloned from kernel.org for Roland's trees.)
> > 4. Any change that should be included in the next OFED package will be 
> > first check-in to the maintainer ofed_1_2 branch.
> >A mail should be sent to Vlad (and cc the list) to pull this change.
> 
> This looks pretty similar to ofed-1.1/SVN release procedure. What about
> discussed idea of "per package release by maintainers"?

I guess there's no conflict: maintainers can make ofed_1_2 point to their 
release.
But I agree we need maintainers' buy-in and commitment to schedule that matches
OFED release schedule.

-- 
MST

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [Bug 325] RDMA_CM and address translation broken on sles9sp3

2007-01-29 Thread bugzilla-daemon
https://bugs.openfabrics.org/show_bug.cgi?id=325


[EMAIL PROTECTED] changed:

   What|Removed |Added

 AssignedTo|[EMAIL PROTECTED] |[EMAIL PROTECTED]




--- Comment #1 from [EMAIL PROTECTED]  2007-01-29 07:59 ---
This is an iwarp-only issue, I think.  And I believe I have a solution.


-- 
Configure bugmail: https://bugs.openfabrics.org/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug, or are watching the assignee.
You are the assignee for the bug, or are watching the assignee.

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] OFED 1.0 Install problems - Kernel Compile error - RH 2.6.9-42EL

2007-01-29 Thread Tziporet Koren
Snider, Tim wrote:
> Trying to install OFED 1.0 on RH EL 2.6.9-42. Recompile of kernel 
> gives redefinition of gfp_t. Can someone point me to a fix? I suspect 
> there's a kernel setting I need to tweak.

I think we never tested OFED 1.0 on this kernel.
Can you try OFED 1.1?

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] iWARP/Chelsio OFED_1_2 status

2007-01-29 Thread Steve Wise

Quick status update on iWARP and Chelsio Support:

- Most of the chelsio driver backports integrated into the ofed_1_2.
Awaiting rhel5 merge from vlad

- compile, load, and successful iwarp rping test on rhel4u4, rhel5 beta
2, and sles 10

- neighbour change notifications needed for rdma_cm users has been
implemented and posted for merging into ofed_1_2.  Awaiting merge from
vlad  

- sles9sp3 bug is prohibiting rdma_cm functionality.  Need a fix for
this (bug 325)

- ammasso driver/lib:  I could add support this in but it will be late.
Does the group see value in adding ammasso?  



Steve.







___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] [PATCH] ofed_1_2 Copy in the library cq rptr address only for non T3A devices.

2007-01-29 Thread Steve Wise
This fixes a bug with the rev 0 Chelsio T3 hardware...

It needs to be pulled into ofed_1_2.  

Roland, it will need to be merged in with the T3 rdma driver.  I'm
maintaining this in my git tree, so I can resend it to you once you
finish reviewing/merging the T3 driver.

Thanks,

Steve.

---


Don't copy in the library cq rptr address for T3A devices.

T3A doesn't support kernel bypass, so we must _not_ save off the lib's
cq rptr address for these devices.  Otherwise the re-arm logic will try
and use the library rptr value for T3A re-arm.

Signed-off-by: Steve Wise <[EMAIL PROTECTED]>
---

 drivers/infiniband/hw/cxgb3/iwch_provider.c |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/infiniband/hw/cxgb3/iwch_provider.c 
b/drivers/infiniband/hw/cxgb3/iwch_provider.c
index 28be418..dbb3f71 100644
--- a/drivers/infiniband/hw/cxgb3/iwch_provider.c
+++ b/drivers/infiniband/hw/cxgb3/iwch_provider.c
@@ -151,7 +151,7 @@ static struct ib_cq *iwch_create_cq(stru
if (!chp)
return ERR_PTR(-ENOMEM);
 
-   if (context) {
+   if (context && !t3a_device(rhp)) {
if (ib_copy_from_udata(&ureq, udata, sizeof (ureq))) {
kfree(chp);
return ERR_PTR(-EFAULT);


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Sasha Khapyorsky
Hi Tziporet,

On 16:05 Mon 29 Jan , Tziporet Koren wrote:
> 
> This is the proposal for OFED 1.2 branching and tagging:
> 
> *Sources developed in OFA:*
> 1. Each git owner will open a branch with the name ofed_1_2. This branch 
> should be opened on 31-Jan (based on code readiness we will review today).
> 2. Vlad will open a new /pub/ofed_1_2.
> 3. All ofed_1_2 branches will be cloned to this directory. (Note: 
> libibverbs and libmthca will be cloned from kernel.org for Roland's trees.)
> 4. Any change that should be included in the next OFED package will be 
> first check-in to the maintainer ofed_1_2 branch.
>A mail should be sent to Vlad (and cc the list) to pull this change.

This looks pretty similar to ofed-1.1/SVN release procedure. What about
discussed idea of "per package release by maintainers"?

Sasha

> 5. A tag will be set before any package is build. Tag name convention: 
> ofed_1_2_ where version will be the suffix of OFED package 
> (e.g. 1.2-alpha1)
> 6. OFED package will be built based on this tag.
> 7. There will be a build script (as in OFED 1.1) to enable each owner to 
> build the package for testing.
> 
> *MPI packages:
> *1. MPI packages are provided as source RPMs
> 2. Each MPI owner will have an account on the OFA server and will open a 
> directory named ofed_1_2
> 3. The SRPM package will be placed in this directory, with version 
> indication in the filename (e.g.ompi-1.2.1-xxx)
> 4. There will be a file named latest.txt that will contain the package 
> that should be taken in the OFED package
> 
> Any other external packages that supplied as SRPs (e.g bonding) and not 
> source will use the same method as above.
> 
> Tziporet
> 
> 

> ___
> openib-general mailing list
> openib-general@openib.org
> http://openib.org/mailman/listinfo/openib-general
> 
> To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] OFED 1.2 release - to be reviewed in the meeting today

2007-01-29 Thread Tziporet Koren

Hi,

This is the proposal for OFED 1.2 branching and tagging:

*Sources developed in OFA:*
1. Each git owner will open a branch with the name ofed_1_2. This branch 
should be opened on 31-Jan (based on code readiness we will review today).

2. Vlad will open a new /pub/ofed_1_2.
3. All ofed_1_2 branches will be cloned to this directory. (Note: 
libibverbs and libmthca will be cloned from kernel.org for Roland's trees.)
4. Any change that should be included in the next OFED package will be 
first check-in to the maintainer ofed_1_2 branch.

   A mail should be sent to Vlad (and cc the list) to pull this change.
5. A tag will be set before any package is build. Tag name convention: 
ofed_1_2_ where version will be the suffix of OFED package 
(e.g. 1.2-alpha1)

6. OFED package will be built based on this tag.
7. There will be a build script (as in OFED 1.1) to enable each owner to 
build the package for testing.


*MPI packages:
*1. MPI packages are provided as source RPMs
2. Each MPI owner will have an account on the OFA server and will open a 
directory named ofed_1_2
3. The SRPM package will be placed in this directory, with version 
indication in the filename (e.g.ompi-1.2.1-xxx)
4. There will be a file named latest.txt that will contain the package 
that should be taken in the OFED package


Any other external packages that supplied as SRPs (e.g bonding) and not 
source will use the same method as above.


Tziporet


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] OFED 1.0 Install problems - Kernel Compile error - RH 2.6.9-42EL

2007-01-29 Thread Snider, Tim
Trying to install OFED 1.0 on RH EL 2.6.9-42. Recompile of kernel gives
redefinition of gfp_t. Can someone point me to a fix? I suspect there's
a kernel setting I need to tweak.
 
[EMAIL PROTECTED] ~]# uname -a
Linux FedoraCore121 2.6.9-42.EL_lustre.1.5.95smp #1 SMP Thu Sep 28
06:36:13 MDT 2006 i686 i686 i386 GNU/Linux

[EMAIL PROTECTED] ~]# vim /tmp/OFED.1479.log
  gcc
-Wp,-MD,/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core
/.index.o.d -nostdinc -iwithprefix include -D__KERNEL__
-I/var/tmp/OFED/tmp/openib/openib/include
-I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include
-Iinclude  -Iinclude2 -I/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include
-I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core -Wall
-Wstrict-prototypes -Wno-trigraphs -fno-strict-aliasing -fno-common -Os
-fomit-frame-pointer -Wdeclaration-after-statement -pipe -msoft-float
-m32 -fno-builtin-sprintf -fno-builtin-log2 -fno-builtin-puts
-mpreferred-stack-boundary=2 -fno-unit-at-a-time -march=i686 -mregparm=3
-I/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/asm-i386/mach-generic
-Iinclude/asm-i386/mach-generic
-I/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/asm-i386/mach-default
-Iinclude/asm-i386/mach-default
-I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include
-I/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/ulp/ipoib
-I/var/tmp/OFED/tmp/openib/openib/drivers/infiniband/debug -D__nocast=
-DMODULE -DKBUILD_BASENAME=index -DKBUILD_MODNAME=findex -c -o
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/.tmp_in
dex.o
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/index.c
In file included from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/slab.h:15,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/slab.h:4,
 from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/percpu.h:4,
 from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/sched.h:31,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/sched.h:4,
 from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/module.h:10,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/index.c
:34:
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/gfp.h:133: error:
redefinition of typedef 'gfp_t'
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/types.h:7: error: previous declaration of 'gfp_t' was here
In file included from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/percpu.h:4,
 from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/sched.h:31,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/sched.h:4,
 from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/module.h:10,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/index.c
:34:
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/slab.h:8: error: conflicting types for 'kzalloc'
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/slab.h:101:
error: previous declaration of 'kzalloc' was here
In file included from
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/module.h:10,
 from
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/index.c
:34:
/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/include/linu
x/sched.h:8: error: static declaration of 'wait_for_completion_timeout'
follows non-static declaration
/usr/src/linux-2.6.9-42.EL_lustre.1.5.95/include/linux/completion.h:32:
error: previous declaration of 'wait_for_completion_timeout' was here
make[5]: ***
[/var/tmp/OFED/tmp/openib/openib/src/linux-kernel/infiniband/core/index.
o] Error 1

Timothy Snider 
Storage Architect
Strategic Planning, Technology and Architecture

LSI Logic Corporation
3718 North Rock Road
Wichita, KS 67226
(316) 636-8736 
[EMAIL PROTECTED]   

 
___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general

[openib-general] Government funding available

2007-01-29 Thread Fred

Press Release

The American Grants and Loans Directory is 
now available. This publication contains more 
than 1500 financial programs, subsidies, 
scholarships, grants and loans offered by the 
US federal government. 

It also includes over 700 financing programs 
available by foundations and associations across 
the United States.

Businesses, students, individuals, municipalities, 
government departments, institutions, foundations 
and associations will find a wealth of information 
that will help them with their new ventures or
existing projects.

What you get:

-Description of Grant available
-Url to government website
-Full mailing address
-Phone and fax number

The Canadian Subsidy Directory is also available 
for Canada.

CD version: $69.95
Printed version: $149.95

To order please call: 819-322-7533

If you do not wish to receive communication from us 
in the future please write "agl" in the subject line 
to: [EMAIL PROTECTED]

**ADVERTISEMENT**

Canada Books
833 Boise de la Riviere
Prevost, Qc
Canada
J0R 1T0

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH] ofed_1_2 Backport Chelsio to rhel5 (2.6.18_FC6).

2007-01-29 Thread Tziporet Koren
Michael S. Tsirkin wrote:
>>
>
> Actually, what's the reason to keep these backports around still?
> Vlad, let's remove.
>
>   
We will keep only backport for FC6 and remove FC4

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [openfabrics-ewg] modules compilation status for OFED 1.2

2007-01-29 Thread Tziporet Koren
Betsy Zeller wrote:
> Bryan is working on recreating the backport patches for the ipath
> driver. It appears that all of the InfiniPath backport patches were
> removed from the OFED source tree late last year. 
>
> By early next week, we'll have a better sense of whether any of these
> patches will need to come in after Jan 31.
>
>   
Bryan - any news?

Tziporet

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [RFC] Performance Manager

2007-01-29 Thread Hal Rosenstock
On Fri, 2007-01-26 at 18:15, Sean Hefty wrote:
> >There are numerous PerfManager models which can be supported:
> >1. Integrated as thread(s) with OpenSM (run only when SM is master)
> >2. Standby SM
> >3. Standalone PerfManager (not running with master or standby SM)
> >4. Distributed PerfManager (most scalable approach)
> 
> IMO, we will eventually need distributed managers, 

Yes.

> so I would go with the last approach.

Initially ? It is also an implementation phasing issue as stated. The
core support is needed in both so there is very little unneeded work to
get to the first phase in terms of a distributed approach. We would
certainly grow/evolve towards this after that initial implementation.

> But, along those lines, if we had a distributed SM,

There has been some early discussion on a distributed SA. Distributing
SM is much harder IMO.

> would you still want to separate the performance manager from the SM?  
> It seems more flexible, but with additional load on the fabric.

Ideally, it would be a deployment choice and the implementation would
support both modes. The problem is that we've already seen that the SM
node has enough to do at times in a large cluster and PerfManagement in
addition with its constant demands is likely not a good addition in
terms of this.

The additional fabric load is twofold: first, the reports for nodes
coming and going, and second, any intermanager communication. I don't
think the first is a significant load and I'm not yet sure about the
second. In any case, the second load can be constrained to the portion
of the subnet where the management nodes are in those cases where this
is a concern.

> >In terms of inter manager communication, there seem to be several
> >choices:
> >1. Use vendor specific MADs (which can be RMPP'd) and build on top of
> >this
> >2. Use IPoIB which is much more powerful as sockets can then be utilized.
> 
> You could also use RC QP communication up/down the hierarchy.

Wouldn't that have the same issues as approach 1 (as compared with
approach 2) ?

-- Hal

> - Sean


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



Re: [openib-general] [PATCH TRIVIAL] opensm: remove unused p_subn->node_lid_tbl field

2007-01-29 Thread Hal Rosenstock
On Sun, 2007-01-28 at 19:50, Sasha Khapyorsky wrote:
> This removes unused node_lid_tbl field in osm_subn_t structure.
> 
> Signed-off-by: Sasha Khapyorsky <[EMAIL PROTECTED]>

Thanks. Applied.

-- Hal


___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general



[openib-general] ofa_1_2_kernel 20070129-0200 daily build status

2007-01-29 Thread vlad
This email was generated automatically, please do not reply


Common build parameters:  --with-ipoib-mod --with-sdp-mod --with-srp-mod 
--with-user_mad-mod --with-user_access-mod --with-mthca-mod --with-core-mod 
--with-addr_trans-mod --with-cxgb3-mod 

Passed:
Passed on i686 with 2.6.15-23-server
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.12
Passed on i686 with linux-2.6.17
Passed on i686 with linux-2.6.14
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.15
Passed on i686 with linux-2.6.13
Passed on i686 with linux-2.6.16
Passed on powerpc with linux-2.6.19
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.17
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.12
Passed on x86_64 with linux-2.6.15
Passed on x86_64 with linux-2.6.16
Passed on x86_64 with linux-2.6.14
Passed on x86_64 with linux-2.6.13
Passed on powerpc with linux-2.6.18
Passed on ppc64 with linux-2.6.12
Passed on powerpc with linux-2.6.13
Passed on powerpc with linux-2.6.17
Passed on ia64 with linux-2.6.19
Passed on ppc64 with linux-2.6.15
Passed on ppc64 with linux-2.6.19
Passed on powerpc with linux-2.6.12
Passed on powerpc with linux-2.6.14
Passed on powerpc with linux-2.6.15
Passed on ppc64 with linux-2.6.16
Passed on powerpc with linux-2.6.16
Passed on ppc64 with linux-2.6.13
Passed on ppc64 with linux-2.6.17
Passed on ia64 with linux-2.6.17
Passed on ia64 with linux-2.6.14
Passed on ia64 with linux-2.6.18
Passed on ppc64 with linux-2.6.14
Passed on ia64 with linux-2.6.16
Passed on ppc64 with linux-2.6.18
Passed on ia64 with linux-2.6.12
Passed on ia64 with linux-2.6.13
Passed on ia64 with linux-2.6.15

Failed:

___
openib-general mailing list
openib-general@openib.org
http://openib.org/mailman/listinfo/openib-general

To unsubscribe, please visit http://openib.org/mailman/listinfo/openib-general