[ewg] ofa_1_5_kernel 20091022-0200 daily build status

2009-10-22 Thread Vladimir Sokolovsky (Mellanox)
This email was generated automatically, please do not reply


git_url: git://git.openfabrics.org/ofed_1_5/linux-2.6.git
git_branch: ofed_kernel_1_5

Common build parameters: 

Passed:
Passed on i686 with linux-2.6.18
Passed on i686 with linux-2.6.19
Passed on i686 with linux-2.6.21.1
Passed on i686 with linux-2.6.26
Passed on i686 with linux-2.6.24
Passed on i686 with linux-2.6.22
Passed on i686 with linux-2.6.27
Passed on x86_64 with linux-2.6.16.60-0.54.5-smp
Passed on x86_64 with linux-2.6.16.60-0.21-smp
Passed on x86_64 with linux-2.6.18
Passed on x86_64 with linux-2.6.18-128.el5
Passed on x86_64 with linux-2.6.18-164.el5
Passed on x86_64 with linux-2.6.19
Passed on x86_64 with linux-2.6.20
Passed on x86_64 with linux-2.6.18-93.el5
Passed on x86_64 with linux-2.6.21.1
Passed on x86_64 with linux-2.6.24
Passed on x86_64 with linux-2.6.22
Passed on x86_64 with linux-2.6.26
Passed on x86_64 with linux-2.6.27
Passed on x86_64 with linux-2.6.25
Passed on x86_64 with linux-2.6.9-67.ELsmp
Passed on x86_64 with linux-2.6.9-78.ELsmp
Passed on ia64 with linux-2.6.18
Passed on ia64 with linux-2.6.19
Passed on ia64 with linux-2.6.23
Passed on ia64 with linux-2.6.21.1
Passed on ia64 with linux-2.6.22
Passed on ia64 with linux-2.6.26
Passed on ia64 with linux-2.6.24
Passed on ia64 with linux-2.6.25
Passed on ppc64 with linux-2.6.18
Passed on ppc64 with linux-2.6.19

Failed:
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: Agenda for EWG/OFED meeting on next Monday

2009-10-22 Thread Woodruff, Robert J
I also did a quick test of the OFED-1.5-RDMA stack with Intel MPI on
EL5.3, x86_64 and Itianium. I was able to get both to run OK,
although on Itanium, the startup script still tries to load the MLX4
driver and it fails to load. If I disable that, it all seems to work fine.

woody



From: Woodruff, Robert J
Sent: Monday, October 19, 2009 10:13 AM
To: 'Tziporet Koren'; ewg@lists.openfabrics.org
Subject: RE: Agenda for EWG/OFED meeting on next Monday

For my team,
we have been testing the following on
small clusters, 16 nodes or less.

OS
 - RHEL 5.3 and 5.4

Arch:
   - X86_64, ia64

ULPs

   OpenSM, Intel MPI over IPoIB, Intel MPI over uDAPL,  ibutils and 
management tools

IHVs

  Mellanox, mthca and mlx4
  Intel (NetEffect) iWarp


For uDAPL, we are testing the latest package on a cluster of  338 nodes with 
Intel MPI,
but that cluster is still runing the older base OFED.


From: ewg-boun...@lists.openfabrics.org 
[mailto:ewg-boun...@lists.openfabrics.org] On Behalf Of Tziporet Koren
Sent: Monday, October 19, 2009 8:31 AM
To: Tziporet Koren; ewg@lists.openfabrics.org
Subject: [ewg] RE: Agenda for EWG/OFED meeting on next Monday

Mellanox testing for OFED 1.5
==

Mellanox test OFED-RDMA package on most systems, and only few machines on OFED.

We test All Mellanox HCAs with main focus on ConnectX and ConnectX-2 with QDR


OS:

-  RHEL4: up6, up7, up8

-  RHEL5: up2, up3, up4

-  SLES10 SP2

-  SLES10 SP3 (not started)

-  SLES11

-  OEL5 up2

-  CentOS5: up2, up3

-  Kernel.org: 2.6.29, 2.6.30

Arch:

-  X64

-  x86_64

-  ppc64

-  ia64 - partial testing only

ULPs:

-  mvapich

-  Open MPI

-  IPoIB (with bonding too)

-  SDP

-  SRP

-  RDS

-  NFS/RDMA

-  Performance tests

Management:

-  OpenSM on the host

-  Management utilities

-  ibutils
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

[ewg] Re: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-22 Thread David J. Wilder

On Wed, 2009-10-21 at 17:08 -0600, Jason Gunthorpe wrote:


 This looks exactly like what I was thinking of - have you tested this?

Yes I did do some testing, but that brings up a good question.  I am not
sure I know what all should be tested?  I am running rping with
different destination address (and scoping).  On the ipv4 side:
rping -c -a ip-of-my-ib0-interface
rping -c -a ip-of-remote-nodes-ib0-interface

For ipv6 I ran what I described previously.  What I do need to do is add
the option to rping to specify a source address and run it with various
address.  Any help you can give defining what exactly needs to be tested
would be appreciated. 
 
 
 If it is OK, I'd make it the first in the series.
 
 There were two things I was not sure about in my example.
  1) Is 'init_net.loopback_dev' the correct reference for the loop device? Or
 is it something like dev_net(rt-idev-dev)-loopback_dev ?
 
 I'm sensing it may be the latter, but can't investigate right now
 Donno much about this new namespace stuff really

I think you may be correct I will look at that closer.  I did explicitly
verify the test worked in both cases.
 
 
  2) Was rt-idev-dev the right choice for the ipv4 case? Or is it
 rt-u.dst.dev ?
 
 The TCP case kinda looks like
 int tcp_v4_connect(struct sock *sk, struct sockaddr *uaddr, int addr_len)
 tmp = ip_route_connect(rt, nexthop, inet-saddr,
RT_CONN_FLAGS(sk), sk-sk_bound_dev_if,
IPPROTO_TCP,
inet-sport, usin-sin_port, sk, 1);
 sk_setup_caps(sk, rt-u.dst);
 
 void sk_setup_caps(struct sock *sk, struct dst_entry *dst)
 __sk_dst_set(sk, dst);
 
 And all later things key off the sk_get_dst. So I'm thinking
 that u.dst.dev might be correct.
 
 I have no idea what the difference is though (can't look too hard
 right now)
 
 The main other fixup I see is to remove
 ret = cma_bind_addr(id, src_addr, dst_addr);
 
 From rdma_resolve_addr and rely on the routing lookup in
 addr_resolve_remote called by addr_resolve_ip to setup the bind device
 from the routing lookup. (This is what I mentioned in my last email)
 
 Which then lets you fixup the checking and handling of the
 sin6_scopeid on the source address - and fixes the main other routing
 difference against the TCP stack.
 
 Thanks for working on this!
 
 Jason

Lots of discussion :) I will go through the mails, address the comments
and post the entire series of patches.

Thanks for all your input.

Dave. 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: [PATCH] link-local address fix for rdma_resolve_addr

2009-10-22 Thread Sean Hefty
For ipv6 I ran what I described previously.  What I do need to do is add
the option to rping to specify a source address and run it with various
address.  Any help you can give defining what exactly needs to be tested
would be appreciated.

You can also test with ucmatose to verify ipv4 still works.  Use the -b option
to bind to a specific address.

- Sean 

___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] RE: OFED 1.4.2 requires rebuild of kernel modules on each node

2009-10-22 Thread Woodruff, Robert J
Sending this to the EWG openfabrics list,
since this seems to be an OFED build/installation issue
rather than a general code problem.

One thing that you might try is to instead of copying the
entire build directory and re-runing ./install.pl -c ofed.conf
on each system, instead, after building on one node,
just copy the binrary RPMS directory and 
the uninstall script to the other nodes,
Then just run the uninstall script and
install the RPMS manually... e.g, 

./uninstall.sh
cd ./RPMS/redhat-release-/x86_64
rpm -i *

This method has worked for me in the past. 

woody


-Original Message-
From: linux-rdma-ow...@vger.kernel.org 
[mailto:linux-rdma-ow...@vger.kernel.org] On Behalf Of Bryan
Sent: Thursday, October 22, 2009 11:22 AM
To: linux-r...@vger.kernel.org
Subject: OFED 1.4.2 requires rebuild of kernel modules on each node

I was referred to this list by the general mailing list on OFED.
Emailing from my personal address since Lotus Notes insists that
anything it sends has to contain some portion of HTML.

This problem was observed on Red Hat Enterprise Linux 5 update 3.  I
searched the list but did not see anything immediately applicable.
We've seen issues similar in the past where we were able to modify the
script to solve an RPM that didn't match the expected naming scheme,
but did not see anything immediately when looking at the scripts for
this version.

Copied from an internal bug reporting tool:

On installing OFED 1.4.2, the tarball was extracted, in directory the code
was extracted to, ./install.pl was run and all components of OFED were
build/installed with the default settings.  Then this directory was
copied to another node, and ./install.pl -c ofed.conf was run.  Previously
this would just do the install of the already built components, but with
OFED 1.4.2, the kernel RPM gets re-built when this is done.

This means that the build tools have to be on each node, and that
deployment of OFED takes longer.

Bryan Reese -- bre...@us.ibm.com
e1350 Linux Cluster Test Engineer
--
To unsubscribe from this list: send the line unsubscribe linux-rdma in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg