RE: [ewg] FW: complete test summary

2008-12-14 Thread Rupert Dance
Tziporet,

They have tested with same vendor HCAs and the MPI test are passing. 

Rupert

-Original Message-
From: Tziporet Koren [mailto:tzipo...@dev.mellanox.co.il] 
Sent: Thursday, December 11, 2008 5:22 AM
To: Rupert Dance
Cc: ewg@lists.openfabrics.org; ofa...@postal.iol.unh.edu
Subject: Re: [ewg] FW: complete test summary

Rupert Dance wrote:
> Hello Tziporet,
>
> Here is the final UNH IOL summary report of testing done on RC6. 
>   
Many thanks
> Regarding the mandatory tests, the Link Init failure is a specific 
> vendor issue and IPoIB failure has been documented in Bug 1287.
>
> The MPI failures in the Beta tests are being research now. I have 
> asked Jeff and Arlin Davis to look into these failures. UNH has noted 
> that it only occurs when the cluster includes HCA from multiple 
> vendors and when the number of processors exceeds 38. Jeff made the 
> following comment "I'm not entirely surprised that OMPI fails when 
> used with multiple vendor HCAs; I don't know if anyone has ever tested 
> that before...?  I would not make it a requirement for passing that 
> OMPI has to work in a single MPI job with multiple vendor HCAs; I 
> don't know of many (any?) real-world environments that do this."
>   
Can you test with same vendor HCA on all nodes and see this is passing
> Thanks
>
> Rupert
>
> -Original Message-
> From: Nickolas Wood [mailto:n...@iol.unh.edu]
> Sent: Wednesday, December 10, 2008 9:41 AM
> To: Rupert Dance
> Cc: ofa...@postal.iol.unh.edu
> Subject: complete test summary
>
> Hi,
> It was my understanding that incremental status reports were 
> acceptable regarding the rc6 testing. I have been told that it was 
> not, there fore I have combined the previous emails into one for easier
use.
>
> All the below results were gathered while using the complete, 
> multi vendor cluster with ofed 1.4 rc6 and the topology used during 
> the debug event. This results in a 62 process mpi cluster.
>
> Mandatory test results:
>Link Init: FAIL - link speed issue
>Fabric init: pass
>IPoIB-Datagram: FAIL - initial packet loss
>iSER: NA - no iSER target to test against
>SRP: pass
>SDP: pass
>
> BETA tests completed:
>IPoIB-Connected: pass
>mvapich1: pingping, pingpong tests - pass
> all tests - FAIL
>mvapich2: pingping, pingpong tests - pass
> all tests - FAIL
>openmpi:   pingping, pingpong tests - pass
>  all tests - FAIL
>intelmpi: pingping, pingpong tests - pass
>  all tests - FAIL
>hpmpi:all tests - FAIL
>dapltest: pass
>
> -Nick
>
>
> ___
> ewg mailing list
> ewg@lists.openfabrics.org
> http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg
>
>   



___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


Re: [ewg] FW: complete test summary

2008-12-11 Thread Tziporet Koren

Rupert Dance wrote:

Hello Tziporet,

Here is the final UNH IOL summary report of testing done on RC6. 
  

Many thanks

Regarding the mandatory tests, the Link Init failure is a specific vendor
issue and IPoIB failure has been documented in Bug 1287. 


The MPI failures in the Beta tests are being research now. I have asked Jeff
and Arlin Davis to look into these failures. UNH has noted that it only
occurs when the cluster includes HCA from multiple vendors and when the
number of processors exceeds 38. Jeff made the following comment "I'm not
entirely surprised that OMPI fails when used with multiple vendor HCAs; I
don't know if anyone has ever tested that before...?  I would not make it a
requirement for passing that OMPI has to work in a single MPI job with
multiple vendor HCAs; I don't know of many (any?) real-world environments
that do this."
  

Can you test with same vendor HCA on all nodes and see this is passing

Thanks

Rupert

-Original Message-
From: Nickolas Wood [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 10, 2008 9:41 AM

To: Rupert Dance
Cc: [EMAIL PROTECTED]
Subject: complete test summary

Hi,
It was my understanding that incremental status reports were acceptable
regarding the rc6 testing. I have been told that it was not, there fore I
have combined the previous emails into one for easier use.

All the below results were gathered while using the complete, multi
vendor cluster with ofed 1.4 rc6 and the topology used during the debug
event. This results in a 62 process mpi cluster.

Mandatory test results:
   Link Init: FAIL - link speed issue
   Fabric init: pass
   IPoIB-Datagram: FAIL - initial packet loss
   iSER: NA - no iSER target to test against
   SRP: pass
   SDP: pass

BETA tests completed:
   IPoIB-Connected: pass
   mvapich1: pingping, pingpong tests - pass
all tests - FAIL
   mvapich2: pingping, pingpong tests - pass
all tests - FAIL
   openmpi:   pingping, pingpong tests - pass
 all tests - FAIL
   intelmpi: pingping, pingpong tests - pass
 all tests - FAIL
   hpmpi:all tests - FAIL
   dapltest: pass

-Nick


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg

  


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg


[ewg] FW: complete test summary

2008-12-10 Thread Rupert Dance
Hello Tziporet,

Here is the final UNH IOL summary report of testing done on RC6. 

Regarding the mandatory tests, the Link Init failure is a specific vendor
issue and IPoIB failure has been documented in Bug 1287. 

The MPI failures in the Beta tests are being research now. I have asked Jeff
and Arlin Davis to look into these failures. UNH has noted that it only
occurs when the cluster includes HCA from multiple vendors and when the
number of processors exceeds 38. Jeff made the following comment "I'm not
entirely surprised that OMPI fails when used with multiple vendor HCAs; I
don't know if anyone has ever tested that before...?  I would not make it a
requirement for passing that OMPI has to work in a single MPI job with
multiple vendor HCAs; I don't know of many (any?) real-world environments
that do this."

Thanks

Rupert

-Original Message-
From: Nickolas Wood [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, December 10, 2008 9:41 AM
To: Rupert Dance
Cc: [EMAIL PROTECTED]
Subject: complete test summary

Hi,
It was my understanding that incremental status reports were acceptable
regarding the rc6 testing. I have been told that it was not, there fore I
have combined the previous emails into one for easier use.

All the below results were gathered while using the complete, multi
vendor cluster with ofed 1.4 rc6 and the topology used during the debug
event. This results in a 62 process mpi cluster.

Mandatory test results:
   Link Init: FAIL - link speed issue
   Fabric init: pass
   IPoIB-Datagram: FAIL - initial packet loss
   iSER: NA - no iSER target to test against
   SRP: pass
   SDP: pass

BETA tests completed:
   IPoIB-Connected: pass
   mvapich1: pingping, pingpong tests - pass
all tests - FAIL
   mvapich2: pingping, pingpong tests - pass
all tests - FAIL
   openmpi:   pingping, pingpong tests - pass
 all tests - FAIL
   intelmpi: pingping, pingpong tests - pass
 all tests - FAIL
   hpmpi:all tests - FAIL
   dapltest: pass

-Nick


___
ewg mailing list
ewg@lists.openfabrics.org
http://lists.openfabrics.org/cgi-bin/mailman/listinfo/ewg