Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Andreas Knüpfer
Hi everybody,

now this is an interesting effect. 

After a fresh checkout all files have the actual time, haven't they? Is the 
timestamp explicitly saved somewhere?

Could it be, that this is newer than Tim's local time yesterday? Maybe the 
system time is not set to UTC or something like this? If so, then it should 
be possible to reproduce this today. Could you give it a try, Tim?

Another cause could be slight differences in files' times because one is 
checked out earlier than the other. However, OTF's configure ran before 
during the first global configure. Therefore, all files' timestamps should be 
correct after this. So I don't believe in this explanation.

What do you think?


-- 
Dipl. Math. Andreas Knuepfer, 
Center for Information Services and 
High Performance Computing (ZIH), TU Dresden, 
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773


signature.asc
Description: This is a digitally signed message part.


Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Ralf Wildenhues
* Jeff Squyres wrote on Thu, Jan 31, 2008 at 07:10:36PM CET:
> Ah -- I didn't notice this before -- do you have a configure script  
> committed to SVN?  If so, this could be the problem.

> > On Do, 2008-01-31 at 08:09 -0500, Tim Prins wrote:
[...]
> >> [tprins@sif test]$ make clean
> >> 
> >> Making clean in otf
> >> make[5]: Entering directory
> >> `/san/homedirs/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf'
> >>   cd . && /bin/sh
> >> /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/missing --run
> >> automake-1.10 --gnu
> >> cd . && /bin/sh /u/tprins/sif/test/ompi/contrib/vt/vt/extlib/otf/ 
> >> missing
> >> --run autoconf
[...]

These files do not belong in SVN, they are generated by aclocal:
  ompi/contrib/vt/vt/extlib/otf/aclocal.m4
  ompi/contrib/vt/vt/aclocal.m4

Cheers,
Ralf


Re: [OMPI devel] VT in trunk + how to disable

2008-02-01 Thread Josh Hursey

Should the default be to *disable* vampirtrace?

I mention this since, I assume, most people do not depend on this  
tool for every Open MPI install. Meaning that Open MPI does not  
require this integration for correct MPI functionality unlike  
something like ROMIO [example of opt-out functionality which is 3rd  
party].


So I would suggest to the group that vampirtrace be an opt-in  
functionality.


What do others think?

-- Josh

On Jan 28, 2008, at 9:59 AM, Andreas Knüpfer wrote:


Hi everybody,

the vampirtrace integration arrived at the trunk today. There seems  
to be one

issue already, but we'll fix this asap.

As a general hint, this is how to completely disable anything we  
integrated:


configure --enable-contrib-no-build=vt ...

Then again, we'd like to see all the issues you may encounter and  
fix them.


Best regards, Andreas

--
Dipl. Math. Andreas Knuepfer,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel





Re: [OMPI devel] vt compiler warnings and errors

2008-02-01 Thread Jeff Squyres

On Feb 1, 2008, at 5:35 AM, Ralf Wildenhues wrote:


These files do not belong in SVN, they are generated by aclocal:
 ompi/contrib/vt/vt/extlib/otf/aclocal.m4
 ompi/contrib/vt/vt/aclocal.m4



I think both of these have their own configure scripts, meaning that  
they were autoconfed/automaked/whatever before they were put into OMPI.


And in hindsight, this fits in with exactly what our original goal  
was: take a VT tarball and dump it into OMPI's SVN.  Doh!


So I think the question still remains: can we hook VT's autoconf (et  
al.) requirements into the top-level autogen.sh so that the trunk copy  
of vt doesn't have configure/aclocal.m4/etc. and OMPI's top-level  
autogen.sh will create them?


--
Jeff Squyres
Cisco Systems



[OMPI devel] 32 bit openib btl warnings

2008-02-01 Thread Jeff Squyres

I noticed these in IBM's MTT runs on the rhc branch last night:

btl_openib_frag.c: In function 'out_constructor':
btl_openib_frag.c:74: warning: cast from pointer to integer of  
different size

btl_openib_frag.c: In function 'recv_constructor':
btl_openib_frag.c:120: warning: cast from pointer to integer of  
different size

btl_openib_frag.c: In function 'get_constructor':
btl_openib_frag.c:141: warning: cast from pointer to integer of  
different size

btl_openib_lex.c:1740: warning: 'yy_flex_realloc' defined but not used

This should be fairly recent with the OMPI trunk; I seem to recall  
seeing Ralph merge yesterday.


I don't test 32 bit builds; do we have some casting / size issues in  
32 bit?


--
Jeff Squyres
Cisco Systems



Re: [OMPI devel] 32 bit openib btl warnings

2008-02-01 Thread Tim Prins
These were fixed by Gleb yesterday in 
https://svn.open-mpi.org/trac/ompi/changeset/17346


Tim


Jeff Squyres wrote:

I noticed these in IBM's MTT runs on the rhc branch last night:

btl_openib_frag.c: In function 'out_constructor':
btl_openib_frag.c:74: warning: cast from pointer to integer of  
different size

btl_openib_frag.c: In function 'recv_constructor':
btl_openib_frag.c:120: warning: cast from pointer to integer of  
different size

btl_openib_frag.c: In function 'get_constructor':
btl_openib_frag.c:141: warning: cast from pointer to integer of  
different size

btl_openib_lex.c:1740: warning: 'yy_flex_realloc' defined but not used

This should be fairly recent with the OMPI trunk; I seem to recall  
seeing Ralph merge yesterday.


I don't test 32 bit builds; do we have some casting / size issues in  
32 bit?






Re: [OMPI devel] 32 bit openib btl warnings

2008-02-01 Thread Jeff Squyres

Cool; I missed that one -- thanks.

On Feb 1, 2008, at 9:25 AM, Tim Prins wrote:


These were fixed by Gleb yesterday in
https://svn.open-mpi.org/trac/ompi/changeset/17346

Tim


Jeff Squyres wrote:

I noticed these in IBM's MTT runs on the rhc branch last night:

btl_openib_frag.c: In function 'out_constructor':
btl_openib_frag.c:74: warning: cast from pointer to integer of
different size
btl_openib_frag.c: In function 'recv_constructor':
btl_openib_frag.c:120: warning: cast from pointer to integer of
different size
btl_openib_frag.c: In function 'get_constructor':
btl_openib_frag.c:141: warning: cast from pointer to integer of
different size
btl_openib_lex.c:1740: warning: 'yy_flex_realloc' defined but not  
used


This should be fairly recent with the OMPI trunk; I seem to recall
seeing Ralph merge yesterday.

I don't test 32 bit builds; do we have some casting / size issues in
32 bit?



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems



[OMPI devel] More VT warnings

2008-02-01 Thread Tim Prins

With a fresh checkout, I get the following warnings:

vt_metric_papi.c:72: warning: no previous prototype for ‘vt_metric_error’
vt_metric_papi.c:86: warning: no previous prototype for ‘vt_metric_warning’
vt_metric_papi.c:100: warning: function declaration isn’t a prototype
vt_metric_papi.c: In function ‘vt_metric_descriptions’:
vt_metric_papi.c:126: warning: comparison between signed and unsigned
vt_metric_papi.c: At top level:
vt_metric_papi.c:147: warning: function declaration isn’t a prototype
vt_metric_papi.c:72: warning: no previous prototype for ‘vt_metric_error’
vt_metric_papi.c:86: warning: no previous prototype for ‘vt_metric_warning’
vt_metric_papi.c:100: warning: function declaration isn’t a prototype
vt_metric_papi.c: In function ‘vt_metric_descriptions’:
vt_metric_papi.c:126: warning: comparison between signed and unsigned
vt_metric_papi.c: At top level:
vt_metric_papi.c:147: warning: function declaration isn’t a prototype
vt_metric_papi.c:72: warning: no previous prototype for ‘vt_metric_error’
vt_metric_papi.c:86: warning: no previous prototype for ‘vt_metric_warning’
vt_metric_papi.c:100: warning: function declaration isn’t a prototype
vt_metric_papi.c: In function ‘vt_metric_descriptions’:
vt_metric_papi.c:126: warning: comparison between signed and unsigned
vt_metric_papi.c: At top level:
vt_metric_papi.c:147: warning: function declaration isn’t a prototype
vt_metric_papi.c:72: warning: no previous prototype for ‘vt_metric_error’
vt_metric_papi.c:86: warning: no previous prototype for ‘vt_metric_warning’
vt_metric_papi.c:100: warning: function declaration isn’t a prototype
vt_metric_papi.c: In function ‘vt_metric_descriptions’:
vt_metric_papi.c:126: warning: comparison between signed and unsigned
vt_metric_papi.c: At top level:
vt_metric_papi.c:147: warning: function declaration isn’t a prototype


Note that this indicates that the file vt_metric_papi.c is being 
compiled *3* times. I am not using a parallel make here. Any ideas why 
it is compiling 3 times? It should not be timing issue, the nfs server 
and the system clock seem to be well synchronized.



Thanks,

Tim


Re: [OMPI devel] VT in trunk + how to disable

2008-02-01 Thread Terry Dontje

Josh Hursey wrote:

Should the default be to *disable* vampirtrace?

I mention this since, I assume, most people do not depend on this  
tool for every Open MPI install. Meaning that Open MPI does not  
require this integration for correct MPI functionality unlike  
something like ROMIO [example of opt-out functionality which is 3rd  
party].


So I would suggest to the group that vampirtrace be an opt-in  
functionality.


What do others think?
  
I am not completely against disabling it as a default.  However, once it 
builds consistently having it enabled by default shouldn't really cause 
any problems for those not directly using it (well outside of more time 
to compile).   I imagine changing the default probably would help ORTE 
move forward but then I wonder if we will run into issues of the vampire 
stuff not being able to resolve their issues because of ORTE problems 
put back to the trunk.   


--td

-- Josh

On Jan 28, 2008, at 9:59 AM, Andreas Knüpfer wrote:

  

Hi everybody,

the vampirtrace integration arrived at the trunk today. There seems  
to be one

issue already, but we'll fix this asap.

As a general hint, this is how to completely disable anything we  
integrated:


configure --enable-contrib-no-build=vt ...

Then again, we'd like to see all the issues you may encounter and  
fix them.


Best regards, Andreas

--
Dipl. Math. Andreas Knuepfer,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel
  




Re: [OMPI devel] VT in trunk + how to disable

2008-02-01 Thread Jeff Squyres

I think my position is about the same as Terry's.

I also think we have a precedent for building everything that is  
possible and letting the user choose at run-time what they want to  
do.  My $0.02 is that it's easier to tell random users (and  
customers!) "yes, OMPI should have built that for you by default; you  
use it like this..." vs. "No, sorry, you need to go re-install OMPI to  
have feature X."


We developers are probably a bit more sensitive to this issue since it  
makes longer builds (and we re-build all the time).  But remember that  
most people install OMPI only a small number of times -- so build time  
is less of an issue for them.


(I'm assuming that at least one of your motivations for asking was the  
longer build time...?)



On Feb 1, 2008, at 10:17 AM, Terry Dontje wrote:


Josh Hursey wrote:

Should the default be to *disable* vampirtrace?

I mention this since, I assume, most people do not depend on this
tool for every Open MPI install. Meaning that Open MPI does not
require this integration for correct MPI functionality unlike
something like ROMIO [example of opt-out functionality which is 3rd
party].

So I would suggest to the group that vampirtrace be an opt-in
functionality.

What do others think?

I am not completely against disabling it as a default.  However,  
once it
builds consistently having it enabled by default shouldn't really  
cause
any problems for those not directly using it (well outside of more  
time

to compile).   I imagine changing the default probably would help ORTE
move forward but then I wonder if we will run into issues of the  
vampire

stuff not being able to resolve their issues because of ORTE problems
put back to the trunk.

--td

-- Josh

On Jan 28, 2008, at 9:59 AM, Andreas Knüpfer wrote:



Hi everybody,

the vampirtrace integration arrived at the trunk today. There seems
to be one
issue already, but we'll fix this asap.

As a general hint, this is how to completely disable anything we
integrated:

   configure --enable-contrib-no-build=vt ...

Then again, we'd like to see all the issues you may encounter and
fix them.

Best regards, Andreas

--
Dipl. Math. Andreas Knuepfer,
Center for Information Services and
High Performance Computing (ZIH), TU Dresden,
Willersbau A114, Zellescher Weg 12, 01062 Dresden
phone +49-351-463-38323, fax +49-351-463-37773
___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel



--
Jeff Squyres
Cisco Systems




Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-02-01 Thread Tim Prins

Adrian,

For the most part this seems to work for me. But there are a few issues. 
I'm not sure which are introduced by this patch, and whether some may be 
expected behavior. But for completeness I will point them all out. 
First, let me explain I am working on a machine with 3 tcp interfaces, 
lo, eth0, and ib0. Both eth0 and ib0 connect all the compute nodes.


1. There are some warnings when compiling:
btl_tcp_proc.c:171: warning: no previous prototype for 'evaluate_assignment'
btl_tcp_proc.c:206: warning: no previous prototype for 'visit'
btl_tcp_proc.c:224: warning: no previous prototype for 
'mca_btl_tcp_initialise_interface'

btl_tcp_proc.c: In function `mca_btl_tcp_proc_insert':
btl_tcp_proc.c:304: warning: pointer targets in passing arg 2 of 
`opal_ifindextomask' differ in signedness
btl_tcp_proc.c:313: warning: pointer targets in passing arg 2 of 
`opal_ifindextomask' differ in signedness

btl_tcp_proc.c:389: warning: comparison between signed and unsigned
btl_tcp_proc.c:400: warning: comparison between signed and unsigned
btl_tcp_proc.c:401: warning: comparison between signed and unsigned
btl_tcp_proc.c:459: warning: ISO C90 forbids variable-size array `a'
btl_tcp_proc.c:459: warning: ISO C90 forbids mixed declarations and code
btl_tcp_proc.c:465: warning: ISO C90 forbids mixed declarations and code
btl_tcp_proc.c:466: warning: comparison between signed and unsigned
btl_tcp_proc.c:480: warning: comparison between signed and unsigned
btl_tcp_proc.c:485: warning: comparison between signed and unsigned
btl_tcp_proc.c:495: warning: comparison between signed and unsigned

2. If I exclude all my tcp interfaces, the connection fails properly, 
but I do get a malloc request for 0 bytes:
tprins@odin examples]$ mpirun -mca btl tcp,self  -mca btl_tcp_if_exclude 
eth0,ib0,lo -np 2 ./ring_c

malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)
malloc debug: Request for 0 bytes (btl_tcp_component.c, 844)


3. If the exclude list does not contain 'lo', or the include list 
contains 'lo', the job hangs when using multiple nodes:
[tprins@odin examples]$ mpirun -mca btl tcp,self  -mca 
btl_tcp_if_exclude ib0 -np 2 -bynode ./ring_cProcess 0 sending 10 to 1, 
tag 201 (2 processes in ring)
[odin011][1,0][btl_tcp_endpoint.c:619:mca_btl_tcp_endpoint_complete_connect] 
connect() failed: Connection refused (111)


[tprins@odin examples]$ mpirun -mca btl tcp,self  -mca 
btl_tcp_if_include eth0,lo -np 2 -bynode ./ring_c

Process 0 sending 10 to 1, tag 201 (2 processes in ring)
[odin011][1,0][btl_tcp_endpoint.c:619:mca_btl_tcp_endpoint_complete_connect] 
connect() failed: Connection refused (111)



However, the great news about this patch is that it appears to fix 
https://svn.open-mpi.org/trac/ompi/ticket/1027 for me.


Hope this helps,

Tim



Adrian Knoth wrote:

On Wed, Jan 30, 2008 at 06:48:54PM +0100, Adrian Knoth wrote:


What is the real issue behind this whole discussion?

Hanging connections.
I'll have a look at it tomorrow.


To everybody who's interested in BTL-TCP, especially George and (to a
minor degree) rhc:

I've integrated something what I call "magic address selection code".
See the comments in r17348.

Can you check

   https://svn.open-mpi.org/svn/ompi/tmp-public/btl-tcp

if it's working for you? Read: multi-rail TCP, FNN, whatever is
important to you?


The code is proof of concept and could use a little tuning (if it's
working at all. Over here, it satisfies all tests).

I vaguely remember that at least Ralph doesn't like

   int a[perm_size * sizeof(int)];

where perm_size is dynamically evaluated (read: array size is runtime
dependent)

There are also some large arrays, search for MAX_KERNEL_INTERFACE_INDEX.
Perhaps it's better to replace them with an appropriate OMPI data
structure. I don't know what fits best, you guys know the details...


So please give the code a try, and if it's working, feel free to cleanup
whatever is necessary to make it the OMPI style or give me some pointers
what to change.


I'd like to point to Thomas' diploma thesis. The PDF explains the theory
behind the code, it's like an rationale. Unfortunately, the PDF has some
typos, but I guess you'll get the idea. It's a graph matching algorithm,
Chapter 3 covers everything in detail:

 http://cluster.inf-ra.uni-jena.de/~adi/peiselt-thesis.pdf


HTH





Re: [OMPI devel] More VT warnings

2008-02-01 Thread Ralf Wildenhues
* Tim Prins wrote on Fri, Feb 01, 2008 at 04:09:31PM CET:
> 
> Note that this indicates that the file vt_metric_papi.c is being 
> compiled *3* times. I am not using a parallel make here. Any ideas why 
> it is compiling 3 times?

The file is listed as source file to four different libraries, and
per-target CFLAGS are used for these.  Between one and four of these
libraries are actually built, depending on decisions done at configure
time.

Cheers,
Ralf