Re: [OMPI devel] Additional excluded tcp inteface

2008-11-07 Thread Adrian Knoth
On Fri, Nov 07, 2008 at 09:49:43AM -0500, Rolf Vandevaart wrote: > I do not think anyone will have a problem with this, but just thought I > would mention that I am planning on adding an additional interface to > the excluded list for the tcp btl. I want to add "sppp" to the list. > This is

Re: [OMPI devel] TCP BTL routability (was: ticket #972)

2008-07-29 Thread Adrian Knoth
On Tue, Jul 29, 2008 at 03:25:00PM -0400, Jeff Squyres wrote: > For reference, the FAQ entry is here: > > http://www.open-mpi.org/faq/?category=tcp#tcp-routability > > It looks like we now *always* assume that two TCP peers are routable. As long as they share the same address family

Re: [OMPI devel] Funny warning message

2008-07-28 Thread Adrian Knoth
On Mon, Jul 28, 2008 at 05:14:29PM +0300, Lenny Verkhovsky wrote: > -advisable to configure rd_win smaller then (rd_num - rd_low), but currently > +advisable to configure rd_win bigger then (rd_num - rd_low), but currently ^ a -- Cluster and

Re: [OMPI devel] multiple GigE interfaces...

2008-06-23 Thread Adrian Knoth
On Wed, Jun 18, 2008 at 05:13:28PM -0700, Muhammad Atif wrote: > Hi again... I was on a break from Xensocket stuff This time some > general questions... Hi. > question. What if I have multiple Ethernet cards (say 5) on two of my > quad core machines. The IP addresses (and the subnets of

Re: [OMPI devel] Change in btl/tcp

2008-04-21 Thread Adrian Knoth
On Mon, Apr 21, 2008 at 09:04:28AM -0400, Josh Hursey wrote: > Adrian, Hi! > Has there been any progress on this bug? If you still cannot reproduce > it, if you send either Tim Prins or I a debugging patch we can run > with it. Or we can try to arrange access to one of our machines for you.

Re: [OMPI devel] Change in btl/tcp

2008-04-18 Thread Adrian Knoth
On Fri, Apr 18, 2008 at 08:04:17AM -0400, Tim Prins wrote: > Hi Adrian, Hi! > After this change, I am getting a lot of errors of the form: > [sif2][[12854,1],9][btl_tcp_frag.c:216:mca_btl_tcp_frag_recv] > mca_btl_tcp_frag_recv: readv failed: Connection reset by > peer (104) > > See for

[OMPI devel] Change in btl/tcp

2008-04-16 Thread Adrian Knoth
Hi! As of r18169, I've changed the acceptance rules for incoming BTL-TCP connections. The old code would have denied a connection in case of non-matching addresses (comparison between source address and expected source address). Unfortunately, you cannot always say which source address an

Re: [OMPI devel] --disable-ipv6 broken on trunk

2008-04-02 Thread Adrian Knoth
On Wed, Apr 02, 2008 at 06:36:02AM -0400, Josh Hursey wrote: > It seems that builds configured with '--disable-ipv6' are broken on > the trunk. I suspect r18055 for this break since the tarball from two > --- > oob_tcp.c: In function `mca_oob_tcp_fini': >

Re: [OMPI devel] Logo as a vector graphic

2008-03-13 Thread Adrian Knoth
On Thu, Mar 13, 2008 at 06:06:12PM +0100, Andreas Schäfer wrote: > > Heh. I usually use the png or jpg version and just crop there. :-) > As this seems to be of public interest, please find attached a vector > version of the logo without text. (-8 Now things are getting difficult... why is my

Re: [OMPI devel] Logo as a vector graphic

2008-03-13 Thread Adrian Knoth
On Thu, Mar 13, 2008 at 08:07:18AM -0500, Jeff Squyres wrote: > Try this one. Thanks, that's beautiful. I'll send you the slides once they are ready, the logo really fits well ;) > We usually snip off the words at the bottom. I also did so. How do you crop the image? I used pdfcrop which is

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-31 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 06:48:54PM +0100, Adrian Knoth wrote: > > What is the real issue behind this whole discussion? > Hanging connections. > I'll have a look at it tomorrow. To everybody who's interested in BTL-TCP, especially George and (to a minor degree) rhc: I've integrat

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 03:38:00PM +0100, Bogdan Costescu wrote: > The results is that, with the default Linux kernel settings, there is > no way to tell which way a connection will take in a multi-rail TCP/IP > setup. Even more, when the ARP cache expires and a new ARP request is > made, the

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Wed, Jan 30, 2008 at 12:05:50PM -0500, George Bosilca wrote: > What is the real issue behind this whole discussion? Hanging connections. See https://svn.open-mpi.org/trac/ompi/ticket/1206 The multi-address peer tries to connect, but btl_tcp_proc_accept denies due to not matching

Re: [OMPI devel] [OMPI svn] svn:open-mpi r17307

2008-01-30 Thread Adrian Knoth
On Tue, Jan 29, 2008 at 07:37:42PM -0500, George Bosilca wrote: > The previous code was correct. Each IP address correspond to a > specific endpoint, and therefore to a specific BTL. This enable us to > have multiple TCP BTL at the same time, and allow the OB1 PML to > stripe the data over

Re: [OMPI devel] Trunk borked

2008-01-28 Thread Adrian Knoth
On Mon, Jan 28, 2008 at 07:26:56AM -0700, Ralph H Castain wrote: > We seem to have a problem on the trunk this morning. I am building on a There are more errors: /tmp/ompi/src/ompi/contrib/vt/vt/vtlib/vt_iowrap.c: In function `fsetpos': /tmp/ompi/src/ompi/contrib/vt/vt/vtlib/vt_iowrap.c:850:

Re: [OMPI devel] btl tcp port to xensocket

2008-01-17 Thread Adrian Knoth
On Tue, Jan 15, 2008 at 04:07:02PM -0800, Muhammad Atif wrote: > Just for reference, I am trying to port btl/tcp to xensockets. Now if > i want to do modex send/recv , to my understanding, mca_btl_tcp_addr_t > is used (ref code/function is mca_btl_tcp_component_exchange). For > xensockets, I need

Re: [OMPI devel] btl tcp port to xensocket

2008-01-09 Thread Adrian Knoth
On Tue, Jan 08, 2008 at 10:51:45PM -0800, Muhammad Atif wrote: > I am planning to port tcp component to xensocket, which is a fast > interdomain communication mechanism for guest domains in Xen. I may Just to get things right: You first partition your SMP/Multicore system with Xen, and then want

[OMPI devel] IPv4 mapped IPv6 addresses

2007-12-14 Thread Adrian Knoth
Hi! The current BTL/TCP and OOB/TCP code contains separate sockets for IPv4 and IPv6. Though it has never been a problem for me, this might cause an out-of-FDs-error in large clusters. (IIRC, rhc has already pointed out this issue) A possible way to reduce FD consumption would be the use of IPv4

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16691

2007-11-08 Thread Adrian Knoth
On Thu, Nov 08, 2007 at 08:02:09AM -0500, Jeff Squyres wrote: > >> All ROMIO patches *must* be coordinated with the ROMIO maintainers. > > Upstream? That's the upstream patch. > That was extracted from ROMIO itself? Which release? >From Jiri: The patch was extracted from a ROMIO sources that

Re: [OMPI devel] [OMPI svn-full] svn:open-mpi r16691

2007-11-08 Thread Adrian Knoth
On Thu, Nov 08, 2007 at 07:51:28AM -0500, Jeff Squyres wrote: [r16691] > Whoa; I'm not sure we want to apply this. Me neither. > All ROMIO patches *must* be coordinated with the ROMIO maintainers. Upstream? That's the upstream patch. Jiri Polach has extracted the fix for this problem.

Re: [OMPI devel] Small manual page patches from Debian package

2007-09-28 Thread Adrian Knoth
On Thu, Sep 27, 2007 at 09:18:39PM -0500, Dirk Eddelbuettel wrote: > Dear Open MPI developers, Hi! > The Debian (source) package for Open MPI still carries a few tiny patches > that we thought we had submitted to you, but then maybe we got that mixed up > with some new manual pages I sent in on

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 08:26:50AM -0400, Jeff Squyres wrote: > > Ok, --enable-progress-threads and --enable-mpi-threads cause the > > segfaults. If you compile without, everything works. > > > I'll now try if it's mpi-threads or the progress-threads, and also > > check > > the upcoming

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#435581: [u...@hermann-uwe.de: Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-17 Thread Adrian Knoth
On Fri, Aug 17, 2007 at 02:11:02AM +0200, Uwe Hermann wrote: > > | The 1.2.3 release also works fine: > I think Adrian used a tarball, not the Debian package? > I'll try a local, manual install too, maybe the bug is Debian-related only? I've tried both: the tarball works fine, the Debian package

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Mon, Aug 13, 2007 at 04:26:31PM -0500, Dirk Eddelbuettel wrote: > > I'll now compile the 1.2.3 release tarball and see if I can reproduce The 1.2.3 release also works fine: adi@debian:~$ ./ompi123/bin/mpirun -np 2 ring 0: sending message (0) to 1 0: sent message 1: waiting for message 1: got

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-13 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 10:51:13AM +0200, Adrian Knoth wrote: > > We (as in the Debian maintainer for Open MPI) got this bug report from > > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > > kernel. > > Any input would be greatly appreciated

Re: [OMPI devel] [u...@hermann-uwe.de: [Pkg-openmpi-maintainers] Bug#435581: openmpi-bin: Segfault on Debian GNU/kFreeBSD]

2007-08-02 Thread Adrian Knoth
On Thu, Aug 02, 2007 at 02:31:30AM +, Dirk Eddelbuettel wrote: > Dear Open MPI developers, Hi! > We (as in the Debian maintainer for Open MPI) got this bug report from > Uwe who sees mpi apps segfault on Debian systems with the FreeBSD > kernel. > Any input would be greatly appreciated!

Re: [OMPI devel] [Pkg-openmpi-maintainers] Bug#433142: openmpi: FTBFS on GNU/kFreeBSD

2007-07-24 Thread Adrian Knoth
On Sat, Jul 14, 2007 at 03:55:12PM -0500, Dirk Eddelbuettel wrote: > | the current version fails to build on GNU/kFreeBSD. > | > | It needs small fixups for munmap hackery and stacktrace. > | It also needs to exclude linux specific build-depends. > | Please find attached patch with that. > >

Re: [OMPI devel] Fwd: [Open MPI] #1101: MPI_ALLOC_MEM with 0 size must be valid

2007-07-24 Thread Adrian Knoth
On Tue, Jul 24, 2007 at 08:41:27AM -0600, Brian Barrett wrote: > > man malloc tells me this: > > "If size was equal to 0, either NULL or a pointer suitable to be > > passed to free() > > is returned". So may be we should just return NULL and be done with > > it? > > Which is also what POSIX

Re: [OMPI devel] Add a bug fix to 1.2.x version

2007-05-02 Thread Adrian Knoth
On Wed, May 02, 2007 at 02:07:17PM +0300, Sharon Melamed wrote: > Hi, Hi! > Change set 14463 - [1]https://svn.open-mpi.org/trac/ompi/changeset/14463. > I would like to integrate this change to version 1.2.x. I guess you're looking for

Re: [OMPI devel] sockaddr* vs. sockaddr_storage*

2007-05-01 Thread Adrian Knoth
On Tue, May 01, 2007 at 07:39:07AM -0700, Jeff Squyres wrote: > > (b) that > > IPv6 was correctly operating...which were the two issues in this > > discussion. > We currently do not have any IPv6 setup in our MPI testing equipment We automatically check every trunk commit against our IPv6

Re: [OMPI devel] sockaddr* vs. sockaddr_storage*

2007-04-29 Thread Adrian Knoth
On Sun, Apr 29, 2007 at 06:07:03PM +0200, Adrian Knoth wrote: > > I have to ask you to remove r14549 quickly as it bring back the trunk > > to the stage it was before r14544 (only random support for multiple > I'll have a look how to accomplish both: IPv6 and a reverted r1454

Re: [OMPI devel] replace 'atoi' with 'strtol'

2007-04-18 Thread Adrian Knoth
On Wed, Apr 18, 2007 at 01:16:54PM -0400, George Bosilca wrote: > That's right, long and int have the same size on Windows 32 and 64 > bits (always 32 bits). However, they are considered as being > different types (!!!). How about (u)int32_t? When I was an Ada programmer, subtypes with the

Re: [OMPI devel] SOS... help needed :(

2007-04-16 Thread Adrian Knoth
On Sun, Apr 15, 2007 at 10:25:06PM -0400, chaitali dherange wrote: > Hi, Hi! > giving more priority to the MPI calls over the non MPI ones. > static I mean.. we know that our clusters use Infiniband for MPI ... > so all the non MPI communication can be assumed to be TCP > communication using

Re: [OMPI devel] SOS!! Run-time error

2007-04-15 Thread Adrian Knoth
On Sun, Apr 15, 2007 at 01:40:01PM -0400, chaitali dherange wrote: > Hi, Hi! > I have downloaded the developer version of source code by downloading a > nightly Subversion snapshot tarball.And have installed the openmpi. Things are getting much clearer when you compile Open MPI with

Re: [OMPI devel] NFS race condition in romio

2007-01-09 Thread Adrian Knoth
On Tue, Jan 09, 2007 at 12:03:38AM +0100, Adrian Knoth wrote: > > The attached patch fixes this problem, but perhaps there is > New patch, I've missed the non-NFS case. This patch was wrong, too (containing a double free segfault). Don't code when dog-tired... ;) I've create ti

Re: [OMPI devel] Major revision to the RML/OOB

2006-12-08 Thread Adrian Knoth
On Thu, Dec 07, 2006 at 11:12:23AM -0500, Jeff Squyres wrote: Hi, > > I therefore suggest to move the OPAL changes into the trunk, > > also the small hostfile code (lex code for IPv6) and the btl code. > Can you describe the changes in opal that were made for IPv6? These changes are limited to

[OMPI devel] IPv6 up and working

2006-11-24 Thread Adrian Knoth
Hi, last week I've rewritten my btl-tcp component to improve several aspects, mainly no oversubscription of interfaces. I now have: - the MCA parameter btl_tcp_disable_family={4|6} to force the use of a special address family at runtime - a working include/exclude list for interfaces

Re: [OMPI devel] Cross-Cluster OpenMPI

2006-11-19 Thread Adrian Knoth
On Sun, Nov 19, 2006 at 02:35:27AM -0500, Resat Umit Payli wrote: > Hi; Hi! > I am interested in using OpenMPI cross-cluster runs on the Grid > environments. Though it's not Grid, but "our" IPv6 code is intended to be run on multi-clusters. (if you're only looking for using all of your

[OMPI devel] valgrind messages important?

2006-11-12 Thread Adrian Knoth
Hi, I'm currently tracing a segfault in mpi_init which is caused by ompi/runtime/ompi_mpi_init.c:569 ret = MCA_PML_CALL(add_procs(procs, nprocs)); free(procs); In most cases, no segfault occurs and everything works fine, but with some special combinations of machines, I can trigger the

Re: [OMPI devel] New oob/tcp?

2006-10-25 Thread Adrian Knoth
On Wed, Oct 25, 2006 at 02:48:33PM +0200, Adrian Knoth wrote: > > I don't see any new component, Adrian. There have been a few updates to the > > existing component, some of which might cause conflicts with the merge, but > > those shouldn't be too hard to resolve. > Ok,

[OMPI devel] New oob/tcp?

2006-10-25 Thread Adrian Knoth
Hi, I've seen a new oob/tcp component in the v1.2 branch (copied from the trunk). Of course, it doesn't merge with my IPv6 patch, so I'm currently using the old oob/tcp in my branch. Is this new component considered stable, thus making it worth to port the IPv6 patch? -- mail: a...@thur.de

Re: [OMPI devel] [IPv6] ORTE layer working

2006-09-22 Thread Adrian Knoth
On Tue, Sep 12, 2006 at 05:44:49PM +0200, Adrian Knoth wrote: > I'm glad to announce a first working version of IPv4+IPv6 orte. > > It contains: >- IPv6 interface discovery on Linux >- a single orte/mca/oob/tcp component >- a single module (no multiple instances) &g

[OMPI devel] [IPv6] ORTE layer working

2006-09-12 Thread Adrian Knoth
Hi, I'm glad to announce a first working version of IPv4+IPv6 orte. It contains: - IPv6 interface discovery on Linux - a single orte/mca/oob/tcp component - a single module (no multiple instances) - two listening sockets - two connecting sockets The listening sockets always stay

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-07 Thread Adrian Knoth
On Thu, Sep 07, 2006 at 07:51:28PM +0200, Adrian Knoth wrote: > No problem, just two hours ago, Christian and me decided to drop > the idea of oob/tcp6 and go on with only one oob-tcp-component. > It shouldn't be that hard and I'll try it tonight or tomorrow. Looks quite promising: a

Re: [OMPI devel] [IPv6] new component oob/tcp6

2006-09-07 Thread Adrian Knoth
On Thu, Sep 07, 2006 at 11:46:28AM -0400, Jeff Squyres wrote: > > On Fri, Sep 01, 2006 at 07:01:25AM -0600, Ralph Castain wrote: > > > >>> Do you agree to go on with two oob components, tcp and tcp6? > >> Yes, I think that's the right approach > > > > It's a deal. ;) > Actually, I would

[OMPI devel] [IPv6] new component oob/tcp6

2006-09-01 Thread Adrian Knoth
Hi, yesterday I felt impelled to create a new ORTE oob component: tcp6. I was able to either compile the library with IPv4 or IPv6 support, but not with both (so to say: two different ompi installations or at least two different DSO versions). As far as I can see, many functions use

[OMPI devel] First IPv6 communication with ORTE

2006-08-24 Thread Adrian Knoth
Hi, I'm glad to announce the first IPv6 launch of orted: tcp6 0960 2001:638:906:2:20:43810 2001:638:906:2::1:43421 ESTABLISHED18368/orted Unit testing discovered the relevant bugs. They're now fixed and it's actually working. Who'd ever guess this? ;) I'm going to prepare some

[OMPI devel] A few notes on IPv6 status

2006-08-19 Thread Adrian Knoth
Hi, as mentioned earlier this year, I'm now working on IPv6 support for OpenMPI. The main design goals are: - do not break existing IPv4 code - compile on SUSv2 (without new socket API) - do not use mapped addresses - test the new code on many systems The porting of OPAL is more or

Re: [OMPI devel] OpenMPI not conforming with the C90 spec?

2006-08-19 Thread Adrian Knoth
On Thu, Aug 17, 2006 at 11:48:44PM +0100, Jonathan Underwood wrote: > Hi, Hi! > Compiling a file with the gcc options -Wall and -pedantic gives the > following warning: > mpi.h:147: warning: ISO C90 does not support 'long long' > Is this intentional, or is this a bug? If you do not insist on

Re: [OMPI devel] Building ompi occasionally touches the source files

2006-07-20 Thread Adrian Knoth
On Mon, Jul 17, 2006 at 10:05:05PM +0200, Adrian Knoth wrote: Hi, > The source is shared via svn, so it's for sure all are using the > same code. > 2. If compiling inside my directory layout, the build > > a) changes the following two files in trunk/src/ > > adi@te

Re: [OMPI devel] Building ompi occasionally touches the source files

2006-07-18 Thread Adrian Knoth
On Tue, Jul 18, 2006 at 12:34:21PM +0200, Christian Kauhaus wrote: > >b) fails to complete (see attachment), the errors are all > > related to lex. > What are the flex versions used on these systems? On Debian stable it is > flex 2.5.31 and on my Gentoo box it is flex 2.5.33, both

[OMPI devel] Building ompi occasionally touches the source files

2006-07-17 Thread Adrian Knoth
Hi, I have a bunch of boxes used to test and compile OMPI (we're talking about the openmpi-1.1 release). Two of them are Debian sarge (current stable), two are Debian testing (i386+amd64) and one is Debian unstable (amd64) The source is shared via svn, so it's for sure all are using the same

[OMPI devel] How to test OpenMPI?

2006-05-02 Thread Adrian Knoth
Hi, as already mentioned some weeks ago, we plan to provide IPv6-support for OpenMPI. Before touching the code, we'd like to have a test environment to ensure not to break anything. There is a test/-directory, but the tests inside seem to be very basic, no network testing or anything running

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 05:21:42PM +0200, Ralf Wildenhues wrote: > > Perhaps it's a good idea to port any internal structure to > > IPv6, as it is able to represent the whole v4 namespace. > > One can always determine whether it is a real v6 or only > > a mapped v4 address (the common :::

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:36:31AM -0500, Jeff Squyres (jsquyres) wrote: > I have no personal experience with IPv6, but one thought that strikes me > is that the components might be able to figure out what to do by looking > at/parsing either the hostnames or the results that come back from >

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 09:07:39AM -0500, Brian Barrett wrote: > > I have a first quick and dirty patch, replacing AF_INET by AF_INET6, > > the sockaddr_in structs and so on. > Is there a way to do this to better support both IPv4 and IPv6? I think so, too. There are probably two different ways

Re: [OMPI devel] IPv6 support in OpenMPI?

2006-03-31 Thread Adrian Knoth
On Fri, Mar 31, 2006 at 10:44:11AM +0200, Christian Kauhaus wrote: > Hello *, Hi. > University of Jena (Germany). Our work group is digging into how to > connect several clusters on a campus. I think I'm also a member of this workgroup, though I am not working at University of Jena, but