Re: [OMPI devel] [OMPI users] Adding a new BTL

2016-02-25 Thread dpchoudh .
. On Thu, Feb 25, 2016 at 7:02 PM, Gilles Gouaillardet < gilles.gouaillar...@gmail.com> wrote: > on master/v2.x, you also have to > > rm -f opal/mca/btl/lf/.opal_ignore > > (and this file would have been .ompi_ignore on v1.10) > > Cheers, > > Gilles > > On

[OMPI devel] component progress function optional?

2016-03-01 Thread dpchoudh .
Hello all (As you might know), I am working on implementing a new BTL for a proprietary fabric, and, taking the path of least effort, copying and pasting code from various pre-implemented BTL as is appropriate for our hardware. My question is: are there any guidance on which of the functions must

[OMPI devel] Network atomic operations

2016-03-03 Thread dpchoudh .
Hello all Here is a 101 level question: OpenMPI supports many transports, out of the box, and can be extended to support those which it does not. Some of these transports, such as infiniband, provide hardware atomic operations on remote memory, whereas others, such as iWARP, do not. My question

Re: [OMPI devel] Network atomic operations

2016-03-04 Thread dpchoudh .
ter), so I am in a bit of spaghetti situation. Thank you Durga Life is complex. It has real and imaginary parts. On Fri, Mar 4, 2016 at 11:06 AM, Nathan Hjelm wrote: > > On Thu, Mar 03, 2016 at 05:26:45PM -0500, dpchoudh . wrote: > >Hello all > > > >Here is a 101

[OMPI devel] Thread safety in the BTL layer

2016-03-06 Thread dpchoudh .
Hello all sorry for asking too many 101 questions; hopefully someone won't mind answering. It looks like, as of the current release, some BTLs (e.g. openib) are not thread safe, and the code explicitly bails out if it finds that MIT_Init() was called with THREAD_MULTIPLE. Then there are some BTLs

[OMPI devel] How to 'hook' a new BTL to OMPI call chain?

2016-03-16 Thread dpchoudh .
Hello all Sorry about asking too many 101 level question, but here is another one: I have a BTL layer code, called 'lf' that is ready for unit testing; it compiles with the OMPI tool chain (by doing a ./configure; make from the top level directory) and have the basic data transport calls implemen

Re: [OMPI devel] How to 'hook' a new BTL to OMPI call chain?

2016-03-16 Thread dpchoudh .
a Life is complex. It has real and imaginary parts. On Wed, Mar 16, 2016 at 12:52 PM, dpchoudh . wrote: > Hello all > > Sorry about asking too many 101 level question, but here is another one: > > I have a BTL layer code, called 'lf' that is ready for unit testing; it

[OMPI devel] mca_btl__prepare_dst

2016-03-18 Thread dpchoudh .
Hello developers It looks like in the trunk, the routine mca_btl__prepare_dst is no longer being implemented, at least in TCP and openib BTLs. Up until 1.10.2, it does exist. Is it a new MPI-3 related thing? What is the reason behind this? Thanks Durga Life is complex. It has real and imaginary

[OMPI devel] IP address to verb interface mapping

2016-04-07 Thread dpchoudh .
Hello all (Newbie warning! Sorry :-( ) Let's say my cluster has 7 nodes, connected via IP-over-Ethernet for control traffic and some kind of raw verbs (or anything else such as SRIO) interface for data transfer. Let's say my host file chooses 4 out of the 7 nodes for an MPI job, based on the IP

Re: [OMPI devel] IP address to verb interface mapping

2016-04-08 Thread dpchoudh .
vity are not used. > (for example, a large message is *not* split and send using native ib, > IPoIB and GbE because the openib btl > has a higher exclusivity than the tcp btl) > > > did this answer your question ? > > Cheers, > > Gilles > > > > On 4/8/2016 12:

Re: [OMPI devel] IP address to verb interface mapping

2016-04-08 Thread dpchoudh .
locket() > > Cheers, > > Gilles > > > On 4/8/2016 1:30 PM, dpchoudh . wrote: > > Hi Gilles > > Thanks for responding quickly; however, I am afraid I did not explain my > question clearly enough; my apologies. > > What I am trying to understand is this: &g

Re: [OMPI devel] Common symbol warnings in tarballs (was: make install warns about 'common symbols')

2016-04-20 Thread dpchoudh .
Dear all Just to clarify, I was doing a build (after adding code to support a new transport) from code pulled from git (a 'git clone') when I came across this warning, so I suppose this would be a 'developer build'. I know I am not a real MPI developer (I am doing OMPI internal development for th

[OMPI devel] modex receive

2016-04-28 Thread dpchoudh .
Hello all I am struggling with this issue for last few days and thought it would be prudent to ask for help from people who have way more experience than I do. There are two questions, interrelated in my mind, but may not be so in reality. Question 2 is the issue I am struggling with, and questio

[OMPI devel] Why is floating point number used for locality

2016-04-28 Thread dpchoudh .
Hello all I am wondering about the rationale of using floating point numbers for calculating 'distances' in the openib BTL. Is it because some distances can be infinite and there is no (conventional) way to represent infinity using integers? Thanks for your comments Durga The surgeon general a

Re: [OMPI devel] modex receive

2016-04-28 Thread dpchoudh .
ml_base_verbose 100 ? > > maybe the add_procs subroutine is not invoked because openmpi uses cm > instead of ob1 > > > Cheers, > > > Gilles > > On 4/28/2016 3:07 PM, dpchoudh . wrote: > > Hello all > > I am struggling with this issue for last few days

Re: [OMPI devel] modex receive

2016-04-29 Thread dpchoudh .
nd so cm is preferred > over ob1. > > what if you > mpirun --mca mtl ^psm ... > is cm selected over ob1 ? > > note PSM does not disqualify itself if there is no link, and this is > now being investigated at intel. > > Cheers, > > Gilles > > On Friday, Apr

[OMPI devel] Question about 'progress function'

2016-05-05 Thread dpchoudh .
Hi all Apologies for a 101 level question again, but here it is: A new BTL layer I am implementing hangs in MPI_Send(). Please keep in mind that at this stage, I am simply desperate to make MPI data move through this fabric in any way possible, so I have thrown all good programming practice out o

Re: [OMPI devel] Question about 'progress function'

2016-05-06 Thread dpchoudh .
ion because we are tied > directly with libevent. In your case you should provide a BTL progress > function, function that will be called at the end of libevent base loop > regularly. > > George. > > > On Thu, May 5, 2016 at 11:30 PM, dpchoudh . wrote: > >> Hi all

[OMPI devel] Process connectivity map

2016-05-14 Thread dpchoudh .
Hello all I have been struggling with this issue for a while and figured it might be a good idea to ask for help. Where (in the code path) is the connectivity map created? I can see that it is *used* in mca_bml_r2_endpoint_add_btl(), but obviously I am not setting it up right, because this routi

[OMPI devel] Misleading error messages?

2016-05-14 Thread dpchoudh .
In the file ompi/mca.bml/r2/bml_r2.c, it seems like the function name is incorrect in some error messages (seems like a case of unchecked copy-paste issue) in: 1. Function mca_bml_r2_allocate_endpoint() line 154 2. Function mca_bml_r2_endpoint_add_btl() line 200, 206 This is on master branch. Th

Re: [OMPI devel] Process connectivity map

2016-05-15 Thread dpchoudh .
gt; (e.g. tcp is never used if openib is available) > you can simply force your btl and self, and the ob1 pml, so you do not > have to worry about other btl exclusivity. > > Cheers, > > Gilles > > > On Sunday, May 15, 2016, dpchoudh . wrote: > >> Hello all >> >&

Re: [OMPI devel] Process connectivity map

2016-05-15 Thread dpchoudh .
; this from happening) > > one more thing ... > now, master default behavior is > mpirun --mca mpi_add_procs_cutoff 0 ... > you might want to try > mpirun --mca mpi_add_procs_cutoff 1024 ... > and see if things make more sense. > if it helps, and iirc, there is a parameter so a btl ca

Re: [OMPI devel] Process connectivity map

2016-05-15 Thread dpchoudh .
em? Thank you Durga The surgeon general advises you to eat right, exercise regularly and quit ageing. On Sun, May 15, 2016 at 11:17 AM, dpchoudh . wrote: > Hello Gilles > > Setting -mca mpi_add_procs_cutoff 1024 indeed makes a difference to the > output, as follows: > > With -m

[OMPI devel] modex getting corrupted

2016-05-21 Thread dpchoudh .
Hello all I have a naive question: My 'cluster' consists of two nodes, connected back to back with a proprietary link as well as GbE (over a switch). I am calling OPAL_MODEX_SEND() and the modex consists of just this: struct modex {char name[20], unsigned mtu}; The mtu field is not currently be

Re: [OMPI devel] modex getting corrupted

2016-05-23 Thread dpchoudh .
d for both send/recv - you likely have an > error in the syntax > > > On May 20, 2016, at 9:36 PM, dpchoudh . wrote: > > Hello all > > I have a naive question: > > My 'cluster' consists of two nodes, connected back to back with a > proprietary link as well as

Re: [OMPI devel] modex getting corrupted

2016-05-23 Thread dpchoudh .
Hello Ralph and all Please ignore this mail. It is indeed due to a syntax error in my code. Sorry for the noise; I'll be more careful with my homework from now on. Best regards Durga We learn from history that we never learn from history. On Mon, May 23, 2016 at 2:13 AM, dpchoudh .

[OMPI devel] mpirun fails with the latest git pull

2016-05-26 Thread dpchoudh .
Hello all With a git pull of roughly 4 PM EDT (US), that had a .m4 file (something to do with MXM) in the change set, mpirun does not work anymore. The failure is like this: [durga@b-2 ~]$ sudo /usr/local/bin/mpirun --allow-run-as-root -np 2 -hostfile ~/hostfile -mca btl lf,self -mca btl_base_ver

[OMPI devel] Porting the underlying fabric interface

2016-02-04 Thread dpchoudh .
Hi developers I am trying to add support for a new (proprietary) RDMA capable fabric to OpenMPI and have the following question: As I understand, some networks are implemented as a PML framework and some are implemented as a BTL framework. It seems there is even overlap as Myrinet seems to exist

Re: [OMPI devel] [OMPI users] Adding a new BTL

2016-02-25 Thread dpchoudh .
. > > Sent from my phone. No type good. > > On Feb 25, 2016, at 2:06 PM, dpchoudh . wrote: > > Hello Gilles > > Thank you very much for your advice. Yes, I copied the templates from the > master branch to the 1.10.2 release, since the release does not have them. > And