Re: [OMPI users] Creating An MPI Job from Procs Launched by a Different Launcher

2022-01-25 Thread Saliya Ekanayake via users
Any pointers? On Tue, Jan 25, 2022 at 12:55 PM Ralph Castain via users < users@lists.open-mpi.org> wrote: > Short answer is yes, but it it a bit complicated to do. > > On Jan 25, 2022, at 12:28 PM, Saliya Ekanayake via users < > users@lists.open-mpi.org> wrote: > &

[OMPI users] Creating An MPI Job from Procs Launched by a Different Launcher

2022-01-25 Thread Saliya Ekanayake via users
Hi, I am trying to run an MPI program on a platform that launches the processes using a custom launcher (not mpiexec). This will end up spawning N processes of the program, but I am not sure if MPI_Init() would work or not in this case? Is it possible to have a group of processes launched by

Re: [OMPI users] OpenMPI Segfault when psm is enabled?

2017-03-15 Thread Saliya Ekanayake
en-mpi.org/msg27524.html) >> that suggested to disable psm as a solution. >> >> It worked, but I would like to know what this module is and is there a >> disadvantage in terms of performance by disabling it? >> >> Thank you, >> Saliya >> >> -- >

[OMPI users] OpenMPI Segfault when psm is enabled?

2017-03-11 Thread Saliya Ekanayake
this module is and is there a disadvantage in terms of performance by disabling it? Thank you, Saliya -- Saliya Ekanayake, Ph.D Applied Computer Scientist Network Dynamics and Simulation Science Laboratory (NDSSL) Virginia Tech, Blacksburg ___ users mailing

Re: [OMPI users] Forcing TCP btl

2016-07-19 Thread Saliya Ekanayake
nMPI has builtin support for mxm, you need to > > - force pml/ob1 (so mtl/mxm cannot be used by pml/cm) > > and > > - blacklist btl/openib > > your mpirun command line looks like this > > mpirun --mca pml ob1 --mca btl ^openib ... > > > Cheers, > > >

Re: [OMPI users] Forcing TCP btl

2016-07-19 Thread Saliya Ekanayake
n Jul 18, 2016, at 10:50 PM, Saliya Ekanayake <esal...@gmail.com> > wrote: > > > > Hi, > > > > I read in a previous thread ( > https://www.open-mpi.org/community/lists/users/2014/05/24475.php) that > Jeff mentions it's possible for OpenMPI to pick up the openi

[OMPI users] Forcing TCP btl

2016-07-19 Thread Saliya Ekanayake
for OpenMPI to use Infiniband and not TCP? Is there a way to guarantee that a test is using TCP, but not IB? Thank you, saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington

Re: [OMPI users] Java-OpenMPI returns with SIGSEGV

2016-07-07 Thread Saliya Ekanayake
l: Aborted (6) >>>> [titan01:01172] Signal code: (-6) >>>> [titan01:01173] *** Process received signal *** >>>> [titan01:01173] Signal: Aborted (6) >>>> [titan01:01173] Signal code: (-6) >>>> [titan01:01172] [ 0] /usr/lib64/libpthread.so.0(+0xf100)[0x2b7e9596a100] >>>> [titan01:01172] [ 1] /usr/lib64/libc.so.6(gsignal+0x37)[0x2b7e95fc75f7] >>>> [titan01:01172] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x2b7e95fc8ce8] >>>> [titan01:01172] [ 3] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2b7e96a95ac5] >>>> [titan01:01172] [ 4] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2b7e96bf5137] >>>> [titan01:01172] [ 5] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2b7e96a995e0] >>>> [titan01:01172] [ 6] [titan01:01173] [ 0] >>>> /usr/lib64/libpthread.so.0(+0xf100)[0x2af694ded100] >>>> [titan01:01173] [ 1] /usr/lib64/libc.so.6(+0x35670)[0x2b7e95fc7670] >>>> [titan01:01172] [ 7] [0x2b7e9c86e3a1] >>>> [titan01:01172] *** End of error message *** >>>> /usr/lib64/libc.so.6(gsignal+0x37)[0x2af69544a5f7] >>>> [titan01:01173] [ 2] /usr/lib64/libc.so.6(abort+0x148)[0x2af69544bce8] >>>> [titan01:01173] [ 3] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x742ac5)[0x2af695f18ac5] >>>> [titan01:01173] [ 4] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(+0x8a2137)[0x2af696078137] >>>> [titan01:01173] [ 5] >>>> /home/gl069/bin/jdk1.7.0_25/jre/lib/amd64/server/libjvm.so(JVM_handle_linux_signal+0x140)[0x2af695f1c5e0] >>>> [titan01:01173] [ 6] /usr/lib64/libc.so.6(+0x35670)[0x2af69544a670] >>>> [titan01:01173] [ 7] [0x2af69c0693a1] >>>> [titan01:01173] *** End of error message *** >>>> --- >>>> Primary job terminated normally, but 1 process returned >>>> a non-zero exit code. Per user-direction, the job has been aborted. >>>> --- >>>> >>>> -- >>>> mpirun noticed that process rank 1 with PID 0 on node titan01 exited on >>>> signal 6 (Aborted). >>>> >>>> >>>> CONFIGURATION: >>>> I used the ompi master sources from github: >>>> commit 267821f0dd405b5f4370017a287d9a49f92e734a >>>> Author: Gilles Gouaillardet <gil...@rist.or.jp> >>>> Date: Tue Jul 5 13:47:50 2016 +0900 >>>> >>>> ./configure --enable-mpi-java >>>> --with-jdk-dir=/home/gl069/bin/jdk1.7.0_25 --disable-dlopen >>>> --disable-mca-dso >>>> >>>> Thanks a lot for your help! >>>> Gundram >>>> >>>> ___ >>>> users mailing list >>>> us...@open-mpi.org >>>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>>> Link to this post: >>>> <http://www.open-mpi.org/community/lists/users/2016/07/29584.php> >>>> http://www.open-mpi.org/community/lists/users/2016/07/29584.php >>>> >>> >>> >>> >>> ___ >>> users mailing listus...@open-mpi.org >>> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >>> Link to this post: >>> http://www.open-mpi.org/community/lists/users/2016/07/29585.php >>> >>> >>> >> >> ___ >> users mailing listus...@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/07/29587.php >> >> >> > > ___ > users mailing listus...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/07/29589.php > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/07/29590.php > -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
mpirun --mca coll_ml_priority 100 ... > > Cheers, > > Gilles > > On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote: > >> Thank you, Gilles. The reason for digging into intra-node optimizations >> is that we've implemented several machine learning a

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
abric, but I do not know the details...) > > Cheers, > > Gilles > > On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote: > >> OK, I am beginning to see how it works now. One question I still have is, >> in the case of a mult-node communicator it seems c

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
. Cheers, Gilles On Thursday, June 30, 2016, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you, Gilles. > > What is the bcast I should look for? In general, how do I know which > module was used to for which communication - can I print this info? > On Jun 30, 2016 3:19 AM,

Re: [OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
Cheers, > > > Gilles > On 6/30/2016 3:04 PM, Saliya Ekanayake wrote: > > Hi, > > Looking at the *ompi/mca/coll/sm/coll_sm_module.c* it seems this module > will be used only if the calling communicator solely groups processes > within a node. I've got two questions here. >

[OMPI users] The ompi/mca/cool/sm will not be used on multi-nodes?

2016-06-30 Thread Saliya Ekanayake
-- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington

[OMPI users] How to know if SM collective is being used?

2016-06-29 Thread Saliya Ekanayake
Hi, I see in *mca_coll_sm_comm_query()* of *ompi/mca/coll/sm/coll_sm_module.c* that al allreduce and bcast have shared memory implementations. Is there a way to know if this implementation is being used when running my program that calls these collectives? Thank you, Saliya -- Saliya

Re: [OMPI users] OpenMP explicit thread affinity with MPI

2016-06-29 Thread Saliya Ekanayake
rsion seem to support that, though. On Wed, Jun 29, 2016 at 1:20 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you, Ralph and Gilles. > > I didn't know about the OMPI_COMM_WORLD_LOCAL_RANK variable. Essentially, > this means I should be able to wrap my applic

Re: [OMPI users] OpenMP explicit thread affinity with MPI

2016-06-29 Thread Saliya Ekanayake
M_WORLD_LOCAL_RANK > envar, and then use that to calculate the offset location for your threads > (i.e., local rank 0 is on socket 0, local rank 1 is on socket 1, etc.). You > can then putenv the correct value of the GOMP envar > > > On Jun 28, 2016, at 8:40 PM, Saliya

Re: [OMPI users] Why communication performance change with binding PEs?

2016-06-23 Thread Saliya Ekanayake
few threads are use per process, i guess case 1 and case 2 > will become pretty close. > > i also suggest that for cases 2 and 3, you bind processes to a socket > instead of no binding at all > > Cheers, > > Gilles > > On 6/23/2016 2:41 PM, Saliya Ekanayake wrote: > > Thank

Re: [OMPI users] Why communication performance change with binding PEs?

2016-06-23 Thread Saliya Ekanayake
r do time sharing. > but if the task is bound on more than one core, then the task and the > helper run in parallel. > > > Cheers, > > Gilles > > On 6/23/2016 1:21 PM, Saliya Ekanayake wrote: > > Hi, > > I am trying to understand this peculiar behavior whe

[OMPI users] Why communication performance change with binding PEs?

2016-06-23 Thread Saliya Ekanayake
Hi, I am trying to understand this peculiar behavior where the communication time in OpenMPI changes depending on the number of process elements (cores) the process is bound to. Is this expected? Thank you, saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
etails. > > Jeff > > > -- > Jeff Hammond > jeff.scie...@gmail.com > http://jeffhammond.github.io/ > > ___ > users mailing list > us...@open-mpi.org > Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.o

Re: [OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
you measuring these times? > > Thanks, > > Matthieu > > -- > *From:* users [users-boun...@open-mpi.org] on behalf of Saliya Ekanayake [ > esal...@gmail.com] > *Sent:* Monday, May 30, 2016 7:53 AM > *To:* Open MPI Users > *Subject:* [OMPI u

[OMPI users] Broadcast faster than barrier

2016-05-30 Thread Saliya Ekanayake
Hi, I ran Ohio micro benchmarks for openmpi and noticed broadcast with smaller number of bytes is faster than a barrier - 2us vs 120us. I'm trying to understand how this could happen? Thank you Saliya

Re: [OMPI users] mpirun java

2016-05-23 Thread Saliya Ekanayake
..@open-mpi.org >> Subscription: https://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2016/05/29285.php >> > > > ___ > users mailing list > us...

Re: [OMPI users] The effect of --bind-to in the presence of PE=N in --map-by

2016-05-21 Thread Saliya Ekanayake
other physical resource. > > It is true that we generally configure our schedulers to set the max > #slots on each node to equal the #cores on the node - but that is purely a > configuration choice. > > > On May 19, 2016, at 4:29 PM, Saliya Ekanayake <esal...@gmail.com> wrote: &

Re: [OMPI users] The effect of --bind-to in the presence of PE=N in --map-by

2016-05-19 Thread Saliya Ekanayake
s and would > > like to pin them to each core the process has been bound to. > > > > On Thu, May 19, 2016 at 3:46 PM, Ralph Castain <r...@open-mpi.org>wrote: > > Perhaps we should error out, but at the moment, PE=4 forces bind-to-core > and so the bind-to socket i

Re: [OMPI users] The effect of --bind-to in the presence of PE=N in --map-by

2016-05-19 Thread Saliya Ekanayake
c bound to 4 cores. That is what we will do - as I said, the > —bind-to socket directive will be ignored. > > On May 19, 2016, at 1:03 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > So if bind-to-core is in effect, does that mean it'll run only on 1 core > even though

Re: [OMPI users] The effect of --bind-to in the presence of PE=N in --map-by

2016-05-19 Thread Saliya Ekanayake
Castain <r...@open-mpi.org> wrote: > Perhaps we should error out, but at the moment, PE=4 forces bind-to-core > and so the bind-to socket is being ignored > > On May 19, 2016, at 12:06 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Hi, > > I understand

[OMPI users] The effect of --bind-to in the presence of PE=N in --map-by

2016-05-19 Thread Saliya Ekanayake
-to socket My understanding is that this will give each process 4 cores. Now, with bind to socket, does that mean it's possible that within a socket the assgined 4 cores for a process may change? Or will they stay in the same 4 cores always? Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate

Re: [OMPI users] Java MPI Code for NAS Benchmarks

2016-03-11 Thread Saliya Ekanayake
paper at > > https://github.com/open-mpi/ompi-java-test > > Howard > > > 2016-02-27 23:01 GMT-07:00 Saliya Ekanayake <esal...@gmail.com>: > >> Hi, >> >> I see this paper from Oscar refers to a Java implementation of NAS >> benchmarks. Is this work publ

[OMPI users] Java MPI Code for NAS Benchmarks

2016-02-28 Thread Saliya Ekanayake
/291695433_SPIDAL_Java_High_Performance_Data_Analytics_with_Java_and_MPI_on_Large_Multicore_HPC_Clusters) and would like to test out the work in the above paper as well. Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington

Re: [OMPI users] How to allocate more memory to java OpenMPI

2016-01-20 Thread Saliya Ekanayake
extra flags to the java command line so the JVM can > allocate more memory. > java -Xmx=... > or something like that (and that could be JVM dependent) > > Cheers, > > Gilles > > > On Thursday, January 21, 2016, Saliya Ekanayake <esal...@gmail.com> wrote: > >> Hi I

Re: [OMPI users] How to allocate more memory to java OpenMPI

2016-01-20 Thread Saliya Ekanayake
__ users mailing list > us...@open-mpi.org Subscription: > http://www.open-mpi.org/mailman/listinfo.cgi/users Link to this post: > http://www.open-mpi.org/community/lists/users/2016/01/28302.php > > ___ > users mailing list > us...@

Re: [OMPI users] Help with Binding in 1.8.8: Use only second socket

2015-12-21 Thread Saliya Ekanayake
>> us...@open-mpi.org >> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users >> Link to this post: >> http://www.open-mpi.org/community/lists/users/2015/12/28190.php >> >> >> >> _______ >> users mailing list >&

Re: [OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
NULL; > > that means the coll sm module does *not* implement allgatherv, so openmpi > will use the next module > (which is very likely the default module, that is why there is no > performance improvement in your specific benchmark) > > Cheers, > > Gilles > > >

[OMPI users] Setting coll_sm_priority = 35 didn't improve communication performance

2015-12-09 Thread Saliya Ekanayake
for different number of processes per node on 48 nodes. The total message size is kept constant at 240 bytes (or 2.28MB). Am I doing something wrong here? Thank you, saliya [image: Inline image 1] -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing

[OMPI users] Binding to hardware thread

2015-09-27 Thread Saliya Ekanayake
Hi, I couldn't find any option in OpenMPI to bind a process to a hardware thread. I am assuming this is not yet supported through binding options. Could specifying a rank file be used as a workaround for this? Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
ut that error is a bug and I’ll have > to fix it. > > > On Sep 13, 2015, at 1:10 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > I could get it working by manually generating a rankfile all the ranks and > not using any --map-by options. > > I'll try the --map-by c

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
gt; > > On 09/13/2015 09:41 AM, Saliya Ekanayake wrote: > > I tried, > > --map-by ppr:12:node --slot-list 0,2,4,6,8,10,12,14,16,18,20,22 --bind-to > core -np 12 > > but it complains, > > "Conflicting directives for binding policy are causing the policy > to

Re: [OMPI users] Help with Specific Binding

2015-09-13 Thread Saliya Ekanayake
t as I don’t know how your machine numbers them, and I > can’t guarantee it will work - but it’s worth a shot. If it doesn’t, then I > may have to add an option for such purposes > > Ralph > > On Sep 12, 2015, at 7:39 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > &g

[OMPI users] Help with Specific Binding

2015-09-12 Thread Saliya Ekanayake
a process bind to 2 cores, which is not what I want. --map-by ppr:12:node:PE=1,SPAN Thank you, Saliya [image: Inline image 1] -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 812-391-4914 http

Re: [OMPI users] OpenMPI optimizations for intra-node process communication

2015-09-01 Thread Saliya Ekanayake
mplemented by the coll mca >> >> the sm coll mca is optimized for shared memory, but support intra node >> communicators only. >> the ml and hierarch coll have some optimizations for intra node >> communications. >> as far as i know, none of these are used in producti

Re: [OMPI users] OpenMPI optimizations for intra-node process communication

2015-09-01 Thread Saliya Ekanayake
to use vader > > you can run > ompi_info --all | grep vader > to check the btl parameters, > of course, reading the source code is the best way to understand what the > vader btl can do and how > > Cheers, > > Gilles > > > > On 9/1/2015 1:28 PM, Saliya

Re: [OMPI users] OpenMPI optimizations for intra-node process communication

2015-09-01 Thread Saliya Ekanayake
> Gilles > > > On 9/1/2015 5:59 AM, Saliya Ekanayake wrote: > > Hi, > > Just trying to see if there are any optimizations (or options) in OpenMPI > to improve communication between intra node processes. For example do they > use something like shared memory? > > Thank

[OMPI users] OpenMPI optimizations for intra-node process communication

2015-08-31 Thread Saliya Ekanayake
Hi, Just trying to see if there are any optimizations (or options) in OpenMPI to improve communication between intra node processes. For example do they use something like shared memory? Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics

Re: [OMPI users] Passing a rank specific argument to the JVM

2015-07-20 Thread Saliya Ekanayake
Thank you. This is very nice! On Sun, Jul 19, 2015 at 2:25 PM, Ralph Castain <r...@open-mpi.org> wrote: > Yes > > On Jul 19, 2015, at 10:47 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > So does this mean I can have different options for each process by &g

Re: [OMPI users] Passing a rank specific argument to the JVM

2015-07-19 Thread Saliya Ekanayake
Nick Papior <nickpap...@gmail.com> wrote: > > Wrap the call in a bash script or the like, there are several examples on > this mailing list. > > I am sorry I am not at my computer so cannot find them. > On 19 Jul 2015 06:34, "Saliya Ekanayake" <esal...@gmail.com

[OMPI users] Passing a rank specific argument to the JVM

2015-07-19 Thread Saliya Ekanayake
the port is passed as an option to the java command and not to the program. Now the port has to be different for the 2 MPI procs and I am not sure how this could be done. Any thoughts? Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing

Re: [OMPI users] What collective implementation is used when?

2015-07-09 Thread Saliya Ekanayake
rchical collective, which means they should be optimized for multi > node / multi tasks per node. > that being said, ml is not production ready, and i am not sure wheter > hierarch is actively maintained) > > i hope this helps > > Gilles > > > On 7/9/2015 5:37 AM, Saliya

[OMPI users] What collective implementation is used when?

2015-07-08 Thread Saliya Ekanayake
the advantage of shared memory? [1] https://www.open-mpi.org/faq/?category=sm Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 812-391-4914 http://saliya.org

Re: [OMPI users] Binding width affects allgatherv performance?

2015-07-06 Thread Saliya Ekanayake
Just checking if anyone has experienced a similar situation or has any pointers to understand this. Thank you Saliya On Jul 1, 2015 9:27 PM, "Saliya Ekanayake" <esal...@gmail.com> wrote: > Hi, > > I am getting strange performance results for allgatherv operation for t

[OMPI users] Binding width affects allgatherv performance?

2015-07-01 Thread Saliya Ekanayake
pected with binding width? I am a bit puzzled and would appreciate any help to understand this. [image: Inline image 1] Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 8

Re: [OMPI users] Running 1 proc per socket but no more

2015-07-01 Thread Saliya Ekanayake
for "numa"). On Wed, Jul 1, 2015 at 4:04 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you Ralph > > Saliya > > On Wed, Jul 1, 2015 at 4:01 PM, Ralph Castain <r...@open-mpi.org> wrote: > >> Scenario 2: --map-by ppr:12:node,span --bind-

Re: [OMPI users] Running 1 proc per socket but no more

2015-07-01 Thread Saliya Ekanayake
; > > On Wed, Jul 1, 2015 at 2:42 PM, Saliya Ekanayake <esal...@gmail.com> > wrote: > >> Hi, >> >> I am doing some benchmarks and would like to test the following two >> scenarios. Each machine has 4 sockets each with 6 cores (lstopo image >> attach

[OMPI users] Running 1 proc per socket but no more

2015-07-01 Thread Saliya Ekanayake
to just 1 core. This is what I don't know how to do, because if I do --map-by socket:PE=1 then mpirun will put more than 12 procs per node as it can do so. I'd appreciate any help on this. Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics

Re: [OMPI users] Allgather Implementation Details

2015-07-01 Thread Saliya Ekanayake
Thank you George. This is very informative. Is it possible to pass the option in runtime rather setting up in the config file? Thank you Saliya On Tue, Jun 30, 2015 at 7:20 PM, George Bosilca <bosi...@icl.utk.edu> wrote: > Saliya, > > On Tue, Jun 30, 2015 at 10:50 AM, Saliya

[OMPI users] Allgather Implementation Details

2015-06-30 Thread Saliya Ekanayake
://www.researchgate.net/profile/William_Gropp/publication/221597354_A_Simple_Pipelined_Algorithm_for_Large_Irregular_All-gather_Problems/links/00b49525d291830c6700.pdf Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital

Re: [OMPI users] Process Binding Warning

2015-03-13 Thread Saliya Ekanayake
Thank you. It worked! On Fri, Mar 13, 2015 at 10:37 AM, Ralph Castain <r...@open-mpi.org> wrote: > You shouldn’t have to do so > > On Mar 13, 2015, at 7:14 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Thanks Ralph. Do I need to specify where to find numactl-dev

Re: [OMPI users] Process Binding Warning

2015-03-13 Thread Saliya Ekanayake
ame location as your proc. > As the warning indicates, it can impact performance but won't stop you from > running > > > On Mar 12, 2015, at 12:51 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Hi, > > I am getting the following binding warning an

[OMPI users] Process Binding Warning

2015-03-12 Thread Saliya Ekanayake
[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]], socket 0[core 4[hwt 0]], socket 0[core 5[hwt 0]]: [B/B/B/B/B/B][./././././.][./././././.][./././././.] Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center

Re: [OMPI users] Max Registerable Memory Warning

2015-02-09 Thread Saliya Ekanayake
; What OFED version you are running? If not latest, is it possible to > upgrade to latest OFED?. Otherwise, Can you try latest OMPI release (>= > v1.8.4), where this warning is ignored on older OFEDs > > -Devendar > > On Sun, Feb 8, 2015 at 12:37 PM, Saliya Ekanayake <esal...

[OMPI users] Max Registerable Memory Warning

2015-02-08 Thread Saliya Ekanayake
memory (kbytes, -v) unlimited file locks (-x) unlimited Thank you, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 812-391-4914 http://saliya.org

Re: [OMPI users] Accessing Process Affinity within MPI Program

2015-01-07 Thread Saliya Ekanayake
MPI extension (you must configure Open MPI with > --enable-mpi-ext=affinity or --enable-mpi-ext=all). See: > > http://www.open-mpi.org/doc/v1.8/man3/OMPI_Affinity_str.3.php > > > > On Dec 21, 2014, at 1:57 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > &

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-29 Thread Saliya Ekanayake
> So you are saying the test worked, but you are still encountering an error > when executing an MPI job? Or are you saying things now work? > > > On Dec 28, 2014, at 5:58 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Thank you Ralph. This produced the warni

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-28 Thread Saliya Ekanayake
admin try running the ibv_ud_pingpong test - that will exercise > the portion of the system under discussion. > > > On Dec 28, 2014, at 2:31 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > What I heard from the administrator is that, > > "The tests that work are the

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-28 Thread Saliya Ekanayake
What I heard from the administrator is that, "The tests that work are the simple utilities ib_read_lat and ib_read_bw that measures latency and bandwith between two nodes. They are part of the "perftest" repo package." On Dec 28, 2014 10:20 AM, "Saliya Ekanayake&q

Re: [OMPI users] OMPI users] What could cause a segfault in OpenMPI?

2014-12-28 Thread Saliya Ekanayake
ct/btl_openib_connect_udcm.c:736: udcm_module_finalize: > Assertion `((0xdeafbeedULL << 32) + 0xdeafbeedULL) == ((opal_object_t *) > (>cm_recv_msg_queue))->obj_magic_id' failed. > > Thank you, > Saliya > > On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake <esal...@gm

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-12-27 Thread Saliya Ekanayake
ject_t *) (>cm_recv_msg_queue))->obj_magic_id' failed. Thank you, Saliya On Mon, Nov 10, 2014 at 10:01 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you Jeff, I'll try this and let you know. > > Saliya > On Nov 10, 2014 6:42 AM, "Jeff Squyres (jsquyres)" <j

Re: [OMPI users] Question on Mapping and Binding

2014-12-22 Thread Saliya Ekanayake
Thank you and one last question. Is it possible to avoid a core and instruct OMPI to use only the other cores? On Mon, Dec 22, 2014 at 2:08 PM, Ralph Castain <r...@open-mpi.org> wrote: > > On Dec 22, 2014, at 10:45 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Hi R

Re: [OMPI users] Question on Mapping and Binding

2014-12-22 Thread Saliya Ekanayake
N was specified, but maybe that got lost. > > > On Dec 22, 2014, at 8:32 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Hi, > > I've been using --map-by socket:PE=N, where N is used to control the > number of cores a proc gets mapped to. Does this also guarant

[OMPI users] Question on Mapping and Binding

2014-12-22 Thread Saliya Ekanayake
, Saliya -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics and Computing | Digital Science Center Indiana University, Bloomington Cell 812-391-4914 http://saliya.org

[OMPI users] Accessing Process Affinity within MPI Program

2014-12-21 Thread Saliya Ekanayake
Hi, Is it possible to get information on the process affinity that's set in mpirun command within the MPI program? For example I'd like to know the number of cores that a given rank is bound to. Thank you -- Saliya Ekanayake Ph.D. Candidate | Research Assistant School of Informatics

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-10 Thread Saliya Ekanayake
> no additional information to give a clue as to what is happening. :-( > > > > On Nov 9, 2014, at 11:43 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > Hi Jeff, > > > > You are probably busy, but just checking if you had a chance to look at > this.

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-09 Thread Saliya Ekanayake
Hi Jeff, You are probably busy, but just checking if you had a chance to look at this. Thanks, Saliya On Thu, Nov 6, 2014 at 9:19 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > Hi Jeff, > > I've attached a tar file with information. > > Thank you, > Saliya > >

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-06 Thread Saliya Ekanayake
pi.org/community/help/ > > > > On Nov 4, 2014, at 1:10 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > Hi, > > > > I am using OpenMPI 1.8.1 in a Linux cluster that we recently setup. It > builds fine, but when I try to run even the simplest hello.c prog

Re: [OMPI users] What could cause a segfault in OpenMPI?

2014-11-04 Thread Saliya Ekanayake
wrote: > Hello Saliya, > > Would you mind trying to reproduce the problem using the latest 1.8 > release - 1.8.3? > > Thanks, > > Howard > > > 2014-11-04 11:10 GMT-07:00 Saliya Ekanayake <esal...@gmail.com>: > >> Hi, >> >> I am using Ope

[OMPI users] What could cause a segfault in OpenMPI?

2014-11-04 Thread Saliya Ekanayake
. The ompi_info is attached. 2. cd to examples directory and mpicc hello_c.c 3. mpirun -np 2 ./a.out 4. Error text is attached. Please let me know if you need more info. Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Saliya Ekanayake
Please find inline comments. On Fri, Aug 22, 2014 at 3:45 PM, Rob Latham <r...@mcs.anl.gov> wrote: > > > On 08/22/2014 02:40 PM, Saliya Ekanayake wrote: > >> Yes, these are all MPI_DOUBLE >> > > well, yeah, but since you are talking about copying

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Saliya Ekanayake
Yes, these are all MPI_DOUBLE On Fri, Aug 22, 2014 at 3:38 PM, Rob Latham <r...@mcs.anl.gov> wrote: > > > On 08/22/2014 10:10 AM, Saliya Ekanayake wrote: > >> Hi, >> >> I've a quick question about the usage of Java binding. >> >> Say there's a 2

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Saliya Ekanayake
for copying? Thank you, Saliya On Fri, Aug 22, 2014 at 3:24 PM, Oscar Vega-Gisbert <ov...@dsic.upv.es> wrote: > El 22/08/14 20:44, Saliya Ekanayake escribió: > > Thank you Oscar for the detailed information, but I'm still wondering how >> would the copying in 2 would be diffe

Re: [OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Saliya Ekanayake
Thank you Oscar for the detailed information, but I'm still wondering how would the copying in 2 would be different than what's done here with copying to a buffer. On Fri, Aug 22, 2014 at 2:17 PM, Oscar Vega-Gisbert <ov...@dsic.upv.es> wrote: > El 22/08/14 17:10, Saliya Ekanayake

[OMPI users] Best way to communicate a 2d array with Java binding

2014-08-22 Thread Saliya Ekanayake
it I guess 2 would internally do the copying to a buffer and use it, so suggesting 1. is the best option. Is this the case or is there a better way to do this? Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com http://saliya.org

Re: [OMPI users] SIGSEGV for Java program in openmpi-1.8.2rc2 on Solaris 10

2014-07-25 Thread Saliya Ekanayake
> us...@open-mpi.org > > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/07/24870.php > > > -- > Jeff Squyres > jsquy...@cisco.com > For corporate legal information go to: > http://www.cisco.com/web/about/doing_business/legal/cri/ > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2014/07/24874.php > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org

Re: [OMPI users] OpenMPI with Gemini Interconnect

2014-04-16 Thread Saliya Ekanayake
ly, due to a Cray bug, case 80503, that > has > >not yet worked. > >Ray > > > >On 4/16/2014 4:44 PM, Saliya Ekanayake wrote: > > > > Hi, > > We have a Cray XE6/XK7 supercomputer (BigRed II) and I was

Re: [OMPI users] OpenMPI with Gemini Interconnect

2014-04-16 Thread Saliya Ekanayake
for the CCM > virtual cluster. Unfortunately, due to a Cray bug, case 80503, that has > not yet worked. > Ray > > > On 4/16/2014 4:44 PM, Saliya Ekanayake wrote: > > Hi, > > We have a Cray XE6/XK7 supercomputer (BigRed II) and I was trying to get > OpenM

[OMPI users] OpenMPI with Gemini Interconnect

2014-04-16 Thread Saliya Ekanayake
could give some suggestions on how to build OpenMPI with Gemini support. [1] https://www.open-mpi.org/papers/cug-2012/cug_2012_open_mpi_for_cray_xe_xk.pdf Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com http://saliya.org

Re: [OMPI users] Optimal mapping/binding when threads are used?

2014-04-12 Thread Saliya Ekanayake
Just an update. Yes, binding to all is as same as binding to none. I was mistaken by my memory :) On Fri, Apr 11, 2014 at 1:22 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you Ralph for the details and it's a good point you mentioned on > mapping by node vs socket. We

Re: [OMPI users] Optimal mapping/binding when threads are used?

2014-04-11 Thread Saliya Ekanayake
emory, and so messaging will run slower - > and you want the ranks that share a node to be the ones that most > frequently communicate to each other, if you can identify them. > > HTH > Ralph > > On Apr 10, 2014, at 5:59 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > >

[OMPI users] Optimal mapping/binding when threads are used?

2014-04-10 Thread Saliya Ekanayake
eed up these Tx*1*xN cases? Also, I expected B to perform better than A as threads could utilize all 8 cores, but it wasn't the case. Thank you, Saliya [image: Inline image 1] -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org

Re: [OMPI users] Contributing Examples for Java Binding

2014-04-08 Thread Saliya Ekanayake
org/faq/?category=java). > > > On Apr 3, 2014, at 7:09 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > Great. I will cleanup and send you a tarball. > > > > Thank you > > Saliya > > > > On Apr 3, 2014 5:51 PM, "Ralph Castain&

Re: [OMPI users] Contributing Examples for Java Binding

2014-04-03 Thread Saliya Ekanayake
> offlist. > > Thanks! > Ralph > > On Apr 3, 2014, at 1:44 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > > Hi, > > I've been working on some applications in our group where I've been using > OpenMPI Java binding. Over the course of this work, I've accumulate

[OMPI users] Contributing Examples for Java Binding

2014-04-03 Thread Saliya Ekanayake
clustering and matrix multiplication. If possible I'd like to contribute these to OpenMPI and wonder what's your input on this. Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org

Re: [OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Saliya Ekanayake
[B/B/B/B][./././.] > > [i52:31765] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket 0[core 1 > [hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]]: > [B/B/B/B][./././.] > > > > > > Is there a better way without using -cpus-per-proc as suggested to

[OMPI users] How to replace --cpus-per-proc by --map-by

2014-03-27 Thread Saliya Ekanayake
][./././.]* *[i52:31765] MCW rank 1 bound to socket 0[core 0[hwt 0]], socket 0[core 1[hwt 0]], socket 0[core 2[hwt 0]], socket 0[core 3[hwt 0]]: [B/B/B/B][./././.]* Is there a better way without using -cpus-per-proc as suggested to get the same effect? Thank you, Saliya -- Saliya Ekanayake esal

Re: [OMPI users] OpenMPI + Hadoop

2014-03-21 Thread Saliya Ekanayake
/Greenplum_RalphCastain-1up.pdf, but wonder if there's some detailed steps on getting a simple MR program running with OpenMPI. Thank you, Saliya On Mon, Feb 24, 2014 at 1:22 PM, Saliya Ekanayake <esal...@gmail.com> wrote: > Thank you Ralph. I'll get back to you if I run into issues. > > &

Re: [OMPI users] efficient strategy with temporary message copy

2014-03-17 Thread Saliya Ekanayake
Also, this presentation might be useful http://extremecomputingtraining.anl.gov/files/2013/07/tuesday-slides2.pdf Thank you, Saliya On Mar 17, 2014 2:18 PM, "christophe petit" wrote: > Thanks Jeff, I understand better the different cases and how to choose as > a

Re: [OMPI users] SIGSEV when running OMPI Java binding

2014-03-14 Thread Saliya Ekanayake
ssue was an errant large array on the stack in debug builds, which > would cause JVMs to run out of stack space. > > The fix is on the SVN trunk now; it will be on the v1.7 branch shortly. > > > On Mar 11, 2014, at 5:06 PM, Saliya Ekanayake <esal...@gmail.com> wrote: >

Re: [OMPI users] SIGSEV when running OMPI Java binding

2014-03-13 Thread Saliya Ekanayake
Just checking if there's some solution for this. Thank you, Saliya On Tue, Mar 11, 2014 at 10:54 PM, Saliya Ekanayake <esal...@gmail.com>wrote: > I forgot to mention that I tried the hello.c version instead of Java and > it too failed in a similar manner, but > > 1. On a sing

Re: [OMPI users] SIGSEV when running OMPI Java binding

2014-03-11 Thread Saliya Ekanayake
ou try: >> >> mpirun --mca coll ^ml ... >> >> This will deactivate the "ml" collective component. See if that enables >> you to run (this particular component has nothing to do with Java). >> >> >> On Mar 11, 2014, at 1:33 AM, Saliya Eka

Re: [OMPI users] SIGSEV when running OMPI Java binding

2014-03-11 Thread Saliya Ekanayake
ponent. See if that enables > you to run (this particular component has nothing to do with Java). > > > On Mar 11, 2014, at 1:33 AM, Saliya Ekanayake <esal...@gmail.com> wrote: > > > Just tested that this happens even with the simple Hello.java program > given in OMPI di

Re: [OMPI users] SIGSEV when running OMPI Java binding

2014-03-11 Thread Saliya Ekanayake
lar > MPI function that you're using that results in this segv (e.g., perhaps we > have a specific bug somewhere)? > > Can you reduce the segv to a small example that we can reproduce (and > therefore fix)? > > > On Mar 10, 2014, at 12:05 AM, Saliya Ekanayake <esal...@gmail

  1   2   >