On Jun 28, 2011, at 10:04 AM, David E Hudak wrote:
>
> On Jun 24, 2011, at 6:32 PM, Igor Peshansky wrote:
>
>> Hi, Dave,
>>
>> Since 2.1.2, X10 comes with what we call a multi-vm Java backend
>> implementation. It runs with the sockets transport. For sockets, the
>> Java runner, "x10", uses the same launcher as the C++ backend
>> ("X10Launcher"), so one can run it on, e.g., a Linux cluster by
>> setting X10_NPLACES and X10_HOSTFILE, just like you would for a C++
>> launch.
>
> I am using X10 2.2.0. I have figured out how to launch multi-vm across
> multiple nodes of the OSC cluster using the x10 command with X10_NPLACES and
> X10_HOSTFILE.
>
> However, for our cluster we are always concerned about process clean up. So,
> as a test, I ran a four-node job running MontyPi (good old MontyPi!), running
> one java instance per node. I was logged into the nodes and ran top on each.
> Sure enough, one 'java' process per node.
>
> Then, I ran it again, but this time I manually killed one of the JVM's - one
> of the launcher "children". A sibling on one node exited, a sibling on
> another node did not and the parent did not exit. I was hoping that an exit
> of any process would cause the entire set of processes to shut down. Looking
> at the source code for launcher.cc, it looks like there are hooks in there
> (Launcher::handleDeadChild and Launcher::handleDeadParent), but I don't know
> the expected behavior.
>
> Typically, we rely on the Torque resource manager to start processes. There
> is a torque daemon on each node of the job. The daemons fork child processes
> to do the work. If one child exits, the daemon is notified and in turn
> notifies the other node's daemons. If the X10 launcher is the supported
> process management mechanism, it would be a good idea to have it work with
> resource managers. Open MPI used to have a standalone project called Open
> Run Time Environment (Open RTE or ORTE) which may be an interesting fit.
> However, at this point, I think I am going to do something in my job scripts
> to manually shut down processes after an exit as a workaround.
Ooops! Hang on, just found X10_LAUNCHER_SSH. I'll let you know if I can get
that to work...
Dave
>
>>
>> As far as I know, you cannot use the MPI transport with multi-vm.
>
> Understood. This is one advantage of the MPI transport: I use mpiexec to
> launch processes via Torque and the cleanup works correctly.
>
> Thanks,
> Dave
>
>> Igor
>>
>> On Fri, Jun 24, 2011 at 4:47 PM, David E Hudak <[email protected]> wrote:
>>> Hi All,
>>>
>>> I have a colleague with a Java implementation of a genetic algorithm. He
>>> is interested in parallelizing the application for both multicore and
>>> multinode execution.
>>>
>>> In the initial implementation, there are a set of classes for specifying
>>> fitness functions, expressing genes and implementing gene manipulations.
>>> There is a top-level simulation object that run the various number of
>>> generations. My plan was to try using the java native interface to use the
>>> existing Java classes for organisms and fitness, and rewrite the top level
>>> simulation in X10.
>>>
>>> I have been evaluating X10 for purely numeric applications on our cluster
>>> (C++ back end, MPI runtime and mpiexec as a process launcher). I believe I
>>> read somewhere that the Java native interface requires the Java back end.
>>> In that case, I'd need to make sure we could run the sockets runtime and
>>> whatever process launcher we have for java (x10run?).
>>>
>>> Anyone have any advice?
>>>
>>> Thanks,
>>> Dave
>>
>> ------------------------------------------------------------------------------
>> All the data continuously generated in your IT infrastructure contains a
>> definitive record of customers, application performance, security
>> threats, fraudulent activity and more. Splunk takes this data and makes
>> sense of it. Business sense. IT sense. Common sense..
>> http://p.sf.net/sfu/splunk-d2d-c1
>> _______________________________________________
>> X10-users mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/x10-users
>
> ---
> David E. Hudak, Ph.D. [email protected]
> Program Director, HPC Engineering
> Ohio Supercomputer Center
> http://www.osc.edu
>
>
>
>
>
>
>
>
>
>
> ------------------------------------------------------------------------------
> All of the data generated in your IT infrastructure is seriously valuable.
> Why? It contains a definitive record of application performance, security
> threats, fraudulent activity, and more. Splunk takes this data and makes
> sense of it. IT sense. And common sense.
> http://p.sf.net/sfu/splunk-d2d-c2
> _______________________________________________
> X10-users mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/x10-users
---
David E. Hudak, Ph.D. [email protected]
Program Director, HPC Engineering
Ohio Supercomputer Center
http://www.osc.edu
------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security
threats, fraudulent activity, and more. Splunk takes this data and makes
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
X10-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/x10-users