Hello Fangfang,
On 18/09/12 03:08 PM, Fangfang Xia wrote:
> Thanks. I forgot to ask you one more question.
>
> Are there some ranks that work differently in Ray?
>
> Do we need to profile each rank?
>
[I CC'ed the Ray mailing list as this is informative to the community as well.]
Each Ray process is a virtual state machine running inside a supervisor
Each Ray process has a slave mode and a master mode. Each process has a
slave mode handler table to handle slave modes, a master mode handler table to
handle master modes, and a message tag handler table to handle received
messages.
These 3 tables are managed by RayPlatform.
The runtime behavior of Ray is defined by the plugins it contains. Ray plugins
tells
RayPlatform what are the handles and what handlers can handle them.
Handles are integers and handlers are function pointers. Plugins are in C++
though, so handler registration is performed with some easy-to-use binding
adapters.
These 3 tables (slave modes, master modes, message tags) are pretty much what
Linux and Minix
use for deleguating system calls and for managing interrupts.
But RayPlatform is not event-driven (these things are event-driven: node.js,
hardware interrupts, your favorite word processing software) because MPI is
not event-driven.
The ki_functions table in Kiki [1] is mostly equivalent to the slave modes of
Ray,
although Kiki has different groups of Kiki processes where as Ray only has
MPI_COMM_WORLD.
The slave mode of anyone will change depending on the current step.
For instance, the slave mode of a process can change upon reception of
a message with a particular message tag.
Ray processes are not synchronized neither -- a Ray process can have the slave
mode
RAY_SLAVE_MODE_SEQUENCE_BIOLOGICAL_ABUNDANCES while another slave mode can have
the slave mode RAY_SLAVE_MODE_DO_NOTHING.
And even if the slave mode of a Ray process is RAY_SLAVE_MODE_DO_NOTHING, this
process will regardless respond to incoming messages and send replies to these
incoming messages.
The master mode of everyone except the Ray process with MPI rank 0 is
RAY_MASTER_MODE_DO_NOTHING.
The master mode of rank 0 varies a lot, but it is really not that much because
the master mode is only used to manage all the process macro scheduling (start
a step,
count how many Ray processes reached a given checkpoint, and so on.).
To provide its scalability, Ray/RayPlatform only utilize point-to-point
communication and no collectives (except for a few barriers). So profiling
these calls will be useful.
And as I said, Ray (via RayPlatform) generates profiling information by
default in RayOutputDirectory/Scheduling/*
With the Cray Gemini Hardware Counters [2]:
If you profile only one Ray process, you will get communication imbalance as
the profiled Ray process will run slower than all the others.
Sébastien
[1] https://github.com/GeneAssembly/kiki
[2] http://docs.cray.com/books/S-0025-10/
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Denovoassembler-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/denovoassembler-users