Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Gleb Natapov
On Fri, Apr 04, 2008 at 10:52:38AM -0600, Ralph H Castain wrote:
> With compression "on", you will get output telling you the original size of
> the message and its compressed size so you can see what was done.
> 
I see this output:
uncompressed allgather msg orig size 67521 compressed size 4162.

What is "allgather msg"

--
Gleb.


Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Ralph H Castain



On 4/7/08 7:04 AM, "Gleb Natapov"  wrote:

> On Fri, Apr 04, 2008 at 10:52:38AM -0600, Ralph H Castain wrote:
>> With compression "on", you will get output telling you the original size of
>> the message and its compressed size so you can see what was done.
>> 
> I see this output:
> uncompressed allgather msg orig size 67521 compressed size 4162.
> 
> What is "allgather msg"

It is the modex message - it is "shared" across all the procs via an
allgather procedure

> 
> --
> Gleb.




Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Gleb Natapov
On Mon, Apr 07, 2008 at 07:07:38AM -0600, Ralph H Castain wrote:
> 
> 
> 
> On 4/7/08 7:04 AM, "Gleb Natapov"  wrote:
> 
> > On Fri, Apr 04, 2008 at 10:52:38AM -0600, Ralph H Castain wrote:
> >> With compression "on", you will get output telling you the original size of
> >> the message and its compressed size so you can see what was done.
> >> 
> > I see this output:
> > uncompressed allgather msg orig size 67521 compressed size 4162.
> > 
> > What is "allgather msg"
> 
> It is the modex message - it is "shared" across all the procs via an
> allgather procedure
> 
If I'll divide allgather msg size by number of processes I should get a
modex size of one process. Is this correct? Also can you explain how
allgather is implemented in orte (sorry if you already explained this once
and I missed it).

--
Gleb.


Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Ralph H Castain



On 4/7/08 7:15 AM, "Gleb Natapov"  wrote:

> On Mon, Apr 07, 2008 at 07:07:38AM -0600, Ralph H Castain wrote:
>> 
>> 
>> 
>> On 4/7/08 7:04 AM, "Gleb Natapov"  wrote:
>> 
>>> On Fri, Apr 04, 2008 at 10:52:38AM -0600, Ralph H Castain wrote:
 With compression "on", you will get output telling you the original size of
 the message and its compressed size so you can see what was done.
 
>>> I see this output:
>>> uncompressed allgather msg orig size 67521 compressed size 4162.
>>> 
>>> What is "allgather msg"
>> 
>> It is the modex message - it is "shared" across all the procs via an
>> allgather procedure
>> 
> If I'll divide allgather msg size by number of processes I should get a
> modex size of one process. Is this correct?

Pretty much - there is some slight overhead added so orte knows what to do
with the message, but that is only a few bytes.

> Also can you explain how
> allgather is implemented in orte (sorry if you already explained this once
> and I missed it).

The default method is for each proc to send its modex data to its local
daemon. The local daemon collects the messages until all of its local procs
have contributed, then sends the collected data to the rank=0 application
proc. One rank=0 has received a message from every daemon, it xcasts the
collected result to all procs in its job.

I am currently working on a more scalable version of this that has the
daemons do a tree-like gather instead of just sending everything to rank=0.
Probably about a week from completion.


> 
> --
> Gleb.




Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Gleb Natapov
On Mon, Apr 07, 2008 at 07:28:07AM -0600, Ralph H Castain wrote:
> > Also can you explain how
> > allgather is implemented in orte (sorry if you already explained this once
> > and I missed it).
> 
> The default method is for each proc to send its modex data to its local
> daemon. The local daemon collects the messages until all of its local procs
> have contributed, then sends the collected data to the rank=0 application
> proc. One rank=0 has received a message from every daemon, it xcasts the
> collected result to all procs in its job.
>
Only collected result is compressed or messages from each proc to local
daemon and messages from local daemon to rank=0 are compressed too?
And, may be a stupid question, but I have to ask :) When rank=0 xcast
collected modex it compress it once or for each rank separately.
Also I think if rank=0 will compress each modex message during
receive it can save some work during xcast.

--
Gleb.


Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Ralph H Castain



On 4/7/08 7:45 AM, "Gleb Natapov"  wrote:

> On Mon, Apr 07, 2008 at 07:28:07AM -0600, Ralph H Castain wrote:
>>> Also can you explain how
>>> allgather is implemented in orte (sorry if you already explained this once
>>> and I missed it).
>> 
>> The default method is for each proc to send its modex data to its local
>> daemon. The local daemon collects the messages until all of its local procs
>> have contributed, then sends the collected data to the rank=0 application
>> proc. One rank=0 has received a message from every daemon, it xcasts the
>> collected result to all procs in its job.
>> 
> Only collected result is compressed or messages from each proc to local
> daemon and messages from local daemon to rank=0 are compressed too?

The individual inbound messages are not currently compressed prior to
sending - too small to bother

> And, may be a stupid question, but I have to ask :) When rank=0 xcast
> collected modex it compress it once or for each rank separately.

No, it only compresses the total message

So there is only one compress being done - the total modex message collected
"raw" and then is compressed just prior to xcast. Each proc then
uncompresses the result it receives from rank=0 before processing it.


> Also I think if rank=0 will compress each modex message during
> receive it can save some work during xcast.

Seems to me like one compress of the entire message has to be a great deal
faster than N compressions of N small messages...

> 
> --
> Gleb.




Re: [OMPI devel] Affect of compression on modex and launch messages

2008-04-07 Thread Gleb Natapov
On Mon, Apr 07, 2008 at 07:54:38AM -0600, Ralph H Castain wrote:
> 
> 
> 
> On 4/7/08 7:45 AM, "Gleb Natapov"  wrote:
> 
> > On Mon, Apr 07, 2008 at 07:28:07AM -0600, Ralph H Castain wrote:
> >>> Also can you explain how
> >>> allgather is implemented in orte (sorry if you already explained this once
> >>> and I missed it).
> >> 
> >> The default method is for each proc to send its modex data to its local
> >> daemon. The local daemon collects the messages until all of its local procs
> >> have contributed, then sends the collected data to the rank=0 application
> >> proc. One rank=0 has received a message from every daemon, it xcasts the
> >> collected result to all procs in its job.
> >> 
> > Only collected result is compressed or messages from each proc to local
> > daemon and messages from local daemon to rank=0 are compressed too?
> 
> The individual inbound messages are not currently compressed prior to
> sending - too small to bother
Make sense.

> > Also I think if rank=0 will compress each modex message during
> > receive it can save some work during xcast.
> 
> Seems to me like one compress of the entire message has to be a great deal
> faster than N compressions of N small messages...
The idea is that modex receive and compress will overlap.

--
Gleb.


[OMPI devel] Memchecker errors on trunk

2008-04-07 Thread Ralph H Castain
Hello

We have a problem this morning on the trunk - recent commits r18084-7
involving the ompi/include/ompi/memchecker.h file contain arithmetic
involving a void* pointer and other problems:

../../../../ompi/include/ompi/memchecker.h: In function
'memchecker_convertor_call':
../../../../ompi/include/ompi/memchecker.h:46: warning: ISO C90 forbids
mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:47: warning: ISO C90 forbids
mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:56: warning: pointer of type
'void *' used in arithmetic
../../../../ompi/include/ompi/memchecker.h:45: warning: unused variable
'pStack'

Shiqing: could you please fix this?

Thanks
Ralph




Re: [OMPI devel] Memchecker errors on trunk

2008-04-07 Thread George Bosilca
That's gcc being really mean !!! There was a double ; at the end of  
the line, and apparently the second one is interpreted as code ...  
Commit r18090 should fix the problem.


  george.

On Apr 7, 2008, at 10:27 AM, Ralph H Castain wrote:

Hello

We have a problem this morning on the trunk - recent commits r18084-7
involving the ompi/include/ompi/memchecker.h file contain arithmetic
involving a void* pointer and other problems:

../../../../ompi/include/ompi/memchecker.h: In function
'memchecker_convertor_call':
../../../../ompi/include/ompi/memchecker.h:46: warning: ISO C90  
forbids

mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:47: warning: ISO C90  
forbids

mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:56: warning: pointer of  
type

'void *' used in arithmetic
../../../../ompi/include/ompi/memchecker.h:45: warning: unused  
variable

'pStack'

Shiqing: could you please fix this?

Thanks
Ralph


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




smime.p7s
Description: S/MIME cryptographic signature


Re: [OMPI devel] Memchecker errors on trunk

2008-04-07 Thread Ralph H Castain
Thanks George!


On 4/7/08 8:48 AM, "George Bosilca"  wrote:

> That's gcc being really mean !!! There was a double ; at the end of
> the line, and apparently the second one is interpreted as code ...
> Commit r18090 should fix the problem.
> 
>george.
> 
> On Apr 7, 2008, at 10:27 AM, Ralph H Castain wrote:
>> Hello
>> 
>> We have a problem this morning on the trunk - recent commits r18084-7
>> involving the ompi/include/ompi/memchecker.h file contain arithmetic
>> involving a void* pointer and other problems:
>> 
>> ../../../../ompi/include/ompi/memchecker.h: In function
>> 'memchecker_convertor_call':
>> ../../../../ompi/include/ompi/memchecker.h:46: warning: ISO C90
>> forbids
>> mixed declarations and code
>> ../../../../ompi/include/ompi/memchecker.h:47: warning: ISO C90
>> forbids
>> mixed declarations and code
>> ../../../../ompi/include/ompi/memchecker.h:56: warning: pointer of
>> type
>> 'void *' used in arithmetic
>> ../../../../ompi/include/ompi/memchecker.h:45: warning: unused
>> variable
>> 'pStack'
>> 
>> Shiqing: could you please fix this?
>> 
>> Thanks
>> Ralph
>> 
>> 
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI devel] Memchecker errors on trunk

2008-04-07 Thread Shiqing Fan

Thanks a lot, George.

I didn't get this message this afternoon, sorry.


Shiqing

On Mon, 7 Apr 2008 10:48:03 -0400
 George Bosilca  wrote:
That's gcc being really mean !!! There was a double ; at 
the end of  the line, and apparently the second one is 
interpreted as code ...  Commit r18090 should fix the 
problem.


  george.

On Apr 7, 2008, at 10:27 AM, Ralph H Castain wrote:

Hello

We have a problem this morning on the trunk - recent 
commits r18084-7
involving the ompi/include/ompi/memchecker.h file 
contain arithmetic

involving a void* pointer and other problems:

../../../../ompi/include/ompi/memchecker.h: In function
'memchecker_convertor_call':
../../../../ompi/include/ompi/memchecker.h:46: warning: 
ISO C90  
forbids

mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:47: warning: 
ISO C90  
forbids

mixed declarations and code
../../../../ompi/include/ompi/memchecker.h:56: warning: 
pointer of  
type

'void *' used in arithmetic
../../../../ompi/include/ompi/memchecker.h:45: warning: 
unused  
variable

'pStack'

Shiqing: could you please fix this?

Thanks
Ralph


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel