cases where the
> presence or absence of code that isn't executed can influence timings
> (perhaps because code will come out of the instruction cache differently),
> but all that is speculation. It's all a guess that what you're really
> seeing isn't really MPI r
Hi, Eugene,
You said:
" The bottom line here is that from a causal point of view it would seem
that B should not impact the timings. Presumably, some other variable is
actually responsible here."
Could you explain it in more details for the second sentence. Thanks a lot.
Linbao
On Thu, Oct 21,
fect on time spent between t1 and t2. But
> extraneous effects might cause it to do so -- e.g., are you running in an
> oversubscribed scenario? And so on.
>
No. We have 1024 nodes available and I'm using 500.
>
> On Oct 21, 2010, at 9:24 AM, Storm Zhang wrote:
>
> >
MPI_Barrier.
Linbao
On Thu, Oct 21, 2010 at 5:17 AM, Jeff Squyres wrote:
> On Oct 20, 2010, at 5:51 PM, Storm Zhang wrote:
>
> > I need to measure t2-t1 to see the time spent on the code A between these
> two MPI_Barriers. I notice that if I comment code B, the time seems much
> less
me(), not
> clock()
>
> regards
> jody
>
> On Wed, Oct 20, 2010 at 11:51 PM, Storm Zhang wrote:
> > Dear all,
> >
> > I got confused with my recent C++ MPI program's behavior. I have an MPI
> > program in which I use clock() to measure the ti
Dear all,
I got confused with my recent C++ MPI program's behavior. I have an MPI
program in which I use clock() to measure the time spent between to
MPI_Barrier, just like this:
MPI::COMM_WORLD.Barrier();
if if(rank == master) t1 = clock();
"code A";
MPI::COMM_WORLD.Barrier();
if if(rank == mas
not find the
bind-to-core info. I only see bynode or byslot options. Is it same as
bind-to-core? My mpirun shows version 1.3.3 but ompi_info shows 1.4.2.
Thanks a lot.
Linbao
On Mon, Oct 4, 2010 at 9:18 PM, Eugene Loh wrote:
> Storm Zhang wrote:
>
>
>> Here is what I meant: the
;t it
like:
mpirun --mca btl_tcp_if_include eth0 -np 600 -bind-to-core scatttest
Thank you very much.
Linbao
On Mon, Oct 4, 2010 at 4:42 PM, Ralph Castain wrote:
>
> On Oct 4, 2010, at 1:48 PM, Storm Zhang wrote:
>
> Thanks a lot, Ralgh. As I said, I also tried to use SGE(also showi
hit
> when requesting > 512 compute units. We should really get input from a
> hyperthreading expert, preferably form intel.
>
> Doug Reeder
> On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote:
>
> We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs.
> So
ised that you see a performance hit
> when requesting > 512 compute units. We should really get input from a
> hyperthreading expert, preferably form intel.
>
> Doug Reeder
> On Oct 4, 2010, at 9:53 AM, Storm Zhang wrote:
>
> We have 64 compute nodes which are dual qual-core and hy
We have 64 compute nodes which are dual qual-core and hyperthreaded CPUs. So
we have 1024 compute units shown in the ROCKS 5.3 system. I'm trying to
scatter an array from the master node to the compute nodes using mpiCC and
mpirun using C++.
Here is my test:
The array size is 18KB * Number of com
11 matches
Mail list logo