Hello,

I don't have a common file system for all cluster nodes.

I've tried to run the application again with VT_UNIFY=no and to call vtunify manually. It works well. I managed to get the .otf file.

Thank you.

Thomas Ropars


Andreas Knüpfer wrote:
Hello Thomas,

sorry for the delay. My first asumption about the cause of your problem is the so called "unify" process. This is a post-processing step which is performed automatically after the trace run. This step needs read access to all files, though. So, do you have a common file system for all cluster nodes?

If yes, set the env variable VT_PFORM_GDIR point there. Then the traces will be copied there from the location VT_PFORM_LDIR which still can be a node-local directory. Then everything will be handled automatically.

If not, please set VT_UNIFY=no in order to disable automatic unification. Then you need to call vtunify manually. Please copy all files from the run directory that start with your OTF file prefix to a common directory and call

%> vtunify <number of processes> <file prefix>

there. This should give you the <prefix>.otf file.

Please give this a try. If it is not working, please give me an 'ls -alh' from your trace directory/directories.

Best regards, Andreas


P.S.: Please have my email on CC, I'm not on the us...@open-mpi.org list.



From: Thomas Ropars <trop...@irisa.fr>
Date: August 11, 2008 3:47:54 PM IST
To: us...@open-mpi.org
Subject: [OMPI users] Problem using VampirTrace
Reply-To: Open MPI Users <us...@open-mpi.org>

Hi all,

I'm trying to use VampirTrace.
I'm working with r19234 of svn trunk.

When I try to run a simple application with 4 processes on the same
computer, it works well.
But if try to use the same application with the 4 processes executed
on 4 different computers, I never get the .otf file.

I've tried to run with VT_VERBOSE=yes, and I get the following trace:

VampirTrace: Thread object #0 created, total number is 1
VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
vt.fffffffffe8349ca.3294 id 1] for generation [buffer 32000000 bytes]
VampirTrace: Thread object #0 created, total number is 1
VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834bca.3020 id 1] for generation [buffer 32000000 bytes]
VampirTrace: Thread object #0 created, total number is 1
VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834aca.3040 id 1] for generation [buffer 32000000 bytes]
VampirTrace: Thread object #0 created, total number is 1
VampirTrace: Opened OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834fca.3011 id 1] for generation [buffer 32000000 bytes]
Ring : Start
Ring : End
[1]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834aca.3040 id 1]
[2]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834bca.3020 id 1]
[1]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834aca.3040 id 1]
[3]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834fca.3011 id 1]
[2]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834bca.3020 id 1]
[0]VampirTrace: Flushed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe8349ca.3294 id 1]
[1]VampirTrace: Wrote unify control file ./ring-vt.2.uctl
[2]VampirTrace: Wrote unify control file ./ring-vt.3.uctl
[3]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe834fca.3011 id 1]
[0]VampirTrace: Closed OTF writer stream [namestub /tmp/ring-
vt.fffffffffe8349ca.3294 id 1]
[0]VampirTrace: Wrote unify control file ./ring-vt.1.uctl
[0]VampirTrace: Checking for ./ring-vt.1.uctl ...
[0]VampirTrace: Checking for ./ring-vt.2.uctl ...
[1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
3040.1.def
[2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
3020.1.def
[3]VampirTrace: Wrote unify control file ./ring-vt.4.uctl
[1]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834aca.
3040.1.events
[2]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834bca.
3020.1.events
[3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
3011.1.def
[1]VampirTrace: Thread object #0 deleted, leaving 0
[2]VampirTrace: Thread object #0 deleted, leaving 0
[3]VampirTrace: Removed trace file /tmp/ring-vt.fffffffffe834fca.
3011.1.events
[3]VampirTrace: Thread object #0 deleted, leaving 0


Regards

Thomas
_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users




Reply via email to