In general, have you had a look at the giraph.metrics.enable option? It prints out metrics after each superstep to each worker's system.out log.
<div>-------- Ursprüngliche Nachricht --------</div><div>Von: Steven Harenberg <sdhar...@ncsu.edu> </div><div>Datum:14.01.2015 19:55 (GMT+01:00) </div><div>An: user <user@giraph.apache.org> </div><div>Betreff: Performance metrics / calling external scripts </div><div> </div>Hi all, I am attempting to measure some performance metrics (such as runtime, memory usage, network communication, etc.) using an external bash script that grabs some machine stats. I am having difficulty figuring out where to externally call this script in Giraph. Particularly, I would like to call it at several key points in Giraph's execution, such as input/setup, beginning of computation, and output. The issue I am having is that I can't clearly figure out where to place the external calls because I can't figure out where these "phases" are actually happening in Giraph's source. I also have the added difficulty that I only want this external script to be called for each machine/worker not for each thread. Meaning, it should not be inside the vertex computation code, for example. Summary: my goal is to call an external script once per machine at the beginning of setup, computation (at/before superstep 0), and output. Is this possible? If so, could anyone please point me to where these phases are happening that would work for making such an external call? I am guessing this would be the MasterThread file, as this is where all the GiraphTimers are happening. Any general advice would be appreciated. Thanks and regards, Steve