On Tue, Oct/06/2009 10:23:48AM, Ashley Pittman wrote: > > Further to the mail linked below, padb is able to perform diagnostics, > including backtraces on hung jobs and integrates well into automated > testing environments.
Can padb get a backtrace from a non-debuggable MPI (e.g., not compiled with -g)? -Ethan > > The attached patch is a minimal change which should enable the > functionality. I don't however have access to a working MTT > installation to test this however. > > http://www.open-mpi.org/community/lists/mtt-devel/2009/06/0415.php > > This will require a HEAD version of padb, at least r273 to allow it to > accept the pid of mpirun rather than a jobid assigned by the underlying > resource manager. > > Yours, > > Ashley, > > -- > > Ashley Pittman, Bath, UK. > > Padb - A parallel job inspection tool for cluster computing > http://padb.pittman.org.uk > Index: lib/MTT/DoCommand.pm > =================================================================== > --- lib/MTT/DoCommand.pm (revision 1322) > +++ lib/MTT/DoCommand.pm (working copy) > @@ -359,6 +359,7 @@ > } > my $killed_status = undef; > my $last_over = 0; > + my $padb_output; > while ($done > 0) { > my $nfound = select($rout = $rin, undef, undef, $t); > if (vec($rout, fileno(OUTread), 1) == 1) { > @@ -410,6 +411,8 @@ > my $timeout_email_recipient = > $MTT::Globals::Values->{docommand_timeout_notify_email}; > my $timeout_notify_timeout = > $MTT::Globals::Values->{docommand_timeout_notify_timeout}; > > + $padb_output = `padb --config-option rmgr=mpirun > --full-report=$pid`; > + > if (defined($timeout_sentinel_file)) { > > # Email someone, if an email address has been specified > @@ -493,6 +496,9 @@ > # Return an anonymous hash containing the relevant data > > $ret->{result_stdout} = join('', @out); > + if ( defined $padb_output ) { > + $ret->{result_stdout} .= "\n$padb_output"; > + } > $ret->{result_stderr} = join('', @err), > if (!$merge_output); > return $ret; > _______________________________________________ > mtt-devel mailing list > mtt-de...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/mtt-devel