Julian Seward wrote: > On Thursday, July 29, 2010, Barry L. Rountree wrote: >> John Reiser wrote: >>>>> ... Valgrind could stop >>>>> simulated CPU and revert back to the real CPU part way through >>>>> program execution." >>>>> >>>>> I'm not seeing this mentioned anywhere else in the documentation. >>>>> Does this capability still exist? >>>> No .. that description I think is somewhat out of date. There >>>> is no provision for switching back to native execution. >>> In case somebody is really motivated, then example code for such >>> a feature for x86 can be found by searching for the string 'letgo' in >>> http://**bitwagon.com/valgrind+uml/valgrind-3.3.0-2007-12-27.patch.gz >> And reversing that process would put you back into emulation mode? >> >> (Yes, that question is eliding a lot of hairy details. But if the >> program is started under valgrind and valgrind let's it go back into >> native mode, then switching back to valgrind should /just/ [!] be a >> question of updating valgrind's registers and hitting the big green >> "GO" button, right?) > > But how would you do that? Once you switch to running on the > real CPU, you lose all control and you no longer have the > ability to decide when to switch back to emulation. > > Additionally, as John points out, the whole point of running > on an emulator is to collect side-data about what's going on. > For some tools (eg profilers) the missing parts of the execution > is not too bad, you'll just get wrong statistics. But for the > error checking tools, at least Memcheck and the thread checkers, > you'll get guaranteed absolute chaos. > > So .. what is it you're _really_ trying to do? >
I'm trying to parallelize valgrind in general, and memcheck in particular. The big picture looks like this: you have a buggy serial program that takes a long time to run. You have a few supercomputers onsite. Run the serial program simultaneously on N cores where each core only instruments 1/Nth of the program. That will get you N nonsensical error reports which, when stitched together, should reduce to one sensical error report. Once that's working, porting it to MPI looks to be pretty straightforward. If you want to instrument node 0 on an MPI application, use a PMPI library that fires off N "fake" node 0s and run the parallel version of valgrind on those. Any MPI messages sent to the real node 0 get duplicated to the fake node 0s and MPI messages sent from the fake node 0s get dropped on the floor. Things get a little more complicated with some of the more obscure collective communication calls, but this is similar enough to work I've done in the past that I think it's a tractable problem. Back to the implementation details: I've been working my way through the source, documentation and publications for the past couple of weeks and I've got four different parallelization approaches sketched out. If John Reiser's patch works with the latest version and I can use that as a template to go from native to valgrind, then that looks to be the most straightforward approach. As to how to notify that app that it's supposed to go back into valgrind, that can be as simple as switching back in every X instructions and use PAPI to figure out when that happens. I could also hack something up on the MPI layer, but I'd like this to be usable for non-MPI applications as well. Comments are very welcome. When I proposed this project I thought I could get away with a combination of turning instrumentation on and off and ignoring address ranges. That only gets you down to the nullgrind overhead, and for this to be interesting the slowdown needs to be < 2x for large N. I think that's doable, but I need to understand the internals of coregrind first. Your thoughts? Thanks, Barry Rountree > J > ------------------------------------------------------------------------------ The Palm PDK Hot Apps Program offers developers who use the Plug-In Development Kit to bring their C/C++ apps to Palm for a share of $1 Million in cash or HP Products. Visit us here for more details: http://p.sf.net/sfu/dev2dev-palm _______________________________________________ Valgrind-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/valgrind-users
