On Thu, Oct 20, 2016 at 09:30:27AM -0700, Stefan Beller wrote:
> On Thu, Oct 20, 2016 at 5:31 AM, Jeff King <[email protected]> wrote:
>
> >
> > $ perl -lne '/execve\("(.*?)"/ and print $1' /tmp/foo.out | sort | uniq -c
> > | sort -rn | head
> > 152271 /home/peff/compile/git/git
> > 57340 /home/peff/compile/git/t/../bin-wrappers/git
> > 16865 /bin/sed
> > 12650 /bin/rm
> > 11257 /bin/cat
> > 9326 /home/peff/compile/git/git-sh-i18n--envsubst
> > 9079 /usr/bin/diff
> > 8013 /usr/bin/wc
> > 5924 /bin/mv
> > 4566 /bin/grep
> >
>
> I am not an expert on perl nor tracing, but is it feasible to find out
> how many internal calls there are? i.e. either some shell script (rebase,
> submodule) calling git itself a couple of times or even from compile/git/git
> itself, e.g. some submodule operations use forking in there.
The script below is my attempt, though I think it is not quite right, as
"make" should be the single apex of the graph. You can run it like:
strace -f -o /tmp/foo.out -e clone,execve make test
perl graph.pl /tmp/foo.out | less -S
One thing that it counts (that was not counted above) is the number of
forks for subshells, which is considerable. I don't know how expensive
that is versus, say, running "cat" (if your fork() doesn't
copy-on-write, and you implement sub-programs via an efficient spawn()
call, it's possible that the subshells are significantly more
expensive).
-Peff
-- >8 --
#!/usr/bin/perl
my %clone;
my %exec;
my %is_child;
my %counter;
while (<>) {
# <pid> execve("some-prog", ...
if (/^(\d+)\s+execve\("(.*?)"/) {
push @{$exec{node($1)}}, $2;
}
# <pid> clone(...) = <child>
# or
# <pid> <... clone resumed> ...) = <child>
elsif (/^(\d+)\s+.*clone.*\) = (\d+)$/) {
push @{$clone{node($1)}}, node($2);
$is_child{node($2)} = 1;
}
# <pid> +++ exited with <code> +++
# We have to keep track of this because pids get recycled,
# and so are not unique node names in our graph.
elsif (/^(\d+)\s+.*exited with/) {
$counter{$1}++;
}
}
show($_, 0) for grep { !$is_child{$_} } keys(%clone);
sub show {
my ($pid, $indent) = @_;
my @progs = @{$exec{$pid}};
if (!@progs) {
@progs = ("(fork)");
}
print ' ' x $indent;
print "$pid: ", shift @progs;
print " => $_" for @progs;
print "\n";
show($_, $indent + 2) for @{$clone{$pid}};
}
sub node {
my $pid = shift;
my $c = $counter{$pid} || "0";
return "$pid-$c";
}