Re: git grep parallelism question

2013-05-05 Thread Pete Wyckoff
torva...@linux-foundation.org wrote on Fri, 26 Apr 2013 13:31 -0700:
 Anyway, I think your patch is good if for no other reason that it
 allows this kind of testing, but at least for my machine, clearly the
 current default of eight threads is actually good enough. Maybe
 somebody with a very different machine might want to run the above
 script and see if how sensitive other machines are to this parameter..

NFS numbers behave as expected:  IO concurrency is key.

WARM
1 real 0m23.147s
2 real 0m13.913s
4 real 0m6.958s
8 real 0m4.104s
16 real 0m3.588s
32 real 0m3.212s
64 real 0m3.173s

COLD
1 real 1m36.969s
2 real 0m51.627s
4 real 0m32.994s
8 real 0m25.657s
16 real 0m21.260s
32 real 0m18.138s
64 real 0m17.265s

I am tempted to change the default locally from 8 to 32.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-30 Thread John Keeping
On Mon, Apr 29, 2013 at 03:22:24PM -0700, Junio C Hamano wrote:
 John Keeping j...@keeping.me.uk writes:
  No, I was the one missing something (--root to be precise).  But with
  TEST_OUTPUT_DIRECTORY you also get the result files in your temporary
  location, not just the trash directory.
 
 With your patch, doesn't t-*.sh --root $there automatically
 use the fast $there temporary location as the result depot, too?

No, the current code uses:

$TEST_OUTPUT_DIRECTORY/$root/trash\ directory.t

where we don't prepend $TEST_OUTPUT_DIRECTORY/ if $root is absolute.

 If it doesn't with the current code, shouldn't it?

I think the current behaviour is fine and the two options complement
each other.

TEST_OUTPUT_DIRECTORY is something you set once and forget about which
says all of the test output should go over here, whereas --root is
passed to a specific test and says put your output here but does not
affect the result aggregation which is not specific to that test.

Note that setting TEST_OUTPUT_DIRECTORY in config.mak affects all tests
no matter how you run them (via make or as ./t-.sh) whereas
setting --root=... in GIT_TEST_OPTS only affect tests run via make.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-30 Thread Jeff King
On Tue, Apr 30, 2013 at 09:08:49AM +0100, John Keeping wrote:

  With your patch, doesn't t-*.sh --root $there automatically
  use the fast $there temporary location as the result depot, too?
 
 No, the current code uses:
 
 $TEST_OUTPUT_DIRECTORY/$root/trash\ directory.t
 
 where we don't prepend $TEST_OUTPUT_DIRECTORY/ if $root is absolute.
 
  If it doesn't with the current code, shouldn't it?
 
 I think the current behaviour is fine and the two options complement
 each other.
 
 TEST_OUTPUT_DIRECTORY is something you set once and forget about which
 says all of the test output should go over here, whereas --root is
 passed to a specific test and says put your output here but does not
 affect the result aggregation which is not specific to that test.

The original intent of --root (and how I use it) is to set and forget
it, too, via GIT_TEST_OPTS. I intentionally didn't move test results
with it, because to me the point was a pure optimization: put the trash
directories on a faster disk, and leave everything else identical.  With
--root, any scripts which later want to look at test-results will find
them in the usual place.

Your patch updates all of the in-tree spots which look at the results,
but any third-party scripts would need to take it into account, too
(though I have no idea if any such scripts even exist).

I'm curious if there is a good reason to want to move the results. Some
possibilities I can think of are:

  1. More optimization, as results are written to the faster filesystem.
 I doubt this is noticeable, though, as the amount of data written
 is relatively small compared to the tests themselves (which are
 constantly creating and deleting repos).

  2. You can run tests in a read-only git checkout. I'm not sure how
 useful that is, though, since you would already need to compile
 git.

  3. You could have multiple sets of test results to keep or compare.
 I'd think you'd want to keep the built versions of git around, too,
 though. Which would mean that a full checkout like git-new-workdir
 would be a much simpler way to accomplish the same thing.

So I'm not against TEST_OUTPUT_DIRECTORY as a concept, but I'm having
trouble seeing how it is more useful than --root.

 Note that setting TEST_OUTPUT_DIRECTORY in config.mak affects all tests
 no matter how you run them (via make or as ./t-.sh) whereas
 setting --root=... in GIT_TEST_OPTS only affect tests run via make.

I actually consider that a feature of --root. When I run make test
everything happens fast. When I run the script manually (which is
usually because I'm debugging), the trash directory appears in the
current directory, so I can easily investigate it. And if you are
running a single test, the performance impact is usually negligible
(where you really notice it is when running make -j32 test).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-30 Thread John Keeping
On Tue, Apr 30, 2013 at 11:59:39AM -0400, Jeff King wrote:
 On Tue, Apr 30, 2013 at 09:08:49AM +0100, John Keeping wrote:
 
   With your patch, doesn't t-*.sh --root $there automatically
   use the fast $there temporary location as the result depot, too?
  
  No, the current code uses:
  
  $TEST_OUTPUT_DIRECTORY/$root/trash\ directory.t
  
  where we don't prepend $TEST_OUTPUT_DIRECTORY/ if $root is absolute.
  
   If it doesn't with the current code, shouldn't it?
  
  I think the current behaviour is fine and the two options complement
  each other.
  
  TEST_OUTPUT_DIRECTORY is something you set once and forget about which
  says all of the test output should go over here, whereas --root is
  passed to a specific test and says put your output here but does not
  affect the result aggregation which is not specific to that test.
 
 The original intent of --root (and how I use it) is to set and forget
 it, too, via GIT_TEST_OPTS. I intentionally didn't move test results
 with it, because to me the point was a pure optimization: put the trash
 directories on a faster disk, and leave everything else identical.  With
 --root, any scripts which later want to look at test-results will find
 them in the usual place.
 
 Your patch updates all of the in-tree spots which look at the results,
 but any third-party scripts would need to take it into account, too
 (though I have no idea if any such scripts even exist).
 
 I'm curious if there is a good reason to want to move the results. Some
 possibilities I can think of are:
 
   1. More optimization, as results are written to the faster filesystem.
  I doubt this is noticeable, though, as the amount of data written
  is relatively small compared to the tests themselves (which are
  constantly creating and deleting repos).
 
   2. You can run tests in a read-only git checkout. I'm not sure how
  useful that is, though, since you would already need to compile
  git.
 
   3. You could have multiple sets of test results to keep or compare.
  I'd think you'd want to keep the built versions of git around, too,
  though. Which would mean that a full checkout like git-new-workdir
  would be a much simpler way to accomplish the same thing.
 
 So I'm not against TEST_OUTPUT_DIRECTORY as a concept, but I'm having
 trouble seeing how it is more useful than --root.

I think the original intent of TEST_OUTPUT_DIRECTORY was to allow other
users of the test framework (in contrib/ or the performance tests) to
put their output in a sensible place for those tests, like you describe
below.

The patch being discussed here [1] just makes sure that it applies
to everything - previously it was applied to test-results/
inconsistently; test-lib.sh used TEST_OUTPUT_DIRECTORY but the makefile
didn't.  So we haven't actually changed where test-results/ live as a
result of this change, just where the makefile looks in order to display
the aggregate results and clean them up.

  Note that setting TEST_OUTPUT_DIRECTORY in config.mak affects all tests
  no matter how you run them (via make or as ./t-.sh) whereas
  setting --root=... in GIT_TEST_OPTS only affect tests run via make.
 
 I actually consider that a feature of --root. When I run make test
 everything happens fast. When I run the script manually (which is
 usually because I'm debugging), the trash directory appears in the
 current directory, so I can easily investigate it. And if you are
 running a single test, the performance impact is usually negligible
 (where you really notice it is when running make -j32 test).

This confirms to me that the patch as it currently stands is correct: we
have made TEST_OUTPUT_DIRECTORY consistent and --root still works as
before.

[1] http://article.gmane.org/gmane.comp.version-control.git/222555
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-30 Thread Jeff King
On Tue, Apr 30, 2013 at 05:12:08PM +0100, John Keeping wrote:

  So I'm not against TEST_OUTPUT_DIRECTORY as a concept, but I'm having
  trouble seeing how it is more useful than --root.
 
 I think the original intent of TEST_OUTPUT_DIRECTORY was to allow other
 users of the test framework (in contrib/ or the performance tests) to
 put their output in a sensible place for those tests, like you describe
 below.
 
 The patch being discussed here [1] just makes sure that it applies
 to everything - previously it was applied to test-results/
 inconsistently; test-lib.sh used TEST_OUTPUT_DIRECTORY but the makefile
 didn't.  So we haven't actually changed where test-results/ live as a
 result of this change, just where the makefile looks in order to display
 the aggregate results and clean them up.

Ah, I see. Thanks, that was the piece I was missing.

 This confirms to me that the patch as it currently stands is correct: we
 have made TEST_OUTPUT_DIRECTORY consistent and --root still works as
 before.
 
 [1] http://article.gmane.org/gmane.comp.version-control.git/222555

Yeah, if it is about harmonizing the Makefile and the test scripts, that
is definitely a bug fix, and the right thing to do.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-29 Thread Ramkumar Ramachandra
Linus Torvalds wrote:
 Anyway, I think your patch is good if for no other reason that it
 allows this kind of testing, but at least for my machine, clearly the
 current default of eight threads is actually good enough. Maybe
 somebody with a very different machine might want to run the above
 script and see if how sensitive other machines are to this parameter..

I have four cores with HT, so I expect similar gains from CPU
parallelism (in the hot-cache case).  And the test results meet this
expectation:

1 real 0m1.287s
2 real 0m0.635s
4 real 0m0.460s
8 real 0m0.413s
16 real 0m0.443s
32 real 0m0.715s

I have a rotating hard disk, and expect the IO parallelism (cold-cache
case) benefits to peak at a larger number of threads (than in your SSD
case).  These are the test results:

1 real 0m28.566s
4 real 0m20.361s
8 real 0m16.990s
16 real 0m15.278s
32 real 0m13.710s
64 real 0m12.405s
128 real 0m11.913s
256 real 0m11.759s
512 real 0m12.022s

It looks like 128~256 threads is the sweet spot.

It's impossible for git to determine if the cache is hot or cold,
correct?  Cold cache is a very rare case, and I'm not sure how we can
optimize specifically for that case.  So, I'm not sure how we can
improve grep.

On a related note, one place that IO parallelism can provide massive
benefits is in executing shell scripts.  Accordingly, I always use the
following commands to compile and test git respectively:

make -j 8 CFLAGS=-g -O0 -Wall
make -j 8 DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS=-j 16 test

i.e. always use 8 threads when the task is known to be CPU intensive,
and always use 16 threads when the task is known to be IO intensive.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-29 Thread John Keeping
On Mon, Apr 29, 2013 at 07:35:01PM +0530, Ramkumar Ramachandra wrote:
 On a related note, one place that IO parallelism can provide massive
 benefits is in executing shell scripts.  Accordingly, I always use the
 following commands to compile and test git respectively:
 
 make -j 8 CFLAGS=-g -O0 -Wall
 make -j 8 DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS=-j 16 test
 
 i.e. always use 8 threads when the task is known to be CPU intensive,
 and always use 16 threads when the task is known to be IO intensive.

On this tangent, I recently added a TEST_OUTPUT_DIRECTORY line to my
config.mak which points into a tmpfs mount.  Keeping all of the test
repositories in RAM makes the tests significantly faster for me and
works nicely when you have the patches in jk/test-output (without those
patches the individual tests work but the reporting of aggregate results
doesn't).
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-29 Thread Thomas Rast
John Keeping j...@keeping.me.uk writes:

 On Mon, Apr 29, 2013 at 07:35:01PM +0530, Ramkumar Ramachandra wrote:
 On a related note, one place that IO parallelism can provide massive
 benefits is in executing shell scripts.  Accordingly, I always use the
 following commands to compile and test git respectively:
 
 make -j 8 CFLAGS=-g -O0 -Wall
 make -j 8 DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS=-j 16 test
 
 i.e. always use 8 threads when the task is known to be CPU intensive,
 and always use 16 threads when the task is known to be IO intensive.

 On this tangent, I recently added a TEST_OUTPUT_DIRECTORY line to my
 config.mak which points into a tmpfs mount.  Keeping all of the test
 repositories in RAM makes the tests significantly faster for me and
 works nicely when you have the patches in jk/test-output (without those
 patches the individual tests work but the reporting of aggregate results
 doesn't).

But that's been possible for quite some time now, using --root, or am I
missing something?

(Not that the fix as such is a bad idea, but other readers might not
want to wait for it to hit master.)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-29 Thread John Keeping
On Mon, Apr 29, 2013 at 08:04:10PM +0200, Thomas Rast wrote:
 John Keeping j...@keeping.me.uk writes:
 
  On Mon, Apr 29, 2013 at 07:35:01PM +0530, Ramkumar Ramachandra wrote:
  On a related note, one place that IO parallelism can provide massive
  benefits is in executing shell scripts.  Accordingly, I always use the
  following commands to compile and test git respectively:
  
  make -j 8 CFLAGS=-g -O0 -Wall
  make -j 8 DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS=-j 16 test
  
  i.e. always use 8 threads when the task is known to be CPU intensive,
  and always use 16 threads when the task is known to be IO intensive.
 
  On this tangent, I recently added a TEST_OUTPUT_DIRECTORY line to my
  config.mak which points into a tmpfs mount.  Keeping all of the test
  repositories in RAM makes the tests significantly faster for me and
  works nicely when you have the patches in jk/test-output (without those
  patches the individual tests work but the reporting of aggregate results
  doesn't).
 
 But that's been possible for quite some time now, using --root, or am I
 missing something?

No, I was the one missing something (--root to be precise).  But with
TEST_OUTPUT_DIRECTORY you also get the result files in your temporary
location, not just the trash directory.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-29 Thread Junio C Hamano
John Keeping j...@keeping.me.uk writes:

 On Mon, Apr 29, 2013 at 08:04:10PM +0200, Thomas Rast wrote:
 John Keeping j...@keeping.me.uk writes:
 
  On Mon, Apr 29, 2013 at 07:35:01PM +0530, Ramkumar Ramachandra wrote:
  On a related note, one place that IO parallelism can provide massive
  benefits is in executing shell scripts.  Accordingly, I always use the
  following commands to compile and test git respectively:
  
  make -j 8 CFLAGS=-g -O0 -Wall
  make -j 8 DEFAULT_TEST_TARGET=prove GIT_PROVE_OPTS=-j 16 test
  
  i.e. always use 8 threads when the task is known to be CPU intensive,
  and always use 16 threads when the task is known to be IO intensive.
 
  On this tangent, I recently added a TEST_OUTPUT_DIRECTORY line to my
  config.mak which points into a tmpfs mount.  Keeping all of the test
  repositories in RAM makes the tests significantly faster for me and
  works nicely when you have the patches in jk/test-output (without those
  patches the individual tests work but the reporting of aggregate results
  doesn't).
 
 But that's been possible for quite some time now, using --root, or am I
 missing something?

 No, I was the one missing something (--root to be precise).  But with
 TEST_OUTPUT_DIRECTORY you also get the result files in your temporary
 location, not just the trash directory.

With your patch, doesn't t-*.sh --root $there automatically
use the fast $there temporary location as the result depot, too?
If it doesn't with the current code, shouldn't it?
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-27 Thread Thomas Rast
Linus Torvalds torva...@linux-foundation.org writes:

 Anyway, I think your patch is good if for no other reason that it
 allows this kind of testing, but at least for my machine, clearly the
 current default of eight threads is actually good enough. Maybe
 somebody with a very different machine might want to run the above
 script and see if how sensitive other machines are to this parameter..

I think the last time Duy, Peff, me and others looked into grep
threading (which was a while ago) we basically reached the conclusion
that it's very unstable across machines.

Back then I tested things on a 2x6-core Xeon OS X machine and there the
performance (for the hot-cache, worktree case, which should parallelize
nicely) flatlines at 5 threads for no apparent reason.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git grep parallelism question

2013-04-26 Thread Linus Torvalds
Since I reboot fairly regularly to test new kernels, I don't *always*
have the kernel source tree in my caches, so I still care about some
cold-cache performance. And git grep tends to be the most noticeable
one.

Now, I have a SSD, and even the cold-cache case takes just five
seconds or so, but that's still somethng I react to, since the normal
kernel tree in cache case ends up bring close enough to
instantaneous (half a second) that then when it takes longer I react
to it.

And I started thinking about it, and our git grep parallelism seems
to be limited to 8.

Which is fine most of the time for CPU parallelism (although maybe
some people with big machines would prefer higher numbers), but for IO
parallelism I thought that maybe we'd like a higher number...

So I tried it out, and with THREADS set to 32, I get a roughly 15%
performance boost for the cold-cache case (the error bar is high
enough to not give a very precise number, but I see it going from
~4.8-4.9s on my machine down to 4.2..4.6s).

That's on an SSD, though - the performance implications might be very
different for other use cases (NFS would likely prefer higher IO
parallelism and might show bigger improvement, while a rotational disk
might end up just thrashing more)

Is this a big deal? Probably not. But I did react to how annoying it
was to set the parallelism factor (recompile git with a new number).
Wouldn't it be lovely if it was slightly smarter (something more akin
to the index preloading that takes number of files into account) or at
least allowed people to set the parallelism explicitly with a command
line switch?

Right now it disables the parallel grep entirely for UP, for example.
Which makes perfect sense if grep is all about CPU use. But even UP
might improve from parallel IO..

  Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-26 Thread Junio C Hamano
Linus Torvalds torva...@linux-foundation.org writes:

 Wouldn't it be lovely if it was slightly smarter (something more akin
 to the index preloading that takes number of files into account) or at
 least allowed people to set the parallelism explicitly with a command
 line switch?

Yeah, a reasonable starting point for auto-tuning may be to use the
same kind of parameters and heuristics (i.e. max parallel of 20
threads, assume a cost to use an extra thread is the same as running
500 greps), and then tweak them (for example, thread cost of 500 may
be reasonable for lstat() but it would be way too big for grep()).

The real issue may be that we do not have a good estimate of how
many paths are involved in the request before starting these
threads, though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-26 Thread Linus Torvalds
On Fri, Apr 26, 2013 at 11:47 AM, Junio C Hamano gits...@pobox.com wrote:

 The real issue may be that we do not have a good estimate of how
 many paths are involved in the request before starting these
 threads, though.

Yes. Also, I'm not sure if the 15% possible improvement on my SSD case
is even worth it for something that in the end isn't necessarily the
common case. I *suspect* that it might be a much bigger deal on NFS
(IO parallelism really does end up being a big deal sometimes, and
caching tends to be less aggressive too), but on rotational media it
might be much less clear, or even a loss..

Are there people out there who use git grep over NFS and have been
unhappy with performance? If are willing to recompile git with a
different THREAD value in builtin/grep.c, then on a Linux client you
can try

echo 3  /proc/sys/vm/drop_caches

to largely force cold-cache behavior for testing (I say largely,
because it won't drop busy/dirty pages, but for git grep kind of
loads it should be good).

Of course, you need root for it, so..

  Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-26 Thread Junio C Hamano
Linus Torvalds torva...@linux-foundation.org writes:

 Yes. Also, I'm not sure if the 15% possible improvement on my SSD case
 is even worth it for something that in the end isn't necessarily the
 common case.

Cold cache being uncommon case would be forever true but more and
more people are on SSD, and 15% is not a trivial improvement.

 Are there people out there who use git grep over NFS and have been
 unhappy with performance? If are willing to recompile git with a
 different THREAD value in builtin/grep.c,...

OK, you have to recompile at least once for experiment, so perhaps
building the test binary with this patch may help.

 builtin/grep.c | 30 ++
 1 file changed, 22 insertions(+), 8 deletions(-)

diff --git a/builtin/grep.c b/builtin/grep.c
index 159e65d..f635cd5 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -26,12 +26,14 @@ static char const * const grep_usage[] = {
 static int use_threads = 1;
 
 #ifndef NO_PTHREADS
-#define THREADS 8
-static pthread_t threads[THREADS];
+#define MAX_THREADS 100
+static int num_threads = 8;
+static pthread_t threads[MAX_THREADS];
 
-/* We use one producer thread and THREADS consumer
- * threads. The producer adds struct work_items to 'todo' and the
- * consumers pick work items from the same array.
+/*
+ * We use one producer thread and THREADS consumer threads. The
+ * producer adds struct work_items to 'todo' and the consumers pick
+ * work items from the same array.
  */
 struct work_item {
struct grep_source source;
@@ -205,7 +207,7 @@ static void start_threads(struct grep_opt *opt)
strbuf_init(todo[i].out, 0);
}
 
-   for (i = 0; i  ARRAY_SIZE(threads); i++) {
+   for (i = 0; i  num_threads; i++) {
int err;
struct grep_opt *o = grep_opt_dup(opt);
o-output = strbuf_out;
@@ -237,7 +239,7 @@ static int wait_all(void)
pthread_cond_broadcast(cond_add);
grep_unlock();
 
-   for (i = 0; i  ARRAY_SIZE(threads); i++) {
+   for (i = 0; i  num_threads; i++) {
void *h;
pthread_join(threads[i], h);
hit |= (int) (intptr_t) h;
@@ -636,6 +638,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
int i;
int dummy;
int use_index = 1;
+   int num_threads_explicit = -1;
int pattern_type_arg = GREP_PATTERN_TYPE_UNSPECIFIED;
 
struct option options[] = {
@@ -743,6 +746,8 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
N_(allow calling of grep(1) (ignored by this 
build))),
{ OPTION_CALLBACK, 0, help-all, options, NULL, N_(show 
usage),
  PARSE_OPT_HIDDEN | PARSE_OPT_NOARG, help_callback },
+   OPT_INTEGER(0, threads, num_threads_explicit,
+   N_(use threads when searching)),
OPT_END()
};
 
@@ -773,6 +778,15 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
 PARSE_OPT_NO_INTERNAL_HELP);
grep_commit_pattern_type(pattern_type_arg, opt);
 
+   if (MAX_THREADS = num_threads_explicit) {
+   warning(limiting --threads to %d, MAX_THREADS);
+   num_threads = MAX_THREADS;
+   } else if (num_threads_explicit  0) {
+   ; /* keep num_threads to compiled-in default */
+   } else {
+   num_threads = num_threads_explicit;
+   }
+
if (use_index  !startup_info-have_repository)
/* die the same way as if we did it at the beginning */
setup_git_directory();
@@ -834,7 +848,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
}
 
 #ifndef NO_PTHREADS
-   if (list.nr || cached || online_cpus() == 1)
+   if ((list.nr || cached || online_cpus() == 1)  num_threads_explicit  
0)
use_threads = 0;
 #else
use_threads = 0;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git grep parallelism question

2013-04-26 Thread Linus Torvalds
On Fri, Apr 26, 2013 at 12:19 PM, Junio C Hamano gits...@pobox.com wrote:

 OK, you have to recompile at least once for experiment, so perhaps
 building the test binary with this patch may help.

So here's my experiment on my machine with this patch and the kernel
tree. First I did the warm-cache case:

  for i in 1 4 8 16 32 64
  do
echo $i:
for j in 1 2 3 4
do
  t=$(sh -c time git grep --threads=$i hjahsja 21 | grep real)
  echo $i $t
done
  done

and the numbers are pretty stable, here's just he summary (best of
four tries for each case):

   1 real 0m0.598s
   4 real 0m0.253s
   8 real 0m0.235s
  16 real 0m0.269s
  32 real 0m0.412s
  64 real 0m0.420s

so for this machine, 8 threads (our old number) is actually optimal
even if it has just four cores (actually, two cores with HT). I
suspect it's just because the load is slightly unbalanced, so a few
extra threads helps. Looks like anything in the 4-16 thread range is
ok, though. But 32 threads is clearly too much.

Then I did the exact same thing, but with the echo 3 
/proc/sys/vm/drop_caches just before the timing of the git grep.
Again, summarizing (best-of-four number, the variation wasn't that
big):

   1 real 0m17.866s
   4 real 0m6.367s
   8 real 0m4.855s
  16 real 0m4.307s
  32 real 0m4.153s
  64 real 0m4.128s

here, the numbers continue to improve up to 64 threads, although the
difference between 32 and 64 is starting to be in the noise. I suspect
it's a combination of better IO overlap (although not all SSD's will
even improve all that much from overlapping IO past a certain point)
and probably a more noticeable imbalance between threads.

Anyway, for *my* machine and for *this* particular load, I'd say that
we're already pretty close to optimal. I did some trials just to see,
but the best hot-cache numbers were fairly reliably for 7 or 8
threads.

Looking at the numbers, I can't really convince myself that it would
be worth it to do (say) 12 threads on this machine. Yes, the
cold-cache case improves from the 8-thread case (best-of-four for 12
threads: 0m4.467s), but the hot-cache case has gotten sufficiently
worse (0m0.251s) that I'm not sure..

Of course, in *absolute* numbers the cold-cache case is so much slower
that a half-second improvement from going to 16 threads might be
considered worth it, because while the the hot-cache case gets worse,
we may just not care because it's fast enough that it's not
noticeable.

Anyway, I think your patch is good if for no other reason that it
allows this kind of testing, but at least for my machine, clearly the
current default of eight threads is actually good enough. Maybe
somebody with a very different machine might want to run the above
script and see if how sensitive other machines are to this parameter..

   Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html