On Wed, 8 Feb 2012 08:36:09 +0000, David Edmondson <dme at dme.org> wrote: > Optimize thread tagging by combining all the tagging operations to a > single "notmuch tag" call. > > For threads in the order of tens or a hundred inbox tagged messages, > this gives a noticeable speedup. On two different machines, archiving > a thread of about 50 inbox tagged messages goes down from 10+ seconds > to about 0.5 seconds. > > The bottleneck is not within emacs; the same behaviour can be observed > in the CLI. This approach has the added benefit of being more > reliable: any of the individual tagging operations might face a locked > database, leading to partial results. > > This introduces a limitation to the number of messages that can be > archived at the same time (through ARG_MAX limiting the command > line). While at least on Linux this seems more like a theoretical > limitation than a real one, it could be avoided by archiving at most a > few hundred messages at a time.
I did a simple test program: --8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<-- #!/usr/bin/perl use strict; use warnings; die "Usage: $0 <arglen> <# of args>\n" unless @ARGV == 2; my $arg = 'x' x $ARGV[0]; my @args = ( $arg ) x $ARGV[1]; print "One arg: '$arg'\n"; print 'Number of args: ', scalar @args, "\n"; exec '/bin/true', @args; --8<----8<----8<----8<----8<----8<----8<----8<----8<----8<----8<-- This program executes /bin/true with given number of args all args of some length: i.e. ./test_cmdlimit 100 10000 makes 10000 100 character arguments (total of million characters) and executes /bin/true with those ten thousand 100-char args. In one machine where getconf ARG_MAX returns 131072 (Debian Lenny ia32) ./test_cmdlimit 200 10000 succeeds but ./test_cmdlimit 20 100000 gives Can't exec "/bin/true": Argument list too long at ./test_cmdlimit.pl line 14. Hmm, actually there: ./test_cmdlimit 19 100000 succeeds and ./test_cmdlimit.pl 209 10000 fails More with args which lengths are 1 and 2 chars: ./test_cmdlimit.pl 1 1046163 succeeds ./test_cmdlimit.pl 1 1046164 fails ./test_cmdlimit.pl 2 697442 succeeds ./test_cmdlimit.pl 2 697443 fails 1046163 was close to 1048576 (1024 * 1024) -- 697442 * 2 is 1394884... >From these I can make an educated guess that when message id:s are typically between 30 to 70 characters something like 20 000 messages are safe to be tagged at once in this test system (./test_cmdlimit.pl 50 40000 succeeds). > > Based on code from Jani Nikula <jani at nikula.org>. Tomi