Hi notmuch folks--

I just wanted to note a performance problem that i'm seeing with
notmuch.  I've attached a demonstration script
("crypto-performance-test"), which sets up a temporary GnuPG homedir and
notmuch installation, adds 20 encrypted+signed messages in a thread
(using indexed cleartext with stashed session IDs), and then tries to
render the thread.  It compares the performance of this rendering with
manual extraction of the ciphertext, fed in a loop to manual invocations
of gpg.


Problem Statement
-----------------

On a long thread like this where each message is both signed and
encrypted, "notmuch show" takes a long time to emit json ouput, even if
all the session keys for the messages are stashed.

I believe the bulk of the time spent rendering this is signature
verification.

I also note that if a comparable test is run with a large keyring, the
duration increases as the keyring increases, but this test doesn't show
that particular behavior.

$ ./crypto-performance-test
gpg: key 18523A27026D55EB marked as ultimately trusted
Found 0 total files (that's not much mail).
No new mail.
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2021-09-30
Processed 20 total files in 2s (8 files/sec.).
Added 20 new messages to the database.

performance of notmuch show:
real    0m1.582s
user    0m0.049s
sys     0m0.042s

performance of piped gpg:
real    0m1.110s
user    0m1.164s
sys     0m0.200s

performance of piped gpg --no-keyring:
real    0m0.249s
user    0m0.305s
sys     0m0.175s
$

note in particular the large delay for notmuch between wall-clock time
("real") and userspace ("user") and kernel ("sys") CPU time.

Diagnosis
---------

If i hack up gmime to disable signature verification (see
id:87v9t9xw0l....@fifthhorseman.net over on gmime-devel-l...@gnome.org,
which effectively adds the --no-keyring option to gpgme's gpg
invocation), then i can reduce the kernel time and the overall delay:

$ ./crypto-performance-test
gpg: key B9F59627422A84DB marked as ultimately trusted
Found 0 total files (that's not much mail).
No new mail.
gpg: checking the trustdb
gpg: marginals needed: 3  completes needed: 1  trust model: pgp
gpg: depth: 0  valid:   1  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 1u
gpg: next trustdb check due at 2021-09-30
Processed 20 total files in 1s (11 files/sec.).
Added 20 new messages to the database.

performance of notmuch show:
real    0m0.161s
user    0m0.037s
sys     0m0.017s

performance of piped gpg:
real    0m1.083s
user    0m1.158s
sys     0m0.176s

performance of piped gpg --no-keyring:
real    0m0.247s
user    0m0.301s
sys     0m0.177s
$

But there even so, there is a real difference between the CPU time and
the wall clock time.  The system i've run these tests on is not starved
for CPU or RAM, and it is not I/O bound.  Even if there were disk
troubles, the filesystem i'm running it on (where mktemp -d creates the
working directory) is a tmpfs.

My current guess is that bash's "time" built-in isn't tabulating all
user+sys CPU time actually used by notmuch because gmime's gpgme
invocations might be detaching its child processes entirely
(i.e. double-fork, orphaning the grandchild). The gpg processes that
appear during the "notmuch show" runs have parent pid 1.

How to fix
----------

I think the right thing to do here is related to stashing signature
verifications when they are first encountered, but i foresee trouble
with that approach given notmuch's willingness to collapse messages that
have the same message ID.

That is: we want to stash the signature verification the first time we
view it, and when displaying a message, we want to show the original
signature verification (rather than computing it again), *as long as*
the message has not changed since the previous verification.

We can't do this at all right now until GMime gains the capability, but
even if that happens i don't know how to do it safely.

Any suggestion on how we ensure that a cached verification applies to
all parts of the message being rendered via "notmuch show" ?

    --dkg

#!/bin/bash

set -e

workdir=$(mktemp -d)

cleanup() {
    rm -rf "$workdir"
}
trap cleanup EXIT

setup_gpg() {
    export GNUPGHOME="$workdir/g"
    mkdir -m 0700 -p "$GNUPGHOME"
    gpgargs=(--batch --pinentry-mode=loopback --passphrase '' --quiet --no-tty)

    gpg "${gpgargs[@]}"  --quick-gen-key 'Test User <t...@example.org>' futuredefault
}

setup_notmuch() {
    mkdir -p "$workdir/m"
    export NOTMUCH_CONFIG="$workdir/notmuch-config"
    cat > "$NOTMUCH_CONFIG" <<EOF
[database]
path=$workdir/m
[user]
name=Test User
primary_email=t...@example.org
[new]
tags=unread;inbox;
EOF
    notmuch new
}

body() {
    cat <<EOF
Content-Type: text/plain

This is test message $1
EOF
}

encmsg() {
    cat <<EOF
From: Test User <t...@example.org>
To: Test User <t...@example.org>
Date: $(date -R)
Subject: Message $x
Mime-Version: 1.0
Message-Id: <crypto-performance-$x...@example.org>
References: <crypto-performa...@example.org>
In-Reply-To: <crypto-performa...@example.org>
Content-Type: multipart/encrypted;
 boundary="xxxx";
 protocol="application/pgp-encrypted"

--xxxx
Content-Type: application/pgp-encrypted

Version: 1

--xxxx
Content-Type: application/octet-stream

$(body $x | gpg --armor --sign -r t...@example.org --encrypt)
--xxxx--
EOF
}

sk() {
    notmuch dump id:"crypto-performance-$1...@example.org" | grep session-key= \
        | sed -r -e 's/.*session-key=([A-F0-9%a-f]+)*/\1/' -e 's/%3a/:/'
}

setup_gpg
setup_notmuch
for x in $(seq 1 20); do
    encmsg > "$workdir/m/$x.eml"
done
notmuch new --decrypt=true
printf '\nperformance of notmuch show:'
time notmuch show --format=json id:crypto-performanc...@example.org >/dev/null
printf '\nperformance of piped gpg:'
time for x in $(seq 1 20); do
    notmuch show --entire-thread=false --decrypt=false --format=raw --part=3 id:crypto-performance-$x...@example.org \
        | gpg --quiet --batch --override-session-key $(sk $x) --decrypt >/dev/null 2>/dev/null
done
printf '\nperformance of piped gpg --no-keyring:'
time for x in $(seq 1 20); do
    notmuch show --entire-thread=false --decrypt=false --format=raw --part=3 id:crypto-performance-$x...@example.org \
        | gpg --quiet --batch --no-keyring --override-session-key $(sk $x) --decrypt >/dev/null 2>/dev/null || true
done

Attachment: signature.asc
Description: PGP signature

_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch

Reply via email to