allow indexing cleartext of encrypted messages

2015-12-09 Thread Daniel Kahn Gillmor
Notmuch currently doesn't index the cleartext of encrypted mail.  This
is the right choice by default, because the index is basically
cleartext-equivalent, and we wouldn't want every indexed mailstore to
leak the contents of its encrypted mails.

However, if a notmuch user has their index in a protected location,
they may prefer the convenience of being able to search the contents
of (at least some of) their encrypted mail.

This series of patches enables notmuch to index the cleartext of
specific encrypted messages when they're being added via "notmuch new"
or "notmuch insert", via a new --try-decrypt flag.

If --try-decrypt is used, and decryption is successful for part of a
message, the message gets an additional "index-decrypted" tag.  If
decryption of part of a message fails, the message gets an additional
"index-decryption-failed" tag.

This tagging approach should allow people to figure out which messages
have been indexed in the clear (or not), and can be used to
selectively reindex them in batch with something like:


#!/usr/bin/env python3

'''notmuch-reindex.py -- a quick and dirty pythonic mechanism to
re-index specific messages in a notmuch database.  This should
probably be properly implemented as a subcommand for /usr/bin/notmuch
itself'''

import notmuch
import sys

d = notmuch.Database(mode=notmuch.Database.MODE.READ_WRITE)

query = sys.argv[1]

q = d.create_query(query)

for m in q.search_messages():
mainfilename = m.get_filename()
origtags = m.get_tags()
tags = []
for t in origtags:
if t not in ['index-decrypted', 'index-decryption-failed']:
tags += [t]
d.begin_atomic()
for f in m.get_filenames():
d.remove_message(f)
(newm,stat) = d.add_message(mainfilename, try_decrypt=True)
for tag in tags:
newm.add_tag(tag)
d.end_atomic()


A couple key points:

 * There is some code duplication between crypto.c (for the
   notmuch-client) and lib/database.cc and lib/index.cc (for the
   library) because both parts of the codebase use gmime to handle the
   decryption.  I don't want to contaminate the libnotmuch API with
   gmime implementation details, so i don't quite see how to reuse the
   code cleanly.  I'd love suggestions on how to reduce the
   duplications.

 * the libnotmuch API is extended with
   notmuch_database_add_message_try_decrypt().  This should probably
   ultimately be more general, because there are a few additional
   knobs that i can imagine fiddling at indexing time.  For example:

* verifying cryptographic signatures and storing something about
  those verifications in the notmuch db
 
* extracting OpenPGP session key information for a given message
  and storing it in a lookaside table in the notmuch db, so that
  it's possible to securely destroy old encryption-capable keys
  and still have local access to the cleartext of the remaining
  messages.

   Some of these additional features might be orthogonal to one
   another as well.  I welcome suggestions for how to improve the API
   so that we don't end up with a combinatorial explosion of
   n_d_add_message_foo() functions.

 * To properly complete this patch series, i think i want to make
   notmuch-reindex.c and add a reindex subcommand, also with a
   --try-decrypt option.  It's not clear to me if the right approach
   for that is to have a C implementation of the python script above
   without modifying libnotmuch, or if i should start by creating a
   notmuch_message_reindex function in libnotmuch, with a try_decrypt
   flag.  Again, suggestions welcome.

 * Is the tagging approach the right thing to do to record success or
   failure of decryption at index time?  Is there a better approach?


___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: allow indexing cleartext of encrypted messages

2015-12-11 Thread Daniel Kahn Gillmor
On Wed 2015-12-09 22:39:37 -0500, Daniel Kahn Gillmor wrote:
>  * the libnotmuch API is extended with
>notmuch_database_add_message_try_decrypt().  This should probably
>ultimately be more general, because there are a few additional
>knobs that i can imagine fiddling at indexing time.  For example:
>
> * verifying cryptographic signatures and storing something about
>   those verifications in the notmuch db
>  
> * extracting OpenPGP session key information for a given message
>   and storing it in a lookaside table in the notmuch db, so that
>   it's possible to securely destroy old encryption-capable keys
>   and still have local access to the cleartext of the remaining
>   messages.
>
>Some of these additional features might be orthogonal to one
>another as well.  I welcome suggestions for how to improve the API
>so that we don't end up with a combinatorial explosion of
>n_d_add_message_foo() functions.

I have a proposal for how to do this better:

I'll introduce a notmuch_index_options_t, with the usual constructors
and destructors and a couple functions:

  notmuch_index_options_set_try_decrypt()
  notmuch_index_options_get_try_decrypt()
  notmuch_index_options_set_gpg_path()
  notmuch_index_options_get_gpg_path()

Then i'll add:

  notmuch_database_add_message_with_options(db, fname, options, &message)

If we add new indexing features, they can be set directly in the
index_options object (including features that might be more complex than
a string or a bool, like a chain of command-line filters).

a few nice features of this approach:

 * The user of the library can craft a set of index options and repeat
   it easily, and the options can contain cached/lazily-initialized
   things (like GMimeCryptoContexts) if needed.

 * The user can index different messages with different options if they
   prefer (no need to set the options on the database object itself)

 * the capability of the indexing features in the library is visible
   directly in the exposed API.

any thoughts on this?

--dkg
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: allow indexing cleartext of encrypted messages

2015-12-11 Thread Tomi Ollila
On Fri, Dec 11 2015, Daniel Kahn Gillmor  wrote:

> On Wed 2015-12-09 22:39:37 -0500, Daniel Kahn Gillmor wrote:
>>  * the libnotmuch API is extended with
>>notmuch_database_add_message_try_decrypt().  This should probably
>>ultimately be more general, because there are a few additional
>>knobs that i can imagine fiddling at indexing time.  For example:
>>
>> * verifying cryptographic signatures and storing something about
>>   those verifications in the notmuch db
>>  
>> * extracting OpenPGP session key information for a given message
>>   and storing it in a lookaside table in the notmuch db, so that
>>   it's possible to securely destroy old encryption-capable keys
>>   and still have local access to the cleartext of the remaining
>>   messages.
>>
>>Some of these additional features might be orthogonal to one
>>another as well.  I welcome suggestions for how to improve the API
>>so that we don't end up with a combinatorial explosion of
>>n_d_add_message_foo() functions.
>
> I have a proposal for how to do this better:
>
> I'll introduce a notmuch_index_options_t, with the usual constructors
> and destructors and a couple functions:
>
>   notmuch_index_options_set_try_decrypt()
>   notmuch_index_options_get_try_decrypt()
>   notmuch_index_options_set_gpg_path()
>   notmuch_index_options_get_gpg_path()
>
> Then i'll add:
>
>   notmuch_database_add_message_with_options(db, fname, options, &message)
>
> If we add new indexing features, they can be set directly in the
> index_options object (including features that might be more complex than
> a string or a bool, like a chain of command-line filters).
>
> a few nice features of this approach:
>
>  * The user of the library can craft a set of index options and repeat
>it easily, and the options can contain cached/lazily-initialized
>things (like GMimeCryptoContexts) if needed.
>
>  * The user can index different messages with different options if they
>prefer (no need to set the options on the database object itself)
>
>  * the capability of the indexing features in the library is visible
>directly in the exposed API.
>
> any thoughts on this?

sounds good (on paper) (*)

>
> --dkg

Tomi

(*) deliberately declined to write 'looks good' >;) (but it's good)
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Allow indexing cleartext of encrypted messages (v2)

2016-01-19 Thread Daniel Kahn Gillmor
This is the second draft of the series initially announced in
id:1449718786-28000-1-git-send-email-...@fifthhorseman.net:

> Notmuch currently doesn't index the cleartext of encrypted mail.  This
> is the right choice by default, because the index is basically
> cleartext-equivalent, and we wouldn't want every indexed mailstore to
> leak the contents of its encrypted mails.
> 
> However, if a notmuch user has their index in a protected location,
> they may prefer the convenience of being able to search the contents
> of (at least some of) their encrypted mail.
> 
> This series of patches enables notmuch to index the cleartext of
> specific encrypted messages when they're being added via "notmuch new"
> or "notmuch insert", via a new --try-decrypt flag.
> 
> If --try-decrypt is used, and decryption is successful for part of a
> message, the message gets an additional "index-decrypted" tag.  If
> decryption of part of a message fails, the message gets an additional
> "index-decryption-failed" tag.

v2 addresses the concerns raised from the helpful feedback on the
previous series, and adds a notmuch_indexopts_t object that can be
used to declare options for indexing messages, including a
"try_decrypt" boolean.

Additionally, this series adds a new function to libnotmuch:

  notmuch_message_reindex (notmuch_message_t *message,
   notmuch_indexopts_t *indexopts)

Which allows user of the library to adjust the indexing options of a
given message.

The CLI is additionally augmented with a new notmuch subcommand,
"notmuch reindex", which also has a --try-decrypt flag.

So a user who has their message index stored securely and wants to
index the cleartext of all encrypted messages they've received can do
something like:

  notmuch reindex --try-decrypt tag:encrypted and not tag:index-decrypted

Or can clear all indexed cleartext from their database with:

  notmuch reindex tag:encrypted and tag:index-decrypted


___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Allow indexing cleartext of encrypted messages (v3)

2016-01-31 Thread Daniel Kahn Gillmor
This is the third draft of the series initially announced in
id:1449718786-28000-1-git-send-email-...@fifthhorseman.net (second
draft was in
id:1453258369-7366-1-git-send-email-...@fifthhorseman.net).  It
differs from v2 in that it incorporates the recent improvements in
detecting and processing S/MIME signatures.

From the v2 description:

> Notmuch currently doesn't index the cleartext of encrypted mail.  This
> is the right choice by default, because the index is basically
> cleartext-equivalent, and we wouldn't want every indexed mailstore to
> leak the contents of its encrypted mails.
> 
> However, if a notmuch user has their index in a protected location,
> they may prefer the convenience of being able to search the contents
> of (at least some of) their encrypted mail.
> 
> This series of patches enables notmuch to index the cleartext of
> specific encrypted messages when they're being added via "notmuch new"
> or "notmuch insert", via a new --try-decrypt flag.
> 
> If --try-decrypt is used, and decryption is successful for part of a
> message, the message gets an additional "index-decrypted" tag.  If
> decryption of part of a message fails, the message gets an additional
> "index-decryption-failed" tag.

v2 addresses the concerns raised from the helpful feedback on the
previous series, and adds a notmuch_indexopts_t object that can be
used to declare options for indexing messages, including a
"try_decrypt" boolean.

Additionally, this series adds a new function to libnotmuch:

  notmuch_message_reindex (notmuch_message_t *message,
   notmuch_indexopts_t *indexopts)

Which allows user of the library to adjust the indexing options of a
given message.

The CLI is additionally augmented with a new notmuch subcommand,
"notmuch reindex", which also has a --try-decrypt flag.

So a user who has their message index stored securely and wants to
index the cleartext of all encrypted messages they've received can do
something like:

  notmuch reindex --try-decrypt tag:encrypted and not tag:index-decrypted

Or can clear all indexed cleartext from their database with:

  notmuch reindex tag:encrypted and tag:index-decrypted


___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Allow indexing cleartext of encrypted messages (v4)

2016-07-08 Thread Daniel Kahn Gillmor
This is the fourth draft of the series that enables indexing cleartext
of encrypted message parts.

previous versions start at:

v1:  id:1449718786-28000-1-git-send-email-...@fifthhorseman.net
v2:  id:1453258369-7366-1-git-send-email-...@fifthhorseman.net
v3:  id:1454272801-23623-1-git-send-email-...@fifthhorseman.net

differs from v3 in that it uses Bremner's "message properties" to
record its information instead of trampling on the user-visible tag
space.

It depends also on one additional fix i pushed to the "message
properties" series, allowing notmuch queries with a "has:" prefix to
search the property namespace.  In particular:

 id:1467969336-7605-1-git-send-email-...@fifthhorseman.net

I welcome feedback!
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Allow indexing cleartext of encrypted messages (v3)

2016-02-06 Thread Tomi Ollila
On Sun, Jan 31 2016, Daniel Kahn Gillmor  wrote:

> This is the third draft of the series initially announced in
> id:1449718786-28000-1-git-send-email-...@fifthhorseman.net (second
> draft was in
> id:1453258369-7366-1-git-send-email-...@fifthhorseman.net).  It
> differs from v2 in that it incorporates the recent improvements in
> detecting and processing S/MIME signatures.

Looks pretty good. Nothing to bikeshed. Did not run tests yet.

Tomi


>
> From the v2 description:
>
>> Notmuch currently doesn't index the cleartext of encrypted mail.  This
>> is the right choice by default, because the index is basically
>> cleartext-equivalent, and we wouldn't want every indexed mailstore to
>> leak the contents of its encrypted mails.
>> 
>> However, if a notmuch user has their index in a protected location,
>> they may prefer the convenience of being able to search the contents
>> of (at least some of) their encrypted mail.
>> 
>> This series of patches enables notmuch to index the cleartext of
>> specific encrypted messages when they're being added via "notmuch new"
>> or "notmuch insert", via a new --try-decrypt flag.
>> 
>> If --try-decrypt is used, and decryption is successful for part of a
>> message, the message gets an additional "index-decrypted" tag.  If
>> decryption of part of a message fails, the message gets an additional
>> "index-decryption-failed" tag.
>
> v2 addresses the concerns raised from the helpful feedback on the
> previous series, and adds a notmuch_indexopts_t object that can be
> used to declare options for indexing messages, including a
> "try_decrypt" boolean.
>
> Additionally, this series adds a new function to libnotmuch:
>
>   notmuch_message_reindex (notmuch_message_t *message,
>notmuch_indexopts_t *indexopts)
>
> Which allows user of the library to adjust the indexing options of a
> given message.
>
> The CLI is additionally augmented with a new notmuch subcommand,
> "notmuch reindex", which also has a --try-decrypt flag.
>
> So a user who has their message index stored securely and wants to
> index the cleartext of all encrypted messages they've received can do
> something like:
>
>   notmuch reindex --try-decrypt tag:encrypted and not tag:index-decrypted
>
> Or can clear all indexed cleartext from their database with:
>
>   notmuch reindex tag:encrypted and tag:index-decrypted
>
>
> ___
> notmuch mailing list
> notmuch@notmuchmail.org
> https://notmuchmail.org/mailman/listinfo/notmuch
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch


Re: Allow indexing cleartext of encrypted messages (v3)

2016-02-09 Thread Jameson Graef Rollins
I've done only a cursory read through of the code, but structurally it
looks good to me.  I like the reworking of crypto.c as a utility
library.  It applies cleaning to trunk, and all tests pass, including
the 11 new ones.

jamie.


signature.asc
Description: PGP signature
___
notmuch mailing list
notmuch@notmuchmail.org
https://notmuchmail.org/mailman/listinfo/notmuch