Hi all, here's something to debate. ;) We have a homebrew message header parser in the lib, and we also parse messages, including headers, using gime during indexing. This means for messages that get indexed we parse the headers twice. (Duplicates and non-emails only get parsed using our own parser.)
The two parsers handle some things differently, which may cause confusion (tab handling in header folding for example). In the interest of reducing somewhat complicated code to maintain, just nuke the homebrew parser in favor of gmime. I did not look into the history of why we have our own parser to begin with; it was more fun to just do some coding. ;) Patches 1-3 do prep work to fix some of the differences in the parsers in advance. Arguably they are not that bad regardless of the parser change. Patches 4-5 actually make the change. Having two patches is a somewhat artificial division, but perhaps makes it easier to review. Patch 6 is just a hack to make perf tests not ignore so many mails... we have quite a bit of non-emails in the corpus by gmime parser standards. And this illlustrates one of the differences in the parsers. BR, Jani. Austin Clements (1): emacs: Sanitize authors and subjects in search and show Jani Nikula (5): cli: sanitize tabs to spaces in notmuch search cli: make the hacky from guessing more liberal lib: replace the header parser with gmime lib: parse messages only once HACK: fix broken messages in the perf test corpus emacs/notmuch-lib.el | 6 + emacs/notmuch-show.el | 7 +- emacs/notmuch.el | 6 +- lib/database.cc | 6 +- lib/index.cc | 70 +------- lib/message-file.c | 351 +++++++++++++------------------------- lib/message.cc | 6 + lib/notmuch-private.h | 19 ++- notmuch-reply.c | 4 +- notmuch-search.c | 4 +- performance-test/perf-test-lib.sh | 4 + 11 files changed, 172 insertions(+), 311 deletions(-) -- 1.8.4.rc3