Re: Gnu sieve vs Dovecot sieve-filter - sieve-filter extremely slow at lda (writing emails to local mbox files)
Oh, one last bit for now regarding pipeing: Given my current sieve-filter command: MLOC="mail_location=mbox:~/mail:INBOX=~/mail/Inbox:INDEX=:UTF-8:VOLATILEDIR=/tmp/dovecot-volatile/%2.256Nu/%u:SUBSCRIPTIONS=dovecot_subscriptions" SCRIPT=~/etc/email/sieve.rc sieve-filter -veWD -c $SIEVE_CONF -o $MLOC $SCRIPT emails-incoming I can imagine trying to do a pipe as suggested, like follows: cat ~/mail/emails-incoming | sieve-filter -veW -c $SIEVE_CONF -o $MLOC $SCRIPT But, I see no suggestion in the sieve-filter man page that this would work. ISTM that sieve-filter just is not designed to work in a local mbox email environment.
Re: Gnu sieve vs Dovecot sieve-filter - sieve-filter extremely slow at lda (writing emails to local mbox files)
(I did subscribe to this mailing list, albeit with zen at freedbms.net, so either way I'm getting all your emails - thank you -so- much for replying...) MUA is mutt, reading email in a terminal (sorry, forgot to mention this before). For many years now my email folder (mbox files) collection has grown to many GiB, mostly mailing lists. If I am to change email storage format, it should be mutt compatible; looking at https://wiki2.dovecot.org/MailboxFormat I see that only DJB's Maildir is compatible with both Dovecot ("a reliable choice" says the wiki), and mutt. I can imagine that sdbox or mdbox could be made "mutt compatible" so to speak, by running some sort of local IMAP server, and accessing my email from mutt that way; this is undesirable to my mind because this would require: 1) a new learning curve wrt mutt and reading email on IMAP servers 2) a new learning curve to set up a local IMAP server (securely) 3) the inability to use mutt without a local IMAP server to read my local email but such a setup would also have some quite desirable benefits: 1) once set up, multiple MUAs could be used, and I'd have a beginning grasp on setting up an IMAP server and front ends (this is something on my bucket list, to assist my local church with) 2) simpler remote "online" access to my local "offline" email store (e.g. using my mobile phone when on the road) by setting up a webmail server (much simpler (read "possible" to use on a mobile phone) than using a vpn and mutt...), thus freeing me up from the behemoth web email providers... Next, I do not know how to "pipe the messages to the dovecot lda". After downloading from my POP3 provider into a local mbox file (this is my step 1), then I sort the emails (this is my step 2): the following should be on a single line: /usr/bin/sieve-filter -veW -c $HOME/etc/email/sieve-dovecot-config.conf -o mail_location=mbox:~/mail:INBOX=~/mail/Inbox:INDEX=:UTF-8:VOLATILEDIR=/tmp/dovecot-volatile/%2.256Nu/%u:SUBSCRIPTIONS=dovecot_subscriptions ~/etc/email/sieve.rc email-incoming-unsorted As you can see from the above command, sieve-filter is given the name of the mbox ("mail folder") to sort, as its very last argument on the command line - so in this instance, sieve-filter really has no excuse, and should be not be re-reading the sieve rules script for each email - now perhaps that's not happening, I only made an assumption because of a CPU hitting 100% for a minute or two just to process a few 100 emails... What could also be happening (again, an assumption), is that sieve-filter is written to assume dovecot index files to be in existence. I disabled those with the "INDEX=" clause you see in the command above, which obviously has been given no value. The reason I figured out how to disable the creation of the indexes in the .imap directories, is that for my setup, Gnu sieve has proven that I should not need such indexes - with mbox files, just append each email to the end of the target "mailbox folder" mbox file, and we're done! This literally should not cost 100% CPU, even for one millisecond! But more importantly, because my working email folder is ~30GiB, without disabling this index creation step, sieve-filter forced the creation of indexes, which "took so long I gave up and hit CTRL-C, which did not work, so I kill -9'ed the sieve-filter and whatever other process was not stopping". Last year someone on debian-user recommended I upgrade to using Dovecot/Pigeonhole's sieve-filter (rather than Gnu sieve) due to the issues with Gnu sieve. I am starting to think that I should perhaps try to figure out if it's possible to (re)process the emails Gnu sieve has a problem with, to massage them into a shape that Gnu sieve accepts - then my immediate problem would certainly be solved... Thank you all again.. Zenaan
Re: Gnu sieve vs Dovecot sieve-filter - sieve-filter extremely slow at lda (writing emails to local mbox files)
I am wondering why sieve-filter is so slow compared to gnu sieve. I run mpop (like getmail) to download from a pop3 server to a local mbox file: ~/mail/email-incoming-unsorted This step is very fast. The next step, I throw the email-incoming-unsorted mbox file at a sieve processor, to sort the emails from that mbox, into other mboxes, according to the sieve rules file. Up until a couple days ago I was using Gnu sieve. Gnu sieve balks on emails which have no x-message-id (?? something like this) header field, so after a few years, I finally decided to switch "up" to Dovecot/Pigeonhole's "sieve-filter" command. Using Gnu sieve, this mbox sorting step was even faster than mpop (/ getmail) - and mpop and getmail are really fast (compared with fetchmail), since they pipeline the email downloads. Even with 100s of emails, Gnu sieve would take only 10 to 20 seconds at most. Super fast. Using sieve-filter, all emails are being processed - including those without "message id header". This is good. But also, using sieve filter, is really slower - slower than the download step by an order of magnitude or two. See below for details, any ideas appreciated. To add to the below, I added: mbox_very_dirty_syncs = yes to the sieve-filter config, which slightly improves performance, but not by much (in comparison with Gnu sieve). TIA, - Forwarded message from Zenaan Harkness - From: Zenaan Harkness To: debian-u...@lists.debian.org Date: Thu, 12 Sep 2019 08:06:12 +1000 Subject: Re: Gnu sieve vs Dovecot sieve-filter - sieve-filter extremely slow at lda (writing emails to local mbox files) On Thu, Sep 12, 2019 at 07:55:23AM +1000, Zenaan Harkness wrote: > Why is Gnu sieve so extremely fast to batch process an mbox file, but > while Dovecot's sieve-filter is an order of magnitude slower? > > Sequence: > > - mpop or getmail to pipeline download emails into temp mbox file > - filter that file > > Gnu sieve just flies through a local mbox file and saving emails to > other local mbox files. > > Gnu sieve rejects too many emails with "malformed" errors, so after a > few years I bit the bullet and upgraded to Dovecot's sieve-filter. > > Dovecot's sieve-filter, at present, is an order of magnitude slower. > > Here's my filter command (one line): > > /usr/bin/sieve-filter -veW -c $HOME/etc/email/sieve-dovecot-config.conf -o > mail_location=mbox:~/mail:INBOX=~/mail/Inbox:INDEX=:UTF-8:VOLATILEDIR=/tmp/dovecot-volatile/%2.256Nu/%u:SUBSCRIPTIONS=dovecot_subscriptions > ~/etc/email/sieve.rc email-incoming-unsorted > > The sieve script is fine now that I have the correct "require" > clauses (hint: "capability strings"). > > File ~/etc/email/sieve-dovecot-config.conf: > > protocols = pop > lda_mailbox_autocreate = yes > lda_mailbox_autosubscribe = yes > mail_fsync = never > > There's no re-sending of emails into my local Postfix SMTP server - I > checked the system logs and confirmed this (journalctl -f). > > I suspect that Gnu sieve was directly writing each email to the > appropriate sieve-determined mbox file (perhaps with only a sync at > the end of a single batch process - what I've attempted to achieve > above with sieve-filter), and that sieve-filter is instead passing > each email through some (dovecot) lda? > > Here's the output for a sieve-filter batch processing of 11 emails: > > $ /usr/bin/sieve-filter -veW -c /home/zen/etc/email/sieve-dovecot-config.conf > -o > mail_location=mbox:/home/zen/mail:INBOX=/home/zen/mail/Inbox:INDEX=:UTF-8:VOLATILEDIR=/tmp/dovecot-volatile/%2.256Nu/%u:SUBSCRIPTIONS=dovecot_subscriptions > /home/zen/etc/email/sieve.rc email-incoming-unsorted > # PS0 Timestamp: 20190912@07:02:23 > info: filtering: [Tue, 3 Sep 2019 05:17:16 -0500; 10240 bytes] `Re: > VentureBeat: The death of disk? H...'. > info: > msgid=: > stored mail into mailbox 'l/cp/cp'. > info: message expunged from source mailbox upon successful move. > info: filtering: [Tue, 3 Sep 2019 07:29:53 -0400; 12968 bytes] `[zfs-devel] > xattr naming format in Zo...'. > info: msgid=<15675101930.d5ba2e.12...@composer.zfsonlinux.topicbox.com>: > stored mail into mailbox 'l/z/zdev'. > info: message expunged from source mailbox upon successful move. > info: filtering: [Tue, 03 Sep 2019 15:29:09 +0300; 20461 bytes] `Re: > [zfs-devel] xattr naming format i...'. > info: msgid=<23955051567513...@sas1-02732547ccc0.qloud-c.yandex.net>: stored > mail into mailbox 'l/z/zdev'. > info: message expunged from source mailbox upon successful move. > info: filtering: [Tue, 3 Sep 2019 18:20:42 +0530; 18065 bytes] `Re: > [Gluster-users] Issues with Geo-r...'. > info: > msgid=: > stored mail into mailbox 'l/gl/user'. > info: message expunged from source mailbox upon successful move. > info: filtering: [Tue, 3 Sep 2019 09:34:20 -0400; 13342 bytes] `Re: tasksel'. > info: msgid=<20190903133420.gs6...@eeg.ccf.org>: stored mail into mailbox > 'l/deb/user'. > info: message expunged from source mailbox upon successf