Re: [Dovecot] using dsync to convert mailboxes looses caching options
On 10.12.2011, at 13.32, Mark Zealey wrote: > 10-12-2011 13:07, Timo Sirainen yazmış: >>> It could well be because of the conversion to sdbox then - the ctime/mtime >>> of the files are not being preserved by dsync (in stock 2.0.16). The >>> date.saved timestamp is only put into the cache on the second dsync run; >>> presumably therefore it picks it up from the filesystem. >> With sdbox the file's mtime isn't even tried to be preserved. The >> received-time and saved-time are written to the metadata block inside the >> file. > > Ah yes; I saw the R metadata but not the C header key. The C is the file's create time. It's not actually use for anything. > Looking deeper at this I think I was expecting the date.save time to be about > the same as the date.receive; however the ctime for these files is quite > recent presumably affected by setting of message flags in a maildir or > something (we're using nfs). Yes, maildir flag changes change the ctime, which also changes the save date if it's not already cached. > so ctime/sdbox C entry are close enough by my calculations (not sure where > the 61 seconds of difference comes from though). It is a bit strange you > wouldn't use the source cache's value for date.save if it is available as > ctime can be pretty unreliable? It is using the the cached value. Anyway, I remembered wrong how sdbox's save date is looked up. It's taken from the sdbox file's ctime. The reason is similar to maildir: The save date is used mainly to figure out when to automatically expunge messages from Trash after it's been there for n days. So if you copy 1 year old message to Trash, you don't want it expunged immediately (based on mtime or some metadata inside the file), you want it expunged n days since the move. And ctime is really the only nice way to do it automatically, because copying a message with sdbox is done with hard linking. mdbox stores the save date in the index file. sdbox could do it too, but that's just extra work and probably not worth the trouble. And unlink atime/mtime, ctime can't be changed using any syscalls (except to current time). So, I think everything here works as intended, although not really as expected. :)
Re: [Dovecot] using dsync to convert mailboxes looses caching options
10-12-2011 13:07, Timo Sirainen yazmış: It could well be because of the conversion to sdbox then - the ctime/mtime of the files are not being preserved by dsync (in stock 2.0.16). The date.saved timestamp is only put into the cache on the second dsync run; presumably therefore it picks it up from the filesystem. With sdbox the file's mtime isn't even tried to be preserved. The received-time and saved-time are written to the metadata block inside the file. Ah yes; I saw the R metadata but not the C header key. Looking deeper at this I think I was expecting the date.save time to be about the same as the date.receive; however the ctime for these files is quite recent presumably affected by setting of message flags in a maildir or something (we're using nfs). The source cache says: - date.received: 1301978447 (4f9d9a4d) - date.save: 1322465550 (0e39d34e) The message file itself has mtime 1301978447 and ctime 1323514077; and in the sdbox header/metadata we have: C4ee3391a R4d9a9d4f so ctime/sdbox C entry are close enough by my calculations (not sure where the 61 seconds of difference comes from though). It is a bit strange you wouldn't use the source cache's value for date.save if it is available as ctime can be pretty unreliable? Mark
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On 10.12.2011, at 13.03, Mark Zealey wrote: > Ah-ha it's doing the same in 2.0.16 - looking deeper it's because i havn't > accessed the tmp fields in a week or two so I guess the decision has been > taken not to migrate them. Yes, most likely the reason. Could this also explain the date.saved? > It could well be because of the conversion to sdbox then - the ctime/mtime of > the files are not being preserved by dsync (in stock 2.0.16). The date.saved > timestamp is only put into the cache on the second dsync run; presumably > therefore it picks it up from the filesystem. With sdbox the file's mtime isn't even tried to be preserved. The received-time and saved-time are written to the metadata block inside the file.
Re: [Dovecot] using dsync to convert mailboxes looses caching options
10-12-2011 08:28, Timo Sirainen yazmış: On Thu, 2011-12-08 at 14:45 +, Mark Zealey wrote: With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). With the patch you provided they don't get copied whether using mirror or backup& starting from scratch. I'm doing a Maildir to sdbox migration otherwise don't think I'm doing anytihng strange. Show the whole list of cache decisions in source and destination? Ah-ha it's doing the same in 2.0.16 - looking deeper it's because i havn't accessed the tmp fields in a week or two so I guess the decision has been taken not to migrate them. Mark
Re: [Dovecot] using dsync to convert mailboxes looses caching options
10-12-2011 08:27, Timo Sirainen yazmış: On Thu, 2011-12-08 at 16:10 +, Mark Zealey wrote: By the way, another bug I noticed with dsync is that when converting from Maildir to sdbox is that the date.saved field is not preserved - it's just the time when the first dsync command happened. Presumably it should be the mtime of the Maildir message file With Maildir the date.saved is taken from the mail file's ctime (yes, it's not perfect, but it's good enough for what it's used for). It's preserved in my tests. It could well be because of the conversion to sdbox then - the ctime/mtime of the files are not being preserved by dsync (in stock 2.0.16). The date.saved timestamp is only put into the cache on the second dsync run; presumably therefore it picks it up from the filesystem. Mark
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On Thu, 2011-12-08 at 14:45 +, Mark Zealey wrote: > With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). > With the patch you provided they don't get copied whether using mirror or > backup & starting from scratch. I'm doing a Maildir to sdbox migration > otherwise don't think I'm doing anytihng strange. Show the whole list of cache decisions in source and destination?
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On Thu, 2011-12-08 at 16:10 +, Mark Zealey wrote: > By the way, another bug I noticed with dsync is that when converting from > Maildir to sdbox is that the date.saved field is not preserved - it's just > the time when the first dsync command happened. Presumably it should be the > mtime of the Maildir message file With Maildir the date.saved is taken from the mail file's ctime (yes, it's not perfect, but it's good enough for what it's used for). It's preserved in my tests.
Re: [Dovecot] using dsync to convert mailboxes looses caching options
By the way, another bug I noticed with dsync is that when converting from Maildir to sdbox is that the date.saved field is not preserved - it's just the time when the first dsync command happened. Presumably it should be the mtime of the Maildir message file Mark
Re: [Dovecot] using dsync to convert mailboxes looses caching options
With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). With the patch you provided they don't get copied whether using mirror or backup & starting from scratch. I'm doing a Maildir to sdbox migration otherwise don't think I'm doing anytihng strange. Mark From: Mark Zealey Sent: 08 December 2011 09:35 To: Timo Sirainen Cc: Dovecot Mailing List Subject: RE: [Dovecot] using dsync to convert mailboxes looses caching options OK I'll test the header copying more fully. The reason we want to preserve caching decisions is to avoid an IO storm when users log in to their mailboxes after an sdbox upgrade so it would be great to be able to have some way to warm caches. Mark
Re: [Dovecot] using dsync to convert mailboxes looses caching options
OK I'll test the header copying more fully. The reason we want to preserve caching decisions is to avoid an IO storm when users log in to their mailboxes after an sdbox upgrade so it would be great to be able to have some way to warm caches. Mark From: Timo Sirainen [t...@iki.fi] Sent: 08 December 2011 09:27 To: Mark Zealey Cc: Dovecot Mailing List Subject: RE: [Dovecot] using dsync to convert mailboxes looses caching options On Thu, 2011-12-08 at 09:19 +, Mark Zealey wrote: > OK now it's copying the timestamp fields for tmp ones. However: > > 1) hdr.* fields are not being copied at all (unlike in previous releases) They are in my tests.. This also happens if the destination doesn't exist? > 2) although the decisions are now being recorded; the items are not actually > being put into the cache for previously sync'd mails. New mails are having > all the cache information produced however. This is intentional. Doing anything else would be horribly inefficient. Note that dsync isn't *copying* cached data. It's simply setting the caching decisions, and the mail saving code parses the mails and updates cache. > Perhaps this should be activated by a new option to dsync; if people are > using this for backup (rather than conversion) caches could get relatively > large? Hm. Maybe..
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On Thu, 2011-12-08 at 09:19 +, Mark Zealey wrote: > OK now it's copying the timestamp fields for tmp ones. However: > > 1) hdr.* fields are not being copied at all (unlike in previous releases) They are in my tests.. This also happens if the destination doesn't exist? > 2) although the decisions are now being recorded; the items are not actually > being put into the cache for previously sync'd mails. New mails are having > all the cache information produced however. This is intentional. Doing anything else would be horribly inefficient. Note that dsync isn't *copying* cached data. It's simply setting the caching decisions, and the mail saving code parses the mails and updates cache. > Perhaps this should be activated by a new option to dsync; if people are > using this for backup (rather than conversion) caches could get relatively > large? Hm. Maybe..
Re: [Dovecot] using dsync to convert mailboxes looses caching options
OK now it's copying the timestamp fields for tmp ones. However: 1) hdr.* fields are not being copied at all (unlike in previous releases) 2) although the decisions are now being recorded; the items are not actually being put into the cache for previously sync'd mails. New mails are having all the cache information produced however. Note: this is only when using the -f option to dsync; when not using -f it doesnt even get round to generating a cache so no fields are put there. Perhaps this should be activated by a new option to dsync; if people are using this for backup (rather than conversion) caches could get relatively large? Mark From: Timo Sirainen [t...@iki.fi] Sent: 08 December 2011 07:33 To: Dovecot Mailing List Cc: Mark Zealey Subject: Re: [Dovecot] using dsync to convert mailboxes looses caching options On Thu, 2011-12-08 at 07:53 +0200, Timo Sirainen wrote: > But yes, it is a problem that dsync doesn't update caching decisions.. > Hmm. I guess I'll have to fix that for v2.1. Could you try if the attached patch fixes your problems when patching against latest v2.1 hg? It's annoyingly large, and it makes v2.1 dsync incompatible with v2.0, but maybe it's better to do it sooner than later..
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On Thu, 2011-12-08 at 07:53 +0200, Timo Sirainen wrote: > But yes, it is a problem that dsync doesn't update caching decisions.. > Hmm. I guess I'll have to fix that for v2.1. Could you try if the attached patch fixes your problems when patching against latest v2.1 hg? It's annoyingly large, and it makes v2.1 dsync incompatible with v2.0, but maybe it's better to do it sooner than later.. diff -r ddfe3a0f75e6 src/doveadm/doveadm-dump-index.c --- a/src/doveadm/doveadm-dump-index.c Thu Dec 08 09:30:14 2011 +0200 +++ b/src/doveadm/doveadm-dump-index.c Thu Dec 08 09:32:03 2011 +0200 @@ -9,7 +9,6 @@ #include "message-part-serialize.h" #include "mail-index-private.h" #include "mail-cache-private.h" -#include "mail-cache-private.h" #include "mail-index-modseq.h" #include "doveadm-dump.h" @@ -344,7 +343,7 @@ printf(" - "); printf("%-4s %.16s\n", cache_decision2str(field->decision), - unixdate2str(cache->fields[cache_idx].last_used)); + unixdate2str(field->last_used)); } } diff -r ddfe3a0f75e6 src/dsync/dsync-data.c --- a/src/dsync/dsync-data.c Thu Dec 08 09:30:14 2011 +0200 +++ b/src/dsync/dsync-data.c Thu Dec 08 09:32:03 2011 +0200 @@ -10,7 +10,8 @@ dsync_mailbox_dup(pool_t pool, const struct dsync_mailbox *box) { struct dsync_mailbox *dest; - const char *const *cache_fields = NULL, *dup; + const struct mailbox_cache_field *cache_fields = NULL; + struct mailbox_cache_field *dup; unsigned int i, count = 0; dest = p_new(pool, struct dsync_mailbox, 1); @@ -24,8 +25,9 @@ else { p_array_init(&dest->cache_fields, pool, count); for (i = 0; i < count; i++) { - dup = p_strdup(pool, cache_fields[i]); - array_append(&dest->cache_fields, &dup, 1); + dup = array_append_space(&dest->cache_fields); + *dup = cache_fields[i]; + dup->name = p_strdup(pool, dup->name); } } return dest; diff -r ddfe3a0f75e6 src/dsync/dsync-data.h --- a/src/dsync/dsync-data.h Thu Dec 08 09:30:14 2011 +0200 +++ b/src/dsync/dsync-data.h Thu Dec 08 09:32:03 2011 +0200 @@ -28,7 +28,7 @@ otherwise it's the last rename timestamp. */ time_t last_change; enum dsync_mailbox_flags flags; - ARRAY_TYPE(const_string) cache_fields; + ARRAY_TYPE(mailbox_cache_field) cache_fields; }; ARRAY_DEFINE_TYPE(dsync_mailbox, struct dsync_mailbox *); #define dsync_mailbox_is_noselect(dsync_box) \ diff -r ddfe3a0f75e6 src/dsync/dsync-proxy-client.c --- a/src/dsync/dsync-proxy-client.c Thu Dec 08 09:30:14 2011 +0200 +++ b/src/dsync/dsync-proxy-client.c Thu Dec 08 09:32:03 2011 +0200 @@ -893,7 +893,7 @@ static void proxy_client_worker_select_mailbox(struct dsync_worker *_worker, const mailbox_guid_t *mailbox, - const ARRAY_TYPE(const_string) *cache_fields) + const ARRAY_TYPE(mailbox_cache_field) *cache_fields) { struct proxy_client_dsync_worker *worker = (struct proxy_client_dsync_worker *)_worker; @@ -908,7 +908,7 @@ str_append(str, "BOX-SELECT\t"); dsync_proxy_mailbox_guid_export(str, mailbox); if (cache_fields != NULL) - dsync_proxy_strings_export(str, cache_fields); + dsync_proxy_cache_fields_export(str, cache_fields); str_append_c(str, '\n'); proxy_client_worker_cmd(worker, str); } T_END; diff -r ddfe3a0f75e6 src/dsync/dsync-proxy-server-cmd.c --- a/src/dsync/dsync-proxy-server-cmd.c Thu Dec 08 09:30:14 2011 +0200 +++ b/src/dsync/dsync-proxy-server-cmd.c Thu Dec 08 09:32:03 2011 +0200 @@ -315,7 +315,7 @@ cmd_box_select(struct dsync_proxy_server *server, const char *const *args) { struct dsync_mailbox box; - unsigned int i, count; + const char *error; memset(&box, 0, sizeof(box)); if (args[0] == NULL || @@ -325,10 +325,11 @@ } args++; - count = str_array_length(args); - t_array_init(&box.cache_fields, count + 1); - for (i = 0; i < count; i++) - array_append(&box.cache_fields, &args[i], 1); + if (dsync_proxy_cache_fields_import(args, pool_datastack_create(), + &box.cache_fields, &error) < 0) { + i_error("box-select: %s", error); + return -1; + } dsync_worker_select_mailbox(server->worker, &box); return 1; } diff -r ddfe3a0f75e6 src/dsync/dsync-proxy.c --- a/src/dsync/dsync-proxy.c Thu Dec 08 09:30:14 2011 +0200 +++ b/src/dsync/dsync-proxy.c Thu Dec 08 09:32:03 2011 +0200 @@ -8,27 +8,104 @@ #include "hex-binary.h" #include "mail-types.h" #include "imap-util.h" +#include "mail-cache.h" #include "dsync-data.h" #include "dsync-proxy.h" #include -void dsync_proxy_strings_export(string_t *str, -const ARRAY_TYPE(const_string) *strings) +#define DSYNC_CACHE_DECISION_NO 'n' +#define DSYNC_CACHE_DECISION_YES 'y' +#define DSYNC_CACHE_DECISION_TEMP 't' +#define DSYNC_CACHE_DECISION_FORCED 'f' + +void dsync_proxy_cache_fields_export(string_t *str, + const ARRAY_TYPE(mailbox_cache_field) *_fields) { - const char *const *fields; + const struct mailbox_cache_field *fields; unsigned int i, count; - if (!array_is_created(strings)) + if (!array_is_created(_fields)) return; - fields = arra
Re: [Dovecot] using dsync to convert mailboxes looses caching options
Apologies for top-posting but I can't figure out how to make this client do inline... I am seeing on the first run (we are using 'backup') we don't get any of the cache copied just the index files created. On the second run (ie when dest exists); a cache file is created and populated with the bits that are required for the sync presumably - guid. As you say the yes/tmp caching decisions are copied over (and visible in the cache file) but because the last used date is not copied; these fields are not activated for any of the messages so none of their data actually gets cached. I'm not seeing a compression at the end as the tmp etc fields are still there (mostly don't have any yes fields in our source caches) but as I say, because they don't have a last used date then the none of them are ever actually used until the client requests them via pop/imap. Mark From: Timo Sirainen [t...@iki.fi] Sent: 08 December 2011 05:53 To: Mark Zealey Cc: Dovecot Mailing List Subject: Re: [Dovecot] using dsync to convert mailboxes looses caching options On Sat, 2011-11-26 at 18:33 +0200, Mark Zealey wrote: > We're trying to convert users from Maildir to sdbox at present; I'm > using dsync to achieve this (2.0.16) however when the user's have been > converted we only get minimal information in the caching files. Is there > some way to preserve all the caching decisions that were previously made > so that when the user logs in to the new mailbox we don't have to cause > an io storm rebuilding the cache that we know was good? Dovecot seems to > be partially doing this - if i remove the logs/cache from the source > mailbox no cache files are built in the conversion; if i put them back > then we get a cache file built but it only contains a few bits of > information (guid, date.save). Looking into this a bit further i find > that when the caches are present at source the fields are preserved but > the 'last used' date and caching decisions are not which I suspect means > dsync doesn't bother caching on import - only fields with a yes decision > in the source are copied (but their decision is only copied as a tmp > with the date of import). For example: How are you calling dsync? Does the destination already exist? I tried with: rm -rf /tmp/foo; dsync -u tss -m INBOX mirror sdbox:/tmp/foo It sets all of the cache fields with "yes" or "tmp" decision, as it should. But yes, the "last used" field should probably be copied as well. Perhaps the problem with you is that dsync actually writes all of the cache fields, but then it does a "cache compression" at the end, which sees that the "last used" fields are so old, so it deletes them. But yes, it is a problem that dsync doesn't update caching decisions.. Hmm. I guess I'll have to fix that for v2.1.
Re: [Dovecot] using dsync to convert mailboxes looses caching options
On Sat, 2011-11-26 at 18:33 +0200, Mark Zealey wrote: > We're trying to convert users from Maildir to sdbox at present; I'm > using dsync to achieve this (2.0.16) however when the user's have been > converted we only get minimal information in the caching files. Is there > some way to preserve all the caching decisions that were previously made > so that when the user logs in to the new mailbox we don't have to cause > an io storm rebuilding the cache that we know was good? Dovecot seems to > be partially doing this - if i remove the logs/cache from the source > mailbox no cache files are built in the conversion; if i put them back > then we get a cache file built but it only contains a few bits of > information (guid, date.save). Looking into this a bit further i find > that when the caches are present at source the fields are preserved but > the 'last used' date and caching decisions are not which I suspect means > dsync doesn't bother caching on import - only fields with a yes decision > in the source are copied (but their decision is only copied as a tmp > with the date of import). For example: How are you calling dsync? Does the destination already exist? I tried with: rm -rf /tmp/foo; dsync -u tss -m INBOX mirror sdbox:/tmp/foo It sets all of the cache fields with "yes" or "tmp" decision, as it should. But yes, the "last used" field should probably be copied as well. Perhaps the problem with you is that dsync actually writes all of the cache fields, but then it does a "cache compression" at the end, which sees that the "last used" fields are so old, so it deletes them. But yes, it is a problem that dsync doesn't update caching decisions.. Hmm. I guess I'll have to fix that for v2.1.