Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-10 Thread Mark Zealey

10-12-2011 08:27, Timo Sirainen yazmış:

On Thu, 2011-12-08 at 16:10 +, Mark Zealey wrote:

By the way, another bug I noticed with dsync is that when converting from 
Maildir to sdbox is that the date.saved field is not preserved - it's just the 
time when the first dsync command happened. Presumably it should be the mtime 
of the Maildir message file

With Maildir the date.saved is taken from the mail file's ctime (yes,
it's not perfect, but it's good enough for what it's used for). It's
preserved in my tests.


It could well be because of the conversion to sdbox then - the 
ctime/mtime of the files are not being preserved by dsync (in stock 
2.0.16). The date.saved timestamp is only put into the cache on the 
second dsync run; presumably therefore it picks it up from the filesystem.


Mark


Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-10 Thread Mark Zealey


10-12-2011 08:28, Timo Sirainen yazmış:

On Thu, 2011-12-08 at 14:45 +, Mark Zealey wrote:

With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). With 
the patch you provided they don't get copied whether using mirror or backup  
starting from scratch. I'm doing a Maildir to sdbox migration otherwise don't think 
I'm doing anytihng strange.

Show the whole list of cache decisions in source and destination?


Ah-ha it's doing the same in 2.0.16 - looking deeper it's because i 
havn't accessed the tmp fields in a week or two so I guess the decision 
has been taken not to migrate them.


Mark


Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-10 Thread Timo Sirainen
On 10.12.2011, at 13.03, Mark Zealey wrote:

 Ah-ha it's doing the same in 2.0.16 - looking deeper it's because i havn't 
 accessed the tmp fields in a week or two so I guess the decision has been 
 taken not to migrate them.

Yes, most likely the reason. Could this also explain the date.saved?

 It could well be because of the conversion to sdbox then - the ctime/mtime of 
 the files are not being preserved by dsync (in stock 2.0.16). The date.saved 
 timestamp is only put into the cache on the second dsync run; presumably 
 therefore it picks it up from the filesystem.

With sdbox the file's mtime isn't even tried to be preserved. The received-time 
and saved-time are written to the metadata block inside the file.

Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-10 Thread Mark Zealey



10-12-2011 13:07, Timo Sirainen yazmış:

It could well be because of the conversion to sdbox then - the ctime/mtime of 
the files are not being preserved by dsync (in stock 2.0.16). The date.saved 
timestamp is only put into the cache on the second dsync run; presumably 
therefore it picks it up from the filesystem.

With sdbox the file's mtime isn't even tried to be preserved. The received-time 
and saved-time are written to the metadata block inside the file.


Ah yes; I saw the R metadata but not the C header key. Looking deeper at 
this I think I was expecting the date.save time to be about the same as 
the date.receive; however the ctime for these files is quite recent 
presumably affected by setting of message flags in a maildir or 
something (we're using nfs). The source cache says:


- date.received: 1301978447 (4f9d9a4d)
- date.save: 1322465550 (0e39d34e)

The message file itself has mtime 1301978447 and ctime 1323514077; and 
in the sdbox header/metadata we have:


C4ee3391a
R4d9a9d4f

so ctime/sdbox C entry are close enough by my calculations (not sure 
where the 61 seconds of difference comes from though). It is a bit 
strange you wouldn't use the source cache's value for date.save if it is 
available as ctime can be pretty unreliable?


Mark


Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-10 Thread Timo Sirainen
On 10.12.2011, at 13.32, Mark Zealey wrote:

 10-12-2011 13:07, Timo Sirainen yazmış:
 It could well be because of the conversion to sdbox then - the ctime/mtime 
 of the files are not being preserved by dsync (in stock 2.0.16). The 
 date.saved timestamp is only put into the cache on the second dsync run; 
 presumably therefore it picks it up from the filesystem.
 With sdbox the file's mtime isn't even tried to be preserved. The 
 received-time and saved-time are written to the metadata block inside the 
 file.
 
 Ah yes; I saw the R metadata but not the C header key.

The C is the file's create time. It's not actually use for anything.

 Looking deeper at this I think I was expecting the date.save time to be about 
 the same as the date.receive; however the ctime for these files is quite 
 recent presumably affected by setting of message flags in a maildir or 
 something (we're using nfs).

Yes, maildir flag changes change the ctime, which also changes the save date if 
it's not already cached.

 so ctime/sdbox C entry are close enough by my calculations (not sure where 
 the 61 seconds of difference comes from though). It is a bit strange you 
 wouldn't use the source cache's value for date.save if it is available as 
 ctime can be pretty unreliable?


It is using the the cached value.

Anyway, I remembered wrong how sdbox's save date is looked up. It's taken from 
the sdbox file's ctime. The reason is similar to maildir: The save date is used 
mainly to figure out when to automatically expunge messages from Trash after 
it's been there for n days. So if you copy 1 year old message to Trash, you 
don't want it expunged immediately (based on mtime or some metadata inside the 
file), you want it expunged n days since the move. And ctime is really the only 
nice way to do it automatically, because copying a message with sdbox is done 
with hard linking.

mdbox stores the save date in the index file. sdbox could do it too, but that's 
just extra work and probably not worth the trouble. And unlink atime/mtime, 
ctime can't be changed using any syscalls (except to current time).

So, I think everything here works as intended, although not really as expected. 
:)

Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-09 Thread Timo Sirainen
On Thu, 2011-12-08 at 16:10 +, Mark Zealey wrote:
 By the way, another bug I noticed with dsync is that when converting from 
 Maildir to sdbox is that the date.saved field is not preserved - it's just 
 the time when the first dsync command happened. Presumably it should be the 
 mtime of the Maildir message file

With Maildir the date.saved is taken from the mail file's ctime (yes,
it's not perfect, but it's good enough for what it's used for). It's
preserved in my tests.




Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-09 Thread Timo Sirainen
On Thu, 2011-12-08 at 14:45 +, Mark Zealey wrote:
 With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). 
 With the patch you provided they don't get copied whether using mirror or 
 backup  starting from scratch. I'm doing a Maildir to sdbox migration 
 otherwise don't think I'm doing anytihng strange.

Show the whole list of cache decisions in source and destination?




Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-08 Thread Mark Zealey
OK now it's copying the timestamp fields for tmp ones. However:

1) hdr.* fields are not being copied at all (unlike in previous releases)
2) although the decisions are now being recorded; the items are not actually 
being put into the cache for previously sync'd mails. New mails are having all 
the cache information produced however.

Note: this is only when using the -f option to dsync; when not using -f it 
doesnt even get round to generating a cache so no fields are put there.

Perhaps this should be activated by a new option to dsync; if people are using 
this for backup (rather than conversion) caches could get relatively large?

Mark

From: Timo Sirainen [t...@iki.fi]
Sent: 08 December 2011 07:33
To: Dovecot Mailing List
Cc: Mark Zealey
Subject: Re: [Dovecot] using dsync to convert mailboxes looses caching options

On Thu, 2011-12-08 at 07:53 +0200, Timo Sirainen wrote:

 But yes, it is a problem that dsync doesn't update caching decisions..
 Hmm. I guess I'll have to fix that for v2.1.

Could you try if the attached patch fixes your problems when patching
against latest v2.1 hg? It's annoyingly large, and it makes v2.1 dsync
incompatible with v2.0, but maybe it's better to do it sooner than
later..




Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-08 Thread Timo Sirainen
On Thu, 2011-12-08 at 09:19 +, Mark Zealey wrote:
 OK now it's copying the timestamp fields for tmp ones. However:
 
 1) hdr.* fields are not being copied at all (unlike in previous releases)

They are in my tests.. This also happens if the destination doesn't
exist?

 2) although the decisions are now being recorded; the items are not actually 
 being put into the cache for previously sync'd mails. New mails are having 
 all the cache information produced however.

This is intentional. Doing anything else would be horribly inefficient.
Note that dsync isn't *copying* cached data. It's simply setting the
caching decisions, and the mail saving code parses the mails and updates
cache.

 Perhaps this should be activated by a new option to dsync; if people are 
 using this for backup (rather than conversion) caches could get relatively 
 large?

Hm. Maybe..




Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-08 Thread Mark Zealey
OK I'll test the header copying more fully. The reason we want to preserve 
caching decisions is to avoid an IO storm when users log in to their mailboxes 
after an sdbox upgrade so it would be great to be able to have some way to warm 
caches.

Mark

From: Timo Sirainen [t...@iki.fi]
Sent: 08 December 2011 09:27
To: Mark Zealey
Cc: Dovecot Mailing List
Subject: RE: [Dovecot] using dsync to convert mailboxes looses caching options

On Thu, 2011-12-08 at 09:19 +, Mark Zealey wrote:
 OK now it's copying the timestamp fields for tmp ones. However:

 1) hdr.* fields are not being copied at all (unlike in previous releases)

They are in my tests.. This also happens if the destination doesn't
exist?

 2) although the decisions are now being recorded; the items are not actually 
 being put into the cache for previously sync'd mails. New mails are having 
 all the cache information produced however.

This is intentional. Doing anything else would be horribly inefficient.
Note that dsync isn't *copying* cached data. It's simply setting the
caching decisions, and the mail saving code parses the mails and updates
cache.

 Perhaps this should be activated by a new option to dsync; if people are 
 using this for backup (rather than conversion) caches could get relatively 
 large?

Hm. Maybe..






Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-08 Thread Mark Zealey
With 2.0.16 hdr.xxx fields get copied fine (but of course without timestamp). 
With the patch you provided they don't get copied whether using mirror or 
backup  starting from scratch. I'm doing a Maildir to sdbox migration 
otherwise don't think I'm doing anytihng strange.

Mark

From: Mark Zealey
Sent: 08 December 2011 09:35
To: Timo Sirainen
Cc: Dovecot Mailing List
Subject: RE: [Dovecot] using dsync to convert mailboxes looses caching options

OK I'll test the header copying more fully. The reason we want to preserve 
caching decisions is to avoid an IO storm when users log in to their mailboxes 
after an sdbox upgrade so it would be great to be able to have some way to warm 
caches.

Mark


Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-08 Thread Mark Zealey
By the way, another bug I noticed with dsync is that when converting from 
Maildir to sdbox is that the date.saved field is not preserved - it's just the 
time when the first dsync command happened. Presumably it should be the mtime 
of the Maildir message file

Mark


Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-07 Thread Timo Sirainen
On Sat, 2011-11-26 at 18:33 +0200, Mark Zealey wrote:

 We're trying to convert users from Maildir to sdbox at present; I'm 
 using dsync to achieve this (2.0.16) however when the user's have been 
 converted we only get minimal information in the caching files. Is there 
 some way to preserve all the caching decisions that were previously made 
 so that when the user logs in to the new mailbox we don't have to cause 
 an io storm rebuilding the cache that we know was good? Dovecot seems to 
 be partially doing this - if i remove the logs/cache from the source 
 mailbox no cache files are built in the conversion; if i put them back 
 then we get a cache file built but it only contains a few bits of 
 information (guid, date.save). Looking into this a bit further i find 
 that when the caches are present at source the fields are preserved but 
 the 'last used' date and caching decisions are not which I suspect means 
 dsync doesn't bother caching on import - only fields with a yes decision 
 in the source are copied (but their decision is only copied as a tmp 
 with the date of import). For example:

How are you calling dsync? Does the destination already exist? I tried
with:

rm -rf /tmp/foo; dsync -u tss -m INBOX mirror sdbox:/tmp/foo

It sets all of the cache fields with yes or tmp decision, as it
should. But yes, the last used field should probably be copied as
well.

Perhaps the problem with you is that dsync actually writes all of the
cache fields, but then it does a cache compression at the end, which
sees that the last used fields are so old, so it deletes them.

But yes, it is a problem that dsync doesn't update caching decisions..
Hmm. I guess I'll have to fix that for v2.1.



Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-07 Thread Mark Zealey
Apologies for top-posting but I can't figure out how to make this client do 
inline... I am seeing on the first run (we are using 'backup') we don't get any 
of the cache copied just the index files created. On the second run (ie when 
dest exists); a cache file is created and populated with the bits that are 
required for the sync presumably - guid. As you say the yes/tmp caching 
decisions are copied over (and visible in the cache file) but because the last 
used date is not copied; these fields are not activated for any of the messages 
so none of their data actually gets cached. I'm not seeing a compression at the 
end as the tmp etc fields are still there (mostly don't have any yes fields in 
our source caches) but as I say, because they don't have a last used date then 
the none of them are ever actually used until the client requests them via 
pop/imap.

Mark

From: Timo Sirainen [t...@iki.fi]
Sent: 08 December 2011 05:53
To: Mark Zealey
Cc: Dovecot Mailing List
Subject: Re: [Dovecot] using dsync to convert mailboxes looses caching options

On Sat, 2011-11-26 at 18:33 +0200, Mark Zealey wrote:

 We're trying to convert users from Maildir to sdbox at present; I'm
 using dsync to achieve this (2.0.16) however when the user's have been
 converted we only get minimal information in the caching files. Is there
 some way to preserve all the caching decisions that were previously made
 so that when the user logs in to the new mailbox we don't have to cause
 an io storm rebuilding the cache that we know was good? Dovecot seems to
 be partially doing this - if i remove the logs/cache from the source
 mailbox no cache files are built in the conversion; if i put them back
 then we get a cache file built but it only contains a few bits of
 information (guid, date.save). Looking into this a bit further i find
 that when the caches are present at source the fields are preserved but
 the 'last used' date and caching decisions are not which I suspect means
 dsync doesn't bother caching on import - only fields with a yes decision
 in the source are copied (but their decision is only copied as a tmp
 with the date of import). For example:

How are you calling dsync? Does the destination already exist? I tried
with:

rm -rf /tmp/foo; dsync -u tss -m INBOX mirror sdbox:/tmp/foo

It sets all of the cache fields with yes or tmp decision, as it
should. But yes, the last used field should probably be copied as
well.

Perhaps the problem with you is that dsync actually writes all of the
cache fields, but then it does a cache compression at the end, which
sees that the last used fields are so old, so it deletes them.

But yes, it is a problem that dsync doesn't update caching decisions..
Hmm. I guess I'll have to fix that for v2.1.





Re: [Dovecot] using dsync to convert mailboxes looses caching options

2011-12-07 Thread Timo Sirainen
On Thu, 2011-12-08 at 07:53 +0200, Timo Sirainen wrote:

 But yes, it is a problem that dsync doesn't update caching decisions..
 Hmm. I guess I'll have to fix that for v2.1.

Could you try if the attached patch fixes your problems when patching
against latest v2.1 hg? It's annoyingly large, and it makes v2.1 dsync
incompatible with v2.0, but maybe it's better to do it sooner than
later..

diff -r ddfe3a0f75e6 src/doveadm/doveadm-dump-index.c
--- a/src/doveadm/doveadm-dump-index.c	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/doveadm/doveadm-dump-index.c	Thu Dec 08 09:32:03 2011 +0200
@@ -9,7 +9,6 @@
 #include message-part-serialize.h
 #include mail-index-private.h
 #include mail-cache-private.h
-#include mail-cache-private.h
 #include mail-index-modseq.h
 #include doveadm-dump.h
 
@@ -344,7 +343,7 @@
 			printf(   - );
 		printf(%-4s %.16s\n,
 		   cache_decision2str(field-decision),
-		   unixdate2str(cache-fields[cache_idx].last_used));
+		   unixdate2str(field-last_used));
 	}
 }
 
diff -r ddfe3a0f75e6 src/dsync/dsync-data.c
--- a/src/dsync/dsync-data.c	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/dsync/dsync-data.c	Thu Dec 08 09:32:03 2011 +0200
@@ -10,7 +10,8 @@
 dsync_mailbox_dup(pool_t pool, const struct dsync_mailbox *box)
 {
 	struct dsync_mailbox *dest;
-	const char *const *cache_fields = NULL, *dup;
+	const struct mailbox_cache_field *cache_fields = NULL;
+	struct mailbox_cache_field *dup;
 	unsigned int i, count = 0;
 
 	dest = p_new(pool, struct dsync_mailbox, 1);
@@ -24,8 +25,9 @@
 	else {
 		p_array_init(dest-cache_fields, pool, count);
 		for (i = 0; i  count; i++) {
-			dup = p_strdup(pool, cache_fields[i]);
-			array_append(dest-cache_fields, dup, 1);
+			dup = array_append_space(dest-cache_fields);
+			*dup = cache_fields[i];
+			dup-name = p_strdup(pool, dup-name);
 		}
 	}
 	return dest;
diff -r ddfe3a0f75e6 src/dsync/dsync-data.h
--- a/src/dsync/dsync-data.h	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/dsync/dsync-data.h	Thu Dec 08 09:32:03 2011 +0200
@@ -28,7 +28,7 @@
 	   otherwise it's the last rename timestamp. */
 	time_t last_change;
 	enum dsync_mailbox_flags flags;
-	ARRAY_TYPE(const_string) cache_fields;
+	ARRAY_TYPE(mailbox_cache_field) cache_fields;
 };
 ARRAY_DEFINE_TYPE(dsync_mailbox, struct dsync_mailbox *);
 #define dsync_mailbox_is_noselect(dsync_box) \
diff -r ddfe3a0f75e6 src/dsync/dsync-proxy-client.c
--- a/src/dsync/dsync-proxy-client.c	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/dsync/dsync-proxy-client.c	Thu Dec 08 09:32:03 2011 +0200
@@ -893,7 +893,7 @@
 static void
 proxy_client_worker_select_mailbox(struct dsync_worker *_worker,
    const mailbox_guid_t *mailbox,
-   const ARRAY_TYPE(const_string) *cache_fields)
+   const ARRAY_TYPE(mailbox_cache_field) *cache_fields)
 {
 	struct proxy_client_dsync_worker *worker =
 		(struct proxy_client_dsync_worker *)_worker;
@@ -908,7 +908,7 @@
 		str_append(str, BOX-SELECT\t);
 		dsync_proxy_mailbox_guid_export(str, mailbox);
 		if (cache_fields != NULL)
-			dsync_proxy_strings_export(str, cache_fields);
+			dsync_proxy_cache_fields_export(str, cache_fields);
 		str_append_c(str, '\n');
 		proxy_client_worker_cmd(worker, str);
 	} T_END;
diff -r ddfe3a0f75e6 src/dsync/dsync-proxy-server-cmd.c
--- a/src/dsync/dsync-proxy-server-cmd.c	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/dsync/dsync-proxy-server-cmd.c	Thu Dec 08 09:32:03 2011 +0200
@@ -315,7 +315,7 @@
 cmd_box_select(struct dsync_proxy_server *server, const char *const *args)
 {
 	struct dsync_mailbox box;
-	unsigned int i, count;
+	const char *error;
 
 	memset(box, 0, sizeof(box));
 	if (args[0] == NULL ||
@@ -325,10 +325,11 @@
 	}
 	args++;
 
-	count = str_array_length(args);
-	t_array_init(box.cache_fields, count + 1);
-	for (i = 0; i  count; i++)
-		array_append(box.cache_fields, args[i], 1);
+	if (dsync_proxy_cache_fields_import(args, pool_datastack_create(),
+	box.cache_fields, error)  0) {
+		i_error(box-select: %s, error);
+		return -1;
+	}
 	dsync_worker_select_mailbox(server-worker, box);
 	return 1;
 }
diff -r ddfe3a0f75e6 src/dsync/dsync-proxy.c
--- a/src/dsync/dsync-proxy.c	Thu Dec 08 09:30:14 2011 +0200
+++ b/src/dsync/dsync-proxy.c	Thu Dec 08 09:32:03 2011 +0200
@@ -8,27 +8,104 @@
 #include hex-binary.h
 #include mail-types.h
 #include imap-util.h
+#include mail-cache.h
 #include dsync-data.h
 #include dsync-proxy.h
 
 #include stdlib.h
 
-void dsync_proxy_strings_export(string_t *str,
-const ARRAY_TYPE(const_string) *strings)
+#define DSYNC_CACHE_DECISION_NO 'n'
+#define DSYNC_CACHE_DECISION_YES 'y'
+#define DSYNC_CACHE_DECISION_TEMP 't'
+#define DSYNC_CACHE_DECISION_FORCED 'f'
+
+void dsync_proxy_cache_fields_export(string_t *str,
+ const ARRAY_TYPE(mailbox_cache_field) *_fields)
 {
-	const char *const *fields;
+	const struct mailbox_cache_field *fields;
 	unsigned int i, count;
 
-	if (!array_is_created(strings))
+	if (!array_is_created(_fields))
 		return;
 
-	fields = array_get(strings, count);
+	fields =