Add new dump/restore format and batch tagging.
Ethan Glasser-Camp writes: > - Patch 4 still has a subject line that ends in a period. I don't think > this is mandatory for everyone but some people consider it best > practice. Best practice, of course, would be to remove the period at the end of the subject line. Patch 12 also has one. Ethan
Add new dump/restore format and batch tagging.
david at tethera.net writes: > which was revied by Tomi and Ethan. I think I implemented their > suggestions. Actually, I don't think you implemented all of mine. - Patch 4 still has a subject line that ends in a period. I don't think this is mandatory for everyone but some people consider it best practice. You still have the spelling "seperate" (also, patch 7 has "seperated"). - In patch 4, I still think this would look better if you switched the branches. +if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + _notmuch_message_add_term (message, "type", "mail"); +} else { + return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; +} - In patch 5, I still think this looks funny: + int num_tags = random () % (max_tags + 1); + int this_mid_len = random () % message_id_len + 1; Additionally, I would like a check that message_id_len is reasonable (more than 1, say). - Patch 8: +int +parse_tag_stream (void *ctx, + notmuch_database_t *notmuch, + FILE *input, + tag_callback_t callback, + tag_op_flag_t flags, + volatile sig_atomic_t *interrupted); Am I going crazy, or does this function not get implemented? - Patch 11: DUMP_FORMAT can change from DUMP_FORMAT_BATCH_TAG to DUMP_FORMAT_SUP if a paren is anywhere on the first line. I'd prefer this only happens if we have DUMP_FORMAT_AUTO. You probably want to move the comment "Dump output is..." closer to the regex. I don't see why it's necessary to have this: + query_string = query_string + 3; - Patch 13: +cp /dev/null EXPECTED.$test_count +notmuch dump --format=batch-tag -- from:cworth |\ +awk "{ print \"+$enc1 +$enc2 +$enc3 -- \" \$5 }" > EXPECTED.$test_count What's the purpose of the CP here? It just creates an empty file. You could do it with touch. Why even bother since you're going to fill it with stuff in a second? Actually, care to explain the dump and sed call? It looks like you're using this dump to encode the message IDs. If format=batch-tag skips a message for some reason or terminates early, the test won't fail. - Patch 16: +message-ids may contained arbitrary non-null characters. Note also I think this should be "may contain", or something else entirely if you're trying to describe past behavior of sup? Ethan
Add new dump/restore format and batch tagging.
This is a patch series with some history, if you will forgive the semi-inadvertant pun. I list that at that end, for the curious. If this series goes in, in the future we might consider whether restore --accumulate provides important functionality over batch tagging; at the moment I suggest leaving it as the extra code to support it is minimal, and it does support peoples existing scripts using the old dump/restore format. History --- About 1 year ago, Petter Reinholdtsen observed a problem with dumping and restoring message-id's with spaces in them. id:2flfwhht87d.fsf at diskless.uio.no There followed a proposed fix id:1323808075-7417-1-git-send-email-david at tethera.net Which Dmitry had a few helpful things to say about the hex encoding libs. Jani took that foundation and proposed two versions of the batch tagging, most recently at id:cover.1334404979.git.jani at nikula.org. There was some discussion with Jamie about the file format for batch tagging in the thread id:cover.1333231401.git.jani at nikula.org id:1323808075-7417-1-git-send-email-david at tethera.net The first 6 of these patches obsolete the series id:1345382314-5330-1-git-send-email-david at tethera.net which was revied by Tomi and Ethan. I think I implemented their suggestions. Although I didn't re-read that whole thread, I believe this version of the patches address's Jamie's concerns by using exactly the same format for restore and tag --batch (renamed from Jani's choice of --stdin).
Re: Add new dump/restore format and batch tagging.
Ethan Glasser-Camp writes: > - Patch 4 still has a subject line that ends in a period. I don't think > this is mandatory for everyone but some people consider it best > practice. Best practice, of course, would be to remove the period at the end of the subject line. Patch 12 also has one. Ethan ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Re: Add new dump/restore format and batch tagging.
da...@tethera.net writes: > which was revied by Tomi and Ethan. I think I implemented their > suggestions. Actually, I don't think you implemented all of mine. - Patch 4 still has a subject line that ends in a period. I don't think this is mandatory for everyone but some people consider it best practice. You still have the spelling "seperate" (also, patch 7 has "seperated"). - In patch 4, I still think this would look better if you switched the branches. +if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) { + _notmuch_message_add_term (message, "type", "mail"); +} else { + return NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID; +} - In patch 5, I still think this looks funny: + int num_tags = random () % (max_tags + 1); + int this_mid_len = random () % message_id_len + 1; Additionally, I would like a check that message_id_len is reasonable (more than 1, say). - Patch 8: +int +parse_tag_stream (void *ctx, + notmuch_database_t *notmuch, + FILE *input, + tag_callback_t callback, + tag_op_flag_t flags, + volatile sig_atomic_t *interrupted); Am I going crazy, or does this function not get implemented? - Patch 11: DUMP_FORMAT can change from DUMP_FORMAT_BATCH_TAG to DUMP_FORMAT_SUP if a paren is anywhere on the first line. I'd prefer this only happens if we have DUMP_FORMAT_AUTO. You probably want to move the comment "Dump output is..." closer to the regex. I don't see why it's necessary to have this: + query_string = query_string + 3; - Patch 13: +cp /dev/null EXPECTED.$test_count +notmuch dump --format=batch-tag -- from:cworth |\ +awk "{ print \"+$enc1 +$enc2 +$enc3 -- \" \$5 }" > EXPECTED.$test_count What's the purpose of the CP here? It just creates an empty file. You could do it with touch. Why even bother since you're going to fill it with stuff in a second? Actually, care to explain the dump and sed call? It looks like you're using this dump to encode the message IDs. If format=batch-tag skips a message for some reason or terminates early, the test won't fail. - Patch 16: +message-ids may contained arbitrary non-null characters. Note also I think this should be "may contain", or something else entirely if you're trying to describe past behavior of sup? Ethan ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
Add new dump/restore format and batch tagging.
This is a patch series with some history, if you will forgive the semi-inadvertant pun. I list that at that end, for the curious. If this series goes in, in the future we might consider whether restore --accumulate provides important functionality over batch tagging; at the moment I suggest leaving it as the extra code to support it is minimal, and it does support peoples existing scripts using the old dump/restore format. History --- About 1 year ago, Petter Reinholdtsen observed a problem with dumping and restoring message-id's with spaces in them. id:2flfwhht87d@diskless.uio.no There followed a proposed fix id:1323808075-7417-1-git-send-email-da...@tethera.net Which Dmitry had a few helpful things to say about the hex encoding libs. Jani took that foundation and proposed two versions of the batch tagging, most recently at id:cover.1334404979.git.j...@nikula.org. There was some discussion with Jamie about the file format for batch tagging in the thread id:cover.1333231401.git.j...@nikula.org id:1323808075-7417-1-git-send-email-da...@tethera.net The first 6 of these patches obsolete the series id:1345382314-5330-1-git-send-email-da...@tethera.net which was revied by Tomi and Ethan. I think I implemented their suggestions. Although I didn't re-read that whole thread, I believe this version of the patches address's Jamie's concerns by using exactly the same format for restore and tag --batch (renamed from Jani's choice of --stdin). ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
New dump/restore format.
This version of this series add fairly extensive testing with strange message ids full of spaces and punctuation, and some documentation.
New dump/restore format.
This version of this series add fairly extensive testing with strange message ids full of spaces and punctuation, and some documentation. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
New Dump/Restore Format
Hi All; There are some style/doc issues remaining, but because bugs in dump and restore really suck, I thought I would ask for early feedback on functionality. I'm particularly interested in how the new dump format works for weird message-ids (spaces and so on). If you have public messages with tricky message-id's, I'd appreciate adding those messages to the test suite. Things to bikeshed now: name(s) of the formats; sup and notmuch are maybe not ideal. The format itself? The encoding format? The latter is chosen for compatibility with nmbug, but we could discussing using a bigger character set. Things I know about - not enough tests - no man page, online docs. - no API docs for hex_encode/blah. I think the code in hex-escape.[ch] is otherwise ready for (second) review; I'll probably do another review of the code in notmuch-(dump|restore).c myself for clarity, so you might want to wait for the next round before diving in. If you prefer pull from git, you can get these patches on branch "new-dump" at git://pivot.cs.unb.ca/notmuch.git
New Dump/Restore Format
Hi All; There are some style/doc issues remaining, but because bugs in dump and restore really suck, I thought I would ask for early feedback on functionality. I'm particularly interested in how the new dump format works for weird message-ids (spaces and so on). If you have public messages with tricky message-id's, I'd appreciate adding those messages to the test suite. Things to bikeshed now: name(s) of the formats; sup and notmuch are maybe not ideal. The format itself? The encoding format? The latter is chosen for compatibility with nmbug, but we could discussing using a bigger character set. Things I know about - not enough tests - no man page, online docs. - no API docs for hex_encode/blah. I think the code in hex-escape.[ch] is otherwise ready for (second) review; I'll probably do another review of the code in notmuch-(dump|restore).c myself for clarity, so you might want to wait for the next round before diving in. If you prefer pull from git, you can get these patches on branch "new-dump" at git://pivot.cs.unb.ca/notmuch.git ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch