[PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
On Sun, 15 Jan 2012 15:35:11 -0500, Austin Clements wrote: > We definitely need a round-trip-able dump format. Did you consider > using JSON to allow for future flexibility (e.g., expansion of what we > store in the database) and so we don't have to invent our own > encodings? A JSON format wouldn't necessarily be a reason *not* to > also have this format, especially considering how > shell-script-friendly this is (versus how shell-script-unfriendly JSON > is), I'm just curious what trade-offs you're considering. I was looking for something fairly close to what we have, to allow people to migrate their various scripts (e.g. nmbug) to the new format without too much pain. Maybe some small amount of header information at the start of the file would support extensibility, while still being shell script friendly. I'm also not too sure how much overhead the JSON quoting would induce. My tags file is currently about 10M, and on my old laptop takes about 15s to dump. That's a long 15s when I'm trying to sync my mail. For "normal" backup use, a little more overhead doesn't matter, although the stories of non-linear slowdowns that people report suggest we shouldn't get too cavalier about that. > You might want to call this format something more self-descriptive > like "text" or "hextext" or something in case we do want to expand in > the future. "sup" is probably fine for the legacy format since that's > set in stone at this point. yeah, I'm definitely open to better suggestions for a name
Re: [PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
On Sun, 15 Jan 2012 15:35:11 -0500, Austin Clements wrote: > We definitely need a round-trip-able dump format. Did you consider > using JSON to allow for future flexibility (e.g., expansion of what we > store in the database) and so we don't have to invent our own > encodings? A JSON format wouldn't necessarily be a reason *not* to > also have this format, especially considering how > shell-script-friendly this is (versus how shell-script-unfriendly JSON > is), I'm just curious what trade-offs you're considering. I was looking for something fairly close to what we have, to allow people to migrate their various scripts (e.g. nmbug) to the new format without too much pain. Maybe some small amount of header information at the start of the file would support extensibility, while still being shell script friendly. I'm also not too sure how much overhead the JSON quoting would induce. My tags file is currently about 10M, and on my old laptop takes about 15s to dump. That's a long 15s when I'm trying to sync my mail. For "normal" backup use, a little more overhead doesn't matter, although the stories of non-linear slowdowns that people report suggest we shouldn't get too cavalier about that. > You might want to call this format something more self-descriptive > like "text" or "hextext" or something in case we do want to expand in > the future. "sup" is probably fine for the legacy format since that's > set in stone at this point. yeah, I'm definitely open to better suggestions for a name ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
Quoth David Bremner on Jan 14 at 9:40 pm: > From: David Bremner > > sup is the old format, and remains the default. > > Each line of the notmuch format is "msg_id tag tag...tag" where each > space seperated token is 'hex-encoded' to remove troubling characters. > In particular this format won't have the same problem with e.g. spaces > in message-ids or tags; they will be round-trip-able. We definitely need a round-trip-able dump format. Did you consider using JSON to allow for future flexibility (e.g., expansion of what we store in the database) and so we don't have to invent our own encodings? A JSON format wouldn't necessarily be a reason *not* to also have this format, especially considering how shell-script-friendly this is (versus how shell-script-unfriendly JSON is), I'm just curious what trade-offs you're considering. You might want to call this format something more self-descriptive like "text" or "hextext" or something in case we do want to expand in the future. "sup" is probably fine for the legacy format since that's set in stone at this point.
Re: [PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
Quoth David Bremner on Jan 14 at 9:40 pm: > From: David Bremner > > sup is the old format, and remains the default. > > Each line of the notmuch format is "msg_id tag tag...tag" where each > space seperated token is 'hex-encoded' to remove troubling characters. > In particular this format won't have the same problem with e.g. spaces > in message-ids or tags; they will be round-trip-able. We definitely need a round-trip-able dump format. Did you consider using JSON to allow for future flexibility (e.g., expansion of what we store in the database) and so we don't have to invent our own encodings? A JSON format wouldn't necessarily be a reason *not* to also have this format, especially considering how shell-script-friendly this is (versus how shell-script-unfriendly JSON is), I'm just curious what trade-offs you're considering. You might want to call this format something more self-descriptive like "text" or "hextext" or something in case we do want to expand in the future. "sup" is probably fine for the legacy format since that's set in stone at this point. ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch
[PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
From: David Bremner sup is the old format, and remains the default. Each line of the notmuch format is "msg_id tag tag...tag" where each space seperated token is 'hex-encoded' to remove troubling characters. In particular this format won't have the same problem with e.g. spaces in message-ids or tags; they will be round-trip-able. --- dump-restore-private.h | 12 notmuch-dump.c | 47 +++ 2 files changed, 51 insertions(+), 8 deletions(-) create mode 100644 dump-restore-private.h diff --git a/dump-restore-private.h b/dump-restore-private.h new file mode 100644 index 000..34a5022 --- /dev/null +++ b/dump-restore-private.h @@ -0,0 +1,12 @@ +#ifndef DUMP_RESTORE_PRIVATE_H +#define DUMP_RESTORE_PRIVATE_H + +#include "hex-escape.h" +#include "command-line-arguments.h" + +typedef enum dump_formats { +DUMP_FORMAT_SUP, +DUMP_FORMAT_NOTMUCH +} dump_format_t; + +#endif diff --git a/notmuch-dump.c b/notmuch-dump.c index a735875..0231db2 100644 --- a/notmuch-dump.c +++ b/notmuch-dump.c @@ -19,6 +19,7 @@ */ #include "notmuch-client.h" +#include "dump-restore-private.h" int notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) @@ -44,9 +45,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) char *output_file_name = NULL; int opt_index; +int output_format = DUMP_FORMAT_SUP; + notmuch_opt_desc_t options[] = { - { NOTMUCH_OPT_POSITION, &output_file_name, 0, 0, 0 }, - { 0, 0, 0, 0, 0 } + { NOTMUCH_OPT_KEYWORD, &output_format, "format", 'f', + (notmuch_keyword_t []){ { "sup", DUMP_FORMAT_SUP }, + { "notmuch", DUMP_FORMAT_NOTMUCH }, + {0, 0} } }, + { NOTMUCH_OPT_POSITION, &output_file_name, 0, 0, 0 }, + { 0,0, 0, 0, 0 } }; opt_index = parse_arguments (argc, argv, options, 1); @@ -85,29 +92,53 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) */ notmuch_query_set_sort (query, NOTMUCH_SORT_UNSORTED); +char *buffer = NULL; +size_t buffer_size = 0; + for (messages = notmuch_query_search_messages (query); notmuch_messages_valid (messages); notmuch_messages_move_to_next (messages)) { int first = 1; - message = notmuch_messages_get (messages); + const char *message_id; - fprintf (output, -"%s (", notmuch_message_get_message_id (message)); + message = notmuch_messages_get (messages); + message_id = notmuch_message_get_message_id (message); + + if (output_format == DUMP_FORMAT_SUP) { + fprintf (output, "%s (", message_id); + } else { + if (hex_encode (notmuch, message_id, + &buffer, &buffer_size) != HEX_SUCCESS) + return 1; + fprintf (output, "%s ", buffer); + } for (tags = notmuch_message_get_tags (message); notmuch_tags_valid (tags); notmuch_tags_move_to_next (tags)) { + const char *tag_str = notmuch_tags_get (tags); + if (! first) - fprintf (output, " "); + fputs (" ", output); - fprintf (output, "%s", notmuch_tags_get (tags)); + if (output_format == DUMP_FORMAT_SUP) { + fputs (tag_str, output); + } else { + if (hex_encode (notmuch, tag_str, + &buffer, &buffer_size) != HEX_SUCCESS) + return 1; + fputs (buffer, output); + } first = 0; } - fprintf (output, ")\n"); + if (output_format == DUMP_FORMAT_SUP) + fputs (")\n", output); + else + fputs ("\n", output); notmuch_message_destroy (message); } -- 1.7.7.3
[PATCH v3 04/10] notmuch-dump: add --format=(notmuch|sup)
From: David Bremner sup is the old format, and remains the default. Each line of the notmuch format is "msg_id tag tag...tag" where each space seperated token is 'hex-encoded' to remove troubling characters. In particular this format won't have the same problem with e.g. spaces in message-ids or tags; they will be round-trip-able. --- dump-restore-private.h | 12 notmuch-dump.c | 47 +++ 2 files changed, 51 insertions(+), 8 deletions(-) create mode 100644 dump-restore-private.h diff --git a/dump-restore-private.h b/dump-restore-private.h new file mode 100644 index 000..34a5022 --- /dev/null +++ b/dump-restore-private.h @@ -0,0 +1,12 @@ +#ifndef DUMP_RESTORE_PRIVATE_H +#define DUMP_RESTORE_PRIVATE_H + +#include "hex-escape.h" +#include "command-line-arguments.h" + +typedef enum dump_formats { +DUMP_FORMAT_SUP, +DUMP_FORMAT_NOTMUCH +} dump_format_t; + +#endif diff --git a/notmuch-dump.c b/notmuch-dump.c index a735875..0231db2 100644 --- a/notmuch-dump.c +++ b/notmuch-dump.c @@ -19,6 +19,7 @@ */ #include "notmuch-client.h" +#include "dump-restore-private.h" int notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) @@ -44,9 +45,15 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) char *output_file_name = NULL; int opt_index; +int output_format = DUMP_FORMAT_SUP; + notmuch_opt_desc_t options[] = { - { NOTMUCH_OPT_POSITION, &output_file_name, 0, 0, 0 }, - { 0, 0, 0, 0, 0 } + { NOTMUCH_OPT_KEYWORD, &output_format, "format", 'f', + (notmuch_keyword_t []){ { "sup", DUMP_FORMAT_SUP }, + { "notmuch", DUMP_FORMAT_NOTMUCH }, + {0, 0} } }, + { NOTMUCH_OPT_POSITION, &output_file_name, 0, 0, 0 }, + { 0,0, 0, 0, 0 } }; opt_index = parse_arguments (argc, argv, options, 1); @@ -85,29 +92,53 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) */ notmuch_query_set_sort (query, NOTMUCH_SORT_UNSORTED); +char *buffer = NULL; +size_t buffer_size = 0; + for (messages = notmuch_query_search_messages (query); notmuch_messages_valid (messages); notmuch_messages_move_to_next (messages)) { int first = 1; - message = notmuch_messages_get (messages); + const char *message_id; - fprintf (output, -"%s (", notmuch_message_get_message_id (message)); + message = notmuch_messages_get (messages); + message_id = notmuch_message_get_message_id (message); + + if (output_format == DUMP_FORMAT_SUP) { + fprintf (output, "%s (", message_id); + } else { + if (hex_encode (notmuch, message_id, + &buffer, &buffer_size) != HEX_SUCCESS) + return 1; + fprintf (output, "%s ", buffer); + } for (tags = notmuch_message_get_tags (message); notmuch_tags_valid (tags); notmuch_tags_move_to_next (tags)) { + const char *tag_str = notmuch_tags_get (tags); + if (! first) - fprintf (output, " "); + fputs (" ", output); - fprintf (output, "%s", notmuch_tags_get (tags)); + if (output_format == DUMP_FORMAT_SUP) { + fputs (tag_str, output); + } else { + if (hex_encode (notmuch, tag_str, + &buffer, &buffer_size) != HEX_SUCCESS) + return 1; + fputs (buffer, output); + } first = 0; } - fprintf (output, ")\n"); + if (output_format == DUMP_FORMAT_SUP) + fputs (")\n", output); + else + fputs ("\n", output); notmuch_message_destroy (message); } -- 1.7.7.3 ___ notmuch mailing list notmuch@notmuchmail.org http://notmuchmail.org/mailman/listinfo/notmuch