Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Thu, Jun 25, 2015 at 12:36 AM, Johannes Schindelin johannes.schinde...@gmx.de wrote: On 2015-06-18 13:25, Paul Tan wrote: diff --git a/builtin/am.c b/builtin/am.c index 7b97ea8..d6434e4 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -94,6 +126,105 @@ static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int } /** + * Reads a KEY=VALUE shell variable assignment from fp, and returns the VALUE + * in `value`. VALUE must be a quoted string, and the KEY must match `key`. + * Returns 0 on success, -1 on failure. + * + * This is used by read_author_script() to read the GIT_AUTHOR_* variables from + * the author-script. + */ +static int read_shell_var(struct strbuf *value, FILE *fp, const char *key) +{ + struct strbuf sb = STRBUF_INIT; + char *str; + + if (strbuf_getline(sb, fp, '\n')) + return -1; + + if (!skip_prefix(sb.buf, key, (const char **)str)) + return -1; + + if (!skip_prefix(str, =, (const char **)str)) + return -1; + + str = sq_dequote(str); + if (!str) + return -1; + + strbuf_reset(value); + strbuf_addstr(value, str); + + strbuf_release(sb); + + return 0; +} How about using `strbuf_remove()` and keeping `str` as `const char *`? OK, I'll try that out. Looks like this now: static char *read_shell_var(FILE *fp, const char *key) { struct strbuf sb = STRBUF_INIT; const char *str; if (strbuf_getline(sb, fp, '\n')) return NULL; if (!skip_prefix(sb.buf, key, str)) return NULL; if (!skip_prefix(str, =, str)) return NULL; strbuf_remove(sb, 0, str - sb.buf); str = sq_dequote(sb.buf); if (!str) return NULL; return strbuf_detach(sb, NULL); } I also think we can fold it into the `read_author_script()` function and make it more resilient with regards to the order of the variables. Something like this: [...] Hmm, I think we should be very strict about parsing the author-script file though. As explained in read_author_script(), git-am.sh evals the author-script, which we can't in C. I would much rather we barf at the first sign that the author-script is not what we expect, rather than attempt to parse it as much as possible, but end up with the wrong results as compared to git-am.sh. Besides, currently git-am.sh will always write the author-script with the order of GIT_AUTHOR_NAME, GIT_AUTHOR_EMAIL and GIT_AUTHOR_DATE. If the order is wrong, I would think it means that something is messing with the author-script, and it would be better if we bail out immediately, instead of potentially doing something wrong. +/** + * Saves state-author_name, state-author_email and state-author_date in + * `filename` as an author script, which is the format used by git-am.sh. + */ +static void write_author_script(const struct am_state *state) +{ + static const char fmt[] = GIT_AUTHOR_NAME=%s\n + GIT_AUTHOR_EMAIL=%s\n + GIT_AUTHOR_DATE=%s\n; + struct strbuf author_name = STRBUF_INIT; + struct strbuf author_email = STRBUF_INIT; + struct strbuf author_date = STRBUF_INIT; + + sq_quote_buf(author_name, state-author_name.buf); + sq_quote_buf(author_email, state-author_email.buf); + sq_quote_buf(author_date, state-author_date.buf); The `sq_quote_buf()` function does not call `strbuf_reset()`. Therefore you could just use a single strbuf to construct the entire three lines and then write that out. Again, if you follow my suggestion to keep a scratch pad strbuf in am_state, you could reuse that. Right, makes sense. I've implemented it on my end. Thanks, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Paul Tan pyoka...@gmail.com writes: OK, I'll try that out. Looks like this now: static char *read_shell_var(FILE *fp, const char *key) { ... str = sq_dequote(sb.buf); if (!str) return NULL; You are unlikely to get !str, but if it does, you leak sb here, don't you? return strbuf_detach(sb, NULL); This call is OK; if you passed the length to detach, you're likely to get a wrong result, though ;-) sq_dequote() is one of the older parts of the API set and its we know it cannot do anything but shrink, so we'd do it in-place attitude, which may be vastly useful in practice, is now showing some impedance mismatch with newer parts of the API like strbuf. +/** + * Saves state-author_name, state-author_email and state-author_date in + * `filename` as an author script, which is the format used by git-am.sh. + */ +static void write_author_script(const struct am_state *state) +{ + static const char fmt[] = GIT_AUTHOR_NAME=%s\n + GIT_AUTHOR_EMAIL=%s\n + GIT_AUTHOR_DATE=%s\n; + struct strbuf author_name = STRBUF_INIT; + struct strbuf author_email = STRBUF_INIT; + struct strbuf author_date = STRBUF_INIT; + + sq_quote_buf(author_name, state-author_name.buf); + sq_quote_buf(author_email, state-author_email.buf); + sq_quote_buf(author_date, state-author_date.buf); The `sq_quote_buf()` function does not call strbuf_reset()`. Therefore you could just use a single strbuf to construct the entire three lines and then write that out. Yup. quote appends to the output, so you could do this: add(out, GIT_AUTHOR_NAME=); quote(out, state-author_name); add(out, \\nGIT_AUTHOR_EMAIL=); quote(out, state-author_email); ... I am not sure if that is easier to read than what you have, though. Again, if you follow my suggestion to keep a scratch pad strbuf in am_state, you could reuse that. Don't do scratch pad in a structure that is passed around to various people. Somebody may be tempted to use the scratch pad while he has the control, but as soon as he becomes complex enough to require calling some helper functions, the ownership rules of the scratch pad will become cumbersome to manage and understandable only by the person who originally wrote the codepath. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Wed, Jun 24, 2015 at 11:59 PM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: 3. I'm over-thinking this and you just want the struct strbufs in the struct am_state to be switched to char*s? Yes, everybody interacts with am_state, and these fields are supposed to be constant during the processing of each patch input message; they should be simple strings, not strbufs, to make sure if anybody _does_ muck with them in-place, that would be very visible. The helpers to initialize them are free to use strbuf API to prepare these simple string fields, of course. OK. I've implemented it on my end. (On a somewhat related thought, currently we do write_file() all over the place, which is really ugly. I'm leaning heavily on introducing an am_save() function, for I don't care how it is done but just update the contents of the am state directory so that it matches the contents of the struct am_state. Sure; the scripted Porcelain may have done echo here, echo there instead of concatenate into a $var and then 'echo $var' at end as that is more natural way to program in that environment. You are doing this in C and prepare the thing in-core and write it all at the point to snapshot may well be the more natural way to program. As long as a process that stops in the middle does not leave on-disk state inconsistent, batching would be fine. For example, you may apply and commit two (or more) patches without updating the on-disk state as you do not see need to give control back to the user (i.e. they applied cleanly) and then write out the on-disk state with .next incremented by two (or more) before giving the control back could be a valid optimization (take this example with a grain of salt, though; I haven't thought too deeply about what should happen if you Ctrl-C the process in the middle). Right, I briefly wondered if we could hold off writing the am_state to disk as much as possible, and only write to disk when we are about to terminate. This would involve installing an atexit() and SIGTERM signal handler, but I haven't thought too deeply about that. Anyway, moving all the writing am_state to disk logic to a central function am_save() would be a good immediate step, I think, so I've implemented it on my end. Thanks, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Okay, let's focus only on the API design issues. For the record, I'm not completely satisfied with the current code organization and API, however I don't have really good ideas at hand to improve it, so any ideas with examples would be appreciated. On Fri, Jun 19, 2015 at 09:13:00AM -0700, Junio C Hamano wrote: Paul Tan pyoka...@gmail.com writes: With the above fields, it is clear that the above fields are per-message thing. So the loop to process multiple messages is conceptually: set up the entire am_state (for things like cur=1, last=N) for each message { update per-message fields like cur, author, msg, etc. process the single message clean up per-message fileds like cur++, free(msg), etc. } tear down the entire am_state Reusing the same strbuf and rely on them to clean up after themselves is simply a bad taste. Hmm, reading reusing the same strbuf makes me wonder if I'm misunderstanding something. Do you mean that: 1. We should release the memory in state-msg, state-author_* after processing each message? Because that is what am_next() already does. Granted, it does strbuf_reset(), but it could just as well be strbuf_release(). or 2. The per-message fields (state-msg, state-author_*) should become local variables in am_run()? or 3. I'm over-thinking this and you just want the struct strbufs in the struct am_state to be switched to char*s? If that is so, I've attached a sample patch below. For (2), my idea is that am_state represents the contents of the state directory at any point in time -- a design feature carried over from git-am.sh. This is why parse_patch() modifies the am_state struct directly -- the msg and author_* fields would be written to the final-commit and author-script files directly after. I think it would be confusing if we don't update the am_state struct directly, as we will never know if the am_state struct contains the actual current state of the patch application session. (On a somewhat related thought, currently we do write_file() all over the place, which is really ugly. I'm leaning heavily on introducing an am_save() function, for I don't care how it is done but just update the contents of the am state directory so that it matches the contents of the struct am_state. To save pointless writes, we can compare the am_state with the last-written/loaded am_state, so we only write the files that have changed. What do others think?) More importantly, we'd want to make sure that people who will be touching the process the single message part in the above loop to think twice before mucking with read-only fields like author. Okay, I understand and agree with your reasoning. However, note that this process the single message part consists solely of run_apply() and do_commit(). Both of them take a const struct am_state*, which means that the compiler will warn/fail if anybody tries to touch any of the fields in the am_state. This extends to strbufs as well. Isn't this already safe enough? Or do you want this safety to extend throughout am_run() as well? If they are char *, they would need to allocate new storage themselves, format a new value into there, free the original, and replace the field with the new value. It takes a conscious effort and very much visible code and would be clear to reviewers what is going on, and gives them a chance to question if it is a good idea to update things in-place in the first place. Okay, although personally I think reviewers may tend to miss out that a single line like: state-msg = strbuf_detach(sb, NULL); introduces a memory leak. If you left it in strbuf, that invites let's extend it temporarily with strbuf_add() and then return to the original once I am done with this single step, which is an entry to a slippery slope to let's extend it with strbuf_add() for my purpose, and I do not even bother to clean up because I know I am the last person to touch this. So, no, please don't leave a field that won't change during the bulk of the processing in strbuf, unless you have a compelling reason to do so (and compelling reasons are pretty much limited to after all, that field we originally thought it won't change is something we need to change very often). Anyway, assuming that you just want the strbufs in the am_state struct to be switched to char*s, here's a quick diff of the result. Can I confirm that this is the direction that you want to take? (since the changes are throughout the entire patch series) Regards, Paul --- 8 --- builtin/am.c | 202 ++- 1 file changed, 131 insertions(+), 71 deletions(-) diff --git a/builtin/am.c b/builtin/am.c index f55dd14..944e35a 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -98,10 +98,10 @@ struct am_state { int last; /* commit message and metadata */ - struct strbuf author_name; -
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Paul Tan pyoka...@gmail.com writes: 3. I'm over-thinking this and you just want the struct strbufs in the struct am_state to be switched to char*s? Yes, everybody interacts with am_state, and these fields are supposed to be constant during the processing of each patch input message; they should be simple strings, not strbufs, to make sure if anybody _does_ muck with them in-place, that would be very visible. The helpers to initialize them are free to use strbuf API to prepare these simple string fields, of course. (On a somewhat related thought, currently we do write_file() all over the place, which is really ugly. I'm leaning heavily on introducing an am_save() function, for I don't care how it is done but just update the contents of the am state directory so that it matches the contents of the struct am_state. Sure; the scripted Porcelain may have done echo here, echo there instead of concatenate into a $var and then 'echo $var' at end as that is more natural way to program in that environment. You are doing this in C and prepare the thing in-core and write it all at the point to snapshot may well be the more natural way to program. As long as a process that stops in the middle does not leave on-disk state inconsistent, batching would be fine. For example, you may apply and commit two (or more) patches without updating the on-disk state as you do not see need to give control back to the user (i.e. they applied cleanly) and then write out the on-disk state with .next incremented by two (or more) before giving the control back could be a valid optimization (take this example with a grain of salt, though; I haven't thought too deeply about what should happen if you Ctrl-C the process in the middle). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Hi Paul, On 2015-06-18 13:25, Paul Tan wrote: diff --git a/builtin/am.c b/builtin/am.c index 7b97ea8..d6434e4 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -94,6 +126,105 @@ static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int } /** + * Reads a KEY=VALUE shell variable assignment from fp, and returns the VALUE + * in `value`. VALUE must be a quoted string, and the KEY must match `key`. + * Returns 0 on success, -1 on failure. + * + * This is used by read_author_script() to read the GIT_AUTHOR_* variables from + * the author-script. + */ +static int read_shell_var(struct strbuf *value, FILE *fp, const char *key) +{ + struct strbuf sb = STRBUF_INIT; + char *str; + + if (strbuf_getline(sb, fp, '\n')) + return -1; + + if (!skip_prefix(sb.buf, key, (const char **)str)) + return -1; + + if (!skip_prefix(str, =, (const char **)str)) + return -1; + + str = sq_dequote(str); + if (!str) + return -1; + + strbuf_reset(value); + strbuf_addstr(value, str); + + strbuf_release(sb); + + return 0; +} How about using `strbuf_remove()` and keeping `str` as `const char *`? I also think we can fold it into the `read_author_script()` function and make it more resilient with regards to the order of the variables. Something like this: static int read_author_script(struct am_state *state) { struct strbuf sb = STRBUF_INIT; const char *filename = am_path(state, author-script); FILE *fp = fopen(filename, r); if (!fp) { if (errno == ENOENT) return 0; die_errno(_(could not open '%s' for reading), filename); } while (!strbuf_getline(sb, fp, '\n')) { char *equal = strchr(sb.buf, '='), **var; if (!equal) { error: fclose(fp); return -1; } *equal = '\0'; if (!strcmp(sb.buf, GIT_AUTHOR_NAME)) var = state-author_name; else if (!strcmp(sb.buf, GIT_AUTHOR_EMAIL)) var = state-author_email; else if (!strcmp(sb.buf, GIT_AUTHOR_DATE)) var = state-author_date; else goto error; *var = xstrdup(sq_dequote(equal + 1)); } fclose(fp); return -1; } If you follow my earlier suggestion to keep a strbuf inside the am_state, you could reuse that here, too. +/** + * Saves state-author_name, state-author_email and state-author_date in + * `filename` as an author script, which is the format used by git-am.sh. + */ +static void write_author_script(const struct am_state *state) +{ + static const char fmt[] = GIT_AUTHOR_NAME=%s\n + GIT_AUTHOR_EMAIL=%s\n + GIT_AUTHOR_DATE=%s\n; + struct strbuf author_name = STRBUF_INIT; + struct strbuf author_email = STRBUF_INIT; + struct strbuf author_date = STRBUF_INIT; + + sq_quote_buf(author_name, state-author_name.buf); + sq_quote_buf(author_email, state-author_email.buf); + sq_quote_buf(author_date, state-author_date.buf); The `sq_quote_buf()` function does not call `strbuf_reset()`. Therefore you could just use a single strbuf to construct the entire three lines and then write that out. Again, if you follow my suggestion to keep a scratch pad strbuf in am_state, you could reuse that. That scratch pad could come in handy in a couple of other places in the rest of this patch. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Hi Paul, On 2015-06-19 18:15, Paul Tan wrote: On Fri, Jun 19, 2015 at 11:09 PM, Junio C Hamano gits...@pobox.com wrote: The primary thing I care about is to discourage callers of the API element am_state from touching these fields with strbuf functions. If it is char * then the would think twice before modifying (which would involve allocating the new buffer, forming the new value and assigning into it). With the fields left as strbuf after they are read or formed by the API (by reading by the API implementation from $GIT_DIR/rebase-apply/), it is too tempting to do strbuf_add(), strbuf_trim(), etc., without restoring the value to the original saying I'm the last user of it anyway--that is the sloppiness we can prevent by not leaving it as strbuf. I don't think this is a good deterrent. If the code wanted to, it could just use strbuf_attach()/strbuf_detach() as well, no? Sadly, I am a bit too busy with some loose Git for Windows ends, so I haven't had the chance to look at your code. But I still wanted to throw this idea out there: would it maybe possible to avoid having those values as global variables, and instead pass them as const char * parameters to the respective functions? That should help resolve the concerns of both sides because it would allow us to keep the strbuf instances, just as local variables. Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Fri, Jun 19, 2015 at 5:10 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; Same comment as dir in the earlier patch applies to these. If the fields are read or computed and then kept constant, use a temporary variable that is a strbuf to read/compute the final value, and then detach to a const char * field. If they are constantly changing and in-place updates are vital, then they can and should be strbufs, but I do not think that is the case for these. I do think it is the case here. The commit message and metadata fields change for every patch we process, and we could be processing a large volume of patches, so we must be careful to not leak any memory. We could use a char* field, but then to prevent memory leaks we would then have to free() it and malloc() a new string for every patch we process, which will lead to lots of malloc()s and free()s over a large volume of patches that can lead to memory fragmentation. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
On Fri, Jun 19, 2015 at 11:09 PM, Junio C Hamano gits...@pobox.com wrote: You do realize that strbuf internally does alloc/free so as a solution to fragmentation issue you are at the mercy of the same alloc/free, don't you? Yes, of course, but it has the alloc variable to keep track of the size of the allocated buffer, and provided that ALLOC_GROW() uses a sensible factor to grow the buffer, it won't be calling malloc/free that much to be a problem. Of course, we could do the same and just add a alloc variable as well and then use it with strbuf_attach()/strbuf_detach(), but at this point why not just store it in an strbuf then and avoid the extra alloc struct member? The primary thing I care about is to discourage callers of the API element am_state from touching these fields with strbuf functions. If it is char * then the would think twice before modifying (which would involve allocating the new buffer, forming the new value and assigning into it). With the fields left as strbuf after they are read or formed by the API (by reading by the API implementation from $GIT_DIR/rebase-apply/), it is too tempting to do strbuf_add(), strbuf_trim(), etc., without restoring the value to the original saying I'm the last user of it anyway--that is the sloppiness we can prevent by not leaving it as strbuf. I don't think this is a good deterrent. If the code wanted to, it could just use strbuf_attach()/strbuf_detach() as well, no? I believe this kind of issue would be better solved with a big WARNING comment and code review. Also, there may be use cases where the user may wish to modify the commit message/author fields. E.g., user may wish to modify the message of the commit that results from am --continue to take into account the conflicts that caused the am to fail. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Paul Tan pyoka...@gmail.com writes: On Fri, Jun 19, 2015 at 5:10 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; Same comment as dir in the earlier patch applies to these. If the fields are read or computed and then kept constant, use a temporary variable that is a strbuf to read/compute the final value, and then detach to a const char * field. If they are constantly changing and in-place updates are vital, then they can and should be strbufs, but I do not think that is the case for these. I do think it is the case here. The commit message and metadata fields change for every patch we process, and we could be processing a large volume of patches, so we must be careful to not leak any memory. With the above fields, it is clear that the above fields are per-message thing. So the loop to process multiple messages is conceptually: set up the entire am_state (for things like cur=1, last=N) for each message { update per-message fields like cur, author, msg, etc. process the single message clean up per-message fileds like cur++, free(msg), etc. } tear down the entire am_state Reusing the same strbuf and rely on them to clean up after themselves is simply a bad taste. More importantly, we'd want to make sure that people who will be touching the process the single message part in the above loop to think twice before mucking with read-only fields like author. If they are char *, they would need to allocate new storage themselves, format a new value into there, free the original, and replace the field with the new value. It takes a conscious effort and very much visible code and would be clear to reviewers what is going on, and gives them a chance to question if it is a good idea to update things in-place in the first place. If you left it in strbuf, that invites let's extend it temporarily with strbuf_add() and then return to the original once I am done with this single step, which is an entry to a slippery slope to let's extend it with strbuf_add() for my purpose, and I do not even bother to clean up because I know I am the last person to touch this. So, no, please don't leave a field that won't change during the bulk of the processing in strbuf, unless you have a compelling reason to do so (and compelling reasons are pretty much limited to after all, that field we originally thought it won't change is something we need to change very often). Thanks. -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
For the purpose of applying the patch and committing the results, implement extracting the patch data, commit message and authorship from an e-mail message using git-mailinfo. git-mailinfo is run as a separate process, but ideally in the future, we should be be able to access its functionality directly without spawning a new process. Helped-by: Junio C Hamano gits...@pobox.com Helped-by: Jeff King p...@peff.net Signed-off-by: Paul Tan pyoka...@gmail.com --- Notes: v3 * Style fixes builtin/am.c | 232 +++ 1 file changed, 232 insertions(+) diff --git a/builtin/am.c b/builtin/am.c index 7b97ea8..d6434e4 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -9,6 +9,23 @@ #include parse-options.h #include dir.h #include run-command.h +#include quote.h + +/** + * Returns 1 if the file is empty or does not exist, 0 otherwise. + */ +static int is_empty_file(const char *filename) +{ + struct stat st; + + if (stat(filename, st) 0) { + if (errno == ENOENT) + return 1; + die_errno(_(could not stat %s), filename); + } + + return !st.st_size; +} enum patch_format { PATCH_FORMAT_UNKNOWN = 0, @@ -23,6 +40,12 @@ struct am_state { int cur; int last; + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; + /* number of digits in patch filename */ int prec; }; @@ -36,6 +59,11 @@ static void am_state_init(struct am_state *state) strbuf_init(state-dir, 0); + strbuf_init(state-author_name, 0); + strbuf_init(state-author_email, 0); + strbuf_init(state-author_date, 0); + strbuf_init(state-msg, 0); + state-prec = 4; } @@ -45,6 +73,10 @@ static void am_state_init(struct am_state *state) static void am_state_release(struct am_state *state) { strbuf_release(state-dir); + strbuf_release(state-author_name); + strbuf_release(state-author_email); + strbuf_release(state-author_date); + strbuf_release(state-msg); } /** @@ -94,6 +126,105 @@ static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int } /** + * Reads a KEY=VALUE shell variable assignment from fp, and returns the VALUE + * in `value`. VALUE must be a quoted string, and the KEY must match `key`. + * Returns 0 on success, -1 on failure. + * + * This is used by read_author_script() to read the GIT_AUTHOR_* variables from + * the author-script. + */ +static int read_shell_var(struct strbuf *value, FILE *fp, const char *key) +{ + struct strbuf sb = STRBUF_INIT; + char *str; + + if (strbuf_getline(sb, fp, '\n')) + return -1; + + if (!skip_prefix(sb.buf, key, (const char **)str)) + return -1; + + if (!skip_prefix(str, =, (const char **)str)) + return -1; + + str = sq_dequote(str); + if (!str) + return -1; + + strbuf_reset(value); + strbuf_addstr(value, str); + + strbuf_release(sb); + + return 0; +} + +/** + * Parses the author script `filename`, and sets state-author_name, + * state-author_email and state-author_date accordingly. We are strict with + * our parsing, as the author script is supposed to be eval'd, and loosely + * parsing it may not give the results the user expects. + * + * The author script is of the format: + * + * GIT_AUTHOR_NAME='$author_name' + * GIT_AUTHOR_EMAIL='$author_email' + * GIT_AUTHOR_DATE='$author_date' + * + * where $author_name, $author_email and $author_date are quoted. + */ +static int read_author_script(struct am_state *state) +{ + const char *filename = am_path(state, author-script); + FILE *fp = fopen(filename, r); + if (!fp) { + if (errno == ENOENT) + return 0; + die_errno(_(could not open '%s' for reading), filename); + } + + if (read_shell_var(state-author_name, fp, GIT_AUTHOR_NAME)) + return -1; + + if (read_shell_var(state-author_email, fp, GIT_AUTHOR_EMAIL)) + return -1; + + if (read_shell_var(state-author_date, fp, GIT_AUTHOR_DATE)) + return -1; + + if (fgetc(fp) != EOF) + return -1; + + fclose(fp); + return 0; +} + +/** + * Saves state-author_name, state-author_email and state-author_date in + * `filename` as an author script, which is the format used by git-am.sh. + */ +static void write_author_script(const struct am_state *state) +{ + static const char fmt[] = GIT_AUTHOR_NAME=%s\n + GIT_AUTHOR_EMAIL=%s\n + GIT_AUTHOR_DATE=%s\n; + struct strbuf author_name = STRBUF_INIT; + struct strbuf author_email = STRBUF_INIT; + struct strbuf author_date = STRBUF_INIT; + +
Re: [PATCH/WIP v3 07/31] am: extract patch, message and authorship with git-mailinfo
Paul Tan pyoka...@gmail.com writes: + /* commit message and metadata */ + struct strbuf author_name; + struct strbuf author_email; + struct strbuf author_date; + struct strbuf msg; Same comment as dir in the earlier patch applies to these. If the fields are read or computed and then kept constant, use a temporary variable that is a strbuf to read/compute the final value, and then detach to a const char * field. If they are constantly changing and in-place updates are vital, then they can and should be strbufs, but I do not think that is the case for these. For example... +/** + * Saves state-author_name, state-author_email and state-author_date in + * `filename` as an author script, which is the format used by git-am.sh. + */ +static void write_author_script(const struct am_state *state) +{ + static const char fmt[] = GIT_AUTHOR_NAME=%s\n + GIT_AUTHOR_EMAIL=%s\n + GIT_AUTHOR_DATE=%s\n; + struct strbuf author_name = STRBUF_INIT; + struct strbuf author_email = STRBUF_INIT; + struct strbuf author_date = STRBUF_INIT; + + sq_quote_buf(author_name, state-author_name.buf); + sq_quote_buf(author_email, state-author_email.buf); + sq_quote_buf(author_date, state-author_date.buf); As you use a separate author_name variable here, what gets sq-quoted that is in *state does not have to be strbuf. The code to read is the same story: +static int read_shell_var(struct strbuf *value, FILE *fp, const char *key) +{ + struct strbuf sb = STRBUF_INIT; + char *str; + + if (strbuf_getline(sb, fp, '\n')) + return -1; +... + strbuf_reset(value); + strbuf_addstr(value, str); + + strbuf_release(sb); + + return 0; +} As you use a separate sb strbuf variable here, there is no need for value to be pointing at a strbuf; it could be char ** that sb's contents is detached into. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html