Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
On Wed, Jun 24, 2015 at 10:59 PM, Johannes Schindelin johannes.schinde...@gmx.de wrote: diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; + + /* current and last patch numbers, 1-indexed */ + int cur; + int last; +}; + +/** + * Initializes am_state with the default values. + */ +static void am_state_init(struct am_state *state) +{ + memset(state, 0, sizeof(*state)); + + strbuf_init(state-dir, 0); +} With strbufs, we use the initializer STRBUF_INIT. How about using #define AM_STATE_INIT { STRBUF_INIT, 0, 0 } here? Later in the patch series am_state_init() will also take into account config settings when filling up the default values. e.g. see patches 25/31[1] or 31/31[2]. I think that is reasonable: the purpose of am_state_init() is to initialize the am_state struct with the default values, and the default values can be set by the user through the config settings. This means, though, that we can't use initializers without introducing global variables. [1] http://thread.gmane.org/gmane.comp.version-control.git/271967/focus=271972 [2] http://thread.gmane.org/gmane.comp.version-control.git/271967/focus=271972 +/** + * Reads the contents of `file`. The third argument can be used to give a hint + * about the file size, to avoid reallocs. Returns number of bytes read on + * success, -1 if the file does not exist. If trim is set, trailing whitespace + * will be removed from the file contents. + */ +static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int trim) +{ + strbuf_reset(sb); + if (strbuf_read_file(sb, file, hint) = 0) { + if (trim) + strbuf_trim(sb); + + return sb-len; + } + + if (errno == ENOENT) + return -1; + + die_errno(_(could not read '%s'), file); +} A couple of thoughts: - why not reuse the strbuf by making it a part of the am_state()? That way, you can allocate, say, 1024 bytes (should be plenty enough for most of our operations) and just reuse them in all of the functions. We will not make any of this multi-threaded anyway, I don't think. But too much usage of this temporary strbuf may lead to a situation where one function calls another, and both functions write to the strbuf and clobber its contents. Besides, if we are talking about read_state_file(), it takes an external strbuf, so it gives the caller the freedom to choose which strbuf it uses (e.g. if it is stack allocated or in the am_state struct). I think it's more flexible this way. - Given that we only read short files all the time, why not skip the hint parameter? Especially if we reuse the strbuf, it should be good enough to allocate a reasonable buffer first and then just assume that we do not have to reallocate it all that often anyway. Doh! Right, the hint parameter is quite useless, since in am_load() we use the same strbuf anyway. (And strbuf_init() can set a hint as well) I've removed it on my side. - Since we only read files from the state directory, why not pass the basename as parameter? That way we can avoid calling `am_path()` explicitly over and over again (and yours truly cannot forget to call `am_path()` in future patches). Makes sense. After all, this function is called read_STATE_file() ;-) - If you agree with these suggestions, the signature would become something like static void read_state_file(struct am_state *state, const char *basename, int trim); So for now, my function signature is static void read_state_file(struct strbuf *sb, const struct am_state *state, const char *basename, int trim); +/** + * Remove the am_state directory. + */ +static void am_destroy(const struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(sb, state-dir.buf); + remove_dir_recursively(sb, 0); + strbuf_release(sb); +} Given that `remove_dir_recursively()` has to reset the strbuf with the directory's path to the original value before it returns (because it recurses into itself, therefore the value *has* to be reset when returning), we can just call remove_dir_recursively(state-dir, 0); and do not need another temporary strbuf. Ah right. Although, state-dir is not an strbuf anymore on my side. As Junio[3] rightfully noted, state-dir is not modified by the am_*() API, so it's been changed to a char*. Which means an strbuf is required to be passed to remove_dir_recursively(); +/** + * Increments the patch pointer, and cleans am_state for the application of the + * next patch. + */ +static void am_next(struct am_state *state) +{ + state-cur++; + write_file(am_path(state, next), 1,
Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
Hi Paul, On 2015-06-18 13:25, Paul Tan wrote: diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; + + /* current and last patch numbers, 1-indexed */ + int cur; + int last; +}; + +/** + * Initializes am_state with the default values. + */ +static void am_state_init(struct am_state *state) +{ + memset(state, 0, sizeof(*state)); + + strbuf_init(state-dir, 0); +} With strbufs, we use the initializer STRBUF_INIT. How about using #define AM_STATE_INIT { STRBUF_INIT, 0, 0 } here? +/** + * Reads the contents of `file`. The third argument can be used to give a hint + * about the file size, to avoid reallocs. Returns number of bytes read on + * success, -1 if the file does not exist. If trim is set, trailing whitespace + * will be removed from the file contents. + */ +static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int trim) +{ + strbuf_reset(sb); + if (strbuf_read_file(sb, file, hint) = 0) { + if (trim) + strbuf_trim(sb); + + return sb-len; + } + + if (errno == ENOENT) + return -1; + + die_errno(_(could not read '%s'), file); +} A couple of thoughts: - why not reuse the strbuf by making it a part of the am_state()? That way, you can allocate, say, 1024 bytes (should be plenty enough for most of our operations) and just reuse them in all of the functions. We will not make any of this multi-threaded anyway, I don't think. - Given that we only read short files all the time, why not skip the hint parameter? Especially if we reuse the strbuf, it should be good enough to allocate a reasonable buffer first and then just assume that we do not have to reallocate it all that often anyway. - Since we only read files from the state directory, why not pass the basename as parameter? That way we can avoid calling `am_path()` explicitly over and over again (and yours truly cannot forget to call `am_path()` in future patches). - If you agree with these suggestions, the signature would become something like static void read_state_file(struct am_state *state, const char *basename, int trim); +/** + * Remove the am_state directory. + */ +static void am_destroy(const struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(sb, state-dir.buf); + remove_dir_recursively(sb, 0); + strbuf_release(sb); +} Given that `remove_dir_recursively()` has to reset the strbuf with the directory's path to the original value before it returns (because it recurses into itself, therefore the value *has* to be reset when returning), we can just call remove_dir_recursively(state-dir, 0); and do not need another temporary strbuf. +/** + * Increments the patch pointer, and cleans am_state for the application of the + * next patch. + */ +static void am_next(struct am_state *state) +{ + state-cur++; + write_file(am_path(state, next), 1, %d, state-cur); +} Locking and re-checking the contents of next before writing the incremented value would probably be a little too paranoid... (Just saying it out loud, the current code is fine by me.) Ciao, Dscho -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
On Fri, Jun 19, 2015 at 4:43 AM, Junio C Hamano gits...@pobox.com wrote: Paul Tan pyoka...@gmail.com writes: diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; Is this a temporary variable you will append /patch, etc. to form a different string to use for fopen() etc., or do you use separate strbufs for things like that and this is only used to initialize them? - If the former then dir is a misnomer. - If the latter, then it probably does not have to be a strbuf; rather, it should probably be a const char *. Unless you pass this directly to functions that take a strbuf, such as remove_dir_recursively(), that is. It's the latter, and yes it does not need to be an strbuf since it will not usually change. However, I don't think it should be a const char*, as it means that the API user has to keep track of the lifetime of the dir string. Note that we will have to git_pathdup() as git_path() returns a static buffer: char *dir = git_pathdup(rebase-apply); struct am_state state; am_state_init(state); state.dir = dir; ...stuff... if (foo) { ... separate code path ... am_state_release(state); free(dir); return 0; } ... separate code path ... am_state_release(state); free(dir); return 0; Note the additional memory bookkeeping. Instead, I would rather the am_state struct take ownership of the dir string and free it in am_state_release(). So dir will be a char*: struct am_state state; am_state_init(state, git_path(rebase-apply)); ... stuff ... if (foo) { ... separate code path ... am_state_release(state); return 0; } ... separate code path ... am_state_release(state); return 0; +/** + * Release memory allocated by an am_state. + */ Everybody else in this file seems to say Initializes, Returns, Reads, etc. While I personally prefer to use imperative (i.e. give command to this function to release memory allocated), you would want to be consistent throughout the file; Releases memory would make it so. OK +/** + * Setup a new am session for applying patches + */ +static void am_setup(struct am_state *state) +{ + if (mkdir(state-dir.buf, 0777) 0 errno != EEXIST) + die_errno(_(failed to create directory '%s'), state-dir.buf); + + write_file(am_path(state, next), 1, %d, state-cur); + + write_file(am_path(state, last), 1, %d, state-last); These two lines are closely related pair; drop the blank in between. I am tno sure if write_file() is an appropriate thing to use, though. What happens when you get interrupted after opening the file but before you manage to write and close? Shouldn't we be doing the usual write to temp, close and then rename to final dance? This comment applies to all the other use of write_file(). We could, although I don't think it would help us much. The state directory is made up of many different files, so even if we made modifications to single files atomic, there's no point if we terminate unexpectedly in the middle of writing multiple files to the state directory as the state, as a whole, is corrupted anyway. So, we must be able to make updates to the entire state directory as a single transaction, which is a more difficult problem... Furthermore, while applying patches, we may face abnormal termination between e.g. the patch application and commit, so I think it is safe to say that if abnormal termination occurs, we will not be able to reliably --resume or --skip anyway, and the user should just run --abort to go back to the last known state. However, it would be an improvement if we wrote the next and last files last, as we use their presence to determine if we have an am session in progress or not, so if we terminate in am_setup() we will just have a stray directory. I will make this change in the next iteration. Regards, Paul -- To unsubscribe from this list: send the line unsubscribe git in
[PATCH/WIP v3 04/31] am: implement patch queue mechanism
git-am applies a series of patches. If the process terminates abnormally, we want to be able to resume applying the series of patches. This requires the session state to be saved in a persistent location. Implement the mechanism of a patch queue, represented by 2 integers -- the index of the current patch we are applying and the index of the last patch, as well as its lifecycle through the following functions: * am_setup(), which will set up the state directory $GIT_DIR/rebase-apply. As such, even if the process exits abnormally, the last-known state will still persist. * am_load(), which is called if there is an am session in progress, to load the last known state from the state directory so we can resume applying patches. * am_run(), which will do the actual patch application. After applying a patch, it calls am_next() to increment the current patch index. The logic for applying and committing a patch is not implemented yet. * am_destroy(), which is finally called when we successfully applied all the patches in the queue, to clean up by removing the state directory and its contents. Signed-off-by: Paul Tan pyoka...@gmail.com --- builtin/am.c | 168 +++ 1 file changed, 168 insertions(+) diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; + + /* current and last patch numbers, 1-indexed */ + int cur; + int last; +}; + +/** + * Initializes am_state with the default values. + */ +static void am_state_init(struct am_state *state) +{ + memset(state, 0, sizeof(*state)); + + strbuf_init(state-dir, 0); +} + +/** + * Release memory allocated by an am_state. + */ +static void am_state_release(struct am_state *state) +{ + strbuf_release(state-dir); +} + +/** + * Returns path relative to the am_state directory. + */ +static inline const char *am_path(const struct am_state *state, const char *path) +{ + return mkpath(%s/%s, state-dir.buf, path); +} + +/** + * Returns 1 if there is an am session in progress, 0 otherwise. + */ +static int am_in_progress(const struct am_state *state) +{ + struct stat st; + + if (lstat(state-dir.buf, st) 0 || !S_ISDIR(st.st_mode)) + return 0; + if (lstat(am_path(state, last), st) || !S_ISREG(st.st_mode)) + return 0; + if (lstat(am_path(state, next), st) || !S_ISREG(st.st_mode)) + return 0; + return 1; +} + +/** + * Reads the contents of `file`. The third argument can be used to give a hint + * about the file size, to avoid reallocs. Returns number of bytes read on + * success, -1 if the file does not exist. If trim is set, trailing whitespace + * will be removed from the file contents. + */ +static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int trim) +{ + strbuf_reset(sb); + if (strbuf_read_file(sb, file, hint) = 0) { + if (trim) + strbuf_trim(sb); + + return sb-len; + } + + if (errno == ENOENT) + return -1; + + die_errno(_(could not read '%s'), file); +} + +/** + * Loads state from disk. + */ +static void am_load(struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + read_state_file(sb, am_path(state, next), 8, 1); + state-cur = strtol(sb.buf, NULL, 10); + + read_state_file(sb, am_path(state, last), 8, 1); + state-last = strtol(sb.buf, NULL, 10); + + strbuf_release(sb); +} + +/** + * Remove the am_state directory. + */ +static void am_destroy(const struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(sb, state-dir.buf); + remove_dir_recursively(sb, 0); + strbuf_release(sb); +} + +/** + * Setup a new am session for applying patches + */ +static void am_setup(struct am_state *state) +{ + if (mkdir(state-dir.buf, 0777) 0 errno != EEXIST) + die_errno(_(failed to create directory '%s'), state-dir.buf); + + write_file(am_path(state, next), 1, %d, state-cur); + + write_file(am_path(state, last), 1, %d, state-last); +} + +/** + * Increments the patch pointer, and cleans am_state for the application of the + * next patch. + */ +static void am_next(struct am_state *state) +{ + state-cur++; + write_file(am_path(state, next), 1, %d, state-cur); +} + +/** + * Applies all queued patches. + */ +static void am_run(struct am_state *state) +{ + while (state-cur = state-last) { + + /* TODO: Patch application not implemented yet */ + + am_next(state); + } + + am_destroy(state); +} + +static struct am_state state; + +static const char * const am_usage[] = {
Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
On Thu, Jun 18, 2015 at 4:25 AM, Paul Tan pyoka...@gmail.com wrote: +/** + * Reads the contents of `file`. The third argument can be used to give a hint I would avoid `third` here. (I needed to count twice to be sure which argument you were referring to, as I was confused.) Also how do you abstain from giving a hint? (0 or negative or MAX_INT?) So maybe /** * Reads the contents of `file`. Returns number of bytes read on success, * -1 if the file does not exist. If trim is set, trailing whitespace will be removed * from the file contents. If `hint` is non-zero, it is used as a hint for initial * allocation to avoid reallocs. */ + * about the file size, to avoid reallocs. Returns number of bytes read on + * success, -1 if the file does not exist. If trim is set, trailing whitespace + * will be removed from the file contents. + */ +static int read_state_file(struct strbuf *sb, const char *file, size_t hint, int trim) +{ + strbuf_reset(sb); + if (strbuf_read_file(sb, file, hint) = 0) { + if (trim) + strbuf_trim(sb); + + return sb-len; + } + + if (errno == ENOENT) + return -1; + + die_errno(_(could not read '%s'), file); +} + +/** + * Loads state from disk. + */ +static void am_load(struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + read_state_file(sb, am_path(state, next), 8, 1); + state-cur = strtol(sb.buf, NULL, 10); + + read_state_file(sb, am_path(state, last), 8, 1); + state-last = strtol(sb.buf, NULL, 10); + + strbuf_release(sb); +} + +/** + * Remove the am_state directory. + */ +static void am_destroy(const struct am_state *state) +{ + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(sb, state-dir.buf); + remove_dir_recursively(sb, 0); + strbuf_release(sb); +} + +/** + * Setup a new am session for applying patches + */ +static void am_setup(struct am_state *state) +{ + if (mkdir(state-dir.buf, 0777) 0 errno != EEXIST) + die_errno(_(failed to create directory '%s'), state-dir.buf); + + write_file(am_path(state, next), 1, %d, state-cur); + + write_file(am_path(state, last), 1, %d, state-last); +} + +/** + * Increments the patch pointer, and cleans am_state for the application of the + * next patch. + */ +static void am_next(struct am_state *state) +{ + state-cur++; + write_file(am_path(state, next), 1, %d, state-cur); +} + +/** + * Applies all queued patches. + */ +static void am_run(struct am_state *state) +{ + while (state-cur = state-last) { + + /* TODO: Patch application not implemented yet */ + + am_next(state); + } + + am_destroy(state); +} + +static struct am_state state; + +static const char * const am_usage[] = { + N_(git am [options] [(mbox|Maildir)...]), + NULL +}; + +static struct option am_options[] = { + OPT_END() +}; int cmd_am(int argc, const char **argv, const char *prefix) { @@ -24,5 +176,21 @@ int cmd_am(int argc, const char **argv, const char *prefix) setup_work_tree(); } + git_config(git_default_config, NULL); + + am_state_init(state); + strbuf_addstr(state.dir, git_path(rebase-apply)); + + argc = parse_options(argc, argv, prefix, am_options, am_usage, 0); + + if (am_in_progress(state)) + am_load(state); + else + am_setup(state); + + am_run(state); + + am_state_release(state); + return 0; } -- 2.1.4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism
Paul Tan pyoka...@gmail.com writes: diff --git a/builtin/am.c b/builtin/am.c index dbc8836..af68c51 100644 --- a/builtin/am.c +++ b/builtin/am.c @@ -6,6 +6,158 @@ #include cache.h #include builtin.h #include exec_cmd.h +#include parse-options.h +#include dir.h + +struct am_state { + /* state directory path */ + struct strbuf dir; Is this a temporary variable you will append /patch, etc. to form a different string to use for fopen() etc., or do you use separate strbufs for things like that and this is only used to initialize them? - If the former then dir is a misnomer. - If the latter, then it probably does not have to be a strbuf; rather, it should probably be a const char *. Unless you pass this directly to functions that take a strbuf, such as remove_dir_recursively(), that is. +/** + * Release memory allocated by an am_state. + */ Everybody else in this file seems to say Initializes, Returns, Reads, etc. While I personally prefer to use imperative (i.e. give command to this function to release memory allocated), you would want to be consistent throughout the file; Releases memory would make it so. +/** + * Setup a new am session for applying patches + */ +static void am_setup(struct am_state *state) +{ + if (mkdir(state-dir.buf, 0777) 0 errno != EEXIST) + die_errno(_(failed to create directory '%s'), state-dir.buf); + + write_file(am_path(state, next), 1, %d, state-cur); + + write_file(am_path(state, last), 1, %d, state-last); These two lines are closely related pair; drop the blank in between. I am tno sure if write_file() is an appropriate thing to use, though. What happens when you get interrupted after opening the file but before you manage to write and close? Shouldn't we be doing the usual write to temp, close and then rename to final dance? This comment applies to all the other use of write_file(). -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html