Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-25 Thread Paul Tan
On Wed, Jun 24, 2015 at 10:59 PM, Johannes Schindelin
johannes.schinde...@gmx.de wrote:
 diff --git a/builtin/am.c b/builtin/am.c
 index dbc8836..af68c51 100644
 --- a/builtin/am.c
 +++ b/builtin/am.c
 @@ -6,6 +6,158 @@
  #include cache.h
  #include builtin.h
  #include exec_cmd.h
 +#include parse-options.h
 +#include dir.h
 +
 +struct am_state {
 + /* state directory path */
 + struct strbuf dir;
 +
 + /* current and last patch numbers, 1-indexed */
 + int cur;
 + int last;
 +};
 +
 +/**
 + * Initializes am_state with the default values.
 + */
 +static void am_state_init(struct am_state *state)
 +{
 + memset(state, 0, sizeof(*state));
 +
 + strbuf_init(state-dir, 0);
 +}

 With strbufs, we use the initializer STRBUF_INIT. How about using

 #define AM_STATE_INIT { STRBUF_INIT, 0, 0 }

 here?

Later in the patch series am_state_init() will also take into account
config settings when filling up the default values. e.g. see patches
25/31[1] or 31/31[2]. I think that is reasonable: the purpose of
am_state_init() is to initialize the am_state struct with the default
values, and the default values can be set by the user through the
config settings.

This means, though, that we can't use initializers without introducing
global variables.

[1] http://thread.gmane.org/gmane.comp.version-control.git/271967/focus=271972
[2] http://thread.gmane.org/gmane.comp.version-control.git/271967/focus=271972

 +/**
 + * Reads the contents of `file`. The third argument can be used to give a 
 hint
 + * about the file size, to avoid reallocs. Returns number of bytes read on
 + * success, -1 if the file does not exist. If trim is set, trailing 
 whitespace
 + * will be removed from the file contents.
 + */
 +static int read_state_file(struct strbuf *sb, const char *file,
 size_t hint, int trim)
 +{
 + strbuf_reset(sb);
 + if (strbuf_read_file(sb, file, hint) = 0) {
 + if (trim)
 + strbuf_trim(sb);
 +
 + return sb-len;
 + }
 +
 + if (errno == ENOENT)
 + return -1;
 +
 + die_errno(_(could not read '%s'), file);
 +}

 A couple of thoughts:

 - why not reuse the strbuf by making it a part of the am_state()? That way, 
 you can allocate, say, 1024 bytes (should be plenty enough for most of our 
 operations) and just reuse them in all of the functions. We will not make any 
 of this multi-threaded anyway, I don't think.

But too much usage of this temporary strbuf may lead to a situation
where one function calls another, and both functions write to the
strbuf and clobber its contents.

Besides, if we are talking about read_state_file(), it takes an
external strbuf, so it gives the caller the freedom to choose which
strbuf it uses (e.g. if it is stack allocated or in the am_state
struct). I think it's more flexible this way.

 - Given that we only read short files all the time, why not skip the hint 
 parameter? Especially if we reuse the strbuf, it should be good enough to 
 allocate a reasonable buffer first and then just assume that we do not have 
 to reallocate it all that often anyway.

Doh! Right, the hint parameter is quite useless, since in am_load() we
use the same strbuf anyway. (And strbuf_init() can set a hint as well)
I've removed it on my side.

 - Since we only read files from the state directory, why not pass the 
 basename as parameter? That way we can avoid calling `am_path()` explicitly 
 over and over again (and yours truly cannot forget to call `am_path()` in 
 future patches).

Makes sense. After all, this function is called read_STATE_file() ;-)

 - If you agree with these suggestions, the signature would become something 
 like

 static void read_state_file(struct am_state *state, const char *basename, int 
 trim);

So for now, my function signature is

static void read_state_file(struct strbuf *sb, const struct am_state
*state, const char *basename, int trim);

 +/**
 + * Remove the am_state directory.
 + */
 +static void am_destroy(const struct am_state *state)
 +{
 + struct strbuf sb = STRBUF_INIT;
 +
 + strbuf_addstr(sb, state-dir.buf);
 + remove_dir_recursively(sb, 0);
 + strbuf_release(sb);
 +}

 Given that `remove_dir_recursively()` has to reset the strbuf with the 
 directory's path to the original value before it returns (because it recurses 
 into itself, therefore the value *has* to be reset when returning), we can 
 just call

 remove_dir_recursively(state-dir, 0);

 and do not need another temporary strbuf.

Ah right. Although, state-dir is not an strbuf anymore on my side. As
Junio[3] rightfully noted, state-dir is not modified by the am_*()
API, so it's been changed to a char*. Which means an strbuf is
required to be passed to remove_dir_recursively();

 +/**
 + * Increments the patch pointer, and cleans am_state for the application of 
 the
 + * next patch.
 + */
 +static void am_next(struct am_state *state)
 +{
 + state-cur++;
 + write_file(am_path(state, next), 1, 

Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-24 Thread Johannes Schindelin
Hi Paul,

On 2015-06-18 13:25, Paul Tan wrote:

 diff --git a/builtin/am.c b/builtin/am.c
 index dbc8836..af68c51 100644
 --- a/builtin/am.c
 +++ b/builtin/am.c
 @@ -6,6 +6,158 @@
  #include cache.h
  #include builtin.h
  #include exec_cmd.h
 +#include parse-options.h
 +#include dir.h
 +
 +struct am_state {
 + /* state directory path */
 + struct strbuf dir;
 +
 + /* current and last patch numbers, 1-indexed */
 + int cur;
 + int last;
 +};
 +
 +/**
 + * Initializes am_state with the default values.
 + */
 +static void am_state_init(struct am_state *state)
 +{
 + memset(state, 0, sizeof(*state));
 +
 + strbuf_init(state-dir, 0);
 +}

With strbufs, we use the initializer STRBUF_INIT. How about using

#define AM_STATE_INIT { STRBUF_INIT, 0, 0 }

here?

 +/**
 + * Reads the contents of `file`. The third argument can be used to give a 
 hint
 + * about the file size, to avoid reallocs. Returns number of bytes read on
 + * success, -1 if the file does not exist. If trim is set, trailing 
 whitespace
 + * will be removed from the file contents.
 + */
 +static int read_state_file(struct strbuf *sb, const char *file,
 size_t hint, int trim)
 +{
 + strbuf_reset(sb);
 + if (strbuf_read_file(sb, file, hint) = 0) {
 + if (trim)
 + strbuf_trim(sb);
 +
 + return sb-len;
 + }
 +
 + if (errno == ENOENT)
 + return -1;
 +
 + die_errno(_(could not read '%s'), file);
 +}

A couple of thoughts:

- why not reuse the strbuf by making it a part of the am_state()? That way, you 
can allocate, say, 1024 bytes (should be plenty enough for most of our 
operations) and just reuse them in all of the functions. We will not make any 
of this multi-threaded anyway, I don't think.

- Given that we only read short files all the time, why not skip the hint 
parameter? Especially if we reuse the strbuf, it should be good enough to 
allocate a reasonable buffer first and then just assume that we do not have to 
reallocate it all that often anyway.

- Since we only read files from the state directory, why not pass the basename 
as parameter? That way we can avoid calling `am_path()` explicitly over and 
over again (and yours truly cannot forget to call `am_path()` in future 
patches).

- If you agree with these suggestions, the signature would become something like

static void read_state_file(struct am_state *state, const char *basename, int 
trim);

 +/**
 + * Remove the am_state directory.
 + */
 +static void am_destroy(const struct am_state *state)
 +{
 + struct strbuf sb = STRBUF_INIT;
 +
 + strbuf_addstr(sb, state-dir.buf);
 + remove_dir_recursively(sb, 0);
 + strbuf_release(sb);
 +}

Given that `remove_dir_recursively()` has to reset the strbuf with the 
directory's path to the original value before it returns (because it recurses 
into itself, therefore the value *has* to be reset when returning), we can just 
call

remove_dir_recursively(state-dir, 0);

and do not need another temporary strbuf.

 +/**
 + * Increments the patch pointer, and cleans am_state for the application of 
 the
 + * next patch.
 + */
 +static void am_next(struct am_state *state)
 +{
 + state-cur++;
 + write_file(am_path(state, next), 1, %d, state-cur);
 +}

Locking and re-checking the contents of next before writing the incremented 
value would probably be a little too paranoid...

(Just saying it out loud, the current code is fine by me.)

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-19 Thread Paul Tan
On Fri, Jun 19, 2015 at 4:43 AM, Junio C Hamano gits...@pobox.com wrote:
 Paul Tan pyoka...@gmail.com writes:

 diff --git a/builtin/am.c b/builtin/am.c
 index dbc8836..af68c51 100644
 --- a/builtin/am.c
 +++ b/builtin/am.c
 @@ -6,6 +6,158 @@
  #include cache.h
  #include builtin.h
  #include exec_cmd.h
 +#include parse-options.h
 +#include dir.h
 +
 +struct am_state {
 + /* state directory path */
 + struct strbuf dir;

 Is this a temporary variable you will append /patch, etc. to form
 a different string to use for fopen() etc., or do you use separate
 strbufs for things like that and this is only used to initialize
 them?

  - If the former then dir is a misnomer.

  - If the latter, then it probably does not have to be a strbuf;
rather, it should probably be a const char *.  Unless you pass
this directly to functions that take a strbuf, such as
remove_dir_recursively(), that is.

It's the latter, and yes it does not need to be an strbuf since it
will not usually change.

However, I don't think it should be a const char*, as it means that
the API user has to keep track of the lifetime of the dir string.
Note that we will have to git_pathdup() as git_path() returns a static
buffer:

char *dir = git_pathdup(rebase-apply);
struct am_state state;
am_state_init(state);
state.dir = dir;
...stuff...
if (foo) {
 ... separate code path ...
 am_state_release(state);
 free(dir);
 return 0;
}
... separate code path ...
am_state_release(state);
free(dir);
return 0;

Note the additional memory bookkeeping.

Instead, I would rather the am_state struct take ownership of the
dir string and free it in am_state_release(). So dir will be a
char*:

struct am_state state;
am_state_init(state, git_path(rebase-apply));
... stuff ...
if (foo) {
 ... separate code path ...
 am_state_release(state);
 return 0;
}
... separate code path ...
am_state_release(state);
return 0;

 +/**
 + * Release memory allocated by an am_state.
 + */

 Everybody else in this file seems to say Initializes, Returns,
 Reads, etc.  While I personally prefer to use imperative
 (i.e. give command to this function to release memory allocated),
 you would want to be consistent throughout the file; Releases
 memory would make it so.

OK

 +/**
 + * Setup a new am session for applying patches
 + */
 +static void am_setup(struct am_state *state)
 +{
 + if (mkdir(state-dir.buf, 0777)  0  errno != EEXIST)
 + die_errno(_(failed to create directory '%s'), 
 state-dir.buf);
 +
 + write_file(am_path(state, next), 1, %d, state-cur);
 +
 + write_file(am_path(state, last), 1, %d, state-last);

 These two lines are closely related pair; drop the blank in between.

 I am tno sure if write_file() is an appropriate thing to use,
 though.  What happens when you get interrupted after opening the
 file but before you manage to write and close?  Shouldn't we be
 doing the usual write to temp, close and then rename to final
 dance?  This comment applies to all the other use of write_file().

We could, although I don't think it would help us much. The state
directory is made up of many different files, so even if we made
modifications to single files atomic, there's no point if we terminate
unexpectedly in the middle of writing multiple files to the state
directory as the state, as a whole, is corrupted anyway.

So, we must be able to make updates to the entire state directory as a
single transaction, which is a more difficult problem...

Furthermore, while applying patches, we may face abnormal termination
between e.g. the patch application and commit, so I think it is safe
to say that if abnormal termination occurs, we will not be able to
reliably --resume or --skip anyway, and the user should just run
--abort to go back to the last known state.

However, it would be an improvement if we wrote the next and last
files last, as we use their presence to determine if we have an am
session in progress or not, so if we terminate in am_setup() we will
just have a stray directory. I will make this change in the next
iteration.

Regards,
Paul
--
To unsubscribe from this list: send the line unsubscribe git in


[PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-18 Thread Paul Tan
git-am applies a series of patches. If the process terminates
abnormally, we want to be able to resume applying the series of patches.
This requires the session state to be saved in a persistent location.

Implement the mechanism of a patch queue, represented by 2 integers --
the index of the current patch we are applying and the index of the last
patch, as well as its lifecycle through the following functions:

* am_setup(), which will set up the state directory
  $GIT_DIR/rebase-apply. As such, even if the process exits abnormally,
  the last-known state will still persist.

* am_load(), which is called if there is an am session in
  progress, to load the last known state from the state directory so we
  can resume applying patches.

* am_run(), which will do the actual patch application. After applying a
  patch, it calls am_next() to increment the current patch index. The
  logic for applying and committing a patch is not implemented yet.

* am_destroy(), which is finally called when we successfully applied all
  the patches in the queue, to clean up by removing the state directory
  and its contents.

Signed-off-by: Paul Tan pyoka...@gmail.com
---
 builtin/am.c | 168 +++
 1 file changed, 168 insertions(+)

diff --git a/builtin/am.c b/builtin/am.c
index dbc8836..af68c51 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -6,6 +6,158 @@
 #include cache.h
 #include builtin.h
 #include exec_cmd.h
+#include parse-options.h
+#include dir.h
+
+struct am_state {
+   /* state directory path */
+   struct strbuf dir;
+
+   /* current and last patch numbers, 1-indexed */
+   int cur;
+   int last;
+};
+
+/**
+ * Initializes am_state with the default values.
+ */
+static void am_state_init(struct am_state *state)
+{
+   memset(state, 0, sizeof(*state));
+
+   strbuf_init(state-dir, 0);
+}
+
+/**
+ * Release memory allocated by an am_state.
+ */
+static void am_state_release(struct am_state *state)
+{
+   strbuf_release(state-dir);
+}
+
+/**
+ * Returns path relative to the am_state directory.
+ */
+static inline const char *am_path(const struct am_state *state, const char 
*path)
+{
+   return mkpath(%s/%s, state-dir.buf, path);
+}
+
+/**
+ * Returns 1 if there is an am session in progress, 0 otherwise.
+ */
+static int am_in_progress(const struct am_state *state)
+{
+   struct stat st;
+
+   if (lstat(state-dir.buf, st)  0 || !S_ISDIR(st.st_mode))
+   return 0;
+   if (lstat(am_path(state, last), st) || !S_ISREG(st.st_mode))
+   return 0;
+   if (lstat(am_path(state, next), st) || !S_ISREG(st.st_mode))
+   return 0;
+   return 1;
+}
+
+/**
+ * Reads the contents of `file`. The third argument can be used to give a hint
+ * about the file size, to avoid reallocs. Returns number of bytes read on
+ * success, -1 if the file does not exist. If trim is set, trailing whitespace
+ * will be removed from the file contents.
+ */
+static int read_state_file(struct strbuf *sb, const char *file, size_t hint, 
int trim)
+{
+   strbuf_reset(sb);
+   if (strbuf_read_file(sb, file, hint) = 0) {
+   if (trim)
+   strbuf_trim(sb);
+
+   return sb-len;
+   }
+
+   if (errno == ENOENT)
+   return -1;
+
+   die_errno(_(could not read '%s'), file);
+}
+
+/**
+ * Loads state from disk.
+ */
+static void am_load(struct am_state *state)
+{
+   struct strbuf sb = STRBUF_INIT;
+
+   read_state_file(sb, am_path(state, next), 8, 1);
+   state-cur = strtol(sb.buf, NULL, 10);
+
+   read_state_file(sb, am_path(state, last), 8, 1);
+   state-last = strtol(sb.buf, NULL, 10);
+
+   strbuf_release(sb);
+}
+
+/**
+ * Remove the am_state directory.
+ */
+static void am_destroy(const struct am_state *state)
+{
+   struct strbuf sb = STRBUF_INIT;
+
+   strbuf_addstr(sb, state-dir.buf);
+   remove_dir_recursively(sb, 0);
+   strbuf_release(sb);
+}
+
+/**
+ * Setup a new am session for applying patches
+ */
+static void am_setup(struct am_state *state)
+{
+   if (mkdir(state-dir.buf, 0777)  0  errno != EEXIST)
+   die_errno(_(failed to create directory '%s'), state-dir.buf);
+
+   write_file(am_path(state, next), 1, %d, state-cur);
+
+   write_file(am_path(state, last), 1, %d, state-last);
+}
+
+/**
+ * Increments the patch pointer, and cleans am_state for the application of the
+ * next patch.
+ */
+static void am_next(struct am_state *state)
+{
+   state-cur++;
+   write_file(am_path(state, next), 1, %d, state-cur);
+}
+
+/**
+ * Applies all queued patches.
+ */
+static void am_run(struct am_state *state)
+{
+   while (state-cur = state-last) {
+
+   /* TODO: Patch application not implemented yet */
+
+   am_next(state);
+   }
+
+   am_destroy(state);
+}
+
+static struct am_state state;
+
+static const char * const am_usage[] = {

Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-18 Thread Stefan Beller
On Thu, Jun 18, 2015 at 4:25 AM, Paul Tan pyoka...@gmail.com wrote:
 +/**
 + * Reads the contents of `file`. The third argument can be used to give a 
 hint

I would avoid `third` here. (I needed to count twice to be sure which
argument you
were referring to, as I was confused.) Also how do you abstain from
giving a hint?
(0 or negative or MAX_INT?)

So maybe

/**
 * Reads the contents of `file`. Returns number of bytes read on success,
 * -1 if the file does not exist. If trim is set, trailing
whitespace will be removed
 * from the file contents. If `hint` is non-zero, it is used as a
hint for initial
 * allocation to avoid reallocs.
 */

 + * about the file size, to avoid reallocs. Returns number of bytes read on
 + * success, -1 if the file does not exist. If trim is set, trailing 
 whitespace
 + * will be removed from the file contents.
 + */
 +static int read_state_file(struct strbuf *sb, const char *file, size_t hint, 
 int trim)
 +{
 +   strbuf_reset(sb);
 +   if (strbuf_read_file(sb, file, hint) = 0) {
 +   if (trim)
 +   strbuf_trim(sb);
 +
 +   return sb-len;
 +   }
 +
 +   if (errno == ENOENT)
 +   return -1;
 +
 +   die_errno(_(could not read '%s'), file);
 +}
 +
 +/**
 + * Loads state from disk.
 + */
 +static void am_load(struct am_state *state)
 +{
 +   struct strbuf sb = STRBUF_INIT;
 +
 +   read_state_file(sb, am_path(state, next), 8, 1);
 +   state-cur = strtol(sb.buf, NULL, 10);
 +
 +   read_state_file(sb, am_path(state, last), 8, 1);
 +   state-last = strtol(sb.buf, NULL, 10);
 +
 +   strbuf_release(sb);
 +}
 +
 +/**
 + * Remove the am_state directory.
 + */
 +static void am_destroy(const struct am_state *state)
 +{
 +   struct strbuf sb = STRBUF_INIT;
 +
 +   strbuf_addstr(sb, state-dir.buf);
 +   remove_dir_recursively(sb, 0);
 +   strbuf_release(sb);
 +}
 +
 +/**
 + * Setup a new am session for applying patches
 + */
 +static void am_setup(struct am_state *state)
 +{
 +   if (mkdir(state-dir.buf, 0777)  0  errno != EEXIST)
 +   die_errno(_(failed to create directory '%s'), 
 state-dir.buf);
 +
 +   write_file(am_path(state, next), 1, %d, state-cur);
 +
 +   write_file(am_path(state, last), 1, %d, state-last);
 +}
 +
 +/**
 + * Increments the patch pointer, and cleans am_state for the application of 
 the
 + * next patch.
 + */
 +static void am_next(struct am_state *state)
 +{
 +   state-cur++;
 +   write_file(am_path(state, next), 1, %d, state-cur);
 +}
 +
 +/**
 + * Applies all queued patches.
 + */
 +static void am_run(struct am_state *state)
 +{
 +   while (state-cur = state-last) {
 +
 +   /* TODO: Patch application not implemented yet */
 +
 +   am_next(state);
 +   }
 +
 +   am_destroy(state);
 +}
 +
 +static struct am_state state;
 +
 +static const char * const am_usage[] = {
 +   N_(git am [options] [(mbox|Maildir)...]),
 +   NULL
 +};
 +
 +static struct option am_options[] = {
 +   OPT_END()
 +};

  int cmd_am(int argc, const char **argv, const char *prefix)
  {
 @@ -24,5 +176,21 @@ int cmd_am(int argc, const char **argv, const char 
 *prefix)
 setup_work_tree();
 }

 +   git_config(git_default_config, NULL);
 +
 +   am_state_init(state);
 +   strbuf_addstr(state.dir, git_path(rebase-apply));
 +
 +   argc = parse_options(argc, argv, prefix, am_options, am_usage, 0);
 +
 +   if (am_in_progress(state))
 +   am_load(state);
 +   else
 +   am_setup(state);
 +
 +   am_run(state);
 +
 +   am_state_release(state);
 +
 return 0;
  }
 --
 2.1.4

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/WIP v3 04/31] am: implement patch queue mechanism

2015-06-18 Thread Junio C Hamano
Paul Tan pyoka...@gmail.com writes:

 diff --git a/builtin/am.c b/builtin/am.c
 index dbc8836..af68c51 100644
 --- a/builtin/am.c
 +++ b/builtin/am.c
 @@ -6,6 +6,158 @@
  #include cache.h
  #include builtin.h
  #include exec_cmd.h
 +#include parse-options.h
 +#include dir.h
 +
 +struct am_state {
 + /* state directory path */
 + struct strbuf dir;

Is this a temporary variable you will append /patch, etc. to form
a different string to use for fopen() etc., or do you use separate
strbufs for things like that and this is only used to initialize
them?

 - If the former then dir is a misnomer.

 - If the latter, then it probably does not have to be a strbuf;
   rather, it should probably be a const char *.  Unless you pass
   this directly to functions that take a strbuf, such as
   remove_dir_recursively(), that is.

 +/**
 + * Release memory allocated by an am_state.
 + */

Everybody else in this file seems to say Initializes, Returns,
Reads, etc.  While I personally prefer to use imperative
(i.e. give command to this function to release memory allocated),
you would want to be consistent throughout the file; Releases
memory would make it so.

 +/**
 + * Setup a new am session for applying patches
 + */
 +static void am_setup(struct am_state *state)
 +{
 + if (mkdir(state-dir.buf, 0777)  0  errno != EEXIST)
 + die_errno(_(failed to create directory '%s'), state-dir.buf);
 +
 + write_file(am_path(state, next), 1, %d, state-cur);
 +
 + write_file(am_path(state, last), 1, %d, state-last);

These two lines are closely related pair; drop the blank in between.

I am tno sure if write_file() is an appropriate thing to use,
though.  What happens when you get interrupted after opening the
file but before you manage to write and close?  Shouldn't we be
doing the usual write to temp, close and then rename to final
dance?  This comment applies to all the other use of write_file().

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html