Re: Re*: [PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Jeff King
On Mon, May 21, 2018 at 09:25:01AM +0900, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > I have a feeling that argv_array might be a better fit for the
> > purpose of keeping track of to_free[] strings in the context of this
> > series.  Moving away from string_list would allow us to sidestep the
> > storage ownership issues the API has, and we do not need the .util
> > thing string_list gives us (which is one distinct advantage string_list
> > has over argv_array, if the application needs that feature).
> >
> > We would need to make _pushf() and friends return "const char *" if
> > we go that route to make the resulting API more useful, though.
> 
> ... and redoing the 4/4 patch using argv_array_pushf() makes the
> result look like this, which does not look too bad.

Agreed.

-Peff


Re: Re*: [PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Jeff King
On Mon, May 21, 2018 at 09:01:05AM +0900, Junio C Hamano wrote:

> Jacob Keller  writes:
> 
> > On Sun, May 20, 2018 at 3:17 AM, Martin Ågren  
> > wrote:
> >> +/**
> >> + * Add formatted string to the end of `list`. This function ignores
> >> + * the value of `list->strdup_strings` and always appends a freshly
> >> + * allocated string, so you will probably not want to use it with
> >> + * `strdup_strings = 0`.
> >> + */
> >> +struct string_list_item *string_list_appendf(struct string_list *list,
> >> +const char *fmt, ...);
> >> +
> >
> > Would it make sense to verify that strdup_strings == 0? I guess we'd
> > have to use die or BUG(), but that would mean that the program could
> > crash..
> 
> It probably is clear to readers that any reasonable implementation
> of *_appendf() will create a new and unique string, as the point of
> *f() is to give a customized instantiation of fmt string for given
> parameters.  So it would be natural to expect that the storage that
> holds the generated string will belong to the list.  We _could_ make
> it honor strdup_strings and make one extra copy when strdup_strings
> is set to true, but the only effect such a stupid implementation has
> is to unnecessarily leak ;-)
> 
> I think it is probably OK to check and BUG() when strdup_strings==0,
> but such a check means that we now declare that a string list must
> either borrow all of its strings from elsewhere or own all of its
> strings itself, and mixture is not allowed.
>
> The (overly) flexible string_list API could be used to mix both
> borrowed and owned strings (an obvious strategy to do this without
> leaking and crashing is to use the .util field to mark which ones
> are owned and which ones are borrowed), so there might already be
> current users of the API that violates that rule.

IMHO such a mixed use is mildly crazy. At any rate, we would know that
anybody using appendf() would not have this problem, since we are just
introducing it now.

> I have a feeling that argv_array might be a better fit for the
> purpose of keeping track of to_free[] strings in the context of this
> series.  Moving away from string_list would allow us to sidestep the
> storage ownership issues the API has, and we do not need the .util
> thing string_list gives us (which is one distinct advantage string_list
> has over argv_array, if the application needs that feature).

I do agree that argv_array is generally a better fit for most cases.
Didn't we want to rename it to strarray or something? That's probably
too much yak-shaving for this series, though. :)

> We would need to make _pushf() and friends return "const char *" if
> we go that route to make the resulting API more useful, though.

This is the first time I think that's been suggested, but I agree it's
the only sensible thing for the functions to return.

> -- >8 --
> Subject: argv-array: return the pushed string from argv_push*()
> 
> Such an API change allows us to use an argv_array this way:
> 
>   struct argv_array to_free = ARGV_ARRAY_INIT;
> const char *msg;
> 
> if (some condition) {
>   msg = "constant string message";
>   ... other logic ...
>   } else {
>   msg = argv_pushf(_free, "format %s", var);
>   }
>   ... use "msg" ...
>   ... do other things ...
>   argv_clear(_free);
> 
> Note that argv_array_pushl() and argv_array_pushv() are used to push
> one or more strings with a single call, so we do not return any one
> of these strings from these two functions in order to reduce the
> chance to misuse the API.
> 
> Signed-off-by: Junio C Hamano 
> ---

Yup, this looks good to me.

-Peff


Re: Re*: [PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Junio C Hamano
Junio C Hamano  writes:

> I have a feeling that argv_array might be a better fit for the
> purpose of keeping track of to_free[] strings in the context of this
> series.  Moving away from string_list would allow us to sidestep the
> storage ownership issues the API has, and we do not need the .util
> thing string_list gives us (which is one distinct advantage string_list
> has over argv_array, if the application needs that feature).
>
> We would need to make _pushf() and friends return "const char *" if
> we go that route to make the resulting API more useful, though.

... and redoing the 4/4 patch using argv_array_pushf() makes the
result look like this, which does not look too bad.

-- >8 --
From: Junio C Hamano 
Subject: [PATCH] unpack_trees_options: keep track of owned messages with 
argv_array

Instead of the string_list API, which is overly flexible and require
callers to be careful about memory ownership issues, use the
argv_array API that always takes ownership to redo the earlier
commit.

Signed-off-by: Junio C Hamano 
---
 unpack-trees.c | 16 ++--
 unpack-trees.h |  4 ++--
 2 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 86046b987a..b28f0c6e9d 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1,5 +1,6 @@
 #define NO_THE_INDEX_COMPATIBILITY_MACROS
 #include "cache.h"
+#include "argv-array.h"
 #include "repository.h"
 #include "config.h"
 #include "dir.h"
@@ -103,11 +104,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
const char **msgs = opts->msgs;
const char *msg;
 
-   /*
-* As we add strings using `...appendf()`, this does not matter,
-* but when we clear the string list, we want them to be freed.
-*/
-   opts->msgs_to_free.strdup_strings = 1;
+   argv_array_init(>msgs_to_free);
 
if (!strcmp(cmd, "checkout"))
msg = advice_commit_before_merge
@@ -125,7 +122,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
  "Please commit your changes or stash them before you 
%s.")
  : _("Your local changes to the following files would be 
overwritten by %s:\n%%s");
msgs[ERROR_WOULD_OVERWRITE] = msgs[ERROR_NOT_UPTODATE_FILE] =
-   string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;
+   argv_array_pushf(>msgs_to_free, msg, cmd, cmd);
 
msgs[ERROR_NOT_UPTODATE_DIR] =
_("Updating the following directories would lose untracked 
files in them:\n%s");
@@ -146,7 +143,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
  "Please move or remove them before you %s.")
  : _("The following untracked working tree files would be 
removed by %s:\n%%s");
msgs[ERROR_WOULD_LOSE_UNTRACKED_REMOVED] =
-   string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;
+   argv_array_pushf(>msgs_to_free, msg, cmd, cmd);
 
if (!strcmp(cmd, "checkout"))
msg = advice_commit_before_merge
@@ -164,7 +161,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
  "Please move or remove them before you %s.")
  : _("The following untracked working tree files would be 
overwritten by %s:\n%%s");
msgs[ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN] =
-   string_list_appendf(>msgs_to_free, msg, cmd, cmd)->string;
+   argv_array_pushf(>msgs_to_free, msg, cmd, cmd);
 
/*
 * Special case: ERROR_BIND_OVERLAP refers to a pair of paths, we
@@ -189,8 +186,7 @@ void setup_unpack_trees_porcelain(struct 
unpack_trees_options *opts,
 
 void clear_unpack_trees_porcelain(struct unpack_trees_options *opts)
 {
-   string_list_clear(>msgs_to_free, 0);
-   memset(opts->msgs, 0, sizeof(opts->msgs));
+   argv_array_clear(>msgs_to_free);
 }
 
 static int do_add_entry(struct unpack_trees_options *o, struct cache_entry *ce,
diff --git a/unpack-trees.h b/unpack-trees.h
index 5a84123a40..c2b434c606 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -2,7 +2,7 @@
 #define UNPACK_TREES_H
 
 #include "tree-walk.h"
-#include "string-list.h"
+#include "argv-array.h"
 
 #define MAX_UNPACK_TREES 8
 
@@ -62,7 +62,7 @@ struct unpack_trees_options {
struct pathspec *pathspec;
merge_fn_t fn;
const char *msgs[NB_UNPACK_TREES_ERROR_TYPES];
-   struct string_list msgs_to_free;
+   struct argv_array msgs_to_free;
/*
 * Store error messages in an array, each case
 * corresponding to a error message type
-- 
2.17.0-582-gccdcbd54c4



Re*: [PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Junio C Hamano
Jacob Keller  writes:

> On Sun, May 20, 2018 at 3:17 AM, Martin Ågren  wrote:
>> +/**
>> + * Add formatted string to the end of `list`. This function ignores
>> + * the value of `list->strdup_strings` and always appends a freshly
>> + * allocated string, so you will probably not want to use it with
>> + * `strdup_strings = 0`.
>> + */
>> +struct string_list_item *string_list_appendf(struct string_list *list,
>> +const char *fmt, ...);
>> +
>
> Would it make sense to verify that strdup_strings == 0? I guess we'd
> have to use die or BUG(), but that would mean that the program could
> crash..

It probably is clear to readers that any reasonable implementation
of *_appendf() will create a new and unique string, as the point of
*f() is to give a customized instantiation of fmt string for given
parameters.  So it would be natural to expect that the storage that
holds the generated string will belong to the list.  We _could_ make
it honor strdup_strings and make one extra copy when strdup_strings
is set to true, but the only effect such a stupid implementation has
is to unnecessarily leak ;-)

I think it is probably OK to check and BUG() when strdup_strings==0,
but such a check means that we now declare that a string list must
either borrow all of its strings from elsewhere or own all of its
strings itself, and mixture is not allowed.

The (overly) flexible string_list API could be used to mix both
borrowed and owned strings (an obvious strategy to do this without
leaking and crashing is to use the .util field to mark which ones
are owned and which ones are borrowed), so there might already be
current users of the API that violates that rule.

I have a feeling that argv_array might be a better fit for the
purpose of keeping track of to_free[] strings in the context of this
series.  Moving away from string_list would allow us to sidestep the
storage ownership issues the API has, and we do not need the .util
thing string_list gives us (which is one distinct advantage string_list
has over argv_array, if the application needs that feature).

We would need to make _pushf() and friends return "const char *" if
we go that route to make the resulting API more useful, though.

-- >8 --
Subject: argv-array: return the pushed string from argv_push*()

Such an API change allows us to use an argv_array this way:

struct argv_array to_free = ARGV_ARRAY_INIT;
const char *msg;

if (some condition) {
msg = "constant string message";
... other logic ...
} else {
msg = argv_pushf(_free, "format %s", var);
}
... use "msg" ...
... do other things ...
argv_clear(_free);

Note that argv_array_pushl() and argv_array_pushv() are used to push
one or more strings with a single call, so we do not return any one
of these strings from these two functions in order to reduce the
chance to misuse the API.

Signed-off-by: Junio C Hamano 
---
 argv-array.c | 6 --
 argv-array.h | 4 ++--
 2 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/argv-array.c b/argv-array.c
index 5d370fa336..449dfc105a 100644
--- a/argv-array.c
+++ b/argv-array.c
@@ -21,12 +21,13 @@ static void argv_array_push_nodup(struct argv_array *array, 
const char *value)
array->argv[array->argc] = NULL;
 }
 
-void argv_array_push(struct argv_array *array, const char *value)
+const char *argv_array_push(struct argv_array *array, const char *value)
 {
argv_array_push_nodup(array, xstrdup(value));
+   return array->argv[array->argc - 1];
 }
 
-void argv_array_pushf(struct argv_array *array, const char *fmt, ...)
+const char *argv_array_pushf(struct argv_array *array, const char *fmt, ...)
 {
va_list ap;
struct strbuf v = STRBUF_INIT;
@@ -36,6 +37,7 @@ void argv_array_pushf(struct argv_array *array, const char 
*fmt, ...)
va_end(ap);
 
argv_array_push_nodup(array, strbuf_detach(, NULL));
+   return array->argv[array->argc - 1];
 }
 
 void argv_array_pushl(struct argv_array *array, ...)
diff --git a/argv-array.h b/argv-array.h
index 29056e49a1..715c93b246 100644
--- a/argv-array.h
+++ b/argv-array.h
@@ -12,9 +12,9 @@ struct argv_array {
 #define ARGV_ARRAY_INIT { empty_argv, 0, 0 }
 
 void argv_array_init(struct argv_array *);
-void argv_array_push(struct argv_array *, const char *);
+const char *argv_array_push(struct argv_array *, const char *);
 __attribute__((format (printf,2,3)))
-void argv_array_pushf(struct argv_array *, const char *fmt, ...);
+const char *argv_array_pushf(struct argv_array *, const char *fmt, ...);
 LAST_ARG_MUST_BE_NULL
 void argv_array_pushl(struct argv_array *, ...);
 void argv_array_pushv(struct argv_array *, const char **);


Re: [PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Jacob Keller
On Sun, May 20, 2018 at 3:17 AM, Martin Ågren  wrote:
> +/**
> + * Add formatted string to the end of `list`. This function ignores
> + * the value of `list->strdup_strings` and always appends a freshly
> + * allocated string, so you will probably not want to use it with
> + * `strdup_strings = 0`.
> + */
> +struct string_list_item *string_list_appendf(struct string_list *list,
> +const char *fmt, ...);
> +

Would it make sense to verify that strdup_strings == 0? I guess we'd
have to use die or BUG(), but that would mean that the program could
crash..

I doubt this could be verified at compilation time

Thanks,
Jake


[PATCH v4 3/4] string-list: provide `string_list_appendf()`

2018-05-20 Thread Martin Ågren
Add a function `string_list_appendf(list, fmt, ...)` to the string-list
API. The next commit will add a user.

This function naturally ignores the `strdup_strings`-setting and always
appends a freshly allocated string. Thus, using this function with
`strdup_strings = 0` risks making ownership unclear and leaking memory.
With `strdup_strings = 1` on the other hand, we can easily add formatted
strings without going through `string_list_append_nodup()` or playing
with `strdup_strings`.

Suggested-by: Jeff King 
Signed-off-by: Martin Ågren 
---
 string-list.h |  9 +
 string-list.c | 13 +
 2 files changed, 22 insertions(+)

diff --git a/string-list.h b/string-list.h
index ff8f6094a3..3a73b86ffa 100644
--- a/string-list.h
+++ b/string-list.h
@@ -208,6 +208,15 @@ void string_list_remove_duplicates(struct string_list 
*sorted_list, int free_uti
  */
 struct string_list_item *string_list_append(struct string_list *list, const 
char *string);
 
+/**
+ * Add formatted string to the end of `list`. This function ignores
+ * the value of `list->strdup_strings` and always appends a freshly
+ * allocated string, so you will probably not want to use it with
+ * `strdup_strings = 0`.
+ */
+struct string_list_item *string_list_appendf(struct string_list *list,
+const char *fmt, ...);
+
 /**
  * Like string_list_append(), except string is never copied.  When
  * list->strdup_strings is set, this function can be used to hand
diff --git a/string-list.c b/string-list.c
index a0cf0cfe88..b54d31c1cf 100644
--- a/string-list.c
+++ b/string-list.c
@@ -224,6 +224,19 @@ struct string_list_item *string_list_append(struct 
string_list *list,
list->strdup_strings ? xstrdup(string) : (char 
*)string);
 }
 
+struct string_list_item *string_list_appendf(struct string_list *list,
+const char *fmt, ...)
+{
+   struct string_list_item *retval;
+   va_list ap;
+
+   va_start(ap, fmt);
+   retval = string_list_append_nodup(list, xstrvfmt(fmt, ap));
+   va_end(ap);
+
+   return retval;
+}
+
 static int cmp_items(const void *a, const void *b, void *ctx)
 {
compare_strings_fn cmp = ctx;
-- 
2.17.0.840.g5d83f92caf