date:20160915

Potentially misleading color.* defaults explanation in git-config(1)

2016-09-15 Thread Anatoly Borodin

Hi All!

git-config(1) says:

   color.branch
   A boolean to enable/disable color in the output of git-branch(1).
   May be set to always, false (or never) or auto (or true), in which
   case colors are used only when the output is to a terminal.
   Defaults to false.

If the value false is the default, and neither color.branch nor
color.ui is set in any config file, one can expect that

(1) git branch

and

(2) git config color.branch false ; git branch

produce the same result. But only (2) produces colorless output, (1)
uses colors (that probably depend on the default value of color.ui).

The same story with color.diff and git-show, color.grep, etc.

Is it me being a non-native English speaker, or does this part really
need to be rewritten?

PS git version 2.9.3

-- 
Mit freundlichen Grüßen,
Anatoly Borodin

[PATCH 05/11] Resumable clone: add output parsing to connect.c

2016-09-15 Thread Kevin Wern

Add method for transport to call when parsing primeclone output from
stdin. Suppress stderr when using git_connect with ssh, unless output
is verbose.

Signed-off-by: Kevin Wern 
---
 connect.c | 47 +++
 connect.h | 10 ++
 2 files changed, 53 insertions(+), 4 deletions(-)

diff --git a/connect.c b/connect.c
index 0478631..284de53 100644
--- a/connect.c
+++ b/connect.c
@@ -804,6 +804,10 @@ struct child_process *git_connect(int fd[2], const char 
*url,
}
argv_array_push(>args, cmd.buf);
 
+   if (flags & CONNECT_SUPPRESS_ERRORS) {
+   conn->no_stderr = 1;
+   }
+
if (start_command(conn))
die("unable to fork");
 
@@ -831,3 +835,46 @@ int finish_connect(struct child_process *conn)
free(conn);
return code;
 }
+
+const struct alt_resource *const get_alt_res_connect(int fd, int flags)
+{
+   struct alt_resource *res = NULL;
+   const char *line;
+   char *url = NULL, *filetype = NULL;
+   char *error_string = NULL;
+
+   while (line = packet_read_line_gentle(fd, NULL)) {
+   const char *space = strchr(line, ' ');
+
+   // We will eventually support multiple resources, so always
+   // parse the whole message
+   if ((filetype && url) || error_string) {
+   continue;
+   }
+   if (skip_prefix(line, "ERR ", ) || !space ||
+   strchr(space + 1, ' ')) {
+   error_string = xstrdup(line);
+   continue;
+   }
+   filetype = xstrndup(line, (space - line));
+   url = xstrdup(space + 1);
+   }
+
+   if (filetype && url && !error_string){
+   res = xcalloc(1, sizeof(*res));
+   res->filetype = filetype;
+   res->url = url;
+   }
+   else {
+   if (!(flags & CONNECT_SUPPRESS_ERRORS)) {
+   if (error_string)
+   fprintf(stderr, "prime clone protocol error: "
+   "got '%s'\n", error_string);
+   else
+   fprintf(stderr, "did not get required "
+   "components for alternate resource\n");
+   }
+   }
+
+   return res;
+}
diff --git a/connect.h b/connect.h
index 01f14cd..966c0eb 100644
--- a/connect.h
+++ b/connect.h
@@ -1,10 +1,11 @@
 #ifndef CONNECT_H
 #define CONNECT_H
 
-#define CONNECT_VERBOSE   (1u << 0)
-#define CONNECT_DIAG_URL  (1u << 1)
-#define CONNECT_IPV4  (1u << 2)
-#define CONNECT_IPV6  (1u << 3)
+#define CONNECT_VERBOSE (1u << 0)
+#define CONNECT_DIAG_URL(1u << 1)
+#define CONNECT_IPV4(1u << 2)
+#define CONNECT_IPV6(1u << 3)
+#define CONNECT_SUPPRESS_ERRORS (1u << 4)
 extern struct child_process *git_connect(int fd[2], const char *url, const 
char *prog, int flags);
 extern int finish_connect(struct child_process *conn);
 extern int git_connection_is_socket(struct child_process *conn);
@@ -12,5 +13,6 @@ extern int server_supports(const char *feature);
 extern int parse_feature_request(const char *features, const char *feature);
 extern const char *server_feature_value(const char *feature, int *len_ret);
 extern int url_is_local_not_ssh(const char *url);
+const struct alt_resource *const get_alt_res_connect(int fd, int flags);
 
 #endif
-- 
2.7.4

[PATCH 06/11] Resumable clone: implement transport_prime_clone

2016-09-15 Thread Kevin Wern

Create transport_prime_clone API, as well as all internal methods.
Create representations of alt_resource and prime-clone path options.

The intention of get_alt_res_helper is solely to parse the output of
remote-curl because transport-helper does not handle verbose options
or speaking to the user verbosely. Therefore, all error parsing is
done with remote-curl, and any protocol breach between remote-curl and
transport-helper will treated as a bug and result in death.

Signed-off-by: Kevin Wern 
---
 transport-helper.c | 51 ++-
 transport.c| 44 
 transport.h| 20 
 3 files changed, 114 insertions(+), 1 deletion(-)

diff --git a/transport-helper.c b/transport-helper.c
index b934183..eb185d5 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -28,7 +28,8 @@ struct helper_data {
signed_tags : 1,
check_connectivity : 1,
no_disconnect_req : 1,
-   no_private_update : 1;
+   no_private_update : 1,
+   prime_clone : 1;
char *export_marks;
char *import_marks;
/* These go from remote name (as in "list") to private name */
@@ -180,6 +181,8 @@ static struct child_process *get_helper(struct transport 
*transport)
data->export = 1;
else if (!strcmp(capname, "check-connectivity"))
data->check_connectivity = 1;
+   else if (!strcmp(capname, "prime-clone"))
+   data->prime_clone = 1;
else if (!data->refspecs && skip_prefix(capname, "refspec ", 
)) {
ALLOC_GROW(refspecs,
   refspec_nr + 1,
@@ -248,6 +251,7 @@ static int disconnect_helper(struct transport *transport)
 }
 
 static const char *unsupported_options[] = {
+   TRANS_OPT_PRIMECLONE,
TRANS_OPT_UPLOADPACK,
TRANS_OPT_RECEIVEPACK,
TRANS_OPT_THIN,
@@ -1054,6 +1058,50 @@ static struct ref *get_refs_list(struct transport 
*transport, int for_push)
return ret;
 }
 
+static const struct alt_resource *const get_alt_res_helper(struct transport 
*transport)
+{
+   struct helper_data *data = transport->data;
+   char *url = NULL, *filetype = NULL;
+   struct alt_resource *ret = NULL;
+   struct strbuf out = STRBUF_INIT;
+   struct child_process *helper = get_helper(transport);
+   int err = 0;
+
+   helper = get_helper(transport);
+   write_constant(helper->in, "prime-clone\n");
+
+   while (!recvline(data, )) {
+   char *space = strchr(out.buf, ' ');
+
+   if (!*out.buf)
+   break;
+
+   if (starts_with(out.buf, "error")) {
+   err = 1;
+   continue;
+   }
+
+   if (!space || strchr(space + 1, ' '))
+   die("malformed alternate resource response: %s\n",
+   out.buf);
+
+   if ((filetype && url) || err)
+   continue;
+
+   filetype = xstrndup(out.buf, (space - out.buf));
+   url = xstrdup(space + 1);
+   }
+
+   if (filetype && url && !err) {
+   ret = xcalloc(1, sizeof(*ret));
+   ret->filetype = filetype;
+   ret->url = url;
+   }
+
+   strbuf_release();
+   return ret;
+}
+
 int transport_helper_init(struct transport *transport, const char *name)
 {
struct helper_data *data = xcalloc(1, sizeof(*data));
@@ -1067,6 +1115,7 @@ int transport_helper_init(struct transport *transport, 
const char *name)
transport->data = data;
transport->set_option = set_helper_option;
transport->get_refs_list = get_refs_list;
+   transport->prime_clone = get_alt_res_helper;
transport->fetch = fetch;
transport->push_refs = push_refs;
transport->disconnect = release_helper;
diff --git a/transport.c b/transport.c
index 7bd3206..dd0d839 100644
--- a/transport.c
+++ b/transport.c
@@ -131,6 +131,9 @@ static int set_git_option(struct git_transport_options 
*opts,
} else if (!strcmp(name, TRANS_OPT_RECEIVEPACK)) {
opts->receivepack = value;
return 0;
+   } else if (!strcmp(name, TRANS_OPT_PRIMECLONE)) {
+   opts->primeclone = value;
+   return 0;
} else if (!strcmp(name, TRANS_OPT_THIN)) {
opts->thin = !!value;
return 0;
@@ -533,6 +536,42 @@ static int git_transport_push(struct transport *transport, 
struct ref *remote_re
return ret;
 }
 
+const struct alt_resource *const get_alt_res_via_connect(struct transport 
*transport)
+{
+   struct git_transport_data *data = transport->data;
+   const struct alt_resource *res = NULL;
+   int flags =

[PATCH 07/11] Resumable clone: add resumable download to http/curl

2016-09-15 Thread Kevin Wern

Create resumable download procedure and progress display function.
The conversion from B to KB occurs because otherwise the byte counts
for large repos (i.e. Linux) overflow calculating percentage.

The download protocol includes the resource's URL, and the directory
the resource will be downloaded to. The url passed to remote-curl on
invocation does not matter (git clone will use the resource url
again here).

Signed-off-by: Kevin Wern 
---
 http.c| 86 ++-
 http.h|  7 -
 remote-curl.c | 27 +++
 3 files changed, 118 insertions(+), 2 deletions(-)

diff --git a/http.c b/http.c
index 1d5e3bb..93d6324 100644
--- a/http.c
+++ b/http.c
@@ -10,6 +10,8 @@
 #include "pkt-line.h"
 #include "gettext.h"
 #include "transport.h"
+#include "progress.h"
+#include "dir.h"
 
 #if LIBCURL_VERSION_NUM >= 0x070a08
 long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
@@ -1136,7 +1138,10 @@ static int handle_curl_result(struct slot_results 
*results)
curl_easy_strerror(results->curl_result),
sizeof(curl_errorstr));
 #endif
-   return HTTP_ERROR;
+   if (results->http_code >= 400)
+   return HTTP_ERROR;
+   else
+   return HTTP_ERROR_RESUMABLE;
}
 }
 
@@ -1365,6 +1370,40 @@ static void http_opt_request_remainder(CURL *curl, off_t 
pos)
 #define HTTP_REQUEST_STRBUF0
 #define HTTP_REQUEST_FILE  1
 
+static int bytes_to_rounded_kb(double bytes)
+{
+   return (int) (bytes + 512)/1024;
+}
+
+int progress_func(void *data, double total_to_download, double now_downloaded,
+ double total_to_upload, double now_uploadeded)
+{
+   struct progress **progress = data;
+   int kilobytes = total_to_download >= 1024;
+
+   if (total_to_download <= 0.0) {
+   return 0;
+   }
+   if (kilobytes) {
+   now_downloaded = bytes_to_rounded_kb(now_downloaded);
+   total_to_download = bytes_to_rounded_kb(total_to_download);
+   }
+   if (!*progress && now_downloaded < total_to_download) {
+   if (total_to_download > 1024)
+   *progress = start_progress("Downloading (KB)",
+  total_to_download);
+   else
+   *progress = start_progress("Downloading (B)",
+  total_to_download);
+   }
+   display_progress(*progress, now_downloaded);
+   if (now_downloaded == total_to_download) {
+   stop_progress(progress);
+   }
+   return 0;
+}
+
+
 static int http_request(const char *url,
void *result, int target,
const struct http_get_options *options)
@@ -1373,6 +1412,7 @@ static int http_request(const char *url,
struct slot_results results;
struct curl_slist *headers = NULL;
struct strbuf buf = STRBUF_INIT;
+   struct progress *progress = NULL;
const char *accept_language;
int ret;
 
@@ -1389,6 +1429,16 @@ static int http_request(const char *url,
off_t posn = ftello(result);
curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
 fwrite);
+   if (options && options->progress) {
+   curl_easy_setopt(slot->curl,
+CURLOPT_NOPROGRESS, 0);
+   curl_easy_setopt(slot->curl,
+CURLOPT_PROGRESSFUNCTION,
+progress_func);
+   curl_easy_setopt(slot->curl,
+CURLOPT_PROGRESSDATA,
+);
+   }
if (posn > 0)
http_opt_request_remainder(slot->curl, posn);
} else
@@ -1559,6 +1609,40 @@ cleanup:
return ret;
 }
 
+int http_download_primer(const char *url, const char *out_file)
+{
+   int ret = 0, try_count = HTTP_TRY_COUNT;
+   struct http_get_options options = {0};
+   options.progress = 1;
+
+   if (file_exists(out_file)) {
+   fprintf(stderr,
+   "File already downloaded: '%s', skipping...\n",
+   out_file);
+   return ret;
+   }
+
+   do {
+   if (try_count != HTTP_TRY_COUNT) {
+   fprintf(stderr, "Connection interrupted for some "
+   "reason, retrying (%d attempts left)\n",
+   try_count);
+   struct timeval time = {10, 0}; // 1s
+

[PATCH 11/11] Resumable clone: implement primer logic in git-clone

2016-09-15 Thread Kevin Wern

Use transport_download_primer and transport_prime_clone in git clone.
This only supports a fully connected packfile.

transport_prime_clone and transport_download_primer are executed
completely independent of transport_(get|fetch)_remote_refs, et al.
transport_download_primer is executed based on the existence of an
alt_resource. The idea is that the "prime clone" execution should be
able to attempt retrieving an alternate resource without dying, as
opposed to depending on the result of upload pack's "capabilities" to
indicate whether or not the client can attempt it.

If a resumable resource is available, execute a codepath with the
following modular components:
- downloading resource to a specific directory
- using the resource (for pack, indexing and generating the bundle
  file)
- cleaning up the resource (if the download or use fails)
- cleaning up the resource (if the download or use succeeds)

If resume is interrupted on the client side, the alternate resource
info is written to the RESUMABLE file in the git directory.

On resume, the required info is extracted by parsing the created
config file, and that info is used to determine the work and git
directories. If these cannot be determined, the program exits.
The writing of the refspec and determination of the initial git
directories are skipped, along with transport_prime_clone.

The main purpose of this series of patches is to flesh out a codepath
for automatic resuming, manual resuming, and leaving a resumable
directory on exit--the logic for when to do these still needs more
work.

Signed-off-by: Kevin Wern 
---
 Documentation/git-clone.txt |  16 ++
 builtin/clone.c | 590 +---
 t/t9904-git-prime-clone.sh  | 181 ++
 3 files changed, 698 insertions(+), 89 deletions(-)
 create mode 100755 t/t9904-git-prime-clone.sh

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index b7c467a..5934bb6 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -16,6 +16,7 @@ SYNOPSIS
  [--depth ] [--[no-]single-branch]
  [--recursive | --recurse-submodules] [--] 
  []
+'git clone --resume '
 
 DESCRIPTION
 ---
@@ -172,6 +173,12 @@ objects from the source repository into a pack in the 
cloned repository.
via ssh, this specifies a non-default path for the command
run on the other end.
 
+--prime-clone ::
+-p ::
+   When given and the repository to clone from is accessed
+   via ssh, this specifies a non-default path for the command
+   run on the other end.
+
 --template=::
Specify the directory from which templates will be used;
(See the "TEMPLATE DIRECTORY" section of linkgit:git-init[1].)
@@ -232,6 +239,15 @@ objects from the source repository into a pack in the 
cloned repository.
for `host.xz:foo/.git`).  Cloning into an existing directory
is only allowed if the directory is empty.
 
+--resume::
+   Resume a partially cloned repo in a "resumable" state. This
+   can only be specified with a single local directory (). This is incompatible with all other options.
+
+::
+   The directory of the partial clone. This could be either the
+   work directory or the git directory.
+
 :git-clone: 1
 include::urls.txt[]
 
diff --git a/builtin/clone.c b/builtin/clone.c
index 9ac6c01..d9a13dc 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -8,7 +8,9 @@
  * Clone a repository into a different directory that does not yet exist.
  */
 
+#include "cache.h"
 #include "builtin.h"
+#include "bundle.h"
 #include "lockfile.h"
 #include "parse-options.h"
 #include "fetch-pack.h"
@@ -40,17 +42,20 @@ static const char * const builtin_clone_usage[] = {
 
 static int option_no_checkout, option_bare, option_mirror, 
option_single_branch = -1;
 static int option_local = -1, option_no_hardlinks, option_shared, 
option_recursive;
+static int option_resume;
 static char *option_template, *option_depth;
-static char *option_origin = NULL;
+static const char *option_origin = NULL;
 static char *option_branch = NULL;
 static const char *real_git_dir;
 static char *option_upload_pack = "git-upload-pack";
+static char *option_prime_clone = "git-prime-clone";
 static int option_verbosity;
 static int option_progress = -1;
 static enum transport_family family;
 static struct string_list option_config;
 static struct string_list option_reference;
 static int option_dissociate;
+static const struct alt_resource *alt_res = NULL;
 
 static struct option builtin_clone_options[] = {
OPT__VERBOSITY(_verbosity),
@@ -85,10 +90,14 @@ static struct option builtin_clone_options[] = {
   N_("checkout  instead of the remote's HEAD")),
OPT_STRING('u', "upload-pack", _upload_pack, N_("path"),
   N_("path to git-upload-pack on the remote")),
+   OPT_STRING('p', "prime-clone", _prime_clone, N_("path"),
+

[PATCH 09/11] path: add resumable marker

2016-09-15 Thread Kevin Wern

Create function to get gitdir file RESUMABLE.

Signed-off-by: Kevin Wern 
---
 cache.h | 1 +
 path.c  | 1 +
 2 files changed, 2 insertions(+)

diff --git a/cache.h b/cache.h
index d7ff46e..1f4117c 100644
--- a/cache.h
+++ b/cache.h
@@ -811,6 +811,7 @@ const char *git_path_merge_mode(void);
 const char *git_path_merge_head(void);
 const char *git_path_fetch_head(void);
 const char *git_path_shallow(void);
+const char *git_path_resumable(void);
 
 /*
  * Return the name of the file in the local object database that would
diff --git a/path.c b/path.c
index 8b7e168..9360ed9 100644
--- a/path.c
+++ b/path.c
@@ -1201,4 +1201,5 @@ GIT_PATH_FUNC(git_path_merge_rr, "MERGE_RR")
 GIT_PATH_FUNC(git_path_merge_mode, "MERGE_MODE")
 GIT_PATH_FUNC(git_path_merge_head, "MERGE_HEAD")
 GIT_PATH_FUNC(git_path_fetch_head, "FETCH_HEAD")
+GIT_PATH_FUNC(git_path_resumable, "RESUMABLE")
 GIT_PATH_FUNC(git_path_shallow, "shallow")
-- 
2.7.4

[PATCH 08/11] Resumable clone: create transport_download_primer

2016-09-15 Thread Kevin Wern

Create function transport_download_primer and components
to invoke and pass commands to remote-curl.

Signed-off-by: Kevin Wern 
---
 transport-helper.c | 24 
 transport.c|  9 +
 transport.h|  7 +++
 3 files changed, 40 insertions(+)

diff --git a/transport-helper.c b/transport-helper.c
index eb185d5..2ff96ef 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -29,6 +29,7 @@ struct helper_data {
check_connectivity : 1,
no_disconnect_req : 1,
no_private_update : 1,
+   download_primer : 1,
prime_clone : 1;
char *export_marks;
char *import_marks;
@@ -183,6 +184,8 @@ static struct child_process *get_helper(struct transport 
*transport)
data->check_connectivity = 1;
else if (!strcmp(capname, "prime-clone"))
data->prime_clone = 1;
+   else if (!strcmp(capname, "download-primer"))
+   data->download_primer = 1;
else if (!data->refspecs && skip_prefix(capname, "refspec ", 
)) {
ALLOC_GROW(refspecs,
   refspec_nr + 1,
@@ -1058,6 +1061,26 @@ static struct ref *get_refs_list(struct transport 
*transport, int for_push)
return ret;
 }
 
+static const char *download_primer(struct transport *transport,
+  const struct alt_resource *res,
+  const char *base_path)
+{
+   struct helper_data *data = transport->data;
+   struct child_process *helper;
+   struct strbuf buf = STRBUF_INIT, out = STRBUF_INIT;
+   char *ret = NULL;
+
+   helper = get_helper(transport);
+
+   strbuf_addf(, "download-primer %s %s\n", res->url, base_path);
+   sendline(data, );
+   recvline(data, );
+   strbuf_release();
+   if (out.len > 0)
+   ret = strbuf_detach(, NULL);
+   return ret;
+}
+
 static const struct alt_resource *const get_alt_res_helper(struct transport 
*transport)
 {
struct helper_data *data = transport->data;
@@ -1115,6 +1138,7 @@ int transport_helper_init(struct transport *transport, 
const char *name)
transport->data = data;
transport->set_option = set_helper_option;
transport->get_refs_list = get_refs_list;
+   transport->download_primer = download_primer;
transport->prime_clone = get_alt_res_helper;
transport->fetch = fetch;
transport->push_refs = push_refs;
diff --git a/transport.c b/transport.c
index dd0d839..3b33029 100644
--- a/transport.c
+++ b/transport.c
@@ -572,6 +572,15 @@ const struct alt_resource *const 
transport_prime_clone(struct transport *transpo
return transport->alt_res;
 }
 
+const char *transport_download_primer(struct transport *transport,
+ const struct alt_resource *alt_res,
+ const char *base_dir)
+{
+   if (transport->download_primer)
+   return transport->download_primer(transport, alt_res, base_dir);
+   return NULL;
+}
+
 static int connect_git(struct transport *transport, const char *name,
   const char *executable, int fd[2])
 {
diff --git a/transport.h b/transport.h
index 2bb6963..1484d6d 100644
--- a/transport.h
+++ b/transport.h
@@ -83,6 +83,10 @@ struct transport {
 **/
const struct alt_resource *const (*prime_clone)(struct transport 
*transport);
 
+   const char *(*download_primer)(struct transport *transport,
+   const struct alt_resource *alt_res,
+   const char *base_path);
+
/**
 * Fetch the objects for the given refs. Note that this gets
 * an array, and should ignore the list structure.
@@ -228,6 +232,9 @@ int transport_push(struct transport *connection,
 
 const struct ref *transport_get_remote_refs(struct transport *transport);
 const struct alt_resource *const transport_prime_clone(struct transport 
*transport);
+const char *transport_download_primer(struct transport *transport,
+   const struct alt_resource *alt_res,
+   const char *base_path);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
-- 
2.7.4

[PATCH 10/11] run command: add RUN_COMMAND_NO_STDOUT

2016-09-15 Thread Kevin Wern

Add option RUN_COMMAND_NO_STDOUT, which sets no_stdout on a child
process.

This will be used by git clone when calling index-pack on a downloaded
packfile.

Signed-off-by: Kevin Wern 
---
 run-command.c | 1 +
 run-command.h | 1 +
 2 files changed, 2 insertions(+)

diff --git a/run-command.c b/run-command.c
index 863dad5..c4f82f9 100644
--- a/run-command.c
+++ b/run-command.c
@@ -574,6 +574,7 @@ int run_command_v_opt_cd_env(const char **argv, int opt, 
const char *dir, const
cmd.silent_exec_failure = opt & RUN_SILENT_EXEC_FAILURE ? 1 : 0;
cmd.use_shell = opt & RUN_USING_SHELL ? 1 : 0;
cmd.clean_on_exit = opt & RUN_CLEAN_ON_EXIT ? 1 : 0;
+   cmd.no_stdout = opt & RUN_COMMAND_NO_STDOUT ? 1 : 0;
cmd.dir = dir;
cmd.env = env;
return run_command();
diff --git a/run-command.h b/run-command.h
index 42917e8..2d2c871 100644
--- a/run-command.h
+++ b/run-command.h
@@ -70,6 +70,7 @@ extern int run_hook_ve(const char *const *env, const char 
*name, va_list args);
 #define RUN_SILENT_EXEC_FAILURE 8
 #define RUN_USING_SHELL 16
 #define RUN_CLEAN_ON_EXIT 32
+#define RUN_COMMAND_NO_STDOUT 64
 int run_command_v_opt(const char **argv, int opt);
 
 /*
-- 
2.7.4

[PATCH 04/11] Resumable clone: add prime-clone to remote-curl

2016-09-15 Thread Kevin Wern

Add function and interface to handle prime-clone input, extracting
and using duplicate functionality from discover_refs as function
request_service.

Because part of our goal is for prime_clone to recover from errors,
HTTP errors are only optionally printed to screen and never cause
death in this case.

Signed-off-by: Kevin Wern 
---
 remote-curl.c | 165 ++
 1 file changed, 121 insertions(+), 44 deletions(-)

diff --git a/remote-curl.c b/remote-curl.c
index 15e48e2..8ebb587 100644
--- a/remote-curl.c
+++ b/remote-curl.c
@@ -13,6 +13,8 @@
 #include "sha1-array.h"
 #include "send-pack.h"
 
+#define HTTP_ERROR_GENTLE (1u << 0)
+
 static struct remote *remote;
 /* always ends with a trailing slash */
 static struct strbuf url = STRBUF_INIT;
@@ -244,7 +246,31 @@ static int show_http_message(struct strbuf *type, struct 
strbuf *charset,
return 0;
 }
 
-static struct discovery *discover_refs(const char *service, int for_push)
+static char *http_handle_result(int http_return)
+{
+   struct strbuf error = STRBUF_INIT;
+
+   switch (http_return) {
+   case HTTP_OK:
+   return NULL;
+   case HTTP_MISSING_TARGET:
+   strbuf_addf(, "repository '%s' not found", url.buf);
+   break;
+   case HTTP_NOAUTH:
+   strbuf_addf(, "Authentication failed for '%s'",
+   url.buf);
+   break;
+   default:
+   strbuf_addf(, "unable to access '%s': %s", url.buf,
+   curl_errorstr);
+   break;
+   }
+
+   return strbuf_detach(, NULL);
+}
+
+static int request_service(char const *const service, char **buffer_full,
+   char **buffer_msg, size_t *buffer_len, int flags)
 {
struct strbuf exp = STRBUF_INIT;
struct strbuf type = STRBUF_INIT;
@@ -252,13 +278,9 @@ static struct discovery *discover_refs(const char 
*service, int for_push)
struct strbuf buffer = STRBUF_INIT;
struct strbuf refs_url = STRBUF_INIT;
struct strbuf effective_url = STRBUF_INIT;
-   struct discovery *last = last_discovery;
-   int http_ret, maybe_smart = 0;
-   struct http_get_options options;
-
-   if (last && !strcmp(service, last->service))
-   return last;
-   free_discovery(last);
+   int http_ret, maybe_smart = 0, ran_smart = 0;
+   struct http_get_options get_options;
+   const char *error_string;
 
strbuf_addf(_url, "%sinfo/refs", url.buf);
if ((starts_with(url.buf, "http://;) || starts_with(url.buf, 
"https://;)) &&
@@ -271,45 +293,41 @@ static struct discovery *discover_refs(const char 
*service, int for_push)
strbuf_addf(_url, "service=%s", service);
}
 
-   memset(, 0, sizeof(options));
-   options.content_type = 
-   options.charset = 
-   options.effective_url = _url;
-   options.base_url = 
-   options.no_cache = 1;
-   options.keep_error = 1;
-
-   http_ret = http_get_strbuf(refs_url.buf, , );
-   switch (http_ret) {
-   case HTTP_OK:
-   break;
-   case HTTP_MISSING_TARGET:
-   show_http_message(, , );
-   die("repository '%s' not found", url.buf);
-   case HTTP_NOAUTH:
-   show_http_message(, , );
-   die("Authentication failed for '%s'", url.buf);
-   default:
-   show_http_message(, , );
-   die("unable to access '%s': %s", url.buf, curl_errorstr);
+   memset(_options, 0, sizeof(get_options));
+   get_options.content_type = 
+   get_options.charset = 
+   get_options.effective_url = _url;
+   get_options.base_url = 
+   get_options.no_cache = 1;
+   get_options.keep_error = 1;
+
+   http_ret = http_get_strbuf(refs_url.buf, , _options);
+   error_string = http_handle_result(http_ret);
+   if (error_string) {
+   if (!(flags & HTTP_ERROR_GENTLE)) {
+   show_http_message(, , );
+   die("%s", error_string);
+   }
+   else if (options.verbosity > 1) {
+   show_http_message(, , );
+   fprintf(stderr, "%s\n", error_string);
+   }
}
 
-   last= xcalloc(1, sizeof(*last_discovery));
-   last->service = service;
-   last->buf_alloc = strbuf_detach(, >len);
-   last->buf = last->buf_alloc;
+   *buffer_full = strbuf_detach(, buffer_len);
+   *buffer_msg = *buffer_full;
 
strbuf_addf(, "application/x-%s-advertisement", service);
if (maybe_smart &&
-   (5 <= last->len && last->buf[4] == '#') &&
-   !strbuf_cmp(, )) {
+   (5 <= *buffer_len && (*buffer_msg)[4] == '#') &&
+   !strbuf_cmp(, ) && http_ret == HTTP_OK) {
char *line;
 
/*
 * smart HTTP

[PATCH 01/11] Resumable clone: create service git-prime-clone

2016-09-15 Thread Kevin Wern

Create git-prime-clone, a program to be executed on the server that
returns the location and type of static resource to download before
performing the rest of a clone.

Additionally, as this executable's location will be configurable (see:
upload-pack and receive-pack), add the program to
BINDIR_PROGRAMS_NEED_X, in addition to the usual builtin places. Add
git-prime-clone executable to gitignore, as well

Signed-off-by: Kevin Wern 
---
 .gitignore|  1 +
 Documentation/git-prime-clone.txt | 39 
 Makefile  |  2 +
 builtin.h |  1 +
 builtin/prime-clone.c | 77 +++
 git.c |  1 +
 6 files changed, 121 insertions(+)
 create mode 100644 Documentation/git-prime-clone.txt
 create mode 100644 builtin/prime-clone.c

diff --git a/.gitignore b/.gitignore
index 5087ce1..bfea25c 100644
--- a/.gitignore
+++ b/.gitignore
@@ -106,6 +106,7 @@
 /git-pack-refs
 /git-parse-remote
 /git-patch-id
+/git-prime-clone
 /git-prune
 /git-prune-packed
 /git-pull
diff --git a/Documentation/git-prime-clone.txt 
b/Documentation/git-prime-clone.txt
new file mode 100644
index 000..fc5917d
--- /dev/null
+++ b/Documentation/git-prime-clone.txt
@@ -0,0 +1,39 @@
+git-prime-clone(1)
+
+
+NAME
+
+git-prime-clone - Get the location of an alternate resource
+to fetch before clone
+
+
+SYNOPSIS
+
+[verse]
+'git prime-clone' [--strict] 
+
+DESCRIPTION
+---
+
+Outputs the resource, if configured to do so. Otherwise, returns
+nothing (packet flush ).
+
+CONFIGURE
+-
+
+primeclone.url::
+   The full url of the resource (e.g.
+   http://examplehost/pack-$NAME.pack).
+
+primeclone.filetype::
+   The type of the resource (e.g. pack).
+
+primeclone.enabled::
+   When 'false', git-prime-clone will return an empty response,
+   regardless of what the rest of the configuration specifies;
+   otherwise, it will return the configured response. Is 'true'
+   by default.
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index 24bef8d..f2564ec 100644
--- a/Makefile
+++ b/Makefile
@@ -648,6 +648,7 @@ OTHER_PROGRAMS = git$X
 # what test wrappers are needed and 'install' will install, in bindir
 BINDIR_PROGRAMS_NEED_X += git
 BINDIR_PROGRAMS_NEED_X += git-upload-pack
+BINDIR_PROGRAMS_NEED_X += git-prime-clone
 BINDIR_PROGRAMS_NEED_X += git-receive-pack
 BINDIR_PROGRAMS_NEED_X += git-upload-archive
 BINDIR_PROGRAMS_NEED_X += git-shell
@@ -904,6 +905,7 @@ BUILTIN_OBJS += builtin/pack-objects.o
 BUILTIN_OBJS += builtin/pack-redundant.o
 BUILTIN_OBJS += builtin/pack-refs.o
 BUILTIN_OBJS += builtin/patch-id.o
+BUILTIN_OBJS += builtin/prime-clone.o
 BUILTIN_OBJS += builtin/prune-packed.o
 BUILTIN_OBJS += builtin/prune.o
 BUILTIN_OBJS += builtin/pull.o
diff --git a/builtin.h b/builtin.h
index 6b95006..c9e2254 100644
--- a/builtin.h
+++ b/builtin.h
@@ -97,6 +97,7 @@ extern int cmd_notes(int argc, const char **argv, const char 
*prefix);
 extern int cmd_pack_objects(int argc, const char **argv, const char *prefix);
 extern int cmd_pack_redundant(int argc, const char **argv, const char *prefix);
 extern int cmd_patch_id(int argc, const char **argv, const char *prefix);
+extern int cmd_prime_clone(int argc, const char **argv, const char *prefix);
 extern int cmd_prune(int argc, const char **argv, const char *prefix);
 extern int cmd_prune_packed(int argc, const char **argv, const char *prefix);
 extern int cmd_pull(int argc, const char **argv, const char *prefix);
diff --git a/builtin/prime-clone.c b/builtin/prime-clone.c
new file mode 100644
index 000..ce914d3
--- /dev/null
+++ b/builtin/prime-clone.c
@@ -0,0 +1,77 @@
+#include "cache.h"
+#include "parse-options.h"
+#include "pkt-line.h"
+
+static char const * const prime_clone_usage[] = {
+   N_("git prime-clone [--strict] "),
+   NULL
+};
+
+static unsigned int enabled = 1;
+static const char *url = NULL, *filetype = NULL;
+static int strict;
+
+static struct option prime_clone_options[] = {
+   OPT_BOOL(0, "strict", , N_("Do not attempt /.git if  "
+ "is not a git directory")),
+   OPT_END(),
+};
+
+static void prime_clone(void)
+{
+   if (!enabled) {
+   fprintf(stderr, _("prime-clone not enabled\n"));
+   }
+   else if (url && filetype){
+   packet_write(1, "%s %s\n", filetype, url);
+   }
+   else if (url || filetype) {
+   if (filetype)
+   fprintf(stderr, _("prime-clone not properly "
+ "configured: missing url\n"));
+   else if (url)
+   fprintf(stderr, _("prime-clone not properly "
+ "configured: missing filetype\n"));
+   }
+   packet_flush(1);
+}
+
+static int

[PATCH 02/11] Resumable clone: add prime-clone endpoints

2016-09-15 Thread Kevin Wern

Add logic to serve git-prime-clone to git and http clients.

Do not pass --stateless-rpc and --advertise-refs options to
prime-clone. It is inherently stateless and an 'advertisement'.

Signed-off-by: Kevin Wern 
---
 Documentation/git-daemon.txt   |  7 +++
 Documentation/git-http-backend.txt |  7 +++
 daemon.c   |  7 +++
 http-backend.c | 22 +-
 4 files changed, 38 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-daemon.txt b/Documentation/git-daemon.txt
index a69b361..853faab 100644
--- a/Documentation/git-daemon.txt
+++ b/Documentation/git-daemon.txt
@@ -231,6 +231,13 @@ receive-pack::
enabled by setting `daemon.receivepack` configuration item to
`true`.
 
+primeclone::
+   This serves 'git prime-clone' service to clients, allowing
+   'git clone' clients to get the location of a static resource
+   to download and integrate before performing an incremental
+   fetch. It is 'false' by default, but can be enabled by setting
+   it to `true`.
+
 EXAMPLES
 
 We assume the following in /etc/services::
diff --git a/Documentation/git-http-backend.txt 
b/Documentation/git-http-backend.txt
index 9268fb6..40be74e 100644
--- a/Documentation/git-http-backend.txt
+++ b/Documentation/git-http-backend.txt
@@ -54,6 +54,13 @@ http.receivepack::
disabled by setting this item to `false`, or enabled for all
users, including anonymous users, by setting it to `true`.
 
+http.primeclone::
+   This serves 'git prime-clone' service to clients, allowing
+   'git clone' clients to get the location of a static resource
+   to download and integrate before performing an incremental
+   fetch. It is 'false' by default, but can be enabled by setting
+   it to `true`.
+
 URL TRANSLATION
 ---
 To determine the location of the repository on disk, 'git http-backend'
diff --git a/daemon.c b/daemon.c
index 8d45c33..c2f539c 100644
--- a/daemon.c
+++ b/daemon.c
@@ -475,10 +475,17 @@ static int receive_pack(void)
return run_service_command(argv);
 }
 
+static int prime_clone(void)
+{
+   static const char *argv[] = { "prime-clone", "--strict", ".", NULL };
+   return run_service_command(argv);
+}
+
 static struct daemon_service daemon_service[] = {
{ "upload-archive", "uploadarch", upload_archive, 0, 1 },
{ "upload-pack", "uploadpack", upload_pack, 1, 1 },
{ "receive-pack", "receivepack", receive_pack, 0, 1 },
+   { "prime-clone", "primeclone", prime_clone, 0, 1 },
 };
 
 static void enable_service(const char *name, int ena)
diff --git a/http-backend.c b/http-backend.c
index 8870a26..9c89a10 100644
--- a/http-backend.c
+++ b/http-backend.c
@@ -27,6 +27,7 @@ struct rpc_service {
 static struct rpc_service rpc_service[] = {
{ "upload-pack", "uploadpack", 1, 1 },
{ "receive-pack", "receivepack", 0, -1 },
+   { "prime-clone", "primeclone", 0, -1 },
 };
 
 static struct string_list *get_parameters(void)
@@ -450,11 +451,22 @@ static void get_info_refs(char *arg)
hdr_nocache();
 
if (service_name) {
-   const char *argv[] = {NULL /* service name */,
-   "--stateless-rpc", "--advertise-refs",
-   ".", NULL};
+   struct argv_array argv;
struct rpc_service *svc = select_service(service_name);
 
+   argv_array_init();
+   argv_array_push(, svc->name);
+
+   // prime-clone does not need --stateless-rpc and
+   // --advertise-refs options. Maybe it will in the future, but
+   // until then it seems best to do this instead of adding
+   // "dummy" options.
+   if (strcmp(svc->name, "prime-clone") != 0) {
+   argv_array_pushl(, "--stateless-rpc",
+"--advertise-refs", NULL);
+   }
+
+   argv_array_pushl(, ".", NULL);
strbuf_addf(, "application/x-git-%s-advertisement",
svc->name);
hdr_str(content_type, buf.buf);
@@ -463,8 +475,8 @@ static void get_info_refs(char *arg)
packet_write(1, "# service=git-%s\n", svc->name);
packet_flush(1);
 
-   argv[0] = svc->name;
-   run_service(argv, 0);
+   run_service(argv.argv, 0);
+   argv_array_clear();
 
} else {
select_getanyfile();
-- 
2.7.4

[PATCH 03/11] pkt-line: create gentle packet_read_line functions

2016-09-15 Thread Kevin Wern

Create a functions that can read malformed messages without dying.
Includes creation of flag PACKET_READ_GENTLE_ALL. For use handling
prime-clone (or other server error) responses.

Signed-off-by: Kevin Wern 
---
 pkt-line.c | 47 ++-
 pkt-line.h | 16 
 2 files changed, 54 insertions(+), 9 deletions(-)

diff --git a/pkt-line.c b/pkt-line.c
index 62fdb37..96060e5 100644
--- a/pkt-line.c
+++ b/pkt-line.c
@@ -155,13 +155,17 @@ static int get_packet_data(int fd, char **src_buf, size_t 
*src_size,
*src_size -= ret;
} else {
ret = read_in_full(fd, dst, size);
-   if (ret < 0)
+   if (ret < 0) {
+   if (options & PACKET_READ_GENTLE_ALL)
+   return -1;
+
die_errno("read error");
+   }
}
 
/* And complain if we didn't get enough bytes to satisfy the read. */
if (ret < size) {
-   if (options & PACKET_READ_GENTLE_ON_EOF)
+   if (options & (PACKET_READ_GENTLE_ON_EOF | 
PACKET_READ_GENTLE_ALL))
return -1;
 
die("The remote end hung up unexpectedly");
@@ -205,15 +209,23 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
if (ret < 0)
return ret;
len = packet_length(linelen);
-   if (len < 0)
+   if (len < 0) {
+   if (options & PACKET_READ_GENTLE_ALL)
+   return -1;
+
die("protocol error: bad line length character: %.4s", linelen);
+   }
if (!len) {
packet_trace("", 4, 0);
return 0;
}
len -= 4;
-   if (len >= size)
+   if (len >= size) {
+   if (options & PACKET_READ_GENTLE_ALL)
+   return -1;
+
die("protocol error: bad line length %d", len);
+   }
ret = get_packet_data(fd, src_buf, src_len, buffer, len, options);
if (ret < 0)
return ret;
@@ -229,22 +241,39 @@ int packet_read(int fd, char **src_buf, size_t *src_len,
 
 static char *packet_read_line_generic(int fd,
  char **src, size_t *src_len,
- int *dst_len)
+ int *dst_len, int flags)
 {
int len = packet_read(fd, src, src_len,
  packet_buffer, sizeof(packet_buffer),
- PACKET_READ_CHOMP_NEWLINE);
+ flags);
if (dst_len)
*dst_len = len;
-   return len ? packet_buffer : NULL;
+   return len > 0 ? packet_buffer : NULL;
 }
 
 char *packet_read_line(int fd, int *len_p)
 {
-   return packet_read_line_generic(fd, NULL, NULL, len_p);
+   return packet_read_line_generic(fd, NULL, NULL, len_p,
+   PACKET_READ_CHOMP_NEWLINE);
 }
 
 char *packet_read_line_buf(char **src, size_t *src_len, int *dst_len)
 {
-   return packet_read_line_generic(-1, src, src_len, dst_len);
+   return packet_read_line_generic(-1, src, src_len, dst_len,
+   PACKET_READ_CHOMP_NEWLINE);
+}
+
+char *packet_read_line_gentle(int fd, int *len_p)
+{
+   return packet_read_line_generic(fd, NULL, NULL, len_p,
+   PACKET_READ_CHOMP_NEWLINE |
+   PACKET_READ_GENTLE_ALL);
+}
+
+
+char *packet_read_line_buf_gentle(char **src, size_t *src_len, int *dst_len)
+{
+   return packet_read_line_generic(-1, src, src_len, dst_len,
+   PACKET_READ_CHOMP_NEWLINE |
+   PACKET_READ_GENTLE_ALL);
 }
diff --git a/pkt-line.h b/pkt-line.h
index 3cb9d91..553e42e 100644
--- a/pkt-line.h
+++ b/pkt-line.h
@@ -52,11 +52,15 @@ void packet_buf_write(struct strbuf *buf, const char *fmt, 
...) __attribute__((f
  * condition 4 (truncated input), but instead return -1. However, we will still
  * die for the other 3 conditions.
  *
+ * If options contains PACKET_READ_GENTLE_ALL, we will not die on any of the
+ * conditions, but return -1 instead.
+ *
  * If options contains PACKET_READ_CHOMP_NEWLINE, a trailing newline (if
  * present) is removed from the buffer before returning.
  */
 #define PACKET_READ_GENTLE_ON_EOF (1u<<0)
 #define PACKET_READ_CHOMP_NEWLINE (1u<<1)
+#define PACKET_READ_GENTLE_ALL(1u<<2)
 int packet_read(int fd, char **src_buffer, size_t *src_len, char
*buffer, unsigned size, int options);
 
@@ -75,6 +79,18 @@ char *packet_read_line(int fd, int *size);
  */
 char *packet_read_line_buf(char **src_buf, size_t *src_len, int *size);
 
+/*
+ * Same as packet_read_line, but does not die on any errors (uses
+ * PACKET_READ_GENTLE_ALL).
+ */
+char *packet_read_line_gentle(int fd, int *len_p);
+
+/*
+ * Same as packet_read_line_buf, but does not die on any errors (uses
+ *

[PATCH 00/11] Resumable clone

2016-09-15 Thread Kevin Wern

Hey, all,

It's been a while (sent a very short patch in May), but I've
still been working on the resumable clone feature and checking up on
the mailing list for any updates. After submitting the prime-clone
service alone, I figured implementing the whole thing would be the best
way to understand the full scope of the problem (this is my first real
contribution here, and learning while working on such an involved
feature has not been easy). 

This is a functional implementation handling a direct http/ftp URI to a
single, fully connected packfile (i.e. the link is a direct path to the
file, not a prefix or guess). My hope is that this acts as a bare
minimum cross-section spanning the full requirments that can expand in
width as more cases are added (.info file, split bundle, daemon
download service). This is certainly not perfect, but I think it at
least prototypes each component involved in the workflow.

This patch series is based on jc/bundle, because the logic to find the
tips of a pack's history already exists there (I call index-pack
--clone-bundle on the downloaded file, and read the file to write the
references to a temporary directory). If I need to re-implement this
logic or base it on another branch, let me know. For ease of pulling
and testing, I included the branch here:

https://github.com/kevinwern/git/tree/feature/prime-clone

Although there are a few changes internally from the last patch,
the "alternate resource" url to download is configured on the
server side in exactly the same way:

[primeclone]
url = http://location/pack-$NAME.pack
filetype = pack

The prime-clone service simply outputs the components as:

url filetype


On the client side, the transport_prime_clone and
transport_download_primer APIs are built to be more robust (i.e. read
messages without dying due to protocol errors), so that git clone can
always try them without being dependent on the capability output of
git-upload-pack. transport_download_primer is dependent on the success
of transport_prime_clone, but transport_prime_clone is always run on an
initial clone. Part of achieving this robustness involves adding
*_gentle functions to pkt_line, so that prime_clone can fail silently
without dying.

The transport_download_primer function uses a resumable download,
which is applicable to both automatic and manual resuming. Automatic
is programmatically reconnecting to the resource after being
interrupted (up to a set number of times). Manual is using a newly
taught --resume option on the command line:

git clone --resume 

Right now, a manually resumable directory is left behind only if the
*client* is interrupted while a new junk mode, JUNK_LEAVE_RESUMABLE,
is set (right before the download). For an initial clone, if the
connection fails after automatic resuming, the client erases the
partial resources and falls through to a normal clone. However, once a
resumable directory is left behind by the program, it is NEVER
deleted/abandoned after it is continued with --resume.

I think determining when a resource is "unsalvageable" should be more
nuanced. Especially in a case where a connection is perpetually poor
and the user wishes to resume over a long period of time. The timeout
logic itself *definitely* needs more nuance than "repeat 5 times", such
as expanding wait times and using earlier successes when deciding to
try again. Right now, I think the most important part of this patch is
that these two paths (falling through after a failed download, exiting
to be manually resumed later) exist.

Off the top of my head, outstanding issues/TODOs inlcude:
- The above issue of determining when to fall through, when to
  reattempt, and when to write the resumable info and exit
  in git clone.
- Creating git-daemon service to download a resumable resource.
  Pretty straightforward, I think, especially if
  http.getanyfile already exists. This falls more under
  "haven't gotten to yet" than dilemma.
- Logic for git clone to determine when a full clone would
  be superior, such as when a clone is local or a reference is
  given.
- Configuring prime-clone for multiple resources, in two
  dimensions: (a) resources to choose from (e.g. fall back to
  a second resource if the first one doesn't work) and (b)
  resources to be downloaded together or in sequence (e.g.
  download http://host/this, then http://host/that). Maybe
  prime-clone could also handle client preferences in terms of
  filetype or protocol. For this, I just have to re-read a few
  discussions about the filetypes we use to see if there are
  any outliers that aren't representable in this way. I think
  this is another "haven't gotten to yet".
- Related to the above, seeing if there are any outlying
  resource types whose process can't be modularized into:
  download

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread brian m. carlson

On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:
> Replace uses of strbuf_addf() for adding strings with more lightweight
> strbuf_addstr() calls.  This makes the intent clearer and avoids
> potential issues with printf format specifiers.
> 
> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
> this patch covers eleven more.
> 
> A semantic patch for Coccinelle is included for easier checking for
> new cases that might be introduced in the future.

I think all three of these patches look good.  I'm glad to see us
getting better use out of Coccinelle.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
+1 832 623 2791 | https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [RFC] extending pathspec support to submodules

2016-09-15 Thread Stefan Beller

On Thu, Sep 15, 2016 at 3:08 PM, Junio C Hamano  wrote:
> Brandon Williams  writes:
>
>> You're right that seems like the best course of action and it already falls
>> inline with what I did with a first patch to ls-files to support submodules.
>> In that patch I did exactly as you suggest and pass in the prefix to the
>> submodule and make the child responsible for prepending the prefix to all of
>> its output.  This way we can simply pass through the whole pathspec (as 
>> apposed
>> to my original idea of stripping the prefix off the pathspec prior to passing
>> it to the child...which can get complicated with wild characters) to the
>> childprocess and when checking if a file matches the pathspec we can check if
>> the prefix + file path matches.
>
> That's brilliant.  A few observations.
>
>  * With that change to tell the command that is spawned in a
>submodule directory where the submodule repository is in the
>context of the top-level superproject _and_ require it to take a
>pathspec as relative to the top-level superproject, you no longer
>worry about having to find where to cut the pathspec given at the
>top-level to adjust it for the submodule's context.  That may
>simplify things.

I wonder how this plays together with the prefix in the superproject, e.g.

cd super/unrelated-path
# when invoking a git command the internal prefix is "unrelated-path/"
git ls-files ../submodule-*
# a submodule in submodule-A would be run in  submodule-A
# with a superproject prefix of super/ ? but additionally we nned
to know we're
# not at the root of the superproject.

>So we may have to rethink what this option name should be.  "You
>are running in a repository that is used as a submodule in a
>larger context, which has the submodule at this path" is what the
>option tells the command; if any existing command already has
>such an option, we should use it.  If we are inventing one,
>perhaps "--submodule-path" (I didn't check if there are existing
>options that sound similar to it and mean completely different
>things, in which case that name is not usable)?

Would it make sense to add the '--submodule-path' to a more generic
part of the code? It's not just ls-files/grep that have to solve exactly this
problem. Up to now we just did not go for those commands, though.

Thanks

Hello

2016-09-15 Thread Hasher Al Maktoum

Dear Friend,

Your contact details came to me by recommendation, I am interested in investing 
in your country and I believe you have the capabilities of providing the needed 
assistance, solutions and advise in actualizing this, Let me know if you are 
willing to understake this task for me so we can discuss more. I hope to hear 
from you soon.

Regards,
Hasher Al Maktoum
Chairman/Chief Executive Officer
Dubai International Holding Company.

Re: [RFC] extending pathspec support to submodules

2016-09-15 Thread Junio C Hamano

Brandon Williams  writes:

> You're right that seems like the best course of action and it already falls
> inline with what I did with a first patch to ls-files to support submodules.
> In that patch I did exactly as you suggest and pass in the prefix to the
> submodule and make the child responsible for prepending the prefix to all of
> its output.  This way we can simply pass through the whole pathspec (as 
> apposed
> to my original idea of stripping the prefix off the pathspec prior to passing
> it to the child...which can get complicated with wild characters) to the
> childprocess and when checking if a file matches the pathspec we can check if
> the prefix + file path matches.

That's brilliant.  A few observations.

 * With that change to tell the command that is spawned in a
   submodule directory where the submodule repository is in the
   context of the top-level superproject _and_ require it to take a
   pathspec as relative to the top-level superproject, you no longer
   worry about having to find where to cut the pathspec given at the
   top-level to adjust it for the submodule's context.  That may
   simplify things.

 * Your program that runs in the top-level superproject still needs
   to be able to say "this pathspec from the top cannot possibly
   match anything in the submodule, so let's not even bother
   descending into it".

 * Earlier while reviewing "ls-files" recursion, I suggested (and
   you took) --output-path-prefix as the option name, because it was
   meant to be "when you output any path, prefix this string".  But
   the suggested name is suboptimal, as it is no longer an option
   that is only about "output".  A command that runs in a submodule
   would:

   - enumerate paths in the context of the submodule repository,
   - prepend the "prefix" to these paths,
   - filter by applying the full-tree pathspec, and
   - work on the surviving paths after filtering.

   When the last step, "work on", involves just "printing", the
   whole path (with "prefix") is sent to the output.  If it involves
   some operation relative to the submodule repository (e.g. seeing
   if it is in the index), the "prefix" may have to be stripped
   while the operation is carried out.

   So we may have to rethink what this option name should be.  "You
   are running in a repository that is used as a submodule in a
   larger context, which has the submodule at this path" is what the
   option tells the command; if any existing command already has
   such an option, we should use it.  If we are inventing one,
   perhaps "--submodule-path" (I didn't check if there are existing
   options that sound similar to it and mean completely different
   things, in which case that name is not usable)?

Thanks.

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano

René Scharfe  writes:

> Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
>> René Scharfe  writes:
>> 
>>> Take this for example:
>>>
>>> -   strbuf_addf(>obuf, _("(bad commit)\n"));
>>> +   strbuf_addstr(>obuf, _("(bad commit)\n"));
>>>
>>> If there's a language that uses percent signs instead of parens or as
>>> regular letters, then they need to be escaped in the translated string
>>> before, but not after the patch.  As I wrote: silly.
>> 
>> Ahh, OK, so "This use of addf only has format part and nothing else,
>> hence the format part can be taken as-is" which is the Coccinelle rule
>> used to produce this patch is incomplete and always needs manual
>> inspection, in case the format part wanted to give a literal % in
>> the output.  E.g. it is a bug to convert this
>> 
>>  strbuf_addf(, _("this is 100%% wrong!"));
>> 
>> to
>> 
>>  strbuf_addstr(, _("this is 100%% wrong!"));
>
> Right.  Such strings seem to be quite rare in practice, though. 
>
>> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
>> have some comment to warn the person who builds *.patch file to look
>> for % in E2, or something?
>
> Something like this?

Yup, with something like that I would understdood where that
puzzling question came from.

Thanks.

>
> ---
>  contrib/coccinelle/strbuf.cocci | 2 ++
>  1 file changed, 2 insertions(+)
>
> diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
> index 7932d48..3f535ca 100644
> --- a/contrib/coccinelle/strbuf.cocci
> +++ b/contrib/coccinelle/strbuf.cocci
> @@ -1,3 +1,5 @@
> +// Careful, this is not fully equivalent: "%" is no longer treated
> +// specially.  Check for "%%", "%m" etc. in the format string (E2).
>  @@
>  expression E1, E2;
>  @@

Re: [PATCH 2/2] SQUASH??? Undecided

2016-09-15 Thread Brandon Williams

Yeah if that is the convention then I have no problem with the change.

-Brandon

On Thu, Sep 15, 2016 at 2:12 PM, Stefan Beller  wrote:
> + cc Brandon
>
> On Thu, Sep 15, 2016 at 1:51 PM, Junio C Hamano  wrote:
>> If we were to follow the convention to leave an optional string
>> variable to NULL, we'd need to do this on top.  I am not sure if it
>> is a good change, though.
>
> I think it is a good change.
>
> Thanks,
> Stefan
>
>> ---
>>  builtin/ls-files.c | 7 ---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
>> index 6e78c71..687e475 100644
>> --- a/builtin/ls-files.c
>> +++ b/builtin/ls-files.c
>> @@ -29,7 +29,7 @@ static int show_valid_bit;
>>  static int line_terminator = '\n';
>>  static int debug_mode;
>>  static int show_eol;
>> -static const char *output_path_prefix = "";
>> +static const char *output_path_prefix;
>>  static int recurse_submodules;
>>
>>  static const char *prefix;
>> @@ -78,7 +78,7 @@ static void write_name(const char *name)
>>  * churn.
>>  */
>> static struct strbuf full_name = STRBUF_INIT;
>> -   if (*output_path_prefix) {
>> +   if (output_path_prefix && *output_path_prefix) {
>> strbuf_reset(_name);
>> strbuf_addstr(_name, output_path_prefix);
>> strbuf_addstr(_name, name);
>> @@ -181,7 +181,8 @@ static void show_gitlink(const struct cache_entry *ce)
>> argv_array_push(, "ls-files");
>> argv_array_push(, "--recurse-submodules");
>> argv_array_pushf(, "--output-path-prefix=%s%s/",
>> -output_path_prefix, ce->name);
>> +output_path_prefix ? output_path_prefix : "",
>> +ce->name);
>> cp.git_cmd = 1;
>> cp.dir = ce->name;
>> status = run_command();
>> --
>> 2.10.0-458-g97b4043
>>

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe

Am 15.09.2016 um 22:01 schrieb Junio C Hamano:
> René Scharfe  writes:
> 
>> Take this for example:
>>
>> -strbuf_addf(>obuf, _("(bad commit)\n"));
>> +strbuf_addstr(>obuf, _("(bad commit)\n"));
>>
>> If there's a language that uses percent signs instead of parens or as
>> regular letters, then they need to be escaped in the translated string
>> before, but not after the patch.  As I wrote: silly.
> 
> Ahh, OK, so "This use of addf only has format part and nothing else,
> hence the format part can be taken as-is" which is the Coccinelle rule
> used to produce this patch is incomplete and always needs manual
> inspection, in case the format part wanted to give a literal % in
> the output.  E.g. it is a bug to convert this
> 
>   strbuf_addf(, _("this is 100%% wrong!"));
> 
> to
> 
>   strbuf_addstr(, _("this is 100%% wrong!"));

Right.  Such strings seem to be quite rare in practice, though. 

> Thanks for clarification.  Perhaps the strbuf.cocci rule file can
> have some comment to warn the person who builds *.patch file to look
> for % in E2, or something?

Something like this?

---
 contrib/coccinelle/strbuf.cocci | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
index 7932d48..3f535ca 100644
--- a/contrib/coccinelle/strbuf.cocci
+++ b/contrib/coccinelle/strbuf.cocci
@@ -1,3 +1,5 @@
+// Careful, this is not fully equivalent: "%" is no longer treated
+// specially.  Check for "%%", "%m" etc. in the format string (E2).
 @@
 expression E1, E2;
 @@
-- 
2.10.0

[PATCH v2 0/1] git-p4: Add --checkpoint-period option to sync/clone

2016-09-15 Thread Ori Rawlings

Importing a long history from Perforce into git using the git-p4 tool
can be especially challenging. The `git p4 clone` operation is based
on an all-or-nothing transactionality guarantee. Under real-world
conditions like network unreliability or a busy Perforce server,
`git p4 clone` and  `git p4 sync` operations can easily fail, forcing a
user to restart the import process from the beginning. The longer the
history being imported, the more likely a fault occurs during the
process. Long enough imports thus become statistically unlikely to ever
succeed.

My idea was to leverage the checkpoint feature of git fast-import.
I've included a patch which exposes a new option to the sync/clone
commands in the git-p4 tool. The option enables explict checkpoints on
a periodic basis (approximately every x seconds).

If the sync/clone command fails during processing of Perforce changes,
the user can craft a new git p4 sync command that will identify
changes that have already been imported and proceed with importing
only changes more recent than the last successful checkpoint.

In v2 of this patch series I've added some basic test scenarios,
documentation, and did some minor clean up of the implementation
based on feedback on v1.

Ori Rawlings (1):
  git-p4: Add --checkpoint-period option to sync/clone

 Documentation/git-p4.txt| 12 ++-
 git-p4.py   |  7 -
 t/t9830-git-p4-checkpoint-period.sh | 59 ++-
 3 files changed, 78 insertions(+), 0 deletions(-)
 create mode 100755 t/t9830-git-p4-checkpoint-period.sh

-- 
git-series 0.8.10

[PATCH v2 1/1] git-p4: Add --checkpoint-period option to sync/clone

2016-09-15 Thread Ori Rawlings

Importing a long history from Perforce into git using the git-p4 tool
can be especially challenging. The `git p4 clone` operation is based
on an all-or-nothing transactionality guarantee. Under real-world
conditions like network unreliability or a busy Perforce server,
`git p4 clone` and  `git p4 sync` operations can easily fail, forcing a
user to restart the import process from the beginning. The longer the
history being imported, the more likely a fault occurs during the
process. Long enough imports thus become statistically unlikely to ever
succeed.

The underlying git fast-import protocol supports an explicit checkpoint
command. The idea here is to optionally allow the user to force an
explicit checkpoint every  seconds. If the sync/clone operation fails
branches are left updated at the appropriate commit available during the
latest checkpoint. This allows a user to resume importing Perforce
history while only having to repeat at most approximately  seconds
worth of import activity.

Signed-off-by: Ori Rawlings 
---
 Documentation/git-p4.txt| 12 ++-
 git-p4.py   |  7 -
 t/t9830-git-p4-checkpoint-period.sh | 59 ++-
 3 files changed, 78 insertions(+), 0 deletions(-)
 create mode 100755 t/t9830-git-p4-checkpoint-period.sh

diff --git a/Documentation/git-p4.txt b/Documentation/git-p4.txt
index c83aaf3..e48ed6d 100644
--- a/Documentation/git-p4.txt
+++ b/Documentation/git-p4.txt
@@ -252,6 +252,18 @@ Git repository:
Use a client spec to find the list of interesting files in p4.
See the "CLIENT SPEC" section below.
 
+--checkpoint-period ::
+   Issue explicit 'checkpoint' commands to the underlying
+   linkgit:git-fast-import[1] approximately every 'n' seconds. If
+   syncing or cloning from the Perforce server is interrupted, the
+   process can be resumed from the most recent checkpoint with a
+   new 'sync' invocation. This is useful in the situations where a
+   large amount of changes are being imported over an unreliable
+   network connection. Explicit checkpoints can take up to several
+   minutes each, so a suitable value for the checkpoint period is
+   approximately 1200 seconds. By default, no explicit checkpoints 
+   are performed.
+
 -/ ::
Exclude selected depot paths when cloning or syncing.
 
diff --git a/git-p4.py b/git-p4.py
index fd5ca52..4c84871 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2244,6 +2244,7 @@ class P4Sync(Command, P4UserMap):
 optparse.make_option("-/", dest="cloneExclude",
  action="append", type="string",
  help="exclude depot path"),
+optparse.make_option("--checkpoint-period", 
dest="checkpointPeriod", type="int", help="Period in seconds between explict 
git fast-import checkpoints (by default, no explicit checkpoints are 
performed)"),
 ]
 self.description = """Imports from Perforce into a git repository.\n
 example:
@@ -2276,6 +2277,7 @@ class P4Sync(Command, P4UserMap):
 self.tempBranches = []
 self.tempBranchLocation = "refs/git-p4-tmp"
 self.largeFileSystem = None
+self.checkpointPeriod = None
 
 if gitConfig('git-p4.largeFileSystem'):
 largeFileSystemConstructor = 
globals()[gitConfig('git-p4.largeFileSystem')]
@@ -3031,6 +3033,7 @@ class P4Sync(Command, P4UserMap):
 
 def importChanges(self, changes):
 cnt = 1
+self.lastCheckpointTime = time.time()
 for change in changes:
 description = p4_describe(change)
 self.updateOptionDict(description)
@@ -3107,6 +3110,10 @@ class P4Sync(Command, P4UserMap):
 self.initialParent)
 # only needed once, to connect to the previous commit
 self.initialParent = ""
+
+if self.checkpointPeriod >= 0 and time.time() - 
self.lastCheckpointTime >= self.checkpointPeriod:
+self.checkpoint()
+self.lastCheckpointTime = time.time()
 except IOError:
 print self.gitError.read()
 sys.exit(1)
diff --git a/t/t9830-git-p4-checkpoint-period.sh 
b/t/t9830-git-p4-checkpoint-period.sh
new file mode 100755
index 000..6ba4914
--- /dev/null
+++ b/t/t9830-git-p4-checkpoint-period.sh
@@ -0,0 +1,59 @@
+#!/bin/sh
+
+test_description='git p4 checkpoint-period tests'
+
+. ./lib-git-p4.sh
+
+p4_submit_each () {
+   for file in $@
+   do
+   echo $file > "$file" &&
+   p4 add "$file" &&
+   p4 submit -d "$file"
+   done
+}
+
+test_expect_success 'start p4d' '
+   start_p4d
+'
+
+test_expect_success 'no explicit checkpoints' '
+   cd "$cli" &&
+   p4_submit_each file1 file2 file3 &&
+   git p4 clone --dest="$git" //depot@all &&
+

Re: [PATCH 2/2] SQUASH??? Undecided

2016-09-15 Thread Stefan Beller

+ cc Brandon

On Thu, Sep 15, 2016 at 1:51 PM, Junio C Hamano  wrote:
> If we were to follow the convention to leave an optional string
> variable to NULL, we'd need to do this on top.  I am not sure if it
> is a good change, though.

I think it is a good change.

Thanks,
Stefan

> ---
>  builtin/ls-files.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/builtin/ls-files.c b/builtin/ls-files.c
> index 6e78c71..687e475 100644
> --- a/builtin/ls-files.c
> +++ b/builtin/ls-files.c
> @@ -29,7 +29,7 @@ static int show_valid_bit;
>  static int line_terminator = '\n';
>  static int debug_mode;
>  static int show_eol;
> -static const char *output_path_prefix = "";
> +static const char *output_path_prefix;
>  static int recurse_submodules;
>
>  static const char *prefix;
> @@ -78,7 +78,7 @@ static void write_name(const char *name)
>  * churn.
>  */
> static struct strbuf full_name = STRBUF_INIT;
> -   if (*output_path_prefix) {
> +   if (output_path_prefix && *output_path_prefix) {
> strbuf_reset(_name);
> strbuf_addstr(_name, output_path_prefix);
> strbuf_addstr(_name, name);
> @@ -181,7 +181,8 @@ static void show_gitlink(const struct cache_entry *ce)
> argv_array_push(, "ls-files");
> argv_array_push(, "--recurse-submodules");
> argv_array_pushf(, "--output-path-prefix=%s%s/",
> -output_path_prefix, ce->name);
> +output_path_prefix ? output_path_prefix : "",
> +ce->name);
> cp.git_cmd = 1;
> cp.dir = ce->name;
> status = run_command();
> --
> 2.10.0-458-g97b4043
>

Re: [PATCH 3/2] batch check whether submodule needs pushing into one call

2016-09-15 Thread Junio C Hamano

Heiko Voigt  writes:

>   if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
>   struct child_process cp = CHILD_PROCESS_INIT;
> - const char *argv[] = {"rev-list", NULL, "--not", "--remotes", 
> "-n", "1" , NULL};
> +
> + argv_array_push(, "rev-list");
> + sha1_array_for_each_unique(hashes, append_hash_to_argv, 
> );
> + argv_array_pushl(, "--not", "--remotes", "-n", "1" , 
> NULL);
> +
>   struct strbuf buf = STRBUF_INIT;
>   int needs_pushing = 0;

These two become decl-after-stmt; move your new lines a bit lower,
perhaps?

> - argv[1] = sha1_to_hex(sha1);
> - cp.argv = argv;
>   prepare_submodule_repo_env(_array);

By the way, with the two new patches, 'pu' seems to start failing
some tests, e.g. 5533 5404 5405.

[PATCH 1/2] SQUASH???

2016-09-15 Thread Junio C Hamano

---
 builtin/ls-files.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index c0bce00..6e78c71 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (output_path_prefix != '\0') {
+   if (*output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
-- 
2.10.0-458-g97b4043

[PATCH 2/2] SQUASH??? Undecided

2016-09-15 Thread Junio C Hamano

If we were to follow the convention to leave an optional string
variable to NULL, we'd need to do this on top.  I am not sure if it
is a good change, though.
---
 builtin/ls-files.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 6e78c71..687e475 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -29,7 +29,7 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
-static const char *output_path_prefix = "";
+static const char *output_path_prefix;
 static int recurse_submodules;
 
 static const char *prefix;
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (*output_path_prefix) {
+   if (output_path_prefix && *output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
@@ -181,7 +181,8 @@ static void show_gitlink(const struct cache_entry *ce)
argv_array_push(, "ls-files");
argv_array_push(, "--recurse-submodules");
argv_array_pushf(, "--output-path-prefix=%s%s/",
-output_path_prefix, ce->name);
+output_path_prefix ? output_path_prefix : "",
+ce->name);
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
-- 
2.10.0-458-g97b4043

Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano

Junio C Hamano  writes:

> Thanks, will queue with a minimum fix.

So here are two squashable patches, one is the "minimum" one, the
other is a bit more invasive one to use "a pointer to an optional
setting is set to NULL" convention.  I am undecided, and I'll stay
to be without further comments from others, on the latter one.

I understand that many internal changes in your work environment
titles their changes like "DOing X", but our convention around here
is to label them "DO X", as if you are giving an order to somebody
else, either telling the codebase "to be like so", or telling the
patch-monkey maintainer "to make it so".  So I'd retitle it

ls-files: optionally recurse into submodules

or something like that.  It is an added advantage of being a lot
more descriptive than "adding support", which does not say what kind
of support it is adding.

Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano

If we were to follow the convention to leave an optional string
variable to NULL, we'd need to do this on top.  I am not sure if it
is a good change, though.
---
 builtin/ls-files.c | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index 6e78c71..687e475 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -29,7 +29,7 @@ static int show_valid_bit;
 static int line_terminator = '\n';
 static int debug_mode;
 static int show_eol;
-static const char *output_path_prefix = "";
+static const char *output_path_prefix;
 static int recurse_submodules;
 
 static const char *prefix;
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (*output_path_prefix) {
+   if (output_path_prefix && *output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
@@ -181,7 +181,8 @@ static void show_gitlink(const struct cache_entry *ce)
argv_array_push(, "ls-files");
argv_array_push(, "--recurse-submodules");
argv_array_pushf(, "--output-path-prefix=%s%s/",
-output_path_prefix, ce->name);
+output_path_prefix ? output_path_prefix : "",
+ce->name);
cp.git_cmd = 1;
cp.dir = ce->name;
status = run_command();
-- 
2.10.0-458-g97b4043

Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano

Here is an absolute mininum fix ;-)

 builtin/ls-files.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/ls-files.c b/builtin/ls-files.c
index c0bce00..6e78c71 100644
--- a/builtin/ls-files.c
+++ b/builtin/ls-files.c
@@ -78,7 +78,7 @@ static void write_name(const char *name)
 * churn.
 */
static struct strbuf full_name = STRBUF_INIT;
-   if (output_path_prefix != '\0') {
+   if (*output_path_prefix) {
strbuf_reset(_name);
strbuf_addstr(_name, output_path_prefix);
strbuf_addstr(_name, name);
-- 
2.10.0-458-g97b4043

Re: [PATCH v2] ls-files: adding support for submodules

2016-09-15 Thread Junio C Hamano

Junio C Hamano  writes:

> Thanks, will queue with a minimum fix.

So here are two squashable patches, one is the "minimum" one, the
other is a bit more invasive one to use "a pointer to an optional
setting is set to NULL" convention.  I am undecided, and I'll stay
to be without further comments from others, on the latter one.

I understand that many internal changes in your work environment are
titled like "DOing X", but our convention around here is to label
them "DO X", as if you are giving an order to somebody else, either
telling the codebase "to be like so", or telling the patch-monkey
maintainer "to make it so".  So I'd retitle it

ls-files: optionally recurse into submodules

or something like that.  It is an added advantage of being a lot
more descriptive than "adding support", which does not say what kind
of support it is adding.

Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()

2016-09-15 Thread Junio C Hamano

Lars Schneider  writes:

>> So the "right" pattern is either:
>> 
>>  1. Return -1 and the caller is responsible for telling the user.
>> 

... which is valid only if there aren't different kinds of errors
that all return -1; with "return error(...)" with different
messages, the users can tell what kind of error they got (while the
caller may just do the same abort-procedure no matter what kind of
error it got), but if all of them are replaced with "return -1", the
caller cannot produce different error messages to tell the users.

>>  2. Return -1 and stuff the error into an error strbuf, so it can be
>> passed up the call chain easily (and callers do not have to come up
>> with their own wording).

... and this would become one of the viable options (the other is to
define your own error code so that the caller can tell what error it
got).

>> But if all current callers would just call error() themselves anyway,
>> then it's OK to punt on this and let somebody else handle it later if
>> they add a new caller who wants different behavior (and that is what
>> Junio was saying above, I think).

Yes.  Just keeping it noisy until somebody wants a quiet-and-gentle
version is probably the best course to take.

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-15 Thread Junio C Hamano

Lars Schneider  writes:

>> On 13 Sep 2016, at 17:22, Junio C Hamano  wrote:
>> 
>> larsxschnei...@gmail.com writes:
>> 
>>> diff --git a/contrib/long-running-filter/example.pl 
>>> b/contrib/long-running-filter/example.pl
>>> ...
>>> +packet_write( "clean=true\n" );
>>> +packet_write( "smudge=true\n" );
>>
>> These extra SP around the contents inside () pair look unfamiliar
>> and somewhat strange to me, but as long as they are consistently
>> done (and I think you are mostly being consistent), it is OK.
>
> Ups. I forgot to run PerlTidy here. I run PerlTidy with the flag 
> "-pbp" (= Perl Best Practices). This seems to add no extra SP for
> functions with one parameter (e.g. `foo("bar")`) and extra SP
> for functions with multiple parameter (e.g. `foo( "bar", 1 )`).
> Is this still OK?

Your choice.  I already said I do not care too much either way as
long as you are consistent.

If you prefer PerlTidy's PBP output over what you wrote, and if you
are resending the patch anyway, then why not? ;-)

> Does anyone have a "Git PerlTidy configuration"?

Not me.

Thanks.

Re: [PATCH v3 13/14] i18n: show-branch: mark plural strings for translation

2016-09-15 Thread Junio C Hamano

Vasco Almeida  writes:

> Mark plural string for translation using Q_().
>
> Although we already know that the plural sentence is always used in the
> English source, other languages have complex plural rules they must
> comply according to the value of MAX_REVS.

Nicely explained.  Thanks.

Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()

2016-09-15 Thread Lars Schneider


> On 15 Sep 2016, at 21:44, Jeff King  wrote:
> 
> On Thu, Sep 15, 2016 at 05:42:58PM +0100, Lars Schneider wrote:
> 
>> +int packet_flush_gently(int fd)
>> +{
>> +packet_trace("", 4, 1);
>> +if (write_in_full(fd, "", 4) == 4)
>> +return 0;
>> +error("flush packet write failed");
>> +return -1;
>> [...]
> I suspect that it is a strong sign that the caller wants to be in
> control of when and what error message is produced; otherwise it
> wouldn't be calling the _gently() variant, no?
 
 Agreed!
>>> 
>>> I am also OK with the current form, too.  Those who need to enhance
>>> it to packet_flush_gently(int fd, int quiet) can come later.
>> 
>> "caller wants to be in control [...] otherwise it wouldn't be calling 
>> the _gently() variant" convinced me. I would like to change it like
>> this:
>> 
>>  trace_printf_key(_packet, "flush packet write failed");
>>  return -1;
>> 
>> Objections?
> 
> I'm not sure that a trace makes sense, because it means that 99% of the
> time we are silent. AFAICT, the question is not "sometimes the user
> needs to see an error and sometimes not, and they should decide before
> starting the program". It is "sometimes the caller will report the error
> to the user as appropriate, and sometimes we need to do so". And only
> the calling code knows which is which.
> 
> So the "right" pattern is either:
> 
>  1. Return -1 and the caller is responsible for telling the user.
> 
> or
> 
>  2. Return -1 and stuff the error into an error strbuf, so it can be
> passed up the call chain easily (and callers do not have to come up
> with their own wording).
> 
> But if all current callers would just call error() themselves anyway,
> then it's OK to punt on this and let somebody else handle it later if
> they add a new caller who wants different behavior (and that is what
> Junio was saying above, I think).

OK. I'll go with 1. then.

Thanks,
Lars

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-15 Thread Lars Schneider


> On 13 Sep 2016, at 17:22, Junio C Hamano  wrote:
> 
> larsxschnei...@gmail.com writes:
> 
>> diff --git a/contrib/long-running-filter/example.pl 
>> b/contrib/long-running-filter/example.pl
>> ...
>> +sub packet_read {
>> +my $buffer;
>> +my $bytes_read = read STDIN, $buffer, 4;
>> +if ( $bytes_read == 0 ) {
>> +
>> +# EOF - Git stopped talking to us!
>> +exit();
>> +...
>> +packet_write( "clean=true\n" );
>> +packet_write( "smudge=true\n" );
>> +packet_flush();
>> +
>> +while (1) {
> 
> These extra SP around the contents inside () pair look unfamiliar
> and somewhat strange to me, but as long as they are consistently
> done (and I think you are mostly being consistent), it is OK.

Ups. I forgot to run PerlTidy here. I run PerlTidy with the flag 
"-pbp" (= Perl Best Practices). This seems to add no extra SP for
functions with one parameter (e.g. `foo("bar")`) and extra SP
for functions with multiple parameter (e.g. `foo( "bar", 1 )`).
Is this still OK?

Does anyone have a "Git PerlTidy configuration"?


> 
>> +#define CAP_CLEAN(1u<<0)
>> +#define CAP_SMUDGE   (1u<<1)
> 
> As these are meant to be usable together, i.e. bits in a single flag
> word, they are of type "unsigned int", which makes perfect sense.
> 
> Make sure your variables and fields that store them are of the same
> type.  I think I saw "int' used to pass them in at least one place.

Fixed!


>> +static int apply_filter(const char *path, const char *src, size_t len,
>> +int fd, struct strbuf *dst, struct convert_driver 
>> *drv,
>> +const int wanted_capability)
>> +{
>> +const char* cmd = NULL;
> 
> "const char *cmd = NULL;" of course.

Fixed!


>> diff --git a/unpack-trees.c b/unpack-trees.c
>> index 11c37fb..f6798f8 100644
>> --- a/unpack-trees.c
>> +++ b/unpack-trees.c
>> @@ -10,6 +10,7 @@
>> #include "attr.h"
>> #include "split-index.h"
>> #include "dir.h"
>> +#include "convert.h"
>> 
>> /*
>>  * Error messages expected by scripts out of plumbing commands such as
> 
> Why?  The resulting file seems to compile without this addition.

Of course. That shouldn't have been part of this commit.


Thank you,
Lars

Re: [PATCH v3 04/14] i18n: blame: mark error messages for translation

2016-09-15 Thread Junio C Hamano

Vasco Almeida  writes:

> @@ -2790,7 +2790,7 @@ int cmd_blame(int argc, const char **argv, const char 
> *prefix)
>   else {
>   o = get_origin(, sb.final, path);
>   if (fill_blob_sha1_and_mode(o))
> - die("no such path %s in %s", path, final_commit_name);
> + die(_("no such path %s in %s"), path, 
> final_commit_name);

This was missing in the earlier round, which is good to make it translated.

> - die("file %s has only %lu lines", path, lno);
> + die(Q_("file %s has only %lu line",
> +"file %s has only %lu lines",
> +lno), path, lno);

Looks good here, too.  I would have moved "lno)," at the beginning
of the third line to the end of the second line to make it easier to
read, but this is OK.

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-15 Thread Junio C Hamano

Lars Schneider  writes:

> Wouldn't that complicate the pathname parsing on the filter side?
> Can't we just define in our filter protocol documentation that our 
> "pathname" packet _always_ has a trailing "\n"? That would mean the 
> receiver would know a packet "pathname=ABC\n\n" encodes the path
> "ABC\n" [1].

That's fine, too.  If you declare that pathname over the protocol is
a binary thing, you can also define that the packet does not have
the terminating \n, i.e. the example encodes the path "ABC\n\n",
which is also OK ;-)

As long as the rule is clearly documented, easy for filter
implementors to follow it, and hard for them to get it wrong, I'd be
perfectly happy.

Thanks.

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano

René Scharfe  writes:

> Take this for example:
>
> - strbuf_addf(>obuf, _("(bad commit)\n"));
> + strbuf_addstr(>obuf, _("(bad commit)\n"));
>
> If there's a language that uses percent signs instead of parens or as
> regular letters, then they need to be escaped in the translated string
> before, but not after the patch.  As I wrote: silly.

Ahh, OK, so "This use of addf only has format part and nothing else,
hence the format part can be taken as-is" which is the Coccinelle rule
used to produce this patch is incomplete and always needs manual
inspection, in case the format part wanted to give a literal % in
the output.  E.g. it is a bug to convert this

strbuf_addf(, _("this is 100%% wrong!"));

to

strbuf_addstr(, _("this is 100%% wrong!"));

Thanks for clarification.  Perhaps the strbuf.cocci rule file can
have some comment to warn the person who builds *.patch file to look
for % in E2, or something?

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe


Am 15.09.2016 um 21:38 schrieb Jeff King:

On Thu, Sep 15, 2016 at 12:25:43PM -0700, Junio C Hamano wrote:


Silly question: Is there a natural language that uses percent signs
as letters or e.g. instead of commas? :)


I don't know, but if they do, they'd better get used to escaping them.
:)


I do not know either, but I am curious where that question comes
from.  I stared at this patch for a few minutes but couldn't guess.


My initial thought is that the next step after picking this low-hanging
fruit would be to find cases where the strings do not contain "%", and
thus we do not have to care about formatting. But a case like:

  strbuf_addf(, "this does not have any percents!", foo);

is simply broken (albeit in a way that we ignore foo, so it's just ugly
code, not a real bug).

So I dunno. I too am curious.


Take this for example:

-   strbuf_addf(>obuf, _("(bad commit)\n"));
+   strbuf_addstr(>obuf, _("(bad commit)\n"));

If there's a language that uses percent signs instead of parens or as 
regular letters, then they need to be escaped in the translated string 
before, but not after the patch.  As I wrote: silly.


René

Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 05:42:58PM +0100, Lars Schneider wrote:

>  +int packet_flush_gently(int fd)
>  +{
>  +packet_trace("", 4, 1);
>  +if (write_in_full(fd, "", 4) == 4)
>  +return 0;
>  +error("flush packet write failed");
>  +return -1;
> [...]
> >>> I suspect that it is a strong sign that the caller wants to be in
> >>> control of when and what error message is produced; otherwise it
> >>> wouldn't be calling the _gently() variant, no?
> >> 
> >> Agreed!
> > 
> > I am also OK with the current form, too.  Those who need to enhance
> > it to packet_flush_gently(int fd, int quiet) can come later.
> 
> "caller wants to be in control [...] otherwise it wouldn't be calling 
> the _gently() variant" convinced me. I would like to change it like
> this:
> 
>   trace_printf_key(_packet, "flush packet write failed");
>   return -1;
> 
> Objections?

I'm not sure that a trace makes sense, because it means that 99% of the
time we are silent. AFAICT, the question is not "sometimes the user
needs to see an error and sometimes not, and they should decide before
starting the program". It is "sometimes the caller will report the error
to the user as appropriate, and sometimes we need to do so". And only
the calling code knows which is which.

So the "right" pattern is either:

  1. Return -1 and the caller is responsible for telling the user.

or

  2. Return -1 and stuff the error into an error strbuf, so it can be
 passed up the call chain easily (and callers do not have to come up
 with their own wording).

But if all current callers would just call error() themselves anyway,
then it's OK to punt on this and let somebody else handle it later if
they add a new caller who wants different behavior (and that is what
Junio was saying above, I think).

-Peff

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 12:25:43PM -0700, Junio C Hamano wrote:

> >> Silly question: Is there a natural language that uses percent signs
> >> as letters or e.g. instead of commas? :)
> >
> > I don't know, but if they do, they'd better get used to escaping them.
> > :)
> 
> I do not know either, but I am curious where that question comes
> from.  I stared at this patch for a few minutes but couldn't guess.

My initial thought is that the next step after picking this low-hanging
fruit would be to find cases where the strings do not contain "%", and
thus we do not have to care about formatting. But a case like:

  strbuf_addf(, "this does not have any percents!", foo);

is simply broken (albeit in a way that we ignore foo, so it's just ugly
code, not a real bug).

So I dunno. I too am curious.

-Peff

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-15 Thread Lars Schneider

> On 13 Sep 2016, at 17:42, Junio C Hamano  wrote:
> 
> Torsten Bögershausen  writes:
> 
>> I would really consider to treat pathnames as binary, and not add a trailing 
>> '\n',
>> are there other opinions ?
> 
> It would be the most consistent if the same format as
> write_name_quoted() is used for this, I would think.

Is that the solution you had in mind?

quote_c_style(path, _path, NULL, 0);
err = packet_write_fmt_gently(process->in, "pathname=%s\n", 
quoted_path.buf);

Wouldn't that complicate the pathname parsing on the filter side?
Can't we just define in our filter protocol documentation that our 
"pathname" packet _always_ has a trailing "\n"? That would mean the 
receiver would know a packet "pathname=ABC\n\n" encodes the path
"ABC\n" [1].

Thanks,
Lars

[1] Following Torsten's example in 
http://public-inbox.org/git/96554f6d-988d-e0b8-7936-8d0f29a75...@web.de )

Re: [PATCH] pkt-line: mark a file-local symbol static

2016-09-15 Thread Lars Schneider


> On 14 Sep 2016, at 14:31, Ramsay Jones  wrote:
> 
> 
> Signed-off-by: Ramsay Jones 
> ---
> 
> Hi Lars,
> 
> If you need to re-roll your 'ls/filter-process' branch, could you
> please squash this into the relevant patch; commit 2afd9b22
> ("pkt-line: add packet_write_gently()", 08-09-2016).
> 
> [...]
> -int packet_write_gently(const int fd_out, const char *buf, size_t size)
> +static int packet_write_gently(const int fd_out, const char *buf, size_t 
> size)
> {
>   static char packet_write_buffer[LARGE_PACKET_MAX];

Done!

Thank you,
Lars

Re: [PATCH v7 04/10] pkt-line: add packet_flush_gently()

2016-09-15 Thread Lars Schneider


> On 13 Sep 2016, at 23:44, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> On 13 Sep 2016, at 00:30, Junio C Hamano  wrote:
>>> 
>>> larsxschnei...@gmail.com writes:
>>> 
 From: Lars Schneider 
 
 packet_flush() would die in case of a write error even though for some
 callers an error would be acceptable. Add packet_flush_gently() which
 writes a pkt-line flush packet and returns `0` for success and `-1` for
 failure.
 ...
 +int packet_flush_gently(int fd)
 +{
 +  packet_trace("", 4, 1);
 +  if (write_in_full(fd, "", 4) == 4)
 +  return 0;
 +  error("flush packet write failed");
 +  return -1;
>>> 
>>> It is more idiomatic to do
>>> 
>>> return error(...);
>>> 
>>> but more importantly, does the caller even want an error message
>>> unconditionally printed here?
>>> 
>>> I suspect that it is a strong sign that the caller wants to be in
>>> control of when and what error message is produced; otherwise it
>>> wouldn't be calling the _gently() variant, no?
>> 
>> Agreed!
> 
> I am also OK with the current form, too.  Those who need to enhance
> it to packet_flush_gently(int fd, int quiet) can come later.

"caller wants to be in control [...] otherwise it wouldn't be calling 
the _gently() variant" convinced me. I would like to change it like
this:

trace_printf_key(_packet, "flush packet write failed");
return -1;

Objections?

Thanks,
Lars

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Junio C Hamano

Jeff King  writes:

> On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:
>
>> Replace uses of strbuf_addf() for adding strings with more lightweight
>> strbuf_addstr() calls.  This makes the intent clearer and avoids
>> potential issues with printf format specifiers.
>> 
>> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
>> this patch covers eleven more.
>
> Great, these all look obviously correct.

Yes.

>> Silly question: Is there a natural language that uses percent signs
>> as letters or e.g. instead of commas? :)
>
> I don't know, but if they do, they'd better get used to escaping them.
> :)

I do not know either, but I am curious where that question comes
from.  I stared at this patch for a few minutes but couldn't guess.

Re: [PATCH v2 21/25] sequencer: refactor write_message()

2016-09-15 Thread Junio C Hamano

Johannes Sixt  writes:

> Am 11.09.2016 um 12:55 schrieb Johannes Schindelin:
>> -static int write_message(struct strbuf *msgbuf, const char *filename)
>> +static int write_with_lock_file(const char *filename,
>> +const void *buf, size_t len, int append_eol)
>>  {
>>  static struct lock_file msg_file;
>>
>>  int msg_fd = hold_lock_file_for_update(_file, filename, 0);
>>  if (msg_fd < 0)
>>  return error_errno(_("Could not lock '%s'"), filename);
>> -if (write_in_full(msg_fd, msgbuf->buf, msgbuf->len) < 0)
>> -return error_errno(_("Could not write to %s"), filename);
>> -strbuf_release(msgbuf);
>> +if (write_in_full(msg_fd, buf, len) < 0)
>> +return error_errno(_("Could not write to '%s'"), filename);
>> +if (append_eol && write(msg_fd, "\n", 1) < 0)
>> +return error_errno(_("Could not write eol to '%s"), filename);
>>  if (commit_lock_file(_file) < 0)
>>  return error(_("Error wrapping up %s."), filename);
>>
>>  return 0;
>>  }
>
> The two error paths in the added lines should both
>
>   rollback_lock_file(_file);
>
> , I think. But I do notice that this is not exactly new, so...

It may not be new for this step, but overall the series is aiming to
libify the stuff, so we should fix fd and lockfile leaks like this
as we notice them.

Thanks.

Re: [PATCH v2 10/25] sequencer: get rid of the subcommand field

2016-09-15 Thread Junio C Hamano

Johannes Schindelin  writes:

> The subcommands are used exactly once, at the very beginning of
> sequencer_pick_revisions(), to determine what to do. This is an
> unnecessary level of indirection: we can simply call the correct
> function to begin with. So let's do that.

Makes sense.  And the diffstat is also pleasant to the eyes.

>  builtin/revert.c | 36 
>  sequencer.c  | 35 +++
>  sequencer.h  | 13 -
>  3 files changed, 31 insertions(+), 53 deletions(-)

Re: Tracking down a segfault in delta_base_cache

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 10:34:43AM -0700, Junio C Hamano wrote:

> Jonathon Mah  writes:
> 
> >> On 2016-09-14, at 17:56, Jeff King  wrote:
> >> 
> >> Have you tried with the patch in:
> >> 
> >>  
> >> http://public-inbox.org/git/20160912164616.vg33kldazuthf...@sigill.intra.peff.net/
> > All the examples I've tried work when I use that. Thanks!
> 
> Peff, thanks for a quick suggestion and Jonathon, thanks for a quick
> confirmation.

Better still would have been for me not to introduce the segfault in the
first place. ;)

-Peff

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 09:26:22AM -0700, Stefan Beller wrote:

> > It may also be possible to really micro-optimize it on some platforms,
> > because we know the size in advance (I'd kind of expect the compiler to
> > do that, but if we're ending up in glibc memcmp then it sounds like it
> > is not the case).
> 
> That stackoverflow link suggests that glibc already has microoptimisations
> for a variety of platforms.

It's definitely micro-optimized in glibc. What I was trying to say is
that if we are hitting the glibc implementation, then we know we are
handling the "20" at runtime. Whereas the compiler should know that "20"
is a constant, and could in theory skip the memcmp() call entirely in
favor of something like an unrolled loop.

-Peff

Re: [PATCH v4 3/4] read-cache: introduce chmod_index_entry

2016-09-15 Thread Thomas Gummerer

On 09/14, Junio C Hamano wrote:
> I've queued this trivial SQUASH??? on top, which I think should be
> squashed into 3/4.

Yeah, I missed this.  The SQUASH??? definitely makes sense, would be
great if you could just squash that in.

> Thanks.

Thanks the reviews and helping me getting the series in a good shape!

> 
> 
>  read-cache.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/read-cache.c b/read-cache.c
> index 2445e30..c2b2e97 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -779,7 +779,7 @@ int chmod_index_entry(struct index_state *istate, struct 
> cache_entry *ce,
>   default:
>   return -2;
>   }
> - cache_tree_invalidate_path(_index, ce->name);
> + cache_tree_invalidate_path(istate, ce->name);
>   ce->ce_flags |= CE_UPDATE_IN_BASE;
>   istate->cache_changed |= CE_ENTRY_CHANGED;
>  
> -- 
> 2.10.0-458-g8cce42d
> 

-- 
Thomas

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 10:45:34AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Measuring _just_ the collisions is more like the patch below. In my
> > measurements it's more like 30ms, compared to 10s for all of the
> > hashcmps.
> >
> > So we really aren't dealing with collisions, but rather just verifying
> > that our hash landed at the right spot. And _any_ data structure is
> > going to have to do that.
> 
> The reverse side of the coin may be if we can shrink the hashtable
> smaller and load it more heavily without sacrificing performance by
> making the necessary "have we landed at the right spot" check cheap
> enough, I guess.

I think that's where things like cuckoo hashing come into play. They
didn't have any effect for us because we already keep the table very
unloaded. But you could _probably_ increase the load factor without
sacrificing performance using a more clever scheme.

It's not clear to me that the current table size is a big problem,
though. It might be hurting us with cache effects, but I think the only
way we'd know is to measure.

-Peff

Re: [PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 08:31:00PM +0200, René Scharfe wrote:

> Replace uses of strbuf_addf() for adding strings with more lightweight
> strbuf_addstr() calls.  This makes the intent clearer and avoids
> potential issues with printf format specifiers.
> 
> 02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
> this patch covers eleven more.

Great, these all look obviously correct.

> A semantic patch for Coccinelle is included for easier checking for
> new cases that might be introduced in the future.

I think there was some discussion in brian's object_id patches about
whether we wanted to carry Coccinelle transformations in the tree, but I
don't remember the outcome. I don't have an opinion myself.

> Silly question: Is there a natural language that uses percent signs
> as letters or e.g. instead of commas? :)

I don't know, but if they do, they'd better get used to escaping them.
:)

-Peff

[PATCH] add coccicheck make target

2016-09-15 Thread René Scharfe

Provide a simple way to run Coccinelle against all source files, in the
form of a Makefile target.  Running "make coccicheck" applies each
.cocci file in contrib/coccinelle/ on all source files.  It generates
a .patch file for each .cocci file, containing the actual changes for
effecting the transformations described by the semantic patches.

Non-empty .patch files are reported.  They can be applied to the work
tree using "patch -p0", but should be checked to e.g. make sure they
don't screw up formatting or create circular references.

Coccinelle's diagnostic output (stderr) is piped into .log files.

Linux has a much more elaborate make target of the same name; let's
start nice and easy.

Signed-off-by: Rene Scharfe 
---
 Makefile | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Makefile b/Makefile
index 7f18492..74b2788 100644
--- a/Makefile
+++ b/Makefile
@@ -461,6 +461,7 @@ CURL_CONFIG = curl-config
 PTHREAD_LIBS = -lpthread
 PTHREAD_CFLAGS =
 GCOV = gcov
+SPATCH = spatch
 
 export TCL_PATH TCLTK_PATH
 
@@ -2307,6 +2308,18 @@ check: common-cmds.h
exit 1; \
fi
 
+C_SOURCES = $(patsubst %.o,%.c,$(C_OBJ))
+%.cocci.patch: %.cocci $(C_SOURCES)
+   @echo '' SPATCH $<; \
+   for f in $(C_SOURCES); do \
+   $(SPATCH) --sp-file $< $$f; \
+   done >$@ 2>$@.log; \
+   if test -s $@; \
+   then \
+   echo '' SPATCH result: $@; \
+   fi
+coccicheck: $(patsubst %.cocci,%.cocci.patch,$(wildcard 
contrib/coccinelle/*.cocci))
+
 ### Installation rules
 
 ifneq ($(filter /%,$(firstword $(template_dir))),)
@@ -2498,6 +2511,7 @@ clean: profile-clean coverage-clean
$(RM) -r $(GIT_TARNAME) .doc-tmp-dir
$(RM) $(GIT_TARNAME).tar.gz git-core_$(GIT_VERSION)-*.tar.gz
$(RM) $(htmldocs).tar.gz $(manpages).tar.gz
+   $(RM) contrib/coccinelle/*.cocci.patch*
$(MAKE) -C Documentation/ clean
 ifndef NO_PERL
$(MAKE) -C gitweb clean
-- 
2.10.0

[PATCH] use strbuf_addstr() for adding constant strings to a strbuf, part 2

2016-09-15 Thread René Scharfe

Replace uses of strbuf_addf() for adding strings with more lightweight
strbuf_addstr() calls.  This makes the intent clearer and avoids
potential issues with printf format specifiers.

02962d36845b89145cd69f8bc65e015d78ae3434 already converted six cases,
this patch covers eleven more.

A semantic patch for Coccinelle is included for easier checking for
new cases that might be introduced in the future.

Signed-off-by: Rene Scharfe 
---
Silly question: Is there a natural language that uses percent signs
as letters or e.g. instead of commas? :)

 builtin/fmt-merge-msg.c | 2 +-
 builtin/merge.c | 2 +-
 builtin/submodule--helper.c | 5 +++--
 contrib/coccinelle/strbuf.cocci | 5 +
 merge-recursive.c   | 2 +-
 remote.c| 8 
 wt-status.c | 6 +++---
 7 files changed, 18 insertions(+), 12 deletions(-)
 create mode 100644 contrib/coccinelle/strbuf.cocci

diff --git a/builtin/fmt-merge-msg.c b/builtin/fmt-merge-msg.c
index ac84e99..dc2e9e4 100644
--- a/builtin/fmt-merge-msg.c
+++ b/builtin/fmt-merge-msg.c
@@ -395,7 +395,7 @@ static void shortlog(const char *name,
 
for (i = 0; i < subjects.nr; i++)
if (i >= limit)
-   strbuf_addf(out, "  ...\n");
+   strbuf_addstr(out, "  ...\n");
else
strbuf_addf(out, "  %s\n", subjects.items[i].string);
 
diff --git a/builtin/merge.c b/builtin/merge.c
index 0ae099f..a8b57c7 100644
--- a/builtin/merge.c
+++ b/builtin/merge.c
@@ -940,7 +940,7 @@ static void write_merge_state(struct commit_list 
*remoteheads)
 
strbuf_reset();
if (fast_forward == FF_NO)
-   strbuf_addf(, "no-ff");
+   strbuf_addstr(, "no-ff");
write_file_buf(git_path_merge_mode(), buf.buf, buf.len);
 }
 
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 9d79f19..ad23155 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -859,8 +859,9 @@ static int update_clone_get_next_task(struct child_process 
*child,
ce = suc->failed_clones[index];
if (!prepare_to_clone_next_submodule(ce, child, suc, err)) {
suc->current ++;
-   strbuf_addf(err, "BUG: submodule considered for 
cloning,"
-   "doesn't need cloning any more?\n");
+   strbuf_addstr(err, "BUG: submodule considered for "
+  "cloning, doesn't need cloning "
+  "any more?\n");
return 0;
}
p = xmalloc(sizeof(*p));
diff --git a/contrib/coccinelle/strbuf.cocci b/contrib/coccinelle/strbuf.cocci
new file mode 100644
index 000..7932d48
--- /dev/null
+++ b/contrib/coccinelle/strbuf.cocci
@@ -0,0 +1,5 @@
+@@
+expression E1, E2;
+@@
+- strbuf_addf(E1, E2);
++ strbuf_addstr(E1, E2);
diff --git a/merge-recursive.c b/merge-recursive.c
index e349126..d2b191b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -206,7 +206,7 @@ static void output_commit_title(struct merge_options *o, 
struct commit *commit)
find_unique_abbrev(commit->object.oid.hash,
DEFAULT_ABBREV));
if (parse_commit(commit) != 0)
-   strbuf_addf(>obuf, _("(bad commit)\n"));
+   strbuf_addstr(>obuf, _("(bad commit)\n"));
else {
const char *title;
const char *msg = get_commit_buffer(commit, NULL);
diff --git a/remote.c b/remote.c
index d29850a..ad6c542 100644
--- a/remote.c
+++ b/remote.c
@@ -2073,7 +2073,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
_("Your branch is based on '%s', but the upstream is 
gone.\n"),
base);
if (advice_status_hints)
-   strbuf_addf(sb,
+   strbuf_addstr(sb,
_("  (use \"git branch --unset-upstream\" to 
fixup)\n"));
} else if (!ours && !theirs) {
strbuf_addf(sb,
@@ -2086,7 +2086,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
   ours),
base, ours);
if (advice_status_hints)
-   strbuf_addf(sb,
+   strbuf_addstr(sb,
_("  (use \"git push\" to publish your local 
commits)\n"));
} else if (!ours) {
strbuf_addf(sb,
@@ -2097,7 +2097,7 @@ int format_tracking_info(struct branch *branch, struct 
strbuf *sb)
   theirs),
base, theirs);
if (advice_status_hints)
-   strbuf_addf(sb,
+

[PATCH] contrib/coccinelle: fix semantic patch for oid_to_hex_r()

2016-09-15 Thread René Scharfe

Both sha1_to_hex_r() and oid_to_hex_r() take two parameters, so use two
expressions in the semantic patch for transforming calls of the former
to the latter one.

Signed-off-by: Rene Scharfe 
---
 contrib/coccinelle/object_id.cocci | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/contrib/coccinelle/object_id.cocci 
b/contrib/coccinelle/object_id.cocci
index 8ccdbb5..0307624 100644
--- a/contrib/coccinelle/object_id.cocci
+++ b/contrib/coccinelle/object_id.cocci
@@ -23,16 +23,16 @@ expression E1;
 + oid_to_hex(E1)
 
 @@
-expression E1;
+expression E1, E2;
 @@
-- sha1_to_hex_r(E1.hash)
-+ oid_to_hex_r()
+- sha1_to_hex_r(E1, E2.hash)
++ oid_to_hex_r(E1, )
 
 @@
-expression E1;
+expression E1, E2;
 @@
-- sha1_to_hex_r(E1->hash)
-+ oid_to_hex_r(E1)
+- sha1_to_hex_r(E1, E2->hash)
++ oid_to_hex_r(E1, E2)
 
 @@
 expression E1;
-- 
2.10.0

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Junio C Hamano

Yaroslav Halchenko  writes:

> do you foresee any unpleasant side-effects from above manual editing
> .gitmodules/submodule update --init ?

I do not think so; you essentially did what a canned command we
should have had should have done by hand because of a lack of such a
feature.

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Junio C Hamano

Stefan Beller  writes:

> When searching around the net, some people use half
> initialized submodules intentionally,...
>
> Not sure I agree with such a setup, but people use it.

In such a top-level project, people would not use "git submodule"
command, would they?  I do not think anybody in this thread was
pushing to forbid such a use, and it may be perfectly fine if "git
submodule" does not work for such a gitlink; after all such a
subdirectory is not even meant to be a submodule.

> So how about this fictional work flow:
>
>  $ git init top
>  $ cd top
>  $ git commit --allow-empty -m 'initial in top'
>  $ git init sub
>  $ git -C sub commit --allow-empty -m 'initial in sub'
>  $ git add sub
> You added a gitlink, but no corresponding entry in
> .gitmodules is found. This is fine for gits core functionality, but
> the submodule command gets confused by this unless you add 'sub'
> to your .gitmodules via `git submodule add --already-in-tree \
> --reuse-submodules-origin-as-URL sub`. Alternatively you can make this
> message disappear by configuring advice.gitlinkPitfalls.

I am not sure if I agree with that direction.

If the trend in Git community collectively these days is to make
usage of submodules easier and smoother, I'd imagine that you would
want to teach "git add" that was given a submodule to "git submodule
add" instead by default, with an option "git add --no-gitmodules
sub" to disable it, or something like that.

>  $ git submodule add --fixup-modules-file ./sub sub
>  Adding .gitmodule entry only for `sub` to use `git -C remote
> show origin` as URL.

I agree that a feature like this is needed regardless of what
happens at "git add" time.

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Stefan Beller

On Thu, Sep 15, 2016 at 11:12 AM, Yaroslav Halchenko  
wrote:

>
> do you foresee any unpleasant side-effects from above manual editing
> .gitmodules/submodule update --init ?

I think that is fine, but un(der)documented. So you have to figure it out
from experience what to do exactly.

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Stefan Beller

On Thu, Sep 15, 2016 at 11:02 AM, Junio C Hamano  wrote:

> I think that is a more pressing thing to address.  Once we make it
> easier for the user to bring a half-initialized submodule properly
> into the world view of the submodule subsystem, we would have to
> worry about the reported failure case even less and you do not need
> to pile on workaround options to let things continue in a state that
> is half-broken (that is, in a state that is perfectly sane to the
> core layer, but is not liked by the submodule layer).

Heh, I see.

When searching around the net, some people use half
initialized submodules intentionally, e.g. I'll store some private keys in
sub and I publish the superproject asking for collaborators to my new shiny
webbased thing. The submodule containing the private keys never leaves
my hard drive, hence no .gitmodules entry is necessary.

Not sure I agree with such a setup, but people use it.

So how about this fictional work flow:

 $ git init top
 $ cd top
 $ git commit --allow-empty -m 'initial in top'
 $ git init sub
 $ git -C sub commit --allow-empty -m 'initial in sub'
 $ git add sub
You added a gitlink, but no corresponding entry in
.gitmodules is found. This is fine for gits core functionality, but
the submodule command gets confused by this unless you add 'sub'
to your .gitmodules via `git submodule add --already-in-tree \
--reuse-submodules-origin-as-URL sub`. Alternatively you can make this
message disappear by configuring advice.gitlinkPitfalls.

 $ git submodule
 ... similar advice goes here...
 fatal: no submodule mapping found in .gitmodules for path 'sub'

 $ git submodule add --fixup-modules-file ./sub sub
 Adding .gitmodule entry only for `sub` to use `git -C remote
show origin` as URL.

 # user is happy now.

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Yaroslav Halchenko


On Thu, 15 Sep 2016, Junio C Hamano wrote:

> >> which then stops without even looking at other submodules.

> >> I think it would be more logical to make it a 'warning:' not a 'fatal:' and
> >> proceed.

> Making "git submodule" listing to continue from that point may be
> one thing, but do we have a sensible way in "git submodule add" to
> allow the user to recover from this condition?  That is, "git add"
> is a right way to tell the core level that there is a gitlink, but
> as Yaroslav correctly observed in the early part of his message,
> having that gitlink alone is not good enough for the world view of 
> "git submodule" that sits at higher-layer.  And the usual way to
> tell the submodule layer that there is a submodule is with "git
> submodule add", but

>   $ git init top
> $ cd top
> $ git commit --allow-empty -m 'initial in top'
> $ git init sub
> $ git -C sub commit --allow-empty -m 'initial in sub'

> $ git add sub
>   $ git submodule
> fatal: no submodule mapping found in .gitmodules for path 'sub'

>   $ git submodule add ./sub sub
> 'sub' already exists in the index
>   $ git submodule add -f ./sub sub
> 'sub' already exists in the index

FWIW
I could have sworn that I have tried to 'submodule add' it and it
worked... but pragmatically I just did edit .gitmodules, added the
record for it, committed it, and then iirc git submodule update --init
which seemed to make  git happy... FTR:

$> git submodule add ./sub ./sub
'sub' already exists in the index

$> git submodule add ./sub/ ./sub/
'sub' already exists in the index

$> vim .gitmodules

$> git add .gitmodules
cached/staged changes:  


  
 .gitmodules | 4 
 sub | 1 +

$> git submodule update --init
Submodule 'sub' (/tmp/111/top/sub) registered for path 'sub'
cached/staged changes:  


  
 .gitmodules | 4 
 sub | 1 +

$> git commit -m 'added finally'
[master aa6d912] added finally
 2 files changed, 5 insertions(+)
 create mode 100644 .gitmodules
 create mode 16 sub

$> git submodule
 6f574b298ef51aebd36daafad450a3e38802ca03 sub (heads/master)


> I highly suspect that the user will then get stuck at this point,
> after trying to "submodule add" and then even attempting to force
> it.

> I think that is a more pressing thing to address.  Once we make it
> easier for the user to bring a half-initialized submodule properly
> into the world view of the submodule subsystem, we would have to
> worry about the reported failure case even less and you do not need 
> to pile on workaround options to let things continue in a state that
> is half-broken (that is, in a state that is perfectly sane to the
> core layer, but is not liked by the submodule layer).

do you foresee any unpleasant side-effects from above manual editing
.gitmodules/submodule update --init ?

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Junio C Hamano

Stefan Beller  writes:

> On Thu, Sep 15, 2016 at 6:02 AM, Yaroslav Halchenko  
> wrote:
>>
>> If e.g. you just 'git add' a subdirectory which is a git repository, git
>> adds it as a subproject but doesn't initiate any entry in .gitmodules
>> since it is the job done by submodule and git core itself is
>> agnostic of those beasts.
>> ...
>> $> git submodule
>>  cc6a09ac06c13cf06b4f4c8b54cda9a535e4e385 ds01 (2.0.0+4)
>>  0a9f3b66e06a2137311a537b7377c336f1fb30ad ds02 (1.0.0-3-g0a9f3b6)
>>  9da7e4f4221699915645ac2003298c6aba2db109 ds03 (1.1.0+4)
>>  fe16cacb5cb9b4d53c50e498298fab182500e147 ds05 (2.0.0+3)
>>  6898d99ff3ba26880183ed3672a458a7fcde1737 ds06 (2.0.0+2)
>>  bbd10f634fe87e9d5853df3a891edbdb18cda7f9 ds07 (2.0.0+3)
>>  138e6730193c0585a69b8baf5b9d7a4439e83ecc ds08 (2.0.0+2)
>>  ddf3a4cf7ce51a01a664e6faff4b8334b8414b1f ds09 (2.0.1+1)
>>  7fa73b4df8166dba950c7dc07c3f8cdd50fca313 ds11 (1.0.0-5-g7fa73b4)
>> fatal: no submodule mapping found in .gitmodules for path 'ds17
>>
>> which then stops without even looking at other submodules.
>>
>> I think it would be more logical to make it a 'warning:' not a 'fatal:' and
>> proceed.

Making "git submodule" listing to continue from that point may be
one thing, but do we have a sensible way in "git submodule add" to
allow the user to recover from this condition?  That is, "git add"
is a right way to tell the core level that there is a gitlink, but
as Yaroslav correctly observed in the early part of his message,
having that gitlink alone is not good enough for the world view of 
"git submodule" that sits at higher-layer.  And the usual way to
tell the submodule layer that there is a submodule is with "git
submodule add", but

$ git init top
$ cd top
$ git commit --allow-empty -m 'initial in top'
$ git init sub
$ git -C sub commit --allow-empty -m 'initial in sub'

$ git add sub
$ git submodule
fatal: no submodule mapping found in .gitmodules for path 'sub'

$ git submodule add ./sub sub
'sub' already exists in the index
$ git submodule add -f ./sub sub
'sub' already exists in the index

I highly suspect that the user will then get stuck at this point,
after trying to "submodule add" and then even attempting to force
it.

I think that is a more pressing thing to address.  Once we make it
easier for the user to bring a half-initialized submodule properly
into the world view of the submodule subsystem, we would have to
worry about the reported failure case even less and you do not need 
to pile on workaround options to let things continue in a state that
is half-broken (that is, in a state that is perfectly sane to the
core layer, but is not liked by the submodule layer).

Re: git add --intent-to-add silently creates empty commits

2016-09-15 Thread Junio C Hamano

Aviv Eyal  writes:

> Using `git add -N` allows creating of empty commits:
>
> git init test && cd test
> echo text > file
> git add --intent-to-add file
> git commit -m 'Empty commit'
> echo $?# prints 0
> ...
> I'd expect `git commit` to error out instead of producing an empty commit.
>
> I've seen this with git 2.8.1 and 2.10.0.129.g35f6318

I think I've seen this reported some time ago.

https://public-inbox.org/git/%3ccacsjy8a8-rgpyxysjbalrmia7d3dfqpr4cxasnsalycnmgm...@mail.gmail.com%3E/

I do not offhand recall what happend to the topic after that.

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Junio C Hamano

Jeff King  writes:

> Measuring _just_ the collisions is more like the patch below. In my
> measurements it's more like 30ms, compared to 10s for all of the
> hashcmps.
>
> So we really aren't dealing with collisions, but rather just verifying
> that our hash landed at the right spot. And _any_ data structure is
> going to have to do that.

The reverse side of the coin may be if we can shrink the hashtable
smaller and load it more heavily without sacrificing performance by
making the necessary "have we landed at the right spot" check cheap
enough, I guess.

Re: Tracking down a segfault in delta_base_cache

2016-09-15 Thread Junio C Hamano

Jonathon Mah  writes:

>> On 2016-09-14, at 17:56, Jeff King  wrote:
>> 
>> Have you tried with the patch in:
>> 
>>  
>> http://public-inbox.org/git/20160912164616.vg33kldazuthf...@sigill.intra.peff.net/
> All the examples I've tried work when I use that. Thanks!

Peff, thanks for a quick suggestion and Jonathon, thanks for a quick
confirmation.

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Stefan Beller

On Thu, Sep 15, 2016 at 9:40 AM, Yaroslav Halchenko  wrote:
>
> On Thu, 15 Sep 2016, Stefan Beller wrote:
>
>> > I think it would be more logical to make it a 'warning:' not a 'fatal:' and
>> > proceed.
>
>> So maybe we would want to introduce a switch
>>   `--existing-but-unconfigure-gitlinks=(warn|ignore)`
>> as well as
>> `git config submodule.existing-but-unconfigured (warn|ignore)`
>> for a more permanent solution?
>
> possibly ignorant question:  is  gitlink === Subprojector a
> Subproject is a kinda of a gitlink and there are other gitlinks which
> aren't Subprojects? ;)
>

gitlink is the internal name (just like a file is called blob, or a directory is
called tree; there is no file system equivalent for e.g. commits, or
gitlinks)

gitlinks as a basic building block only points at a sha1 to be part of
the repository.

Submodules use gitlinks to point at (usually) different projects, e.g.
a library.

Subprojects in Git is a loose term, it could mean submodule or part of your
repo merged in via the git-subtree command. Or a sub project can
be just a loose repository inside your repository.

gitlinks do not necessarily need to track other projects; it's "just a pointer"
to a specific version of a repository.

So to come back to your question: yeah In that response I meant
submodule==gitlink, though naming the internal data structure to be
more explicit. (What if we invent a new way to do submodules not
using gitlinks? Then that config option may be less confusing and
backwards compatible. That is unlikely though. So I guess we'll settle
with a different name)

Re: [PATCH] Move format-patch base commit and prerequisites before email signature

2016-09-15 Thread Junio C Hamano

Josh Triplett  writes:

> I'd suggest squashing in an *additional* patch to the testsuite to
> ensure the presence of the blank line:

Thanks, will do.

> diff --git a/t/t4014-format-patch.sh b/t/t4014-format-patch.sh
> index 535857e..8d90a6e 100755
> --- a/t/t4014-format-patch.sh
> +++ b/t/t4014-format-patch.sh
> @@ -1515,8 +1515,9 @@ test_expect_success 'format-patch -o overrides 
> format.outputDirectory' '
>  
>  test_expect_success 'format-patch --base' '
>   git checkout side &&
> - git format-patch --stdout --base=HEAD~3 -1 | tail -n 6 >actual &&
> - echo "base-commit: $(git rev-parse HEAD~3)" >expected &&
> + git format-patch --stdout --base=HEAD~3 -1 | tail -n 7 >actual &&
> + echo >expected &&
> + echo "base-commit: $(git rev-parse HEAD~3)" >>expected &&
>   echo "prerequisite-patch-id: $(git show --patch HEAD~2 | git patch-id 
> --stable | awk "{print \$1}")" >>expected &&
>   echo "prerequisite-patch-id: $(git show --patch HEAD~1 | git patch-id 
> --stable | awk "{print \$1}")" >>expected &&
>   signature >> expected &&

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Yaroslav Halchenko


On Thu, 15 Sep 2016, Stefan Beller wrote:

> > I think it would be more logical to make it a 'warning:' not a 'fatal:' and
> > proceed.

> So maybe we would want to introduce a switch
>   `--existing-but-unconfigure-gitlinks=(warn|ignore)`
> as well as
> `git config submodule.existing-but-unconfigured (warn|ignore)`
> for a more permanent solution?

possibly ignorant question:  is  gitlink === Subprojector a
Subproject is a kinda of a gitlink and there are other gitlinks which
aren't Subprojects? ;)

-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Stefan Beller

On Wed, Sep 14, 2016 at 11:47 PM, Jeff King  wrote:
> On Wed, Sep 14, 2016 at 07:01:41PM -0700, Stefan Beller wrote:
>
>>  According to Jeff, sending patches that don't get accepted is the new hype!
>
> It is what all the cool kids are doing. Unfortunately, it does not save
> you from nitpicky reviews...;)
>
>>   first = i = hash_obj(sha1, obj_hash_size);
>> + clock_gettime(CLOCK_MONOTONIC, );
>>   while ((obj = obj_hash[i]) != NULL) {
>>   if (!hashcmp(sha1, obj->oid.hash))
>>   break;
>> @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1)
>>   if (i == obj_hash_size)
>>   i = 0;
>>   }
>> + clock_gettime(CLOCK_MONOTONIC, );
>> + diff(, , _diff);
>> + add_time_to(, _diff);
>>   if (obj && i != first) {
>
> I don't think this is actually measuring the time spent on collisions.
> It's measuring the time we spend in hashcmp(), but that includes the
> non-collision case where we find it in the first hashcmp.

Right. I measured all lookup times, i.e. this function accounts for
1/3 of the run
time.

>
> Measuring _just_ the collisions is more like the patch below. In my
> measurements it's more like 30ms, compared to 10s for all of the
> hashcmps.

So off by a few orders of magnitude, i.e. we don't have to worry about
the time spent in resolving collisions.

>
> So we really aren't dealing with collisions, but rather just verifying
> that our hash landed at the right spot. And _any_ data structure is
> going to have to do that. If you want to make it faster, I'd try
> optimizing hashcmp (and you can see why the critbit tree was slower; if
> we spend so much time just on hashcmp() to make sure we're at the right
> key, then making that slower with a bunch of branching is not going to
> help).
>
> I notice we still open-code hashcmp. I get a slight speedup by switching
> it to memcmp(). About 2.5%, which is similar to what I showed in
>
>   http://public-inbox.org/git/20130318073229.ga5...@sigill.intra.peff.net/
>
> a few years ago (though it's more pronounced as a portion of the whole
> now, because we've optimized some of the other bits).
>
> The main driver there was memcmp() improvements that went into glibc
> 2.13 several years ago. It might be time to start assuming that memcmp()
> beats our open-coded loop.

http://stackoverflow.com/questions/21106801/why-is-memcmp-so-much-faster-than-a-for-loop-check

seems to agree with you; so I'd think I'll agree with switching over.

>
> It may also be possible to really micro-optimize it on some platforms,
> because we know the size in advance (I'd kind of expect the compiler to
> do that, but if we're ending up in glibc memcmp then it sounds like it
> is not the case).

That stackoverflow link suggests that glibc already has microoptimisations
for a variety of platforms.

>
> -Peff

Re: [wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Stefan Beller

On Thu, Sep 15, 2016 at 6:02 AM, Yaroslav Halchenko  wrote:
> NB echos some questions of mine a few days back on IRC about Subprojects
> and submodules
>
> If e.g. you just 'git add' a subdirectory which is a git repository, git
> adds it as a subproject but doesn't initiate any entry in .gitmodules
> since it is the job done by submodule and git core itself is
> agnostic of those beasts.
>
> But having then this "Subproject"s which aren't registered as submodules
> (and I haven't found any other use for them besides being a submodule)
> brakes "git submodule" commands, e.g.
>
> $> git submodule
>  cc6a09ac06c13cf06b4f4c8b54cda9a535e4e385 ds01 (2.0.0+4)
>  0a9f3b66e06a2137311a537b7377c336f1fb30ad ds02 (1.0.0-3-g0a9f3b6)
>  9da7e4f4221699915645ac2003298c6aba2db109 ds03 (1.1.0+4)
>  fe16cacb5cb9b4d53c50e498298fab182500e147 ds05 (2.0.0+3)
>  6898d99ff3ba26880183ed3672a458a7fcde1737 ds06 (2.0.0+2)
>  bbd10f634fe87e9d5853df3a891edbdb18cda7f9 ds07 (2.0.0+3)
>  138e6730193c0585a69b8baf5b9d7a4439e83ecc ds08 (2.0.0+2)
>  ddf3a4cf7ce51a01a664e6faff4b8334b8414b1f ds09 (2.0.1+1)
>  7fa73b4df8166dba950c7dc07c3f8cdd50fca313 ds11 (1.0.0-5-g7fa73b4)
> fatal: no submodule mapping found in .gitmodules for path 'ds17
>
> which then stops without even looking at other submodules.
>
> I think it would be more logical to make it a 'warning:' not a 'fatal:' and
> proceed.

So maybe we would want to introduce a switch
  `--existing-but-unconfigure-gitlinks=(warn|ignore)`
as well as
`git config submodule.existing-but-unconfigured (warn|ignore)`
for a more permanent solution?



>
> Thank you for consideration
> --
> Yaroslav O. Halchenko
> Center for Open Neuroscience http://centerforopenneuroscience.org
> Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
> Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
> WWW:   http://www.linkedin.com/in/yarik

Re: Tracking down a segfault in delta_base_cache

2016-09-15 Thread Jonathon Mah


> On 2016-09-14, at 17:56, Jeff King  wrote:
> 
> On Wed, Sep 14, 2016 at 05:42:29PM -0700, Jonathon Mah wrote:
> 
>> Hi git, I've been seeing git segfault over the past few days. I'm on Mac OS 
>> X 10.12, 64-bit, compiling with clang (Apple LLVM version 8.0.0 
>> (clang-800.0.40)).
>> [...]
>> Bisect says:
>> 
>> 8261e1f139db3f8aa6f9fd7d98c876cbeb0f927c is the first bad commit
>> commit 8261e1f139db3f8aa6f9fd7d98c876cbeb0f927c
>> Author: Jeff King 
>> Date:   Mon Aug 22 18:00:07 2016 -0400
>> 
>>delta_base_cache: use hashmap.h
> 
> Have you tried with the patch in:
> 
>  
> http://public-inbox.org/git/20160912164616.vg33kldazuthf...@sigill.intra.peff.net/
> 
> ?

All the examples I've tried work when I use that. Thanks!

>> $ lldb /Users/jmah/Documents/Streams/git/git-log -- -u
>> (lldb) target create "/Users/jmah/Documents/Streams/git/git-log"
>> Current executable set to '/Users/jmah/Documents/Streams/git/git-log' 
>> (x86_64).
>> (lldb) settings set -- target.run-args  "-u"
>> (lldb) process launch -o /dev/null
>> Process 92815 launched: '/Users/jmah/Documents/Streams/git/git-log' (x86_64)
>> Process 92815 stopped
>> * thread #1: tid = 0x1c30677, 0x0001001bba80 
>> git-log`release_delta_base_cache(ent=0xffd0) + 16 at 
>> sha1_file.c:2171, queue = 'com.apple.main-thread', stop reason = 
>> EXC_BAD_ACCESS (code=1, address=0x10)
>>frame #0: 0x0001001bba80 
>> git-log`release_delta_base_cache(ent=0xffd0) + 16 at 
>> sha1_file.c:2171
>>   2168   
>>   2169   static inline void release_delta_base_cache(struct 
>> delta_base_cache_entry *ent)
>>   2170   {
>> -> 2171  free(ent->data);
>>   2172   detach_delta_base_cache_entry(ent);
> 
> The problems I saw with valgrind weren't here, but would explain this.
> We free() the previous node, then walk forward from its "next" pointer.
> On my Linux box, that happens to work, but we could be feeding total
> junk to the list pointer, which would meant ent->data is junk, and
> free() notices.
> 
> -Peff

Re: [RFC] extending pathspec support to submodules

2016-09-15 Thread Brandon Williams

On Thu, Sep 15, 2016 at 4:57 AM, Heiko Voigt  wrote:
>
> The problem when you do that is that the child is not aware that it is
> actually run as a submodule process. E.g.
>
>git grep --recurse-submodules foobar -- sub/dir/a
>
> would report back matches in 'dir/a' instead of 'sub/dir/a'. From the
> users perspective this will be confusing since she started the grep in
> the superproject.
>
> So what I would suggest is that we make the childs aware of the fact
> that they are called from a superproject by passing a prefix when
> calling it. E.g. like this
>
>git grep --submodule-prefix=sub foobar -- sub/dir/a
>
> Whether we pass the full or stripped path is now a matter of taste or
> ease of implementation, since we could either preprend or strip it in
> the child. But the important difference is that we have information
> about the original path.
>
> Another approach could, of course, be to omit passing the prefix and
> parse the output of the child and try to substitute, paths but that
> seems quite error prone to me.
>
> Cheers Heiko

You're right that seems like the best course of action and it already falls
inline with what I did with a first patch to ls-files to support submodules.
In that patch I did exactly as you suggest and pass in the prefix to the
submodule and make the child responsible for prepending the prefix to all of
its output.  This way we can simply pass through the whole pathspec (as apposed
to my original idea of stripping the prefix off the pathspec prior to passing
it to the child...which can get complicated with wild characters) to the
childprocess and when checking if a file matches the pathspec we can check if
the prefix + file path matches.

-Brandon

[PATCH v3 12/14] i18n: show-branch: mark error messages for translation

2016-09-15 Thread Vasco Almeida

Lowercase some messages first word to match style of the others.

Signed-off-by: Vasco Almeida 
---
 builtin/show-branch.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 2566935..5809371 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -538,7 +538,7 @@ static void append_one_rev(const char *av)
for_each_ref(append_matching_ref, NULL);
if (saved_matches == ref_name_cnt &&
ref_name_cnt < MAX_REVS)
-   error("no matching refs with %s", av);
+   error(_("no matching refs with %s"), av);
if (saved_matches + 1 < ref_name_cnt)
sort_ref_range(saved_matches, ref_name_cnt);
return;
@@ -701,8 +701,8 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
 *
 * Also --all and --remotes do not make sense either.
 */
-   die("--reflog is incompatible with --all, --remotes, "
-   "--independent or --merge-base");
+   die(_("--reflog is incompatible with --all, --remotes, "
+ "--independent or --merge-base"));
}
 
/* If nothing is specified, show all branches by default */
@@ -725,16 +725,16 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
av = fake_av;
ac = 1;
if (!*av)
-   die("no branches given, and HEAD is not valid");
+   die(_("no branches given, and HEAD is not 
valid"));
}
if (ac != 1)
-   die("--reflog option needs one branch name");
+   die(_("--reflog option needs one branch name"));
 
if (MAX_REVS < reflog)
die("Only %d entries can be shown at one time.",
MAX_REVS);
if (!dwim_ref(*av, strlen(*av), oid.hash, ))
-   die("No such ref %s", *av);
+   die(_("no such ref %s"), *av);
 
/* Has the base been specified? */
if (reflog_base) {
@@ -828,10 +828,10 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
if (MAX_REVS <= num_rev)
die("cannot handle more than %d revs.", MAX_REVS);
if (get_sha1(ref_name[num_rev], revkey.hash))
-   die("'%s' is not a valid ref.", ref_name[num_rev]);
+   die(_("'%s' is not a valid ref."), ref_name[num_rev]);
commit = lookup_commit_reference(revkey.hash);
if (!commit)
-   die("cannot find commit %s (%s)",
+   die(_("cannot find commit %s (%s)"),
ref_name[num_rev], oid_to_hex());
parse_commit(commit);
mark_seen(commit, );
-- 
2.7.4

[PATCH v3 13/14] i18n: show-branch: mark plural strings for translation

2016-09-15 Thread Vasco Almeida

Mark plural string for translation using Q_().

Although we already know that the plural sentence is always used in the
English source, other languages have complex plural rules they must
comply according to the value of MAX_REVS.

Signed-off-by: Vasco Almeida 
---
 builtin/show-branch.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/builtin/show-branch.c b/builtin/show-branch.c
index 5809371..623ca56 100644
--- a/builtin/show-branch.c
+++ b/builtin/show-branch.c
@@ -373,8 +373,9 @@ static int append_ref(const char *refname, const struct 
object_id *oid,
return 0;
}
if (MAX_REVS <= ref_name_cnt) {
-   warning("ignoring %s; cannot handle more than %d refs",
-   refname, MAX_REVS);
+   warning(Q_("ignoring %s; cannot handle more than %d ref",
+  "ignoring %s; cannot handle more than %d refs",
+  MAX_REVS), refname, MAX_REVS);
return 0;
}
ref_name[ref_name_cnt++] = xstrdup(refname);
@@ -731,8 +732,9 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
die(_("--reflog option needs one branch name"));
 
if (MAX_REVS < reflog)
-   die("Only %d entries can be shown at one time.",
-   MAX_REVS);
+   die(Q_("only %d entry can be shown at one time.",
+  "only %d entries can be shown at one time.",
+  MAX_REVS), MAX_REVS);
if (!dwim_ref(*av, strlen(*av), oid.hash, ))
die(_("no such ref %s"), *av);
 
@@ -826,7 +828,9 @@ int cmd_show_branch(int ac, const char **av, const char 
*prefix)
unsigned int flag = 1u << (num_rev + REV_SHIFT);
 
if (MAX_REVS <= num_rev)
-   die("cannot handle more than %d revs.", MAX_REVS);
+   die(Q_("cannot handle more than %d rev.",
+  "cannot handle more than %d revs.",
+  MAX_REVS), MAX_REVS);
if (get_sha1(ref_name[num_rev], revkey.hash))
die(_("'%s' is not a valid ref."), ref_name[num_rev]);
commit = lookup_commit_reference(revkey.hash);
-- 
2.7.4

[PATCH v3 09/14] i18n: notes: mark error messages for translation

2016-09-15 Thread Vasco Almeida

Signed-off-by: Vasco Almeida 
---
 builtin/notes.c | 20 +++-
 1 file changed, 11 insertions(+), 9 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index f848b89..229ad6d 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -340,7 +340,9 @@ static struct notes_tree *init_notes_check(const char 
*subcommand,
 
ref = (flags & NOTES_INIT_WRITABLE) ? t->update_ref : t->ref;
if (!starts_with(ref, "refs/notes/"))
-   die("Refusing to %s notes in %s (outside of refs/notes/)",
+   /* TRANSLATORS: the first %s will be replaced by a
+  git notes command: 'add', 'merge', 'remove', etc.*/
+   die(_("Refusing to %s notes in %s (outside of refs/notes/)"),
subcommand, ref);
return t;
 }
@@ -680,11 +682,11 @@ static int merge_abort(struct notes_merge_options *o)
 */
 
if (delete_ref("NOTES_MERGE_PARTIAL", NULL, 0))
-   ret += error("Failed to delete ref NOTES_MERGE_PARTIAL");
+   ret += error(_("Failed to delete ref NOTES_MERGE_PARTIAL"));
if (delete_ref("NOTES_MERGE_REF", NULL, REF_NODEREF))
-   ret += error("Failed to delete ref NOTES_MERGE_REF");
+   ret += error(_("Failed to delete ref NOTES_MERGE_REF"));
if (notes_merge_abort(o))
-   ret += error("Failed to remove 'git notes merge' worktree");
+   ret += error(_("Failed to remove 'git notes merge' worktree"));
return ret;
 }
 
@@ -704,11 +706,11 @@ static int merge_commit(struct notes_merge_options *o)
 */
 
if (get_sha1("NOTES_MERGE_PARTIAL", sha1))
-   die("Failed to read ref NOTES_MERGE_PARTIAL");
+   die(_("Failed to read ref NOTES_MERGE_PARTIAL"));
else if (!(partial = lookup_commit_reference(sha1)))
-   die("Could not find commit from NOTES_MERGE_PARTIAL.");
+   die(_("Could not find commit from NOTES_MERGE_PARTIAL."));
else if (parse_commit(partial))
-   die("Could not parse commit from NOTES_MERGE_PARTIAL.");
+   die(_("Could not parse commit from NOTES_MERGE_PARTIAL."));
 
if (partial->parents)
hashcpy(parent_sha1, partial->parents->item->object.oid.hash);
@@ -721,10 +723,10 @@ static int merge_commit(struct notes_merge_options *o)
o->local_ref = local_ref_to_free =
resolve_refdup("NOTES_MERGE_REF", 0, sha1, NULL);
if (!o->local_ref)
-   die("Failed to resolve NOTES_MERGE_REF");
+   die(_("Failed to resolve NOTES_MERGE_REF"));
 
if (notes_merge_commit(o, t, partial, sha1))
-   die("Failed to finalize notes merge");
+   die(_("Failed to finalize notes merge"));
 
/* Reuse existing commit message in reflog message */
memset(_ctx, 0, sizeof(pretty_ctx));
-- 
2.7.4

[PATCH v3 14/14] i18n: update-index: mark warnings for translation

2016-09-15 Thread Vasco Almeida

Signed-off-by: Vasco Almeida 
---
 builtin/update-index.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index ba04b19..7a17ce1 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -1127,9 +1127,9 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
break;
case UC_DISABLE:
if (git_config_get_untracked_cache() == 1)
-   warning("core.untrackedCache is set to true; "
-   "remove or change it, if you really want to "
-   "disable the untracked cache");
+   warning(_("core.untrackedCache is set to true; "
+ "remove or change it, if you really want to "
+ "disable the untracked cache"));
remove_untracked_cache(_index);
report(_("Untracked cache disabled"));
break;
@@ -1139,9 +1139,9 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
case UC_ENABLE:
case UC_FORCE:
if (git_config_get_untracked_cache() == 0)
-   warning("core.untrackedCache is set to false; "
-   "remove or change it, if you really want to "
-   "enable the untracked cache");
+   warning(_("core.untrackedCache is set to false; "
+ "remove or change it, if you really want to "
+ "enable the untracked cache"));
add_untracked_cache(_index);
report(_("Untracked cache enabled for '%s'"), 
get_git_work_tree());
break;
-- 
2.7.4

[PATCH v3 10/14] notes: lowercase first word of error messages

2016-09-15 Thread Vasco Almeida

Follow the usual case style.

Update one test to reflect these changes.

Signed-off-by: Vasco Almeida 
---
 builtin/notes.c  | 64 
 t/t3320-notes-merge-worktrees.sh |  2 +-
 2 files changed, 33 insertions(+), 33 deletions(-)

diff --git a/builtin/notes.c b/builtin/notes.c
index 229ad6d..5248a9b 100644
--- a/builtin/notes.c
+++ b/builtin/notes.c
@@ -191,7 +191,7 @@ static void prepare_note_data(const unsigned char *object, 
struct note_data *d,
strbuf_reset(>buf);
 
if (launch_editor(d->edit_path, >buf, NULL)) {
-   die(_("Please supply the note contents using either -m 
or -F option"));
+   die(_("please supply the note contents using either -m 
or -F option"));
}
strbuf_stripspace(>buf, 1);
}
@@ -202,7 +202,7 @@ static void write_note_data(struct note_data *d, unsigned 
char *sha1)
if (write_sha1_file(d->buf.buf, d->buf.len, blob_type, sha1)) {
error(_("unable to write note object"));
if (d->edit_path)
-   error(_("The note contents have been left in %s"),
+   error(_("the note contents have been left in %s"),
d->edit_path);
exit(128);
}
@@ -251,14 +251,14 @@ static int parse_reuse_arg(const struct option *opt, 
const char *arg, int unset)
strbuf_addch(>buf, '\n');
 
if (get_sha1(arg, object))
-   die(_("Failed to resolve '%s' as a valid ref."), arg);
+   die(_("failed to resolve '%s' as a valid ref."), arg);
if (!(buf = read_sha1_file(object, , ))) {
free(buf);
-   die(_("Failed to read object '%s'."), arg);
+   die(_("failed to read object '%s'."), arg);
}
if (type != OBJ_BLOB) {
free(buf);
-   die(_("Cannot read note data from non-blob object '%s'."), arg);
+   die(_("cannot read note data from non-blob object '%s'."), arg);
}
strbuf_add(>buf, buf, len);
free(buf);
@@ -298,13 +298,13 @@ static int notes_copy_from_stdin(int force, const char 
*rewrite_cmd)
 
split = strbuf_split(, ' ');
if (!split[0] || !split[1])
-   die(_("Malformed input line: '%s'."), buf.buf);
+   die(_("malformed input line: '%s'."), buf.buf);
strbuf_rtrim(split[0]);
strbuf_rtrim(split[1]);
if (get_sha1(split[0]->buf, from_obj))
-   die(_("Failed to resolve '%s' as a valid ref."), 
split[0]->buf);
+   die(_("failed to resolve '%s' as a valid ref."), 
split[0]->buf);
if (get_sha1(split[1]->buf, to_obj))
-   die(_("Failed to resolve '%s' as a valid ref."), 
split[1]->buf);
+   die(_("failed to resolve '%s' as a valid ref."), 
split[1]->buf);
 
if (rewrite_cmd)
err = copy_note_for_rewrite(c, from_obj, to_obj);
@@ -313,7 +313,7 @@ static int notes_copy_from_stdin(int force, const char 
*rewrite_cmd)
combine_notes_overwrite);
 
if (err) {
-   error(_("Failed to copy notes from '%s' to '%s'"),
+   error(_("failed to copy notes from '%s' to '%s'"),
  split[0]->buf, split[1]->buf);
ret = 1;
}
@@ -342,7 +342,7 @@ static struct notes_tree *init_notes_check(const char 
*subcommand,
if (!starts_with(ref, "refs/notes/"))
/* TRANSLATORS: the first %s will be replaced by a
   git notes command: 'add', 'merge', 'remove', etc.*/
-   die(_("Refusing to %s notes in %s (outside of refs/notes/)"),
+   die(_("refusing to %s notes in %s (outside of refs/notes/)"),
subcommand, ref);
return t;
 }
@@ -369,13 +369,13 @@ static int list(int argc, const char **argv, const char 
*prefix)
t = init_notes_check("list", 0);
if (argc) {
if (get_sha1(argv[0], object))
-   die(_("Failed to resolve '%s' as a valid ref."), 
argv[0]);
+   die(_("failed to resolve '%s' as a valid ref."), 
argv[0]);
note = get_note(t, object);
if (note) {
puts(sha1_to_hex(note));
retval = 0;
} else
-   retval = error(_("No note found for object %s."),
+   retval = error(_("no note found for object %s."),
   sha1_to_hex(object));
} else
retval = for_each_note(t, 0, list_each_note, NULL);
@@ -424,7 +424,7 @@ static int add(int

[PATCH v3 05/14] i18n: branch: mark option description for translation

2016-09-15 Thread Vasco Almeida

Signed-off-by: Vasco Almeida 
---
 builtin/branch.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 7df0543..d5d93a8 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -657,7 +657,7 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
OPT_SET_INT( 0, "set-upstream",  , N_("change upstream 
info"),
BRANCH_TRACK_OVERRIDE),
OPT_STRING('u', "set-upstream-to", _upstream, 
N_("upstream"), N_("change the upstream info")),
-   OPT_BOOL(0, "unset-upstream", _upstream, "Unset the 
upstream info"),
+   OPT_BOOL(0, "unset-upstream", _upstream, N_("Unset the 
upstream info")),
OPT__COLOR(_use_color, N_("use colored output")),
OPT_SET_INT('r', "remotes", , N_("act on 
remote-tracking branches"),
FILTER_REFS_REMOTES),
-- 
2.7.4

[PATCH v3 06/14] i18n: config: mark error message for translation

2016-09-15 Thread Vasco Almeida

Signed-off-by: Vasco Almeida 
---
 builtin/config.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/builtin/config.c b/builtin/config.c
index 6cbf733..05843a0 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -622,8 +622,8 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
value = normalize_value(argv[0], argv[1]);
ret = git_config_set_in_file_gently(given_config_source.file, 
argv[0], value);
if (ret == CONFIG_NOTHING_SET)
-   error("cannot overwrite multiple values with a single 
value\n"
-   "   Use a regexp, --add or --replace-all to change 
%s.", argv[0]);
+   error(_("cannot overwrite multiple values with a single 
value\n"
+   "   Use a regexp, --add or --replace-all to change 
%s."), argv[0]);
return ret;
}
else if (actions == ACTION_SET_ALL) {
-- 
2.7.4

[PATCH v3 01/14] i18n: apply: mark plural string for translation

2016-09-15 Thread Vasco Almeida

Mark plural string for translation using Q_().

Signed-off-by: Vasco Almeida 
---
 builtin/apply.c | 10 ++
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 1a488f9..ef03c74 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -4768,10 +4768,12 @@ static int apply_all_patches(struct apply_state *state,
   state->whitespace_error),
state->whitespace_error);
if (state->applied_after_fixing_ws && state->apply)
-   warning("%d line%s applied after"
-   " fixing whitespace errors.",
-   state->applied_after_fixing_ws,
-   state->applied_after_fixing_ws == 1 ? "" : "s");
+   warning(Q_("%d line applied after"
+  " fixing whitespace errors.",
+  "%d lines applied after"
+  " fixing whitespace errors.",
+  state->applied_after_fixing_ws),
+   state->applied_after_fixing_ws);
else if (state->whitespace_error)
warning(Q_("%d line adds whitespace errors.",
   "%d lines add whitespace errors.",
-- 
2.7.4

[PATCH v3 08/14] i18n: merge-recursive: mark verbose message for translation

2016-09-15 Thread Vasco Almeida

Signed-off-by: Vasco Almeida 
---
 builtin/merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index 3b09610..0dd9021 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -74,7 +74,7 @@ int cmd_merge_recursive(int argc, const char **argv, const 
char *prefix)
o.branch2 = better_branch_name(o.branch2);
 
if (o.verbosity >= 3)
-   printf("Merging %s with %s\n", o.branch1, o.branch2);
+   printf(_("Merging %s with %s\n"), o.branch1, o.branch2);
 
failed = merge_recursive_generic(, , , bases_count, bases, 
);
if (failed < 0)
-- 
2.7.4

[PATCH v3 07/14] i18n: merge-recursive: mark error messages for translation

2016-09-15 Thread Vasco Almeida

Lowercase first word of such error messages following the usual style.

Signed-off-by: Vasco Almeida 
---
 builtin/merge-recursive.c | 17 ++---
 1 file changed, 10 insertions(+), 7 deletions(-)

diff --git a/builtin/merge-recursive.c b/builtin/merge-recursive.c
index fd2c455..3b09610 100644
--- a/builtin/merge-recursive.c
+++ b/builtin/merge-recursive.c
@@ -42,30 +42,33 @@ int cmd_merge_recursive(int argc, const char **argv, const 
char *prefix)
if (!arg[2])
break;
if (parse_merge_opt(, arg + 2))
-   die("Unknown option %s", arg);
+   die(_("unknown option %s"), arg);
continue;
}
if (bases_count < ARRAY_SIZE(bases)-1) {
struct object_id *oid = xmalloc(sizeof(struct 
object_id));
if (get_oid(argv[i], oid))
-   die("Could not parse object '%s'", argv[i]);
+   die(_("could not parse object '%s'"), argv[i]);
bases[bases_count++] = oid;
}
else
-   warning("Cannot handle more than %d bases. "
-   "Ignoring %s.",
+   warning(Q_("cannot handle more than %d base. "
+  "Ignoring %s.",
+  "cannot handle more than %d bases. "
+  "Ignoring %s.",
+   (int)ARRAY_SIZE(bases)-1),
(int)ARRAY_SIZE(bases)-1, argv[i]);
}
if (argc - i != 3) /* "--" "" "" */
-   die("Not handling anything other than two heads merge.");
+   die(_("not handling anything other than two heads merge."));
 
o.branch1 = argv[++i];
o.branch2 = argv[++i];
 
if (get_oid(o.branch1, ))
-   die("Could not resolve ref '%s'", o.branch1);
+   die(_("could not resolve ref '%s'"), o.branch1);
if (get_oid(o.branch2, ))
-   die("Could not resolve ref '%s'", o.branch2);
+   die(_("could not resolve ref '%s'"), o.branch2);
 
o.branch1 = better_branch_name(o.branch1);
o.branch2 = better_branch_name(o.branch2);
-- 
2.7.4

[PATCH v3 04/14] i18n: blame: mark error messages for translation

2016-09-15 Thread Vasco Almeida

Mark error messages for translation passed to die() function.
Change "Cannot" to lowercase following the usual style.

Reflect changes to test by using test_i18ngrep.

Signed-off-by: Vasco Almeida 
---
 builtin/blame.c   | 18 ++
 t/t8003-blame-corner-cases.sh |  4 ++--
 2 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index a5bbf91..27aea79 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -2601,7 +2601,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
 
if (incremental || (output_option & OUTPUT_PORCELAIN)) {
if (show_progress > 0)
-   die("--progress can't be used with --incremental or 
porcelain formats");
+   die(_("--progress can't be used with --incremental or 
porcelain formats"));
show_progress = 0;
} else if (show_progress < 0)
show_progress = isatty(2);
@@ -2727,7 +2727,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
sb.commits.compare = compare_commits_by_commit_date;
}
else if (contents_from)
-   die("--contents and --reverse do not blend well.");
+   die(_("--contents and --reverse do not blend well."));
else {
final_commit_name = prepare_initial();
sb.commits.compare = compare_commits_by_reverse_commit_date;
@@ -2747,12 +2747,12 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
add_pending_object(, &(sb.final->object), ":");
}
else if (contents_from)
-   die("Cannot use --contents with final commit object name");
+   die(_("cannot use --contents with final commit object name"));
 
if (reverse && revs.first_parent_only) {
final_commit = find_single_final(sb.revs, NULL);
if (!final_commit)
-   die("--reverse and --first-parent together require 
specified latest commit");
+   die(_("--reverse and --first-parent together require 
specified latest commit"));
}
 
/*
@@ -2779,7 +2779,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
}
 
if (oidcmp(>object.oid, >object.oid))
-   die("--reverse --first-parent together require range 
along first-parent chain");
+   die(_("--reverse --first-parent together require range 
along first-parent chain"));
}
 
if (is_null_oid(>object.oid)) {
@@ -2790,7 +2790,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
else {
o = get_origin(, sb.final, path);
if (fill_blob_sha1_and_mode(o))
-   die("no such path %s in %s", path, final_commit_name);
+   die(_("no such path %s in %s"), path, 
final_commit_name);
 
if (DIFF_OPT_TST(>diffopt, ALLOW_TEXTCONV) &&
textconv_object(path, o->mode, o->blob_sha1, 1, (char **) 
_buf,
@@ -2801,7 +2801,7 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
  _buf_size);
 
if (!sb.final_buf)
-   die("Cannot read blob %s for path %s",
+   die(_("cannot read blob %s for path %s"),
sha1_to_hex(o->blob_sha1),
path);
}
@@ -2820,7 +2820,9 @@ int cmd_blame(int argc, const char **argv, const char 
*prefix)
, , sb.path))
usage(blame_usage);
if (lno < top || ((lno || bottom) && lno < bottom))
-   die("file %s has only %lu lines", path, lno);
+   die(Q_("file %s has only %lu line",
+  "file %s has only %lu lines",
+  lno), path, lno);
if (bottom < 1)
bottom = 1;
if (top < 1)
diff --git a/t/t8003-blame-corner-cases.sh b/t/t8003-blame-corner-cases.sh
index e48370d..661f9d4 100755
--- a/t/t8003-blame-corner-cases.sh
+++ b/t/t8003-blame-corner-cases.sh
@@ -212,12 +212,12 @@ EOF
 
 test_expect_success 'blame -L with invalid start' '
test_must_fail git blame -L5 tres 2>errors &&
-   grep "has only 2 lines" errors
+   test_i18ngrep "has only 2 lines" errors
 '
 
 test_expect_success 'blame -L with invalid end' '
test_must_fail git blame -L1,5 tres 2>errors &&
-   grep "has only 2 lines" errors
+   test_i18ngrep "has only 2 lines" errors
 '
 
 test_expect_success 'blame parses  part of -L' '
-- 
2.7.4

[PATCH v3 11/14] i18n: receive-pack: mark messages for translation

2016-09-15 Thread Vasco Almeida

Mark messages refuse_unconfigured_deny_msg and
refuse_unconfigured_deny_delete_current_msg for translation.

Signed-off-by: Vasco Almeida 
---
 builtin/receive-pack.c | 58 ++
 1 file changed, 25 insertions(+), 33 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index f1ce05c..896b16f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -781,47 +781,39 @@ static int is_ref_checked_out(const char *ref)
return !strcmp(head_name, ref);
 }
 
-static char *refuse_unconfigured_deny_msg[] = {
-   "By default, updating the current branch in a non-bare repository",
-   "is denied, because it will make the index and work tree inconsistent",
-   "with what you pushed, and will require 'git reset --hard' to match",
-   "the work tree to HEAD.",
-   "",
-   "You can set 'receive.denyCurrentBranch' configuration variable to",
-   "'ignore' or 'warn' in the remote repository to allow pushing into",
-   "its current branch; however, this is not recommended unless you",
-   "arranged to update its work tree to match what you pushed in some",
-   "other way.",
-   "",
-   "To squelch this message and still keep the default behaviour, set",
-   "'receive.denyCurrentBranch' configuration variable to 'refuse'."
-};
+static char *refuse_unconfigured_deny_msg =
+   N_("By default, updating the current branch in a non-bare repository\n"
+  "is denied, because it will make the index and work tree 
inconsistent\n"
+  "with what you pushed, and will require 'git reset --hard' to 
match\n"
+  "the work tree to HEAD.\n"
+  "\n"
+  "You can set 'receive.denyCurrentBranch' configuration variable to\n"
+  "'ignore' or 'warn' in the remote repository to allow pushing into\n"
+  "its current branch; however, this is not recommended unless you\n"
+  "arranged to update its work tree to match what you pushed in some\n"
+  "other way.\n"
+  "\n"
+  "To squelch this message and still keep the default behaviour, set\n"
+  "'receive.denyCurrentBranch' configuration variable to 'refuse'.");
 
 static void refuse_unconfigured_deny(void)
 {
-   int i;
-   for (i = 0; i < ARRAY_SIZE(refuse_unconfigured_deny_msg); i++)
-   rp_error("%s", refuse_unconfigured_deny_msg[i]);
+   rp_error("%s", _(refuse_unconfigured_deny_msg));
 }
 
-static char *refuse_unconfigured_deny_delete_current_msg[] = {
-   "By default, deleting the current branch is denied, because the next",
-   "'git clone' won't result in any file checked out, causing confusion.",
-   "",
-   "You can set 'receive.denyDeleteCurrent' configuration variable to",
-   "'warn' or 'ignore' in the remote repository to allow deleting the",
-   "current branch, with or without a warning message.",
-   "",
-   "To squelch this message, you can set it to 'refuse'."
-};
+static char *refuse_unconfigured_deny_delete_current_msg =
+   N_("By default, deleting the current branch is denied, because the 
next\n"
+  "'git clone' won't result in any file checked out, causing 
confusion.\n"
+  "\n"
+  "You can set 'receive.denyDeleteCurrent' configuration variable to\n"
+  "'warn' or 'ignore' in the remote repository to allow deleting the\n"
+  "current branch, with or without a warning message.\n"
+  "\n"
+  "To squelch this message, you can set it to 'refuse'.");
 
 static void refuse_unconfigured_deny_delete_current(void)
 {
-   int i;
-   for (i = 0;
-i < ARRAY_SIZE(refuse_unconfigured_deny_delete_current_msg);
-i++)
-   rp_error("%s", refuse_unconfigured_deny_delete_current_msg[i]);
+   rp_error("%s", _(refuse_unconfigured_deny_delete_current_msg));
 }
 
 static int command_singleton_iterator(void *cb_data, unsigned char sha1[20]);
-- 
2.7.4

[PATCH v3 03/14] i18n: apply: mark info messages for translation

2016-09-15 Thread Vasco Almeida

Mark messages for translation printed to stderr.

Signed-off-by: Vasco Almeida 
---
 builtin/apply.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index ef2c084..43ab7c5 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3525,7 +3525,7 @@ static int try_threeway(struct apply_state *state,
 read_blob_object(, pre_sha1, patch->old_mode))
return error(_("repository lacks the necessary blob to fall 
back on 3-way merge."));
 
-   fprintf(stderr, "Falling back to three-way merge...\n");
+   fprintf(stderr, _("Falling back to three-way merge...\n"));
 
img = strbuf_detach(, );
prepare_image(_image, img, len, 1);
@@ -3555,7 +3555,7 @@ static int try_threeway(struct apply_state *state,
status = three_way_merge(image, patch->new_name,
 pre_sha1, our_sha1, post_sha1);
if (status < 0) {
-   fprintf(stderr, "Failed to fall back on three-way merge...\n");
+   fprintf(stderr, _("Failed to fall back on three-way 
merge...\n"));
return status;
}
 
@@ -3567,9 +3567,9 @@ static int try_threeway(struct apply_state *state,
hashcpy(patch->threeway_stage[0].hash, pre_sha1);
hashcpy(patch->threeway_stage[1].hash, our_sha1);
hashcpy(patch->threeway_stage[2].hash, post_sha1);
-   fprintf(stderr, "Applied patch to '%s' with conflicts.\n", 
patch->new_name);
+   fprintf(stderr, _("Applied patch to '%s' with conflicts.\n"), 
patch->new_name);
} else {
-   fprintf(stderr, "Applied patch to '%s' cleanly.\n", 
patch->new_name);
+   fprintf(stderr, _("Applied patch to '%s' cleanly.\n"), 
patch->new_name);
}
return 0;
 }
-- 
2.7.4

[PATCH v3 02/14] i18n: apply: mark error messages for translation

2016-09-15 Thread Vasco Almeida

Mark error messages for translation passed to error() and die()
functions.

Signed-off-by: Vasco Almeida 
---
 builtin/apply.c | 46 +++---
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index ef03c74..ef2c084 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3065,8 +3065,8 @@ static int apply_binary_fragment(struct apply_state 
*state,
/* Binary patch is irreversible without the optional second hunk */
if (state->apply_in_reverse) {
if (!fragment->next)
-   return error("cannot reverse-apply a binary patch "
-"without the reverse hunk to '%s'",
+   return error(_("cannot reverse-apply a binary patch "
+  "without the reverse hunk to '%s'"),
 patch->new_name
 ? patch->new_name : patch->old_name);
fragment = fragment->next;
@@ -3111,8 +3111,8 @@ static int apply_binary(struct apply_state *state,
strlen(patch->new_sha1_prefix) != 40 ||
get_sha1_hex(patch->old_sha1_prefix, sha1) ||
get_sha1_hex(patch->new_sha1_prefix, sha1))
-   return error("cannot apply binary patch to '%s' "
-"without full index line", name);
+   return error(_("cannot apply binary patch to '%s' "
+  "without full index line"), name);
 
if (patch->old_name) {
/*
@@ -3121,16 +3121,16 @@ static int apply_binary(struct apply_state *state,
 */
hash_sha1_file(img->buf, img->len, blob_type, sha1);
if (strcmp(sha1_to_hex(sha1), patch->old_sha1_prefix))
-   return error("the patch applies to '%s' (%s), "
-"which does not match the "
-"current contents.",
+   return error(_("the patch applies to '%s' (%s), "
+  "which does not match the "
+  "current contents."),
 name, sha1_to_hex(sha1));
}
else {
/* Otherwise, the old one must be empty. */
if (img->len)
-   return error("the patch applies to an empty "
-"'%s' but it is not empty", name);
+   return error(_("the patch applies to an empty "
+  "'%s' but it is not empty"), name);
}
 
get_sha1_hex(patch->new_sha1_prefix, sha1);
@@ -3147,8 +3147,8 @@ static int apply_binary(struct apply_state *state,
 
result = read_sha1_file(sha1, , );
if (!result)
-   return error("the necessary postimage %s for "
-"'%s' cannot be read",
+   return error(_("the necessary postimage %s for "
+  "'%s' cannot be read"),
 patch->new_sha1_prefix, name);
clear_image(img);
img->buf = result;
@@ -3523,7 +3523,7 @@ static int try_threeway(struct apply_state *state,
write_sha1_file("", 0, blob_type, pre_sha1);
else if (get_sha1(patch->old_sha1_prefix, pre_sha1) ||
 read_blob_object(, pre_sha1, patch->old_mode))
-   return error("repository lacks the necessary blob to fall back 
on 3-way merge.");
+   return error(_("repository lacks the necessary blob to fall 
back on 3-way merge."));
 
fprintf(stderr, "Falling back to three-way merge...\n");
 
@@ -3541,11 +3541,11 @@ static int try_threeway(struct apply_state *state,
/* our_sha1[] is ours */
if (patch->is_new) {
if (load_current(state, _image, patch))
-   return error("cannot read the current contents of '%s'",
+   return error(_("cannot read the current contents of 
'%s'"),
 patch->new_name);
} else {
if (load_preimage(state, _image, patch, st, ce))
-   return error("cannot read the current contents of '%s'",
+   return error(_("cannot read the current contents of 
'%s'"),
 patch->old_name);
}
write_sha1_file(tmp_image.buf, tmp_image.len, blob_type, our_sha1);
@@ -4020,29 +4020,29 @@ static void build_fake_ancestor(struct patch *list, 
const char *filename)
if (!preimage_sha1_in_gitlink_patch(patch, sha1))
; /* ok, the textual part looks sane */
else
-

[wishlist?] make submodule commands robust to having non-submodule Subprojects

2016-09-15 Thread Yaroslav Halchenko

NB echos some questions of mine a few days back on IRC about Subprojects
and submodules 

If e.g. you just 'git add' a subdirectory which is a git repository, git
adds it as a subproject but doesn't initiate any entry in .gitmodules
since it is the job done by submodule and git core itself is
agnostic of those beasts.

But having then this "Subproject"s which aren't registered as submodules
(and I haven't found any other use for them besides being a submodule)
brakes "git submodule" commands, e.g.

$> git submodule
 cc6a09ac06c13cf06b4f4c8b54cda9a535e4e385 ds01 (2.0.0+4)
 0a9f3b66e06a2137311a537b7377c336f1fb30ad ds02 (1.0.0-3-g0a9f3b6)
 9da7e4f4221699915645ac2003298c6aba2db109 ds03 (1.1.0+4)
 fe16cacb5cb9b4d53c50e498298fab182500e147 ds05 (2.0.0+3)
 6898d99ff3ba26880183ed3672a458a7fcde1737 ds06 (2.0.0+2)
 bbd10f634fe87e9d5853df3a891edbdb18cda7f9 ds07 (2.0.0+3)
 138e6730193c0585a69b8baf5b9d7a4439e83ecc ds08 (2.0.0+2)
 ddf3a4cf7ce51a01a664e6faff4b8334b8414b1f ds09 (2.0.1+1)
 7fa73b4df8166dba950c7dc07c3f8cdd50fca313 ds11 (1.0.0-5-g7fa73b4)
fatal: no submodule mapping found in .gitmodules for path 'ds17

which then stops without even looking at other submodules.

I think it would be more logical to make it a 'warning:' not a 'fatal:' and
proceed.

Thank you for consideration
-- 
Yaroslav O. Halchenko
Center for Open Neuroscience http://centerforopenneuroscience.org
Dartmouth College, 419 Moore Hall, Hinman Box 6207, Hanover, NH 03755
Phone: +1 (603) 646-9834   Fax: +1 (603) 646-1419
WWW:   http://www.linkedin.com/in/yarik

[PATCH 4/2] use actual start hashes for submodule push check instead of local refs

2016-09-15 Thread Heiko Voigt

Push knows the actual revision range it is actually pushing to a remote.
Let's use the start revisions to reduce the amount of checked revisions
instead of the locally cached remote refs which might be out of date.

This actually changes behavior as it now can also properly handle pushes
with URLs. That is also the reason why we change some tests. When
passing the remote as URL the push is made unconditionally in on-demand.
This is wrong since on-demand should only push the submodule if its SHA1
reference got changed in the superproject history that is getting pushed.
The reason behind that is that the exclusion list for revisions was
reduced by using the parameters "--not --remotes=" to
reduce the amount of revisions for the revision walk. In case of an URL
this does not match anything and thus we would always do a full revision
walk until the root commit.

Signed-off-by: Heiko Voigt 
---
And here is another one which makes the check more correct. For push
--recurse-submodules=check|on-demand with URLs this is also a
performance improvement since at the moment we iterate over all
revisions unconditionally as explained above.

This patch changes the behavior (now its how its supposed to be) so
there may be fallout from users complaining that their submodules do not
get pushed with on-demand anymore. This is only for users using URL for
pushing though.

Cheers Heiko

BTW: Peff, thanks for the good diagnosis of all these problems.

 submodule.c| 12 ++--
 submodule.h|  6 +++---
 t/t5531-deep-submodule-push.sh | 24 ++--
 transport.c| 40 ++--
 4 files changed, 57 insertions(+), 25 deletions(-)

diff --git a/submodule.c b/submodule.c
index 28bb74e..1f82974 100644
--- a/submodule.c
+++ b/submodule.c
@@ -639,8 +639,8 @@ static void free_submodules_sha1s(struct string_list 
*submodules)
string_list_clear(submodules, 1);
 }
 
-int find_unpushed_submodules(struct sha1_array *hashes,
-   const char *remotes_name, struct string_list *needs_pushing)
+int find_unpushed_submodules(struct sha1_array *old_hashes,
+   struct sha1_array *new_hashes, struct string_list 
*needs_pushing)
 {
struct rev_info rev;
struct commit *commit;
@@ -652,9 +652,9 @@ int find_unpushed_submodules(struct sha1_array *hashes,
 
/* argv.argv[0] will be ignored by setup_revisions */
argv_array_push(, "find_unpushed_submodules");
-   sha1_array_for_each_unique(hashes, append_hash_to_argv, );
+   sha1_array_for_each_unique(new_hashes, append_hash_to_argv, );
argv_array_push(, "--not");
-   argv_array_pushf(, "--remotes=%s", remotes_name);
+   sha1_array_for_each_unique(old_hashes, append_hash_to_argv, );
 
setup_revisions(argv.argc, argv.argv, , NULL);
if (prepare_revision_walk())
@@ -700,12 +700,12 @@ static int push_submodule(const char *path)
return 1;
 }
 
-int push_unpushed_submodules(struct sha1_array *hashes, const char 
*remotes_name)
+int push_unpushed_submodules(struct sha1_array *old_hashes, struct sha1_array 
*new_hashes)
 {
int i, ret = 1;
struct string_list needs_pushing = STRING_LIST_INIT_DUP;
 
-   if (!find_unpushed_submodules(hashes, remotes_name, _pushing))
+   if (!find_unpushed_submodules(old_hashes, new_hashes, _pushing))
return 1;
 
for (i = 0; i < needs_pushing.nr; i++) {
diff --git a/submodule.h b/submodule.h
index 065b2f0..55c6c91 100644
--- a/submodule.h
+++ b/submodule.h
@@ -63,9 +63,9 @@ int submodule_uses_gitfile(const char *path);
 int ok_to_remove_submodule(const char *path);
 int merge_submodule(unsigned char result[20], const char *path, const unsigned 
char base[20],
const unsigned char a[20], const unsigned char b[20], int 
search);
-int find_unpushed_submodules(struct sha1_array *hashes, const char 
*remotes_name,
-   struct string_list *needs_pushing);
-int push_unpushed_submodules(struct sha1_array *hashes, const char 
*remotes_name);
+int find_unpushed_submodules(struct sha1_array *old_hashes,
+   struct sha1_array *new_hashes, struct string_list 
*needs_pushing);
+int push_unpushed_submodules(struct sha1_array *old_hashes, struct sha1_array 
*new_hashes);
 void connect_work_tree_and_git_dir(const char *work_tree, const char *git_dir);
 int parallel_submodules(void);
 
diff --git a/t/t5531-deep-submodule-push.sh b/t/t5531-deep-submodule-push.sh
index 198ce84..a5b8ef6 100755
--- a/t/t5531-deep-submodule-push.sh
+++ b/t/t5531-deep-submodule-push.sh
@@ -154,6 +154,8 @@ test_expect_success 'push recurse-submodules on command 
line overrides config' '
git fetch ../pub.git &&
git diff --quiet FETCH_HEAD master &&
(cd gar/bage && git diff --quiet origin/master master^) &&
+   # Since this push was executed reset to

[PATCH 3/2] batch check whether submodule needs pushing into one call

2016-09-15 Thread Heiko Voigt

We run a command for each sha1 change in a submodule. This is
unnecessary since we can simply batch all sha1's we want to check into
one command. Lets do it so we can speedup the check when many submodule
changes are in need of checking.

Signed-off-by: Heiko Voigt 
---
On Wed, Sep 14, 2016 at 03:30:53PM -0700, Junio C Hamano wrote:
> Heiko Voigt  writes:
> 
> > Sorry about the late reply. I was not able to process emails until now.
> > Here are two patches that should help to improve the situation and batch
> > up some processing. This one is for repositories with submodules, so
> > that they do not iterate over the same submodule twice with the same
> > hash.
> >
> > The second one will be the one people without submodules are interested
> > in.
> 
> Thanks.  Will take a look at later as I'm already deep in today's
> integration cycle.  Very much appreciated.

No problem. While I am at it: Here are actually another two patches that
should make life of submodule users easier (push times of big pushes).

In Numbers with the qt5[1] superproject and all submodules initialized.
The same --mirror test as before with the git repository:

# Without patch:

book:qt5 hvoigt (5.6)$
rm -rf ~/Downloads/git-test && mkdir ~/Downloads/git-test &&
   (cd ~/Downloads/git-test && git init) &&
   time git push --mirror --recurse-submodules=check ~/Downloads/git-test

real4m0.881s
user3m30.139s
sys 0m22.329s

Without --recurse-submodules=check

real0m0.251s
user0m0.218s
sys 0m0.082s


# With patch:

real0m1.167s
user0m0.846s
sys 0m0.262s

real0m1.110s
user0m0.815s
sys 0m0.247s

real0m1.111s
user0m0.818s
sys 0m0.251s

Without --recurse-submodules=check

real0m0.294s
user0m0.221s
sys 0m0.104s

real0m0.248s
user0m0.216s
sys 0m0.080s

real0m0.247s
user0m0.212s
sys 0m0.082s

[1] git://code.qt.io/qt/qt5.git

 submodule.c | 75 -
 1 file changed, 39 insertions(+), 36 deletions(-)

diff --git a/submodule.c b/submodule.c
index a15e346..28bb74e 100644
--- a/submodule.c
+++ b/submodule.c
@@ -522,27 +522,54 @@ static int has_remote(const char *refname, const struct 
object_id *oid,
return 1;
 }
 
-static int submodule_needs_pushing(const char *path, const unsigned char 
sha1[20])
+static void append_hash_to_argv(const unsigned char sha1[20], void *data)
 {
-   if (add_submodule_odb(path) || !lookup_commit_reference(sha1))
+   struct argv_array *argv = (struct argv_array *) data;
+   argv_array_push(argv, sha1_to_hex(sha1));
+}
+
+static void check_has_hash(const unsigned char sha1[20], void *data)
+{
+   int *has_hash = (int *) data;
+
+   if (!lookup_commit_reference(sha1))
+   *has_hash = 0;
+}
+
+static int submodule_has_hashes(const char *path, struct sha1_array *hashes)
+{
+   int has_hash = 1;
+
+   if (add_submodule_odb(path))
+   return 0;
+
+   sha1_array_for_each_unique(hashes, check_has_hash, _hash);
+   return has_hash;
+}
+
+static int submodule_needs_pushing(const char *path, struct sha1_array *hashes)
+{
+   if (!submodule_has_hashes(path, hashes))
return 0;
 
if (for_each_remote_ref_submodule(path, has_remote, NULL) > 0) {
struct child_process cp = CHILD_PROCESS_INIT;
-   const char *argv[] = {"rev-list", NULL, "--not", "--remotes", 
"-n", "1" , NULL};
+
+   argv_array_push(, "rev-list");
+   sha1_array_for_each_unique(hashes, append_hash_to_argv, 
);
+   argv_array_pushl(, "--not", "--remotes", "-n", "1" , 
NULL);
+
struct strbuf buf = STRBUF_INIT;
int needs_pushing = 0;
 
-   argv[1] = sha1_to_hex(sha1);
-   cp.argv = argv;
prepare_submodule_repo_env(_array);
cp.git_cmd = 1;
cp.no_stdin = 1;
cp.out = -1;
cp.dir = path;
if (start_command())
-   die("Could not run 'git rev-list %s --not --remotes -n 
1' command in submodule %s",
-   sha1_to_hex(sha1), path);
+   die("Could not run 'git rev-list  --not 
--remotes -n 1' command in submodule %s",
+   path);
if (strbuf_read(, cp.out, 41))
needs_pushing = 1;
finish_command();
@@ -601,21 +628,6 @@ static void find_unpushed_submodule_commits(struct commit 
*commit,
diff_tree_combined_merge(commit, 1, );
 }
 
-struct collect_submodule_from_sha1s_data {
-   char *submodule_path;
-   struct string_list *needs_pushing;
-};
-
-static void collect_submodules_from_sha1s(const unsigned char sha1[20],
-   void *data)
-{
-   struct collect_submodule_from_sha1s_data *me =
-   (struct

Re: [RFC] extending pathspec support to submodules

2016-09-15 Thread Heiko Voigt

Hi,

On Wed, Sep 14, 2016 at 04:57:53PM -0700, Brandon Williams wrote:
> ---
> I've been trying to think through how we could potentially add pathspec 
> support
> for --recurse-submodule options (for builtins like ls-files or grep down the
> line).  This is something that could be useful if the user supply's a pathspec
> that could match to a file in a submodule.  We could match the submodule to 
> the
> pathspec and then fork the process to recursively run the command on the
> submodule which can be passed a modified pathspec.
> 
> For example with a pathspec 'sub/dir/a', where sub is a submodule in the root
> directory of the supermodule's repo, we could match 'sub' to that spec and 
> then
> recursively call the git command with a pathspec of 'dir/a'.  The child 
> process
> would then have the responsibility of matching 'dir/a' to files in its repo.
> 
> Does this seem like a reasonable feature to add? And if so are how is my
> initial approach at solving the problem?

The problem when you do that is that the child is not aware that it is
actually run as a submodule process. E.g.

   git grep --recurse-submodules foobar -- sub/dir/a

would report back matches in 'dir/a' instead of 'sub/dir/a'. From the
users perspective this will be confusing since she started the grep in
the superproject.

So what I would suggest is that we make the childs aware of the fact
that they are called from a superproject by passing a prefix when
calling it. E.g. like this

   git grep --submodule-prefix=sub foobar -- sub/dir/a

Whether we pass the full or stripped path is now a matter of taste or
ease of implementation, since we could either preprend or strip it in
the child. But the important difference is that we have information
about the original path.

Another approach could, of course, be to omit passing the prefix and
parse the output of the child and try to substitute, paths but that
seems quite error prone to me.

Cheers Heiko

Re: [RFC 0/1] mailinfo: de-quote quoted-pair in header fields

2016-09-15 Thread Jeff King

On Thu, Sep 15, 2016 at 07:15:33AM +0200, Kevin Daudt wrote:

> > > Another small thing I am not sure about is if the \ quoting can hide
> > > an embedded newline in the author name.  Would we end up turning
> > > 
> > >   From: "Jeff \
> > > King" 
> > > 
> > > or somesuch into
> > > 
> > >   Author: Jeff
> > > King
> > > Email: p...@peff.net
> > > 
> > > ;-)
> > 
> > Heh, yeah. That is another reason to clean up and sanitize as much as
> > possible before stuffing it into another text format that will be
> > parsed.
> 
> A quoted string cannot contain newlines according to the RFC, so I think
> we don't need to care about that.

I wondered how we handled something like:

  From: =?UTF-8?q?J=0Aff=20King?= 

which sticks a newline into the middle of the buffer. We do decode it
that way, but eventually call cleanup_space() which converts a run of 1
or more isspace() characters into a single space (0x20). So you end up
with:

  Author: J ff King

which is probably reasonable.

> Makes sense, the current itteration of my patch already strips exterior
> quotes, no matter where they happen.
> 
> I will send a patch soon.

Great. Thanks for working on this.

-Peff

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Jeff King

On Wed, Sep 14, 2016 at 11:47:01PM -0700, Jeff King wrote:

> > first = i = hash_obj(sha1, obj_hash_size);
> > +   clock_gettime(CLOCK_MONOTONIC, );
> > while ((obj = obj_hash[i]) != NULL) {
> > if (!hashcmp(sha1, obj->oid.hash))
> > break;
> > @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1)
> > if (i == obj_hash_size)
> > i = 0;
> > }
> > +   clock_gettime(CLOCK_MONOTONIC, );
> > +   diff(, , _diff);
> > +   add_time_to(, _diff);
> > if (obj && i != first) {
> 
> I don't think this is actually measuring the time spent on collisions.
> It's measuring the time we spend in hashcmp(), but that includes the
> non-collision case where we find it in the first hashcmp.
> 
> Measuring _just_ the collisions is more like the patch below. In my
> measurements it's more like 30ms, compared to 10s for all of the
> hashcmps.

I forgot to send the patch. Which is just as well, because I realized it
was totally buggy.

Here's a patch that I believe is correct (it counts only times when we
move past the first hash slot). It spends about 280ms. Which is still a
lot less than 10s, so I think my other comments stand.

---
diff --git a/object.c b/object.c
index e9e73e0..7a74a1d 100644
--- a/object.c
+++ b/object.c
@@ -123,17 +123,20 @@ struct object *lookup_object(const unsigned char *sha1)
return NULL;
 
first = i = hash_obj(sha1, obj_hash_size);
-   clock_gettime(CLOCK_MONOTONIC, );
while ((obj = obj_hash[i]) != NULL) {
if (!hashcmp(sha1, obj->oid.hash))
break;
+   if (first == i)
+   clock_gettime(CLOCK_MONOTONIC, );
i++;
if (i == obj_hash_size)
i = 0;
}
-   clock_gettime(CLOCK_MONOTONIC, );
-   diff(, , _diff);
-   add_time_to(, _diff);
+   if (i != first) {
+   clock_gettime(CLOCK_MONOTONIC, );
+   diff(, , _diff);
+   add_time_to(, _diff);
+   }
if (obj && i != first) {
/*
 * Move object to where we started to look for it so

Re: [PATCH] object: measure time needed for resolving hash collisions

2016-09-15 Thread Jeff King

On Wed, Sep 14, 2016 at 07:01:41PM -0700, Stefan Beller wrote:

>  According to Jeff, sending patches that don't get accepted is the new hype!

It is what all the cool kids are doing. Unfortunately, it does not save
you from nitpicky reviews...;)

>   first = i = hash_obj(sha1, obj_hash_size);
> + clock_gettime(CLOCK_MONOTONIC, );
>   while ((obj = obj_hash[i]) != NULL) {
>   if (!hashcmp(sha1, obj->oid.hash))
>   break;
> @@ -98,6 +131,9 @@ struct object *lookup_object(const unsigned char *sha1)
>   if (i == obj_hash_size)
>   i = 0;
>   }
> + clock_gettime(CLOCK_MONOTONIC, );
> + diff(, , _diff);
> + add_time_to(, _diff);
>   if (obj && i != first) {

I don't think this is actually measuring the time spent on collisions.
It's measuring the time we spend in hashcmp(), but that includes the
non-collision case where we find it in the first hashcmp.

Measuring _just_ the collisions is more like the patch below. In my
measurements it's more like 30ms, compared to 10s for all of the
hashcmps.

So we really aren't dealing with collisions, but rather just verifying
that our hash landed at the right spot. And _any_ data structure is
going to have to do that. If you want to make it faster, I'd try
optimizing hashcmp (and you can see why the critbit tree was slower; if
we spend so much time just on hashcmp() to make sure we're at the right
key, then making that slower with a bunch of branching is not going to
help).

I notice we still open-code hashcmp. I get a slight speedup by switching
it to memcmp(). About 2.5%, which is similar to what I showed in

  http://public-inbox.org/git/20130318073229.ga5...@sigill.intra.peff.net/

a few years ago (though it's more pronounced as a portion of the whole
now, because we've optimized some of the other bits).

The main driver there was memcmp() improvements that went into glibc
2.13 several years ago. It might be time to start assuming that memcmp()
beats our open-coded loop.

It may also be possible to really micro-optimize it on some platforms,
because we know the size in advance (I'd kind of expect the compiler to
do that, but if we're ending up in glibc memcmp then it sounds like it
is not the case).

-Peff

Re: [PATCH 2/2] use zstd zlib wrapper

2016-09-15 Thread Jeff King

On Wed, Sep 14, 2016 at 06:22:17PM -0700, Stefan Beller wrote:

> > Disappointingly, the answer seems to be "no".
> 
> After having looked at the data, I disagree with the conclusion.
> And for that I think we need to reason about the frequency
> of the operations happening.

I definitely agree that reads outnumber writes, and it's OK to have an
asymmetric tradeoff between the two. zstd5 isn't _too_ bad in that
respect. I guess I was just disappointed that the pack size was still
bigger, as I was really hoping to see some speed tradeoff without
getting a worse pack.

The other thing to weigh against is "if we were designing it today"
versus "is it worth the compatibility headaches now". A 6% improvement
in "rev-list --objects" is not that amazing for a data format change.
Bitmaps were an _easier_ data format change and are more like a 99%
speedup.

They do not apply to every operations, but we may be able to do similar
space/time tradeoffs that are easier to handle in terms of backwards
compatibility, and which yield bigger results.

-Peff

93 matches

Mail list logo