Re: [PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

2016-05-25 Thread Jeff King
On Fri, May 20, 2016 at 01:39:06PM -0700, Junio C Hamano wrote:

> > +{
> > +   size_t i;
> > +   struct strbuf out = STRBUF_INIT;
> > +   unsigned int width = 80;
> 
> In a few places Git limits the width of the output, like using function
> name in hunk header lines and drawing diffstat in "diff --stat", we
> do default to limit the total width to 80 display columns.
> 
> Given that this routine prefixes each and every line with a short
> heading like "=> Send header: " that costs at around 15-20 columns,
> and the loop we see below only counts the true payload without
> counting the heading, you would probably want to reduce this
> somewhat so that the whole thing would fit within 80 columns.

I think that may be a losing battle. Remember that we are getting the
usual trace header on top of this, which I think leaves only about 20
characters for actual data on each line.

I kind of doubt people will manually read the body data. In my
experience, debugging git-over-http you want either:

  1. Just the request/response headers to see what was said.

  2. A complete dump of each body with no other formatting (e.g., so you
 can replay it with curl).

This trace output gives you (1). You can in theory generate (2) from it
with a perl snippet, but the non-printing characters have been
irreversibly transformed.

So I dunno. I do not mind having the body stuff there for completeness,
but I doubt I'd use it myself. And I don't care much either way about
how long its lines are.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

2016-05-20 Thread Junio C Hamano
Elia Pinto  writes:

> diff --git a/http.c b/http.c
> index df6dd01..ba32bac 100644
> --- a/http.c
> +++ b/http.c
> @@ -11,6 +11,7 @@
>  #include "gettext.h"
>  #include "transport.h"
>  
> +static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
>  #if LIBCURL_VERSION_NUM >= 0x070a08
>  long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
>  #else
> @@ -477,6 +478,125 @@ static void set_curl_keepalive(CURL *c)
>  }
>  #endif
>  
> +static void curl_dump_header(const char *text, unsigned char *ptr, size_t 
> size, int nopriv_header)
> +{
> + struct strbuf out = STRBUF_INIT;
> + const char *header;
> + struct strbuf **header_list, **ptr_list;
> +
> + strbuf_addf(, "%s, %10.10ld bytes (0x%8.8lx)\n",
> + text, (long)size, (long)size);
> + trace_strbuf(_curl, );
> + strbuf_reset();
> + strbuf_add(,ptr,size);
> + header_list = strbuf_split_max(, '\n', 0);
> +
> + for (ptr_list = header_list; *ptr_list; ptr_list++) {
> + /*
> +  * if we are called with nopriv_header substitute a dummy value
> +  * in the Authorization or Proxy-Authorization http header if
> +  * present.
> +  */
> + if (nopriv_header &&
> + (skip_prefix((*ptr_list)->buf , "Authorization:", )
> + || skip_prefix((*ptr_list)->buf , "Proxy-Authorization:", 
> ))) { 
> + /* The first token is the type, which is OK to log */
> + while (isspace(*header))
> + header++;
> + while (*header && !isspace(*header))
> + header++;
> + /* Everything else is opaque and possibly sensitive */
> + strbuf_setlen((*ptr_list),  header - (*ptr_list)->buf );
> + strbuf_addstr((*ptr_list), " ");
> + }
> + strbuf_insert((*ptr_list), 0, text, strlen(text));
> + strbuf_insert((*ptr_list), strlen(text), ": ", 2);
> + strbuf_rtrim((*ptr_list));
> + strbuf_addch((*ptr_list), '\n');
> + trace_strbuf(_curl, (*ptr_list));

This funny indentation makes me wonder why you didn't make it into a
helper function.  If it would require too many parameters, I could
understand the aversion, as it would end up looking like:

for (header = headers; *header; header++) {
if (hide_sensitive_header)
redact_sensitive_header(header, , , ,
, ,
, , );
strbuf_insert(*header, 0, text, strlen(text));
strbuf_insert(*header, strlen(text), ": ", 2);
...
trace_strbuf(_curl, *header);
}

but I think redact_sensitive_header() helper would only need to take
a single strbuf, from the look of your actual implementation.

Yes, I am also suggesting to rename header_list and ptr_list to
headers and header, respectively.  header_list may be an OK name (as
it is "a list, each element of which is a header"), but ptr_list is
a poor name--"a pointer into a list" is a much less interesting thing
for the reader of the code to learn from the name than it represents
"one header".

And your "const char *header" does not have to be here (it would be
an implementation detail of redact_sensitive_header() function), so
such a renaming would not cause conflicts in variable names.

> + }
> + strbuf_list_free(header_list);
> + strbuf_release();
> +}
> +static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)

A blank line, between the end of previous function body and the
begining of this function, is missing.

> +{
> + size_t i;
> + struct strbuf out = STRBUF_INIT;
> + unsigned int width = 80;

In a few places Git limits the width of the output, like using function
name in hunk header lines and drawing diffstat in "diff --stat", we
do default to limit the total width to 80 display columns.

Given that this routine prefixes each and every line with a short
heading like "=> Send header: " that costs at around 15-20 columns,
and the loop we see below only counts the true payload without
counting the heading, you would probably want to reduce this
somewhat so that the whole thing would fit within 80 columns.

> + strbuf_addf(, "%s, %10.10ld bytes (0x%8.8lx)\n",
> + text, (long)size, (long)size);
> + trace_strbuf(_curl, );
> +
> + for (i = 0; i < size; i += width) {
> + size_t w;
> +
> + strbuf_reset();
> + strbuf_addf(, "%s: ", text);
> + for (w = 0; (w < width) && (i + w < size); w++) {
> + strbuf_addch(, (ptr[i + w] >= 0x20)
> + && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');

Please think twice to make sure a long expression that has to span
multiple lines is cut at a sensible place.  This cuts at the worst
place where the syntactic element that binds strongest sits in the
expression.

The weakest 

[PATCH v6 1/2] http.c: implement the GIT_TRACE_CURL environment variable

2016-05-20 Thread Elia Pinto
Implement the GIT_TRACE_CURL environment variable to allow a
greater degree of detail of GIT_CURL_VERBOSE, in particular
the complete transport header and all the data payload exchanged.
It might be useful if a particular situation could require a more
thorough debugging analysis. Document the new GIT_TRACE_CURL
environment variable.

Helped-by: Torsten Bögershausen 
Helped-by: Ramsay Jones 
Helped-by: Junio C Hamano 
Helped-by: Eric Sunshine 
Helped-by: Jeff King 
Signed-off-by: Elia Pinto 
---
 Documentation/git.txt |   8 
 http.c| 124 +-
 http.h|   2 +
 3 files changed, 132 insertions(+), 2 deletions(-)

diff --git a/Documentation/git.txt b/Documentation/git.txt
index dd6dbf7..a46a356 100644
--- a/Documentation/git.txt
+++ b/Documentation/git.txt
@@ -1077,6 +1077,14 @@ of clones and fetches.
cloning of shallow repositories.
See 'GIT_TRACE' for available trace output options.
 
+'GIT_TRACE_CURL'::
+   Enables a curl full trace dump of all incoming and outgoing data,
+   including descriptive information, of the git transport protocol.
+   This is similar to doing curl --trace-ascii on the command line.
+   This option overrides setting the GIT_CURL_VERBOSE environment
+   variable.
+   See 'GIT_TRACE' for available trace output options.
+
 'GIT_LITERAL_PATHSPECS'::
Setting this variable to `1` will cause Git to treat all
pathspecs literally, rather than as glob patterns. For example,
diff --git a/http.c b/http.c
index df6dd01..ba32bac 100644
--- a/http.c
+++ b/http.c
@@ -11,6 +11,7 @@
 #include "gettext.h"
 #include "transport.h"
 
+static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
 #if LIBCURL_VERSION_NUM >= 0x070a08
 long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
 #else
@@ -477,6 +478,125 @@ static void set_curl_keepalive(CURL *c)
 }
 #endif
 
+static void curl_dump_header(const char *text, unsigned char *ptr, size_t 
size, int nopriv_header)
+{
+   struct strbuf out = STRBUF_INIT;
+   const char *header;
+   struct strbuf **header_list, **ptr_list;
+
+   strbuf_addf(, "%s, %10.10ld bytes (0x%8.8lx)\n",
+   text, (long)size, (long)size);
+   trace_strbuf(_curl, );
+   strbuf_reset();
+   strbuf_add(,ptr,size);
+   header_list = strbuf_split_max(, '\n', 0);
+
+   for (ptr_list = header_list; *ptr_list; ptr_list++) {
+   /*
+* if we are called with nopriv_header substitute a dummy value
+* in the Authorization or Proxy-Authorization http header if
+* present.
+*/
+   if (nopriv_header &&
+   (skip_prefix((*ptr_list)->buf , "Authorization:", )
+   || skip_prefix((*ptr_list)->buf , "Proxy-Authorization:", 
))) { 
+   /* The first token is the type, which is OK to log */
+   while (isspace(*header))
+   header++;
+   while (*header && !isspace(*header))
+   header++;
+   /* Everything else is opaque and possibly sensitive */
+   strbuf_setlen((*ptr_list),  header - (*ptr_list)->buf );
+   strbuf_addstr((*ptr_list), " ");
+   }
+   strbuf_insert((*ptr_list), 0, text, strlen(text));
+   strbuf_insert((*ptr_list), strlen(text), ": ", 2);
+   strbuf_rtrim((*ptr_list));
+   strbuf_addch((*ptr_list), '\n');
+   trace_strbuf(_curl, (*ptr_list));
+   }
+   strbuf_list_free(header_list);
+   strbuf_release();
+}
+static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
+{
+   size_t i;
+   struct strbuf out = STRBUF_INIT;
+   unsigned int width = 80;
+
+   strbuf_addf(, "%s, %10.10ld bytes (0x%8.8lx)\n",
+   text, (long)size, (long)size);
+   trace_strbuf(_curl, );
+
+   for (i = 0; i < size; i += width) {
+   size_t w;
+
+   strbuf_reset();
+   strbuf_addf(, "%s: ", text);
+   for (w = 0; (w < width) && (i + w < size); w++) {
+   strbuf_addch(, (ptr[i + w] >= 0x20)
+   && (ptr[i + w] < 0x80) ? ptr[i + w] : '.');
+   }
+   strbuf_addch(, '\n');
+   trace_strbuf(_curl, );
+   }
+   strbuf_release();
+}
+
+static int curl_trace(CURL *handle, curl_infotype type, char *data, size_t 
size, void *userp)
+{
+   const char *text;
+   int nopriv_header = 0;  /*
+* default: there are no sensitive data
+* in the trace to be skipped
+*/
+
+   switch (type) {
+   case CURLINFO_TEXT:
+   trace_printf_key(_curl, "== Info: %s", data);
+   default:/* we ignore