Re: [PATCH v2 4/8] Specify explicitly where we parse timestamps

2017-04-02 Thread Torsten Bögershausen

[]

--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -319,6 +319,8 @@ extern char *gitdirname(char *);
 #define PRIo32 "o"
 #endif

+#define parse_timestamp strtoul
+


Would
#define parse_timestamp(a,b,c)  strtoul((a),(b),(c))
be more strict ?



Re: [PATCH v2 1/8] ref-filter: avoid using `unsigned long` for catch-all data type

2017-04-02 Thread Torsten Bögershausen



On 02/04/17 21:06, Johannes Schindelin wrote:

In its `atom_value` struct, the ref-filter source code wants to store
different values in a field called `ul` (for `unsigned long`), e.g.
timestamps.

However, as we are about to switch the data type of timestamps away from
`unsigned long` (because it may be 32-bit even when `time_t` is 64-bit),
that data type is not large enough.

Simply change that field to use `uintmax_t` instead.

This patch is a bit larger than the mere change of the data type
because the field's name was tied to its data type, which has been fixed
at the same time.

Signed-off-by: Johannes Schindelin 
---
 ref-filter.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 9c82b5b9d63..8538328fc7f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -351,7 +351,7 @@ struct ref_formatting_state {
 struct atom_value {
const char *s;
void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
-   unsigned long ul; /* used for sorting when not FIELD_STR */
+   uintmax_t value; /* used for sorting when not FIELD_STR */
struct used_atom *atom;
 };

@@ -723,7 +723,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   v->ul = sz;
+   v->value = sz;
v->s = xstrfmt("%lu", sz);
}
else if (deref)
@@ -770,8 +770,8 @@ static void grab_commit_values(struct atom_value *val, int 
deref, struct object
v->s = xstrdup(oid_to_hex(>tree->object.oid));
}
else if (!strcmp(name, "numparent")) {
-   v->ul = commit_list_count(commit->parents);
-   v->s = xstrfmt("%lu", v->ul);
+   v->value = commit_list_count(commit->parents);
+   v->s = xstrfmt("%lu", (unsigned long)v->value);


If we want to get rid of "%lu" at some day, we can do like this:
v->s = xstrfmt("%" PRIuMAX, v->value);
Or, to make clear that under all circumstances an unsigned long is big enough to
hold the counter, for readers in the future, use something like this:
v->s = xstrfmt("%lu", (xulong_t)v->value);

(this is more a reminder to myself, to send such a patch )


chmod: changing permissions of `blib/arch/auto/Git': Operation not permitted

2017-04-02 Thread Jeffrey Walton
This is kind of unusual. I'm seeing it under Debian 7 on a ci20 mipsel
dev-board when building/installing Git 2.12.2:

...
317 translated messages.
GEN gitk-wish
307 translated messages.
SUBDIR perl
chmod: changing permissions of `blib/lib': Operation not permitted
chmod: changing permissions of `blib/arch': Operation not
permittedmake[2]: *** [blib/lib/.exists] Error 1
make[2]: *** Waiting for unfinished jobs

make[2]: *** [blib/arch/.exists] Error 1
chmod: changing permissions of `blib/arch/auto/Git': Operation not permitted
make[2]: *** [blib/arch/auto/Git/.exists] Error 1
chmod: changing permissions of `blib/lib/auto/Git': Operation not permitted
make[2]: *** [blib/lib/auto/Git/.exists] Error 1
make[1]: *** [all] Error 2
make: *** [all] Error 2
SUBDIR git-gui
SUBDIR gitk-git
...


When I check permissions:

$ ls -Al git-2.12.2/perl/blib/bin/
total 0
-rw-r--r-- 1 root root 0 Apr  3 05:06 .exists

It appears one of the Git processes is creating the directory
structure git-2.12.2/perl/... It also appears to be doing it under
sudo/root since I build under my account.

Jeff


Re: [PATCH v7 5/5] remove_subtree(): reimplement using iterators

2017-04-02 Thread Michael Haggerty
On 04/02/2017 10:03 PM, Daniel Ferreira wrote:
> From: Daniel Ferreira 
> 
> Use dir_iterator to traverse through remove_subtree()'s directory tree,
> avoiding the need for recursive calls to readdir(). Simplify
> remove_subtree()'s code.
> 
> A conversion similar in purpose was previously done at 46d092a
> ("for_each_reflog(): reimplement using iterators", 2016-05-21).
> 
> Signed-off-by: Daniel Ferreira 

This patch has a different author email than the others. If possible,
please choose one email address that you will use for your commits, and
try to be consistent.

> ---
>  entry.c | 38 --
>  1 file changed, 12 insertions(+), 26 deletions(-)
> 
> diff --git a/entry.c b/entry.c
> index d2b512d..bd06f41 100644
> --- a/entry.c
> +++ b/entry.c
> @@ -3,6 +3,8 @@
>  #include "dir.h"
>  #include "streaming.h"
>  #include "submodule.h"
> +#include "iterator.h"
> +#include "dir-iterator.h"
>  
>  static void create_directories(const char *path, int path_len,
>  const struct checkout *state)
> @@ -45,33 +47,17 @@ static void create_directories(const char *path, int 
> path_len,
>   free(buf);
>  }
>  
> -static void remove_subtree(struct strbuf *path)
> +static void remove_subtree(const char *path)
>  {
> - DIR *dir = opendir(path->buf);
> - struct dirent *de;
> - int origlen = path->len;
> -
> - if (!dir)
> - die_errno("cannot opendir '%s'", path->buf);
> - while ((de = readdir(dir)) != NULL) {
> - struct stat st;
> -
> - if (is_dot_or_dotdot(de->d_name))
> - continue;
> -
> - strbuf_addch(path, '/');
> - strbuf_addstr(path, de->d_name);
> - if (lstat(path->buf, ))
> - die_errno("cannot lstat '%s'", path->buf);
> - if (S_ISDIR(st.st_mode))
> - remove_subtree(path);
> - else if (unlink(path->buf))
> - die_errno("cannot unlink '%s'", path->buf);
> - strbuf_setlen(path, origlen);
> + struct dir_iterator *diter = dir_iterator_begin(path, 
> DIR_ITERATOR_POST_ORDER_TRAVERSAL | DIR_ITERATOR_LIST_ROOT_DIR);

The line above is way too long. Try to limit lines to 80 characters max.
(This is documented in `Documentation/CodingGuidelines`.)

> + while (dir_iterator_advance(diter) == ITER_OK) {
> + if (S_ISDIR(diter->st.st_mode)) {
> + if (rmdir(diter->path.buf))
> + die_errno("cannot rmdir '%s'", diter->path.buf);
> + } else if (unlink(diter->path.buf))
> + die_errno("cannot unlink '%s'", diter->path.buf);
>   }
> - closedir(dir);
> - if (rmdir(path->buf))
> - die_errno("cannot rmdir '%s'", path->buf);
>  }
>  
>  static int create_file(const char *path, unsigned int mode)
> @@ -312,7 +298,7 @@ int checkout_entry(struct cache_entry *ce,
>   return 0;
>   if (!state->force)
>   return error("%s is a directory", path.buf);
> - remove_subtree();
> + remove_subtree(path.buf);
>   } else if (unlink(path.buf))
>   return error_errno("unable to unlink old '%s'", 
> path.buf);
>   } else if (state->not_new)
> 

That's a gratifying decrease in code size :-)

Michael



Re: [PATCH v7 4/5] dir_iterator: refactor state machine model

2017-04-02 Thread Michael Haggerty
On 04/02/2017 10:03 PM, Daniel Ferreira wrote:
> Perform major refactor of dir_iterator_advance(). dir_iterator has
> ceased to rely on a convoluted state machine mechanism of two loops and
> two state variables (level.initialized and level.dir_state). This serves
> to ease comprehension of the iterator mechanism and ease addition of new
> features to the iterator.
> 
> Create an option for the dir_iterator API to iterate over subdirectories
> only after having iterated through their contents. This feature was
> predicted, although not implemented by 0fe5043 ("dir_iterator: new API
> for iterating over a directory tree", 2016-06-18). This is useful for
> recursively removing a directory and calling rmdir() on a directory only
> after all of its contents have been wiped.
> 
> Add an option for the dir_iterator API to iterate over the root
> directory (the one it was initialized with) as well.
> 
> Add the "flags" parameter to dir_iterator_create, allowing for the
> aforementioned new features to be enabled. The new default behavior
> (i.e. flags set to 0) does not iterate over directories. Flag
> DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing
> so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a
> directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR
> iterates over the root directory. These flags do not conflict with each
> other and may be used simultaneously.
> 
> Amend a call to dir_iterator_begin() in refs/files-backend.c to pass
> the flags parameter introduced.
> 
> Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to
> test "post-order" and "iterate-over-root" modes.

Nice. I think this version is a lot more understandable than the old
code. I also think it's getting very close to done.

> Signed-off-by: Daniel Ferreira 
> ---
>  dir-iterator.c   | 155 
> +++
>  dir-iterator.h   |  28 ++--
>  refs/files-backend.c |   2 +-
>  t/helper/test-dir-iterator.c |   6 +-
>  t/t0065-dir-iterator.sh  |  61 -
>  5 files changed, 183 insertions(+), 69 deletions(-)
> 
> diff --git a/dir-iterator.c b/dir-iterator.c
> index ce8bf81..18b7e68 100644
> --- a/dir-iterator.c
> +++ b/dir-iterator.c
> @@ -4,8 +4,6 @@
>  #include "dir-iterator.h"
>  
>  struct dir_iterator_level {
> - int initialized;
> -
>   DIR *dir;
>  
>   /*
> @@ -20,8 +18,11 @@ struct dir_iterator_level {
>* iteration and also iterated into):
>*/
>   enum {
> - DIR_STATE_ITER,
> - DIR_STATE_RECURSE
> + DIR_STATE_PUSHED,
> + DIR_STATE_PRE_ITERATION,
> + DIR_STATE_ITERATING,
> + DIR_STATE_POST_ITERATION,
> + DIR_STATE_EXHAUSTED
>   } dir_state;
>  };
>  
> @@ -48,15 +49,20 @@ struct dir_iterator_int {
>* that will be included in this iteration.
>*/
>   struct dir_iterator_level *levels;
> +
> + /* Holds the flags passed to dir_iterator_begin(). */
> + unsigned flags;
>  };
>  
>  static inline void push_dir_level(struct dir_iterator_int *iter, struct 
> dir_iterator_level *level)
>  {
> - level->dir_state = DIR_STATE_RECURSE;
>   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
>  iter->levels_alloc);
> +
> + /* Push a new level */
>   level = >levels[iter->levels_nr++];
> - level->initialized = 0;
> + level->dir = NULL;
> + level->dir_state = DIR_STATE_PUSHED;
>  }
>  
>  static inline int pop_dir_level(struct dir_iterator_int *iter)
> @@ -75,18 +81,35 @@ static inline int set_iterator_data(struct 
> dir_iterator_int *iter, struct dir_it
>   }
>  
>   /*
> -  * We have to set these each time because
> -  * the path strbuf might have been realloc()ed.
> +  * Check if we are dealing with the root directory as an
> +  * item that's being iterated through.
>*/
> - iter->base.relative_path =
> - iter->base.path.buf + iter->levels[0].prefix_len;
> + if (level->dir_state != DIR_STATE_ITERATING &&
> + iter->levels_nr == 1) {
> + iter->base.relative_path = ".";

Doesn't `iter->base.basename` also need special handling in this case?

> + }
> + else {

The Git project standard is to including the `else` on the same line as
the curly brace ending the `if` block:

} else {

> + iter->base.relative_path =
> + iter->base.path.buf + iter->levels[0].prefix_len;
> + }
>   iter->base.basename =
>   iter->base.path.buf + level->prefix_len;
> - level->dir_state = DIR_STATE_ITER;
>  
>   return 0;
>  }
>  
> +/*
> + * This function uses a state machine with the following states:
> + * -> DIR_STATE_PUSHED: the directory has been pushed to the
> + * iterator traversal tree.
> + * -> DIR_STATE_PRE_ITERATION: the directory is *NOT* initialized. The
> + * 

Bug? git submodule update --reference doesn't use the referenced repository

2017-04-02 Thread Maxime Viargues

Hi there,

I have been trying to use the --reference option to clone a big 
repository using a local copy, but I can't manage to make it work using 
sub-module update. I believe this is a bug, unless I missed something.

I am on Windows, Git 2.12.0

So the problem is as follow:
- I have got a repository with multiple sub-modules, say
main
lib1
sub-module1.git
lib2
sub-module2.git
- The original repositories are in GitHub, which makes it slow
- I have done a normal git clone of the entire repository (not bare) and 
put it on a file server, say \\fileserver\ref_repo\

(Note that the problem also happens with local copy)

So if I do a clone to get the repo and all the submodules with...
git clone --reference-if-able \\fileserver\ref-repo --recursive 
g...@github.com:company/main

...then it all works, all the sub-modules get cloned and the it's fast.

Now in my case I am working with Jenkins jobs and I need to first do a 
clone, and then get the sub-modules, but if I do...
git clone --reference-if-able \\fileserver\ref-repo 
g...@github.com:company/main (so non-recursive)

cd main
git submodule update --init --reference \\fileserver\ref-repo
... then this takes ages, as it would normally do without the use of 
--reference. I suspect it's not actually using it.
The git clone documentation mentions that the reference is then passed 
to the sub-module clone commands, so I would expect "git clone 
--recursive" to work the same as "git submodule update", as far as 
--reference is concerned.


I noticed for a single module, doing a...
git submodule update --init --reference 
\\fileserver\ref-repo\lib1\sub-module1 -- lib1/sub-module1
...i.e. adding the sub-module path to the reference path, works. Which 
kind of make sense but then how do you do to apply it to all the 
sub-modules? (without writing a script to do that)


If someone can confirm the problem or explain me what I am dong wrong 
that would be great.


Maxime


[PATCH] Fix 'git am' in-body header continuations

2017-04-02 Thread Linus Torvalds

From: Linus Torvalds 
Date: Sat, 1 Apr 2017 12:14:39 -0700
Subject: [PATCH] Fix 'git am' in-body header continuations

An empty line should stop any pending in-body headers, and start the
actual body parsing.

This also modifies the original test for the in-body headers to actually
have a real commit body that starts with spaces, and changes the test to
check that the long line matches _exactly_, and doesn't get extra data
from the body.

Fixes:6b4b013f1884 ("mailinfo: handle in-body header continuations")
Cc: Jonathan Tan 
Cc: Jeff King 
Signed-off-by: Linus Torvalds 
---

On Sun, 2 Apr 2017, Junio C Hamano wrote:
> 
> And that is exactly your patch does.  The change "feels" correct to
> me.

Ok, resent with the test-case for the original behavior changed to be 
stricter (so it fails without this fix), and with Signed-off lines etc.

I didn't really test the test-case very much, but it seemed to fail 
without this patch (because the "Body test" thing from the body becomes 
part of the long first line), and passes with it.

But somebody who is more used to the test-suite should double-check my 
stupid test edit.

 mailinfo.c| 7 ++-
 t/t4150-am.sh | 6 --
 2 files changed, 10 insertions(+), 3 deletions(-)

diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..68037758f 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -757,8 +757,13 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
strbuf *line)
assert(!mi->filter_stage);
 
if (mi->header_stage) {
-   if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
+   if (!line->len || (line->len == 1 && line->buf[0] == '\n')) {
+   if (mi->inbody_header_accum.len) {
+   flush_inbody_header_accum(mi);
+   mi->header_stage = 0;
+   }
return 0;
+   }
}
 
if (mi->use_inbody_headers && mi->header_stage) {
diff --git a/t/t4150-am.sh b/t/t4150-am.sh
index 89a5bacac..44807e218 100755
--- a/t/t4150-am.sh
+++ b/t/t4150-am.sh
@@ -983,7 +983,9 @@ test_expect_success 'am works with multi-line in-body 
headers' '
rm -fr .git/rebase-apply &&
git checkout -f first &&
echo one >> file &&
-   git commit -am "$LONG" --author="$LONG " &&
+   git commit -am "$LONG
+
+Body test" --author="$LONG " &&
git format-patch --stdout -1 >patch &&
# bump from, date, and subject down to in-body header
perl -lpe "
@@ -997,7 +999,7 @@ test_expect_success 'am works with multi-line in-body 
headers' '
git am msg &&
# Ensure that the author and full message are present
git cat-file commit HEAD | grep "^author.*l...@example.com" &&
-   git cat-file commit HEAD | grep "^$LONG"
+   git cat-file commit HEAD | grep "^$LONG$"
 '
 
 test_done
-- 
2.12.2.578.g5c4e54f4e



Re: [GSoC] Proposal: turn git-add--interactive.perl into a builtin

2017-04-02 Thread Daniel Ferreira (theiostream)
On Sun, Apr 2, 2017 at 4:43 PM, Johannes Schindelin
 wrote:
> We ask to accomplish a microproject before evaluating the proposals for
> one reason: to have a good understanding how well the students would
> interact with the project if they were accepted. As such, the
> microprojects really are about the flow of the contribution, not to tackle
> the project already.
> Meaning: I would recommend staying with your microproject, in particular
> if it is already in full swing.

Oh, when I mentioned these bugfixes I meant I'd be willing to do those
*plus* my microproject before the coding period begins as a "warm-up"
to the project. I'm certainly staying with the microproject until the
end!

-- Daniel.


[PATCH v7 5/5] remove_subtree(): reimplement using iterators

2017-04-02 Thread Daniel Ferreira
From: Daniel Ferreira 

Use dir_iterator to traverse through remove_subtree()'s directory tree,
avoiding the need for recursive calls to readdir(). Simplify
remove_subtree()'s code.

A conversion similar in purpose was previously done at 46d092a
("for_each_reflog(): reimplement using iterators", 2016-05-21).

Signed-off-by: Daniel Ferreira 
---
 entry.c | 38 --
 1 file changed, 12 insertions(+), 26 deletions(-)

diff --git a/entry.c b/entry.c
index d2b512d..bd06f41 100644
--- a/entry.c
+++ b/entry.c
@@ -3,6 +3,8 @@
 #include "dir.h"
 #include "streaming.h"
 #include "submodule.h"
+#include "iterator.h"
+#include "dir-iterator.h"
 
 static void create_directories(const char *path, int path_len,
   const struct checkout *state)
@@ -45,33 +47,17 @@ static void create_directories(const char *path, int 
path_len,
free(buf);
 }
 
-static void remove_subtree(struct strbuf *path)
+static void remove_subtree(const char *path)
 {
-   DIR *dir = opendir(path->buf);
-   struct dirent *de;
-   int origlen = path->len;
-
-   if (!dir)
-   die_errno("cannot opendir '%s'", path->buf);
-   while ((de = readdir(dir)) != NULL) {
-   struct stat st;
-
-   if (is_dot_or_dotdot(de->d_name))
-   continue;
-
-   strbuf_addch(path, '/');
-   strbuf_addstr(path, de->d_name);
-   if (lstat(path->buf, ))
-   die_errno("cannot lstat '%s'", path->buf);
-   if (S_ISDIR(st.st_mode))
-   remove_subtree(path);
-   else if (unlink(path->buf))
-   die_errno("cannot unlink '%s'", path->buf);
-   strbuf_setlen(path, origlen);
+   struct dir_iterator *diter = dir_iterator_begin(path, 
DIR_ITERATOR_POST_ORDER_TRAVERSAL | DIR_ITERATOR_LIST_ROOT_DIR);
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode)) {
+   if (rmdir(diter->path.buf))
+   die_errno("cannot rmdir '%s'", diter->path.buf);
+   } else if (unlink(diter->path.buf))
+   die_errno("cannot unlink '%s'", diter->path.buf);
}
-   closedir(dir);
-   if (rmdir(path->buf))
-   die_errno("cannot rmdir '%s'", path->buf);
 }
 
 static int create_file(const char *path, unsigned int mode)
@@ -312,7 +298,7 @@ int checkout_entry(struct cache_entry *ce,
return 0;
if (!state->force)
return error("%s is a directory", path.buf);
-   remove_subtree();
+   remove_subtree(path.buf);
} else if (unlink(path.buf))
return error_errno("unable to unlink old '%s'", 
path.buf);
} else if (state->not_new)
-- 
2.7.4 (Apple Git-66)



[PATCH v7 4/5] dir_iterator: refactor state machine model

2017-04-02 Thread Daniel Ferreira
Perform major refactor of dir_iterator_advance(). dir_iterator has
ceased to rely on a convoluted state machine mechanism of two loops and
two state variables (level.initialized and level.dir_state). This serves
to ease comprehension of the iterator mechanism and ease addition of new
features to the iterator.

Create an option for the dir_iterator API to iterate over subdirectories
only after having iterated through their contents. This feature was
predicted, although not implemented by 0fe5043 ("dir_iterator: new API
for iterating over a directory tree", 2016-06-18). This is useful for
recursively removing a directory and calling rmdir() on a directory only
after all of its contents have been wiped.

Add an option for the dir_iterator API to iterate over the root
directory (the one it was initialized with) as well.

Add the "flags" parameter to dir_iterator_create, allowing for the
aforementioned new features to be enabled. The new default behavior
(i.e. flags set to 0) does not iterate over directories. Flag
DIR_ITERATOR_PRE_ORDER_TRAVERSAL iterates over a directory before doing
so over its contents. DIR_ITERATOR_POST_ORDER_TRAVERSAL iterates over a
directory after doing so over its contents. DIR_ITERATOR_LIST_ROOT_DIR
iterates over the root directory. These flags do not conflict with each
other and may be used simultaneously.

Amend a call to dir_iterator_begin() in refs/files-backend.c to pass
the flags parameter introduced.

Improve t/t0065-dir-iterator.sh and t/helper/test-dir-iterator.c to
test "post-order" and "iterate-over-root" modes.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c   | 155 +++
 dir-iterator.h   |  28 ++--
 refs/files-backend.c |   2 +-
 t/helper/test-dir-iterator.c |   6 +-
 t/t0065-dir-iterator.sh  |  61 -
 5 files changed, 183 insertions(+), 69 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index ce8bf81..18b7e68 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -4,8 +4,6 @@
 #include "dir-iterator.h"
 
 struct dir_iterator_level {
-   int initialized;
-
DIR *dir;
 
/*
@@ -20,8 +18,11 @@ struct dir_iterator_level {
 * iteration and also iterated into):
 */
enum {
-   DIR_STATE_ITER,
-   DIR_STATE_RECURSE
+   DIR_STATE_PUSHED,
+   DIR_STATE_PRE_ITERATION,
+   DIR_STATE_ITERATING,
+   DIR_STATE_POST_ITERATION,
+   DIR_STATE_EXHAUSTED
} dir_state;
 };
 
@@ -48,15 +49,20 @@ struct dir_iterator_int {
 * that will be included in this iteration.
 */
struct dir_iterator_level *levels;
+
+   /* Holds the flags passed to dir_iterator_begin(). */
+   unsigned flags;
 };
 
 static inline void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
 {
-   level->dir_state = DIR_STATE_RECURSE;
ALLOC_GROW(iter->levels, iter->levels_nr + 1,
   iter->levels_alloc);
+
+   /* Push a new level */
level = >levels[iter->levels_nr++];
-   level->initialized = 0;
+   level->dir = NULL;
+   level->dir_state = DIR_STATE_PUSHED;
 }
 
 static inline int pop_dir_level(struct dir_iterator_int *iter)
@@ -75,18 +81,35 @@ static inline int set_iterator_data(struct dir_iterator_int 
*iter, struct dir_it
}
 
/*
-* We have to set these each time because
-* the path strbuf might have been realloc()ed.
+* Check if we are dealing with the root directory as an
+* item that's being iterated through.
 */
-   iter->base.relative_path =
-   iter->base.path.buf + iter->levels[0].prefix_len;
+   if (level->dir_state != DIR_STATE_ITERATING &&
+   iter->levels_nr == 1) {
+   iter->base.relative_path = ".";
+   }
+   else {
+   iter->base.relative_path =
+   iter->base.path.buf + iter->levels[0].prefix_len;
+   }
iter->base.basename =
iter->base.path.buf + level->prefix_len;
-   level->dir_state = DIR_STATE_ITER;
 
return 0;
 }
 
+/*
+ * This function uses a state machine with the following states:
+ * -> DIR_STATE_PUSHED: the directory has been pushed to the
+ * iterator traversal tree.
+ * -> DIR_STATE_PRE_ITERATION: the directory is *NOT* initialized. The
+ * dirpath has already been returned if pre-order traversal is set.
+ * -> DIR_STATE_ITERATING: the directory is initialized. We are traversing
+ * through it.
+ * -> DIR_STATE_POST_ITERATION: the directory has been iterated through.
+ * We are ready to close it.
+ * -> DIR_STATE_EXHAUSTED: the directory is closed and ready to be popped.
+ */
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
struct dir_iterator_int *iter =
@@ -97,7 +120,18 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)

[PATCH v7 3/5] dir_iterator: add helpers to dir_iterator_advance

2017-04-02 Thread Daniel Ferreira
Create inline helpers to dir_iterator_advance(). Make
dir_iterator_advance()'s code more legible and allow some behavior to
be reusable.

Signed-off-by: Daniel Ferreira 
---
 dir-iterator.c | 65 +-
 1 file changed, 42 insertions(+), 23 deletions(-)

diff --git a/dir-iterator.c b/dir-iterator.c
index 34182a9..ce8bf81 100644
--- a/dir-iterator.c
+++ b/dir-iterator.c
@@ -50,6 +50,43 @@ struct dir_iterator_int {
struct dir_iterator_level *levels;
 };
 
+static inline void push_dir_level(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   level->dir_state = DIR_STATE_RECURSE;
+   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
+  iter->levels_alloc);
+   level = >levels[iter->levels_nr++];
+   level->initialized = 0;
+}
+
+static inline int pop_dir_level(struct dir_iterator_int *iter)
+{
+   return --iter->levels_nr;
+}
+
+static inline int set_iterator_data(struct dir_iterator_int *iter, struct 
dir_iterator_level *level)
+{
+   if (lstat(iter->base.path.buf, >base.st) < 0) {
+   if (errno != ENOENT)
+   warning("error reading path '%s': %s",
+   iter->base.path.buf,
+   strerror(errno));
+   return -1;
+   }
+
+   /*
+* We have to set these each time because
+* the path strbuf might have been realloc()ed.
+*/
+   iter->base.relative_path =
+   iter->base.path.buf + iter->levels[0].prefix_len;
+   iter->base.basename =
+   iter->base.path.buf + level->prefix_len;
+   level->dir_state = DIR_STATE_ITER;
+
+   return 0;
+}
+
 int dir_iterator_advance(struct dir_iterator *dir_iterator)
 {
struct dir_iterator_int *iter =
@@ -84,11 +121,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * over; now prepare to iterate into
 * it.
 */
-   level->dir_state = DIR_STATE_RECURSE;
-   ALLOC_GROW(iter->levels, iter->levels_nr + 1,
-  iter->levels_alloc);
-   level = >levels[iter->levels_nr++];
-   level->initialized = 0;
+   push_dir_level(iter, level);
continue;
} else {
/*
@@ -104,7 +137,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
 * This level is exhausted (or wasn't opened
 * successfully); pop up a level.
 */
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
 
continue;
@@ -129,7 +162,7 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
iter->base.path.buf, 
strerror(errno));
 
level->dir = NULL;
-   if (--iter->levels_nr == 0)
+   if (pop_dir_level(iter) == 0)
return dir_iterator_abort(dir_iterator);
break;
}
@@ -138,23 +171,9 @@ int dir_iterator_advance(struct dir_iterator *dir_iterator)
continue;
 
strbuf_addstr(>base.path, de->d_name);
-   if (lstat(iter->base.path.buf, >base.st) < 0) {
-   if (errno != ENOENT)
-   warning("error reading path '%s': %s",
-   iter->base.path.buf,
-   strerror(errno));
-   continue;
-   }
 
-   /*
-* We have to set these each time because
-* the path strbuf might have been realloc()ed.
-*/
-   iter->base.relative_path =
-   iter->base.path.buf + 
iter->levels[0].prefix_len;
-   iter->base.basename =
-   iter->base.path.buf + level->prefix_len;
-   level->dir_state = DIR_STATE_ITER;
+   if (set_iterator_data(iter, level))
+   continue;
 
return ITER_OK;
}
-- 
2.7.4 (Apple Git-66)



[PATCH v7 1/5] dir_iterator: add tests for dir_iterator API

2017-04-02 Thread Daniel Ferreira
Create t/helper/test-dir-iterator.c, which prints relevant information
about a directory tree iterated over with dir_iterator.

Create t/t0065-dir-iterator.sh, which tests that dir_iterator does
iterate through a whole directory tree.

Signed-off-by: Daniel Ferreira 
---
 Makefile |  1 +
 t/helper/.gitignore  |  1 +
 t/helper/test-dir-iterator.c | 28 +++
 t/t0065-dir-iterator.sh  | 54 
 4 files changed, 84 insertions(+)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

diff --git a/Makefile b/Makefile
index 9b36068..41ce9ab 100644
--- a/Makefile
+++ b/Makefile
@@ -614,6 +614,7 @@ TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-config
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
+TEST_PROGRAMS_NEED_X += test-dir-iterator
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-split-index
 TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
diff --git a/t/helper/.gitignore b/t/helper/.gitignore
index 758ed2e..a7d74d3 100644
--- a/t/helper/.gitignore
+++ b/t/helper/.gitignore
@@ -3,6 +3,7 @@
 /test-config
 /test-date
 /test-delta
+/test-dir-iterator
 /test-dump-cache-tree
 /test-dump-split-index
 /test-dump-untracked-cache
diff --git a/t/helper/test-dir-iterator.c b/t/helper/test-dir-iterator.c
new file mode 100644
index 000..06f03fc
--- /dev/null
+++ b/t/helper/test-dir-iterator.c
@@ -0,0 +1,28 @@
+#include "git-compat-util.h"
+#include "strbuf.h"
+#include "iterator.h"
+#include "dir-iterator.h"
+
+int cmd_main(int argc, const char **argv) {
+   struct strbuf path = STRBUF_INIT;
+   struct dir_iterator *diter;
+
+   if (argc < 2) {
+   return 1;
+   }
+
+   strbuf_add(, argv[1], strlen(argv[1]));
+
+   diter = dir_iterator_begin(path.buf);
+
+   while (dir_iterator_advance(diter) == ITER_OK) {
+   if (S_ISDIR(diter->st.st_mode))
+   printf("[d] ");
+   else
+   printf("[f] ");
+
+   printf("(%s) %s\n", diter->relative_path, diter->path.buf);
+   }
+
+   return 0;
+}
diff --git a/t/t0065-dir-iterator.sh b/t/t0065-dir-iterator.sh
new file mode 100755
index 000..b857c07
--- /dev/null
+++ b/t/t0065-dir-iterator.sh
@@ -0,0 +1,54 @@
+#!/bin/sh
+
+test_description='Test directory iteration.'
+
+. ./test-lib.sh
+
+cat >expect-sorted-output <<-\EOF &&
+[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[d] (d) ./dir/d
+[d] (d/e) ./dir/d/e
+[d] (d/e/d) ./dir/d/e/d
+[f] (a/b/c/d) ./dir/a/b/c/d
+[f] (a/e) ./dir/a/e
+[f] (b) ./dir/b
+[f] (c) ./dir/c
+[f] (d/e/d/a) ./dir/d/e/d/a
+EOF
+
+test_expect_success 'dir-iterator should iterate through all files' '
+   mkdir -p dir &&
+   mkdir -p dir/a/b/c/ &&
+   >dir/b &&
+   >dir/c &&
+   mkdir -p dir/d/e/d/ &&
+   >dir/a/b/c/d &&
+   >dir/a/e &&
+   >dir/d/e/d/a &&
+
+   test-dir-iterator ./dir | sort >./actual-pre-order-sorted-output &&
+   rm -rf dir &&
+
+   test_cmp expect-sorted-output actual-pre-order-sorted-output
+'
+
+cat >expect-pre-order-output <<-\EOF &&
+[d] (a) ./dir/a
+[d] (a/b) ./dir/a/b
+[d] (a/b/c) ./dir/a/b/c
+[f] (a/b/c/d) ./dir/a/b/c/d
+EOF
+
+test_expect_success 'dir-iterator should list files in the correct order' '
+   mkdir -p dir/a/b/c/ &&
+   >dir/a/b/c/d &&
+
+   test-dir-iterator ./dir >actual-pre-order-output &&
+   rm -rf dir &&
+
+   test_cmp expect-pre-order-output actual-pre-order-output
+'
+
+test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v7 2/5] remove_subtree(): test removing nested directories

2017-04-02 Thread Daniel Ferreira
Test removing a nested directory when an attempt is made to restore the
index to a state where it does not exist. A similar test could be found
previously in t/t2000-checkout-cache-clash.sh, but it did not check for
nested directories, which could allow a faulty implementation of
remove_subtree() pass the tests.

Signed-off-by: Daniel Ferreira 
---
 t/t2000-checkout-cache-clash.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t2000-checkout-cache-clash.sh b/t/t2000-checkout-cache-clash.sh
index de3edb5..ac10ba3 100755
--- a/t/t2000-checkout-cache-clash.sh
+++ b/t/t2000-checkout-cache-clash.sh
@@ -57,4 +57,15 @@ test_expect_success SYMLINKS 'checkout-index -f twice with 
--prefix' '
git checkout-index -a -f --prefix=there/
 '
 
+test_expect_success 'git checkout-index -f should remove nested subtrees' '
+   echo content >path &&
+   git update-index --add path &&
+   rm path &&
+   mkdir -p path/with/nested/paths &&
+   echo content >path/file1 &&
+   echo content >path/with/nested/paths/file2 &&
+   git checkout-index -f -a &&
+   test ! -d path
+'
+
 test_done
-- 
2.7.4 (Apple Git-66)



[PATCH v7 0/5] [GSoC] remove_subtree(): reimplement using iterators

2017-04-02 Thread Daniel Ferreira
This is the seventh version of a patch series that implements the GSoC
microproject of converting a recursive call to readdir() to use dir_iterator.

v1: 
https://public-inbox.org/git/CAGZ79kZwT-9mHTiOJ5CEjk2wDFkn6+NcogjX0=vjhsah16a...@mail.gmail.com/T/#t
v2: 
https://public-inbox.org/git/cacsjy8dxh-qpbblfafwpawusba9gvxa7x+mxljevykhk1zo...@mail.gmail.com/T/#t
v3: 
https://public-inbox.org/git/CAGZ79kYtpmURSQWPumobA=e3jbfjkhwcdv_lphkcd71zrwm...@mail.gmail.com/T/#t
v4: 
https://public-inbox.org/git/1490747533-89143-1-git-send-email-bnm...@gmail.com/T/#e437a63e0c22c00c69b5d92977c9b438ed2b9fd3a
v5: 
https://public-inbox.org/git/1490844730-47634-1-git-send-email-bnm...@gmail.com/T/#m2323f15e45de699f2e09364f40a62e17047cf453
v6: 
https://public-inbox.org/git/1491107726-21504-1-git-send-email-bnm...@gmail.com/T/#t

I screwed up in v6 because I had introduced a bug that in case git tried to open
a directory that did not exist using dir_iterator, the program would segfault. 
This
was amended and all commits are passing the tests. Sorry for not having tested
my changes properly.

CI build: https://travis-ci.org/theiostream/git

Since the changes in v6 were not reviewed, I'll just copy what was sent
back there.

> Back in v5, Michael had a number of suggestions, all of which were applied
> to this version (including a slightly modified version of his "biggish 
> rewrite"
> project to make dir_iterator's state machine simpler). The only suggestion 
> that
> did not make it into this series was that of not traversing into 
> subdirectories,
> since I believe it would be better off in another series that actually 
> required
> that feature (that is, I do not want a series to implement a feature it will
> not need). The same goes for Junio's thought on a flag to list *only* 
> directories
> and no files on the v4 discussion.

> Junio and Peff's comments about how to write to files in the tests were also
> considered, and the tests were adjusted.

> I chose to squash both the state machine refactor and the addition of the
> new flags in a single commit. I do not know whether you will feel this is
> the right choice but it seemed natural, since most of the state machine's
> new logic would not even make sense without encompassing the new features.
> I am, of course, open for feedback on this decision.

Daniel Ferreira (5):
  dir_iterator: add tests for dir_iterator API
  remove_subtree(): test removing nested directories
  dir_iterator: add helpers to dir_iterator_advance
  dir_iterator: refactor state machine model
  remove_subtree(): reimplement using iterators

 Makefile|   1 +
 dir-iterator.c  | 196 ++--
 dir-iterator.h  |  28 --
 entry.c |  38 +++-
 refs/files-backend.c|   2 +-
 t/helper/.gitignore |   1 +
 t/helper/test-dir-iterator.c|  32 +++
 t/t0065-dir-iterator.sh | 109 ++
 t/t2000-checkout-cache-clash.sh |  11 +++
 9 files changed, 316 insertions(+), 102 deletions(-)
 create mode 100644 t/helper/test-dir-iterator.c
 create mode 100755 t/t0065-dir-iterator.sh

--
2.7.4 (Apple Git-66)



Re: [GSoC] Proposal: turn git-add--interactive.perl into a builtin

2017-04-02 Thread Johannes Schindelin
Hi Daniel,

On Fri, 31 Mar 2017, Daniel Ferreira (theiostream) wrote:

> Question: do you suggest any pending bugfix to git-add--interactive or
> to something related that might give some useful knowledge in advance?
> (for the pre-code period). My microproject involves playing with the
> dir_iterator interface, which is a great exercise in code refactoring
> but really does not teach me too much about Git's architecture.

We ask to accomplish a microproject before evaluating the proposals for
one reason: to have a good understanding how well the students would
interact with the project if they were accepted. As such, the
microprojects really are about the flow of the contribution, not to tackle
the project already.

Meaning: I would recommend staying with your microproject, in particular
if it is already in full swing.

Ciao,
Johannes


[PATCH v2 8/8] Use uintmax_t for timestamps

2017-04-02 Thread Johannes Schindelin
Previously, we used `unsigned long` for timestamps. This was only a good
choice on Linux, where we know implicitly that `unsigned long` is what is
used for `time_t`.

However, we want to use a different data type for timestamps for two
reasons:

- there is nothing that says that `unsigned long` should be the same data
  type as `time_t`, and indeed, on 64-bit Windows for example, it is not:
  `unsigned long` is 32-bit but `time_t` is 64-bit.

- even on 32-bit Linux, where `unsigned long` (and thereby `time_t`) is
  32-bit, we *want* to be able to encode timestamps in Git that are
  currently absurdly far in the future, *even if* the system library is
  not able to format those timestamps into date strings.

So let's just switch to the maximal integer type available, which should
be at least 64-bit for all practical purposes these days. It certainly
cannot be worse than `unsigned long`, so...

Signed-off-by: Johannes Schindelin 
---
 git-compat-util.h | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 72c12173a14..c678ca94b8f 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -319,10 +319,14 @@ extern char *gitdirname(char *);
 #define PRIo32 "o"
 #endif
 
-typedef unsigned long timestamp_t;
-#define PRItime "lu"
-#define parse_timestamp strtoul
+typedef uintmax_t timestamp_t;
+#define PRItime PRIuMAX
+#define parse_timestamp strtoumax
+#ifdef ULLONG_MAX
+#define TIME_MAX ULLONG_MAX
+#else
 #define TIME_MAX ULONG_MAX
+#endif
 
 #ifndef PATH_SEP
 #define PATH_SEP ':'
-- 
2.12.2.windows.1


[PATCH v2 7/8] Abort if the system time cannot handle one of our timestamps

2017-04-02 Thread Johannes Schindelin
We are about to switch to a new data type for time stamps that is
definitely not smaller or equal, but larger or equal to time_t.

So before using the system functions to process or format timestamps,
let's make extra certain that they can handle what we feed them.

Signed-off-by: Johannes Schindelin 
---
 date.c | 17 ++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/date.c b/date.c
index 92ab31aa441..db3435df3e4 100644
--- a/date.c
+++ b/date.c
@@ -43,10 +43,13 @@ static time_t gm_time_t(timestamp_t time, int tz)
 {
int minutes;
 
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
minutes = tz < 0 ? -tz : tz;
minutes = (minutes / 100)*60 + (minutes % 100);
minutes = tz < 0 ? -minutes : minutes;
-   return time + minutes * 60;
+   return (time_t)time + minutes * 60;
 }
 
 /*
@@ -56,7 +59,12 @@ static time_t gm_time_t(timestamp_t time, int tz)
  */
 static struct tm *time_to_tm(timestamp_t time, int tz)
 {
-   time_t t = gm_time_t(time, tz);
+   time_t t;
+
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
+   t = gm_time_t((time_t)time, tz);
return gmtime();
 }
 
@@ -70,7 +78,10 @@ static int local_tzoffset(timestamp_t time)
struct tm tm;
int offset, eastwest;
 
-   t = time;
+   if (date_overflows(time))
+   die("Timestamp too large for this system: %"PRItime, time);
+
+   t = (time_t)time;
localtime_r(, );
t_local = tm_to_time_t();
 
-- 
2.12.2.windows.1




[PATCH v2 6/8] Introduce a new data type for timestamps

2017-04-02 Thread Johannes Schindelin
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin 
---
 Documentation/technical/api-parse-options.txt |  8 ++--
 archive-tar.c |  5 +-
 archive-zip.c |  6 ++-
 archive.h |  2 +-
 builtin/am.c  |  2 +-
 builtin/blame.c   |  8 ++--
 builtin/fsck.c|  4 +-
 builtin/gc.c  |  2 +-
 builtin/log.c |  2 +-
 builtin/merge-base.c  |  2 +-
 builtin/name-rev.c|  6 +--
 builtin/pack-objects.c|  4 +-
 builtin/prune.c   |  4 +-
 builtin/receive-pack.c|  6 +--
 builtin/reflog.c  | 24 +-
 builtin/show-branch.c |  4 +-
 builtin/worktree.c|  4 +-
 bundle.c  |  2 +-
 cache.h   | 14 +++---
 commit.c  | 12 ++---
 commit.h  |  2 +-
 config.c  |  2 +-
 credential-cache--daemon.c| 12 ++---
 date.c| 66 +--
 fetch-pack.c  |  6 +--
 git-compat-util.h |  2 +
 http-backend.c|  4 +-
 parse-options-cb.c|  4 +-
 pretty.c  |  2 +-
 reachable.c   |  9 ++--
 reachable.h   |  4 +-
 ref-filter.c  |  4 +-
 reflog-walk.c |  8 ++--
 refs.c| 14 +++---
 refs.h|  8 ++--
 refs/files-backend.c  |  4 +-
 revision.c|  6 +--
 revision.h|  4 +-
 sha1_name.c   |  6 +--
 t/helper/test-date.c  |  8 ++--
 t/helper/test-parse-options.c |  2 +-
 tag.c |  2 +-
 tag.h |  2 +-
 upload-pack.c |  4 +-
 vcs-svn/fast_export.c |  4 +-
 vcs-svn/fast_export.h |  4 +-
 vcs-svn/svndump.c |  2 +-
 wt-status.c   |  2 +-
 48 files changed, 162 insertions(+), 156 deletions(-)

diff --git a/Documentation/technical/api-parse-options.txt 
b/Documentation/technical/api-parse-options.txt
index 36768b479e1..829b5581105 100644
--- a/Documentation/technical/api-parse-options.txt
+++ b/Documentation/technical/api-parse-options.txt
@@ -183,13 +183,13 @@ There are some macros to easily define options:
scale the provided value by 1024, 1024^2 or 1024^3 respectively.
The scaled value is put into `unsigned_long_var`.
 
-`OPT_DATE(short, long, _var, description)`::
+`OPT_DATE(short, long, _t_var, description)`::
Introduce an option with date argument, see `approxidate()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
-`OPT_EXPIRY_DATE(short, long, _var, description)`::
+`OPT_EXPIRY_DATE(short, long, _t_var, description)`::
Introduce an option with expiry date argument, see 
`parse_expiry_date()`.
-   The timestamp is put into `int_var`.
+   The timestamp is put into `timestamp_t_var`.
 
 `OPT_CALLBACK(short, long, , arg_str, description, func_ptr)`::
Introduce an option with argument.
diff --git a/archive-tar.c b/archive-tar.c
index 380e3aedd23..695339a2369 100644
--- a/archive-tar.c
+++ b/archive-tar.c
@@ -27,9 +27,12 @@ static int write_tar_filter_archive(const struct archiver 
*ar,
  */
 #if ULONG_MAX == 0x
 #define USTAR_MAX_SIZE ULONG_MAX
-#define USTAR_MAX_MTIME ULONG_MAX
 #else
 #define USTAR_MAX_SIZE 0777UL
+#endif
+#if TIME_MAX == 0x

[PATCH v2 5/8] Introduce a new "printf format" for timestamps

2017-04-02 Thread Johannes Schindelin
Currently, Git's source code treats all timestamps as if they were
unsigned longs. Therefore, it is okay to write "%lu" when printing them.

There is a substantial problem with that, though: at least on Windows,
time_t is *larger* than unsigned long, and hence we will want to switch
away from the ill-specified `unsigned long` data type.

So let's introduce the pseudo format "PRItime" (currently simply being
defined to "lu") to make it easier to change the data type used for
timestamps.

Signed-off-by: Johannes Schindelin 
---
 builtin/blame.c   |  6 +++---
 builtin/fsck.c|  2 +-
 builtin/log.c |  2 +-
 builtin/receive-pack.c|  4 ++--
 builtin/rev-list.c|  2 +-
 builtin/rev-parse.c   |  3 ++-
 date.c| 26 +-
 fetch-pack.c  |  2 +-
 git-compat-util.h |  1 +
 refs/files-backend.c  |  2 +-
 t/helper/test-date.c  |  2 +-
 t/helper/test-parse-options.c |  2 +-
 upload-pack.c |  2 +-
 vcs-svn/fast_export.c |  4 ++--
 14 files changed, 31 insertions(+), 29 deletions(-)

diff --git a/builtin/blame.c b/builtin/blame.c
index f7aa95f4bab..ff7b2df023b 100644
--- a/builtin/blame.c
+++ b/builtin/blame.c
@@ -1727,11 +1727,11 @@ static int emit_one_suspect_detail(struct origin 
*suspect, int repeat)
get_commit_info(suspect->commit, , 1);
printf("author %s\n", ci.author.buf);
printf("author-mail %s\n", ci.author_mail.buf);
-   printf("author-time %lu\n", ci.author_time);
+   printf("author-time %"PRItime"\n", ci.author_time);
printf("author-tz %s\n", ci.author_tz.buf);
printf("committer %s\n", ci.committer.buf);
printf("committer-mail %s\n", ci.committer_mail.buf);
-   printf("committer-time %lu\n", ci.committer_time);
+   printf("committer-time %"PRItime"\n", ci.committer_time);
printf("committer-tz %s\n", ci.committer_tz.buf);
printf("summary %s\n", ci.summary.buf);
if (suspect->commit->object.flags & UNINTERESTING)
@@ -1844,7 +1844,7 @@ static const char *format_time(unsigned long time, const 
char *tz_str,
 
strbuf_reset(_buf);
if (show_raw_time) {
-   strbuf_addf(_buf, "%lu %s", time, tz_str);
+   strbuf_addf(_buf, "%"PRItime" %s", time, tz_str);
}
else {
const char *time_str;
diff --git a/builtin/fsck.c b/builtin/fsck.c
index f76e4163abb..af7b985c6eb 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -407,7 +407,7 @@ static void fsck_handle_reflog_oid(const char *refname, 
struct object_id *oid,
if (timestamp && name_objects)
add_decoration(fsck_walk_options.object_names,
obj,
-   xstrfmt("%s@{%ld}", refname, 
timestamp));
+   xstrfmt("%s@{%"PRItime"}", refname, 
timestamp));
obj->used = 1;
mark_object_reachable(obj);
} else {
diff --git a/builtin/log.c b/builtin/log.c
index 670229cbb4c..079c659c754 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -903,7 +903,7 @@ static void get_patch_ids(struct rev_info *rev, struct 
patch_ids *ids)
 static void gen_message_id(struct rev_info *info, char *base)
 {
struct strbuf buf = STRBUF_INIT;
-   strbuf_addf(, "%s.%lu.git.%s", base,
+   strbuf_addf(, "%s.%"PRItime".git.%s", base,
(unsigned long) time(NULL),

git_committer_info(IDENT_NO_NAME|IDENT_NO_DATE|IDENT_STRICT));
info->message_id = strbuf_detach(, NULL);
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index fd8a24dd47e..8814e49893e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -459,12 +459,12 @@ static char *prepare_push_cert_nonce(const char *path, 
unsigned long stamp)
struct strbuf buf = STRBUF_INIT;
unsigned char sha1[20];
 
-   strbuf_addf(, "%s:%lu", path, stamp);
+   strbuf_addf(, "%s:%"PRItime, path, stamp);
hmac_sha1(sha1, buf.buf, buf.len, cert_nonce_seed, 
strlen(cert_nonce_seed));;
strbuf_release();
 
/* RFC 2104 5. HMAC-SHA1-80 */
-   strbuf_addf(, "%lu-%.*s", stamp, 20, sha1_to_hex(sha1));
+   strbuf_addf(, "%"PRItime"-%.*s", stamp, 20, sha1_to_hex(sha1));
return strbuf_detach(, NULL);
 }
 
diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 0aa93d58919..a30fbce3341 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -80,7 +80,7 @@ static void show_commit(struct commit *commit, void *data)
}
 
if (info->show_timestamp)
-   printf("%lu ", commit->date);
+   printf("%"PRItime" ", commit->date);
if (info->header_prefix)
fputs(info->header_prefix, stdout);
 
diff --git 

[PATCH v2 0/8] Introduce timestamp_t for timestamps

2017-04-02 Thread Johannes Schindelin
Git v2.9.2 was released in a hurry to accomodate for platforms like
Windows, where the `unsigned long` data type is 32-bit even for 64-bit
setups.

The quick fix was to simply disable all the testing with "absurd" future
dates.

However, we can do much better than that, as we already make use of
64-bit data types internally. There is no good reason why we should not
use the same for timestamps. Hence, let's use uintmax_t for timestamps.

Note: while the `time_t` data type exists and is meant to be used for
timestamps, on 32-bit Linux it is *still* 32-bit. An earlier iteration
used `time_t` for that reason, but it came with a few serious downsides:
as `time_t` can be signed (and indeed, on Windows it is an int64_t),
Git's expectation that 0 is the minimal value does no longer hold true,
introducing its own set of interesting challenges. Besides, if we *can*
handle far in the future timestamps (except for formatting them using
the system libraries), it is more consistent to do so.

The upside of using `uintmax_t` for timestamps is that we do a much
better job to support far in the future timestamps across all platforms,
including 32-bit ones. The downside is that those platforms that use a
32-bit `time_t` will barf when parsing or formatting those timestamps.

Changes since v1 (sorry, the interdiff is huge due to the switch from
time_t to uintmax_t):

- moved an `unsigned long` -> `time_t` change into 6/6 (it inadvertently
  was done as part of the patch that wanted to introduce
  parse_timestamp()).

- using `uintmax_t` for timestamps instead of `time_t` now.

- as 32-bit Linux still uses 32-bit time_t, while we may be able to
  represent timestamps far in the future internally, as soon as we try
  to format them using strftime() it will fail. Added safeguards against
  that and prepared t0006 and t5000 for this scenario (i.e. skip those
  tests that require strftime() to handle 64-bit timestamps).

- found a couple more places where strtoul() was still hardcoded for
  parsing timestamps, and fixed them.


Johannes Schindelin (8):
  ref-filter: avoid using `unsigned long` for catch-all data type
  t0006 & t5000: prepare for 64-bit timestamps
  t0006 & t5000: skip "far in the future" test when time_t is too
limited
  Specify explicitly where we parse timestamps
  Introduce a new "printf format" for timestamps
  Introduce a new data type for timestamps
  Abort if the system time cannot handle one of our timestamps
  Use uintmax_t for timestamps

 Documentation/technical/api-parse-options.txt |   8 +-
 archive-tar.c |   5 +-
 archive-zip.c |   6 +-
 archive.h |   2 +-
 builtin/am.c  |   4 +-
 builtin/blame.c   |  14 ++--
 builtin/fsck.c|   6 +-
 builtin/gc.c  |   2 +-
 builtin/log.c |   4 +-
 builtin/merge-base.c  |   2 +-
 builtin/name-rev.c|   6 +-
 builtin/pack-objects.c|   4 +-
 builtin/prune.c   |   4 +-
 builtin/receive-pack.c|  14 ++--
 builtin/reflog.c  |  24 +++---
 builtin/rev-list.c|   2 +-
 builtin/rev-parse.c   |   3 +-
 builtin/show-branch.c |   4 +-
 builtin/worktree.c|   4 +-
 bundle.c  |   4 +-
 cache.h   |  14 ++--
 commit.c  |  18 ++--
 commit.h  |   2 +-
 config.c  |   2 +-
 credential-cache--daemon.c|  12 +--
 date.c| 113 ++
 fetch-pack.c  |   8 +-
 fsck.c|   2 +-
 git-compat-util.h |   9 ++
 http-backend.c|   4 +-
 parse-options-cb.c|   4 +-
 pretty.c  |   4 +-
 reachable.c   |   9 +-
 reachable.h   |   4 +-
 ref-filter.c  |  22 ++---
 reflog-walk.c |   8 +-
 refs.c|  14 ++--
 refs.h|   8 +-
 refs/files-backend.c  |   8 +-
 revision.c|   6 +-
 revision.h|   4 +-
 sha1_name.c   |   6 +-
 t/helper/test-date.c  |  18 ++--
 t/helper/test-parse-options.c |   4 +-

[PATCH v2 4/8] Specify explicitly where we parse timestamps

2017-04-02 Thread Johannes Schindelin
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin 
---
 builtin/am.c   | 2 +-
 builtin/receive-pack.c | 4 ++--
 bundle.c   | 2 +-
 commit.c   | 6 +++---
 date.c | 6 +++---
 fsck.c | 2 +-
 git-compat-util.h  | 2 ++
 pretty.c   | 2 +-
 ref-filter.c   | 2 +-
 refs/files-backend.c   | 2 +-
 t/helper/test-date.c   | 2 +-
 tag.c  | 4 ++--
 upload-pack.c  | 2 +-
 13 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/builtin/am.c b/builtin/am.c
index f7a7a971fbe..2c93adc69c3 100644
--- a/builtin/am.c
+++ b/builtin/am.c
@@ -882,7 +882,7 @@ static int hg_patch_to_mail(FILE *out, FILE *in, int 
keep_cr)
char *end;
 
errno = 0;
-   timestamp = strtoul(str, , 10);
+   timestamp = parse_timestamp(str, , 10);
if (errno)
return error(_("invalid timestamp"));
 
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index aca9c33d8d8..fd8a24dd47e 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -534,7 +534,7 @@ static const char *check_nonce(const char *buf, size_t len)
retval = NONCE_BAD;
goto leave;
}
-   stamp = strtoul(nonce, , 10);
+   stamp = parse_timestamp(nonce, , 10);
if (bohmac == nonce || bohmac[0] != '-') {
retval = NONCE_BAD;
goto leave;
@@ -552,7 +552,7 @@ static const char *check_nonce(const char *buf, size_t len)
 * would mean it was issued by another server with its clock
 * skewed in the future.
 */
-   ostamp = strtoul(push_cert_nonce, NULL, 10);
+   ostamp = parse_timestamp(push_cert_nonce, NULL, 10);
nonce_stamp_slop = (long)ostamp - (long)stamp;
 
if (nonce_stamp_slop_limit &&
diff --git a/bundle.c b/bundle.c
index bbf4efa0a0a..f43bfcf5ff3 100644
--- a/bundle.c
+++ b/bundle.c
@@ -227,7 +227,7 @@ static int is_tag_in_date_range(struct object *tag, struct 
rev_info *revs)
line = memchr(line, '>', lineend ? lineend - line : buf + size - line);
if (!line++)
goto out;
-   date = strtoul(line, NULL, 10);
+   date = parse_timestamp(line, NULL, 10);
result = (revs->max_age == -1 || revs->max_age < date) &&
(revs->min_age == -1 || revs->min_age > date);
 out:
diff --git a/commit.c b/commit.c
index 73c78c2b80c..0d2d0fa1984 100644
--- a/commit.c
+++ b/commit.c
@@ -89,8 +89,8 @@ static unsigned long parse_commit_date(const char *buf, const 
char *tail)
/* nada */;
if (buf >= tail)
return 0;
-   /* dateptr < buf && buf[-1] == '\n', so strtoul will stop at buf-1 */
-   return strtoul(dateptr, NULL, 10);
+   /* dateptr < buf && buf[-1] == '\n', so parsing will stop at buf-1 */
+   return parse_timestamp(dateptr, NULL, 10);
 }
 
 static struct commit_graft **commit_graft;
@@ -607,7 +607,7 @@ static void record_author_date(struct author_date_slab 
*author_date,
!ident.date_begin || !ident.date_end)
goto fail_exit; /* malformed "author" line */
 
-   date = strtoul(ident.date_begin, _end, 10);
+   date = parse_timestamp(ident.date_begin, _end, 10);
if (date_end != ident.date_end)
goto fail_exit; /* malformed date */
*(author_date_slab_at(author_date, commit)) = date;
diff --git a/date.c b/date.c
index a996331f5b3..495c207c64f 100644
--- a/date.c
+++ b/date.c
@@ -510,7 +510,7 @@ static int match_digit(const char *date, struct tm *tm, int 
*offset, int *tm_gmt
char *end;
unsigned long num;
 
-   num = strtoul(date, , 10);
+   num = parse_timestamp(date, , 10);
 
/*
 * Seconds since 1970? We trigger on that for any numbers with
@@ -658,7 +658,7 @@ static int match_object_header_date(const char *date, 
unsigned long *timestamp,
 
if (*date < '0' || '9' < *date)
return -1;
-   stamp = strtoul(date, , 10);
+   stamp = parse_timestamp(date, , 10);
if (*end != ' ' || stamp == ULONG_MAX || (end[1] != '+' && end[1] != 
'-'))
return -1;
date = end + 2;
@@ -1066,7 +1066,7 @@ static const char *approxidate_digit(const char *date, 
struct tm *tm, int *num,
 time_t now)
 {
char *end;
-   unsigned long number = strtoul(date, , 10);
+   unsigned long number = parse_timestamp(date, , 10);
 
switch (*end) {
case ':':
diff --git a/fsck.c b/fsck.c
index 

[PATCH v2 3/8] t0006 & t5000: skip "far in the future" test when time_t is too limited

2017-04-02 Thread Johannes Schindelin
Git's source code refers to timestamps as unsigned long, which is
ill-defined, as there is no guarantee about the number of bits that
data type has.

In preparation of switching to another data type that is large enough
to hold "far in the future" dates, we need to prepare the t0006-date.sh
script for the case where we *still* cannot format those dates if the
system library uses 32-bit time_t.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 2 +-
 t/test-lib.sh| 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 4727bea255c..ac7c66c733b 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -5,7 +5,8 @@ static const char *usage_msg = "\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
 "  test-date approxidate [date]...\n"
-"  test-date is64bit\n";
+"  test-date is64bit\n"
+"  test-date time_t-is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -96,6 +97,8 @@ int cmd_main(int argc, const char **argv)
parse_approxidate(argv+1, );
else if (!strcmp(*argv, "is64bit"))
return sizeof(unsigned long) == 8 ? 0 : 1;
+   else if (!strcmp(*argv, "time_t-is64bit"))
+   return sizeof(time_t) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index 9539b425ffb..42d4ea61ef5 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" 
TIME_IS_64BIT,TIME_T_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" 
TIME_IS_64BIT,TIME_T_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 997aa9dea28..fe2d4f15a73 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -402,7 +402,7 @@ test_expect_success TIME_IS_64BIT 'generate tar with future 
mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT,TIME_T_IS_64BIT 'system tar can 
read our future mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index beee1d847ff..8d25cb7c183 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1166,3 +1166,4 @@ test_lazy_prereq LONG_IS_64BIT '
 '
 
 test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
+test_lazy_prereq TIME_T_IS_64BIT 'test-date time_t-is64bit'
-- 
2.12.2.windows.1




[PATCH v2 1/8] ref-filter: avoid using `unsigned long` for catch-all data type

2017-04-02 Thread Johannes Schindelin
In its `atom_value` struct, the ref-filter source code wants to store
different values in a field called `ul` (for `unsigned long`), e.g.
timestamps.

However, as we are about to switch the data type of timestamps away from
`unsigned long` (because it may be 32-bit even when `time_t` is 64-bit),
that data type is not large enough.

Simply change that field to use `uintmax_t` instead.

This patch is a bit larger than the mere change of the data type
because the field's name was tied to its data type, which has been fixed
at the same time.

Signed-off-by: Johannes Schindelin 
---
 ref-filter.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/ref-filter.c b/ref-filter.c
index 9c82b5b9d63..8538328fc7f 100644
--- a/ref-filter.c
+++ b/ref-filter.c
@@ -351,7 +351,7 @@ struct ref_formatting_state {
 struct atom_value {
const char *s;
void (*handler)(struct atom_value *atomv, struct ref_formatting_state 
*state);
-   unsigned long ul; /* used for sorting when not FIELD_STR */
+   uintmax_t value; /* used for sorting when not FIELD_STR */
struct used_atom *atom;
 };
 
@@ -723,7 +723,7 @@ static void grab_common_values(struct atom_value *val, int 
deref, struct object
if (!strcmp(name, "objecttype"))
v->s = typename(obj->type);
else if (!strcmp(name, "objectsize")) {
-   v->ul = sz;
+   v->value = sz;
v->s = xstrfmt("%lu", sz);
}
else if (deref)
@@ -770,8 +770,8 @@ static void grab_commit_values(struct atom_value *val, int 
deref, struct object
v->s = xstrdup(oid_to_hex(>tree->object.oid));
}
else if (!strcmp(name, "numparent")) {
-   v->ul = commit_list_count(commit->parents);
-   v->s = xstrfmt("%lu", v->ul);
+   v->value = commit_list_count(commit->parents);
+   v->s = xstrfmt("%lu", (unsigned long)v->value);
}
else if (!strcmp(name, "parent")) {
struct commit_list *parents;
@@ -875,11 +875,11 @@ static void grab_date(const char *buf, struct atom_value 
*v, const char *atomnam
if ((tz == LONG_MIN || tz == LONG_MAX) && errno == ERANGE)
goto bad;
v->s = xstrdup(show_date(timestamp, tz, _mode));
-   v->ul = timestamp;
+   v->value = timestamp;
return;
  bad:
v->s = "";
-   v->ul = 0;
+   v->value = 0;
 }
 
 /* See grab_values */
@@ -1934,9 +1934,9 @@ static int cmp_ref_sorting(struct ref_sorting *s, struct 
ref_array_item *a, stru
else if (cmp_type == FIELD_STR)
cmp = cmp_fn(va->s, vb->s);
else {
-   if (va->ul < vb->ul)
+   if (va->value < vb->value)
cmp = -1;
-   else if (va->ul == vb->ul)
+   else if (va->value == vb->value)
cmp = cmp_fn(a->refname, b->refname);
else
cmp = 1;
-- 
2.12.2.windows.1




[PATCH v2 2/8] t0006 & t5000: prepare for 64-bit timestamps

2017-04-02 Thread Johannes Schindelin
Git's source code refers to timestamps as unsigned longs. On 32-bit
platforms, as well as on Windows, unsigned long is not large enough to
capture dates that are "absurdly far in the future".

It is perfectly valid by the C standard, of course, for the `long` data
type to refer to 32-bit integers. That is why the `time_t` data type
exists: so that it can be 64-bit even if `long` is 32-bit. Git's source
code simply uses an incorrect data type for timestamps, is all.

The earlier quick fix 6b9c38e14cd (t0006: skip "far in the future" test
when unsigned long is not long enough, 2016-07-11) papered over this
issue simply by skipping the respective test cases on platforms where
they would fail due to the data type in use.

This quick fix, however, tests for *long* to be 64-bit or not. What we
need, though, is a test that says whether *whatever data type we use for
timestamps* is 64-bit or not.

The same quick fix was used to handle the similar problem where Git's
source code uses `unsigned long` to represent size, instead of `size_t`,
conflating the two issues.

So let's just add another prerequisite to test specifically whether
timestamps are represented by a 64-bit data type or not. Later, after we
switch to a larger data type, we can flip that prerequisite to test
`time_t` instead of `long`.

Signed-off-by: Johannes Schindelin 
---
 t/helper/test-date.c | 5 -
 t/t0006-date.sh  | 4 ++--
 t/t5000-tar-tree.sh  | 6 +++---
 t/test-lib.sh| 2 ++
 4 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/t/helper/test-date.c b/t/helper/test-date.c
index 506054bcd5d..4727bea255c 100644
--- a/t/helper/test-date.c
+++ b/t/helper/test-date.c
@@ -4,7 +4,8 @@ static const char *usage_msg = "\n"
 "  test-date relative [time_t]...\n"
 "  test-date show: [time_t]...\n"
 "  test-date parse [date]...\n"
-"  test-date approxidate [date]...\n";
+"  test-date approxidate [date]...\n"
+"  test-date is64bit\n";
 
 static void show_relative_dates(const char **argv, struct timeval *now)
 {
@@ -93,6 +94,8 @@ int cmd_main(int argc, const char **argv)
parse_dates(argv+1, );
else if (!strcmp(*argv, "approxidate"))
parse_approxidate(argv+1, );
+   else if (!strcmp(*argv, "is64bit"))
+   return sizeof(unsigned long) == 8 ? 0 : 1;
else
usage(usage_msg);
return 0;
diff --git a/t/t0006-date.sh b/t/t0006-date.sh
index c0c910867d7..9539b425ffb 100755
--- a/t/t0006-date.sh
+++ b/t/t0006-date.sh
@@ -53,8 +53,8 @@ check_show unix-local "$TIME" '146600'
 
 # arbitrary time absurdly far in the future
 FUTURE="5758122296 -0400"
-check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" LONG_IS_64BIT
-check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" LONG_IS_64BIT
+check_show iso   "$FUTURE" "2152-06-19 18:24:56 -0400" TIME_IS_64BIT
+check_show iso-local "$FUTURE" "2152-06-19 22:24:56 +" TIME_IS_64BIT
 
 check_parse() {
echo "$1 -> $2" >expect
diff --git a/t/t5000-tar-tree.sh b/t/t5000-tar-tree.sh
index 886b6953e40..997aa9dea28 100755
--- a/t/t5000-tar-tree.sh
+++ b/t/t5000-tar-tree.sh
@@ -390,7 +390,7 @@ test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can 
read our huge size' '
test_cmp expect actual
 '
 
-test_expect_success LONG_IS_64BIT 'set up repository with far-future commit' '
+test_expect_success TIME_IS_64BIT 'set up repository with far-future commit' '
rm -f .git/index &&
echo content >file &&
git add file &&
@@ -398,11 +398,11 @@ test_expect_success LONG_IS_64BIT 'set up repository with 
far-future commit' '
git commit -m "tempori parendum"
 '
 
-test_expect_success LONG_IS_64BIT 'generate tar with future mtime' '
+test_expect_success TIME_IS_64BIT 'generate tar with future mtime' '
git archive HEAD >future.tar
 '
 
-test_expect_success TAR_HUGE,LONG_IS_64BIT 'system tar can read our future 
mtime' '
+test_expect_success TAR_HUGE,TIME_IS_64BIT 'system tar can read our future 
mtime' '
echo 4147 >expect &&
tar_info future.tar | cut -d" " -f2 >actual &&
test_cmp expect actual
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 13b5696822d..beee1d847ff 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -1164,3 +1164,5 @@ build_option () {
 test_lazy_prereq LONG_IS_64BIT '
test 8 -le "$(build_option sizeof-long)"
 '
+
+test_lazy_prereq TIME_IS_64BIT 'test-date is64bit'
-- 
2.12.2.windows.1




Re: Bug in "git am" when the body starts with spaces

2017-04-02 Thread Junio C Hamano
Linus Torvalds  writes:

> On Fri, Mar 31, 2017 at 5:52 PM, Linus Torvalds
>  wrote:
>>
>> The continuation logic is oddly complex, and I can't follow the logic.
>> But it is completely broken in how it thinks empty lines are somehow
>> "continuations".
>
> The attached patch seems to work for me. Comments?

We start at header_stage set to 1, keep skipping empty lines while
in that state, and we stay in that state as long as we see in-body
header (or a continuation of the in-body header we saw earlier).  We
get out of this state when we see a blank line after we are done
with the in-body headers.  Once header_stage is set to 0 with a
blank line, we don't do in-body headers (scissors will roll back the
whole thing and irrelevant to the analysis of correctness).

But you found that "keep skipping empty" done unconditionally is
wrong, because we may have already seen an in-body header and are
trying to see if the line is a continuation (in which case
check_inbody_header() would append to the previous) or another
in-body header (in which case again check_inbody_header() would use
it), or something else (in which case check_inbody_header() will
return false, we get out of header_stage=1 state and process this
line as the first line of the log message.  An empty line we see in
this state must trigger "we are no longer taking in-body header"
logic, too.

And that is exactly your patch does.  The change "feels" correct to
me.


>  mailinfo.c | 7 ++-
>  1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/mailinfo.c b/mailinfo.c
> index a489d9d0f..68037758f 100644
> --- a/mailinfo.c
> +++ b/mailinfo.c
> @@ -757,8 +757,13 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
> strbuf *line)
>   assert(!mi->filter_stage);
>  
>   if (mi->header_stage) {
> - if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
> + if (!line->len || (line->len == 1 && line->buf[0] == '\n')) {
> + if (mi->inbody_header_accum.len) {
> + flush_inbody_header_accum(mi);
> + mi->header_stage = 0;
> + }
>   return 0;
> + }
>   }
>  
>   if (mi->use_inbody_headers && mi->header_stage) {


Re: [PATCH 2/2] [GSOC] show_bisect_vars(): Use unsigned int instead of signed int for flags

2017-04-02 Thread Junio C Hamano
Robert Stanca  writes:

> The question is : If the flags field of the structure is used in
> function calls should i update flags that deep?(there are other
> cases where the field is used in nested calls )

[administrivia: please don't top-post around here].

There won't be any fast and clear rule and you'd need to grow and
use common sense and good taste, but it probably is helpful to go
back and think from the first principle, i.e. why you are doing this
conversion.

For example, in your PATCH 2/2 that we are discussing, you updated
the local variable "flags" to unsigned in show_bisect_vars(),
because it receives the value from "info->flags", which is becoming
an "unsigned" because it is a collection of independent bits.

The function uses this "flags" (now unsigned) twice, and one is to
pass it to filter_skipped() as its 3rd parameter.  This helper
function takes "int", but you didn't update it to "unsigned".  And
you made the right decision to stop there.

The reason why it is the right place to stop is because the function
does not use its 3rd parameter as a collection of bits; it wants its
callers to give Yes/No there--anything non-zero is yes.  Because you
know "flags & BISECT_SHOW_ALL", which is unsigned, would be passed
as a non-zero "int" to filter_skipped(), iff flags has that bit set,
you know you do not have to touch that function and you stop there.

There will be similar places in the callchain that stop propagating
the "collection-of-independent-bits"-ness.  And that is where you
would stop--because beyond that point there is no "arrgh, we use
signed int to represent a collection of bits?" problem, which is
what you are cleaning up.




Re: [PATCH] contrib/git-resurrect.sh: do not write \t for HT in sed scripts

2017-04-02 Thread Junio C Hamano
Jakub Narębski  writes:

> W dniu 01.04.2017 o 06:14, Junio C Hamano pisze:
>
>> Just like we did in 0d1d6e50 ("t/t7003: replace \t with literal tab
>> in sed expression", 2010-08-12), avoid writing "\t" for HT in sed
>> scripts, which is not portable.
>
> I guess it is not possible to use HT variable (similar to LF that we
> have in t/test-lib.sh, and which is defined by some scripts) set to
> literal tab
>
>   HT=""
>
> wouldn't work there, is it?
>
> Using this variable would make literal tab character more visible.

Patches welcome.


Re: [PATCH v2 00/20] Separate `ref_cache` into a separate module

2017-04-02 Thread Junio C Hamano
Ramsay Jones  writes:

>> In that sense, Michael's series and Duy's later two series are
>> "tangled" (i.e. shares some common commits that are still not in
>> 'master').  If nd/files-backend-git-dir that is shared among them is
>> ever rebased, all of them need to be rebased on top of it
>> consistently.
>> 
>> But if either of nd/prune-in-worktree and nd/worktree-kill-parse-ref
>> needs to be rerolled, that can be done independently from Michael's
>> series, as long as nd/files-backend-git-dir is solid and unchanging.
>
> I suspected as much (hence the 'maybe not now' comment), but I noticed
> that all four branches were merged into 'jch'. So, it seemed possible
> to me that they were all being considered for merging into next.

Yes they were (and still are, but I do not expect they will be
moving to 'next' while I am offline ;-).  What you can do to help is
to review and say things like "nd/files-backend-git-dir and
Michael's one on top of it looked good, no opinion on others",
"the others have this and that issues that need a reroll", etc.

Thanks.



Re: [PATCH v2 00/20] Separate `ref_cache` into a separate module

2017-04-02 Thread Ramsay Jones


On 02/04/17 04:38, Junio C Hamano wrote:
> Ramsay Jones  writes:
> 
>>> I am getting the impression that the files-backend thing as well as
>>> this topic are ready for 'next'.  Please stop me if I missed something
>>> in these topics (especially the other one) that needs updating
>>> before that happens.
>>
>> Hmm, are these branches 'tangled' with nd/prune-in-worktree?
>> (I think they were at one point, but maybe not now?).
> 
> Michael's mh/separate-ref-cache builds directly on top of
> nd/files-backend-git-dir topic.
> 
> nd/prune-in-worktree builds directly on top of
> nd/worktree-kill-parse-ref, which in turn builds directly on top of
> nd/files-backend-git-dir.
> 
> In that sense, Michael's series and Duy's later two series are
> "tangled" (i.e. shares some common commits that are still not in
> 'master').  If nd/files-backend-git-dir that is shared among them is
> ever rebased, all of them need to be rebased on top of it
> consistently.
> 
> But if either of nd/prune-in-worktree and nd/worktree-kill-parse-ref
> needs to be rerolled, that can be done independently from Michael's
> series, as long as nd/files-backend-git-dir is solid and unchanging.

I suspected as much (hence the 'maybe not now' comment), but I noticed
that all four branches were merged into 'jch'. So, it seemed possible
to me that they were all being considered for merging into next.

Thanks!

ATB,
Ramsay Jones




Re: [PATCH 2/2] [GSOC] show_bisect_vars(): Use unsigned int instead of signed int for flags

2017-04-02 Thread Robert Stanca
One more quesestion regarding flags used in structures :
for example: update_callback_data has a flags field (type int) ,
in function void update_callback()  the field flags of that structure
is used as param for add_file_to_index(..., data->flags)
and this function is define as: int add_file_to_index(..., int flags)
in read-cache.c
The question is : If the flags field of the structure is used in
function calls should i update flags that deep?(there are other
cases where the field is used in nested calls )

On Sun, Apr 2, 2017 at 6:30 AM, Junio C Hamano  wrote:
> Robert Stanca  writes:
>
>> I am used to 1change per patch so it's easier  to redo specific
>> patches...if there are small changes(10 lines max) can i send them as
>> 1 patch?
>
> It's not number of lines.  One line per patch does not make sense if
> you need to make corresponding changes to two places, one line each,
> in order to make the end result consistent.  If you change a type of
> a structure field, and that field is assigned to a variable
> somewhere, you would change the type of both that field and the
> variable that receives its value at the same time in a single
> commit, as that would be the logical unit of a smallest change that
> still makes sense (i.e. either half of that change alone would not
> make sense).
>
>
>


Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows

2017-04-02 Thread Jakub Narębski
W dniu 02.04.2017 o 09:45, Jeff King pisze:
> On Sat, Apr 01, 2017 at 08:31:27PM +0200, Jakub Narębski wrote:
> 
>> W dniu 01.04.2017 o 08:08, Jeff King pisze:
>>> On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote:
>>>
> I suspect in the normal case that git is doing line-ending conversion,
> but it's suppressed when textconv is in use.

 I would not consider this a bug if not for the fact that there is no ^M
 without using iconv as textconv.
>>>
>>> I don't think it's a bug, though. You have told Git that you will
>>> convert the contents (whatever their format) into the canonical format,
>>> but your program to do so includes a CR.
>>
>> Well, I have not declared file binary with "binary = true" in diff driver
>> definition, isn't it?
> 
> I don't think binary has anything to do with it. A textconv filter takes
> input (binary or not) and delivers a normalized representation to feed
> to the diff algorithm. There's no further post-processing, and it's the
> responsibility of the filter to deliver the bytes it wants diffed.
> 
> Like I said, I could see an argument for treating the filter output as
> text to be pre-processed, but that's not how it works (and I don't think
> it is a good idea to change it now, unless by adding an option to the
> diff filter).

I think that actually there is something wrong.

If textconv really gets normalized representation of pre-image (the index
version) and post-image (the filesystem version), as it should I think,
both pre-image lines ('-') and post-image lines ('+') should use CRLF,
so there should be no warning, i.e. ^M

Or textconv filter gets normalized representation (it looks this way
when examining diff result saved to file with `git diff test.tex >test.diff`;
I were unable to use `tr '\r' 'Q', either I got "fatal: bad config line"
from Git, or "tr: extra operand" from tr), and somehow Git mistakes
what is happening and writes those ^M.

If I understand it correctly, if pre-image, post-image and context
all use the same eol, there should be no warning, isn't it?

> 
>> P.S. What do you think about Git supporting 'encoding' attribute (or
>> 'core.encoding' config) plus 'core.outputEncoding' in-core?
> 
> Supporting an "encoding" attribute to normalize file encodings in diffs
> seems reasonable to me. But it would have to be enabled only for
> human-readable diffs, as the result could not be applied (so the same as
> textconv).

I was thinking about human readable diffs, and 'git show ', same
as with textconv.

> 
> I don't think core.outputEncoding is necessarily a good idea. We are not
> really equipped anything that isn't an ascii superset, as we intermingle
> the bytes with ascii diff headers (though I think that is true of the
> commitEncoding stuff; I assume everything breaks horribly if you tried
> to set that to UTF-16, but I've never tried it).

Well, the understanding would be that the same limitation as for 
core.logOutputEncoding (documented if it isn't) that only encodings that
are ASCII compatibile are supported.
 
-- 
Jakub Narębski



Re: [PATCH] contrib/git-resurrect.sh: do not write \t for HT in sed scripts

2017-04-02 Thread Jakub Narębski
W dniu 01.04.2017 o 06:14, Junio C Hamano pisze:

> Just like we did in 0d1d6e50 ("t/t7003: replace \t with literal tab
> in sed expression", 2010-08-12), avoid writing "\t" for HT in sed
> scripts, which is not portable.

I guess it is not possible to use HT variable (similar to LF that we
have in t/test-lib.sh, and which is defined by some scripts) set to
literal tab

  HT="  "

wouldn't work there, is it?

Using this variable would make literal tab character more visible.
-- 
Jakub Narębski



Re: Bug: 'git config --local user.email=' fails silently?

2017-04-02 Thread Knut Omang
On Sun, 2017-04-02 at 03:38 -0400, Jeff King wrote:
> On Sun, Apr 02, 2017 at 07:47:23AM +0200, Knut Omang wrote:
> 
> > From the documentation I would have expected 
> > 
> > git config --local user.email=alt.email@alt.domain
> > 
> > to create a section 
> > 
> > [user]
> > email=alt.email@alt.domain
> > 
> > in the local .git/config.
> 
> When it sees one argument, git-config treats that argument as a key to
> be retrieved. When given two, the second is a value to be set. E.g.:
> 
>   $ git config foo.bar
>   $ git config foo.bar some-value
>   $ git config foo.bar
>   some-value
> 
> So your command was interpreted as a request to fetch the value, which
> doesn't exist.
> 
> > Instead it returns status 1 with no error message.
> 
> Hopefully that explains the response you saw; we do not emit an error
> message when a key isn't found, which makes it easy for scripts to do
> things like:
> 
>   value=$(git config foo.bar || echo default-value)
> 
> without being unnecessarily noisy.
> 
> Usually we'd catch an error like yours and complain, because the key is
> syntactically invalid ("=" is not generally allowed in key names):
> 
>   $ git config foo.bar=some-value
>   error: invalid key: foo.bar=some-value
> 
> But your argument actually _is_ a syntactically valid key, because of
> the dots. In a three-level key like "one.two.three", the second level
> subsection is allowed to contain any character (including "=" and more
> dots). So your "user.email=alt.email@alt.domain" tries to look up the
> config represented by:
> 
>   [user "email=alt.email@alt"]
>   domain
> 
> Which of course did not exist.
> 
> > Is this intentional?
> 
> Yes, everything is working as intended. The documentation in
> git-config(1) seems to be quite poor at describing the various operating
> modes, though.

Ah - I see! 
Thanks for the quick answer and excellent explanation,
and sorry for the confusion - I should know well that config takes
the write argument after a blank.

I think I'll go and get myself another cup of coffee 
before I ask more questions anywhere...

Regards,
Knut

> 
> -Peff


Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows

2017-04-02 Thread Jeff King
On Sat, Apr 01, 2017 at 08:31:27PM +0200, Jakub Narębski wrote:

> W dniu 01.04.2017 o 08:08, Jeff King pisze:
> > On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote:
> > 
> >>> I suspect in the normal case that git is doing line-ending conversion,
> >>> but it's suppressed when textconv is in use.
> >>
> >> I would not consider this a bug if not for the fact that there is no ^M
> >> without using iconv as textconv.
> > 
> > I don't think it's a bug, though. You have told Git that you will
> > convert the contents (whatever their format) into the canonical format,
> > but your program to do so includes a CR.
> 
> Well, I have not declared file binary with "binary = true" in diff driver
> definition, isn't it?

I don't think binary has anything to do with it. A textconv filter takes
input (binary or not) and delivers a normalized representation to feed
to the diff algorithm. There's no further post-processing, and it's the
responsibility of the filter to deliver the bytes it wants diffed.

Like I said, I could see an argument for treating the filter output as
text to be pre-processed, but that's not how it works (and I don't think
it is a good idea to change it now, unless by adding an option to the
diff filter).

> P.S. What do you think about Git supporting 'encoding' attribute (or
> 'core.encoding' config) plus 'core.outputEncoding' in-core?

Supporting an "encoding" attribute to normalize file encodings in diffs
seems reasonable to me. But it would have to be enabled only for
human-readable diffs, as the result could not be applied (so the same as
textconv).

I don't think core.outputEncoding is necessarily a good idea. We are not
really equipped anything that isn't an ascii superset, as we intermingle
the bytes with ascii diff headers (though I think that is true of the
commitEncoding stuff; I assume everything breaks horribly if you tried
to set that to UTF-16, but I've never tried it).

-Peff


Re: Bug: 'git config --local user.email=' fails silently?

2017-04-02 Thread Jeff King
On Sun, Apr 02, 2017 at 07:47:23AM +0200, Knut Omang wrote:

> From the documentation I would have expected 
> 
> git config --local user.email=alt.email@alt.domain
> 
> to create a section 
> 
> [user]
>   email=alt.email@alt.domain
> 
> in the local .git/config.

When it sees one argument, git-config treats that argument as a key to
be retrieved. When given two, the second is a value to be set. E.g.:

  $ git config foo.bar
  $ git config foo.bar some-value
  $ git config foo.bar
  some-value

So your command was interpreted as a request to fetch the value, which
doesn't exist.

> Instead it returns status 1 with no error message.

Hopefully that explains the response you saw; we do not emit an error
message when a key isn't found, which makes it easy for scripts to do
things like:

  value=$(git config foo.bar || echo default-value)

without being unnecessarily noisy.

Usually we'd catch an error like yours and complain, because the key is
syntactically invalid ("=" is not generally allowed in key names):

  $ git config foo.bar=some-value
  error: invalid key: foo.bar=some-value

But your argument actually _is_ a syntactically valid key, because of
the dots. In a three-level key like "one.two.three", the second level
subsection is allowed to contain any character (including "=" and more
dots). So your "user.email=alt.email@alt.domain" tries to look up the
config represented by:

  [user "email=alt.email@alt"]
  domain

Which of course did not exist.

> Is this intentional?

Yes, everything is working as intended. The documentation in
git-config(1) seems to be quite poor at describing the various operating
modes, though.

-Peff