Re: [PATCH v3 0/3] Multiple worktrees vs. submodules fixes

2014-12-08 Thread Max Kirillov
On Tue, Dec 09, 2014 at 06:44:40AM +0200, Max Kirillov wrote:
> After discussions I came to basically same as v1.
> 
> * Resubmitting the 2 patches which have not been taken to worktrees reroll -
>   they fix visible issue. Mostly unchanged except small cleanup in test.
> * Added GIT_COMMON_DIR to local_repo_env. While it is obviously a right
>   thing, I wasn't able to observe any change in behavior.
> 
> Max Kirillov (3):
>   submodule refactor: use git_path_submodule() in add_submodule_odb()
>   path: implement common_dir handling in git_path_submodule()
>   Add GIT_COMMON_DIR to local_repo_env
> 
>  cache.h  |  1 +
>  environment.c|  1 +
>  path.c   | 24 
>  setup.c  | 17 -
>  submodule.c  | 28 ++--
>  t/t7410-submodule-checkout-to.sh | 10 ++
>  6 files changed, 54 insertions(+), 27 deletions(-)
> 
> -- 
> 2.2.0.50.gb2b6831
>

Should be applied on top of
http://thread.gmane.org/gmane.comp.version-control.git/260387
with _all_ patches included, currently it's df56607dff

-- 
Max
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 2/3] path: implement common_dir handling in git_path_submodule()

2014-12-08 Thread Max Kirillov
This allows making submodules a linked workdirs.

Same as for .git, but ignores the GIT_COMMON_DIR environment variable,
because it would mean common directory for the parent repository and
does not make sense for submodule.

Also add test for functionality which uses this call.

Signed-off-by: Max Kirillov 
---
 cache.h  |  1 +
 path.c   | 24 
 setup.c  | 17 -
 t/t7410-submodule-checkout-to.sh | 10 ++
 4 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/cache.h b/cache.h
index 3f60a11..e8f465a 100644
--- a/cache.h
+++ b/cache.h
@@ -437,6 +437,7 @@ extern char *get_object_directory(void);
 extern char *get_index_file(void);
 extern char *get_graft_file(void);
 extern int set_git_dir(const char *path);
+extern int get_common_dir_noenv(struct strbuf *sb, const char *gitdir);
 extern int get_common_dir(struct strbuf *sb, const char *gitdir);
 extern const char *get_git_namespace(void);
 extern const char *strip_namespace(const char *namespaced_ref);
diff --git a/path.c b/path.c
index a5c51a3..78f718f 100644
--- a/path.c
+++ b/path.c
@@ -98,7 +98,7 @@ static const char *common_list[] = {
NULL
 };
 
-static void update_common_dir(struct strbuf *buf, int git_dir_len)
+static void update_common_dir(struct strbuf *buf, int git_dir_len, const char* 
common_dir)
 {
char *base = buf->buf + git_dir_len;
const char **p;
@@ -115,12 +115,17 @@ static void update_common_dir(struct strbuf *buf, int 
git_dir_len)
path++;
is_dir = 1;
}
+
+   if (!common_dir) {
+   common_dir = get_git_common_dir();
+   }
+
if (is_dir && dir_prefix(base, path)) {
-   replace_dir(buf, git_dir_len, get_git_common_dir());
+   replace_dir(buf, git_dir_len, common_dir);
return;
}
if (!is_dir && !strcmp(base, path)) {
-   replace_dir(buf, git_dir_len, get_git_common_dir());
+   replace_dir(buf, git_dir_len, common_dir);
return;
}
}
@@ -160,7 +165,7 @@ static void adjust_git_path(struct strbuf *buf, int 
git_dir_len)
else if (git_db_env && dir_prefix(base, "objects"))
replace_dir(buf, git_dir_len + 7, get_object_directory());
else if (git_common_dir_env)
-   update_common_dir(buf, git_dir_len);
+   update_common_dir(buf, git_dir_len, NULL);
 }
 
 static void do_git_path(struct strbuf *buf, const char *fmt, va_list args)
@@ -256,6 +261,8 @@ const char *git_path_submodule(const char *path, const char 
*fmt, ...)
 {
struct strbuf *buf = get_pathname();
const char *git_dir;
+   struct strbuf git_submodule_common_dir = STRBUF_INIT;
+   struct strbuf git_submodule_dir = STRBUF_INIT;
va_list args;
 
strbuf_addstr(buf, path);
@@ -269,11 +276,20 @@ const char *git_path_submodule(const char *path, const 
char *fmt, ...)
strbuf_addstr(buf, git_dir);
}
strbuf_addch(buf, '/');
+   strbuf_addstr(&git_submodule_dir, buf->buf);
 
va_start(args, fmt);
strbuf_vaddf(buf, fmt, args);
va_end(args);
+
+   if (get_common_dir_noenv(&git_submodule_common_dir, 
git_submodule_dir.buf)) {
+   update_common_dir(buf, git_submodule_dir.len, 
git_submodule_common_dir.buf);
+   }
+
strbuf_cleanup_path(buf);
+
+   strbuf_release(&git_submodule_dir);
+   strbuf_release(&git_submodule_common_dir);
return buf->buf;
 }
 
diff --git a/setup.c b/setup.c
index 05a8955..45e90c4 100644
--- a/setup.c
+++ b/setup.c
@@ -226,14 +226,21 @@ void verify_non_filename(const char *prefix, const char 
*arg)
 
 int get_common_dir(struct strbuf *sb, const char *gitdir)
 {
+   const char *git_env_common_dir = getenv(GIT_COMMON_DIR_ENVIRONMENT);
+   if (git_env_common_dir) {
+   strbuf_addstr(sb, git_env_common_dir);
+   return 1;
+   } else {
+   return get_common_dir_noenv(sb, gitdir);
+   }
+}
+
+int get_common_dir_noenv(struct strbuf *sb, const char *gitdir)
+{
struct strbuf data = STRBUF_INIT;
struct strbuf path = STRBUF_INIT;
-   const char *git_common_dir = getenv(GIT_COMMON_DIR_ENVIRONMENT);
int ret = 0;
-   if (git_common_dir) {
-   strbuf_addstr(sb, git_common_dir);
-   return 1;
-   }
+
strbuf_addf(&path, "%s/commondir", gitdir);
if (file_exists(path.buf)) {
if (strbuf_read_file(&data, path.buf, 0) <= 0)
diff --git a/t/t7410-submodule-checkout-to.sh b/t/t7410-submodule-checkout-to.sh
index 8f30aed..b43391a 100755
--- a/t/t7410-submodule-checkout-to.sh
+++ b/t/t7410-submodule-checkout-to.sh
@@ -47,4 +47,

[PATCH v3 3/3] Add GIT_COMMON_DIR to local_repo_env

2014-12-08 Thread Max Kirillov
This is obviously right thing to do, because submodule repository does
not use common directory of super repository.

Suggested-by: Jens Lehmann 
Signed-off-by: Max Kirillov 
---
 environment.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/environment.c b/environment.c
index 8351007..85ce3c4 100644
--- a/environment.c
+++ b/environment.c
@@ -94,6 +94,7 @@ const char * const local_repo_env[] = {
CONFIG_DATA_ENVIRONMENT,
DB_ENVIRONMENT,
GIT_DIR_ENVIRONMENT,
+   GIT_COMMON_DIR_ENVIRONMENT,
GIT_WORK_TREE_ENVIRONMENT,
GIT_IMPLICIT_WORK_TREE_ENVIRONMENT,
GRAFT_ENVIRONMENT,
-- 
2.2.0.50.gb2b6831

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 1/3] submodule refactor: use git_path_submodule() in add_submodule_odb()

2014-12-08 Thread Max Kirillov
Signed-off-by: Max Kirillov 
---
 submodule.c | 28 ++--
 1 file changed, 10 insertions(+), 18 deletions(-)

diff --git a/submodule.c b/submodule.c
index 34094f5..4aad3d4 100644
--- a/submodule.c
+++ b/submodule.c
@@ -122,43 +122,35 @@ void stage_updated_gitmodules(void)
 
 static int add_submodule_odb(const char *path)
 {
-   struct strbuf objects_directory = STRBUF_INIT;
struct alternate_object_database *alt_odb;
+   const char* objects_directory;
int ret = 0;
-   const char *git_dir;
 
-   strbuf_addf(&objects_directory, "%s/.git", path);
-   git_dir = read_gitfile(objects_directory.buf);
-   if (git_dir) {
-   strbuf_reset(&objects_directory);
-   strbuf_addstr(&objects_directory, git_dir);
-   }
-   strbuf_addstr(&objects_directory, "/objects/");
-   if (!is_directory(objects_directory.buf)) {
+   objects_directory = git_path_submodule(path, "objects/");
+   if (!is_directory(objects_directory)) {
ret = -1;
goto done;
}
+
/* avoid adding it twice */
for (alt_odb = alt_odb_list; alt_odb; alt_odb = alt_odb->next)
-   if (alt_odb->name - alt_odb->base == objects_directory.len &&
-   !strncmp(alt_odb->base, objects_directory.buf,
-   objects_directory.len))
+   if (alt_odb->name - alt_odb->base == strlen(objects_directory) 
&&
+   !strcmp(alt_odb->base, objects_directory))
goto done;
 
-   alt_odb = xmalloc(objects_directory.len + 42 + sizeof(*alt_odb));
+   alt_odb = xmalloc(strlen(objects_directory) + 42 + sizeof(*alt_odb));
alt_odb->next = alt_odb_list;
-   strcpy(alt_odb->base, objects_directory.buf);
-   alt_odb->name = alt_odb->base + objects_directory.len;
+   strcpy(alt_odb->base, objects_directory);
+   alt_odb->name = alt_odb->base + strlen(objects_directory);
alt_odb->name[2] = '/';
alt_odb->name[40] = '\0';
alt_odb->name[41] = '\0';
alt_odb_list = alt_odb;
 
/* add possible alternates from the submodule */
-   read_info_alternates(objects_directory.buf, 0);
+   read_info_alternates(objects_directory, 0);
prepare_alt_odb();
 done:
-   strbuf_release(&objects_directory);
return ret;
 }
 
-- 
2.2.0.50.gb2b6831

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/3] Multiple worktrees vs. submodules fixes

2014-12-08 Thread Max Kirillov
After discussions I came to basically same as v1.

* Resubmitting the 2 patches which have not been taken to worktrees reroll -
  they fix visible issue. Mostly unchanged except small cleanup in test.
* Added GIT_COMMON_DIR to local_repo_env. While it is obviously a right
  thing, I wasn't able to observe any change in behavior.

Max Kirillov (3):
  submodule refactor: use git_path_submodule() in add_submodule_odb()
  path: implement common_dir handling in git_path_submodule()
  Add GIT_COMMON_DIR to local_repo_env

 cache.h  |  1 +
 environment.c|  1 +
 path.c   | 24 
 setup.c  | 17 -
 submodule.c  | 28 ++--
 t/t7410-submodule-checkout-to.sh | 10 ++
 6 files changed, 54 insertions(+), 27 deletions(-)

-- 
2.2.0.50.gb2b6831

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] doc: make clear --assume-unchanged's user contract

2014-12-08 Thread Sérgio Basto
On Sáb, 2014-12-06 at 15:04 +, Philip Oakley wrote: 
> Many users misunderstand the --assume-unchanged contract, believing
> it means Git won't look at the flagged file.
> 
> Be explicit that the --assume-unchanged contract is by the user that
> they will NOT change the file so that Git does not need to look (and
> expend, for example, lstat(2) cycles)
> 
> Mentioning "Git stops checking" does not help the reader, as it is
> only one possible consequence of what that assumption allows Git to
> do, but
> 
>(1) there are things other than "stop checking" that Git can do
>based on that assumption; and
>(2) Git is not obliged to stop checking; it merely is allowed to.
> 
> Also, this is a single flag bit, correct the plural to singular, and
> the verb, accordingly.
> 
> Drop the stale and incorrect information about "poor-man's ignore",
> which is not what this flag bit is about at all.
> 
> Signed-off-by: Philip Oakley 
> ---
>  Documentation/git-update-index.txt | 18 --
>  1 file changed, 8 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/git-update-index.txt 
> b/Documentation/git-update-index.txt
> index e0a8702..da1ccbc 100644
> --- a/Documentation/git-update-index.txt
> +++ b/Documentation/git-update-index.txt
> @@ -78,20 +78,18 @@ OPTIONS
>  Set the execute permissions on the updated files.
>  
>  --[no-]assume-unchanged::
> - When these flags are specified, the object names recorded
> - for the paths are not updated.  Instead, these options
> - set and unset the "assume unchanged" bit for the
> - paths.  When the "assume unchanged" bit is on, Git stops
> - checking the working tree files for possible
> - modifications, so you need to manually unset the bit to
> - tell Git when you change the working tree file. This is
> + When this flag is specified, the object names recorded
> + for the paths are not updated.  Instead, this option
> + sets/unsets the "assume unchanged" bit for the
> + paths.  When the "assume unchanged" bit is on, the user
> + promises not to change the file and allows Git to assume
> + that the working tree file matches what is recorded in
> + the index.  If you want to change the working tree file,
> + you need to unset the bit to tell Git.  This is
>   sometimes helpful when working with a big project on a
>   filesystem that has very slow lstat(2) system call
>   (e.g. cifs).
>  +
> -This option can be also used as a coarse file-level mechanism
> -to ignore uncommitted changes in tracked files (akin to what
> -`.gitignore` does for untracked files).
>  Git will fail (gracefully) in case it needs to modify this file
>  in the index e.g. when merging in a commit;
>  thus, in case the assumed-untracked file is changed upstream,

I don't understand why you insist that we have a contract, 
when : 
"git diff .", "git diff -a" and "git commit -a" have a different
behavior of "git commit ." , this is not about any contract this is
about coherency and be user friendly . 

At least if you want keep things like that, wrote in doc, clearly, that
assume-unchanged flag *is not*, to git ignoring changes in tracked files
and currently not ignore files for git commit  and may not work in
other cases . 

Also don't understand why --assumed-untracked shouldn't deal with
changed files instead fallback in "the user promises not to change the
file" and sometimes works others not. 

Also if this is the contract when a file is different from commit,
should warning the user that is not in contract (modify files that are
assumed-untracked ) 


Thanks, 
-- 
Sérgio M. B.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fast-import should not care about core.ignorecase

2014-12-08 Thread Joshua Jensen

Jonathan Nieder wrote on 12/8/2014 6:31 PM:

Joshua Jensen wrote:

I think it has been discussed before, but maybe Git needs a
core.casefold in addition to core.ignorecase.)

Would it work for --casefold to be a commandline flag to fast-import,
instead of a global option affecting multiple Git commands?
Given that core.ignorecase=true means to fold filename case in quite a 
number of places within Git right now, I would expect the same behavior 
within a repository where fast-import is being run against 
core.ignorecase=true.


So, I don't know what core.ignorecase should mean, but I'm pretty sure I 
know what core.foldcase should mean.


Would --casefold work?  Sure, but it would be a special case against the 
existing core.ignorecase behavior that I don't think makes much sense.


Josh
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] commit: ignore assume-unchanged files in "commmit " mode

2014-12-08 Thread Sérgio Basto
On Sex, 2014-12-05 at 17:56 +0700, Nguyễn Thái Ngọc Duy wrote: 
> In the same spirit of 7fce6e3 (commit: correctly respect skip-worktree
> bit - 2009-12-14), if a file is marked unchanged, skip it.
> 
> Noticed-by: Sérgio Basto 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/commit.c |  2 +-
>  t/t2106-update-index-assume-unchanged.sh | 10 ++
>  2 files changed, 11 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/commit.c b/builtin/commit.c
> index e108c53..ee3de12 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -252,7 +252,7 @@ static int list_paths(struct string_list *list, const 
> char *with_tree,
>   if (!ce_path_match(ce, pattern, m))
>   continue;
>   item = string_list_insert(list, ce->name);
> - if (ce_skip_worktree(ce))
> + if (ce->ce_flags & (CE_VALID | CE_SKIP_WORKTREE))
>   item->util = item; /* better a valid pointer than a 
> fake one */
>   }
>  
> diff --git a/t/t2106-update-index-assume-unchanged.sh 
> b/t/t2106-update-index-assume-unchanged.sh
> index 99d858c..dc332f5 100755
> --- a/t/t2106-update-index-assume-unchanged.sh
> +++ b/t/t2106-update-index-assume-unchanged.sh
> @@ -21,4 +21,14 @@ test_expect_success 'do not switch branches with dirty 
> file' \
>git update-index --assume-unchanged file &&
>test_must_fail git checkout master'
>  
> +test_expect_success 'commit  ignore assume-unchanged files' '
> + : >anotherfile &&
> + git add anotherfile &&
> + echo dirty >anotherfile &&
> + git commit -m one -- file anotherfile &&
> + git diff --name-only HEAD^ HEAD >actual &&
> + echo anotherfile >expected &&
> + test_cmp expected actual
> +'
> +
>  test_done


works great many thanks, 

-- 
Sérgio M. B.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fast-import should not care about core.ignorecase

2014-12-08 Thread Jonathan Nieder
Joshua Jensen wrote:

> I think it has been discussed before, but maybe Git needs a
> core.casefold in addition to core.ignorecase.)

Would it work for --casefold to be a commandline flag to fast-import,
instead of a global option affecting multiple Git commands?

Curious,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fast-import should not care about core.ignorecase

2014-12-08 Thread Joshua Jensen

Mike Hommey wrote on 12/8/2014 5:12 PM:

While it makes sense for checkouts and local commits, it doesn't make
sense to me that using git fast-import with the same import script would
have a different behavior depending on whether the file system is
case-sensitive or not.
I have used fast-import with Perforce inputs.  When you run a Windows 
Perforce server, filenames can be submitted with ANY case, but given the 
case insensitive nature of the file system, a synced Perforce file will 
end up using whatever case happens to be on the file system at that point.


That may not be clear, so here goes:

Revision 1: abc/DEF/ghi/FILE.dat

Revision 2: ABC/def/GHI/file.dat

^^ Yes, Perforce stores the filename internally in that manner and does 
not fold the case.


If you happen to sync Revision 2 on an empty directory tree, you'll get 
ABC/def/GHI/file.dat.  If you then sync Revision 1, the filename case 
remains ABC/def/GHI/file.dat.


Likewise, if you happen to sync Revision 1 into an empty directory tree, 
you'll get abc/DEF/ghi/FILE.dat.  If you then sync Revision 2, the 
filename case remains as abc/DEF/ghi/FILE.dat.


I was the one who originally submitted the patch for this some 4 years 
ago.  It was commit 50906e04e8f48215b0b09841686709b92a2ab2e4. 'git 
fast-import' with core.ignorecase=true will fold the case of the 
filename specified in Revision 2 to the case currently stored in the Git 
repository from Revision 1.


If it does not do this, then Git internally stores FILE.dat and 
file.dat, and bad things happen on case-insensitive file systems.


(Further, there are still a few paths into Git where 
core.ignorecase=true does not fold the case of the filename, and this 
can cause 'repository corruptions' on case-insensitive file systems.  
One such place is in 'git update-index' directly used by 'git gui'.  I 
really need to get this submitted, as we've been beating on it for a 
long time now, but here is the partial patch for informational purposes 
only.


I think it has been discussed before, but maybe Git needs a 
core.casefold in addition to core.ignorecase.)


-Josh

 builtin/update-index.c 


index aaa6f78..4cfedc1 100644
@@ -99,6 +99,7 @@ static int add_one_path(const struct cache_entry *old, 
const char *path, int len

 memcpy(ce->name, path, len);
 ce->ce_flags = create_ce_flags(0);
 ce->ce_namelen = len;
+fold_ce_name_case(&the_index, ce);
 fill_stat_cache_info(ce, st);
 ce->ce_mode = ce_mode_from_stat(old, st->st_mode);

@@ -234,6 +235,7 @@ static int add_cacheinfo(unsigned int mode, const 
unsigned char *sha1,

 memcpy(ce->name, path, len);
 ce->ce_flags = create_ce_flags(stage);
 ce->ce_namelen = len;
+fold_ce_name_case(&the_index, ce);
 ce->ce_mode = create_ce_mode(mode);
 if (assume_unchanged)
 ce->ce_flags |= CE_VALID;




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: fast-import should not care about core.ignorecase

2014-12-08 Thread Mike Hommey
On Tue, Dec 09, 2014 at 09:12:11AM +0900, Mike Hommey wrote:
> Hi,
> 
> As you now know, I'm working on a mercurial remote helper for git. As
> such, it uses fast-import.
> 
> In the mercurial history of mozilla-central, there have been various
> renames of files with only case changes, and it so happens that my
> remote helper blows things up on case insensitive file systems. The
> reason is git clone probing the file system and setting core.ignorecase
> appropriately.
> 
> While it makes sense for checkouts and local commits, it doesn't make
> sense to me that using git fast-import with the same import script would
> have a different behavior depending on whether the file system is
> case-sensitive or not.

Heh, I just found this thread:
http://marc.info/?t=13913470871&r=1&w=2

It doesn't seem to have led to something actually being committed,
though.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


fast-import should not care about core.ignorecase

2014-12-08 Thread Mike Hommey
Hi,

As you now know, I'm working on a mercurial remote helper for git. As
such, it uses fast-import.

In the mercurial history of mozilla-central, there have been various
renames of files with only case changes, and it so happens that my
remote helper blows things up on case insensitive file systems. The
reason is git clone probing the file system and setting core.ignorecase
appropriately.

While it makes sense for checkouts and local commits, it doesn't make
sense to me that using git fast-import with the same import script would
have a different behavior depending on whether the file system is
case-sensitive or not.

Reduced testcase:

$ git init
$ git fast-import < 0 +
data 0

M 644 :1 a

commit refs/FOO
committer  0 +
data 0

R a A
EOF

This is what you get on a case sensitive FS:

$ git log refs/FOO -p -M
commit be1497308f30f883343eefd0da7ddf1e747133f8
Author:  
Date:   Thu Jan 1 00:00:00 1970 +

diff --git a/a b/A
similarity index 100%
rename from a
rename to A

commit 8d37f958cfc0702c577b918c86769a902fe109f8
Author:  
Date:   Thu Jan 1 00:00:00 1970 +

diff --git a/a b/a
new file mode 100644
index 000..7898192
--- /dev/null
+++ b/a
@@ -0,0 +1 @@
+a

This is what you get on a case insensitive FS:

$ git log refs/FOO -p -M
commit 208c0c4cf58cd54512301e0de33ccb8a78d6b226
Author:  
Date:   Thu Jan 1 00:00:00 1970 +

commit 8d37f958cfc0702c577b918c86769a902fe109f8
Author:  
Date:   Thu Jan 1 00:00:00 1970 +

diff --git a/a b/a
new file mode 100644
index 000..7898192
--- /dev/null
+++ b/a
@@ -0,0 +1 @@
+a

Note, this applies equally to filerename commands or filedelete +
filemodify combinations.

Mike
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-svn: Support for git-svn propset

2014-12-08 Thread Alfred Perlstein


> On Dec 8, 2014, at 1:36 PM, Eric Wong  wrote:
> 
> Alfred Perlstein  wrote:
>> Appearing here:
>>  http://marc.info/?l=git&m=125259772625008&w=2
> 
> Probably better to use a mid URL here, too
> 
> http://mid.gmane.org/1927112650.1281253084529659.javamail.r...@klofta.sjsoft.com
> 
> such a long URL, though...
> 
>> --- a/perl/Git/SVN/Editor.pm
>> +++ b/perl/Git/SVN/Editor.pm
>> @@ -288,6 +288,44 @@ sub apply_autoprops {
>>}
>> }
>> 
>> +sub check_attr {
>> +my ($attr,$path) = @_;
>> +my $fh = command_output_pipe("check-attr", $attr, "--", $path);
>> +return undef if (!$fh);
>> +
>> +my $val = <$fh>;
>> +close $fh;
>> +if ($val) { $val =~ s/^[^:]*:\s*[^:]*:\s*(.*)\s*$/$1/; }
>> +return $val;
>> +}
> 
> I just noticed command_output_pipe didn't use a corresponding
> command_close_pipe to check for errors, but command_oneline is even
> better.  I'll squash the following:
> 
> --- a/perl/Git/SVN/Editor.pm
> +++ b/perl/Git/SVN/Editor.pm
> @@ -290,11 +290,7 @@ sub apply_autoprops {
> 
> sub check_attr {
>my ($attr,$path) = @_;
> -my $fh = command_output_pipe("check-attr", $attr, "--", $path);
> -return undef if (!$fh);
> -
> -my $val = <$fh>;
> -close $fh;
> +my $val = command_oneline("check-attr", $attr, "--", $path);
>if ($val) { $val =~ s/^[^:]*:\s*[^:]*:\s*(.*)\s*$/$1/; }
>return $val;
> }
> 
> In your test, "local" isn't portable, unfortunately, but tests seem to
> work fine without local so I've removed them:
> 
> --- a/t/t9148-git-svn-propset.sh
> +++ b/t/t9148-git-svn-propset.sh
> @@ -29,10 +29,9 @@ test_expect_success 'fetch revisions from svn' '
>git svn fetch
>'
> 
> -set_props()
> -{
> -local subdir="$1"
> -local file="$2"
> +set_props () {
> +subdir="$1"
> +file="$2"
>shift;shift;
>(cd "$subdir" &&
>while [ $# -gt 0 ] ; do
> @@ -43,10 +42,9 @@ set_props()
>git commit -m "testing propset" "$file")
> }
> 
> -confirm_props()
> -{
> -local subdir="$1"
> -local file="$2"
> +confirm_props () {
> +subdir="$1"
> +file="$2"
>shift;shift;
>(set -e ; cd "svn_project/$subdir" &&
>while [ $# -gt 0 ] ; do
> 
> Unless there's other improvements we missed, I'll push out your v3 with
> my changes squashed in for Junio to pull in a day or two.  Thank you
> again for working on this!
> 

Eric,

All looks good to me. 

Thank you all very much for the feedback and help.  It's made this a very 
rewarding endeavor. 

-Alfred. --
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 22/23] lock_any_ref_for_update(): inline function

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:34AM +0100, Michael Haggerty wrote:
> From: Ronnie Sahlberg 
> 
> Inline the function at its one remaining caller (which is within
> refs.c) and remove it.
> 


> Signed-off-by: Michael Haggerty 

It's originally from Ronnie, but his sign off is missing?

If that sign off is found again, 
Reviewed-by: Stefan Beller 

> ---
>  refs.c | 9 +
>  refs.h | 9 +
>  2 files changed, 2 insertions(+), 16 deletions(-)
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 20/23] reflog_expire(): new function in the reference API

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:32AM +0100, Michael Haggerty wrote:
> Move expire_reflog() into refs.c and rename it to reflog_expire().
> Turn the three policy functions into function pointers that are passed
> into reflog_expire(). Add function prototypes and documentation to
> refs.h.
> 
> Signed-off-by: Michael Haggerty 

With or without the nits fixed
Reviewed-by: Stefan Beller 
as the nits are not degrading functionality.

> ---
>  builtin/reflog.c | 133 
> +++
>  refs.c   | 114 +++
>  refs.h   |  45 +++
>  3 files changed, 174 insertions(+), 118 deletions(-)
> 



> +static int expire_reflog_ent(unsigned char *osha1, unsigned char *nsha1,
> + const char *email, unsigned long timestamp, int tz,
> + const char *message, void *cb_data)

Nit: According to our Codingguidelines we want to indent it further, so it 
aligns with
the arguments from the first line.

+static int expire_reflog_ent(unsigned char *osha1, unsigned char *nsha1,
+ const char *email, unsigned long timestamp, int 
tz,
+ const char *message, void *cb_data)

> + }
> + return 0;

Why do we need the return value for expire_reflog_ent?
The "return 0:" at the very end of the function is the only return I see here.

> +enum expire_reflog_flags {
> + EXPIRE_REFLOGS_DRY_RUN = 1 << 0,
> + EXPIRE_REFLOGS_UPDATE_REF = 1 << 1,
> + EXPIRE_REFLOGS_VERBOSE = 1 << 2,
> + EXPIRE_REFLOGS_REWRITE = 1 << 3
> +};

Sometimes we align the assigned numbers and sometimes we don't in git, so an 
alternative would be

enum expire_reflog_flags {
 EXPIRE_REFLOGS_DRY_RUN= 1 << 0,
 EXPIRE_REFLOGS_UPDATE_REF = 1 << 1,
 EXPIRE_REFLOGS_VERBOSE= 1 << 2,
 EXPIRE_REFLOGS_REWRITE= 1 << 3
}

Do we have a preference in the coding style on this one?




> + *
> + * reflog_expiry_select_fn -- Called once for each entry in the
> + * existing reflog. It should return true iff that entry should be
> + * pruned.

Also I know how we got here, I wonder if we should inverse the logic here
(in a later patch). "select" sounds to me as if the line is selected to keep it.
However the opposite is true. To actually select (keep) the line we need to 
return
0. Would it make sense to rename this to reflog_expiry_should_prune_fn ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 19/23] expire_reflog(): treat the policy callback data as opaque

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:31AM +0100, Michael Haggerty wrote:
> Now that expire_reflog() doesn't actually look in the
> expire_reflog_policy_cb data structure, we can make it opaque:
> 
> * Change its callers to pass it a pointer to an entire "struct
>   expire_reflog_policy_cb".
> 
> * Change it to pass the pointer through as a "void *".
> 
> * Change the policy functions, reflog_expiry_prepare(),
>   reflog_expiry_cleanup(), and should_expire_reflog_ent(), to accept
>   "void *cb_data" arguments and cast them to "struct
>   expire_reflog_policy_cb" internally.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: Stefan Beller 
> ---
>  builtin/reflog.c | 73 
> 
>  1 file changed, 36 insertions(+), 37 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 01b76d0..c30936bb 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -43,7 +43,7 @@ struct expire_reflog_policy_cb {
>   } unreachable_expire_kind;
>   struct commit_list *mark_list;
>   unsigned long mark_limit;
> - struct cmd_reflog_expire_cb *cmd;
> + struct cmd_reflog_expire_cb cmd;
>   struct commit *tip_commit;
>   struct commit_list *tips;
>  };
> @@ -309,22 +309,22 @@ static int should_expire_reflog_ent(unsigned char 
> *osha1, unsigned char *nsha1,
>   struct expire_reflog_policy_cb *cb = cb_data;
>   struct commit *old, *new;
>  
> - if (timestamp < cb->cmd->expire_total)
> + if (timestamp < cb->cmd.expire_total)
>   return 1;
>  
>   old = new = NULL;
> - if (cb->cmd->stalefix &&
> + if (cb->cmd.stalefix &&
>   (!keep_entry(&old, osha1) || !keep_entry(&new, nsha1)))
>   return 1;
>  
> - if (timestamp < cb->cmd->expire_unreachable) {
> + if (timestamp < cb->cmd.expire_unreachable) {
>   if (cb->unreachable_expire_kind == UE_ALWAYS)
>   return 1;
>   if (unreachable(cb, old, osha1) || unreachable(cb, new, nsha1))
>   return 1;
>   }
>  
> - if (cb->cmd->recno && --(cb->cmd->recno) == 0)
> + if (cb->cmd.recno && --(cb->cmd.recno) == 0)
>   return 1;
>  
>   return 0;
> @@ -378,9 +378,11 @@ static int push_tip_to_list(const char *refname, const 
> unsigned char *sha1,
>  
>  static void reflog_expiry_prepare(const char *refname,
> const unsigned char *sha1,
> -   struct expire_reflog_policy_cb *cb)
> +   void *cb_data)
>  {
> - if (!cb->cmd->expire_unreachable || !strcmp(refname, "HEAD")) {
> + struct expire_reflog_policy_cb *cb = cb_data;
> +
> + if (!cb->cmd.expire_unreachable || !strcmp(refname, "HEAD")) {
>   cb->tip_commit = NULL;
>   cb->unreachable_expire_kind = UE_HEAD;
>   } else {
> @@ -391,7 +393,7 @@ static void reflog_expiry_prepare(const char *refname,
>   cb->unreachable_expire_kind = UE_NORMAL;
>   }
>  
> - if (cb->cmd->expire_unreachable <= cb->cmd->expire_total)
> + if (cb->cmd.expire_unreachable <= cb->cmd.expire_total)
>   cb->unreachable_expire_kind = UE_ALWAYS;
>  
>   cb->mark_list = NULL;
> @@ -405,13 +407,15 @@ static void reflog_expiry_prepare(const char *refname,
>   } else {
>   commit_list_insert(cb->tip_commit, &cb->mark_list);
>   }
> - cb->mark_limit = cb->cmd->expire_total;
> + cb->mark_limit = cb->cmd.expire_total;
>   mark_reachable(cb);
>   }
>  }
>  
> -static void reflog_expiry_cleanup(struct expire_reflog_policy_cb *cb)
> +static void reflog_expiry_cleanup(void *cb_data)
>  {
> + struct expire_reflog_policy_cb *cb = cb_data;
> +
>   if (cb->unreachable_expire_kind != UE_ALWAYS) {
>   if (cb->unreachable_expire_kind == UE_HEAD) {
>   struct commit_list *elem;
> @@ -427,19 +431,16 @@ static void reflog_expiry_cleanup(struct 
> expire_reflog_policy_cb *cb)
>  static struct lock_file reflog_lock;
>  
>  static int expire_reflog(const char *refname, const unsigned char *sha1,
> -  unsigned int flags, void *cb_data)
> +  unsigned int flags, void *policy_cb_data)
>  {
> - struct cmd_reflog_expire_cb *cmd = cb_data;
>   struct expire_reflog_cb cb;
> - struct expire_reflog_policy_cb policy_cb;
>   struct ref_lock *lock;
>   char *log_file;
>   int status = 0;
>  
>   memset(&cb, 0, sizeof(cb));
> - memset(&policy_cb, 0, sizeof(policy_cb));
>   cb.flags = flags;
> - cb.policy_cb = &policy_cb;
> + cb.policy_cb = policy_cb_data;
>  
>   /*
>* we take the lock for the ref itself to prevent it from
> @@ -462,11 +463,9 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   goto failure;
>   }
>  
> - polic

Re: [PATCH 18/23] Move newlog and last_kept_sha1 to "struct expire_reflog_cb"

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:30AM +0100, Michael Haggerty wrote:
> These members are not needed by the policy functions.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: Stefan Beller 
> ---
>  builtin/reflog.c | 20 ++--
>  1 file changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 6294406..01b76d0 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -36,7 +36,6 @@ struct cmd_reflog_expire_cb {
>  };
>  
>  struct expire_reflog_policy_cb {
> - FILE *newlog;
>   enum {
>   UE_NORMAL,
>   UE_ALWAYS,
> @@ -45,7 +44,6 @@ struct expire_reflog_policy_cb {
>   struct commit_list *mark_list;
>   unsigned long mark_limit;
>   struct cmd_reflog_expire_cb *cmd;
> - unsigned char last_kept_sha1[20];
>   struct commit *tip_commit;
>   struct commit_list *tips;
>  };
> @@ -53,6 +51,8 @@ struct expire_reflog_policy_cb {
>  struct expire_reflog_cb {
>   unsigned int flags;
>   void *policy_cb;
> + FILE *newlog;
> + unsigned char last_kept_sha1[20];
>  };
>  
>  struct collected_reflog {
> @@ -338,23 +338,23 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   struct expire_reflog_policy_cb *policy_cb = cb->policy_cb;
>  
>   if (cb->flags & EXPIRE_REFLOGS_REWRITE)
> - osha1 = policy_cb->last_kept_sha1;
> + osha1 = cb->last_kept_sha1;
>  
>   if (should_expire_reflog_ent(osha1, nsha1, email, timestamp, tz,
>message, policy_cb)) {
> - if (!policy_cb->newlog)
> + if (!cb->newlog)
>   printf("would prune %s", message);
>   else if (cb->flags & EXPIRE_REFLOGS_VERBOSE)
>   printf("prune %s", message);
>   } else {
> - if (policy_cb->newlog) {
> + if (cb->newlog) {
>   char sign = (tz < 0) ? '-' : '+';
>   int zone = (tz < 0) ? (-tz) : tz;
> - fprintf(policy_cb->newlog, "%s %s %s %lu %c%04d\t%s",
> + fprintf(cb->newlog, "%s %s %s %lu %c%04d\t%s",
>   sha1_to_hex(osha1), sha1_to_hex(nsha1),
>   email, timestamp, sign, zone,
>   message);
> - hashcpy(policy_cb->last_kept_sha1, nsha1);
> + hashcpy(cb->last_kept_sha1, nsha1);
>   }
>   if (cb->flags & EXPIRE_REFLOGS_VERBOSE)
>   printf("keep %s", message);
> @@ -457,8 +457,8 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
>   goto failure;
> - policy_cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> - if (!policy_cb.newlog)
> + cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> + if (!cb.newlog)
>   goto failure;
>   }
>  
> @@ -474,7 +474,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   strerror(errno));
>   } else if ((flags & EXPIRE_REFLOGS_UPDATE_REF) &&
>   (write_in_full(lock->lock_fd,
> - sha1_to_hex(policy_cb.last_kept_sha1), 40) != 
> 40 ||
> + sha1_to_hex(cb.last_kept_sha1), 40) != 40 ||
>write_str_in_full(lock->lock_fd, "\n") != 1 ||
>close_ref(lock) < 0)) {
>   status |= error("Couldn't write %s",
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 17/23] expire_reflog(): move rewrite to flags argument

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:29AM +0100, Michael Haggerty wrote:
> The policy objects don't care about "--rewrite". So move it to
> expire_reflog()'s flags parameter.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: Stefan Beller 

> ---
>  builtin/reflog.c | 10 +-
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index cc7a220..6294406 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -23,13 +23,13 @@ static unsigned long default_reflog_expire_unreachable;
>  enum expire_reflog_flags {
>   EXPIRE_REFLOGS_DRY_RUN = 1 << 0,
>   EXPIRE_REFLOGS_UPDATE_REF = 1 << 1,
> - EXPIRE_REFLOGS_VERBOSE = 1 << 2
> + EXPIRE_REFLOGS_VERBOSE = 1 << 2,
> + EXPIRE_REFLOGS_REWRITE = 1 << 3
>  };
>  
>  struct cmd_reflog_expire_cb {
>   struct rev_info revs;
>   int stalefix;
> - int rewrite;
>   unsigned long expire_total;
>   unsigned long expire_unreachable;
>   int recno;
> @@ -337,7 +337,7 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   struct expire_reflog_cb *cb = cb_data;
>   struct expire_reflog_policy_cb *policy_cb = cb->policy_cb;
>  
> - if (policy_cb->cmd->rewrite)
> + if (cb->flags & EXPIRE_REFLOGS_REWRITE)
>   osha1 = policy_cb->last_kept_sha1;
>  
>   if (should_expire_reflog_ent(osha1, nsha1, email, timestamp, tz,
> @@ -673,7 +673,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   else if (!strcmp(arg, "--stale-fix"))
>   cb.stalefix = 1;
>   else if (!strcmp(arg, "--rewrite"))
> - cb.rewrite = 1;
> + flags |= EXPIRE_REFLOGS_REWRITE;
>   else if (!strcmp(arg, "--updateref"))
>   flags |= EXPIRE_REFLOGS_UPDATE_REF;
>   else if (!strcmp(arg, "--all"))
> @@ -755,7 +755,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>   if (!strcmp(arg, "--dry-run") || !strcmp(arg, "-n"))
>   flags |= EXPIRE_REFLOGS_DRY_RUN;
>   else if (!strcmp(arg, "--rewrite"))
> - cb.rewrite = 1;
> + flags |= EXPIRE_REFLOGS_REWRITE;
>   else if (!strcmp(arg, "--updateref"))
>   flags |= EXPIRE_REFLOGS_UPDATE_REF;
>   else if (!strcmp(arg, "--verbose"))
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/23] expire_reflog(): move verbose to flags argument

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:28AM +0100, Michael Haggerty wrote:
> The policy objects don't care about "--verbose". So move it to
> expire_reflog()'s flags parameter.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: Stefan Beller 

> ---
>  builtin/reflog.c | 24 
>  1 file changed, 12 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 1512b67..cc7a220 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -20,11 +20,16 @@ static const char reflog_delete_usage[] =
>  static unsigned long default_reflog_expire;
>  static unsigned long default_reflog_expire_unreachable;
>  
> +enum expire_reflog_flags {
> + EXPIRE_REFLOGS_DRY_RUN = 1 << 0,
> + EXPIRE_REFLOGS_UPDATE_REF = 1 << 1,
> + EXPIRE_REFLOGS_VERBOSE = 1 << 2
> +};
> +
>  struct cmd_reflog_expire_cb {
>   struct rev_info revs;
>   int stalefix;
>   int rewrite;
> - int verbose;
>   unsigned long expire_total;
>   unsigned long expire_unreachable;
>   int recno;
> @@ -339,7 +344,7 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>message, policy_cb)) {
>   if (!policy_cb->newlog)
>   printf("would prune %s", message);
> - else if (policy_cb->cmd->verbose)
> + else if (cb->flags & EXPIRE_REFLOGS_VERBOSE)
>   printf("prune %s", message);
>   } else {
>   if (policy_cb->newlog) {
> @@ -351,7 +356,7 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   message);
>   hashcpy(policy_cb->last_kept_sha1, nsha1);
>   }
> - if (policy_cb->cmd->verbose)
> + if (cb->flags & EXPIRE_REFLOGS_VERBOSE)
>   printf("keep %s", message);
>   }
>   return 0;
> @@ -421,11 +426,6 @@ static void reflog_expiry_cleanup(struct 
> expire_reflog_policy_cb *cb)
>  
>  static struct lock_file reflog_lock;
>  
> -enum expire_reflog_flags {
> - EXPIRE_REFLOGS_DRY_RUN = 1 << 0,
> - EXPIRE_REFLOGS_UPDATE_REF = 1 << 1
> -};
> -
>  static int expire_reflog(const char *refname, const unsigned char *sha1,
>unsigned int flags, void *cb_data)
>  {
> @@ -679,7 +679,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   else if (!strcmp(arg, "--all"))
>   do_all = 1;
>   else if (!strcmp(arg, "--verbose"))
> - cb.verbose = 1;
> + flags |= EXPIRE_REFLOGS_VERBOSE;
>   else if (!strcmp(arg, "--")) {
>   i++;
>   break;
> @@ -697,10 +697,10 @@ static int cmd_reflog_expire(int argc, const char 
> **argv, const char *prefix)
>*/
>   if (cb.stalefix) {
>   init_revisions(&cb.revs, prefix);
> - if (cb.verbose)
> + if (flags & EXPIRE_REFLOGS_VERBOSE)
>   printf("Marking reachable objects...");
>   mark_reachable_objects(&cb.revs, 0, 0, NULL);
> - if (cb.verbose)
> + if (flags & EXPIRE_REFLOGS_VERBOSE)
>   putchar('\n');
>   }
>  
> @@ -759,7 +759,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>   else if (!strcmp(arg, "--updateref"))
>   flags |= EXPIRE_REFLOGS_UPDATE_REF;
>   else if (!strcmp(arg, "--verbose"))
> - cb.verbose = 1;
> + flags |= EXPIRE_REFLOGS_VERBOSE;
>   else if (!strcmp(arg, "--")) {
>   i++;
>   break;
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 15/23] expire_reflog(): pass flags through to expire_reflog_ent()

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:27AM +0100, Michael Haggerty wrote:
> Add a flags field to "struct expire_reflog_cb", and pass the flags
> argument through to expire_reflog_ent(). In a moment we will start
> using it to pass through flags that expire_reflog_ent() needs.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 

> ---
>  builtin/reflog.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 5dfa53a..1512b67 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -46,6 +46,7 @@ struct expire_reflog_policy_cb {
>  };
>  
>  struct expire_reflog_cb {
> + unsigned int flags;
>   void *policy_cb;
>  };
>  
> @@ -437,6 +438,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>  
>   memset(&cb, 0, sizeof(cb));
>   memset(&policy_cb, 0, sizeof(policy_cb));
> + cb.flags = flags;
>   cb.policy_cb = &policy_cb;
>  
>   /*
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 14/23] struct expire_reflog_cb: a new callback data type

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:26AM +0100, Michael Haggerty wrote:
> Add a new data type, "struct expire_reflog_cb", for holding the data
> that expire_reflog() passes to expire_reflog_ent() via
> for_each_reflog_ent(). For now it only holds a pointer to "struct
> expire_reflog_policy_cb". In future commits we will move some data
> from the latter to the former.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 
> ---
>  builtin/reflog.c | 43 ++-
>  1 file changed, 26 insertions(+), 17 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 3538e4b..5dfa53a 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -45,10 +45,15 @@ struct expire_reflog_policy_cb {
>   struct commit_list *tips;
>  };
>  
> +struct expire_reflog_cb {
> + void *policy_cb;
> +};
> +
>  struct collected_reflog {
>   unsigned char sha1[20];
>   char reflog[FLEX_ARRAY];
>  };
> +
>  struct collect_reflog_cb {
>   struct collected_reflog **e;
>   int alloc;
> @@ -323,28 +328,29 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   const char *email, unsigned long timestamp, int tz,
>   const char *message, void *cb_data)
>  {
> - struct expire_reflog_policy_cb *cb = cb_data;
> + struct expire_reflog_cb *cb = cb_data;
> + struct expire_reflog_policy_cb *policy_cb = cb->policy_cb;
>  
> - if (cb->cmd->rewrite)
> - osha1 = cb->last_kept_sha1;
> + if (policy_cb->cmd->rewrite)
> + osha1 = policy_cb->last_kept_sha1;
>  
>   if (should_expire_reflog_ent(osha1, nsha1, email, timestamp, tz,
> -  message, cb_data)) {
> - if (!cb->newlog)
> +  message, policy_cb)) {
> + if (!policy_cb->newlog)
>   printf("would prune %s", message);
> - else if (cb->cmd->verbose)
> + else if (policy_cb->cmd->verbose)
>   printf("prune %s", message);
>   } else {
> - if (cb->newlog) {
> + if (policy_cb->newlog) {
>   char sign = (tz < 0) ? '-' : '+';
>   int zone = (tz < 0) ? (-tz) : tz;
> - fprintf(cb->newlog, "%s %s %s %lu %c%04d\t%s",
> + fprintf(policy_cb->newlog, "%s %s %s %lu %c%04d\t%s",
>   sha1_to_hex(osha1), sha1_to_hex(nsha1),
>   email, timestamp, sign, zone,
>   message);
> - hashcpy(cb->last_kept_sha1, nsha1);
> + hashcpy(policy_cb->last_kept_sha1, nsha1);
>   }
> - if (cb->cmd->verbose)
> + if (policy_cb->cmd->verbose)
>   printf("keep %s", message);
>   }
>   return 0;
> @@ -423,12 +429,15 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>unsigned int flags, void *cb_data)
>  {
>   struct cmd_reflog_expire_cb *cmd = cb_data;
> - struct expire_reflog_policy_cb cb;
> + struct expire_reflog_cb cb;
> + struct expire_reflog_policy_cb policy_cb;
>   struct ref_lock *lock;
>   char *log_file;
>   int status = 0;
>  
>   memset(&cb, 0, sizeof(cb));
> + memset(&policy_cb, 0, sizeof(policy_cb));
> + cb.policy_cb = &policy_cb;
>  
>   /*
>* we take the lock for the ref itself to prevent it from
> @@ -446,16 +455,16 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
>   goto failure;
> - cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> - if (!cb.newlog)
> + policy_cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> + if (!policy_cb.newlog)
>   goto failure;
>   }
>  
> - cb.cmd = cmd;
> + policy_cb.cmd = cmd;
>  
> - reflog_expiry_prepare(refname, sha1, &cb);
> + reflog_expiry_prepare(refname, sha1, &policy_cb);
>   for_each_reflog_ent(refname, expire_reflog_ent, &cb);
> - reflog_expiry_cleanup(&cb);
> + reflog_expiry_cleanup(&policy_cb);
>  
>   if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
>   if (close_lock_file(&reflog_lock)) {
> @@ -463,7 +472,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   strerror(errno));
>   } else if ((flags & EXPIRE_REFLOGS_UPDATE_REF) &&
>   (write_in_full(lock->lock_fd,
> - sha1_to_hex(cb.last_kept_sha1), 40) != 40 ||
> + sha1_to_hex(policy_cb.last_kept_sha1), 40) != 
> 40 ||
>write_str_in_full(lock->lock_fd, "\n") != 1 ||
>

Re: [PATCH 13/23] Rename expire_reflog_cb to expire_reflog_policy_cb

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:25AM +0100, Michael Haggerty wrote:
> This is the first step towards separating the data needed by the
> policy code from the data needed by the reflog expiration machinery.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 
> ---
>  builtin/reflog.c | 19 ++-
>  1 file changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 597c547..3538e4b 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -30,7 +30,7 @@ struct cmd_reflog_expire_cb {
>   int recno;
>  };
>  
> -struct expire_reflog_cb {
> +struct expire_reflog_policy_cb {
>   FILE *newlog;
>   enum {
>   UE_NORMAL,
> @@ -220,7 +220,7 @@ static int keep_entry(struct commit **it, unsigned char 
> *sha1)
>   * the expire_limit and queue them back, so that the caller can call
>   * us again to restart the traversal with longer expire_limit.
>   */
> -static void mark_reachable(struct expire_reflog_cb *cb)
> +static void mark_reachable(struct expire_reflog_policy_cb *cb)
>  {
>   struct commit *commit;
>   struct commit_list *pending;
> @@ -259,7 +259,7 @@ static void mark_reachable(struct expire_reflog_cb *cb)
>   cb->mark_list = leftover;
>  }
>  
> -static int unreachable(struct expire_reflog_cb *cb, struct commit *commit, 
> unsigned char *sha1)
> +static int unreachable(struct expire_reflog_policy_cb *cb, struct commit 
> *commit, unsigned char *sha1)
>  {
>   /*
>* We may or may not have the commit yet - if not, look it
> @@ -295,7 +295,7 @@ static int should_expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   const char *email, unsigned long timestamp, 
> int tz,
>   const char *message, void *cb_data)
>  {
> - struct expire_reflog_cb *cb = cb_data;
> + struct expire_reflog_policy_cb *cb = cb_data;
>   struct commit *old, *new;
>  
>   if (timestamp < cb->cmd->expire_total)
> @@ -323,7 +323,7 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   const char *email, unsigned long timestamp, int tz,
>   const char *message, void *cb_data)
>  {
> - struct expire_reflog_cb *cb = cb_data;
> + struct expire_reflog_policy_cb *cb = cb_data;
>  
>   if (cb->cmd->rewrite)
>   osha1 = cb->last_kept_sha1;
> @@ -350,7 +350,8 @@ static int expire_reflog_ent(unsigned char *osha1, 
> unsigned char *nsha1,
>   return 0;
>  }
>  
> -static int push_tip_to_list(const char *refname, const unsigned char *sha1, 
> int flags, void *cb_data)
> +static int push_tip_to_list(const char *refname, const unsigned char *sha1,
> + int flags, void *cb_data)
>  {
>   struct commit_list **list = cb_data;
>   struct commit *tip_commit;
> @@ -365,7 +366,7 @@ static int push_tip_to_list(const char *refname, const 
> unsigned char *sha1, int
>  
>  static void reflog_expiry_prepare(const char *refname,
> const unsigned char *sha1,
> -   struct expire_reflog_cb *cb)
> +   struct expire_reflog_policy_cb *cb)
>  {
>   if (!cb->cmd->expire_unreachable || !strcmp(refname, "HEAD")) {
>   cb->tip_commit = NULL;
> @@ -397,7 +398,7 @@ static void reflog_expiry_prepare(const char *refname,
>   }
>  }
>  
> -static void reflog_expiry_cleanup(struct expire_reflog_cb *cb)
> +static void reflog_expiry_cleanup(struct expire_reflog_policy_cb *cb)
>  {
>   if (cb->unreachable_expire_kind != UE_ALWAYS) {
>   if (cb->unreachable_expire_kind == UE_HEAD) {
> @@ -422,7 +423,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>unsigned int flags, void *cb_data)
>  {
>   struct cmd_reflog_expire_cb *cmd = cb_data;
> - struct expire_reflog_cb cb;
> + struct expire_reflog_policy_cb cb;
>   struct ref_lock *lock;
>   char *log_file;
>   int status = 0;
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 12/23] expire_reflog(): move updateref to flags argument

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:24AM +0100, Michael Haggerty wrote:
> The policy objects don't care about "--updateref". So move it to
> expire_reflog()'s flags parameter.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 

> ---
>  builtin/reflog.c | 12 ++--
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index a490193..597c547 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -24,7 +24,6 @@ struct cmd_reflog_expire_cb {
>   struct rev_info revs;
>   int stalefix;
>   int rewrite;
> - int updateref;
>   int verbose;
>   unsigned long expire_total;
>   unsigned long expire_unreachable;
> @@ -415,7 +414,8 @@ static void reflog_expiry_cleanup(struct expire_reflog_cb 
> *cb)
>  static struct lock_file reflog_lock;
>  
>  enum expire_reflog_flags {
> - EXPIRE_REFLOGS_DRY_RUN = 1 << 0
> + EXPIRE_REFLOGS_DRY_RUN = 1 << 0,
> + EXPIRE_REFLOGS_UPDATE_REF = 1 << 1
>  };
>  
>  static int expire_reflog(const char *refname, const unsigned char *sha1,
> @@ -460,7 +460,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   if (close_lock_file(&reflog_lock)) {
>   status |= error("Couldn't write %s: %s", log_file,
>   strerror(errno));
> - } else if (cmd->updateref &&
> + } else if ((flags & EXPIRE_REFLOGS_UPDATE_REF) &&
>   (write_in_full(lock->lock_fd,
>   sha1_to_hex(cb.last_kept_sha1), 40) != 40 ||
>write_str_in_full(lock->lock_fd, "\n") != 1 ||
> @@ -471,7 +471,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   } else if (commit_lock_file(&reflog_lock)) {
>   status |= error("cannot rename %s.lock to %s",
>   log_file, log_file);
> - } else if (cmd->updateref && commit_ref(lock)) {
> + } else if ((flags & EXPIRE_REFLOGS_UPDATE_REF) && 
> commit_ref(lock)) {
>   status |= error("Couldn't set %s", lock->ref_name);
>   }
>   }
> @@ -663,7 +663,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   else if (!strcmp(arg, "--rewrite"))
>   cb.rewrite = 1;
>   else if (!strcmp(arg, "--updateref"))
> - cb.updateref = 1;
> + flags |= EXPIRE_REFLOGS_UPDATE_REF;
>   else if (!strcmp(arg, "--all"))
>   do_all = 1;
>   else if (!strcmp(arg, "--verbose"))
> @@ -745,7 +745,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>   else if (!strcmp(arg, "--rewrite"))
>   cb.rewrite = 1;
>   else if (!strcmp(arg, "--updateref"))
> - cb.updateref = 1;
> + flags |= EXPIRE_REFLOGS_UPDATE_REF;
>   else if (!strcmp(arg, "--verbose"))
>   cb.verbose = 1;
>   else if (!strcmp(arg, "--")) {
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 11/23] expire_reflog(): move dry_run to flags argument

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:23AM +0100, Michael Haggerty wrote:
> The policy objects don't care about "--dry-run". So move it to
> expire_reflog()'s flags parameter.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 

> ---
>  builtin/reflog.c | 13 -
>  1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index 319f0d2..a490193 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -22,7 +22,6 @@ static unsigned long default_reflog_expire_unreachable;
>  
>  struct cmd_reflog_expire_cb {
>   struct rev_info revs;
> - int dry_run;
>   int stalefix;
>   int rewrite;
>   int updateref;
> @@ -415,6 +414,10 @@ static void reflog_expiry_cleanup(struct 
> expire_reflog_cb *cb)
>  
>  static struct lock_file reflog_lock;
>  
> +enum expire_reflog_flags {
> + EXPIRE_REFLOGS_DRY_RUN = 1 << 0
> +};
> +
>  static int expire_reflog(const char *refname, const unsigned char *sha1,
>unsigned int flags, void *cb_data)
>  {
> @@ -439,7 +442,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   }
>  
>   log_file = git_pathdup("logs/%s", refname);
> - if (!cmd->dry_run) {
> + if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
>   goto failure;
>   cb.newlog = fdopen_lock_file(&reflog_lock, "w");
> @@ -453,7 +456,7 @@ static int expire_reflog(const char *refname, const 
> unsigned char *sha1,
>   for_each_reflog_ent(refname, expire_reflog_ent, &cb);
>   reflog_expiry_cleanup(&cb);
>  
> - if (cb.newlog) {
> + if (!(flags & EXPIRE_REFLOGS_DRY_RUN)) {
>   if (close_lock_file(&reflog_lock)) {
>   status |= error("Couldn't write %s: %s", log_file,
>   strerror(errno));
> @@ -644,7 +647,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   for (i = 1; i < argc; i++) {
>   const char *arg = argv[i];
>   if (!strcmp(arg, "--dry-run") || !strcmp(arg, "-n"))
> - cb.dry_run = 1;
> + flags |= EXPIRE_REFLOGS_DRY_RUN;
>   else if (starts_with(arg, "--expire=")) {
>   if (parse_expiry_date(arg + 9, &cb.expire_total))
>   die(_("'%s' is not a valid timestamp"), arg);
> @@ -738,7 +741,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>   for (i = 1; i < argc; i++) {
>   const char *arg = argv[i];
>   if (!strcmp(arg, "--dry-run") || !strcmp(arg, "-n"))
> - cb.dry_run = 1;
> + flags |= EXPIRE_REFLOGS_DRY_RUN;
>   else if (!strcmp(arg, "--rewrite"))
>   cb.rewrite = 1;
>   else if (!strcmp(arg, "--updateref"))
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 10/23] expire_reflog(): add a "flags" argument

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:22AM +0100, Michael Haggerty wrote:
> We want to separate the options relevant to the expiry machinery from
> the options affecting the expiration policy. So add a "flags" argument
> to expire_reflog() to hold the former.
> 
> The argument doesn't yet do anything.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: 

> ---
>  builtin/reflog.c | 11 +++
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/builtin/reflog.c b/builtin/reflog.c
> index ebfa635..319f0d2 100644
> --- a/builtin/reflog.c
> +++ b/builtin/reflog.c
> @@ -415,7 +415,8 @@ static void reflog_expiry_cleanup(struct expire_reflog_cb 
> *cb)
>  
>  static struct lock_file reflog_lock;
>  
> -static int expire_reflog(const char *refname, const unsigned char *sha1, 
> void *cb_data)
> +static int expire_reflog(const char *refname, const unsigned char *sha1,
> +  unsigned int flags, void *cb_data)
>  {
>   struct cmd_reflog_expire_cb *cmd = cb_data;
>   struct expire_reflog_cb cb;
> @@ -627,6 +628,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   unsigned long now = time(NULL);
>   int i, status, do_all;
>   int explicit_expiry = 0;
> + unsigned int flags = 0;
>  
>   default_reflog_expire_unreachable = now - 30 * 24 * 3600;
>   default_reflog_expire = now - 90 * 24 * 3600;
> @@ -696,7 +698,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   for (i = 0; i < collected.nr; i++) {
>   struct collected_reflog *e = collected.e[i];
>   set_reflog_expiry_param(&cb, explicit_expiry, 
> e->reflog);
> - status |= expire_reflog(e->reflog, e->sha1, &cb);
> + status |= expire_reflog(e->reflog, e->sha1, flags, &cb);
>   free(e);
>   }
>   free(collected.e);
> @@ -710,7 +712,7 @@ static int cmd_reflog_expire(int argc, const char **argv, 
> const char *prefix)
>   continue;
>   }
>   set_reflog_expiry_param(&cb, explicit_expiry, ref);
> - status |= expire_reflog(ref, sha1, &cb);
> + status |= expire_reflog(ref, sha1, flags, &cb);
>   }
>   return status;
>  }
> @@ -729,6 +731,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>  {
>   struct cmd_reflog_expire_cb cb;
>   int i, status = 0;
> + unsigned int flags = 0;
>  
>   memset(&cb, 0, sizeof(cb));
>  
> @@ -781,7 +784,7 @@ static int cmd_reflog_delete(int argc, const char **argv, 
> const char *prefix)
>   cb.expire_total = 0;
>   }
>  
> - status |= expire_reflog(ref, sha1, &cb);
> + status |= expire_reflog(ref, sha1, flags, &cb);
>   free(ref);
>   }
>   return status;
> -- 
> 2.1.3
> 
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 08/23] Extract function should_expire_reflog_ent()

2014-12-08 Thread Stefan Beller
On Fri, Dec 05, 2014 at 12:08:20AM +0100, Michael Haggerty wrote:
> Extracted from expire_reflog_ent() a function that is solely
> responsible for deciding whether a reflog entry should be expired. By
> separating this "business logic" from the mechanics of actually
> expiring entries, we are working towards the goal of encapsulating
> reflog expiry within the refs API, with policy decided by a callback
> function passed to it by its caller.
> 
> Signed-off-by: Michael Haggerty 

Reviewed-by: Stefan Beller 

The comments below are just thoughts, which don't need to be
included into this commit.


> + if (should_expire_reflog_ent(osha1, nsha1, email, timestamp, tz,
> +  message, cb_data)) {
> + if (!cb->newlog)
> + printf("would prune %s", message);
> + else if (cb->cmd->verbose)
> + printf("prune %s", message);

While this commit is just shoveling code around, we don't want to introduce
changes here. So a question for a possible later follow up:
"git reflog" is listed as an ancillary manipulator, which still is porcelain.
So we maybe want to translate "[would] prune"?


> + char sign = (tz < 0) ? '-' : '+';
> + int zone = (tz < 0) ? (-tz) : tz;
> + fprintf(cb->newlog, "%s %s %s %lu %c%04d\t%s",
> + sha1_to_hex(osha1), sha1_to_hex(nsha1),
> + email, timestamp, sign, zone,
> + message);

This is fine for just moving code around and reviewing.
I send a patch on top of this one to remove the manual calculation of the 
sign and zone and let the fprintf function figure it out.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] Making reflog modifications part of the transactions API

2014-12-08 Thread Jonathan Nieder
Hi,

Stefan Beller wrote:

> How do we view an empty reflog, i.e. an empty file at logs/refs/heads/master?

That's a good question.  I'm a little concerned about what 'git reflog
expire --updateref' would do in this case.

It looks like the longstanding behavior is for 'git reflog expire' to
expire any entry that is old enough, even if that entry describes the
current value of the ref and the reflog becomes empty.  The empty
reflog file is kind of like having no reflog at all, except it means
git should log updates to that ref even if core.logallrefupdates is
false.

That suggests that two separate operations delete_reflog (for 'git
reflog delete') and truncate_reflog (for 'git reflog expire') would be
needed.

Thanks for catching it.
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2] Squashed changes for multiple worktrees vs. submodules

2014-12-08 Thread Max Kirillov
On Mon, Dec 08, 2014 at 09:40:59PM +0100, Jens Lehmann wrote:
> Huh? I think we already have that: If you ignore the url
> config it's as if the submodule was never initialized, so
> you can just *not* run the "git submodule update" command
> at all to get that effect. No new option needed ;-)

You are right. I was thinking about minimal change to
submodules which would allow user selectively checkout them
but the most minimal one is just selectively run `submodule
update`. I think in scope of this feature no changes to
git-submodule is required.

>> btw, have you tried alternates? It does reduce the number of
>> objects you need to keep very strongly. You can put in the
>> alternate store only released branches which are guaranteed
>> to be not force-updated, to avoid issues with missing
>> objects, and it still helps.

> Which is exactly what we do *not* want to do on a CI server,
> its purpose is to endlessly build development branches that
> are force-updated on a regular basis.

Yes, but they still are only somewhat ahead of some stable
branch. And not very much, if you count space: _All_ git
development, with whatever unstable branches, takes 5-10
times less space than its carved in stone history under
`master`.

-- 
Max
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] git-svn: Support for git-svn propset

2014-12-08 Thread Eric Wong
Alfred Perlstein  wrote:
> Appearing here:
>   http://marc.info/?l=git&m=125259772625008&w=2

Probably better to use a mid URL here, too

http://mid.gmane.org/1927112650.1281253084529659.javamail.r...@klofta.sjsoft.com

such a long URL, though...

> --- a/perl/Git/SVN/Editor.pm
> +++ b/perl/Git/SVN/Editor.pm
> @@ -288,6 +288,44 @@ sub apply_autoprops {
>   }
>  }
>  
> +sub check_attr {
> + my ($attr,$path) = @_;
> + my $fh = command_output_pipe("check-attr", $attr, "--", $path);
> + return undef if (!$fh);
> +
> + my $val = <$fh>;
> + close $fh;
> + if ($val) { $val =~ s/^[^:]*:\s*[^:]*:\s*(.*)\s*$/$1/; }
> + return $val;
> +}

I just noticed command_output_pipe didn't use a corresponding
command_close_pipe to check for errors, but command_oneline is even
better.  I'll squash the following:

--- a/perl/Git/SVN/Editor.pm
+++ b/perl/Git/SVN/Editor.pm
@@ -290,11 +290,7 @@ sub apply_autoprops {
 
 sub check_attr {
my ($attr,$path) = @_;
-   my $fh = command_output_pipe("check-attr", $attr, "--", $path);
-   return undef if (!$fh);
-
-   my $val = <$fh>;
-   close $fh;
+   my $val = command_oneline("check-attr", $attr, "--", $path);
if ($val) { $val =~ s/^[^:]*:\s*[^:]*:\s*(.*)\s*$/$1/; }
return $val;
 }

In your test, "local" isn't portable, unfortunately, but tests seem to
work fine without local so I've removed them:

--- a/t/t9148-git-svn-propset.sh
+++ b/t/t9148-git-svn-propset.sh
@@ -29,10 +29,9 @@ test_expect_success 'fetch revisions from svn' '
git svn fetch
'
 
-set_props()
-{
-   local subdir="$1"
-   local file="$2"
+set_props () {
+   subdir="$1"
+   file="$2"
shift;shift;
(cd "$subdir" &&
while [ $# -gt 0 ] ; do
@@ -43,10 +42,9 @@ set_props()
git commit -m "testing propset" "$file")
 }
 
-confirm_props()
-{
-   local subdir="$1"
-   local file="$2"
+confirm_props () {
+   subdir="$1"
+   file="$2"
shift;shift;
(set -e ; cd "svn_project/$subdir" &&
while [ $# -gt 0 ] ; do

Unless there's other improvements we missed, I'll push out your v3 with
my changes squashed in for Junio to pull in a day or two.  Thank you
again for working on this!
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC v2] Squashed changes for multiple worktrees vs. submodules

2014-12-08 Thread Jens Lehmann

Am 07.12.2014 um 07:42 schrieb Max Kirillov:

On Sat, Dec 06, 2014 at 02:06:08PM +0100, Jens Lehmann wrote:

Am 05.12.2014 um 07:32 schrieb Max Kirillov:

Currently I'm estimating approach when submodules which have .git
file or directory inside are updated, and those which do not have it are not.
I have added a config variable submodule.updateIgnoringConfigUrl (because
usually the submodule..url is what turns on the update). It looks working,
maybe I even add setting the variable when chackout --to is used.



But it's not only submodule..url, the list goes on with
update, fetch & ignore and then there are the global options
like diff.submodule, diff.ignoreSubmodules and some more.


I believe that parameters are important for some use, but I
know several tesns of git users who have no idea bout them,
and I myself only learned about them while working on this.


But we still want to support them all properly, no?


To have some a submodule not initialized in some sorktree is
what I really need. I was sure before it is managed by
having the submodule checked out. Probably I just did not
run `submodule update` in the worktree where did not use
submodules, but I cannot rely on it.  I see now from
211b7f19c7 that adding parameter for all updates will break
the initalization. Maybe it would be better to have a
runtime argument: `git submodule update --ignore-config-url`


Huh? I think we already have that: If you ignore the url
config it's as if the submodule was never initialized, so
you can just *not* run the "git submodule update" command
at all to get that effect. No new option needed ;-)


Thanks to you and Duy for discussing this with me! I'd sum it
up like this:

*) Multiple worktrees are meant to couple separate worktrees
with a single repository to avoid having to push and fetch
each time to sync refs and also to not having to sync
settings manually (with the benefit of some disk space
savings). That's a cool feature and explains why a branch
should be protected against being modified in different
worktrees.


I should notify that I am not the author of the feature,
maybe Duy have some other vision.


The first level submodule settings are shared between the
multiple worktrees; submodule objects, settings and refs
aren't (because the .git/modules directory isn't shared).

Looks like that would work with just what we have now, no?


Yes, very much like what I proposed in $gmane/258173, but I
need to have something about preventing checkout. And I
should review what I've done since that, maybe there are
more things to fix.


Hmm, I do not get the "preventing checkout" part. If you ran
"git submodule init " in just one of the multiple work
trees a later "git submodule update" in any of the multiple
work trees will checkout the submodule there. The only way I
can imagine to change that is to implement separate worktree
configurations for each of the multiple worktrees.


*) I'd love to see a solution for sharing the object database
between otherwise unrelated clones of the same project (so
that fetching in one clone updates the objects in the common
dir and gc cannot throw anything away used by one of the
clones). But I'd expect a bare repository as the common one
where we put the worktrees refs into their own namespaces.


There is a GIT_NAMESPACE already, maybe it should be just
extended to work with all commands?


As you already noticed, it isn't a solution for my problem.


btw, have you tried alternates? It does reduce the number of
objects you need to keep very strongly. You can put in the
alternate store only released branches which are guaranteed
to be not force-updated, to avoid issues with missing
objects, and it still helps.


Which is exactly what we do *not* want to do on a CI server,
its purpose is to endlessly build development branches that
are force-updated on a regular basis.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] Making reflog modifications part of the transactions API

2014-12-08 Thread Stefan Beller
So I reviewed and examined this series and run into some problems.

How do we view an empty reflog, i.e. an empty file at logs/refs/heads/master?
I was told this is not a valid state for a reflog. However even the
test suite sometimes
produces an empty reflog files.
Below there is a patch, which highlights the empty reflog when running
t1301.sh verbosely.


---
 t/t1301-shared-repo.sh | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/t/t1301-shared-repo.sh b/t/t1301-shared-repo.sh
index de42d21..6a35fc0 100755
--- a/t/t1301-shared-repo.sh
+++ b/t/t1301-shared-repo.sh
@@ -113,7 +113,9 @@ done

 test_expect_success POSIXPERM 'git reflog expire honors
core.sharedRepository' '
  git config core.sharedRepository group &&
- git reflog expire --all &&
+ git reflog master &&
+ git reflog expire --all --verbose &&
+ git reflog master &&
  actual="$(ls -l .git/logs/refs/heads/master)" &&
  case "$actual" in
  -rw-rw-*)
-- 
2.2.0
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Blobs not referenced by file (anymore) are not removed by GC

2014-12-08 Thread Roberto Tyley
Hi Martin, I'm the developer of the BFG - I'd guess that there
probably isn't a bug for Git developers here, so you might want to
open one or more issues at
https://github.com/rtyley/bfg-repo-cleaner/issues, where I'd be happy
to take a look.

best regards,
Roberto

> On 8 Dec 2014 16:35, "Martin Scherer"  wrote:
>>
>> Hi,
>>
>> after using BFG on a repo given certain directory globs, all of those
>> files(names) are gone from history, but can not be collected by garbage
>> collection anymore. So the blobs of the underlying files are not deleted
>> and only the file names are not associated with the blob anymore. I
>> wonder, if I discovered a bug (at least in bfg). But I expect git to
>> discover that this blobs are not used in any way (so they have to
>> associated to something right?)
>>
>> # invoke bfg --delete-folders something multiple times with different
>> pattern.
>>
>> # try to cleanup
>>
>> git gc --aggressive --prune=now # big blobs still in history
>> git fsck # no results
>> git fsck --full  --unreachable --dangling # no results
>>
>> to verify if the blobs are still there, see the output of
>>
>> git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+
>> blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects
>> .txt
>>
>> head bigobjects.txt # outputs 9451427d7335395779b91864418630d2f0af780a
>> blob   7895212 1869047 7657491
>>
>>
>> Also if bfg is being told to remove the biggest blob (bfg -B 1) with
>> no-blob-protection, it does not succeed in removing it.
>>
>> --- output of bfg -B 1
>>
>> Found 1 blob ids for large blobs - biggest=7895212 smallest=7895212
>> 
>>
>> BFG aborting: No refs to update - no dirty commits found??
>> ---
>>
>> The repo can be found here.
>>
>> https://github.com/marscher/stallone_stale_objects
>>
>> I will restart all over to cleanup the history, but I guess this might
>> be interesting for git developers.
>>
>>
>> Best,
>> Martin
>> --
>> To unsubscribe from this list: send the line "unsubscribe git" in
>> the body of a message to majord...@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/5] git-glossary

2014-12-08 Thread Michael J Gruber
Christian Couder schrieb am 08.12.2014 um 17:16:
> On Mon, Dec 8, 2014 at 4:38 PM, Michael J Gruber
>  wrote:
>> More and more people use Git in localised setups, which usually means
>> mixed localisation setups - not only, but also because of our English
>> man pages.
>>
>> Here's an attempt at leveraging our current infrastructure for helping
>> those poor mixed localisation folks. The idea is to keep the most
>> important iterms in the glossary and translate at least these.
> 
> If the problem is related to all the man pages, shouldn't the solution
> apply to all the man pages?

Huh? I'm not going to translate the man pages.

This is about providing a localised glossary. It just so happens that we
have a glossary as a man page already, so I'm leveraging it.

>> 1/5: generate glossary term list automatically from gitglossary.txt
>> 2/5: introduce git-glossary command which helps with lookups
> 
> Couldn't you improve git-help ?

I think git help is good as is.

Or do you suggest integrating the glossary lookup command in "git help"?
For my taste, "git help" does a lot of magic already (in terms of
revolving "foo" in "git help foo"). What it does not do is translating.
Integrating "glossary" in "help" would require the use of mode changing
options to get the same functionality as "git glossary" and "git
glossary foo". So, git help is really for help about commands, and git
younameit for localised help about terms.

Michael

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Avoid gcc compiler warning

2014-12-08 Thread Johannes Schindelin
At least on this developer's MacOSX (Snow Leopard, gcc-4.2.1), GCC prints
a warning that 'hash' may be used uninitialized when compiling
test-hashmap that 'hash' may be used uninitialized (but GCC 4.6.3 on this
developer's Ubuntu server does not report this problem).

Since hash() is called from perf_hashmap() which accepts an unchecked
integer value from the command line, the warning appears to be legitimate,
even if the test-hashmap command is only called from the test suite.

Signed-off-by: Johannes Schindelin 
---
 test-hashmap.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/test-hashmap.c b/test-hashmap.c
index 07aa7ec..40a126d 100644
--- a/test-hashmap.c
+++ b/test-hashmap.c
@@ -62,6 +62,8 @@ static unsigned int hash(unsigned int method, unsigned int i, 
const char *key)
case HASH_METHOD_0:
hash = 0;
break;
+   default:
+   die("Unknown method: %d", method);
}
 
if (method & HASH_METHOD_X2)
-- 
2.0.0.rc3.9669.g840d1f9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Blobs not referenced by file (anymore) are not removed by GC

2014-12-08 Thread Martin Scherer
Hi,

after using BFG on a repo given certain directory globs, all of those
files(names) are gone from history, but can not be collected by garbage
collection anymore. So the blobs of the underlying files are not deleted
and only the file names are not associated with the blob anymore. I
wonder, if I discovered a bug (at least in bfg). But I expect git to
discover that this blobs are not used in any way (so they have to
associated to something right?)

# invoke bfg --delete-folders something multiple times with different
pattern.

# try to cleanup

git gc --aggressive --prune=now # big blobs still in history
git fsck # no results
git fsck --full  --unreachable --dangling # no results

to verify if the blobs are still there, see the output of

git gc && git verify-pack -v .git/objects/pack/pack-*.idx | egrep "^\w+
blob\W+[0-9]+ [0-9]+ [0-9]+$" | sort -k 3 -n -r > bigobjects
.txt

head bigobjects.txt # outputs 9451427d7335395779b91864418630d2f0af780a
blob   7895212 1869047 7657491


Also if bfg is being told to remove the biggest blob (bfg -B 1) with
no-blob-protection, it does not succeed in removing it.

--- output of bfg -B 1

Found 1 blob ids for large blobs - biggest=7895212 smallest=7895212


BFG aborting: No refs to update - no dirty commits found??
---

The repo can be found here.

https://github.com/marscher/stallone_stale_objects

I will restart all over to cleanup the history, but I guess this might
be interesting for git developers.


Best,
Martin
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 1/2] t3200-branch: test -M

2014-12-08 Thread Michael J Gruber
Signed-off-by: Michael J Gruber 
Signed-off-by: Junio C Hamano 
---
 t/t3200-branch.sh | 9 +
 1 file changed, 9 insertions(+)

diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 432921b..0b3b8f5 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -97,6 +97,15 @@ test_expect_success 'git branch -m o/o o should fail when 
o/p exists' '
test_must_fail git branch -m o/o o
 '
 
+test_expect_success 'git branch -m o/q o/p should fail when o/p exists' '
+   git branch o/q &&
+   test_must_fail git branch -m o/q o/p
+'
+
+test_expect_success 'git branch -M o/q o/p should work when o/p exists' '
+   git branch -M o/q o/p
+'
+
 test_expect_success 'git branch -m q r/q should fail when r exists' '
git branch q &&
git branch r &&
-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 0/2] Make git branch -f forceful

2014-12-08 Thread Michael J Gruber
For many git commands, '-f/--force' is a way to force actions which
would otherwise error out. Way more than once, I've been trying this
with 'git branch -d' and 'git branch -m'...

I've had these two patches sitting in my tree for 3 years now it seems.
Here's a rebase.

In v2 I rename force_create to force and spell out the "-f" behaviour
for other options in the commit message.

Michael J Gruber (2):
  t3200-branch: test -M
  branch: allow -f with -m and -d

 builtin/branch.c  | 13 +
 t/t3200-branch.sh | 14 ++
 2 files changed, 23 insertions(+), 4 deletions(-)

-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2 2/2] branch: allow -f with -m and -d

2014-12-08 Thread Michael J Gruber
-f/--force is the standard way to force an action, and is used by branch
for the recreation of existing branches, but not for deleting unmerged
branches nor for renaming to an existing branch.

Make "-m -f" equivalent to "-M" and "-d -f" equivalent to" -D", i.e.
allow -f/--force to be used with -m/-d also.

For the list modes, "-f" is simply ignored.

Signed-off-by: Michael J Gruber 
---
 builtin/branch.c  | 13 +
 t/t3200-branch.sh |  5 +
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/builtin/branch.c b/builtin/branch.c
index 3b79c50..dc6f0b2 100644
--- a/builtin/branch.c
+++ b/builtin/branch.c
@@ -800,7 +800,7 @@ static int edit_branch_description(const char *branch_name)
 
 int cmd_branch(int argc, const char **argv, const char *prefix)
 {
-   int delete = 0, rename = 0, force_create = 0, list = 0;
+   int delete = 0, rename = 0, force = 0, list = 0;
int verbose = 0, abbrev = -1, detached = 0;
int reflog = 0, edit_description = 0;
int quiet = 0, unset_upstream = 0;
@@ -848,7 +848,7 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
OPT_BOOL('l', "create-reflog", &reflog, N_("create the branch's 
reflog")),
OPT_BOOL(0, "edit-description", &edit_description,
 N_("edit the description for the branch")),
-   OPT__FORCE(&force_create, N_("force creation (when already 
exists)")),
+   OPT__FORCE(&force, N_("force creation, move/rename, deletion")),
{
OPTION_CALLBACK, 0, "no-merged", &merge_filter_ref,
N_("commit"), N_("print only not merged branches"),
@@ -891,7 +891,7 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
if (with_commit || merge_filter != NO_FILTER)
list = 1;
 
-   if (!!delete + !!rename + !!force_create + !!new_upstream +
+   if (!!delete + !!rename + !!new_upstream +
list + unset_upstream > 1)
usage_with_options(builtin_branch_usage, options);
 
@@ -904,6 +904,11 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
colopts = 0;
}
 
+   if (force) {
+   delete *= 2;
+   rename *= 2;
+   }
+
if (delete) {
if (!argc)
die(_("branch name required"));
@@ -1020,7 +1025,7 @@ int cmd_branch(int argc, const char **argv, const char 
*prefix)
 
branch_existed = ref_exists(branch->refname);
create_branch(head, argv[0], (argc == 2) ? argv[1] : head,
- force_create, reflog, 0, quiet, track);
+ force, reflog, 0, quiet, track);
 
/*
 * We only show the instructions if the user gave us
diff --git a/t/t3200-branch.sh b/t/t3200-branch.sh
index 0b3b8f5..ddea498 100755
--- a/t/t3200-branch.sh
+++ b/t/t3200-branch.sh
@@ -106,6 +106,11 @@ test_expect_success 'git branch -M o/q o/p should work 
when o/p exists' '
git branch -M o/q o/p
 '
 
+test_expect_success 'git branch -m -f o/q o/p should work when o/p exists' '
+   git branch o/q &&
+   git branch -m -f o/q o/p
+'
+
 test_expect_success 'git branch -m q r/q should fail when r exists' '
git branch q &&
git branch r &&
-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/5] git-glossary

2014-12-08 Thread Christian Couder
On Mon, Dec 8, 2014 at 4:38 PM, Michael J Gruber
 wrote:
> More and more people use Git in localised setups, which usually means
> mixed localisation setups - not only, but also because of our English
> man pages.
>
> Here's an attempt at leveraging our current infrastructure for helping
> those poor mixed localisation folks. The idea is to keep the most
> important iterms in the glossary and translate at least these.

If the problem is related to all the man pages, shouldn't the solution
apply to all the man pages?

> 1/5: generate glossary term list automatically from gitglossary.txt
> 2/5: introduce git-glossary command which helps with lookups

Couldn't you improve git-help ?

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 08/18] Make fsck_commit() warn-friendly

2014-12-08 Thread Johannes Schindelin
When fsck_commit() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Note that some problems are too problematic to simply ignore. For
example, when the header lines are mixed up, we punt after encountering
an incorrect line. Therefore, demoting certain warnings to errors can
hide other problems. Example: demoting the missing-author error to
a warning would hide a problematic committer line.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 28 
 1 file changed, 20 insertions(+), 8 deletions(-)

diff --git a/fsck.c b/fsck.c
index 256f567..a63654c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -499,12 +499,18 @@ static int fsck_commit_buffer(struct commit *commit, 
const char *buffer,
 
if (!skip_prefix(buffer, "tree ", &buffer))
return report(options, &commit->object, FSCK_MSG_MISSING_TREE, 
"invalid format - expected 'tree' line");
-   if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n')
-   return report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, 
"invalid 'tree' line format - bad sha1");
+   if (get_sha1_hex(buffer, tree_sha1) || buffer[40] != '\n') {
+   err = report(options, &commit->object, FSCK_MSG_BAD_TREE_SHA1, 
"invalid 'tree' line format - bad sha1");
+   if (err)
+   return err;
+   }
buffer += 41;
while (skip_prefix(buffer, "parent ", &buffer)) {
-   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
-   return report(options, &commit->object, 
FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
+   err = report(options, &commit->object, 
FSCK_MSG_BAD_PARENT_SHA1, "invalid 'parent' line format - bad sha1");
+   if (err)
+   return err;
+   }
buffer += 41;
parent_line_count++;
}
@@ -513,11 +519,17 @@ static int fsck_commit_buffer(struct commit *commit, 
const char *buffer,
if (graft) {
if (graft->nr_parent == -1 && !parent_count)
; /* shallow commit */
-   else if (graft->nr_parent != parent_count)
-   return report(options, &commit->object, 
FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+   else if (graft->nr_parent != parent_count) {
+   err = report(options, &commit->object, 
FSCK_MSG_MISSING_GRAFT, "graft objects missing");
+   if (err)
+   return err;
+   }
} else {
-   if (parent_count != parent_line_count)
-   return report(options, &commit->object, 
FSCK_MSG_MISSING_PARENT, "parent objects missing");
+   if (parent_count != parent_line_count) {
+   err = report(options, &commit->object, 
FSCK_MSG_MISSING_PARENT, "parent objects missing");
+   if (err)
+   return err;
+   }
}
if (!skip_prefix(buffer, "author ", &buffer))
return report(options, &commit->object, 
FSCK_MSG_MISSING_AUTHOR, "invalid format - expected 'author' line");
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 15/18] Document the new receive.fsck.* options.

2014-12-08 Thread Johannes Schindelin
Signed-off-by: Johannes Schindelin 
---
 Documentation/config.txt | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 7deae0b..b3276ee 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2109,6 +2109,20 @@ receive.fsckObjects::
Defaults to false. If not set, the value of `transfer.fsckObjects`
is used instead.
 
+receive.fsck.*::
+   When `receive.fsckObjects is set to true, errors can be switched
+   to warnings and vice versa by setting e.g. `receive.fsck.bad-name`
+   to `warn` or `error` (or `ignore` to hide those errors
+   completely). For convenience, fsck prefixes the error/warning
+   with the name of the option, e.g. "missing-email: invalid
+   author/committer line - missing email" means that setting
+   `receive.fsck.missing-email` to `ignore` will hide that issue.
+   For convenience, camelCased options are accepted, too (e.g.
+   `receive.fsck.missingEmail`).
++
+This feature is intended to support working with legacy repositories
+which would not pass pushing when `receive.fsckObjects = true`.
+
 receive.unpackLimit::
If the number of objects received in a push is below this
limit then the objects will be unpacked into loose object
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 18/18] git receive-pack: support excluding objects from fsck'ing

2014-12-08 Thread Johannes Schindelin
The optional new config option `receive.fsck.skip-list` specifies the path
to a file listing the names, i.e. SHA-1s, one per line, of objects that
are to be ignored by `git receive-pack` when `receive.fsckObjects = true`.

This is extremely handy in case of legacy repositories where it would
cause more pain to change incorrect objects than to live with them
(e.g. a duplicate 'author' line in an early commit object).

The intended use case is for server administrators to inspect objects
that are reported by `git push` as being too problematic to enter the
repository, and to add the objects' SHA-1 to a (preferably sorted) file
when the objects are legitimate, i.e. when it is determined that those
problematic objects should be allowed to enter the server.

Signed-off-by: Johannes Schindelin 
---
 builtin/receive-pack.c  |  9 +++
 fsck.c  | 59 +++--
 fsck.h  |  2 ++
 t/t5504-fetch-receive-strict.sh | 12 +
 4 files changed, 80 insertions(+), 2 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 111e514..5169f1f 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -110,6 +110,15 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (starts_with(var, "receive.fsck.skip-list")) {
+   const char *path = is_absolute_path(value) ?
+   value : git_path("%s", value);
+   if (fsck_strict_mode.len)
+   strbuf_addch(&fsck_strict_mode, ',');
+   strbuf_addf(&fsck_strict_mode, "skip-list=%s", path);
+   return 0;
+   }
+
if (starts_with(var, "receive.fsck.")) {
if (fsck_strict_mode.len)
strbuf_addch(&fsck_strict_mode, ',');
diff --git a/fsck.c b/fsck.c
index 154f361..00693f2 100644
--- a/fsck.c
+++ b/fsck.c
@@ -7,6 +7,7 @@
 #include "tag.h"
 #include "fsck.h"
 #include "refs.h"
+#include "sha1-array.h"
 
 #define FOREACH_MSG_ID(FUNC) \
/* fatal errors */ \
@@ -56,7 +57,9 @@
FUNC(ZERO_PADDED_FILEMODE) \
/* infos (reported as warnings, but ignored by default) */ \
FUNC(INVALID_TAG_NAME) \
-   FUNC(MISSING_TAGGER_ENTRY)
+   FUNC(MISSING_TAGGER_ENTRY) \
+   /* special value */ \
+   FUNC(SKIP_LIST)
 
 #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
@@ -109,6 +112,43 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct 
fsck_options *options)
return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
+static void init_skip_list(struct fsck_options *options, const char *path)
+{
+   static struct sha1_array skip_list = SHA1_ARRAY_INIT;
+   int sorted, fd;
+   char buffer[41];
+   unsigned char sha1[20];
+
+   if (options->skip_list)
+   sorted = options->skip_list->sorted;
+   else {
+   sorted = 1;
+   options->skip_list = &skip_list;
+   }
+
+   fd = open(path, O_RDONLY);
+   if (fd < 0)
+   die("Could not open skip list: %s", path);
+   for (;;) {
+   int result = read_in_full(fd, buffer, sizeof(buffer));
+   if (result < 0)
+   die_errno("Could not read '%s'", path);
+   if (!result)
+   break;
+   if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n')
+   die("Invalid SHA-1: %s", buffer);
+   sha1_array_append(&skip_list, sha1);
+   if (sorted && skip_list.nr > 1 &&
+   hashcmp(skip_list.sha1[skip_list.nr - 2],
+   sha1) > 0)
+   sorted = 0;
+   }
+   close(fd);
+
+   if (sorted)
+   skip_list.sorted = 1;
+}
+
 static inline int substrcmp(const char *string, int len, const char *match)
 {
int match_len = strlen(match);
@@ -141,6 +181,18 @@ void fsck_strict_mode(struct fsck_options *options, const 
char *mode)
if (mode[equal] == '=')
break;
 
+   msg_id = parse_msg_id(mode, equal);
+   if (msg_id == FSCK_MSG_SKIP_LIST) {
+   char *path = xstrndup(mode + equal + 1, len - equal - 
1);
+
+   if (equal == len)
+   die("skip-list requires a path");
+   init_skip_list(options, path);
+   free(path);
+   mode += len;
+   continue;
+   }
+
if (equal < len) {
const char *type_str = mode + equal + 1;
int type_len = len - equal - 1;
@@ -155,7 +207,6 @@ void fsck_strict_mode(struct fsck_options *options, const 
char *mode)
  

[PATCH 14/18] fsck: allow upgrading fsck warnings to errors

2014-12-08 Thread Johannes Schindelin
The 'invalid tag name' and 'missing tagger entry' warnings can now be
upgraded to errors by setting receive.fsck.invalid-tag-name and
receive.fsck.missing-tagger-entry to 'error'.

Incidentally, the missing tagger warning is now really shown as a warning
(as opposed to being reported with the "error:" prefix, as it used to be
the case before this commit).

Signed-off-by: Johannes Schindelin 
---
 fsck.c| 24 
 t/t5302-pack-index.sh |  2 +-
 2 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/fsck.c b/fsck.c
index abfd3af..154f361 100644
--- a/fsck.c
+++ b/fsck.c
@@ -52,13 +52,15 @@
FUNC(HAS_DOT) \
FUNC(HAS_DOTDOT) \
FUNC(HAS_DOTGIT) \
-   FUNC(INVALID_TAG_NAME) \
-   FUNC(MISSING_TAGGER_ENTRY) \
FUNC(NULL_SHA1) \
-   FUNC(ZERO_PADDED_FILEMODE)
+   FUNC(ZERO_PADDED_FILEMODE) \
+   /* infos (reported as warnings, but ignored by default) */ \
+   FUNC(INVALID_TAG_NAME) \
+   FUNC(MISSING_TAGGER_ENTRY)
 
 #define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
+#define FIRST_INFO FSCK_MSG_INVALID_TAG_NAME
 
 #define MSG_ID(x) FSCK_MSG_##x,
 enum fsck_msg_id {
@@ -103,7 +105,7 @@ int fsck_msg_type(enum fsck_msg_id msg_id, struct 
fsck_options *options)
if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
return options->strict_mode[msg_id];
if (options->strict)
-   return FSCK_ERROR;
+   return msg_id < FIRST_INFO ? FSCK_ERROR : FSCK_WARN;
return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
@@ -643,13 +645,19 @@ static int fsck_tag_buffer(struct tag *tag, const char 
*data,
goto done;
}
strbuf_addf(&sb, "refs/tags/%.*s", (int)(eol - buffer), buffer);
-   if (check_refname_format(sb.buf, 0))
-   report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, 
"invalid 'tag' name: %s", buffer);
+   if (check_refname_format(sb.buf, 0)) {
+   ret = report(options, &tag->object, FSCK_MSG_INVALID_TAG_NAME, 
"invalid 'tag' name: %s", buffer);
+   if (ret)
+   goto done;
+   }
buffer = eol + 1;
 
-   if (!skip_prefix(buffer, "tagger ", &buffer))
+   if (!skip_prefix(buffer, "tagger ", &buffer)) {
/* early tags do not contain 'tagger' lines; warn only */
-   report(options, &tag->object, FSCK_MSG_MISSING_TAGGER_ENTRY, 
"invalid format - expected 'tagger' line");
+   ret = report(options, &tag->object, 
FSCK_MSG_MISSING_TAGGER_ENTRY, "invalid format - expected 'tagger' line");
+   if (ret)
+   goto done;
+   }
else
ret = fsck_ident(&buffer, &tag->object, options);
 
diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
index 61bc8da..3dc5ec4 100755
--- a/t/t5302-pack-index.sh
+++ b/t/t5302-pack-index.sh
@@ -259,7 +259,7 @@ EOF
 thirtyeight=${tag#??} &&
 rm -f .git/objects/${tag%$thirtyeight}/$thirtyeight &&
 git index-pack --strict tag-test-${pack1}.pack 2>err &&
-grep "^error:.* expected .tagger. line" err
+grep "^warning:.* expected .tagger. line" err
 '
 
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 17/18] Introduce `git fsck --quick`

2014-12-08 Thread Johannes Schindelin
This option avoids unpacking each and all objects, and just verifies the
connectivity. In particular with large repositories, this speeds up the
operation, at the expense of missing corrupt blobs and ignoring
unreachable objects, if any.

Signed-off-by: Johannes Schindelin 
---
 Documentation/git-fsck.txt |  7 ++-
 builtin/fsck.c |  7 ++-
 t/t1450-fsck.sh| 22 ++
 3 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 25c431d..b98fb43 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-[--[no-]full] [--strict] [--verbose] [--lost-found]
+[--[no-]full] [--quick] [--strict] [--verbose] [--lost-found]
 [--[no-]dangling] [--[no-]progress] [*]
 
 DESCRIPTION
@@ -60,6 +60,11 @@ index file, all SHA-1 references in `refs` namespace, and 
all reflogs
object pools.  This is now default; you can turn it off
with --no-full.
 
+--quick::
+   Check only the connectivity of tags, commits and tree objects. By
+   avoiding to unpack blobs, this speeds up the operation, at the
+   expense of missing corrupt objects.
+
 --strict::
Enable more strict checking, namely to catch a file mode
recorded with g+w bit set, which was created by older
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2b8faa4..dcea9b0 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -23,6 +23,7 @@ static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
 static int check_full = 1;
+static int quick;
 static int check_strict;
 static int keep_cache_objects;
 static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
@@ -184,6 +185,8 @@ static void check_reachable_object(struct object *obj)
if (!(obj->flags & HAS_OBJ)) {
if (has_sha1_pack(obj->sha1))
return; /* it is in pack - forget about it */
+   if (quick && has_sha1_file(obj->sha1))
+   return;
printf("missing %s %s\n", typename(obj->type), 
sha1_to_hex(obj->sha1));
errors_found |= ERROR_REACHABLE;
return;
@@ -618,6 +621,7 @@ static struct option fsck_opts[] = {
OPT_BOOL(0, "cache", &keep_cache_objects, N_("make index objects head 
nodes")),
OPT_BOOL(0, "reflogs", &include_reflogs, N_("make reflogs head nodes 
(default)")),
OPT_BOOL(0, "full", &check_full, N_("also consider packs and alternate 
objects")),
+   OPT_BOOL(0, "quick", &quick, N_("check only connectivity")),
OPT_BOOL(0, "strict", &check_strict, N_("enable more strict checking")),
OPT_BOOL(0, "lost-found", &write_lost_and_found,
N_("write dangling objects in 
.git/lost-found")),
@@ -654,7 +658,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
git_config(fsck_config, NULL);
 
fsck_head_link();
-   fsck_object_dir(get_object_directory());
+   if (!quick)
+   fsck_object_dir(get_object_directory());
 
prepare_alt_odb();
for (alt = alt_odb_list; alt; alt = alt->next) {
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index d74df19..d389d4a 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -407,4 +407,26 @@ test_expect_success 'fsck notices ref pointing to missing 
tag' '
test_must_fail git -C missing fsck
 '
 
+test_expect_success 'fsck --quick' '
+   rm -rf quick &&
+   git init quick &&
+   (
+   cd quick &&
+   touch empty &&
+   git add empty &&
+   test_commit empty &&
+   empty=.git/objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 &&
+   rm -f $empty &&
+   echo invalid >$empty &&
+   test_must_fail git fsck --strict &&
+   git fsck --strict --quick &&
+   tree=$(git rev-parse HEAD:) &&
+   suffix=${tree#??} &&
+   tree=.git/objects/${tree%$suffix}/$suffix &&
+   rm -f $tree &&
+   echo invalid >$tree &&
+   test_must_fail git fsck --strict --quick
+   )
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/18] fsck: support demoting errors to warnings

2014-12-08 Thread Johannes Schindelin
We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
missing emails in author, committer and tagger lines.

Signed-off-by: Johannes Schindelin 
---
 Documentation/config.txt | 13 +
 builtin/fsck.c   | 15 +++
 t/t1450-fsck.sh  | 11 +++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b3276ee..fa58c26 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1192,6 +1192,19 @@ filter..smudge::
object to a worktree file upon checkout.  See
linkgit:gitattributes[5] for details.
 
+fsck.*::
+   With these options, fsck errors can be switched to warnings and
+   vice versa by setting e.g. `fsck.bad-name` to `warn` or `error`
+   (or `ignore` to hide those errors completely). For convenience,
+   fsck prefixes the error/warning with the name of the option, e.g.
+   "missing-email: invalid author/committer line - missing email"
+   means that setting `fsck.missing-email` to `ignore` will hide that
+   issue.  For convenience, camelCased options are accepted, too (e.g.
+   `fsck.missingEmail`).
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
The depth parameter used in the delta compression
algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99d4538..2b8faa4 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,19 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)->d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+   if (starts_with(var, "fsck.")) {
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addf(&sb, "%s=%s", var + 5, value ? value : "error");
+   fsck_strict_mode(&fsck_obj_options, sb.buf);
+   strbuf_release(&sb);
+   return 0;
+   }
+
+   return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
   const char *err)
 {
@@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
include_reflogs = 0;
}
 
+   git_config(fsck_config, NULL);
+
fsck_head_link();
fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 019fddd..d74df19 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -283,6 +283,17 @@ test_expect_success 'rev-list --verify-objects with bad 
sha1' '
grep -q "error: sha1 mismatch 63ff" 
out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+   git cat-file commit HEAD >basis &&
+   sed "s/^author .*/&,&/" multiple-authors &&
+   new=$(git hash-object -t commit -w --stdin http://vger.kernel.org/majordomo-info.html


[PATCH 06/18] fsck: report the ID of the error/warning

2014-12-08 Thread Johannes Schindelin
Some legacy code has objects with non-fatal fsck issues; To enable the
user to ignore those issues, let's print out the ID (e.g. when
encountering "missing-email", the user might want to call `git config
receive.fsck.missing-email warn`).

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/fsck.c b/fsck.c
index 9e6d70f..ff50a87 100644
--- a/fsck.c
+++ b/fsck.c
@@ -154,6 +154,23 @@ void fsck_strict_mode(struct fsck_options *options, const 
char *mode)
}
 }
 
+static void append_msg_id(struct strbuf *sb, const char *msg_id)
+{
+   for (;;) {
+   char c = *(msg_id)++;
+
+   if (!c)
+   break;
+   if (c == '_')
+   c = '-';
+   else
+   c = tolower(c);
+   strbuf_addch(sb, c);
+   }
+
+   strbuf_addstr(sb, ": ");
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
enum fsck_msg_id id, const char *fmt, ...)
@@ -162,6 +179,8 @@ static int report(struct fsck_options *options, struct 
object *object,
struct strbuf sb = STRBUF_INIT;
int msg_type = fsck_msg_type(id, options), result;
 
+   append_msg_id(&sb, msg_id_str[id]);
+
va_start(ap, fmt);
strbuf_vaddf(&sb, fmt, ap);
result = options->error_func(object, msg_type, sb.buf);
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 11/18] Add a simple test for receive.fsck.*

2014-12-08 Thread Johannes Schindelin
Signed-off-by: Johannes Schindelin 
---
 t/t5504-fetch-receive-strict.sh | 20 
 1 file changed, 20 insertions(+)

diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 69ee13c..db79e56 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -115,4 +115,24 @@ test_expect_success 'push with transfer.fsckobjects' '
test_cmp exp act
 '
 
+cat >bogus-commit << EOF
+tree 4b825dc642cb6eb9a060e54bf8d69288fbee4904
+author Bugs Bunny 1234567890 +
+committer Bugs Bunny  1234567890 +
+
+This commit object intentionally broken
+EOF
+
+test_expect_success 'push with receive.fsck.missing-mail = warn' '
+   commit="$(git hash-object -t commit -w --stdin < bogus-commit)" &&
+   git push . $commit:refs/heads/bogus &&
+   rm -rf dst &&
+   git init dst &&
+   git --git-dir=dst/.git config receive.fsckobjects true &&
+   test_must_fail git push --porcelain dst bogus &&
+   git --git-dir=dst/.git config receive.fsck.missing-email warn &&
+   git push --porcelain dst bogus >act 2>&1 &&
+   grep "missing-email" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 13/18] Optionally ignore specific fsck issues completely

2014-12-08 Thread Johannes Schindelin
An fsck issue in a legacy repository might be so common that one would
like not to bother the user with mentioning it at all. With this change,
that is possible by setting the respective error to "ignore".

This change "abuses" the missing-email=warn test to verify that "ignore"
is also accepted and works correctly. And while at it, it makes sure
that multiple options work, too (they are passed to unpack-objects or
index-pack as a comma-separated list via the --strict=... command-line
option).

Signed-off-by: Johannes Schindelin 
---
 fsck.c  | 5 +
 fsck.h  | 1 +
 t/t5504-fetch-receive-strict.sh | 7 ++-
 3 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index f8339af..abfd3af 100644
--- a/fsck.c
+++ b/fsck.c
@@ -146,6 +146,8 @@ void fsck_strict_mode(struct fsck_options *options, const 
char *mode)
type = FSCK_ERROR;
else if (!substrcmp(type_str, type_len, "warn"))
type = FSCK_WARN;
+   else if (!substrcmp(type_str, type_len, "ignore"))
+   type = FSCK_IGNORE;
else
die("Unknown fsck message type: '%.*s'",
len - equal - 1, type_str);
@@ -184,6 +186,9 @@ static int report(struct fsck_options *options, struct 
object *object,
struct strbuf sb = STRBUF_INIT;
int msg_type = fsck_msg_type(id, options), result;
 
+   if (msg_type == FSCK_IGNORE)
+   return 0;
+
append_msg_id(&sb, msg_id_str[id]);
 
va_start(ap, fmt);
diff --git a/fsck.h b/fsck.h
index 9d67ea2..82bedf9 100644
--- a/fsck.h
+++ b/fsck.h
@@ -3,6 +3,7 @@
 
 #define FSCK_ERROR 1
 #define FSCK_WARN 2
+#define FSCK_IGNORE 3
 
 struct fsck_options;
 
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index 8a47db2..0e521d9 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -132,7 +132,12 @@ test_expect_success 'push with receive.fsck.missing-mail = 
warn' '
test_must_fail git push --porcelain dst bogus &&
git --git-dir=dst/.git config receive.fsck.missing-email warn &&
git push --porcelain dst bogus >act 2>&1 &&
-   grep "missing-email" act
+   grep "missing-email" act &&
+   git --git-dir=dst/.git branch -D bogus &&
+   git  --git-dir=dst/.git config receive.fsck.missing-email ignore &&
+   git  --git-dir=dst/.git config receive.fsck.bad-date warn &&
+   git push --porcelain dst bogus >act 2>&1 &&
+   test_must_fail grep "missing-email" act
 '
 
 test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 12/18] Disallow demoting grave fsck errors to warnings

2014-12-08 Thread Johannes Schindelin
Some kinds of errors are intrinsically unrecoverable (e.g. errors while
uncompressing objects). It does not make sense to allow demoting them to
mere warnings.

Signed-off-by: Johannes Schindelin 
---
 fsck.c  | 8 ++--
 t/t5504-fetch-receive-strict.sh | 9 +
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index c1e7a85..f8339af 100644
--- a/fsck.c
+++ b/fsck.c
@@ -9,6 +9,9 @@
 #include "refs.h"
 
 #define FOREACH_MSG_ID(FUNC) \
+   /* fatal errors */ \
+   FUNC(NUL_IN_HEADER) \
+   FUNC(UNTERMINATED_HEADER) \
/* errors */ \
FUNC(BAD_DATE) \
FUNC(BAD_EMAIL) \
@@ -39,10 +42,8 @@
FUNC(MISSING_TYPE_ENTRY) \
FUNC(MULTIPLE_AUTHORS) \
FUNC(NOT_SORTED) \
-   FUNC(NUL_IN_HEADER) \
FUNC(TAG_OBJECT_NOT_TAG) \
FUNC(UNKNOWN_TYPE) \
-   FUNC(UNTERMINATED_HEADER) \
FUNC(ZERO_PADDED_DATE) \
/* warnings */ \
FUNC(BAD_FILEMODE) \
@@ -56,6 +57,7 @@
FUNC(NULL_SHA1) \
FUNC(ZERO_PADDED_FILEMODE)
 
+#define FIRST_NON_FATAL_ERROR FSCK_MSG_BAD_DATE
 #define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
 
 #define MSG_ID(x) FSCK_MSG_##x,
@@ -150,6 +152,8 @@ void fsck_strict_mode(struct fsck_options *options, const 
char *mode)
}
 
msg_id = parse_msg_id(mode, equal);
+   if (type != FSCK_ERROR && msg_id < FIRST_NON_FATAL_ERROR)
+   die("Cannot demote %.*s", len, mode);
options->strict_mode[msg_id] = type;
mode += len;
}
diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
index db79e56..8a47db2 100755
--- a/t/t5504-fetch-receive-strict.sh
+++ b/t/t5504-fetch-receive-strict.sh
@@ -135,4 +135,13 @@ test_expect_success 'push with receive.fsck.missing-mail = 
warn' '
grep "missing-email" act
 '
 
+test_expect_success 'receive.fsck.unterminated-header = warn triggers error' '
+   rm -rf dst &&
+   git init dst &&
+   git --git-dir=dst/.git config receive.fsckobjects true &&
+   git --git-dir=dst/.git config receive.fsck.unterminated-header warn &&
+   test_must_fail git push --porcelain dst HEAD >act 2>&1 &&
+   grep "Cannot demote unterminated-header=warn" act
+'
+
 test_done
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 07/18] Make fsck_ident() warn-friendly

2014-12-08 Thread Johannes Schindelin
When fsck_ident() identifies a problem with the ident, it should still
advance the pointer to the next line so that fsck can continue in the
case of a mere warning.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 49 +++--
 1 file changed, 27 insertions(+), 22 deletions(-)

diff --git a/fsck.c b/fsck.c
index ff50a87..256f567 100644
--- a/fsck.c
+++ b/fsck.c
@@ -444,40 +444,45 @@ static int require_end_of_header(const void *data, 
unsigned long size,
 
 static int fsck_ident(const char **ident, struct object *obj, struct 
fsck_options *options)
 {
+   const char *p = *ident;
char *end;
 
-   if (**ident == '<')
+   *ident = strchrnul(*ident, '\n');
+   if (**ident == '\n')
+   (*ident)++;
+
+   if (*p == '<')
return report(options, obj, FSCK_MSG_MISSING_NAME_BEFORE_EMAIL, 
"invalid author/committer line - missing space before email");
-   *ident += strcspn(*ident, "<>\n");
-   if (**ident == '>')
+   p += strcspn(p, "<>\n");
+   if (*p == '>')
return report(options, obj, FSCK_MSG_BAD_NAME, "invalid 
author/committer line - bad name");
-   if (**ident != '<')
+   if (*p != '<')
return report(options, obj, FSCK_MSG_MISSING_EMAIL, "invalid 
author/committer line - missing email");
-   if ((*ident)[-1] != ' ')
+   if (p[-1] != ' ')
return report(options, obj, 
FSCK_MSG_MISSING_SPACE_BEFORE_EMAIL, "invalid author/committer line - missing 
space before email");
-   (*ident)++;
-   *ident += strcspn(*ident, "<>\n");
-   if (**ident != '>')
+   p++;
+   p += strcspn(p, "<>\n");
+   if (*p != '>')
return report(options, obj, FSCK_MSG_BAD_EMAIL, "invalid 
author/committer line - bad email");
-   (*ident)++;
-   if (**ident != ' ')
+   p++;
+   if (*p != ' ')
return report(options, obj, FSCK_MSG_MISSING_SPACE_BEFORE_DATE, 
"invalid author/committer line - missing space before date");
-   (*ident)++;
-   if (**ident == '0' && (*ident)[1] != ' ')
+   p++;
+   if (*p == '0' && p[1] != ' ')
return report(options, obj, FSCK_MSG_ZERO_PADDED_DATE, "invalid 
author/committer line - zero-padded date");
-   if (date_overflows(strtoul(*ident, &end, 10)))
+   if (date_overflows(strtoul(p, &end, 10)))
return report(options, obj, FSCK_MSG_DATE_OVERFLOW, "invalid 
author/committer line - date causes integer overflow");
-   if (end == *ident || *end != ' ')
+   if ((end == p || *end != ' '))
return report(options, obj, FSCK_MSG_BAD_DATE, "invalid 
author/committer line - bad date");
-   *ident = end + 1;
-   if ((**ident != '+' && **ident != '-') ||
-   !isdigit((*ident)[1]) ||
-   !isdigit((*ident)[2]) ||
-   !isdigit((*ident)[3]) ||
-   !isdigit((*ident)[4]) ||
-   ((*ident)[5] != '\n'))
+   p = end + 1;
+   if ((*p != '+' && *p != '-') ||
+   !isdigit(p[1]) ||
+   !isdigit(p[2]) ||
+   !isdigit(p[3]) ||
+   !isdigit(p[4]) ||
+   (p[5] != '\n'))
return report(options, obj, FSCK_MSG_BAD_TIMEZONE, "invalid 
author/committer line - bad time zone");
-   (*ident) += 6;
+   p += 6;
return 0;
 }
 
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 09/18] fsck: handle multiple authors in commits specially

2014-12-08 Thread Johannes Schindelin
This problem has been detected in the wild, and is the primary reason
to introduce an option to demote certain fsck errors to warnings. Let's
offer to ignore this particular problem specifically.

Technically, we could handle such repositories by setting
missing-committer = warn, but that could hide missing tree objects in the
same commit because we cannot continue verifying any commit object after
encountering a missing committer line, while we can continue in the case
of multiple author lines.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/fsck.c b/fsck.c
index a63654c..21ff35b 100644
--- a/fsck.c
+++ b/fsck.c
@@ -37,6 +37,7 @@
FUNC(MISSING_TREE) \
FUNC(MISSING_TYPE) \
FUNC(MISSING_TYPE_ENTRY) \
+   FUNC(MULTIPLE_AUTHORS) \
FUNC(NOT_SORTED) \
FUNC(NUL_IN_HEADER) \
FUNC(TAG_OBJECT_NOT_TAG) \
@@ -536,6 +537,13 @@ static int fsck_commit_buffer(struct commit *commit, const 
char *buffer,
err = fsck_ident(&buffer, &commit->object, options);
if (err)
return err;
+   while (skip_prefix(buffer, "author ", &buffer)) {
+   err = report(options, &commit->object, 
FSCK_MSG_MULTIPLE_AUTHORS, "invalid format - multiple 'author' lines");
+   if (err)
+   return err;
+   /* require_end_of_header() ensured that there is a newline */
+   buffer = strchr(buffer, '\n') + 1;
+   }
if (!skip_prefix(buffer, "committer ", &buffer))
return report(options, &commit->object, 
FSCK_MSG_MISSING_COMMITTER, "invalid format - expected 'committer' line");
err = fsck_ident(&buffer, &commit->object, options);
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 10/18] Make fsck_tag() warn-friendly

2014-12-08 Thread Johannes Schindelin
When fsck_tag() identifies a problem with the commit, it should try
to make it possible to continue checking the commit object, in case the
user wants to demote the detected errors to mere warnings.

Just like fsck_commit(), there are certain problems that could hide other
issues with the same tag object. For example, if the 'type' line is not
encountered in the correct position, the 'tag' line – if there is any –
would not be handled at all.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fsck.c b/fsck.c
index 21ff35b..c1e7a85 100644
--- a/fsck.c
+++ b/fsck.c
@@ -604,7 +604,8 @@ static int fsck_tag_buffer(struct tag *tag, const char 
*data,
}
if (get_sha1_hex(buffer, sha1) || buffer[40] != '\n') {
ret = report(options, &tag->object, 
FSCK_MSG_INVALID_OBJECT_SHA1, "invalid 'object' line format - bad sha1");
-   goto done;
+   if (ret)
+   goto done;
}
buffer += 41;
 
-- 
2.0.0.rc3.9669.g840d1f9


[PATCH 04/18] Offer a function to demote fsck errors to warnings

2014-12-08 Thread Johannes Schindelin
There are legacy repositories out there whose older commits and tags
have issues that prevent pushing them when 'receive.fsckObjects' is set.
One real-life example is a commit object that has been hand-crafted to
list two authors.

Often, it is not possible to fix those issues without disrupting the
work with said repositories, yet it is still desirable to perform checks
by setting `receive.fsckObjects = true`. This commit is the first step
to allow demoting specific fsck issues to mere warnings.

The function added by this commit parses a list of settings in the form:

missing-email=warn,bad-name=warn,...

Unfortunately, the FSCK_WARN/FSCK_ERROR flag is only really heeded by
git fsck so far, but other call paths (e.g. git index-pack --strict)
error out *always* no matter what type was specified. Therefore, we
need to take extra care to default to all FSCK_ERROR in those cases.

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 58 ++
 fsck.h |  7 +--
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/fsck.c b/fsck.c
index 05b146c..9e6d70f 100644
--- a/fsck.c
+++ b/fsck.c
@@ -97,9 +97,63 @@ static int parse_msg_id(const char *text, int len)
 
 int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 {
+   if (options->strict_mode && msg_id >= 0 && msg_id < FSCK_MSG_MAX)
+   return options->strict_mode[msg_id];
+   if (options->strict)
+   return FSCK_ERROR;
return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
 }
 
+static inline int substrcmp(const char *string, int len, const char *match)
+{
+   int match_len = strlen(match);
+   if (match_len != len)
+   return -1;
+   return memcmp(string, match, len);
+}
+
+void fsck_strict_mode(struct fsck_options *options, const char *mode)
+{
+   int type = FSCK_ERROR;
+
+   if (!options->strict_mode) {
+   int i;
+   int *strict_mode = malloc(sizeof(int) * FSCK_MSG_MAX);
+   for (i = 0; i < FSCK_MSG_MAX; i++)
+   strict_mode[i] = fsck_msg_type(i, options);
+   options->strict_mode = strict_mode;
+   }
+
+   while (*mode) {
+   int len = strcspn(mode, " ,|"), equal, msg_id;
+
+   if (!len) {
+   mode++;
+   continue;
+   }
+
+   for (equal = 0; equal < len; equal++)
+   if (mode[equal] == '=')
+   break;
+
+   if (equal < len) {
+   const char *type_str = mode + equal + 1;
+   int type_len = len - equal - 1;
+   if (!substrcmp(type_str, type_len, "error"))
+   type = FSCK_ERROR;
+   else if (!substrcmp(type_str, type_len, "warn"))
+   type = FSCK_WARN;
+   else
+   die("Unknown fsck message type: '%.*s'",
+   len - equal - 1, type_str);
+   }
+
+   msg_id = parse_msg_id(mode, equal);
+   options->strict_mode[msg_id] = type;
+   mode += len;
+   }
+}
+
 __attribute__((format (printf, 4, 5)))
 static int report(struct fsck_options *options, struct object *object,
enum fsck_msg_id id, const char *fmt, ...)
@@ -585,6 +639,10 @@ int fsck_object(struct object *obj, void *data, unsigned 
long size,
 
 int fsck_error_function(struct object *obj, int type, const char *message)
 {
+   if (type == FSCK_WARN) {
+   warning("object %s: %s", sha1_to_hex(obj->sha1), message);
+   return 0;
+   }
error("object %s: %s", sha1_to_hex(obj->sha1), message);
return 1;
 }
diff --git a/fsck.h b/fsck.h
index a18e9a6..9d67ea2 100644
--- a/fsck.h
+++ b/fsck.h
@@ -6,6 +6,8 @@
 
 struct fsck_options;
 
+void fsck_strict_mode(struct fsck_options *options, const char *mode);
+
 /*
  * callback function for fsck_walk
  * type is the expected type of the object or OBJ_ANY
@@ -25,10 +27,11 @@ struct fsck_options {
fsck_walk_func walk;
fsck_error error_func;
int strict:1;
+   int *strict_mode;
 };
 
-#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0 }
-#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1 }
+#define FSCK_OPTIONS_DEFAULT { NULL, fsck_error_function, 0, NULL }
+#define FSCK_OPTIONS_STRICT { NULL, fsck_error_function, 1, NULL }
 
 /* descend in all linked child objects
  * the return value is:
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/18] Introduce fsck options

2014-12-08 Thread Johannes Schindelin
Just like the diff machinery, we are about to introduce more settings,
therefore it makes sense to carry them around as a (pointer to a) struct
containing all of them.

Signed-off-by: Johannes Schindelin 
---
 builtin/fsck.c   |  20 +--
 builtin/index-pack.c |   9 +--
 builtin/unpack-objects.c |  11 ++--
 fsck.c   | 150 +++
 fsck.h   |  17 +-
 5 files changed, 114 insertions(+), 93 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index a27515a..2241e29 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -25,6 +25,8 @@ static int include_reflogs = 1;
 static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
+static struct fsck_options fsck_walk_options = FSCK_OPTIONS_DEFAULT;
+static struct fsck_options fsck_obj_options = FSCK_OPTIONS_DEFAULT;
 static unsigned char head_sha1[20];
 static const char *head_points_at;
 static int errors_found;
@@ -76,7 +78,7 @@ static int fsck_error_func(struct object *obj, int type, 
const char *err, ...)
 
 static struct object_array pending;
 
-static int mark_object(struct object *obj, int type, void *data)
+static int mark_object(struct object *obj, int type, void *data, struct 
fsck_options *options)
 {
struct object *parent = data;
 
@@ -119,7 +121,7 @@ static int mark_object(struct object *obj, int type, void 
*data)
 
 static void mark_object_reachable(struct object *obj)
 {
-   mark_object(obj, OBJ_ANY, NULL);
+   mark_object(obj, OBJ_ANY, NULL, NULL);
 }
 
 static int traverse_one_object(struct object *obj)
@@ -132,7 +134,7 @@ static int traverse_one_object(struct object *obj)
if (parse_tree(tree) < 0)
return 1; /* error already displayed */
}
-   result = fsck_walk(obj, mark_object, obj);
+   result = fsck_walk(obj, obj, &fsck_walk_options);
if (tree)
free_tree_buffer(tree);
return result;
@@ -158,7 +160,7 @@ static int traverse_reachable(void)
return !!result;
 }
 
-static int mark_used(struct object *obj, int type, void *data)
+static int mark_used(struct object *obj, int type, void *data, struct 
fsck_options *options)
 {
if (!obj)
return 1;
@@ -296,9 +298,9 @@ static int fsck_obj(struct object *obj)
fprintf(stderr, "Checking %s %s\n",
typename(obj->type), sha1_to_hex(obj->sha1));
 
-   if (fsck_walk(obj, mark_used, NULL))
+   if (fsck_walk(obj, NULL, &fsck_obj_options))
objerror(obj, "broken links");
-   if (fsck_object(obj, NULL, 0, check_strict, fsck_error_func))
+   if (fsck_object(obj, NULL, 0, &fsck_obj_options))
return -1;
 
if (obj->type == OBJ_TREE) {
@@ -630,6 +632,12 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
 
argc = parse_options(argc, argv, prefix, fsck_opts, fsck_usage, 0);
 
+   fsck_walk_options.walk = mark_object;
+   fsck_obj_options.walk = mark_used;
+   fsck_obj_options.error_func = fsck_error_func;
+   if (check_strict)
+   fsck_obj_options.strict = 1;
+
if (show_progress == -1)
show_progress = isatty(2);
if (verbose)
diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a369f55..1c17c3f 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -74,6 +74,7 @@ static int nr_threads;
 static int from_stdin;
 static int strict;
 static int do_fsck_object;
+static struct fsck_options fsck_options = FSCK_OPTIONS_STRICT;
 static int verbose;
 static int show_stat;
 static int check_self_contained_and_connected;
@@ -191,7 +192,7 @@ static void cleanup_thread(void)
 #endif
 
 
-static int mark_link(struct object *obj, int type, void *data)
+static int mark_link(struct object *obj, int type, void *data, struct 
fsck_options *options)
 {
if (!obj)
return -1;
@@ -782,10 +783,10 @@ static void sha1_object(const void *data, struct 
object_entry *obj_entry,
if (!obj)
die(_("invalid %s"), typename(type));
if (do_fsck_object &&
-   fsck_object(obj, buf, size, 1,
-   fsck_error_function))
+   fsck_object(obj, buf, size, &fsck_options))
die(_("Error in object"));
-   if (fsck_walk(obj, mark_link, NULL))
+   fsck_options.walk = mark_link;
+   if (fsck_walk(obj, NULL, &fsck_options))
die(_("Not all child objects of %s are 
reachable"), sha1_to_hex(obj->sha1));
 
if (obj->type == OBJ_TREE) {
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 855d94b..e9e8bec 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -20,6 +20,7 @@ static un

[PATCH 05/18] Allow demoting errors to warnings via receive.fsck. = warn

2014-12-08 Thread Johannes Schindelin
For example, missing emails in commit and tag objects can be demoted to
mere warnings with

git config receive.fsck.missing-email warn

As git receive-pack does not actually perform the checks, it hands off
the setting to index-pack or unpack-objects in the form of an optional
argument to the --strict option.

Signed-off-by: Johannes Schindelin 
---
 builtin/index-pack.c |  4 
 builtin/receive-pack.c   | 27 +++
 builtin/unpack-objects.c |  5 +
 3 files changed, 32 insertions(+), 4 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index 1c17c3f..34a11b3 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -1565,6 +1565,10 @@ int cmd_index_pack(int argc, const char **argv, const 
char *prefix)
} else if (!strcmp(arg, "--strict")) {
strict = 1;
do_fsck_object = 1;
+   } else if (starts_with(arg, "--strict=")) {
+   strict = 1;
+   do_fsck_object = 1;
+   fsck_strict_mode(&fsck_options, arg + 9);
} else if (!strcmp(arg, 
"--check-self-contained-and-connected")) {
strict = 1;
check_self_contained_and_connected = 1;
diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index e908d07..111e514 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -35,6 +35,7 @@ static enum deny_action deny_current_branch = 
DENY_UNCONFIGURED;
 static enum deny_action deny_delete_current = DENY_UNCONFIGURED;
 static int receive_fsck_objects = -1;
 static int transfer_fsck_objects = -1;
+static struct strbuf fsck_strict_mode = STRBUF_INIT;
 static int receive_unpack_limit = -1;
 static int transfer_unpack_limit = -1;
 static int unpack_limit = 100;
@@ -109,6 +110,14 @@ static int receive_pack_config(const char *var, const char 
*value, void *cb)
return 0;
}
 
+   if (starts_with(var, "receive.fsck.")) {
+   if (fsck_strict_mode.len)
+   strbuf_addch(&fsck_strict_mode, ',');
+   strbuf_addf(&fsck_strict_mode,
+   "%s=%s", var + 13, value ? value : "error");
+   return 0;
+   }
+
if (strcmp(var, "receive.fsckobjects") == 0) {
receive_fsck_objects = git_config_bool(var, value);
return 0;
@@ -1266,8 +1275,13 @@ static const char *unpack(int err_fd, struct 
shallow_info *si)
argv_array_pushl(&child.args, "unpack-objects", hdr_arg, NULL);
if (quiet)
argv_array_push(&child.args, "-q");
-   if (fsck_objects)
-   argv_array_push(&child.args, "--strict");
+   if (fsck_objects) {
+   if (fsck_strict_mode.len)
+   argv_array_pushf(&child.args, "--strict=%s",
+   fsck_strict_mode.buf);
+   else
+   argv_array_push(&child.args, "--strict");
+   }
child.no_stdout = 1;
child.err = err_fd;
child.git_cmd = 1;
@@ -1284,8 +1298,13 @@ static const char *unpack(int err_fd, struct 
shallow_info *si)
 
argv_array_pushl(&child.args, "index-pack",
 "--stdin", hdr_arg, keep_arg, NULL);
-   if (fsck_objects)
-   argv_array_push(&child.args, "--strict");
+   if (fsck_objects) {
+   if (fsck_strict_mode.len)
+   argv_array_pushf(&child.args, "--strict=%s",
+   fsck_strict_mode.buf);
+   else
+   argv_array_push(&child.args, "--strict");
+   }
if (fix_thin)
argv_array_push(&child.args, "--fix-thin");
child.out = -1;
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index e9e8bec..916616f 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -530,6 +530,11 @@ int cmd_unpack_objects(int argc, const char **argv, const 
char *prefix)
strict = 1;
continue;
}
+   if (starts_with(arg, "--strict=")) {
+   strict = 1;
+   fsck_strict_mode(&fsck_options, arg + 9);
+   continue;
+   }
if (starts_with(arg, "--pack_header=")) {
struct pack_header *hdr;
char *c;
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the 

[PATCH 03/18] Provide a function to parse fsck message IDs

2014-12-08 Thread Johannes Schindelin
This function will be used in the next commits to allow the user to
ask fsck to handle specific problems differently, e.g. demoting certain
errors to warnings. It has to handle partial strings because we would
like to be able to parse, say, '--strict=missing-email=warn' command
lines.

To make the parsing robust, we generate strings from the enum keys, and we
will match both lower-case, dash-separated values as well as camelCased
ones (e.g. both "missing-email" and "missingEmail" will match the
"MISSING_EMAIL" key).

Signed-off-by: Johannes Schindelin 
---
 fsck.c | 32 
 1 file changed, 32 insertions(+)

diff --git a/fsck.c b/fsck.c
index 3cea034..05b146c 100644
--- a/fsck.c
+++ b/fsck.c
@@ -63,6 +63,38 @@ enum fsck_msg_id {
FSCK_MSG_MAX
 };
 
+#define STR(x) #x
+#define MSG_ID_STR(x) STR(x),
+static const char *msg_id_str[FSCK_MSG_MAX + 1] = {
+   FOREACH_MSG_ID(MSG_ID_STR)
+   NULL
+};
+
+static int parse_msg_id(const char *text, int len)
+{
+   int i, j;
+
+   for (i = 0; i < FSCK_MSG_MAX; i++) {
+   const char *key = msg_id_str[i];
+   /* msg_id_str is upper-case, with underscores */
+   for (j = 0; j < len; j++) {
+   char c = *(key++);
+   if (c == '_') {
+   if (isalpha(text[j]))
+   c = *(key++);
+   else if (text[j] != '_')
+   c = '-';
+   }
+   if (toupper(text[j]) != c)
+   break;
+   }
+   if (j == len && !*key)
+   return i;
+   }
+
+   die("Unhandled type: %.*s", len, text);
+}
+
 int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
 {
return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 02/18] Introduce identifiers for fsck messages

2014-12-08 Thread Johannes Schindelin
Rather than specifying only whether a message by the fsck machinery
constitutes an error or a warning, let's specify an identifier relating
to the concrete problem that was encountered. This is necessary for
upcoming support to be able to demote certain errors to warnings.

In the course, simplify the requirements on the calling code: instead of
having to handle full-blown varargs in every callback, we now send a
string buffer ready to be used by the callback.

Signed-off-by: Johannes Schindelin 
---
 builtin/fsck.c |  24 +++-
 fsck.c | 185 +++--
 fsck.h |   5 +-
 3 files changed, 137 insertions(+), 77 deletions(-)

diff --git a/builtin/fsck.c b/builtin/fsck.c
index 2241e29..99d4538 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -47,32 +47,22 @@ static int show_dangling = 1;
 #endif
 
 static void objreport(struct object *obj, const char *severity,
-  const char *err, va_list params)
+  const char *err)
 {
-   fprintf(stderr, "%s in %s %s: ",
-   severity, typename(obj->type), sha1_to_hex(obj->sha1));
-   vfprintf(stderr, err, params);
-   fputs("\n", stderr);
+   fprintf(stderr, "%s in %s %s: %s\n",
+   severity, typename(obj->type), sha1_to_hex(obj->sha1), err);
 }
 
-__attribute__((format (printf, 2, 3)))
-static int objerror(struct object *obj, const char *err, ...)
+static int objerror(struct object *obj, const char *err)
 {
-   va_list params;
-   va_start(params, err);
errors_found |= ERROR_OBJECT;
-   objreport(obj, "error", err, params);
-   va_end(params);
+   objreport(obj, "error", err);
return -1;
 }
 
-__attribute__((format (printf, 3, 4)))
-static int fsck_error_func(struct object *obj, int type, const char *err, ...)
+static int fsck_error_func(struct object *obj, int type, const char *message)
 {
-   va_list params;
-   va_start(params, err);
-   objreport(obj, (type == FSCK_WARN) ? "warning" : "error", err, params);
-   va_end(params);
+   objreport(obj, (type == FSCK_WARN) ? "warning" : "error", message);
return (type == FSCK_WARN) ? 0 : 1;
 }
 
diff --git a/fsck.c b/fsck.c
index d6f539f..3cea034 100644
--- a/fsck.c
+++ b/fsck.c
@@ -8,6 +8,83 @@
 #include "fsck.h"
 #include "refs.h"
 
+#define FOREACH_MSG_ID(FUNC) \
+   /* errors */ \
+   FUNC(BAD_DATE) \
+   FUNC(BAD_EMAIL) \
+   FUNC(BAD_NAME) \
+   FUNC(BAD_PARENT_SHA1) \
+   FUNC(BAD_TIMEZONE) \
+   FUNC(BAD_TREE_SHA1) \
+   FUNC(DATE_OVERFLOW) \
+   FUNC(DUPLICATE_ENTRIES) \
+   FUNC(INVALID_OBJECT_SHA1) \
+   FUNC(INVALID_TAG_OBJECT) \
+   FUNC(INVALID_TREE) \
+   FUNC(INVALID_TYPE) \
+   FUNC(MISSING_AUTHOR) \
+   FUNC(MISSING_COMMITTER) \
+   FUNC(MISSING_EMAIL) \
+   FUNC(MISSING_GRAFT) \
+   FUNC(MISSING_NAME_BEFORE_EMAIL) \
+   FUNC(MISSING_OBJECT) \
+   FUNC(MISSING_PARENT) \
+   FUNC(MISSING_SPACE_BEFORE_DATE) \
+   FUNC(MISSING_SPACE_BEFORE_EMAIL) \
+   FUNC(MISSING_TAG) \
+   FUNC(MISSING_TAG_ENTRY) \
+   FUNC(MISSING_TAG_OBJECT) \
+   FUNC(MISSING_TREE) \
+   FUNC(MISSING_TYPE) \
+   FUNC(MISSING_TYPE_ENTRY) \
+   FUNC(NOT_SORTED) \
+   FUNC(NUL_IN_HEADER) \
+   FUNC(TAG_OBJECT_NOT_TAG) \
+   FUNC(UNKNOWN_TYPE) \
+   FUNC(UNTERMINATED_HEADER) \
+   FUNC(ZERO_PADDED_DATE) \
+   /* warnings */ \
+   FUNC(BAD_FILEMODE) \
+   FUNC(EMPTY_NAME) \
+   FUNC(FULL_PATHNAME) \
+   FUNC(HAS_DOT) \
+   FUNC(HAS_DOTDOT) \
+   FUNC(HAS_DOTGIT) \
+   FUNC(INVALID_TAG_NAME) \
+   FUNC(MISSING_TAGGER_ENTRY) \
+   FUNC(NULL_SHA1) \
+   FUNC(ZERO_PADDED_FILEMODE)
+
+#define FIRST_WARNING FSCK_MSG_BAD_FILEMODE
+
+#define MSG_ID(x) FSCK_MSG_##x,
+enum fsck_msg_id {
+   FOREACH_MSG_ID(MSG_ID)
+   FSCK_MSG_MAX
+};
+
+int fsck_msg_type(enum fsck_msg_id msg_id, struct fsck_options *options)
+{
+   return msg_id < FIRST_WARNING ? FSCK_ERROR : FSCK_WARN;
+}
+
+__attribute__((format (printf, 4, 5)))
+static int report(struct fsck_options *options, struct object *object,
+   enum fsck_msg_id id, const char *fmt, ...)
+{
+   va_list ap;
+   struct strbuf sb = STRBUF_INIT;
+   int msg_type = fsck_msg_type(id, options), result;
+
+   va_start(ap, fmt);
+   strbuf_vaddf(&sb, fmt, ap);
+   result = options->error_func(object, msg_type, sb.buf);
+   strbuf_release(&sb);
+   va_end(ap);
+
+   return result;
+}
+
 static int fsck_walk_tree(struct tree *tree, void *data, struct fsck_options 
*options)
 {
struct tree_desc desc;
@@ -216,25 +293,25 @@ static int fsck_tree(struct tree *item, struct 
fsck_options *options)
 
retval = 0;
if (has_null_sha1)
-   retval += options->error_func(&item->object, FSCK_WARN, 
"contains entries pointing to null sha1");
+   retval += re

[PATCH 00/18] Introduce an internal API to interact with the fsck machinery

2014-12-08 Thread Johannes Schindelin
At the moment, the git-fsck's integrity checks are targeted toward the
end user, i.e. the error messages are really just messages, intended for
human consumption.

Under certain circumstances, some of those errors should be allowed to
be turned into mere warnings, though, because the cost of fixing the
issues might well be larger than the cost of carrying those flawed
objects. For example, when an already-public repository contains a
commit object with two authors for years, it does not make sense to
force the maintainer to rewrite the history, affecting all contributors
negatively by forcing them to update.

This branch introduces an internal fsck API to be able to turn some of
the errors into warnings, and to make it easier to call the fsck
machinery from elsewhere in general.

I am proud to report that this work has been sponsored by GitHub.


Johannes Schindelin (18):
  Introduce fsck options
  Introduce identifiers for fsck messages
  Provide a function to parse fsck message IDs
  Offer a function to demote fsck errors to warnings
  Allow demoting errors to warnings via receive.fsck. = warn
  fsck: report the ID of the error/warning
  Make fsck_ident() warn-friendly
  Make fsck_commit() warn-friendly
  fsck: handle multiple authors in commits specially
  Make fsck_tag() warn-friendly
  Add a simple test for receive.fsck.*
  Disallow demoting grave fsck errors to warnings
  Optionally ignore specific fsck issues completely
  fsck: allow upgrading fsck warnings to errors
  Document the new receive.fsck.* options.
  fsck: support demoting errors to warnings
  Introduce `git fsck --quick`
  git receive-pack: support excluding objects from fsck'ing

 Documentation/config.txt|  27 +++
 Documentation/git-fsck.txt  |   7 +-
 builtin/fsck.c  |  66 --
 builtin/index-pack.c|  13 +-
 builtin/receive-pack.c  |  36 ++-
 builtin/unpack-objects.c|  16 +-
 fsck.c  | 512 +++-
 fsck.h  |  28 ++-
 t/t1450-fsck.sh |  33 +++
 t/t5302-pack-index.sh   |   2 +-
 t/t5504-fetch-receive-strict.sh |  46 
 11 files changed, 624 insertions(+), 162 deletions(-)

-- 
2.0.0.rc3.9669.g840d1f9
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] check-ignore: clarify treatment of tracked files

2014-12-08 Thread Michael J Gruber
Junio C Hamano schrieb am 04.12.2014 um 21:15:
> Michael J Gruber  writes:
> 
>> By default, check-ignore does not list tracked files at all since
>> they are not subject to ignore patterns.
>>
>> Make this clearer in the man page.
>>
>> Reported-by: Guilherme 
>> Signed-off-by: Michael J Gruber 
>> ---
>> That really is a bit confusing. Does this help?
> 
> Thanks.
> 
> "git check-ignore" is a tool to debug your .gitignore settings when
> your expectation does not match the reality, so having this new
> sentence here is a good thing to do, but I wonder if there is a more
> prominent and central place where people learn about the ignore
> mechanism the first place.  If we had this sentence there, too, that
> may reduce the need to debug their .gitignore settings in the first
> place.
> 
> Perhaps Documentation/gitignore.txt?  Documentation/user-manual.txt?

gitignore.txt has

DESCRIPTION
   A gitignore file specifies intentionally untracked files that Git
should ignore. Files already tracked by Git are not affected; see the
   NOTES below for details.

I doesn't get any clearer. But then the notes read:

NOTES
   The purpose of gitignore files is to ensure that certain files
not tracked by Git remain untracked.

   To ignore uncommitted changes in a file that is already tracked,
use git update-index --assume-unchanged.

   To stop tracking a file that is currently tracked, use git rm
--cached.

That is again clear for our case (line 1), but line 2 is troublesome,
isn't it?

user-manual mainly refers to gitignore. So I guess it's good, but that
line about assume-unchanged doesn't quite match with the discussion in
another current thread.

Michael
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 0/5] git-glossary

2014-12-08 Thread Michael J Gruber
Michael J Gruber schrieb am 08.12.2014 um 16:38:
> More and more people use Git in localised setups, which usually means
> mixed localisation setups - not only, but also because of our English
> man pages.
> 
> Here's an attempt at leveraging our current infrastructure for helping
> those poor mixed localisation folks. The idea is to keep the most
> important iterms in the glossary and translate at least these.
> 
> 1/5: generate glossary term list automatically from gitglossary.txt
> 2/5: introduce git-glossary command which helps with lookups
> 3/5: introduce git-glossary.txt, the man page for the command
> 4/5: git.pot update
> 5/5: sample de.po update
> 
> Without 4/5 and 5/5, a few terms from the glossary can be translated
> already by coincidence with localised messages from some git commands.
> 
> Michael J Gruber (5):
>   glossary.h: generate a glossary list from the Makefile
>   glossary: introduce glossary lookup command
>   glossary: man page
>   l10n: git-glossary
>   l10n: de: git-glossary
> 
>  .gitignore |2 +
>  Documentation/git-glossary.txt |   48 ++
>  Makefile   |8 +-
>  builtin.h  |1 +
>  builtin/glossary.c |  104 +++
>  command-list.txt   |1 +
>  generate-glossary.sh   |8 +
>  git.c  |1 +
>  po/de.po   | 1382 
> 
>  po/git.pot | 1362 +++
>  10 files changed, 1839 insertions(+), 1078 deletions(-)
>  create mode 100644 Documentation/git-glossary.txt
>  create mode 100644 builtin/glossary.c
>  create mode 100755 generate-glossary.sh

While I did send 5/5 with UTF-8 encoding (or rather: git-sendemail
helpfully did so) it seems it doesn't get through. Anyways, this stuff
is here also:

https://github.com/mjg/git/tree/glossary-cmd

Or rather:

The following changes since commit a0de725a8ff02c1f2a9452c2234bee819242395c:

  Sync with Git 2.2 (2014-11-26 13:20:21 -0800)

are available in the git repository at:

  git://github.com/mjg/git glossary-cmd

for you to fetch changes up to 1265605787662a72c2457be0623a76d4d2a74bc1:

  l10n: de: git-glossary (2014-12-08 16:26:31 +0100)


Michael J Gruber (5):
  glossary.h: generate a glossary list from the Makefile
  glossary: introduce glossary lookup command
  glossary: man page
  l10n: git-glossary
  l10n: de: git-glossary

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Accept-language test fails on Mac OS

2014-12-08 Thread Christian Couder
On Sun, Dec 7, 2014 at 8:18 AM, Jeff King  wrote:
> On Sat, Dec 06, 2014 at 10:04:06PM +0100, Torsten Bögershausen wrote:
>
>> I get this:
>>
>>
>> expecting success:
>> check_language "ko-KR, *;q=0.1" ko_KR.UTF-8 de_DE.UTF-8 ja_JP.UTF-8 
>> en_US.UTF-8 &&
>> check_language "de-DE, *;q=0.1" ""  de_DE.UTF-8 ja_JP.UTF-8 
>> en_US.UTF-8 &&
>> check_language "ja-JP, *;q=0.1" ""  ""  ja_JP.UTF-8 
>> en_US.UTF-8 &&
>> check_language "en-US, *;q=0.1" ""  ""  ""  
>> en_US.UTF-8
>>
>> --- expect  2014-12-06 21:00:59.0 +
>> +++ actual  2014-12-06 21:00:59.0 +
>> @@ -1 +0,0 @@
>> -Accept-Language: de-DE, *;q=0.1
>> not ok 25 - git client sends Accept-Language based on LANGUAGE, LC_ALL, 
>> LC_MESSAGES and LANG
>
> I can reproduce the same problem here (Debian unstable). I actually ran
> into three issues (aside from needing to use Junio's SQUASH commit, to
> avoid the "\r" bash-ism):
>
>   1. I couldn't build without including locale.h, for the
>  definition of setlocale() and the LC_MESSAGES constant (both used
>  in get_preferred_languages).
>
>  I'm not sure what portability issues there are with including it
>  unconditionally. Should this possibly be tied into gettext.c, which
>  already uses setlocale?

Yeah, pu build is broken on Ubuntu 14.04 too, because of
7567fad2431eb38291fd74a70f603e5746c6f728 (http: send Accept-Language
header if possible).

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 4/5] l10n: git-glossary

2014-12-08 Thread Michael J Gruber
git.pot update for the new glossary command.

Signed-off-by: Michael J Gruber 
---
 po/git.pot | 1362 
 1 file changed, 829 insertions(+), 533 deletions(-)

diff --git a/po/git.pot b/po/git.pot
index ee91402..d725a5d 100644
--- a/po/git.pot
+++ b/po/git.pot
@@ -8,7 +8,7 @@ msgid ""
 msgstr ""
 "Project-Id-Version: PACKAGE VERSION\n"
 "Report-Msgid-Bugs-To: Git Mailing List \n"
-"POT-Creation-Date: 2014-11-20 09:42+0800\n"
+"POT-Creation-Date: 2014-12-08 16:02+0100\n"
 "PO-Revision-Date: YEAR-MO-DA HO:MI+ZONE\n"
 "Last-Translator: FULL NAME \n"
 "Language-Team: LANGUAGE \n"
@@ -70,8 +70,8 @@ msgstr ""
 #: archive.c:422 builtin/archive.c:88 builtin/blame.c:2517
 #: builtin/blame.c:2518 builtin/config.c:57 builtin/fast-export.c:986
 #: builtin/fast-export.c:988 builtin/grep.c:712 builtin/hash-object.c:101
-#: builtin/ls-files.c:489 builtin/ls-files.c:492 builtin/notes.c:411
-#: builtin/notes.c:568 builtin/read-tree.c:109 parse-options.h:151
+#: builtin/ls-files.c:489 builtin/ls-files.c:492 builtin/notes.c:394
+#: builtin/notes.c:557 builtin/read-tree.c:109 parse-options.h:151
 msgid "file"
 msgstr ""
 
@@ -103,7 +103,7 @@ msgstr ""
 msgid "list supported archive formats"
 msgstr ""
 
-#: archive.c:441 builtin/archive.c:90 builtin/clone.c:85
+#: archive.c:441 builtin/archive.c:90 builtin/clone.c:86
 msgid "repo"
 msgstr ""
 
@@ -111,7 +111,7 @@ msgstr ""
 msgid "retrieve the archive from remote repository "
 msgstr ""
 
-#: archive.c:443 builtin/archive.c:92 builtin/notes.c:490
+#: archive.c:443 builtin/archive.c:92 builtin/notes.c:478
 msgid "command"
 msgstr ""
 
@@ -236,7 +236,7 @@ msgstr ""
 msgid "unrecognized header: %s%s (%d)"
 msgstr ""
 
-#: bundle.c:87 builtin/commit.c:788
+#: bundle.c:87 builtin/commit.c:834
 #, c-format
 msgid "could not open '%s'"
 msgstr ""
@@ -245,9 +245,9 @@ msgstr ""
 msgid "Repository lacks these prerequisite commits:"
 msgstr ""
 
-#: bundle.c:163 sequencer.c:641 sequencer.c:1096 builtin/blame.c:2706
-#: builtin/branch.c:652 builtin/commit.c:1085 builtin/log.c:330
-#: builtin/log.c:823 builtin/log.c:1432 builtin/log.c:1669 builtin/merge.c:357
+#: bundle.c:163 sequencer.c:645 sequencer.c:1100 builtin/blame.c:2706
+#: builtin/branch.c:652 builtin/commit.c:1107 builtin/log.c:330
+#: builtin/log.c:823 builtin/log.c:1432 builtin/log.c:1669 builtin/merge.c:358
 #: builtin/shortlog.c:158
 msgid "revision walk setup failed"
 msgstr ""
@@ -604,11 +604,11 @@ msgstr[1] ""
 msgid "%s: %s - %s"
 msgstr ""
 
-#: lockfile.c:275
+#: lockfile.c:283
 msgid "BUG: reopen a lockfile that is still open"
 msgstr ""
 
-#: lockfile.c:277
+#: lockfile.c:285
 msgid "BUG: reopen a lockfile that has been committed"
 msgstr ""
 
@@ -616,8 +616,8 @@ msgstr ""
 msgid "failed to read the cache"
 msgstr ""
 
-#: merge.c:94 builtin/checkout.c:356 builtin/checkout.c:562
-#: builtin/clone.c:659
+#: merge.c:94 builtin/checkout.c:374 builtin/checkout.c:580
+#: builtin/clone.c:662
 msgid "unable to write new index file"
 msgstr ""
 
@@ -664,7 +664,7 @@ msgstr ""
 msgid "blob expected for %s '%s'"
 msgstr ""
 
-#: merge-recursive.c:792 builtin/clone.c:318
+#: merge-recursive.c:792 builtin/clone.c:321
 #, c-format
 msgid "failed to open '%s'"
 msgstr ""
@@ -861,7 +861,7 @@ msgstr ""
 msgid "Could not parse object '%s'"
 msgstr ""
 
-#: merge-recursive.c:2019 builtin/merge.c:666
+#: merge-recursive.c:2019 builtin/merge.c:667
 msgid "Unable to write index."
 msgstr ""
 
@@ -1017,32 +1017,32 @@ msgstr ""
 msgid "Internal error"
 msgstr ""
 
-#: remote.c:1968
+#: remote.c:1980
 #, c-format
 msgid "Your branch is based on '%s', but the upstream is gone.\n"
 msgstr ""
 
-#: remote.c:1972
+#: remote.c:1984
 msgid "  (use \"git branch --unset-upstream\" to fixup)\n"
 msgstr ""
 
-#: remote.c:1975
+#: remote.c:1987
 #, c-format
 msgid "Your branch is up-to-date with '%s'.\n"
 msgstr ""
 
-#: remote.c:1979
+#: remote.c:1991
 #, c-format
 msgid "Your branch is ahead of '%s' by %d commit.\n"
 msgid_plural "Your branch is ahead of '%s' by %d commits.\n"
 msgstr[0] ""
 msgstr[1] ""
 
-#: remote.c:1985
+#: remote.c:1997
 msgid "  (use \"git push\" to publish your local commits)\n"
 msgstr ""
 
-#: remote.c:1988
+#: remote.c:2000
 #, c-format
 msgid "Your branch is behind '%s' by %d commit, and can be fast-forwarded.\n"
 msgid_plural ""
@@ -1050,11 +1050,11 @@ msgid_plural ""
 msgstr[0] ""
 msgstr[1] ""
 
-#: remote.c:1996
+#: remote.c:2008
 msgid "  (use \"git pull\" to update your local branch)\n"
 msgstr ""
 
-#: remote.c:1999
+#: remote.c:2011
 #, c-format
 msgid ""
 "Your branch and '%s' have diverged,\n"
@@ -1065,7 +1065,7 @@ msgid_plural ""
 msgstr[0] ""
 msgstr[1] ""
 
-#: remote.c:2009
+#: remote.c:2021
 msgid "  (use \"git pull\" to merge the remote branch into yours)\n"
 msgstr ""
 
@@ -1086,14 +1086,14 @@ msgstr ""
 msgid "the receiving end does not support --signed push"
 msgstr ""
 
-#: sequencer.c:172 builtin/merge.c:781 builtin/merge.c:892
-#: builtin/merge

[RFC/PATCH 2/5] glossary: introduce glossary lookup command

2014-12-08 Thread Michael J Gruber
When using a localised git, there are many reasons why a correspondence
between English and localised git terms is needed:
- connect localised messages with English ones (porcelain vs. plumbing)
- connect localised messages with English man pages or online docs
- help out someone in a different locale

Introduce a `git glossary' command that leverages the existing infrastructure
in three possible ways:
- `git glossary' lists all glossary terms along with their translation
- `git glossary foo' matches `foo' in the glossary (both English and
  localisation; partial matches shown)
- `git glossary -a foo' matches `foo' in the git message catalogue
  (English, exact match only).

Signed-off-by: Michael J Gruber 
---
Some bike-shedding expected regarding the interface...
Once I've learned how to test l10n stuff, this will be amended.

 .gitignore |   1 +
 Makefile   |   1 +
 builtin.h  |   1 +
 builtin/glossary.c | 104 +
 command-list.txt   |   1 +
 git.c  |   1 +
 6 files changed, 109 insertions(+)
 create mode 100644 builtin/glossary.c

diff --git a/.gitignore b/.gitignore
index fb4ebaa..ff627a6 100644
--- a/.gitignore
+++ b/.gitignore
@@ -64,6 +64,7 @@
 /git-fsck-objects
 /git-gc
 /git-get-tar-commit-id
+/git-glossary
 /git-grep
 /git-hash-object
 /git-help
diff --git a/Makefile b/Makefile
index ae74fdf..8fc9de2 100644
--- a/Makefile
+++ b/Makefile
@@ -824,6 +824,7 @@ BUILTIN_OBJS += builtin/for-each-ref.o
 BUILTIN_OBJS += builtin/fsck.o
 BUILTIN_OBJS += builtin/gc.o
 BUILTIN_OBJS += builtin/get-tar-commit-id.o
+BUILTIN_OBJS += builtin/glossary.o
 BUILTIN_OBJS += builtin/grep.o
 BUILTIN_OBJS += builtin/hash-object.o
 BUILTIN_OBJS += builtin/help.o
diff --git a/builtin.h b/builtin.h
index b87df70..dcaf220 100644
--- a/builtin.h
+++ b/builtin.h
@@ -68,6 +68,7 @@ extern int cmd_format_patch(int argc, const char **argv, 
const char *prefix);
 extern int cmd_fsck(int argc, const char **argv, const char *prefix);
 extern int cmd_gc(int argc, const char **argv, const char *prefix);
 extern int cmd_get_tar_commit_id(int argc, const char **argv, const char 
*prefix);
+extern int cmd_glossary(int argc, const char **argv, const char *prefix);
 extern int cmd_grep(int argc, const char **argv, const char *prefix);
 extern int cmd_hash_object(int argc, const char **argv, const char *prefix);
 extern int cmd_help(int argc, const char **argv, const char *prefix);
diff --git a/builtin/glossary.c b/builtin/glossary.c
new file mode 100644
index 000..4ad8c51
--- /dev/null
+++ b/builtin/glossary.c
@@ -0,0 +1,104 @@
+/*
+ * Builtin help command
+ */
+#include "cache.h"
+#include "builtin.h"
+#include "exec_cmd.h"
+#include "parse-options.h"
+#include "run-command.h"
+#include "column.h"
+#include "glossary.h"
+
+
+static int match_all = 0;
+static unsigned int colopts;
+static struct option builtin_glossary_options[] = {
+   OPT_BOOL('a', "all", &match_all, N_("match all English git messages")),
+   OPT_END(),
+};
+
+static const char * const builtin_glossary_usage[] = {
+   N_("git glossary [-a|--all] [term]..."),
+   NULL
+};
+
+
+/*
+static int git_glossary_config(const char *var, const char *value, void *cb)
+{
+   if (starts_with(var, "column."))
+   return git_column_config(var, value, "help", &colopts);
+
+   return git_default_config(var, value, cb);
+}
+*/
+
+static void emit_one(const char *one, const char* two, int pad)
+{
+   printf("   %s   ", one);
+   for (; pad; pad--)
+   putchar(' ');
+   puts(two);
+}
+
+static void lookup_all(int n, const char **terms)
+{
+   int i;
+   for (i = 0; i < n; i++)
+   emit_one(terms[i], _(terms[i]), 0);
+}
+
+static void lookup_glossary(int n, const char **terms)
+{
+   int i, j;
+   for (i = 0; i < ARRAY_SIZE(glossary); i++) {
+   for (j = 0; j < n; j++) {
+   if (strstr(glossary[i], terms[j]) || 
strstr(_(glossary[i]), terms[j])) {
+   emit_one(glossary[i], _(glossary[i]), 0);
+   break;
+   }
+   }
+   }
+}
+
+static void list_glossary()
+{
+   int i, longest = 0;
+
+   for (i = 0; i < ARRAY_SIZE(glossary); i++) {
+   if (longest < strlen(glossary[i]))
+   longest = strlen(glossary[i]);
+   }
+
+   for (i = 0; i < ARRAY_SIZE(glossary); i++)
+   emit_one(glossary[i], _(glossary[i]), longest - 
strlen(glossary[i]));
+}
+
+int cmd_glossary(int argc, const char **argv, const char *prefix)
+{
+   int nongit;
+
+   argc = parse_options(argc, argv, prefix, builtin_glossary_options,
+   builtin_glossary_usage, 0);
+
+   if (match_all && !argc) {
+   printf(_("usage: %s%s"), _(builtin_glossary_usage[0]), "\n\n");
+   exit(1);
+   }
+
+
+/*
+   setup_git_directory_gent

[RFC/PATCH 1/5] glossary.h: generate a glossary list from the Makefile

2014-12-08 Thread Michael J Gruber
Generate a header file which lists all terms defined in the glossary
in a way suitable for localisation. This will be used by the new
glossary command.

Signed-off-by: Michael J Gruber 
---
I also snuck in a change to the clean target, so that we don't have to update
it (its definition) as long as we keep GENERATED_H up to date.

 .gitignore   | 1 +
 Makefile | 7 +--
 generate-glossary.sh | 8 
 3 files changed, 14 insertions(+), 2 deletions(-)
 create mode 100755 generate-glossary.sh

diff --git a/.gitignore b/.gitignore
index a052419..fb4ebaa 100644
--- a/.gitignore
+++ b/.gitignore
@@ -208,6 +208,7 @@
 /test-urlmatch-normalization
 /test-wildmatch
 /common-cmds.h
+/glossary.h
 *.tar.gz
 *.dsc
 *.deb
diff --git a/Makefile b/Makefile
index 14d5ac1..ae74fdf 100644
--- a/Makefile
+++ b/Makefile
@@ -627,7 +627,7 @@ LIB_FILE = libgit.a
 XDIFF_LIB = xdiff/lib.a
 VCSSVN_LIB = vcs-svn/lib.a
 
-GENERATED_H += common-cmds.h
+GENERATED_H += common-cmds.h glossary.h
 
 LIB_H = $(shell $(FIND) . \
-name .git -prune -o \
@@ -1649,6 +1649,9 @@ common-cmds.h: ./generate-cmdlist.sh command-list.txt
 common-cmds.h: $(wildcard Documentation/git-*.txt)
$(QUIET_GEN)./generate-cmdlist.sh > $@+ && mv $@+ $@
 
+glossary.h: ./generate-glossary.sh Documentation/glossary-content.txt
+   $(QUIET_GEN)./generate-glossary.sh > $@+ && mv $@+ $@
+
 SCRIPT_DEFINES = $(SHELL_PATH_SQ):$(DIFF_SQ):$(GIT_VERSION):\
$(localedir_SQ):$(NO_CURL):$(USE_GETTEXT_SCHEME):$(SANE_TOOL_PATH_SQ):\
$(gitwebdir_SQ):$(PERL_PATH_SQ)
@@ -2356,7 +2359,7 @@ clean: profile-clean coverage-clean
$(RM) $(TEST_PROGRAMS) $(NO_INSTALL)
$(RM) -r bin-wrappers $(dep_dirs)
$(RM) -r po/build/
-   $(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo common-cmds.h $(ETAGS_TARGET) 
tags cscope*
+   $(RM) *.spec *.pyc *.pyo */*.pyc */*.pyo $(GENERATED_H) $(ETAGS_TARGET) 
tags cscope*
$(RM) -r $(GIT_TARNAME) .doc-tmp-dir
$(RM) $(GIT_TARNAME).tar.gz git-core_$(GIT_VERSION)-*.tar.gz
$(RM) $(htmldocs).tar.gz $(manpages).tar.gz
diff --git a/generate-glossary.sh b/generate-glossary.sh
new file mode 100755
index 000..41f1eb3
--- /dev/null
+++ b/generate-glossary.sh
@@ -0,0 +1,8 @@
+#!/bin/sh
+
+echo "/* Automatically generated by $0 */
+
+static const char *glossary[] = {"
+
+sed -n -e 's/^.*\]\]\(.*\)::$/\tN_("\1"),/p' Documentation/glossary-content.txt
+echo "};"
-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 0/5] git-glossary

2014-12-08 Thread Michael J Gruber
More and more people use Git in localised setups, which usually means
mixed localisation setups - not only, but also because of our English
man pages.

Here's an attempt at leveraging our current infrastructure for helping
those poor mixed localisation folks. The idea is to keep the most
important iterms in the glossary and translate at least these.

1/5: generate glossary term list automatically from gitglossary.txt
2/5: introduce git-glossary command which helps with lookups
3/5: introduce git-glossary.txt, the man page for the command
4/5: git.pot update
5/5: sample de.po update

Without 4/5 and 5/5, a few terms from the glossary can be translated
already by coincidence with localised messages from some git commands.

Michael J Gruber (5):
  glossary.h: generate a glossary list from the Makefile
  glossary: introduce glossary lookup command
  glossary: man page
  l10n: git-glossary
  l10n: de: git-glossary

 .gitignore |2 +
 Documentation/git-glossary.txt |   48 ++
 Makefile   |8 +-
 builtin.h  |1 +
 builtin/glossary.c |  104 +++
 command-list.txt   |1 +
 generate-glossary.sh   |8 +
 git.c  |1 +
 po/de.po   | 1382 
 po/git.pot | 1362 +++
 10 files changed, 1839 insertions(+), 1078 deletions(-)
 create mode 100644 Documentation/git-glossary.txt
 create mode 100644 builtin/glossary.c
 create mode 100755 generate-glossary.sh

-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 3/5] glossary: man page

2014-12-08 Thread Michael J Gruber
Signed-off-by: Michael J Gruber 
---
This also means you have to use "git help gitglossary" if you want to have
the glossary only. But it's included in "git help glossary" for convenience.

 Documentation/git-glossary.txt | 48 ++
 1 file changed, 48 insertions(+)
 create mode 100644 Documentation/git-glossary.txt

diff --git a/Documentation/git-glossary.txt b/Documentation/git-glossary.txt
new file mode 100644
index 000..f2605c5
--- /dev/null
+++ b/Documentation/git-glossary.txt
@@ -0,0 +1,48 @@
+git-glossary(1)
+===
+
+NAME
+
+git-glossary - List and translate terms from the glossary
+
+SYNOPSIS
+
+[verse]
+'git glossary' [-a | --all] ...
+
+DESCRIPTION
+---
+Look up each term in the glossary and display it along with its
+translation. This works with localised (translated) versions
+of git only. 
+
+OPTIONS
+---
+...::
+   Term(s) to look up.
++
+If no term is specified, list all terms from the glossary along with its 
translation.
+
+-a::
+   Look up terms in the message catalogue of all git commands, instead of 
the glossary.
+   This matches with complete english messages only.
+
+
+DISCUSSION
+--
+Unless `-a' is used, partial matches in both english and the translated
+entry are shown.
+
+The list of terms is taken from the linkgit:gitglossary[7].
+Currently they are:
+
+include::glossary-content.txt[]
+
+
+SEE ALSO
+
+linkgit:gitglossary[7]
+
+GIT
+---
+Part of the linkgit:git[1] suite
-- 
2.2.0.345.g7041aac

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] index-pack: terminate object buffers with NUL

2014-12-08 Thread Johannes Schindelin
From: Duy Nguyen 

We have some tricky checks in fsck that rely on a side effect of
require_end_of_header(), and would otherwise easily run outside
non-NUL-terminated buffers. This is a bit brittle, so let's make sure
that only NUL-terminated buffers are passed around to begin with.

Jeff "Peff" King contributed the detailed analysis which call paths are
involved and pointed out that we also have to patch the get_data()
function in unpack-objects.c, which is what Johannes "Dscho" Schindelin
implemented.

Signed-off-by: Nguyễn Thái Ngọc Duy 
Analyzed-by: Jeff King 
Signed-off-by: Johannes Schindelin 
---

On Mon, 8 Dec 2014, Duy Nguyen wrote:

> Subject: [PATCH] index-pack: terminate object buffers with NUL
> 
> Signed-off-by: Nguyễn Thái Ngọc Duy 

Here is a patch that is updated with Peff's suggested
unpack-object.c:get_data change.

While it is not as good as Peff's analysis, I can provide an
additional data point: the test suite passes cleanly even with

https://github.com/dscho/git/commit/567ad592

applied (and with 567ad592, but without below changes, at least
t1050 does not pass).

 builtin/index-pack.c | 4 ++--
 builtin/unpack-objects.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/builtin/index-pack.c b/builtin/index-pack.c
index a369f55..4632117 100644
--- a/builtin/index-pack.c
+++ b/builtin/index-pack.c
@@ -447,7 +447,7 @@ static void *unpack_entry_data(unsigned long offset, 
unsigned long size,
if (type == OBJ_BLOB && size > big_file_threshold)
buf = fixed_buf;
else
-   buf = xmalloc(size);
+   buf = xmallocz(size);
 
memset(&stream, 0, sizeof(stream));
git_inflate_init(&stream);
@@ -552,7 +552,7 @@ static void *unpack_data(struct object_entry *obj,
git_zstream stream;
int status;
 
-   data = xmalloc(consume ? 64*1024 : obj->size);
+   data = xmallocz(consume ? 64*1024 : obj->size);
inbuf = xmalloc((len < 64*1024) ? len : 64*1024);
 
memset(&stream, 0, sizeof(stream));
diff --git a/builtin/unpack-objects.c b/builtin/unpack-objects.c
index 855d94b..ac66672 100644
--- a/builtin/unpack-objects.c
+++ b/builtin/unpack-objects.c
@@ -91,7 +91,7 @@ static void use(int bytes)
 static void *get_data(unsigned long size)
 {
git_zstream stream;
-   void *buf = xmalloc(size);
+   void *buf = xmallocz(size);
 
memset(&stream, 0, sizeof(stream));
 
-- 
1.8.4.msysgit.0.dirty


Re: [PATCH 07/23] expire_reflog(): use a lock_file for rewriting the reflog file

2014-12-08 Thread Michael Haggerty
On 12/05/2014 01:23 AM, Jonathan Nieder wrote:
> Michael Haggerty wrote:
> 
>> We don't actually need the locking functionality, because we already
>> hold the lock on the reference itself, which is how the reflog file is
>> locked. But the lock_file code still does some of the bookkeeping for
>> us and is more careful than the old code here was.
> 
> As you say, the ref lock takes care of mutual exclusion, so we do not
> have to be too careful about compatibility with other tools that might
> not know to lock the reflog.  And this is not tying our hands for a
> future when I might want to lock logs/refs/heads/topic/1 while
> logs/refs/heads/topic still exists as part of the implementation of
> "git mv topic/1 topic".
> 
> Stefan and I had forgotten about that guarantee when looking at that
> kind of operation --- thanks for the reminder.

This reminder is important (and forgettable) enough that I will add a
comment within the function explaining it.

> Should updates to the HEAD reflog acquire HEAD.lock?  (They don't
> currently.)

Yes, they should; good catch. I assume that you are referring to the
code at the bottom of write_ref_sha1()? Or did you find a problem in
this patch series?

If the former, then I propose that we address this bug in a separate
patch series.

> [...]
>> --- a/builtin/reflog.c
>> +++ b/builtin/reflog.c
>> @@ -349,12 +349,14 @@ static int push_tip_to_list(const char *refname, const 
>> unsigned char *sha1, int
>>  return 0;
>>  }
>>  
>> +static struct lock_file reflog_lock;
> 
> If this lockfile is only used in that one function, it can be declared
> inside the function.
> 
> If it is meant to be used throughout the 'git reflog' command, then it
> can go near the top of the file.

For now it is only used within this function, so I will move it into the
function as you suggest. (As you know, it does need to remain static,
because of the way the lock_file module takes over ownership of these
objects.)

>> +
>>  static int expire_reflog(const char *refname, const unsigned char *sha1, 
>> void *cb_data)
>>  {
>>  struct cmd_reflog_expire_cb *cmd = cb_data;
>>  struct expire_reflog_cb cb;
>>  struct ref_lock *lock;
>> -char *log_file, *newlog_path = NULL;
>> +char *log_file;
>>  struct commit *tip_commit;
>>  struct commit_list *tips;
>>  int status = 0;
>> @@ -372,10 +374,14 @@ static int expire_reflog(const char *refname, const 
>> unsigned char *sha1, void *c
>>  unlock_ref(lock);
>>  return 0;
>>  }
>> +
>>  log_file = git_pathdup("logs/%s", refname);
>>  if (!cmd->dry_run) {
>> -newlog_path = git_pathdup("logs/%s.lock", refname);
>> -cb.newlog = fopen(newlog_path, "w");
>> +if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0)
>> +goto failure;
> 
> hold_lock_file_for_update doesn't print a message.  Code to print one
> looks like
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0) < 0) {
>   unable_to_lock_message(log_file, errno, &err);
>   error("%s", err.buf);
>   goto failure;
>   }

Thanks; will add.

> (A patch in flight changes that to
> 
>   if (hold_lock_file_for_update(&reflog_lock, log_file, 0, &err) < 0) {
>   error("%s", err.buf);
>   goto failure;
>   }
> 
> )

Thanks for the heads-up. The compiler will complain when the branches
are merged, and hopefully the fix will be obvious.

>> +cb.newlog = fdopen_lock_file(&reflog_lock, "w");
>> +if (!cb.newlog)
>> +goto failure;
> 
> Hm.  lockfile.c::fdopen_lock_file ought to use xfdopen to make this
> case impossible.  And xfdopen should use try_to_free_routine() and
> try again on failure.

That sounds reasonable, but it is not manifestly obvious given that at
least one caller of fdopen_lock_file() (in fast-import.c) tries to
recover if fdopen_lock_file() fails. Let's address this in a separate
patch series if that is OK with you. For now I will add explicit
error-reporting code here before "goto failure".

> [...]
>> @@ -423,10 +429,9 @@ static int expire_reflog(const char *refname, const 
>> unsigned char *sha1, void *c
>>  }
>>  
>>  if (cb.newlog) {
>> -if (fclose(cb.newlog)) {
>> -status |= error("%s: %s", strerror(errno),
>> -newlog_path);
>> -unlink(newlog_path);
>> +if (close_lock_file(&reflog_lock)) {
>> +status |= error("Couldn't write %s: %s", log_file,
>> +strerror(errno));
> 
> Style nit: error messages usually start with a lowercase letter
> (though I realize nearby examples are already inconsistent).

Thanks; will fix.

> commit_lock_file() can take care of the close_lock_file automatically.

The existing code is a tiny bit safer: first make sure both files can be
written, *then* ren

Re: [PATCH] checkout: add --ignore-other-wortrees

2014-12-08 Thread Duy Nguyen
Oops. This one does not belong to the series. I cleaned up `pwd`, then
jumped to another one for testing and forgot to clean up patches again
:(
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 20/23] update-index: test the system before enabling untracked cache

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-update-index.txt |   6 ++
 builtin/update-index.c | 148 +
 2 files changed, 154 insertions(+)

diff --git a/Documentation/git-update-index.txt 
b/Documentation/git-update-index.txt
index f9a35cd..ed32bae 100644
--- a/Documentation/git-update-index.txt
+++ b/Documentation/git-update-index.txt
@@ -180,6 +180,12 @@ may not support it yet.
system must change `st_mtime` field of a directory if files
are added or deleted in that directory.
 
+--force-untracked-cache::
+   For safety, `--untracked-cache` performs tests on the working
+   directory to make sure untracked cache can be used. These
+   tests can take a few seconds. `--force-untracked-cache` can be
+   used to skip the tests.
+
 \--::
Do not interpret any more arguments as options.
 
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 3d2dedd..f23ec83 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -47,6 +47,147 @@ static void report(const char *fmt, ...)
va_end(vp);
 }
 
+static void remove_test_directory(void)
+{
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addstr(&sb, "dir-mtime-test");
+   remove_dir_recursively(&sb, 0);
+   strbuf_release(&sb);
+}
+
+static void xmkdir(const char *path)
+{
+   if (mkdir(path, 0700))
+   die_errno(_("failed to create directory %s"), path);
+}
+
+static int xstat(const char *path, struct stat *st)
+{
+   if (stat(path, st))
+   die_errno(_("failed to stat %s"), path);
+   return 0;
+}
+
+static int create_file(const char *path)
+{
+   int fd = open(path, O_CREAT | O_RDWR, 0644);
+   if (fd < 0)
+   die_errno(_("failed to create file %s"), path);
+   return fd;
+}
+
+static void xunlink(const char *path)
+{
+   if (unlink(path))
+   die_errno(_("failed to delete file %s"), path);
+}
+
+static void xrmdir(const char *path)
+{
+   if (rmdir(path))
+   die_errno(_("failed to delete directory %s"), path);
+}
+
+static void avoid_racy(void)
+{
+   /*
+* not use if we could usleep(10) if USE_NSEC is defined. The
+* field nsec could be there, but the OS could choose to
+* ignore it?
+*/
+   sleep(1);
+}
+
+static int test_if_untracked_cache_is_supported(void)
+{
+   struct stat st;
+   struct stat_data base;
+   int fd;
+
+   fprintf(stderr, _("Testing "));
+   xmkdir("dir-mtime-test");
+   atexit(remove_test_directory);
+   xstat("dir-mtime-test", &st);
+   fill_stat_data(&base, &st);
+   fputc('.', stderr);
+
+   avoid_racy();
+   fd = create_file("dir-mtime-test/newfile");
+   xstat("dir-mtime-test", &st);
+   if (!match_stat_data(&base, &st)) {
+   close(fd);
+   fputc('\n', stderr);
+   fprintf_ln(stderr,_("directory stat info does not "
+   "change after adding a new file"));
+   return 0;
+   }
+   fill_stat_data(&base, &st);
+   fputc('.', stderr);
+
+   avoid_racy();
+   xmkdir("dir-mtime-test/new-dir");
+   xstat("dir-mtime-test", &st);
+   if (!match_stat_data(&base, &st)) {
+   close(fd);
+   fputc('\n', stderr);
+   fprintf_ln(stderr, _("directory stat info does not change "
+"after adding a new directory"));
+   return 0;
+   }
+   fill_stat_data(&base, &st);
+   fputc('.', stderr);
+
+   avoid_racy();
+   write_or_die(fd, "data", 4);
+   close(fd);
+   xstat("dir-mtime-test", &st);
+   if (match_stat_data(&base, &st)) {
+   fputc('\n', stderr);
+   fprintf_ln(stderr, _("directory stat info changes "
+"after updating a file"));
+   return 0;
+   }
+   fputc('.', stderr);
+
+   avoid_racy();
+   close(create_file("dir-mtime-test/new-dir/new"));
+   xstat("dir-mtime-test", &st);
+   if (match_stat_data(&base, &st)) {
+   fputc('\n', stderr);
+   fprintf_ln(stderr, _("directory stat info changes after "
+"adding a file inside subdirectory"));
+   return 0;
+   }
+   fputc('.', stderr);
+
+   avoid_racy();
+   xunlink("dir-mtime-test/newfile");
+   xstat("dir-mtime-test", &st);
+   if (!match_stat_data(&base, &st)) {
+   fputc('\n', stderr);
+   fprintf_ln(stderr, _("directory stat info does not "
+"change after deleting a file"));
+   return 0;
+   }
+   fill_stat_data(&base, &st);
+   fputc('.', stderr);
+
+   avoid_racy();
+   xunlink("dir-mtime-test/new-dir/new");
+   xrmdir("dir-mtime-test/new-d

[PATCH v3 21/23] t7063: tests for untracked cache

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 .gitignore |   1 +
 Makefile   |   1 +
 t/t7063-status-untracked-cache.sh (new +x) | 353 +
 test-dump-untracked-cache.c (new)  |  61 +
 4 files changed, 416 insertions(+)
 create mode 100755 t/t7063-status-untracked-cache.sh
 create mode 100644 test-dump-untracked-cache.c

diff --git a/.gitignore b/.gitignore
index 81e12c0..e2bb375 100644
--- a/.gitignore
+++ b/.gitignore
@@ -182,6 +182,7 @@
 /test-delta
 /test-dump-cache-tree
 /test-dump-split-index
+/test-dump-untracked-cache
 /test-scrap-cache-tree
 /test-genrandom
 /test-hashmap
diff --git a/Makefile b/Makefile
index 9f984a9..fa58a53 100644
--- a/Makefile
+++ b/Makefile
@@ -555,6 +555,7 @@ TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-dump-split-index
+TEST_PROGRAMS_NEED_X += test-dump-untracked-cache
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-hashmap
 TEST_PROGRAMS_NEED_X += test-index-version
diff --git a/t/t7063-status-untracked-cache.sh 
b/t/t7063-status-untracked-cache.sh
new file mode 100755
index 000..2b2ffd7
--- /dev/null
+++ b/t/t7063-status-untracked-cache.sh
@@ -0,0 +1,353 @@
+#!/bin/sh
+
+test_description='test untracked cache'
+
+. ./test-lib.sh
+
+avoid_racy() {
+   sleep 1
+}
+
+git update-index --untracked-cache
+# It's fine if git update-index returns an error code other than one,
+# it'll be caught in the first test.
+if test $? -eq 1; then
+   skip_all='This system does not support untracked cache'
+   test_done
+fi
+
+test_expect_success 'setup' '
+   git init worktree &&
+   cd worktree &&
+   mkdir done dtwo dthree &&
+   touch one two three done/one dtwo/two dthree/three &&
+   git add one two done/one &&
+   : >.git/info/exclude &&
+   git update-index --untracked-cache
+'
+
+test_expect_success 'untracked cache is empty' '
+   test-dump-untracked-cache >../actual &&
+   cat >../expect <../status.expect <../dump.expect <../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   test_cmp ../status.expect ../actual &&
+   cat >../trace.expect <../actual &&
+   test_cmp ../dump.expect ../actual
+'
+
+test_expect_success 'status second time (fully populated cache)' '
+   avoid_racy &&
+   : >../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   test_cmp ../status.expect ../actual &&
+   cat >../trace.expect <../actual &&
+   test_cmp ../dump.expect ../actual
+'
+
+test_expect_success 'modify in root directory, one dir invalidation' '
+   avoid_racy &&
+   : >four &&
+   : >../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   cat >../status.expect <../trace.expect <../actual &&
+   cat >../expect <.gitignore &&
+   : >../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   cat >../status.expect <../trace.expect <../actual &&
+   cat >../expect <>.git/info/exclude &&
+   : >../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   cat >../status.expect <../trace.expect <../actual &&
+   cat >../expect <../actual &&
+   cat >../expect <../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   cat >../status.expect <../trace.expect <../actual &&
+   cat >../expect <../actual &&
+   cat >../expect <../trace &&
+   GIT_TRACE_UNTRACKED_STATS="$TRASH_DIRECTORY/trace" \
+   git status --porcelain >../actual &&
+   cat >../status.expect <../trace.expect <../actual &&
+   cat >../expect name);
+}
+
+static void dump(struct untracked_cache_dir *ucd, struct strbuf *base)
+{
+   int i, len;
+   qsort(ucd->untracked, ucd->untracked_nr, sizeof(*ucd->untracked),
+ compare_untracked);
+   qsort(ucd->dirs, ucd->dirs_nr, sizeof(*ucd->dirs),
+ compare_dir);
+   len = base->len;
+   strbuf_addf(base, "%s/", ucd->name);
+   printf("%s %s", base->buf,
+  sha1_to_hex(ucd->exclude_sha1));
+   if (ucd->recurse)
+   fputs(" recurse", stdout);
+   if (ucd->check_only)
+   fputs(" check_only", stdout);
+   if (ucd->valid)
+   fputs(" valid", stdout);
+   printf("\n");
+   for (i = 0; i < ucd->untracked_nr; i++)
+   printf("%s\n", ucd->untracked[i]);
+   for (i = 0; i < ucd->dirs_nr; i++)
+   dump(ucd->dirs[i], base);
+   strbuf_setlen(base, len);
+}
+
+int main(int ac, char **av)
+{
+   st

[PATCH v3 22/23] mingw32: add uname()

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 compat/mingw.c | 11 +++
 compat/mingw.h |  9 +
 2 files changed, 20 insertions(+)

diff --git a/compat/mingw.c b/compat/mingw.c
index c5c37e5..88140e4 100644
--- a/compat/mingw.c
+++ b/compat/mingw.c
@@ -2128,3 +2128,14 @@ void mingw_startup()
/* initialize Unicode console */
winansi_init();
 }
+
+int uname(struct utsname *buf)
+{
+   DWORD v = GetVersion();
+   memset(buf, 0, sizeof(*buf));
+   strcpy(buf->sysname, "Windows");
+   sprintf(buf->release, "%u.%u", v & 0xff, (v >> 8) & 0xff);
+   /* assuming NT variants only.. */
+   sprintf(buf->version, "%u", (v >> 16) & 0x7fff);
+   return 0;
+}
diff --git a/compat/mingw.h b/compat/mingw.h
index df0e320..d00ba7a 100644
--- a/compat/mingw.h
+++ b/compat/mingw.h
@@ -77,6 +77,14 @@ struct itimerval {
 };
 #define ITIMER_REAL 0
 
+struct utsname {
+   char sysname[16];
+   char nodename[1];
+   char release[16];
+   char version[16];
+   char machine[1];
+};
+
 /*
  * sanitize preprocessor namespace polluted by Windows headers defining
  * macros which collide with git local versions
@@ -166,6 +174,7 @@ struct passwd *getpwuid(uid_t uid);
 int setitimer(int type, struct itimerval *in, struct itimerval *out);
 int sigaction(int sig, struct sigaction *in, struct sigaction *out);
 int link(const char *oldpath, const char *newpath);
+int uname(struct utsname *buf);
 
 /*
  * replacements of existing functions
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 23/23] untracked cache: guard and disable on system changes

2014-12-08 Thread Nguyễn Thái Ngọc Duy
If the user enables untracked cache, then

 - move worktree to an unsupported filesystem
 - or simply upgrade OS
 - or move the whole (portable) disk from one machine to another
 - or access a shared fs from another machine

there's no guarantee that untracked cache can still function properly.
Record the worktree location and OS footprint in the cache. If it
changes, err on the safe side and disable the cache. The user can
'update-index --untracked-cache' again to make sure all conditions are
met.

This adds a new requirement that setup_git_directory* must be called
before read_cache() because we need worktree location by then, or the
cache is dropped.

This change does not cover all bases, you can fool it if you try
hard. The point is to stop accidents.

Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/technical/index-format.txt |  3 +++
 dir.c| 28 
 git-compat-util.h|  1 +
 test-dump-untracked-cache.c  |  1 +
 4 files changed, 33 insertions(+)

diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index b97ac8d..5dc2bee 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -242,6 +242,9 @@ Git index format
 
   The extension starts with
 
+  - A NUL-terminated string describing the environment when the cache
+is created.
+
   - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
 ctime field until "file size".
 
diff --git a/dir.c b/dir.c
index 916f1ed..ef58547 100644
--- a/dir.c
+++ b/dir.c
@@ -2246,10 +2246,21 @@ static void write_one_dir(struct untracked_cache_dir 
*untracked,
write_one_dir(untracked->dirs[i], wd);
 }
 
+static void get_ident_string(struct strbuf *sb)
+{
+   struct utsname uts;
+
+   if (uname(&uts))
+   die_errno(_("failed to get kernel name and information"));
+   strbuf_addf(sb, "Location %s, system %s %s %s", get_git_work_tree(),
+   uts.sysname, uts.release, uts.version);
+}
+
 void write_untracked_extension(struct strbuf *out, struct untracked_cache 
*untracked)
 {
struct ondisk_untracked_cache *ouc;
struct write_data wd;
+   struct strbuf sb = STRBUF_INIT;
unsigned char varbuf[16];
int len = 0, varint_len;
if (untracked->exclude_per_dir)
@@ -2261,6 +2272,11 @@ void write_untracked_extension(struct strbuf *out, 
struct untracked_cache *untra
hashcpy(ouc->excludes_file_sha1, untracked->ss_excludes_file.sha1);
ouc->dir_flags = htonl(untracked->dir_flags);
memcpy(ouc->exclude_per_dir, untracked->exclude_per_dir, len + 1);
+
+   get_ident_string(&sb);
+   strbuf_add(out, sb.buf, sb.len + 1);
+   strbuf_release(&sb);
+
strbuf_add(out, ouc, sizeof(*ouc) + len);
if (!untracked->root) {
varint_len = encode_varint(0, varbuf);
@@ -2444,12 +2460,24 @@ struct untracked_cache *read_untracked_extension(const 
void *data, unsigned long
struct untracked_cache *uc;
struct read_data rd;
const unsigned char *next = data, *end = data + sz;
+   struct strbuf sb = STRBUF_INIT;
int len;
 
if (sz <= 1 || end[-1] != '\0')
return NULL;
end--;
 
+   get_ident_string(&sb);
+   if (strcmp(sb.buf, (const char *)next)) {
+   warning(_("system identification does not match, untracked 
cache disabled.\n"
+ "Stored: %s\nCurrent: %s\n"),
+   next, sb.buf);
+   strbuf_release(&sb);
+   return NULL;
+   }
+   next += sb.len + 1;
+   strbuf_release(&sb);
+
ouc = (const struct ondisk_untracked_cache *)next;
if (next + sizeof(*ouc) > end)
return NULL;
diff --git a/git-compat-util.h b/git-compat-util.h
index f587749..e9502a1 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -132,6 +132,7 @@
 #elif defined(_MSC_VER)
 #include "compat/msvc.h"
 #else
+#include 
 #include 
 #include 
 #include 
diff --git a/test-dump-untracked-cache.c b/test-dump-untracked-cache.c
index 710441e..25d855d 100644
--- a/test-dump-untracked-cache.c
+++ b/test-dump-untracked-cache.c
@@ -44,6 +44,7 @@ int main(int ac, char **av)
 {
struct untracked_cache *uc;
struct strbuf base = STRBUF_INIT;
+   setup_git_directory();
if (read_cache() < 0)
die("unable to read index file");
uc = the_index.untracked;
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 18/23] status: enable untracked cache

2014-12-08 Thread Nguyễn Thái Ngọc Duy
update_index_if_able() is moved down so that the updated untracked
cache could be written out.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/commit.c | 5 +++--
 wt-status.c  | 2 ++
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 5ed6036..bdcfa61 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -1372,13 +1372,14 @@ int cmd_status(int argc, const char **argv, const char 
*prefix)
refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, &s.pathspec, 
NULL, NULL);
 
fd = hold_locked_index(&index_lock, 0);
-   if (0 <= fd)
-   update_index_if_able(&the_index, &index_lock);
 
s.is_initial = get_sha1(s.reference, sha1) ? 1 : 0;
s.ignore_submodule_arg = ignore_submodule_arg;
wt_status_collect(&s);
 
+   if (0 <= fd)
+   update_index_if_able(&the_index, &index_lock);
+
if (s.relative_paths)
s.prefix = prefix;
 
diff --git a/wt-status.c b/wt-status.c
index 27da529..8880c3b 100644
--- a/wt-status.c
+++ b/wt-status.c
@@ -585,6 +585,8 @@ static void wt_status_collect_untracked(struct wt_status *s)
DIR_SHOW_OTHER_DIRECTORIES | DIR_HIDE_EMPTY_DIRECTORIES;
if (s->show_ignored_files)
dir.flags |= DIR_SHOW_IGNORED_TOO;
+   else
+   dir.untracked = the_index.untracked;
setup_standard_excludes(&dir);
 
fill_directory(&dir, &s->pathspec);
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 05/23] untracked cache: make a wrapper around {open,read,close}dir()

2014-12-08 Thread Nguyễn Thái Ngọc Duy
This allows us to feed different info to read_directory_recursive()
based on untracked cache in the next patch.

Helped-by: Ramsay Jones 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 55 +++
 1 file changed, 47 insertions(+), 8 deletions(-)

diff --git a/dir.c b/dir.c
index 6e91315..fb6ed86 100644
--- a/dir.c
+++ b/dir.c
@@ -31,6 +31,15 @@ enum path_treatment {
path_untracked
 };
 
+/*
+ * Support data structure for our opendir/readdir/closedir wrappers
+ */
+struct cached_dir {
+   DIR *fdir;
+   struct untracked_cache_dir *untracked;
+   struct dirent *de;
+};
+
 static enum path_treatment read_directory_recursive(struct dir_struct *dir,
const char *path, int len, struct untracked_cache_dir *untracked,
int check_only, const struct path_simplify *simplify);
@@ -1417,12 +1426,13 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
 
 static enum path_treatment treat_path(struct dir_struct *dir,
  struct untracked_cache_dir *untracked,
- struct dirent *de,
+ struct cached_dir *cdir,
  struct strbuf *path,
  int baselen,
  const struct path_simplify *simplify)
 {
int dtype;
+   struct dirent *de = cdir->de;
 
if (is_dot_or_dotdot(de->d_name) || !strcmp(de->d_name, ".git"))
return path_none;
@@ -1444,6 +1454,37 @@ static void add_untracked(struct untracked_cache_dir 
*dir, const char *name)
dir->untracked[dir->untracked_nr++] = xstrdup(name);
 }
 
+static int open_cached_dir(struct cached_dir *cdir,
+  struct dir_struct *dir,
+  struct untracked_cache_dir *untracked,
+  struct strbuf *path,
+  int check_only)
+{
+   memset(cdir, 0, sizeof(*cdir));
+   cdir->untracked = untracked;
+   cdir->fdir = opendir(path->len ? path->buf : ".");
+   if (!cdir->fdir)
+   return -1;
+   return 0;
+}
+
+static int read_cached_dir(struct cached_dir *cdir)
+{
+   if (cdir->fdir) {
+   cdir->de = readdir(cdir->fdir);
+   if (!cdir->de)
+   return -1;
+   return 0;
+   }
+   return -1;
+}
+
+static void close_cached_dir(struct cached_dir *cdir)
+{
+   if (cdir->fdir)
+   closedir(cdir->fdir);
+}
+
 /*
  * Read a directory tree. We currently ignore anything but
  * directories, regular files and symlinks. That's because git
@@ -1460,23 +1501,21 @@ static enum path_treatment 
read_directory_recursive(struct dir_struct *dir,
struct untracked_cache_dir *untracked, int 
check_only,
const struct path_simplify *simplify)
 {
-   DIR *fdir;
+   struct cached_dir cdir;
enum path_treatment state, subdir_state, dir_state = path_none;
-   struct dirent *de;
struct strbuf path = STRBUF_INIT;
 
strbuf_add(&path, base, baselen);
 
-   fdir = opendir(path.len ? path.buf : ".");
-   if (!fdir)
+   if (open_cached_dir(&cdir, dir, untracked, &path, check_only))
goto out;
 
if (untracked)
untracked->check_only = !!check_only;
 
-   while ((de = readdir(fdir)) != NULL) {
+   while (!read_cached_dir(&cdir)) {
/* check how the file or directory should be treated */
-   state = treat_path(dir, untracked, de, &path, baselen, 
simplify);
+   state = treat_path(dir, untracked, &cdir, &path, baselen, 
simplify);
 
if (state > dir_state)
dir_state = state;
@@ -1529,7 +1568,7 @@ static enum path_treatment 
read_directory_recursive(struct dir_struct *dir,
break;
}
}
-   closedir(fdir);
+   close_cached_dir(&cdir);
  out:
strbuf_release(&path);
 
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 17/23] untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE

2014-12-08 Thread Nguyễn Thái Ngọc Duy
This can be used to double check if results with untracked cache are
correctly, compared to vanilla version. Untracked cache remains in
index, but not used.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 9b0e1a1..916f1ed 100644
--- a/dir.c
+++ b/dir.c
@@ -1800,7 +1800,7 @@ static struct untracked_cache_dir 
*validate_untracked_cache(struct dir_struct *d
struct untracked_cache_dir *root;
int i;
 
-   if (!dir->untracked)
+   if (!dir->untracked || getenv("GIT_DISABLE_UNTRACKED_CACHE"))
return NULL;
 
/*
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 12/23] untracked cache: invalidate at index addition or removal

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Ideally we should implement untracked_cache_remove_from_index() and
untracked_cache_add_to_index() so that they update untracked cache
right away instead of invalidating it and wait for read_directory()
next time to deal with it. But that may need some more work in
unpack-trees.c. So stay simple as the first step.

The new call in add_index_entry_with_check() may look strange because
new calls usually stay close to cache_tree_invalidate_path(). We do it
a bit later than c_t_i_p() in this function because if it's about
replacing the entry with the same name, we don't care (but cache-tree
does).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c  | 31 +++
 dir.h  |  4 
 read-cache.c   |  4 
 unpack-trees.c |  7 +--
 4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index 0126a0d..a14b48f 100644
--- a/dir.c
+++ b/dir.c
@@ -2493,3 +2493,34 @@ done2:
}
return uc;
 }
+
+void untracked_cache_invalidate_path(struct index_state *istate,
+const char *path)
+{
+   const char *sep;
+   struct untracked_cache_dir *d;
+   if (!istate->untracked || !istate->untracked->root)
+   return;
+   sep = strrchr(path, '/');
+   if (sep)
+   d = lookup_untracked(istate->untracked,
+istate->untracked->root,
+path, sep - path);
+   else
+   d = istate->untracked->root;
+   istate->untracked->dir_invalidated++;
+   d->valid = 0;
+   d->untracked_nr = 0;
+}
+
+void untracked_cache_remove_from_index(struct index_state *istate,
+  const char *path)
+{
+   untracked_cache_invalidate_path(istate, path);
+}
+
+void untracked_cache_add_to_index(struct index_state *istate,
+ const char *path)
+{
+   untracked_cache_invalidate_path(istate, path);
+}
diff --git a/dir.h b/dir.h
index 40a679a..2ce7dd3 100644
--- a/dir.h
+++ b/dir.h
@@ -298,6 +298,10 @@ static inline int dir_path_match(const struct dir_entry 
*ent,
  has_trailing_dir);
 }
 
+void untracked_cache_invalidate_path(struct index_state *, const char *);
+void untracked_cache_remove_from_index(struct index_state *, const char *);
+void untracked_cache_add_to_index(struct index_state *, const char *);
+
 void free_untracked_cache(struct untracked_cache *);
 struct untracked_cache *read_untracked_extension(const void *data, unsigned 
long sz);
 void write_untracked_extension(struct strbuf *out, struct untracked_cache 
*untracked);
diff --git a/read-cache.c b/read-cache.c
index 3736a56..d643a3f 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -78,6 +78,7 @@ void rename_index_entry_at(struct index_state *istate, int 
nr, const char *new_n
memcpy(new->name, new_name, namelen + 1);
 
cache_tree_invalidate_path(istate, old->name);
+   untracked_cache_remove_from_index(istate, old->name);
remove_index_entry_at(istate, nr);
add_index_entry(istate, new, 
ADD_CACHE_OK_TO_ADD|ADD_CACHE_OK_TO_REPLACE);
 }
@@ -537,6 +538,7 @@ int remove_file_from_index(struct index_state *istate, 
const char *path)
if (pos < 0)
pos = -pos-1;
cache_tree_invalidate_path(istate, path);
+   untracked_cache_remove_from_index(istate, path);
while (pos < istate->cache_nr && !strcmp(istate->cache[pos]->name, 
path))
remove_index_entry_at(istate, pos);
return 0;
@@ -968,6 +970,8 @@ static int add_index_entry_with_check(struct index_state 
*istate, struct cache_e
}
pos = -pos-1;
 
+   untracked_cache_add_to_index(istate, ce->name);
+
/*
 * Inserting a merged entry ("stage 0") into the index
 * will always replace all non-merged entries..
diff --git a/unpack-trees.c b/unpack-trees.c
index 629c658..e5ddb0c 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -9,6 +9,7 @@
 #include "refs.h"
 #include "attr.h"
 #include "split-index.h"
+#include "dir.h"
 
 /*
  * Error messages expected by scripts out of plumbing commands such as
@@ -1255,8 +1256,10 @@ static int verify_uptodate_sparse(const struct 
cache_entry *ce,
 static void invalidate_ce_path(const struct cache_entry *ce,
   struct unpack_trees_options *o)
 {
-   if (ce)
-   cache_tree_invalidate_path(o->src_index, ce->name);
+   if (!ce)
+   return;
+   cache_tree_invalidate_path(o->src_index, ce->name);
+   untracked_cache_invalidate_path(o->src_index, ce->name);
 }
 
 /*
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 11/23] untracked cache: load from UNTR index extension

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c| 220 +++
 dir.h|   2 +
 read-cache.c |   5 ++
 3 files changed, 227 insertions(+)

diff --git a/dir.c b/dir.c
index a0a7330..0126a0d 100644
--- a/dir.c
+++ b/dir.c
@@ -2273,3 +2273,223 @@ void write_untracked_extension(struct strbuf *out, 
struct untracked_cache *untra
strbuf_release(&wd.sb_stat);
strbuf_release(&wd.sb_sha1);
 }
+
+static void free_untracked(struct untracked_cache_dir *ucd)
+{
+   int i;
+   if (!ucd)
+   return;
+   for (i = 0; i < ucd->dirs_nr; i++)
+   free_untracked(ucd->dirs[i]);
+   for (i = 0; i < ucd->untracked_nr; i++)
+   free(ucd->untracked[i]);
+   free(ucd->untracked);
+   free(ucd->dirs);
+   free(ucd);
+}
+
+void free_untracked_cache(struct untracked_cache *uc)
+{
+   if (uc)
+   free_untracked(uc->root);
+   free(uc);
+}
+
+struct read_data {
+   int index;
+   struct untracked_cache_dir **ucd;
+   struct ewah_bitmap *check_only;
+   struct ewah_bitmap *valid;
+   struct ewah_bitmap *sha1_valid;
+   const unsigned char *data;
+   const unsigned char *end;
+};
+
+static void stat_data_from_disk(struct stat_data *to, const struct stat_data 
*from)
+{
+   to->sd_ctime.sec  = get_be32(&from->sd_ctime.sec);
+   to->sd_ctime.nsec = get_be32(&from->sd_ctime.nsec);
+   to->sd_mtime.sec  = get_be32(&from->sd_mtime.sec);
+   to->sd_mtime.nsec = get_be32(&from->sd_mtime.nsec);
+   to->sd_dev= get_be32(&from->sd_dev);
+   to->sd_ino= get_be32(&from->sd_ino);
+   to->sd_uid= get_be32(&from->sd_uid);
+   to->sd_gid= get_be32(&from->sd_gid);
+   to->sd_size   = get_be32(&from->sd_size);
+}
+
+static int read_one_dir(struct untracked_cache_dir **untracked_,
+   struct read_data *rd)
+{
+#define NEXT(x) \
+   next = data + (x); \
+   if (next > rd->end) \
+   return -1;
+
+   struct untracked_cache_dir ud, *untracked;
+   const unsigned char *next, *data = rd->data, *end = rd->end;
+   unsigned int value;
+   int i, len;
+
+   memset(&ud, 0, sizeof(ud));
+
+   next = data;
+   value = decode_varint(&next);
+   if (next > end)
+   return -1;
+   ud.recurse = 1;
+   ud.untracked_alloc = value;
+   ud.untracked_nr= value;
+   if (ud.untracked_nr)
+   ud.untracked = xmalloc(sizeof(*ud.untracked) * ud.untracked_nr);
+   data = next;
+
+   next = data;
+   ud.dirs_alloc = ud.dirs_nr = decode_varint(&next);
+   if (next > end)
+   return -1;
+   ud.dirs = xmalloc(sizeof(*ud.dirs) * ud.dirs_nr);
+   data = next;
+
+   len = strlen((const char *)data);
+   NEXT(len + 1);
+   *untracked_ = untracked = xmalloc(sizeof(*untracked) + len);
+   memcpy(untracked, &ud, sizeof(ud));
+   memcpy(untracked->name, data, len + 1);
+   data = next;
+
+   for (i = 0; i < untracked->untracked_nr; i++) {
+   len = strlen((const char *)data);
+   NEXT(len + 1);
+   untracked->untracked[i] = xstrdup((const char*)data);
+   data = next;
+   }
+
+   rd->ucd[rd->index++] = untracked;
+   rd->data = data;
+
+   for (i = 0; i < untracked->dirs_nr; i++) {
+   len = read_one_dir(untracked->dirs + i, rd);
+   if (len < 0)
+   return -1;
+   }
+   return 0;
+}
+
+static void set_check_only(size_t pos, void *cb)
+{
+   struct read_data *rd = cb;
+   struct untracked_cache_dir *ud = rd->ucd[pos];
+   ud->check_only = 1;
+}
+
+static void read_stat(size_t pos, void *cb)
+{
+   struct read_data *rd = cb;
+   struct untracked_cache_dir *ud = rd->ucd[pos];
+   if (rd->data + sizeof(struct stat_data) > rd->end) {
+   rd->data = rd->end + 1;
+   return;
+   }
+   stat_data_from_disk(&ud->stat_data, (struct stat_data *)rd->data);
+   rd->data += sizeof(struct stat_data);
+   ud->valid = 1;
+}
+
+static void read_sha1(size_t pos, void *cb)
+{
+   struct read_data *rd = cb;
+   struct untracked_cache_dir *ud = rd->ucd[pos];
+   if (rd->data + 20 > rd->end) {
+   rd->data = rd->end + 1;
+   return;
+   }
+   hashcpy(ud->exclude_sha1, rd->data);
+   rd->data += 20;
+}
+
+static void load_sha1_stat(struct sha1_stat *sha1_stat,
+  const struct stat_data *stat,
+  const unsigned char *sha1)
+{
+   stat_data_from_disk(&sha1_stat->stat, stat);
+   hashcpy(sha1_stat->sha1, sha1);
+   sha1_stat->valid = 1;
+}
+
+struct untracked_cache *read_untracked_extension(const void *data, unsigned 
long sz)
+{
+   const struct ondisk_untracked_cache *ouc;
+   struct unt

[PATCH v3 04/23] untracked cache: invalidate dirs recursively if .gitignore changes

2014-12-08 Thread Nguyễn Thái Ngọc Duy
It's easy to see that if an existing .gitignore changes, its SHA-1
would be different and invalidate_gitignore() is called.

If .gitignore is removed, add_excludes() will treat it like an empty
.gitignore, which again should invalidate the cached directory data.

if .gitignore is added, lookup_untracked() already fills initial
.gitignore SHA-1 as "empty file", so again invalidate_gitignore() is
called.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 18 +-
 1 file changed, 17 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 44ed9f2..6e91315 100644
--- a/dir.c
+++ b/dir.c
@@ -1010,7 +1010,23 @@ static void prep_exclude(struct dir_struct *dir, const 
char *base, int baselen)
add_excludes(el->src, el->src, stk->baselen, el, 1,
 untracked ? &sha1_stat : NULL);
}
-   if (untracked) {
+   /*
+* NEEDSWORK: when untracked cache is enabled, prep_exclude()
+* will first be called in valid_cached_dir() then maybe many
+* times more in last_exclude_matching(). When the cache is
+* used, last_exclude_matching() will not be called and
+* reading .gitignore content will be a waste.
+*
+* So when it's called by valid_cached_dir() and we can get
+* .gitignore SHA-1 from the index (i.e. .gitignore is not
+* modified on work tree), we could delay reading the
+* .gitignore content until we absolutely need it in
+* last_exclude_matching(). Be careful about ignore rule
+* order, though, if you do that.
+*/
+   if (untracked &&
+   hashcmp(sha1_stat.sha1, untracked->exclude_sha1)) {
+   invalidate_gitignore(dir->untracked, untracked);
hashcpy(untracked->exclude_sha1, sha1_stat.sha1);
}
dir->exclude_stack = stk;
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 13/23] read-cache.c: split racy stat test to a separate function

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 read-cache.c | 24 +++-
 1 file changed, 15 insertions(+), 9 deletions(-)

diff --git a/read-cache.c b/read-cache.c
index d643a3f..f12a185 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -270,20 +270,26 @@ static int ce_match_stat_basic(const struct cache_entry 
*ce, struct stat *st)
return changed;
 }
 
-static int is_racy_timestamp(const struct index_state *istate,
-const struct cache_entry *ce)
+static int is_racy_stat(const struct index_state *istate,
+   const struct stat_data *sd)
 {
-   return (!S_ISGITLINK(ce->ce_mode) &&
-   istate->timestamp.sec &&
+   return (istate->timestamp.sec &&
 #ifdef USE_NSEC
 /* nanosecond timestamped files can also be racy! */
-   (istate->timestamp.sec < ce->ce_stat_data.sd_mtime.sec ||
-(istate->timestamp.sec == ce->ce_stat_data.sd_mtime.sec &&
- istate->timestamp.nsec <= ce->ce_stat_data.sd_mtime.nsec))
+   (istate->timestamp.sec < sd->sd_mtime.sec ||
+(istate->timestamp.sec == sd->sd_mtime.sec &&
+ istate->timestamp.nsec <= sd->sd_mtime.nsec))
 #else
-   istate->timestamp.sec <= ce->ce_stat_data.sd_mtime.sec
+   istate->timestamp.sec <= sd->sd_mtime.sec
 #endif
-);
+   );
+}
+
+static int is_racy_timestamp(const struct index_state *istate,
+const struct cache_entry *ce)
+{
+   return (!S_ISGITLINK(ce->ce_mode) &&
+   is_racy_stat(istate, &ce->ce_stat_data));
 }
 
 int ie_match_stat(const struct index_state *istate,
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 06/23] untracked cache: record/validate dir mtime and reuse cached output

2014-12-08 Thread Nguyễn Thái Ngọc Duy
The main readdir loop in read_directory_recursive() is replaced with a
new one that checks if cached results of a directory is still valid.

If a file is added or removed from the index, the containing directory
is invalidated (but not its subdirs). If directory's mtime is changed,
the same happens. If a .gitignore is updated, the containing directory
and all subdirs are invalidated recursively. If dir_struct#flags or
other conditions change, the cache is ignored.

If a directory is invalidated, we opendir/readdir/closedir and run the
exclude machinery on that directory listing as usual. If untracked
cache is also enabled, we'll update the cache along the way. If a
directory is validated, we simply pull the untracked listing out from
the cache. The cache also records the list of direct subdirs that we
have to recurse in. Fully excluded directories are seen as "untracked
files".

In the best case when no dirs are invalidated, read_directory()
becomes a series of

  stat(dir), open(.gitignore), fstat(), read(), close() and optionally
  hash_sha1_file()

For comparison, standard read_directory() is a sequence of

  opendir(), readdir(), open(.gitignore), fstat(), read(), close(), the
  expensive last_exclude_matching() and closedir().

We already try not to open(.gitignore) if we know it does not exist,
so open/fstat/read/close sequence does not apply to every
directory. The sequence could be reduced further, as noted in
prep_exclude() in another patch. So in theory, the entire best-case
read_directory sequence could be reduced to a series of stat() and
nothing else.

This is not a silver bullet approach. When you compile a C file, for
example, the old .o file is removed and a new one with the same name
created, effectively invalidating the containing directory's cache
(but not its subdirectories). If your build process touches every
directory, this cache adds extra overhead for nothing, so it's a good
idea to separate generated files from tracked files.. Editors may use
the same strategy for saving files. And of course you're out of luck
running your repo on an unsupported filesystem and/or operating system.

Helped-by: Eric Sunshine 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 121 --
 dir.h |   2 ++
 2 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index fb6ed86..8c989e3 100644
--- a/dir.c
+++ b/dir.c
@@ -37,7 +37,12 @@ enum path_treatment {
 struct cached_dir {
DIR *fdir;
struct untracked_cache_dir *untracked;
+   int nr_files;
+   int nr_dirs;
+
struct dirent *de;
+   const char *file;
+   struct untracked_cache_dir *ucd;
 };
 
 static enum path_treatment read_directory_recursive(struct dir_struct *dir,
@@ -607,6 +612,14 @@ static void invalidate_gitignore(struct untracked_cache 
*uc,
do_invalidate_gitignore(dir);
 }
 
+static void invalidate_directory(struct untracked_cache *uc,
+struct untracked_cache_dir *dir)
+{
+   uc->dir_invalidated++;
+   dir->valid = 0;
+   dir->untracked_nr = 0;
+}
+
 /*
  * Given a file with name "fname", read it (either from disk, or from
  * the index if "check_index" is non-zero), parse it and store the
@@ -1424,6 +1437,39 @@ static enum path_treatment treat_one_path(struct 
dir_struct *dir,
}
 }
 
+static enum path_treatment treat_path_fast(struct dir_struct *dir,
+  struct untracked_cache_dir 
*untracked,
+  struct cached_dir *cdir,
+  struct strbuf *path,
+  int baselen,
+  const struct path_simplify *simplify)
+{
+   strbuf_setlen(path, baselen);
+   if (!cdir->ucd) {
+   strbuf_addstr(path, cdir->file);
+   return path_untracked;
+   }
+   strbuf_addstr(path, cdir->ucd->name);
+   /* treat_one_path() does this before it calls treat_directory() */
+   if (path->buf[path->len - 1] != '/')
+   strbuf_addch(path, '/');
+   if (cdir->ucd->check_only)
+   /*
+* check_only is set as a result of treat_directory() getting
+* to its bottom. Verify again the same set of directories
+* with check_only set.
+*/
+   return read_directory_recursive(dir, path->buf, path->len,
+   cdir->ucd, 1, simplify);
+   /*
+* We get path_recurse in the first run when
+* directory_exists_in_index() returns index_nonexistent. We
+* are sure that new changes in the index does not impact the
+* outcome. Return now.
+*/
+   return path_recurse;
+}
+
 static enum path_treatment treat_path(struct dir_struct *dir,
  struct untracked_cache

[PATCH v3 16/23] untracked cache: mark index dirty if untracked cache is updated

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache.h  | 1 +
 dir.c| 9 +
 read-cache.c | 2 +-
 3 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/cache.h b/cache.h
index f8b3dc5..fca979b 100644
--- a/cache.h
+++ b/cache.h
@@ -295,6 +295,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define RESOLVE_UNDO_CHANGED   (1 << 4)
 #define CACHE_TREE_CHANGED (1 << 5)
 #define SPLIT_INDEX_ORDERED(1 << 6)
+#define UNTRACKED_CHANGED  (1 << 7)
 
 struct split_index;
 struct untracked_cache;
diff --git a/dir.c b/dir.c
index 14dbd7a..9b0e1a1 100644
--- a/dir.c
+++ b/dir.c
@@ -1933,6 +1933,15 @@ int read_directory(struct dir_struct *dir, const char 
*path, int len, const stru
 dir->untracked->gitignore_invalidated,
 dir->untracked->dir_invalidated,
 dir->untracked->dir_opened);
+   if (dir->untracked == the_index.untracked &&
+   (dir->untracked->dir_opened ||
+dir->untracked->gitignore_invalidated ||
+dir->untracked->dir_invalidated))
+   the_index.cache_changed |= UNTRACKED_CHANGED;
+   if (dir->untracked != the_index.untracked) {
+   free(dir->untracked);
+   dir->untracked = NULL;
+   }
}
return dir->nr;
 }
diff --git a/read-cache.c b/read-cache.c
index 0ecba05..71d8e20 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -42,7 +42,7 @@ static struct cache_entry *refresh_cache_entry(struct 
cache_entry *ce,
 /* changes that can be kept in $GIT_DIR/index (basically all extensions) */
 #define EXTMASK (RESOLVE_UNDO_CHANGED | CACHE_TREE_CHANGED | \
 CE_ENTRY_ADDED | CE_ENTRY_REMOVED | CE_ENTRY_CHANGED | \
-SPLIT_INDEX_ORDERED)
+SPLIT_INDEX_ORDERED | UNTRACKED_CHANGED)
 
 struct index_state the_index;
 static const char *alternate_index_output;
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 09/23] ewah: add convenient wrapper ewah_serialize_strbuf()

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 ewah/ewah_io.c | 13 +
 ewah/ewok.h|  2 ++
 split-index.c  | 11 ++-
 3 files changed, 17 insertions(+), 9 deletions(-)

diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c
index 1c2d7af..43481b9 100644
--- a/ewah/ewah_io.c
+++ b/ewah/ewah_io.c
@@ -19,6 +19,7 @@
  */
 #include "git-compat-util.h"
 #include "ewok.h"
+#include "strbuf.h"
 
 int ewah_serialize_native(struct ewah_bitmap *self, int fd)
 {
@@ -110,6 +111,18 @@ int ewah_serialize(struct ewah_bitmap *self, int fd)
return ewah_serialize_to(self, write_helper, (void *)(intptr_t)fd);
 }
 
+static int write_strbuf(void *user_data, const void *data, size_t len)
+{
+   struct strbuf *sb = user_data;
+   strbuf_add(sb, data, len);
+   return len;
+}
+
+int ewah_serialize_strbuf(struct ewah_bitmap *self, struct strbuf *sb)
+{
+   return ewah_serialize_to(self, write_strbuf, sb);
+}
+
 int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len)
 {
const uint8_t *ptr = map;
diff --git a/ewah/ewok.h b/ewah/ewok.h
index f6ad190..4d7f5e9 100644
--- a/ewah/ewok.h
+++ b/ewah/ewok.h
@@ -30,6 +30,7 @@
 #  define ewah_calloc xcalloc
 #endif
 
+struct strbuf;
 typedef uint64_t eword_t;
 #define BITS_IN_WORD (sizeof(eword_t) * 8)
 
@@ -97,6 +98,7 @@ int ewah_serialize_to(struct ewah_bitmap *self,
  void *out);
 int ewah_serialize(struct ewah_bitmap *self, int fd);
 int ewah_serialize_native(struct ewah_bitmap *self, int fd);
+int ewah_serialize_strbuf(struct ewah_bitmap *self, struct strbuf *);
 
 int ewah_deserialize(struct ewah_bitmap *self, int fd);
 int ewah_read_mmap(struct ewah_bitmap *self, const void *map, size_t len);
diff --git a/split-index.c b/split-index.c
index 21485e2..968b780 100644
--- a/split-index.c
+++ b/split-index.c
@@ -41,13 +41,6 @@ int read_link_extension(struct index_state *istate,
return 0;
 }
 
-static int write_strbuf(void *user_data, const void *data, size_t len)
-{
-   struct strbuf *sb = user_data;
-   strbuf_add(sb, data, len);
-   return len;
-}
-
 int write_link_extension(struct strbuf *sb,
 struct index_state *istate)
 {
@@ -55,8 +48,8 @@ int write_link_extension(struct strbuf *sb,
strbuf_add(sb, si->base_sha1, 20);
if (!si->delete_bitmap && !si->replace_bitmap)
return 0;
-   ewah_serialize_to(si->delete_bitmap, write_strbuf, sb);
-   ewah_serialize_to(si->replace_bitmap, write_strbuf, sb);
+   ewah_serialize_strbuf(si->delete_bitmap, sb);
+   ewah_serialize_strbuf(si->replace_bitmap, sb);
return 0;
 }
 
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 19/23] update-index: manually enable or disable untracked cache

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Overall time saving on "git status" is about 40% in the best case
scenario, removing ..collect_untracked() as the most time consuming
function. read and refresh index operations are now at the top (which
should drop when index-helper and/or watchman support is added). More
numbers and analysis below.

webkit.git
==

169k files. 6k dirs. Lots of test data (i.e. not touched most of the
time)

Base status
---

Index version 4 in split index mode and cache-tree populated. No
untracked cache. It shows how time is consumed by "git status". The
same settings are used for other repos below.

18:28:10.199679 builtin/commit.c:1394   performance: 0.00451 s: 
cmd_status:setup
18:28:10.474847 read-cache.c:1407   performance: 0.274873831 s: read_index
18:28:10.475295 read-cache.c:1407   performance: 0.00656 s: read_index
18:28:10.728443 preload-index.c:131 performance: 0.253147487 s: 
read_index_preload
18:28:10.741422 read-cache.c:1254   performance: 0.012868340 s: 
refresh_index
18:28:10.752300 wt-status.c:623 performance: 0.010421357 s: 
wt_status_collect_changes_worktree
18:28:10.762069 wt-status.c:629 performance: 0.009644748 s: 
wt_status_collect_changes_index
18:28:11.601019 wt-status.c:632 performance: 0.838859547 s: 
wt_status_collect_untracked
18:28:11.605939 builtin/commit.c:1421   performance: 0.004835004 s: 
cmd_status:update_index
18:28:11.606580 trace.c:415 performance: 1.407878388 s: git 
command: 'git' 'status'

Populating status
-

This is after enabling untracked cache and the cache is still empty.
We see a slight increase in .._collect_untracked() and update_index
(because new cache has to be written to $GIT_DIR/index).

18:28:18.915213 builtin/commit.c:1394   performance: 0.00326 s: 
cmd_status:setup
18:28:19.197364 read-cache.c:1407   performance: 0.281901416 s: read_index
18:28:19.197754 read-cache.c:1407   performance: 0.00546 s: read_index
18:28:19.451355 preload-index.c:131 performance: 0.253599607 s: 
read_index_preload
18:28:19.464400 read-cache.c:1254   performance: 0.012935336 s: 
refresh_index
18:28:19.475115 wt-status.c:623 performance: 0.010236920 s: 
wt_status_collect_changes_worktree
18:28:19.486022 wt-status.c:629 performance: 0.010801685 s: 
wt_status_collect_changes_index
18:28:20.362660 wt-status.c:632 performance: 0.876551366 s: 
wt_status_collect_untracked
18:28:20.396199 builtin/commit.c:1421   performance: 0.033447969 s: 
cmd_status:update_index
18:28:20.396939 trace.c:415 performance: 1.482695902 s: git 
command: 'git' 'status'

Populated status


After the cache is populated, wt_status_collect_untracked() drops 82%
from 0.838s to 0.144s. Overall time drops 45%. Top offenders are now
read_index() and read_index_preload().

18:28:20.408605 builtin/commit.c:1394   performance: 0.00457 s: 
cmd_status:setup
18:28:20.692864 read-cache.c:1407   performance: 0.283980458 s: read_index
18:28:20.693273 read-cache.c:1407   performance: 0.00661 s: read_index
18:28:20.958814 preload-index.c:131 performance: 0.265540254 s: 
read_index_preload
18:28:20.972375 read-cache.c:1254   performance: 0.013437429 s: 
refresh_index
18:28:20.983959 wt-status.c:623 performance: 0.011146646 s: 
wt_status_collect_changes_worktree
18:28:20.993948 wt-status.c:629 performance: 0.009879094 s: 
wt_status_collect_changes_index
18:28:21.138125 wt-status.c:632 performance: 0.144084737 s: 
wt_status_collect_untracked
18:28:21.173678 builtin/commit.c:1421   performance: 0.035463949 s: 
cmd_status:update_index
18:28:21.174251 trace.c:415 performance: 0.766707355 s: git 
command: 'git' 'status'

gentoo-x86.git
==

This repository is a strange one with a balanced, wide and shallow
worktree (about 100k files and 23k dirs) and no .gitignore in
worktree. .._collect_untracked() time drops 88%, total time drops 56%.

Base status
---
18:20:40.828642 builtin/commit.c:1394   performance: 0.00496 s: 
cmd_status:setup
18:20:41.027233 read-cache.c:1407   performance: 0.198130532 s: read_index
18:20:41.027670 read-cache.c:1407   performance: 0.00581 s: read_index
18:20:41.171716 preload-index.c:131 performance: 0.144045594 s: 
read_index_preload
18:20:41.179171 read-cache.c:1254   performance: 0.007320424 s: 
refresh_index
18:20:41.185785 wt-status.c:623 performance: 0.006144638 s: 
wt_status_collect_changes_worktree
18:20:41.192701 wt-status.c:629 performance: 0.006780184 s: 
wt_status_collect_changes_index
18:20:41.991723 wt-status.c:632 performance: 0.798927029 s: 
wt_status_collect_untracked
18:20:41.994664 builtin/commit.c:1421   performance: 0.002852772 s: 
cmd_status:update_index
18:20:41.995458 trace.c:415 performance: 1.168427502 s: git 
command: 'git' 'status'
Populating status
-
18:20:48.968848 builtin

[PATCH v3 14/23] untracked cache: avoid racy timestamps

2014-12-08 Thread Nguyễn Thái Ngọc Duy
When a directory is updated within the same second that its timestamp
is last saved, we cannot realize the directory has been updated by
checking timestamps. Assume the worst (something is update). See
29e4d36 (Racy GIT - 2005-12-20) for more information.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 cache.h  | 2 ++
 dir.c| 4 ++--
 read-cache.c | 8 
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/cache.h b/cache.h
index b14d6e2..f8b3dc5 100644
--- a/cache.h
+++ b/cache.h
@@ -561,6 +561,8 @@ extern void fill_stat_data(struct stat_data *sd, struct 
stat *st);
  * INODE_CHANGED, and DATA_CHANGED.
  */
 extern int match_stat_data(const struct stat_data *sd, struct stat *st);
+extern int match_stat_data_racy(const struct index_state *istate,
+   const struct stat_data *sd, struct stat *st);
 
 extern void fill_stat_cache_info(struct cache_entry *ce, struct stat *st);
 
diff --git a/dir.c b/dir.c
index a14b48f..7fa372e 100644
--- a/dir.c
+++ b/dir.c
@@ -682,7 +682,7 @@ static int add_excludes(const char *fname, const char 
*base, int baselen,
if (sha1_stat) {
int pos;
if (sha1_stat->valid &&
-   !match_stat_data(&sha1_stat->stat, &st))
+   !match_stat_data_racy(&the_index, &sha1_stat->stat, 
&st))
; /* no content change, ss->sha1 still good */
else if (check_index &&
 (pos = cache_name_pos(fname, strlen(fname))) 
>= 0 &&
@@ -1538,7 +1538,7 @@ static int valid_cached_dir(struct dir_struct *dir,
return 0;
}
if (!untracked->valid ||
-   match_stat_data(&untracked->stat_data, &st)) {
+   match_stat_data_racy(&the_index, &untracked->stat_data, &st)) {
if (untracked->valid)
invalidate_directory(dir->untracked, untracked);
fill_stat_data(&untracked->stat_data, &st);
diff --git a/read-cache.c b/read-cache.c
index f12a185..0ecba05 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -292,6 +292,14 @@ static int is_racy_timestamp(const struct index_state 
*istate,
is_racy_stat(istate, &ce->ce_stat_data));
 }
 
+int match_stat_data_racy(const struct index_state *istate,
+const struct stat_data *sd, struct stat *st)
+{
+   if (is_racy_stat(istate, sd))
+   return MTIME_CHANGED;
+   return match_stat_data(sd, st);
+}
+
 int ie_match_stat(const struct index_state *istate,
  const struct cache_entry *ce, struct stat *st,
  unsigned int options)
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 07/23] untracked cache: mark what dirs should be recursed/saved

2014-12-08 Thread Nguyễn Thái Ngọc Duy
If we redo this thing in a functional style, we would have one struct
untracked_dir as input tree and another as output. The input is used
for verification. The output is a brand new tree, reflecting current
worktree.

But that means recreate a lot of dir nodes even if a lot could be
shared between input and output trees in good cases. So we go with the
messy but efficient way, combining both input and output trees into
one. We need a way to know which node in this combined tree belongs to
the output. This is the purpose of this "recurse" flag.

"valid" bit can't be used for this because it's about data of the node
except the subdirs. When we invalidate a directory, we want to keep
cached data of the subdirs intact even though we don't really know
what subdir still exists (yet). Then we check worktree to see what
actual subdir remains on disk. Those will have 'recurse' bit set
again. If cached data for those are still valid, we may be able to
avoid computing exclude files for them. Those subdirs that are deleted
will have 'recurse' remained clear and their 'valid' bits do not
matter.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 14 +-
 dir.h |  3 ++-
 2 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/dir.c b/dir.c
index 8c989e3..02cdc26 100644
--- a/dir.c
+++ b/dir.c
@@ -615,9 +615,12 @@ static void invalidate_gitignore(struct untracked_cache 
*uc,
 static void invalidate_directory(struct untracked_cache *uc,
 struct untracked_cache_dir *dir)
 {
+   int i;
uc->dir_invalidated++;
dir->valid = 0;
dir->untracked_nr = 0;
+   for (i = 0; i < dir->dirs_nr; i++)
+   dir->dirs[i]->recurse = 0;
 }
 
 /*
@@ -1577,6 +1580,10 @@ static int read_cached_dir(struct cached_dir *cdir)
}
while (cdir->nr_dirs < cdir->untracked->dirs_nr) {
struct untracked_cache_dir *d = 
cdir->untracked->dirs[cdir->nr_dirs];
+   if (!d->recurse) {
+   cdir->nr_dirs++;
+   continue;
+   }
cdir->ucd = d;
cdir->nr_dirs++;
return 0;
@@ -1598,8 +1605,10 @@ static void close_cached_dir(struct cached_dir *cdir)
 * We have gone through this directory and found no untracked
 * entries. Mark it valid.
 */
-   if (cdir->untracked)
+   if (cdir->untracked) {
cdir->untracked->valid = 1;
+   cdir->untracked->recurse = 1;
+   }
 }
 
 /*
@@ -1842,6 +1851,9 @@ static struct untracked_cache_dir 
*validate_untracked_cache(struct dir_struct *d
invalidate_gitignore(dir->untracked, root);
dir->untracked->ss_excludes_file = dir->ss_excludes_file;
}
+
+   /* Make sure this directory is not dropped out at saving phase */
+   root->recurse = 1;
return root;
 }
 
diff --git a/dir.h b/dir.h
index ff3d99b..95baf01 100644
--- a/dir.h
+++ b/dir.h
@@ -115,8 +115,9 @@ struct untracked_cache_dir {
unsigned int untracked_alloc, dirs_nr, dirs_alloc;
unsigned int untracked_nr;
unsigned int check_only : 1;
-   /* all data in this struct are good */
+   /* all data except 'dirs' in this struct are good */
unsigned int valid : 1;
+   unsigned int recurse : 1;
/* null SHA-1 means this directory does not have .gitignore */
unsigned char exclude_sha1[20];
char name[FLEX_ARRAY];
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 03/23] untracked cache: initial untracked cache validation

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Make sure the starting conditions and all global exclude files are
good to go. If not, either disable untracked cache completely, or wipe
out the cache and start fresh.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 113 --
 dir.h |   4 +++
 2 files changed, 114 insertions(+), 3 deletions(-)

diff --git a/dir.c b/dir.c
index 27734f0..44ed9f2 100644
--- a/dir.c
+++ b/dir.c
@@ -582,6 +582,22 @@ static struct untracked_cache_dir *lookup_untracked(struct 
untracked_cache *uc,
return d;
 }
 
+static void do_invalidate_gitignore(struct untracked_cache_dir *dir)
+{
+   int i;
+   dir->valid = 0;
+   dir->untracked_nr = 0;
+   for (i = 0; i < dir->dirs_nr; i++)
+   do_invalidate_gitignore(dir->dirs[i]);
+}
+
+static void invalidate_gitignore(struct untracked_cache *uc,
+struct untracked_cache_dir *dir)
+{
+   uc->gitignore_invalidated++;
+   do_invalidate_gitignore(dir);
+}
+
 /*
  * Given a file with name "fname", read it (either from disk, or from
  * the index if "check_index" is non-zero), parse it and store the
@@ -697,6 +713,13 @@ static void add_excludes_from_file_1(struct dir_struct 
*dir, const char *fname,
 struct sha1_stat *sha1_stat)
 {
struct exclude_list *el;
+   /*
+* catch setup_standard_excludes() that's called before
+* dir->untracked is assigned. That function behaves
+* differently when dir->untracked is non-NULL.
+*/
+   if (!dir->untracked)
+   dir->unmanaged_exclude_files++;
el = add_exclude_list(dir, EXC_FILE, fname);
if (add_excludes(fname, "", 0, el, 0, sha1_stat) < 0)
die("cannot use %s as an exclude file", fname);
@@ -704,6 +727,7 @@ static void add_excludes_from_file_1(struct dir_struct 
*dir, const char *fname,
 
 void add_excludes_from_file(struct dir_struct *dir, const char *fname)
 {
+   dir->unmanaged_exclude_files++; /* see validate_untracked_cache() */
add_excludes_from_file_1(dir, fname, NULL);
 }
 
@@ -1572,9 +1596,87 @@ static int treat_leading_path(struct dir_struct *dir,
return rc;
 }
 
+static struct untracked_cache_dir *validate_untracked_cache(struct dir_struct 
*dir,
+ int base_len,
+ const struct pathspec 
*pathspec)
+{
+   struct untracked_cache_dir *root;
+
+   if (!dir->untracked)
+   return NULL;
+
+   /*
+* We only support $GIT_DIR/info/exclude and core.excludesfile
+* as the global ignore rule files. Any other additions
+* (e.g. from command line) invalidate the cache. This
+* condition also catches running setup_standard_excludes()
+* before setting dir->untracked!
+*/
+   if (dir->unmanaged_exclude_files)
+   return NULL;
+
+   /*
+* Optimize for the main use case only: whole-tree git
+* status. More work involved in treat_leading_path() if we
+* use cache on just a subset of the worktree. pathspec
+* support could make the matter even worse.
+*/
+   if (base_len || (pathspec && pathspec->nr))
+   return NULL;
+
+   /* Different set of flags may produce different results */
+   if (dir->flags != dir->untracked->dir_flags ||
+   /*
+* See treat_directory(), case index_nonexistent. Without
+* this flag, we may need to also cache .git file content
+* for the resolve_gitlink_ref() call, which we don't.
+*/
+   !(dir->flags & DIR_SHOW_OTHER_DIRECTORIES) ||
+   /* We don't support collecting ignore files */
+   (dir->flags & (DIR_SHOW_IGNORED | DIR_SHOW_IGNORED_TOO |
+  DIR_COLLECT_IGNORED)))
+   return NULL;
+
+   /*
+* If we use .gitignore in the cache and now you change it to
+* .gitexclude, everything will go wrong.
+*/
+   if (dir->exclude_per_dir != dir->untracked->exclude_per_dir &&
+   strcmp(dir->exclude_per_dir, dir->untracked->exclude_per_dir))
+   return NULL;
+
+   /*
+* EXC_CMDL is not considered in the cache. If people set it,
+* skip the cache.
+*/
+   if (dir->exclude_list_group[EXC_CMDL].nr)
+   return NULL;
+
+   if (!dir->untracked->root) {
+   const int len = sizeof(*dir->untracked->root);
+   dir->untracked->root = xmalloc(len);
+   memset(dir->untracked->root, 0, len);
+   }
+
+   /* Validate $GIT_DIR/info/exclude and core.excludesfile */
+   root = dir->untracked->root;
+   if (hashcmp(dir->ss_info_exclude.sha1,
+   dir->untracked->ss_info_exclude.sha1)) {
+   invalidate_gitignore(dir->untracked, root);
+

[PATCH v3 10/23] untracked cache: save to an index extension

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/technical/index-format.txt |  58 +
 cache.h  |   3 +
 dir.c| 134 +++
 dir.h|   1 +
 read-cache.c |  12 +++
 5 files changed, 208 insertions(+)

diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index fe6f316..b97ac8d 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -233,3 +233,61 @@ Git index format
   The remaining index entries after replaced ones will be added to the
   final index. These added entries are also sorted by entry namme then
   stage.
+
+== Untracked cache
+
+  Untracked cache saves the untracked file list and necessary data to
+  verify the cache. The signature for this extension is { 'U', 'N',
+  'T', 'R' }.
+
+  The extension starts with
+
+  - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
+ctime field until "file size".
+
+  - Stat data of core.excludesfile
+
+  - 32-bit dir_flags (see struct dir_struct)
+
+  - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file
+does not exist.
+
+  - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does
+not exist.
+
+  - NUL-terminated string of per-dir exclude file name. This usually
+is ".gitignore".
+
+  - The number of following directory blocks, variable width
+encoding. If this number is zero, the extension ends here with a
+following NUL.
+
+  - A number of directory blocks in depth-first-search order, each
+consists of
+
+- The number of untracked entries, variable witdh encoding.
+
+- The number of sub-directory blocks, variable with encoding.
+
+- The directory name terminated by NUL.
+
+- A number of untrached file/dir names terminated by NUL.
+
+The remaining data of each directory block is grouped by type:
+
+  - An ewah bitmap, the n-th bit marks whether the n-th directory has
+valid untracked cache entries.
+
+  - An ewah bitmap, the n-th bit records "check-only" bit of
+read_directory_recursive() for the n-th directory.
+
+  - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data
+is valid for the n-th directory and exists in the next data.
+
+  - An array of stat data. The n-th data corresponds with the n-th
+"one" bit in the previous ewah bitmap.
+
+  - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit
+in the previous ewah bitmap.
+
+  - One NUL.
diff --git a/cache.h b/cache.h
index dcf3a2a..b14d6e2 100644
--- a/cache.h
+++ b/cache.h
@@ -297,6 +297,8 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define SPLIT_INDEX_ORDERED(1 << 6)
 
 struct split_index;
+struct untracked_cache;
+
 struct index_state {
struct cache_entry **cache;
unsigned int version;
@@ -310,6 +312,7 @@ struct index_state {
struct hashmap name_hash;
struct hashmap dir_hash;
unsigned char sha1[20];
+   struct untracked_cache *untracked;
 };
 
 extern struct index_state the_index;
diff --git a/dir.c b/dir.c
index 95a0f0a..a0a7330 100644
--- a/dir.c
+++ b/dir.c
@@ -12,6 +12,8 @@
 #include "refs.h"
 #include "wildmatch.h"
 #include "pathspec.h"
+#include "varint.h"
+#include "ewah/ewok.h"
 
 struct path_simplify {
int len;
@@ -2139,3 +2141,135 @@ void clear_directory(struct dir_struct *dir)
}
strbuf_release(&dir->basebuf);
 }
+
+struct ondisk_untracked_cache {
+   struct stat_data info_exclude_stat;
+   struct stat_data excludes_file_stat;
+   uint32_t dir_flags;
+   unsigned char info_exclude_sha1[20];
+   unsigned char excludes_file_sha1[20];
+   char exclude_per_dir[1];
+};
+
+struct write_data {
+   int index; /* number of written untracked_cache_dir */
+   struct ewah_bitmap *check_only; /* from untracked_cache_dir */
+   struct ewah_bitmap *valid;  /* from untracked_cache_dir */
+   struct ewah_bitmap *sha1_valid; /* set if exclude_sha1 is not null */
+   struct strbuf out;
+   struct strbuf sb_stat;
+   struct strbuf sb_sha1;
+};
+
+static void stat_data_to_disk(struct stat_data *to, const struct stat_data 
*from)
+{
+   to->sd_ctime.sec  = htonl(from->sd_ctime.sec);
+   to->sd_ctime.nsec = htonl(from->sd_ctime.nsec);
+   to->sd_mtime.sec  = htonl(from->sd_mtime.sec);
+   to->sd_mtime.nsec = htonl(from->sd_mtime.nsec);
+   to->sd_dev= htonl(from->sd_dev);
+   to->sd_ino= htonl(from->sd_ino);
+   to->sd_uid= htonl(from->sd_uid);
+   to->sd_gid= htonl(from->sd_gid);
+   to->sd_size   = htonl(from->sd_size);
+}
+
+static void write_one_dir(struct untracked_cache_dir *untracked,
+ struct write_data *wd)
+{
+   struct stat_data stat_data;
+   struct 

[PATCH v3 08/23] untracked cache: don't open non-existent .gitignore

2014-12-08 Thread Nguyễn Thái Ngọc Duy
This cuts down a signficant number of open(.gitignore) because most
directories usually don't have .gitignore files.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 26 +-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/dir.c b/dir.c
index 02cdc26..95a0f0a 100644
--- a/dir.c
+++ b/dir.c
@@ -1019,7 +1019,21 @@ static void prep_exclude(struct dir_struct *dir, const 
char *base, int baselen)
/* Try to read per-directory file */
hashclr(sha1_stat.sha1);
sha1_stat.valid = 0;
-   if (dir->exclude_per_dir) {
+   if (dir->exclude_per_dir &&
+   /*
+* If we know that no files have been added in
+* this directory (i.e. valid_cached_dir() has
+* been executed and set untracked->valid) ..
+*/
+   (!untracked || !untracked->valid ||
+/*
+ * .. and .gitignore does not exist before
+ * (i.e. null exclude_sha1 and skip_worktree is
+ * not set). Then we can skip loading .gitignore,
+ * which would result in ENOENT anyway.
+ * skip_worktree is taken care in read_directory()
+ */
+!is_null_sha1(untracked->exclude_sha1))) {
/*
 * dir->basebuf gets reused by the traversal, but we
 * need fname to remain unchanged to ensure the src
@@ -1782,6 +1796,7 @@ static struct untracked_cache_dir 
*validate_untracked_cache(struct dir_struct *d
  const struct pathspec 
*pathspec)
 {
struct untracked_cache_dir *root;
+   int i;
 
if (!dir->untracked)
return NULL;
@@ -1833,6 +1848,15 @@ static struct untracked_cache_dir 
*validate_untracked_cache(struct dir_struct *d
if (dir->exclude_list_group[EXC_CMDL].nr)
return NULL;
 
+   /*
+* An optimization in prep_exclude() does not play well with
+* CE_SKIP_WORKTREE. It's a rare case anyway, if a single
+* entry has that bit set, disable the whole untracked cache.
+*/
+   for (i = 0; i < active_nr; i++)
+   if (ce_skip_worktree(active_cache[i]))
+   return NULL;
+
if (!dir->untracked->root) {
const int len = sizeof(*dir->untracked->root);
dir->untracked->root = xmalloc(len);
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 15/23] untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS

2014-12-08 Thread Nguyễn Thái Ngọc Duy
This could be used to verify correct behavior in tests

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/dir.c b/dir.c
index 7fa372e..14dbd7a 100644
--- a/dir.c
+++ b/dir.c
@@ -1922,6 +1922,18 @@ int read_directory(struct dir_struct *dir, const char 
*path, int len, const stru
free_simplify(simplify);
qsort(dir->entries, dir->nr, sizeof(struct dir_entry *), cmp_name);
qsort(dir->ignored, dir->ignored_nr, sizeof(struct dir_entry *), 
cmp_name);
+   if (dir->untracked) {
+   static struct trace_key trace_untracked_stats = 
TRACE_KEY_INIT(UNTRACKED_STATS);
+   trace_printf_key(&trace_untracked_stats,
+"node creation: %u\n"
+"gitignore invalidation: %u\n"
+"directory invalidation: %u\n"
+"opendir: %u\n",
+dir->untracked->dir_created,
+dir->untracked->gitignore_invalidated,
+dir->untracked->dir_invalidated,
+dir->untracked->dir_opened);
+   }
return dir->nr;
 }
 
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 00/23] nd/untracked-cache updates

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Compare to v1 (the one on 'pu' as v2 never got to 'pu'):

 - New cleanup patch 09/23

 - New patch 17/23 allows to ignore untracked cache without destroying
   it (for comparison and verification)

 - New patches 22/23 and 23/23 add some protection against filesystem
   or operating system changes

 - Document UNTR extension

 - Reorganize UNTR to avoid saving SHA-1/stat data when it's useless

 - Fix writing UNTR to base index in split index mode

 - Various review comments since v1

-- 8< --
diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index fe6f316..5dc2bee 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -233,3 +233,64 @@ Git index format
   The remaining index entries after replaced ones will be added to the
   final index. These added entries are also sorted by entry namme then
   stage.
+
+== Untracked cache
+
+  Untracked cache saves the untracked file list and necessary data to
+  verify the cache. The signature for this extension is { 'U', 'N',
+  'T', 'R' }.
+
+  The extension starts with
+
+  - A NUL-terminated string describing the environment when the cache
+is created.
+
+  - Stat data of $GIT_DIR/info/exclude. See "Index entry" section from
+ctime field until "file size".
+
+  - Stat data of core.excludesfile
+
+  - 32-bit dir_flags (see struct dir_struct)
+
+  - 160-bit SHA-1 of $GIT_DIR/info/exclude. Null SHA-1 means the file
+does not exist.
+
+  - 160-bit SHA-1 of core.excludesfile. Null SHA-1 means the file does
+not exist.
+
+  - NUL-terminated string of per-dir exclude file name. This usually
+is ".gitignore".
+
+  - The number of following directory blocks, variable width
+encoding. If this number is zero, the extension ends here with a
+following NUL.
+
+  - A number of directory blocks in depth-first-search order, each
+consists of
+
+- The number of untracked entries, variable witdh encoding.
+
+- The number of sub-directory blocks, variable with encoding.
+
+- The directory name terminated by NUL.
+
+- A number of untrached file/dir names terminated by NUL.
+
+The remaining data of each directory block is grouped by type:
+
+  - An ewah bitmap, the n-th bit marks whether the n-th directory has
+valid untracked cache entries.
+
+  - An ewah bitmap, the n-th bit records "check-only" bit of
+read_directory_recursive() for the n-th directory.
+
+  - An ewah bitmap, the n-th bit indicates whether SHA-1 and stat data
+is valid for the n-th directory and exists in the next data.
+
+  - An array of stat data. The n-th data corresponds with the n-th
+"one" bit in the previous ewah bitmap.
+
+  - An array of SHA-1. The n-th SHA-1 corresponds with the n-th "one" bit
+in the previous ewah bitmap.
+
+  - One NUL.
diff --git a/builtin/update-index.c b/builtin/update-index.c
index c1c18db..f23ec83 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -115,6 +115,7 @@ static int test_if_untracked_cache_is_supported(void)
fd = create_file("dir-mtime-test/newfile");
xstat("dir-mtime-test", &st);
if (!match_stat_data(&base, &st)) {
+   close(fd);
fputc('\n', stderr);
fprintf_ln(stderr,_("directory stat info does not "
"change after adding a new file"));
@@ -127,6 +128,7 @@ static int test_if_untracked_cache_is_supported(void)
xmkdir("dir-mtime-test/new-dir");
xstat("dir-mtime-test", &st);
if (!match_stat_data(&base, &st)) {
+   close(fd);
fputc('\n', stderr);
fprintf_ln(stderr, _("directory stat info does not change "
 "after adding a new directory"));
@@ -1094,10 +1096,10 @@ int cmd_update_index(int argc, const char **argv, const 
char *prefix)
/* should be the same flags used by git-status */
uc->dir_flags = DIR_SHOW_OTHER_DIRECTORIES | 
DIR_HIDE_EMPTY_DIRECTORIES;
the_index.untracked = uc;
-   the_index.cache_changed |= SOMETHING_CHANGED;
+   the_index.cache_changed |= UNTRACKED_CHANGED;
} else if (!untracked_cache && the_index.untracked) {
the_index.untracked = NULL;
-   the_index.cache_changed |= SOMETHING_CHANGED;
+   the_index.cache_changed |= UNTRACKED_CHANGED;
}
 
if (active_cache_changed) {
diff --git a/cache.h b/cache.h
index 201b22e..fca979b 100644
--- a/cache.h
+++ b/cache.h
@@ -295,7 +295,7 @@ static inline unsigned int canon_mode(unsigned int mode)
 #define RESOLVE_UNDO_CHANGED   (1 << 4)
 #define CACHE_TREE_CHANGED (1 << 5)
 #define SPLIT_INDEX_ORDERED(1 << 6)
-#define UNTRACKED_CHANGED   (1 << 7)
+#define UNTRACKED_CHANGED  (1 << 7)
 
 struct split_index;
 struct untracked_cache;
diff --git a/compat/mingw.c b/compat/mingw.c
index c5c37e5..88

[PATCH v3 01/23] dir.c: optionally compute sha-1 of a .gitignore file

2014-12-08 Thread Nguyễn Thái Ngọc Duy
This is not used anywhere yet. But the goal is to compare quickly if a
.gitignore file has changed when we have the SHA-1 of both old (cached
somewhere) and new (from index or a tree) versions.

Helped-by: Junio C Hamano 
Helped-by: Torsten Bögershausen 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 53 ++---
 dir.h |  6 ++
 2 files changed, 52 insertions(+), 7 deletions(-)

diff --git a/dir.c b/dir.c
index fcb6872..4cc936b 100644
--- a/dir.c
+++ b/dir.c
@@ -466,7 +466,8 @@ void add_exclude(const char *string, const char *base,
x->el = el;
 }
 
-static void *read_skip_worktree_file_from_index(const char *path, size_t *size)
+static void *read_skip_worktree_file_from_index(const char *path, size_t *size,
+   struct sha1_stat *sha1_stat)
 {
int pos, len;
unsigned long sz;
@@ -485,6 +486,10 @@ static void *read_skip_worktree_file_from_index(const char 
*path, size_t *size)
return NULL;
}
*size = xsize_t(sz);
+   if (sha1_stat) {
+   memset(&sha1_stat->stat, 0, sizeof(sha1_stat->stat));
+   hashcpy(sha1_stat->sha1, active_cache[pos]->sha1);
+   }
return data;
 }
 
@@ -529,11 +534,18 @@ static void trim_trailing_spaces(char *buf)
*last_space = '\0';
 }
 
-int add_excludes_from_file_to_list(const char *fname,
-  const char *base,
-  int baselen,
-  struct exclude_list *el,
-  int check_index)
+/*
+ * Given a file with name "fname", read it (either from disk, or from
+ * the index if "check_index" is non-zero), parse it and store the
+ * exclude rules in "el".
+ *
+ * If "ss" is not NULL, compute SHA-1 of the exclude file and fill
+ * stat data from disk (only valid if add_excludes returns zero). If
+ * ss_valid is non-zero, "ss" must contain good value as input.
+ */
+static int add_excludes(const char *fname, const char *base, int baselen,
+   struct exclude_list *el, int check_index,
+   struct sha1_stat *sha1_stat)
 {
struct stat st;
int fd, i, lineno = 1;
@@ -547,7 +559,7 @@ int add_excludes_from_file_to_list(const char *fname,
if (0 <= fd)
close(fd);
if (!check_index ||
-   (buf = read_skip_worktree_file_from_index(fname, &size)) == 
NULL)
+   (buf = read_skip_worktree_file_from_index(fname, &size, 
sha1_stat)) == NULL)
return -1;
if (size == 0) {
free(buf);
@@ -560,6 +572,11 @@ int add_excludes_from_file_to_list(const char *fname,
} else {
size = xsize_t(st.st_size);
if (size == 0) {
+   if (sha1_stat) {
+   fill_stat_data(&sha1_stat->stat, &st);
+   hashcpy(sha1_stat->sha1, EMPTY_BLOB_SHA1_BIN);
+   sha1_stat->valid = 1;
+   }
close(fd);
return 0;
}
@@ -571,6 +588,21 @@ int add_excludes_from_file_to_list(const char *fname,
}
buf[size++] = '\n';
close(fd);
+   if (sha1_stat) {
+   int pos;
+   if (sha1_stat->valid &&
+   !match_stat_data(&sha1_stat->stat, &st))
+   ; /* no content change, ss->sha1 still good */
+   else if (check_index &&
+(pos = cache_name_pos(fname, strlen(fname))) 
>= 0 &&
+!ce_stage(active_cache[pos]) &&
+ce_uptodate(active_cache[pos]))
+   hashcpy(sha1_stat->sha1, 
active_cache[pos]->sha1);
+   else
+   hash_sha1_file(buf, size, "blob", 
sha1_stat->sha1);
+   fill_stat_data(&sha1_stat->stat, &st);
+   sha1_stat->valid = 1;
+   }
}
 
el->filebuf = buf;
@@ -589,6 +621,13 @@ int add_excludes_from_file_to_list(const char *fname,
return 0;
 }
 
+int add_excludes_from_file_to_list(const char *fname, const char *base,
+  int baselen, struct exclude_list *el,
+  int check_index)
+{
+   return add_excludes(fname, base, baselen, el, check_index, NULL);
+}
+
 struct exclude_list *add_exclude_list(struct dir_struct *dir,
  int group_type, const char *src)
 {
diff --git a/dir.h b/dir.h
index 6c45e9d..cdca71b 100644
--- a/dir.h
+++ b/dir.h
@@ -73,6 +73,12 @@ struct exclude_list_group {
struct exclude_list *el;
 };
 

[PATCH] checkout: add --ignore-other-wortrees

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-checkout.txt |  6 ++
 builtin/checkout.c | 19 +++
 t/t2025-checkout-to.sh |  7 +++
 3 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/Documentation/git-checkout.txt b/Documentation/git-checkout.txt
index 0c13825..71d9e4e 100644
--- a/Documentation/git-checkout.txt
+++ b/Documentation/git-checkout.txt
@@ -232,6 +232,12 @@ section of linkgit:git-add[1] to learn how to operate the 
`--patch` mode.
specific files such as HEAD, index... See "MULTIPLE WORKING
TREES" section for more information.
 
+--ignore-other-worktrees::
+   `git checkout` refuses when the wanted ref is already checked out
+   by another worktree. This option makes `git checkout` check the
+   ref out anyway. In other words, the ref is held by more than one
+   worktree.
+
 ::
Branch to checkout; if it refers to a branch (i.e., a name that,
when prepended with "refs/heads/", is a valid ref), then that
diff --git a/builtin/checkout.c b/builtin/checkout.c
index 953b763..8b2bf20 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -37,6 +37,7 @@ struct checkout_opts {
int writeout_stage;
int overwrite_ignore;
int ignore_skipworktree;
+   int ignore_other_worktrees;
 
const char *new_branch;
const char *new_branch_force;
@@ -1079,11 +1080,12 @@ static void check_linked_checkouts(struct branch_info 
*new)
 static int parse_branchname_arg(int argc, const char **argv,
int dwim_new_local_branch_ok,
struct branch_info *new,
-   struct tree **source_tree,
-   unsigned char rev[20],
-   const char **new_branch,
-   int force_detach)
+   struct checkout_opts *opts,
+   unsigned char rev[20])
 {
+   struct tree **source_tree = &opts->source_tree;
+   const char **new_branch = &opts->new_branch;
+   int force_detach = opts->force_detach;
int argcount = 0;
unsigned char branch_rev[20];
const char *arg;
@@ -1209,7 +1211,8 @@ static int parse_branchname_arg(int argc, const char 
**argv,
int flag;
char *head_ref = resolve_refdup("HEAD", 0, sha1, &flag);
if (head_ref &&
-   (!(flag & REF_ISSYMREF) || strcmp(head_ref, new->path)))
+   (!(flag & REF_ISSYMREF) || strcmp(head_ref, new->path)) &&
+   !opts->ignore_other_worktrees)
check_linked_checkouts(new);
free(head_ref);
}
@@ -1340,6 +1343,8 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
N_("second guess 'git checkout 
no-such-branch'")),
OPT_FILENAME(0, "to", &opts.new_worktree,
   N_("check a branch out in a separate working 
directory")),
+   OPT_BOOL(0, "ignore-other-worktrees", 
&opts.ignore_other_worktrees,
+N_("do not check if another worktree is holding the 
given ref")),
OPT_END(),
};
 
@@ -1420,9 +1425,7 @@ int cmd_checkout(int argc, const char **argv, const char 
*prefix)
opts.track == BRANCH_TRACK_UNSPECIFIED &&
!opts.new_branch;
int n = parse_branchname_arg(argc, argv, dwim_ok,
-&new, &opts.source_tree,
-rev, &opts.new_branch,
-opts.force_detach);
+&new, &opts, rev);
argv += n;
argc -= n;
}
diff --git a/t/t2025-checkout-to.sh b/t/t2025-checkout-to.sh
index 915b506..f8e4df4 100755
--- a/t/t2025-checkout-to.sh
+++ b/t/t2025-checkout-to.sh
@@ -79,6 +79,13 @@ test_expect_success 'die the same branch is already checked 
out' '
)
 '
 
+test_expect_success 'not die the same branch is already checked out' '
+   (
+   cd here &&
+   git checkout --ignore-other-worktrees --to anothernewmaster 
newmaster
+   )
+'
+
 test_expect_success 'not die on re-checking out current branch' '
(
cd there &&
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 02/23] untracked cache: record .gitignore information and dir hierarchy

2014-12-08 Thread Nguyễn Thái Ngọc Duy
The idea is if we can capture all input and (non-rescursive) output of
read_directory_recursive(), and can verify later that all the input is
the same, then the second r_d_r() should produce the same output as in
the first run.

The requirement for this to work is stat info of a directory MUST
change if an entry is added to or removed from that directory (and
should not change often otherwise). If your OS and filesystem do not
meet this requirement, untracked cache is not for you. Most file
systems on *nix should be fine. On Windows, NTFS is fine while FAT may
not be [1] even though FAT on Linux seems to be fine.

The list of input of r_d_r() is in the big comment block in dir.h. In
short, the output of a directory (not counting subdirs) mainly depends
on stat info of the directory in question, all .gitignore leading to
it and the check_only flag when r_d_r() is called recursively. This
patch records all this info (and the output) as r_d_r() runs.

Two hash_sha1_file() are required for $GIT_DIR/info/exclude and
core.excludesfile unless their stat data matches. hash_sha1_file() is
only needed when .gitignore files in the worktree are modified,
otherwise their SHA-1 in index is used (see the previous patch).

We could store stat data for .gitignore files so we don't have to
rehash them if their content is different from index, but I think
.gitignore files are rarely modified, so not worth extra cache data
(and hashing penalty read-cache.c:verify_hdr(), as we will be storing
this as an index extension).

The implication is, if you change .gitignore, you better add it to the
index soon or you lose all the benefit of untracked cache because a
modified .gitignore invalidates all subdirs recursively. This is
especially bad for .gitignore at root.

This cached output is about untracked files only, not ignored files
because the number of tracked files is usually small, so small cache
overhead, while the number of ignored files could go really high
(e.g. *.o files mixing with source code).

[1] "Description of NTFS date and time stamps for files and folders"
http://support.microsoft.com/kb/299648

Helped-by: Torsten Bögershausen 
Helped-by: David Turner 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 dir.c | 142 +-
 dir.h |  60 
 2 files changed, 183 insertions(+), 19 deletions(-)

diff --git a/dir.c b/dir.c
index 4cc936b..27734f0 100644
--- a/dir.c
+++ b/dir.c
@@ -32,7 +32,7 @@ enum path_treatment {
 };
 
 static enum path_treatment read_directory_recursive(struct dir_struct *dir,
-   const char *path, int len,
+   const char *path, int len, struct untracked_cache_dir *untracked,
int check_only, const struct path_simplify *simplify);
 static int get_dtype(struct dirent *de, const char *path, int len);
 
@@ -535,6 +535,54 @@ static void trim_trailing_spaces(char *buf)
 }
 
 /*
+ * Given a subdirectory name and "dir" of the current directory,
+ * search the subdir in "dir" and return it, or create a new one if it
+ * does not exist in "dir".
+ *
+ * If "name" has the trailing slash, it'll be excluded in the search.
+ */
+static struct untracked_cache_dir *lookup_untracked(struct untracked_cache *uc,
+   struct untracked_cache_dir 
*dir,
+   const char *name, int len)
+{
+   int first, last;
+   struct untracked_cache_dir *d;
+   if (!dir)
+   return NULL;
+   if (len && name[len - 1] == '/')
+   len--;
+   first = 0;
+   last = dir->dirs_nr;
+   while (last > first) {
+   int cmp, next = (last + first) >> 1;
+   d = dir->dirs[next];
+   cmp = strncmp(name, d->name, len);
+   if (!cmp && strlen(d->name) > len)
+   cmp = -1;
+   if (!cmp)
+   return d;
+   if (cmp < 0) {
+   last = next;
+   continue;
+   }
+   first = next+1;
+   }
+
+   uc->dir_created++;
+   d = xmalloc(sizeof(*d) + len + 1);
+   memset(d, 0, sizeof(*d));
+   memcpy(d->name, name, len);
+   d->name[len] = '\0';
+
+   ALLOC_GROW(dir->dirs, dir->dirs_nr + 1, dir->dirs_alloc);
+   memmove(dir->dirs + first + 1, dir->dirs + first,
+   (dir->dirs_nr - first) * sizeof(*dir->dirs));
+   dir->dirs_nr++;
+   dir->dirs[first] = d;
+   return d;
+}
+
+/*
  * Given a file with name "fname", read it (either from disk, or from
  * the index if "check_index" is non-zero), parse it and store the
  * exclude rules in "el".
@@ -645,14 +693,20 @@ struct exclude_list *add_exclude_list(struct dir_struct 
*dir,
 /*
  * Used to set up core.excludesfile and .git/info/exclude lists.
  */
-void add_excludes_from_file(struct dir_struct *dir, const char *fname)
+static void add_excludes_from_file_1(

Re: [PATCH] fsck: properly bound "invalid tag name" error message

2014-12-08 Thread Johannes Schindelin
Hi Peff,

On Mon, 8 Dec 2014, Jeff King wrote:

> On Mon, Dec 08, 2014 at 12:35:27PM +0100, Johannes Schindelin wrote:
> 
> > On Mon, 8 Dec 2014, Duy Nguyen wrote:
> > 
> > > On Mon, Dec 08, 2014 at 12:57:06AM -0500, Jeff King wrote:
> > > > I do admit that I am tempted to teach index-pack to always NUL-terminate
> > > > objects in memory that we feed to fsck, just to be on the safe side. It
> > > > doesn't cost much, and could prevent a silly mistake (either in the
> > > > future, or one that I missed in my analysis).
> > > 
> > > I think I'm missing a "but.." here.
> > 
> > The "but..."s I have are:
> > 
> > 1) we potentially waste space, and
> 
> I think this can be ignored. It's 1 byte per object, and only while we
> keep the object in RAM. Also, we already do it for buffers read from
> read_sha1_file, so when you run "git log" every commit buffer we keep in
> RAM is already doing this (and has been since basically day one).

Fine with me.

> > 2) I would like to make really certain, preferably with static analysis,
> >that fsck_object() only receives buffers that are NUL terminated, and
> >that no call path is missed.
> 
> I know this is not as good as a real static analysis, but I was
> concerned about this exact thing about a year ago (I think in relation
> to commit parsing for pretty-printing) and traced all of the paths
> through which you can get an object; they all end up in the same few
> code paths that all xmallocz: unpack_sha1_file for loose objects,
> unpack_compressed_entry for pack bases, and patch_delta for deltas.

Thank you for sharing the analysis. This is exactly what I was looking
for.

> Index-pack and unpack-objects are the odd men out here because they are
> processing objects that are not actually in the repository yet. I think
> the spots Duy pointed out probably cover index-pack. It looks like
> builtin/unpack-objects.c:get_data needs the same treatment.

I just started working on that. To see the progress, please have a look
here:

https://github.com/dscho/git/pull/5

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] index-format.txt: add a missing closing quote

2014-12-08 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/technical/index-format.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/technical/index-format.txt 
b/Documentation/technical/index-format.txt
index 1250b5c..35112e4 100644
--- a/Documentation/technical/index-format.txt
+++ b/Documentation/technical/index-format.txt
@@ -207,7 +207,7 @@ Git index format
   in a separate file. This extension records the changes to be made on
   top of that to produce the final index.
 
-  The signature for this extension is { 'l', 'i, 'n', 'k' }.
+  The signature for this extension is { 'l', 'i', 'n', 'k' }.
 
   The extension consists of:
 
-- 
2.2.0.60.gb7b3c64

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/3] tree.c: update read_tree_recursive callback to pass strbuf as base

2014-12-08 Thread Duy Nguyen
On Wed, Dec 3, 2014 at 11:13 PM, Junio C Hamano  wrote:
> A question during the review, especially on proposed log messages
> and documentation changes, is rarely just a request to explain it to
> the questioner in the discussion. It is an indication that what is
> being commented on needs tweaking to be understood.

I do the same at work and somehow forgot to apply the same principle
here :D Do you want to me resend with Eric's wording?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fsck: properly bound "invalid tag name" error message

2014-12-08 Thread Jeff King
On Mon, Dec 08, 2014 at 12:35:27PM +0100, Johannes Schindelin wrote:

> On Mon, 8 Dec 2014, Duy Nguyen wrote:
> 
> > On Mon, Dec 08, 2014 at 12:57:06AM -0500, Jeff King wrote:
> > > I do admit that I am tempted to teach index-pack to always NUL-terminate
> > > objects in memory that we feed to fsck, just to be on the safe side. It
> > > doesn't cost much, and could prevent a silly mistake (either in the
> > > future, or one that I missed in my analysis).
> > 
> > I think I'm missing a "but.." here.
> 
> The "but..."s I have are:
> 
> 1) we potentially waste space, and

I think this can be ignored. It's 1 byte per object, and only while we
keep the object in RAM. Also, we already do it for buffers read from
read_sha1_file, so when you run "git log" every commit buffer we keep in
RAM is already doing this (and has been since basically day one).

> 2) I would like to make really certain, preferably with static analysis,
>that fsck_object() only receives buffers that are NUL terminated, and
>that no call path is missed.

I know this is not as good as a real static analysis, but I was
concerned about this exact thing about a year ago (I think in relation
to commit parsing for pretty-printing) and traced all of the paths
through which you can get an object; they all end up in the same few
code paths that all xmallocz: unpack_sha1_file for loose objects,
unpack_compressed_entry for pack bases, and patch_delta for deltas.

Index-pack and unpack-objects are the odd men out here because they are
processing objects that are not actually in the repository yet. I think
the spots Duy pointed out probably cover index-pack. It looks like
builtin/unpack-objects.c:get_data needs the same treatment.

I know that Duy mentioned a while ago killing off unpack-objects and
rolling its functionality into index-pack. That would be a very nice
thing to do if somebody can find the time. There's a fair bit of
duplication, and index-pack receives a lot more attention (so it's
faster, and probably more robust against weird incoming packs).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Antw: Re: Enhancement Request: "locale" git option

2014-12-08 Thread David Kastrup
"Ulrich Windl"  writes:

 Ralf Thielow  schrieb am 06.12.2014 um 20:28 in
> Nachricht
> :
>> 2014-12-05 16:45 GMT+01:00 Torsten Bögershausen :
>>>
>>> I do not know who was first, and who came later, but
>>> 
>>
> 
>> chverfolgen>
>>>
>>> uses "versioniert" as "tracked"
>>>
>>>
>>> LANG=de_DE.UTF-8 git status
>>> gives:
>>> nichts zum Commit vorgemerkt, aber es gibt unbeobachtete Dateien (benutzen
>
>> Sie "git add" zum Beobachten)
>>>
>>>
>>> Does it make sense to replace "beobachten" with "versionieren" ?
>>>
>> 
>> I think it makes sense. "versionieren" describes the concept of tracking
>> better than "beobachten", IMO. I'll send a patch for that.
>
> Isolated from usage, "versionieren" and "tracking" have no common translation;
> what about "verfolgen" (~follow) for "tracking"?

What about "bekannt", "unbekannt" and "bekanntmachen"?  "unregistriert",
"registriert", "anmelden"?  Or "ungemeldet", "angemeldet", "anmelden"?

-- 
David Kastrup
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] fsck: properly bound "invalid tag name" error message

2014-12-08 Thread Johannes Schindelin
Hi Duy,

On Mon, 8 Dec 2014, Duy Nguyen wrote:

> On Mon, Dec 08, 2014 at 12:57:06AM -0500, Jeff King wrote:
> > I do admit that I am tempted to teach index-pack to always NUL-terminate
> > objects in memory that we feed to fsck, just to be on the safe side. It
> > doesn't cost much, and could prevent a silly mistake (either in the
> > future, or one that I missed in my analysis).
> 
> I think I'm missing a "but.." here.

The "but..."s I have are:

1) we potentially waste space, and

2) I would like to make really certain, preferably with static analysis,
   that fsck_object() only receives buffers that are NUL terminated, and
   that no call path is missed.

The patch looks good, of course, but I lack the broad overview of Git's
source code - it has been years since I was familiar enough with it to
know the places touching particular functions from the top of my head.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


  1   2   >