date:20180221

Re: test bare repository for unit tests

2018-02-21 Thread Basin Ilya

Hi.
git-fast-export would work too, but it creates an additional step. I don't 
commit to the model repo during tests, but I do commit when I want to modify 
the tests.
So far, I configured core.compression=0 and gc.auto=0, created the 
.gitattributes inside the bare repo dir containing one line: * binary
I also created two empty .gitignore files inside refs/ and objects/

I still haven't found a way to force prune without pack after each push.

On 22.02.2018 1:53, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Feb 21 2018, Basin Ilya jotted:
> 
>> Hi.
>> I want to the test-repo-git under 
>> https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
>> just like test-repo-cvs and test-repo-svn
>>
>> Which configuation options would suit that?
>> I think core.compression 0 for human readable diffs.
>> also, I need to force loose, gc after each push.
> 
> It looks like you have unit tests that are going to do integration tests
> of some SVN/CVS repos as used by some other tool, and want to add git to
> that.
> 
> Since you have git already, the most straightforward thing to do would
> be to ship the output of git-fast-export in the repo, and have the test
> setup code create a repo locally out of that, then test it.
> 
> Or do you really need to commit the raw repo files as-is for some reason
> I've missed?
>

Re: bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor


> On Feb 21, 2018, at 9:37 PM, Jonathan Nieder  wrote:
> 
> Thanks for writing it.
> 
> Do you mind if we forge your sign-off?  (See Documentation/SubmittingPatches
> item '(5) Certify your work' for details about what this means.)

Sure, or I can just re-paste:

Signed-off-by: Dorian Taylor 

---
Documentation/technical/http-protocol.txt | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
index a0e45f2889e6e..19d73f7efb338 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -214,14 +214,17 @@ smart server reply:
  S: Cache-Control: no-cache
  S:
  S: 001e# service=git-upload-pack\n
+   S: 
  S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint\0multi_ack\n
  S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
  S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
  S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
+   S: 

The client may send Extra Parameters (see
Documentation/technical/pack-protocol.txt) as a colon-separated string
-in the Git-Protocol HTTP header.
+in the Git-Protocol HTTP header. Note as well that there is *no* newline
+after the ``.

Dumb Server Response

@@ -264,8 +267,8 @@ Servers MUST set $servicename to be the request parameter 
value.
Servers SHOULD include an LF at the end of this line.
Clients MUST ignore an LF at the end of the line.

-Servers MUST terminate the response with the magic `` end
-pkt-line marker.
+Servers MUST follow the first pkt-line, as well as terminate the
+response, with the magic `` end pkt-line marker.

The returned response is a pkt-line stream describing each ref and
its known value.  The stream SHOULD be sorted by name according to
@@ -278,6 +281,7 @@ Extra Parameter.

 smart_reply =  PKT-LINE("# service=$servicename" LF)
 *1("version 1")
+""
 ref_list
 ""
 ref_list=  empty_list / non_empty_list

---

> 
>> Note I am not sure what the story is behind that `version 1`
>> element, whether it's supposed to go before or after the null packet
>> or if there should be another null packet or what. Perhaps somebody
>> more fluent with the smart protocol can advise.
> 
> I believe the 'version 1' goes after the flush-packet.

I took a traipse through the code and couldn’t determine it one way or another, 
but my money is on that looking something like `000aversion 1\n` on the wire.

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [PATCH 26/27] sha1_file: allow map_sha1_file to handle arbitrary repositories

2018-02-21 Thread Jonathan Nieder

Stefan Beller wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  object-store.h | 3 +--
>  sha1_file.c| 5 +++--
>  2 files changed, 4 insertions(+), 4 deletions(-)

Thanks: this is short and simple, making my life as a reviewer very
easy.

Reviewed-by: Jonathan Nieder

Re: [PATCH 21/27] sha1_file: add repository argument to sha1_loose_object_info

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> Add a repository argument to allow the sha1_loose_object_info caller
> to be more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  sha1_file.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

Simple and obviously correct.
Reviewed-by: Jonathan Nieder

Re: [PATCH 17/27] sha1_file: add repository argument to stat_sha1_file

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> Add a repository argument to allow the stat_sha1_file caller to be
> more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  sha1_file.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

Simple and obviously correct.
Reviewed-by: Jonathan Nieder

Re: [PATCH 13/27] sha1_file: add repository argument to prepare_alt_odb

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> See previous patch for explanation.
>
> While at it, move the declaration to object-store.h,
> where it should be easier to find.

Which declaration?  It looks like prepare_alt_odb is already in
object-store.h.

[...]
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -717,7 +717,7 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   } else {
>   fsck_object_dir(get_object_directory());
>  
> - prepare_alt_odb();
> + prepare_alt_odb(the_repository);

Patch 2 added a #include of "repository.h".  Good.

(I checked because with the definition of prepare_alt_odb as a macro,
the function call would compile correctly even if the_repository
weren't in scope, but we want to include what we use as a matter of
style/maintainability.)

[...]
> --- a/packfile.c
> +++ b/packfile.c
> @@ -890,7 +890,7 @@ void prepare_packed_git(void)
>   if (the_repository->objects.packed_git_initialized)
>   return;
>   prepare_packed_git_one(get_object_directory(), 1);
> - prepare_alt_odb();
> + prepare_alt_odb(the_repository);

Also good, since patch 3 added a #include of "repository.h".

[...]
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -23,6 +23,7 @@
>  #include "sha1-lookup.h"
>  #include "bulk-checkin.h"
>  #include "repository.h"
> +#include "object-store.h"

Should this #include have been added in an earlier patch, since the
file both uses and defines prepare_alt_odb, which is declared there?

The rest looks good.

Thanks,
Jonathan

[PATCH] t: send verbose test-helper output to fd 4

2018-02-21 Thread Jeff King

This is a repost of the two patches from:

  https://public-inbox.org/git/20180209185710.ga23...@sigill.intra.peff.net/

(now just one patch, since sg/test-i18ngrep graduated and we can do it
all in one step). The idea got positive feedback, but nobody commented
on patches and I didn't see them in "What's cooking".

-- >8 --
Test helper functions like test_must_fail may produce
messages to stderr when they see a problem. When the tests
are run with "--verbose", this ends up on the test script's
stderr, and the user can read it.

But there's a problem. Some tests record stderr as part of
the test, like:

  test_must_fail git foo 2>output &&
  test_i18ngrep expected.message output

In this case the error text goes into "output". This makes
the --verbose output less useful (it also means we might
accidentally match it in the second, though in practice we
tend to produce these messages only on error, so we'd abort
the test when the first command fails).

Let's instead send this user-facing output directly to
descriptor 4, which always points to the original stderr (or
/dev/null in non-verbose mode). And it's already forbidden
to redirect descriptor 4, since we use it for BASH_XTRACEFD,
as explained in 9be795fbce (t5615: avoid re-using descriptor
4, 2017-12-08).

Signed-off-by: Jeff King 
---
 t/test-lib-functions.sh | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 67b5994afb..aabee13e5d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -625,22 +625,22 @@ test_must_fail () {
exit_code=$?
if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
then
-   echo >&2 "test_must_fail: command succeeded: $*"
+   echo >&4 "test_must_fail: command succeeded: $*"
return 1
elif test_match_signal 13 $exit_code && list_contains "$_test_ok" 
sigpipe
then
return 0
elif test $exit_code -gt 129 && test $exit_code -le 192
then
-   echo >&2 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
+   echo >&4 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
return 1
elif test $exit_code -eq 127
then
-   echo >&2 "test_must_fail: command not found: $*"
+   echo >&4 "test_must_fail: command not found: $*"
return 1
elif test $exit_code -eq 126
then
-   echo >&2 "test_must_fail: valgrind error: $*"
+   echo >&4 "test_must_fail: valgrind error: $*"
return 1
fi
return 0
@@ -678,7 +678,7 @@ test_expect_code () {
return 0
fi
 
-   echo >&2 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
+   echo >&4 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
return 1
 }
 
@@ -742,18 +742,18 @@ test_i18ngrep () {
shift
! grep "$@" && return 0
 
-   echo >&2 "error: '! grep $@' did find a match in:"
+   echo >&4 "error: '! grep $@' did find a match in:"
else
grep "$@" && return 0
 
-   echo >&2 "error: 'grep $@' didn't find a match in:"
+   echo >&4 "error: 'grep $@' didn't find a match in:"
fi
 
if test -s "$last_arg"
then
-   cat >&2 "$last_arg"
+   cat >&4 "$last_arg"
else
-   echo >&2 ""
+   echo >&4 ""
fi
 
return 1
@@ -764,7 +764,7 @@ test_i18ngrep () {
 # not output anything when they fail.
 verbose () {
"$@" && return 0
-   echo >&2 "command failed: $(git rev-parse --sq-quote "$@")"
+   echo >&4 "command failed: $(git rev-parse --sq-quote "$@")"
return 1
 }
 
-- 
2.16.2.580.g650ee5408b

Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> In a process with multiple repositories open, packfile accessors
> should be associated to a single repository and not shared globally.
> Move packed_git and packed_git_mru into the_repository and adjust
> callers to reflect this.
>
> Patch generated by
>
>  1. Moving the struct packed_git declaration to object-store.h
> and packed_git, packed_git_mru globals to struct object_store.
>
>  2. Applying the semantic patch
> contrib/coccinelle/refactoring/packed_git.cocci to adjust callers.
> This semantic patch is placed in a sub directory of the coccinelle
> contrib dir, as this semantic patch is not expected to be of general
> usefulness; it is only useful during developing this series and
> merging it with other topics in flight. At a later date, just
> delete that semantic patch.

Can the semantic patch go in the commit message instead?  It is very
brief.

Actually, I don't see this semantic patch in the diffstat.  Is the
commit message stale?

>  3. Applying line wrapping fixes from "make style" to break the
> resulting long lines.
>
>  4. Adding missing #includes of repository.h and object-store.h
> where needed.

Is there a way to automate this step?  (I'm asking for my own
reference when writing future patches, not because of any concern
about the correctness of this one.)
>
>  5. As the packfiles are now owned by an objectstore/repository, which
> is ephemeral unlike globals, we introduce memory leaks. So address
> them in raw_object_store_clear().

The compound words are confusing me.  What is an
objectstore/repository?  Are these referring to particular identifiers
or something else?

Would some wording like the following work?

   5. Freeing packed_git and packed_git_mru in raw_object_store_clear
  to avoid a per-repository memory leak.  Previously they were
  global singletons, so code to free them did not exist.

[...]
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -12,6 +12,7 @@
>  #include "exec_cmd.h"
>  #include "streaming.h"
>  #include "thread-utils.h"
> +#include "object-store.h"
>  #include "packfile.h"
>  
>  static const char index_pack_usage[] =

Change from a different patch leaked into this one?

[...]
> +++ b/builtin/pack-objects.c
[...]
> @@ -1044,7 +1045,7 @@ static int want_object_in_pack(const struct object_id 
> *oid,
>   }
>   want = want_found_object(exclude, p);
>   if (!exclude && want > 0)
> - list_move(>mru, _git_mru);
> + list_move(>mru, 
> _repository->objects.packed_git_mru);

Long line.  Can "make style" catch this?

[...]
> +++ b/builtin/receive-pack.c
> @@ -7,6 +7,7 @@
>  #include "sideband.h"
>  #include "run-command.h"
>  #include "exec_cmd.h"
> +#include "object-store.h"
>  #include "commit.h"
>  #include "object.h"
>  #include "remote.h"

Another change leaked in?

[...]
> --- a/cache.h
> +++ b/cache.h
> @@ -1585,35 +1585,6 @@ struct pack_window {
>   unsigned int inuse_cnt;
>  };
>  
> -extern struct packed_git {
[...]
> -} *packed_git;

Move detecting diff confirms that this wasn't modified.  Thanks for
creating it.

[...]
> +++ b/fast-import.c
[...]
> @@ -1110,7 +1112,7 @@ static int store_object(
>   if (e->idx.offset) {
>   duplicate_count_by_type[type]++;
>   return 1;
> - } else if (find_sha1_pack(oid.hash, packed_git)) {
> + } else if (find_sha1_pack(oid.hash, 
> the_repository->objects.packed_git)) {

Long line.  (I'll refrain from commenting about any further ones.)

[...]
> +++ b/http-push.c
> @@ -1,4 +1,5 @@
>  #include "cache.h"
> +#include "object-store.h"
>  #include "commit.h"
>  #include "tag.h"
>  #include "blob.h"

Stray change?

> diff --git a/http-walker.c b/http-walker.c
> index 07c2b1af82..8bb5d991bb 100644
> --- a/http-walker.c
> +++ b/http-walker.c
> @@ -4,6 +4,7 @@
>  #include "http.h"
>  #include "list.h"
>  #include "transport.h"
> +#include "object-store.h"
>  #include "packfile.h"
>  
>  struct alt_base {

Same question.

> diff --git a/http.c b/http.c
> index 31755023a4..a4a9e583c7 100644
> --- a/http.c
> +++ b/http.c
> @@ -1,6 +1,7 @@
>  #include "git-compat-util.h"
>  #include "http.h"
>  #include "config.h"
> +#include "object-store.h"
>  #include "pack.h"
>  #include "sideband.h"
>  #include "run-command.h"

Likewise.

> diff --git a/object-store.h b/object-store.h
> index e78eea1dde..1de9e07102 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -52,6 +52,30 @@ void add_to_alternates_memory(const char *dir);
>   */
>  struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
>  
> +struct packed_git {
> + struct packed_git *next;
> + struct list_head mru;
> + struct pack_window *windows;
> + off_t pack_size;
> + const void *index_data;
> + size_t index_size;
> + uint32_t num_objects;
> + uint32_t num_bad_objects;
> +

Re: [PATCH 01/27] repository: introduce raw object store field

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> The raw object store field will contain any objects needed for
> access to objects in a given repository.
>
> This patch introduces the raw object store and populates it with the
> `objectdir`, which used to be part of the repository struct.
>
> As the struct gains members, we'll also populate the function to clear
> the memory for these members.
>
> In a later we'll introduce a struct object_parser, that will complement
> the object parsing in a repository struct: The raw object parser is the
> layer that will provide access to raw object content, while the higher
> level object parser code will parse raw objects and keeps track of
> parenthood and other object relationships using 'struct object'.
> For now only add the lower level to the repository struct.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---

Heh, I suppose that sign-off makes me not a great candidate for
reviewing this.  It's been long enough since I looked at the patch
that I feel okay trying anyway.

[...]
> --- /dev/null
> +++ b/object-store.h
> @@ -0,0 +1,15 @@
> +#ifndef OBJECT_STORE_H
> +#define OBJECT_STORE_H
> +
> +struct raw_object_store {
> + /*
> +  * Path to the repository's object store.
> +  * Cannot be NULL after initialization.
> +  */
> + char *objectdir;
> +};
> +#define RAW_OBJECT_STORE_INIT { NULL }

Who owns and is responsible for freeing objectdir?

> +
> +void raw_object_store_clear(struct raw_object_store *o);

Oh, that answers that.

It could be worth a note in the doc comment anyway, but I don't mind
not having one.

[...]
> +
> +void raw_object_store_clear(struct raw_object_store *o)
> +{
> + free(o->objectdir);
> +}

Should this use FREE_AND_NULL?

[...]
> --- a/repository.c
> +++ b/repository.c
[...]
> @@ -42,8 +49,8 @@ static void repo_setup_env(struct repository *repo)
>   !repo->ignore_env);
>   free(repo->commondir);
>   repo->commondir = strbuf_detach(, NULL);
> - free(repo->objectdir);
> - repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
> + free(repo->objects.objectdir);

Should this call raw_object_store_clear instead of calling free
directly?

> + repo->objects.objectdir = git_path_from_env(DB_ENVIRONMENT, 
> repo->commondir,
>   "objects", !repo->ignore_env);

Long line.  One way to break it would be

repo->objects.objectdir =
git_path_from_env(DB_ENVIRONMENT, repo->commondir,
  "objects", !repo->ignore_env);

>   free(repo->graft_file);
>   repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
> @@ -209,12 +216,14 @@ void repo_clear(struct repository *repo)
>  {
>   FREE_AND_NULL(repo->gitdir);
>   FREE_AND_NULL(repo->commondir);
> - FREE_AND_NULL(repo->objectdir);
>   FREE_AND_NULL(repo->graft_file);
>   FREE_AND_NULL(repo->index_file);
>   FREE_AND_NULL(repo->worktree);
>   FREE_AND_NULL(repo->submodule_prefix);
>  
> + raw_object_store_clear(>objects);
> + memset(>objects, 0, sizeof(repo->objects));

If raw_object_store_clear uses FREE_AND_NULL, this memset won't be necessary.

[...]
> --- a/repository.h
> +++ b/repository.h
> @@ -1,6 +1,8 @@
>  #ifndef REPOSITORY_H
>  #define REPOSITORY_H
>  
> +#include "object-store.h"
> +
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> @@ -21,10 +23,9 @@ struct repository {
>   char *commondir;
>  
>   /*
> -  * Path to the repository's object store.
> -  * Cannot be NULL after initialization.
> +  * Holds any information related to the object store.
>*/

This comment doesn't make it clear to me what the field represents.
Can it be replaced with some of the description from the commit
message?

> - char *objectdir;
> + struct raw_object_store objects;
>  

Thanks,
Jonathan

Re: bug in HTTP protocol spec

2018-02-21 Thread Jonathan Nieder

(+cc: bmwill@, who is working on protocol v2)
Hi,

Dorian Taylor wrote:
> On Feb 21, 2018, at 2:15 PM, Jeff King  wrote:

>> Thanks, I agree the document is buggy. Do you want to submit a patch?
>
> Will this do?

Thanks for writing it.

Do you mind if we forge your sign-off?  (See Documentation/SubmittingPatches
item '(5) Certify your work' for details about what this means.)

> Note I am not sure what the story is behind that `version 1`
> element, whether it's supposed to go before or after the null packet
> or if there should be another null packet or what. Perhaps somebody
> more fluent with the smart protocol can advise.

I believe the 'version 1' goes after the flush-packet.

Thanks,
Jonathan

Re: [PATCH 2/2] ref-filter: get rid of goto

2018-02-21 Thread Оля Тележная

2018-02-21 20:41 GMT+03:00 Junio C Hamano :
> Olga Telezhnaya  writes:
>
>> Get rid of goto command in ref-filter for better readability.
>>
>> Signed-off-by: Olga Telezhnaia 
>> Mentored-by: Christian Couder 
>> Mentored by: Jeff King 
>> ---
>>  ref-filter.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> It looks like this is the same change as the bottom-most change on
> your "cat-file --batch" series (and is obviously correct).
>
> I am puzzled by your intention---are you re-organizing and rebooting
> the series?  Either 'Yes' or 'No' is an acceptable answer, and so is
> anything else.  I just want to know what you want to happen to the
> merge conflicts if I queued this while still keeping your "cat-file
> --batch" thing I have on 'pu').

Thanks for the question, I needed to mention that.
We (with Peff) decided that it's better and easier to remake whole
previous patch. Before that, I want to make some refactorings in
ref-filter so after that migrating should go much easier. I want to do
that by small separate patches, so this is first in the series. I hope
it would be both easier to review for you and easier to fix for me.
So, if everything is fine, you could merge it to master. I will
rewrite most parts of previous patch anyway.

Thanks!

>
>>
>> diff --git a/ref-filter.c b/ref-filter.c
>> index 83ffd84affe52..28df6e21fb996 100644
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item 
>> *ref)
>>   for (i = 0; i < used_atom_cnt; i++) {
>>   struct atom_value *v = >value[i];
>>   if (v->s == NULL)
>> - goto need_obj;
>> + break;
>>   }
>> - return;
>> + if (used_atom_cnt <= i)
>> + return;
>>
>> - need_obj:
>>   get_object(ref, >objectname, 0, );
>>
>>   /*
>>
>> --
>> https://github.com/git/git/pull/460

Re: Bug Report: git status triggers a file change event

2018-02-21 Thread Jonathan Nieder

+git-for-windows
Hi,

Raining Chain wrote:

> On Windows 10, git version 2.16.2.windows.1, running the command
>
> git status
>
> will trigger a file change event to file C:\myPath\.git  "Attributes changed."
>
> This causes problems when using scripts that detect file changes such
> as tsc -w (Typescript compiler) and using softwares that regularly
> call git status such as VisualStudioCode.
>
> Thanks.

Can you say more about how "tsc -w" reacts to the file change?  Is there
a way to tell it to exclude particular files from the files it watches?

Thanks,
Jonathan

Bug Report: git status triggers a file change event

2018-02-21 Thread Raining Chain

On Windows 10, git version 2.16.2.windows.1, running the command

git status

will trigger a file change event to file C:\myPath\.git  "Attributes changed."

This causes problems when using scripts that detect file changes such
as tsc -w (Typescript compiler) and using softwares that regularly
call git status such as VisualStudioCode.

Thanks.

Re: Git should preserve modification times at least on request

2018-02-21 Thread 'Peter Backes'

On Wed, Feb 21, 2018 at 06:58:34PM -0500, Randall S. Becker wrote:
> May I suggest storing the date/time in UTC+0 in all cases. I can see
> potential issues a couple of times a year where holes exist. I cannot even
> fathom what would happen on a merge or edit of history.

I consider storing the timestamp simply in the traditional 
seconds-since-epoch UNIX timestamp format. But I'm not entirely sure 
yet (see below).

If a timestamp includes the offset, there shouldn't be any issue with 
holes. UTC+0 is nice, too, of course, though some might want to 
preserve the timezone in which the timestamp was actually created.

The bigger issue is usually to copy with those pesky leap seconds. It 
makes a difference whether one uses solar seconds ("posix" style; those 
are more commonly seen) or atomic seconds ("right" style) for the UNIX 
timestamp. Those differences accumulate over time, so you can have 
almost half a minute delta if you are not careful with timestamp 
conversion. If I remember correctly, rcs uses some rather awkward 
interative convergence algorithm to portably convert from 
human-readable date and time to UNIX timestamps.

Thus I'm still not sure whether it will be a UNIX-format timestamp or 
whether a human-readable date/time might be preferrable.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee


On 2/21/2018 6:13 PM, Jeff King wrote:

On Wed, Feb 21, 2018 at 02:17:11PM -0500, Derrick Stolee wrote:


The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
   freed by rev-list in the first write. The output then does not
   match the format expectations, since the OID is written without the
   rest of the content.

I'm not sure after my earlier digging if there is even a way to trigger
this (and if so, it is probably accidental, since those lines were added
explicitly for --show-all).

And actually after re-reading the commit message for 3131b7130 again, I
think the current behavior is definitely not something that was
carefully planned. So I'd propose a commit message like below.


I only submitted my patch to avoid making you do the work of writing the 
commit message. My messages still don't have quite the right amount of 
detail (or the correct details, in this case).


Junio: please add

Reported-by: Derrick Stolee 

Thanks,

-Stolee




-- >8 --
Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()

The "--show-all" revision option shows UNINTERESTING
commits. Some of these commits may be unparsed when we try
to show them (since we may or may not need to walk their
parents to fulfill the request).

Commit 3131b71301 (Add "--show-all" revision walker flag for
debugging, 2008-02-09) resolved this by just skipping
pretty-printing for commits without their object contents
cached, saying:

   Because we now end up listing commits we may not even have been parsed
   at all "show_log" and "show_commit" need to protect against commits
   that don't have a commit buffer entry.

That was the easy fix to avoid the pretty-printer segfaulting,
but:

   1. It doesn't work for all formats. E.g., --oneline
  prints the oid for each such commit but not a trailing
  newline, leading to jumbled output.

   2. It only affects some commits, depending on whether we
  happened to parse them or not (so if they were at the
  tip of an UNINTERESTING starting point, or if we
  happened to traverse over them, you'd see more data).

   3. It unncessarily ties the decision to show the verbose
  header to whether the commit buffer was cached. That
  makes it harder to change the logic around caching
  (e.g., if we could traverse without actually loading
  the full commit objects).

These days it's safe to feed such a commit to the
pretty-print code. Since be5c9fb904 (logmsg_reencode: lazily
load missing commit buffers, 2013-01-26), we'll load it on
demand in such a case. So let's just always show the verbose
headers.

This does change the behavior of plumbing, but:

   a. The --show-all option was explicitly introduced as a
  debugging aid, and was never documented (and has rarely
  even been mentioned on the list by git devs).

   b. Avoiding the commits was already not deterministic due
  to (2) above. So the caller might have seen full
  headers for these commits anyway, and would need to be
  prepared for it.

Signed-off-by: Jeff King 
---
  builtin/rev-list.c | 2 +-
  log-tree.c | 3 ---
  2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9e11..d320b6f1e3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
  
-	if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {

+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d6d1..22b2fb6c58 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
  
-	if (!get_cached_commit_buffer(commit, NULL))

-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;

Re: What's cooking in git.git (Feb 2018, #03; Wed, 21)

2018-02-21 Thread Brandon Williams

On 02/21, Junio C Hamano wrote:
> * bw/c-plus-plus (2018-02-14) 38 commits
>  - fixup! diff: rename 'this' variables
>  - replace: rename 'new' variables
>  - trailer: rename 'template' variables
>  - tempfile: rename 'template' variables
>  - wrapper: rename 'template' variables
>  - environment: rename 'namespace' variables
>  - diff: rename 'template' variables
>  - environment: rename 'template' variables
>  - init-db: rename 'template' variables
>  - unpack-trees: rename 'new' variables
>  - trailer: rename 'new' variables
>  - submodule: rename 'new' variables
>  - split-index: rename 'new' variables
>  - remote: rename 'new' variables
>  - ref-filter: rename 'new' variables
>  - read-cache: rename 'new' variables
>  - line-log: rename 'new' variables
>  - imap-send: rename 'new' variables
>  - http: rename 'new' variables
>  - entry: rename 'new' variables
>  - diffcore-delta: rename 'new' variables
>  - diff: rename 'new' variables
>  - diff-lib: rename 'new' variable
>  - commit: rename 'new' variables
>  - combine-diff: rename 'new' variables
>  - remote: rename 'new' variables
>  - reflog: rename 'new' variables
>  - pack-redundant: rename 'new' variables
>  - help: rename 'new' variables
>  - checkout: rename 'new' variables
>  - apply: rename 'new' variables
>  - apply: rename 'try' variables
>  - diff: rename 'this' variables
>  - rev-parse: rename 'this' variable
>  - pack-objects: rename 'this' variables
>  - blame: rename 'this' variables
>  - object: rename function 'typename' to 'type_name'
>  - object_info: change member name from 'typename' to 'type_name'
> 
>  Avoid using identifiers that clash with C++ keywords.  Even though
>  it is not a goal to compile Git with C++ compilers, changes like
>  this help use of code analysis tools that targets C++ on our
>  codebase.
> 
>  Is the 'fixup!' cleanly squashable to the problematic one, or does
>  this series require another reroll to get it in a good enough shape?

Yeah the fixup patch looks good to me.  I don't think there was anything
else that needed attention so it should be in good shape with the fixup
patch.


-- 
Brandon Williams

Re: [PATCH v3 0/3] Add "git rebase --show-current-patch"

2018-02-21 Thread Tim Landscheidt

Junio C Hamano  wrote:

>> Compared to v2:

>> - the potential loss of errno before it's printed out in builtin/am.c
>>   is fixed.
>> - keep update_ref() in sequencer.c non-fatal like this rest
>> - rename ORIG_COMMIT to REBASE_HEAD

>> Interdiff:

> This round hasn't seen any comments.  Is everybody happy with it?

> I personally do not have strong opinion for the feature but didn't
> spot anything against the execution, either, so...

Sorry for the late reply: I dislike REBASE_/HEAD/ because
ORIG_/HEAD/ refers to the tip of the original branch, and
/ORIG/_HEAD refers to the original branch, so
/REBASE/_/HEAD/ is doubly confusing IMHO.  I consider
ORIG_COMMIT easier to understand because ORIG_HEAD refers to
the tip of the original branch, and ORIG_COMMIT would refer
to one of the commits making up that original branch, but as
I suggested it myself I might not be very objective in that
regard :-).

(BTW, thanks, Duy, for doing the actual work!)

Tim

Re: [PATCH 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Add a repository argument to allow sha1_file_name callers to be more
> specific about which repository to handle. This is a small mechanical
> change; it doesn't change the implementation to handle repositories
> other than the_repository yet.
> 
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
> 
> While at it, move the declaration to object-store.h, where it should
> be easier to find.

Seems like we may want to make a sha1-file.h or an oid-file.h or
something like that at some point as that seems like a better place for
the function than in the object-store.h file?


-- 
Brandon Williams

Re: [PATCH 14/27] sha1_file: allow link_alt_odb_entries to handle arbitrary repositories

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Actually this also allows read_info_alternates and link_alt_odb_entry to
> handle arbitrary repositories, but link_alt_odb_entries is the most
> interesting function in this set of functions, hence the commit subject.
> 
> These functions span a strongly connected component in the function
> graph, i.e. the recursive call chain might look like
> 
>   -> link_alt_odb_entries
> -> link_alt_odb_entry
>   -> read_info_alternates
> -> link_alt_odb_entries
> 
> That is why we need to convert all these functions at the same time.
> 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 
> ---

Looks good.

>  object-store.h |  4 
>  sha1_file.c| 36 
>  2 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 514ad94287..f283fbdba9 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -18,6 +18,10 @@ struct alternate_object_database {
>   char loose_objects_subdir_seen[256];
>   struct oid_array loose_objects_cache;
>  
> + /*
> +  * Path to the alternative object store. If this is a relative path,
> +  * it is relative to the current working directory.
> +  */

Thanks for adding the comment.

>   char path[FLEX_ARRAY];
>  };
>  #define prepare_alt_odb(r) prepare_alt_odb_##r()
> diff --git a/sha1_file.c b/sha1_file.c
> index c26921df4a..6e5105a252 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -390,11 +390,10 @@ static int alt_odb_usable(struct raw_object_store *o,
>   * SHA1, an extra slash for the first level indirection, and the
>   * terminating NUL.
>   */
> -#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
> -static void read_info_alternates_the_repository(const char *relative_base,
> - int depth);
> -#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, 
> n)
> -static int link_alt_odb_entry_the_repository(const char *entry,
> +static void read_info_alternates(struct repository *r,
> +  const char *relative_base,
> +  int depth);
> +static int link_alt_odb_entry(struct repository *r, const char *entry,
>   const char *relative_base, int depth, const char *normalized_objdir)
>  {
>   struct alternate_object_database *ent;
> @@ -420,7 +419,7 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
>   strbuf_setlen(, pathbuf.len - 1);
>  
> - if (!alt_odb_usable(_repository->objects, , 
> normalized_objdir)) {
> + if (!alt_odb_usable(>objects, , normalized_objdir)) {
>   strbuf_release();
>   return -1;
>   }
> @@ -428,12 +427,12 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   ent = alloc_alt_odb(pathbuf.buf);
>  
>   /* add the alternate entry */
> - *the_repository->objects.alt_odb_tail = ent;
> - the_repository->objects.alt_odb_tail = &(ent->next);
> + *r->objects.alt_odb_tail = ent;
> + r->objects.alt_odb_tail = &(ent->next);
>   ent->next = NULL;
>  
>   /* recursively add alternates */
> - read_info_alternates(the_repository, pathbuf.buf, depth + 1);
> + read_info_alternates(r, pathbuf.buf, depth + 1);
>  
>   strbuf_release();
>   return 0;
> @@ -468,12 +467,8 @@ static const char *parse_alt_odb_entry(const char 
> *string,
>   return end;
>  }
>  
> -#define link_alt_odb_entries(r, a, s, rb, d) \
> - link_alt_odb_entries_##r(a, s, rb, d)
> -static void link_alt_odb_entries_the_repository(const char *alt,
> - int sep,
> - const char *relative_base,
> - int depth)
> +static void link_alt_odb_entries(struct repository *r, const char *alt,
> +  int sep, const char *relative_base, int depth)
>  {
>   struct strbuf objdirbuf = STRBUF_INIT;
>   struct strbuf entry = STRBUF_INIT;
> @@ -487,7 +482,7 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   return;
>   }
>  
> - strbuf_add_absolute_path(, get_object_directory());
> + strbuf_add_absolute_path(, r->objects.objectdir);
>   if (strbuf_normalize_path() < 0)
>   die("unable to normalize object directory: %s",
>   objdirbuf.buf);
> @@ -496,15 +491,16 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   alt = parse_alt_odb_entry(alt, sep, );
>   if (!entry.len)
>   continue;
> - link_alt_odb_entry(the_repository, entry.buf,
> + link_alt_odb_entry(r, entry.buf,
>  relative_base, depth, objdirbuf.buf);
>   }
>

Re: [PATCH 08/27] pack: move approximate object count to object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> The approximate_object_count() function maintains a rough count of
> objects in a repository to estimate how long object name abbreviates
> should be.  Object names are scoped to a repository and the
> appropriate length may differ by repository, so the object count
> should not be global.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  object-store.h | 10 +-
>  packfile.c | 11 +--
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 6cecba3951..bd1e4fcd8b 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -93,6 +93,14 @@ struct raw_object_store {
>   struct alternate_object_database *alt_odb_list;
>   struct alternate_object_database **alt_odb_tail;
>  
> + /*
> +  * A fast, rough count of the number of objects in the repository.
> +  * These two fields are not meant for direct access. Use
> +  * approximate_object_count() instead.
> +  */
> + unsigned long approximate_object_count;
> + unsigned approximate_object_count_valid : 1;

Patch looks fine and is effectively a no-op, though what is the need for
both of these variables?  Maybe it can be simplified down to just use
one?  Just musing as its out of the scope of this patch and we probably
shouldn't try to change that in this series.

> +
>   /*
>* Whether packed_git has already been populated with this repository's
>* packs.
> @@ -107,7 +115,7 @@ struct raw_object_store {
>   * that macro on member variables. Use NULL instead as that is defined
>   * and accepted, deferring the real init to prepare_packed_git_mru(). */
>  #define __MRU_INIT { NULL, NULL }
> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0, 0, 0 }
>  
>  void raw_object_store_clear(struct raw_object_store *o);
>  
> diff --git a/packfile.c b/packfile.c
> index a8a2e7fe43..693bafbc98 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -803,8 +803,6 @@ static void prepare_packed_git_one(char *objdir, int 
> local)
>   strbuf_release();
>  }
>  
> -static int approximate_object_count_valid;
> -
>  /*
>   * Give a fast, rough count of the number of objects in the repository. This
>   * ignores loose objects completely. If you have a lot of them, then either
> @@ -814,8 +812,8 @@ static int approximate_object_count_valid;
>   */
>  unsigned long approximate_object_count(void)
>  {
> - static unsigned long count;
> - if (!approximate_object_count_valid) {
> + if (!the_repository->objects.approximate_object_count_valid) {
> + unsigned long count;
>   struct packed_git *p;
>  
>   prepare_packed_git();
> @@ -825,8 +823,9 @@ unsigned long approximate_object_count(void)
>   continue;
>   count += p->num_objects;
>   }
> + the_repository->objects.approximate_object_count = count;
>   }
> - return count;
> + return the_repository->objects.approximate_object_count;
>  }
>  
>  static void *get_next_packed_git(const void *p)
> @@ -901,7 +900,7 @@ void prepare_packed_git(void)
>  
>  void reprepare_packed_git(void)
>  {
> - approximate_object_count_valid = 0;
> + the_repository->objects.approximate_object_count_valid = 0;
>   the_repository->objects.packed_git_initialized = 0;
>   prepare_packed_git();
>  }
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: [PATCH 24/27] sha1_file: allow open_sha1_file to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:27 -0800
Stefan Beller  wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan

Re: [PATCH 22/27] sha1_file: allow sha1_file_name to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:25 -0800
Stefan Beller  wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan 

> -void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
> *sha1)
> +void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
> char *sha1)
>  {
> - strbuf_addstr(buf, get_object_directory());
> + strbuf_addstr(buf, r->objects.objectdir);
>   strbuf_addch(buf, '/');
>   fill_sha1_path(buf, sha1);
>  }

In the future, we should probably have:
 - a function to get the object store out of a repo (so that it can
   lazily initialize the object store struct if necessary)
 - when the object store is obtained, its objectdir field is guaranteed
   to be populated
 - sha1_file_name should take the object store struct, not the repo
   struct

but this is outside the scope of this patch.

Re: [PATCH 06/27] object-store: close all packs upon clearing the object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Signed-off-by: Stefan Beller 

Straight forward change, looks good.

> ---
>  builtin/am.c   | 2 +-
>  builtin/clone.c| 2 +-
>  builtin/fetch.c| 2 +-
>  builtin/merge.c| 2 +-
>  builtin/receive-pack.c | 2 +-
>  object.c   | 6 ++
>  packfile.c | 4 ++--
>  packfile.h | 2 +-
>  8 files changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin/am.c b/builtin/am.c
> index 5bdd2d7578..4762a702e3 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1859,7 +1859,7 @@ static void am_run(struct am_state *state, int resume)
>*/
>   if (!state->rebasing) {
>   am_destroy(state);
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>  }
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 101c27a593..13cfaa6488 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1217,7 +1217,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   transport_disconnect(transport);
>  
>   if (option_dissociate) {
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   dissociate_from_references();
>   }
>  
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 8ee998ea2e..4d72efca78 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1478,7 +1478,7 @@ int cmd_fetch(int argc, const char **argv, const char 
> *prefix)
>  
>   string_list_clear(, 0);
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>  
>   argv_array_pushl(_gc_auto, "gc", "--auto", NULL);
>   if (verbosity < 0)
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 30264cfd7c..907ae44ab5 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -411,7 +411,7 @@ static void finish(struct commit *head_commit,
>* We ignore errors in 'gc --auto', since the
>* user should see them.
>*/
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>   }
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index b2eac79a6e..954fc72c7c 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -2027,7 +2027,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
> char *prefix)
>   proc.git_cmd = 1;
>   proc.argv = argv_gc_auto;
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   if (!start_command()) {
>   if (use_sideband)
>   copy_to_sideband(proc.err, -1, NULL);
> diff --git a/object.c b/object.c
> index c76b62572a..34daaf37b3 100644
> --- a/object.c
> +++ b/object.c
> @@ -4,6 +4,7 @@
>  #include "tree.h"
>  #include "commit.h"
>  #include "tag.h"
> +#include "packfile.h"
>  
>  static struct object **obj_hash;
>  static int nr_objs, obj_hash_size;
> @@ -469,8 +470,5 @@ void raw_object_store_clear(struct raw_object_store *o)
>  
>   while (!list_empty(>packed_git_mru))
>   list_del(>packed_git_mru);
> - /*
> -  * TODO: call close_all_packs once migrated to
> -  * take an object store argument
> -  */
> + close_all_packs(o);
>  }
> diff --git a/packfile.c b/packfile.c
> index d41e4c83d0..511a2b0cdf 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -311,11 +311,11 @@ static void close_pack(struct packed_git *p)
>   close_pack_index(p);
>  }
>  
> -void close_all_packs(void)
> +void close_all_packs(struct raw_object_store *o)
>  {
>   struct packed_git *p;
>  
> - for (p = the_repository->objects.packed_git; p; p = p->next)
> + for (p = o->packed_git; p; p = p->next)
>   if (p->do_not_close)
>   die("BUG: want to close pack marked 'do-not-close'");
>   else
> diff --git a/packfile.h b/packfile.h
> index a7fca598d6..6a2c57045c 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -63,7 +63,7 @@ extern void close_pack_index(struct packed_git *);
>  
>  extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
> off_t, unsigned long *);
>  extern void close_pack_windows(struct packed_git *);
> -extern void close_all_packs(void);
> +extern void close_all_packs(struct raw_object_store *o);
>  extern void unuse_pack(struct pack_window **);
>  extern void clear_delta_base_cache(void);
>  extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
> int local);
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: [PATCH 04/27] object-store: free alt_odb_list

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Free the memory and reset alt_odb_{list, tail} to NULL.

Good to see memory leaks being avoided (well they will be on other
repository objects)

> 
> Signed-off-by: Stefan Beller 
> ---
>  object.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/object.c b/object.c
> index 11d904c033..17b1da6d6b 100644
> --- a/object.c
> +++ b/object.c
> @@ -446,7 +446,24 @@ void clear_commit_marks_all(unsigned int flags)
>   }
>  }
>  
> +static void free_alt_odb(struct alternate_object_database *alt)
> +{
> + strbuf_release(>scratch);
> + oid_array_clear(>loose_objects_cache);
> +}
> +
> +static void free_alt_odbs(struct raw_object_store *o)
> +{
> + while (o->alt_odb_list) {
> + free_alt_odb(o->alt_odb_list);
> + o->alt_odb_list = o->alt_odb_list->next;
> + }
> +}
> +
>  void raw_object_store_clear(struct raw_object_store *o)
>  {
>   free(o->objectdir);
> +
> + free_alt_odbs(o);
> + o->alt_odb_tail = NULL;
>  }
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: [PATCH 03/27] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> In a process with multiple repositories open, alternates should be
> associated to a single repository and not shared globally. Move
> alt_odb_list and alt_odb_tail into the_repository and adjust callers
> to reflect this.
> 
> Now that the alternative object data base is per repository, we're
> leaking its memory upon freeing a repository. The next patch plugs
> this hole.
> 
> No functional change intended.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Looks good aside from the one spacing issue (though it's not that big of
a deal).

> ---
>  builtin/fsck.c |  4 +++-
>  object-store.h |  9 ++---
>  packfile.c |  3 ++-
>  sha1_file.c| 25 -
>  sha1_name.c|  3 ++-
>  5 files changed, 25 insertions(+), 19 deletions(-)
> 
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 7a8a679d4f..908e4f092a 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "cache.h"
> +#include "repository.h"
>  #include "config.h"
>  #include "commit.h"
>  #include "tree.h"
> @@ -716,7 +717,8 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   fsck_object_dir(get_object_directory());
>  
>   prepare_alt_odb();
> - for (alt = alt_odb_list; alt; alt = alt->next)
> + for (alt = the_repository->objects.alt_odb_list;
> + alt; alt = alt->next)

Indentation on this line looks odd.


-- 
Brandon Williams

Re: [PATCH 02/27] object-store: migrate alternates struct and functions from cache.h

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Migrate the struct alternate_object_database and all its related
> functions to the object store as these functions are easier found in
> that header. The migration is just a verbatim copy, no need to
> include the object store header at any C file, because cache.h includes
> repository.h which in turn includes the object-store.h

Nice.  Always love seeing cache.h get smaller


-- 
Brandon Williams

Re: [PATCH 19/27] sha1_file: add repository argument to map_sha1_file_1

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:22 -0800
Stefan Beller  wrote:

> Add a repository argument to allow the map_sha1_file_1 caller to be
> more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
> 
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan

Re: [PATCH 15/27] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:18 -0800
Stefan Beller  wrote:

> -void prepare_alt_odb_the_repository(void)
> +void prepare_alt_odb(struct repository *r)
>  {
> - const char *alt;
> -
> - if (the_repository->objects.alt_odb_tail)
> + if (r->objects.alt_odb_tail)
>   return;
>  
> - alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> + r->objects.alt_odb_tail = >objects.alt_odb_list;
> +
> + if (!r->ignore_env) {
> + const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> + if (!alt)
> + alt = "";

alt can be NULL, just like in the existing code, so these 2 lines are
unnecessary. (link_alt_odb_entries() will just do nothing in either
case.)

(I also think that the check of absence of alt should be done by the
caller of link_alt_odb_entries(), not by link_alt_odb_entries() itself,
but that is much beyond the scope of this patch set.)

>  
> - the_repository->objects.alt_odb_tail =
> - _repository->objects.alt_odb_list;
> - link_alt_odb_entries(the_repository, alt,
> -  PATH_SEP, NULL, 0);
> + link_alt_odb_entries(r, alt, PATH_SEP, NULL, 0);
> + }
>  
> - read_info_alternates(the_repository, get_object_directory(), 0);
> + read_info_alternates(r, r->objects.objectdir, 0);
>  }

What's cooking in git.git (Feb 2018, #03; Wed, 21)

2018-02-21 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/sha1dc-build (2017-12-08) 3 commits
  (merged to 'next' on 2018-02-08 at ba9ff2b836)
 + sha1dc_git.h: re-arrange an ifdef chain for a subsequent change
 + Makefile: under "make dist", include the sha1collisiondetection submodule
 + Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto

 Push the submodule version of collision-detecting SHA-1 hash
 implementation a bit harder on builders.


* ab/wildmatch-tests (2018-01-30) 10 commits
  (merged to 'next' on 2018-02-08 at f999a3d732)
 + wildmatch test: mark test as EXPENSIVE_ON_WINDOWS
 + test-lib: add an EXPENSIVE_ON_WINDOWS prerequisite
 + wildmatch test: create & test files on disk in addition to in-memory
 + wildmatch test: perform all tests under all wildmatch() modes
 + wildmatch test: use test_must_fail, not ! for test-wildmatch
 + wildmatch test: remove dead fnmatch() test code
 + wildmatch test: use a paranoia pattern from nul_match()
 + wildmatch test: don't try to vertically align our output
 + wildmatch test: use more standard shell style
 + wildmatch test: indent with tabs, not spaces

 More tests for wildmatch functions.


* bc/hash-algo (2018-02-09) 13 commits
  (merged to 'next' on 2018-02-09 at 4437f3f132)
 + hash: update obsolete reference to SHA1_HEADER
  (merged to 'next' on 2018-02-08 at 18f36d12ed)
 + bulk-checkin: abstract SHA-1 usage
 + csum-file: abstract uses of SHA-1
 + csum-file: rename sha1file to hashfile
 + read-cache: abstract away uses of SHA-1
 + pack-write: switch various SHA-1 values to abstract forms
 + pack-check: convert various uses of SHA-1 to abstract forms
 + fast-import: switch various uses of SHA-1 to the_hash_algo
 + sha1_file: switch uses of SHA-1 to the_hash_algo
 + builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo
 + builtin/index-pack: improve hash function abstraction
 + hash: create union for hash context allocation
 + hash: move SHA-1 macros to hash.h

 More abstraction of hash function from the codepath.


* cc/perf-aggregate (2018-02-02) 3 commits
  (merged to 'next' on 2018-02-08 at d8f074e6fb)
 + perf/aggregate: sort JSON fields in output
 + perf/aggregate: add --reponame option
 + perf/aggregate: add --subsection option

 "make perf" enhancement.


* en/merge-recursive-fixes (2018-01-19) 3 commits
  (merged to 'next' on 2018-02-08 at c254292070)
 + merge-recursive: add explanation for src_entry and dst_entry
 + merge-recursive: fix logic ordering issue
 + Tighten and correct a few testcases for merging and cherry-picking
 (this branch is used by en/rename-directory-detection.)


* gs/rebase-allow-empty-message (2018-02-07) 1 commit
  (merged to 'next' on 2018-02-08 at 9d81a2496c)
 + rebase: add --allow-empty-message option

 "git rebase" learned to take "--allow-empty-message" option.


* jc/worktree-add-short-help (2018-01-17) 1 commit
  (merged to 'next' on 2018-02-08 at 9f59ca72ab)
 + worktree: say that "add" takes an arbitrary commit in short-help

 Error message fix.


* jt/fsck-code-cleanup (2018-01-23) 1 commit
  (merged to 'next' on 2018-02-08 at 199ad41486)
 + fsck: fix leak when traversing trees

 Plug recently introduced leaks in fsck.


* kg/packed-ref-cache-fix (2018-01-24) 6 commits
  (merged to 'next' on 2018-02-08 at 370f06a565)
 + packed_ref_cache: don't use mmap() for small files
 + load_contents(): don't try to mmap an empty file
 + packed_ref_iterator_begin(): make optimization more general
 + find_reference_location(): make function safe for empty snapshots
 + create_snapshot(): use `xmemdupz()` rather than a strbuf
 + struct snapshot: store `start` rather than `header_len`

 Avoid mmapping small files while using packed refs (especially ones
 with zero size, which would cause later munmap() to fail).
 A change to a binsearch loop to work around picky complers was
 unnecessarily hard to reason about, but it should do.


* lw/daemon-log-destination (2018-02-05) 1 commit
  (merged to 'next' on 2018-02-08 at da91bd56f4)
 + daemon: add --log-destination=(stderr|syslog|none)

 The log from "git daemon" can be redirected with a new option; one
 relevant use case is to send the log to standard error (instead of
 syslog) when running it from inetd.


* nd/format-patch-stat-width (2018-02-02) 2 commits
  (merged to 'next' on 2018-02-08 at c03e8a084e)
 + format-patch: reduce patch diffstat width to 72
 + format-patch: keep cover-letter diffstat wrapped in 72 columns

 "git format-patch" learned to give 72-cols to diffstat, which is
 consistent with other

Re: [PATCH 09/27] sha1_file: add raw_object_store argument to alt_odb_usable

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:12 -0800
Stefan Beller  wrote:

> Add a raw_object_store to alt_odb_usable to be more specific about which
> repository to act on. The choice of the repository is delegated to its
> only caller link_alt_odb_entry.
> 
> Signed-off-by: Stefan Beller 

I checked that alt_odb_usable no longer depends on any repository-like
globals.

Reviewed-by: Jonathan Tan

Re: [PATCHv3 00/27] Moving global state into the repository object (part 1)

2018-02-21 Thread Stefan Beller

On Tue, Feb 20, 2018 at 5:54 PM, Stefan Beller  wrote:
> v3:
> * reverted back to use the repository for most of the functions
>   (including unduplicating 'ignore_env')
> * rebased on master again (I lost that state when doing v2, as
>   I did both rebase as well as conversion to object store in one go)
> * one additional patch, that moves the alternates related things to
>   object-store.h, such that inclusion of cache.h (in object-store.h)
>   is not required any more.

This is also available at
https://github.com/stefanbeller/git/tree/object-store-part1-v3

Re: [PATCH 00/36] object_id part 12

2018-02-21 Thread brian m. carlson

On Wed, Feb 21, 2018 at 10:47:19AM -0800, Junio C Hamano wrote:
> "brian m. carlson"  writes:
> 
> > This is the twelfth in a series of patches to convert from unsigned char
> > [20] to struct object_id.  This series is based on next.
> >
> > Included in this series are conversions for find_unique_abbrev and
> > lookup_replace_object, as well as parts of the sha1_file code.
> >
> > Conflicts with pu are average in number but minor, mostly because of the
> > type_name conversion.  None of them are tricky, except that the
> > introduction of get_tree_entry_if_blob requires a conversion of that
> > function.
> 
> And the reason why this is based on 'next' is...?  Which topic(s) do
> we have to wait for until we can queue this series, in other words?
> 
> Thanks for working on this, though.

It was waiting on the hash_algo changes I had submitted, and I don't
believe they'd made it into master.  When they have, I'll rebase and
send a v2.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH v5 00/17] document & test fetch pruning & add fetch.pruneTags

2018-02-21 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason   writes:

> Here's a v5 (correct subject line this time!). Many thanks to Eric for
> a thorough review.

We haven't seen any comments on this round.  Is everybody happy?

I do not have a strong opinion on the new feature, either for or
against.  I didn't find anything majorly questionable in the
execution, though, so...

Re: [PATCH v3 0/3] Add "git rebase --show-current-patch"

2018-02-21 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Compared to v2:
>
> - the potential loss of errno before it's printed out in builtin/am.c
>   is fixed.
> - keep update_ref() in sequencer.c non-fatal like this rest
> - rename ORIG_COMMIT to REBASE_HEAD
>
> Interdiff:

This round hasn't seen any comments.  Is everybody happy with it?

I personally do not have strong opinion for the feature but didn't
spot anything against the execution, either, so...

Re: [PATCH 07/27] pack: move prepare_packed_git_run_once to object store

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:10 -0800
Stefan Beller  wrote:

> Each repository's object store can be initialized independently, so
> they must not share a run_once variable.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan 

> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }

Optional: Trailing zeros are not needed in struct initializers.

Re: [PATCH v3 35/35] remote-curl: don't request v2 when pushing

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:12 -0800
Brandon Williams  wrote:

> +test_expect_success 'push with http:// and a config of v2 does not request 
> v2' '
> + # Till v2 for push is designed, make sure that if a client has
> + # protocol.version configured to use v2, that the client instead falls
> + # back and uses v0.
> +
> + test_commit -C http_child three &&
> +
> + # Push to another branch, as the target repository has the
> + # master branch checked out and we cannot push into it.
> + GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
> + push origin HEAD:client_branch && 2>log &&

Should it be protocol.version=2? Also, two double ampersands?

Also, optionally, it might be better to do
GIT_TRACE_PACKET="$(pwd)/log", so that it does not get mixed with other
stderr output.

Re: [PATCH v3 32/35] http: allow providing extra headers for http requests

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:09 -0800
Brandon Williams  wrote:

> @@ -172,6 +172,8 @@ struct http_get_options {
>* for details.
>*/
>   struct strbuf *base_url;
> +
> + struct string_list *extra_headers;

Document this? For example:

  If not NULL, additional HTTP headers to be sent with the request. The
  strings in the list must not be freed until after the request.

Re: [PATCH v3 30/35] remote-curl: create copy of the service name

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:07 -0800
Brandon Williams  wrote:

> Make a copy of the service name being requested instead of relying on
> the buffer pointed to by the passed in 'const char *' to remain
> unchanged.
> 
> Signed-off-by: Brandon Williams 

Probably worth mentioning in the commit message:

  Currently, all service names are string constants, but a subsequent
  patch will introduce service names from external sources.

Other than that,

Reviewed-by: Jonathan Tan

Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:05 -0800
Brandon Williams  wrote:

> Introduce the transport-helper capability 'stateless-connect'.  This
> capability indicates that the transport-helper can be requested to run
> the 'stateless-connect' command which should attempt to make a
> stateless connection with a remote end.  Once established, the
> connection can be used by the git client to communicate with
> the remote end natively in a stateless-rpc manner as supported by
> protocol v2.  This means that the client must send everything the server
> needs in a single request as the client must not assume any
> state-storing on the part of the server or transport.

Maybe it's worth mentioning that support in the actual remote helpers
will be added in a subsequent patch.

> If a stateless connection cannot be established then the remote-helper
> will respond in the same manner as the 'connect' command indicating that
> the client should fallback to using the dumb remote-helper commands.

This makes sense, but there doesn't seem to be any code in this patch
that implements this.

> @@ -612,6 +615,11 @@ static int process_connect_service(struct transport 
> *transport,
>   if (data->connect) {
>   strbuf_addf(, "connect %s\n", name);
>   ret = run_connect(transport, );
> + } else if (data->stateless_connect) {
> + strbuf_addf(, "stateless-connect %s\n", name);
> + ret = run_connect(transport, );
> + if (ret)
> + transport->stateless_rpc = 1;

Why is process_connect_service() falling back to stateless_connect if
connect doesn't work? I don't think this fallback would work, as a
client that needs "connect" might need its full capabilities.

RE: Git should preserve modification times at least on request

2018-02-21 Thread Randall S. Becker

On February 21, 2018 6:13 PM, Peter Backes wrote:
> On Wed, Feb 21, 2018 at 11:44:13PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > If it were added as a first-level feature to git it would present a
> > lot of UX confusion. E.g. you run "git add" and it'll be showing the
> > mtime somehow, or you get a formatted patch over E-Mail and it doesn't
> > only include the commit time but also times for individual files.
> 
> But that's pretty standard. patch format has timestamp fields for
precisely
> this purpose:
> 
> % echo a > x
> % echo b > y
> % diff -u x y
> --- x 2018-02-21 23:56:29.574029523 +0100
> +++ y 2018-02-21 23:56:31.430003389 +0100

May I suggest storing the date/time in UTC+0 in all cases. I can see
potential issues a couple of times a year where holes exist. I cannot even
fathom what would happen on a merge or edit of history.

Cheers,
Randall

Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order) 2

2018-02-21 Thread Josh Tepper

When using git log, boundary commits (ie, those commits added by
specifying --boundary) do not respect the order (e.g., --date-order,
--topo-order).  Consider the following commit history, where number
indicates the order of the commit timestamps:


0125  <--A
   \ \
 346  <--B


Executing the following command:

$ git log --boundary --date-order ^A B

Should produce the following order (boundary commits shown with dashes):
6 -5 4 3 -1

However, it in fact produces:
6 4 3 -5 -1

Please advise.

Best,
~Josh

Re: Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order)

2018-02-21 Thread Josh Tepper

TYPO IN EXPECTED OUTPUT.

To avoid inevitable confusion, creating new thread "Bug: git log:
boundary commits do not respect order (e.g. date-order, topo-order)
2".

DON'T REPLY TO THIS MESSAGE.  Instead reply to the new message

On Wed, Feb 21, 2018 at 6:28 PM, Josh Tepper  wrote:
> When using git log, boundary commits (ie, those commits added by
> specifying --boundary) do not respect the order (e.g., --date-order,
> --topo-order).  Consider the following commit history, where number
> indicates the order of the commit timestamps:
>
> 
> 0125  <--A
>\ \
>  346  <--B
>
>
> Executing the following command:
>
> $ git log --boundary --date-order ^A B
>
> Should produce the following order (boundary commits shown with dashes):
> 6 -5 4 3 -1
>
> However, it in fact produces:
> 6 4 3 -7 -1
>
> Please advise.
>
> Best,
> ~Josh

Re: [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:01 -0800
Brandon Williams  wrote:

> Instead of having each builtin transport asking for which protocol
> version the user has configured in 'protocol.version' by calling
> `get_protocol_version_config()` multiple times, factor this logic out
> so there is just a single call at the beginning of `git_connect()`.
> 
> This will be helpful in the next patch where we can have centralized
> logic which determines if we need to request a different protocol
> version than what the user has configured.
> 
> Signed-off-by: Brandon Williams 

Reviewed-by: Jonathan Tan

Re: bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor


> On Feb 21, 2018, at 2:15 PM, Jeff King  wrote:
> 
> Thanks, I agree the document is buggy. Do you want to submit a patch?

Will this do?

Note I am not sure what the story is behind that `version 1` element, whether 
it's supposed to go before or after the null packet or if there should be 
another null packet or what. Perhaps somebody more fluent with the smart 
protocol can advise.

---
Documentation/technical/http-protocol.txt | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
index a0e45f2889e6e..19d73f7efb338 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -214,14 +214,17 @@ smart server reply:
   S: Cache-Control: no-cache
   S:
   S: 001e# service=git-upload-pack\n
+   S: 
   S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint\0multi_ack\n
   S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
   S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
   S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
+   S: 

The client may send Extra Parameters (see
Documentation/technical/pack-protocol.txt) as a colon-separated string
-in the Git-Protocol HTTP header.
+in the Git-Protocol HTTP header. Note as well that there is *no* newline
+after the ``.

Dumb Server Response

@@ -264,8 +267,8 @@ Servers MUST set $servicename to be the request parameter 
value.
Servers SHOULD include an LF at the end of this line.
Clients MUST ignore an LF at the end of the line.

-Servers MUST terminate the response with the magic `` end
-pkt-line marker.
+Servers MUST follow the first pkt-line, as well as terminate the
+response, with the magic `` end pkt-line marker.

The returned response is a pkt-line stream describing each ref and
its known value.  The stream SHOULD be sorted by name according to
@@ -278,6 +281,7 @@ Extra Parameter.

  smart_reply =  PKT-LINE("# service=$servicename" LF)
 *1("version 1")
+""
 ref_list
 ""
  ref_list=  empty_list / non_empty_list

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com

Re: [PATCH v3 20/35] upload-pack: introduce fetch server command

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:57 -0800
Brandon Williams  wrote:

> +want 
> + Indicates to the server an object which the client wants to
> + retrieve.

Mention that the client can "want" anything even if not advertised by
the server (like uploadpack.allowanysha1inwant).

> +output = *section
> +section = (acknowledgments | packfile)
> +   (flush-pkt | delim-pkt)
> +
> +acknowledgments = PKT-LINE("acknowledgments" LF)
> +   *(ready | nak | ack)

Can this part be described more precisely in the BNF section? I see that
you describe later that there can be multiple ACKs or one NAK (but not
both), and "ready" can be sent regardless of whether ACKs or a NAK is
sent.

> +ready = PKT-LINE("ready" LF)
> +nak = PKT-LINE("NAK" LF)
> +ack = PKT-LINE("ACK" SP obj-id LF)
> +
> +packfile = PKT-LINE("packfile" LF)
> +[PACKFILE]
> +
> +
> +acknowledgments section
> + * Always begins with the section header "acknowledgments"
> +
> + * The server will respond with "NAK" if none of the object ids sent
> +   as have lines were common.
> +
> + * The server will respond with "ACK obj-id" for all of the
> +   object ids sent as have lines which are common.
> +
> + * A response cannot have both "ACK" lines as well as a "NAK"
> +   line.
> +
> + * The server will respond with a "ready" line indicating that
> +   the server has found an acceptable common base and is ready to
> +   make and send a packfile (which will be found in the packfile
> +   section of the same response)
> +
> + * If the client determines that it is finished with negotiations
> +   by sending a "done" line, the acknowledgments sections can be
> +   omitted from the server's response as an optimization.

Should this be changed to "must"? The current implementation does not
support it (on the client side).

> +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 
> 0, 0, 0 }

Optional: the trailing zeroes can be omitted. (That's shorter, and also
easier to maintain when we add new fields.)

> +int upload_pack_v2(struct repository *r, struct argv_array *keys,
> +struct argv_array *args)
> +{
> + enum fetch_state state = FETCH_PROCESS_ARGS;
> + struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
> + use_sideband = LARGE_PACKET_MAX;
> +
> + while (state != FETCH_DONE) {
> + switch (state) {
> + case FETCH_PROCESS_ARGS:
> + process_args(args, );
> +
> + if (!want_obj.nr) {
> + /*
> +  * Request didn't contain any 'want' lines,
> +  * guess they didn't want anything.
> +  */
> + state = FETCH_DONE;
> + } else if (data.haves.nr) {
> + /*
> +  * Request had 'have' lines, so lets ACK them.
> +  */
> + state = FETCH_SEND_ACKS;
> + } else {
> + /*
> +  * Request had 'want's but no 'have's so we can
> +  * immedietly go to construct and send a pack.
> +  */
> + state = FETCH_SEND_PACK;
> + }
> + break;
> + case FETCH_READ_HAVES:
> + read_haves();
> + state = FETCH_SEND_ACKS;
> + break;

This branch seems to never be taken?

> + case FETCH_SEND_ACKS:
> + if (process_haves_and_send_acks())
> + state = FETCH_SEND_PACK;
> + else
> + state = FETCH_DONE;
> + break;
> + case FETCH_SEND_PACK:
> + packet_write_fmt(1, "packfile\n");
> + create_pack_file();
> + state = FETCH_DONE;
> + break;
> + case FETCH_DONE:
> + continue;
> + }
> + }
> +
> + upload_pack_data_clear();
> + return 0;
> +}

Re: [PATCH] revision: drop --show-all option

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 3:27 PM, Jeff King  wrote:
>
> We'll skip the usual deprecation period because this was
> explicitly a debugging aid that was never documented.

Ack. I don't think I've used it since, and probably nobody else ever used it.

  Linus

Re: [PATCH 04/26] upload-pack: convert to a builtin

2018-02-21 Thread Junio C Hamano

Jonathan Nieder  writes:

> For defense in depth, it would be comforting if the git wrapper had
> some understanding of "don't support --help in handle_builtin when
> invoked as a dashed command".  That is, I don't expect that anyone has
> been relying on
>
>   git-add --help
>
> acting like
>
>   git help add
>
> instead of printing the usage message from
>
>   git add -h

Sounds like a neat trick.

> It's a little fussy because today we rewrite "git add --help" to
> "git-add --help" before rewriting it to "git help add"; we'd have to
> skip that middle hop for this to work.

I do not quite get this part.  "git add --help" goes through run_argv()
and then to handle_builtin() which is what does this "git help add"
swapping.

"git-add --help" does get thrown into the same codepath by
pretending as if we got "add --help" as an argument to "git"
command, and that happens without going through run_argv(),
so presumably we can add another perameter to handle_builtin()
so that the callee can tell these two invocation sites apart, no?

> I don't think that has to block this patch or series, though --- it's
> just a separate thought about hardening.

Yeah, I agree with this assessment.

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:19:17PM -0500, Derrick Stolee wrote:

> > These behaviors are undocumented, untested, and unlikely to be
> > expected by users or other software attempting to parse this output.
> > 
> > Helped-by: Jeff King 
> > Signed-off-by: Derrick Stolee 
> 
> This would be a good time to allow multiple authors, or to just change the
> author, since this is exactly the diff you (Peff) provided in an earlier
> email. The commit message hopefully summarizes our discussion, but I welcome
> edits.

The point is moot if we take the revision I just sent (though in
retrospect I really ought to have put in a Reported-by: for you there).

But some communities are settling on Co-authored-by as a trailer for
this case. And GitHub has started parsing and showing that along with
author information:

  https://github.com/blog/2496-commit-together-with-co-authors

-Peff

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 03:22:02PM -0800, Stefan Beller wrote:

> > Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()
> > ---
> >  builtin/rev-list.c | 2 +-
> >  log-tree.c | 3 ---
> >  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> Now if we'd get around to rewrite pretty.c as well, we could make it static,
> giving a stronger reason of not using that function. But it looks a bit
> complicated to me, who is not familiar in that area of the code.
> 
> Thanks for making less use of this suboptimal API,

I'm not sure the API is suboptimal. It's not wrong to ask "do you have a
cached copy of this?". It was just being used poorly here. :)

See the discussion in

  https://public-inbox.org/git/20180221184811.gd4...@sigill.intra.peff.net/

-Peff

Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order)

2018-02-21 Thread Josh Tepper

When using git log, boundary commits (ie, those commits added by
specifying --boundary) do not respect the order (e.g., --date-order,
--topo-order).  Consider the following commit history, where number
indicates the order of the commit timestamps:


0125  <--A
   \ \
 346  <--B


Executing the following command:

$ git log --boundary --date-order ^A B

Should produce the following order (boundary commits shown with dashes):
6 -5 4 3 -1

However, it in fact produces:
6 4 3 -7 -1

Please advise.

Best,
~Josh

[PATCH] revision: drop --show-all option

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 03:03:18PM -0800, Junio C Hamano wrote:

> > So what I'm wondering is whether we should consider just ripping it out
> > (but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
> > it's probably not hurting anybody).
> 
> I see no problem in removing it.  With more "interesting" features
> relying on post-processing (like 'simplify-merges'), show_all whose
> primary focus was how limit_list() behaves soft of outlived its
> usefulness, I would think.

So here's a patch to do that (textually independent but conceptually on
top of the earlier one to fix the commit-buffer thing). It actually
doesn't remove that much code, so I could go either way on it.

+cc Linus in case he's secretly in love with this feature.

-- >8 --
Subject: [PATCH] revision: drop --show-all option

This was an undocumented debugging aid that does not seem to
have come in handy in the past decade, judging from its lack
of mentions on the mailing list.

Let's drop it in the name of simplicity. This is morally a
revert of 3131b71301 (Add "--show-all" revision walker flag
for debugging, 2008-02-09), but note that I did leave in the
mapping of UNINTERESTING to "^" in get_revision_mark(). I
don't think this would be possible to trigger with the
current code, but it's the only sensible marker.

We'll skip the usual deprecation period because this was
explicitly a debugging aid that was never documented.

Signed-off-by: Jeff King 
---
 revision.c   | 10 -
 revision.h   |  1 -
 t/t6015-rev-list-show-all-parents.sh | 31 
 3 files changed, 42 deletions(-)
 delete mode 100755 t/t6015-rev-list-show-all-parents.sh

diff --git a/revision.c b/revision.c
index 5ce9b93baa..5c1cb7277c 100644
--- a/revision.c
+++ b/revision.c
@@ -1065,14 +1065,9 @@ static int limit_list(struct rev_info *revs)
return -1;
if (obj->flags & UNINTERESTING) {
mark_parents_uninteresting(commit);
-   if (revs->show_all)
-   p = _list_insert(commit, p)->next;
slop = still_interesting(list, date, slop, 
_cache);
if (slop)
continue;
-   /* If showing all, add the whole pending list to the 
end */
-   if (revs->show_all)
-   *p = list;
break;
}
if (revs->min_age != -1 && (commit->date > revs->min_age))
@@ -1864,8 +1859,6 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
revs->dense = 1;
} else if (!strcmp(arg, "--sparse")) {
revs->dense = 0;
-   } else if (!strcmp(arg, "--show-all")) {
-   revs->show_all = 1;
} else if (!strcmp(arg, "--in-commit-order")) {
revs->tree_blobs_in_commit_order = 1;
} else if (!strcmp(arg, "--remove-empty")) {
@@ -3094,8 +3087,6 @@ enum commit_action get_commit_action(struct rev_info 
*revs, struct commit *commi
return commit_ignore;
if (revs->unpacked && has_sha1_pack(commit->object.oid.hash))
return commit_ignore;
-   if (revs->show_all)
-   return commit_show;
if (commit->object.flags & UNINTERESTING)
return commit_ignore;
if (revs->min_age != -1 &&
@@ -3194,7 +3185,6 @@ enum commit_action simplify_commit(struct rev_info *revs, 
struct commit *commit)
enum commit_action action = get_commit_action(revs, commit);
 
if (action == commit_show &&
-   !revs->show_all &&
revs->prune && revs->dense && want_ancestry(revs)) {
/*
 * --full-diff on simplified parents is no good: it
diff --git a/revision.h b/revision.h
index 3dee97bfb9..b8c47b98e2 100644
--- a/revision.h
+++ b/revision.h
@@ -90,7 +90,6 @@ struct rev_info {
unsigned intdense:1,
prune:1,
no_walk:2,
-   show_all:1,
remove_empty_trees:1,
simplify_history:1,
topo_order:1,
diff --git a/t/t6015-rev-list-show-all-parents.sh 
b/t/t6015-rev-list-show-all-parents.sh
deleted file mode 100755
index 3c73c93ba6..00
--- a/t/t6015-rev-list-show-all-parents.sh
+++ /dev/null
@@ -1,31 +0,0 @@
-#!/bin/sh
-
-test_description='--show-all --parents does not rewrite TREESAME commits'
-
-. ./test-lib.sh
-
-test_expect_success 'set up --show-all --parents test' '
-   test_commit one foo.txt &&
-   commit1=$(git rev-list -1 HEAD) &&
-   test_commit two bar.txt &&
-   commit2=$(git rev-list -1 HEAD) &&
-   test_commit three foo.txt &&
-   commit3=$(git rev-list -1 HEAD)
-   '
-
-test_expect_success '--parents rewrites

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Stefan Beller

> Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()
> ---
>  builtin/rev-list.c | 2 +-
>  log-tree.c | 3 ---
>  2 files changed, 1 insertion(+), 4 deletions(-)

Now if we'd get around to rewrite pretty.c as well, we could make it static,
giving a stronger reason of not using that function. But it looks a bit
complicated to me, who is not familiar in that area of the code.

Thanks for making less use of this suboptimal API,
Stefan

Re: Git should preserve modification times at least on request

2018-02-21 Thread Peter Backes

On Wed, Feb 21, 2018 at 11:44:13PM +0100, Ævar Arnfjörð Bjarmason wrote:
> If it were added as a first-level feature to git it would present a lot
> of UX confusion. E.g. you run "git add" and it'll be showing the mtime
> somehow, or you get a formatted patch over E-Mail and it doesn't only
> include the commit time but also times for individual files.

But that's pretty standard. patch format has timestamp fields for 
precisely this purpose:

% echo a > x  
% echo b > y
% diff -u x y
--- x   2018-02-21 23:56:29.574029523 +0100
+++ y   2018-02-21 23:56:31.430003389 +0100
@@ -1 +1 @@
-a
+b

At present, git simply leaves those fields blank...

> The VC systems that had this feature in the past were centralized, so
> they could (in theory anyway) ensure that timestamps were monotonically
> increasing. This won't be the case with git, we have plenty of timestamp
> drift in e.g. linux.git and other git repos.

I don't see where monotonicity would be an issue any more than it is 
for centralized version control systems.

Even in the centralized setting, monotonicity is not guaranteed, since 
you might have local timestamps deviating from the repository; you 
might have added a line, compiled, and removed it again later on, 
without running make again. Now if you checkout changes from the 
repository, and it sets the timestamp, that timestamp might be older 
than before the compile, and the file would not be rebuilt if you run 
make. So you cannot avoid those issues in centralized setttings either.

> So if these mtimes were used by default they'd interact badly with stuff
> like "make" in those cases, because you might check out a modified
> version with a timestamp in the past.

That's very clearly the case, and I have stressed in my initial email 
that I fully agree with the reasoning of the FAQ in this regard. It is, 
however, merely an argument against *restoring* the timestamps *by 
default*, to comply with the principle of least astonishment. It is, by 
itself, not an argument against *storing* the timestamps, let alone 
against restoring them *on request*.

For the initial checkout, it should not even be harmful to restore the 
timestamps by default.

> any case, I just wanted to point out a workaround (but then digressed
> into critiquing the idea above...).

Well, Johannes's proposed solution seems pretty reasonable and 
realistic to me.  Thanks to Phillip's hint about unquote_path() in 
Git.pm it seems I now have all the needed ingredients to implement this 
feature.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:17:11PM -0500, Derrick Stolee wrote:

> The get_cached_commit_buffer() method provides access to the buffer
> loaded for a struct commit, if it was ever loadead and was not freed.
> 
> Two places use this to inform how to output information about commits.
> 
> log-tree.c uses this method to short-circuit the output of commit
> information when the buffer is not cached. However, this leads to
> incorrect output in 'git log --oneline' where the short-OID is written
> but then the rest of the commit information is dropped and the next
> commit is written on the same line.
> 
> rev-list uses this method for two reasons:
> 
> - First, if the revision walk visits a commit twice, the buffer was
>   freed by rev-list in the first write. The output then does not
>   match the format expectations, since the OID is written without the
>   rest of the content.

I'm not sure after my earlier digging if there is even a way to trigger
this (and if so, it is probably accidental, since those lines were added
explicitly for --show-all).

And actually after re-reading the commit message for 3131b7130 again, I
think the current behavior is definitely not something that was
carefully planned. So I'd propose a commit message like below.

-- >8 --
Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()

The "--show-all" revision option shows UNINTERESTING
commits. Some of these commits may be unparsed when we try
to show them (since we may or may not need to walk their
parents to fulfill the request).

Commit 3131b71301 (Add "--show-all" revision walker flag for
debugging, 2008-02-09) resolved this by just skipping
pretty-printing for commits without their object contents
cached, saying:

  Because we now end up listing commits we may not even have been parsed
  at all "show_log" and "show_commit" need to protect against commits
  that don't have a commit buffer entry.

That was the easy fix to avoid the pretty-printer segfaulting,
but:

  1. It doesn't work for all formats. E.g., --oneline
 prints the oid for each such commit but not a trailing
 newline, leading to jumbled output.

  2. It only affects some commits, depending on whether we
 happened to parse them or not (so if they were at the
 tip of an UNINTERESTING starting point, or if we
 happened to traverse over them, you'd see more data).

  3. It unncessarily ties the decision to show the verbose
 header to whether the commit buffer was cached. That
 makes it harder to change the logic around caching
 (e.g., if we could traverse without actually loading
 the full commit objects).

These days it's safe to feed such a commit to the
pretty-print code. Since be5c9fb904 (logmsg_reencode: lazily
load missing commit buffers, 2013-01-26), we'll load it on
demand in such a case. So let's just always show the verbose
headers.

This does change the behavior of plumbing, but:

  a. The --show-all option was explicitly introduced as a
 debugging aid, and was never documented (and has rarely
 even been mentioned on the list by git devs).

  b. Avoiding the commits was already not deterministic due
 to (2) above. So the caller might have seen full
 headers for these commits anyway, and would need to be
 prepared for it.

Signed-off-by: Jeff King 
---
 builtin/rev-list.c | 2 +-
 log-tree.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9e11..d320b6f1e3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');

-   if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {
+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d6d1..22b2fb6c58 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}

-   if (!get_cached_commit_buffer(commit, NULL))
-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;
-- 
2.16.2.555.g885a024879

Investment Project

2018-02-21 Thread Ms Melisa Mehmet

Hello,

I have a business proposal i would like to discuss with you, reply if you are 
interested for more info.

Thank You.
Ms. Melisa Mehmet
My Skype ID: ms.melisameh...@hotmail.com
WhatsApp": +90 534 716 2603

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Junio C Hamano

Jeff King  writes:

> Out of curiosity, do you actually use --show-all for anything?

Absolutely not.  I'd actually love it if I could say "not anymore"
instead, but I haven't had an opportunity to debug the revision
traversal code for quite some time so I do not even remember when
was the last time I used it, which disqualifies me from saying even
that.

> So what I'm wondering is whether we should consider just ripping it out
> (but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
> it's probably not hurting anybody).

I see no problem in removing it.  With more "interesting" features
relying on post-processing (like 'simplify-merges'), show_all whose
primary focus was how limit_list() behaves soft of outlived its
usefulness, I would think.

Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:53 -0800
Brandon Williams  wrote:

> -const struct ref *transport_get_remote_refs(struct transport *transport)
> +const struct ref *transport_get_remote_refs(struct transport *transport,
> + const struct argv_array 
> *ref_patterns)
>  {
>   if (!transport->got_remote_refs) {
> - transport->remote_refs = 
> transport->vtable->get_refs_list(transport, 0, NULL);
> + transport->remote_refs =
> + transport->vtable->get_refs_list(transport, 0,
> +  ref_patterns);
>   transport->got_remote_refs = 1;
>   }

Should we do our own client-side filtering if the server side cannot do
it for us (because it doesn't support protocol v2)? Either way, this
decision should be mentioned in the commit message.

Re: [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:52 -0800
Brandon Williams  wrote:

> @@ -21,7 +22,8 @@ struct transport_vtable {
>* the ref without a huge amount of effort, it should store it
>* in the ref's old_sha1 field; otherwise it should be all 0.
>**/
> - struct ref *(*get_refs_list)(struct transport *transport, int for_push);
> + struct ref *(*get_refs_list)(struct transport *transport, int for_push,
> +  const struct argv_array *ref_patterns);

Also mention in the documentation that this function is allowed to
return refs that do not match the ref patterns.

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v3 14/35] connect: request remote refs using v2

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:51 -0800
Brandon Williams  wrote:

> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> + struct ref **list, int for_push,
> + const struct argv_array *ref_patterns);

I haven't looked at the rest of this patch in detail, but the type of
ref_patterns is probably better as struct string_list, since this is not
a true argument array (e.g. with flags starting with --). Same comment
for the next few patches that deal with ref patterns.

Re: test bare repository for unit tests

2018-02-21 Thread Ævar Arnfjörð Bjarmason


On Wed, Feb 21 2018, Basin Ilya jotted:

> Hi.
> I want to the test-repo-git under 
> https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
> just like test-repo-cvs and test-repo-svn
>
> Which configuation options would suit that?
> I think core.compression 0 for human readable diffs.
> also, I need to force loose, gc after each push.

It looks like you have unit tests that are going to do integration tests
of some SVN/CVS repos as used by some other tool, and want to add git to
that.

Since you have git already, the most straightforward thing to do would
be to ship the output of git-fast-export in the repo, and have the test
setup code create a repo locally out of that, then test it.

Or do you really need to commit the raw repo files as-is for some reason
I've missed?

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:22:05PM -0800, Junio C Hamano wrote:

> > I think that repeating the oid is intentional; the point is to dump how
> > the traversal code is hitting the endpoints, even if we do so multiple
> > times.
> >
> > The --oneline behavior just looks like a bug. I think --format is broken
> > with --show-all, too (it does not show anything!).
> 
> I do not know about the --format thing,[...]

Hmm, maybe it is fine. I thought before that I got funny output out of
"git log --show-all --format", but I can't seem to reproduce it now.

> being a bug is correct.  I've known about the oneline that does not
> show anything other than the oid (not even end-of-line) for unparsed
> commits for a long time---I just didn't bother looking into fixing
> it exactly because this is only a debugging aid ;-)

Out of curiosity, do you actually use --show-all for anything? I didn't
even know it existed until this thread, and AFAICT nobody but Linus has
ever recommended its use. And it does not even seem that useful as a
general debugging aid.

So what I'm wondering is whether we should consider just ripping it out
(but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
it's probably not hurting anybody).

-Peff

Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:49 -0800
Brandon Williams  wrote:

>  .gitignore  |   1 +
>  Documentation/technical/protocol-v2.txt | 114 +++
>  Makefile|   2 +
>  builtin.h   |   1 +
>  builtin/serve.c |  30 
>  git.c   |   1 +
>  serve.c | 250 
> 
>  serve.h |  15 ++
>  t/t5701-git-serve.sh|  60 
>  9 files changed, 474 insertions(+)
>  create mode 100644 Documentation/technical/protocol-v2.txt
>  create mode 100644 builtin/serve.c
>  create mode 100644 serve.c
>  create mode 100644 serve.h
>  create mode 100755 t/t5701-git-serve.sh

As someone who is implementing the server side of protocol V2 in JGit, I
now have a bit more insight into this :-)

First of all, I used to not have a strong opinion on the existence of a
new endpoint, but now I think that it's better to *not* have git-serve.
As it is, as far as I can tell, upload-pack also needs to support (and
does support, as of the end of this patch set) protocol v2 anyway, so it
might be better to merely upgrade upload-pack.

> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +command-request = command
> +   capability-list
> +   (command-args)

If you are stating that this is optional, write "*1command-args". (RFC
5234 also supports square brackets, but "*1" is already used in
pack-protocol.txt and http-protocol.txt.)

> +   flush-pkt
> +command = PKT-LINE("command=" key LF)
> +command-args = delim-pkt
> +*arg
> +arg = 1*CHAR

arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

Re: Git should preserve modification times at least on request

2018-02-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 21 2018, Peter Backes jotted:

> On Wed, Feb 21, 2018 at 10:33:05PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> This sounds like a sensible job for a git import tool, i.e. import a
>> target directory into git, and instead of 'git add'-ing the whole thing
>> it would look at the mtimes, sort files by mtime, then add them in order
>> and only commit those files that had the same mtime in the same commit
>> (or within some boundary).
>
> I think that this would be The Wrong Thing to do.

I'm merely pointing out that if you have the use-case Derek Fawcus
describes you can get per-file mtimes via something similar to the the
hook method Theodore Ts'o described today with a simple import tool with
no changes to git or its object format required.

To the extent that there's a convention for this in git that's the
convention, e.g. if you use github or gitlab they'll render the
modification time of a file in the tree view, and that time is the time
of the commit that last touched it: https://github.com/git/git

> The commit time is just that: The time the commit was done. The commit
> is an atomic group of changes to a number of files that hopefully bring
> the tree from one usable state into the next.
>
> The mtime, in contrast, tells us when a file was most recently modified.
>
> It may well be that main.c was most recently modified yesterday, and
> feature.c was modified this morning, and that only both changes taken
> together make sense as a commit, despite the long time in between.
>
> Even worse, it may be that feature A took a long time to implement, so
> we have huge gaps in between the mtimes, but feature B was quickly done
> after A was finished.

...

> Such an algorithm would probably split feature A
> incorrectly into several commits, and group the more recently changed
> files of feature A with those of feature B.

Right, but that's a trade-off you can pick at import time in this
hypothetical tar-to-commits tool, you could decide to do no merging and
suffer to signal loss.

> And if Feature A and Feature B were developed in parallel, things get
> completely messy.
>
>> The advantage of doing this via such a tool is that you could tweak it
>> to commit by any criteria you wanted, e.g. not mtime but ctime or even
>> atime.
>
> Maybe, but it would be rather useless to commit by ctime or atime. You
> do one grep -r and the atime is different. You do one chmod or chown
> and the ctime is different. Those timestamps are really only useful for
> very limited purposes.
>
> That ctime exists seems reasonable, since it's only ever updated when
> the inode is written anyway.
>
> atime, in contrast, was clearly one of the rather nonsensical
> innovations of UNIX: Do one write to the disk for each read from the
> disk. C'mon, really? It would have been a lot more reasonable to simply
> provide a generic way for tracing read() system calls instead; then
> userspace could decide what to do with that information and which of it
> is useful and should be kept and perhaps stored on disk. Now we have
> this ugly hack called relatime to deal with the problem.

Yes, that [ac]time example was a stretch. A better example would be
committing the file mode, or extended attributes, or "this is on a
different FS", or whatever other per-file/dir attribute we're not
currently capturing.

>> You'd get the same thing as you'd get if git's tree format would change
>> to include mtimes (which isn't going to happen), but with a lot more
>> flexibility.
>
> Well, from basic logic, I don't see how a decision not to implement a
> feature could possibly increase flexility. The opposite seems to be the
> case.

I'm not trying to argue the usefulness of this mtime-per-file thing in
theory, just providing Derek Fawcus with a suggestion for a viable
workaround.

What I meant by this offhand comment, and which you may or may not know
(and I see no references to it from skimming the thread) is that there's
simply no space in the tree objects to add *anything* without breaking
the object format and requiring a major upgrade, although the plan to
switch to a new hash function is relevant to this.

Even if we suppose that git was being implemented today I don't think
this would make any sense as a first-level feature.

Empirical evidence suggests that people use git on a massive scale
largely without caring about this, and the users who do have a
workaround.

If it were added as a first-level feature to git it would present a lot
of UX confusion. E.g. you run "git add" and it'll be showing the mtime
somehow, or you get a formatted patch over E-Mail and it doesn't only
include the commit time but also times for individual files.

The VC systems that had this feature in the past were centralized, so
they could (in theory anyway) ensure that timestamps were monotonically
increasing. This won't be the case with git, we have plenty of timestamp
drift in e.g. linux.git and other git repos.

So if these mtimes were used by default they'd

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v2 0/3] Re: t7006: add tests for how git config paginates

2018-02-21 Thread Junio C Hamano

Thanks.  The entire thing looked reasonable.

Will replace.

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v4 12/13] commit-graph: read only from specific pack-indexes

2018-02-21 Thread Stefan Beller

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:
>
> Teach git-commit-graph to inspect the objects only in a certain list
> of pack-indexes within the given pack directory. This allows updating
> the commit graph iteratively, since we add all commits stored in a
> previous commit graph.
>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/git-commit-graph.txt | 11 +++
>  builtin/commit-graph.c | 32 +---
>  commit-graph.c | 26 --
>  commit-graph.h |  4 +++-
>  packfile.c |  4 ++--
>  packfile.h |  2 ++
>  t/t5318-commit-graph.sh| 16 
>  7 files changed, 87 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-commit-graph.txt 
> b/Documentation/git-commit-graph.txt
> index b9b4031..93d50d1 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -42,6 +42,10 @@ With the `--delete-expired` option, delete the graph files 
> in the pack
>  directory that are not referred to by the graph-latest file. To avoid race
>  conditions, do not delete the file previously referred to by the
>  graph-latest file if it is updated by the `--set-latest` option.
> ++
> +With the `--stdin-packs` option, generate the new commit graph by
> +walking objects only in the specified packfiles and any commits in
> +the existing graph-head.

A general question on this series:
How do commit graph buildups deal with garbage collected commits?
(my personal workflow is heavy on rebase, which generates lots of
dangling commits, to be thrown out later)

The second half of the sentence makes it sound like once a
commit is in the graph it cannot be pulled out easily again, hence
the question on the impact of graphs on a long living repository
which is garbage collected frequently.

AFAICT you could just run
git commit-graph write --set-latest [--delete-expired]
as that actually looks up objects from outside the existing graph files,
such that lost objects are ignored?

> +   const char **lines = NULL;
> +   int nr_lines = 0;
> +   int alloc_lines = 0;

(nit:)
I had the impression that these triplet-variables, that are used in
ALLOC_GROW are allo X, X_nr and X_allow, but I might be wrong.

> @@ -170,7 +178,25 @@ static int graph_write(int argc, const char **argv)
>
> old_graph_name = get_graph_latest_contents(opts.obj_dir);
>
> -   graph_name = write_commit_graph(opts.obj_dir);
> +   if (opts.stdin_packs) {
> +   struct strbuf buf = STRBUF_INIT;
> +   nr_lines = 0;
> +   alloc_lines = 128;

alloc_lines has been initialized before, so why redo it here again?
Also what is the rationale for choosing 128 as a good default?
I would guess 0 is just as fine, because ALLOC_GROW makes sure
that it growth fast in the first couple entries by having an additional
offset. (no need to fine tune the starting allocation IMHO)

> +   ALLOC_ARRAY(lines, alloc_lines);
> +
> +   while (strbuf_getline(, stdin) != EOF) {
> +   ALLOC_GROW(lines, nr_lines + 1, alloc_lines);
> +   lines[nr_lines++] = buf.buf;
> +   strbuf_detach(, NULL);

strbuf_detach returns its previous buf.buf, such that you can combine these
two lines as
lines[nr_lines++] = strbuf_detach(, NULL);


> +   }
> +
> +   pack_indexes = lines;
> +   nr_packs = nr_lines;

Technically we do not need to strbuf_release() here, because
strbuf_detach is always called, and by knowing its implementation,
it is just as good.


> @@ -579,7 +581,27 @@ char *write_commit_graph(const char *obj_dir)
> oids.alloc = 1024;
> ALLOC_ARRAY(oids.list, oids.alloc);
>
> -   for_each_packed_object(if_packed_commit_add_to_list, , 0);
> +   if (pack_indexes) {
> +   struct strbuf packname = STRBUF_INIT;
> +   int dirlen;
> +   strbuf_addf(, "%s/pack/", obj_dir);
> +   dirlen = packname.len;
> +   for (i = 0; i < nr_packs; i++) {
> +   struct packed_git *p;
> +   strbuf_setlen(, dirlen);
> +   strbuf_addstr(, pack_indexes[i]);
> +   p = add_packed_git(packname.buf, packname.len, 1);
> +   if (!p)
> +   die("error adding pack %s", packname.buf);
> +   if (open_pack_index(p))
> +   die("error opening index for %s", 
> packname.buf);
> +   for_each_object_in_pack(p, 
> if_packed_commit_add_to_list, );
> +   close_pack(p);
> +   }

strbuf_release();

> +   }
> +   else

(micro style nit)

} else

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Junio C Hamano

Jeff King  writes:

> I think that repeating the oid is intentional; the point is to dump how
> the traversal code is hitting the endpoints, even if we do so multiple
> times.
>
> The --oneline behavior just looks like a bug. I think --format is broken
> with --show-all, too (it does not show anything!).

I do not know about the --format thing, but the part about --oneline
being a bug is correct.  I've known about the oneline that does not
show anything other than the oid (not even end-of-line) for unparsed
commits for a long time---I just didn't bother looking into fixing
it exactly because this is only a debugging aid ;-)

> Though I think it would be equally correct to have set_commit_buffer()
> just throw away the existing cache entry and replace it with this one. I
> don't think there's a real reason to prefer the old to the new. And that
> might be worth doing if it would let us drop get_cached_commit_buffer()
> as a public function. But...
> ...
> In my opinion it's not really worth trying to make it private. The
> confusion you're fixing in the first two calls is not due to a bad API,
> but due to some subtly confusing logic in that code's use of the API. ;)

Yup.

> So I'd probably do this:
> ...

Makes sense to me.

Re: bug in HTTP protocol spec

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 10:29:35AM -0800, Dorian Taylor wrote:

> I didn’t get an insight until I ran with GIT_TRACE_PACKET=true on a 
> known-good remote (i.e. GitHub), that the null packet-line `` has to 
> follow the service line. This is not reflected in the example here:
> 
> https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L216

This is missing the trailing flush after the ref advertisement, too.

> It is also not reflected in the BNF:
> 
> https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L279
> 
> (Note these links are from the most recent commit of this file as of this 
> writing.)
> 
> Just thought somebody would like to know.

Thanks, I agree the document is buggy. Do you want to submit a patch?

-Peff

Re: Git should preserve modification times at least on request

2018-02-21 Thread Peter Backes

On Wed, Feb 21, 2018 at 10:33:05PM +0100, Ævar Arnfjörð Bjarmason wrote:
> This sounds like a sensible job for a git import tool, i.e. import a
> target directory into git, and instead of 'git add'-ing the whole thing
> it would look at the mtimes, sort files by mtime, then add them in order
> and only commit those files that had the same mtime in the same commit
> (or within some boundary).

I think that this would be The Wrong Thing to do.

The commit time is just that: The time the commit was done. The commit 
is an atomic group of changes to a number of files that hopefully bring 
the tree from one usable state into the next.

The mtime, in contrast, tells us when a file was most recently modified.

It may well be that main.c was most recently modified yesterday, and 
feature.c was modified this morning, and that only both changes taken 
together make sense as a commit, despite the long time in between.

Even worse, it may be that feature A took a long time to implement, so 
we have huge gaps in between the mtimes, but feature B was quickly done 
after A was finished. Such an algorithm would probably split feature A 
incorrectly into several commits, and group the more recently changed 
files of feature A with those of feature B.

And if Feature A and Feature B were developed in parallel, things get 
completely messy.

> The advantage of doing this via such a tool is that you could tweak it
> to commit by any criteria you wanted, e.g. not mtime but ctime or even
> atime.

Maybe, but it would be rather useless to commit by ctime or atime. You 
do one grep -r and the atime is different. You do one chmod or chown 
and the ctime is different. Those timestamps are really only useful for 
very limited purposes.

That ctime exists seems reasonable, since it's only ever updated when 
the inode is written anyway.

atime, in contrast, was clearly one of the rather nonsensical 
innovations of UNIX: Do one write to the disk for each read from the 
disk. C'mon, really? It would have been a lot more reasonable to simply 
provide a generic way for tracing read() system calls instead; then 
userspace could decide what to do with that information and which of it 
is useful and should be kept and perhaps stored on disk. Now we have 
this ugly hack called relatime to deal with the problem.

> You'd get the same thing as you'd get if git's tree format would change
> to include mtimes (which isn't going to happen), but with a lot more
> flexibility.

Well, from basic logic, I don't see how a decision not to implement a 
feature could possibly increase flexility. The opposite seems to be the 
case.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Re: [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:45 -0800
Brandon Williams  wrote:

> - get_remote_heads(fd[0], NULL, 0, , 0, NULL, );
> +
> + packet_reader_init(, fd[0], NULL, 0,
> +PACKET_READ_CHOMP_NEWLINE |
> +PACKET_READ_GENTLE_ON_EOF);
> +
> + switch (discover_version()) {
> + case protocol_v1:
> + case protocol_v0:
> + get_remote_heads(, , 0, NULL, );
> + break;
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }

This inlining is repeated a few times, which raises the question: if the
intention is to keep the v0/1 logic separately from v2, why not have a
single function that wraps them all? Looking at the end result (after
all the patches in this patch set are applied), it seems that the v2
version does not have extra_have or shallow parameters, which is a good
enough reason for me (I don't think functions that take in many
arguments and then selectively use them is a good idea). I think that
other reviewers will have this question too, so maybe discuss this in
the commit message.

> diff --git a/remote.h b/remote.h
> index 1f6611be2..2016461df 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
>  void free_refs(struct ref *ref);
>  
>  struct oid_array;
> -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct packet_reader;
> +extern struct ref **get_remote_heads(struct packet_reader *reader,
>struct ref **list, unsigned int flags,
>struct oid_array *extra_have,
> -  struct oid_array *shallow);
> +  struct oid_array *shallow_points);

This change probably does not belong in this patch, especially since
remote.c is unchanged.

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee


On 2/21/2018 2:17 PM, Derrick Stolee wrote:

The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
   freed by rev-list in the first write. The output then does not
   match the format expectations, since the OID is written without the
   rest of the content.

- Second, if the revision walk visits a commit that was marked
   UNINTERESTING, the walk may never load a buffer and hence rev-list
   will not output the verbose information.

These behaviors are undocumented, untested, and unlikely to be
expected by users or other software attempting to parse this output.

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 


This would be a good time to allow multiple authors, or to just change 
the author, since this is exactly the diff you (Peff) provided in an 
earlier email. The commit message hopefully summarizes our discussion, 
but I welcome edits.



---
  builtin/rev-list.c | 2 +-
  log-tree.c | 3 ---
  2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9..d320b6f 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
  
-	if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {

+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d..22b2fb6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
  
-	if (!get_cached_commit_buffer(commit, NULL))

-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;

Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-21 Thread Junio C Hamano

Stefan Beller  writes:

> +
> +/*
> + * The mru list_head is supposed to be initialized using
> + * the LIST_HEAD macro, assigning prev/next to itself.
> + * However this doesn't work in this case as some compilers dislike
> + * that macro on member variables. Use NULL instead as that is defined
> + * and accepted, deferring the real init to prepare_packed_git_mru(). */
> +#define __MRU_INIT { NULL, NULL }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }

I do not think it has to be this way to abuse two NULLs, if you
faithfully mimicked how LIST_HEAD() macro is constructed.  The
reason why it does not try to introduce

struct list_head x = LIST_HEAD_INIT;

and instead, uses

LIST_HEAD(x);

is because of the need for self referral.  If we learn from it, we
can do the same, i.e. instead of doing

struct raw_object_store x = RAW_OBJECT_STORE_INIT;

we can do

RAW_OBJECT_STORE(x);

that expands to

struct raw_object_store x = {
...
{ _git_mru, _git_mru },
...
};

no?  Then we do not need such a lengthy comment that reads only like
an excuse ;-)

Re: [PATCH 04/26] upload-pack: convert to a builtin

2018-02-21 Thread Jonathan Nieder

Brandon Williams wrote:
> On 01/03, Stefan Beller wrote:
> > On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams  wrote:

>>> In order to allow for code sharing with the server-side of fetch in
>>> protocol-v2 convert upload-pack to be a builtin.
>>
>> What is the security aspect of this patch?
>>
>> By making upload-pack builtin, it gains additional abilities,
>> such as answers to '-h' or '--help' (which would start a pager).
>> Is there an easy way to sooth my concerns? (best put into the
>> commit message)
>
> receive-pack is already a builtin, so theres that.

*nod*

Since v2.4.12~1^2 (shell: disallow repo names beginning with dash,
2017-04-29), git-shell refuses to pass --help to upload-pack, limiting
the security impact in configurations that use git-shell (e.g.
gitolite installations).

If you're not using git-shell, then hopefully you have some other form
of filtering preventing arbitrary options being passed to
git-upload-pack.  If you don't, then you're in trouble, for the
reasons described in that commit.

Since some installations may be allowing access to git-upload-pack
(for read-only access) without allowing access to git-receive-pack,
this does increase the chance of attack.  On the other hand, I suspect
the maintainability benefit is worth it.

For defense in depth, it would be comforting if the git wrapper had
some understanding of "don't support --help in handle_builtin when
invoked as a dashed command".  That is, I don't expect that anyone has
been relying on

git-add --help

acting like

git help add

instead of printing the usage message from

git add -h

It's a little fussy because today we rewrite "git add --help" to
"git-add --help" before rewriting it to "git help add"; we'd have to
skip that middle hop for this to work.

I don't think that has to block this patch or series, though --- it's
just a separate thought about hardening.

Cc-ing Sitaram Chamarty since he tends to be wiser about this kind of
thing than I am.

What do you think?

Thanks,
Jonathan

Re: [PATCH v3 04/35] upload-pack: convert to a builtin

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:41 -0800
Brandon Williams  wrote:

> In order to allow for code sharing with the server-side of fetch in
> protocol-v2 convert upload-pack to be a builtin.
> 
> Signed-off-by: Brandon Williams 

As Stefan mentioned in [1], also mention in the commit message that this
means that the "git-upload-pack" invocation gains additional
capabilities (for example, invoking a pager for --help).

Having said that, the main purpose of this patch seems to be to libify
upload-pack, and the move to builtin is just a way of putting the
program somewhere - we could have easily renamed upload-pack.c and
created a new upload-pack.c containing the main(), preserving the
non-builtin-ness of upload-pack, while still gaining the benefits of
libifying upload-pack.

If the community does want to make upload-pack a builtin, I would write
the commit message this way:

  upload-pack: libify

  Libify upload-pack. The main() function is moved to
  builtin/upload-pack.c, thus making upload-pack a builtin. Note that
  this means that "git-upload-pack" gains functionality such as the
  ability to invoke a pager when passed "--help".

And if not:

  upload-pack: libify

  Libify upload-pack by moving most of the functionality in
  upload-pack.c into a file upload-pack-lib.c (or some other name),
  to be used in subsequent patches.

[1] 
https://public-inbox.org/git/CAGZ79kb2=uu0_k8wr27gndnx-t+p+7gvdgc5ebdyc3zbobs...@mail.gmail.com/

> -static void upload_pack(void)
> -{
> - struct string_list symref = STRING_LIST_INIT_DUP;
> -
> - head_ref_namespaced(find_symref, );
> -
> - if (advertise_refs || !stateless_rpc) {
> - reset_timeout();
> - head_ref_namespaced(send_ref, );
> - for_each_namespaced_ref(send_ref, );
> - advertise_shallow_grafts(1);
> - packet_flush(1);
> - } else {
> - head_ref_namespaced(check_ref, NULL);
> - for_each_namespaced_ref(check_ref, NULL);
> - }
> - string_list_clear(, 1);
> - if (advertise_refs)
> - return;
> -
> - receive_needs();
> - if (want_obj.nr) {
> - get_common_commits();
> - create_pack_file();
> - }
> -}

I see that this function had to be moved to the bottom because it now
also needs to make use of functions like upload_pack_config() - that's
fine.

> +struct upload_pack_options {
> + int stateless_rpc;
> + int advertise_refs;
> + unsigned int timeout;
> + int daemon_mode;
> +};

I would have expected "unsigned stateless_rpc : 1" etc., but I see that
this makes it easier to use with OPT_BOOL (which needs us to pass it a
pointer-to-int).

As for what existing code does, files like fetch-pack and diff use
"unsigned : 1", but they also process arguments without OPT_, so I don't
think they are relevant.

I think that we should decide if we're going to prefer "unsigned : 1" or
"int" for flags in new code. Personally, I prefer "unsigned : 1"
(despite the slight inconvenience in that argument parsers will need to
declare their own temporary "int" and then assign its contents to the
options struct) because of the stronger type, but I'm OK either way.
Whatever the decision, I don't think it needs to block the review of
this patch set.

Congratulations i willed to you

2018-02-21 Thread Micheal P Anderson

 
   My Dear friend,

 I am Mr. Micheal Pascal Anderson I got your contact on my personal search
of the person.. I willed the only funds left in my account to you.

If you want to know why I have willed the only funds left
in my account to you please contact the bank manager whose
name and address I will give to you as soon as you reply to
this mail. Right now I am in the hospital. Please,

Mr. Micheal Pascal Anderson

Re: [PATCH v4 08/13] commit-graph: implement --delete-expired

2018-02-21 Thread Stefan Beller

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:

> graph_name = write_commit_graph(opts.obj_dir);
>
> if (graph_name) {
> if (opts.set_latest)
> set_latest_file(opts.obj_dir, graph_name);
>
> +   if (opts.delete_expired)
> +   do_delete_expired(opts.obj_dir,
> + old_graph_name,
> + graph_name);
> +

So this only allows to delete expired things and setting the latest
when writing a new graph. Would we ever envision a user to produce
a new graph (e.g. via obtaining a graph that they got from a server) and
then manually rerouting the latest to that new graph file without writing
that graph file in the same process? The same for expired.

I guess these operations are just available via editing the
latest or deleting files manually, which slightly contradicts
e.g. "git update-ref", which in olden times was just a fancy way
of rewriting the refs file manually. (though it claims to be less
prone to errors as it takes lock files)

>
>  extern char *get_graph_latest_filename(const char *obj_dir);
> +extern char *get_graph_latest_contents(const char *obj_dir);

Did
https://public-inbox.org/git/20180208213806.ga6...@sigill.intra.peff.net/
ever make it into tree? (It is sort of new, but I feel we'd want to
strive for consistency in the code base, eventually.)

Re: Git should preserve modification times at least on request

2018-02-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 21 2018, Derek Fawcus jotted:

> On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote:
>>
>> It is pretty annoying that git cannot, even if I know what I am doing,
>> and explicitly want it to, preserve the modification time.
>
> The use case I've come across where it would be of value is for code
> archeology, either importing a bunch of tar files, or importing a
> repo from some other VCS.
>
> There preserving the mod times can be useful when one is subsequently
> figuring out what changed, and the scope of the 'commits' is too big
> (i.e. the granularity of the tar files themselves).
>
> e.g. initial commits are done on tar boundaries, but one may try to
> figure out individual changes from a ChangeLog file.  I've done this
> a couple of times, but to date it has required keeping the untarred
> trees around (or a timestamp list file from each tree), in addition
> to the git repro in to which one is then synthesizing smaller commits.

This sounds like a sensible job for a git import tool, i.e. import a
target directory into git, and instead of 'git add'-ing the whole thing
it would look at the mtimes, sort files by mtime, then add them in order
and only commit those files that had the same mtime in the same commit
(or within some boundary).

The advantage of doing this via such a tool is that you could tweak it
to commit by any criteria you wanted, e.g. not mtime but ctime or even
atime.

You'd get the same thing as you'd get if git's tree format would change
to include mtimes (which isn't going to happen), but with a lot more
flexibility.

Re: Git should preserve modification times at least on request

2018-02-21 Thread Phillip Wood


On 20/02/18 22:48, Peter Backes wrote:


On Tue, Feb 20, 2018 at 11:32:23PM +0100, Johannes Schindelin wrote:

Hi Peter,

On Tue, 20 Feb 2018, Peter Backes wrote:


On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:


I would probably invent a file format (``)


I'm stuck there because of  being munged.


 From which command do you want to get it? If you are looking at `git
diff`, you may want to use the `-z --name-only` options to avoid munging
the paths.


I plan to use "git diff-tree --name-only $w_tree HEAD" and subtract
all lines from "git diff-index --name-only HEAD" to get the files for
which the timestamp should be stored..

If I use "-z" I get the non-munged path, but I cannot safely store such
paths in the proposed file format; they might contain newlines (sigh).
So at one point I have to munge. Then the same question arises when I
have to get the actual path from the munged path when restoring the
timestamps.

If there's no ready-made functionality to munge and unmunge paths, I
have to write some awk for this. At first I thought this might add one
more dependency to git, but it seems that awk is already used in
git-mergetool.sh, so I suppose it's okay to use in git-stash.sh etc,
too.


In recent versions of git there's unquote_path() in Git.pm, you could 
possibly use that with perl -e from your script


Best Wishes

Phillip

Best wishes
Peter

Re: Git should preserve modification times at least on request

2018-02-21 Thread Derek Fawcus

On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote:
> 
> It is pretty annoying that git cannot, even if I know what I am doing, 
> and explicitly want it to, preserve the modification time.

The use case I've come across where it would be of value is for code
archeology, either importing a bunch of tar files, or importing a
repo from some other VCS.

There preserving the mod times can be useful when one is subsequently
figuring out what changed, and the scope of the 'commits' is too big
(i.e. the granularity of the tar files themselves).

e.g. initial commits are done on tar boundaries, but one may try to
figure out individual changes from a ChangeLog file.  I've done this
a couple of times, but to date it has required keeping the untarred
trees around (or a timestamp list file from each tree), in addition
to the git repro in to which one is then synthesizing smaller commits.

DF

Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 08:53:16AM -0800, Junio C Hamano wrote:

> Even though keeping track of list of known-leaky tests may not be so
> interesting, we can still salvage useful pieces from the discussion
> and make them available to developers, e.g.  something like
> 
> prove --dry --state=failed |
> perl -lne '/^(t[0-9]{4})-.*\.sh$/ and print $1' | sort >$@+
> if cmp >/dev/null $@ $@+; then rm $@+; else mv $@+ $@; fi
> 
> could be made into a target to stash away the list of failing tests
> after a test run?

Unfortunately there are some caveats in that snippet:

  1. You are using prove.

  2. You are using --state=save in the initial run.

I think we might be better off having the test scripts write to
test-results/*.counts even when run under a TAP harness, and then we can
have a consistent way to get the list of failed tests (we already have a
"make failed" that works this way).

-Peff

[PATCH v2 3/3] config: change default of `pager.config` to "on"

2018-02-21 Thread Martin Ågren

This is similar to ff1e72483 (tag: change default of `pager.tag` to
"on", 2017-08-02) and is safe now that we do not consider `pager.config`
at all when we are not listing or getting configuration. This change
will help with listing large configurations, but will not hurt users of
`git config --edit` as it would have before the previous commit.

Signed-off-by: Martin Ågren 
---
 Documentation/git-config.txt |  1 +
 t/t7006-pager.sh | 12 ++--
 builtin/config.c |  2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 249090ac84..e09ed5d7d5 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -237,6 +237,7 @@ CONFIGURATION
 -
 `pager.config` is only respected when listing configuration, i.e., when
 using `--list` or any of the `--get-*` which may return multiple results.
+The default is to use a pager.
 
 [[FILES]]
 FILES
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 95bd26f0b2..7541ba5edb 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -267,23 +267,23 @@ test_expect_success TTY 'git config --get ignores 
pager.config' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
+test_expect_success TTY 'git config --get-urlmatch defaults to paging' '
rm -f paginated.out &&
test_terminal git -c http."https://foo.com/".bar=foo \
  config --get-urlmatch http https://foo.com &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git config --get-all respects pager.config' '
rm -f paginated.out &&
-   test_terminal git -c pager.config config --get-all foo.bar &&
-   test -e paginated.out
+   test_terminal git -c pager.config=false config --get-all foo.bar &&
+   ! test -e paginated.out
 '
 
-test_expect_success TTY 'git config --list defaults to not paging' '
+test_expect_success TTY 'git config --list defaults to paging' '
rm -f paginated.out &&
test_terminal git config --list &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 
diff --git a/builtin/config.c b/builtin/config.c
index a732d9b56c..01169dd628 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -602,7 +602,7 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
}
 
if (actions & PAGING_ACTIONS)
-   setup_auto_pager("config", 0);
+   setup_auto_pager("config", 1);
 
if (actions == ACTION_LIST) {
check_argc(argc, 0, 0);
-- 
2.16.2.246.ga4ee8f

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 09:13:22AM -0500, Derrick Stolee wrote:

> > So there it is. It does show commits multiple times, but suppresses the
> > verbose header after the first showing. If we do something like this:
> > 
> >git rev-list --show-all --pretty --boundary c93150cfb0^-
> > 
> > you'll see some boundary commits that _don't_ have their pretty headers
> > shown. And with your proposed patch, we'd show them again. To keep the
> > same behavior we need to store that "we've already seen this" boolean
> > somewhere else (e.g., in an object flag; possibly SEEN, but that might
> > run afoul of other logic).
> 
> What confuses me about this behavior is that the OID is still shown on the
> repeat (and in the case of `git log --oneline` will not actually have a line
> break between two short-OIDs). I don't believe this behavior is something to
> preserve.

I think that repeating the oid is intentional; the point is to dump how
the traversal code is hitting the endpoints, even if we do so multiple
times.

The --oneline behavior just looks like a bug. I think --format is broken
with --show-all, too (it does not show anything!).

> Unless I am misunderstanding, the current behavior on a repeated commit is
> already incorrect: some amount of output occurs before checking the buffer,
> so the output includes repeated records but with formatting that violates
> the expectation. By doing the simple change of swapping
> get_cached_commit_buffer() with get_commit_buffer(), we correct that format
> violation but have duplicate copies.

Yeah, I'd agree with that assessment.

> The most-correct thing to do (in my opinion) is to put the requirement of
> "no repeats" into the revision walk logic and stop having the formatting
> methods expect them. Then, however we change this boolean setting of "we
> have seen this before" it will not require the formatting methods to change.

But then you wouldn't show repeats at all. If I'm understanding you
correctly.

TBH, I do not think it is worth spending a lot of effort on this
--show-all feature. It seems mostly like forgotten debugging cruft to
me. That's why I'd be OK with showing the whole header as the simplest
fix (i.e., just removing those calls entirely, not even converting them
to get_commit_buffer).

> I can start working on a patch to move the duplicate-removal logic into
> revision.c instead of these three callers:
> 
> builtin/rev-list.c: if (revs->verbose_header &&
> get_cached_commit_buffer(commit, NULL)) {
> log-tree.c: if (!get_cached_commit_buffer(commit, NULL))
> object.c:   if (!get_cached_commit_buffer(commit, NULL))

Those first two are duplicate detection. The third one in object.c
should stay, though. We've been fed a commit buffer to parse, and we
want to know whether we should attach it as the cached buffer for that
commit. But if we already have a cached buffer, there's no point in
doing so. And that's what we're checking there.

Though I think it would be equally correct to have set_commit_buffer()
just throw away the existing cache entry and replace it with this one. I
don't think there's a real reason to prefer the old to the new. And that
might be worth doing if it would let us drop get_cached_commit_buffer()
as a public function. But...

> But this caller seems pretty important in pretty.c:
> 
>     /*
>  * Otherwise, we still want to munge the encoding header in the
>  * result, which will be done by modifying the buffer. If we
>  * are using a fresh copy, we can reuse it. But if we are using
>  * the cached copy from get_commit_buffer, we need to duplicate it
>  * to avoid munging the cached copy.
>  */
>     if (msg == get_cached_commit_buffer(commit, NULL))
>     out = xstrdup(msg);
>     else
>     out = (char *)msg

Like the one in object.c, this really does want to know about the cached
entry. And it should be unaffected by your patch, since we will have
called get_commit_buffer() at the top of that function.

If we wanted to write this one without get_cached_commit_buffer(), we'd
really need a function to ask "did this pointer come from the cache, or
was it freshly allocated?". That's the same thing we do for
unuse_commit_buffer(). So in theory we could have a boolean function
that would check that, and that would let us make
get_cached_commit_buffer() private.

In my opinion it's not really worth trying to make it private. The
confusion you're fixing in the first two calls is not due to a bad API,
but due to some subtly confusing logic in that code's use of the API. ;)

So I'd probably do this:

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index d94062bc84..3af56921c8 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -150,7 +150,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');

-   if (revs->verbose_header && get_cached_commit_buffer(commit,

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 01:48:11PM -0500, Jeff King wrote:

> > What confuses me about this behavior is that the OID is still shown on the
> > repeat (and in the case of `git log --oneline` will not actually have a line
> > break between two short-OIDs). I don't believe this behavior is something to
> > preserve.
> 
> I think that repeating the oid is intentional; the point is to dump how
> the traversal code is hitting the endpoints, even if we do so multiple
> times.
> 
> The --oneline behavior just looks like a bug. I think --format is broken
> with --show-all, too (it does not show anything!).

I poked at one of the examples a little more closely. I actually think
these are not repeats, but simply UNINTERESTING parents that we never
needed to look at in our traversal (because we hit a point where
everything was UNINTERESTING).

So we are relying not on finish_commit() to have freed the buffer, but
on the traversal code to have never parsed those commits in the first
place. Which is doubly subtle.

I think the rest of my email stands, though: we should just show the
full headers for those commits.

-Peff

Re: [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write'

2018-02-21 Thread Junio C Hamano

Derrick Stolee  writes:

> +static int graph_write(int argc, const char **argv)
> +{
> + ...
> + graph_name = write_commit_graph(opts.obj_dir);
> +
> + if (graph_name) {
> + printf("%s\n", graph_name);
> + FREE_AND_NULL(graph_name);
> + }
> +
> + return 0;
> +}

After successfully writing a graph file out, write_commit_graph()
signals that fact by returning a non-NULL pointer, so that this
caller can report the filename to the end user.  This caller
protects itself from a NULL return, presumably because the callee
uses it to signal an error when writing the graph file out?  

Is it OK to lose that 1-bit of information, or should we have more like

if (graph_name) {
printf;
return 0;
} else {
return -1;
}

>  int cmd_commit_graph(int argc, const char **argv, const char *prefix)
>  {
>   static struct option builtin_commit_graph_options[] = {
> - { OPTION_STRING, 'p', "object-dir", _dir,
> + { OPTION_STRING, 'o', "object-dir", _dir,
>   N_("dir"),
>   N_("The object directory to store the graph") },
>   OPT_END(),

The same comment for a no-op patch from an earlier step applies
here, and we have another one that we saw above in graph_write().

> @@ -31,6 +67,11 @@ int cmd_commit_graph(int argc, const char **argv, const 
> char *prefix)
>builtin_commit_graph_usage,
>PARSE_OPT_STOP_AT_NON_OPTION);
>  
> + if (argc > 0) {
> + if (!strcmp(argv[0], "write"))
> + return graph_write(argc, argv);

And if we fix "graph_write" to report an error with negative return,
this needs to become something like

return !!graph_write(argc, argv);

as we do not want to return a negative value to be passed via
run_builtin() to exit(3) in handle_builtin().

> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> new file mode 100755
> index 000..6a5e93c
> --- /dev/null
> +++ b/t/t5318-commit-graph.sh
> @@ -0,0 +1,119 @@
> +#!/bin/sh
> +
> +test_description='commit graph'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup full repo' '
> + rm -rf .git &&

I am perfectly OK with creating a separate subdirectory called
'full' in the trash directory given by the test framework, but
unless absolutely necessary I'd rather not to see "rm -rf", 
especially on ".git", in our test scripts.  People can screw up
doing various things (like copying and pasting).

> + mkdir full &&
> + cd full &&
> + git init &&
> + objdir=".git/objects"
> +'

And I absolutely do not want to see "cd full" that leaves and stays
in the subdirectory after this step is done.  

Imagine what happens if any earlier step fails before doing "cd
full", causing this "setup full" step to report failure, and then
the test goes on to the next step?  We will not be in "full" and
worse yet because we do not have "$TRASH_DIRECTORY/.git" (you
removed it), the "git commit-graph write --object-dir" command we
end up doing next will see the git source repository as the
repository it is working on.  Never risk trashing our source
repository with your test.  That is why we give you $TRASH_DIRECTORY
to play in.  Make use of it when you can.

I'd make this step just a single

git init full

and then the next one

git -C full commit-graph write --object-dir .

In later tests that have multi-step things, I'd instead make them

(
cd full &&
... whatever you do  &&
... in that separate  &&
... 'full' repository
)

if I were writing this test *and* if I really wanted to do things
inside $TRASH_DIRECTORY/full/.git repository.  I am not convinced
yet about the latter.  I know that it will make certain things
simpler to use a separate /full hierarchy (e.g. cleaning up, having
another unrelated test repository, etc.) while making other things
more cumbersome (e.g. you need to be careful when you "cd" and the
easiest way to do so is to ( do things in a subshell )).  I just do
not know what the trade-off would look like in this particular case.

A simple rule of thumb I try to follow is not to change $(pwd) for
the process that runs these test_expect_success shell functions.

> +
> +test_expect_success 'write graph with no packs' '
> + git commit-graph write --object-dir .
> +'
> +
> +test_expect_success 'create commits and repack' '
> + for i in $(test_seq 3)
> + do
> + test_commit $i &&
> + git branch commits/$i
> + done &&
> + git repack
> +'

Re: [PATCH v4 02/13] graph: add commit graph design document

2018-02-21 Thread Stefan Beller

> +[3] 
> https://public-inbox.org/git/20170907094718.b6kuzp2uhvkmw...@sigill.intra.peff.net/t/#m7a2ea7b355aeda962e6b86404bcbadc648abfbba
> +More discussion about generation numbers and not storing them inside
> +commit objects. A valuable quote:

Unlike the other public inbox links this links to a discussion with
all messages on one page,
https://public-inbox.org/git/20170908034739.4op3w4f2ma5s6...@sigill.intra.peff.net/
would
have this be more inline with the other links. (this is a super small
nit, which I am not sure if
we care about at all; the rest of the doc is awesome!)

bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor

Hello list,

I had been banging my head all morning trying to figure out why I couldn’t get 
a little HTTP implementation to clone/push via the smart protocol (just 
wrapping git-receive-pack/git-upload-pack). I kept getting the following 
(likely familiar to some) error:

```
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
```

I didn’t get an insight until I ran with GIT_TRACE_PACKET=true on a known-good 
remote (i.e. GitHub), that the null packet-line `` has to follow the 
service line. This is not reflected in the example here:

https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L216

It is also not reflected in the BNF:

https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L279

(Note these links are from the most recent commit of this file as of this 
writing.)

Just thought somebody would like to know.

Regards,

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com

Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Stefan Beller

>>
>> [ so in small repos, where there are fewer than 256 objects,
>> F[i] == F[i+1], for all i'th where there is no object starting with i
>> byte]
>
>
> Correct. I'm not sure this additional information is valuable for the
> document, though.

It is not, I was just making sure I'd understand correctly.

Re: [PATCH] submodule: indicate that 'submodule.recurse' doesn't apply to clone

2018-02-21 Thread Junio C Hamano

Brandon Williams  writes:

> Update the documentation for the 'submodule.recurse' config to identify
> that the clone command does not respect it.
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index f57e9cf10..59ff29d7a 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3210,7 +3210,8 @@ submodule.active::
>  
>  submodule.recurse::
>   Specifies if commands recurse into submodules by default. This
> - applies to all commands that have a `--recurse-submodules` option.
> + applies to all commands that have a `--recurse-submodules` option,
> + except `clone`.
>   Defaults to false.
>  
>  submodule.fetchJobs::

Re: [PATCH v4 06/13] commit-graph: implement git commit-graph read

2018-02-21 Thread Junio C Hamano

Derrick Stolee  writes:

> +'read'::
> +
> +Read a graph file given by the graph-head file and output basic
> +details about the graph file.
> ++
> +With `--file=` option, consider the graph stored in the file at
> +the path  /info/.
> +

A sample reader confusion after reading the above twice:

What is "the graph-head file" and how does the user specify it?  Is
it given by  the value for the "--file=" command line option?

Another sample reader reaction after reading the above:

What are the kind of "basic details" we can learn from this
command is unclear, but perhaps there is an example to help me
decide if this command is worth studying.

> @@ -44,6 +53,12 @@ EXAMPLES
>  $ git commit-graph write
>  
>  
> +* Read basic information from a graph file.
> ++
> +
> +$ git commit-graph read --file=
> +
> +

And the sample reader is utterly disappointed at this point.

> +static int graph_read(int argc, const char **argv)
> +{
> + struct commit_graph *graph = 0;
> + struct strbuf full_path = STRBUF_INIT;
> +
> + static struct option builtin_commit_graph_read_options[] = {
> + { OPTION_STRING, 'o', "object-dir", _dir,
> + N_("dir"),
> + N_("The object directory to store the graph") },
> + { OPTION_STRING, 'H', "file", _file,
> + N_("file"),
> + N_("The filename for a specific commit graph file in 
> the object directory."),
> + PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
> + OPT_END(),
> + };

The same comment as all the previous ones apply, wrt short options
and non-use of OPT_STRING().

Also, I suspect that these two would want to use OPT_FILENAME
instead, if we anticipate that the command might want to be
sometimes run from a subdirectory.  Otherwise wouldn't

cd t && git commit-graph read --file=../.git/object/info/$whatever

end up referring to a wrong place because the code that uses the
value obtained from OPTION_STRING does not do the equivalent of
parse-options.c::fix_filename()?  The same applies to object-dir
handling.

> + argc = parse_options(argc, argv, NULL,
> +  builtin_commit_graph_read_options,
> +  builtin_commit_graph_read_usage, 0);
> +
> + if (!opts.obj_dir)
> + opts.obj_dir = get_object_directory();
> +
> + if (!opts.graph_file)
> + die("no graph hash specified");
> +
> + strbuf_addf(_path, "%s/info/%s", opts.obj_dir, opts.graph_file);

Ahh, I was fooled by a misnamed option.  --file does *not* name the
file.  It is a filename in a fixed place that is determined by other
things.

So it would be a mistake to use OPT_FILENAME() in the parser for
that misnamed "--file" option.  The parser for --object-dir still
would want to be OPT_FILENAME(), but quite honestly, I do not see
the point of having --object-dir option in the first place.  The
graph file is not relative to it but is forced to have /info/ in
between that directory and the filename, so it is not like the user
gets useful flexibility out of being able to specify two different
places using --object-dir= option and $GIT_OBJECT_DIRECTORY
environment (iow, a caller that wants to work on a specific object
directory can use the environment, which is how it would tell any
other Git subcommand which object store it wants to work with).

But stepping back a bit, I think the way --file argument is defined
is halfway off from two possible more useful ways to define it.  If
it were just "path to the file" (iow, what OPT_FILENAME() is suited
for parsing it), then a user could say "I have this graph file that
I created for testing, it is not installed in its usual place in
$GIT_OBJECT_DIRECTORY/info/ at all, but I want you to read it
because I am debugging".  That is one possible useful extreme.  The
other possibility would be to allow *only* the hash part to be
specified, iow, not just forcing /info/ relative to object
directory, you would force the "graph-" prefix and ".graph" suffix.
That would be the other extreme that is useful (less typing and less
error prone).

For a low-level command line this, my gut feeling is that it would
be better to allow paths to the object directory and the graph file
to be totally independently specified.

> + if (graph_signature != GRAPH_SIGNATURE) {
> + munmap(graph_map, graph_size);
> + close(fd);
> + die("graph signature %X does not match signature %X",
> + graph_signature, GRAPH_SIGNATURE);
> + }
> +
> + graph_version = *(unsigned char*)(data + 4);
> + if (graph_version != GRAPH_VERSION) {
> + munmap(graph_map, graph_size);
> + close(fd);
> + die("graph version %X does not match

[PATCH v2 1/3] t7006: add tests for how git config paginates

2018-02-21 Thread Martin Ågren

The next couple of commits will change how `git config` handles
`pager.config`, similar to how de121ffe5 (tag: respect `pager.tag` in
list-mode only, 2017-08-02) and ff1e72483 (tag: change default of
`pager.tag` to "on", 2017-08-02) changed `git tag`. Similar work has
also been done to `git branch`.

Add tests in this area to make sure that we don't regress and so that
the upcoming commits can be made clearer by adapting the tests. Add
tests for simple config-setting, `--edit`, `--get`, `--get-urlmatch`,
`get-all`, and `--list`. Those represent a fair portion of the various
options that will be affected by the next two commits.

Use `test_expect_failure` to document that we currently respect the
pager-configuration with `--edit`. The current behavior is buggy since
the pager interferes with the editor and makes the end result completely
broken. See also b3ee740c8 (t7006: add tests for how git tag paginates,
2017-08-02).

The next commit will teach simple config-setting and `--get` to ignore
`pager.config`. Test the current behavior as "success", not "failure",
since the currently expected behavior according to documentation would
be to page. The next commit will change that expectation by updating the
documentation on `git config` and will redefine those successful tests.

Remove the test added in commit 3ba7e6e29a (config: run
setup_git_directory_gently() sooner, 2010-08-05) since it has some
overlap with these. We could leave it or tweak it, or place new tests
like these next to it, but let's instead make the tests for `git config`
as similar as possible to the ones for `git tag` and `git branch`, and
place them after those.

Signed-off-by: Martin Ågren 
---
 t/t7006-pager.sh | 49 ++---
 1 file changed, 42 insertions(+), 7 deletions(-)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index f5f46a95b4..a46a079339 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -110,13 +110,6 @@ test_expect_success TTY 'configuration can disable pager' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config uses a pager if configured to' '
-   rm -f paginated.out &&
-   test_config pager.config true &&
-   test_terminal git config --list &&
-   test -e paginated.out
-'
-
 test_expect_success TTY 'configuration can enable pager (from subdir)' '
rm -f paginated.out &&
mkdir -p subdir &&
@@ -252,6 +245,48 @@ test_expect_success TTY 'git branch --set-upstream-to 
ignores pager.branch' '
! test -e paginated.out
 '
 
+test_expect_success TTY 'git config respects pager.config when setting' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config foo.bar bar &&
+   test -e paginated.out
+'
+
+test_expect_failure TTY 'git config --edit ignores pager.config' '
+   rm -f paginated.out editor.used &&
+   write_script editor <<-\EOF &&
+   touch editor.used
+   EOF
+   EDITOR=./editor test_terminal git -c pager.config config --edit &&
+   ! test -e paginated.out &&
+   test -e editor.used
+'
+
+test_expect_success TTY 'git config --get respects pager.config' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config --get foo.bar &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git -c http."https://foo.com/".bar=foo \
+ config --get-urlmatch http https://foo.com &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git config --get-all respects pager.config' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config --get-all foo.bar &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git config --list defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git config --list &&
+   ! test -e paginated.out
+'
+
+
 # A colored commit log will begin with an appropriate ANSI escape
 # for the first color; the text "commit" comes later.
 colorful() {
-- 
2.16.2.246.ga4ee8f

Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Derrick Stolee


On 2/21/2018 2:23 PM, Stefan Beller wrote:

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:

+In order to allow extensions that add extra data to the graph, we organize
+the body into "chunks" and provide a binary lookup table at the beginning
+of the body. The header includes certain values, such as number of chunks,
+hash lengths and types.
+
+All 4-byte numbers are in network order.
+
+HEADER:
+
+  4-byte signature:
+  The signature is: {'C', 'G', 'P', 'H'}
+
+  1-byte version number:
+  Currently, the only valid version is 1.
+
+  1-byte Object Id Version (1 = SHA-1)
+
+  1-byte number (C) of "chunks"
+
+  1-byte (reserved for later use)

What should clients of today do with it?
* ignore it completely [as they have no idea what it is] or
* throw hands up in the air if it is anything other than 0 ?
   [because clearly we will increment the version
or have new information in a new chunk instead of just sneaking
in information here?]


They should ignore it completely, which will allow using the value for 
something meaningful later without causing a version change (which we DO 
die() for). A user could downgrade from a version that uses this byte 
for something meaningful and not require a new commit-graph file.


The "commit-graph read" subcommand does output this byte, so we can 
verify that the "write" subcommand places a 0 in this position.





+CHUNK LOOKUP:
+
+  (C + 1) * 12 bytes listing the table of contents for the chunks:
+  First 4 bytes describe chunk id. Value 0 is a terminating label.
+  Other 8 bytes provide offset in current file for chunk to start.

offset [in bytes? I could imagine having a larger granularity here,
because chunks don't sound small.]


It is good to specify "offset in bytes".




+  (Chunks are ordered contiguously in the file, so you can infer
+  the length using the next chunk position if necessary.)
+
+  The remaining data in the body is described one chunk at a time, and
+  these chunks may be given in any order. Chunks are required unless
+  otherwise specified.
+
+CHUNK DATA:
+
+  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+  The ith entry, F[i], stores the number of OIDs with first
+  byte at most i. Thus F[255] stores the total
+  number of commits (N).

[ so in small repos, where there are fewer than 256 objects,
F[i] == F[i+1], for all i'th where there is no object starting with i byte]


Correct. I'm not sure this additional information is valuable for the 
document, though.





+  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+  The OIDs for all commits in the graph, sorted in ascending order.
+
+  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
+* The first H bytes are for the OID of the root tree.
+* The next 8 bytes are for the int-ids of the first two parents
+  of the ith commit. Stores value 0x if no parent in that
+  position. If there are more than two parents, the second value
+  has its most-significant bit on and the other bits store an array
+  position into the Large Edge List chunk.
+* The next 8 bytes store the generation number of the commit and
+  the commit time in seconds since EPOCH. The generation number
+  uses the higher 30 bits of the first 4 bytes, while the commit
+  time uses the 32 bits of the second 4 bytes, along with the lowest
+  2 bits of the lowest byte, storing the 33rd and 34th bit of the
+  commit time.
+
+  Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
+  This list of 4-byte values store the second through nth parents for
+  all octopus merges. The second parent value in the commit data stores
+  an array position within this list along with the most-significant bit
+  on. Starting at that array position, iterate through this list of int-ids
+  for the parents until reaching a value with the most-significant bit on.
+  The other bits correspond to the int-id of the last parent.
+
+TRAILER:
+
+   H-byte HASH-checksum of all of the above.
+
--
2.7.4

Makes sense so far, I'll read on.
I agree with Junio, that I could read this documentation without
the urge to point out nits. :)

Thanks,
Stefan

[PATCH v2 0/3] Re: t7006: add tests for how git config paginates

2018-02-21 Thread Martin Ågren

This is v2 of my series to teach `git config` to only respect
`pager.config` when listing configuration, then changing the default to
"on". Thanks to Duy and Junio for feedback on the first version.

Based on Duy's feeback, I've changed the approach to more carefully
divide the various getters into "may produce multiple lines, so let's
page" vs "may not, so don't".

Junio hesitated whether we should add tests using `test_expect_success`,
then flip the test-definition, or whether we should start with a
"failure" that we then flip to "success". I have not done anything about
that, except to try and motivate the choice better in the commit message
of the second patch.

Martin

Martin Ågren (3):
  t7006: add tests for how git config paginates
  config: respect `pager.config` in list/get-mode only
  config: change default of `pager.config` to "on"

 Documentation/git-config.txt |  6 ++
 t/t7006-pager.sh | 49 +---
 builtin/config.c | 10 +
 git.c|  2 +-
 4 files changed, 59 insertions(+), 8 deletions(-)

-- 
2.16.2.246.ga4ee8f

Re: [PATCH v3 00/35] protocol version 2

2018-02-21 Thread Brandon Williams

On 02/06, Brandon Williams wrote:
> Changes in v3:
>  * There were some comments about how the protocol should be designed
>stateless first.  I've made this change and instead of having to
>supply the `stateless-rpc=true` capability to force stateless
>behavior, the protocol just requires all commands to be stateless.
>  
>  * Added some patches towards the end of the series to force the client
>to not request to use protocol v2 when pushing (even if configured to
>use v2).  This is to ease the roll-out process of a push command in
>protocol v2.  This way when servers gain the ability to accept
>pushing in v2 (and they start responding using v2 when requests are
>sent to the git-receive-pack endpoint) that clients who still don't
>understand how to push using v2 won't request to use v2 and then die
>when they recognize that the server does indeed know how to accept a
>push under v2.
> 
>  * I implemented the `shallow` feature for fetch.  This feature
>encapsulates the existing functionality of all the shallow/deepen
>capabilities in v0.  So now a server can process shallow requests.
> 
>  * Various other small tweaks that I can't remember :)
> 
> After all of that I think the series is in a pretty good state, baring
> any more critical reviewing feedback.
> 
> Thanks!

I'm hoping to get some more in depth review before I do any more
re-rolls, but for those interested I will need to do a re-roll to
eliminate the prelude from the http transport.  This is the prelude
which includes the service line followed by any number of packet lines
culminating in a flush-pkt like so:

  # service=git-upload-pack
  some
  other
  optional
  lines
  

With this eliminated all transports will be exactly the same, the only
difference will be how the protocol is tunneled.

> 
> Brandon Williams (35):
>   pkt-line: introduce packet_read_with_status
>   pkt-line: introduce struct packet_reader
>   pkt-line: add delim packet support
>   upload-pack: convert to a builtin
>   upload-pack: factor out processing lines
>   transport: use get_refs_via_connect to get refs
>   connect: convert get_remote_heads to use struct packet_reader
>   connect: discover protocol version outside of get_remote_heads
>   transport: store protocol version
>   protocol: introduce enum protocol_version value protocol_v2
>   test-pkt-line: introduce a packet-line test helper
>   serve: introduce git-serve
>   ls-refs: introduce ls-refs server command
>   connect: request remote refs using v2
>   transport: convert get_refs_list to take a list of ref patterns
>   transport: convert transport_get_remote_refs to take a list of ref
> patterns
>   ls-remote: pass ref patterns when requesting a remote's refs
>   fetch: pass ref patterns when fetching
>   push: pass ref patterns when pushing
>   upload-pack: introduce fetch server command
>   fetch-pack: perform a fetch using v2
>   upload-pack: support shallow requests
>   fetch-pack: support shallow requests
>   connect: refactor git_connect to only get the protocol version once
>   connect: don't request v2 when pushing
>   transport-helper: remove name parameter
>   transport-helper: refactor process_connect_service
>   transport-helper: introduce stateless-connect
>   pkt-line: add packet_buf_write_len function
>   remote-curl: create copy of the service name
>   remote-curl: store the protocol version the server responded with
>   http: allow providing extra headers for http requests
>   http: don't always add Git-Protocol header
>   remote-curl: implement stateless-connect command
>   remote-curl: don't request v2 when pushing
> 
>  .gitignore  |   1 +
>  Documentation/technical/protocol-v2.txt | 338 +
>  Makefile|   7 +-
>  builtin.h   |   2 +
>  builtin/clone.c |   2 +-
>  builtin/fetch-pack.c|  21 +-
>  builtin/fetch.c |  14 +-
>  builtin/ls-remote.c |   7 +-
>  builtin/receive-pack.c  |   6 +
>  builtin/remote.c|   2 +-
>  builtin/send-pack.c |  20 +-
>  builtin/serve.c |  30 ++
>  builtin/upload-pack.c   |  74 
>  connect.c   | 352 +-
>  connect.h   |   7 +
>  fetch-pack.c| 319 +++-
>  fetch-pack.h|   4 +-
>  git.c   |   2 +
>  http.c  |  25 +-
>  http.h  |   2 +
>  ls-refs.c   |  96 +
>  ls-refs.h   |   9 +
>  pkt-line.c  | 149 +++-
>  pkt-line.h  |  77 
>  protocol.c

Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Stefan Beller

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:
> Add document specifying the binary format for commit graphs. This
> format allows for:
>
> * New versions.
> * New hash functions and hash lengths.
> * Optional extensions.
>
> Basic header information is followed by a binary table of contents
> into "chunks" that include:
>
> * An ordered list of commit object IDs.
> * A 256-entry fanout into that list of OIDs.
> * A list of metadata for the commits.
> * A list of "large edges" to enable octopus merges.
>
> The format automatically includes two parent positions for every
> commit. This favors speed over space, since using only one position
> per commit would cause an extra level of indirection for every merge
> commit. (Octopus merges suffer from this indirection, but they are
> very rare.)
>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph-format.txt | 90 
> +
>  1 file changed, 90 insertions(+)
>  create mode 100644 Documentation/technical/commit-graph-format.txt
>
> diff --git a/Documentation/technical/commit-graph-format.txt 
> b/Documentation/technical/commit-graph-format.txt
> new file mode 100644
> index 000..11b18b5
> --- /dev/null
> +++ b/Documentation/technical/commit-graph-format.txt
> @@ -0,0 +1,90 @@
> +Git commit graph format
> +===
> +
> +The Git commit graph stores a list of commit OIDs and some associated
> +metadata, including:
> +
> +- The generation number of the commit. Commits with no parents have
> +  generation number 1; commits with parents have generation number
> +  one more than the maximum generation number of its parents. We
> +  reserve zero as special, and can be used to mark a generation
> +  number invalid or as "not computed".
> +
> +- The root tree OID.
> +
> +- The commit date.
> +
> +- The parents of the commit, stored using positional references within
> +  the graph file.
> +
> +== graph-*.graph files have the following format:
> +
> +In order to allow extensions that add extra data to the graph, we organize
> +the body into "chunks" and provide a binary lookup table at the beginning
> +of the body. The header includes certain values, such as number of chunks,
> +hash lengths and types.
> +
> +All 4-byte numbers are in network order.
> +
> +HEADER:
> +
> +  4-byte signature:
> +  The signature is: {'C', 'G', 'P', 'H'}
> +
> +  1-byte version number:
> +  Currently, the only valid version is 1.
> +
> +  1-byte Object Id Version (1 = SHA-1)
> +
> +  1-byte number (C) of "chunks"
> +
> +  1-byte (reserved for later use)

What should clients of today do with it?
* ignore it completely [as they have no idea what it is] or
* throw hands up in the air if it is anything other than 0 ?
  [because clearly we will increment the version
   or have new information in a new chunk instead of just sneaking
   in information here?]

> +CHUNK LOOKUP:
> +
> +  (C + 1) * 12 bytes listing the table of contents for the chunks:
> +  First 4 bytes describe chunk id. Value 0 is a terminating label.
> +  Other 8 bytes provide offset in current file for chunk to start.

offset [in bytes? I could imagine having a larger granularity here,
because chunks don't sound small.]

> +  (Chunks are ordered contiguously in the file, so you can infer
> +  the length using the next chunk position if necessary.)
> +
> +  The remaining data in the body is described one chunk at a time, and
> +  these chunks may be given in any order. Chunks are required unless
> +  otherwise specified.
> +
> +CHUNK DATA:
> +
> +  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
> +  The ith entry, F[i], stores the number of OIDs with first
> +  byte at most i. Thus F[255] stores the total
> +  number of commits (N).

[ so in small repos, where there are fewer than 256 objects,
F[i] == F[i+1], for all i'th where there is no object starting with i byte]

> +  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
> +  The OIDs for all commits in the graph, sorted in ascending order.
> +
> +  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
> +* The first H bytes are for the OID of the root tree.
> +* The next 8 bytes are for the int-ids of the first two parents
> +  of the ith commit. Stores value 0x if no parent in that
> +  position. If there are more than two parents, the second value
> +  has its most-significant bit on and the other bits store an array
> +  position into the Large Edge List chunk.
> +* The next 8 bytes store the generation number of the commit and
> +  the commit time in seconds since EPOCH. The generation number
> +  uses the higher 30 bits of the first 4 bytes, while the commit
> +  time uses the 32 bits of the second 4 bytes, along with the lowest
> +  2 bits of the lowest byte, storing the 33rd and 34th bit of the
> +  commit time.
> +
> +  Large Edge List (ID: {'E',

[PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee

The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
  freed by rev-list in the first write. The output then does not
  match the format expectations, since the OID is written without the
  rest of the content.

- Second, if the revision walk visits a commit that was marked
  UNINTERESTING, the walk may never load a buffer and hence rev-list
  will not output the verbose information.

These behaviors are undocumented, untested, and unlikely to be
expected by users or other software attempting to parse this output.

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 
---
 builtin/rev-list.c | 2 +-
 log-tree.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9..d320b6f 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
 
-   if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {
+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d..22b2fb6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
 
-   if (!get_cached_commit_buffer(commit, NULL))
-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;
-- 
2.7.4

Re: [PATCH v4 03/13] commit-graph: create git-commit-graph builtin

2018-02-21 Thread Junio C Hamano

Junio C Hamano  writes:

> Derrick Stolee  writes:
>
>> +int cmd_commit_graph(int argc, const char **argv, const char *prefix)
>> +{
>> +static struct option builtin_commit_graph_options[] = {
>> +{ OPTION_STRING, 'p', "object-dir", _dir,
>> +N_("dir"),
>> +N_("The object directory to store the graph") },
>
> I have a suspicion that this was modeled after some other built-in
> that has a similar issue (perhaps written long time ago), but isn't
> OPT_STRING() sufficient to define this element these days?
>
> Or am I missing something?
>
> Why squat on short-and-sweet "-p"?  For that matter, since this is
> not expected to be end-user facing command anyway, I suspect that we
> do not want to allocate a single letter option from day one, which
> paints ourselves into a corner from where we cannot escape.

I suspect that exactly the same comment applies to patches in this
series that add other subcommands (I just saw one in the patch for
adding 'write').

1 2 >

1 - 100 of 119 matches

Mail list logo