Re: [PATCH v4 06/13] commit-graph: implement git commit-graph read

2018-02-21 Thread Junio C Hamano
Derrick Stolee  writes:

> +'read'::
> +
> +Read a graph file given by the graph-head file and output basic
> +details about the graph file.
> ++
> +With `--file=` option, consider the graph stored in the file at
> +the path  /info/.
> +

A sample reader confusion after reading the above twice:

What is "the graph-head file" and how does the user specify it?  Is
it given by  the value for the "--file=" command line option?

Another sample reader reaction after reading the above:

What are the kind of "basic details" we can learn from this
command is unclear, but perhaps there is an example to help me
decide if this command is worth studying.

> @@ -44,6 +53,12 @@ EXAMPLES
>  $ git commit-graph write
>  
>  
> +* Read basic information from a graph file.
> ++
> +
> +$ git commit-graph read --file=
> +
> +

And the sample reader is utterly disappointed at this point.

> +static int graph_read(int argc, const char **argv)
> +{
> + struct commit_graph *graph = 0;
> + struct strbuf full_path = STRBUF_INIT;
> +
> + static struct option builtin_commit_graph_read_options[] = {
> + { OPTION_STRING, 'o', "object-dir", _dir,
> + N_("dir"),
> + N_("The object directory to store the graph") },
> + { OPTION_STRING, 'H', "file", _file,
> + N_("file"),
> + N_("The filename for a specific commit graph file in 
> the object directory."),
> + PARSE_OPT_OPTARG, NULL, (intptr_t) "" },
> + OPT_END(),
> + };

The same comment as all the previous ones apply, wrt short options
and non-use of OPT_STRING().

Also, I suspect that these two would want to use OPT_FILENAME
instead, if we anticipate that the command might want to be
sometimes run from a subdirectory.  Otherwise wouldn't

cd t && git commit-graph read --file=../.git/object/info/$whatever

end up referring to a wrong place because the code that uses the
value obtained from OPTION_STRING does not do the equivalent of
parse-options.c::fix_filename()?  The same applies to object-dir
handling.

> + argc = parse_options(argc, argv, NULL,
> +  builtin_commit_graph_read_options,
> +  builtin_commit_graph_read_usage, 0);
> +
> + if (!opts.obj_dir)
> + opts.obj_dir = get_object_directory();
> +
> + if (!opts.graph_file)
> + die("no graph hash specified");
> +
> + strbuf_addf(_path, "%s/info/%s", opts.obj_dir, opts.graph_file);

Ahh, I was fooled by a misnamed option.  --file does *not* name the
file.  It is a filename in a fixed place that is determined by other
things.

So it would be a mistake to use OPT_FILENAME() in the parser for
that misnamed "--file" option.  The parser for --object-dir still
would want to be OPT_FILENAME(), but quite honestly, I do not see
the point of having --object-dir option in the first place.  The
graph file is not relative to it but is forced to have /info/ in
between that directory and the filename, so it is not like the user
gets useful flexibility out of being able to specify two different
places using --object-dir= option and $GIT_OBJECT_DIRECTORY
environment (iow, a caller that wants to work on a specific object
directory can use the environment, which is how it would tell any
other Git subcommand which object store it wants to work with).

But stepping back a bit, I think the way --file argument is defined
is halfway off from two possible more useful ways to define it.  If
it were just "path to the file" (iow, what OPT_FILENAME() is suited
for parsing it), then a user could say "I have this graph file that
I created for testing, it is not installed in its usual place in
$GIT_OBJECT_DIRECTORY/info/ at all, but I want you to read it
because I am debugging".  That is one possible useful extreme.  The
other possibility would be to allow *only* the hash part to be
specified, iow, not just forcing /info/ relative to object
directory, you would force the "graph-" prefix and ".graph" suffix.
That would be the other extreme that is useful (less typing and less
error prone).

For a low-level command line this, my gut feeling is that it would
be better to allow paths to the object directory and the graph file
to be totally independently specified.

> + if (graph_signature != GRAPH_SIGNATURE) {
> + munmap(graph_map, graph_size);
> + close(fd);
> + die("graph signature %X does not match signature %X",
> + graph_signature, GRAPH_SIGNATURE);
> + }
> +
> + graph_version = *(unsigned char*)(data + 4);
> + if (graph_version != GRAPH_VERSION) {
> + munmap(graph_map, graph_size);
> + close(fd);
> + die("graph version %X does not match 

Re: [PATCH] submodule: indicate that 'submodule.recurse' doesn't apply to clone

2018-02-21 Thread Junio C Hamano
Brandon Williams  writes:

> Update the documentation for the 'submodule.recurse' config to identify
> that the clone command does not respect it.
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)

Thanks.

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index f57e9cf10..59ff29d7a 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -3210,7 +3210,8 @@ submodule.active::
>  
>  submodule.recurse::
>   Specifies if commands recurse into submodules by default. This
> - applies to all commands that have a `--recurse-submodules` option.
> + applies to all commands that have a `--recurse-submodules` option,
> + except `clone`.
>   Defaults to false.
>  
>  submodule.fetchJobs::


Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Stefan Beller
>>
>> [ so in small repos, where there are fewer than 256 objects,
>> F[i] == F[i+1], for all i'th where there is no object starting with i
>> byte]
>
>
> Correct. I'm not sure this additional information is valuable for the
> document, though.

It is not, I was just making sure I'd understand correctly.


bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor
Hello list,

I had been banging my head all morning trying to figure out why I couldn’t get 
a little HTTP implementation to clone/push via the smart protocol (just 
wrapping git-receive-pack/git-upload-pack). I kept getting the following 
(likely familiar to some) error:

```
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
```

I didn’t get an insight until I ran with GIT_TRACE_PACKET=true on a known-good 
remote (i.e. GitHub), that the null packet-line `` has to follow the 
service line. This is not reflected in the example here:

https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L216

It is also not reflected in the BNF:

https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L279

(Note these links are from the most recent commit of this file as of this 
writing.)

Just thought somebody would like to know.

Regards,

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com



Re: [PATCH v4 02/13] graph: add commit graph design document

2018-02-21 Thread Stefan Beller
> +[3] 
> https://public-inbox.org/git/20170907094718.b6kuzp2uhvkmw...@sigill.intra.peff.net/t/#m7a2ea7b355aeda962e6b86404bcbadc648abfbba
> +More discussion about generation numbers and not storing them inside
> +commit objects. A valuable quote:

Unlike the other public inbox links this links to a discussion with
all messages on one page,
https://public-inbox.org/git/20170908034739.4op3w4f2ma5s6...@sigill.intra.peff.net/
would
have this be more inline with the other links. (this is a super small
nit, which I am not sure if
we care about at all; the rest of the doc is awesome!)


Re: [PATCH v4 05/13] commit-graph: implement 'git-commit-graph write'

2018-02-21 Thread Junio C Hamano
Derrick Stolee  writes:

> +static int graph_write(int argc, const char **argv)
> +{
> + ...
> + graph_name = write_commit_graph(opts.obj_dir);
> +
> + if (graph_name) {
> + printf("%s\n", graph_name);
> + FREE_AND_NULL(graph_name);
> + }
> +
> + return 0;
> +}

After successfully writing a graph file out, write_commit_graph()
signals that fact by returning a non-NULL pointer, so that this
caller can report the filename to the end user.  This caller
protects itself from a NULL return, presumably because the callee
uses it to signal an error when writing the graph file out?  

Is it OK to lose that 1-bit of information, or should we have more like

if (graph_name) {
printf;
return 0;
} else {
return -1;
}

>  int cmd_commit_graph(int argc, const char **argv, const char *prefix)
>  {
>   static struct option builtin_commit_graph_options[] = {
> - { OPTION_STRING, 'p', "object-dir", _dir,
> + { OPTION_STRING, 'o', "object-dir", _dir,
>   N_("dir"),
>   N_("The object directory to store the graph") },
>   OPT_END(),

The same comment for a no-op patch from an earlier step applies
here, and we have another one that we saw above in graph_write().

> @@ -31,6 +67,11 @@ int cmd_commit_graph(int argc, const char **argv, const 
> char *prefix)
>builtin_commit_graph_usage,
>PARSE_OPT_STOP_AT_NON_OPTION);
>  
> + if (argc > 0) {
> + if (!strcmp(argv[0], "write"))
> + return graph_write(argc, argv);

And if we fix "graph_write" to report an error with negative return,
this needs to become something like

return !!graph_write(argc, argv);

as we do not want to return a negative value to be passed via
run_builtin() to exit(3) in handle_builtin().

> diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
> new file mode 100755
> index 000..6a5e93c
> --- /dev/null
> +++ b/t/t5318-commit-graph.sh
> @@ -0,0 +1,119 @@
> +#!/bin/sh
> +
> +test_description='commit graph'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup full repo' '
> + rm -rf .git &&

I am perfectly OK with creating a separate subdirectory called
'full' in the trash directory given by the test framework, but
unless absolutely necessary I'd rather not to see "rm -rf", 
especially on ".git", in our test scripts.  People can screw up
doing various things (like copying and pasting).

> + mkdir full &&
> + cd full &&
> + git init &&
> + objdir=".git/objects"
> +'

And I absolutely do not want to see "cd full" that leaves and stays
in the subdirectory after this step is done.  

Imagine what happens if any earlier step fails before doing "cd
full", causing this "setup full" step to report failure, and then
the test goes on to the next step?  We will not be in "full" and
worse yet because we do not have "$TRASH_DIRECTORY/.git" (you
removed it), the "git commit-graph write --object-dir" command we
end up doing next will see the git source repository as the
repository it is working on.  Never risk trashing our source
repository with your test.  That is why we give you $TRASH_DIRECTORY
to play in.  Make use of it when you can.

I'd make this step just a single

git init full

and then the next one

git -C full commit-graph write --object-dir .

In later tests that have multi-step things, I'd instead make them

(
cd full &&
... whatever you do  &&
... in that separate  &&
... 'full' repository
)

if I were writing this test *and* if I really wanted to do things
inside $TRASH_DIRECTORY/full/.git repository.  I am not convinced
yet about the latter.  I know that it will make certain things
simpler to use a separate /full hierarchy (e.g. cleaning up, having
another unrelated test repository, etc.) while making other things
more cumbersome (e.g. you need to be careful when you "cd" and the
easiest way to do so is to ( do things in a subshell )).  I just do
not know what the trade-off would look like in this particular case.

A simple rule of thumb I try to follow is not to change $(pwd) for
the process that runs these test_expect_success shell functions.

> +
> +test_expect_success 'write graph with no packs' '
> + git commit-graph write --object-dir .
> +'
> +
> +test_expect_success 'create commits and repack' '
> + for i in $(test_seq 3)
> + do
> + test_commit $i &&
> + git branch commits/$i
> + done &&
> + git repack
> +'


Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King
On Wed, Feb 21, 2018 at 01:48:11PM -0500, Jeff King wrote:

> > What confuses me about this behavior is that the OID is still shown on the
> > repeat (and in the case of `git log --oneline` will not actually have a line
> > break between two short-OIDs). I don't believe this behavior is something to
> > preserve.
> 
> I think that repeating the oid is intentional; the point is to dump how
> the traversal code is hitting the endpoints, even if we do so multiple
> times.
> 
> The --oneline behavior just looks like a bug. I think --format is broken
> with --show-all, too (it does not show anything!).

I poked at one of the examples a little more closely. I actually think
these are not repeats, but simply UNINTERESTING parents that we never
needed to look at in our traversal (because we hit a point where
everything was UNINTERESTING).

So we are relying not on finish_commit() to have freed the buffer, but
on the traversal code to have never parsed those commits in the first
place. Which is doubly subtle.

I think the rest of my email stands, though: we should just show the
full headers for those commits.

-Peff


Re: Git should preserve modification times at least on request

2018-02-21 Thread Derek Fawcus
On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote:
> 
> It is pretty annoying that git cannot, even if I know what I am doing, 
> and explicitly want it to, preserve the modification time.

The use case I've come across where it would be of value is for code
archeology, either importing a bunch of tar files, or importing a
repo from some other VCS.

There preserving the mod times can be useful when one is subsequently
figuring out what changed, and the scope of the 'commits' is too big
(i.e. the granularity of the tar files themselves).

e.g. initial commits are done on tar boundaries, but one may try to
figure out individual changes from a ChangeLog file.  I've done this
a couple of times, but to date it has required keeping the untarred
trees around (or a timestamp list file from each tree), in addition
to the git repro in to which one is then synthesizing smaller commits.

DF


Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-21 Thread Jeff King
On Wed, Feb 21, 2018 at 08:53:16AM -0800, Junio C Hamano wrote:

> Even though keeping track of list of known-leaky tests may not be so
> interesting, we can still salvage useful pieces from the discussion
> and make them available to developers, e.g.  something like
> 
> prove --dry --state=failed |
> perl -lne '/^(t[0-9]{4})-.*\.sh$/ and print $1' | sort >$@+
> if cmp >/dev/null $@ $@+; then rm $@+; else mv $@+ $@; fi
> 
> could be made into a target to stash away the list of failing tests
> after a test run?

Unfortunately there are some caveats in that snippet:

  1. You are using prove.

  2. You are using --state=save in the initial run.

I think we might be better off having the test scripts write to
test-results/*.counts even when run under a TAP harness, and then we can
have a consistent way to get the list of failed tests (we already have a
"make failed" that works this way).

-Peff


Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King
On Wed, Feb 21, 2018 at 09:13:22AM -0500, Derrick Stolee wrote:

> > So there it is. It does show commits multiple times, but suppresses the
> > verbose header after the first showing. If we do something like this:
> > 
> >git rev-list --show-all --pretty --boundary c93150cfb0^-
> > 
> > you'll see some boundary commits that _don't_ have their pretty headers
> > shown. And with your proposed patch, we'd show them again. To keep the
> > same behavior we need to store that "we've already seen this" boolean
> > somewhere else (e.g., in an object flag; possibly SEEN, but that might
> > run afoul of other logic).
> 
> What confuses me about this behavior is that the OID is still shown on the
> repeat (and in the case of `git log --oneline` will not actually have a line
> break between two short-OIDs). I don't believe this behavior is something to
> preserve.

I think that repeating the oid is intentional; the point is to dump how
the traversal code is hitting the endpoints, even if we do so multiple
times.

The --oneline behavior just looks like a bug. I think --format is broken
with --show-all, too (it does not show anything!).

> Unless I am misunderstanding, the current behavior on a repeated commit is
> already incorrect: some amount of output occurs before checking the buffer,
> so the output includes repeated records but with formatting that violates
> the expectation. By doing the simple change of swapping
> get_cached_commit_buffer() with get_commit_buffer(), we correct that format
> violation but have duplicate copies.

Yeah, I'd agree with that assessment.

> The most-correct thing to do (in my opinion) is to put the requirement of
> "no repeats" into the revision walk logic and stop having the formatting
> methods expect them. Then, however we change this boolean setting of "we
> have seen this before" it will not require the formatting methods to change.

But then you wouldn't show repeats at all. If I'm understanding you
correctly.

TBH, I do not think it is worth spending a lot of effort on this
--show-all feature. It seems mostly like forgotten debugging cruft to
me. That's why I'd be OK with showing the whole header as the simplest
fix (i.e., just removing those calls entirely, not even converting them
to get_commit_buffer).

> I can start working on a patch to move the duplicate-removal logic into
> revision.c instead of these three callers:
> 
> builtin/rev-list.c: if (revs->verbose_header &&
> get_cached_commit_buffer(commit, NULL)) {
> log-tree.c: if (!get_cached_commit_buffer(commit, NULL))
> object.c:   if (!get_cached_commit_buffer(commit, NULL))

Those first two are duplicate detection. The third one in object.c
should stay, though. We've been fed a commit buffer to parse, and we
want to know whether we should attach it as the cached buffer for that
commit. But if we already have a cached buffer, there's no point in
doing so. And that's what we're checking there.

Though I think it would be equally correct to have set_commit_buffer()
just throw away the existing cache entry and replace it with this one. I
don't think there's a real reason to prefer the old to the new. And that
might be worth doing if it would let us drop get_cached_commit_buffer()
as a public function. But...

> But this caller seems pretty important in pretty.c:
> 
>     /*
>  * Otherwise, we still want to munge the encoding header in the
>  * result, which will be done by modifying the buffer. If we
>  * are using a fresh copy, we can reuse it. But if we are using
>  * the cached copy from get_commit_buffer, we need to duplicate it
>  * to avoid munging the cached copy.
>  */
>     if (msg == get_cached_commit_buffer(commit, NULL))
>     out = xstrdup(msg);
>     else
>     out = (char *)msg

Like the one in object.c, this really does want to know about the cached
entry. And it should be unaffected by your patch, since we will have
called get_commit_buffer() at the top of that function.

If we wanted to write this one without get_cached_commit_buffer(), we'd
really need a function to ask "did this pointer come from the cache, or
was it freshly allocated?". That's the same thing we do for
unuse_commit_buffer(). So in theory we could have a boolean function
that would check that, and that would let us make
get_cached_commit_buffer() private.

In my opinion it's not really worth trying to make it private. The
confusion you're fixing in the first two calls is not due to a bad API,
but due to some subtly confusing logic in that code's use of the API. ;)

So I'd probably do this:

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index d94062bc84..3af56921c8 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -150,7 +150,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
 
-   if (revs->verbose_header && get_cached_commit_buffer(commit, 

[PATCH v2 3/3] config: change default of `pager.config` to "on"

2018-02-21 Thread Martin Ågren
This is similar to ff1e72483 (tag: change default of `pager.tag` to
"on", 2017-08-02) and is safe now that we do not consider `pager.config`
at all when we are not listing or getting configuration. This change
will help with listing large configurations, but will not hurt users of
`git config --edit` as it would have before the previous commit.

Signed-off-by: Martin Ågren 
---
 Documentation/git-config.txt |  1 +
 t/t7006-pager.sh | 12 ++--
 builtin/config.c |  2 +-
 3 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 249090ac84..e09ed5d7d5 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -237,6 +237,7 @@ CONFIGURATION
 -
 `pager.config` is only respected when listing configuration, i.e., when
 using `--list` or any of the `--get-*` which may return multiple results.
+The default is to use a pager.
 
 [[FILES]]
 FILES
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index 95bd26f0b2..7541ba5edb 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -267,23 +267,23 @@ test_expect_success TTY 'git config --get ignores 
pager.config' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
+test_expect_success TTY 'git config --get-urlmatch defaults to paging' '
rm -f paginated.out &&
test_terminal git -c http."https://foo.com/".bar=foo \
  config --get-urlmatch http https://foo.com &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 test_expect_success TTY 'git config --get-all respects pager.config' '
rm -f paginated.out &&
-   test_terminal git -c pager.config config --get-all foo.bar &&
-   test -e paginated.out
+   test_terminal git -c pager.config=false config --get-all foo.bar &&
+   ! test -e paginated.out
 '
 
-test_expect_success TTY 'git config --list defaults to not paging' '
+test_expect_success TTY 'git config --list defaults to paging' '
rm -f paginated.out &&
test_terminal git config --list &&
-   ! test -e paginated.out
+   test -e paginated.out
 '
 
 
diff --git a/builtin/config.c b/builtin/config.c
index a732d9b56c..01169dd628 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -602,7 +602,7 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
}
 
if (actions & PAGING_ACTIONS)
-   setup_auto_pager("config", 0);
+   setup_auto_pager("config", 1);
 
if (actions == ACTION_LIST) {
check_argc(argc, 0, 0);
-- 
2.16.2.246.ga4ee8f



Re: Git should preserve modification times at least on request

2018-02-21 Thread Phillip Wood

On 20/02/18 22:48, Peter Backes wrote:


On Tue, Feb 20, 2018 at 11:32:23PM +0100, Johannes Schindelin wrote:

Hi Peter,

On Tue, 20 Feb 2018, Peter Backes wrote:


On Tue, Feb 20, 2018 at 11:46:38AM +0100, Johannes Schindelin wrote:


I would probably invent a file format (``)


I'm stuck there because of  being munged.


 From which command do you want to get it? If you are looking at `git
diff`, you may want to use the `-z --name-only` options to avoid munging
the paths.


I plan to use "git diff-tree --name-only $w_tree HEAD" and subtract
all lines from "git diff-index --name-only HEAD" to get the files for
which the timestamp should be stored..

If I use "-z" I get the non-munged path, but I cannot safely store such
paths in the proposed file format; they might contain newlines (sigh).
So at one point I have to munge. Then the same question arises when I
have to get the actual path from the munged path when restoring the
timestamps.

If there's no ready-made functionality to munge and unmunge paths, I
have to write some awk for this. At first I thought this might add one
more dependency to git, but it seems that awk is already used in
git-mergetool.sh, so I suppose it's okay to use in git-stash.sh etc,
too.


In recent versions of git there's unquote_path() in Git.pm, you could 
possibly use that with perl -e from your script


Best Wishes

Phillip

Best wishes
Peter





Re: Git should preserve modification times at least on request

2018-02-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 21 2018, Derek Fawcus jotted:

> On Mon, Feb 19, 2018 at 10:22:36PM +0100, Peter Backes wrote:
>>
>> It is pretty annoying that git cannot, even if I know what I am doing,
>> and explicitly want it to, preserve the modification time.
>
> The use case I've come across where it would be of value is for code
> archeology, either importing a bunch of tar files, or importing a
> repo from some other VCS.
>
> There preserving the mod times can be useful when one is subsequently
> figuring out what changed, and the scope of the 'commits' is too big
> (i.e. the granularity of the tar files themselves).
>
> e.g. initial commits are done on tar boundaries, but one may try to
> figure out individual changes from a ChangeLog file.  I've done this
> a couple of times, but to date it has required keeping the untarred
> trees around (or a timestamp list file from each tree), in addition
> to the git repro in to which one is then synthesizing smaller commits.

This sounds like a sensible job for a git import tool, i.e. import a
target directory into git, and instead of 'git add'-ing the whole thing
it would look at the mtimes, sort files by mtime, then add them in order
and only commit those files that had the same mtime in the same commit
(or within some boundary).

The advantage of doing this via such a tool is that you could tweak it
to commit by any criteria you wanted, e.g. not mtime but ctime or even
atime.

You'd get the same thing as you'd get if git's tree format would change
to include mtimes (which isn't going to happen), but with a lot more
flexibility.


Re: [PATCH v4 08/13] commit-graph: implement --delete-expired

2018-02-21 Thread Stefan Beller
On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:

> graph_name = write_commit_graph(opts.obj_dir);
>
> if (graph_name) {
> if (opts.set_latest)
> set_latest_file(opts.obj_dir, graph_name);
>
> +   if (opts.delete_expired)
> +   do_delete_expired(opts.obj_dir,
> + old_graph_name,
> + graph_name);
> +

So this only allows to delete expired things and setting the latest
when writing a new graph. Would we ever envision a user to produce
a new graph (e.g. via obtaining a graph that they got from a server) and
then manually rerouting the latest to that new graph file without writing
that graph file in the same process? The same for expired.

I guess these operations are just available via editing the
latest or deleting files manually, which slightly contradicts
e.g. "git update-ref", which in olden times was just a fancy way
of rewriting the refs file manually. (though it claims to be less
prone to errors as it takes lock files)

>
>  extern char *get_graph_latest_filename(const char *obj_dir);
> +extern char *get_graph_latest_contents(const char *obj_dir);

Did
https://public-inbox.org/git/20180208213806.ga6...@sigill.intra.peff.net/
ever make it into tree? (It is sort of new, but I feel we'd want to
strive for consistency in the code base, eventually.)


Congratulations i willed to you

2018-02-21 Thread Micheal P Anderson
 
   My Dear friend,

 I am Mr. Micheal Pascal Anderson I got your contact on my personal search
of the person.. I willed the only funds left in my account to you.

If you want to know why I have willed the only funds left
in my account to you please contact the bank manager whose
name and address I will give to you as soon as you reply to
this mail. Right now I am in the hospital. Please,

Mr. Micheal Pascal Anderson


Re: [PATCH v3 04/35] upload-pack: convert to a builtin

2018-02-21 Thread Jonathan Tan
On Tue,  6 Feb 2018 17:12:41 -0800
Brandon Williams  wrote:

> In order to allow for code sharing with the server-side of fetch in
> protocol-v2 convert upload-pack to be a builtin.
> 
> Signed-off-by: Brandon Williams 

As Stefan mentioned in [1], also mention in the commit message that this
means that the "git-upload-pack" invocation gains additional
capabilities (for example, invoking a pager for --help).

Having said that, the main purpose of this patch seems to be to libify
upload-pack, and the move to builtin is just a way of putting the
program somewhere - we could have easily renamed upload-pack.c and
created a new upload-pack.c containing the main(), preserving the
non-builtin-ness of upload-pack, while still gaining the benefits of
libifying upload-pack.

If the community does want to make upload-pack a builtin, I would write
the commit message this way:

  upload-pack: libify

  Libify upload-pack. The main() function is moved to
  builtin/upload-pack.c, thus making upload-pack a builtin. Note that
  this means that "git-upload-pack" gains functionality such as the
  ability to invoke a pager when passed "--help".

And if not:

  upload-pack: libify

  Libify upload-pack by moving most of the functionality in
  upload-pack.c into a file upload-pack-lib.c (or some other name),
  to be used in subsequent patches.

[1] 
https://public-inbox.org/git/CAGZ79kb2=uu0_k8wr27gndnx-t+p+7gvdgc5ebdyc3zbobs...@mail.gmail.com/

> -static void upload_pack(void)
> -{
> - struct string_list symref = STRING_LIST_INIT_DUP;
> -
> - head_ref_namespaced(find_symref, );
> -
> - if (advertise_refs || !stateless_rpc) {
> - reset_timeout();
> - head_ref_namespaced(send_ref, );
> - for_each_namespaced_ref(send_ref, );
> - advertise_shallow_grafts(1);
> - packet_flush(1);
> - } else {
> - head_ref_namespaced(check_ref, NULL);
> - for_each_namespaced_ref(check_ref, NULL);
> - }
> - string_list_clear(, 1);
> - if (advertise_refs)
> - return;
> -
> - receive_needs();
> - if (want_obj.nr) {
> - get_common_commits();
> - create_pack_file();
> - }
> -}

I see that this function had to be moved to the bottom because it now
also needs to make use of functions like upload_pack_config() - that's
fine.

> +struct upload_pack_options {
> + int stateless_rpc;
> + int advertise_refs;
> + unsigned int timeout;
> + int daemon_mode;
> +};

I would have expected "unsigned stateless_rpc : 1" etc., but I see that
this makes it easier to use with OPT_BOOL (which needs us to pass it a
pointer-to-int).

As for what existing code does, files like fetch-pack and diff use
"unsigned : 1", but they also process arguments without OPT_, so I don't
think they are relevant.

I think that we should decide if we're going to prefer "unsigned : 1" or
"int" for flags in new code. Personally, I prefer "unsigned : 1"
(despite the slight inconvenience in that argument parsers will need to
declare their own temporary "int" and then assign its contents to the
options struct) because of the stronger type, but I'm OK either way.
Whatever the decision, I don't think it needs to block the review of
this patch set.


Re: [PATCH 04/26] upload-pack: convert to a builtin

2018-02-21 Thread Jonathan Nieder
Brandon Williams wrote:
> On 01/03, Stefan Beller wrote:
> > On Tue, Jan 2, 2018 at 4:18 PM, Brandon Williams  wrote:

>>> In order to allow for code sharing with the server-side of fetch in
>>> protocol-v2 convert upload-pack to be a builtin.
>>
>> What is the security aspect of this patch?
>>
>> By making upload-pack builtin, it gains additional abilities,
>> such as answers to '-h' or '--help' (which would start a pager).
>> Is there an easy way to sooth my concerns? (best put into the
>> commit message)
>
> receive-pack is already a builtin, so theres that.

*nod*

Since v2.4.12~1^2 (shell: disallow repo names beginning with dash,
2017-04-29), git-shell refuses to pass --help to upload-pack, limiting
the security impact in configurations that use git-shell (e.g.
gitolite installations).

If you're not using git-shell, then hopefully you have some other form
of filtering preventing arbitrary options being passed to
git-upload-pack.  If you don't, then you're in trouble, for the
reasons described in that commit.

Since some installations may be allowing access to git-upload-pack
(for read-only access) without allowing access to git-receive-pack,
this does increase the chance of attack.  On the other hand, I suspect
the maintainability benefit is worth it.

For defense in depth, it would be comforting if the git wrapper had
some understanding of "don't support --help in handle_builtin when
invoked as a dashed command".  That is, I don't expect that anyone has
been relying on

git-add --help

acting like

git help add

instead of printing the usage message from

git add -h

It's a little fussy because today we rewrite "git add --help" to
"git-add --help" before rewriting it to "git help add"; we'd have to
skip that middle hop for this to work.

I don't think that has to block this patch or series, though --- it's
just a separate thought about hardening.

Cc-ing Sitaram Chamarty since he tends to be wiser about this kind of
thing than I am.

What do you think?

Thanks,
Jonathan


Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-21 Thread Junio C Hamano
Stefan Beller  writes:

> +
> +/*
> + * The mru list_head is supposed to be initialized using
> + * the LIST_HEAD macro, assigning prev/next to itself.
> + * However this doesn't work in this case as some compilers dislike
> + * that macro on member variables. Use NULL instead as that is defined
> + * and accepted, deferring the real init to prepare_packed_git_mru(). */
> +#define __MRU_INIT { NULL, NULL }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }

I do not think it has to be this way to abuse two NULLs, if you
faithfully mimicked how LIST_HEAD() macro is constructed.  The
reason why it does not try to introduce

struct list_head x = LIST_HEAD_INIT;

and instead, uses

LIST_HEAD(x);

is because of the need for self referral.  If we learn from it, we
can do the same, i.e. instead of doing

struct raw_object_store x = RAW_OBJECT_STORE_INIT;

we can do

RAW_OBJECT_STORE(x);

that expands to

struct raw_object_store x = {
...
{ _git_mru, _git_mru },
...
};

no?  Then we do not need such a lengthy comment that reads only like
an excuse ;-)



Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee

On 2/21/2018 2:17 PM, Derrick Stolee wrote:

The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
   freed by rev-list in the first write. The output then does not
   match the format expectations, since the OID is written without the
   rest of the content.

- Second, if the revision walk visits a commit that was marked
   UNINTERESTING, the walk may never load a buffer and hence rev-list
   will not output the verbose information.

These behaviors are undocumented, untested, and unlikely to be
expected by users or other software attempting to parse this output.

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 


This would be a good time to allow multiple authors, or to just change 
the author, since this is exactly the diff you (Peff) provided in an 
earlier email. The commit message hopefully summarizes our discussion, 
but I welcome edits.



---
  builtin/rev-list.c | 2 +-
  log-tree.c | 3 ---
  2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9..d320b6f 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
  
-	if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {

+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d..22b2fb6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
  
-	if (!get_cached_commit_buffer(commit, NULL))

-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;




<    1   2