date:20180221

test bare repository for unit tests

2018-02-21 Thread Basin Ilya

Hi.
I want to the test-repo-git under 
https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
just like test-repo-cvs and test-repo-svn

Which configuation options would suit that?
I think core.compression 0 for human readable diffs.
also, I need to force loose, gc after each push.

Re: Git should preserve modification times at least on request

2018-02-21 Thread Jacob Keller

On Tue, Feb 20, 2018 at 2:05 PM, Peter Backes  wrote:
> Hi Jeff,
>
> On Tue, Feb 20, 2018 at 04:16:34PM -0500, Jeff King wrote:
>> I think there are some references buried somewhere in that wiki, but did
>> you look at any of the third-party tools that store file metadata
>> alongside the files in the repository? E.g.:
>>
>>   https://etckeeper.branchable.com/
>>
>> or
>>
>>   https://github.com/przemoc/metastore
>>
>> I didn't see either of those mentioned in this thread (though I also do
>> not have personal experience with them, either).
>>
>> Modification times are a subset of the total metadata you might care
>> about, so they are solving a much more general problem. Which may also
>> partially answer your question about why this isn't built into git. The
>> general problem gets much bigger when you start wanting to carry things
>> like modes (which git doesn't actually track; we really only care about
>> the executable bit) or extended attributes (acls, etc).
>
> I know about those, but that's not what I am looking for. Those tools
> serve entirely different purposes, ie., tracking file system changes.
> I, however, am specifically interested in version control.
>
> In version control, the user checks out his own copy of the tree for
> working. For this purpose, it is thus pointless to track ownership,
> permissions (except for the x bit), xattrs, or any other metadata. In
> fact, it can be considered the wrong thing to do.
>
> The modification time, however, is special. It clearly has its place in
> version control. It tells us when the last modification was actually
> done to the file. I am often working on some feature, and one part is
> finished and is lying around, but I am still working on other parts in
> other files. Then, maybe after some weeks, the other parts are
> finished. Now, when committing, the information about modification time
> is lost. Maybe some weeks later I want to figure out when I last
> modified those files that were committed. But that information is now
> gone, at least in the git repository. Sure, I could do lots of WIP
> commits, but this would clutter up the history unneccessarly and I
> would have lots of versions that might not even compile, let alone run.

You could have git figure this out by the commit time of the last
commit which modified a file. This gets a bit weird for cherry-picks
or other things like rebase, but that should get what you want.

If you only ever need this information sometimes, you can look it up
by doing something like:

git log -1 --pretty="%cd" -- 

That should show the commit time of the latest commit which touches
that file, which is "essentially" the modify time of the file in terms
of  the version control history.

Obviously, this wouldn't work if you continually amend a change of
multiple files, since git wouldn't track the files separately, and
this only really shows you the time of the last commit.

However, in "version control" sense, this *is* the last time a file
was modified, since it doesn't really care about the stuff that
happens outside of version control.

I'm not really sure if this is enough for you, or if you really want
to store the actual mtime for some reason? (I think you can likely
solve your problem in some other way though).

>
> As far as I remember, bitkeeper had this distinction between checkins
> and commits. You could check in a file at any time, and any number of
> times, and then group all those checkins together with a commit. Git
> seems to have avoided this principle, or have kept it only
> rudimentarily via git add (but git add cannot add more than one version
> of the same file). Perhaps for simplificiation of use, perhaps for
> simplification of implementation, I don't know.
>

You can do lots of commits on a branch and then one merge commit to
merge it into the main line. This is a common strategy used by many
people.

Thanks,
Jake

> I assume, if it were not for the build tool issues, git would have
> tracked mtime from the very start.
>

Maybe. Personally, I would hate having my mtime not be "the time I
checked the file out", since this is intuitive to me at this point.
I'm sure if I lived in a different world I'd be used to that way also,
though.

The build issue *is* important though, because many build systems rely
on the mtime to figure out what to rebuild, and a complete rebuild
isn't a good idea for very large projects.

Thanks,
Jake

> Best wishes
> Peter
> --
> Peter Backes, r...@helen.plasma.xg8.de

Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen

On Wed, 2018-02-21 at 08:53 +0100, Torsten Bögershausen wrote:
> I don't hava a pointer, but what should happen ?
> 2 warnings for 2 "git add" should be OK, I think.
> 
> 1 warning is part of the optimization, that Git does to handle
> hundrets and thousands of files efficciently.
> 
> Is the 1/2 warning  real live problem  ?

As I've suggested, my opinion is that the nondeterministic second
warning can result in significant user confusion and should be avoided.
 (If it were always shown, I'd be less concerned.)  We'll see what the
decision-makers think.

Matt

Re: [PATCH v2 5/9] t3701: add failing test for pathological context lines

2018-02-21 Thread Phillip Wood


On 19/02/18 11:29, Phillip Wood wrote:

From: Phillip Wood 

When a hunk is skipped by add -i the offsets of subsequent hunks are
not adjusted to account for any missing insertions due to the skipped
hunk. Most of the time this does not matter as apply uses the context
lines to apply the subsequent hunks in the correct place, however in
pathological cases the context lines will match at the now incorrect
offset and the hunk will be applied in the wrong place. The offsets of
hunks following an edited hunk that has had the number of insertions
or deletions changed also need to be updated in the same way. Add
failing tests to demonstrate this.

Signed-off-by: Phillip Wood 
---

Notes:
 changes since v1:
  - changed edit test to use the existing fake editor and to strip
the hunk header and some context lines as well as deleting an
insertion
  - style fixes

  t/t3701-add-interactive.sh | 31 +++
  1 file changed, 31 insertions(+)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 
70748393f28c93f4bc5d43b07bd96bd208aba3e9..06c4747f506a1b05ccad0f9387e1fbd69cfd7784
 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -511,4 +511,35 @@ test_expect_success 'status ignores dirty submodules 
(except HEAD)' '
! grep dirty-otherwise output
  '
  
+test_expect_success 'set up pathological context' '

+   git reset --hard &&
+   test_write_lines a a a a a a a a a a a >a &&
+   git add a &&
+   git commit -m a &&
+   test_write_lines c b a a a a a a a b a a a a >a &&
+   test_write_lines a a a a a a a b a a a a >expected-1 &&
+   test_write_lines   b a a a a a a a b a a a a >expected-2 &&
+   # check editing can cope with missing header and deleted context lines
+   # as well as changes to other lines
+   test_write_lines +b " a" >patch
+'
+
+test_expect_failure 'add -p works with pathological context lines' '
+   git reset &&
+   printf "%s\n" n y |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-1 actual
+'
+
+test_expect_failure 'add -p patch editing works with pathological context 
lines' '
+   git reset &&
+   test_set_editor "$FAKE_EDITOR" &&

In light of the discussion of patch 2, I think this line should be deleted



+   # n q q below is in case edit fails
+   printf "%s\n" e yn q q |
+   git add -p &&
+   git cat-file blob :a >actual &&
+   test_cmp expected-2 actual
+'
+
  test_done

Re: Why git-revert doesn't invoke the pre-commit and the commit-msg hooks?

2018-02-21 Thread Gustavo Chaves

2018-02-20 15:00 GMT-03:00 Junio C Hamano :
> It would make more sense (if we were to add
> an option to run any hook we currently do not run to the command) to
> run pre-revert/revert-msg hooks instead, and then people who happen
> to want to do the same thing in these hooks what they do for
> ordinary commits can just call their pre-commit/commit-msg hooks
> from there, perhaps.

I like this idea very much as it doesn't break a long standing behaviour and
simply introduces a new feature.

-- 
Gustavo.

Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Matt McCutchen

On Tue, 2018-02-20 at 08:42 -0500, Matt McCutchen wrote:
> In either case, if "git update-index --refresh" (or "git status") is
> run before "git add", then "git add" does not print the warning.  On
> the other hand, if line endings in the working tree file are changed,
> then git shows the file as having an unstaged change, even though the
> content that would be added to the index after CRLF conversion is
> identical.  So it seems that git remembers the pre-conversion file
> content and uses it for "git update-index --refresh" and would just
> need to use it for "git add" as well.

On further testing, this analysis is wrong.  What I was seeing is that
if the size of the working tree file has changed, git reports an
unstaged change.  (I suppose that reporting an unstaged change in this
case without checking whether the post-conversion content has changed
may be an important optimization.)  If the line endings are changed
without changing the size or post-conversion content, then no unstaged
change is reported.  It does not appear that git saves the pre-
conversion content.

Thus, if it were possible to create a file that doesn't need a safecrlf
warning, add it to the index, and then modify it so that it does need a
safecrlf warning without changing the size or post-conversion content,
we would have a bug where no warning is shown in the case where "git
status" is run before the second "git add".  I believe this bug can't
occur in the particular case of CRLF conversion without other filters
because the file that doesn't need a safecrlf warning has a unique
minimum (LF) or maximum (CRLF) size, though I presume it could occur
with custom filters.  My proposal would then be that "git add" should
not show a safecrlf warning if the size and post-conversion content
haven't changed; it would merely bring "git add" to parity with the
potential bug in the "git status" case.

Matt

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee


On 2/20/2018 5:57 PM, Jeff King wrote:

On Tue, Feb 20, 2018 at 05:12:50PM -0500, Derrick Stolee wrote:


In rev-list, the "--header" option outputs a value and expects the buffer to
be cached. It outputs the header info only if get_cached_commit_buffer()
returns a non-null buffer, giving incorrect output. If it called
get_commit_buffer() instead, it would immediately call
get_cached_commit_buffer() and on failure actually load the buffer.

This has not been a problem before, since the buffer was always loaded at
some point for each commit (and saved because of the save_commit_buffer
global).

I propose to make get_cached_commit_buffer() static to commit.c and convert
all callers to get_commit_buffer(). Is there any reason to _not_ do this? It
seems that there is no functional or performance change.

That helper was added in 152ff1cceb (provide helpers to access the
commit buffer, 2014-06-10). I think interesting part is the final
paragraph:

   Note that we also need to add a "get_cached" variant which
   returns NULL when we do not have a cached buffer. At first
   glance this seems to defeat the purpose of "get", which is
   to always provide a return value. However, some log code
   paths actually use the NULL-ness of commit->buffer as a
   boolean flag to decide whether to try printing the
   commit. At least for now, we want to continue supporting
   that use.

So I think a conversion to get_commit_buffer() would break the callers
that use the boolean nature for something useful. Unfortunately the
author did not enumerate exactly what those uses are, so we'll have to
dig. :)

My guess is that it has to do with showing some commits twice, since we
would normally have the buffer available the first time we hit the
commit, and then free it in finish_commit().

If we blame that rev-list line (and then walk back over a bunch of
uninteresting commits via parent re-blaming), it comes from 3131b71301
(Add "--show-all" revision walker flag for debugging, 2008-02-09).


Thanks for doing this digging. I appreciate the breadcrumbs, too, so I 
can do a better job of digging next time.



So there it is. It does show commits multiple times, but suppresses the
verbose header after the first showing. If we do something like this:

   git rev-list --show-all --pretty --boundary c93150cfb0^-

you'll see some boundary commits that _don't_ have their pretty headers
shown. And with your proposed patch, we'd show them again. To keep the
same behavior we need to store that "we've already seen this" boolean
somewhere else (e.g., in an object flag; possibly SEEN, but that might
run afoul of other logic).


What confuses me about this behavior is that the OID is still shown on 
the repeat (and in the case of `git log --oneline` will not actually 
have a line break between two short-OIDs). I don't believe this behavior 
is something to preserve.


You are right that we definitely don't want to show the full content twice.


It looks like the call in log-tree comes from the same commit, and
serves the same purpose.

Aside from storing the boolean "did we show it" in another way, the
other option is to simply change the behavior and accept that we might
pretty-print the commit twice. This is a backwards-incompatible change,
but I'm not sure if anybody would care. According to that commit,
--show-all was added explicitly for debugging, and it has never been
documented.  I couldn't find any reference to people actually using it
on the list (a grep of the whole archive turns up 32 messages, most of
which are just it appearing in context; the only person mentioning its
actual use was Linus in 2008.


Unless I am misunderstanding, the current behavior on a repeated commit 
is already incorrect: some amount of output occurs before checking the 
buffer, so the output includes repeated records but with formatting that 
violates the expectation. By doing the simple change of swapping 
get_cached_commit_buffer() with get_commit_buffer(), we correct that 
format violation but have duplicate copies.


The most-correct thing to do (in my opinion) is to put the requirement 
of "no repeats" into the revision walk logic and stop having the 
formatting methods expect them. Then, however we change this boolean 
setting of "we have seen this before" it will not require the formatting 
methods to change.


I can start working on a patch to move the duplicate-removal logic into 
revision.c instead of these three callers:


builtin/rev-list.c: if (revs->verbose_header && 
get_cached_commit_buffer(commit, NULL)) {

log-tree.c: if (!get_cached_commit_buffer(commit, NULL))
object.c:   if (!get_cached_commit_buffer(commit, 
NULL)) {


But this caller seems pretty important in pretty.c:

    /*
 * Otherwise, we still want to munge the encoding header in the
 * result, which will be done by modifying the buffer. If we
 * are using a fresh copy, we can reuse it. But if

Re: Git should preserve modification times at least on request

2018-02-21 Thread Peter Backes

On Wed, Feb 21, 2018 at 10:33:05PM +0100, Ævar Arnfjörð Bjarmason wrote:
> This sounds like a sensible job for a git import tool, i.e. import a
> target directory into git, and instead of 'git add'-ing the whole thing
> it would look at the mtimes, sort files by mtime, then add them in order
> and only commit those files that had the same mtime in the same commit
> (or within some boundary).

I think that this would be The Wrong Thing to do.

The commit time is just that: The time the commit was done. The commit 
is an atomic group of changes to a number of files that hopefully bring 
the tree from one usable state into the next.

The mtime, in contrast, tells us when a file was most recently modified.

It may well be that main.c was most recently modified yesterday, and 
feature.c was modified this morning, and that only both changes taken 
together make sense as a commit, despite the long time in between.

Even worse, it may be that feature A took a long time to implement, so 
we have huge gaps in between the mtimes, but feature B was quickly done 
after A was finished. Such an algorithm would probably split feature A 
incorrectly into several commits, and group the more recently changed 
files of feature A with those of feature B.

And if Feature A and Feature B were developed in parallel, things get 
completely messy.

> The advantage of doing this via such a tool is that you could tweak it
> to commit by any criteria you wanted, e.g. not mtime but ctime or even
> atime.

Maybe, but it would be rather useless to commit by ctime or atime. You 
do one grep -r and the atime is different. You do one chmod or chown 
and the ctime is different. Those timestamps are really only useful for 
very limited purposes.

That ctime exists seems reasonable, since it's only ever updated when 
the inode is written anyway.

atime, in contrast, was clearly one of the rather nonsensical 
innovations of UNIX: Do one write to the disk for each read from the 
disk. C'mon, really? It would have been a lot more reasonable to simply 
provide a generic way for tracing read() system calls instead; then 
userspace could decide what to do with that information and which of it 
is useful and should be kept and perhaps stored on disk. Now we have 
this ugly hack called relatime to deal with the problem.

> You'd get the same thing as you'd get if git's tree format would change
> to include mtimes (which isn't going to happen), but with a lot more
> flexibility.

Well, from basic logic, I don't see how a decision not to implement a 
feature could possibly increase flexility. The opposite seems to be the 
case.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:22:05PM -0800, Junio C Hamano wrote:

> > I think that repeating the oid is intentional; the point is to dump how
> > the traversal code is hitting the endpoints, even if we do so multiple
> > times.
> >
> > The --oneline behavior just looks like a bug. I think --format is broken
> > with --show-all, too (it does not show anything!).
> 
> I do not know about the --format thing,[...]

Hmm, maybe it is fine. I thought before that I got funny output out of
"git log --show-all --format", but I can't seem to reproduce it now.

> being a bug is correct.  I've known about the oneline that does not
> show anything other than the oid (not even end-of-line) for unparsed
> commits for a long time---I just didn't bother looking into fixing
> it exactly because this is only a debugging aid ;-)

Out of curiosity, do you actually use --show-all for anything? I didn't
even know it existed until this thread, and AFAICT nobody but Linus has
ever recommended its use. And it does not even seem that useful as a
general debugging aid.

So what I'm wondering is whether we should consider just ripping it out
(but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
it's probably not hurting anybody).

-Peff

What's cooking in git.git (Feb 2018, #03; Wed, 21)

2018-02-21 Thread Junio C Hamano

Here are the topics that have been cooking.  Commits prefixed with
'-' are only in 'pu' (proposed updates) while commits prefixed with
'+' are in 'next'.  The ones marked with '.' do not appear in any of
the integration branches, but I am still holding onto them.

You can find the changes described here in the integration branches
of the repositories listed at

http://git-blame.blogspot.com/p/git-public-repositories.html

--
[Graduated to "master"]

* ab/sha1dc-build (2017-12-08) 3 commits
  (merged to 'next' on 2018-02-08 at ba9ff2b836)
 + sha1dc_git.h: re-arrange an ifdef chain for a subsequent change
 + Makefile: under "make dist", include the sha1collisiondetection submodule
 + Makefile: don't error out under DC_SHA1_EXTERNAL if DC_SHA1_SUBMODULE=auto

 Push the submodule version of collision-detecting SHA-1 hash
 implementation a bit harder on builders.


* ab/wildmatch-tests (2018-01-30) 10 commits
  (merged to 'next' on 2018-02-08 at f999a3d732)
 + wildmatch test: mark test as EXPENSIVE_ON_WINDOWS
 + test-lib: add an EXPENSIVE_ON_WINDOWS prerequisite
 + wildmatch test: create & test files on disk in addition to in-memory
 + wildmatch test: perform all tests under all wildmatch() modes
 + wildmatch test: use test_must_fail, not ! for test-wildmatch
 + wildmatch test: remove dead fnmatch() test code
 + wildmatch test: use a paranoia pattern from nul_match()
 + wildmatch test: don't try to vertically align our output
 + wildmatch test: use more standard shell style
 + wildmatch test: indent with tabs, not spaces

 More tests for wildmatch functions.


* bc/hash-algo (2018-02-09) 13 commits
  (merged to 'next' on 2018-02-09 at 4437f3f132)
 + hash: update obsolete reference to SHA1_HEADER
  (merged to 'next' on 2018-02-08 at 18f36d12ed)
 + bulk-checkin: abstract SHA-1 usage
 + csum-file: abstract uses of SHA-1
 + csum-file: rename sha1file to hashfile
 + read-cache: abstract away uses of SHA-1
 + pack-write: switch various SHA-1 values to abstract forms
 + pack-check: convert various uses of SHA-1 to abstract forms
 + fast-import: switch various uses of SHA-1 to the_hash_algo
 + sha1_file: switch uses of SHA-1 to the_hash_algo
 + builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo
 + builtin/index-pack: improve hash function abstraction
 + hash: create union for hash context allocation
 + hash: move SHA-1 macros to hash.h

 More abstraction of hash function from the codepath.


* cc/perf-aggregate (2018-02-02) 3 commits
  (merged to 'next' on 2018-02-08 at d8f074e6fb)
 + perf/aggregate: sort JSON fields in output
 + perf/aggregate: add --reponame option
 + perf/aggregate: add --subsection option

 "make perf" enhancement.


* en/merge-recursive-fixes (2018-01-19) 3 commits
  (merged to 'next' on 2018-02-08 at c254292070)
 + merge-recursive: add explanation for src_entry and dst_entry
 + merge-recursive: fix logic ordering issue
 + Tighten and correct a few testcases for merging and cherry-picking
 (this branch is used by en/rename-directory-detection.)


* gs/rebase-allow-empty-message (2018-02-07) 1 commit
  (merged to 'next' on 2018-02-08 at 9d81a2496c)
 + rebase: add --allow-empty-message option

 "git rebase" learned to take "--allow-empty-message" option.


* jc/worktree-add-short-help (2018-01-17) 1 commit
  (merged to 'next' on 2018-02-08 at 9f59ca72ab)
 + worktree: say that "add" takes an arbitrary commit in short-help

 Error message fix.


* jt/fsck-code-cleanup (2018-01-23) 1 commit
  (merged to 'next' on 2018-02-08 at 199ad41486)
 + fsck: fix leak when traversing trees

 Plug recently introduced leaks in fsck.


* kg/packed-ref-cache-fix (2018-01-24) 6 commits
  (merged to 'next' on 2018-02-08 at 370f06a565)
 + packed_ref_cache: don't use mmap() for small files
 + load_contents(): don't try to mmap an empty file
 + packed_ref_iterator_begin(): make optimization more general
 + find_reference_location(): make function safe for empty snapshots
 + create_snapshot(): use `xmemdupz()` rather than a strbuf
 + struct snapshot: store `start` rather than `header_len`

 Avoid mmapping small files while using packed refs (especially ones
 with zero size, which would cause later munmap() to fail).
 A change to a binsearch loop to work around picky complers was
 unnecessarily hard to reason about, but it should do.


* lw/daemon-log-destination (2018-02-05) 1 commit
  (merged to 'next' on 2018-02-08 at da91bd56f4)
 + daemon: add --log-destination=(stderr|syslog|none)

 The log from "git daemon" can be redirected with a new option; one
 relevant use case is to send the log to standard error (instead of
 syslog) when running it from inetd.


* nd/format-patch-stat-width (2018-02-02) 2 commits
  (merged to 'next' on 2018-02-08 at c03e8a084e)
 + format-patch: reduce patch diffstat width to 72
 + format-patch: keep cover-letter diffstat wrapped in 72 columns

 "git format-patch" learned to give 72-cols to diffstat, which is
 consistent with other

Re: test bare repository for unit tests

2018-02-21 Thread Ævar Arnfjörð Bjarmason


On Wed, Feb 21 2018, Basin Ilya jotted:

> Hi.
> I want to the test-repo-git under 
> https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
> just like test-repo-cvs and test-repo-svn
>
> Which configuation options would suit that?
> I think core.compression 0 for human readable diffs.
> also, I need to force loose, gc after each push.

It looks like you have unit tests that are going to do integration tests
of some SVN/CVS repos as used by some other tool, and want to add git to
that.

Since you have git already, the most straightforward thing to do would
be to ship the output of git-fast-export in the repo, and have the test
setup code create a repo locally out of that, then test it.

Or do you really need to commit the raw repo files as-is for some reason
I've missed?

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Junio C Hamano

Jeff King  writes:

> Out of curiosity, do you actually use --show-all for anything?

Absolutely not.  I'd actually love it if I could say "not anymore"
instead, but I haven't had an opportunity to debug the revision
traversal code for quite some time so I do not even remember when
was the last time I used it, which disqualifies me from saying even
that.

> So what I'm wondering is whether we should consider just ripping it out
> (but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
> it's probably not hurting anybody).

I see no problem in removing it.  With more "interesting" features
relying on post-processing (like 'simplify-merges'), show_all whose
primary focus was how limit_list() behaves soft of outlived its
usefulness, I would think.

Re: [PATCHv3 00/27] Moving global state into the repository object (part 1)

2018-02-21 Thread Stefan Beller

On Tue, Feb 20, 2018 at 5:54 PM, Stefan Beller  wrote:
> v3:
> * reverted back to use the repository for most of the functions
>   (including unduplicating 'ignore_env')
> * rebased on master again (I lost that state when doing v2, as
>   I did both rebase as well as conversion to object store in one go)
> * one additional patch, that moves the alternates related things to
>   object-store.h, such that inclusion of cache.h (in object-store.h)
>   is not required any more.

This is also available at
https://github.com/stefanbeller/git/tree/object-store-part1-v3

Re: [PATCH 15/27] sha1_file: allow prepare_alt_odb to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:18 -0800
Stefan Beller  wrote:

> -void prepare_alt_odb_the_repository(void)
> +void prepare_alt_odb(struct repository *r)
>  {
> - const char *alt;
> -
> - if (the_repository->objects.alt_odb_tail)
> + if (r->objects.alt_odb_tail)
>   return;
>  
> - alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> + r->objects.alt_odb_tail = >objects.alt_odb_list;
> +
> + if (!r->ignore_env) {
> + const char *alt = getenv(ALTERNATE_DB_ENVIRONMENT);
> + if (!alt)
> + alt = "";

alt can be NULL, just like in the existing code, so these 2 lines are
unnecessary. (link_alt_odb_entries() will just do nothing in either
case.)

(I also think that the check of absence of alt should be done by the
caller of link_alt_odb_entries(), not by link_alt_odb_entries() itself,
but that is much beyond the scope of this patch set.)

>  
> - the_repository->objects.alt_odb_tail =
> - _repository->objects.alt_odb_list;
> - link_alt_odb_entries(the_repository, alt,
> -  PATH_SEP, NULL, 0);
> + link_alt_odb_entries(r, alt, PATH_SEP, NULL, 0);
> + }
>  
> - read_info_alternates(the_repository, get_object_directory(), 0);
> + read_info_alternates(r, r->objects.objectdir, 0);
>  }

Re: [PATCH 08/27] pack: move approximate object count to object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> The approximate_object_count() function maintains a rough count of
> objects in a repository to estimate how long object name abbreviates
> should be.  Object names are scoped to a repository and the
> appropriate length may differ by repository, so the object count
> should not be global.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  object-store.h | 10 +-
>  packfile.c | 11 +--
>  2 files changed, 14 insertions(+), 7 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 6cecba3951..bd1e4fcd8b 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -93,6 +93,14 @@ struct raw_object_store {
>   struct alternate_object_database *alt_odb_list;
>   struct alternate_object_database **alt_odb_tail;
>  
> + /*
> +  * A fast, rough count of the number of objects in the repository.
> +  * These two fields are not meant for direct access. Use
> +  * approximate_object_count() instead.
> +  */
> + unsigned long approximate_object_count;
> + unsigned approximate_object_count_valid : 1;

Patch looks fine and is effectively a no-op, though what is the need for
both of these variables?  Maybe it can be simplified down to just use
one?  Just musing as its out of the scope of this patch and we probably
shouldn't try to change that in this series.

> +
>   /*
>* Whether packed_git has already been populated with this repository's
>* packs.
> @@ -107,7 +115,7 @@ struct raw_object_store {
>   * that macro on member variables. Use NULL instead as that is defined
>   * and accepted, deferring the real init to prepare_packed_git_mru(). */
>  #define __MRU_INIT { NULL, NULL }
> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0, 0, 0 }
>  
>  void raw_object_store_clear(struct raw_object_store *o);
>  
> diff --git a/packfile.c b/packfile.c
> index a8a2e7fe43..693bafbc98 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -803,8 +803,6 @@ static void prepare_packed_git_one(char *objdir, int 
> local)
>   strbuf_release();
>  }
>  
> -static int approximate_object_count_valid;
> -
>  /*
>   * Give a fast, rough count of the number of objects in the repository. This
>   * ignores loose objects completely. If you have a lot of them, then either
> @@ -814,8 +812,8 @@ static int approximate_object_count_valid;
>   */
>  unsigned long approximate_object_count(void)
>  {
> - static unsigned long count;
> - if (!approximate_object_count_valid) {
> + if (!the_repository->objects.approximate_object_count_valid) {
> + unsigned long count;
>   struct packed_git *p;
>  
>   prepare_packed_git();
> @@ -825,8 +823,9 @@ unsigned long approximate_object_count(void)
>   continue;
>   count += p->num_objects;
>   }
> + the_repository->objects.approximate_object_count = count;
>   }
> - return count;
> + return the_repository->objects.approximate_object_count;
>  }
>  
>  static void *get_next_packed_git(const void *p)
> @@ -901,7 +900,7 @@ void prepare_packed_git(void)
>  
>  void reprepare_packed_git(void)
>  {
> - approximate_object_count_valid = 0;
> + the_repository->objects.approximate_object_count_valid = 0;
>   the_repository->objects.packed_git_initialized = 0;
>   prepare_packed_git();
>  }
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: [PATCH 24/27] sha1_file: allow open_sha1_file to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:27 -0800
Stefan Beller  wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v3 14/35] connect: request remote refs using v2

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:51 -0800
Brandon Williams  wrote:

> +extern struct ref **get_remote_refs(int fd_out, struct packet_reader *reader,
> + struct ref **list, int for_push,
> + const struct argv_array *ref_patterns);

I haven't looked at the rest of this patch in detail, but the type of
ref_patterns is probably better as struct string_list, since this is not
a true argument array (e.g. with flags starting with --). Same comment
for the next few patches that deal with ref patterns.

Re: bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor


> On Feb 21, 2018, at 2:15 PM, Jeff King  wrote:
> 
> Thanks, I agree the document is buggy. Do you want to submit a patch?

Will this do?

Note I am not sure what the story is behind that `version 1` element, whether 
it's supposed to go before or after the null packet or if there should be 
another null packet or what. Perhaps somebody more fluent with the smart 
protocol can advise.

---
Documentation/technical/http-protocol.txt | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
index a0e45f2889e6e..19d73f7efb338 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -214,14 +214,17 @@ smart server reply:
   S: Cache-Control: no-cache
   S:
   S: 001e# service=git-upload-pack\n
+   S: 
   S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint\0multi_ack\n
   S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
   S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
   S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
+   S: 

The client may send Extra Parameters (see
Documentation/technical/pack-protocol.txt) as a colon-separated string
-in the Git-Protocol HTTP header.
+in the Git-Protocol HTTP header. Note as well that there is *no* newline
+after the ``.

Dumb Server Response

@@ -264,8 +267,8 @@ Servers MUST set $servicename to be the request parameter 
value.
Servers SHOULD include an LF at the end of this line.
Clients MUST ignore an LF at the end of the line.

-Servers MUST terminate the response with the magic `` end
-pkt-line marker.
+Servers MUST follow the first pkt-line, as well as terminate the
+response, with the magic `` end pkt-line marker.

The returned response is a pkt-line stream describing each ref and
its known value.  The stream SHOULD be sorted by name according to
@@ -278,6 +281,7 @@ Extra Parameter.

  smart_reply =  PKT-LINE("# service=$servicename" LF)
 *1("version 1")
+""
 ref_list
 ""
  ref_list=  empty_list / non_empty_list

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com

RE: Git should preserve modification times at least on request

2018-02-21 Thread Randall S. Becker

On February 21, 2018 6:13 PM, Peter Backes wrote:
> On Wed, Feb 21, 2018 at 11:44:13PM +0100, Ævar Arnfjörð Bjarmason wrote:
> > If it were added as a first-level feature to git it would present a
> > lot of UX confusion. E.g. you run "git add" and it'll be showing the
> > mtime somehow, or you get a formatted patch over E-Mail and it doesn't
> > only include the commit time but also times for individual files.
> 
> But that's pretty standard. patch format has timestamp fields for
precisely
> this purpose:
> 
> % echo a > x
> % echo b > y
> % diff -u x y
> --- x 2018-02-21 23:56:29.574029523 +0100
> +++ y 2018-02-21 23:56:31.430003389 +0100

May I suggest storing the date/time in UTC+0 in all cases. I can see
potential issues a couple of times a year where holes exist. I cannot even
fathom what would happen on a merge or edit of history.

Cheers,
Randall

Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order) 2

2018-02-21 Thread Josh Tepper

When using git log, boundary commits (ie, those commits added by
specifying --boundary) do not respect the order (e.g., --date-order,
--topo-order).  Consider the following commit history, where number
indicates the order of the commit timestamps:


0125  <--A
   \ \
 346  <--B


Executing the following command:

$ git log --boundary --date-order ^A B

Should produce the following order (boundary commits shown with dashes):
6 -5 4 3 -1

However, it in fact produces:
6 4 3 -5 -1

Please advise.

Best,
~Josh

Re: [PATCH 07/27] pack: move prepare_packed_git_run_once to object store

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:10 -0800
Stefan Beller  wrote:

> Each repository's object store can be initialized independently, so
> they must not share a run_once variable.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan 

> -#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL }
> +#define RAW_OBJECT_STORE_INIT { NULL, NULL, __MRU_INIT, NULL, NULL, 0 }

Optional: Trailing zeros are not needed in struct initializers.

Re: [PATCH 14/27] sha1_file: allow link_alt_odb_entries to handle arbitrary repositories

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Actually this also allows read_info_alternates and link_alt_odb_entry to
> handle arbitrary repositories, but link_alt_odb_entries is the most
> interesting function in this set of functions, hence the commit subject.
> 
> These functions span a strongly connected component in the function
> graph, i.e. the recursive call chain might look like
> 
>   -> link_alt_odb_entries
> -> link_alt_odb_entry
>   -> read_info_alternates
> -> link_alt_odb_entries
> 
> That is why we need to convert all these functions at the same time.
> 
> Signed-off-by: Jonathan Nieder 
> Signed-off-by: Stefan Beller 
> ---

Looks good.

>  object-store.h |  4 
>  sha1_file.c| 36 
>  2 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/object-store.h b/object-store.h
> index 514ad94287..f283fbdba9 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -18,6 +18,10 @@ struct alternate_object_database {
>   char loose_objects_subdir_seen[256];
>   struct oid_array loose_objects_cache;
>  
> + /*
> +  * Path to the alternative object store. If this is a relative path,
> +  * it is relative to the current working directory.
> +  */

Thanks for adding the comment.

>   char path[FLEX_ARRAY];
>  };
>  #define prepare_alt_odb(r) prepare_alt_odb_##r()
> diff --git a/sha1_file.c b/sha1_file.c
> index c26921df4a..6e5105a252 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -390,11 +390,10 @@ static int alt_odb_usable(struct raw_object_store *o,
>   * SHA1, an extra slash for the first level indirection, and the
>   * terminating NUL.
>   */
> -#define read_info_alternates(r, rb, d) read_info_alternates_##r(rb, d)
> -static void read_info_alternates_the_repository(const char *relative_base,
> - int depth);
> -#define link_alt_odb_entry(r, e, rb, d, n) link_alt_odb_entry_##r(e, rb, d, 
> n)
> -static int link_alt_odb_entry_the_repository(const char *entry,
> +static void read_info_alternates(struct repository *r,
> +  const char *relative_base,
> +  int depth);
> +static int link_alt_odb_entry(struct repository *r, const char *entry,
>   const char *relative_base, int depth, const char *normalized_objdir)
>  {
>   struct alternate_object_database *ent;
> @@ -420,7 +419,7 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   while (pathbuf.len && pathbuf.buf[pathbuf.len - 1] == '/')
>   strbuf_setlen(, pathbuf.len - 1);
>  
> - if (!alt_odb_usable(_repository->objects, , 
> normalized_objdir)) {
> + if (!alt_odb_usable(>objects, , normalized_objdir)) {
>   strbuf_release();
>   return -1;
>   }
> @@ -428,12 +427,12 @@ static int link_alt_odb_entry_the_repository(const char 
> *entry,
>   ent = alloc_alt_odb(pathbuf.buf);
>  
>   /* add the alternate entry */
> - *the_repository->objects.alt_odb_tail = ent;
> - the_repository->objects.alt_odb_tail = &(ent->next);
> + *r->objects.alt_odb_tail = ent;
> + r->objects.alt_odb_tail = &(ent->next);
>   ent->next = NULL;
>  
>   /* recursively add alternates */
> - read_info_alternates(the_repository, pathbuf.buf, depth + 1);
> + read_info_alternates(r, pathbuf.buf, depth + 1);
>  
>   strbuf_release();
>   return 0;
> @@ -468,12 +467,8 @@ static const char *parse_alt_odb_entry(const char 
> *string,
>   return end;
>  }
>  
> -#define link_alt_odb_entries(r, a, s, rb, d) \
> - link_alt_odb_entries_##r(a, s, rb, d)
> -static void link_alt_odb_entries_the_repository(const char *alt,
> - int sep,
> - const char *relative_base,
> - int depth)
> +static void link_alt_odb_entries(struct repository *r, const char *alt,
> +  int sep, const char *relative_base, int depth)
>  {
>   struct strbuf objdirbuf = STRBUF_INIT;
>   struct strbuf entry = STRBUF_INIT;
> @@ -487,7 +482,7 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   return;
>   }
>  
> - strbuf_add_absolute_path(, get_object_directory());
> + strbuf_add_absolute_path(, r->objects.objectdir);
>   if (strbuf_normalize_path() < 0)
>   die("unable to normalize object directory: %s",
>   objdirbuf.buf);
> @@ -496,15 +491,16 @@ static void link_alt_odb_entries_the_repository(const 
> char *alt,
>   alt = parse_alt_odb_entry(alt, sep, );
>   if (!entry.len)
>   continue;
> - link_alt_odb_entry(the_repository, entry.buf,
> + link_alt_odb_entry(r, entry.buf,
>  relative_base, depth, objdirbuf.buf);
>   }
>

Re: bug in HTTP protocol spec

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 10:29:35AM -0800, Dorian Taylor wrote:

> I didn’t get an insight until I ran with GIT_TRACE_PACKET=true on a 
> known-good remote (i.e. GitHub), that the null packet-line `` has to 
> follow the service line. This is not reflected in the example here:
> 
> https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L216

This is missing the trailing flush after the ref advertisement, too.

> It is also not reflected in the BNF:
> 
> https://github.com/git/git/blob/6464679d9620d91b639e2681b9cc6473f3856d09/Documentation/technical/http-protocol.txt#L279
> 
> (Note these links are from the most recent commit of this file as of this 
> writing.)
> 
> Just thought somebody would like to know.

Thanks, I agree the document is buggy. Do you want to submit a patch?

-Peff

Re: [PATCH v3 16/35] transport: convert transport_get_remote_refs to take a list of ref patterns

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:53 -0800
Brandon Williams  wrote:

> -const struct ref *transport_get_remote_refs(struct transport *transport)
> +const struct ref *transport_get_remote_refs(struct transport *transport,
> + const struct argv_array 
> *ref_patterns)
>  {
>   if (!transport->got_remote_refs) {
> - transport->remote_refs = 
> transport->vtable->get_refs_list(transport, 0, NULL);
> + transport->remote_refs =
> + transport->vtable->get_refs_list(transport, 0,
> +  ref_patterns);
>   transport->got_remote_refs = 1;
>   }

Should we do our own client-side filtering if the server side cannot do
it for us (because it doesn't support protocol v2)? Either way, this
decision should be mentioned in the commit message.

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 03:22:02PM -0800, Stefan Beller wrote:

> > Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()
> > ---
> >  builtin/rev-list.c | 2 +-
> >  log-tree.c | 3 ---
> >  2 files changed, 1 insertion(+), 4 deletions(-)
> 
> Now if we'd get around to rewrite pretty.c as well, we could make it static,
> giving a stronger reason of not using that function. But it looks a bit
> complicated to me, who is not familiar in that area of the code.
> 
> Thanks for making less use of this suboptimal API,

I'm not sure the API is suboptimal. It's not wrong to ask "do you have a
cached copy of this?". It was just being used poorly here. :)

See the discussion in

  https://public-inbox.org/git/20180221184811.gd4...@sigill.intra.peff.net/

-Peff

Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order)

2018-02-21 Thread Josh Tepper

When using git log, boundary commits (ie, those commits added by
specifying --boundary) do not respect the order (e.g., --date-order,
--topo-order).  Consider the following commit history, where number
indicates the order of the commit timestamps:


0125  <--A
   \ \
 346  <--B


Executing the following command:

$ git log --boundary --date-order ^A B

Should produce the following order (boundary commits shown with dashes):
6 -5 4 3 -1

However, it in fact produces:
6 4 3 -7 -1

Please advise.

Best,
~Josh

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:19:17PM -0500, Derrick Stolee wrote:

> > These behaviors are undocumented, untested, and unlikely to be
> > expected by users or other software attempting to parse this output.
> > 
> > Helped-by: Jeff King 
> > Signed-off-by: Derrick Stolee 
> 
> This would be a good time to allow multiple authors, or to just change the
> author, since this is exactly the diff you (Peff) provided in an earlier
> email. The commit message hopefully summarizes our discussion, but I welcome
> edits.

The point is moot if we take the revision I just sent (though in
retrospect I really ought to have put in a Reported-by: for you there).

But some communities are settling on Co-authored-by as a trailer for
this case. And GitHub has started parsing and showing that along with
author information:

  https://github.com/blog/2496-commit-together-with-co-authors

-Peff

Re: [PATCH 04/26] upload-pack: convert to a builtin

2018-02-21 Thread Junio C Hamano

Jonathan Nieder  writes:

> For defense in depth, it would be comforting if the git wrapper had
> some understanding of "don't support --help in handle_builtin when
> invoked as a dashed command".  That is, I don't expect that anyone has
> been relying on
>
>   git-add --help
>
> acting like
>
>   git help add
>
> instead of printing the usage message from
>
>   git add -h

Sounds like a neat trick.

> It's a little fussy because today we rewrite "git add --help" to
> "git-add --help" before rewriting it to "git help add"; we'd have to
> skip that middle hop for this to work.

I do not quite get this part.  "git add --help" goes through run_argv()
and then to handle_builtin() which is what does this "git help add"
swapping.

"git-add --help" does get thrown into the same codepath by
pretending as if we got "add --help" as an argument to "git"
command, and that happens without going through run_argv(),
so presumably we can add another perameter to handle_builtin()
so that the callee can tell these two invocation sites apart, no?

> I don't think that has to block this patch or series, though --- it's
> just a separate thought about hardening.

Yeah, I agree with this assessment.

Re: [PATCH v3 20/35] upload-pack: introduce fetch server command

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:57 -0800
Brandon Williams  wrote:

> +want 
> + Indicates to the server an object which the client wants to
> + retrieve.

Mention that the client can "want" anything even if not advertised by
the server (like uploadpack.allowanysha1inwant).

> +output = *section
> +section = (acknowledgments | packfile)
> +   (flush-pkt | delim-pkt)
> +
> +acknowledgments = PKT-LINE("acknowledgments" LF)
> +   *(ready | nak | ack)

Can this part be described more precisely in the BNF section? I see that
you describe later that there can be multiple ACKs or one NAK (but not
both), and "ready" can be sent regardless of whether ACKs or a NAK is
sent.

> +ready = PKT-LINE("ready" LF)
> +nak = PKT-LINE("NAK" LF)
> +ack = PKT-LINE("ACK" SP obj-id LF)
> +
> +packfile = PKT-LINE("packfile" LF)
> +[PACKFILE]
> +
> +
> +acknowledgments section
> + * Always begins with the section header "acknowledgments"
> +
> + * The server will respond with "NAK" if none of the object ids sent
> +   as have lines were common.
> +
> + * The server will respond with "ACK obj-id" for all of the
> +   object ids sent as have lines which are common.
> +
> + * A response cannot have both "ACK" lines as well as a "NAK"
> +   line.
> +
> + * The server will respond with a "ready" line indicating that
> +   the server has found an acceptable common base and is ready to
> +   make and send a packfile (which will be found in the packfile
> +   section of the same response)
> +
> + * If the client determines that it is finished with negotiations
> +   by sending a "done" line, the acknowledgments sections can be
> +   omitted from the server's response as an optimization.

Should this be changed to "must"? The current implementation does not
support it (on the client side).

> +#define UPLOAD_PACK_DATA_INIT { OBJECT_ARRAY_INIT, OID_ARRAY_INIT, 0, 0, 0, 
> 0, 0, 0 }

Optional: the trailing zeroes can be omitted. (That's shorter, and also
easier to maintain when we add new fields.)

> +int upload_pack_v2(struct repository *r, struct argv_array *keys,
> +struct argv_array *args)
> +{
> + enum fetch_state state = FETCH_PROCESS_ARGS;
> + struct upload_pack_data data = UPLOAD_PACK_DATA_INIT;
> + use_sideband = LARGE_PACKET_MAX;
> +
> + while (state != FETCH_DONE) {
> + switch (state) {
> + case FETCH_PROCESS_ARGS:
> + process_args(args, );
> +
> + if (!want_obj.nr) {
> + /*
> +  * Request didn't contain any 'want' lines,
> +  * guess they didn't want anything.
> +  */
> + state = FETCH_DONE;
> + } else if (data.haves.nr) {
> + /*
> +  * Request had 'have' lines, so lets ACK them.
> +  */
> + state = FETCH_SEND_ACKS;
> + } else {
> + /*
> +  * Request had 'want's but no 'have's so we can
> +  * immedietly go to construct and send a pack.
> +  */
> + state = FETCH_SEND_PACK;
> + }
> + break;
> + case FETCH_READ_HAVES:
> + read_haves();
> + state = FETCH_SEND_ACKS;
> + break;

This branch seems to never be taken?

> + case FETCH_SEND_ACKS:
> + if (process_haves_and_send_acks())
> + state = FETCH_SEND_PACK;
> + else
> + state = FETCH_DONE;
> + break;
> + case FETCH_SEND_PACK:
> + packet_write_fmt(1, "packfile\n");
> + create_pack_file();
> + state = FETCH_DONE;
> + break;
> + case FETCH_DONE:
> + continue;
> + }
> + }
> +
> + upload_pack_data_clear();
> + return 0;
> +}

Re: [PATCH] revision: drop --show-all option

2018-02-21 Thread Linus Torvalds

On Wed, Feb 21, 2018 at 3:27 PM, Jeff King  wrote:
>
> We'll skip the usual deprecation period because this was
> explicitly a debugging aid that was never documented.

Ack. I don't think I've used it since, and probably nobody else ever used it.

  Linus

Re: [PATCH v3 35/35] remote-curl: don't request v2 when pushing

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:12 -0800
Brandon Williams  wrote:

> +test_expect_success 'push with http:// and a config of v2 does not request 
> v2' '
> + # Till v2 for push is designed, make sure that if a client has
> + # protocol.version configured to use v2, that the client instead falls
> + # back and uses v0.
> +
> + test_commit -C http_child three &&
> +
> + # Push to another branch, as the target repository has the
> + # master branch checked out and we cannot push into it.
> + GIT_TRACE_PACKET=1 git -C http_child -c protocol.version=1 \
> + push origin HEAD:client_branch && 2>log &&

Should it be protocol.version=2? Also, two double ampersands?

Also, optionally, it might be better to do
GIT_TRACE_PACKET="$(pwd)/log", so that it does not get mixed with other
stderr output.

Re: [PATCH v3 0/3] Add "git rebase --show-current-patch"

2018-02-21 Thread Junio C Hamano

Nguyễn Thái Ngọc Duy   writes:

> Compared to v2:
>
> - the potential loss of errno before it's printed out in builtin/am.c
>   is fixed.
> - keep update_ref() in sequencer.c non-fatal like this rest
> - rename ORIG_COMMIT to REBASE_HEAD
>
> Interdiff:

This round hasn't seen any comments.  Is everybody happy with it?

I personally do not have strong opinion for the feature but didn't
spot anything against the execution, either, so...

Re: [PATCH 09/27] sha1_file: add raw_object_store argument to alt_odb_usable

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:12 -0800
Stefan Beller  wrote:

> Add a raw_object_store to alt_odb_usable to be more specific about which
> repository to act on. The choice of the repository is delegated to its
> only caller link_alt_odb_entry.
> 
> Signed-off-by: Stefan Beller 

I checked that alt_odb_usable no longer depends on any repository-like
globals.

Reviewed-by: Jonathan Tan

Re: [PATCH 03/27] object-store: move alt_odb_list and alt_odb_tail to object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> In a process with multiple repositories open, alternates should be
> associated to a single repository and not shared globally. Move
> alt_odb_list and alt_odb_tail into the_repository and adjust callers
> to reflect this.
> 
> Now that the alternative object data base is per repository, we're
> leaking its memory upon freeing a repository. The next patch plugs
> this hole.
> 
> No functional change intended.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Looks good aside from the one spacing issue (though it's not that big of
a deal).

> ---
>  builtin/fsck.c |  4 +++-
>  object-store.h |  9 ++---
>  packfile.c |  3 ++-
>  sha1_file.c| 25 -
>  sha1_name.c|  3 ++-
>  5 files changed, 25 insertions(+), 19 deletions(-)
> 
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 7a8a679d4f..908e4f092a 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -1,5 +1,6 @@
>  #include "builtin.h"
>  #include "cache.h"
> +#include "repository.h"
>  #include "config.h"
>  #include "commit.h"
>  #include "tree.h"
> @@ -716,7 +717,8 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   fsck_object_dir(get_object_directory());
>  
>   prepare_alt_odb();
> - for (alt = alt_odb_list; alt; alt = alt->next)
> + for (alt = the_repository->objects.alt_odb_list;
> + alt; alt = alt->next)

Indentation on this line looks odd.


-- 
Brandon Williams

Re: [PATCH 04/27] object-store: free alt_odb_list

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Free the memory and reset alt_odb_{list, tail} to NULL.

Good to see memory leaks being avoided (well they will be on other
repository objects)

> 
> Signed-off-by: Stefan Beller 
> ---
>  object.c | 17 +
>  1 file changed, 17 insertions(+)
> 
> diff --git a/object.c b/object.c
> index 11d904c033..17b1da6d6b 100644
> --- a/object.c
> +++ b/object.c
> @@ -446,7 +446,24 @@ void clear_commit_marks_all(unsigned int flags)
>   }
>  }
>  
> +static void free_alt_odb(struct alternate_object_database *alt)
> +{
> + strbuf_release(>scratch);
> + oid_array_clear(>loose_objects_cache);
> +}
> +
> +static void free_alt_odbs(struct raw_object_store *o)
> +{
> + while (o->alt_odb_list) {
> + free_alt_odb(o->alt_odb_list);
> + o->alt_odb_list = o->alt_odb_list->next;
> + }
> +}
> +
>  void raw_object_store_clear(struct raw_object_store *o)
>  {
>   free(o->objectdir);
> +
> + free_alt_odbs(o);
> + o->alt_odb_tail = NULL;
>  }
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: Question about get_cached_commit_buffer()

2018-02-21 Thread Junio C Hamano

Jeff King  writes:

> I think that repeating the oid is intentional; the point is to dump how
> the traversal code is hitting the endpoints, even if we do so multiple
> times.
>
> The --oneline behavior just looks like a bug. I think --format is broken
> with --show-all, too (it does not show anything!).

I do not know about the --format thing, but the part about --oneline
being a bug is correct.  I've known about the oneline that does not
show anything other than the oid (not even end-of-line) for unparsed
commits for a long time---I just didn't bother looking into fixing
it exactly because this is only a debugging aid ;-)

> Though I think it would be equally correct to have set_commit_buffer()
> just throw away the existing cache entry and replace it with this one. I
> don't think there's a real reason to prefer the old to the new. And that
> might be worth doing if it would let us drop get_cached_commit_buffer()
> as a public function. But...
> ...
> In my opinion it's not really worth trying to make it private. The
> confusion you're fixing in the first two calls is not due to a bad API,
> but due to some subtly confusing logic in that code's use of the API. ;)

Yup.

> So I'd probably do this:
> ...

Makes sense to me.

Re: [PATCH v3 15/35] transport: convert get_refs_list to take a list of ref patterns

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:52 -0800
Brandon Williams  wrote:

> @@ -21,7 +22,8 @@ struct transport_vtable {
>* the ref without a huge amount of effort, it should store it
>* in the ref's old_sha1 field; otherwise it should be all 0.
>**/
> - struct ref *(*get_refs_list)(struct transport *transport, int for_push);
> + struct ref *(*get_refs_list)(struct transport *transport, int for_push,
> +  const struct argv_array *ref_patterns);

Also mention in the documentation that this function is allowed to
return refs that do not match the ref patterns.

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Stefan Beller

> Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()
> ---
>  builtin/rev-list.c | 2 +-
>  log-tree.c | 3 ---
>  2 files changed, 1 insertion(+), 4 deletions(-)

Now if we'd get around to rewrite pretty.c as well, we could make it static,
giving a stronger reason of not using that function. But it looks a bit
complicated to me, who is not familiar in that area of the code.

Thanks for making less use of this suboptimal API,
Stefan

Re: Bug: git log: boundary commits do not respect order (e.g. date-order, topo-order)

2018-02-21 Thread Josh Tepper

TYPO IN EXPECTED OUTPUT.

To avoid inevitable confusion, creating new thread "Bug: git log:
boundary commits do not respect order (e.g. date-order, topo-order)
2".

DON'T REPLY TO THIS MESSAGE.  Instead reply to the new message

On Wed, Feb 21, 2018 at 6:28 PM, Josh Tepper  wrote:
> When using git log, boundary commits (ie, those commits added by
> specifying --boundary) do not respect the order (e.g., --date-order,
> --topo-order).  Consider the following commit history, where number
> indicates the order of the commit timestamps:
>
> 
> 0125  <--A
>\ \
>  346  <--B
>
>
> Executing the following command:
>
> $ git log --boundary --date-order ^A B
>
> Should produce the following order (boundary commits shown with dashes):
> 6 -5 4 3 -1
>
> However, it in fact produces:
> 6 4 3 -7 -1
>
> Please advise.
>
> Best,
> ~Josh

Re: [PATCH v3 30/35] remote-curl: create copy of the service name

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:07 -0800
Brandon Williams  wrote:

> Make a copy of the service name being requested instead of relying on
> the buffer pointed to by the passed in 'const char *' to remain
> unchanged.
> 
> Signed-off-by: Brandon Williams 

Probably worth mentioning in the commit message:

  Currently, all service names are string constants, but a subsequent
  patch will introduce service names from external sources.

Other than that,

Reviewed-by: Jonathan Tan

Re: [PATCH 19/27] sha1_file: add repository argument to map_sha1_file_1

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:22 -0800
Stefan Beller  wrote:

> Add a repository argument to allow the map_sha1_file_1 caller to be
> more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
> 
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
> 
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan

Re: Git should preserve modification times at least on request

2018-02-21 Thread 'Peter Backes'

On Wed, Feb 21, 2018 at 06:58:34PM -0500, Randall S. Becker wrote:
> May I suggest storing the date/time in UTC+0 in all cases. I can see
> potential issues a couple of times a year where holes exist. I cannot even
> fathom what would happen on a merge or edit of history.

I consider storing the timestamp simply in the traditional 
seconds-since-epoch UNIX timestamp format. But I'm not entirely sure 
yet (see below).

If a timestamp includes the offset, there shouldn't be any issue with 
holes. UTC+0 is nice, too, of course, though some might want to 
preserve the timezone in which the timestamp was actually created.

The bigger issue is usually to copy with those pesky leap seconds. It 
makes a difference whether one uses solar seconds ("posix" style; those 
are more commonly seen) or atomic seconds ("right" style) for the UNIX 
timestamp. Those differences accumulate over time, so you can have 
almost half a minute delta if you are not careful with timestamp 
conversion. If I remember correctly, rcs uses some rather awkward 
interative convergence algorithm to portably convert from 
human-readable date and time to UNIX timestamps.

Thus I'm still not sure whether it will be a UNIX-format timestamp or 
whether a human-readable date/time might be preferrable.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Re: [PATCH v2 0/3] Re: t7006: add tests for how git config paginates

2018-02-21 Thread Junio C Hamano

Thanks.  The entire thing looked reasonable.

Will replace.

Re: Git should preserve modification times at least on request

2018-02-21 Thread Peter Backes

On Wed, Feb 21, 2018 at 11:44:13PM +0100, Ævar Arnfjörð Bjarmason wrote:
> If it were added as a first-level feature to git it would present a lot
> of UX confusion. E.g. you run "git add" and it'll be showing the mtime
> somehow, or you get a formatted patch over E-Mail and it doesn't only
> include the commit time but also times for individual files.

But that's pretty standard. patch format has timestamp fields for 
precisely this purpose:

% echo a > x  
% echo b > y
% diff -u x y
--- x   2018-02-21 23:56:29.574029523 +0100
+++ y   2018-02-21 23:56:31.430003389 +0100
@@ -1 +1 @@
-a
+b

At present, git simply leaves those fields blank...

> The VC systems that had this feature in the past were centralized, so
> they could (in theory anyway) ensure that timestamps were monotonically
> increasing. This won't be the case with git, we have plenty of timestamp
> drift in e.g. linux.git and other git repos.

I don't see where monotonicity would be an issue any more than it is 
for centralized version control systems.

Even in the centralized setting, monotonicity is not guaranteed, since 
you might have local timestamps deviating from the repository; you 
might have added a line, compiled, and removed it again later on, 
without running make again. Now if you checkout changes from the 
repository, and it sets the timestamp, that timestamp might be older 
than before the compile, and the file would not be rebuilt if you run 
make. So you cannot avoid those issues in centralized setttings either.

> So if these mtimes were used by default they'd interact badly with stuff
> like "make" in those cases, because you might check out a modified
> version with a timestamp in the past.

That's very clearly the case, and I have stressed in my initial email 
that I fully agree with the reasoning of the FAQ in this regard. It is, 
however, merely an argument against *restoring* the timestamps *by 
default*, to comply with the principle of least astonishment. It is, by 
itself, not an argument against *storing* the timestamps, let alone 
against restoring them *on request*.

For the initial checkout, it should not even be harmful to restore the 
timestamps by default.

> any case, I just wanted to point out a workaround (but then digressed
> into critiquing the idea above...).

Well, Johannes's proposed solution seems pretty reasonable and 
realistic to me.  Thanks to Phillip's hint about unquote_path() in 
Git.pm it seems I now have all the needed ingredients to implement this 
feature.

Best wishes
Peter

-- 
Peter Backes, r...@helen.plasma.xg8.de

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 02:17:11PM -0500, Derrick Stolee wrote:

> The get_cached_commit_buffer() method provides access to the buffer
> loaded for a struct commit, if it was ever loadead and was not freed.
> 
> Two places use this to inform how to output information about commits.
> 
> log-tree.c uses this method to short-circuit the output of commit
> information when the buffer is not cached. However, this leads to
> incorrect output in 'git log --oneline' where the short-OID is written
> but then the rest of the commit information is dropped and the next
> commit is written on the same line.
> 
> rev-list uses this method for two reasons:
> 
> - First, if the revision walk visits a commit twice, the buffer was
>   freed by rev-list in the first write. The output then does not
>   match the format expectations, since the OID is written without the
>   rest of the content.

I'm not sure after my earlier digging if there is even a way to trigger
this (and if so, it is probably accidental, since those lines were added
explicitly for --show-all).

And actually after re-reading the commit message for 3131b7130 again, I
think the current behavior is definitely not something that was
carefully planned. So I'd propose a commit message like below.

-- >8 --
Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()

The "--show-all" revision option shows UNINTERESTING
commits. Some of these commits may be unparsed when we try
to show them (since we may or may not need to walk their
parents to fulfill the request).

Commit 3131b71301 (Add "--show-all" revision walker flag for
debugging, 2008-02-09) resolved this by just skipping
pretty-printing for commits without their object contents
cached, saying:

  Because we now end up listing commits we may not even have been parsed
  at all "show_log" and "show_commit" need to protect against commits
  that don't have a commit buffer entry.

That was the easy fix to avoid the pretty-printer segfaulting,
but:

  1. It doesn't work for all formats. E.g., --oneline
 prints the oid for each such commit but not a trailing
 newline, leading to jumbled output.

  2. It only affects some commits, depending on whether we
 happened to parse them or not (so if they were at the
 tip of an UNINTERESTING starting point, or if we
 happened to traverse over them, you'd see more data).

  3. It unncessarily ties the decision to show the verbose
 header to whether the commit buffer was cached. That
 makes it harder to change the logic around caching
 (e.g., if we could traverse without actually loading
 the full commit objects).

These days it's safe to feed such a commit to the
pretty-print code. Since be5c9fb904 (logmsg_reencode: lazily
load missing commit buffers, 2013-01-26), we'll load it on
demand in such a case. So let's just always show the verbose
headers.

This does change the behavior of plumbing, but:

  a. The --show-all option was explicitly introduced as a
 debugging aid, and was never documented (and has rarely
 even been mentioned on the list by git devs).

  b. Avoiding the commits was already not deterministic due
 to (2) above. So the caller might have seen full
 headers for these commits anyway, and would need to be
 prepared for it.

Signed-off-by: Jeff King 
---
 builtin/rev-list.c | 2 +-
 log-tree.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9e11..d320b6f1e3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');

-   if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {
+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d6d1..22b2fb6c58 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}

-   if (!get_cached_commit_buffer(commit, NULL))
-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;
-- 
2.16.2.555.g885a024879

[PATCH] revision: drop --show-all option

2018-02-21 Thread Jeff King

On Wed, Feb 21, 2018 at 03:03:18PM -0800, Junio C Hamano wrote:

> > So what I'm wondering is whether we should consider just ripping it out
> > (but I'm OK with keeping it, as once the commit-buffer stuff is fixed,
> > it's probably not hurting anybody).
> 
> I see no problem in removing it.  With more "interesting" features
> relying on post-processing (like 'simplify-merges'), show_all whose
> primary focus was how limit_list() behaves soft of outlived its
> usefulness, I would think.

So here's a patch to do that (textually independent but conceptually on
top of the earlier one to fix the commit-buffer thing). It actually
doesn't remove that much code, so I could go either way on it.

+cc Linus in case he's secretly in love with this feature.

-- >8 --
Subject: [PATCH] revision: drop --show-all option

This was an undocumented debugging aid that does not seem to
have come in handy in the past decade, judging from its lack
of mentions on the mailing list.

Let's drop it in the name of simplicity. This is morally a
revert of 3131b71301 (Add "--show-all" revision walker flag
for debugging, 2008-02-09), but note that I did leave in the
mapping of UNINTERESTING to "^" in get_revision_mark(). I
don't think this would be possible to trigger with the
current code, but it's the only sensible marker.

We'll skip the usual deprecation period because this was
explicitly a debugging aid that was never documented.

Signed-off-by: Jeff King 
---
 revision.c   | 10 -
 revision.h   |  1 -
 t/t6015-rev-list-show-all-parents.sh | 31 
 3 files changed, 42 deletions(-)
 delete mode 100755 t/t6015-rev-list-show-all-parents.sh

diff --git a/revision.c b/revision.c
index 5ce9b93baa..5c1cb7277c 100644
--- a/revision.c
+++ b/revision.c
@@ -1065,14 +1065,9 @@ static int limit_list(struct rev_info *revs)
return -1;
if (obj->flags & UNINTERESTING) {
mark_parents_uninteresting(commit);
-   if (revs->show_all)
-   p = _list_insert(commit, p)->next;
slop = still_interesting(list, date, slop, 
_cache);
if (slop)
continue;
-   /* If showing all, add the whole pending list to the 
end */
-   if (revs->show_all)
-   *p = list;
break;
}
if (revs->min_age != -1 && (commit->date > revs->min_age))
@@ -1864,8 +1859,6 @@ static int handle_revision_opt(struct rev_info *revs, int 
argc, const char **arg
revs->dense = 1;
} else if (!strcmp(arg, "--sparse")) {
revs->dense = 0;
-   } else if (!strcmp(arg, "--show-all")) {
-   revs->show_all = 1;
} else if (!strcmp(arg, "--in-commit-order")) {
revs->tree_blobs_in_commit_order = 1;
} else if (!strcmp(arg, "--remove-empty")) {
@@ -3094,8 +3087,6 @@ enum commit_action get_commit_action(struct rev_info 
*revs, struct commit *commi
return commit_ignore;
if (revs->unpacked && has_sha1_pack(commit->object.oid.hash))
return commit_ignore;
-   if (revs->show_all)
-   return commit_show;
if (commit->object.flags & UNINTERESTING)
return commit_ignore;
if (revs->min_age != -1 &&
@@ -3194,7 +3185,6 @@ enum commit_action simplify_commit(struct rev_info *revs, 
struct commit *commit)
enum commit_action action = get_commit_action(revs, commit);
 
if (action == commit_show &&
-   !revs->show_all &&
revs->prune && revs->dense && want_ancestry(revs)) {
/*
 * --full-diff on simplified parents is no good: it
diff --git a/revision.h b/revision.h
index 3dee97bfb9..b8c47b98e2 100644
--- a/revision.h
+++ b/revision.h
@@ -90,7 +90,6 @@ struct rev_info {
unsigned intdense:1,
prune:1,
no_walk:2,
-   show_all:1,
remove_empty_trees:1,
simplify_history:1,
topo_order:1,
diff --git a/t/t6015-rev-list-show-all-parents.sh 
b/t/t6015-rev-list-show-all-parents.sh
deleted file mode 100755
index 3c73c93ba6..00
--- a/t/t6015-rev-list-show-all-parents.sh
+++ /dev/null
@@ -1,31 +0,0 @@
-#!/bin/sh
-
-test_description='--show-all --parents does not rewrite TREESAME commits'
-
-. ./test-lib.sh
-
-test_expect_success 'set up --show-all --parents test' '
-   test_commit one foo.txt &&
-   commit1=$(git rev-list -1 HEAD) &&
-   test_commit two bar.txt &&
-   commit2=$(git rev-list -1 HEAD) &&
-   test_commit three foo.txt &&
-   commit3=$(git rev-list -1 HEAD)
-   '
-
-test_expect_success '--parents rewrites

Re: [PATCH v5 00/17] document & test fetch pruning & add fetch.pruneTags

2018-02-21 Thread Junio C Hamano

Ævar Arnfjörð Bjarmason   writes:

> Here's a v5 (correct subject line this time!). Many thanks to Eric for
> a thorough review.

We haven't seen any comments on this round.  Is everybody happy?

I do not have a strong opinion on the new feature, either for or
against.  I didn't find anything majorly questionable in the
execution, though, so...

Re: [PATCH 22/27] sha1_file: allow sha1_file_name to handle arbitrary repositories

2018-02-21 Thread Jonathan Tan

On Tue, 20 Feb 2018 17:54:25 -0800
Stefan Beller  wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 

Reviewed-by: Jonathan Tan 

> -void sha1_file_name_the_repository(struct strbuf *buf, const unsigned char 
> *sha1)
> +void sha1_file_name(struct repository *r, struct strbuf *buf, const unsigned 
> char *sha1)
>  {
> - strbuf_addstr(buf, get_object_directory());
> + strbuf_addstr(buf, r->objects.objectdir);
>   strbuf_addch(buf, '/');
>   fill_sha1_path(buf, sha1);
>  }

In the future, we should probably have:
 - a function to get the object store out of a repo (so that it can
   lazily initialize the object store struct if necessary)
 - when the object store is obtained, its objectdir field is guaranteed
   to be populated
 - sha1_file_name should take the object store struct, not the repo
   struct

but this is outside the scope of this patch.

Re: [PATCH 06/27] object-store: close all packs upon clearing the object store

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Signed-off-by: Stefan Beller 

Straight forward change, looks good.

> ---
>  builtin/am.c   | 2 +-
>  builtin/clone.c| 2 +-
>  builtin/fetch.c| 2 +-
>  builtin/merge.c| 2 +-
>  builtin/receive-pack.c | 2 +-
>  object.c   | 6 ++
>  packfile.c | 4 ++--
>  packfile.h | 2 +-
>  8 files changed, 10 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin/am.c b/builtin/am.c
> index 5bdd2d7578..4762a702e3 100644
> --- a/builtin/am.c
> +++ b/builtin/am.c
> @@ -1859,7 +1859,7 @@ static void am_run(struct am_state *state, int resume)
>*/
>   if (!state->rebasing) {
>   am_destroy(state);
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>  }
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 101c27a593..13cfaa6488 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1217,7 +1217,7 @@ int cmd_clone(int argc, const char **argv, const char 
> *prefix)
>   transport_disconnect(transport);
>  
>   if (option_dissociate) {
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   dissociate_from_references();
>   }
>  
> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 8ee998ea2e..4d72efca78 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1478,7 +1478,7 @@ int cmd_fetch(int argc, const char **argv, const char 
> *prefix)
>  
>   string_list_clear(, 0);
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>  
>   argv_array_pushl(_gc_auto, "gc", "--auto", NULL);
>   if (verbosity < 0)
> diff --git a/builtin/merge.c b/builtin/merge.c
> index 30264cfd7c..907ae44ab5 100644
> --- a/builtin/merge.c
> +++ b/builtin/merge.c
> @@ -411,7 +411,7 @@ static void finish(struct commit *head_commit,
>* We ignore errors in 'gc --auto', since the
>* user should see them.
>*/
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   run_command_v_opt(argv_gc_auto, RUN_GIT_CMD);
>   }
>   }
> diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> index b2eac79a6e..954fc72c7c 100644
> --- a/builtin/receive-pack.c
> +++ b/builtin/receive-pack.c
> @@ -2027,7 +2027,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
> char *prefix)
>   proc.git_cmd = 1;
>   proc.argv = argv_gc_auto;
>  
> - close_all_packs();
> + close_all_packs(_repository->objects);
>   if (!start_command()) {
>   if (use_sideband)
>   copy_to_sideband(proc.err, -1, NULL);
> diff --git a/object.c b/object.c
> index c76b62572a..34daaf37b3 100644
> --- a/object.c
> +++ b/object.c
> @@ -4,6 +4,7 @@
>  #include "tree.h"
>  #include "commit.h"
>  #include "tag.h"
> +#include "packfile.h"
>  
>  static struct object **obj_hash;
>  static int nr_objs, obj_hash_size;
> @@ -469,8 +470,5 @@ void raw_object_store_clear(struct raw_object_store *o)
>  
>   while (!list_empty(>packed_git_mru))
>   list_del(>packed_git_mru);
> - /*
> -  * TODO: call close_all_packs once migrated to
> -  * take an object store argument
> -  */
> + close_all_packs(o);
>  }
> diff --git a/packfile.c b/packfile.c
> index d41e4c83d0..511a2b0cdf 100644
> --- a/packfile.c
> +++ b/packfile.c
> @@ -311,11 +311,11 @@ static void close_pack(struct packed_git *p)
>   close_pack_index(p);
>  }
>  
> -void close_all_packs(void)
> +void close_all_packs(struct raw_object_store *o)
>  {
>   struct packed_git *p;
>  
> - for (p = the_repository->objects.packed_git; p; p = p->next)
> + for (p = o->packed_git; p; p = p->next)
>   if (p->do_not_close)
>   die("BUG: want to close pack marked 'do-not-close'");
>   else
> diff --git a/packfile.h b/packfile.h
> index a7fca598d6..6a2c57045c 100644
> --- a/packfile.h
> +++ b/packfile.h
> @@ -63,7 +63,7 @@ extern void close_pack_index(struct packed_git *);
>  
>  extern unsigned char *use_pack(struct packed_git *, struct pack_window **, 
> off_t, unsigned long *);
>  extern void close_pack_windows(struct packed_git *);
> -extern void close_all_packs(void);
> +extern void close_all_packs(struct raw_object_store *o);
>  extern void unuse_pack(struct pack_window **);
>  extern void clear_delta_base_cache(void);
>  extern struct packed_git *add_packed_git(const char *path, size_t path_len, 
> int local);
> -- 
> 2.16.1.291.g4437f3f132-goog
> 

-- 
Brandon Williams

Re: bug in HTTP protocol spec

2018-02-21 Thread Dorian Taylor


> On Feb 21, 2018, at 9:37 PM, Jonathan Nieder  wrote:
> 
> Thanks for writing it.
> 
> Do you mind if we forge your sign-off?  (See Documentation/SubmittingPatches
> item '(5) Certify your work' for details about what this means.)

Sure, or I can just re-paste:

Signed-off-by: Dorian Taylor 

---
Documentation/technical/http-protocol.txt | 10 +++---
1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/Documentation/technical/http-protocol.txt 
b/Documentation/technical/http-protocol.txt
index a0e45f2889e6e..19d73f7efb338 100644
--- a/Documentation/technical/http-protocol.txt
+++ b/Documentation/technical/http-protocol.txt
@@ -214,14 +214,17 @@ smart server reply:
  S: Cache-Control: no-cache
  S:
  S: 001e# service=git-upload-pack\n
+   S: 
  S: 004895dcfa3633004da0049d3d0fa03f80589cbcaf31 refs/heads/maint\0multi_ack\n
  S: 0042d049f6c27a2244e12041955e262a404c7faba355 refs/heads/master\n
  S: 003c2cb58b79488a98d2721cea644875a8dd0026b115 refs/tags/v1.0\n
  S: 003fa3c2e2402b99163d1d59756e5f207ae21cccba4c refs/tags/v1.0^{}\n
+   S: 

The client may send Extra Parameters (see
Documentation/technical/pack-protocol.txt) as a colon-separated string
-in the Git-Protocol HTTP header.
+in the Git-Protocol HTTP header. Note as well that there is *no* newline
+after the ``.

Dumb Server Response

@@ -264,8 +267,8 @@ Servers MUST set $servicename to be the request parameter 
value.
Servers SHOULD include an LF at the end of this line.
Clients MUST ignore an LF at the end of the line.

-Servers MUST terminate the response with the magic `` end
-pkt-line marker.
+Servers MUST follow the first pkt-line, as well as terminate the
+response, with the magic `` end pkt-line marker.

The returned response is a pkt-line stream describing each ref and
its known value.  The stream SHOULD be sorted by name according to
@@ -278,6 +281,7 @@ Extra Parameter.

 smart_reply =  PKT-LINE("# service=$servicename" LF)
 *1("version 1")
+""
 ref_list
 ""
 ref_list=  empty_list / non_empty_list

---

> 
>> Note I am not sure what the story is behind that `version 1`
>> element, whether it's supposed to go before or after the null packet
>> or if there should be another null packet or what. Perhaps somebody
>> more fluent with the smart protocol can advise.
> 
> I believe the 'version 1' goes after the flush-packet.

I took a traipse through the code and couldn’t determine it one way or another, 
but my money is on that looking something like `000aversion 1\n` on the wire.

--
Dorian Taylor
Make things. Make sense.
https://doriantaylor.com



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: test bare repository for unit tests

2018-02-21 Thread Basin Ilya

Hi.
git-fast-export would work too, but it creates an additional step. I don't 
commit to the model repo during tests, but I do commit when I want to modify 
the tests.
So far, I configured core.compression=0 and gc.auto=0, created the 
.gitattributes inside the bare repo dir containing one line: * binary
I also created two empty .gitignore files inside refs/ and objects/

I still haven't found a way to force prune without pack after each push.

On 22.02.2018 1:53, Ævar Arnfjörð Bjarmason wrote:
> 
> On Wed, Feb 21 2018, Basin Ilya jotted:
> 
>> Hi.
>> I want to the test-repo-git under 
>> https://github.com/apache/maven-wagon/tree/master/wagon-providers/wagon-scm/src/test/resources/
>> just like test-repo-cvs and test-repo-svn
>>
>> Which configuation options would suit that?
>> I think core.compression 0 for human readable diffs.
>> also, I need to force loose, gc after each push.
> 
> It looks like you have unit tests that are going to do integration tests
> of some SVN/CVS repos as used by some other tool, and want to add git to
> that.
> 
> Since you have git already, the most straightforward thing to do would
> be to ship the output of git-fast-export in the repo, and have the test
> setup code create a repo locally out of that, then test it.
> 
> Or do you really need to commit the raw repo files as-is for some reason
> I've missed?
>

Re: [PATCH v3 08/35] connect: discover protocol version outside of get_remote_heads

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:45 -0800
Brandon Williams  wrote:

> - get_remote_heads(fd[0], NULL, 0, , 0, NULL, );
> +
> + packet_reader_init(, fd[0], NULL, 0,
> +PACKET_READ_CHOMP_NEWLINE |
> +PACKET_READ_GENTLE_ON_EOF);
> +
> + switch (discover_version()) {
> + case protocol_v1:
> + case protocol_v0:
> + get_remote_heads(, , 0, NULL, );
> + break;
> + case protocol_unknown_version:
> + BUG("unknown protocol version");
> + }

This inlining is repeated a few times, which raises the question: if the
intention is to keep the v0/1 logic separately from v2, why not have a
single function that wraps them all? Looking at the end result (after
all the patches in this patch set are applied), it seems that the v2
version does not have extra_have or shallow parameters, which is a good
enough reason for me (I don't think functions that take in many
arguments and then selectively use them is a good idea). I think that
other reviewers will have this question too, so maybe discuss this in
the commit message.

> diff --git a/remote.h b/remote.h
> index 1f6611be2..2016461df 100644
> --- a/remote.h
> +++ b/remote.h
> @@ -150,10 +150,11 @@ int check_ref_type(const struct ref *ref, int flags);
>  void free_refs(struct ref *ref);
>  
>  struct oid_array;
> -extern struct ref **get_remote_heads(int in, char *src_buf, size_t src_len,
> +struct packet_reader;
> +extern struct ref **get_remote_heads(struct packet_reader *reader,
>struct ref **list, unsigned int flags,
>struct oid_array *extra_have,
> -  struct oid_array *shallow);
> +  struct oid_array *shallow_points);

This change probably does not belong in this patch, especially since
remote.c is unchanged.

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v3 12/35] serve: introduce git-serve

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:12:49 -0800
Brandon Williams  wrote:

>  .gitignore  |   1 +
>  Documentation/technical/protocol-v2.txt | 114 +++
>  Makefile|   2 +
>  builtin.h   |   1 +
>  builtin/serve.c |  30 
>  git.c   |   1 +
>  serve.c | 250 
> 
>  serve.h |  15 ++
>  t/t5701-git-serve.sh|  60 
>  9 files changed, 474 insertions(+)
>  create mode 100644 Documentation/technical/protocol-v2.txt
>  create mode 100644 builtin/serve.c
>  create mode 100644 serve.c
>  create mode 100644 serve.h
>  create mode 100755 t/t5701-git-serve.sh

As someone who is implementing the server side of protocol V2 in JGit, I
now have a bit more insight into this :-)

First of all, I used to not have a strong opinion on the existence of a
new endpoint, but now I think that it's better to *not* have git-serve.
As it is, as far as I can tell, upload-pack also needs to support (and
does support, as of the end of this patch set) protocol v2 anyway, so it
might be better to merely upgrade upload-pack.

> +A client then responds to select the command it wants with any particular
> +capabilities or arguments.  There is then an optional section where the
> +client can provide any command specific parameters or queries.
> +
> +command-request = command
> +   capability-list
> +   (command-args)

If you are stating that this is optional, write "*1command-args". (RFC
5234 also supports square brackets, but "*1" is already used in
pack-protocol.txt and http-protocol.txt.)

> +   flush-pkt
> +command = PKT-LINE("command=" key LF)
> +command-args = delim-pkt
> +*arg
> +arg = 1*CHAR

arg should be wrapped in PKT-LINE, I think, and terminated by an LF.

Re: Git should preserve modification times at least on request

2018-02-21 Thread Ævar Arnfjörð Bjarmason

On Wed, Feb 21 2018, Peter Backes jotted:

> On Wed, Feb 21, 2018 at 10:33:05PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> This sounds like a sensible job for a git import tool, i.e. import a
>> target directory into git, and instead of 'git add'-ing the whole thing
>> it would look at the mtimes, sort files by mtime, then add them in order
>> and only commit those files that had the same mtime in the same commit
>> (or within some boundary).
>
> I think that this would be The Wrong Thing to do.

I'm merely pointing out that if you have the use-case Derek Fawcus
describes you can get per-file mtimes via something similar to the the
hook method Theodore Ts'o described today with a simple import tool with
no changes to git or its object format required.

To the extent that there's a convention for this in git that's the
convention, e.g. if you use github or gitlab they'll render the
modification time of a file in the tree view, and that time is the time
of the commit that last touched it: https://github.com/git/git

> The commit time is just that: The time the commit was done. The commit
> is an atomic group of changes to a number of files that hopefully bring
> the tree from one usable state into the next.
>
> The mtime, in contrast, tells us when a file was most recently modified.
>
> It may well be that main.c was most recently modified yesterday, and
> feature.c was modified this morning, and that only both changes taken
> together make sense as a commit, despite the long time in between.
>
> Even worse, it may be that feature A took a long time to implement, so
> we have huge gaps in between the mtimes, but feature B was quickly done
> after A was finished.

...

> Such an algorithm would probably split feature A
> incorrectly into several commits, and group the more recently changed
> files of feature A with those of feature B.

Right, but that's a trade-off you can pick at import time in this
hypothetical tar-to-commits tool, you could decide to do no merging and
suffer to signal loss.

> And if Feature A and Feature B were developed in parallel, things get
> completely messy.
>
>> The advantage of doing this via such a tool is that you could tweak it
>> to commit by any criteria you wanted, e.g. not mtime but ctime or even
>> atime.
>
> Maybe, but it would be rather useless to commit by ctime or atime. You
> do one grep -r and the atime is different. You do one chmod or chown
> and the ctime is different. Those timestamps are really only useful for
> very limited purposes.
>
> That ctime exists seems reasonable, since it's only ever updated when
> the inode is written anyway.
>
> atime, in contrast, was clearly one of the rather nonsensical
> innovations of UNIX: Do one write to the disk for each read from the
> disk. C'mon, really? It would have been a lot more reasonable to simply
> provide a generic way for tracing read() system calls instead; then
> userspace could decide what to do with that information and which of it
> is useful and should be kept and perhaps stored on disk. Now we have
> this ugly hack called relatime to deal with the problem.

Yes, that [ac]time example was a stretch. A better example would be
committing the file mode, or extended attributes, or "this is on a
different FS", or whatever other per-file/dir attribute we're not
currently capturing.

>> You'd get the same thing as you'd get if git's tree format would change
>> to include mtimes (which isn't going to happen), but with a lot more
>> flexibility.
>
> Well, from basic logic, I don't see how a decision not to implement a
> feature could possibly increase flexility. The opposite seems to be the
> case.

I'm not trying to argue the usefulness of this mtime-per-file thing in
theory, just providing Derek Fawcus with a suggestion for a viable
workaround.

What I meant by this offhand comment, and which you may or may not know
(and I see no references to it from skimming the thread) is that there's
simply no space in the tree objects to add *anything* without breaking
the object format and requiring a major upgrade, although the plan to
switch to a new hash function is relevant to this.

Even if we suppose that git was being implemented today I don't think
this would make any sense as a first-level feature.

Empirical evidence suggests that people use git on a massive scale
largely without caring about this, and the users who do have a
workaround.

If it were added as a first-level feature to git it would present a lot
of UX confusion. E.g. you run "git add" and it'll be showing the mtime
somehow, or you get a formatted patch over E-Mail and it doesn't only
include the commit time but also times for individual files.

The VC systems that had this feature in the past were centralized, so
they could (in theory anyway) ensure that timestamps were monotonically
increasing. This won't be the case with git, we have plenty of timestamp
drift in e.g. linux.git and other git repos.

So if these mtimes were used by default they'd

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Re: [PATCH v3 24/35] connect: refactor git_connect to only get the protocol version once

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:01 -0800
Brandon Williams  wrote:

> Instead of having each builtin transport asking for which protocol
> version the user has configured in 'protocol.version' by calling
> `get_protocol_version_config()` multiple times, factor this logic out
> so there is just a single call at the beginning of `git_connect()`.
> 
> This will be helpful in the next patch where we can have centralized
> logic which determines if we need to request a different protocol
> version than what the user has configured.
> 
> Signed-off-by: Brandon Williams 

Reviewed-by: Jonathan Tan

Re: [PATCH v3 28/35] transport-helper: introduce stateless-connect

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:05 -0800
Brandon Williams  wrote:

> Introduce the transport-helper capability 'stateless-connect'.  This
> capability indicates that the transport-helper can be requested to run
> the 'stateless-connect' command which should attempt to make a
> stateless connection with a remote end.  Once established, the
> connection can be used by the git client to communicate with
> the remote end natively in a stateless-rpc manner as supported by
> protocol v2.  This means that the client must send everything the server
> needs in a single request as the client must not assume any
> state-storing on the part of the server or transport.

Maybe it's worth mentioning that support in the actual remote helpers
will be added in a subsequent patch.

> If a stateless connection cannot be established then the remote-helper
> will respond in the same manner as the 'connect' command indicating that
> the client should fallback to using the dumb remote-helper commands.

This makes sense, but there doesn't seem to be any code in this patch
that implements this.

> @@ -612,6 +615,11 @@ static int process_connect_service(struct transport 
> *transport,
>   if (data->connect) {
>   strbuf_addf(, "connect %s\n", name);
>   ret = run_connect(transport, );
> + } else if (data->stateless_connect) {
> + strbuf_addf(, "stateless-connect %s\n", name);
> + ret = run_connect(transport, );
> + if (ret)
> + transport->stateless_rpc = 1;

Why is process_connect_service() falling back to stateless_connect if
connect doesn't work? I don't think this fallback would work, as a
client that needs "connect" might need its full capabilities.

Re: [PATCH 16/27] sha1_file: add repository argument to sha1_file_name

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Add a repository argument to allow sha1_file_name callers to be more
> specific about which repository to handle. This is a small mechanical
> change; it doesn't change the implementation to handle repositories
> other than the_repository yet.
> 
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
> 
> While at it, move the declaration to object-store.h, where it should
> be easier to find.

Seems like we may want to make a sha1-file.h or an oid-file.h or
something like that at some point as that seems like a better place for
the function than in the object-store.h file?


-- 
Brandon Williams

Re: [PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee


On 2/21/2018 6:13 PM, Jeff King wrote:

On Wed, Feb 21, 2018 at 02:17:11PM -0500, Derrick Stolee wrote:


The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
   freed by rev-list in the first write. The output then does not
   match the format expectations, since the OID is written without the
   rest of the content.

I'm not sure after my earlier digging if there is even a way to trigger
this (and if so, it is probably accidental, since those lines were added
explicitly for --show-all).

And actually after re-reading the commit message for 3131b7130 again, I
think the current behavior is definitely not something that was
carefully planned. So I'd propose a commit message like below.


I only submitted my patch to avoid making you do the work of writing the 
commit message. My messages still don't have quite the right amount of 
detail (or the correct details, in this case).


Junio: please add

Reported-by: Derrick Stolee 

Thanks,

-Stolee




-- >8 --
Subject: [PATCH] commit: drop uses of get_cached_commit_buffer()

The "--show-all" revision option shows UNINTERESTING
commits. Some of these commits may be unparsed when we try
to show them (since we may or may not need to walk their
parents to fulfill the request).

Commit 3131b71301 (Add "--show-all" revision walker flag for
debugging, 2008-02-09) resolved this by just skipping
pretty-printing for commits without their object contents
cached, saying:

   Because we now end up listing commits we may not even have been parsed
   at all "show_log" and "show_commit" need to protect against commits
   that don't have a commit buffer entry.

That was the easy fix to avoid the pretty-printer segfaulting,
but:

   1. It doesn't work for all formats. E.g., --oneline
  prints the oid for each such commit but not a trailing
  newline, leading to jumbled output.

   2. It only affects some commits, depending on whether we
  happened to parse them or not (so if they were at the
  tip of an UNINTERESTING starting point, or if we
  happened to traverse over them, you'd see more data).

   3. It unncessarily ties the decision to show the verbose
  header to whether the commit buffer was cached. That
  makes it harder to change the logic around caching
  (e.g., if we could traverse without actually loading
  the full commit objects).

These days it's safe to feed such a commit to the
pretty-print code. Since be5c9fb904 (logmsg_reencode: lazily
load missing commit buffers, 2013-01-26), we'll load it on
demand in such a case. So let's just always show the verbose
headers.

This does change the behavior of plumbing, but:

   a. The --show-all option was explicitly introduced as a
  debugging aid, and was never documented (and has rarely
  even been mentioned on the list by git devs).

   b. Avoiding the commits was already not deterministic due
  to (2) above. So the caller might have seen full
  headers for these commits anyway, and would need to be
  prepared for it.

Signed-off-by: Jeff King 
---
  builtin/rev-list.c | 2 +-
  log-tree.c | 3 ---
  2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9e11..d320b6f1e3 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
  
-	if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {

+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d6d1..22b2fb6c58 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
  
-	if (!get_cached_commit_buffer(commit, NULL))

-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;

Bug Report: git status triggers a file change event

2018-02-21 Thread Raining Chain

On Windows 10, git version 2.16.2.windows.1, running the command

git status

will trigger a file change event to file C:\myPath\.git  "Attributes changed."

This causes problems when using scripts that detect file changes such
as tsc -w (Typescript compiler) and using softwares that regularly
call git status such as VisualStudioCode.

Thanks.

Re: Bug Report: git status triggers a file change event

2018-02-21 Thread Jonathan Nieder

+git-for-windows
Hi,

Raining Chain wrote:

> On Windows 10, git version 2.16.2.windows.1, running the command
>
> git status
>
> will trigger a file change event to file C:\myPath\.git  "Attributes changed."
>
> This causes problems when using scripts that detect file changes such
> as tsc -w (Typescript compiler) and using softwares that regularly
> call git status such as VisualStudioCode.
>
> Thanks.

Can you say more about how "tsc -w" reacts to the file change?  Is there
a way to tell it to exclude particular files from the files it watches?

Thanks,
Jonathan

Re: bug in HTTP protocol spec

2018-02-21 Thread Jonathan Nieder

(+cc: bmwill@, who is working on protocol v2)
Hi,

Dorian Taylor wrote:
> On Feb 21, 2018, at 2:15 PM, Jeff King  wrote:

>> Thanks, I agree the document is buggy. Do you want to submit a patch?
>
> Will this do?

Thanks for writing it.

Do you mind if we forge your sign-off?  (See Documentation/SubmittingPatches
item '(5) Certify your work' for details about what this means.)

> Note I am not sure what the story is behind that `version 1`
> element, whether it's supposed to go before or after the null packet
> or if there should be another null packet or what. Perhaps somebody
> more fluent with the smart protocol can advise.

I believe the 'version 1' goes after the flush-packet.

Thanks,
Jonathan

Re: [PATCH 13/27] sha1_file: add repository argument to prepare_alt_odb

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> See previous patch for explanation.
>
> While at it, move the declaration to object-store.h,
> where it should be easier to find.

Which declaration?  It looks like prepare_alt_odb is already in
object-store.h.

[...]
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -717,7 +717,7 @@ int cmd_fsck(int argc, const char **argv, const char 
> *prefix)
>   } else {
>   fsck_object_dir(get_object_directory());
>  
> - prepare_alt_odb();
> + prepare_alt_odb(the_repository);

Patch 2 added a #include of "repository.h".  Good.

(I checked because with the definition of prepare_alt_odb as a macro,
the function call would compile correctly even if the_repository
weren't in scope, but we want to include what we use as a matter of
style/maintainability.)

[...]
> --- a/packfile.c
> +++ b/packfile.c
> @@ -890,7 +890,7 @@ void prepare_packed_git(void)
>   if (the_repository->objects.packed_git_initialized)
>   return;
>   prepare_packed_git_one(get_object_directory(), 1);
> - prepare_alt_odb();
> + prepare_alt_odb(the_repository);

Also good, since patch 3 added a #include of "repository.h".

[...]
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -23,6 +23,7 @@
>  #include "sha1-lookup.h"
>  #include "bulk-checkin.h"
>  #include "repository.h"
> +#include "object-store.h"

Should this #include have been added in an earlier patch, since the
file both uses and defines prepare_alt_odb, which is declared there?

The rest looks good.

Thanks,
Jonathan

Re: [PATCH 17/27] sha1_file: add repository argument to stat_sha1_file

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> Add a repository argument to allow the stat_sha1_file caller to be
> more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  sha1_file.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

Simple and obviously correct.
Reviewed-by: Jonathan Nieder

Re: [PATCH v4 12/13] commit-graph: read only from specific pack-indexes

2018-02-21 Thread Stefan Beller

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:
>
> Teach git-commit-graph to inspect the objects only in a certain list
> of pack-indexes within the given pack directory. This allows updating
> the commit graph iteratively, since we add all commits stored in a
> previous commit graph.
>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/git-commit-graph.txt | 11 +++
>  builtin/commit-graph.c | 32 +---
>  commit-graph.c | 26 --
>  commit-graph.h |  4 +++-
>  packfile.c |  4 ++--
>  packfile.h |  2 ++
>  t/t5318-commit-graph.sh| 16 
>  7 files changed, 87 insertions(+), 8 deletions(-)
>
> diff --git a/Documentation/git-commit-graph.txt 
> b/Documentation/git-commit-graph.txt
> index b9b4031..93d50d1 100644
> --- a/Documentation/git-commit-graph.txt
> +++ b/Documentation/git-commit-graph.txt
> @@ -42,6 +42,10 @@ With the `--delete-expired` option, delete the graph files 
> in the pack
>  directory that are not referred to by the graph-latest file. To avoid race
>  conditions, do not delete the file previously referred to by the
>  graph-latest file if it is updated by the `--set-latest` option.
> ++
> +With the `--stdin-packs` option, generate the new commit graph by
> +walking objects only in the specified packfiles and any commits in
> +the existing graph-head.

A general question on this series:
How do commit graph buildups deal with garbage collected commits?
(my personal workflow is heavy on rebase, which generates lots of
dangling commits, to be thrown out later)

The second half of the sentence makes it sound like once a
commit is in the graph it cannot be pulled out easily again, hence
the question on the impact of graphs on a long living repository
which is garbage collected frequently.

AFAICT you could just run
git commit-graph write --set-latest [--delete-expired]
as that actually looks up objects from outside the existing graph files,
such that lost objects are ignored?

> +   const char **lines = NULL;
> +   int nr_lines = 0;
> +   int alloc_lines = 0;

(nit:)
I had the impression that these triplet-variables, that are used in
ALLOC_GROW are allo X, X_nr and X_allow, but I might be wrong.

> @@ -170,7 +178,25 @@ static int graph_write(int argc, const char **argv)
>
> old_graph_name = get_graph_latest_contents(opts.obj_dir);
>
> -   graph_name = write_commit_graph(opts.obj_dir);
> +   if (opts.stdin_packs) {
> +   struct strbuf buf = STRBUF_INIT;
> +   nr_lines = 0;
> +   alloc_lines = 128;

alloc_lines has been initialized before, so why redo it here again?
Also what is the rationale for choosing 128 as a good default?
I would guess 0 is just as fine, because ALLOC_GROW makes sure
that it growth fast in the first couple entries by having an additional
offset. (no need to fine tune the starting allocation IMHO)

> +   ALLOC_ARRAY(lines, alloc_lines);
> +
> +   while (strbuf_getline(, stdin) != EOF) {
> +   ALLOC_GROW(lines, nr_lines + 1, alloc_lines);
> +   lines[nr_lines++] = buf.buf;
> +   strbuf_detach(, NULL);

strbuf_detach returns its previous buf.buf, such that you can combine these
two lines as
lines[nr_lines++] = strbuf_detach(, NULL);


> +   }
> +
> +   pack_indexes = lines;
> +   nr_packs = nr_lines;

Technically we do not need to strbuf_release() here, because
strbuf_detach is always called, and by knowing its implementation,
it is just as good.


> @@ -579,7 +581,27 @@ char *write_commit_graph(const char *obj_dir)
> oids.alloc = 1024;
> ALLOC_ARRAY(oids.list, oids.alloc);
>
> -   for_each_packed_object(if_packed_commit_add_to_list, , 0);
> +   if (pack_indexes) {
> +   struct strbuf packname = STRBUF_INIT;
> +   int dirlen;
> +   strbuf_addf(, "%s/pack/", obj_dir);
> +   dirlen = packname.len;
> +   for (i = 0; i < nr_packs; i++) {
> +   struct packed_git *p;
> +   strbuf_setlen(, dirlen);
> +   strbuf_addstr(, pack_indexes[i]);
> +   p = add_packed_git(packname.buf, packname.len, 1);
> +   if (!p)
> +   die("error adding pack %s", packname.buf);
> +   if (open_pack_index(p))
> +   die("error opening index for %s", 
> packname.buf);
> +   for_each_object_in_pack(p, 
> if_packed_commit_add_to_list, );
> +   close_pack(p);
> +   }

strbuf_release();

> +   }
> +   else

(micro style nit)

} else

Proposal

2018-02-21 Thread melisa mehmet

Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds

Investment Project

2018-02-21 Thread Ms Melisa Mehmet

Hello,

I have a business proposal i would like to discuss with you, reply if you are 
interested for more info.

Thank You.
Ms. Melisa Mehmet
My Skype ID: ms.melisameh...@hotmail.com
WhatsApp": +90 534 716 2603

Re: [PATCH v3 32/35] http: allow providing extra headers for http requests

2018-02-21 Thread Jonathan Tan

On Tue,  6 Feb 2018 17:13:09 -0800
Brandon Williams  wrote:

> @@ -172,6 +172,8 @@ struct http_get_options {
>* for details.
>*/
>   struct strbuf *base_url;
> +
> + struct string_list *extra_headers;

Document this? For example:

  If not NULL, additional HTTP headers to be sent with the request. The
  strings in the list must not be freed until after the request.

Re: [PATCH 00/36] object_id part 12

2018-02-21 Thread brian m. carlson

On Wed, Feb 21, 2018 at 10:47:19AM -0800, Junio C Hamano wrote:
> "brian m. carlson"  writes:
> 
> > This is the twelfth in a series of patches to convert from unsigned char
> > [20] to struct object_id.  This series is based on next.
> >
> > Included in this series are conversions for find_unique_abbrev and
> > lookup_replace_object, as well as parts of the sha1_file code.
> >
> > Conflicts with pu are average in number but minor, mostly because of the
> > type_name conversion.  None of them are tricky, except that the
> > introduction of get_tree_entry_if_blob requires a conversion of that
> > function.
> 
> And the reason why this is based on 'next' is...?  Which topic(s) do
> we have to wait for until we can queue this series, in other words?
> 
> Thanks for working on this, though.

It was waiting on the hash_algo changes I had submitted, and I don't
believe they'd made it into master.  When they have, I'll rebase and
send a v2.
-- 
brian m. carlson / brian with sandals: Houston, Texas, US
https://www.crustytoothpaste.net/~bmc | My opinion only
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature

Re: [PATCH 02/27] object-store: migrate alternates struct and functions from cache.h

2018-02-21 Thread Brandon Williams

On 02/20, Stefan Beller wrote:
> Migrate the struct alternate_object_database and all its related
> functions to the object store as these functions are easier found in
> that header. The migration is just a verbatim copy, no need to
> include the object store header at any C file, because cache.h includes
> repository.h which in turn includes the object-store.h

Nice.  Always love seeing cache.h get smaller


-- 
Brandon Williams

Re: What's cooking in git.git (Feb 2018, #03; Wed, 21)

2018-02-21 Thread Brandon Williams

On 02/21, Junio C Hamano wrote:
> * bw/c-plus-plus (2018-02-14) 38 commits
>  - fixup! diff: rename 'this' variables
>  - replace: rename 'new' variables
>  - trailer: rename 'template' variables
>  - tempfile: rename 'template' variables
>  - wrapper: rename 'template' variables
>  - environment: rename 'namespace' variables
>  - diff: rename 'template' variables
>  - environment: rename 'template' variables
>  - init-db: rename 'template' variables
>  - unpack-trees: rename 'new' variables
>  - trailer: rename 'new' variables
>  - submodule: rename 'new' variables
>  - split-index: rename 'new' variables
>  - remote: rename 'new' variables
>  - ref-filter: rename 'new' variables
>  - read-cache: rename 'new' variables
>  - line-log: rename 'new' variables
>  - imap-send: rename 'new' variables
>  - http: rename 'new' variables
>  - entry: rename 'new' variables
>  - diffcore-delta: rename 'new' variables
>  - diff: rename 'new' variables
>  - diff-lib: rename 'new' variable
>  - commit: rename 'new' variables
>  - combine-diff: rename 'new' variables
>  - remote: rename 'new' variables
>  - reflog: rename 'new' variables
>  - pack-redundant: rename 'new' variables
>  - help: rename 'new' variables
>  - checkout: rename 'new' variables
>  - apply: rename 'new' variables
>  - apply: rename 'try' variables
>  - diff: rename 'this' variables
>  - rev-parse: rename 'this' variable
>  - pack-objects: rename 'this' variables
>  - blame: rename 'this' variables
>  - object: rename function 'typename' to 'type_name'
>  - object_info: change member name from 'typename' to 'type_name'
> 
>  Avoid using identifiers that clash with C++ keywords.  Even though
>  it is not a goal to compile Git with C++ compilers, changes like
>  this help use of code analysis tools that targets C++ on our
>  codebase.
> 
>  Is the 'fixup!' cleanly squashable to the problematic one, or does
>  this series require another reroll to get it in a good enough shape?

Yeah the fixup patch looks good to me.  I don't think there was anything
else that needed attention so it should be in good shape with the fixup
patch.


-- 
Brandon Williams

Re: [PATCH v3 0/3] Add "git rebase --show-current-patch"

2018-02-21 Thread Tim Landscheidt

Junio C Hamano  wrote:

>> Compared to v2:

>> - the potential loss of errno before it's printed out in builtin/am.c
>>   is fixed.
>> - keep update_ref() in sequencer.c non-fatal like this rest
>> - rename ORIG_COMMIT to REBASE_HEAD

>> Interdiff:

> This round hasn't seen any comments.  Is everybody happy with it?

> I personally do not have strong opinion for the feature but didn't
> spot anything against the execution, either, so...

Sorry for the late reply: I dislike REBASE_/HEAD/ because
ORIG_/HEAD/ refers to the tip of the original branch, and
/ORIG/_HEAD refers to the original branch, so
/REBASE/_/HEAD/ is doubly confusing IMHO.  I consider
ORIG_COMMIT easier to understand because ORIG_HEAD refers to
the tip of the original branch, and ORIG_COMMIT would refer
to one of the commits making up that original branch, but as
I suggested it myself I might not be very objective in that
regard :-).

(BTW, thanks, Duy, for doing the actual work!)

Tim

Re: [PATCH 26/27] sha1_file: allow map_sha1_file to handle arbitrary repositories

2018-02-21 Thread Jonathan Nieder

Stefan Beller wrote:

> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  object-store.h | 3 +--
>  sha1_file.c| 5 +++--
>  2 files changed, 4 insertions(+), 4 deletions(-)

Thanks: this is short and simple, making my life as a reviewer very
easy.

Reviewed-by: Jonathan Nieder

Re: [PATCH 21/27] sha1_file: add repository argument to sha1_loose_object_info

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> Add a repository argument to allow the sha1_loose_object_info caller
> to be more specific about which repository to act on. This is a small
> mechanical change; it doesn't change the implementation to handle
> repositories other than the_repository yet.
>
> As with the previous commits, use a macro to catch callers passing a
> repository other than the_repository at compile time.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---
>  sha1_file.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)

Simple and obviously correct.
Reviewed-by: Jonathan Nieder

Re: [PATCH 01/27] repository: introduce raw object store field

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> The raw object store field will contain any objects needed for
> access to objects in a given repository.
>
> This patch introduces the raw object store and populates it with the
> `objectdir`, which used to be part of the repository struct.
>
> As the struct gains members, we'll also populate the function to clear
> the memory for these members.
>
> In a later we'll introduce a struct object_parser, that will complement
> the object parsing in a repository struct: The raw object parser is the
> layer that will provide access to raw object content, while the higher
> level object parser code will parse raw objects and keeps track of
> parenthood and other object relationships using 'struct object'.
> For now only add the lower level to the repository struct.
>
> Signed-off-by: Stefan Beller 
> Signed-off-by: Jonathan Nieder 
> ---

Heh, I suppose that sign-off makes me not a great candidate for
reviewing this.  It's been long enough since I looked at the patch
that I feel okay trying anyway.

[...]
> --- /dev/null
> +++ b/object-store.h
> @@ -0,0 +1,15 @@
> +#ifndef OBJECT_STORE_H
> +#define OBJECT_STORE_H
> +
> +struct raw_object_store {
> + /*
> +  * Path to the repository's object store.
> +  * Cannot be NULL after initialization.
> +  */
> + char *objectdir;
> +};
> +#define RAW_OBJECT_STORE_INIT { NULL }

Who owns and is responsible for freeing objectdir?

> +
> +void raw_object_store_clear(struct raw_object_store *o);

Oh, that answers that.

It could be worth a note in the doc comment anyway, but I don't mind
not having one.

[...]
> +
> +void raw_object_store_clear(struct raw_object_store *o)
> +{
> + free(o->objectdir);
> +}

Should this use FREE_AND_NULL?

[...]
> --- a/repository.c
> +++ b/repository.c
[...]
> @@ -42,8 +49,8 @@ static void repo_setup_env(struct repository *repo)
>   !repo->ignore_env);
>   free(repo->commondir);
>   repo->commondir = strbuf_detach(, NULL);
> - free(repo->objectdir);
> - repo->objectdir = git_path_from_env(DB_ENVIRONMENT, repo->commondir,
> + free(repo->objects.objectdir);

Should this call raw_object_store_clear instead of calling free
directly?

> + repo->objects.objectdir = git_path_from_env(DB_ENVIRONMENT, 
> repo->commondir,
>   "objects", !repo->ignore_env);

Long line.  One way to break it would be

repo->objects.objectdir =
git_path_from_env(DB_ENVIRONMENT, repo->commondir,
  "objects", !repo->ignore_env);

>   free(repo->graft_file);
>   repo->graft_file = git_path_from_env(GRAFT_ENVIRONMENT, repo->commondir,
> @@ -209,12 +216,14 @@ void repo_clear(struct repository *repo)
>  {
>   FREE_AND_NULL(repo->gitdir);
>   FREE_AND_NULL(repo->commondir);
> - FREE_AND_NULL(repo->objectdir);
>   FREE_AND_NULL(repo->graft_file);
>   FREE_AND_NULL(repo->index_file);
>   FREE_AND_NULL(repo->worktree);
>   FREE_AND_NULL(repo->submodule_prefix);
>  
> + raw_object_store_clear(>objects);
> + memset(>objects, 0, sizeof(repo->objects));

If raw_object_store_clear uses FREE_AND_NULL, this memset won't be necessary.

[...]
> --- a/repository.h
> +++ b/repository.h
> @@ -1,6 +1,8 @@
>  #ifndef REPOSITORY_H
>  #define REPOSITORY_H
>  
> +#include "object-store.h"
> +
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> @@ -21,10 +23,9 @@ struct repository {
>   char *commondir;
>  
>   /*
> -  * Path to the repository's object store.
> -  * Cannot be NULL after initialization.
> +  * Holds any information related to the object store.
>*/

This comment doesn't make it clear to me what the field represents.
Can it be replaced with some of the description from the commit
message?

> - char *objectdir;
> + struct raw_object_store objects;
>  

Thanks,
Jonathan

Re: [PATCH 05/27] object-store: move packed_git and packed_git_mru to object store

2018-02-21 Thread Jonathan Nieder

Hi,

Stefan Beller wrote:

> In a process with multiple repositories open, packfile accessors
> should be associated to a single repository and not shared globally.
> Move packed_git and packed_git_mru into the_repository and adjust
> callers to reflect this.
>
> Patch generated by
>
>  1. Moving the struct packed_git declaration to object-store.h
> and packed_git, packed_git_mru globals to struct object_store.
>
>  2. Applying the semantic patch
> contrib/coccinelle/refactoring/packed_git.cocci to adjust callers.
> This semantic patch is placed in a sub directory of the coccinelle
> contrib dir, as this semantic patch is not expected to be of general
> usefulness; it is only useful during developing this series and
> merging it with other topics in flight. At a later date, just
> delete that semantic patch.

Can the semantic patch go in the commit message instead?  It is very
brief.

Actually, I don't see this semantic patch in the diffstat.  Is the
commit message stale?

>  3. Applying line wrapping fixes from "make style" to break the
> resulting long lines.
>
>  4. Adding missing #includes of repository.h and object-store.h
> where needed.

Is there a way to automate this step?  (I'm asking for my own
reference when writing future patches, not because of any concern
about the correctness of this one.)
>
>  5. As the packfiles are now owned by an objectstore/repository, which
> is ephemeral unlike globals, we introduce memory leaks. So address
> them in raw_object_store_clear().

The compound words are confusing me.  What is an
objectstore/repository?  Are these referring to particular identifiers
or something else?

Would some wording like the following work?

   5. Freeing packed_git and packed_git_mru in raw_object_store_clear
  to avoid a per-repository memory leak.  Previously they were
  global singletons, so code to free them did not exist.

[...]
> --- a/builtin/index-pack.c
> +++ b/builtin/index-pack.c
> @@ -12,6 +12,7 @@
>  #include "exec_cmd.h"
>  #include "streaming.h"
>  #include "thread-utils.h"
> +#include "object-store.h"
>  #include "packfile.h"
>  
>  static const char index_pack_usage[] =

Change from a different patch leaked into this one?

[...]
> +++ b/builtin/pack-objects.c
[...]
> @@ -1044,7 +1045,7 @@ static int want_object_in_pack(const struct object_id 
> *oid,
>   }
>   want = want_found_object(exclude, p);
>   if (!exclude && want > 0)
> - list_move(>mru, _git_mru);
> + list_move(>mru, 
> _repository->objects.packed_git_mru);

Long line.  Can "make style" catch this?

[...]
> +++ b/builtin/receive-pack.c
> @@ -7,6 +7,7 @@
>  #include "sideband.h"
>  #include "run-command.h"
>  #include "exec_cmd.h"
> +#include "object-store.h"
>  #include "commit.h"
>  #include "object.h"
>  #include "remote.h"

Another change leaked in?

[...]
> --- a/cache.h
> +++ b/cache.h
> @@ -1585,35 +1585,6 @@ struct pack_window {
>   unsigned int inuse_cnt;
>  };
>  
> -extern struct packed_git {
[...]
> -} *packed_git;

Move detecting diff confirms that this wasn't modified.  Thanks for
creating it.

[...]
> +++ b/fast-import.c
[...]
> @@ -1110,7 +1112,7 @@ static int store_object(
>   if (e->idx.offset) {
>   duplicate_count_by_type[type]++;
>   return 1;
> - } else if (find_sha1_pack(oid.hash, packed_git)) {
> + } else if (find_sha1_pack(oid.hash, 
> the_repository->objects.packed_git)) {

Long line.  (I'll refrain from commenting about any further ones.)

[...]
> +++ b/http-push.c
> @@ -1,4 +1,5 @@
>  #include "cache.h"
> +#include "object-store.h"
>  #include "commit.h"
>  #include "tag.h"
>  #include "blob.h"

Stray change?

> diff --git a/http-walker.c b/http-walker.c
> index 07c2b1af82..8bb5d991bb 100644
> --- a/http-walker.c
> +++ b/http-walker.c
> @@ -4,6 +4,7 @@
>  #include "http.h"
>  #include "list.h"
>  #include "transport.h"
> +#include "object-store.h"
>  #include "packfile.h"
>  
>  struct alt_base {

Same question.

> diff --git a/http.c b/http.c
> index 31755023a4..a4a9e583c7 100644
> --- a/http.c
> +++ b/http.c
> @@ -1,6 +1,7 @@
>  #include "git-compat-util.h"
>  #include "http.h"
>  #include "config.h"
> +#include "object-store.h"
>  #include "pack.h"
>  #include "sideband.h"
>  #include "run-command.h"

Likewise.

> diff --git a/object-store.h b/object-store.h
> index e78eea1dde..1de9e07102 100644
> --- a/object-store.h
> +++ b/object-store.h
> @@ -52,6 +52,30 @@ void add_to_alternates_memory(const char *dir);
>   */
>  struct strbuf *alt_scratch_buf(struct alternate_object_database *alt);
>  
> +struct packed_git {
> + struct packed_git *next;
> + struct list_head mru;
> + struct pack_window *windows;
> + off_t pack_size;
> + const void *index_data;
> + size_t index_size;
> + uint32_t num_objects;
> + uint32_t num_bad_objects;
> +

Re: [PATCH 2/2] ref-filter: get rid of goto

2018-02-21 Thread Оля Тележная

2018-02-21 20:41 GMT+03:00 Junio C Hamano :
> Olga Telezhnaya  writes:
>
>> Get rid of goto command in ref-filter for better readability.
>>
>> Signed-off-by: Olga Telezhnaia 
>> Mentored-by: Christian Couder 
>> Mentored by: Jeff King 
>> ---
>>  ref-filter.c | 6 +++---
>>  1 file changed, 3 insertions(+), 3 deletions(-)
>
> It looks like this is the same change as the bottom-most change on
> your "cat-file --batch" series (and is obviously correct).
>
> I am puzzled by your intention---are you re-organizing and rebooting
> the series?  Either 'Yes' or 'No' is an acceptable answer, and so is
> anything else.  I just want to know what you want to happen to the
> merge conflicts if I queued this while still keeping your "cat-file
> --batch" thing I have on 'pu').

Thanks for the question, I needed to mention that.
We (with Peff) decided that it's better and easier to remake whole
previous patch. Before that, I want to make some refactorings in
ref-filter so after that migrating should go much easier. I want to do
that by small separate patches, so this is first in the series. I hope
it would be both easier to review for you and easier to fix for me.
So, if everything is fine, you could merge it to master. I will
rewrite most parts of previous patch anyway.

Thanks!

>
>>
>> diff --git a/ref-filter.c b/ref-filter.c
>> index 83ffd84affe52..28df6e21fb996 100644
>> --- a/ref-filter.c
>> +++ b/ref-filter.c
>> @@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item 
>> *ref)
>>   for (i = 0; i < used_atom_cnt; i++) {
>>   struct atom_value *v = >value[i];
>>   if (v->s == NULL)
>> - goto need_obj;
>> + break;
>>   }
>> - return;
>> + if (used_atom_cnt <= i)
>> + return;
>>
>> - need_obj:
>>   get_object(ref, >objectname, 0, );
>>
>>   /*
>>
>> --
>> https://github.com/git/git/pull/460

[PATCH] t: send verbose test-helper output to fd 4

2018-02-21 Thread Jeff King

This is a repost of the two patches from:

  https://public-inbox.org/git/20180209185710.ga23...@sigill.intra.peff.net/

(now just one patch, since sg/test-i18ngrep graduated and we can do it
all in one step). The idea got positive feedback, but nobody commented
on patches and I didn't see them in "What's cooking".

-- >8 --
Test helper functions like test_must_fail may produce
messages to stderr when they see a problem. When the tests
are run with "--verbose", this ends up on the test script's
stderr, and the user can read it.

But there's a problem. Some tests record stderr as part of
the test, like:

  test_must_fail git foo 2>output &&
  test_i18ngrep expected.message output

In this case the error text goes into "output". This makes
the --verbose output less useful (it also means we might
accidentally match it in the second, though in practice we
tend to produce these messages only on error, so we'd abort
the test when the first command fails).

Let's instead send this user-facing output directly to
descriptor 4, which always points to the original stderr (or
/dev/null in non-verbose mode). And it's already forbidden
to redirect descriptor 4, since we use it for BASH_XTRACEFD,
as explained in 9be795fbce (t5615: avoid re-using descriptor
4, 2017-12-08).

Signed-off-by: Jeff King 
---
 t/test-lib-functions.sh | 20 ++--
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 67b5994afb..aabee13e5d 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -625,22 +625,22 @@ test_must_fail () {
exit_code=$?
if test $exit_code -eq 0 && ! list_contains "$_test_ok" success
then
-   echo >&2 "test_must_fail: command succeeded: $*"
+   echo >&4 "test_must_fail: command succeeded: $*"
return 1
elif test_match_signal 13 $exit_code && list_contains "$_test_ok" 
sigpipe
then
return 0
elif test $exit_code -gt 129 && test $exit_code -le 192
then
-   echo >&2 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
+   echo >&4 "test_must_fail: died by signal $(($exit_code - 128)): 
$*"
return 1
elif test $exit_code -eq 127
then
-   echo >&2 "test_must_fail: command not found: $*"
+   echo >&4 "test_must_fail: command not found: $*"
return 1
elif test $exit_code -eq 126
then
-   echo >&2 "test_must_fail: valgrind error: $*"
+   echo >&4 "test_must_fail: valgrind error: $*"
return 1
fi
return 0
@@ -678,7 +678,7 @@ test_expect_code () {
return 0
fi
 
-   echo >&2 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
+   echo >&4 "test_expect_code: command exited with $exit_code, we wanted 
$want_code $*"
return 1
 }
 
@@ -742,18 +742,18 @@ test_i18ngrep () {
shift
! grep "$@" && return 0
 
-   echo >&2 "error: '! grep $@' did find a match in:"
+   echo >&4 "error: '! grep $@' did find a match in:"
else
grep "$@" && return 0
 
-   echo >&2 "error: 'grep $@' didn't find a match in:"
+   echo >&4 "error: 'grep $@' didn't find a match in:"
fi
 
if test -s "$last_arg"
then
-   cat >&2 "$last_arg"
+   cat >&4 "$last_arg"
else
-   echo >&2 ""
+   echo >&4 ""
fi
 
return 1
@@ -764,7 +764,7 @@ test_i18ngrep () {
 # not output anything when they fail.
 verbose () {
"$@" && return 0
-   echo >&2 "command failed: $(git rev-parse --sq-quote "$@")"
+   echo >&4 "command failed: $(git rev-parse --sq-quote "$@")"
return 1
 }
 
-- 
2.16.2.580.g650ee5408b

Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-21 Thread Junio C Hamano

Phillip Wood  writes:

> Keeping the permission bits makes sense (I'd not thought of them when
> I created the patch) as we want to check that the file has the correct
> permissions. As for the all-zero object name, is it really worth
> leaving it in - if a file has been created or deleted then we'll still
> have /dev/null as the file name for one side or the other and the diff
> lines will show it as well. As these tests are just to check the state
> of the index then I'm not sure the hash values add anything. How do
> you feel about a filter like
>
> sed "/^   index/s/ [0-9a-f][0-9a-f]*\.\.[0-9a-f][0-9a-f]*/x/"

Something like that, with perhaps the single 'x' replaced with
something more normal looking and the all-zero thing special cased,
was what I had in mind.

Special casing all-zero matters.  I know that the current code uses
all-zero on the missing side.  The thing is, tests are about
protecting the correctness we currently have, and we want to catch
the next idiot that breaks the code to stop showing all-zero when
talking about the missing side.

RE: Stackdump from stash save on Windows 10 64-bit

2018-02-21 Thread Johannes Schindelin

Hi Tim,

I re-Cc:ed the Git mailing list, as I do not like the pressure of being
the only one you ask for help.

On Wed, 21 Feb 2018, Tim Mayo wrote:

>   > Can you please test with v2.16.2
> 
> Is there a built version somewhere .. or do I have to build it myself?

Yes, official Git for Windows versions are always available from
https://gitforwindows.org/: either click the big Download button (which
will download the installer right away), or click on the version in the
upper left (if you want to choose which flavor of Git for Windows you want
to download).

Ciao,
Johannes

Re: [PATCH] t/known-leaky: add list of known-leaky test scripts

2018-02-21 Thread Junio C Hamano

Martin Ågren  writes:

> On 19 February 2018 at 22:29, Jeff King  wrote:
> ...
>> Or alternatively, we could just not bother with checking this into the
>> repository, and it becomes a local thing for people interested in
>> leak-testing. What's the value in having a shared known-leaky list,
>> especially if we don't expect most people to run it.
>
> This sums up my feeling about this.

Even though keeping track of list of known-leaky tests may not be so
interesting, we can still salvage useful pieces from the discussion
and make them available to developers, e.g.  something like

prove --dry --state=failed |
perl -lne '/^(t[0-9]{4})-.*\.sh$/ and print $1' | sort >$@+
if cmp >/dev/null $@ $@+; then rm $@+; else mv $@+ $@; fi

could be made into a target to stash away the list of failing tests
after a test run?

Re: [PATCH v2 4/9] t3701: don't hard code sha1 hash values

2018-02-21 Thread Phillip Wood


On 20/02/18 17:39, Junio C Hamano wrote:

Phillip Wood  writes:


From: Phillip Wood 

Purge the index lines from diffs so we're not hard coding sha1 hash
values in the expected output.


The motivation of this patch is clear, but all-zero object name for
missing side of deletion or creation patch should not change when we
transition to any hash function.  Neither the permission bits shown
in the output (and whether the index line has the bits are shown on
it in the first place, i.e. the index line of a creation patch does
not, whilethe one in a modification patch does).

So I am a bit ambivalent about this change.

Perhaps have a filter that redacts, instead of removes, selected
pieces of information that are likely to change while hash
transition, and use that consistently to filter both the expected
output and the actual output before comparing?

Keeping the permission bits makes sense (I'd not thought of them when I 
created the patch) as we want to check that the file has the correct 
permissions. As for the all-zero object name, is it really worth leaving 
it in - if a file has been created or deleted then we'll still have 
/dev/null as the file name for one side or the other and the diff lines 
will show it as well. As these tests are just to check the state of the 
index then I'm not sure the hash values add anything. How do you feel 
about a filter like


sed "/^index/s/ [0-9a-f][0-9a-f]*\.\.[0-9a-f][0-9a-f]*/x/"

Best Wishes

Phillip

Re: [PATCH v2 3/9] t3701: use test_write_lines and write_script

2018-02-21 Thread Phillip Wood


On 20/02/18 17:19, Junio C Hamano wrote:

Eric Sunshine  writes:


  test_expect_success 'setup fake editor' '
-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   FAKE_EDITOR="$(pwd)/fake-editor.sh" &&
+   write_script "$FAKE_EDITOR" <<-\EOF &&
 mv -f "$1" oldpatch &&
 mv -f patch "$1"
 EOF
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
+   test_set_editor "$FAKE_EDITOR"
  '


The very first thing that test_set_editor() does is set FAKE_EDITOR to
the value of $1, so it is confusing to see it getting setting it here
first; the reader has to spend extra brain cycles wondering if
something non-obvious is going on that requires this manual
assignment. Perhaps drop the assignment altogether and just write
literal "fake_editor.sh" in the couple places it's needed (as was done
in the original code)?


Yeah, I think $(pwd)/ prefix is needed at the final step (i.e. as
the first argument to test_set_editor) for this to be a faithful
rewrite but it is distracting having to write it anywhere else.

Other than that, this looks like a quite straight-forward cleanup.

Thanks, both.  Here is what I'd be queuing tentatively.


That looks good, thanks for fixing it up

Phillip


  t/t3701-add-interactive.sh | 33 +
  1 file changed, 5 insertions(+), 28 deletions(-)

diff --git a/t/t3701-add-interactive.sh b/t/t3701-add-interactive.sh
index 39c7423069..4a369fcb51 100755
--- a/t/t3701-add-interactive.sh
+++ b/t/t3701-add-interactive.sh
@@ -87,13 +87,8 @@ test_expect_success 'setup expected' '
EOF
  '
  
-test_expect_success 'setup fake editor' '

-   >fake_editor.sh &&
-   chmod a+x fake_editor.sh &&
-   test_set_editor "$(pwd)/fake_editor.sh"
-'
-
  test_expect_success 'dummy edit works' '
+   test_set_editor : &&
(echo e; echo a) | git add -p &&
git diff > diff &&
test_cmp expected diff
@@ -110,12 +105,10 @@ test_expect_success 'setup patch' '
  '
  
  test_expect_success 'setup fake editor' '

-   echo "#!$SHELL_PATH" >fake_editor.sh &&
-   cat >>fake_editor.sh <<-\EOF &&
+   write_script "fake_editor.sh" <<-\EOF &&
mv -f "$1" oldpatch &&
mv -f patch "$1"
EOF
-   chmod a+x fake_editor.sh &&
test_set_editor "$(pwd)/fake_editor.sh"
  '
  
@@ -302,18 +295,12 @@ test_expect_success 'deleting an empty file' '
  
  test_expect_success 'split hunk setup' '

git reset --hard &&
-   for i in 10 20 30 40 50 60
-   do
-   echo $i
-   done >test &&
+   test_write_lines 10 20 30 40 50 60 >test &&
git add test &&
test_tick &&
git commit -m test &&
  
-	for i in 10 15 20 21 22 23 24 30 40 50 60

-   do
-   echo $i
-   done >test
+   test_write_lines 10 15 20 21 22 23 24 30 40 50 60 >test
  '
  
  test_expect_success 'split hunk "add -p (edit)"' '

@@ -334,17 +321,7 @@ test_expect_success 'split hunk "add -p (edit)"' '
  '
  
  test_expect_failure 'split hunk "add -p (no, yes, edit)"' '

-   cat >test <<-\EOF &&
-   5
-   10
-   20
-   21
-   30
-   31
-   40
-   50
-   60
-   EOF
+   test_write_lines 5 10 20 21 30 31 40 50 60 >test &&
git reset &&
# test sequence is s(plit), n(o), y(es), e(dit)
# q n q q is there to make sure we exit at the end.

[ANNOUNCE] Git Rev News edition 36

2018-02-21 Thread Christian Couder

Hi everyone,

The 36th edition of Git Rev News is now published:

  https://git.github.io/rev_news/2018/02/21/edition-36/

Thanks a lot to all the contributors!

Enjoy,
Christian, Jakub, Markus and Gabriel.

Re: [PATCH 2/2] ref-filter: get rid of goto

2018-02-21 Thread Junio C Hamano

Olga Telezhnaya  writes:

> Get rid of goto command in ref-filter for better readability.
>
> Signed-off-by: Olga Telezhnaia 
> Mentored-by: Christian Couder 
> Mentored by: Jeff King 
> ---
>  ref-filter.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)

It looks like this is the same change as the bottom-most change on
your "cat-file --batch" series (and is obviously correct).

I am puzzled by your intention---are you re-organizing and rebooting
the series?  Either 'Yes' or 'No' is an acceptable answer, and so is
anything else.  I just want to know what you want to happen to the
merge conflicts if I queued this while still keeping your "cat-file
--batch" thing I have on 'pu').

>
> diff --git a/ref-filter.c b/ref-filter.c
> index 83ffd84affe52..28df6e21fb996 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1494,11 +1494,11 @@ static void populate_value(struct ref_array_item *ref)
>   for (i = 0; i < used_atom_cnt; i++) {
>   struct atom_value *v = >value[i];
>   if (v->s == NULL)
> - goto need_obj;
> + break;
>   }
> - return;
> + if (used_atom_cnt <= i)
> + return;
>  
> - need_obj:
>   get_object(ref, >objectname, 0, );
>  
>   /*
>
> --
> https://github.com/git/git/pull/460

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-21 Thread Lars Schneider


> On 16 Feb 2018, at 19:55, Junio C Hamano  wrote:
> 
> Jeff King  writes:
> 
>> So a full proposal would support both cases: "check this out in the
>> local platform's preferred encoding" and "always check this out in
>> _this_ encoding". And Lars's proposal is just the second half of that.
> 
> Actually, what you seem to take as a whole is just half of the
> story.  The other half that is an ability to say "what is in the
> repository for this path is stored in this encoding".  I agree that
> "check it out in this encoding" is a useful thing to have, and using
> the in-tree .gitattributes as a place to state the project-wide
> preference may be OK (and .git/info/attributes should be able to
> override it if needed -- this probably deserves to be added to a
> test somewhere by this series).

Good call! I'll add this test case!


> Luckily, lack of 'in-repository-encoding' attribute is not a show
> stopper for this series.  A later topic could start with "earlier,
> in order to make use of w-t-e attribute, you had to have your
> contents in UTF-8.  Teach the codepath to honor a new attribute that
> tells in what encoding the blob contents are stored." without having
> to be a part of this topic.

I have the impression that this is the purpose of the already existing 
"encoding" attribute, no? AFAIK this attribute is only respected by 
gitk, though. A future series could make Git respect this attribute too.


- Lars

Re: Duplicate safecrlf warning for racily clean index entry

2018-02-21 Thread Junio C Hamano

Matt McCutchen  writes:

> ... may be an important optimization.)  If the line endings are changed
> without changing the size or post-conversion content, then no unstaged
> change is reported.  It does not appear that git saves the pre-
> conversion content.

Correct.  The cached-stat information is meant to be compared with
the size on the filesystem.  Based on your observation, it seems
that what you are seeing is not specific to safe-crlf thing.

If you reconfigure anything that affects the checkout conversion
codepath, including the "smudge" filter, an entry in the index that
used to be up-to-date will still have cached-stat info like
timestamp and size that match the on-disk file, even though if you
_were_ to check it out afresh out of the index, the reconfigured
checkout codepath may produce different file contents on-disk.  A
consequence of this is that you may cause Git to still say that the
path is clean, even though it is no longer true.

There is no single "right" solution out of this situation, as it
depends on the reason why you made such a reconfiguration in the
first place.

 - If the reason is because you found that what is stored in the
   index is correct but their contents are checked out incorrectly
   (e.g. both the index and the working tree files end their lines
   with LF, but you want your working tree files to be converted to
   CRLF, and you futzed with .gitattributes or core.crlf), then you
   would want to "correct" it by checking them out, bypassing the
   "if the cached-stat information says we already have the matching
   contents on disk, do not write the file out" optimization.

 - If the reason is the other way around, then you would want to
   "correct" the indexed contents by rehashing what you have on
   disk.  Perhaps a recently added "git add --renormalize" is what
   you are looking for.

Re: [PATCH 1/2] ref-filter: get rid of duplicate code

2018-02-21 Thread Junio C Hamano

Olga Telezhnaya  writes:

> Make one function from 2 duplicate pieces and invoke it twice.
>
> Signed-off-by: Olga Telezhnaia 
> Mentored-by: Christian Couder 
> Mentored by: Jeff King 
> ---
>  ref-filter.c | 45 +
>  1 file changed, 21 insertions(+), 24 deletions(-)

Nice code reduction, removing 2 instances of 11-line section and
replacing with a 17-line function ;-)

This looks related to your earlier changes for "cat-file --batch"
but I do not recall seeing this refactoring.  It seems to be done
correctly.

Good spotting.

> diff --git a/ref-filter.c b/ref-filter.c
> index f9e25aea7a97e..83ffd84affe52 100644
> --- a/ref-filter.c
> +++ b/ref-filter.c
> @@ -1354,15 +1354,31 @@ static const char *get_refname(struct used_atom 
> *atom, struct ref_array_item *re
>   return show_ref(>u.refname, ref->refname);
>  }
>  
> +static void get_object(struct ref_array_item *ref, const struct object_id 
> *oid,
> +int deref, struct object **obj)
> +{
> + int eaten;
> + unsigned long size;
> + void *buf = get_obj(oid, obj, , );
> + if (!buf)
> + die(_("missing object %s for %s"),
> + oid_to_hex(oid), ref->refname);
> + if (!*obj)
> + die(_("parse_object_buffer failed on %s for %s"),
> + oid_to_hex(oid), ref->refname);
> +
> + grab_values(ref->value, deref, *obj, buf, size);
> + if (!eaten)
> + free(buf);
> +}
> +
>  /*
>   * Parse the object referred by ref, and grab needed value.
>   */
>  static void populate_value(struct ref_array_item *ref)
>  {
> - void *buf;
>   struct object *obj;
> - int eaten, i;
> - unsigned long size;
> + int i;
>   const struct object_id *tagged;
>  
>   ref->value = xcalloc(used_atom_cnt, sizeof(struct atom_value));
> @@ -1483,17 +1499,7 @@ static void populate_value(struct ref_array_item *ref)
>   return;
>  
>   need_obj:
> - buf = get_obj(>objectname, , , );
> - if (!buf)
> - die(_("missing object %s for %s"),
> - oid_to_hex(>objectname), ref->refname);
> - if (!obj)
> - die(_("parse_object_buffer failed on %s for %s"),
> - oid_to_hex(>objectname), ref->refname);
> -
> - grab_values(ref->value, 0, obj, buf, size);
> - if (!eaten)
> - free(buf);
> + get_object(ref, >objectname, 0, );
>  
>   /*
>* If there is no atom that wants to know about tagged
> @@ -1514,16 +1520,7 @@ static void populate_value(struct ref_array_item *ref)
>* is not consistent with what deref_tag() does
>* which peels the onion to the core.
>*/
> - buf = get_obj(tagged, , , );
> - if (!buf)
> - die(_("missing object %s for %s"),
> - oid_to_hex(tagged), ref->refname);
> - if (!obj)
> - die(_("parse_object_buffer failed on %s for %s"),
> - oid_to_hex(tagged), ref->refname);
> - grab_values(ref->value, 1, obj, buf, size);
> - if (!eaten)
> - free(buf);
> + get_object(ref, tagged, 1, );
>  }
>  
>  /*
>
> --
> https://github.com/git/git/pull/460

Re: [PATCH 00/36] object_id part 12

2018-02-21 Thread Junio C Hamano

"brian m. carlson"  writes:

> This is the twelfth in a series of patches to convert from unsigned char
> [20] to struct object_id.  This series is based on next.
>
> Included in this series are conversions for find_unique_abbrev and
> lookup_replace_object, as well as parts of the sha1_file code.
>
> Conflicts with pu are average in number but minor, mostly because of the
> type_name conversion.  None of them are tricky, except that the
> introduction of get_tree_entry_if_blob requires a conversion of that
> function.

And the reason why this is based on 'next' is...?  Which topic(s) do
we have to wait for until we can queue this series, in other words?

Thanks for working on this, though.

[PATCH v2 2/3] config: respect `pager.config` in list/get-mode only

2018-02-21 Thread Martin Ågren

Similar to de121ffe5 (tag: respect `pager.tag` in list-mode only,
2017-08-02), use the DELAY_PAGER_CONFIG-mechanism to only respect
`pager.config` when we are listing or "get"ing config.

We have several getters and some are guaranteed to give at most one line
of output. Paging all getters including those could be convenient from a
documentation point-of-view. The downside would be that a misconfigured
or not so modern pager might wait for user interaction before
terminating. Let's instead respect the config for precisely those
getters which may produce more than one line of output.

`--get-urlmatch` may or may not produce multiple lines of output,
depending on the exact usage. Let's not try to recognize the two modes,
but instead make `--get-urlmatch` always respect the config. Analyzing
the detailed usage might be trivial enough here, but could establish a
precedent that we will never be able to enforce throughout the codebase
and that will just open a can of worms.

This fixes the failing test added in the previous commit. Also adapt the
test for whether `git config foo.bar bar` and `git config --get foo.bar`
respects `pager.config`.

Signed-off-by: Martin Ågren 
---
 Documentation/git-config.txt |  5 +
 t/t7006-pager.sh | 10 +-
 builtin/config.c | 10 ++
 git.c|  2 +-
 4 files changed, 21 insertions(+), 6 deletions(-)

diff --git a/Documentation/git-config.txt b/Documentation/git-config.txt
index 14da5fc157..249090ac84 100644
--- a/Documentation/git-config.txt
+++ b/Documentation/git-config.txt
@@ -233,6 +233,11 @@ See also <>.
using `--file`, `--global`, etc) and `on` when searching all
config files.
 
+CONFIGURATION
+-
+`pager.config` is only respected when listing configuration, i.e., when
+using `--list` or any of the `--get-*` which may return multiple results.
+
 [[FILES]]
 FILES
 -
diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index a46a079339..95bd26f0b2 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -245,13 +245,13 @@ test_expect_success TTY 'git branch --set-upstream-to 
ignores pager.branch' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config respects pager.config when setting' '
+test_expect_success TTY 'git config ignores pager.config when setting' '
rm -f paginated.out &&
test_terminal git -c pager.config config foo.bar bar &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
-test_expect_failure TTY 'git config --edit ignores pager.config' '
+test_expect_success TTY 'git config --edit ignores pager.config' '
rm -f paginated.out editor.used &&
write_script editor <<-\EOF &&
touch editor.used
@@ -261,10 +261,10 @@ test_expect_failure TTY 'git config --edit ignores 
pager.config' '
test -e editor.used
 '
 
-test_expect_success TTY 'git config --get respects pager.config' '
+test_expect_success TTY 'git config --get ignores pager.config' '
rm -f paginated.out &&
test_terminal git -c pager.config config --get foo.bar &&
-   test -e paginated.out
+   ! test -e paginated.out
 '
 
 test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
diff --git a/builtin/config.c b/builtin/config.c
index ab5f95476e..a732d9b56c 100644
--- a/builtin/config.c
+++ b/builtin/config.c
@@ -48,6 +48,13 @@ static int show_origin;
 #define ACTION_GET_COLORBOOL (1<<14)
 #define ACTION_GET_URLMATCH (1<<15)
 
+/*
+ * The actions "ACTION_LIST | ACTION_GET_*" which may produce more than
+ * one line of output and which should therefore be paged.
+ */
+#define PAGING_ACTIONS (ACTION_LIST | ACTION_GET_ALL | \
+   ACTION_GET_REGEXP | ACTION_GET_URLMATCH)
+
 #define TYPE_BOOL (1<<0)
 #define TYPE_INT (1<<1)
 #define TYPE_BOOL_OR_INT (1<<2)
@@ -594,6 +601,9 @@ int cmd_config(int argc, const char **argv, const char 
*prefix)
usage_with_options(builtin_config_usage, 
builtin_config_options);
}
 
+   if (actions & PAGING_ACTIONS)
+   setup_auto_pager("config", 0);
+
if (actions == ACTION_LIST) {
check_argc(argc, 0, 0);
if (config_with_options(show_all_config, NULL,
diff --git a/git.c b/git.c
index c870b9719c..e5c9b8729d 100644
--- a/git.c
+++ b/git.c
@@ -389,7 +389,7 @@ static struct cmd_struct commands[] = {
{ "column", cmd_column, RUN_SETUP_GENTLY },
{ "commit", cmd_commit, RUN_SETUP | NEED_WORK_TREE },
{ "commit-tree", cmd_commit_tree, RUN_SETUP },
-   { "config", cmd_config, RUN_SETUP_GENTLY },
+   { "config", cmd_config, RUN_SETUP_GENTLY | DELAY_PAGER_CONFIG },
{ "count-objects", cmd_count_objects, RUN_SETUP },
{ "credential", cmd_credential, RUN_SETUP_GENTLY },
{ "describe", cmd_describe, RUN_SETUP },
-- 
2.16.2.246.ga4ee8f

Re: [PATCH v4 03/13] commit-graph: create git-commit-graph builtin

2018-02-21 Thread Junio C Hamano

Junio C Hamano  writes:

> Derrick Stolee  writes:
>
>> +int cmd_commit_graph(int argc, const char **argv, const char *prefix)
>> +{
>> +static struct option builtin_commit_graph_options[] = {
>> +{ OPTION_STRING, 'p', "object-dir", _dir,
>> +N_("dir"),
>> +N_("The object directory to store the graph") },
>
> I have a suspicion that this was modeled after some other built-in
> that has a similar issue (perhaps written long time ago), but isn't
> OPT_STRING() sufficient to define this element these days?
>
> Or am I missing something?
>
> Why squat on short-and-sweet "-p"?  For that matter, since this is
> not expected to be end-user facing command anyway, I suspect that we
> do not want to allocate a single letter option from day one, which
> paints ourselves into a corner from where we cannot escape.

I suspect that exactly the same comment applies to patches in this
series that add other subcommands (I just saw one in the patch for
adding 'write').

[PATCH] commit: drop uses of get_cached_commit_buffer()

2018-02-21 Thread Derrick Stolee

The get_cached_commit_buffer() method provides access to the buffer
loaded for a struct commit, if it was ever loadead and was not freed.

Two places use this to inform how to output information about commits.

log-tree.c uses this method to short-circuit the output of commit
information when the buffer is not cached. However, this leads to
incorrect output in 'git log --oneline' where the short-OID is written
but then the rest of the commit information is dropped and the next
commit is written on the same line.

rev-list uses this method for two reasons:

- First, if the revision walk visits a commit twice, the buffer was
  freed by rev-list in the first write. The output then does not
  match the format expectations, since the OID is written without the
  rest of the content.

- Second, if the revision walk visits a commit that was marked
  UNINTERESTING, the walk may never load a buffer and hence rev-list
  will not output the verbose information.

These behaviors are undocumented, untested, and unlikely to be
expected by users or other software attempting to parse this output.

Helped-by: Jeff King 
Signed-off-by: Derrick Stolee 
---
 builtin/rev-list.c | 2 +-
 log-tree.c | 3 ---
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/builtin/rev-list.c b/builtin/rev-list.c
index 48300d9..d320b6f 100644
--- a/builtin/rev-list.c
+++ b/builtin/rev-list.c
@@ -134,7 +134,7 @@ static void show_commit(struct commit *commit, void *data)
else
putchar('\n');
 
-   if (revs->verbose_header && get_cached_commit_buffer(commit, NULL)) {
+   if (revs->verbose_header) {
struct strbuf buf = STRBUF_INIT;
struct pretty_print_context ctx = {0};
ctx.abbrev = revs->abbrev;
diff --git a/log-tree.c b/log-tree.c
index fc0cc0d..22b2fb6 100644
--- a/log-tree.c
+++ b/log-tree.c
@@ -659,9 +659,6 @@ void show_log(struct rev_info *opt)
show_mergetag(opt, commit);
}
 
-   if (!get_cached_commit_buffer(commit, NULL))
-   return;
-
if (opt->show_notes) {
int raw;
struct strbuf notebuf = STRBUF_INIT;
-- 
2.7.4

Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Stefan Beller

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:
> Add document specifying the binary format for commit graphs. This
> format allows for:
>
> * New versions.
> * New hash functions and hash lengths.
> * Optional extensions.
>
> Basic header information is followed by a binary table of contents
> into "chunks" that include:
>
> * An ordered list of commit object IDs.
> * A 256-entry fanout into that list of OIDs.
> * A list of metadata for the commits.
> * A list of "large edges" to enable octopus merges.
>
> The format automatically includes two parent positions for every
> commit. This favors speed over space, since using only one position
> per commit would cause an extra level of indirection for every merge
> commit. (Octopus merges suffer from this indirection, but they are
> very rare.)
>
> Signed-off-by: Derrick Stolee 
> ---
>  Documentation/technical/commit-graph-format.txt | 90 
> +
>  1 file changed, 90 insertions(+)
>  create mode 100644 Documentation/technical/commit-graph-format.txt
>
> diff --git a/Documentation/technical/commit-graph-format.txt 
> b/Documentation/technical/commit-graph-format.txt
> new file mode 100644
> index 000..11b18b5
> --- /dev/null
> +++ b/Documentation/technical/commit-graph-format.txt
> @@ -0,0 +1,90 @@
> +Git commit graph format
> +===
> +
> +The Git commit graph stores a list of commit OIDs and some associated
> +metadata, including:
> +
> +- The generation number of the commit. Commits with no parents have
> +  generation number 1; commits with parents have generation number
> +  one more than the maximum generation number of its parents. We
> +  reserve zero as special, and can be used to mark a generation
> +  number invalid or as "not computed".
> +
> +- The root tree OID.
> +
> +- The commit date.
> +
> +- The parents of the commit, stored using positional references within
> +  the graph file.
> +
> +== graph-*.graph files have the following format:
> +
> +In order to allow extensions that add extra data to the graph, we organize
> +the body into "chunks" and provide a binary lookup table at the beginning
> +of the body. The header includes certain values, such as number of chunks,
> +hash lengths and types.
> +
> +All 4-byte numbers are in network order.
> +
> +HEADER:
> +
> +  4-byte signature:
> +  The signature is: {'C', 'G', 'P', 'H'}
> +
> +  1-byte version number:
> +  Currently, the only valid version is 1.
> +
> +  1-byte Object Id Version (1 = SHA-1)
> +
> +  1-byte number (C) of "chunks"
> +
> +  1-byte (reserved for later use)

What should clients of today do with it?
* ignore it completely [as they have no idea what it is] or
* throw hands up in the air if it is anything other than 0 ?
  [because clearly we will increment the version
   or have new information in a new chunk instead of just sneaking
   in information here?]

> +CHUNK LOOKUP:
> +
> +  (C + 1) * 12 bytes listing the table of contents for the chunks:
> +  First 4 bytes describe chunk id. Value 0 is a terminating label.
> +  Other 8 bytes provide offset in current file for chunk to start.

offset [in bytes? I could imagine having a larger granularity here,
because chunks don't sound small.]

> +  (Chunks are ordered contiguously in the file, so you can infer
> +  the length using the next chunk position if necessary.)
> +
> +  The remaining data in the body is described one chunk at a time, and
> +  these chunks may be given in any order. Chunks are required unless
> +  otherwise specified.
> +
> +CHUNK DATA:
> +
> +  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
> +  The ith entry, F[i], stores the number of OIDs with first
> +  byte at most i. Thus F[255] stores the total
> +  number of commits (N).

[ so in small repos, where there are fewer than 256 objects,
F[i] == F[i+1], for all i'th where there is no object starting with i byte]

> +  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
> +  The OIDs for all commits in the graph, sorted in ascending order.
> +
> +  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
> +* The first H bytes are for the OID of the root tree.
> +* The next 8 bytes are for the int-ids of the first two parents
> +  of the ith commit. Stores value 0x if no parent in that
> +  position. If there are more than two parents, the second value
> +  has its most-significant bit on and the other bits store an array
> +  position into the Large Edge List chunk.
> +* The next 8 bytes store the generation number of the commit and
> +  the commit time in seconds since EPOCH. The generation number
> +  uses the higher 30 bits of the first 4 bytes, while the commit
> +  time uses the 32 bits of the second 4 bytes, along with the lowest
> +  2 bits of the lowest byte, storing the 33rd and 34th bit of the
> +  commit time.
> +
> +  Large Edge List (ID: {'E',

Re: [PATCH v3 00/35] protocol version 2

2018-02-21 Thread Brandon Williams

On 02/06, Brandon Williams wrote:
> Changes in v3:
>  * There were some comments about how the protocol should be designed
>stateless first.  I've made this change and instead of having to
>supply the `stateless-rpc=true` capability to force stateless
>behavior, the protocol just requires all commands to be stateless.
>  
>  * Added some patches towards the end of the series to force the client
>to not request to use protocol v2 when pushing (even if configured to
>use v2).  This is to ease the roll-out process of a push command in
>protocol v2.  This way when servers gain the ability to accept
>pushing in v2 (and they start responding using v2 when requests are
>sent to the git-receive-pack endpoint) that clients who still don't
>understand how to push using v2 won't request to use v2 and then die
>when they recognize that the server does indeed know how to accept a
>push under v2.
> 
>  * I implemented the `shallow` feature for fetch.  This feature
>encapsulates the existing functionality of all the shallow/deepen
>capabilities in v0.  So now a server can process shallow requests.
> 
>  * Various other small tweaks that I can't remember :)
> 
> After all of that I think the series is in a pretty good state, baring
> any more critical reviewing feedback.
> 
> Thanks!

I'm hoping to get some more in depth review before I do any more
re-rolls, but for those interested I will need to do a re-roll to
eliminate the prelude from the http transport.  This is the prelude
which includes the service line followed by any number of packet lines
culminating in a flush-pkt like so:

  # service=git-upload-pack
  some
  other
  optional
  lines
  

With this eliminated all transports will be exactly the same, the only
difference will be how the protocol is tunneled.

> 
> Brandon Williams (35):
>   pkt-line: introduce packet_read_with_status
>   pkt-line: introduce struct packet_reader
>   pkt-line: add delim packet support
>   upload-pack: convert to a builtin
>   upload-pack: factor out processing lines
>   transport: use get_refs_via_connect to get refs
>   connect: convert get_remote_heads to use struct packet_reader
>   connect: discover protocol version outside of get_remote_heads
>   transport: store protocol version
>   protocol: introduce enum protocol_version value protocol_v2
>   test-pkt-line: introduce a packet-line test helper
>   serve: introduce git-serve
>   ls-refs: introduce ls-refs server command
>   connect: request remote refs using v2
>   transport: convert get_refs_list to take a list of ref patterns
>   transport: convert transport_get_remote_refs to take a list of ref
> patterns
>   ls-remote: pass ref patterns when requesting a remote's refs
>   fetch: pass ref patterns when fetching
>   push: pass ref patterns when pushing
>   upload-pack: introduce fetch server command
>   fetch-pack: perform a fetch using v2
>   upload-pack: support shallow requests
>   fetch-pack: support shallow requests
>   connect: refactor git_connect to only get the protocol version once
>   connect: don't request v2 when pushing
>   transport-helper: remove name parameter
>   transport-helper: refactor process_connect_service
>   transport-helper: introduce stateless-connect
>   pkt-line: add packet_buf_write_len function
>   remote-curl: create copy of the service name
>   remote-curl: store the protocol version the server responded with
>   http: allow providing extra headers for http requests
>   http: don't always add Git-Protocol header
>   remote-curl: implement stateless-connect command
>   remote-curl: don't request v2 when pushing
> 
>  .gitignore  |   1 +
>  Documentation/technical/protocol-v2.txt | 338 +
>  Makefile|   7 +-
>  builtin.h   |   2 +
>  builtin/clone.c |   2 +-
>  builtin/fetch-pack.c|  21 +-
>  builtin/fetch.c |  14 +-
>  builtin/ls-remote.c |   7 +-
>  builtin/receive-pack.c  |   6 +
>  builtin/remote.c|   2 +-
>  builtin/send-pack.c |  20 +-
>  builtin/serve.c |  30 ++
>  builtin/upload-pack.c   |  74 
>  connect.c   | 352 +-
>  connect.h   |   7 +
>  fetch-pack.c| 319 +++-
>  fetch-pack.h|   4 +-
>  git.c   |   2 +
>  http.c  |  25 +-
>  http.h  |   2 +
>  ls-refs.c   |  96 +
>  ls-refs.h   |   9 +
>  pkt-line.c  | 149 +++-
>  pkt-line.h  |  77 
>  protocol.c

[PATCH v2 0/3] Re: t7006: add tests for how git config paginates

2018-02-21 Thread Martin Ågren

This is v2 of my series to teach `git config` to only respect
`pager.config` when listing configuration, then changing the default to
"on". Thanks to Duy and Junio for feedback on the first version.

Based on Duy's feeback, I've changed the approach to more carefully
divide the various getters into "may produce multiple lines, so let's
page" vs "may not, so don't".

Junio hesitated whether we should add tests using `test_expect_success`,
then flip the test-definition, or whether we should start with a
"failure" that we then flip to "success". I have not done anything about
that, except to try and motivate the choice better in the commit message
of the second patch.

Martin

Martin Ågren (3):
  t7006: add tests for how git config paginates
  config: respect `pager.config` in list/get-mode only
  config: change default of `pager.config` to "on"

 Documentation/git-config.txt |  6 ++
 t/t7006-pager.sh | 49 +---
 builtin/config.c | 10 +
 git.c|  2 +-
 4 files changed, 59 insertions(+), 8 deletions(-)

-- 
2.16.2.246.ga4ee8f

Re: [PATCH v4 01/13] commit-graph: add format document

2018-02-21 Thread Derrick Stolee


On 2/21/2018 2:23 PM, Stefan Beller wrote:

On Mon, Feb 19, 2018 at 10:53 AM, Derrick Stolee  wrote:

+In order to allow extensions that add extra data to the graph, we organize
+the body into "chunks" and provide a binary lookup table at the beginning
+of the body. The header includes certain values, such as number of chunks,
+hash lengths and types.
+
+All 4-byte numbers are in network order.
+
+HEADER:
+
+  4-byte signature:
+  The signature is: {'C', 'G', 'P', 'H'}
+
+  1-byte version number:
+  Currently, the only valid version is 1.
+
+  1-byte Object Id Version (1 = SHA-1)
+
+  1-byte number (C) of "chunks"
+
+  1-byte (reserved for later use)

What should clients of today do with it?
* ignore it completely [as they have no idea what it is] or
* throw hands up in the air if it is anything other than 0 ?
   [because clearly we will increment the version
or have new information in a new chunk instead of just sneaking
in information here?]


They should ignore it completely, which will allow using the value for 
something meaningful later without causing a version change (which we DO 
die() for). A user could downgrade from a version that uses this byte 
for something meaningful and not require a new commit-graph file.


The "commit-graph read" subcommand does output this byte, so we can 
verify that the "write" subcommand places a 0 in this position.





+CHUNK LOOKUP:
+
+  (C + 1) * 12 bytes listing the table of contents for the chunks:
+  First 4 bytes describe chunk id. Value 0 is a terminating label.
+  Other 8 bytes provide offset in current file for chunk to start.

offset [in bytes? I could imagine having a larger granularity here,
because chunks don't sound small.]


It is good to specify "offset in bytes".




+  (Chunks are ordered contiguously in the file, so you can infer
+  the length using the next chunk position if necessary.)
+
+  The remaining data in the body is described one chunk at a time, and
+  these chunks may be given in any order. Chunks are required unless
+  otherwise specified.
+
+CHUNK DATA:
+
+  OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes)
+  The ith entry, F[i], stores the number of OIDs with first
+  byte at most i. Thus F[255] stores the total
+  number of commits (N).

[ so in small repos, where there are fewer than 256 objects,
F[i] == F[i+1], for all i'th where there is no object starting with i byte]


Correct. I'm not sure this additional information is valuable for the 
document, though.





+  OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes)
+  The OIDs for all commits in the graph, sorted in ascending order.
+
+  Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes)
+* The first H bytes are for the OID of the root tree.
+* The next 8 bytes are for the int-ids of the first two parents
+  of the ith commit. Stores value 0x if no parent in that
+  position. If there are more than two parents, the second value
+  has its most-significant bit on and the other bits store an array
+  position into the Large Edge List chunk.
+* The next 8 bytes store the generation number of the commit and
+  the commit time in seconds since EPOCH. The generation number
+  uses the higher 30 bits of the first 4 bytes, while the commit
+  time uses the 32 bits of the second 4 bytes, along with the lowest
+  2 bits of the lowest byte, storing the 33rd and 34th bit of the
+  commit time.
+
+  Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional]
+  This list of 4-byte values store the second through nth parents for
+  all octopus merges. The second parent value in the commit data stores
+  an array position within this list along with the most-significant bit
+  on. Starting at that array position, iterate through this list of int-ids
+  for the parents until reaching a value with the most-significant bit on.
+  The other bits correspond to the int-id of the last parent.
+
+TRAILER:
+
+   H-byte HASH-checksum of all of the above.
+
--
2.7.4

Makes sense so far, I'll read on.
I agree with Junio, that I could read this documentation without
the urge to point out nits. :)

Thanks,
Stefan

[PATCH v2 1/3] t7006: add tests for how git config paginates

2018-02-21 Thread Martin Ågren

The next couple of commits will change how `git config` handles
`pager.config`, similar to how de121ffe5 (tag: respect `pager.tag` in
list-mode only, 2017-08-02) and ff1e72483 (tag: change default of
`pager.tag` to "on", 2017-08-02) changed `git tag`. Similar work has
also been done to `git branch`.

Add tests in this area to make sure that we don't regress and so that
the upcoming commits can be made clearer by adapting the tests. Add
tests for simple config-setting, `--edit`, `--get`, `--get-urlmatch`,
`get-all`, and `--list`. Those represent a fair portion of the various
options that will be affected by the next two commits.

Use `test_expect_failure` to document that we currently respect the
pager-configuration with `--edit`. The current behavior is buggy since
the pager interferes with the editor and makes the end result completely
broken. See also b3ee740c8 (t7006: add tests for how git tag paginates,
2017-08-02).

The next commit will teach simple config-setting and `--get` to ignore
`pager.config`. Test the current behavior as "success", not "failure",
since the currently expected behavior according to documentation would
be to page. The next commit will change that expectation by updating the
documentation on `git config` and will redefine those successful tests.

Remove the test added in commit 3ba7e6e29a (config: run
setup_git_directory_gently() sooner, 2010-08-05) since it has some
overlap with these. We could leave it or tweak it, or place new tests
like these next to it, but let's instead make the tests for `git config`
as similar as possible to the ones for `git tag` and `git branch`, and
place them after those.

Signed-off-by: Martin Ågren 
---
 t/t7006-pager.sh | 49 ++---
 1 file changed, 42 insertions(+), 7 deletions(-)

diff --git a/t/t7006-pager.sh b/t/t7006-pager.sh
index f5f46a95b4..a46a079339 100755
--- a/t/t7006-pager.sh
+++ b/t/t7006-pager.sh
@@ -110,13 +110,6 @@ test_expect_success TTY 'configuration can disable pager' '
! test -e paginated.out
 '
 
-test_expect_success TTY 'git config uses a pager if configured to' '
-   rm -f paginated.out &&
-   test_config pager.config true &&
-   test_terminal git config --list &&
-   test -e paginated.out
-'
-
 test_expect_success TTY 'configuration can enable pager (from subdir)' '
rm -f paginated.out &&
mkdir -p subdir &&
@@ -252,6 +245,48 @@ test_expect_success TTY 'git branch --set-upstream-to 
ignores pager.branch' '
! test -e paginated.out
 '
 
+test_expect_success TTY 'git config respects pager.config when setting' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config foo.bar bar &&
+   test -e paginated.out
+'
+
+test_expect_failure TTY 'git config --edit ignores pager.config' '
+   rm -f paginated.out editor.used &&
+   write_script editor <<-\EOF &&
+   touch editor.used
+   EOF
+   EDITOR=./editor test_terminal git -c pager.config config --edit &&
+   ! test -e paginated.out &&
+   test -e editor.used
+'
+
+test_expect_success TTY 'git config --get respects pager.config' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config --get foo.bar &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git config --get-urlmatch defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git -c http."https://foo.com/".bar=foo \
+ config --get-urlmatch http https://foo.com &&
+   ! test -e paginated.out
+'
+
+test_expect_success TTY 'git config --get-all respects pager.config' '
+   rm -f paginated.out &&
+   test_terminal git -c pager.config config --get-all foo.bar &&
+   test -e paginated.out
+'
+
+test_expect_success TTY 'git config --list defaults to not paging' '
+   rm -f paginated.out &&
+   test_terminal git config --list &&
+   ! test -e paginated.out
+'
+
+
 # A colored commit log will begin with an appropriate ANSI escape
 # for the first color; the text "commit" comes later.
 colorful() {
-- 
2.16.2.246.ga4ee8f

1 2 >

1 - 100 of 119 matches

Mail list logo