Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)

2018-03-18 Thread Sergey Organov
Igor Djordjevic  writes:

> On 15/03/2018 00:11, Igor Djordjevic wrote:
>> 
>> > > > Second side note: if we can fast-forward, currently we prefer
>> > > > that, and I think we should keep that behavior with -R, too.
>> > >
>> > > I agree.
>> > 
>> > I'm admittedly somewhat lost in the discussion, but are you
>> > talking fast-forward on _rebasing_ existing merge? Where would it
>> > go in any of the suggested algorithms of rebasing and why?
>> > 
>> > I readily see how it can break merges. E.g., any "git merge
>> > --ff-only --no-ff" merge will magically disappear. So, even if
>> > somehow supported, fast-forward should not be performed by default
>> > during _rebasing_ of a merge.
>> 
>> Hmm, now that you brought this up, I can only agree, of course.
>> 
>> What I had in my mind was more similar to "no-rebase-cousins", like 
>> if we can get away without actually rebasing the merge but still 
>> using the original one, do it. But I guess that`s not what Johannes 
>> originally asked about.
>> 
>> This is another definitive difference between rebasing (`pick`?) and 
>> recreating (`merge`) a merge commit - in the case where we`re rebasing, 
>> of course it doesn`t make sense to drop commit this time (due to 
>> fast-forward). This does make sense in recreating the merge (only).
>
> Eh, I might take this back. I think my original interpretation (and 
> agreement) to fast-forwarding is correct.
>
> But the confusion here comes from `--no-ff` as used for merging, as 
> opposed to `--no-ff` as used for rebasing. I _think_ Johannes meant 
> the latter one.
>
> In rebasing, `--no-ff` means that even if a commit inside todo list 
> isn`t to be changed, do not reuse it but create a new one. Here`s 
> excerpt from the docs[1]:
>
>   --no-ff
> With --interactive, cherry-pick all rebased commits instead of 
> fast-forwarding over the unchanged ones. This ensures that the 
> entire history of the rebased branch is composed of new commits.
>
> Without --interactive, this is a synonym for --force-rebase.
>
>
> So fast-forwarding in case of rebasing (merge commits as well) is 
> something you would want by default, as it wouldn`t drop/lose 
> anything, but merely reuse existing commit (if unchanged), instead of 
> cherry-picking (rebasing) it into a new (merge) commit anyway.

This sounds like breakage. E.g., it seems to be breaking every "-x ours"
merge out there.

Fast-forwarding existing merge, one way or another, still seems to be
wrong idea to me, as merge commit is not only about content change, but
also about joint point at particular place in the DAG.

As for fast-forwarding re-merge, explicitly requested, I'm not sure. On
one hand, it's inline with the default "git merge" behavior, on the
other hand, it still feels wrong, somehow.

-- Sergey


Re: [PATCH v6 1/3] Makefile: generate Perl header from template file

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 10:50 PM, Dan Jacques  wrote:
> Currently, the generated Perl script headers are emitted by commands in
> the Makefile. This mechanism restricts options to introduce alternative
> header content, needed by Perl runtime prefix support, and obscures the
> origin of the Perl script header.
>
> Change the Makefile to generate a header by processing a template file and
> move the header content into the "perl/" subdirectory. The processed
> generated will now be stored in the "GIT-PERL-HEADER" file. This allows

"processed generated"?

> the content of the Perl header to be controlled by changing the path of
> the template in the Makefile.
>
> Signed-off-by: Dan Jacques 


[PATCH v6 2/3] Makefile: add Perl runtime prefix support

2018-03-18 Thread Dan Jacques
Add a new Makefile flag, RUNTIME_PREFIX_PERL, which, when enabled,
configures Perl scripts to locate the Git installation's Perl support
libraries by resolving against the script's path, rather than
hard-coding that path at build-time.

RUNTIME_PREFIX_PERL requires that system paths are expressed relative to
a common installation directory, and uses that relationship to locate
support files based on the known starting point of the script being
executed, much like RUNTIME_PREFIX does for the Git binary.

This change enables Git's Perl scripts to work when their Git installation
is relocated or moved to another system.

Signed-off-by: Dan Jacques 
Thanks-to: Ævar Arnfjörð Bjarmason 
Thanks-to: Johannes Schindelin 
---
 Makefile | 67 +++-
 perl/Git/I18N.pm |  2 +-
 perl/header_templates/runtime_prefix.template.pl | 40 ++
 3 files changed, 107 insertions(+), 2 deletions(-)
 create mode 100644 perl/header_templates/runtime_prefix.template.pl

diff --git a/Makefile b/Makefile
index e479822ce..101a98a78 100644
--- a/Makefile
+++ b/Makefile
@@ -434,6 +434,13 @@ all::
 #
 # When cross-compiling, define HOST_CPU as the canonical name of the CPU on
 # which the built Git will run (for instance "x86_64").
+#
+# Define RUNTIME_PREFIX to configure Git to resolve its ancillary tooling and
+# support files relative to the location of the runtime binary, rather than
+# hard-coding them into the binary. Git installations built with RUNTIME_PREFIX
+# can be moved to arbitrary filesystem locations. RUNTIME_PREFIX also causes
+# Perl scripts to use a modified entry point header allowing them to resolve
+# support files at runtime.
 
 GIT-VERSION-FILE: FORCE
@$(SHELL_PATH) ./GIT-VERSION-GEN
@@ -471,6 +478,8 @@ ARFLAGS = rcs
 #   mandir
 #   infodir
 #   htmldir
+#   localedir
+#   perllibdir
 # This can help installing the suite in a relocatable way.
 
 prefix = $(HOME)
@@ -492,9 +501,12 @@ lib = lib
 # DESTDIR =
 pathsep = :
 
+gitexecdir_relative = $(patsubst $(prefix)/%,%,$(gitexecdir))
 mandir_relative = $(patsubst $(prefix)/%,%,$(mandir))
 infodir_relative = $(patsubst $(prefix)/%,%,$(infodir))
+localedir_relative = $(patsubst $(prefix)/%,%,$(localedir))
 htmldir_relative = $(patsubst $(prefix)/%,%,$(htmldir))
+perllibdir_relative = $(patsubst $(prefix)/%,%,$(perllibdir))
 
 export prefix bindir sharedir sysconfdir gitwebdir perllibdir localedir
 
@@ -1740,10 +1752,13 @@ mandir_relative_SQ = $(subst ','\'',$(mandir_relative))
 infodir_relative_SQ = $(subst ','\'',$(infodir_relative))
 perllibdir_SQ = $(subst ','\'',$(perllibdir))
 localedir_SQ = $(subst ','\'',$(localedir))
+localedir_relative_SQ = $(subst ','\'',$(localedir_relative))
 gitexecdir_SQ = $(subst ','\'',$(gitexecdir))
+gitexecdir_relative_SQ = $(subst ','\'',$(gitexecdir_relative))
 template_dir_SQ = $(subst ','\'',$(template_dir))
 htmldir_relative_SQ = $(subst ','\'',$(htmldir_relative))
 prefix_SQ = $(subst ','\'',$(prefix))
+perllibdir_relative_SQ = $(subst ','\'',$(perllibdir_relative))
 gitwebdir_SQ = $(subst ','\'',$(gitwebdir))
 
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
@@ -1754,6 +1769,31 @@ TCLTK_PATH_SQ = $(subst ','\'',$(TCLTK_PATH))
 DIFF_SQ = $(subst ','\'',$(DIFF))
 PERLLIB_EXTRA_SQ = $(subst ','\'',$(PERLLIB_EXTRA))
 
+# RUNTIME_PREFIX's resolution logic requires resource paths to be expressed
+# relative to each other and share an installation path.
+#
+# This is a dependnecy in:
+# - Git's binary RUNTIME_PREFIX logic in (see "exec_cmd.c").
+# - The runtime prefix Perl header (see
+#   "perl/header_templates/runtime_prefix.template.pl").
+ifdef RUNTIME_PREFIX
+
+ifneq ($(filter /%,$(firstword $(gitexecdir_relative))),)
+$(error RUNTIME_PREFIX requires a relative gitexecdir, not: $(gitexecdir))
+endif
+
+ifneq ($(filter /%,$(firstword $(localedir_relative))),)
+$(error RUNTIME_PREFIX requires a relative localedir, not: $(localedir))
+endif
+
+ifndef NO_PERL
+ifneq ($(filter /%,$(firstword $(perllibdir_relative))),)
+$(error RUNTIME_PREFIX requires a relative perllibdir, not: $(perllibdir))
+endif
+endif
+
+endif
+
 # We must filter out any object files from $(GITLIBS),
 # as it is typically used like:
 #
@@ -1974,10 +2014,31 @@ git.res: git.rc GIT-VERSION-FILE
 # This makes sure we depend on the NO_PERL setting itself.
 $(SCRIPT_PERL_GEN): GIT-BUILD-OPTIONS
 
+# Used for substitution in Perl modules. Disabled when using RUNTIME_PREFIX
+# since the locale directory is injected.
+perl_localedir_SQ = $(localedir_SQ)
+
 ifndef NO_PERL
 PERL_HEADER_TEMPLATE = perl/header_templates/fixed_prefix.template.pl
 PERL_DEFINES = $(PERL_PATH_SQ):$(PERLLIB_EXTRA_SQ):$(perllibdir_SQ)
 
+PERL_DEFINES := $(PERL_PATH_SQ) $(PERLLIB_EXTRA_SQ) $(perllibdir_SQ)
+PERL_DEFINES += $(RUNTIME_PREFIX)
+
+# Support Perl runtime prefix. In this mode, a different header is installed
+# into Perl 

[PATCH v6 1/3] Makefile: generate Perl header from template file

2018-03-18 Thread Dan Jacques
Currently, the generated Perl script headers are emitted by commands in
the Makefile. This mechanism restricts options to introduce alternative
header content, needed by Perl runtime prefix support, and obscures the
origin of the Perl script header.

Change the Makefile to generate a header by processing a template file and
move the header content into the "perl/" subdirectory. The processed
generated will now be stored in the "GIT-PERL-HEADER" file. This allows
the content of the Perl header to be controlled by changing the path of
the template in the Makefile.

Signed-off-by: Dan Jacques 
Thanks-to: Ævar Arnfjörð Bjarmason 
Thanks-to: Johannes Schindelin 
---
 .gitignore |  1 +
 Makefile   | 27 +++---
 perl/header_templates/fixed_prefix.template.pl |  1 +
 3 files changed, 18 insertions(+), 11 deletions(-)
 create mode 100644 perl/header_templates/fixed_prefix.template.pl

diff --git a/.gitignore b/.gitignore
index 833ef3b0b..89bd7bd8a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -3,6 +3,7 @@
 /GIT-LDFLAGS
 /GIT-PREFIX
 /GIT-PERL-DEFINES
+/GIT-PERL-HEADER
 /GIT-PYTHON-VARS
 /GIT-SCRIPT-DEFINES
 /GIT-USER-AGENT
diff --git a/Makefile b/Makefile
index a1d8775ad..e479822ce 100644
--- a/Makefile
+++ b/Makefile
@@ -1975,20 +1975,15 @@ git.res: git.rc GIT-VERSION-FILE
 $(SCRIPT_PERL_GEN): GIT-BUILD-OPTIONS
 
 ifndef NO_PERL
-$(SCRIPT_PERL_GEN):
-
+PERL_HEADER_TEMPLATE = perl/header_templates/fixed_prefix.template.pl
 PERL_DEFINES = $(PERL_PATH_SQ):$(PERLLIB_EXTRA_SQ):$(perllibdir_SQ)
-$(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES GIT-VERSION-FILE
+
+$(SCRIPT_PERL_GEN): % : %.perl GIT-PERL-DEFINES GIT-PERL-HEADER 
GIT-VERSION-FILE
$(QUIET_GEN)$(RM) $@ $@+ && \
-   INSTLIBDIR='$(perllibdir_SQ)' && \
-   INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \
-   INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \
sed -e '1{' \
-e 's|#!.*perl|#!$(PERL_PATH_SQ)|' \
-   -e 'h' \
-   -e 's=.*=use lib (split(/$(pathsep)/, $$ENV{GITPERLLIB} || 
"'"$$INSTLIBDIR"'"));=' \
-   -e 'H' \
-   -e 'x' \
+   -e 'rGIT-PERL-HEADER' \
+   -e 'G' \
-e '}' \
-e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
$< >$@+ && \
@@ -2002,6 +1997,16 @@ GIT-PERL-DEFINES: FORCE
echo "$$FLAGS" >$@; \
fi
 
+GIT-PERL-HEADER: $(PERL_HEADER_TEMPLATE) GIT-PERL-DEFINES Makefile
+   $(QUIET_GEN)$(RM) $@ && \
+   INSTLIBDIR='$(perllibdir_SQ)' && \
+   INSTLIBDIR_EXTRA='$(PERLLIB_EXTRA_SQ)' && \
+   INSTLIBDIR="$$INSTLIBDIR$${INSTLIBDIR_EXTRA:+:$$INSTLIBDIR_EXTRA}" && \
+   sed -e 's=@@PATHSEP@@=$(pathsep)=g' \
+   -e 's=@@INSTLIBDIR@@='$$INSTLIBDIR'=g' \
+   -e 's=@@PERLLIBDIR@@='$(perllibdir_SQ)'=g' \
+   $< >$@+ && \
+   mv $@+ $@
 
 .PHONY: gitweb
 gitweb:
@@ -2770,7 +2775,7 @@ ifndef NO_TCLTK
 endif
$(RM) GIT-VERSION-FILE GIT-CFLAGS GIT-LDFLAGS GIT-BUILD-OPTIONS
$(RM) GIT-USER-AGENT GIT-PREFIX
-   $(RM) GIT-SCRIPT-DEFINES GIT-PERL-DEFINES GIT-PYTHON-VARS
+   $(RM) GIT-SCRIPT-DEFINES GIT-PERL-DEFINES GIT-PERL-HEADER 
GIT-PYTHON-VARS
 
 .PHONY: all install profile-clean clean strip
 .PHONY: shell_compatibility_test please_set_SHELL_PATH_to_a_more_modern_shell
diff --git a/perl/header_templates/fixed_prefix.template.pl 
b/perl/header_templates/fixed_prefix.template.pl
new file mode 100644
index 0..857b4391a
--- /dev/null
+++ b/perl/header_templates/fixed_prefix.template.pl
@@ -0,0 +1 @@
+use lib (split(/@@PATHSEP@@/, $ENV{GITPERLLIB} || '@@INSTLIBDIR@@'));
-- 
2.15.0.chromium12



[PATCH v6 3/3] exec_cmd: RUNTIME_PREFIX on some POSIX systems

2018-03-18 Thread Dan Jacques
Enable Git to resolve its own binary location using a variety of
OS-specific and generic methods, including:

- procfs via "/proc/self/exe" (Linux)
- _NSGetExecutablePath (Darwin)
- KERN_PROC_PATHNAME sysctl on BSDs.
- argv0, if absolute (all, including Windows).

This is used to enable RUNTIME_PREFIX support for non-Windows systems,
notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will
do a best-effort resolution of its executable path and automatically use
this as its "exec_path" for relative helper and data lookups, unless
explicitly overridden.

Small incidental formatting cleanup of "exec_cmd.c".

Signed-off-by: Dan Jacques 
Thanks-to: Robbie Iannucci 
Thanks-to: Junio C Hamano 
---
 Makefile   |  26 +-
 cache.h|   1 +
 common-main.c  |   4 +-
 config.mak.uname   |   7 ++
 exec_cmd.c | 241 ++---
 exec_cmd.h |   4 +-
 gettext.c  |   8 +-
 git.c  |   2 +-
 t/t0061-run-command.sh |   2 +-
 9 files changed, 254 insertions(+), 41 deletions(-)

diff --git a/Makefile b/Makefile
index 101a98a78..df17a62a4 100644
--- a/Makefile
+++ b/Makefile
@@ -418,6 +418,16 @@ all::
 #
 # Define HAVE_BSD_SYSCTL if your platform has a BSD-compatible sysctl function.
 #
+# Define HAVE_BSD_KERN_PROC_SYSCTL if your platform supports the KERN_PROC BSD
+# sysctl function.
+#
+# Define PROCFS_EXECUTABLE_PATH if your platform mounts a "procfs" filesystem
+# capable of resolving the path of the current executable. If defined, this
+# must be the canonical path for the "procfs" current executable path.
+#
+# Define HAVE_NS_GET_EXECUTABLE_PATH if your platform supports calling
+# _NSGetExecutablePath to retrieve the path of the running executable.
+#
 # Define HAVE_GETDELIM if your system has the getdelim() function.
 #
 # Define PAGER_ENV to a SP separated VAR=VAL pairs to define
@@ -1664,10 +1674,23 @@ ifdef HAVE_BSD_SYSCTL
BASIC_CFLAGS += -DHAVE_BSD_SYSCTL
 endif
 
+ifdef HAVE_BSD_KERN_PROC_SYSCTL
+   BASIC_CFLAGS += -DHAVE_BSD_KERN_PROC_SYSCTL
+endif
+
 ifdef HAVE_GETDELIM
BASIC_CFLAGS += -DHAVE_GETDELIM
 endif
 
+ifneq ($(PROCFS_EXECUTABLE_PATH),)
+   procfs_executable_path_SQ = $(subst ','\'',$(PROCFS_EXECUTABLE_PATH))
+   BASIC_CFLAGS += 
'-DPROCFS_EXECUTABLE_PATH="$(procfs_executable_path_SQ)"'
+endif
+
+ifdef HAVE_NS_GET_EXECUTABLE_PATH
+   BASIC_CFLAGS += -DHAVE_NS_GET_EXECUTABLE_PATH
+endif
+
 ifeq ($(TCLTK_PATH),)
 NO_TCLTK = NoThanks
 endif
@@ -2216,6 +2239,7 @@ endif
 exec_cmd.sp exec_cmd.s exec_cmd.o: GIT-PREFIX
 exec_cmd.sp exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
'-DGIT_EXEC_PATH="$(gitexecdir_SQ)"' \
+   '-DGIT_LOCALE_PATH="$(localedir_relative_SQ)"' \
'-DBINDIR="$(bindir_relative_SQ)"' \
'-DPREFIX="$(prefix_SQ)"'
 
@@ -2233,7 +2257,7 @@ attr.sp attr.s attr.o: EXTRA_CPPFLAGS = \
 
 gettext.sp gettext.s gettext.o: GIT-PREFIX
 gettext.sp gettext.s gettext.o: EXTRA_CPPFLAGS = \
-   -DGIT_LOCALE_PATH='"$(localedir_SQ)"'
+   -DGIT_LOCALE_PATH='"$(localedir_relative_SQ)"'
 
 http-push.sp http.sp http-walker.sp remote-curl.sp imap-send.sp: SPARSE_FLAGS 
+= \
-DCURL_DISABLE_TYPECHECK
diff --git a/cache.h b/cache.h
index d06932ed0..2d1999a5f 100644
--- a/cache.h
+++ b/cache.h
@@ -428,6 +428,7 @@ static inline enum object_type object_type(unsigned int 
mode)
 #define GIT_ICASE_PATHSPECS_ENVIRONMENT "GIT_ICASE_PATHSPECS"
 #define GIT_QUARANTINE_ENVIRONMENT "GIT_QUARANTINE_PATH"
 #define GIT_OPTIONAL_LOCKS_ENVIRONMENT "GIT_OPTIONAL_LOCKS"
+#define GIT_TEXT_DOMAIN_DIR_ENVIRONMENT "GIT_TEXTDOMAINDIR"
 
 /*
  * Environment variable used in handshaking the wire protocol.
diff --git a/common-main.c b/common-main.c
index 6a689007e..6516a1f89 100644
--- a/common-main.c
+++ b/common-main.c
@@ -32,12 +32,12 @@ int main(int argc, const char **argv)
 */
sanitize_stdfds();
 
+   git_resolve_executable_dir(argv[0]);
+
git_setup_gettext();
 
attr_start();
 
-   git_extract_argv0_path(argv[0]);
-
restore_sigpipe_to_default();
 
return cmd_main(argc, argv);
diff --git a/config.mak.uname b/config.mak.uname
index 6a1d0de0c..e1cfe5e5e 100644
--- a/config.mak.uname
+++ b/config.mak.uname
@@ -37,6 +37,7 @@ ifeq ($(uname_S),Linux)
HAVE_GETDELIM = YesPlease
SANE_TEXT_GREP=-a
FREAD_READS_DIRECTORIES = UnfortunatelyYes
+   PROCFS_EXECUTABLE_PATH = /proc/self/exe
 endif
 ifeq ($(uname_S),GNU/kFreeBSD)
HAVE_ALLOCA_H = YesPlease
@@ -111,6 +112,7 @@ ifeq ($(uname_S),Darwin)
BASIC_CFLAGS += -DPROTECT_HFS_DEFAULT=1
HAVE_BSD_SYSCTL = YesPlease
FREAD_READS_DIRECTORIES = UnfortunatelyYes
+   HAVE_NS_GET_EXECUTABLE_PATH = YesPlease
 endif
 ifeq ($(uname_S),SunOS)
NEEDS_SOCKET = YesPlease
@@ -205,6 +207,7 @@ ifeq ($(uname_S),FreeBSD)

[PATCH v6 0/3] RUNTIME_PREFIX relocatable Git

2018-03-18 Thread Dan Jacques
This patch set expands support for the RUNTIME_PREFIX configuration flag,
currently only used on Windows builds, to include Linux, Darwin, and
FreeBSD. When Git is built with RUNTIME_PREFIX enabled, it resolves its
ancillary paths relative to the runtime location of its executable
rather than hard-coding them at compile-time, allowing a Git
installation to be deployed to a path other than the one in which it
was built/installed.

Note that RUNTIME_PREFIX is not currently used outside of Windows.
This patch set should not have an impact on default Git builds.

I'm dusting this back off now that avarab@'s Perl Makefile simplification
patch set has landed. It's been a few months, so I'm a bit rusty, but I think
that I've incorporated all of the feedback. Please take a look and let me know
what you think!

Previous threads:
v1: https://public-inbox.org/git/20171116170523.28696-1-...@google.com/
v2: https://public-inbox.org/git/20171119173141.4896-1-...@google.com/
v3: https://public-inbox.org/git/20171127164055.93283-1-...@google.com/
v4: https://public-inbox.org/git/20171129223807.91343-1-...@google.com/
v5: https://public-inbox.org/git/20180108030239.92036-1-...@google.com/

Changes in v6 from v5:

- Rebased on top of "master".
- Updated commit messages.
- Updated runtime prefix Perl header comment and code to clarify when and
  why FindBin is used.
- With Johannes' blessing on Git-for-Windows, folded "RUNTIME_PREFIX_PERL"
  functionality into "RUNTIME_PREFIX".
- Updated "run-command" test to accommodate RUNTIME_PREFIX trace messages.

=== Testing ===

The latest patch set is available for testing on my GitHub fork, including
"travis.ci" testing. The "runtime-prefix" branch includes a "config.mak"
commit that enables runtime prefix for the Travis build; the
"runtime-prefix-no-config" omits this file, testing this patch without
runtime prefix enabled:
- https://github.com/danjacques/git/tree/runtime-prefix
- https://github.com/danjacques/git/tree/runtime-prefix-no-config
- https://travis-ci.org/danjacques/git/branches

Built/tested locally using this "config.mak" w/ autoconf:

=== Example config.mak ===

## (BEGIN config.mak)

RUNTIME_PREFIX = YesPlease
RUNTIME_PREFIX_PERL = YesPlease
gitexecdir = libexec/git-core
template_dir = share/git-core/templates
sysconfdir = etc

## (END config.mak)

=== Revision History ===

Changes in v5 from v4:

- Rebase on top of "next", notably incorporating the
  "ab/simplify-perl-makefile" branch.
- Cleaner Makefile relative path enforcement.
- Update Perl header template path now that the "perl/" directory has
  fewer build-related files in it.
- Update Perl runtime prefix header to use a general system path resolution
  function.
- Implemented the injection of the locale directory into Perl's
  "Git/I18N.pm" module from the runtime prefix Perl script header.
- Updated Perl's "Git/I18N.pm" module to accept injected locale directory.
- Added more content to some comments.


Changes in v4 from v3:

- Incorporated some quoting and Makefile dependency fixes, courtesy of
  .

Changes in v3 from v2:

- Broken into multiple patches now that Perl is isolated in its own
  RUNTIME_PREFIX_PERL flag.
- Working with avarab@, several changes to Perl script runtime prefix
  support:
  - Moved Perl header body content from Makefile into external template
file(s).
  - Added generic "perllibdir" variable to override Perl installation
path.
  - RUNTIME_PREFIX_PERL generated script header is more descriptive and
consistent with how the C version operates.
  - Fixed Generated Perl header Makefile dependency, should rebuild
when dependent files and flags change.
- Changed some of the new RUNTIME_PREFIX trace strings to use consistent
  formatting and terminology.

Changes in v2 from v1:

- Added comments and formatting to improve readability of
  platform-sepecific executable path resolution sleds in
  `git_get_exec_path`.
- Consolidated "cached_exec_path" and "argv_exec_path" globals
  into "exec_path_value".
- Use `strbuf_realpath` instead of `realpath` for procfs resolution.
- Removed new environment variable exports. Git with RUNTIME_PREFIX no
  longer exports or consumes any additional environment information.
- Updated Perl script resolution strategy: rather than having Git export
  the relative executable path to the Perl scripts, they now resolve
  it independently when RUNTIME_PREFIX_PERL is enabled.
- Updated resolution strategy for "gettext()": use system_path() instead
  of special environment variable.
- Added `sysctl` executable resolution support for BSDs that don't
  mount "procfs" by default (most of them).

Dan Jacques (3):
  Makefile: generate Perl header from template file
  Makefile: add Perl runtime prefix support
  exec_cmd: RUNTIME_PREFIX on some POSIX systems

 .gitignore   |   1 +
 Makefile | 120 +--
 cache.h  

Hi Pretty,

2018-03-18 Thread Jack
Hi Dear, my name is Jack and i am seeking for a relationship in which i will 
feel loved after a series of failed relationships. 

I am hoping that you would be interested and we could possibly get to know each 
other more if you do not mind. I am open to answering questions from you as i 
think my approach is a little inappropriate. Hope to hear back from you.

Jack.


Re: Draft of Git Rev News edition 37

2018-03-18 Thread Christian Couder
On Sun, Mar 18, 2018 at 11:41 PM, Jacob Keller  wrote:
>
> I don't have a good summary yet, but I think a section about the
> discussion regarding the new recreate-merges and rebasing merges
> that's been on going might be useful?

Yeah sure, we would gladly accept a summary of this discussion.

> a lot of that discussion
> occurred prior to git-merge (tho it's been ongoing since then?).

If you want to take the latest discussions into account, the summary
could be either split into two parts, one for this edition and the
other one for the next edition. Or we could wait and have the whole
summary in the next edition.

Thanks,
Christian.


Re: user-manual: patch proposals and questions

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 7:49 PM, kalle  wrote:
> 1.I wonder, why the "user-manual" is so hidden on the (official?) site
> git-scm.com [it is accessible at git-scm.com/docs/user-manual ,but is
> not viewable in /docs ]

The git-scm.com website is maintained as a distinct project[1] at
Github; it is not directly related to the Git project itself. A good
way to voice your concern about the website is either to open an issue
ticket[2] or submit a pull request[3] if you have a specific change in
mind.

[1]: https://github.com/git/git-scm.com
[2]: https://github.com/git/git-scm.com/issues
[3]: https://github.com/git/git-scm.com/pulls

> 2.I did not receive an answer to my mail. Maybe it could have to do with
> a possible stopped maintainment of the 'user-manual'

More likely it was because your email was not composed in a way for
people to digest and respond to it easily. There are some things you
can do to help make it easier for people to respond:

* Send patches inline rather than attachments since it is much easier
for people to respond to them directly when inline. When they are
attachments, people are forced to open the attachment, then copy/paste
parts of the patch back into the email for response, and most people
won't bother with such effort. Also, make each patch a separate email
with a meaningful commit message ("git format-patch" and "git
send-email" can help with this).

* For your questions about documentation, quote the section of
documentation you want to talk about directly in your email, then ask
a question about it. This saves people the effort of manually having
to open the file and locate the line or paragraph in question. Also,
line numbers change as changes are made to files, so the line number
you cite might not match the line number in a version of the file
someone else is looking at.

> 3.it would be for non-graphics-users to have the Git-Pro-book in
> text-format.

Like the website, the Pro Git book is a distinct project[4], not
directly related to the Git project. To suggest making the book
available as plain text, you can open an issue ticket[5] or submit a
pull request[6] if you implement it yourself.

[4]: https://github.com/progit/progit2
[5]: https://github.com/progit/progit2/issues
[6]: https://github.com/progit/progit2/pulls


>  Weitergeleitete Nachricht 
> Betreff: user-manual: patch proposals and questions
> Datum: Tue, 6 Mar 2018 00:08:55 +0100
> Von: kalle 
> An: git@vger.kernel.org
>
> The patches are attached.
> Further some questions:
>
> -see the explanations of the branch command, ca. line 280: wouldn't it
> be better to use other words than 'references'?
> -sentence "it shows all commits reachable from the parent commit": it
> seems wrong to me. The last commit is also shown.
> - chapter "Browsing revisions": it seems counterintuitive to me to have
> two different logics for the meaning of "branch1..branch2" and
> "branch1...branch2", according to whether it's the argument of `git log'
> or `git diff'
> -section "Check whether two branches point at the same history": 'git
> diff origin..master' -> shouldn't it be rather 'git diff
> branch1..branch2'? … or rewrite the example with branch1=origin and
> branch2=master.
>
> greetings,
> kalle


Fwd: user-manual: patch proposals and questions

2018-03-18 Thread kalle

hello,

1.I wonder, why the "user-manual" is so hidden on the (official?) site
git-scm.com [it is accessible at git-scm.com/docs/user-manual ,but is
not viewable in /docs ]

2.I did not receive an answer to my mail. Maybe it could have to do with
a possible stopped maintainment of the 'user-manual'

3.it would be for non-graphics-users to have the Git-Pro-book in
text-format.

thanks,
kalle

 Weitergeleitete Nachricht 
Betreff: user-manual: patch proposals and questions
Datum: Tue, 6 Mar 2018 00:08:55 +0100
Von: kalle 
An: git@vger.kernel.org

The patches are attached.
Further some questions:

-see the explanations of the branch command, ca. line 280: wouldn't it
be better to use other words than 'references'?
-sentence "it shows all commits reachable from the parent commit": it
seems wrong to me. The last commit is also shown.
- chapter "Browsing revisions": it seems counterintuitive to me to have
two different logics for the meaning of "branch1..branch2" and
"branch1...branch2", according to whether it's the argument of `git log'
or `git diff'
-section "Check whether two branches point at the same history": 'git
diff origin..master' -> shouldn't it be rather 'git diff
branch1..branch2'? … or rewrite the example with branch1=origin and
branch2=master.

greetings,
kalle


>From 19061a0dd4363edbf8757a5e9eee8ace210f4029 Mon Sep 17 00:00:00 2001
From: kalledaballe 
Date: Fri, 9 Feb 2018 20:46:52 +0100
Subject: [PATCH 2/5] 3 small formulation changes in
 Documentation/user-manual.txt

---
 Documentation/user-manual.txt | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index eff7890..e3efc26 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -322,7 +322,7 @@ do so (now or later) by using -b with the checkout command again. Example:
 HEAD is now at 427abfa Linux v2.6.17
 
 
-The HEAD then refers to the SHA-1 of the commit instead of to a branch,
+If you haven't created a new branch yet, the HEAD then refers to the SHA-1 of the commit instead of to a branch,
 and git branch shows that you are no longer on a branch:
 
 
@@ -370,7 +370,7 @@ be updated by `git fetch` (hence `git pull`) and `git push`. See
 <> for details.
 
 You might want to build on one of these remote-tracking branches
-on a branch of your own, just as you would for a tag:
+a branch of your own, just as you would for a tag:
 
 
 $ git checkout -b my-todo-copy origin/todo
@@ -404,7 +404,7 @@ they may also be packed together in a single file; see
 linkgit:git-pack-refs[1]).
 
 As another useful shortcut, the "HEAD" of a repository can be referred
-to just using the name of that repository.  So, for example, "origin"
+to by just using the name of that repository.  So, for example, "origin"
 is usually a shortcut for the HEAD branch in the repository "origin".
 
 For the complete list of paths which Git checks for references, and
-- 
2.1.4



>From 3917eeaf8d21b8b90a773c46ee5b9d12eac901e3 Mon Sep 17 00:00:00 2001
From: kalledaballe 
Date: Sat, 10 Feb 2018 16:08:45 +0100
Subject: [PATCH 3/5] I changed the sequence of 2 sentences.

---
 Documentation/user-manual.txt | 10 +++---
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/Documentation/user-manual.txt b/Documentation/user-manual.txt
index e3efc26..b9dc17a 100644
--- a/Documentation/user-manual.txt
+++ b/Documentation/user-manual.txt
@@ -609,13 +609,9 @@ $ git show HEAD^2   # show the second parent of HEAD
 In addition to HEAD, there are several other special names for
 commits:
 
-Merges (to be discussed later), as well as operations such as
-`git reset`, which change the currently checked-out commit, generally
-set ORIG_HEAD to the value HEAD had before the current operation.
+ORIG_HEAD is set to the value HEAD before merging (to be discussed later) operations as well as operations such as `git reset'. which change the currently checked-out commit.
 
-The `git fetch` operation always stores the head of the last fetched
-branch in FETCH_HEAD.  For example, if you run `git fetch` without
-specifying a local branch as the target of the operation
+FETCH_HEAD after a `git fetch' operation stores the head of the last fetched branch. For example, if you run `git fetch' without specifying a local branch as the target of the operation.
 
 -
 $ git fetch git://example.com/proj.git theirbranch
@@ -664,7 +660,7 @@ can also make more specific requests:
 
 -
 $ git log v2.5..	# commits since (not reachable from) v2.5
-$ git log test..master	# commits reachable from master but not test
+$ git log ..master	# commits reachable from master but not 

Investment Proposal

2018-03-18 Thread FARDEL Anastasia
Hello

Greetings to you and everyone around you please did you get my previous email 
regarding my proposal ?
please let me know if we can work together on this.

Best Reagrds
Melisa Mehmet


Re: [PATCH v2 2/2] git-svn: allow empty email-address in authors-prog and authors-file

2018-03-18 Thread Eric Wong
Andreas Heiduk  wrote:
> The email address in --authors-file and --authors-prog can be empty but
> git-svn translated it into a syntethic email address in the form
> $USERNAME@$REPO_UUID. Now git-svn behaves like git-commit: If the email
> is explicitly set to the empty string, the commit does not contain
> an email address.

What is missing is WHY "<>" is preferable to "<$USERNAME@$REPO_UUID>".

$USERNAME is good anyways since projects/organizations tie their
SVN usernames to email usernames via LDAP, making it easy to
infer their email address from $USERNAME.  The latter can also
be used to disambiguate authors if they happen to have the same
real name.

"<>" is completely meaningless.


Re: [PATCH v2 0/3] git worktree prune improvements

2018-03-18 Thread Eric Sunshine
On Thu, Mar 15, 2018 at 12:44 PM, Nguyễn Thái Ngọc Duy
 wrote:
> v2 fixes comments from Eric and rebases on 'master' since 'worktree
> move' has been merged. Commit messages are updated to reflect this.

Thanks. Aside from a couple minor commit message tweaks (not worth a
re-roll), I didn't find anything else on which to comment on this
version.

> There's one thing I didn't do is moving the new paragraph up [1]. I
> still feel this it's not right to put it there since it starts with
> 'gc --auto' introduction, then goes to loose objects and packs.
> Inserting a note between the loose objects paragraph and the packs one
> does not make much sense.

Fair enough.


Re: [PATCH v2 3/3] worktree prune: improve prune logic when worktree is moved

2018-03-18 Thread Eric Sunshine
On Thu, Mar 15, 2018 at 12:44 PM, Nguyễn Thái Ngọc Duy
 wrote:
> Automatic detection of worktree relocation by a user (via 'mv', for
> instance) was removed by 618244e160 (worktree: stop supporting moving
> worktrees manually - 2016-01-22). Prior to that,
> .git/worktrees//gitdir was updated whenever the worktree was
> accessed in order to let the pruning logic know that the worktree was
> "active" even if it disappeared for a while (due to being located on
> removable media, for instance).

This is my fault since I suggested this rewrite[1], but other
documentation calls this , not , so: s/tag/id/

(Again, not worth a re-roll; perhaps Junio can tweak it when queuing.)

> "git worktree move" has come so we don't really need this, but since
> it's easy to do, perhaps we could keep supporting manual worktree move
> a bit longer. Notice that when a worktree is active, the "index" file
> should be updated pretty often in common case. The logic is updated to
> check for index mtime to see if the worktree is alive.
>
> The old logic of checking gitdir's mtime is dropped because nobody
> updates it anyway. The new corner case is, if the index file does not
> exist, we immediately remove the stale worktree. But if the "index"
> file does not exist, you may have a bigger problem.

Makes sense.

> Signed-off-by: Nguyễn Thái Ngọc Duy 

[1]: 
https://public-inbox.org/git/capig+cqg8yequnnujoeapy-w9+ttzwadhgjdl6udnyrg0ye...@mail.gmail.com/


Re: [PATCH v2 2/3] worktree: delete dead code

2018-03-18 Thread Eric Sunshine
On Thu, Mar 15, 2018 at 12:44 PM, Nguyễn Thái Ngọc Duy
 wrote:
> This "link" was a feature in early iterations of multiple worktree
> functionality for some reason it was dropped [1]. Since nobody creates
> this "link", there's no need to check it.
>
> This is mostly used to let the user moves a worktree manually [2]. If

s/moves/move/

(Not worth a re-roll; perhaps Junio can tweak this when queuing.)

> you move a worktree within the same file system, this hard link count
> lets us know the worktree is still there even if we don't know where it
> is.
>
> We support 'worktree move' now and don't need this anymore.
>
> [1] last appearance in v4 message-id:
> 1393675983-3232-25-git-send-email-pclo...@gmail.com
> and the reason in v5 was "revisit later", message-id:
> 1394246900-31535-1-git-send-email-pclo...@gmail.com
> [2] 23af91d102 (prune: strategies for linked checkouts - 2014-11-30)
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 


Re: Draft of Git Rev News edition 37

2018-03-18 Thread Jacob Keller
On Sun, Mar 18, 2018 at 11:21 AM, Christian Couder
 wrote:
> Hi,
>
> A draft of a new Git Rev News edition is available here:
>
>   
> https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-37.md
>
> Everyone is welcome to contribute in any section either by editing the
> above page on GitHub and sending a pull request, or by commenting on
> this GitHub issue:
>
>   https://github.com/git/git.github.io/issues/279
>
> You can also reply to this email.
>
> In general all kinds of contribution, for example proofreading,
> suggestions for articles or links, help on the issues in GitHub, and
> so on, are very much appreciated.
>
> I tried to cc everyone who appears in this edition, but maybe I missed
> some people, sorry about that.
>
> Jakub, Markus, Gabriel and me plan to publish this edition on
> Wednesday March 21st.
>
> Thanks,
> Christian.

I don't have a good summary yet, but I think a section about the
discussion regarding the new recreate-merges and rebasing merges
that's been on going might be useful? a lot of that discussion
occurred prior to git-merge (tho it's been ongoing since then?).

Thanks,
Jake


[GSoC] Regarding "Convert scripts to builtins"

2018-03-18 Thread Paul Sebastian Ungureanu
Hello,

I am interested in the "Convert scripts to builtins" project. I have
recently started to analyze it better and see exactly what it entails
and a few questions came to my mind.

First of all, I find it difficult to pick which scripts would benefit
the most by being rewritten. I am thinking of git bisect, git stash
and git rebase since these are maybe some of the most popular commands
of Git. However, on the other side, scripts like
git-rebase--interactive.sh and git-bisect.sh are also subject of other
GSoC projects. Should I steer away from these projects or can I
consider them?

Secondly, what is too little or too much? On one hand, I do want to do
my best and help the Git community as much as I can. On the other
hand, I do not want to have too much on my plate and not be able to
finish my project. Considering that mentors have already decided that
git rebase --interactive and git bisect are enough for two projects,
how could I quantify the time required for each command? Looking back
at the previous editions of GSoC I noticed that most projects were
focused on only one command.

>From my research, these are the scripts that could be subject of this
project. Which ones do you think could be the best choice for a
project of this kind?

 * git/git-add--interactive.perl
 * git/git-archimport.perl
 * git/git-bisect.sh -- there is a project about this
 * git/git-cvsexportcommit.perl
 * git/git-cvsimport.perl
 * git/git-cvsserver.perl
 * git/git-difftool--helper.sh
 * git/git-filter-branch.sh
 * git/git-instaweb.sh
 * git/git-merge-octopus.sh
 * git/git-merge-one-file.sh
 * git/git-merge-resolve.sh
 * git/git-mergetool--lib.sh
 * git/git-mergetool.sh
 * git/git-quiltimport.sh
 * git/git-rebase--am.sh
 * git/git-rebase--interactive.sh -- there is a project about this
 * git/git-rebase--merge.sh
 * git/git-rebase.sh
 * git/git-remote-testgit.sh
 * git/git-request-pull.sh
 * git/git-send-email.perl
 * git/git-stash.sh
 * git/git-submodule.sh -- there was a project about this
 * git/git-svn.perl
 * git/git-web--browse.sh

I look forward to hearing from you. I will also submit a draft of my
proposal soon enough.

Best regards,
Paul Ungureanu


Re: [PATCH v2 0/2] git-svn: --author-prog improvements

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 5:19 PM, Andreas Heiduk  wrote:
> No comments on this one?

I can't speak for Eric W., but my impression was that v2 did not
address his concern that patch 2/2's unconditional change of behavior
was unacceptable[1]. He didn't say so explicitly, but perhaps the
implication was that he might be more open to the behavior change
being an opt-in.

[1]: https://public-inbox.org/git/20180305202017.GA26533@whir/


Re: [PATCH v2 0/2] git-svn: --author-prog improvements

2018-03-18 Thread Andreas Heiduk
No comments on this one?


HELLO

2018-03-18 Thread Mrs Sarah Boardman



--
Hello Dear
 Am a dying woman here in the hospital, i was diagnose as a cancer
patient 2 years ago. I am from (Zarqa) Jordan, a business woman


dealing with Gold Exportation. I have a charitable and unfufilment
project that am about to handover to you, if you are interested please
reply through my private email address for more details,


(sarahboard...@mail.com)


hope to hear from you.
Regard
Mrs sarah boardman
--



Re: feature-request: git "cp" like there is git mv.

2018-03-18 Thread Stefan Moch
* Junio C Hamano  [2018-02-07T11:49:39-0800]:
> Stefan Moch  writes:
> 
> > * Jonathan Nieder  [2017-12-15T17:31:30-0800]:  
> >> This sounds like a reasonable thing to add.  See builtin/mv.c for
> >> how "git mv" works if you're looking for inspiration.
> >> 
> >> cmd_mv in that file looks rather long, so I'd also be happy if
> >> someone interested refactors to break it into multiple
> >> self-contained pieces for easier reading (git mostly follows
> >> https://www.kernel.org/doc/html/latest/process/coding-style.html#functions).
> >>   
> >
> > I looked at builtin/mv.c and have a rough idea how to split it
> > up to support both mv and cp commands.
> >
> > But first I noticed and removed a redundant check in cmd_mv,
> > also added a test case to check if mv --dry-run does not move
> > the file.  
> 
> I guess these two patches went unnoticed when posted at the end of
> last year.  Reading them again, I think they are good changes.

Thanks.

Are such redundant checks in general a pattern worth searching
for and cleaning up globally? Or is this rather in the category
of cleaning up only when noticed?


> As a no-op clean-up of a127331c ("mv: allow moving nested
> submodules", 2016-04-19), the attached would also make sense, I
> would think.
> 
> Thanks.
> 
>  builtin/mv.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/mv.c b/builtin/mv.c
> index 9662804d23..9cb07990fd 100644
> --- a/builtin/mv.c
> +++ b/builtin/mv.c
> @@ -266,10 +266,11 @@ int cmd_mv(int argc, const char **argv, const
> char *prefix) const char *src = source[i], *dst = destination[i];
>   enum update_mode mode = modes[i];
>   int pos;
> - if (show_only || verbose)
> - printf(_("Renaming %s to %s\n"), src, dst);
> - if (show_only)
> + if (show_only) {
> + if (verbose)
> + printf(_("Renaming %s to %s\n"),
> src, dst); continue;
> + }
>   if (mode != INDEX && rename(src, dst) < 0) {
>   if (ignore_errors)
>   continue;
> 

As Stefan Beller already noted, this changes the printing
behavior:


See also the output of

git mv -n
git mv -n -v
git mv -v


without your patch:

$ git mv -n 1 2
Checking rename of '1' to '2'
Renaming 1 to 2
$ git mv -n -v 1 2
Checking rename of '1' to '2'
Renaming 1 to 2
$ git mv -v 1 2
Renaming 1 to 2


and with your patch:

$ git mv -n 1 2
Checking rename of '1' to '2'
$ git mv -n -v 1 2
Checking rename of '1' to '2'
Renaming 1 to 2
$ git mv -v 1 2


Having different outputs of “git mv -n” and “git mv -n -v” seems
odd, but not necessarily wrong. However, “git mv -v” with no
output at all, does not what the documentation says:

   -v, --verbose
   Report the names of files as they are moved.




Re: [ANNOUNCE] git-sizer: compute various size-related metrics for your Git repository

2018-03-18 Thread Michael Haggerty
On Fri, Mar 16, 2018 at 10:29 PM, Jeff King  wrote:
> On Fri, Mar 16, 2018 at 09:01:42PM +0100, Ævar Arnfjörð Bjarmason wrote:
>> One thing that can make repositories very pathological is if the ratio
>> of trees to commits is too low.
>>
>> I was dealing with a repo the other day that had several thousand files
>> all in the same root directory, and no subdirectories.
>
> We've definitely run into this problem before (CocoaPods/Specs, for
> example). The metric that would hopefully show this off is "what is the
> tree object with the most entries". Or possibly "what is the average
> number of entries in a tree object".

I find that the best metric for determining this sort of problem is
"Overall repository size -> Trees -> Total tree entries". If you have
a big directory that is being changed frequently, the *real* problem
is that every commit has to rewrite the whole tree, with all of its
many entries. So "Total tree entries" (or equivalently, the total tree
size) skyrockets. And this means that a history traversal has to
*expand* all of those trees again. So a repository that is problematic
for this reason will have a very large number of tree entries.

If you want to detect a bad repository layout like this *before* it
becomes a problem, then probably something like "average tree entries
per commit" might be a good leading indicator of a problem.

Michael


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Ramsay Jones


On 18/03/18 15:55, Duy Nguyen wrote:
> On Sun, Mar 18, 2018 at 9:18 AM, Nguyễn Thái Ngọc Duy  
> wrote:
>> +ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter 
>> clang4,$(COMPILER_FEATURES))),)
>> +CFLAGS += -Wextra
> 
> Another thing we can add here is -Og instead of standard -O2 (or -O0
> in my build), which is supported since gcc 4.8. clang seems to support
> it too (mapped to -O1 at least for clang5) but I don't know what
> version added that flag.

I've been doing a lot of compiling recently, using 10 'different'
versions of clang and gcc ('different' if you count 64-bit and 32-bit
compilers with the same version number as different!)

However, I can tell you that clang version 3.4 and 3.8.0 don't
support -Og, but clang version 5.0.1 does.

ATB,
Ramsay Jones




Draft of Git Rev News edition 37

2018-03-18 Thread Christian Couder
Hi,

A draft of a new Git Rev News edition is available here:

  https://github.com/git/git.github.io/blob/master/rev_news/drafts/edition-37.md

Everyone is welcome to contribute in any section either by editing the
above page on GitHub and sending a pull request, or by commenting on
this GitHub issue:

  https://github.com/git/git.github.io/issues/279

You can also reply to this email.

In general all kinds of contribution, for example proofreading,
suggestions for articles or links, help on the issues in GitHub, and
so on, are very much appreciated.

I tried to cc everyone who appears in this edition, but maybe I missed
some people, sorry about that.

Jakub, Markus, Gabriel and me plan to publish this edition on
Wednesday March 21st.

Thanks,
Christian.


Websites & Web Based Applications

2018-03-18 Thread Huma
Dear Sir/Ma’am,

We are an Online Marketing Group, comprising young and dynamic professionals. 
We have executed a number of projects in the below mentioned categories:

Website Design & Development
PHP & MySQL website application development
E-Commerce Applications/ Websites
We Specialize in Laravel Based Developments

We are extremely competitive and have worked for a lot of companies outside 
India, assisting them in developing their company’s websites and customized 
business applications.

Please do not hesitate to get in touch with us, or any queries you may have.

Huma
Online Marketing Manager
Whatsapp: +91 6393002436


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 9:18 AM, Nguyễn Thái Ngọc Duy  wrote:
> +ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter 
> clang4,$(COMPILER_FEATURES))),)
> +CFLAGS += -Wextra

Another thing we can add here is -Og instead of standard -O2 (or -O0
in my build), which is supported since gcc 4.8. clang seems to support
it too (mapped to -O1 at least for clang5) but I don't know what
version added that flag.
-- 
Duy


OK

2018-03-18 Thread Ahmed Zama
Dear Sir,



Please can both of us handle a lucrative deal.?? I will give you the
full detail explanation as soon as I hear from you.



Faithfully yours,
Mr Ahmed Zama


Re: [PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-18 Thread Ævar Arnfjörð Bjarmason

On Sun, Mar 18 2018, Nguyễn Thái Ngọc Duy jotted:

> v6 fixes the one optimization that I just couldn't get right, fixes
> two off-by-one error messages and a couple commit message update
> (biggest change is in 11/11 to record some numbers from AEvar)

Thanks, aside from the minor typo I just noted in
https://public-inbox.org/git/878tapcucc@evledraar.gmail.com/ (which
I trust Junio can fix up) this all looks good to me.


Re: [PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-18 Thread Ævar Arnfjörð Bjarmason

On Sun, Mar 18 2018, Nguyễn Thái Ngọc Duy jotted:

> It's very very rare that an uncompressedd object is larger than 4GB

So this went from a typo of "uncompressd" in v5 to "uncompressedd",
needs one less "d": "uncompressed".


[PATCH v6 06/11] pack-objects: move in_pack out of struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
Instead of using 8 bytes (on 64 bit arch) to store a pointer to a
pack. Use an index instead since the number of packs should be
relatively small.

This limits the number of packs we can handle to 16k. For now if you hit
16k pack files limit, pack-objects will simply fail [1].

[1] The escape hatch is .keep file to limit the non-kept pack files
below 16k limit. Then you can go for another pack-objects run to
combine another 16k pack files. Repeat until you're satisfied.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/git-pack-objects.txt |  9 ++
 builtin/pack-objects.c | 40 +--
 cache.h|  1 +
 pack-objects.h | 44 +-
 4 files changed, 79 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 3503c9e3e6..b8d936ccf5 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -269,6 +269,15 @@ Unexpected missing object will raise an error.
locally created objects [without .promisor] and objects from the
promisor remote [with .promisor].)  This is used with partial clone.
 
+LIMITATIONS
+---
+
+This command could only handle 16384 existing pack files at a time.
+If you have more than this, you need to exclude some pack files with
+".keep" file and --honor-pack-keep option, to combine 16k pack files
+in one, then remove these .keep files and run pack-objects one more
+time.
+
 SEE ALSO
 
 linkgit:git-rev-list[1]
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index e1244918a5..9792d31e46 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -29,6 +29,8 @@
 #include "list.h"
 #include "packfile.h"
 
+#define IN_PACK(obj) oe_in_pack(_pack, obj)
+
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
N_("git pack-objects [...]  [<  | < 
]"),
@@ -367,7 +369,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 static off_t write_reuse_object(struct hashfile *f, struct object_entry *entry,
unsigned long limit, int usable_delta)
 {
-   struct packed_git *p = entry->in_pack;
+   struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
@@ -478,7 +480,7 @@ static off_t write_object(struct hashfile *f,
 
if (!reuse_object)
to_reuse = 0;   /* explicit */
-   else if (!entry->in_pack)
+   else if (!IN_PACK(entry))
to_reuse = 0;   /* can't reuse what we don't have */
else if (oe_type(entry) == OBJ_REF_DELTA ||
 oe_type(entry) == OBJ_OFS_DELTA)
@@ -1025,7 +1027,7 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (*found_pack) {
want = want_found_object(exclude, *found_pack);
if (want != -1)
-   return want;
+   goto done;
}
 
list_for_each(pos, _git_mru) {
@@ -1048,11 +1050,16 @@ static int want_object_in_pack(const struct object_id 
*oid,
if (!exclude && want > 0)
list_move(>mru, _git_mru);
if (want != -1)
-   return want;
+   goto done;
}
}
 
-   return 1;
+   want = 1;
+done:
+   if (want && *found_pack && !(*found_pack)->index)
+   oe_add_pack(_pack, *found_pack);
+
+   return want;
 }
 
 static void create_object_entry(const struct object_id *oid,
@@ -1074,7 +1081,7 @@ static void create_object_entry(const struct object_id 
*oid,
else
nr_result++;
if (found_pack) {
-   entry->in_pack = found_pack;
+   oe_set_in_pack(entry, found_pack);
entry->in_pack_offset = found_offset;
}
 
@@ -1399,8 +1406,8 @@ static void cleanup_preferred_base(void)
 
 static void check_object(struct object_entry *entry)
 {
-   if (entry->in_pack) {
-   struct packed_git *p = entry->in_pack;
+   if (IN_PACK(entry)) {
+   struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
const unsigned char *base_ref = NULL;
struct object_entry *base_entry;
@@ -1535,14 +1542,16 @@ static int pack_offset_sort(const void *_a, const void 
*_b)
 {
const struct object_entry *a = *(struct object_entry **)_a;
const struct object_entry *b = *(struct object_entry **)_b;
+   const struct packed_git *a_in_pack = IN_PACK(a);
+   const struct packed_git *b_in_pack = IN_PACK(b);
 
/* avoid filesystem trashing with loose objects */
-   if (!a->in_pack && !b->in_pack)
+   if 

[PATCH v6 08/11] pack-objects: shrink z_delta_size field in struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
We only cache deltas when it's smaller than a certain limit. This limit
defaults to 1000 but save its compressed length in a 64-bit field.
Shrink that field down to 16 bits, so you can only cache 65kb deltas.
Larger deltas must be recomputed at when the pack is written down.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt |  3 ++-
 builtin/pack-objects.c   | 22 --
 pack-objects.h   |  3 ++-
 3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 9bd3f5a789..00fa824448 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2449,7 +2449,8 @@ pack.deltaCacheLimit::
The maximum size of a delta, that is cached in
linkgit:git-pack-objects[1]. This cache is used to speed up the
writing object phase by not having to recompute the final delta
-   result once the best match for all objects is found. Defaults to 1000.
+   result once the best match for all objects is found.
+   Defaults to 1000. Maximum value is 65535.
 
 pack.threads::
Specifies the number of threads to spawn when searching for best
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index b39234f7fb..372afe48c4 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2105,12 +2105,19 @@ static void find_deltas(struct object_entry **list, 
unsigned *list_size,
 * between writes at that moment.
 */
if (entry->delta_data && !pack_to_stdout) {
-   entry->z_delta_size = do_compress(>delta_data,
- entry->delta_size);
-   cache_lock();
-   delta_cache_size -= entry->delta_size;
-   delta_cache_size += entry->z_delta_size;
-   cache_unlock();
+   unsigned long size;
+
+   size = do_compress(>delta_data, 
entry->delta_size);
+   if (size < (1 << OE_Z_DELTA_BITS)) {
+   entry->z_delta_size = size;
+   cache_lock();
+   delta_cache_size -= entry->delta_size;
+   delta_cache_size += entry->z_delta_size;
+   cache_unlock();
+   } else {
+   FREE_AND_NULL(entry->delta_data);
+   entry->z_delta_size = 0;
+   }
}
 
/* if we made n a delta, and if n is already at max
@@ -3089,6 +3096,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (depth >= (1 << OE_DEPTH_BITS))
die(_("delta chain depth %d is greater than maximum limit %d"),
depth, (1 << OE_DEPTH_BITS) - 1);
+   if (cache_max_small_delta_size >= (1 << OE_Z_DELTA_BITS))
+   die(_("pack.deltaCacheLimit is greater than maximum limit %d"),
+   (1 << OE_Z_DELTA_BITS) - 1);
 
argv_array_push(, "pack-objects");
if (thin) {
diff --git a/pack-objects.h b/pack-objects.h
index 594a213554..c12219385a 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -4,6 +4,7 @@
 #define OE_DFS_STATE_BITS  2
 #define OE_DEPTH_BITS  12
 #define OE_IN_PACK_BITS14
+#define OE_Z_DELTA_BITS16
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -79,7 +80,7 @@ struct object_entry {
 */
void *delta_data;   /* cached delta (uncompressed) */
unsigned long delta_size;   /* delta data size (uncompressed) */
-   unsigned long z_delta_size; /* delta data size (compressed) */
+   unsigned z_delta_size:OE_Z_DELTA_BITS;
unsigned type_:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned type_valid:1;
-- 
2.17.0.rc0.347.gf9cf61673a



[PATCH v6 04/11] pack-objects: use bitfield for object_entry::depth

2018-03-18 Thread Nguyễn Thái Ngọc Duy
Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 Documentation/config.txt   | 1 +
 Documentation/git-pack-objects.txt | 4 +++-
 Documentation/git-repack.txt   | 4 +++-
 builtin/pack-objects.c | 4 
 pack-objects.h | 5 ++---
 5 files changed, 13 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f57e9cf10c..9bd3f5a789 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -2412,6 +2412,7 @@ pack.window::
 pack.depth::
The maximum delta depth used by linkgit:git-pack-objects[1] when no
maximum depth is given on the command line. Defaults to 50.
+   Maximum value is 4095.
 
 pack.windowMemory::
The maximum size of memory that is consumed by each thread
diff --git a/Documentation/git-pack-objects.txt 
b/Documentation/git-pack-objects.txt
index 81bc490ac5..3503c9e3e6 100644
--- a/Documentation/git-pack-objects.txt
+++ b/Documentation/git-pack-objects.txt
@@ -96,7 +96,9 @@ base-name::
it too deep affects the performance on the unpacker
side, because delta data needs to be applied that many
times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --window-memory=::
This option provides an additional limit on top of `--window`;
diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index ae750e9e11..25c83c4927 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -90,7 +90,9 @@ other objects in that pack they already have locally.
space. `--depth` limits the maximum delta depth; making it too deep
affects the performance on the unpacker side, because delta data needs
to be applied that many times to get to the necessary object.
-   The default value for --window is 10 and --depth is 50.
++
+The default value for --window is 10 and --depth is 50. The maximum
+depth is 4095.
 
 --threads=::
This option is passed through to `git pack-objects`.
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 83f8154865..205e1f646c 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3068,6 +3068,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
if (pack_to_stdout != !base_name || argc)
usage_with_options(pack_usage, pack_objects_options);
 
+   if (depth >= (1 << OE_DEPTH_BITS))
+   die(_("delta chain depth %d is greater than maximum limit %d"),
+   depth, (1 << OE_DEPTH_BITS) - 1);
+
argv_array_push(, "pack-objects");
if (thin) {
use_internal_rev_list = 1;
diff --git a/pack-objects.h b/pack-objects.h
index 8507e1b869..59407aae3c 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -2,6 +2,7 @@
 #define PACK_OBJECTS_H
 
 #define OE_DFS_STATE_BITS  2
+#define OE_DEPTH_BITS  12
 
 /*
  * State flags for depth-first search used for analyzing delta cycles.
@@ -89,9 +90,7 @@ struct object_entry {
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
unsigned dfs_state:OE_DFS_STATE_BITS;
-
-   int depth;
-
+   unsigned depth:OE_DEPTH_BITS;
 };
 
 struct packing_data {
-- 
2.17.0.rc0.347.gf9cf61673a



[PATCH v6 09/11] pack-objects: shrink size field in struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
It's very very rare that an uncompressedd object is larger than 4GB
(partly because Git does not handle those large files very well to
begin with). Let's optimize it for the common case where object size
is smaller than this limit.

Shrink size field down to 32 bits [1] and one overflow bit. If the size
is too large, we read it back from disk.

Add two compare helpers that can take advantage of the overflow
bit (e.g. if the file is 4GB+, chances are it's already larger than
core.bigFileThreshold and there's no point in comparing the actual
value).

A small note about the conditional oe_set_size() in
check_object(). Technically if we don't get a valid type, it's not
wrong if we set uninitialized value "size" (we don't pre-initialize
this and sha1_object_info will not assign anything when it fails to
get the info).

This how changes the writing code path slightly which emits different
error messages (either way we die). One of our tests in t5530 depends
on this specific error message. Let's just keep the test as-is and
play safe by not assigning random value. That might trigger valgrind
anyway.

[1] it's actually already 32 bits on Windows

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 49 ++-
 pack-objects.h | 58 +-
 2 files changed, 88 insertions(+), 19 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 372afe48c4..89ed4b5125 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -274,7 +274,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 
if (!usable_delta) {
if (oe_type(entry) == OBJ_BLOB &&
-   entry->size > big_file_threshold &&
+   oe_size_greater_than(entry, big_file_threshold) &&
(st = open_istream(entry->idx.oid.hash, , , 
NULL)) != NULL)
buf = NULL;
else {
@@ -384,12 +384,13 @@ static off_t write_reuse_object(struct hashfile *f, 
struct object_entry *entry,
unsigned char header[MAX_PACK_OBJECT_HEADER],
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
+   unsigned long entry_size = oe_size(entry);
 
if (DELTA(entry))
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
hdrlen = encode_in_pack_object_header(header, sizeof(header),
- type, entry->size);
+ type, entry_size);
 
offset = entry->in_pack_offset;
revidx = find_pack_revindex(p, offset);
@@ -406,7 +407,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
datalen -= entry->in_pack_header_size;
 
if (!pack_to_stdout && p->index_version == 1 &&
-   check_pack_inflate(p, _curs, offset, datalen, entry->size)) {
+   check_pack_inflate(p, _curs, offset, datalen, entry_size)) {
error("corrupt packed object for %s",
  oid_to_hex(>idx.oid));
unuse_pack(_curs);
@@ -1412,6 +1413,8 @@ static void cleanup_preferred_base(void)
 
 static void check_object(struct object_entry *entry)
 {
+   unsigned long size;
+
if (IN_PACK(entry)) {
struct packed_git *p = IN_PACK(entry);
struct pack_window *w_curs = NULL;
@@ -1431,13 +1434,14 @@ static void check_object(struct object_entry *entry)
 */
used = unpack_object_header_buffer(buf, avail,
   ,
-  >size);
+  );
if (used == 0)
goto give_up;
 
if (type < 0)
die("BUG: invalid type %d", type);
entry->in_pack_type = type;
+   oe_set_size(entry, size);
 
/*
 * Determine if this is a delta and if so whether we can
@@ -1505,7 +1509,7 @@ static void check_object(struct object_entry *entry)
 */
oe_set_type(entry, entry->in_pack_type);
SET_DELTA(entry, base_entry);
-   entry->delta_size = entry->size;
+   entry->delta_size = oe_size(entry);
entry->delta_sibling_idx = base_entry->delta_child_idx;
SET_DELTA_CHILD(base_entry, entry);
unuse_pack(_curs);
@@ -1513,14 +1517,17 @@ static void check_object(struct object_entry *entry)
}
 
if (oe_type(entry)) {
+   unsigned long size;
+
+   size = get_size_from_delta(p, _curs,
+   

[PATCH v6 05/11] pack-objects: move in_pack_pos out of struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
This field is only need for pack-bitmap, which is an optional
feature. Move it to a separate array that is only allocated when
pack-bitmap is used (it's not freed in the same way that objects[] is
not).

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c |  3 ++-
 pack-bitmap-write.c|  8 +---
 pack-bitmap.c  |  2 +-
 pack-bitmap.h  |  4 +++-
 pack-objects.h | 16 +++-
 5 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 205e1f646c..e1244918a5 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -879,7 +879,8 @@ static void write_pack_file(void)
 
if (write_bitmap_index) {
bitmap_writer_set_checksum(oid.hash);
-   bitmap_writer_build_type_index(written_list, 
nr_written);
+   bitmap_writer_build_type_index(
+   _pack, written_list, nr_written);
}
 
finish_tmp_packfile(, pack_tmp_name,
diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index fd11f08940..f7c897515b 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -48,7 +48,8 @@ void bitmap_writer_show_progress(int show)
 /**
  * Build the initial type index for the packfile
  */
-void bitmap_writer_build_type_index(struct pack_idx_entry **index,
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
uint32_t index_nr)
 {
uint32_t i;
@@ -57,12 +58,13 @@ void bitmap_writer_build_type_index(struct pack_idx_entry 
**index,
writer.trees = ewah_new();
writer.blobs = ewah_new();
writer.tags = ewah_new();
+   ALLOC_ARRAY(to_pack->in_pack_pos, to_pack->nr_objects);
 
for (i = 0; i < index_nr; ++i) {
struct object_entry *entry = (struct object_entry *)index[i];
enum object_type real_type;
 
-   entry->in_pack_pos = i;
+   oe_set_in_pack_pos(to_pack, entry, i);
 
switch (oe_type(entry)) {
case OBJ_COMMIT:
@@ -147,7 +149,7 @@ static uint32_t find_object_pos(const unsigned char *sha1)
"(object %s is missing)", sha1_to_hex(sha1));
}
 
-   return entry->in_pack_pos;
+   return oe_in_pack_pos(writer.to_pack, entry);
 }
 
 static void show_object(struct object *object, const char *name, void *data)
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 9270983e5f..865d9ecc4e 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -1032,7 +1032,7 @@ int rebuild_existing_bitmaps(struct packing_data *mapping,
oe = packlist_find(mapping, sha1, NULL);
 
if (oe)
-   reposition[i] = oe->in_pack_pos + 1;
+   reposition[i] = oe_in_pack_pos(mapping, oe) + 1;
}
 
rebuild = bitmap_new();
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 3742a00e14..5ded2f139a 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -44,7 +44,9 @@ int rebuild_existing_bitmaps(struct packing_data *mapping, 
khash_sha1 *reused_bi
 
 void bitmap_writer_show_progress(int show);
 void bitmap_writer_set_checksum(unsigned char *sha1);
-void bitmap_writer_build_type_index(struct pack_idx_entry **index, uint32_t 
index_nr);
+void bitmap_writer_build_type_index(struct packing_data *to_pack,
+   struct pack_idx_entry **index,
+   uint32_t index_nr);
 void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
 void bitmap_writer_select_commits(struct commit **indexed_commits,
unsigned int indexed_commits_nr, int max_bitmaps);
diff --git a/pack-objects.h b/pack-objects.h
index 59407aae3c..4a11653657 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -79,7 +79,6 @@ struct object_entry {
unsigned in_pack_type:TYPE_BITS; /* could be delta */
unsigned type_valid:1;
uint32_t hash;  /* name hint hash */
-   unsigned int in_pack_pos;
unsigned char in_pack_header_size;
unsigned preferred_base:1; /*
* we do not pack this, but is available
@@ -99,6 +98,8 @@ struct packing_data {
 
int32_t *index;
uint32_t index_size;
+
+   unsigned int *in_pack_pos;
 };
 
 struct object_entry *packlist_alloc(struct packing_data *pdata,
@@ -144,4 +145,17 @@ static inline void oe_set_type(struct object_entry *e,
e->type_ = (unsigned)type;
 }
 
+static inline unsigned int oe_in_pack_pos(const struct packing_data *pack,
+ const struct object_entry *e)
+{
+   return pack->in_pack_pos[e - pack->objects];
+}
+
+static inline void oe_set_in_pack_pos(const struct packing_data *pack,
+ 

[PATCH v6 10/11] pack-objects: shrink delta_size field in struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
Allowing a delta size of 64 bits is crazy. Shrink this field down to
31 bits with one overflow bit.

If we find an existing delta larger than 2GB, we do not cache
delta_size at all and will get the value from oe_size(), potentially
from disk if it's larger than 4GB.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 24 ++--
 pack-objects.h | 23 ++-
 2 files changed, 36 insertions(+), 11 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 89ed4b5125..4406af640f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -30,10 +30,12 @@
 #include "packfile.h"
 
 #define IN_PACK(obj) oe_in_pack(_pack, obj)
+#define DELTA_SIZE(obj) oe_delta_size(_pack, obj)
 #define DELTA(obj) oe_delta(_pack, obj)
 #define DELTA_CHILD(obj) oe_delta_child(_pack, obj)
 #define DELTA_SIBLING(obj) oe_delta_sibling(_pack, obj)
 #define SET_DELTA(obj, val) oe_set_delta(_pack, obj, val)
+#define SET_DELTA_SIZE(obj, val) oe_set_delta_size(_pack, obj, val)
 #define SET_DELTA_CHILD(obj, val) oe_set_delta_child(_pack, obj, val)
 #define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(_pack, obj, val)
 
@@ -140,7 +142,7 @@ static void *get_delta(struct object_entry *entry)
oid_to_hex((entry)->idx.oid));
delta_buf = diff_delta(base_buf, base_size,
   buf, size, _size, 0);
-   if (!delta_buf || delta_size != entry->delta_size)
+   if (!delta_buf || delta_size != DELTA_SIZE(entry))
die("delta size changed");
free(buf);
free(base_buf);
@@ -291,14 +293,14 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
FREE_AND_NULL(entry->delta_data);
entry->z_delta_size = 0;
} else if (entry->delta_data) {
-   size = entry->delta_size;
+   size = DELTA_SIZE(entry);
buf = entry->delta_data;
entry->delta_data = NULL;
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
} else {
buf = get_delta(entry);
-   size = entry->delta_size;
+   size = DELTA_SIZE(entry);
type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
}
@@ -1509,7 +1511,7 @@ static void check_object(struct object_entry *entry)
 */
oe_set_type(entry, entry->in_pack_type);
SET_DELTA(entry, base_entry);
-   entry->delta_size = oe_size(entry);
+   SET_DELTA_SIZE(entry, oe_size(entry));
entry->delta_sibling_idx = base_entry->delta_child_idx;
SET_DELTA_CHILD(base_entry, entry);
unuse_pack(_curs);
@@ -1895,7 +1897,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
max_size = trg_size/2 - 20;
ref_depth = 1;
} else {
-   max_size = trg_entry->delta_size;
+   max_size = DELTA_SIZE(trg_entry);
ref_depth = trg->depth;
}
max_size = (uint64_t)max_size * (max_depth - src->depth) /
@@ -1966,10 +1968,12 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
delta_buf = create_delta(src->index, trg->data, trg_size, _size, 
max_size);
if (!delta_buf)
return 0;
+   if (delta_size >= (1 << OE_DELTA_SIZE_BITS))
+   return 0;
 
if (DELTA(trg_entry)) {
/* Prefer only shallower same-sized deltas. */
-   if (delta_size == trg_entry->delta_size &&
+   if (delta_size == DELTA_SIZE(trg_entry) &&
src->depth + 1 >= trg->depth) {
free(delta_buf);
return 0;
@@ -1984,7 +1988,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
free(trg_entry->delta_data);
cache_lock();
if (trg_entry->delta_data) {
-   delta_cache_size -= trg_entry->delta_size;
+   delta_cache_size -= DELTA_SIZE(trg_entry);
trg_entry->delta_data = NULL;
}
if (delta_cacheable(src_size, trg_size, delta_size)) {
@@ -1997,7 +2001,7 @@ static int try_delta(struct unpacked *trg, struct 
unpacked *src,
}
 
SET_DELTA(trg_entry, src_entry);
-   trg_entry->delta_size = delta_size;
+   SET_DELTA_SIZE(trg_entry, delta_size);
trg->depth = src->depth + 1;
 
return 1;
@@ -2120,11 +2124,11 @@ static void find_deltas(struct object_entry **list, 
unsigned *list_size,
if (entry->delta_data && !pack_to_stdout) {
unsigned long size;
 
-   size = do_compress(>delta_data, 

[PATCH v6 11/11] pack-objects: reorder members to shrink struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
Previous patches leave lots of holes and padding in this struct. This
patch reorders the members and shrinks the struct down to 80 bytes
(from 136 bytes, before any field shrinking is done) with 16 bits to
spare (and a couple more in in_pack_header_size when we really run out
of bits).

This is the last in a series of memory reduction patches (see
"pack-objects: a bit of document about struct object_entry" for the
first one).

Overall they've reduced repack memory size on linux-2.6.git from
3.747G to 3.424G, or by around 320M, a decrease of 8.5%. The runtime
of repack has stayed the same throughout this series. Ævar's testing
on a big monorepo he has access to (bigger than linux-2.6.git) has
shown a 7.9% reduction, so the overall expected improvement should be
somewhere around 8%.

See 87po42cwql@evledraar.gmail.com on-list
(https://public-inbox.org/git/87po42cwql@evledraar.gmail.com/) for
more detailed numbers and a test script used to produce the numbers
cited above.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 17 +
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/pack-objects.h b/pack-objects.h
index cbd5cf61ca..af40211105 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -71,35 +71,36 @@ enum dfs_state {
  */
 struct object_entry {
struct pack_idx_entry idx;
-   /* object uncompressed size _if_ size_valid is true */
-   uint32_t size_;
-   unsigned size_valid:1;
-   unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
+   void *delta_data;   /* cached delta (uncompressed) */
off_t in_pack_offset;
+   uint32_t hash;  /* name hint hash */
+   uint32_t size_; /* object uncompressed size _if_ size_valid is true */
uint32_t delta_idx; /* delta base object */
uint32_t delta_child_idx; /* deltified objects who bases me */
uint32_t delta_sibling_idx; /* other deltified objects who
 * uses the same base as me
 */
-   void *delta_data;   /* cached delta (uncompressed) */
uint32_t delta_size_:OE_DELTA_SIZE_BITS; /* delta data size 
(uncompressed) */
uint32_t delta_size_valid:1;
+   unsigned in_pack_idx:OE_IN_PACK_BITS;   /* already in pack */
+   unsigned size_valid:1;
unsigned z_delta_size:OE_Z_DELTA_BITS;
+   unsigned type_valid:1;
unsigned type_:TYPE_BITS;
unsigned in_pack_type:TYPE_BITS; /* could be delta */
-   unsigned type_valid:1;
-   uint32_t hash;  /* name hint hash */
-   unsigned char in_pack_header_size;
unsigned preferred_base:1; /*
* we do not pack this, but is available
* to be used as the base object to delta
* objects against.
*/
unsigned no_try_delta:1;
+   unsigned char in_pack_header_size;
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
unsigned dfs_state:OE_DFS_STATE_BITS;
unsigned depth:OE_DEPTH_BITS;
+
+   /* size: 80, bit_padding: 16 bits */
 };
 
 struct packing_data {
-- 
2.17.0.rc0.347.gf9cf61673a



[PATCH v6 07/11] pack-objects: refer to delta objects by index instead of pointer

2018-03-18 Thread Nguyễn Thái Ngọc Duy
These delta pointers always point to elements in the objects[] array
in packing_data struct. We can only hold maximum 4G of those objects
because the array size in nr_objects is uint32_t. We could use
uint32_t indexes to address these elements instead of pointers. On
64-bit architecture (8 bytes per pointer) this would save 4 bytes per
pointer.

Convert these delta pointers to indexes. Since we need to handle NULL
pointers as well, the index is shifted by one [1].

[1] This means we can only index 2^32-2 objects even though nr_objects
could contain 2^32-1 objects. It should not be a problem in
practice because when we grow objects[], nr_alloc would probably
blow up long before nr_objects hits the wall.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 116 ++---
 pack-objects.h |  67 ++--
 2 files changed, 124 insertions(+), 59 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 9792d31e46..b39234f7fb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -30,6 +30,12 @@
 #include "packfile.h"
 
 #define IN_PACK(obj) oe_in_pack(_pack, obj)
+#define DELTA(obj) oe_delta(_pack, obj)
+#define DELTA_CHILD(obj) oe_delta_child(_pack, obj)
+#define DELTA_SIBLING(obj) oe_delta_sibling(_pack, obj)
+#define SET_DELTA(obj, val) oe_set_delta(_pack, obj, val)
+#define SET_DELTA_CHILD(obj, val) oe_set_delta_child(_pack, obj, val)
+#define SET_DELTA_SIBLING(obj, val) oe_set_delta_sibling(_pack, obj, val)
 
 static const char *pack_usage[] = {
N_("git pack-objects --stdout [...] [<  | < 
]"),
@@ -127,11 +133,11 @@ static void *get_delta(struct object_entry *entry)
buf = read_sha1_file(entry->idx.oid.hash, , );
if (!buf)
die("unable to read %s", oid_to_hex(>idx.oid));
-   base_buf = read_sha1_file(entry->delta->idx.oid.hash, ,
+   base_buf = read_sha1_file(DELTA(entry)->idx.oid.hash, ,
  _size);
if (!base_buf)
die("unable to read %s",
-   oid_to_hex(>delta->idx.oid));
+   oid_to_hex((entry)->idx.oid));
delta_buf = diff_delta(base_buf, base_size,
   buf, size, _size, 0);
if (!delta_buf || delta_size != entry->delta_size)
@@ -288,12 +294,12 @@ static unsigned long write_no_reuse_object(struct 
hashfile *f, struct object_ent
size = entry->delta_size;
buf = entry->delta_data;
entry->delta_data = NULL;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
} else {
buf = get_delta(entry);
size = entry->delta_size;
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
}
 
@@ -317,7 +323,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
 * encoding of the relative offset for the delta
 * base from this object's position in the pack.
 */
-   off_t ofs = entry->idx.offset - entry->delta->idx.offset;
+   off_t ofs = entry->idx.offset - DELTA(entry)->idx.offset;
unsigned pos = sizeof(dheader) - 1;
dheader[pos] = ofs & 127;
while (ofs >>= 7)
@@ -343,7 +349,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
return 0;
}
hashwrite(f, header, hdrlen);
-   hashwrite(f, entry->delta->idx.oid.hash, 20);
+   hashwrite(f, DELTA(entry)->idx.oid.hash, 20);
hdrlen += 20;
} else {
if (limit && hdrlen + datalen + 20 >= limit) {
@@ -379,8 +385,8 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
  dheader[MAX_PACK_OBJECT_HEADER];
unsigned hdrlen;
 
-   if (entry->delta)
-   type = (allow_ofs_delta && entry->delta->idx.offset) ?
+   if (DELTA(entry))
+   type = (allow_ofs_delta && DELTA(entry)->idx.offset) ?
OBJ_OFS_DELTA : OBJ_REF_DELTA;
hdrlen = encode_in_pack_object_header(header, sizeof(header),
  type, entry->size);
@@ -408,7 +414,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
}
 
if (type == OBJ_OFS_DELTA) {
-   off_t ofs = entry->idx.offset - entry->delta->idx.offset;
+   off_t ofs = entry->idx.offset - DELTA(entry)->idx.offset;
unsigned pos = sizeof(dheader) - 1;
 

[PATCH v6 03/11] pack-objects: use bitfield for object_entry::dfs_state

2018-03-18 Thread Nguyễn Thái Ngọc Duy
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c |  3 +++
 pack-objects.h | 28 +---
 2 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 647c01ea34..83f8154865 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3049,6 +3049,9 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
OPT_END(),
};
 
+   if (DFS_NUM_STATES > (1 << OE_DFS_STATE_BITS))
+   die("BUG: too many dfs states, increase OE_DFS_STATE_BITS");
+
check_replace_refs = 0;
 
reset_pack_idx_option(_idx_opts);
diff --git a/pack-objects.h b/pack-objects.h
index b883d7aa10..8507e1b869 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -1,6 +1,21 @@
 #ifndef PACK_OBJECTS_H
 #define PACK_OBJECTS_H
 
+#define OE_DFS_STATE_BITS  2
+
+/*
+ * State flags for depth-first search used for analyzing delta cycles.
+ *
+ * The depth is measured in delta-links to the base (so if A is a delta
+ * against B, then A has a depth of 1, and B a depth of 0).
+ */
+enum dfs_state {
+   DFS_NONE = 0,
+   DFS_ACTIVE,
+   DFS_DONE,
+   DFS_NUM_STATES
+};
+
 /*
  * basic object info
  * -
@@ -73,19 +88,10 @@ struct object_entry {
unsigned no_try_delta:1;
unsigned tagged:1; /* near the very tip of refs */
unsigned filled:1; /* assigned write-order */
+   unsigned dfs_state:OE_DFS_STATE_BITS;
 
-   /*
-* State flags for depth-first search used for analyzing delta cycles.
-*
-* The depth is measured in delta-links to the base (so if A is a delta
-* against B, then A has a depth of 1, and B a depth of 0).
-*/
-   enum {
-   DFS_NONE = 0,
-   DFS_ACTIVE,
-   DFS_DONE
-   } dfs_state;
int depth;
+
 };
 
 struct packing_data {
-- 
2.17.0.rc0.347.gf9cf61673a



[PATCH v6 02/11] pack-objects: turn type and in_pack_type to bitfields

2018-03-18 Thread Nguyễn Thái Ngọc Duy
An extra field type_valid is added to carry the equivalent of OBJ_BAD
in the original "type" field. in_pack_type always contains a valid
type so we only need 3 bits for it.

A note about accepting OBJ_NONE as "valid" type. The function
read_object_list_from_stdin() can pass this value [1] and it
eventually calls create_object_entry() where current code skip setting
"type" field if the incoming type is zero. This does not have any bad
side effects because "type" field should be memset()'d anyway.

But since we also need to set type_valid now, skipping oe_set_type()
leaves type_valid zero/false, which will make oe_type() return
OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in
prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger

fatal: unable to get type of object ...

Accepting OBJ_NONE [2] does sound wrong, but this is how it is has
been for a very long time and I haven't time to dig in further.

[1] See 5c49c11686 (pack-objects: better check_object() performances -
2007-04-16)

[2] 21666f1aae (convert object type handling from a string to a number
- 2007-02-26)

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 builtin/pack-objects.c | 60 --
 cache.h|  2 ++
 object.h   |  1 -
 pack-bitmap-write.c|  6 ++---
 pack-objects.h | 20 --
 5 files changed, 58 insertions(+), 31 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 5c674b2843..647c01ea34 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -265,7 +265,7 @@ static unsigned long write_no_reuse_object(struct hashfile 
*f, struct object_ent
struct git_istream *st = NULL;
 
if (!usable_delta) {
-   if (entry->type == OBJ_BLOB &&
+   if (oe_type(entry) == OBJ_BLOB &&
entry->size > big_file_threshold &&
(st = open_istream(entry->idx.oid.hash, , , 
NULL)) != NULL)
buf = NULL;
@@ -371,7 +371,7 @@ static off_t write_reuse_object(struct hashfile *f, struct 
object_entry *entry,
struct pack_window *w_curs = NULL;
struct revindex_entry *revidx;
off_t offset;
-   enum object_type type = entry->type;
+   enum object_type type = oe_type(entry);
off_t datalen;
unsigned char header[MAX_PACK_OBJECT_HEADER],
  dheader[MAX_PACK_OBJECT_HEADER];
@@ -480,11 +480,12 @@ static off_t write_object(struct hashfile *f,
to_reuse = 0;   /* explicit */
else if (!entry->in_pack)
to_reuse = 0;   /* can't reuse what we don't have */
-   else if (entry->type == OBJ_REF_DELTA || entry->type == OBJ_OFS_DELTA)
+   else if (oe_type(entry) == OBJ_REF_DELTA ||
+oe_type(entry) == OBJ_OFS_DELTA)
/* check_object() decided it for us ... */
to_reuse = usable_delta;
/* ... but pack split may override that */
-   else if (entry->type != entry->in_pack_type)
+   else if (oe_type(entry) != entry->in_pack_type)
to_reuse = 0;   /* pack has delta which is unusable */
else if (entry->delta)
to_reuse = 0;   /* we want to pack afresh */
@@ -705,8 +706,8 @@ static struct object_entry **compute_write_order(void)
 * And then all remaining commits and tags.
 */
for (i = last_untagged; i < to_pack.nr_objects; i++) {
-   if (objects[i].type != OBJ_COMMIT &&
-   objects[i].type != OBJ_TAG)
+   if (oe_type([i]) != OBJ_COMMIT &&
+   oe_type([i]) != OBJ_TAG)
continue;
add_to_write_order(wo, _end, [i]);
}
@@ -715,7 +716,7 @@ static struct object_entry **compute_write_order(void)
 * And then all the trees.
 */
for (i = last_untagged; i < to_pack.nr_objects; i++) {
-   if (objects[i].type != OBJ_TREE)
+   if (oe_type([i]) != OBJ_TREE)
continue;
add_to_write_order(wo, _end, [i]);
}
@@ -1066,8 +1067,7 @@ static void create_object_entry(const struct object_id 
*oid,
 
entry = packlist_alloc(_pack, oid->hash, index_pos);
entry->hash = hash;
-   if (type)
-   entry->type = type;
+   oe_set_type(entry, type);
if (exclude)
entry->preferred_base = 1;
else
@@ -1407,6 +1407,7 @@ static void check_object(struct object_entry *entry)
unsigned long avail;
off_t ofs;
unsigned char *buf, c;
+   enum object_type type;
 
buf = use_pack(p, _curs, entry->in_pack_offset, );
 
@@ -1415,11 +1416,15 @@ static void check_object(struct object_entry *entry)
 * since non-delta representations could still be reused.
  

[PATCH v6 00/11] nd/pack-objects-pack-struct updates

2018-03-18 Thread Nguyễn Thái Ngọc Duy
v6 fixes the one optimization that I just couldn't get right, fixes
two off-by-one error messages and a couple commit message update
(biggest change is in 11/11 to record some numbers from AEvar)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index fb2aba80bf..4406af640f 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -3112,10 +3112,10 @@ int cmd_pack_objects(int argc, const char **argv, const 
char *prefix)
 
if (depth >= (1 << OE_DEPTH_BITS))
die(_("delta chain depth %d is greater than maximum limit %d"),
-   depth, (1 << OE_DEPTH_BITS));
+   depth, (1 << OE_DEPTH_BITS) - 1);
if (cache_max_small_delta_size >= (1 << OE_Z_DELTA_BITS))
die(_("pack.deltaCacheLimit is greater than maximum limit %d"),
-   1 << OE_Z_DELTA_BITS);
+   (1 << OE_Z_DELTA_BITS) - 1);
 
argv_array_push(, "pack-objects");
if (thin) {
diff --git a/pack-objects.h b/pack-objects.h
index 55358da9f3..af40211105 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -275,7 +275,7 @@ static inline unsigned long oe_size(const struct 
object_entry *e)
}
 }
 
-static inline int contains_in_32bits(unsigned long limit)
+static inline int oe_fits_in_32bits(unsigned long limit)
 {
uint32_t truncated_limit = (uint32_t)limit;
 
@@ -287,8 +287,8 @@ static inline int oe_size_less_than(const struct 
object_entry *e,
 {
if (e->size_valid)
return e->size_ < limit;
-   if (contains_in_32bits(limit))
-   return 1;
+   if (oe_fits_in_32bits(limit)) /* limit < 2^32 <= size ? */
+   return 0;
return oe_size(e) < limit;
 }
 
@@ -297,8 +297,8 @@ static inline int oe_size_greater_than(const struct 
object_entry *e,
 {
if (e->size_valid)
return e->size_ > limit;
-   if (contains_in_32bits(limit))
-   return 0;
+   if (oe_fits_in_32bits(limit)) /* limit < 2^32 <= size ? */
+   return 1;
return oe_size(e) > limit;
 }
 
@@ -307,6 +307,14 @@ static inline void oe_set_size(struct object_entry *e,
 {
e->size_ = size;
e->size_valid = e->size_ == size;
+
+   if (!e->size_valid) {
+   unsigned long real_size;
+
+   if (sha1_object_info(e->idx.oid.hash, _size) < 0 ||
+   size != real_size)
+   die("BUG: 'size' is supposed to be the object size!");
+   }
 }
 
 static inline unsigned long oe_delta_size(struct packing_data *pack,

-- 
2.17.0.rc0.347.gf9cf61673a



[PATCH v6 01/11] pack-objects: a bit of document about struct object_entry

2018-03-18 Thread Nguyễn Thái Ngọc Duy
The role of this comment block becomes more important after we shuffle
fields around to shrink this struct. It will be much harder to see what
field is related to what.

Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 pack-objects.h | 45 +
 1 file changed, 45 insertions(+)

diff --git a/pack-objects.h b/pack-objects.h
index 03f1191659..c0a1f61aac 100644
--- a/pack-objects.h
+++ b/pack-objects.h
@@ -1,6 +1,51 @@
 #ifndef PACK_OBJECTS_H
 #define PACK_OBJECTS_H
 
+/*
+ * basic object info
+ * -
+ * idx.oid is filled up before delta searching starts. idx.crc32 is
+ * only valid after the object is written out and will be used for
+ * generating the index. idx.offset will be both gradually set and
+ * used in writing phase (base objects get offset first, then deltas
+ * refer to them)
+ *
+ * "size" is the uncompressed object size. Compressed size of the raw
+ * data for an object in a pack is not stored anywhere but is computed
+ * and made available when reverse .idx is made.
+ *
+ * "hash" contains a path name hash which is used for sorting the
+ * delta list and also during delta searching. Once prepare_pack()
+ * returns it's no longer needed.
+ *
+ * source pack info
+ * 
+ * The (in_pack, in_pack_offset) tuple contains the location of the
+ * object in the source pack. in_pack_header_size allows quickly
+ * skipping the header and going straight to the zlib stream.
+ *
+ * "type" and "in_pack_type" both describe object type. in_pack_type
+ * may contain a delta type, while type is always the canonical type.
+ *
+ * deltas
+ * --
+ * Delta links (delta, delta_child and delta_sibling) are created to
+ * reflect that delta graph from the source pack then updated or added
+ * during delta searching phase when we find better deltas.
+ *
+ * delta_child and delta_sibling are last needed in
+ * compute_write_order(). "delta" and "delta_size" must remain valid
+ * at object writing phase in case the delta is not cached.
+ *
+ * If a delta is cached in memory and is compressed, delta_data points
+ * to the data and z_delta_size contains the compressed size. If it's
+ * uncompressed [1], z_delta_size must be zero. delta_size is always
+ * the uncompressed size and must be valid even if the delta is not
+ * cached.
+ *
+ * [1] during try_delta phase we don't bother with compressing because
+ * the delta could be quickly replaced with a better one.
+ */
 struct object_entry {
struct pack_idx_entry idx;
unsigned long size; /* uncompressed size */
-- 
2.17.0.rc0.347.gf9cf61673a



Re: [PATCH v6 07/14] commit-graph: implement 'git-commit-graph write'

2018-03-18 Thread Ævar Arnfjörð Bjarmason

On Wed, Mar 14 2018, Derrick Stolee jotted:

> +'git commit-graph write'  [--object-dir ]
> +
> +
> +DESCRIPTION
> +---
> +
> +Manage the serialized commit graph file.
> +
> +
> +OPTIONS
> +---
> +--object-dir::
> + Use given directory for the location of packfiles and commit graph
> + file. The commit graph file is expected to be at /info/commit-graph
> + and the packfiles are expected to be in /pack.

Maybe this was covered in a previous round, this series is a little hard
to follow since each version isn't In-Reply-To the version before it,
but why is this option needed, i.e. why would you do:

git commit-graph write --object-dir=/some/path/.git/objects

As opposed to just pigging-backing on what we already have with both of:

git --git-dir=/some/path/.git commit-graph write
git -C /some/path commit-graph write

Is there some use-case where you have *just* the objects dir and not the
rest of the .git folder?


NICE TO MEET YOU,

2018-03-18 Thread Jack
Hi Dear, my name is Jack and i am seeking for a relationship in which i will 
feel loved after a series of failed relationships. 

I am hoping that you would be interested and we could possibly get to know each 
other more if you do not mind. I am open to answering questions from you as i 
think my approach is a little inappropriate. Hope to hear back from you.

Jack.


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 10:26 AM, Jeff King  wrote:
> On Sun, Mar 18, 2018 at 09:18:34AM +0100, Nguyễn Thái Ngọc Duy wrote:
>
>> The set of extra warnings we enable when DEVELOPER has to be
>> conservative because we can't assume any compiler version the
>> developer may use. Detect the compiler version so we know when it's
>> safe to enable -Wextra and maybe more.
>
> Hrm, I was planning to expand my patch into a series.

I started this so I think people expect me to move it forward. But
I'll gladly hand it to you. It looks like you have more compilers to
play with anyway (I'm on gentoo and installing a new compiler,
especially llvm-based, takes ages).
-- 
Duy


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 5:28 AM, Jeff King  wrote:
> On Sun, Mar 18, 2018 at 05:06:07AM -0400, Eric Sunshine wrote:
>> On MacOS, "cc -v" output is:
>> --- >8 ---
>> Apple LLVM version 9.0.0 (clang-900.0.39.2)
>> Target: x86_64-apple-darwin16.7.0
>> Thread model: posix
>> InstalledDir: ...
>> --- >8 ---
>
> Is that really way ahead of the clang releases (which are at 7.0,
> AFAIK), or do they use a totally different number scheme?

I have no idea, but from past experience with Apple, I'd guess that
the version number is an Apple invention.

{...goes and researches a bit...}

Indeed, there doesn't seem to be any correlation between what Apple
reports and the actual clang version number[1].

[1]: 
https://stackoverflow.com/questions/33603027/get-apple-clang-version-and-corresponding-upstream-llvm-version


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Jeff King
On Sun, Mar 18, 2018 at 05:06:07AM -0400, Eric Sunshine wrote:

> On MacOS, "cc -v" output is:
> 
> --- >8 ---
> Apple LLVM version 9.0.0 (clang-900.0.39.2)
> Target: x86_64-apple-darwin16.7.0
> Thread model: posix
> InstalledDir: ...
> --- >8 ---

Is that really way ahead of the clang releases (which are at 7.0,
AFAIK), or do they use a totally different number scheme?

If the latter, I guess we'd have to treat it separately from clang and
have all of the conditionals treat it independently? Yuck.

-Peff


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Jeff King
On Sun, Mar 18, 2018 at 09:18:34AM +0100, Nguyễn Thái Ngọc Duy wrote:

> The set of extra warnings we enable when DEVELOPER has to be
> conservative because we can't assume any compiler version the
> developer may use. Detect the compiler version so we know when it's
> safe to enable -Wextra and maybe more.

Hrm, I was planning to expand my patch into a series. In particular, in
my experiments some of those warnings do not need to be disabled under
-Wextra (I tested going back to gcc 5 and clang 4.0).

I think it's worth splitting up the boilerplate changes from the
decisions on individual warnings, because I suspect we'll need to revise
the latter going forward.

I also noticed that all of those options seem to go back at least to
gcc-5. It makes me wonder how much we even need this version-detecting
framework. There are one or two "clang vs gcc" things I've found, but I
haven't found a case where old versions of gcc don't support a
particular option.

IOW, could we get away with just adding these to the DEVELOPER knob and
assuming that people who use it have a reasonably modern gcc or clang
setup? That's more or less what the existing knobs do.

But I didn't follow whether there was any earlier discussion on specific
problems.

>  -dumpversion does not work correctly for clang. So I use "-v" instead
>  which seems to produce the same output for both gcc and clang (with a
>  minor difference in freebsd where it says "FreeBSD clang" instead of
>  just "clang"). Not sure if it helps your "cc on debian" case though

Heh. At first I was confused, as it seems to work for me:

  $ clang-4.0 -dumpversion
  4.2.1

But then I tried this:

  $ clang-5.0 -dumpversion
  4.2.1

Whoops. It returns the same value up through clang 7.

-Peff


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 5:17 AM, Duy Nguyen  wrote:
> On Sun, Mar 18, 2018 at 10:06 AM, Eric Sunshine  
> wrote:
>> On MacOS, "cc -v" output is:
>> --- >8 ---
>> Apple LLVM version 9.0.0 (clang-900.0.39.2)
>> Target: x86_64-apple-darwin16.7.0
>> Thread model: posix
>> InstalledDir: ...
>> --- >8 ---
>
> Does it still build ok for you with your changes squashed in? I think
> the check for clang4/gcc6 and setting -Wextra in config.mak.dev may
> backfire because clang9 probably has a lot more warnings enabled and
> some of them we don't want and cause compile error...

The project does still compile cleanly for me. (And I used "make V=1"
to verify that the new flags are used.)


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Jeff King
On Sun, Mar 18, 2018 at 10:17:41AM +0100, Duy Nguyen wrote:

> On Sun, Mar 18, 2018 at 10:06 AM, Eric Sunshine  
> wrote:
> > On Sun, Mar 18, 2018 at 09:18:34AM +0100, Nguyễn Thái Ngọc Duy wrote:
> >> The set of extra warnings we enable when DEVELOPER has to be
> >> conservative because we can't assume any compiler version the
> >> developer may use. Detect the compiler version so we know when it's
> >> safe to enable -Wextra and maybe more.
> >>
> >> Helped-by: Jeff King 
> >> Signed-off-by: Nguyễn Thái Ngọc Duy 
> >> ---
> >> diff --git a/detect-compiler b/detect-compiler
> >> --- /dev/null
> >> +++ b/detect-compiler
> >> @@ -0,0 +1,50 @@
> >> +get_version_line() {
> >> + "$CC" -v 2>&1 | grep ' version '
> >> +}
> >
> > On MacOS, "cc -v" output is:
> >
> > --- >8 ---
> > Apple LLVM version 9.0.0 (clang-900.0.39.2)
> > Target: x86_64-apple-darwin16.7.0
> > Thread model: posix
> > InstalledDir: ...
> > --- >8 ---
> 
> Does it still build ok for you with your changes squashed in? I think
> the check for clang4/gcc6 and setting -Wextra in config.mak.dev may
> backfire because clang9 probably has a lot more warnings enabled and
> some of them we don't want and cause compile error...

I don't think we can predict that going forward, though. We'd have to
wait for people to use the new compiler and then send a patch disabling
whatever causes a problem (or better yet, fixing the code if
appropriate).

This framework can only help us write those patches; I don't think it
can save us from having to do them in the first place.

-Peff


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 10:06 AM, Eric Sunshine  wrote:
> On Sun, Mar 18, 2018 at 09:18:34AM +0100, Nguyễn Thái Ngọc Duy wrote:
>> The set of extra warnings we enable when DEVELOPER has to be
>> conservative because we can't assume any compiler version the
>> developer may use. Detect the compiler version so we know when it's
>> safe to enable -Wextra and maybe more.
>>
>> Helped-by: Jeff King 
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>> diff --git a/detect-compiler b/detect-compiler
>> --- /dev/null
>> +++ b/detect-compiler
>> @@ -0,0 +1,50 @@
>> +get_version_line() {
>> + "$CC" -v 2>&1 | grep ' version '
>> +}
>
> On MacOS, "cc -v" output is:
>
> --- >8 ---
> Apple LLVM version 9.0.0 (clang-900.0.39.2)
> Target: x86_64-apple-darwin16.7.0
> Thread model: posix
> InstalledDir: ...
> --- >8 ---

Does it still build ok for you with your changes squashed in? I think
the check for clang4/gcc6 and setting -Wextra in config.mak.dev may
backfire because clang9 probably has a lot more warnings enabled and
some of them we don't want and cause compile error...
-- 
Duy


Re: [PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 09:18:34AM +0100, Nguyễn Thái Ngọc Duy wrote:
> The set of extra warnings we enable when DEVELOPER has to be
> conservative because we can't assume any compiler version the
> developer may use. Detect the compiler version so we know when it's
> safe to enable -Wextra and maybe more.
> 
> Helped-by: Jeff King 
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
> diff --git a/detect-compiler b/detect-compiler
> --- /dev/null
> +++ b/detect-compiler
> @@ -0,0 +1,50 @@
> +get_version_line() {
> + "$CC" -v 2>&1 | grep ' version '
> +}

On MacOS, "cc -v" output is:

--- >8 ---
Apple LLVM version 9.0.0 (clang-900.0.39.2)
Target: x86_64-apple-darwin16.7.0
Thread model: posix
InstalledDir: ...
--- >8 ---

> +get_family() {
> + get_version_line | sed 's/^\(.*\) version [0-9][^ ]* .*/\1/'
> +}

So, this returns "Apple LLVM".

> +case "$(get_family)" in
> +gcc)
> + print_flags gcc
> + ;;
> +*clang)
> + print_flags clang
> + ;;
> +"FreeBSD clang")
> + print_flags clang
> + ;;
> +*)
> + : unknown compiler family
> + ;;
> +esac

Which means you probably want to squash in:

--- >8 ---
diff --git a/detect-compiler b/detect-compiler
index bc2ea39ef5..3140416b40 100755
--- a/detect-compiler
+++ b/detect-compiler
@@ -41,6 +41,9 @@ gcc)
 *clang)
print_flags clang
;;
+"Apple LLVM")
+   print_flags clang
+   ;;
 "FreeBSD clang")
print_flags clang
;;
--- >8 ---


Re: [PATCH v5 11/11] pack-objects.h: reorder members to shrink struct object_entry

2018-03-18 Thread Duy Nguyen
On Sat, Mar 17, 2018 at 8:53 PM, Ævar Arnfjörð Bjarmason
 wrote:
>
> On Sat, Mar 17 2018, Nguyễn Thái Ngọc Duy jotted:
>
>> Previous patches leave lots of holes and padding in this struct. This
>> patch reorders the members and shrinks the struct down to 80 bytes
>> (from 136 bytes, before any field shrinking is done) with 16 bits to
>> spare (and a couple more in in_pack_header_size when we really run out
>> of bits).
>
> Given what I mentioned in 87po42cwql@evledraar.gmail.com just now I
> think we should add this to the commit message.
>
> This is the last in a series of memory reduction patches (see
> "pack-objects: a bit of document about struct object_entry" for the
> first one).
>
> Overall they've reduced repack memory size on linux.git from 3.747G
> to 3.424G, or by around 320M, a decrease of 8.5%. The runtime of
> repack has stayed the same throughout this series. Ævar's testing on
> a big monorepo he has access to (bigger than linux.git) has shown a
> 7.9% reduction, so the overall expected improvement should be
> somewhere around 8%.
>
> See 87po42cwql@evledraar.gmail.com on-list
> (https://public-inbox.org/git/87po42cwql@evledraar.gmail.com/)
> for more detailed numbers and a test script used to produce the
> numbers cited above.

Yeah.

I probably should add something that was on my mind but never written
out. These shrinking and packing definitely slow down access to these
struct members (more instructions to read or write). However, since
pack-objects is mostly IO-bound, and when it's CPU-bound, I think the
hot path is either inflating objects or generating deltas, these
slowdowns do not matter (and smaller cache footprint helps too)
-- 
Duy


Re: [PATCH 04/36] t/helper: merge test-lazy-init-name-hash into test-tool

2018-03-18 Thread Eric Sunshine
On Sun, Mar 18, 2018 at 4:25 AM, Duy Nguyen  wrote:
> On Sun, Mar 18, 2018 at 3:11 AM, Eric Sunshine  
> wrote:
>> On Sat, Mar 17, 2018 at 3:53 AM, Nguyễn Thái Ngọc Duy  
>> wrote:
>>> -extern int test_lazy_init_name_hash(struct index_state *istate, int 
>>> try_threaded);
>>> +extern int lazy_init_name_hash_for_testing(struct index_state *istate, int 
>>> try_threaded);
>>
>> I get why you renamed this since the "main" function in the test
>> program wants to be called 'test_lazy_init_name_hash'...
>>
>>> diff --git a/t/helper/test-lazy-init-name-hash.c 
>>> b/t/helper/test-lazy-init-name-hash.c
>>> @@ -9,6 +10,9 @@ static int perf;
>>> +static int (*init_name_hash)(struct index_state *istate, int try_threaded) 
>>> =
>>> +   lazy_init_name_hash_for_testing;
>>> +
>>> @@ -33,9 +37,9 @@ static void dump_run(void)
>>> if (single) {
>>> -   test_lazy_init_name_hash(_index, 0);
>>> +   init_name_hash(_index, 0);
>>
>> ... but I'm having trouble understanding why this indirection through
>> 'init_name_hash' is used rather than just calling
>> lazy_init_name_hash_for_testing() directly. Am I missing something
>> obvious or is 'init_name_hash' just an unneeded artifact of an earlier
>> iteration before the rename in cache.{c,h}?
>
> Nope. It just feels too long to me and because we're already in the
> test I don't feel the need to repeat _for_testing everywhere. If it's
> confusing, I'll remove init_name_hash.

Without an explanatory in-code comment, I'd guess that someone coming
across this in the future will also stumble over it just like I did in
the review.

What if, instead, you rename test_lazy_init_name_hash() to
lazy_init_name_hash_test(), shifting 'test' from the front to the back
of the name? That way, the name remains the same length at the call
sites in the test helper, and you don't have to introduce the
confusing, otherwise unneeded 'init_name_hash'.


Re: Why does pack-objects use so much memory on incremental packing?

2018-03-18 Thread Duy Nguyen
On Sat, Mar 17, 2018 at 11:05 PM, Ævar Arnfjörð Bjarmason
 wrote:
>
> On Wed, Feb 28 2018, Duy Nguyen jotted:
>
>> linux-2.6.git current has 6483999 objects. "git gc" on my poor laptop
>> consumes 1.7G out of 4G RAM, pushing lots of data to swap and making
>> all apps nearly unusuable (granted the problem is partly Linux I/O
>> scheduler too). So I wonder if we can reduce pack-objects memory
>> footprint a bit.
>>
>> This demonstration patch (probably breaks some tests) would reduce the
>> size of struct object_entry from from 136 down to 112 bytes on
>> x86-64. There are 6483999 of these objects, so the saving is 17% or
>> 148 MB.
>
> Splitting this off into its own thread. Aside from the improvements in
> your repack memory reduction (20180317141033.21545-1-pclo...@gmail.com)
> and gc config (20180316192745.19557-1-pclo...@gmail.com) series's I'm
> wondering why repack takes so much memory to incrementally repack new
> stuff when you leave out the base pack.
>
> Repacking git.git takes around 290MB of memory on my system, but I'd
> think that this would make it take a mere few megabytes, since all I'm
> asking it to do is pack up the few loose objects that got added and keep
> the base pack:
>
> ...
>

I left some clue in the new estimate_repack_memory() function in my gc
series that could help you find this out. I haven't really tested this
case but my guess is the two cache pools we have will likely be filled
up close to full anyway and hit delta_base_cache_limit and
max_delta_cache_size limits. When these are really full on default
configuration, they'll take roughly ~300mb.

The second is, I think we still go through all objects to mark which
one is included in the new pack, which one not (and probably which one
can be delta base candidates). Try calling alloc_report() function at
the end of repack to see exactly how much memory is locked in there.
This we could perhaps improve for incremental repack by avoiding
running rev-list.
-- 
Duy


Re: [PATCH 04/36] t/helper: merge test-lazy-init-name-hash into test-tool

2018-03-18 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 3:11 AM, Eric Sunshine  wrote:
> On Sat, Mar 17, 2018 at 3:53 AM, Nguyễn Thái Ngọc Duy  
> wrote:
>> diff --git a/cache.h b/cache.h
>> @@ -333,7 +333,7 @@ struct index_state {
>> -extern int test_lazy_init_name_hash(struct index_state *istate, int 
>> try_threaded);
>> +extern int lazy_init_name_hash_for_testing(struct index_state *istate, int 
>> try_threaded);
>
> I get why you renamed this since the "main" function in the test
> program wants to be called 'test_lazy_init_name_hash'...
>
>> diff --git a/t/helper/test-lazy-init-name-hash.c 
>> b/t/helper/test-lazy-init-name-hash.c
>> @@ -9,6 +10,9 @@ static int perf;
>> +static int (*init_name_hash)(struct index_state *istate, int try_threaded) =
>> +   lazy_init_name_hash_for_testing;
>> +
>> @@ -33,9 +37,9 @@ static void dump_run(void)
>> if (single) {
>> -   test_lazy_init_name_hash(_index, 0);
>> +   init_name_hash(_index, 0);
>
> ... but I'm having trouble understanding why this indirection through
> 'init_name_hash' is used rather than just calling
> lazy_init_name_hash_for_testing() directly. Am I missing something
> obvious or is 'init_name_hash' just an unneeded artifact of an earlier
> iteration before the rename in cache.{c,h}?

Nope. It just feels too long to me and because we're already in the
test I don't feel the need to repeat _for_testing everywhere. If it's
confusing, I'll remove init_name_hash.
-- 
Duy


Re: [PATCH v5 09/11] pack-objects: shrink size field in struct object_entry

2018-03-18 Thread Duy Nguyen
On Sun, Mar 18, 2018 at 6:09 AM, Junio C Hamano  wrote:
>> + uint32_t truncated_limit = (uint32_t)limit;
>> +
>> + return limit == truncated_limit;
>> +}
>
> I am guessing that a compiler that is clever enough will make this
> function a no-op on a 32-bit arch and that is why it is a static
> inline function?

It's a separate function because I don't want to duplicate this ==
logic twice. Even if the compiler does not optimize this, it's still
much cheaper than oe_sze() which involves disk access.

>> +static inline int oe_size_less_than(const struct object_entry *e,
>> + unsigned long limit)
>> +{
>> + if (e->size_valid)
>> + return e->size_ < limit;
>
> e->size_ is the true size so we can compare it to see if it is smaller
> than limit.
>
>> + if (contains_in_32bits(limit))
>> + return 1;
>
> If limit is small enough, and because e->size_valid means e->size_
> does not fit in 32-bit, we know size is larger than limit.
> Shouldn't we be returning 0 that means "no, the size is not less
> than limit" from here?

Argh!!! This logic keeps messing with my brain.
-- 
Duy


[PATCH] Makefile: detect compiler and enable more warnings in DEVELOPER=1

2018-03-18 Thread Nguyễn Thái Ngọc Duy
The set of extra warnings we enable when DEVELOPER has to be
conservative because we can't assume any compiler version the
developer may use. Detect the compiler version so we know when it's
safe to enable -Wextra and maybe more.

Helped-by: Jeff King 
Signed-off-by: Nguyễn Thái Ngọc Duy 
---
 -dumpversion does not work correctly for clang. So I use "-v" instead
 which seems to produce the same output for both gcc and clang (with a
 minor difference in freebsd where it says "FreeBSD clang" instead of
 just "clang"). Not sure if it helps your "cc on debian" case though

 Tested with clang-5.0.1 and gcc-6.4.0 (too lazy to test on freebsd
 clang 3.4.1 but I don't expect surprises there)

 I will still need to pull more modern gcc/clang on travis, but that
 can wait until this patch settles.

 Makefile| 11 +--
 config.mak.dev  | 28 +++
 detect-compiler | 50 +
 3 files changed, 79 insertions(+), 10 deletions(-)
 create mode 100644 config.mak.dev
 create mode 100755 detect-compiler

diff --git a/Makefile b/Makefile
index a1d8775adb..9dfd152a1e 100644
--- a/Makefile
+++ b/Makefile
@@ -442,15 +442,6 @@ GIT-VERSION-FILE: FORCE
 # CFLAGS and LDFLAGS are for the users to override from the command line.
 
 CFLAGS = -g -O2 -Wall
-DEVELOPER_CFLAGS = -Werror \
-   -Wdeclaration-after-statement \
-   -Wno-format-zero-length \
-   -Wold-style-definition \
-   -Woverflow \
-   -Wpointer-arith \
-   -Wstrict-prototypes \
-   -Wunused \
-   -Wvla
 LDFLAGS =
 ALL_CFLAGS = $(CPPFLAGS) $(CFLAGS)
 ALL_LDFLAGS = $(LDFLAGS)
@@ -1051,7 +1042,7 @@ include config.mak.uname
 -include config.mak
 
 ifdef DEVELOPER
-CFLAGS += $(DEVELOPER_CFLAGS)
+include config.mak.dev
 endif
 
 comma := ,
diff --git a/config.mak.dev b/config.mak.dev
new file mode 100644
index 00..59aef342c4
--- /dev/null
+++ b/config.mak.dev
@@ -0,0 +1,28 @@
+CFLAGS += -Werror
+CFLAGS += -Wdeclaration-after-statement
+CFLAGS += -Wno-format-zero-length
+CFLAGS += -Wold-style-definition
+CFLAGS += -Woverflow
+CFLAGS += -Wpointer-arith
+CFLAGS += -Wstrict-prototypes
+CFLAGS += -Wunused
+CFLAGS += -Wvla
+
+COMPILER_FEATURES := $(shell ./detect-compiler $(CC))
+
+ifneq ($(filter clang4,$(COMPILER_FEATURES)),)
+CFLAGS += -Wtautological-constant-out-of-range-compare
+endif
+
+ifneq ($(or $(filter gcc6,$(COMPILER_FEATURES)),$(filter 
clang4,$(COMPILER_FEATURES))),)
+CFLAGS += -Wextra
+CFLAGS += -Wmissing-prototypes
+CFLAGS += -Wno-empty-body
+CFLAGS += -Wno-missing-field-initializers
+CFLAGS += -Wno-sign-compare
+CFLAGS += -Wno-unused-function
+CFLAGS += -Wno-unused-parameter
+ifneq ($(filter gcc6,$(COMPILER_FEATURES)),)
+CFLAGS += -Wno-maybe-uninitialized
+endif
+endif
diff --git a/detect-compiler b/detect-compiler
new file mode 100755
index 00..bc2ea39ef5
--- /dev/null
+++ b/detect-compiler
@@ -0,0 +1,50 @@
+#!/bin/sh
+#
+# Probe the compiler for vintage, version, etc. This is used for setting
+# optional make knobs under the DEVELOPER knob.
+
+CC="$*"
+
+# we get something like (this is at least true for gcc and clang)
+#
+# FreeBSD clang version 3.4.1 (tags/RELEASE...)
+get_version_line() {
+   "$CC" -v 2>&1 | grep ' version '
+}
+
+get_family() {
+   get_version_line | sed 's/^\(.*\) version [0-9][^ ]* .*/\1/'
+}
+
+get_version() {
+   get_version_line | sed 's/^.* version \([0-9][^ ]*\) .*/\1/'
+}
+
+print_flags() {
+   family=$1
+   version=$(get_version | cut -f 1 -d .)
+
+   # Print a feature flag not only for the current version, but also
+   # for any prior versions we encompass. This avoids needing to do
+   # numeric comparisons in make, which are awkward.
+   while test "$version" -gt 0
+   do
+   echo $family$version
+   version=$((version - 1))
+   done
+}
+
+case "$(get_family)" in
+gcc)
+   print_flags gcc
+   ;;
+*clang)
+   print_flags clang
+   ;;
+"FreeBSD clang")
+   print_flags clang
+   ;;
+*)
+   : unknown compiler family
+   ;;
+esac
-- 
2.17.0.rc0.347.gf9cf61673a



Re: [PATCH v11 06/10] convert: add 'working-tree-encoding' attribute

2018-03-18 Thread Torsten Bögershausen
Some comments inline

On Fri, Mar 09, 2018 at 06:35:32PM +0100, lars.schnei...@autodesk.com wrote:
> From: Lars Schneider 
> 
> Git recognizes files encoded with ASCII or one of its supersets (e.g.
> UTF-8 or ISO-8859-1) as text files. All other encodings are usually
> interpreted as binary and consequently built-in Git text processing
> tools (e.g. 'git diff') as well as most Git web front ends do not
> visualize the content.
> 
> Add an attribute to tell Git what encoding the user has defined for a
> given file. If the content is added to the index, then Git converts the

Minor comment:
"Git converts the content"
Everywhere else (?) "encodes or reencodes" is used.
"Git reencodes the content" may be more consistent.


[No comments on the .gitattributes]

>  
> diff --git a/convert.c b/convert.c
> index b976eb968c..aa59ecfe49 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -7,6 +7,7 @@
>  #include "sigchain.h"
>  #include "pkt-line.h"
>  #include "sub-process.h"
> +#include "utf8.h"
>  
>  /*
>   * convert.c - convert a file when checking it out and checking it in.
> @@ -265,6 +266,78 @@ static int will_convert_lf_to_crlf(size_t len, struct 
> text_stat *stats,
>  
>  }
>  
> +static const char *default_encoding = "UTF-8";
> +
> +static int encode_to_git(const char *path, const char *src, size_t src_len,
> +  struct strbuf *buf, const char *enc, int conv_flags)
> +{
> + char *dst;
> + int dst_len;
> + int die_on_error = conv_flags & CONV_WRITE_OBJECT;
> +
> + /*
> +  * No encoding is specified or there is nothing to encode.
> +  * Tell the caller that the content was not modified.
> +  */
> + if (!enc || (src && !src_len))
> + return 0;

(This may have been discussed before.
 As we checked (enc != NULL) I think we can add here:)
if (is_encoding_utf8(enc))
return 0;

> +
> + /*
> +  * Looks like we got called from "would_convert_to_git()".
> +  * This means Git wants to know if it would encode (= modify!)
> +  * the content. Let's answer with "yes", since an encoding was
> +  * specified.
> +  */
> + if (!buf && !src)
> + return 1;
> +
> + dst = reencode_string_len(src, src_len, default_encoding, enc,
> +   _len);
> + if (!dst) {
> + /*
> +  * We could add the blob "as-is" to Git. However, on checkout
> +  * we would try to reencode to the original encoding. This
> +  * would fail and we would leave the user with a messed-up
> +  * working tree. Let's try to avoid this by screaming loud.
> +  */
> + const char* msg = _("failed to encode '%s' from %s to %s");
> + if (die_on_error)
> + die(msg, path, enc, default_encoding);
> + else {
> + error(msg, path, enc, default_encoding);
> + return 0;
> + }
> + }
> +
> + strbuf_attach(buf, dst, dst_len, dst_len + 1);
> + return 1;
> +}
> +
> +static int encode_to_worktree(const char *path, const char *src, size_t 
> src_len,
> +   struct strbuf *buf, const char *enc)
> +{
> + char *dst;
> + int dst_len;
> +
> + /*
> +  * No encoding is specified or there is nothing to encode.
> +  * Tell the caller that the content was not modified.
> +  */
> + if (!enc || (src && !src_len))
> + return 0;

 Same as above:
if (is_encoding_utf8(enc))
return 0;

> +
> + dst = reencode_string_len(src, src_len, enc, default_encoding,
> +   _len);
> + if (!dst) {
> + error("failed to encode '%s' from %s to %s",
> + path, default_encoding, enc);
> + return 0;
> + }
> +
> + strbuf_attach(buf, dst, dst_len, dst_len + 1);
> + return 1;
> +}
> +
>  static int crlf_to_git(const struct index_state *istate,
>  const char *path, const char *src, size_t len,
>  struct strbuf *buf,
> @@ -978,6 +1051,25 @@ static int ident_to_worktree(const char *path, const 
> char *src, size_t len,
>   return 1;
>  }
>  
> +static const char *git_path_check_encoding(struct attr_check_item *check)
> +{
> + const char *value = check->value;
> +
> + if (ATTR_UNSET(value) || !strlen(value))
> + return NULL;
> +


> + if (ATTR_TRUE(value) || ATTR_FALSE(value)) {
> + error(_("working-tree-encoding attribute requires a value"));
> + return NULL;
> + }

TRUE or false are values, but just wrong ones.
If this test is removed, the user will see "failed to encode "TRUE" to "UTF-8",
which should give enough information to fix it.

> +
> + /* Don't encode to the default encoding */
> + if (!strcasecmp(value, default_encoding))
> + return NULL;
 Same as above ?: