Re: Apparent bug in git-gc

2014-09-16 Thread Dale R. Worley
 From: Jeff King p...@peff.net
 
  I have an 11 GB repository.  It passes git-fsck (though with a number
  of dangling objects).  But when I run git-gc on it, the file
  refs/heads/master disappears.
 
 That's the expected behavior. Gc runs git pack-refs, which puts an
 entry into packed-refs and prunes the loose ref.

Arrrgh, damn!  I knew that but I forgot it.  Sorry about the trouble.

(At least I don't have to worry about fixing it now.)

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Apparent bug in git-gc

2014-09-15 Thread Dale R. Worley
I have an unpleasant bug in git-gc:

Git version:  1.8.3.1
Running on:  Fedora 19 Gnu/Linux

I have an 11 GB repository.  It passes git-fsck (though with a number
of dangling objects).  But when I run git-gc on it, the file
refs/heads/master disappears.  Since HEAD points to refs/heads/master,
this makes the repository unusable.

What should I do next?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


What happens when the repository is bigger than gc.autopacklimit * pack.packSizeLimit?

2014-08-27 Thread Dale R. Worley
[Previously sent to the git-users mailing list, but it probably should
be addressed here.]

A number of commands invoke git gc --auto to clean up the repository
when there might be a lot of dangling objects and/or there might be
far too many unpacked files.  The manual pages say:

git gc:
   --auto
   With this option, git gc checks whether any housekeeping is
   required; if not, it exits without performing any work. Some git
   commands run git gc --auto after performing operations that could
   create many loose objects.

   Housekeeping is required if there are too many loose objects or too
   many packs in the repository. If the number of loose objects
   exceeds the value of the gc.auto configuration variable, then all
   loose objects are combined into a single pack using git repack -d
   -l. Setting the value of gc.auto to 0 disables automatic packing of
   loose objects.

git config:
   gc.autopacklimit
   When there are more than this many packs that are not marked with
   *.keep file in the repository, git gc --auto consolidates them into
   one larger pack. The default value is 50. Setting this to 0
   disables it.

What happens when the amount of data in the repository exceeds
gc.autopacklimit * pack.packSizeLimit?  According to the
documentation, git gc --auto will then *always* repack the
repository, whether it needs it or not, because the data will require
more than gc.autopacklimit pack files.

And it appears from an experiment that this is what happens.  I have a
repository with pack.packSizeLimit = 99m, and there are 104 pack
files, and even when git gc is done, if I do git gc --auto, it
will do git-repack again.

Looking at the code, I see:

builtin/gc.c:
static int too_many_packs(void)
{
struct packed_git *p;
int cnt;

if (gc_auto_pack_limit = 0)
return 0;

prepare_packed_git();
for (cnt = 0, p = packed_git; p; p = p-next) {
if (!p-pack_local)
continue;
if (p-pack_keep)
continue;
/*
 * Perhaps check the size of the pack and count only
 * very small ones here?
 */
cnt++;
}
return gc_auto_pack_limit = cnt;
}

Yes, perhaps you *should* check the size of the pack!

What is a good strategy for making this function behave as we want it to?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-29 Thread Dale R. Worley
 From: David Lang da...@lang.hm

 well, as others noted, the problem is actually caused by doing the diffs, and 
 that is something that is a very common thing to do with source code.

To some degree, my attitude comes from When I Was A Boy, when you got
16k for both your bytecode and your data, so you never kept more than
one line of a file in memory unless you had to.

Regardless of that, the problem with git fsck is not due to doing
diffs, and git commit by default does diffs even if you don't ask
for them, so the observed problems cannot be subsumed under the label
you asked for a diff of a file that can't be diffed.

 And I would assume that files of several MB would be able to fit in
 memory (again, this was assumed to be for development, and compilers
 take a lot of ram to run, so having enough ram to hold any
 individual source file while the compiler is _not_ using ram doesn't
 seem likely to be a problem)

At least the first versions of GCC only kept one function (and the
global symbol table) in memory at once, so you could compile a source
file that was larger than the available memory.

In any case, if Git is no longer limited to handling source files, it
needs to be updated so it can handle large files.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: Duy Nguyen pclo...@gmail.com

 I don't know how many commands are hit by this. If you have time and
 gdb, please put a break point in die_builtin() function and send
 backtraces for those that fail. You could speed up the process by
 creating a smaller file and set the environment variable
 GIT_ALLOC_LIMIT (in kilobytes) to a number lower than that size. If
 git attempts to allocate a block larger than that limit it'll die.

I don't use Git enough to exercise it well.  And there are dozens of
commands with hundreds of options.

As someone else has noted, if I run 'git commit -q --no-status', it
doesn't crash.

It seems that much of Git was coded under the assumption that any file
could always be held entirely in RAM.  Who made that mistake?  Are
people so out of touch with reality?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 You need to have enough memory (virtual is fine if you have enough
 time) to do fsck.  Some part of index-pack could be refactored into
 a common helper function that could be called from fsck, but I think
 it would be a lot of work.

How much memory is enough?  And how is it that fsck is coded so that
the available RAM is a limit on which repositories can be checked?
Did someone forget that RAM may be considerably smaller than the
amount of data a program may have to deal with?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git chokes on large file

2014-05-28 Thread Dale R. Worley
 From: David Lang da...@lang.hm

 Git was designed to track source code, there are warts that show up
 in the implementation when you use individual files 4GB

I'd expect that if you want to deal with files over 100k, you should
assume that it doesn't all fit in memory.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Git chokes on large file

2014-05-27 Thread Dale R. Worley
I've discovered a problem using Git.  It's not clear to me what the
correct behavior should be, but it seems to me that Git is failing
in an undesirable way.

The problem arises when trying to handle a very large file.  For
example:

$ git --version
git version 1.8.3.1
$ mkdir $$
$ cd $$
$ git init
Initialized empty Git repository in 
/common/not-replicated/worley/temp/5627/.git/
$ truncate --size=20G big_file
$ ls -l
total 0
-rw-rw-r--. 1 worley worley 21474836480 May 27 11:59 big_file
$ time git add big_file

real4m48.752s
user4m31.295s
sys 0m16.747s
$

At this point, either 'git fsck' or 'git commit' fails:

$ git fsck --full --strict
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

$ git commit -m Test.
[master (root-commit) 3df3655] Test.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

The central problem is that one can accidentally add a file that
leaves the repository in a broken state, where various normal
commands simply don't work.  The most worrying aspect is that git
fsck fails -- of all the commands, the one that verifies the validity
of the repository (and diagnoses errors) should be the most robust!

Even doing a 'git reset' does not put the repository in a state where
'git fsck' will complete:

$ git reset
$ git fsck --full --strict
notice: HEAD points to an unborn branch (master)
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 21474836481 bytes)

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git fsck fails on malloc of 80 G

2013-12-18 Thread Dale R. Worley
 From: Jeff King p...@peff.net

 One of the problems I ran into recently is that
 corrupt data can cause it to make a large allocation

One thing I notice is that in unpack_compressed_entry() in
sha1_file.c, there is a mallocz of size bytes.  It appears that
size is the size of the object that is being unpacked.  If so, this
code cannot be correct, because it assumes that any file that is
stored in the repository can be put into a buffer allocated in RAM.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git fsck fails on malloc of 80 G

2013-12-16 Thread Dale R. Worley
I have a large repository (17 GiB of disk used), although no single
file in the repository is over 1 GiB.  (I have pack.packSizeLimit set
to 1g.)  I don't know how many files are in the repository, but it
shouldn't exceed several tens of commits each containing several tens
of thousands of files.

Due to Git crashing while performing an operation, I want to verify
that the repository is consistent.  However, when I run git fsck it
fails, apparently because it is trying to allocate 80 G of memory.  (I
can still do adds, commits, etc.)

# git fsck
Checking object directories: 100% (256/256), done.
fatal: Out of memory, malloc failed (tried to allocate 80530636801 bytes)
#

I don't know if this is due to an outright bug or not.  But it seems
to me that git fsck should not need to allocate any more memory than
the size (1 GiB) of a single pack file.  And given its purpose, git
fsck should be one of the *most* robust Git tools!

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Working patterns

2013-10-23 Thread Dale R. Worley
 The pattern I use is to have this:
 
 /repository/.git
   with core.worktree = /working
 /working/...
 
 then
 
 cd /repository
 git add /working/x/y
 git ...

The point I'm trying to make is that it appears that all of the Git
commands implemented as programs use the same code at the beginning to
determine the location of the Git repository and the root directory of
the work tree -- to determine GIT_DIR and GIT_WORK_TREE effectively.
And the code works with my work pattern, which seems intuitively
correct to me.

It seems to me that this code should implement the design, and the
design should match what the code does.

However, the parallel code in the Git commands that are scripts, which
seems to be in git-sh-setup, does not implement the same design as the
C code does.

It seems to me that the two sets of Git commands should be invokable
under the same circumstances, that there is a design specification as
to how Git can be invoked, and both implementations should match that.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-21 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

  ... it's not clear why GIT_WORK_TREE exists, ...
 
 The configuration item came _way_ later than the environment, and we
 need to keep users and scripts from old world working, that is why.

OK, that explains a great deal.  IIRC, I first became aware that
detached worktrees are possible through the documentation of
core.worktree.  As Git's architecture has a tight binding between the
repository and the worktree, it made a great deal of sense to me that
the repository points to the detached worktree.  And the absence of
core.worktree, a non-detached worktree, is essentially equivalent to
having core.worktree specify the directory containing the .git
directory.

So the obvious way (to me) to invoke Git with a detached worktree is
to set GIT_DIR to point to the repository, and the repository points
to the root of the worktree.  If the command operates on the worktree,
Git can compare the cwd with the worktree root to determine the
relative path of files.

(And you can see that in this situation, Git doesn't have to search
upward to try to determine where the worktree root is.)

What you're saying is that there's an older mode of operation where
the repository does not point to the worktree.  Instead, the caller
has to set GIT_DIR to locate the repository and GIT_WORK_TREE to
locate the worktree.  This would be prone to error, because the user
is responsible for always pairing the repository with the correct
worktree; it doesn't enforce the architectural assumption that the
repository is paired with a work tree.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

   Side note: without GIT_WORK_TREE environment (or
   core.worktree), there is no way to tell where the top level
   is, so you were limited to always be at the top level of
   your working tree if you used GIT_DIR to refer to a
   repository that is not embedded in your working tree.  There
   were some changes in this area, but I do not recall the
   details offhand.

That's not true.  The core.worktree config variable tells the top of
the worktree, so once you've located the repository, you know where
the worktree is.

Indeed, it's not clear why GIT_WORK_TREE exists, as that allows the
user to set GIT_WORK_TREE inconsistently with core.worktree.  (What
happens if you do that?)

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 Now, when you say the cwd contains the .git directory, do you mean
 
   cd /repositories
 git add ../working/trees/proj-wt1/file
 
 updates file in the /repositories/proj.git/index?  Or do you mean
 this?

The pattern I use is to have this:

/repository/.git
/working/...

then

cd /repository
git add /working/x/y/z

works as you'd expect it to.  git rm seems to work correctly under
these circumstances as well.

I seem to recall that using relative path values doesn't work under
some conditions involving symbolic links, but I can't recall the
details right now.

 you talk about starting Git command _outside_ the working tree
 (whether the working tree has its repository embedded in it is not
 very relevant).

The above pattern is what I mean, where the cwd is not within the work
tree.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-18 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 It was unclear to me which part of our documentation needs updating
 and how, and that was (and still is) what I was primarily interested
 in finding out.

It seems to me that what is missing is a description of the
circumstances under which Git can be run.  With Subversion (the only
other source control system I know in detail), the working tree that
is operated on is at and below the cwd, and the working tree always
points to the repository.  (A subdirectory of a working tree is also a
valid working tree.)

With Git, it seems that the basic usage is that Git searches upward
from the cwd to find the top of the work tree, which is distinguished
by having a .git subdirectory.  The rules when the worktree is
detached are more complicated, and don't seem to be written in any
single place.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-17 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com
 
 wor...@alum.mit.edu (Dale R. Worley) writes:
 
  In general, Git commands on a repository with a detached worktree can
  be executed by cd'ing into the directory containing the .git
  directory, ...
 
 Eh?  News to me; it might happened to have appeared to work by
 accident, but that is not by design.

I must admit I've never seen the design (and I personally doubt that
the design has ever been written down).  But at least the following
commands work correctly on a detached worktree if the current
directory contains the .git directory, because I am using them in a
production manner:

git add
git cat-file
git commit
git commit-tree
git config
git gc
git log
git ls-tree
git reset
git rev-list
git update-ref

In my situation, the worktree is not, in my mind, dependent on the
repository; the repository is intended to keep backups of the contents
of the directories that are worktree.  Indeed, one could establish
several detached repositories to back up different subsets of the same
worktree.  So it is conceptually natural to execute Git in the
repository directory.  And, after all, the current directory
identifies the repository and the repository contains a pointer to the
worktree.

 Not exporting GIT_DIR variable in sh-setup was done not by accident
 but as a very deliberate design choice, IIRC.

The intention of my change is that it appears that all of the failures
of my use pattern are when the command is implemented by a shell
script, and it appears that all shell scripts initially invoke
git-sh-setup.

The change specifically detects my use pattern and, for the remainder
of the script, changes the use pattern into a pattern closely related
to the one that Junio documents:

 - export GIT_DIR that points at the correct .git directory;

 - GIT_WORK_TREE is left unset

 - set the current directory to the top of the working tree

Perhaps the change should also set GIT_WORK_TREE.  I haven't noticed
setting GIT_WORK_TREE to be necessary for git filter-branch, but
perhaps that is coincidental.

It seems to me that this change would uniformly support the use
pattern I use without affecting Git's behavior in any other case.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[git-users] Problem using detached worktrees with commands implemented in scripts

2013-10-16 Thread Dale R. Worley
In Git, one can set up a repository with a detached worktree, where
the .git directory is not a subdirectory of the top directory of the
work tree.

In general, Git commands on a repository with a detached worktree can
be executed by cd'ing into the directory containing the .git
directory, and executing the Git command there.  E.g., git add and
git commit execute as one would expect.  (I think they can also be
executed by cd'ing to the worktree and setting GIT_DIR.)

However, this approach does not work with git filter-branch, which
objects with You need to run this command from the toplevel of the
working tree.

I suspect that it does not work with other Git commands that are
implemented with shell scripts.  The problem appears to be in the
git-sh-setup script, which is called by the Git shell scripts to set
up the environment and do preliminary tests.

It seems to me that this inconsistency between the script commands and
the binary commands can be fixed by updating git-sh-setup in this way:

--- git-sh-setup.Custom.orig2013-06-20 12:59:45.0 -0400
+++ git-sh-setup2013-10-07 22:34:06.719946134 -0400
@@ -297,14 +297,18 @@
 # if we require to be in a git repository.
 if test -z $NONGIT_OK
 then
-   GIT_DIR=$(git rev-parse --git-dir) || exit
+   export GIT_DIR=$(git rev-parse --git-dir) || exit
if [ -z $SUBDIRECTORY_OK ]
then
-   test -z $(git rev-parse --show-cdup) || {
-   exit=$?
-   echo 2 You need to run this command from the 
toplevel of the working tree.
-   exit $exit
-   }
+   cdup=$(git rev-parse --show-cdup)
+   if [ -n $cdup ]
+   then
+   # Current directory is not the toplevel.
+   # Set GIT_DIR to the absolute path of the repository.
+   GIT_DIR=$(cd $GIT_DIR  pwd)
+   # cd to the toplevel.
+   cd $cdup
+   fi
fi
test -n $GIT_DIR  GIT_DIR=$(cd $GIT_DIR  pwd) || {
echo 2 Unable to determine absolute path of git directory

What this change does is, when a command is invoked from a directory
containing a repository with a detached worktree, is to set GIT_DIR to
the directory of the repository, then cd to the top of the worktree.
After that, the script command should work as expected.

I am far from being an expert in Git internals, so I don't know
whether this is the correct approach to take to this problem or not.

Does anyone have any feedback on this?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Using filter-branch without messing up the working copy

2013-10-03 Thread Dale R. Worley
I'm working on using git filter-branch to remove the history of a
large file from my repository so as to reduce the size of the
repository.  This pattern of use is effective for me:

1. $ git filter-branch --index-filter 'git rm --cached --ignore-unmatch 
core.4563' HEAD

2. edit .git/packed-refs to remove the line
   c6589bef2c776277407686a3a81c5a76bfe41ba2 refs/original/refs/heads/hobgoblin
(so that there is no reference to the old HEAD)

3. git gc --quiet --aggressive

The difficulty I am having is that I do not want to disturb the
working copy while doing this.  The working copy is my entire home
directory, because I am using the repository as a historical backup
system for my home directory.  Currently, Git will not execute
filter-branch if there are unstaged changes in the working copy,
despite the fact that git filter-branch --index-filter does not
touch the working copy itself.  (In this case, the file to be deleted,
core.4563, is not now in the working copy.)

A further difficulty is that the repository is remote, the .git
directory is not in my home directory, and core.worktree is
/home/worley.

The best set of steps I have found that accomplishes my goals is to
(1) disconnect the repository from the working copy by removing the
core.worktree value, (2) use git checkout -q -f master to create in
the repository's location an entire copy of my home directory, (3)
perform the above filtering steps, (4) reconnect the repository by
adding the core.worktree value, and (5) deleting the temporary working
copy files.

Surely there is a better way than this.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-09-02 Thread Dale R. Worley
 From: Matthieu Moy matthieu@grenoble-inp.fr

  const char *git_pager(int stdout_is_tty)
  {
  const char *pager;
 
  if (!stdout_is_tty)
  return NULL;
 
  pager = getenv(GIT_PAGER);
  if (!pager) {
  if (!pager_program)
  git_config(git_default_config, NULL);
  pager = pager_program;
  }
  if (!pager)
  pager = getenv(PAGER);
  if (!pager)
  pager = DEFAULT_PAGER;
  else if (!*pager || !strcmp(pager, cat))
  pager = NULL;
 
 I guess the else could and should be dropped. If you do so (and
 possibly insert a blank line between the DEFAULT_PAGER case and the
 pager = NULL case), you get a nice pattern
 
 if (!pager)
   try_something();
 if (!pager)
   try_next_option();

That's true, but it would change the effect of using cat as a value:
cat as a value of DEFAULT_PAGER would cause git_pager() to return
NULL, whereas now it causes git_pager() to return cat.  (All other
places where cat can be a value are translated to NULL already.)

This is why I want to know what the *intended* behavior is, because we
might be changing Git's behavior, and I want to know that if we do
that, we're changing it to what it should be.  And I haven't seen
anyone venture an opinion on what the intended behavior is.

 I agree that a comment like this would help, though:
 
 --- a/cache.h
 +++ b/cache.h
 @@ -1266,7 +1266,7 @@ static inline ssize_t write_str_in_full(int fd, const 
 char *str)
  
  /* pager.c */
  extern void setup_pager(void);
 -extern const char *pager_program;
 +extern const char *pager_program; /* value read from git_config() */
  extern int pager_in_use(void);
  extern int pager_use_color;
  extern int term_columns(void);

First off, the wording is wrong, it should be value set by
git_config().

But that doesn't tell the reader what the significance of the value
is.  I suspect that a number of global variables need to be marked:

 /* The pager program name, or cat if there is no pager.
  * Can be overridden by the pager.cmd configuration value for a
  * single command, or suppressed by the --no-pager option.
  * Set by calling git_config().
  * NULL if hasn't been set yet by calling git_config(). */
 extern const char *pager_program;

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-09-02 Thread Dale R. Worley
 I've noticed that Git by default puts long output through less as a
 pager.  I don't like that, but this is not the time to change
 established behavior.  But while tracking that down, I noticed that
 the paging behavior is controlled by at least 5 things:
 
 the -p/--paginate/--no-pager options
 the GIT_PAGER environment variable
 the PAGER environment variable
 the core.pager Git configuration variable
 the build-in default (which seems to usually be less)

One complication is the meaning of -p/--no-pager:

With the remaining sources, we assume that there is a priority
sequence, and that is used to determine what the pager is.

There is a somewhat independent question of when the pager is
activated.  What I know so far is that some commands use the pager by
default and some by default do not.  My expectation is that
--no-pager can be used to suppress the pager for *any* command.  Is it
also true that -p can force the pager for *any* command, or are there
commands which will not page even with -p?

I assume that if -p is specified but the which pager selection is
cat (or some other specification of no pager), then there is no
paging operation.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-08-29 Thread Dale R. Worley
So I set out to verify in the code that the order of priority of pager
specification is

GIT_PAGER  core.pager  PAGER  default

I discovered that there is also a pager.command configuration
variable.

I was expecting the code to be simple, uniform (with regard to the 5
sources), and reasonably well documented.  The relevant parts of the
code that I have located so far include:

in environment.c:

const char *pager_program;

in config.c:

int git_config_with_options(config_fn_t fn, void *data,
const char *filename,
const char *blob_ref,
int respect_includes)
{
char *repo_config = NULL;
int ret;
struct config_include_data inc = CONFIG_INCLUDE_INIT;

if (respect_includes) {
inc.fn = fn;
inc.data = data;
fn = git_config_include;
data = inc;
}

/*
 * If we have a specific filename, use it. Otherwise, follow the
 * regular lookup sequence.
 */
if (filename)
return git_config_from_file(fn, filename, data);
else if (blob_ref)
return git_config_from_blob_ref(fn, blob_ref, data);

repo_config = git_pathdup(config);
ret = git_config_early(fn, data, repo_config);
if (repo_config)
free(repo_config);
return ret;
}

in pager.c:

/* returns 0 for no pager, 1 for use pager, and -1 for not specified 
*/
int check_pager_config(const char *cmd)
{
struct pager_config c;
c.cmd = cmd;
c.want = -1;
c.value = NULL;
git_config(pager_command_config, c);
if (c.value)
pager_program = c.value;
return c.want;
}

const char *git_pager(int stdout_is_tty)
{
const char *pager;

if (!stdout_is_tty)
return NULL;

pager = getenv(GIT_PAGER);
if (!pager) {
if (!pager_program)
git_config(git_default_config, NULL);
pager = pager_program;
}
if (!pager)
pager = getenv(PAGER);
if (!pager)
pager = DEFAULT_PAGER;
else if (!*pager || !strcmp(pager, cat))
pager = NULL;

return pager;
}

What's with the code?  It's not simple, it's not uniform (e.g.,
setting env. var. PAGER to cat will cause git_pager() to return
NULL, but setting preprocessor var. DEFAULT_PAGER to cat will cause
it to return cat), and it's barely got any comments at all (a global
variable has *no description whatsoever*).

I'd like to clean up the manual pages at least, but it would take me
hours to figure out what the code *does*.

I know I'm griping here, but I thought that part of the reward for
contributing to an open-source project was as a showcase of one's
work.  Commenting your code is what you learn first in programming.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-29 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
 index b1630ba..33fbd8c 100644
 --- a/Documentation/git-diff.txt
 +++ b/Documentation/git-diff.txt
 @@ -28,11 +28,15 @@ two blob objects, or changes between two files on disk.
   words, the differences are what you _could_ tell Git to
   further add to the index but you still haven't.  You can
   stage these changes by using linkgit:git-add[1].
 -+
 -If exactly two paths are given and at least one points outside
 -the current repository, 'git diff' will compare the two files /
 -directories. This behavior can be forced by --no-index or by
 -executing 'git diff' outside of a working tree.
 +
 +'git diff' --no-index [--options] [--] [path...]::
 +
 + This form is to compare the given two paths on the
 + filesystem.  You can omit the `--no-index` option when
 + running the command in a working tree controlled by Git and
 + at least one of the paths points outside the working tree,
 + or when running the command outside a working tree
 + controlled by Git.

That does break out the --no-index case in a clearer way.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: the pager

2013-08-28 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com
 
  I've noticed that Git by default puts long output through less as a
  pager.  I don't like that, but this is not the time to change
  established behavior.  But while tracking that down, I noticed that
  the paging behavior is controlled by at least 5 things:
 
  the -p/--paginate/--no-pager options
  the GIT_PAGER environment variable
  the PAGER environment variable
  the core.pager Git configuration variable
  the build-in default (which seems to usually be less)
  ...
  What is the (intended) order of precedence of specifiers of paging
  behavior?  My guess is that it should be the order I've given above.
 
 I think that sounds about right (I didn't check the code, though).
 The most specific to the command line invocation (i.e. option)
 trumps the environment, which trumps the configured default, and the
 hard wired stuff is used as the fallback default.
 
 I am not sure about PAGER environment and core.pager, though.
 People want Git specific pager that applies only to Git process
 specified to core.pager, and still want to use their own generic
 PAGER to other programs, so my gut feeling is that it would make
 sense to consider core.pager a way to specify GIT_PAGER without
 contaminating the environment, and use both to override the generic
 PAGER (in other words, core.pager should take precedence over PAGER
 as far as Git is concerned).

I've just discovered this bit of documentation.  Within the git-var
manual page is this entry:

   GIT_PAGER
   Text viewer for use by git commands (e.g., less). The value is
   meant to be interpreted by the shell. The order of preference is
   the $GIT_PAGER environment variable, then core.pager configuration,
   then $PAGER, and then finally less.

This suggests that the ordering is GIT_PAGER  core.pager  PAGER 
default.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


the pager

2013-08-26 Thread Dale R. Worley
I've noticed that Git by default puts long output through less as a
pager.  I don't like that, but this is not the time to change
established behavior.  But while tracking that down, I noticed that
the paging behavior is controlled by at least 5 things:

the -p/--paginate/--no-pager options
the GIT_PAGER environment variable
the PAGER environment variable
the core.pager Git configuration variable
the build-in default (which seems to usually be less)

There is documentation in git.1 and git-config.1, and the two are not
coordinated to make it clear what happens in all cases.  And the
built-in default is not mentioned at all.

What is the (intended) order of precedence of specifiers of paging
behavior?  My guess is that it should be the order I've given above.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-23 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

 I suspect that it may be a good idea to split the section altogether
 to reduce confusion like what triggered this thread, e.g.
 
 'git diff' [--options] [--] [path...]::
 
 This form is to view the changes you made relative to
 the index (staging area for the next commit).  In other
 words, the differences are what you _could_ tell Git to
 further add to the index but you still haven't.  You can
 stage these changes by using linkgit:git-add[1].
 
 'git diff' --no-index [--options] [--] path path::
 
   This form is to compare the given two paths on the
   filesystem.  When run in a working tree controlled by
   Git, if at least one of the paths points outside the
   working tree, or when run outside a working tree
   controlled by Git, you can omit the `--no-index` option.
 
 For now, I'll queue your version as-is modulo style fixes, while
 waiting for others to help polishing the documentation better.

It'd difficult to figure out how to describe it well.  In my opinion,
the problem here is the DWIM nature of the command, which means that
there is a lot of interaction between the options that are specified,
the number of path arguments, and the circumstances.  My preference is
for do what I say, that the options restrict the command to operate
in exactly one way, which determines the way the paths are used (and
thus their number) and the context in which it can be used.  But
that's not how git-diff works.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] git-diff: Clarify operation when not inside a repository.

2013-08-22 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley wor...@ariadne.com
---


The error message has been updated from [PATCH].  git diff outside a
repository now produces:

Not a git repository
To compare two paths outside a working tree:
usage: git diff [--no-index] path path

This should inform the user of his error regardless of whether he
intended to perform a within-repository git diff or an
out-of-repository git diff.

This message is closer to the message that other Git commands produce:

fatal: Not a git repository (or any parent up to mount parent )
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

git diff --no-index produces the same message as before (since the
user is clearly invoking the non-repository behavior):

usage: git diff --no-index path path

Regarding the change to git-diff.txt, perhaps forced ... by executing
'git diff' outside of a working tree is not the best wording, but it
should be clear to the reader that (1) it is possible to execute 'git
diff' outside of a working tree, and (2) when doing so, the behavior
will be as if '--no-index' was specified.

I've also added some comments for the new code.


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|   12 +++-
 2 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [commit] [--] [path...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..9734ec3 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,19 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   /* There was no --no-index and there were not two
+* paths.  It is possible that the user intended
+* to do an inside-repository operation. */
+   fprintf(stderr, Not a git repository\n);
+   fprintf(stderr,
+   To compare two paths outside a working 
tree:\n);
+   }
+   /* Give the usage message for non-repository usage and exit. */
usagef(git diff %s path path,
   no_index ? --no-index : [--no-index]);
+   }
 
diff_setup(revs-diffopt);
for (i = 1; i  argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git-diff: Clarify operation when not inside a repository.

2013-08-21 Thread Dale R. Worley
Clarify documentation for git-diff:  State that when not inside a
repository, --no-index is implied (and thus two arguments are
mandatory).

Clarify error message from diff-no-index to inform user that CWD is
not inside a repository and thus two arguments are mandatory.

Signed-off-by: Dale Worley wor...@ariadne.com
---


This clarification is to avoid a problem I ran into.  I executed 'git
diff' in the remote working tree of a repository, and not in the
repository directory itself.  Because of that, git-diff assumed
git-diff --no-index, and executed diff-no-index.  Since I hadn't
provided paths, diff-no-index produced an error message.
Unfortunately, the error message presupposes that the decision to
execute diff-no-index reflects the user's intention, thus leaving me
confused, as the error message is only:
usage: git diff [--no-index] path path
and does not cover the case I intended.  This patch changes the
message to notify the user that he is getting --no-index semantics
because he is outside of a repository:
Not within a git repository:
usage: git diff [--no-index] path path
The additional line is suppressed if the user specified --no-index.

The documentation is expanded to state that execution outside of a
repository forces --no-index behavior.  Previously, the manual implied
this but did not state it, making it easy for the user to overlook
that it's possible to run git-diff outside of a repository.

Dale


 Documentation/git-diff.txt |3 ++-
 diff-no-index.c|6 +-
 2 files changed, 7 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-diff.txt b/Documentation/git-diff.txt
index 78d6d50..9f74989 100644
--- a/Documentation/git-diff.txt
+++ b/Documentation/git-diff.txt
@@ -31,7 +31,8 @@ two blob objects, or changes between two files on disk.
 +
 If exactly two paths are given and at least one points outside
 the current repository, 'git diff' will compare the two files /
-directories. This behavior can be forced by --no-index.
+directories. This behavior can be forced by --no-index or by 
+executing 'git diff' outside of a working tree.
 
 'git diff' [--options] --cached [commit] [--] [path...]::
 
diff --git a/diff-no-index.c b/diff-no-index.c
index e66fdf3..98c5f76 100644
--- a/diff-no-index.c
+++ b/diff-no-index.c
@@ -215,9 +215,13 @@ void diff_no_index(struct rev_info *revs,
 path_inside_repo(prefix, argv[i+1])))
return;
}
-   if (argc != i + 2)
+   if (argc != i + 2) {
+   if (!no_index) {
+   fprintf(stderr, Not within a git repository:\n);
+   }
usagef(git diff %s path path,
   no_index ? --no-index : [--no-index]);
+   }
 
diff_setup(revs-diffopt);
for (i = 1; i  argc - 2; ) {
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
Commit a2cb86 (git_mkstemps: correctly test return value of open(),
12 Jul 2013) fixes a bug regarding testing the return of an open()
call for success/failure.  Add a testsuite test for that fix.  The
test exercises a situation where that open() is known to return 0.

Signed-off-by: Dale Worley wor...@ariadne.com
---
This version of the patch cleans up a number of errors in my previous
version (which were ultimately due to my faulty updating of my master
branch).  The commit that added the open() test is now correctly
described.  Since the test was not present in the test suite at all,
the patch is described as adding the test rather than improving it.

a2cb86 is on branch tr/fd-gotcha-fixes, but that has been merged into
master now.

(Thanks for your patience with this.)

Dale

 t/t0070-fundamental.sh | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep cannotwrite/test err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init 
+   echo Test. test-file 
+   git add test-file 
+   git commit -m Message. -
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
-- 
1.8.4.rc1.24.gd407a5c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] SubmittingPatches: clarify some details of the patch format

2013-08-06 Thread Dale R. Worley
---
This is a first draft of a patch that clarifies a number of points
about how patches should be formatted that have tripped me up.  I have
re-filled a few of the paragraphs, which makes it hard to see from the
diff what I've changed.  This listing shows the changed words between
{ ... }:

 { This first line should be a separate paragraph, that is, it should be
followed by an empty line, which is then followed by the body of the
commit message } .

 { At the end of the commit message should be a Signed-off-by line giving
your name.  This can be added to the commit message automatically by
giving git commit the --signoff option.  This line has legal
implications, see Sign your work below } .

People on the Git mailing list need to be able to read and comment on
the changes you are submitting.  It is important for a developer to be
able to quote your changes, using standard e-mail tools, so that
they may comment on specific portions of your code.  For this reason,
all patches should be submitted inline { rather than as message
attachments } .  If your log message (including your name on the
Signed-off-by line) is not writable in ASCII, make sure that you send
off { the } message in the correct encoding.

git format-patch command follows the best current practice to
format the { patch for transmission as an e-mail message.  The files
that it outputs are sample e-mail messages in mh format.  The
initial lines are sample From, To, Date, and Subject headers, which
you will likely have to remove or revise, depending on how your MUA
operates } .

At the beginning of the patch should come your commit message ( { the
first line in the Subject header, the remainder in the body of the
message } ), ending with the Signed-off-by: lines, and a line that
consists of three dashes, followed by the diffstat information and the
patch itself.  If you are forwarding a patch from somebody else,
optionally, at the beginning of the e-mail message just before the
commit message starts, you can put a From:  line to name that
person.

You often want to add additional explanation about the patch,
other than the commit message itself.  Place such cover letter
material between the three-dash { line } and the diffstat. Git-notes
can also be inserted using the `--notes` option.

 From: Junio C Hamano gits...@pobox.com
 
 I am not sure if SubmittingPatches is a good place, though.
 The document is a guidance for people who contribute to _this_
 project.
 
 But the specialness of the first paragraph applies to any project
 that uses Git, so people other than those who contribute to this
 project should be aware of it.

All of that is true.  But what can we do to ensure that someone who
submits a patch does so with the right format?  The special treatment
of the first line is not a requirement.  See, e.g., the git-commit
manual page:

   Though not required, it’s a good idea to begin the commit message with
   a single short (less than 50 character) line summarizing the change,
   followed by a blank line and then a more thorough description. Tools
   that turn commits into email, for example, use the first line on the
   Subject: line and the rest of the commit in the body.

 Originally we literally used first line, but that made many things
 like shortlog output and patch Subject: useless when people write a
 block of text starting from the first line without a title.  Also
 after resurrecting such a text from e-mail, am couldn't tell if
 the first line on the Subject: is meant to be the first line of
 the same first paragraph (which is not what we encourage), or it is
 properly a single line title, and need a blank line before the first
 line of the body.  So quite a while ago, we changed the rule to take
 the first paragraph and use that in these places where we want to
 give a title of a patch.

And as you note, if organizing the first line this way was a
requirement, git-am would be able to unambiguously reconstruct the
commit message from an e-mail.  The only way to minimize errors in
this probject is to clearly specify what is required within this
project.

Dale


 Documentation/SubmittingPatches |   47 +-
 1 files changed, 31 insertions(+), 16 deletions(-)

diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches
index d0a4733..e974dd7 100644
--- a/Documentation/SubmittingPatches
+++ b/Documentation/SubmittingPatches
@@ -85,6 +85,10 @@ identifier for the general area of the code being modified, 
e.g.
 If in doubt which identifier to use, run git log --no-merges on the
 files you are modifying to see the current conventions.
 
+This first line should be a separate paragraph, that is, it should be
+followed by an empty line, which is then followed by the body of the
+commit message.
+
 The body should provide a meaningful 

Re: [PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com
 
 Thanks. I thought I've already queued 
 
 Message-ID: 7vfvuokpr0@alter.siamese.dyndns.org
 aka 
 http://article.gmane.org/gmane.comp.version-control.git/231680
 
 which tests
 
 git commit --allow-empty -m message -

My mistake...  I've been so intent on revising my repository and
rewriting the patch that I overlooked that you'd done the revision
already.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH revised] git_mkstemps: add test suite test

2013-08-06 Thread Dale R. Worley
 git commit --allow-empty -m message -

Though as of [fb56570] Sync with maint to grab trivial doc fixes,
that test doesn't fail for me if I revert to

fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
if (fd  0)
return fd;

I haven't been watching the code changes carefully; has there been a
fix that is expected to cause that?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Making a patch: git format-patch does not produce the documented format

2013-08-02 Thread Dale R. Worley
 From: John Keeping j...@keeping.me.uk
 
 git-format-patch(1) says:
 
 By default, the subject of a single patch is [PATCH]  followed
 by the concatenation of lines from the commit message up to the
 first blank line (see the DISCUSSION section of git-commit(1)).
 
 I think that accurately describes what you're seeing.  The referenced
 DISCUSSION section describes how to write a commit message that is
 formatted in a suitable way, with a short first subject line and then a
 blank line before the body of the message.

Thanks for the confirmation.  I've figured out what is going wrong:
Documentation/SubmittingPatches says:

The first line of the commit message should be a short description (50
characters is the soft limit, see DISCUSSION in git-commit(1)), and
should skip the full stop.

What it *doesn't* say is that the second line of the commit message
should be empty -- precisely so that git format-patch turns the first
line into the Subject: but does not merge the remainder of the commit
message (the body) into the Subject: line.  Now that I know to look
for this, I can see that the commit messages in the Git repository
show this pattern.

I'm preparing some clarifications of SubmittingPatches to explain
things that a new person (e.g., me) would not know.  To fix this
issue, I am inserting:

This first line should be a separate paragraph, that is, it should be
followed by an empty line, which is then followed by the body of the
commit message.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] git_mkstemps: improve test suite test

2013-08-02 Thread Dale R. Worley
Commit 52749 fixes a bug regarding testing the return of an open()
call for success/failure.  Improve the testsuite test for that fix by
removing the helper program 'test-close-fd-0' and replacing it with
the shell redirection '-'.  (The redirection is Posix, so it should
be portable.)

Signed-off-by: Dale Worley wor...@ariadne.com
---

 From: Junio C Hamano gits...@pobox.com
 Date: Fri, 19 Jul 2013 07:29:47 -0700
 
 The change itself looks good; care to write it up as a proper patch
 with a proposed log message?

My apologies for the delay; I've had to do some yak-shaving to learn
how to construct patches properly.  (I've written some clarifications
for Document/SubmittingPatches, which I will submit separately.)

Someone has gone ahead and made the code change, so all that remains
is to update the testsuite test by replacing the helper program
'test-close-fd-0' with the Posix shell redirection '-'.

Dale


 Makefile  |1 -
 test-close-fd-0.c |   14 --
 2 files changed, 0 insertions(+), 15 deletions(-)
 delete mode 100644 test-close-fd-0.c

diff --git a/Makefile b/Makefile
index 8ad40d4..3588ca1 100644
--- a/Makefile
+++ b/Makefile
@@ -557,7 +557,6 @@ X =
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 
 TEST_PROGRAMS_NEED_X += test-chmtime
-TEST_PROGRAMS_NEED_X += test-close-fd-0
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
diff --git a/test-close-fd-0.c b/test-close-fd-0.c
deleted file mode 100644
index 3745c34..000
--- a/test-close-fd-0.c
+++ /dev/null
@@ -1,14 +0,0 @@
-#include unistd.h
-
-/* Close file descriptor 0 (which is standard-input), then execute the
- * remainder of the command line as a command. */
-
-int main(int argc, char **argv)
-{
-   /* Close fd 0. */
-   close(0);
-   /* Execute the requested command. */
-   execvp(argv[1], argv[1]);
-   /* If execve() failed, return an error. */
-   return 1;
-}
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Difficulty adding a symbolic link, part 3

2013-08-01 Thread Dale R. Worley
 From: Duy Nguyen pclo...@gmail.com

  With the above change, the test suite runs with zero failures, so it
  doesn't affect any common Git usage.
 
 It means the test suite is incomplete. As you can see, the commit
 introducing this change does not come with a test case to catch people
 changing this.

Who should be blamed for omitting the test?

  Can someone give me advice on what this code *should* do?
 
 It does as the function name says: given cwd, a prefix (i.e. a
 relative path with no .. components) and a path relative to
 cwd+prefix, convert 'path' to something relative to cwd. In the
 simplest case, prepending the prefix to 'path' is enough. cwd is also
 get_git_work_tree().

But as you can see, the exact behavior that the function is intended
to exhibit regarding symlinks is not clear from the function name;
there should have been a real explanation in the comment above the
function.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Difficulty adding a symbolic link, part 3

2013-08-01 Thread Dale R. Worley
 From: Duy Nguyen pclo...@gmail.com

  Can someone give me advice on what this code *should* do?
 
 It does as the function name says: given cwd, a prefix (i.e. a
 relative path with no .. components) and a path relative to
 cwd+prefix, convert 'path' to something relative to cwd. In the
 simplest case, prepending the prefix to 'path' is enough. cwd is also
 get_git_work_tree().
 
 I agree with you that this code should not resolve anything in the
 components after 'cwd', after rebasing the path to 'cwd' (not just the
 final component). Not sure how to do it correctly though because we do
 need to resolve symlinks before cwd. Maybe a new variant of real_path
 that stops at 'cwd'?
 
 We may also have problems with resolve symlinks before cwd when 'path'
 is relative, as normalize_path_copy() does not resolve symlinks. We
 just found out emacs has this bug [1] but did not realize we also have
 one :-P.

Thanks for the detailed information.  It seems to me that the minimum
needed change is that the handling of relative and absolute paths
should be made consistent.

 [1] http://thread.gmane.org/gmane.comp.version-control.git/231268

That problem isn't so much a matter of not resolving symlinks but
rather the question of what .. means.  In the case shown in that
message, the current directory *is* {topdir}/z/b, but it was entered
with cd {topdir}/b -- and the shell records the latter as the value
of $PWD.  (Actually, the bash shell can handle symbolic-linked
directories *either* way, depending on whether it is given the -P
option.)

When Emacs is given the file name ../a/file, does the .. mean to
go up one directory *textually* in the pathspec based on $PWD

{topdir}/b/../a/file - {topdir}/a/file (which does not exist)

or according to the directory linkage on the disk (where the ..
entry in the current directory goes to the directory named {topdir}/z,
which does contain a file a/file.  It looks like Emacs uses the first
method.

The decision of which implementation to use depends on the semantics
you *want*.  As I noted, bash allows the user to choose *either*
interpretation, which shows that both semantics are considered valid
(by some people).

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link, part 3

2013-07-31 Thread Dale R. Worley
I've run into a problem (with Git 1.8.3.3) where I cannot add a
symbolic link (as such) to the repository *if* its path is given
absolutely; instead Git adds the file the symbolic link points to.
(If I give the path relatively, Git does what I expect, that is, adds
the symbolic link.)

I've written a test script that shows the problem and included it
below.

I would not expect *how* a path is presented to Git to change how Git
processes the path.  In the test case, I would expect /bin/awk and
../../bin/awk to produce the same effect when used as arguments to
git add.

What is going on in the code is this:  In git add, all paths are
normalized by the function prefix_path_gently() in abspath.c.  That
function removes symbolic links from the pathspec *only if* it is
absolute, as shown in the first few lines of the function:

 static char *prefix_path_gently(const char *prefix, int len, const char *path)
 {
 const char *orig = path;
 char *sanitized;
 if (is_absolute_path(orig)) {
-const char *temp = real_path(path);
+const char *temp = absolute_path(path);
 sanitized = xmalloc(len + strlen(temp) + 1);
 strcpy(sanitized, temp);
 } else {

real_path() is specified to remove symbolic links.  As shown, I've
replaced real_path() with absolute_path(), based on the comment at the
top of real_path():

/*
 * Return the real path (i.e., absolute path, with symlinks resolved
 * and extra slashes removed) equivalent to the specified path.  (If
 * you want an absolute path but don't mind links, use
 * absolute_path().)  The return value is a pointer to a static
 * buffer.
 *

If I replace real_path() with absolute_path() as shown, the problem I
am testing for disappears.

With the above change, the test suite runs with zero failures, so it
doesn't affect any common Git usage.

But I don't know enough about the internal architecture of Git to know
that my change is correct in all cases.  I'm almost certain that the
normalization process for pathspecs should *not* normalize a final
component that is a symbolic link.  But I would expect it would be
desirable to normalize non-final components that are symbolic links.
On the other hand, that might not matter.

Can someone give me advice on what this code *should* do?

I believe I can prepare a proper test for the test suite for this, so
once I know what the code change should be, I can prepare a proper
patch for it.

Dale
--  
Here's a test case for adding a symbolic link.  This test exploits the
fact that on my system, /bin/awk is a symbolic link to gawk.  As you
can see, the behavior of Git differs if the link's path is given to
git add as an absolute path or a relative path.

Here is the test script:
--
#! /bin/bash

# Illustrates a problem with applying git add to a symbolic link.

set -x

# To be run from a directory one step below the root directory.  E.g.,
# /tmp.
# On this system, /bin/awk is a symbolic link to gawk, which
# means /tmp/gawk.

# Show the Git version.
git version

# Make a test directory and cd to it.
DIR=temp.$$
mkdir $DIR
cd $DIR

# Create a Git repository.
git init
# Set the worktree to be /
git config core.worktree /
# Create an empty commit.
git commit --allow-empty -m Empty.

# Add the symbolic link using its absolute name.
ABSOLUTE=/bin/awk
ls -l $ABSOLUTE
git add $ABSOLUTE
# Notice that the target of the link is added, not the link itself.
git status -uno

# Reset the index.
git reset

# Add the symbolic link using its relative name.
# Remember that we are two directory levels below the root directory now.
RELATIVE=../..$ABSOLUTE
ls -l $RELATIVE
git add $RELATIVE
# Notice that now the link itself is added.
git status -uno
--
Here is sample output of the script:
--
+ git version
git version 1.8.3.3.756.g07a2553.dirty
+ DIR=temp.22366
+ mkdir temp.22366
+ cd temp.22366
+ git init
Initialized empty Git repository in /git-add-link/temp.22366/.git/
+ git config core.worktree /
+ git commit --allow-empty -m Empty.
[master (root-commit) fb232e5] Empty.
+ ABSOLUTE=/bin/awk
+ ls -l /bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 /bin/awk - gawk
+ git add /bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../bin/gawk
#
# Untracked files not listed (use -u option to show untracked files)
+ git reset
+ RELATIVE=../../bin/awk
+ ls -l ../../bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 ../../bin/awk - gawk
+ git add ../../bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../bin/awk
#
# Untracked files not 

Making a patch: git format-patch does not produce the documented format

2013-07-31 Thread Dale R. Worley
I'm working on writing a patch, but I'm running into a problem.  The
patch itself is from this commit:

$ git log -1
commit 07a25537909dd277426818a39d9bc4235e755383
Author: Dale Worley wor...@ariadne.com
Date:   Thu Jul 18 18:43:12 2013 -0400

open() returns -1 on failure, and indeed 0 is a possible success value
if the user closed stdin in our process.  Fix the test.
$ 

But the output of git format-patch is:

From 07a25537909dd277426818a39d9bc4235e755383 Mon Sep 17 00:00:00 2001
From: Dale Worley wor...@ariadne.com
Date: Thu, 18 Jul 2013 18:43:12 -0400
Subject: [PATCH] open() returns -1 on failure, and indeed 0 is a possible
 success value if the user closed stdin in our process.  Fix
 the test.

---
 t/t0070-fundamental.sh |7 +++
 wrapper.c  |2 +-
 2 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to 
unwritable directory prints file
grep cannotwrite/test err
[...]

Notice that the whole commit message has been formatted as if it is
part of the Subject line, and the line breaks in the commit message
have been refilled.

The file Documentation/SubmittingPatches says that git format-patch
produces patches in the best format, but the manual page shows an
example more like this:

From 8f72bad1baf19a53459661343e21d6491c3908d3 Mon Sep 17 00:00:00 2001
From: Tony Luck tony.l...@intel.com
Date: Tue, 13 Jul 2010 11:42:54 -0700
Subject: [PATCH] Put ia64 config files on the
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

arch/arm config files were slimmed down using a python script
(See commit c2330e286f68f1c408b4aa6515ba49d57f05beae comment)
[...]

That is, the first line of the commit message is in the Subject and
the remaining lines are in the message body.  As far as I can tell,
that's what SubmittingPatches prescribes.  And that is what I see in
the Git mailing list on vger.

(This is with git 1.8.3.3.)

Exactly how should the commit message be inserted into the patch
e-mail?  What needs to be updated so the code is consistent with the
documentation?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
I've been looking into writing a proper test for this patch.  My first
attempt tests the symptom that was seen initially, that git commit
fails if fd 0 is closed.

One problem is how to arrange for fd 0 to be closed.  I could use the
bash redirection -, but I think you want to be more portable than
that.  This version uses execvp() inside a small C program, and
execvp() is a Posix function.

I've tested that this test does what it should:  If you remove the
fix, fd = 0, the test fails.  If you then remove test-close-fd-0
from before git init in the test, the test is nullified and succeeds
again.

Here is the diff.  What do people think of it?

diff --git a/Makefile b/Makefile
index 0600eb4..6b410f5 100644
--- a/Makefile
+++ b/Makefile
@@ -557,6 +557,7 @@ X =
 PROGRAMS += $(patsubst %.o,git-%$X,$(PROGRAM_OBJS))
 
 TEST_PROGRAMS_NEED_X += test-chmtime
+TEST_PROGRAMS_NEED_X += test-close-fd-0
 TEST_PROGRAMS_NEED_X += test-ctype
 TEST_PROGRAMS_NEED_X += test-date
 TEST_PROGRAMS_NEED_X += test-delta
diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..6a31103 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep cannotwrite/test err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init 
+   echo Test. test-file 
+   git add test-file 
+   test-close-fd-0 git commit -m Message.
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
diff --git a/test-close-fd-0.c b/test-close-fd-0.c
new file mode 100644
index 000..3745c34
--- /dev/null
+++ b/test-close-fd-0.c
@@ -0,0 +1,14 @@
+#include unistd.h
+
+/* Close file descriptor 0 (which is standard-input), then execute the
+ * remainder of the command line as a command. */
+
+int main(int argc, char **argv)
+{
+   /* Close fd 0. */
+   close(0);
+   /* Execute the requested command. */
+   execvp(argv[1], argv[1]);
+   /* If execve() failed, return an error. */
+   return 1;
+}
diff --git a/wrapper.c b/wrapper.c
index dd7ecbb..6a015de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -322,7 +322,7 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int 
mode)
template[5] = letters[v % num_letters]; v /= num_letters;
 
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-   if (fd  0)
+   if (fd = 0)
return fd;
/*
 * Fatal error (EPERM, ENOSPC etc).

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com
 
 That's just a plain-vanilla part of POSIX shell behaviour, no?
 
 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_07_05

Close standard input is so weird I never thought it was Posix.  In
that case, we can eliminate the C helper program:

diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh
index 986b2a8..d427f3a 100755
--- a/t/t0070-fundamental.sh
+++ b/t/t0070-fundamental.sh
@@ -25,6 +25,13 @@ test_expect_success POSIXPERM,SANITY 'mktemp to unwritable 
directory prints file
grep cannotwrite/test err
 '
 
+test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
+   git init 
+   echo Test. test-file 
+   git add test-file 
+   git commit -m Message. -
+'
+
 test_expect_success 'check for a bug in the regex routines' '
# if this test fails, re-build git with NO_REGEX=1
test-regex
diff --git a/wrapper.c b/wrapper.c
index dd7ecbb..6a015de 100644
--- a/wrapper.c
+++ b/wrapper.c
@@ -322,7 +322,7 @@ int git_mkstemps_mode(char *pattern, int suffix_len, int 
mode)
template[5] = letters[v % num_letters]; v /= num_letters;
 
fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-   if (fd  0)
+   if (fd = 0)
return fd;
/*
 * Fatal error (EPERM, ENOSPC etc).

Is this a sensible place to put this test?

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 1/2] git_mkstemps: correctly test return value of open()

2013-07-18 Thread Dale R. Worley
 From: Junio C Hamano gits...@pobox.com

  +test_expect_success 'git_mkstemps_mode does not fail if fd 0 is not open' '
  +   git init 
  +   echo Test. test-file 
  +   git add test-file 
  +   git commit -m Message. -
  +'
  +
 
 Yup.  I wonder how it would fail without the fix, though ;-)

Eh, what?  You could run it and see.  The test script system will just
say not ok for this test.  If you execute those commands from a
shell, you see:

 $ git init
Initialized empty Git repository in /common/not-replicated/worley/temp/1/.git/
 $ echo Test. test-file
 $ git add test-file
 $ git commit -m Message. -
error: unable to create temporary sha1 filename : No such file or directory

error: Error building trees
 $ 

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug: Failure if stdin is closed.

2013-07-11 Thread Dale R. Worley
(The original problem and the discussion that ensued is on the
git-users mailing list:
https://groups.google.com/forum/#!topic/git-users/lNQ7Cn35EqA)

git commit (and probably other operations) fail if standard input
(fd 0) is closed when git starts.  A simple test case follows.  (The
execution is version 1.7.7.6, but the code listed below is from a very
recent commit.)


$ git --version
git version 1.7.7.6
$ git status
# On branch master
nothing to commit (working directory clean)
$ echo This is a test 
$ git status
# On branch master
# Untracked files:
#   (use git add file... to include in what will be committed)
#
#   
nothing added to commit but untracked files present (use git add to track)
$ git add 
$ # The notation 0- means close standard input (fd 0) in the process that
$ # executes this command.  It may be specific to the bash shell.
$ git commit -m  0-
error: unable to create temporary sha1 filename : No such file or directory

error: Error building trees
$ git status
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   
#
$ git commit -m 
[master 54c146c] 
 1 files changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 1 commit.
#
nothing to commit (working directory clean)
$ 


The fundamental error is that git_mkstemps_mode() in wrapper.c assumes
that if open() is successful, its return value must be 0.  That is
not true, because if fd 0 is closed, then open() can successfully
return 0.  I have not tested it, but it looks like the fix is:

 int git_mkstemps_mode(char *pattern, int suffix_len, int mode)
 {
 static const char letters[] =
 abcdefghijklmnopqrstuvwxyz
 ABCDEFGHIJKLMNOPQRSTUVWXYZ
 0123456789;
 static const int num_letters = 62;
 uint64_t value;
 struct timeval tv;
 char *template;
 size_t len;
 int fd, count;

 len = strlen(pattern);

 if (len  6 + suffix_len) {
 errno = EINVAL;
 return -1;
 }

 if (strncmp(pattern[len - 6 - suffix_len], XX, 6)) {
 errno = EINVAL;
 return -1;
 }

 /*
  * Replace pattern's XX characters with randomness.
  * Try TMP_MAX different filenames.
  */
 gettimeofday(tv, NULL);
 value = ((size_t)(tv.tv_usec  16)) ^ tv.tv_sec ^ getpid();
 template = pattern[len - 6 - suffix_len];
 for (count = 0; count  TMP_MAX; ++count) {
 uint64_t v = value;
 /* Fill in the random bits. */
 template[0] = letters[v % num_letters]; v /= num_letters;
 template[1] = letters[v % num_letters]; v /= num_letters;
 template[2] = letters[v % num_letters]; v /= num_letters;
 template[3] = letters[v % num_letters]; v /= num_letters;
 template[4] = letters[v % num_letters]; v /= num_letters;
 template[5] = letters[v % num_letters]; v /= num_letters;

 fd = open(pattern, O_CREAT | O_EXCL | O_RDWR, mode);
-if (fd  0)
+if (fd = 0)
 return fd;
 /*
  * Fatal error (EPERM, ENOSPC etc).
  * It doesn't make sense to loop.
  */
 if (errno != EEXIST)
 break;
 /*
  * This is a random value.  It is only necessary that
  * the next TMP_MAX values generated by adding  to
  * VALUE are different with (module 2^32).
  */
 value += ;
 }
 /* We return the null string if we can't find a unique file name.  */
 pattern[0] = '\0';
 return -1;
 }


However, when looking at the code, I noticed that few of the functions
have comments describing what they do, and none describe their input
and output values.  In particular, there are no comments specifying
what the error return values are.  This is appalling for a supposedly
professional-quality project!

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link, part 2

2013-06-24 Thread Dale R. Worley
Here's a slightly simpler test case for adding a symbolic link.  This
test exploits the fact that on my system, /bin/awk is a symbolic link
to gawk.  As you can see, the behavior of Git differs if the link's
path is given to git add as an absolute path or a relative path.

Here is the test script:
--
#! /bin/bash

# Illustrates a problem with applying git add to a symbolic link.

set -x

# To be run from a directory one step below the root directory.  E.g.,
# /git-add-link.
# Exploits the fact that /bin/awk is a symbolic link to gawk.

# Show the Git version.
git version

# Make a test directory and cd to it.
DIR=temp.$$
mkdir $DIR
cd $DIR

# Create a Git repository.
git init
# Set the worktree to be /
git config core.worktree /
# Create an empty commit.
git commit --allow-empty -m Empty.

# Add the symbolic link using its absolute name.
ABSOLUTE=/bin/awk
ls -l $ABSOLUTE
git add $ABSOLUTE
# Notice that the target of the link is added, not the link itself.
git status -uno

# Reset the index.
git reset

# Add the symbolic link using its relative name.
# Remember that we are two directory levels below the root directory now.
RELATIVE=../..$ABSOLUTE
ls -l $RELATIVE
git add $RELATIVE
# Notice that now the link itself is added.
git status -uno
--
Here is the output of the script:
--
+ git version
git version 1.7.7.6
+ DIR=temp.22366
+ mkdir temp.22366
+ cd temp.22366
+ git init
Initialized empty Git repository in /git-add-link/temp.22366/.git/
+ git config core.worktree /
+ git commit --allow-empty -m Empty.
[master (root-commit) fb232e5] Empty.
+ ABSOLUTE=/bin/awk
+ ls -l /bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 /bin/awk - gawk
+ git add /bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../bin/gawk
#
# Untracked files not listed (use -u option to show untracked files)
+ git reset
+ RELATIVE=../../bin/awk
+ ls -l ../../bin/awk
lrwxrwxrwx. 1 root root 4 Nov  2  2012 ../../bin/awk - gawk
+ git add ../../bin/awk
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../bin/awk
#
# Untracked files not listed (use -u option to show untracked files)
--
I can't see any principle of operation of Git that would cause git
add /bin/awk and git add ../../bin/awk to be handled differently.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Difficulty adding a symbolic link

2013-06-14 Thread Dale R. Worley
I'm having a problem with git add in version 1.7.7.6.

The situation is that I have a repository that is contained in a
second-level directory, a sub-sub-directory of /.  The core.worktree
of the repository is /, so the working directory is the entire file
tree.  I want this repository to track selected files, which I add
with git add.

The difficulty is that git add seems to not add specific files,
depending on how they are specified as arguments to git add.  In
particular, /dev/dvd (which is a symbolic link) can be added with git
add ../../dev/dvd but not with git add /dev/dvd.  On the other
hand, /etc/hosts (an ordinary file) can be added with either git add
/etc/hosts or git add ../../etc/hosts.

To demonstrate the problem, I've written a script.  A typical
execution of the script is as follows.  The lines at the left margin
start with $, and are the lines of the script.  The lines indented 4
spaces start with +, and are the simple commands as they are
executed by the shell, showing the values substituted for the shell
variables.  The lines indented 8 spaces are the output of the various
commands.

Someone suggested that the problem may be triggered by the fact that
/dev is in a different filesystem than / and /etc.  I added a third
section to the script by creating a symbolic link from /etc/hosts.link
to /etc/hosts, which is thus in the same filesystem as / and the
repository.  Git handles it as expected.

Any help with this would be appreciated.

Dale

$ set -x
$ 
$ # Show git version.
$ git version
+ git version
git version 1.7.7.6
$ 
$ # To be run in a directory that is contained in /.
$ pwd
+ pwd
/git-add-issue
$ 
$ # Make a test directory.
$ DIR=temp.$$
+ DIR=temp.10714
$ mkdir $DIR
+ mkdir temp.10714
$ cd $DIR
+ cd temp.10714
++ pwd
$ DIR_ABSOLUTE=$( pwd )
+ DIR_ABSOLUTE=/git-add-issue/temp.10714
$ 
$ # Create a new repository.
$ git init
+ git init
Initialized empty Git repository in /git-add-issue/temp.10714/.git/
$ # Set the worktree to be /
$ git config core.worktree /
+ git config core.worktree /
$ # Create empty commit.
$ git commit --allow-empty -m Empty.
+ git commit --allow-empty -m Empty.
[master (root-commit) 62d86cb] Empty.
$ 
$ # Show empty status
$ git status -uno
+ git status -uno
# On branch master
nothing to commit (use -u to show untracked files)
$ 
$ # First test:  /dev/dvd, which is a symbolic link.
$ 
$ # Try to add /dev/dvd
$ ABSOLUTE_NAME=/dev/dvd
+ ABSOLUTE_NAME=/dev/dvd
$ ll $ABSOLUTE_NAME
+ ls -al /dev/dvd
lrwxrwxrwx. 1 root root 9 Jun 12 22:23 /dev/dvd - /dev/dvd1
$ git add $ABSOLUTE_NAME
+ git add /dev/dvd
$ git status -uno
+ git status -uno
# On branch master
nothing to commit (use -u to show untracked files)
$ git reset
+ git reset
$ 
$ # Try with alternative name ../../dev/dvd
$ RELATIVE_NAME=../../dev/dvd
+ RELATIVE_NAME=../../dev/dvd
$ ll $RELATIVE_NAME
+ ls -al ../../dev/dvd
lrwxrwxrwx. 1 root root 9 Jun 12 22:23 ../../dev/dvd - /dev/dvd1
$ git add $RELATIVE_NAME
+ git add ../../dev/dvd
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../dev/dvd
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Second test:  /etc/hosts, which is an ordinary file.
$ 
$ # Try to add /etc/hosts
$ ABSOLUTE_NAME=/etc/hosts
+ ABSOLUTE_NAME=/etc/hosts
$ ll $ABSOLUTE_NAME
+ ls -al /etc/hosts
-rw-r--r--. 1 root root 222 Nov  4  2012 /etc/hosts
$ git add $ABSOLUTE_NAME
+ git add /etc/hosts
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../etc/hosts
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Try with alternative name ../../etc/hosts
$ RELATIVE_NAME=../../etc/hosts
+ RELATIVE_NAME=../../etc/hosts
$ ll $RELATIVE_NAME
+ ls -al ../../etc/hosts
-rw-r--r--. 1 root root 222 Nov  4  2012 ../../etc/hosts
$ git add $RELATIVE_NAME
+ git add ../../etc/hosts
$ git status -uno
+ git status -uno
# On branch master
# Changes to be committed:
#   (use git reset HEAD file... to unstage)
#
#   new file:   ../../etc/hosts
#
# Untracked files not listed (use -u option to show untracked files)
$ git reset
+ git reset
$ 
$ # Third test:  /etc/hosts.link, which is a link to /etc/hosts
$ 
$ # Try to add /etc/hosts.link
$ ABSOLUTE_NAME=/etc/hosts.link
+ ABSOLUTE_NAME=/etc/hosts.link
$ ll $ABSOLUTE_NAME
+ ls -al /etc/hosts.link
lrwxrwxrwx. 1 root root 5 Jun 14 12:04 /etc/hosts.link - hosts
$ git add 

[PATCHv2] git-submodule.txt: Clarify 'init' and 'add' subcommands.

2013-05-15 Thread Dale R. Worley
Describe how 'add' sets the submodule's logical name, which is used in
the configuration entry names.

Clarify that 'init' only sets up the configuration entries for
submodules that have already been added elsewhere.  Describe that
path arguments limit the submodules that are configured.

Signed-off-by: Dale Worley wor...@ariadne.com
---
This patch seems to have all the features that we have discussed:

- Describes how 'add' selects the submodule logical name, including
  the effect of --name.  (My first patch was on a version of Git that
  did not support --name, so I didn't know of it.)

- Corrects description of 'init' to clarify that its behavior is
  driven by the gitlinks recorded in the index, rather than implying
  it is driven by the contents of .gitmodules.

- Describes 'init' behavior to be driven by the index, rather than my
  previous incorrect use of file tree.  (Of course, gitlinks aren't
  visible in the file tree.)

- Updated text for 'init' is shorter and less technical than my
  previous patch.

- Since (which were added and committed elsewhere) is stated in the
  first sentence, I've removed the later sentence explaining that
  submodules must be added before they can be inited.

- Explains the effect of path arguments to 'init' subcommand.

 Documentation/git-submodule.txt |8 ++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt
index c99d795..83235c0 100644
--- a/Documentation/git-submodule.txt
+++ b/Documentation/git-submodule.txt
@@ -76,6 +76,8 @@ argument path is the relative location for the cloned 
submodule
 to exist in the superproject. If path is not given, the
 humanish part of the source repository is used (repo for
 /path/to/repo.git and foo for host.xz:foo/.git).
+The path is also used as the submodule's logical name in its
+configuration entries unless `--name` is used to specify a logical name.
 +
 repository is the URL of the new submodule's origin repository.
 This may be either an absolute URL, or (if it begins with ./
@@ -123,8 +125,10 @@ linkgit:git-status[1] and linkgit:git-diff[1] will provide 
that information
 too (and can also report changes to a submodule's work tree).
 
 init::
-   Initialize the submodules, i.e. register each submodule name
-   and url found in .gitmodules into .git/config.
+   Initialize the submodules recorded in the index (which were
+   added and committed elsewhere) by copying submodule
+   names and urls from .gitmodules to .git/config.
+   Optional path arguments limit which submodules will be initialized.
It will also copy the value of `submodule.$name.update` into
.git/config.
The key used in .git/config is `submodule.$name.url`.
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCHv2] CodingGuidelines: make it clear which files in Documentation/ are the sources

2013-05-08 Thread Dale R. Worley
From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
From: Dale Worley wor...@ariadne.com
Date: Tue, 7 May 2013 13:39:46 -0400
Subject: [PATCH] CodingGuidelines:  make it clear which files in
 Documentation/ are the sources

Signed-off-by: Dale R. Worley wor...@ariadne.com
---
While learning about making a documentation patch, I noticed that
Documentation/CodingGuideles isn't as clear as it could be regarding
how to edit the documentation.  In particular, it says Most (if not
all) of the documentation pages are written in AsciiDoc - and
processed into HTML output and manpages. without really specifying
the details for those of us who aren't familiar with AsciiDoc.  So I
added a sentence stating explicitly which files are the sources and
which are derived.

It's also a test for submitting a patch.  I've read SubmittingPatches
again, more carefully, and have corrected some problem with my
previous message.

 Documentation/CodingGuidelines |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 7e4d571..b8eef7c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -238,7 +238,9 @@ For Python scripts:
 Writing Documentation:
 
  Most (if not all) of the documentation pages are written in AsciiDoc
- and processed into HTML output and manpages.
+ and processed into HTML output and manpages.  This means that the *.txt
+ files in this directory are usually the sources from which the
+ corresponding *.html, *.1, and *.xml files are generated.
 
  Every user-visible change should be reflected in the documentation.
  The same general rule as for code applies -- imitate the existing
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] CodingGuidelines: make it clear which files in Documentation are the sources

2013-05-07 Thread Dale R. Worley
While learning about making a documentation patch, I noticed that
Documentation/CodingGuideles isn't as clear as it could be regarding
how to edit the documentation.  In particular, it says Most (if not
all) of the documentation pages are written in AsciiDoc - and
processed into HTML output and manpages. without really specifying
the details for those of us who aren't familiar with AsciiDoc.  So I
added a sentence stating explicitly which files are the sources and
which are derived.

It's also a test for submitting a patch.

Dale



From e87227498ef3d50dc20584c24c53071cce63c555 Mon Sep 17 00:00:00 2001
From: Dale Worley wor...@ariadne.com
Date: Tue, 7 May 2013 13:39:46 -0400
Subject: [PATCH] CodingGuidelines:  make it clear which files in
 Documentation are the sources

---
 Documentation/CodingGuidelines |4 +++-
 1 files changed, 3 insertions(+), 1 deletions(-)

diff --git a/Documentation/CodingGuidelines b/Documentation/CodingGuidelines
index 7e4d571..b8eef7c 100644
--- a/Documentation/CodingGuidelines
+++ b/Documentation/CodingGuidelines
@@ -238,7 +238,9 @@ For Python scripts:
 Writing Documentation:
 
  Most (if not all) of the documentation pages are written in AsciiDoc
- and processed into HTML output and manpages.
+ and processed into HTML output and manpages.  This means that the *.txt
+ files in this directory are usually the sources from which the
+ corresponding *.html, *.1, and *.xml files are generated.
 
  Every user-visible change should be reflected in the documentation.
  The same general rule as for code applies -- imitate the existing
-- 
1.7.7.6

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Suggestion for improving the manual page for git submodule

2013-05-02 Thread Dale R. Worley
Several people have made similar mistakes in beliving that git
submodule init can be used for adding submodules to a working
directory, whereas git submodule add is the command that should be
used.  That *is* documented at the top of the manual page for git
submodule, but my error was enhanced by a subtle error in the
documentation of init.

The text as it is written suggests that init's behavior is driven by
the contents of .submodules.  But in fact, its behavior is driven by
the existing gitlinks in the file tree, possibly limited by the path
arguments.  (Which is *why* git submodule init can't *add*
submodules; it only processes *existing* submodules.)

I would like to suggest that the manual page be updated to remove the
error in the description of the init subcommand, along with another
addition to tell the submodule logical name that is used by the add
subcommand:

--- man12013-04-26 12:02:16.752702146 -0400
+++ man32013-05-02 21:06:00.020353100 -0400
@@ -61,6 +61,8 @@
to exist in the superproject. If path is not given, the
humanish part of the source repository is used (repo for
/path/to/repo.git and foo for host.xz:foo/.git).
+   The path is used as the submodule's logical name in its
+   configuration entries.
 
repository is the URL of the new submodule’s origin repository.
This may be either an absolute URL, or (if it begins with ./ or
@@ -109,7 +111,9 @@
too (and can also report changes to a submodule’s work tree).
 
init
-   Initialize the submodules, i.e. register each submodule name and
+   Initialize the submodules, i.e. register each submodule for which
+   there is a gitlink recorded (or the specific gitlinks specified by
+   path ...) by copying the name and
url found in .gitmodules into .git/config. The key used in
.git/config is submodule.$name.url. This command does not alter
existing information in .git/config. You can then customize the
@@ -118,6 +122,10 @@
submodule update --init without the explicit init step if you do
not intend to customize any submodule locations.
 
+   (Because init only operates on existing gitlinks, it cannot
+   be used to create submodules, regardless of the contents of
+   .gitmodules.  Use the add subcommand to create submodules.)
+
update
Update the registered submodules, i.e. clone missing submodules and
checkout the commit specified in the index of the containing

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What I want rebase to do

2013-03-07 Thread Dale R. Worley
 From: Thomas Rast tr...@student.ethz.ch
 
 wor...@alum.mit.edu (Dale R. Worley) writes:
 [...snip...]
 
 Isn't that just a very long-winded way of restating what Junio said
 earlier:
 
   It was suggested to make it apply the first-parent diff and record
   the result, I think.  If that were an acceptable approach (I didn't
   think about it through myself, though), that would automatically
   cover the evil-merge case as well.

Well, I believe what I said was a fleshed-out way of saying what I
*think* Junio said, but...

 You can fake that with something like
 
 git rev-list --first-parent --reverse RANGE_TO_REBASE |
 while read rev; do
 if git rev-parse $rev^2 /dev/null 21; then
 git cherry-pick -n -m1 $rev
 git rev-parse $rev^2 .git/MERGE_HEAD
 git commit -C$rev
 else
 git cherry-pick $rev
 fi
 done

This code doesn't do that.  I don't want something that rebases a
single thread of the current branch, I want something that rebases
*all* the commits between the head commit and the merge base.  Which
is what is illustrated in my message.

 [1]  If you don't get the sarcasm: that would amount to reinventing
 large parts of git-rebase.

Yes, that is the point of the exercise.

I've done a proof-of-concept implementation of what I want to see,
calling it git-rebase--merge-safe.  But I'm new here and likely that
is a pretty crude solution.  I suspect that a real implementation
could be done by inserting this logic into the framework of
git-filter-tree.  Following is git-rebase--merge-safe, and the script
I use to test it (and explore rebase problems).

Dale
--
git-rebase--merge-safe

#!/bin/bash

. git-sh-setup

prec=4

set -ex

# Ensure the work tree is clean.
require_clean_work_tree rebase Please commit or stash them.

onto_name=$1
onto=$(git rev-parse --verify ${onto_name}^0) ||
die Does not point to a valid commit: $1

head_name=$( git symbolic-ref HEAD )
orig_head=$(git rev-parse --verify $head_name) ||
exit 1

echo onto=$onto
echo head_name=$head_name
echo orig_head=$orig_head

# Get the merge base, which is the root of the branch that we are rebasing.
# (For now, ignore the question of whether there is more than one merge base.)
mb=$(git merge-base $onto $orig_head)
echo mb=$mb

# Get the list of commits to rebase, which is everything between $mb and
# $orig_head.
# Note that $mb is not included.
revisions=`git rev-list --reverse --ancestry-path $mb..$orig_head`
echo revisions=$revisions

# Set up the list mapping the commits on the original branch to the commits
# on the branch we are creating.
# Its format is ,old-hash1/new-hash1,old-hash2/new-hash2,...,.
# The initial value maps $mb to $onto.
map=,$mb/$onto,

# Export these so git commit can see them.
export GIT_AUTHOR_NAME GIT_AUTHOR_EMAIL GIT_AUTHOR_DATE

# Process each commit in forward topological order.
for cmt in $revisions
do
# Examine the commit to extract information we will need to reconstruct it.
# First parent of the commit that has a mapping, i.e., is part of the
# branch (and has thus been rebuilt already.
first_mapped_parent=
# The new commit that was made of $first_mapped_parent.
first_mapped_parent_mapped=
# List of -p options naming the parent commits, or their new commits if they
# are in the branch.
parents=
   # Dissect the old commit's data.
# Output the commit data into FD 3.
exec 3 ( git cat-file commit $cmt )

while read keyword rest 3
do
case $keyword in
tree)
# Ignored
;;
parent)
# See if the parent is mapped, i.e., is in the
# original branch.
if [[ $map == *,$rest/* ]]
then
# This parent has been mapped.  Get the new commit.
parent_mapped=${map#*,$rest/}
parent_mapped=${parent_mapped%%,*}
if test -z $first_mapped_parent
then
first_mapped_parent=$rest
first_mapped_parent_mapped=$parent_mapped
fi
else
# This parent has not been mapped.
parent_mapped=$rest
fi
# $parent_mapped is a parent of the new commit.
parents=$parents -p $parent_mapped
;;
author)
# Extract the information about the author.
GIT_AUTHOR_NAME=${rest%% *}
GIT_AUTHOR_EMAIL=${rest##* }
GIT_AUTHOR_EMAIL=${GIT_AUTHOR_EMAIL%% *}
GIT_AUTHOR_DATE=${rest##* }
;;
committer)
# Ignored:  The new commit will have this use's name
# as committer.
;;
'')
# End of fixed fields, remainder

What I want rebase to do

2013-03-06 Thread Dale R. Worley
This is how I see what rebase should do:

The simple case for rebase starts from

   P---Q---R---S master
\
 A---B---C topic

Then git checkout topic ; git rebase master will change it to

   P---Q---R---S master
\
 A'--B'--C' topic

A' is created by a three-way merge that can be symbolized

Q--S
|   |
v   v
A--A'

That is, Q, applying the changes from Q to A and the changes from Q to
S, becomes A'.

After that

A--A'
|   |
v   v
B--B'
|   |
v   v
C--C'

A more complex case is when there is a merge from an external source

   P---Q---R---S master
\
 A---M---C topic
/
---X

We want to produce

   P---Q---R---S master
\
 A'--M'--C' topic
/
---X

So we have to merge

Q--S
|   |
v   v
A--A'
|   |
v   v
M--M'
|   |
v   v
C--C'

Any evil changes in M will be in the changes A-M (along with the
changes introduced from X), and so they will be reincorporated in
A'-M'.  M' lists A' and X as its parents.  (And not M!)

If there is an internal merge in the topic branch, things look like
this

   P---Q---R---S master
\
 \   B
  \ / \
   A   M---D topic
\ /
 C

and we want to produce this

   P---Q---R---S master
\
 \B'
  \  /  \
   A'M'--D' topic
 \  /
  C'

Which can be done with these merges


Q--S
|   |
v   v
A--A'A--A'
|   | |   | 
v   v v   v 
B--B'C--C'

There are two choices for constructing M' (which ought to produce the
same results under ordinary circumstances)

B--B'C--C'
|   | |   | 
v   v v   v 
M--M'M--M'

and finally

M--M'
|   |
v   v
D--D'

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git rebase loses the additional changes in evil merges

2013-03-04 Thread Dale R. Worley
(git version 1.7.7.6)

I've been learning how to use Git.  While exploring git rebase, I've
discovered that if the branch being rebased contains an evil merge,
that is, a merge which contains changes that are in addition to the
changes in any of the parent commits, the rebase operation will
silenty lose those additional changes.

I believe that this is a serious bug.

The problem is not that Git does not propagate the additional changes
(although I'd like it to do so), the problem is that the rebase
*silently loses* changes in the code.  It would be acceptable if Git
detected the evil merge, notified me of it, and forced me to manually
re-do it in order to finish the rebase.

Throughout the Git design (as in other SCMs), care has been taken to
not lose changes that the user has introduced.  In this situation,
that principle is violated.

I have a script that demonstrates this failure, if that would be
useful for testing purposes.

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Bug in git log --graph -p -m (version 1.7.7.6)

2013-02-06 Thread Dale R. Worley
 From: Matthieu Moy matthieu@grenoble-inp.fr
 
 In any case, I can't reproduce with 1.8.1.2.526.gf51a757: I don't get
 undless output. On the other hand, I get a slightly misformatted output:
 
 *   commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
 2c1e6a36f4b712e914fac994463da7d0fdb2bc6d)
 |\  Merge: 2c1e6a3 33e70e7
 | | Author: Matthieu Moy matthieu@imag.fr
 | | Date:   Tue Feb 5 22:05:33 2013 +0100
 | | 
 | | Commit S
 | | 
 | | diff --git a/file b/file
 | | index 6bb4d3e..afd2e75 100644
 | | --- a/file
 | | +++ b/file
 | | @@ -1,4 +1,5 @@
 | |  1
 | |  1a
 | |  2
 | | +2a
 | |  3
 | | 
 commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
 33e70e70c0173d634826b998bdc304f93c0966b8)
 | | Merge: 2c1e6a3 33e70e7
 | | Author: Matthieu Moy matthieu@imag.fr
 | | Date:   Tue Feb 5 22:05:33 2013 +0100
 
 The second commit line (diff with second parent) doesn't have the
 | | prefix, I don't think this is intentional.

The second commit line should start with | * :

 | | 
 | * commit a393ed598e9fb11436f85bd58f1a38c82f2cadb7 (from 
 33e70e70c0173d634826b998bdc304f93c0966b8)
 | | Merge: 2c1e6a3 33e70e7
 | | Author: Matthieu Moy matthieu@imag.fr
 | | Date:   Tue Feb 5 22:05:33 2013 +0100

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Bug in git log --graph -p -m (version 1.7.7.6)

2013-02-05 Thread Dale R. Worley
I have found a situation where git log produces (apparently)
endless output.  Presumably this is a bug.  Following is a (Linux)
script that reliably reproduces the error for me (on Fedora 16):

--
set -ve

# Print the git version.
git --version

# Create respository.
rm -rf .git
git init

# Initial commit.
( echo 1 ; echo 2 ; echo 3 ) file
git add file
git commit -m 'Commit P'
git branch B HEAD

# Next commit on master adds line 1a.
( echo 1 ; echo 1a ; echo 2 ; echo 3 ) file
git add file
git commit -m 'Commit Q'

git checkout B

# Next commit on B adds line 2a.
( echo 1 ; echo 2 ; echo 2a ; echo 3 ) file
git add file
git commit -m 'Commit R'

# Merge the two commits, but add line 3a to the commit as well.
git checkout master
git merge --no-commit B
# Show what the merge produces.
cat file
# Add line 3a.
( echo 1 ; echo 1a ; echo 2 ; echo 2a ; echo 3 ; echo 3a ) file
git commit -m 'Commit S'

# These log commands work.
git log
git log --graph
git log --graph -p

# This log command produces infinite output.
git log --graph -p -m
--

Dale
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html