Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2013-02-20 Thread Anders Kaseorg
On Tue, 19 Feb 2013, Junio C Hamano wrote:
> Assuming that this says "yes":
> 
>   D=/afs/athena.mit.edu/user/a/n/andersk/my/dir
> cd "$D"
> test "$(/bin/pwd)" = "$D" && echo yes

Correct.

> Perhaps existing of an empty element in the list would do?  E.g.
> 
>   GIT_CEILING_DIRECTORIES=:/afs/athena.mit.edu/users/a/n/andesk
> 
> or something like that.  And in such a case, we do not run realpath on 
> the elements on the list before comparing them with what we get from 
> getcwd(3).

That seems reasonable, and has the advantage of backwards compatibility 
with versions before 1.8.1.2, I guess.

Anders
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2013-02-19 Thread Junio C Hamano
Anders Kaseorg  writes:

> On 10/29/2012 01:10 AM, Michael Haggerty wrote:
>> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
>> slowdown?
>
> Sorry to bring up this old thread again, but I just realized why my
> computer has been acting so slow when I’m not connected to the
> network. I put various network filesystem paths in
> $GIT_CEILING_DIRECTORIES, such as /afs/athena.mit.edu/user/a/n/andersk
> (to avoid hitting its parents /afs/athena.mit.edu,
> /afs/athena.mit.edu/user/a, and /afs/athena.mit.edu/user/a/n which all
> live in different AFS volumes). Now when I’m not connected to the
> network, every invocation of Git, including the __git_ps1 in my shell
> prompt, waits for AFS to timeout.

Thanks for a report.

Assuming that this says "yes":

D=/afs/athena.mit.edu/user/a/n/andersk/my/dir
cd "$D"
test "$(/bin/pwd)" = "$D" && echo yes

iow, AFS mounting does not have funny symbolic link issues that make
the logical and physical PWD to be different like some automounter
implementation used to have, perhaps we can introduce a way for the
user to say "The element of this CEILING list do not have any
alias due to funny symlinks" to solve this.  Perhaps existing of an
empty element in the list would do?  E.g.

GIT_CEILING_DIRECTORIES=:/afs/athena.mit.edu/users/a/n/andesk

or something like that.  And in such a case, we do not run realpath
on the elements on the list before comparing them with what we get
from getcwd(3).

Of course, you could condionally unset the environment while
offline, but that feels like an ugly hack.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2013-02-19 Thread Anders Kaseorg

On 10/29/2012 01:10 AM, Michael Haggerty wrote:

How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
slowdown?


Sorry to bring up this old thread again, but I just realized why my 
computer has been acting so slow when I’m not connected to the network. 
 I put various network filesystem paths in $GIT_CEILING_DIRECTORIES, 
such as /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its 
parents /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and 
/afs/athena.mit.edu/user/a/n which all live in different AFS volumes). 
Now when I’m not connected to the network, every invocation of Git, 
including the __git_ps1 in my shell prompt, waits for AFS to timeout.


Obviously I’m going to stop using $GIT_CEILING_DIRECTORIES now that I 
know what the problem is, but I figured you might want to know why this 
feature is now useless for me.


Anders

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-11-15 Thread Michael Haggerty
On 11/13/2012 09:50 PM, David Aguilar wrote:
> On Mon, Nov 12, 2012 at 9:47 AM, Junio C Hamano  wrote:
>> Michael Haggerty  writes:
>>
>>> The log message of the original commit (0454dd93bf) described the
>>> following scenario: a /home partition under which user home directories
>>> are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid
>>> hitting /home/.git, /home/.git/objects, and /home/objects (which would
>>> attempt to automount those directories).  I believe that this scenario
>>> would not be slowed down by my patches.
>>>
>>> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
>>> slowdown?
>>
>> Yeah, I was also wondering about that.
>>
>> David?
> 
> I double-checked our configuration and all the parent directories
> of those listed in GIT_CEILING_DIRECTORIES are local,
> so our particular configuration would not have a performance hit.
> 
> We do have multiple directories listed there.  Some of them share
> a parent directory.  I'm assuming the implementation is simple and
> does not try and avoid repeating the check when the parent dir is
> the same across multiple entries.
> 
> In any case, it won't be a problem in practice based on my
> reading of the current code.

OK, so we're back to the following status: some people (including me)
are nervous that this change could cause a performance regression,
though it seems that the most sensible ways of using the
GIT_CEILING_DIRECTORIES feature would not be affected.

In favor: Currently, if a directory containing a symlink is added to
GIT_CEILING_DIRECTORIES, then GIT_CEILING_DIRECTORIES will not work, git
has no way of recognizing that there is a problem, and the only symptom
observable by the user is that the hoped-for performance improvement
from using GIT_CEILING_DIRECTORIES will not materialize (or will
disappear after a filesystem reorg) [1].

Against: The change will cause a performance regression if a
slow-to-stat directory is listed in GIT_CEILING_DIRECTORIES.  The
slowdown will occur whenever git is run outside of a true git-managed
project, most nastily in the case of using __git_ps1 in a shell prompt.

I don't have a preference either way about whether these patches should
be merged.

Michael

[1] It is also conceivable that GIT_CEILING_DIRECTORIES is being used to
*hide* an enclosing git project rather than to inform git that there are
no enclosing projects, in which case the enclosing project would *not*
be hidden.  This is in fact the mechanism by which the problem causes
failures in our test suite.  But I don't expect that this is a common
real-world scenario, and anyway such a failure would be obvious to the
user and quickly fixed.

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-11-13 Thread David Aguilar
On Mon, Nov 12, 2012 at 9:47 AM, Junio C Hamano  wrote:
> Michael Haggerty  writes:
>
>> The log message of the original commit (0454dd93bf) described the
>> following scenario: a /home partition under which user home directories
>> are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid
>> hitting /home/.git, /home/.git/objects, and /home/objects (which would
>> attempt to automount those directories).  I believe that this scenario
>> would not be slowed down by my patches.
>>
>> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
>> slowdown?
>
> Yeah, I was also wondering about that.
>
> David?

I double-checked our configuration and all the parent directories
of those listed in GIT_CEILING_DIRECTORIES are local,
so our particular configuration would not have a performance hit.

We do have multiple directories listed there.  Some of them share
a parent directory.  I'm assuming the implementation is simple and
does not try and avoid repeating the check when the parent dir is
the same across multiple entries.

In any case, it won't be a problem in practice based on my
reading of the current code.



>>> Is there another way to accomplish this without the performance hit?
>>> Maybe something that can be solved with configuration?
>>
>> Without doing the symlink expansion there is no way for git to detect
>> that GIT_CEILING_DIRECTORIES contains symlinks and is therefore
>> ineffective.  So the user has no warning about the misconfiguration
>> (except that git runs slowly).
>>
>> On 10/29/2012 02:42 AM, Junio C Hamano wrote:
>>> Perhaps not canonicalize elements on the CEILING list ourselves? If
>>> we make it a user error to put symlinked alias in the variable, and
>>> document it clearly, wouldn't it suffice?
>>
>> There may be no other choice.  (That, and fix the test suite in another
>> way to tolerate a $PWD that involves symlinks.)
-- 
David
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-11-12 Thread Junio C Hamano
Michael Haggerty  writes:

> The log message of the original commit (0454dd93bf) described the
> following scenario: a /home partition under which user home directories
> are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid
> hitting /home/.git, /home/.git/objects, and /home/objects (which would
> attempt to automount those directories).  I believe that this scenario
> would not be slowed down by my patches.
>
> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
> slowdown?

Yeah, I was also wondering about that.

David?

>> Is there another way to accomplish this without the performance hit?
>> Maybe something that can be solved with configuration?
>
> Without doing the symlink expansion there is no way for git to detect
> that GIT_CEILING_DIRECTORIES contains symlinks and is therefore
> ineffective.  So the user has no warning about the misconfiguration
> (except that git runs slowly).
>
> On 10/29/2012 02:42 AM, Junio C Hamano wrote:
>> Perhaps not canonicalize elements on the CEILING list ourselves? If
>> we make it a user error to put symlinked alias in the variable, and
>> document it clearly, wouldn't it suffice?
>
> There may be no other choice.  (That, and fix the test suite in another
> way to tolerate a $PWD that involves symlinks.)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-28 Thread Lars Damerow
>From David Aguilar , Sun, Oct 28, 2012 at 05:15:29PM -0700:
> 
> In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added
> GIT_ONE_FILESYSTEM to fix a related issue.
> Do you guys have GIT_CEILING_DIRECTORIES set too?

Once GIT_DISCOVERY_ACROSS_FILESYSTEM (the eventual name for GIT_ONE_FILESYSTEM) 
was accepted, we stopped setting GIT_CEILING_DIRECTORIES in our environment.

Even so, the way our filesystems work, stat() or lstat() calls on the paths 
that we used to have in GIT_CEILING_DIRECTORIES wouldn't have caused a problem.

I don't personally have an objection to patches that explicitly access the 
GIT_CEILING_DIRECTORIES paths, but it seems like other folks' configurations 
could be harmed by them.

-lars

--
lars r. damerow :: button pusher :: pixar animation studios
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-28 Thread Michael Haggerty
On 10/29/2012 01:15 AM, David Aguilar wrote:
> On Sat, Oct 20, 2012 at 11:51 PM, Junio C Hamano  wrote:
>> Michael Haggerty  writes:
>>
>>> This patch series has the side effect that all of the directories
>>> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to
>>> resolve any symlinks that are present in their paths.  It is
>>> admittedly odd that a feature intended to avoid accessing expensive
>>> directories would now *intentionally* access directories near the
>>> expensive ones.  In the above scenario this shouldn't be a problem,
>>> because /home would be the directory listed in
>>> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be
>>> expensive.
>>
>> Interesting observation.  In the last sentence, "accessing /home"
>> does not exactly mean accessing /home, but accessing / to learn
>> about "home" in it, no?
>>
>>> But there might be other scenarios for which this patch
>>> series causes a performance regression.
>>
>> Yeah, after merging this to 'next', we should ask people who care
>> about CEILING to test it sufficiently.
>>
>> Thanks for rerolling.
> 
> GIT_CEILING_DIRECTORIES was always about trying to avoid
> hitting them at all; they can be (busy) NFS volumes there.
> 
> Here's the description from the 1.6.0 release notes:
> 
> * A new environment variable GIT_CEILING_DIRECTORIES can be used to stop
>   the discovery process of the toplevel of working tree; this may be useful
>   when you are working in a slow network disk and are outside any working 
> tree,
>   as bash-completion and "git help" may still need to run in these places.
> 
> In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added
> GIT_ONE_FILESYSTEM to fix a related issue.
> Do you guys have GIT_CEILING_DIRECTORIES set too?
> 
> We use GIT_CEILING_DIRECTORIES and I'm pretty sure
> we don't want every git command hitting them, so this would
> be a regression when seen from the POV of our current usage
> of this variable, which would be a bummer.

I would certainly withdraw the patch series if it causes a performance hit.

The log message of the original commit (0454dd93bf) described the
following scenario: a /home partition under which user home directories
are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid
hitting /home/.git, /home/.git/objects, and /home/objects (which would
attempt to automount those directories).  I believe that this scenario
would not be slowed down by my patches.

How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a
slowdown?

> Is there another way to accomplish this without the performance hit?
> Maybe something that can be solved with configuration?

Without doing the symlink expansion there is no way for git to detect
that GIT_CEILING_DIRECTORIES contains symlinks and is therefore
ineffective.  So the user has no warning about the misconfiguration
(except that git runs slowly).

On 10/29/2012 02:42 AM, Junio C Hamano wrote:
> Perhaps not canonicalize elements on the CEILING list ourselves? If
> we make it a user error to put symlinked alias in the variable, and
> document it clearly, wouldn't it suffice?

There may be no other choice.  (That, and fix the test suite in another
way to tolerate a $PWD that involves symlinks.)

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-28 Thread Junio C Hamano


David Aguilar  wrote:

>Is there another way to accomplish this without the performance hit?

Perhaps not canonicalize elements on the CEILING list ourselves? If we make it 
a user error to put symlinked alias in the variable, and document it clearly, 
wouldn't it suffice?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-28 Thread David Aguilar
On Sat, Oct 20, 2012 at 11:51 PM, Junio C Hamano  wrote:
> Michael Haggerty  writes:
>
>> This patch series has the side effect that all of the directories
>> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to
>> resolve any symlinks that are present in their paths.  It is
>> admittedly odd that a feature intended to avoid accessing expensive
>> directories would now *intentionally* access directories near the
>> expensive ones.  In the above scenario this shouldn't be a problem,
>> because /home would be the directory listed in
>> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be
>> expensive.
>
> Interesting observation.  In the last sentence, "accessing /home"
> does not exactly mean accessing /home, but accessing / to learn
> about "home" in it, no?
>
>> But there might be other scenarios for which this patch
>> series causes a performance regression.
>
> Yeah, after merging this to 'next', we should ask people who care
> about CEILING to test it sufficiently.
>
> Thanks for rerolling.

GIT_CEILING_DIRECTORIES was always about trying to avoid
hitting them at all; they can be (busy) NFS volumes there.

Here's the description from the 1.6.0 release notes:

* A new environment variable GIT_CEILING_DIRECTORIES can be used to stop
  the discovery process of the toplevel of working tree; this may be useful
  when you are working in a slow network disk and are outside any working tree,
  as bash-completion and "git help" may still need to run in these places.

In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added
GIT_ONE_FILESYSTEM to fix a related issue.
Do you guys have GIT_CEILING_DIRECTORIES set too?

We use GIT_CEILING_DIRECTORIES and I'm pretty sure
we don't want every git command hitting them, so this would
be a regression when seen from the POV of our current usage
of this variable, which would be a bummer.

Is there another way to accomplish this without the performance hit?
Maybe something that can be solved with configuration?

I'd be happy to lend a hand if you guys have some ideas
on how we can help keep it fast.  Thoughts?

Original patches for those just joining us:
http://thread.gmane.org/gmane.comp.version-control.git/208102
-- 
David
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-22 Thread Michael Haggerty
On 10/21/2012 08:51 AM, Junio C Hamano wrote:
> Michael Haggerty  writes:
> 
>> This patch series has the side effect that all of the directories
>> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to
>> resolve any symlinks that are present in their paths.  It is
>> admittedly odd that a feature intended to avoid accessing expensive
>> directories would now *intentionally* access directories near the
>> expensive ones.  In the above scenario this shouldn't be a problem,
>> because /home would be the directory listed in
>> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be
>> expensive.
> 
> Interesting observation.  In the last sentence, "accessing /home"
> does not exactly mean accessing /home, but accessing / to learn
> about "home" in it, no?

This is the extra overhead on my system for using
GIT_CEILING_DIRECTORIES=/home:

stat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
getcwd("/home/mhagger", 1024)   = 14
chdir("/home")  = 0
getcwd("/home", 4096)   = 6
lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
chdir("/home/mhagger")  = 0

If I use GIT_CEILING_DIRECTORIES=/dev/shm, which is a symlink to
/run/shm on my system, the overhead is comparable:

stat("/dev/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=200, ...}) = 0
getcwd("/home/mhagger", 1024)   = 14
chdir("/dev/shm")   = 0
getcwd("/run/shm", 4096)= 9
lstat("/run/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=200, ...}) = 0
chdir("/home/mhagger")  = 0

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-20 Thread Junio C Hamano
Michael Haggerty  writes:

> This patch series has the side effect that all of the directories
> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to
> resolve any symlinks that are present in their paths.  It is
> admittedly odd that a feature intended to avoid accessing expensive
> directories would now *intentionally* access directories near the
> expensive ones.  In the above scenario this shouldn't be a problem,
> because /home would be the directory listed in
> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be
> expensive.

Interesting observation.  In the last sentence, "accessing /home"
does not exactly mean accessing /home, but accessing / to learn
about "home" in it, no?

> But there might be other scenarios for which this patch
> series causes a performance regression.

Yeah, after merging this to 'next', we should ask people who care
about CEILING to test it sufficiently.

Thanks for rerolling.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks

2012-10-20 Thread Michael Haggerty
v3 of the series, reworked WRT v2:

* Change longest_ancestor_length() to take a string_list for its
  prefixes argument (instead of a PATH_SEP-separated string).

* Move the responsibility for canonicalizing prefixes from
  longest_ancestor_length() to its caller in
  setup_git_directory_gently_1().  This keeps
  longest_ancestor_length() testable, though the test inputs now have
  to be canonicalized.

* Remove function string_list_longest_prefix(), which was mainly
  intended to be used in the implementation of
  longest_ancestor_length() but is now unneeded.

Thanks to Junio for his comments.

I would like to explicitly point out a possible objection to this
whole enterprise.  GIT_CEILING_DIRECTORIES is used to prevent git from
mucking around in certain directories, to avoid expensive accesses
(for example of remote directories).  The original motivation for the
feature was a user whose home directory /home/$USER was automounted.
When git was invoked outside of a git tree, it would crawl up the
filesystem tree, eventually reaching /home, and then try to access the
paths

/home
/home/.git
/home/.git/objects
/home/objects

The last three accesses would be very expensive because the system
would attempt to automount the entries listed.

This patch series has the side effect that all of the directories
listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to
resolve any symlinks that are present in their paths.  It is
admittedly odd that a feature intended to avoid accessing expensive
directories would now *intentionally* access directories near the
expensive ones.  In the above scenario this shouldn't be a problem,
because /home would be the directory listed in
GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be
expensive.  But there might be other scenarios for which this patch
series causes a performance regression.

Another point: After this change, it would be trivially possible to
support non-absolute paths in GIT_CEILING_DIRECTORIES; just remove the
call to is_absolute_path() from normalize_ceiling_entry().  This might
be convenient for the test suite.

Michael Haggerty (8):
  Introduce new static function real_path_internal()
  real_path_internal(): add comment explaining use of cwd
  Introduce new function real_path_if_valid()
  longest_ancestor_length(): use string_list_split()
  longest_ancestor_length(): take a string_list argument for prefixes
  longest_ancestor_length(): require prefix list entries to be
normalized
  normalize_ceiling_entry(): resolve symlinks
  string_list_longest_prefix(): remove function

 Documentation/technical/api-string-list.txt |   8 ---
 abspath.c   | 105 ++--
 cache.h |   3 +-
 path.c  |  46 ++--
 setup.c |  32 -
 string-list.c   |  20 --
 string-list.h   |   8 ---
 t/t0060-path-utils.sh   |  41 ---
 t/t0063-string-list.sh  |  30 
 test-path-utils.c   |   8 ++-
 test-string-list.c  |  20 --
 11 files changed, 157 insertions(+), 164 deletions(-)

-- 
1.7.11.3

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html