Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On Tue, 19 Feb 2013, Junio C Hamano wrote: > Assuming that this says "yes": > > D=/afs/athena.mit.edu/user/a/n/andersk/my/dir > cd "$D" > test "$(/bin/pwd)" = "$D" && echo yes Correct. > Perhaps existing of an empty element in the list would do? E.g. > > GIT_CEILING_DIRECTORIES=:/afs/athena.mit.edu/users/a/n/andesk > > or something like that. And in such a case, we do not run realpath on > the elements on the list before comparing them with what we get from > getcwd(3). That seems reasonable, and has the advantage of backwards compatibility with versions before 1.8.1.2, I guess. Anders -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
Anders Kaseorg writes: > On 10/29/2012 01:10 AM, Michael Haggerty wrote: >> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a >> slowdown? > > Sorry to bring up this old thread again, but I just realized why my > computer has been acting so slow when I’m not connected to the > network. I put various network filesystem paths in > $GIT_CEILING_DIRECTORIES, such as /afs/athena.mit.edu/user/a/n/andersk > (to avoid hitting its parents /afs/athena.mit.edu, > /afs/athena.mit.edu/user/a, and /afs/athena.mit.edu/user/a/n which all > live in different AFS volumes). Now when I’m not connected to the > network, every invocation of Git, including the __git_ps1 in my shell > prompt, waits for AFS to timeout. Thanks for a report. Assuming that this says "yes": D=/afs/athena.mit.edu/user/a/n/andersk/my/dir cd "$D" test "$(/bin/pwd)" = "$D" && echo yes iow, AFS mounting does not have funny symbolic link issues that make the logical and physical PWD to be different like some automounter implementation used to have, perhaps we can introduce a way for the user to say "The element of this CEILING list do not have any alias due to funny symlinks" to solve this. Perhaps existing of an empty element in the list would do? E.g. GIT_CEILING_DIRECTORIES=:/afs/athena.mit.edu/users/a/n/andesk or something like that. And in such a case, we do not run realpath on the elements on the list before comparing them with what we get from getcwd(3). Of course, you could condionally unset the environment while offline, but that feels like an ugly hack. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On 10/29/2012 01:10 AM, Michael Haggerty wrote: How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a slowdown? Sorry to bring up this old thread again, but I just realized why my computer has been acting so slow when I’m not connected to the network. I put various network filesystem paths in $GIT_CEILING_DIRECTORIES, such as /afs/athena.mit.edu/user/a/n/andersk (to avoid hitting its parents /afs/athena.mit.edu, /afs/athena.mit.edu/user/a, and /afs/athena.mit.edu/user/a/n which all live in different AFS volumes). Now when I’m not connected to the network, every invocation of Git, including the __git_ps1 in my shell prompt, waits for AFS to timeout. Obviously I’m going to stop using $GIT_CEILING_DIRECTORIES now that I know what the problem is, but I figured you might want to know why this feature is now useless for me. Anders -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On 11/13/2012 09:50 PM, David Aguilar wrote: > On Mon, Nov 12, 2012 at 9:47 AM, Junio C Hamano wrote: >> Michael Haggerty writes: >> >>> The log message of the original commit (0454dd93bf) described the >>> following scenario: a /home partition under which user home directories >>> are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid >>> hitting /home/.git, /home/.git/objects, and /home/objects (which would >>> attempt to automount those directories). I believe that this scenario >>> would not be slowed down by my patches. >>> >>> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a >>> slowdown? >> >> Yeah, I was also wondering about that. >> >> David? > > I double-checked our configuration and all the parent directories > of those listed in GIT_CEILING_DIRECTORIES are local, > so our particular configuration would not have a performance hit. > > We do have multiple directories listed there. Some of them share > a parent directory. I'm assuming the implementation is simple and > does not try and avoid repeating the check when the parent dir is > the same across multiple entries. > > In any case, it won't be a problem in practice based on my > reading of the current code. OK, so we're back to the following status: some people (including me) are nervous that this change could cause a performance regression, though it seems that the most sensible ways of using the GIT_CEILING_DIRECTORIES feature would not be affected. In favor: Currently, if a directory containing a symlink is added to GIT_CEILING_DIRECTORIES, then GIT_CEILING_DIRECTORIES will not work, git has no way of recognizing that there is a problem, and the only symptom observable by the user is that the hoped-for performance improvement from using GIT_CEILING_DIRECTORIES will not materialize (or will disappear after a filesystem reorg) [1]. Against: The change will cause a performance regression if a slow-to-stat directory is listed in GIT_CEILING_DIRECTORIES. The slowdown will occur whenever git is run outside of a true git-managed project, most nastily in the case of using __git_ps1 in a shell prompt. I don't have a preference either way about whether these patches should be merged. Michael [1] It is also conceivable that GIT_CEILING_DIRECTORIES is being used to *hide* an enclosing git project rather than to inform git that there are no enclosing projects, in which case the enclosing project would *not* be hidden. This is in fact the mechanism by which the problem causes failures in our test suite. But I don't expect that this is a common real-world scenario, and anyway such a failure would be obvious to the user and quickly fixed. -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On Mon, Nov 12, 2012 at 9:47 AM, Junio C Hamano wrote: > Michael Haggerty writes: > >> The log message of the original commit (0454dd93bf) described the >> following scenario: a /home partition under which user home directories >> are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid >> hitting /home/.git, /home/.git/objects, and /home/objects (which would >> attempt to automount those directories). I believe that this scenario >> would not be slowed down by my patches. >> >> How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a >> slowdown? > > Yeah, I was also wondering about that. > > David? I double-checked our configuration and all the parent directories of those listed in GIT_CEILING_DIRECTORIES are local, so our particular configuration would not have a performance hit. We do have multiple directories listed there. Some of them share a parent directory. I'm assuming the implementation is simple and does not try and avoid repeating the check when the parent dir is the same across multiple entries. In any case, it won't be a problem in practice based on my reading of the current code. >>> Is there another way to accomplish this without the performance hit? >>> Maybe something that can be solved with configuration? >> >> Without doing the symlink expansion there is no way for git to detect >> that GIT_CEILING_DIRECTORIES contains symlinks and is therefore >> ineffective. So the user has no warning about the misconfiguration >> (except that git runs slowly). >> >> On 10/29/2012 02:42 AM, Junio C Hamano wrote: >>> Perhaps not canonicalize elements on the CEILING list ourselves? If >>> we make it a user error to put symlinked alias in the variable, and >>> document it clearly, wouldn't it suffice? >> >> There may be no other choice. (That, and fix the test suite in another >> way to tolerate a $PWD that involves symlinks.) -- David -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
Michael Haggerty writes: > The log message of the original commit (0454dd93bf) described the > following scenario: a /home partition under which user home directories > are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid > hitting /home/.git, /home/.git/objects, and /home/objects (which would > attempt to automount those directories). I believe that this scenario > would not be slowed down by my patches. > > How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a > slowdown? Yeah, I was also wondering about that. David? >> Is there another way to accomplish this without the performance hit? >> Maybe something that can be solved with configuration? > > Without doing the symlink expansion there is no way for git to detect > that GIT_CEILING_DIRECTORIES contains symlinks and is therefore > ineffective. So the user has no warning about the misconfiguration > (except that git runs slowly). > > On 10/29/2012 02:42 AM, Junio C Hamano wrote: >> Perhaps not canonicalize elements on the CEILING list ourselves? If >> we make it a user error to put symlinked alias in the variable, and >> document it clearly, wouldn't it suffice? > > There may be no other choice. (That, and fix the test suite in another > way to tolerate a $PWD that involves symlinks.) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
>From David Aguilar , Sun, Oct 28, 2012 at 05:15:29PM -0700: > > In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added > GIT_ONE_FILESYSTEM to fix a related issue. > Do you guys have GIT_CEILING_DIRECTORIES set too? Once GIT_DISCOVERY_ACROSS_FILESYSTEM (the eventual name for GIT_ONE_FILESYSTEM) was accepted, we stopped setting GIT_CEILING_DIRECTORIES in our environment. Even so, the way our filesystems work, stat() or lstat() calls on the paths that we used to have in GIT_CEILING_DIRECTORIES wouldn't have caused a problem. I don't personally have an objection to patches that explicitly access the GIT_CEILING_DIRECTORIES paths, but it seems like other folks' configurations could be harmed by them. -lars -- lars r. damerow :: button pusher :: pixar animation studios -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On 10/29/2012 01:15 AM, David Aguilar wrote: > On Sat, Oct 20, 2012 at 11:51 PM, Junio C Hamano wrote: >> Michael Haggerty writes: >> >>> This patch series has the side effect that all of the directories >>> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to >>> resolve any symlinks that are present in their paths. It is >>> admittedly odd that a feature intended to avoid accessing expensive >>> directories would now *intentionally* access directories near the >>> expensive ones. In the above scenario this shouldn't be a problem, >>> because /home would be the directory listed in >>> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be >>> expensive. >> >> Interesting observation. In the last sentence, "accessing /home" >> does not exactly mean accessing /home, but accessing / to learn >> about "home" in it, no? >> >>> But there might be other scenarios for which this patch >>> series causes a performance regression. >> >> Yeah, after merging this to 'next', we should ask people who care >> about CEILING to test it sufficiently. >> >> Thanks for rerolling. > > GIT_CEILING_DIRECTORIES was always about trying to avoid > hitting them at all; they can be (busy) NFS volumes there. > > Here's the description from the 1.6.0 release notes: > > * A new environment variable GIT_CEILING_DIRECTORIES can be used to stop > the discovery process of the toplevel of working tree; this may be useful > when you are working in a slow network disk and are outside any working > tree, > as bash-completion and "git help" may still need to run in these places. > > In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added > GIT_ONE_FILESYSTEM to fix a related issue. > Do you guys have GIT_CEILING_DIRECTORIES set too? > > We use GIT_CEILING_DIRECTORIES and I'm pretty sure > we don't want every git command hitting them, so this would > be a regression when seen from the POV of our current usage > of this variable, which would be a bummer. I would certainly withdraw the patch series if it causes a performance hit. The log message of the original commit (0454dd93bf) described the following scenario: a /home partition under which user home directories are automounted, and setting GIT_CEILING_DIRECTORIES=/home to avoid hitting /home/.git, /home/.git/objects, and /home/objects (which would attempt to automount those directories). I believe that this scenario would not be slowed down by my patches. How do you use GIT_CEILING_DIRECTORIES that the proposed changes cause a slowdown? > Is there another way to accomplish this without the performance hit? > Maybe something that can be solved with configuration? Without doing the symlink expansion there is no way for git to detect that GIT_CEILING_DIRECTORIES contains symlinks and is therefore ineffective. So the user has no warning about the misconfiguration (except that git runs slowly). On 10/29/2012 02:42 AM, Junio C Hamano wrote: > Perhaps not canonicalize elements on the CEILING list ourselves? If > we make it a user error to put symlinked alias in the variable, and > document it clearly, wouldn't it suffice? There may be no other choice. (That, and fix the test suite in another way to tolerate a $PWD that involves symlinks.) Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
David Aguilar wrote: >Is there another way to accomplish this without the performance hit? Perhaps not canonicalize elements on the CEILING list ourselves? If we make it a user error to put symlinked alias in the variable, and document it clearly, wouldn't it suffice? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On Sat, Oct 20, 2012 at 11:51 PM, Junio C Hamano wrote: > Michael Haggerty writes: > >> This patch series has the side effect that all of the directories >> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to >> resolve any symlinks that are present in their paths. It is >> admittedly odd that a feature intended to avoid accessing expensive >> directories would now *intentionally* access directories near the >> expensive ones. In the above scenario this shouldn't be a problem, >> because /home would be the directory listed in >> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be >> expensive. > > Interesting observation. In the last sentence, "accessing /home" > does not exactly mean accessing /home, but accessing / to learn > about "home" in it, no? > >> But there might be other scenarios for which this patch >> series causes a performance regression. > > Yeah, after merging this to 'next', we should ask people who care > about CEILING to test it sufficiently. > > Thanks for rerolling. GIT_CEILING_DIRECTORIES was always about trying to avoid hitting them at all; they can be (busy) NFS volumes there. Here's the description from the 1.6.0 release notes: * A new environment variable GIT_CEILING_DIRECTORIES can be used to stop the discovery process of the toplevel of working tree; this may be useful when you are working in a slow network disk and are outside any working tree, as bash-completion and "git help" may still need to run in these places. In 8030e44215fe8f34edd57d711a35f2f0f97a0423 Lars added GIT_ONE_FILESYSTEM to fix a related issue. Do you guys have GIT_CEILING_DIRECTORIES set too? We use GIT_CEILING_DIRECTORIES and I'm pretty sure we don't want every git command hitting them, so this would be a regression when seen from the POV of our current usage of this variable, which would be a bummer. Is there another way to accomplish this without the performance hit? Maybe something that can be solved with configuration? I'd be happy to lend a hand if you guys have some ideas on how we can help keep it fast. Thoughts? Original patches for those just joining us: http://thread.gmane.org/gmane.comp.version-control.git/208102 -- David -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
On 10/21/2012 08:51 AM, Junio C Hamano wrote: > Michael Haggerty writes: > >> This patch series has the side effect that all of the directories >> listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to >> resolve any symlinks that are present in their paths. It is >> admittedly odd that a feature intended to avoid accessing expensive >> directories would now *intentionally* access directories near the >> expensive ones. In the above scenario this shouldn't be a problem, >> because /home would be the directory listed in >> GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be >> expensive. > > Interesting observation. In the last sentence, "accessing /home" > does not exactly mean accessing /home, but accessing / to learn > about "home" in it, no? This is the extra overhead on my system for using GIT_CEILING_DIRECTORIES=/home: stat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 getcwd("/home/mhagger", 1024) = 14 chdir("/home") = 0 getcwd("/home", 4096) = 6 lstat("/home", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0 chdir("/home/mhagger") = 0 If I use GIT_CEILING_DIRECTORIES=/dev/shm, which is a symlink to /run/shm on my system, the overhead is comparable: stat("/dev/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=200, ...}) = 0 getcwd("/home/mhagger", 1024) = 14 chdir("/dev/shm") = 0 getcwd("/run/shm", 4096)= 9 lstat("/run/shm", {st_mode=S_IFDIR|S_ISVTX|0777, st_size=200, ...}) = 0 chdir("/home/mhagger") = 0 Michael -- Michael Haggerty mhag...@alum.mit.edu http://softwareswirl.blogspot.com/ -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
Michael Haggerty writes: > This patch series has the side effect that all of the directories > listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to > resolve any symlinks that are present in their paths. It is > admittedly odd that a feature intended to avoid accessing expensive > directories would now *intentionally* access directories near the > expensive ones. In the above scenario this shouldn't be a problem, > because /home would be the directory listed in > GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be > expensive. Interesting observation. In the last sentence, "accessing /home" does not exactly mean accessing /home, but accessing / to learn about "home" in it, no? > But there might be other scenarios for which this patch > series causes a performance regression. Yeah, after merging this to 'next', we should ask people who care about CEILING to test it sufficiently. Thanks for rerolling. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3 0/8] Fix GIT_CEILING_DIRECTORIES that contain symlinks
v3 of the series, reworked WRT v2: * Change longest_ancestor_length() to take a string_list for its prefixes argument (instead of a PATH_SEP-separated string). * Move the responsibility for canonicalizing prefixes from longest_ancestor_length() to its caller in setup_git_directory_gently_1(). This keeps longest_ancestor_length() testable, though the test inputs now have to be canonicalized. * Remove function string_list_longest_prefix(), which was mainly intended to be used in the implementation of longest_ancestor_length() but is now unneeded. Thanks to Junio for his comments. I would like to explicitly point out a possible objection to this whole enterprise. GIT_CEILING_DIRECTORIES is used to prevent git from mucking around in certain directories, to avoid expensive accesses (for example of remote directories). The original motivation for the feature was a user whose home directory /home/$USER was automounted. When git was invoked outside of a git tree, it would crawl up the filesystem tree, eventually reaching /home, and then try to access the paths /home /home/.git /home/.git/objects /home/objects The last three accesses would be very expensive because the system would attempt to automount the entries listed. This patch series has the side effect that all of the directories listed in GIT_CEILING_DIRECTORIES are accessed *unconditionally* to resolve any symlinks that are present in their paths. It is admittedly odd that a feature intended to avoid accessing expensive directories would now *intentionally* access directories near the expensive ones. In the above scenario this shouldn't be a problem, because /home would be the directory listed in GIT_CEILING_DIRECTORIES, and accessing /home itself shouldn't be expensive. But there might be other scenarios for which this patch series causes a performance regression. Another point: After this change, it would be trivially possible to support non-absolute paths in GIT_CEILING_DIRECTORIES; just remove the call to is_absolute_path() from normalize_ceiling_entry(). This might be convenient for the test suite. Michael Haggerty (8): Introduce new static function real_path_internal() real_path_internal(): add comment explaining use of cwd Introduce new function real_path_if_valid() longest_ancestor_length(): use string_list_split() longest_ancestor_length(): take a string_list argument for prefixes longest_ancestor_length(): require prefix list entries to be normalized normalize_ceiling_entry(): resolve symlinks string_list_longest_prefix(): remove function Documentation/technical/api-string-list.txt | 8 --- abspath.c | 105 ++-- cache.h | 3 +- path.c | 46 ++-- setup.c | 32 - string-list.c | 20 -- string-list.h | 8 --- t/t0060-path-utils.sh | 41 --- t/t0063-string-list.sh | 30 test-path-utils.c | 8 ++- test-string-list.c | 20 -- 11 files changed, 157 insertions(+), 164 deletions(-) -- 1.7.11.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html