Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-06 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 Just an idea. We could unify [a-z]: and //host into dos root
 concept. That would teach other code paths about UNC paths too.
 ...
 diff --git a/path.c b/path.c
 index 66acd24..0e4e2d7 100644
 --- a/path.c
 +++ b/path.c
 @@ -498,11 +498,12 @@ const char *relative_path(const char *abs, const char 
 *base)
  int normalize_path_copy(char *dst, const char *src)
  {
   char *dst0;
 + int i, len;
  
 - if (has_dos_drive_prefix(src)) {
 + len = offset_1st_component(src, 1);
 + for (i = 0; i  len; i++)
   *dst++ = *src++;
 - *dst++ = *src++;
 - }
 +
   dst0 = dst;

Modulo that I suspect you could get rid of offset_1st_component()
altogether and has_dos_drive_prefix() return the length of the d:
or //d part (which needs to be copied literally regardless of the
normalization), what you suggest feels like the right approach.
Why do you need the keep_root parameter and do things differently
depending on the setting by the way?  Wouldn't skip the root level
when computing the offset of the first path component something the
caller can easily decide to do or not to do, and wouldn't it make
the semantics of the function cleaner and simpler by making it do
only one thing and one thing well?


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-06 Thread Michael Haggerty
On 09/05/2012 10:40 AM, Johannes Sixt wrote:
 Am 9/4/2012 10:14, schrieb mhag...@alum.mit.edu:
 From: Michael Haggerty mhag...@alum.mit.edu

 These tests already pass, but make sure they don't break in the
 future.

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---

 It would be great if somebody would check whether these tests pass on
 Windows, and if not, give me a tip about how to fix them.

  t/t-basic.sh | 11 +++
  1 file changed, 11 insertions(+)

 diff --git a/t/t-basic.sh b/t/t-basic.sh
 index d929578..3c75e97 100755
 --- a/t/t-basic.sh
 +++ b/t/t-basic.sh
 @@ -468,6 +468,17 @@ test_expect_success 'real path works on absolute paths' 
 '
  test $d/$nopath = $(test-path-utils real_path $d/$nopath)
  '
  
 +test_expect_success 'real path removes extra slashes' '
 +nopath=hopefully-absent-path 
 +test / = $(test-path-utils real_path ///) 
 +test /$nopath = $(test-path-utils real_path ///$nopath) 
 
 Same here: test-path-utils operates on a DOS-style absolute path.

ACK.

 Furthermore, as Junio pointed out elsewhere, it is desirable that slashes
 (dir-separators) at the beginning are not collapsed if there are exactly
 two of them. Three or more can be collapsed to a single slash.

The empirical behavior of real_path() on Linux is (assuming /tmp
exists and /foo does not)

 Before my changes   After my changes
real_path(/tmp)/tmp/tmp
real_path(//tmp)   /tmp/tmp
real_path(/foo)$(pwd)/foo  /foo
real_path(//foo)   /foo/foo

real_path(/tmp/bar)/tmp/bar/tmp/bar
real_path(//tmp/bar)   /tmp/bar/tmp/bar
real_path(/foo/bar)--dies----dies--
real_path(//foo/bar)   --dies----dies--

So I think that my changes makes things strictly better (albeit probably
not perfect).

real_path() relies on chdir() / getcwd() to canonicalize the path,
except for one case:

If the specified path is not itself a directory, then it strips off the
last component of the name, tries chdir() / getcwd() on the shortened
path, then pastes the last component on again.  The stripping off is
done by find_last_dir_sep(), which literally just looks for the last '/'
(or for Windows also '\') in the string.  Please note that the pasting
back is done using '/' as a separator.

So I think that the only problematic case is a path that only includes a
single group of slashes, like //foo or maybe c:\\foo, and also is
not is_directory() [1].  What should be the algorithm for such cases,
and how should it depend on the platform?  (And does it really matter?
Top-level Linux paths are usually directories.  Are Windows-style
network shares also considered directories in the sense of is_directory()?)

I will make an easy improvement: not to remove *any* slashes when
stripping the last path component from the end of the path (letting
chdir() deal with them).  This results in the same results for Linux and
perhaps more hope under Windows:

On Linux: //foo - (chdir(//), foo) - (/, foo) - /foo

On Windows: //foo - (chdir(//), foo) - (?, foo) - ?
(what is the result of chdir(//) then getcwd()?)

On Windows: c:\foo - (chdir(c:\), foo) - (?, foo) - ?
(what is the result of chdir(c:\) then getcwd()?)

 +# We need an existing top-level directory to use in the
 +# remaining tests.  Use the top-level ancestor of $(pwd):
 +d=$(pwd -P | sed -e s|^\(/[^/]*\)/.*|\1|) 
 
 Same here: Account for the drive letter-colon.

ACK

 +test $d = $(test-path-utils real_path //$d///) 
 +test $d/$nopath = $(test-path-utils real_path //$d///$nopath)
 
 Since $d is a DOS-style path on Windows, //$d means something entirely
 different than $d. You should split the test in two: One test checks that
 slashes at the end or before $nopath are collapsed (this must work on
 Windows as well), and another test protected by POSIX prerequisite to
 check that slashes at the beginning are collapsed.

Done.

Thanks again for your help.

Michael

[1] If there is more than one group of slashes in the name, like
//foo//bar or c:\\foo\\bar then:

* All but the last groups of slashes are handled by
  chdir(//foo/)/getcwd() and presumably de-duplicated or not as
  appropriate

* The extras from the last group of slashes are trailing slashes in
  the string passed to chdir() and are therefore removed.

so everything should be OK.

-- 
Michael Haggerty
mhag...@alum.mit.edu
http://softwareswirl.blogspot.com/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [msysGit] Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-06 Thread Nguyen Thai Ngoc Duy
On Fri, Sep 7, 2012 at 12:34 AM, Junio C Hamano gits...@pobox.com wrote:
 Modulo that I suspect you could get rid of offset_1st_component()
 altogether and has_dos_drive_prefix() return the length of the d:
 or //d part (which needs to be copied literally regardless of the
 normalization), what you suggest feels like the right approach.
 Why do you need the keep_root parameter and do things differently
 depending on the setting by the way?

That's how offset_1st_component() originally works, root slash if
present is counted.

 Wouldn't skip the root level
 when computing the offset of the first path component something the
 caller can easily decide to do or not to do, and wouldn't it make
 the semantics of the function cleaner and simpler by making it do
 only one thing and one thing well?

Yeah. I'll have a closer look later and see if we can simplify the function.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-05 Thread Johannes Sixt
Am 9/4/2012 10:14, schrieb mhag...@alum.mit.edu:
 From: Michael Haggerty mhag...@alum.mit.edu
 
 These tests already pass, but make sure they don't break in the
 future.
 
 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
 
 It would be great if somebody would check whether these tests pass on
 Windows, and if not, give me a tip about how to fix them.
 
  t/t-basic.sh | 11 +++
  1 file changed, 11 insertions(+)
 
 diff --git a/t/t-basic.sh b/t/t-basic.sh
 index d929578..3c75e97 100755
 --- a/t/t-basic.sh
 +++ b/t/t-basic.sh
 @@ -468,6 +468,17 @@ test_expect_success 'real path works on absolute paths' '
   test $d/$nopath = $(test-path-utils real_path $d/$nopath)
  '
  
 +test_expect_success 'real path removes extra slashes' '
 + nopath=hopefully-absent-path 
 + test / = $(test-path-utils real_path ///) 
 + test /$nopath = $(test-path-utils real_path ///$nopath) 

Same here: test-path-utils operates on a DOS-style absolute path.
Furthermore, as Junio pointed out elsewhere, it is desirable that slashes
(dir-separators) at the beginning are not collapsed if there are exactly
two of them. Three or more can be collapsed to a single slash.

 + # We need an existing top-level directory to use in the
 + # remaining tests.  Use the top-level ancestor of $(pwd):
 + d=$(pwd -P | sed -e s|^\(/[^/]*\)/.*|\1|) 

Same here: Account for the drive letter-colon.

 + test $d = $(test-path-utils real_path //$d///) 
 + test $d/$nopath = $(test-path-utils real_path //$d///$nopath)

Since $d is a DOS-style path on Windows, //$d means something entirely
different than $d. You should split the test in two: One test checks that
slashes at the end or before $nopath are collapsed (this must work on
Windows as well), and another test protected by POSIX prerequisite to
check that slashes at the beginning are collapsed.

-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-05 Thread Nguyen Thai Ngoc Duy
On Wed, Sep 5, 2012 at 1:19 AM, Junio C Hamano gits...@pobox.com wrote:
 mhag...@alum.mit.edu writes:

 From: Michael Haggerty mhag...@alum.mit.edu

 These tests already pass, but make sure they don't break in the
 future.

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---

 It would be great if somebody would check whether these tests pass on
 Windows, and if not, give me a tip about how to fix them.

 I think this (perhaps unwarranted) removal of the double leading
 slashes is the root cause of

 http://thread.gmane.org/gmane.comp.version-control.git/204469

 (people involved in that thread Cc'ed).

I gave up on that thread because I did not have a proper environment
to further troubleshoot. Thanks Mike for giving me the clue about
real_path(). I traced link_alt_odb_entry() and it does this:

if (!is_absolute_path(entry)  relative_base) {
strbuf_addstr(pathbuf, real_path(relative_base));
strbuf_addch(pathbuf, '/');
}
strbuf_add(pathbuf, entry, len);
normalize_path_copy(pathbuf.buf, pathbuf.buf);

The culprit might be normalize_path_copy (I don't have Windows to test
whether real_path() strips the leading slash in //server/somewhere). I
tested normalize_path_copy() separately and it does strip leading
slashes, so it needs to be fixed. Something like maybe:

diff --git a/path.c b/path.c
index 66acd24..ad2881c 100644
--- a/path.c
+++ b/path.c
@@ -503,6 +503,10 @@ int normalize_path_copy(char *dst, const char *src)
*dst++ = *src++;
*dst++ = *src++;
}
+#ifdef WIN32
+   else if (src[0] == '/'  src[1] == '/')
+   *dst++ = *src++;
+#endif
dst0 = dst;

if (is_dir_sep(*src)) {

I'm not suitable for following this up as I don't have environment to
test it. Maybe some msysgit guy want to step in?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-05 Thread Nguyen Thai Ngoc Duy
On Wed, Sep 5, 2012 at 6:05 PM, Orgad and Raizel Shaneh
org...@gmail.com wrote:
 This seems to fix it. Just change src[x] == '/' to is_dir_sep(src[x]).

So you are able to test the changes. Would you mind submitting a patch
so other users benefit it as well?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-05 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 diff --git a/path.c b/path.c
 index 66acd24..ad2881c 100644
 --- a/path.c
 +++ b/path.c
 @@ -503,6 +503,10 @@ int normalize_path_copy(char *dst, const char *src)
 *dst++ = *src++;
 *dst++ = *src++;
 }
 +#ifdef WIN32
 +   else if (src[0] == '/'  src[1] == '/')
 +   *dst++ = *src++;
 +#endif

The two-byte copy we see above the context is conditional on a nice
abstraction has_dos_drive_prefix() so that we do not have to
suffer from these ugly ifdefs.  Could we do something similar?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-04 Thread mhagger
From: Michael Haggerty mhag...@alum.mit.edu

These tests already pass, but make sure they don't break in the
future.

Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
---

It would be great if somebody would check whether these tests pass on
Windows, and if not, give me a tip about how to fix them.

 t/t-basic.sh | 11 +++
 1 file changed, 11 insertions(+)

diff --git a/t/t-basic.sh b/t/t-basic.sh
index d929578..3c75e97 100755
--- a/t/t-basic.sh
+++ b/t/t-basic.sh
@@ -468,6 +468,17 @@ test_expect_success 'real path works on absolute paths' '
test $d/$nopath = $(test-path-utils real_path $d/$nopath)
 '
 
+test_expect_success 'real path removes extra slashes' '
+   nopath=hopefully-absent-path 
+   test / = $(test-path-utils real_path ///) 
+   test /$nopath = $(test-path-utils real_path ///$nopath) 
+   # We need an existing top-level directory to use in the
+   # remaining tests.  Use the top-level ancestor of $(pwd):
+   d=$(pwd -P | sed -e s|^\(/[^/]*\)/.*|\1|) 
+   test $d = $(test-path-utils real_path //$d///) 
+   test $d/$nopath = $(test-path-utils real_path //$d///$nopath)
+'
+
 test_expect_success SYMLINKS 'real path works on symlinks' '
mkdir first 
ln -s ../.git first/.git 
-- 
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 7/7] t0000: verify that real_path() removes extra slashes

2012-09-04 Thread Junio C Hamano
mhag...@alum.mit.edu writes:

 From: Michael Haggerty mhag...@alum.mit.edu

 These tests already pass, but make sure they don't break in the
 future.

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---

 It would be great if somebody would check whether these tests pass on
 Windows, and if not, give me a tip about how to fix them.

I think this (perhaps unwarranted) removal of the double leading
slashes is the root cause of

http://thread.gmane.org/gmane.comp.version-control.git/204469

(people involved in that thread Cc'ed).

  t/t-basic.sh | 11 +++
  1 file changed, 11 insertions(+)

 diff --git a/t/t-basic.sh b/t/t-basic.sh
 index d929578..3c75e97 100755
 --- a/t/t-basic.sh
 +++ b/t/t-basic.sh
 @@ -468,6 +468,17 @@ test_expect_success 'real path works on absolute paths' '
   test $d/$nopath = $(test-path-utils real_path $d/$nopath)
  '
  
 +test_expect_success 'real path removes extra slashes' '
 + nopath=hopefully-absent-path 
 + test / = $(test-path-utils real_path ///) 
 + test /$nopath = $(test-path-utils real_path ///$nopath) 
 + # We need an existing top-level directory to use in the
 + # remaining tests.  Use the top-level ancestor of $(pwd):
 + d=$(pwd -P | sed -e s|^\(/[^/]*\)/.*|\1|) 
 + test $d = $(test-path-utils real_path //$d///) 
 + test $d/$nopath = $(test-path-utils real_path //$d///$nopath)
 +'
 +
  test_expect_success SYMLINKS 'real path works on symlinks' '
   mkdir first 
   ln -s ../.git first/.git 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html