Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Luke Diamand

On 15/09/15 08:31, Luke Diamand wrote:

On 14/09/15 18:10, larsxschnei...@gmail.com wrote:

It would be better to query this once at startup. Otherwise we're
potentially forking "git config" twice per file which on a large repo
could become significant. Make it an instance variable perhaps?


This is of course complete nonsense since gitConfig caches its results!


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Luke Diamand

On 14/09/15 18:10, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

If a path with non-ASCII characters is detected then print always the


s/print always/print/



encoding and the encoded string in verbose mode.

Signed-off-by: Lars Schneider 
---
  git-p4.py | 19 +--
  1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d45cf2b..da25d3f 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
  text = regexp.sub(r'$\1$', text)
  contents = [ text ]

-if gitConfig("git-p4.pathEncoding"):
-relPath = 
relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
-elif self.verbose:
-try:
-relPath.decode('ascii')
-except:
-print (
-"Path with Non-ASCII characters detected and no path encoding 
defined. "
-"Please check the encoding: %s" % relPath
-)
+try:
+relPath.decode('ascii')
+except:
+encoding = 'utf8'
+if gitConfig('git-p4.pathEncoding'):
+encoding = gitConfig('git-p4.pathEncoding')


It would be better to query this once at startup. Otherwise we're 
potentially forking "git config" twice per file which on a large repo 
could become significant. Make it an instance variable perhaps?



+relPath = relPath.decode(encoding).encode('utf8', 'replace')
+if self.verbose:
+print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, relPath)

  self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Lars Schneider

On 15 Sep 2015, at 09:31, Luke Diamand  wrote:

> On 14/09/15 18:10, larsxschnei...@gmail.com wrote:
>> From: Lars Schneider 
>> 
>> If a path with non-ASCII characters is detected then print always the
> 
> s/print always/print/
I will fix it.

> 
> 
>> encoding and the encoded string in verbose mode.
>> 
>> Signed-off-by: Lars Schneider 
>> ---
>>  git-p4.py | 19 +--
>>  1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index d45cf2b..da25d3f 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>>  text = regexp.sub(r'$\1$', text)
>>  contents = [ text ]
>> 
>> -if gitConfig("git-p4.pathEncoding"):
>> -relPath = 
>> relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
>> -elif self.verbose:
>> -try:
>> -relPath.decode('ascii')
>> -except:
>> -print (
>> -"Path with Non-ASCII characters detected and no path 
>> encoding defined. "
>> -"Please check the encoding: %s" % relPath
>> -)
>> +try:
>> +relPath.decode('ascii')
>> +except:
>> +encoding = 'utf8'
>> +if gitConfig('git-p4.pathEncoding'):
>> +encoding = gitConfig('git-p4.pathEncoding')
> 
> It would be better to query this once at startup. Otherwise we're potentially 
> forking "git config" twice per file which on a large repo could become 
> significant. Make it an instance variable perhaps?
solved in other email

> 
>> +relPath = relPath.decode(encoding).encode('utf8', 'replace')
>> +if self.verbose:
>> +print 'Path with non-ASCII characters detected. Used %s to 
>> encode: %s ' % (encoding, relPath)
>> 
>>  self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

Thanks!--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-15 Thread Lars Schneider

On 14 Sep 2015, at 20:40, Junio C Hamano  wrote:

> larsxschnei...@gmail.com writes:
> 
>> From: Lars Schneider 
>> 
>> If a path with non-ASCII characters is detected then print always the
>> encoding and the encoded string in verbose mode.
> 
> Earlier if the user tells us that s/he knows what she is doing
> by setting the configuration, we just followed the instruction
> without complaining or notifying.  The differences in this version
> are
> 
> (1) if the path is in ASCII, the configuration is not even
> consulted, and we didn't do any path munging.
Correct!


> (2) for a non-ASCII path, even if the user tells us that s/he knows
> what she is doing, we notify what we did under "--verbose"
> mode.
Correct!


> I think (1) is a definite improvement, but it is not immediately
> obvious why (2) is an improvement.  It is clearly a good thing to
> let the user know when we munged the path without being told, but
> when the configuration is given, it can be argued both ways.  It may
> be a good thing to reassure that the configuration is kicking in, or
> it may be a needless noise to tell the user that we did what we were
> told to do.
I get your point. However, changing file names in a repository is a pretty 
significant action and therefore I would prefer to explicitly tell the user 
about it. Some encodings differ only slightly I would like to have an easy way 
to look at all the changed paths to ensure I picked the right encoding (e.g. 
grep “Path with non-ASCII characters detected”). I also assume the user is OK 
with noise since s/he enabled “verbose” mode :-)


> In any case, I suspectq that the call to decode-encode to munge
> relPath is indented one level too deep in this patch.  You would
> want to use the configured value if exists and utf8 if there is no
> configuration, but in either case you would want to munge relPath
> when it does not decode as ASCII, no?
Good catch! It works with the indented code too because UTF8 is the default 
encoding for relPath later on. However, with your suggestion the code is more 
explicit. I will change it in the next roll

Thanks!

> 
>> Signed-off-by: Lars Schneider 
>> ---
>> git-p4.py | 19 +--
>> 1 file changed, 9 insertions(+), 10 deletions(-)
>> 
>> diff --git a/git-p4.py b/git-p4.py
>> index d45cf2b..da25d3f 100755
>> --- a/git-p4.py
>> +++ b/git-p4.py
>> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>> text = regexp.sub(r'$\1$', text)
>> contents = [ text ]
>> 
>> -if gitConfig("git-p4.pathEncoding"):
>> -relPath = 
>> relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
>> -elif self.verbose:
>> -try:
>> -relPath.decode('ascii')
>> -except:
>> -print (
>> -"Path with Non-ASCII characters detected and no path 
>> encoding defined. "
>> -"Please check the encoding: %s" % relPath
>> -)
>> +try:
>> +relPath.decode('ascii')
>> +except:
>> +encoding = 'utf8'
>> +if gitConfig('git-p4.pathEncoding'):
>> +encoding = gitConfig('git-p4.pathEncoding')
>> +relPath = relPath.decode(encoding).encode('utf8', 'replace')
>> +if self.verbose:
>> +print 'Path with non-ASCII characters detected. Used %s to 
>> encode: %s ' % (encoding, relPath)
>> 
>> self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7] git-p4: improve path encoding verbose output

2015-09-14 Thread larsxschneider
From: Lars Schneider 

If a path with non-ASCII characters is detected then print always the
encoding and the encoded string in verbose mode.

Signed-off-by: Lars Schneider 
---
 git-p4.py | 19 +--
 1 file changed, 9 insertions(+), 10 deletions(-)

diff --git a/git-p4.py b/git-p4.py
index d45cf2b..da25d3f 100755
--- a/git-p4.py
+++ b/git-p4.py
@@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
 text = regexp.sub(r'$\1$', text)
 contents = [ text ]
 
-if gitConfig("git-p4.pathEncoding"):
-relPath = 
relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
-elif self.verbose:
-try:
-relPath.decode('ascii')
-except:
-print (
-"Path with Non-ASCII characters detected and no path 
encoding defined. "
-"Please check the encoding: %s" % relPath
-)
+try:
+relPath.decode('ascii')
+except:
+encoding = 'utf8'
+if gitConfig('git-p4.pathEncoding'):
+encoding = gitConfig('git-p4.pathEncoding')
+relPath = relPath.decode(encoding).encode('utf8', 'replace')
+if self.verbose:
+print 'Path with non-ASCII characters detected. Used %s to 
encode: %s ' % (encoding, relPath)
 
 self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
 
-- 
2.5.1

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7] git-p4: improve path encoding verbose output

2015-09-14 Thread Junio C Hamano
larsxschnei...@gmail.com writes:

> From: Lars Schneider 
>
> If a path with non-ASCII characters is detected then print always the
> encoding and the encoded string in verbose mode.

Earlier if the user tells us that s/he knows what she is doing
by setting the configuration, we just followed the instruction
without complaining or notifying.  The differences in this version
are

 (1) if the path is in ASCII, the configuration is not even
 consulted, and we didn't do any path munging.

 (2) for a non-ASCII path, even if the user tells us that s/he knows
 what she is doing, we notify what we did under "--verbose"
 mode.

I think (1) is a definite improvement, but it is not immediately
obvious why (2) is an improvement.  It is clearly a good thing to
let the user know when we munged the path without being told, but
when the configuration is given, it can be argued both ways.  It may
be a good thing to reassure that the configuration is kicking in, or
it may be a needless noise to tell the user that we did what we were
told to do.

In any case, I suspectq that the call to decode-encode to munge
relPath is indented one level too deep in this patch.  You would
want to use the configured value if exists and utf8 if there is no
configuration, but in either case you would want to munge relPath
when it does not decode as ASCII, no?

> Signed-off-by: Lars Schneider 
> ---
>  git-p4.py | 19 +--
>  1 file changed, 9 insertions(+), 10 deletions(-)
>
> diff --git a/git-p4.py b/git-p4.py
> index d45cf2b..da25d3f 100755
> --- a/git-p4.py
> +++ b/git-p4.py
> @@ -2220,16 +2220,15 @@ class P4Sync(Command, P4UserMap):
>  text = regexp.sub(r'$\1$', text)
>  contents = [ text ]
>  
> -if gitConfig("git-p4.pathEncoding"):
> -relPath = 
> relPath.decode(gitConfig("git-p4.pathEncoding")).encode('utf8', 'replace')
> -elif self.verbose:
> -try:
> -relPath.decode('ascii')
> -except:
> -print (
> -"Path with Non-ASCII characters detected and no path 
> encoding defined. "
> -"Please check the encoding: %s" % relPath
> -)
> +try:
> +relPath.decode('ascii')
> +except:
> +encoding = 'utf8'
> +if gitConfig('git-p4.pathEncoding'):
> +encoding = gitConfig('git-p4.pathEncoding')
> +relPath = relPath.decode(encoding).encode('utf8', 'replace')
> +if self.verbose:
> +print 'Path with non-ASCII characters detected. Used %s to 
> encode: %s ' % (encoding, relPath)
>  
>  self.gitStream.write("M %s inline %s\n" % (git_mode, relPath))
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html