[PATCH v8 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding

2013-07-01 Thread Alexey Shumkin
One can set an alias
$ git config alias.lg "log --graph --pretty=format:'%Cred%h%Creset
-%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)<%an>%Creset'
--abbrev-commit --date=local"

to see the log as a pretty tree (like *gitk* but in a terminal).

However, log messages written in an encoding i18n.commitEncoding which differs
from terminal encoding are shown corrupted even when i18n.logOutputEncoding
and terminal encoding are the same (e.g. log messages committed on a Cygwin box
with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice 
versa).

To simplify an example we can say the following two commands are expected
to give the same output to a terminal:

$ git log --oneline --no-color
$ git log --pretty=format:'%h %s'

However, the former pays attention to i18n.logOutputEncoding
configuration, while the latter does not when it formats "%s".

The same corruption is true for
$ git diff --submodule=log
and
$ git rev-list --pretty=format:%s HEAD
and
$ git reset --hard

This patch adds failing tests for the next patch that fixes them.

Signed-off-by: Alexey Shumkin 
---
 t/t4041-diff-submodule-option.sh |  35 ++
 t/t4205-log-pretty-formats.sh| 146 ---
 t/t6006-rev-list-format.sh   |  74 +++-
 t/t7102-reset.sh |  31 -
 4 files changed, 198 insertions(+), 88 deletions(-)

diff --git a/t/t4041-diff-submodule-option.sh b/t/t4041-diff-submodule-option.sh
index 32d4a60..d300d0c 100755
--- a/t/t4041-diff-submodule-option.sh
+++ b/t/t4041-diff-submodule-option.sh
@@ -1,6 +1,7 @@
 #!/bin/sh
 #
 # Copyright (c) 2009 Jens Lehmann, based on t7401 by Ping Yin
+# Copyright (c) 2013 Alexey Shumkin (+ non-UTF-8 commit encoding tests)
 #
 
 test_description='Support for verbose submodule differences in git diff
@@ -10,6 +11,9 @@ This test tries to verify the sanity of the --submodule 
option of git diff.
 
 . ./test-lib.sh
 
+# String "added" in German (translated with Google Translate), encoded in 
UTF-8,
+# used in sample commit log messages in add_file() function below.
+added=$(printf "hinzugef\303\274gt")
 add_file () {
(
cd "$1" &&
@@ -19,7 +23,8 @@ add_file () {
echo "$name" >"$name" &&
git add "$name" &&
test_tick &&
-   git commit -m "Add $name" || exit
+   msg_added_iso88591=$(echo "Add $name ($added $name)" | 
iconv -f utf-8 -t iso8859-1) &&
+   git -c 'i18n.commitEncoding=iso8859-1' commit -m 
"$msg_added_iso88591"
done >/dev/null &&
git rev-parse --short --verify HEAD
)
@@ -89,29 +94,29 @@ test_expect_success 'diff.submodule does not affect 
plumbing' '
 commit_file sm1 &&
 head2=$(add_file sm1 foo3)
 
-test_expect_success 'modified submodule(forward)' '
+test_expect_failure 'modified submodule(forward)' '
git diff-index -p --submodule=log HEAD >actual &&
cat >expected <<-EOF &&
Submodule sm1 $head1..$head2:
- > Add foo3
+ > Add foo3 ($added foo3)
EOF
test_cmp expected actual
 '
 
-test_expect_success 'modified submodule(forward)' '
+test_expect_failure 'modified submodule(forward)' '
git diff --submodule=log >actual &&
cat >expected <<-EOF &&
Submodule sm1 $head1..$head2:
- > Add foo3
+ > Add foo3 ($added foo3)
EOF
test_cmp expected actual
 '
 
-test_expect_success 'modified submodule(forward) --submodule' '
+test_expect_failure 'modified submodule(forward) --submodule' '
git diff --submodule >actual &&
cat >expected <<-EOF &&
Submodule sm1 $head1..$head2:
- > Add foo3
+ > Add foo3 ($added foo3)
EOF
test_cmp expected actual
 '
@@ -138,25 +143,25 @@ head3=$(
git rev-parse --short --verify HEAD
 )
 
-test_expect_success 'modified submodule(backward)' '
+test_expect_failure 'modified submodule(backward)' '
git diff-index -p --submodule=log HEAD >actual &&
cat >expected <<-EOF &&
Submodule sm1 $head2..$head3 (rewind):
- < Add foo3
- < Add foo2
+ < Add foo3 ($added foo3)
+ < Add foo2 ($added foo2)
EOF
test_cmp expected actual
 '
 
 head4=$(add_file sm1 foo4 foo5)
-test_expect_success 'modified submodule(backward and forward)' '
+test_expect_failure 'modified submodule(backward and forward)' '
git diff-index -p --submodule=log HEAD >actual &&
cat >expected <<-EOF &&
Submodule sm1 $head2...$head4:
- > Add foo5
- > Add foo4
- < Add foo3
- < Add foo2
+ > Add foo5 ($added foo5)
+ > Add foo4 ($added foo4)
+ < Add foo3 ($added foo3)
+ < Add foo2 ($added foo2)
EOF
test_cmp expected actual
 '
diff --git a/t/t4205-l

Re: [PATCH v8 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding

2013-07-01 Thread Johannes Sixt
Am 7/2/2013 1:19, schrieb Alexey Shumkin:
> +commit_msg() {
> + # String "initial. initial" partly in German
> +   # (translated with Google Translate),
> + # encoded in UTF-8, used as a commit log message below.
> + msg=$(printf "initial. anf\303\244nglich")
> + if test -n "$1"
> + then
> + msg=$(echo $msg | iconv -f utf-8 -t $1)
> + fi
> + if test -n "$2" -a -n "$3"
> + then
> + # cut string, replace cut part with two dots
> + # $2 - chars count from the beginning of the string
> + # $3 - "trailing" chars
> + # LC_ALL is set to make `sed` interpret "." as a UTF-8 char not 
> a byte
> + # as it does with C locale
> + msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e 
> "s/^\(.\{$2\}\)$3/\1../")
> + fi
> + echo $msg
> +}

Ignoring failure reports is not very helpful. Anyway, here is how I would
adjust this patch. (There are trivial conflicts when 5/5 is applied on
top.) Notice the comment I added in test case 'left alignment formatting
with ltrunc'.

Signed-off-by: Johannes Sixt 

To be squashed into v8 4/5:

diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh
index a23da67..ef3a226 100755
--- a/t/t4205-log-pretty-formats.sh
+++ b/t/t4205-log-pretty-formats.sh
@@ -7,25 +7,13 @@
 test_description='Test pretty formats'
 . ./test-lib.sh
 
-commit_msg() {
-   # String "initial. initial" partly in German
-   # (translated with Google Translate),
-   # encoded in UTF-8, used as a commit log message below.
-   msg=$(printf "initial. anf\303\244nglich")
-   if test -n "$1"
-   then
-   msg=$(echo $msg | iconv -f utf-8 -t $1)
-   fi
-   if test -n "$2" -a -n "$3"
-   then
-   # cut string, replace cut part with two dots
-   # $2 - chars count from the beginning of the string
-   # $3 - "trailing" chars
-   # LC_ALL is set to make `sed` interpret "." as a UTF-8 char not 
a byte
-   # as it does with C locale
-   msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e 
"s/^\(.\{$2\}\)$3/\1../")
-   fi
-   echo $msg
+# String "initial. initial" partly in German encoded in UTF-8
+initial_msg=$(printf "initial. anf\303\244nglich")
+
+# extract part of the initial commit message
+# $1 - a RE with \( \) brackets that specify which part to keep
+extract_msg() {
+   echo "$initial_msg" | sed -e "s/$1/\1/"
 }
 
 test_expect_success 'set up basic repos' '
@@ -33,12 +21,11 @@ test_expect_success 'set up basic repos' '
>bar &&
git add foo &&
test_tick &&
-   git config i18n.commitEncoding iso8859-1 &&
-   git commit -m "$(commit_msg iso8859-1)" &&
+   test_config i18n.commitEncoding iso8859-1 &&
+   git commit -m "$(echo "$initial_msg" | iconv -f utf-8 -t iso8859-1)" &&
git add bar &&
test_tick &&
-   git commit -m "add bar" &&
-   git config --unset i18n.commitEncoding
+   git commit -m "add bar"
 '
 
 test_expect_success 'alias builtin format' '
@@ -63,10 +50,9 @@ test_expect_success 'alias user-defined format' '
 '
 
 test_expect_success 'alias user-defined tformat with %s (iso8859-1 encoding)' '
-   git config i18n.logOutputEncoding iso8859-1 &&
+   test_config i18n.logOutputEncoding iso8859-1 &&
git log --oneline >expected-s &&
git log --pretty="tformat:%h %s" >actual-s &&
-   git config --unset i18n.logOutputEncoding &&
test_cmp expected-s actual-s
 '
 
@@ -110,13 +96,13 @@ test_expect_success 'alias loop' '
 '
 
 test_expect_failure 'NUL separation' '
-   printf "add bar\0$(commit_msg)" >expected &&
+   printf "add bar\0$initial_msg" >expected &&
git log -z --pretty="format:%s" >actual &&
test_cmp expected actual
 '
 
 test_expect_failure 'NUL termination' '
-   printf "add bar\0$(commit_msg)\0" >expected &&
+   printf "add bar\0$initial_msg\0" >expected &&
git log -z --pretty="tformat:%s" >actual &&
test_cmp expected actual
 '
@@ -124,7 +110,7 @@ test_expect_failure 'NUL termination' '
 test_expect_failure 'NUL separation with --stat' '
stat0_part=$(git diff --stat HEAD^ HEAD) &&
stat1_part=$(git diff-tree --no-commit-id --stat --root HEAD^) &&
-   printf "add bar\n$stat0_part\n\0$(commit_msg)\n$stat1_part\n" >expected 
&&
+   printf "add bar\n$stat0_part\n\0$initial_msg\n$stat1_part\n" >expected 
&&
git log -z --stat --pretty="format:%s" >actual &&
test_i18ncmp expected actual
 '
@@ -132,7 +118,7 @@ test_expect_failure 'NUL separation with --stat' '
 test_expect_failure 'NUL termination with --stat' '
stat0_part=$(git diff --stat HEAD^ HEAD) &&
stat1_part=$(git diff-tree --no-commit-id --stat --root HEAD^) &&
-   printf "add bar\n$stat0_part\n\0$(commit_msg)\n$stat1_part\n0" 
>expected &&
+   printf "add bar\n$stat0_part\n\0$initial_msg\n$stat1_part\n0" >