Re: Git messes up 'ø' character

2015-01-22 Thread Michael J Gruber
Noralf Trønnes schrieb am 20.01.2015 um 23:26:
 Den 20.01.2015 23:18, skrev Nico Williams:
 On Tue, Jan 20, 2015 at 10:38:40PM +0100, Noralf Trønnes wrote:
 Yes:
 $ echo Noralf Trønnes | xxd
 000: 4e6f 7261 6c66 2054 72f8 6e6e 6573 0aNoralf Tr.nnes.

 Is there a command I can run that shows that I'm using ISO-8859-1 ?
 I need something to google with, my previous search only gave locale
 stuff, which seems fine.
 The locale(1) command tells you what your locale is set to, but it
 doesn't say anything about your input method -- it only tells you what
 your shell and commands started from it expect for input and what they
 should produce for output.

 The input method will generally be part of your windowing environment,
 for which you'll have to search how to check/configure your OS
 (sometimes it can be set on a per-window basis, sometimes it's a global
 setting).

 Even if the windowing environment is set to UTF-8, your terminal
 emulator might be set to ISO-8859-something, so check the terminal
 emulator (e.g., rxvt, Terminator, GNOME Terminal, PuTTY, ...).
 
 I use putty which was set to ISO-8859-1. Changing this to UTF-8 gave me 
 the correct result:
 $ echo Noralf Trønnes | xxd
 000: 4e6f 7261 6c66 2054 72c3 b86e 6e65 730a  Noralf Tr..nnes.
 
 Thank you all for helping me!
 

You can also check the encoding of your config file with

file .git/config

or :set fileencoding in vim. :set fileencoding=utf8 would allow you
to convert it easily.

(This assumes that the file does not mix encodings.)

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Torsten Bögershausen
On 2015-01-20 20.46, Noralf Trønnes wrote:
could it be that your ø is not encoded as UTF-8,
but in ISO-8859-15 (or so)

 $ git log -1
 commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
 Author: Noralf TrF8nnes no...@tronnes.org
What does 
git config -l | grep Noralf | xxd
say ?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Noralf Trønnes

Den 20.01.2015 21:07, skrev Torsten Bögershausen:

On 2015-01-20 20.46, Noralf Trønnes wrote:
could it be that your ø is not encoded as UTF-8,
but in ISO-8859-15 (or so)


$ git log -1
commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
Author: Noralf TrF8nnes no...@tronnes.org

What does
git config -l | grep Noralf | xxd
say ?


$ git config -l | grep Noralf | xxd
000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
010: 2054 72f8 6e6e 6573 0aTr.nnes.

$ file ~/.gitconfig
/home/pi/.gitconfig: ISO-8859 text

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Ævar Arnfjörð Bjarmason
On Tue, Jan 20, 2015 at 10:23 PM, Noralf Trønnes no...@tronnes.org wrote:
 Den 20.01.2015 21:45, skrev Ævar Arnfjörð Bjarmason:

 On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org wrote:

 Den 20.01.2015 21:07, skrev Torsten Bögershausen:

 On 2015-01-20 20.46, Noralf Trønnes wrote:
 could it be that your ø is not encoded as UTF-8,
 but in ISO-8859-15 (or so)

 $ git log -1
 commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
 Author: Noralf TrF8nnes no...@tronnes.org

 What does
 git config -l | grep Noralf | xxd
 say ?

 $ git config -l | grep Noralf | xxd
 000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
 010: 2054 72f8 6e6e 6573 0aTr.nnes.

 $ file ~/.gitconfig
 /home/pi/.gitconfig: ISO-8859 text

 What's happened here is that:

   1. You've authored your commit in ISO-8859-1
   2. Git itself has no place for the encoding of the author name in the
 commit object format
   3. git-format-patch has a --compose-encoding which I think would sort
 this out if you set it to ISO-8859-1, but it defaults to UTF-8
   4. Your patch is actually a ISO-8859-1 byte sequence, but is
 advertised as UTF-8
   5. You end up with a screwed-up commit

 You could work around this, but I suggest just joining the 21st
 century and working exclusively in UTF-8, it makes things much easier,
 speaking as someone with 3x more non-ASCII characters their his name
 than you :)


 Ok, then the question is: How do I switch to UTF-8?

 To me it seems I'm already using it:
 $ locale charmap
 UTF-8

Your .gitconfig has an ISO-8859-1 string, from an earlier mail of yours:

 $ git config -l | grep Noralf | xxd
 000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
 010: 2054 72f8 6e6e 6573 0aTr.nnes.

On a system configured for UTF-8 this would be:

$ echo Noralf Trønnes | xxd
000: 4e6f 7261 6c66 2054 72c3 b86e 6e65 730a  Noralf Tr..nnes.

Note the f8 v.s. c3 b8.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Ævar Arnfjörð Bjarmason
On Tue, Jan 20, 2015 at 10:38 PM, Noralf Trønnes no...@tronnes.org wrote:
 Den 20.01.2015 22:26, skrev Ævar Arnfjörð Bjarmason:

 On Tue, Jan 20, 2015 at 10:23 PM, Noralf Trønnes no...@tronnes.org
 wrote:

 Den 20.01.2015 21:45, skrev Ævar Arnfjörð Bjarmason:

 On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org
 wrote:

 Den 20.01.2015 21:07, skrev Torsten Bögershausen:

 On 2015-01-20 20.46, Noralf Trønnes wrote:
 could it be that your ø is not encoded as UTF-8,
 but in ISO-8859-15 (or so)

 $ git log -1
 commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
 Author: Noralf TrF8nnes no...@tronnes.org

 What does
 git config -l | grep Noralf | xxd
 say ?

 $ git config -l | grep Noralf | xxd
 000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
 010: 2054 72f8 6e6e 6573 0aTr.nnes.

 $ file ~/.gitconfig
 /home/pi/.gitconfig: ISO-8859 text

 What's happened here is that:

1. You've authored your commit in ISO-8859-1
2. Git itself has no place for the encoding of the author name in the
 commit object format
3. git-format-patch has a --compose-encoding which I think would sort
 this out if you set it to ISO-8859-1, but it defaults to UTF-8
4. Your patch is actually a ISO-8859-1 byte sequence, but is
 advertised as UTF-8
5. You end up with a screwed-up commit

 You could work around this, but I suggest just joining the 21st
 century and working exclusively in UTF-8, it makes things much easier,
 speaking as someone with 3x more non-ASCII characters their his name
 than you :)

 Ok, then the question is: How do I switch to UTF-8?

 To me it seems I'm already using it:
 $ locale charmap
 UTF-8

 Your .gitconfig has an ISO-8859-1 string, from an earlier mail of yours:

 $ git config -l | grep Noralf | xxd
 000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
 010: 2054 72f8 6e6e 6573 0aTr.nnes.

 On a system configured for UTF-8 this would be:

 $ echo Noralf Trønnes | xxd
 000: 4e6f 7261 6c66 2054 72c3 b86e 6e65 730a  Noralf Tr..nnes.

 Note the f8 v.s. c3 b8.


 Yes:
 $ echo Noralf Trønnes | xxd
 000: 4e6f 7261 6c66 2054 72f8 6e6e 6573 0aNoralf Tr.nnes.

 Is there a command I can run that shows that I'm using ISO-8859-1 ?
 I need something to google with, my previous search only gave locale stuff,
 which seems fine.

What does this give you, this is UTF-8.

$ echo git commit --author=Noralf Trønnes no...@tronnes.org | xxd
000: 6769 7420 636f 6d6d 6974 202d 2d61 7574  git commit --aut
010: 686f 723d 4e6f 7261 6c66 2054 72c3 b86e  hor=Noralf Tr..n
020: 6e65 7320 3c6e 6f74 726f 4074 726f 6e6e  nes notro@tronn
030: 6573 2e6f 7267 3e0a  es.org.

To see if you're using UTF-8 just look at the codepoints for the
non-ASCII characters you're using and check if they're valid UTF-8.
E.g. you can check this out:
http://en.wikipedia.org/wiki/%C3%98#Computers

Which shows you that the UTF-8 hex version is C3 B8, but the Latin-1
is F8, you're emitting F8, I'm emitting C3 B8.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Greg Kroah-Hartman
On Tue, Jan 20, 2015 at 09:45:46PM +0100, Ævar Arnfjörð Bjarmason wrote:
 On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org wrote:
  Den 20.01.2015 21:07, skrev Torsten Bögershausen:
 
  On 2015-01-20 20.46, Noralf Trønnes wrote:
  could it be that your ø is not encoded as UTF-8,
  but in ISO-8859-15 (or so)
 
  $ git log -1
  commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
  Author: Noralf TrF8nnes no...@tronnes.org
 
  What does
  git config -l | grep Noralf | xxd
  say ?
 
  $ git config -l | grep Noralf | xxd
  000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
  010: 2054 72f8 6e6e 6573 0aTr.nnes.
 
  $ file ~/.gitconfig
  /home/pi/.gitconfig: ISO-8859 text
 
 What's happened here is that:
 
  1. You've authored your commit in ISO-8859-1
  2. Git itself has no place for the encoding of the author name in the
 commit object format
  3. git-format-patch has a --compose-encoding which I think would sort
 this out if you set it to ISO-8859-1, but it defaults to UTF-8
  4. Your patch is actually a ISO-8859-1 byte sequence, but is
 advertised as UTF-8
  5. You end up with a screwed-up commit
 
 You could work around this, but I suggest just joining the 21st
 century and working exclusively in UTF-8, it makes things much easier,
 speaking as someone with 3x more non-ASCII characters their his name
 than you :)

So how exactly do you fix this using UTF-8?  Git is exporting a UTF-8
From: line so it thinks the character is correct, but it's not
creating something properly here.

confused,

greg k-h
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Noralf Trønnes

Den 20.01.2015 22:26, skrev Ævar Arnfjörð Bjarmason:

On Tue, Jan 20, 2015 at 10:23 PM, Noralf Trønnes no...@tronnes.org wrote:

Den 20.01.2015 21:45, skrev Ævar Arnfjörð Bjarmason:


On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org wrote:

Den 20.01.2015 21:07, skrev Torsten Bögershausen:

On 2015-01-20 20.46, Noralf Trønnes wrote:
could it be that your ø is not encoded as UTF-8,
but in ISO-8859-15 (or so)


$ git log -1
commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
Author: Noralf TrF8nnes no...@tronnes.org

What does
git config -l | grep Noralf | xxd
say ?


$ git config -l | grep Noralf | xxd
000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
010: 2054 72f8 6e6e 6573 0aTr.nnes.

$ file ~/.gitconfig
/home/pi/.gitconfig: ISO-8859 text

What's happened here is that:

   1. You've authored your commit in ISO-8859-1
   2. Git itself has no place for the encoding of the author name in the
commit object format
   3. git-format-patch has a --compose-encoding which I think would sort
this out if you set it to ISO-8859-1, but it defaults to UTF-8
   4. Your patch is actually a ISO-8859-1 byte sequence, but is
advertised as UTF-8
   5. You end up with a screwed-up commit

You could work around this, but I suggest just joining the 21st
century and working exclusively in UTF-8, it makes things much easier,
speaking as someone with 3x more non-ASCII characters their his name
than you :)


Ok, then the question is: How do I switch to UTF-8?

To me it seems I'm already using it:
$ locale charmap
UTF-8

Your .gitconfig has an ISO-8859-1 string, from an earlier mail of yours:


$ git config -l | grep Noralf | xxd
000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
010: 2054 72f8 6e6e 6573 0aTr.nnes.

On a system configured for UTF-8 this would be:

$ echo Noralf Trønnes | xxd
000: 4e6f 7261 6c66 2054 72c3 b86e 6e65 730a  Noralf Tr..nnes.

Note the f8 v.s. c3 b8.



Yes:
$ echo Noralf Trønnes | xxd
000: 4e6f 7261 6c66 2054 72f8 6e6e 6573 0aNoralf Tr.nnes.

Is there a command I can run that shows that I'm using ISO-8859-1 ?
I need something to google with, my previous search only gave locale 
stuff, which seems fine.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Ævar Arnfjörð Bjarmason
On Tue, Jan 20, 2015 at 10:20 PM, Jeff King p...@peff.net wrote:
 On Tue, Jan 20, 2015 at 09:45:46PM +0100, Ævar Arnfjörð Bjarmason wrote:

 What's happened here is that:

  1. You've authored your commit in ISO-8859-1
  2. Git itself has no place for the encoding of the author name in the
 commit object format

 Is (2) right? The encoding header in a commit object should apply not
 just to the commit message, but also to the author (and committer) name.

 I think the real problem is simply that it defaults to UTF-8, but he is
 giving it iso-8859-1 characters. Setting i18n.commitEncoding should fix
 it.

True, I forgot about that setting.

 -Peff

 PS If you try experimenting with this, you may fall afoul of 08a94a1
(commit/commit-tree: correct latin1 to utf-8, 2012-06-28), which will
silently correct Latin1 characters into UTF-8 (when the commit
message is expected to be in UTF-8, of course). So it actually
_should_ just work under modern gits, but only for Latin1.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Jeff King
On Tue, Jan 20, 2015 at 09:45:46PM +0100, Ævar Arnfjörð Bjarmason wrote:

 What's happened here is that:
 
  1. You've authored your commit in ISO-8859-1
  2. Git itself has no place for the encoding of the author name in the
 commit object format

Is (2) right? The encoding header in a commit object should apply not
just to the commit message, but also to the author (and committer) name.

I think the real problem is simply that it defaults to UTF-8, but he is
giving it iso-8859-1 characters. Setting i18n.commitEncoding should fix
it.

-Peff

PS If you try experimenting with this, you may fall afoul of 08a94a1
   (commit/commit-tree: correct latin1 to utf-8, 2012-06-28), which will
   silently correct Latin1 characters into UTF-8 (when the commit
   message is expected to be in UTF-8, of course). So it actually
   _should_ just work under modern gits, but only for Latin1.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Nico Williams
On Tue, Jan 20, 2015 at 10:38:40PM +0100, Noralf Trønnes wrote:
 Yes:
 $ echo Noralf Trønnes | xxd
 000: 4e6f 7261 6c66 2054 72f8 6e6e 6573 0aNoralf Tr.nnes.
 
 Is there a command I can run that shows that I'm using ISO-8859-1 ?
 I need something to google with, my previous search only gave locale
 stuff, which seems fine.

The locale(1) command tells you what your locale is set to, but it
doesn't say anything about your input method -- it only tells you what
your shell and commands started from it expect for input and what they
should produce for output.

The input method will generally be part of your windowing environment,
for which you'll have to search how to check/configure your OS
(sometimes it can be set on a per-window basis, sometimes it's a global
setting).

Even if the windowing environment is set to UTF-8, your terminal
emulator might be set to ISO-8859-something, so check the terminal
emulator (e.g., rxvt, Terminator, GNOME Terminal, PuTTY, ...).

Finally, check what stty(1) says (e.g., on Linux it should show that
iutf8 is enabled) (this is mostly so that when you backspace in cooked
mode the line discipline knows how many bytes to drop from the buffer).

Nico
-- 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Noralf Trønnes

Den 20.01.2015 21:45, skrev Ævar Arnfjörð Bjarmason:

On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org wrote:

Den 20.01.2015 21:07, skrev Torsten Bögershausen:

On 2015-01-20 20.46, Noralf Trønnes wrote:
could it be that your ø is not encoded as UTF-8,
but in ISO-8859-15 (or so)


$ git log -1
commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
Author: Noralf TrF8nnes no...@tronnes.org

What does
git config -l | grep Noralf | xxd
say ?


$ git config -l | grep Noralf | xxd
000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
010: 2054 72f8 6e6e 6573 0aTr.nnes.

$ file ~/.gitconfig
/home/pi/.gitconfig: ISO-8859 text

What's happened here is that:

  1. You've authored your commit in ISO-8859-1
  2. Git itself has no place for the encoding of the author name in the
commit object format
  3. git-format-patch has a --compose-encoding which I think would sort
this out if you set it to ISO-8859-1, but it defaults to UTF-8
  4. Your patch is actually a ISO-8859-1 byte sequence, but is
advertised as UTF-8
  5. You end up with a screwed-up commit

You could work around this, but I suggest just joining the 21st
century and working exclusively in UTF-8, it makes things much easier,
speaking as someone with 3x more non-ASCII characters their his name
than you :)



Ok, then the question is: How do I switch to UTF-8?

To me it seems I'm already using it:
$ locale charmap
UTF-8

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Noralf Trønnes

Den 20.01.2015 23:18, skrev Nico Williams:

On Tue, Jan 20, 2015 at 10:38:40PM +0100, Noralf Trønnes wrote:

Yes:
$ echo Noralf Trønnes | xxd
000: 4e6f 7261 6c66 2054 72f8 6e6e 6573 0aNoralf Tr.nnes.

Is there a command I can run that shows that I'm using ISO-8859-1 ?
I need something to google with, my previous search only gave locale
stuff, which seems fine.

The locale(1) command tells you what your locale is set to, but it
doesn't say anything about your input method -- it only tells you what
your shell and commands started from it expect for input and what they
should produce for output.

The input method will generally be part of your windowing environment,
for which you'll have to search how to check/configure your OS
(sometimes it can be set on a per-window basis, sometimes it's a global
setting).

Even if the windowing environment is set to UTF-8, your terminal
emulator might be set to ISO-8859-something, so check the terminal
emulator (e.g., rxvt, Terminator, GNOME Terminal, PuTTY, ...).


I use putty which was set to ISO-8859-1. Changing this to UTF-8 gave me 
the correct result:

$ echo Noralf Trønnes | xxd
000: 4e6f 7261 6c66 2054 72c3 b86e 6e65 730a  Noralf Tr..nnes.

Thank you all for helping me!

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Git messes up 'ø' character

2015-01-20 Thread Ævar Arnfjörð Bjarmason
On Tue, Jan 20, 2015 at 9:17 PM, Noralf Trønnes no...@tronnes.org wrote:
 Den 20.01.2015 21:07, skrev Torsten Bögershausen:

 On 2015-01-20 20.46, Noralf Trønnes wrote:
 could it be that your ø is not encoded as UTF-8,
 but in ISO-8859-15 (or so)

 $ git log -1
 commit b2a4f6abdb097c4dc092b56995a2af8e42fbea79
 Author: Noralf TrF8nnes no...@tronnes.org

 What does
 git config -l | grep Noralf | xxd
 say ?

 $ git config -l | grep Noralf | xxd
 000: 7573 6572 2e6e 616d 653d 4e6f 7261 6c66  user.name=Noralf
 010: 2054 72f8 6e6e 6573 0aTr.nnes.

 $ file ~/.gitconfig
 /home/pi/.gitconfig: ISO-8859 text

What's happened here is that:

 1. You've authored your commit in ISO-8859-1
 2. Git itself has no place for the encoding of the author name in the
commit object format
 3. git-format-patch has a --compose-encoding which I think would sort
this out if you set it to ISO-8859-1, but it defaults to UTF-8
 4. Your patch is actually a ISO-8859-1 byte sequence, but is
advertised as UTF-8
 5. You end up with a screwed-up commit

You could work around this, but I suggest just joining the 21st
century and working exclusively in UTF-8, it makes things much easier,
speaking as someone with 3x more non-ASCII characters their his name
than you :)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html