Re: RFC: Native clean/smudge filter for UTF-16 files

2017-12-04 Thread Jeff King
On Sun, Dec 03, 2017 at 07:48:01PM +0100, Lars Schneider wrote: > > - if core.convertEncoding is true, then for any file with an > > encoding=foo attribute, internally run iconv(foo, utf8) in > > convert_to_git(), and likewise iconv(utf8, foo) in > > convert_to_working_tree. > > > > - I'm

Re: RFC: Native clean/smudge filter for UTF-16 files

2017-12-03 Thread Lars Schneider
> On 24 Nov 2017, at 19:04, Jeff King wrote: > > On Thu, Nov 23, 2017 at 04:18:59PM +0100, Lars Schneider wrote: > >> Alternatively, I could add a native attribute to Git that translates UTF-16 >> to UTF-8 and back. A conversion function is already available in "mingw.h" >>

Re: RFC: Native clean/smudge filter for UTF-16 files

2017-11-24 Thread Junio C Hamano
Jeff King writes: > So anyway, that is an alternate strategy, but I think I like "canonical > in-repo text is utf-8" approach a lot more, since then git operations > work consistently. There are still a few rough edges (e.g., I'm not sure Sounds like a good way forward. > if you

Re: RFC: Native clean/smudge filter for UTF-16 files

2017-11-24 Thread Jeff King
On Thu, Nov 23, 2017 at 04:18:59PM +0100, Lars Schneider wrote: > Alternatively, I could add a native attribute to Git that translates UTF-16 > to UTF-8 and back. A conversion function is already available in "mingw.h" [3] > on Windows. Limiting this feature to Windows wouldn't be a problem from

Re: RFC: Native clean/smudge filter for UTF-16 files

2017-11-23 Thread Torsten Bögershausen
On Thu, Nov 23, 2017 at 04:18:59PM +0100, Lars Schneider wrote: > Hi, > > I am working with a team that owns a repository with lots of UTF-16 files. > Converting these files to UTF-8 is no good option as downstream applications > require the UTF-16 encoding. Keeping the files in UTF-16 is no good

RFC: Native clean/smudge filter for UTF-16 files

2017-11-23 Thread Lars Schneider
Hi, I am working with a team that owns a repository with lots of UTF-16 files. Converting these files to UTF-8 is no good option as downstream applications require the UTF-16 encoding. Keeping the files in UTF-16 is no good option either as Git and Git related tools (e.g. GitHub) consider the