Re: [PATCH v7 0/7] convert: add support for different encodings

2018-03-04 Thread Torsten Bögershausen
On 2018-02-28 14:21, Jeff King wrote: > On Wed, Feb 28, 2018 at 09:20:05AM +0100, Torsten Bögershausen wrote: > >>> 2. auto-detect utf-16 (your patch) >>> - Just Works for existing repositories storing utf-16 >>> >>> - carries some risk of kicking in when people would like it not to

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Jeff King
On Wed, Feb 28, 2018 at 09:42:27AM -0800, Junio C Hamano wrote: > > I also think we'd want a plan for this to be used consistently in other > > diff-like tools. E.g., "git blame" uses textconv for the starting file > > content, and it would be nice for this to kick in then, too. Ditto for > >

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Lars Schneider
> On 27 Feb 2018, at 22:25, Jeff King wrote: > > On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote: > > Of the three solutions, I think the relative merits are something like > this: > > 1. baked-in textconv (my patch) > > - reuses an existing diff

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Junio C Hamano
Jeff King writes: >> The binary patch is still supported, but that detail may need some more >> explanation >> in the commit message. Please see t4066-diff-encoding.sh > > Yeah, but if you don't have binary-patches enabled we'd generate a bogus > patch. Which, granted, without

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Jeff King
On Wed, Feb 28, 2018 at 09:20:05AM +0100, Torsten Bögershausen wrote: > > 2. auto-detect utf-16 (your patch) > > - Just Works for existing repositories storing utf-16 > > > > - carries some risk of kicking in when people would like it not to > >(e.g., when they really do want

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Torsten Bögershausen
On Tue, Feb 27, 2018 at 04:25:38PM -0500, Jeff King wrote: > On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote: > > > The other question is: > > Would this help showing diffs of UTF-16 encoded files on a "git hoster", > > github/bitbucket/ ? > > Almost. There's probably

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Jeff King
On Tue, Feb 27, 2018 at 02:10:20PM -0800, Junio C Hamano wrote: > > I thought it solved that by the hosting folks never seeing the strange > > binary-looking data. They see only utf8, which diffs well. > > Ah, OK, that is a "fix" in a wider context (in a narrower context, > "work around" is a

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Junio C Hamano
Jeff King writes: > On Tue, Feb 27, 2018 at 01:55:02PM -0800, Junio C Hamano wrote: > >> Jeff King writes: >> >> > Of the three solutions, I think the relative merits are something like >> > this: >> > ... >> > 3. w-t-e (Lars's patch) >> >> I thought Lars's

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Jeff King
On Tue, Feb 27, 2018 at 01:55:02PM -0800, Junio C Hamano wrote: > Jeff King writes: > > > Of the three solutions, I think the relative merits are something like > > this: > > ... > > 3. w-t-e (Lars's patch) > > I thought Lars's w-t-e was about keeping the in-repo contents in >

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Junio C Hamano
Jeff King writes: > Of the three solutions, I think the relative merits are something like > this: > ... > 3. w-t-e (Lars's patch) I thought Lars's w-t-e was about keeping the in-repo contents in UTF-8 and externalize in whatever encoding (e.g. UTF-16), so it won't help the

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Jeff King
On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote: > The other question is: > Would this help showing diffs of UTF-16 encoded files on a "git hoster", > github/bitbucket/ ? Almost. There's probably one more thing needed. We don't currently read in-tree .gitattributes when

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-27 Thread Torsten Bögershausen
On Mon, Feb 26, 2018 at 03:46:35PM -0500, Jeff King wrote: > On Mon, Feb 26, 2018 at 06:35:33PM +0100, Torsten Bögershausen wrote: > > > > diff --git a/userdiff.c b/userdiff.c > > > index dbfb4e13cd..48fa7e8bdd 100644 > > > --- a/userdiff.c > > > +++ b/userdiff.c > > > @@ -161,6 +161,7 @@

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-26 Thread Jeff King
On Mon, Feb 26, 2018 at 06:35:33PM +0100, Torsten Bögershausen wrote: > > diff --git a/userdiff.c b/userdiff.c > > index dbfb4e13cd..48fa7e8bdd 100644 > > --- a/userdiff.c > > +++ b/userdiff.c > > @@ -161,6 +161,7 @@ IPATTERN("css", > > "-?[_a-zA-Z][-_a-zA-Z0-9]*" /* identifiers */ > >

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-26 Thread Torsten Bögershausen
On Sun, Feb 25, 2018 at 08:44:46PM -0500, Jeff King wrote: > On Sat, Feb 24, 2018 at 04:18:36PM +0100, Lars Schneider wrote: > > > > We always use the in-repo contents when generating 'diff'. I think > > > by "attribute to be used in diff", what you are reallying after is > > > to convert the

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-25 Thread Jeff King
On Sat, Feb 24, 2018 at 04:18:36PM +0100, Lars Schneider wrote: > > We always use the in-repo contents when generating 'diff'. I think > > by "attribute to be used in diff", what you are reallying after is > > to convert the in-repo contents to that encoding _BEFORE_ running > > 'diff' on it.

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-24 Thread Lars Schneider
> On 23 Feb 2018, at 21:11, Junio C Hamano wrote: > > Junio C Hamano writes: > >> Lars Schneider writes: >> >>> I still think it would be nice to see diffs for arbitrary encodings. >>> Would it be an option to read the

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-23 Thread Junio C Hamano
Junio C Hamano writes: > Lars Schneider writes: > >> I still think it would be nice to see diffs for arbitrary encodings. >> Would it be an option to read the `encoding` attribute and use it in >> `git diff`? > > Reusing that gitk-only thing and

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-23 Thread Junio C Hamano
Lars Schneider writes: > I still think it would be nice to see diffs for arbitrary encodings. > Would it be an option to read the `encoding` attribute and use it in > `git diff`? Reusing that gitk-only thing and suddenly start doing so would break gitk users, no? The

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-22 Thread Jeff King
On Thu, Feb 22, 2018 at 09:00:45PM +0100, Lars Schneider wrote: > > If it was only about a diff of UTF-16 files, I may suggest a patch. > > I simply copy-paste it here for review, if someone thinks that it may > > be useful, I can send it as a real patch/RFC. > > That's a nice idea but I see two

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-22 Thread Lars Schneider
> On 16 Feb 2018, at 17:58, Torsten Bögershausen wrote: > > On Fri, Feb 16, 2018 at 03:42:35PM +0100, Lars Schneider wrote: > [] >> >> Agreed. However, people using ShiftJIS are not my target audience. >> My target audience are: >> >> (1) People that have to encode their text

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-21 Thread Lars Schneider
> On 16 Feb 2018, at 19:55, Junio C Hamano wrote: > > Jeff King writes: > >> So a full proposal would support both cases: "check this out in the >> local platform's preferred encoding" and "always check this out in >> _this_ encoding". And Lars's proposal is

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Junio C Hamano
Jeff King writes: > In which case yeah, I could see choosing an in-repo encoding to possibly > be useful (but it also seems like a feature that could easily be tacked > on later if somebody cares). Yes, I think we are on the same page---in-repo-encoding that is a natural

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Jeff King
On Fri, Feb 16, 2018 at 02:25:41PM -0500, Jeff King wrote: > On Fri, Feb 16, 2018 at 10:55:58AM -0800, Junio C Hamano wrote: > > > Jeff King writes: > > > > > So a full proposal would support both cases: "check this out in the > > > local platform's preferred encoding" and

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Jeff King
On Fri, Feb 16, 2018 at 10:55:58AM -0800, Junio C Hamano wrote: > Jeff King writes: > > > So a full proposal would support both cases: "check this out in the > > local platform's preferred encoding" and "always check this out in > > _this_ encoding". And Lars's proposal is just

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Junio C Hamano
Lars Schneider writes: >> One thing I find more problematic is that the above places *too* >> much stress on the UTF-8 centric worldview. It is perfectly valid >> to store your text contents encoded in ShiftJIS and check them out >> as-is, with or without this patch.

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Junio C Hamano
Jeff King writes: > So a full proposal would support both cases: "check this out in the > local platform's preferred encoding" and "always check this out in > _this_ encoding". And Lars's proposal is just the second half of that. Actually, what you seem to take as a whole is just

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Torsten Bögershausen
On Fri, Feb 16, 2018 at 03:42:35PM +0100, Lars Schneider wrote: [] > > Agreed. However, people using ShiftJIS are not my target audience. > My target audience are: > > (1) People that have to encode their text files in UTF-16 (for > whatever reason - usually because of legacy processes or

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Lars Schneider
> On 15 Feb 2018, at 21:03, Junio C Hamano wrote: > > lars.schnei...@autodesk.com writes: > >> -- Git clients that do not support the `working-tree-encoding` attribute >> - will checkout the respective files UTF-8 encoded and not in the >> - expected encoding.

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread Jeff King
On Thu, Feb 15, 2018 at 12:03:06PM -0800, Junio C Hamano wrote: > And from that point of view, perhaps w-t-e attribute is somewhat > misdesigned. > > In general, an attribute is about the project's contents in the > manner independent of platform or environment. You define "this > file is a C

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread Junio C Hamano
lars.schnei...@autodesk.com writes: > -- Git clients that do not support the `working-tree-encoding` attribute > - will checkout the respective files UTF-8 encoded and not in the > - expected encoding. Consequently, these files will appear different > - which typically causes trouble. This is

[PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread lars . schneider
From: Lars Schneider Hi, Patches 1-4, 6 are preparation and helper functions. Patch 5,7 are the actual change. This series depends on Torsten's 8462ff43e4 (convert_to_git(): safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already in master. Changes