[PATCH v9 6/8] convert: check for detectable errors in UTF encodings

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Check that new content is valid with respect to the user defined 'working-tree-encoding' attribute. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- convert.c| 50

[PATCH v9 8/8] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are

[PATCH v9 5/8] convert: add 'working-tree-encoding' attribute

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

[PATCH v9 0/8] convert: add support for different encodings

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-4,7 are preparation and helper functions. Patch 5,6,8 are the actual change. This series depends on Torsten's 8462ff43e4 (convert_to_git(): safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already in master. C

[PATCH v9 1/8] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v9 4/8] utf8: add function to detect a missing UTF-16/32 BOM

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The is_missing_required_utf_bom() function returns true if a required BOM is missing. The Unicode standard ins

[PATCH v9 3/8] utf8: add function to detect prohibited UTF-16/32 BOM

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v9 2/8] strbuf: add xstrdup_toupper()

2018-03-04 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

Re: [PATCH v8 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-03-04 Thread Lars Schneider
ist of encodings, to define for what encodings Git should check the >> conversion round trip if they are used in the 'working-tree-encoding' >> attribute. >> >> Set SHIFT-JIS as default value for 'core.checkRoundtripEncoding'. >> >> [1] >> https://support.micros

Re: What's cooking in git.git (Mar 2018, #01; Thu, 1)

2018-03-02 Thread Lars Schneider
> On 02 Mar 2018, at 18:11, Junio C Hamano wrote: > > Junio C Hamano writes: > >> SZEDER Gábor writes: >> >>> On Thu, Mar 1, 2018 at 11:20 PM, Junio C Hamano wrote: >>> *

Re: [PATCH v8 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-28 Thread Lars Schneider
> On 27 Feb 2018, at 06:17, Eric Sunshine <sunsh...@sunshineco.com> wrote: > > On Sun, Feb 25, 2018 at 6:35 AM, Lars Schneider > <larsxschnei...@gmail.com> wrote: >>> On 25 Feb 2018, at 04:41, Eric Sunshine <sunsh...@sunshineco.com> wrote: >>> Is

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-28 Thread Lars Schneider
> On 27 Feb 2018, at 22:25, Jeff King wrote: > > On Tue, Feb 27, 2018 at 10:05:17PM +0100, Torsten Bögershausen wrote: > > Of the three solutions, I think the relative merits are something like > this: > > 1. baked-in textconv (my patch) > > - reuses an existing diff

Re: [PATCH v8 5/7] convert: add 'working-tree-encoding' attribute

2018-02-27 Thread Lars Schneider
fined for a >> given file. If the content is added to the index, then Git converts the >> content to a canonical UTF-8 representation. On checkout Git will >> reverse the conversion. >> >> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> >> --- >>

Re: [PATCH v8 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-25 Thread Lars Schneider
t; in HTML5 recommends to assume little-endian to "deal with deployed >> content" [3]. Strictly requiring a BOM seems to be the safest option >> for content in Git. >> >> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> >> --- >> diff --git

Re: [PATCH v8 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-25 Thread Lars Schneider
ed [1]. The function returns true if >> this is the case. >> >> [1] http://unicode.org/faq/utf_bom.html#bom10 >> >> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> >> --- >> diff --git a/utf8.c b/utf8.c >> @@ -538,6 +538,30 @@ char *reenc

[PATCH v8 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are

[PATCH v8 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The is_missing_required_utf_bom() function returns true if a required BOM is missing. The Unicode standard ins

[PATCH v8 6/7] convert: add tracing for 'working-tree-encoding' attribute

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable tracing for content that is reencoded with the 'working-tree-encoding' attribute. This is useful to debug encoding issues. Signed-off-by: Lars Schneider <la

[PATCH v8 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v8 5/7] convert: add 'working-tree-encoding' attribute

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

[PATCH v8 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v8 2/7] strbuf: add xstrdup_toupper()

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

[PATCH v8 0/7] convert: add support for different encodings

2018-02-24 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-4, 6 are preparation and helper functions. Patch 5,7 are the actual change. This series depends on Torsten's 8462ff43e4 (convert_to_git(): safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already in master. C

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-24 Thread Lars Schneider
> On 23 Feb 2018, at 21:11, Junio C Hamano <gits...@pobox.com> wrote: > > Junio C Hamano <gits...@pobox.com> writes: > >> Lars Schneider <larsxschnei...@gmail.com> writes: >> >>> I still think it would be nice to see diffs for arb

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-22 Thread Lars Schneider
> On 16 Feb 2018, at 17:58, Torsten Bögershausen <tbo...@web.de> wrote: > > On Fri, Feb 16, 2018 at 03:42:35PM +0100, Lars Schneider wrote: > [] >> >> Agreed. However, people using ShiftJIS are not my target audience. >> My target audience are: >> &g

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-21 Thread Lars Schneider
> On 16 Feb 2018, at 19:55, Junio C Hamano wrote: > > Jeff King writes: > >> So a full proposal would support both cases: "check this out in the >> local platform's preferred encoding" and "always check this out in >> _this_ encoding". And Lars's proposal is

Re: [PATCH v3] worktree: add: fix 'post-checkout' not knowing new worktree location

2018-02-16 Thread Lars Schneider
when the new worktree is created from > a sibling worktree (as opposed to the main worktree); (2) verify that > the hook is provided with correct context when the new worktree is > created from a bare repository (test provided by Lars Schneider). Thanks! This patch works great and fixes t

Re: [PATCH v7 0/7] convert: add support for different encodings

2018-02-16 Thread Lars Schneider
> On 15 Feb 2018, at 21:03, Junio C Hamano wrote: > > lars.schnei...@autodesk.com writes: > >> -- Git clients that do not support the `working-tree-encoding` attribute >> - will checkout the respective files UTF-8 encoded and not in the >> - expected encoding.

[PATCH v7 0/7] convert: add support for different encodings

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-4, 6 are preparation and helper functions. Patch 5,7 are the actual change. This series depends on Torsten's 8462ff43e4 (convert_to_git(): safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already in master. C

[PATCH v7 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v7 2/7] strbuf: add xstrdup_toupper()

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

[PATCH v7 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v7 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The is_missing_required_utf_bom() function returns true if a required BOM is missing. The Unicode standard ins

[PATCH v7 5/7] convert: add 'working-tree-encoding' attribute

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

[PATCH v7 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are

[PATCH v7 6/7] convert: add tracing for 'working-tree-encoding' attribute

2018-02-15 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable tracing for content that is reencoded with the 'working-tree-encoding' attribute. This is useful to debug encoding issues. Signed-off-by: Lars Schneider <la

Re: [PATCH v6 5/7] convert: add 'working-tree-encoding' attribute

2018-02-14 Thread Lars Schneider
> On 10 Feb 2018, at 10:48, Torsten Bögershausen <tbo...@web.de> wrote: > > On Fri, Feb 09, 2018 at 02:28:28PM +0100, lars.schnei...@autodesk.com wrote: >> From: Lars Schneider <larsxschnei...@gmail.com> >> >> ... >> >> +Please note that u

Re: [PATCH 1/2] run-command: teach 'run_hook' about alternate worktrees

2018-02-12 Thread Lars Schneider
> On 12 Feb 2018, at 04:15, Eric Sunshine wrote: > > Git commands which run hooks do so at the top level of the worktree in > which the command itself was invoked. However, the 'git worktree' > command may need to run hooks within some other directory. For > instance,

Re: [PATCH 2/2] worktree: add: change to new worktree directory before running hook

2018-02-12 Thread Lars Schneider
tree is created from a sibling > worktree (as opposed to the main worktree). > > Reported-by: Lars Schneider <larsxschnei...@gmail.com> > Signed-off-by: Eric Sunshine <sunsh...@sunshineco.com> > --- > builtin/worktree.c | 11 --- > t/t2025-worktree-add.sh |

Re: [PATCH v1] worktree: set worktree environment in post-checkout hook

2018-02-09 Thread Lars Schneider
> On 10 Feb 2018, at 02:01, lars.schnei...@autodesk.com wrote: > > From: Lars Schneider <larsxschnei...@gmail.com> > > In ade546be47 (worktree: invoke post-checkout hook (unless > --no-checkout), 2017-12-07) we taught Git to run the post-checkout hook >

Re: [PATCH v6 0/7] convert: add support for different encodings

2018-02-09 Thread Lars Schneider
> On 09 Feb 2018, at 21:09, Junio C Hamano wrote: > > Documentation has core.checkRoundtripEncoding while t0028 and a > comment in convert.c capitalize it differently. I suspect that it > would be more reader-friendly to update the documentation to match. Agreed. I will

[PATCH v1] worktree: set worktree environment in post-checkout hook

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> In ade546be47 (worktree: invoke post-checkout hook (unless --no-checkout), 2017-12-07) we taught Git to run the post-checkout hook in worktrees. Unfortunately, the environment of the hook was not made aware of the worktree. Consequently, a 'g

Re: [PATCH v6 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-09 Thread Lars Schneider
> On 09 Feb 2018, at 20:28, Junio C Hamano <gits...@pobox.com> wrote: > > lars.schnei...@autodesk.com writes: > >> From: Lars Schneider <larsxschnei...@gmail.com> >> >> If the endianness is not defined in the encoding name, then let's >> ... &

[PATCH v6 6/7] convert: add tracing for 'working-tree-encoding' attribute

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable tracing for content that is reencoded with the 'working-tree-encoding' attribute. This is useful to debug encoding issues. Signed-off-by: Lars Schneider <la

[PATCH v6 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v6 2/7] strbuf: add xstrdup_toupper()

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

[PATCH v6 0/7] convert: add support for different encodings

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-4, 6 are preparation and helper functions. Patch 5,7 are the actual change. This series depends on Torsten's 8462ff43e4 (convert_to_git(): safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is already in next. C

[PATCH v6 7/7] convert: add round trip check based on 'core.checkRoundtripEncoding'

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> UTF supports lossless conversion round tripping and conversions between UTF and other encodings are mostly round trip safe as Unicode aims to be a superset of all other character encodings. However, certain encodings (e.g. SHIFT-JIS) are

[PATCH v6 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The is_missing_required_utf_bom() function returns true if a required BOM is missing. The Unicode standard ins

[PATCH v6 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v6 5/7] convert: add 'working-tree-encoding' attribute

2018-02-09 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

Re: "git branch" issue in 2.16.1

2018-02-08 Thread Lars Schneider
> On 08 Feb 2018, at 17:19, Kevin Daudt <m...@ikke.info> wrote: > > On Thu, Feb 08, 2018 at 12:27:07PM +0100, Lars Schneider wrote: >> >>> On 08 Feb 2018, at 12:13, Lars Schneider <larsxschnei...@gmail.com> wrote: >>> >>> >>

Re: "git branch" issue in 2.16.1

2018-02-08 Thread Lars Schneider
> On 08 Feb 2018, at 12:13, Lars Schneider <larsxschnei...@gmail.com> wrote: > > >> On 08 Feb 2018, at 09:50, Jeff King <p...@peff.net> wrote: >> >> On Wed, Feb 07, 2018 at 11:20:08PM +0100, Lars Schneider wrote: >> >>>> 1. You hav

Re: "git branch" issue in 2.16.1

2018-02-08 Thread Lars Schneider
> On 08 Feb 2018, at 09:50, Jeff King <p...@peff.net> wrote: > > On Wed, Feb 07, 2018 at 11:20:08PM +0100, Lars Schneider wrote: > >>> 1. You have $LESS in your environment (without "F") on one platform >>>but not the other. >> >&

Re: "git branch" issue in 2.16.1

2018-02-07 Thread Lars Schneider
> On 07 Feb 2018, at 21:08, Jeff King <p...@peff.net> wrote: > > On Wed, Feb 07, 2018 at 06:54:23PM +0100, Lars Schneider wrote: > >>> Maybe the number of branches changed since then? >>> As the pager only comes to life when the output fills >>

Re: "git branch" issue in 2.16.1

2018-02-07 Thread Lars Schneider
n your work and personal machine? Plus, what shell do you use and what terminal application? Thanks, Lars PS: Please don't top post on the git mailing list :-) https://en.wikipedia.org/wiki/Posting_style > Thanks! > > - Jason > > >> On Feb 7, 2018, at 9:54 AM, Lars S

Re: "git branch" issue in 2.16.1

2018-02-07 Thread Lars Schneider
> On 06 Feb 2018, at 21:05, Stefan Beller wrote: > > On Tue, Feb 6, 2018 at 11:57 AM, Todd Zullinger wrote: >> Hi Jason, >> >> Jason Racey wrote: >>> After upgrading git from 2.16.0 to 2.16.1 (via Homebrew - >>> I’m on macOS) I noticed that the “git branch”

Re: [PATCH 0/2] minor GETTEXT_POISON fixes

2018-02-06 Thread Lars Schneider
> On 06 Feb 2018, at 09:42, Jeff King wrote: > > I set NO_GETTEXT=1 in my config.mak, and happened to notice that running > the tests with GETTEXT_POISON fails. I think this has been broken for > years, but I don't generally play with GETTEXT_POISON. ;) On Travis we run

Re: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding

2018-01-31 Thread Lars Schneider
> On 31 Jan 2018, at 18:28, Torsten Bögershausen wrote: > > [] >>> That is a good one. >>> If you ever plan a re-roll (I don't at the moment) the *.proj extemsion >>> make much more sense in Documentation/gitattributes that *.tx >>> There no text files encoded in UTF-16 wich are

Re: [PATCH v5 5/7] convert: add 'working-tree-encoding' attribute

2018-01-31 Thread Lars Schneider
> On 30 Jan 2018, at 22:56, Junio C Hamano <gits...@pobox.com> wrote: > > Lars Schneider <larsxschnei...@gmail.com> writes: > >>> On 30 Jan 2018, at 21:05, Junio C Hamano <gits...@pobox.com> wrote: >>> >>> tbo...@web.de writes: >>

Re: [PATCH v5 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-01-30 Thread Lars Schneider
> On 30 Jan 2018, at 20:15, Junio C Hamano <gits...@pobox.com> wrote: > > tbo...@web.de writes: > >> From: Lars Schneider <larsxschnei...@gmail.com> >> >> If the endianness is not defined in the encoding name, then let's >> be strict a

Re: [PATCH v5 5/7] convert: add 'working-tree-encoding' attribute

2018-01-30 Thread Lars Schneider
> On 30 Jan 2018, at 21:05, Junio C Hamano wrote: > > tbo...@web.de writes: > >> +if ((conv_flags & CONV_WRITE_OBJECT) && !strcmp(enc->name, >> "SHIFT-JIS")) { >> +char *re_src; >> +int re_src_len; > > I think it is a bad idea to > > (1) not

Re: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding

2018-01-30 Thread Lars Schneider
> On 30 Jan 2018, at 15:40, Torsten Bögershausen <tbo...@web.de> wrote: > > On Tue, Jan 30, 2018 at 12:23:47PM +0100, Lars Schneider wrote: >> >>> On 29 Jan 2018, at 21:19, tbo...@web.de wrote: >>> >>> From: Torsten Bögershausen <tbo..

Re: [PATCH/RFC v5 7/7] Careful with CRLF when using e.g. UTF-16 for working-tree-encoding

2018-01-30 Thread Lars Schneider
> On 29 Jan 2018, at 21:19, tbo...@web.de wrote: > > From: Torsten Bögershausen > > UTF-16 encoded files are treated as "binary" by Git, and no CRLF > conversion is done. > When the UTF-16 encoded files are converted into UF-8 using the new s/UF-8/UTF-8/ >

Re: [ANNOUNCE] Git Merge Contributor's Summit Mar 7, 2018, Barcelona

2018-01-27 Thread Lars Schneider
Hi Peff, I would like to register to the contributor summit :-) --- As I am writing you, I thought I could ask you a question: "git verify-pack" tells me the "size-in-packfile" which is kind of the "real" size of a file in a Git repo. Are you aware of a way to get this number via the GitHub

Re: [PATCH 3/3] read-cache: don't write index twice if we can't write shared index

2018-01-26 Thread Lars Schneider
> On 22 Jan 2018, at 19:27, SZEDER Gábor wrote: > > > On Thu, Jan 18, 2018 at 1:47 PM, Duy Nguyen wrote: >> On Thu, Jan 18, 2018 at 6:36 PM, SZEDER Gábor wrote: >>> This series, queued as 'nd/shared-index-fix', makes the 32 bit

[PATCH v2] SQUASH convert: add tracing for 'working-tree-encoding' attribute

2018-01-23 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi Junio, I overlooked a typo pointed out in Simon's review. Here is a new patch for squashing. Sorry for the trouble! @Eric: Thanks for spotting this! Cheers, Lars convert.c| 8 ++-- t/t0028-workin

SQUASH convert: add tracing for 'working-tree-encoding' attribute

2018-01-22 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi Junio, this attached patch addresses Simon's review comments. Can you squash the patch if you apply "[PATCH v4 5/6] convert: add 'working-tree-encoding' attribute"? https://public-inbox.org/git/20180120152418.528

Re: [PATCH v4 5/6] convert: add 'working-tree-encoding' attribute

2018-01-22 Thread Lars Schneider
> On 21 Jan 2018, at 15:22, Simon Ruderich wrote: > > On Sat, Jan 20, 2018 at 04:24:17PM +0100, lars.schnei...@autodesk.com wrote: >> +static struct encoding *git_path_check_encoding(struct attr_check_item >> *check) >> +{ >> +const char *value = check->value; >> +

[PATCH v4 6/6] convert: add tracing for 'working-tree-encoding' attribute

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Add the GIT_TRACE_CHECKOUT_ENCODING environment variable to enable tracing for content that is reencoded with the 'working-tree-encoding' attribute. This is useful to debug encoding issues. Signed-off-by: Lars Schneider <larsxschnei...@

[PATCH v4 4/6] utf8: add function to detect a missing UTF-16/32 BOM

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The has_missing_utf_bom() function returns true if a required BOM is missing. The Unicode standard instructs to

[PATCH v4 3/6] utf8: add function to detect prohibited UTF-16/32 BOM

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v4 5/6] convert: add 'working-tree-encoding' attribute

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

[PATCH v4 2/6] strbuf: add xstrdup_toupper()

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

[PATCH v4 1/6] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v4 0/6] convert: add support for different encodings

2018-01-20 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-4 and 6 are preparation and helper functions. Patch 5 is the actual change. This series depends on Torsten's "convert_to_git(): safe_crlf/checksafe becomes int conv_flags" patch: https://public-inbox.org/git/201801

Re: [PATCH] describe: use strbuf_add_unique_abbrev() for adding short hashes

2018-01-18 Thread Lars Schneider
> On 18 Jan 2018, at 23:40, SZEDER Gábor wrote: > > On Thu, Jan 18, 2018 at 10:40 PM, René Scharfe wrote: >> Am 16.01.2018 um 18:11 schrieb SZEDER Gábor: >>> Unfortunately, most of the changes coming from 'strbuf.cocci' don't >>> make any sense, they appear

Re: What's cooking in git.git (Dec 2017, #03; Wed, 13)

2018-01-09 Thread Lars Schneider
> On 14 Dec 2017, at 00:00, Junio C Hamano wrote: > > Here are the topics that have been cooking. Commits prefixed with > '-' are only in 'pu' (proposed updates) while commits prefixed with > '+' are in 'next'. The ones marked with '.' do not appear in any of > the

Re: [PATCH v3 5/7] convert_to_git(): safe_crlf/checksafe becomes int conv_flags

2018-01-08 Thread Lars Schneider
> On 08 Jan 2018, at 22:28, Junio C Hamano wrote: > > lars.schnei...@autodesk.com writes: > >> diff --git a/sha1_file.c b/sha1_file.c >> index afe4b90f6e..dcb02e9ffd 100644 >> --- a/sha1_file.c >> +++ b/sha1_file.c >> @@ -75,14 +75,14 @@ static struct cached_object

Re: [PATCH] travis-ci: build Git during the 'script' phase

2018-01-08 Thread Lars Schneider
> On 08 Jan 2018, at 23:07, Junio C Hamano wrote: > > SZEDER Gábor writes: > >> The reason why Travis CI does it this way and why it's a better >> approach than ours lies in how unsuccessful build jobs are >> categorized. ... >> ... >> This makes it

Re: [PATCH v3 0/7] convert: add support for different encodings

2018-01-08 Thread Lars Schneider
> On 07 Jan 2018, at 10:38, Torsten Bögershausen <tbo...@web.de> wrote: > > On Sat, Jan 06, 2018 at 01:48:01AM +0100, lars.schnei...@autodesk.com wrote: >> From: Lars Schneider <larsxschnei...@gmail.com> >> >> Hi, >> >> Patches 1-5 and

Re: [PATCH 1/5] convert_to_git(): checksafe becomes an integer

2018-01-05 Thread Lars Schneider
> On 06 Jan 2018, at 00:22, Junio C Hamano <gits...@pobox.com> wrote: > > Lars Schneider <larsxschnei...@gmail.com> writes: > >>> On 31 Dec 2017, at 09:05, tbo...@web.de wrote: >>> >>> From: Torsten Bögershausen <tbo...@web.de> >&g

[PATCH v3 1/7] strbuf: remove unnecessary NUL assignment in xstrdup_tolower()

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we allocate the buffer for the lower case string with xmallocz(). This already ensures a NUL at the end of the allocated buffer. Remove the unnecessary assignment. Sign

[PATCH v3 6/7] convert: add support for 'checkout-encoding' attribute

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Git recognizes files encoded with ASCII or one of its supersets (e.g. UTF-8 or ISO-8859-1) as text files. All other encodings are usually interpreted as binary and consequently built-in Git text processing tools (e.g. 'git diff') as well as mo

[PATCH v3 2/7] strbuf: add xstrdup_toupper()

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Create a copy of an existing string and make all characters upper case. Similar xstrdup_tolower(). This function is used in a subsequent commit. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- strbuf.c | 12 ++

[PATCH v3 7/7] convert: add tracing for checkout-encoding

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Add the GIT_TRACE_CHECKOUT_ENCODING environment variable to enable tracing for content that is reencoded with the checkout-encoding attribute. Signed-off-by: Lars Schneider <larsxschnei...@gmail.com> --- convert.c

[PATCH v3 4/7] utf8: add function to detect a missing UTF-16/32 BOM

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> If the endianness is not defined in the encoding name, then let's be strict and require a BOM to avoid any encoding confusion. The has_missing_utf_bom() function returns true if a required BOM is missing. The Unicode standard instructs to

[PATCH v3 3/7] utf8: add function to detect prohibited UTF-16/32 BOM

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE or UTF-32LE a BOM must not be used [1]. The function returns true if this is the case. This function is used in a subsequent commit. [1] http://unicode.org/faq/utf_bo

[PATCH v3 5/7] convert_to_git(): safe_crlf/checksafe becomes int conv_flags

2018-01-05 Thread lars . schneider
E and KEEP_CRLF. Therefore, an enum is not ideal. Let's use a integer bit pattern instead and rename the parameter to conv_flags to make it more generically usable. This allows us to extend the bit pattern in a subsequent commit. Helped-By: Lars Schneider <larsxschnei...@gmail.com> Signed-off-by:

[PATCH v3 0/7] convert: add support for different encodings

2018-01-05 Thread lars . schneider
From: Lars Schneider <larsxschnei...@gmail.com> Hi, Patches 1-5 and 6 are helper functions and preparation. Patch 6 is the actual change. I am still torn between "checkout-encoding" and "working-tree-encoding" as attribute name. I am happy to hear arguments fo

Re: [PATCH v3 1/1] convert_to_git(): checksafe becomes int conv_flags

2018-01-05 Thread Lars Schneider
enum value SAFE_CRLF_FALSE. > > Turn the whole call chain to use an integer with single bits, which > can be extended in the next commits: > - The global configuration variable safe_crlf is now conv_flags_eol. > - The parameter checksafe is renamed into conv_flags. > &

Re: [PATCH v2 3/3] travis: run tests with GIT_TEST_SPLIT_INDEX

2018-01-05 Thread Lars Schneider
> On 04 Jan 2018, at 21:13, Thomas Gummerer <t.gumme...@gmail.com> wrote: > > On 12/18, Lars Schneider wrote: >> >>> On 17 Dec 2017, at 23:51, Thomas Gummerer <t.gumme...@gmail.com> wrote: >>> >>> Split index mode only has a few dedicated t

Re: [PATCH v1] convert: add support for 'encoding' attribute

2018-01-03 Thread Lars Schneider
On 03 Jan 2018, at 20:15, Junio C Hamano wrote: > Torsten Bögershausen writes: > >> May be. >> Originally utf8.c was about encoding and all kind of UTF-8 related stuff. >> Especially it didn't know anything about strbuf. >> I don't know why strbuf.h and other

Re: [PATCH 2/2] travis-ci: check that all build artifacts are .gitignore-d

2018-01-03 Thread Lars Schneider
> On 03 Jan 2018, at 00:12, SZEDER Gábor <szeder@gmail.com> wrote: > > On Tue, Jan 2, 2018 at 8:40 PM, Lars Schneider <larsxschnei...@gmail.com> > wrote: >> >>> On 31 Dec 2017, at 17:02, SZEDER Gábor <szeder@gmail.com> wrote: >>> &

Re: [PATCH v3 1/1] convert_to_git(): checksafe becomes int conv_flags

2018-01-03 Thread Lars Schneider
> On 03 Jan 2018, at 06:36, Torsten Bögershausen <tbo...@web.de> wrote: > > On Tue, Jan 02, 2018 at 08:11:51PM +0100, Lars Schneider wrote: > > [snip] > >>> /* >>> diff --git a/diff.c

Re: [PATCH 2/2] travis-ci: check that all build artifacts are .gitignore-d

2018-01-02 Thread Lars Schneider
> On 31 Dec 2017, at 17:02, SZEDER Gábor wrote: > > Every once in a while our explicit .gitignore files get out of sync > when our build process learns to create new artifacts, like test > helper executables, but the .gitignore files are not updated > accordingly. > > Use

Re: [PATCH 1/2] travis-ci: don't store P4 and Git LFS in the working tree

2018-01-02 Thread Lars Schneider
> On 31 Dec 2017, at 17:02, SZEDER Gábor wrote: > > The Clang and GCC 64 bit Linux build jobs download and store the P4 > and Git LFS executables under the current directory, which is the > working tree that we are about to build and test. This means that Git > commands

Re: [PATCH v3 1/1] convert_to_git(): checksafe becomes int conv_flags

2018-01-02 Thread Lars Schneider
enum value SAFE_CRLF_FALSE. > > Turn the whole call chain to use an integer with single bits, which > can be extended in the next commits: > - The global configuration variable safe_crlf is now conv_flags_eol. > - The parameter checksafe is renamed into conv_flags. > &

Re: [PATCH 1/5] convert_to_git(): checksafe becomes an integer

2017-12-31 Thread Lars Schneider
> On 31 Dec 2017, at 09:05, tbo...@web.de wrote: > > From: Torsten Bögershausen > > When calling convert_to_git(), the checksafe parameter has been used to > check if commit would give a non-roundtrip conversion of EOL. > > When checksafe was introduced, 3 values had been in

Re: [PATCHv2 0/3] Travis CI: skip commits with successfully built and tested trees

2017-12-31 Thread Lars Schneider
> On 31 Dec 2017, at 11:12, SZEDER Gábor wrote: > > This is the second iteration of 'sg/travis-skip-identical-test', > addressing the comments of Lars and Jonathan: > > - Colorize the "Tip of $TRAVIS_BRANCH is exactly at $TAG" message >in the new patch 1/3. > > -

<    1   2   3   4   5   6   7   8   9   10   >