> On Sep 24, 2018, at 7:24 PM, Elijah Newren wrote:
>
> On Sun, Sep 23, 2018 at 6:08 AM Lars Schneider
> wrote:
>>
>> Hi,
>>
>> I recently had to purge files from large Git repos (many files, many
>> commits).
>> The usual recommenda
> On Sep 23, 2018, at 4:55 PM, Eric Sunshine wrote:
>
> On Sun, Sep 23, 2018 at 9:05 AM Lars Schneider
> wrote:
>> I recently had to purge files from large Git repos (many files, many
>> commits).
>> The usual recommendation is to use `git filter-branc
Hi,
I recently had to purge files from large Git repos (many files, many commits).
The usual recommendation is to use `git filter-branch --index-filter` to purge
files. However, this is *very* slow for large repos (e.g. it takes 45min to
remove the `builtin` directory from git core). I realized
> On Jul 19, 2018, at 11:19 PM, Stefan Beller wrote:
>
> On Thu, Jul 19, 2018 at 2:02 PM Lars Schneider
> wrote:
>>
>> Hi,
>>
>> I have a blob hash and I would like to know what commit referenced
>> this blob first in a given Git repo.
>
&g
Hi,
I have a blob hash and I would like to know what commit referenced
this blob first in a given Git repo.
I could iterate through all commits sorted by date (or generation
number) and then recursively search in the referenced trees until
I find my blob. I wonder, is this the most efficient
> On Jul 8, 2018, at 8:30 PM, larsxschnei...@gmail.com wrote:
>
> From: Lars Schneider
>
> In 107642fe26 ("convert: add 'working-tree-encoding' attribute",
> 2018-04-15) we added an attribute which defines the working tree
> encoding of a file.
>
>
> On Jul 8, 2018, at 8:30 PM, larsxschnei...@gmail.com wrote:
>
> From: Lars Schneider
>
> Refactor conversion driver config parsing to ease the parsing of new
> configs in a subsequent patch.
>
> No functional change intended.
>
> Signed-off-by: Lars Sch
> -----Lars Schneider wrote: -
> To: Jeff King
> From: Lars Schneider
> Date: 06/28/2018 18:21
> Cc: "brian m. carlson" , Steve Groeger
> , git@vger.kernel.org
> Subject: Re: Use of new .gitattributes working-tree-encoding attribute across
> different
> On Jun 28, 2018, at 4:34 PM, Jeff King wrote:
>
> On Thu, Jun 28, 2018 at 02:44:47AM +, brian m. carlson wrote:
>
>> On Wed, Jun 27, 2018 at 07:54:52AM +, Steve Groeger wrote:
>>> We have common code that is supposed to be usable across different
>>> platforms and hence different
> On 04 Jun 2018, at 11:55, Jeff King wrote:
>
> On Mon, Jun 04, 2018 at 12:18:59PM -0400, Martin-Louis Bright wrote:
>
>> Why must the credentials must be deleted after receiving the 401 (or
>> any) error? What's the rationale for this?
>
> Because Git only tries a single credential per
> On 04 Jun 2018, at 06:53, Junio C Hamano wrote:
>
> A release candidate Git v2.18.0-rc1 is now available for testing
> at the usual places. It is comprised of 842 non-merge commits
> since v2.17.0, contributed by 65 people, 20 of which are new faces.
>
> ...
>
> * The new
From: Lars Schneider
If a Git HTTP server responds with 401 or 407, then Git tells the
credential helper to reject and delete the credentials. In general
this is good.
However, in certain automation environments it is not desired to remove
credentials automatically. This is in particular
> On 16 May 2018, at 11:29, Ævar Arnfjörð Bjarmason <ava...@gmail.com> wrote:
>
>
> On Wed, May 16 2018, Lars Schneider wrote:
>
>> I am looking into different options to cache Git repositories on build
>> machines. The two most promising ways seem to be git-w
Hi,
I am looking into different options to cache Git repositories on build
machines. The two most promising ways seem to be git-worktree [1] and
git-alternates [2].
I wonder if you see an advantage of one over the other?
My impression is that git-worktree supersedes git-alternates. Would
that
> On 16 Apr 2018, at 19:45, Jacob Keller <jacob.kel...@gmail.com> wrote:
>
> On Mon, Apr 16, 2018 at 10:43 AM, Jacob Keller <jacob.kel...@gmail.com> wrote:
>> On Mon, Apr 16, 2018 at 9:07 AM, Lars Schneider
>> <larsxschnei...@gmail.com> wrote:
>>>
> On 16 Apr 2018, at 19:04, Ævar Arnfjörð Bjarmason <ava...@gmail.com> wrote:
>
>
> On Mon, Apr 16 2018, Lars Schneider wrote:
>
>>> On 16 Apr 2018, at 04:03, Linus Torvalds <torva...@linux-foundation.org>
>>> wrote:
>>>
>>> On
> On 16 Apr 2018, at 04:03, Linus Torvalds
> wrote:
>
> On Sun, Apr 15, 2018 at 6:44 PM, Junio C Hamano wrote:
>>
>> I think Elijah's corrected was_tracked() also does not care "has
>> this been renamed".
>
> I'm perfectly happy with the
From: Lars Schneider <larsxschnei...@gmail.com>
UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are
From: Lars Schneider <larsxschnei...@gmail.com>
Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.
Signed-off-by: Lars Schneider <la
From: Lars Schneider <larsxschnei...@gmail.com>
Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as mo
From: Lars Schneider <larsxschnei...@gmail.com>
If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.
The Unicode standard ins
From: Lars Schneider <larsxschnei...@gmail.com>
Check in a case insensitive manner if one string is a prefix of another
string.
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
git-compat-util.h | 1 +
strbuf.c | 9
From: Lars Schneider <larsxschnei...@gmail.com>
The function same_encoding() could only recognize alternative names for
UTF-8 encodings. Teach it to recognize all kinds of alternative UTF
encoding names (e.g. utf16).
While we are at it, fix a crash that would occur if same_encoding() was
From: Lars Schneider <larsxschnei...@gmail.com>
Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.
Remove the unnecessary assignment.
Sign
From: Lars Schneider <larsxschnei...@gmail.com>
Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.
This function is used in a subsequent commit.
[1] http://unicode.org/faq/utf_bo
From: Lars Schneider <larsxschnei...@gmail.com>
Hi,
Patches 1-6,9 are preparation and helper functions.
Patch 7,8,10 are the actual change.
This series is based on v2.16.0 and Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13).
The seri
From: Lars Schneider <larsxschnei...@gmail.com>
Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
strbuf.c | 12 ++
From: Lars Schneider <larsxschnei...@gmail.com>
Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
convert.c| 61
> On 05 Apr 2018, at 18:41, Torsten Bögershausen <tbo...@web.de> wrote:
>
> On 01.04.18 15:24, Lars Schneider wrote:
>>> TRUE or false are values, but just wrong ones.
>>> If this test is removed, the user will see "failed to encode "TRUE&quo
> On 04 Jan 2018, at 20:26, Jeff King wrote:
>
> On Wed, Dec 27, 2017 at 09:41:30AM -0800, Junio C Hamano wrote:
>
>> Jeff King writes:
>>
>>> I, too, had a funny feeling about calling this "core". But I didn't have
>>> a better name, as I'm not sure what other
> On 02 Apr 2018, at 20:31, Lars Schneider <larsxschnei...@gmail.com> wrote:
>
>
>> On 29 Mar 2018, at 20:37, Junio C Hamano <gits...@pobox.com> wrote:
>>
>> lars.schnei...@autodesk.com writes:
>>
>>> From: Lars Schneider <larsxsc
> On 29 Mar 2018, at 20:37, Junio C Hamano <gits...@pobox.com> wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> From: Lars Schneider <larsxschnei...@gmail.com>
>>
>> Patches 1-6,9 are preparation and helper functions. Patch 4 is new.
>> Patch
> On 13 Mar 2018, at 18:45, Siddhartha Mishra <sidm1...@gmail.com> wrote:
>
> On Mon, Mar 12, 2018 at 3:49 PM, Lars Schneider
> <larsxschnei...@gmail.com> wrote:
>> Hi,
>>
>> That looks interesting but I agree with Dscho that we should not limit
>&g
> On 16 Mar 2018, at 19:19, Eric Sunshine wrote:
>
> On Fri, Mar 16, 2018 at 1:50 PM, Junio C Hamano wrote:
>> Eric Sunshine writes:
>>> However, I'm having a tough time imagining cases in which callers
>>> would want
> On 18 Mar 2018, at 08:24, Torsten Bögershausen <tbo...@web.de> wrote:
>
> Some comments inline
>
> On Fri, Mar 09, 2018 at 06:35:32PM +0100, lars.schnei...@autodesk.com wrote:
>> From: Lars Schneider <larsxschnei...@gmail.com>
>>
>> Git reco
> On 30 Mar 2018, at 12:32, Lars Schneider <larsxschnei...@gmail.com> wrote:
>
>
>> On 30 Mar 2018, at 11:24, Ævar Arnfjörð Bjarmason <ava...@gmail.com> wrote:
>>
>>
>> On Wed, Mar 28 2018, Junio C. Hamano wrote:
>>
>>> * ls/checko
> On 30 Mar 2018, at 11:24, Ævar Arnfjörð Bjarmason wrote:
>
>
> On Wed, Mar 28 2018, Junio C. Hamano wrote:
>
>> * ls/checkout-encoding (2018-03-16) 10 commits
>> - convert: add round trip check based on 'core.checkRoundtripEncoding'
>> - convert: add tracing for
> On 17 Mar 2018, at 09:01, Duy Nguyen wrote:
>
> On Fri, Mar 16, 2018 at 10:22 PM, Jeff King wrote:
>>> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
>>> index 3735ce413f..f6f346c468 100755
>>> --- a/ci/run-build-and-tests.sh
>>> +++
> On 14 Mar 2018, at 21:43, Junio C Hamano wrote:
>
> Derrick Stolee writes:
>
>> This v6 includes feedback around csum-file.c and the rename of hashclose()
>> to finalize_hashfile(). These are the first two commits of the series, so
>> they could be
kinds of alternative UTF encoding
>> names.
>>
>> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
>> ---
>> diff --git a/utf8.c b/utf8.c
>> @@ -401,11 +401,27 @@ void strbuf_utf8_replace(struct strbuf *sb_src, int
>>
> On 15 Mar 2018, at 20:18, Lars Schneider <larsxschnei...@gmail.com> wrote:
>
>
>> On 15 Mar 2018, at 02:34, Junio C Hamano <gits...@pobox.com> wrote:
>>
>> ...
>>
>> * ls/checkout-encoding (2018-03-09) 10 commits
>> - convert: a
From: Lars Schneider <larsxschnei...@gmail.com>
Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
convert.c| 61
From: Lars Schneider <larsxschnei...@gmail.com>
Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.
Signed-off-by: Lars Schneider <la
From: Lars Schneider <larsxschnei...@gmail.com>
UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are
From: Lars Schneider <larsxschnei...@gmail.com>
The function same_encoding() checked only for alternative UTF-8 encoding
names. Teach it to check for all kinds of alternative UTF encoding
names.
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <la
From: Lars Schneider <larsxschnei...@gmail.com>
Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as mo
From: Lars Schneider <larsxschnei...@gmail.com>
If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.
The Unicode standard ins
From: Lars Schneider <larsxschnei...@gmail.com>
Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
strbuf.c | 12 ++
From: Lars Schneider <larsxschnei...@gmail.com>
Hi,
Patches 1-6,9 are preparation and helper functions. Patch 4 is new.
Patch 7,8,10 are the actual change.
This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is
a
From: Lars Schneider <larsxschnei...@gmail.com>
Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.
This function is used in a subsequent commit.
[1] http://unicode.org/faq/utf_bo
From: Lars Schneider <larsxschnei...@gmail.com>
Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.
Remove the unnecessary assignment.
Sign
From: Lars Schneider <larsxschnei...@gmail.com>
Check in a case insensitive manner if one string is a prefix of another
string.
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
git-compat-util.h | 1 +
strbuf.c | 9
> On 09 Mar 2018, at 20:11, Junio C Hamano <gits...@pobox.com> wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> From: Lars Schneider <larsxschnei...@gmail.com>
>>
>> The canonical name of an UTF encoding has the format UTF, dash, number,
>>
> On 09 Mar 2018, at 20:10, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +static const char *default_encoding = "UTF-8";
>> +
>> ...
>> +static const char *git_path_check_encoding(struct attr_check_item *check)
>> +{
>> +const char *value =
> On 15 Mar 2018, at 02:34, Junio C Hamano wrote:
>
> ...
>
> * ls/checkout-encoding (2018-03-09) 10 commits
> - convert: add round trip check based on 'core.checkRoundtripEncoding'
> - convert: add tracing for 'working-tree-encoding' attribute
> - convert: advise canonical
> On 14 Mar 2018, at 23:20, Jeff King <p...@peff.net> wrote:
>
> On Wed, Mar 14, 2018 at 05:56:04PM +0100, Lars Schneider wrote:
>
>> I am investigating a Git merge (a86dd40fe) in which an older version of
>> a file won over the newer version. I try to understa
> On 14 Mar 2018, at 18:02, Derrick Stolee <sto...@gmail.com> wrote:
>
> On 3/14/2018 12:56 PM, Lars Schneider wrote:
>> Hi,
>>
>> I am investigating a Git merge (a86dd40fe) in which an older version of
>> a file won over the newer version. I try to un
Hi,
I am investigating a Git merge (a86dd40fe) in which an older version of
a file won over the newer version. I try to understand why this is the
case. I can reproduce the merge with the following commands:
$ git checkout -b test a02fa3303
$ GIT_MERGE_VERBOSITY=5 git merge --verbose c1b82995c
> On 14 Mar 2018, at 09:33, Michael Haggerty <mhag...@alum.mit.edu> wrote:
>
> On Wed, Mar 14, 2018 at 9:14 AM, Lars Schneider
> <larsxschnei...@gmail.com> wrote:
>> I am using Michael's fantastic Git repo analyzer tool "git-sizer" [*]
>> and it
Hi,
I am using Michael's fantastic Git repo analyzer tool "git-sizer" [*]
and it detected a very large commit of 7.33 MiB in my repo (see chart
below).
This large commit is expected. I've imported that repo from another
version control system but excluded all binary files (e.g. images) and
some
Hi,
That looks interesting but I agree with Dscho that we should not limit
this to master/maint.
I assume you did run this on TravisCI already? Can you share a link?
I assume you did find errors? Can we fix them or are there too many?
If there are existing errors, how do we define a "successful"
Hi Viet,
> On 12 Mar 2018, at 03:20, Viet Hung Tran wrote:
>
> This is my submission as a microproject for the Google Summer of code.
> I apologize for not setting the [GSoC] in my previous email
> at <20180312020855.7950-1-viethtran1...@gmail.com>.
> Please ignore it.
> On 09 Mar 2018, at 20:00, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +const char *advise_msg = _(
>> +"The file '%s' contains a byte order "
>> +"mark (BOM). Please use %.6s
From: Lars Schneider <larsxschnei...@gmail.com>
Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.
This function is used in a subsequent commit.
[1] http://unicode.org/faq/utf_bo
From: Lars Schneider <larsxschnei...@gmail.com>
Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
convert.c| 48 +++
From: Lars Schneider <larsxschnei...@gmail.com>
Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.
Signed-off-by: Lars Schneider <la
From: Lars Schneider <larsxschnei...@gmail.com>
Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as mo
From: Lars Schneider <larsxschnei...@gmail.com>
Check in a case insensitive manner if one string is a prefix of another
string.
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
git-compat-util.h | 1 +
strbuf.c | 9
From: Lars Schneider <larsxschnei...@gmail.com>
If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.
The Unicode standard ins
From: Lars Schneider <larsxschnei...@gmail.com>
Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.
Remove the unnecessary assignment.
Sign
From: Lars Schneider <larsxschnei...@gmail.com>
The canonical name of an UTF encoding has the format UTF, dash, number,
and an optionally byte order in upper case (e.g. UTF-8 or UTF-16BE).
Some iconv versions support alternative names without a dash or with
lower case characters.
To
From: Lars Schneider <larsxschnei...@gmail.com>
UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are
From: Lars Schneider <larsxschnei...@gmail.com>
Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
strbuf.c | 12 ++
From: Lars Schneider <larsxschnei...@gmail.com>
Hi,
Patches 1-5,9 are preparation and helper functions.
Patch 6-8,10 are the actual change. Patch 8 is new.
This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is
a
> On 07 Mar 2018, at 19:04, Eric Sunshine <sunsh...@sunshineco.com> wrote:
>
> On Wed, Mar 7, 2018 at 12:30 PM, <lars.schnei...@autodesk.com> wrote:
>> Check that new content is valid with respect to the user defined
>> 'working-tree-encoding' attribute.
>
> On 09 Mar 2018, at 00:12, Junio C Hamano wrote:
>
> Duy Nguyen writes:
>
>>> extern int starts_with(const char *str, const char *prefix);
>>> +extern int startscase_with(const char *str, const char *prefix);
>>
>> This name is a bit hard to read. Boost
> On 07 Mar 2018, at 23:57, Junio C Hamano <gits...@pobox.com> wrote:
>
> Lars Schneider <larsxschnei...@gmail.com> writes:
>
>> At this point I thought it would make sense to make the advised
>> encoding name uppercase in both situations. OK with
> On 07 Mar 2018, at 23:52, Junio C Hamano <gits...@pobox.com> wrote:
>
> Lars Schneider <larsxschnei...@gmail.com> writes:
>
>> I don't think HT makes too much sense. However, isspace() is nice
>> and I will use it. Being more permissive on the inputs should
content is added to the index, then Git converts the
>> content to a canonical UTF-8 representation. On checkout Git will
>> reverse the conversion.
>>
>> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
>> ---
>> Documentation/gitattributes.txt
> On 07 Mar 2018, at 23:32, Junio C Hamano <gits...@pobox.com> wrote:
>
> Lars Schneider <larsxschnei...@gmail.com> writes:
>
>> I also would have liked to advise "UTF-16" instead of "UTF16" as
>> you suggested. However, that requ
> On 07 Mar 2018, at 20:59, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +static int check_roundtrip(const char* enc_name)
>
> The asterisk sticks to the variable, not type.
Argh. I need to put this check into Travis CI ;-)
>> +{
>> +/*
>> +
> On 07 Mar 2018, at 20:49, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +static int validate_encoding(const char *path, const char *enc,
>> + const char *data, size_t len, int die_on_error)
>> +{
>> +/* We only check for UTF here
From: Lars Schneider <larsxschnei...@gmail.com>
Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.
Signed-off-by: Lars Schneider <la
From: Lars Schneider <larsxschnei...@gmail.com>
Git recognizes files encoded with ASCII or one of its supersets (e.g.
UTF-8 or ISO-8859-1) as text files. All other encodings are usually
interpreted as binary and consequently built-in Git text processing
tools (e.g. 'git diff') as well as mo
From: Lars Schneider <larsxschnei...@gmail.com>
If the endianness is not defined in the encoding name, then let's
be strict and require a BOM to avoid any encoding confusion. The
is_missing_required_utf_bom() function returns true if a required BOM
is missing.
The Unicode standard ins
From: Lars Schneider <larsxschnei...@gmail.com>
Whenever a data stream is declared to be UTF-16BE, UTF-16LE, UTF-32BE
or UTF-32LE a BOM must not be used [1]. The function returns true if
this is the case.
This function is used in a subsequent commit.
[1] http://unicode.org/faq/utf_bo
From: Lars Schneider <larsxschnei...@gmail.com>
Hi,
Patches 1-5,8 are preparation and helper functions. Patch 3 is new.
Patch 6,7,9 are the actual change.
This series depends on Torsten's 8462ff43e4 (convert_to_git():
safe_crlf/checksafe becomes int conv_flags, 2018-01-13) which is
a
From: Lars Schneider <larsxschnei...@gmail.com>
Since 3733e69464 (use xmallocz to avoid size arithmetic, 2016-02-22) we
allocate the buffer for the lower case string with xmallocz(). This
already ensures a NUL at the end of the allocated buffer.
Remove the unnecessary assignment.
Sign
From: Lars Schneider <larsxschnei...@gmail.com>
Check that new content is valid with respect to the user defined
'working-tree-encoding' attribute.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
convert.c| 55
From: Lars Schneider <larsxschnei...@gmail.com>
UTF supports lossless conversion round tripping and conversions between
UTF and other encodings are mostly round trip safe as Unicode aims to be
a superset of all other character encodings. However, certain encodings
(e.g. SHIFT-JIS) are
From: Lars Schneider <larsxschnei...@gmail.com>
Check in a case insensitive manner if one string is a prefix of another
string.
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
git-compat-util.h | 1 +
strbuf.c | 9
From: Lars Schneider <larsxschnei...@gmail.com>
Create a copy of an existing string and make all characters upper case.
Similar xstrdup_tolower().
This function is used in a subsequent commit.
Signed-off-by: Lars Schneider <larsxschnei...@gmail.com>
---
strbuf.c | 12 ++
> On 07 Mar 2018, at 00:07, Junio C Hamano <gits...@pobox.com> wrote:
>
> Junio C Hamano <gits...@pobox.com> writes:
>
>> Lars Schneider <larsxschnei...@gmail.com> writes:
>>
>>>> Also "UTF16" or other spelling
>&g
> On 06 Mar 2018, at 23:53, Junio C Hamano <gits...@pobox.com> wrote:
>
> Lars Schneider <larsxschnei...@gmail.com> writes:
>
>>> Also "UTF16" or other spelling
>>> the platform may support but this code fails to recognise will go
>>>
> On 06 Mar 2018, at 21:50, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +int is_missing_required_utf_bom(const char *enc, const char *data, size_t
>> len)
>> +{
>> +return (
>> + !strcmp(enc, "UTF-16") &&
>> + !(has_bom_prefix(data,
. All other encodings are usually
>> interpreted as binary and consequently built-in Git text processing
>> tools (e.g. 'git diff') as well as most Git web front ends do not
>> visualize the content.
>> [...]
>> Signed-off-by: Lars Schneider <larsxschnei...@gmail.com&g
> On 06 Mar 2018, at 02:23, Junio C Hamano <gits...@pobox.com> wrote:
>
> Lars Schneider <larsxschnei...@gmail.com> writes:
>
>>> On 05 Mar 2018, at 22:50, Junio C Hamano <gits...@pobox.com> wrote:
>>>
>>> lars.schnei...@autodesk.com w
> On 05 Mar 2018, at 22:50, Junio C Hamano wrote:
>
> lars.schnei...@autodesk.com writes:
>
>> +static int validate_encoding(const char *path, const char *enc,
>> + const char *data, size_t len, int die_on_error)
>> +{
>> +if (!memcmp("UTF-", enc, 4)) {
> On 03 Mar 2018, at 11:39, Jeff King wrote:
>
> On Sat, Mar 03, 2018 at 05:30:10AM -0500, Jeff King wrote:
>
>> As in past years, I plan to run it like an unconference. Attendees are
>> expected to bring topics for group discussion. Short presentations are
>> also welcome.
From: Lars Schneider <larsxschnei...@gmail.com>
Add the GIT_TRACE_WORKING_TREE_ENCODING environment variable to enable
tracing for content that is reencoded with the 'working-tree-encoding'
attribute. This is useful to debug encoding issues.
Signed-off-by: Lars Schneider <la
1 - 100 of 1027 matches
Mail list logo