from:"Torsten Bögershausen"

Re: t6044 broken on pu

2016-05-08 Thread Torsten Bögershausen

On 08.05.16 04:21, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
>
>> That's true, but the test passes anyway.
> You can also remove the body of the test and replace it with "true"
> and say "the test passes anyway".  Changing the test to use a file
> with only one line is irresponsible, if you do not know the nature
> of expected future bug that requires 10 lines to be there to
> manifest that the test wants to try.
>
> test_seq was invented exactly for the purpose of accomodating
> platforms that lack seq, so using it would probably be the best
> first step.  Updating implementation of test_seq to avoid $PERL
> would be a separate step, if desired (I personally do not think
> that is worth it).
We don't need to invoke perl, when the shell can do that internally ?

May a  simple
 printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n"

be an option ?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

t6044 broken on pu

2016-05-07 Thread Torsten Bögershausen

The "seq" is not understood by all shells,
using printf fixes this,

index 20a3ffe..48d964e 100755
--- a/t/t6044-merge-unrelated-index-changes.sh
+++ b/t/t6044-merge-unrelated-index-changes.sh
@@ -20,7 +20,7 @@ test_description="merges with unrelated index changes"
 #   Commit E: renames a->subdir/a, adds subdir/e

 test_expect_success 'setup trivial merges' '
-   seq 1 10 >a &&
+   printf 1 2 3 4 5 7 8 9 10 >a &&
git add a &&
test_tick && git commit -m A &&

@@ -42,7 +42,7 @@ test_expect_success 'setup trivial merges' '
test_tick && git commit -m C &&

git checkout D &&
-   seq 2 10 >a &&
+   printf 2 3 4 5 7 8 9 10 >a &&
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

t6392 broken on pu (Mac OS X)

2016-05-07 Thread Torsten Bögershausen

These tests fail here under Mac OS,
they pass under Linux:
commit ff3d9b660a4b6e9d3eeb664ce1febe717adff737
I haven't had a chance to dig further.


expecting success:
git for-each-ref --format="%(if)%(authorname)%(then)%(authorname):
%(refname)%(end)" >actual &&
cat >expect <<-\EOF &&
A U Thor: refs/heads/master
A U Thor: refs/heads/side
A U Thor: refs/odd/spot



A U Thor: refs/tags/foo1.10
A U Thor: refs/tags/foo1.3
A U Thor: refs/tags/foo1.6
A U Thor: refs/tags/four
A U Thor: refs/tags/one

A U Thor: refs/tags/three
A U Thor: refs/tags/two
EOF
test_cmp expect actual

--- expect  2016-05-07 16:08:32.0 +
+++ actual  2016-05-07 16:08:32.0 +
@@ -3,12 +3,10 @@
 A U Thor: refs/odd/spot


-
 A U Thor: refs/tags/foo1.10
 A U Thor: refs/tags/foo1.3
 A U Thor: refs/tags/foo1.6
 A U Thor: refs/tags/four
 A U Thor: refs/tags/one
-
 A U Thor: refs/tags/three
 A U Thor: refs/tags/two
not ok 34 - check %(if)...%(then)...%(end) atoms
#   
#   git for-each-ref 
--format="%(if)%(authorname)%(then)%(authorname):
%(refname)%(end)" >actual &&
#   cat >expect <<-\EOF &&
#   A U Thor: refs/heads/master
#   A U Thor: refs/heads/side
#   A U Thor: refs/odd/spot
#   
#   
#   
#   A U Thor: refs/tags/foo1.10
#   A U Thor: refs/tags/foo1.3
#   A U Thor: refs/tags/foo1.6
#   A U Thor: refs/tags/four
#   A U Thor: refs/tags/one
#   
#   A U Thor: refs/tags/three
#   A U Thor: refs/tags/two
#   EOF
#   test_cmp expect actual
#   

expecting success:
git for-each-ref 
--format="%(if)%(authorname)%(then)%(authorname)%(else)No
author%(end): %(refname)" >actual &&
cat >expect <<-\EOF &&
A U Thor: refs/heads/master
A U Thor: refs/heads/side
A U Thor: refs/odd/spot
No author: refs/tags/annotated-tag
No author: refs/tags/doubly-annotated-tag
No author: refs/tags/doubly-signed-tag
A U Thor: refs/tags/foo1.10
A U Thor: refs/tags/foo1.3
A U Thor: refs/tags/foo1.6
A U Thor: refs/tags/four
A U Thor: refs/tags/one
No author: refs/tags/signed-tag
A U Thor: refs/tags/three
A U Thor: refs/tags/two
EOF
test_cmp expect actual

--- expect  2016-05-07 16:08:32.0 +
+++ actual  2016-05-07 16:08:32.0 +
@@ -3,12 +3,10 @@
 A U Thor: refs/odd/spot
 No author: refs/tags/annotated-tag
 No author: refs/tags/doubly-annotated-tag
-No author: refs/tags/doubly-signed-tag
 A U Thor: refs/tags/foo1.10
 A U Thor: refs/tags/foo1.3
 A U Thor: refs/tags/foo1.6
 A U Thor: refs/tags/four
 A U Thor: refs/tags/one
-No author: refs/tags/signed-tag
 A U Thor: refs/tags/three
 A U Thor: refs/tags/two
not ok 35 - check %(if)...%(then)...%(else)...%(end) atoms
#   
#   git for-each-ref 
--format="%(if)%(authorname)%(then)%(authorname)%(else)No
author%(end): %(refname)" >actual &&
#   cat >expect <<-\EOF &&
#   A U Thor: refs/heads/master
#   A U Thor: refs/heads/side
#   A U Thor: refs/odd/spot
#   No author: refs/tags/annotated-tag
#   No author: refs/tags/doubly-annotated-tag
#   No author: refs/tags/doubly-signed-tag
#   A U Thor: refs/tags/foo1.10
#   A U Thor: refs/tags/foo1.3
#   A U Thor: refs/tags/foo1.6
#   A U Thor: refs/tags/four
#   A U Thor: refs/tags/one
#   No author: refs/tags/signed-tag
#   A U Thor: refs/tags/three
#   A U Thor: refs/tags/two
#   EOF
#   test_cmp expect actual
#   

expecting success:
git for-each-ref --format="%(refname:short): %(if)%(HEAD)%(then)Head
ref%(else)Not Head ref%(end)" >actual &&
cat >expect <<-\EOF &&
master: Head ref
side: Not Head ref
odd/spot: Not Head ref
annotated-tag: Not Head ref
doubly-annotated-tag: Not Head ref
doubly-signed-tag: Not Head ref
foo1.10: Not Head ref
foo1.3: Not Head ref
foo1.6: Not Head ref
four: Not Head ref
one: Not Head ref
signed-tag: Not Head ref
three: Not Head ref
two: Not Head ref
EOF
test_cmp expect actual

--- expect  2016-05-07 16:08:32.0 +
+++ actual  2016-05-07 16:08:32.0 +
@@ -3,12 +3,10 @@
 odd/spot: Not Head ref
 annotated-tag: Not Head ref
 doubly-annotated-tag: Not Head ref
-doubly-signed-tag: Not Head ref
 foo1.10: Not Head ref
 foo1.3: Not Head ref
 foo1.6: Not Head ref
 four: Not Head ref
 one: Not Head ref
-signed-tag: Not Head ref
 three: Not Head ref
 two: Not Head ref
not ok 36 - ignore

Re: t6044 broken on pu

2016-05-07 Thread Torsten Bögershausen

On 2016-05-07 14.19, Andreas Schwab wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
> 
>> The "seq" is not understood by all shells,
>> using printf fixes this,
>>
>> index 20a3ffe..48d964e 100755
>> --- a/t/t6044-merge-unrelated-index-changes.sh
>> +++ b/t/t6044-merge-unrelated-index-changes.sh
>> @@ -20,7 +20,7 @@ test_description="merges with unrelated index changes"
>>  #   Commit E: renames a->subdir/a, adds subdir/e
>>
>>  test_expect_success 'setup trivial merges' '
>> -   seq 1 10 >a &&
>> +   printf 1 2 3 4 5 7 8 9 10 >a &&
> 
> $ printf 1 2 3 4 5 7 8 9 10
> 1

That's true, but the test passes anyway.
So do we need the sequence ?
Is there something that can be improved in the test ?


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: t6044 broken on pu

2016-05-08 Thread Torsten Bögershausen

On 08.05.16 20:20, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
>
>> May a  simple
>>  printf "1\n2\n3\n4\n5\n6\n7\n8\n9\n10\n"
>>
>> be an option ?
> If you were to do that, at least have the decency to make it more
> readable by doing something like:
>
>   printf "%s\n" 1 2 3 4 5 6 7 8 9 10
>
> ;-)
>
> But as I said, as a response to "t6044 broken on pu" bug report,
> s/seq/test_seq/ is the only sensible change.
>
> Improving "test_seq, the alternative to seq" is a separate topic.
>
> If you have aversion to $PERL, perhaps do them without using
> anything what is not expected to be built-in in modern shells,
> perhaps like this?
Please don't get me wrong -
I wasn't really clear why:
I think that the invocation of an external program
to produce a 10 line test program is a waste of CPU cycles  - in this very use 
case.

We can try to use the ideal tool to do the job, in this case
the fork() that needs to be done to invoke perl seems rather expensive in 
relation
to what we get in terms of functionality - a file with 10 lines of content.

I recently read a message why "grep | sed" is not ideal, when sed can do 
everything
that grep needs to do, and only 1 external program needs to be invoked - not 2.

I try to apply the same pattern to this TC: Stay in the shell, as long as 
possible.
But if you really need perl for e.g. regexp, then use it.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v10 12/12] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-07-25 Thread Torsten Bögershausen

On 07/25/2016 06:53 PM, Junio C Hamano wrote:

Pranit Bauva  writes:

>>> +enum terms_defined {
>>> +   TERM_BAD = 1,
>>> +   TERM_GOOD = 2,
>>> +   TERM_NEW = 4,
>>> +   TERM_OLD = 8
>>> +};
>>> +

>>
>> What does TERM stand for  ?

The terms (words) used to denote the newer and the older parts of
the history.  Traditionally, as a regression-hunting tool (i.e. it
used to work, where did I break it?), we called older parts of the
history "good" and newer one "bad", but as people gained experience
with the tool, it was found that the pair of words was error-prone
to use for an opposite use case "I do not recall fixing it, but it
seems to have been fixed magically, when did that happen?", and a
more explicit "new" and "old" were introduced.

Thanks for the explanation.
Is there any risk that a more generic term like "TERM_BAD" may collide
with some other definition some day ?

Would it make sense to call it GIT_BISECT_TERM_BAD, GBS_TERM_BAD,
BIS_TERM_BAD or something more unique ?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 4/5] convert: generate large test files only once

2016-07-26 Thread Torsten Bögershausen




On 07/27/2016 02:06 AM, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

Generate a more interesting large test file with random characters in
between and reuse this test file in multiple tests. Run tests formerly
marked as EXPENSIVE every time but with a smaller test file.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7b45136..b9911a4 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -4,6 +4,13 @@ test_description='blob conversion via gitattributes'

 . ./test-lib.sh

+if test_have_prereq EXPENSIVE
+then
+   T0021_LARGE_FILE_SIZE=2048
+else
+   T0021_LARGE_FILE_SIZE=30
+fi
+
 cat test.i &&
git add test test.t test.i &&
rm -f test test.t test.i &&
-   git checkout -- test test.t test.i
+   git checkout -- test test.t test.i &&
+
+   mkdir -p generated-test-data &&
+   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
+   do
+   # Generate 1MB of empty data and 100 bytes of random characters
+   printf "%1048576d" 1
+   printf "$(LC_ALL=C tr -dc "A-Za-z0-9" >8)) count=1 2>/dev/null)"

I'm not sure how portable /dev/urandom is.
The other thing, that "really random" numbers are an overkill, and
it may be easier to use pre-defined numbers,

The rest of 1..4 looks good, I will look at 5/5 later.


+   done >generated-test-data/large.file
 '

 script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -199,9 +214,9 @@ test_expect_success 'required filter clean failure' '
 test_expect_success 'filtering large input to small output should use little 
memory' '
test_config filter.devnull.clean "cat >/dev/null" &&
test_config filter.devnull.required true &&
-   for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
-   echo "30MB filter=devnull" >.gitattributes &&
-   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
+   cp generated-test-data/large.file large.file &&
+   echo "large.file filter=devnull" >.gitattributes &&
+   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file
 '

 test_expect_success 'filter that does not read is fine' '
@@ -214,15 +229,15 @@ test_expect_success 'filter that does not read is fine' '
test_cmp expect actual
 '

-test_expect_success EXPENSIVE 'filter large file' '
+test_expect_success 'filter large file' '
test_config filter.largefile.smudge cat &&
test_config filter.largefile.clean cat &&
-   for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
-   echo "2GB filter=largefile" >.gitattributes &&
-   git add 2GB 2>err &&
+   echo "large.file filter=largefile" >.gitattributes &&
+   cp generated-test-data/large.file large.file &&
+   git add large.file 2>err &&
test_must_be_empty err &&
-   rm -f 2GB &&
-   git checkout -- 2GB 2>err &&
+   rm -f large.file &&
+   git checkout -- large.file 2>err &&
test_must_be_empty err
 '



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v2 5/5] convert: add filter..process option

2016-07-28 Thread Torsten Bögershausen



On 07/27/2016 02:06 AM, larsxschnei...@gmail.com wrote:
Some comments inline
[]
> The filter is expected to respond with the result content size as
> ASCII number in bytes and the result content in packet format with
> a flush packet at the end:
> 
> packet:  git< 57
> packet:  git< SMUDGED_CONTENT
> packet:  git< 
how does the filter report possible errors here ?
Let's say I want to convert UTF-8 into ISO-8859-1,
but the conversion is impossible.

 > packet:  git< -1
 > packet:  git< Error message, (or empty), followed by a '\n'
 > packet:  git< 

Side note: a filter may return length 0.
Suggestion: add 2 test cases.


> 
> Please note: In a future version of Git the capability "stream"
> might be supported. In that case the content size must not be
> part of the filter response.
>
> Afterwards the filter is expected to wait for the next command.
>
> Signed-off-by: Lars Schneider 
> Helped-by: Martin-Louis Bright 
> Signed-off-by: Lars Schneider 
> ---
>  Documentation/gitattributes.txt |  54 +++-
>  convert.c   | 269 
++--

>  t/t0021-conversion.sh   | 175 ++
>  t/t0021/rot13-filter.pl | 146 ++
>  4 files changed, 631 insertions(+), 13 deletions(-)
>  create mode 100755 t/t0021/rot13-filter.pl
>
> diff --git a/Documentation/gitattributes.txt 
b/Documentation/gitattributes.txt

> index 8882a3e..8fb40d2 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -300,7 +300,11 @@ checkout, when the `smudge` command is 
specified, the command is

>  fed the blob object from its standard input, and its standard
>  output is used to update the worktree file.  Similarly, the
>  `clean` command is used to convert the contents of worktree file
> -upon checkin.
> +upon checkin. By default these commands process only a single
> +blob and terminate. If a long running filter process (see section
> +below) is used then Git can process all blobs with a single filter
> +invocation for the entire life of a single Git command (e.g.
> +`git add .`).
>
>  One use of the content filtering is to massage the content into a shape
>  that is more convenient for the platform, filesystem, and the user 
to use.

> @@ -375,6 +379,54 @@ substitution.  For example:
>  
>
>
> +Long Running Filter Process
> +^^^
> +
> +If the filter command (string value) is defined via
> +filter..process then Git can process all blobs with a
> +single filter invocation for the entire life of a single Git
> +command by talking with the following packet format (pkt-line)
> +based protocol over standard input and standard output.
> +
> +Git starts the filter on first usage and expects a welcome
> +message, protocol version number, and filter capabilities
> +seperated by spaces:
> +
> +packet:  git< git-filter-protocol
> +packet:  git< version 2
> +packet:  git< clean smudge
> +
> +Supported filter capabilities are "clean" and "smudge".
> +
> +Afterwards Git sends a command (e.g. "smudge" or "clean" - based
> +on the supported capabilities), the filename, the content size as
> +ASCII number in bytes, and the content in packet format with a
> +flush packet at the end:
> +
> +packet:  git> smudge
> +packet:  git> testfile.dat
> +packet:  git> 7
> +packet:  git> CONTENT
> +packet:  git> 
> +
> +
> +The filter is expected to respond with the result content size as
> +ASCII number in bytes and the result content in packet format with
> +a flush packet at the end:
> +
> +packet:  git< 57
> +packet:  git< SMUDGED_CONTENT
> +packet:  git< 
> +
> +Please note: In a future version of Git the capability "stream"
> +might be supported. In that case the content size must not be
> +part of the filter response.
> +
> +Afterwards the filter is expected to wait for the next command.
> +A demo implementation can be found in `t/t0021/rot13-filter.pl`
> +located in the Git core repository.
> +
> +
>  Interaction between checkin/checkout attributes
>  ^^^
>
> diff --git a/convert.c b/convert.c
> index 522e2c5..5ff200b 100644
> --- a/convert.c
> +++ b/convert.c
> @@ -3,6 +3,7 @@
>  #include "run-command.h"
>  #include "quote.h"
>  #include "sigchain.h"
> +#include "pkt-line.h"
>
>  /*
>   * convert.c - convert a file when checking it out and checking it in.
> @@ -481,11 +482,232 @@ static int apply_filter(const char *path, 
const char *src, size_t len, int fd,

>return ret;
>  }
>
> +static off_t multi_packet_read(struct

Re: [PATCH v1 3/3] convert: add filter..useProtocol option

2016-07-22 Thread Torsten Bögershausen




On 07/22/2016 05:49 PM, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

Git's clean/smudge mechanism invokes an external filter process for every
single blob that is affected by a filter. If Git filters a lot of blobs
then the startup time of the external filter processes can become a
significant part of the overall Git execution time.

This patch adds the filter..useProtocol option which, if enabled,
keeps the external filter process running and processes all blobs with
the following protocol over stdin/stdout.

1. Git starts the filter on first usage and expects a welcome message
with protocol version number:
Git <-- Filter: "git-filter-protocol\n"
Git <-- Filter: "version 1"

Is there no terminator here ?
How long should the reading side wait without a '\n', or should it read
"version 1\n" ?



2. Git sends the command (either "smudge" or "clean"), the filename, the
content size in bytes, and the content separated by a newline character:
Git --> Filter: "smudge\n"
Git --> Filter: "testfile.dat\n"
Git --> Filter: "7\n"
Git --> Filter: "CONTENT"

3. The filter is expected to answer with the result content size in
bytes and the result content separated by a newline character:
Git <-- Filter: "15\n"
Git <-- Filter: "SMUDGED_CONTENT"

4. The filter is expected to wait for the next file in step 2, again.

Please note that the protocol filters do not support stream processing
with this implemenatation because the filter needs to know the length of

typo

the result in advance. A protocol version 2 could address this in a
future patch.

Signed-off-by: Lars Schneider 
Helped-by: Martin-Louis Bright 
---
 Documentation/gitattributes.txt |  41 +++-
 convert.c   | 210 ++--
 t/t0021-conversion.sh   | 170 
 t/t0021/rot13.pl|  80 +++
 4 files changed, 494 insertions(+), 7 deletions(-)
 create mode 100755 t/t0021/rot13.pl

diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
index 8882a3e..7026d62 100644
--- a/Documentation/gitattributes.txt
+++ b/Documentation/gitattributes.txt
@@ -300,7 +300,10 @@ checkout, when the `smudge` command is specified, the 
command is
 fed the blob object from its standard input, and its standard
 output is used to update the worktree file.  Similarly, the
 `clean` command is used to convert the contents of worktree file
-upon checkin.
+upon checkin. By default these commands process only a single
+blob and terminate. If the setting filter..useProtocol is
+enabled then Git can process all blobs with a single filter command
+invocation (see filter protocol below).

 One use of the content filtering is to massage the content into a shape
 that is more convenient for the platform, filesystem, and the user to use.
@@ -375,6 +378,42 @@ substitution.  For example:
 


+Filter Protocol
+^^^
+
+If the setting filter..useProtocol is enabled then Git
+can process all blobs with a single filter command invocation
+by talking with the following protocol over stdin/stdout.
+
+Git starts the filter on first usage and expects a welcome
+message with protocol version number:
+
+Git < Filter: "git-filter-protocol\n"
+Git < Filter: "version 1"
+
+
+Afterwards Git sends a blob command (either "smudge" or "clean"),
+the filename, the content size in bytes, and the content separated
+by a newline character:
+
+Git > Filter: "smudge\n"
+Git > Filter: "testfile.dat\n"
+Git > Filter: "7\n"
+Git > Filter: "CONTENT"
+
+
+The filter is expected to answer with the result content size in
+bytes and the result content separated by a newline character:
+
+Git < Filter: "15\n"
+Git < Filter: "SMUDGED_CONTENT"
+
+
+Afterwards the filter is expected to wait for the next blob process
+command. A demo implementation can be found in `t/t0021/rot13.pl`
+located in the Git core repository.
+
+
 Interaction between checkin/checkout attributes
 ^^^

diff --git a/convert.c b/convert.c
index 522e2c5..91ce86f 100644
--- a/convert.c
+++ b/convert.c
@@ -481,12 +481,188 @@ static int apply_filter(const char *path, const char 
*src, size_t len, int fd,
return ret;
 }

+static int cmd_process_map_init = 0;
+static struct hashmap cmd_process_map;
+
+struct cmd2process {
+   struct hashmap_entry ent; /* must be the first member! */
+   const char *cmd;
+   long protocol;
+   struct child_process process;
+};
+
+static int cmd2process_cmp(const struct cmd2process *e1,
+   const struct 
cmd2process *e2,
+

Re: [PATCH v10 12/12] bisect--helper: `get_terms` & `bisect_terms` shell function in C

2016-07-21 Thread Torsten Bögershausen




On 07/20/2016 11:47 PM, Pranit Bauva wrote:

Reimplement the `get_terms` and `bisect_terms` shell function in C and
add `bisect-terms` subcommand to `git bisect--helper` to call it from
git-bisect.sh .

In the shell version, the terms were identified by strings but in C
version its done by bit manipulation and passing the integer value to
the function.

Using `--bisect-terms` subcommand is a temporary measure to port shell
function in C so as to use the existing test suite. As more functions
are ported, this subcommand will be retired and will be called by some
other methods.

Mentored-by: Lars Schneider 
Mentored-by: Christian Couder 
Signed-off-by: Pranit Bauva 
---
 builtin/bisect--helper.c | 74 +++-
 git-bisect.sh| 35 ++-
 2 files changed, 75 insertions(+), 34 deletions(-)

diff --git a/builtin/bisect--helper.c b/builtin/bisect--helper.c
index 001096a..185a8ad 100644
--- a/builtin/bisect--helper.c
+++ b/builtin/bisect--helper.c
@@ -8,6 +8,13 @@
 #include "run-command.h"
 #include "prompt.h"

+enum terms_defined {
+   TERM_BAD = 1,
+   TERM_GOOD = 2,
+   TERM_NEW = 4,
+   TERM_OLD = 8
+};
+

What does TERM stand for  ?
It could be TERMinal, TERMinator or just TERM.
Something like BIS_TERM_DEF_BAD .. may be more intuitive,
and may avoid name clashes in the long run.

And why are the defines 1,2,4,8 ?
It looks as if a #define bitmap may be a better choice here ?
How do we do these kind of bit-wise opions otherwise ?


 static GIT_PATH_FUNC(git_path_bisect_terms, "BISECT_TERMS")
 static GIT_PATH_FUNC(git_path_bisect_expected_rev, "BISECT_EXPECTED_REV")
 static GIT_PATH_FUNC(git_path_bisect_ancestors_ok, "BISECT_ANCESTORS_OK")
@@ -26,6 +33,7 @@ static const char * const git_bisect_helper_usage[] = {
N_("git bisect--helper --bisect-write 
[]"),
N_("git bisect--helper --bisect-check-and-set-terms   
"),
N_("git bisect--helper --bisect-next-check []  
term_bad, fp) == EOF ||
+ strbuf_getline(>term_good, fp) == EOF;
+
+   fclose(fp);
+   return res ? -1 : 0;
+}
+
+static int bisect_terms(struct bisect_terms *terms, int term_defined)
+{
+   if (get_terms(terms)) {
+   fprintf(stderr, "no terms defined\n");
+   return -1;
+   }
+   if (!term_defined) {
+   printf("Your current terms are %s for the old state\nand "
+  "%s for the new state.\n", terms->term_good.buf,
+  terms->term_bad.buf);
+   return 0;
+   }
+
+   if (term_defined == TERM_GOOD || term_defined == TERM_OLD)
+   printf("%s\n", terms->term_good.buf);
+   if (term_defined == TERM_BAD || term_defined == TERM_NEW)
+   printf("%s\n", terms->term_bad.buf);

May be a switch-case ?
Or at least "else if" ?


+
+   return 0;
+}
+
 int cmd_bisect__helper(int argc, const char **argv, const char *prefix)
 {
enum {
@@ -369,9 +414,11 @@ int cmd_bisect__helper(int argc, const char **argv, const 
char *prefix)
CHECK_EXPECTED_REVS,
BISECT_WRITE,
CHECK_AND_SET_TERMS,
-   BISECT_NEXT_CHECK
+   BISECT_NEXT_CHECK,
+   BISECT_TERMS
} cmdmode = 0;
int no_checkout = 0, res = 0;
+   enum terms_defined term_defined = 0;
struct option options[] = {
OPT_CMDMODE(0, "next-all", ,
 N_("perform 'git bisect next'"), NEXT_ALL),
@@ -389,6 +436,16 @@ int cmd_bisect__helper(int argc, const char **argv, const 
char *prefix)
 N_("check and set terms in a bisection state"), 
CHECK_AND_SET_TERMS),
OPT_CMDMODE(0, "bisect-next-check", ,
 N_("check whether bad or good terms exist"), 
BISECT_NEXT_CHECK),
+   OPT_CMDMODE(0, "bisect-terms", ,
+N_("print out the bisect terms"), BISECT_TERMS),
+   OPT_BIT(0, "term-bad", _defined,
+N_("show the bad term"), TERM_BAD),
+   OPT_BIT(0, "term-good", _defined,
+N_("show the good term"), TERM_GOOD),
+   OPT_BIT(0, "term-new", _defined,
+N_("show the new term"), TERM_NEW),
+   OPT_BIT(0, "term-old", _defined,
+N_("show the old

Re: t0027 racy?

2016-08-11 Thread Torsten Bögershausen

[]
> FWIW I would strongly prefer to use the warning of `git add` and not even
> bother with `git commit`. What we are interested in is the warning
> message, generated by convert_to_git(). 
The commit is needed, because we check the content of the commit later.
> Not using the first one and
[]
> 
> On that matter, I wonder whether there would be a chance to revamp t0027
> in a major way, with the following goals:
> 
> - to make it very obvious to the casual reader what is being tested
> 
> - to combine Git invocations when possible, e.g. running one big `git add`
>   on a couple of files and then verify the relevant parts of the output
> 
> - dramatically decreasing the time required to run the test, without
>   sacrificing correctness (I would wager a bet that not only a few of
>   those 1388 test cases essentially exercise identical code paths)
Good ideas, I will work on a series that fixes bugs first, and then we
can see if there is room for optimization.

What do you think about this as a starting point,
more things will follow.
I like to here comments about the commit msg first ;-)

commit 3754404d3d1ea4a0cbbed4986cc4ac1b5fe6b66e
Author: Torsten Bögershausen <tbo...@web.de>
Date:   Thu Aug 11 18:47:29 2016 +0200

t0027: Correct raciness in NNO test

When a non-reversible CRLF conversion is done in "git add",
a warning is printed on stderr.

The commit_chk_wrnNNO() function  in t0027 was written to test this,
but did the wrong thing: Instead of looking at the warning
from "git add", it looked at the warning from "git commit".

Correct this and replace the commit for each and every file with a commit
of all files in one go.

The function commit_chk_wrnNNO() will to be renamed in a separate commit.
Thanks to Jeff King <p...@peff.net> for analizing t0027.
Reporyed-By: Johannes Schindelin <johannes.schinde...@gmx.de>

diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 2860d2d..6e44382 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -119,8 +119,7 @@ commit_chk_wrnNNO () {
fname=${pfx}_$f.txt &&
cp $f $fname &&
printf Z >>"$fname" &&
-   git -c core.autocrlf=$crlf add $fname 2>/dev/null &&
-   git -c core.autocrlf=$crlf commit -m "commit_$fname" $fname 
>"${pfx}_$f.err" 2>&1
+   git -c core.autocrlf=$crlf add $fname 2>"${pfx}_$f.err"
done
 
test_expect_success "commit NNO files crlf=$crlf attr=$attr LF" '
@@ -394,11 +393,10 @@ test_expect_success 'commit files attr=crlf' '
 
 # attrLFCRLF  CRLFmixLF   
LF_mix_CR   CRLFNUL
 commit_chk_wrnNNO ""  ""  false   """"""  ""   
   ""
-commit_chk_wrnNNO ""  ""  trueLF_CRLF   """"  ""   
   ""
+commit_chk_wrnNNO ""  ""  true""""""  ""   
   ""
 commit_chk_wrnNNO ""  ""  input   """"""  ""   
   ""
-
-commit_chk_wrnNNO "auto"  ""  false   "$WILC"   """"  ""   
   ""
-commit_chk_wrnNNO "auto"  ""  trueLF_CRLF   """"  ""   
   ""
+commit_chk_wrnNNO "auto"  ""  false   """"""  ""   
   ""
+commit_chk_wrnNNO "auto"  ""  true""""""  ""   
   ""
 commit_chk_wrnNNO "auto"  ""  input   """"""  ""   
   ""
 for crlf in true false input
 do
@@ -408,7 +406,7 @@ do
commit_chk_wrnNNO ""lf  $crlf   ""   CRLF_LFCRLF_LF 
 "" CRLF_LF
commit_chk_wrnNNO ""crlf$crlf   LF_CRLF   ""LF_CRLF 
LF_CRLF ""
commit_chk_wrnNNO auto  lf  $crlf   """"""  
""  ""
-   commit_chk_wrnNNO auto  crlf$crlf   LF_CRLF   """"  
""  ""
+   commit_chk_wrnNNO auto  crlf$crlf   """"""  
""  ""
commit_chk_wrnNNO text  lf  $crlf   ""   CRLF_LFCRLF_LF 
""  CRLF_LF
commit_chk_wrnNNO text  crlf$crlf   LF_CRLF   ""LF_CRLF 
LF_CRLF ""
 done
@@ -417,7 +415,8 @@ commit_chk_wrnNNO "text"  ""  false   "$WILC"   "$WICL" 
  "$WAMIX""$WILC
 commit_chk_wrnNNO "text"  ""  trueLF_CRLF   ""LF_CRLF 
LF_CRLF ""
 commit_chk_wrnNNO "text"  ""  input   ""CRLF_LF   CRLF_LF ""   
   CRLF_LF
 
-test_expect_success 'create files cleanup' '
+test_expect_success 'commit NNO and cleanup' '
+   git commit -m "commit files on top of NNO" &&
rm -f *.txt &&
git -c core.autocrlf=false reset --hard
 '
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Making file permissions match

2016-08-03 Thread Torsten Bögershausen

On Wed, Aug 03, 2016 at 09:46:06AM -0400, jonsm...@gmail.com wrote:
> I'm working with some Windows programmers that don't believe in file
> permissions They keep sending me zip files of their source tree.  I
> have my copy of the tree in git on Linux with all of the correct file
> permissions.
> 
> So I unzip the archive they send me and to see what they changed. I
> unzip it on top of my git tree. But now all of the file permissions
> don't match. The code deltas are there but there is also all of this
> noise from the file permissions.
> 
> Is there an easy way to deal with this? I want to keep the code deltas
> and ignore the permission changes. Since there are about 100K files
> this is too much to do manually.

(I'm not sure if I understand "match" correctly)

You can ignore changes in the executable permissions:
git config core.filemode false

Please let us know if it works
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 1/2] t0027: Correct raciness in NNO test

2016-08-14 Thread Torsten Bögershausen

On Sat, Aug 13, 2016 at 06:50:19PM +0200, Johannes Sixt wrote:
> Am 12.08.2016 um 18:51 schrieb tbo...@web.de:
> >From: Torsten Bögershausen <tbo...@web.de>
> >
> >When a non-reversible CRLF conversion is done in "git add",
> >a warning is printed on stderr (or Git dies, depending on checksafe)
> >
> >The function commit_chk_wrnNNO() in t0027 was written to test this,
> >but did the wrong thing: Instead of looking at the warning
> >from "git add", it looked at the warning from "git commit".
> >
> >This is racy because "git commit" may not have to do CRLF conversion
> >at all if it can use the sha1 value from the index (which depends on
> >whether "add" and "commit" run in a single second).
> >
> >Correct this and replace the commit for each and every file with a commit
> >of all files in one go.
> 
> The new test code does not only fix the race condition, but also
> tests different things, or prepares test cases in a different
> manner. I would have appreciated an explanation why this is
> necessary. Is it "on my machine, the race condition was triggered
> consistently for a bunch of tests, and so I recorded the wrong
> expected output in the test cases"?
> 
Good point. The origanal intention was to let t0027 pass, and fix
the bug in convert.c in a later commit.
(and rename NNO in a 3rd commit, that is postponed until we figured this out).

Looking at t0027, it turns out that the changes in the test matrix done in 1/2
are reverted in 2/2.
This is not a good thing.
Either (a) they should be marked as test_expect_failure in 1/2 and
corrected in 2/2,
or (b) 1/2 and 2/2 are squashed together and the noise in t0027 is minimal.
I'll send a patch for (b) in a second.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's cooking in git.git (Jun 2016, #05; Thu, 16)

2016-07-13 Thread Torsten Bögershausen




On 07/13/2016 12:20 AM, Joey Hess wrote:

Torsten Bögershausen wrote re jh/clean-smudge-annex:

The thing is that we need to check the file system to find .gitatttibutes,
even if we just did it 1 nanosecond ago.

So the .gitattributes is done 3 times:
-1 would_convert_to_git_filter_fd(
-2 assert(would_convert_to_git_filter_fd(path));
-3 convert.c/convert_to_git_filter_fd()

The only situation where this could be useful is when the .gitattributes
change between -1 and -2,
but then they would have changed between -1 and -3, and convert.c
will die().

Does it make sense to remove -2 ?


There's less redundant work going on than at first seems, because
.gitattribute files are only read once and cached. Verified by strace.


OK, I think I missed that work (not enough time for Git at the moment)
Junio, please help me out, do we have a cache here now?
I tried to figure out that following your attr branch, but failed.

So, the redundant work is only in the processing that convert_attrs() does
of the cached .gitattributes.

Notice that there was a similar redundant triple call to convert_attrs()
before my patch set:

1. index_fd checks would_convert_to_git_filter_fd
2. index_stream_convert_blob does assert(would_convert_to_git_filter_fd(path))
   (Again redundantly since 1. is its only caller and has already
   checked.)
3. in convert_to_git_filter_fd

If convert_attrs() is somehow expensive, it might be worth passing a
struct conv_attrs * into the functions that currently call
convert_attrs(). But it does not look expensive to me.

I have that on the list, but seems to be uneccesary now.


I think it would be safe enough to remove both asserts, at least as the
code is structured now.


OK.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Looking for help to understand external filter driver code

2016-07-19 Thread Torsten Bögershausen




On 07/20/2016 12:01 AM, Lars Schneider wrote:


On 19 Jul 2016, at 23:33, Junio C Hamano  wrote:


Lars Schneider  writes:


Git writes --> 4 byte filename length
Git writes --> filename string


Why limit to 32GB?  Perhaps NUL termination is more appropriate
here?

OK, I will use NUL termination for the filename.
You're also right about the limit - I will use 8 byte to encode the
content length.

Is there any reason to encode the file length in binary format?
With all the discussions about big endianess, little endianess, 4GiB or 
32 GiB.

How about simply writing the length as ASCII ?

Unless we don't want to have a "spare" field for future extensions,
it could be good to add an option field, which may be empty.
On top of that, do we want a field separator different from the line
separator ?

How about this:


 may be "var=value;var2=value2" or simply ""
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 2/5] t5000: test tar files that overflow ustar headers

2016-07-15 Thread Torsten Bögershausen




On 07/15/2016 12:38 AM, Jeff King wrote:

On Thu, Jul 14, 2016 at 03:30:58PM -0700, Junio C Hamano wrote:


If we move to time_t everywhere, I think we'll need an extra
TIME_T_IS_64BIT, but we can cross that bridge when we come to it.

Likewise I think we'll need SIZE_T_IS_64BIT eventually (for real 32-bit
systems; LLP64 systems like Windows will then be able to run the test).


I guess I wrote essentially the same thing before refreshing my
Inbox.

I am a bit fuzzy between off_t and size_t; the former is for the
size of things you see on the filesystem, while the latter is for
you to give malloc(3).  I would have thought that off_t is the type
we would want at the end of the raw object header, denoting the size
of a blob object when deflated, which could be larger than the size
of a region of memory we can get from malloc(3), in which case we
would use the streaming interface.


Yeah, your understanding is right (s/deflated/inflated/). I agree that
off_t is probably a better size for blobs. Traditionally git assumed any
object could fit in memory. The streaming interface helps that somewhat,
but I think there are cases where we still must load a blob (e.g., if it
is stored as a delta). In theory that never happens because of
core.bigfilethreshold, but you may get a packfile from somebody with a
higher threshold than you.

I wouldn't be surprised if there are other cases that aren't smart
enough to use the streaming interface yet, but the solution there is to
make them smarter. :)

So off_t is probably better. We do need to be careful, though, when
allocating objects. E.g., this:

  off_t size;
  struct git_istream *stream;
  void *buf;

  stream = open_istream(sha1, , , NULL);
  buf = xmalloc(size);
  while (1) {
/* read stream into buf */
  }

is a security hole when size_t is less than off_t (it gets truncated in
the call to xmalloc, which allocates too few bytes). This is a toy
example, obviously, but it's something to watch out for.

-Peff

That code is "illegal", it should be
 buf = xmalloc(xsize_t(size));

And the transition from off_t into size_t
should always got via xsize_t():

static inline size_t xsize_t(off_t len)
{
if (len > (size_t) len)
die("Cannot handle files this big");
return (size_t)len;
}

There are some more things to be done, on the long run:
- convert "unsigned long" into either off_t of size_t in e.g. convert.c
- Use the streaming interface to analyze if blobs are binary
  (That is already on my list, the old "stream and early out"
  from the olc 10/10, gmane/$293010 or so can be reused)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-11 Thread Torsten Bögershausen



How do things look at this point?  This version is what I ended up
queuing in 'pu', but I took your "Thanks" in $gmane/299120 to only
mean "Thanks for feeding some ideas to help me move forward", not
necessarily "Tnanks that looks like the right approach." yet, so
right now both topics are stalled and waiting for an action from
you.

Yes, the code looks good to me.
And the commit message does explain what is going on.

For my taste, these 3 lines don't explain too much,may be remove them ?
> The test update was taken from a series by Torsten Bögershausen
> that attempted to fix this with a different approach (which was a
> lot more intrusive).
So thanks for your efforts, ack from my side.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 3/5] archive-tar: write extended headers for file sizes >= 8GB

2016-07-14 Thread Torsten Bögershausen




On 07/14/2016 06:48 PM, Johannes Sixt wrote:

Am 30.06.2016 um 11:09 schrieb Jeff King:

+/*
+ * This is the max value that a ustar size header can specify, as it
is fixed
+ * at 11 octal digits. POSIX specifies that we switch to extended
headers at
+ * this size.
+ */
+#define USTAR_MAX_SIZE 0777UL


This is too large by one bit for our 32-bit unsigned long on Windows:

archive-tar.c: In function 'write_tar_entry':
archive-tar.c:295: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c: In function 'write_global_extended_header':
archive-tar.c:332: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: integer constant is too large for 'unsigned
long' type
archive-tar.c:335: warning: overflow in implicit constant conversion

-- Hannes

Similar problem on 32 Bit Linux:
 In function ‘write_global_extended_header’:
archive-tar.c:29:25: warning: overflow in implicit constant conversion 
[-Woverflow]

 #define USTAR_MAX_MTIME 0777UL
 ^
archive-tar.c:335:16: note: in expansion of macro ‘USTAR_MAX_MTIME’
   args->time = USTAR_MAX_MTIME;

--
I want to volunteer to do more tests on 32 bit Linux ;-)
Does this fix it and look as a good thing to change ?


--- a/archive-tar.c
+++ b/archive-tar.c
@@ -332,7 +332,7 @@ static void write_global_extended_header(struct 
archiver_args *args)

if (args->time > USTAR_MAX_MTIME) {
strbuf_append_ext_header_uint(_header, "mtime",
  args->time);
-   args->time = USTAR_MAX_MTIME;
+   args->time = (time_t)USTAR_MAX_MTIME;
}
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 0/6] worktree lock/unlock

2016-06-27 Thread Torsten Bögershausen

>On 03.06.16 14:19, Nguyễn Thái Ngọc Duy wrote:

Minor problem:
t2028 fails, when the test is run from a directory that is

a softlink.
(In my case 
/Users/tb/projects/git/git.pu
is a softlink to
/Users/tb/NoBackup/projects/git/git.pu/


[master (root-commit) 2519212] init
 Author: A U Thor 
 1 file changed, 1 insertion(+)
 create mode 100644 init.t
Preparing source (identifier source)
HEAD is now at 2519212 init
--- expected2016-06-27 09:50:19.0 +
+++ actual  2016-06-27 09:50:19.0 +
@@ -1,2 +1,2 @@
-worktree /Users/tb/projects/git/git.pu/t/trash directory.t2028-worktree-move
-worktree /Users/tb/projects/git/git.pu/t/trash 
directory.t2028-worktree-move/source
+worktree /Users/tb/NoBackup/projects/git/git.pu/t/trash 
directory.t2028-worktree-move
+worktree /Users/tb/NoBackup/projects/git/git.pu/t/trash 
directory.t2028-worktree-move/source
not ok 1 - setup
#

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-07 Thread Torsten Bögershausen

On 2016-07-06 16.57, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
> 
>> At some point inside the merge, Git calls read-cache.c/ce_compare_data(),
>>   to check if the path named "file" is clean.
>>   According to the tree information, the path "file" has the sha1 99b633.
>>   #Note:
>>   #sha1 99b633 is "first line\nsame line\n"
> 
> I thought your scenario was that our side had CRLF both in the blob
> and in the working tree.  So this is a different example?  Our side
> has LF in the blob that is checked out with CRLF in the working tree,
> and their side has CRLF in the blob?
> 
That was probably to confuse myself, and the rest of the world,
sorry for confusion.

This is the callstack:

merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned char *sha1,
const char *path, int stage, int refresh, int options)
{
struct cache_entry *ce;
ce = make_cache_entry
if (!ce)
return error(_("addinfo_cache failed for path '%s'"), path);
return add_cache_entry(ce, options);
}
#Calls
read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)


struct cache_entry *make_cache_entry(unsigned int mode,
const unsigned char *sha1, const char *path, int stage,
unsigned int refresh_options)
{
[snip]
ret = refresh_cache_entry(ce, refresh_options);
if (ret != ce)
free(ce);
return ret;
}

#Calls
refresh_cache_ent(_index, ce, options, NULL, NULL);

#Calls
ie_modified()

#Calls
read-cache.c/ie_match_stat()

#Calls
changed |= ce_modified_check_fs(ce, st);

#Calls
ce_compare_data(path=file sha1=0x99b633)

#Calls
index_fd(..., ..., ce->name, ...)
#Note the sha is lost here, the parameter sha is the output.

Deep down, has_cr_in_index(path) will look at ad55e2 instead,
which is wrong.


>> ce_compare_data() starts the work:
>> OK, lets run index_fd(...,ce->name,...)
>> # index_fd will simulate a "git add"  and return the sha1 (via the sha1 
>> pointer)
>> # after the hashing.
>>
>> # ce_compare_data() will later compare ce->sha1 with the result stored in
>> # the local sha1. That's why a sha1 is in the parameter list.
>> # To return the resulting hash:
>>
>> ce_compare_data() calls index_fd(sha1, ..., ce->name, ...)
>>
>> #Down in index_fd():
>>
>> sha1_file.c/index_fd() calls index_core() (after consulting the attributes)
>> index_core() reads the file from the working tree into memory using
>> read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
>> to calculate the sha1.
>>
>> Before the hashing is done, index_mem() runs
>> convert_to_git(path, buf, size, , SAFE_CRLF_FALSE)
>> to convert  "blobs to git internal format".
>>
>>
>> Here, convert_to_git() consults the .gitattributes (again) to find out that
>> crlf_action is CRLF_AUTO in our case.
>> The "new safer autocrlf handling" says that if a blob as any CR, don't 
>> convert
>> CRLFs at "git add".
>>
>> convert.c/crlf_to_git() starts to do it's job:
>> - look at buf (It has CRLF, conversion may be needed)
>> - consult blob_has_cr()
>>   # Note: Before this patch, has_cr_in_index(path)) was used
>>
>> #Again, before this patch,
>> has_cr_in_index(path) calls read_blob_data_from_cache(path, ) to read the
>> blob into memory.
>>
>> Now read_blob_data_from_cache() is a macro, and we end up in
>> read_blob_data_from_index(_index, (path), (sz))
>>
>> read-cache.c/read_blob_data_from_index() starts its work:
>>  pos = index_name_pos(istate, path, len);
>>  if (pos < 0) {
>>  /*
>>   * We might be in the middle of a merge, in which
>>   * case we would read stage #2 (ours).
>>   */
>>
>> # And here, and this is important to notice, "ours" is sha1 ad55e2,
>> # which corresponds to "first line\r\nsame line\r\n"
> 
> Where did 99b633 come from then?  There still is something missing
> in this description.
> 
> Puzzled...
This is an unfinished attempt for a commit message:
--
correct ce_compare_data() at the end of a merge

The following didn't work as expected:

 - At the end of a merge
 - merge.renormalize is true,
 - .gitattributes = "* text=auto"
 - core.eol = crlf

Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
with LF "first line\nsame line\n".

The expected result of the merge is "first line\nsame line\n".

The content in the working tre

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-08 Thread Torsten Bögershausen




On 07/07/16 20:43, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
>
>> This is the call stack:
>>
>> merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned 
char *sha1,

>>const char *path, int stage, int refresh, int options)
>> {
>>struct cache_entry *ce;
>>ce = make_cache_entry
>>if (!ce)
>>return error(_("addinfo_cache failed for path '%s'"), path);
>>return add_cache_entry(ce, options);
>> }
>> #Calls
>> read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)
>>
>>
>> struct cache_entry *make_cache_entry(unsigned int mode,
>>const unsigned char *sha1, const char *path, int stage,
>>unsigned int refresh_options)
>> {
>> [snip]
>>ret = refresh_cache_entry(ce, refresh_options);
>>if (ret != ce)
>>free(ce);
>>return ret;
>> }
>>
>> #Calls
>> refresh_cache_ent(_index, ce, options, NULL, NULL);
>>
>> #Calls
>> ie_modified()
>>
>> #Calls
>> read-cache.c/ie_match_stat()
>>
>> #Calls
>> changed |= ce_modified_check_fs(ce, st);
>>
>> #Calls
>> ce_compare_data(path=file sha1=0x99b633)
>>
>> #Calls
>> index_fd(..., ..., ce->name, ...)
>> #Note the sha is lost here, the parameter sha is the output.
>>
>> Deep down, has_cr_in_index(path) will look at ad55e2 instead,
>> which is wrong.
> The call to add_cacheinfo() is made with 99b633 to record the 3-way
> merge result to the index in this callchain:
>
>  merge_trees()
>  -> git_merge_trees()
>  -> process_renames() # does nothing for this case
>  -> process_entry()
> -> merge_content()
>-> merge_file_special_markers()
>   -> merge_file_1()
>  -> merge_3way()
> -> ll_merge() # correctly handles the renormalization
>  -> write_sha1_file() # this is what gives us 99b633
> -> update_file() # this is fed the 99b633 write_sha1_file() computed
>-> update_file_flags()
>   -> read_sha1_file() # reads 99b633
>   -> convert_to_working_tree()
>   -> write_in_full() # updates the working tree
>   -> add_cacheinfo() # to record 99b633 at "file" in the index
>
> So refresh() may incorrectly find that 99b633 does not match what is
> in the filesystem if it looks at _any_ entry in the index.  We know
> we have written the file ourselves, so by definition it should match
> ;-)

Does it, always ?
There was a lengthy discussion around January, if
git checkout -f
should always give a clean workspace.
(My suggestion was yes, please), but the conclusion
was to always check the other way around:
What would "git add" say ?
If the smudge/clean don't provide
a round trip, then the worktree should not be clean.

My understanding is, that after merge, the filters must be round trip
(and all other conversions), otherwise the merge willl fail.

In this case the merge fails because of a bug in Git.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-08 Thread Torsten Bögershausen




On 07/08/2016 06:36 PM, Junio C Hamano wrote:

Torsten Bögershausen <tbo...@web.de> writes:


I dunno.  I really do not like that extra sha1 argument added all
over the callchain by this patch.

Or did you mean other calls to add_cacheinfo()?


I didn't mean too much - the whole call chain touches code where I
am not able to comment on details.
I'm happy to test other implementations, if someone suggests a
path, so to say.


I did a bit of experiment.

When 1/3 alone is applied, and then only changes for t/t6038 from
3/3 is picked, (i.e. we do not add the extra "don't look at index,
check this contents"), your "Merge addition of text=auto eol=CRLF"
test would fail.

And then with this further on top:

diff --git a/merge-recursive.c b/merge-recursive.c
index b880ae5..628c8ed 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -202,6 +202,9 @@ static int add_cacheinfo(unsigned int mode, const unsigned 
char *sha1,
const char *path, int stage, int refresh, int options)
 {
struct cache_entry *ce;
+
+   if (!stage)
+   remove_file_from_cache(path);
ce = make_cache_entry(mode, sha1 ? sha1 : null_sha1, path, stage,
  (refresh ? (CE_MATCH_REFRESH |
  CE_MATCH_IGNORE_MISSING) : 0 ));


Thanks :-)
Did that experiment made it to a branch somewhere ?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-08 Thread Torsten Bögershausen




On 07/07/16 20:43, Junio C Hamano wrote:

Torsten Bögershausen <tbo...@web.de> writes:


This is the callstack:

merge-recursive.c/add_cacheinfo(unsigned int mode, const unsigned char *sha1,
const char *path, int stage, int refresh, int options)
{
struct cache_entry *ce;
ce = make_cache_entry
if (!ce)
return error(_("addinfo_cache failed for path '%s'"), path);
return add_cache_entry(ce, options);
}
#Calls
read-cache.c/make_cache_entry(path=file sha1=0x99b633 stage=0)


struct cache_entry *make_cache_entry(unsigned int mode,
const unsigned char *sha1, const char *path, int stage,
unsigned int refresh_options)
{
 [snip]
ret = refresh_cache_entry(ce, refresh_options);
if (ret != ce)
free(ce);
return ret;
}

#Calls
refresh_cache_ent(_index, ce, options, NULL, NULL);

#Calls
ie_modified()

#Calls
read-cache.c/ie_match_stat()

#Calls
changed |= ce_modified_check_fs(ce, st);

#Calls
ce_compare_data(path=file sha1=0x99b633)

#Calls
index_fd(..., ..., ce->name, ...)
#Note the sha is lost here, the parameter sha is the output.

Deep down, has_cr_in_index(path) will look at ad55e2 instead,
which is wrong.

The call to add_cacheinfo() is made with 99b633 to record the 3-way
merge result to the index in this callchain:

  merge_trees()
  -> git_merge_trees()
  -> process_renames() # does nothing for this case
  -> process_entry()
 -> merge_content()
-> merge_file_special_markers()
   -> merge_file_1()
  -> merge_3way()
 -> ll_merge() # correctly handles the renormalization
  -> write_sha1_file() # this is what gives us 99b633
 -> update_file() # this is fed the 99b633 write_sha1_file() computed
-> update_file_flags()
   -> read_sha1_file() # reads 99b633
   -> convert_to_working_tree()
   -> write_in_full() # updates the working tree
   -> add_cacheinfo() # to record 99b633 at "file" in the index

So refresh() may incorrectly find that 99b633 does not match what is
in the filesystem if it looks at _any_ entry in the index.  We know
we have written the file ourselves, so by definition it should match
;-) Even though I am not sure why that should affect the overall
correctness of the program (because we have written the correct
result to both working tree and to the index), this needs to be
fixed.

I am however not convinced passing the full SHA-1 is a good
solution.

It seems at least that this is the correct way to do it.

The make_cache_entry() function may be creating a cache
entry to stuff in a non-default index (i.e. not "the_index"), but
its caller does not have a way to tell it which index the resulting
cache entry will go, and calls refresh_cache_entry(), which always
assumes that the caller is interested in "the_index", so what
add_cacheinfo() does by calling make_cache_entry() feels wrong in
the first place.  Also, the call from update_file_flags() knows that
the working tree is in sync with the resulting cache entry.

Perhaps update_file_flags() should not even call add_cacheinfo()
there in the first place?  I wonder if it should just call
add_file_to_index()---after all, even though the current code
already knows that 99b633 _is_ the result it wants in the index, it
still goes to the filesystem and rehashes.

I dunno.  I really do not like that extra sha1 argument added all
over the callchain by this patch.

Or did you mean other calls to add_cacheinfo()?

I didn't mean too much - the whole call chain touches code where I am not able 
to
comment on details.
I'm happy to test other implementations, if someone suggests a path, so to say.




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 11/12] convert: add filter..process option

2016-08-05 Thread Torsten Bögershausen

On 2016-08-03 18.42, larsxschnei...@gmail.com wrote:
> The filter is expected to respond with the result content in zero
> or more pkt-line packets and a flush packet at the end. Finally, a
> "result=success" packet is expected if everything went well.
> 
> packet:  git< SMUDGED_CONTENT
> packet:  git< 
> packet:  git< result=success\n
> 
I would really send the diagnostics/return codes before the content.

> If the result content is empty then the filter is expected to respond
> only with a flush packet and a "result=success" packet.
> 
> packet:  git< 
> packet:  git< result=success\n
> 

Which may be:

packet:  git< result=success\n
packet:  git< SMUDGED_CONTENT
packet:  git< 

or for an empty file:

packet:  git< result=success\n
packet:  git< SMUDGED_CONTENT
packet:  git< 


or in case of an error:
packet:  git< result=reject\n
# And this will not send the "" packet

Does this makes sense ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: t0027 racy?

2016-08-08 Thread Torsten Bögershausen

On 2016-08-08 17.05, Johannes Schindelin wrote:
> Hi Torsten,
> 
> I remember that you did a ton of work on t0027. Now I see problems, and
> not only that the entire script now takes a whopping 4 minutes 20 seconds
> to run on my high-end Windows machine.
> 
> It appears that t0027 fails randomly for me, in seemingly random places.
> Sometimes all 1388 cases pass. Sometimes "29 - commit NNO files crlf=true
> attr=auto LF" fails. Sometimes it is "24 - commit NNO files crlf=false
> attr=auto LF". Sometimes it is "114 - commit NNO files crlf=false
> attr=auto LF", and sometimes "111 - commit NNO files attr=auto aeol=lf
> crlf=false CRLF_mix_LF".
> 
> When I run it with -i -v -x --tee, it passes every single time (taking
> over 5 minutes, just to make things worse)...
> 
> Any idea about any possible races?
Just to double-check: I assume that you have this
commit ded2444ad8ab8128cae2b91b8efa57ea2dd8c7a5
Author: Torsten Bögershausen <tbo...@web.de>
Date:   Mon Apr 25 18:56:27 2016 +0200

t0027: make commit_chk_wrnNNO() reliable
in your tree ?

Is there a special pattern ?
Did you
a) Update the machine ?
b) Update Git code base ?
or both ?
(Yes, I know that this may be stupid questions)

Is it only the NNO tests that fail ?
Did they ever pass ?
(I think I run them some time ago on a Virtual machine running Windows 7)

I see only "commit NNO files" in you report, they belong to
check_warning(), which is called around line 126 in t0027.

check_warning() does a grep on a file which has been re-directed from stderr.

How reproducible is the problem ?
If you add
exit 0
After the last "commit_chk_wrnNNO" line (line 418),
does
ls -l crlf*.err
give you any hint ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: t0027 racy?

2016-08-08 Thread Torsten Bögershausen

On Mon, Aug 08, 2016 at 11:29:26AM -0400, Jeff King wrote:
> On Mon, Aug 08, 2016 at 05:05:07PM +0200, Johannes Schindelin wrote:
> 
> > I remember that you did a ton of work on t0027. Now I see problems, and
> > not only that the entire script now takes a whopping 4 minutes 20 seconds
> > to run on my high-end Windows machine.
> > 
> > It appears that t0027 fails randomly for me, in seemingly random places.
> > Sometimes all 1388 cases pass. Sometimes "29 - commit NNO files crlf=true
> > attr=auto LF" fails. Sometimes it is "24 - commit NNO files crlf=false
> > attr=auto LF". Sometimes it is "114 - commit NNO files crlf=false
> > attr=auto LF", and sometimes "111 - commit NNO files attr=auto aeol=lf
> > crlf=false CRLF_mix_LF".
> > 
> > When I run it with -i -v -x --tee, it passes every single time (taking
> > over 5 minutes, just to make things worse)...
> > 
> > Any idea about any possible races?
> 
> Try:
> 
>   https://github.com/peff/git/blob/meta/stress
> 
> which you can run as "sh /path/to/stress t0027" in the top-level of your
> git repository. I got failure within about 30 seconds on t0027 (though 5
> minutes? Yeesh. It runs in 9s on my laptop. I weep for you).
> 
> The verbose output is not very exciting, though:
> 
>   expecting success: 
>   check_warning "$lfwarn" ${pfx}_LF.err
> 
>   --- NNO_attr_auto_aeol_crlf_false_LF.err.expect 2016-08-08 
> 15:26:37.061701392 +
>   +++ NNO_attr_auto_aeol_crlf_false_LF.err.actual 2016-08-08 
> 15:26:37.061701392 +
>   @@ -1 +0,0 @@
>   -warning: LF will be replaced by CRLF
>   not ok 114 - commit NNO files crlf=false attr=auto LF

(I realized that t0027 is not yet self-explaining, I have it on my list)

NNO_attr_auto_aeol_crlf_false_LF means:

NNO:   "Not NOrmalized" A file had been commited with CRLF in the repo
attr_auto: .gitattributes has "* text=auto"
aeol_eol   .gitattributes has "eol=crlf"
crlf_false git config core.autocrlf = false
LF We commit a file with LF line endings.

This should happend:
- The file is commited "as is", with LF line endings.
- While commiting, git should print the warning
- "warning: LF will be replaced by CRLF" to stderr
- stderr is piped (redirected) into NNO_attr_auto_aeol_crlf_false_LF.err
- we grep for "will be replaced by" in xx.err and pipe it into xx.err.actual

The rest is test_cmp, this is what you see.
The warning is missing, but should be there:

The file has LF, and after commit and a new checkout these LF will
be convertet into CRLF.

So why isn't the warning there (but here on my oldish machines)
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 07/12] run-command: add clean_on_exit_handler

2016-08-05 Thread Torsten Bögershausen

On 2016-08-05 15.08, Lars Schneider wrote:

[]
> Yeah it could do that. But then the filter cannot do things like
> modifying the index after the fact... however, that might be considered
> nasty by the Git community anyways... I am thinking about dropping
> this patch in the next roll as it is not strictly necessary for my
> current use case.
(Thanks Peff for helping me out with the EOF explanation)

I would say that a filter is a filter, and should do nothing else than filtering
one file,
(or a stream).
When you want to modify the index, a hook may be your friend.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v4 11/12] convert: add filter..process option

2016-08-06 Thread Torsten Bögershausen

On 2016-08-06 00.06, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
>
>> On 2016-08-03 18.42, larsxschnei...@gmail.com wrote:
>>> The filter is expected to respond with the result content in zero
>>> or more pkt-line packets and a flush packet at the end. Finally, a
>>> "result=success" packet is expected if everything went well.
>>> 
>>> packet:  git< SMUDGED_CONTENT
>>> packet:  git< 
>>> packet:  git< result=success\n
>>> 
>> I would really send the diagnostics/return codes before the content.
> I smell the assumption "by the time the filter starts output, it
> must have finished everything and knows both size and the status".
>
> I'd prefer to have a protocol that allows us to do streaming I/O on
> both ends when possible, even if the initial version of the filters
> (and the code that sits on the Git side) hold everything in-core
> before starting to talk.
>
>>> If the result content is empty then the filter is expected to respond
>>> only with a flush packet and a "result=success" packet.
>> ...
>> Which may be:
>>
>> packet:  git< result=success\n
>> packet:  git< SMUDGED_CONTENT
>> packet:  git< 
>>
>> or for an empty file:
>>
>> packet:  git< result=success\n
>> packet:  git< SMUDGED_CONTENT
>> packet:  git< 
> The above two look the same to me.
Copy-paste error.
i see that we need a status after the complete transfer,
and after some thinking I would like to take back my comment.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen

On Tue, Aug 09, 2016 at 02:51:10AM -0400, Jeff King wrote:
> On Mon, Aug 08, 2016 at 08:32:24PM +0000, Torsten Bögershausen wrote:
> 
> > > The verbose output is not very exciting, though:
> > > 
> > >   expecting success: 
> > >   check_warning "$lfwarn" ${pfx}_LF.err
> > > 
> > >   --- NNO_attr_auto_aeol_crlf_false_LF.err.expect 2016-08-08 
> > > 15:26:37.061701392 +
> > >   +++ NNO_attr_auto_aeol_crlf_false_LF.err.actual 2016-08-08 
> > > 15:26:37.061701392 +
> > >   @@ -1 +0,0 @@
> > >   -warning: LF will be replaced by CRLF
> > >   not ok 114 - commit NNO files crlf=false attr=auto LF
> > [...]
> > The warning is missing, but should be there:
> > 
> > The file has LF, and after commit and a new checkout these LF will
> > be convertet into CRLF.
> > 
> > So why isn't the warning there (but here on my oldish machines)
> 
> To be clear, the warning _is_ there when I just run t0027 by itself, and
> the test passes.  It's only under heavy load that it isn't. So it's a
> race condition either in the test script or in git itself.
> 
> Usually race conditions like these are due to one of:
> 
>   - git dying from SIGPIPE before it has a chance to output the command.
> But I don't see any pipes being used in the test script.
> 
>   - index raciness causing us to avoid reading file content. For
> example, if you do:
> 
>   echo foo >bar
>   git add bar
> 
> Then _usually_ "bar" and the index will have the same mtime. And
> therefore subsequent commands that need to refresh the index will
> re-read the content of "bar", because they cannot tell from the stat
> information if we have the latest version of "bar" in the index or
> not (it could have been written after the index update, but in the
> same second).
> 
> But on a slow or heavily loaded system (or if you simply get unlucky
> in crossing the boundary to a new second), they'll have different
> mtimes. And therefore git knows it can skip reading the content from
> the filesystem.
> 
> So if your test relies on git actually re-converting the file
> content, it would sometimes randomly fail.
> 
> The second one seems plausible, given the history of issues with
> changing CRLF settings for an existing checkout. I'm not sure if it
> would be feasible to reset the index completely before each tested
> command, but that would probably solve it.
The content of the file has been changed (we appended the letter 'Z' to it,
so even if mtime is the same, st.st_size should differ.
And it seems as if the commit is triggered, see below.
> 

(I got the stress script working; no I can reproduce it nicely)

Is it possible to vote for a 3rd option ?
I added some more warnings, to have always some output in stderr. 

This is the good case, this test case passed:
cat NNO_attr_auto_aeol_crlf_true_LF.err 

warning: check_safe_crlf in NNO_attr_auto_aeol_crlf_true_LF.txt 2
warning: LF was here in NNO_attr_auto_aeol_crlf_true_LF.txt..
warning: LF will be replaced by CRLF in NNO_attr_auto_aeol_crlf_true_LF.txt.
The file will have its original line endings in your working directory.
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_CRLF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_CRLF_mix_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_crlf_true_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_CRLF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_CRLF_mix_LF.txt 0
warning: check_safe_crlf in NNO_attr__aeol_lf_true_LF.txt 0
warning: check_safe_crlf in NNO_attr_auto_aeol_lf_true_LF.txt 0
[master 2decee0] commit_NNO_attr_auto_aeol_crlf_true_LF.txt
 Author: A U Thor <aut...@example.com>
 1 file changed, 2 insertions(+), 2 deletions(-)
---
And this one failed, the same code base, but a different file:
cat commit_NNO_attr_auto_aeol_crlf_input_LF.

[master ce2045a] commit_NNO_attr_auto_aeol_crlf_input_LF.txt
 Author: A U Thor <aut...@example.com>
 1 file changed, 2 insertions(+), 2 deletions(-)

---
Both had been generated with a patched convert.c:
git diff convert.c
index 67d69b5..ae64a58 100644
--- a/convert.c
+++ b/convert.c
@@ -191,6 +191,7 @@ static enum eol output_eol(enum crlf_action crlf_action)
 static void check_safe_crlf(const char *path, enum crlf_action crlf_action,
 struct text_stat *stats, enum safe_crlf checksafe)
 {
+   warning("check_safe_crlf in %s %d", path, (int)checksafe);
if (!checksafe)
return;
 
@@ -210,6 +211,7 @@ static void check_safe_crlf(const char *path, enum 
crlf_action crlf_action,
 * C

Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen

On Tue, Aug 09, 2016 at 07:49:38AM -0400, Jeff King wrote:
> On Tue, Aug 09, 2016 at 11:33:37AM +0000, Torsten Bögershausen wrote:
> 
> > > The second one seems plausible, given the history of issues with
> > > changing CRLF settings for an existing checkout. I'm not sure if it
> > > would be feasible to reset the index completely before each tested
> > > command, but that would probably solve it.
> > The content of the file has been changed (we appended the letter 'Z' to it,
> > so even if mtime is the same, st.st_size should differ.
> > And it seems as if the commit is triggered, see below.
> 
> I don't think I made myself clear. It's not a question of whether there
> is something to commit. It's that when git asks the index "what is the
> sha1 of the content at this path?", the index may be able to answer
> directly (the file is up-to-date, so we return the index value), or it
> may have to go to the filesystem and read the file content. It is this
> latter which triggers convert_to_git(), which is what generates the
> message in question.
> 
> For a more stripped-down example, try:
> 
>   git add foo
>   git commit -m msg
> 
> versus:
> 
>   git add foo
>   sleep 1
>   git commit -m msg
> 
> In the latter case, we should not generally need convert_to_git() in the
> "commit" step. It was already done by "git add", and we reuse the cached
> result.
> 
> Whereas in the first one, we may run into the racy-index problem and
> have to re-read the file to be on the safe side.
> 
> -Peff

Thanks for the explanation, so there are 2 chances for a race.

I assume that the suggested "touch" will fix race#2 in most cases.
In my understanding, the change of the file size will be more reliable:
 

diff --git a/t/t0027-auto-crlf.sh b/t/t0027-auto-crlf.sh
index 2860d2d..9933a9b 100755
--- a/t/t0027-auto-crlf.sh
+++ b/t/t0027-auto-crlf.sh
@@ -120,6 +120,7 @@ commit_chk_wrnNNO () {
cp $f $fname &&
printf Z >>"$fname" &&
git -c core.autocrlf=$crlf add $fname 2>/dev/null &&
+   printf Z >>"$fname" &&
git -c core.autocrlf=$crlf commit -m "commit_$fname" $fname 
>"${pfx}_$f.err" 2>&1
done
---
Does anybody agree ?
And, by the way, the convert warning may be issued twice, once in
"git add" and once in "git commit".
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: What's cooking in git.git (Aug 2016, #02; Thu, 4)

2016-08-08 Thread Torsten Bögershausen

On 2016-08-05 01.32, Junio C Hamano wrote:
> Mike Hommey  writes:
[]

>> What kind of support are you expecting?
> 
> The only rationale I recall you justifying this series was that this
> makes the resulting code easier to read, but I do not recall other
> people agreeing with you, and I do not particularly see the
> resulting code easier to follow.
> 
If that helps:
I can offer to make a code review,
and come back at the end of the week or so.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 03/10] pkt-line: add packet_flush_gentle()

2016-08-02 Thread Torsten Bögershausen

On Sun, Jul 31, 2016 at 11:45:08PM +0200, Lars Schneider wrote:
> 
> > On 31 Jul 2016, at 22:36, Torstem Bögershausen  wrote:
> > 
> > 
> > 
> >> Am 29.07.2016 um 20:37 schrieb larsxschnei...@gmail.com:
> >> 
> >> From: Lars Schneider 
> >> 
> >> packet_flush() would die in case of a write error even though for some 
> >> callers
> >> an error would be acceptable.
> > What happens if there is a write error ?
> > Basically the protocol is out of synch.
> > Lenght information is mixed up with payload, or the other way
> > around.
> > It may be, that the consequences of a write error are acceptable,
> > because a filter is allowed to fail.
> > What is not acceptable is a "broken" protocol.
> > The consequence schould be to close the fd and tear down all
> > resources. connected to it.
> > In our case to terminate the external filter daemon in some way,
> > and to never use this instance again.
> 
> Correct! That is exactly what is happening in kill_protocol2_filter()
> here:

Wait a second.
Is kill the same as shutdown ?
I would expect that
The process terminates itself as soon as it detects EOF.
As there is nothing more read.

Then the next question: The combination of kill & protocol in kill_protocol(),
what does it mean ?
Is it more like a graceful shutdown_protocol() ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: t0027 racy?

2016-08-09 Thread Torsten Bögershausen

>   git -c core.autocrlf=$crlf add $fname >"${pfx}_$f.err" 2>&1
> 
> would make more sense. We _know_ that we have to do convert_to_git() in
> that step because the content is changed. And then you can ignore the
> warnings from "git commit" (which are racy), or you can simply commit as
> a whole later, as some other loops do.
> 
> But like Dscho, I do not actually understand what this test is checking.
> The function is called commit_chk_wrnNNO(), so perhaps you really are
> interested in what "commit" has to say. But IMHO that is not an
> interesting test. We know that if it has to read the content from disk,
> it will call convert_to_git(), which is the exact same code path used by
> "git add". So I do not understand what it is accomplishing to make a
> commit at all here.

It seems as if the test has been written without understanding the raciness.
It should commit files with different line endings on top of
a file with mixed line endings.
The warning should be checked (and here "git add" can be used,
or the file can be commited directly).
I'm not sure why the test ended up in doing both.

However, doing it the right way triggers a bug in convert.c,
(some warnings are missing, so I need some days to come up
with a proper patch)

Thanks for reporting & digging.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-06-30 Thread Torsten Bögershausen

On 29.06.16 18:14, Junio C Hamano wrote:
> tbo...@web.de writes:
> 
>> From: Torsten Bögershausen <tbo...@web.de>
>>
>> The following didn't work as expected:
> 
> Sorry for being slow (not in response but in understanding), but
> let's examine the expectation first.

Thanks for the patience.
There is one detail missing in my description:
The check if a file in the working tree is clean:

static int ce_compare_data(const struct cache_entry *ce, struct stat *st)
{
int match = -1;
int fd = open(ce->name, O_RDONLY);

if (fd >= 0) {
unsigned char sha1[20];
unsigned flags = HASH_USE_SHA_NOT_PATH;
--
Remove the "new feature":

git diff  read-cache.c
diff --git a/read-cache.c b/read-cache.c
index dd98b36..1f951ea 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -170,7 +170,7 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)
 
if (fd >= 0) {
unsigned char sha1[20];
-   unsigned flags = HASH_USE_SHA_NOT_PATH;
+   unsigned flags = 0;//HASH_USE_SHA_NOT_PATH;
memcpy(sha1, ce->sha1, sizeof(sha1));
if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
match = hashcmp(sha1, ce->sha1);

The the problem can be re-produced, and debugged with help of t6038:

not ok 5 - Merge addition of text=auto eol=CRLF


> 
>>  - In a middle of a merge
>>  - merge.renormalize is true,
> 
> gitattributes(5) tells us that this would make a "virtual check-out
> and check-in of all stages", i.e. each of the three blob objects
> involved is run through convert_to_working_tree(), and the result is
> run through convert_to_git().  Now, what are these two operations
> told to do?
> 
>>  - .gitattributes = "* text=auto"
>>  - core.eol = crlf
> 
> git-config(1) tells us that a text file will be checked out with
> CRLF line endings, so convert_to_working_tree() would work that
> way.  Even though core.eol entry in that manual page does not tell
> us anything, presumably convert_to_git() is expected to turn it back
> to LF line endings.
> 
Yes, the git-config(1) may need improvements, but that can go as a later path.

>> Merge a blob with CRLF "first line\r\nsame line\r\n" and a blob
>> with LF "first line\nsame line\n".
I may have written:
Merge a blob with CRLF "first line\r\nsame line\r\n" 
(which is checked out as
"first line\r\nsame line\r\n" 
in the working tree, and therefore clean)
and a blob with LF "first line\nsame line\n".

> 
> The former, when renormalized, should (we are discussing the
> expectation, not necessarily what is done by today's code) be
> processed like this:
> 
>  * Pretend as if blob "first line\r\nsame line\r\n" is in the stage
>#0 of the index at the path;
>  * Pretend as if the user said "git checkout" that path;
>  * Pretend as if the user then said "git add" that path.
> 
> The checkout process hopefully does not blindly turn LF into CRLF,
> making it "first line \r\r\nsame line\r\rn".  Instead the (virtual)
> working tree file will have "first line\r\nsame line\r\n".
Yes
> 
> Then "git add" should turn that CRLF into LF, which would give us
> "first line\nsame line\n", but because the "safer autocrlf" rule
> prevents it from happening.  The (real) index already has CR in it
> in the blob in stage #2, so the check-in process would (this is not
> describing the ideal, but what is done by today's code) disable
> CRLF->LF conversion.
> 
> Is that the problem you are trying to solve?
Not sure, if that problem isn't already solved.

> 
> If that is the case, I do not see how "don't use stage #2 of the
> real index; use the blob being renormalized instead" would help.
> The blob being renormalized may have CR in it, triggering that
> "safer autocrlf" rule and cause you the same trouble, no?
> 
> To me, it appears that if you consider that the "safer autocrlf" is
> a sensible thing, you _must_ expect that the renormalization would
> not work at all, in the scenario you presented.  Also,
> 
>> The expected result of the merge is "first line\nsame line\n".
> 
> if you expect this, to me, it appears that you need to reject the
> "safer autocrlf", at least during renormalization, as a sensible
> thing.  And if you _are_ to disable the "safer autocrlf" thing, then
> it does not matter what is currently in the index--the conversion
> can happen solely based on the data being converted and the
> configuration and attribute settings.
> 
>

Re: [PATCH v3 3/3] correct ce_compare_data() in a middle of a merge

2016-07-02 Thread Torsten Bögershausen


On 2016-07-02 00.11, Junio C Hamano wrote:
[snip]

diff --git a/read-cache.c b/read-cache.c
index a3ef967..c109b6d 100644
--- a/read-cache.c
+++ b/read-cache.c
@@ -163,7 +163,9 @@ static int ce_compare_data(const struct cache_entry *ce, 
struct stat *st)

if (fd >= 0) {
unsigned char sha1[20];
-   if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, 0))
+   unsigned flags = HASH_USE_SHA_NOT_PATH;
+   memcpy(sha1, ce->sha1, sizeof(sha1));
+   if (!index_fd(sha1, fd, st, OBJ_BLOB, ce->name, flags))
match = hashcmp(sha1, ce->sha1);
/* index_fd() closed the file descriptor already */
}

I do not quite see how "Don't use ce->sha1, but use sha1 given to
you" flag affects this call, when sha1 and ce->sha1 are the same due
to memcpy() just before the call?



Lets step back a second. and try to explain whats going on.

Do a merge (and renormalize the files).

At some point inside the merge, Git calls read-cache.c/ce_compare_data(),
  to check if the path named "file" is clean.
  According to the tree information, the path "file" has the sha1 99b633.
  #Note:
  #sha1 99b633 is "first line\nsame line\n"

ce_compare_data() starts the work:
OK, lets run index_fd(...,ce->name,...)
# index_fd will simulate a "git add"  and return the sha1 (via the sha1 pointer)
# after the hashing.

# ce_compare_data() will later compare ce->sha1 with the result stored in
# the local sha1. That's why a sha1 is in the parameter list.
# To return the resulting hash:

ce_compare_data() calls index_fd(sha1, ..., ce->name, ...)

#Down in index_fd():

sha1_file.c/index_fd() calls index_core() (after consulting the attributes)
index_core() reads the file from the working tree into memory using
read_in_full(), and after that index_mem() calls hash_sha1_file(buf)
to calculate the sha1.

Before the hashing is done, index_mem() runs
convert_to_git(path, buf, size, , SAFE_CRLF_FALSE)
to convert  "blobs to git internal format".


Here, convert_to_git() consults the .gitattributes (again) to find out that
crlf_action is CRLF_AUTO in our case.
The "new safer autocrlf handling" says that if a blob as any CR, don't convert
CRLFs at "git add".

convert.c/crlf_to_git() starts to do it's job:
- look at buf (It has CRLF, conversion may be needed)
- consult blob_has_cr()
  # Note: Before this patch, has_cr_in_index(path)) was used

#Again, before this patch,
has_cr_in_index(path) calls read_blob_data_from_cache(path, ) to read the
blob into memory.

Now read_blob_data_from_cache() is a macro, and we end up in
read_blob_data_from_index(_index, (path), (sz))

read-cache.c/read_blob_data_from_index() starts its work:
pos = index_name_pos(istate, path, len);
if (pos < 0) {
/*
 * We might be in the middle of a merge, in which
 * case we would read stage #2 (ours).
 */

# And here, and this is important to notice, "ours" is sha1 ad55e2,
# which corresponds to "first line\r\nsame line\r\n"
# On our way in the callstack the information what to check had been lost:
# blob 99b633 and nothing else.
# read_blob_data_from_index() doesn't know the sha, but makes a guess,
# derived from path.
# The guess works OK for most callers: to take "ours" in a middle
# of a merge, but not here.

#Back to crlf_to_git():
The (wrong) blob has CRs in it, so crlf_to_git() decides not to convert CRLFs,
(and this is wrong).

# And now we go all the way back in the call chain:

The content in the worktree which is "first line\r\nsame line\r\n" is hashed:
hash_sha1_file(buf, size, typename(type), sha1);
#and the result is stored in the return pointer "sha1": ad55e2

#Back to ce_compare_data():
match = hashcmp(sha1, ce->sha1);

#And here sha1=ad55e2 != ce->sha199b633

The whole merge is aborted:

   error: addinfo_cache failed for path 'file'
file: unmerged (cbd69ec7cd12dd0989e853923867d94c8519aa52)
file: unmerged (ad55e240aeb42e0d9a0e18d6d8b02dd82ee3e527)
file: unmerged (99b633103c15c20cebebf821133ab526b0ff90b2)
fatal: git write-tree failed to write a tree
#Side note: cbd69ec is another file in the working tree.

###

With this patch,
ce_compare_data() feeds the sha1 99b633 into
blob_has_cr(const unsigned char *index_blob_sha1).

crlf_to_git() says:
"first line\r\nsame line\r\n" in the worktree needs to be converted
into
"first line\nsame line\n" as the "new" blob.

The new (converted) blob "first line\nsame line\n"
is hashed into 99b633, so ce_compare_data() says
fine, converting "first line\r\nsame line\r\n" will give sha1 99b633,
lets continue with the merge, and do the normalization.

# Note1:
# The renormalization is outside what this patch fixes.
# But I didn't manage to construct a test case, which triggered
# "fatal: git write-tree failed to write a tree" without using
# merge.renormalize

# Note2:
# ce_compare_data() has (if I understand things right)
#

Re: [ANNOUNCE] Git v2.10.0-rc0

2016-08-15 Thread Torsten Bögershausen

On 15.08.16 00:47, Junio C Hamano wrote:
> Torsten Bögershausen (1):
>   convert: unify the "auto" handling of CRLF

Should we mention this change in the release notes?

The handling of "*.text = auto" was changed, and now

$ echo "* text=auto eol=crlf" >.gitattributes
has the same effect as
$ git config core.autocrlf true


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [ANNOUNCE] Git v2.10.0-rc0

2016-08-17 Thread Torsten Bögershausen

On 17.08.16 19:38, Junio C Hamano wrote:
> Junio C Hamano <gits...@pobox.com> writes:
> 
>> Torsten Bögershausen <tbo...@web.de> writes:
>>
>>> Is this `text=auto` correct ?
>>
>> Thanks for spotting a typo.  Definitely.
>>
>>> I think it should be
>>>
>>>used to have the same effect as
>>>$ echo "* text eol=crlf" >.gitattributes
>>
>> Thanks.
>>
>>> # In other words, the `auto` was ignored, as explained here:
>>> +   $ git config core.eol crlf
>>> +   i.e. declaring all files are text; the combination now is
>>> +   equivalent to doing
>>> +   $ git config core.autocrlf true
>>> +
> 
> Just to make sure I got you right, how does this look?
> 
>  Documentation/RelNotes/2.10.0.txt | 8 
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/RelNotes/2.10.0.txt 
> b/Documentation/RelNotes/2.10.0.txt
> index 179e575..ccf5f64 100644
> --- a/Documentation/RelNotes/2.10.0.txt
> +++ b/Documentation/RelNotes/2.10.0.txt
> @@ -235,13 +235,13 @@ Performance, Internal Implementation, Development 
> Support etc.
>   * The API to iterate over all the refs (i.e. for_each_ref(), etc.)
> has been revamped.
>  
> - * The handling of the "text = auto" attribute has been updated.
> + * The handling of the "text=auto" attribute has been corrected.
> $ echo "* text=auto eol=crlf" >.gitattributes
> used to have the same effect as
> -   $ echo "* text=auto eol=crlf" >.gitattributes
> +   $ echo "* text eol=crlf" >.gitattributes
# Now we describe/define  the eol at checkouts twice.
# I think we can drop this line:
> $ git config core.eol crlf
# The rest looks good:
> -   i.e. declaring all files are text; the combination now is
> -   equivalent to doing
> +   i.e. declaring all files are text (ignoring "auto").  The
> +   combination has been fixed to be equivalent to doing
> $ git config core.autocrlf true
>  
>   * A few tests that specifically target "git rebase -i" have been
> 

# Just to be sure: the whole paragraph may look like this:

* The handling of the "text=auto" attribute has been corrected.
  $ echo "* text=auto eol=crlf" >.gitattributes
  used to have the same effect as
  $ echo "* text eol=crlf" >.gitattributes
  i.e. declaring all files are text (ignoring "auto").  The
  combination has been fixed to be equivalent to doing
  $ git config core.autocrlf true
















--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/4] cat-file: introduce the --filters option

2016-08-18 Thread Torsten Bögershausen

On Thu, Aug 18, 2016 at 02:46:17PM +0200, Johannes Schindelin wrote:
> As suggested by its name, the --filters option applies the filters

[]
> diff --git a/t/t8010-cat-file-filters.sh b/t/t8010-cat-file-filters.sh
Does it make sense to integrate tests into t1006-cat-file ?
> new file mode 100755
> index 000..e466634
> --- /dev/null
> +++ b/t/t8010-cat-file-filters.sh
> @@ -0,0 +1,34 @@
> +#!/bin/sh
> +
> +test_description='git cat-file filters support'
> +. ./test-lib.sh
> +
> +test_expect_success 'setup ' '
> + echo "*.txt eol=crlf diff=txt" >.gitattributes &&
> + echo "hello" | append_cr >world.txt &&
> + git add .gitattributes world.txt &&
> + test_tick &&
> + git commit -m "Initial commit"
> +'
> +
> +has_cr () {
> + tr '\015' Q <"$1" | grep Q >/dev/null
> +}
> +
> +test_expect_success 'no filters with `git show`' '
> + git show HEAD:world.txt >actual &&
I would prefer to have something using
  cat >expect <<-\EOF &&
  xxx
  test_cmp expect actual
to make it easier to debug in case of a failure ?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/4] cat-file: introduce the --filters option

2016-08-19 Thread Torsten Bögershausen

[]
On Thu, Aug 18, 2016 at 02:46:17PM +0200, Johannes Schindelin wrote:

> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 071029b..7d48735 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -9,15 +9,15 @@ git-cat-file - Provide content or type and size information 
> for repository objec
>  SYNOPSIS
>  
>  [verse]
> -'git cat-file' (-t [--allow-unknown-type]| -s [--allow-unknown-type]| -e | 
> -p |  | --textconv ) 
> +'git cat-file' (-t [--allow-unknown-type]| -s [--allow-unknown-type]| -e | 
> -p |  | --textconv | --filters ) 
>  'git cat-file' (--batch | --batch-check) [--follow-symlinks]
>  
>  DESCRIPTION
>  ---
>  In its first form, the command provides the content or the type of an object 
> in
>  the repository. The type is required unless `-t` or `-p` is used to find the
> -object type, or `-s` is used to find the object size, or `--textconv` is used
> -(which implies type "blob").
> +object type, or `-s` is used to find the object size, or `--textconv` or
> +`--filters` is used (which imply type "blob").
>  
>  In the second form, a list of objects (separated by linefeeds) is provided on
>  stdin, and the SHA-1, type, and size of each object is printed on stdout.
> @@ -58,6 +58,12 @@ OPTIONS
>   order to apply the filter to the content recorded in the index at
>   .
>  
> +--filters::
> + Show the content as transformed by the filters configured in
Minor comment:
s/transformed/converted/ ?

Does it make sense to be more specific here:
The order of conversion is
- ident
- CRLF
- smudge
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-18 Thread Torsten Bögershausen

On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> With this patch, --batch can be combined with --textconv or --filters.
> For this to work, the input needs to have the form
> 
>   
> 
> so that the filters can be chosen appropriately.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  Documentation/git-cat-file.txt | 18 +++-
>  builtin/cat-file.c | 49 
> +-
>  t/t8010-cat-file-filters.sh| 10 +
>  3 files changed, 67 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 59a3c37..1f4d954 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  
>  [verse]
>  'git cat-file' (-t [--allow-unknown-type]| -s [--allow-unknown-type]| -e | 
> -p |  | --textconv | --filters ) [--use-path=] 
> -'git cat-file' (--batch | --batch-check) [--follow-symlinks]
> +'git cat-file' (--batch | --batch-check) [ --textconv | --filters ] 
> [--follow-symlinks]
>  
>  DESCRIPTION
>  ---
> @@ -20,7 +20,11 @@ object type, or `-s` is used to find the object size, or 
> `--textconv` or
>  `--filters` is used (which imply type "blob").
>  
>  In the second form, a list of objects (separated by linefeeds) is provided on
> -stdin, and the SHA-1, type, and size of each object is printed on stdout.
> +stdin, and the SHA-1, type, and size of each object is printed on stdout. The
> +output format can be overridden using the optional `` argument. If
> +either `--textconv` or `--filters` was specified, the input is expected to
> +list the object names followed by the path name, separated by a single white
> +space, so that the appropriate drivers can be determined.
>  
>  OPTIONS
>  ---
> @@ -72,13 +76,17 @@ OPTIONS
>  --batch::
>  --batch=::
>   Print object information and contents for each object provided
> - on stdin.  May not be combined with any other options or arguments.
> - See the section `BATCH OUTPUT` below for details.
> + on stdin.  May not be combined with any other options or arguments
> + except `--textconv` or `--filters`, in which case the input lines
> + also need to specify the path, separated by white space.  See the
> + section `BATCH OUTPUT` below for details.
>  
>  --batch-check::
>  --batch-check=::
>   Print object information for each object provided on stdin.  May
> - not be combined with any other options or arguments.  See the
> + not be combined with any other options or arguments except
> + `--textconv` or `--filters`, in which case the input lines also
> + need to specify the path, separated by white space.  See the
>   section `BATCH OUTPUT` below for details.
>  
>  --batch-all-objects::
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 5ff58b3..5f91cf4 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -17,6 +17,7 @@ struct batch_options {
>   int print_contents;
>   int buffer_output;
>   int all_objects;
> + int cmdmode; /* may be 'w' or 'c' for --filters or --textconv */
How do I read 'w' and 'c' ?
wilter and cextconv ? Does it make sense to use an enum here ?
Or a #define ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git diff --quiet exits with 1 on clean tree with CRLF conversions

2017-02-28 Thread Torsten Bögershausen

On 2017-02-27 21:17, Junio C Hamano wrote:

> Torsten, you've been quite active in fixing various glitches around
> the EOL conversion in the latter half of last year.  Have any
> thoughts to share on this topic?
> 
> Thanks.

Sorry for the delay, being too busy with other things.
I followed the discussion, but didn't have good things to contribute.
I am not an expert in diff.c, but there seems to be a bug, thanks everybody
for digging.

Back to business:

My understanding is that git diff --quiet should be quiet, when
git add will not do anything (but the file is "touched".
The touched means that Git will detect e.g a new mtime or inode
or file size when doing lstat().

mtime is tricky, inodes are problematic under Windows.
What is easy to change is the file length.
I don't thing that we need a test file with LF, nor do we need
the sleep, touch or anything.
Would the the following work ?
(This is copy-paste, so the TABs may be corrupted)

#!/bin/sh
#
# Copyright (c) 2017 Mike Crowe
#
# These tests ensure that files changing line endings in the presence
# of .gitattributes to indicate that line endings should be ignored
# don't cause 'git diff' or 'git diff --quiet' to think that they have
# been changed.

test_description='git diff with files that require CRLF conversion'

. ./test-lib.sh

test_expect_success setup '
echo "* text=auto" > .gitattributes &&
printf "Hello\r\nWorld\r\n" >crlf.txt &&
git add .gitattributes crlf.txt lf.txt &&
git commit -m "initial"
'

test_expect_success 'quiet diff works on file with line-ending change that has
no effect on repository' '
printf "Hello\r\nWorld\n" >crlf.txt &&
git status &&
git diff --quiet
'

test_done

Re: git-svn bridge and line endings

2016-08-23 Thread Torsten Bögershausen

On Mon, Aug 22, 2016 at 05:04:47PM -0700, Lucian Smith wrote:
> I'm attempting to use the git-svn bridge, and am having problems with
> line endings on Windows.
> 
> The setup is that we have a git repository on github, and I've checked
> out a branch on my Windows machine using Tortoise svn.  I make
> changes, commit them, and the branch is updated.  In general, this
> works fine.

Just to make sure:
The repo is in git format.
Is it a public repo ?
Or could you make a piblic demo repo ?
Do I understand it right: Tortoise SVN talks directly to the Git server ?
Isn't Tortoise SVN a client to talk to SVN server?
What goes over the wire to the remote Git server, git or SVN ?

To my understanding, "git svn" can use Git locally, and talk to an SVN server.
What do I miss ?

> 
> If this was just SVN, I could set the 'eol-style' for files to
> 'native' to let it know to expect Windows/linux/mac line endings for
> particular files.  This seems to be handled in git by using the
> '.gitattributes' file instead.  Unfortunately, the git/svn bridge
> doesn't seem to be translate the information in the .gitattributes
> file to appropriate eol-style settings in SVN.  Checking out a file
> using SVN on Windows leaves me with a file without CRLF's, and if I
> check in a CRLF file, that's the way it goes into the repository.
> Differences in CRLF alone show up as 'real' differences that can be
> checked in, and, if this happens, this causes problems with other
> people's repositories.
> 
> Am I doing something wrong; is there another way to handle this; or
> can I file this as a bug report/feature request?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: core.autocrlf, was Re: [git-for-windows] Re: [ANNOUNCE] Git for Windows 2.9.3

2016-08-25 Thread Torsten Bögershausen




I was not talking about the cost of correcting mistakes. Running --filters
is potentially very costly. Just so you understand what I am talking
about: I have a report that says that checking out a sizeable worktree
with core.autocrlf=true is 58% slower than with core.autocrlf=false. That
is horrible. []


Is this a public repo ?

Or is there a benchmark repo somewhere ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: How to simulate a real checkout to test a new smudge filter?

2016-09-06 Thread Torsten Bögershausen

On 06.09.16 19:47, john smith wrote:
> I am looking for a way to force smudge filter to run by simulating a
> real life checkout. Let's say I just created a new branch and did not
> modify any files but want to test my new smudge filter. According to
> some answers such as
> https://stackoverflow.com/questions/22909620/git-smudge-clean-filter-between-branches
> and
> https://stackoverflow.com/questions/21652242/git-re-checkout-files-after-creating-smudge-filter
> it should be possible by running:
>
> git checkout HEAD --
>
> but in doesn't work with git 2.9.0. Method suggested in accepted
> answer here
> https://stackoverflow.com/questions/17223527/how-do-i-force-git-to-checkout-the-master-branch-and-remove-carriage-returns-aft
> works but I don't like because it seems fragile. Is there a safe way
> to do what I want to do in Git still today?
>
It depends what you mean with "safe way".

git checkout, git checkout -f or other combinations will only

overwrite/rewrite the files in the working tree, if, and only if,

git comes to the conclusion that "git add" will do something,

like replace a blob for a file in the index.

(And by running "rm .git/index git will evaluate the "clean" filters,

and the CRLF->LF conversion).


If you want to test a smudge filter, simply remove the file:

mv file /tmp && git checkout file

Re: [PATCH v6 13/13] read-cache: make sure file handles are not inherited by child processes

2016-08-30 Thread Torsten Bögershausen


> 
> diff --git a/sha1_file.c b/sha1_file.c
> index d5e1121..759991e 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1485,7 +1485,7 @@ int check_sha1_signature(const unsigned char *sha1, 
> void *map,
>  
>  int git_open_noatime(const char *name)

Hm, should the function then be renamed into

git_open_noatime_cloexec()

>  {
> - static int sha1_file_open_flag = O_NOATIME;
> + static int sha1_file_open_flag = O_NOATIME | O_CLOEXEC;
>  
>   for (;;) {
>   int fd;
> 
> 
>

Re: [PATCH v6 12/13] convert: add filter..process option

2016-08-30 Thread Torsten Bögershausen

On Tue, Aug 30, 2016 at 03:23:10PM -0700, Junio C Hamano wrote:
> Lars Schneider  writes:
> 
> > On 30 Aug 2016, at 20:59, Junio C Hamano  wrote:
> >> 
> >>> "abort" could be ambiguous because it could be read as "abort only
> >>> this file". "abort-all" would work, though. Would you prefer to see
> >>> "error" replaced by "abort" and "error-all" by "abort-all"?
> >> 
> >> No.
> >> 
> >> I was primarily reacting to "-all" part, so anything that ends with
> >> "-all" is equally ugly from my point of view and not an improvement.
> >> 
> >> As I said, "error-all" as long as other reviewers are happy with is
> >> OK by me, too.
> >
> > Now, I see your point. How about "error-and-stop" or "error-stop"
> > instead of "error-all"?
> 
> Not really.  I was primarily reacting to having "error" and
> "error-all", that share the same prefix.  Changing "-all" suffix to
> "-stop" does not make all that difference to me.  But in any case,
> as long as other reviewers are happy with it, it is OK by me, too.
> 
> > Good argument. However, my intention here was to mimic the v1 filter
> > mechanism.
> 
> I am not making any argument and in no way against the chosen
> behaviour.  That "intention here" that was revealed after two
> iterations is something we would want to see as the reasoning
> behind the choice of the final behaviour recorded somewhere,
> and now I drew it out of you, I achieved what I set out to do
> initially ;-)
> 
> > I am not sure I understand your last sentence. Just to be clear:
> > Would you prefer it, if Git would just close the pipe to the filter process
> > on Git exit and leave the filter running?
> 
> As a potential filter writer, I would probably feel it the easiest
> to handle if there is no signal involved.  My job would be to read
> from the command pipe, react to it, and when the command pipe gives
> me EOF, I am expected to exit myself.  That would be simple enough.
> 
> >> I meant it as primarily an example people can learn from when they
> >> want to write their own.
> >
> > I think `t/t0021/rot13-filter.pl` (part of this patch) serves this purpose 
> > already.
> 
> I would expect them to peek at contrib/, but do you seriously expect
> people to comb through t/ directory?

How about a filter written in C, and placed in ./t/helper/ ?
At least I feel that a filter in C-language could be a starting point
for others which prefer, depending on the task the filter is going to do,
to use shell scripts, perl, python or any other high-level language.

A test case, where data can not be filtered, would be a minimum.
As Jakub pointed out, you can use iconv with good and bad data.

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-13 Thread Torsten Bögershausen

On 12.09.16 11:49, Lars Schneider wrote:

>> How do we send pathnames the have '\n' ?
>> Not really recommended, but allowed.
>> And here I am a little bit lost, is each of the lines packed into
>> a pkt-line ?
>> command=smudge is packet as pkt-line and pathname= is packed into
>> another one ? (The we don't need the '\n' at all)
> 
> Every line is a dedicated packet. That's why '\n' in a path name would
> not be a problem as the receiver is expected to read the entire packet
> when parsing the value (and the receiver knows the packet length, too).
> 
> The '\n' at the end is required by the pkt-line format:
> "A non-binary line SHOULD BE terminated by an LF..."
> (see protocol-common.txt)

That is only the half part of the story:
A non-binary line SHOULD BE terminated by an LF, which if present
MUST be included in the total length. Receivers MUST treat pkt-lines
with non-binary data the same whether or not they contain the trailing
LF (stripping the LF if present, and not complaining when it is
missing).

How do we treat pathnames ?
They can have each byte value except '\0'.
What should a receiver do, which reads a string like "ABC\n\n" ?
Is it "ABC\n" or "ABC\n\n" ?

I would really consider to treat pathnames as binary, and not add a trailing 
'\n',
are there other opinions ?

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-10 Thread Torsten Bögershausen

On Thu, Sep 08, 2016 at 08:21:32PM +0200, larsxschnei...@gmail.com wrote:
[]
> +packet:  git> git-filter-client
> +packet:  git> version=2
> +packet:  git> version=42
> +packet:  git> 
> +packet:  git< git-filter-server
> +packet:  git< version=2
> +packet:  git> clean=true
> +packet:  git> smudge=true
> +packet:  git> not-yet-invented=true
> +packet:  git> 
> +packet:  git< clean=true
> +packet:  git< smudge=true
> +packet:  git< 

It's probalby only me who has difficulties to distinguish
'>' from '<'.

packet:  git> git-filter-client
packet:  git> version=2
packet:  git> version=42
packet:  git> 
packet:   filter> git-filter-server
packet:   filter> version=2

(Otherwise the dialoge description is nice)

> +
> +Supported filter capabilities in version 2 are "clean" and
> +"smudge".
> +
> +Afterwards Git sends a list of "key=value" pairs terminated with
> +a flush packet. The list will contain at least the filter command
> +(based on the supported capabilities) and the pathname of the file
> +to filter relative to the repository root. Right after these packets
> +Git sends the content split in zero or more pkt-line packets and a
> +flush packet to terminate content.
> +
> +packet:  git> command=smudge\n
> +packet:  git> pathname=path/testfile.dat\n

How do we send pathnames the have '\n' ?
Not really recommended, but allowed.
And here I am a little bit lost, is each of the lines packed into
a pkt-line ?
command=smudge is packet as pkt-line and pathname= is packed into
another one ? (The we don't need the '\n' at all)
Or go both lines into one pkt-line (thats what I think), then
we don't need the '\n' afther the pathname.
And in this case the pathname is always the last element, and a '\n'
may occur in the pathname, since we know the length of the packet
we know how long the pathname must be.



> +packet:  git> 
> +packet:  git> CONTENT
> +packet:  git> 
> +
> +


> +The filter is expected to respond with a list of "key=value" pairs
> +terminated with a flush packet. If the filter does not experience
> +problems then the list must contain a "success" status. Right after
> +these packets the filter is expected to send the content in zero
> +or more pkt-line packets and a flush packet at the end. Finally, a
> +second list of "key=value" pairs terminated with a flush packet
> +is expected. The filter can change the status in the second list.
> +
> +packet:  git< status=success\n
> +packet:  git< 
> +packet:  git< SMUDGED_CONTENT
> +packet:  git< 
> +packet:  git<   # empty list!
> +
> +
> +If the result content is empty then the filter is expected to respond
> +with a success status and an empty list.
> +
> +packet:  git< status=success\n
> +packet:  git< 
> +packet:  git<   # empty content!
> +packet:  git<   # empty list!
> +
> +
> +In case the filter cannot or does not want to process the content,

Does not want ? 
I can see something like "I read through the file, there is nothing
to do. So Git, I don't send anything back, you know where the file is.

> +it is expected to respond with an "error" status. Depending on the
> +`filter..required` flag Git will interpret that as error
> +but it will not stop or restart the filter process.
> +
> +packet:  git< status=error\n
> +packet:  git< 
> +
> +
> +If the filter experiences an error during processing, then it can
> +send the status "error" after the content was (partially or
> +completely) sent. Depending on the `filter..required` flag
> +Git will interpret that as error but it will not stop or restart the
> +filter process.
> +
> +packet:  git< status=success\n
> +packet:  git< 
> +packet:  git< HALF_WRITTEN_ERRONEOUS_CONTENT
> +packet:  git< 
> +packet:  git< status=error\n
> +packet:  git< 
> +
> +
> +If the filter dies during the communication or does not adhere to
> +the protocol then Git will stop the filter process and restart it

My personal comment:
When a filter is mis-behaving, Git should say so loud and clear, and
die(). 
The filter process can be left running, so that it can be debugged.

Here I stopped the review for a moment, 
I still think that Git shouldn't kill anything, because we loose
the ability to debug these processes.

Re: Gitattributes file is not respected when switching between branches

2016-09-12 Thread Torsten Bögershausen

On 12.09.16 21:35, Torsten Bögershausen wrote:
> On 12.09.16 14:55, Виталий Ищенко wrote:
>> Good day
>>
>> I faced following issue with gitattributes file (at least eol setting)
>> when was trying to force `lf` mode on windows.
>>
>> We have 2 branches: master & dev. With master set as HEAD in repository
>>
>> I've added `.gitattributes` with following content to `dev` branch
>>
>> ```
>> * text eol=lf
>> ```
>>
>> Now when you clone this repo on other machine and checkout dev branch,
>> eol setting is not respected.
>> As a workaround you can rm all files except .git folder and do hard reset.
>>
>> Issue is reproducible on windows & unix versions. Test repo can be
>> found on github
>> https://github.com/betalb/gitattributes-issue
>>
>> master branch - one file without gitattributes
>> feature-branch - .gitattributes added with eol=lf
>> unix-feature-branch - .gitattributes added with eol=crlf
>>
>> Thanks,
>> Vitalii
> Some more information may be needed, to help to debug.
>
> Which version of Git are you using ?
> What does
>
> git ls-files --eol
>
> say ?
Obs, All information was in the email.

tb@xxx:/tmp/gitattributes-issue> git ls-files --eol
i/lfw/lfattr/   testfile-crlf.txt
tb@xxx:/tmp/gitattributes-issue> ls -al
total 8
drwxr-xr-x   4 tbwheel  136 Sep 12 21:38 .
drwxrwxrwt  19 root  wheel  646 Sep 12 21:38 ..
drwxr-xr-x  13 tbwheel  442 Sep 12 21:38 .git
-rw-r--r--   1 tbwheel   60 Sep 12 21:38 testfile-crlf.txt
tb@xxx:/tmp/gitattributes-issue>

Could it be that you didn't commit the file ".gitattributes" ?
This could help:
git add .gitattributes && git commit -m "Add .gitattributes"

Re: Gitattributes file is not respected when switching between branches

2016-09-12 Thread Torsten Bögershausen

On 12.09.16 14:55, Виталий Ищенко wrote:
> Good day
>
> I faced following issue with gitattributes file (at least eol setting)
> when was trying to force `lf` mode on windows.
>
> We have 2 branches: master & dev. With master set as HEAD in repository
>
> I've added `.gitattributes` with following content to `dev` branch
>
> ```
> * text eol=lf
> ```
>
> Now when you clone this repo on other machine and checkout dev branch,
> eol setting is not respected.
> As a workaround you can rm all files except .git folder and do hard reset.
>
> Issue is reproducible on windows & unix versions. Test repo can be
> found on github
> https://github.com/betalb/gitattributes-issue
>
> master branch - one file without gitattributes
> feature-branch - .gitattributes added with eol=lf
> unix-feature-branch - .gitattributes added with eol=crlf
>
> Thanks,
> Vitalii
Some more information may be needed, to help to debug.

Which version of Git are you using ?
What does

git ls-files --eol

say ?

Re: [PATCH v4 1/3] tests: move test_lazy_prereq JGIT to test-lib.sh

2016-09-09 Thread Torsten Bögershausen

On Fri, Sep 09, 2016 at 10:36:28AM -0700, Jonathan Tan wrote:
[]
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index d731d66..c9c1037 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -1072,6 +1072,10 @@ test_lazy_prereq NOT_ROOT '
>   test "$uid" != 0
>  '
>  
> +test_lazy_prereq JGIT '
> + type jgit
> +'
> +

Minor note: 
Typically the stdout of `which` is suppressed like this:

if ! type cvs >/dev/null 2>&1

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-10 Thread Torsten Bögershausen

[]

One general question up here, more comments inline.
The current order for a clean-filter is like this, I removed the error handling:

int convert_to_git()
{
ret |= apply_filter(path, src, len, -1, dst, filter);
if (ret && dst) {
src = dst->buf;
len = dst->len;
}
ret |= crlf_to_git(path, src, len, dst, ca.crlf_action, checksafe);
return ret | ident_to_git(path, src, len, dst, ca.ident);
}

The first step is the clean filter, the CRLF-LF conversion (if needed),
then ident.
The current implementation streams the whole file content to the filter,
(STDIN of the filter) and reads back STDOUT from the filter into a STRBUF.
This is to use the UNIX-like STDIN--STDOUT method for writing a filter.

However, int would_convert_to_git_filter_fd() and convert_to_git_filter_fd()
offer a sort of short-cut:
The filter reads from the file directly, and the output of the filter is
read into a STRBUF.

It looks as if the multi-filter approach can use this in a similar way:
Give the pathname to the filter, the filter opens the file for reading
and stream the result via the pkt-line protocol into Git.
This needs some more changes, and may be very well go into a separate patch
series. (and should).

What I am asking for:
When a multi-filter is used, the content is handled to the filter via pkt-line,
and the result is given to Git via pkt-line ?
Nothing wrong with it, I just wonder, if it should be mentioned somewhere.

> diff --git a/contrib/long-running-filter/example.pl 
> b/contrib/long-running-filter/example.pl
> new file mode 100755
> index 000..279fbfb
> --- /dev/null
> +++ b/contrib/long-running-filter/example.pl
> @@ -0,0 +1,111 @@
> +#!/usr/bin/perl
> +#
> +# Example implementation for the Git filter protocol version 2
> +# See Documentation/gitattributes.txt, section "Filter Protocol"
> +#
> +
> +use strict;
> +use warnings;
> +
> +my $MAX_PACKET_CONTENT_SIZE = 65516;
> +
> +sub packet_read {
> +my $buffer;
> +my $bytes_read = read STDIN, $buffer, 4;
> +if ( $bytes_read == 0 ) {
> +
> +# EOF - Git stopped talking to us!
> +exit();
> +}
> +elsif ( $bytes_read != 4 ) {
> +die "invalid packet size '$bytes_read' field";
> +}

This is half-kosher, I would say,
(And I really. really would like to see an implementation in C ;-)

A read function may look like this:

   ret = read(0, , 4);
   if (ret < 0) {
 /* Error, nothing we can do */
 exit(1);
   } else if (ret == 0) {
 /* EOF  */
 exit(0);
   } else if (ret < 4) {
 /* 
  * Go and read more, until we have 4 bytes or EOF or Error */
   } else {
 /* Good case, see below */
   }

> +my $pkt_size = hex($buffer);
> +if ( $pkt_size == 0 ) {
> +return ( 1, "" );
> +}
> +elsif ( $pkt_size > 4 ) {
> +my $content_size = $pkt_size - 4;
> +$bytes_read = read STDIN, $buffer, $content_size;
> +if ( $bytes_read != $content_size ) {
> +die "invalid packet ($content_size expected; $bytes_read read)";
> +}
> +return ( 0, $buffer );
> +}
> +else {
> +die "invalid packet size";
> +}
> +}
> +
> +sub packet_write {
> +my ($packet) = @_;
> +print STDOUT sprintf( "%04x", length($packet) + 4 );
> +print STDOUT $packet;
> +STDOUT->flush();
> +}
> +
> +sub packet_flush {
> +print STDOUT sprintf( "%04x", 0 );
> +STDOUT->flush();
> +}
> +
> +( packet_read() eq ( 0, "git-filter-client" ) ) || die "bad initialization";
> +( packet_read() eq ( 0, "version=2" ) ) || die "bad version";
> +( packet_read() eq ( 1, "" ) )  || die "bad version end";
> +
> +packet_write("git-filter-server\n");
> +packet_write("version=2\n");
> +
> +( packet_read() eq ( 0, "clean=true" ) )  || die "bad capability";
> +( packet_read() eq ( 0, "smudge=true" ) ) || die "bad capability";
> +( packet_read() eq ( 1, "" ) )|| die "bad capability end";
> +
> +packet_write( "clean=true\n" );
> +packet_write( "smudge=true\n" );
> +packet_flush();
> +
> +while (1) {
> +my ($command) = packet_read() =~ /^command=([^=]+)\n$/;
> +my ($pathname) = packet_read() =~ /^pathname=([^=]+)\n$/;
> +
> +packet_read();
> +
> +my $input = "";
> +{
> +binmode(STDIN);
> +my $buffer;
> +my $done = 0;
> +while ( !$done ) {
> +( $done, $buffer ) = packet_read();
> +$input .= $buffer;
> +}
> +}
> +
> +my $output;
> +if ( $command eq "clean" ) {
> +### Perform clean here ###
> +$output = $input;
> +}
> +elsif ( $command eq "smudge" ) {
> +### Perform smudge here ###
> +$output = $input;
> +}
> +else {
> +die "bad command '$command'";
> +}
> +
> +packet_write("status=success\n");
> +packet_flush();
> +while ( length($output) > 0 ) {
> +my $packet = substr( $output, 0,

Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen




On 29/09/16 19:57, Lars Schneider wrote:

On 29 Sep 2016, at 18:57, Junio C Hamano <gits...@pobox.com> wrote:

Torsten Bögershausen <tbo...@web.de> writes:


1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Running "ps" and grepping for a command is not suitable for script
to reliably tell things, so it is out of question.  Compared to
that, your version looks slightly better, but what if the machinery
that being tested, i.e. the part that drives the filter process, is
buggy or becomes buggy and causes the filter process that writes
"STOP" to die before it actually writes that string?

I have a feeling that the machinery being tested needs to be fixed
so that the sequence is always be:

0) Git spawns the filter process, as it needs some contents to
   be filtered.

1) Git did everything it needed to do and decides that is time
   to go.

2) Filter process receives EOF and prints "STOP" to the log.

3) Git waits until the filter process finishes.

4) t0021, after Git finishes, checks the log.

Repeated sleep combined with grep is probably just sweeping the real
problem under the rug.  Do we have enough information to do the
above?

An inspiration may be in the way we centrally clean all tempfiles
and lockfiles before exiting.  We have a central registry of these
files that need cleaning up and have a single atexit(3) handler to
clean them up.  Perhaps we need a registry that filter processes
spawned by the mechanism Lars introduces in this series, and have an
atexit(3) handler that closes the pipe to them (which signals the
filters that it is time for them to go) and wait(2) on them, or
something?  I do not think we want any kill(2) to be involved in
this clean-up procedure, but I do think we should wait(2) on what we
spawn, as long as these processes are meant to be shut down when the
main process of Git exits (this is different from things like
credential-cache daemon where they are expected to persist and meant
to serve multiple Git processes).

We discussed that issue in v4 and v6:
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/

My impression was that you don't want Git to wait for the filter process.
If Git waits for the filter process - how long should Git wait?

Thanks,
Lars


Hm,
I would agree that  Git should not wait for the filter.
But does the test suite need to wait for the filter ?
May be, in this case we test the filter and Git, which is good.
Adding a 1 second delay, if, and only if, there is a racy condition,
is not that bad (or do we have better ways to check for a process to
be terminated ?)

Re: [PATCH] run-command: fix build on cygwin (stdin is a macro)

2016-10-05 Thread Torsten Bögershausen


>I am not suggesting that you apply this exact patch (stdin_ is not a good
choice

How about fd_stdin ?

Re: [PATCH v10 13/14] convert: add filter..process option

2016-10-08 Thread Torsten Bögershausen

On 09.10.16 01:06, Jakub Narębski wrote:
>> +
>> > +packet:  git< status=abort
>> > +packet:  git< 
>> > +
>> > +
>> > +After the filter has processed a blob it is expected to wait for
>> > +the next "key=value" list containing a command. Git will close
>> > +the command pipe on exit. The filter is expected to detect EOF
>> > +and exit gracefully on its own.
> Any "kill filter" solutions should probably be put here.  I guess
> that filter exiting means EOF on its standard output when read
> by Git command, isn't it?
>
Isn't it that Git closes the command pipe, then filter sees EOF on it's stdin

and does a graceful exit.

Re: [PATCH v10 14/14] contrib/long-running-filter: add long running filter example

2016-10-08 Thread Torsten Bögershausen

On 08.10.16 13:25, larsxschnei...@gmail.com wrote:
> From: Lars Schneider 
> 
> Add a simple pass-thru filter as example implementation for the Git
> filter protocol version 2. See Documentation/gitattributes.txt, section
> "Filter Protocol" for more info.
> 

Nothing wrong with code in contrib.
I may have missed parts of the discussion, was there a good reason to
drop the test case completely?

>When adding a new feature, make sure that you have new tests to show
>the feature triggers the new behavior when it should, and to show the
>feature does not trigger when it shouldn't.  After any code change, make
>sure that the entire test suite passes.

Or is there a plan to add them later ?

Re: A head scratcher, clone results in modified files (tested linux and cygwin) - .gitattributes file?

2016-10-09 Thread Torsten Bögershausen




On 09/10/16 08:48, Jason Pyeron wrote:


The whole .gitattributes needs to be adopted, I think

Git 2.10 or higher has "git ls-files --eol":

git ls-files --eol   | grep "i/crlf.*auto"
i/crlf  w/crlf  attr/text=auto src/site/xdoc/upgradeto2_3.xml
i/crlf  w/crlf  attr/text=auto 
src/test/resources/org/apache/commons/io/FileUtilsTestDataCRLF.dat

i/crlf  w/crlf  attr/text=auto src/test/resources/test-file-gbk.bin
i/crlf  w/crlf  attr/text=auto 
src/test/resources/test-file-iso8859-1-shortlines-win-linebr.bin

i/crlf  w/crlf  attr/text=auto src/test/resources/test-file-utf8-win-linebr.bin
i/crlf  w/crlf  attr/text=auto src/test/resources/test-file-windows-31j.bin
i/crlf  w/crlf  attr/text=auto src/test/resources/test-file-x-windows-949.bin
i/crlf  w/crlf  attr/text=auto src/test/resources/test-file-x-windows-950.bin

Problem:
xml file had been commited  with CRLF : either normalize it or declare "-text".

The same is valid for the other files as well.
They are identified by auto as text, and commited with CRLF.
My feeling is that they should be declared as "-text".
Or, to be more compatible, with "-crlf":

Solution:
Make up your mind about the xml file and the html files.
If they are text, they need to be normalized.


Question:
What happens, if you do this:
# Auto detect text files and perform LF normalization
*crlf=auto

*.bin-crlf
*.dat-crlf
*.java   crlf diff=java
*.html   crlf diff=html
*.csscrlf
*.js crlf
*.sqlcrlf

Re: Gitattributes file is not respected when switching between branches

2016-09-18 Thread Torsten Bögershausen

On 16.09.16 08:51, Виталий Ищенко wrote:
> Sorry for delay.
> 
No problem about the delay.

(And please no top-posting)

If you say
> ".gitattributes" indeed is not present in "master", but this is intentionally
then nobody has (to my knowledge) thought about this situation/workflow yet.

The short version:
Git is designed to have the same .gitattributes in different branches.
At least not in the long run.

A typical use case is to create a repo, adjust the
.gitattributes and keep this in all branches.

Re: [ANNOUNCE] Git v2.10.0-rc0

2016-08-17 Thread Torsten Bögershausen

On Mon, Aug 15, 2016 at 08:42:48AM -0700, Junio C Hamano wrote:
> Torsten Bögershausen <tbo...@web.de> writes:
> 
> > On 15.08.16 00:47, Junio C Hamano wrote:
> >> Torsten Bögershausen (1):
> >>   convert: unify the "auto" handling of CRLF
> >
> > Should we mention this change in the release notes?
> >
> > The handling of "*.text = auto" was changed, and now
> >
> > $ echo "* text=auto eol=crlf" >.gitattributes
> > has the same effect as
> > $ git config core.autocrlf true
> 
> Oh, definitely.  Thanks.
Hm, one question remains:
 git show 07c92928f2b782330df6e78
[]
+ * The handling of the "text = auto" attribute has been updated.
+   $ echo "* text=auto eol=crlf" >.gitattributes
+   used to have the same effect as
+   $ echo "* text=auto eol=crlf" >.gitattributes
Is this `text=auto` correct ?
I think it should be

   used to have the same effect as
   $ echo "* text eol=crlf" >.gitattributes

# In other words, the `auto` was ignored, as explained here:
+   $ git config core.eol crlf
+   i.e. declaring all files are text; the combination now is
+   equivalent to doing
+   $ git config core.autocrlf true
+
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: git-svn bridge and line endings

2016-08-23 Thread Torsten Bögershausen

On 23.08.16 19:50, Lucian Smith wrote:
> On Tue, Aug 23, 2016 at 10:36 AM, Julian Phillips
>  wrote:
>> On 23/08/2016 17:14, Lucian Smith wrote:
>>>
>>> Thanks for the quick responses!
>>>
>>> My situation is that the git side is entirely whatever github.org is
>>> running; presumably the latest stable version?  They provide a URL for
>>> repositories hosted there that can be accessed by an SVN
>>> client--somewhere on github is the 'git-svn bridge' (as I understand
>>> it): something that receives SVN requests, translates them to
>>> git-speak, and replies with what SVN expects.
>>
>>
>> The ability to use a Subversion client is functionality provided by GitHub,
>> and not native to git itself.  So unless someone for the appropriate GitHub
>> team happens to read this thread I expect there isn't much we can do to
>> help.  I don't know if they've even provided any real detail of how they
>> implemented the bridge.
> 
> All right, that makes sense.  And yeah, it was hard to find any
> information about the bridge, which is why I ended up here...
> 
>>> There is indeed a .gitattributes file in the repository, but the SVN
>>> client doesn't know what to do with it.  My hope was that something in
>>> the bridge code, that translated SVN requests to git and back, could
>>> take the SVN request, "Please give me this file; I'm on Windows" look
>>> at the .gitattributes file in the repository, and hand out a file with
>>> CR/LF's in it.  Conversely, when SVN tells git "Here is the new
>>> version of the file to check in," the bridge could look at the file,
>>> realize it had CR/LF's in it, look at the .gitattributes file to know
>>> if it needed to be converted, and then convert it appropriately.
>>>
>>> I can imagine a full-blown bridge that could even translate the SVN
>>> EOL propset back and forth appropriately, but I'm not sure if going
>>> that far is necessary and/or helpful.
>>>
>>> I don't know if this is the right mailing list for that particular bit
>>> of software, but it at least seemed like a good place to start.  Thank
>>> you!
>>
>>
>> Supported properties are listed here:
>> https://help.github.com/articles/subversion-properties-supported-by-github/
>>
>> You'll need to ask GitHub to implement support for the svn:eol-style
>> property I expect.
> 
> Thanks for finding that!  That even has an 'ask a human' button, which
> I expect is my next step.
> 
>> Might be easier to just use Tortoise Git?
> 
> It may be!  But thanks for the responses anyway.
Most text-files have been commited with LF:
/tmp/ttt/sbml-test-suite> git ls-files --eol | grep "i/lf" | wc -l
   10266
Some have been commited with CRLF:
/tmp/ttt/sbml-test-suite> git ls-files --eol | grep "i/crlf" | wc -l
1620


The whole repo deserves to be normalized and equipped with a .gitattributes 
file,
see

https://www.kernel.org/pub/software/scm/git/docs/gitattributes.html


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH 2/4] cat-file: introduce the --filters option

2016-08-22 Thread Torsten Bögershausen

On 19.08.16 17:00, Johannes Schindelin wrote:
> Hi Torsten,
>
> On Fri, 19 Aug 2016, Torsten Bögershausen wrote:
>
>> On Thu, Aug 18, 2016 at 02:46:17PM +0200, Johannes Schindelin wrote:
>>
>>> +--filters::
>>> +   Show the content as transformed by the filters configured in
>> Minor comment:
>> s/transformed/converted/ ?
> Sure.
>
>> Does it make sense to be more specific here:
>> The order of conversion is
>> - ident
>> - CRLF
>> - smudge
> I do not think it makes sense to complexify the documentation in that
> manner. The filters should always be applied in the same order, methinks,
> and it would only clutter the man page to repeat that order here.
Can we can shorten the description and have something like this:

--filters::
+   Show the content converted by the filters configured in
+   the current working tree for the given . 
+has to be of the form : or :.
+


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v1 0/3] Update eol documentation

2016-08-26 Thread Torsten Bögershausen




On 25/08/16 22:31, Junio C Hamano wrote:

[]



This [0/3] is meant to be a cover for [1/2] and [2/2]?

I am trying to see if we broke format-patch recently, or it is a
manual editing error.  The latter I do not care about; the former I
do.



No worry, 0/3 is patched by me by hand, sorry for the confusion.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [PATCH v7 10/10] convert: add filter..process option

2016-09-29 Thread Torsten Bögershausen

On 15.09.16 22:04, Junio C Hamano wrote:
> Lars Schneider  writes:
> 
>> Wouldn't that complicate the pathname parsing on the filter side?
>> Can't we just define in our filter protocol documentation that our 
>> "pathname" packet _always_ has a trailing "\n"? That would mean the 
>> receiver would know a packet "pathname=ABC\n\n" encodes the path
>> "ABC\n" [1].
> 
> That's fine, too.  If you declare that pathname over the protocol is
> a binary thing, you can also define that the packet does not have
> the terminating \n, i.e. the example encodes the path "ABC\n\n",
> which is also OK ;-)
> 
> As long as the rule is clearly documented, easy for filter
> implementors to follow it, and hard for them to get it wrong, I'd be
> perfectly happy.
>

(Sorry for the late reply)

In V8 the additional "\n" is clearly documented.

On the long run,
I would suggest to be more clear what BINARY is:

--- a/Documentation/technical/protocol-common.txt
+++ b/Documentation/technical/protocol-common.txt
@@ -61,6 +61,9 @@ the length's hexadecimal representation.
 A pkt-line MAY contain binary data, so implementors MUST ensure
 pkt-line parsing/formatting routines are 8-bit clean.
 
+Each pkt-line that may contain ASCII control characters should
+be treated as binary.
+

Re: Impossible to change working directory

2016-10-01 Thread Torsten Bögershausen

On 29.09.16 21:30, Sebastian Feldmann wrote:
> Hi there,
>
> I have a problem executing a pre-commit hook.
> The hook script has to change the working directory to work and if I use plain
>
> git commit
>
> it works as expected, the script executes without errors, but if I use
>
> git commit —only file.x file.y
>
> the script fails because changing the current working directory fails.
> If I echo the current working directory it always echoes the root repository 
> path
>
> Is this expected behavior?
> Thanks for your feedback.
Is there any chance to send us the content of the script ?
(Or a demo example, which doesn't work)

Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen




On 29/09/16 12:28, Lars Schneider wrote:

On 28 Sep 2016, at 23:49, Junio C Hamano  wrote:

I suspect that you are preparing a reroll already, but the one that
is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see
occasional failures from it.

I didn't trace where the test goes wrong, but one easy mistake you
could make (I am not saying that is the reason of the failure) is to
assume your filter will not be called under certain condition (like
immediately after you checked out from the index to the working
tree), when the automated test goes fast enough and get you into a
"racy git" situation---the filter may be asked to filter the
contents from the working tree again to re-validate what's there is
still what is in the index.

Thanks for the heads-up!

This is what happens:

1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Re: [PATCH v1 1/2] config.mak.in: set NO_OPENSSL and APPLE_COMMON_CRYPTO for macOS >10.11

2016-11-09 Thread Torsten Bögershausen

On 07.11.16 18:26, Jeff King wrote:
> On Sun, Nov 06, 2016 at 08:35:04PM +0100, Lars Schneider wrote:
> 
>> Good point. I think I found an even easier way to achieve the same.
>> What do you think about the patch below?
>>
>> [...]
>>
>> diff --git a/Makefile b/Makefile
>> index 9d6c245..f53fcc9 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -1047,6 +1047,7 @@ ifeq ($(uname_S),Darwin)
>>  endif
>>  endif
>>  ifndef NO_APPLE_COMMON_CRYPTO
>> +NO_OPENSSL = YesPlease
>>  APPLE_COMMON_CRYPTO = YesPlease
>>  COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
>>  endif
> 
> That is much simpler.
[]
I don't know if that is a correct solution.

If I have Mac OS 10.12 and Mac Ports installed, I may want to use
OPENSSL from Mac Ports.
How about this:


diff --git a/Makefile b/Makefile
index ee89c06..e93511f 100644
--- a/Makefile
+++ b/Makefile
@@ -1038,17 +1038,22 @@ ifeq ($(uname_S),Darwin)
ifeq ($(shell test -d /sw/lib && echo y),y)
BASIC_CFLAGS += -I/sw/include
BASIC_LDFLAGS += -L/sw/lib
+   HAS_OPENSSL = Yes
endif
endif
ifndef NO_DARWIN_PORTS
ifeq ($(shell test -d /opt/local/lib && echo y),y)
BASIC_CFLAGS += -I/opt/local/include
BASIC_LDFLAGS += -L/opt/local/lib
+   HAS_OPENSSL = Yes
endif
endif
ifndef NO_APPLE_COMMON_CRYPTO
APPLE_COMMON_CRYPTO = YesPlease
COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
+   ifndef HAS_OPENSSL
+   NO_OPENSSL = YesPlease
+   endif
endif
NO_REGEX = YesPlease
PTHREAD_LIBS =

Re: [PATCH v1 1/2] config.mak.in: set NO_OPENSSL and APPLE_COMMON_CRYPTO for macOS >10.11

2016-11-09 Thread Torsten Bögershausen

On 09.11.16 10:29, Lars Schneider wrote:
> 
>> On 09 Nov 2016, at 09:18, Torsten Bögershausen <tbo...@web.de> wrote:
>>
>> On 07.11.16 18:26, Jeff King wrote:
>>> On Sun, Nov 06, 2016 at 08:35:04PM +0100, Lars Schneider wrote:
>>>
>>>> Good point. I think I found an even easier way to achieve the same.
>>>> What do you think about the patch below?
>>>>
>>>> [...]
>>>>
>>>> diff --git a/Makefile b/Makefile
>>>> index 9d6c245..f53fcc9 100644
>>>> --- a/Makefile
>>>> +++ b/Makefile
>>>> @@ -1047,6 +1047,7 @@ ifeq ($(uname_S),Darwin)
>>>>endif
>>>>endif
>>>>ifndef NO_APPLE_COMMON_CRYPTO
>>>> +  NO_OPENSSL = YesPlease
>>>>APPLE_COMMON_CRYPTO = YesPlease
>>>>COMPAT_CFLAGS += -DAPPLE_COMMON_CRYPTO
>>>>endif
>>>
>>> That is much simpler.
>> []
>> I don't know if that is a correct solution.
>>
>> If I have Mac OS 10.12 and Mac Ports installed, I may want to use
>> OPENSSL from Mac Ports.
> 
> Can't you define `NO_APPLE_COMMON_CRYPTO` in that case? 
> I think if you use OpenSSL then you don't need the Apple crypto lib, right?

After re-reading the Makefile: that makes sense :-)

Do you want to send a new patch ?

Feel free to omit
"Original-patch-by: Torsten Bögershausen <tbo...@web.de>"

Re: [PATCH v1 0/2] Fix default macOS build locally and on Travis CI

2016-11-09 Thread Torsten Bögershausen



On 10/11/16 00:39, Junio C Hamano wrote:


I've followed what was available at the public-inbox archive, but it
is unclear what the conclusion was.

For the first one your "how about" non-patch, to which Peff said
"that's simple and good", looked good to me as well, but is it
available as a final patch that I can just take and apply (otherwise
I think I can do the munging myself, but I'd rather be spoon-fed
when able ;-).


If you can munge the Peff version that may be easiest ?
I couldn't find a branch in your repo, but am happy to review
whatever shows up there and in pu.

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-05 Thread Torsten Bögershausen

On Thu, Nov 03, 2016 at 09:30:44AM -0400, Martin-Louis Bright wrote:
> I will see if I can find a OSX 10.6 system to test with, and I'll try with
> perl 5.10.
> 
> --Martin

No need to worry too much:

I have tested Peffs patch applied on next OK- 
And the integration into pu that came the 2nd Novevember is tested OK as well.

(And please everybody: avoid top-posting here)

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-05 Thread Torsten Bögershausen

> * st/verify-tag (2016-10-10) 7 commits
>  - t/t7004-tag: Add --format specifier tests
>  - t/t7030-verify-tag: Add --format specifier tests
>  - builtin/tag: add --format argument for tag -v
>  - builtin/verify-tag: add --format to verify-tag
>  - tag: add format specifier to gpg_verify_tag
>  - ref-filter: add function to print single ref_array_item
>  - gpg-interface, tag: add GPG_VERIFY_QUIET flag
> 
>  "git tag" and "git verify-tag" learned to put GPG verification
>  status in their "--format=" output format.
> 
>  Waiting for a reroll.
>  cf. <20161007210721.20437-1-santi...@nyu.edu>

I don't know if this has been reported before:
Some tests needs to be protected by GPG:
+test_expect_success GPG 'verifying a proper tag with --format pass and format 
accordingly' 

test_expect_success GPG 'verifying a forged tag with --format fail and format 
accordingly'

test_expect_success GPG 'verifying a forged tag with --format fail and format 
accordingly'

Re: [PATCH] rev-list: restore the NUL commit separator in --header mode

2016-10-20 Thread Torsten Bögershausen


diff --git a/t/t6000-rev-list-misc.sh b/t/t6000-rev-list-misc.sh

index 3e752ce..a2acff3 100755
--- a/t/t6000-rev-list-misc.sh
+++ b/t/t6000-rev-list-misc.sh
@@ -100,4 +100,11 @@ test_expect_success '--bisect and --first-parent can not 
be combined' '
test_must_fail git rev-list --bisect --first-parent HEAD
  '
  
+test_expect_success '--header shows a NUL after each commit' '

+   touch expect &&
+   printf "\0" > expect &&

Micronit: No ' ' after '>'
(And why do we need the touch ?)

+   printf "\0" >expect &&



+   git rev-list --header --max-count=1 HEAD | tail -n1 >actual &&
+   test_cmp_bin expect actual
+'
+
  test_done

Re: [PATCH] t0021, t5615: use $PWD instead of $(pwd) in PATH-like shell variables

2016-11-14 Thread Torsten Bögershausen

On 14.11.16 10:11, Lars Schneider wrote:
> 
>> On 13 Nov 2016, at 02:13, Junio C Hamano  wrote:
>>
>> Johannes Schindelin  writes:
>>
 Thanks.  Dscho, does this fix both of these issues to you?
>>>
>>> Apparently it does because the CI jobs for `master` and for `next` pass.
>>
>> OK, thanks for a quick confirmation.
>>
>>> The one for `pu` still times out, of course.
>>
>> Earlier you said 'pu' did even not build, but do we know where this
>> "still times out" comes from?  As long as I don't merge anything
>> prematurely, which I need to be careful about, it shouldn't impact
>> the upcoming release, but we'd need to figure it out before moving
>> things forward post release.
> 
> What is the goal for 'pu'?
> 

> (1) Builds clean on all platforms + passes all tests
Yes
> (2) Builds clean on all platforms
Yes
> (3) Builds clean on Linux
Yes
> (4) Just makes new topics easily available to a broader audience
Yes

> 
> My understanding was always (4) but the discussion above sounds 
> more like (1) or (2)?
All commits should work on all platforms - in the ideal world there is no 
problem.

>From time to time things sneak in, which are not portable.
(in the sense that not all "supported" compile/run tests without breakages)

And if everybody reports breakages and problems found on the pu
branch, there is a good chance that they don't reach next or master.

Does this make sense ?

> 
> --
> 
> Git 'pu' does not compile on macOS right now:
> builtin/bisect--helper.c:299:6: error: variable 'good_syn' is used 
> uninitialized 
> whenever 'if' condition is true [-Werror,-Wsometimes-uninitialized]
> ...
> 
> More info here:
> https://api.travis-ci.org/jobs/175417712/log.txt?deansi=true
> 
> --
> 
> Cheers,
> Lars
>

Re: [PATCH v8 07/10] convert: unify the "auto" handling of CRLF

2016-11-25 Thread Torsten Bögershausen

RFH, the normalization as descrived in Documentation/gitattributes.txt
does not work anymore:


>From a clean working directory:

-
$ echo "* text=auto" >.gitattributes
$ rm .git/index # Remove the index to force Git to
$ git reset # re-scan the working directory
$ git status# Show files that will be normalized
$ git add -u
$ git add .gitattributes
$ git commit -m "Introduce end-of-line normalization"
-


I have different ideas, how a a normalizatio  can be done:

A) 
-
$ echo "* text=auto" >.gitattributes
$ rm .git/index
$ git add .
$ git status# Show files that will be normalized
$ git commit -m "Introduce end-of-line normalization"
-

B)
$ echo "* text=auto" >.gitattributes &&
$ git add .gitattributes &&
$ git ls-files --eol | egrep '^i/(crlf|mixed).*attr/(text|auto)' | ( 
TAB=$(printf "\t") ; sed -e "s/.*$TAB/dos2unix /" ) >/tmp/$$ &&
$ /bin/sh /tmp/$$ &&
$ rm -f /tmp/$$ &&
$ git add -u &&
$ git commit -m "Introduce end-of-line normalization"

C) 
Teach "git add" to learn --renormalize and then
-
$ echo "* text=auto" >.gitattributes
$ git add -u --renormalize
$ git add .gitattributes
$ git commit -m "Introduce end-of-line normalization"
-

(None of them is really tested)

A) may loose the execute bit
B) dos2unix is not installed everywhere (like Mac OS)
C) seems to most attractive, but I couldn't find out how to forward
   options from "git add" into convert.c


Any help is appreciated.

Re: cherry-pick -Xrenormalize fails with formerly CRLF files

2016-11-28 Thread Torsten Bögershausen

On Sun, Nov 27, 2016 at 10:19:35PM -0800, Eevee (Lexy Munroe) wrote:
> I'm working with a repo that used to be all CRLF.  At some point it
> was changed to all LF, with `text=auto` in .gitattributes for the
> sake of Windows devs.  I'm on Linux and have never touched any
> twiddles relating to line endings.  I'm trying to cherry-pick some
> commits from before the switchover.
> 
> Straightforward cherry-picking causes entire files at a time to
> conflict, which I've seen before when switching from tabs to spaces.
> So I tried -Xrenormalize and got:
> 
> fatal: CRLF would be replaced by LF in [path]
> 

Which version of Git are you using, what does
git --version
say?


> The error comes from check_safe_crlf, which warns if checksafe is
> CRLF_SAFE_WARN and dies if it's (presumably) CRLF_SAFE_FAIL.  The
> funny thing is that it's CRLF_SAFE_RENORMALIZE.
> 
> I don't know what the semantics of this value are, but the caller
> (crlf_to_git) explicitly checks for CRLF_SAFE_RENORMALIZE and
> changes it to CRLF_SAFE_FALSE instead.  But that check only happens
> if crlf_action is CRLF_AUTO*, and for me it's CRLF_TEXT_INPUT.
> 
> I moved the check to happen regardless of the value of crlf_action,
> and at least in this case, git appears to happily do the right
> thing.  So I think this is a bug, but line endings are such a tangle
> that I'm really not sure.  :)
>
I am not sure either.
Could you send me the diff you made ?
git diff

I am happy to look into it, (in the amount of time I have).

> The repository in question is ZDoom: https://github.com/rheit/zdoom
> I'm trying to cherry-pick commits from the 3dfloors3 branch (e.g.,
> 9fb2daf58e9d512170859302a1ac0ea9c2ec5993) onto a slightly outdated
> master, 6384e81d0f135a2c292ac3e874f6fe26093f45b1.

This is what I tried:

user@pc:~/NoBackup> cd zdoom/  9fb2daf58e9d512170859302a1ac0ea9c2ec5993 
 t9fb2daf5Note: checking out 
'9fb2daf58e9d512170859302a1ac0ea9c2ec5993'.02a1ac0ea9c2ec5993 

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b 

HEAD is now at 9fb2daf... - Force use of GL nodes when 3D floors are present. - 
Move MAPINFO line modification flags into P_AdjustLine().
user@pc:~/NoBackup/zdoom> git format-patch HEAD^..HEAD --stdout >9fb2daf.patch
user@pc:~/NoBackup/zdoom> git format-patch HEAD^..HEAD --stdout | tr -d "\r" 
>9fb2daf-noCRLF.patch
user@pc:~/NoBackup/zdoom> git checkout 6384e81d0f135a2c292ac3e874f6fe26093f45b1
Previous HEAD position was 9fb2daf... - Force use of GL nodes when 3D floors 
are present. - Move MAPINFO line modification flags into P_AdjustLine().
HEAD is now at 6384e81... - Add support for Skulltag ACS IsNetworkGame.
user@pc:~/NoBackup/zdoom> git cherry-pick 
9fb2daf58e9d512170859302a1ac0ea9c2ec5993'.02a1ac0ea9c2ec5993
> 
user@pc:~/NoBackup/zdoom> git cherry-pick 
9fb2daf58e9d512170859302a1ac0ea9c2ec5993
error: could not apply 9fb2daf... - Force use of GL nodes when 3D floors are 
present.
hint: after resolving the conflicts, mark the corrected paths
hint: with 'git add ' or 'git rm '
hint: and commit the result with 'git commit'
user@pc:~/NoBackup/zdoom> git status
HEAD detached at 6384e81
You are currently cherry-picking commit 9fb2daf.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Unmerged paths:
  (use "git add ..." to mark resolution)

both modified:   src/p_setup.cpp

Untracked files:
  (use "git add ..." to include in what will be committed)

9fb2daf-noCRLF.patch
9fb2daf.patch

no changes added to commit (use "git add" and/or "git commit -a")
user@pc:~/NoBackup/zdoom> git cherry-pick --abort
user@pc:~/NoBackup/zdoom> git am < 9fb2daf-noCRLF.patch 
Applying: - Force use of GL nodes when 3D floors are present. - Move MAPINFO 
line modification flags into P_AdjustLine().
error: patch failed: src/p_setup.cpp:1798
error: src/p_setup.cpp: patch does not apply
Patch failed at 0001 - Force use of GL nodes when 3D floors are present. - Move 
MAPINFO line modification flags into P_AdjustLine().
The copy of the patch that failed is found in: .git/rebase-apply/patch
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".



The patch did not apply, but for different reasons.

Could you send us, what exactly you did, to help me out ?

Re: [PATCH v1 2/2] travis-ci: disable GIT_TEST_HTTPD for macOS

2016-11-16 Thread Torsten Bögershausen

On 16.11.16 15:39, Heiko Voigt wrote:
> On Tue, Nov 15, 2016 at 10:31:59AM -0500, Jeff King wrote:
>> On Tue, Nov 15, 2016 at 01:07:18PM +0100, Heiko Voigt wrote:
>>
>>> On Fri, Nov 11, 2016 at 09:22:51AM +0100, Lars Schneider wrote:
 To all macOS users on the list:
 Does anyone execute the tests with GIT_TEST_HTTPD enabled successfully?
>>>
>>> Nope. The following tests fail for me on master: 5539, 5540, 5541, 5542,
>>> 5550, 5551, 5561, 5812.
>>
>> Failing how? Does apache fail to start up? Do tests fails? What does
>> "-v" say? Is there anything interesting in httpd/error.log in the trash
>> directory?
> 
> This is what I see for 5539:
> 
> $ GIT_TEST_HTTPD=1 ./t5539-fetch-http-shallow.sh -v
> Initialized empty Git repository in /Users/hvoigt/Repository/git4/t/trash 
> directory.t5539-fetch-http-shallow/.git/
> checking prerequisite: NOT_ROOT
> 
> mkdir -p "$TRASH_DIRECTORY/prereq-test-dir" &&
> (
>   cd "$TRASH_DIRECTORY/prereq-test-dir" &&
>   uid=$(id -u) &&
>   test "$uid" != 0
> 
> )
> prerequisite NOT_ROOT ok
> httpd: Syntax error on line 65 of 
> /Users/hvoigt/Repository/git4/t/lib-httpd/apache.conf: Cannot load 
> modules/mod_mpm_prefork.so into server: 
> dlopen(/Users/hvoigt/Repository/git4/t/trash 
> directory.t5539-fetch-http-shallow/httpd/modules/mod_mpm_prefork.so, 10): 
> image not found
> error: web server setup failed
> 
Yes, same here.

If we take that out:

diff --git a/t/lib-httpd/apache.conf b/t/lib-httpd/apache.conf
index c3e6313..1925fdb 100644
--- a/t/lib-httpd/apache.conf
+++ b/t/lib-httpd/apache.conf
@@ -61,9 +61,6 @@ LockFile accept.lock
 
LoadModule access_compat_module modules/mod_access_compat.so
 
-
-   LoadModule mpm_prefork_module modules/mod_mpm_prefork.so
-
 
LoadModule unixd_module modules/mod_unixd.so
 

I run into other issues:

 [core:emerg] [pid 2502] (2)No such file or directory: AH00023: Couldn't create 
the rewrite-map mutex (file /private/var/run/rewrite-map.2502)
AH00016: Configuration Failed

(apache2 comes via MacPorts)

Re: merge --no-ff is NOT mentioned in help

2016-11-17 Thread Torsten Bögershausen




On 17/11/16 18:10, Junio C Hamano wrote:

Mike Rappazzo  writes:


(Please reply inline)

Indeed ;-)


On Wed, Nov 16, 2016 at 10:48 AM, Vanderhoof, Tzadik
 wrote:

I am running:git version 2.10.1.windows.1

I typed: git merge -h

and got:

usage: git merge [] [...]
or: git merge []  HEAD 
or: git merge --abort

 -ndo not show a diffstat at the end of the merge
...
 --overwrite-ignoreupdate ignored files (default)

Notice there is NO mention of the "--no-ff" option

I understand.  On my system I can reproduce this by providing a bad
argument to `git merge`.  This is the output from the arg setup.  For
"boolean" arguments (like '--ff'), there is an automatic counter
argument with "no-" in there ('--no-ff') to disable the option.  Maybe
it would make sense to word the output to include both.

I think that was a deliberate design decision to avoid cluttering
the short help text with mention of both --option and --no-option.

People interested may want to try the attached single-liner patch to
see how the output from _ALL_ commands that use parse-options API
looks when given "-h".  It could be that the result may not be too
bad.

I suspect that we may discover that some options that should be
marked with NONEG are not marked along the way, which need to be
fixed.


  parse-options.c | 4 +++-
  1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/parse-options.c b/parse-options.c
index 312a85dbde..348be6b240 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -626,7 +626,9 @@ static int usage_with_options_internal(struct 
parse_opt_ctx_t *ctx,
if (opts->long_name && opts->short_name)
pos += fprintf(outfile, ", ");
if (opts->long_name)
-   pos += fprintf(outfile, "--%s", opts->long_name);
+   pos += fprintf(outfile, "--%s%s",
+  (opts->flags & PARSE_OPT_NONEG) ? "" : 
"[no-]",
+  opts->long_name);
if (opts->type == OPTION_NUMBER)
pos += utf8_fprintf(outfile, _("-NUM"));
  

+1 from my side
(As I once spend some time to find out that the "no--" is automatically 
available)

Re: [PATCH] Remove dependency on deprecated Net::SMTP::SSL

2016-11-20 Thread Torsten Bögershausen





On 20/11/16 22:18, Mike Fisher  wrote:


Thanks for contributing to Git.
One comment on the head line:
>Refactor send_message() to remove dependency on deprecated
Net::SMTP::SSL
The word "refactor" may be used in other way: Re-structure the code,
and use the same API.


"Remove dependency on deprecated Net::SMTP::SSL"


Refactor send_message() to remove dependency on deprecated
Net::SMTP::SSL:

Is there a security risk with require Net::SMTP::SSL ?
If yes, the commit message should state this.
If no:
Even if it is deprecated, is it still in use somewhere ?
Does it hurt someone, is there any OS release where the old code doesn't work 
anymore ?

Or is it "only" nice to have ?
Since when does Net::SMTP include Net::SMTP::SSL ?
On which system has the change been tested ?

I think the commit message could and should give more information like this.

My comments may be over-critical.
Lets see if other people from the list know more than me.





Signed-off-by: Mike Fisher 
---
 git-send-email.perl | 54 +
 1 file changed, 25 insertions(+), 29 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index da81be4..fc166c5 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1330,15 +1330,17 @@ Message-Id: $message_id
 print $sm "$header\n$message";
 close $sm or die $!;
 } else {
-

I can see one refactoring, that is the removal of an empty line.



 if (!defined $smtp_server) {
 die "The required SMTP server is not properly defined."
 }

+require Net::SMTP;
+$smtp_domain ||= maildomain();
+my $smtp_ssl = 0;
+
 if ($smtp_encryption eq 'ssl') {
 $smtp_server_port ||= 465; # ssmtp
-require Net::SMTP::SSL;
-$smtp_domain ||= maildomain();
+$smtp_ssl = 1;
 require IO::Socket::SSL;

 # Suppress "variable accessed once" warning.
@@ -1347,37 +1349,31 @@ Message-Id: $message_id
 $IO::Socket::SSL::DEBUG = 1;
 }

-# Net::SMTP::SSL->new() does not forward any SSL options
 IO::Socket::SSL::set_client_defaults(
 ssl_verify_params());
-$smtp ||= Net::SMTP::SSL->new($smtp_server,
-  Hello => $smtp_domain,
-  Port => $smtp_server_port,
-  Debug => $debug_net_smtp);
 }
 else {
-require Net::SMTP;
-$smtp_domain ||= maildomain();
 $smtp_server_port ||= 25;
-$smtp ||= Net::SMTP->new($smtp_server,
- Hello => $smtp_domain,
- Debug => $debug_net_smtp,
- Port => $smtp_server_port);
-if ($smtp_encryption eq 'tls' && $smtp) {
-require Net::SMTP::SSL;
-$smtp->command('STARTTLS');
-$smtp->response();
-if ($smtp->code == 220) {
-$smtp = Net::SMTP::SSL->start_SSL($smtp,
-  ssl_verify_params())
-or die "STARTTLS failed! ".IO::Socket::SSL::errstr();
-$smtp_encryption = '';
-# Send EHLO again to receive fresh
-# supported commands
-$smtp->hello($smtp_domain);
-} else {
-die "Server does not support STARTTLS! ".$smtp->message;
-}
+}
+
+$smtp ||= Net::SMTP->new($smtp_server,
+ Hello => $smtp_domain,
+ Port => $smtp_server_port,
+ Debug => $debug_net_smtp,
+ SSL => $smtp_ssl);
+
+if ($smtp_encryption eq 'tls' && $smtp) {
+$smtp->command('STARTTLS');
+$smtp->response();
+if ($smtp->code == 220) {
+$smtp->starttls(ssl_verify_params())
+or die "STARTTLS failed! ".IO::Socket::SSL::errstr();
+$smtp_encryption = '';
+# Send EHLO again to receive fresh
+# supported commands
+$smtp->hello($smtp_domain);
+} else {
+die "Server does not support STARTTLS! ".$smtp->message;
 }
 }

Re: [PATCH v2 2/2] convert.c: stream and fast search for binary

2016-11-01 Thread Torsten Bögershausen

[]
> This probably should be done as four more patches to become
> reviewable.
> 
>  - One to use the CONVERT_STAT_BITS a lot more for the conversion
>decision than before, 
> 
>  - another to allow the caller to tell gather_stats() to give up
>early with the "search_only" bits, 
> 
>  - another to update the get_*_convert_stats() functions to use
>get_convert_stats_sha1(), and then finally 
> 
>  - use the streaming interface when reading from blob and file.
> 
> or something line that.

Many thanks for the detailed review. Let's see if I can come up
with a better series the next weeks or so.

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-02 Thread Torsten Bögershausen

> * ls/filter-process (2016-10-17) 14 commits
>   (merged to 'next' on 2016-10-19 at ffd0de042c)

Some (late, as I recently got a new battery for the Mac OS 10.6 test system) 
comments:
t0021 failes here:


Can't locate object method "flush" via package "IO::Handle" at 
/Users/tb/projects/git/git.next/t/t0021/rot13-filter.pl line 90.
fatal: The remote end hung up unexpectedly


perl itself is 5.10 and we use the one shipped with Mac OS.
Why that ?
t0021 uses the hard-coded path:
t0021/rot13-filter.pl (around line 345) and the nice macro
PERL_PATH from the Makefile is fully ignored.

Commenting out the different "flush" makes the test hang, and I haven't digged 
further.

Re: [PATCH 4/4] t0021: fix filehandle usage on older perl

2016-11-02 Thread Torsten Bögershausen


> +use IO::File;

That did the trick (the J6T version work as well)

Thanks for the fast responses.

Re: What's cooking in git.git (Oct 2016, #09; Mon, 31)

2016-11-02 Thread Torsten Bögershausen

> 
> * tb/convert-stream-check (2016-10-27) 2 commits
>  - convert.c: stream and fast search for binary
>  - read-cache: factor out get_sha1_from_index() helper
> 
>  End-of-line conversion sometimes needs to see if the current blob
>  in the index has NULs and CRs to base its decision.  We used to
>  always get a full statistics over the blob, but in many cases we
>  can return early when we have seen "enough" (e.g. if we see a
>  single NUL, the blob will be handled as binary).  The codepaths
>  have been optimized by using streaming interface.
> 
>  The tip seems to do too much in a single commit and may be better split.
>  cf. <20161012134724.28287-1-tbo...@web.de>
>  cf. 

Reviews have been done, thanks everybody.

It looks to be a good "proof of concept".

The current series is far away from being ready, so please kick it
out of pu and feel free to discard.

Re: [PATCH v10 13/14] convert: add filter..process option

2016-10-11 Thread Torsten Bögershausen

On Tue, Oct 11, 2016 at 10:11:22AM +0200, Lars Schneider wrote:
> 
> > On 10 Oct 2016, at 21:58, Junio C Hamano  wrote:
> > 
> > larsxschnei...@gmail.com writes:
> > 
> > [...]
> >> +# Count unique lines except clean invocations in two files and compare
> >> +# them. Clean invocations are not counted because their number can vary.
> >> +# c.f. 
> >> http://public-inbox.org/git/xmqqshv18i8i@gitster.mtv.corp.google.com/
> >> +test_cmp_count_except_clean () {
> >> +  for FILE in $@
> >> +  do
> >> +  sort $FILE | uniq -c | sed "s/^[ ]*//" |
> >> +  sed "s/^\([0-9]\) IN: clean/x IN: clean/" >$FILE.tmp
> >> +  cat $FILE.tmp >$FILE
> >> +  done &&
> >> +  test_cmp $@
> >> +}
> > 
> > Why do you even _care_ about the number of invocations?  While I
> > told you why "clean" could be called multiple times under racy Git
> > as an example, that was not meant to be an exhaustive example.  I
> > wouldn't be surprised if we needed to run smudge twice, for example,
> > in some weirdly racy cases in the future.
> > 
> > Can we just have the correctness (i.e. "we expect that the working
> > tree file gets this as the result of checking it out, and we made
> > sure that is the case") test without getting into such an
> > implementation detail?
> 
> My goal is to check that clean/smudge is invoked at least once. I could
> just run `uniq` to achieve that but then all other filter commands could
> happen multiple times and the test would not detect that.
> 
> I also prefer to check the filter commands to ensure the filter is 
> working as expected (e.g. no multiple start ups etc) in addition to 
> checking the working tree.
> 
> Would the patch below work for you? If yes, then please squash it into
> "convert: add filter..process option".
> 
> Thank you,
> Lars
> 
> 
> 
> diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
> index 9f892c0..714f706 100755
> --- a/t/t0021-conversion.sh
> +++ b/t/t0021-conversion.sh
> @@ -31,38 +31,33 @@ filter_git () {
>   rm -f git-stderr.log
>  }
>  
> -# Count unique lines in two files and compare them.
> -test_cmp_count () {
> - for FILE in $@
> - do
> - sort $FILE | uniq -c | sed "s/^[ ]*//" >$FILE.tmp
> - cat $FILE.tmp >$FILE
> - done &&
> - test_cmp $@
> -}
> -
> -# Count unique lines except clean invocations in two files and compare
> -# them. Clean invocations are not counted because their number can vary.
> +# Compare two files and ensure that `clean` and `smudge` respectively are
> +# called at least once if specified in the `expect` file. The actual
> +# invocation count is not relevant because their number can vary.
>  # c.f. 
> http://public-inbox.org/git/xmqqshv18i8i@gitster.mtv.corp.google.com/
> -test_cmp_count_except_clean () {
> - for FILE in $@

> +test_cmp_count () {
> + expect=$1 actual=$2

That could be 
expect="$1"
actual="$2"

> + for FILE in "$expect" "$actual"
>   do

> + sort "$FILE" | uniq -c | sed "s/^[ ]*//" |
> + sed "s/^\([0-9]\) IN: clean/x IN: clean/" |
> + sed "s/^\([0-9]\) IN: smudge/x IN: smudge/" 
> >"$FILE.tmp" &&
> + cat "$FILE.tmp" >"$FILE"

How about 
cp "$FILE.tmp" "$FILE"

Re: 2 directories same spelling one directory is camel cased

2016-10-13 Thread Torsten Bögershausen

On 12.10.16 18:05, David Brown wrote:
> Howdy git gurus,
> 
> I have the dubious distinction of working with a remote repo (master) that 
> has a class loader run-time error when cloned, built and executed.
> 
> The reason for the runtime issue is a directory hierarchical path has to 
> directories (folders) with the same name spelling but one of the directories 
> is camel-cased. The package names are the same.
> 
> The compiler doesn't care but the run-time class loader has an issue with the 
> 2 'same like named' classes.
> 
> How to remove the offending directories and files at the locally cloned repo 
> but not push 'deleted' directories and files back to origin/master the remote 
> repo?
> 
> Please advise.
> 
> Regards.
This email did not resolve from here: David Brown 

It is somewhat unclear, which issue the class loader has.
What does 
git ls-files | grep -i "sameNameBitDifferent"
say ?

What do you mean with 
"How to remove the offending directories" ?

Just run
rm -rf "offending directories" ?

Or, may be
git mv dir1 NewDir1

If you don't want to push, you don't have to, or what do I miss ?

Re: [RFC] Case insensitive Git attributes

2016-10-16 Thread Torsten Bögershausen




On 17/10/16 05:07, Stefan Beller wrote:

On Sun, Oct 16, 2016 at 6:04 PM, Lars Schneider
 wrote:

Hi,

Git attributes for path names are generally case sensitive. However, on
a case insensitive file system (e.g. macOS/Windows) they appear to be
case insensitive (`*.bar` would match `foo.bar` and `foo.BAR`).

This feels like a bug:

$ git diff
diff --git a/.gitattributes b/.gitattributes
index 5e98806..1419867 100644
--- a/.gitattributes
+++ b/.gitattributes
@@ -1,3 +1,4 @@
 * whitespace=!indent,trail,space
 *.[ch] whitespace=indent,trail,space
 *.sh whitespace=indent,trail,space
+*.C text

---
$ git -c core.ignorecase=false check-attr --all git.c
git.c: whitespace: indent,trail,space

#But running on a case insensitve FS I get:
$ git  check-attr --all git.c
git.c: text: set
git.c: whitespace: indent,trail,space



That
works great until a Git users joins the party with a case sensitive file
system. For this Git user only files that match the exact case of the
attribute pattern get the attributes (only `foo.bar`).

This inconsistent behavior can confuse Git users. An advanced Git user
could use a glob pattern (e.g. `*.[bB][aA][rR]) to match files in a
case insensitive way. However, this can get confusing quickly, too.

I wonder if we can do something about this. One idea could be to add an
attribute "case-sensitive" (or "caseSensitive") and set it to false
(if desired) for all files in .gitattributes for a given repo.

FYI: I am currently refactoring the attr subsystem (e.g.
https://public-inbox.org/git/20161012224109.23410-1-sbel...@google.com/
"attr: convert to new threadsafe API")


### .gitattributes example ###

* case-sensitive=false

How about
* ignorecase=true
 ?


Would this modify the current file only or the whole stack of attrs?
(In just one way or the whole stack, i.e. can you add this in .git/info/exclude
and the attribute file in the home dir also behaves differently? Or rather the
other way round when the system wide attr file enables case insensitivity,
each repository local config is set automatically? both ways?)


*.bar something

###

I haven't looked into the feasibility of an implementation, yet. However,
would that be an acceptable approach?

Conceptually I would prefer if we had a single switch that indicates a
case insensitive FS. That could be used for different purposes as well,
that are FS relevant such as checking in, checking out/renaming files
in the working tree? (does any such switch already exist for case
sensitivity?)

Thanks,
Stefan


Thanks,
Lars

Re: [PATCH v3 07/25] sequencer: completely revamp the "todo" script parsing

2016-10-15 Thread Torsten Bögershausen

On 15.10.16 19:19, Jeff King wrote:
> On Sat, Oct 15, 2016 at 07:03:46PM +0200, Torsten Bögershausen wrote:
> 
>> sequencer.c:633:14: warning: comparison of constant 2 with expression of 
>> type 'const enum todo_command' is always true 
>> [-Wtautological-constant-out-of-range-compare]
>> if (command < ARRAY_SIZE(todo_command_strings))
>> ~~~ ^ 
>> 1 warning generated.
>>
>> 53f8024e (Johannes Schindelin   2016-10-10 19:25:07 +0200  633) if 
>> (command < ARRAY_SIZE(todo_command_strings))
>>
> 
> Interesting. The compiler is right that this _should_ never happen, but
> I think the patch is quite reasonable to be defensive in case the enum
> happens to get a value outside of its acceptable range (which is
> probably undefined behavior, but...).
> 
> I wonder if:
> 
>   if ((int)command < ARRAY_SIZE(todo_command_strings))
> 
> silences the warning (I suppose size_t is probably an even better type,
> though obviously it does not matter in practice).
> 
> -Peff
> 
Both do (silence the warning)

enum may be signed or unsigned, right ?
So the size_t variant seams to be a better choice

Re: [PATCH v3 07/25] sequencer: completely revamp the "todo" script parsing

2016-10-15 Thread Torsten Bögershausen


Not sure is this has been reported before:


sequencer.c:633:14: warning: comparison of constant 2 with expression of type 
'const enum todo_command' is always true 
[-Wtautological-constant-out-of-range-compare]
if (command < ARRAY_SIZE(todo_command_strings))
~~~ ^ 
1 warning generated.


53f8024e (Johannes Schindelin   2016-10-10 19:25:07 +0200  633) if 
(command < ARRAY_SIZE(todo_command_strings))

Re: [PATCH] real_path: make real_path thread-safe

2016-12-07 Thread Torsten Bögershausen

On Wed, Dec 07, 2016 at 02:13:35PM -0800, Brandon Williams wrote:
> On 12/07, Junio C Hamano wrote:
> > Torsten Bögershausen <tbo...@web.de> writes:
> > 
> > > But in any case it seems that e.g.
> > > //SEFVER/SHARE/DIR1/DIR2/..
> > > must be converted into
> > > //SEFVER/SHARE/DIR1
> > >
> > > and 
> > > \\SEFVER\SHARE\DIR1\DIR2\..
> > > must be converted into
> > > \\SEFVER\SHARE\DIR1
> > 
> > Additional questions that may be interesting are:
> > 
> > //A/B/../C  is it //A/C?  is it an error?
Yes, at least under Windows.
If I have e.g. a Raspi with SAMBA, I can put a git Repository here: 

//raspi/torsten/projects/git
If I use
git push //raspi/torsten/../junio/projects/git
that should be an error.

> > //A/B/../../C/D is it //C/D?  is it an error?
> > 

Same for
git push /raspi/../raspi2/torsten//projects/git


> 
> 
> Also is //.. the same as //?  I would assume so since /.. is /
> 
Under Windows //.. is simply illegal, I would say.
The documentation here
https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247(v=vs.85).aspx

Mentions these 2 examples, what to feed into the WIN32 file API:

a)
\\?\D:\very-long-path

b)
\\server\share\path\file"

c)
"\\?\UNC\server\share\path" 

So whatever we do, the ".." resoltion is only allowed to look at 
"very-long-path" or "path".

Some conversion may be done in mingw.c:
https://github.com/github/git-msysgit/blob/master/compat/mingw.c
So what I understand, '/' in Git are already converted into '\' if needed ?

It seams that we may wnat a function get_start_of_path(uncpath),
which returns:

get_start_of_path_win("//?/D:/very-long-path") "/very-long-path" 
get_start_of_path_win("//server/share/path/file")  "/path/file"
get_start_of_path_win("//?/UNC/server/share/path") "/path"
(I don't know if we need the variant with '\', but is would'n hurt):
get_start_of_path_win("?\\D:\\very-long-path") "\\very-long-path" 
get_start_of_path_win("server\\share\\path\\file")  "\\path\\file"
get_start_of_path_win("?\\UNC\\server\\share\\path") "\\path"

Then the non-windows version could simply return
get_start_of_path_non_win(something) something

Does this make sense ?

Re: [PATCH] real_path: make real_path thread-safe

2016-12-07 Thread Torsten Bögershausen

On Wed, Dec 07, 2016 at 01:12:25AM +, Ramsay Jones wrote:
> 
> 
> On 07/12/16 00:10, Brandon Williams wrote:
> > On 12/06, Junio C Hamano wrote:
> >> POSIX cares about treating "//" at the very beginning of the path
> >> specially.  Is that supposed to be handled here, or by a lot higher
> >> level up in the callchain?
> > 
> > What exactly does "//" mean in this context? (I'm just naive in this
> > area)
> 
> This refers to a UNC path (ie Universal Naming Convention) which
> is used to refer to servers, printers and other 'network resources'.
> Although this started (many moons ago) in unix, it isn't used too
> much outside of windows networks! (where it is usually denoted by
> \\servername\path).
> 
> You can see the relics of unix UNC paths if you look at the wording
> for basename() in the POSIX standard. Note the special treatment of
> the path which 'is exactly "//"', see 
> http://pubs.opengroup.org/onlinepubs/009695399/functions/basename.html
> 
> ATB,
> Ramsay Jones

Please allow one more comment about UNC:
They are used under Windows, and typically wotk together with Git.
One breakage between 2.10 and 2.11 has been observed, saying that
pushing to \\SERVER\SHARE\DIRECTORY does not work any more.

It has been reported under
 "git 2.11.0 error when pushing to remote located on a windows share"
both here and 
here:
https://github.com/git-for-windows/git/issues/979#issuecomment-264816175

I don't have a Windows box at the moment, and I don't know if the
breakage was introduced by changes in real_patyh().

But in any case it seems that e.g.
//SEFVER/SHARE/DIR1/DIR2/..
must be converted into
//SEFVER/SHARE/DIR1

and 
\\SEFVER\SHARE\DIR1\DIR2\..
must be converted into
\\SEFVER\SHARE\DIR1

Re: [PATCH 1/3] update_unicode.sh: update the uniset repo if it exists

2016-12-11 Thread Torsten Bögershausen

On 2016-12-12 00:34, Beat Bolli wrote:
> We need to track the new commits in uniset, otherwise their and our code
> get out of sync.
> 
> Signed-off-by: Beat Bolli 
> ---
> 
> Junio, these go on top of my bb/unicode-9.0 branch, please.
> 
> Thanks!
> 
>  update_unicode.sh | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/update_unicode.sh b/update_unicode.sh
> index 4c1ec8d..9ca7d8b 100755
> --- a/update_unicode.sh
> +++ b/update_unicode.sh
> @@ -14,6 +14,11 @@ fi &&
>   http://www.unicode.org/Public/UCD/latest/ucd/EastAsianWidth.txt 
> &&
>   if ! test -d uniset; then
>   git clone https://github.com/depp/uniset.git
> + else
> + (
> + cd uniset &&
> + git pull
If upstream has accepted your patches, that's nice.

Minor question, especially to the next commit:
Should we make sure to checkout the exact version, which has been tested?
In this case  cb97792880625e24a9f581412d03659091a0e54f

And this is for both a fresh clone and the git pull
needs to be replaced by
git fetch && git checkout cb97792880625e24a9f581412d03659091a0e54f


(Which of course is a shell variable

Re: [PATCH 1/3] update_unicode.sh: update the uniset repo if it exists

2016-12-12 Thread Torsten Bögershausen




Sure, and I'd rather see the update-unicode.sh script moved
somewhere in contrib/ while at it.  Those who are interested in
keeping up with the unicode standard are tiny minority of the
developer population, and most of us would treat the built width
table as the source (after all, that is what we ship).

To be bluntly honest, I'd rather not to see "update-unicode.sh"
download and build uniset at all.  It's as if po/ hierarchy shipping
with its own script to download and build msgmerge--that's madness.
Needless to say, shipping the sources for uniset embedded in our
project tree (either as a snapshot-fork or as a submodule) is even
worse.  Those who want to muck with po/ are expected to have
msgmerge and friends.  Why not expect the same for those who want to
update the unicode width table?

I'd rather see a written instruction telling which snapshot to get
and from where to build and place on their $PATH in the README file,
sitting next to the update-unicode.sh script in contrib/uniwidth/
directory, for those who are interested in building the width table
"from the source", and the update-unicode.sh script to assume that
uniset is available.

OK with the contrib - that's an improvement.
About the instructions how to download and compile:
(we don't need to change the  $PATH, do we ?)
I don't know.
The typical instructions I have seen are a sequence of shell commands
to be executed, which hopefully simply work by doing "copy-and-paste".
I find this error-prone, as you you may loose the last character while
moving the mouse, or don't check the error message or return codes.
Having a pre-baked shell script, which does use "&&" is in that way more 
attractive,

and the README can be as simple as run "update-unicode.sh" and that's it.

uniset is a small project and where should we put it ?
a) inside the Git tree?
b) /tmp ?
c) into the $HOME directory ?
d) /usr/local

a) is quick and dirty
b) probably OK
c) Not sure about tha
d) Needs super user rights
Can we try to find a good place ?

"contrib/uniwidth/" may be different to find, how about contrib/update-unicode ?
 


OK. So please don't merge bb/unicode-9.0 to next yet; I'll prepare a
reroll following your description.

Torsten, is this alright with you?


sure


Cheers, Beat

Re: [PATCH v2] fix pushing to //server/share/dir on Windows

2016-12-14 Thread Torsten Bögershausen




On 14/12/16 20:37, Johannes Sixt wrote:

normalize_path_copy() is not prepared to keep the double-slash of a
//server/share/dir kind of path, but treats it like a regular POSIX
style path and transforms it to /server/share/dir.

The bug manifests when 'git push //server/share/dir master' is run,
because tmp_objdir_add_as_alternate() uses the path in normalized
form when it registers the quarantine object database via
link_alt_odb_entries(). Needless to say that the directory cannot be
accessed using the wrongly normalized path.

Fix it by skipping all of the root part, not just a potential drive
prefix. offset_1st_component takes care of this, see the
implementation in compat/mingw.c::mingw_offset_1st_component().

Signed-off-by: Johannes Sixt 
---
Am 14.12.2016 um 18:30 schrieb Jeff King:

Would it be reasonable to
write:

   /* Copy initial part of absolute path, converting separators on Windows */
   const char *end = src + offset_1st_component(src);
   while (src < end) {
  char c = *src++;
  if (c == '\\')
  c = '/';
  *dst++ = c;
   }

Makes a lot of sense! I haven't had an opportunity, though, to test
on Windows.

I'm not sure, if a conversion should be done here, in this part of code.
To my knowledge,

C:\dir1\file
is the same
as
C:/dir1/file
and that is handled by windows.

The \\server\share\dir1\file is native to windows,
and I can't see good reasons to change '\' into '/' somewhere in Git,
when UNC is used.

Cygwin does a translation from
//server/share/dir1/file
into
\\server\share\dir1\file

In other words:
The patch looks good as is, and once I get a Windows machine,
may be able to do some testing and come up with test cases



[]

Re: [PATCH 1/3] update_unicode.sh: update the uniset repo if it exists

2016-12-12 Thread Torsten Bögershausen


>> Minor question, especially to the next commit:
>> Should we make sure to checkout the exact version, which has been tested?
>> In this case  cb97792880625e24a9f581412d03659091a0e54f
>>
>> And this is for both a fresh clone and the git pull
>> needs to be replaced by
>> git fetch && git checkout cb97792880625e24a9f581412d03659091a0e54f
>>
>>
>> (Which of course is a shell variable)
> 
> I was actually wondering what the policy was for adding submodules to the Git 
> repo,
> but then decided against it. Another option would be to fork uniset on GitHub 
> and
> just let it stay on a working commit.
> 
> Junio, what's your stance on this?
> 
> Beat

If I run  ./update_unicode.sh on the latest master of   
https://github.com/depp/uniset.git ,
commit  a5fac4a091857dd5429cc2d, I get a diff in  unicode_width.h like this:

-{ 0x0300, 0x036F },

+{ 768, 879 },

IOW, all hex values are printed as decimal values.
Not a problem for the compiler, but for the human
to check the unicode tables.

So I think we should "pin" the version of uniset.

Re: [PATCH v2 1/1] convert: git cherry-pick -Xrenormalize did not work

2016-12-02 Thread Torsten Bögershausen




Yup, this is what I queued.


Looks good, thanks you all.

Re: [PATCH v4 1/3] update-unicode.sh: automatically download newer definition files

2016-12-03 Thread Torsten Bögershausen

On Sat, Dec 03, 2016 at 10:00:47PM +0100, Beat Bolli wrote:
> Checking just for the unicode data files' existence is not sufficient;
> we should also download them if a newer version exists on the Unicode
> consortium's servers. Option -N of wget does this nicely for us.
> 
> Reviewed-by: Torsten Boegershausen 

Minor remark (Not sure if this motivates v5, may be Junio can fix it locally?)
s/oe/ö/

Beside this: Thanks again (and I learned about the -N option of wget)

Re: git 2.11.0 error when pushing to remote located on a windows share

2016-12-04 Thread Torsten Bögershausen

On Fri, Dec 02, 2016 at 05:37:50PM -0500, Jeff King wrote:
> On Fri, Dec 02, 2016 at 06:02:16PM +, thomas.attw...@stfc.ac.uk wrote:
> 
> > After updating git from 2.10.0 to 2.11.0 when trying to push any
> > changes to a repo located in a windows share, the following error
> > occurs:
> > 
> > $ git push origin test
> > Counting objects: 2, done.
> > Delta compression using up to 8 threads.
> > Compressing objects: 100% (2/2), done.
> > Writing objects: 100% (2/2), 284 bytes | 0 bytes/s, done.
> > Total 2 (delta 1), reused 1 (delta 0)
> > remote: error: object directory /path/to/dir/objects does not exist; check 
> > .git/objects/info/alternates.
> > remote: fatal: unresolved deltas left after unpacking
> > error: unpack failed: unpack-objects abnormal exit
> > To //path/to/dir
> >  ! [remote rejected] test -> test (unpacker error)
> > error: failed to push some refs to '//path/to/dir'
> 
> Hmm. This is probably related to the quarantine-push change in v2.11;
> the receiving end will write the objects into a temporary directory but
> point to the original via GIT_ALTERNATE_OBJECT_DIRECTORIES. That pointer
> isn't working for some reason, so the receiver can't resolve the deltas
> it needs.
> 
> As you noted, the extra "/" is missing in the error message, and that
> sounds like a plausible cause for what you're seeing. I'm not sure where
> the slash is getting dropped, though. The value in the environment comes
> from calling absolute_path(get_object_directory()), so I suspect the
> real problem is not in the quarantine code, but it's just triggering a
> latent bug elsewhere (either in absolute_path(), or in the code which
> generates the objdir path).
> 
> > No error occurs if pushing to the same repo (a direct copy into a local 
> > directory) using 2.11.0.
> > 
> > $ git push local_test test
> > Counting objects: 2, done.
> > Delta compression using up to 8 threads.
> > Compressing objects: 100% (2/2), done.
> > Writing objects: 100% (2/2), 284 bytes | 0 bytes/s, done.
> > Total 2 (delta 1), reused 1 (delta 0)
> > To C:/path/to/dir
> >  * [new branch]  test -> test
> 
> The fact that it works using the non-UNC path reinforces my feeling that
> something is normalizing the absolute path incorrectly.
> 
> > Using `git fsck --full` in both 2.11.0 and 2.10.0, it doesn't reveal any 
> > additional problems.
> 
> Yeah, I don't think there is anything wrong with your repo. It's just a
> path-building issue internal to the receiving process.
> 

There seems to be another issue, which may or may not being related:
https://github.com/git-for-windows/git/issues/979

This is pure speculation:
Could it be that a '/' is lost because of a change in the underlying
Msys2 between 2.10 and 2.11 ?

Dscho, (or anybody else) any ideas?

Re: [PATCH v1 1/1] convert: git cherry-pick -Xrenormalize did not work

2016-11-29 Thread Torsten Bögershausen

On Tue, Nov 29, 2016 at 10:42:18AM -0800, Junio C Hamano wrote:
> tbo...@web.de writes:
> 
> > From: Torsten Bögershausen <tbo...@web.de>
> >
> > Working with a repo that used to be all CRLF. At some point it
> > was changed to all LF, with `text=auto` in .gitattributes.
> > Trying to cherry-pick a commit from before the switchover fails:
> >
> > $ git cherry-pick -Xrenormalize 
> > fatal: CRLF would be replaced by LF in [path]
> 
> OK.  That's a very clear description of the symptom that can be
> observed from the surface.
> 
> > Whenever crlf_action is CRLF_TEXT_XXX and not CRLF_AUTO_XXX,
> > SAFE_CRLF_RENORMALIZE must be turned into CRLF_SAFE_FALSE.
> 
> Aside from needing s/CRLF_SAFE/SAFE_CRLF/, this however lacks
> "Otherwise, because of X and Y, Z ends up doing W" to explain
> the "must be" part.  Care to explain it a bit more?

Thanks for the review - how about this:

convert: git cherry-pick -Xrenormalize did not work

Working with a repo that used to be all CRLF. At some point it
was changed to all LF, with `text=auto` in .gitattributes.
Trying to cherry-pick a commit from before the switchover fails:

$ git cherry-pick -Xrenormalize 
fatal: CRLF would be replaced by LF in [path]

Commit 65237284 "unify the "auto" handling of CRLF" introduced
a regression:

Whenever crlf_action is CRLF_TEXT_XXX and not CRLF_AUTO_XXX,
SAFE_CRLF_RENORMALIZE was feed into check_safe_crlf().
This is wrong because here everything else than SAFE_CRLF_WARN is
treated as SAFE_CRLF_FAIL.

Solution: Turn SAFE_CRLF_RENORMALIZE into SAFE_CRLF_FALSE before
calling check_safe_crlf().

Reported-by: Eevee (Lexy Munroe) <ee...@veekun.com>
Signed-off-by: Torsten Bögershausen <tbo...@web.de>

Re: git 2.11.0 error when pushing to remote located on a windows share

2016-12-05 Thread Torsten Bögershausen

On 2016-12-05 12:05, thomas.attw...@stfc.ac.uk wrote:
> On Sun, 4 Dec 2016 08:09:14 +, Torsten BÃ¶gershausen wrote:
>> There seems to be another issue, which may or may not being related:
>> https://github.com/git-for-windows/git/issues/979
> 
> I think this is the same issue. I've posted my trace command output there as
> It might be more appropriate:
> https://github.com/git-for-windows/git/issues/979#issuecomment-264816175
> 
Thanks for the trace.
I think that the problem comes from the "cwd", when a UNC name is used.

cd //SERVER/share/somedir
does not work under Windows, the is no chance to change into that directory.
Does anybody know out of his head why and since when we change the directory
like this ?
Or "git bisect" may help.

Re: [PATCH v4 0/5] road to reentrant real_path

2017-01-03 Thread Torsten Bögershausen

On 04.01.17 01:48, Jeff King wrote:
> On Tue, Jan 03, 2017 at 11:09:18AM -0800, Brandon Williams wrote:
> 
>> Only change with v4 is in [1/5] renaming the #define MAXSYMLINKS back to
>> MAXDEPTH due to a naming conflict brought up by Lars Schneider.
> 
> Hmm. Isn't MAXSYMLINKS basically what you want here, though? It what's
> what all other similar functions will be using.
> 
> The only problem was that we were redefining the macro. So maybe:
> 
>   #ifndef MAXSYMLINKS
>   #define MAXSYMLINKS 5
>   #endif
> 
> would be a good solution?
Why 5  ? (looking at the  20..30 below)
And why 5 on one system and e.g. on my Mac OS
#define MAXSYMLINKS 32  

Would the same value value for all Git installations on all platforms make 
sense?
#define GITMAXSYMLINKS 20

> 
> It looks like the "usual" value is more like 20 or 30 on most systems,
> though.  We should probably also set errno to ELOOP when we hit the
> limit, which is what other symlink-resolving functions would do.
> 
> I'm surprised we didn't hit this on Linux, which has MAXSYMLINKS, too.
> We should be picking it up from .
> 
> -Peff
>

Re: Git filesystem case-insensitive to case-sensitive system broken

2017-01-07 Thread Torsten Bögershausen

On Fri, Jan 06, 2017 at 01:56:36PM -0800, Steven Robertson wrote:
> Hello,
> 
> I was doing development on a linux box on AWS, when we found a code
> bug that had me switching to running the code on a Mac instead. We
> discovered that we had accidentally named two files the same when
> looked at case-insensitively, which made git commands afterwards
> display the wrong thing. It looked like this (ignoring some things
> that aren't relevant):
> 
> $ git status
> 
> 
>modified:   tests/test_system/show_19_L.txt
> 
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> 
> $ git checkout tests/test_system/show_19_L.txt
> 
> $ git status
> 
> 
>modified:   tests/test_system/show_19_l.txt
> 
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> 
> $ git checkout tests/test_system/show_19_l.txt
> 
> $ git status
> 
> 
>modified:   tests/test_system/show_19_L.txt
> 
> 
> no changes added to commit (use "git add" and/or "git commit -a")
> 
> $ diff tests/test_system/show_19_L.txt tests/test_system/show_19_l.txt
> 
> $
> 
> 
> Those two files are different in our repo, and as such git thinks that
> we modified one of them when we try and pull it down from github
> again.
> 
> 
> Thanks for looking at this!
> -- Steven

I assume that you are on Mac OS ?
This is what I would have done:

- find the twin of your file:
$  git ls-files | grep -i tests/test_system/show_19_L.txt

- Let's assume it is the little brother:
  "tests/test_system/show_19_l.txt"
$  git mv tests/test_system/show_19_l.txt tests/test_system/show_19_l2.txt

- Check out the original:
$ git checkout tests/test_system/show_19_L.txt

- check:
$ git status

Re: Test failures when Git is built with libpcre and grep is built without it

2017-01-01 Thread Torsten Bögershausen

On 01.01.17 05:59, A. Wilcox wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> 
> Hello!
> 
> I'm attempting to package Git for our new Linux distribution and I
> have run in to a failure on our PowerPC builder while running the test
> suite.
> 
> The PowerPC builder runs a tiny version of grep(1) that was not built
> with PCRE.  As such, grep -P returns 2 and prints:
> 
> grep: support for the -P option is not compiled into this
> - --disable-perl-regexp binary
> 
> However, our Git build *does* link against libpcre.  This causes a
> tests numbered 142 and 143 to fail in t7810-grep.sh.
> 
> I am not sure the best way to handle this but I felt it would be
> prudent to inform you of this issue.  I will be happy to provide any
> other information you may require.

The first thing you can do is to run the test with debug and verbose:

debug=t verbose=t ./t7810-grep.sh  
and post the output of these 2 test cases here:
(mine looks like this)

expecting success: 
echo "ab:a+bc" >expected &&
git \
-c grep.patterntype=extended \
-c grep.patterntype=fixed \
-c grep.patterntype=basic \
grep "a+b*c" ab >actual &&
test_cmp expected actual

ok 142 - grep pattern with grep.patternType=extended, =fixed, =basic

expecting success: 
echo "ab:abc" >expected &&
git grep -F -G -E "a+b*c" ab >actual &&
test_cmp expected actual

ok 143 - grep -F -G -E pattern

< 5 6 7 8 9 10 11 12 13 >

901 - 1000 of 1221 matches

Mail list logo