Re: [PATCH v2 4/5] convert: generate large test files only once

2016-07-27 Thread Lars Schneider

> On 27 Jul 2016, at 15:32, Jeff King  wrote:
> 
> On Wed, Jul 27, 2016 at 04:35:32AM +0200, Torsten Bögershausen wrote:
> 
>>> +   mkdir -p generated-test-data &&
>>> +   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
>>> +   do
>>> +   # Generate 1MB of empty data and 100 bytes of random characters
>>> +   printf "%1048576d" 1
>>> +   printf "$(LC_ALL=C tr -dc "A-Za-z0-9" >> bs=$((RANDOM>>8)) count=1 2>/dev/null)"
>> I'm not sure how portable /dev/urandom is.
>> The other thing, that "really random" numbers are an overkill, and
>> it may be easier to use pre-defined numbers,
> 
> Right, there are a few reasons not to use /dev/urandom:
> 
>  - it's not portable
> 
>  - if we have to generate a lot of numbers, it drains the system's
>entropy pool, which is an unfriendly thing to do (and may also be
>slow)
> 
>  - it makes our tests random! This sounds like a good thing, but it
>means that if some input happens to cause failure, you are unlikely
>to be able to reproduce it.
> 
> Instead, use test-genrandom, which is an LCG that starts at a seed. So
> you get a large amount of random-ish quickly and portably, and you get
> the same data each time.

Thank you! That's exactly what I need here :-)

- Lars--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] convert: generate large test files only once

2016-07-27 Thread Jeff King
On Wed, Jul 27, 2016 at 04:35:32AM +0200, Torsten Bögershausen wrote:

> > +   mkdir -p generated-test-data &&
> > +   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
> > +   do
> > +   # Generate 1MB of empty data and 100 bytes of random characters
> > +   printf "%1048576d" 1
> > +   printf "$(LC_ALL=C tr -dc "A-Za-z0-9"  > bs=$((RANDOM>>8)) count=1 2>/dev/null)"
> I'm not sure how portable /dev/urandom is.
> The other thing, that "really random" numbers are an overkill, and
> it may be easier to use pre-defined numbers,

Right, there are a few reasons not to use /dev/urandom:

  - it's not portable

  - if we have to generate a lot of numbers, it drains the system's
entropy pool, which is an unfriendly thing to do (and may also be
slow)

  - it makes our tests random! This sounds like a good thing, but it
means that if some input happens to cause failure, you are unlikely
to be able to reproduce it.

Instead, use test-genrandom, which is an LCG that starts at a seed. So
you get a large amount of random-ish quickly and portably, and you get
the same data each time.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 4/5] convert: generate large test files only once

2016-07-26 Thread Torsten Bögershausen



On 07/27/2016 02:06 AM, larsxschnei...@gmail.com wrote:

From: Lars Schneider 

Generate a more interesting large test file with random characters in
between and reuse this test file in multiple tests. Run tests formerly
marked as EXPENSIVE every time but with a smaller test file.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7b45136..b9911a4 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -4,6 +4,13 @@ test_description='blob conversion via gitattributes'

 . ./test-lib.sh

+if test_have_prereq EXPENSIVE
+then
+   T0021_LARGE_FILE_SIZE=2048
+else
+   T0021_LARGE_FILE_SIZE=30
+fi
+
 cat test.i &&
git add test test.t test.i &&
rm -f test test.t test.i &&
-   git checkout -- test test.t test.i
+   git checkout -- test test.t test.i &&
+
+   mkdir -p generated-test-data &&
+   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
+   do
+   # Generate 1MB of empty data and 100 bytes of random characters
+   printf "%1048576d" 1
+   printf "$(LC_ALL=C tr -dc "A-Za-z0-9" >8)) count=1 2>/dev/null)"

I'm not sure how portable /dev/urandom is.
The other thing, that "really random" numbers are an overkill, and
it may be easier to use pre-defined numbers,

The rest of 1..4 looks good, I will look at 5/5 later.


+   done >generated-test-data/large.file
 '

 script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -199,9 +214,9 @@ test_expect_success 'required filter clean failure' '
 test_expect_success 'filtering large input to small output should use little 
memory' '
test_config filter.devnull.clean "cat >/dev/null" &&
test_config filter.devnull.required true &&
-   for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
-   echo "30MB filter=devnull" >.gitattributes &&
-   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
+   cp generated-test-data/large.file large.file &&
+   echo "large.file filter=devnull" >.gitattributes &&
+   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file
 '

 test_expect_success 'filter that does not read is fine' '
@@ -214,15 +229,15 @@ test_expect_success 'filter that does not read is fine' '
test_cmp expect actual
 '

-test_expect_success EXPENSIVE 'filter large file' '
+test_expect_success 'filter large file' '
test_config filter.largefile.smudge cat &&
test_config filter.largefile.clean cat &&
-   for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
-   echo "2GB filter=largefile" >.gitattributes &&
-   git add 2GB 2>err &&
+   echo "large.file filter=largefile" >.gitattributes &&
+   cp generated-test-data/large.file large.file &&
+   git add large.file 2>err &&
test_must_be_empty err &&
-   rm -f 2GB &&
-   git checkout -- 2GB 2>err &&
+   rm -f large.file &&
+   git checkout -- large.file 2>err &&
test_must_be_empty err
 '



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2 4/5] convert: generate large test files only once

2016-07-26 Thread larsxschneider
From: Lars Schneider 

Generate a more interesting large test file with random characters in
between and reuse this test file in multiple tests. Run tests formerly
marked as EXPENSIVE every time but with a smaller test file.

Signed-off-by: Lars Schneider 
---
 t/t0021-conversion.sh | 35 +--
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/t/t0021-conversion.sh b/t/t0021-conversion.sh
index 7b45136..b9911a4 100755
--- a/t/t0021-conversion.sh
+++ b/t/t0021-conversion.sh
@@ -4,6 +4,13 @@ test_description='blob conversion via gitattributes'
 
 . ./test-lib.sh
 
+if test_have_prereq EXPENSIVE
+then
+   T0021_LARGE_FILE_SIZE=2048
+else
+   T0021_LARGE_FILE_SIZE=30
+fi
+
 cat test.i &&
git add test test.t test.i &&
rm -f test test.t test.i &&
-   git checkout -- test test.t test.i
+   git checkout -- test test.t test.i &&
+
+   mkdir -p generated-test-data &&
+   for i in $(test_seq 1 $T0021_LARGE_FILE_SIZE)
+   do
+   # Generate 1MB of empty data and 100 bytes of random characters
+   printf "%1048576d" 1
+   printf "$(LC_ALL=C tr -dc "A-Za-z0-9" >8)) count=1 2>/dev/null)"
+   done >generated-test-data/large.file
 '
 
 script='s/^\$Id: \([0-9a-f]*\) \$/\1/p'
@@ -199,9 +214,9 @@ test_expect_success 'required filter clean failure' '
 test_expect_success 'filtering large input to small output should use little 
memory' '
test_config filter.devnull.clean "cat >/dev/null" &&
test_config filter.devnull.required true &&
-   for i in $(test_seq 1 30); do printf "%1048576d" 1; done >30MB &&
-   echo "30MB filter=devnull" >.gitattributes &&
-   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add 30MB
+   cp generated-test-data/large.file large.file &&
+   echo "large.file filter=devnull" >.gitattributes &&
+   GIT_MMAP_LIMIT=1m GIT_ALLOC_LIMIT=1m git add large.file
 '
 
 test_expect_success 'filter that does not read is fine' '
@@ -214,15 +229,15 @@ test_expect_success 'filter that does not read is fine' '
test_cmp expect actual
 '
 
-test_expect_success EXPENSIVE 'filter large file' '
+test_expect_success 'filter large file' '
test_config filter.largefile.smudge cat &&
test_config filter.largefile.clean cat &&
-   for i in $(test_seq 1 2048); do printf "%1048576d" 1; done >2GB &&
-   echo "2GB filter=largefile" >.gitattributes &&
-   git add 2GB 2>err &&
+   echo "large.file filter=largefile" >.gitattributes &&
+   cp generated-test-data/large.file large.file &&
+   git add large.file 2>err &&
test_must_be_empty err &&
-   rm -f 2GB &&
-   git checkout -- 2GB 2>err &&
+   rm -f large.file &&
+   git checkout -- large.file 2>err &&
test_must_be_empty err
 '
 
-- 
2.9.0

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html