RE: Test iterations

2014-08-09 Thread Uwe Schindler
I opened https://issues.apache.org/jira/browse/LUCENE-5881

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Saturday, August 09, 2014 8:58 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Slightly improved patch:
> forbidding tests.iters is not needed. It still makes sense to beast 20 rounds
> and each test repeated (with same static class seed) 20 times, too -> 400
> reps. Also more groovy-like loop.
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-
> > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > Sent: Saturday, August 09, 2014 8:08 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Test iterations
> >
> > Hi,
> >
> > attached you will find the beaster:
> >
> > - Only modifies common-build.xml, so no inherit down (makes no sense
> > otherwise, as you would never run "ant beast-test" from top-level. So
> > you have to go to correct submodule and run "ant beast-test
> > -Dbeast.iters=n - Dtestcase=..." from there
> > - Uses "antcall" in a loop, invoking the internal dependency-less "-test"
> > target. My first impl used the test-macro directly, but this did not
> > work, because test-macro sets non-local properties, which are then
> > available on second round, causing errors or use always same seed.
> > Antcall creates a new project each time and runs tests.
> >
> > I can open an issue or just commit this :-)
> >
> > Uwe
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> > > -Original Message-
> > > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > > Sent: Friday, August 08, 2014 8:13 PM
> > > To: dev@lucene.apache.org
> > > Subject: RE: Test iterations
> > >
> > > Hi,
> > >
> > > I will look into that as a Groovy Skript: The main problem is: You
> > > cannot simply use  in a loop, because this would also
> > > execute the dependencies on each run.
> > >
> > > My idea is to do the following:
> > > - maybe subclass antcall Task with Groovy (not sure if this is
> > > needed)
> > > - instantiate it with current project
> > > - execute dependent targets
> > > - execute the inner target multiple times: store the project
> > > properties first and restore them after execution. This is done,
> > > because ANT properties can only be set *once*. If you don't give a
> > > fixed test seed, each run would pick a new one (because the project
> > > properties are reset, so the seed from the previous execution is gone).
> > >
> > > Uwe
> > >
> > > -
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > eMail: u...@thetaphi.de
> > >
> > >
> > > > -Original Message-
> > > > From: Ryan Ernst [mailto:r...@iernst.net]
> > > > Sent: Friday, August 08, 2014 5:08 PM
> > > > To: dev@lucene.apache.org
> > > > Subject: Re: Test iterations
> > > >
> > > > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > > > always. :)
> > > >
> > > > > Should we provide this "beaster" in common-build?
> > > >
> > > > I would use it! It sounds like there is a lot of work involved in
> > > > making tests.iters work better with LuceneTestCase.  In the mean
> > > > time, this sounds like a quick solution that might not be as
> > > > efficient (multiple JVMs), but still better than having to come up with 
> > > > a
> > bash script?
> > > >
> > > > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> > > >  wrote:
> > > > > +1, this sounds awesome?
> > > > >
> > > > > Mike McCandless
> > > > >
> > > > > http://blog.mikemccandless.com
> > > > >
> > > > >
> > > > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler 
> > > wrote:
> > > > >> Hi,
> > > > >>
> > > > >> We could emulate the same thing (the repeating beaster) with pure
> > > Ant:

RE: Test iterations

2014-08-09 Thread Uwe Schindler
Slightly improved patch:
forbidding tests.iters is not needed. It still makes sense to beast 20 rounds 
and each test repeated (with same static class seed) 20 times, too -> 400 reps. 
Also more groovy-like loop.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Saturday, August 09, 2014 8:08 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Hi,
> 
> attached you will find the beaster:
> 
> - Only modifies common-build.xml, so no inherit down (makes no sense
> otherwise, as you would never run "ant beast-test" from top-level. So you
> have to go to correct submodule and run "ant beast-test -Dbeast.iters=n -
> Dtestcase=..." from there
> - Uses "antcall" in a loop, invoking the internal dependency-less "-test"
> target. My first impl used the test-macro directly, but this did not work,
> because test-macro sets non-local properties, which are then available on
> second round, causing errors or use always same seed. Antcall creates a new
> project each time and runs tests.
> 
> I can open an issue or just commit this :-)
> 
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-----
> > From: Uwe Schindler [mailto:u...@thetaphi.de]
> > Sent: Friday, August 08, 2014 8:13 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Test iterations
> >
> > Hi,
> >
> > I will look into that as a Groovy Skript: The main problem is: You
> > cannot simply use  in a loop, because this would also
> > execute the dependencies on each run.
> >
> > My idea is to do the following:
> > - maybe subclass antcall Task with Groovy (not sure if this is needed)
> > - instantiate it with current project
> > - execute dependent targets
> > - execute the inner target multiple times: store the project
> > properties first and restore them after execution. This is done,
> > because ANT properties can only be set *once*. If you don't give a
> > fixed test seed, each run would pick a new one (because the project
> > properties are reset, so the seed from the previous execution is gone).
> >
> > Uwe
> >
> > -
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: u...@thetaphi.de
> >
> >
> > > -Original Message-
> > > From: Ryan Ernst [mailto:r...@iernst.net]
> > > Sent: Friday, August 08, 2014 5:08 PM
> > > To: dev@lucene.apache.org
> > > Subject: Re: Test iterations
> > >
> > > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > > always. :)
> > >
> > > > Should we provide this "beaster" in common-build?
> > >
> > > I would use it! It sounds like there is a lot of work involved in
> > > making tests.iters work better with LuceneTestCase.  In the mean
> > > time, this sounds like a quick solution that might not be as
> > > efficient (multiple JVMs), but still better than having to come up with a
> bash script?
> > >
> > > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> > >  wrote:
> > > > +1, this sounds awesome?
> > > >
> > > > Mike McCandless
> > > >
> > > > http://blog.mikemccandless.com
> > > >
> > > >
> > > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler 
> > wrote:
> > > >> Hi,
> > > >>
> > > >> We could emulate the same thing (the repeating beaster) with pure
> > Ant:
> > > >>
> > > >> Just repeat the "test" target, which can be done using ant-contrib's
> "for"
> > > task or (much simplier) a groovy script using antcall on the test target.
> > > >> Should we provide this "beaster" in common-build?
> > > >>
> > > >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> > > >>
> > > >> Very easy to implement and makes it easier to use for the python
> > > >> haters -
> > > and comes embedded...
> > > >>
> > > >> Uwe
> > > >>
> > > >> -
> > > >> Uwe Schindler
> > > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > >> eMail: u...@thetaphi.de
> > > &g

RE: Test iterations

2014-08-09 Thread Uwe Schindler
Hi,

attached you will find the beaster:

- Only modifies common-build.xml, so no inherit down (makes no sense otherwise, 
as you would never run "ant beast-test" from top-level. So you have to go to 
correct submodule and run "ant beast-test -Dbeast.iters=n -Dtestcase=..." from 
there
- Uses "antcall" in a loop, invoking the internal dependency-less "-test" 
target. My first impl used the test-macro directly, but this did not work, 
because test-macro sets non-local properties, which are then available on 
second round, causing errors or use always same seed. Antcall creates a new 
project each time and runs tests.

I can open an issue or just commit this :-)

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Uwe Schindler [mailto:u...@thetaphi.de]
> Sent: Friday, August 08, 2014 8:13 PM
> To: dev@lucene.apache.org
> Subject: RE: Test iterations
> 
> Hi,
> 
> I will look into that as a Groovy Skript: The main problem is: You cannot 
> simply
> use  in a loop, because this would also execute the dependencies
> on each run.
> 
> My idea is to do the following:
> - maybe subclass antcall Task with Groovy (not sure if this is needed)
> - instantiate it with current project
> - execute dependent targets
> - execute the inner target multiple times: store the project properties first
> and restore them after execution. This is done, because ANT properties can
> only be set *once*. If you don't give a fixed test seed, each run would pick a
> new one (because the project properties are reset, so the seed from the
> previous execution is gone).
> 
> Uwe
> 
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
> 
> 
> > -Original Message-
> > From: Ryan Ernst [mailto:r...@iernst.net]
> > Sent: Friday, August 08, 2014 5:08 PM
> > To: dev@lucene.apache.org
> > Subject: Re: Test iterations
> >
> > Thanks for the extremely thorough answer, Dawid!  Entertaining as
> > always. :)
> >
> > > Should we provide this "beaster" in common-build?
> >
> > I would use it! It sounds like there is a lot of work involved in
> > making tests.iters work better with LuceneTestCase.  In the mean time,
> > this sounds like a quick solution that might not be as efficient
> > (multiple JVMs), but still better than having to come up with a bash script?
> >
> > On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
> >  wrote:
> > > +1, this sounds awesome?
> > >
> > > Mike McCandless
> > >
> > > http://blog.mikemccandless.com
> > >
> > >
> > > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler 
> wrote:
> > >> Hi,
> > >>
> > >> We could emulate the same thing (the repeating beaster) with pure
> Ant:
> > >>
> > >> Just repeat the "test" target, which can be done using ant-contrib's 
> > >> "for"
> > task or (much simplier) a groovy script using antcall on the test target.
> > >> Should we provide this "beaster" in common-build?
> > >>
> > >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> > >>
> > >> Very easy to implement and makes it easier to use for the python
> > >> haters -
> > and comes embedded...
> > >>
> > >> Uwe
> > >>
> > >> -
> > >> Uwe Schindler
> > >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> eMail: u...@thetaphi.de
> > >>
> > >>
> > >>> -Original Message-
> > >>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> > >>> Sent: Friday, August 08, 2014 3:48 PM
> > >>> To: Lucene/Solr dev
> > >>> Subject: Re: Test iterations
> > >>>
> > >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler 
> > wrote:
> > >>> > Hi Dawid,
> > >>> >
> > >>> > Thanks for the very good explanation! Indeed the main problem with
> > >>> tests.iters is the static initializers. Maybe put that explanation
> > >>> into the Wiki! I sometimes also need to remember it, so it should be
> > documented.
> > >>> >
> > >>> > One (only theoretical) way to solve the whole thing could be:
> > >>> > Load the class(es) in a separate classloader for every repeated
> > >

Re: Test iterations

2014-08-08 Thread Dawid Weiss
I know we could fork separate class loaders, Uwe. But I had exactly
the same kind of concerns you already so accurately pinpointed; if no
real gain is to be had I typically vote for simplicity.

Mike -- Ant's overhead is indeed a problem. Uwe's solution with
antcontrib (which I already mentioned a while ago I believe) will make
it more palatable, but it's still a half-way thing because if we had
"real" seed reiteration in the runner then it could also run multiple
concurrent copies of the same class (with different master seeds) in
the forked JVMs. This would nicely play with what's already in the
code.

I will get down to it and look at the possibilities and problems
again. Thanks for bringing it up, Ryan. I admit it's been on my queue
for a long time but I was hesitant to open this particular can of
worms...

Dawid


On Fri, Aug 8, 2014 at 8:13 PM, Uwe Schindler  wrote:
> Hi,
>
> I will look into that as a Groovy Skript: The main problem is: You cannot 
> simply use  in a loop, because this would also execute the 
> dependencies on each run.
>
> My idea is to do the following:
> - maybe subclass antcall Task with Groovy (not sure if this is needed)
> - instantiate it with current project
> - execute dependent targets
> - execute the inner target multiple times: store the project properties first 
> and restore them after execution. This is done, because ANT properties can 
> only be set *once*. If you don't give a fixed test seed, each run would pick 
> a new one (because the project properties are reset, so the seed from the 
> previous execution is gone).
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Ryan Ernst [mailto:r...@iernst.net]
>> Sent: Friday, August 08, 2014 5:08 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Test iterations
>>
>> Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)
>>
>> > Should we provide this "beaster" in common-build?
>>
>> I would use it! It sounds like there is a lot of work involved in making
>> tests.iters work better with LuceneTestCase.  In the mean time, this sounds
>> like a quick solution that might not be as efficient (multiple JVMs), but 
>> still
>> better than having to come up with a bash script?
>>
>> On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
>>  wrote:
>> > +1, this sounds awesome?
>> >
>> > Mike McCandless
>> >
>> > http://blog.mikemccandless.com
>> >
>> >
>> > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler  wrote:
>> >> Hi,
>> >>
>> >> We could emulate the same thing (the repeating beaster) with pure Ant:
>> >>
>> >> Just repeat the "test" target, which can be done using ant-contrib's "for"
>> task or (much simplier) a groovy script using antcall on the test target.
>> >> Should we provide this "beaster" in common-build?
>> >>
>> >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>> >>
>> >> Very easy to implement and makes it easier to use for the python haters -
>> and comes embedded...
>> >>
>> >> Uwe
>> >>
>> >> -
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: u...@thetaphi.de
>> >>
>> >>
>> >>> -Original Message-
>> >>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>> >>> Sent: Friday, August 08, 2014 3:48 PM
>> >>> To: Lucene/Solr dev
>> >>> Subject: Re: Test iterations
>> >>>
>> >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler 
>> wrote:
>> >>> > Hi Dawid,
>> >>> >
>> >>> > Thanks for the very good explanation! Indeed the main problem with
>> >>> tests.iters is the static initializers. Maybe put that explanation
>> >>> into the Wiki! I sometimes also need to remember it, so it should be
>> documented.
>> >>> >
>> >>> > One (only theoretical) way to solve the whole thing could be:
>> >>> > Load the class(es) in a separate classloader for every repeated
>> >>> > execution,... but of course this will very fast blow up your
>> >>> > permgen (java 6, 7) or anything else we don't know about (java 8).
>> >>> >

RE: Test iterations

2014-08-08 Thread Uwe Schindler
Hi,

I will look into that as a Groovy Skript: The main problem is: You cannot 
simply use  in a loop, because this would also execute the 
dependencies on each run.

My idea is to do the following:
- maybe subclass antcall Task with Groovy (not sure if this is needed)
- instantiate it with current project
- execute dependent targets
- execute the inner target multiple times: store the project properties first 
and restore them after execution. This is done, because ANT properties can only 
be set *once*. If you don't give a fixed test seed, each run would pick a new 
one (because the project properties are reset, so the seed from the previous 
execution is gone).

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Ryan Ernst [mailto:r...@iernst.net]
> Sent: Friday, August 08, 2014 5:08 PM
> To: dev@lucene.apache.org
> Subject: Re: Test iterations
> 
> Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)
> 
> > Should we provide this "beaster" in common-build?
> 
> I would use it! It sounds like there is a lot of work involved in making
> tests.iters work better with LuceneTestCase.  In the mean time, this sounds
> like a quick solution that might not be as efficient (multiple JVMs), but 
> still
> better than having to come up with a bash script?
> 
> On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
>  wrote:
> > +1, this sounds awesome?
> >
> > Mike McCandless
> >
> > http://blog.mikemccandless.com
> >
> >
> > On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler  wrote:
> >> Hi,
> >>
> >> We could emulate the same thing (the repeating beaster) with pure Ant:
> >>
> >> Just repeat the "test" target, which can be done using ant-contrib's "for"
> task or (much simplier) a groovy script using antcall on the test target.
> >> Should we provide this "beaster" in common-build?
> >>
> >> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
> >>
> >> Very easy to implement and makes it easier to use for the python haters -
> and comes embedded...
> >>
> >> Uwe
> >>
> >> -
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>
> >>
> >>> -Original Message-
> >>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> >>> Sent: Friday, August 08, 2014 3:48 PM
> >>> To: Lucene/Solr dev
> >>> Subject: Re: Test iterations
> >>>
> >>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler 
> wrote:
> >>> > Hi Dawid,
> >>> >
> >>> > Thanks for the very good explanation! Indeed the main problem with
> >>> tests.iters is the static initializers. Maybe put that explanation
> >>> into the Wiki! I sometimes also need to remember it, so it should be
> documented.
> >>> >
> >>> > One (only theoretical) way to solve the whole thing could be:
> >>> > Load the class(es) in a separate classloader for every repeated
> >>> > execution,... but of course this will very fast blow up your
> >>> > permgen (java 6, 7) or anything else we don't know about (java 8).
> >>> > In fact the separate classloader approach is not different from
> >>> > Mike's scripts, just that Mike's script creates a new classloader
> >>> > by forking a new JVM. In fact I don't think the separate
> >>> > classloader approach would be much faster, because the class
> >>> > clones will all have separate compilation paths in Hotspot, so
> >>> > Hotspot cannot share the same assembler code. So except the JVM
> >>> > startup time, you gain nothing. Just permgen issues :-)
> >>>
> >>> The big thing the python beasting scripts avoids is all the ant
> >>> overhead to just get to the point where it actually spawns the JVM
> >>> to run the test.  Really, that's all the beasting script does:
> >>> directly spawn the JVM on the test runner (after running "ant
> >>> test-compile" up
> >>> front) and then parse its output/events.
> >>>
> >>> The distributed test runner, which uses rsync/ssh to run tests on N
> >>> machines, is very different from the beasting script: it runs all
> >>> Lucene's tests (i

Re: Test iterations

2014-08-08 Thread Ryan Ernst
Thanks for the extremely thorough answer, Dawid!  Entertaining as always. :)

> Should we provide this "beaster" in common-build?

I would use it! It sounds like there is a lot of work involved in
making tests.iters work better with LuceneTestCase.  In the mean time,
this sounds like a quick solution that might not be as efficient
(multiple JVMs), but still better than having to come up with a bash
script?

On Fri, Aug 8, 2014 at 7:28 AM, Michael McCandless
 wrote:
> +1, this sounds awesome?
>
> Mike McCandless
>
> http://blog.mikemccandless.com
>
>
> On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler  wrote:
>> Hi,
>>
>> We could emulate the same thing (the repeating beaster) with pure Ant:
>>
>> Just repeat the "test" target, which can be done using ant-contrib's "for" 
>> task or (much simplier) a groovy script using antcall on the test target.
>> Should we provide this "beaster" in common-build?
>>
>> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>>
>> Very easy to implement and makes it easier to use for the python haters - 
>> and comes embedded...
>>
>> Uwe
>>
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>
>>
>>> -Original Message-
>>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>>> Sent: Friday, August 08, 2014 3:48 PM
>>> To: Lucene/Solr dev
>>> Subject: Re: Test iterations
>>>
>>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler  wrote:
>>> > Hi Dawid,
>>> >
>>> > Thanks for the very good explanation! Indeed the main problem with
>>> tests.iters is the static initializers. Maybe put that explanation into the 
>>> Wiki! I
>>> sometimes also need to remember it, so it should be documented.
>>> >
>>> > One (only theoretical) way to solve the whole thing could be:
>>> > Load the class(es) in a separate classloader for every repeated
>>> > execution,... but of course this will very fast blow up your permgen
>>> > (java 6, 7) or anything else we don't know about (java 8). In fact the
>>> > separate classloader approach is not different from Mike's scripts,
>>> > just that Mike's script creates a new classloader by forking a new
>>> > JVM. In fact I don't think the separate classloader approach would be
>>> > much faster, because the class clones will all have separate
>>> > compilation paths in Hotspot, so Hotspot cannot share the same
>>> > assembler code. So except the JVM startup time, you gain nothing. Just
>>> > permgen issues :-)
>>>
>>> The big thing the python beasting scripts avoids is all the ant overhead to 
>>> just
>>> get to the point where it actually spawns the JVM to run the test.  Really,
>>> that's all the beasting script does: directly spawn the JVM on the test 
>>> runner
>>> (after running "ant test-compile" up
>>> front) and then parse its output/events.
>>>
>>> The distributed test runner, which uses rsync/ssh to run tests on N 
>>> machines,
>>> is very different from the beasting script: it runs all Lucene's tests 
>>> (instead of
>>> a single test over and over) across N JVMs on M machines.  It "cheats" by
>>> taking the union of all CLASSPATHs ...
>>> but this is a huge win because it means all testing is fully concurrent, 
>>> not just
>>> concurrent within one module.  This script can also repeat, which means once
>>> all lucene tests finish, re-en-queue all of them again.
>>>
>>> Mike McCandless
>>>
>>> http://blog.mikemccandless.com
>>>
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>>> commands, e-mail: dev-h...@lucene.apache.org
>>
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test iterations

2014-08-08 Thread Michael McCandless
+1, this sounds awesome?

Mike McCandless

http://blog.mikemccandless.com


On Fri, Aug 8, 2014 at 10:06 AM, Uwe Schindler  wrote:
> Hi,
>
> We could emulate the same thing (the repeating beaster) with pure Ant:
>
> Just repeat the "test" target, which can be done using ant-contrib's "for" 
> task or (much simplier) a groovy script using antcall on the test target.
> Should we provide this "beaster" in common-build?
>
> "ant beast-tests -Dbeast.iter=100 -Dtestcase=..."
>
> Very easy to implement and makes it easier to use for the python haters - and 
> comes embedded...
>
> Uwe
>
> -
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: u...@thetaphi.de
>
>
>> -Original Message-
>> From: Michael McCandless [mailto:luc...@mikemccandless.com]
>> Sent: Friday, August 08, 2014 3:48 PM
>> To: Lucene/Solr dev
>> Subject: Re: Test iterations
>>
>> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler  wrote:
>> > Hi Dawid,
>> >
>> > Thanks for the very good explanation! Indeed the main problem with
>> tests.iters is the static initializers. Maybe put that explanation into the 
>> Wiki! I
>> sometimes also need to remember it, so it should be documented.
>> >
>> > One (only theoretical) way to solve the whole thing could be:
>> > Load the class(es) in a separate classloader for every repeated
>> > execution,... but of course this will very fast blow up your permgen
>> > (java 6, 7) or anything else we don't know about (java 8). In fact the
>> > separate classloader approach is not different from Mike's scripts,
>> > just that Mike's script creates a new classloader by forking a new
>> > JVM. In fact I don't think the separate classloader approach would be
>> > much faster, because the class clones will all have separate
>> > compilation paths in Hotspot, so Hotspot cannot share the same
>> > assembler code. So except the JVM startup time, you gain nothing. Just
>> > permgen issues :-)
>>
>> The big thing the python beasting scripts avoids is all the ant overhead to 
>> just
>> get to the point where it actually spawns the JVM to run the test.  Really,
>> that's all the beasting script does: directly spawn the JVM on the test 
>> runner
>> (after running "ant test-compile" up
>> front) and then parse its output/events.
>>
>> The distributed test runner, which uses rsync/ssh to run tests on N machines,
>> is very different from the beasting script: it runs all Lucene's tests 
>> (instead of
>> a single test over and over) across N JVMs on M machines.  It "cheats" by
>> taking the union of all CLASSPATHs ...
>> but this is a huge win because it means all testing is fully concurrent, not 
>> just
>> concurrent within one module.  This script can also repeat, which means once
>> all lucene tests finish, re-en-queue all of them again.
>>
>> Mike McCandless
>>
>> http://blog.mikemccandless.com
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
>> commands, e-mail: dev-h...@lucene.apache.org
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Test iterations

2014-08-08 Thread Uwe Schindler
Hi,

We could emulate the same thing (the repeating beaster) with pure Ant:

Just repeat the "test" target, which can be done using ant-contrib's "for" task 
or (much simplier) a groovy script using antcall on the test target.
Should we provide this "beaster" in common-build?

"ant beast-tests -Dbeast.iter=100 -Dtestcase=..."

Very easy to implement and makes it easier to use for the python haters - and 
comes embedded...

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Michael McCandless [mailto:luc...@mikemccandless.com]
> Sent: Friday, August 08, 2014 3:48 PM
> To: Lucene/Solr dev
> Subject: Re: Test iterations
> 
> On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler  wrote:
> > Hi Dawid,
> >
> > Thanks for the very good explanation! Indeed the main problem with
> tests.iters is the static initializers. Maybe put that explanation into the 
> Wiki! I
> sometimes also need to remember it, so it should be documented.
> >
> > One (only theoretical) way to solve the whole thing could be:
> > Load the class(es) in a separate classloader for every repeated
> > execution,... but of course this will very fast blow up your permgen
> > (java 6, 7) or anything else we don't know about (java 8). In fact the
> > separate classloader approach is not different from Mike's scripts,
> > just that Mike's script creates a new classloader by forking a new
> > JVM. In fact I don't think the separate classloader approach would be
> > much faster, because the class clones will all have separate
> > compilation paths in Hotspot, so Hotspot cannot share the same
> > assembler code. So except the JVM startup time, you gain nothing. Just
> > permgen issues :-)
> 
> The big thing the python beasting scripts avoids is all the ant overhead to 
> just
> get to the point where it actually spawns the JVM to run the test.  Really,
> that's all the beasting script does: directly spawn the JVM on the test runner
> (after running "ant test-compile" up
> front) and then parse its output/events.
> 
> The distributed test runner, which uses rsync/ssh to run tests on N machines,
> is very different from the beasting script: it runs all Lucene's tests 
> (instead of
> a single test over and over) across N JVMs on M machines.  It "cheats" by
> taking the union of all CLASSPATHs ...
> but this is a huge win because it means all testing is fully concurrent, not 
> just
> concurrent within one module.  This script can also repeat, which means once
> all lucene tests finish, re-en-queue all of them again.
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional
> commands, e-mail: dev-h...@lucene.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Test iterations

2014-08-08 Thread Michael McCandless
On Fri, Aug 8, 2014 at 9:35 AM, Uwe Schindler  wrote:
> Hi Dawid,
>
> Thanks for the very good explanation! Indeed the main problem with 
> tests.iters is the static initializers. Maybe put that explanation into the 
> Wiki! I sometimes also need to remember it, so it should be documented.
>
> One (only theoretical) way to solve the whole thing could be:
> Load the class(es) in a separate classloader for every repeated execution,... 
> but of course this will very fast blow up your permgen (java 6, 7) or 
> anything else we don't know about (java 8). In fact the separate classloader 
> approach is not different from Mike's scripts, just that Mike's script 
> creates a new classloader by forking a new JVM. In fact I don't think the 
> separate classloader approach would be much faster, because the class clones 
> will all have separate compilation paths in Hotspot, so Hotspot cannot share 
> the same assembler code. So except the JVM startup time, you gain nothing. 
> Just permgen issues :-)

The big thing the python beasting scripts avoids is all the ant
overhead to just get to the point where it actually spawns the JVM to
run the test.  Really, that's all the beasting script does: directly
spawn the JVM on the test runner (after running "ant test-compile" up
front) and then parse its output/events.

The distributed test runner, which uses rsync/ssh to run tests on N
machines, is very different from the beasting script: it runs all
Lucene's tests (instead of a single test over and over) across N JVMs
on M machines.  It "cheats" by taking the union of all CLASSPATHs ...
but this is a huge win because it means all testing is fully
concurrent, not just concurrent within one module.  This script can
also repeat, which means once all lucene tests finish, re-en-queue all
of them again.

Mike McCandless

http://blog.mikemccandless.com

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



RE: Test iterations

2014-08-08 Thread Uwe Schindler
Hi Dawid,

Thanks for the very good explanation! Indeed the main problem with tests.iters 
is the static initializers. Maybe put that explanation into the Wiki! I 
sometimes also need to remember it, so it should be documented.

One (only theoretical) way to solve the whole thing could be:
Load the class(es) in a separate classloader for every repeated execution,... 
but of course this will very fast blow up your permgen (java 6, 7) or anything 
else we don't know about (java 8). In fact the separate classloader approach is 
not different from Mike's scripts, just that Mike's script creates a new 
classloader by forking a new JVM. In fact I don't think the separate 
classloader approach would be much faster, because the class clones will all 
have separate compilation paths in Hotspot, so Hotspot cannot share the same 
assembler code. So except the JVM startup time, you gain nothing. Just permgen 
issues :-)

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


> -Original Message-
> From: Dawid Weiss [mailto:dawid.we...@gmail.com]
> Sent: Friday, August 08, 2014 3:10 PM
> To: dev@lucene.apache.org
> Subject: Re: Test iterations
> 
> Hi Ryan,
> 
> So. I discussed this a while ago, but here it comes again. Let me first clear 
> a
> few things from what you said.
> 
> > Only in the last month or so did I learn that -Dtests.iters doesn't really
> "work".  What I mean is in regards to randomization.
> 
> This is not true. It works (as I will explain below). Try it, for example (the
> annotation has the same effect as providing
> -Dtests.iters=10):
> 
> @Repeat(iterations = 10)
> @Seed("0")
> public class Test001 extends RandomizedTest {
>   @Test public void test() {
> System.out.println(randomAsciiOfLength(10));
>   }
> }
> 
> I fixed the initial seed to make it reproducible. This will print:
> 
> nKtjLXhWQw
> awHTHLIGAq
> vEYgnxTkWv
> mSAloRXtIV
> iBhCJuZNzP
> DHAIyqecSS
> zaEoTAWAOa
> CoraUrKuib
> fKxUZnyQTx
> beFtvsUTHc
> 
> > Each iteration currently is *exactly* the same as far as randomization
> >> (each iteration uses the same master seed).
> 
> You can see above that it isn't true. Every iteration is different and uses
> different randomness (and this randomness is "derived" from the (master,
> iteration) pair so it is fully reproducible in each run).
> 
> >> Why not create a different seed for each iteration when -Dtests.iters is
> used?
> 
> Let's talk about JUnit unit tests and how (any) runner should execute them. I
> will demonstrate this on a simple class like this one (pseudo
> code):
> 
> class Foo {
>   @BeforeClass beforeClassHook() {}
> 
>   @Before beforeHook() {}
>   @Test test1() {}
>   @After afterHook() {}
> 
>   @AfterClass afterClassHook() {}
> }
> 
> There are a couple of "stages" to be executed. Simplifying a bit, it looks 
> like
> this.
> 
> 0. Prerequsite
> 
> - class available, possible loaded and initialized
> 
> 1. Setup:
> 
> - extract test methods
> 
> 2. Execution.
> 
> - run class-before hooks (rules, @BeforeClass)
> - for each test:
> run before hooks (@Before, rules)
> run the test itself
> run after hooks (@After, rules)
> - run class-after hooks (rules, @AfterClass)
> 
> For the class above, the sequence of method calls would be:
> 
> beforeClassHook()
> 
> new() // constructor
> beforeHook()
> test1()
> afterHook()
> 
> afterClassHook()
> 
> If you were to multiply tests execution manually, you would copy-paste the
> test method giving it a different name:
> 
> class Foo {
>   @BeforeClass beforeClassHook() {}
> 
>   @Before beforeHook() {}
>   @Test test1() {}
>   @Test test2() {}
>   @After afterHook() {}
> 
>   @AfterClass afterClassHook() {}
> }
> 
> which would result in a sequence of calls like this one:
> 
> beforeClassHook()
> 
> new() // constructor
> beforeHook()
> test1()
> afterHook()
> 
> new() // constructor (new instance)
> beforeHook()
> test2()
> afterHook()
> 
> afterClassHook()
> 
> So, first of all, note that duplicating tests is *not* equivalent to just 
> looping
> around method body. Each execution should be run on a new instance and
> wrapped with setup and teardown hooks, otherwise it's not really an isolated
> JUnit test anymore (and it would be against JUnit informal execution flow).
> 
> This is, in short, what -Dtests.iters does (and what @Repeat does) -- it
> replicates every test, making sure they have unique names (IDEs get
> confused if they don't) and 

Re: Test iterations

2014-08-08 Thread Dawid Weiss
Hi Ryan,

So. I discussed this a while ago, but here it comes again. Let me
first clear a few things from what you said.

> Only in the last month or so did I learn that -Dtests.iters doesn't really 
> "work".  What I mean is in regards to randomization.

This is not true. It works (as I will explain below). Try it, for
example (the annotation has the same effect as providing
-Dtests.iters=10):

@Repeat(iterations = 10)
@Seed("0")
public class Test001 extends RandomizedTest {
  @Test public void test() {
System.out.println(randomAsciiOfLength(10));
  }
}

I fixed the initial seed to make it reproducible. This will print:

nKtjLXhWQw
awHTHLIGAq
vEYgnxTkWv
mSAloRXtIV
iBhCJuZNzP
DHAIyqecSS
zaEoTAWAOa
CoraUrKuib
fKxUZnyQTx
beFtvsUTHc

> Each iteration currently is *exactly* the same as far as randomization
>> (each iteration uses the same master seed).

You can see above that it isn't true. Every iteration is different and
uses different randomness (and this randomness is "derived" from the
(master, iteration) pair so it is fully reproducible in each run).

>> Why not create a different seed for each iteration when -Dtests.iters is 
>> used?

Let's talk about JUnit unit tests and how (any) runner should execute
them. I will demonstrate this on a simple class like this one (pseudo
code):

class Foo {
  @BeforeClass beforeClassHook() {}

  @Before beforeHook() {}
  @Test test1() {}
  @After afterHook() {}

  @AfterClass afterClassHook() {}
}

There are a couple of "stages" to be executed. Simplifying a bit, it
looks like this.

0. Prerequsite

- class available, possible loaded and initialized

1. Setup:

- extract test methods

2. Execution.

- run class-before hooks (rules, @BeforeClass)
- for each test:
run before hooks (@Before, rules)
run the test itself
run after hooks (@After, rules)
- run class-after hooks (rules, @AfterClass)

For the class above, the sequence of method calls would be:

beforeClassHook()

new() // constructor
beforeHook()
test1()
afterHook()

afterClassHook()

If you were to multiply tests execution manually, you would copy-paste
the test method giving it a different name:

class Foo {
  @BeforeClass beforeClassHook() {}

  @Before beforeHook() {}
  @Test test1() {}
  @Test test2() {}
  @After afterHook() {}

  @AfterClass afterClassHook() {}
}

which would result in a sequence of calls like this one:

beforeClassHook()

new() // constructor
beforeHook()
test1()
afterHook()

new() // constructor (new instance)
beforeHook()
test2()
afterHook()

afterClassHook()

So, first of all, note that duplicating tests is *not* equivalent to
just looping around method body. Each execution should be run on a new
instance and wrapped with setup and teardown hooks, otherwise it's not
really an isolated JUnit test anymore (and it would be against JUnit
informal execution flow).

This is, in short, what -Dtests.iters does (and what @Repeat does) --
it replicates every test, making sure they have unique names (IDEs get
confused if they don't) and trying to work around other issues I won't
discuss here. It does work. The reason you believe it doesn't work is
because most of the stuff in LuceneTestCase is initialized at *static*
class level, which by definition is executed only once, regardless of
the number of tests in a class. Let's modify our initial example a
bit:

@Repeat(iterations = 10)
@Seed("0")
public class Test002 extends RandomizedTest {
  static String s;

  @BeforeClass
  public static void beforeClass() {
s = randomAsciiOfLength(10);
  }

  @Test public void test() {
System.out.println(s);
  }
}

If you run this, you'll see:

SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD
SXVNjhPdQD

This works as expected because beforeClass() is invoked once (even if
every test has a different randomness available to it). LuceneTestCase
does it for performance reasons (that static initialization is fairly
costly). This ends the
JUnit part of the story.

But wait, there is more. If you take a look above at how JUnit runners
should work they load the class (or in fact are given an initialized
class) before they can do anything. So if there are static class
initializers (static { field = foo(); }) then these may get executed
well before the runner has any chance to initialize its own stuff --
that's why you *have* to use @BeforeClass methods if you want to use
RandomizedTest's randomness; doing
this:

@Repeat(iterations = 10)
@Seed("0")
public class Test003 extends RandomizedTest {
  static final String s;
  static {
s = randomAsciiOfLength(10);
  }

  @Test public void test() {
System.out.println(s);
  }
}

will result in an initialization exception complaining about missing
random context:

java.lang.IllegalStateException: No context information for thread:
Thread[id=11, name=SUITE-Test003-seed#[0], state=RUNNABLE,
group=TGRP-Test003]. Is this thread running under a class
com.carrotsearch.randomizedtesting.RandomizedRunner ru

Re: Test iterations

2014-08-07 Thread Dawid Weiss
It is a longer story, Ryan. And *not* a trivial change to the runner. I
will reply tomorrow. I am at a pub right now. To you, cheers :)
On Aug 7, 2014 11:36 PM, "Ryan Ernst"  wrote:

> Only in the last month or so did I learn that -Dtests.iters doesn't
> really "work".  What I mean is in regards to randomization.  Each
> iteration currently is *exactly* the same as far as randomization
> (each iteration uses the same master seed).  And because of this, I
> understand that different people have their own "beasting" scripts
> that run the test essentially N times from a shell to force different
> seeds in each iteration.
>
> Why not create a different seed for each iteration when -Dtests.iters
> is used?  This way the test would still spit out a reproducible run
> line for a specific iteration, but each iteration would have good
> randomization (so trying to hit a rare bug could be done with
> -Dtests.iters).
>
> I'm curious if there is history here as to why test iters is done this
> way, or what peoples opinions are on moving towards the approach I
> suggested above.
>
> Thanks!
> Ryan
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>
>