Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability

2006-11-12 Thread Anton Luht

Alexei,


I like your approach to result comparison. 10% can be default value
for some form field - anyone can change it if needed.


OK, let's try it and see if it satisfies the community/


Probably more conventional would be to parse
 for some metric. Does the site
already parse TEST-* files?


It parses TESTS-TestSuites.xml  - as far as I can see, it contains the
same information as TEST-* files. Is there a common format of output
of stress tests you're talking about?




Thank you, Alexei

On 11/10/06, Anton Luht <[EMAIL PROTECTED]> wrote:
> Hello, Alexei,
>
> > I have related question. How can we improve http://harmonytest.org to
> > make it possible to publish not just pass, fail, or error but numeric
> > test scores?
>
> Easily - test results in JUnit reports have 'time' property -
> execution time in seconds. We can import and show them in the results.
> What else is needed? Maybe add something like 'show regressions' to
> the the 'compare runs' page? For example, show tests that increased
> execution time by more than 10% sorted by increase rate desc?
>
> --
> Regards,
> Anton Luht,
> Intel Java & XML Engineering
>


--
Thank you,
Alexei




--
Regards,
Anton Luht,
Intel Java & XML Engineering



Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability

2006-11-10 Thread Alexei Fedotov

Anton,

I like your approach to result comparison. 10% can be default value
for some form field - anyone can change it if needed.

As for test execution time reported by JUnit, it is applicable for
stress tests as well if we gradually increase a load over time. Though
using ttime field for stress tests and documentation rankings is not
quite conventional.

Probably more conventional would be to parse
 for some metric. Does the site
already parse TEST-* files?

Thank you, Alexei

On 11/10/06, Anton Luht <[EMAIL PROTECTED]> wrote:

Hello, Alexei,

> I have related question. How can we improve http://harmonytest.org to
> make it possible to publish not just pass, fail, or error but numeric
> test scores?

Easily - test results in JUnit reports have 'time' property -
execution time in seconds. We can import and show them in the results.
What else is needed? Maybe add something like 'show regressions' to
the the 'compare runs' page? For example, show tests that increased
execution time by more than 10% sorted by increase rate desc?

--
Regards,
Anton Luht,
Intel Java & XML Engineering




--
Thank you,
Alexei


Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability

2006-11-09 Thread Anton Luht

Hello, Alexei,


I have related question. How can we improve http://harmonytest.org to
make it possible to publish not just pass, fail, or error but numeric
test scores?


Easily - test results in JUnit reports have 'time' property -
execution time in seconds. We can import and show them in the results.
What else is needed? Maybe add something like 'show regressions' to
the the 'compare runs' page? For example, show tests that increased
execution time by more than 10% sorted by increase rate desc?

--
Regards,
Anton Luht,
Intel Java & XML Engineering


Re: [DRLVM] General stability

2006-11-09 Thread Geir Magnusson Jr.



Egor Pasko wrote:

On the 0x21C day of Apache Harmony Mikhail Loenko wrote:

2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>:



I do think of us having a 'zero regression' policy except in cases where
we make the explicit decision to break.  (like we did with TM, for example)

+1 for 'zero regression' unless explicitely agreed


I am +1 for "'zero regression' unless explicitely agreed" at least for
(c/jit-)unit, kernel, classlib tests. Do you think of including
Eclipse UT, DaCapo, etc. to the 'zero regression' set? 


That would probably be good, I am afraid of too heavy checks for each
small patch. 


This wouldn't be done by each committer - it's too heavy.  We'd have CI 
running continuously to tell us when we broke something.  Our 
committment would be to fix things that CI found immediately...


geir




Thanks,
Mikhail




I hesitate to say that again, but we also need to decide about VM we
will use for that release. I like the following mission: "Class library
and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
challenging, understandable and short enough.

Well, we'll need Windows XP and RHEL as well.



Writing down this mission certainly shouldn't inhibit individuals from
achieving other goals at Harmony. But it would help the rest of
community to concentrate on the common task.

1.
http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
on

With best regards,
Alexei Fedotov,
Intel Java & XML Engineering


-Original Message-
From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 08, 2006 10:36 AM
To: harmony-dev@incubator.apache.org
Subject: Re: [DRLVM] General stability

2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:

Probably it's time to create some release plan :)


So let's start this discussion?
Good idea!
The only release I can imagine is Harmony Java5SE 100% compatible.

To be Java5SE 100% compatible we need TCK first.
So we could think about some less impressive goal for the first release

:)

SY, Alexey






Re: [DRLVM] General stability

2006-11-09 Thread Egor Pasko
On the 0x21C day of Apache Harmony Mikhail Loenko wrote:
> 2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>:
> >
> >
> > Fedotov, Alexei A wrote:
> > > Alexey Petrenko wrote,
> > >> The only release I can imagine is Harmony Java5SE 100% compatible.
> > >> To be Java5SE 100% compatible we need TCK first.
> > >
> > > +1
> > >
> >
> > Yes - and I still think that talk of a release is a bit premature right now.
> >
> > The key things that I believe we need to focus on are
> >
> >  a) stability and
> >
> >  b) completeness.
> >
> >  c) reliability (which may be 'stability')
> >
> > (and not always in that order :)
> >
> >
> > Things I'd like to see us do :
> >
> > 1)  We need to drive to fully working unit tests for both DRLVM and
> > classlib  (using DRLVM).  Great progress has been made in this area, and
> >  we should probably make this a "campaign" for DRLVM as we did for
> > classlib.
> >
> > 2) Add stress tests
> >
> > 3) Get our CC-based build-test framework patched and running on as many
> > platforms as possible, reporting breakage into the list.
> >
> > 4) Identify problem areas and focus on them.  For example, threading in
> > DRLVM...
> >
> > I do think of us having a 'zero regression' policy except in cases where
> > we make the explicit decision to break.  (like we did with TM, for example)
> 
> +1 for 'zero regression' unless explicitely agreed

I am +1 for "'zero regression' unless explicitely agreed" at least for
(c/jit-)unit, kernel, classlib tests. Do you think of including
Eclipse UT, DaCapo, etc. to the 'zero regression' set? 

That would probably be good, I am afraid of too heavy checks for each
small patch. 

> Thanks,
> Mikhail
> 
> >
> >
> > > I hesitate to say that again, but we also need to decide about VM we
> > > will use for that release. I like the following mission: "Class library
> > > and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
> > > challenging, understandable and short enough.
> >
> > Well, we'll need Windows XP and RHEL as well.
> >
> >
> > >
> > > Writing down this mission certainly shouldn't inhibit individuals from
> > > achieving other goals at Harmony. But it would help the rest of
> > > community to concentrate on the common task.
> > >
> > > 1.
> > > http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
> > > on
> > >
> > > With best regards,
> > > Alexei Fedotov,
> > > Intel Java & XML Engineering
> > >
> > >> -Original Message-
> > >> From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
> > >> Sent: Wednesday, November 08, 2006 10:36 AM
> > >> To: harmony-dev@incubator.apache.org
> > >> Subject: Re: [DRLVM] General stability
> > >>
> > >> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
> > >>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
> > >>>> Probably it's time to create some release plan :)
> > >>>>
> > >>> So let's start this discussion?
> > >>> Good idea!
> > >>> The only release I can imagine is Harmony Java5SE 100% compatible.
> > >> To be Java5SE 100% compatible we need TCK first.
> > >> So we could think about some less impressive goal for the first release
> > > :)
> > >> SY, Alexey
> > >
> >
> >
> 

-- 
Egor Pasko



Tests scores on http://harmonytest.org Was: [DRLVM] General stability

2006-11-09 Thread Fedotov, Alexei A
Geir,
I like the overall letter. 

Anton,

I have related question. How can we improve http://harmonytest.org to
make it possible to publish not just pass, fail, or error but numeric
test scores? 

How this is related to the letter? I believe that stress tests which
were mentioned in the letter may have scores in a way performance tests
do, see
http://incubator.apache.org/harmony/subcomponents/stresstest/index.html

BTW, Egor Pasko requested scores to report documentation quality.

With best regards,
Alexei Fedotov,
Intel Java & XML Engineering

>-Original Message-
>From: Geir Magnusson Jr. [mailto:[EMAIL PROTECTED]
>Sent: Thursday, November 09, 2006 4:51 AM
>To: harmony-dev@incubator.apache.org
>Subject: Re: [DRLVM] General stability
>
>
>
>Fedotov, Alexei A wrote:
>> Alexey Petrenko wrote,
>>> The only release I can imagine is Harmony Java5SE 100% compatible.
>>> To be Java5SE 100% compatible we need TCK first.
>>
>> +1
>>
>
>Yes - and I still think that talk of a release is a bit premature right
>now.
>
>The key things that I believe we need to focus on are
>
>  a) stability and
>
>  b) completeness.
>
>  c) reliability (which may be 'stability')
>
>(and not always in that order :)
>
>
>Things I'd like to see us do :
>
>1)  We need to drive to fully working unit tests for both DRLVM and
>classlib  (using DRLVM).  Great progress has been made in this area,
and
>  we should probably make this a "campaign" for DRLVM as we did for
>classlib.
>
>2) Add stress tests
>
>3) Get our CC-based build-test framework patched and running on as many
>platforms as possible, reporting breakage into the list.
>
>4) Identify problem areas and focus on them.  For example, threading in
>DRLVM...
>
>I do think of us having a 'zero regression' policy except in cases
where
>we make the explicit decision to break.  (like we did with TM, for
example)
>
>
>> I hesitate to say that again, but we also need to decide about VM we
>> will use for that release. I like the following mission: "Class
library
>> and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which
is
>> challenging, understandable and short enough.
>
>Well, we'll need Windows XP and RHEL as well.
>
>
>>
>> Writing down this mission certainly shouldn't inhibit individuals
from
>> achieving other goals at Harmony. But it would help the rest of
>> community to concentrate on the common task.
>>
>> 1.
>>
http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
>> on
>>
>> With best regards,
>> Alexei Fedotov,
>> Intel Java & XML Engineering
>>
>>> -Original Message-
>>> From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
>>> Sent: Wednesday, November 08, 2006 10:36 AM
>>> To: harmony-dev@incubator.apache.org
>>> Subject: Re: [DRLVM] General stability
>>>
>>> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
>>>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>>>>> Probably it's time to create some release plan :)
>>>>>
>>>> So let's start this discussion?
>>>> Good idea!
>>>> The only release I can imagine is Harmony Java5SE 100% compatible.
>>> To be Java5SE 100% compatible we need TCK first.
>>> So we could think about some less impressive goal for the first
release
>> :)
>>> SY, Alexey
>>


Re: [DRLVM] General stability

2006-11-09 Thread Mikhail Loenko

2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>:



Fedotov, Alexei A wrote:
> Alexey Petrenko wrote,
>> The only release I can imagine is Harmony Java5SE 100% compatible.
>> To be Java5SE 100% compatible we need TCK first.
>
> +1
>

Yes - and I still think that talk of a release is a bit premature right now.

The key things that I believe we need to focus on are

 a) stability and

 b) completeness.

 c) reliability (which may be 'stability')

(and not always in that order :)


Things I'd like to see us do :

1)  We need to drive to fully working unit tests for both DRLVM and
classlib  (using DRLVM).  Great progress has been made in this area, and
 we should probably make this a "campaign" for DRLVM as we did for
classlib.

2) Add stress tests

3) Get our CC-based build-test framework patched and running on as many
platforms as possible, reporting breakage into the list.

4) Identify problem areas and focus on them.  For example, threading in
DRLVM...

I do think of us having a 'zero regression' policy except in cases where
we make the explicit decision to break.  (like we did with TM, for example)


+1 for 'zero regression' unless explicitely agreed

Thanks,
Mikhail




> I hesitate to say that again, but we also need to decide about VM we
> will use for that release. I like the following mission: "Class library
> and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
> challenging, understandable and short enough.

Well, we'll need Windows XP and RHEL as well.


>
> Writing down this mission certainly shouldn't inhibit individuals from
> achieving other goals at Harmony. But it would help the rest of
> community to concentrate on the common task.
>
> 1.
> http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
> on
>
> With best regards,
> Alexei Fedotov,
> Intel Java & XML Engineering
>
>> -Original Message-
>> From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, November 08, 2006 10:36 AM
>> To: harmony-dev@incubator.apache.org
>> Subject: Re: [DRLVM] General stability
>>
>> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
>>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>>>> Probably it's time to create some release plan :)
>>>>
>>> So let's start this discussion?
>>> Good idea!
>>> The only release I can imagine is Harmony Java5SE 100% compatible.
>> To be Java5SE 100% compatible we need TCK first.
>> So we could think about some less impressive goal for the first release
> :)
>> SY, Alexey
>




Re: [DRLVM] General stability

2006-11-09 Thread Oleg Oleinik

The key things that I believe we need to focus on are
a) stability and
b) completeness.
c) reliability (which may be 'stability')


Just 2 cents - just to be clear on terms "stability" and "reliability" - we
started using them in this thread and I feel like we do not separate them.

I propose, when we are talking about "reliability" we mean:
Capability of Harmony runtime to run given workloads correctly (as it is
defined by J2SE specifications) for certain period of time.
This is interpretation of definition given by IEEE for "reliability".

There is no definition for stability in IEEE software glossary, so I propose
when we are talking about stability we mean stability of runtime / code in
time, i.e. progressing, not regressing.

In these terms I completely agree with the order.



On 11/9/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote:




Fedotov, Alexei A wrote:
> Alexey Petrenko wrote,
>> The only release I can imagine is Harmony Java5SE 100% compatible.
>> To be Java5SE 100% compatible we need TCK first.
>
> +1
>

Yes - and I still think that talk of a release is a bit premature right
now.

The key things that I believe we need to focus on are

a) stability and

b) completeness.

c) reliability (which may be 'stability')

(and not always in that order :)


Things I'd like to see us do :

1)  We need to drive to fully working unit tests for both DRLVM and
classlib  (using DRLVM).  Great progress has been made in this area, and
we should probably make this a "campaign" for DRLVM as we did for
classlib.

2) Add stress tests

3) Get our CC-based build-test framework patched and running on as many
platforms as possible, reporting breakage into the list.

4) Identify problem areas and focus on them.  For example, threading in
DRLVM...

I do think of us having a 'zero regression' policy except in cases where
we make the explicit decision to break.  (like we did with TM, for
example)


> I hesitate to say that again, but we also need to decide about VM we
> will use for that release. I like the following mission: "Class library
> and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
> challenging, understandable and short enough.

Well, we'll need Windows XP and RHEL as well.


>
> Writing down this mission certainly shouldn't inhibit individuals from
> achieving other goals at Harmony. But it would help the rest of
> community to concentrate on the common task.
>
> 1.
> http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
> on
>
> With best regards,
> Alexei Fedotov,
> Intel Java & XML Engineering
>
>> -Original Message-
>> From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
>> Sent: Wednesday, November 08, 2006 10:36 AM
>> To: harmony-dev@incubator.apache.org
>> Subject: Re: [DRLVM] General stability
>>
>> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
>>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>>>> Probably it's time to create some release plan :)
>>>>
>>> So let's start this discussion?
>>> Good idea!
>>> The only release I can imagine is Harmony Java5SE 100% compatible.
>> To be Java5SE 100% compatible we need TCK first.
>> So we could think about some less impressive goal for the first release
> :)
>> SY, Alexey
>




Re: [DRLVM] General stability

2006-11-08 Thread Geir Magnusson Jr.



Fedotov, Alexei A wrote:

Alexey Petrenko wrote,

The only release I can imagine is Harmony Java5SE 100% compatible.
To be Java5SE 100% compatible we need TCK first.


+1



Yes - and I still think that talk of a release is a bit premature right now.

The key things that I believe we need to focus on are

 a) stability and

 b) completeness.

 c) reliability (which may be 'stability')

(and not always in that order :)


Things I'd like to see us do :

1)  We need to drive to fully working unit tests for both DRLVM and 
classlib  (using DRLVM).  Great progress has been made in this area, and 
 we should probably make this a "campaign" for DRLVM as we did for 
classlib.


2) Add stress tests

3) Get our CC-based build-test framework patched and running on as many 
platforms as possible, reporting breakage into the list.


4) Identify problem areas and focus on them.  For example, threading in 
DRLVM...


I do think of us having a 'zero regression' policy except in cases where 
we make the explicit decision to break.  (like we did with TM, for example)




I hesitate to say that again, but we also need to decide about VM we
will use for that release. I like the following mission: "Class library
and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
challenging, understandable and short enough.


Well, we'll need Windows XP and RHEL as well.




Writing down this mission certainly shouldn't inhibit individuals from
achieving other goals at Harmony. But it would help the rest of
community to concentrate on the common task.

1.
http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
on

With best regards,
Alexei Fedotov,
Intel Java & XML Engineering


-Original Message-
From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
Sent: Wednesday, November 08, 2006 10:36 AM
To: harmony-dev@incubator.apache.org
Subject: Re: [DRLVM] General stability

2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:

Probably it's time to create some release plan :)


So let's start this discussion?
Good idea!
The only release I can imagine is Harmony Java5SE 100% compatible.

To be Java5SE 100% compatible we need TCK first.
So we could think about some less impressive goal for the first release

:)

SY, Alexey






RE: [DRLVM] General stability

2006-11-08 Thread Fedotov, Alexei A
Alexey Petrenko wrote,
>The only release I can imagine is Harmony Java5SE 100% compatible.
>To be Java5SE 100% compatible we need TCK first.

+1

I hesitate to say that again, but we also need to decide about VM we
will use for that release. I like the following mission: "Class library
and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is
challenging, understandable and short enough.

Writing down this mission certainly shouldn't inhibit individuals from
achieving other goals at Harmony. But it would help the rest of
community to concentrate on the common task.

1.
http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_
on

With best regards,
Alexei Fedotov,
Intel Java & XML Engineering

>-Original Message-
>From: Alexey Petrenko [mailto:[EMAIL PROTECTED]
>Sent: Wednesday, November 08, 2006 10:36 AM
>To: harmony-dev@incubator.apache.org
>Subject: Re: [DRLVM] General stability
>
>2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>> >
>> > Probably it's time to create some release plan :)
>> >
>> So let's start this discussion?
>> >
>> Good idea!
>> The only release I can imagine is Harmony Java5SE 100% compatible.
>To be Java5SE 100% compatible we need TCK first.
>So we could think about some less impressive goal for the first release
:)
>
>SY, Alexey


Re: [DRLVM] General stability

2006-11-08 Thread Alexey Varlamov

2006/11/7, Vladimir Ivanov <[EMAIL PROTECTED]>:

On 11/7/06, Alexey Varlamov <[EMAIL PROTECTED]> wrote:
>
>
> But do we have needed scripts/tools readily available to run and
> analyze such stability testing? I'm also pretty sure existing c-unit
> and smoke tests would help to reveal certain problems if run
> repeatedly - just need to add this stuff to CC and run it nightly.
> Anybody volunteer?
> And yet there are a lot of excluded tests in smoke suite...



Actually, we have one. The task 'ant test' from the drlvm module is running
under CC on linux and windows boxes. Every one can easily reproduce it
(checkout 'buildtest' module and run it, updated version is available in the
issue 995).

The problem is: CC will be useful only to track the regression. While we
have some failed tests it should be fixed asap. At present time some issues
that prevent successful CC runs wait for integration more than 1 month :(


AFAIU CC only tracks status changes between subsequent runs, right?
This is not really helpful to detect stability issues.
Apparently a few race conditions present in DRLVM threads suspension,
classloading, and/or elsewhere. So one rather need to collect
statistics to spot suspicious areas. Results can be represented as %%
of failures per test and groups of  failures by symptoms, e.g. the
same assert failed for different tests. Further analysis could bred
better tests with higher failure probability...
Alternative approach could be to employ custom "stressing" test
harness, running a test in several concurrent threads, etc.

Although this is just a temp solution until we have thorough stress
tests and more decent coverage of VM code by unit tests, the latter
will hardly happen in foreseeable future. So we should try to derive
maximum benefit from already available tests.

--
Regards,
Alexey


 thanks, Vladimir




Re: [DRLVM] General stability

2006-11-08 Thread Mikhail Loenko

In general I agrre with what Oleg says. Some comment are below

2006/11/8, Oleg Oleinik <[EMAIL PROTECTED]>:

> "no regression" policy should be relevant to a number of *small* tests
that are easy to run and are running fast, to make them good as pre-commit
criteria.

Actually, I'm thinking about the following model (which goes a little bit
beyond pre-commit testing):

**Unit testing: new feature is developed; unit (or whatever) tests are
created; the tests are passing on certain platforms / runtime
configurations; the feature along with tests goes into JIRA.


But we welcome any contributions even without unit tests



**Pre-commit testing: done by committer with agreed set of tests, typically,
quickly running (example: classlib unit tests + vm/jit-specific tests).
Regression? - no commit or exclude tests / ignore failures, if reasonable
and agreed.

**Code integrity testing: done automatically ~hourly, the same set of tests
as for pre-commit testing may be used. Regression? - notify and fix asap (or
even roll-back changes if appropriate) or just exclude tests, if reasonable
and agreed.


It should be done on all the platforms we support.
(And probably "support" of some platform should imply that we do
integrity testing on it)

Thanks,
Mikhail



**QA testing (say, nightly): one runs automated workloads (from
buildtest/) on his platform(s) nightly (or from time to time). Regression? -
for example, 3 EUT tests or some Eclipse scenario started failing after
certain commit - notify harmony-dev about regression, then, it should
be decided whether or not stop new commits and fix the regression asap or
accept regression and just exclude tests.
Bugfixer can take automation scripts from buildtest/ and play with failing
tests or scenario.

I think, the more we care about regressions on ongoing basis, the less time
we will need to achieve milestone's stability requirements, the more sense
have applications enabling acitivities.

Who knows - 2 months is sufficient for reaching established stability level
or not?
Why enable something if tommorow it has good chances do not work?

> Many successful projects (probably, all of them) have stability periods,
even stability releases (and, yes, stability branches). That is considered
effective. And IMO our project should act the same.

I support having milestones, releases. But, after milestone I don't
like seeing loosing achieved results.



On 08 Nov 2006 12:48:48 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote:
>
> On the 0x21B day of Apache Harmony Oleg Oleinik wrote:
> > Such model works but there is a risk of fixing again "from scratch"
> those
> > bugs which were fixed once on the previous milestones.
>
> sometimes it is easier to fix a couple of bugs "from scratch" than to
> spend large amount of resources on regular complex checks (that also
> do not guarantee 100% stability)
>
> > We can eliminate this if follow "no regression" policy - if something
> works
> > (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for
> example),
> > it should continue working - any regression is a subject for reporting
> and
> > fixing as soon as possible (it is easier to find root cause and fix
> since we
> > will know which commit caused regression).
> >
> > Will this model work? Isn't it a little bit better than focusing on
> runtime
> > stability periodically?
>
> "no regression" policy should be relevant to a number of *small* tests
> that are easy to run and are running fast, to make them good as
> pre-commit criteria.
>
> Complex workloads _cannot_ be run as a pre-commit criteria. So there
> _should be regressions_. That's because:
> * we cannot afford to run them as pre-commit
> * we cannot afford complex rollbacks and stop-the world
>
> Many successful projects (probably, all of them) have stability
> periods, even stability releases (and, yes, stability branches). That
> is considered effective. And IMO our project should act the same.
>
> We _have to_ allow some bugs to continue active development. But not
> too many. It is always a tradeoff.
>
> To summarize. I support you idea to improve the regression test base
> and infrastructure. Let it be a step-by-step improvement. Then we can
> decide which tests to run as pre-commit and which are to measure the
> overall stability.
>
> > On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote:
> > >
> > > I wouldn't go so far as to label issues as "won't fix" unless they are
> > > really high risk and low value items.
> > >
> > > It's useful to go through a stabilization period where the focus is on
> > > getting the code solid again and delaying significant new
> functionality
> > > until it is achieved.  A plan that aims to deliver stable milestones
> on
> > > regular periods is, in my experience, a good way to focus the
> > > development effort.
> > >
> > > Regards,
> > > Tim
> > >
> > > Weldon Washburn wrote:
> > > > Folks,
> > > >
> > > > I have spent the last two months committing patches to the
> VM.  While we
> > > > h

Re: [DRLVM] General stability

2006-11-08 Thread Oleg Oleinik

"no regression" policy should be relevant to a number of *small* tests

that are easy to run and are running fast, to make them good as pre-commit
criteria.

Actually, I'm thinking about the following model (which goes a little bit
beyond pre-commit testing):

**Unit testing: new feature is developed; unit (or whatever) tests are
created; the tests are passing on certain platforms / runtime
configurations; the feature along with tests goes into JIRA.

**Pre-commit testing: done by committer with agreed set of tests, typically,
quickly running (example: classlib unit tests + vm/jit-specific tests).
Regression? - no commit or exclude tests / ignore failures, if reasonable
and agreed.

**Code integrity testing: done automatically ~hourly, the same set of tests
as for pre-commit testing may be used. Regression? - notify and fix asap (or
even roll-back changes if appropriate) or just exclude tests, if reasonable
and agreed.

**QA testing (say, nightly): one runs automated workloads (from
buildtest/) on his platform(s) nightly (or from time to time). Regression? -
for example, 3 EUT tests or some Eclipse scenario started failing after
certain commit - notify harmony-dev about regression, then, it should
be decided whether or not stop new commits and fix the regression asap or
accept regression and just exclude tests.
Bugfixer can take automation scripts from buildtest/ and play with failing
tests or scenario.

I think, the more we care about regressions on ongoing basis, the less time
we will need to achieve milestone's stability requirements, the more sense
have applications enabling acitivities.

Who knows - 2 months is sufficient for reaching established stability level
or not?
Why enable something if tommorow it has good chances do not work?


Many successful projects (probably, all of them) have stability periods,

even stability releases (and, yes, stability branches). That is considered
effective. And IMO our project should act the same.

I support having milestones, releases. But, after milestone I don't
like seeing loosing achieved results.



On 08 Nov 2006 12:48:48 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote:


On the 0x21B day of Apache Harmony Oleg Oleinik wrote:
> Such model works but there is a risk of fixing again "from scratch"
those
> bugs which were fixed once on the previous milestones.

sometimes it is easier to fix a couple of bugs "from scratch" than to
spend large amount of resources on regular complex checks (that also
do not guarantee 100% stability)

> We can eliminate this if follow "no regression" policy - if something
works
> (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for
example),
> it should continue working - any regression is a subject for reporting
and
> fixing as soon as possible (it is easier to find root cause and fix
since we
> will know which commit caused regression).
>
> Will this model work? Isn't it a little bit better than focusing on
runtime
> stability periodically?

"no regression" policy should be relevant to a number of *small* tests
that are easy to run and are running fast, to make them good as
pre-commit criteria.

Complex workloads _cannot_ be run as a pre-commit criteria. So there
_should be regressions_. That's because:
* we cannot afford to run them as pre-commit
* we cannot afford complex rollbacks and stop-the world

Many successful projects (probably, all of them) have stability
periods, even stability releases (and, yes, stability branches). That
is considered effective. And IMO our project should act the same.

We _have to_ allow some bugs to continue active development. But not
too many. It is always a tradeoff.

To summarize. I support you idea to improve the regression test base
and infrastructure. Let it be a step-by-step improvement. Then we can
decide which tests to run as pre-commit and which are to measure the
overall stability.

> On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote:
> >
> > I wouldn't go so far as to label issues as "won't fix" unless they are
> > really high risk and low value items.
> >
> > It's useful to go through a stabilization period where the focus is on
> > getting the code solid again and delaying significant new
functionality
> > until it is achieved.  A plan that aims to deliver stable milestones
on
> > regular periods is, in my experience, a good way to focus the
> > development effort.
> >
> > Regards,
> > Tim
> >
> > Weldon Washburn wrote:
> > > Folks,
> > >
> > > I have spent the last two months committing patches to the
VM.  While we
> > > have added a ton of much needed functionality, the stability of the
> > system
> > > has been ignored.  By chance, I looked at thread synchronization
design
> > > problems this week.  Its very apparent that  we lack the regression
> > testing
> > > to really find threading bugs, test the fixes and test against
> > regression.
> > > No doubt there are similar problems in other VM subsystems.   "build
> > test"
> > > is necessary but not sufficient for wher

Re: [DRLVM] General stability

2006-11-08 Thread Mikhail Fursov

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:


> If we want to have "unnamed" milestones, the solution could be: every
last
> month of a quoter is a "stability" period. No new features are accepted
during
> this period. This could work for a long period of time without need
> in additional discussions.
But we need to define what is stability in this case. Unit tests? It's
a precommit criteria and should not fail always.
Big applications? Which applications? Which scenarios?
If we adds some new features then application list should increase.
This needs discussions and so no



I propose making this discussion only once and derive a solution that works
for a long time.
The global releases we will have Java5, Java6 are something special.
The minor releases could be done automatically without additional
discussions: for example we can release a snapshot every quoter.
If the trunk is frozen for new features (the last month of every quoter) the
only way to contribute is to fix bugs/improve quality of existing code. IMO
any applications and any tests could be used here. Unit tests or some
application tests could be announced as showstoppers for release, of course.

--
Mikhail Fursov


Re: [DRLVM] General stability

2006-11-07 Thread Alexey Petrenko

2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:

If our classlib is not 100% compatible we can't stop addition of new
features.
The same is with VM.

Nobody trying to freeze new features forever. At least I don't :)


If we want to have "unnamed" milestones, the solution could be: every last
month of a quoter is a "stability" period. No new features are accepted during
this period. This could work for a long period of time without need
in additional discussions.

But we need to define what is stability in this case. Unit tests? It's
a precommit criteria and should not fail always.
Big applications? Which applications? Which scenarios?
If we adds some new features then application list should increase.
This needs discussions and so no

SY, Alexey


On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>
> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
> > On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
> > >
> > > Probably it's time to create some release plan :)
> > >
> > So let's start this discussion?
> > >
> > Good idea!
> > The only release I can imagine is Harmony Java5SE 100% compatible.
> To be Java5SE 100% compatible we need TCK first.
> So we could think about some less impressive goal for the first release :)
>
> SY, Alexey


Re: [DRLVM] General stability

2006-11-07 Thread Mikhail Fursov

If our classlib is not 100% compatible we can't stop addition of new
features.
The same is with VM.

If we want to have "unnamed" milestones, the solution could be: every last
month of a quoter is a "stability" period. No new features are accepted
during this period. This could work for a long period of time without need
in additional discussions.
?

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:


2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:
> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
> >
> > Probably it's time to create some release plan :)
> >
> So let's start this discussion?
> >
> Good idea!
> The only release I can imagine is Harmony Java5SE 100% compatible.
To be Java5SE 100% compatible we need TCK first.
So we could think about some less impressive goal for the first release :)

SY, Alexey





--
Mikhail Fursov


Re: [DRLVM] General stability

2006-11-07 Thread Alexey Petrenko

2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>:

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:
>
> Probably it's time to create some release plan :)
>
So let's start this discussion?
>
Good idea!
The only release I can imagine is Harmony Java5SE 100% compatible.

To be Java5SE 100% compatible we need TCK first.
So we could think about some less impressive goal for the first release :)

SY, Alexey


Re: [DRLVM] General stability

2006-11-07 Thread Mikhail Fursov

On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote:


Probably it's time to create some release plan :)


So let's start this discussion?



Good idea!
The only release I can imagine is Harmony Java5SE 100% compatible. We can
make feature freeze in 2 or 3 monthes before this date and after we have all
features in classlib and vm implemented.
If this proposal is OK we will have feature freeze only in Q1 2007.. ?

--
Mikhail Fursov


Re: [DRLVM] General stability

2006-11-07 Thread Alexey Petrenko

2006/11/4, Weldon Washburn <[EMAIL PROTECTED]>:

Folks,

I have spent the last two months committing patches to the VM.  While we
have added a ton of much needed functionality, the stability of the system
has been ignored.  By chance, I looked at thread synchronization design
problems this week.  Its very apparent that  we lack the regression testing
to really find threading bugs, test the fixes and test against regression.
No doubt there are similar problems in other VM subsystems.   "build test"
is necessary but not sufficient for where we need to go.  In a sense,
committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

So that we can get engineers focused on stability, I am thinking of coding
the JIRAs that involve new features as "later" or even "won't fix".  Please
feel free to comment.

I prefer to move the issue to the next milestone or release if I'm not
sure that it will not break stability of the product and we are in
"near release" phase.

Probably it's time to create some release plan :)
I think it will make many things much easier. If we will have a list
of features for release and release date it will be much easier to
decide do we want to apply the fix or postpone it to the next release.

So let's start this discussion?

SY, Alexey


We also need to restart the old email threads on regression tests.  For
example, we need some sort of automated test script that runs Eclipse and
tomcat, etc. in a deterministic fashion so that we can compare test
results.  It does not have to be perfect for starts, just repeatable and
easy to use.  Feel free to beat me to starting these threads :)

--
Weldon Washburn
Intel Enterprise Solutions Software Division




Re: [DRLVM] General stability

2006-11-07 Thread Egor Pasko
On the 0x21B day of Apache Harmony Oleg Oleinik wrote:
> Such model works but there is a risk of fixing again "from scratch" those
> bugs which were fixed once on the previous milestones.

sometimes it is easier to fix a couple of bugs "from scratch" than to
spend large amount of resources on regular complex checks (that also
do not guarantee 100% stability)

> We can eliminate this if follow "no regression" policy - if something works
> (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for example),
> it should continue working - any regression is a subject for reporting and
> fixing as soon as possible (it is easier to find root cause and fix since we
> will know which commit caused regression).
> 
> Will this model work? Isn't it a little bit better than focusing on runtime
> stability periodically?

"no regression" policy should be relevant to a number of *small* tests
that are easy to run and are running fast, to make them good as
pre-commit criteria.

Complex workloads _cannot_ be run as a pre-commit criteria. So there
_should be regressions_. That's because:
* we cannot afford to run them as pre-commit
* we cannot afford complex rollbacks and stop-the world

Many successful projects (probably, all of them) have stability
periods, even stability releases (and, yes, stability branches). That
is considered effective. And IMO our project should act the same.

We _have to_ allow some bugs to continue active development. But not
too many. It is always a tradeoff.

To summarize. I support you idea to improve the regression test base
and infrastructure. Let it be a step-by-step improvement. Then we can
decide which tests to run as pre-commit and which are to measure the
overall stability.

> On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote:
> >
> > I wouldn't go so far as to label issues as "won't fix" unless they are
> > really high risk and low value items.
> >
> > It's useful to go through a stabilization period where the focus is on
> > getting the code solid again and delaying significant new functionality
> > until it is achieved.  A plan that aims to deliver stable milestones on
> > regular periods is, in my experience, a good way to focus the
> > development effort.
> >
> > Regards,
> > Tim
> >
> > Weldon Washburn wrote:
> > > Folks,
> > >
> > > I have spent the last two months committing patches to the VM.  While we
> > > have added a ton of much needed functionality, the stability of the
> > system
> > > has been ignored.  By chance, I looked at thread synchronization design
> > > problems this week.  Its very apparent that  we lack the regression
> > testing
> > > to really find threading bugs, test the fixes and test against
> > regression.
> > > No doubt there are similar problems in other VM subsystems.   "build
> > test"
> > > is necessary but not sufficient for where we need to go.  In a sense,
> > > committing code with only "build test" to prevent regression is the
> > > equivalent to flying in the fog without instrumentation.
> > >
> > > So that we can get engineers focused on stability, I am thinking of
> > coding
> > > the JIRAs that involve new features as "later" or even "won't
> > fix".  Please
> > > feel free to comment.
> > >
> > > We also need to restart the old email threads on regression tests.  For
> > > example, we need some sort of automated test script that runs Eclipse
> > and
> > > tomcat, etc. in a deterministic fashion so that we can compare test
> > > results.  It does not have to be perfect for starts, just repeatable and
> > > easy to use.  Feel free to beat me to starting these threads :)
> > >
> >
> > --
> >
> > Tim Ellison ([EMAIL PROTECTED])
> > IBM Java technology centre, UK.
> >

-- 
Egor Pasko



Re: [DRLVM] General stability

2006-11-07 Thread Oleg Oleinik

Such model works but there is a risk of fixing again "from scratch" those
bugs which were fixed once on the previous milestones.

We can eliminate this if follow "no regression" policy - if something works
(classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for example),
it should continue working - any regression is a subject for reporting and
fixing as soon as possible (it is easier to find root cause and fix since we
will know which commit caused regression).

Will this model work? Isn't it a little bit better than focusing on runtime
stability periodically?


On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote:


I wouldn't go so far as to label issues as "won't fix" unless they are
really high risk and low value items.

It's useful to go through a stabilization period where the focus is on
getting the code solid again and delaying significant new functionality
until it is achieved.  A plan that aims to deliver stable milestones on
regular periods is, in my experience, a good way to focus the
development effort.

Regards,
Tim

Weldon Washburn wrote:
> Folks,
>
> I have spent the last two months committing patches to the VM.  While we
> have added a ton of much needed functionality, the stability of the
system
> has been ignored.  By chance, I looked at thread synchronization design
> problems this week.  Its very apparent that  we lack the regression
testing
> to really find threading bugs, test the fixes and test against
regression.
> No doubt there are similar problems in other VM subsystems.   "build
test"
> is necessary but not sufficient for where we need to go.  In a sense,
> committing code with only "build test" to prevent regression is the
> equivalent to flying in the fog without instrumentation.
>
> So that we can get engineers focused on stability, I am thinking of
coding
> the JIRAs that involve new features as "later" or even "won't
fix".  Please
> feel free to comment.
>
> We also need to restart the old email threads on regression tests.  For
> example, we need some sort of automated test script that runs Eclipse
and
> tomcat, etc. in a deterministic fashion so that we can compare test
> results.  It does not have to be perfect for starts, just repeatable and
> easy to use.  Feel free to beat me to starting these threads :)
>

--

Tim Ellison ([EMAIL PROTECTED])
IBM Java technology centre, UK.



Re: [DRLVM] General stability

2006-11-07 Thread Oleg Oleinik

 "build test" is necessary but not sufficient for where we need to go.  In

a sense, committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

Right.

Here I see 2 aspects - creating VM/JIT regression tests and running various
tests regularaly / tracking regressions timely.

1. VM/JIT regression test development.

We can start with these guidelines:

Who creates/integrates a regression test? - committer.
How? - typically, from bug's reproducer code.
Where? - drlvm\trunk\src\test\vm\\harmony-\ or
drlvm\trunk\src\test\jit\\harmony-\
Format: Junit, Jasmin
When? - Consider creating a regression test for each fixed bug against
Harmony DRLVM or JIT, if reasonable and technically possible.
Creation of regression test may be omitted if:
- Bug in documentation, build, test, code comment, code style, enhancement
is fixed.
- Performance-related bug is fixed.
- Bug is found by existing Harmony test (Harmony regression or unit test).
// to avoid duplication of tests

Try to use public API to provide implementation independence and stability
of the tests.

This test suite can/should be used in pre-commit testing.


2. Timely regression tracking via test execution.

Now we have some solution for code integrity testing via Cruise Control
(buildtest/trunk/CC + HARMONY-995) - good start point to track regressions
hourly using classlib unit tests and "build test" on your specific platform.

I see HARMONY-2038 which is about automation of Ecilpse Unit Tests execution
in context of buildtest/ infrastructure - to start running EUT regularly
using buildtest/ (say, nightly) and reporting timely about new test failures
(and causing commits).

I think, it would be great if we continue adding more automation scripts
into buildtest/ for various public unit test suites and scenarios (Derby,
Tomcat, etc.), so that one could take automation scripts from buildtest/ and
run them regularly on his specific platform, report regressions timely (what
Nina, I suppose, is going to do with EUT).



On 11/4/06, Weldon Washburn <[EMAIL PROTECTED]> wrote:


Folks,

I have spent the last two months committing patches to the VM.  While we
have added a ton of much needed functionality, the stability of the system
has been ignored.  By chance, I looked at thread synchronization design
problems this week.  Its very apparent that  we lack the regression
testing
to really find threading bugs, test the fixes and test against regression.
No doubt there are similar problems in other VM subsystems.   "build test"
is necessary but not sufficient for where we need to go.  In a sense,
committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

So that we can get engineers focused on stability, I am thinking of coding
the JIRAs that involve new features as "later" or even "won't
fix".  Please
feel free to comment.

We also need to restart the old email threads on regression tests.  For
example, we need some sort of automated test script that runs Eclipse and
tomcat, etc. in a deterministic fashion so that we can compare test
results.  It does not have to be perfect for starts, just repeatable and
easy to use.  Feel free to beat me to starting these threads :)

--
Weldon Washburn
Intel Enterprise Solutions Software Division




Re: [DRLVM] General stability

2006-11-07 Thread Rana Dasgupta

+1

This makes more sense than putting a sudden unannounced stop to new
features. Possibly Weldon means "on a case by case basis after risk
assessment". However, announced stability periods at the end of milestones
is a good practice. For example, since we have been focussing on development
for a while, and there are many open JIRA bugs, we could consider rounding
off the year with a "no new features and only bugfixing" period for 3-4
weeks starting December 1, to round off the year. We can fork and create a
development branch to support some new work during this period. Going
forward, we can try to plan for milestones and repeat this
plan-development-stabilization process.

Thanks,
Rana



On 11/7/06, Tim Ellison <[EMAIL PROTECTED]> wrote:


I wouldn't go so far as to label issues as "won't fix" unless they are
really high risk and low value items.

It's useful to go through a stabilization period where the focus is on
getting the code solid again and delaying significant new functionality
until it is achieved.  A plan that aims to deliver stable milestones on
regular periods is, in my experience, a good way to focus the
development effort.

Regards,
Tim




Re: [DRLVM] General stability

2006-11-07 Thread Tim Ellison
I wouldn't go so far as to label issues as "won't fix" unless they are
really high risk and low value items.

It's useful to go through a stabilization period where the focus is on
getting the code solid again and delaying significant new functionality
until it is achieved.  A plan that aims to deliver stable milestones on
regular periods is, in my experience, a good way to focus the
development effort.

Regards,
Tim

Weldon Washburn wrote:
> Folks,
> 
> I have spent the last two months committing patches to the VM.  While we
> have added a ton of much needed functionality, the stability of the system
> has been ignored.  By chance, I looked at thread synchronization design
> problems this week.  Its very apparent that  we lack the regression testing
> to really find threading bugs, test the fixes and test against regression.
> No doubt there are similar problems in other VM subsystems.   "build test"
> is necessary but not sufficient for where we need to go.  In a sense,
> committing code with only "build test" to prevent regression is the
> equivalent to flying in the fog without instrumentation.
> 
> So that we can get engineers focused on stability, I am thinking of coding
> the JIRAs that involve new features as "later" or even "won't fix".  Please
> feel free to comment.
> 
> We also need to restart the old email threads on regression tests.  For
> example, we need some sort of automated test script that runs Eclipse and
> tomcat, etc. in a deterministic fashion so that we can compare test
> results.  It does not have to be perfect for starts, just repeatable and
> easy to use.  Feel free to beat me to starting these threads :)
> 

-- 

Tim Ellison ([EMAIL PROTECTED])
IBM Java technology centre, UK.


Re: [DRLVM] General stability

2006-11-07 Thread Vladimir Ivanov

On 11/7/06, Alexey Varlamov <[EMAIL PROTECTED]> wrote:



But do we have needed scripts/tools readily available to run and
analyze such stability testing? I'm also pretty sure existing c-unit
and smoke tests would help to reveal certain problems if run
repeatedly - just need to add this stuff to CC and run it nightly.
Anybody volunteer?
And yet there are a lot of excluded tests in smoke suite...




Actually, we have one. The task 'ant test' from the drlvm module is running
under CC on linux and windows boxes. Every one can easily reproduce it
(checkout 'buildtest' module and run it, updated version is available in the
issue 995).

The problem is: CC will be useful only to track the regression. While we
have some failed tests it should be fixed asap. At present time some issues
that prevent successful CC runs wait for integration more than 1 month :(

thanks, Vladimir


Re: [DRLVM] General stability

2006-11-06 Thread Alexey Varlamov

2006/11/7, Gregory Shimansky <[EMAIL PROTECTED]>:

Weldon Washburn wrote:
> Folks,
>
> I have spent the last two months committing patches to the VM.  While we
> have added a ton of much needed functionality, the stability of the system
> has been ignored.  By chance, I looked at thread synchronization design
> problems this week.  Its very apparent that  we lack the regression testing
> to really find threading bugs, test the fixes and test against regression.
> No doubt there are similar problems in other VM subsystems.   "build test"
> is necessary but not sufficient for where we need to go.  In a sense,
> committing code with only "build test" to prevent regression is the
> equivalent to flying in the fog without instrumentation.
>
> So that we can get engineers focused on stability, I am thinking of coding
> the JIRAs that involve new features as "later" or even "won't fix".  Please
> feel free to comment.
>
> We also need to restart the old email threads on regression tests.  For
> example, we need some sort of automated test script that runs Eclipse and
> tomcat, etc. in a deterministic fashion so that we can compare test
> results.  It does not have to be perfect for starts, just repeatable and
> easy to use.  Feel free to beat me to starting these threads :)

In my experience on working with drlvm, stability problems are often
discovered on the existing VM acceptance tests. Big applications like
eclipse or tomcat with long workloads usually reveal problems like lack
of class unloading unless they crash on something like threading
problems. The acceptance VM tests that we have already are a good start
to test stability if they are ran nonstop many times.


Gregory,

But do we have needed scripts/tools readily available to run and
analyze such stability testing? I'm also pretty sure existing c-unit
and smoke tests would help to reveal certain problems if run
repeatedly - just need to add this stuff to CC and run it nightly.
Anybody volunteer?
And yet there are a lot of excluded tests in smoke suite...

--
Alexey


I don't say that we shouldn't have real application workloads. I just
want to say that acceptance tests already usually reveal threading
problems quite well if they are ran many times, and race conditions
happen in some circumstances.

However at the moment we already have failing tests, some of them like
gc.LOS on WinXP don't need multiple times to make them fail. There's
also java.lang.ThreadTest which fails for me on Windows 2003 server SP1
and now started to fail on Linux as well.

--
Gregory




Re: [DRLVM] General stability

2006-11-06 Thread Gregory Shimansky

Weldon Washburn wrote:

Folks,

I have spent the last two months committing patches to the VM.  While we
have added a ton of much needed functionality, the stability of the system
has been ignored.  By chance, I looked at thread synchronization design
problems this week.  Its very apparent that  we lack the regression testing
to really find threading bugs, test the fixes and test against regression.
No doubt there are similar problems in other VM subsystems.   "build test"
is necessary but not sufficient for where we need to go.  In a sense,
committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

So that we can get engineers focused on stability, I am thinking of coding
the JIRAs that involve new features as "later" or even "won't fix".  Please
feel free to comment.

We also need to restart the old email threads on regression tests.  For
example, we need some sort of automated test script that runs Eclipse and
tomcat, etc. in a deterministic fashion so that we can compare test
results.  It does not have to be perfect for starts, just repeatable and
easy to use.  Feel free to beat me to starting these threads :)


In my experience on working with drlvm, stability problems are often 
discovered on the existing VM acceptance tests. Big applications like 
eclipse or tomcat with long workloads usually reveal problems like lack 
of class unloading unless they crash on something like threading 
problems. The acceptance VM tests that we have already are a good start 
to test stability if they are ran nonstop many times.


I don't say that we shouldn't have real application workloads. I just 
want to say that acceptance tests already usually reveal threading 
problems quite well if they are ran many times, and race conditions 
happen in some circumstances.


However at the moment we already have failing tests, some of them like 
gc.LOS on WinXP don't need multiple times to make them fail. There's 
also java.lang.ThreadTest which fails for me on Windows 2003 server SP1 
and now started to fail on Linux as well.


--
Gregory



Re: [DRLVM] General stability

2006-11-04 Thread Geir Magnusson Jr.



Weldon Washburn wrote:

Folks,

I have spent the last two months committing patches to the VM.  While we
have added a ton of much needed functionality, the stability of the system
has been ignored.  By chance, I looked at thread synchronization design
problems this week.  Its very apparent that  we lack the regression testing
to really find threading bugs, test the fixes and test against regression.
No doubt there are similar problems in other VM subsystems.   "build test"
is necessary but not sufficient for where we need to go.  In a sense,
committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

So that we can get engineers focused on stability, I am thinking of coding
the JIRAs that involve new features as "later" or even "won't fix".  Please
feel free to comment.


Please don't w/o discussion on each.  I see no reason to do that right 
now, as while I 100% agree that stabilization is a necessary goal, we're 
not anywhere near the point where we need to call a halt to adding new 
stuff that's missing.




We also need to restart the old email threads on regression tests.  For
example, we need some sort of automated test script that runs Eclipse and
tomcat, etc. in a deterministic fashion so that we can compare test
results.  It does not have to be perfect for starts, just repeatable and
easy to use.  Feel free to beat me to starting these threads :)


Feel free to add a script to the build-test framework :)

geir





[DRLVM] General stability

2006-11-03 Thread Weldon Washburn

Folks,

I have spent the last two months committing patches to the VM.  While we
have added a ton of much needed functionality, the stability of the system
has been ignored.  By chance, I looked at thread synchronization design
problems this week.  Its very apparent that  we lack the regression testing
to really find threading bugs, test the fixes and test against regression.
No doubt there are similar problems in other VM subsystems.   "build test"
is necessary but not sufficient for where we need to go.  In a sense,
committing code with only "build test" to prevent regression is the
equivalent to flying in the fog without instrumentation.

So that we can get engineers focused on stability, I am thinking of coding
the JIRAs that involve new features as "later" or even "won't fix".  Please
feel free to comment.

We also need to restart the old email threads on regression tests.  For
example, we need some sort of automated test script that runs Eclipse and
tomcat, etc. in a deterministic fashion so that we can compare test
results.  It does not have to be perfect for starts, just repeatable and
easy to use.  Feel free to beat me to starting these threads :)

--
Weldon Washburn
Intel Enterprise Solutions Software Division