Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability
Alexei, I like your approach to result comparison. 10% can be default value for some form field - anyone can change it if needed. OK, let's try it and see if it satisfies the community/ Probably more conventional would be to parse for some metric. Does the site already parse TEST-* files? It parses TESTS-TestSuites.xml - as far as I can see, it contains the same information as TEST-* files. Is there a common format of output of stress tests you're talking about? Thank you, Alexei On 11/10/06, Anton Luht <[EMAIL PROTECTED]> wrote: > Hello, Alexei, > > > I have related question. How can we improve http://harmonytest.org to > > make it possible to publish not just pass, fail, or error but numeric > > test scores? > > Easily - test results in JUnit reports have 'time' property - > execution time in seconds. We can import and show them in the results. > What else is needed? Maybe add something like 'show regressions' to > the the 'compare runs' page? For example, show tests that increased > execution time by more than 10% sorted by increase rate desc? > > -- > Regards, > Anton Luht, > Intel Java & XML Engineering > -- Thank you, Alexei -- Regards, Anton Luht, Intel Java & XML Engineering
Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability
Anton, I like your approach to result comparison. 10% can be default value for some form field - anyone can change it if needed. As for test execution time reported by JUnit, it is applicable for stress tests as well if we gradually increase a load over time. Though using ttime field for stress tests and documentation rankings is not quite conventional. Probably more conventional would be to parse for some metric. Does the site already parse TEST-* files? Thank you, Alexei On 11/10/06, Anton Luht <[EMAIL PROTECTED]> wrote: Hello, Alexei, > I have related question. How can we improve http://harmonytest.org to > make it possible to publish not just pass, fail, or error but numeric > test scores? Easily - test results in JUnit reports have 'time' property - execution time in seconds. We can import and show them in the results. What else is needed? Maybe add something like 'show regressions' to the the 'compare runs' page? For example, show tests that increased execution time by more than 10% sorted by increase rate desc? -- Regards, Anton Luht, Intel Java & XML Engineering -- Thank you, Alexei
Re: [testing] Tests scores on http://harmonytest.org Was: [DRLVM] General stability
Hello, Alexei, I have related question. How can we improve http://harmonytest.org to make it possible to publish not just pass, fail, or error but numeric test scores? Easily - test results in JUnit reports have 'time' property - execution time in seconds. We can import and show them in the results. What else is needed? Maybe add something like 'show regressions' to the the 'compare runs' page? For example, show tests that increased execution time by more than 10% sorted by increase rate desc? -- Regards, Anton Luht, Intel Java & XML Engineering
Re: [DRLVM] General stability
Egor Pasko wrote: On the 0x21C day of Apache Harmony Mikhail Loenko wrote: 2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>: I do think of us having a 'zero regression' policy except in cases where we make the explicit decision to break. (like we did with TM, for example) +1 for 'zero regression' unless explicitely agreed I am +1 for "'zero regression' unless explicitely agreed" at least for (c/jit-)unit, kernel, classlib tests. Do you think of including Eclipse UT, DaCapo, etc. to the 'zero regression' set? That would probably be good, I am afraid of too heavy checks for each small patch. This wouldn't be done by each committer - it's too heavy. We'd have CI running continuously to tell us when we broke something. Our committment would be to fix things that CI found immediately... geir Thanks, Mikhail I hesitate to say that again, but we also need to decide about VM we will use for that release. I like the following mission: "Class library and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is challenging, understandable and short enough. Well, we'll need Windows XP and RHEL as well. Writing down this mission certainly shouldn't inhibit individuals from achieving other goals at Harmony. But it would help the rest of community to concentrate on the common task. 1. http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ on With best regards, Alexei Fedotov, Intel Java & XML Engineering -Original Message- From: Alexey Petrenko [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 08, 2006 10:36 AM To: harmony-dev@incubator.apache.org Subject: Re: [DRLVM] General stability 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: Probably it's time to create some release plan :) So let's start this discussion? Good idea! The only release I can imagine is Harmony Java5SE 100% compatible. To be Java5SE 100% compatible we need TCK first. So we could think about some less impressive goal for the first release :) SY, Alexey
Re: [DRLVM] General stability
On the 0x21C day of Apache Harmony Mikhail Loenko wrote: > 2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>: > > > > > > Fedotov, Alexei A wrote: > > > Alexey Petrenko wrote, > > >> The only release I can imagine is Harmony Java5SE 100% compatible. > > >> To be Java5SE 100% compatible we need TCK first. > > > > > > +1 > > > > > > > Yes - and I still think that talk of a release is a bit premature right now. > > > > The key things that I believe we need to focus on are > > > > a) stability and > > > > b) completeness. > > > > c) reliability (which may be 'stability') > > > > (and not always in that order :) > > > > > > Things I'd like to see us do : > > > > 1) We need to drive to fully working unit tests for both DRLVM and > > classlib (using DRLVM). Great progress has been made in this area, and > > we should probably make this a "campaign" for DRLVM as we did for > > classlib. > > > > 2) Add stress tests > > > > 3) Get our CC-based build-test framework patched and running on as many > > platforms as possible, reporting breakage into the list. > > > > 4) Identify problem areas and focus on them. For example, threading in > > DRLVM... > > > > I do think of us having a 'zero regression' policy except in cases where > > we make the explicit decision to break. (like we did with TM, for example) > > +1 for 'zero regression' unless explicitely agreed I am +1 for "'zero regression' unless explicitely agreed" at least for (c/jit-)unit, kernel, classlib tests. Do you think of including Eclipse UT, DaCapo, etc. to the 'zero regression' set? That would probably be good, I am afraid of too heavy checks for each small patch. > Thanks, > Mikhail > > > > > > > > I hesitate to say that again, but we also need to decide about VM we > > > will use for that release. I like the following mission: "Class library > > > and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is > > > challenging, understandable and short enough. > > > > Well, we'll need Windows XP and RHEL as well. > > > > > > > > > > Writing down this mission certainly shouldn't inhibit individuals from > > > achieving other goals at Harmony. But it would help the rest of > > > community to concentrate on the common task. > > > > > > 1. > > > http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ > > > on > > > > > > With best regards, > > > Alexei Fedotov, > > > Intel Java & XML Engineering > > > > > >> -Original Message- > > >> From: Alexey Petrenko [mailto:[EMAIL PROTECTED] > > >> Sent: Wednesday, November 08, 2006 10:36 AM > > >> To: harmony-dev@incubator.apache.org > > >> Subject: Re: [DRLVM] General stability > > >> > > >> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: > > >>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > >>>> Probably it's time to create some release plan :) > > >>>> > > >>> So let's start this discussion? > > >>> Good idea! > > >>> The only release I can imagine is Harmony Java5SE 100% compatible. > > >> To be Java5SE 100% compatible we need TCK first. > > >> So we could think about some less impressive goal for the first release > > > :) > > >> SY, Alexey > > > > > > > > -- Egor Pasko
Tests scores on http://harmonytest.org Was: [DRLVM] General stability
Geir, I like the overall letter. Anton, I have related question. How can we improve http://harmonytest.org to make it possible to publish not just pass, fail, or error but numeric test scores? How this is related to the letter? I believe that stress tests which were mentioned in the letter may have scores in a way performance tests do, see http://incubator.apache.org/harmony/subcomponents/stresstest/index.html BTW, Egor Pasko requested scores to report documentation quality. With best regards, Alexei Fedotov, Intel Java & XML Engineering >-Original Message- >From: Geir Magnusson Jr. [mailto:[EMAIL PROTECTED] >Sent: Thursday, November 09, 2006 4:51 AM >To: harmony-dev@incubator.apache.org >Subject: Re: [DRLVM] General stability > > > >Fedotov, Alexei A wrote: >> Alexey Petrenko wrote, >>> The only release I can imagine is Harmony Java5SE 100% compatible. >>> To be Java5SE 100% compatible we need TCK first. >> >> +1 >> > >Yes - and I still think that talk of a release is a bit premature right >now. > >The key things that I believe we need to focus on are > > a) stability and > > b) completeness. > > c) reliability (which may be 'stability') > >(and not always in that order :) > > >Things I'd like to see us do : > >1) We need to drive to fully working unit tests for both DRLVM and >classlib (using DRLVM). Great progress has been made in this area, and > we should probably make this a "campaign" for DRLVM as we did for >classlib. > >2) Add stress tests > >3) Get our CC-based build-test framework patched and running on as many >platforms as possible, reporting breakage into the list. > >4) Identify problem areas and focus on them. For example, threading in >DRLVM... > >I do think of us having a 'zero regression' policy except in cases where >we make the explicit decision to break. (like we did with TM, for example) > > >> I hesitate to say that again, but we also need to decide about VM we >> will use for that release. I like the following mission: "Class library >> and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is >> challenging, understandable and short enough. > >Well, we'll need Windows XP and RHEL as well. > > >> >> Writing down this mission certainly shouldn't inhibit individuals from >> achieving other goals at Harmony. But it would help the rest of >> community to concentrate on the common task. >> >> 1. >> http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ >> on >> >> With best regards, >> Alexei Fedotov, >> Intel Java & XML Engineering >> >>> -Original Message- >>> From: Alexey Petrenko [mailto:[EMAIL PROTECTED] >>> Sent: Wednesday, November 08, 2006 10:36 AM >>> To: harmony-dev@incubator.apache.org >>> Subject: Re: [DRLVM] General stability >>> >>> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: >>>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: >>>>> Probably it's time to create some release plan :) >>>>> >>>> So let's start this discussion? >>>> Good idea! >>>> The only release I can imagine is Harmony Java5SE 100% compatible. >>> To be Java5SE 100% compatible we need TCK first. >>> So we could think about some less impressive goal for the first release >> :) >>> SY, Alexey >>
Re: [DRLVM] General stability
2006/11/9, Geir Magnusson Jr. <[EMAIL PROTECTED]>: Fedotov, Alexei A wrote: > Alexey Petrenko wrote, >> The only release I can imagine is Harmony Java5SE 100% compatible. >> To be Java5SE 100% compatible we need TCK first. > > +1 > Yes - and I still think that talk of a release is a bit premature right now. The key things that I believe we need to focus on are a) stability and b) completeness. c) reliability (which may be 'stability') (and not always in that order :) Things I'd like to see us do : 1) We need to drive to fully working unit tests for both DRLVM and classlib (using DRLVM). Great progress has been made in this area, and we should probably make this a "campaign" for DRLVM as we did for classlib. 2) Add stress tests 3) Get our CC-based build-test framework patched and running on as many platforms as possible, reporting breakage into the list. 4) Identify problem areas and focus on them. For example, threading in DRLVM... I do think of us having a 'zero regression' policy except in cases where we make the explicit decision to break. (like we did with TM, for example) +1 for 'zero regression' unless explicitely agreed Thanks, Mikhail > I hesitate to say that again, but we also need to decide about VM we > will use for that release. I like the following mission: "Class library > and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is > challenging, understandable and short enough. Well, we'll need Windows XP and RHEL as well. > > Writing down this mission certainly shouldn't inhibit individuals from > achieving other goals at Harmony. But it would help the rest of > community to concentrate on the common task. > > 1. > http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ > on > > With best regards, > Alexei Fedotov, > Intel Java & XML Engineering > >> -Original Message- >> From: Alexey Petrenko [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, November 08, 2006 10:36 AM >> To: harmony-dev@incubator.apache.org >> Subject: Re: [DRLVM] General stability >> >> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: >>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: >>>> Probably it's time to create some release plan :) >>>> >>> So let's start this discussion? >>> Good idea! >>> The only release I can imagine is Harmony Java5SE 100% compatible. >> To be Java5SE 100% compatible we need TCK first. >> So we could think about some less impressive goal for the first release > :) >> SY, Alexey >
Re: [DRLVM] General stability
The key things that I believe we need to focus on are a) stability and b) completeness. c) reliability (which may be 'stability') Just 2 cents - just to be clear on terms "stability" and "reliability" - we started using them in this thread and I feel like we do not separate them. I propose, when we are talking about "reliability" we mean: Capability of Harmony runtime to run given workloads correctly (as it is defined by J2SE specifications) for certain period of time. This is interpretation of definition given by IEEE for "reliability". There is no definition for stability in IEEE software glossary, so I propose when we are talking about stability we mean stability of runtime / code in time, i.e. progressing, not regressing. In these terms I completely agree with the order. On 11/9/06, Geir Magnusson Jr. <[EMAIL PROTECTED]> wrote: Fedotov, Alexei A wrote: > Alexey Petrenko wrote, >> The only release I can imagine is Harmony Java5SE 100% compatible. >> To be Java5SE 100% compatible we need TCK first. > > +1 > Yes - and I still think that talk of a release is a bit premature right now. The key things that I believe we need to focus on are a) stability and b) completeness. c) reliability (which may be 'stability') (and not always in that order :) Things I'd like to see us do : 1) We need to drive to fully working unit tests for both DRLVM and classlib (using DRLVM). Great progress has been made in this area, and we should probably make this a "campaign" for DRLVM as we did for classlib. 2) Add stress tests 3) Get our CC-based build-test framework patched and running on as many platforms as possible, reporting breakage into the list. 4) Identify problem areas and focus on them. For example, threading in DRLVM... I do think of us having a 'zero regression' policy except in cases where we make the explicit decision to break. (like we did with TM, for example) > I hesitate to say that again, but we also need to decide about VM we > will use for that release. I like the following mission: "Class library > and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is > challenging, understandable and short enough. Well, we'll need Windows XP and RHEL as well. > > Writing down this mission certainly shouldn't inhibit individuals from > achieving other goals at Harmony. But it would help the rest of > community to concentrate on the common task. > > 1. > http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ > on > > With best regards, > Alexei Fedotov, > Intel Java & XML Engineering > >> -Original Message- >> From: Alexey Petrenko [mailto:[EMAIL PROTECTED] >> Sent: Wednesday, November 08, 2006 10:36 AM >> To: harmony-dev@incubator.apache.org >> Subject: Re: [DRLVM] General stability >> >> 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: >>> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: >>>> Probably it's time to create some release plan :) >>>> >>> So let's start this discussion? >>> Good idea! >>> The only release I can imagine is Harmony Java5SE 100% compatible. >> To be Java5SE 100% compatible we need TCK first. >> So we could think about some less impressive goal for the first release > :) >> SY, Alexey >
Re: [DRLVM] General stability
Fedotov, Alexei A wrote: Alexey Petrenko wrote, The only release I can imagine is Harmony Java5SE 100% compatible. To be Java5SE 100% compatible we need TCK first. +1 Yes - and I still think that talk of a release is a bit premature right now. The key things that I believe we need to focus on are a) stability and b) completeness. c) reliability (which may be 'stability') (and not always in that order :) Things I'd like to see us do : 1) We need to drive to fully working unit tests for both DRLVM and classlib (using DRLVM). Great progress has been made in this area, and we should probably make this a "campaign" for DRLVM as we did for classlib. 2) Add stress tests 3) Get our CC-based build-test framework patched and running on as many platforms as possible, reporting breakage into the list. 4) Identify problem areas and focus on them. For example, threading in DRLVM... I do think of us having a 'zero regression' policy except in cases where we make the explicit decision to break. (like we did with TM, for example) I hesitate to say that again, but we also need to decide about VM we will use for that release. I like the following mission: "Class library and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is challenging, understandable and short enough. Well, we'll need Windows XP and RHEL as well. Writing down this mission certainly shouldn't inhibit individuals from achieving other goals at Harmony. But it would help the rest of community to concentrate on the common task. 1. http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ on With best regards, Alexei Fedotov, Intel Java & XML Engineering -Original Message- From: Alexey Petrenko [mailto:[EMAIL PROTECTED] Sent: Wednesday, November 08, 2006 10:36 AM To: harmony-dev@incubator.apache.org Subject: Re: [DRLVM] General stability 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: Probably it's time to create some release plan :) So let's start this discussion? Good idea! The only release I can imagine is Harmony Java5SE 100% compatible. To be Java5SE 100% compatible we need TCK first. So we could think about some less impressive goal for the first release :) SY, Alexey
RE: [DRLVM] General stability
Alexey Petrenko wrote, >The only release I can imagine is Harmony Java5SE 100% compatible. >To be Java5SE 100% compatible we need TCK first. +1 I hesitate to say that again, but we also need to decide about VM we will use for that release. I like the following mission: "Class library and DRLVM pass TCK on Ubuntu 6". I'm open for any other mission which is challenging, understandable and short enough. Writing down this mission certainly shouldn't inhibit individuals from achieving other goals at Harmony. But it would help the rest of community to concentrate on the common task. 1. http://wiki.apache.org/harmony/Platforms_to_Run_Harmony_Development_Kit_ on With best regards, Alexei Fedotov, Intel Java & XML Engineering >-Original Message- >From: Alexey Petrenko [mailto:[EMAIL PROTECTED] >Sent: Wednesday, November 08, 2006 10:36 AM >To: harmony-dev@incubator.apache.org >Subject: Re: [DRLVM] General stability > >2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: >> On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: >> > >> > Probably it's time to create some release plan :) >> > >> So let's start this discussion? >> > >> Good idea! >> The only release I can imagine is Harmony Java5SE 100% compatible. >To be Java5SE 100% compatible we need TCK first. >So we could think about some less impressive goal for the first release :) > >SY, Alexey
Re: [DRLVM] General stability
2006/11/7, Vladimir Ivanov <[EMAIL PROTECTED]>: On 11/7/06, Alexey Varlamov <[EMAIL PROTECTED]> wrote: > > > But do we have needed scripts/tools readily available to run and > analyze such stability testing? I'm also pretty sure existing c-unit > and smoke tests would help to reveal certain problems if run > repeatedly - just need to add this stuff to CC and run it nightly. > Anybody volunteer? > And yet there are a lot of excluded tests in smoke suite... Actually, we have one. The task 'ant test' from the drlvm module is running under CC on linux and windows boxes. Every one can easily reproduce it (checkout 'buildtest' module and run it, updated version is available in the issue 995). The problem is: CC will be useful only to track the regression. While we have some failed tests it should be fixed asap. At present time some issues that prevent successful CC runs wait for integration more than 1 month :( AFAIU CC only tracks status changes between subsequent runs, right? This is not really helpful to detect stability issues. Apparently a few race conditions present in DRLVM threads suspension, classloading, and/or elsewhere. So one rather need to collect statistics to spot suspicious areas. Results can be represented as %% of failures per test and groups of failures by symptoms, e.g. the same assert failed for different tests. Further analysis could bred better tests with higher failure probability... Alternative approach could be to employ custom "stressing" test harness, running a test in several concurrent threads, etc. Although this is just a temp solution until we have thorough stress tests and more decent coverage of VM code by unit tests, the latter will hardly happen in foreseeable future. So we should try to derive maximum benefit from already available tests. -- Regards, Alexey thanks, Vladimir
Re: [DRLVM] General stability
In general I agrre with what Oleg says. Some comment are below 2006/11/8, Oleg Oleinik <[EMAIL PROTECTED]>: > "no regression" policy should be relevant to a number of *small* tests that are easy to run and are running fast, to make them good as pre-commit criteria. Actually, I'm thinking about the following model (which goes a little bit beyond pre-commit testing): **Unit testing: new feature is developed; unit (or whatever) tests are created; the tests are passing on certain platforms / runtime configurations; the feature along with tests goes into JIRA. But we welcome any contributions even without unit tests **Pre-commit testing: done by committer with agreed set of tests, typically, quickly running (example: classlib unit tests + vm/jit-specific tests). Regression? - no commit or exclude tests / ignore failures, if reasonable and agreed. **Code integrity testing: done automatically ~hourly, the same set of tests as for pre-commit testing may be used. Regression? - notify and fix asap (or even roll-back changes if appropriate) or just exclude tests, if reasonable and agreed. It should be done on all the platforms we support. (And probably "support" of some platform should imply that we do integrity testing on it) Thanks, Mikhail **QA testing (say, nightly): one runs automated workloads (from buildtest/) on his platform(s) nightly (or from time to time). Regression? - for example, 3 EUT tests or some Eclipse scenario started failing after certain commit - notify harmony-dev about regression, then, it should be decided whether or not stop new commits and fix the regression asap or accept regression and just exclude tests. Bugfixer can take automation scripts from buildtest/ and play with failing tests or scenario. I think, the more we care about regressions on ongoing basis, the less time we will need to achieve milestone's stability requirements, the more sense have applications enabling acitivities. Who knows - 2 months is sufficient for reaching established stability level or not? Why enable something if tommorow it has good chances do not work? > Many successful projects (probably, all of them) have stability periods, even stability releases (and, yes, stability branches). That is considered effective. And IMO our project should act the same. I support having milestones, releases. But, after milestone I don't like seeing loosing achieved results. On 08 Nov 2006 12:48:48 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote: > > On the 0x21B day of Apache Harmony Oleg Oleinik wrote: > > Such model works but there is a risk of fixing again "from scratch" > those > > bugs which were fixed once on the previous milestones. > > sometimes it is easier to fix a couple of bugs "from scratch" than to > spend large amount of resources on regular complex checks (that also > do not guarantee 100% stability) > > > We can eliminate this if follow "no regression" policy - if something > works > > (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for > example), > > it should continue working - any regression is a subject for reporting > and > > fixing as soon as possible (it is easier to find root cause and fix > since we > > will know which commit caused regression). > > > > Will this model work? Isn't it a little bit better than focusing on > runtime > > stability periodically? > > "no regression" policy should be relevant to a number of *small* tests > that are easy to run and are running fast, to make them good as > pre-commit criteria. > > Complex workloads _cannot_ be run as a pre-commit criteria. So there > _should be regressions_. That's because: > * we cannot afford to run them as pre-commit > * we cannot afford complex rollbacks and stop-the world > > Many successful projects (probably, all of them) have stability > periods, even stability releases (and, yes, stability branches). That > is considered effective. And IMO our project should act the same. > > We _have to_ allow some bugs to continue active development. But not > too many. It is always a tradeoff. > > To summarize. I support you idea to improve the regression test base > and infrastructure. Let it be a step-by-step improvement. Then we can > decide which tests to run as pre-commit and which are to measure the > overall stability. > > > On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote: > > > > > > I wouldn't go so far as to label issues as "won't fix" unless they are > > > really high risk and low value items. > > > > > > It's useful to go through a stabilization period where the focus is on > > > getting the code solid again and delaying significant new > functionality > > > until it is achieved. A plan that aims to deliver stable milestones > on > > > regular periods is, in my experience, a good way to focus the > > > development effort. > > > > > > Regards, > > > Tim > > > > > > Weldon Washburn wrote: > > > > Folks, > > > > > > > > I have spent the last two months committing patches to the > VM. While we > > > > h
Re: [DRLVM] General stability
"no regression" policy should be relevant to a number of *small* tests that are easy to run and are running fast, to make them good as pre-commit criteria. Actually, I'm thinking about the following model (which goes a little bit beyond pre-commit testing): **Unit testing: new feature is developed; unit (or whatever) tests are created; the tests are passing on certain platforms / runtime configurations; the feature along with tests goes into JIRA. **Pre-commit testing: done by committer with agreed set of tests, typically, quickly running (example: classlib unit tests + vm/jit-specific tests). Regression? - no commit or exclude tests / ignore failures, if reasonable and agreed. **Code integrity testing: done automatically ~hourly, the same set of tests as for pre-commit testing may be used. Regression? - notify and fix asap (or even roll-back changes if appropriate) or just exclude tests, if reasonable and agreed. **QA testing (say, nightly): one runs automated workloads (from buildtest/) on his platform(s) nightly (or from time to time). Regression? - for example, 3 EUT tests or some Eclipse scenario started failing after certain commit - notify harmony-dev about regression, then, it should be decided whether or not stop new commits and fix the regression asap or accept regression and just exclude tests. Bugfixer can take automation scripts from buildtest/ and play with failing tests or scenario. I think, the more we care about regressions on ongoing basis, the less time we will need to achieve milestone's stability requirements, the more sense have applications enabling acitivities. Who knows - 2 months is sufficient for reaching established stability level or not? Why enable something if tommorow it has good chances do not work? Many successful projects (probably, all of them) have stability periods, even stability releases (and, yes, stability branches). That is considered effective. And IMO our project should act the same. I support having milestones, releases. But, after milestone I don't like seeing loosing achieved results. On 08 Nov 2006 12:48:48 +0600, Egor Pasko <[EMAIL PROTECTED]> wrote: On the 0x21B day of Apache Harmony Oleg Oleinik wrote: > Such model works but there is a risk of fixing again "from scratch" those > bugs which were fixed once on the previous milestones. sometimes it is easier to fix a couple of bugs "from scratch" than to spend large amount of resources on regular complex checks (that also do not guarantee 100% stability) > We can eliminate this if follow "no regression" policy - if something works > (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for example), > it should continue working - any regression is a subject for reporting and > fixing as soon as possible (it is easier to find root cause and fix since we > will know which commit caused regression). > > Will this model work? Isn't it a little bit better than focusing on runtime > stability periodically? "no regression" policy should be relevant to a number of *small* tests that are easy to run and are running fast, to make them good as pre-commit criteria. Complex workloads _cannot_ be run as a pre-commit criteria. So there _should be regressions_. That's because: * we cannot afford to run them as pre-commit * we cannot afford complex rollbacks and stop-the world Many successful projects (probably, all of them) have stability periods, even stability releases (and, yes, stability branches). That is considered effective. And IMO our project should act the same. We _have to_ allow some bugs to continue active development. But not too many. It is always a tradeoff. To summarize. I support you idea to improve the regression test base and infrastructure. Let it be a step-by-step improvement. Then we can decide which tests to run as pre-commit and which are to measure the overall stability. > On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote: > > > > I wouldn't go so far as to label issues as "won't fix" unless they are > > really high risk and low value items. > > > > It's useful to go through a stabilization period where the focus is on > > getting the code solid again and delaying significant new functionality > > until it is achieved. A plan that aims to deliver stable milestones on > > regular periods is, in my experience, a good way to focus the > > development effort. > > > > Regards, > > Tim > > > > Weldon Washburn wrote: > > > Folks, > > > > > > I have spent the last two months committing patches to the VM. While we > > > have added a ton of much needed functionality, the stability of the > > system > > > has been ignored. By chance, I looked at thread synchronization design > > > problems this week. Its very apparent that we lack the regression > > testing > > > to really find threading bugs, test the fixes and test against > > regression. > > > No doubt there are similar problems in other VM subsystems. "build > > test" > > > is necessary but not sufficient for wher
Re: [DRLVM] General stability
On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > If we want to have "unnamed" milestones, the solution could be: every last > month of a quoter is a "stability" period. No new features are accepted during > this period. This could work for a long period of time without need > in additional discussions. But we need to define what is stability in this case. Unit tests? It's a precommit criteria and should not fail always. Big applications? Which applications? Which scenarios? If we adds some new features then application list should increase. This needs discussions and so no I propose making this discussion only once and derive a solution that works for a long time. The global releases we will have Java5, Java6 are something special. The minor releases could be done automatically without additional discussions: for example we can release a snapshot every quoter. If the trunk is frozen for new features (the last month of every quoter) the only way to contribute is to fix bugs/improve quality of existing code. IMO any applications and any tests could be used here. Unit tests or some application tests could be announced as showstoppers for release, of course. -- Mikhail Fursov
Re: [DRLVM] General stability
2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: If our classlib is not 100% compatible we can't stop addition of new features. The same is with VM. Nobody trying to freeze new features forever. At least I don't :) If we want to have "unnamed" milestones, the solution could be: every last month of a quoter is a "stability" period. No new features are accepted during this period. This could work for a long period of time without need in additional discussions. But we need to define what is stability in this case. Unit tests? It's a precommit criteria and should not fail always. Big applications? Which applications? Which scenarios? If we adds some new features then application list should increase. This needs discussions and so no SY, Alexey On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: > > On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > > > > > Probably it's time to create some release plan :) > > > > > So let's start this discussion? > > > > > Good idea! > > The only release I can imagine is Harmony Java5SE 100% compatible. > To be Java5SE 100% compatible we need TCK first. > So we could think about some less impressive goal for the first release :) > > SY, Alexey
Re: [DRLVM] General stability
If our classlib is not 100% compatible we can't stop addition of new features. The same is with VM. If we want to have "unnamed" milestones, the solution could be: every last month of a quoter is a "stability" period. No new features are accepted during this period. This could work for a long period of time without need in additional discussions. ? On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: 2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: > On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > > > Probably it's time to create some release plan :) > > > So let's start this discussion? > > > Good idea! > The only release I can imagine is Harmony Java5SE 100% compatible. To be Java5SE 100% compatible we need TCK first. So we could think about some less impressive goal for the first release :) SY, Alexey -- Mikhail Fursov
Re: [DRLVM] General stability
2006/11/8, Mikhail Fursov <[EMAIL PROTECTED]>: On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: > > Probably it's time to create some release plan :) > So let's start this discussion? > Good idea! The only release I can imagine is Harmony Java5SE 100% compatible. To be Java5SE 100% compatible we need TCK first. So we could think about some less impressive goal for the first release :) SY, Alexey
Re: [DRLVM] General stability
On 11/8/06, Alexey Petrenko <[EMAIL PROTECTED]> wrote: Probably it's time to create some release plan :) So let's start this discussion? Good idea! The only release I can imagine is Harmony Java5SE 100% compatible. We can make feature freeze in 2 or 3 monthes before this date and after we have all features in classlib and vm implemented. If this proposal is OK we will have feature freeze only in Q1 2007.. ? -- Mikhail Fursov
Re: [DRLVM] General stability
2006/11/4, Weldon Washburn <[EMAIL PROTECTED]>: Folks, I have spent the last two months committing patches to the VM. While we have added a ton of much needed functionality, the stability of the system has been ignored. By chance, I looked at thread synchronization design problems this week. Its very apparent that we lack the regression testing to really find threading bugs, test the fixes and test against regression. No doubt there are similar problems in other VM subsystems. "build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. So that we can get engineers focused on stability, I am thinking of coding the JIRAs that involve new features as "later" or even "won't fix". Please feel free to comment. I prefer to move the issue to the next milestone or release if I'm not sure that it will not break stability of the product and we are in "near release" phase. Probably it's time to create some release plan :) I think it will make many things much easier. If we will have a list of features for release and release date it will be much easier to decide do we want to apply the fix or postpone it to the next release. So let's start this discussion? SY, Alexey We also need to restart the old email threads on regression tests. For example, we need some sort of automated test script that runs Eclipse and tomcat, etc. in a deterministic fashion so that we can compare test results. It does not have to be perfect for starts, just repeatable and easy to use. Feel free to beat me to starting these threads :) -- Weldon Washburn Intel Enterprise Solutions Software Division
Re: [DRLVM] General stability
On the 0x21B day of Apache Harmony Oleg Oleinik wrote: > Such model works but there is a risk of fixing again "from scratch" those > bugs which were fixed once on the previous milestones. sometimes it is easier to fix a couple of bugs "from scratch" than to spend large amount of resources on regular complex checks (that also do not guarantee 100% stability) > We can eliminate this if follow "no regression" policy - if something works > (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for example), > it should continue working - any regression is a subject for reporting and > fixing as soon as possible (it is easier to find root cause and fix since we > will know which commit caused regression). > > Will this model work? Isn't it a little bit better than focusing on runtime > stability periodically? "no regression" policy should be relevant to a number of *small* tests that are easy to run and are running fast, to make them good as pre-commit criteria. Complex workloads _cannot_ be run as a pre-commit criteria. So there _should be regressions_. That's because: * we cannot afford to run them as pre-commit * we cannot afford complex rollbacks and stop-the world Many successful projects (probably, all of them) have stability periods, even stability releases (and, yes, stability branches). That is considered effective. And IMO our project should act the same. We _have to_ allow some bugs to continue active development. But not too many. It is always a tradeoff. To summarize. I support you idea to improve the regression test base and infrastructure. Let it be a step-by-step improvement. Then we can decide which tests to run as pre-commit and which are to measure the overall stability. > On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote: > > > > I wouldn't go so far as to label issues as "won't fix" unless they are > > really high risk and low value items. > > > > It's useful to go through a stabilization period where the focus is on > > getting the code solid again and delaying significant new functionality > > until it is achieved. A plan that aims to deliver stable milestones on > > regular periods is, in my experience, a good way to focus the > > development effort. > > > > Regards, > > Tim > > > > Weldon Washburn wrote: > > > Folks, > > > > > > I have spent the last two months committing patches to the VM. While we > > > have added a ton of much needed functionality, the stability of the > > system > > > has been ignored. By chance, I looked at thread synchronization design > > > problems this week. Its very apparent that we lack the regression > > testing > > > to really find threading bugs, test the fixes and test against > > regression. > > > No doubt there are similar problems in other VM subsystems. "build > > test" > > > is necessary but not sufficient for where we need to go. In a sense, > > > committing code with only "build test" to prevent regression is the > > > equivalent to flying in the fog without instrumentation. > > > > > > So that we can get engineers focused on stability, I am thinking of > > coding > > > the JIRAs that involve new features as "later" or even "won't > > fix". Please > > > feel free to comment. > > > > > > We also need to restart the old email threads on regression tests. For > > > example, we need some sort of automated test script that runs Eclipse > > and > > > tomcat, etc. in a deterministic fashion so that we can compare test > > > results. It does not have to be perfect for starts, just repeatable and > > > easy to use. Feel free to beat me to starting these threads :) > > > > > > > -- > > > > Tim Ellison ([EMAIL PROTECTED]) > > IBM Java technology centre, UK. > > -- Egor Pasko
Re: [DRLVM] General stability
Such model works but there is a risk of fixing again "from scratch" those bugs which were fixed once on the previous milestones. We can eliminate this if follow "no regression" policy - if something works (classlib unit tests, Tomacat or Eclipse Unit Tests pass 100%, for example), it should continue working - any regression is a subject for reporting and fixing as soon as possible (it is easier to find root cause and fix since we will know which commit caused regression). Will this model work? Isn't it a little bit better than focusing on runtime stability periodically? On 11/8/06, Tim Ellison <[EMAIL PROTECTED]> wrote: I wouldn't go so far as to label issues as "won't fix" unless they are really high risk and low value items. It's useful to go through a stabilization period where the focus is on getting the code solid again and delaying significant new functionality until it is achieved. A plan that aims to deliver stable milestones on regular periods is, in my experience, a good way to focus the development effort. Regards, Tim Weldon Washburn wrote: > Folks, > > I have spent the last two months committing patches to the VM. While we > have added a ton of much needed functionality, the stability of the system > has been ignored. By chance, I looked at thread synchronization design > problems this week. Its very apparent that we lack the regression testing > to really find threading bugs, test the fixes and test against regression. > No doubt there are similar problems in other VM subsystems. "build test" > is necessary but not sufficient for where we need to go. In a sense, > committing code with only "build test" to prevent regression is the > equivalent to flying in the fog without instrumentation. > > So that we can get engineers focused on stability, I am thinking of coding > the JIRAs that involve new features as "later" or even "won't fix". Please > feel free to comment. > > We also need to restart the old email threads on regression tests. For > example, we need some sort of automated test script that runs Eclipse and > tomcat, etc. in a deterministic fashion so that we can compare test > results. It does not have to be perfect for starts, just repeatable and > easy to use. Feel free to beat me to starting these threads :) > -- Tim Ellison ([EMAIL PROTECTED]) IBM Java technology centre, UK.
Re: [DRLVM] General stability
"build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. Right. Here I see 2 aspects - creating VM/JIT regression tests and running various tests regularaly / tracking regressions timely. 1. VM/JIT regression test development. We can start with these guidelines: Who creates/integrates a regression test? - committer. How? - typically, from bug's reproducer code. Where? - drlvm\trunk\src\test\vm\\harmony-\ or drlvm\trunk\src\test\jit\\harmony-\ Format: Junit, Jasmin When? - Consider creating a regression test for each fixed bug against Harmony DRLVM or JIT, if reasonable and technically possible. Creation of regression test may be omitted if: - Bug in documentation, build, test, code comment, code style, enhancement is fixed. - Performance-related bug is fixed. - Bug is found by existing Harmony test (Harmony regression or unit test). // to avoid duplication of tests Try to use public API to provide implementation independence and stability of the tests. This test suite can/should be used in pre-commit testing. 2. Timely regression tracking via test execution. Now we have some solution for code integrity testing via Cruise Control (buildtest/trunk/CC + HARMONY-995) - good start point to track regressions hourly using classlib unit tests and "build test" on your specific platform. I see HARMONY-2038 which is about automation of Ecilpse Unit Tests execution in context of buildtest/ infrastructure - to start running EUT regularly using buildtest/ (say, nightly) and reporting timely about new test failures (and causing commits). I think, it would be great if we continue adding more automation scripts into buildtest/ for various public unit test suites and scenarios (Derby, Tomcat, etc.), so that one could take automation scripts from buildtest/ and run them regularly on his specific platform, report regressions timely (what Nina, I suppose, is going to do with EUT). On 11/4/06, Weldon Washburn <[EMAIL PROTECTED]> wrote: Folks, I have spent the last two months committing patches to the VM. While we have added a ton of much needed functionality, the stability of the system has been ignored. By chance, I looked at thread synchronization design problems this week. Its very apparent that we lack the regression testing to really find threading bugs, test the fixes and test against regression. No doubt there are similar problems in other VM subsystems. "build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. So that we can get engineers focused on stability, I am thinking of coding the JIRAs that involve new features as "later" or even "won't fix". Please feel free to comment. We also need to restart the old email threads on regression tests. For example, we need some sort of automated test script that runs Eclipse and tomcat, etc. in a deterministic fashion so that we can compare test results. It does not have to be perfect for starts, just repeatable and easy to use. Feel free to beat me to starting these threads :) -- Weldon Washburn Intel Enterprise Solutions Software Division
Re: [DRLVM] General stability
+1 This makes more sense than putting a sudden unannounced stop to new features. Possibly Weldon means "on a case by case basis after risk assessment". However, announced stability periods at the end of milestones is a good practice. For example, since we have been focussing on development for a while, and there are many open JIRA bugs, we could consider rounding off the year with a "no new features and only bugfixing" period for 3-4 weeks starting December 1, to round off the year. We can fork and create a development branch to support some new work during this period. Going forward, we can try to plan for milestones and repeat this plan-development-stabilization process. Thanks, Rana On 11/7/06, Tim Ellison <[EMAIL PROTECTED]> wrote: I wouldn't go so far as to label issues as "won't fix" unless they are really high risk and low value items. It's useful to go through a stabilization period where the focus is on getting the code solid again and delaying significant new functionality until it is achieved. A plan that aims to deliver stable milestones on regular periods is, in my experience, a good way to focus the development effort. Regards, Tim
Re: [DRLVM] General stability
I wouldn't go so far as to label issues as "won't fix" unless they are really high risk and low value items. It's useful to go through a stabilization period where the focus is on getting the code solid again and delaying significant new functionality until it is achieved. A plan that aims to deliver stable milestones on regular periods is, in my experience, a good way to focus the development effort. Regards, Tim Weldon Washburn wrote: > Folks, > > I have spent the last two months committing patches to the VM. While we > have added a ton of much needed functionality, the stability of the system > has been ignored. By chance, I looked at thread synchronization design > problems this week. Its very apparent that we lack the regression testing > to really find threading bugs, test the fixes and test against regression. > No doubt there are similar problems in other VM subsystems. "build test" > is necessary but not sufficient for where we need to go. In a sense, > committing code with only "build test" to prevent regression is the > equivalent to flying in the fog without instrumentation. > > So that we can get engineers focused on stability, I am thinking of coding > the JIRAs that involve new features as "later" or even "won't fix". Please > feel free to comment. > > We also need to restart the old email threads on regression tests. For > example, we need some sort of automated test script that runs Eclipse and > tomcat, etc. in a deterministic fashion so that we can compare test > results. It does not have to be perfect for starts, just repeatable and > easy to use. Feel free to beat me to starting these threads :) > -- Tim Ellison ([EMAIL PROTECTED]) IBM Java technology centre, UK.
Re: [DRLVM] General stability
On 11/7/06, Alexey Varlamov <[EMAIL PROTECTED]> wrote: But do we have needed scripts/tools readily available to run and analyze such stability testing? I'm also pretty sure existing c-unit and smoke tests would help to reveal certain problems if run repeatedly - just need to add this stuff to CC and run it nightly. Anybody volunteer? And yet there are a lot of excluded tests in smoke suite... Actually, we have one. The task 'ant test' from the drlvm module is running under CC on linux and windows boxes. Every one can easily reproduce it (checkout 'buildtest' module and run it, updated version is available in the issue 995). The problem is: CC will be useful only to track the regression. While we have some failed tests it should be fixed asap. At present time some issues that prevent successful CC runs wait for integration more than 1 month :( thanks, Vladimir
Re: [DRLVM] General stability
2006/11/7, Gregory Shimansky <[EMAIL PROTECTED]>: Weldon Washburn wrote: > Folks, > > I have spent the last two months committing patches to the VM. While we > have added a ton of much needed functionality, the stability of the system > has been ignored. By chance, I looked at thread synchronization design > problems this week. Its very apparent that we lack the regression testing > to really find threading bugs, test the fixes and test against regression. > No doubt there are similar problems in other VM subsystems. "build test" > is necessary but not sufficient for where we need to go. In a sense, > committing code with only "build test" to prevent regression is the > equivalent to flying in the fog without instrumentation. > > So that we can get engineers focused on stability, I am thinking of coding > the JIRAs that involve new features as "later" or even "won't fix". Please > feel free to comment. > > We also need to restart the old email threads on regression tests. For > example, we need some sort of automated test script that runs Eclipse and > tomcat, etc. in a deterministic fashion so that we can compare test > results. It does not have to be perfect for starts, just repeatable and > easy to use. Feel free to beat me to starting these threads :) In my experience on working with drlvm, stability problems are often discovered on the existing VM acceptance tests. Big applications like eclipse or tomcat with long workloads usually reveal problems like lack of class unloading unless they crash on something like threading problems. The acceptance VM tests that we have already are a good start to test stability if they are ran nonstop many times. Gregory, But do we have needed scripts/tools readily available to run and analyze such stability testing? I'm also pretty sure existing c-unit and smoke tests would help to reveal certain problems if run repeatedly - just need to add this stuff to CC and run it nightly. Anybody volunteer? And yet there are a lot of excluded tests in smoke suite... -- Alexey I don't say that we shouldn't have real application workloads. I just want to say that acceptance tests already usually reveal threading problems quite well if they are ran many times, and race conditions happen in some circumstances. However at the moment we already have failing tests, some of them like gc.LOS on WinXP don't need multiple times to make them fail. There's also java.lang.ThreadTest which fails for me on Windows 2003 server SP1 and now started to fail on Linux as well. -- Gregory
Re: [DRLVM] General stability
Weldon Washburn wrote: Folks, I have spent the last two months committing patches to the VM. While we have added a ton of much needed functionality, the stability of the system has been ignored. By chance, I looked at thread synchronization design problems this week. Its very apparent that we lack the regression testing to really find threading bugs, test the fixes and test against regression. No doubt there are similar problems in other VM subsystems. "build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. So that we can get engineers focused on stability, I am thinking of coding the JIRAs that involve new features as "later" or even "won't fix". Please feel free to comment. We also need to restart the old email threads on regression tests. For example, we need some sort of automated test script that runs Eclipse and tomcat, etc. in a deterministic fashion so that we can compare test results. It does not have to be perfect for starts, just repeatable and easy to use. Feel free to beat me to starting these threads :) In my experience on working with drlvm, stability problems are often discovered on the existing VM acceptance tests. Big applications like eclipse or tomcat with long workloads usually reveal problems like lack of class unloading unless they crash on something like threading problems. The acceptance VM tests that we have already are a good start to test stability if they are ran nonstop many times. I don't say that we shouldn't have real application workloads. I just want to say that acceptance tests already usually reveal threading problems quite well if they are ran many times, and race conditions happen in some circumstances. However at the moment we already have failing tests, some of them like gc.LOS on WinXP don't need multiple times to make them fail. There's also java.lang.ThreadTest which fails for me on Windows 2003 server SP1 and now started to fail on Linux as well. -- Gregory
Re: [DRLVM] General stability
Weldon Washburn wrote: Folks, I have spent the last two months committing patches to the VM. While we have added a ton of much needed functionality, the stability of the system has been ignored. By chance, I looked at thread synchronization design problems this week. Its very apparent that we lack the regression testing to really find threading bugs, test the fixes and test against regression. No doubt there are similar problems in other VM subsystems. "build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. So that we can get engineers focused on stability, I am thinking of coding the JIRAs that involve new features as "later" or even "won't fix". Please feel free to comment. Please don't w/o discussion on each. I see no reason to do that right now, as while I 100% agree that stabilization is a necessary goal, we're not anywhere near the point where we need to call a halt to adding new stuff that's missing. We also need to restart the old email threads on regression tests. For example, we need some sort of automated test script that runs Eclipse and tomcat, etc. in a deterministic fashion so that we can compare test results. It does not have to be perfect for starts, just repeatable and easy to use. Feel free to beat me to starting these threads :) Feel free to add a script to the build-test framework :) geir
[DRLVM] General stability
Folks, I have spent the last two months committing patches to the VM. While we have added a ton of much needed functionality, the stability of the system has been ignored. By chance, I looked at thread synchronization design problems this week. Its very apparent that we lack the regression testing to really find threading bugs, test the fixes and test against regression. No doubt there are similar problems in other VM subsystems. "build test" is necessary but not sufficient for where we need to go. In a sense, committing code with only "build test" to prevent regression is the equivalent to flying in the fog without instrumentation. So that we can get engineers focused on stability, I am thinking of coding the JIRAs that involve new features as "later" or even "won't fix". Please feel free to comment. We also need to restart the old email threads on regression tests. For example, we need some sort of automated test script that runs Eclipse and tomcat, etc. in a deterministic fashion so that we can compare test results. It does not have to be perfect for starts, just repeatable and easy to use. Feel free to beat me to starting these threads :) -- Weldon Washburn Intel Enterprise Solutions Software Division