Re: [Pharo-dev] The 6.1 Space Monkey is unhappy

Nicolas Anquetil Mon, 05 Mar 2018 05:41:50 -0800

this sounds a bit like smart-test (that tries to identify the tests tore-run after a method is changed)


Smarttest can work

- on dynamic data: first run the tests to know what test cause whatmethods to be executed; then if such method is modified, re-run theassociated tests

- on static data: analyse senders of a method recursively to find thetests that could possibly call it

1- we are doing now experiment to find out what method (static ordynamic) is best.Surprisingly, the static method gives very good results (often betterthan dynamic) even if we all know that senders-of is not always veryprecise in smalltalk

2- may be something similar could be used to find out in the monkey whatfailing tests are caused by a new slice ? (going back from the changesin the slice to the tests, or forward from the failing test toward thechanges in the slice ...)


nicolas


On 05/03/2018 14:29, Pavel Krivanek wrote:

2018-03-05 14:14 GMT+01:00 Stephane Ducasse <[email protected]>:

Hi pavel

when I'm back can you explain to me because I did not get it :).

:) it is simple. If you want to ignore issues that the slice is not
adding, you need to know which of them to ignore. That's why the
original monkey run all the validations twice - first time to collect
a list of failing tests in the fresh unchanged image and then with the
slice or configuration loaded. So it doubled the issue validation
time.
There are several alternative strategies like to cache the failing
test results for every build and use them for validations but the best
strategy is simply to keep the amount of failing tests in the clean
image on the zero level and force people to keep the system clean. The
original monkey hasn't exposed the list of ignored tests. So it was
possible that some test was ignored because of a temporal network
issue but for the second time it failed for o good reason and you even
didn't know.

Cheers,
-- Pavel

Stef


On Mon, Mar 5, 2018 at 11:07 AM, Pavel Krivanek
<[email protected]> wrote:

2018-03-05 10:54 GMT+01:00 Marcus Denker <[email protected]>:

On 5 Mar 2018, at 10:27, Alistair Grant <[email protected]> wrote:

Hi Marcus,

On 5 March 2018 at 09:23, Marcus Denker <[email protected]> wrote:

On 5 Mar 2018, at 09:16, Alistair Grant <[email protected]> wrote:

Hi Esteban & Marcus,

I'm getting repeated validation failures for:

https://pharo.manuscript.com/f/cases/21431


It's the same set of tests that fail each time, and as far as I can
tell they have nothing to do with the patch I submitted.

Do you know if this is happening on other tests?

I saw that Saturday but decided to wait till Monday (weekends are important..).

So: no, I have *no* idea what happened. From one CI run to the next,
suddenly around 160 tests related to Calypso started failing due to a missing 
method.

Now starting from sometime today, this problem stoped. The last failing PR 
checks
fail due to different reasons…

And I have no idea why.

(And yes, we al know that
1) the PR checks need more compute power, too slow
2) we *need* to track down the reason why still *a lot* of times the PR fails
   even though it should not.

The problem is that just keeping a build alive of this kind is a full time 
job.. that
we have nobody doing, so many many people do as much as they can and we
hope it will get better….)

Thanks for the update.

Oh, and it was completely unrelated. Your change is for Pharo6...

I took a look at the failures and it appears that

BehaviorTest>>testBehaviorRespectsPolymorphismWithTraitBehavior
ClassDescriptionTest>>testClassDescriptionRespectsPolymorphismWithTraitDescription
ClassTest>>testClassRespectsPolymorphismWithTrait

are all failing due to changes in Fuel - methods were changed from
traits to local methods.

Yes, the problem is that the monkey (the contribution checker) fails as soon
as there are errors even in the main image.

The last Pharo6 has these tests failing, so now all contribution checks for
Pharo6 fail.

What needs to be done?

-> your change can be accepted as we know it does not fail more fixes
-> then we need fix the tests in Pharo6
-> in a perfect world we would update the slice checker to only fail for
now test failing… (it used to be lille that…).

I must say that It made the validation two times slower, fragile and
led to the hiding of problems instead of solving them...

-- Pavel

As I said: this is a full time job…

         Marcus


--
Nicolas Anquetil
RMod team -- Inria Lille

Re: [Pharo-dev] The 6.1 Space Monkey is unhappy

Reply via email to