Re: reverting test-breaking changes

Vineet Garg Thu, 05 Apr 2018 13:36:12 -0700

TestNegativeCli driver tests are still failing with java.lang.OutOfMemoryError: 
GC overhead limit exceeded error.
Can we increase the amount of memory for tests?


Vineet

On Mar 5, 2018, at 11:35 AM, Sergey Shelukhin 
<[email protected]<mailto:[email protected]>> wrote:

On a semi-related note, I noticed recently that negative tests seem to OOM
in setup from time to time.
Can we increase the amount of memory for the tests a little bit, and/or
maybe add the dump on OOM to them, saved to test logs directory, so we
could investigate?

On 18/3/5, 11:07, "Vineet Garg" 
<[email protected]<mailto:[email protected]>> wrote:

+1 for nightly build. We could generate reports to identify both frequent
and sporadic test failures plus other interesting bits like average build
time, yetus failures etc. It’ll also help narrow down the culprit
commit(s) range to one day.
If you guys decide to go ahead with this I would like to help.

Vineet

On Mar 5, 2018, at 8:50 AM, Sahil Takiar 
<[email protected]<mailto:[email protected]>> wrote:

Wow that HBase UI looks super useful. +1 to having something like that.

If not, +1 to having a proper nightly build, it would help devs identify
which commits break which tests. I find using git-bisect can take a long
time to run, and can be difficult to use (e.g. finding a known good
commit
isn't always easy).

On Mon, Mar 5, 2018 at 9:03 AM, Peter Vary 
<[email protected]<mailto:[email protected]>> wrote:

Without a nightly build and with this many flaky tests it is very hard
to
identify the braking commits. We can use something like bisect and
multiple
test runs.

There is a more elegant way to do this with nightly test runs:
https://issues.apache.org/jira/browse/HBASE-15917 <
https://issues.apache.org/jira/browse/HBASE-15917>
https://builds.apache.org/job/HBASE-Find-Flaky-Tests/
lastSuccessfulBuild/artifact/dashboard.html <https://builds.apache.org/
job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html>

This also helps to identify the flaky tests, and creates a continuos,
updated list of them.

On Feb 23, 2018, at 6:55 PM, Sahil Takiar <[email protected]>
wrote:

+1

Does anyone have suggestions about how to efficiently identify which
commit
is breaking a test? Is it just git-bisect or is there an easier way?
Hive
QA isn't always that helpful, it will say a test is failing for the
past
"x" builds, but that doesn't help much since Hive QA isn't a nightly
build.

On Thu, Feb 22, 2018 at 10:31 AM, Vihang Karajgaonkar <
[email protected]>
wrote:

+1
Commenting on JIRA and giving a 24hr heads-up (excluding weekends)
would be
good.

On Thu, Feb 22, 2018 at 10:19 AM, Alan Gates <[email protected]>
wrote:

+1.

Alan.

On Thu, Feb 22, 2018 at 8:25 AM, Thejas Nair <[email protected]>
wrote:

+1
I agree, this makes sense. The number of failures keeps increasing.
A 24 hour heads up in either case before revert would be good.


On Thu, Feb 22, 2018 at 2:45 AM, Peter Vary <[email protected]>
wrote:

I agree with Zoltan. The continuously braking tests make it very
hard
to
spot real issues.
Any thoughts on doing it automatically?

On Feb 22, 2018, at 10:47 AM, Zoltan Haindrich <[email protected]>
wrote:

*

Hello,

*
*

**

In the last couple weeks the number of broken tests have started
to
go
up...and even tho I run bisect/etc from time to time ; sometimes
people
don’t react to my comments/tickets/etc.

Because keeping this many failing tests makes it easier for a new
one
to
slip in...I think reverting the patch introducing the test
failures
would
also help in some case.

I think it would help a lot to prevent further test breaks to
revert
the
patch if any of the following conditions is met:

*
*

C1) if the notification/comment about the fact that the patch
indeed
broken a test somehow have been unanswered for at least 24 hours.

C2) if the patch is in for 7 days; but the test failure is still
not
addressed (note that in this case there might be a conversation
about
fixing it...but in this case ; to enable other people to work in a
cleaner
environment is more important than a single patch - and if it
can't
be
fixed in 7 days...well it might not get fixed in a month).

*
*

I would like to also note that I've seen a few tickets which have
been
picked up by people who were not involved in creating the original
change -
and although the intention was good, they might miss the context
of
the
original patch and may "fix" the tests in the wrong way: accept a
q.out
which is inappropriate or ignore the test...

*
*

would it be ok to implement this from now on? because it makes my
efforts practically useless if people are not reacting…

*
*

note: just to be on the same page - this is only about running a
single
test which falls on its own - I feel that flaky tests are an
entirely
different topic.

*
*

cheers,

Zoltan

**
*








--
Sahil Takiar
Software Engineer
[email protected] | (510) 673-0309




--
Sahil Takiar
Software Engineer
[email protected]<mailto:[email protected]> | (510) 673-0309

Re: reverting test-breaking changes

Reply via email to