On Mon, Nov 23, 2015 at 1:53 PM, Colin P. McCabe <cmcc...@apache.org> wrote:
> I agree that our tests are in a bad state.  It would help if we could
> maintain a list of "flaky tests" somewhere in git and have Yetus
> consider the flakiness of a test before -1ing a patch.  Right now, we
> pretty much all have that list in our heads, and we're not applying it
> very consistently.  Having this list would also let us know where to
> concentrate our efforts to fix things.
>
> On Sun, Nov 22, 2015 at 4:21 AM, Steve Loughran <ste...@hortonworks.com> 
> wrote:
>>
>> Jenkins is pretty much dead in the water these days; a test run that works 
>> is a rare miracle rather than the default state. Which also means most 
>> patches are being +1'd in even though patches are failing, with comments 
>> like "the test failures are probably unrelated"
>>
>>
>> I think everyone has to be grateful that I'm not volunteering to be release 
>> manager for 2.8, as if I were i'd have already imposed a block on any 
>> patches going in until jenkins was stable. That is: nothing but test fixes 
>> would go in.
>>
>> as it is, at least for the next couple of weeks, I'm going to experiment 
>> with reverting patches which break the build. Usually those breakages are 
>> being fixed, eventually, with followup patches. With a "patches which break 
>> the build get reverted" policy, whoever submitted that first patch gets to 
>> write the fix *and test it again*. This should encourage people to be more 
>> rigorous first time round.
>>
>>
>>   1.  Yes, I'm going to have to be ruthless and do this for myself too. Or 
>> others can. I'm not doing much (any?) core hadoop coding right now, so more 
>> isolated.
>>   2.  No, I don't plan to show favouritism: break the build and it gets 
>> rolled back.
>>   3.  We can review this in a week or two  to see how it goes. And someone 
>> else can volunteer to keep jenkins happy.
>>   4.  I'll get a smaller fix for HDFS-9263 in.
>>   5.  I've also started running slider 0.90-SNAPSHOT test runs with Hadoop 
>> 2.8.0-SNAPSHOT, so I'm being the first to find problems beyond jenkins. So 
>> far HADOOP-12050 is the first blocker. It went in in August, which shows we 
>> aren't doing enough cross-version testing beyond just Jenkins. That breakage 
>> (HADOOP-12587) is stopping my test code working against secure clusters —if 
>> I was being really harsh I'd have reverted that too, but's been in long 
>> enough I think a fix is probably the best solution.
>
> Well, this is already directly contracting point #2, isn't it? :)

Just to be clear, I'm not trying to imply that this was favoritism (I
don't think it was) but just that a revert is not always the right
solution.  A short discussion usually helps to find the right
solution, which could be a revert, a follow-on fix, or something else.

best,
Colin

>
> I am open to being more critical about patches going in, but I think
> we should have some very minimal discussion before reverting things.
> It's just polite.
>
> Colin
>
>
>>   6.  Finally: everyone should feel free to fix tests. Don't be shy now!
>>
>> Giving this is a US vacation week, it should be a quieter week for breakages.
>>
>> Sorry —but if we can't even get Jenkins stable, then what hope do we have 
>> for a 2.8 release working?
>>
>> -Steve
>>
>>

Reply via email to