Re: Flume failing travis builds

Ralph Goers Tue, 25 Jan 2022 11:03:43 -0800

There seem to be two builds running and both fail but fail in different places.

The first build seems to be failing in a way it shouldn’t. The test is for not 
specifying any Kafka partitions. 
The behavior of how Kafka handles this changed in version 2.4 so it should only 
be checking to see if it 
received all the evants, but it appears it is somehow in the logic to check 
that all the partitions have an 
equal number of events. I’ve added more info into the assert message to help 
diagnose this.

The second build is failing in changes I just made to upgrade Netty & Avro. It 
appears to be failing 
checking the local host name. I will have to add some info to the error to 
determine what it getting for a 
hostname.

I then ran the build in an Ubuntu VM on my MacBook and it got an error in 
TestExecSource (which hasn’t 
been changed). It seems it is calling process.waitFor() and getting a returned 
value of 1. I changed the 
test to call waitFor before calling destroy and it passed. It then failed in 
TestFileChannelRestart giving me 
IOExceptions saying the checkpoint hadn’t completed and the checkpoint interval 
should be increased. 
I added logic to retry in this situation but there is a unit test that tries to 
force that error so I had to have
 it bypass the fix in that case.

I committed those changes and will look at the results of the next Travis build 
to see what additional info 
it can provide.

Ralph

> On Jan 24, 2022, at 12:18 AM, Tristan Stevens <tris...@apache.org> wrote:
> 
> Hi all,
> It seems that for some reason the Travis builds are failing again. One of 
> them has been since the Log4j and SLF4J bump (odd) and the other since the 
> Kafka upgrade.
> 
> Anybody got some cycles in investigate whether these are just flaky tests 
> and/or whether there’s something more sinister in there?
> 
> Thanks
> Tristan
>

Re: Flume failing travis builds

Reply via email to