Hi Aljoscha, I think you are right, increase the timeout config will fix
this issue. this depends on the resource of Travis.  I would like share
some phenomenon during my test (not the flink problem) as follows:  :-)

During my testing, `mvn clean verify` and `nightly end-to-end test ` both
consume a lot of machine resources (especially memory/network), and the
network bandwidth requirements of `nightly end-to-end test ` are also very
high. In China, need to use VPN acceleration (100~200Kb before
acceleration, 3~4Mb after acceleration), I have encountered: [Avro
Confluent Schema Registry nightly end-to-end test' failed after 18 minutes
and 15 seconds! Test exited with exit Code 1] takes more than 18 minutes,
the download failed because the network bandwidth is not enough. and it
runs smoothly when using VPN acceleration. The overall end-to-end run was
passed twice. The Docker resource configuration (CUPs 7, Mem: 28.7G, Swap:
3.5G). See detail log here
<https://docs.google.com/document/d/1CcyTCyZmMmP57pkKv4drjSuxW61_u78HR3q1fJJODMw/edit?usp=sharing>
.

Just now, I had checked the Travis for your last commit (Increase startup
timeout in end-to-end tests), in addition to the Cleanup phase, other
phases are successful. here
<https://travis-ci.org/apache/flink/builds/511071777>

In order to verify that our speculation is accurate, I can help with 10 and
20 seconds timeout config on my repo verification to see if 100% recurring
timeout problem. It is already running, we are waiting for the result.
10seconds <https://travis-ci.org/sunjincheng121/flink/builds/511235749>
20seconds <https://travis-ci.org/sunjincheng121/flink/builds/511235598>

Best,
Jincheng

Aljoscha Krettek <aljos...@apache.org> 于2019年3月26日周二 上午1:04写道:

> Thanks for the testing done so far!
>
> There has been quite some flakiness on Travis lately, see here:
> https://travis-ci.org/apache/flink/branches <
> https://travis-ci.org/apache/flink/branches>. I’m a bit hesitant to
> release in this state. Looking at the tests you can see that all of the
> end-to-end tests fail because waiting for the dispatcher to come up times
> out. I also noticed that this usually takes about 5-8 seconds on Travis, so
> a 10 second timeout might be a bit low. I pushed commits to increase that
> to 20 secs. Let’s see what will happen.
>
> I’ll keep you posted!
> Aljoscha
>
> > On 25. Mar 2019, at 13:13, jincheng sun <sunjincheng...@gmail.com>
> wrote:
> >
> > Great thanks for preparing the RC4 of Flink 1.8.0, Aljoscha!
> >
> > +1 (non-binding)
> >
> > I checked the functional things as follows(Without performance
> > verification):
> >
> > 1. Checking Artifacts:
> >
> >    1). Download the release source code -  SUCCESS
> >    2). Check Source release flink-1.8.0-src.tgz.sha512 - SUCCESS
> >    3). Download the released JAR - SUCCESS
> >    4). Check if checksums and GPG files match the corresponding release
> > files - SUCCESS.
> >    5). Verify that the source archives do not contain any binaries -
> > SUCCESS.
> >    6). Build the source with `mvn clean verify -DskipTests` to ensure all
> > source files have Apache headers - SUCCESS
> >    7). Check that all POM files point to the same version - SUCCESS
> >    8). Read the `README.md` file to ensure there is nothing unexpected -
> > SUCCESS
> >
> > 2. Testing Larger Setups
> >
> >   Cluster Environment:7 nodes, jm 1024m, tm 4096m
> >   Testing Jobs: WordCount(Batch&Streaming), DataStreamAllroundTestProgram
> >
> >   1). Use local&hdfs file systems for checkpoints - SUCCESS
> >   2). Use hdfs file systems for input/output -SUCCESS
> >   3). Run examples on YARN(with or without session) - SUCCESS
> >   4). Test failover and recovery. - SUCCESS
> >   5). Test incremental&non-incremental checkpoint - SUCCESS
> >   6). Test connector - kafka -SUCCESS
> >
> > 3. Testing Functionality
> >
> >   1). Built-in tests(linux&mac os)
> >      - `mvn cealn verify`  (some test timeout error and test case bug see
> > FLINK-12001 <https://issues.apache.org/jira/browse/FLINK-12001>, all of
> > them are not the blocker)
> >      -  build for scala 2.11(mvn clean install -P scala-2.11 -DskipTests)
> > - SUCCESS
> >      -  Run the scripted nightly end-to-end test  - SUCCESS
> >
> >   2). Quickstarts
> >      - Verify that the quickstarts for Scala with the staging repository
> > in IntelliJ - SUCCESS
> >      - Verify that the quickstarts for Java with the staging repository
> in
> > IntelliJ - SUCCESS
> >
> >   3). Simple Starter Experience and Use Cases
> >
> >      - run all examples from IntelliJ IDE  -  SUCCESS
> >      - Start a local cluster and verify that the processes -  SUCCESS
> >        a. Examine the *.out files (should be empty) and the log files
> > (should contain no exceptions)
> >        b. Test for Linux, MacOS
> >        c. Shutdown and verify there are no exceptions in the log output
> > (after shutdown)
> >
> >      - Verify that the examples are running from both ./bin/flink and
> from
> > the web-based job submission tool(following items)   -  SUCCESS
> >        a. Start multiple task managers in the local cluster
> >        b. Change the flink-conf.yml to define more than one task slot (2)
> >        c. Run the examples with a parallelism > 1
> >        d. Examine the log output - no error messages should be
> encountered
> >
> > 4. Review the PR
> >     - [Add 1.8 Release Blog Post] - Just a reminder, updated the release
> > date to correct date before merging.
> >
> > Cheers,
> > Jincheng
> >
> > Piotr Nowojski <pi...@ververica.com> 于2019年3月25日周一 下午4:11写道:
> >
> >> +1 from my side. Previously spotted performance regression seems to be
> >> gone, or mostly gone.
> >>
> >> Piotrek
> >>
> >>> On 21 Mar 2019, at 17:52, Aljoscha Krettek <aljos...@apache.org>
> wrote:
> >>>
> >>> Hi everyone,
> >>> Please review and vote on the release candidate 4 for Flink 1.8.0, as
> >> follows:
> >>> [ ] +1, Approve the release
> >>> [ ] -1, Do not approve the release (please provide specific comments)
> >>>
> >>>
> >>> The complete staging area is available for your review, which includes:
> >>> * JIRA release notes [1],
> >>> * the official Apache source release and binary convenience releases to
> >> be deployed to dist.apache.org [2], which are signed with the key with
> >> fingerprint F2A67A8047499BBB3908D17AA8F4FD97121D7293 [3],
> >>> * all artifacts to be deployed to the Maven Central Repository [4],
> >>> * source code tag "release-1.8.0-rc4" [5],
> >>> * website pull request listing the new release [6]
> >>> * website pull request adding announcement blog post [7].
> >>>
> >>> The vote will be open for at least 72 hours. It is adopted by majority
> >> approval, with at least 3 PMC affirmative votes.
> >>>
> >>> Thanks,
> >>> Aljoscha
> >>>
> >>> [1]
> >>
> https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315522&version=12344274
> >>> [2] https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc4/
> >>> [3] https://dist.apache.org/repos/dist/release/flink/KEYS
> >>> [4]
> >> https://repository.apache.org/content/repositories/orgapacheflink-1215
> >>> [5]
> >>
> https://gitbox.apache.org/repos/asf?p=flink.git;a=tag;h=c650befc10c8bb6cc4b007ae250b7b2173046145
> >>> [6] https://github.com/apache/flink-web/pull/180 <
> >> https://github.com/apache/flink-web/pull/180>
> >>> [7] https://github.com/apache/flink-web/pull/179 <
> >> https://github.com/apache/flink-web/pull/179>
> >>>
> >>> P.S. The difference to the previous RCs is small, you can fetch the
> tags
> >> and do a "git log release-1.8.0-rc1..release-1.8.0-rc4” to see the
> >> difference in commits. Its fixes for the issues that led to the
> >> cancellation of the previous RCs plus smaller fixes. Most
> >> verification/testing that was carried out should apply as is to this RC.
> >> Any functional verification that you did on previous RCs should
> therefore
> >> easily carry over to this one.
> >>
> >>
>
>

Reply via email to