Re: [DISCUSS] Project build time and possible restructuring

2017-03-31 Thread Robert Metzger
Hi Grey, No, I still think we should split the repos. But it makes the whole thing a bit easier, because we don't need to introduce Jenkins at the same time. If we do the change, I'm going to ask Travis to extend the limit for all forks of the Flink repo? (Hope that's possible). Here is how it

Re: [DISCUSS] Project build time and possible restructuring

2017-03-31 Thread Greg Hogan
Thanks for pursuing this Robert. I appreciate their receptiveness to increasing the time and memory limits but we’ll still be bound by the old limits for our personal repos. Does this change any of the proposed actions for splitting the repo? Has anyone looked into why we see many jobs timeout

Re: [DISCUSS] Project build time and possible restructuring

2017-03-31 Thread Robert Metzger
Good news :) A few weeks ago, I got an email from travis asking for feedback. I filled out the form and said, that the 50 minutes build time limit is a showstopper for us. And now, a few weeks later they got back to me and told me that they have increased the build time for "apache/flink" to 120

Re: [DISCUSS] Project build time and possible restructuring

2017-03-28 Thread Robert Metzger
I think your selection of modules is okay. Moving out storm and the scala shell would be nice as well. But storm is not really maintained, so maybe we should consider moving it out of the Flink repo entirely. And the scala shell is not a library, but it also doesn't really belong into the main

Re: [DISCUSS] Project build time and possible restructuring

2017-03-21 Thread Timo Walther
So what do we want to move to the libraries repository? I would propose to move these modules first: flink-cep-scala flink-cep flink-gelly-examples flink-gelly-scala flink-gelly flink-ml All other modules (e.g. in flink-contrib) are rather connectors. I think it would be better to move those

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Aljoscha Krettek
The Beam Jenkins jobs are configured inside the Beam src repo itself. For example: https://github.com/apache/beam/blob/master/.jenkins/job_beam_PostCommit_Java_RunnableOnService_Flink.groovy For initial setup of the seed job you need admin rights on Jenkins, as described here:

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Greg Hogan
We can add cluster tests using the distribution jar, and will need to do so to remove Flink’s dependency on Hadoop. The YARN and Mesos tests would still run nightly and running cluster tests should be much faster. As troublesome as TravisCI has been, a major driver for this change has been

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger
It looks like Jetbrains TeamCity supports something in that direction: https://blog.jetbrains.com/teamcity/2012/03/incremental-building-with-maven-and-teamcity/ On Mon, Mar 20, 2017 at 2:40 PM, Timo Walther wrote: > Another solution would be to make the Travis builds more

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Timo Walther
Another solution would be to make the Travis builds more efficient. For example, we could write a script that determines the modified Maven module and only run the test for this module (and maybe transitive dependencies). PRs for libraries such as Gelly, Table, CEP or connectors would not

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger
Aljoscha, do you know how to configure jenkins? Is Apache INFRA doing that, or are the beam people doing that themselves? One downside of Jenkins is that we probably need some machines that execute the tests. A Travis container has 2 CPU cores and 4 GB main memory. We currently have 10 such

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Timo Walther
I agress with Aljoscha that we might consider moving from Jenkins to Travis. Is there any disadvantage in using Jenkins? I think we should structure the project according to release management (e.g. more frequent releases of libraries) or other criteria (e.g. core and non-core) instead of

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Aljoscha Krettek
I prefer Jenkins to Travis by far. Working on Beam, where we have good Jenkins integration, has opened my eyes to what is possible with good CI integration. For example, look at this recent Beam PR: https://github.com/apache/beam/pull/2263 . The

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Robert Metzger
Thank you for looking into the build times. I didn't know that the build time situation is so bad. Even with yarn, mesos, connectors and libraries removed, we are still running into the build timeout :( Aljoscha told me that the Beam community is using Jenkins for running the tests, and they are

Re: [DISCUSS] Project build time and possible restructuring

2017-03-20 Thread Stephan Ewen
@Greg I am personally in favor of splitting "connectors" and "contrib" out as well. I know that @rmetzger has some reservations about the connectors, but we may be able to convince him. For the cluster tests (yarn / mesos) - in the past there were many cases where these tests caught cases that

Re: [DISCUSS] Project build time and possible restructuring

2017-03-17 Thread Greg Hogan
I’d like to use this refactoring opportunity to unspilt the Travis tests. With 51 builds queued up for the weekend (some of which may fail or have been force pushed) we are at the limit of the number of contributions we can process. Fixing this requires 1) splitting the project, 2)

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Stephan Ewen
@Robert - I think once we know that a separate git repo works well, and that it actually solves problems, I see no reason to not create a connectors repository later. The infrastructure changes should be identical for two or more repositories. On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Robert Metzger
"flink-core" means the main repository, not the "flink-core" module. When doing a release, we need to build the flink main code first, because the flink-libraries depend on that. Once the "flink-libraries" are build, we need to run the main build again (at least the flink-dist module), so that it

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Till Rohrmann
I'm ok with point 3. Concerning point 8: Why do we have to build flink-core twice after having it built as a dependency for flink-libraries? This seems wrong to me. Cheers, Till On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger wrote: > Thank you. Running on AWS is a good

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Robert Metzger
Thank you. Running on AWS is a good idea! Let me know if you (or anybody else) wants to help me with the infrastructure work! Any help is much appreciated (as I've said before, I don't really have time for doing this, but it has to be done :) ) I'm against creating two new repositories. I fear

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Greg Hogan
Robert, appreciate your kickstarting this task. We should compare the verification time with and without the listed modules. I’ll try to run this by tomorrow on AWS and on Travis. Should we maintain separate repos for flink-contrib and flink-libraries? Are you intending that we move

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Robert Metzger
Thank you for looking into this Till. I think we then have to split the repositories. My main motivation for doing this is that it seems to be the only feasible way of scaling the community to allow more committers working on the libraries. I'll take care of getting things started. As the next

Re: [DISCUSS] Project build time and possible restructuring

2017-03-15 Thread Till Rohrmann
In theory we could have a merging bot which solves the problem of the "commit window". Once the PR passes all tests and has enough +1s, the bot could do the merging and, thus, it effectively linearizes the merge process. I think the second point is actually a disadvantage because there is not

Re: [DISCUSS] Project build time and possible restructuring

2017-03-14 Thread Stephan Ewen
Some other thoughts on how repository split would help. I am not sure for all of them, so please comment: - There is less competition for a "commit window". It happens a lot already that you run all tests and want to commit, but there was a commit in the meantime. You rebase, need to re-test,

Re: [DISCUSS] Project build time and possible restructuring

2017-03-10 Thread Till Rohrmann
Thanks for all your input. In order to wrap the discussion up I'd like to summarize the mentioned points: The problem of increasing build times and complexity of the project has been acknowledged. Ideally we would have everything in one repository using an incremental build tool. Since Maven does

Re: [DISCUSS] Project build time and possible restructuring

2017-02-24 Thread Robert Metzger
@Jin Mingjian: You can not use the paid travis version for open source projects. It only works for private repositories (at least back then when we've asked them about that). @Stephan: I don't think that incremental builds will be available with Maven anytime soon. I agree that we need to fix

Re: [DISCUSS] Project build time and possible restructuring

2017-02-23 Thread Stephan Ewen
If we can get a incremental builds to work, that would actually be the preferred solution in my opinion. Many companies have invested heavily in making a "single repository" code base work, because it has the advantage of not having to update/publish several repositories first. However, the

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Greg Hogan
An additional option for reducing time to build and test is parallel execution. This would help users more than on TravisCI since we're generally running on multi-core machines rather than VM slices. Is the idea that each user would only check out the modules that he or she is developing with?

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Fabian Hueske
Hi everybody, I think this should be a discussion about the benefits and drawbacks of separating the code into distinct repositories from a development point of view. So I agree with Stephan that we should not divide the community by creating separate groups of committers. Also the discussion

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Gábor Hermann
@Stephan: Although I tried to raise some issues about splitting committers, I'm still strongly in favor of some kind of restructuring. We just have to be conscious about the disadvantages. Not splitting the committers could leave the libraries in the same stalling status, described by Till.

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Stephan Ewen
Hi all! Thanks for kicking this off, Till, it is a good discussion to have. A few thoughts from my side: - From what I get from the first responses, from a development convenience point the split in repositories would be desirable. - The biggest obstacles on that way are probably the

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Aljoscha Krettek
I'm not against splitting but I wan't to highlight that there are other options: - We could split the tests run on travis logically. For example, run unit tests and integration tests separately. This would have the benefit that you would see early on if the (fast) unit tests fail. We could also

Re: [DISCUSS] Project build time and possible restructuring

2017-02-22 Thread Gábor Hermann
Hi all, I'm also in favor of splitting, but only in terms of committers. I agree with Theodore, that async releases would cause confusion. With time based releases [1] it should be easy to sync release. Even if it's possible to add committers to different components, should we do a more

Re: [DISCUSS] Project build time and possible restructuring

2017-02-21 Thread Jin Mingjian
The repo splitting is the result of the grown code base. So this will happen finally. The problem is when and how. when: the time point seems not bad. how: is the schema good? I assume we can not add committer per project(or the committer is just a logic concept?). So just splitting into

Re: [DISCUSS] Project build time and possible restructuring

2017-02-21 Thread Theodore Vasiloudis
Hello all, >From a library developer POV I think splitting up the project will have more advantages than disadvantages. Api breaking things should move to be the responsibility of library developers, and with automated tests they shouldn't be too hard to catch. I think I'm more fin favor of

[DISCUSS] Project build time and possible restructuring

2017-02-21 Thread Till Rohrmann
Hi Flink community, I'd like to revive a discussion about Flink's build time and project structure which we already had in some other mailing thread [1] and which we wanted do after the 1.2 release. Recently, we can see that Flink is exceeding more and more often Travis maximum build time of 50