Re: [DISCUSS] Project build time and possible restructuring

Aljoscha Krettek Mon, 20 Mar 2017 07:10:46 -0700

The Beam Jenkins jobs are configured inside the Beam src repo itself. For 
example: 
https://github.com/apache/beam/blob/master/.jenkins/job_beam_PostCommit_Java_RunnableOnService_Flink.groovy


For initial setup of the seed job you need admin rights on Jenkins, as 
described here: https://cwiki.apache.org/confluence/display/INFRA/Jenkins.

The somewhat annoying thing is setting up our own “flink” build slaves and 
maintaining them. There are some general purpose build slaves but 
high-throughput projects usually have their own build slaves to ensure speedy 
processing of Jenkins jobs: 
https://cwiki.apache.org/confluence/display/INFRA/Jenkins+node+labels

> On 20 Mar 2017, at 14:40, Timo Walther <[email protected]> wrote:
> 
> Another solution would be to make the Travis builds more efficient. For 
> example, we could write a script that determines the modified Maven module 
> and only run the test for this module (and maybe transitive dependencies). 
> PRs for libraries such as Gelly, Table, CEP or connectors would not trigger a 
> compilation of the entire stack anymore. Of course this would not solve all 
> problems but many of it.
> 
> What do you think about this?
> 
> 
> 
> Am 20/03/17 um 14:02 schrieb Robert Metzger:
>> Aljoscha, do you know how to configure jenkins?
>> Is Apache INFRA doing that, or are the beam people doing that themselves?
>> 
>> One downside of Jenkins is that we probably need some machines that execute
>> the tests. A Travis container has 2 CPU cores and 4 GB main memory. We
>> currently have 10 such containers available on travis concurrently. I think
>> we would need at least the same amount on Jenkins.
>> 
>> 
>> On Mon, Mar 20, 2017 at 1:48 PM, Timo Walther <[email protected]> wrote:
>> 
>>> I agress with Aljoscha that we might consider moving from Jenkins to
>>> Travis. Is there any disadvantage in using Jenkins?
>>> 
>>> I think we should structure the project according to release management
>>> (e.g. more frequent releases of libraries) or other criteria (e.g. core and
>>> non-core) instead of build time. What would happen if the built of another
>>> submodule would become too long, would we split/restructure again and
>>> again? If Jenkins solves all our problems we should use it.
>>> 
>>> Regards,
>>> Timo
>>> 
>>> 
>>> 
>>> Am 20/03/17 um 12:21 schrieb Aljoscha Krettek:
>>> 
>>>> I prefer Jenkins to Travis by far. Working on Beam, where we have good
>>>> Jenkins integration, has opened my eyes to what is possible with good CI
>>>> integration.
>>>> 
>>>> For example, look at this recent Beam PR: https://github.com/apache/beam
>>>> /pull/2263 <https://github.com/apache/beam/pull/2263>. The
>>>> Jenkins-Github integration will tell you exactly which tests failed and if
>>>> you click on the links you can look at the log output/std out of the tests
>>>> in question.
>>>> 
>>>> This is the overview page of one of the Jenkins Jobs that we have in
>>>> Beam: https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/>. This is an example of a
>>>> stable build: https://builds.apache.org/job/
>>>> beam_PostCommit_Java_RunnableOnService_Flink/lastStableBuild/ <
>>>> https://builds.apache.org/job/beam_PostCommit_Java_Runnable
>>>> OnService_Flink/lastStableBuild/>. Notice how it gives you fine grained
>>>> information about the Maven run. This is an unstable run:
>>>> https://builds.apache.org/job/beam_PostCommit_Java_RunnableO
>>>> nService_Flink/lastUnstableBuild/ <https://builds.apache.org/job
>>>> /beam_PostCommit_Java_RunnableOnService_Flink/lastUnstableBuild/>. There
>>>> you can see which tests failed and you can easily drill down.
>>>> 
>>>> Best,
>>>> Aljoscha
>>>> 
>>>> On 20 Mar 2017, at 11:46, Robert Metzger <[email protected]> wrote:
>>>>> Thank you for looking into the build times.
>>>>> 
>>>>> I didn't know that the build time situation is so bad. Even with yarn,
>>>>> mesos, connectors and libraries removed, we are still running into the
>>>>> build timeout :(
>>>>> 
>>>>> Aljoscha told me that the Beam community is using Jenkins for running
>>>>> the tests, and they are planning to completely move away from Travis. I
>>>>> wonder whether we should do the same, as having our own Jenkins servers
>>>>> would allow us to run tests for more than 50 minutes.
>>>>> 
>>>>> I agree with Stephan that we should keep the yarn and mesos tests in the
>>>>> core for stability / testing quality purposes.
>>>>> 
>>>>> 
>>>>> On Mon, Mar 20, 2017 at 11:27 AM, Stephan Ewen <[email protected]
>>>>> <mailto:[email protected]>> wrote:
>>>>> @Greg
>>>>> 
>>>>> I am personally in favor of splitting "connectors" and "contrib" out as
>>>>> well. I know that @rmetzger has some reservations about the connectors,
>>>>> but
>>>>> we may be able to convince him.
>>>>> 
>>>>> For the cluster tests (yarn / mesos) - in the past there were many cases
>>>>> where these tests caught cases that other tests did not, because they are
>>>>> the only tests that actually use the "flink-dist.jar" and thus discover
>>>>> many dependency and configuration issues. For that reason, my feeling
>>>>> would
>>>>> be that they are valuable in the core repository.
>>>>> 
>>>>> I would actually suggest to do only the library split initially, to see
>>>>> what the challenges are in setting up the multi-repo build and release
>>>>> tooling. Once we gathered experience there, we can probably easily see
>>>>> what
>>>>> else we can split out.
>>>>> 
>>>>> Stephan
>>>>> 
>>>>> 
>>>>> On Fri, Mar 17, 2017 at 8:37 PM, Greg Hogan <[email protected] <mailto:
>>>>> [email protected]>> wrote:
>>>>> 
>>>>> I’d like to use this refactoring opportunity to unspilt the Travis tests.
>>>>>> With 51 builds queued up for the weekend (some of which may fail or have
>>>>>> been force pushed) we are at the limit of the number of contributions we
>>>>>> can process. Fixing this requires 1) splitting the project, 2)
>>>>>> investigating speedups for long-running tests, and 3) staying cognizant
>>>>>> of
>>>>>> test performance when accepting new code.
>>>>>> 
>>>>>> I’d like to add one to Stephan’s list of module group. I like that the
>>>>>> modules are generic (“libraries”) so that no one module is alone and
>>>>>> independent.
>>>>>> 
>>>>>> Flink has three “libraries”: cep, ml, and gelly.
>>>>>> 
>>>>>> “connectors” is a hotspot due to the long-running Kafka tests (and
>>>>>> connectors for three Kafka versions).
>>>>>> 
>>>>>> Both flink-storm and flink-python have a modest number of number of
>>>>>> tests
>>>>>> and could live with the miscellaneous modules in “contrib”.
>>>>>> 
>>>>>> The YARN tests are long-running and problematic (I am unable to
>>>>>> successfully run these locally). A “cluster” module could host
>>>>>> flink-mesos,
>>>>>> flink-yarn, and flink-yarn-tests.
>>>>>> 
>>>>>> That gets us close to running all tests in a single Travis build.
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212122590>>
>>>>>> 
>>>>>> I also tested (https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build> <
>>>>>> https://github.com/greghogan/flink/commits/core_build <
>>>>>> https://github.com/greghogan/flink/commits/core_build>>) with a maven
>>>>>> parallelism of 2 and 4, with the latter a 6.4% drop in build time.
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212137659>>
>>>>>>    https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470> <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470 <
>>>>>> https://travis-ci.org/greghogan/flink/builds/212154470>>
>>>>>> 
>>>>>> We can run Travis CI builds nightly to guard against breaking changes.
>>>>>> 
>>>>>> I also wanted to get an idea of how disruptive it would be to developers
>>>>>> to divide the project into multiple git repos. I wrote a simple python
>>>>>> script and configured it with the module partitions listed above. The
>>>>>> usage
>>>>>> string from the top of the file lists commits with files from multiple
>>>>>> partitions and well as the modified files.
>>>>>>    https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897> <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897 <
>>>>>> https://gist.github.com/greghogan/f38a8efe6b6dd5a162a6b43335ac4897>>
>>>>>> 
>>>>>> Accounting for the merging of the batch and streaming connector modules,
>>>>>> and assuming that the project structure has not changed much over the
>>>>>> past
>>>>>> 15 months, for the following date ranges the listed number of commits
>>>>>> would
>>>>>> have been split across repositories.
>>>>>> 
>>>>>> since "2017-01-01"
>>>>>> 36 of 571 commits were mixed
>>>>>> 
>>>>>> since "2016-07-01"
>>>>>> 155 of 1607 commits were mixed
>>>>>> 
>>>>>> since "2016-01-01"
>>>>>> 272 of 2561 commits were mixed
>>>>>> 
>>>>>> Greg
>>>>>> 
>>>>>> 
>>>>>> On Mar 15, 2017, at 1:13 PM, Stephan Ewen <[email protected] <mailto:
>>>>>>> [email protected]>> wrote:
>>>>>>> 
>>>>>>> @Robert - I think once we know that a separate git repo works well, and
>>>>>>> that it actually solves problems, I see no reason to not create a
>>>>>>> connectors repository later. The infrastructure changes should be
>>>>>>> 
>>>>>> identical
>>>>>> 
>>>>>>> for two or more repositories.
>>>>>>> 
>>>>>>> On Wed, Mar 15, 2017 at 5:22 PM, Till Rohrmann <[email protected]
>>>>>>> <mailto:[email protected]>>
>>>>>>> 
>>>>>> wrote:
>>>>>> 
>>>>>>> I think it should not be at least the flink-dist but exactly the
>>>>>>> remaining
>>>>>>> flink-dist module. Otherwise we do redundant work.
>>>>>>>> On Wed, Mar 15, 2017 at 5:03 PM, Robert Metzger <[email protected]
>>>>>>>> <mailto:[email protected]>>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> "flink-core" means the main repository, not the "flink-core" module.
>>>>>>>>> When doing a release, we need to build the flink main code first,
>>>>>>>>> 
>>>>>>>> because
>>>>>>> the flink-libraries depend on that.
>>>>>>>>> Once the "flink-libraries" are build, we need to run the main build
>>>>>>>>> 
>>>>>>>> again
>>>>>>> (at least the flink-dist module), so that it is pulling the artifacts
>>>>>>>> from
>>>>>>>> 
>>>>>>>>> the flink-libraries to put them into the opt/ folder of the final
>>>>>>>>> 
>>>>>>>> artifact.
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> On Wed, Mar 15, 2017 at 4:44 PM, Till Rohrmann <[email protected]
>>>>>>>>> <mailto:[email protected]>>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>> I'm ok with point 3.
>>>>>>>>>> Concerning point 8: Why do we have to build flink-core twice after
>>>>>>>>>> 
>>>>>>>>> having
>>>>>>>>> it built as a dependency for flink-libraries? This seems wrong to me.
>>>>>>>>>> Cheers,
>>>>>>>>>> Till
>>>>>>>>>> 
>>>>>>>>>> On Wed, Mar 15, 2017 at 4:23 PM, Robert Metzger <
>>>>>>>>>> [email protected] <mailto:[email protected]>>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>> Thank you. Running on AWS is a good idea!
>>>>>>>>>>> Let me know if you (or anybody else) wants to help me with the
>>>>>>>>>>> infrastructure work! Any help is much appreciated (as I've said
>>>>>>>>>>> 
>>>>>>>>>> before, I
>>>>>>>>>> don't really have time for doing this, but it has to be done :) )
>>>>>>>>>>> I'm against creating two new repositories. I fear that this
>>>>>>>>>>> 
>>>>>>>>>> introduces
>>>>>>>>> too
>>>>>>>>>>> much complexity and too many repositories.
>>>>>>>>>>> "flink" and "flink-libraries" are hopefully enough to get the build
>>>>>>>>>>> 
>>>>>>>>>> time
>>>>>>>>>> significantly down.
>>>>>>>>>>> We can also consider putting the connectors into the
>>>>>>>>>>> 
>>>>>>>>>> "flink-libraries"
>>>>>>>>> repo
>>>>>>>>>>> if we need to further reduce the build time.
>>>>>>>>>>> 
>>>>>>>>>>> We should probably move "flink-table" of out "flink-libraries" if
>>>>>>>>>>> we
>>>>>>>>>>> 
>>>>>>>>>> want
>>>>>>>>>> to keep "flink-table" in the main repo. (This would eliminate the
>>>>>>>>>>> "flink-libraries" module from main.
>>>>>>>>>>> 
>>>>>>>>>>> Also, I agree that "flink-statebackend-rocksdb" is not correctly
>>>>>>>>>>> 
>>>>>>>>>> placed
>>>>>>>>> in
>>>>>>>>>>> contrib anymore.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, Mar 15, 2017 at 4:07 PM, Greg Hogan <[email protected]
>>>>>>>>>>> <mailto:[email protected]>>
>>>>>>>>>>> 
>>>>>>>>>> wrote:
>>>>>>>>>> Robert, appreciate your kickstarting this task.
>>>>>>>>>>>> We should compare the verification time with and without the
>>>>>>>>>>>> listed
>>>>>>>>>>>> modules. I’ll try to run this by tomorrow on AWS and on Travis.
>>>>>>>>>>>> 
>>>>>>>>>>>> Should we maintain separate repos for flink-contrib and
>>>>>>>>>>>> 
>>>>>>>>>>> flink-libraries?
>>>>>>>>>>> Are you intending that we move flink-table out of flink-libraries
>>>>>>>>>>> (and
>>>>>>>>>> perhaps flink-statebackend-rocksdb out of flink-contrib)?
>>>>>>>>>>>> Greg
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Mar 15, 2017, at 9:55 AM, Robert Metzger <[email protected]
>>>>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>>>>> 
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Thank you for looking into this Till.
>>>>>>>>>>>>> I think we then have to split the repositories.
>>>>>>>>>>>>> My main motivation for doing this is that it seems to be the only
>>>>>>>>>>>>> 
>>>>>>>>>>>> feasible
>>>>>>>>>>>> 
>>>>>>>>>>>>> way of scaling the community to allow more committers working on
>>>>>>>>>>>>> 
>>>>>>>>>>>> the
>>>>>>>>>> libraries.
>>>>>>>>>>>>> I'll take care of getting things started.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> As the next steps I propose to:
>>>>>>>>>>>>> 1. Ask INFRA to rename https://git-wip-us.apache.org/ <
>>>>>>>>>>>>> https://git-wip-us.apache.org/>
>>>>>>>>>>>>> 
>>>>>>>>>>>> repos/asf?p=flink-
>>>>>>>>>>>> connectors.git;a=summary to "flink-libraries"
>>>>>>>>>>>>> 2. Ask INFRA to set up GitHub and travis integration for
>>>>>>>>>>>>> 
>>>>>>>>>>>> "flink-libraries"
>>>>>>>>>>>> 
>>>>>>>>>>>>> 3. Put the code of "flink-ml", "flink-gelly", "flink-python",
>>>>>>>>>>>>> 
>>>>>>>>>>>> "flink-cep",
>>>>>>>>>>>> 
>>>>>>>>>>>>> "flink-scala-shell", "flink-storm" into the new repository. (I
>>>>>>>>>>>>> 
>>>>>>>>>>>> decided
>>>>>>>>>>> against moving flink-contrib there, because rocksdb is in the
>>>>>>>>>>>> contrib
>>>>>>>>>> module, for flink-table, I'm undecided, but I kept it in the main
>>>>>>>>>>>> repo
>>>>>>>>>>> because its probably going to interact more with the core code in
>>>>>>>>>>>> the
>>>>>>>>>> future)
>>>>>>>>>>>>> I try to preserve the history of those modules when splitting
>>>>>>>>>>>>> 
>>>>>>>>>>>> them
>>>>>>>>> into
>>>>>>>>>>> the
>>>>>>>>>>>>> new repo
>>>>>>>>>>>>> 4. I'll close all pull requests against those modules in the main
>>>>>>>>>>>>> 
>>>>>>>>>>>> repo.
>>>>>>>>>>> 5. I'll set up a minimal documentation page for the library
>>>>>>>>>>>> repository,
>>>>>>>>>>> similar to the main documentation.
>>>>>>>>>>>>> 6. I'll update the documentation build process to build both
>>>>>>>>>>>>> 
>>>>>>>>>>>> documentations
>>>>>>>>>>>> 
>>>>>>>>>>>>> & link them to each other
>>>>>>>>>>>>> 7. I'll update the nightly deployment process to include both
>>>>>>>>>>>>> 
>>>>>>>>>>>> repositories
>>>>>>>>>>>> 
>>>>>>>>>>>>> 8. I'll update the release script to create the Flink release out
>>>>>>>>>>>>> 
>>>>>>>>>>>> of
>>>>>>>>>> both
>>>>>>>>>>>> repositories. In order to put the libraries into the opt/ dir of
>>>>>>>>>>>> the
>>>>>>>>>> release, I'll need to change the build of "flink-dist" so that it
>>>>>>>>>>>> first
>>>>>>>>>>> builds flink core, then the libraries and then the core again
>>>>>>>>>>>> with
>>>>>>>>> the
>>>>>>>>>>> libraries as an additional dependency.
>>>>>>>>>>>>> The main question for the community is: do you agree with point
>>>>>>>>>>>>> 
>>>>>>>>>>>> 3 ?
>>>>>>>>> Would
>>>>>>>>>>>> you like to include more or less?
>>>>>>>>>>>>> I'll start with 1. and 2. tomorrow morning.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Wed, Mar 15, 2017 at 1:48 PM, Till Rohrmann <
>>>>>>>>>>>>> 
>>>>>>>>>>>> [email protected] <mailto:[email protected]>
>>>>>>>>>> wrote:
>>>>>>>>>>>>> In theory we could have a merging bot which solves the problem
>>>>>>>>>>>>> of
>>>>>>>>> the
>>>>>>>>>>> "commit window". Once the PR passes all tests and has enough
>>>>>>>>>>>>> +1s,
>>>>>>>>> the
>>>>>>>>>>> bot
>>>>>>>>>>>>> could do the merging and, thus, it effectively linearizes the
>>>>>>>>>>>>> merge
>>>>>>>>>> process.
>>>>>>>>>>>>>> I think the second point is actually a disadvantage because
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> there
>>>>>>>>> is
>>>>>>>>> 
>>>>>>>>>> not
>>>>>>>>>>>> such an immediate incentive/pressure to fix the broken module if
>>>>>>>>>>>>> it
>>>>>>>>>> lives
>>>>>>>>>>>>> in a separate repository. Furthermore, breaking API changes in
>>>>>>>>>>>>> the
>>>>>>>>> core
>>>>>>>>>>>> will most likely go unnoticed for some time in other modules
>>>>>>>>>>>>> which
>>>>>>>>> are
>>>>>>>>>>> not
>>>>>>>>>>>>> developed so actively. In the worst case these things will only
>>>>>>>>>>>>> be
>>>>>>>>> noticed
>>>>>>>>>>>>> when we try to make a release.
>>>>>>>>>>>>>> But I also agree that we are not Google and we don't have the
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> capacities to
>>>>>>>>>>>>> maintain such a smooth a build process that we can keep all the
>>>>>>>>>>>>> code
>>>>>>>>>> in
>>>>>>>>>>>> a
>>>>>>>>>>>> 
>>>>>>>>>>>>> single repository.
>>>>>>>>>>>>>> I looked a bit into Gradle and as far as I can tell it offers
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> some
>>>>>>>>> nice
>>>>>>>>>>>> features wrt incrementally building projects. This would be
>>>>>>>>>>>>> beneficial
>>>>>>>>>>> for
>>>>>>>>>>>>> local development but it would not solve our build time problems
>>>>>>>>>>>>> on
>>>>>>>>>> Travis.
>>>>>>>>>>>>> Gradle intends to introduce a task result cache which allows to
>>>>>>>>>>>>> reuse
>>>>>>>>>>> results across builds. This could help when building on Travis,
>>>>>>>>>>>>> however, it
>>>>>>>>>>>>> is not yet fully implemented. Moreover, migrating from Maven to
>>>>>>>>>>>>> Gradle
>>>>>>>>>>> won't come for free (there's simply no free lunch out there) and
>>>>>>>>>>>>> we
>>>>>>>>>> might
>>>>>>>>>>>>> risk to introduce new bugs. Therefore, I would vote to split the
>>>>>>>>>>>>> repository
>>>>>>>>>>>>> in order to mitigate our current problems with Travis and the
>>>>>>>>>>>>> build
>>>>>>>>>> time in
>>>>>>>>>>>>> general. Whether to use a different build system or not can then
>>>>>>>>>>>>> be
>>>>>>>>>> discussed as an orthogonal question.
>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Mar 14, 2017 at 8:05 PM, Stephan Ewen <[email protected]
>>>>>>>>>>>>>> <mailto:[email protected]>
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>> Some other thoughts on how repository split would help. I am
>>>>>>>>>>>>>> not
>>>>>>>>> sure
>>>>>>>>>>> for
>>>>>>>>>>>>> all of them, so please comment:
>>>>>>>>>>>>>>> - There is less competition for a "commit window". It happens
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> a
>>>>>>>>> lot
>>>>>>>>>>> already that you run all tests and want to commit, but there
>>>>>>>>>>>>>> was
>>>>>>>>> a
>>>>>>>>> 
>>>>>>>>>> commit
>>>>>>>>>>>>> in the meantime. You rebase, need to re-test, again commit in
>>>>>>>>>>>>>> the
>>>>>>>>> meantime.
>>>>>>>>>>>>>>>    For a "linear" commit history, this may become a bottleneck
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> eventually
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> as well.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> - There is less risk of broken master. If one
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> repository/modules
>>>>>>>>> breaks
>>>>>>>>>>>>> its master, the others can still continue.
>>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Fri, Mar 10, 2017 at 12:20 PM, Till Rohrmann <
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> [email protected] <mailto:[email protected]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> Thanks for all your input. In order to wrap the discussion up
>>>>>>>>>>>>>>> I'd
>>>>>>>>>> like
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> summarize the mentioned points:
>>>>>>>>>>>>>>>> The problem of increasing build times and complexity of the
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> project
>>>>>>>>>>> has
>>>>>>>>>>>>> been acknowledged. Ideally we would have everything in one
>>>>>>>>>>>>>>> repository
>>>>>>>>>>>> using
>>>>>>>>>>>>>>>> an incremental build tool. Since Maven does not properly
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> support
>>>>>>>>> this
>>>>>>>>>>>> we
>>>>>>>>>>>>>>> would have to switch our build tool to something like Gradle,
>>>>>>>>>>>>>>> for
>>>>>>>>>> example.
>>>>>>>>>>>>>>>> Another option is introducing build profiles for different
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> sets
>>>>>>>>> of
>>>>>>>>> 
>>>>>>>>>> modules
>>>>>>>>>>>>>>>> as well as separating integration and unit tests. The third
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> alternative
>>>>>>>>>>>>> would be creating sub-projects with their own repositories. I
>>>>>>>>>>>>>>> actually
>>>>>>>>>>>> think that these two proposal are not necessarily exclusive
>>>>>>>>>>>>>>> and
>>>>>>>>> it
>>>>>>>>> 
>>>>>>>>>> would
>>>>>>>>>>>>>>> also make sense to have a separation between unit and
>>>>>>>>>>>>>>> integration
>>>>>>>>>> tests
>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>> we split the respository.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> The overall consensus seems to be that we don't want to split
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> the
>>>>>>>>>> community
>>>>>>>>>>>>>>>> and want to keep everything under the same umbrella. I think
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> this
>>>>>>>>>> is
>>>>>>>>>> 
>>>>>>>>>>> the
>>>>>>>>>>>>>>> right way to go, because otherwise some parts of the project
>>>>>>>>>>>>>>> could
>>>>>>>>>> become
>>>>>>>>>>>>>>> second class citizens. Given that and that we continue using
>>>>>>>>>>>>>>> Maven,
>>>>>>>>>>> I
>>>>>>>>>>> 
>>>>>>>>>>>> still
>>>>>>>>>>>>>>>> think that creating sub-projects for the libraries, for
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> example,
>>>>>>>>> could
>>>>>>>>>>>> be
>>>>>>>>>>>>>>> beneficial. A split could reduce the project's complexity and
>>>>>>>>>>>>>>> make
>>>>>>>>>> it
>>>>>>>>>>>> potentially easier for libraries to get actively developed.
>>>>>>>>>>>>>>> The
>>>>>>>>> main
>>>>>>>>>>> concern is setting up the build infrastructure to aggregate
>>>>>>>>>>>>>>> docs
>>>>>>>>> from
>>>>>>>>>>>> multiple repositories and making them publicly available.
>>>>>>>>>>>>>>>> Since I started this thread and I would really like to see
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Flink's
>>>>>>>>>> ML
>>>>>>>>>>>> library being revived again, I'd volunteer investigating first
>>>>>>>>>>>>>>> whether
>>>>>>>>>>>> it
>>>>>>>>>>>>>>> is doable establishing a proper incremental build for Flink.
>>>>>>>>>>>>>>> If
>>>>>>>>> that
>>>>>>>>>>> should
>>>>>>>>>>>>>>>> not be possible, I will look into splitting the repository,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> first
>>>>>>>>>> only
>>>>>>>>>>>> for
>>>>>>>>>>>>>>>> the libraries. I'll share my results with the community once
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I'm
>>>>>>>>> done
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the investigation.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 3:50 PM, Robert Metzger <
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> [email protected] <mailto:[email protected]>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>> @Jin Mingjian: You can not use the paid travis version for
>>>>>>>>>>>>>>>> open
>>>>>>>>> source
>>>>>>>>>>>>>>> projects. It only works for private repositories (at least
>>>>>>>>>>>>>>>> back
>>>>>>>>> then
>>>>>>>>>>>> when
>>>>>>>>>>>>>>>> we've asked them about that).
>>>>>>>>>>>>>>>>> @Stephan: I don't think that incremental builds will be
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> available
>>>>>>>>>> with
>>>>>>>>>>>>>>> Maven anytime soon.
>>>>>>>>>>>>>>>>> I agree that we need to fix the build time issue on Travis.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I've
>>>>>>>>>> recently
>>>>>>>>>>>>>>>> pushed a commit to use now three instead of two test groups.
>>>>>>>>>>>>>>>>> But I don't think that this is feasible long-term solution.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> If this discussion is only about reducing the build and test
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> time,
>>>>>>>>>>> introducing build profiles for different components as
>>>>>>>>>>>>>>>> Aljoscha
>>>>>>>>> suggested
>>>>>>>>>>>>>>>> would solve the problem Till mentioned.
>>>>>>>>>>>>>>>>> Also, if we decide that travis is not a good tool anymore for
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> the
>>>>>>>>>> testing,
>>>>>>>>>>>>>>>>> I guess we can find a different solution. There are now
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> competitors
>>>>>>>>>>> to
>>>>>>>>>>>>>>> Travis that might be willing to offer a paid plan for an open
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>> project, or we set up our own infra on a server sponsored by
>>>>>>>>>>>>>>>> one
>>>>>>>>>> of
>>>>>>>>>> 
>>>>>>>>>>> the
>>>>>>>>>>>>>>> contributing companies.
>>>>>>>>>>>>>>>>> If we want to solve "community issues" with the change as
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> well,
>>>>>>>>> then
>>>>>>>>>>>> I
>>>>>>>>>>>>>>> think its work the effort of splitting up Flink into
>>>>>>>>>>>>>>>> different
>>>>>>>>> repositories.
>>>>>>>>>>>>>>>>> Splitting up repositories is not a trivial task in my
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> opinion.
>>>>>>>>> As
>>>>>>>>> 
>>>>>>>>>> others
>>>>>>>>>>>>>>>> have mentioned before, we need to consider the following
>>>>>>>>>>>>>>>> things:
>>>>>>>>>> - How are we doing to build the documentation? Ideally every
>>>>>>>>>>>>>>>> repo
>>>>>>>>>> should
>>>>>>>>>>>>>>>> contain its docs, so we would need to pull them together when
>>>>>>>>>>>>>>>> building
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> main docs.
>>>>>>>>>>>>>>>>> - How do organize the dependencies? If we have library
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> repository
>>>>>>>>>> depend
>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> snapshot Flink versions, we need to make sure that the
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> snapshot
>>>>>>>>> deployment
>>>>>>>>>>>>>>>>> always works. This also means that people working on a
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> library
>>>>>>>>> repository
>>>>>>>>>>>>>>>> will pull from snapshot OR need to build first locally.
>>>>>>>>>>>>>>>>> - We need to update the release scripts
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> If we commit to do these changes, we need to assign at least
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> one
>>>>>>>>>> committer
>>>>>>>>>>>>>>>>> (yes, in this case we need somebody who can commit, for
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> example
>>>>>>>>> for
>>>>>>>>>>> updating the buildbot stuff) who volunteers to do the change.
>>>>>>>>>>>>>>>>> I've done a lot of infrastructure work in the past, but I'm
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> currently
>>>>>>>>>>>> pretty booked with many other things, so I don't
>>>>>>>>>>>>>>>> realistically
>>>>>>>>> see
>>>>>>>>>>> myself
>>>>>>>>>>>>>>>> doing that. Max who used to work on these things is taking
>>>>>>>>>>>>>>>> some
>>>>>>>>> time
>>>>>>>>>>>> off.
>>>>>>>>>>>>>>>> I think we need, best case 3 days for the change, worst case
>>>>>>>>>>>>>>>> 5
>>>>>>>>> days.
>>>>>>>>>>>> The
>>>>>>>>>>>>>>>> problem is that there are no "unit tests" for the infra
>>>>>>>>>>>>>>>> stuff,
>>>>>>> 
>

Re: [DISCUSS] Project build time and possible restructuring

Reply via email to