A confluence page sounds good to me. I'll create it. On Wed, Oct 10, 2018 at 8:06 PM Wes McKinney <wesmck...@gmail.com> wrote:
> How would you all like to manage this project? Maybe we should create > a Confluence wiki page to enumerate the different facets of the effort > and make sure we create JIRA issues to plot a course to where we want > to go. Would someone like to take point on this? > > Thanks > Wes > On Tue, Oct 9, 2018 at 2:33 PM Krisztián Szűcs > <szucs.kriszt...@gmail.com> wrote: > > > > On Tue, Oct 9, 2018 at 6:02 PM Antoine Pitrou <anto...@python.org> > wrote: > > > > > > > > Le 09/10/2018 à 17:54, Wes McKinney a écrit : > > > > hi folks, > > > > > > > > After the packaging automation work for 0.10 was completed, we have > > > > stalled out a bit on one of the objectives of this framework, which > is > > > > to allow contributors to define and add new tasks that can be run on > > > > demand or as part of a nightly job. > > > > > > > > So we have some problems to solve: > > > > > > > > * How to define a task we wish to validate (like building the API > > > > documentation, or building Arrow with some particular build > > > > parameters) as a new Crossbow task -- document this well so that > > > > people have some instructions to follow > > > > > Crossbow indeed lacks of documentation in that matter. Defining a task > > requires > > a CI configuration and commands per platform and a section in tasks.yml. > > However I think this is not straightforward enough - like just creating a > > bash/batch > > script - We still need to define config management stuff (which makes > user > > friendliness harder to achieve). > > > > > > * How to add a task to some kind of a nightly build manifest > > > > > > * Where to schedule and run the nightly jobs > > > > > Currently nightly builds are submitted by this nightly travis script: > > > https://github.com/kszucs/crossbow/blob/trigger-nightly-builds/.travis.yml > > We can have arbitrary number of branches to trigger custom jobs, however > it > > requires manual travis setup - with still not satisfying ergonomics. > > > > > > * Reporting nightly build failures to the mailing list > > > > > I regularly check the nightly builds which occasionally fails, mostly > > transient failures. > > For example last conda nightlies have failed, because conda-build have > some > > issues with libarchive - during the feedstock updates I couldn't even > > rerender them > > locally. > > BTW to send the errors to the mailing list We need to set CROSSBOW_EMAIL > env > > variable > > https://github.com/apache/arrow/blob/master/dev/tasks/crossbow.py#L475 > > (We might want to use a centralized crossbow repository though with > proper > > permissions). > > > > > > > > > > In terms of scalability requirements, this needs to accommodate > 50-100 > > > tasks. > > > > > The current tasks.yml contains a lot of duplication which bothers me, but > > it provides > > more flexibility than having another "matrix" definition and > > implementation. I don't have > > a user friendly solution for that yet. > > Parallelization is another question, a single crossbow repo can run ~5 > > travis jobs and > > a single appveyor job simultaneously, however We can improve that via > > introducing more > > CI services, e.g. pipelines and/or circleci. > > > > CI service agnostic? > > Ideally We should abstract away the CI service (the worker itself), where > > We do the > > configuration management right now, see the ".<service>.yml" files: > > https://github.com/apache/arrow/tree/master/dev/tasks/conda-recipes > > But then We need to create another, custom (I hope not yml) "dialect" to > > define build > > requirements (e.g. node, python, ruby, clang, etc.). It's quite hard to > > plan an easy > > and flexible interface for that. > > > > > > > > > > This won't be the last time we need to do some infrastructure work to > > > > scale our testing process, but this will help with testing things > that > > > > we want to make sure work but without having to increase the size of > > > > our CI matrix. > > > > > > One question which came to my mind is how to develop, debug and > maintain > > > the nightly tasks without waiting for the nightly Travis run for > > > validation. It doesn't seem easy to trigger a "nightly" build from the > > > Travis UI. > > > > > Good point! Triggering is not the actual issue, but the evaluation of the > > outcome. > > We can submit builds if the PR touches e.g. the task definitions, but We > > cannot > > really wait for the results, thus triggering builds could be useless. > > > > Actually this can be solved by a github integration bot Wes has > mentioned, > > with > > manual triggering and approval. > > > > > > > > Regards > > > > > > Antoine. > > > > > All in all I feel the usability crucial here. A couple of examples how a > > straightforward > > task definition should look like would be handy. Handling and defining > task > > dependencies is another question too (I'm experimenting with a prototype > > though). > > > > Regards, Krisztian >