On 13 Mar 2017, at 13:24, Sam Elamin <hussam.ela...@gmail.com<mailto:hussam.ela...@gmail.com>> wrote:
Hi Jorn Thanks for the prompt reply, really we have 2 main concerns with CD, ensuring tests pasts and linting on the code. I'd add "providing diagnostics when tests fail", which is a combination of: tests providing useful information and CI tooling collecting all those results and presenting them meaningfully. The hard parts are invariably (at least for me) -what to do about the intermittent failures -tradeoff between thorough testing and fast testing, especially when thorough means "better/larger datasets" You can consider the output of jenkins & tests as data sources for your own analysis too: track failure rates over time, test runs over time, etc: could be interesting. If you want to go there, then the question of "which CI toolings produce the most interesting machine-parseable results, above and beyond the classic Ant-originated XML test run reports" I have mixed feelings about scalatest there: I think the expression language is good, but the maven test runner doesn't report that well, at least for me: https://steveloughran.blogspot.co.uk/2016/09/scalatest-thoughts-and-ideas.html I think all platforms should handle this with ease, I was just wondering what people are using. Jenkins seems to have the best spark plugins so we are investigating that as well as a variety of other hosted CI tools Happy to write a blog post detailing our findings and sharing it here if people are interested Regards Sam On Mon, Mar 13, 2017 at 1:18 PM, Jörn Franke <jornfra...@gmail.com<mailto:jornfra...@gmail.com>> wrote: Hi, Jenkins also now supports pipeline as code and multibranch pipelines. thus you are not so dependent on the UI and you do not need anymore a long list of jobs for different branches. Additionally it has a new UI (beta) called blueocean, which is a little bit nicer. You may also check GoCD. Aside from this you have a huge variety of commercial tools, e.g. Bamboo. In the cloud, I use for my open source github projects Travis-Ci, but there are also a lot of alternatives, e.g. Distelli. It really depends what you expect, e.g. If you want to Version the build pipeline in GIT, if you need Docker deployment etc. I am not sure if new starters should be responsible for the build pipeline, thus I am not sure that i understand your concern in this area. From my experience, integration tests for Spark can be run on any of these platforms. Best regards > On 13 Mar 2017, at 10:55, Sam Elamin > <hussam.ela...@gmail.com<mailto:hussam.ela...@gmail.com>> wrote: > > Hi Folks > > This is more of a general question. What's everyone using for their CI /CD > when it comes to spark > > We are using Pyspark but potentially looking to make to spark scala and Sbt > in the future > > > One of the suggestions was jenkins but I know the UI isn't great for new > starters so I'd rather avoid it. I've used team city but that was more > focused on dot net development > > > What are people using? > > Kind Regards > Sam