On 13 Mar 2017, at 13:24, Sam Elamin 
<hussam.ela...@gmail.com<mailto:hussam.ela...@gmail.com>> wrote:

Hi Jorn

Thanks for the prompt reply, really we have 2 main concerns with CD, ensuring 
tests pasts and linting on the code.

I'd add "providing diagnostics when tests fail", which is a combination of: 
tests providing useful information and CI tooling collecting all those results 
and presenting them meaningfully. The hard parts are invariably (at least for 
me)

-what to do about the intermittent failures
-tradeoff between thorough testing and fast testing, especially when thorough 
means "better/larger datasets"

You can consider the output of jenkins & tests as data sources for your own 
analysis too: track failure rates over time, test runs over time, etc: could be 
interesting. If you want to go there, then the question of "which CI toolings 
produce the most interesting machine-parseable results, above and beyond the 
classic Ant-originated XML test run reports"

I have mixed feelings about scalatest there: I think the expression language is 
good, but the maven test runner doesn't report that well, at least for me:

https://steveloughran.blogspot.co.uk/2016/09/scalatest-thoughts-and-ideas.html



I think all platforms should handle this with ease, I was just wondering what 
people are using.

Jenkins seems to have the best spark plugins so we are investigating that as 
well as a variety of other hosted CI tools

Happy to write a blog post detailing our findings and sharing it here if people 
are interested


Regards
Sam

On Mon, Mar 13, 2017 at 1:18 PM, Jörn Franke 
<jornfra...@gmail.com<mailto:jornfra...@gmail.com>> wrote:
Hi,

Jenkins also now supports pipeline as code and multibranch pipelines. thus you 
are not so dependent on the UI and you do not need anymore a long list of jobs 
for different branches. Additionally it has a new UI (beta) called blueocean, 
which is a little bit nicer. You may also check GoCD. Aside from this you have 
a huge variety of commercial tools, e.g. Bamboo.
In the cloud, I use for my open source github projects Travis-Ci, but there are 
also a lot of alternatives, e.g. Distelli.

It really depends what you expect, e.g. If you want to Version the build 
pipeline in GIT, if you need Docker deployment etc. I am not sure if new 
starters should be responsible for the build pipeline, thus I am not sure that 
i understand  your concern in this area.

From my experience, integration tests for Spark can be run on any of these 
platforms.

Best regards

> On 13 Mar 2017, at 10:55, Sam Elamin 
> <hussam.ela...@gmail.com<mailto:hussam.ela...@gmail.com>> wrote:
>
> Hi Folks
>
> This is more of a general question. What's everyone using for their CI /CD 
> when it comes to spark
>
> We are using Pyspark but potentially looking to make to spark scala and Sbt 
> in the future
>
>
> One of the suggestions was jenkins but I know the UI isn't great for new 
> starters so I'd rather avoid it. I've used team city but that was more 
> focused on dot net development
>
>
> What are people using?
>
> Kind Regards
> Sam


Reply via email to