[VOTE][RESULT] Spark 2.4.5 (RC2)

2020-02-05 Thread Dongjoon Hyun
Hi, All. The vote passes. Thanks to all who helped with this release 2.4.5! I'll follow up later with a release announcement once everything is published. +1 (* = binding): - Dongjoon Hyun * - Wenchen Fan * - Hyukjin Kwon * - Takeshi Yamamuro - Maxim Gekk - Sean Owen * +0: None -1: None

Re: Apache Spark Docker image repository

2020-02-05 Thread shane knapp ☠
> > (This can be used in GitHub Action Jobs and Jenkins K8s > Integration Tests to speed up jobs and to have more stabler environments) > yep! not only that, if we ever get around (hopefully this year) to containerizing (the majority) the master and branch builds, i think it'd be nice to

Re: Apache Spark Docker image repository

2020-02-05 Thread Jiaxin Shan
I will vote for this. It's pretty helpful to have managed Spark images. Currently, user have to download Spark binaries and build their own. With this supported, user journey will be simplified and we only need to build an application image on top of base image provided by community. Do we have

subscribe

2020-02-05 Thread Cool Joe
subscribe

Re: Apache Spark Docker image repository

2020-02-05 Thread Sean Owen
What would the images have - just the image for a worker? We wouldn't want to publish N permutations of Python, R, OS, Java, etc. But if we don't then we make one or a few choices of that combo, and then I wonder how many people find the image useful. If the goal is just to support Spark testing,

Apache Spark Docker image repository

2020-02-05 Thread Dongjoon Hyun
Hi, All. >From 2020, shall we have an official Docker image repository as an additional distribution channel? I'm considering the following images. - Public binary release (no snapshot image) - Public non-Spark base image (OS + R + Python) (This can be used in GitHub Action Jobs

Re: [spark-packages.org] Jenkins down

2020-02-05 Thread Xiao Li
@Cheng Lian just recreated the Jenkins service. The service is up now. Thank you for your patience, Xiao Dongjoon Hyun 于2020年1月24日周五 上午10:32写道: > Thank you for updating! > > On Fri, Jan 24, 2020 at 10:29 AM Xiao Li wrote: > >> It does not block any Spark release. Reduced the priority to

Re: [SQL] Is it worth it (and advisable) to implement native UDFs?

2020-02-05 Thread Walaa Eldin Moustafa
For a general-purpose code example, you may take a look at the class we defined in Transport UDFs to express all Expression UDFs [1]. This is an internal class though and not a user-facing API. User-facing UDF example is in [2]. It leverages [1] behind the scenes. [1]

Re: Spark 3.0 branch cut and code freeze on Jan 31?

2020-02-05 Thread Hyukjin Kwon
Awesome Shane. 2020년 2월 5일 (수) 오전 7:29, Xiao Li 님이 작성: > Thank you, Shane! > > Xiao > > On Tue, Feb 4, 2020 at 2:16 PM Dongjoon Hyun > wrote: > >> Thank you, Shane! :D >> >> Bests, >> Dongjoon >> >> On Tue, Feb 4, 2020 at 13:28 shane knapp ☠ wrote: >> >>> all the 3.0 builds have been created

Re: [SQL] Is it worth it (and advisable) to implement native UDFs?

2020-02-05 Thread Wenchen Fan
This is a hack really and we don't recommend users to access internal classes directly. That's why there is no public document. If you really need to do it and are aware of the risks, you can read the source code. All expressions (or the so-called "native UDF") extend the base class `Expression`.