Re: [VOTE] Apache Spark 2.1.1 (RC3)

2017-04-18 Thread Nicholas Chammas
I had trouble starting up a shell with the AWS package loaded (specifically, org.apache.hadoop:hadoop-aws:2.7.3): [NOT FOUND ] com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms) local-m2-cache: tried

[DStream][Kinesis] Requesting review for spark-kinesis retries

2017-04-18 Thread Yash Sharma
Hi Fellow Devs, Please share your thoughts on the pull request that allows spark to have more graceful retries with kinesis streaming. The patch removes simple hard codings in the code and allows user to pass the values in config. This will help users to cope up with kinesis throttling errors and

Re: [VOTE] Apache Spark 2.1.1 (RC2)

2017-04-18 Thread Michael Armbrust
In case it wasn't obvious by the appearance of RC3, this vote failed. On Thu, Mar 30, 2017 at 4:09 PM, Michael Armbrust wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.1.0. The vote is open until Sun, April 2nd, 2018 at 16:30 PST and

[VOTE] Apache Spark 2.1.1 (RC3)

2017-04-18 Thread Michael Armbrust
Please vote on releasing the following candidate as Apache Spark version 2.1.1. The vote is open until Fri, April 21st, 2018 at 13:00 PST and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 2.1.1 [ ] -1 Do not release this package because ...

Re: RDD functions using GUI

2017-04-18 Thread Reynold Xin
This is not really a dev list question ... I'm sure some tools exist out there, e.g. Talend, Alteryx. On Tue, Apr 18, 2017 at 10:35 AM, Ke Yang (Conan) wrote: > Ping… wonder why there aren’t any such drag-n-drop GUI tool for creating > batch query scripts? > > Thanks > > >

CfP - VHPC at ISC extension - Papers due May 2

2017-04-18 Thread VHPC 17
CALL FOR PAPERS 12th Workshop on Virtualization in High­-Performance Cloud Computing (VHPC '17) held in conjunction with the International Supercomputing Conference - High Performance, June 18-22, 2017, Frankfurt, Germany.

RE: RDD functions using GUI

2017-04-18 Thread Ke Yang (Conan)
Ping... wonder why there aren't any such drag-n-drop GUI tool for creating batch query scripts? Thanks From: Ke Yang (Conan) Sent: Monday, April 17, 2017 5:31 PM To: 'dev@spark.apache.org' Subject: RDD functions using GUI Hi, Are there drag and drop GUI (code-free) for

branch-2.2 has been cut

2017-04-18 Thread Michael Armbrust
I just cut the release branch for Spark 2.2. If you are merging important bug fixes, please backport as appropriate. If you have doubts if something should be backported, please ping me. I'll follow with an RC later this week.

Re: distributed computation of median

2017-04-18 Thread pavan adukuri
Do you know of any python implementation for the same? thanks pavan On 4/17/17, 9:54 AM, svjk24 wrote: Hello, Is there any interest in an efficient distributed computation of the median algorithm? A google search pulls some stackoverflow discussion but it would be good to have one provided.