Re: Spark 1.6.1

2016-02-02 Thread Mingyu Kim
Cool, thanks! Mingyu From: Michael Armbrust Date: Tuesday, February 2, 2016 at 10:48 AM To: Mingyu Kim Cc: Romi Kuntsman , Hamel Kothari , Ted Yu , "dev@spark.apache.org"

Re: Spark 1.6.1

2016-02-02 Thread Mingyu Kim
Hi all, Is there an estimated timeline for 1.6.1 release? Just wanted to check how the release is coming along. Thanks! Mingyu From: Romi Kuntsman Date: Tuesday, February 2, 2016 at 3:16 AM To: Michael Armbrust Cc: Hamel Kothari

Re: Spark 1.6.1

2016-02-02 Thread Michael Armbrust
I'm waiting for a few last fixes to be merged. Hoping to cut an RC in the next few days. On Tue, Feb 2, 2016 at 10:43 AM, Mingyu Kim wrote: > Hi all, > > Is there an estimated timeline for 1.6.1 release? Just wanted to check how > the release is coming along. Thanks! > >

Re: Spark 1.6.1

2016-02-02 Thread Michael Armbrust
> > What about the memory leak bug? > https://issues.apache.org/jira/browse/SPARK-11293 > Even after the memory rewrite in 1.6.0, it still happens in some cases. > Will it be fixed for 1.6.1? > I think we have enough issues queued up that I would not hold the release for that, but if there is a

Re: [ANNOUNCE] New SAMBA Package = Spark + AWS Lambda

2016-02-02 Thread David Russell
Hi Ben, > My company uses Lamba to do simple data moving and processing using python > scripts. I can see using Spark instead for the data processing would make it > into a real production level platform. That may be true. Spark has first class support for Python which should make your life

Re: Encrypting jobs submitted by the client

2016-02-02 Thread Ted Yu
For #1, a brief search landed the following: core/src/main/scala/org/apache/spark/SparkConf.scala: DeprecatedConfig("spark.rpc", "2.0", "Not used any more.") core/src/main/scala/org/apache/spark/SparkConf.scala: "spark.rpc.numRetries" -> Seq(

Re: Encrypting jobs submitted by the client

2016-02-02 Thread eugene miretsky
Thanks Steve! 1. spark-submit submitting the YARN app for launch? That you get it if you turn hadoop IPC encruption on, by settingo hadoop.rpc.protection=privacy across the cluster. > That's what I meant: Is there something similar for stand alone or Mesos? 2. communications between spark driver

Spark saveAsHadoopFile stage fails with ExecutorLostfailure

2016-02-02 Thread Prabhu Joseph
Hi All, Spark job stage having saveAsHadoopFile fails with ExecutorLostFailure whenever the Executor is run with more cores. The stage is not memory intensive, executor has 20GB memory. for example, 6 executors each with 6 cores, ExecutorLostFailure happens 10 executors each with 2 cores,

Re: Spark 1.6.0 Streaming + Persistance Bug?

2016-02-02 Thread mkhaitman
Actually disregard! Forgot that spark.dynamicAllocation.cachedExecutorIdleTimeout was defaulted to Infinity, so lowering that should solve the problem :) Mark. -- View this message in context:

Re: Spark 1.6.1

2016-02-02 Thread Romi Kuntsman
Hi Michael, What about the memory leak bug? https://issues.apache.org/jira/browse/SPARK-11293 Even after the memory rewrite in 1.6.0, it still happens in some cases. Will it be fixed for 1.6.1? Thanks, *Romi Kuntsman*, *Big Data Engineer* http://www.totango.com On Mon, Feb 1, 2016 at 9:59 PM,

Spark 1.6.0 Streaming + Persistance Bug?

2016-02-02 Thread mkhaitman
Calling unpersist on an RDD in a spark streaming application does not actually unpersist the blocks from memory and/or disk. After the RDD has been processed in a .foreach(rdd) call, I attempt to unpersist the rdd since it is no longer useful to store in memory/disk. This mainly causes a problem

Lunch dev/run-tests on Windows

2016-02-02 Thread Wen Pei Yu
Hi All Have any one try launch dev/run-tests on Windows? I face some issues 1. `which` function didn't support file check without extension, like "java" vs "java.exe", "R" vs "R.exe". 2. Get error below in `run_cmd` function, major issues is some script file failed run in windows.

Re: Encrypting jobs submitted by the client

2016-02-02 Thread Steve Loughran
> On 1 Feb 2016, at 20:48, eugene miretsky wrote: > > Spark supports client authentication via shared secret or kerberos (on YARN). > However, the job itself is sent unencrypted over the network. Is there a way > to encrypt the jobs the client submits to cluster?