Re: 回复: trouble understanding data frame memory usage ³java.io.IOException: Unable to acquire memory²

2015-12-30 Thread SparkUser
Sounds like you guys are on the right track, this is purely FYI because I haven't seen it posted, just responding to the line in the original post that your data structure should fit in memory. OK two more disclaimers "FWIW" and "maybe this is not relevant or already covered" OK here goes...

Re: Can't submit job to stand alone cluster

2015-12-30 Thread SparkUser
Sorry need to clarify: When you say: /When the docs say //"If your application is launched through Spark submit, then the application jar is automatically distributed to all worker nodes,"//it is actually saying that your executors get their jars from the driver. This is true

Re: map spark.driver.appUIAddress IP to different IP

2015-12-28 Thread SparkUser
Wouldn't Amazon Elastic IP do this for you? http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/elastic-ip-addresses-eip.html On 12/28/2015 10:58 PM, Divya Gehlot wrote: Hi, I have HDP2.3.2 cluster installed in Amazon EC2. I want to update the IP adress of spark.driver.appUIAddress,which is

How to calculate percentiles with Spark?

2014-10-21 Thread sparkuser
Hi, What would be the best way to get percentiles from a Spark RDD? I can see JavaDoubleRDD or MLlib's MultivariateStatisticalSummary https://spark.apache.org/docs/latest/mllib-statistics.html provide the mean() but not percentiles. Thank you! Horace -- View this message in context: