Re: @scala.annotation.varargs or @_root_.scala.annotation.varargs?

2016-09-23 Thread Jacek Laskowski
On Sat, Sep 24, 2016 at 5:27 AM, Hyukjin Kwon wrote: > Then, are we going to submit a PR and fix this maybe? https://issues.apache.org/jira/browse/SPARK-17656 Thanks Hyukjin! Unless someone beats me to it, I'm going to have a PR over the weekend. Jacek -

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Yash Sharma
Hi Dhruve, thanks. I've solved the issue with adding max executors. I wanted to find some place where I can add this behavior in Spark so that user should not have to worry about the max executors. Cheers - Thanks, via mobile, excuse brevity. On Sep 24, 2016 1:15 PM, "dhruve ashar" wrote: > F

Re: @scala.annotation.varargs or @_root_.scala.annotation.varargs?

2016-09-23 Thread Hyukjin Kwon
Then, are we going to submit a PR and fix this maybe? On 9 Sep 2016 9:30 p.m., "Sean Owen" wrote: > Oh I get it now. I was necessary in the past. Sure, seems like it > could be standardized now. > > On Fri, Sep 9, 2016 at 1:13 AM, Reynold Xin wrote: > > Yea but the earlier email was asking they

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Yash Sharma
Is there anywhere I can help fix this ? I can see the requests being made in the yarn allocator. What should be the upperlimit of the requests made ? https://github.com/apache/spark/blob/master/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L222 On Sat, Sep 24, 2016 at 10:2

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Yash Sharma
Have been playing around with configs to crack this. Adding them here where it would be helpful to others :) Number of executors and timeout seemed like the core issue. {code} --driver-memory 4G \ --conf spark.dynamicAllocation.enabled=true \ --conf spark.dynamicAllocation.maxExecutors=500 \ --con

Re: Why Expression.deterministic method and Nondeterministic trait?

2016-09-23 Thread Reynold Xin
deterministic method describes whether this instance of the expression tree is deterministic, whereas Nondeterministic trait is about a class. On Fri, Sep 23, 2016 at 10:46 AM, Jacek Laskowski wrote: > Hi Herman, > > That helps to know that someone can explain why we've got the two > nondetermi

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Jacek Laskowski
Hi, Not that it could fix the issue but no -Pmesos? Jacek On 24 Sep 2016 12:08 a.m., "Sean Owen" wrote: > +1 Signatures and hashes check out. I checked that the Kinesis > assembly artifacts are not present. > > I compiled and tested on Java 8 / Ubuntu 16 with -Pyarn -Phive > -Phive-thriftserve

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread vaquar khan
+1 non binding No issue found. Regards, Vaquar khan On 23 Sep 2016 17:25, "Mark Hamstra" wrote: Similar but not identical configuration (Java 8/macOs 10.12 with build/mvn -Phive -Phive-thriftserver -Phadoop-2.7 -Pyarn clean install); Similar but not identical failure: ... - line wrapper only i

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Mark Hamstra
Similar but not identical configuration (Java 8/macOs 10.12 with build/mvn -Phive -Phive-thriftserver -Phadoop-2.7 -Pyarn clean install); Similar but not identical failure: ... - line wrapper only initialized once when used as encoder outer scope Spark context available as 'sc' (master = local-c

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Sean Owen
+1 Signatures and hashes check out. I checked that the Kinesis assembly artifacts are not present. I compiled and tested on Java 8 / Ubuntu 16 with -Pyarn -Phive -Phive-thriftserver -Phadoop-2.7 -Psparkr and only saw one test problem. This test never completed. If nobody else sees it, +1, assuming

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Ricardo Almeida
+1 (non-binding) Build: OK, but can no longer use the "--tgz" option when calling make-distribution.sh (maybe a problem on my side?) Run: No regressions from 2.0.0 detected. Tested our pipelines on a standalone cluster (Python API) On 23 September 2016 at 08:01, Reynold Xin wrote: > Please v

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Luciano Resende
+1 (non-binding) also verified that the assembly files with license issues are not being published to maven staging repositories. On Thu, Sep 22, 2016 at 11:01 PM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 2.0.1. The vote is open until Sunday

Re: Why Expression.deterministic method and Nondeterministic trait?

2016-09-23 Thread Jacek Laskowski
Hi Herman, That helps to know that someone can explain why we've got the two nondeterministic states. It's not possible to say...a non-Nondeterministic expression can be non-deterministic (the former is the trait while the latter is the method) #strange Pozdrawiam, Jacek Laskowski https://m

Re: Why Expression.deterministic method and Nondeterministic trait?

2016-09-23 Thread Herman van Hövell tot Westerflier
Jacek, A non-deterministic expression usually holds some state. The Nondeterministic trait makes sure a user can initialize this state properly. Take a look at InterpretedProjection

Why Expression.deterministic method and Nondeterministic trait?

2016-09-23 Thread Jacek Laskowski
Hi, Just came across the Expression trait [1] that can be check for determinism by the method deterministic [2] and trait Nondeterministic [3]. Why both? [1] https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Expression.scala#L53 [2]

Re: [SPARK-15717][GraphX] status

2016-09-23 Thread Asher Krim
Thanks Anderson! I have not tried the fix yet due to the way we currently build spark (we don't really yet :-(). Once we build internally, I can give it a whirl. On Thu, Sep 22, 2016 at 6:03 PM, Anderson de Andrade wrote: > Done. > > On Thu, Sep 22, 2016 at 5:53 PM, Anderson de Andrade < > adea

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread aditya . calangutkar
For testing purpose can you run with fix number of executors and try. May be 12 executors for testing and let know the status. Get Outlook for Android On Fri, Sep 23, 2016 at 3:13 PM +0530, "Yash Sharma" wrote: Thanks Aditya, appreciate the help. I had the exact thought about

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Yash Sharma
Thanks Aditya, appreciate the help. I had the exact thought about the huge number of executors requested. I am going with the dynamic executors and not specifying the number of executors. Are you suggesting that I should limit the number of executors when the dynamic allocator requests for more nu

Re: [VOTE] Release Apache Spark 2.0.1 (RC2)

2016-09-23 Thread Jacek Laskowski
+1 Pozdrawiam, Jacek Laskowski https://medium.com/@jaceklaskowski/ Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark Follow me at https://twitter.com/jaceklaskowski On Fri, Sep 23, 2016 at 8:01 AM, Reynold Xin wrote: > Please vote on releasing the following candidate as Apach

Re: Spark Yarn Cluster with Reference File

2016-09-23 Thread Aditya
Hi Abhishek, From your spark-submit it seems your passing the file as a parameter to the driver program. So now it depends what exactly you are doing with that parameter. Using --files option it will be available to all the worker nodes but if in your code if you are referencing using the spe

Re: Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Aditya
Hi Yash, What is your total cluster memory and number of cores? Problem might be with the number of executors you are allocating. The logs shows it as 168510 which is on very high side. Try reducing your executors. On Friday 23 September 2016 12:34 PM, Yash Sharma wrote: Hi All, I have a spa

Spark job fails as soon as it starts. Driver requested a total number of 168510 executor

2016-09-23 Thread Yash Sharma
Hi All, I have a spark job which runs over a huge bulk of data with Dynamic allocation enabled. The job takes some 15 minutes to start up and fails as soon as it starts*. Is there anything I can check to debug this problem. There is not a lot of information in logs for the exact cause but here is