Re: zinc invocation examples
fwiw I've been using `zinc -scala-home $SCALA_HOME -nailed -start` which: - starts a nailgun server as well, - uses my installed scala 2.{10,11}, as opposed to zinc's default 2.9.2 https://github.com/typesafehub/zinc#scala: If no options are passed to locate a version of Scala then Scala 2.9.2 is used by default (which is bundled with zinc). The latter seems like it might be especially important. On Thu Dec 04 2014 at 4:25:32 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Oh, derp. I just assumed from looking at all the options that there was something to it. Thanks Sean. On Thu Dec 04 2014 at 7:47:33 AM Sean Owen so...@cloudera.com wrote: You just run it once with zinc -start and leave it running as a background process on your build machine. You don't have to do anything for each build. On Wed, Dec 3, 2014 at 3:44 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: https://github.com/apache/spark/blob/master/docs/ building-spark.md#speeding-up-compilation-with-zinc Could someone summarize how they invoke zinc as part of a regular build-test-etc. cycle? I'll add it in to the aforelinked page if appropriate. Nick
Re: Unit tests in 5 minutes
@Patrick and Josh actually we went even further than that. We simply disable the UI for most tests and these used to be the single largest source of port conflict.
Re: zinc invocation examples
One thing I created a JIRA for a while back was to have a similar script to sbt/sbt that transparently downloads Zinc, Scala, and Maven in a subdirectory of Spark and sets it up correctly. I.e. build/mvn. Outside of brew for MacOS there aren't good Zinc packages, and it's a pain to figure out how to set it up. https://issues.apache.org/jira/browse/SPARK-4501 Prashant Sharma looked at this for a bit but I don't think he's working on it actively any more, so if someone wanted to do this, I'd be extremely grateful. - Patrick On Fri, Dec 5, 2014 at 11:05 AM, Ryan Williams ryan.blake.willi...@gmail.com wrote: fwiw I've been using `zinc -scala-home $SCALA_HOME -nailed -start` which: - starts a nailgun server as well, - uses my installed scala 2.{10,11}, as opposed to zinc's default 2.9.2 https://github.com/typesafehub/zinc#scala: If no options are passed to locate a version of Scala then Scala 2.9.2 is used by default (which is bundled with zinc). The latter seems like it might be especially important. On Thu Dec 04 2014 at 4:25:32 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: Oh, derp. I just assumed from looking at all the options that there was something to it. Thanks Sean. On Thu Dec 04 2014 at 7:47:33 AM Sean Owen so...@cloudera.com wrote: You just run it once with zinc -start and leave it running as a background process on your build machine. You don't have to do anything for each build. On Wed, Dec 3, 2014 at 3:44 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: https://github.com/apache/spark/blob/master/docs/ building-spark.md#speeding-up-compilation-with-zinc Could someone summarize how they invoke zinc as part of a regular build-test-etc. cycle? I'll add it in to the aforelinked page if appropriate. Nick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: drop table if exists throws exception
The command run fine for me on master. Note that Hive does print an exception in the logs, but that exception does not propogate to user code. On Thu, Dec 4, 2014 at 11:31 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi, I got exception saying Hive: NoSuchObjectException(message:table table not found) when running DROP TABLE IF EXISTS table Looks like a new regression in Hive module. Anyone can confirm this? Thanks, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/
Re: drop table if exists throws exception
And that is no different from how Hive has worked for a long time. On Fri, Dec 5, 2014 at 11:42 AM, Michael Armbrust mich...@databricks.com wrote: The command run fine for me on master. Note that Hive does print an exception in the logs, but that exception does not propogate to user code. On Thu, Dec 4, 2014 at 11:31 PM, Jianshi Huang jianshi.hu...@gmail.com wrote: Hi, I got exception saying Hive: NoSuchObjectException(message:table table not found) when running DROP TABLE IF EXISTS table Looks like a new regression in Hive module. Anyone can confirm this? Thanks, -- Jianshi Huang LinkedIn: jianshi Twitter: @jshuang Github Blog: http://huangjs.github.com/
CREATE TABLE AS SELECT does not work with temp tables in 1.2.0
I am having trouble getting create table as select or saveAsTable from a hiveContext to work with temp tables in spark 1.2. No issues in 1.1.0 or 1.1.1 Simple modification to test case in the hive SQLQuerySuite.scala: test(double nested data) { sparkContext.parallelize(Nested1(Nested2(Nested3(1))) :: Nil).registerTempTable(nested) checkAnswer( sql(SELECT f1.f2.f3 FROM nested), 1) checkAnswer(sql(CREATE TABLE test_ctas_1234 AS SELECT * from nested), Seq.empty[Row]) checkAnswer( sql(SELECT * FROM test_ctas_1234), sql(SELECT * FROM nested).collect().toSeq) } output: 11:57:15.974 ERROR org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:45 Table not found 'nested' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1243) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$lzycompute(CreateTableAsSelect.scala:59) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation(CreateTableAsSelect.scala:55) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.sideEffectResult$lzycompute(CreateTableAsSelect.scala:82) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.sideEffectResult(CreateTableAsSelect.scala:70) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.execute(CreateTableAsSelect.scala:89) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:105) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:103) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply$mcV$sp(SQLQuerySuite.scala:122) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply(SQLQuerySuite.scala:117) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply(SQLQuerySuite.scala:117) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) at org.scalatest.Suite$class.withFixture(Suite.scala:1122) at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) at scala.collection.immutable.List.foreach(List.scala:318) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) at org.scalatest.Suite$class.run(Suite.scala:1424) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) at org.scalatest.SuperEngine.runImpl(Engine.scala:545) at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212) at org.scalatest.FunSuite.run(FunSuite.scala:1555) at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:55) at
Re: CREATE TABLE AS SELECT does not work with temp tables in 1.2.0
Thanks for reporting. This looks like a regression related to: https://github.com/apache/spark/pull/2570 I've filed it here: https://issues.apache.org/jira/browse/SPARK-4769 On Fri, Dec 5, 2014 at 12:03 PM, kb kend...@hotmail.com wrote: I am having trouble getting create table as select or saveAsTable from a hiveContext to work with temp tables in spark 1.2. No issues in 1.1.0 or 1.1.1 Simple modification to test case in the hive SQLQuerySuite.scala: test(double nested data) { sparkContext.parallelize(Nested1(Nested2(Nested3(1))) :: Nil).registerTempTable(nested) checkAnswer( sql(SELECT f1.f2.f3 FROM nested), 1) checkAnswer(sql(CREATE TABLE test_ctas_1234 AS SELECT * from nested), Seq.empty[Row]) checkAnswer( sql(SELECT * FROM test_ctas_1234), sql(SELECT * FROM nested).collect().toSeq) } output: 11:57:15.974 ERROR org.apache.hadoop.hive.ql.parse.SemanticAnalyzer: org.apache.hadoop.hive.ql.parse.SemanticException: Line 1:45 Table not found 'nested' at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1243) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:1192) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:9209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:327) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation$lzycompute(CreateTableAsSelect.scala:59) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.metastoreRelation(CreateTableAsSelect.scala:55) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.sideEffectResult$lzycompute(CreateTableAsSelect.scala:82) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.sideEffectResult(CreateTableAsSelect.scala:70) at org.apache.spark.sql.hive.execution.CreateTableAsSelect.execute(CreateTableAsSelect.scala:89) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425) at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:105) at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:103) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply$mcV$sp(SQLQuerySuite.scala:122) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply(SQLQuerySuite.scala:117) at org.apache.spark.sql.hive.execution.SQLQuerySuite$$anonfun$4.apply(SQLQuerySuite.scala:117) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) at org.scalatest.Suite$class.withFixture(Suite.scala:1122) at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) at scala.collection.immutable.List.foreach(List.scala:318) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.org $scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) at org.scalatest.Suite$class.run(Suite.scala:1424) at org.scalatest.FunSuite.org $scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) at
Protobuf version in mvn vs sbt
Hi devs, I play with your amazing Spark here in Prague for some time. I have stumbled on a thing which I like to ask about. I create assembly jars from source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster using yarn. Example of my usage [1]. Formerly I had started to use sbt for creating assemblies like this [2] which runs just fine. Then reading those maven-prefered stories here on dev list I found make-distribution.sh script in root of codebase and wanted to give it a try. I used it to create assembly by both [3] and [4]. But I am not able to use assemblies created by make-distribution because it refuses to be submited to cluster. Here is what happens: - run [3] or [4] - recompile app agains new assembly - submit job using new assembly by [1] like command - submit fails with important parts of stack trace being [5] My guess is that it is due to improper version of protobuf included in assembly jar. My questions are: - Can you confirm this hypothesis? - What is the difference between sbt and mvn way of creating assembly? I mean sbt works and mvn not... - What additional option I need to pass to make-distribution to make it work? Any help/explanation here would be appreciated Jakub -- [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT- hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib. CreateGuidDomainDictionary root-0.1.jar ${args} [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/ assembly [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive - DskipTests [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh 5.1.3 -Pyarn -Phive -DskipTests [5]Exception in thread main org.apache.hadoop.yarn.exceptions. YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:79) at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy (HadoopYarnProtoRPC.java:48) at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134) ... Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance (NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:76) ... 27 more Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto. YarnServiceProtos$SubmitApplicationRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631)
Re: Protobuf version in mvn vs sbt
When building against Hadoop 2.x, you need to enable the appropriate profile, aside from just specifying the version. e.g. -Phadoop-2.3 for Hadoop 2.3. On Fri, Dec 5, 2014 at 12:51 PM, spark.dubovsky.ja...@seznam.cz wrote: Hi devs, I play with your amazing Spark here in Prague for some time. I have stumbled on a thing which I like to ask about. I create assembly jars from source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster using yarn. Example of my usage [1]. Formerly I had started to use sbt for creating assemblies like this [2] which runs just fine. Then reading those maven-prefered stories here on dev list I found make-distribution.sh script in root of codebase and wanted to give it a try. I used it to create assembly by both [3] and [4]. But I am not able to use assemblies created by make-distribution because it refuses to be submited to cluster. Here is what happens: - run [3] or [4] - recompile app agains new assembly - submit job using new assembly by [1] like command - submit fails with important parts of stack trace being [5] My guess is that it is due to improper version of protobuf included in assembly jar. My questions are: - Can you confirm this hypothesis? - What is the difference between sbt and mvn way of creating assembly? I mean sbt works and mvn not... - What additional option I need to pass to make-distribution to make it work? Any help/explanation here would be appreciated Jakub -- [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT- hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib. CreateGuidDomainDictionary root-0.1.jar ${args} [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/ assembly [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive - DskipTests [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh 5.1.3 -Pyarn -Phive -DskipTests [5]Exception in thread main org.apache.hadoop.yarn.exceptions. YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:79) at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy (HadoopYarnProtoRPC.java:48) at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134) ... Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance (NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:76) ... 27 more Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto. YarnServiceProtos$SubmitApplicationRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Protobuf version in mvn vs sbt
As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try ./make-distribution.sh -Pyarn -Phive -Phadoop-2.3 -Dhadoop.version=2.3.0-cdh5.1.3 -DskipTests See the detail of how to change the profile at https://spark.apache.org/docs/latest/building-with-maven.html Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Fri, Dec 5, 2014 at 12:54 PM, Marcelo Vanzin van...@cloudera.com wrote: When building against Hadoop 2.x, you need to enable the appropriate profile, aside from just specifying the version. e.g. -Phadoop-2.3 for Hadoop 2.3. On Fri, Dec 5, 2014 at 12:51 PM, spark.dubovsky.ja...@seznam.cz wrote: Hi devs, I play with your amazing Spark here in Prague for some time. I have stumbled on a thing which I like to ask about. I create assembly jars from source and then use it to run simple jobs on our 2.3.0-cdh5.1.3 cluster using yarn. Example of my usage [1]. Formerly I had started to use sbt for creating assemblies like this [2] which runs just fine. Then reading those maven-prefered stories here on dev list I found make-distribution.sh script in root of codebase and wanted to give it a try. I used it to create assembly by both [3] and [4]. But I am not able to use assemblies created by make-distribution because it refuses to be submited to cluster. Here is what happens: - run [3] or [4] - recompile app agains new assembly - submit job using new assembly by [1] like command - submit fails with important parts of stack trace being [5] My guess is that it is due to improper version of protobuf included in assembly jar. My questions are: - Can you confirm this hypothesis? - What is the difference between sbt and mvn way of creating assembly? I mean sbt works and mvn not... - What additional option I need to pass to make-distribution to make it work? Any help/explanation here would be appreciated Jakub -- [1] ./bin/spark-submit --num-executors 200 --master yarn-cluster --conf spark.yarn.jar=assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT- hadoop2.3.0-cdh5.1.3.jar --class org.apache.spark.mllib. CreateGuidDomainDictionary root-0.1.jar ${args} [2] ./sbt/sbt -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive assembly/ assembly [3] ./make-distribution.sh -Dhadoop.version=2.3.0-cdh5.1.3 -Pyarn -Phive - DskipTests [4] ./make-distribution.sh -Dyarn.version=2.3.0 -Dhadoop.version=2.3.0-cdh 5.1.3 -Pyarn -Phive -DskipTests [5]Exception in thread main org.apache.hadoop.yarn.exceptions. YarnRuntimeException: java.lang.reflect.InvocationTargetException at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:79) at org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC.getProxy (HadoopYarnProtoRPC.java:48) at org.apache.hadoop.yarn.client.RMProxy$1.run(RMProxy.java:134) ... Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance (NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance (DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at org.apache.hadoop.yarn.factories.impl.pb.RpcClientFactoryPBImpl. getClient(RpcClientFactoryPBImpl.java:76) ... 27 more Caused by: java.lang.VerifyError: class org.apache.hadoop.yarn.proto. YarnServiceProtos$SubmitApplicationRequestProto overrides final method getUnknownFields.()Lcom/google/protobuf/UnknownFieldSet; at java.lang.ClassLoader.defineClass1(Native Method) at java.lang.ClassLoader.defineClassCond(ClassLoader.java:631) -- Marcelo - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Protobuf version in mvn vs sbt
(Nit: CDH *5.1.x*, including 5.1.3, is derived from Hadoop 2.3.x. 5.3 is based on 2.5.x) On Fri, Dec 5, 2014 at 3:29 PM, DB Tsai dbt...@dbtsai.com wrote: As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: Protobuf version in mvn vs sbt
oh, I meant to say cdh5.1.3 used by Jakub's company is based on 2.3. You can see it from the first part of the Cloudera's version number - 2.3.0-cdh 5.1.3. Sincerely, DB Tsai --- My Blog: https://www.dbtsai.com LinkedIn: https://www.linkedin.com/in/dbtsai On Fri, Dec 5, 2014 at 1:38 PM, Sean Owen so...@cloudera.com wrote: (Nit: CDH *5.1.x*, including 5.1.3, is derived from Hadoop 2.3.x. 5.3 is based on 2.5.x) On Fri, Dec 5, 2014 at 3:29 PM, DB Tsai dbt...@dbtsai.com wrote: As Marcelo said, CDH5.3 is based on hadoop 2.3, so please try
build in IntelliJ IDEA
Hi everyone, Have a newbie question on using IntelliJ to build and debug. I followed this wiki to setup IntelliJ: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA Afterward I tried to build via Toolbar (Build Rebuild Project). The action fails with the error message: Cannot start compiler: the SDK is not specified. What SDK do I need to specify to get the build working? Thanks, Judy
Re: [VOTE] Release Apache Spark 1.2.0 (RC1)
Hey All, Thanks all for the continued testing! The issue I mentioned earlier SPARK-4498 was fixed earlier this week (hat tip to Mark Hamstra who contributed to fix). In the interim a few smaller blocker-level issues with Spark SQL were found and fixed (SPARK-4753, SPARK-4552, SPARK-4761). There is currently an outstanding issue (SPARK-4740[1]) in Spark core that needs to be fixed. I want to thank in particular Shopify and Intel China who have identified and helped test blocker issues with the release. This type of workload testing around releases is really helpful for us. Once things stabilize I will cut RC2. I think we're pretty close with this one. - Patrick On Wed, Dec 3, 2014 at 5:38 PM, Takeshi Yamamuro linguin@gmail.com wrote: +1 (non-binding) Checked on CentOS 6.5, compiled from the source. Ran various examples in stand-alone master and three slaves, and browsed the web UI. On Sat, Nov 29, 2014 at 2:16 PM, Patrick Wendell pwend...@gmail.com wrote: Please vote on releasing the following candidate as Apache Spark version 1.2.0! The tag to be voted on is v1.2.0-rc1 (commit 1056e9ec1): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=1056e9ec13203d0c51564265e94d77a054498fdb The release files, including signatures, digests, etc. can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1/ Release artifacts are signed with the following key: https://people.apache.org/keys/committer/pwendell.asc The staging repository for this release can be found at: https://repository.apache.org/content/repositories/orgapachespark-1048/ The documentation corresponding to this release can be found at: http://people.apache.org/~pwendell/spark-1.2.0-rc1-docs/ Please vote on releasing this package as Apache Spark 1.2.0! The vote is open until Tuesday, December 02, at 05:15 UTC and passes if a majority of at least 3 +1 PMC votes are cast. [ ] +1 Release this package as Apache Spark 1.1.0 [ ] -1 Do not release this package because ... To learn more about Apache Spark, please see http://spark.apache.org/ == What justifies a -1 vote for this release? == This vote is happening very late into the QA period compared with previous votes, so -1 votes should only occur for significant regressions from 1.0.2. Bugs already present in 1.1.X, minor regressions, or bugs related to new features will not block this release. == What default changes should I be aware of? == 1. The default value of spark.shuffle.blockTransferService has been changed to netty -- Old behavior can be restored by switching to nio 2. The default value of spark.shuffle.manager has been changed to sort. -- Old behavior can be restored by setting spark.shuffle.manager to hash. == Other notes == Because this vote is occurring over a weekend, I will likely extend the vote if this RC survives until the end of the vote period. - Patrick - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org - To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org For additional commands, e-mail: dev-h...@spark.apache.org
Re: build in IntelliJ IDEA
If you go to “File - Project Structure” and click on “Project” under the “Project settings” heading, do you see an entry for “Project SDK?” If not, you should click “New…” and configure a JDK; by default, I think IntelliJ should figure out a correct path to your system JDK, so you should just be able to hit “Ok” then rebuild your project. For reference, here’s a screenshot showing what my version of that window looks like: http://i.imgur.com/hRfQjIi.png On December 5, 2014 at 1:52:35 PM, Judy Nash (judyn...@exchange.microsoft.com) wrote: Hi everyone, Have a newbie question on using IntelliJ to build and debug. I followed this wiki to setup IntelliJ: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA Afterward I tried to build via Toolbar (Build Rebuild Project). The action fails with the error message: Cannot start compiler: the SDK is not specified. What SDK do I need to specify to get the build working? Thanks, Judy