Is it just a typo in the email or are you missing a space after your --master argument?
The logs here actually don't say much but "something went wrong". It seems fairly low-level, like the gateway process failed or didn't start, rather than a problem with the program. It's hard to say more unless you can dig out any more logs, like from the worker, executor? On Sun, Oct 16, 2016 at 4:24 AM Tobi Bosede <ani.to...@gmail.com> wrote: Hi Mekal, thanks for wanting to help. I have attached the python script as well as the different exceptions here. I have also pasted the cluster exception below so I can highlight the relevant parts. [abosede2@badboy ~]$ spark-submit --master spark://10.160.5.48:7077 trade_data_count.py Ivy Default Cache set to: /home/abosede2/.ivy2/cache The jars for the packages stored in: /home/abosede2/.ivy2/jars :: loading settings :: url = jar:file:/usr/local/spark-1.6.1/assembly/target/scala-2.11/spark-assembly-1.6.1-hre/settings/ivysettings.xml com.databricks#spark-csv_2.11 added as a dependency :: resolving dependencies :: org.apache.spark#spark-submit-parent;1.0 confs: [default] found com.databricks#spark-csv_2.11;1.3.0 in central found org.apache.commons#commons-csv;1.1 in central found com.univocity#univocity-parsers;1.5.1 in central :: resolution report :: resolve 160ms :: artifacts dl 7ms :: modules in use: com.databricks#spark-csv_2.11;1.3.0 from central in [default] com.univocity#univocity-parsers;1.5.1 from central in [default] org.apache.commons#commons-csv;1.1 from central in [default] --------------------------------------------------------------------- | | modules || artifacts | | conf | number| search|dwnlded|evicted|| number|dwnlded| --------------------------------------------------------------------- | default | 3 | 0 | 0 | 0 || 3 | 0 | --------------------------------------------------------------------- :: retrieving :: org.apache.spark#spark-submit-parent confs: [default] 0 artifacts copied, 3 already retrieved (0kB/6ms) [Stage 0:========================> (104 + 8) /235]16/10/15 19:42:37 ERROR TaskScadboy.win.ad.jhu.edu <http://taskscadboy.win.ad.jhu.edu/>: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or netwoWARN messages. 16/10/15 19:42:37 ERROR SparkDeploySchedulerBackend: Application has been killed. Reason: Master removed our a [Stage 0:=======================> (104 + -28) / 235]Traceback (most recent call la File "/home/abosede2/trade_data_count.py", line 79, in <module> print("Raw data is %d rows." % data.count()) File "/usr/local/spark-1.6.1/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 269, in count File "/usr/lib/python2.7/site-packages/py4j-0.9.2-py2.7.egg/py4j/java_gateway.py", line 836, in __call__ answer, self.gateway_client, self.target_id, self.name) File "/usr/local/spark-1.6.1/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco File "/usr/lib/python2.7/site-packages/py4j-0.9.2-py2.7.egg/py4j/protocol.py", line 310, in get_return_value format(target_id, ".", name), value) py4j.protocol.Py4JJavaError: An error occurred while calling o6867.count. : org.apache.spark.SparkException: Job aborted due to stage failure: Master removed our application: FAILED at org.apache.spark.scheduler.DAGScheduler.org <http://org.apache.spark.scheduler.dagscheduler.org/> $apache$spark$scheduler$DAGScheduler$$failJobAndIndepend) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1419) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1418) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1418) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799 at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:799 at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:799) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1640) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:620) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1832) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1845) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1858) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111) at org.apache.spark.rdd.RDD.withScope(RDD.scala:316) at org.apache.spark.rdd.RDD.collect(RDD.scala:926) at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166) at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174) at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:56) at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2086) at org.apache.spark.sql.DataFrame.org <http://org.apache.spark.sql.dataframe.org/> $apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1498) at org.apache.spark.sql.DataFrame.org <http://org.apache.spark.sql.dataframe.org/> $apache$spark$sql$DataFrame$$collect(DataFrame.scala:1505) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1515) at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1514) at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2099) at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1514) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381) at py4j.Gateway.invoke(Gateway.java:259) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:209) at java.lang.Thread.run(Unknown Source )