Have you verified the spark master/slaves are started correctly? Please check using netstat command and open ports mode. Are they listening? Binds to which address etc..
From: Oleg Ruchovets [mailto:oruchov...@gmail.com] Sent: 19 January 2016 11:24 To: Peter Zhang <zhangju...@gmail.com> Cc: Daniel Darabos <daniel.dara...@lynxanalytics.com>; user <user@spark.apache.org> Subject: Re: spark 1.6.0 on ec2 doesn't work I am running from $SPARK_HOME. It looks like connection problem to port 9000. It is on master machine. What is this process is spark tries to connect? Should I start any framework , processes before executing spark? Thanks OIeg. 16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 9 time(s); retry On Tue, Jan 19, 2016 at 1:13 PM, Peter Zhang <zhangju...@gmail.com<mailto:zhangju...@gmail.com>> wrote: Could you run spark-shell at $SPARK_HOME DIR? You can try to change you command run at $SPARK_HOME or, point to README.md with full path. Peter Zhang -- Google Sent with Airmail On January 19, 2016 at 11:26:14, Oleg Ruchovets (oruchov...@gmail.com<mailto:oruchov...@gmail.com>) wrote: It looks spark is not working fine : I followed this link ( http://spark.apache.org/docs/latest/ec2-scripts.html. ) and I see spot instances installed on EC2. from spark shell I am counting lines and got connection exception. scala> val lines = sc.textFile("README.md") scala> lines.count() scala> val lines = sc.textFile("README.md") 16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 26.5 KB, free 26.5 KB) 16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 5.6 KB, free 32.1 KB) 16/01/19 03:17:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in memory on 172.31.28.196:44028<http://172.31.28.196:44028> (size: 5.6 KB, free: 511.5 MB) 16/01/19 03:17:35 INFO spark.SparkContext: Created broadcast 0 from textFile at <console>:21 lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at <console>:21 scala> lines.count() 16/01/19 03:17:55 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) 16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server: ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000>. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS) java.lang.RuntimeException: java.net.ConnectException: Call to ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000> failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:567) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318) at org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015) at org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176) at scala.Option.map(Option.scala:145) at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176) at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:237) at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929) at org.apache.spark.rdd.RDD.count(RDD.scala:1143) at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24) at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29) at $iwC$$iwC$$iwC$$iwC.<init>(<console>:31) at $iwC$$iwC$$iwC.<init>(<console>:33) at $iwC$$iwC.<init>(<console>:35) at $iwC.<init>(<console>:37) at <init>(<console>:39) at .<init>(<console>:43) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857) at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902) at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814) at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657) at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665) at org.apache.spark.repl.SparkILoop.org<http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945) at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135) at org.apache.spark.repl.SparkILoop.org<http://org.apache.spark.repl.SparkILoop.org>$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945) at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059) at org.apache.spark.repl.Main$.main(Main.scala:31) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.net.ConnectException: Call to ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000<http://ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000> failed on connection exception: java.net.ConnectException: Connection refused at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142) at org.apache.hadoop.ipc.Client.call(Client.java:1118) at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229) at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62) at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source) at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422) at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281) at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67) at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263) at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124) at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:563) ... 64 more Caused by: java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:457) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583) at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249) at org.apache.hadoop.ipc.Client.call(Client.java:1093) ... 84 more scala> On Tue, Jan 19, 2016 at 1:22 AM, Daniel Darabos <daniel.dara...@lynxanalytics.com<mailto:daniel.dara...@lynxanalytics.com>> wrote: On Mon, Jan 18, 2016 at 5:24 PM, Oleg Ruchovets <oruchov...@gmail.com<mailto:oruchov...@gmail.com>> wrote: I thought script tries to install hadoop / hdfs also. And it looks like it failed. Installation is only standalone spark without hadoop. Is it correct behaviour? Yes, it also sets up two HDFS clusters. Are they not working? Try to see if Spark is working by running some simple jobs on it. (See http://spark.apache.org/docs/latest/ec2-scripts.html.) There is no program called Hadoop. If you mean YARN, then indeed the script does not set up YARN. It sets up standalone Spark. Also errors in the log: ERROR: Unknown Tachyon version Error: Could not find or load main class crayondata.com.log As long as Spark is working fine, you can ignore all output from the EC2 script :). The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com