Re: spark 1.6.0 on ec2 doesn't work

Peter Zhang Mon, 18 Jan 2016 21:13:53 -0800

Could you run spark-shell at $SPARK_HOME DIR?

You can try to change you command run at $SPARK_HOME or, point to README.md 
with full path.



Peter Zhang
-- 
Google
Sent with Airmail

On January 19, 2016 at 11:26:14, Oleg Ruchovets (oruchov...@gmail.com) wrote:

It looks spark is not working fine : 
 
I followed this link ( http://spark.apache.org/docs/latest/ec2-scripts.html. ) 
and I see spot instances installed on EC2.

from spark shell I am counting lines and got connection exception.
scala> val lines = sc.textFile("README.md")
scala> lines.count()



scala> val lines = sc.textFile("README.md")

16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0 stored as values 
in memory (estimated size 26.5 KB, free 26.5 KB)
16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored as 
bytes in memory (estimated size 5.6 KB, free 32.1 KB)
16/01/19 03:17:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0 in 
memory on 172.31.28.196:44028 (size: 5.6 KB, free: 511.5 MB)
16/01/19 03:17:35 INFO spark.SparkContext: Created broadcast 0 from textFile at 
<console>:21
lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile at 
<console>:21

scala> lines.count()

16/01/19 03:17:55 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 0 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 1 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 2 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 3 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 4 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 5 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 6 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 7 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 8 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server: 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried 9 
time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, 
sleepTime=1 SECONDS)
java.lang.RuntimeException: java.net.ConnectException: Call to 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000 failed on 
connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:567)
at 
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at 
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291)
at 
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at 
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at 
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at 
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count(RDD.scala:1143)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
at $iwC$$iwC$$iwC.<init>(<console>:33)
at $iwC$$iwC.<init>(<console>:35)
at $iwC.<init>(<console>:37)
at <init>(<console>:39)
at .<init>(<console>:43)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at 
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at 
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at 
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at 
org.apache.spark.repl.SparkILoop.org$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Call to 
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000 failed on 
connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at 
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:563)
... 64 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at 
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:457)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249)
at org.apache.hadoop.ipc.Client.call(Client.java:1093)
... 84 more


scala> 




On Tue, Jan 19, 2016 at 1:22 AM, Daniel Darabos 
<daniel.dara...@lynxanalytics.com> wrote:

On Mon, Jan 18, 2016 at 5:24 PM, Oleg Ruchovets <oruchov...@gmail.com> wrote:
I thought script tries to install hadoop / hdfs also. And it looks like it 
failed. Installation is only standalone spark without hadoop. Is it correct 
behaviour?

Yes, it also sets up two HDFS clusters. Are they not working? Try to see if 
Spark is working by running some simple jobs on it. (See 
http://spark.apache.org/docs/latest/ec2-scripts.html.)

There is no program called Hadoop. If you mean YARN, then indeed the script 
does not set up YARN. It sets up standalone Spark.
 
Also errors in the log:
   ERROR: Unknown Tachyon version
   Error: Could not find or load main class crayondata.com.log

As long as Spark is working fine, you can ignore all output from the EC2 script 
:).

Re: spark 1.6.0 on ec2 doesn't work

Reply via email to