Re: spark 1.6.0 on ec2 doesn't work

Oleg Ruchovets Mon, 18 Jan 2016 19:26:42 -0800

It looks spark is not working fine :

I followed this link ( http://spark.apache.org/docs/latest/ec2-scripts.html.
) and I see spot instances installed on EC2.

from spark shell I am counting lines and got connection exception.
*scala> val lines = sc.textFile("README.md")*
*scala> lines.count()*

*scala> val lines = sc.textFile("README.md")*

16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0 stored as
values in memory (estimated size 26.5 KB, free 26.5 KB)
16/01/19 03:17:35 INFO storage.MemoryStore: Block broadcast_0_piece0 stored
as bytes in memory (estimated size 5.6 KB, free 32.1 KB)
16/01/19 03:17:35 INFO storage.BlockManagerInfo: Added broadcast_0_piece0
in memory on 172.31.28.196:44028 (size: 5.6 KB, free: 511.5 MB)
16/01/19 03:17:35 INFO spark.SparkContext: Created broadcast 0 from
textFile at <console>:21
lines: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[1] at textFile
at <console>:21

*scala> lines.count()*

16/01/19 03:17:55 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:17:56 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
1 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:17:57 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
2 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:17:58 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
3 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:17:59 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
4 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:18:00 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
5 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:18:01 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
6 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:18:02 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
7 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:18:03 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
8 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
16/01/19 03:18:04 INFO ipc.Client: Retrying connect to server:
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000. Already tried
9 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1 SECONDS)
java.lang.RuntimeException: java.net.ConnectException: Call to
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000 failed on
connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:567)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:318)
at
org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:291)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at
org.apache.spark.SparkContext$$anonfun$hadoopFile$1$$anonfun$33.apply(SparkContext.scala:1015)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at
org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$6.apply(HadoopRDD.scala:176)
at scala.Option.map(Option.scala:145)
at org.apache.spark.rdd.HadoopRDD.getJobConf(HadoopRDD.scala:176)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:195)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1929)
at org.apache.spark.rdd.RDD.count(RDD.scala:1143)
at $iwC$$iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:24)
at $iwC$$iwC$$iwC$$iwC$$iwC.<init>(<console>:29)
at $iwC$$iwC$$iwC$$iwC.<init>(<console>:31)
at $iwC$$iwC$$iwC.<init>(<console>:33)
at $iwC$$iwC.<init>(<console>:35)
at $iwC.<init>(<console>:37)
at <init>(<console>:39)
at .<init>(<console>:43)
at .<clinit>(<console>)
at .<init>(<console>:7)
at .<clinit>(<console>)
at $print(<console>)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065)
at
org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346)
at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871)
at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819)
at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:857)
at
org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:902)
at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:814)
at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:657)
at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:665)
at org.apache.spark.repl.SparkILoop.org
$apache$spark$repl$SparkILoop$$loop(SparkILoop.scala:670)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply$mcZ$sp(SparkILoop.scala:997)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at
org.apache.spark.repl.SparkILoop$$anonfun$org$apache$spark$repl$SparkILoop$$process$1.apply(SparkILoop.scala:945)
at
scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
at org.apache.spark.repl.SparkILoop.org
$apache$spark$repl$SparkILoop$$process(SparkILoop.scala:945)
at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:1059)
at org.apache.spark.repl.Main$.main(Main.scala:31)
at org.apache.spark.repl.Main.main(Main.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.ConnectException: Call to
ec2-54-88-242-197.compute-1.amazonaws.com/172.31.28.196:9000 failed on
connection exception: java.net.ConnectException: Connection refused
at org.apache.hadoop.ipc.Client.wrapException(Client.java:1142)
at org.apache.hadoop.ipc.Client.call(Client.java:1118)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:229)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:85)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:62)
at com.sun.proxy.$Proxy15.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.checkVersion(RPC.java:422)
at org.apache.hadoop.hdfs.DFSClient.createNamenode(DFSClient.java:183)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:281)
at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:245)
at
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:100)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1446)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1464)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:263)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:124)
at org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:563)
... 64 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744)
at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:511)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:481)
at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:457)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$2200(Client.java:205)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1249)
at org.apache.hadoop.ipc.Client.call(Client.java:1093)
... 84 more

scala>

On Tue, Jan 19, 2016 at 1:22 AM, Daniel Darabos <
daniel.dara...@lynxanalytics.com> wrote:

>
> On Mon, Jan 18, 2016 at 5:24 PM, Oleg Ruchovets <oruchov...@gmail.com>
> wrote:
>
>> I thought script tries to install hadoop / hdfs also. And it looks like
>> it failed. Installation is only standalone spark without hadoop. Is it
>> correct behaviour?
>>
>
> Yes, it also sets up two HDFS clusters. Are they not working? Try to see
> if Spark is working by running some simple jobs on it. (See
> http://spark.apache.org/docs/latest/ec2-scripts.html.)
>
> There is no program called Hadoop. If you mean YARN, then indeed the
> script does not set up YARN. It sets up standalone Spark.
>
>
>> Also errors in the log:
>>    ERROR: Unknown Tachyon version
>>    Error: Could not find or load main class crayondata.com.log
>>
>
> As long as Spark is working fine, you can ignore all output from the EC2
> script :).
>

Re: spark 1.6.0 on ec2 doesn't work

Reply via email to