Re: spark-submit failing but job running from scala ide

2016-09-26 Thread Marco Mistroni
Hi Vr
 your code works fine for me, running on Windows 10 vs Spark 1.6.1
i m guessing your Spark installation could be busted?
That would explain why it works on your IDE, as you are just importing jars
in your project.

The java.io.IOException: Failed to connect to error  is misleading,  i have
seen similar error for  two or three completely different usecases

I'd suggest your either try to
- move down to Spark 1.4.0 or 1.5.2 (there are subtle differences between
these old version and spark 1.6.1
- or reinstall Spark 1.6.1 and start from running spark-examples via
spark-submit
- run spark-shell and enter your SimpleApp line by line, to see if you can
get better debugging infos

hth
 marco.



On Mon, Sep 26, 2016 at 5:22 PM, vr spark  wrote:

> Hi Jacek/All,
>
>  I restarted my terminal and then i try spark-submit and again getting
> those errors. How do i see how many "runtimes" are running and how to have
> only one? some how my spark 1.6 and spark 2.0 are conflicting. how to fix
> it?
>
> i installed spark 1.6 earlier using this steps http://genomegeek.
> blogspot.com/2014/11/how-to-install-apache-spark-on-mac-os-x.html
> i installed spark 2.0 using these steps http://blog.weetech.co/
> 2015/08/light-learning-apache-spark.html
>
> Here is the for run-example
>
> m-C02KL0B1FFT4:bin vr$ ./run-example SparkPi
> Using Spark's default log4j profile: org/apache/spark/log4j-
> defaults.properties
> 16/09/26 09:11:00 INFO SparkContext: Running Spark version 2.0.0
> 16/09/26 09:11:00 WARN NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
> 16/09/26 09:11:00 INFO SecurityManager: Changing view acls to: vr
> 16/09/26 09:11:00 INFO SecurityManager: Changing modify acls to: vr
> 16/09/26 09:11:00 INFO SecurityManager: Changing view acls groups to:
> 16/09/26 09:11:00 INFO SecurityManager: Changing modify acls groups to:
> 16/09/26 09:11:00 INFO SecurityManager: SecurityManager: authentication
> disabled; ui acls disabled; users  with view permissions: Set(vr); groups
> with view permissions: Set(); users  with modify permissions: Set(vr);
> groups with modify permissions: Set()
> 16/09/26 09:11:01 INFO Utils: Successfully started service 'sparkDriver'
> on port 59323.
> 16/09/26 09:11:01 INFO SparkEnv: Registering MapOutputTracker
> 16/09/26 09:11:01 INFO SparkEnv: Registering BlockManagerMaster
> 16/09/26 09:11:01 INFO DiskBlockManager: Created local directory at
> /private/var/folders/23/ycbtxh8s551gzlsgj8q647d88gsjgb
> /T/blockmgr-d0d6dfea-2c97-4337-8e7d-0bbcb141f4c9
> 16/09/26 09:11:01 INFO MemoryStore: MemoryStore started with capacity
> 366.3 MB
> 16/09/26 09:11:01 INFO SparkEnv: Registering OutputCommitCoordinator
> 16/09/26 09:11:01 WARN Utils: Service 'SparkUI' could not bind on port
> 4040. Attempting port 4041.
> 16/09/26 09:11:01 INFO Utils: Successfully started service 'SparkUI' on
> port 4041.
> 16/09/26 09:11:01 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
> http://192.168.1.3:4041
> 16/09/26 09:11:01 INFO SparkContext: Added JAR file:/Users/vr/Downloads/
> spark-2.0.0/examples/target/scala-2.11/jars/scopt_2.11-3.3.0.jar at
> spark://192.168.1.3:59323/jars/scopt_2.11-3.3.0.jar with timestamp
> 1474906261472
> 16/09/26 09:11:01 INFO SparkContext: Added JAR file:/Users/vr/Downloads/
> spark-2.0.0/examples/target/scala-2.11/jars/spark-examples_2.11-2.0.0.jar
> at spark://192.168.1.3:59323/jars/spark-examples_2.11-2.0.0.jar with
> timestamp 1474906261473
> 16/09/26 09:11:01 INFO Executor: Starting executor ID driver on host
> localhost
> 16/09/26 09:11:01 INFO Utils: Successfully started service
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59324.
> 16/09/26 09:11:01 INFO NettyBlockTransferService: Server created on
> 192.168.1.3:59324
> 16/09/26 09:11:01 INFO BlockManagerMaster: Registering BlockManager
> BlockManagerId(driver, 192.168.1.3, 59324)
> 16/09/26 09:11:01 INFO BlockManagerMasterEndpoint: Registering block
> manager 192.168.1.3:59324 with 366.3 MB RAM, BlockManagerId(driver,
> 192.168.1.3, 59324)
> 16/09/26 09:11:01 INFO BlockManagerMaster: Registered BlockManager
> BlockManagerId(driver, 192.168.1.3, 59324)
> 16/09/26 09:11:01 WARN SparkContext: Use an existing SparkContext, some
> configuration may not take effect.
> 16/09/26 09:11:01 INFO SharedState: Warehouse path is
> 'file:/Users/vr/Downloads/spark-2.0.0/bin/spark-warehouse'.
> 16/09/26 09:11:01 INFO SparkContext: Starting job: reduce at
> SparkPi.scala:38
> 16/09/26 09:11:02 INFO DAGScheduler: Got job 0 (reduce at
> SparkPi.scala:38) with 2 output partitions
> 16/09/26 09:11:02 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at
> SparkPi.scala:38)
> 16/09/26 09:11:02 INFO DAGScheduler: Parents of final stage: List()
> 16/09/26 09:11:02 INFO DAGScheduler: Missing parents: List()
> 16/09/26 09:11:02 INFO DAGScheduler: Submitting ResultStage 0
> (MapPartitionsRDD[1] at map at SparkPi.scala:34), 

Re: spark-submit failing but job running from scala ide

2016-09-26 Thread vr spark
Hi Jacek/All,

 I restarted my terminal and then i try spark-submit and again getting
those errors. How do i see how many "runtimes" are running and how to have
only one? some how my spark 1.6 and spark 2.0 are conflicting. how to fix
it?

i installed spark 1.6 earlier using this steps
http://genomegeek.blogspot.com/2014/11/how-to-install-apache-spark-on-mac-os-x.html
i installed spark 2.0 using these steps
http://blog.weetech.co/2015/08/light-learning-apache-spark.html

Here is the for run-example

m-C02KL0B1FFT4:bin vr$ ./run-example SparkPi
Using Spark's default log4j profile:
org/apache/spark/log4j-defaults.properties
16/09/26 09:11:00 INFO SparkContext: Running Spark version 2.0.0
16/09/26 09:11:00 WARN NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable
16/09/26 09:11:00 INFO SecurityManager: Changing view acls to: vr
16/09/26 09:11:00 INFO SecurityManager: Changing modify acls to: vr
16/09/26 09:11:00 INFO SecurityManager: Changing view acls groups to:
16/09/26 09:11:00 INFO SecurityManager: Changing modify acls groups to:
16/09/26 09:11:00 INFO SecurityManager: SecurityManager: authentication
disabled; ui acls disabled; users  with view permissions: Set(vr); groups
with view permissions: Set(); users  with modify permissions: Set(vr);
groups with modify permissions: Set()
16/09/26 09:11:01 INFO Utils: Successfully started service 'sparkDriver' on
port 59323.
16/09/26 09:11:01 INFO SparkEnv: Registering MapOutputTracker
16/09/26 09:11:01 INFO SparkEnv: Registering BlockManagerMaster
16/09/26 09:11:01 INFO DiskBlockManager: Created local directory at
/private/var/folders/23/ycbtxh8s551gzlsgj8q647d88gsjgb/T/blockmgr-d0d6dfea-2c97-4337-8e7d-0bbcb141f4c9
16/09/26 09:11:01 INFO MemoryStore: MemoryStore started with capacity 366.3
MB
16/09/26 09:11:01 INFO SparkEnv: Registering OutputCommitCoordinator
16/09/26 09:11:01 WARN Utils: Service 'SparkUI' could not bind on port
4040. Attempting port 4041.
16/09/26 09:11:01 INFO Utils: Successfully started service 'SparkUI' on
port 4041.
16/09/26 09:11:01 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
http://192.168.1.3:4041
16/09/26 09:11:01 INFO SparkContext: Added JAR
file:/Users/vr/Downloads/spark-2.0.0/examples/target/scala-2.11/jars/scopt_2.11-3.3.0.jar
at spark://192.168.1.3:59323/jars/scopt_2.11-3.3.0.jar with timestamp
1474906261472
16/09/26 09:11:01 INFO SparkContext: Added JAR
file:/Users/vr/Downloads/spark-2.0.0/examples/target/scala-2.11/jars/spark-examples_2.11-2.0.0.jar
at spark://192.168.1.3:59323/jars/spark-examples_2.11-2.0.0.jar with
timestamp 1474906261473
16/09/26 09:11:01 INFO Executor: Starting executor ID driver on host
localhost
16/09/26 09:11:01 INFO Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port 59324.
16/09/26 09:11:01 INFO NettyBlockTransferService: Server created on
192.168.1.3:59324
16/09/26 09:11:01 INFO BlockManagerMaster: Registering BlockManager
BlockManagerId(driver, 192.168.1.3, 59324)
16/09/26 09:11:01 INFO BlockManagerMasterEndpoint: Registering block
manager 192.168.1.3:59324 with 366.3 MB RAM, BlockManagerId(driver,
192.168.1.3, 59324)
16/09/26 09:11:01 INFO BlockManagerMaster: Registered BlockManager
BlockManagerId(driver, 192.168.1.3, 59324)
16/09/26 09:11:01 WARN SparkContext: Use an existing SparkContext, some
configuration may not take effect.
16/09/26 09:11:01 INFO SharedState: Warehouse path is
'file:/Users/vr/Downloads/spark-2.0.0/bin/spark-warehouse'.
16/09/26 09:11:01 INFO SparkContext: Starting job: reduce at
SparkPi.scala:38
16/09/26 09:11:02 INFO DAGScheduler: Got job 0 (reduce at SparkPi.scala:38)
with 2 output partitions
16/09/26 09:11:02 INFO DAGScheduler: Final stage: ResultStage 0 (reduce at
SparkPi.scala:38)
16/09/26 09:11:02 INFO DAGScheduler: Parents of final stage: List()
16/09/26 09:11:02 INFO DAGScheduler: Missing parents: List()
16/09/26 09:11:02 INFO DAGScheduler: Submitting ResultStage 0
(MapPartitionsRDD[1] at map at SparkPi.scala:34), which has no missing
parents
16/09/26 09:11:02 INFO MemoryStore: Block broadcast_0 stored as values in
memory (estimated size 1832.0 B, free 366.3 MB)
16/09/26 09:11:02 INFO MemoryStore: Block broadcast_0_piece0 stored as
bytes in memory (estimated size 1169.0 B, free 366.3 MB)
16/09/26 09:11:02 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory
on 192.168.1.3:59324 (size: 1169.0 B, free: 366.3 MB)
16/09/26 09:11:02 INFO SparkContext: Created broadcast 0 from broadcast at
DAGScheduler.scala:1012
16/09/26 09:11:02 INFO DAGScheduler: Submitting 2 missing tasks from
ResultStage 0 (MapPartitionsRDD[1] at map at SparkPi.scala:34)
16/09/26 09:11:02 INFO TaskSchedulerImpl: Adding task set 0.0 with 2 tasks
16/09/26 09:11:02 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0, localhost, partition 0, PROCESS_LOCAL, 5474 bytes)
16/09/26 09:11:02 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
1, localhost, partition 1, 

Re: spark-submit failing but job running from scala ide

2016-09-25 Thread Jacek Laskowski
Hi,

How did you install Spark 1.6? It's usually as simple as rm -rf
$SPARK_1.6_HOME, but it really depends on how you installed it in the
first place.

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sun, Sep 25, 2016 at 4:32 PM, vr spark  wrote:
> yes, i have both spark 1.6 and spark 2.0.
> I unset the spark home environment variable and pointed spark submit to 2.0.
> Its working now.
>
> How do i uninstall/remove spark 1.6 from mac?
>
> Thanks
>
>
> On Sun, Sep 25, 2016 at 4:28 AM, Jacek Laskowski  wrote:
>>
>> Hi,
>>
>> Can you execute run-example SparkPi with your Spark installation?
>>
>> Also, see the logs:
>>
>> 16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
>> 4040. Attempting port 4041.
>>
>> 16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI'
>> on port 4041.
>>
>> You've got two Spark runtimes up that may or may not contribute to the
>> issue.
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://medium.com/@jaceklaskowski/
>> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>> On Sun, Sep 25, 2016 at 8:36 AM, vr spark  wrote:
>> > Hi,
>> > I have this simple scala app which works fine when i run it as scala
>> > application from the scala IDE for eclipse.
>> > But when i export is as jar and run it from spark-submit i am getting
>> > below
>> > error. Please suggest
>> >
>> > bin/spark-submit --class com.x.y.vr.spark.first.SimpleApp test.jar
>> >
>> > 16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
>> > 4040.
>> > Attempting port 4041.
>> >
>> > 16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI' on
>> > port
>> > 4041.
>> >
>> > 16/09/24 23:15:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
>> > http://192.168.1.3:4041
>> >
>> > 16/09/24 23:15:15 INFO SparkContext: Added JAR
>> > file:/Users/vr/Downloads/spark-2.0.0/test.jar at
>> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>> >
>> > 16/09/24 23:15:15 INFO Executor: Starting executor ID driver on host
>> > localhost
>> >
>> > 16/09/24 23:15:15 INFO Utils: Successfully started service
>> > 'org.apache.spark.network.netty.NettyBlockTransferService' on port
>> > 59264.
>> >
>> > 16/09/24 23:15:15 INFO NettyBlockTransferService: Server created on
>> > 192.168.1.3:59264
>> >
>> > 16/09/24 23:15:16 INFO TaskSetManager: Starting task 0.0 in stage 0.0
>> > (TID
>> > 0, localhost, partition 0, PROCESS_LOCAL, 5354 bytes)
>> >
>> > 16/09/24 23:15:16 INFO TaskSetManager: Starting task 1.0 in stage 0.0
>> > (TID
>> > 1, localhost, partition 1, PROCESS_LOCAL, 5354 bytes)
>> >
>> > 16/09/24 23:15:16 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
>> >
>> > 16/09/24 23:15:16 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
>> >
>> > 16/09/24 23:15:16 INFO Executor: Fetching
>> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>> >
>> > 16/09/24 23:16:31 INFO Executor: Fetching
>> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>> >
>> > 16/09/24 23:16:31 ERROR Executor: Exception in task 1.0 in stage 0.0
>> > (TID 1)
>> >
>> > java.io.IOException: Failed to connect to /192.168.1.3:59263
>> >
>> > at
>> >
>> > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
>> >
>> > at
>> >
>> > org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
>> >
>> > at
>> >
>> > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:358)
>> >
>> > at
>> > org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)
>> >
>> > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:633)
>> >
>> > at org.apache.spark.util.Utils$.fetchFile(Utils.scala:459)
>> >
>> > at
>> >
>> > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:488)
>> >
>> > at
>> >
>> > org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:480)
>> >
>> > at
>> >
>> > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>> >
>> > at
>> >
>> > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>> >
>> > at
>> >
>> > scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>> >
>> > at
>> >
>> > scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
>> >
>> > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>> >
>> > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
>> >
>> > at
>> >
>> > scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>> >
>> > at
>> >

Re: spark-submit failing but job running from scala ide

2016-09-25 Thread vr spark
yes, i have both spark 1.6 and spark 2.0.
I unset the spark home environment variable and pointed spark submit to 2.0.
Its working now.

How do i uninstall/remove spark 1.6 from mac?

Thanks


On Sun, Sep 25, 2016 at 4:28 AM, Jacek Laskowski  wrote:

> Hi,
>
> Can you execute run-example SparkPi with your Spark installation?
>
> Also, see the logs:
>
> 16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
> 4040. Attempting port 4041.
>
> 16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI'
> on port 4041.
>
> You've got two Spark runtimes up that may or may not contribute to the
> issue.
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Sun, Sep 25, 2016 at 8:36 AM, vr spark  wrote:
> > Hi,
> > I have this simple scala app which works fine when i run it as scala
> > application from the scala IDE for eclipse.
> > But when i export is as jar and run it from spark-submit i am getting
> below
> > error. Please suggest
> >
> > bin/spark-submit --class com.x.y.vr.spark.first.SimpleApp test.jar
> >
> > 16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
> 4040.
> > Attempting port 4041.
> >
> > 16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI' on
> port
> > 4041.
> >
> > 16/09/24 23:15:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
> > http://192.168.1.3:4041
> >
> > 16/09/24 23:15:15 INFO SparkContext: Added JAR
> > file:/Users/vr/Downloads/spark-2.0.0/test.jar at
> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
> >
> > 16/09/24 23:15:15 INFO Executor: Starting executor ID driver on host
> > localhost
> >
> > 16/09/24 23:15:15 INFO Utils: Successfully started service
> > 'org.apache.spark.network.netty.NettyBlockTransferService' on port
> 59264.
> >
> > 16/09/24 23:15:15 INFO NettyBlockTransferService: Server created on
> > 192.168.1.3:59264
> >
> > 16/09/24 23:15:16 INFO TaskSetManager: Starting task 0.0 in stage 0.0
> (TID
> > 0, localhost, partition 0, PROCESS_LOCAL, 5354 bytes)
> >
> > 16/09/24 23:15:16 INFO TaskSetManager: Starting task 1.0 in stage 0.0
> (TID
> > 1, localhost, partition 1, PROCESS_LOCAL, 5354 bytes)
> >
> > 16/09/24 23:15:16 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
> >
> > 16/09/24 23:15:16 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
> >
> > 16/09/24 23:15:16 INFO Executor: Fetching
> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
> >
> > 16/09/24 23:16:31 INFO Executor: Fetching
> > spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
> >
> > 16/09/24 23:16:31 ERROR Executor: Exception in task 1.0 in stage 0.0
> (TID 1)
> >
> > java.io.IOException: Failed to connect to /192.168.1.3:59263
> >
> > at
> > org.apache.spark.network.client.TransportClientFactory.createClient(
> TransportClientFactory.java:228)
> >
> > at
> > org.apache.spark.network.client.TransportClientFactory.createClient(
> TransportClientFactory.java:179)
> >
> > at
> > org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(
> NettyRpcEnv.scala:358)
> >
> > at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(
> NettyRpcEnv.scala:324)
> >
> > at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:633)
> >
> > at org.apache.spark.util.Utils$.fetchFile(Utils.scala:459)
> >
> > at
> > org.apache.spark.executor.Executor$$anonfun$org$apache$
> spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:488)
> >
> > at
> > org.apache.spark.executor.Executor$$anonfun$org$apache$
> spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:480)
> >
> > at
> > scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(
> TraversableLike.scala:733)
> >
> > at
> > scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:99)
> >
> > at
> > scala.collection.mutable.HashMap$$anonfun$foreach$1.
> apply(HashMap.scala:99)
> >
> > at
> > scala.collection.mutable.HashTable$class.foreachEntry(
> HashTable.scala:230)
> >
> > at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
> >
> > at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
> >
> > at
> > scala.collection.TraversableLike$WithFilter.
> foreach(TraversableLike.scala:732)
> >
> > at
> > org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$
> updateDependencies(Executor.scala:480)
> >
> > at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:252)
> >
> > at
> > java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1142)
> >
> > at
> > java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:617)
> >
> > at java.lang.Thread.run(Thread.java:745)
> >
> >
> >
> >
> >
> > My Scala code
> >
> >
> > package com.x.y.vr.spark.first
> >
> > /* SimpleApp.scala */
> >
> > import 

Re: spark-submit failing but job running from scala ide

2016-09-25 Thread Jacek Laskowski
Hi,

Can you execute run-example SparkPi with your Spark installation?

Also, see the logs:

16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
4040. Attempting port 4041.

16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI'
on port 4041.

You've got two Spark runtimes up that may or may not contribute to the issue.

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Sun, Sep 25, 2016 at 8:36 AM, vr spark  wrote:
> Hi,
> I have this simple scala app which works fine when i run it as scala
> application from the scala IDE for eclipse.
> But when i export is as jar and run it from spark-submit i am getting below
> error. Please suggest
>
> bin/spark-submit --class com.x.y.vr.spark.first.SimpleApp test.jar
>
> 16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port 4040.
> Attempting port 4041.
>
> 16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI' on port
> 4041.
>
> 16/09/24 23:15:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
> http://192.168.1.3:4041
>
> 16/09/24 23:15:15 INFO SparkContext: Added JAR
> file:/Users/vr/Downloads/spark-2.0.0/test.jar at
> spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>
> 16/09/24 23:15:15 INFO Executor: Starting executor ID driver on host
> localhost
>
> 16/09/24 23:15:15 INFO Utils: Successfully started service
> 'org.apache.spark.network.netty.NettyBlockTransferService' on port 59264.
>
> 16/09/24 23:15:15 INFO NettyBlockTransferService: Server created on
> 192.168.1.3:59264
>
> 16/09/24 23:15:16 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
> 0, localhost, partition 0, PROCESS_LOCAL, 5354 bytes)
>
> 16/09/24 23:15:16 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
> 1, localhost, partition 1, PROCESS_LOCAL, 5354 bytes)
>
> 16/09/24 23:15:16 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
>
> 16/09/24 23:15:16 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
>
> 16/09/24 23:15:16 INFO Executor: Fetching
> spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>
> 16/09/24 23:16:31 INFO Executor: Fetching
> spark://192.168.1.3:59263/jars/test.jar with timestamp 1474784115210
>
> 16/09/24 23:16:31 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
>
> java.io.IOException: Failed to connect to /192.168.1.3:59263
>
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)
>
> at
> org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)
>
> at
> org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:358)
>
> at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)
>
> at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:633)
>
> at org.apache.spark.util.Utils$.fetchFile(Utils.scala:459)
>
> at
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:488)
>
> at
> org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:480)
>
> at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)
>
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)
>
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)
>
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)
>
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)
>
> at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)
>
> at
> org.apache.spark.executor.Executor.org$apache$spark$executor$Executor$$updateDependencies(Executor.scala:480)
>
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:252)
>
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>
> at java.lang.Thread.run(Thread.java:745)
>
>
>
>
>
> My Scala code
>
>
> package com.x.y.vr.spark.first
>
> /* SimpleApp.scala */
>
> import org.apache.spark.SparkContext
>
> import org.apache.spark.SparkContext._
>
> import org.apache.spark.SparkConf
>
> object SimpleApp {
>
>   def main(args: Array[String]) {
>
> val logFile = "/Users/vttrich/Downloads/spark-2.0.0/README.md" // Should
> be some file on your system
>
> val conf = new SparkConf().setAppName("Simple Application")
>
> val sc = new SparkContext("local[*]", "RatingsCounter")
>
> //val sc = new SparkContext(conf)
>
> val logData = sc.textFile(logFile, 2).cache()
>
> val numAs = logData.filter(line => line.contains("a")).count()
>
> val numBs = 

spark-submit failing but job running from scala ide

2016-09-25 Thread vr spark
Hi,
I have this simple scala app which works fine when i run it as scala
application from the scala IDE for eclipse.
But when i export is as jar and run it from spark-submit i am getting below
error. Please suggest

*bin/spark-submit --class com.x.y.vr.spark.first.SimpleApp test.jar*

16/09/24 23:15:15 WARN Utils: Service 'SparkUI' could not bind on port
4040. Attempting port 4041.

16/09/24 23:15:15 INFO Utils: Successfully started service 'SparkUI' on
port 4041.

16/09/24 23:15:15 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at
http://192.168.1.3:4041

16/09/24 23:15:15 INFO SparkContext: Added JAR
file:/Users/vr/Downloads/spark-2.0.0/test.jar at spark://
192.168.1.3:59263/jars/test.jar with timestamp 1474784115210

16/09/24 23:15:15 INFO Executor: Starting executor ID driver on host
localhost

16/09/24 23:15:15 INFO Utils: Successfully started service
'org.apache.spark.network.netty.NettyBlockTransferService' on port 59264.

16/09/24 23:15:15 INFO NettyBlockTransferService: Server created on
192.168.1.3:59264

16/09/24 23:15:16 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID
0, localhost, partition 0, PROCESS_LOCAL, 5354 bytes)

16/09/24 23:15:16 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID
1, localhost, partition 1, PROCESS_LOCAL, 5354 bytes)

16/09/24 23:15:16 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)

16/09/24 23:15:16 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)

16/09/24 23:15:16 INFO Executor: Fetching spark://
192.168.1.3:59263/jars/test.jar with timestamp 1474784115210

16/09/24 23:16:31 INFO Executor: Fetching spark://
192.168.1.3:59263/jars/test.jar with timestamp 1474784115210

16/09/24 23:16:31 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)

java.io.IOException: Failed to connect to /192.168.1.3:59263

at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:228)

at
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:179)

at
org.apache.spark.rpc.netty.NettyRpcEnv.downloadClient(NettyRpcEnv.scala:358)

at org.apache.spark.rpc.netty.NettyRpcEnv.openChannel(NettyRpcEnv.scala:324)

at org.apache.spark.util.Utils$.doFetchFile(Utils.scala:633)

at org.apache.spark.util.Utils$.fetchFile(Utils.scala:459)

at
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:488)

at
org.apache.spark.executor.Executor$$anonfun$org$apache$spark$executor$Executor$$updateDependencies$5.apply(Executor.scala:480)

at
scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:733)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)

at
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)

at
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)

at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:40)

at scala.collection.mutable.HashMap.foreach(HashMap.scala:99)

at
scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:732)

at org.apache.spark.executor.Executor.org
$apache$spark$executor$Executor$$updateDependencies(Executor.scala:480)

at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:252)

at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)

at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)

at java.lang.Thread.run(Thread.java:745)




*My Scala code*


package com.x.y.vr.spark.first

/* SimpleApp.scala */

import org.apache.spark.SparkContext

import org.apache.spark.SparkContext._

import org.apache.spark.SparkConf

object SimpleApp {

  def main(args: Array[String]) {

val logFile = "/Users/vttrich/Downloads/spark-2.0.0/README.md" //
Should be some file on your system

val conf = new SparkConf().setAppName("Simple Application")

val sc = new SparkContext("local[*]", "RatingsCounter")

//val sc = new SparkContext(conf)

val logData = sc.textFile(logFile, 2).cache()

val numAs = logData.filter(line => line.contains("a")).count()

val numBs = logData.filter(line => line.contains("b")).count()

println("Lines with a: %s, Lines with b: %s".format(numAs, numBs))

  }

}