Re: how can I run spark job in my environment which is a single Ubuntu host with no hadoop installed
Maybe your application is overriding the master variable when it creates its SparkContext. I see you are still passing “yarn-client” as an argument later to it in your command. > On Jun 17, 2018, at 11:53 AM, Raymond Xie wrote: > > Thank you Subhash. > > Here is the new command: > spark-submit --master local[*] --class retail_db.GetRevenuePerOrder --conf > spark.ui.port=12678 spark2practice_2.11-0.1.jar yarn-client > /public/retail_db/order_items /home/rxie/output/revenueperorder > > Still seeing the same issue here. > 2018-06-17 11:51:25 INFO RMProxy:98 - Connecting to ResourceManager at > /0.0.0.0:8032 > 2018-06-17 11:51:27 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:28 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:29 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:30 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:31 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:32 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:33 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:34 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:35 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > 2018-06-17 11:51:36 INFO Client:871 - Retrying connect to server: > 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is > >RetryUpToMaximumCountWithFixedSleep(maxRetries=10, > sleepTime=1000 MILLISECONDS) > > > > > Sincerely yours, > > > Raymond > > On Sun, Jun 17, 2018 at 2:36 PM, Subhash Sriram > wrote: > Hi Raymond, > > If you set your master to local[*] instead of yarn-client, it should run on > your local machine. > > Thanks, > Subhash > > Sent from my iPhone > > On Jun 17, 2018, at 2:32 PM, Raymond Xie wrote: > >> Hello, >> >> I am wondering how can I run spark job in my environment which is a single >> Ubuntu host with no hadoop installed? if I run my job like below, I will end >> up with infinite loop at the end. Thank you very much. >> >> rxie@ubuntu:~/data$ spark-submit --class retail_db.GetRevenuePerOrder --conf >> spark.ui.port=12678 spark2practice_2.11-0.1.jar yarn-client >> /public/retail_db/order_items /home/rxie/output/revenueperorder >> 2018-06-17 11:19:36 WARN Utils:66 - Your hostname, ubuntu resolves to a >> loopback address: 127.0.1.1; using 192.168.112.141 instead (on interface >> ens33) >> 2018-06-17 11:19:36 WARN Utils:66 - Set SPARK_LOCAL_IP
Re: how can I run spark job in my environment which is a single Ubuntu host with no hadoop installed
Thank you Subhash. Here is the new command: spark-submit --master local[*] --class retail_db.GetRevenuePerOrder --conf spark.ui.port=12678 spark2practice_2.11-0.1.jar yarn-client /public/retail_db/order_items /home/rxie/output/revenueperorder Still seeing the same issue here. 2018-06-17 11:51:25 INFO RMProxy:98 - Connecting to ResourceManager at / 0.0.0.0:8032 2018-06-17 11:51:27 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:28 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:29 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:30 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:31 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 4 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:32 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:33 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 6 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:34 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 7 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:35 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) 2018-06-17 11:51:36 INFO Client:871 - Retrying connect to server: 0.0.0.0/0.0.0.0:8032. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS) ** *Sincerely yours,* *Raymond* On Sun, Jun 17, 2018 at 2:36 PM, Subhash Sriram wrote: > Hi Raymond, > > If you set your master to local[*] instead of yarn-client, it should run > on your local machine. > > Thanks, > Subhash > > Sent from my iPhone > > On Jun 17, 2018, at 2:32 PM, Raymond Xie wrote: > > Hello, > > I am wondering how can I run spark job in my environment which is a single > Ubuntu host with no hadoop installed? if I run my job like below, I will > end up with infinite loop at the end. Thank you very much. > > rxie@ubuntu:~/data$ spark-submit --class retail_db.GetRevenuePerOrder > --conf spark.ui.port=12678 spark2practice_2.11-0.1.jar yarn-client > /public/retail_db/order_items /home/rxie/output/revenueperorder > 2018-06-17 11:19:36 WARN Utils:66 - Your hostname, ubuntu resolves to a > loopback address: 127.0.1.1; using 192.168.112.141 instead (on interface > ens33) > 2018-06-17 11:19:36 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to > bind to another address > 2018-06-17 11:19:37 WARN NativeCodeLoader:62 - Unable to load > native-hadoop library for your platform... using builtin-java classes where > applicable > 2018-06-17 11:19:38 INFO SparkContext:54 - Running Spark version 2.3.1 > 2018-06-17 11:19:38 WARN SparkConf:66 - spark.master yarn-client is > deprecated in Spark 2.0+, please instead use "yarn" with specified deploy > mode. > 2018-06-17 11:19:38 INFO SparkContext:54 - Submitted application: Get > Revenue Per Order > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing view acls to: rxie > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing modify acls to: > rxie > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing view acls groups > to: > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing modify acls groups > to: > 2018-06-17 11:19:38 INFO SecurityManager:54 - SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(rxie); groups with view permissions: Set(); users with modify > permissions: Set(rxie); groups with modify permissions: Set() > 2018-06-17 11:19:39 INFO Utils:54 - Successfully started service > 'sparkDriver' on port 44709. > 2018-06-17 11:19:39 INFO SparkEnv:54 - Registering MapOutputTracker > 2018-06-17 11:19:39 INFO
Re: how can I run spark job in my environment which is a single Ubuntu host with no hadoop installed
Hi Raymond, If you set your master to local[*] instead of yarn-client, it should run on your local machine. Thanks, Subhash Sent from my iPhone > On Jun 17, 2018, at 2:32 PM, Raymond Xie wrote: > > Hello, > > I am wondering how can I run spark job in my environment which is a single > Ubuntu host with no hadoop installed? if I run my job like below, I will end > up with infinite loop at the end. Thank you very much. > > rxie@ubuntu:~/data$ spark-submit --class retail_db.GetRevenuePerOrder --conf > spark.ui.port=12678 spark2practice_2.11-0.1.jar yarn-client > /public/retail_db/order_items /home/rxie/output/revenueperorder > 2018-06-17 11:19:36 WARN Utils:66 - Your hostname, ubuntu resolves to a > loopback address: 127.0.1.1; using 192.168.112.141 instead (on interface > ens33) > 2018-06-17 11:19:36 WARN Utils:66 - Set SPARK_LOCAL_IP if you need to bind > to another address > 2018-06-17 11:19:37 WARN NativeCodeLoader:62 - Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 2018-06-17 11:19:38 INFO SparkContext:54 - Running Spark version 2.3.1 > 2018-06-17 11:19:38 WARN SparkConf:66 - spark.master yarn-client is > deprecated in Spark 2.0+, please instead use "yarn" with specified deploy > mode. > 2018-06-17 11:19:38 INFO SparkContext:54 - Submitted application: Get > Revenue Per Order > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing view acls to: rxie > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing modify acls to: rxie > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing view acls groups to: > 2018-06-17 11:19:38 INFO SecurityManager:54 - Changing modify acls groups to: > 2018-06-17 11:19:38 INFO SecurityManager:54 - SecurityManager: > authentication disabled; ui acls disabled; users with view permissions: > Set(rxie); groups with view permissions: Set(); users with modify > permissions: Set(rxie); groups with modify permissions: Set() > 2018-06-17 11:19:39 INFO Utils:54 - Successfully started service > 'sparkDriver' on port 44709. > 2018-06-17 11:19:39 INFO SparkEnv:54 - Registering MapOutputTracker > 2018-06-17 11:19:39 INFO SparkEnv:54 - Registering BlockManagerMaster > 2018-06-17 11:19:39 INFO BlockManagerMasterEndpoint:54 - Using > org.apache.spark.storage.DefaultTopologyMapper for getting topology > information > 2018-06-17 11:19:39 INFO BlockManagerMasterEndpoint:54 - > BlockManagerMasterEndpoint up > 2018-06-17 11:19:39 INFO DiskBlockManager:54 - Created local directory at > /tmp/blockmgr-69a8a12d-0881-4454-96ab-6a45d5c58bfe > 2018-06-17 11:19:39 INFO MemoryStore:54 - MemoryStore started with capacity > 413.9 MB > 2018-06-17 11:19:39 INFO SparkEnv:54 - Registering OutputCommitCoordinator > 2018-06-17 11:19:40 INFO log:192 - Logging initialized @7035ms > 2018-06-17 11:19:40 INFO Server:346 - jetty-9.3.z-SNAPSHOT > 2018-06-17 11:19:40 INFO Server:414 - Started @7383ms > 2018-06-17 11:19:40 INFO AbstractConnector:278 - Started > ServerConnector@51ad75c2{HTTP/1.1,[http/1.1]}{0.0.0.0:12678} > 2018-06-17 11:19:40 INFO Utils:54 - Successfully started service 'SparkUI' > on port 12678. > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@50b8ae8d{/jobs,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@60afd40d{/jobs/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@28a2a3e7{/jobs/job,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@10b3df93{/jobs/job/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@ea27e34{/stages,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@33a2499c{/stages/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@e72dba7{/stages/stage,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@3c321bdb{/stages/stage/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@24855019{/stages/pool,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@3abd581e{/stages/pool/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@4d4d8fcf{/storage,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@610db97e{/storage/json,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started > o.s.j.s.ServletContextHandler@6f0628de{/storage/rdd,null,AVAILABLE,@Spark} > 2018-06-17 11:19:40 INFO ContextHandler:781 - Started >