Hi Alex, I managed to make it work!! Finally I'm running both mesos master and slave in my laptop and picking up the spark jar from a hdfs installed in a VM. I've just launched an spark job and is working fine!
Thank you very much for your help 2015-05-28 16:20 GMT+02:00 Alberto Rodriguez <[email protected]>: > Hi Alex, > > see following an extract of the chronos log (not sure whether this is the > log you were talking about): > > 2015-05-28_14:18:28.49322 [2015-05-28 14:18:28,491] INFO No tasks > scheduled! Declining offers > (com.airbnb.scheduler.mesos.MesosJobFramework:106) > 2015-05-28_14:18:34.49896 [2015-05-28 14:18:34,497] INFO Received resource > offers > 2015-05-28_14:18:34.49903 > (com.airbnb.scheduler.mesos.MesosJobFramework:87) > 2015-05-28_14:18:34.50036 [2015-05-28 14:18:34,498] INFO No tasks > scheduled! Declining offers > (com.airbnb.scheduler.mesos.MesosJobFramework:106) > 2015-05-28_14:18:40.50442 [2015-05-28 14:18:40,503] INFO Received resource > offers > 2015-05-28_14:18:40.50444 > (com.airbnb.scheduler.mesos.MesosJobFramework:87) > 2015-05-28_14:18:40.50506 [2015-05-28 14:18:40,503] INFO No tasks > scheduled! Declining offers > (com.airbnb.scheduler.mesos.MesosJobFramework:106) > > I'm using 0.20.1 because I'm using this vagrant machine: > https://github.com/Banno/vagrant-mesos > > Kind regards and thank you again for your help > > 2015-05-28 14:09 GMT+02:00 Alex Rukletsov <[email protected]>: > >> Alberto, >> >> it looks like Spark scheduler disconnects right after establishing the >> connection. Would you mind sharing scheduler logs as well? Also I see that >> you haven't specified the failover_timeout, try setting this value to >> something meaningful (several hours for test purposes). >> >> And by the way, any reason you're still on Mesos 0.20.1? >> >> On Wed, May 27, 2015 at 5:32 PM, Alberto Rodriguez <[email protected]> >> wrote: >> >> > Hi Alex, >> > >> > I do not know what's going on, now I'm unable to access the spark >> console >> > again, it's hanging up in the same point as before. See following the >> > master logs: >> > >> > 2015-05-27_15:30:53.68764 I0527 15:30:53.687494 944 master.cpp:3760] >> > Sending 1 offers to framework 20150527-100126-169978048-5050-1851-0001 >> > (chronos-2.3.0_mesos-0.20.1-SNAPSHOT) at scheduler-be29901f-39ab-4bdf >> > [email protected]:32768 >> > 2015-05-27_15:30:53.69032 I0527 15:30:53.690196 942 master.cpp:2273] >> > Processing ACCEPT call for offers: [ >> > 20150527-152023-169978048-5050-876-O241 ] on slave >> > 20150527-152023-169978048-5050-876-S0 at slave(1)@19 >> > 2.168.33.11:5051 (mesos-slave1) for framework >> > 20150527-100126-169978048-5050-1851-0001 >> > (chronos-2.3.0_mesos-0.20.1-SNAPSHOT) at >> > [email protected]:32768 >> > 2015-05-27_15:30:53.69038 I0527 15:30:53.690300 942 >> hierarchical.hpp:648] >> > Recovered mem(*):1024; cpus(*):2; disk(*):33375; ports(*):[31000-32000] >> > (total allocatable: mem(*):1024; cpus(*):2; disk(*):33375; port >> > s(*):[31000-32000]) on slave 20150527-152023-169978048-5050-876-S0 from >> > framework 20150527-100126-169978048-5050-1851-0001 >> > 2015-05-27_15:30:54.00952 I0527 15:30:54.009363 937 master.cpp:1574] >> > Received registration request for framework 'Spark shell' at >> > [email protected]:55562 >> > 2015-05-27_15:30:54.00957 I0527 15:30:54.009461 937 master.cpp:1638] >> > Registering framework 20150527-152023-169978048-5050-876-0026 (Spark >> shell) >> > at [email protected]:5556 >> > 2 >> > 2015-05-27_15:30:54.00994 I0527 15:30:54.009703 937 >> hierarchical.hpp:321] >> > Added framework 20150527-152023-169978048-5050-876-0026 >> > 2015-05-27_15:30:54.00996 I0527 15:30:54.009826 937 master.cpp:3760] >> > Sending 1 offers to framework 20150527-152023-169978048-5050-876-0026 >> > (Spark shell) at [email protected]. >> > 0.1:55562 >> > 2015-05-27_15:30:54.01035 I0527 15:30:54.010267 944 master.cpp:878] >> > Framework 20150527-152023-169978048-5050-876-0026 (Spark shell) at >> > [email protected]:55562 >> disconnecte >> > d >> > 2015-05-27_15:30:54.01037 I0527 15:30:54.010308 944 master.cpp:1948] >> > Disconnecting framework 20150527-152023-169978048-5050-876-0026 (Spark >> > shell) at [email protected]:55 >> > 562 >> > 2015-05-27_15:30:54.01038 I0527 15:30:54.010326 944 master.cpp:1964] >> > Deactivating framework 20150527-152023-169978048-5050-876-0026 (Spark >> > shell) at [email protected]:555 >> > 62 >> > 2015-05-27_15:30:54.01053 I0527 15:30:54.010447 939 >> hierarchical.hpp:400] >> > Deactivated framework 20150527-152023-169978048-5050-876-0026 >> > 2015-05-27_15:30:54.01055 I0527 15:30:54.010459 944 master.cpp:900] >> > Giving framework 20150527-152023-169978048-5050-876-0026 (Spark shell) >> at >> > [email protected]:55562 0ns >> > to failover >> > >> > >> > Kind regards and thank you very much for your help!! >> > >> > >> > >> > 2015-05-27 16:28 GMT+02:00 Alex Rukletsov <[email protected]>: >> > >> > > Alberto, >> > > >> > > would you mind providing slave and master logs (or appropriate parts >> of >> > > them)? Have you specified the --work_dir flag for your Mesos Workers? >> > > >> > > On Wed, May 27, 2015 at 3:56 PM, Alberto Rodriguez <[email protected] >> > >> > > wrote: >> > > >> > > > Hi Alex, >> > > > >> > > > Thank you for replying. I managed to fix the first problem but now >> > when I >> > > > launch a spark job through my console mesos is losing all the >> tasks. I >> > > can >> > > > see them all in my mesos slave but their status is LOST. The stderr >> & >> > > > stdout files of the tasks are both empty. >> > > > >> > > > Any ideas? >> > > > >> > > > 2015-05-26 17:35 GMT+02:00 Alex Rukletsov <[email protected]>: >> > > > >> > > > > Alberto, >> > > > > >> > > > > What may be happening in your case is that Master is not able to >> talk >> > > to >> > > > > your scheduler. When responding to a scheduler, Mesos Master >> doesn't >> > > use >> > > > > the IP from which a request came from, but rather an IP set in the >> > > > > "Libprocess-from" field instead. That's exactly what you specify >> in >> > > > > LIBPROCESS_IP env var prior starting your scheduler. Could you >> please >> > > > > double check the it set up correctly and that IP is reachable for >> > Mesos >> > > > > Master? >> > > > > >> > > > > In case you are not able to solve the problem, please provide >> > scheduler >> > > > and >> > > > > Master logs together with master, zookeeper, and scheduler >> > > > configurations. >> > > > > >> > > > > >> > > > > On Mon, May 25, 2015 at 6:30 PM, Alberto Rodriguez < >> > [email protected]> >> > > > > wrote: >> > > > > >> > > > > > Hi all, >> > > > > > >> > > > > > I managed to get a mesos cluster up & running on a Ubuntu VM. >> I've >> > > > > > been also able to run and connect a spark-shell from this >> machine >> > and >> > > > > > it works properly. >> > > > > > >> > > > > > Unfortunately, I'm trying to connect from the host machine where >> > the >> > > > > > VM is running to launch spark jobs and I can not. >> > > > > > >> > > > > > See below the spark console output: >> > > > > > >> > > > > > Using Scala version 2.10.4 (Java HotSpot(TM) 64-Bit Server VM, >> Java >> > > > > > 1.7.0_75) >> > > > > > Type in expressions to have them evaluated. >> > > > > > Type :help for more information. >> > > > > > 15/05/25 18:13:00 INFO SecurityManager: Changing view acls to: >> > > > arodriguez >> > > > > > 15/05/25 18:13:00 INFO SecurityManager: Changing modify acls to: >> > > > > arodriguez >> > > > > > 15/05/25 18:13:00 INFO SecurityManager: SecurityManager: >> > > > > > authentication disabled; ui acls disabled; users with view >> > > > > > permissions: Set(arodriguez); users with modify permissions: >> > > > > > Set(arodriguez) >> > > > > > 15/05/25 18:13:01 INFO Slf4jLogger: Slf4jLogger started >> > > > > > 15/05/25 18:13:01 INFO Remoting: Starting remoting >> > > > > > 15/05/25 18:13:01 INFO Remoting: Remoting started; listening on >> > > > > > addresses :[akka.tcp://[email protected]:47229] >> > > > > > 15/05/25 18:13:01 INFO Utils: Successfully started service >> > > > > > 'sparkDriver' on port 47229. >> > > > > > 15/05/25 18:13:01 INFO SparkEnv: Registering MapOutputTracker >> > > > > > 15/05/25 18:13:01 INFO SparkEnv: Registering BlockManagerMaster >> > > > > > 15/05/25 18:13:01 INFO DiskBlockManager: Created local >> directory at >> > > > > > /tmp/spark-local-20150525181301-7fa8 >> > > > > > 15/05/25 18:13:01 INFO MemoryStore: MemoryStore started with >> > capacity >> > > > > > 265.4 MB >> > > > > > 15/05/25 18:13:01 WARN NativeCodeLoader: Unable to load >> > native-hadoop >> > > > > > library for your platform... using builtin-java classes where >> > > > > > applicable >> > > > > > 15/05/25 18:13:01 INFO HttpFileServer: HTTP File server >> directory >> > is >> > > > > > /tmp/spark-1249c23f-adc8-4fcd-a044-b65a80f40e16 >> > > > > > 15/05/25 18:13:01 INFO HttpServer: Starting HTTP Server >> > > > > > 15/05/25 18:13:01 INFO Utils: Successfully started service 'HTTP >> > file >> > > > > > server' on port 51659. >> > > > > > 15/05/25 18:13:01 INFO Utils: Successfully started service >> > 'SparkUI' >> > > > > > on port 4040. >> > > > > > 15/05/25 18:13:01 INFO SparkUI: Started SparkUI at >> > > > > > http://localhost.localdomain:4040 >> > > > > > WARNING: Logging before InitGoogleLogging() is written to STDERR >> > > > > > W0525 18:13:01.749449 10908 sched.cpp:1323] >> > > > > > ************************************************** >> > > > > > Scheduler driver bound to loopback interface! Cannot communicate >> > with >> > > > > > remote master(s). You might want to set 'LIBPROCESS_IP' >> environment >> > > > > > variable to use a routable IP address. >> > > > > > ************************************************** >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @712: >> > > > > > Client environment:zookeeper.version=zookeeper C client 3.4.6 >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @716: >> > > > > > Client environment:host.name=localhost.localdomain >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @723: >> > > > > > Client environment:os.name=Linux >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @724: >> > > > > > Client environment:os.arch=3.19.7-200.fc21.x86_64 >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @725: >> > > > > > Client environment:os.version=#1 SMP Thu May 7 22:00:21 UTC 2015 >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @733: >> > > > > > Client environment:user.name=arodriguez >> > > > > > I0525 18:13:01.749791 10908 sched.cpp:157] Version: 0.22.1 >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @741: >> > > > > > Client environment:user.home=/home/arodriguez >> > > > > > 2015-05-25 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@log_env >> > @753: >> > > > > > Client >> > > > > > >> > > >> environment:user.dir=/home/arodriguez/dev/spark-1.2.0-bin-hadoop2.4/bin >> > > > > > 2015-05-25 >> > 18:13:01,749:10746(0x7fd4b1ffb700):ZOO_INFO@zookeeper_init >> > > > > @786: >> > > > > > Initiating client connection, host=10.141.141.10:2181 >> > > > > > sessionTimeout=10000 watcher=0x7fd4c2f0d5b0 sessionId=0 >> > > > > > sessionPasswd=<null> context=0x7fd3d40063c0 flags=0 >> > > > > > 2015-05-25 >> 18:13:01,750:10746(0x7fd4ab7fe700):ZOO_INFO@check_events >> > > > > @1705: >> > > > > > initiated connection to server [10.141.141.10:2181] >> > > > > > 2015-05-25 >> 18:13:01,752:10746(0x7fd4ab7fe700):ZOO_INFO@check_events >> > > > > @1752: >> > > > > > session establishment complete on server [10.141.141.10:2181], >> > > > > > sessionId=0x14d8babef360022, negotiated timeout=10000 >> > > > > > I0525 18:13:01.752760 10913 group.cpp:313] Group process >> > > > > > (group(1)@127.0.0.1:48557) connected to ZooKeeper >> > > > > > I0525 18:13:01.752787 10913 group.cpp:790] Syncing group >> > operations: >> > > > > > queue size (joins, cancels, datas) = (0, 0, 0) >> > > > > > I0525 18:13:01.752807 10913 group.cpp:385] Trying to create path >> > > > > > '/mesos' in ZooKeeper >> > > > > > I0525 18:13:01.754317 10909 detector.cpp:138] Detected a new >> > leader: >> > > > > > (id='16') >> > > > > > I0525 18:13:01.754408 10913 group.cpp:659] Trying to get >> > > > > > '/mesos/info_0000000016' in ZooKeeper >> > > > > > I0525 18:13:01.755056 10913 detector.cpp:452] A new leading >> master >> > > > > > ([email protected]:5050) is detected >> > > > > > I0525 18:13:01.755113 10911 sched.cpp:254] New master detected >> at >> > > > > > [email protected]:5050 >> > > > > > I0525 18:13:01.755345 10911 sched.cpp:264] No credentials >> provided. >> > > > > > Attempting to register without authentication >> > > > > > >> > > > > > >> > > > > > It hangs up in the last line. >> > > > > > >> > > > > > I've tried to set the LIBPROCESS_IP env variable with no luck. >> > > > > > >> > > > > > Any advice? >> > > > > > >> > > > > > Thank you in advance. >> > > > > > >> > > > > > Kind regards, >> > > > > > >> > > > > > Alberto >> > > > > > >> > > > > >> > > > >> > > >> > >> > >
