Hi, Very Good morning and Wish you all a wonderful New Year ahead.
Sorry to bother you on New Year's eve but I really appreciate any hints. *1. Setting up and running the basic setup (on MacOS High Sierra) (Solution found): * I remember having done this successfully before but it was strange this time. a. Cloned amaterasu and amaterasu-vagrant repos. b. Built amaterasu using ./gradlew buildHomeDir test (and later tried with buildDistribution) c. Have a vagrant box up and running (have modified the location of the sync folder to point to amaterasu's build/amaterasu directory) d. Have installed mesos locally using brew and set the MESOS_NATIVE_JAVA_LIBRARY to point to /usr/local/lib/libmesos.dylib e. Did a ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample.git" --branch="master" --env="test" --report="code" I found from mesos that zookeeper wasn't running and from zookeeper logs that java wasn't installed. I manually installed java (sudo yum install java-1.7.0-openjdk-devel) and then started zookeeper (service zookeeper-server start). Mesos came up. *Question 1 : Is it okay if I add the java installation as part of the provision.sh or did I miss anything earlier in order to bump into this issue?* 2. Upon submitting the job, I saw that the job didn't run fine. *Client logs (./ama-start.sh)* I1231 13:24:01.548068 193015808 sched.cpp:232] Version: 1.3.0 I1231 13:24:01.554518 226975744 sched.cpp:336] New master detected at master@192.168.33.11:5050 I1231 13:24:01.554733 226975744 sched.cpp:352] No credentials provided. Attempting to register without authentication I1231 13:24:01.558101 226975744 sched.cpp:759] Framework registered with 70566146-bf07-4515-aa0c-ed7fd597efe3-0019 ===> moving to err action null 2017-12-31 13:24:08.884:INFO:oejs.ServerConnector:Thread-22: Stopped ServerConnector@424e1977{HTTP/1.1}{0.0.0.0:8000} 2017-12-31 13:24:08.886:INFO:oejsh.ContextHandler:Thread-22: Stopped o.e.j.s.ServletContextHandler@7bedc48a {/,file:/Users/arun/IdeaProjects/amaterasu/build/amaterasu/dist/,UNAVAILABLE} 2017-12-31 13:24:08.886:INFO:oejsh.ContextHandler:Thread-22: Stopped o.e.j.s.h.ContextHandler@4802796d{/,null,UNAVAILABLE} I1231 13:24:08.887260 229122048 sched.cpp:2021] Asked to stop the driver I1231 13:24:08.887711 229122048 sched.cpp:1203] Stopping framework 70566146-bf07-4515-aa0c-ed7fd597efe3-0019 *(MESOS LOGS as ATTACHMENT MesosMaster-INFO.log)* *Question 2 : Hints please* *2. Dev setup* I would like to debug the program to understand how the flow of code within Amaterasu. I have attempted the following and that gives the same result as above. Created a main program that invokes the job launcher with the following parameters (this class is in test directory primarily to bring in the "provided" libraries. Not sure if that makes sense). *Program arguments : * -Djava.library.path=/usr/lib *Environment variables : (not sure why System.setProperty doesn't picked up)* AMA_NODE = MacBook-Pro.local MESOS_NATIVE_JAVA_LIBRARY = /usr/local/lib/libmesos.dylib object JobLauncherDebug extends App { JobLauncher.main(Array( "--home", "/Users/arun/IdeaProjects/amaterasu/build/amaterasu", "--repo", "https://github.com/shintoio/amaterasu-job-sample.git", "--branch", "master", "--env", "test", "--report", "code" )) } *Question 3 : Would this be a good idea to go about debugging into the code?* *3. Pipelining two Spark jobs* My ultimate goal would be to run the following use-case using Amaterasu. This is very similar to what I do at work. a. Create a bunch of Hive tables (and hdfs directories) using a Spark job (that will be used as a deployment script - one time setup but no harm in running it again since it has the "if not exists" clause) ( https://github.com/arunma/ama_schemapreparer) b. Run another Spark job that populates the data (this job runs on regular intervals throughout the day) (https://github.com/arunma/ama_datapopulator) c. Run a different Spark job that reconciles the populated data. I am yet to create the "job" project for this one which I intend to do once I have the default testcase running. *Question 4 :* A couple of hurdles that I believe I would have is that Amaterasu, at the moment, a. Expects the Spark jobs to be in the same repository. b. The file that instantiates the Spark session, context etc has to be explicitly given as a ".scala" file (we then use the IMain interpreter to inject the AmaContext?) Now, with two repositories in play and only the binaries/repository name given for the repo, would it be a good idea to achieve the AmaContext insertion using a compiler plugin? I am pretty sure this has been discussed before and it would be great if you could share your views on this. I can come up with a POC PR of sorts if I get some ideas. Best Regards, Arun