Re: ama-start and JobLauncher stuck (Solution found) + Running Spark jobs across several repo

Arun Manivannan Sun, 31 Dec 2017 00:31:15 -0800

As a side-note, I am not very familiar yet with Mesos.
Also, I have tried to execute this job
https://github.com/shintoio/amaterasu-job-py-sample but got the "===>
moving to err action null"  again. Could be some problem with my local
environment and build?


Cheers,
Arun

On Sun, Dec 31, 2017 at 1:35 PM Arun Manivannan <a...@arunma.com> wrote:

> Hi,
>
> Very Good morning and Wish you all a wonderful New Year ahead.
>
> Sorry to bother you on New Year's eve but I really appreciate any hints.
>
> *1. Setting up and running the basic setup (on MacOS High Sierra)
> (Solution found): *
>
> I remember having done this successfully before but it was strange this
> time.
>
> a. Cloned amaterasu and amaterasu-vagrant repos.
> b. Built amaterasu using ./gradlew buildHomeDir test (and later tried with
> buildDistribution)
> c. Have a vagrant box up and running (have modified the location of the
> sync folder to point to amaterasu's build/amaterasu directory)
> d. Have installed mesos locally using brew and set the
> MESOS_NATIVE_JAVA_LIBRARY to point to /usr/local/lib/libmesos.dylib
> e. Did a
>
> ama-start.sh --repo="https://github.com/shintoio/amaterasu-job-sample.git"; 
> --branch="master" --env="test" --report="code"
>
>
> I found from mesos that zookeeper wasn't running and from zookeeper logs
> that java wasn't installed.  I manually installed java (sudo yum install
> java-1.7.0-openjdk-devel) and then started zookeeper (service
> zookeeper-server start). Mesos came up.
>
> *Question 1 : Is it okay if I add the java installation as part of the
> provision.sh or did I miss anything earlier in order to bump into this
> issue?*
>
> 2. Upon submitting the job, I saw that the job didn't run fine.
>
>
> *Client logs (./ama-start.sh)*
>
> I1231 13:24:01.548068 193015808 sched.cpp:232] Version: 1.3.0
> I1231 13:24:01.554518 226975744 sched.cpp:336] New master detected at
> master@192.168.33.11:5050
> I1231 13:24:01.554733 226975744 sched.cpp:352] No credentials provided.
> Attempting to register without authentication
> I1231 13:24:01.558101 226975744 sched.cpp:759] Framework registered with
> 70566146-bf07-4515-aa0c-ed7fd597efe3-0019
> ===> moving to err action null
> 2017-12-31 13:24:08.884:INFO:oejs.ServerConnector:Thread-22: Stopped
> ServerConnector@424e1977{HTTP/1.1}{0.0.0.0:8000}
> 2017-12-31 13:24:08.886:INFO:oejsh.ContextHandler:Thread-22: Stopped
> o.e.j.s.ServletContextHandler@7bedc48a
> {/,file:/Users/arun/IdeaProjects/amaterasu/build/amaterasu/dist/,UNAVAILABLE}
> 2017-12-31 13:24:08.886:INFO:oejsh.ContextHandler:Thread-22: Stopped
> o.e.j.s.h.ContextHandler@4802796d{/,null,UNAVAILABLE}
> I1231 13:24:08.887260 229122048 sched.cpp:2021] Asked to stop the driver
> I1231 13:24:08.887711 229122048 sched.cpp:1203] Stopping framework
> 70566146-bf07-4515-aa0c-ed7fd597efe3-0019
>
> *(MESOS LOGS as ATTACHMENT MesosMaster-INFO.log)*
>
> *Question 2 :  Hints please*
>
> *2. Dev setup*
> I would like to debug the program to understand how the flow of code
> within Amaterasu. I have attempted the following and that gives the same
> result as above. Created a main program that invokes the job launcher with
> the following parameters (this class is in test directory primarily to
> bring in the "provided" libraries. Not sure if that makes sense).
>
> *Program arguments : *
> -Djava.library.path=/usr/lib
> *Environment variables : (not sure why System.setProperty doesn't picked
> up)*
> AMA_NODE = MacBook-Pro.local
> MESOS_NATIVE_JAVA_LIBRARY = /usr/local/lib/libmesos.dylib
>
> object JobLauncherDebug extends App {
>   JobLauncher.main(Array(
>     "--home", "/Users/arun/IdeaProjects/amaterasu/build/amaterasu",
>     "--repo", "https://github.com/shintoio/amaterasu-job-sample.git";,
>     "--branch", "master",
>     "--env", "test",
>     "--report", "code"
>   ))
> }
>
> *Question 3 : Would this be a good idea to go about debugging into the
> code?*
>
> *3. Pipelining two Spark jobs*
>
> My ultimate goal would be to run the following use-case using Amaterasu.
> This is very similar to what I do at work.
>
> a. Create a bunch of Hive tables (and hdfs directories) using a Spark job
> (that will be used as a deployment script - one time setup but no harm in
> running it again since it has the "if not exists" clause) (
> https://github.com/arunma/ama_schemapreparer)
> b. Run another Spark job that populates the data (this job runs on regular
> intervals throughout the day) (https://github.com/arunma/ama_datapopulator)
>
> c. Run a different Spark job that reconciles the populated data.
>
> I am yet to create the "job" project for this one which I intend to do
> once I have the default testcase running.
>
> *Question 4 :*
> A couple of hurdles that I believe I would have is that Amaterasu, at the
> moment,
>
> a. Expects the Spark jobs to be in the same repository.
> b. The file that instantiates the Spark session, context etc has to be
> explicitly given as a ".scala" file (we then use the IMain interpreter to
> inject the AmaContext?)
>
> Now, with two repositories in play and only the binaries/repository name
> given for the repo, would it be a good idea to achieve the AmaContext
> insertion using a compiler plugin?  I am pretty sure this has been
> discussed before and it would be great if you could share your views on
> this. I can come up with a POC PR of sorts if I get some ideas.
>
> Best Regards,
> Arun
>
>

Re: ama-start and JobLauncher stuck (Solution found) + Running Spark jobs across several repo

Reply via email to