Hi Armen, I remember having to make sure that /usr/lib/zeppelin/local-repo was owned by user zeppelin
sudo chown zeppelin /usr/lib/zeppelin/local-repo Asif On Sat, Dec 5, 2015 at 10:43 AM, armen donigian <[email protected]> wrote: > Follow up to my previous email regarding loading of external jars & Null > Pointer Exception (NPE). > > '*/usr/lib/zeppelin/local-repo' *doesn't exist for user 'hadoop' on > master node. Is it supposed to? > I created '*/var/lib/zeppelin/local-repo*', then '*ln -s > /var/lib/zeppelin/local-repo /usr/lib/zeppelin/local-repo*'...but still > getting NPE error. Any suggestions? > > Btw, in an unrelated topic, does zeppelin support a feature to email a > user the output of a note? Like unix processes would return a status code, > a zeppelin note can return at minimum true (success) or false (failure). > > > On Sat, Dec 5, 2015 at 12:18 AM Work <[email protected]> wrote: > >> 1. EMR does not currently provide anything like this for Zeppelin. (Good >> idea though!) Zeppelin's built-in S3 notebook storage might help you, >> especially if you turn on bucket versioning, I suppose, but I have not >> tried this. >> >> 2. Yes, if you go to the ResourceManager on port 8088 then click the >> ApplicationMaster link next to the Zeppelin app, you can get to the Spark >> UI associated with the Zeppelin SparkContext (assuming you have first run a >> notebook containing Spark code, otherwise the Zeppelin YARN app won't exist >> yet). >> >> 3. Sorry, I have not tried using Zeppelin's notebook scheduler, but yes, >> DataPipelines would probably provide you more reliability for production >> batch ETL jobs. I don't know what your use case is, but maybe you could use >> DataPipelines to generate some dataset that you store in S3 and can query >> via Zeppelin? >> >> 4. This is a limitation of Zeppelin (really though, of Spark), not >> specifically of Zeppelin on EMR, in that you must load any dependencies >> before running any Spark code because the dependencies can only be loaded >> once. However, once you solve this issue, you will run into a known issue >> with Zeppelin on EMR where you hit a weird NPE that is caused by the >> zeppelin user not having write access to /usr/lib/zeppelin/local-repo. I >> would suggest creating /var/lib/zeppelin/local-repo then creating a symlink >> from /usr/lib/zeppelin/local-repo to /var/lib/zeppelin/local-repo. We will >> fix this in emr-4.3.0. >> >> ~ Jonathan >> >> — >> Sent from Mailbox <https://www.dropbox.com/mailbox> >> >> >> On Fri, Dec 4, 2015 at 11:18 PM, armen donigian <[email protected]> >> wrote: >> >>> Hi all, >>> Installed Zeppelin on Amazon EMR and it's running swell. Had a few >>> questions... >>> >>> 1. How do we version control Zeppelin notes? >>> >>> 2. How do you check for status of a long running Zeppelin task? Is there >>> a web UI for this or do you simply check the Resource Manager UI >>> @master-node:8088 (in case of AWS)? >>> >>> 3. Are there any known issues/limitations of running Zeppelin note >>> scheduler in production for batch ETL jobs? Trying to assess it vs Amazon >>> Data Pipelines. >>> >>> 4. When trying to add an external jar, I'm getting this error. >>> %dep >>> z.reset() >>> z.load("com.databricks:spark-redshift_2.10:0.5.2") >>> Must be used before SparkInterpreter (%spark) initialized >>> >>> Thanks >>> >> >>
