Re: User Impersonation Configuration
>From the exception, it looks like you are on 0.7.1. Which has the above-mentioned patch (https://github.com/apache/zeppelin/pull/1840). Without ZEPPELIN_IMPERSONATE_CMD and ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER set it should work just fine. However, if you want to setup password-less login, you can use this doc, its was written for Zeppelin-0.6.0 but it works fine https://community.hortonworks.com/content/kbentry/81069/how-to-enable-user-impersonation-for-sh-interprete.html . Also, can you quickly check if you are able to connect to spark with user impersonation? On 10 May 2017 at 23:05, Yeshwanth Jaginiwrote: > Hi prabhjyot, > thanks for your reply. > > i am using zeppelin 0.7.0 version. > when i do not specify impersonation config in zeppelin-env.sh and only in > interpreter setting, > it is throwing following exception > > ERROR [2017-05-10 17:26:30,551] ({pool-2-thread-3} Job.java[run]:188) - > Job failed > org.apache.zeppelin.interpreter.InterpreterException: Host key > verification failed. > > at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProces > s.start(RemoteInterpreterManagedProcess.java:143) > at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess. > reference(RemoteInterpreterProcess.java:73) > at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open( > RemoteInterpreter.java:258) > at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType( > RemoteInterpreter.java:423) > at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType( > LazyOpenInterpreter.java:106) > at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387) > at org.apache.zeppelin.scheduler.Job.run(Job.java:175) > at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run( > RemoteScheduler.java:329) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) > at java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at java.util.concurrent.ScheduledThreadPoolExecutor$ > ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at java.util.concurrent.ThreadPoolExecutor.runWorker( > ThreadPoolExecutor.java:1145) > at java.util.concurrent.ThreadPoolExecutor$Worker.run( > ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > > i am running zeppelin as root user, root user doesn't had a password less > ssh setup where as the end web user *user1* has. > > how should i proceed now? > > Thanks, > Yeshwanth Jagini > > > > > > > > > > > On Wed, May 10, 2017 at 1:45 AM, Prabhjyot Singh < > prabhjyotsi...@apache.org> wrote: > >> Hi Yeshwant, >> >> Which version of Zeppelin are you on? >> >> If you are on latest then you don't need to do any of >> ZEPPELIN_IMPERSONATE_CMD >> or ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER. Just by enabling User >> Impersonation check-box should be sufficient. >> >> Can you confirm by `ps aux | grep spark`. This is what I see on my >> machine; >> >> prabhjyotsingh@MACHINE:~/ps-zeppelin/logs$ ps aux | grep spark >> prabhjyotsingh2496 0.2 3.9 5179540 657660 s000 S12:08PM >> 0:30.68 >> /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java >> -cp /Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/*:/Users >> /prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/target/ >> lib/*:/Users/prabhjyotsingh/ps-zeppelin/zeppelin- >> interpreter/target/classes/:/Users/prabhjyotsingh/ps- >> zeppelin/zeppelin-interpreter/target/test-classes/:/Users/pr >> abhjyotsingh/ps-zeppelin/zeppelin-zengine/target/test-classe >> s/:/Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/ >> zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar:/Users/prabhjyotsingh/spark-2.0.0- >> bin-hadoop2.7/conf/:/Users/prabhjyotsingh/spark-2.0.0-bin-hadoop2.7/jars/* >> -Xmx1g -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:/// >> Users/prabhjyotsingh/ps-zeppelin/conf/log4j.properties >> -Dzeppelin.log.file=/Users/prabhjyotsingh/ps-zeppelin/logs/ >> zeppelin-interpreter-spark-user1-spark-prabhjyotsingh-HW11610.local.log >> org.apache.spark.deploy.SparkSubmit --conf spark.driver.extraClassPath=:/ >> Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/*:/Users/ >> prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/target/ >> lib/*::/Users/prabhjyotsingh/ps-zeppelin/zeppelin- >> interpreter/target/classes:/Users/prabhjyotsingh/ps- >> zeppelin/zeppelin-interpreter/target/test-classes:/Users/ >> prabhjyotsingh/ps-zeppelin/zeppelin-zengine/target/test- >> classes:/Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/ >> zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar --conf >> spark.driver.extraJavaOptions= -Dfile.encoding=UTF-8 >> -Dlog4j.configuration=file:///Users/prabhjyotsingh/ps-zeppelin/conf/log4j.properties >> -Dzeppelin.log.file=/Users/prabhjyotsingh/ps-zeppelin/logs/ >> zeppelin-interpreter-spark-user1-spark-prabhjyotsingh-HW11610.local.log >> --class
Modularization of notebooks
How can I modularize a big notebook? Is it possible to import other notebooks? Can Zeppelin interoperate with a regular SBT-scala project? http://stackoverflow.com/questions/43796688/zeppelin-load-full-project-external-files
Re: NullPointerException at org.apache.zeppelin.spark.Utils.buildJobGroupId
It is fixed here https://github.com/apache/zeppelin/pull/2334 Ruslan Dautkhanov于2017年5月10日周三 下午12:46写道: > Has anyone experienced below exception? > It started happening inconsistently after upgrade to a last week master > snapshot of Zeppelin. > We have multiple users reported the same issue. > > java.lang.NullPointerException at > org.apache.zeppelin.spark.Utils.buildJobGroupId(Utils.java:112) at > org.apache.zeppelin.spark.SparkZeppelinContext.showData(SparkZeppelinContext.java:100) > at > org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:129) > at > org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101) > at > org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:500) > at org.apache.zeppelin.scheduler.Job.run(Job.java:181) at > org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) > at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > at java.util.concurrent.FutureTask.run(FutureTask.java:262) at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > > > > Thanks, > Ruslan > >
NullPointerException at org.apache.zeppelin.spark.Utils.buildJobGroupId
Has anyone experienced below exception? It started happening inconsistently after upgrade to a last week master snapshot of Zeppelin. We have multiple users reported the same issue. java.lang.NullPointerException at org.apache.zeppelin.spark.Utils.buildJobGroupId(Utils.java:112) at org.apache.zeppelin.spark.SparkZeppelinContext.showData(SparkZeppelinContext.java:100) at org.apache.zeppelin.spark.SparkSqlInterpreter.interpret(SparkSqlInterpreter.java:129) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:101) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:500) at org.apache.zeppelin.scheduler.Job.run(Job.java:181) at org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.run(ParallelScheduler.java:162) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Thanks, Ruslan
Re: User Impersonation Configuration
Hi prabhjyot, thanks for your reply. i am using zeppelin 0.7.0 version. when i do not specify impersonation config in zeppelin-env.sh and only in interpreter setting, it is throwing following exception ERROR [2017-05-10 17:26:30,551] ({pool-2-thread-3} Job.java[run]:188) - Job failed org.apache.zeppelin.interpreter.InterpreterException: Host key verification failed. at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:143) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.reference(RemoteInterpreterProcess.java:73) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:258) at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:423) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:106) at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387) at org.apache.zeppelin.scheduler.Job.run(Job.java:175) at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:473) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) i am running zeppelin as root user, root user doesn't had a password less ssh setup where as the end web user *user1* has. how should i proceed now? Thanks, Yeshwanth Jagini On Wed, May 10, 2017 at 1:45 AM, Prabhjyot Singhwrote: > Hi Yeshwant, > > Which version of Zeppelin are you on? > > If you are on latest then you don't need to do any of ZEPPELIN_IMPERSONATE_CMD > or ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER. Just by enabling User > Impersonation check-box should be sufficient. > > Can you confirm by `ps aux | grep spark`. This is what I see on my > machine; > > prabhjyotsingh@MACHINE:~/ps-zeppelin/logs$ ps aux | grep spark > prabhjyotsingh2496 0.2 3.9 5179540 657660 s000 S12:08PM > 0:30.68 > /Library/Java/JavaVirtualMachines/jdk1.8.0_102.jdk/Contents/Home/bin/java > -cp /Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/*:/ > Users/prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/target/lib/*:/Users/ > prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/target/classes/:/Users/ > prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/target/ > test-classes/:/Users/prabhjyotsingh/ps-zeppelin/ > zeppelin-zengine/target/test-classes/:/Users/prabhjyotsingh/ps-zeppelin/ > interpreter/spark/zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar: > /Users/prabhjyotsingh/spark-2.0.0-bin-hadoop2.7/conf/:/ > Users/prabhjyotsingh/spark-2.0.0-bin-hadoop2.7/jars/* -Xmx1g > -Dfile.encoding=UTF-8 -Dlog4j.configuration=file:/// > Users/prabhjyotsingh/ps-zeppelin/conf/log4j.properties > -Dzeppelin.log.file=/Users/prabhjyotsingh/ps-zeppelin/ > logs/zeppelin-interpreter-spark-user1-spark-prabhjyotsingh-HW11610.local.log > org.apache.spark.deploy.SparkSubmit --conf spark.driver.extraClassPath=:/ > Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/*:/ > Users/prabhjyotsingh/ps-zeppelin/zeppelin-interpreter/ > target/lib/*::/Users/prabhjyotsingh/ps-zeppelin/ > zeppelin-interpreter/target/classes:/Users/prabhjyotsingh/ > ps-zeppelin/zeppelin-interpreter/target/test- > classes:/Users/prabhjyotsingh/ps-zeppelin/zeppelin-zengine/ > target/test-classes:/Users/prabhjyotsingh/ps-zeppelin/ > interpreter/spark/zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar --conf > spark.driver.extraJavaOptions= -Dfile.encoding=UTF-8 > -Dlog4j.configuration=file:///Users/prabhjyotsingh/ps-zeppelin/conf/log4j.properties > -Dzeppelin.log.file=/Users/prabhjyotsingh/ps-zeppelin/ > logs/zeppelin-interpreter-spark-user1-spark-prabhjyotsingh-HW11610.local.log > --class org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer > *--proxy-user > user1* /Users/prabhjyotsingh/ps-zeppelin/interpreter/spark/ > zeppelin-spark_2.10-0.8.0-SNAPSHOT.jar 50911 > prabhjyotsingh2508 0.0 0.0 2445100860 s000 S+ 12:08PM > 0:00.00 grep spark > prabhjyotsingh2495 0.0 0.0 2465144764 s000 S12:08PM > 0:00.00 /bin/bash /Users/prabhjyotsingh/ps-zeppelin/bin/interpreter.sh -d > /Users/prabhjyotsingh/ps-zeppelin/interpreter/spark -p 50911 -u user1 -l > /Users/prabhjyotsingh/ps-zeppelin/local-repo/2CEZC4JXN -g spark > prabhjyotsingh2484 0.0 0.0 2465144 1368 s000 S12:08PM > 0:00.01 /bin/bash /Users/prabhjyotsingh/ps-zeppelin/bin/interpreter.sh -d > /Users/prabhjyotsingh/ps-zeppelin/interpreter/spark -p 50911 -u user1 -l >
Re: Spark-CSV - Zeppelin tries to read CSV locally in Standalon mode
I've put the csv in the worker node since the job is run in the worker. I didn't put the csv in the master because I believe it doesn't run jobs. If I put the csv in the zeppelin node with the same path as the worker, it reads the csv and writes a _SUCCESS file locally. The job is run on the worker too but doesn't terminate. The result is saved under a _temporary directory in the worker. worker - ls -laRt /data/02.csv/ 02.csv/: total 0 drwxr-xr-x. 3 root root 24 Apr 28 09:55 . drwxr-xr-x. 3 root root 15 Apr 28 09:55 _temporary drwxr-xr-x. 3 root root 64 Apr 28 09:55 .. 02.csv/_temporary: total 0 drwxr-xr-x. 5 root root 106 Apr 28 09:56 0 drwxr-xr-x. 3 root root 15 Apr 28 09:55 . drwxr-xr-x. 3 root root 24 Apr 28 09:55 .. 02.csv/_temporary/0: total 0 drwxr-xr-x. 5 root root 106 Apr 28 09:56 . drwxr-xr-x. 2 root root 6 Apr 28 09:56 _temporary drwxr-xr-x. 2 root root 129 Apr 28 09:56 task_20170428095632_0005_m_00 drwxr-xr-x. 2 root root 129 Apr 28 09:55 task_20170428095516_0002_m_00 drwxr-xr-x. 3 root root 15 Apr 28 09:55 .. 02.csv/_temporary/0/_temporary: total 0 drwxr-xr-x. 2 root root 6 Apr 28 09:56 . drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. 02.csv/_temporary/0/task_20170428095632_0005_m_00: total 52 drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. -rw-r--r--. 1 root root 376 Apr 28 09:56 .part-0-e39ebc76-5343-407e-b42e-c33e69b8fd1a.csv.crc -rw-r--r--. 1 root root 46605 Apr 28 09:56 part-0-e39ebc76-5343-407e-b42e-c33e69b8fd1a.csv drwxr-xr-x. 2 root root 129 Apr 28 09:56 . 02.csv/_temporary/0/task_20170428095516_0002_m_00: total 52 drwxr-xr-x. 5 root root 106 Apr 28 09:56 .. -rw-r--r--. 1 root root 376 Apr 28 09:55 .part-0-c2ac5299-26f6-4b23-a74b-b3dc96464271.csv.crc -rw-r--r--. 1 root root 46605 Apr 28 09:55 part-0-c2ac5299-26f6-4b23-a74b-b3dc96464271.csv zeppelin - ls -laRt 02.csv/ 02.csv/: total 12 drwxr-sr-x2 root 1700 4096 Apr 28 09:56 . -rw-r--r--1 root 1700 8 Apr 28 09:56 ._SUCCESS.crc -rw-r--r--1 root 1700 0 Apr 28 09:56 _SUCCESS drwxrwsr-x5 root 1700 4096 Apr 28 09:56 .. El El mié, 10 may 2017 a las 14:06, Meethu Mathewescribió: > Try putting the csv in the same path in all the nodes or in a mount point > path which is accessible by all the nodes > > Regards, > > > Meethu Mathew > > > On Wed, May 10, 2017 at 3:36 PM, Sofiane Cherchalli > wrote: > >> Yes, I already tested with spark-shell and pyspark , with the same result. >> >> Can't I use Linux filesystem to read CSV, such as file:///data/file.csv. >> My understanding is that the job is sent and is interpreted in the worker, >> isn't it? >> >> Thanks. >> >> El El mar, 9 may 2017 a las 20:23, Jongyoul Lee >> escribió: >> >>> Could you test if it works with spark-shell? >>> >>> On Sun, May 7, 2017 at 5:22 PM, Sofiane Cherchalli >>> wrote: >>> Hi, I have a standalone cluster, one master and one worker, running in separate nodes. Zeppelin is running is in a separate node too in client mode. When I run a notebook that reads a CSV file located in the worker node with Spark-CSV package, Zeppelin tries to read the CSV locally and fails because the CVS is in the worker node and not in Zeppelin node. Is this the expected behavior? Thanks. >>> >>> >>> >>> -- >>> 이종열, Jongyoul Lee, 李宗烈 >>> http://madeng.net >>> >> >
Re: Hive Reserve Keyword support
right. This backticks worked On Wed, May 10, 2017 at 8:51 AM, Felix Cheungwrote: > I think you can put backticks around the name date > > https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL > > -- > *From:* Jongyoul Lee > *Sent:* Tuesday, May 9, 2017 10:33:50 AM > *To:* users@zeppelin.apache.org > *Subject:* Re: Hive Reserve Keyword support > > If it's possible for you to pass that properties when you create a > connection, you can passes it by setting it into interpreter setting > > On Sat, Apr 29, 2017 at 4:25 PM, Dibyendu Bhattacharya < > dibyendu.bhattach...@gmail.com> wrote: > >> Hi, >> >> I have a Hive Table which has a column named date. When I tried to query >> using Zeppelin %jdbc interpreter , I got bellow error. >> >> >> Error while compiling statement: FAILED: ParseException line 1:312 Failed >> to recognize predicate 'date'. Failed rule: 'identifier' in expression >> specification >> class org.apache.hive.service.cli.HiveSQLException >> org.apache.hive.jdbc.Utils.verifySuccess(Utils.java:231) >> org.apache.hive.jdbc.Utils.verifySuccessWithInfo(Utils.java:217) >> org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254) >> org.apache.zeppelin.jdbc.JDBCInterpreter.executeSql(JDBCInte >> rpreter.java:322) >> org.apache.zeppelin.jdbc.JDBCInterpreter.interpret(JDBCInter >> preter.java:408) >> org.apache.zeppelin.interpreter.LazyOpenInterpreter. >> interpret(LazyOpenInterpreter.java:94) >> org.apache.zeppelin.interpreter.remote.RemoteInterpreterServ >> er$InterpretJob.jobRun(RemoteInterpreterServer.java:341) >> org.apache.zeppelin.scheduler.Job.run(Job.java:176) >> org.apache.zeppelin.scheduler.ParallelScheduler$JobRunner.ru >> n(ParallelScheduler.java:162) >> >> >> My query looks like this : >> >> select x,y,z from mytable where date = '2017-04-28" >> >> I believe it is failing because date is reserve keyword . Is there anyway >> I can set hive.support.sql11.reserved.keywords=false in Zeppelin ? >> >> regards, >> Dibyendu >> >> >> >> > > > -- > 이종열, Jongyoul Lee, 李宗烈 > http://madeng.net >