[jira] [Created] (ZEPPELIN-3829) Allow to specify a custom interpreter running directory
Jhon Cardenas created ZEPPELIN-3829: --- Summary: Allow to specify a custom interpreter running directory Key: ZEPPELIN-3829 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3829 Project: Zeppelin Issue Type: Improvement Components: Interpreters Affects Versions: 0.8.0 Reporter: Jhon Cardenas Fix For: 0.9.0 It would be very useful be able to specify in zeppelin the directory where the interpreters will be running. An use case for this would be big core dumps generation. With this feature the user could change the default execution directory (which now is $ZEPPELIN_HOME) to another folder, and have the generated core dumps there. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-3763) Broken link to Writing a new Visualization documentation
Jhon Cardenas created ZEPPELIN-3763: --- Summary: Broken link to Writing a new Visualization documentation Key: ZEPPELIN-3763 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3763 Project: Zeppelin Issue Type: Bug Components: helium Affects Versions: 0.8.0 Reporter: Jhon Cardenas Fix For: 0.8.1 Steps to reproduce: 1. Go to Helium page. 2. Look for a VISUALIZATION plugin on the list, and click on "Enable". 3. Click on the VISUALIZATION link, located in the Type field. 4. You will be redirected to [a broken link|https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization.html]. Expected behavior: You should be redirected to the right [Writing a new Visualization link|https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization_basic.html]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-3749) New Spark interpreter has to be restarted two times in order to work fine for different users
Jhon Cardenas created ZEPPELIN-3749: --- Summary: New Spark interpreter has to be restarted two times in order to work fine for different users Key: ZEPPELIN-3749 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3749 Project: Zeppelin Issue Type: Bug Components: Interpreters Affects Versions: 0.8.0, 0.8.1 Environment: *Spark interpreter property to reproduce:* zeppelin.spark.useNew -> true *Spark interpreter instantiation mode*: per user - scoped *Zeppelin version:* branch-0.8 (Until july 23) Reporter: Jhon Cardenas Attachments: first_error.txt, second_error.txt New Spark interpreter has to be restarted two times in order to work fine for different users. To reproduce this you have to configure zeppelin to use the new interpreter: zeppelin.spark.useNew -> true And the instantiation mode: per user - scoped *Steps to reproduce:* 1. User A login to zeppelin and runs some spark paragraph. It should works fine. 2. User B login to zeppelin and runs some spark paragraph, for example {code:java} %spark println(sc.version) println(scala.util.Properties.versionString) {code} 3. This error appears (see entire log trace [^first_error.txt] ): {\{java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext. This stopped SparkContext was created at: .}} 4. The user B restart the spark interpreter from notebook page, and executes now a paragraph that throws a job, for example: {code:java} import sqlContext.implicits._ import org.apache.commons.io.IOUtils import java.net.URL import java.nio.charset.Charset // Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext or SqlContext) // So you don't need create them manually // load bank data val bankText = sc.parallelize( IOUtils.toString( new URL("https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv;), Charset.forName("utf8")).split("\n")) sc.parallelize(1 to 100).foreach(n => print((java.lang.Math.random() * 100) + n)) case class Bank(age: Integer, job: String, marital: String, education: String, balance: Integer) val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map( s => Bank(s(0).toInt, s(1).replaceAll("\"", ""), s(2).replaceAll("\"", ""), s(3).replaceAll("\"", ""), s(5).replaceAll("\"", "").toInt ) ).toDF() bank.registerTempTable("bank") {code} 5. This error appears (see entire log trace [^second_error.txt] ): {\{org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 36, 100.96.85.172, executor 2): java.lang.ClassNotFoundException: $anonfun$1 at org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:82) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:348) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67) .}} 6. User B restart spark interpreter from notebook page and now it works. *Actual Behavior:* The user B has to restart two times spark interpreter so it can works. *Expected Behavior:* Spark should works fine for another users without any restarting. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-3641) Impersonation for spark without native proxy user can potentially fail
Jhon Cardenas created ZEPPELIN-3641: --- Summary: Impersonation for spark without native proxy user can potentially fail Key: ZEPPELIN-3641 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3641 Project: Zeppelin Issue Type: Bug Components: zeppelin-interpreter Affects Versions: 0.8.0 Reporter: Jhon Cardenas Impersonation for spark without native proxy user can potentially fail. When impersonation in on for spark without native proxy user (ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false) and the spark interpreter has a property value with a blank space, the spark paragraphs fail. How to reproduce: # Turn on impersonation for spark. # Disable native proxy user (ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false). # Specify the impersonation command: ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c ' (With the default command the same error should happen) # Put a property with at least one blank space, for example: ||Name||Value|| |spark.executor.extraJavaOptions|-Dpro1=val1 -Dprop1=val2| # Run a spark paragraph. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-3542) Stop a pyspark paragraph cancels all the pyspark jobs of other users
Jhon Cardenas created ZEPPELIN-3542: --- Summary: Stop a pyspark paragraph cancels all the pyspark jobs of other users Key: ZEPPELIN-3542 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3542 Project: Zeppelin Issue Type: Bug Components: pySpark Affects Versions: 0.8.0 Environment: This happen when the spark context is shared between users (scoped). Reporter: Jhon Cardenas Stop a pyspark paragraph cancels all the pyspark jobs of other users. Cancel button in pyspark paragraph cancels Spark jobs for all users. This happen when the spark context is shared between users (scoped). It seems to be related with [the solution|https://github.com/apache/zeppelin/commit/9f22db91c279b7daf6a13b2d805a874074b070fd] for the task [ZEPPELIN-2075|https://issues.apache.org/jira/browse/ZEPPELIN-2075]. This solution is causing that when one particular user cancels his py-spark job, the py-spark jobs from all the users are being canceled !!. When a py-spark job is cancelled, the method PySparkInterpreter interrupt() is invoked, and then the SIGINT event is called, causing that all the jobs in the same spark context be cancelled: context.py: # create a signal handler which would be invoked on receiving SIGINT def signal_handler(signal, frame): self.cancelAllJobs() raise KeyboardInterrupt() -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (ZEPPELIN-3228) Interpreter dependencies are not downloaded on zeppelin start - regression issue
Jhon Cardenas created ZEPPELIN-3228: --- Summary: Interpreter dependencies are not downloaded on zeppelin start - regression issue Key: ZEPPELIN-3228 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3228 Project: Zeppelin Issue Type: Bug Components: zeppelin-interpreter Affects Versions: 0.8.0 Reporter: Jhon Cardenas Fix For: 0.8.0 Same issue that ZEPPELIN-1143. It seems to have been caused by code refactorings. When zeppelin is started/restarted, server should try and download dependencies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)