[jira] [Created] (ZEPPELIN-3829) Allow to specify a custom interpreter running directory

2018-10-26 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3829:
---

 Summary: Allow to specify a custom interpreter running directory
 Key: ZEPPELIN-3829
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3829
 Project: Zeppelin
  Issue Type: Improvement
  Components: Interpreters
Affects Versions: 0.8.0
Reporter: Jhon Cardenas
 Fix For: 0.9.0


It would be very useful be able to specify in zeppelin the directory where the 
interpreters will be running.
An use case for this would be big core dumps generation. With this feature the 
user could change the default execution directory (which now is $ZEPPELIN_HOME) 
to another folder, and have the generated core dumps there.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZEPPELIN-3763) Broken link to Writing a new Visualization documentation

2018-09-03 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3763:
---

 Summary: Broken link to Writing a new Visualization documentation
 Key: ZEPPELIN-3763
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3763
 Project: Zeppelin
  Issue Type: Bug
  Components: helium
Affects Versions: 0.8.0
Reporter: Jhon Cardenas
 Fix For: 0.8.1


Steps to reproduce:
1. Go to Helium page.
2. Look for a VISUALIZATION plugin on the list, and click on "Enable".
3. Click on the VISUALIZATION link, located in the Type field.
4. You will be redirected to [a broken 
link|https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization.html].

Expected behavior: You should be redirected to the right [Writing a new 
Visualization 
link|https://zeppelin.apache.org/docs/0.8.0/development/helium/writing_visualization_basic.html].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZEPPELIN-3749) New Spark interpreter has to be restarted two times in order to work fine for different users

2018-08-29 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3749:
---

 Summary: New Spark interpreter has to be restarted two times in 
order to work fine for different users
 Key: ZEPPELIN-3749
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3749
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters
Affects Versions: 0.8.0, 0.8.1
 Environment: *Spark interpreter property to reproduce:*

zeppelin.spark.useNew -> true

*Spark interpreter instantiation mode*:

per user - scoped

*Zeppelin version:*

branch-0.8 (Until july 23)
Reporter: Jhon Cardenas
 Attachments: first_error.txt, second_error.txt

New Spark interpreter has to be restarted two times in order to work fine for 
different users.

To reproduce this you have to configure zeppelin to use the new interpreter:
zeppelin.spark.useNew -> true

And the instantiation mode: per user - scoped

*Steps to reproduce:*
1. User A login to zeppelin and runs some spark paragraph. It should works fine.
2. User B login to zeppelin and runs some spark paragraph, for example
{code:java}
%spark
println(sc.version)
println(scala.util.Properties.versionString)
{code}
3. This error appears (see entire log trace [^first_error.txt] ):
{\{java.lang.IllegalStateException: Cannot call methods on a stopped 
SparkContext. This stopped SparkContext was created at: .}}
4. The user B restart the spark interpreter from notebook page, and executes 
now a paragraph that throws a job, for example:

{code:java}
import sqlContext.implicits._
import org.apache.commons.io.IOUtils
import java.net.URL
import java.nio.charset.Charset

// Zeppelin creates and injects sc (SparkContext) and sqlContext (HiveContext 
or SqlContext)
// So you don't need create them manually

// load bank data
val bankText = sc.parallelize(
    IOUtils.toString(
    new 
URL("https://s3.amazonaws.com/apache-zeppelin/tutorial/bank/bank.csv;),
    Charset.forName("utf8")).split("\n"))

sc.parallelize(1 to 100).foreach(n => print((java.lang.Math.random() * 
100) + n))

case class Bank(age: Integer, job: String, marital: String, education: String, 
balance: Integer)

val bank = bankText.map(s => s.split(";")).filter(s => s(0) != "\"age\"").map(
    s => Bank(s(0).toInt, 
    s(1).replaceAll("\"", ""),
    s(2).replaceAll("\"", ""),
    s(3).replaceAll("\"", ""),
    s(5).replaceAll("\"", "").toInt
    )
).toDF()
bank.registerTempTable("bank")
{code}
5. This error appears (see entire log trace  [^second_error.txt] ):
{\{org.apache.spark.SparkException: Job aborted due to stage failure: Task 6 in 
stage 0.0 failed 4 times, most recent failure: Lost task 6.3 in stage 0.0 (TID 
36, 100.96.85.172, executor 2): java.lang.ClassNotFoundException: $anonfun$1 at 
org.apache.spark.repl.ExecutorClassLoader.findClass(ExecutorClassLoader.scala:82)
 at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at 
java.lang.ClassLoader.loadClass(ClassLoader.java:357) at 
java.lang.Class.forName0(Native Method) at 
java.lang.Class.forName(Class.java:348) at 
org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:67)
 .}}
6. User B restart spark interpreter from notebook page and now it works.

*Actual Behavior:*
The user B has to restart two times spark interpreter so it can works.

*Expected Behavior:*
Spark should works fine for another users without any restarting.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZEPPELIN-3641) Impersonation for spark without native proxy user can potentially fail

2018-07-19 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3641:
---

 Summary: Impersonation for spark without native proxy user can 
potentially fail
 Key: ZEPPELIN-3641
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3641
 Project: Zeppelin
  Issue Type: Bug
  Components: zeppelin-interpreter
Affects Versions: 0.8.0
Reporter: Jhon Cardenas


Impersonation for spark without native proxy user can potentially fail.
When impersonation in on for spark without native proxy user 
(ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false) and the spark interpreter has a 
property value with a blank space, the spark paragraphs fail.

How to reproduce:
# Turn on impersonation for spark.
# Disable native proxy user (ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false).
# Specify the impersonation command: ZEPPELIN_IMPERSONATE_CMD='sudo -H -u 
${ZEPPELIN_IMPERSONATE_USER} bash -c ' (With the default command the same error 
should happen)
# Put a property with at least one blank space, for example:
||Name||Value||
|spark.executor.extraJavaOptions|-Dpro1=val1 -Dprop1=val2|
# Run a spark paragraph.






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZEPPELIN-3542) Stop a pyspark paragraph cancels all the pyspark jobs of other users

2018-06-14 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3542:
---

 Summary: Stop a pyspark paragraph cancels all the pyspark jobs of 
other users
 Key: ZEPPELIN-3542
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3542
 Project: Zeppelin
  Issue Type: Bug
  Components: pySpark
Affects Versions: 0.8.0
 Environment: This happen when the spark context is shared between 
users (scoped).
Reporter: Jhon Cardenas


Stop a pyspark paragraph cancels all the pyspark jobs of other users.

Cancel button in pyspark paragraph cancels Spark jobs for all users.

This happen when the spark context is shared between users (scoped).

It seems to be related with [the 
solution|https://github.com/apache/zeppelin/commit/9f22db91c279b7daf6a13b2d805a874074b070fd]
 for the task 
[ZEPPELIN-2075|https://issues.apache.org/jira/browse/ZEPPELIN-2075].

This solution is causing that when one particular user cancels his py-spark 
job, the py-spark jobs from all the users are being canceled !!.

When a py-spark job is cancelled, the method PySparkInterpreter interrupt() is 
invoked, and then the SIGINT event is called, causing that all the jobs in the 
same spark context be cancelled:

context.py:

# create a signal handler which would be invoked on receiving SIGINT
def signal_handler(signal, frame):
self.cancelAllJobs()
raise KeyboardInterrupt()



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (ZEPPELIN-3228) Interpreter dependencies are not downloaded on zeppelin start - regression issue

2018-02-13 Thread Jhon Cardenas (JIRA)
Jhon Cardenas created ZEPPELIN-3228:
---

 Summary: Interpreter dependencies are not downloaded on zeppelin 
start - regression issue
 Key: ZEPPELIN-3228
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3228
 Project: Zeppelin
  Issue Type: Bug
  Components: zeppelin-interpreter
Affects Versions: 0.8.0
Reporter: Jhon Cardenas
 Fix For: 0.8.0


Same issue that ZEPPELIN-1143. It seems to have been caused by code 
refactorings.

When zeppelin is started/restarted, server should try and download dependencies.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)