from:"Mathew \(JIRA\)"

[jira] [Created] (ZEPPELIN-3770) zeppelin.spark.uiWebUrl is ignored.

2018-09-07 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3770:


 Summary: zeppelin.spark.uiWebUrl is ignored.
 Key: ZEPPELIN-3770
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3770
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
 Environment: Spark 2.3.1 (Standalone Cluster)

Zeppelin 0.8
Reporter: Mathew


Since Zeppelin 0.8, `zeppelin.spark.uiWebUrl` seems to be ignored.

Even if you put something random like `http://example.com/endpoint` as 
`zeppelin.spark.uiWebUrl`, you still end up at the host address for the spark 
driver when you click "spark ui" in interpreter settings, or the new "spark 
jobs" button in a spark paragraph.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3762) Notebook folders with number names are displayed out of order.

2018-09-03 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3762:


 Summary: Notebook folders with number names are displayed out of 
order.
 Key: ZEPPELIN-3762
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3762
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
 Environment: Zeppelin 0.8
Reporter: Mathew
 Attachments: image-2018-09-04-10-24-49-807.png

If you name your notebook folder something like "12345" with only numeric 
characters, it will be displayed in the wrong order e.g.

 

`/111/LAYER_2/333/LAYER_3` is displayed as: !image-2018-09-04-10-24-49-807.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3759) Spark SQL Blocks incorrectly displays strings with leading 0's

2018-09-03 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3759:


 Summary: Spark SQL Blocks incorrectly displays strings with 
leading 0's
 Key: ZEPPELIN-3759
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3759
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
 Environment: Zeppelin 0.8

Spark 2.2.2
Reporter: Mathew
 Attachments: spark_sql.PNG

When you have a Spark dataframe, with a string column which has leading 0's, 
these zeros are not displayed in %spark.sql paragraphs. 

 

*As displayed in this image:*

!spark_sql.PNG!

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3721) Documentation misnames PYSPARK_PYTHON as PYSPARKPYTHON

2018-08-15 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3721:


 Summary: Documentation misnames PYSPARK_PYTHON as PYSPARKPYTHON
 Key: ZEPPELIN-3721
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3721
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Mathew


On the documentation page: 
[https://zeppelin.apache.org/docs/0.8.0/interpreter/spark.html], we missspecify 
the following configs:
 * PYSPARK__PYTHON_

 * PYSPARK_DRIVER__PYTHON_

It seems that the underscore is confusing the conversion to HTML and it thinks 
it means italics. __ 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3720) PYSPARK_PYTHON / PYSPARK_DRIVER_PYTHON ignored.

2018-08-15 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3720:


 Summary: PYSPARK_PYTHON / PYSPARK_DRIVER_PYTHON ignored.
 Key: ZEPPELIN-3720
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3720
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
 Environment: Spark 2.2.2 (Cloudera)

Zeppelin 0.8.0

RHEL 7.4
Reporter: Mathew


I am not 100% sure if the issue is on my end, but no matter what combination of 
the following spark interpreter configs I use, the PYSPARK_PYTHON present in my 
spark-env.sh takes precedence.
 * PYSPARK_PYTHON
 * PYSPARK_DRIVER_PYTHON
 * spark.pyspark.python
 * spark.pyspark.driver.python
 * spark.yarn.appMasterEnv.PYSPARK_PYTHON
 * spark.yarn.appMasterEnv.PYSPARK_DRIVER_PYTHON

 

Note that this is only is yarn client mode, as in yarn cluster mode, it ignores 
whats in spark-env.sh.

When I use spark-submit it works fine, and picks up my custom python executable 
path.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3579) z.load( fails for yarn-cluster.

2018-07-02 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3579:


 Summary: z.load( fails for yarn-cluster.
 Key: ZEPPELIN-3579
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3579
 Project: Zeppelin
  Issue Type: Bug
Reporter: Mathew






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3578) Log spam from interpreter % deceleration

2018-07-02 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3578:


 Summary: Log spam from interpreter % deceleration 
 Key: ZEPPELIN-3578
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3578
 Project: Zeppelin
  Issue Type: Bug
Affects Versions: 0.8.0
Reporter: Mathew


As you type out an interpreter name, say "%spark" for every character you type, 
an error is thrown, until you finish typing it. 

*LOGS:*
{code:java}
INFO [2018-07-03 02:41:28,654] ({qtp1144648478-26} 
NotebookServer.java[broadcastNewParagraph]:674) - Broadcasting paragraph on run 
call instead of note.
ERROR [2018-07-03 02:41:31,526] ({qtp1144648478-26} 
NotebookServer.java[onMessage]:365) - Can't handle message: 
{"op":"EDITOR_SETTING","data":{"paragraphId":" 
XXX","magic":"s"},"principal":" XXX","ticket":" XXX","roles":"[]"}
java.io.IOException: Fail to get interpreter: s
at 
org.apache.zeppelin.socket.NotebookServer.getEditorSetting(NotebookServer.java:2454)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:347)
at 
org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59)
at 
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at 
org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at 
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at 
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at 
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at 
org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at 
org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
at 
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at 
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.zeppelin.interpreter.InterpreterNotFoundException: Either 
no interpreter named s or it is not binded to this note
at 
org.apache.zeppelin.interpreter.InterpreterFactory.getInterpreter(InterpreterFactory.java:101)
at 
org.apache.zeppelin.socket.NotebookServer.getEditorSetting(NotebookServer.java:2452)
... 17 more
ERROR [2018-07-03 02:41:31,689] ({qtp1144648478-50} 
NotebookServer.java[onMessage]:365) - Can't handle message: 
{"op":"EDITOR_SETTING","data":{"paragraphId":" 
XXX","magic":"sp"},"principal":" XXX","ticket":" XXX","roles":"[]"}
java.io.IOException: Fail to get interpreter: sp
at 
org.apache.zeppelin.socket.NotebookServer.getEditorSetting(NotebookServer.java:2454)
at org.apache.zeppelin.socket.NotebookServer.onMessage(NotebookServer.java:347)
at 
org.apache.zeppelin.socket.NotebookSocket.onWebSocketText(NotebookSocket.java:59)
at 
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextMessage(JettyListenerEventDriver.java:128)
at 
org.eclipse.jetty.websocket.common.message.SimpleTextMessage.messageComplete(SimpleTextMessage.java:69)
at 
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.appendMessage(AbstractEventDriver.java:65)
at 
org.eclipse.jetty.websocket.common.events.JettyListenerEventDriver.onTextFrame(JettyListenerEventDriver.java:122)
at 
org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:161)
at 
org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:309)
at 
org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:214)
at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:220)
at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:258)
at 
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.readParse(AbstractWebSocketConnection.java:632)
at 
org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onFillable(AbstractWebSocketConnection.java:480)
at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:544)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635)
at 
org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555)
at java.lang.Thread.run(Thread.java:745)
Caused by:

[jira] [Created] (ZEPPELIN-3577) Notebook Save Button

2018-07-02 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3577:


 Summary: Notebook Save Button
 Key: ZEPPELIN-3577
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3577
 Project: Zeppelin
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Mathew


While 0.8.0 has made massive improvements in actually saving code after you 
click off the paragraph (rather than only when you run the paragraph), there is 
still improvements to make.

I think we should implement something like Jupyter's save button, where 
intermediate changes are cached where possible (similar to current behaviour), 
but there is a dedicated save button which is guaranteed to save all current 
code and results. 

Mixing this with Zeppelins current GIT/Version support requires some 
discussion, not to speak of real time collaboration, and who's version would 
take precedence. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3576) Implement project abstraction

2018-07-02 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3576:


 Summary: Implement project abstraction
 Key: ZEPPELIN-3576
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3576
 Project: Zeppelin
  Issue Type: New Feature
Affects Versions: 0.8.0
Reporter: Mathew


This is sort of a wild idea, but hear me out.

I think we should implement a project abstraction.

A 'project' could be a collection of notes all relating to one analytics 
project, this project could have corresponding interpreter settings, access 
permissions and jobs. Ideally this would become the main 'container'  for 
notes, with the main page displaying the 'projects' you have read permission on.

This would open up the possibility of storing these projects in separate 
storage locations, e.g. each project connects to a different GIT repository, or 
has its own HDFS/LOCAL storage location. Additionally, in multi-user 
environments this will make collaboration much easier. 

I believe his is needed as most real world analytics projects involve many 
notes and multiple people working on them, with a need to change manage (GIT) 
them separately.

Comments are very welcome. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3561) Allow username variable in isolated interpreter configs (Kerberos)

2018-06-24 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3561:


 Summary: Allow username variable in isolated interpreter configs 
(Kerberos)
 Key: ZEPPELIN-3561
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3561
 Project: Zeppelin
  Issue Type: Improvement
Affects Versions: 0.8.0
Reporter: Mathew


Currently we have configs like:
* spark.yarn.principal
* spark.yarn.keytab
* zeppelin.jdbc.principal
* zeppelin.jdbc.keytab.location

Can we support specifying variables like {LOGGED_IN_USER} in these configs for 
"Isolated per user" interpreters, this would allow us to specify the keytab 
location for the user, rather than giving Zeppelin itself one.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3462) DataFrames with tabs get corrupted in SQL interpreter.

2018-05-15 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3462:


 Summary: DataFrames with tabs get corrupted in SQL interpreter.
 Key: ZEPPELIN-3462
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3462
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters
Affects Versions: 0.7.3
Reporter: Mathew
 Fix For: 0.8.0
 Attachments: image-2018-05-16-09-49-44-647.png

If there is a tab in a dataframe, the SQL interpreter will interpret this as a 
new column, causing the table display to chomp of some of the following 
columns. 

 

*Steps to Reproduce:*

Create dataframe with tab:
{code:java}
%spark.pyspark
from pyspark.sql import Row

# Create dataframe with 3 cols
df = sc.parallelize([
Row(u'First col, \u0009 still first col.', 'Second col', 'Third col')
]
).toDF()

# Display table
df.show()

# Register table for SQL
df.registerTempTable("df"){code}
 

Query in SQL interpreter:
{code:java}
%sql
SELECT * FROM df
{code}
Output:

!image-2018-05-16-09-49-44-647.png!

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3456) Livy Interpreter incompatible with Spark 2.2+

2018-05-13 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3456:


 Summary: Livy Interpreter incompatible with Spark 2.2+
 Key: ZEPPELIN-3456
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3456
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters, livy-interpreter
Affects Versions: 0.7.3
Reporter: Mathew
 Fix For: 0.8.0


*Issue:*

Zeppelin's Livy interpreter dose not support Spark 2.2+, as the job fails to 
initialize after we call a removed method (As of Spark 2.2)

This is very similar to ZEPPELIN-2150 for the main Spark interpreter. 

 

*Environment**:*
 * Livy 0.5
 * Spark 2.2+

 

*Specifics:*

We still try to call (the now removed as of Spark 2.2) "appUIAddress":
[https://github.com/apache/zeppelin/blob/master/livy/src/main/java/org/apache/zeppelin/livy/LivySparkInterpreter.java#L51]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3455) Zeppelin crashes if Spark interpreter is restarted in a hanging state (And no hang timeout)

2018-05-13 Thread Mathew (JIRA)

Mathew created ZEPPELIN-3455:


 Summary: Zeppelin crashes if Spark interpreter is restarted in a 
hanging state (And no hang timeout)
 Key: ZEPPELIN-3455
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3455
 Project: Zeppelin
  Issue Type: Bug
  Components: Interpreters
Affects Versions: 0.7.3
Reporter: Mathew
 Fix For: 0.8.0


*Issue*:

If a user has not kinit'd their keytab, and attempts to run a %spark paragraph, 
it will hang indefinitely, and if they then try to restart the Spark 
interpreter, the whole of Zeppelin Crashes.

 

*Enviroment:*
 * Zeppelin 0.7.3
 * Spark 2.2
 * Yarn-Client (Kerberized) 
 * User Impersonation Enabled (Per user, in isolated process) 
 * Shiro authenticating users through AD
 * ZEPPELIN_IMPERSONATE_SPARK_PROXY_USER=false
 * ZEPPELIN_IMPERSONATE_CMD='sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '

While this might seem like a crazy setup, it is extremely common in enterprise, 
as users have differing permissions in the Hadoop environment.

(I am aware that zeppelin can proxy users if it has its own keytab, but many 
Zeppelin users cannot do that for now.)

 

*Things to Fix:*
 * Firstly, there is seemingly no timeout for failing to initialize the Spark 
interpreter. (Meaning it hangs forever)
 * Secondly, while it is in the hanging state, restarting the Spark interpreter 
will crash Zeppelin for everyone. (Sometimes it will come back after a 20+ Min) 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Created] (ZEPPELIN-3126) More than 2 notebooks in R failing with error sparkr intrepreter not responding

2018-01-04 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-3126:
---

 Summary: More than 2 notebooks in R failing with error sparkr 
intrepreter not responding
 Key: ZEPPELIN-3126
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-3126
 Project: Zeppelin
  Issue Type: Bug
  Components: r-interpreter
Affects Versions: 0.7.2
 Environment: spark version 1.6.2


Reporter: Meethu Mathew
Priority: Critical


Spark interpreter is in per note Scoped mode.
Please find the steps below to reproduce the issue:
1. Create a notebook (Note1) and run any r code in a paragraph. I ran the 
following code.
%r
rdf <- data.frame(c(1,2,3,4))
colnames(rdf) <- c("myCol")
sdf <- createDataFrame(sqlContext, rdf)  
withColumn(sdf, "newCol", sdf$myCol * 2.0)

2.  Create another notebook (Note2) and run any r code in a paragraph. I ran 
the same code as above.

Till now everything works fine.

3. Create third notebook (Note3) and run any r code in a paragraph. I ran the 
same code. This notebook fails with the error 
org.apache.zeppelin.interpreter.InterpreterException: sparkr is not responding 

The problem will be solved on restarting the sparkr interpreter and another 2 
models could be executed successfully. But again, for the third model run using 
the sparkr interpreter, the error is thrown. 
Once a notebook throws the error, all further notebooks will throw the same 
error and each time we run those failed notebooks, a new R shell process will 
be started and these processes are not getting killed even if we we delete the 
failed notebook.i.e it does not reuse original R shell after failure



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (ZEPPELIN-2313) Run-a-paragraph-synchronously response documented incorrectly

2017-03-23 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-2313:
---

 Summary: Run-a-paragraph-synchronously response documented 
incorrectly
 Key: ZEPPELIN-2313
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2313
 Project: Zeppelin
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.7.0
Reporter: Meethu Mathew


The documentation at 
https://zeppelin.apache.org/docs/0.7.0/rest-api/rest-notebook.html#run-a-paragraph-synchronously
 says the sample json error as 
{
 "status": "INTERNAL_SERVER_ERROR",
   "body": {
   "code": "ERROR",
   "type": "TEXT",
   "msg": "bash: -c: line 0: unexpected EOF while looking for matching 
``'\nbash: -c: line 1: syntax error: unexpected end of file\nExitValue: 2"
   }
}

But it is  actually coming like 
 {  "status": "OK", 
"body": {   
 "code": "SUCCESS",
  "msg": [  {   
   "type": "TEXT",
"data": "hello world"  }]  
}}





--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ZEPPELIN-2312) Allow to Undo edits in a paragraph once its executed and undo deleted paragraph

2017-03-23 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-2312:
---

 Summary: Allow to Undo edits in a paragraph once its executed and 
undo deleted paragraph
 Key: ZEPPELIN-2312
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2312
 Project: Zeppelin
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Meethu Mathew
Priority: Minor


Its not possible to undo edits in a paragraph once its executed. But it was 
possible in 0.6.0.

There should an option to undo a delete paragraph.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ZEPPELIN-2305) overall experience on auto-completion need to improve.

2017-03-23 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-2305:
---

 Summary: overall experience on auto-completion need to improve.
 Key: ZEPPELIN-2305
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2305
 Project: Zeppelin
  Issue Type: Improvement
  Components: Core
Affects Versions: 0.7.0
Reporter: Meethu Mathew


There is no Auto-completion or suggestions for the defined variable names which 
is available in other frameworks. Also
Ctrl+. is giving awkward suggestions for related functions also. For example, 
the relevant functions for a spark rdd or dataframe is not available in the 
suggestions list. The overall experience on auto-completion is something that 
Zeppelin need to improve.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ZEPPELIN-2141) sc.addPyFile("hdfs://path/to file) in zeppelin causing UnKnownHostException

2017-02-20 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-2141:
---

 Summary: sc.addPyFile("hdfs://path/to file) in zeppelin causing 
UnKnownHostException
 Key: ZEPPELIN-2141
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2141
 Project: Zeppelin
  Issue Type: Bug
  Components: pySpark
Affects Versions: 0.6.0
Reporter: Meethu Mathew
Priority: Minor


In the documentation of sc.addPyFile(0 its is mentioned that "
Add a .py or .zip dependency for all tasks to be executed on this SparkContext 
in the future. The path passed can be either a local file, a file in HDFS (or 
other Hadoop-supported filesystems), or an HTTP, HTTPS or FTP URI"

But when I added an HDFS path in the method in zeppelin, it results in the 
following exception:
Py4JJavaError: An error occurred while calling 
z:org.apache.spark.api.python.PythonRDD.runJob.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in 
stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 
3, demo-node4.flytxt.com): java.lang.IllegalArgumentException: 
java.net.UnknownHostException: flycluster

Spark version used is 1.6.2. The same command is working fine with pyspark 
shell and hence I think something is wrong  with Zeppelin



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ZEPPELIN-2136) --files in SPARK_SUBMIT_OPTIONS not working

2017-02-19 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-2136:
---

 Summary: --files in SPARK_SUBMIT_OPTIONS not working 
 Key: ZEPPELIN-2136
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-2136
 Project: Zeppelin
  Issue Type: Bug
  Components: pySpark
Affects Versions: 0.6.0
Reporter: Meethu Mathew


Acc to the zeppelin documentation, to pass a python package to zeppelin pyspark 
interpreter, you can export it through --files option in SPARK_SUBMIT_OPTIONS 
in conf/zeppelin-env.sh. 

When I add a .egg file through the --files option in SPARK_SUBMIT_OPTIONS , 
zeppelin notebook is not throwing error, but I am not able to import the module 
inside the zeppelin notebook.

Spark version is 1.6.2 and the zepplein-env.sh file looks like:

export SPARK_HOME=/home/me/spark-1.6.1-bin-hadoop2.6
export SPARK_SUBMIT_OPTIONS="--jars 
/home/me/spark-csv-1.5.0-s_2.10.jar,/home/me/commons-csv-1.4.jar --files 
/home/me/models/Churn/package/build/dist/fly_libs-1.1-py2.7.egg"

My work around for this problem was to add the .rgg file using sc.addPyFile() 
inside the notebook.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (ZEPPELIN-1562) Wrong documentation in 'Run a paragraph synchronously' rest api

2016-10-18 Thread Meethu Mathew (JIRA)

Meethu Mathew created ZEPPELIN-1562:
---

 Summary: Wrong documentation in 'Run a paragraph synchronously' 
rest api
 Key: ZEPPELIN-1562
 URL: https://issues.apache.org/jira/browse/ZEPPELIN-1562
 Project: Zeppelin
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.7.0
Reporter: Meethu Mathew
 Fix For: 0.7.0


The URL for running a paragraph synchronously using REST api is given as 
"http://[zeppelin-server]:[zeppelin-port]/api/notebook/job/[notebookId]/[paragraphId]
 " in the documentation.
https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/rest-api/rest-notebook.html#run-a-paragraph-synchronously.

When I searched the same in the github code , 
https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/rest-api/rest-notebook.html#run-a-paragraph-synchronously
 , the URL is given as  "run/notebookId/paragraphId"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (ZEPPELIN-3770) zeppelin.spark.uiWebUrl is ignored.

[jira] [Created] (ZEPPELIN-3762) Notebook folders with number names are displayed out of order.

[jira] [Created] (ZEPPELIN-3759) Spark SQL Blocks incorrectly displays strings with leading 0's

[jira] [Created] (ZEPPELIN-3721) Documentation misnames PYSPARK_PYTHON as PYSPARKPYTHON

[jira] [Created] (ZEPPELIN-3720) PYSPARK_PYTHON / PYSPARK_DRIVER_PYTHON ignored.

[jira] [Created] (ZEPPELIN-3579) z.load( fails for yarn-cluster.

[jira] [Created] (ZEPPELIN-3578) Log spam from interpreter % deceleration

[jira] [Created] (ZEPPELIN-3577) Notebook Save Button

[jira] [Created] (ZEPPELIN-3576) Implement project abstraction

[jira] [Created] (ZEPPELIN-3561) Allow username variable in isolated interpreter configs (Kerberos)

[jira] [Created] (ZEPPELIN-3462) DataFrames with tabs get corrupted in SQL interpreter.

[jira] [Created] (ZEPPELIN-3456) Livy Interpreter incompatible with Spark 2.2+

[jira] [Created] (ZEPPELIN-3455) Zeppelin crashes if Spark interpreter is restarted in a hanging state (And no hang timeout)

[jira] [Created] (ZEPPELIN-3126) More than 2 notebooks in R failing with error sparkr intrepreter not responding

[jira] [Created] (ZEPPELIN-2313) Run-a-paragraph-synchronously response documented incorrectly

[jira] [Created] (ZEPPELIN-2312) Allow to Undo edits in a paragraph once its executed and undo deleted paragraph

[jira] [Created] (ZEPPELIN-2305) overall experience on auto-completion need to improve.

[jira] [Created] (ZEPPELIN-2141) sc.addPyFile("hdfs://path/to file) in zeppelin causing UnKnownHostException

[jira] [Created] (ZEPPELIN-2136) --files in SPARK_SUBMIT_OPTIONS not working

[jira] [Created] (ZEPPELIN-1562) Wrong documentation in 'Run a paragraph synchronously' rest api

20 matches

Site Navigation

Mail list logo

Footer information