[ 
https://issues.apache.org/jira/browse/SPARK-13973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15249628#comment-15249628
 ] 

Paul Shearer edited comment on SPARK-13973 at 4/20/16 10:48 AM:
----------------------------------------------------------------

The problem with this change is that it creates a bug for users who simply want 
the IPython interactive shell, as opposed to the notebook. `ipython` with no 
arguments starts the IPython shell, but `jupyter` with no arguments results in 
the following error:

{noformat}
usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir]
               [--paths] [--json]
               [subcommand]
jupyter: error: one of the arguments --version subcommand --config-dir 
--data-dir --runtime-dir --paths is required
{noformat}

I can't speak for the general Python community but as a data scientist, 
personally I find the IPython notebook only suitable for very basic exploratory 
analysis - any sort of application development is much better served by the 
IPython shell, so I'm always using the shell and rarely the notebook. So I 
prefer the old script.

Perhaps the best answer is to stop maintaining an unsustainable backwards 
compatibility. The committed change broke the pyspark startup script in my 
case, and the old startup script will eventually be broken when `ipython 
notebook` is deprecated. So perhaps `IPYTHON=1` should just result in some kind 
of error message prompting the user to switch to the new PYSPARK_DRIVER_PYTHON 
config style. Most Spark users knows the installation process is not seamless 
and requires mucking about with environment variables - they might as well be 
told to do it in a way that's convenient to the development team.


was (Author: pshearer):
The problem with this change is that it creates a bug for users who simply want 
the IPython interactive shell, as opposed to the notebook. `ipython` with no 
arguments starts the IPython shell, but `jupyter` with no arguments results in 
the following error:

{noformat}
usage: jupyter [-h] [--version] [--config-dir] [--data-dir] [--runtime-dir]
               [--paths] [--json]
               [subcommand]
jupyter: error: one of the arguments --version subcommand --config-dir 
--data-dir --runtime-dir --paths is required
{noformat}

I can't speak for the general Python community but as a data scientist, 
personally I find the IPython notebook only suitable for very basic exploratory 
analysis - any sort of application development is much better served by the 
IPython shell, so I'm always using the shell and rarely the notebook.

It seems like maintaining this old configuration switch is no longer 
sustainable. The change breaks it for my case, and the old state will 
eventually be broken when `ipython notebook` is deprecated. So perhaps 
`IPYTHON=1` should just result in some kind of error message prompting the user 
to switch to the new PYSPARK_DRIVER_PYTHON config style.

> `ipython notebook` is going away...
> -----------------------------------
>
>                 Key: SPARK-13973
>                 URL: https://issues.apache.org/jira/browse/SPARK-13973
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>         Environment: spark-1.6.1-bin-hadoop2.6
> Anaconda2-2.5.0-Linux-x86_64
>            Reporter: Bogdan Pirvu
>            Assignee: Rekha Joshi
>            Priority: Trivial
>             Fix For: 2.0.0
>
>
> Starting {{pyspark}} with following environment variables:
> {code:none}
> export IPYTHON=1
> export IPYTHON_OPTS="notebook --no-browser"
> {code}
> yields this warning
> {code:none}
> [TerminalIPythonApp] WARNING | Subcommand `ipython notebook` is deprecated 
> and will be removed in future versions.
> [TerminalIPythonApp] WARNING | You likely want to use `jupyter notebook`... 
> continue in 5 sec. Press Ctrl-C to quit now.
> {code}
> Changing line 52 from
> {code:none}
> PYSPARK_DRIVER_PYTHON="ipython"
> {code}
> to
> {code:none}
> PYSPARK_DRIVER_PYTHON="jupyter"
> {code}
> in https://github.com/apache/spark/blob/master/bin/pyspark works for me to 
> solve this issue, but I'm not sure if it's sustainable as I'm not familiar 
> with the rest of the code...
> This is the relevant part of my Python environment:
> {code:none}
> ipython                   4.1.2                    py27_0  
> ipython-genutils          0.1.0                     <pip>
> ipython_genutils          0.1.0                    py27_0  
> ipywidgets                4.1.1                    py27_0  
> ...
> jupyter                   1.0.0                    py27_1  
> jupyter-client            4.2.1                     <pip>
> jupyter-console           4.1.1                     <pip>
> jupyter-core              4.1.0                     <pip>
> jupyter_client            4.2.1                    py27_0  
> jupyter_console           4.1.1                    py27_0  
> jupyter_core              4.1.0                    py27_0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to