[jira] [Updated] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled

2018-07-02 Thread holdenk (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

holdenk updated SPARK-24668:

 Shepherd: holdenk
Affects Version/s: 2.4.0

> PySpark crashes when getting the webui url if the webui is disabled
> ---
>
> Key: SPARK-24668
> URL: https://issues.apache.org/jira/browse/SPARK-24668
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.3.0, 2.4.0
> Environment: * Spark 2.3.0
>  * Spark-on-YARN
>  * Java 8
>  * Python 3.6.5
>  * Jupyter 4.4.0
>Reporter: Karthik Palaniappan
>Priority: Minor
>
> Repro:
>  
> Evaluate `sc` in a Jupyter notebook:
>  
>  
> {{---}}
> {{Py4JJavaError                             Traceback (most recent call 
> last)}}
> {{/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in 
> __call__(self, obj)}}
> {{    343             method = get_real_method(obj, self.print_method)}}
> {{    344             if method is not None:}}
> {{--> 345                 return method()}}
> {{    346             return None}}
> {{    347         else:}}
> {{/usr/lib/spark/python/pyspark/context.py in _repr_html_(self)}}
> {{    261         }}
> {{    262         """.format(}}
> {{--> 263             sc=self}}
> {{    264         )}}
> {{    265 }}
> {{/usr/lib/spark/python/pyspark/context.py in uiWebUrl(self)}}
> {{    373     def uiWebUrl(self):}}
> {{    374         """Return the URL of the SparkUI instance started by this 
> SparkContext"""}}
> {{--> 375         return 
> self._[jsc.sc|https://www.google.com/url?q=http://jsc.sc&sa=D&usg=AFQjCNHUwO0Cf3OHs1QafBFXzShZ_PU8IQ]().uiWebUrl().get()}}
> {{    376 }}
> {{    377     @property}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)}}
> {{   1158         answer = self.gateway_client.send_command(command)}}
> {{   1159         return_value = get_return_value(}}
> {{-> 1160             answer, self.gateway_client, self.target_id, 
> [self.name|https://www.google.com/url?q=http://self.name&sa=D&usg=AFQjCNEu_LlQOduOrIyV64UgIuRgm6Ea2w])}}
> {{   1161 }}
> {{   1162         for temp_arg in temp_args:}}
> {{/usr/lib/spark/python/pyspark/sql/utils.py in deco(*a, **kw)}}
> {{     61     def deco(*a, **kw):}}
> {{     62         try:}}
> {{---> 63             return f(*a, **kw)}}
> {{     64         except py4j.protocol.Py4JJavaError as e:}}
> {{     65             s = e.java_exception.toString()}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)}}
> {{    318                 raise Py4JJavaError(}}
> {{    319                     "An error occurred while calling 
> \{0}{1}\{2}.\n".}}
> {{--> 320                     format(target_id, ".", name), value)}}
> {{    321             else:}}
> {{    322                 raise Py4JError(}}
> {{Py4JJavaError: An error occurred while calling o80.get.}}
> {{: java.util.NoSuchElementException: None.get}}
> {{        at scala.None$.get(Option.scala:347)}}
> {{        at scala.None$.get(Option.scala:345)}}
> {{        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{        at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
> {{        at java.lang.reflect.Method.invoke(Method.java:498)}}
> {{        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)}}
> {{        at 
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)}}
> {{        at py4j.Gateway.invoke(Gateway.java:282)}}
> {{        at 
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)}}
> {{        at py4j.commands.CallCommand.execute(CallCommand.java:79)}}
> {{        at py4j.GatewayConnection.run(GatewayConnection.java:214)}}
> {{        at java.lang.Thread.run(Thread.java:748)}}
>  
> PySpark only prints out the web ui url in `_repr_html`, not `__repr__`, so 
> this only happens in notebooks that render html, not the pyspark shell. 
> [https://github.com/apache/spark/commit/f654b39a63d4f9b118733733c7ed2a1b58649e3d]
>  
> Disabling Spark's UI with `spark.ui.enabled` *is* valuable outside of tests. 
> A couple reasons that come to mind:
> 1) If you run multiple spark applications from one machine, Spark 
> irritatingly starts picking the same port (4040), as the first application, 
> then increments (4041, 4042, etc) until it finds an open port. If you are 
> running 10 spark apps, then the 11th prints out 10 warnings about ports being 
> taken until it finally finds one.
> 2) You can serve the spark web ui from a dedicated spark history server 
> 

[jira] [Updated] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled

2018-06-27 Thread Karthik Palaniappan (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Karthik Palaniappan updated SPARK-24668:

Environment: 
* Spark 2.3.0
 * Spark-on-YARN
 * Java 8
 * Python 3.6.5
 * Jupyter 4.4.0

  was:
* Spark 2.3.0
 * Spark-on-YARN
 * Java 8
 * Python 2
 * Jupyter 


> PySpark crashes when getting the webui url if the webui is disabled
> ---
>
> Key: SPARK-24668
> URL: https://issues.apache.org/jira/browse/SPARK-24668
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 2.3.0
> Environment: * Spark 2.3.0
>  * Spark-on-YARN
>  * Java 8
>  * Python 3.6.5
>  * Jupyter 4.4.0
>Reporter: Karthik Palaniappan
>Priority: Minor
>
> Repro:
>  
> Evaluate `sc` in a Jupyter notebook:
>  
>  
> {{---}}
> {{Py4JJavaError                             Traceback (most recent call 
> last)}}
> {{/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in 
> __call__(self, obj)}}
> {{    343             method = get_real_method(obj, self.print_method)}}
> {{    344             if method is not None:}}
> {{--> 345                 return method()}}
> {{    346             return None}}
> {{    347         else:}}
> {{/usr/lib/spark/python/pyspark/context.py in _repr_html_(self)}}
> {{    261         }}
> {{    262         """.format(}}
> {{--> 263             sc=self}}
> {{    264         )}}
> {{    265 }}
> {{/usr/lib/spark/python/pyspark/context.py in uiWebUrl(self)}}
> {{    373     def uiWebUrl(self):}}
> {{    374         """Return the URL of the SparkUI instance started by this 
> SparkContext"""}}
> {{--> 375         return 
> self._[jsc.sc|https://www.google.com/url?q=http://jsc.sc&sa=D&usg=AFQjCNHUwO0Cf3OHs1QafBFXzShZ_PU8IQ]().uiWebUrl().get()}}
> {{    376 }}
> {{    377     @property}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in 
> __call__(self, *args)}}
> {{   1158         answer = self.gateway_client.send_command(command)}}
> {{   1159         return_value = get_return_value(}}
> {{-> 1160             answer, self.gateway_client, self.target_id, 
> [self.name|https://www.google.com/url?q=http://self.name&sa=D&usg=AFQjCNEu_LlQOduOrIyV64UgIuRgm6Ea2w])}}
> {{   1161 }}
> {{   1162         for temp_arg in temp_args:}}
> {{/usr/lib/spark/python/pyspark/sql/utils.py in deco(*a, **kw)}}
> {{     61     def deco(*a, **kw):}}
> {{     62         try:}}
> {{---> 63             return f(*a, **kw)}}
> {{     64         except py4j.protocol.Py4JJavaError as e:}}
> {{     65             s = e.java_exception.toString()}}
> {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py in 
> get_return_value(answer, gateway_client, target_id, name)}}
> {{    318                 raise Py4JJavaError(}}
> {{    319                     "An error occurred while calling 
> \{0}{1}\{2}.\n".}}
> {{--> 320                     format(target_id, ".", name), value)}}
> {{    321             else:}}
> {{    322                 raise Py4JError(}}
> {{Py4JJavaError: An error occurred while calling o80.get.}}
> {{: java.util.NoSuchElementException: None.get}}
> {{        at scala.None$.get(Option.scala:347)}}
> {{        at scala.None$.get(Option.scala:345)}}
> {{        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}}
> {{        at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}}
> {{        at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}}
> {{        at java.lang.reflect.Method.invoke(Method.java:498)}}
> {{        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)}}
> {{        at 
> py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)}}
> {{        at py4j.Gateway.invoke(Gateway.java:282)}}
> {{        at 
> py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)}}
> {{        at py4j.commands.CallCommand.execute(CallCommand.java:79)}}
> {{        at py4j.GatewayConnection.run(GatewayConnection.java:214)}}
> {{        at java.lang.Thread.run(Thread.java:748)}}
>  
> PySpark only prints out the web ui url in `_repr_html`, not `__repr__`, so 
> this only happens in notebooks that render html, not the pyspark shell. 
> [https://github.com/apache/spark/commit/f654b39a63d4f9b118733733c7ed2a1b58649e3d]
>  
> Disabling Spark's UI with `spark.ui.enabled` *is* valuable outside of tests. 
> A couple reasons that come to mind:
> 1) If you run multiple spark applications from one machine, Spark 
> irritatingly starts picking the same port (4040), as the first application, 
> then increments (4041, 4042, etc) until it finds an open port. If you are 
> running 10 spark apps, then the 11th prints out 10 warnings about por