[jira] [Updated] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled
[ https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] holdenk updated SPARK-24668: Shepherd: holdenk Affects Version/s: 2.4.0 > PySpark crashes when getting the webui url if the webui is disabled > --- > > Key: SPARK-24668 > URL: https://issues.apache.org/jira/browse/SPARK-24668 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.3.0, 2.4.0 > Environment: * Spark 2.3.0 > * Spark-on-YARN > * Java 8 > * Python 3.6.5 > * Jupyter 4.4.0 >Reporter: Karthik Palaniappan >Priority: Minor > > Repro: > > Evaluate `sc` in a Jupyter notebook: > > > {{---}} > {{Py4JJavaError Traceback (most recent call > last)}} > {{/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in > __call__(self, obj)}} > {{ 343 method = get_real_method(obj, self.print_method)}} > {{ 344 if method is not None:}} > {{--> 345 return method()}} > {{ 346 return None}} > {{ 347 else:}} > {{/usr/lib/spark/python/pyspark/context.py in _repr_html_(self)}} > {{ 261 }} > {{ 262 """.format(}} > {{--> 263 sc=self}} > {{ 264 )}} > {{ 265 }} > {{/usr/lib/spark/python/pyspark/context.py in uiWebUrl(self)}} > {{ 373 def uiWebUrl(self):}} > {{ 374 """Return the URL of the SparkUI instance started by this > SparkContext"""}} > {{--> 375 return > self._[jsc.sc|https://www.google.com/url?q=http://jsc.sc&sa=D&usg=AFQjCNHUwO0Cf3OHs1QafBFXzShZ_PU8IQ]().uiWebUrl().get()}} > {{ 376 }} > {{ 377 @property}} > {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in > __call__(self, *args)}} > {{ 1158 answer = self.gateway_client.send_command(command)}} > {{ 1159 return_value = get_return_value(}} > {{-> 1160 answer, self.gateway_client, self.target_id, > [self.name|https://www.google.com/url?q=http://self.name&sa=D&usg=AFQjCNEu_LlQOduOrIyV64UgIuRgm6Ea2w])}} > {{ 1161 }} > {{ 1162 for temp_arg in temp_args:}} > {{/usr/lib/spark/python/pyspark/sql/utils.py in deco(*a, **kw)}} > {{ 61 def deco(*a, **kw):}} > {{ 62 try:}} > {{---> 63 return f(*a, **kw)}} > {{ 64 except py4j.protocol.Py4JJavaError as e:}} > {{ 65 s = e.java_exception.toString()}} > {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py in > get_return_value(answer, gateway_client, target_id, name)}} > {{ 318 raise Py4JJavaError(}} > {{ 319 "An error occurred while calling > \{0}{1}\{2}.\n".}} > {{--> 320 format(target_id, ".", name), value)}} > {{ 321 else:}} > {{ 322 raise Py4JError(}} > {{Py4JJavaError: An error occurred while calling o80.get.}} > {{: java.util.NoSuchElementException: None.get}} > {{ at scala.None$.get(Option.scala:347)}} > {{ at scala.None$.get(Option.scala:345)}} > {{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}} > {{ at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}} > {{ at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} > {{ at java.lang.reflect.Method.invoke(Method.java:498)}} > {{ at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)}} > {{ at > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)}} > {{ at py4j.Gateway.invoke(Gateway.java:282)}} > {{ at > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)}} > {{ at py4j.commands.CallCommand.execute(CallCommand.java:79)}} > {{ at py4j.GatewayConnection.run(GatewayConnection.java:214)}} > {{ at java.lang.Thread.run(Thread.java:748)}} > > PySpark only prints out the web ui url in `_repr_html`, not `__repr__`, so > this only happens in notebooks that render html, not the pyspark shell. > [https://github.com/apache/spark/commit/f654b39a63d4f9b118733733c7ed2a1b58649e3d] > > Disabling Spark's UI with `spark.ui.enabled` *is* valuable outside of tests. > A couple reasons that come to mind: > 1) If you run multiple spark applications from one machine, Spark > irritatingly starts picking the same port (4040), as the first application, > then increments (4041, 4042, etc) until it finds an open port. If you are > running 10 spark apps, then the 11th prints out 10 warnings about ports being > taken until it finally finds one. > 2) You can serve the spark web ui from a dedicated spark history server >
[jira] [Updated] (SPARK-24668) PySpark crashes when getting the webui url if the webui is disabled
[ https://issues.apache.org/jira/browse/SPARK-24668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Karthik Palaniappan updated SPARK-24668: Environment: * Spark 2.3.0 * Spark-on-YARN * Java 8 * Python 3.6.5 * Jupyter 4.4.0 was: * Spark 2.3.0 * Spark-on-YARN * Java 8 * Python 2 * Jupyter > PySpark crashes when getting the webui url if the webui is disabled > --- > > Key: SPARK-24668 > URL: https://issues.apache.org/jira/browse/SPARK-24668 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.3.0 > Environment: * Spark 2.3.0 > * Spark-on-YARN > * Java 8 > * Python 3.6.5 > * Jupyter 4.4.0 >Reporter: Karthik Palaniappan >Priority: Minor > > Repro: > > Evaluate `sc` in a Jupyter notebook: > > > {{---}} > {{Py4JJavaError Traceback (most recent call > last)}} > {{/opt/conda/lib/python3.6/site-packages/IPython/core/formatters.py in > __call__(self, obj)}} > {{ 343 method = get_real_method(obj, self.print_method)}} > {{ 344 if method is not None:}} > {{--> 345 return method()}} > {{ 346 return None}} > {{ 347 else:}} > {{/usr/lib/spark/python/pyspark/context.py in _repr_html_(self)}} > {{ 261 }} > {{ 262 """.format(}} > {{--> 263 sc=self}} > {{ 264 )}} > {{ 265 }} > {{/usr/lib/spark/python/pyspark/context.py in uiWebUrl(self)}} > {{ 373 def uiWebUrl(self):}} > {{ 374 """Return the URL of the SparkUI instance started by this > SparkContext"""}} > {{--> 375 return > self._[jsc.sc|https://www.google.com/url?q=http://jsc.sc&sa=D&usg=AFQjCNHUwO0Cf3OHs1QafBFXzShZ_PU8IQ]().uiWebUrl().get()}} > {{ 376 }} > {{ 377 @property}} > {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/java_gateway.py in > __call__(self, *args)}} > {{ 1158 answer = self.gateway_client.send_command(command)}} > {{ 1159 return_value = get_return_value(}} > {{-> 1160 answer, self.gateway_client, self.target_id, > [self.name|https://www.google.com/url?q=http://self.name&sa=D&usg=AFQjCNEu_LlQOduOrIyV64UgIuRgm6Ea2w])}} > {{ 1161 }} > {{ 1162 for temp_arg in temp_args:}} > {{/usr/lib/spark/python/pyspark/sql/utils.py in deco(*a, **kw)}} > {{ 61 def deco(*a, **kw):}} > {{ 62 try:}} > {{---> 63 return f(*a, **kw)}} > {{ 64 except py4j.protocol.Py4JJavaError as e:}} > {{ 65 s = e.java_exception.toString()}} > {{/usr/lib/spark/python/lib/py4j-0.10.6-src.zip/py4j/protocol.py in > get_return_value(answer, gateway_client, target_id, name)}} > {{ 318 raise Py4JJavaError(}} > {{ 319 "An error occurred while calling > \{0}{1}\{2}.\n".}} > {{--> 320 format(target_id, ".", name), value)}} > {{ 321 else:}} > {{ 322 raise Py4JError(}} > {{Py4JJavaError: An error occurred while calling o80.get.}} > {{: java.util.NoSuchElementException: None.get}} > {{ at scala.None$.get(Option.scala:347)}} > {{ at scala.None$.get(Option.scala:345)}} > {{ at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)}} > {{ at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)}} > {{ at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)}} > {{ at java.lang.reflect.Method.invoke(Method.java:498)}} > {{ at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)}} > {{ at > py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)}} > {{ at py4j.Gateway.invoke(Gateway.java:282)}} > {{ at > py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)}} > {{ at py4j.commands.CallCommand.execute(CallCommand.java:79)}} > {{ at py4j.GatewayConnection.run(GatewayConnection.java:214)}} > {{ at java.lang.Thread.run(Thread.java:748)}} > > PySpark only prints out the web ui url in `_repr_html`, not `__repr__`, so > this only happens in notebooks that render html, not the pyspark shell. > [https://github.com/apache/spark/commit/f654b39a63d4f9b118733733c7ed2a1b58649e3d] > > Disabling Spark's UI with `spark.ui.enabled` *is* valuable outside of tests. > A couple reasons that come to mind: > 1) If you run multiple spark applications from one machine, Spark > irritatingly starts picking the same port (4040), as the first application, > then increments (4041, 4042, etc) until it finds an open port. If you are > running 10 spark apps, then the 11th prints out 10 warnings about por