[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21648 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199387596 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- Got it :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199386302 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- I am okay too but let's just leave it. It's private and not a big deal though. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199385685 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- I'm fine with it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199385323 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- Thanks for your advise, I want to change the comments in ``` """Accessor for the JVM SQL-specific configurations. This property returns a SQLConf which keep same behavior in scala SQLContext, we mainly use this to call the helper methods in SQLConf. """ ``` Do you think it's reasonable. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199380206 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- Ok. Make sense to me. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199379979 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- This one should be considered as an internal API only to use it to reduce the hardcoded ones and the leading underscore implies it. I think we are fine with it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199379810 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- Btw, `_conf` sounds like a property internal to the class/object, why we have a underscore there? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199379854 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- > Maybe we should add more comments to explain this? AFAIC its more natural than get conf in hard code config key. Yeah, I think this is good idea. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199379417 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- ``` What's the difference between this and SQLContext.getConf ? ``` getConf returns the value of Spark SQL conf for given key and we add this _conf wants to access the helper methods in SQLConf. Actually this following the behavior in SQLContext.scala. https://github.com/apache/spark/blob/3c0c2d09ca89c6b6247137823169db17847dfae3/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala#L80 ``` Can users being confused by two similar APIs? ``` Maybe we should add more comments to explain this? AFAIC its more natural than get conf in hard code config key. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199378189 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Accessor for the JVM SQL-specific configurations""" +return self.sparkSession._jsparkSession.sessionState().conf() --- End diff -- What's the difference between this and `SQLContext.getConf` ? Can users being confused by two similar APIs? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199373295 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Get SQLConf from current SqlContext""" --- End diff -- Thanks and done in next commit. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199370180 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,11 @@ def _ssql_ctx(self): """ return self._jsqlContext +@property +def _conf(self): +"""Get SQLConf from current SqlContext""" --- End diff -- nit: `Accessor for the JVM SQL-specific configurations` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199316817 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,10 @@ def _ssql_ctx(self): """ return self._jsqlContext +def conf(self): --- End diff -- Thanks, done in 4fc0ae4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199316401 --- Diff: python/pyspark/sql/context.py --- @@ -93,6 +93,10 @@ def _ssql_ctx(self): """ return self._jsqlContext +def conf(self): --- End diff -- Let's make this `_conf` with a property. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199316328 --- Diff: python/pyspark/sql/dataframe.py --- @@ -358,22 +360,19 @@ def show(self, n=20, truncate=True, vertical=False): def _eager_eval(self): --- End diff -- Thanks, Done in 453004a. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21648: [SPARK-24665][PySpark] Use SQLConf in PySpark to ...
Github user xuanyuanking commented on a diff in the pull request: https://github.com/apache/spark/pull/21648#discussion_r199316323 --- Diff: python/pyspark/sql/conf.py --- @@ -64,6 +64,97 @@ def _checkType(self, obj, identifier): (identifier, obj, type(obj).__name__)) +class ConfigEntry(object): +"""An entry contains all meta information for a configuration""" + +def __init__(self, confKey): +"""Create a new ConfigEntry with config key""" +self.confKey = confKey +self.converter = None +self.default = _NoValue + +def boolConf(self): +"""Designate current config entry is boolean config""" +self.converter = lambda x: str(x).lower() == "true" +return self + +def intConf(self): +"""Designate current config entry is integer config""" +self.converter = lambda x: int(x) +return self + +def stringConf(self): +"""Designate current config entry is string config""" +self.converter = lambda x: str(x) +return self + +def withDefault(self, default): +"""Give a default value for current config entry, the default value will be set +to _NoValue when its absent""" +self.default = default +return self + +def read(self, ctx): +"""Read value from this config entry through sql context""" +return self.converter(ctx.getConf(self.confKey, self.default)) + + +class SQLConf(object): +"""A class that enables the getting of SQL config parameters in pyspark""" + +REPL_EAGER_EVAL_ENABLED = ConfigEntry("spark.sql.repl.eagerEval.enabled")\ --- End diff -- During see the currently implement, maybe we can directly use the SQLConf in SessionState? I do this in last commit 453004a. Please have a look when you have time, thanks :) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org