[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22858#discussion_r228731178 --- Diff: python/pyspark/sql/functions.py --- @@ -2326,7 +2326,7 @@ def schema_of_json(json): >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect() [Row(json=u'struct')] """ -if isinstance(json, basestring): +if isinstance(json, str): --- End diff -- Yea we should. They are put only when it's needed because there are so many cases like that (for instance, imap in Python 2 and map in Python 3) Looks that's added in another PR in master beach only. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...
Github user cloud-fan commented on a diff in the pull request: https://github.com/apache/spark/pull/22858#discussion_r228730582 --- Diff: python/pyspark/sql/functions.py --- @@ -2326,7 +2326,7 @@ def schema_of_json(json): >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect() [Row(json=u'struct')] """ -if isinstance(json, basestring): +if isinstance(json, str): --- End diff -- shall we apply it to 2.4? I'm not aware of the background, why we did not put ``` if sys.version >= '3': basestring = str ``` in 2.4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/22858#discussion_r228713086 --- Diff: python/pyspark/sql/functions.py --- @@ -2326,7 +2326,7 @@ def schema_of_json(json): >>> df.select(schema_of_json('{"a": 0}').alias("json")).collect() [Row(json=u'struct')] """ -if isinstance(json, basestring): +if isinstance(json, str): --- End diff -- The problem here is we will not support unicode in Python 2 .. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #22858: [SPARK-24709][SQL][2.4] use str instead of basest...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/22858 [SPARK-24709][SQL][2.4] use str instead of basestring ## What changes were proposed in this pull request? after backport https://github.com/apache/spark/pull/22775 to 2.4, the 2.4 sbt Jenkins QA job is broken, see https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test/job/spark-branch-2.4-test-sbt-hadoop-2.7/147/console I checked all the `isinstance` calls in `functions.py`, all of them use `str` to check string type. I don't know why `basestring` works in master and 2.4 maven build, but it's safer to follow exiting code. ## How was this patch tested? existing test You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark python Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22858.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22858 commit 2917acd18994c3901c8c5b562cf87964bca879d9 Author: Wenchen Fan Date: 2018-10-27T11:12:10Z use str instead of basestring --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org