[GitHub] [spark] HyukjinKwon edited a comment on issue #25130: [SPARK-28359][test-maven][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-14 Thread GitBox
HyukjinKwon edited a comment on issue #25130: 
[SPARK-28359][test-maven][SQL][PYTHON][TESTS] Make integrated UDF tests robust 
by making UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-511181960
 
 
   @chitralverma, Do you mean a comment about supporting complex types 
(https://github.com/apache/spark/pull/25130#discussion_r303227812)? Yea. so 
what I mean is that we can use `from_json`/`to_json` expressions to keep the 
string representation identical.  See examples for 
details.
   
   
   **Python:**
   
   ```python
   from pyspark.sql.functions import to_json, from_json
   df = spark.range(3).selectExpr("struct(1, id) as col")
   df.select(to_json("col")).show()
   ```
   ```
   +-+
   | to_json(col)|
   +-+
   |{"col1":1,"id":0}|
   |{"col1":1,"id":1}|
   |{"col1":1,"id":2}|
   +-+
   ```
   
   ```python
   df.select(from_json(to_json("col"), df.schema["col"].dataType)).show()
   ```
   ```
   +---+
   |from_json(to_json(col))|
   +---+
   | [1, 0]|
   | [1, 1]|
   | [1, 2]|
   +---+
   ```
   
   **Scala:**
   
   ```scala
   import org.apache.spark.sql.functions.{to_json, from_json}
   val df = spark.range(3).selectExpr("struct(1, id) as col")
   df.select(to_json($"col")).show()
   ```
   
   ```
   +-+
   | to_json(col)|
   +-+
   |{"col1":1,"id":0}|
   |{"col1":1,"id":1}|
   |{"col1":1,"id":2}|
   +-+
   ```
   
   ```scala
   df.select(from_json(to_json($"col"), df("col").expr.dataType)).show()
   ```
   ```
   +---+
   |from_json(to_json(col))|
   +---+
   | [1, 0]|
   | [1, 1]|
   | [1, 2]|
   +---+
   ```
   
   
   
   
   Let's do this one later when we meet many complex type test cases. Looks 
there are not a lot and we can just skip it for now since integrated UDF tests 
currently don't target value conversions.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] HyukjinKwon edited a comment on issue #25130: [SPARK-28359][test-maven][SQL][PYTHON][TESTS] Make integrated UDF tests robust by making UDFs (virtually) no-op

2019-07-11 Thread GitBox
HyukjinKwon edited a comment on issue #25130: 
[SPARK-28359][test-maven][SQL][PYTHON][TESTS] Make integrated UDF tests robust 
by making UDFs (virtually) no-op
URL: https://github.com/apache/spark/pull/25130#issuecomment-510763658
 
 
   Array issue will still stands but I think this can address most of our 
cases. I would like to avoid add all combinations of Python / Scalar UDFs for 
tests that mainly targets plans.
   
   Let's work around array ones in those tests specifically. Those set of tests 
should really target plan specifically.
   
   I will comment in that PR.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org