Github user ueshin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20306#discussion_r162293629
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
    @@ -838,6 +839,7 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
                  |$evPrim = $buffer.build();
                """.stripMargin
             }
    +      case pudt: PythonUserDefinedType => castToStringCode(pudt.sqlType, 
ctx)
    --- End diff --
    
    Yes, it works to cast to string.
    
    Btw, as for `VectorUDT`, seems like `DenseVector` and `SparseVector` 
override `toString()` at least for `show()` on purpose(?):
    
https://github.com/apache/spark/blob/74c17353bb6372b123c5aee1b6d58a21de36f99a/python/pyspark/ml/classification.py#L1497-L1503
    
    If we also use cast to string for `show()`, the result will be like:
    
    ```
    +-----------------+----------+
    |         features|prediction|
    +-----------------+----------+
    |[1,,, [1.0, 0.0]]|       1.0|
    |[1,,, [0.0, 0.0]]|       0.0|
    +-----------------+----------+
    ```
    
    I'm not sure we can change the string here.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to