Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/20306#discussion_r162293629
--- Diff:
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
---
@@ -838,6 +839,7 @@ case class Cast(child: Expression, dataType: DataType,
timeZoneId: Option[String
|$evPrim = $buffer.build();
""".stripMargin
}
+ case pudt: PythonUserDefinedType => castToStringCode(pudt.sqlType,
ctx)
--- End diff --
Yes, it works to cast to string.
Btw, as for `VectorUDT`, seems like `DenseVector` and `SparseVector`
override `toString()` at least for `show()` on purpose(?):
https://github.com/apache/spark/blob/74c17353bb6372b123c5aee1b6d58a21de36f99a/python/pyspark/ml/classification.py#L1497-L1503
If we also use cast to string for `show()`, the result will be like:
```
+-----------------+----------+
| features|prediction|
+-----------------+----------+
|[1,,, [1.0, 0.0]]| 1.0|
|[1,,, [0.0, 0.0]]| 0.0|
+-----------------+----------+
```
I'm not sure we can change the string here.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]