Hi All,

I came across these two types MatrixUDT and VectorUDF in Spark ML when
doing feature extraction and preprocessing with PySpark. However, when
trying to do some basic operations, such as vector multiplication and
matrix multiplication, I had to go down to Python UDF.

It seems to be it would be very useful to have built-in operators on these
types just like first class Spark SQL types, e.g.,

df.withColumn('v', df.matrix_column * df.vector_column)

I wonder what are other people's thoughts on this?

Li

Reply via email to