Hi Filipp,
your solution worked very well: thanks a lot!
Davide
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hi,
you can combine multiple columns using
org.apache.spark.sql.functions.struct and invoke UDF on resulting
column.
In that case your UDF have to accept Row as an argument.
See VectorAssermber's sources for example:
Hi,
I am trying to write a custom ml.Transformer. It's a very simple row-by-row
transformation, but it takes in account multiple columns of the DataFrame
(and sometimes, interaction between columns).
I was wondering what the best way to achieve this is. I have used a udf in
the Transformer
Hi Cesar,
these methods would be private until new ml api would stabilize (aprox.
in spark 1.4). My solution for the same issue was to create
org.apache.spark.ml package in my project and extends/implement
everything there.
Thanks,
Peter Rudenko
On 2015-02-18 22:17, Cesar Flores wrote:
I
I am working right now with the ML pipeline, which I really like it.
However in order to make a real use of it, I would like create my own
transformers that implements org.apache.spark.ml.Transformer. In order to
do that, a method from the PipelineStage needs to be implemented. But this
method is
Hi Cesar,
Thanks for trying out Pipelines and bringing up this issue! It's been an
experimental API, but feedback like this will help us prepare it for
becoming non-Experimental. I've made a JIRA, and will vote for this being
protected (instead of private[ml]) for Spark 1.3: