Re: ML Transformer: create feature that uses multiple columns

2017-12-11 Thread davideanastasia
Hi Filipp, your solution worked very well: thanks a lot! Davide -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: ML Transformer: create feature that uses multiple columns

2017-12-09 Thread Filipp Zhinkin
Hi, you can combine multiple columns using org.apache.spark.sql.functions.struct and invoke UDF on resulting column. In that case your UDF have to accept Row as an argument. See VectorAssermber's sources for example:

ML Transformer: create feature that uses multiple columns

2017-12-09 Thread davideanastasia
Hi, I am trying to write a custom ml.Transformer. It's a very simple row-by-row transformation, but it takes in account multiple columns of the DataFrame (and sometimes, interaction between columns). I was wondering what the best way to achieve this is. I have used a udf in the Transformer

Re: ML Transformer

2015-02-19 Thread Peter Rudenko
Hi Cesar, these methods would be private until new ml api would stabilize (aprox. in spark 1.4). My solution for the same issue was to create org.apache.spark.ml package in my project and extends/implement everything there. Thanks, Peter Rudenko On 2015-02-18 22:17, Cesar Flores wrote: I

ML Transformer

2015-02-18 Thread Cesar Flores
I am working right now with the ML pipeline, which I really like it. However in order to make a real use of it, I would like create my own transformers that implements org.apache.spark.ml.Transformer. In order to do that, a method from the PipelineStage needs to be implemented. But this method is

Re: ML Transformer

2015-02-18 Thread Joseph Bradley
Hi Cesar, Thanks for trying out Pipelines and bringing up this issue! It's been an experimental API, but feedback like this will help us prepare it for becoming non-Experimental. I've made a JIRA, and will vote for this being protected (instead of private[ml]) for Spark 1.3: