[ https://issues.apache.org/jira/browse/SPARK-22021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-22021. ------------------------------- Resolution: Won't Fix Agree, just can't imagine supporting Javascript in just one corner of Spark > Add a feature transformation to accept a function and apply it on all rows of > dataframe > --------------------------------------------------------------------------------------- > > Key: SPARK-22021 > URL: https://issues.apache.org/jira/browse/SPARK-22021 > Project: Spark > Issue Type: New Feature > Components: ML > Affects Versions: 2.3.0 > Reporter: Hosur Narahari > > More often we generate derived features in ML pipeline by doing some > mathematical or other kind of operation on columns of dataframe like getting > a total of few columns as a new column or if there is text field message and > we want the length of message etc. We currently don't have an efficient way > to handle such scenario in ML pipeline. > By Providing a transformer which accepts a function and performs that on > mentioned columns to generate output column of numerical type, user has the > flexibility to derive features by applying any domain specific logic. > Example: > val function = "function(a,b) { return a+b;}" > val transformer = new GenFuncTransformer().setInputCols(Array("v1", > "v2")).setOutputCol("result").setFunction(function) > val df = Seq((1.0, 2.0), (3.0, 4.0)).toDF("v1", "v2") > val result = transformer.transform(df) > result.show > v1 v2 result > 1.0 2.0 3.0 > 3.0 4.0 7.0 -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org