Since N is decided at runtime, the first idea come to my mind is transform the 
columns into one vector column (VectorIndexer can do that) and then let udf 
handle the vector. Just like many ml transformers do.

 

From: anup ahire <ahirea...@gmail.com>
Date: Wednesday, March 15, 2017 at 2:04 PM
To: <user@spark.apache.org>
Subject: apply UDFs to N columns dynamically in dataframe

 

Hello,

 

I have a schema and name of columns to apply UDF to. Name of columns are user 
input and they can differ in numbers for each input.

 

Is there a way to apply UDFs to N columns in dataframe  ?

 

 

Thanks !

Reply via email to