Re: UDF on multiple columns

2016-10-12 Thread Meeraj Kunnumpurath
This is what I do at the moment, def build(path: String, spark: SparkSession) = { val toDouble = udf((x: String) => x.toDouble) val df = spark.read. option("header", "true"). csv(path). withColumn("sqft_living", toDouble('sqft_living)). withColumn("price", toDouble('price)).

UDF on multiple columns

2016-10-12 Thread Meeraj Kunnumpurath
Hello, How do I write a UDF that operate on two columns. For example, how do I introduce a new column, which is a product of two columns already on the dataframe. Many thanks Meeraj