Hi Divya,
You can use the withColumn method from the DataFrame API. Here is the method 
signature:

def withColumn(colName: String, col: 
Column<http://spark.apache.org/docs/latest/api/scala/org/apache/spark/sql/Column.html>):
 DataFrame


Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: Divya Gehlot [mailto:divya.htco...@gmail.com]
Sent: Thursday, February 4, 2016 1:29 AM
To: user @spark
Subject: add new column in the schema + Dataframe

Hi,
I am beginner in spark and using Spark 1.5.2 on YARN.(HDP2.3.4)
I have a use case where I have to read two input files and based on certain  
conditions in second input file ,have to add a new column in the first input 
file and save it .

I am using spark-csv to read my input files .
Would really appreciate if somebody would share their thoughts on best/feasible 
way of doing it(using dataframe API)


Thanks,
Divya


Reply via email to