Hi, I am beginner in spark and using Spark 1.5.2 on YARN.(HDP2.3.4) I have a use case where I have to read two input files and based on certain conditions in second input file ,have to add a new column in the first input file and save it .
I am using spark-csv to read my input files . Would really appreciate if somebody would share their thoughts on best/feasible way of doing it(using dataframe API) Thanks, Divya