Re: adding rows to a DataFrame

2016-03-11 Thread Bijay Pathak
Here is another way you can achieve that(in Python): base_df.withColumn("column_name","column_expression_for_new_column") # To add new row create the data frame containing the new row and do the unionAll() base_df.unionAll(new_df) # Another approach convert to rdd add required fields and convert

Re: adding rows to a DataFrame

2016-03-11 Thread Jan Štěrba
It very much depends on the logic that generates the new rows. Is it per row (i.e. without context?) then you can just convert to RDD and perform a map operation on each row. JavaPairRDD grouped = dataFrame.javaRDD().groupBy( group by what you need, probably ID ); return

Re: adding rows to a DataFrame

2016-03-11 Thread Michael Armbrust
Or look at explode on DataFrame On Fri, Mar 11, 2016 at 10:45 AM, Stefan Panayotov wrote: > Hi, > > I have a problem that requires me to go through the rows in a DataFrame > (or possibly through rows in a JSON file) and conditionally add rows > depending on a value in one of

Re: adding rows to a DataFrame

2016-03-11 Thread Jacek Laskowski
Just a guess...flatMap? Jacek 11.03.2016 7:46 PM "Stefan Panayotov" napisał(a): > Hi, > > I have a problem that requires me to go through the rows in a DataFrame > (or possibly through rows in a JSON file) and conditionally add rows > depending on a value in one of the

adding rows to a DataFrame

2016-03-11 Thread Stefan Panayotov
Hi, I have a problem that requires me to go through the rows in a DataFrame (or possibly through rows in a JSON file) and conditionally add rows depending on a value in one of the columns in each existing row. So, for example if I have: +---+---+---+ | _1| _2| _3| +---+---+---+