Hello Victor, I want to do it on multiple columns. I was able to do it on one column by the help of Sean using code below.
val matData = file.map(_.split(";")) val stats = matData.map(_(2).toDouble).stats() stats.mean stats.max Thank you Vineet From: Victor Tso-Guillen [mailto:v...@paxata.com] Sent: Montag, 25. August 2014 18:34 To: Hingorani, Vineet Cc: user@spark.apache.org Subject: Re: Manipulating columns in CSV file or Transpose of Array[Array[String]] RDD Do you want to do this on one column or all numeric columns? On Mon, Aug 25, 2014 at 7:09 AM, Hingorani, Vineet <vineet.hingor...@sap.com<mailto:vineet.hingor...@sap.com>> wrote: Hello all, Could someone help me with the manipulation of csv file data. I have 'semicolon' separated csv data including doubles and strings. I want to calculate the maximum/average of a column. When I read the file using sc.textFile(test.csv).map(_.split(";"), each field is read as string. Could someone help me with the above manipulation and how to do that. Or maybe if there is some way to take the transpose of the data and then manipulating the rows in some way? Thank you in advance, I am struggling with this thing for quite sometime Regards, Vineet