Hello Victor,

I want to do it on multiple columns. I was able to do it on one column by the 
help of Sean using code below.


val matData = file.map(_.split(";"))

val stats = matData.map(_(2).toDouble).stats()

stats.mean
stats.max

Thank you

Vineet

From: Victor Tso-Guillen [mailto:v...@paxata.com]
Sent: Montag, 25. August 2014 18:34
To: Hingorani, Vineet
Cc: user@spark.apache.org
Subject: Re: Manipulating columns in CSV file or Transpose of 
Array[Array[String]] RDD

Do you want to do this on one column or all numeric columns?

On Mon, Aug 25, 2014 at 7:09 AM, Hingorani, Vineet 
<vineet.hingor...@sap.com<mailto:vineet.hingor...@sap.com>> wrote:

Hello all,

Could someone help me with the manipulation of csv file data. I have 
'semicolon' separated csv data including doubles and strings. I want to 
calculate the maximum/average of a column. When I read the file using 
sc.textFile(test.csv).map(_.split(";"), each field is read as string. Could 
someone help me with the above manipulation and how to do that.

Or maybe if there is some way to take the transpose of the data and then 
manipulating the rows in some way?

Thank you in advance, I am struggling with this thing for quite sometime

Regards,
Vineet

Reply via email to