Hello all,
Could someone help me with the manipulation of csv file data. I have
'semicolon' separated csv data including doubles and strings. I want to
calculate the maximum/average of a column. When I read the file using
sc.textFile(test.csv).map(_.split(;), each field is read as string.
Do you want to do this on one column or all numeric columns?
On Mon, Aug 25, 2014 at 7:09 AM, Hingorani, Vineet vineet.hingor...@sap.com
wrote:
Hello all,
Could someone help me with the manipulation of csv file data. I have
'semicolon' separated csv data including doubles and strings. I
Hello Victor,
I want to do it on multiple columns. I was able to do it on one column by the
help of Sean using code below.
val matData = file.map(_.split(;))
val stats = matData.map(_(2).toDouble).stats()
stats.mean
stats.max
Thank you
Vineet
From: Victor Tso-Guillen
Assuming the CSV is well-formed (every row has the same number of columns)
and every column is a number, this is how you can do it. You can adjust so
that you pick just the columns you want, of course, by mapping each row to
a new Array that contains just the column values you want. Just be sure