Basic Grouping Question

Marco Mans Mon, 20 Feb 2017 03:24:30 -0800

Hi!

I'm new to Spark and trying to write my first spark job on some data I have.
The data is in this (parquet) format:


Code,timestamp, value
A, 2017-01-01, 123
A, 2017-01-02, 124
A, 2017-01-03, 126
B, 2017-01-01, 127
B, 2017-01-02, 126
B, 2017-01-03, 123

I want to write a little map-reduce application that must be run on each
'code'.
So I would need to group the data on the 'code' column and than execute the
map and the reduce steps on each code; 2 times in this example, A and B.

But when I group the data (groupBy-function), it returns a
RelationalDatasetGroup. On this I cannot apply the map and reduce function.

I have the feeling that I am running in the wrong direction. Does anyone
know how to approach this? (I hope I explained it right, so it can be
understand :))

Regards,
Marco

Basic Grouping Question

Reply via email to