Hi,

what is the most efficient way to perform a group-by operation in Spark and
merge rows into csv?

Here is the current RDD
-----------------
ID   STATE
-----------------
1       TX
1        NY
1        FL
2        CA
2        OH
-----------------

This is the required output:
-------------------------
ID    CSV_STATE
-------------------------
1     TX,NY,FL
2     CA,OH
-------------------------

Reply via email to