Hi, what is the most efficient way to perform a group-by operation in Spark and merge rows into csv?
Here is the current RDD ----------------- ID STATE ----------------- 1 TX 1 NY 1 FL 2 CA 2 OH ----------------- This is the required output: ------------------------- ID CSV_STATE ------------------------- 1 TX,NY,FL 2 CA,OH -------------------------