Hi Daniel,
This is the user mailing list for Apache Hadoop, not Apache Spark.
Please use instead.
https://spark.apache.org/community.html
-Akira
On Tue, Dec 3, 2019 at 1:00 AM Daniel Zhang wrote:
> Hi, Spark Users:
>
> I have a question related to the way I use the spark Dataset API for my
>
Hi, Spark Users:
I have a question related to the way I use the spark Dataset API for my case.
If the "ds_old" dataset is having 100 records, with 10 unique $"col1", and for
the following pseudo-code:
val ds_new = ds_old.repartition(5,
$"col1").sortWithinPartitions($"col2").mapPartitions(new