repartition() puts all values with the same key in one partition, but,
multiple other keys can be in the same partition. It sounds like you want
groupBy, not repartition, if you want to handle these separately.
On Mon, Jun 20, 2022 at 8:26 AM DESCOTTE Loic - externe
wrote:
> Hi,
>
>
>
> I have
Hi,
I have a data type like this :
case class Data(col: String, ...)
and a Dataset[Data] ds. Some rows have columns filled with value 'a', and other
with value 'b', etc.
I want to process separately all data with a 'a', and all data with a 'b'. But
I also need to have all the 'a' in the