Hello, I'm trying to assign a unique (and deterministic) ID to a globally sorted DataSet.
Given a DataSet of String, I'm computing the frequency of each label as follows: val env = ExecutionEnvironment.getExecutionEnvironment val data = env.fromCollection(List("a","b","c","a","a","d","a","a","a","b","b","c","a","c","b","c")) val mapping = data.map(s => (s,1)) .groupBy(0) .reduce((a,b) => (a._1, a._2 + b._2)) .partitionByRange(1) .sortPartition(1, Order.DESCENDING) Then I want the most frequent label to be ID 0 and so on *in a decreasing order*. My idea was to use zipWithIndex. val result = mapping.zipWithIndex But this does not guarantee that the global order will be preserved right ? What can I do to get such mapping ? Thanks Regards Thomas