Are there plans to add reduceByKey to dataframes, Since switching over to
spark 2 I find myself increasing dissatisfied with the idea of converting
dataframes to RDD to do procedural programming on grouped data(both from a
ease of programming stance and performance stance). So I've been using
Dataframe's experimental groupByKey and flatMapGroups which perform
extremely well, I'm guessing because of the encoders, but the amount of
data being transfers is a little excessive. Is there any plans to port
reduceByKey ( and additionally a reduceByKeyleft and right)?

Reply via email to