Re: Renaming a DataFrame column makes Spark lose partitioning information

2020-08-05 Thread Antoine Wendlinger
AS c#11] > +- Exchange hashpartitioning(b#8, 200), false, [id=#12] >+- LocalTableScan [a#7, b#8] > > Thanks, > Terry > > On Tue, Aug 4, 2020 at 6:26 AM Antoine Wendlinger < > awendlin...@mytraffic.fr> wrote: > >> Hi, >> >> When ren

Renaming a DataFrame column makes Spark lose partitioning information

2020-08-04 Thread Antoine Wendlinger
Hi, When renaming a DataFrame column, it looks like Spark is forgetting the partition information: Seq((1, 2)) .toDF("a", "b") .repartition($"b") .withColumnRenamed("b", "c") .repartition($"c") .explain() Gives the following plan: == Physical Plan ==