Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20280 Also, this will cause a breaking change if `Row`s are defined with kwargs and schema changes field names, like this: ``` data = [Row(key=i, value=str(i)) for i in range(100)] rdd = self.sc.parallelize(data, 5) df = rdd.toDF(" a: int, b: string ") ``` and this would work but might be slower, depending on how complicated the schema is, because now the field names are searched for instead of just going by position ``` df = rdd.toDF(" key: int, value: string ") ``` So if we go forward with this fix, I should probably add something in the migration guide
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org