Github user BryanCutler commented on the issue: https://github.com/apache/spark/pull/20280 Let me restate what I think the intended behavior of Row is: If a `Row` is made from kwargs, then the order of the fields can not be relied upon and whenever accessing data, it must be done like a dict with the field name. Because of this, when applying a schema to the data, the schema fields must also be fields in the `Row` objects. Field position can change as long as the name matches. If a `Row` is made from generating a custom class, like `TestRow = Row("key", "value")` then `row = TestRow('a', 1)`, the the schema will be applied base on position and the elements in the `Row` objects are accessed by index. The name of each field in the schema can differ as long as the element at that index can be converted to the specified schema type.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org