Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20280#discussion_r161964785
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -2306,18 +2306,20 @@ def test_toDF_with_schema_string(self):
             self.assertEqual(df.schema.simpleString(), 
"struct<key:string,value:string>")
             self.assertEqual(df.collect(), [Row(key=str(i), value=str(i)) for 
i in range(100)])
     
    -        # field names can differ.
    -        df = rdd.toDF(" a: int, b: string ")
    --- End diff --
    
    Yeah, I think you're right. The schema should be able to override names no 
matter what.  I was thinking it was flawed because it relied on the row field 
to be sorted internally, so just changing the kwarg names (not the order) could 
cause it to fail.  That seems a little strange, but maybe it is the intent?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to