[ https://issues.apache.org/jira/browse/SPARK-38839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Daniel deCordoba updated SPARK-38839: ------------------------------------- Description: When creating a dataframe using createDataFrame that contains a float inside a struct, the float is set to null. This only happens if using a list of dictionaries as data type, if I use a list of Rows it works fine: {code:java} data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1}, "MyFloat": 10.1}] spark.createDataFrame(data).show() # +-------+------------------------------+ # |MyFloat|MyStruct | # +-------+------------------------------+ # |10.1 |{MyInt -> 10, MyFloat -> null}| # +-------+------------------------------+ data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)] spark.createDataFrame(data).show() # +-------+------------------------------+ # |MyFloat|MyStruct | # +-------+------------------------------+ # |10.1 |{MyInt -> 10, MyFloat -> 10.1}| # +-------+------------------------------+ {code} Note MyFloat inside MyStruct is set to null in the first example. Interestingly enough, when I do the same with Row, or if I specify the schema, then this does not happen (second example). was: When creating a dataframe using createDataFrame that contains a float inside a struct, the float is set to null. This only happens if using a list of dictionaries as data type, if I use a list of Rows it works fine: {code:java} data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1}, "MyFloat": 10.1}] spark.createDataFrame(data).show() # +-------+------------------------------+ # |MyFloat|MyStruct | # +-------+------------------------------+ # |10.1 |{MyInt -> 10, MyFloat -> null}| # +-------+------------------------------+ data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)] spark.createDataFrame(data).show() # +-------+------------------------------+ # |MyFloat|MyStruct | # +-------+------------------------------+ # |10.1 |{MyInt -> 10, MyFloat -> 10.1}| # +-------+------------------------------+ {code} ```python data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1} , "MyFloat": 10.1}] spark.createDataFrame(data).show() # |MyFloat|MyStruct | # |10.1|{MyInt -> 10, MyFloat -> null}| data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)] spark.createDataFrame(data).show() # |MyFloat|MyStruct | # |10.1 |{MyInt -> 10, MyFloat -> 10.1}| ``` Note MyFloat inside MyStruct is set to null in the first example. Interestingly enough, when I do the same with Row, or if I specify the schema, then this does not happen (second example). > Creating a struct with a float inside > -------------------------------------- > > Key: SPARK-38839 > URL: https://issues.apache.org/jira/browse/SPARK-38839 > Project: Spark > Issue Type: Bug > Components: PySpark > Affects Versions: 3.2.1 > Reporter: Daniel deCordoba > Priority: Minor > > When creating a dataframe using createDataFrame that contains a float inside > a struct, the float is set to null. This only happens if using a list of > dictionaries as data type, if I use a list of Rows it works fine: > {code:java} > data = [{"MyStruct": {"MyInt": 10, "MyFloat": 10.1}, "MyFloat": 10.1}] > spark.createDataFrame(data).show() > # +-------+------------------------------+ > # |MyFloat|MyStruct | > # +-------+------------------------------+ > # |10.1 |{MyInt -> 10, MyFloat -> null}| > # +-------+------------------------------+ > data = [Row(MyStruct=Row(MyInt=10, MyFloat=10.1), MyFloat=10.1)] > spark.createDataFrame(data).show() > # +-------+------------------------------+ > # |MyFloat|MyStruct | > # +-------+------------------------------+ > # |10.1 |{MyInt -> 10, MyFloat -> 10.1}| > # +-------+------------------------------+ {code} > Note MyFloat inside MyStruct is set to null in the first example. > Interestingly enough, when I do the same with Row, or if I specify the > schema, then this does not happen (second example). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org