[ https://issues.apache.org/jira/browse/SPARK-43341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated SPARK-43341: ----------------------------------- Labels: pull-request-available (was: ) > StructType.toDDL does not pick up on non-nullability of column in nested > struct > ------------------------------------------------------------------------------- > > Key: SPARK-43341 > URL: https://issues.apache.org/jira/browse/SPARK-43341 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 3.3.0, 3.3.1, 3.3.2 > Reporter: Bram Boogaarts > Priority: Major > Labels: pull-request-available > > h2. The problem > When converting a StructType instance containing a nested StructType column > which in turn contains a column for which {{nullable = false}} to a DDL > string using {{{}.toDDL{}}}, the resulting DDL string does not include this > non-nullability. For example: > {code:java} > val testschema = StructType(List( > StructField("key", IntegerType, false), > StructField("value", StringType, true), > StructField("nestedCols", StructType(List( > StructField("nestedKey", IntegerType, false), > StructField("nestedValue", StringType, true) > )), false) > )) > println(testschema.toDDL) > println(StructType.fromDDL(testschema.toDDL)){code} > gives: > {code:java} > key INT NOT NULL,value STRING,nestedCols STRUCT<nestedKey: INT, nestedValue: > STRING> NOT NULL > StructType( > StructField(key,IntegerType,false), > StructField(value,StringType,true), > StructField(nestedCols,StructType( > StructField(nestedKey,IntegerType,true), > StructField(nestedValue,StringType,true) > ),false) > ){code} > > This is due to the fact that {{StructType.toDDL}} calls {{StructField.toDDL}} > for its fields, which in turn calls {{.sql}} for its {{{}dataType{}}}. If > {{dataType}} is a {{{}StructType{}}}, the call to {{.sql}} in turn calls > {{.sql}} for all the nested fields, and this last method does not include the > nullability of the field in its output. > h2. Proposed solution > {{StructField.toDDL}} should call {{dataType.toDDL}} for a > {{{}StructType{}}}, since this will include information about nullability of > nested columns. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org