Hi,
I have a nested StructType. The StructType is deeply nested and may
comprise other Structs. Now I want to update this struct at the lowest
level.
I tried withField but it doesn't work if any of the top level struct is
null. I will appreciate any help with this.

The example schema is:

val schema = new StructType()
      .add("key", StringType)
      .add(
        "cells",
        ArrayType(
          new StructType()
            .add("family", StringType)
            .add("qualifier", StringType)
            .add("timestamp", LongType)
            .add("nestStruct", new StructType()
                .add("id1", LongType)
                .add("id2", StringType)
.               .add("id3", new StructType()
                   .add("id31", LongType)
                   .add("id32", StringType))
        )
      )

val data = Seq(
      Row(
        "1235321863",
        Array(
          Row("a", "b", 1L,  null)
        )
      )
    )


   val  df_test = spark
      .createDataFrame(spark.sparkContext.parallelize(data), schema)

val result = df_test.withColumn(
  "cell1",
  transform($"cells", cell => {
      cell.withField("nestStruct.id3.id31", lit(40)) // This line doesn't
do anything is nestStruct is null.
  }))
result.show(false)
result.printSchema




Thanks

Reply via email to