Hi,

I have a requirement to create types dynamically in Spark and then
instantiate those types from Spark SQL via a UDF.

I tried doing the following:

val addressType = StructType(List(
  new StructField("state", DataTypes.StringType),
  new StructField("zipcode", DataTypes.IntegerType)
))

sqlContext.udf.register("Address", (args: Seq[Any]) => new
GenericRowWithSchema(args.toArray, addressType))

sqlContext.sql("SELECT Address('NY', 12345)").show(10)

This seems reasonable to me but this fails with:

Exception in thread "main" java.lang.UnsupportedOperationException: Schema
for type org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema is
not supported
at
org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:755)
at
org.apache.spark.sql.catalyst.ScalaReflection$.schemaFor(ScalaReflection.scala:685)
at org.apache.spark.sql.UDFRegistration.register(UDFRegistration.scala:130)

It looks like it would be simple to update ScalaReflection to be able to
infer the schema from a GenericRowWithSchema, but before I file a JIRA and
submit a patch I wanted to see if there is already a way of achieving this.

Thanks,

Andy.

Reply via email to