Yong, Thanks for the response. While those are good examples, they are able to leverage the keytype/valuetype structure of Maps to specify an explicit return type.
I guess maybe the more fundamental issue is that I want to support heterogenous maps/arrays allowed by JSON: [1, "str", 2.345] or {"name":"Chris","value":123}. Given the Spark SQL constraints that ArrayType and MapType need explicit and consistent element types, I don't see any way to support this in the current type system short of falling back to binary data. Open to other suggestions, Chris On Tue, Jul 26, 2016 at 9:42 AM Yong Zhang <java8...@hotmail.com> wrote: > I don't know the if "ANY" will work or not, but do you take a look about > how "map_values" UDF implemented in Spark, which return map values of an > array/seq of arbitrary type. > > > https://issues.apache.org/jira/browse/SPARK-16279 > > > Yong > > > ------------------------------ > *From:* Chris Beavers <cbeav...@trifacta.com> > *Sent:* Monday, July 25, 2016 10:32 PM > *To:* user@spark.apache.org > *Subject:* UDF returning generic Seq > > Hey there, > > Interested in writing a UDF that returns an ArrayType column of unknown > subtype. My understanding is that this translated JVM-type-wise be a Seq of > generic templated type: Seq[Any]. I seem to be hitting the constraint at > https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala:657 > that > basically necessitates a fully qualified schema on the return type (i.e. > the templated Any is hitting the default exception throwing case at the end > of schemaFor). > > Is there any more canonical way have a UDF produce an ArrayType column of > unknown type? Or is my only alternative here to reduce this to BinaryType > and use whatever encoding/data structures I want under the covers there and > in subsequent UDFs? > > Thanks, > Chris >