[
https://issues.apache.org/jira/browse/SPARK-42579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697216#comment-17697216
]
Yang Jie commented on SPARK-42579:
--
[~hvanhovell] In fact, if we compare `org.apache.spark.sql.functions#lit` in
sql module, I think this function has been completed.
[https://github.com/apache/spark/blob/201e08c03a31c763e3120540ac1b1ca8ef252e6b/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L116-L126]
{code:java}
def lit(literal: Any): Column = literal match {
case c: Column => c
case s: Symbol => new ColumnName(s.name)
case _ =>
// This is different from `typedlit`. `typedlit` calls `Literal.create`
to use
// `ScalaReflection` to get the type of `literal`. However, since we use
`Any` in this method,
// `typedLit[Any](literal)` will always fail and fallback to
`Literal.apply`. Hence, we can
// just manually call `Literal.apply` to skip the expensive
`ScalaReflection` code. This is
// significantly better when there are many threads calling `lit`
concurrently.
Column(Literal(literal))
} {code}
[https://github.com/apache/spark/blob/201e08c03a31c763e3120540ac1b1ca8ef252e6b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/literals.scala#L64-L102]
{code:java}
def apply(v: Any): Literal = v match {
case i: Int => Literal(i, IntegerType)
case l: Long => Literal(l, LongType)
case d: Double => Literal(d, DoubleType)
case f: Float => Literal(f, FloatType)
case b: Byte => Literal(b, ByteType)
case s: Short => Literal(s, ShortType)
case s: String => Literal(UTF8String.fromString(s), StringType)
case s: UTF8String => Literal(s, StringType)
case c: Char => Literal(UTF8String.fromString(c.toString), StringType)
case ac: Array[Char] => Literal(UTF8String.fromString(String.valueOf(ac)),
StringType)
case b: Boolean => Literal(b, BooleanType)
case d: BigDecimal =>
val decimal = Decimal(d)
Literal(decimal, DecimalType.fromDecimal(decimal))
case d: JavaBigDecimal =>
val decimal = Decimal(d)
Literal(decimal, DecimalType.fromDecimal(decimal))
case d: Decimal => Literal(d, DecimalType(Math.max(d.precision, d.scale),
d.scale))
case i: Instant => Literal(instantToMicros(i), TimestampType)
case t: Timestamp => Literal(DateTimeUtils.fromJavaTimestamp(t),
TimestampType)
case l: LocalDateTime => Literal(DateTimeUtils.localDateTimeToMicros(l),
TimestampNTZType)
case ld: LocalDate => Literal(ld.toEpochDay.toInt, DateType)
case d: Date => Literal(DateTimeUtils.fromJavaDate(d), DateType)
case d: Duration => Literal(durationToMicros(d), DayTimeIntervalType())
case p: Period => Literal(periodToMonths(p), YearMonthIntervalType())
case a: Array[Byte] => Literal(a, BinaryType)
case a: collection.mutable.WrappedArray[_] => apply(a.array)
case a: Array[_] =>
val elementType = componentTypeToDataType(a.getClass.getComponentType())
val dataType = ArrayType(elementType)
val convert = CatalystTypeConverters.createToCatalystConverter(dataType)
Literal(convert(a), dataType)
case i: CalendarInterval => Literal(i, CalendarIntervalType)
case null => Literal(null, NullType)
case v: Literal => v
case _ =>
throw QueryExecutionErrors.literalTypeUnsupportedError(v)
} {code}
For nested structures, `org.apache.spark.sql.functions#lit` only supports
Array[_], For example, run
{code:java}
val column = org.apache.spark.sql.functions.lit(Map(1 -> "v1", 2 -> "v2"))
{code}
There will be following errors:
{code:java}
[UNSUPPORTED_FEATURE.LITERAL_TYPE] The feature is not supported: Literal for
'Map(1 -> v1, 2 -> v2)' of class scala.collection.immutable.Map$Map2.
org.apache.spark.SparkRuntimeException: [UNSUPPORTED_FEATURE.LITERAL_TYPE] The
feature is not supported: Literal for 'Map(1 -> v1, 2 -> v2)' of class
scala.collection.immutable.Map$Map2.
at
org.apache.spark.sql.errors.QueryExecutionErrors$.literalTypeUnsupportedError(QueryExecutionErrors.scala:327)
at
org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:101)
at org.apache.spark.sql.functions$.lit(functions.scala:125) {code}
and from the code comments, for other types, we should use `typedlit` instead
of `lit`
> Extend function.lit() to match Literal.apply()
> --
>
> Key: SPARK-42579
> URL: https://issues.apache.org/jira/browse/SPARK-42579
> Project: Spark
> Issue Type: New Feature
> Components: Connect
>Affects Versions: 3.4.0
>Reporter: Herman van Hövell
>Priority: Major
> Attachments: image-2023-03-07-11-00-39-429.png,
> image-2023-03-07-11-00-49-379.png
>
>
> function.lit should support the same conversions as the original.
> This requires an addition to the connect protocol, since it does not support
> nested type literals at the