[ https://issues.apache.org/jira/browse/SPARK-34484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Apache Spark reassigned SPARK-34484: ------------------------------------ Assignee: Kousuke Saruta (was: Apache Spark) > Introduce a new syntax to represent attributes with the Catalyst DSL > -------------------------------------------------------------------- > > Key: SPARK-34484 > URL: https://issues.apache.org/jira/browse/SPARK-34484 > Project: Spark > Issue Type: Improvement > Components: SQL > Affects Versions: 3.2.0 > Reporter: Kousuke Saruta > Assignee: Kousuke Saruta > Priority: Major > > With the Catalyst DSL (dsl/package.scala), we have two ways to represent > attributes. > 1. Symbol literals (`'` syntax) > 2. `$""` syntax which is defined in `sql/catalyst` module using string > context. > But they have problems. > Regarding symbol literals, the scala community deprecates the symbol literals > in Scala 2.13. We could alternatively use `Symbol` constructor but what is > worse, Scala will completely remove `Symbol` in the future > (https://scalacenter.github.io/scala-3-migration-guide/docs/incompatibilities/dropped-features.html). > {code} > Although scala.Symbol is useful for migration, beware that it is deprecated > and that it will be removed from the scala-library. You are recommended, as a > second step, to replace them with plain string literals "xwy" or a dedicated > class. > {code} > Regarding `$""` syntax, this has two problems. > The first problem is that the syntax conflicts with another `$""` syntax > defined in `sql/core` module. > You can easily see the problem with the Spark Shell. > {code} > import org.apache.spark.sql.catalyst.dsl.expressions._ > val attr1 = $"attr1" > error: type mismatch; > found : StringContext > required: ?{def $: ?} > Note that implicit conversions are not applicable because they are > ambiguous: > both method StringToColumn in class SQLImplicits of type (sc: > StringContext): spark.implicits.StringToColumn > and method StringToAttributeConversionHelper in trait > ExpressionConversions of type (sc: StringContext): > org.apache.spark.sql.catalyst.dsl.expressions.StringToAttributeConversionHelper > are possible conversion functions from StringContext to ?{def $: ?} > {code} > The second problem is that we can't write like `$"attr".map(StringType, > StringType)`, though we can write `'attr.map(StringType, StringType)`. > This seems to be a bug of the Scala compiler and will be fixed in neither > `2.12` nor `2.13` (https://github.com/scala/scala/pull/7396). > Actually, I'm working on replacing all the symbol literals with `$""` syntax > in SPARK-34443 and I found this problem in the following test code. > * EncoderResolutionSuite.scala > * ComplexTypeSuite.scala > * ObjectExpressionsSuite.scala > * NestedColumnAliasingSuite.scala > * ReplaceNullWithFalseInPredicateSuite.scala > * SimplifyCastsSuite.scala > * SimplifyConditionalSuite.scala > {code} > [error] > /home/kou/work/oss/spark-scala-2.13/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/EncoderResolutionSuite.scala:212:28: > too many arguments (found 2, expected 1) for method map: (f: > org.apache.spark.sql.catalyst.expressions.Expression => A): Seq[A] > [error] $"a".map(StringType, StringType)).foreach { attr => > {code} > So, it's better to have another way to represent attributes with DSL. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org