lgbo-ustc commented on code in PR #12349:
URL: https://github.com/apache/gluten/pull/12349#discussion_r3479724351
##########
gluten-substrait/src/main/scala/org/apache/gluten/expression/ExpressionConverter.scala:
##########
@@ -336,10 +336,16 @@ object ExpressionConverter extends SQLConfHelper with
Logging {
replaceWithExpressionTransformer0(m.child, attributeSeq,
expressionsMap),
m)
case m: MapFromEntries =>
+ val mapKeyDedupPolicy =
SQLConf.get.getConf(SQLConf.MAP_KEY_DEDUP_POLICY)
BackendsApiManager.getSparkPlanExecApiInstance.genMapFromEntriesTransformer(
- substraitExprName,
+ if (mapKeyDedupPolicy.toString ==
SQLConf.MapKeyDedupPolicy.LAST_WIN.toString) {
Review Comment:
I am a bit concerned about putting the LAST_WIN function-name rewrite in the
common ExpressionConverter. This changes the Substrait function name for all
backends when `spark.sql.mapKeyDedupPolicy=LAST_WIN`, but
`map_from_entries_last_win` is currently a ClickHouse-specific function
mapping. Other backends may receive this new function name without supporting
it.
Could we keep the common converter backend-agnostic and move this
policy-to-function-name decision into
`CHSparkPlanExecApi.genMapFromEntriesTransformer` instead? For example,
`ExpressionConverter` can continue passing `substraitExprName`:
```scala
case m: MapFromEntries =>
BackendsApiManager.getSparkPlanExecApiInstance.genMapFromEntriesTransformer(
substraitExprName,
replaceWithExpressionTransformer0(m.child, attributeSeq, expressionsMap),
m)
```
Then ClickHouse can make the backend-specific function-name choice locally:
```scala
override def genMapFromEntriesTransformer(
substraitExprName: String,
child: ExpressionTransformer,
expr: Expression): ExpressionTransformer = {
val mapKeyDedupPolicy = SQLConf.get.getConf(SQLConf.MAP_KEY_DEDUP_POLICY)
val functionName =
if (mapKeyDedupPolicy == SQLConf.MapKeyDedupPolicy.LAST_WIN) {
ExpressionNames.MAP_FROM_ENTRIES_LAST_WIN
} else {
substraitExprName
}
GenericExpressionTransformer(functionName, Seq(child), expr)
}
```
This keeps the common converter backend-agnostic and limits the new
`map_from_entries_last_win` function name to the backend that registers it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]