jinchengchenghh commented on PR #11211:
URL:
https://github.com/apache/incubator-gluten/pull/11211#issuecomment-3586503750
Maybe all the data frame stats function has similar issue, the information
is native information, we could supply a utility class to replace current
df.stat.xxx
```
✅ 方案 1(推荐):使用隐式类扩展 df.stat 功能
不用修改 Spark 源码,也不会破坏 API。
class GlutenDataFrameStatFunctions(df: DataFrame)
extends DataFrameStatFunctions(df) {
def glutenApproxQuantile(cols: Seq[String]): DataFrame = {
// your gluten implementation
df // return something
}
}
object GlutenStatImplicits {
implicit class GlutenStatOps(df: DataFrame) {
def glutenStat: GlutenDataFrameStatFunctions =
new GlutenDataFrameStatFunctions(df)
}
}
使用:
import GlutenStatImplicits._
df.glutenStat.glutenApproxQuantile(Seq("col1"))
✔ 不修改 Spark
✔ 和 Spark 版本兼容
✔ 可以加入任意 Gluten 优化
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]