advancedxy commented on code in PR #100:
URL:
https://github.com/apache/arrow-datafusion-comet/pull/100#discussion_r1504405184
##########
spark/src/main/scala/org/apache/comet/shims/ShimCometSparkSessionExtensions.scala:
##########
@@ -32,4 +34,19 @@ trait ShimCometSparkSessionExtensions {
.map { a => a.setAccessible(true); a }
.flatMap(_.get(scan).asInstanceOf[Option[Aggregation]])
.headOption
+
+ /**
+ * TODO: delete after dropping Spark 3.2 and 3.3 support
+ */
+ def getOffset(limit: LimitExec): Int = getOffsetOpt(limit).getOrElse(0)
Review Comment:
How do you like this? I think we should expose the `getOffset` method to
accept `LimitExec` only and it could return `Int` directly.
The actual implementation could be generic.
##########
spark/src/main/scala/org/apache/spark/sql/comet/CometTakeOrderedAndProjectExec.scala:
##########
@@ -77,11 +77,7 @@ case class CometTakeOrderedAndProjectExec(
childRDD
} else {
val localTopK = if (orderingSatisfies) {
- childRDD.mapPartitionsInternal { iter =>
- val limitOp =
- CometExecUtils.getLimitNativePlan(output, limit).get
- CometExec.getCometIterator(Seq(iter), limitOp)
- }
+ CometExecUtils.toNativeLimitedPerPartition(childRDD, output, limit)
Review Comment:
refactor to use the utility method.
If not appropriate, I can revert this.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]