[spark] branch master updated: [SPARK-39137][SQL] Use `slice` instead of `take and drop`

srowen Tue, 10 May 2022 20:07:54 -0700

This is an automated email from the ASF dual-hosted git repository.

srowen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/spark.git



The following commit(s) were added to refs/heads/master by this push:
     new 11320f38897 [SPARK-39137][SQL] Use `slice` instead of `take and drop`
11320f38897 is described below

commit 11320f38897be05ff31d0e2d0c2943112b0df24b
Author: yangjie01 <yangji...@baidu.com>
AuthorDate: Tue May 10 22:07:37 2022 -0500

    [SPARK-39137][SQL] Use `slice` instead of `take and drop`
    
    ### What changes were proposed in this pull request?
    This pr is  minor code simplification:
    
    - `seq.drop(n).take(m)` -> `seq.slice(n, n + m)`
    - `seq.take(m).drop(n)` -> `seq.slice(n, m)`
    
    ### Why are the changes needed?
    Code simplification
    
    ### Does this PR introduce _any_ user-facing change?
    No
    
    ### How was this patch tested?
    Pass GA
    
    Closes #36494 from LuciferYang/droptake-2-slice.
    
    Authored-by: yangjie01 <yangji...@baidu.com>
    Signed-off-by: Sean Owen <sro...@gmail.com>
---
 sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
index f79361ff1c5..caffe3ff855 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/execution/limit.scala
@@ -98,7 +98,7 @@ case class CollectLimitExec(limit: Int = -1, child: 
SparkPlan, offset: Int = 0)
       }
       if (limit >= 0) {
         if (offset > 0) {
-          singlePartitionRDD.mapPartitionsInternal(_.drop(offset).take(limit))
+          singlePartitionRDD.mapPartitionsInternal(_.slice(offset, offset + 
limit))
         } else {
           singlePartitionRDD.mapPartitionsInternal(_.take(limit))
         }
@@ -238,7 +238,7 @@ case class GlobalLimitAndOffsetExec(
   override def requiredChildDistribution: List[Distribution] = AllTuples :: Nil
 
   override def doExecute(): RDD[InternalRow] = if (limit >= 0) {
-    child.execute().mapPartitionsInternal(iter => iter.take(limit + 
offset).drop(offset))
+    child.execute().mapPartitionsInternal(iter => iter.slice(offset, limit + 
offset))
   } else {
     child.execute().mapPartitionsInternal(iter => iter.drop(offset))
   }


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[spark] branch master updated: [SPARK-39137][SQL] Use `slice` instead of `take and drop`

Reply via email to