(kyuubi) branch master updated: [KYUUBI #5831] Pre-allocate array list capacity for TColumns and TColumnValues in TRowSet generation

bowenliang Thu, 07 Dec 2023 21:16:37 -0800

This is an automated email from the ASF dual-hosted git repository.

bowenliang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/kyuubi.git



The following commit(s) were added to refs/heads/master by this push:
     new 1d6e65356 [KYUUBI #5831] Pre-allocate array list capacity for TColumns 
and TColumnValues in TRowSet generation
1d6e65356 is described below

commit 1d6e65356e6eab6656be6245de89a02e910304ec
Author: Bowen Liang <[email protected]>
AuthorDate: Fri Dec 8 13:16:21 2023 +0800

    [KYUUBI #5831] Pre-allocate array list capacity for TColumns and 
TColumnValues in TRowSet generation
    
    # :mag: Description
    ## Issue References 🔗
    
    Subtask of #5808.
    
    ## Describe Your Solution 🔧
    
    To avoid unnecessary possible array copy in ensuring capacity of the array 
list, pre-allocate array list capacity for TColumns and TColumnValues in 
TRowSet generation.
    
    ## Types of changes :bookmark:
    
    - [ ] Bugfix (non-breaking change which fixes an issue)
    - [ ] New feature (non-breaking change which adds functionality)
    - [ ] Breaking change (fix or feature that would cause existing 
functionality to change)
    
    ## Test Plan 🧪
    
    #### Behavior Without This Pull Request :coffin:
    
    #### Behavior With This Pull Request :tada:
    No behaviour changes.
    
    #### Related Unit Tests
    RowSetSuite of Spark Engine.
    
    ---
    
    # Checklists
    ## 📝 Author Self Checklist
    
    - [x] My code follows the [style 
guidelines](https://kyuubi.readthedocs.io/en/master/contributing/code/style.html)
 of this project
    - [ ] I have performed a self-review
    - [ ] I have commented my code, particularly in hard-to-understand areas
    - [ ] I have made corresponding changes to the documentation
    - [ ] My changes generate no new warnings
    - [ ] I have added tests that prove my fix is effective or that my feature 
works
    - [ ] New and existing unit tests pass locally with my changes
    - [x] This patch was not authored or co-authored using [Generative 
Tooling](https://www.apache.org/legal/generative-tooling.html)
    
    ## 📝 Committer Pre-Merge Checklist
    
    - [x] Pull request title is okay.
    - [x] No license issues.
    - [x] Milestone correctly set?
    - [x] Test coverage is ok
    - [x] Assignees are selected.
    - [x] Minimum number of approvals
    - [ ] No changes are requested
    
    **Be nice. Be informative.**
    
    Closes #5831 from bowenliang123/rowset-inplace.
    
    Closes #5831
    
    c1c6c0f84 [liangbowen] avoid possible array copying with growing of array 
list in TRowSet generation
    
    Lead-authored-by: Bowen Liang <[email protected]>
    Co-authored-by: liangbowen <[email protected]>
    Signed-off-by: Bowen Liang <[email protected]>
---
 .../scala/org/apache/kyuubi/engine/spark/schema/RowSet.scala     | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git 
a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/schema/RowSet.scala
 
b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/schema/RowSet.scala
index b9e9f6411..806451907 100644
--- 
a/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/schema/RowSet.scala
+++ 
b/externals/kyuubi-spark-sql-engine/src/main/scala/org/apache/kyuubi/engine/spark/schema/RowSet.scala
@@ -77,15 +77,16 @@ object RowSet {
     var i = 0
     while (i < rowSize) {
       val row = rows(i)
-      val tRow = new TRow()
       var j = 0
       val columnSize = row.length
+      val tColumnValues = new java.util.ArrayList[TColumnValue](columnSize)
       while (j < columnSize) {
         val columnValue = toTColumnValue(j, row, schema, timeFormatters)
-        tRow.addToColVals(columnValue)
+        tColumnValues.add(columnValue)
         j += 1
       }
       i += 1
+      val tRow = new TRow(tColumnValues)
       tRows.add(tRow)
     }
     new TRowSet(0, tRows)
@@ -97,12 +98,14 @@ object RowSet {
     val timeFormatters = HiveResult.getTimeFormatters
     var i = 0
     val columnSize = schema.length
+    val tColumns = new java.util.ArrayList[TColumn](columnSize)
     while (i < columnSize) {
       val field = schema(i)
       val tColumn = toTColumn(rows, i, field.dataType, timeFormatters)
-      tRowSet.addToColumns(tColumn)
+      tColumns.add(tColumn)
       i += 1
     }
+    tRowSet.setColumns(tColumns)
     tRowSet
   }

(kyuubi) branch master updated: [KYUUBI #5831] Pre-allocate array list capacity for TColumns and TColumnValues in TRowSet generation

Reply via email to