This is an automated email from the ASF dual-hosted git repository. dongjoon pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/spark.git
The following commit(s) were added to refs/heads/master by this push: new 20c9b3dc4fac [SPARK-46328][SQL] Allocate capacity of array list of TColumns by columns size in TRowSet generation 20c9b3dc4fac is described below commit 20c9b3dc4fac283f895c8d860b4c6e0144697302 Author: liangbowen <liangbo...@gf.com.cn> AuthorDate: Fri Dec 8 11:24:35 2023 -0800 [SPARK-46328][SQL] Allocate capacity of array list of TColumns by columns size in TRowSet generation ### What changes were proposed in this pull request? Allocate enough capacity by columns size for assembling array list of TColumns in TRowSet generation. ### Why are the changes needed? ArrayLists is created for TColumn value collections in RowSetUtils for TRowSet generation. Currently, they are created with Java's default capacity of 16, rather than by the number of columns, which could cause array copying in assembling each TColumn collection when the column number exceeds the default capacity. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? GA tests. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #44258 from bowenliang123/rowset-cap. Authored-by: liangbowen <liangbo...@gf.com.cn> Signed-off-by: Dongjoon Hyun <dh...@apple.com> --- .../org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala index 94046adca0d8..502e29619027 100644 --- a/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala +++ b/sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/RowSetUtils.scala @@ -57,15 +57,16 @@ object RowSetUtils { val tRows = new java.util.ArrayList[TRow](rowSize) while (i < rowSize) { val row = rows(i) - val tRow = new TRow() var j = 0 val columnSize = row.length + val tColumnValues = new java.util.ArrayList[TColumnValue](columnSize) while (j < columnSize) { val columnValue = toTColumnValue(j, row, schema(j), timeFormatters) - tRow.addToColVals(columnValue) + tColumnValues.add(columnValue) j += 1 } i += 1 + val tRow = new TRow(tColumnValues) tRows.add(tRow) } new TRowSet(startRowOffSet, tRows) @@ -80,11 +81,13 @@ object RowSetUtils { val tRowSet = new TRowSet(startRowOffSet, new java.util.ArrayList[TRow](rowSize)) var i = 0 val columnSize = schema.length + val tColumns = new java.util.ArrayList[TColumn](columnSize) while (i < columnSize) { val tColumn = toTColumn(rows, i, schema(i), timeFormatters) - tRowSet.addToColumns(tColumn) + tColumns.add(tColumn) i += 1 } + tRowSet.setColumns(tColumns) tRowSet } --------------------------------------------------------------------- To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org For additional commands, e-mail: commits-h...@spark.apache.org