zhengruifeng commented on code in PR #51473:
URL: https://github.com/apache/spark/pull/51473#discussion_r2212974849
##########
sql/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -215,13 +215,19 @@ message Expression {
}
message Array {
- DataType element_type = 1;
+ // (Optional) The element type of the array. Only need to set this when
the elements are
+ // empty, since spark 4.1+ supports inferring the element type from the
elements.
+ optional DataType element_type = 1;
repeated Literal elements = 2;
Review Comment:
I am not sure whether it is worthwhile to just optimize the `element_type`.
For large arrays of primitive types, e.g. large dense matrix for ML, we
introduced `SpecializedArray`.
##########
sql/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -215,13 +215,19 @@ message Expression {
}
message Array {
- DataType element_type = 1;
+ // (Optional) The element type of the array. Only need to set this when
the elements are
+ // empty, since spark 4.1+ supports inferring the element type from the
elements.
+ optional DataType element_type = 1;
repeated Literal elements = 2;
Review Comment:
I am not sure whether it is worthwhile to just optimize out the
`element_type`.
For large arrays of primitive types, e.g. large dense matrix for ML, we
introduced `SpecializedArray`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]