Github user kiszk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21912#discussion_r209877426
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ArrayData.scala 
---
    @@ -34,6 +36,37 @@ object ArrayData {
         case a: Array[Double] => UnsafeArrayData.fromPrimitiveArray(a)
         case other => new GenericArrayData(other)
       }
    +
    +
    +  /**
    +   * Allocate [[UnsafeArrayData]] or [[GenericArrayData]] based on given 
parameters.
    +   *
    +   * @param elementSize a size of an element in bytes
    +   * @param numElements the number of elements the array should contain
    +   * @param isPrimitiveType whether the type of an element is primitive 
type
    +   * @param additionalErrorMessage string to include in the error message
    +   */
    +  def allocateArrayData(
    +      elementSize: Int,
    +      numElements : Long,
    +      isPrimitiveType: Boolean,
    +      additionalErrorMessage: String) : ArrayData = {
    +    val arraySize = 
UnsafeArrayData.calculateSizeOfUnderlyingByteArray(numElements, elementSize)
    +    if (isPrimitiveType && 
!UnsafeArrayData.shouldUseGenericArrayData(elementSize, numElements)) {
    --- End diff --
    
    When `UnsafeArrayData` can be used, `GenericArrayData` is also used. 
However, if the element size is large, `GenericArrayData` should be used. But, 
`UnsafeArrayData` cannot be used.
    
    Thus, I think that it would be good to use the current name 
`shouldUseGenericArrayData`.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to