[GitHub] spark pull request #20116: [SPARK-20960][SQL] make ColumnVector public

gatorsmile Wed, 03 Jan 2018 08:43:49 -0800

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/20116#discussion_r159469632
  
    --- Diff: 
sql/core/src/main/java/org/apache/spark/sql/vectorized/ColumnarBatch.java ---
    @@ -14,26 +14,18 @@
      * See the License for the specific language governing permissions and
      * limitations under the License.
      */
    -package org.apache.spark.sql.execution.vectorized;
    +package org.apache.spark.sql.vectorized;
     
     import java.util.*;
     
     import org.apache.spark.sql.catalyst.InternalRow;
    +import org.apache.spark.sql.execution.vectorized.MutableColumnarRow;
     import org.apache.spark.sql.types.StructType;
     
     /**
    - * This class is the in memory representation of rows as they are streamed 
through operators. It
    - * is designed to maximize CPU efficiency and not storage footprint. Since 
it is expected that
    - * each operator allocates one of these objects, the storage footprint on 
the task is negligible.
    - *
    - * The layout is a columnar with values encoded in their native format. 
Each RowBatch contains
    - * a horizontal partitioning of the data, split into columns.
    - *
    - * The ColumnarBatch supports either on heap or offheap modes with 
(mostly) the identical API.
    - *
    - * TODO:
    - *  - There are many TODOs for the existing APIs. They should throw a not 
implemented exception.
    - *  - Compaction: The batch and columns should be able to compact based on 
a selection vector.
    + * This class is a wrapper of multiple ColumnVectors and represents a 
logical table-like data
    --- End diff --
    
    How about?
    > This class wraps multiple ColumnVectors as a row-wise table



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #20116: [SPARK-20960][SQL] make ColumnVector public

Reply via email to