[
https://issues.apache.org/jira/browse/TAJO-30?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Chen updated TAJO-30:
---------------------------
Description:
Parquet is a columnar storage format developed by Twitter. Implement Parquet
(http://parquet.io/) support for Tajo.
The implementation consists of the following:
* {{TajoParquetReader}} and {{TajoParquetWriter}} - Top-level reader and
writer for serializing/deserializing to Tajo Tuples.
* {{TajoReadSupport}} and {{TajoWriteSupport}} - Abstractions to perform
conversion between Parquet and Tajo records.
* {{TajoRecordMaterializer}} - Materializes Tajo Tuples from Parquet's
internal representation.
* {{TajoRecordConverter}} - Used by {{TajoRecordMateriailzer}} to materialize
a Tajo Tuple.
* {{TajoSchemaConverter}} - Converts between Tajo and Parquet schemas.
was:
Parquet is very promising file format developed by twitter. We need to
investigate the applicability of Parquet. If possible, we implement Parquet
port.
http://parquet.io/
> Parquet Integration
> -------------------
>
> Key: TAJO-30
> URL: https://issues.apache.org/jira/browse/TAJO-30
> Project: Tajo
> Issue Type: New Feature
> Reporter: Hyunsik Choi
> Assignee: David Chen
> Labels: Parquet
>
> Parquet is a columnar storage format developed by Twitter. Implement Parquet
> (http://parquet.io/) support for Tajo.
> The implementation consists of the following:
> * {{TajoParquetReader}} and {{TajoParquetWriter}} - Top-level reader and
> writer for serializing/deserializing to Tajo Tuples.
> * {{TajoReadSupport}} and {{TajoWriteSupport}} - Abstractions to perform
> conversion between Parquet and Tajo records.
> * {{TajoRecordMaterializer}} - Materializes Tajo Tuples from Parquet's
> internal representation.
> * {{TajoRecordConverter}} - Used by {{TajoRecordMateriailzer}} to
> materialize a Tajo Tuple.
> * {{TajoSchemaConverter}} - Converts between Tajo and Parquet schemas.
--
This message was sent by Atlassian JIRA
(v6.2#6252)