[
https://issues.apache.org/jira/browse/CRUNCH-450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Zhong Wang updated CRUNCH-450:
------------------------------
Description:
This JIRA adds ORC (Optimized Row Columnar) file format support in Crunch.
Three models are supported for ORC serialization/deserialization:
--
1) Orcs.orcs(): using OrcStructs as the deserialized objects to provide high
performance
2) Orcs.reflects(): using Java reflection to support POJOs as the deserialized
objects
3) Orcs.tuples(): using Crunch Tuples as the deserialized objects to leverage
performance and user-friendliness
was:
This JIRA adds ORC file format support in Crunch by:
--
1. Adding input source and output target for ORC
2. Adding a new type family - OrcTypeFamily to serialize / deserialize objects
into OrcStruct
3. Supporting column pruning optimization
> Adding ORC file format support in Crunch
> ----------------------------------------
>
> Key: CRUNCH-450
> URL: https://issues.apache.org/jira/browse/CRUNCH-450
> Project: Crunch
> Issue Type: New Feature
> Components: Core, IO
> Reporter: Zhong Wang
> Assignee: Josh Wills
> Fix For: 0.11.0
>
> Attachments: CRUNCH-450-final.patch, CRUNCH-450-newapi.patch,
> CRUNCH-450-submodule.1.patch, CRUNCH-450-submodule.2.patch,
> CRUNCH-450-submodule.patch, CRUNCH-450.patch
>
>
> This JIRA adds ORC (Optimized Row Columnar) file format support in Crunch.
> Three models are supported for ORC serialization/deserialization:
> --
> 1) Orcs.orcs(): using OrcStructs as the deserialized objects to provide high
> performance
> 2) Orcs.reflects(): using Java reflection to support POJOs as the
> deserialized objects
> 3) Orcs.tuples(): using Crunch Tuples as the deserialized objects to leverage
> performance and user-friendliness
--
This message was sent by Atlassian JIRA
(v6.2#6252)