[ https://issues.apache.org/jira/browse/HIVE-553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zheng Shao updated HIVE-553: ---------------------------- Summary: Add BinarySortableSerDe to Hive (was: Add LazyBinarySerDe to Hive) > Add BinarySortableSerDe to Hive > ------------------------------- > > Key: HIVE-553 > URL: https://issues.apache.org/jira/browse/HIVE-553 > Project: Hadoop Hive > Issue Type: New Feature > Affects Versions: 0.4.0 > Reporter: Zheng Shao > Attachments: HIVE-553.2.patch > > > Currently the most popular SerDe in Hive is LazySimpleSerDe. LazySimpleSerDe > has the benefit of being simple (use text format to store data), but its > performance may suffer in the following cases: > 1. For double values, we are storing them in text format which is very > space-inefficient, and both serialization and deserialization are slow; > 2. For complex type of columns that contains a lot of levels, we are scanning > the buffer once per level, which is very inefficient. > We should add a binary serde format that stores the data in binary format. > The format should have the following properties: > 1. Compact: it should be space-efficient; > 2. Fast: it should be efficiently to deserialize the data, especially for > double values and complex types. > 3. It should support serializing NULL values. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.