[ https://issues.apache.org/jira/browse/HIVE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661380#action_12661380 ]
Jeff Hammerbacher commented on HIVE-207: ---------------------------------------- I learned a lot, too. Could someone with a handle on Hive compress this discussion into reusable documentation and post it to the Hive site? > Change SerDe API to allow skipping unused columns > ------------------------------------------------- > > Key: HIVE-207 > URL: https://issues.apache.org/jira/browse/HIVE-207 > Project: Hadoop Hive > Issue Type: Bug > Components: Query Processor, Serializers/Deserializers > Reporter: David Phillips > > A deserializer shouldn't have to deserialize columns that are never used by > the query processor. A serializer shouldn't have to examine unused columns > that are known to always be null. > As an example, we store data as a Protocol Buffer structure with ~60 fields. > Running a "select count(1)" currently requires deserializing all fields, > which includes checking if they exist and formatting the data appropriately. > This is expensive and unnecessary. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.