[ 
https://issues.apache.org/jira/browse/HIVE-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661380#action_12661380
 ] 

Jeff Hammerbacher commented on HIVE-207:
----------------------------------------

I learned a lot, too. Could someone with a handle on Hive compress this 
discussion into reusable documentation and post it to the Hive site?

> Change SerDe API to allow skipping unused columns
> -------------------------------------------------
>
>                 Key: HIVE-207
>                 URL: https://issues.apache.org/jira/browse/HIVE-207
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor, Serializers/Deserializers
>            Reporter: David Phillips
>
> A deserializer shouldn't have to deserialize columns that are never used by 
> the query processor.  A serializer shouldn't have to examine unused columns 
> that are known to always be null.
> As an example, we store data as a Protocol Buffer structure with ~60 fields.  
> Running a "select count(1)" currently requires deserializing all fields, 
> which includes checking if they exist and formatting the data appropriately.  
> This is expensive and unnecessary.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to