[jira] Commented: (HIVE-625) Use of BinarySortableSerDe for serialization of the value between map and reduce boundary

Zheng Shao (JIRA) Fri, 10 Jul 2009 11:59:39 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729772#action_12729772
 ]


Zheng Shao commented on HIVE-625:
---------------------------------

For this particular case, I think predicate push down will push the filter to 
the mapper side. And partition pruner will prune out all columns that are not 
accessed.
So, the reducer will probably read all columns that are passed through map and 
reduce boundary.

I agree there can still be other opposite cases - but that won't appear often. 
I can also make this SerDe configurable if that's a better idea.

What do you think?


> Use of BinarySortableSerDe for serialization of the value between map and 
> reduce boundary
> -----------------------------------------------------------------------------------------
>
>                 Key: HIVE-625
>                 URL: https://issues.apache.org/jira/browse/HIVE-625
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-625.1.patch
>
>
> We currently use LazySimpleSerDe which serializes double to text format. 
> Before we have LazyBinarySerDe, we should switch to BinarySortableSerDe 
> because that's still much faster than LazySimpleSerDe.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HIVE-625) Use of BinarySortableSerDe for serialization of the value between map and reduce boundary

Reply via email to