[
https://issues.apache.org/jira/browse/HIVE-6262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13879369#comment-13879369
]
Gunther Hagleitner commented on HIVE-6262:
------------------------------------------
Tests have successfully run:
http://bigtop01.cloudera.org:8080/job/PreCommit-HIVE-Build/994/testReport/ (the
7 failures are unrelated). Unfortunately jira was down when the tests completed
(so no auto update)
> Remove unnecessary copies of schema + table desc from serialized plan
> ---------------------------------------------------------------------
>
> Key: HIVE-6262
> URL: https://issues.apache.org/jira/browse/HIVE-6262
> Project: Hive
> Issue Type: Bug
> Reporter: Gunther Hagleitner
> Assignee: Gunther Hagleitner
> Attachments: HIVE-6262.1.patch
>
>
> Currently for a partitioned table the following are true:
> - for each partitiondesc we send a copy of the corresponding tabledesc
> - for each partitiondesc we send two copies of the schema (in different
> formats).
> Obviously we need to send different schemas if they are required by schema
> evolution, but in our case we'll always end up with multiple copies.
> The effect can be dramatic. The reductions by removing those on partitioned
> tables easily be can be 8-10x in size. Plans themselves can be 10s to 100s of
> mb (even with kryo). The size difference also plays out in every task on the
> cluster we run.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)