[
https://issues.apache.org/jira/browse/HIVE-4078?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13620884#comment-13620884
]
Gopal V commented on HIVE-4078:
-------------------------------
No, [~namit] cloneBean() does not perform a deep copy properly. I had dropped
that approach during the third iteration.
SerializationUtils.clone() from apache-commons does do a deepClone, so does
uk.com.robust-it's cloning lib. But those do not work because some of the tree
items don't implement Serializable or depend on the getter/setter actions.
I updated the patch to not do a serialize/deserialize when the tasks are
non-conditional, since the conversion doesn't need to be reversible.
That speeds up query27 by avoiding that step, but the conditional map-joins
still need to go through the slow serialize/deserialize pair inside the for
loop.
> Delay the serialize-deserialize pair in CommonJoinTaskDispatcher
> ----------------------------------------------------------------
>
> Key: HIVE-4078
> URL: https://issues.apache.org/jira/browse/HIVE-4078
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Reporter: Gopal V
> Assignee: Gopal V
> Labels: client, perfomance
> Attachments: HIVE-4078-20130305.2.patch, HIVE-4078-20130305.patch,
> HIVE-4078-20130406.patch
>
>
> CommonJoinProcessor tries to clone a MapredWork while attempting a conversion
> to a map-join
> {code}
> // deep copy a new mapred work from xml
> InputStream in = new ByteArrayInputStream(xml.getBytes("UTF-8"));
> MapredWork newWork = Utilities.deserializeMapRedWork(in,
> physicalContext.getConf());
> {code}
> which is a very heavy operation memory wise & cpu-wise.
> It would be better to do this only if a conditional task is required,
> resulting in a copy of the task.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira