[ 
https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13718628#comment-13718628
 ] 

Xuefu Zhang commented on HIVE-4885:
-----------------------------------

[~appodictic] Thanks for your input. I originally thought it's only a matter of 
ordering, but while I was fixing it, I found it also has structural diff. For 
instance:

JDK6:
{code}
<object id="GenericUDAFEvaluator$Mode0" 
class="org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$Mode" 
method="valueOf"> 
  <string>PARTIAL1</string> 
</object> 
{code}

JDK7:
{code}
<object id="GenericUDAFEvaluator$Mode0" class="java.lang.Enum" method="valueOf">
  <class>org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator$Mode</class>
  <string>PARTIAL1</string> 
</object> 
{code}

I'm not sure if DOM comparison can equal them without customization.

My patch doesn't elegantly solve the problem, but it fixes the test failures 
while allowing us to look more into the serialization/comparison strategy.
                
> Alternative object serialization for execution plan in hive testing 
> --------------------------------------------------------------------
>
>                 Key: HIVE-4885
>                 URL: https://issues.apache.org/jira/browse/HIVE-4885
>             Project: Hive
>          Issue Type: Improvement
>          Components: CLI
>    Affects Versions: 0.10.0, 0.11.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4885.patch
>
>
> Currently there are a lot of test cases involving in comparing execution 
> plan, such as those in TestParse suite. XmlEncoder is used to serialize the 
> generated plan by hive, and store it in the file for file diff comparison. 
> However, XmlEncoder is tied with Java compiler, whose implementation may 
> change from version to version. Thus, upgrade the compiler can generate a lot 
> of fake test failures. The following is an example of diff generated when 
> running hive with JDK7:
> {code}
> Begin query: case_sensitivity.q
> diff -a 
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out
>  
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out
> diff -a -b 
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml
>  
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml
> 3c3
> <  <object class="org.apache.hadoop.hive.ql.exec.MapRedTask" id="MapRedTask0">
> ---
> >  <object id="MapRedTask0" 
> > class="org.apache.hadoop.hive.ql.exec.MapRedTask"> 
> 12c12
> <        <object class="java.util.ArrayList" id="ArrayList0">
> ---
> >        <object id="ArrayList0" class="java.util.ArrayList"> 
> 14c14
> <          <object class="org.apache.hadoop.hive.ql.exec.MoveTask" 
> id="MoveTask0">
> ---
> >          <object id="MoveTask0" 
> > class="org.apache.hadoop.hive.ql.exec.MoveTask"> 
> 18c18
> <              <object class="org.apache.hadoop.hive.ql.exec.MoveTask" 
> id="MoveTask1">
> ---
> >              <object id="MoveTask1" 
> > class="org.apache.hadoop.hive.ql.exec.MoveTask"> 
> 22c22
> <                  <object class="org.apache.hadoop.hive.ql.exec.StatsTask" 
> id="StatsTask0">
> ---
> >                  <object id="StatsTask0" 
> > class="org.apache.hadoop.hive.ql.exec.StatsTask"> 
> 60c60
> <                  <object class="org.apache.hadoop.hive.ql.exec.MapRedTask" 
> id="MapRedTask1">
> ---
> >                  <object id="MapRedTask1" 
> > class="org.apache.hadoop.hive.ql.exec.MapRedTask"> 
> {code}
> As it can be seen, the only difference is the order of the attributes in the 
> serialized XML doc, yet it brings 50+ test failures in Hive.
> We need to have a better plan comparison, or object serialization to improve 
> the situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to