[ 
https://issues.apache.org/jira/browse/HIVE-4885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737311#comment-13737311
 ] 

Brock Noland commented on HIVE-4885:
------------------------------------

Sounds good. I am +1 on the patch as well.
                
> Alternative object serialization for execution plan in hive testing 
> --------------------------------------------------------------------
>
>                 Key: HIVE-4885
>                 URL: https://issues.apache.org/jira/browse/HIVE-4885
>             Project: Hive
>          Issue Type: Improvement
>          Components: CLI
>    Affects Versions: 0.10.0, 0.11.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>             Fix For: 0.12.0
>
>         Attachments: HIVE-4885.patch
>
>
> Currently there are a lot of test cases involving in comparing execution 
> plan, such as those in TestParse suite. XmlEncoder is used to serialize the 
> generated plan by hive, and store it in the file for file diff comparison. 
> However, XmlEncoder is tied with Java compiler, whose implementation may 
> change from version to version. Thus, upgrade the compiler can generate a lot 
> of fake test failures. The following is an example of diff generated when 
> running hive with JDK7:
> {code}
> Begin query: case_sensitivity.q
> diff -a 
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.out
>  
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/parse/case_sensitivity.q.out
> diff -a -b 
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/build/ql/test/logs/positive/case_sensitivity.q.xml
>  
> /data/4/hive-local/a2307.halxg.cloudera.com-hiveptest-2/cdh-source/ql/src/test/results/compiler/plan/case_sensitivity.q.xml
> 3c3
> <  <object class="org.apache.hadoop.hive.ql.exec.MapRedTask" id="MapRedTask0">
> ---
> >  <object id="MapRedTask0" 
> > class="org.apache.hadoop.hive.ql.exec.MapRedTask"> 
> 12c12
> <        <object class="java.util.ArrayList" id="ArrayList0">
> ---
> >        <object id="ArrayList0" class="java.util.ArrayList"> 
> 14c14
> <          <object class="org.apache.hadoop.hive.ql.exec.MoveTask" 
> id="MoveTask0">
> ---
> >          <object id="MoveTask0" 
> > class="org.apache.hadoop.hive.ql.exec.MoveTask"> 
> 18c18
> <              <object class="org.apache.hadoop.hive.ql.exec.MoveTask" 
> id="MoveTask1">
> ---
> >              <object id="MoveTask1" 
> > class="org.apache.hadoop.hive.ql.exec.MoveTask"> 
> 22c22
> <                  <object class="org.apache.hadoop.hive.ql.exec.StatsTask" 
> id="StatsTask0">
> ---
> >                  <object id="StatsTask0" 
> > class="org.apache.hadoop.hive.ql.exec.StatsTask"> 
> 60c60
> <                  <object class="org.apache.hadoop.hive.ql.exec.MapRedTask" 
> id="MapRedTask1">
> ---
> >                  <object id="MapRedTask1" 
> > class="org.apache.hadoop.hive.ql.exec.MapRedTask"> 
> {code}
> As it can be seen, the only difference is the order of the attributes in the 
> serialized XML doc, yet it brings 50+ test failures in Hive.
> We need to have a better plan comparison, or object serialization to improve 
> the situation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to