[ 
https://issues.apache.org/jira/browse/HIVE-8720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14196510#comment-14196510
 ] 

Sergey Shelukhin commented on HIVE-8720:
----------------------------------------

+1

> Update orc_merge tests to make it consistent across OS'es
> ---------------------------------------------------------
>
>                 Key: HIVE-8720
>                 URL: https://issues.apache.org/jira/browse/HIVE-8720
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>         Attachments: HIVE-8720.1.patch, orc_merge5_filedump_macosx.txt, 
> orc_merge5_filedump_opensuse.txt
>
>
> orc_merge*.q test cases fails with qfile diffs related to file size on 
> different OSes. I have seen failures with Open SUSE and CentOS. The order of 
> insertion of rows into ORC table impacts the file size because of run length 
> encoding. Since the order of rows is not guaranteed during insertion into 
> table we may get different file sizes. We cannot add ORDER BY to insert 
> queries as it will force insertion through single reducer which will disable 
> orc merge file optimization. Since these test cases test if the files are 
> merged or not it is sufficient to know the number of files after merging. 
> Instead of DESCRIBE FORMATTED (which shows the numFiles and fileSize) we can 
> use "dfs -ls" to know the number of files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to