[ 
https://issues.apache.org/jira/browse/HIVE-12877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yangfang updated HIVE-12877:
----------------------------
    Description: 
Hive created the index using the extracted file length when the file is  the 
compressed,
but when to divide the data into pieces in MapReduce,Hive use the file length 
to compare with the extracted file length,if
If it found that these two lengths are not matched, It filters out the file.So 
the query will lose some data.
I modified the source code and make hive index can be used when the files is 
compressed,please test it.

  was:
Hive created the index using the extracted file length when the file is  the 
compressed,
but when to divide the data into pieces in MapReduce,Hive use the file length 
to compare with the extracted file length,if
If it found that these two lengths are not matched, It filters out the file.So 
the query will lose some data.


> Hive use index for queries will lose some data if the Query file is 
> compressed.
> -------------------------------------------------------------------------------
>
>                 Key: HIVE-12877
>                 URL: https://issues.apache.org/jira/browse/HIVE-12877
>             Project: Hive
>          Issue Type: Bug
>          Components: Indexing
>    Affects Versions: 1.2.1
>         Environment: This problem exists in all Hive versions.no matter what 
> platform
>            Reporter: yangfang
>         Attachments: HIVE-12877.patch
>
>
> Hive created the index using the extracted file length when the file is  the 
> compressed,
> but when to divide the data into pieces in MapReduce,Hive use the file length 
> to compare with the extracted file length,if
> If it found that these two lengths are not matched, It filters out the 
> file.So the query will lose some data.
> I modified the source code and make hive index can be used when the files is 
> compressed,please test it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to