[ 
https://issues.apache.org/jira/browse/HIVE-18563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Deepak Jaiswal reassigned HIVE-18563:
-------------------------------------

    Assignee: Deepak Jaiswal

> "Load data into table" behavior is different between 1.2.1 and 1.2.1000
> -----------------------------------------------------------------------
>
>                 Key: HIVE-18563
>                 URL: https://issues.apache.org/jira/browse/HIVE-18563
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive, HiveServer2
>         Environment: * OS : CentOS6
>  * JDK : 1.8.0_152(Oracle)
>  * HDP : 2.3.2.0 and 2.6.2.0
>  * Hive : 1.2.1.2.3.2.0-2950 and 1.2.1000.2.6.2.0-205
>            Reporter: Junichi Oda
>            Assignee: Deepak Jaiswal
>            Priority: Major
>
> After upgrading HDP from 2.3.2.0 to 2.6.2.0, the "load data into table" 
> behavior changed.
> Data is input hourly, All files have the same name.
> {code:java}
> /user/user1/logs/yyyymmdd/00/part-r-00000.gz
> /user/user1/logs/yyyymmdd/01/part-r-00000.gz
> /user/user1/logs/yyyymmdd/02/part-r-00000.gz
> /user/user1/logs/yyyymmdd/03/part-r-00000.gz
> ・・・・・・・・・・・・・・・・・・・・・・・
> /user/user1/logs/yyyymmdd/22/part-r-00000.gz
> /user/user1/logs/yyyymmdd/23/part-r-00000.gz
> {code}
> Before upgrade (HDP 2.3.2.0 )
> {code:java}
> HQL
> hive> load data inpath '/user/user1/logs/yyyymmdd/*/*.gz' into table 
> sample_db.sample_tbl partition (dt='yyyymmdd');
>  
>  
> Result
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_1.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_10.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_11.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_12.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_13.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_14.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_15.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_16.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_17.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_18.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_19.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_2.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_20.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_21.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_22.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_23.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_3.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_4.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_5.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_6.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_7.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_8.gz
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000_copy_9.gz
> {code}
> All files were renamed into part-r-0000_copy_*.gz without the file 
> part-r-0000.gz.
> After upgrade(HDP 2.6.2.0 )
> {code:java}
> HQL
> hive> load data inpath '/user/user1/logs/yyyymmdd/*/*.gz' into table 
> sample_db.sample_tbl partition (dt='yyyymmdd');
>  
> Result
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd
> /hive/warehouse/sample_db.db/sample_tbl/dt=yyyymmdd/part-r-00000.gz
> {code}
> There is only part-r-0000.gz.
> This file was the same file as part-r-0000_copy_23.gz.
> When files are loaded one by one, I can load all files like as HDP 2.3.2.0 
> environment.
> Why is the behavior different between 2.3.2.0 and 2.6.2.0 ?
> Thanks in advance
>  
> https://community.hortonworks.com/questions/158176/load-data-into-table-behavior-is-different-between.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to