[ 
https://issues.apache.org/jira/browse/HIVE-16784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Barna Zsombor Klara updated HIVE-16784:
---------------------------------------
    Attachment: HIVE-16784.02.patch

Uploading a new patch. It seems we cannot just rewrite the path for the lineage 
information as it may have been used already. Instead we will duplicate it.... 
any cleaner suggestion would be much appreciated.

> Missing lineage information when hive.blobstore.optimizations.enabled is true
> -----------------------------------------------------------------------------
>
>                 Key: HIVE-16784
>                 URL: https://issues.apache.org/jira/browse/HIVE-16784
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Marta Kuczora
>            Assignee: Barna Zsombor Klara
>             Fix For: 3.0.0
>
>         Attachments: HIVE-16784.01.patch, HIVE-16784.02.patch, 
> HIVE-16784.02.patch
>
>
> Running the commands of the add_part_multiple.q test on S3 with 
> hive.blobstore.optimizations.enabled=true fails because of missing lineage 
> information.
> Running the command on HDFS
> {noformat}
> from src TABLESAMPLE (1 ROWS)
> insert into table add_part_test PARTITION (ds='2010-01-01') select 100,100
> insert into table add_part_test PARTITION (ds='2010-02-01') select 200,200
> insert into table add_part_test PARTITION (ds='2010-03-01') select 400,300
> insert into table add_part_test PARTITION (ds='2010-04-01') select 500,400;
> {noformat}
> results the following posthook outputs 
> {noformat}
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-01-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-01-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-02-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-02-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-03-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-03-01).value EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-04-01).key EXPRESSION []
> POSTHOOK: Lineage: add_part_test2 PARTITION(ds=2010-04-01).value EXPRESSION []
> {noformat}
> These lines are not printed when running the command on the table located in 
> S3.
> If hive.blobstore.optimizations.enabled=false, the lineage information is 
> printed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to