[jira] [Updated] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata
[ https://issues.apache.org/jira/browse/HIVE-22077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui An updated HIVE-22077: -- Attachment: HIVE-22077.patch.1 Status: Patch Available (was: Open) > Inserting overwrite partitions clause does not clean directories while > partitions' info is not stored in metadata > - > > Key: HIVE-22077 > URL: https://issues.apache.org/jira/browse/HIVE-22077 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.3.4, 1.1.1, 4.0.0 >Reporter: Hui An >Assignee: Hui An >Priority: Major > Attachments: HIVE-22077.patch.1 > > > Inserting overwrite static partitions may not clean related HDFS location if > partitions' info is not stored in metadata. > Steps to reproduce this issue : > > 1. Create a managed table : > > {code:sql} > CREATE TABLE `test`( >`id` string) > PARTITIONED BY ( >`dayno` string) > ROW FORMAT SERDE >'org.apache.hadoop.hive.ql.io.orc.OrcSerde' > STORED AS INPUTFORMAT >'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' > OUTPUTFORMAT >'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' > LOCATION >'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' > TBLPROPERTIES ( >'transient_lastDdlTime'='1564731656') > {code} > > 2. Create partition's directory and put some data in it > > {code:java} > hdfs dfs -mkdir > hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 > hdfs dfs -put test.data > hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 > {code} > > 3. Insert overwrite partition dayno=20190802 > > {code:sql} > INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') > SELECT "some value"; > {code} > > 4. We could see the test.data under partition directory is not deleted. > -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Updated] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata
[ https://issues.apache.org/jira/browse/HIVE-22077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui An updated HIVE-22077: -- Description: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data in it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value"; {code} 4. We could see the test.data under partition directory is not deleted. was: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION | 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data in it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value"; {code} 4. We could see the test.data under partition directory is not deleted. > Inserting overwrite partitions clause does not clean directories while > partitions' info is not stored in metadata > - > > Key: HIVE-22077 > URL: https://issues.apache.org/jira/browse/HIVE-22077 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1, 4.0.0, 2.3.4 >Reporter: Hui An >Assignee: Hui An >Priority: Major > > Inserting overwrite static partitions may not clean related HDFS location if > partitions' info is not stored in metadata. > Steps to reproduce this issue : > > 1. Create a managed table : > > {code:sql} > CREATE TABLE `test`( >`id` string) > PARTITIONED BY ( >`dayno` string) > ROW FORMAT SERDE >'org.apache.hadoop.hive.ql.
[jira] [Updated] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata
[ https://issues.apache.org/jira/browse/HIVE-22077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui An updated HIVE-22077: -- Description: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION | 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data in it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value"; {code} 4. We could see the test.data under partition directory is not deleted. was: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to Reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION | 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data under it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value"; {code} 4. We could see the test.data under partition directory is not deleted. > Inserting overwrite partitions clause does not clean directories while > partitions' info is not stored in metadata > - > > Key: HIVE-22077 > URL: https://issues.apache.org/jira/browse/HIVE-22077 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1, 4.0.0, 2.3.4 >Reporter: Hui An >Assignee: Hui An >Priority: Major > > Inserting overwrite static partitions may not clean related HDFS location if > partitions' info is not stored in metadata. > Steps to reproduce this issue : > > 1. Create a managed table : > > {code:sql} > CREATE TABLE `test`( >`id` string) > PARTITIONED BY ( >`dayno` string) > ROW FORMAT SERDE >'org.apache.hadoop.hive
[jira] [Updated] (HIVE-22077) Inserting overwrite partitions clause does not clean directories while partitions' info is not stored in metadata
[ https://issues.apache.org/jira/browse/HIVE-22077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hui An updated HIVE-22077: -- Description: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to Reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION | 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data under it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT "some value"; {code} 4. We could see the test.data under partition directory is not deleted. was: Inserting overwrite static partitions may not clean related HDFS location if partitions' info is not stored in metadata. Steps to Reproduce this issue : 1. Create a managed table : {code:sql} CREATE TABLE `test`( `id` string) PARTITIONED BY ( `dayno` string) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION | 'hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test' TBLPROPERTIES ( 'transient_lastDdlTime'='1564731656') {code} 2. Create partition's directory and put some data under it {code:java} hdfs dfs -mkdir hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 hdfs dfs -put test.data hdfs://test-dev-hdfs/user/hive/warehouse/test.db/test/dayno=20190802 {code} 3. Insert overwrite partition dayno=20190802 {code:sql} INSERT OVERWRITE TABLE test PARTITION(dayno='20190802') SELECT 1; {code} 4. We could see the test.data under partition directory is not deleted. > Inserting overwrite partitions clause does not clean directories while > partitions' info is not stored in metadata > - > > Key: HIVE-22077 > URL: https://issues.apache.org/jira/browse/HIVE-22077 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.1.1, 4.0.0, 2.3.4 >Reporter: Hui An >Assignee: Hui An >Priority: Major > > Inserting overwrite static partitions may not clean related HDFS location if > partitions' info is not stored in metadata. > Steps to Reproduce this issue : > > 1. Create a managed table : > > {code:sql} > CREATE TABLE `test`( >`id` string) > PARTITIONED BY ( >`dayno` string) > ROW FORMAT SERDE >'org.apache.hadoop.hive.ql.io.o