Re: ALTER TABLE DROP PARTITION not working on S3

Furcy Pin Thu, 05 Apr 2018 07:03:35 -0700

¯\_(ツ)_/¯

On 5 April 2018 at 16:02, Richard A. Bross <[email protected]> wrote:


> I can't duplicate the issue now.  Works like it always has . .
>
> ----- Original Message -----
> From: "Furcy Pin" <[email protected]>
> To: [email protected]
> Sent: Thursday, April 5, 2018 9:57:39 AM
> Subject: Re: ALTER TABLE DROP PARTITION not working on S3
>
>
> Indeed.
>
>
> If I remember correctly, s3 does not really have the concept of "folder"
> like HDFS has, and Hive sort of makes up for it by creating a descriptor
> file
> where the partition "folder" is supposed to be. Maybe this is what is
> missing here.
>
>
> Perhaps you could try doing a "MSCK REPAIR TABLE tablename" to make sure
> that the partitions are correctly loaded and then try again dropping that
> particular partition?
>
>
> Or look at your s3 folder if you see any such "partition folder file" and
> check if it is missing for this particular partition?
>
>
>
>
> On 5 April 2018 at 15:40, Richard A. Bross < [email protected] > wrote:
>
>
> Leaving the column list out, here you go:
>
> # Detailed Table Information
> Database: default
> Owner: hadoop
> CreateTime: Thu Apr 05 13:24:33 UTC 2018
> LastAccessTime: UNKNOWN
> Retention: 0
> Location: s3://zoomi-proto-warehouse-measurements/
> Table Type: MANAGED_TABLE
> Table Parameters:
> COLUMN_STATS_ACCURATE {\"BASIC_STATS\":\"true\"}
> numFiles 10
> numPartitions 7
> numRows 153
> orc.compress ZLIB
> orc.create.index true
> rawDataSize 113563
> totalSize 37801
> transient_lastDdlTime 1522934673
>
> # Storage Information
> SerDe Library: org.apache.hadoop.hive.ql.io .orc.OrcSerde
> InputFormat: org.apache.hadoop.hive.ql.io .orc.OrcInputFormat
> OutputFormat: org.apache.hadoop.hive.ql.io .orc.OrcOutputFormat
> Compressed: No
> Num Buckets: 61
> Bucket Columns: [crs_id]
> Sort Columns: []
> Storage Desc Params:
> serialization.format 1
> Time taken: 0.467 seconds, Fetched: 98 row(s)
>
>
> ----- Original Message -----
> From: "Furcy Pin" < [email protected] >
> To: [email protected]
> Sent: Thursday, April 5, 2018 9:21:06 AM
>
>
> Subject: Re: ALTER TABLE DROP PARTITION not working on S3
>
>
> Hi Richard,
>
>
> could you please check if your table is EXTERNAL?
> You can see it with a "DESCRIBE FORMATTED table_name ;"
>
>
> That's what external tables are for, they don't delete underlying data
> when you drop them.
>
>
>
>
> On 5 April 2018 at 15:18, Richard A. Bross < [email protected] > wrote:
>
>
> I think that someone put a file in there manually. Would that prevent Hive
> from dropping the partition. I also did a "drop table" and the s3 object
> keys persisted.
>
>
>
> ----- Original Message -----
> From: "Richard A. Bross" < [email protected] >
> To: [email protected]
> Sent: Thursday, April 5, 2018 9:14:52 AM
> Subject: ALTER TABLE DROP PARTITION not working on S3
>
> Hi,
>
> I have a Hive managed table on S3, "api_measurements". I've tried dropping
> a partition like so:
>
> hive> alter table api_measurements drop if exists
> partition(daydate='2018-04-04', epoch=1522876500);
> Dropped the partition daydate=2018-04-04/epoch=1522876500
> OK
> Time taken: 2.109 seconds
>
> Yet the data is still on S3. Because object keys on S3 are always strings,
> I also tried this:
>
> hive> alter table api_measurements drop partition(daydate='2018-04-04',
> epoch='1522876500');
> OK
> Time taken: 0.135 seconds
>
> Yet the object keys and data are still there. I assume that Im missing
> something really simple. Can anyone shed some light on this?
>
> Thanks
>
>
>
>

Re: ALTER TABLE DROP PARTITION not working on S3

Reply via email to