[ https://issues.apache.org/jira/browse/HIVE-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764283#action_12764283 ]
Cyrus Katrak commented on HIVE-493: ----------------------------------- Ping? Was half way through writing a script to do this when I found this thread. bq. You can do 'alter table <tbl> add partition <partition spec>' at the end of the map-reduce job that creates the partition. You don't really 'automatic inference' unless you do not have any control over the partition creation process In one of my use cases I don't have control over partition creation (Intra-cluster copying of a hive table) Ignoring the other issues (Indicies/Compaction), I think an HQL solution would be usefull. e.g.: "ALTER TABLE <tbl> ADD PARTITION AUTOSCAN" Adds entries for partitions on hdfs that don't exist in the metastore. > automatically infer existing partitions of table from HDFS files. > ----------------------------------------------------------------- > > Key: HIVE-493 > URL: https://issues.apache.org/jira/browse/HIVE-493 > Project: Hadoop Hive > Issue Type: New Feature > Components: Metastore, Query Processor > Affects Versions: 0.3.0, 0.3.1, 0.4.0 > Reporter: Prasad Chakka > > Initially partition list for a table is inferred from HDFS directory > structure instead of looking into metastore (partitions are created using > 'alter table ... add partition'). but this automatic inferring was removed to > favor the later approach during checking-in metastore checker feature and > also to facilitate external partitions. > Joydeep and Frederick mentioned that it would simple for users to create the > HDFS directory and let Hive infer rather than explicitly add a partition. But > doing that raises following... > 1) External partitions -- so we have to mix both approaches and partition > list is merged list of inferred partitions and registered partitions. and > duplicates have to be resolved. > 2) Partition level schemas can't supported. Which schema to chose for the > inferred partitions? the table schema when the inferred partition is created > or the latest tale schema? how do we know the table schema when the inferred > partitions is created? > 3) If partitions have to be registered the partitions can be disabled without > actually deleting the data. this feature is not supported and may not be that > useful but nevertheless this can't be supported with inferred partitions > 4) Indexes are being added. So if partitions are not registered then indexes > for such partitions can not be maintained automatically. > I would like to know what is the general thinking about this among users of > Hive. If inferred partitions are preferred then can we live with restricted > functionality that this imposes? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.