Tagar commented on PR #39115: URL: https://github.com/apache/spark/pull/39115#issuecomment-1366276253
I had a customer that runs MSCK on really huge tables, and it takes them hours to complete that operation. So this looks the same as the 2nd bullet point in @wecharyu's "why are the changes needed?". That particular customer uses Athena and want to populate list of newly added partitions to Glue Catalog. It would be great for MSCK operation to have an optional partitioning clause so it would not rescan whole table, but just those partitions matching the PARTITION pattern. For example, ```sql MSCK TABLE [db_name.]table_name PARTITION '202107%' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org