Hi, Good morning.
I am using Spark batch to process and ingest extracts of several RDBMS tables/Filebased systems arriving in regular intervals into a Datalake as ORC backed Hive tables. Considering that the input data file size, file count, row count and feature counts vary quite a lot, I am unable to come up with an optimal number for the coalesce. I felt that the "alter table concatenate" is an easy way out to work around the small files issue on NN that we are facing. Sorry about the long story - I bumped into this issue earlier today - Alter table concatenate is not working as expected (SPARK-20592 <https://issues.apache.org/jira/browse/SPARK-20592>). After some analysis of the sql module, I found that the concatenate operation is consciously marked as one of unsupportedHiveNativeCommands in the Antlr grammar. Please let me know if you have strong reservations against enabling this? I can take a stab at it and have a PR for review. Cheers, Arun