Hi, I'm writing a custom Storage Handler and would need to run some custom code at the end of an INSERT query.
I can easily do that by providing a custom OutputCommitter class and overriding the commitJob() method. However, that only works for the "mr" execution engine, as the "commitJob()" method is never called when using Tez. With Tez, I managed to get it to work partially by providing a custom HiveMetaHook class and overriding the commitInsertTable() method. However, that method only gets called at the end of a "INSERT INTO TABLE" query. It never gets called at the end of a "INSERT INTO TABLE PARTITION (...)" query. After doing a bit of troubleshooting, it looks like Tez uses the "DDLTask" class (which later calls the commitInsertTable() method) only for a "INSERT INTO TABLE" query. When inserting into a specific partition, the "DDLTask" class doesn't seem to be used at all. Is there a way for me to override some type of Tez hook to run custom code at the end of a "INSERT INTO TABLE PARTITION (...)" query? Maybe by somehow hooking into the TezTask or TezWork classes? Any tips would be very welcome. Thanks! Julien
