Hi, I have a DataFrame I compute from a long chain of transformations. I cache it, and then perform two additional transformations on it. I use two Futures - each Future will insert the content of one of the above Dataframe to a different hive table. One Future must SET hive.exec.dynamic.partition=true and the other must set it to false.
How can I run both INSERT commands in parallel, but guarantee each runs with its own settings? If I don't use the same HiveContext then the initial long chain of transformations which I cache is not reusable between HiveContexts. If I use the same HiveContext, race conditions between threads my cause one INSERT to execute with the wrong config.