I have to run some processes in parallel, for different input datasets. in each of the processes, one step is insertion into a hive table, shared by all these processes.
would I get conflicts if the insertions are run in parallel ? it's fine if I get a blocking, but I need to guarantee correctness of data. if there is conflict, would "INSERT OVERWRITE PARTITION" get conflicts ? the different processes indeed process different partitions thanks Yang