I have to run some processes in parallel, for different input datasets. in
each of the processes, one step is insertion into a hive table, shared by
all these processes.

would I get conflicts if the insertions are run in parallel ? it's fine if
I get a blocking, but I need to guarantee correctness of data.

if there is conflict,  would "INSERT OVERWRITE PARTITION" get conflicts ?
the different processes indeed process different partitions

thanks
Yang

Reply via email to