Hi, I need to run a batch job written in Java that executes several SQL statements on different hive tables, and then process each partition result set in a foreachPartition() operator. I'd like to run these actions in parallel. I saw there are two approaches for achieving this:
1. Using the java.util.concurrent package e.g. Future/ForkJoinPool 2. Transforming my Dataset to JavaRDD<Row> and using the foreachPartitionAsync() on the RDD. Can you please recommend the best way to achieve this using one of these options, or suggest a better approach? Thanks, Guy This message and the information contained herein is proprietary and confidential and subject to the Amdocs policy statement, you may review at https://www.amdocs.com/about/email-disclaimer <https://www.amdocs.com/about/email-disclaimer>