I have 7 workers for spark and set SPARK_WORKER_CORES=12, therefore 84 tasks in
one job can run simultaneously,
I call the tasks in a job started almost simultaneously a wave.
While inserting, there is only one job on spark, not inserting from multiple
programs concurrently.
—
Best Regards!
When few waves (1 or 2) are used in a job, LoadApp could finish after a
few failures and retries.
But when more waves (3) are involved in a job, the job would terminate
abnormally.
Can you clarify what you mean by waves? Are you inserting from multiple
programs concurrently?
Hi,
I am using spark-1.3 prebuilt release with hadoop2.4 support and Hadoop 2.4.0.
I wrote a spark application(LoadApp) to generate data in each task and load the
data into HDFS as parquet Files (use “saveAsParquet()” in spark sql)
When few waves (1 or 2) are used in a job, LoadApp could