Re: Spark SQL saveAsParquet failed after a few waves

2015-04-01 Thread Yijie Shen
I have 7 workers for spark and set SPARK_WORKER_CORES=12, therefore 84 tasks in one job can run simultaneously, I call the tasks in a job started almost  simultaneously a wave. While inserting, there is only one job on spark, not inserting from multiple programs concurrently. —  Best Regards!

Re: Spark SQL saveAsParquet failed after a few waves

2015-04-01 Thread Michael Armbrust
When few waves (1 or 2) are used in a job, LoadApp could finish after a few failures and retries. But when more waves (3) are involved in a job, the job would terminate abnormally. Can you clarify what you mean by waves? Are you inserting from multiple programs concurrently?

Spark SQL saveAsParquet failed after a few waves

2015-03-31 Thread Yijie Shen
Hi, I am using spark-1.3 prebuilt release with hadoop2.4 support and Hadoop 2.4.0. I wrote a spark application(LoadApp) to generate data in each task and load the data into HDFS as parquet Files (use “saveAsParquet()” in spark sql) When few waves (1 or 2) are used in a job, LoadApp could