Re: Custom return code

2016-09-02 Thread Pierre Villard
Any hint? 2016-08-31 20:40 GMT+02:00 Pierre Villard <pierre.villard...@gmail.com>: > Hi, > > I am using Spark 1.5.2 and I am submitting a job (jar file) using > spark-submit command in a yarn cluster mode. I'd like the command to return > a custom return code. > >

Custom return code

2016-08-31 Thread Pierre Villard
Hi, I am using Spark 1.5.2 and I am submitting a job (jar file) using spark-submit command in a yarn cluster mode. I'd like the command to return a custom return code. In the run method, if I do: sys.exit(myCode) the command will always return 0. The only way to have something not equal to 0 is

Spark driver memory keeps growing

2016-08-08 Thread Pierre Villard
Hi, I'm running a job on Spark 1.5.2 and I get OutOfMemoryError on broadcast variables access. The thing is I am not sure to understand why the broadcast keeps growing and why it does at this place of code. Basically, I have a large input file, each line having a key. I group by key my lines to

Re: Does saveAsHadoopFile depend on master?

2016-06-23 Thread Pierre Villard
zjf...@gmail.com> wrote: > >> Please check the driver and executor log, there should be logs about >> where the data is written. >> >> >> >> On Wed, Jun 22, 2016 at 2:03 AM, Pierre Villard < >> pierre.villard...@gmail.com> wrote: >> >>>

Does saveAsHadoopFile depend on master?

2016-06-21 Thread Pierre Villard
Hi, I have a Spark job writing files to HDFS using .saveAsHadoopFile method. If I run my job in local/client mode, it works as expected and I get all my files written in HDFS. However if I change to yarn/cluster mode, I don't see any error logs (the job is successful) and there is no files

MetadataFetchFailedException: Missing an output location for shuffle 0

2016-03-03 Thread Pierre Villard
Hi, I have set up a spark job and it keeps failing even though I tried a lot of different configurations regarding memory parameters (as suggested in other threads I read). My configuration: Cluster of 4 machines: 4vCPU, 16Go RAM. YARN version: 2.7.1 Spark version: 1.5.2 I tried a lot of