Re: Task failures and other problems

2017-11-09 Thread Jörn Franke
Sorry I thought with infiniband it was their appliance :) > On 9. Nov 2017, at 23:38, Vadim Semenov wrote: > > Probably not Oracle but Cloudera 🙂 > > Jan, I think your DataNodes might be overloaded, I'd suggest reducing > `spark.executor.cores` if you run executors alongside DataNodes, so the

Re: Task failures and other problems

2017-11-09 Thread Vadim Semenov
Probably not Oracle but Cloudera 🙂 Jan, I think your DataNodes might be overloaded, I'd suggest reducing `spark.executor.cores` if you run executors alongside DataNodes, so the DataNode process would get some resources. The other thing you can do is to increase `dfs.client.socket-timeout` in hado

Re: Task failures and other problems

2017-11-09 Thread Jan-Hendrik Zab
Jörn Franke writes: > Maybe contact Oracle support? Something like that would be the last option I guess, university money is usually hard to come by for such things. > Do you have maybe accidentally configured some firewall rules? Routing > issues? Maybe only one of the nodes... All systems

Re: Task failures and other problems

2017-11-09 Thread Jörn Franke
Maybe contact Oracle support? Do you have maybe accidentally configured some firewall rules? Routing issues? Maybe only one of the nodes... > On 9. Nov 2017, at 20:04, Jan-Hendrik Zab wrote: > > > Hello! > > This might not be the perfect list for the issue, but I tried user@ > previously

Task failures and other problems

2017-11-09 Thread Jan-Hendrik Zab
Hello! This might not be the perfect list for the issue, but I tried user@ previously with the same issue, but with a bit less information to no avail. So I'm hoping someone here can point me into the right direction. We're using Spark 2.2 on CDH 5.13 (Hadoop 2.6 with patches) and a lot of our