Re: Filter cannot be pushed via a Join

2019-06-18 Thread William Wong
Hi Xiao, Just report this with JIRA SPARK-28103. https://issues.apache.org/jira/browse/SPARK-28103 Thanks and Regards, William On Wed, 19 Jun 2019 at 1:35 AM, Xiao Li wrote: > Hi, William, > > Thanks for reporting it. Could you open a JIRA? > > Cheers, > > Xiao > > William Wong

RE: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-18 Thread Guo, Chenzhao
Cool : ) +1 (non-binding) Chenzhao From: dhruve ashar [mailto:dhruveas...@gmail.com] Sent: Wednesday, June 19, 2019 2:58 AM To: John Zhuge Cc: Vinoo Ganesh ; Felix Cheung ; Yinan Li ; rb...@netflix.com; Dongjoon Hyun ; Saisai Shao ; Imran Rashid ; Ilan Filonenko ; bo yang ; Matt Cheah ;

Re: Detect executor core count

2019-06-18 Thread Andrew Melo
On Tue, Jun 18, 2019 at 5:40 PM Steve Loughran wrote: > be aware that older java 8 versions count the #of cores in the host, not > those allocated for the container they run in > https://bugs.openjdk.java.net/browse/JDK-8140793 > > Ergh, that's good to know. I suppose, though, that in any case,

Re: Detect executor core count

2019-06-18 Thread Steve Loughran
be aware that older java 8 versions count the #of cores in the host, not those allocated for the container they run in https://bugs.openjdk.java.net/browse/JDK-8140793 On Tue, Jun 18, 2019 at 8:13 PM Ilya Matiach wrote: > Hi Andrew, > > I tried to do something similar to that in the LightGBM >

RE: Detect executor core count

2019-06-18 Thread Ilya Matiach
Hi Andrew, I tried to do something similar to that in the LightGBM classifier/regressor/ranker in mmlspark package, I try to use the spark conf and if not configured I get the processors from the JVM directly:

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-18 Thread dhruve ashar
+1 (non-binding) On Tue, Jun 18, 2019 at 12:12 PM John Zhuge wrote: > +1 (non-binding) Great work! > > On Tue, Jun 18, 2019 at 6:22 AM Vinoo Ganesh wrote: > >> +1 (non-binding). >> >> >> >> Thanks for pushing this forward, Matt and Yifei. >> >> >> >> *From: *Felix Cheung >> *Date: *Tuesday,

Re: Filter cannot be pushed via a Join

2019-06-18 Thread Xiao Li
Hi, William, Thanks for reporting it. Could you open a JIRA? Cheers, Xiao William Wong 于2019年6月18日周二 上午8:57写道: > BTW, I noticed a workaround is creating a custom rule to remove 'empty > local relation' from a union table. However, I am not 100% sure if it is > the right approach. > > On Tue,

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-18 Thread John Zhuge
+1 (non-binding) Great work! On Tue, Jun 18, 2019 at 6:22 AM Vinoo Ganesh wrote: > +1 (non-binding). > > > > Thanks for pushing this forward, Matt and Yifei. > > > > *From: *Felix Cheung > *Date: *Tuesday, June 18, 2019 at 00:01 > *To: *Yinan Li , "rb...@netflix.com" < > rb...@netflix.com> >

Re: Filter cannot be pushed via a Join

2019-06-18 Thread William Wong
BTW, I noticed a workaround is creating a custom rule to remove 'empty local relation' from a union table. However, I am not 100% sure if it is the right approach. On Tue, Jun 18, 2019 at 11:53 PM William Wong wrote: > Dear all, > > I am not sure if it is something expected or not, and should I

Re: Filter cannot be pushed via a Join

2019-06-18 Thread William Wong
Dear all, I am not sure if it is something expected or not, and should I report it as a bug. Basically, the constraints of a union table could be turned empty if any subtable is turned into an empty local relation. The side effect is filter cannot be inferred correctly (by

Re: [External Sender] Re: Spark 2.4.1 on Kubernetes - DNS resolution of driver fails

2019-06-18 Thread Jose Luis Pedrosa
Hi! I am assuming you’re running it in cluster mode, Service should be created by the submit binary, in this file: org/apache/spark/deploy/k8s/submit/KubernetesClientApplication.scala Don’t you have any failing logs where spark submit has been launched? JL From: "Prudhvi Chennuru (CONT)"

Detect executor core count

2019-06-18 Thread Andrew Melo
Hello, Is there a way to detect the number of cores allocated for an executor within a java-based InputPartitionReader? Thanks! Andrew

unsubscribe

2019-06-18 Thread Matteo Bovetti
[agilelab_logo] Matteo Bovetti Big Data Engineer & DevOps Mobile: +39 333 290 1242 Email: matteo.bove...@agilelab.it Site: www.agilelab.it

Re: Spark 2.4.3 source download is a dead link

2019-06-18 Thread Sean Owen
Huh, I don't know how long that's been a bug, but the JS that creates the filename with .replace doesn't seem to have ever worked? https://github.com/apache/spark-website/pull/207 On Tue, Jun 18, 2019 at 4:07 AM Olivier Girardot wrote: > > Hi everyone, > FYI the spark source download link on

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-18 Thread Vinoo Ganesh
+1 (non-binding). Thanks for pushing this forward, Matt and Yifei. From: Felix Cheung Date: Tuesday, June 18, 2019 at 00:01 To: Yinan Li , "rb...@netflix.com" Cc: Dongjoon Hyun , Saisai Shao , Imran Rashid , Ilan Filonenko , bo yang , Matt Cheah , Spark Dev List , "Yifei Huang (PD)" ,

Re: [External Sender] Re: Spark 2.4.1 on Kubernetes - DNS resolution of driver fails

2019-06-18 Thread Jose Luis Pedrosa
Hi guys There’s also an interesting one that we found in a similar case. In our case the service ip ranges takes more time to be reachable, so DNS was timing out. The approach that I was suggesting was: 1. Add retries in the connection from the executor to the driver:

Spark 2.4.3 source download is a dead link

2019-06-18 Thread Olivier Girardot
Hi everyone, FYI the spark source download link on spark.apache.org is dead : https://archive.apache.org/dist/spark/spark-2.4.3/spark-2.4.3-bin-sources.tgz Regards, -- *Olivier Girardot*

Re: [External Sender] Re: Spark 2.4.1 on Kubernetes - DNS resolution of driver fails

2019-06-18 Thread Olivier Girardot
Hi Prudhvi, not really but we took a drastic approach mitigating this, modifying the bundled launch script to be more resilient. In the kubernetes/dockerfiles/spark/entrypoint.sh in the executor case we added something like that : executor) DRIVER_HOST=$(echo $SPARK_DRIVER_URL | cut -d "@"