Re: Persisting driver logs in yarn client mode (SPARK-25118)

2018-08-27 Thread Henry Robinson
On Mon, 27 Aug 2018 at 13:04, Ankur Gupta wrote: > Thanks all for your responses. > > So I believe a solution that accomplishes the following will be a good > solution: > > 1. Writes logs to Hdfs asynchronously > In the limit, this could perform just as slowly at shutdown time as synchronous log

Re: [VOTE] SPIP: Standardize SQL logical plans

2018-07-18 Thread Henry Robinson
+1 (non-binding) On Wed, Jul 18, 2018 at 9:12 AM Reynold Xin wrote: > +1 on this, on the condition that we can come up with a design that will > remove the existing plans. > > > On Tue, Jul 17, 2018 at 11:00 AM Ryan Blue wrote: > >> Hi everyone, >> >> From discussion on the proposal doc and the

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-04 Thread Henry Robinson
+1 (non-binding) On 4 June 2018 at 11:15, Bryan Cutler wrote: > +1 > > On Mon, Jun 4, 2018 at 10:18 AM, Joseph Bradley > wrote: > >> +1 >> >> On Mon, Jun 4, 2018 at 10:16 AM, Mark Hamstra >> wrote: >> >>> +1 >>> >>> On Fri, Jun 1, 2018 at 3:29 PM Marcelo Vanzin >>> wrote: >>> Please vote

Re: [VOTE] [SPARK-24374] SPIP: Support Barrier Scheduling in Apache Spark

2018-06-04 Thread Henry Robinson
+1 (I hope there will be a fuller design document to review, since the SPIP is really light on details). On 4 June 2018 at 10:17, Joseph Bradley wrote: > +1 > > On Sun, Jun 3, 2018 at 9:59 AM, Weichen Xu > wrote: > >> +1 >> >> On Fri, Jun 1, 2018 at 3:41 PM, Xiao Li wrote: >> >>> +1 >>> >>> 2

Re: Time for 2.3.1?

2018-05-11 Thread Henry Robinson
https://github.com/apache/spark/pull/21302 On 11 May 2018 at 11:47, Henry Robinson wrote: > I was planning to do so shortly. > > Henry > > On 11 May 2018 at 11:45, Ryan Blue wrote: > >> The Parquet Java 1.8.3 release is out. Has anyone started a PR to update, >>

Re: Time for 2.3.1?

2018-05-11 Thread Henry Robinson
dd SPARK-24067 today assuming there's no >> objections >> >> On Thu, May 10, 2018 at 1:22 PM, Henry Robinson wrote: >> > +1, I'd like to get a release out with SPARK-23852 fixed. The Parquet >> > community are about to release 1.8.3 - the voting perio

Re: Time for 2.3.1?

2018-05-10 Thread Henry Robinson
+1, I'd like to get a release out with SPARK-23852 fixed. The Parquet community are about to release 1.8.3 - the voting period closes tomorrow - and I've tested it with Spark 2.3 and confirmed the bug is fixed. Hopefully it is released and I can post the version change to branch-2.3 before you star

Re: Maintenance releases for SPARK-23852?

2018-04-16 Thread Henry Robinson
2018 at 1:23 PM, Reynold Xin wrote: > >> Seems like this would make sense... we usually make maintenance releases >> for bug fixes after a month anyway. >> >> >> On Wed, Apr 11, 2018 at 12:52 PM, Henry Robinson >> wrote: >> >>> >>> &

Re: Maintenance releases for SPARK-23852?

2018-04-11 Thread Henry Robinson
I don't know about parquet-cpp, but yeah, the only implementation I've seen writing the half-completed stats is Impala. (as you know, that's compliant with the spec, just an unusual choice). > > On Wed, Apr 11, 2018 at 12:35 PM, Henry Robinson wrote: > >> Hi all - >>

Maintenance releases for SPARK-23852?

2018-04-11 Thread Henry Robinson
Hi all - SPARK-23852 (where a query can silently give wrong results thanks to a predicate pushdown bug in Parquet) is a fairly bad bug. In other projects I've been involved with, we've released maintenance releases for bugs of this severity. Since Spark 2.4.0 is probably a while away, I wanted to

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-28 Thread Henry Robinson
(My vote is non-binding, of course). On 28 November 2017 at 14:53, Henry Robinson wrote: > +1, tests all pass for me on Ubuntu 16.04. > > On 28 November 2017 at 10:36, Herman van Hövell tot Westerflier < > hvanhov...@databricks.com> wrote: > >> +1 >> >>

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-28 Thread Henry Robinson
+1, tests all pass for me on Ubuntu 16.04. On 28 November 2017 at 10:36, Herman van Hövell tot Westerflier < hvanhov...@databricks.com> wrote: > +1 > > On Tue, Nov 28, 2017 at 7:35 PM, Felix Cheung > wrote: > >> +1 >> >> Thanks Sean. Please vote! >> >> Tested various scenarios with R package. Ub

SPARK-22211: Removing an incorrect FOJ optimization

2017-11-01 Thread Henry Robinson
Hi - I'm digging into some Spark SQL tickets, and wanted to ask a procedural question about SPARK-22211 and optimizer changes in general. To summarise the JIRA, Catalyst appears to be incorrectly pushing a limit down below a FULL OUTER JOIN, risking possibly incorrect results. I don't believe the