Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Jungtaek Lim
OK got it. Thanks for clarifying. I can help checking and modifying version, but not sure the case both versions are specified, like "2.4.0/3.0.0". Removing 3.0.0 would work in this case? 2018년 9월 21일 (금) 오후 2:29, Wenchen Fan 님이 작성: > There is an issue in the merge script, when resolving a

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Wenchen Fan
There is an issue in the merge script, when resolving a ticket, the default fixed version is 3.0.0. I guess someone forgot to type the fixed version and lead to this mistake. On Fri, Sep 21, 2018 at 1:15 PM Jungtaek Lim wrote: > Ah these issues were resolved before branch-2.4 is cut, like

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Jungtaek Lim
Ah these issues were resolved before branch-2.4 is cut, like SPARK-24441 https://github.com/apache/spark/blob/v2.4.0-rc1/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/HDFSBackedStateStoreProvider.scala SPARK-24441 is included to Spark 2.4.0 RC1 but set to 3.0.0. I heard

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Holden Karau
So normally during the release process if it's in branch-2.4 but not part of the current RC we set the resolved version to 2.4.1 and then if roll a new RC we switch the 2.4.1 issues to 2.4.0. On Thu, Sep 20, 2018 at 9:55 PM Jungtaek Lim wrote: > I also noticed there're some fixed issues which

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Jungtaek Lim
I also noticed there're some fixed issues which are included in branch-2.4 but its versions are still 3.0.0. Would we want to update versions to 2.4.0? If we are not planning to run some automations to correct it, I'm happy to fix them. 2018년 9월 20일 (목) 오후 9:22, Weichen Xu 님이 작성: > We need to

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Ryan Blue
Changing my vote to +1 with this fixed. Here's what was going on -- and thanks to Owen O'Malley for debugging: The problem was that Iceberg contained a fix for a JVM bug for timestamps before the unix epoch where the timestamp was off by 1s. Owen moved this code into ORC as well and using the

2.4.0 Blockers, Critical, etc

2018-09-20 Thread Sean Owen
Because we're into 2.4 release candidates, I thought I'd look at what's still open and targeted at 2.4.0. I presume the Blockers are the usual umbrellas that don't themselves block anything, but, confirming, there is nothing left to do there? I think that's mostly a question for Joseph and

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Dongjoon Hyun
Hi, Ryan. Could you share the result on 2.3.1 since this is 2.3.2 RC? That would be helpful to narrow down the scope. Bests, Dongjoon. On Thu, Sep 20, 2018 at 11:56 Ryan Blue wrote: > -0 > > My DataSourceV2 implementation for Iceberg is failing ORC tests when I run > with the 2.3.2 RC that

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Ryan Blue
-0 My DataSourceV2 implementation for Iceberg is failing ORC tests when I run with the 2.3.2 RC that pass when I run with 2.3.0. I'm tracking down the cause and will report back, but I'm -0 on the release because there may be a behavior change. On Thu, Sep 20, 2018 at 10:37 AM Denny Lee wrote:

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Denny Lee
+1 On Thu, Sep 20, 2018 at 9:55 AM Xiao Li wrote: > +1 > > > John Zhuge 于2018年9月19日周三 下午1:17写道: > >> +1 (non-binding) >> >> Built on Ubuntu 16.04 with Maven flags: -Phadoop-2.7 -Pmesos -Pyarn >> -Phive-thriftserver -Psparkr -Pkinesis-asl -Phadoop-provided >> >> java version "1.8.0_181" >>

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Xiao Li
+1 John Zhuge 于2018年9月19日周三 下午1:17写道: > +1 (non-binding) > > Built on Ubuntu 16.04 with Maven flags: -Phadoop-2.7 -Pmesos -Pyarn > -Phive-thriftserver -Psparkr -Pkinesis-asl -Phadoop-provided > > java version "1.8.0_181" > Java(TM) SE Runtime Environment (build 1.8.0_181-b13) > Java

unsubscribe

2018-09-20 Thread Praveen Srivastava
unsubscribe Praveen Srivastava HYPERLINK "mailto:praveen.s.srivast...@oracle.com"praveen.s.srivast...@oracle.com

unsubscribe

2018-09-20 Thread Ryan Adams
unsubscribe Ryan Adams radams...@gmail.com

Re: [Discuss] Datasource v2 support for manipulating partitions

2018-09-20 Thread Thakrar, Jayesh
Here’s what can be done in PostgreSQL You can create a partitioned table with a partition clause, e.g. CREATE TABLE measurement (.) PARTITION BY RANGE (logdate) You can create a partitioned table by creating tables as partitions of a partitioned table, e.g. CREATE TABLE measurement_y2006m02

Checkpointing clarifications

2018-09-20 Thread Alessandro Liparoti
Good morning, I have a large scale job that for certain size of input breaks so I am trying to play with checkpointing to split the DAG and understand the problematic point. I have some questions about checkpointing: 1. What is the utility of non-eager checkpointing? 2. How checkpointing

Re: [VOTE] SPARK 2.4.0 (RC1)

2018-09-20 Thread Weichen Xu
We need to merge this. https://github.com/apache/spark/pull/22492 Otherwise mleap cannot build against spark 2.4.0 Thanks! On Wed, Sep 19, 2018 at 1:16 PM Yinan Li wrote: > FYI: SPARK-23200 has been resolved. > > On Tue, Sep 18, 2018 at 8:49 AM Felix Cheung > wrote: > >> If we could work on

Re: [Feedback Requested] SPARK-25299: Using Distributed Storage for Persisting Shuffle Data

2018-09-20 Thread Felix Cheung
Hi +baibing3 +huangtao6 Came across your presentation on Alluxio - including shuffling - would you be interested in this? From: Matt Cheah Sent: Tuesday, September 4, 2018 2:54 PM To: Yuanjian Li Cc: Spark dev list Subject: Re: [Feedback Requested]

Re: [DISCUSS] PySpark Window UDF

2018-09-20 Thread Felix Cheung
Definitely! numba numbers are amazing From: Wes McKinney Sent: Saturday, September 8, 2018 7:46 AM To: Li Jin Cc: dev@spark.apache.org Subject: Re: [DISCUSS] PySpark Window UDF hi Li, These results are very cool. I'm excited to see you continuing to push this