Re: [VOTE] SPARK 2.4.0 (RC4)

2018-10-24 Thread Dongjoon Hyun
the bug > in `Dateset.collect` as I mentioned above. > > I think map_filter is implemented correctly. map(1,2,1,3) is actually > map(1,2) according to the "earlier entry wins" semantic. I don't think > this will change in 2.4.1. > > On Thu, Oct 25, 2018 at 8:56 AM

Re: [VOTE] SPARK 2.4.0 (RC4)

2018-10-24 Thread Dongjoon Hyun
lect map(1,2,1,3)").collect > res14: Array[org.apache.spark.sql.Row] = Array([Map(1 -> 3)]) > Same bug happens at `collect`. No ticket yet. > > I'll create tickets and list all of them as known issues in 2.4.0. > > It's arguable if the "earlier entry wins" semantic is reasonable.

Re: [VOTE] SPARK 2.4.0 (RC4)

2018-10-24 Thread Dongjoon Hyun
Hi, All. -0 due to the following issue. From Spark 2.4.0, users may get an incorrect result when they use new `map_fitler` with `map_concat` functions. https://issues.apache.org/jira/browse/SPARK-25823 SPARK-25823 is only aiming to fix the data correctness issue from `map_filter`. PMC members

Re: [VOTE] SPARK 2.4.0 (RC5)

2018-10-31 Thread Dongjoon Hyun
+1 Cheers, Dongjoon.

Re: Drop support for old Hive in Spark 3.0?

2018-10-26 Thread Dongjoon Hyun
Hi, Sean and All. For the first question, we support only Hive Metastore from 1.x ~ 2.x. And, we can support Hive Metastore 3.0 simultaneously. Spark is designed like that. I don't think we need to drop old Hive Metastore Support. Is it for avoiding Hive Metastore sharing between Spark2 and

Re: [VOTE] SPARK 2.4.0 (RC4)

2018-10-25 Thread Dongjoon Hyun
s, Dongjoon. PS. Also, there is a PR to completely remove them, too. https://github.com/cloud-fan/spark/pull/11 On Wed, Oct 24, 2018 at 10:14 PM Xiao Li wrote: > @Dongjoon Hyun Thanks! This is a blocking > ticket. It returns a wrong result due to our undefined behavior. I ag

Re: DataSourceV2 hangouts sync

2018-10-25 Thread Dongjoon Hyun
+1. Thank you for volunteering, Ryan! Bests, Dongjoon. On Thu, Oct 25, 2018 at 4:19 PM Xiao Li wrote: > +1 > > Reynold Xin 于2018年10月25日周四 下午4:16写道: > >> +1 >> >> >> >> On Thu, Oct 25, 2018 at 4:12 PM Li Jin wrote: >> >>> Although I am not specifically involved in DSv2, I think having this

Re: [VOTE] SPARK 2.4.0 (RC4)

2018-10-23 Thread Dongjoon Hyun
Ur, Wenchen. Source distribution seems to fail by default. https://dist.apache.org/repos/dist/dev/spark/v2.4.0-rc4-bin/spark-2.4.0.tgz $ dev/make-distribution.sh -Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserver ... + cp /spark-2.4.0/LICENSE-binary /spark-2.4.0/dist/LICENSE cp:

Re: Make Scala 2.12 as default Scala version in Spark 3.0

2018-11-06 Thread Dongjoon Hyun
+1 for making Scala 2.12 as default for Spark 3.0. Bests, Dongjoon. On Tue, Nov 6, 2018 at 11:13 AM DB Tsai wrote: > We made Scala 2.11 as default Scala version in Spark 2.0. Now, the next > Spark version will be 3.0, so it's a great time to discuss should we make > Scala 2.12 as default

Re: [ANNOUNCE] Announcing Apache Spark 2.4.0

2018-11-08 Thread Dongjoon Hyun
Finally, thank you all. Especially, thanks to the release manager, Wenchen! Bests, Dongjoon. On Thu, Nov 8, 2018 at 11:24 AM Wenchen Fan wrote: > + user list > > On Fri, Nov 9, 2018 at 2:20 AM Wenchen Fan wrote: > >> resend >> >> On Thu, Nov 8, 2018 at 11:02 PM Wenchen Fan wrote: >> >>> >>>

Re: [CRAN-pretest-archived] CRAN submission SparkR 2.4.0

2018-11-05 Thread Dongjoon Hyun
I'm wondering if we should change the order of publishing next time. Although it's not announced, we already have uploaded artifacts for (1), (2), (3). 1. Download: https://www-us.apache.org/dist/spark/spark-2.4.0/ 2. Maven:

Re: [VOTE] SPARK 2.4.0 (RC3)

2018-10-10 Thread Dongjoon Hyun
For now, you can see generated release notes. Official one will be posted on the website when the official 2.4.0 is out. https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12315420=12342385 Bests, Dongjoon. On Wed, Oct 10, 2018 at 11:29 AM Jean Georges Perrin wrote: > Hi, > >

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-20 Thread Dongjoon Hyun
;>>> On Wed, Sep 19, 2018 at 2:31 AM Takeshi Yamamuro >>>> wrote: >>>> >>>>> +1 >>>>> >>>>> I also checked `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive >>>>> -Phive-thriftserve` on the openjdk below/ma

Re: [VOTE] SPARK 2.4.0 (RC2)

2018-09-28 Thread Dongjoon Hyun
Hi, Wenchen. The current issue link seems to be out of order for me. The list of bug fixes going into 2.4.0 can be found at the following URL: https://issues.apache.org/jira/projects/SPARK/versions/2.4.0 Could you send out with the following issue link for next RCs?

Apache Spark 2.2.3 ?

2019-01-01 Thread Dongjoon Hyun
Hi, All. Apache Spark community has a policy maintaining the feature branch for 18 months. I think it's time for the 2.2.3 release since 2.2.0 is released on July 2017. http://spark.apache.org/versioning-policy.html After 2.2.2 (July 2018), `branch-2.2` has 40 patches (including

Re: Apache Spark 2.2.3 ?

2019-01-02 Thread Dongjoon Hyun
nce 2.3.2... (Sept 2018) >>> >>> And 2 months since 2.4.0 (Nov 2018) - does the community feel 2.4 branch >>> is stabilizing? >>> >>> >>> -- >>> *From:* Sean Owen >>> *Sent:* Tuesday, January 1, 2019 8:30

Re: Spark Packaging Jenkins

2019-01-05 Thread Dongjoon Hyun
y push in to early next week... these builds were set up before >> my time, and i'm currently unraveling how they all work before pushing a >> commit to fix stuff. >> >> nothing like some code archaeology to make my friday more exciting! :) >> >> shane >> >> On F

Re: Spark Packaging Jenkins

2019-01-05 Thread Dongjoon Hyun
er. > > > > On Sun, Jan 6, 2019 at 6:34 AM Dongjoon Hyun > wrote: > >> Hi, All. >> >> It turns out that `gpg signing` is the next huddle in Spark Packaging >> Jenkins. >> Since 2.4.0 release, is there something changed in our Jenkins machine? >>

Re: Spark Packaging Jenkins

2019-01-06 Thread Dongjoon Hyun
:42 AM Felix Cheung wrote: > Awesome Shane! > > > -- > *From:* shane knapp > *Sent:* Sunday, January 6, 2019 11:38 AM > *To:* Felix Cheung > *Cc:* Dongjoon Hyun; Wenchen Fan; dev > *Subject:* Re: Spark Packaging Jenkins > > no

[ANNOUNCE] Announcing Apache Spark 2.2.3

2019-01-14 Thread Dongjoon Hyun
We are happy to announce the availability of Spark 2.2.3! Apache Spark 2.2.3 is a maintenance release, based on the branch-2.2 maintenance branch of Spark. We strongly recommend all 2.2.x users to upgrade to this stable release. To download Spark 2.2.3, head over to the download page:

Removing old HiveMetastore(0.12~0.14) from Spark 3.0.0?

2019-01-22 Thread Dongjoon Hyun
Hi, All. Currently, Apache Spark supports Hive Metastore(HMS) 0.12 ~ 2.3. Among them, HMS 0.x releases look very old since we are in 2019. If these are not used in the production any more, can we drop HMS 0.x supports in 3.0.0? hive-0.12.0 2013-10-10 hive-0.13.0

Re: Removing old HiveMetastore(0.12~0.14) from Spark 3.0.0?

2019-01-23 Thread Dongjoon Hyun
very easy to maintain. >> >> On Jan 22, 2019, at 11:13 PM, Hyukjin Kwon wrote: >> >> Yea, I was thinking about that too. They are too old to keep. +1 for >> removing them out. >> >> 2019년 1월 23일 (수) 오전 11:30, Dongjoon Hyun 님이 작성: >> >>> Hi

Re: GitHub sync

2018-12-11 Thread Dongjoon Hyun
https://issues.apache.org/jira/browse/INFRA-17401 is filed. Dongjoon. On Tue, Dec 11, 2018 at 12:49 PM Dongjoon Hyun wrote: > Hi, All. > > Currently, GitHub `spark:branch-2.4` is out of sync (with two commits). > > > https://gitbox.apache.org/repos/asf?p=spark.git;a=sho

Re: GitHub sync

2018-12-11 Thread Dongjoon Hyun
Now, it's recovered. Dongjoon. On Tue, Dec 11, 2018 at 2:15 PM Dongjoon Hyun wrote: > https://issues.apache.org/jira/browse/INFRA-17401 is filed. > > Dongjoon. > > On Tue, Dec 11, 2018 at 12:49 PM Dongjoon Hyun > wrote: > >> Hi, All. >> >> Currently, G

GitHub sync

2018-12-11 Thread Dongjoon Hyun
Hi, All. Currently, GitHub `spark:branch-2.4` is out of sync (with two commits). https://gitbox.apache.org/repos/asf?p=spark.git;a=shortlog;h=refs/heads/branch-2.4 https://github.com/apache/spark/commits/branch-2.4 I did the followings already. 1. Wait for the next commit. 2. Trigger

Re: [NOTICE] Mandatory relocation of Apache git repositories on git-wip-us.apache.org

2018-12-07 Thread Dongjoon Hyun
+1 for moving to new git infra. Bests, Dongjoon. On Fri, Dec 7, 2018 at 12:53 PM shane knapp wrote: > no concerns, and this sounds great to me! > > On Fri, Dec 7, 2018 at 12:41 PM Sean Owen wrote: > >> See below: Apache projects are migrating to a new git infrastructure, >> and are seeking

Re: Double pass over ORC data files even after supplying schema and setting inferSchema = false

2018-11-21 Thread Dongjoon Hyun
Hi, Thakrar. Which version are you using now? If it's below Spark 2.4.0, please try to use 2.4.0. There was an improvement related to that. https://issues.apache.org/jira/browse/SPARK-25126 Bests, Dongjoon. On Wed, Nov 21, 2018 at 6:17 AM Thakrar, Jayesh < jthak...@conversantmedia.com>

[VOTE] SPARK 2.2.3 (RC1)

2019-01-08 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 2.2.3. The vote is open until January 11 11:30AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.2.3 [ ] -1 Do not release this package

Re: Apache Spark 2.2.3 ?

2019-01-08 Thread Dongjoon Hyun
Great! Thank you, Takeshi! :D Bests, Dongjoon. On Tue, Jan 8, 2019 at 8:47 PM Takeshi Yamamuro wrote: > If there is no other volunteer for the release of 2.3.3, I'd like to. > > best, > takeshi > > On Fri, Jan 4, 2019 at 11:49 AM Dongjoon Hyun > wrote: > >>

Re: [VOTE] SPARK 2.2.3 (RC1)

2019-01-09 Thread Dongjoon Hyun
ect Answer - 1 == == Spark Answer - 0 == > !struct<_1:int,_2:int> struct<> > ![10,5] > > > > On Tue, Jan 8, 2019 at 1:14 PM Dongjoon Hyun > wrote: > > > > Please vote on releasing the following candidate as Apache Spark version > 2.2.3. > >

Re: [VOTE] SPARK 2.2.3 (RC1)

2019-01-09 Thread Dongjoon Hyun
and it built correctly, at least. > > On Wed, Jan 9, 2019 at 5:09 PM Dongjoon Hyun > wrote: > > > > Hi, Sean. > > > > It looks strange. I didn't hit them. I'm not sure but it looks like some > flakiness at 2.2.x era. > > For me, those test passes. (I ran twi

Re: [VOTE] SPARK 2.2.3 (RC1)

2019-01-10 Thread Dongjoon Hyun
Hi, Takeshi. Yep. It's not a release blocker. We don't need that as Sean mentioned already. Since you are the release manager of 2.3.3, you may include that in the scope of Spark 2.3.3 before it starts. Bests, Dongjoon. On Thu, Jan 10, 2019 at 5:44 AM Sean Owen wrote: > Is that the right

Spark Packaging Jenkins

2019-01-04 Thread Dongjoon Hyun
Hi, All As a part of release process, we need to check Packaging/Compile/Test Jenkins status. http://spark.apache.org/release-process.html 1. Spark Packaging: https://amplab.cs.berkeley.edu/jenkins/view/Spark%20Packaging/ 2. Spark QA Compile:

Re: Apache Spark 2.2.3 ?

2019-01-03 Thread Dongjoon Hyun
>> +1 on 2.2.3 of course >> >> >> ------ >> *From:* Dongjoon Hyun >> *Sent:* Wednesday, January 2, 2019 12:21 PM >> *To:* Saisai Shao >> *Cc:* Xiao Li; Felix Cheung; Sean Owen; dev >> *Subject:* Re: Apache Spark 2.2.3 ? >>

Re: Spark Packaging Jenkins

2019-01-04 Thread Dongjoon Hyun
Thank you, Shane! Bests, Dongjoon. On Fri, Jan 4, 2019 at 10:50 AM shane knapp wrote: > yeah, i'll get on that today. thanks for the heads up. > > On Fri, Jan 4, 2019 at 10:46 AM Dongjoon Hyun > wrote: > >> Hi, All >> >> As a part of release process, we nee

[VOTE][RESULT] Spark 2.2.3 (RC1)

2019-01-11 Thread Dongjoon Hyun
Hi, All. The vote passes. Thanks to all who helped with this release 2.2.3 (the final 2.2.x)! I'll follow up later with a release announcement once everything is published. +1 (* = binding): DB Tsai* Wenchen Fan* Dongjoon Hyun Denny Lee Sean Owen* Hyukjin Kwon John Zhuge +0: None -1: None

Re: Apache Spark 2.2.3 ?

2019-01-12 Thread Dongjoon Hyun
t all the tests passed in branch-2.3 and > # there is no problem by the release scripts with dry-run. > > If there is any problem, please ping me. > > Best, > Takeshi > > > On Wed, Jan 9, 2019 at 3:16 PM Xiao Li wrote: > >> Thank you, Takeshi! >> >>

Re: Clean out https://dist.apache.org/repos/dist/dev/spark/ ?

2019-01-12 Thread Dongjoon Hyun
+1 for removing old docs there. It seems that we need to upgrade our build script to maintain only one published snapshot doc. Bests, Dongjoon. On Sat, Jan 12, 2019 at 2:18 PM Sean Owen wrote: > I'm not sure it matters a whole lot, but we are encouraged to keep > dist.apache.org free of old

Re: [VOTE] SPARK 2.2.3 (RC1)

2019-01-09 Thread Dongjoon Hyun
tps://www.dbtsai.com >> PGP Key ID: 0x5CED8B896A6BDFA0 >> >> On Tue, Jan 8, 2019 at 11:14 AM Dongjoon Hyun >> wrote: >> > >> > Please vote on releasing the following candidate as Apache Spark >> version 2.2.3. >> > >> > The vote is

Re: Apache Spark 2.2.3 ?

2019-01-03 Thread Dongjoon Hyun
this year. > > On Thu, Jan 3, 2019 at 2:31 PM Dongjoon Hyun > wrote: > >> Thank you for additional support for 2.2.3, Felix and Takeshi! >> >> >> The following is the update for Apache Spark 2.2.3 release. >> >> For correctness issues, two more pa

Re: Metastore problem on Spark2.3 with Hive3.0

2018-09-17 Thread Dongjoon Hyun
Hi, Jerry. There is a JIRA issue for that, https://issues.apache.org/jira/browse/SPARK-24360 . So far, it's in progress for Hive 3.1.0 Metastore for Apache Spark 2.5.0. You can track that issue there. Bests, Dongjoon. On Mon, Sep 17, 2018 at 7:01 PM 白也诗无敌 <445484...@qq.com> wrote: > Hi, guys

Re: [VOTE] SPARK 2.3.2 (RC6)

2018-09-18 Thread Dongjoon Hyun
+1. I tested with `-Pyarn -Phadoop-2.7 -Pkinesis-asl -Phive -Phive-thriftserve` on OpenJDK(1.8.0_181)/CentOS 7.5. I hit the following test case failure once during testing, but it's not persistent. KafkaContinuousSourceSuite ... subscribing topic by name from earliest offsets

Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

2019-02-18 Thread Dongjoon Hyun
+1 Dongjoon. On 2019/02/19 04:12:23, Wenchen Fan wrote: > +1 > > On Tue, Feb 19, 2019 at 10:50 AM Ryan Blue > wrote: > > > Hi everyone, > > > > It looks like there is consensus on the proposal, so I'd like to start a > > vote thread on the SPIP for identifiers in multi-catalog Spark. > > >

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-07 Thread Dongjoon Hyun
+1 for 2.3.3 RC2. Thank you, Takeshi. And, +1 for 2.3.4 as 2.3.x EOL release. Cheers, Dongjoon. On Thu, Feb 7, 2019 at 6:48 AM Sean Owen wrote: > It wouldn't be wasted effort, as there is probably going to be a 2.3.4 > release before 2.3.x is EOL. At least, having reliable tests on > Jenkins

Re: Time to cut an Apache 2.4.1 release?

2019-02-11 Thread Dongjoon Hyun
Thank you, DB. +1, Yes. It's time for preparing 2.4.1 release. Bests, Dongjoon. On 2019/02/12 03:16:05, Sean Owen wrote: > I support a 2.4.1 release now, yes. > > SPARK-23539 is a non-trivial improvement, so probably would not be > back-ported to 2.4.x.SPARK-26154 does look like a bug whose

Re: Welcome Jose Torres as a Spark committer

2019-01-29 Thread Dongjoon Hyun
Congrats, Jose! :) Bests, Dongjoon. On Tue, Jan 29, 2019 at 11:41 AM Arun Mahadevan wrote: > Congrats Jose! Well deserved. > > On Tue, 29 Jan 2019 at 11:15, Jules Damji wrote: > >> Congrats Jose! >> >> Sent from my iPhone >> Pardon the dumb thumb typos :) >> >> On Jan 29, 2019, at 11:07 AM,

Re: [VOTE] [SPARK-25994] SPIP: DataFrame-based Property Graphs, Cypher Queries, and Algorithms

2019-01-29 Thread Dongjoon Hyun
Hi, Xiangrui Meng. +1 for the proposal. However, please update the following section for this vote. As we see, it seems to be inaccurate because today is Jan. 29th. (Almost February). (Since I cannot comment on the SPIP, I replied here.) Q7. How long will it take? - If accepted by the

Re: [VOTE][SPARK-27396] SPIP: Public APIs for extended Columnar Processing Support

2019-05-25 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Fri, May 24, 2019 at 17:03 DB Tsai wrote: > +1 on exposing the APIs for columnar processing support. > > I understand that the scope of this SPIP doesn't cover AI / ML > use-cases. But I saw a good performance gain when I converted data > from rows to columns to

Re: [DISCUSS] Increasing minimum supported version of Pandas

2019-06-14 Thread Dongjoon Hyun
+1 Thank you for this effort, Bryan! Bests, Dongjoon. On Fri, Jun 14, 2019 at 4:24 AM Holden Karau wrote: > I’m +1 for upgrading, although since this is probably the last easy chance > we’ll have to bump version numbers easily I’d suggest 0.24.2 > > > On Fri, Jun 14, 2019 at 4:38 AM Hyukjin

Re: [build system] upcoming jenkins downtime: august 3rd 2019

2019-06-14 Thread Dongjoon Hyun
Thank you for the early notice, Shane! :) Dongjoon On Fri, Jun 14, 2019 at 9:13 AM shane knapp wrote: > the campus colo will be performing some electrical maintenance, which > means that they'll be powering off the entire building. > > since the jenkins cluster is located in that colo, we are

Re: Exposing JIRA issue types at GitHub PRs

2019-06-14 Thread Dongjoon Hyun
Now, you can see the exposed component labels (ordered by the number of PRs) here and click the component to search. https://github.com/apache/spark/labels?sort=count-desc Dongjoon. On Fri, Jun 14, 2019 at 1:15 AM Dongjoon Hyun wrote: > Hi, All. > > JIRA and PR is ready fo

Re: Exposing JIRA issue types at GitHub PRs

2019-06-22 Thread Dongjoon Hyun
community doesn't allow the bot has committer's API key for security reason. Sorry for missing this policy from the beginning. I'll post again after rethinking about this migration for a while. Bests, Dongjoon. On Wed, Jun 19, 2019 at 11:03 AM Dongjoon Hyun wrote: > Thank you for feedb

Re: Exposing JIRA issue types at GitHub PRs

2019-06-13 Thread Dongjoon Hyun
y be updated later: so keeping them in sync may be > an extra effort.. > > On Thu, 13 Jun 2019, 08:09 Reynold Xin, wrote: > >> Seems like a good idea. Can we test this with a component first? >> >> On Thu, Jun 13, 2019 at 6:17 AM Dongjoon Hyun >> wrote: >> &

Re: Exposing JIRA issue types at GitHub PRs

2019-06-14 Thread Dongjoon Hyun
Hi, All. JIRA and PR is ready for reviews. https://issues.apache.org/jira/browse/SPARK-28051 (Exposing JIRA issue component types at GitHub PRs) https://github.com/apache/spark/pull/24871 Bests, Dongjoon. On Thu, Jun 13, 2019 at 10:48 AM Dongjoon Hyun wrote: > Thank you for the feedba

Jenkins Jobs for Hadoop-3.2 profile

2019-06-19 Thread Dongjoon Hyun
Hi, All. So far, we have only `hadoop-2.7` profile jobs. - SBT with hadoop-2.7 - Maven with hadoop-2.7 (on JDK8 and JDK11) Can we have a `Hadoop-3.2` profile Jenkins job for Spark 3.0.0? Bests, Dongjoon.

Re: Exposing JIRA issue types at GitHub PRs

2019-06-19 Thread Dongjoon Hyun
he github_jira_sync script is fully automated, >> should contributors skip adding the duplicated labels in new PR titles? >> >> >> On Jun 17, 2019, at 4:21 PM, Gabor Somogyi >> wrote: >> >> Dongjoon, I think it's useful. Thanks for adding it! >> >>

Exposing JIRA issue types at GitHub PRs

2019-06-12 Thread Dongjoon Hyun
Hi, All. Since we use both Apache JIRA and GitHub actively for Apache Spark contributions, we have lots of JIRAs and PRs consequently. One specific thing I've been longing to see is `Jira Issue Type` in GitHub. How about exposing JIRA issue types at GitHub PRs as GitHub `Labels`? There are two

Re: [VOTE][SPARK-25299] SPIP: Shuffle Storage API

2019-06-17 Thread Dongjoon Hyun
+1 Bests, Dongjoon. On Sun, Jun 16, 2019 at 9:41 PM Saisai Shao wrote: > +1 (binding) > > Thanks > Saisai > > Imran Rashid 于2019年6月15日周六 上午3:46写道: > >> +1 (binding) >> >> I think this is a really important feature for spark. >> >> First, there is already a lot of interest in alternative

Re: Exposing JIRA issue types at GitHub PRs

2019-06-17 Thread Dongjoon Hyun
Thank you, Hyukjin ! On Sun, Jun 16, 2019 at 4:12 PM Hyukjin Kwon wrote: > Labels look good and useful. > > On Sat, 15 Jun 2019, 02:36 Dongjoon Hyun, wrote: > >> Now, you can see the exposed component labels (ordered by the number of >> PRs) here and click

Re: Resolving all JIRAs affecting EOL releases

2019-05-17 Thread Dongjoon Hyun
+1, too. Thank you, Hyukjin! Bests, Dongjoon. On Fri, May 17, 2019 at 9:07 AM Imran Rashid wrote: > +1, thanks for taking this on > > On Wed, May 15, 2019 at 7:26 PM Hyukjin Kwon wrote: > >> oh, wait. 'Incomplete' can still make sense in this way then. >> Yes, I am good with 'Incomplete'

Re: [VOTE] Release Apache Spark 2.4.2

2019-04-29 Thread Dongjoon Hyun
Hi, All and Xiao (as a next release manager). In any case, can the release manager include the information about the used release script as a part of VOTE email officially? That information will be very helpful to reproduce Spark build (in the downstream environment) Currently, it's not clearly

Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-11 Thread Dongjoon Hyun
Additionally, one more correctness patch landed yesterday. - SPARK-28015 Check stringToDate() consumes entire input for the and -[m]m formats Bests, Dongjoon. On Tue, Jul 9, 2019 at 10:11 AM Dongjoon Hyun wrote: > Thank you for the reply, Sean. Sure. 2.4.x should be a

Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-12 Thread Dongjoon Hyun
(if we are on schedule). - 2.4.4 at the end of July - 2.3.4 at the end of August (since 2.3.0 was released at the end of February 2018) - 3.0.0 (possibily September?) - 3.1.0 (January 2020?) Bests, Dongjoon. On Thu, Jul 11, 2019 at 1:30 PM Jacek Laskowski wrote: > Hi, > > Thanks Dong

Re: Opinions wanted: how much to match PostgreSQL semantics?

2019-07-09 Thread Dongjoon Hyun
Thank you, Sean and all. One decision was made swiftly today. I believe that we can move forward case-by-case for the others until the feature freeze (3.0 branch cut). Bests, Dongjoon. On Mon, Jul 8, 2019 at 13:03 Marco Gaido wrote: > Hi Sean, > > Thanks for bringing this up. Honestly, my

Release Apache Spark 2.4.4 before 3.0.0

2019-07-09 Thread Dongjoon Hyun
Hi, All. Spark 2.4.3 was released two months ago (8th May). As of today (9th July), there exist 45 fixes in `branch-2.4` including the following correctness or blocker issues. - SPARK-26038 Decimal toScalaBigInt/toJavaBigInteger not work for decimals not fitting in long - SPARK-26045

Re: Spark SQL upgrade / migration guide: discoverability and content organization

2019-07-14 Thread Dongjoon Hyun
Thank you, Josh and Xiao. That sounds great. Do you think we can have some parts of that improvement in `2.4.4` document first since that is the very next release? Bests, Dongjoon. On Sun, Jul 14, 2019 at 4:25 PM Xiao Li wrote: > Yeah, Josh! All these ideas sound good to me. All the top

Re: Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-16 Thread Dongjoon Hyun
oon for being a release manager. > > If the assumed dates are ok, I would like to volunteer for an 2.3.4 > release manager. > > Best Regards, > Kazuaki Ishizaki, > > > > From:Dongjoon Hyun > To:dev , "user @spark" < > u...@spark.apache.

Disabling `Merge Commits` from GitHub Merge Button

2019-07-01 Thread Dongjoon Hyun
Hi, Apache Spark PMC members and committers. We are using GitHub `Merge Button` in `spark-website` repository because it's very convenient. 1. https://github.com/apache/spark-website/commits/asf-site 2. https://github.com/apache/spark/commits/master In order to be consistent with our

Re: Disabling `Merge Commits` from GitHub Merge Button

2019-07-01 Thread Dongjoon Hyun
e, Jul 2, 2019 at 5:58 AM Sean Owen wrote: >> >>> I'm using the merge script in both repos. I think that was the best >>> practice? >>> So, sure, I'm fine with disabling it. >>> >>> On Mon, Jul 1, 2019 at 3:53 PM Dongjoon Hyun >>> wrote

Release Apache Spark 2.4.4

2019-08-13 Thread Dongjoon Hyun
Hi, All. Spark 2.4.3 was released three months ago (8th May). As of today (13th August), there are 112 commits (75 JIRAs) in `branch-24` since 2.4.3. It would be great if we can have Spark 2.4.4. Shall we start `2.4.4 RC1` next Monday (19th August)? Last time, there was a request for K8s issue

Re: Release Apache Spark 2.4.4

2019-08-16 Thread Dongjoon Hyun
Thank you, Kazuaki. Bests, Dongjoon. On Fri, Aug 16, 2019 at 3:10 AM Kazuaki Ishizaki wrote: > Sure, I will launch a separate e-mail thread for discussing 2.3.4 later. > > Regards, > Kazuaki Ishizaki, Ph.D. > > > > From:Dongjoon Hyun > To:Sean Ow

Re: Release Apache Spark 2.4.4

2019-08-14 Thread Dongjoon Hyun
; >>>>> On Tue, Aug 13, 2019 at 5:22 PM Sean Owen wrote: >>>>> >>>>>> Seems fine to me if there are enough valuable fixes to justify another >>>>>> release. If there are any other important fixes imminent, it's fin

Re: Release Spark 2.3.4

2019-08-16 Thread Dongjoon Hyun
+1 for 2.3.4 release as the last release for `branch-2.3` EOL. Also, +1 for next week release. Bests, Dongjoon. On Fri, Aug 16, 2019 at 8:19 AM Sean Owen wrote: > I think it's fine to do these in parallel, yes. Go ahead if you are > willing. > > On Fri, Aug 16, 2019 at 9:48 AM Kazuaki

Re: Release Apache Spark 2.4.4

2019-08-15 Thread Dongjoon Hyun
2.3 since 2.3.3 was released in Februrary: > https://issues.apache.org/jira/projects/SPARK/versions/12344844 > > Some look moderately important. > > Should we also, or first, cut 2.3.4 to end the 2.3.x line? > > On Tue, Aug 13, 2019 at 6:16 PM Dongjoon Hyun > wrote: > > > &

[VOTE] Release Apache Spark 2.4.4 (RC1)

2019-08-19 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 2.4.4. The vote is open until August 22nd 10AM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.4.4 [ ] -1 Do not release this package because

Re: [VOTE] Release Apache Spark 2.4.4 (RC1)

2019-08-19 Thread Dongjoon Hyun
6e9e157e6c855718df972efad >> for a fix for a similar type of issue. >> >> This may be quite specific to a particular version of Java 8, but I'm >> testing on the latest (1.8.0_222). We can 'patch' it by allowing for >> multiple correct answers here. >> It may not ho

Re: Unmarking most things as experimental, evolving for 3.0?

2019-08-22 Thread Dongjoon Hyun
+1 for unmarking old ones (made in `2.3.x` and before). Thank you, Sean. Bests, Dongjoon. On Wed, Aug 21, 2019 at 6:46 PM Sean Owen wrote: > There are currently about 130 things marked as 'experimental' in > Spark, and some have been around since Spark 1.x. A few may be > legitimately still

Re: [VOTE] Release Apache Spark 2.4.4 (RC1)

2019-08-22 Thread Dongjoon Hyun
thub.com/apache/spark/pull/25498 > > > > I think we should have this fix in 2.3 and 2.4. > > > > Thanks, > > Wenchen > > > > On Tue, Aug 20, 2019 at 7:32 AM Dongjoon Hyun > wrote: > >> > >> Thank you for testing, Sean and Herman. > &g

Re: JDK11 Support in Apache Spark

2019-08-26 Thread Dongjoon Hyun
we'll have to inherit it. > ;) > >michael > > > On Aug 26, 2019, at 12:32 PM, Dongjoon Hyun > wrote: > > As Shane wrote, not yet. > > `one build for works for both` is our aspiration and the next step > mentioned in the first email. > > > T

JDK11 Support in Apache Spark

2019-08-24 Thread Dongjoon Hyun
Hi, All. Thanks to your many many contributions, Apache Spark master branch starts to pass on JDK11 as of today. (with `hadoop-3.2` profile: Apache Hadoop 3.2 and Hive 2.3.6)

Re: [VOTE] Release Apache Spark 2.4.4 (RC1)

2019-08-24 Thread Dongjoon Hyun
issues like `[SPARK-28778][MESOS] Fixed executors advertised address ...`, we can have it if it lands before `2.4.4-rc2` tag creation. I'll make `2.4.4-rc2` tag tomorrow. Please let me know if there is blocker issues. Bests, Dongjoon. On Thu, Aug 22, 2019 at 9:28 AM Dongjoon Hyun wrote: > Hi,

Re: JDK11 Support in Apache Spark

2019-08-27 Thread Dongjoon Hyun
Hi, All. Thank you for your attention! UPDATE: We succeeded to build with JDK8 and test with JDK11. - https://github.com/apache/spark/pull/25587 - https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4842 (Scala/Java/Python/R) We are ready to release Maven artifacts as

Re: JDK11 Support in Apache Spark

2019-08-26 Thread Dongjoon Hyun
g 25, 2019 at 6:03 AM Xiao Li wrote: >> >>> Thank you for your contributions! This is a great feature for Spark >>> 3.0! We finally achieve it! >>> >>> Xiao >>> >>> On Sat, Aug 24, 2019 at 12:18 PM Felix Cheung >>> wrote: >&g

[VOTE] Release Apache Spark 2.4.4 (RC3)

2019-08-27 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 2.4.4. The vote is open until August 30th 5PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.4.4 [ ] -1 Do not release this package because

Re: Apply for JIRA permission

2019-08-27 Thread Dongjoon Hyun
Hi, Hefei. You can file a JIRA issue already. And, you will be added to the Apache Spark Contributor Jira group when the committers merge your PR and assign an JIRA issue to you. For more information, please see https://spark.apache.org/contributing.html . You can make a PR to Apache Spark

Re: [VOTE] Release Apache Spark 2.4.4 (RC3)

2019-08-27 Thread Dongjoon Hyun
-2.11/Scala-2.12 and both Python2/3. Python 2.7.15 with numpy 1.16.4, scipy 1.2.2, pandas 0.19.2, pyarrow 0.8.0 Python 3.6.4 with numpy 1.16.4, scipy 1.2.2, pandas 0.23.2, pyarrow 0.11.0 - Tested JDBC IT. Bests, Dongjoon. On Tue, Aug 27, 2019 at 4:05 PM Dongjoon Hyun wrote: > Please v

[VOTE][RESULT] Spark 2.4.4 (RC3)

2019-08-30 Thread Dongjoon Hyun
Hi, All. The vote passes. Thanks to all who helped with this release 2.4.4! It was very intensive vote with +11 (including +8 PMC votes) and no -1. I'll follow up later with a release announcement once everything is published. +1 (* = binding): Dongjoon Hyun Kazuaki Ishizaki Sean Owen* Wenchen

[VOTE] Release Apache Spark 2.4.4 (RC2)

2019-08-26 Thread Dongjoon Hyun
Please vote on releasing the following candidate as Apache Spark version 2.4.4. The vote is open until August 29th 1AM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 2.4.4 [ ] -1 Do not release this package because

[ANNOUNCE] Announcing Apache Spark 2.4.4

2019-09-01 Thread Dongjoon Hyun
all community members for contributing to this release. This release would not have been possible without you. Dongjoon Hyun

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-28 Thread Dongjoon Hyun
Ya. It looks like that, but it seems to be a long standing issue. (2.3.3 / 2.4.0 / 2.4.1 / 2.4.2 / 2.4.3 are the same). $ bin/spark-submit --version Welcome to __ / __/__ ___ _/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.3.3

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-28 Thread Dongjoon Hyun
https://issues.apache.org/jira/browse/SPARK-28906 is filed for 2.3.2 ~ 2.4.4. Bests, Dongjoon.

Re: [VOTE] Release Apache Spark 2.3.4 (RC1)

2019-08-27 Thread Dongjoon Hyun
+1. I also verified SHA/GPG and tested UTs on AdoptOpenJDKu8_222/CentOS6.9 with profile "-Pyarn -Phadoop-2.7 -Pkubernetes -Pkinesis-asl -Phive -Phive-thriftserver" Additionally, JDBC IT also is tested. Thank you, Kazuaki! Bests, Dongjoon. On Tue, Aug 27, 2019 at 11:20 AM Sean Owen wrote: >

Re: [VOTE] Release Apache Spark 2.4.4 (RC2)

2019-08-27 Thread Dongjoon Hyun
181-b13, mixed mode) >> >> Bests, >> Takeshi >> >> >> On Tue, Aug 27, 2019 at 11:06 AM Sean Owen wrote: >> >>> +1 as per response to RC1. The existing issues identified there seem >>> to have been fixed. >>> >>> >>&

Re: Resolving all JIRAs affecting EOL releases

2019-09-08 Thread Dongjoon Hyun
Thank you, Hyukjin. +1 for closing according to 2.3.x EOL. For the timing, please do that after the official 2.3.4 release announcement. Bests, Dongjoon. On Sun, Sep 8, 2019 at 16:27 Sean Owen wrote: > I think simply closing old issues with no activity in a long time is > OK. The "Affected

Re: [build system] weird mvn errors post-cache cleaning

2019-09-17 Thread Dongjoon Hyun
Oh, thank you for fixing that! :) Bests, Dongjoon. On Tue, Sep 17, 2019 at 12:57 PM Shane Knapp wrote: > > ah, i found this sucker on amp-jenkins-worker-02: > > s/02/06 > > - > To unsubscribe e-mail:

Re: Ask for ARM CI for spark

2019-09-18 Thread Dongjoon Hyun
Hi, Tianhua. Could you summarize the detail on the JIRA once more? It will be very helpful for the community. Also, I've been waiting on that JIRA. :) Bests, Dongjoon. On Mon, Sep 16, 2019 at 11:48 PM Tianhua huang wrote: > @shane knapp thank you very much, I opened an issue > for this

Re: Thoughts on Spark 3 release, or a preview release

2019-09-13 Thread Dongjoon Hyun
;>>> >> >>>> There're some more new features/improvements items in SS, but given >> we're talking about ramping-down, above list might be realistic one. >> >>>> >> >>>> >> >>>> >> >>>> On Thu, Sep

Re: [VOTE] [SPARK-27495] SPIP: Support Stage level resource configuration and scheduling

2019-09-13 Thread Dongjoon Hyun
+1 Thanks, Dongjoon. On Fri, Sep 13, 2019 at 9:59 AM Thomas Graves wrote: > Thanks everyone so far for the voting and the feedback, bumping this > up as vote is scheduling to end today. > > Tom > > On Wed, Sep 11, 2019 at 1:10 PM Bryan Cutler wrote: > > > > +1 (non-binding), looks good! > > >

Re: In Apache Spark JIRA, spark/dev/github_jira_sync.py not running properly

2019-07-19 Thread Dongjoon Hyun
Hi, Hyukjin. In short, there are two bots. And, the current situation happens when only one bot with `dev/github_jira_sync.py` works. And, `dev/github_jira_sync.py` is irrelevant to the JIRA status change because it only use `add_remote_link` and `add_comment` API. I know only this bot (in

Re: Release Apache Spark 2.4.4 before 3.0.0

2019-07-15 Thread Dongjoon Hyun
Hi, Apache Spark PMC members. Can we cut Apache Spark 2.4.4 next Monday (22nd July)? Bests, Dongjoon. On Fri, Jul 12, 2019 at 3:18 PM Dongjoon Hyun wrote: > Thank you, Jacek. > > BTW, I added `@private` since we need PMC's help to make an Apache Spark > release. > > Can I

<    1   2   3   4   5   6   7   8   >