Re: Apache Spark 3.2 Expectation

2021-03-11 Thread Hyukjin Kwon
Just for an update, I will send a discussion email about my idea late this week or early next week. 2021년 3월 11일 (목) 오후 7:00, Wenchen Fan 님이 작성: > There are many projects going on right now, such as new DS v2 APIs, ANSI > interval types, join improvement, disaggregated shuffle, etc. I don't >

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-04 Thread Hyukjin Kwon
eleasing 2.4.8 and thanks, Liang-chi, for volunteering. > > Btw, anyone roughly know how many v2.4 users still are based on some > stats > > (e.g., # of v2.4.7 downloads from the official repos)? > > Most users have started using v3.x? > > > > On Thu, Mar 4, 2021 at

Re: [ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-03 Thread Hyukjin Kwon
>>>>>> Greenplum >>>>>> with Spark SQL and DataFrames, 10~100x faster.* >>>>>> *spark-func-extras <https://github.com/yaooqinn/spark-func-extras>A >>>>>> library that brings excellent and useful functions from var

Re: Apache Spark 2.4.8 (and EOL of 2.4)

2021-03-03 Thread Hyukjin Kwon
Yeah, I would prefer to have a 2.4.8 release as an EOL too. I don't mind having 2.4.9 as EOL too if that's preferred from more people. 2021년 3월 4일 (목) 오전 4:01, Sean Owen 님이 작성: > Sure, I'm even arguing that 2.4.8 could possibly be the final release. No > objection of course to continuing to

[ANNOUNCE] Announcing Apache Spark 3.1.1

2021-03-02 Thread Hyukjin Kwon
We are excited to announce Spark 3.1.1 today. Apache Spark 3.1.1 is the second release of the 3.x line. This release adds Python type annotations and Python dependency management support as part of Project Zen. Other major updates include improved ANSI SQL compliance support, history server

Re: Please take a look at the draft of the Spark 3.1.1 release notes

2021-03-01 Thread Hyukjin Kwon
section" ? Currently, they refer to " > https://spark.apache.org/docs/3.0.0/.. <https://spark.apache.org/docs>.". > I think that they should refer to "https://spark.apache.org/docs/3.1.1/.. > <https://spark.apache.org/docs>." > > Regards, >

Re: Please take a look at the draft of the Spark 3.1.1 release notes

2021-03-01 Thread Hyukjin Kwon
aring, Hyukjin! > > Dongjoon. > > On Sat, Feb 27, 2021 at 12:36 AM Hyukjin Kwon wrote: > > Hi all, > > I am preparing to publish and announce Spark 3.1.1. > This is the draft of the release note, and I plan to edit a bit more and > use it as the final release note. >

Please take a look at the draft of the Spark 3.1.1 release notes

2021-02-27 Thread Hyukjin Kwon
Hi all, I am preparing to publish and announce Spark 3.1.1. This is the draft of the release note, and I plan to edit a bit more and use it as the final release note. Please take a look and let me know if I missed any major changes or something.

Re: Apache Spark 3.2 Expectation

2021-02-26 Thread Hyukjin Kwon
I have an idea which I'll send an email to discuss next or a week after the next week. I did not have enough bandwidth to drive both together at the same time. I would appreciate if we have some more time for 3.2. In addition, It would also be great if we follow the schedule and catch potential

[VOTE][RESULT] Release Spark 3.1.1 (RC3)

2021-02-26 Thread Hyukjin Kwon
The vote passes with 15 +1s (6 binding +1s). (* = binding) +1 - Hyukjin Kwon * - Jungtaek Lim - Herman van Hovell * - Sean Owen * - Yuming Wang - Gengliang Wang - John Zhuge - Takeshi Yamamuro - Cheng Su - Maxim Gekk - Gabor Somogyi - Dongjoon Hyun * - Terry Kim - Mridul Muralidharan * - Xiao Li

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-25 Thread Hyukjin Kwon
;> >>>>> -1 Could we extend the voting deadline? >>>>> >>>>> A few TPC-DS queries (q17, q18, q39a, q39b) are returning different >>>>> results between Spark 3.0 and Spark 3.1. We need a few more days to >>>>> un

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-24 Thread Hyukjin Kwon
Mridul > > > On Mon, Feb 22, 2021 at 12:57 AM Hyukjin Kwon wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 3.1.1. >> >> The vote is open until February 24th 11PM PST and passes if a majority +1 >> PMC votes are cast

Re: [VOTE] Release Spark 3.1.1 (RC3)

2021-02-21 Thread Hyukjin Kwon
Starting with my +1 (binding). 2021년 2월 22일 (월) 오후 3:56, Hyukjin Kwon 님이 작성: > Please vote on releasing the following candidate as Apache Spark version > 3.1.1. > > The vote is open until February 24th 11PM PST and passes if a majority +1 > PMC votes are cast, with a minimu

[VOTE] Release Spark 3.1.1 (RC3)

2021-02-21 Thread Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.1. The vote is open until February 24th 11PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.1.1 [ ] -1 Do not release this package because

Re: [VOTE] Release Spark 3.1.1 (RC2)

2021-02-19 Thread Hyukjin Kwon
> > Pozdrawiam, > Jacek Laskowski > > https://about.me/JacekLaskowski > "The Internals Of" Online Books <https://books.japila.pl/> > Follow me on https://twitter.com/jaceklaskowski > > <https://twitter.com/jaceklaskowski> > > > On Sat, Feb 1

Re: Please use Jekyll via "bundle exec" from now on

2021-02-18 Thread Hyukjin Kwon
Thanks Attlila for fixing and sharing this. 2021년 2월 18일 (목) 오후 6:17, Attila Zsolt Piros 님이 작성: > Hello everybody, > > To pin the exact same version of Jekyll across all the contributors, Ruby > Bundler is introduced. > This way the differences in the generated documentation, which were caused >

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-16 Thread Hyukjin Kwon
urce v2 distribution requirements on the write path, etc. I >>> like Ryan's proposals which look simple and elegant, with nice support on >>> function overloading and variadic arguments. On the other hand, I think >>> Wenchen made a very good point about performan

Re: [VOTE] Release Spark 3.0.2 (RC1)

2021-02-16 Thread Hyukjin Kwon
+1 2021년 2월 16일 (화) 오후 5:10, Prashant Sharma 님이 작성: > +1 > > On Tue, Feb 16, 2021 at 1:22 PM Dongjoon Hyun > wrote: > >> Please vote on releasing the following candidate as Apache Spark version >> 3.0.2. >> >> The vote is open until February 19th 9AM (PST) and passes if a majority >> +1 PMC

Re: [DISCUSS] assignee practice on committers+ (possible issue on preemption)

2021-02-15 Thread Hyukjin Kwon
I remember I raised a similar issue a long time ago in the dev mailing list. I agree that setting no assignee makes sense in most of the cases, and also think we share similar thoughts about the assignee on umbrella JIRAs, followup tasks, the case when it's clear with a design doc, etc. It makes

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-12 Thread Hyukjin Kwon
method signature > >>> searches. > >>> The merits on both sides can hopefully be more properly examined with > >>> code, > >>> so I look forward to seeing an implementation of Wenchen's ideas to > >>> provide > >>> a more co

Re: [VOTE] Release Spark 3.1.1 (RC2)

2021-02-12 Thread Hyukjin Kwon
;> I keep getting test failures >> with org.apache.spark.sql.kafka010.KafkaDelegationTokenSuite: removing this >> suite gets the build through though - does anyone have suggestions on how >> to fix it ? >> Perhaps a local problem at my end ? >> >> >> Regards, >&

Re: Apache Spark 3.0.2 Release ?

2021-02-12 Thread Hyukjin Kwon
Yeah, +1 too 2021년 2월 13일 (토) 오전 4:49, Dongjoon Hyun 님이 작성: > Thank you, Sean! > > On Fri, Feb 12, 2021 at 11:41 AM Sean Owen wrote: > >> Sounds like a fine time to me, sure. >> >> On Fri, Feb 12, 2021 at 1:39 PM Dongjoon Hyun >> wrote: >> >>> Hi, All. >>> >>> As of today, `branch-3.0` has 307

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-09 Thread Hyukjin Kwon
Just dropping a few lines. I remember that one of the goals in DSv2 is to correct the mistakes we made in the current Spark codes. It would not have much point if we will happen to just follow and mimic what Spark currently does. It might just end up with another copy of Spark APIs, e.g.

Re: [DISCUSS] Add RocksDB StateStore

2021-02-09 Thread Hyukjin Kwon
I mean I am okay with adding it as an external module for the extra clarification :-) 2021년 2월 9일 (화) 오후 11:10, Hyukjin Kwon 님이 작성: > I'm good with this too. > > 2021년 2월 9일 (화) 오후 4:16, DB Tsai 님이 작성: > >> +1 to add it as an external module so people can test it out and give

Re: [DISCUSS] Add RocksDB StateStore

2021-02-09 Thread Hyukjin Kwon
I'm good with this too. 2021년 2월 9일 (화) 오후 4:16, DB Tsai 님이 작성: > +1 to add it as an external module so people can test it out and give > feedback easier. > > On Mon, Feb 8, 2021 at 10:22 PM Gabor Somogyi > wrote: > > > > +1 adding it any way. > > > > On Mon, 8 Feb 2021, 21:54 Holden Karau,

Re: [VOTE] Release Spark 3.1.1 (RC2)

2021-02-08 Thread Hyukjin Kwon
losed)%20AND%20fixVersion%20in%20(3.1.0%2C%203.1.1)%20AND%20(assignee%20is%20EMPTY%20or%20assignee%20%3D%20apachespark) > > > On Tue, Feb 9, 2021 at 9:05 AM Hyukjin Kwon wrote: > >> +1 (binding) from myself too. >> >> 2021년 2월 9일 (화) 오전 9:28, Kent Yao 님이 작성: >> &

Re: [VOTE] Release Spark 3.1.1 (RC2)

2021-02-08 Thread Hyukjin Kwon
k-func-extras>A > library that brings excellent and useful functions from various modern > database management systems to Apache Spark <http://spark.apache.org/>.* > > > > On 02/9/2021 08:24,Hyukjin Kwon > wrote: > > Please vote on releasing the following candidat

[VOTE] Release Spark 3.1.1 (RC2)

2021-02-08 Thread Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.1. The vote is open until February 15th 5PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. Note that it is 7 days this time because it is a holiday season in several countries

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-06 Thread Hyukjin Kwon
(q88 for instance) that is caused by a recent commit in 3.1 ... > > I have found that the perf regression is caused by the Hadoop config: > io.file.buffer.size = 4096 > Before the commit > https://github.com/apache/spark/commit/278f6f45f46ccafc7a31007d51ab9cb720c9cb14, > we had: > i

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Hyukjin Kwon
. > > Tom > On Tuesday, February 2, 2021, 05:12:24 PM CST, Hyukjin Kwon < > gurwls...@gmail.com> wrote: > > > There is one here: https://github.com/apache/spark/pull/31440. There look > several issues being identified (to confirm that this is an issue in OSS > too)

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-02-02 Thread Hyukjin Kwon
as soon as I can confirm. 2021년 2월 3일 (수) 오전 2:36, Tom Graves 님이 작성: > Just curious if we have an update on next rc? is there a jira for the > tpcds issue? > > Thanks, > Tom > > On Wednesday, January 27, 2021, 05:46:27 PM CST, Hyukjin Kwon < > gurwls...@gmail.com&

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Hyukjin Kwon
FYI exposing methods with Column signature only is already documented on the top of functions.scala, and I believe that has been the current dev direction if I am not mistaken. Another point is that we should rather expose commonly used expressions. Its best if it considers language specific

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-27 Thread Hyukjin Kwon
te: > >> If were ok waiting for it, I’d like to get >> https://github.com/apache/spark/pull/31298 in as well (it’s not a >> regression but it is a bug fix). >> >> On Tue, Jan 26, 2021 at 6:38 AM Hyukjin Kwon wrote: >> >>> It looks like a cool one but it's

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-26 Thread Hyukjin Kwon
e > since 3.0 when Dynamic Partition Pruning was added. > So it is not a regression from 3.0 to 3.1.1, but in some cases (like TPCDS > q23b) it is causing performance regression from 2.4 to 3.x. > > Thanks, > Peter > > On Tue, Jan 26, 2021 at 6:30 AM Hyukjin Kwon wrote: > >

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-25 Thread Hyukjin Kwon
sure we're safe instead of rushing an RC without finishing the investigation. Thanks all. 2021년 1월 22일 (금) 오후 6:19, Hyukjin Kwon 님이 작성: > Sure, thanks guys. I'll start another RC after the fixes. Looks like we're > almost there. > > On Fri, 22 Jan 2021, 17:47 Wenchen Fan, wro

Re: Should 3.1.0 config props be 3.1.1 (as s.k.executor.missingPodDetectDelta)?

2021-01-23 Thread Hyukjin Kwon
I think we can just leave it as is. We have the unofficial 3.1.0 release with its corresponding git tag so 3.1.0 mark isn't completely useless. Also changing means we should go through JIRAs and change the version properties, fixing conflict, etc. which I don't think is worthwhile. On Sat, 23

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-22 Thread Hyukjin Kwon
nd a regression in 3.1. A self-join query works well in >>> 3.0 but fails in 3.1. It's being fixed at >>> https://github.com/apache/spark/pull/31287 >>> >>> On Fri, Jan 22, 2021 at 4:34 AM Tom Graves >>> wrote: >>> >>>> +1 >>&g

Re: [VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Hyukjin Kwon
I forgot to say :). I'll start with my +1. On Mon, 18 Jan 2021, 21:06 Hyukjin Kwon, wrote: > Please vote on releasing the following candidate as Apache Spark version > 3.1.1. > > The vote is open until January 22nd 4PM PST and passes if a majority +1 > PMC votes are cast, with a

[VOTE] Release Spark 3.1.1 (RC1)

2021-01-18 Thread Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.1. The vote is open until January 22nd 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.1.0 [ ] -1 Do not release this package because

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-13 Thread Hyukjin Kwon
I plan to cut RC 1 for Spark 3.1.1 this friday or next monday. Let me know if there are any blocker. Thanks guys. 2021년 1월 12일 (화) 오전 1:22, Dongjoon Hyun 님이 작성: > Thank you, Hyukjin! > > Bests, > Dongjoon. > > On Mon, Jan 11, 2021 at 7:24 AM Hyukjin Kwon wrote: > >

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-11 Thread Hyukjin Kwon
I had a response from the INFRA team and Sonatype. Just to share, the removal is possible as an exception, but it's best to go ahead for 3.1.1 for safety as we all discussed. There are several non-regression but correctness issues being tracked under

Re: [FYI] CI Infra issues (in both GitHub Action and Jenkins)

2021-01-09 Thread Hyukjin Kwon
too who were involved 2021년 1월 9일 (토) 오전 8:34, Hyukjin Kwon 님이 작성: > For GitHub resources of ASF repo, I have been contacting GitHub to address > the issue few days ago. This is not a repo level problem cc @Sean Owen > . > > ASF organisation in GitHub has already too many repos, and

Re: [FYI] CI Infra issues (in both GitHub Action and Jenkins)

2021-01-08 Thread Hyukjin Kwon
For GitHub resources of ASF repo, I have been contacting GitHub to address the issue few days ago. This is not a repo level problem cc @Sean Owen . ASF organisation in GitHub has already too many repos, and we should have a way to increase the limit, or set the separare limit specifically for the

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Hyukjin Kwon
voting is already affected by the fact that >> 3.1.0 is already in maven central. Skipping 3.1.0 sounds better to me. >> >> On Thu, Jan 7, 2021 at 12:54 PM Hyukjin Kwon wrote: >> >>> Okay, let me just start to prepare 3.1.1. I think that will address all >>&

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Hyukjin Kwon
. Are we all good with this? 2021년 1월 7일 (목) 오후 1:11, Hyukjin Kwon 님이 작성: > I think that It would be great though if we have a clear blocker that > makes the release pointless if we want to drop this RC practically given > that we will schedule 3.1.1 faster - non-regression

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Hyukjin Kwon
I think that It would be great though if we have a clear blocker that makes the release pointless if we want to drop this RC practically given that we will schedule 3.1.1 faster - non-regression bug fixes will be delivered to end users relatively fast. That would make it clear which option we

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Hyukjin Kwon
gt; One thing I didn't follow was the comment: "release 3.1.1 fast that > exceptionally allows a bit of breaking changes" - what do you mean by that? > > if there is anything we can add to our release process documentation to > prevent in the future that would be great as well.

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-06 Thread Hyukjin Kwon
Yes, it was my mistake. I faced the same issue as INFRA-20651 , and it is worse in my case because I misunderstood that RC and releases are separately released out. Right after this, I filed an INFRA JIRA to revert this at INFRA-21266

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-05 Thread Hyukjin Kwon
Seems like we have two PRs for both blockers, and one is already merged, nice. I will wait for a couple of days more before starting a new RC to make sure we catch more regressions before the new RC. Please keep testing this RC. I would appreciate it :-). 2021년 1월 6일 (수) 오후 2:28, Hyukjin Kwon 님이

Re: [VOTE] Release Spark 3.1.0 (RC1)

2021-01-05 Thread Hyukjin Kwon
which sounds like a blocker. I'll mark > this as a blocker, unless anyone has different opinions. > > 1. https://issues.apache.org/jira/browse/SPARK-33635 > > On Wed, Jan 6, 2021 at 9:01 AM Hyukjin Kwon wrote: > >> Please vote on releasing the following candidate as

[VOTE] Release Spark 3.1.0 (RC1)

2021-01-05 Thread Hyukjin Kwon
Please vote on releasing the following candidate as Apache Spark version 3.1.0. The vote is open until January 8th 4PM PST and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.1.0 [ ] -1 Do not release this package because

Re: Recovering SparkR on CRAN?

2020-12-30 Thread Hyukjin Kwon
I just double checked. 3.0.1 is the latest one that has my fix. 2020년 12월 31일 (목) 오전 9:21, Hyukjin Kwon 님이 작성: > Nice, yeah, 3.0.1 should have all fixes needed. > > 2020년 12월 31일 (목) 오전 5:23, Felix Cheung 님이 작성: > >> We could just submit the latest release with the

Re: Recovering SparkR on CRAN?

2020-12-30 Thread Hyukjin Kwon
o through a release vote. > > What is the latest release with your fix? 3.0.1? I can put it in but will > need to make sure we can get hold of Shivaram. > > > On Tue, Dec 29, 2020 at 11:05 PM Hyukjin Kwon wrote: > >> Let me try in this release - I will have to ask some questions

Re: Recovering SparkR on CRAN?

2020-12-29 Thread Hyukjin Kwon
작성: > Ah, I don’t recall actually - maybe it was just missed? > > The last message I had, was in June when it was broken by R 4.0.1, which > was fixed. > > > On Tue, Dec 29, 2020 at 7:21 PM Hyukjin Kwon wrote: > >> BTW, I remember I fixed all standing issues at >

Re: Recovering SparkR on CRAN?

2020-12-29 Thread Hyukjin Kwon
n Tue, Dec 22, 2020 at 7:48 PM Felix Cheung > wrote: > >> Ok - it took many years to get it first published, so it was hard to get >> there. >> >> >> On Tue, Dec 22, 2020 at 5:45 PM Hyukjin Kwon wrote: >> >>> Adding @Shivaram Venkataraman and

Re: Recovering SparkR on CRAN?

2020-12-22 Thread Hyukjin Kwon
Adding @Shivaram Venkataraman and @Felix Cheung FYI 2020년 12월 23일 (수) 오전 9:22, Michael Heuer 님이 작성: > Anecdotally, as a project downstream of Spark, we've been prevented from > pushing to CRAN because of this > > https://github.com/bigdatagenomics/adam/issues/1851 > > We've given up and marked

Re: Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Shane, do you mind setting up Jenkins jobs for branch-3.2 please? On Sat, 5 Dec 2020, 08:14 Hyukjin Kwon, wrote: > Great, thank you for doing this. > > On Sat, 5 Dec 2020, 02:02 Dongjoon Hyun, wrote: > >> Thank you so much, Hyukjin Kwon. >> >> I made a PR fo

Re: Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Great, thank you for doing this. On Sat, 5 Dec 2020, 02:02 Dongjoon Hyun, wrote: > Thank you so much, Hyukjin Kwon. > > I made a PR for updating the `master` branch to 3.2.0-SNAPSHOT. > > https://github.com/apache/spark/pull/30606 > [SPARK-33662][BUILD] Setting version

Spark branch-3.1

2020-12-04 Thread Hyukjin Kwon
Hi all, It’s 4th PDT and branch-3.1 is cut out now as planned. Mid Dec 2020 QA period. Focus on bug fixes, tests, stability and docs. Generally, no new features merged Now we’re in the QA period. Please focus on testing, polishing, stability and docs for Spark 3.1.0, and hope we can have a nice

Re: Apache ORC 1.6.6 Release

2020-12-03 Thread Hyukjin Kwon
It's still good to know since Spark uses ORC :-) 2020년 12월 4일 (금) 오전 3:34, Dongjoon Hyun 님이 작성: > Oh, my bad. The previous email was written for `d...@orc.apache.org`. > > Apache ORC 1.6.6 is not for Apache Spark 3.1. > > It's prepared for Apache Spark 3.2 (2020 Summer) to provide mainly >

Re: Spark 3.1 branch cut 4th Dec?

2020-11-26 Thread Hyukjin Kwon
2021 as planned:Early Jan 2021 Release candidates (RC), voting, etc. until final release passes I know this is Thanksgiving day now in the US. Hope you guys enjoy the rest of the holidays. Thanks! 2020년 11월 21일 (토) 오전 8:15, Hyukjin Kwon 님이 작성: > Just for the record, I'll stick to the date

Re: [build system] IMPORTANT UPDATE

2020-11-25 Thread Hyukjin Kwon
Thanks Shane. On Thu, 26 Nov 2020, 10:19 shane knapp ☠, wrote: > alright, builds are looking solid except for SBT... if someone here could > take a look at those failures i'd be most appreciative. > > the important ones: PRB, PRB-K8s, k8s, snapshot and maven builds all > green! > > i'm

Re: Spark 3.1 branch cut 4th Dec?

2020-11-20 Thread Hyukjin Kwon
wrote: >>>>>>> >>>>>>>> Correction: >>>>>>>> >>>>>>>> Merging the feature work after the branch cut should not be >>>>>>>> encouraged in general, although some committers did make some >>>>>>

Spark 3.1 branch cut 4th Dec?

2020-11-19 Thread Hyukjin Kwon
Hi all, I think we haven’t decided yet the exact branch-cut, code freeze and release manager. As we planned in https://spark.apache.org/versioning-policy.html Early Dec 2020 Code freeze. Release branch cut Code freeze and branch cutting is coming. Therefore, we should finish if there are any

Re: [DISCUSS] Review/merge phase, and post-review

2020-11-14 Thread Hyukjin Kwon
In practice, I usually wait some more when the changes look complicated, when there are many reviews/discussions, when the change can potentially be controversial, etc. When I think its pretty clear to go, for example, multiple approvals from committers, when the changes look pretty clear and

Re: I'm going to be out starting Nov 5th

2020-10-31 Thread Hyukjin Kwon
Oh, take care Holden! On Sun, 1 Nov 2020, 03:04 Denny Lee, wrote: > Best wishes Holden! :) > > On Sat, Oct 31, 2020 at 11:00 Dongjoon Hyun > wrote: > >> Take care, Holden! I believe everything goes well. >> >> Bests, >> Dongjoon. >> >> On Sat, Oct 31, 2020 at 10:24 AM Reynold Xin wrote: >>

Re: [DISCUSS][SPIP] Standardize Spark Exception Messages

2020-10-26 Thread Hyukjin Kwon
Thanks for pointing this out, Nicholas. This SPIP seems focused on the Scala side, grouping the exception handling and providing some guidance about error messages. Yes, I think we can refer to it on the PySpark side. Probably I will follow up and file some JIRAs based on how this SPIP gose, and

Re: Mu-L/spark Github actions emails

2020-10-09 Thread Hyukjin Kwon
Yeah, looks like I received emails from the fork as well. I label and filter the emails from github so I didnt notice. I'll take a closer look tomorrow or next Monday and take an action. Thanks for the heads up. On Thu, 8 Oct 2020, 23:41 Sean Owen, wrote: > I'm getting emails from a repo called

Re: Broken rlang installation on AppVeyor

2020-10-09 Thread Hyukjin Kwon
JIRA ticket, right? > > On 10/9/20 1:48 PM, Hyukjin Kwon wrote: > > Thanks for reporting this. I think we should change to "x64". Can you open > a PR to change? > > 2020년 10월 9일 (금) 오전 4:36, Maciej 님이 작성: > >> Hi Everyone, >> >> I've been digging into Ap

Re: Broken rlang installation on AppVeyor

2020-10-09 Thread Hyukjin Kwon
Thanks for reporting this. I think we should change to "x64". Can you open a PR to change? 2020년 10월 9일 (금) 오전 4:36, Maciej 님이 작성: > Hi Everyone, > > I've been digging into AppVeyor test failures for > https://github.com/apache/spark/pull/29978 > > > I see the following error > > [00:01:48]

Re: Apache Spark 3.1 Preparation Status (Oct. 2020)

2020-10-03 Thread Hyukjin Kwon
Nice summary. Thanks Dongjoon. One minor correction -> I believe we dropped R 3.5 and below at branch 2.4 as well. On Sun, 4 Oct 2020, 09:17 Dongjoon Hyun, wrote: > Hi, All. > > As of today, master branch (Apache Spark 3.1.0) resolved > 852+ JIRA issues and 606+ issues are 3.1.0-only patches. >

Re: Running K8s integration tests for changes in core?

2020-09-24 Thread Hyukjin Kwon
+1 On Fri, 25 Sep 2020, 02:21 Holden Karau, wrote: > Thanks Shane! > > On Thu, Sep 24, 2020 at 10:17 AM shane knapp ☠ > wrote: > >> just revisiting this thread... >> >> re presubmit strategy: i don't think this would be easy to set up... >> and i'm not sure what benefit it will give us. >> >>

Re: [VOTE] Release Spark 3.0.1 (RC3)

2020-09-02 Thread Hyukjin Kwon
For a quick correction: > - For Apache Spark 3.1, we are testing R 4.0 on `master` branch, > but we don't have test coverage on `branch-3.0`. > So, I'm wondering if Spark 3.0.1 supports R 4.0 without any issue. I believe we now test SparkR at branch-3.0 with R 4.0 after

Re: pip/conda distribution headless mode

2020-08-30 Thread Hyukjin Kwon
I am going to take a look if nobody is interested in it. 2020년 8월 31일 (월) 오후 1:48, Georg Heiler 님이 작성: > Many thanks. > > Best, > Georg > > Am Mo., 31. Aug. 2020 um 01:12 Uhr schrieb Xiao Li >: > >> Hi, Georg, >> >> This is being tracked by >> https://issues.apache.org/jira/browse/SPARK-32017

Re: [PySpark] Revisiting PySpark type annotations

2020-08-27 Thread Hyukjin Kwon
Thanks Maciej and Fokko. 2020년 8월 28일 (금) 오전 6:09, Maciej 님이 작성: > On my side, I'll try to identify any possible problems by the end of the > week or so (at somewhat crude inspection there is nothing unexpected or > particularly hard to resolve, but sometimes problem occur when you try to >

Re: [PySpark] Revisiting PySpark type annotations

2020-08-27 Thread Hyukjin Kwon
후 8:39, Driesprong, Fokko 님이 작성: > No worries, thanks for the update! > > Op do 20 aug. 2020 om 12:50 schreef Hyukjin Kwon > >> Yeah, we had a short meeting. I had to check a few other things so some >> delays happened. I will share soon. >> >> 2020년 8월 20일 (목) 오

Re: [PySpark] Revisiting PySpark type annotations

2020-08-20 Thread Hyukjin Kwon
oes not annotate types in some other APIs (by using Any). Correct me if I >> am wrong, Maciej. >> >> For me, it is a bit like code coverage. You want this to be high to make >> sure that you cover most of the APIs, but it will take some time to make it >> complete. >&g

Re: Contributing to JIRA Maintenance

2020-08-06 Thread Hyukjin Kwon
ofile.jspa?name=rohitmishr1484> I will keep monitoring it too. Thanks. 2020년 8월 1일 (토) 오후 8:05, Hyukjin Kwon 님이 작성: > Thank you! > > On Sat, 1 Aug 2020, 19:31 Takeshi Yamamuro, wrote: > >> Great work and thanks for your JIRA maintenance and this heads-up (sorry >> for my late

Re: Need some help and contributions in PySpark API documentation

2020-08-05 Thread Hyukjin Kwon
> Rohit Mishra > > On Wed, Aug 5, 2020 at 12:12 PM Hyukjin Kwon wrote: > >> Hi all, >> >> I am trying to redesign the PySpark documentation at SPARK-31851 >> <https://issues.apache.org/jira/browse/SPARK-31851>. >> Basically from: >&g

Need some help and contributions in PySpark API documentation

2020-08-05 Thread Hyukjin Kwon
Hi all, I am trying to redesign the PySpark documentation at SPARK-31851 . Basically from: - https://spark.apache.org/docs/latest/api/python/index.html to: - https://hyukjin-spark.readthedocs.io/en/latest/index.html (draft) The base

Re: [PySpark] Revisiting PySpark type annotations

2020-08-04 Thread Hyukjin Kwon
bably something that should be discussed here. > On 8/4/20 11:06 PM, Felix Cheung wrote: > > So IMO maintaining outside in a separate repo is going to be harder. That > was why I asked. > > > > -- > *From:* Maciej Szymkiewicz > > *Sent:* Tu

Re: [PySpark] Revisiting PySpark type annotations

2020-08-03 Thread Hyukjin Kwon
idering typing is arguably premature yet. >> >> >> This feels a bit weird to me, since you want to keep this in sync right? >> Do you provide different stubs for different versions of Python? I had to >> look up the literals: https://www.python.org/dev/peps/pep-0586/ >

PySpark documentation main page

2020-08-01 Thread Hyukjin Kwon
Hi all, I am trying to write up the main page of PySpark documentation at https://github.com/apache/spark/pull/29320. While I think the current proposal might be good enough, I would like to collect more feedback about the contents, structure and image since this is the entrance page of PySpark

Re: Contributing to JIRA Maintenance

2020-08-01 Thread Hyukjin Kwon
it from now on for the community's help. > > On Wed, Jul 29, 2020 at 10:52 AM Hyukjin Kwon wrote: > >> Yeah, to contribute to JIRA maintenance, it does not need a lot of codes >> given my experience. >> >> Just to share my own story: >> 4 years ago when I was one of contr

[OSS DIGEST] The major changes of Apache Spark from June 17 to June 30

2020-07-30 Thread Hyukjin Kwon
Hi all, This is the bi-weekly Apache Spark digest from the Databricks OSS team. For each API/configuration/behavior change, an *[API] *tag is added in the title. CORE

Re: Contributing to JIRA Maintenance

2020-07-28 Thread Hyukjin Kwon
l say this is a great way for anyone >> >> out there to contribute directly to the project. Issue trackers need >> >> maintenance too. It's not that hard to spot basic problems with JIRAs >> >> and request fixes, as a way to engage the reporter usefully. >> >>

Contributing to JIRA Maintenance

2020-07-27 Thread Hyukjin Kwon
Hi all, I would like to ask for some help about JIRA maintenance contributions in Apache Spark. I tend to see less and less people active in JIRA maintenance contributions. I have regularly checked all JIRAs and monitored them continuously for the last 4 years. For the last week, I didn't have

Re: Re: request the contributor permission

2020-07-27 Thread Hyukjin Kwon
ssue.jspa?atl_token=A5KQ-2QAV-T4JA-FDED_d58408eb41144d9970c56fbb41300f40176aadfc_lin=13275018=linshan>to > me . I'm logged in. As shown in figure > > > > > 在 2020-07-27 17:04:54,"Hyukjin Kwon" 写道: > > Once you contribute (e.g., your PR is merged to the codebase), you

Re: request the contributor permission

2020-07-27 Thread Hyukjin Kwon
Once you contribute (e.g., your PR is merged to the codebase), you will be able to get the permission. BTW, you are already able to do most of the work as a contributor regardless of the permission. Would you mind if I ask what you specifically want to do? 2020년 7월 27일 (월) 오후 5:11, linshan 님이

Re: [DISCUSS] Amend the commiter guidelines on the subject of -1s & how we expect PR discussion to be treated.

2020-07-25 Thread Hyukjin Kwon
+1 thanks Holden. On Fri, 24 Jul 2020, 22:34 Tom Graves, wrote: > +1 > > Tom > > On Tuesday, July 21, 2020, 03:35:18 PM CDT, Holden Karau < > hol...@pigscanfly.ca> wrote: > > > Hi Spark Developers, > > There has been a rather active discussion regarding the specific vetoes > that occured during

Re: Python xmlrunner being used?

2020-07-24 Thread Hyukjin Kwon
It's used in Jenkins IIRC 2020년 7월 24일 (금) 오후 11:43, Driesprong, Fokko 님이 작성: > I found this ticket: https://issues.apache.org/jira/browse/SPARK-7021 > > Is anybody actually using this? > > Cheers, Fokko > > Op vr 24 jul. 2020 om 16:27 schreef Driesprong, Fokko >: > >> Hi all, >> >> Does anyone

Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-23 Thread Hyukjin Kwon
ub.blog/changelog/2020-07-06-github-actions-manual-triggers-with-workflow_dispatch/ > > thanks, > Imran > > On Tue, Jul 14, 2020 at 1:18 AM Hyukjin Kwon wrote: > >> Hi dev, >> >> Github Actions build was introduced to run the regular Spark test cases >> at https://gith

Re: [PySpark] Revisiting PySpark type annotations

2020-07-21 Thread Hyukjin Kwon
Yeah, I tend to be positive about leveraging the Python type hints in general. However, just to clarify, I don’t think we should just port the type hints into the main codes yet but maybe think about having/porting Maciej's work, pyi files as stubs. For now, I tend to think adding type hints to

Re: Welcoming some new Apache Spark committers

2020-07-17 Thread Hyukjin Kwon
, *Takeshi Yamamuro,* *Sean Owen*, *Dongjoon > hyun*, *Hyukjin Kwon, *and *Liang-Chi Hsieh,* who all helped review the > majority of my PRs allowing me to grow technically. > > Thanks again and looking forward to working with you all. > > Regards, > Dilip > > On Thu, Jul 16, 2

Re: Welcoming some new Apache Spark committers

2020-07-14 Thread Hyukjin Kwon
Congrats! 2020년 7월 15일 (수) 오전 7:56, Takeshi Yamamuro 님이 작성: > Congrats, all! > > On Wed, Jul 15, 2020 at 5:15 AM Takuya UESHIN > wrote: > >> Congrats and welcome! >> >> On Tue, Jul 14, 2020 at 1:07 PM Bryan Cutler wrote: >> >>> Congratulations and welcome! >>> >>> On Tue, Jul 14, 2020 at 12:36

Re: [PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Hyukjin Kwon
g? > > On Tue, Jul 14, 2020 at 2:18 PM Hyukjin Kwon wrote: > >> Hi dev, >> >> Github Actions build was introduced to run the regular Spark test cases >> at https://github.com/apache/spark/pull/29057and >> https://github.com/apache/spark/pull/29086.

[PSA] Apache Spark uses GitHub Actions to run the tests

2020-07-14 Thread Hyukjin Kwon
Hi dev, Github Actions build was introduced to run the regular Spark test cases at https://github.com/apache/spark/pull/29057and https://github.com/apache/spark/pull/29086. This is virtually the duplication of default Jenkins PR builder at this moment. The only differences are: - Github Actions

Re: [PSA] Python 2, 3.4 and 3.5 are now dropped

2020-07-13 Thread Hyukjin Kwon
cc user mailing list too. 2020년 7월 14일 (화) 오전 11:27, Hyukjin Kwon 님이 작성: > I am sending another email to make sure dev people know. Python 2, 3.4 and > 3.5 are now dropped at https://github.com/apache/spark/pull/28957. > > >

[PSA] Python 2, 3.4 and 3.5 are now dropped

2020-07-13 Thread Hyukjin Kwon
I am sending another email to make sure dev people know. Python 2, 3.4 and 3.5 are now dropped at https://github.com/apache/spark/pull/28957.

Re: [DISCUSS] Drop Python 2, 3.4 and 3.5

2020-07-13 Thread Hyukjin Kwon
Thank you all. Python 2, 3.4 and 3.5 are dropped now in the master branch at https://github.com/apache/spark/pull/28957 2020년 7월 3일 (금) 오전 10:01, Hyukjin Kwon 님이 작성: > Thanks Dongjoon. That makes much more sense now! > > 2020년 7월 3일 (금) 오전 12:11, Dongjoon Hyun 님이 작성: > >>

<    1   2   3   4   5   6   7   >