Please note that the context if TRIM/LTRIM/RTRIM with two-parameters and
TRIM(trimStr FROM str) syntax.
This thread is irrelevant to one-parameter TRIM/LTRIM/RTRIM.
On Fri, Feb 14, 2020 at 11:35 AM Dongjoon Hyun
wrote:
> Hi, All.
>
> I'm sending this email because the Apache Spark c
Hi, All.
I'm sending this email because the Apache Spark committers had better have
a consistent point of views for the upcoming PRs. And, the community policy
is the way to lead the community members transparently and clearly for a
long term good.
First of all, I want to emphasize that, like
Thank you for raising the issue, Hyukjin.
According to the current status of discussion, it seems that we are able to
agree on updating the non-structured configurations and keeping the
structured configuration AS-IS.
I'm +1 for the revisiting the configurations if that is our direction. If
e best way to deprecate an SQL function. Runtime
> warning can be annoying if it keeps coming out. Maybe we should only log
> the warning once per Spark application.
>
> On Tue, Feb 18, 2020 at 3:45 PM Dongjoon Hyun
> wrote:
>
>> Thank you for feedback, Wenchen, Maxim, and T
Hi, Karen.
Are you saying that Spark 3 has to have all deprecated 2.x APIs?
Could you tell us what is your criteria for `unnecessarily` or
`necessarily`?
> the migration process from Spark 2 to Spark 3 unnecessarily painful.
Bests,
Dongjoon.
On Tue, Feb 18, 2020 at 4:55 PM Karen Feng
wrote:
n 2.4 this
>> way libraries and programs can dual target during the migration process.
>>
>> Now that isn’t always going to be doable, but certainly worth looking at
>> the situations where we aren’t providing a smooth migration path and making
>> sure it’s the best t
Thank you, Wenchen.
The new policy looks clear to me. +1 for the explicit policy.
So, are we going to revise the existing conf names before 3.0.0 release?
Or, is it applied to new up-coming configurations from now?
Bests,
Dongjoon.
On Wed, Feb 12, 2020 at 7:43 AM Wenchen Fan wrote:
> Hi
too much, I think it's fine to give a shot.
>
>
> 2020년 2월 8일 (토) 오전 6:51, Dongjoon Hyun 님이 작성:
>
>> Thank you, Sean, Jiaxin, Shane, and Tom, for feedbacks.
>>
>> 1. For legal questions, please see the following three Apache-approved
>> approaches. We
at official
> releases
> > 2) There was some ambiguity about whether or not a container image that
> included GPL'ed packages (spark images do) might trip over the GPL "viral
> propagation" due to integrating ASL and GPL in a "binary release". The
> "air gap&
es,
> 2. `interval` -> CalenderIntervalType support in the parser
>
> Thanks
>
> *Kent Yao*
> Data Science Center, Hangzhou Research Institute, Netease Corp.
> PHONE: (86) 186-5715-3499
> EMAIL: hzyao...@corp.netease.com
>
> On 01/11/2020 01:57,Dongjoon Hyun
>
t;>>>> PGP Key ID: 42E5B25A8F7A82C1
>>>>>
>>>>> On Tue, Jan 14, 2020 at 11:08 AM Sean Owen wrote:
>>>>> >
>>>>> > Yeah it's something about the env I spun up, but I don't know what.
>>>>> It
>>
out it fixed a
> regression, long lasting one (broken at 2.3.0). The link refers the PR for
> 2.4 branch.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> On Thu, Jan 16, 2020 at 12:56 PM Dongjoon Hyun
> wrote:
>
>> Sure. Wenchen and Hyukjin.
>>
>> I observ
Hi, Saurabh.
It seems that you are hitting
https://issues.apache.org/jira/browse/SPARK-26095 .
And, we disabled the parallel build via
https://github.com/apache/spark/pull/23061 at 3.0.0.
According to the stack trace in JIRA and PR description,
`maven-shade-plugin` seems to be the root cause.
Hi, All.
According to our policy, "Correctness and data loss issues should be
considered Blockers".
- http://spark.apache.org/contributing.html
Since we are close to branch-3.0 cut,
I want to ask your opinions on the following correctness and data loss
issues.
SPARK-30218 Columns used
Please vote on releasing the following candidate as Apache Spark version
2.4.5.
The vote is open until January 16th 5AM PST and passes if a majority +1 PMC
votes are cast, with a minimum of 3 +1 votes.
[ ] +1 Release this package as Apache Spark 2.4.5
[ ] -1 Do not release this package because
://issues.apache.org/jira/browse/SPARK-28344
>
> On Mon, Jan 20, 2020 at 2:07 PM Dongjoon Hyun
> wrote:
>
>> Hi, All.
>>
>> According to our policy, "Correctness and data loss issues should be
>> considered Blockers".
>>
>> - http://
ut that it feels like
> it can wait for 3.0 but would be good to get others input and I'm not an
> expert on SQL standard and what do the other sql engines do in this case.
>
> Tom
>
> On Monday, January 20, 2020, 12:07:54 AM CST, Dongjoon Hyun <
> dongjoon.h...@gmail.c
.
The remaining things are the followings:
1. Revisit `3.0.0`-only correctness patches?
2. Set the target version to `2.4.5`? (Specifically, is this feasible
in terms of timeline?)
Bests,
Dongjoon.
On Wed, Jan 22, 2020 at 9:43 AM Dongjoon Hyun
wrote:
> Hi, Tom.
>
> Th
Hi, All.
As of now, Apache Spark sbt build is broken by the Maven Central repository
policy.
-
https://stackoverflow.com/questions/59764749/requests-to-http-repo1-maven-org-maven2-return-a-501-https-required-status-an
> Effective January 15, 2020, The Central Maven Repository no longer
supports
Hi, Tom and Shane.
It looks like an old `sbt` bug. Maven seems to start to ban the `http`
access recently.
If you use Maven, it's okay because it goes to `https`.
$ build/sbt clean
[error] org.apache.maven.model.building.ModelBuildingException: 1 problem
was encountered while building the
Hi, All.
RC2 was scheduled on Today and all RC1 feedbacks seems to be addressed.
However, I'm waiting for another on-going correctness PR.
https://github.com/apache/spark/pull/27233
[SPARK-29701][SQL] Correct behaviours of group analytical queries when
empty input given
Unlike the other
+1, I'm supporting the following proposal.
> this mirror as the primary repo in the build, falling back to Central if
needed.
Thanks,
Dongjoon.
On Tue, Jan 21, 2020 at 14:37 Sean Owen wrote:
> See https://github.com/apache/spark/pull/27307 for some context. We've
> had to add, in at least
Hi, Kent.
Thank you for the proposal.
Does your proposal need to revert something from the master branch?
I'm just asking because it's not clear in the proposal document.
Bests,
Dongjoon.
On Fri, Jan 10, 2020 at 5:31 AM Dr. Kent Yao wrote:
> Hi, Devs
>
> I’d like to propose to add two new
Server Version: v1.14.9-eks-c0eccc
Bests,
Dongjoon.
On Mon, Jan 13, 2020 at 4:27 AM Dongjoon Hyun
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.4.5.
>
> The vote is open until January 16th 5AM PST and passes if a majority +1
> PMC
+1 for January 31st.
Bests,
Dongjoon.
On Tue, Dec 24, 2019 at 7:11 AM Xiao Li wrote:
> Jan 31 is pretty reasonable. Happy Holidays!
>
> Xiao
>
> On Tue, Dec 24, 2019 at 5:52 AM Sean Owen wrote:
>
>> Yep, always happens. Is earlier realistic, like Jan 15? it's all
>> arbitrary but indeed this
Indeed! Thank you again, Yuming and all.
Bests,
Dongjoon.
On Tue, Dec 24, 2019 at 13:38 Takeshi Yamamuro
wrote:
> Great work, Yuming!
>
> Bests,
> Takeshi
>
> On Wed, Dec 25, 2019 at 6:00 AM Xiao Li wrote:
>
>> Thank you all. Happy Holidays!
>>
>> Xiao
>>
>> On Tue, Dec 24, 2019 at 12:53 PM
Hi, All.
Happy New Year (2020)!
Although we slightly missed the timeline for 3.0 branch cut last month,
it seems that we keep 2.4.x timeline on track.
https://spark.apache.org/versioning-policy.html
As of today, `branch-2.4` has 154 patches since v2.4.4.
$ git log --oneline
t;
>>>>>>> +1 (non-binding)
>>>>>>>
>>>>>>> Bests,
>>>>>>> Takeshi
>>>>>>>
>>>>>>> On Mon, Mar 9, 2020 at 4:52 PM Gengliang Wang <
>>>>>>> gengliang.w...@dat
[SPARK-24640][SQL] Return `NULL` from `size(NULL)` by default
Bests,
Dongjoon.
On Thu, Mar 5, 2020 at 9:08 PM Dongjoon Hyun
wrote:
> Hi, All.
>
> There is a on-going Xiao's PR referencing this email.
>
> https://github.com/apache/spark/pull/27821
>
> Bests,
> Dongjoon.
>
repo and have tons of mails. Compared to the popularity on Github PRs,
>>> dev@ mailing list is not that crowded so less chance of missing the
>>> critical changes, and not quickly decided by only a couple of committers.
>>>
>>> These suggestions would slow
This new policy has a good indention, but can we narrow down on the
migration from Apache Spark 2.4.5 to Apache Spark 3.0+?
I saw that there already exists a reverting PR to bring back Spark 1.4 and
1.5 APIs based on this AS-IS suggestion.
The AS-IS policy is clearly mentioning that
ments turn
> around 'commonly used' but can we know that more concretely?
>
> Otherwise I think we'll back into implementing personal interpretations of
> general principles, which is arguably the issue in the first place, even
> when everyone believes in good faith in the same princip
Hi, All.
Apache Spark has been suffered from a known consistency issue on `CHAR`
type behavior among its usages and configurations. However, the evolution
direction has been gradually moving forward to be consistent inside Apache
Spark because we don't have `CHAR` offically. The following is the
Thank you, Alex, Nicholas, and Holden.
I filed an INFRA issue for Apache Spark like Zeppelin.
https://issues.apache.org/jira/browse/INFRA-19957
Bests,
Dongjoon.
On Tue, Mar 10, 2020 at 12:03 PM Alex Ott wrote:
> yes - it's https://issues.apache.org/jira/browse/INFRA-19934
>
> Nicholas
Hi, All.
Autolinking from PR to JIRA started.
*Inside PR*
https://github.com/apache/spark/pull/27881
*Inside commit log*
https://github.com/apache/spark/commits/master
You don't need to add hyperlink to `SPARK-XXX` manually from now.
Bests,
Dongjoon.
>
code that was working for char(3) would now stop
> working.
>
> For new users, depending on whether the underlying metastore char(3) is
> either supported but different from ansi Sql (which is not that big of a
> deal if we explain it) or not supported.
>
> On Sat, Mar 14, 2020 at 3
Hi, All.
RC1 tag was created yesterday and traditionally we hold on all backporting
activities to give some time to a release manager. I'm also holding two
commits at master branch.
https://github.com/apache/spark/tree/v3.0.0-rc1
However, I'm still seeing some commits land on `branch-3.0`.
Hi, All.
First of all, always "Community Over Code"!
I wish you the best health and happiness.
As we know, we are still working on QA period, we didn't reach RC stage. It
seems that we need to make website up-to-date once more.
https://spark.apache.org/versioning-policy.html
If possible,
Thank you so much, Shane!
On Thu, May 14, 2020 at 9:51 AM Xiao Li wrote:
> Thank you, Shane!
>
> On Thu, May 14, 2020 at 9:50 AM shane knapp ☠ wrote:
>
>> we're back. doesn't seem to have fixed the issue of the workers
>> connecting to repository.apache.org but i'm still investigating.
>>
>>
I confirmed and update the JIRA. SPARK-31663 is a correctness issue since
Apache Spark 2.4.0.
Bests,
Dongjoon.
On Fri, May 8, 2020 at 10:26 AM Holden Karau wrote:
> Can you provide a bit more context (is it a regression?)
>
> On Fri, May 8, 2020 at 9:33 AM Yuanjian Li wrote:
>
>> Hi Holden,
nsistently everywhere.
>
>
> Cheers,
>
> Steve C
>
> On 17 Mar 2020, at 10:01 am, Dongjoon Hyun
> wrote:
>
> Hi, Reynold.
> (And +Michael Armbrust)
>
> If you think so, do you think it's okay that we change the return value
> silently? Then, I'm wondering why we r
PM, Reynold Xin wrote:
>
>> I looked up our usage logs (sorry I can't share this publicly) and trim
>> has at least four orders of magnitude higher usage than char.
>>
>>
>> On Mon, Mar 16, 2020 at 5:27 PM, Dongjoon Hyun
>> wrote:
>>
>>> T
(honestly negligible). I was comparing select vs
> select.
>
>
>
> On Mon, Mar 16, 2020 at 5:40 PM, Dongjoon Hyun
> wrote:
>
>> Ur, are you comparing the number of SELECT statement with TRIM and CREATE
>> statements with `CHAR`?
>>
>> > I looked up our usage
+1 for Wenchen's suggestion.
I believe that the difference and effects are informed widely and discussed
in many ways twice.
First, this was shared on last December.
"FYI: SPARK-30098 Use default datasource as provider for CREATE TABLE
syntax", 2019/12/06
hread.html/493f88c10169680191791f9f6962fd16cd0ffa3b06726e92ed04cbe1%40%3Cdev.spark.apache.org%3E
>
> (Yes it talked about changing the default data source provider, but that's
> just one of the ways we are exposing this char/varchar issue).
>
>
>
> On Thu, Mar 19, 2020 at 8:41 PM, Dongjoon Hyun
0the%20default.=Snowflake%20currently%20deviates%20from%20common,space%2Dpadded%20at%20the%20end.>
>> :
>> "Snowflake currently deviates from common CHAR semantics in that strings
>> shorter than the maximum length are not space-padded at the end."
>>
>> MyS
+1
Thanks,
Dongjoon.
On Tue, Mar 24, 2020 at 14:49 Reynold Xin wrote:
> I actually think we should start cutting RCs. We can cut RCs even with
> blockers.
>
>
> On Tue, Mar 24, 2020 at 12:51 PM, Dongjoon Hyun
> wrote:
>
>> Hi, All.
>>
>> First of all,
Hi, Holden.
The following link looks outdated. It was a link used at Spark 2.4.5 RC2.
- https://repository.apache.org/content/repositories/orgapachespark-1340/
Instead, in the Apache repo, there are three candidates. Is 1343 the one we
vote?
-
+1.
Thank you all.
I tested the following additionally with OpenJDK 11.0.8.
- PySpark UT on Python 3.7.7 with Pandas 0.23.2 / PyArrow 0.15.1.
- JDBC integration suite
- K8s integration suite (except SparkR test)
(Minikube: K8s Client v1.18.8, K8s Server v1.17.11)
For SparkR,
; So, I'm wondering if Spark 3.0.1 supports R 4.0 without any issue.
>
> I believe we now test SparkR at branch-3.0 with R 4.0 after
> https://github.com/apache/spark/commit/56ec5ddcac8233011c17fc7d120a284707f0f712
>
>
> 2020년 9월 2일 (수) 오후 12:47, Dongjoon Hyun 님이 작성:
&g
It's great. Thank you, Ruifeng!
Bests,
Dongjoon.
On Fri, Sep 11, 2020 at 1:54 AM 郑瑞峰 wrote:
> Hi all,
>
> We are happy to announce the availability of Spark 3.0.1!
> Spark 3.0.1 is a maintenance release containing stability fixes. This
> release is based on the branch-3.0 maintenance branch of
Thank you, Prashant!
Bests,
Dongjoon.
On Fri, Sep 11, 2020 at 7:02 PM Prashant Sharma
wrote:
> The vote passes. Thanks to all who helped with the release!
>
> (* = binding)
> +1:
> - Sean Owen *
> - Wenchan Fan *
> - Dongjoon Hyun *
> - Mridul *
>
> +0: None
>
> -1: None
>
>
>
>
+1
Bests,
Dongjoon.
On Mon, Sep 14, 2020 at 9:19 PM kalyan wrote:
> +1
>
> Will positively improve the performance and reliability of spark...
> Looking fwd to this..
>
> Regards
> Kalyan.
>
> On Tue, Sep 15, 2020, 9:26 AM Joseph Torres
> wrote:
>
>> +1
>>
>> On Mon, Sep 14, 2020 at 6:39 PM
Hi, Igor.
The first RC is scheduled for early December .
Please see the website for Apache Spark release cadence.
- https://spark.apache.org/versioning-policy.html
Date Event
Early Nov 2020 Code freeze. Release branch cut.
Mid Nov 2020QA period. Focus on bug fixes, tests, stability and
4, 2020 at 10:53 AM Dongjoon Hyun
wrote:
> Thank you all.
>
> BTW, Xiao and Mridul, I'm wondering what date you have in your mind
> specifically.
>
> Usually, `Christmas and New Year season` doesn't give us much additional
> time.
>
> If you think so, could you make a
browse/SPARK-31800
>
> Note that the tittle shouldn't be "*Unable to disable Kerberos when
> submitting jobs to Kubernetes" *(based on the comments) and something
> more related with the spark.kubernetes.file.upload.path property
>
> Should we add it too?
>
> On Wed
le syntax:
>>https://issues.apache.org/jira/browse/SPARK-31257
>>- Bloom filter join: https://issues.apache.org/jira/browse/SPARK-32268
>>
>> Thanks,
>>
>> Xiao
>>
>>
>> Hyukjin Kwon 于2020年10月3日周六 下午5:41写道:
>>
>>> Nice summa
Hi, All.
Apache Spark 3.1.0 Release Window is adjusted like the following today.
Please check the latest information on the official website.
-
https://github.com/apache/spark-website/commit/0cd0bdc80503882b4737db7e77cc8f9d17ec12ca
- https://spark.apache.org/versioning-policy.html
Hi, Koert.
We know, welcome, and believe it. However, it's only Scala community's
roadmap so far. It doesn't mean Apache Spark supports Scala 3 officially.
For example, Apache Spark 3.0.1 supports Scala 2.12.10 but not 2.12.12 due
to Scala issue.
In Apache Spark community, we had better focus
Hi, Denis
We are currently moving toward Scala 3 together by focusing on completion
SPARK-25075 first as a stepping stone.
https://issues.apache.org/jira/browse/SPARK-25075
Build and test Spark against Scala 2.13
We didn't finish it yet. We need to have Jenkins jobs with Scala 2.13.
me (using older spark version to extract
> out of hive, then switch to newer spark version) so i am not too worried
> about this. just making sure i understand.
>
> thanks
>
> On Sat, Oct 3, 2020 at 8:17 PM Dongjoon Hyun
> wrote:
>
>> Hi, All.
>>
>> As of today,
tle confused about this. i assumed spark would no longer make a
> distribution with hive 1.x, but the hive-1.2 profile remains.
>
> yet i see the hive-1.2 profile has been removed from pom.xml?
>
> On Wed, Sep 23, 2020 at 6:58 PM Dongjoon Hyun
> wrote:
>
>> Hi, All.
nresolved bugs raised
> against 3.0.0, but conversely there were quite a few critical correctness
> fixes waiting to be released.
>
>
>
> Cheers,
>
> Jason.
>
>
>
> *From: *Takeshi Yamamuro
> *Date: *Wednesday, 15 July 2020 at 9:00 am
> *To: *Shivaram Venkataraman
&g
Thank you!
Bests,
Dongjoon
On Mon, Sep 28, 2020 at 8:07 PM Dr. Kent Yao wrote:
> Thanks, Dongjon,
>
>I pined two long-standing issues to the umbrella.
>
>
>
>https://issues.apache.org/jira/browse/SPARK-28895
>
>https://issues.apache.org/jira/browse/SPARK-28992
>
>
>
>This helps
Hi, Steve.
Sure, you can suggest, but I'm wondering how the suggested namespaces are
able to satisfy the existing visibility rules. Could you give us some
examples specifically?
> Can I suggest some common prefix for third-party-classes put into the
spark package tree, just to make clear that
Hi, Steve.
Sure, you can suggest, but I'm wondering how the suggested namespaces are
able to satisfy the existing visibility rules. Could you give us some
examples specifically?
> Can I suggest some common prefix for third-party-classes put into the
spark package tree, just to make clear that
Hi, All.
Since Apache Spark 3.0.0, Apache Hive 2.3.7 is the default
Hive execution library. The forked Hive 1.2.1 library is not
recommended because it's not maintained properly.
In Apache Spark 3.1 on December 2020, we are going to
remove it from our official distribution.
Hi, All.
K8s GA preparation is on the way like the following.
https://issues.apache.org/jira/browse/SPARK-33005
Apache Spark 3.1/3.2 is scheduled for December 2020
and mid of 2021 (TBD). If you hit K8s issues, please
file a JIRA issue. To give more visibility to your issue,
you can create
Hi, All.
As of today, master branch (Apache Spark 3.1.0) resolved
852+ JIRA issues and 606+ issues are 3.1.0-only patches.
According to the 3.1.0 release window, branch-3.1 will be
created on November 1st and enters QA period.
Here are some notable updates I've been monitoring.
*Language*
01.
Hi, All.
Unfortunately, there is an on-going discussion about the new decimal
correctness.
Although we fixed one correctness issue at master and backported it
partially to 3.0/2.4, it turns out that it needs more patched to be
complete.
Please see https://github.com/apache/spark/pull/29125 for
the priority of SPARK-31703 to `Blocker` for both Apache Spark
2.4.7 and 3.0.1.
Bests,
Dongjoon.
On Sat, Aug 8, 2020 at 6:10 AM Holden Karau wrote:
> I'm going to go ahead and vote -0 then based on that then.
>
> On Fri, Aug 7, 2020 at 11:36 PM Dongjoon Hyun
> wrote:
+1.
Thank you, Holden.
Bests,
Dongjoon.
On Thu, Jul 2, 2020 at 6:43 AM wuyi wrote:
> +1 for having this feature in Spark
>
>
>
> --
> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>
> -
> To
Thank you, Hyukjin.
According to the Python community, Python 3.5 is also EOF at 2020-09-13
(only two months left).
- https://www.python.org/downloads/
So, targeting live Python versions at Apache Spark 3.1.0 (December 2020)
looks reasonable to me.
For old Python versions, we still have Apache
GA
>
> --
> *From:* Holden Karau
> *Sent:* Monday, June 29, 2020 9:33 AM
> *To:* Maxim Gekk
> *Cc:* Dongjoon Hyun; dev
> *Subject:* Re: Apache Spark 3.1 Feature Expectation (Dec. 2020)
>
> Should we also consider the shuffle service refactoring
Thank you always, Shane!
Bests,
Dongjoon.
On Thu, Jul 9, 2020 at 9:30 AM shane knapp ☠ wrote:
> this is happening now.
>
> On Wed, Jul 8, 2020 at 9:07 AM shane knapp ☠ wrote:
>
>> this will be happening tomorrow... today is Meeting Hell Day[tm].
>>
>> On Tue, Jul 7, 2020 at 1:59 PM shane
Welcome everyone! :D
Bests,
Dongjoon.
On Tue, Jul 14, 2020 at 11:21 AM Xiao Li wrote:
> Welcome, Dilip, Huaxin and Jungtaek!
>
> Xiao
>
> On Tue, Jul 14, 2020 at 11:02 AM Holden Karau
> wrote:
>
>> So excited to have our committer pool growing with these awesome folks,
>> welcome y'all!
>>
>>
Hi, Yi.
Could you explain why you think that is a blocker? For the given example
from the JIRA description,
spark.udf.register("key", udf((m: Map[String, String]) => m.keys.head.toInt))
Seq(Map("1" -> "one", "2" -> "two")).toDF("a").createOrReplaceTempView("t")
checkAnswer(sql("SELECT key(a)
HI, Alex and Michel.
I removed the `Stale` label and reopened it for now. You may want to ping
the original author because the last update of that PR is one year ago and
has many conflicts as of today.
Bests,
Dongjoon.
On Tue, Jun 30, 2020 at 10:56 AM Alex Scammon <
Hi, All.
Now, AmpLab Jenkins farm came back online.
https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/
Also, many PRBuilder jobs were re-started 10 minutes ago.
Bests,
Dongjoon.
On Fri, Jul 3, 2020 at 4:43 AM Hyukjin Kwon wrote:
> Hi all and Shane,
>
> Is there
alizing.
>
> PS: There are two critical problems I've seen with the release (Spark UI
> is virtually unusable in some cases, and streaming issues). I will
> highlight them in the release notes and link to the JIRA tickets. But I
> think we should make 3.0.1 ASAP to follow up.
>
>
of Hive
>>> thriftserver.
>>>
>>> To reduce the risk, I would like to keep the current default version
>>> unchanged. When it becomes stable, we can change the default profile to
>>> Hadoop-3.2.
>>>
>>> Cheers,
>>>
>>> X
Thanks, Xiao, Sean, Nicholas.
To Xiao,
> it sounds like Hadoop 3.x is not as popular as Hadoop 2.7.
If you say so,
- Apache Hadoop 2.6.0 is the most popular one with 156 dependencies.
- Apache Spark 2.2.0 is the most popular one with 264 dependencies.
As we know, it doesn't make sense. Are we
To Xiao.
Why Apache project releases should be blocked by PyPi / CRAN? It's
completely optional, isn't it?
> let me repeat my opinion: the top priority is to provide two options
for PyPi distribution
IIRC, Apache Spark 3.0.0 fails to upload to CRAN and this is not the first
incident. Apache
Hi, All.
After a short celebration of Apache Spark 3.0, I'd like to ask you the
community opinion on Apache Spark 3.1 feature expectations.
First of all, Apache Spark 3.1 is scheduled for December 2020.
- https://spark.apache.org/versioning-policy.html
I'm expecting the following items:
1.
ia (binding)
> Jungtaek Lim
> Denny Lee
> Russell Spitzer
> Dongjoon Hyun (binding)
> DB Tsai (binding)
> Michael Armbrust (binding)
> Tom Graves (binding)
> Bryan Cutler
> Huaxin Gao
> Jiaxin Shan
> Xingbo Jiang
> Xiao Li (binding)
> Hyukjin Kwon (binding)
> Kent
as the default?
>
> How to explain this to the community? I would not change the default for
> consistency.
>
> Xiao
>
>
>
> On Tue, Jun 23, 2020 at 7:18 PM Dongjoon Hyun
> wrote:
>
>> Thanks. Uploading PySpark to PyPI is a simple manual step and our rele
; Please correct me if my concern is not valid.
>
> Xiao
>
>
> On Tue, Jun 23, 2020 at 12:04 AM Dongjoon Hyun
> wrote:
>
>> Hi, All.
>>
>> I bump up this thread again with the title "Use Hadoop-3.2 as a default
>> Hadoop profile in 3.1.0?"
>>
+1
Bests,
Dongjoon.
On Tue, Jun 23, 2020 at 1:19 PM Jungtaek Lim
wrote:
> +1 on a 3.0.1 soon.
>
> Probably it would be nice if some Scala experts can take a look at
> https://issues.apache.org/jira/browse/SPARK-32051 and include the fix
> into 3.0.1 if possible.
> Looks like APIs designed to
+1
Thanks,
Dongjoon.
On Mon, Jun 8, 2020 at 6:37 AM Russell Spitzer
wrote:
> +1 (non-binding) ran the new SCC DSV2 suite and all other tests, no issues
>
> On Sun, Jun 7, 2020 at 11:12 PM Yin Huai wrote:
>
>> Hello everyone,
>>
>> I am wondering if it makes more sense to not count Saturday
+1
Bests,
Dongjoon
On Wed, Jun 3, 2020 at 5:59 AM Tom Graves
wrote:
> +1
>
> Tom
>
> On Sunday, May 31, 2020, 06:47:09 PM CDT, Holden Karau <
> hol...@pigscanfly.ca> wrote:
>
>
> Please vote on releasing the following candidate as Apache Spark
> version 2.4.6.
>
> The vote is open until June
Thank you so much, Holden! :)
On Wed, Jun 10, 2020 at 6:59 PM Hyukjin Kwon wrote:
> Yay!
>
> 2020년 6월 11일 (목) 오전 10:38, Holden Karau 님이 작성:
>
>> We are happy to announce the availability of Spark 2.4.6!
>>
>> Spark 2.4.6 is a maintenance release containing stability, correctness,
>> and
Thank you so much, Sean!
Bests,
Dongjoon.
On Fri, Jul 24, 2020 at 8:56 AM Sean Owen wrote:
> Status update - we should have Scala 2.13 compiling, with the
> exception of the REPL.
> Looks like 99% or so of tests pass too, but the remaining ones might
> be hard to debug. I haven't looked hard
Thank you, Hyukjin!
Bests,
Dongjoon.
On Mon, Jan 11, 2021 at 7:24 AM Hyukjin Kwon wrote:
> I had a response from the INFRA team and Sonatype. Just to share, the
> removal is possible as an exception, but it's best to go ahead for 3.1.1
> for safety as we all discussed.
> There are several
Thank you, Jacek, Sean, and Hyukjin.
The release is a human-driven process. Everyone can make mistakes.
For example, I released Apache Spark 2.2.3 with a missing pandoc, but we
didn't touch it because it's a community-blessed official version.
https://pypi.org/project/pyspark/2.2.3/
For
Before we discover the pre-uploaded artifacts, both Jungtaek and Hyukjin
already made two blockers shared here.
IIUC, it meant implicitly RC1 failure at that time.
In addition to that, there are two correctness issues. So, I made up my
mind to cast -1 for this RC1 before joining this thread.
Yay! Thanks!
Bests,
Dongjoon
On Tue, Dec 1, 2020 at 5:31 PM Takeshi Yamamuro
wrote:
> Many thanks, guys!
> I've checked I can re-trigger Jenkins tests.
>
> Bests,
> Takeshi
>
> On Wed, Dec 2, 2020 at 9:55 AM shane knapp ☠ wrote:
>
>> https://amplab.cs.berkeley.edu/jenkins/
>>
>> i cleared the
etStripeStatistics back for backward compatibility
> ORC-669. Reduce breaking changes in ReaderImpl.java
>
> As of today, the snapshot release passed Apache Spark and Apache Iceberg
> UTs.
>
> https://github.com/dongjoon-hyun/spark/pull/41
> https://github.com/dongjoon-hyun/iceberg/pull/1
changes in ReaderImpl.java
As of today, the snapshot release passed Apache Spark and Apache Iceberg
UTs.
https://github.com/dongjoon-hyun/spark/pull/41
https://github.com/dongjoon-hyun/iceberg/pull/1
I start to roll 1.6.6-rc0. After 1.6.6 release, 1.6.7 will focus on Apache
Hive.
Thanks,
Dongjoon.
Thank you so much, Hyukjin Kwon.
I made a PR for updating the `master` branch to 3.2.0-SNAPSHOT.
https://github.com/apache/spark/pull/30606
[SPARK-33662][BUILD] Setting version to 3.2.0-SNAPSHOT
Bests,
Dongjoon.
On Fri, Dec 4, 2020 at 7:05 AM Tom Graves
wrote:
> Can we update the
Thank you, Shane. :)
Bests,
Dongjoon.
On Mon, Nov 30, 2020 at 10:05 AM shane knapp ☠ wrote:
> hey all!
>
> the Great Jenkins Migration[tm] is well under way, and we will be
> sunsetting the old amp-jenkins-master server and moving to a new one.
>
> i've put jenkins in to quiet mode so that it
301 - 400 of 784 matches
Mail list logo