gt;> >
>>>>> > The vote is open until April 17th 1AM (PST) and passes
>>>>> > if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>>>>> >
>>>>> > [ ] +1 Use ANSI SQL mode by default
>>>>> > [ ] -1 Do not use ANSI SQL mode by default because ...
>>>>> >
>>>>> > Thank you in advance.
>>>>> >
>>>>> > Dongjoon
>>>>> >
>>>>>
>>>>> -
>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>
>>>>>
--
Takuya UESHIN
ull/45053>
> SPIP doc
> <https://docs.google.com/document/d/1Pund40wGRuB72LX6L7cliMDVoXTPR-xx4IkPmMLaZXk/edit?usp=sharing>
>
> Please vote on the SPIP for the next 72 hours:
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
> Thanks.
>
--
Takuya UESHIN
; +1.
>>>>>>>>>>
>>>>>>>>>> See https://youtu.be/yj7XlTB1Jvc?t=604 :-).
>>>>>>>>>>
>>>>>>>>>> On Thu, 6 Jul 2023 at 09:15, Allison Wang
>>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi all,
>>>>>>>>>>>
>>>>>>>>>>> I'd like to start the vote for SPIP: Python Data Source API.
>>>>>>>>>>>
>>>>>>>>>>> The high-level summary for the SPIP is that it aims to
>>>>>>>>>>> introduce a simple API in Python for Data Sources. The idea is to
>>>>>>>>>>> enable
>>>>>>>>>>> Python developers to create data sources without learning Scala or
>>>>>>>>>>> dealing
>>>>>>>>>>> with the complexities of the current data source APIs. This would
>>>>>>>>>>> make
>>>>>>>>>>> Spark more accessible to the wider Python developer community.
>>>>>>>>>>>
>>>>>>>>>>> References:
>>>>>>>>>>>
>>>>>>>>>>>- SPIP doc
>>>>>>>>>>>
>>>>>>>>>>> <https://docs.google.com/document/d/1oYrCKEKHzznljYfJO4kx5K_Npcgt1Slyfph3NEk7JRU/edit?usp=sharing>
>>>>>>>>>>>- JIRA ticket
>>>>>>>>>>><https://issues.apache.org/jira/browse/SPARK-44076>
>>>>>>>>>>>- Discussion thread
>>>>>>>>>>>
>>>>>>>>>>> <https://lists.apache.org/thread/w621zn14ho4rw61b0s139klnqh900s8y>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Please vote on the SPIP for the next 72 hours:
>>>>>>>>>>>
>>>>>>>>>>> [ ] +1: Accept the proposal as an official SPIP
>>>>>>>>>>> [ ] +0
>>>>>>>>>>> [ ] -1: I don’t think this is a good idea because __.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Allison
>>>>>>>>>>>
>>>>>>>>>>
>>>>
>>>> --
>>>> Twitter: https://twitter.com/holdenkarau
>>>> Books (Learning Spark, High Performance Spark, etc.):
>>>> https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9>
>>>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>>>
>>>
--
Takuya UESHIN
add three new PMC members. Join me in
>>>> welcoming them to their new roles!
>>>>
>>>> New PMC members: Huaxin Gao, Gengliang Wang and Maxim Gekk
>>>>
>>>> The Spark PMC
>>>>
>>>
--
Takuya UESHIN
13, Hyukjin Kwon 님이 작성:
>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> The Spark PMC recently added Xinrong Meng as a committer on the
>>>>>> project. Xinrong is the major contributor of PySpark especially Pandas
>>>>>> API
>>>>>> on Spark. She has guided a lot of new contributors enthusiastically.
>>>>>> Please
>>>>>> join me in welcoming Xinrong!
>>>>>>
>>>>>>
--
Takuya UESHIN
-1
I found a correctness issue of ArrayAggregate and the fix was merged after
the RC3 cut.
- https://issues.apache.org/jira/browse/SPARK-39293
- https://github.com/apache/spark/pull/36674
Thanks.
On Tue, May 24, 2022 at 10:21 AM Maxim Gekk
wrote:
> Please vote on releasing the following
I'm afraid I'm also against the proposal so far.
What's wrong with going with "1. Functions" and using transform which allows
chaining functions?
I was not sure what you mean by "manage the namespaces", though.
def with_price(df, factor: float = 2.0):
return df.withColumn("price",
AM Liang-Chi Hsieh
> >>> >> wrote:
> >>> >> +1 (non-binding)
> >>> >>
> >>> >> rxin wrote
> >>> >>> +1. Would open up a huge persona for Spark.
> >>> >>>
> >>> >>> On Fri, Mar 26 2021 at 11:30 AM, Bryan Cutler <
> >>> >>
> >>> >>> cutlerb@
> >>> >>
> >>> >>>> wrote:
> >>> >>>
> >>> >>>>
> >>> >>>> +1 (non-binding)
> >>> >>>>
> >>> >>>>
> >>> >>>> On Fri, Mar 26, 2021 at 9:49 AM Maciej <
> >>> >>
> >>> >>> mszymkiewicz@
> >>> >>
> >>> >>>> wrote:
> >>> >>>>
> >>> >>>>
> >>> >>>>> +1 (nonbinding)
> >>> >>
> >>> >> --
> >>> >> Sent from:
> >>> >> http://apache-spark-developers-list.1001551.n3.nabble.com/
> >>> >>
> >>> >>
> >>> > -
> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>> >>
> >>> >> --
> >>> >>
> >>> >> ---
> >>> >> Takeshi Yamamuro
> >>>
> >>> -
> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>>
> >> --
> >> Twitter: https://twitter.com/holdenkarau
> >> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9
> >> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
--
Takuya UESHIN
;>> - Jungtaek Lim
>>> - Dilip Biswal
>>>
>>> All three of them contributed to Spark 3.0 and we’re excited to have
>>> them join the project.
>>>
>>> Matei and the Spark PMC
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
--
Takuya UESHIN
Hi all,
This is the bi-weekly Apache Spark digest from the Databricks OSS team.
For each API/configuration/behavior change, there will be an *[API]* tag in
the title.
CORE
ng about what the cost will be:
>>>>>>>> >> >>
>>>>>>>> >> >> Usage - an API that is actively used in many different
>>>>>>>> places, is always very costly to break. While it is hard to know usage
>>>>>>>> for
>>>>>>>> sure, there are a bunch of ways that we can estimate:
>>>>>>>> >> >>
>>>>>>>> >> >> How long has the API been in Spark?
>>>>>>>> >> >>
>>>>>>>> >> >> Is the API common even for basic programs?
>>>>>>>> >> >>
>>>>>>>> >> >> How often do we see recent questions in JIRA or mailing lists?
>>>>>>>> >> >>
>>>>>>>> >> >> How often does it appear in StackOverflow or blogs?
>>>>>>>> >> >>
>>>>>>>> >> >> Behavior after the break - How will a program that works
>>>>>>>> today, work after the break? The following are listed roughly in order
>>>>>>>> of
>>>>>>>> increasing severity:
>>>>>>>> >> >>
>>>>>>>> >> >> Will there be a compiler or linker error?
>>>>>>>> >> >>
>>>>>>>> >> >> Will there be a runtime exception?
>>>>>>>> >> >>
>>>>>>>> >> >> Will that exception happen after significant processing has
>>>>>>>> been done?
>>>>>>>> >> >>
>>>>>>>> >> >> Will we silently return different answers? (very hard to
>>>>>>>> debug, might not even notice!)
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >> Cost of Maintaining an API
>>>>>>>> >> >>
>>>>>>>> >> >> Of course, the above does not mean that we will never break
>>>>>>>> any APIs. We must also consider the cost both to the project and to our
>>>>>>>> users of keeping the API in question.
>>>>>>>> >> >>
>>>>>>>> >> >> Project Costs - Every API we have needs to be tested and
>>>>>>>> needs to keep working as other parts of the project changes. These
>>>>>>>> costs
>>>>>>>> are significantly exacerbated when external dependencies change (the
>>>>>>>> JVM,
>>>>>>>> Scala, etc). In some cases, while not completely technically
>>>>>>>> infeasible,
>>>>>>>> the cost of maintaining a particular API can become too high.
>>>>>>>> >> >>
>>>>>>>> >> >> User Costs - APIs also have a cognitive cost to users
>>>>>>>> learning Spark or trying to understand Spark programs. This cost
>>>>>>>> becomes
>>>>>>>> even higher when the API in question has confusing or undefined
>>>>>>>> semantics.
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >> Alternatives to Breaking an API
>>>>>>>> >> >>
>>>>>>>> >> >> In cases where there is a "Bad API", but where the cost of
>>>>>>>> removal is also high, there are alternatives that should be considered
>>>>>>>> that
>>>>>>>> do not hurt existing users but do address some of the maintenance
>>>>>>>> costs.
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >> Avoid Bad APIs - While this is a bit obvious, it is an
>>>>>>>> important point. Anytime we are adding a new interface to Spark we
>>>>>>>> should
>>>>>>>> consider that we might be stuck with this API forever. Think deeply
>>>>>>>> about
>>>>>>>> how new APIs relate to existing ones, as well as how you expect them to
>>>>>>>> evolve over time.
>>>>>>>> >> >>
>>>>>>>> >> >> Deprecation Warnings - All deprecation warnings should point
>>>>>>>> to a clear alternative and should never just say that an API is
>>>>>>>> deprecated.
>>>>>>>> >> >>
>>>>>>>> >> >> Updated Docs - Documentation should point to the "best"
>>>>>>>> recommended way of performing a given task. In the cases where we
>>>>>>>> maintain
>>>>>>>> legacy documentation, we should clearly point to newer APIs and
>>>>>>>> suggest to
>>>>>>>> users the "right" way.
>>>>>>>> >> >>
>>>>>>>> >> >> Community Work - Many people learn Spark by reading blogs and
>>>>>>>> other sites such as StackOverflow. However, many of these resources
>>>>>>>> are out
>>>>>>>> of date. Update them, to reduce the cost of eventually removing
>>>>>>>> deprecated
>>>>>>>> APIs.
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >> >>
>>>>>>>> >>
>>>>>>>> >>
>>>>>>>> -
>>>>>>>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>> >>
>>>>>>>>
>>>>>>>>
>>>>>>>> -
>>>>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>
>>>>> --
>>>>> ---
>>>>> Takeshi Yamamuro
>>>>>
>>>>
>>
>> --
>> <https://databricks.com/sparkaisummit/north-america>
>>
>
--
Takuya UESHIN
http://twitter.com/ueshin
ache/spark/pull/20280
> >>> [2] https://www.python.org/dev/peps/pep-0468/
> >>> [3] https://issues.apache.org/jira/browse/SPARK-29748
>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
;>>>>>>>
>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>
>>>>>>>>>>> While deprecation of Python 2 in 3.0.0 has been announced
>>>>>>>>>>> <https://spark.apache.org/new
s including ML, SQL, and
>> data sources, so it’s great to have them here. All the best,
>>
>> Matei and the Spark PMC
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.
ak Yavuz
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Congrats Jose!
>>>>>>>>
>>>>>>>> On Tue, Jan 29, 2019 at 10:50 AM Xiao Li
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Congratulations!
>>>>>>>>>
>>>>>>>>> Xiao
>>>>>>>>>
>>>>>>>>> Shixiong Zhu 于2019年1月29日周二 上午10:48写道:
>>>>>>>>>
>>>>>>>>>> Hi all,
>>>>>>>>>>
>>>>>>>>>> The Apache Spark PMC recently added Jose Torres as a committer on
>>>>>>>>>> the project. Jose has been a major contributor to Structured
>>>>>>>>>> Streaming.
>>>>>>>>>> Please join me in welcoming him!
>>>>>>>>>>
>>>>>>>>>> Best Regards,
>>>>>>>>>>
>>>>>>>>>> Shixiong Zhu
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> Shane Knapp
>>>>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>>>>> https://rise.cs.berkeley.edu
>>>>>>
>>>>>>
>>
>> --
>> ---
>> Takeshi Yamamuro
>>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
rk SQL)
>> >
>> > Please join me in welcoming them!
>>
>>
>>
>>
>>
>> --
>> Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
expr, Literal(value))
> }
> }
>
>
> It does a pattern matching to detect if value is of type Column. If yes,
> it will use the .expr of the column, otherwise it will work as it used to.
>
> Any suggestion or opinion on the proposition?
>
>
> Kind regards,
> Chongguang LIU
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
ributing across several areas of Spark for a while, focusing
>> especially
>> > on analyzer, optimizer in Spark SQL. Please join me in welcoming
>> Zhenhua!
>> >
>> > Wenchen
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
to Kubernetes support and other parts of
>>>>> Spark)
>>>>> > - Seth Hendrickson (contributor to MLlib and PySpark)
>>>>> >
>>>>> > Please join me in welcoming Anirudh, Bryan, Cody, Erik, Matt and
>>>>> Seth as committers!
>>>>> >
>>>>> > Matei
>>>>> >
>>>>> -
>>>>> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>> >
>>>>>
>>>>> -
>>>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>>>
>>>>>
>>>>
>>>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
that impact compatibility should be
>> worked on immediately. Everything else please retarget to 2.3.1 or 2.4.0 as
>> appropriate.
>>
>> ===
>> Why is my bug not fixed?
>> ===
>>
>> In order to make timely releases, we will typically not hold the release
>> unless the bug in question is a regression from 2.2.0. That being said, if
>> there is something which is a regression from 2.2.0 and has not been
>> correctly targeted please ping me or a committer to help target the issue
>> (you can see the open issues listed as impacting Spark 2.3.0 at
>> https://s.apache.org/WmoI).
>>
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
la you
>>>>> can add the staging repository to your projects resolvers and test with
>>>>> the
>>>>> RC (make sure to clean up the artifact cache before/after so you don't end
>>>>> up building with a out of date RC going forward).
>>>>>
is less than I expected, I definitely support it. It should
>> speed up those cool changes.
>>
>>
>> On 14 Nov 2017 7:14 pm, "Takuya UESHIN" <ues...@happy-camper.st> wrote:
>>
>> Hi all,
>>
>> I'd like to raise a discussion about Pa
ame from pandas DataFrame with Arrow
- https://github.com/apache/spark/pull/19646
Any comments are welcome!
Thanks.
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
ently added Tejas Patil as a committer on the
>> project. Tejas has been contributing across several areas of Spark for
>> a while, focusing especially on scalability issues and SQL. Please
>> join me in welcoming Tejas!
>>
>> Matei
>>
>> -----
t; > :
> >> > +1
> >> >
> >> > On Mon, Sep 11, 2017 at 5:47 PM, Sameer Agarwal
>
> > sameer@
>
> >
> >> wrote:
> >> > +1 (non-binding)
> >> >
> >> > On Thu, Sep 7, 2017 at 9:10 PM, Bryan Cut
orator name so that it
>> could also be useable for other efficient vectorized format in the future?
>> Or do we anticipate the decorator to be format specific and will have more
>> in the future?
>>
>> --
>> *From:* Reynold Xin <r...@
e current effort and we will
> be adding those later?
>
> On Fri, Sep 1, 2017 at 8:01 AM Takuya UESHIN <ues...@happy-camper.st>
> wrote:
>
>> Hi all,
>>
>> We've been discussing to support vectorized UDFs in Python and we almost
>> got a consensus about the
forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following technical
reasons.
Thanks!
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
> > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >
>
>
> ---------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
ark PMC recently voted to add Hyukjin Kwon and Sameer Agarwal
>>>>> as committers. Join me in congratulating both of them and thanking them
>>>>> for
>>>>> their contributions to the project!
>>>>> >
>>>>> > Matei
>>>>> > -
Congrats!
>>
>> Kazuaki Ishizaki
>>
>>
>>
>> From:Reynold Xin <r...@databricks.com>
>> To:"dev@spark.apache.org" <dev@spark.apache.org>
>> Date:2017/02/14 04:18
>> Subject:welcoming Takuya Uesh
gt; If you are a Spark user, you can help us test this release by taking an
> existing Apache Spark workload and running on this candidate, then
> reporting any regressions.
>
>
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
Thank you for your reply.
I've sent pull requests.
Thanks.
2014-06-05 3:16 GMT+09:00 Patrick Wendell pwend...@gmail.com:
It should be 1.1-SNAPSHOT. Feel free to submit a PR to clean up any
inconsistencies.
On Tue, Jun 3, 2014 at 8:33 PM, Takuya UESHIN ues...@happy-camper.st wrote:
Hi all
(d96794132e37cf57f8dd945b9d11f8adcfc30490):
- pom.xml: 1.0.1-SNAPSHOT
- SparkBuild.scala: 1.0.0
It should be 1.0.1-SNAPSHOT?
Thanks.
--
Takuya UESHIN
Tokyo, Japan
http://twitter.com/ueshin
34 matches
Mail list logo