Re: Write Spark Connection client application in Go

2023-09-12 Thread Holden Karau
That’s so cool! Great work y’all :)

On Tue, Sep 12, 2023 at 8:14 PM bo yang  wrote:

> Hi Spark Friends,
>
> Anyone interested in using Golang to write Spark application? We created a 
> Spark
> Connect Go Client library .
> Would love to hear feedback/thoughts from the community.
>
> Please see the quick start guide
> 
> about how to use it. Following is a very short Spark Connect application in
> Go:
>
> func main() {
>   spark, _ := 
> sql.SparkSession.Builder.Remote("sc://localhost:15002").Build()
>   defer spark.Stop()
>
>   df, _ := spark.Sql("select 'apple' as word, 123 as count union all 
> select 'orange' as word, 456 as count")
>   df.Show(100, false)
>   df.Collect()
>
>   df.Write().Mode("overwrite").
>   Format("parquet").
>   Save("file:///tmp/spark-connect-write-example-output.parquet")
>
>   df = spark.Read().Format("parquet").
>   Load("file:///tmp/spark-connect-write-example-output.parquet")
>   df.Show(100, false)
>
>   df.CreateTempView("view1", true, false)
>   df, _ = spark.Sql("select count, word from view1 order by count")
> }
>
>
> Many thanks to Martin, Hyukjin, Ruifeng and Denny for creating and working
> together on this repo! Welcome more people to contribute :)
>
> Best,
> Bo
>
>


unsubscribe

2023-09-12 Thread 杨军
unsubscribe

Write Spark Connection client application in Go

2023-09-12 Thread bo yang
Hi Spark Friends,

Anyone interested in using Golang to write Spark application? We
created a Spark
Connect Go Client library .
Would love to hear feedback/thoughts from the community.

Please see the quick start guide

about how to use it. Following is a very short Spark Connect application in
Go:

func main() {
spark, _ := 
sql.SparkSession.Builder.Remote("sc://localhost:15002").Build()
defer spark.Stop()

df, _ := spark.Sql("select 'apple' as word, 123 as count union all
select 'orange' as word, 456 as count")
df.Show(100, false)
df.Collect()

df.Write().Mode("overwrite").
Format("parquet").
Save("file:///tmp/spark-connect-write-example-output.parquet")

df = spark.Read().Format("parquet").
Load("file:///tmp/spark-connect-write-example-output.parquet")
df.Show(100, false)

df.CreateTempView("view1", true, false)
df, _ = spark.Sql("select count, word from view1 order by count")
}


Many thanks to Martin, Hyukjin, Ruifeng and Denny for creating and working
together on this repo! Welcome more people to contribute :)

Best,
Bo


Re: [VOTE] Release Apache Spark 3.5.0 (RC5)

2023-09-12 Thread XiDuo You
+1 (non-binding)

Jungtaek Lim  于2023年9月12日周二 15:14写道:
>
> +1 (non-binding)
>
> Thanks for driving this release and the patience on multiple RCs!
>
> On Tue, Sep 12, 2023 at 10:00 AM Yuanjian Li  wrote:
>>
>> +1 (non-binding)
>>
>> Yuanjian Li  于2023年9月11日周一 09:36写道:
>>>
>>> @Peter Toth I've looked into the details of this issue, and it appears that 
>>> it's neither a regression in version 3.5.0 nor a correctness issue. It's a 
>>> bug related to a new feature. I think we can fix this in 3.5.1 and list it 
>>> as a known issue of the Scala client of Spark Connect in 3.5.0.
>>>
>>> Mridul Muralidharan  于2023年9月10日周日 04:12写道:


 +1

 Signatures, digests, etc check out fine.
 Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes

 Regards,
 Mridul

 On Sat, Sep 9, 2023 at 10:02 AM Yuanjian Li  wrote:
>
> Please vote on releasing the following candidate(RC5) as Apache Spark 
> version 3.5.0.
>
>
> The vote is open until 11:59pm Pacific time Sep 11th and passes if a 
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
>
> [ ] +1 Release this package as Apache Spark 3.5.0
>
> [ ] -1 Do not release this package because ...
>
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
>
> The tag to be voted on is v3.5.0-rc5 (commit 
> ce5ddad990373636e94071e7cef2f31021add07b):
>
> https://github.com/apache/spark/tree/v3.5.0-rc5
>
>
> The release files, including signatures, digests, etc. can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc5-bin/
>
>
> Signatures used for Spark RCs can be found in this file:
>
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
>
> The staging repository for this release can be found at:
>
> https://repository.apache.org/content/repositories/orgapachespark-1449
>
>
> The documentation corresponding to this release can be found at:
>
> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc5-docs/
>
>
> The list of bug fixes going into 3.5.0 can be found at the following URL:
>
> https://issues.apache.org/jira/projects/SPARK/versions/12352848
>
>
> This release is using the release script of the tag v3.5.0-rc5.
>
>
>
> FAQ
>
>
> =
>
> How can I help test this release?
>
> =
>
> If you are a Spark user, you can help us test this release by taking
>
> an existing Spark workload and running on this release candidate, then
>
> reporting any regressions.
>
>
> If you're working in PySpark you can set up a virtual env and install
>
> the current RC and see if anything important breaks, in the Java/Scala
>
> you can add the staging repository to your projects resolvers and test
>
> with the RC (make sure to clean up the artifact cache before/after so
>
> you don't end up building with an out of date RC going forward).
>
>
> ===
>
> What should happen to JIRA tickets still targeting 3.5.0?
>
> ===
>
> The current list of open tickets targeted at 3.5.0 can be found at:
>
> https://issues.apache.org/jira/projects/SPARK and search for "Target 
> Version/s" = 3.5.0
>
>
> Committers should look at those and triage. Extremely important bug
>
> fixes, documentation, and API tweaks that impact compatibility should
>
> be worked on immediately. Everything else please retarget to an
>
> appropriate release.
>
>
> ==
>
> But my bug isn't fixed?
>
> ==
>
> In order to make timely releases, we will typically not hold the
>
> release unless the bug in question is a regression from the previous
>
> release. That being said, if there is something which is a regression
>
> that has not been correctly targeted please ping me or a committer to
>
> help target the issue.
>
>
> Thanks,
>
> Yuanjian Li

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[VOTE][RESULT] Release Apache Spark 3.5.0 (RC5)

2023-09-12 Thread Yuanjian Li
The vote passes with 13 +1s (8 binding +1s).
Thank you all who helped with the release!

(* = binding)
+1:
- Mridul Muralidharan (*)
- Yuanjian Li
- Xiao Li (*)
- Gengliang Wang (*)
- Hyukjin Kwon (*)
- Ruifeng Zheng (*)
- Jungtaek Lim
- Wenchen Fan (*)
- Jia Fan
- Jie Yang
- Yuming Wang (*)
- Kent Yao
- Dongjoon Hyun (*)

+0: None

-1: None


Re: [VOTE] Release Apache Spark 3.5.0 (RC5)

2023-09-12 Thread Dongjoon Hyun
+1

Dongjoon.

On 2023/09/12 03:38:37 Kent Yao wrote:
> +1 (non-binding), great work!
> 
> Kent Yao
> 
> Yuming Wang  于2023年9月12日周二 11:32写道:
> >
> > +1.
> >
> > On Tue, Sep 12, 2023 at 10:57 AM yangjie01  
> > wrote:
> >>
> >> +1
> >>
> >>
> >>
> >> 发件人: Jia Fan 
> >> 日期: 2023年9月12日 星期二 10:08
> >> 收件人: Ruifeng Zheng 
> >> 抄送: Hyukjin Kwon , Xiao Li , 
> >> Mridul Muralidharan , Peter Toth , 
> >> Spark dev list , Yuanjian Li 
> >> 主题: Re: [VOTE] Release Apache Spark 3.5.0 (RC5)
> >>
> >>
> >>
> >> +1
> >>
> >>
> >>
> >> Ruifeng Zheng  于2023年9月12日周二 08:46写道:
> >>
> >> +1
> >>
> >>
> >>
> >> On Tue, Sep 12, 2023 at 7:24 AM Hyukjin Kwon  wrote:
> >>
> >> +1
> >>
> >>
> >>
> >> On Tue, Sep 12, 2023 at 7:05 AM Xiao Li  wrote:
> >>
> >> +1
> >>
> >>
> >>
> >> Xiao
> >>
> >>
> >>
> >> Yuanjian Li  于2023年9月11日周一 10:53写道:
> >>
> >> @Peter Toth I've looked into the details of this issue, and it appears 
> >> that it's neither a regression in version 3.5.0 nor a correctness issue. 
> >> It's a bug related to a new feature. I think we can fix this in 3.5.1 and 
> >> list it as a known issue of the Scala client of Spark Connect in 3.5.0.
> >>
> >> Mridul Muralidharan  于2023年9月10日周日 04:12写道:
> >>
> >>
> >>
> >> +1
> >>
> >>
> >>
> >> Signatures, digests, etc check out fine.
> >>
> >> Checked out tag and build/tested with -Phive -Pyarn -Pmesos -Pkubernetes
> >>
> >>
> >>
> >> Regards,
> >>
> >> Mridul
> >>
> >>
> >>
> >> On Sat, Sep 9, 2023 at 10:02 AM Yuanjian Li  wrote:
> >>
> >> Please vote on releasing the following candidate(RC5) as Apache Spark 
> >> version 3.5.0.
> >>
> >>
> >>
> >> The vote is open until 11:59pm Pacific time Sep 11th and passes if a 
> >> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
> >>
> >>
> >>
> >> [ ] +1 Release this package as Apache Spark 3.5.0
> >>
> >> [ ] -1 Do not release this package because ...
> >>
> >>
> >>
> >> To learn more about Apache Spark, please see http://spark.apache.org/
> >>
> >>
> >>
> >> The tag to be voted on is v3.5.0-rc5 (commit 
> >> ce5ddad990373636e94071e7cef2f31021add07b):
> >>
> >> https://github.com/apache/spark/tree/v3.5.0-rc5
> >>
> >>
> >>
> >> The release files, including signatures, digests, etc. can be found at:
> >>
> >> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc5-bin/
> >>
> >>
> >>
> >> Signatures used for Spark RCs can be found in this file:
> >>
> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
> >>
> >>
> >>
> >> The staging repository for this release can be found at:
> >>
> >> https://repository.apache.org/content/repositories/orgapachespark-1449
> >>
> >>
> >>
> >> The documentation corresponding to this release can be found at:
> >>
> >> https://dist.apache.org/repos/dist/dev/spark/v3.5.0-rc5-docs/
> >>
> >>
> >>
> >> The list of bug fixes going into 3.5.0 can be found at the following URL:
> >>
> >> https://issues.apache.org/jira/projects/SPARK/versions/12352848
> >>
> >>
> >>
> >> This release is using the release script of the tag v3.5.0-rc5.
> >>
> >>
> >>
> >> FAQ
> >>
> >>
> >>
> >> =
> >>
> >> How can I help test this release?
> >>
> >> =
> >>
> >> If you are a Spark user, you can help us test this release by taking
> >>
> >> an existing Spark workload and running on this release candidate, then
> >>
> >> reporting any regressions.
> >>
> >>
> >>
> >> If you're working in PySpark you can set up a virtual env and install
> >>
> >> the current RC and see if anything important breaks, in the Java/Scala
> >>
> >> you can add the staging repository to your projects resolvers and test
> >>
> >> with the RC (make sure to clean up the artifact cache before/after so
> >>
> >> you don't end up building with an out of date RC going forward).
> >>
> >>
> >>
> >> ===
> >>
> >> What should happen to JIRA tickets still targeting 3.5.0?
> >>
> >> ===
> >>
> >> The current list of open tickets targeted at 3.5.0 can be found at:
> >>
> >> https://issues.apache.org/jira/projects/SPARK and search for "Target 
> >> Version/s" = 3.5.0
> >>
> >>
> >>
> >> Committers should look at those and triage. Extremely important bug
> >>
> >> fixes, documentation, and API tweaks that impact compatibility should
> >>
> >> be worked on immediately. Everything else please retarget to an
> >>
> >> appropriate release.
> >>
> >>
> >>
> >> ==
> >>
> >> But my bug isn't fixed?
> >>
> >> ==
> >>
> >> In order to make timely releases, we will typically not hold the
> >>
> >> release unless the bug in question is a regression from the previous
> >>
> >> release. That being said, if there is something which is a regression
> >>
> >> that has not been correctly targeted please ping me or a committer to
> >>
> >> help target the issue.
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Yuanjian Li
> 
> -
> To unsubscribe e-mail: