Re: [VOTE] Document and Feature Preview via GitHub Pages

2024-09-11 Thread Martin Grund
+1 On Wed, Sep 11, 2024 at 9:39 AM Kent Yao wrote: > Hi all, > > Following the discussion[1], I'd like to start the vote for 'Document and > Feature Preview via GitHub Pages' > > > Please vote for the next 72 hours:(excluding next weekend) > > [ ] +1: Accept the proposal > [ ] +0 > [ ]- 1: I

Re: [VOTE] Deprecate SparkR

2024-08-21 Thread Martin Grund
+1 On Wed, Aug 21, 2024 at 20:26 Xiangrui Meng wrote: > +1 > > On Wed, Aug 21, 2024, 10:24 AM Mridul Muralidharan > wrote: > >> +1 >> >> >> Regards, >> Mridul >> >> >> On Wed, Aug 21, 2024 at 11:46 AM Reynold Xin >> wrote: >> >>> +1 >>> >>> On Wed, Aug 21, 2024 at 6:42 PM Shivaram Venkataraman

Re: [VOTE] Using Github Issues for Spark-Connect-Go _only_ issues.

2024-08-15 Thread Martin Grund
; >>>>>>view my Linkedin profile >>>>>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>>>>> >>>>>> >>>>>> https://en.everybodywiki.com/Mich_Talebzadeh >>>>>> >>>

Re: [DISCUSS] Deprecating SparkR

2024-08-13 Thread Martin Grund
+1 On Tue, Aug 13, 2024 at 7:26 AM Ruifeng Zheng wrote: > +1 > > On Tue, Aug 13, 2024 at 1:08 PM Holden Karau > wrote: > >> +1 >> >> Are the sparklyr folks on this list? >> >> Twitter: https://twitter.com/holdenkarau >> Books (Learning Spark, High Performance Spark, etc.): >> https://amzn.to/2M

Re: Welcoming a new PMC member

2024-08-13 Thread Martin Grund
Congratulations! On Tue, Aug 13, 2024 at 9:37 AM Peter Toth wrote: > Congratulations! > > Mridul Muralidharan ezt írta (időpont: 2024. aug. 13., > K, 8:46): > >> >> Congratulations Kent ! >> >> Regards, >> Mridul >> >> On Mon, Aug 12, 2024 at 8:46 PM Dongjoon Hyun >> wrote: >> >>> Congratulati

[VOTE] Using Github Issues for Spark-Connect-Go _only_ issues.

2024-08-12 Thread Martin Grund
Hi Folks, following the discussion in the previous thread, I would like to open the vote on the following proposal: Enabling the use of GitHub Issues for issue tracking for the Spark Connect Go repository. This will support the tracking of issues that are exclusively relevant to this project. For

Re: [外部邮件] Re: [DISCUSS] Using Github Issues for Spark-Connect-Go _only_ issues.

2024-08-11 Thread Martin Grund
> > On Fri, Aug 9, 2024 at 11:56 PM bo yang wrote: > > +1 to start small as an experiment to see how people use GitHub issue... > > > > On Thu, Aug 8, 2024 at 11:54 PM Kent Yao wrote: > > +1 > > On 2024/08/08 23:24:32 Hyukjin Kwon wrote: > > SGTM > > >

Re: [DISCUSS] Using Github Issues for Spark-Connect-Go _only_ issues.

2024-08-08 Thread Martin Grund
_London> >> London, United Kingdom >> >> *Disclaimer:* The information provided is correct to the best of my >> knowledge but of course cannot be guaranteed . It is essential to note >> that, as with any advice, quote "one test result is worth one-thousand &

[DISCUSS] Using Github Issues for Spark-Connect-Go _only_ issues.

2024-08-07 Thread Martin Grund
Hi folks, I wanted to start a discussion for the following proposal: To make it easier for folks to contribute to the Spark Connect Go client, I was contemplating not requiring them to deal with two accounts (one for Jira) and one for Gihutb but allow using GitHub Issues for bugs and issues that a

Re: [外部邮件] [VOTE] Differentiate Spark without Spark Connect from Spark Connect

2024-07-23 Thread Martin Grund
+1 On Tue, Jul 23, 2024 at 07:06 Dongjoon Hyun wrote: > +1 for the proposed definition. > > Thanks, > Dongjoon > > > On Tue, Jul 23, 2024 at 6:42 AM Xianjin YE wrote: > >> +1 (non-binding) >> >> On Jul 23, 2024, at 16:16, Jungtaek Lim >> wrote: >> >> +1 (non-binding) >> >> On Tue, Jul 23, 2024

Re: [DISCUSS] Differentiate Spark without Spark Connect from Spark Connect

2024-07-22 Thread Martin Grund
+1 for classic. It's simple, easy to understand and it doesn't have the negative meanings like legacy for example. On Sun, Jul 21, 2024 at 23:48 Wenchen Fan wrote: > Classic SGTM. > > On Mon, Jul 22, 2024 at 1:12 PM Jungtaek Lim > wrote: > >> I'd propose not to change the name of "Spark Connect

Re: [DISCUSS] Why do we remove RDD usage and RDD-backed code?

2024-07-12 Thread Martin Grund
Mridul, I really just wanted to understand the concern from Dongjoon. What you're pointing at is a slightly different concern. So what I see is the following: > [...] they can initialize a SparkContext and work with RDD api: The current PR uses a potentially optional value without checking that i

Re: [DISCUSS] Why do we remove RDD usage and RDD-backed code?

2024-07-12 Thread Martin Grund
I took a quick look at the PR and would like to understand your concern better about: > SparkSession is heavier than SparkContext It looks like the PR is using the active SparkSession, not creating a new one etc. I would highly appreciate it if you could help me understand this situation better.

Re: [VOTE] Allow GitHub Actions runs for contributors' PRs without approvals in apache/spark-connect-go

2024-07-04 Thread Martin Grund
+1 (non-binding) On Thu, Jul 4, 2024 at 7:15 PM Holden Karau wrote: > +1 > > Although given its a US holiday maybe keep the vote open for an extra day? > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9

Re: [DISCUSS] Allow GitHub Actions runs for contributors' PRs without approvals in apache/spark-connect-go

2024-07-03 Thread Martin Grund
Absolutely we should do that. I thought that the default rule was inclusive already so that once folks have their first contribution it would automatically allow kicking of the workflows. On Thu, Jul 4, 2024 at 04:20 Matthew Powers wrote: > Yea, this would be great. > > spark-connect-go is still

Re: [外部邮件] Re: [VOTE] Move Spark Connect server to builtin package (Client API layer stays external)

2024-07-02 Thread Martin Grund
+1 (non-binding) On Wed, Jul 3, 2024 at 07:25 Holden Karau wrote: > +1 > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > >

Re: [外部邮件] [DISCUSS] Move Spark Connect server to builtin package (Client API layer stays external)

2024-07-02 Thread Martin Grund
+1 On Tue, Jul 2, 2024 at 7:19 AM yangjie01 wrote: > I have manually attempted to only modify the `assembly/pom.xml` and > examined the results of executing `dev/make-distribution.sh --tgz`. The > `spark-connect_2.13-4.0.0-SNAPSHOT.jar` is indeed included in the jars > directory. However, if rea

Re: Write Spark Connection client application in Go

2023-09-13 Thread Martin Grund
This is absolutely awesome! Thank you so much for dedicating your time to this project! On Wed, Sep 13, 2023 at 6:04 AM Holden Karau wrote: > That’s so cool! Great work y’all :) > > On Tue, Sep 12, 2023 at 8:14 PM bo yang wrote: > >> Hi Spark Friends, >> >> Anyone interested in using Golang to

Re: [VOTE] Release Apache Spark 3.5.0 (RC3)

2023-08-29 Thread Martin Grund
+1 (non binding) Tested Spark Connect fully isolated and with PySpark build. Tested as well some of the new PySpark ML Connect features On Tue 29. Aug 2023 at 18:25 Yuanjian Li wrote: > Please vote on releasing the following candidate(RC3) as Apache Spark > version 3.5.0. > > The vote is open u

Re: Spark Connect: API mismatch in SparkSesession#execute

2023-08-28 Thread Martin Grund
Hi Stefan, There are some current limitations around how protobuf is embedded in Spark Connect. One of the challenges there is that for compatibility reasons we currently shade protobuf that then shades the `prototobuf.GeneramtedMessage` class. The way to work around this is to shade the protobuf

Re: [VOTE][SPIP] Python Data Source API

2023-07-07 Thread Martin Grund
+1 (non-binding) On Fri, Jul 7, 2023 at 12:05 AM Denny Lee wrote: > +1 (non-binding) > > On Fri, Jul 7, 2023 at 00:50 Maciej wrote: > >> +0 >> >> Best regards, >> Maciej Szymkiewicz >> >> Web: https://zero323.net >> PGP: A30CEF0C31A501EC >> >> On 7/6/23 17:41, Xiao Li wrote: >> >> +1 >> >> Xiao

Re: [DISCUSS] SPIP: Python Data Source API

2023-06-24 Thread Martin Grund
Hey, I would like to express my strong support for Python Data Sources even though they might not be immediately as powerful as Scala-based data sources. One element that is easily lost in this discussion is how much faster the iteration speed is with Python compared to Scala. Due to the dynamic n

Re: [CONNECT] New Clients for Go and Rust

2023-06-01 Thread Martin Grund
> > > On 5/30/23 11:50, Martin Grund wrote: > > I think it makes sense to split this discussion into two pieces. On > > the contribution side, my personal perspective is that these new > clients > are explicitly marked as experimental and unsupported until > we deem th

Re: [CONNECT] New Clients for Go and Rust

2023-06-01 Thread Martin Grund
ark/pull/41036)! > > Great to see we have a new repo for Spark Golang Connect client. Thanks > Hyukjin! > I am thinking to migrate my PR to this new repo. Would like to hear any > feedback or suggestion before I make the new PR :) > > Thanks, > Bo > > > > On

Re: [CONNECT] New Clients for Go and Rust

2023-05-30 Thread Martin Grund
EADME file that this is an experimental client. Looking forward to all your contributions! On Tue, May 30, 2023 at 11:50 AM Martin Grund wrote: > I think it makes sense to split this discussion into two pieces. On the > contribution side, my personal perspective is that these new

Re: [CONNECT] New Clients for Go and Rust

2023-05-30 Thread Martin Grund
> Also, an elephant in the room is the future of the current API in Spark 4 > and onwards. As useful as connect is, it is not exactly a replacement for > many existing deployments. Furthermore, it doesn't make extending Spark > much easier and the current ecosystem is, subjectively sp

Re: [CONNECT] New Clients for Go and Rust

2023-05-25 Thread Martin Grund
gt;> > >> > >> > >> 1. Different repository can maintain independent versions, different > release times, and faster bug fix releases. > >> > >> > >> > >> 2. Different languages have different build tools. Putting them in one > repository w

[CONNECT] New Clients for Go and Rust

2023-05-19 Thread Martin Grund
Hi folks, When Bo (thanks for the time and contribution) started the work on https://github.com/apache/spark/pull/41036 he started the Go client directly in the Spark repository. In the meantime, I was approached by other engineers who are willing to contribute to working on a Rust client for Spar

Re: Enforcing scalafmt on Spark Connect - connector/connect

2022-10-14 Thread Martin Grund
we now do this in PySpark, and it's >> pretty nice that you can just forget about formatting it manually by >> yourself. >> >> On Fri, 14 Oct 2022 at 16:37, Martin Grund >> wrote: >> >>> Hi folks, >>> >>> I'm reaching out t

Enforcing scalafmt on Spark Connect - connector/connect

2022-10-14 Thread Martin Grund
Hi folks, I'm reaching out to ask to gather input / consensus on the following proposal: Since Spark Connect is effectively new code, I would like to enforce scalafmt explicitly *only* on this module by adding a check in `dev/lint-scala` that checks if there is a diff after running ./build/mvn -

Re: [VOTE][RESULT] SPIP: Spark Connect

2022-06-16 Thread Martin Grund
Thanks everyone for your votes and thanks Herman for being the shepherd. On Fri 17. Jun 2022 at 02:23 Hyukjin Kwon wrote: > Awesome, I am excited to see this in Apache Spark. > > On Fri, 17 Jun 2022 at 08:37, Herman van Hovell > wrote: > >> The vote passes with 17 +1s (10 binding +1s). >> +1: >

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-07 Thread Martin Grund
On Tue, Jun 7, 2022 at 3:54 PM Steve Loughran wrote: > > > On Fri, 3 Jun 2022 at 18:46, Martin Grund > wrote: > >> Hi Everyone, >> >> We would like to start a discussion on the "Spark Connect" proposal. >> Please find the links below: >>

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-06 Thread Martin Grund
es the > API to various databases, for example Google BiqQuery is very efficient. I > am not sure what this proposal is to trying to address? > > HTH > > On Fri, 3 Jun 2022 at 18:46, Martin Grund ent server > wrote: > >> Hi Everyone, >> >> We would l

Re: [DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-04 Thread Martin Grund
be supported in this? > > On Fri, Jun 3, 2022 at 1:52 PM Martin Grund > wrote: > >> Hi Everyone, >> >> We would like to start a discussion on the "Spark Connect" proposal. >> Please find the links below: >> >> *JIRA* - https://issues.apa

[DISCUSS] SPIP: Spark Connect - A client and server interface for Apache Spark.

2022-06-03 Thread Martin Grund
Hi Everyone, We would like to start a discussion on the "Spark Connect" proposal. Please find the links below: *JIRA* - https://issues.apache.org/jira/browse/SPARK-39375 *SPIP Document* - https://docs.google.com/document/d/1Mnl6jmGszixLW4KcJU5j9IgpG9-UabS0dcM6PM2XGDc/edit#heading=h.wmsrrfealhrj