from:"Santosh Pingale"

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-01 Thread Santosh Pingale

Thanks Wenchen for starting this! How do we define "the user" for spark? 1. End users: There are some users that use spark as a service from a provider 2. Providers/Operators: There are some users that provide spark as a service for their internal(on-prem setup with yarn/k8s)/external(Something

Re: [FYI] SPARK-47993: Drop Python 3.8

2024-04-25 Thread Santosh Pingale

+1 On Thu, Apr 25, 2024, 5:41 PM Dongjoon Hyun wrote: > FYI, there is a proposal to drop Python 3.8 because its EOL is October > 2024. > > https://github.com/apache/spark/pull/46228 > [SPARK-47993][PYTHON] Drop Python 3.8 > > Since it's still alive and there will be an overlap between the

Re: Re: [DISCUSS] Release Spark 3.5.1?

2024-02-04 Thread Santosh Pingale

+1 On Sun, Feb 4, 2024, 8:18 PM Xiao Li wrote: > +1 > > On Sun, Feb 4, 2024 at 6:07 AM beliefer wrote: > >> +1 >> >> >> >> 在 2024-02-04 15:26:13，"Dongjoon Hyun" 写道： >> >> +1 >> >> On Sat, Feb 3, 2024 at 9:18 PM yangjie01 >> wrote: >> >>> +1 >>> >>> 在 2024/2/4 13:13，“Kent

Spark 3.5.1

2024-01-30 Thread Santosh Pingale

Hey there Spark 3.5 branch has accumulated 199 commits with quite a few bug fixes related to correctness. Are there any plans for releasing 3.5.1? Kind regards Santosh

Re: Apache Spark 3.4.2 (?)

2023-11-06 Thread Santosh Pingale

Makes sense given the nature of those commits. On Mon, Nov 6, 2023, 7:52 PM Dongjoon Hyun wrote: > Hi, All. > > Apache Spark 3.4.1 tag was created on Jun 19th and `branch-3.4` has 103 > commits including important security and correctness patches like > SPARK-44251, SPARK-44805, and

Making spark plan UI interactive

2023-09-06 Thread Santosh Pingale

Hey community Spark UI with the plan visualisation is an excellent resource for finding out crucial information about how your application is doing and what parts of the execution can still be optimized to fulfill time/resource constraints. The graph in its current form is sufficient for simpler

Re: Volcano in spark distro

2023-08-23 Thread Santosh Pingale

> In any way, I'd like to say that the root cause of the difference is those scheduler designs instead of Apache Spark itself. For example, Apache YuniKorn doesn't force us to add a new dependency at all while Volcano did. This makes sense! > In these day, I prefer and invest more Apache

Volcano in spark distro

2023-08-22 Thread Santosh Pingale

Hey all It would useful to support volcano in spark distro itself just like yunikorn. So I am wondering what is the reason behind this decision of not packaging it already.

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-23 Thread Santosh Pingale

before users take an action). > > > On Fri, 24 Feb 2023 at 15:35, Santosh Pingale > wrote: > >> Very interesting and user focused discussion, thanks for the proposal. >> >> Would it be better if we rather let users set the preference about the >> l

Re: [DISCUSS] Show Python code examples first in Spark documentation

2023-02-23 Thread Santosh Pingale

Very interesting and user focused discussion, thanks for the proposal. Would it be better if we rather let users set the preference about the language they want to see first in the code examples? This preference can be easily stored on the browser side and used to decide ordering. This is inline

Re: Pandas UDF cogroup.applyInPandas with multiple dataframes

2023-02-22 Thread Santosh Pingale

is. > > On Mon, Feb 6, 2023 at 5:29 AM Santosh Pingale > wrote: > Created a PR: https://github.com/apache/spark/pull/39902 > <https://github.com/apache/spark/pull/39902> > > >> On 24 Jan 2023, at 15:04, Santosh Pingale > <mailto:santosh.ping...@adyen.c

Re: Pandas UDF cogroup.applyInPandas with multiple dataframes

2023-02-06 Thread Santosh Pingale

Created a PR: https://github.com/apache/spark/pull/39902 <https://github.com/apache/spark/pull/39902> > On 24 Jan 2023, at 15:04, Santosh Pingale wrote: > > Hey all > > I have an interesting problem in hand. We have cases where we want to pass > multip

Pandas UDF cogroup.applyInPandas with multiple dataframes

2023-01-24 Thread Santosh Pingale

Hey all I have an interesting problem in hand. We have cases where we want to pass multiple(20 to 30) data frames to cogroup.applyInPandas function. RDD currently supports cogroup with upto 4 dataframes (ZippedPartitionsRDD4) where as cogroup with pandas can handle only 2 dataframes (with

Re: [DISCUSS] clarify the definition of behavior changes

Re: [FYI] SPARK-47993: Drop Python 3.8

Re: Re: [DISCUSS] Release Spark 3.5.1?

Spark 3.5.1

Re: Apache Spark 3.4.2 (?)

Making spark plan UI interactive

Re: Volcano in spark distro

Volcano in spark distro

Re: [DISCUSS] Show Python code examples first in Spark documentation

Re: [DISCUSS] Show Python code examples first in Spark documentation

Re: Pandas UDF cogroup.applyInPandas with multiple dataframes

Re: Pandas UDF cogroup.applyInPandas with multiple dataframes

Pandas UDF cogroup.applyInPandas with multiple dataframes

13 matches

Site Navigation

Mail list logo

Footer information