Re: [DISCUSS] clarify the definition of behavior changes

2024-05-02 Thread Will Raschkowski
To add some user perspective, I wanted to share our experience from automatically upgrading tens of thousands of jobs from Spark 2 to 3 at Palantir: We didn't mind "loud" changes that threw exceptions. We have some infra to try run jobs with Spark 3 and fallback to Spark 2 if there's an

Re: Plans for built-in v2 data sources in Spark 4

2023-09-20 Thread Will Raschkowski
ust my understanding – curious if I’m thinking about this correctly). Anyway, thank you for the pointer. From: Dongjoon Hyun Date: Friday, 15 September 2023 at 05:36 To: Will Raschkowski Cc: dev@spark.apache.org Subject: Re: Plans for built-in v2 data sources in Spark 4 CAUTION: This email orig

Re: Plans for built-in v2 data sources in Spark 4

2023-09-20 Thread Will Raschkowski
closer to supporting bucketing and partitioning in v2 and then defaulting to v2. From: Dongjoon Hyun Date: Friday, 15 September 2023 at 05:36 To: Will Raschkowski Cc: dev@spark.apache.org Subject: Re: Plans for built-in v2 data sources in Spark 4 CAUTION: This email originates from an external party (o

Plans for built-in v2 data sources in Spark 4

2023-09-13 Thread Will Raschkowski
Hey everyone, I was wondering what the plans are for Spark's built-in v2 file data sources in Spark 4. Concretely, is the plan for Spark 4 to continue defaulting to the built-in v1 data sources? And if yes, what are the blockers for defaulting to v2? I see, just as example, that writing

Re: Bridging gap between Spark UI and Code

2021-05-24 Thread Will Raschkowski
This would be great. At least for logical nodes, would it be possible to re-use the existing Utils.getCallSite to populate a field when nodes are created? I suppose most value would come