[DISCUSS] Spark - How to improve our release processes

2024-05-09 Thread Nimrod Ofek
Following the conversation started with Spark 4.0.0 release, this is a thread to discuss improvements to our release processes. I'll Start by raising some questions that probably should have answers to start the discussion: 1. What is currently running in GitHub Actions? 2. Who currently

Re: [DISCUSS] Spark 4.0.0 release

2024-05-08 Thread Nimrod Ofek
Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): > https://amzn.to/2MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On Tue, May 7, 2024 at 9:43 PM Nimrod Ofek wrote: > >

Re: [DISCUSS] Spark 4.0.0 release

2024-05-07 Thread Nimrod Ofek
MaRAG9 <https://amzn.to/2MaRAG9> > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On Tue, May 7, 2024 at 10:55 AM Nimrod Ofek wrote: > >> Hi, >> >> Sorry for the novice question, Wenchen - the release is done manually >> fro

Re: [DISCUSS] Spark 4.0.0 release

2024-05-07 Thread Nimrod Ofek
Hi, Sorry for the novice question, Wenchen - the release is done manually from a laptop? Not using a CI CD process on a build server? Thanks, Nimrod On Tue, May 7, 2024 at 8:50 PM Wenchen Fan wrote: > UPDATE: > > Unfortunately, it took me quite some time to set up my laptop and get it > ready

Re: [DISCUSS] clarify the definition of behavior changes

2024-05-02 Thread Nimrod Ofek
Hi Erik and Wenchen, I think that usually a good practice with public api and with internal api that has big impact and a lot of usage is to ease in changes by providing defaults to new parameters that will keep former behaviour in a method with the previous signature with deprecation notice, and

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Nimrod Ofek
+1 (non-binding) p.s How do I become binding? Thanks, Nimrod On Tue, Apr 30, 2024 at 10:53 AM Ye Xianjin wrote: > +1 > Sent from my iPhone > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > >  > +1 > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > >  > To add more color: > > Spark data

Re: [DISCUSS] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-25 Thread Nimrod Ofek
t;>> >>>view my Linkedin profile >>> <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/> >>> >>> >>> https://en.everybodywiki.com/Mich_Talebzadeh >>> >>> >>> >>> *Disclaimer:* The informat

Re: [DISCUSS] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-25 Thread Nimrod Ofek
e, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Thu, 25 Apr 2024 at 15:39, Nimrod Ofek wrote: > >> Yes

Re: [DISCUSS] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-25 Thread Nimrod Ofek
ith any advice, quote "one test result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Thu, 25 Apr 2024 at 14:38, Nimrod Ofek wrote: > >>

Re: [DISCUSS] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-25 Thread Nimrod Ofek
result is worth one-thousand > expert opinions (Werner <https://en.wikipedia.org/wiki/Wernher_von_Braun>Von > Braun <https://en.wikipedia.org/wiki/Wernher_von_Braun>)". > > > On Thu, 25 Apr 2024 at 12:30, Nimrod Ofek wrote: > >> I will also appreciate so

Re: [DISCUSS] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-25 Thread Nimrod Ofek
I will also appreciate some material that describes the differences between Spark native tables vs hive tables and why each should be used... Thanks Nimrod בתאריך יום ה׳, 25 באפר׳ 2024, 14:27, מאת Mich Talebzadeh ‏< mich.talebza...@gmail.com>: > I see a statement made as below and I quote > >

Support Avro rolling version upgrades using schema manager

2024-04-13 Thread Nimrod Ofek
Hi, Currently, Avro records are supported in Spark - but with the limitation that we must specify the input and output schema versions. For writing out an avro record that is fine - but for reading avro records, that is usually a problem since there are upgrades and changes - and the current