Re: Re: [VOTE] Release Apache Spark 3.4.0 (RC4)

2023-03-23 Thread Xinrong Meng
Hi, Considering https://issues.apache.org/jira/browse/SPARK-42693 is a release blocker, I would suggest we postpone the v3.4.0-rc5 until next week. I appreciate the ongoing efforts into API auditing. Please feel free to participate in the auditing if you are interested! Please refer to the

Re: please help the problem of big parquet file can not be splitted to read

2023-03-23 Thread Alfie Davidson
I’m pretty sure snappy file is not splittable. That’s why you have a single task (and most likely core) reading the 1.9GB snappy file Sent from my iPhone > On 23 Mar 2023, at 07:36, yangjie01 wrote: >  > Is there only one RowGroup for this file? You can check this by printing the > file's

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread John Zhuge
+1 on better notebook and other REPL experience On Thu, Mar 23, 2023 at 9:17 AM Dongjoon Hyun wrote: > I also support Herman's `SPARK-42884 Add Ammonite REPL integration` PR. > > Thanks, > Dongjoon. > > > On Thu, Mar 23, 2023 at 7:51 AM Mridul Muralidharan > wrote: > >> >> Sounds good, thanks

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread Dongjoon Hyun
I also support Herman's `SPARK-42884 Add Ammonite REPL integration` PR. Thanks, Dongjoon. On Thu, Mar 23, 2023 at 7:51 AM Mridul Muralidharan wrote: > > Sounds good, thanks for clarifying ! > > Regards, > Mridul > > On Thu, Mar 23, 2023 at 9:09 AM Herman van Hovell > wrote: > >> The goal of

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread Mridul Muralidharan
Sounds good, thanks for clarifying ! Regards, Mridul On Thu, Mar 23, 2023 at 9:09 AM Herman van Hovell wrote: > The goal of adding this, is to make it easy for a user to connect a scala > REPL to a Spark Connect server. Just like Spark shell makes it easy to work > with a regular Spark

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread Herman van Hovell
The goal of adding this, is to make it easy for a user to connect a scala REPL to a Spark Connect server. Just like Spark shell makes it easy to work with a regular Spark environment. It is not meant as a Spark shell replacement. They represent two different modes of working with Spark, and they

Re: Ammonite as REPL for Spark Connect

2023-03-23 Thread Mridul Muralidharan
What is unclear to me is why we are introducing this integration, how users will leverage it. * Are we replacing spark-shell with it ? Given the existing gaps, this is not the case. * Is it an example to showcase how to build an integration ? That could be interesting, and we can add it to

Re: please help the problem of big parquet file can not be splitted to read

2023-03-23 Thread yangjie01
Is there only one RowGroup for this file? You can check this by printing the file's metadata using the `meta` command of `parquet-cli`. Yang Jie 发件人: zhangliyun 日期: 2023年3月23日 星期四 15:16 收件人: Spark Dev List 主题: please help the problem of big parquet file can not be splitted to read hi all i