Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Kent Yao
+1 Kent Yao On 2024/04/30 09:07:21 Yuming Wang wrote: > +1 > > On Tue, Apr 30, 2024 at 3:31 PM Ye Xianjin wrote: > > > +1 > > Sent from my iPhone > > > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > > > >  > > +1 > > > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > > > >  > > To add

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Yuming Wang
+1 On Tue, Apr 30, 2024 at 3:31 PM Ye Xianjin wrote: > +1 > Sent from my iPhone > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > >  > +1 > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > >  > To add more color: > > Spark data source table and Hive Serde table are both stored in the

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Nimrod Ofek
+1 (non-binding) p.s How do I become binding? Thanks, Nimrod On Tue, Apr 30, 2024 at 10:53 AM Ye Xianjin wrote: > +1 > Sent from my iPhone > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > >  > +1 > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > >  > To add more color: > > Spark data

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread XiDuo You
+1 Dongjoon Hyun 于2024年4月27日周六 03:50写道: > > Please vote on SPARK-46122 to set spark.sql.legacy.createHiveTableByDefault > to `false` by default. The technical scope is defined in the following PR. > > - DISCUSSION: https://lists.apache.org/thread/ylk96fg4lvn6klxhj6t6yh42lyqb8wmd > - JIRA:

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Ye Xianjin
+1 Sent from my iPhoneOn Apr 30, 2024, at 3:23 PM, DB Tsai wrote:+1 On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote:To add more color:Spark data source table and Hive Serde table are both stored in the Hive metastore and keep the data files in the table directory. The only difference is they

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread DB Tsai
+1 On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote:To add more color:Spark data source table and Hive Serde table are both stored in the Hive metastore and keep the data files in the table directory. The only difference is they have different "table provider", which means Spark will use different

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-29 Thread Wenchen Fan
To add more color: Spark data source table and Hive Serde table are both stored in the Hive metastore and keep the data files in the table directory. The only difference is they have different "table provider", which means Spark will use different reader/writer. Ideally the Spark native data

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-29 Thread Wenchen Fan
@Mich Talebzadeh there seems to be a misunderstanding here. The Spark native data source table is still stored in the Hive metastore, it's just that Spark will use a different (and faster) reader/writer for it. `hive-site.xml` should work as it is today. On Tue, Apr 30, 2024 at 5:23 AM Hyukjin

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-29 Thread Hyukjin Kwon
+1 It's a legacy conf that we should eventually remove it away. Spark should create Spark table by default, not Hive table. Mich, for your workload, you can simply switch that conf off if it concerns you. We also enabled ANSI as well (that you agreed on). It's a bit akwakrd to stop in the middle

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-28 Thread Mich Talebzadeh
Hi @Wenchen Fan Thanks for your response. I believe we have not had enough time to "DISCUSS" this matter. Currently in order to make Spark take advantage of Hive, I create a soft link in $SPARK_HOME/conf. FYI, my spark version is 3.4.0 and Hive is 3.1.1 /opt/spark/conf/hive-site.xml ->

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-28 Thread Wenchen Fan
@Mich Talebzadeh thanks for sharing your concern! Note: creating Spark native data source tables is usually Hive compatible as well, unless we use features that Hive does not support (TIMESTAMP NTZ, ANSI INTERVAL, etc.). I think it's a better default to create Spark native table in this case,

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Cheng Pan
+1 (non-binding) Thanks, Cheng Pan On Sat, Apr 27, 2024 at 9:29 AM Holden Karau wrote: > > +1 > > Twitter: https://twitter.com/holdenkarau > Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 > YouTube Live Streams: https://www.youtube.com/user/holdenkarau > > > On

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Zhou Jiang
+1 (non-binding) On Fri, Apr 26, 2024 at 10:01 AM Dongjoon Hyun wrote: > I'll start with my +1. > > Dongjoon. > > On 2024/04/26 16:45:51 Dongjoon Hyun wrote: > > Please vote on SPARK-46122 to set > spark.sql.legacy.createHiveTableByDefault > > to `false` by default. The technical scope is

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Mich Talebzadeh
-1 for me Do not change spark.sql.legacy.createHiveTableByDefault because: 1. We have not had enough time to "DISCUSS" this matter. The discussion thread was opened almost 24 hours ago. 2. Compatibility: Changing the default behavior could potentially break existing workflows or

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Holden Karau
+1 Twitter: https://twitter.com/holdenkarau Books (Learning Spark, High Performance Spark, etc.): https://amzn.to/2MaRAG9 YouTube Live Streams: https://www.youtube.com/user/holdenkarau On Fri, Apr 26, 2024 at 12:06 PM L. C. Hsieh wrote: > +1 > > On Fri, Apr 26, 2024

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread L. C. Hsieh
+1 On Fri, Apr 26, 2024 at 10:01 AM Dongjoon Hyun wrote: > > I'll start with my +1. > > Dongjoon. > > On 2024/04/26 16:45:51 Dongjoon Hyun wrote: > > Please vote on SPARK-46122 to set spark.sql.legacy.createHiveTableByDefault > > to `false` by default. The technical scope is defined in the

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Gengliang Wang
+1 On Fri, Apr 26, 2024 at 10:01 AM Dongjoon Hyun wrote: > I'll start with my +1. > > Dongjoon. > > On 2024/04/26 16:45:51 Dongjoon Hyun wrote: > > Please vote on SPARK-46122 to set > spark.sql.legacy.createHiveTableByDefault > > to `false` by default. The technical scope is defined in the

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Dongjoon Hyun
I'll start with my +1. Dongjoon. On 2024/04/26 16:45:51 Dongjoon Hyun wrote: > Please vote on SPARK-46122 to set spark.sql.legacy.createHiveTableByDefault > to `false` by default. The technical scope is defined in the following PR. > > - DISCUSSION: >

[VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-26 Thread Dongjoon Hyun
Please vote on SPARK-46122 to set spark.sql.legacy.createHiveTableByDefault to `false` by default. The technical scope is defined in the following PR. - DISCUSSION: https://lists.apache.org/thread/ylk96fg4lvn6klxhj6t6yh42lyqb8wmd - JIRA: https://issues.apache.org/jira/browse/SPARK-46122 - PR: