To add more color:
Spark data source table and Hive Serde table are both stored in the Hive
metastore and keep the data files in the table directory. The only
difference is they have different "table provider", which means Spark will
use different reader/writer. Ideally the Spark native data
@Mich Talebzadeh there seems to be a
misunderstanding here. The Spark native data source table is still stored
in the Hive metastore, it's just that Spark will use a different (and
faster) reader/writer for it. `hive-site.xml` should work as it is today.
On Tue, Apr 30, 2024 at 5:23 AM Hyukjin
Mich, It is a legacy config we should get rid of in the end, and it has
been tested in production for very long time. Spark should create a Spark
table by default.
On Tue, Apr 30, 2024 at 5:38 AM Mich Talebzadeh
wrote:
> Your point
>
> ".. t's a surprise to me to see that someone has different
+1
It's a legacy conf that we should eventually remove it away. Spark should
create Spark table by default, not Hive table.
Mich, for your workload, you can simply switch that conf off if it concerns
you. We also enabled ANSI as well (that you agreed on). It's a bit akwakrd
to stop in the middle
? I'm not sure why you think in that direction.
What I wrote was the following.
- You voted +1 for SPARK-4 on April 14th
(https://lists.apache.org/thread/tp92yzf8y4yjfk6r3dkqjtlb060g82sy)
- You voted -1 for SPARK-46122 on April 26th.
Your point
".. t's a surprise to me to see that someone has different positions in a
very short period of time in the community"
Well, I have been with Spark since 2015 and this is the article in the
medium dated February 7, 2016 with regard to both Hive and Spark and also
presented in
It's a surprise to me to see that someone has different positions
in a very short period of time in the community.
Mitch casted +1 for SPARK-4 and -1 for SPARK-46122.
- https://lists.apache.org/thread/4cbkpvc3vr3b6k0wp6lgsw37spdpnqrc
-