Hi all,

I now have some more input related to the issues I face at the moment:

When I try to UPDATE an external table via JDBC connection to HiveThrift2 
server I get the following exception:

java.lang.UnsupportedOperationException: UPDATE TABLE is not supported 
temporarily.

Whey doing an DELETE I see:

org.apache.spark.sql.AnalysisException: DELETE is only supported with v2 tables.

INSERT is working as expected.

We are using Spark 3.1.2 with Hadoop 3.2.0 and an external Hive 3.0.0 metastore 
on K8S.
Warehouse dir is located at AWS s3 attached using protocol s3a.

I learned so far that  that we need to use an ACID compatible file format for 
external tables such as ORC order DELTA.
In addition to that we would need to set some ACID related properties either as 
first commands after session creation or via appropriate configuration files:

SET hive.support.concurrency=true;
SET hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
SET hive.enforce.sorting=true;
SET hive.enforce.bucketing=true;
SET hive.exec.dynamic.partition.mode=nostrict;
SET hive.compactor.initiator.on=true;
SET hive.compactor.worker.threads=1;

Now, when I try to create the following table:

create external table acidtab (id string, val string)
            stored as ORC location '/data/acidtab.orc'
            tblproperties ('transactional'='true');

I see the following exception:

org.apache.spark.sql.AnalysisException: 
org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:The 
table must be stored using an ACID compliant format (such as ORC): 
default.acidtab)

Even when I try to create the file in ORC format the exception makes the 
suggestion to use ORC as it is required for ACID compliance.

Another point is that external tables are not getting deleted via DROP TABLE 
command. The only are being removed from the metastore but they remain 
physically available at their s3 bucket.

I tried with:

SET `hive.metastore.thrift.delete-files-on-drop`=true;

And also by setting:

TBLPROPERTIES ('external.table.purge'='true')


Any help on these issues would be very appreciated!

Many thanks,
Meikel Bode

From: Bode, Meikel, NMA-CFD <meikel.b...@bertelsmann.de>
Sent: Mittwoch, 10. November 2021 08:23
To: user <user@spark.apache.org>; dev <d...@spark.apache.org>
Subject: HiveThrift2 ACID Transactions?

Hi all,

We want to use apply INSERTS, UPDATE, and DELETE operations on tables based on 
parquet or ORC files served by thrift2.
Actually its unclear whether we can enable them and where.

At the moment, when executing UPDATE or DELETE operations those are getting 
blocked.

Anyone out who uses ACID transactions in combination with thrift2?

Best,
Meikel

Reply via email to