Hi,
I am trying to understand the usage of Transactions in Iceberg with
"commit.retry.num-retries" set to zero. My requirement is that the
transaction must fail if the table gets updated by any concurrent
transaction after opening the transaction.
I wrote the following unit test in TestHadoopTables.java to verify the
behaviour. I am noticing that both transactions are committing one after
the other leading to an unexpected table state. Could anyone please confirm
if I am doing anything wrong, or whether Iceberg transaction commit logic
needs any change?
This test is very simple. It opens two transactions one after another, adds
a file as part of the transaction, and commits them one after the other. My
requirement is that the second transaction must fail with
CommitFailedException. But, it is successfully committing.
@Test
public void testSimpleConcurrentTransaction() {
PartitionSpec spec = PartitionSpec.builderFor(SCHEMA)
.build();
// set table property to avoid retries during commit
final Map<String, String> tableProperties = Stream.of(new String[][] {
{ TableProperties.COMMIT_NUM_RETRIES, "0"
}}).collect(Collectors.toMap(d->d[0], d->d[1]));
final DataFile FILE_A = DataFiles.builder(spec)
.withPath("/path/to/data-a.parquet")
.withFileSizeInBytes(10)
.withRecordCount(1)
.build();
Table table = TABLES.create(SCHEMA, spec, tableProperties,
tableDir.toURI().toString());
// It is an empty table, so there is no snapshot yet
Assert.assertEquals("Current snapshot must be null", null,
table.currentSnapshot());
// start transaction t1
Transaction t1 = table.newTransaction();
// start transaction t2
Transaction t2 = table.newTransaction();
// t1 is adding a data file
t1.newAppend()
.appendFile(FILE_A)
.commit();
// t2 is adding a data file
t2.newAppend()
.appendFile(FILE_A)
.commit();
// commit transaction t1
t1.commitTransaction();
// commit transaction t2: My requirement is that the following commit
must fail
t2.commitTransaction();
table.refresh();
List<ManifestFile> manifests = table.currentSnapshot().allManifests();
// Following assert fails since both transaction added one each
manifest file
Assert.assertEquals("Should have 1 manifest file", 1, manifests.size());
}
Please suggest whether there is a way to commit transactions such that the
second one fails. Thank you so much.