jacques-n commented on a change in pull request #1849:
URL: https://github.com/apache/iceberg/pull/1849#discussion_r536468975



##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/TransactionalCatalog.java
##########
@@ -0,0 +1,107 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import org.apache.iceberg.catalog.SupportsCatalogTransactions.IsolationLevel;
+import org.apache.iceberg.catalog.SupportsCatalogTransactions.LockingMode;
+import org.apache.iceberg.exceptions.CommitFailedException;
+
+/**
+ * A {@link Catalog} that applies all mutations within a single transaction.
+ *
+ * <p>A TransactionalCatalog can spawn child transactions for multiple 
operations on different
+ * tables. All operations will be done within the context of a single 
Catalog-level transaction
+ * and they will either all be successful or all fail.
+ *
+ * <p>A TransactionalCatalog is initially active upon creation and will remain 
so until one of
+ * the following terminal actions occurs:
+ * <ul>
+ * <li>{@link rollback} is called.
+ * <li>{@link commit} is called.
+ * <li>The transaction expires while using Pessimistic {@link LockingMode}.
+ * <li>The transaction is terminated externally (for example, when a locking 
arbitrator
+ *     determines a deadlock between two transactions has occurred).
+ * <li>The underlying implementation determines that the transaction can no 
longer complete
+ *     successfully.
+ * </ul>
+ *
+ * <p>When one of the items above occurs, the transaction is no longer valid. 
Further use
+ * of the transaction will result in a {@link IllegalStateException} being 
thrown.
+ *
+ * <p>Nested transactions such as creating a new table may fail. Those 
failures alone do
+ * not necessarily result in a failure of the catalog-level transaction.
+ *
+ */
+public interface TransactionalCatalog extends Catalog, AutoCloseable {
+
+  /**
+   * An internal identifier associated with this transaction.
+   * @return An internal identifier.
+   */
+  String transactionId();
+
+  /**
+   * Return the current {@code IsolationLevel} for this transaction.
+   * @return The IsolationLevel for this transaction.
+   */
+  IsolationLevel isolationLevel();
+
+  /**
+   * Return the {@link LockingMode} for this transaction.
+   * @return The LockingMode for this transaction.
+   */
+  LockingMode lockingMode();
+
+  /**
+   * Whether the current transaction is still active/open.
+   * @return True until a terminal action occurs.
+   */
+  boolean active();
+
+  /**
+   * Aborts the set of operations here and makes this TransactionalCatalog 
inoperable.
+   *
+   * <p>Once called, no further operations can be done against this catalog. 
If any
+   * operations are attempted, {@link IllegalStateException} will be thrown.
+   */
+  void rollback();
+
+  /**
+   * Commit the pending changes from all nested transactions against the 
Catalog.
+   *
+   * <p>Once called, no further operations can be done against this catalog. 
If any
+   * operations are attempted, {@link IllegalStateException} will be thrown.
+   *
+   * @throws CommitFailedException If the updates cannot be committed due to 
conflicts.
+   */
+  void commit();
+
+  /**
+   * A shortcut for {@link commit} that allows users to use this catalog in 
try-with-resources
+   * block.
+   *
+   * @throws CommitFailedException If the updates cannot be committed due to 
conflicts.
+   */
+  @Override
+  default void close() {

Review comment:
       Updated to use rollback closeable pattern.

##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/SupportsCatalogTransactions.java
##########
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import java.util.Set;
+
+/**
+ * Catalog methods for working with catalog-level transactions.
+ *
+ * <p>Catalog implementations are not required to support catalog-level 
transactional state.
+ * If they do, they may support one or more {@code IsolationLevel}s and one or 
more
+ * {@code LockingMode}s.
+ */
+public interface SupportsCatalogTransactions {
+
+  /**
+   * The level of isolation for a catalog-level transaction.
+   *
+   * <p>Isolation covers both what data is read and what data can be written.
+   *
+   * <p>At all levels, data is only visible if it is either committed by 
another transaction or
+   * committed by a nested transaction within this catalog-level transaction.
+   *
+   * <p>Individual nested Table transactions may be "rebased" to expose 
updated versions of a
+   * table if the isolation level allows that behavior.
+   *
+   * <p>In the definitions of each isolation level, the concept of conflicting 
writes is
+   * referenced. Conflicting writes are two mutations to the same object that 
happen concurrently.
+   * Depending on the particular implementation, the coarseness of this 
conflict may vary. The
+   * most coarse conflict is any two mutations to the same table. However, 
some implementations
+   * may consider some of these "absolute" conflicts as allowable by using 
finer-grained conflict
+   * resolution. For example, two different operations that both append new 
files to a table may
+   * be in "absolute" conflict but could be resolved automatically as a "safe 
conflict" by using
+   * a set of automatic implementation-defined conflict resolution rules.
+   */
+  enum IsolationLevel {
+
+    /**
+     * Reading the same table multiple times may result in different versions 
read of the same
+     * table. A commit can be completed as long as any tables changed 
externally do not conflict
+     * with any writes within this transaction.
+     */
+    READ_COMMITTED,
+
+    /**
+     * Reading the same table multiple times will result in the same view of 
that table.
+     * Different tables may come from different snapshots. A commit can be 
completed as
+     * long as any tables changed externally do not conflict with any writes 
within this
+     * transaction.
+     */
+    REPEATED_READ,
+
+    /**
+     * A commit will only succeed if there have been no meaningful changes to 
data read during
+     * the course of this transaction prior to commit. This imposes stricter 
read guarantees than
+     * {@code REPEATED_READ} (consistent reads per table) as it requires that 
the reads are
+     * consistent for all tables to a single point in time (or single snapshot 
of the database).
+     * Additionally, it implies additional requirements around the successful 
completion of a
+     * write. In order for a write to complete, any entities read during this 
transaction are also
+     * blocked from changing (via another transaction) post-read in ways that 
would influence the
+     * writes of this operation. This is also sometimes called snapshot 
isolation.

Review comment:
       Snapshot isolation in general is more sticky from my pov as I don't 
believe there is a canonical definition of it. Serializable has a very clear 
definition from sql 92. People were confused previously by this definition and 
the missing snapshot isolation which is what caused me to add this sentence.

##########
File path: api/src/main/java/org/apache/iceberg/Table.java
##########
@@ -41,6 +41,13 @@ default String name() {
 
   /**
    * Refresh the current table metadata.
+   *
+   * <p>If this table is associated with a TransactionalCatalog, this refresh 
will be bounded by
+   * the visibility that the {@code IsolationLevel} of that transaction 
exposes. For example, if
+   * we are in a context of {@code READ_COMMITTED}, this refresh will update 
to the latest state
+   * of the table. However, in the case of {@code SERIALIZABLE} where this 
table hasn't mutated
+   * within this transaction, calling refresh will have no impact as the 
isolation level

Review comment:
       correct

##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/SupportsCatalogTransactions.java
##########
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import java.util.Set;
+
+/**
+ * Catalog methods for working with catalog-level transactions.
+ *
+ * <p>Catalog implementations are not required to support catalog-level 
transactional state.
+ * If they do, they may support one or more {@code IsolationLevel}s and one or 
more
+ * {@code LockingMode}s.
+ */
+public interface SupportsCatalogTransactions {
+
+  /**
+   * The level of isolation for a catalog-level transaction.
+   *
+   * <p>Isolation covers both what data is read and what data can be written.
+   *
+   * <p>At all levels, data is only visible if it is either committed by 
another transaction or
+   * committed by a nested transaction within this catalog-level transaction.
+   *
+   * <p>Individual nested Table transactions may be "rebased" to expose 
updated versions of a
+   * table if the isolation level allows that behavior.
+   *
+   * <p>In the definitions of each isolation level, the concept of conflicting 
writes is
+   * referenced. Conflicting writes are two mutations to the same object that 
happen concurrently.
+   * Depending on the particular implementation, the coarseness of this 
conflict may vary. The
+   * most coarse conflict is any two mutations to the same table. However, 
some implementations
+   * may consider some of these "absolute" conflicts as allowable by using 
finer-grained conflict
+   * resolution. For example, two different operations that both append new 
files to a table may
+   * be in "absolute" conflict but could be resolved automatically as a "safe 
conflict" by using
+   * a set of automatic implementation-defined conflict resolution rules.
+   */
+  enum IsolationLevel {
+
+    /**
+     * Reading the same table multiple times may result in different versions 
read of the same
+     * table. A commit can be completed as long as any tables changed 
externally do not conflict
+     * with any writes within this transaction.
+     */
+    READ_COMMITTED,
+
+    /**
+     * Reading the same table multiple times will result in the same view of 
that table.
+     * Different tables may come from different snapshots. A commit can be 
completed as
+     * long as any tables changed externally do not conflict with any writes 
within this
+     * transaction.
+     */
+    REPEATED_READ,
+
+    /**
+     * A commit will only succeed if there have been no meaningful changes to 
data read during
+     * the course of this transaction prior to commit. This imposes stricter 
read guarantees than

Review comment:
       This is defined in more detail further down in this paragraph. This is a 
simplified introductory sentence to help guide the users.
   > any entities read during this transaction are also blocked from changing 
(via another transaction) post-read in ways that would influence the writes of 
this operation

##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/SupportsCatalogTransactions.java
##########
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import java.util.Set;
+
+/**
+ * Catalog methods for working with catalog-level transactions.
+ *
+ * <p>Catalog implementations are not required to support catalog-level 
transactional state.
+ * If they do, they may support one or more {@code IsolationLevel}s and one or 
more
+ * {@code LockingMode}s.
+ */
+public interface SupportsCatalogTransactions {
+
+  /**
+   * The level of isolation for a catalog-level transaction.
+   *
+   * <p>Isolation covers both what data is read and what data can be written.
+   *
+   * <p>At all levels, data is only visible if it is either committed by 
another transaction or
+   * committed by a nested transaction within this catalog-level transaction.
+   *
+   * <p>Individual nested Table transactions may be "rebased" to expose 
updated versions of a
+   * table if the isolation level allows that behavior.
+   *
+   * <p>In the definitions of each isolation level, the concept of conflicting 
writes is

Review comment:
       It's more complex than that and I think different implementations will 
do it differently, which is why I state that this will be implementation 
dependent. The types of legal conflict resolutions change depending on the 
isolation levels and and how much work implementers want to put into things. My 
expectation is that initially the conflict resolution will be fairly minimal 
and people will mostly use serializable which basically disallows conflict 
resolution. In time, as we can reliably distinguish open for modify versus open 
for scan, we will be able to get more sophisticated but I don't want to put 
such a high bar at the API to begin that no one can (or is willing to) 
implement the interface.

##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/SupportsCatalogTransactions.java
##########
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import java.util.Set;
+
+/**
+ * Catalog methods for working with catalog-level transactions.
+ *
+ * <p>Catalog implementations are not required to support catalog-level 
transactional state.
+ * If they do, they may support one or more {@code IsolationLevel}s and one or 
more
+ * {@code LockingMode}s.
+ */
+public interface SupportsCatalogTransactions {
+
+  /**
+   * The level of isolation for a catalog-level transaction.
+   *
+   * <p>Isolation covers both what data is read and what data can be written.
+   *
+   * <p>At all levels, data is only visible if it is either committed by 
another transaction or
+   * committed by a nested transaction within this catalog-level transaction.
+   *
+   * <p>Individual nested Table transactions may be "rebased" to expose 
updated versions of a
+   * table if the isolation level allows that behavior.
+   *
+   * <p>In the definitions of each isolation level, the concept of conflicting 
writes is
+   * referenced. Conflicting writes are two mutations to the same object that 
happen concurrently.
+   * Depending on the particular implementation, the coarseness of this 
conflict may vary. The
+   * most coarse conflict is any two mutations to the same table. However, 
some implementations
+   * may consider some of these "absolute" conflicts as allowable by using 
finer-grained conflict
+   * resolution. For example, two different operations that both append new 
files to a table may
+   * be in "absolute" conflict but could be resolved automatically as a "safe 
conflict" by using
+   * a set of automatic implementation-defined conflict resolution rules.
+   */
+  enum IsolationLevel {
+
+    /**
+     * Reading the same table multiple times may result in different versions 
read of the same

Review comment:
       All isolation level concepts are within the same transaction so yes.

##########
File path: 
api/src/main/java/org/apache/iceberg/catalog/SupportsCatalogTransactions.java
##########
@@ -0,0 +1,162 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
+package org.apache.iceberg.catalog;
+
+import java.util.Set;
+
+/**
+ * Catalog methods for working with catalog-level transactions.
+ *
+ * <p>Catalog implementations are not required to support catalog-level 
transactional state.
+ * If they do, they may support one or more {@code IsolationLevel}s and one or 
more
+ * {@code LockingMode}s.
+ */
+public interface SupportsCatalogTransactions {
+
+  /**
+   * The level of isolation for a catalog-level transaction.
+   *
+   * <p>Isolation covers both what data is read and what data can be written.
+   *
+   * <p>At all levels, data is only visible if it is either committed by 
another transaction or
+   * committed by a nested transaction within this catalog-level transaction.
+   *
+   * <p>Individual nested Table transactions may be "rebased" to expose 
updated versions of a
+   * table if the isolation level allows that behavior.
+   *
+   * <p>In the definitions of each isolation level, the concept of conflicting 
writes is
+   * referenced. Conflicting writes are two mutations to the same object that 
happen concurrently.
+   * Depending on the particular implementation, the coarseness of this 
conflict may vary. The
+   * most coarse conflict is any two mutations to the same table. However, 
some implementations
+   * may consider some of these "absolute" conflicts as allowable by using 
finer-grained conflict
+   * resolution. For example, two different operations that both append new 
files to a table may
+   * be in "absolute" conflict but could be resolved automatically as a "safe 
conflict" by using
+   * a set of automatic implementation-defined conflict resolution rules.
+   */
+  enum IsolationLevel {
+
+    /**
+     * Reading the same table multiple times may result in different versions 
read of the same
+     * table. A commit can be completed as long as any tables changed 
externally do not conflict
+     * with any writes within this transaction.
+     */
+    READ_COMMITTED,
+
+    /**
+     * Reading the same table multiple times will result in the same view of 
that table.
+     * Different tables may come from different snapshots. A commit can be 
completed as
+     * long as any tables changed externally do not conflict with any writes 
within this
+     * transaction.
+     */
+    REPEATED_READ,

Review comment:
       Agreed, will update.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to