mchades commented on code in PR #9127:
URL: https://github.com/apache/gravitino/pull/9127#discussion_r2745465044
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -332,11 +331,48 @@ private org.apache.arrow.vector.types.pojo.Schema
convertColumnsToArrowSchema(Co
return new org.apache.arrow.vector.types.pojo.Schema(fields);
}
- private void addLanceIndex(Table table, List<Index> addedIndexes) {
+ // Note: this method can't guarantee the atomicity of the operations on
Lance dataset. For
+ // example, only a subset of changes may be applied if an exception occurs
during the process.
+ /**
+ * Handle the table changes on the underlying Lance dataset.
+ *
+ * @param table the table to be altered
+ * @param changes the changes to be applied
+ * @return the new version id of the Lance dataset after applying the changes
+ */
+ private long handleLanceTableChange(Table table, TableChange[] changes) {
+ List<String> dropColumns = Lists.newArrayList();
+ List<Index> indexToAdd = Lists.newArrayList();
+ List<ColumnAlteration> renameColumns = Lists.newArrayList();
Review Comment:
small suggestion: align the parameter names like `columnsToDrop`,
`indexesToAdd`, `columnsToRename`
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -332,11 +331,48 @@ private org.apache.arrow.vector.types.pojo.Schema
convertColumnsToArrowSchema(Co
return new org.apache.arrow.vector.types.pojo.Schema(fields);
}
- private void addLanceIndex(Table table, List<Index> addedIndexes) {
+ // Note: this method can't guarantee the atomicity of the operations on
Lance dataset. For
+ // example, only a subset of changes may be applied if an exception occurs
during the process.
+ /**
+ * Handle the table changes on the underlying Lance dataset.
+ *
+ * @param table the table to be altered
+ * @param changes the changes to be applied
+ * @return the new version id of the Lance dataset after applying the changes
+ */
+ private long handleLanceTableChange(Table table, TableChange[] changes) {
+ List<String> dropColumns = Lists.newArrayList();
+ List<Index> indexToAdd = Lists.newArrayList();
+ List<ColumnAlteration> renameColumns = Lists.newArrayList();
+
+ for (TableChange change : changes) {
+ if (change instanceof TableChange.DeleteColumn deleteColumn) {
+ dropColumns.add(String.join(".", deleteColumn.fieldName()));
+ } else if (change instanceof TableChange.AddIndex addIndex) {
+ indexToAdd.add(
+ Indexes.IndexImpl.builder()
+ .withIndexType(addIndex.getType())
+ .withName(addIndex.getName())
+ .withFieldNames(addIndex.getFieldNames())
+ .build());
+ } else if (change instanceof TableChange.RenameColumn renameColumn) {
+ // Currently, only renaming columns is supported.
+ // TODO: Support change column type once we have a clear knowledge
about the means of
+ // castTo in Lance.
Review Comment:
The comment here seems unnecessary because it is `renameColumn`, not
`updateDataType`
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -332,11 +331,48 @@ private org.apache.arrow.vector.types.pojo.Schema
convertColumnsToArrowSchema(Co
return new org.apache.arrow.vector.types.pojo.Schema(fields);
}
- private void addLanceIndex(Table table, List<Index> addedIndexes) {
+ // Note: this method can't guarantee the atomicity of the operations on
Lance dataset. For
+ // example, only a subset of changes may be applied if an exception occurs
during the process.
Review Comment:
this note can also move the the java doc
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -161,31 +164,27 @@ public Table createTable(
@Override
public Table alterTable(NameIdentifier ident, TableChange... changes)
throws NoSuchSchemaException, TableAlreadyExistsException {
- // Lance only supports adding indexes for now.
- boolean onlyAddIndex =
- Arrays.stream(changes).allMatch(change -> change instanceof
TableChange.AddIndex);
- Preconditions.checkArgument(onlyAddIndex, "Only adding indexes is
supported for Lance tables");
-
- List<Index> addedIndexes =
- Arrays.stream(changes)
- .filter(change -> change instanceof TableChange.AddIndex)
- .map(
- change -> {
- TableChange.AddIndex addIndexChange = (TableChange.AddIndex)
change;
- return Indexes.IndexImpl.builder()
- .withIndexType(addIndexChange.getType())
- .withName(addIndexChange.getName())
- .withFieldNames(addIndexChange.getFieldNames())
- .build();
- })
- .collect(Collectors.toList());
Table loadedTable = super.loadTable(ident);
- addLanceIndex(loadedTable, addedIndexes);
- // After adding the index to the Lance dataset, we need to update the
table metadata in
+ long version = handleLanceTableChange(loadedTable, changes);
+ // After making changes to the Lance dataset, we need to update the table
metadata in
// Gravitino. If there's any failure during this process, the code will
throw an exception
// and the update won't be applied in Gravitino.
- return super.alterTable(ident, changes);
+ GenericTable table = (GenericTable) super.alterTable(ident, changes);
+ Map<String, String> updatedProperties = new
java.util.HashMap<>(table.properties());
Review Comment:
plz remove the package name
##########
catalogs/catalog-lakehouse-generic/src/main/java/org/apache/gravitino/catalog/lakehouse/lance/LanceTableOperations.java:
##########
@@ -349,9 +385,21 @@ private void addLanceIndex(Table table, List<Index>
addedIndexes) {
indexParams,
true);
}
+
+ if (!dropColumns.isEmpty()) {
+ dataset.dropColumns(dropColumns);
+ }
+
+ if (!renameColumns.isEmpty()) {
Review Comment:
Should we consider the execution order of `DROP` and `ALTER`?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]