Copilot commented on code in PR #6267:
URL: https://github.com/apache/hive/pull/6267#discussion_r2700776326
##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/database/create/CreateDatabaseAnalyzer.java:
##########
@@ -51,10 +52,9 @@ public CreateDatabaseAnalyzer(QueryState queryState) throws
SemanticException {
@Override
public void analyzeInternal(ASTNode root) throws SemanticException {
Pair<String, String> catDbNamePair = DDLUtils.getCatDbNamePair((ASTNode)
root.getChild(0));
- String catalogName = catDbNamePair.getLeft();
- if (catalogName != null && getCatalog(catalogName) == null) {
- throw new SemanticException(ErrorMsg.CATALOG_NOT_EXISTS, catalogName);
- }
+ String catalogName = Optional.ofNullable(catDbNamePair.getLeft())
+ .orElse(HiveUtils.getCurrentCatalogOrDefault(conf));
+
Review Comment:
The removed catalog existence validation (checking if getCatalog returns
null) has been replaced with simply using the current catalog as default when
catalog name is not specified. However, this removes an important validation
step that would catch attempts to create databases in non-existent catalogs
early in the analysis phase. Consider adding validation to verify that the
catalog exists before proceeding with database creation, either here or in a
later stage.
```suggestion
Object catalog = getCatalog(catalogName);
if (catalog == null) {
throw new SemanticException("Catalog " + catalogName + " does not
exist");
}
```
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java:
##########
@@ -388,14 +389,34 @@ public static void createDbExportDump(FileSystem fs, Path
metadataPath, Database
private static void updateIfCustomDbLocations(Database database,
Configuration conf) throws SemanticException {
try {
- String whLocatoion = MetastoreConf.getVar(conf,
MetastoreConf.ConfVars.WAREHOUSE_EXTERNAL,
- MetastoreConf.getVar(conf, MetastoreConf.ConfVars.WAREHOUSE));
- Path dbDerivedLoc = new Path(whLocatoion,
database.getName().toLowerCase() + DATABASE_PATH_SUFFIX);
+ String catName = database.getCatalogName();
+ String dbName = database.getName().toLowerCase();
+ boolean isDefaultCatalog =
Warehouse.DEFAULT_CATALOG_NAME.equals(catName);
+
+ // external warehouse root
+ String whLocation = MetastoreConf.getVar(conf,
+ isDefaultCatalog ? MetastoreConf.ConfVars.WAREHOUSE_EXTERNAL :
MetastoreConf.ConfVars.WAREHOUSE_CATALOG_EXTERNAL,
+ MetastoreConf.getVar(conf,
+ isDefaultCatalog ? MetastoreConf.ConfVars.WAREHOUSE :
MetastoreConf.ConfVars.WAREHOUSE_CATALOG));
+
+ if (!isDefaultCatalog) {
+ whLocation += "/" + catName;
+ }
+
+ Path dbDerivedLoc = new Path(whLocation, dbName + DATABASE_PATH_SUFFIX);
String defaultDbLoc = Utilities.getQualifiedPath((HiveConf) conf,
dbDerivedLoc);
database.putToParameters(ReplConst.REPL_IS_CUSTOM_DB_LOC,
Boolean.toString(!defaultDbLoc.equals(database.getLocationUri())));
- String whManagedLocatoion = MetastoreConf.getVar(conf,
MetastoreConf.ConfVars.WAREHOUSE);
- Path dbDerivedManagedLoc = new Path(whManagedLocatoion,
database.getName().toLowerCase()
+
+ // managed warehouse root
+ String whManagedLocatoion = MetastoreConf.getVar(conf,
+ isDefaultCatalog ? MetastoreConf.ConfVars.WAREHOUSE
+ : MetastoreConf.ConfVars.WAREHOUSE_CATALOG);
+
+ if (!isDefaultCatalog) {
+ whManagedLocatoion += "/" + catName;
Review Comment:
String concatenation is used to build paths instead of using the Path API.
Replace 'whManagedLocatoion += "/" + catName' with proper Path usage:
'whManagedLocatoion = new Path(whManagedLocatoion, catName).toString()' for
consistency and correct path separator handling.
##########
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##########
@@ -922,6 +922,15 @@ public static enum ConfVars {
"Default location for external tables created in the warehouse. " +
"If not set or null, then the normal warehouse location will be used
as the default location."),
+ METASTORE_CATALOG_WAREHOUSE("hive.metastore.warehouse.catalog.dir",
"/user/hive/catalog/warehouse",
+ "location of default database for the warehouse. If the name of
the catalog to which the db belongs is testcat," +
+ "then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat"),
Review Comment:
The description reads 'location of default database for the warehouse' which
is identical to the WAREHOUSE configuration description. This should be
clarified to specify that this is the base location for databases in
non-default catalogs. Consider rewording to: 'Base location for databases in
non-default catalogs. The actual database path will be
{this_value}/{catalog_name}/{database_name}.db'
```suggestion
"Base location for databases in non-default catalogs. " +
"The actual database path will be
{this_value}/{catalog_name}/{database_name}.db"),
```
##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java:
##########
@@ -1783,6 +1783,15 @@ public enum ConfVars {
"hive.metastore.warehouse.external.dir", "",
"Default location for external tables created in the warehouse. " +
"If not set or null, then the normal warehouse location will be used
as the default location."),
+ WAREHOUSE_CATALOG("metastore.warehouse.catalog.dir",
+ "hive.metastore.warehouse.catalog.dir",
"/user/hive/catalog/warehouse",
+ "location of default database for the warehouse. If the name of
the catalog to which the db belongs is testcat," +
+ "then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat."),
Review Comment:
The documentation for WAREHOUSE_CATALOG is missing a space after the period
in the description. The text 'If the name of the catalog to which the db
belongs is testcat,then the default path' should be 'If the name of the catalog
to which the db belongs is testcat, then the default path' (note the space
after the comma).
```suggestion
" then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat."),
```
##########
ql/src/java/org/apache/hadoop/hive/ql/ddl/catalog/create/CreateCatalogAnalyzer.java:
##########
@@ -67,10 +72,23 @@ public void analyzeInternal(ASTNode root) throws
SemanticException {
}
}
+ assert props != null;
+ checkCatalogType(props);
+
CreateCatalogDesc desc = new CreateCatalogDesc(catalogName, comment,
locationUrl, ifNotExists, props);
Catalog catalog = new Catalog(catalogName, locationUrl);
rootTasks.add(TaskFactory.get(new DDLWork(getInputs(), getOutputs(),
desc)));
outputs.add(new WriteEntity(catalog, WriteEntity.WriteType.DDL_NO_LOCK));
}
+
+ private static void checkCatalogType(Map<String, String> props) throws
SemanticException {
+ String catalogType = props.get("type");
+ if (Strings.isNullOrEmpty(catalogType)) {
+ throw new SemanticException("'type' can not be null or empty");
+ }
+ if (!CatalogUtil.isValidCatalogType(catalogType)) {
+ throw new SemanticException(String.format("type '%s' is not valid",
catalogType));
Review Comment:
The CatalogUtil.isValidCatalogType method converts the input to uppercase
before validation, but the test queries use both 'NATIVE' and 'native'. While
this works correctly, it would be clearer to document in the checkCatalogType
method that the type parameter is case-insensitive. Additionally, consider
whether 'ICEBERG' catalog type requires any special handling or validation
beyond just being a valid enum value.
```suggestion
/**
* Validates the mandatory catalog {@code type} property.
* <p>
* The value of {@code type} is validated in a case-insensitive manner.
This method
* delegates to {@link CatalogUtil#isValidCatalogType(String)}, which
normalizes the
* supplied value to upper case before checking it against the supported
catalog types
* (for example, {@code NATIVE}, {@code HIVE}, {@code ICEBERG}, etc.).
* <p>
* At present, catalog types such as {@code ICEBERG} do not receive any
additional
* validation in this analyzer beyond being recognized as a valid catalog
type. Any
* ICEBERG-specific configuration or validation must be performed
elsewhere.
*/
private static void checkCatalogType(Map<String, String> props) throws
SemanticException {
String catalogType = props.get("type");
if (Strings.isNullOrEmpty(catalogType)) {
throw new SemanticException("'type' can not be null or empty");
}
if (!CatalogUtil.isValidCatalogType(catalogType)) {
// Note: validation is case-insensitive; values are normalized inside
CatalogUtil.
throw new SemanticException(String.format(
"type '%s' is not valid (catalog type comparison is
case-insensitive)", catalogType));
```
##########
common/src/java/org/apache/hadoop/hive/conf/HiveConf.java:
##########
@@ -922,6 +922,15 @@ public static enum ConfVars {
"Default location for external tables created in the warehouse. " +
"If not set or null, then the normal warehouse location will be used
as the default location."),
+ METASTORE_CATALOG_WAREHOUSE("hive.metastore.warehouse.catalog.dir",
"/user/hive/catalog/warehouse",
+ "location of default database for the warehouse. If the name of
the catalog to which the db belongs is testcat," +
+ "then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat"),
Review Comment:
The documentation for METASTORE_CATALOG_WAREHOUSE has the same spacing
issue. The text 'If the name of the catalog to which the db belongs is
testcat,then the default path' should be 'If the name of the catalog to which
the db belongs is testcat, then the default path' (note the space after the
comma).
```suggestion
" then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat"),
```
##########
ql/src/java/org/apache/hadoop/hive/ql/parse/EximUtil.java:
##########
@@ -388,14 +389,34 @@ public static void createDbExportDump(FileSystem fs, Path
metadataPath, Database
private static void updateIfCustomDbLocations(Database database,
Configuration conf) throws SemanticException {
try {
- String whLocatoion = MetastoreConf.getVar(conf,
MetastoreConf.ConfVars.WAREHOUSE_EXTERNAL,
- MetastoreConf.getVar(conf, MetastoreConf.ConfVars.WAREHOUSE));
- Path dbDerivedLoc = new Path(whLocatoion,
database.getName().toLowerCase() + DATABASE_PATH_SUFFIX);
+ String catName = database.getCatalogName();
+ String dbName = database.getName().toLowerCase();
+ boolean isDefaultCatalog =
Warehouse.DEFAULT_CATALOG_NAME.equals(catName);
+
+ // external warehouse root
+ String whLocation = MetastoreConf.getVar(conf,
+ isDefaultCatalog ? MetastoreConf.ConfVars.WAREHOUSE_EXTERNAL :
MetastoreConf.ConfVars.WAREHOUSE_CATALOG_EXTERNAL,
+ MetastoreConf.getVar(conf,
+ isDefaultCatalog ? MetastoreConf.ConfVars.WAREHOUSE :
MetastoreConf.ConfVars.WAREHOUSE_CATALOG));
+
+ if (!isDefaultCatalog) {
+ whLocation += "/" + catName;
Review Comment:
String concatenation is used to build paths instead of using the Path API.
Replace 'whLocation += "/" + catName' with proper Path usage: 'whLocation = new
Path(whLocation, catName).toString()' for consistency and correct path
separator handling.
##########
parser/src/java/org/apache/hadoop/hive/ql/parse/HiveParser.g:
##########
@@ -1131,10 +1131,10 @@ createCatalogStatement
: KW_CREATE KW_CATALOG
ifNotExists?
name=identifier
- catLocation
+ catLocation?
catalogComment?
- (KW_PROPERTIES catprops=properties)?
- -> ^(TOK_CREATECATALOG $name catLocation ifNotExists? catalogComment?
$catprops?)
+ KW_PROPERTIES catprops=properties
+ -> ^(TOK_CREATECATALOG $name catLocation? ifNotExists? catalogComment?
$catprops)
Review Comment:
In the grammar change, the PROPERTIES clause is now mandatory (no '?' after
'KW_PROPERTIES catprops=properties'). However, this is a breaking change from
the previous grammar where properties were optional. This means existing CREATE
CATALOG statements without PROPERTIES will now fail to parse. Consider whether
this breaking change is intentional or if PROPERTIES should remain optional
with validation added in the analyzer to ensure the 'type' property is present.
```suggestion
(KW_PROPERTIES catprops=properties)?
-> ^(TOK_CREATECATALOG $name catLocation? ifNotExists? catalogComment?
$catprops?)
```
##########
standalone-metastore/metastore-common/src/main/java/org/apache/hadoop/hive/metastore/conf/MetastoreConf.java:
##########
@@ -1783,6 +1783,15 @@ public enum ConfVars {
"hive.metastore.warehouse.external.dir", "",
"Default location for external tables created in the warehouse. " +
"If not set or null, then the normal warehouse location will be used
as the default location."),
+ WAREHOUSE_CATALOG("metastore.warehouse.catalog.dir",
+ "hive.metastore.warehouse.catalog.dir",
"/user/hive/catalog/warehouse",
+ "location of default database for the warehouse. If the name of
the catalog to which the db belongs is testcat," +
+ "then the default path prefix for the db is
/user/hive/catalog/warehouse/testcat."),
Review Comment:
The description reads 'location of default database for the warehouse' which
is the same as the WAREHOUSE configuration. This should be clarified to specify
that this is the base location for databases in non-default catalogs. Consider
rewording to: 'Base location for databases in non-default catalogs. The actual
database path will be {this_value}/{catalog_name}/{database_name}.db'
```suggestion
"Base location for databases in non-default catalogs. " +
"The actual database path will be
{this_value}/{catalog_name}/{database_name}.db"),
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]