lirui-apache commented on a change in pull request #13434:
URL: https://github.com/apache/flink/pull/13434#discussion_r495519793
##########
File path:
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveTableUtil.java
##########
@@ -421,6 +424,31 @@ public static void checkAcidTable(CatalogTable
catalogTable, ObjectPath tablePat
}
}
+ /**
+ * Returns a new Hadoop Configuration object using the path to the
hadoop conf configured.
+ *
+ * @param hadoopConfDir Hadoop conf directory path.
+ * @return A Hadoop configuration instance.
+ */
+ public static Configuration getHadoopConfiguration(String
hadoopConfDir) {
+ Configuration hadoopConfiguration = new Configuration();
+ if (new File(hadoopConfDir).exists()) {
+ if (new File(hadoopConfDir +
"/core-site.xml").exists()) {
+ hadoopConfiguration.addResource(new
Path(hadoopConfDir + "/core-site.xml"));
+ }
+ if (new File(hadoopConfDir +
"/hdfs-default.xml").exists()) {
Review comment:
Why do we need hdfs-default.xml?
##########
File path:
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/util/HiveTableUtil.java
##########
@@ -421,6 +424,31 @@ public static void checkAcidTable(CatalogTable
catalogTable, ObjectPath tablePat
}
}
+ /**
+ * Returns a new Hadoop Configuration object using the path to the
hadoop conf configured.
+ *
+ * @param hadoopConfDir Hadoop conf directory path.
+ * @return A Hadoop configuration instance.
+ */
+ public static Configuration getHadoopConfiguration(String
hadoopConfDir) {
+ Configuration hadoopConfiguration = new Configuration();
+ if (new File(hadoopConfDir).exists()) {
+ if (new File(hadoopConfDir +
"/core-site.xml").exists()) {
+ hadoopConfiguration.addResource(new
Path(hadoopConfDir + "/core-site.xml"));
+ }
+ if (new File(hadoopConfDir +
"/hdfs-default.xml").exists()) {
+ hadoopConfiguration.addResource(new
Path(hadoopConfDir + "/hdfs-default.xml"));
+ }
+ if (new File(hadoopConfDir +
"/hdfs-site.xml").exists()) {
+ hadoopConfiguration.addResource(new
Path(hadoopConfDir + "/hdfs-site.xml"));
+ }
+ if (new File(hadoopConfDir +
"/mapred-site.xml").exists()) {
Review comment:
We also need yarn-site.xml, which is needed to generate Parquet splits
in a kerberized environment.
##########
File path:
flink-connectors/flink-connector-hive/src/main/java/org/apache/flink/table/catalog/hive/HiveCatalog.java
##########
@@ -196,15 +204,21 @@ private static HiveConf createHiveConf(@Nullable String
hiveConfDir) {
}
// create HiveConf from hadoop configuration
- Configuration hadoopConf =
HadoopUtils.getHadoopConfiguration(new
org.apache.flink.configuration.Configuration());
+ Configuration hadoopConf;
- // Add mapred-site.xml. We need to read configurations like
compression codec.
- for (String possibleHadoopConfPath :
HadoopUtils.possibleHadoopConfPaths(new
org.apache.flink.configuration.Configuration())) {
- File mapredSite = new File(new
File(possibleHadoopConfPath), "mapred-site.xml");
- if (mapredSite.exists()) {
- hadoopConf.addResource(new
Path(mapredSite.getAbsolutePath()));
- break;
+ if (isNullOrWhitespaceOnly(hadoopConfDir)) {
+ hadoopConf = HadoopUtils.getHadoopConfiguration(new
org.apache.flink.configuration.Configuration());
Review comment:
Let's not rely on `HadoopUtils.getHadoopConfiguration` to get the
configuration. We can just call `HadoopUtils.possibleHadoopConfPaths` to get
the paths and load the files by ourselves. And the loading logic should be
consistent with `HiveTableUtil.getHadoopConfiguration`.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]