Hi Qian, Thanks for your questions.
For (1) -> The spark properties right now is picked up from the SPARK_CONF_DIR, so if you define all these configs in the spark-defaults.conf file, HDFSParquetImporter will pick it up from there. For (2) -> The HiveSyncTool now has a flag useJDBC that can let you either use JDBC or HiveMetastoreClient. If you provide the right URL to connect to the metastore (essentially provide hive kerberos principal) and set useJDBC to false, you will be able to talk to the Hive Metastore via the HiveMetastoreClient. Thanks, Nishith On Sun, Oct 6, 2019 at 9:15 AM Qian Wang <[email protected]> wrote: > Hi, > > I have some questions when I try to use Hudi in my company’s prod env: > > 1. When I migrate the history table in HDFS, I tried use hudi-cli and > HDFSParquetImporter tool. How can I specify Spark parameters in this tool, > such as Yarn queue, etc? > 2. Hudi needs to write metadata to Hive and it uses HiveMetastoreClient > and HiveJDBC. How can I do if the Hive has Kerberos Authentication? > > Thanks. > > Best, > Qian >
