Hi Qian,

Thanks for your questions.

For (1) -> The spark properties right now is picked up from the
SPARK_CONF_DIR, so if you define all these configs in the
spark-defaults.conf file, HDFSParquetImporter will pick it up from there.

For (2) -> The HiveSyncTool now has a flag useJDBC that can let you either
use JDBC or HiveMetastoreClient. If you provide the right URL to connect to
the metastore (essentially provide hive kerberos principal) and set useJDBC
to false, you will be able to talk to the Hive Metastore via the
HiveMetastoreClient.

Thanks,
Nishith

On Sun, Oct 6, 2019 at 9:15 AM Qian Wang <[email protected]> wrote:

> Hi,
>
> I have some questions when I try to use Hudi in my company’s prod env:
>
> 1. When I migrate the history table in HDFS, I tried use hudi-cli and
> HDFSParquetImporter tool. How can I specify Spark parameters in this tool,
> such as Yarn queue, etc?
> 2. Hudi needs to write metadata to Hive and it uses HiveMetastoreClient
> and HiveJDBC. How can I do if the Hive has Kerberos Authentication?
>
> Thanks.
>
> Best,
> Qian
>

Reply via email to