Hi,

Can someone guide me or share some documentation regarding how to use
HiveIncrementalPuller. I already went through the documentation on
https://hudi.apache.org/querying_data.html. I tried using this puller using
the below command and facing the given exception.

Any leads are appreciated.

Command -
spark-submit --name incremental-puller --queue etl --files
incremental_sql.txt --master yarn --deploy-mode cluster --driver-memory 4g
--executor-memory 4g --num-executors 2 --class
org.apache.hudi.utilities.HiveIncrementalPuller
hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl
jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass <pass>
--extractSQLFile incremental_sql.txt --sourceDb <source_db> --sourceTable
<src_table> --targetDb tmp --targetTable tempTable --fromCommitTime 0
--maxCommits 1

Error -

java.lang.NullPointerException
at org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73)
at
org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66)
at
org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61)
at
org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113)
at
org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343)

Reply via email to