Hi Vinoth, Sorry my bad, I did not realise earlier that spark is not needed for this class. I tried running it with the below command to get the mentioned exception -
Command - java -cp /path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar org.apache.hudi.utilities.HiveIncrementalPuller --help Exception - Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/log4j/LogManager at org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64) Caused by: java.lang.ClassNotFoundException: org.apache.log4j.LogManager at java.net.URLClassLoader.findClass(URLClassLoader.java:382) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ... 1 more I was able to fix it by including the corresponding jar in the bundle. After fixing the above, still I am getting the NPE even though the template is bundled in the jar. On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar <vin...@apache.org> wrote: > Hi Pratyaksh, > > HveIncrementalPuller is just a java program. Does not need Spark, since it > just runs a HiveQL remotely.. > > On the error you specified, seems like it can't find the template? Can you > see if the bundle does not have the template file.. May be this got broken > during the bundling changes.. (since its no longer part of the resources > folder of the bundle module).. We should also probably be throwing a better > error than NPE.. > > We can raise a JIRA, once you confirm. > > String templateContent = > > FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate")); > > > On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma <pratyaks...@gmail.com> > wrote: > > > Hi, > > > > Can someone guide me or share some documentation regarding how to use > > HiveIncrementalPuller. I already went through the documentation on > > https://hudi.apache.org/querying_data.html. I tried using this puller > > using > > the below command and facing the given exception. > > > > Any leads are appreciated. > > > > Command - > > spark-submit --name incremental-puller --queue etl --files > > incremental_sql.txt --master yarn --deploy-mode cluster --driver-memory > 4g > > --executor-memory 4g --num-executors 2 --class > > org.apache.hudi.utilities.HiveIncrementalPuller > > hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl > > jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass <pass> > > --extractSQLFile incremental_sql.txt --sourceDb <source_db> --sourceTable > > <src_table> --targetDb tmp --targetTable tempTable --fromCommitTime 0 > > --maxCommits 1 > > > > Error - > > > > java.lang.NullPointerException > > at org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73) > > at > > > > > org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66) > > at > > > > > org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343) > > >