Re: Facing issues when using HiveIncrementalPuller

Pratyaksh Sharma Tue, 24 Dec 2019 01:24:14 -0800

Hi Vinoth,

Sorry my bad, I did not realise earlier that spark is not needed for this
class. I tried running it with the below command to get the mentioned
exception -


Command -

java -cp
/path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar
org.apache.hudi.utilities.HiveIncrementalPuller --help

Exception -
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/log4j/LogManager
        at
org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64)
Caused by: java.lang.ClassNotFoundException: org.apache.log4j.LogManager
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 1 more

I was able to fix it by including the corresponding jar in the bundle.

After fixing the above, still I am getting the NPE even though the template
is bundled in the jar.

On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar <[email protected]> wrote:

> Hi Pratyaksh,
>
> HveIncrementalPuller is just a java program. Does not need Spark, since it
> just runs a HiveQL remotely..
>
> On the error you specified, seems like it can't find the template? Can you
> see if the bundle does not have the template file.. May be this got broken
> during the bundling changes.. (since its no longer part of the resources
> folder of the bundle module).. We should also probably be throwing a better
> error than NPE..
>
> We can raise a JIRA, once you confirm.
>
> String templateContent =
>
> FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate"));
>
>
> On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma <[email protected]>
> wrote:
>
> > Hi,
> >
> > Can someone guide me or share some documentation regarding how to use
> > HiveIncrementalPuller. I already went through the documentation on
> > https://hudi.apache.org/querying_data.html. I tried using this puller
> > using
> > the below command and facing the given exception.
> >
> > Any leads are appreciated.
> >
> > Command -
> > spark-submit --name incremental-puller --queue etl --files
> > incremental_sql.txt --master yarn --deploy-mode cluster --driver-memory
> 4g
> > --executor-memory 4g --num-executors 2 --class
> > org.apache.hudi.utilities.HiveIncrementalPuller
> > hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl
> > jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass <pass>
> > --extractSQLFile incremental_sql.txt --sourceDb <source_db> --sourceTable
> > <src_table> --targetDb tmp --targetTable tempTable --fromCommitTime 0
> > --maxCommits 1
> >
> > Error -
> >
> > java.lang.NullPointerException
> > at org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73)
> > at
> >
> >
> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66)
> > at
> >
> >
> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61)
> > at
> >
> >
> org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113)
> > at
> >
> >
> org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343)
> >
>

Re: Facing issues when using HiveIncrementalPuller

Reply via email to