Hi @Pratyaksh Sharma,


Okay, all right. BTW, thanks for raising this issue.


best,
lamber-ken


On 01/2/2020 13:47,Pratyaksh Sharma<pratyaks...@gmail.com> wrote:
Hi Lamberken,

I am also trying to fix this issue. Please let us know if you come up with
anything.

On Thu, Jan 2, 2020 at 11:12 AM lamberken <lamber...@163.com> wrote:



Hi @Vinoth,


Got it, thank you for reminding me. I just made a mistake just now.


best,
lamber-ken


On 01/2/2020 13:08,Vinoth Chandar<vin...@apache.org> wrote:
Hi Lamber,

utilities-bundle has always been a fat jar.. I was talking about
hudi-utilities.
Sure. take a swing at it. Happy to help as needed

On Wed, Jan 1, 2020 at 8:57 PM lamberken <lamber...@163.com> wrote:



Hi @Vinoth,


I'm willing to solve this problem. I'm trying to find out from the history
when hudi-utilities-bundle becoming not a fatjar.



Git History
2019-08-29 FAT-JAR ---> 5f9fa82f47e1cc14a22b869250fe23c8f9c033cd
2019-09-14 NOT-FATJAR ---> d2525c31b7dad7bae2d4899d8df2a353ca39af50
best,
lamber-ken


At 2020-01-01 09:15:01, "Vinoth Chandar" <vin...@apache.org> wrote:
This does sound like a fair bit of pain.
I am wondering if it makes sense to change the integ-test setup/docker
demo
to use incremental  puller. Bunch of the packaging issues around jars,
seem
like regressions that the hudi-utilities is not a fat jar anymore?

if there are nt any takers, I can also try my hand at fixing this, once I
get done with few things on my end. left a comment on HUDI-485



On Tue, Dec 31, 2019 at 4:19 PM lamberken <lamber...@163.com> wrote:



Hi @Pratyaksh Sharma,


Thanks for your detail stackstrace and reproduce steps. And your
suggestion is reasonable.


1, For NPE issue, please tracking pr #1167 <
https://github.com/apache/incubator-hudi/pull/1167>
2, For TTransportException issue, I have a question that can other
statements be executed except create statement?


best,
lamber-ken

At 2019-12-30 23:11:17, "Pratyaksh Sharma" <pratyaks...@gmail.com>
wrote:
Thank you Lamberken, the above issue gets resolved with what you
suggested.
However, still HiveIncrementalPuller is not working.
Subsequently I found and fixed a bug raised here -
https://issues.apache.org/jira/browse/HUDI-485.

Currently I am facing the below exception when trying to run the create
table statement on docker cluster. Any leads for solving this are
welcome
-

6811 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller  -
Exception when executing SQL

java.sql.SQLException: org.apache.thrift.transport.TTransportException

at



org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:399)

at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254)

at



org.apache.hudi.utilities.HiveIncrementalPuller.executeStatement(HiveIncrementalPuller.java:233)

at



org.apache.hudi.utilities.HiveIncrementalPuller.executeIncrementalSQL(HiveIncrementalPuller.java:200)

at



org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:157)

at



org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345)

Caused by: org.apache.thrift.transport.TTransportException

at



org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

at



org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)

at



org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)

at
org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)

at



org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

at



org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)

at



org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)

at



org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)

at



org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_GetOperationStatus(TCLIService.java:467)

at



org.apache.hive.service.rpc.thrift.TCLIService$Client.GetOperationStatus(TCLIService.java:454)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at



sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at



sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at



org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524)

at com.sun.proxy.$Proxy5.GetOperationStatus(Unknown Source)

at



org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:367)

... 5 more

6812 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller  -
Could
not close the resultset opened

java.sql.SQLException: org.apache.thrift.transport.TTransportException

at



org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:214)

at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:231)

at



org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:165)

at



org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345)

Caused by: org.apache.thrift.transport.TTransportException

at



org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

at



org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)

at



org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)

at
org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)

at



org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)

at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)

at



org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)

at



org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)

at



org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)

at



org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:513)

at



org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:500)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at



sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at



sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at



org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524)

at com.sun.proxy.$Proxy5.CloseOperation(Unknown Source)

at



org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:208)

... 3 more

Also the documentation does not mention the jars which need to be
passed
externally in classPath for executing above tool. We should upgrade the
documentation to list down the jars so that it becomes easier for a new
user to use this tool. I spent a lot of time adding all the jars
incrementally. This jira (
https://issues.apache.org/jira/browse/HUDI-486)
tracks this.

On Mon, Dec 30, 2019 at 5:35 PM lamberken <lamber...@163.com> wrote:



Hi @Pratyaksh Sharma


Thanks for your steps to reproduce this issue. Try to modify bellow
codes,
and test again.



org.apache.hudi.utilities.HiveIncrementalPuller#HiveIncrementalPuller /
--------------------------------- / String templateContent =



FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate"));
Changed to
/ --------------------------------- / String templateContent =



FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("/IncrementalPull.sqltemplate"));
best,
lamber-ken





At 2019-12-30 19:25:08, "Pratyaksh Sharma" <pratyaks...@gmail.com>
wrote:
Hi Vinoth,

I am able to reproduce this error on docker setup and have filed a
jira -
https://issues.apache.org/jira/browse/HUDI-484.

Steps to reproduce are mentioned in the jira description itself.

On Thu, Dec 26, 2019 at 12:42 PM Pratyaksh Sharma <
pratyaks...@gmail.com>
wrote:

Hi Vinoth,

I will try to reproduce the error on docker cluster and keep you
updated.

On Tue, Dec 24, 2019 at 11:23 PM Vinoth Chandar <
vin...@apache.org>
wrote:

Pratyaksh,

If you are still having this issue, could you try reproducing
this
on
the
docker setup





https://hudi.apache.org/docker_demo.html#step-7--incremental-query-for-copy-on-write-table
similar to this and raise a JIRA.
Happy to look into it and get it fixed if needed

Thanks
Vinoth

On Tue, Dec 24, 2019 at 8:43 AM lamberken <lamber...@163.com>
wrote:



Hi, @Pratyaksh Sharma


The log4j-1.2.17.jar lib also needs to added to the classpath,
for
example:
java -cp





/path/to/hive-jdbc-2.3.1.jar:/path/to/log4j-1.2.17.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar
org.apache.hudi.utilities.HiveIncrementalPuller --help


best,
lamber-ken

At 2019-12-24 17:23:20, "Pratyaksh Sharma" <
pratyaks...@gmail.com

wrote:
Hi Vinoth,

Sorry my bad, I did not realise earlier that spark is not
needed
for
this
class. I tried running it with the below command to get the
mentioned
exception -

Command -

java -cp






/path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar
org.apache.hudi.utilities.HiveIncrementalPuller --help

Exception -
Exception in thread "main" java.lang.NoClassDefFoundError:
org/apache/log4j/LogManager
at






org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64)
Caused by: java.lang.ClassNotFoundException:
org.apache.log4j.LogManager
at
java.net.URLClassLoader.findClass(URLClassLoader.java:382)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at
sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
at
java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 1 more

I was able to fix it by including the corresponding jar in the
bundle.

After fixing the above, still I am getting the NPE even though
the
template
is bundled in the jar.

On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar <
vin...@apache.org>
wrote:

Hi Pratyaksh,

HveIncrementalPuller is just a java program. Does not need
Spark,
since
it
just runs a HiveQL remotely..

On the error you specified, seems like it can't find the
template?
Can
you
see if the bundle does not have the template file.. May be
this
got
broken
during the bundling changes.. (since its no longer part of
the
resources
folder of the bundle module).. We should also probably be
throwing a
better
error than NPE..

We can raise a JIRA, once you confirm.

String templateContent =







FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate"));


On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma <
pratyaks...@gmail.com

wrote:

Hi,

Can someone guide me or share some documentation regarding
how
to
use
HiveIncrementalPuller. I already went through the
documentation
on
https://hudi.apache.org/querying_data.html. I tried using
this
puller
using
the below command and facing the given exception.

Any leads are appreciated.

Command -
spark-submit --name incremental-puller --queue etl --files
incremental_sql.txt --master yarn --deploy-mode cluster
--driver-memory
4g
--executor-memory 4g --num-executors 2 --class
org.apache.hudi.utilities.HiveIncrementalPuller
hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl
jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass
<pass>
--extractSQLFile incremental_sql.txt --sourceDb
<source_db>
--sourceTable
<src_table> --targetDb tmp --targetTable tempTable
--fromCommitTime 0
--maxCommits 1

Error -

java.lang.NullPointerException
at
org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73)
at








org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66)
at








org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61)
at








org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113)
at








org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343)









Reply via email to