Hi @Pratyaksh Sharma,


Thanks for your detail stackstrace and reproduce steps. And your suggestion is 
reasonable.


1, For NPE issue, please tracking pr #1167 
<https://github.com/apache/incubator-hudi/pull/1167>
2, For TTransportException issue, I have a question that can other statements 
be executed except create statement?


best,
lamber-ken

At 2019-12-30 23:11:17, "Pratyaksh Sharma" <pratyaks...@gmail.com> wrote:
>Thank you Lamberken, the above issue gets resolved with what you suggested.
>However, still HiveIncrementalPuller is not working.
>Subsequently I found and fixed a bug raised here -
>https://issues.apache.org/jira/browse/HUDI-485.
>
>Currently I am facing the below exception when trying to run the create
>table statement on docker cluster. Any leads for solving this are welcome -
>
>6811 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller  -
>Exception when executing SQL
>
>java.sql.SQLException: org.apache.thrift.transport.TTransportException
>
>at
>org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:399)
>
>at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.executeStatement(HiveIncrementalPuller.java:233)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.executeIncrementalSQL(HiveIncrementalPuller.java:200)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:157)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345)
>
>Caused by: org.apache.thrift.transport.TTransportException
>
>at
>org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>
>at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>
>at
>org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
>
>at
>org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
>
>at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
>
>at
>org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)
>
>at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
>
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
>
>at
>org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_GetOperationStatus(TCLIService.java:467)
>
>at
>org.apache.hive.service.rpc.thrift.TCLIService$Client.GetOperationStatus(TCLIService.java:454)
>
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>at
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>at
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>at java.lang.reflect.Method.invoke(Method.java:498)
>
>at
>org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524)
>
>at com.sun.proxy.$Proxy5.GetOperationStatus(Unknown Source)
>
>at
>org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:367)
>
>... 5 more
>
>6812 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller  - Could
>not close the resultset opened
>
>java.sql.SQLException: org.apache.thrift.transport.TTransportException
>
>at
>org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:214)
>
>at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:231)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:165)
>
>at
>org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345)
>
>Caused by: org.apache.thrift.transport.TTransportException
>
>at
>org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>
>at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>
>at
>org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374)
>
>at
>org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451)
>
>at org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433)
>
>at
>org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38)
>
>at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321)
>
>at
>org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225)
>
>at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77)
>
>at
>org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:513)
>
>at
>org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:500)
>
>at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>
>at
>sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>
>at
>sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>
>at java.lang.reflect.Method.invoke(Method.java:498)
>
>at
>org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524)
>
>at com.sun.proxy.$Proxy5.CloseOperation(Unknown Source)
>
>at
>org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:208)
>
>... 3 more
>
>Also the documentation does not mention the jars which need to be passed
>externally in classPath for executing above tool. We should upgrade the
>documentation to list down the jars so that it becomes easier for a new
>user to use this tool. I spent a lot of time adding all the jars
>incrementally. This jira (https://issues.apache.org/jira/browse/HUDI-486)
>tracks this.
>
>On Mon, Dec 30, 2019 at 5:35 PM lamberken <lamber...@163.com> wrote:
>
>>
>>
>> Hi @Pratyaksh Sharma
>>
>>
>> Thanks for your steps to reproduce this issue. Try to modify bellow codes,
>> and test again.
>>
>>
>> org.apache.hudi.utilities.HiveIncrementalPuller#HiveIncrementalPuller /
>> --------------------------------- / String templateContent =
>> FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate"));
>> Changed to
>> / --------------------------------- / String templateContent =
>> FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("/IncrementalPull.sqltemplate"));
>> best,
>> lamber-ken
>>
>>
>>
>>
>>
>> At 2019-12-30 19:25:08, "Pratyaksh Sharma" <pratyaks...@gmail.com> wrote:
>> >Hi Vinoth,
>> >
>> >I am able to reproduce this error on docker setup and have filed a jira -
>> >https://issues.apache.org/jira/browse/HUDI-484.
>> >
>> >Steps to reproduce are mentioned in the jira description itself.
>> >
>> >On Thu, Dec 26, 2019 at 12:42 PM Pratyaksh Sharma <pratyaks...@gmail.com>
>> >wrote:
>> >
>> >> Hi Vinoth,
>> >>
>> >> I will try to reproduce the error on docker cluster and keep you
>> updated.
>> >>
>> >> On Tue, Dec 24, 2019 at 11:23 PM Vinoth Chandar <vin...@apache.org>
>> wrote:
>> >>
>> >>> Pratyaksh,
>> >>>
>> >>> If you are still having this issue, could you try reproducing this on
>> the
>> >>> docker setup
>> >>>
>> >>>
>> https://hudi.apache.org/docker_demo.html#step-7--incremental-query-for-copy-on-write-table
>> >>> similar to this and raise a JIRA.
>> >>> Happy to look into it and get it fixed if needed
>> >>>
>> >>> Thanks
>> >>> Vinoth
>> >>>
>> >>> On Tue, Dec 24, 2019 at 8:43 AM lamberken <lamber...@163.com> wrote:
>> >>>
>> >>> >
>> >>> >
>> >>> > Hi, @Pratyaksh Sharma
>> >>> >
>> >>> >
>> >>> > The log4j-1.2.17.jar lib also needs to added to the classpath, for
>> >>> example:
>> >>> > java -cp
>> >>> >
>> >>>
>> /path/to/hive-jdbc-2.3.1.jar:/path/to/log4j-1.2.17.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar
>> >>> > org.apache.hudi.utilities.HiveIncrementalPuller --help
>> >>> >
>> >>> >
>> >>> > best,
>> >>> > lamber-ken
>> >>> >
>> >>> > At 2019-12-24 17:23:20, "Pratyaksh Sharma" <pratyaks...@gmail.com>
>> >>> wrote:
>> >>> > >Hi Vinoth,
>> >>> > >
>> >>> > >Sorry my bad, I did not realise earlier that spark is not needed for
>> >>> this
>> >>> > >class. I tried running it with the below command to get the
>> mentioned
>> >>> > >exception -
>> >>> > >
>> >>> > >Command -
>> >>> > >
>> >>> > >java -cp
>> >>> >
>> >>> >
>> >>>
>> >/path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar
>> >>> > >org.apache.hudi.utilities.HiveIncrementalPuller --help
>> >>> > >
>> >>> > >Exception -
>> >>> > >Exception in thread "main" java.lang.NoClassDefFoundError:
>> >>> > >org/apache/log4j/LogManager
>> >>> > >        at
>> >>> >
>> >>> >
>> >>>
>> >org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64)
>> >>> > >Caused by: java.lang.ClassNotFoundException:
>> >>> org.apache.log4j.LogManager
>> >>> > >        at
>> java.net.URLClassLoader.findClass(URLClassLoader.java:382)
>> >>> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
>> >>> > >        at
>> >>> sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
>> >>> > >        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
>> >>> > >        ... 1 more
>> >>> > >
>> >>> > >I was able to fix it by including the corresponding jar in the
>> bundle.
>> >>> > >
>> >>> > >After fixing the above, still I am getting the NPE even though the
>> >>> > template
>> >>> > >is bundled in the jar.
>> >>> > >
>> >>> > >On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar <vin...@apache.org>
>> >>> > wrote:
>> >>> > >
>> >>> > >> Hi Pratyaksh,
>> >>> > >>
>> >>> > >> HveIncrementalPuller is just a java program. Does not need Spark,
>> >>> since
>> >>> > it
>> >>> > >> just runs a HiveQL remotely..
>> >>> > >>
>> >>> > >> On the error you specified, seems like it can't find the template?
>> >>> Can
>> >>> > you
>> >>> > >> see if the bundle does not have the template file.. May be this
>> got
>> >>> > broken
>> >>> > >> during the bundling changes.. (since its no longer part of the
>> >>> resources
>> >>> > >> folder of the bundle module).. We should also probably be
>> throwing a
>> >>> > better
>> >>> > >> error than NPE..
>> >>> > >>
>> >>> > >> We can raise a JIRA, once you confirm.
>> >>> > >>
>> >>> > >> String templateContent =
>> >>> > >>
>> >>> > >>
>> >>> >
>> >>>
>> FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate"));
>> >>> > >>
>> >>> > >>
>> >>> > >> On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma <
>> >>> pratyaks...@gmail.com
>> >>> > >
>> >>> > >> wrote:
>> >>> > >>
>> >>> > >> > Hi,
>> >>> > >> >
>> >>> > >> > Can someone guide me or share some documentation regarding how
>> to
>> >>> use
>> >>> > >> > HiveIncrementalPuller. I already went through the documentation
>> on
>> >>> > >> > https://hudi.apache.org/querying_data.html. I tried using this
>> >>> puller
>> >>> > >> > using
>> >>> > >> > the below command and facing the given exception.
>> >>> > >> >
>> >>> > >> > Any leads are appreciated.
>> >>> > >> >
>> >>> > >> > Command -
>> >>> > >> > spark-submit --name incremental-puller --queue etl --files
>> >>> > >> > incremental_sql.txt --master yarn --deploy-mode cluster
>> >>> > --driver-memory
>> >>> > >> 4g
>> >>> > >> > --executor-memory 4g --num-executors 2 --class
>> >>> > >> > org.apache.hudi.utilities.HiveIncrementalPuller
>> >>> > >> > hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl
>> >>> > >> > jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass <pass>
>> >>> > >> > --extractSQLFile incremental_sql.txt --sourceDb <source_db>
>> >>> > --sourceTable
>> >>> > >> > <src_table> --targetDb tmp --targetTable tempTable
>> >>> --fromCommitTime 0
>> >>> > >> > --maxCommits 1
>> >>> > >> >
>> >>> > >> > Error -
>> >>> > >> >
>> >>> > >> > java.lang.NullPointerException
>> >>> > >> > at
>> >>> org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73)
>> >>> > >> > at
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>>
>> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66)
>> >>> > >> > at
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>>
>> org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61)
>> >>> > >> > at
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>>
>> org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113)
>> >>> > >> > at
>> >>> > >> >
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>>
>> org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343)
>> >>> > >> >
>> >>> > >>
>> >>> >
>> >>>
>> >>
>>

Reply via email to