Hi Vinoth/Lamberken, The issue seems to be due to number of open files (ulimit parameter). I checked hiveserver container's logs when executing HiveIncrementalPuller script and got the below error -
# # A fatal error has been detected by the Java Runtime Environment: # # SIGBUS (0x7) at pc=0x00007f522a4ede6d, pid=438, tid=0x00007f5204ffe700 # # JRE version: OpenJDK Runtime Environment (8.0_212-b01) (build 1.8.0_212-8u212-b01-1~deb9u1-b01) # Java VM: OpenJDK 64-Bit Server VM (25.212-b01 mixed mode linux-amd64 compressed oops) # Problematic frame: # C [libzip.so+0x4e6d] # # Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again # # An error report file with more information is saved as: # /opt/hive/bin/hs_err_pid438.log I was able to fix this by running the command ulimit -c unlimited in hiveserver container. Now the task is only to set the ulimits in docker-compose file itself for hiveserver. Will be doing that and will raise a PR. Will keep you guys posted if I face any further issues. On Thu, Jan 2, 2020 at 11:27 AM lamberken <lamber...@163.com> wrote: > > > Hi @Pratyaksh Sharma, > > > Okay, all right. BTW, thanks for raising this issue. > > > best, > lamber-ken > > > On 01/2/2020 13:47,Pratyaksh Sharma<pratyaks...@gmail.com> wrote: > Hi Lamberken, > > I am also trying to fix this issue. Please let us know if you come up with > anything. > > On Thu, Jan 2, 2020 at 11:12 AM lamberken <lamber...@163.com> wrote: > > > > Hi @Vinoth, > > > Got it, thank you for reminding me. I just made a mistake just now. > > > best, > lamber-ken > > > On 01/2/2020 13:08,Vinoth Chandar<vin...@apache.org> wrote: > Hi Lamber, > > utilities-bundle has always been a fat jar.. I was talking about > hudi-utilities. > Sure. take a swing at it. Happy to help as needed > > On Wed, Jan 1, 2020 at 8:57 PM lamberken <lamber...@163.com> wrote: > > > > Hi @Vinoth, > > > I'm willing to solve this problem. I'm trying to find out from the history > when hudi-utilities-bundle becoming not a fatjar. > > > > Git History > 2019-08-29 FAT-JAR ---> 5f9fa82f47e1cc14a22b869250fe23c8f9c033cd > 2019-09-14 NOT-FATJAR ---> d2525c31b7dad7bae2d4899d8df2a353ca39af50 > best, > lamber-ken > > > At 2020-01-01 09:15:01, "Vinoth Chandar" <vin...@apache.org> wrote: > This does sound like a fair bit of pain. > I am wondering if it makes sense to change the integ-test setup/docker > demo > to use incremental puller. Bunch of the packaging issues around jars, > seem > like regressions that the hudi-utilities is not a fat jar anymore? > > if there are nt any takers, I can also try my hand at fixing this, once I > get done with few things on my end. left a comment on HUDI-485 > > > > On Tue, Dec 31, 2019 at 4:19 PM lamberken <lamber...@163.com> wrote: > > > > Hi @Pratyaksh Sharma, > > > Thanks for your detail stackstrace and reproduce steps. And your > suggestion is reasonable. > > > 1, For NPE issue, please tracking pr #1167 < > https://github.com/apache/incubator-hudi/pull/1167> > 2, For TTransportException issue, I have a question that can other > statements be executed except create statement? > > > best, > lamber-ken > > At 2019-12-30 23:11:17, "Pratyaksh Sharma" <pratyaks...@gmail.com> > wrote: > Thank you Lamberken, the above issue gets resolved with what you > suggested. > However, still HiveIncrementalPuller is not working. > Subsequently I found and fixed a bug raised here - > https://issues.apache.org/jira/browse/HUDI-485. > > Currently I am facing the below exception when trying to run the create > table statement on docker cluster. Any leads for solving this are > welcome > - > > 6811 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller - > Exception when executing SQL > > java.sql.SQLException: org.apache.thrift.transport.TTransportException > > at > > > > > org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:399) > > at org.apache.hive.jdbc.HiveStatement.execute(HiveStatement.java:254) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.executeStatement(HiveIncrementalPuller.java:233) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.executeIncrementalSQL(HiveIncrementalPuller.java:200) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:157) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345) > > Caused by: org.apache.thrift.transport.TTransportException > > at > > > > > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > > at > > > > > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374) > > at > > > > > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451) > > at > org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433) > > at > > > > > org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38) > > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225) > > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) > > at > > > > > org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_GetOperationStatus(TCLIService.java:467) > > at > > > > > org.apache.hive.service.rpc.thrift.TCLIService$Client.GetOperationStatus(TCLIService.java:454) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at > > > > > org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524) > > at com.sun.proxy.$Proxy5.GetOperationStatus(Unknown Source) > > at > > > > > org.apache.hive.jdbc.HiveStatement.waitForOperationToComplete(HiveStatement.java:367) > > ... 5 more > > 6812 [main] ERROR org.apache.hudi.utilities.HiveIncrementalPuller - > Could > not close the resultset opened > > java.sql.SQLException: org.apache.thrift.transport.TTransportException > > at > > > > > org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:214) > > at org.apache.hive.jdbc.HiveStatement.close(HiveStatement.java:231) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.saveDelta(HiveIncrementalPuller.java:165) > > at > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:345) > > Caused by: org.apache.thrift.transport.TTransportException > > at > > > > > org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > > at > > > > > org.apache.thrift.transport.TSaslTransport.readLength(TSaslTransport.java:374) > > at > > > > > org.apache.thrift.transport.TSaslTransport.readFrame(TSaslTransport.java:451) > > at > org.apache.thrift.transport.TSaslTransport.read(TSaslTransport.java:433) > > at > > > > > org.apache.thrift.transport.TSaslClientTransport.read(TSaslClientTransport.java:38) > > at org.apache.thrift.transport.TTransport.readAll(TTransport.java:86) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:425) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:321) > > at > > > > > org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:225) > > at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:77) > > at > > > > > org.apache.hive.service.rpc.thrift.TCLIService$Client.recv_CloseOperation(TCLIService.java:513) > > at > > > > > org.apache.hive.service.rpc.thrift.TCLIService$Client.CloseOperation(TCLIService.java:500) > > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > > at > > > > > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > > at > > > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > > at java.lang.reflect.Method.invoke(Method.java:498) > > at > > > > > org.apache.hive.jdbc.HiveConnection$SynchronizedHandler.invoke(HiveConnection.java:1524) > > at com.sun.proxy.$Proxy5.CloseOperation(Unknown Source) > > at > > > > > org.apache.hive.jdbc.HiveStatement.closeClientOperation(HiveStatement.java:208) > > ... 3 more > > Also the documentation does not mention the jars which need to be > passed > externally in classPath for executing above tool. We should upgrade the > documentation to list down the jars so that it becomes easier for a new > user to use this tool. I spent a lot of time adding all the jars > incrementally. This jira ( > https://issues.apache.org/jira/browse/HUDI-486) > tracks this. > > On Mon, Dec 30, 2019 at 5:35 PM lamberken <lamber...@163.com> wrote: > > > > Hi @Pratyaksh Sharma > > > Thanks for your steps to reproduce this issue. Try to modify bellow > codes, > and test again. > > > > org.apache.hudi.utilities.HiveIncrementalPuller#HiveIncrementalPuller / > --------------------------------- / String templateContent = > > > > > FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate")); > Changed to > / --------------------------------- / String templateContent = > > > > > FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("/IncrementalPull.sqltemplate")); > best, > lamber-ken > > > > > > At 2019-12-30 19:25:08, "Pratyaksh Sharma" <pratyaks...@gmail.com> > wrote: > Hi Vinoth, > > I am able to reproduce this error on docker setup and have filed a > jira - > https://issues.apache.org/jira/browse/HUDI-484. > > Steps to reproduce are mentioned in the jira description itself. > > On Thu, Dec 26, 2019 at 12:42 PM Pratyaksh Sharma < > pratyaks...@gmail.com> > wrote: > > Hi Vinoth, > > I will try to reproduce the error on docker cluster and keep you > updated. > > On Tue, Dec 24, 2019 at 11:23 PM Vinoth Chandar < > vin...@apache.org> > wrote: > > Pratyaksh, > > If you are still having this issue, could you try reproducing > this > on > the > docker setup > > > > > > > https://hudi.apache.org/docker_demo.html#step-7--incremental-query-for-copy-on-write-table > similar to this and raise a JIRA. > Happy to look into it and get it fixed if needed > > Thanks > Vinoth > > On Tue, Dec 24, 2019 at 8:43 AM lamberken <lamber...@163.com> > wrote: > > > > Hi, @Pratyaksh Sharma > > > The log4j-1.2.17.jar lib also needs to added to the classpath, > for > example: > java -cp > > > > > > > /path/to/hive-jdbc-2.3.1.jar:/path/to/log4j-1.2.17.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar > org.apache.hudi.utilities.HiveIncrementalPuller --help > > > best, > lamber-ken > > At 2019-12-24 17:23:20, "Pratyaksh Sharma" < > pratyaks...@gmail.com > > wrote: > Hi Vinoth, > > Sorry my bad, I did not realise earlier that spark is not > needed > for > this > class. I tried running it with the below command to get the > mentioned > exception - > > Command - > > java -cp > > > > > > > > /path/to/hive-jdbc-2.3.1.jar:packaging/hudi-utilities-bundle/target/hudi-utilities-bundle-0.5.1-SNAPSHOT.jar > org.apache.hudi.utilities.HiveIncrementalPuller --help > > Exception - > Exception in thread "main" java.lang.NoClassDefFoundError: > org/apache/log4j/LogManager > at > > > > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.<clinit>(HiveIncrementalPuller.java:64) > Caused by: java.lang.ClassNotFoundException: > org.apache.log4j.LogManager > at > java.net.URLClassLoader.findClass(URLClassLoader.java:382) > at > java.lang.ClassLoader.loadClass(ClassLoader.java:424) > at > sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > at > java.lang.ClassLoader.loadClass(ClassLoader.java:357) > ... 1 more > > I was able to fix it by including the corresponding jar in the > bundle. > > After fixing the above, still I am getting the NPE even though > the > template > is bundled in the jar. > > On Mon, Dec 23, 2019 at 10:45 PM Vinoth Chandar < > vin...@apache.org> > wrote: > > Hi Pratyaksh, > > HveIncrementalPuller is just a java program. Does not need > Spark, > since > it > just runs a HiveQL remotely.. > > On the error you specified, seems like it can't find the > template? > Can > you > see if the bundle does not have the template file.. May be > this > got > broken > during the bundling changes.. (since its no longer part of > the > resources > folder of the bundle module).. We should also probably be > throwing a > better > error than NPE.. > > We can raise a JIRA, once you confirm. > > String templateContent = > > > > > > > > > FileIOUtils.readAsUTFString(this.getClass().getResourceAsStream("IncrementalPull.sqltemplate")); > > > On Mon, Dec 23, 2019 at 6:02 AM Pratyaksh Sharma < > pratyaks...@gmail.com > > wrote: > > Hi, > > Can someone guide me or share some documentation regarding > how > to > use > HiveIncrementalPuller. I already went through the > documentation > on > https://hudi.apache.org/querying_data.html. I tried using > this > puller > using > the below command and facing the given exception. > > Any leads are appreciated. > > Command - > spark-submit --name incremental-puller --queue etl --files > incremental_sql.txt --master yarn --deploy-mode cluster > --driver-memory > 4g > --executor-memory 4g --num-executors 2 --class > org.apache.hudi.utilities.HiveIncrementalPuller > hudi-utilities-bundle-0.5.1-SNAPSHOT.jar --hiveUrl > jdbc:hive2://HOST:PORT/ --hiveUser <user> --hivePass > <pass> > --extractSQLFile incremental_sql.txt --sourceDb > <source_db> > --sourceTable > <src_table> --targetDb tmp --targetTable tempTable > --fromCommitTime 0 > --maxCommits 1 > > Error - > > java.lang.NullPointerException > at > org.apache.hudi.common.util.FileIOUtils.copy(FileIOUtils.java:73) > at > > > > > > > > > > org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:66) > at > > > > > > > > > > org.apache.hudi.common.util.FileIOUtils.readAsUTFString(FileIOUtils.java:61) > at > > > > > > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.<init>(HiveIncrementalPuller.java:113) > at > > > > > > > > > > org.apache.hudi.utilities.HiveIncrementalPuller.main(HiveIncrementalPuller.java:343) > > > > > > > > > >