[GitHub] [incubator-hudi] vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync
vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-533596837 Got it. To run from IDE what I do is to add spark jars to the class path of my module in IntelliJ. You dont have to mess with your sbt and bring in avro etc . The idea here is that when you actually submit your application using spark-submit then these jars are there already. On transitive dependencies it would be based. on scope . Avro, parquet etc are in provided scope and so the expectation is that they are supplied by the actual runtime (IDE or spark-submit). This way we can keep hudi thinner and be able to support multiples spark hive hadoop versions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync
vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-532347935 >>Am I supposed to add hudi-hive jars separately? No.. its all there in the bundle Some general context . if you are writing a spark job, its better to just depend on `hudi-spark` , which will pull in hudi-hive. That way you have control over what versions to exclude and bring in. With a bundled jar (true for any bundled/fat/uber jar), you dont have control to say tell hudi to not bring its version of Hive. Can you try building a fat jar and running your job once via `spark-submit` locally? I imagine you added the spark jars to your IntelliJ module to be able to run the program locally. Want to see if the jar conflict is coming from that.. Cant understand where `Hive 2.3.2-amzn-2` comes from still from what you shared This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync
vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-531988780 Hmmm. I tested with Hive version - Hive 2.3.3 Spark version - 2.4.4 by simply using spark2.4.4 to do this step on the docker demo (had spark installation unzipped onto `docker` and did something like ``` root@adhoc-2:/opt# /var/hoodie/ws/docker/spark-2.4.4-bin-hadoop2.7/bin/spark-submit --class org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer /var/hoodie/ws/docker/hoodie/hadoop/hive_base/target/hoodie-utilities.jar --storage-type COPY_ON_WRITE --source-class org.apache.hudi.utilities.sources.JsonDFSSource --source-ordering-field ts --target-base-path /user/hive/warehouse/stock_ticks_cow --target-table stock_ticks_cow --props /var/demo/config/dfs-source.properties --schemaprovider-class org.apache.hudi.utilities.schema.FilebasedSchemaProvider --enable-hive-sync --hoodie-conf hoodie.datasource.hive_sync.jdbcurl=jdbc:hive2://hiveserver:1 --hoodie-conf hoodie.datasource.hive_sync.username=hive --hoodie-conf hoodie.datasource.hive_sync.password=hive --hoodie-conf hoodie.datasource.hive_sync.partition_fields=dt --hoodie-conf hoodie.datasource.hive_sync.database=default --hoodie-conf hoodie.datasource.hive_sync.table=stock_ticks_cow ``` seems to work.. So I suspect it has to do something with the amazon specific version? Can you try building a local docker image with that hive version locally and see if you can reproduce this? https://hudi.apache.org/docker_demo.html#building-local-docker-containers This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-hudi] vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync
vinothchandar commented on issue #894: Getting java.lang.NoSuchMethodError while doing Hive sync URL: https://github.com/apache/incubator-hudi/issues/894#issuecomment-531825236 Seems like a jar mismatch isssue.. ``` at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193) Caused by: MetaException(message:org.apache.hadoop.hive.ql.log.PerfLogger.getPerfLogger(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;Z)Lorg/apache/hadoop/hive/ql/log/PerfLogger;) at org.apache.hudi.org.apache.hadoop_hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:83) at org.apache.hudi.org.apache.hadoop_hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92) at org.apache.hudi.org.apache.hadoop_hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6889) at org.apache.hudi.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:159) at org.apache.hudi.org.apache.hadoop_hive.metastore.HiveMetaStoreClient.(HiveMetaStoreClient.java:128) at org.apache.hudi.hive.HoodieHiveClient.(HoodieHiveClient.java:99) ... 50 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.hive.ql.log.PerfLogger.getPerfLogger(Lorg/apache/hudi/org/apache/hadoop_hive/conf/HiveConf;Z)Lorg/apache/hadoop/hive/ql/log/PerfLogger; at org.apache.hudi.org.apache.hadoop_hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:104) at org.apache.hudi.org.apache.hadoop_hive.metastore.RetryingHMSHandler.(RetryingHMSHandler.java:79) ... 55 more ``` I will try repro this and try to get it working. Thanks for reporting this . We stopped bundling the standalone hive jars, which may have been providing this previously (guessing. will need to repro and understand) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services