[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-663688727 @bvaradar I am getting the same exception I had added the jars to the --jars option of submit so its available to both driver and executors. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662610773 @bvaradar I have used the --jars option in submit other jars are picked up as well. I also see the class is there but still getting the same error. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-662525238 @bvaradar I added https://mvnrepository.com/artifact/com.tdunning/json/1.8/json-1.8.jar to spark jars but still facing the same issue An error occurred while calling o179.save. : java.lang.NoClassDefFoundError: org/json/JSONException at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128) at org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-661874955 @bvaradar any recommendation on this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-661196097 @bvaradar I am running hudi-spark-bundle This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-660175460 @bvaradar pease let me know if anything else need to be done to disable the jdbc interface? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-659690154 @bvaradar @bhasudha I tried using following "hoodie.datasource.hive_sync.use_jdbc":False, "hoodie.datasource.hive_sync.enable":True, My spark is configured already with thrifturl of metastore for hive. Does the hudi will use the thrift instance as we I have disabled the jdbc? I get the error An error occurred while calling o175.save.\n: java.lang.NoClassDefFoundError: org/json/JSONException\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847)\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047)\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128)\n\tat org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)\n\tat org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)\n\tat org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)\n\tat org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)\n\tat org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)\n\tat org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLs(HoodieHiveClient.java:515)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLUsingHiveDriver(HoodieHiveClient.java:498)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:488)\n\tat org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:273)\n\tat org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:146)\n\tat This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-658191364 @bhasudha it work with Presto and I am able to query data fine and data seems to be correct based on my queries. The only concern I have is it missing anything that might hit in long run? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-657806405 @bhasudha in my setup we are not running hive we are just using the metadatastore from hive. So what I did I regster a external table like CREATE TABLE test.hoodie_test2( "_hoodie_commit_time" varchar, "_hoodie_commit_seqno" varchar, "_hoodie_record_key" varchar, "_hoodie_partition_path" varchar, "_hoodie_file_name" varchar, "column" varchar, "data_type" varchar, "is_data_type_inferred" varchar, "completeness" double, "approximate_num_distinct_values" bigint, "histogram" array(row(count bigint, ratio double, value varchar)), "mean" double, "maximum" double, "minimum" double, "sum" double, "std_dev" double, approx_percentiles ARRAY ) WITH ( format='parquet', external_location='s3a://tempwrite/hudi/' ) Just wanted to know if this is right way of doing it does it going to loose any of the functionality? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-656772241 @vinothchandar sorry for the delay I will try to pull the logs and attach. Another quick question I have I have created external table using presto for the data I have written to s3 using CREATE TABLE test.hoodie_test2( "_hoodie_commit_time" varchar, "_hoodie_commit_seqno" varchar, "_hoodie_record_key" varchar, "_hoodie_partition_path" varchar, "_hoodie_file_name" varchar, "column" varchar, "data_type" varchar, "is_data_type_inferred" varchar, "completeness" double, "approximate_num_distinct_values" bigint, "histogram" array(row(count bigint, ratio double, value varchar)), "mean" double, "maximum" double, "minimum" double, "sum" double, "std_dev" double, approx_percentiles ARRAY ) WITH ( format='parquet', external_location='s3a://tempwrite/hudi/' ) It worked fine and able to query it with presto. I haven't added the jar presto_install>/plugin/hive-hadoop2/hudi-presto-bundle.jar still it work fine I think it reading the parquet files directly? Is it the right way to do it or need to be done differently? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert
asheeshgarg commented on issue #1787: URL: https://github.com/apache/hudi/issues/1787#issuecomment-653641776 @leesf after adding the options it works fine. Does setting the option to false have any impact? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org