[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-24 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-663688727


   @bvaradar I am getting the same exception I had added the jars to the --jars 
option of submit so its available to both driver and executors.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-22 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-662610773


   @bvaradar I have used the --jars option in submit other jars are picked up 
as well. I also see the class is there but still getting the same error.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-22 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-662525238


   @bvaradar I added 
https://mvnrepository.com/artifact/com.tdunning/json/1.8/json-1.8.jar to spark 
jars but still facing the same issue
An error occurred while calling o179.save.
   : java.lang.NoClassDefFoundError: org/json/JSONException
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047)
at 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128)
at 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)
at 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)
at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-21 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-661874955


   @bvaradar any recommendation on this please.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-20 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-661196097


   @bvaradar I am running  hudi-spark-bundle



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-17 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-660175460


   @bvaradar pease let me know if anything else need to be done to disable the 
jdbc interface?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-16 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-659690154


   @bvaradar @bhasudha I tried using following 
   "hoodie.datasource.hive_sync.use_jdbc":False,
   "hoodie.datasource.hive_sync.enable":True,
    
   My spark is configured already with thrifturl of metastore for hive.  Does 
the hudi will use the thrift instance as we I have disabled the jdbc?
   
   I get the error 
An error occurred while calling o175.save.\n: 
java.lang.NoClassDefFoundError: org/json/JSONException\n\tat 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847)\n\tat
 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047)\n\tat
 
org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128)\n\tat
 
org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)\n\tat
 
org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)\n\tat
 org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)\n\tat 
org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)\n\tat 
org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)\n\tat 
org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)\n\tat 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)\n\tat 
org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)\n\tat 
org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLs(HoodieHiveClient.java:515)\n\tat
 
org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLUsingHiveDriver(HoodieHiveClient.java:498)\n\tat
 
org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:488)\n\tat
 
org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:273)\n\tat
 org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:146)\n\tat
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-14 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-658191364


   @bhasudha it work with Presto and I am able to query data fine and data 
seems to be correct based on my queries. The only concern I have is it missing 
anything that might hit in long run?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-13 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-657806405


   @bhasudha in my setup we are not running hive we are just using the 
metadatastore from hive. So what I did I regster a external table like 
   CREATE TABLE test.hoodie_test2(
   "_hoodie_commit_time" varchar,
   "_hoodie_commit_seqno" varchar,
   "_hoodie_record_key" varchar,
   "_hoodie_partition_path" varchar,
   "_hoodie_file_name" varchar,
   "column" varchar,
   "data_type" varchar,
   "is_data_type_inferred" varchar,
   "completeness" double,
   "approximate_num_distinct_values" bigint,
   "histogram" array(row(count bigint, ratio double, value varchar)),
   "mean" double,
   "maximum" double,
   "minimum" double,
   "sum" double,
   "std_dev" double,
   approx_percentiles ARRAY )
   WITH (
   format='parquet',
   external_location='s3a://tempwrite/hudi/'
   )
   Just wanted to know if this is right way of doing it does it going to loose 
any of the functionality?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-10 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-656772241


   @vinothchandar sorry for the delay I will try to pull the logs and attach.
   
   Another quick question I have I have created external table using presto for 
the data I have written to s3 using
   CREATE TABLE test.hoodie_test2(
   "_hoodie_commit_time" varchar,
   "_hoodie_commit_seqno" varchar,
   "_hoodie_record_key" varchar,
   "_hoodie_partition_path" varchar,
   "_hoodie_file_name" varchar,
   "column" varchar,
   "data_type" varchar,
   "is_data_type_inferred" varchar,
   "completeness" double,
   "approximate_num_distinct_values" bigint,
   "histogram" array(row(count bigint, ratio double, value varchar)),
   "mean" double,
   "maximum" double,
   "minimum" double,
   "sum" double,
   "std_dev" double,
   approx_percentiles ARRAY  )
   WITH (
   format='parquet',
   external_location='s3a://tempwrite/hudi/'
   )
   
   It worked fine and able to query it with presto.
   I haven't added the jar 
presto_install>/plugin/hive-hadoop2/hudi-presto-bundle.jar still it work fine I 
think it reading the parquet files directly? Is it the right way to do it or 
need to be done differently?
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [hudi] asheeshgarg commented on issue #1787: Exception During Insert

2020-07-03 Thread GitBox


asheeshgarg commented on issue #1787:
URL: https://github.com/apache/hudi/issues/1787#issuecomment-653641776


   @leesf after adding the options it works fine. Does setting the option to 
false have any impact?



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org