[GitHub] [hudi] FelixKJose opened a new issue #1875: EMR + Spark Batch job + HUDI + Hive external Metastore (MySQL RDS Instance) failed with No Suitable Driver

GitBox Fri, 24 Jul 2020 15:27:34 -0700


FelixKJose opened a new issue #1875:
URL: https://github.com/apache/hudi/issues/1875



   Hello,
   
   I am getting following error while I am using external RDS instance as Hive 
Metastore. 
   
   **My configuration:**
   
   
   'hoodie.datasource.hive_sync.enable': 'true',
   'hoodie.datasource.hive_sync.database': 'hive_metastore',
   'hoodie.datasource.hive_sync.table': 'calculations',
   'hoodie.datasource.hive_sync.username': 'spark',
   'hoodie.datasource.hive_sync.password': 'password123',
   'hoodie.datasource.hive_sync.jdbcurl': 
'jdbc:mysql://*******************.us-east-1.rds.amazonaws.com:3306'
   
   Error Stacktrace:
   `org.apache.hudi.hive.HoodieHiveSyncException: Cannot create hive connection 
jdbc:mysql://******************.us-east-1.rds.amazonaws.com:3306/
        at 
org.apache.hudi.hive.HoodieHiveClient.createHiveConnection(HoodieHiveClient.java:559)
        at 
org.apache.hudi.hive.HoodieHiveClient.<init>(HoodieHiveClient.java:108)
        at org.apache.hudi.hive.HiveSyncTool.<init>(HiveSyncTool.java:60)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.syncHive(HoodieSparkSqlWriter.scala:236)
        at 
org.apache.hudi.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:169)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:108)
        at 
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:46)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at 
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$execute$1(SparkPlan.scala:131)
        at 
org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:156)
        at 
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
        at 
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
        at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
        at 
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:83)
        at 
org.apache.spark.sql.DataFrameWriter.$anonfun$runCommand$1(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:84)
        at 
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:165)
        at 
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74)
        at 
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676)
        at 
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:290)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271)
        at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:229)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:748)
   **Caused by: java.sql.SQLException: No suitable driver found for 
jdbc:mysql://******************.us-east-1.rds.amazonaws.com:3306**
        at java.sql.DriverManager.getConnection(DriverManager.java:689)
        at java.sql.DriverManager.getConnection(DriverManager.java:247)
        at 
org.apache.hudi.hive.HoodieHiveClient.createHiveConnection(HoodieHiveClient.java:556)
        ... 35 more`
   
   
   **Environment Description**
    * EMR: 6.0.0
   
   * Hudi version : Custom HUDI Jar (provided by **Udit Mehrotra** for EMR 
6.0.0 with performance fixes)
   
   * Spark version : 2.4.4
   
   * Storage (HDFS/S3/GCS..) : S3
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [hudi] FelixKJose opened a new issue #1875: EMR + Spark Batch job + HUDI + Hive external Metastore (MySQL RDS Instance) failed with No Suitable Driver

Reply via email to