Hi, Trying to write data in spark to the hive as JDBC mode below is the sample code:
spark standalone 2.4.7 version 21/07/15 08:04:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Setting default log level to "WARN". To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel). Spark context Web UI available at http://localhost:4040 Spark context available as 'sc' (master = spark://localhost:7077, app id = app-20210715080414-0817). Spark session available as 'spark'. Welcome to ____ __ / __/__ ___ _____/ /__ _\ \/ _ \/ _ `/ __/ '_/ /___/ .__/\_,_/_/ /_/\_\ version 2.4.7 /_/ Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java 1.8.0_292) Type in expressions to have them evaluated. Type :help for more information. scala> :paste // Entering paste mode (ctrl-D to finish) val df = Seq( ("John", "Smith", "London"), ("David", "Jones", "India"), ("Michael", "Johnson", "Indonesia"), ("Chris", "Lee", "Brazil"), ("Mike", "Brown", "Russia") ).toDF("first_name", "last_name", "country") df.write .format("jdbc") .option("url", "jdbc:hive2://localhost:10000/foundation;AuthMech=2;UseNativeQuery=0") .option("dbtable", "test.test") .option("user", "admin") .option("password", "admin") .option("driver", "com.cloudera.hive.jdbc41.HS2Driver") .mode("overwrite") .save // Exiting paste mode, now interpreting. java.sql.SQLException: [Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while compiling statement: FAILED: ParseException line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key:28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207, org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290, org.apache.hive.service.cli.operation.Operation:run:Operation.java:260, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490, sun.reflect.GeneratedMethodAccessor13:invoke::-1, sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43, java.lang.reflect.Method:invoke:Method.java:498, org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78, org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36, org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63, java.security.AccessController:doPrivileged:AccessController.java:-2, javax.security.auth.Subject:doAs:Subject.java:422, org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875, org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59, com.sun.proxy.$Proxy35:executeStatementAsync::-1, org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295, org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422, org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39, org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39, org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56, org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286, java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149, java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624, java.lang.Thread:run:Thread.java:748, *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key:33:6, org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68, org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564, org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425, org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205], sqlState:42000, errorCode:40000, errorMessage:Error while compiling statement: FAILED: ParseException line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key), Query: CREATE TABLE test.test("first_name" TEXT , "last_name" TEXT , "country" TEXT ). at com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatementInternal(Unknown Source) at com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatement(Unknown Source) at com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown Source) at com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.SStatement.executeNoParams(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.SStatement.executeAnyUpdate(Unknown Source) at com.cloudera.hiveserver2.jdbc.common.SStatement.executeUpdate(Unknown Source) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:863) at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:81) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696) at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291) ... 48 elided Caused by: com.cloudera.hiveserver2.support.exceptions.GeneralException: [Cloudera][HiveJDBCDriver](500051) ERROR processing query/statement. Error Code: 40000, SQL state: TStatus(statusCode:ERROR_STATUS, infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error while compiling statement: FAILED: ParseException line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key:28:27, org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207, org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290, org.apache.hive.service.cli.operation.Operation:run:Operation.java:260, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504, org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490, sun.reflect.GeneratedMethodAccessor13:invoke::-1, sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43, java.lang.reflect.Method:invoke:Method.java:498, org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78, org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36, org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63, java.security.AccessController:doPrivileged:AccessController.java:-2, javax.security.auth.Subject:doAs:Subject.java:422, org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875, org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59, com.sun.proxy.$Proxy35:executeStatementAsync::-1, org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295, org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437, org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422, org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39, org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39, org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56, org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286, java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149, java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624, java.lang.Thread:run:Thread.java:748, *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key:33:6, org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75, org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68, org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564, org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425, org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398, org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205], sqlState:42000, errorCode:40000, errorMessage:Error while compiling statement: FAILED: ParseException line 1:39 cannot recognize input near '"first_name"' 'TEXT' ',' in column name or primary key or foreign key), Query: CREATE TABLE profile_test.person_test ("first_name" TEXT , "last_name" TEXT , "country" TEXT ). ... 77 more Found similar issue ion Jira : https://issues.apache.org/jira/browse/SPARK-31614 There no comments in that and the resolution is Incomplete, is there any way we can do in spark to write data into the hive as JDBC mode. Thanks for any help. Thanks, Badrinath.