Re: Unable to write data into hive table using Spark via Hive JDBC driver Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED

Artemis User Mon, 19 Jul 2021 06:33:53 -0700

As Mich mentioned, no need to use jdbc API, using the DataFrameWriter'ssaveAsTable method is the way to go. JDBC Driver is for a JDBC client(a Java client for instance) to access the Hive tables in Spark via theThrift server interface.


-- ND


On 7/19/21 2:42 AM, Badrinath Patchikolla wrote:

I have trying to create table in hive from spark itself,

And using local mode it will work what I am trying here is from sparkstandalone I want to create the manage table in hive (another sparkcluster basically CDH) using jdbc mode.


When I try that below are the error I am facing.

On Thu, 15 Jul, 2021, 9:55 pm Mich Talebzadeh,<mich.talebza...@gmail.com <mailto:mich.talebza...@gmail.com>> wrote:


    Have you created that table in Hive or are you trying to create it
    from Spark itself.

    You Hive is local. In this case you don't need a JDBC connection.
    Have you tried:

    df2.write.mode("overwrite").saveAsTable(mydb.mytable)

    HTH




    **view my Linkedin profile
    <https://www.linkedin.com/in/mich-talebzadeh-ph-d-5205b2/>

    *Disclaimer:* Use it at your own risk.Any and all responsibility
    for any loss, damage or destruction of data or any other property
    which may arise from relying on this email's technical content is
    explicitly disclaimed. The author will in no case be liable for
    any monetary damages arising from such loss, damage or destruction.



    On Thu, 15 Jul 2021 at 12:51, Badrinath Patchikolla
    <pbadrinath1...@gmail.com <mailto:pbadrinath1...@gmail.com>> wrote:

        Hi,

        Trying to write data in spark to the hive as JDBC mode below 
        is the sample code:

        spark standalone 2.4.7 version

        21/07/15 08:04:07 WARN util.NativeCodeLoader: Unable to load
        native-hadoop library for your platform... using builtin-java
        classes where applicable
        Setting default log level to "WARN".
        To adjust logging level use sc.setLogLevel(newLevel). For
        SparkR, use setLogLevel(newLevel).
        Spark context Web UI available at http://localhost:4040
        <http://localhost:4040>
        Spark context available as 'sc' (master =
        spark://localhost:7077, app id = app-20210715080414-0817).
        Spark session available as 'spark'.
        Welcome to
              ____              __
             / __/__  ___ _____/ /__
            _\ \/ _ \/ _ `/ __/  '_/
           /___/ .__/\_,_/_/ /_/\_\   version 2.4.7
              /_/

        Using Scala version 2.11.12 (OpenJDK 64-Bit Server VM, Java
        1.8.0_292)
        Type in expressions to have them evaluated.
        Type :help for more information.

        scala> :paste
        // Entering paste mode (ctrl-D to finish)

        val df = Seq(
            ("John", "Smith", "London"),
            ("David", "Jones", "India"),
            ("Michael", "Johnson", "Indonesia"),
            ("Chris", "Lee", "Brazil"),
            ("Mike", "Brown", "Russia")
          ).toDF("first_name", "last_name", "country")


         df.write
          .format("jdbc")
          .option("url",
        "jdbc:hive2://localhost:10000/foundation;AuthMech=2;UseNativeQuery=0")
          .option("dbtable", "test.test")
          .option("user", "admin")
          .option("password", "admin")
          .option("driver", "com.cloudera.hive.jdbc41.HS2Driver")
          .mode("overwrite")
          .save


        // Exiting paste mode, now interpreting.

        java.sql.SQLException: [Cloudera][HiveJDBCDriver](500051)
        ERROR processing query/statement. Error Code: 40000, SQL
        state: TStatus(statusCode:ERROR_STATUS,
        infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error
        while compiling statement: FAILED: ParseException line 1:39
        cannot recognize input near '"first_name"' 'TEXT' ',' in
        column name or primary key or foreign key:28:27,
        
org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
        
org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
        
org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
        org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
        
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
        
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
        sun.reflect.GeneratedMethodAccessor13:invoke::-1,
        
sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
        java.lang.reflect.Method:invoke:Method.java:498,
        
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
        
org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
        
org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
        java.security.AccessController:doPrivileged:AccessController.java:-2,
        javax.security.auth.Subject:doAs:Subject.java:422,
        
org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
        
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
        com.sun.proxy.$Proxy35:executeStatementAsync::-1,
        
org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
        
org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
        
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
        
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
        org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
        org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
        
org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
        
org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
        
java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
        
java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
        java.lang.Thread:run:Thread.java:748,
        *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39
        cannot recognize input near '"first_name"' 'TEXT' ',' in
        column name or primary key or foreign key:33:6,
        org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
        org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
        org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
        org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
        org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
        org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
        
org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
        sqlState:42000, errorCode:40000, errorMessage:Error while
        compiling statement: FAILED: ParseException line 1:39 cannot
        recognize input near '"first_name"' 'TEXT' ',' in column name
        or primary key or foreign key), Query: CREATE TABLE
        test.test("first_name" TEXT , "last_name" TEXT , "country" TEXT ).
          at
        
com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatementInternal(Unknown
        Source)
          at
        
com.cloudera.hiveserver2.hivecommon.api.HS2Client.executeStatement(Unknown
        Source)
          at
        
com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.executeHelper(Unknown
        Source)
          at
        
com.cloudera.hiveserver2.hivecommon.dataengine.HiveJDBCNativeQueryExecutor.execute(Unknown
        Source)
          at
        com.cloudera.hiveserver2.jdbc.common.SStatement.executeNoParams(Unknown
        Source)
          at
        com.cloudera.hiveserver2.jdbc.common.SStatement.executeAnyUpdate(Unknown
        Source)
          at
        com.cloudera.hiveserver2.jdbc.common.SStatement.executeUpdate(Unknown
        Source)
          at
        
org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.createTable(JdbcUtils.scala:863)
          at
        
org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:81)
          at
        
org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
          at
        
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
          at
        
org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
          at
        
org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
          at
        
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
          at
        
org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
          at
        
org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
          at
        
org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
          at
        
org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
          at
        org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
          at
        
org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:83)
          at
        
org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:81)
          at
        
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
          at
        
org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:696)
          at
        
org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:80)
          at
        
org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:127)
          at
        
org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:75)
          at
        
org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:696)
          at
        
org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:305)
          at
        org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:291)
          ... 48 elided
        Caused by:
        com.cloudera.hiveserver2.support.exceptions.GeneralException:
        [Cloudera][HiveJDBCDriver](500051) ERROR processing
        query/statement. Error Code: 40000, SQL state:
        TStatus(statusCode:ERROR_STATUS,
        infoMessages:[*org.apache.hive.service.cli.HiveSQLException:Error
        while compiling statement: FAILED: ParseException line 1:39
        cannot recognize input near '"first_name"' 'TEXT' ',' in
        column name or primary key or foreign key:28:27,
        
org.apache.hive.service.cli.operation.Operation:toSQLException:Operation.java:329,
        
org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:207,
        
org.apache.hive.service.cli.operation.SQLOperation:runInternal:SQLOperation.java:290,
        org.apache.hive.service.cli.operation.Operation:run:Operation.java:260,
        
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementInternal:HiveSessionImpl.java:504,
        
org.apache.hive.service.cli.session.HiveSessionImpl:executeStatementAsync:HiveSessionImpl.java:490,
        sun.reflect.GeneratedMethodAccessor13:invoke::-1,
        
sun.reflect.DelegatingMethodAccessorImpl:invoke:DelegatingMethodAccessorImpl.java:43,
        java.lang.reflect.Method:invoke:Method.java:498,
        
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:78,
        
org.apache.hive.service.cli.session.HiveSessionProxy:access$000:HiveSessionProxy.java:36,
        
org.apache.hive.service.cli.session.HiveSessionProxy$1:run:HiveSessionProxy.java:63,
        java.security.AccessController:doPrivileged:AccessController.java:-2,
        javax.security.auth.Subject:doAs:Subject.java:422,
        
org.apache.hadoop.security.UserGroupInformation:doAs:UserGroupInformation.java:1875,
        
org.apache.hive.service.cli.session.HiveSessionProxy:invoke:HiveSessionProxy.java:59,
        com.sun.proxy.$Proxy35:executeStatementAsync::-1,
        
org.apache.hive.service.cli.CLIService:executeStatementAsync:CLIService.java:295,
        
org.apache.hive.service.cli.thrift.ThriftCLIService:ExecuteStatement:ThriftCLIService.java:507,
        
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1437,
        
org.apache.hive.service.rpc.thrift.TCLIService$Processor$ExecuteStatement:getResult:TCLIService.java:1422,
        org.apache.thrift.ProcessFunction:process:ProcessFunction.java:39,
        org.apache.thrift.TBaseProcessor:process:TBaseProcessor.java:39,
        
org.apache.hive.service.auth.TSetIpAddressProcessor:process:TSetIpAddressProcessor.java:56,
        
org.apache.thrift.server.TThreadPoolServer$WorkerProcess:run:TThreadPoolServer.java:286,
        
java.util.concurrent.ThreadPoolExecutor:runWorker:ThreadPoolExecutor.java:1149,
        
java.util.concurrent.ThreadPoolExecutor$Worker:run:ThreadPoolExecutor.java:624,
        java.lang.Thread:run:Thread.java:748,
        *org.apache.hadoop.hive.ql.parse.ParseException:line 1:39
        cannot recognize input near '"first_name"' 'TEXT' ',' in
        column name or primary key or foreign key:33:6,
        org.apache.hadoop.hive.ql.parse.ParseDriver:parse:ParseDriver.java:221,
        org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:75,
        org.apache.hadoop.hive.ql.parse.ParseUtils:parse:ParseUtils.java:68,
        org.apache.hadoop.hive.ql.Driver:compile:Driver.java:564,
        org.apache.hadoop.hive.ql.Driver:compileInternal:Driver.java:1425,
        org.apache.hadoop.hive.ql.Driver:compileAndRespond:Driver.java:1398,
        
org.apache.hive.service.cli.operation.SQLOperation:prepare:SQLOperation.java:205],
        sqlState:42000, errorCode:40000, errorMessage:Error while
        compiling statement: FAILED: ParseException line 1:39 cannot
        recognize input near '"first_name"' 'TEXT' ',' in column name
        or primary key or foreign key), Query: CREATE TABLE
        profile_test.person_test ("first_name" TEXT , "last_name" TEXT
        , "country" TEXT ).
          ... 77 more


        Found similar issue ion Jira :

        https://issues.apache.org/jira/browse/SPARK-31614
        <https://issues.apache.org/jira/browse/SPARK-31614>

        There no comments in that and the resolution is Incomplete, is
        there any way we can do in spark to write data into the hive
        as JDBC mode.

        Thanks for any help.


        Thanks,
        Badrinath.

Re: Unable to write data into hive table using Spark via Hive JDBC driver Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED

Reply via email to