[ https://issues.apache.org/jira/browse/SPARK-20808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018366#comment-16018366 ]
lyc commented on SPARK-20808: ----------------------------- Since 2.2.0, `createExternalTable` is deprecated and you should use `createTable`, and I run your code with the latest master (88e6d7, 2.3.0) with `createTable`, and there is not any warning and can select from the table correctly. Can you whether the problem still occurs against the latest master? > External Table unnecessarily not created in Hive-compatible way > --------------------------------------------------------------- > > Key: SPARK-20808 > URL: https://issues.apache.org/jira/browse/SPARK-20808 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.1.0, 2.1.1 > Reporter: Joachim Hereth > Priority: Minor > > In Spark 2.1.0 and 2.1.1 {{spark.catalog.createExternalTable}} creates tables > unnecessarily in a hive-incompatible way. > For instance executing in a spark shell > {code} > val database = "default" > val table = "table_name" > val path = "/user/daki/" + database + "/" + table > var data = Array(("Alice", 23), ("Laura", 33), ("Peter", 54)) > val df = sc.parallelize(data).toDF("name","age") > df.write.mode(org.apache.spark.sql.SaveMode.Overwrite).parquet(path) > spark.sql("DROP TABLE IF EXISTS " + database + "." + table) > spark.catalog.createExternalTable(database + "."+ table, path) > {code} > issues the warning > {code} > Search Subject for Kerberos V5 INIT cred (<<DEF>>, > sun.security.jgss.krb5.Krb5InitCredential) > 17/05/19 11:01:17 WARN hive.HiveExternalCatalog: Could not persist > `default`.`table_name` in a Hive compatible way. Persisting it into Hive > metastore in Spark SQL specific format. > org.apache.hadoop.hive.ql.metadata.HiveException: MetaException(message:User > daki does not have privileges for CREATETABLE) > at org.apache.hadoop.hive.ql.metadata.Hive.createTable(Hive.java:720) > ... > {code} > The Exception (user does not have privileges for CREATETABLE) is misleading > (I do have the CREATE TABLE privilege). > Querying the table with Hive does not return any result. With Spark one can > access the data. > The following code creates the table correctly (workaround): > {code} > def sqlStatement(df : org.apache.spark.sql.DataFrame, database : String, > table: String, path: String) : String = { > val rows = (for(col <- df.schema) > yield "`" + col.name + "` " + > col.dataType.simpleString).mkString(",\n") > val sqlStmnt = ("CREATE EXTERNAL TABLE `%s`.`%s` (%s) " + > "STORED AS PARQUET " + > "Location 'hdfs://nameservice1%s'").format(database, table, rows, path) > return sqlStmnt > } > spark.sql("DROP TABLE IF EXISTS " + database + "." + table) > spark.sql(sqlStatement(df, database, table, path)) > {code} > The code is executed via YARN against a Cloudera CDH 5.7.5 cluster with > Sentry enabled (in case this matters regarding the privilege warning). Spark > was built against the CDH libraries. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org