[
https://issues.apache.org/jira/browse/SPARK-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen updated SPARK-8435:
-----------------------------
Component/s: SQL
> Cannot create tables in an specific database using a provider
> -------------------------------------------------------------
>
> Key: SPARK-8435
> URL: https://issues.apache.org/jira/browse/SPARK-8435
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Environment: Spark SQL 1.4.0 (Spark-Shell), Hive metastore, MySQL
> Driver, Linux
> Reporter: Cristian
>
> Hello,
> I've been trying to create tables in different catalogs using a Hive
> metastore and when I execute the "CREATE" statement, I realized that it is
> created into the default catalog.
> This is what I'm trying.
> {quote}
> scala> sqlContext.sql("CREATE DATABASE IF NOT EXISTS testmetastore COMMENT
> 'Testing catalogs' ")
> scala> sqlContext.sql("USE testmetastore")
> scala> sqlContext.sql("CREATE TABLE students USING
> org.apache.spark.sql.parquet OPTIONS (path '/user/hive, highavailability
> 'true', DefaultLimit '1000')")
> {quote}
> And this is what I get. I can see that it is kind of working because it seems
> that when it checks if the table exists, it searchs in the correct catalog
> (testmetastore). But finally when it tries to create the table, it uses the
> default catalog.
> {quote}
> scala> sqlContext.sql("CREATE TABLE students USING a OPTIONS
> (highavailability 'true', DefaultLimit '1000')").show
> 15/06/18 10:28:48 INFO HiveMetaStore: 0: get_table : db=*testmetastore*
> tbl=students
> 15/06/18 10:28:48 INFO audit: ugi=ccaballero ip=unknown-ip-addr
> cmd=get_table : db=testmetastore tbl=students
> 15/06/18 10:28:48 INFO Persistence: Request to load fields
> "comment,name,type" of class
> org.apache.hadoop.hive.metastore.model.MFieldSchema but object is embedded,
> so ignored
> 15/06/18 10:28:48 INFO Persistence: Request to load fields
> "comment,name,type" of class
> org.apache.hadoop.hive.metastore.model.MFieldSchema but object is embedded,
> so ignored
> 15/06/18 10:28:48 INFO HiveMetaStore: 0: create_table:
> Table(tableName:students, dbName:*default*, owner:ccaballero,
> createTime:1434616128, lastAccessTime:0, retention:0,
> sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>,
> comment:from deserializer)], location:null,
> inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,
> parameters:{DefaultLimit=1000, serialization.format=1,
> highavailability=true}), bucketCols:[], sortCols:[], parameters:{},
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
> skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE,
> spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null,
> tableType:MANAGED_TABLE)
> 15/06/18 10:28:48 INFO audit: ugi=ccaballero ip=unknown-ip-addr
> cmd=create_table: Table(tableName:students, dbName:default, owner:ccaballero,
> createTime:1434616128, lastAccessTime:0, retention:0,
> sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>,
> comment:from deserializer)], location:null,
> inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat,
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat,
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null,
> serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe,
> parameters:{DefaultLimit=1000, serialization.format=1,
> highavailability=true}), bucketCols:[], sortCols:[], parameters:{},
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[],
> skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE,
> spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null,
> tableType:MANAGED_TABLE)
> 15/06/18 10:28:49 INFO SparkContext: Starting job: show at <console>:20
> 15/06/18 10:28:49 INFO DAGScheduler: Got job 2 (show at <console>:20) with 1
> output partitions (allowLocal=false)
> 15/06/18 10:28:49 INFO DAGScheduler: Final stage: ResultStage 2(show at
> <console>:20)
> 15/06/18 10:28:49 INFO DAGScheduler: Parents of final stage: List()
> 15/06/18 10:28:49 INFO DAGScheduler: Missing parents: List()
> 15/06/18 10:28:49 INFO DAGScheduler: Submitting ResultStage 2
> (MapPartitionsRDD[6] at show at <console>:20), which has no missing parents
> 15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1792) called with
> curMem=0, maxMem=278302556
> 15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2 stored as values in
> memory (estimated size 1792.0 B, free 265.4 MB)
> 15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1139) called with
> curMem=1792, maxMem=278302556
> 15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes
> in memory (estimated size 1139.0 B, free 265.4 MB)
> 15/06/18 10:28:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory
> on localhost:59110 (size: 1139.0 B, free: 265.4 MB)
> 15/06/18 10:28:49 INFO SparkContext: Created broadcast 2 from broadcast at
> DAGScheduler.scala:874
> 15/06/18 10:28:49 INFO DAGScheduler: Submitting 1 missing tasks from
> ResultStage 2 (MapPartitionsRDD[6] at show at <console>:20)
> 15/06/18 10:28:49 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
> 15/06/18 10:28:49 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2,
> localhost, PROCESS_LOCAL, 1379 bytes)
> 15/06/18 10:28:49 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
> 15/06/18 10:28:49 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 628
> bytes result sent to driver
> 15/06/18 10:28:49 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2)
> in 10 ms on localhost (1/1)
> 15/06/18 10:28:49 INFO DAGScheduler: ResultStage 2 (show at <console>:20)
> finished in 0.010 s
> 15/06/18 10:28:49 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks
> have all completed, from pool
> 15/06/18 10:28:49 INFO DAGScheduler: Job 2 finished: show at <console>:20,
> took 0.016204 s
> ++
> ||
> ++
> ++
> {quote}
> Any suggestions would be appreciated.
> Thank you.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]