[ 
https://issues.apache.org/jira/browse/SPARK-8435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sean Owen updated SPARK-8435:
-----------------------------
    Component/s: SQL

> Cannot create tables in an specific database using a provider
> -------------------------------------------------------------
>
>                 Key: SPARK-8435
>                 URL: https://issues.apache.org/jira/browse/SPARK-8435
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>         Environment: Spark SQL 1.4.0 (Spark-Shell), Hive metastore, MySQL 
> Driver, Linux
>            Reporter: Cristian
>
> Hello,
> I've been trying to create tables in different catalogs using a Hive 
> metastore and when I execute the "CREATE" statement, I realized that it is 
> created into the default catalog.
> This is what I'm trying. 
> {quote}
> scala> sqlContext.sql("CREATE DATABASE IF NOT EXISTS testmetastore COMMENT 
> 'Testing catalogs' ")
> scala> sqlContext.sql("USE testmetastore")
> scala> sqlContext.sql("CREATE TABLE students USING 
> org.apache.spark.sql.parquet OPTIONS (path '/user/hive, highavailability 
> 'true', DefaultLimit '1000')")
> {quote}
> And this is what I get. I can see that it is kind of working because it seems 
> that when it checks if the table exists, it searchs in the correct catalog 
> (testmetastore). But finally when it tries to create the table, it uses the 
> default catalog.
> {quote}
> scala> sqlContext.sql("CREATE TABLE students USING a OPTIONS 
> (highavailability 'true', DefaultLimit '1000')").show
> 15/06/18 10:28:48 INFO HiveMetaStore: 0: get_table : db=*testmetastore* 
> tbl=students
> 15/06/18 10:28:48 INFO audit: ugi=ccaballero  ip=unknown-ip-addr      
> cmd=get_table : db=testmetastore tbl=students   
> 15/06/18 10:28:48 INFO Persistence: Request to load fields 
> "comment,name,type" of class 
> org.apache.hadoop.hive.metastore.model.MFieldSchema but object is embedded, 
> so ignored
> 15/06/18 10:28:48 INFO Persistence: Request to load fields 
> "comment,name,type" of class 
> org.apache.hadoop.hive.metastore.model.MFieldSchema but object is embedded, 
> so ignored
> 15/06/18 10:28:48 INFO HiveMetaStore: 0: create_table: 
> Table(tableName:students, dbName:*default*, owner:ccaballero, 
> createTime:1434616128, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, 
> comment:from deserializer)], location:null, 
> inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, 
> parameters:{DefaultLimit=1000, serialization.format=1, 
> highavailability=true}), bucketCols:[], sortCols:[], parameters:{}, 
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
> spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null, 
> tableType:MANAGED_TABLE)
> 15/06/18 10:28:48 INFO audit: ugi=ccaballero  ip=unknown-ip-addr      
> cmd=create_table: Table(tableName:students, dbName:default, owner:ccaballero, 
> createTime:1434616128, lastAccessTime:0, retention:0, 
> sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, 
> comment:from deserializer)], location:null, 
> inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
> outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
> compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
> serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, 
> parameters:{DefaultLimit=1000, serialization.format=1, 
> highavailability=true}), bucketCols:[], sortCols:[], parameters:{}, 
> skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
> skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
> spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null, 
> tableType:MANAGED_TABLE)      
> 15/06/18 10:28:49 INFO SparkContext: Starting job: show at <console>:20
> 15/06/18 10:28:49 INFO DAGScheduler: Got job 2 (show at <console>:20) with 1 
> output partitions (allowLocal=false)
> 15/06/18 10:28:49 INFO DAGScheduler: Final stage: ResultStage 2(show at 
> <console>:20)
> 15/06/18 10:28:49 INFO DAGScheduler: Parents of final stage: List()
> 15/06/18 10:28:49 INFO DAGScheduler: Missing parents: List()
> 15/06/18 10:28:49 INFO DAGScheduler: Submitting ResultStage 2 
> (MapPartitionsRDD[6] at show at <console>:20), which has no missing parents
> 15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1792) called with 
> curMem=0, maxMem=278302556
> 15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2 stored as values in 
> memory (estimated size 1792.0 B, free 265.4 MB)
> 15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1139) called with 
> curMem=1792, maxMem=278302556
> 15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes 
> in memory (estimated size 1139.0 B, free 265.4 MB)
> 15/06/18 10:28:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory 
> on localhost:59110 (size: 1139.0 B, free: 265.4 MB)
> 15/06/18 10:28:49 INFO SparkContext: Created broadcast 2 from broadcast at 
> DAGScheduler.scala:874
> 15/06/18 10:28:49 INFO DAGScheduler: Submitting 1 missing tasks from 
> ResultStage 2 (MapPartitionsRDD[6] at show at <console>:20)
> 15/06/18 10:28:49 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
> 15/06/18 10:28:49 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, 
> localhost, PROCESS_LOCAL, 1379 bytes)
> 15/06/18 10:28:49 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
> 15/06/18 10:28:49 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 628 
> bytes result sent to driver
> 15/06/18 10:28:49 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) 
> in 10 ms on localhost (1/1)
> 15/06/18 10:28:49 INFO DAGScheduler: ResultStage 2 (show at <console>:20) 
> finished in 0.010 s
> 15/06/18 10:28:49 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks 
> have all completed, from pool 
> 15/06/18 10:28:49 INFO DAGScheduler: Job 2 finished: show at <console>:20, 
> took 0.016204 s
> ++
> ||
> ++
> ++
> {quote}
> Any suggestions would be appreciated.
> Thank you.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to