Cristian created SPARK-8435:
-------------------------------

             Summary: Cannot create tables in an specific database using a 
provider
                 Key: SPARK-8435
                 URL: https://issues.apache.org/jira/browse/SPARK-8435
             Project: Spark
          Issue Type: Bug
         Environment: Spark SQL 1.4.0 (Spark-Shell), Hive metastore, MySQL 
Driver, Linux
            Reporter: Cristian


Hello,

I've been trying to create tables in different catalogs using a Hive metastore 
and when I execute the "CREATE" statement, I realized that it is created into 
the default catalog.

This is what I'm trying. 
{quote}
scala> sqlContext.sql("CREATE DATABASE IF NOT EXISTS testmetastore COMMENT 
'Testing catalogs' ")
scala> sqlContext.sql("USE testmetastore")
scala> sqlContext.sql("CREATE TABLE students USING org.apache.spark.sql.parquet 
OPTIONS (path '/user/hive, highavailability 'true', DefaultLimit '1000')")
{quote}

And this is what I get. I can see that it is kind of working because it seems 
that when it checks if the table exists, it searchs in the correct catalog 
(testmetastore). But finally when it tries to create the table, it uses the 
default catalog.

{quote}
scala> sqlContext.sql("CREATE TABLE students USING a OPTIONS (highavailability 
'true', DefaultLimit '1000')").show
15/06/18 10:28:48 INFO HiveMetaStore: 0: get_table : db=*testmetastore* 
tbl=students
15/06/18 10:28:48 INFO audit: ugi=ccaballero    ip=unknown-ip-addr      
cmd=get_table : db=testmetastore tbl=students   
15/06/18 10:28:48 INFO Persistence: Request to load fields "comment,name,type" 
of class org.apache.hadoop.hive.metastore.model.MFieldSchema but object is 
embedded, so ignored
15/06/18 10:28:48 INFO Persistence: Request to load fields "comment,name,type" 
of class org.apache.hadoop.hive.metastore.model.MFieldSchema but object is 
embedded, so ignored
15/06/18 10:28:48 INFO HiveMetaStore: 0: create_table: 
Table(tableName:students, dbName:*default*, owner:ccaballero, 
createTime:1434616128, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, 
comment:from deserializer)], location:null, 
inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, 
parameters:{DefaultLimit=1000, serialization.format=1, highavailability=true}), 
bucketCols:[], sortCols:[], parameters:{}, 
skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE)
15/06/18 10:28:48 INFO audit: ugi=ccaballero    ip=unknown-ip-addr      
cmd=create_table: Table(tableName:students, dbName:default, owner:ccaballero, 
createTime:1434616128, lastAccessTime:0, retention:0, 
sd:StorageDescriptor(cols:[FieldSchema(name:col, type:array<string>, 
comment:from deserializer)], location:null, 
inputFormat:org.apache.hadoop.mapred.SequenceFileInputFormat, 
outputFormat:org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat, 
compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, 
serializationLib:org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe, 
parameters:{DefaultLimit=1000, serialization.format=1, highavailability=true}), 
bucketCols:[], sortCols:[], parameters:{}, 
skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], 
skewedColValueLocationMaps:{})), partitionKeys:[], parameters:{EXTERNAL=TRUE, 
spark.sql.sources.provider=a}, viewOriginalText:null, viewExpandedText:null, 
tableType:MANAGED_TABLE)      
15/06/18 10:28:49 INFO SparkContext: Starting job: show at <console>:20
15/06/18 10:28:49 INFO DAGScheduler: Got job 2 (show at <console>:20) with 1 
output partitions (allowLocal=false)
15/06/18 10:28:49 INFO DAGScheduler: Final stage: ResultStage 2(show at 
<console>:20)
15/06/18 10:28:49 INFO DAGScheduler: Parents of final stage: List()
15/06/18 10:28:49 INFO DAGScheduler: Missing parents: List()
15/06/18 10:28:49 INFO DAGScheduler: Submitting ResultStage 2 
(MapPartitionsRDD[6] at show at <console>:20), which has no missing parents
15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1792) called with curMem=0, 
maxMem=278302556
15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2 stored as values in 
memory (estimated size 1792.0 B, free 265.4 MB)
15/06/18 10:28:49 INFO MemoryStore: ensureFreeSpace(1139) called with 
curMem=1792, maxMem=278302556
15/06/18 10:28:49 INFO MemoryStore: Block broadcast_2_piece0 stored as bytes in 
memory (estimated size 1139.0 B, free 265.4 MB)
15/06/18 10:28:49 INFO BlockManagerInfo: Added broadcast_2_piece0 in memory on 
localhost:59110 (size: 1139.0 B, free: 265.4 MB)
15/06/18 10:28:49 INFO SparkContext: Created broadcast 2 from broadcast at 
DAGScheduler.scala:874
15/06/18 10:28:49 INFO DAGScheduler: Submitting 1 missing tasks from 
ResultStage 2 (MapPartitionsRDD[6] at show at <console>:20)
15/06/18 10:28:49 INFO TaskSchedulerImpl: Adding task set 2.0 with 1 tasks
15/06/18 10:28:49 INFO TaskSetManager: Starting task 0.0 in stage 2.0 (TID 2, 
localhost, PROCESS_LOCAL, 1379 bytes)
15/06/18 10:28:49 INFO Executor: Running task 0.0 in stage 2.0 (TID 2)
15/06/18 10:28:49 INFO Executor: Finished task 0.0 in stage 2.0 (TID 2). 628 
bytes result sent to driver
15/06/18 10:28:49 INFO TaskSetManager: Finished task 0.0 in stage 2.0 (TID 2) 
in 10 ms on localhost (1/1)
15/06/18 10:28:49 INFO DAGScheduler: ResultStage 2 (show at <console>:20) 
finished in 0.010 s
15/06/18 10:28:49 INFO TaskSchedulerImpl: Removed TaskSet 2.0, whose tasks have 
all completed, from pool 
15/06/18 10:28:49 INFO DAGScheduler: Job 2 finished: show at <console>:20, took 
0.016204 s
++
||
++
++

{quote}

Any suggestions would be appreciated.

Thank you.












--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to