[jira] [Created] (SPARK-6644) [SPARK-SQL]when the partition schema does not match table schema(ADD COLUMN), new column is NULL

dongxu (JIRA) Tue, 31 Mar 2015 22:16:39 -0700

dongxu created SPARK-6644:
-----------------------------

             Summary: [SPARK-SQL]when the partition schema does not match table 
schema(ADD COLUMN), new column is NULL
                 Key: SPARK-6644
                 URL: https://issues.apache.org/jira/browse/SPARK-6644
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.3.0
            Reporter: dongxu



In hive,the schema of partition may be difference from the table schema. For 
example, we add new column. When we use spark-sql to query the data of 
partition which schema is difference from the table schema.
some problems is solved(https://github.com/apache/spark/pull/4289), 
but if you add a new column,put new data into the old partition,new column 
value is NULL

[According to the following steps]:

case class TestData(key: Int, value: String)
val testData = TestHive.sparkContext.parallelize(
      (1 to 10).map(i => TestData(i, i.toString))).toDF()
testData.registerTempTable("testData")
 sql("DROP TABLE IF EXISTS table_with_partition ")
 sql(s"CREATE  TABLE  IF NOT EXISTS  table_with_partition(key int,value string) 
PARTITIONED by (ds string) location '${tmpDir.toURI.toString}' ")
 sql("INSERT OVERWRITE TABLE table_with_partition  partition (ds='1') SELECT 
key,value FROM testData")
    // add column to table
 sql("ALTER TABLE table_with_partition ADD COLUMNS(key1 string)")
 sql("ALTER TABLE table_with_partition ADD COLUMNS(destlng double)") 
 sql("INSERT OVERWRITE TABLE table_with_partition  partition (ds='1') SELECT 
key,value,'test',1.11 FROM testData")
 sql("select * from table_with_partition where ds='1' 
").collect().foreach(println)      
 
result : 
[1,1,null,null,1]
[2,2,null,null,1]
 
result we expect:
[1,1,test,1.11,1]
[2,2,test,1.11,1]




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Created] (SPARK-6644) [SPARK-SQL]when the partition schema does not match table schema(ADD COLUMN), new column is NULL

Reply via email to