Brian Cho created SPARK-16926:
---------------------------------

             Summary: Partition columns are present in columns metadata for 
partition but not table
                 Key: SPARK-16926
                 URL: https://issues.apache.org/jira/browse/SPARK-16926
             Project: Spark
          Issue Type: Bug
          Components: SQL
            Reporter: Brian Cho


A change introduced in SPARK-14388 removes partition columns from the column 
metadata of tables, but not for partitions. This causes TableReader to believe 
that the schema is different and create an unnecessary conversion object 
inspector, taking the else codepath in TableReader below:

{code}
    val soi = if 
(rawDeser.getObjectInspector.equals(tableDeser.getObjectInspector)) {
      rawDeser.getObjectInspector.asInstanceOf[StructObjectInspector]
    } else {
      ObjectInspectorConverters.getConvertedOI(
        rawDeser.getObjectInspector,
        tableDeser.getObjectInspector).asInstanceOf[StructObjectInspector]
    }
{code}

Printing the properties as debug output confirms the difference for the Hive 
table.

Table properties (tableDesc.getProperties):
{code}
16/08/04 20:36:58 DEBUG HadoopTableReader: columns.types, 
string:bigint:string:bigint:bigint:array<string>
{code}

Partition properties (partProps):
{code}
16/08/04 20:36:58 DEBUG HadoopTableReader: columns.types, 
string:bigint:string:bigint:bigint:array<string>:string:string:string
{code}

Where the final three string columns are partition columns



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to