[GitHub] spark pull request #14572: [SPARK-17192] [SQL] Store the Inferred Schemas in...

gatorsmile Mon, 22 Aug 2016 13:44:46 -0700

Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/14572#discussion_r75754936
  
    --- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala 
---
    @@ -72,29 +72,19 @@ case class PreprocessDDL(conf: SQLConf) extends 
Rule[LogicalPlan] {
     
       def apply(plan: LogicalPlan): LogicalPlan = plan transform {
         // When we CREATE TABLE without specifying the table schema, we should 
fail the query if
    -    // bucketing information is specified, as we can't infer bucketing 
from data files currently,
    -    // and we should ignore the partition columns if it's specified, as we 
will infer it later, at
    -    // runtime.
    +    // bucketing information is specified, as we can't infer bucketing 
from data files currently.
    +    // Since the runtime inferred partition columns could be different 
from what user specified,
    +    // we fail the query if the partitioning information is specified.
         case c @ CreateTable(tableDesc, _, None) if tableDesc.schema.isEmpty =>
           if (tableDesc.bucketSpec.isDefined) {
             failAnalysis("Cannot specify bucketing information if the table 
schema is not specified " +
               "when creating and will be inferred at runtime")
           }
    -
    -      val partitionColumnNames = tableDesc.partitionColumnNames
    -      if (partitionColumnNames.nonEmpty) {
    -        // The table does not have a specified schema, which means that 
the schema will be inferred
    -        // at runtime. So, we are not expecting partition columns and we 
will discover partitions
    -        // at runtime. However, if there are specified partition columns, 
we simply ignore them and
    -        // provide a warning message.
    -        logWarning(
    -          s"Specified partition columns 
(${partitionColumnNames.mkString(",")}) will be " +
    -            s"ignored. The schema and partition columns of table 
${tableDesc.identifier} will " +
    -            "be inferred.")
    -        c.copy(tableDesc = tableDesc.copy(partitionColumnNames = Nil))
    -      } else {
    -        c
    +      if (tableDesc.partitionColumnNames.nonEmpty) {
    +        failAnalysis("Cannot specify partition information if the table 
schema is not specified " +
    +          "when creating and will be inferred at runtime")
           }
    +      c
    --- End diff --
    
    A new JIRA is created and the PR title is updated. Thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #14572: [SPARK-17192] [SQL] Store the Inferred Schemas in...

Reply via email to