Aniket Kulkarni created SPARK-17550:
---------------------------------------

             Summary: DataFrameWriter.partitionBy() should throw exception if 
column is not present in Dataframe
                 Key: SPARK-17550
                 URL: https://issues.apache.org/jira/browse/SPARK-17550
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
            Reporter: Aniket Kulkarni
            Priority: Minor


I have a spark job which performs certain computations on event data and 
eventually persists it to hive. 

I was trying to write to hive using the code snippet shown below : 
dataframe.write.format("orc").partitionBy(col1,col2).options(options).mode(SaveMode.Append).saveAsTable(hiveTable)

The write to hive was not working as col2 in the above example was not present 
in the dataframe. It was a little tedious to debug this as no exception or 
message showed up in the logs. I was constantly seeing executor lost failures 
in the logs and nothing more.

I think there should be an exception thrown when one tries to write to hive on 
a partitioning column that does not exist.

If this is indeed something that needs to be fixed, I would like to volunteer 
to fix this in the spark-core code base.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to