GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/16313
[SPARK-18899][SQL] append a bucketed table using DataFrameWriter with mismatched bucketing should fail ## What changes were proposed in this pull request? When we append data to an existing table with `DataFrameWriter.saveAsTable`, we will check the schema and partition columns to see if there is a mismatch. However, we forget to check bucketing, which may lead to a problematic table that has different bucketing in different data files. This PR cleans up the checking logic, to fix this bug, and also adds the schema check for non-file-based data source. ## How was this patch tested? new regression test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark bug1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16313.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16313 ---- commit 370bdc9c15bb865869e2c2e10e60dd501ad8b2f0 Author: Wenchen Fan <wenc...@databricks.com> Date: 2016-12-16T16:40:16Z append a bucketed table using DataFrameWriter with mismatched bucketing should fail ---- --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org