[ 
https://issues.apache.org/jira/browse/SPARK-13207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated SPARK-13207:
-----------------------------
    Description: 
Partitioning discovery will fail with the following case
{code}
test("_SUCCESS should not break partitioning discovery") {
    withTempPath { dir =>
      val tablePath = new File(dir, "table")
      val df = (1 to 3).map(i => (i, i, i, i)).toDF("a", "b", "c", "d")

      df.write
        .format("parquet")
        .partitionBy("b", "c", "d")
        .save(tablePath.getCanonicalPath)

      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1", "_SUCCESS"))
      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1", 
"_SUCCESS"))
      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1/d=1", 
"_SUCCESS"))
      
checkAnswer(sqlContext.read.format("parquet").load(tablePath.getCanonicalPath), 
df)
    }
  }
{code}

Because {{_SUCCESS}} is the in the inner partitioning dirs, partitioning 
discovery will fail.

  was:
Partitioning discovery will fail with the following case
{code}
test("_SUCCESS should not break partitioning discovery") {
    withTempPath { dir =>
      val tablePath = new File(dir, "table")
      val df = (1 to 3).map(i => (i, i, i, i)).toDF("a", "b", "c", "d")

      df.write
        .format("parquet")
        .partitionBy("b", "c", "d")
        .save(tablePath.getCanonicalPath)

      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1", "_SUCCESS"))
      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1", 
"_SUCCESS"))
      Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1/d=1", 
"_SUCCESS"))
      
checkAnswer(sqlContext.read.format("parquet").load(tablePath.getCanonicalPath), 
df)
    }
  }
{code}

{{_SUCCESS}} is the cause of this problem.


> _SUCCESS should not break partition discovery
> ---------------------------------------------
>
>                 Key: SPARK-13207
>                 URL: https://issues.apache.org/jira/browse/SPARK-13207
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>
> Partitioning discovery will fail with the following case
> {code}
> test("_SUCCESS should not break partitioning discovery") {
>     withTempPath { dir =>
>       val tablePath = new File(dir, "table")
>       val df = (1 to 3).map(i => (i, i, i, i)).toDF("a", "b", "c", "d")
>       df.write
>         .format("parquet")
>         .partitionBy("b", "c", "d")
>         .save(tablePath.getCanonicalPath)
>       Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1", "_SUCCESS"))
>       Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1", 
> "_SUCCESS"))
>       Files.touch(new File(s"${tablePath.getCanonicalPath}/b=1/c=1/d=1", 
> "_SUCCESS"))
>       
> checkAnswer(sqlContext.read.format("parquet").load(tablePath.getCanonicalPath),
>  df)
>     }
>   }
> {code}
> Because {{_SUCCESS}} is the in the inner partitioning dirs, partitioning 
> discovery will fail.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to