[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r44612323 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -124,34 +132,40 @@ private[sql] object PartitioningUtils { * Literal.create("hello", StringType), * Literal.create(3.14, FloatType))) * }}} + * and the base path: + * {{{ + * /path/to/partition + * }}} */ private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { val columns = ArrayBuffer.empty[(String, Literal)] // Old Hadoop versions don't have `Path.isRoot` var finished = path.getParent == null var chopped = path +var basePath = path while (!finished) { // Sometimes (e.g., when speculative task is enabled), temporary directories may be left // uncleaned. Here we simply ignore them. if (chopped.getName.toLowerCase == "_temporary") { -return None +return (None, None) } val maybeColumn = parsePartitionColumn(chopped.getName, defaultPartitionName, typeInference) maybeColumn.foreach(columns += _) + basePath = chopped chopped = chopped.getParent - finished = maybeColumn.isEmpty || chopped.getParent == null + finished = (maybeColumn.isEmpty && !columns.isEmpty) || chopped.getParent == null --- End diff -- oh i see. It is for something like `table/a=1/_temporary/something`, right? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r44611718 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -124,34 +132,40 @@ private[sql] object PartitioningUtils { * Literal.create("hello", StringType), * Literal.create(3.14, FloatType))) * }}} + * and the base path: + * {{{ + * /path/to/partition + * }}} */ private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { val columns = ArrayBuffer.empty[(String, Literal)] // Old Hadoop versions don't have `Path.isRoot` var finished = path.getParent == null var chopped = path +var basePath = path while (!finished) { // Sometimes (e.g., when speculative task is enabled), temporary directories may be left // uncleaned. Here we simply ignore them. if (chopped.getName.toLowerCase == "_temporary") { -return None +return (None, None) } val maybeColumn = parsePartitionColumn(chopped.getName, defaultPartitionName, typeInference) maybeColumn.foreach(columns += _) + basePath = chopped chopped = chopped.getParent - finished = maybeColumn.isEmpty || chopped.getParent == null + finished = (maybeColumn.isEmpty && !columns.isEmpty) || chopped.getParent == null --- End diff -- Why do we need `!columns.isEmpty`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153283225 **[Test build #44906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44906/consoleFull)** for PR 8840 at commit [`3db4226`](https://github.com/apache/spark/commit/3db42268b7948102278427f73d48e3ebb3196924). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153302126 **[Test build #44906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44906/consoleFull)** for PR 8840 at commit [`3db4226`](https://github.com/apache/spark/commit/3db42268b7948102278427f73d48e3ebb3196924). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153302250 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153302252 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44906/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153393850 Should this one be backported to branch-1.5? cc @liancheng @yhuai --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43775159 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -77,9 +77,11 @@ private[sql] object PartitioningUtils { defaultPartitionName: String, typeInference: Boolean): PartitionSpec = { // First, we need to parse every partition's path and see if we can find partition values. -val pathsWithPartitionValues = paths.flatMap { path => - parsePartition(path, defaultPartitionName, typeInference).map(path -> _) -} +val (partitionValues, optBasePaths) = paths.map { path => + parsePartition(path, defaultPartitionName, typeInference) +}.unzip + +val pathsWithPartitionValues = paths.zip(partitionValues).flatMap(x => x._2.map(x._1 -> _)) --- End diff -- Seems it is not very obvious what we are doing at here. Maybe a comment can help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153419110 @rxin I think it is find to not backport it. Without this, a user will get the DataFrame and when he/she queries it, it will fail (and the error message does not say what's the cause). Since, it does not really let users make any real progress, I think we do not really need to backport it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43775309 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetPartitionDiscoverySuite.scala --- @@ -58,14 +58,46 @@ class ParquetPartitionDiscoverySuite extends QueryTest with ParquetTest with Sha check(defaultPartitionName, Literal.create(null, NullType)) } + test("parse invalid partitioned directories") { +// Invalid +var paths = Seq( + "hdfs://host:9000/invalidPath", + "hdfs://host:9000/path/a=10/b=20", + "hdfs://host:9000/path/a=10.5/b=hello") + +var exception = intercept[AssertionError] { + parsePartitions(paths.map(new Path(_)), defaultPartitionName, true) +} +assert(exception.getMessage().contains("Conflicting directory structures detected")) + +// Valid +paths = Seq( + "hdfs://host:9000/path/_temporary", + "hdfs://host:9000/path/a=10/b=20", + "hdfs://host:9000/path/_temporary/path") + +parsePartitions(paths.map(new Path(_)), defaultPartitionName, true) + +// Invalid +paths = Seq( + "hdfs://host:9000/path/_temporary", + "hdfs://host:9000/path/a=10/b=20", + "hdfs://host:9000/path/path1") + +exception = intercept[AssertionError] { + parsePartitions(paths.map(new Path(_)), defaultPartitionName, true) +} +assert(exception.getMessage().contains("Conflicting directory structures detected")) + } --- End diff -- Which case is for the one I mentioned in the jira? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43775098 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -87,6 +89,12 @@ private[sql] object PartitioningUtils { } else { // This dataset is partitioned. We need to check whether all partitions have the same // partition columns and resolve potential type conflicts. + val basePaths = optBasePaths.flatMap(x => x) + assert( +basePaths.distinct.size == 1, +"Conflicting directory structures detected. Suspicious paths:\b" + + basePaths.mkString("\n\t", "\n\t", "\n\n")) --- End diff -- Can you give a case that we will hit this branch? What will `basePaths` be at here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153419572 OK great. @viirya can you submit a followup pr to address @yhuai's feedback. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153592786 @rxin @yhuai OK. Submitted a following pr at #9459. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153259988 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user davies commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153261282 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153259912 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153260381 **[Test build #44888 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44888/consoleFull)** for PR 8840 at commit [`3db4226`](https://github.com/apache/spark/commit/3db42268b7948102278427f73d48e3ebb3196924). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153280347 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44888/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153280346 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153280321 **[Test build #44888 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44888/consoleFull)** for PR 8840 at commit [`3db4226`](https://github.com/apache/spark/commit/3db42268b7948102278427f73d48e3ebb3196924). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153282428 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153282411 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43719832 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -184,7 +204,8 @@ private[sql] object PartitioningUtils { * }}} */ private[sql] def resolvePartitions( - pathsWithPartitionValues: Seq[(Path, PartitionValues)]): Seq[PartitionValues] = { + pathsWithPartitionValues: Seq[(Path, PartitionValues)], + basePaths: Seq[Path]): Seq[PartitionValues] = { --- End diff -- Removed. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-153282032 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43699130 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -77,8 +77,16 @@ private[sql] object PartitioningUtils { defaultPartitionName: String, typeInference: Boolean): PartitionSpec = { // First, we need to parse every partition's path and see if we can find partition values. -val pathsWithPartitionValues = paths.flatMap { path => - parsePartition(path, defaultPartitionName, typeInference).map(path -> _) +val partitionValuesWithBasePaths = paths.map { path => + path -> parsePartition(path, defaultPartitionName, typeInference) +} + +val pathsWithPartitionValues = partitionValuesWithBasePaths.flatMap { pathAndPart => + pathAndPart._2._1.map(part => pathAndPart._1 -> part) +} + +val basePaths = partitionValuesWithBasePaths.flatMap { pathAndPart => + pathAndPart._2._2 --- End diff -- Can we writhe these like this? ``` val (basePaths, pathsWithPartitionValues) = paths.flatMap { }.unzip ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r43699404 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -184,7 +204,8 @@ private[sql] object PartitioningUtils { * }}} */ private[sql] def resolvePartitions( - pathsWithPartitionValues: Seq[(Path, PartitionValues)]): Seq[PartitionValues] = { + pathsWithPartitionValues: Seq[(Path, PartitionValues)], + basePaths: Seq[Path]): Seq[PartitionValues] = { --- End diff -- Where is `basePaths` used? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-151132730 @chenghao-intel @JoshRosen any comments? Is this patch ready? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150900314 **[Test build #44316 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44316/consoleFull)** for PR 8840 at commit [`cdf6dc4`](https://github.com/apache/spark/commit/cdf6dc424abba99a7fd091fca5ce2af56255f69a). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42942564 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -128,30 +136,32 @@ private[sql] object PartitioningUtils { private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { --- End diff -- A base path is not always associated with a `PartitionValues`. If there is no partition, we can still have a base path. That is why I don't make `case class PartitionValues(columnNames: Seq[String], literals: Seq[Literal])` to something like `case class PartitionValues(path: String, columnNames: Seq[String], literals: Seq[Literal])`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150900104 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150900105 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150911391 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44316/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150911390 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150911363 **[Test build #44316 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44316/consoleFull)** for PR 8840 at commit [`cdf6dc4`](https://github.com/apache/spark/commit/cdf6dc424abba99a7fd091fca5ce2af56255f69a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150814991 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44295/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150814989 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150814759 **[Test build #44295 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44295/consoleFull)** for PR 8840 at commit [`779fbd2`](https://github.com/apache/spark/commit/779fbd264092e8b4ff0ab5472d34db2d01a971f3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150802465 **[Test build #44295 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44295/consoleFull)** for PR 8840 at commit [`779fbd2`](https://github.com/apache/spark/commit/779fbd264092e8b4ff0ab5472d34db2d01a971f3). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150799912 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-150799847 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42694917 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -128,30 +136,32 @@ private[sql] object PartitioningUtils { private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { --- End diff -- But it is possible that we only have Path without corresponding PartitionValues. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149656467 **[Test build #44001 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44001/consoleFull)** for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149654462 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149654228 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user JoshRosen commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149655120 @chenghao-intel, it looks like this PR's error message improvements are similar to the ones that you added as part of #8026. @viirya, could you take a look at @chenghao-intel's PR to see which approach you like better? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149692945 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44001/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149692943 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149692790 **[Test build #44001 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44001/consoleFull)** for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-149753146 @JoshRosen ok, I will take a look at it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42577464 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -207,6 +222,12 @@ private[sql] object PartitioningUtils { } } + private[sql] def throwErrorForInvalidPartition( --- End diff -- This function seems used only once, can we inline it? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42577386 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -128,30 +136,32 @@ private[sql] object PartitioningUtils { private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { --- End diff -- I am wondering if it's possible to make the return type as `Option[(PartitionValues, Path)]`, and can we simply ignore the path whose Column / Value is empty? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42577425 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -128,30 +136,32 @@ private[sql] object PartitioningUtils { private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { --- End diff -- BTW, you need to update this function description also for its return type. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42576934 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -193,6 +204,10 @@ private[sql] object PartitioningUtils { distinctPartColNames.size == 1, listConflictingPartitionColumns(pathsWithPartitionValues)) + assert( --- End diff -- Move this `assert` out of the `resolvePartitions`, since it's not so tight with the original purpose of this function, and we can avoid the code change --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/8840#discussion_r42577632 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PartitioningUtils.scala --- @@ -128,30 +136,32 @@ private[sql] object PartitioningUtils { private[sql] def parsePartition( path: Path, defaultPartitionName: String, - typeInference: Boolean): Option[PartitionValues] = { + typeInference: Boolean): (Option[PartitionValues], Option[Path]) = { --- End diff -- Or change the `case class PartitionValues(columnNames: Seq[String], literals: Seq[Literal])` as `case class PartitionValues(path: String, columnNames: Seq[String], literals: Seq[Literal])`? Then the code probably much simple and readable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141700875 [Test build #42708 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42708/console) for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141709318 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141700888 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42708/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141700887 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141709793 [Test build #42718 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42718/consoleFull) for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141720466 [Test build #42718 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42718/console) for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141720580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42718/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141720579 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/8840 [SPARK-10304][SQL] Partition discovery should throw an exception if the dir structure is invalid JIRA: https://issues.apache.org/jira/browse/SPARK-10304 This patch detects if the structure of partition directories is not valid. The test cases are from #8547. Thanks @zhzhan. cc @liancheng You can merge this pull request into a Git repository by running: $ git pull https://github.com/viirya/spark-1 detect_invalid_part_dir Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/8840.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #8840 commit 2d13156c32002f65acc1bfccf09e08ef0063aa73 Author: Liang-Chi HsiehDate: 2015-09-19T13:01:01Z Detect invalid partition directory. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141691809 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141691798 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141692703 [Test build #42708 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/42708/consoleFull) for PR 8840 at commit [`2d13156`](https://github.com/apache/spark/commit/2d13156c32002f65acc1bfccf09e08ef0063aa73). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141709394 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10304][SQL] Partition discovery should ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/8840#issuecomment-141709421 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org