[GitHub] spark pull request #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not s...

2016-07-14 Thread yhuai
Github user yhuai closed the pull request at:

https://github.com/apache/spark/pull/14139


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not s...

2016-07-14 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14139#discussion_r70843685
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -273,6 +273,20 @@ private[hive] class HiveMetastoreCatalog(val client: 
ClientInterface, hive: Hive
 serdeProperties = options)
 }
 
+def hasPartitionColumns(relation: HadoopFsRelation): Boolean = {
+  try {
+// Calling hadoopFsRelation.partitionColumns will trigger the 
refresh call of
--- End diff --

Done


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not s...

2016-07-14 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14139#discussion_r70841770
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -273,6 +273,20 @@ private[hive] class HiveMetastoreCatalog(val client: 
ClientInterface, hive: Hive
 serdeProperties = options)
 }
 
+def hasPartitionColumns(relation: HadoopFsRelation): Boolean = {
+  try {
+// Calling hadoopFsRelation.partitionColumns will trigger the 
refresh call of
--- End diff --

I'd add to the comment that this is a hack for 
[SPARK-16313][SQL][BRANCH-1.6] Spark should not silently drop exceptions in 
file listing


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not s...

2016-07-13 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14139#discussion_r70727924
  
--- Diff: 
sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala ---
@@ -273,6 +273,22 @@ private[hive] class HiveMetastoreCatalog(val client: 
ClientInterface, hive: Hive
 serdeProperties = options)
 }
 
+def hasPartitionColumns(relation: BaseRelation): Boolean = relation 
match {
+  case hadoopFsRelation: HadoopFsRelation =>
+try {
+  // Calling hadoopFsRelation.partitionColumns will trigger the 
refresh call of
+  // the HadoopFsRelation, which will validate input paths. 
However, when we create
+  // an empty table, the dir of the table has not been created, 
which will
+  // cause a FileNotFoundException. So, at here we will catch the 
FileNotFoundException
+  // and return false.
+  hadoopFsRelation.partitionColumns.nonEmpty
+} catch {
+  case _: java.io.FileNotFoundException =>
+false
+}
+  case _ => false
+}
--- End diff --

This function is equivalent with `val resolvedRelation = 
dataSource.resolveRelation(checkPathExist = false)` in 2.0 
(https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala#L427).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14139: [SPARK-16313][SQL][BRANCH-1.6] Spark should not s...

2016-07-11 Thread yhuai
GitHub user yhuai opened a pull request:

https://github.com/apache/spark/pull/14139

[SPARK-16313][SQL][BRANCH-1.6] Spark should not silently drop exceptions in 
file listing

## What changes were proposed in this pull request?
Spark silently drops exceptions during file listing. This is a very bad 
behavior because it can mask legitimate errors and the resulting plan will 
silently have 0 rows. This patch changes it to not silently drop the errors.

## How was this patch tested?
Manually tested.

**Note: This is a backport of https://github.com/apache/spark/pull/13987**

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/yhuai/spark SPARK-16313-branch-1.6

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14139.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14139


commit 674999df12973c5b5a6e4cc3446babe70fd26568
Author: Yin Huai 
Date:   2016-07-11T19:59:01Z

Spark should not silently drop exceptions in file listing




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org