[GitHub] spark pull request #21190: [SPARK-22938][SQL][followup] Assert that SQLConf....

2018-05-10 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/21190


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21190: [SPARK-22938][SQL][followup] Assert that SQLConf....

2018-05-03 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/21190#discussion_r185750124
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala
 ---
@@ -102,22 +105,32 @@ object TypeCoercion {
 case (_: TimestampType, _: DateType) | (_: DateType, _: TimestampType) 
=>
   Some(TimestampType)
 
-case (t1 @ StructType(fields1), t2 @ StructType(fields2)) if 
t1.sameType(t2) =>
-  Some(StructType(fields1.zip(fields2).map { case (f1, f2) =>
-// Since `t1.sameType(t2)` is true, two StructTypes have the same 
DataType
-// except `name` (in case of `spark.sql.caseSensitive=false`) and 
`nullable`.
-// - Different names: use f1.name
-// - Different nullabilities: `nullable` is true iff one of them 
is nullable.
-val dataType = findTightestCommonType(f1.dataType, f2.dataType).get
-StructField(f1.name, dataType, nullable = f1.nullable || 
f2.nullable)
-  }))
+case (t1 @ StructType(fields1), t2 @ StructType(fields2)) =>
+  val isSameType = if (caseSensitive) {
+t1.sameType(t2)
--- End diff --

Call `DataType.equalsIgnoreNullability` here for better show the difference 
between the call of `DataType.equalsIgnoreCaseAndNullability` below?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #21190: [SPARK-22938][SQL][followup] Assert that SQLConf....

2018-04-28 Thread cloud-fan
GitHub user cloud-fan opened a pull request:

https://github.com/apache/spark/pull/21190

[SPARK-22938][SQL][followup] Assert that SQLConf.get is accessed only on 
the driver

## What changes were proposed in this pull request?

This is a followup of https://github.com/apache/spark/pull/20136 . #20136 
didn't really work because in the test, we are using local backend, which 
shares the driver side `SparkEnv`, so `SparkEnv.get.executorId == 
SparkContext.DRIVER_IDENTIFIER` doesn't work.

This PR changes the check to `TaskContext.get != null`, and move the check 
to `SQLConf.get`, and fix all the places that violate this check:
* `InMemoryTableScanExec#createAndDecompressColumn` is executed inside 
`rdd.map`, we can't access `conf.offHeapColumnVectorEnabled` there.
* `DataType#sameType` may be executed in executor side, for things like 
json schema inference, so we can't call `conf.caseSensitiveAnalysis` there. 
This contributes to most of the code changes, as we need to add `caseSensitive` 
parameter to a lot of methods.
* `ParquetFilters` is used in the file scan function, which is executed in 
executor side, so we can't can't call `conf.parquetFilterPushDownDate` there.
* `WindowExec#createBoundOrdering` is called on executor side, so we can't 
use `conf.sessionLocalTimezone` there.

## How was this patch tested?

existing test


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/cloud-fan/spark minor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/21190.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21190


commit fc679098d917d226a834a8ab6d08c23dbe5bf7db
Author: Wenchen Fan 
Date:   2018-04-29T01:15:14Z

SQLConf should not be accessed in executor




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org