subject:"\[GitHub\] spark pull request\: \[SPARK\-8654\]\[SQL\] Fix Analysis exception when ..."

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146688197
  
open a new one is also OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146685208
  
@cloud-fan 
I didn't see a re-open option on this pull request. Do i have to create a 
new pull request ?
Please let me know ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146682761
  
just reopen this PR and we will trigger a test on our jenkin for it.

For local test, you can do `./build/sbt catalyst/test`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146682265
  
@marmbrus .. sorry about it. Is there a way i can look at the list of 
failures ?
I had run :
build/mvn -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver 
-Dhadoop.version=2.6.0  -Dmaven.test.failure.ignore=true test

and it reported success. But then this is my first time.. so may not have 
right configuration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146680375
  
Sorry, this never passed tests and broke something.  I'm going to revert.  
Please reopen the PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146679919
  
test this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146645486
  
Thanks a lot @marmbrus .

Many thanks to @cloud-fan for his help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/8983


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146634337
  
Thanks, merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146631777
  
I only found one rule in `Optimizer` to optimize `In` to `InSet` when the 
values in `list` are all literals.
I think it's better to add a case in `NullPropagation` to optimize `In` 
with literal null `key` to literal null even the `list` are not all literals.

Anyway it should be another PR and this PR LGTM.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146470802
  
@cloud-fan 
Thanks a lot. I have implemented the review comments. Please take a look. I 
looked at the optimizer code,. we already seem to be transforming NULL in (...) 
to  "Filter null"  in ConstantFolding rule. So we should be okay here , right ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41475635
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -135,4 +135,26 @@ class AnalysisSuite extends AnalysisTest {
 plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 
1).as("a+1"))).as("col"))
 checkAnalysis(plan, plan)
   }
+
+  test("SPARK-8654: invalid CAST in NULL IN(...) expression") {
+val plan = Project(Alias(In(Literal(null), Seq(Literal(1), 
Literal(2))), "a")() :: Nil,
+  LocalRelation()
+)
+assertAnalysisSuccess(plan)
+  }
+
+  test("SPARK-8654: different types in inlist but can be converted to a 
commmon type") {
+val plan = Project(Alias(In(Literal(null), Seq(Literal(1), 
Literal(1.2345))), "a")() :: Nil,
+  LocalRelation()
+)
+assertAnalysisSuccess(plan)
+  }
+
+  test("SPARK-8654: check type compatibility error") {
+val plan = Project(Alias(In(Literal(null), Seq(Literal(true), 
Literal(1))), "a")() :: Nil,
+  LocalRelation()
+)
+assertAnalysisError(plan, Seq("cannot resolve 'null IN (true,1)' due 
to data " +
+ "type mismatch: Arguments must be same 
type"))
--- End diff --

This can be simplified to `assertAnalysisError(plan, Seq("data type 
mismatch: Arguments must be same type")`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41475573
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -305,12 +305,22 @@ object HiveTypeCoercion {
 
   /**
* Convert all expressions in in() list to the left operator type
+   * except when the left operator type is NullType. In case when left hand
+   * operator type is NullType create a Literal(Null).
*/
   object InConversion extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
   // Skip nodes who's children have not been resolved yet.
   case e if !e.childrenResolved => e
 
+  case i @ In(a, b) if (a.dataType == NullType) =>
+var inTypes : Seq[DataType] = Seq.empty
+b.foreach(e => inTypes = inTypes ++ Seq(e.dataType))
+findWiderCommonType(inTypes) match {
+  case Some(finalDataType) => Literal.create(null, BooleanType)
+  case None => i
+}
--- End diff --

instead of returning literal null, I think we should just wider the types, 
as it does fix the bug(we can add a rule in `Optimizer` to return null for this 
case).

the code can be:
```
case i @ In(a, b) if b.exists(_.dataType != a.dataType) =>
  findWiderCommonType(i.children.map(_.dataType)) match {
case Some(finalDataType) => i.withNewChildren(i.children.map(Cast(_, 
finalDataType)))
case None => i
  }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146404345
  
@cloud-fan
Hi Wenchen, can you please take a look at the changes and let me know what 
you think..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146287283
  
Thank you @marmbrus @rick-ibm @cloud-fan 

I checked the behavior of db2. It also raises an error if the in list types 
are not compatible.

db2 => select * from f1 where NULL in (1, true)
SQL0401N  The data types of the operands for the operation "IN" are not 
compatible or comparable.  SQLSTATE=42818

I am studying the code now to figure out how to detect this and raise an 
error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread rick-ibm

Github user rick-ibm commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146280859
  
I don't find any guidance in the Standard for what should be done if the 
left side of the IN operator is an untyped NULL literal. Technically, there is 
no such thing in the Standard. The NULL needs to be cast to a legal type.

Section 8.4 provides no guidance about the type correspondence of the IN 
list values. However, section 8.9 implies that the IN list is equivalent to the 
result of a subquery, which means that we must be able to cast all of the 
values on the right side to a common type.

Thanks,
-Rick



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread rick-ibm

Github user rick-ibm commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146271123
  
Hi Michael,

Postgres and Derby raise an error if the expressions in the IN list can't 
be implicitly cast to a common type. MySQL is more forgiving.

Thanks,
-Rick

-

MySQL:

mysql> SELECT NULL IN ( 1, 'abc' );
SELECT NULL IN ( 1, 'abc' );
+--+
| NULL IN ( 1, 'abc' ) |
+--+
| NULL |
+--+
1 row in set (0.00 sec)


-

Postgres:

mydb=# SELECT NULL IN ( 1, 'abc' );
SELECT NULL IN ( 1, 'abc' );
ERROR:  invalid input syntax for integer: "abc"
LINE 1: SELECT NULL IN ( 1, 'abc' );


-

Derby:

ij> VALUES CAST (NULL AS INT) IN (1, 'abc' );
ERROR 42818: Comparisons between 'INTEGER' and 'CHAR (UCS_BASIC)' are not 
supported. Types must be comparable. String types must also have matching 
collation. If collation does not match, a possible solution is to cast operands 
to force them to the default collation (e.g. SELECT tablename FROM 
sys.systables WHERE CAST(tablename AS VARCHAR(128)) = 'T1')



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146266287
  
I think we are all in agreement that `null IN (...)` should return null.  

The only question here is if we should through an error when the stuff in 
`(...)` can be coerced to a common type.  It seems to me that the user must be 
confused if they are attempting to compare a value with an incompatible set of 
options.  It would be good to see what other systems do here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread rick-ibm

Github user rick-ibm commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146244545
  
According to my reading of the SQL Standard,

  NULL IN (expr1, ...)

should always evaluate to NULL. Here is my reasoning:

The 2011 SQL Standard, part 2, section 8.4 (in predicate), syntax rule 5 
says that

  expr IN (expr1, ...)

is equivalent to

  expr = ANY (expr1, ...)

Section 8.9 (quantified comparison predicate), general rule 2, subrules (c) 
and (d), say that

  expr = ANY (expr1, ...)

evaluates to the following:

  TRUE if (expr = exprN) is TRUE for at least one of the expressions on the 
right side

  FALSE if the right side is an empty list or if (expr = exprN) is FALSE 
for every exprN on the right side

  UNKNOWN (NULL) otherwise

Since (NULL = exprN) is always UNKNOWN and since an IN list must be 
non-empty (see the BNF in section 8.4), it follows that

  NULL IN (expr1, ...)

always evaluates to UNKNOWN (NULL). So Dilip's transformation of

  NULL IN (expr1, ...) -> NULL

looks correct to me. There is no need to cast the expressions on the right 
side to a common type. That is, not unless you want to raise syntax errors in 
situations where there is no implicit conversion to a common type.

As the following examples show, Postgres, MySQL, and Derby all exhibit the 
correct Standard behavior.

Thanks,
-Rick



MySQL behavior:

mysql> SELECT NULL IN (1, 2, 3);
SELECT NULL IN (1, 2, 3);
+---+
| NULL IN (1, 2, 3) |
+---+
|  NULL |
+---+
1 row in set (0.00 sec)

mysql> SELECT NULL IN (1, 2, NULL);
SELECT NULL IN (1, 2, NULL);
+--+
| NULL IN (1, 2, NULL) |
+--+
| NULL |
+--+
1 row in set (0.00 sec)

mysql> SELECT NULL IN ();
SELECT NULL IN ();
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual 
that corresponds to your MySQL server version for the right syntax to use near 
')' at line 1





Postgres behavior:

mydb=# SELECT NULL IN (1, 2, 3);
SELECT NULL IN (1, 2, 3);
 ?column? 
--
 
(1 row)

mydb=# SELECT NULL IN (1, 2, NULL);
SELECT NULL IN (1, 2, NULL);
 ?column? 
--
 
(1 row)

mydb=# SELECT NULL IN ();
SELECT NULL IN ();
ERROR:  syntax error at or near ")"
LINE 1: SELECT NULL IN ();





Derby behavior:

ij> VALUES CAST (NULL AS INT) IN (1, 2, 3);
1
-
NULL 

1 row selected
ij> VALUES CAST (NULL AS INT) IN (1, 2, CAST (NULL AS INT));
1
-
NULL 

1 row selected
ij> VALUES CAST (NULL AS INT) IN ();
ERROR 42X01: Syntax error: Encountered ")" at line 1, column 31.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146077599
  
if `null in (true, array(2,3))` returns null in hive, then we probably 
should first cast `key` and all values in `list` to a common type. If there is 
no such a common type, throw exception.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146047895
  
@marmbrus

Thanks a lot michael for looking into this. I debugged hive to understand 
the 
behaviour and would like to share my findings. I wanted to make sure we are 
doing 
the right thing here. Here are the comments at the top of GenericUDFIn() in 
hive.

/**
 * GenericUDFIn
 *
 * Example usage:
 * SELECT key FROM src WHERE key IN ("238", "1");
 *
 * From MySQL page on IN(): To comply with the SQL standard, IN returns NULL
 * not only if the expression on the left hand side is NULL, but also if no
 * match is found in the list and one of the expressions in the list is 
NULL.
 *
 * Also noteworthy: type conversion behavior is different from MySQL. With
 * expr IN expr1, expr2... in MySQL, exprN will each be converted into the 
same
 * type as expr. In the Hive implementation, all expr(N) will be converted 
into
 * a common type for conversion consistency with other UDF's, and to prevent
 * conversions from a big type to a small type (e.g. int to tinyint)
 */

**(case 1) expr in (expr1,... exprN)**
* **1.1** Per sql standard if expr is NULL then IN should return NULL
(with this PR we are attempting to achieve this)
* **1.2** if any of the expression in the right hand side is NULL and also
no match is found in the list then IN should also return NULL.
We also enforce this semantics by 

[implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L124-L144)

**(case 2) Type conversion semantics.**
* **2.1** In MySQL all the expressions in the right hand side is converted 
to left hand side type. Our behaviour matches this semantics. I am
not sure if this is the standard though.
* **2.2** In Hive, they seem to find a common type (probably larger type) 
and
promote both left hand side and right hand side to that common type.
I believe this is where it throws the SemanticException.

Our behaviour seems match that of MySql more at the present time. Do we 
want to change this ? 
also about case 1 , it is not clear from the hive comments on what they 
intended to
do vs what is the external behaviour. Please let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread marmbrus

Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146016743
  
Lets follow hive.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146003985
  
Very good point..Thanks.. Actually Hive reports an error in this case.

hive> select * from tnull where array(2,3) in (1, array(2,3));
FAILED: SemanticException [Error 10014]: Line 1:37 Wrong arguments '3': The 
arguments for IN should be the same type! Types are: {array IN (int, 
array)}

I am not sure what is the right thing to do here. 

Any comments @marmbrus ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145977094
  
yup, another JIRA please.

You need ask some committers like @marmbrus  to review your PR to get it 
merged .

My final thoughts on this PR:
When the type is confilct, for example `null in (true, array(2,3))`, should 
we return null or report type conflict error during analysis? What hive's 
behavior for this case?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145958925
  
Thanks a LOT @cloud-fan. Sure.. i will look into it. When you say another 
PR, 
do we mean another JIRA ? 

Asking as i am new to the process. 

One other question.. what is the process to get this change integrated ? Do 
i need to initiate any action from my end ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145953466
  
ok misleaded by the imperative code style, sorry for that...
So we will return null if the `key` is null, or we can't find a value in 
`list` matching our `key`, and there is null value inside `list`.

This PR make `In` operation return null when `key` is NullType, which is 
right. But I think we can do it further by returning null for any type of `key` 
which is literal null(probably another PR).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145948447
  
@cloud-fan , please confirm my understanding of the code (fairly new to the 
codebase..:-)
In the code we go through the entire in list and run evaluate the 
expression flagging hasNull. But
we continue with next items and return true if we see a match. If we 
haven't seen it then we look at the hasNull flag and return Null or False.

To confirm if there is a issue, i tried to run the following two queries 
again. The output looks 
ok to me.. 

select * from inttab where 1 in (1,2,NULL)
var2: org.apache.spark.sql.DataFrame = [c1: int]
+---+
| c1|
+---+
|  1|
|  2|
|  3|
|  4|
|  5|
+---+

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter 1 IN (1,2,null)
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter 1 IN (cast(1 as int),cast(2 as int),cast(null as int))
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Optimized Logical Plan ==
LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Physical Plan ==
Scan PhysicalRDD[c1#0]

Code Generation: true

select * from inttab where 1 in (NULL,1,2)
var2: org.apache.spark.sql.DataFrame = [c1: int]
+---+
| c1|
+---+
|  1|
|  2|
|  3|
|  4|
|  5|
+---+

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter 1 IN (null,1,2)
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter 1 IN (cast(null as int),cast(1 as int),cast(2 as int))
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Optimized Logical Plan ==
LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Physical Plan ==
Scan PhysicalRDD[c1#0]

Code Generation: true

Please let me know your thoughts ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145930866
  
Looks like our 
[implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L124-L144)
 is wrong We return null if any value in the list is null.
Would you like to send another PR to fix it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145734618
  
Can you please help clarify ? Are you referring to the case when one of the 
value in the 
in list is literal null  like 1 in (1, NULL) ? If so, i don't think we can 
evaluate this to NULL...

Ran the following two queries on hive 
hive> select 1 in (1,NULL) from tnull;
OK
true
true
true
Time taken: 1.21 seconds, Fetched: 3 row(s)
hive> select 1 in (NULL,1) from tnull;
OK
true
true
true
Time taken: 0.168 seconds, Fetched: 3 row(s)

We have already taken care of the case LHS type is null. Let me know what 
you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145717934
  
I checked our 
[implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L126-L127),
 we should return null for this case. 

Can you also add a rule in `Optimizer` so that if `value` is literal null, 
we can just return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145718166
  
btw you can send another PR to add a rule in `Optimizer` so that if `value` 
is literal null, we can just return null.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145713214
  
hive> select null in (1,2,null) from tnull; 
OK
NULL
NULL
NULL
Time taken: 0.139 seconds, Fetched: 3 row(s)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145712756
  
can you try `select null in (1,2,null)`? I wanna make sure hive doesn't 
execute the `In` operation when `value` is null type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145711558
  
Hi Wenchen,

Here is the link i could find where its a bit confusing on the equality 
operator.


https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-RelationalOperators

In order to test it out , i tried the queries on hive like following ..

hive> select * from tnull ;
OK
1
2
NULL


hive> select null = 1 from tnull ;
OK
NULL
NULL
NULL
Time taken: 0.118 seconds, Fetched: 3 row(s)
hive> select null in (1,2) from tnull ;
OK
NULL
NULL
NULL

Please let me know what you think and thanks again for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41212281
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -305,12 +305,17 @@ object HiveTypeCoercion {
 
   /**
* Convert all expressions in in() list to the left operator type
+   * except when the left operator type is NullType. In case when left hand
+   * operator type is NullType create a Literal(Null).
*/
   object InConversion extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
   // Skip nodes who's children have not been resolved yet.
   case e if !e.childrenResolved => e
 
+  case i @ In(a, b) if (a.dataType == NullType) =>
+Literal.create(null, BooleanType)
--- End diff --

Ah sorry for my mistake. I thought you are casting the `a` to boolean type, 
but actually you just turn the result to boolean null.

Can you reference to a hive doc that says something like "if the value is 
null, the In operation will always return null"? I think we should follow hive 
semantic here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145696762
  
Thanks Wenchen. You are right that not all types can be casted to boolean. 

However, in this case, we are not trying to cast the in list types to the 
LHS type (null type in our case) as we know that this is a special case 
predicate would always evaluate to NULL. That is why we are simply transforming 
the in predicate to NULL one and dropping the in list altogether.

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter NOT null IN (1,2,3,4) => original one
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter NOT null=> rewritten one
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

Please let me know what you think .. If you have a test case in mind that 
would exhibit a problem then i would like to try it out.

Thanks a lot for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145689983
  
if you change the `value` type to boolean, then we have to change each 
element in `list` to boolean type too, which maybe dangerous as a lot of types 
can't be casted to boolean. See 
https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145664718
  
Thanks !! 

Do we need to look at the in list types in this case ? The in list types 
could be literals of different types , right ? for example NULL not in (1, 'a')

Since the result of IN predicate is a boolean type, i thought  it would be 
safe to transform it to
a Null literal of boolean type. Are you thinking of a case where this would 
not work ?

Thanks a lot in advance for your help.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41188938
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -305,12 +305,17 @@ object HiveTypeCoercion {
 
   /**
* Convert all expressions in in() list to the left operator type
+   * except when the left operator type is NullType. In case when left hand
+   * operator type is NullType create a Literal(Null).
*/
   object InConversion extends Rule[LogicalPlan] {
 def apply(plan: LogicalPlan): LogicalPlan = plan resolveExpressions {
   // Skip nodes who's children have not been resolved yet.
   case e if !e.childrenResolved => e
 
+  case i @ In(a, b) if (a.dataType == NullType) =>
+Literal.create(null, BooleanType)
--- End diff --

instead of just casting null to boolean, can we come up with a better idea 
according to the data types of `b`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145639934
  
Thanks for reviewing the code Wenchen. I was trying to model the test case 
based on what was put in the JIRA which did a caseInsensitiveAnalyze. I have 
fixed it now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread cloud-fan

Github user cloud-fan commented on a diff in the pull request:

https://github.com/apache/spark/pull/8983#discussion_r41183045
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -135,4 +135,11 @@ class AnalysisSuite extends AnalysisTest {
 plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 
1).as("a+1"))).as("col"))
 checkAnalysis(plan, plan)
   }
+
+  test("SPARK-8654: invalid CAST in NULL IN(...) expression") {
+val plan = Project(Alias(In(Literal(null), Seq(Literal(1), 
Literal(2))), "a")() :: Nil,
+  LocalRelation()
+)
+assertAnalysisSuccess(plan, false)
--- End diff --

why change the default value of `caseSensitive`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread AmplabJenkins

Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145630551
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal

GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/8983

[SPARK-8654][SQL] Fix Analysis exception when using NULL IN (...)

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark_8654

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8983


commit 38f973bb124c63c1caabe14ee6e5cca7b764b15a
Author: Dilip Biswal 
Date:   2015-10-02T23:20:56Z

[SPARK-8654] Analysis exception when using NULL IN (...) : invalid cast

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

43 matches

Mail list logo