[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-30 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9348#discussion_r43538406
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeInSuite.scala
 ---
@@ -82,4 +83,35 @@ class OptimizeInSuite extends PlanTest {
 
 comparePlans(optimized, correctAnswer)
   }
+
+  test("OptimizedIn test: NULL IN (expr1, ..., exprN) gets transformed to 
Filter(null)") {
+val originalQuery =
+  testRelation
+.where(In(Literal.create(null, NullType), Seq(Literal(1), 
Literal(2
+.analyze
+
+val optimized = Optimize.execute(originalQuery.analyze)
+val correctAnswer =
+  testRelation
+.where(Literal.create(null, BooleanType))
+.analyze
+
+comparePlans(optimized, correctAnswer)
+  }
+
+  test("OptimizedIn test: Inset optimization disabled as " +
+"list expression contains attribute)") {
+val originalQuery =
+  testRelation
+.where(In(Literal.create(null, StringType), Seq(Literal(1), 
UnresolvedAttribute("b"
+.analyze
+
+val optimized = Optimize.execute(originalQuery.analyze)
+val correctAnswer =
+  testRelation
+.where(Literal.create(null, BooleanType))
+.analyze
+
+comparePlans(optimized, correctAnswer)
+  }
--- End diff --

@yhuai 
Sure.. I have added the test case. Thanks a lot for reviewing.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-30 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9374#issuecomment-152467840
  
@marmbrus this is for 1.5.2 branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-30 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9375#issuecomment-152467732
  
@marmbrus this is for 1.4.2 branch..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-30 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9375

[SPARK-11188][SQL] Elide stacktraces in bin/spark-sql for AnalysisExceptions

Only print the error message to the console for Analysis Exceptions in 
sql-shell

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark-11188-v142

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9375.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9375


commit ed395be288b800b2d1fd6b669a13570381ba8b54
Author: Dilip Biswal 
Date:   2015-10-30T04:57:52Z

[SPARK-11188][SQL] Elide stacktraces in bin/spark-sql for AnalysisExceptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-30 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9374

[SPARK-11188][SQL] Elide stacktraces in bin/spark-sql for AnalysisExceptions

Only print the error message to the console for Analysis Exceptions in 
sql-shell

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark dkb-11188-v152

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9374.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9374


commit a58cedc9ee8971fe8ec758cfd1321a61cf0fcb4f
Author: Dilip Biswal 
Date:   2015-10-29T17:29:50Z

[SPARK-11188][SQL] Elide stacktraces in bin/spark-sql for AnalysisExceptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9348#issuecomment-152422271
  
@cloud-fan Thanks.. have updated the test case name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9348#issuecomment-152305661
  
@cloud-fan Hi Wenchen, this failure seems unrelated to my change. Should we 
retest ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-152263073
  
@marmbrus Thanks a LOT for your comments and guidance. I am still little 
new to the process. 

So do i have to open two JIRAs one for 1.4 and the other for 1.5 and 
associate the new pull requests against them ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9348#discussion_r43419241
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/OptimizeInSuite.scala
 ---
@@ -36,6 +36,7 @@ class OptimizeInSuite extends PlanTest {
   Batch("AnalysisNodes", Once,
 EliminateSubQueries) ::
   Batch("ConstantFolding", Once,
--- End diff --

@cloud-fan Thank you very very much. It works !!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL] Elide stacktraces in bin/sp...

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-152253043
  
@marmbrus Took out "WIP" and also changed the description.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9348#discussion_r43355402
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -417,6 +417,14 @@ object NullPropagation extends Rule[LogicalPlan] {
 case left :: Literal(null, _) :: Nil => Literal.create(null, 
e.dataType)
 case _ => e
   }
+
+  // If the value expression is NULL then transform the In expression 
to
+  // Literal(null)
+  case e @ In(v, list) => v match {
--- End diff --

@cloud-fan -
If i re-order the rules to move ConstantFolding ahead of NullPropagation 
then it works fine.
I don't know the impact of doing this .. need your advice.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-29 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9348#discussion_r43354942
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -417,6 +417,14 @@ object NullPropagation extends Rule[LogicalPlan] {
 case left :: Literal(null, _) :: Nil => Literal.create(null, 
e.dataType)
 case _ => e
   }
+
+  // If the value expression is NULL then transform the In expression 
to
+  // Literal(null)
+  case e @ In(v, list) => v match {
--- End diff --

@cloud-fan Thanks Wenchen for your prompt input as always. Got a question ..

- ConstantFolding rule seems to come after the NullPropagation. Would this 
still work ?
- The 2nd test case i have added is failing 
   == FAIL: Plans do not match ===
!Filter null IN (1,cast(b#0 as string))   Filter null
  LocalRelation [a#0,b#0,c#0]  LocalRelation [a#0,b#0,c#0]

I am debugging it now. But thought i should check with you first..

Thanks again.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11024][SQL] Optimize NULL in

2015-10-28 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9348

[SPARK-11024][SQL] Optimize NULL in  by folding it to 
Literal(null)

Add a rule in optimizer to convert NULL [NOT] IN (expr1,...,expr2) to
Literal(null).

This is a follow up defect to SPARK-8654

@cloud-fan Can you please take a look ?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark_11024

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9348.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9348


commit 8e77826c885cdc5aefcdf2a2049cf94f3c3abff6
Author: Dilip Biswal 
Date:   2015-10-09T07:34:38Z

[SPARK-11024] Optimize NULL in  by folding it to 
Literal(null)




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-28 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-152001423
  
@marmbrus 
Michael, i will work on your suggestion about having different log4j 
default for spark-sql binary in another PR. Thanks a lot for suggesting it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-28 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-152000584
  
@marmbrus 
Thank you very much for your comments. I did think about users having their 
own log4j. My thinking was that - if they don't specify a console appender then 
i would get a null when querying for it and so i would not attempt to remove 
it. However, it is possible for them to write their own console appender and 
give it their own alias. I was hoping that this would be a rare case ..

However i like your suggestion of logging it as DEBUG level. Its much 
simpler and if users have set to log at this level then it is ok for them to 
see this exception on the console (if they have selected a console appender). I 
will make the change per your suggestion.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-27 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9194#discussion_r43159249
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 ---
@@ -219,4 +219,11 @@ class CliSuite extends SparkFunSuite with 
BeforeAndAfter with Logging {
 -> "OK"
 )
   }
+
+  test("SPARK-11188 Analysis error reporting") {
+runCliWithin(2.minute)(
+  "select * from nonexistent_table;"
+-> "Error in query: Table not found: nonexistent_table;"
--- End diff --

@marmbrus 
Hi Michael, given that logging to console is default , i have made changes 
to suppress the logging to console for analysis exceptions. One option is to 
completely stop logging these kind of exceptions. But i have tried to log them 
to non-console appenders. Please let me know what you think. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-26 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9194#discussion_r43011192
  
--- Diff: 
sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 ---
@@ -219,4 +219,11 @@ class CliSuite extends SparkFunSuite with 
BeforeAndAfter with Logging {
 -> "OK"
 )
   }
+
+  test("SPARK-11188 Analysis error reporting") {
+runCliWithin(2.minute)(
+  "select * from nonexistent_table;"
+-> "Error in query: Table not found: nonexistent_table;"
--- End diff --

@marmbrus 
Thanks for your quick feedback. Actually this is one of the question i was 
asking earlier.

2) The actual exception is also logged in **run** method of the 
SparkSQLDriver. If the
logging is set to CONSOLE then the exception is printed on the screen. I 
hope this is ok ?

I think by default the logging is enabled to console or stdout. If i change 
the log4j.properties and direct the logging to a file , then we should not see 
a stack trace.

Michael, do you want this logging completely suppressed for 
AnalysisException ? 




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-150333267
  
@marmbrus 
Michael, i tried putting "retest this please" in the comment but it does 
not seem to start the tests ? Is there a way to restart the tests ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-150259879
  
retest this please.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-22 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-150156202
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-21 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9194#issuecomment-150124281
  
@marmbrus 
Michael, need your advice here. It seems like when CliSuite is run, the 
tracing is set to console and thats why i see the AnalysisException in the test 
output. I had run it in my env by setting the trace to a file.  

What can we do here ? Please advice ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-21 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9194#discussion_r42654708
  
--- Diff: 
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala
 ---
@@ -308,7 +310,15 @@ private[hive] class SparkSQLCLIDriver extends 
CliDriver with Logging {
 
   ret = rc.getResponseCode
   if (ret != 0) {
-console.printError(rc.getErrorMessage())
+// For analysis exception, only the error is printed out to 
the console.
+rc.getException() match {
+  case e : AnalysisException =>
+console.printError(
+  s"""Failed with exception ${e.getClass.getName}: 
${e.getMessage}""")
--- End diff --

Sure Michael. I will make the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11188][SQL][WIP] Elide stacktraces in b...

2015-10-21 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9194

[SPARK-11188][SQL][WIP] Elide stacktraces in bin/spark-sql for 
AnalysisExceptions

I have a couple of questions.

1) I have used the a newly added constructor in hive that takes the 
additional  parameter
Throwable to construct a CommandProcessorResponse which i use to 
 match in the caller to find out if the type of exception is  
AnalysisException.
 
 Question : Can this code run under an earlier version of Hive ? In 
which case this
 will trigger a MethodNotFound exception.

 If the answer to above question is yes, then i can think of a couple 
of options. Please
 let me know if there is a preference or a different way. 

 1) We create a wrapper class that holds CommandProcessorResponse and a 
throwable
that we use to match later. We also create a wrapper method to 
Driver.run() that 
returns CommandProcessorResponseWrapper.I prefer this option as it 
seems cleaner.
 2) We search for AnalysisException in the exception string 

2) The actual exception is also logged in run method of the SparkSQLDriver. 
If the
 logging is set to CONSOLE then the exception is printed on the screen. 
I hope this is ok ?

3) Is there a way to test this ? Since we only suppress printing of the 
stack trace , i didn't find a way to hook into any existing test suite.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark-11188

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9194.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9194


commit 0c68926935eec48cb4b61175400694d8ff783564
Author: Dilip Biswal 
Date:   2015-10-20T21:15:59Z

[SPARK-11188] Elide stacktraces in bin/spark-sql for AnalysisExceptions




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-20 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-149665013
  
@marmbrus @rxin 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10534][SQL] ORDER BY clause allows only...

2015-10-15 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9123#issuecomment-148314200
  
@cloud-fan 
Hi Wenchen, i have fixed the code based on your comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10534][SQL] ORDER BY clause allows only...

2015-10-15 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9123#discussion_r42092689
  
--- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/AnalysisSuite.scala
 ---
@@ -135,4 +135,21 @@ class AnalysisSuite extends AnalysisTest {
 plan = testRelation.select(CreateStructUnsafe(Seq(a, (a + 
1).as("a+1"))).as("col"))
 checkAnalysis(plan, plan)
   }
+
+  test("SPARK-10534: resolve attribute references in order by clause") {
+
+val a = testRelation2.output.head
+val c = testRelation2.output.toArray.apply(2)
+
+val sortProjected = Floor(Cast(Floor(c), DoubleType))
+val projected = Alias(a, "a")()
+val plan = testRelation2.select(a).orderBy(SortOrder(Floor(Floor(c)), 
Ascending))
+
+val expected =
+  Project(Seq(a),
+Sort(Seq(SortOrder(sortProjected, Ascending)), true,
+  Project(Seq(a, c), testRelation2)))
+checkAnalysis(plan, expected)
+
+  }
--- End diff --

Thanks a LOT. I will make the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10534][SQL] ORDER BY clause allows only...

2015-10-15 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9123#discussion_r42092649
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -482,7 +482,12 @@ class Analyzer(
   val newOrdering = resolveSortOrders(ordering, grandchild, throws = 
true)
   // Construct a set that contains all of the attributes that we need 
to evaluate the
   // ordering.
-  val requiredAttributes = AttributeSet(newOrdering.filter(_.resolved))
+
+  val resolvedAttributes =
+newOrdering.flatMap(_.collect {case a : AttributeReference if 
a.resolved => a})
+
+  val requiredAttributes = AttributeSet(resolvedAttributes)
--- End diff --

Thanks. Much simpler :-). Will make the change and test..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10534][SQL] ORDER BY clause allows only...

2015-10-14 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9123#issuecomment-148220723
  
@cloud-fan 
Can you please take a look. Thanks ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-10534][SQL] ORDER BY clause allows only...

2015-10-14 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9123

[SPARK-10534][SQL] ORDER BY clause allows only columns that are present in 
S…

Find out the missing attributes by recursively looking
at the sort order expression and rest of the code
takes care of projecting them out.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark SPARK-10534

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9123.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9123


commit ac78af14508ac02a74202de564253b927f06c237
Author: Dilip Biswal 
Date:   2015-10-14T21:44:10Z

[SPARK-10534] ORDER BY clause allows only columns that are present in 
SELECT statement




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-14 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-148102351
  
@marmbrus 
Hi Michael,  the tests have passed. Please let me know if anything is 
pending before it gets merged.
Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-12 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-147537472
  
@mambrus
Hi Michael, the tests have passed. Please let me know if anything is 
pending before it gets merged.
Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146757252
  
@cloud-fan
Hi Wenchen, the test failed which looks unrelated. I wanted to retest .. 
but looks like its not triggering it. Can you please help ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146756933
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146756112
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146751654
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/9036#discussion_r41595658
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
 ---
@@ -304,7 +304,10 @@ object HiveTypeCoercion {
   }
 
   /**
-   * Convert all expressions in in() list to the left operator type
+   * Convert the value and in list expressions to the common operator type
+   * by looking at all the argument types and finding the closest one that
+   * all the arguments can be cast to. When no common operator type is 
found
+   * an Analysis Exception is raised.
--- End diff --

@cloud-fan 
You are right Wenchen.I just wanted to somehow mention that an exception 
will be raised on a data type mistmatch. I will reword it to the way you 
suggest. ""the original one will be returned and an Analysis Exception will be 
raised at type checking phase". Let me push a commit.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146728774
  
@cloud-fan 
Thanks.. Fixed it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146728816
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146703685
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146701712
  
@cloud-fan 
It seems to fail now with the following exception ..
hudson.plugins.git.GitException: Failed to fetch from 
https://github.com/apache/spark.git
at hudson.plugins.git.GitSCM.fetchFrom(GitSCM.java:735)
at hudson.plugins.git.GitSCM.retrieveChanges(GitSCM.java:983)
at hudson.plugins.git.GitSCM.checkout(GitSCM.java:1016)
at hudson.scm.SCM.checkout(SCM.java:485)
at hudson.model.AbstractProject.checkout(AbstractProject.java:1277)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.defaultCheckout(AbstractBuild.java:610)
at jenkins.scm.SCMCheckoutStrategy.checkout(SCMCheckoutStrategy.java:86)
at 
hudson.model.AbstractBuild$AbstractBuildExecution.run(AbstractBuild.java:532)
at hudson.model.Run.execute(Run.java:1741)
at hudson.model.FreeStyleBuild.run(FreeStyleBuild.java:43)
at hudson.model.ResourceController.execute(ResourceController.java:98)
at hudson.model.Executor.run(Executor.java:408)
Caused by: hudson.plugins.git.GitException: Command "git fetch --tags 
--progress https://github.com/apache/spark.git 
+refs/pull/9036/*:refs/remotes/origin/pr/9036/*" returned status code 143:
stdout: 
stderr: error: RPC failed; result=18, HTTP code = 200

What do we do here ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146696603
  
@cloud-fan , hi, how do we trigger a retest ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/9036#issuecomment-146692581
  
@cloud-fan 
Hi Wenchen, Can you please review the update to the test case ? Thank you.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654] [SQL] Analysis exception when usi...

2015-10-08 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/9036

[SPARK-8654] [SQL] Analysis exception when using NULL IN (...) : invalid 
cast

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to find a common type between LHS and RHS expressions and cast
all the expression to the common type. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark_8654_new

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/9036.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #9036


commit a1ee62904c0bed7aebd429e84f6f14212bd3c097
Author: Dilip Biswal 
Date:   2015-10-02T23:20:56Z

[SPARK-8654] Analysis exception when using NULL IN (...) : invalid cast

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146685208
  
@cloud-fan 
I didn't see a re-open option on this pull request. Do i have to create a 
new pull request ?
Please let me know ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146682265
  
@marmbrus .. sorry about it. Is there a way i can look at the list of 
failures ?
I had run :
build/mvn -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver 
-Dhadoop.version=2.6.0  -Dmaven.test.failure.ignore=true test

and it reported success. But then this is my first time.. so may not have 
right configuration.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146645486
  
Thanks a lot @marmbrus .

Many thanks to @cloud-fan for his help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-08 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146470802
  
@cloud-fan 
Thanks a lot. I have implemented the review comments. Please take a look. I 
looked at the optimizer code,. we already seem to be transforming NULL in (...) 
to  "Filter null"  in ConstantFolding rule. So we should be okay here , right ? 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146404345
  
@cloud-fan
Hi Wenchen, can you please take a look at the changes and let me know what 
you think..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-07 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146287283
  
Thank you @marmbrus @rick-ibm @cloud-fan 

I checked the behavior of db2. It also raises an error if the in list types 
are not compatible.

db2 => select * from f1 where NULL in (1, true)
SQL0401N  The data types of the operands for the operation "IN" are not 
compatible or comparable.  SQLSTATE=42818

I am studying the code now to figure out how to detect this and raise an 
error.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146047895
  
@marmbrus

Thanks a lot michael for looking into this. I debugged hive to understand 
the 
behaviour and would like to share my findings. I wanted to make sure we are 
doing 
the right thing here. Here are the comments at the top of GenericUDFIn() in 
hive.

/**
 * GenericUDFIn
 *
 * Example usage:
 * SELECT key FROM src WHERE key IN ("238", "1");
 *
 * From MySQL page on IN(): To comply with the SQL standard, IN returns NULL
 * not only if the expression on the left hand side is NULL, but also if no
 * match is found in the list and one of the expressions in the list is 
NULL.
 *
 * Also noteworthy: type conversion behavior is different from MySQL. With
 * expr IN expr1, expr2... in MySQL, exprN will each be converted into the 
same
 * type as expr. In the Hive implementation, all expr(N) will be converted 
into
 * a common type for conversion consistency with other UDF's, and to prevent
 * conversions from a big type to a small type (e.g. int to tinyint)
 */

**(case 1) expr in (expr1,... exprN)**
* **1.1** Per sql standard if expr is NULL then IN should return NULL
(with this PR we are attempting to achieve this)
* **1.2** if any of the expression in the right hand side is NULL and also
no match is found in the list then IN should also return NULL.
We also enforce this semantics by 

[implementation](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala#L124-L144)

**(case 2) Type conversion semantics.**
* **2.1** In MySQL all the expressions in the right hand side is converted 
to left hand side type. Our behaviour matches this semantics. I am
not sure if this is the standard though.
* **2.2** In Hive, they seem to find a common type (probably larger type) 
and
promote both left hand side and right hand side to that common type.
I believe this is where it throws the SemanticException.

Our behaviour seems match that of MySql more at the present time. Do we 
want to change this ? 
also about case 1 , it is not clear from the hive comments on what they 
intended to
do vs what is the external behaviour. Please let me know what you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-146003985
  
Very good point..Thanks.. Actually Hive reports an error in this case.

hive> select * from tnull where array(2,3) in (1, array(2,3));
FAILED: SemanticException [Error 10014]: Line 1:37 Wrong arguments '3': The 
arguments for IN should be the same type! Types are: {array IN (int, 
array)}

I am not sure what is the right thing to do here. 

Any comments @marmbrus ?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145958925
  
Thanks a LOT @cloud-fan. Sure.. i will look into it. When you say another 
PR, 
do we mean another JIRA ? 

Asking as i am new to the process. 

One other question.. what is the process to get this change integrated ? Do 
i need to initiate any action from my end ?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-06 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145948447
  
@cloud-fan , please confirm my understanding of the code (fairly new to the 
codebase..:-)
In the code we go through the entire in list and run evaluate the 
expression flagging hasNull. But
we continue with next items and return true if we see a match. If we 
haven't seen it then we look at the hasNull flag and return Null or False.

To confirm if there is a issue, i tried to run the following two queries 
again. The output looks 
ok to me.. 

select * from inttab where 1 in (1,2,NULL)
var2: org.apache.spark.sql.DataFrame = [c1: int]
+---+
| c1|
+---+
|  1|
|  2|
|  3|
|  4|
|  5|
+---+

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter 1 IN (1,2,null)
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter 1 IN (cast(1 as int),cast(2 as int),cast(null as int))
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Optimized Logical Plan ==
LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Physical Plan ==
Scan PhysicalRDD[c1#0]

Code Generation: true

select * from inttab where 1 in (NULL,1,2)
var2: org.apache.spark.sql.DataFrame = [c1: int]
+---+
| c1|
+---+
|  1|
|  2|
|  3|
|  4|
|  5|
+---+

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter 1 IN (null,1,2)
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter 1 IN (cast(null as int),cast(1 as int),cast(2 as int))
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Optimized Logical Plan ==
LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

== Physical Plan ==
Scan PhysicalRDD[c1#0]

Code Generation: true

Please let me know your thoughts ..


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145734618
  
Can you please help clarify ? Are you referring to the case when one of the 
value in the 
in list is literal null  like 1 in (1, NULL) ? If so, i don't think we can 
evaluate this to NULL...

Ran the following two queries on hive 
hive> select 1 in (1,NULL) from tnull;
OK
true
true
true
Time taken: 1.21 seconds, Fetched: 3 row(s)
hive> select 1 in (NULL,1) from tnull;
OK
true
true
true
Time taken: 0.168 seconds, Fetched: 3 row(s)

We have already taken care of the case LHS type is null. Let me know what 
you think.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145713214
  
hive> select null in (1,2,null) from tnull; 
OK
NULL
NULL
NULL
Time taken: 0.139 seconds, Fetched: 3 row(s)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145711558
  
Hi Wenchen,

Here is the link i could find where its a bit confusing on the equality 
operator.


https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF#LanguageManualUDF-RelationalOperators

In order to test it out , i tried the queries on hive like following ..

hive> select * from tnull ;
OK
1
2
NULL


hive> select null = 1 from tnull ;
OK
NULL
NULL
NULL
Time taken: 0.118 seconds, Fetched: 3 row(s)
hive> select null in (1,2) from tnull ;
OK
NULL
NULL
NULL

Please let me know what you think and thanks again for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145696762
  
Thanks Wenchen. You are right that not all types can be casted to boolean. 

However, in this case, we are not trying to cast the in list types to the 
LHS type (null type in our case) as we know that this is a special case 
predicate would always evaluate to NULL. That is why we are simply transforming 
the in predicate to NULL one and dropping the in list altogether.

== Parsed Logical Plan ==
'Project [unresolvedalias(*)]
 'Filter NOT null IN (1,2,3,4) => original one
  'UnresolvedRelation [inttab], None

== Analyzed Logical Plan ==
c1: int
Project [c1#0]
 Filter NOT null=> rewritten one
  Subquery inttab
   LogicalRDD [c1#0], MapPartitionsRDD[4] at rddToDataFrameHolder at 
:26

Please let me know what you think .. If you have a test case in mind that 
would exhibit a problem then i would like to try it out.

Thanks a lot for your help.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145664718
  
Thanks !! 

Do we need to look at the in list types in this case ? The in list types 
could be literals of different types , right ? for example NULL not in (1, 'a')

Since the result of IN predicate is a boolean type, i thought  it would be 
safe to transform it to
a Null literal of boolean type. Are you thinking of a case where this would 
not work ?

Thanks a lot in advance for your help.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/8983#issuecomment-145639934
  
Thanks for reviewing the code Wenchen. I was trying to model the test case 
based on what was put in the JIRA which did a caseInsensitiveAnalyze. I have 
fixed it now. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-8654][SQL] Fix Analysis exception when ...

2015-10-05 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/8983

[SPARK-8654][SQL] Fix Analysis exception when using NULL IN (...)

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark_8654

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/8983.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #8983


commit 38f973bb124c63c1caabe14ee6e5cca7b764b15a
Author: Dilip Biswal 
Date:   2015-10-02T23:20:56Z

[SPARK-8654] Analysis exception when using NULL IN (...) : invalid cast

In the analysis phase , while processing the rules for IN predicate, we
compare the in-list types to the lhs expression type and generate
cast operation if necessary. In the case of NULL [NOT] IN expr1 , we end up
generating cast between in list types to NULL like cast (1 as NULL) which
is not a valid cast.

The fix is to not generate such a cast if the lhs type is a NullType instead
we translate the expression to Literal(Null).




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    8   9   10   11   12   13