[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197720740
  
**[Test build #53397 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53397/consoleFull)**
 for PR 11297 at commit 
[`29a4f59`](https://github.com/apache/spark/commit/29a4f59dcdbe24666a5ff2905f50895d6232bf2d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197560109
  
cc @marmbrus 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197750038
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53397/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197974133
  
LGTM.  Thanks!

Merging to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56426148
  
--- Diff: 
sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/FromClauseParser.g
 ---
@@ -91,10 +91,17 @@ fromClause
 joinSource
 @init { gParent.pushMsg("join source", state); }
 @after { gParent.popMsg(state); }
-: fromSource ( joinToken^ fromSource ( KW_ON! expression 
{$joinToken.start.getType() != COMMA}? )? )*
+: fromSource ( joinToken^ fromSource ( joinCond 
{$joinToken.start.getType() != COMMA}? )? )*
 | uniqueJoinToken^ uniqueJoinSource (COMMA! uniqueJoinSource)+
 ;
 
+joinCond
+@init { gParent.pushMsg("join expression list", state); }
+@after { gParent.popMsg(state); }
+: KW_ON! expression
+| KW_USING LPAREN columnNameList RPAREN -> ^(TOK_USING columnNameList)
+;
--- End diff --

This looks pretty good.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197978987
  
@marmbrus Thank you very much Michael !!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56459052
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1329,48 +1329,72 @@ class Analyzer(
   }
 
   /**
-   * Removes natural joins by calculating output columns based on output 
from two sides,
-   * Then apply a Project on a normal Join to eliminate natural join.
+   * Removes natural or using joins by calculating output columns based on 
output from two sides,
+   * Then apply a Project on a normal Join to eliminate natural or using 
join.
*/
-  object ResolveNaturalJoin extends Rule[LogicalPlan] {
+  object ResolveNaturalAndUsingJoin extends Rule[LogicalPlan] {
 override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+  case j @ Join(left, right, UsingJoin(joinType, usingCols), condition)
+  if left.resolved && right.resolved && j.duplicateResolved =>
+// Resolve the column names referenced in using clause from both 
the legs of join.
+val lCols = usingCols.flatMap(col => left.resolveQuoted(col.name, 
resolver))
+val rCols = usingCols.flatMap(col => right.resolveQuoted(col.name, 
resolver))
+if ((lCols.length == usingCols.length) && (rCols.length == 
usingCols.length)) {
+  val joinNames = lCols.map(exp => exp.name)
+  commonNaturalJoinProcessing(left, right, joinType, joinNames, 
None)
+} else {
+  j
+}
   case j @ Join(left, right, NaturalJoin(joinType), condition) if 
j.resolvedExceptNatural =>
 // find common column names from both sides
 val joinNames = 
left.output.map(_.name).intersect(right.output.map(_.name))
-val leftKeys = joinNames.map(keyName => left.output.find(_.name == 
keyName).get)
-val rightKeys = joinNames.map(keyName => right.output.find(_.name 
== keyName).get)
-val joinPairs = leftKeys.zip(rightKeys)
-
-// Add joinPairs to joinConditions
-val newCondition = (condition ++ joinPairs.map {
-  case (l, r) => EqualTo(l, r)
-}).reduceOption(And)
-
-// columns not in joinPairs
-val lUniqueOutput = left.output.filterNot(att => 
leftKeys.contains(att))
-val rUniqueOutput = right.output.filterNot(att => 
rightKeys.contains(att))
-
-// the output list looks like: join keys, columns from left, 
columns from right
-val projectList = joinType match {
-  case LeftOuter =>
-leftKeys ++ lUniqueOutput ++ 
rUniqueOutput.map(_.withNullability(true))
-  case RightOuter =>
-rightKeys ++ lUniqueOutput.map(_.withNullability(true)) ++ 
rUniqueOutput
-  case FullOuter =>
-// in full outer join, joinCols should be non-null if there is.
-val joinedCols = joinPairs.map { case (l, r) => 
Alias(Coalesce(Seq(l, r)), l.name)() }
-joinedCols ++
-  lUniqueOutput.map(_.withNullability(true)) ++
-  rUniqueOutput.map(_.withNullability(true))
-  case Inner =>
-rightKeys ++ lUniqueOutput ++ rUniqueOutput
-  case _ =>
-sys.error("Unsupported natural join type " + joinType)
-}
-// use Project to trim unnecessary fields
-Project(projectList, Join(left, right, joinType, newCondition))
+commonNaturalJoinProcessing(left, right, joinType, joinNames, 
condition)
+}
+  }
+
+  private def commonNaturalJoinProcessing(
+ left: LogicalPlan,
+ right: LogicalPlan,
+ joinType: JoinType,
+ joinNames: Seq[String],
+ condition: Option[Expression]) = {
+val leftKeys = joinNames.map(keyName => left.output.find(_.name == 
keyName).get)
+val rightKeys = joinNames.map(keyName => right.output.find(_.name == 
keyName).get)
+val joinPairs = leftKeys.zip(rightKeys)
+
+// Add joinPairs to joinConditions
+val newCondition = (condition ++ joinPairs.map {
--- End diff --

@hvanhovell Thank you for your review !! I have made the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197750031
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread hvanhovell
Github user hvanhovell commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56421281
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1329,48 +1329,72 @@ class Analyzer(
   }
 
   /**
-   * Removes natural joins by calculating output columns based on output 
from two sides,
-   * Then apply a Project on a normal Join to eliminate natural join.
+   * Removes natural or using joins by calculating output columns based on 
output from two sides,
+   * Then apply a Project on a normal Join to eliminate natural or using 
join.
*/
-  object ResolveNaturalJoin extends Rule[LogicalPlan] {
+  object ResolveNaturalAndUsingJoin extends Rule[LogicalPlan] {
 override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+  case j @ Join(left, right, UsingJoin(joinType, usingCols), condition)
+  if left.resolved && right.resolved && j.duplicateResolved =>
+// Resolve the column names referenced in using clause from both 
the legs of join.
+val lCols = usingCols.flatMap(col => left.resolveQuoted(col.name, 
resolver))
+val rCols = usingCols.flatMap(col => right.resolveQuoted(col.name, 
resolver))
+if ((lCols.length == usingCols.length) && (rCols.length == 
usingCols.length)) {
+  val joinNames = lCols.map(exp => exp.name)
+  commonNaturalJoinProcessing(left, right, joinType, joinNames, 
None)
+} else {
+  j
+}
   case j @ Join(left, right, NaturalJoin(joinType), condition) if 
j.resolvedExceptNatural =>
 // find common column names from both sides
 val joinNames = 
left.output.map(_.name).intersect(right.output.map(_.name))
-val leftKeys = joinNames.map(keyName => left.output.find(_.name == 
keyName).get)
-val rightKeys = joinNames.map(keyName => right.output.find(_.name 
== keyName).get)
-val joinPairs = leftKeys.zip(rightKeys)
-
-// Add joinPairs to joinConditions
-val newCondition = (condition ++ joinPairs.map {
-  case (l, r) => EqualTo(l, r)
-}).reduceOption(And)
-
-// columns not in joinPairs
-val lUniqueOutput = left.output.filterNot(att => 
leftKeys.contains(att))
-val rUniqueOutput = right.output.filterNot(att => 
rightKeys.contains(att))
-
-// the output list looks like: join keys, columns from left, 
columns from right
-val projectList = joinType match {
-  case LeftOuter =>
-leftKeys ++ lUniqueOutput ++ 
rUniqueOutput.map(_.withNullability(true))
-  case RightOuter =>
-rightKeys ++ lUniqueOutput.map(_.withNullability(true)) ++ 
rUniqueOutput
-  case FullOuter =>
-// in full outer join, joinCols should be non-null if there is.
-val joinedCols = joinPairs.map { case (l, r) => 
Alias(Coalesce(Seq(l, r)), l.name)() }
-joinedCols ++
-  lUniqueOutput.map(_.withNullability(true)) ++
-  rUniqueOutput.map(_.withNullability(true))
-  case Inner =>
-rightKeys ++ lUniqueOutput ++ rUniqueOutput
-  case _ =>
-sys.error("Unsupported natural join type " + joinType)
-}
-// use Project to trim unnecessary fields
-Project(projectList, Join(left, right, joinType, newCondition))
+commonNaturalJoinProcessing(left, right, joinType, joinNames, 
condition)
+}
+  }
+
+  private def commonNaturalJoinProcessing(
+ left: LogicalPlan,
+ right: LogicalPlan,
+ joinType: JoinType,
+ joinNames: Seq[String],
+ condition: Option[Expression]) = {
+val leftKeys = joinNames.map(keyName => left.output.find(_.name == 
keyName).get)
+val rightKeys = joinNames.map(keyName => right.output.find(_.name == 
keyName).get)
+val joinPairs = leftKeys.zip(rightKeys)
+
+// Add joinPairs to joinConditions
+val newCondition = (condition ++ joinPairs.map {
--- End diff --

`joinPairs.map(EqualTo)`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-19 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-197749675
  
**[Test build #53397 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53397/consoleFull)**
 for PR 11297 at commit 
[`29a4f59`](https://github.com/apache/spark/commit/29a4f59dcdbe24666a5ff2905f50895d6232bf2d).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-18 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/11297


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196660958
  
Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196660960
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53154/
Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196660444
  
**[Test build #53154 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53154/consoleFull)**
 for PR 11297 at commit 
[`f39f547`](https://github.com/apache/spark/commit/f39f54791098578a2a676f599040263086fb4b19).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196635689
  
**[Test build #53154 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53154/consoleFull)**
 for PR 11297 at commit 
[`f39f547`](https://github.com/apache/spark/commit/f39f54791098578a2a676f599040263086fb4b19).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196601404
  
Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196601382
  
**[Test build #53145 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53145/consoleFull)**
 for PR 11297 at commit 
[`f7bb37e`](https://github.com/apache/spark/commit/f7bb37eb066bb60265297f9aa7d2373038d853d3).
 * This patch **fails MiMa tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196601409
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53145/
Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread SparkQA
Github user SparkQA commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196598285
  
**[Test build #53145 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53145/consoleFull)**
 for PR 11297 at commit 
[`f7bb37e`](https://github.com/apache/spark/commit/f7bb37eb066bb60265297f9aa7d2373038d853d3).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196597773
  
ok to test


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56101664
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1339,43 +1339,68 @@ class Analyzer(
*/
   object ResolveNaturalJoin extends Rule[LogicalPlan] {
--- End diff --

OK.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56101632
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala ---
@@ -458,43 +458,14 @@ class DataFrame private[sql](
   Join(logicalPlan, right.logicalPlan, joinType = JoinType(joinType), 
None))
   .analyzed.asInstanceOf[Join]
 
-val condition = usingColumns.map { col =>
-  catalyst.expressions.EqualTo(
-withPlan(joined.left).resolve(col),
-withPlan(joined.right).resolve(col))
-}.reduceLeftOption[catalyst.expressions.BinaryExpression] { (cond, 
eqTo) =>
-  catalyst.expressions.And(cond, eqTo)
-}
-
-// Project only one of the join columns.
-val joinedCols = JoinType(joinType) match {
-  case Inner | LeftOuter | LeftSemi =>
-usingColumns.map(col => withPlan(joined.left).resolve(col))
-  case RightOuter =>
-usingColumns.map(col => withPlan(joined.right).resolve(col))
-  case FullOuter =>
-usingColumns.map { col =>
-  val leftCol = 
withPlan(joined.left).resolve(col).toAttribute.withNullability(true)
-  val rightCol = 
withPlan(joined.right).resolve(col).toAttribute.withNullability(true)
-  Alias(Coalesce(Seq(leftCol, rightCol)), col)()
-}
-  case NaturalJoin(_) => sys.error("NaturalJoin with using clause is 
not supported.")
-}
-// The nullability of output of joined could be different than 
original column,
-// so we can only compare them by exprId
-val joinRefs = AttributeSet(condition.toSeq.flatMap(_.references))
-val resultCols = joinedCols ++ 
joined.output.filterNot(joinRefs.contains(_))
 withPlan {
-  Project(
-resultCols,
-Join(
-  joined.left,
-  joined.right,
-  joinType = JoinType(joinType),
-  condition)
-  )
+  Join(
+joined.left,
+joined.right,
+UsingJoin(JoinType(joinType), 
usingColumns.map(UnresolvedAttribute(_))),
+None)
 }
-  }
+   }
--- End diff --

I will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56101653
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1339,43 +1339,68 @@ class Analyzer(
*/
   object ResolveNaturalJoin extends Rule[LogicalPlan] {
 override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+  case j @ Join(left, right, UsingJoin(joinType, usingCols), condition)
+if left.resolved && right.resolved =>
+// Resolve the column names referenced in using clause from both 
the legs of join.
+val lCols = usingCols.flatMap(col => left.resolveQuoted(col.name, 
resolver))
+val rCols = usingCols.flatMap(col => right.resolveQuoted(col.name, 
resolver))
+if ((lCols.length == usingCols.length) && (rCols.length == 
usingCols.length))
+{
--- End diff --

ok. Will make the change.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56101613
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2178,4 +2178,62 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 Row(4) :: Nil)
 }
   }
-}
+
+  test("join with using clause") {
+val df1 = Seq(("r1c1", "r1c2", "t1r1c3"),
+  ("r2c1", "r2c2", "t1r2c3"), ("r3c1x", "r3c2", "t1r3c3")).toDF("c1", 
"c2", "c3")
+val df2 = Seq(("r1c1", "r1c2", "t2r1c3"),
+  ("r2c1", "r2c2", "t2r2c3"), ("r3c1y", "r3c2", "t2r3c3")).toDF("c1", 
"c2", "c3")
+val df3 = Seq((null, "r1c2", "t3r1c3"),
+  ("r2c1", "r2c2", "t3r2c3"), ("r3c1y", "r3c2", "t3r3c3")).toDF("c1", 
"c2", "c3")
+withTempTable("t1", "t2", "t3") {
+  df1.registerTempTable("t1")
+  df2.registerTempTable("t2")
+  df3.registerTempTable("t3")
+  // inner join with one using column
+  checkAnswer(
+sql("SELECT * FROM t1 join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") :: Nil)
+
+  // inner join with two using columns
+  checkAnswer(
+sql("SELECT * FROM t1 join t2 using (c1, c2)"),
+Row("r1c1", "r1c2", "t1r1c3", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "t2r2c3") :: Nil)
+
+  // Left outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 left join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) :: Nil)
+
+  // Right outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 right join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1y", null, null, "r3c2", "t2r3c3") :: Nil)
+
+  // Full outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 full outer join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) ::
+  Row("r3c1y", null,
+null, "r3c2", "t2r3c3") :: Nil)
+
+  // Full outer join with null value in join column.
+  checkAnswer(
+sql("SELECT * FROM t1 full outer join t3 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", null, null) ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t3r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) ::
+  Row("r3c1y", null, null, "r3c2", "t3r3c3") ::
+  Row(null, null, null, "r1c2", "t3r1c3") :: Nil)
+}
+  }
+  }
--- End diff --

Will fix it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread adrian-wang
Github user adrian-wang commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56101451
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1169,6 +1170,7 @@ object PushPredicateThroughJoin extends 
Rule[LogicalPlan] with PredicateHelper {
   Join(newLeft, newRight, LeftOuter, newJoinCond)
 case FullOuter => f
 case NaturalJoin(_) => sys.error("Untransformed NaturalJoin node")
+case UsingJoin(_, _) => sys.error("Untransformed Using join node")
--- End diff --

JoinType is sealed, so we need to put something in this pattern match


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196479791
  
Oh, you already did :)  Ignore that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196479175
  
This looks pretty good.  Thanks for working on it!  Can you also update the 
the function in 
[Dataset](https://github.com/apache/spark/blob/6a4bfcd62b7effcfbb865bdd301d41a0ba6e5c94/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L476)
 to use this code path instead of doing resolution there.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56057338
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1339,43 +1339,68 @@ class Analyzer(
*/
   object ResolveNaturalJoin extends Rule[LogicalPlan] {
--- End diff --

Update the comments, and probably the class name.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56057244
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
@@ -1339,43 +1339,68 @@ class Analyzer(
*/
   object ResolveNaturalJoin extends Rule[LogicalPlan] {
 override def apply(plan: LogicalPlan): LogicalPlan = plan 
resolveOperators {
+  case j @ Join(left, right, UsingJoin(joinType, usingCols), condition)
+if left.resolved && right.resolved =>
+// Resolve the column names referenced in using clause from both 
the legs of join.
+val lCols = usingCols.flatMap(col => left.resolveQuoted(col.name, 
resolver))
+val rCols = usingCols.flatMap(col => right.resolveQuoted(col.name, 
resolver))
+if ((lCols.length == usingCols.length) && (rCols.length == 
usingCols.length))
+{
--- End diff --

Nit: don't wrap `{}` for `if`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56057116
  
--- Diff: 
sql/catalyst/src/main/antlr3/org/apache/spark/sql/catalyst/parser/FromClauseParser.g
 ---
@@ -91,10 +91,17 @@ fromClause
 joinSource
 @init { gParent.pushMsg("join source", state); }
 @after { gParent.popMsg(state); }
-: fromSource ( joinToken^ fromSource ( KW_ON! expression 
{$joinToken.start.getType() != COMMA}? )? )*
+: fromSource ( joinToken^ fromSource ( joinCond 
{$joinToken.start.getType() != COMMA}? )? )*
 | uniqueJoinToken^ uniqueJoinSource (COMMA! uniqueJoinSource)+
 ;
 
+joinCond
+@init { gParent.pushMsg("join expression list", state); }
+@after { gParent.popMsg(state); }
+: KW_ON! expression
+| KW_USING LPAREN columnNameList RPAREN -> ^(TOK_USING columnNameList)
+;
--- End diff --

/cc @hvanhovell


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-196475952
  
We should also modify the [`resolved` attribute of a 
join](https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicOperators.scala#L303)
 to return false until `UsingJoin` is resolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56056572
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala ---
@@ -458,43 +458,14 @@ class DataFrame private[sql](
   Join(logicalPlan, right.logicalPlan, joinType = JoinType(joinType), 
None))
   .analyzed.asInstanceOf[Join]
 
-val condition = usingColumns.map { col =>
-  catalyst.expressions.EqualTo(
-withPlan(joined.left).resolve(col),
-withPlan(joined.right).resolve(col))
-}.reduceLeftOption[catalyst.expressions.BinaryExpression] { (cond, 
eqTo) =>
-  catalyst.expressions.And(cond, eqTo)
-}
-
-// Project only one of the join columns.
-val joinedCols = JoinType(joinType) match {
-  case Inner | LeftOuter | LeftSemi =>
-usingColumns.map(col => withPlan(joined.left).resolve(col))
-  case RightOuter =>
-usingColumns.map(col => withPlan(joined.right).resolve(col))
-  case FullOuter =>
-usingColumns.map { col =>
-  val leftCol = 
withPlan(joined.left).resolve(col).toAttribute.withNullability(true)
-  val rightCol = 
withPlan(joined.right).resolve(col).toAttribute.withNullability(true)
-  Alias(Coalesce(Seq(leftCol, rightCol)), col)()
-}
-  case NaturalJoin(_) => sys.error("NaturalJoin with using clause is 
not supported.")
-}
-// The nullability of output of joined could be different than 
original column,
-// so we can only compare them by exprId
-val joinRefs = AttributeSet(condition.toSeq.flatMap(_.references))
-val resultCols = joinedCols ++ 
joined.output.filterNot(joinRefs.contains(_))
 withPlan {
-  Project(
-resultCols,
-Join(
-  joined.left,
-  joined.right,
-  joinType = JoinType(joinType),
-  condition)
-  )
+  Join(
+joined.left,
+joined.right,
+UsingJoin(JoinType(joinType), 
usingColumns.map(UnresolvedAttribute(_))),
+None)
 }
-  }
+   }
--- End diff --

Is this right?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56056454
  
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 
---
@@ -2178,4 +2178,62 @@ class SQLQuerySuite extends QueryTest with 
SharedSQLContext {
 Row(4) :: Nil)
 }
   }
-}
+
+  test("join with using clause") {
+val df1 = Seq(("r1c1", "r1c2", "t1r1c3"),
+  ("r2c1", "r2c2", "t1r2c3"), ("r3c1x", "r3c2", "t1r3c3")).toDF("c1", 
"c2", "c3")
+val df2 = Seq(("r1c1", "r1c2", "t2r1c3"),
+  ("r2c1", "r2c2", "t2r2c3"), ("r3c1y", "r3c2", "t2r3c3")).toDF("c1", 
"c2", "c3")
+val df3 = Seq((null, "r1c2", "t3r1c3"),
+  ("r2c1", "r2c2", "t3r2c3"), ("r3c1y", "r3c2", "t3r3c3")).toDF("c1", 
"c2", "c3")
+withTempTable("t1", "t2", "t3") {
+  df1.registerTempTable("t1")
+  df2.registerTempTable("t2")
+  df3.registerTempTable("t3")
+  // inner join with one using column
+  checkAnswer(
+sql("SELECT * FROM t1 join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") :: Nil)
+
+  // inner join with two using columns
+  checkAnswer(
+sql("SELECT * FROM t1 join t2 using (c1, c2)"),
+Row("r1c1", "r1c2", "t1r1c3", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "t2r2c3") :: Nil)
+
+  // Left outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 left join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) :: Nil)
+
+  // Right outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 right join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1y", null, null, "r3c2", "t2r3c3") :: Nil)
+
+  // Full outer join with one using column.
+  checkAnswer(
+sql("SELECT * FROM t1 full outer join t2 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", "r1c2", "t2r1c3") ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t2r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) ::
+  Row("r3c1y", null,
+null, "r3c2", "t2r3c3") :: Nil)
+
+  // Full outer join with null value in join column.
+  checkAnswer(
+sql("SELECT * FROM t1 full outer join t3 using (c1)"),
+Row("r1c1", "r1c2", "t1r1c3", null, null) ::
+  Row("r2c1", "r2c2", "t1r2c3", "r2c2", "t3r2c3") ::
+  Row("r3c1x", "r3c2", "t1r3c3", null, null) ::
+  Row("r3c1y", null, null, "r3c2", "t3r3c3") ::
+  Row(null, null, null, "r1c2", "t3r1c3") :: Nil)
+}
+  }
+  }
--- End diff --

Nit: indent.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r56056326
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala
 ---
@@ -1169,6 +1170,7 @@ object PushPredicateThroughJoin extends 
Rule[LogicalPlan] with PredicateHelper {
   Join(newLeft, newRight, LeftOuter, newJoinCond)
 case FullOuter => f
 case NaturalJoin(_) => sys.error("Untransformed NaturalJoin node")
+case UsingJoin(_, _) => sys.error("Untransformed Using join node")
--- End diff --

Nit: I'm not really sure what the point of these extra checks is.  Is it 
only to remove a warning?  All kinds of things break in the optimizer if the 
plan is unresolved.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-10 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-195015526
  
@gatorsmile Thanks !! I have rebased the code now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-10 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-195014561
  
@marmbrus Hi Michael, i have changed the code per your feedback. As part of 
this change, i have also cleaned up dataframe's using join to make use of the 
new code. Please let me know your comments. Thanks !!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-09 Thread dilipbiswal
Github user dilipbiswal commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r55589677
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -72,6 +72,19 @@ case class UnresolvedAttribute(nameParts: Seq[String]) 
extends Attribute with Un
   override def sql: String = quoteIdentifier(name)
 }
 
+/**
+ * Holds the name of columns referenced in an USING clause of JOIN source.
+ */
+case class UnresolvedUsingAttributes(usingCols: Seq[String])
--- End diff --

@marmbrus Thank you Michael. Good ideal. I will change it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-09 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/11297#discussion_r55569500
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala
 ---
@@ -72,6 +72,19 @@ case class UnresolvedAttribute(nameParts: Seq[String]) 
extends Attribute with Un
   override def sql: String = quoteIdentifier(name)
 }
 
+/**
+ * Holds the name of columns referenced in an USING clause of JOIN source.
+ */
+case class UnresolvedUsingAttributes(usingCols: Seq[String])
--- End diff --

Why not store the information in the join type like we do for `NaturalJoin`?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-03-09 Thread gatorsmile
Github user gatorsmile commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-194451801
  
@dilipbiswal Could you please resolve the conflicts? Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-02-29 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-190338300
  
@rxin Hi Reynold, would appreciate your feedback on this PR. Thanks a lot.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-02-21 Thread dilipbiswal
Github user dilipbiswal commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-187014299
  
@rxin @adrian-wang Can you please review the implementation and let me know 
your comments.Thanks !!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-02-21 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/11297#issuecomment-186992843
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-13427][SQL] Support USING clause in JOI...

2016-02-21 Thread dilipbiswal
GitHub user dilipbiswal opened a pull request:

https://github.com/apache/spark/pull/11297

[SPARK-13427][SQL] Support USING clause in JOIN.

## What changes were proposed in this pull request?

Support queries that JOIN tables with USING clause.
SELECT * from table1 JOIN table2 USING 


## How was the this patch tested?

Have added unit tests in SQLQuerySuite, CatalystQlSuite, 
ResolveNaturalJoinSuite

(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dilipbiswal/spark spark-13427

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/11297.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #11297


commit bdfe5ed32c3a1fa148b467723018de9746a26d73
Author: Dilip Biswal 
Date:   2016-02-19T23:23:07Z

[SPARK-13427] Support USING clause in JOIN.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org