[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-08-01 Thread dongjoon-hyun
Github user dongjoon-hyun closed the pull request at:

https://github.com/apache/spark/pull/14353


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-27 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72571804
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
+} else {
+  TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+}
+  }
 
   override def dataType: DataType = {
-ArrayType(
-  children.headOption.map(_.dataType).getOrElse(NullType),
-  containsNull = children.exists(_.nullable))
+var elementType: DataType = 
children.headOption.map(_.dataType).getOrElse(NullType)
+if (elementType.isInstanceOf[DecimalType]) {
+  children.foreach { child =>
+if 
(elementType.asInstanceOf[DecimalType].isTighterThan(child.dataType)) {
--- End diff --

Thank you, @rxin .
Yep. I've read you comment about the lose.
I'll check that and revise.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-27 Thread rxin
Github user rxin commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72570595
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
+} else {
+  TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+}
+  }
 
   override def dataType: DataType = {
-ArrayType(
-  children.headOption.map(_.dataType).getOrElse(NullType),
-  containsNull = children.exists(_.nullable))
+var elementType: DataType = 
children.headOption.map(_.dataType).getOrElse(NullType)
+if (elementType.isInstanceOf[DecimalType]) {
+  children.foreach { child =>
+if 
(elementType.asInstanceOf[DecimalType].isTighterThan(child.dataType)) {
--- End diff --

i think this suffers from the same issue as the map pr.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-26 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72375060
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

Hi, @yhuai .
Could you give me some advice?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72186040
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

Is there anything to check more?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72186016
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

In short, those are recognized correctly in the Analyzed Logical Plan. As a 
result, the codegen correctly writes it with the unified precision and scale.
```
== Analyzed Logical Plan ==
a[0]: decimal(3,3), a[1]: decimal(3,3)
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72185905
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

And Finally, the following is the codegen result. Please see the line 29.
```scala
scala> sql("explain codegen select array(0.001, 
0.02)[1]").collect().foreach(println)
[Found 1 WholeStageCodegen subtrees.
== Subtree 1 / 1 ==
*Project [0.02 AS array(0.001, 0.02)[1]#75]
+- Scan OneRowRelation[]

Generated code:
/* 001 */ public Object generate(Object[] references) {
/* 002 */   return new GeneratedIterator(references);
/* 003 */ }
/* 004 */
/* 005 */ final class GeneratedIterator extends 
org.apache.spark.sql.execution.BufferedRowIterator {
/* 006 */   private Object[] references;
/* 007 */   private scala.collection.Iterator inputadapter_input;
/* 008 */   private UnsafeRow project_result;
/* 009 */   private 
org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder project_holder;
/* 010 */   private 
org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter 
project_rowWriter;
/* 011 */
/* 012 */   public GeneratedIterator(Object[] references) {
/* 013 */ this.references = references;
/* 014 */   }
/* 015 */
/* 016 */   public void init(int index, scala.collection.Iterator inputs[]) 
{
/* 017 */ partitionIndex = index;
/* 018 */ inputadapter_input = inputs[0];
/* 019 */ project_result = new UnsafeRow(1);
/* 020 */ this.project_holder = new 
org.apache.spark.sql.catalyst.expressions.codegen.BufferHolder(project_result, 
0);
/* 021 */ this.project_rowWriter = new 
org.apache.spark.sql.catalyst.expressions.codegen.UnsafeRowWriter(project_holder,
 1);
/* 022 */   }
/* 023 */
/* 024 */   protected void processNext() throws java.io.IOException {
/* 025 */ while (inputadapter_input.hasNext()) {
/* 026 */   InternalRow inputadapter_row = (InternalRow) 
inputadapter_input.next();
/* 027 */   Object project_obj = ((Expression) 
references[0]).eval(null);
/* 028 */   Decimal project_value = (Decimal) project_obj;
/* 029 */   project_rowWriter.write(0, project_value, 3, 3);
/* 030 */   append(project_result);
/* 031 */   if (shouldStop()) return;
/* 032 */ }
/* 033 */   }
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72185594
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

```scala
scala> sql("create table d1(a DECIMAL(3,2))")
scala> sql("create table d2(a DECIMAL(2,1))")
scala> sql("insert into d1 values(1.0)")
scala> sql("insert into d2 values(1.0)")
scala> sql("select * from d1, d2").show()
++---+
|   a|  a|
++---+
|1.00|1.0|
++---+

scala> sql("select array(d1.a,d2.a),array(d2.a,d1.a),* from d1, d2")
res5: org.apache.spark.sql.DataFrame = [array(a, a): array, 
array(a, a): array ... 2 more fields]

scala> sql("select array(d1.a,d2.a),array(d2.a,d1.a),* from d1, d2").show()
++++---+
| array(a, a)| array(a, a)|   a|  a|
++++---+
|[1.00, 1.00]|[1.00, 1.00]|1.00|1.0|
++++---+
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72185372
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

Hi, @yhuai . I check the following.

```scala
scala> sql("select a[0], a[1] from (select array(0.001, 0.02) a) T")
res4: org.apache.spark.sql.DataFrame = [a[0]: decimal(3,3), a[1]: 
decimal(3,3)]

scala> sql("select a[0], a[1] from (select array(0.001, 0.02) a) T").show()
+-+-+
| a[0]| a[1]|
+-+-+
|0.001|0.020|
+-+-+

scala> sql("select a[0], a[1] from (select array(0.001, 0.02) a) 
T").explain(true)
== Parsed Logical Plan ==
'Project [unresolvedalias('a[0], None), unresolvedalias('a[1], None)]
+- 'SubqueryAlias T
   +- 'Project ['array(0.001, 0.02) AS a#54]
  +- OneRowRelation$

== Analyzed Logical Plan ==
a[0]: decimal(3,3), a[1]: decimal(3,3)
Project [a#54[0] AS a[0]#61, a#54[1] AS a[1]#62]
+- SubqueryAlias T
   +- Project [array(0.001, 0.02) AS a#54]
  +- OneRowRelation$
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
Github user dongjoon-hyun commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72182807
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

Thank you for review, @yhuai .
I see. I'll check that more. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72182390
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

For example, if we access a single element, its data type actually may not 
be the one shown as the array's datatype.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread yhuai
Github user yhuai commented on a diff in the pull request:

https://github.com/apache/spark/pull/14353#discussion_r72182316
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/complexTypeCreator.scala
 ---
@@ -33,13 +33,24 @@ case class CreateArray(children: Seq[Expression]) 
extends Expression {
 
   override def foldable: Boolean = children.forall(_.foldable)
 
-  override def checkInputDataTypes(): TypeCheckResult =
-TypeUtils.checkForSameTypeInputExpr(children.map(_.dataType), 
"function array")
+  override def checkInputDataTypes(): TypeCheckResult = {
+if (children.map(_.dataType).forall(_.isInstanceOf[DecimalType])) {
+  TypeCheckResult.TypeCheckSuccess
--- End diff --

I think we cannot just make the check pass. We need to need to actually 
cast those element to the same prevision and scale.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #14353: [SPARK-16714][SQL] `array` should create a decima...

2016-07-25 Thread dongjoon-hyun
GitHub user dongjoon-hyun opened a pull request:

https://github.com/apache/spark/pull/14353

[SPARK-16714][SQL] `array` should create a decimal array from decimals with 
different precisions and scales

## What changes were proposed in this pull request?

In Spark 2.0, we will parse float literals as decimals. However, it 
introduces a side-effect, which is described below.
```scala
scala> select array(0.001, 0.02)
org.apache.spark.sql.AnalysisException: cannot resolve 'array(CAST(0.001 AS 
DECIMAL(3,3)), CAST(0.02 AS DECIMAL(2,2)))' due to data type mismatch: input to 
function array should all be the same type, but it's [decimal(3,3), 
decimal(2,2)]; line 1 pos 7
```

## How was this patch tested?

Pass the Jenkins tests.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dongjoon-hyun/spark SPARK-16714

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/14353.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #14353


commit d3dd7fb03c7bd96d7ce1ab7ed505d137775062f2
Author: Dongjoon Hyun 
Date:   2016-07-25T22:36:43Z

[SPARK-16714][SQL] Fail to create a decimal arrays with literals having 
different inferred precessions and scales




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org