[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-06-08 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90137/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90137 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90137/testReport)**
 for PR 21215 at commit 
[`e151ab7`](https://github.com/apache/spark/commit/e151ab7475fed32b2baaca5c0cbcf427a5c09ad3).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread mn-mikke
Github user mn-mikke commented on the issue:

https://github.com/apache/spark/pull/21215
  
@maropu Really nice idea to create typed empty arrays via an `Literal` 
expression! On the other hand, I feel that the end user shouldn't work with 
classes from Catalyst internals if we consider that the creation of typed empty 
arrays is an elementary operation. 

I've tailored the solution according to your suggestion, but still think 
that some function should be introduced. What do you think?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90137 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90137/testReport)**
 for PR 21215 at commit 
[`e151ab7`](https://github.com/apache/spark/commit/e151ab7475fed32b2baaca5c0cbcf427a5c09ad3).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread lokm01
Github user lokm01 commented on the issue:

https://github.com/apache/spark/pull/21215
  
@maropu Thanks! Didn't know about creating a literal this way.

Don't you feel that the suggested change is way more elegant?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21215
  
Like this?
```
scala> val structTy = StructType.fromDDL("a ARRAY>")
structTy: org.apache.spark.sql.types.StructType = 
StructType(StructField(a,ArrayType(StructType(StructField(b,IntegerType,true), 
StructField(c,StringType,true)),true),true))

scala> val newCol = new Column(Literal.create(Seq.empty[Inner], 
structTy.head.dataType))
newCol: org.apache.spark.sql.Column = []

scala> val df = Seq(1, 2, 3).toDF("a").withColumn("b", newCol)
df: org.apache.spark.sql.DataFrame = [a: int, b: 
array>]

scala> df.show
+---+---+
|  a|  b|
+---+---+
|  1| []|
|  2| []|
|  3| []|
+---+---+


scala> df.printSchema
root
 |-- a: integer (nullable = false)
 |-- b: array (nullable = false)
 ||-- element: struct (containsNull = true)
 |||-- b: integer (nullable = true)
 |||-- c: string (nullable = true)
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread lokm01
Github user lokm01 commented on the issue:

https://github.com/apache/spark/pull/21215
  
@maropu That would work if you had scala case classes for all the types. In 
our case, we're working on a generic framework, where we only have Spark 
schemas (and I'd rather not generate case classes at runtime).

Can you suggest an existing way to do this using spark's DataType please?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21215
  
How about this?
```

scala> val df = Seq(Outer(Seq.empty[Inner]), 
Outer(Seq.empty[Inner])).toDF("a")
df: org.apache.spark.sql.DataFrame = [a: array>]

scala> df.printSchema
root
 |-- a: array (nullable = true)
 ||-- element: struct (containsNull = true)
 |||-- b: integer (nullable = false)
 |||-- c: string (nullable = true)

scala> df.show
+---+
|  a|
+---+
| []|
| []|
+---+


scala> val df = Seq(1, 2, 3).toDF("a").withColumn("b", 
typedLit(Seq.empty[Inner]))
df: org.apache.spark.sql.DataFrame = [a: int, b: 
array>]

scala> df.printSchema
root
 |-- a: integer (nullable = false)
 |-- b: array (nullable = false)
 ||-- element: struct (containsNull = true)
 |||-- b: integer (nullable = false)
 |||-- c: string (nullable = true)

scala> df.show
+---+---+
|  a|  b|
+---+---+
|  1| []|
|  2| []|
|  3| []|
+---+---+
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-03 Thread lokm01
Github user lokm01 commented on the issue:

https://github.com/apache/spark/pull/21215
  
Hey @maropu,

So we've encountered a number of issues with casting:
1. Casting an empty array to an array of primitive types caused an 
exception on 2.2.1, but works on 2.3.0+ so that's sorted

2. We're still facing an issue on 2.3.0 when we try to cast an empty array 
to an array of complex types. See the following example:

`
case class Outer(a: List[Inner])
case class Inner(b: Int, c: String)

object App4 extends App {
  val spark = 
SparkSession.builder().appName("").master("local[*]").getOrCreate()

  import spark.implicits._
  import org.apache.spark.sql.functions._

  val df = spark.createDataFrame(Seq[Outer]())
  
  val r = spark.range(100).select(array().cast(df.schema("a").dataType))
   
  r.printSchema()
  r.show
  
}
`  

This code produces 

> Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
resolve 'array()' due to data type mismatch: cannot cast array to 
array>;;


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90086/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90086 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90086/testReport)**
 for PR 21215 at commit 
[`9c12457`](https://github.com/apache/spark/commit/9c124574a3fefe2e63dcd95bd03e47f1f8d5071a).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21215
  
Do you wanna do this?
```
scala> sql("select array()").printSchema
root
 |-- array(): array (nullable = false)
 ||-- element: string (containsNull = false)


scala> sql("select CAST(array() AS ARRAY) c").printSchema
root
 |-- c: array (nullable = false)
 ||-- element: integer (containsNull = true)


scala> sql("select CAST(array() AS ARRAY) c").show
+---+
|  c|
+---+
| []|
+---+
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90086 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90086/testReport)**
 for PR 21215 at commit 
[`9c12457`](https://github.com/apache/spark/commit/9c124574a3fefe2e63dcd95bd03e47f1f8d5071a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/21215
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90079/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90079 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90079/testReport)**
 for PR 21215 at commit 
[`9c12457`](https://github.com/apache/spark/commit/9c124574a3fefe2e63dcd95bd03e47f1f8d5071a).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90079 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90079/testReport)**
 for PR 21215 at commit 
[`9c12457`](https://github.com/apache/spark/commit/9c124574a3fefe2e63dcd95bd03e47f1f8d5071a).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90073 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90073/testReport)**
 for PR 21215 at commit 
[`44b1852`](https://github.com/apache/spark/commit/44b18520dcf8e3e3639756cd8a12f75ea1080bee).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds the following public classes _(experimental)_:
  * `case class CreateArray(children: Seq[Expression], defaultElementType: 
DataType = StringType)`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/90073/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/21215
  
**[Test build #90073 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/90073/testReport)**
 for PR 21215 at commit 
[`44b1852`](https://github.com/apache/spark/commit/44b18520dcf8e3e3639756cd8a12f75ea1080bee).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/21215
  
ok to test


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread mn-mikke
Github user mn-mikke commented on the issue:

https://github.com/apache/spark/pull/21215
  
@lokm01 @gatorsmile @maropu @ueshin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21215: [SPARK-24148][SQL] Overloading array function to support...

2018-05-02 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/21215
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org