[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread Ngone51
Github user Ngone51 commented on the issue:

https://github.com/apache/spark/pull/20056
  
@vanzin @squito Can you have a look at this pr ? Thanks!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85766/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20056
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20056
  
**[Test build #85766 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85766/testReport)**
 for PR 20056 at commit 
[`20ad65b`](https://github.com/apache/spark/commit/20ad65ba5af5ea6eb6b6e5e9fc625a30059c97fe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...

2018-01-06 Thread liufengdb
Github user liufengdb commented on a diff in the pull request:

https://github.com/apache/spark/pull/20174#discussion_r160042482
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -245,11 +252,15 @@ case class HashAggregateExec(
|   $doAggFuncName();
|   $aggTime.add((System.nanoTime() - $beforeAgg) / 100);
|
-   |   // output the result
-   |   ${genResult.trim}
+   |   if (!$hasInput && ${resultVars.isEmpty}) {
--- End diff --

I think it hurts the code readability if the code for the two cases are 
defined separately.  For the regular case, the generated code will look like 
`if (false && !hasInput) ... else ...`. This pattern should be optimized easily 
during jit, so we don't need to worry about the performance too much.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20174
  
**[Test build #85769 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85769/testReport)**
 for PR 20174 at commit 
[`7bbcdc0`](https://github.com/apache/spark/commit/7bbcdc0179b0c053416c91a0f00795c21c4b0aff).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20174
  
**[Test build #85768 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85768/testReport)**
 for PR 20174 at commit 
[`2c3516d`](https://github.com/apache/spark/commit/2c3516d57e3ed231d8644d646ca262c9041d4276).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20176: [SPARK-22981][SQL] Fix incorrect results of Casting Stru...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20176
  
**[Test build #85767 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85767/testReport)**
 for PR 20176 at commit 
[`10285d0`](https://github.com/apache/spark/commit/10285d0ee8ba037809a9d15409a5c5055cd5be84).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20176: [SPARK-22981][SQL] Fix incorrect results of Casti...

2018-01-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20176#discussion_r160041886
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala
 ---
@@ -259,6 +259,29 @@ case class Cast(child: Expression, dataType: DataType, 
timeZoneId: Option[String
 builder.append("]")
 builder.build()
   })
+case StructType(fields) =>
+  buildCast[InternalRow](_, row => {
+val builder = new UTF8StringBuilder
+builder.append("[")
+if (row.numFields > 0) {
--- End diff --

Probably, it seems we have no chance to hit `row.numFields == 0` here 
though, I just leave this for strict checks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20176: [SPARK-22981][SQL] Fix incorrect results of Casti...

2018-01-06 Thread maropu
GitHub user maropu opened a pull request:

https://github.com/apache/spark/pull/20176

[SPARK-22981][SQL] Fix incorrect results of Casting Struct to String

## What changes were proposed in this pull request?
This pr fixed the issue when casting structs into strings;
```
scala> val df = Seq(((1, "a"), 0), ((2, "b"), 0)).toDF("a", "b")
scala> df.write.saveAsTable("t")
scala> sql("SELECT CAST(a AS STRING) FROM t").show
+---+
|  a|
+---+
|[0,1,180001,61]|
|[0,2,180001,62]|
+---+
```
This pr modified the result into;
```
+--+
| a|
+--+
|[1, a]|
|[2, b]|
+--+
```

## How was this patch tested?
Added tests in `CastSuite`.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/maropu/spark SPARK-22981

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20176.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20176


commit 10285d0ee8ba037809a9d15409a5c5055cd5be84
Author: Takeshi Yamamuro 
Date:   2018-01-06T09:58:20Z

Cast structs to strings




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20166
  
ok, I'll fix struct in a next following pr first.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20146
  
also cc @jkbradley @MLnick for review.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20166: [SPARK-22973][SQL] Fix incorrect results of Casti...

2018-01-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20166


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20166
  
merging to master/2.3!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/20166
  
I think this is a bug, in `Dataset.showString` I see code like
```
case seq: Seq[_] => seq.mkString("[", ", ", "]")
```
Which means we do want to show strings like `[[1, 2], [3], [4, 5, 6]]`

Anyway let's fix in another PR, I'm merging this PR first


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20146
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20146
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85765/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20146
  
**[Test build #85765 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85765/testReport)**
 for PR 20146 at commit 
[`26cc94b`](https://github.com/apache/spark/commit/26cc94bb335cf0ba3bcdbc2b78effd447026792c).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20013: [SPARK-20657][core] Speed up rendering of the stages pag...

2018-01-06 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/20013
  
The code looks good. But it is a lot of changes in SHS, I suggest running 
more tests(real workloads) before merge.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20056
  
**[Test build #85766 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85766/testReport)**
 for PR 20056 at commit 
[`20ad65b`](https://github.com/apache/spark/commit/20ad65ba5af5ea6eb6b6e5e9fc625a30059c97fe).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: [SPARK-22951][SQL] aggregate should not produce empty ro...

2018-01-06 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20174
  
(This is not related to this pr and too trivial things though, I just leave 
comments) `PropagateEmptyRelation` does not collapse 
`spark.emptyDataFrame.dropDuplicates` because `spark.emptyDataFrame` uses 
`ExistingRDD` instead of empty `LocalRelation`;

```
scala> spark.emptyDataFrame.dropDuplicates.explain(true)
== Parsed Logical Plan ==
Deduplicate
+- AnalysisBarrier LogicalRDD false

== Analyzed Logical Plan ==
Deduplicate
+- LogicalRDD false

== Optimized Logical Plan ==
Aggregate
+- LogicalRDD false

== Physical Plan ==
*HashAggregate(keys=[], functions=[], output=[])
+- Exchange SinglePartition
   +- *HashAggregate(keys=[], functions=[], output=[])
  +- Scan ExistingRDD[]

scala> Seq.empty[Tuple2[Int, Int]].toDF("a", 
"b").dropDuplicates.explain(true)
== Parsed Logical Plan ==
Deduplicate [a#8, b#9]
+- AnalysisBarrier Project [_1#5 AS a#8, _2#6 AS b#9]

== Analyzed Logical Plan ==
a: int, b: int
Deduplicate [a#8, b#9]
+- Project [_1#5 AS a#8, _2#6 AS b#9]
   +- LocalRelation , [_1#5, _2#6]

== Optimized Logical Plan ==
LocalRelation , [a#8, b#9]

== Physical Plan ==
LocalTableScan , [a#8, b#9]
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...

2018-01-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20174#discussion_r160040260
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -666,4 +666,16 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
   assert(exchangePlans.length == 1)
 }
   }
+
+  test("SPARK-22951: aggregation on empty data frame should only return 
initial values") {
+// non code gen
+withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
+  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
+}
+
+// code gen
+withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
+  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
+}
+  }
--- End diff --

cc: @gatorsmile 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...

2018-01-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20174#discussion_r160040102
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -106,6 +106,8 @@ case class HashAggregateExec(
 // This is a grouped aggregate and the input iterator is empty,
 // so return an empty iterator.
 Iterator.empty
+  } else if (!hasInput && resultExpressions.isEmpty) {
--- End diff --

`val res = if (!hasInput && (groupingExpressions.nonEmpty || 
resultExpressions.isEmpty)) {`? in line 105? Also, we need to update the 
comment in line 106.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...

2018-01-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20174#discussion_r160040256
  
--- Diff: 
sql/core/src/test/scala/org/apache/spark/sql/DataFrameAggregateSuite.scala ---
@@ -666,4 +666,16 @@ class DataFrameAggregateSuite extends QueryTest with 
SharedSQLContext {
   assert(exchangePlans.length == 1)
 }
   }
+
+  test("SPARK-22951: aggregation on empty data frame should only return 
initial values") {
+// non code gen
+withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "false") {
+  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
+}
+
+// code gen
+withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> "true") {
+  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
+}
+  }
--- End diff --

```
Seq("true", "false").foreach { codegen =>
withSQLConf(SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key -> codegen) {
  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
}
}
```
BTW, I think it is common patterns to check codegen and non-codegen paths, 
so we might be better to add a helper function in test utility class like;
```
checkExecution {
  assert(spark.emptyDataFrame.dropDuplicates.count == 0)
}
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20174: [SPARK-22951][SQL] aggregate should not produce e...

2018-01-06 Thread maropu
Github user maropu commented on a diff in the pull request:

https://github.com/apache/spark/pull/20174#discussion_r160040214
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/HashAggregateExec.scala
 ---
@@ -245,11 +252,15 @@ case class HashAggregateExec(
|   $doAggFuncName();
|   $aggTime.add((System.nanoTime() - $beforeAgg) / 100);
|
-   |   // output the result
-   |   ${genResult.trim}
+   |   if (!$hasInput && ${resultVars.isEmpty}) {
--- End diff --

We only need this check only if `resultExpressions` is empty. So, I think 
we can drop this check if non-empty (and regular) cases.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19943
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19943
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85763/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19943
  
**[Test build #85763 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85763/testReport)**
 for PR 19943 at commit 
[`0a44d7d`](https://github.com/apache/spark/commit/0a44d7d20e2a0df71fb499db67e0e4779fa46874).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20171
  
```
from pyspark.sql.functions import pandas_udf
from pyspark.sql.functions import col, lit
from pyspark.sql.types import LongType
df = spark.range(3)
f = pandas_udf(lambda x, y: len(x) + y, LongType())
df.select(f(lit('text'), col('id'))).show()
```

The result is wrong. cc @icexelloss @BryanCutler @taku-k @cloud-fan 
```
+--+
|(text, id)|
+--+
| 1|
| 2|
| 3|
+--+
```




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20146: [SPARK-11215][ML] Add multiple columns support to...

2018-01-06 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/20146#discussion_r160039165
  
--- Diff: R/pkg/tests/fulltests/test_mllib_classification.R ---
@@ -348,12 +348,12 @@ test_that("spark.mlp", {
 
   # Test random seed
   # default seed
-  model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3), maxIter 
= 10)
+  model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3), maxIter 
= 100)
--- End diff --

```R
> start.time <- Sys.time()
> model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3), maxIter 
= 10)
> end.time <- Sys.time()
> time.taken <- end.time - start.time
> time.taken
Time difference of 1.780564 secs
```

```R
> start.time <- Sys.time()
> model <- spark.mlp(df, label ~ features, layers = c(4, 5, 4, 3), maxIter 
= 100)
> end.time <- Sys.time()
> time.taken <- end.time - start.time
> time.taken
Time difference of 5.728089 secs
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20146
  
Ok. I use a handcrafted tiny dataset to replace iris in the failed test. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20146
  
**[Test build #85765 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85765/testReport)**
 for PR 20146 at commit 
[`26cc94b`](https://github.com/apache/spark/commit/26cc94bb335cf0ba3bcdbc2b78effd447026792c).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85764/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85764 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85764/testReport)**
 for PR 20171 at commit 
[`3c08f3d`](https://github.com/apache/spark/commit/3c08f3d6b7ec58735260de687bb74b104e6f7009).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85764 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85764/testReport)**
 for PR 20171 at commit 
[`3c08f3d`](https://github.com/apache/spark/commit/3c08f3d6b7ec58735260de687bb74b104e6f7009).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19943
  
**[Test build #85763 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85763/testReport)**
 for PR 19943 at commit 
[`0a44d7d`](https://github.com/apache/spark/commit/0a44d7d20e2a0df71fb499db67e0e4779fa46874).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #16209: [SPARK-10849][SQL] Adds option to the JDBC data source w...

2018-01-06 Thread sureshthalamati
Github user sureshthalamati commented on the issue:

https://github.com/apache/spark/pull/16209
  
@cbyn The specified types should be valid spark sql data types. LONGTEXT 
probably is not one of those types supported by spark sql syntax. 

@robbyki Problem with dialect as you noticed it will be same for all the 
columns as you noticed. Only workaround is to  create table explicitly in the 
Netezza , and the save it.  There is a truncate option also
if you need to empty the table before saving , that typically keeps the 
table as you created.

Please post questions  to spark user list , you will get answers quickly 
from other users and developers. People will not notice comments on the closed 
PRS.




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19943
  
I'm still testing some other stuff this PR.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85762/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85762 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85762/testReport)**
 for PR 20072 at commit 
[`291ce3a`](https://github.com/apache/spark/commit/291ce3a70d903c6d4608c0a1b6f0a27f5526a79f).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20056
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20056
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85756/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20056
  
**[Test build #85756 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85756/testReport)**
 for PR 20056 at commit 
[`213320e`](https://github.com/apache/spark/commit/213320ec33c33e640776d46df2fbd101a0084fd0).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18576
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85761/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/18576
  
Build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18576
  
**[Test build #85761 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85761/testReport)**
 for PR 18576 at commit 
[`84d32ef`](https://github.com/apache/spark/commit/84d32efbc05d614ce8b9f942623b05bec67d3cfc).
 * This patch **fails Spark unit tests**.
 * This patch **does not merge cleanly**.
 * This patch adds the following public classes _(experimental)_:
  * `case class FilterExec(condition: Expression, child: SparkPlan, 
outputAttrs: Seq[Attribute])`


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85758/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85758 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85758/testReport)**
 for PR 20072 at commit 
[`670a6c0`](https://github.com/apache/spark/commit/670a6c062a10ff775d1e3d6533702e2cc3cb34da).
 * This patch passes all tests.
 * This patch **does not merge cleanly**.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85762 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85762/testReport)**
 for PR 20072 at commit 
[`291ce3a`](https://github.com/apache/spark/commit/291ce3a70d903c6d4608c0a1b6f0a27f5526a79f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread CodingCat
Github user CodingCat commented on the issue:

https://github.com/apache/spark/pull/20072
  
retest this please


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread felixcheung
Github user felixcheung commented on the issue:

https://github.com/apache/spark/pull/20146
  
I think all dataset with a string order get indexed, as far as I recall?

Pick existing R dataset is just a convenience, we can also make up a few 
lines of data if that works out better.

Although as a separate note the difference in sort order is potentially 
something we should document, esp if it goes beyond glm, for example in sql 
functions too




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85760/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85760 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85760/testReport)**
 for PR 20072 at commit 
[`291ce3a`](https://github.com/apache/spark/commit/291ce3a70d903c6d4608c0a1b6f0a27f5526a79f).
 * This patch **fails Spark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85759 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85759/testReport)**
 for PR 20171 at commit 
[`b801e70`](https://github.com/apache/spark/commit/b801e7094c7c730adb53a25e13363973930a0b42).
 * This patch **fails PySpark unit tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85759/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #18576: [SPARK-21351][SQL] Update nullability based on children'...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/18576
  
**[Test build #85761 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85761/testReport)**
 for PR 18576 at commit 
[`84d32ef`](https://github.com/apache/spark/commit/84d32efbc05d614ce8b9f942623b05bec67d3cfc).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85759 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85759/testReport)**
 for PR 20171 at commit 
[`b801e70`](https://github.com/apache/spark/commit/b801e7094c7c730adb53a25e13363973930a0b42).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85760 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85760/testReport)**
 for PR 20072 at commit 
[`291ce3a`](https://github.com/apache/spark/commit/291ce3a70d903c6d4608c0a1b6f0a27f5526a79f).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85758 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85758/testReport)**
 for PR 20072 at commit 
[`670a6c0`](https://github.com/apache/spark/commit/670a6c062a10ff775d1e3d6533702e2cc3cb34da).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20171
  
retest this please 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20175


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85757 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85757/testReport)**
 for PR 20072 at commit 
[`2f6e3c9`](https://github.com/apache/spark/commit/2f6e3c9da482e85f6ee8046c7e7e2f9f6194ece2).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20072
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85757/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20072
  
**[Test build #85757 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85757/testReport)**
 for PR 20072 at commit 
[`2f6e3c9`](https://github.com/apache/spark/commit/2f6e3c9da482e85f6ee8046c7e7e2f9f6194ece2).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20072: [SPARK-22790][SQL] add a configurable factor to describe...

2018-01-06 Thread CodingCat
Github user CodingCat commented on the issue:

https://github.com/apache/spark/pull/20072
  
@cloud-fan @rxin @wzhfy @felixcheung @gatorsmile thanks the review, the new 
name of the parameter and test are added


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20175
  
Thanks! Merged to master/2.3


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20175
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85755/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20175
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20175
  
**[Test build #85755 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85755/testReport)**
 for PR 20175 at commit 
[`6b4ddcc`](https://github.com/apache/spark/commit/6b4ddcc4a1ba2c36475f8b1fa281c434a1138002).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20056: [SPARK-22878] [CORE] Count totalDroppedEvents for LiveLi...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20056
  
**[Test build #85756 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85756/testReport)**
 for PR 20056 at commit 
[`213320e`](https://github.com/apache/spark/commit/213320ec33c33e640776d46df2fbd101a0084fd0).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20173: [SPARK-22901][PYTHON][FOLLOWUP] Adds the doc for ...

2018-01-06 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/20173


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20146: [SPARK-11215][ML] Add multiple columns support to String...

2018-01-06 Thread viirya
Github user viirya commented on the issue:

https://github.com/apache/spark/pull/20146
  
Hmm, I reconsider this 
https://github.com/apache/spark/pull/20146#pullrequestreview-87070102. Even we 
use a dataset without duplicate values, if the string indexer order from R glm 
is different than the index used by RFormula, we still can't get the same 
results because looks like R glm doesn't follow frequency/alphabet.

For example, I've tried the dataset Puromycin:

```R
> training <- suppressWarnings(createDataFrame(Puromycin))  
 
> stats <- summary(spark.glm(training, conc ~ rate + state))
> rStats <- summary(glm(conc ~ rate + state, data = Puromycin))
> rStats$coefficients
   Estimate  Std. Error   t value Pr(>|t|)
(Intercept)-0.595461828 0.157462177 -3.781618 1.171709e-03
rate0.006642461 0.001022196  6.498228 2.464757e-06
stateuntreated  0.136323828 0.095090605  1.433620 1.671302e-01
> stats$coefficients
  Estimate  Std. Error   t value Pr(>|t|)
(Intercept)   -0.459138000 0.130420375 -3.520447 2.150817e-03
rate   0.006642461 0.001022196  6.498228 2.464757e-06
state_treated -0.136323828 0.095090605 -1.433620 1.671302e-01
```

You can see because the string index of state column is still different 
between R glm and RFormula, we can't get the same results.

A workaround to this is that we can use a dataset which doesn't need string 
indexing. What do you think? @felixcheung 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/20175
  
@dongjoon-hyun The JAVA style check is disabled.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/20175
  
@gatorsmile . Could you check Java style, too?
```
$ dev/lint-java
Using `mvn` from path: /usr/local/maven-3.5.2/bin/mvn
Checkstyle checks failed at following occurrences:
[ERROR] 
src/main/java/org/apache/spark/launcher/InProcessAppHandle.java:[20,8] 
(imports) UnusedImports: Unused import - java.io.IOException.
[ERROR] 
src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java:[464] (sizes) 
LineLength: Line is longer than 100 characters (found 102).
```


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread dongjoon-hyun
Github user dongjoon-hyun commented on the issue:

https://github.com/apache/spark/pull/19943
  
Thank you, @HyukjinKwon .


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20166
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85751/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20166
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20166
  
**[Test build #85751 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85751/testReport)**
 for PR 20166 at commit 
[`fb11796`](https://github.com/apache/spark/commit/fb1179698c5a4cdfd13bbc4fa2a0ceda07fe43c9).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20175
  
**[Test build #85755 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85755/testReport)**
 for PR 20175 at commit 
[`6b4ddcc`](https://github.com/apache/spark/commit/6b4ddcc4a1ba2c36475f8b1fa281c434a1138002).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #20175: [HOTFIX] Fix style checking failure

2018-01-06 Thread gatorsmile
GitHub user gatorsmile opened a pull request:

https://github.com/apache/spark/pull/20175

[HOTFIX] Fix style checking failure

## What changes were proposed in this pull request?
This PR is to fix the  style checking failure.

## How was this patch tested?
N/A

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gatorsmile/spark stylefix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/20175.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #20175


commit 6b4ddcc4a1ba2c36475f8b1fa281c434a1138002
Author: gatorsmile 
Date:   2018-01-06T12:52:03Z

fix




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20166: [SPARK-22973][SQL] Fix incorrect results of Casting Map ...

2018-01-06 Thread maropu
Github user maropu commented on the issue:

https://github.com/apache/spark/pull/20166
  
BTW, the current `Datset.showString` prints rows thru `RowEncoder` 
deserializers like;
```
scala> Seq(Seq(Seq(1, 2), Seq(3), Seq(4, 5, 6))).toDF("a").show(false)
++
|a   |
++
|[WrappedArray(1, 2), WrappedArray(3), WrappedArray(4, 5, 6)]|
++
```
If [we  cast them before 
prints](https://github.com/apache/spark/compare/master...maropu:CastToStringInShowString),
 we could get more simpler forms like;
```
scala> Seq(Seq(Seq(1, 2), Seq(3), Seq(4, 5, 6))).toDF("a").show(false)
++
|a   |
++
|[[1, 2], [3], [4, 5, 6]]|
++
```
I'm not sure through, is this acceptable? (Probably, we might need to add a 
option to keep the old behaviour)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85754 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85754/testReport)**
 for PR 20171 at commit 
[`b801e70`](https://github.com/apache/spark/commit/b801e7094c7c730adb53a25e13363973930a0b42).
 * This patch **fails to build**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20171
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85754/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20172: [SPARK-22979][PYTHON][SQL] Avoid per-record type dispatc...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20172
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85746/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20171: [SPARK-22978] [PySpark] Register Vectorized UDFs for SQL...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20171
  
**[Test build #85754 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85754/testReport)**
 for PR 20171 at commit 
[`b801e70`](https://github.com/apache/spark/commit/b801e7094c7c730adb53a25e13363973930a0b42).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20172: [SPARK-22979][PYTHON][SQL] Avoid per-record type dispatc...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20172
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20172: [SPARK-22979][PYTHON][SQL] Avoid per-record type dispatc...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20172
  
**[Test build #85746 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85746/testReport)**
 for PR 20172 at commit 
[`83c9b58`](https://github.com/apache/spark/commit/83c9b58670ab01c8abc11ffca08938b0a8189aee).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19943
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19943
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85745/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19943: [SPARK-16060][SQL] Support Vectorized ORC Reader

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19943
  
**[Test build #85745 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85745/testReport)**
 for PR 19943 at commit 
[`aeb6abd`](https://github.com/apache/spark/commit/aeb6abd66ee3338635edf9dca85894c14a05fb72).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: S[SPARK-22951][SQL] aggregate should not produce empty r...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20174
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: S[SPARK-22951][SQL] aggregate should not produce empty r...

2018-01-06 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/20174
  
**[Test build #85753 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/85753/testReport)**
 for PR 20174 at commit 
[`89d6b75`](https://github.com/apache/spark/commit/89d6b7504239211cd64188be50ae32a199f1ddd8).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20174: S[SPARK-22951][SQL] aggregate should not produce empty r...

2018-01-06 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/20174
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/85753/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   >