[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...

2018-10-31 Thread rednaxelafx
Github user rednaxelafx commented on the issue:

https://github.com/apache/spark/pull/22847
  
Just in case people wonder, the following is the hack patch that I used for 
stress testing code splitting before this PR:
```diff
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/CodeGenerator.scala
@@ -647,11 +647,13 @@ class CodegenContext(val useStreamlining: Boolean) {
* Returns a term name that is unique within this instance of a 
`CodegenContext`.
*/
   def freshName(name: String): String = synchronized {
-val fullName = if (freshNamePrefix == "") {
+// hack: intentionally add a very long prefix (length=300 characters) 
to
+// trigger code splitting more frequently
+val fullName = ("averylongprefix" * 20) + (if (freshNamePrefix == "") {
   name
 } else {
   s"${freshNamePrefix}_$name"
-}
+})
 if (freshNameIds.contains(fullName)) {
   val id = freshNameIds(fullName)
   freshNameIds(fullName) = id + 1
```
Of course, now with this PR, we can simply set the split threshold to a 
very low value (e.g. `1`) to force split.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...

2018-10-31 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request:

https://github.com/apache/spark/pull/22847#discussion_r229943260
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
 .intConf
 .createWithDefault(65535)
 
+  val CODEGEN_METHOD_SPLIT_THRESHOLD = 
buildConf("spark.sql.codegen.methodSplitThreshold")
+.internal()
+.doc("The threshold of source code length without comment of a single 
Java function by " +
+  "codegen to be split. When the generated Java function source code 
exceeds this threshold" +
+  ", it will be split into multiple small functions. We can't know how 
many bytecode will " +
+  "be generated, so use the code length as metric. A function's 
bytecode should not go " +
+  "beyond 8KB, otherwise it will not be JITted; it also should not be 
too small, otherwise " +
+  "there will be many function calls.")
+.intConf
--- End diff --

Oh I see, you're using the column name...that's not the right place to put 
the "prefix". Column names are almost never carried over to the generated code 
in the current framework (the only exception is the lambda variable name).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19927: [SPARK-22737][ML][WIP] OVR transform optimization

2018-10-31 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/19927
  
@srowen How do you think about this? Current OVR model's transform is too 
slow.  Thanks.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...

2018-10-31 Thread rednaxelafx
Github user rednaxelafx commented on a diff in the pull request:

https://github.com/apache/spark/pull/22847#discussion_r229942325
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
 .intConf
 .createWithDefault(65535)
 
+  val CODEGEN_METHOD_SPLIT_THRESHOLD = 
buildConf("spark.sql.codegen.methodSplitThreshold")
+.internal()
+.doc("The threshold of source code length without comment of a single 
Java function by " +
+  "codegen to be split. When the generated Java function source code 
exceeds this threshold" +
+  ", it will be split into multiple small functions. We can't know how 
many bytecode will " +
+  "be generated, so use the code length as metric. A function's 
bytecode should not go " +
+  "beyond 8KB, otherwise it will not be JITted; it also should not be 
too small, otherwise " +
+  "there will be many function calls.")
+.intConf
--- End diff --

The "freshNamePrefix" prefix is only applied in whole-stage codegen, 

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L87

https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala#L169

It doesn't take any effect in non-whole-stage codegen.

If you intend to stress test expression codegen but don't see the prefix 
being prepended, you're probably not adding it in the right place. Where did 
you add it?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread zhengruifeng
Github user zhengruifeng commented on the issue:

https://github.com/apache/spark/pull/22087
  
@imatiach-msft  Updated according to your comments! Thanks for your 
reviewing!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22898: [SPARK-25746][SQL][followup] do not add unnecessa...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22898


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22898: [SPARK-25746][SQL][followup] do not add unnecessary If e...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22898
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98345/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22626
  
This needs to be rebased.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22087
  
**[Test build #98345 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98345/testReport)**
 for PR 22087 at commit 
[`2d6594e`](https://github.com/apache/spark/commit/2d6594e37ab6968fc13add89cdec2fd42f2b799b).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22914
  
> May be we can highlight above the table, that "Invalid page number, 
falling back to first page"

Yes, that's what I mean. 
No big deal but falling back to the first page seems unnecessary, we can 
return the same page again.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98344/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22087
  
**[Test build #98344 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98344/testReport)**
 for PR 22087 at commit 
[`f428a5d`](https://github.com/apache/spark/commit/f428a5ddef242bbcccb189ae62259a9ced6e80de).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22908
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98343/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22908
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22908
  
**[Test build #98343 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98343/testReport)**
 for PR 22908 at commit 
[`743b821`](https://github.com/apache/spark/commit/743b821ab7aab0d88cf52c13b98fb690f7b80836).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22914
  
The current behavior is, If we enter a value more than the maximum page 
number, the page navigation bar shows the user is in first page and throws an 
exception. So, if we really want to throw exception, then page navigation bar 
also need to change accordingly.

But still I believe, falling back to the first page is better. May be we 
can highlight above the table, that "Invalid page number, falling back to first 
page"



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22912
  
cc @jiangxb1987 and @mengxr 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r229933689
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.StructType
+
+/**
+ * An optional mix-in for columnar [[FileFormat]]s. This trait provides 
some helpful metadata when
+ * debugging a physical query plan.
+ */
+private[sql] trait ColumnarFileFormat {
--- End diff --

If it's supposed to be exposed as an interface to external datasources, 
then I wouldn't even add this one. It looks a rough guess that it can be 
generalised.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r229933544
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala 
---
@@ -306,7 +306,15 @@ case class FileSourceScanExec(
   withOptPartitionCount
 }
 
-withSelectedBucketsCount
+val withOptColumnCount = relation.fileFormat match {
+  case columnar: ColumnarFileFormat =>
+val sqlConf = relation.sparkSession.sessionState.conf
+val columnCount = columnar.columnCountForSchema(sqlConf, 
requiredSchema)
+withSelectedBucketsCount + ("ColumnCount" -> columnCount.toString)
--- End diff --

Is this something we really should include in the metadata? If the purpose 
of this is to check if the column pruning works or not, logging should be good 
enough. Adding a trait for it sounds an overkill for the current status. Let's 
not add an abstraction just for rough guess that it can be generalised.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread viirya
Github user viirya commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r229932838
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.StructType
+
+/**
+ * An optional mix-in for columnar [[FileFormat]]s. This trait provides 
some helpful metadata when
+ * debugging a physical query plan.
+ */
+private[sql] trait ColumnarFileFormat {
--- End diff --

Don't we allow other non built-in datasource to use this trait?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r229932338
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.StructType
+
+/**
+ * An optional mix-in for columnar [[FileFormat]]s. This trait provides 
some helpful metadata when
+ * debugging a physical query plan.
+ */
+private[sql] trait ColumnarFileFormat {
--- End diff --

and I would actually make it `pricate[datasources]` since that's only 
currently used in ParquetFileFormat.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread gengliangwang
Github user gengliangwang commented on the issue:

https://github.com/apache/spark/pull/22914
  
I prefer to  just highlight the invalid output. E.g.


![image](https://user-images.githubusercontent.com/1097932/47831557-0e6ea800-ddcc-11e8-9fd1-c4d29f944c9d.png)


![image](https://user-images.githubusercontent.com/1097932/47831561-11699880-ddcc-11e8-9d83-20e49a0c517d.png)



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22905: [SPARK-25894][SQL] Add a ColumnarFileFormat type ...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22905#discussion_r229932234
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ColumnarFileFormat.scala
 ---
@@ -0,0 +1,32 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.datasources
+
+import org.apache.spark.sql.internal.SQLConf
+import org.apache.spark.sql.types.StructType
+
+/**
+ * An optional mix-in for columnar [[FileFormat]]s. This trait provides 
some helpful metadata when
+ * debugging a physical query plan.
+ */
+private[sql] trait ColumnarFileFormat {
--- End diff --

It's already in a private package. `private[sql]` can be removed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4691/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #20899: Bug fix in sendMessage() of pregel implementation in Pag...

2018-10-31 Thread WenqianZhao
Github user WenqianZhao commented on the issue:

https://github.com/apache/spark/pull/20899
  
> @WenqianZhao I think the point of sending deltas instead of absolute 
ranks was that, as parts of the graph converge, their deltas would go to zero. 
GraphX would then be able to compress those zero messages more efficiently.

Got it! Thank you!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22087
  
**[Test build #98345 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98345/testReport)**
 for PR 22087 at commit 
[`2d6594e`](https://github.com/apache/spark/commit/2d6594e37ab6968fc13add89cdec2fd42f2b799b).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread shahidki31
Github user shahidki31 commented on the issue:

https://github.com/apache/spark/pull/22914
  
@gengliangwang IMHO, We should try to avoid exceptions in the WEBUI. 

User will come to know which page he is, from the page navigation bar. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22087
  
**[Test build #98344 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98344/testReport)**
 for PR 22087 at commit 
[`f428a5d`](https://github.com/apache/spark/commit/f428a5ddef242bbcccb189ae62259a9ced6e80de).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22087: [SPARK-25097][ML] Support prediction on single instance ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22087
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4690/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98342/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #98342 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98342/testReport)**
 for PR 19045 at commit 
[`ca448d1`](https://github.com/apache/spark/commit/ca448d13c523e4658720ed3bf9b7cfa9f03ec260).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT,...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22892


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...

2018-10-31 Thread yucai
Github user yucai commented on a diff in the pull request:

https://github.com/apache/spark/pull/22847#discussion_r229919857
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
 .intConf
 .createWithDefault(65535)
 
+  val CODEGEN_METHOD_SPLIT_THRESHOLD = 
buildConf("spark.sql.codegen.methodSplitThreshold")
+.internal()
+.doc("The threshold of source code length without comment of a single 
Java function by " +
+  "codegen to be split. When the generated Java function source code 
exceeds this threshold" +
+  ", it will be split into multiple small functions. We can't know how 
many bytecode will " +
+  "be generated, so use the code length as metric. A function's 
bytecode should not go " +
+  "beyond 8KB, otherwise it will not be JITted; it also should not be 
too small, otherwise " +
+  "there will be many function calls.")
+.intConf
--- End diff --

Seems like long alias names have no influence.
```
[info] Java HotSpot(TM) 64-Bit Server VM 1.8.0_162-b12 on Mac OS X 10.13.6
[info] Intel(R) Core(TM) i7-7820HQ CPU @ 2.90GHz
[info] projection on wide table:Best/Avg Time(ms)
Rate(M/s)   Per Row(ns)   Relative
[info] 

[info] split threshold 106512 / 6736  
0.26210.4   1.0X
[info] split threshold 100   5730 / 6329  
0.25464.9   1.1X
[info] split threshold 1024  3119 / 3184  
0.32974.6   2.1X
[info] split threshold 2048  2981 / 3100  
0.42842.9   2.2X
[info] split threshold 4096  3289 / 3379  
0.33136.6   2.0X
[info] split threshold 8196  4307 / 4338  
0.24108.0   1.5X
[info] split threshold 65536   29147 / 30212  
0.0   27797.0   0.2X
```

No `averylongprefixrepeatedmultipletimes` in the **expression code gen**:
```
/* 047 */   private void createExternalRow_0_8(InternalRow i, Object[] 
values_0) {
/* 048 */
/* 049 */ // input[80, bigint, false]
/* 050 */ long value_81 = i.getLong(80);
/* 051 */ if (false) {
/* 052 */   values_0[80] = null;
/* 053 */ } else {
/* 054 */   values_0[80] = value_81;
/* 055 */ }
/* 056 */
/* 057 */ // input[81, bigint, false]
/* 058 */ long value_82 = i.getLong(81);
/* 059 */ if (false) {
/* 060 */   values_0[81] = null;
/* 061 */ } else {
/* 062 */   values_0[81] = value_82;
/* 063 */ }
/* 064 */
/* 065 */ // input[82, bigint, false]
/* 066 */ long value_83 = i.getLong(82);
/* 067 */ if (false) {
/* 068 */   values_0[82] = null;
/* 069 */ } else {
/* 070 */   values_0[82] = value_83;
/* 071 */ }
/* 072 */
...
```

My benchmark:
```
object WideTableBenchmark extends SqlBasedBenchmark {

  override def runBenchmarkSuite(mainArgs: Array[String]): Unit = {
runBenchmark("projection on wide table") {
  val N = 1 << 20
  val df = spark.range(N)
  val columns = (0 until 400).map{ i => s"id as 
averylongprefixrepeatedmultipletimes_id$i"}
  val benchmark = new Benchmark("projection on wide table", N, output = 
output)
  Seq("10", "100", "1024", "2048", "4096", "8196", "65536").foreach { n 
=>
benchmark.addCase(s"split threshold $n", numIters = 5) { iter =>
  withSQLConf("spark.testing.codegen.splitThreshold" -> n) {
df.selectExpr(columns: _*).foreach(identity(_))
  }
}
  }
  benchmark.run()
}
  }
}
```

Will keep benchmarking for the complex expression.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22892: [SPARK-25884][SQL] Add TBLPROPERTIES and COMMENT, and us...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22892
  
thanks, merging to master!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22847: [SPARK-25850][SQL] Make the split threshold for the code...

2018-10-31 Thread cloud-fan
Github user cloud-fan commented on the issue:

https://github.com/apache/spark/pull/22847
  
@rednaxelafx ah good point! It's hardcoded as 1024 too, and it's also doing 
method splitting. Let's apply the config there too.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98341/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22912
  
**[Test build #98341 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98341/testReport)**
 for PR 22912 at commit 
[`2f2de0b`](https://github.com/apache/spark/commit/2f2de0bb381c6b6bed65f5371aa001dd84aff3fe).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98340/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22912
  
**[Test build #98340 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98340/testReport)**
 for PR 22912 at commit 
[`e7ad8ab`](https://github.com/apache/spark/commit/e7ad8abc2c3505b7fab81cfa79f009a8808bb2b8).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22918
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #21860: [SPARK-24901][SQL]Merge the codegen of RegularHashMap an...

2018-10-31 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/21860
  
thanks, @cloud-fan, @maropu, @kiszk


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22918
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22666
  
Argh, sorry, it was my mistake.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22918: [SPARK-25902][SQL]Change AttributeReference.withMetadata...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22918
  
Can one of the admins verify this patch?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22918: [SPARK-25902][SQL]Change AttributeReference.withM...

2018-10-31 Thread kevinyu98
GitHub user kevinyu98 opened a pull request:

https://github.com/apache/spark/pull/22918

[SPARK-25902][SQL]Change AttributeReference.withMetadata's return type to 
AttributeReference

## What changes were proposed in this pull request?

Currently the `AttributeReference.withMetadata` method have return type 
`Attribute,` the rest of with methods in the `AttributeReference` return type 
are `AttributeReference`, as the 
[spark-25902](https://issues.apache.org/jira/browse/SPARK-25892?jql=project%20%3D%20SPARK%20AND%20component%20in%20(ML%2C%20PySpark%2C%20SQL))
 mentioned. 

## How was this patch tested?

Run all `sql/test,` `catalyst/test` and 
`org.apache.spark.sql.execution.streaming.*`

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kevinyu98/spark spark-25892

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22918.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22918


commit 7aa2e9f9113ace0106ed1de31bad5997d600f03b
Author: Kevin Yu 
Date:   2018-10-31T22:40:04Z

return AttributeReference type for AttributeReference.withMetadata




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22666: [SPARK-25672][SQL] schema_of_csv() - schema infer...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22666


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22666
  
Ah no I am sorry @MaxGekk. I made the primary author as me mistakenly. 
I showed my email first.

```
=== Pull Request #22666 ===
title   [SPARK-25672][SQL] schema_of_csv() - schema inference from an 
example
source  MaxGekk/schema_of_csv-function
target  master
url https://api.github.com/repos/apache/spark/pulls/22666

Proceed with merging pull request #22666? (y/n): y
git fetch apache-github pull/22666/head:PR_TOOL_MERGE_PR_22666
From https://github.com/apache/spark
 * [new ref] refs/pull/22666/head -> PR_TOOL_MERGE_PR_22666
git fetch apache master:PR_TOOL_MERGE_PR_22666_MASTER
remote: Counting objects: 303, done.
remote: Compressing objects: 100% (153/153), done.
remote: Total 209 (delta 91), reused 0 (delta 0)
Receiving objects: 100% (209/209), 91.89 KiB | 445.00 KiB/s, done.
Resolving deltas: 100% (91/91), completed with 65 local objects.
From https://git-wip-us.apache.org/repos/asf/spark
 * [new branch]  master -> PR_TOOL_MERGE_PR_22666_MASTER
   57eddc7182e..c5ef477d2f6  master -> apache/master
git checkout PR_TOOL_MERGE_PR_22666_MASTER
Switched to branch 'PR_TOOL_MERGE_PR_22666_MASTER'
['git', 'merge', 'PR_TOOL_MERGE_PR_22666', '--squash']
Automatic merge went well; stopped before committing as requested
['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%an <%ae>']
Enter primary author in the format of "name " [hyukjinkwon 
]: hyukjinkwon 
['git', 'log', 'HEAD..PR_TOOL_MERGE_PR_22666', '--pretty=format:%h [%an] 
%s']
```

Looks the commit order affects the name appearing for `Enter primary author 
in the format of "name "`.



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22914: [SPARK-25900][WEBUI]When the page number is more than th...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22914
  
**[Test build #4400 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4400/testReport)**
 for PR 22914 at commit 
[`5a31350`](https://github.com/apache/spark/commit/5a313506b63b870d8de37348f97cf5d67ed52ff6).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22666: [SPARK-25672][SQL] schema_of_csv() - schema inference fr...

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22666
  
Merged to master.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22845: [SPARK-25848][SQL][TEST] Refactor CSVBenchmarks to use m...

2018-10-31 Thread heary-cao
Github user heary-cao commented on the issue:

https://github.com/apache/spark/pull/22845
  
thanks,@dongjoon-hyum


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22908
  
**[Test build #98343 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98343/testReport)**
 for PR 22908 at commit 
[`743b821`](https://github.com/apache/spark/commit/743b821ab7aab0d88cf52c13b98fb690f7b80836).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22908
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22908: [MONOR][SQL] Replace all TreeNode's node name in the sim...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22908
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4689/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22626
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22626
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98336/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22626: [SPARK-25638][SQL] Adding new function - to_csv()

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22626
  
**[Test build #98336 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98336/testReport)**
 for PR 22626 at commit 
[`39f6899`](https://github.com/apache/spark/commit/39f689932ee2df194420fc63c7c5d9e351b09b86).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22860: Branch 2.4

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22860


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22897: [SPARK-25875][k8s] Merge code to set up driver command i...

2018-10-31 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22897
  
@mccheah @liyinan926 

(I'm kinda assuming you guys monitor github / jira instead of relying on 
pings.)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22917
  
The title seems to describe the problem, can you describe the solution 
instead?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22901: [SPARK-25891][PYTHON] Upgrade to Py4J 0.10.8.1

2018-10-31 Thread HyukjinKwon
Github user HyukjinKwon commented on the issue:

https://github.com/apache/spark/pull/22901
  
Late LGTM!


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22883
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98331/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22883
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22883
  
**[Test build #98331 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98331/testReport)**
 for PR 22883 at commit 
[`82fef86`](https://github.com/apache/spark/commit/82fef8686f4b71d94b3bfe44ea809b4d88f5fe70).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedu...

2018-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/22910


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4688/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/19045
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
Kubernetes integration test status failure
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4688/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
Kubernetes integration test starting
URL: 
https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/4688/



---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...

2018-10-31 Thread gatorsmile
Github user gatorsmile commented on the issue:

https://github.com/apache/spark/pull/22910
  
Thanks! Merged to master/2.4


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #19045: [WIP][SPARK-20628][CORE][K8S] Keep track of nodes (/ spo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/19045
  
**[Test build #98342 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98342/testReport)**
 for PR 19045 at commit 
[`ca448d1`](https://github.com/apache/spark/commit/ca448d13c523e4658720ed3bf9b7cfa9f03ec260).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22883
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98329/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22883
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22883: [SPARK-25837] [Core] Fix potential slowdown in AppStatus...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22883
  
**[Test build #98329 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98329/testReport)**
 for PR 22883 at commit 
[`178f7c3`](https://github.com/apache/spark/commit/178f7c3bf82f93177fce086037ece6ebf09bb350).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22910
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98328/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22910
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22910: [SPARK-25899][TESTS]Fix flaky CoarseGrainedSchedulerBack...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22910
  
**[Test build #98328 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98328/testReport)**
 for PR 22910 at commit 
[`0a3095f`](https://github.com/apache/spark/commit/0a3095fd5610810004ef6a0d1e02581b78bfdea4).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22847: [SPARK-25850][SQL] Make the split threshold for t...

2018-10-31 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22847#discussion_r229879855
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ---
@@ -812,6 +812,17 @@ object SQLConf {
 .intConf
 .createWithDefault(65535)
 
+  val CODEGEN_METHOD_SPLIT_THRESHOLD = 
buildConf("spark.sql.codegen.methodSplitThreshold")
+.internal()
+.doc("The threshold of source code length without comment of a single 
Java function by " +
+  "codegen to be split. When the generated Java function source code 
exceeds this threshold" +
+  ", it will be split into multiple small functions. We can't know how 
many bytecode will " +
+  "be generated, so use the code length as metric. A function's 
bytecode should not go " +
+  "beyond 8KB, otherwise it will not be JITted; it also should not be 
too small, otherwise " +
+  "there will be many function calls.")
+.intConf
--- End diff --

Could you try some very long alias names or complex expressions? You will 
get different number, right? 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread yogeshg
Github user yogeshg commented on the issue:

https://github.com/apache/spark/pull/22912
  
In an offline discussion with @MrBago , we noted that there's at most as 
many (non-cancelled) `timerTasks` on the `timer` as there are slots. So, one 
thread for managing logging is probably fine, in fact if anything, we should 
also think about how we can just use the main thread. Also, this means that the 
`timer.purge()` call in the finally block is also `O(n + log c) \in 
O(constant)` where `n` is the total number of tasks and `c` is the number of 
cancelled tasks. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22912
  
**[Test build #98341 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98341/testReport)**
 for PR 22912 at commit 
[`2f2de0b`](https://github.com/apache/spark/commit/2f2de0bb381c6b6bed65f5371aa001dd84aff3fe).


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4687/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22912: [SPARK-25901][CORE] Use only one thread in BarrierTaskCo...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22912
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22917
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22917
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 

https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/4686/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22917
  
Merged build finished. Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22911: [SPARK-25815][k8s] Support kerberos in client mode, keyt...

2018-10-31 Thread vanzin
Github user vanzin commented on the issue:

https://github.com/apache/spark/pull/22911
  
It is actually not needed for client mode because only the driver needs the 
keytab.

But whether to store it in secrets is not a question. You either store it 
in a secret or you don't support the keytab/principal feature in Spark at all, 
and we can delete a bunch of code here.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22917
  
Test FAILed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98339/
Test FAILed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22917: [SPARK-25827][CORE] Encrypted blocks can be over 2GB.

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22917
  
**[Test build #98339 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98339/testReport)**
 for PR 22917 at commit 
[`5b432e7`](https://github.com/apache/spark/commit/5b432e7184130c0264d330f3068a80f1b3a4bb61).
 * This patch **fails Scala style tests**.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22429
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98324/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22909: [SPARK-25897][k8s] Hook up k8s integration tests to sbt ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22909
  
Test PASSed.
Refer to this link for build results (access rights to CI server needed): 
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/98327/
Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22429: [SPARK-25440][SQL] Dumping query execution info to a fil...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22429
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22909: [SPARK-25897][k8s] Hook up k8s integration tests to sbt ...

2018-10-31 Thread AmplabJenkins
Github user AmplabJenkins commented on the issue:

https://github.com/apache/spark/pull/22909
  
Merged build finished. Test PASSed.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark issue #22909: [SPARK-25897][k8s] Hook up k8s integration tests to sbt ...

2018-10-31 Thread SparkQA
Github user SparkQA commented on the issue:

https://github.com/apache/spark/pull/22909
  
**[Test build #98327 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/98327/testReport)**
 for PR 22909 at commit 
[`0d8c7f6`](https://github.com/apache/spark/commit/0d8c7f6d883818cbdd2b830d1829b505e8263e2a).
 * This patch passes all tests.
 * This patch merges cleanly.
 * This patch adds no public classes.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



  1   2   3   4   5   6   >