[GitHub] AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix 
deadlock in BytesToBytesMap.MapIterator when locking both 
BytesToBytesMap.MapIterator and TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445909870
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99915/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445909870
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99915/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445909862
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445815525
 
 
   **[Test build #99915 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99915/testReport)**
 for PR 23272 at commit 
[`9d52320`](https://github.com/apache/spark/commit/9d52320e24077a8c94639aad6b21a4af5d3e83d9).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
SparkQA commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445909073
 
 
   **[Test build #99915 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99915/testReport)**
 for PR 23272 at commit 
[`9d52320`](https://github.com/apache/spark/commit/9d52320e24077a8c94639aad6b21a4af5d3e83d9).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix 
deadlock in BytesToBytesMap.MapIterator when locking both 
BytesToBytesMap.MapIterator and TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445906449
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99914/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23272: [SPARK-26265][Core] Fix 
deadlock in BytesToBytesMap.MapIterator when locking both 
BytesToBytesMap.MapIterator and TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445906442
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445906442
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445906449
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99914/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445815520
 
 
   **[Test build #99914 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99914/testReport)**
 for PR 23272 at commit 
[`4c621d2`](https://github.com/apache/spark/commit/4c621d2bd36c50a10591d93ccd77bd7c0432a873).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23272: [SPARK-26265][Core] Fix deadlock in BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and TaskMemoryManager

2018-12-10 Thread GitBox
SparkQA commented on issue #23272: [SPARK-26265][Core] Fix deadlock in 
BytesToBytesMap.MapIterator when locking both BytesToBytesMap.MapIterator and 
TaskMemoryManager
URL: https://github.com/apache/spark/pull/23272#issuecomment-445905638
 
 
   **[Test build #99914 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99914/testReport)**
 for PR 23272 at commit 
[`4c621d2`](https://github.com/apache/spark/commit/4c621d2bd36c50a10591d93ccd77bd7c0432a873).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow 
sharing Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445905073
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
SparkQA commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's 
memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445905307
 
 
   **[Test build #99928 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99928/testReport)**
 for PR 23278 at commit 
[`f73bc8f`](https://github.com/apache/spark/commit/f73bc8fde7208c6256303c850c49ffbe22feda07).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow 
sharing Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445905081
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5934/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing 
Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445905081
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5934/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing 
Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445905073
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow 
sharing Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445902818
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] vanzin commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
vanzin commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's 
memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445903549
 
 
   add to whitelist


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
SparkQA commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's 
memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445903508
 
 
   **[Test build #99927 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99927/testReport)**
 for PR 23278 at commit 
[`f73bc8f`](https://github.com/apache/spark/commit/f73bc8fde7208c6256303c850c49ffbe22feda07).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23278: [SPARK-24920][Core] Allow 
sharing Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445902681
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing 
Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445902681
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23278: [SPARK-24920][Core] Allow sharing 
Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278#issuecomment-445902818
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] vanzin commented on a change in pull request #22904: [SPARK-25887][K8S] Configurable K8S context support

2018-12-10 Thread GitBox
vanzin commented on a change in pull request #22904: [SPARK-25887][K8S] 
Configurable K8S context support
URL: https://github.com/apache/spark/pull/22904#discussion_r240308164
 
 

 ##
 File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ##
 @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects configurable 
context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the users 
K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current 
context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context should be 
used so if no
+// explicit setting pass null
+val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
 
 Review comment:
   > What does client mode mean to you?
   
   Client mode means that the driver process / container is not started by 
Spark. It's started directly by the user.
   
   > Also - how should one interpret this paragraph in the docs?
   
   I have no idea.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] vanzin commented on a change in pull request #22904: [SPARK-25887][K8S] Configurable K8S context support

2018-12-10 Thread GitBox
vanzin commented on a change in pull request #22904: [SPARK-25887][K8S] 
Configurable K8S context support
URL: https://github.com/apache/spark/pull/22904#discussion_r240308164
 
 

 ##
 File path: 
resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/SparkKubernetesClientFactory.scala
 ##
 @@ -67,8 +66,16 @@ private[spark] object SparkKubernetesClientFactory {
 val dispatcher = new Dispatcher(
   ThreadUtils.newDaemonCachedThreadPool("kubernetes-dispatcher"))
 
-// TODO [SPARK-25887] Create builder in a way that respects configurable 
context
-val config = new ConfigBuilder()
+// Allow for specifying a context used to auto-configure from the users 
K8S config file
+val kubeContext = sparkConf.get(KUBERNETES_CONTEXT).filter(c => 
StringUtils.isNotBlank(c))
+logInfo(s"Auto-configuring K8S client using " +
+  s"${if (kubeContext.isEmpty) s"context ${kubeContext.get}" else "current 
context"}" +
+  s" from users K8S config file")
+
+// Start from an auto-configured config with the desired context
+// Fabric 8 uses null to indicate that the users current context should be 
used so if no
+// explicit setting pass null
+val config = new ConfigBuilder(autoConfigure(kubeContext.getOrElse(null)))
 
 Review comment:
   > What does client mode mean to you?
   
   Client mode means that the driver is not started by Spark. It's started 
directly by the user.
   
   > Also - how should one interpret this paragraph in the docs?
   
   I have no idea.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show 
associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445901339
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23068: [SPARK-26098][WebUI] Show 
associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445901351
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5933/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated 
SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445901351
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5933/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] attilapiros opened a new pull request #23278: [SPARK-24920][Core] Allow sharing Netty's memory pool allocators

2018-12-10 Thread GitBox
attilapiros opened a new pull request #23278: [SPARK-24920][Core] Allow sharing 
Netty's memory pool allocators
URL: https://github.com/apache/spark/pull/23278
 
 
   ## What changes were proposed in this pull request?
   
   Introducing shared polled ByteBuf allocators. 
   This feature can be enabled via the "spark.network.sharedByteBufAllocators" 
configuration. 
   
   When it is on then only two pooled ByteBuf allocators are created: 
   - one for transport servers where caching is allowed and
   - one for transport clients where caching is disabled
   
   This way the cache allowance remains as before. 
   Both shareable pools are created with numCores parameter set to 0 (which 
defaults to the available processors) as conf.serverThreads() and 
conf.clientThreads() are module dependant and the lazy creation of this 
allocators would lead to unpredicted behaviour.
   
   When "spark.network.sharedByteBufAllocators" is false then a new allocator 
is created for every transport client and server separately as was before this 
PR.
   
   ## How was this patch tested?
   
   Existing unit tests.
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23068: [SPARK-26098][WebUI] Show associated 
SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445901339
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] 
Show associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#discussion_r240306597
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -56,6 +56,11 @@ private[spark] class AppStatusStore(
 store.read(classOf[JobDataWrapper], jobId).info
   }
 
+  def jobWithAssociatedSql(jobId: Int): (v1.JobData, Option[Long]) = {
 
 Review comment:
   Add a function description above this line.
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] 
Show associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#discussion_r240305948
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/status/AppStatusStore.scala
 ##
 @@ -56,6 +56,11 @@ private[spark] class AppStatusStore(
 store.read(classOf[JobDataWrapper], jobId).info
   }
 
+  def jobWithAssociatedSql(jobId: Int): (v1.JobData, Option[Long]) = {
 
 Review comment:
   Add a function description above this line. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
gatorsmile commented on a change in pull request #23068: [SPARK-26098][WebUI] 
Show associated SQL query in Job page
URL: https://github.com/apache/spark/pull/23068#discussion_r240305749
 
 

 ##
 File path: core/src/main/scala/org/apache/spark/ui/jobs/JobPage.scala
 ##
 @@ -189,14 +189,19 @@ private[ui] class JobPage(parent: JobsTab, store: 
AppStatusStore) extends WebUIP
 require(parameterId != null && parameterId.nonEmpty, "Missing id 
parameter")
 
 val jobId = parameterId.toInt
-val jobData = store.asOption(store.job(jobId)).getOrElse {
+val (jobData, sqlExecutionId) = 
store.asOption(store.jobWithAssociatedSql(jobId)).getOrElse {
   val content =
 
   No information to display for job {jobId}
 
   return UIUtils.headerSparkPage(
 request, s"Details for Job $jobId", content, parent)
 }
+val sqlDetailUrl = sqlExecutionId.map { id =>
 
 Review comment:
   Add a code comment to explain when `sqlExecutionId ` can be None. 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
SparkQA commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL 
query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445899666
 
 
   **[Test build #99926 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99926/testReport)**
 for PR 23068 at commit 
[`e7c2ebb`](https://github.com/apache/spark/commit/e7c2ebbda949918034cb9cb92ac6ef30af17d943).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL query in Job page

2018-12-10 Thread GitBox
gatorsmile commented on issue #23068: [SPARK-26098][WebUI] Show associated SQL 
query in Job page
URL: https://github.com/apache/spark/pull/23068#issuecomment-445899184
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #21881: [SPARK-24930][SQL] Improve exception information when using LOAD DATA LOCAL INPATH

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #21881: [SPARK-24930][SQL] Improve exception 
information when using LOAD DATA LOCAL INPATH
URL: https://github.com/apache/spark/pull/21881#issuecomment-445898969
 
 
   Can one of the admins verify this patch?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445898081
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99925/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889277
 
 
   **[Test build #99925 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99925/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897956
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99924/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445898071
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897948
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445887183
 
 
   **[Test build #99924 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99924/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445898081
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99925/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445898071
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897833
 
 
   **[Test build #99925 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99925/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).
* This patch **fails PySpark unit tests**.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class HaveArrowTests(unittest.TestCase):`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897956
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99924/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897948
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445897623
 
 
   **[Test build #99924 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99924/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds the following public classes _(experimental)_:
 * `class HaveArrowTests(unittest.TestCase):`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445896751
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445896763
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99913/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] rezasafi commented on issue #22612: [SPARK-24958][CORE] Add memory from procfs to executor metrics.

2018-12-10 Thread GitBox
rezasafi commented on issue #22612: [SPARK-24958][CORE] Add memory from procfs 
to executor metrics.
URL: https://github.com/apache/spark/pull/22612#issuecomment-445897006
 
 
   Thank you very much @squito 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445896763
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99913/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445896751
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] squito commented on issue #22612: [SPARK-24958][CORE] Add memory from procfs to executor metrics.

2018-12-10 Thread GitBox
squito commented on issue #22612: [SPARK-24958][CORE] Add memory from procfs to 
executor metrics.
URL: https://github.com/apache/spark/pull/22612#issuecomment-445895902
 
 
   merged to master, thanks @rezasafi 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445815532
 
 
   **[Test build #99913 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99913/testReport)**
 for PR 23262 at commit 
[`9758534`](https://github.com/apache/spark/commit/9758534ef28109df25d4ef9155c54f09ac58a45c).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
SparkQA commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445895749
 
 
   **[Test build #99913 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99913/testReport)**
 for PR 23262 at commit 
[`9758534`](https://github.com/apache/spark/commit/9758534ef28109df25d4ef9155c54f09ac58a45c).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen closed pull request #23048: transform DenseVector x DenseVector sqdist from imperativ to function…

2018-12-10 Thread GitBox
srowen closed pull request #23048: transform DenseVector x DenseVector sqdist 
from imperativ to function…
URL: https://github.com/apache/spark/pull/23048
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala 
b/mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
index 6e950f968a65d..42364fe132dd5 100644
--- a/mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
+++ b/mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala
@@ -370,14 +370,19 @@ object Vectors {
   case (v1: DenseVector, v2: SparseVector) =>
 squaredDistance = sqdist(v2, v1)
 
-  case (DenseVector(vv1), DenseVector(vv2)) =>
-var kv = 0
+  case (DenseVector(vv1), DenseVector(vv2)) => {
 val sz = vv1.length
-while (kv < sz) {
-  val score = vv1(kv) - vv2(kv)
-  squaredDistance += score * score
-  kv += 1
+@annotation.tailrec
+def go(d: Double, kv: Int): Double = {
+  if (kv < sz) {
+val score = vv1(kv) - vv2(kv)
+go(d + score * score, kv + 1)
+  }
+  else d
 }
+go(0D, 0)
+  }
+
   case _ =>
 throw new IllegalArgumentException("Do not support vector type " + 
v1.getClass +
   " and " + v2.getClass)


 


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the 
doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445894455
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] hvanhovell commented on a change in pull request #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
hvanhovell commented on a change in pull request #23249: [SPARK-26297][SQL] 
improve the doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#discussion_r240300150
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ##
 @@ -241,12 +240,12 @@ case class HashPartitioning(expressions: 
Seq[Expression], numPartitions: Int)
 
 /**
  * Represents a partitioning where rows are split across partitions based on 
some total ordering of
- * the expressions specified in `ordering`.  When data is partitioned in this 
manner the following
- * two conditions are guaranteed to hold:
- *  - All row where the expressions in `ordering` evaluate to the same values 
will be in the same
- *partition.
- *  - Each partition will have a `min` and `max` row, relative to the given 
ordering.  All rows
- *that are in between `min` and `max` in this `ordering` will reside in 
this partition.
+ * the expressions specified in `ordering`.  When data is partitioned in this 
manner, it guarantees:
+ *   - Given any 2 adjacent partitions, all the rows of the second partition 
must be larger than
 
 Review comment:
   Nit don't use bullets if you have only one of them


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23249: [SPARK-26297][SQL] improve the 
doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445894461
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99912/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445894461
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99912/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445894455
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445893199
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99922/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23249: [SPARK-26297][SQL] improve the doc 
of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445815528
 
 
   **[Test build #99912 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99912/testReport)**
 for PR 23249 at commit 
[`adfcec4`](https://github.com/apache/spark/commit/adfcec41adbffbef2e33fb85db5ad48eba5f3d71).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445893192
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] hvanhovell commented on a change in pull request #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
hvanhovell commented on a change in pull request #23249: [SPARK-26297][SQL] 
improve the doc of Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#discussion_r240299365
 
 

 ##
 File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala
 ##
 @@ -118,10 +115,12 @@ case class HashClusteredDistribution(
 
 /**
  * Represents data where tuples have been ordered according to the `ordering`
- * [[Expression Expressions]].  This is a strictly stronger guarantee than
- * [[ClusteredDistribution]] as an ordering will ensure that tuples that share 
the
- * same value for the ordering expressions are contiguous and will never be 
split across
- * partitions.
+ * [[Expression Expressions]]. Its requirement is defined as the following:
+ *   - Given any 2 adjacent partitions, all the rows of the second partition 
must be larger than or
+ * equal to any row in the first partition, according to the `ordering` 
expressions.
 
 Review comment:
   Global sort (actually the `RangePartitioner`) currently guarantees that all 
rows in partition `p + 1` are larger than the rows in partition `p`. I don't 
think we should relax this, besides collect limit there aren't any use cases I 
can think of that could work with this relaxed requirement.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen closed pull request #22723: [SPARK-25729][CORE]It is better to replace `minPartitions` with `defaultParallelism` , when `minPartitions` is less than `defaultParallelism`

2018-12-10 Thread GitBox
srowen closed pull request #22723: [SPARK-25729][CORE]It is better to replace 
`minPartitions` with `defaultParallelism` , when `minPartitions` is less than 
`defaultParallelism`
URL: https://github.com/apache/spark/pull/22723
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git 
a/core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala 
b/core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala
index 04c5c4b90e8a1..9400879f27048 100644
--- a/core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala
+++ b/core/src/main/scala/org/apache/spark/input/WholeTextFileInputFormat.scala
@@ -46,13 +46,15 @@ private[spark] class WholeTextFileInputFormat
 
   /**
* Allow minPartitions set by end-user in order to keep compatibility with 
old Hadoop API,
-   * which is set through setMaxSplitSize
+   * which is set through setMaxSplitSize. But when minPartitions is less than 
defaultParallelism,
+   * it is better to replace minPartitions with defaultParallelism, because 
this can improve
+   * parallelism.
*/
-  def setMinPartitions(context: JobContext, minPartitions: Int) {
+  def setMinPartitions(defaultParallelism: Int, context: JobContext, 
minPartitions: Int) {
 val files = listStatus(context).asScala
 val totalLen = files.map(file => if (file.isDirectory) 0L else 
file.getLen).sum
-val maxSplitSize = Math.ceil(totalLen * 1.0 /
-  (if (minPartitions == 0) 1 else minPartitions)).toLong
+val minPartNum = Math.max(defaultParallelism, minPartitions)
+val maxSplitSize = Math.ceil(totalLen * 1.0 / minPartNum).toLong
 
 // For small files we need to ensure the min split size per node & rack <= 
maxSplitSize
 val config = context.getConfiguration
diff --git a/core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala 
b/core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala
index 9f3d0745c33c9..6377b677ed10c 100644
--- a/core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/WholeTextFileRDD.scala
@@ -30,7 +30,7 @@ import org.apache.spark.input.WholeTextFileInputFormat
  * An RDD that reads a bunch of text files in, and each text file becomes one 
record.
  */
 private[spark] class WholeTextFileRDD(
-sc : SparkContext,
+@transient private val sc: SparkContext,
 inputFormatClass: Class[_ <: WholeTextFileInputFormat],
 keyClass: Class[Text],
 valueClass: Class[Text],
@@ -51,7 +51,7 @@ private[spark] class WholeTextFileRDD(
   case _ =>
 }
 val jobContext = new JobContextImpl(conf, jobId)
-inputFormat.setMinPartitions(jobContext, minPartitions)
+inputFormat.setMinPartitions(sc.defaultParallelism, jobContext, 
minPartitions)
 val rawSplits = inputFormat.getSplits(jobContext).toArray
 val result = new Array[Partition](rawSplits.size)
 for (i <- 0 until rawSplits.size) {
diff --git 
a/core/src/test/scala/org/apache/spark/input/WholeTextFileInputFormatSuite.scala
 
b/core/src/test/scala/org/apache/spark/input/WholeTextFileInputFormatSuite.scala
index 817dc082b7d38..531ac936a4d5d 100644
--- 
a/core/src/test/scala/org/apache/spark/input/WholeTextFileInputFormatSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/input/WholeTextFileInputFormatSuite.scala
@@ -38,7 +38,7 @@ class WholeTextFileInputFormatSuite extends SparkFunSuite 
with BeforeAndAfterAll
   override def beforeAll() {
 super.beforeAll()
 val conf = new SparkConf()
-sc = new SparkContext("local", "test", conf)
+sc = new SparkContext("local[2]", "test", conf)
   }
 
   override def afterAll() {
@@ -79,6 +79,22 @@ class WholeTextFileInputFormatSuite extends SparkFunSuite 
with BeforeAndAfterAll
   Utils.deleteRecursively(dir)
 }
   }
+
+  test("Test the number of partitions for WholeTextFileRDD") {
+var dir: File = null
+try {
+  dir = Utils.createTempDir()
+  WholeTextFileInputFormatSuite.files.foreach { case (filename, contents) 
=>
+createNativeFile(dir, filename, contents, true)
+  }
+  // set `minPartitions = 1`
+  val rdd = sc.wholeTextFiles(dir.toString, 1)
+  // The number of partitions is equal to 2, not equal to 1, because the 
defaultParallelism is 2
+  assert(rdd.getNumPartitions === 2)
+} finally {
+  Utils.deleteRecursively(dir)
+}
+  }
 }
 
 /**
@@ -88,7 +104,7 @@ object WholeTextFileInputFormatSuite {
   private val testWords: IndexedSeq[Byte] = "Spark is easy to 
use.\n".map(_.toByte)
 
   private val fileNames = Array("part-0", "part-1", "part-2")
-  private val fileLengths = Array(10, 100, 1000)
+  private val fileLengths = Array(10, 100, 100)
 
   private val files = fileLengths.zip(fileNames).map { case (upperBound, 

[GitHub] srowen commented on a change in pull request #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
srowen commented on a change in pull request #23275: [SPARK-26323][SQL] Scala 
UDF should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#discussion_r240298939
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ##
 @@ -88,68 +88,49 @@ sealed trait UserDefinedFunction {
 private[sql] case class SparkUserDefinedFunction(
 f: AnyRef,
 dataType: DataType,
-inputTypes: Option[Seq[DataType]],
-nullableTypes: Option[Seq[Boolean]],
+inputSchemas: Seq[Option[ScalaReflection.Schema]],
 name: Option[String] = None,
 nullable: Boolean = true,
 deterministic: Boolean = true) extends UserDefinedFunction {
 
   @scala.annotation.varargs
-  override def apply(exprs: Column*): Column = {
-// TODO: make sure this class is only instantiated through 
`SparkUserDefinedFunction.create()`
-// and `nullableTypes` is always set.
-if (inputTypes.isDefined) {
-  assert(inputTypes.get.length == nullableTypes.get.length)
-}
-
-val inputsNullSafe = nullableTypes.getOrElse {
-  ScalaReflection.getParameterTypeNullability(f)
-}
+  override def apply(cols: Column*): Column = {
+Column(createScalaUDF(cols.map(_.expr)))
+  }
 
-Column(ScalaUDF(
+  private[sql] def createScalaUDF(exprs: Seq[Expression]): ScalaUDF = {
+// It's possible that some of the inputs don't have a specific type(e.g. 
`Any`),  skip type
+// check and null check for them.
+val inputTypes = inputSchemas.map(_.map(_.dataType).getOrElse(AnyDataType))
+val inputsNullSafe = inputSchemas.map(_.map(_.nullable).getOrElse(true))
 
 Review comment:
   Ah right. I'm neutral on whether it's clearer than getOrElse; I think we end 
up using the latter in the code more. I know IJ suggests forall though.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445893199
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99922/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23249: [SPARK-26297][SQL] improve the doc of Distribution/Partitioning

2018-12-10 Thread GitBox
SparkQA commented on issue #23249: [SPARK-26297][SQL] improve the doc of 
Distribution/Partitioning
URL: https://github.com/apache/spark/pull/23249#issuecomment-445893422
 
 
   **[Test build #99912 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99912/testReport)**
 for PR 23249 at commit 
[`adfcec4`](https://github.com/apache/spark/commit/adfcec41adbffbef2e33fb85db5ad48eba5f3d71).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445893192
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445854780
 
 
   **[Test build #99922 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99922/testReport)**
 for PR 23277 at commit 
[`0e00aa7`](https://github.com/apache/spark/commit/0e00aa7a219805f3d14ca4d222df4a922a34d825).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23277: [SPARK-26327][SQL] Metrics in FileSourceScanExec not update correctly

2018-12-10 Thread GitBox
SparkQA commented on issue #23277: [SPARK-26327][SQL] Metrics in 
FileSourceScanExec not update correctly
URL: https://github.com/apache/spark/pull/23277#issuecomment-445892950
 
 
   **[Test build #99922 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99922/testReport)**
 for PR 23277 at commit 
[`0e00aa7`](https://github.com/apache/spark/commit/0e00aa7a219805f3d14ca4d222df4a922a34d825).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445891136
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445891136
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445891152
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99910/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445891152
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99910/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
SparkQA commented on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445890454
 
 
   **[Test build #99910 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99910/testReport)**
 for PR 23262 at commit 
[`56cf4e5`](https://github.com/apache/spark/commit/56cf4e5f079c6ddd36de197eecc3b51393a5859b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23262: [SPARK-26312][SQL]Replace RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23262: [SPARK-26312][SQL]Replace 
RDDConversions.rowToRowRdd with RowEncoder to improve its conversion performance
URL: https://github.com/apache/spark/pull/23262#issuecomment-445815524
 
 
   **[Test build #99910 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99910/testReport)**
 for PR 23262 at commit 
[`56cf4e5`](https://github.com/apache/spark/commit/56cf4e5f079c6ddd36de197eecc3b51393a5859b).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889234
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5932/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #22273: [SPARK-25272][PYTHON][TEST] 
Add test to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889224
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889277
 
 
   **[Test build #99925 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99925/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889234
 
 
   Test PASSed.
   Refer to this link for build results (access rights to CI server needed): 
   
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/5932/
   Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test 
to better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445889224
 
 
   Merged build finished. Test PASSed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
SparkQA commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445887183
 
 
   **[Test build #99924 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99924/testReport)**
 for PR 22273 at commit 
[`8574291`](https://github.com/apache/spark/commit/8574291a0b84574626ca213bc6f95dc0db73b0ef).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] BryanCutler commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
BryanCutler commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445887027
 
 
   retest this please


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] BryanCutler commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to better indicate pyarrow is installed and related tests will run

2018-12-10 Thread GitBox
BryanCutler commented on issue #22273: [SPARK-25272][PYTHON][TEST] Add test to 
better indicate pyarrow is installed and related tests will run
URL: https://github.com/apache/spark/pull/22273#issuecomment-445886897
 
 
   Hey @holdenk , yeah we could do that but I'm ok with the current way which 
is to only print if the tests are skipped and assume they ran otherwise. I just 
wanted to make sure that the weird behavior I saw earlier wasn't happening 
anymore.  Looks good so far, but I was trying to hit one more worker to check, 
then I'll close this out.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF 
should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445882601
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99920/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF 
should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445882594
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445882601
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99920/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445882594
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445840083
 
 
   **[Test build #99920 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99920/testReport)**
 for PR 23275 at commit 
[`92466d4`](https://github.com/apache/spark/commit/92466d486734f3904be31e45b85e49654eb39255).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
SparkQA commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still 
check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445882225
 
 
   **[Test build #99920 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99920/testReport)**
 for PR 23275 at commit 
[`92466d4`](https://github.com/apache/spark/commit/92466d486734f3904be31e45b85e49654eb39255).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF 
should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445878940
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99919/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
SparkQA removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445836177
 
 
   **[Test build #99919 has 
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99919/testReport)**
 for PR 23275 at commit 
[`8582607`](https://github.com/apache/spark/commit/8582607195f12a4c133fb28b59e8a7fce7a97fbb).


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins removed a comment on issue #23275: [SPARK-26323][SQL] Scala UDF 
should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445878926
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445878926
 
 
   Merged build finished. Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
AmplabJenkins commented on issue #23275: [SPARK-26323][SQL] Scala UDF should 
still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445878940
 
 
   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/99919/
   Test FAILed.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] SparkQA commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
SparkQA commented on issue #23275: [SPARK-26323][SQL] Scala UDF should still 
check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#issuecomment-445878673
 
 
   **[Test build #99919 has 
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/99919/testReport)**
 for PR 23275 at commit 
[`8582607`](https://github.com/apache/spark/commit/8582607195f12a4c133fb28b59e8a7fce7a97fbb).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] mgaido91 commented on a change in pull request #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
mgaido91 commented on a change in pull request #23275: [SPARK-26323][SQL] Scala 
UDF should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#discussion_r240278143
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ##
 @@ -88,68 +88,49 @@ sealed trait UserDefinedFunction {
 private[sql] case class SparkUserDefinedFunction(
 f: AnyRef,
 dataType: DataType,
-inputTypes: Option[Seq[DataType]],
-nullableTypes: Option[Seq[Boolean]],
+inputSchemas: Seq[Option[ScalaReflection.Schema]],
 name: Option[String] = None,
 nullable: Boolean = true,
 deterministic: Boolean = true) extends UserDefinedFunction {
 
   @scala.annotation.varargs
-  override def apply(exprs: Column*): Column = {
-// TODO: make sure this class is only instantiated through 
`SparkUserDefinedFunction.create()`
-// and `nullableTypes` is always set.
-if (inputTypes.isDefined) {
-  assert(inputTypes.get.length == nullableTypes.get.length)
-}
-
-val inputsNullSafe = nullableTypes.getOrElse {
-  ScalaReflection.getParameterTypeNullability(f)
-}
+  override def apply(cols: Column*): Column = {
+Column(createScalaUDF(cols.map(_.expr)))
+  }
 
-Column(ScalaUDF(
+  private[sql] def createScalaUDF(exprs: Seq[Expression]): ScalaUDF = {
+// It's possible that some of the inputs don't have a specific type(e.g. 
`Any`),  skip type
+// check and null check for them.
+val inputTypes = inputSchemas.map(_.map(_.dataType).getOrElse(AnyDataType))
+val inputsNullSafe = inputSchemas.map(_.map(_.nullable).getOrElse(true))
 
 Review comment:
   I mean `inputSchemas.map(_.forall(_.nullable))`


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] srowen commented on a change in pull request #23275: [SPARK-26323][SQL] Scala UDF should still check input types even if some inputs are of type Any

2018-12-10 Thread GitBox
srowen commented on a change in pull request #23275: [SPARK-26323][SQL] Scala 
UDF should still check input types even if some inputs are of type Any
URL: https://github.com/apache/spark/pull/23275#discussion_r240275041
 
 

 ##
 File path: 
sql/core/src/main/scala/org/apache/spark/sql/expressions/UserDefinedFunction.scala
 ##
 @@ -88,68 +88,49 @@ sealed trait UserDefinedFunction {
 private[sql] case class SparkUserDefinedFunction(
 f: AnyRef,
 dataType: DataType,
-inputTypes: Option[Seq[DataType]],
-nullableTypes: Option[Seq[Boolean]],
+inputSchemas: Seq[Option[ScalaReflection.Schema]],
 name: Option[String] = None,
 nullable: Boolean = true,
 deterministic: Boolean = true) extends UserDefinedFunction {
 
   @scala.annotation.varargs
-  override def apply(exprs: Column*): Column = {
-// TODO: make sure this class is only instantiated through 
`SparkUserDefinedFunction.create()`
-// and `nullableTypes` is always set.
-if (inputTypes.isDefined) {
-  assert(inputTypes.get.length == nullableTypes.get.length)
-}
-
-val inputsNullSafe = nullableTypes.getOrElse {
-  ScalaReflection.getParameterTypeNullability(f)
-}
+  override def apply(cols: Column*): Column = {
+Column(createScalaUDF(cols.map(_.expr)))
+  }
 
-Column(ScalaUDF(
+  private[sql] def createScalaUDF(exprs: Seq[Expression]): ScalaUDF = {
+// It's possible that some of the inputs don't have a specific type(e.g. 
`Any`),  skip type
+// check and null check for them.
+val inputTypes = inputSchemas.map(_.map(_.dataType).getOrElse(AnyDataType))
+val inputsNullSafe = inputSchemas.map(_.map(_.nullable).getOrElse(true))
 
 Review comment:
   I'm missing it, how could you write this more simply with `forall` to get 
from `Option[Schema]` to `Boolean`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



<    1   2   3   4   5   6   7   8   9   >