[GitHub] spark pull request: [SPARK-10827] [CORE] AppClient should not use ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9317#issuecomment-151744807 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11332] [ML] Refactored to use ml.featur...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9325#issuecomment-151744714 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10827] [CORE] AppClient should not use ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9317#issuecomment-151744815 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44495/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11332] [ML] Refactored to use ml.featur...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9325#issuecomment-151744634 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10827] [CORE] AppClient should not use ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9317#issuecomment-151744460 **[Test build #44495 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44495/consoleFull)** for PR 9317 at commit [`5e155cc`](https://github.com/apache/spark/commit/5e155cc2f43d98f365524ec6bac81bb6206a780e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11332] [ML] Refactored to use ml.featur...
Github user dbtsai commented on the pull request: https://github.com/apache/spark/pull/9325#issuecomment-151744040 Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11362][SQL] Use Spark BitSet in Broadca...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9316#issuecomment-151742912 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44489/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11362][SQL] Use Spark BitSet in Broadca...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9316#issuecomment-151742907 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11362][SQL] Use Spark BitSet in Broadca...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9316#issuecomment-151742496 **[Test build #44489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44489/consoleFull)** for PR 9316 at commit [`c6751f7`](https://github.com/apache/spark/commit/c6751f73b745a1209dd19488a3e12464459adb77). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11332] [ML] Refactored to use ml.featur...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9325#issuecomment-151741814 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11332] [ML] Refactored to use ml.featur...
GitHub user nakul02 opened a pull request: https://github.com/apache/spark/pull/9325 [SPARK-11332] [ML] Refactored to use ml.feature.Instance instead of WeightedLeastSquare.Instance You can merge this pull request into a Git repository by running: $ git pull https://github.com/nakul02/spark SPARK-11332_refactor_WeightedLeastSquares_dot_Instance Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9325.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9325 commit f05bb3f5e74f8522af50ce08ff5ccd1692939798 Author: Nakul Jindal Date: 2015-10-28T05:15:38Z Refactor - ml.feature.Instance instead of WeightedLeastSquare.Instance --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-9034] [SQL] Reflect field names defined...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/8456#issuecomment-151741657 LGTM, /cc @liancheng @cloud-fan for taking another look. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151741695 **[Test build #44507 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44507/consoleFull)** for PR 9322 at commit [`138fd14`](https://github.com/apache/spark/commit/138fd1400c4ed8c7221f7125b64d87cc86d9fc5d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151741284 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151741274 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on the pull request: https://github.com/apache/spark/pull/8744#issuecomment-151740814 Hi @steveloughran , looks like this patch is quite large, can we just: 1. Remove some unnecessary getter/setter methods, it is quite Java way and so verbose in Scala. 2. Group and separate some class parameters into subclasses, it will be easily error-prone if you have a large mount of mutable parameter to track the state. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9319 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11364][SQL] Always load the latest hado...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9323#issuecomment-151740758 **[Test build #44506 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44506/consoleFull)** for PR 9323 at commit [`479db63`](https://github.com/apache/spark/commit/479db63248d55e2df447480d1001e8d5ec19aa3d). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151740698 test this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151740659 Merged into master. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9324#issuecomment-151740066 **[Test build #44505 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44505/consoleFull)** for PR 9324 at commit [`499454d`](https://github.com/apache/spark/commit/499454ddc2718c106a395c2f55c43a09a4d42554). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11364][SQL] Always load the latest hado...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/9323#issuecomment-151739573 cc @liancheng --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/8744#discussion_r43221885 --- Diff: yarn/src/history/main/scala/org/apache/spark/deploy/history/yarn/YarnTimelineUtils.scala --- @@ -0,0 +1,757 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.history.yarn + +import java.io.IOException +import java.net.{InetSocketAddress, NoRouteToHostException, URI, URL} +import java.text.DateFormat +import java.util.concurrent.atomic.AtomicLong +import java.util.{ArrayList => JArrayList, Collection => JCollection, Date, HashMap => JHashMap, Map => JMap} +import java.{lang, util} + +import scala.collection.JavaConverters._ +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.hadoop.conf.Configuration +import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse.TimelinePutError +import org.apache.hadoop.yarn.api.records.timeline.{TimelineEntity, TimelineEvent, TimelinePutResponse} +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} +import org.apache.hadoop.yarn.client.api.TimelineClient +import org.apache.hadoop.yarn.conf.YarnConfiguration +import org.apache.spark +import org.json4s.JsonAST.JObject +import org.json4s._ +import org.json4s.jackson.JsonMethods._ + +import org.apache.spark.deploy.history.yarn.YarnHistoryService._ +import org.apache.spark.scheduler.{SparkListenerApplicationEnd, SparkListenerApplicationStart, SparkListenerEvent, SparkListenerExecutorAdded, SparkListenerExecutorRemoved, SparkListenerJobEnd, SparkListenerJobStart, SparkListenerStageCompleted, SparkListenerStageSubmitted} +import org.apache.spark.util.{JsonProtocol, Utils} +import org.apache.spark.{Logging, SparkContext} + +/** + * Utility methods for timeline classes. + */ +private[spark] object YarnTimelineUtils extends Logging { + + /** + * What attempt ID to use as the attempt ID field (not the entity ID) when + * there is no attempt info. + */ + val SINGLE_ATTEMPT = "1" + + /** + * Exception text when there is no event info data to unmarshall. + */ + val E_NO_EVENTINFO = "No 'eventinfo' entry" + + /** + * Exception text when there is event info entry in the timeline event, but it is empty. + */ + + val E_EMPTY_EVENTINFO = "Empty 'eventinfo' entry" + + /** + * counter incremented on every spark event to timeline event creation, + * so guaranteeing uniqueness of event IDs across a single application attempt + * (which is implicitly, one per JVM). + */ + val uid = new AtomicLong(System.currentTimeMillis()) + + /** + * Converts a Java object to its equivalent json4s representation. + */ + def toJValue(obj: Object): JValue = { +obj match { + case str: String => JString(str) + case dbl: java.lang.Double => JDouble(dbl) + case dec: java.math.BigDecimal => JDecimal(dec) + case int: java.lang.Integer => JInt(BigInt(int)) + case long: java.lang.Long => JInt(BigInt(long)) + case bool: java.lang.Boolean => JBool(bool) + case map: JMap[_, _] => +val jmap = map.asInstanceOf[JMap[String, Object]] +JObject(jmap.entrySet().asScala.map { e => e.getKey -> toJValue(e.getValue) }.toList) + case array: JCollection[_] => +JArray(array.asInstanceOf[JCollection[Object]].asScala.map(o => toJValue(o)).toList) + case null => JNothing +} + } + + /** + * Converts a JValue into its Java equivalent. + */ + def toJavaObject(v: JValue): Object = { +v match { + case JNothing => null + case JNull => null + case JString(s) => s + case JDouble(num) => java.lang.Double.valueOf(num) + case JDecimal(num) => num.bigDecimal + case JInt(num) => java.lang.Long.valueOf(num.long
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user cloud-fan closed the pull request at: https://github.com/apache/spark/pull/9279 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9279#issuecomment-151739494 will open it again when we need to support cogroup on more than 2 datasets. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/8744#discussion_r43221841 --- Diff: yarn/src/history/main/scala/org/apache/spark/deploy/history/yarn/YarnTimelineUtils.scala --- @@ -0,0 +1,757 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.history.yarn + +import java.io.IOException +import java.net.{InetSocketAddress, NoRouteToHostException, URI, URL} +import java.text.DateFormat +import java.util.concurrent.atomic.AtomicLong +import java.util.{ArrayList => JArrayList, Collection => JCollection, Date, HashMap => JHashMap, Map => JMap} +import java.{lang, util} + +import scala.collection.JavaConverters._ +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.hadoop.conf.Configuration +import org.apache.hadoop.yarn.api.records.timeline.TimelinePutResponse.TimelinePutError +import org.apache.hadoop.yarn.api.records.timeline.{TimelineEntity, TimelineEvent, TimelinePutResponse} +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} +import org.apache.hadoop.yarn.client.api.TimelineClient +import org.apache.hadoop.yarn.conf.YarnConfiguration +import org.apache.spark +import org.json4s.JsonAST.JObject +import org.json4s._ +import org.json4s.jackson.JsonMethods._ + +import org.apache.spark.deploy.history.yarn.YarnHistoryService._ +import org.apache.spark.scheduler.{SparkListenerApplicationEnd, SparkListenerApplicationStart, SparkListenerEvent, SparkListenerExecutorAdded, SparkListenerExecutorRemoved, SparkListenerJobEnd, SparkListenerJobStart, SparkListenerStageCompleted, SparkListenerStageSubmitted} +import org.apache.spark.util.{JsonProtocol, Utils} +import org.apache.spark.{Logging, SparkContext} + +/** + * Utility methods for timeline classes. + */ +private[spark] object YarnTimelineUtils extends Logging { + + /** + * What attempt ID to use as the attempt ID field (not the entity ID) when + * there is no attempt info. + */ + val SINGLE_ATTEMPT = "1" + + /** + * Exception text when there is no event info data to unmarshall. + */ + val E_NO_EVENTINFO = "No 'eventinfo' entry" + + /** + * Exception text when there is event info entry in the timeline event, but it is empty. + */ + + val E_EMPTY_EVENTINFO = "Empty 'eventinfo' entry" + + /** + * counter incremented on every spark event to timeline event creation, + * so guaranteeing uniqueness of event IDs across a single application attempt + * (which is implicitly, one per JVM). + */ + val uid = new AtomicLong(System.currentTimeMillis()) + + /** + * Converts a Java object to its equivalent json4s representation. + */ + def toJValue(obj: Object): JValue = { +obj match { + case str: String => JString(str) + case dbl: java.lang.Double => JDouble(dbl) + case dec: java.math.BigDecimal => JDecimal(dec) + case int: java.lang.Integer => JInt(BigInt(int)) + case long: java.lang.Long => JInt(BigInt(long)) + case bool: java.lang.Boolean => JBool(bool) + case map: JMap[_, _] => +val jmap = map.asInstanceOf[JMap[String, Object]] +JObject(jmap.entrySet().asScala.map { e => e.getKey -> toJValue(e.getValue) }.toList) + case array: JCollection[_] => +JArray(array.asInstanceOf[JCollection[Object]].asScala.map(o => toJValue(o)).toList) + case null => JNothing +} + } + + /** + * Converts a JValue into its Java equivalent. + */ + def toJavaObject(v: JValue): Object = { +v match { + case JNothing => null + case JNull => null + case JString(s) => s + case JDouble(num) => java.lang.Double.valueOf(num) + case JDecimal(num) => num.bigDecimal + case JInt(num) => java.lang.Long.valueOf(num.long
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9324#issuecomment-151739410 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user cloud-fan commented on the pull request: https://github.com/apache/spark/pull/9324#issuecomment-151739416 cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11364][SQL] Always load the latest hado...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9323#issuecomment-151739426 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9324#issuecomment-151739421 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11364][SQL] Always load the latest hado...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9323#issuecomment-151739414 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11364][SQL] Always load the latest hado...
GitHub user chenghao-intel opened a pull request: https://github.com/apache/spark/pull/9323 [SPARK-11364][SQL] Always load the latest hadoop configuration We didn't propagate the hadoop configuration to the Data Source, as we always try to load the default hadoop configuration. A real case description can be found at: https://www.mail-archive.com/user@spark.apache.org/msg39706.html You can merge this pull request into a Git repository by running: $ git pull https://github.com/chenghao-intel/spark hadoopConf Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9323.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9323 commit 479db63248d55e2df447480d1001e8d5ec19aa3d Author: Cheng Hao Date: 2015-10-28T06:21:04Z always load the latest hadoop configuration --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11313][SQL] implement cogroup on DataSe...
GitHub user cloud-fan opened a pull request: https://github.com/apache/spark/pull/9324 [SPARK-11313][SQL] implement cogroup on DataSets (support 2 datasets) A simpler version of https://github.com/apache/spark/pull/9279, only support 2 datasets. You can merge this pull request into a Git repository by running: $ git pull https://github.com/cloud-fan/spark cogroup2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9324.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9324 commit 499454ddc2718c106a395c2f55c43a09a4d42554 Author: Wenchen Fan Date: 2015-10-28T06:30:35Z support cogroup on dataset --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11215] [ML] Add multiple columns suppor...
Github user yanboliang commented on a diff in the pull request: https://github.com/apache/spark/pull/9183#discussion_r43221732 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -56,14 +56,3 @@ test_that("feature interaction vs native glm", { rVals <- predict(glm(Sepal.Width ~ Species:Sepal.Length, data = iris), iris) expect_true(all(abs(rVals - vals) < 1e-6), rVals - vals) }) - -test_that("summary coefficients match with native glm", { --- End diff -- It's not removed, just temporary disable. Because of this PR changed the semantics of ```StringIndexer``` a little that cause this test case produce indeterminate result. We should first discuss the semantics changing is necessary or not, and then we can update the test case to produce determinate result. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11215] [ML] Add multiple columns suppor...
Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/9183#discussion_r43221520 --- Diff: R/pkg/inst/tests/test_mllib.R --- @@ -56,14 +56,3 @@ test_that("feature interaction vs native glm", { rVals <- predict(glm(Sepal.Width ~ Species:Sepal.Length, data = iris), iris) expect_true(all(abs(rVals - vals) < 1e-6), rVals - vals) }) - -test_that("summary coefficients match with native glm", { --- End diff -- why is this removed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/8744#discussion_r43221438 --- Diff: yarn/src/history/main/scala/org/apache/spark/deploy/history/yarn/YarnHistoryService.scala --- @@ -0,0 +1,1328 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.history.yarn + +import java.io.InterruptedIOException +import java.net.{ConnectException, URI} +import java.util.concurrent.atomic.{AtomicBoolean, AtomicInteger} +import java.util.concurrent.{LinkedBlockingDeque, TimeUnit} + +import scala.collection.JavaConverters._ +import scala.collection.mutable + +import com.codahale.metrics.{Counter, MetricRegistry, Timer} +import org.apache.hadoop.security.UserGroupInformation +import org.apache.hadoop.yarn.api.records.timeline.{TimelineDomain, TimelineEntity, TimelineEvent} +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} +import org.apache.hadoop.yarn.client.api.TimelineClient +import org.apache.hadoop.yarn.conf.YarnConfiguration + +import org.apache.spark.deploy.history.yarn.YarnTimelineUtils._ +import org.apache.spark.metrics.source.Source +import org.apache.spark.scheduler._ +import org.apache.spark.scheduler.cluster.{SchedulerExtensionService, SchedulerExtensionServiceBinding} +import org.apache.spark.util.{SystemClock, Utils} +import org.apache.spark.{Logging, SparkContext} + +/** + * A Yarn Extension Service to post lifecycle events to a registered YARN Timeline Server. + * + * Posting algorithm + * + * 1. The service subscribes to all events coming from the Spark Context. + * 1. These events are serialized into JSON objects for publishing to the timeline service through + * HTTP(S) posts. + * 1. Events are buffered into `pendingEvents` until a batch is aggregated into a + * [[TimelineEntity]] for posting. + * 1. That aggregation happens when a lifecycle event (application start/stop) takes place, + * or the number of pending events in a running application exceeds the limit set in + * `spark.hadoop.yarn.timeline.batch.size`. + * 1. Posting operations take place in a separate thread from the spark event listener. + * 1. If an attempt to post to the timeline server fails, the service sleeps and then + * it is re-attempted after the retry period defined by + * `spark.hadoop.yarn.timeline.post.retry.interval`. + * 1. If the number of events buffered in the history service exceed the limit set in + * `spark.hadoop.yarn.timeline.post.limit`, then further events other than application start/stop + * are dropped. + * 1. When the service is stopped, it will make a best-effort attempt to post all queued events. + * the call of [[stop()]] can block up to the duration of + * `spark.hadoop.yarn.timeline.shutdown.waittime` for this to take place. + * 1. No events are posted until the service receives a [[SparkListenerApplicationStart]] event. + * + * If the spark context has a metrics registry, then the internal counters of queued entities, + * post failures and successes, and the performance of the posting operation are all registered + * as metrics. + * + * The shutdown logic is somewhat convoluted, as the posting thread may be blocked on HTTP IO + * when the shutdown process begins. In this situation, the thread continues to be blocked, and + * will be interrupted once the wait time has expired. All time consumed during the ongoing + * operation will be counted as part of the shutdown time period. + */ +private[spark] class YarnHistoryService extends SchedulerExtensionService + with Logging with Source { + + import org.apache.spark.deploy.history.yarn.YarnHistoryService._ + + /** Simple state model implemented in an atomic integer */ + private val _serviceState = new AtomicInteger(CreatedState) + + /** Get the current state */ + def serviceState: Int = { +_serviceState.get()
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151738114 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44501/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151738110 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10342] [SPARK-10309] [SQL] [WIP] Cooper...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9241#issuecomment-151736450 **[Test build #44504 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44504/consoleFull)** for PR 9241 at commit [`8470fc9`](https://github.com/apache/spark/commit/8470fc9ddd37c525cf648c00e019d968410fa66f). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/8744#discussion_r43221207 --- Diff: yarn/src/history/main/scala/org/apache/spark/deploy/history/yarn/YarnHistoryService.scala --- @@ -0,0 +1,1328 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.history.yarn + +import java.io.InterruptedIOException +import java.net.{ConnectException, URI} +import java.util.concurrent.atomic.{AtomicBoolean, AtomicInteger} +import java.util.concurrent.{LinkedBlockingDeque, TimeUnit} + +import scala.collection.JavaConverters._ +import scala.collection.mutable + +import com.codahale.metrics.{Counter, MetricRegistry, Timer} +import org.apache.hadoop.security.UserGroupInformation +import org.apache.hadoop.yarn.api.records.timeline.{TimelineDomain, TimelineEntity, TimelineEvent} +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} +import org.apache.hadoop.yarn.client.api.TimelineClient +import org.apache.hadoop.yarn.conf.YarnConfiguration + +import org.apache.spark.deploy.history.yarn.YarnTimelineUtils._ +import org.apache.spark.metrics.source.Source +import org.apache.spark.scheduler._ +import org.apache.spark.scheduler.cluster.{SchedulerExtensionService, SchedulerExtensionServiceBinding} +import org.apache.spark.util.{SystemClock, Utils} +import org.apache.spark.{Logging, SparkContext} + +/** + * A Yarn Extension Service to post lifecycle events to a registered YARN Timeline Server. + * + * Posting algorithm + * + * 1. The service subscribes to all events coming from the Spark Context. + * 1. These events are serialized into JSON objects for publishing to the timeline service through + * HTTP(S) posts. + * 1. Events are buffered into `pendingEvents` until a batch is aggregated into a + * [[TimelineEntity]] for posting. + * 1. That aggregation happens when a lifecycle event (application start/stop) takes place, + * or the number of pending events in a running application exceeds the limit set in + * `spark.hadoop.yarn.timeline.batch.size`. + * 1. Posting operations take place in a separate thread from the spark event listener. + * 1. If an attempt to post to the timeline server fails, the service sleeps and then + * it is re-attempted after the retry period defined by + * `spark.hadoop.yarn.timeline.post.retry.interval`. + * 1. If the number of events buffered in the history service exceed the limit set in + * `spark.hadoop.yarn.timeline.post.limit`, then further events other than application start/stop + * are dropped. + * 1. When the service is stopped, it will make a best-effort attempt to post all queued events. + * the call of [[stop()]] can block up to the duration of + * `spark.hadoop.yarn.timeline.shutdown.waittime` for this to take place. + * 1. No events are posted until the service receives a [[SparkListenerApplicationStart]] event. + * + * If the spark context has a metrics registry, then the internal counters of queued entities, + * post failures and successes, and the performance of the posting operation are all registered + * as metrics. + * + * The shutdown logic is somewhat convoluted, as the posting thread may be blocked on HTTP IO + * when the shutdown process begins. In this situation, the thread continues to be blocked, and + * will be interrupted once the wait time has expired. All time consumed during the ongoing + * operation will be counted as part of the shutdown time period. + */ +private[spark] class YarnHistoryService extends SchedulerExtensionService + with Logging with Source { --- End diff -- Can we make `Source` related things into a sub-class or separated class. Here in this class there's so many class parameters, it is not easy to understand and track the state. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well.
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151735782 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151735783 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44498/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151735707 **[Test build #44498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44498/consoleFull)** for PR 9319 at commit [`ca3c5b1`](https://github.com/apache/spark/commit/ca3c5b12ecc6abe4c13bb36139ec0f55d843c5be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-151735689 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44500/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-151735616 **[Test build #44500 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44500/consoleFull)** for PR 9320 at commit [`a556c50`](https://github.com/apache/spark/commit/a556c5081638733fe0da3110470cc32bc42aa695). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-151735688 Merged build finished. Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151735661 **[Test build #44503 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44503/consoleFull)** for PR 9297 at commit [`0af5afe`](https://github.com/apache/spark/commit/0af5afeafe894614e7c1cb83f343db0a0869ad77). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11302] [MLLIB] (2) Multivariate Gaussia...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/9309#issuecomment-151735507 LGTM. Merged into master, 1.5, 1.4, and 1.3. Thanks! Please close #9293 . --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151735350 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10342] [SPARK-10309] [SQL] [WIP] Cooper...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9241#issuecomment-151735339 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10342] [SPARK-10309] [SQL] [WIP] Cooper...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9241#issuecomment-151735357 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151735338 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11141][STREAMING] Batch ReceivedBlockTr...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9143#issuecomment-151735132 **[Test build #44502 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44502/consoleFull)** for PR 9143 at commit [`80a0b8d`](https://github.com/apache/spark/commit/80a0b8d9e994ccf5c9381e12dae4c736ad6c3800). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11302] [MLLIB] (2) Multivariate Gaussia...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/9309 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11141][STREAMING] Batch ReceivedBlockTr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9143#issuecomment-151734788 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151734769 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9322#issuecomment-151734779 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11315] [YARN] WiP Add YARN extension se...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/8744#discussion_r43220805 --- Diff: yarn/pom.xml --- @@ -164,6 +164,113 @@ + + --- End diff -- Does this profile work under SBT? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11141][STREAMING] Batch ReceivedBlockTr...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9143#issuecomment-151734773 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10342] [SPARK-10309] [SQL] [WIP] Cooper...
Github user davies commented on a diff in the pull request: https://github.com/apache/spark/pull/9241#discussion_r43220734 --- Diff: core/src/main/java/org/apache/spark/memory/MemoryConsumer.java --- @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.memory; + + +import java.io.IOException; + +import org.apache.spark.unsafe.memory.MemoryBlock; + + +/** + * An memory consumer of TaskMemoryManager, which support spilling. + */ +public class MemoryConsumer { --- End diff -- Yes, we could make it abstract --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11354] [Web UI] expose custom log4j to ...
Github user yongjiaw commented on the pull request: https://github.com/apache/spark/pull/9307#issuecomment-151734523 https://github.com/apache/spark/pull/9321 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11358] [MLLIB] deprecate runs in k-mean...
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/9322 [SPARK-11358] [MLLIB] deprecate runs in k-means This PR deprecates `runs` in k-means. `runs` introduces extra complexity and overhead in MLlib's k-means implementation. I haven't seen much usage with `runs` not equal to `1`. We don't have a unit test for it either. We can deprecate this method in 1.6, and void it in 1.7. It helps us simplify the implementation. cc: @srowen You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark SPARK-11358 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9322.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9322 commit 138fd1400c4ed8c7221f7125b64d87cc86d9fc5d Author: Xiangrui Meng Date: 2015-10-28T05:46:55Z deprecate runs in k-means --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11354] [Web UI] expose custom log4j to ...
Github user yongjiaw closed the pull request at: https://github.com/apache/spark/pull/9307 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11354] [Web UI] expose custom log4j to ...
Github user yongjiaw commented on the pull request: https://github.com/apache/spark/pull/9307#issuecomment-151734456 this PR had some issue, I created another one. Closing this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11354] [Web UI] Expose custom log4j fil...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9321#issuecomment-151734301 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-151734093 **[Test build #44500 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44500/consoleFull)** for PR 9320 at commit [`a556c50`](https://github.com/apache/spark/commit/a556c5081638733fe0da3110470cc32bc42aa695). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11354] [Web UI] Expose custom log4j fil...
GitHub user yongjiaw opened a pull request: https://github.com/apache/spark/pull/9321 [SPARK-11354] [Web UI] Expose custom log4j files on executor page for standalone cluster. You can merge this pull request into a Git repository by running: $ git pull https://github.com/yongjiaw/spark log4j Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9321.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9321 commit 48be2219fd09c73662f4087f921aaf9cc08e4125 Author: Yongjia Wang Date: 2015-10-28T03:41:11Z Expose custom log4j files on executor page for standalone cluster. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11363][SQL] LeftSemiJoin should be Left...
Github user chenghao-intel commented on the pull request: https://github.com/apache/spark/pull/9318#issuecomment-151733662 Oh, that's a good catch!, thank you @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-15177 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9320#issuecomment-151733327 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151733311 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151733309 **[Test build #44499 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44499/consoleFull)** for PR 9297 at commit [`caab0ba`](https://github.com/apache/spark/commit/caab0bab0299d4eb985b2e5e68cc5813faac6dfb). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151733313 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44499/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151732842 **[Test build #44499 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44499/consoleFull)** for PR 9297 at commit [`caab0ba`](https://github.com/apache/spark/commit/caab0bab0299d4eb985b2e5e68cc5813faac6dfb). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11336] Add links to example codes
GitHub user yinxusen opened a pull request: https://github.com/apache/spark/pull/9320 [SPARK-11336] Add links to example codes https://issues.apache.org/jira/browse/SPARK-11336 @mengxr I add a hyperlink of Spark on Github and a hint of their existences in Spark code repo in each code example. I remove the config key for changing the example code dir, since we assume all examples should be in spark/examples. The hyperlink, though we cannot use it now, since the Spark v1.6.0 has not been released yet, can be used after the release. So it is not a problem. I add some screen shots, so you can get an instant feeling. https://cloud.githubusercontent.com/assets/2637239/10780634/bd20e072-7cfc-11e5-8960-def4fc62a8ea.png";> https://cloud.githubusercontent.com/assets/2637239/10780636/c3f6e180-7cfc-11e5-80b2-233589f4a9a3.png";> You can merge this pull request into a Git repository by running: $ git pull https://github.com/yinxusen/spark SPARK-11336 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9320.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9320 commit a556c5081638733fe0da3110470cc32bc42aa695 Author: Xusen Yin Date: 2015-10-28T05:43:04Z add links to example codes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43220093 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/SchedulerExtensionService.scala --- @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import java.util.concurrent.atomic.AtomicBoolean + +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} + +import org.apache.spark.util.Utils +import org.apache.spark.{Logging, SparkContext} + +/** + * An extension service that can be loaded into a Spark YARN scheduler. + * A Service that can be started and stopped + * + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ +trait SchedulerExtensionService { + + /** + * Start the extension service. This should be a no-op if + * called more than once. + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit + + /** + * Stop the service + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ + def stop(): Unit +} + +/** + * Binding information for a [[SchedulerExtensionService]] + * @param sparkContext current spark context + * @param applicationId YARN application ID + * @param attemptId optional AttemptID. + */ +case class SchedulerExtensionServiceBinding( +sparkContext: SparkContext, +applicationId: ApplicationId, +attemptId: Option[ApplicationAttemptId] = None) + +/** + * Container for [[SchedulerExtensionService]] instances. + * + * Loads Extension Services from the configuration property + * `"spark.yarn.services"`, instantiates and starts them. + * When stopped, it stops all child entries. + * + * The order in which child extension services are started and stopped + * is undefined. + * + */ +private[spark] class SchedulerExtensionServices extends SchedulerExtensionService +with Logging { + private var services: List[SchedulerExtensionService] = Nil + private var sparkContext: SparkContext = _ + private var appId: ApplicationId = _ + private var attemptId: Option[ApplicationAttemptId] = _ + private val started = new AtomicBoolean(false) + private var binding: SchedulerExtensionServiceBinding = _ + + /** + * Binding operation will load the named services and call bind on them too; the + * entire set of services are then ready for `init()` and `start()` calls + + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit = { +if (started.getAndSet(true)) { + logWarning("Ignoring re-entrant start operation") + return +} +require(binding.sparkContext != null, "Null context parameter") +require(binding.applicationId != null, "Null appId parameter") +this.binding = binding --- End diff -- Here `binding` is actually duplicated with below 3 parameters, from my understanding in this code, we could choose either. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151730887 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151730942 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219717 --- Diff: yarn/src/test/scala/org/apache/spark/scheduler/cluster/StubApplicationAttemptId.scala --- @@ -0,0 +1,50 @@ +/* --- End diff -- Can we put these fake stub class into one file like: `SparkYarnTestHelper` or something else? That will possibly reduce the file number. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10827] [CORE] AppClient should not use ...
Github user BryanCutler commented on a diff in the pull request: https://github.com/apache/spark/pull/9317#discussion_r43219668 --- Diff: core/src/test/scala/org/apache/spark/deploy/client/AppClientSuite.scala --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.deploy.client + +import org.apache.spark._ +import org.apache.spark.deploy.{Command, ApplicationDescription} +import org.apache.spark.deploy.DeployMessages.{MasterStateResponse, RequestMasterState} +import org.apache.spark.deploy.master.{ApplicationInfo, Master} +import org.apache.spark.deploy.worker.Worker +import org.apache.spark.rpc.RpcEnv +import org.apache.spark.util.Utils +import org.scalatest.BeforeAndAfterAll +import org.scalatest.concurrent.Eventually._ + +import scala.collection.mutable.{SynchronizedBuffer, ArrayBuffer} +import scala.concurrent.duration._ + +/** + * End-to-end tests for application client in standalone mode. + */ +class AppClientSuite + extends SparkFunSuite + with LocalSparkContext + with BeforeAndAfterAll { + + private val numWorkers = 2 + private val conf = new SparkConf() + private val securityManager = new SecurityManager(conf) + + private var masterRpcEnv: RpcEnv = null + private var workerRpcEnvs: Seq[RpcEnv] = null + private var master: Master = null + private var workers: Seq[Worker] = null + + /** + * Start the local cluster. + * Note: local-cluster mode is insufficient because we want a reference to the Master. + */ + override def beforeAll(): Unit = { +super.beforeAll() +masterRpcEnv = RpcEnv.create(Master.SYSTEM_NAME, "localhost", 0, conf, securityManager) +workerRpcEnvs = (0 until numWorkers).map { i => + RpcEnv.create(Worker.SYSTEM_NAME + i, "localhost", 0, conf, securityManager) +} +master = makeMaster() +workers = makeWorkers(10, 2048) +// Wait until all workers register with master successfully +eventually(timeout(60.seconds), interval(10.millis)) { + assert(getMasterState.workers.size === numWorkers) +} + } + + override def afterAll(): Unit = { +workerRpcEnvs.foreach(_.shutdown()) +masterRpcEnv.shutdown() +workers.foreach(_.stop()) +master.stop() +workerRpcEnvs = null +masterRpcEnv = null +workers = null +master = null +super.afterAll() + } + + test("interface methods of AppClient using local Master") { +val ci = new AppClientInst(masterRpcEnv.address.toSparkURL) + +ci.client.start() + +// Client should connect with one Master which registers the application +eventually(timeout(10.seconds), interval(10.millis)) { + val apps = getApplications() + assert(ci.listener.connectedIdList.size === 1, "client listener should have one connection") + assert(apps.size === 1, "master should have 1 registered app") +} + +// Send message to Master to request Executors, verify request by change in executor limit +val numExecutorsRequested = 1 +assert( ci.client.requestTotalExecutors(numExecutorsRequested) ) + +eventually(timeout(10.seconds), interval(10.millis)) { + val apps = getApplications() + assert(apps.head.getExecutorLimit === numExecutorsRequested, s"executor request failed") +} + +// Send request to kill executor, verify request was made +assert { + val apps = getApplications() + val executorId: String = apps.head.executors.head._2.fullId + ci.client.killExecutors(Seq(executorId)) +} + +// Issue stop command for Client to disconnect from Master +ci.client.stop() + +// Verify Client is marked dead and unregistered from Master +eventually(timeout(10.seconds), interval(10.millis)) { + val apps = getApplications() + assert(ci.l
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219638 --- Diff: yarn/src/test/scala/org/apache/spark/scheduler/cluster/SimpleExtensionService.scala --- @@ -0,0 +1,39 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import java.util.concurrent.atomic.AtomicBoolean + + --- End diff -- nit: one more empty line. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151729199 **[Test build #44498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44498/consoleFull)** for PR 9319 at commit [`ca3c5b1`](https://github.com/apache/spark/commit/ca3c5b12ecc6abe4c13bb36139ec0f55d843c5be). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151729123 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9319#issuecomment-151729137 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MINOR] [ML] fix compile warns
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/9319 [MINOR] [ML] fix compile warns This fixes some compile time warnings. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark mllib-compile-warn-20151027 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/9319.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #9319 commit ca3c5b12ecc6abe4c13bb36139ec0f55d843c5be Author: Xiangrui Meng Date: 2015-10-28T04:49:57Z fix compile warns --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219575 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnSchedulerBackend.scala --- @@ -51,6 +51,38 @@ private[spark] abstract class YarnSchedulerBackend( private implicit val askTimeout = RpcUtils.askRpcTimeout(sc.conf) + /** Application ID. Must be set by a subclass before starting the service */ + private var appId: ApplicationId = null + + /** Attempt ID. This is unset for client-side schedulers */ + private var attemptId: Option[ApplicationAttemptId] = None + + /** Scheduler extension services */ + private val services: SchedulerExtensionServices = new SchedulerExtensionServices() + + /** +* Bind to YARN. This *must* be done before calling [[start()]]. +* +* @param appId YARN application ID +* @param attemptId Optional YARN attempt ID +*/ + protected def bindToYARN(appId: ApplicationId, attemptId: Option[ApplicationAttemptId]): Unit = { --- End diff -- `bindToYarn`? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10658][PYSPARK][WIP] Provide add jars t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9313#issuecomment-151729038 **[Test build #44497 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44497/consoleFull)** for PR 9313 at commit [`b8b8d72`](https://github.com/apache/spark/commit/b8b8d72c26401a9d26a86f4829f8720b3192c1cc). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11363][SQL] LeftSemiJoin should be Left...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9318#issuecomment-151728910 **[Test build #44496 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44496/consoleFull)** for PR 9318 at commit [`ede6192`](https://github.com/apache/spark/commit/ede619229898c3496adf3e7bc569a9cd86c5b6c1). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10658][PYSPARK][WIP] Provide add jars t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9313#issuecomment-151728646 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-10658][PYSPARK][WIP] Provide add jars t...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9313#issuecomment-151728658 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219421 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala --- @@ -22,6 +22,7 @@ import org.apache.hadoop.yarn.conf.YarnConfiguration import org.apache.spark.SparkContext import org.apache.spark.deploy.yarn.YarnSparkHadoopUtil +import org.apache.spark.deploy.yarn.ApplicationMaster --- End diff -- nit: this import can be merged with above one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151728573 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151728574 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44493/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11206] Support SQL UI on the history se...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/9297#issuecomment-151728571 **[Test build #44493 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44493/consoleFull)** for PR 9297 at commit [`7a2aced`](https://github.com/apache/spark/commit/7a2acedfc524e5c5887bd783e6fbbe289313306a). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219322 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/SchedulerExtensionService.scala --- @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import java.util.concurrent.atomic.AtomicBoolean + +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} + +import org.apache.spark.util.Utils +import org.apache.spark.{Logging, SparkContext} + +/** + * An extension service that can be loaded into a Spark YARN scheduler. + * A Service that can be started and stopped + * + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ +trait SchedulerExtensionService { + + /** + * Start the extension service. This should be a no-op if + * called more than once. + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit + + /** + * Stop the service + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ + def stop(): Unit +} + +/** + * Binding information for a [[SchedulerExtensionService]] + * @param sparkContext current spark context + * @param applicationId YARN application ID + * @param attemptId optional AttemptID. + */ +case class SchedulerExtensionServiceBinding( +sparkContext: SparkContext, +applicationId: ApplicationId, +attemptId: Option[ApplicationAttemptId] = None) + +/** + * Container for [[SchedulerExtensionService]] instances. + * + * Loads Extension Services from the configuration property + * `"spark.yarn.services"`, instantiates and starts them. + * When stopped, it stops all child entries. + * + * The order in which child extension services are started and stopped + * is undefined. + * + */ +private[spark] class SchedulerExtensionServices extends SchedulerExtensionService +with Logging { + private var services: List[SchedulerExtensionService] = Nil + private var sparkContext: SparkContext = _ + private var appId: ApplicationId = _ + private var attemptId: Option[ApplicationAttemptId] = _ + private val started = new AtomicBoolean(false) + private var binding: SchedulerExtensionServiceBinding = _ + + /** + * Binding operation will load the named services and call bind on them too; the + * entire set of services are then ready for `init()` and `start()` calls + + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit = { +if (started.getAndSet(true)) { + logWarning("Ignoring re-entrant start operation") + return +} +require(binding.sparkContext != null, "Null context parameter") +require(binding.applicationId != null, "Null appId parameter") +this.binding = binding +sparkContext = binding.sparkContext +appId = binding.applicationId +attemptId = binding.attemptId +logInfo(s"Starting Yarn extension services with app ${binding.applicationId}" + +s" and attemptId $attemptId") + +services = sparkContext.getConf.getOption(SchedulerExtensionServices.SPARK_YARN_SERVICES) +.map { s => + s.split(",").map(_.trim()).filter(!_.isEmpty) +.map { sClass => +val instance = Utils.classForName(sClass) +.newInstance() --- End diff -- Do we need to try catch some exceptions like `ClassNotFound` here? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but no
[GitHub] spark pull request: [SPARK-11314] [YARN] add service API and test ...
Github user jerryshao commented on a diff in the pull request: https://github.com/apache/spark/pull/9182#discussion_r43219282 --- Diff: yarn/src/main/scala/org/apache/spark/scheduler/cluster/SchedulerExtensionService.scala --- @@ -0,0 +1,137 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + +import java.util.concurrent.atomic.AtomicBoolean + +import org.apache.hadoop.yarn.api.records.{ApplicationAttemptId, ApplicationId} + +import org.apache.spark.util.Utils +import org.apache.spark.{Logging, SparkContext} + +/** + * An extension service that can be loaded into a Spark YARN scheduler. + * A Service that can be started and stopped + * + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ +trait SchedulerExtensionService { + + /** + * Start the extension service. This should be a no-op if + * called more than once. + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit + + /** + * Stop the service + * The `stop()` operation MUST be idempotent, and succeed even if `start()` was + * never invoked. + */ + def stop(): Unit +} + +/** + * Binding information for a [[SchedulerExtensionService]] + * @param sparkContext current spark context + * @param applicationId YARN application ID + * @param attemptId optional AttemptID. + */ +case class SchedulerExtensionServiceBinding( +sparkContext: SparkContext, +applicationId: ApplicationId, +attemptId: Option[ApplicationAttemptId] = None) + +/** + * Container for [[SchedulerExtensionService]] instances. + * + * Loads Extension Services from the configuration property + * `"spark.yarn.services"`, instantiates and starts them. + * When stopped, it stops all child entries. + * + * The order in which child extension services are started and stopped + * is undefined. + * + */ +private[spark] class SchedulerExtensionServices extends SchedulerExtensionService +with Logging { + private var services: List[SchedulerExtensionService] = Nil + private var sparkContext: SparkContext = _ + private var appId: ApplicationId = _ + private var attemptId: Option[ApplicationAttemptId] = _ + private val started = new AtomicBoolean(false) + private var binding: SchedulerExtensionServiceBinding = _ + + /** + * Binding operation will load the named services and call bind on them too; the + * entire set of services are then ready for `init()` and `start()` calls + + * @param binding binding to the spark application and YARN + */ + def start(binding: SchedulerExtensionServiceBinding): Unit = { +if (started.getAndSet(true)) { + logWarning("Ignoring re-entrant start operation") + return +} +require(binding.sparkContext != null, "Null context parameter") +require(binding.applicationId != null, "Null appId parameter") +this.binding = binding +sparkContext = binding.sparkContext +appId = binding.applicationId +attemptId = binding.attemptId +logInfo(s"Starting Yarn extension services with app ${binding.applicationId}" + +s" and attemptId $attemptId") + +services = sparkContext.getConf.getOption(SchedulerExtensionServices.SPARK_YARN_SERVICES) +.map { s => + s.split(",").map(_.trim()).filter(!_.isEmpty) +.map { sClass => +val instance = Utils.classForName(sClass) +.newInstance() +.asInstanceOf[SchedulerExtensionService] +// bind this service +instance.start(binding) +logInfo(s"Service $sClass started") +instance + } +}.map(_.toList).getOrElse(Nil) + } + + /** +
[GitHub] spark pull request: [SPARK-11363][SQL] LeftSemiJoin should be Left...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9318#issuecomment-151728188 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9290#issuecomment-151728226 Merged build finished. Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11340][SPARKR] Support setting driver p...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9290#issuecomment-151728227 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44492/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-11363][SQL] LeftSemiJoin should be Left...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/9318#issuecomment-151728197 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org