[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35474696 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12775/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: coding style discussion: explicit return type in public APIs
You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
[GitHub] incubator-spark pull request: MLLIB-24: url of Collaborative Filt...
Github user CrazyJvm commented on the pull request: https://github.com/apache/incubator-spark/pull/619#issuecomment-35476826 There seems no problem to use yahoo link. Or you are worried about the link might be invalid again? @mengxr --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Spark 1095 : Adding explicit return ...
Github user NirmalReddy commented on the pull request: https://github.com/apache/incubator-spark/pull/610#issuecomment-35480180 @aarondav With this last commit i suppose i have completed the issue.(Spark-1095) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Spark 0.9.0
Hi, I am trying to use Apache spark on a Standalone cluster. After downloading the Spark I tried to build the package. However I am getting following error for the normal build using default Hadoop: gino@gino008:~/Downloads/spark-0.9.0-incubating$ sbt assembly Loading /usr/share/sbt/bin/sbt-launch-lib.bash [info] Loading project definition from /home/gino/Downloads/spark-0.9.0-incubating/project/project [info] Updating {file:/home/gino/Downloads/spark-0.9.0-incubating/project/project/}default-5f2b58... [info] Resolving org.scala-lang#scala-library;2.9.2 ... [error] Server access Error: Connection reset url=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-lang/scala-library/2.9.2/jars/scala-library.jar [error] Server access Error: Connection reset url=http://scalasbt.artifactoryonline.com/scalasbt/sbt-plugin-releases/org.scala-lang/scala-library/2.9.2/jars/scala-library.jar [error] Server access Error: Connection reset url=http://repo1.maven.org/maven2/org/scala-lang/scala-library/2.9.2/scala-library-2.9.2.jar [info] Resolving org.scala-sbt#control;0.12.4 ... truncated- I am getting following error for the normal build using Hadoop 2.2.0: gino@gino008:~/Downloads/spark-0.9.0-incubating$ SPARK_HADOOP_VERSION=2.2.0 sbt assembly Loading /usr/share/sbt/bin/sbt-launch-lib.bash [info] Loading project definition from /home/gino/Downloads/spark-0.9.0-incubating/project/project [info] Updating {file:/home/gino/Downloads/spark-0.9.0-incubating/project/project/}default-5f2b58... [info] Resolving org.scala-lang#scala-compiler;2.9.2 ... [error] Server access Error: Connection reset url=http://repo.typesafe.com/typesafe/ivy-releases/org.scala-lang/scala-compiler/2.9.2/jars/scala-compiler.jar [error] Server access Error: Connection reset url=http://scalasbt.artifactoryonline.com/scalasbt/sbt-plugin-releases/org.scala-lang/scala-compiler/2.9.2/jars/scala-compiler.jar [error] Server access Error: Connection reset url=http://repo1.maven.org/maven2/org/scala-lang/scala-compiler/2.9.2/scala-compiler-2.9.2.jar [info] Resolving org.sonatype.oss#oss-parent;7 ... [error] Server access Error: Connection reset url=http://repo.typesafe.com/typesafe/ivy-releases/org.sonatype.oss/oss-parent/7/jars/oss-parent.jar [error] Server access Error: Connection reset url=http://scalasbt.artifactoryonline.com/scalasbt/sbt-plugin-releases/org.sonatype.oss/oss-parent/7/jars/oss-parent.jar [error] Server access Error: Connection reset url=http://repo1.maven.org/maven2/org/sonatype/oss/oss-parent/7/oss-parent-7.jar [error] Server access Error: Connection reset url=http://repo.typesafe.com/typesafe/ivy-releases/jline/jline/1.0/jars/jline.jar [error] Server access Error: Connection reset url=http://scalasbt.artifactoryonline.com/scalasbt/sbt-plugin-releases/jline/jline/1.0/jars/jline.jar [error] Server access Error: Connection reset url=http://repo1.maven.org/maven2/jline/jline/1.0/jline-1.0.jar [info] Resolving org.scala-sbt#api;0.12.4 ... --truncated- Please guide how to download the maven repositories. Thanks in Advance Gino Mathews K
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35491849 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/585#discussion_r9863061 --- Diff: project/MimaBuild.scala --- @@ -0,0 +1,115 @@ +import com.typesafe.tools.mima.plugin.MimaKeys.{binaryIssueFilters, previousArtifact} +import com.typesafe.tools.mima.plugin.MimaPlugin.mimaDefaultSettings + +object MimaBuild { + + val ignoredABIProblems = { +import com.typesafe.tools.mima.core._ +import com.typesafe.tools.mima.core.ProblemFilters._ +/** + * A: Detections likely to become semi private at some point. + */ + Seq(exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom), + exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom$), + exclude[MissingMethodProblem](org.apache.spark.util.Utils.cloneWritables), + exclude[MissingMethodProblem](org.apache.spark.util.collection.ExternalAppendOnlyMap#DiskMapIterator.nextItem_=), --- End diff -- exclude for a class does not work I suppose. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Deprecated and added a few java api ...
Github user ScrapCodes commented on the pull request: https://github.com/apache/incubator-spark/pull/402#issuecomment-35494734 @pwendell are you okay with the changes ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35496756 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35496755 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35496818 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35496819 One or more automated tests failed Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12778/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35496946 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35496947 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: MLLIB-24: url of Collaborative Filt...
Github user CodingCat commented on the pull request: https://github.com/apache/incubator-spark/pull/619#issuecomment-35497321 @mengxr DOI link may not be accessible to non-paid users, I think yahoo research is relatively stable enough --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35501417 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35501415 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12780/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Add Security to Spark - Akka, Http, ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/332#discussion_r9868173 --- Diff: core/src/main/java/org/apache/spark/SparkSaslServer.java --- @@ -0,0 +1,189 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark; + +import org.apache.commons.net.util.Base64; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Map; +import java.util.TreeMap; + +import javax.security.auth.callback.Callback; +import javax.security.auth.callback.CallbackHandler; +import javax.security.auth.callback.NameCallback; +import javax.security.auth.callback.PasswordCallback; +import javax.security.auth.callback.UnsupportedCallbackException; +import javax.security.sasl.AuthorizeCallback; +import javax.security.sasl.RealmCallback; +import javax.security.sasl.Sasl; +import javax.security.sasl.SaslException; +import javax.security.sasl.SaslServer; +import java.io.IOException; + +/** + * Encapsulates SASL server logic for Server + */ +public class SparkSaslServer { + /** Logger */ + private static Logger LOG = LoggerFactory.getLogger(SparkSaslServer.class); + + /** + * Actual SASL work done by this object from javax.security.sasl. + * Initialized below in constructor. + */ + private SaslServer saslServer; + + public static final String SASL_DEFAULT_REALM = default; --- End diff -- Yeah that code was specifically copied from Hadoop 0.23. I'll leave it for now and we can make it configurable in the next round of changes to add more configurability. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: SPARK-1059. Now that we submit core ...
Github user tgravescs commented on the pull request: https://github.com/apache/incubator-spark/pull/555#issuecomment-35513822 @sryza can this be closed then? I think the important note you added to the running on yarn about the cores will suffice alone with my security PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35514615 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/618#discussion_r9870730 --- Diff: docs/index.md --- @@ -19,7 +19,7 @@ Spark uses [Simple Build Tool](http://www.scala-sbt.org), which is bundled with sbt/sbt assembly -For its Scala API, Spark {{site.SPARK_VERSION}} depends on Scala {{site.SCALA_VERSION}}. If you write applications in Scala, you will need to use this same version of Scala in your own program -- newer major versions may not work. You can get the right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/). +For its Scala API, Spark {{site.SPARK_VERSION}} depends on Scala {{site.SCALA_BINARY_VERSION}}. If you write applications in Scala, you will need to use this same version of Scala in your own program -- newer major versions may not work. You can get the right version of Scala from [scala-lang.org](http://www.scala-lang.org/download/). --- End diff -- To make this more clear, it might be good to say: If you write applications in Scala, you will need to use a compatible Scala version (e.g. {{site.SCALA_BINARY_VERSION}}.X) -- newer major versions may not work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user pwendell commented on the pull request: https://github.com/apache/incubator-spark/pull/618#issuecomment-35516252 LGTM pending a small fix -- @aarondav want to take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: coding style discussion: explicit return type in public APIs
+1 overall. Christopher - I agree that once the number of rules becomes large it's more efficient to pursue a use your judgement approach. However, since this is only 3 cases I'd prefer to wait to see if it grows. The concern with this approach is that for newer people, contributors, etc it's hard for them to understand what good judgement is. Many are new to scala, so explicit rules are generally better. - Patrick On Wed, Feb 19, 2014 at 12:19 AM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.comwrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35521749 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user aarondav commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/611#discussion_r9875482 --- Diff: core/src/main/scala/org/apache/spark/deploy/master/ZooKeeperLeaderElectionAgent.scala --- @@ -18,105 +18,73 @@ package org.apache.spark.deploy.master import akka.actor.ActorRef -import org.apache.zookeeper._ -import org.apache.zookeeper.Watcher.Event.EventType import org.apache.spark.{SparkConf, Logging} import org.apache.spark.deploy.master.MasterMessages._ +import org.apache.curator.framework.CuratorFramework +import org.apache.curator.framework.recipes.leader.{LeaderLatchListener, LeaderLatch} private[spark] class ZooKeeperLeaderElectionAgent(val masterActor: ActorRef, masterUrl: String, conf: SparkConf) - extends LeaderElectionAgent with SparkZooKeeperWatcher with Logging { + extends LeaderElectionAgent with LeaderLatchListener with Logging { val WORKING_DIR = conf.get(spark.deploy.zookeeper.dir, /spark) + /leader_election - private val watcher = new ZooKeeperWatcher() - private val zk = new SparkZooKeeperSession(this, conf) + private var zk: CuratorFramework = _ + private var leaderLatch: LeaderLatch = _ private var status = LeadershipStatus.NOT_LEADER - private var myLeaderFile: String = _ - private var leaderUrl: String = _ override def preStart() { + logInfo(Starting ZooKeeper LeaderElection agent) -zk.connect() - } +zk = SparkCuratorUtil.newClient(conf) +leaderLatch = new LeaderLatch(zk, WORKING_DIR) +leaderLatch.addListener(this) - override def zkSessionCreated() { -synchronized { - zk.mkdirRecursive(WORKING_DIR) - myLeaderFile = -zk.create(WORKING_DIR + /master_, masterUrl.getBytes, CreateMode.EPHEMERAL_SEQUENTIAL) - self ! CheckLeader -} +leaderLatch.start() } override def preRestart(reason: scala.Throwable, message: scala.Option[scala.Any]) { -logError(LeaderElectionAgent failed, waiting + zk.ZK_TIMEOUT_MILLIS + ..., reason) -Thread.sleep(zk.ZK_TIMEOUT_MILLIS) +logError(LeaderElectionAgent failed..., reason) super.preRestart(reason, message) } - override def zkDown() { -logError(ZooKeeper down! LeaderElectionAgent shutting down Master.) -System.exit(1) - } - override def postStop() { +leaderLatch.close() zk.close() } override def receive = { -case CheckLeader = checkLeader() +case _ = } - private class ZooKeeperWatcher extends Watcher { -def process(event: WatchedEvent) { - if (event.getType == EventType.NodeDeleted) { -logInfo(Leader file disappeared, a master is down!) -self ! CheckLeader + override def isLeader() { +// In case that leadship gain and lost in a short time. +Thread.sleep(1000) --- End diff -- Ah, sorry if I was unclear, but I was just joking about putting a sleep(1000) in here. The real solution is to add a synchronized block to isLeader and notLeader -- I was just making a point that we're not concerned with the overhead of synchronization in this code path. (The synchronized block is not needed with the current implementation and use of Curator, but I think it makes the code clearer without a real downside.) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user aarondav commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/618#discussion_r9875739 --- Diff: docs/scala-programming-guide.md --- @@ -17,12 +17,12 @@ This guide shows each of these features and walks through some samples. It assum # Linking with Spark -Spark {{site.SPARK_VERSION}} uses Scala {{site.SCALA_VERSION}}. If you write applications in Scala, you'll need to use this same version of Scala in your program -- newer major versions may not work. +Spark {{site.SPARK_VERSION}} uses Scala {{site.SCALA_BINARY_VERSION}}. If you write applications in Scala, you'll need to use this same version of Scala in your program -- newer major versions may not work. --- End diff -- I suppose we should repeat Patrick's comment here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Add Security to Spark - Akka, Http, ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/332#discussion_r9875857 --- Diff: core/src/main/scala/org/apache/spark/SecurityManager.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark + +import org.apache.hadoop.io.Text +import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.UserGroupInformation + +import org.apache.spark.deploy.SparkHadoopUtil + +/** + * Spark class responsible for security. + */ +private[spark] class SecurityManager extends Logging { + + private val isAuthOn = System.getProperty(spark.authenticate, false).toBoolean + private val isUIAuthOn = System.getProperty(spark.authenticate.ui, false).toBoolean + private val viewAcls = System.getProperty(spark.ui.view.acls, ).split(',').map(_.trim()).toSet + private val secretKey = generateSecretKey() + logDebug(is auth enabled = + isAuthOn + is uiAuth enabled = + isUIAuthOn) + + /** + * In Yarn mode it uses Hadoop UGI to pass the secret as that + * will keep it protected. For a standalone SPARK cluster + * use a environment variable SPARK_SECRET to specify the secret. + * This probably isn't ideal but only the user who starts the process + * should have access to view the variable (at least on Linux). + * Since we can't set the environment variable we set the + * java system property SPARK_SECRET so it will automatically + * generate a secret is not specified. This definitely is not + * ideal since users can see it. We should switch to put it in + * a config. + */ + private def generateSecretKey(): String = { + +if (!isAuthenticationEnabled) return null +// first check to see if secret already set, else generate it +if (SparkHadoopUtil.get.isYarnMode) { + val credentials = SparkHadoopUtil.get.getCurrentUserCredentials() + if (credentials != null) { +val secretKey = credentials.getSecretKey(new Text(akkaCookie)) +if (secretKey != null) { + logDebug(in yarn mode, getting secret from credentials) + return new Text(secretKey).toString +} else { + logDebug(getSecretKey: yarn mode, secret key from credentials is null) +} + } else { +logDebug(getSecretKey: yarn mode, credentials are null) + } +} +val secret = System.getProperty(SPARK_SECRET, System.getenv(SPARK_SECRET)) +if (secret != null !secret.isEmpty()) return secret +// generate one +val sCookie = akka.util.Crypt.generateSecureCookie + +// if we generate we must be the first so lets set it so its used by everyone else +if (SparkHadoopUtil.get.isYarnMode) { + val creds = new Credentials() + creds.addSecretKey(new Text(akkaCookie), sCookie.getBytes()) + SparkHadoopUtil.get.addCurrentUserCredentials(creds) + logDebug(adding secret to credentials yarn mode) +} else { + System.setProperty(SPARK_SECRET, sCookie) + logDebug(adding secret to java property) +} +return sCookie + } + + def isUIAuthenticationEnabled(): Boolean = return isUIAuthOn + + // allow anyone in the acl list and the application owner + def checkUIViewPermissions(user: String): Boolean = { +if (isUIAuthenticationEnabled() (user != null)) { + if ((!viewAcls.contains(user)) (user != System.getProperty(user.name))) { --- End diff -- Good idea to just prepopulate it. I assume its safer just to add both user.name and SPARK_USER to acl list if they are set? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working,
[GitHub] incubator-spark pull request: Add Security to Spark - Akka, Http, ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/332#discussion_r9875875 --- Diff: core/src/main/scala/org/apache/spark/SecurityManager.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark + +import org.apache.hadoop.io.Text +import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.UserGroupInformation + +import org.apache.spark.deploy.SparkHadoopUtil + +/** + * Spark class responsible for security. + */ +private[spark] class SecurityManager extends Logging { + + private val isAuthOn = System.getProperty(spark.authenticate, false).toBoolean + private val isUIAuthOn = System.getProperty(spark.authenticate.ui, false).toBoolean + private val viewAcls = System.getProperty(spark.ui.view.acls, ).split(',').map(_.trim()).toSet + private val secretKey = generateSecretKey() + logDebug(is auth enabled = + isAuthOn + is uiAuth enabled = + isUIAuthOn) + + /** + * In Yarn mode it uses Hadoop UGI to pass the secret as that + * will keep it protected. For a standalone SPARK cluster + * use a environment variable SPARK_SECRET to specify the secret. + * This probably isn't ideal but only the user who starts the process + * should have access to view the variable (at least on Linux). + * Since we can't set the environment variable we set the + * java system property SPARK_SECRET so it will automatically + * generate a secret is not specified. This definitely is not + * ideal since users can see it. We should switch to put it in + * a config. + */ + private def generateSecretKey(): String = { + +if (!isAuthenticationEnabled) return null +// first check to see if secret already set, else generate it +if (SparkHadoopUtil.get.isYarnMode) { + val credentials = SparkHadoopUtil.get.getCurrentUserCredentials() + if (credentials != null) { +val secretKey = credentials.getSecretKey(new Text(akkaCookie)) +if (secretKey != null) { + logDebug(in yarn mode, getting secret from credentials) + return new Text(secretKey).toString +} else { + logDebug(getSecretKey: yarn mode, secret key from credentials is null) +} + } else { +logDebug(getSecretKey: yarn mode, credentials are null) + } +} +val secret = System.getProperty(SPARK_SECRET, System.getenv(SPARK_SECRET)) +if (secret != null !secret.isEmpty()) return secret +// generate one +val sCookie = akka.util.Crypt.generateSecureCookie + +// if we generate we must be the first so lets set it so its used by everyone else +if (SparkHadoopUtil.get.isYarnMode) { + val creds = new Credentials() + creds.addSecretKey(new Text(akkaCookie), sCookie.getBytes()) --- End diff -- yep, I'll update. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Add Security to Spark - Akka, Http, ...
Github user tgravescs commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/332#discussion_r9875896 --- Diff: core/src/main/scala/org/apache/spark/network/Connection.scala --- @@ -431,6 +466,7 @@ private[spark] class ReceivingConnection(channel_ : SocketChannel, selector_ : S val newMessage = Message.create(header).asInstanceOf[BufferMessage] newMessage.started = true newMessage.startTime = System.currentTimeMillis +newMessage.isSecurityNeg = if (header.securityNeg == 1) true else false --- End diff -- ah, ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user CodingCat commented on the pull request: https://github.com/apache/incubator-spark/pull/618#issuecomment-35530650 thank you very much for your comments @pwendell @aarondav --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
Re: coding style discussion: explicit return type in public APIs
Patrick, I sympathize with your sensibility here, and at face value, there's very little daylight between (a) a rule comprising small set of enumerated items and (b) a guideline followed by the same set as examples. My suggestion had a non-obvious tl;dr thesis behind it, so allow me to show my cards :) First, rules can be costly for the rule makers to create and maintain to ensure necessity and sufficiency, and can unintentionally encourage mischievous, often tedious arguments to work around those rules. In the area of coding style, even at Google (at least when I was there), we had guides rather than rules. It turns out that guidelines are also easier to socialize and enforce than enumerated rules. strawman_humorA Google search for coding style guide returns 15.7 million results, while that for coding style rule has 6.6M, and most of *those* are articles about coding style guide./strawman_humor More importantly, I've found Spark's to be one of the best socially-engineered communities I've participated in. It is quite helpful and welcoming to newcomers while (not paradoxically) comprising one of the highest median quality of participants, per my calibration of, e.g., the various meetups I've gone to in the SF Bay Area. This community friendliness and mutual regard are not accidental and have contributed in part to Spark's success to date. It seems quite tolerant of newbies and implicitly recognizes that there may be a lot of valuable expertise and interesting use cases we can learn from the person behind that idiotic-sounding question, who might go on to contribute valuable PRs. I've yet to see the acronym RTFM used in anger here. Now, rules don't automatically negate that, but they can be discouraging to navigate (Have I broken some rule?) and misused as devices to shoot others (You've just broken our rule #178.S4.P2). I'd rather see those things kept to a minimum, in locked cabinets. For the above reasons, I would suggest, for Spark, guidelines over rules whenever feasible tolerable, certainly in the area of coding style. Cheers, -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Wed, Feb 19, 2014 at 8:37 AM, Patrick Wendell pwend...@gmail.com wrote: +1 overall. Christopher - I agree that once the number of rules becomes large it's more efficient to pursue a use your judgement approach. However, since this is only 3 cases I'd prefer to wait to see if it grows. The concern with this approach is that for newer people, contributors, etc it's hard for them to understand what good judgement is. Many are new to scala, so explicit rules are generally better. - Patrick On Wed, Feb 19, 2014 at 12:19 AM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The
Re: coding style discussion: explicit return type in public APIs
Without bikeshedding this too much ... It is likely incorrect (not wrong) - and rules like this potentially cause things to slip through. Explicit return type strictly specifies what is being exposed (think in face of impl change - createFoo changes in future from Foo to Foo1 or Foo2) .. being conservative about how to specify exposed interfaces, imo, outweighs potential gains in breveity of code. Btw this is a degenerate contrieved example already stretching its use ... Regards Mridul Regards Mridul On Feb 19, 2014 1:49 PM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
Re: coding style discussion: explicit return type in public APIs
One slight concern regarding primitive types -- in particular, Ints and Longs can have semantic differences when it comes to overflow, so it's often good to know what type of variable you're returning. Perhaps it is sufficient to say that Int is the default numeric type, and that other types should be specified explicitly. On Wed, Feb 19, 2014 at 8:37 AM, Patrick Wendell pwend...@gmail.com wrote: +1 overall. Christopher - I agree that once the number of rules becomes large it's more efficient to pursue a use your judgement approach. However, since this is only 3 cases I'd prefer to wait to see if it grows. The concern with this approach is that for newer people, contributors, etc it's hard for them to understand what good judgement is. Many are new to scala, so explicit rules are generally better. - Patrick On Wed, Feb 19, 2014 at 12:19 AM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
Re: coding style discussion: explicit return type in public APIs
Mridul, Can you be more specific in the createFoo example? def myFunc = createFoo is disallowed in my guideline. It is invoking a function createFoo, not the constructor of Foo. On Wed, Feb 19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.comwrote: Without bikeshedding this too much ... It is likely incorrect (not wrong) - and rules like this potentially cause things to slip through. Explicit return type strictly specifies what is being exposed (think in face of impl change - createFoo changes in future from Foo to Foo1 or Foo2) .. being conservative about how to specify exposed interfaces, imo, outweighs potential gains in breveity of code. Btw this is a degenerate contrieved example already stretching its use ... Regards Mridul Regards Mridul On Feb 19, 2014 1:49 PM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
Re: coding style discussion: explicit return type in public APIs
I found Haskell's convention of including type signatures as documentation to be worthwhile. http://www.haskell.org/haskellwiki/Type_signatures_as_good_style I'd support a guideline to include type signatures where they're unclear but would prefer to leave it quite vague. In my experience, the lightest process is the best process for contributions. Strict rules here _will_ drive away contributors. On Wed, Feb 19, 2014 at 10:42 AM, Reynold Xin r...@databricks.com wrote: Mridul, Can you be more specific in the createFoo example? def myFunc = createFoo is disallowed in my guideline. It is invoking a function createFoo, not the constructor of Foo. On Wed, Feb 19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.com wrote: Without bikeshedding this too much ... It is likely incorrect (not wrong) - and rules like this potentially cause things to slip through. Explicit return type strictly specifies what is being exposed (think in face of impl change - createFoo changes in future from Foo to Foo1 or Foo2) .. being conservative about how to specify exposed interfaces, imo, outweighs potential gains in breveity of code. Btw this is a degenerate contrieved example already stretching its use ... Regards Mridul Regards Mridul On Feb 19, 2014 1:49 PM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
Re: coding style discussion: explicit return type in public APIs
My initial mail had it listed, adding more details here since I assume I am missing something or not being clear - please note, this is just illustrative and my scala knowledge is bad :-) (I am trying to draw parallels from mistakes in java world) def createFoo = new Foo() To def createFoo = new Foo1() To def createFoo = new Foo2() (appropriate inheritance applied - parent Foo). I am thinking from api evolution and binary compatibility point of view Regards, Mridul On Feb 20, 2014 12:12 AM, Reynold Xin r...@databricks.com wrote: Mridul, Can you be more specific in the createFoo example? def myFunc = createFoo is disallowed in my guideline. It is invoking a function createFoo, not the constructor of Foo. On Wed, Feb 19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.com wrote: Without bikeshedding this too much ... It is likely incorrect (not wrong) - and rules like this potentially cause things to slip through. Explicit return type strictly specifies what is being exposed (think in face of impl change - createFoo changes in future from Foo to Foo1 or Foo2) .. being conservative about how to specify exposed interfaces, imo, outweighs potential gains in breveity of code. Btw this is a degenerate contrieved example already stretching its use ... Regards Mridul Regards Mridul On Feb 19, 2014 1:49 PM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new AsyncRDDActions(rdd) Thoughts?
Re: coding style discussion: explicit return type in public APIs
I agree, makes sense. Please note I was referring only to exposed user api in my comments - not other code ! Regards, Mridul On Feb 20, 2014 12:15 AM, Andrew Ash and...@andrewash.com wrote: I found Haskell's convention of including type signatures as documentation to be worthwhile. http://www.haskell.org/haskellwiki/Type_signatures_as_good_style I'd support a guideline to include type signatures where they're unclear but would prefer to leave it quite vague. In my experience, the lightest process is the best process for contributions. Strict rules here _will_ drive away contributors. On Wed, Feb 19, 2014 at 10:42 AM, Reynold Xin r...@databricks.com wrote: Mridul, Can you be more specific in the createFoo example? def myFunc = createFoo is disallowed in my guideline. It is invoking a function createFoo, not the constructor of Foo. On Wed, Feb 19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.com wrote: Without bikeshedding this too much ... It is likely incorrect (not wrong) - and rules like this potentially cause things to slip through. Explicit return type strictly specifies what is being exposed (think in face of impl change - createFoo changes in future from Foo to Foo1 or Foo2) .. being conservative about how to specify exposed interfaces, imo, outweighs potential gains in breveity of code. Btw this is a degenerate contrieved example already stretching its use ... Regards Mridul Regards Mridul On Feb 19, 2014 1:49 PM, Reynold Xin r...@databricks.com wrote: Yes, the case you brought up is not a matter of readability or style. If it returns a different type, it should be declared (otherwise it is just wrong). On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com wrote: You are right. A degenerate case would be : def createFoo = new FooImpl() vs def createFoo: Foo = new FooImpl() Former will cause api instability. Reynold, maybe this is already avoided - and I understood it wrong ? Thanks, Mridul On Wed, Feb 19, 2014 at 12:44 PM, Christopher Nguyen c...@adatao.com wrote: Mridul, IIUUC, what you've mentioned did come to mind, but I deemed it orthogonal to the stylistic issue Reynold is talking about. I believe you're referring to the case where there is a specific desired return type by API design, but the implementation does not, in which case, of course, one must define the return type. That's an API requirement and not just a matter of readability. We could add this as an NB in the proposed guideline. -- Christopher T. Nguyen Co-founder CEO, Adatao http://adatao.com linkedin.com/in/ctnguyen On Tue, Feb 18, 2014 at 10:40 PM, Reynold Xin r...@databricks.com wrote: +1 Christopher's suggestion. Mridul, How would that happen? Case 3 requires the method to be invoking the constructor directly. It was implicit in my email, but the return type should be the same as the class itself. On Tue, Feb 18, 2014 at 10:37 PM, Mridul Muralidharan mri...@gmail.com wrote: Case 3 can be a potential issue. Current implementation might be returning a concrete class which we might want to change later - making it a type change. The intention might be to return an RDD (for example), but the inferred type might be a subclass of RDD - and future changes will cause signature change. Regards, Mridul On Wed, Feb 19, 2014 at 11:52 AM, Reynold Xin r...@databricks.com wrote: Hi guys, Want to bring to the table this issue to see what other members of the community think and then we can codify it in the Spark coding style guide. The topic is about declaring return types explicitly in public APIs. In general I think we should favor explicit type declaration in public APIs. However, I do think there are 3 cases we can avoid the public API definition because in these 3 cases the types are self-evident repetitive. Case 1. toString Case 2. A method returning a string or a val defining a string def name = abcd // this is so obvious that it is a string val name = edfg // this too Case 3. The method or variable is invoking the constructor of a class and return that immediately. For example: val a = new SparkContext(...) implicit def rddToAsyncRDDActions[T: ClassTag](rdd: RDD[T]) = new
[GitHub] incubator-spark pull request: Add Security to Spark - Akka, Http, ...
Github user mridulm commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/332#discussion_r9878293 --- Diff: core/src/main/scala/org/apache/spark/SecurityManager.scala --- @@ -0,0 +1,112 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * License); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark + +import org.apache.hadoop.io.Text +import org.apache.hadoop.security.Credentials +import org.apache.hadoop.security.UserGroupInformation + +import org.apache.spark.deploy.SparkHadoopUtil + +/** + * Spark class responsible for security. + */ +private[spark] class SecurityManager extends Logging { + + private val isAuthOn = System.getProperty(spark.authenticate, false).toBoolean + private val isUIAuthOn = System.getProperty(spark.authenticate.ui, false).toBoolean + private val viewAcls = System.getProperty(spark.ui.view.acls, ).split(',').map(_.trim()).toSet + private val secretKey = generateSecretKey() + logDebug(is auth enabled = + isAuthOn + is uiAuth enabled = + isUIAuthOn) + + /** + * In Yarn mode it uses Hadoop UGI to pass the secret as that + * will keep it protected. For a standalone SPARK cluster + * use a environment variable SPARK_SECRET to specify the secret. + * This probably isn't ideal but only the user who starts the process + * should have access to view the variable (at least on Linux). + * Since we can't set the environment variable we set the + * java system property SPARK_SECRET so it will automatically + * generate a secret is not specified. This definitely is not + * ideal since users can see it. We should switch to put it in + * a config. + */ + private def generateSecretKey(): String = { + +if (!isAuthenticationEnabled) return null +// first check to see if secret already set, else generate it +if (SparkHadoopUtil.get.isYarnMode) { + val credentials = SparkHadoopUtil.get.getCurrentUserCredentials() + if (credentials != null) { +val secretKey = credentials.getSecretKey(new Text(akkaCookie)) +if (secretKey != null) { + logDebug(in yarn mode, getting secret from credentials) + return new Text(secretKey).toString +} else { + logDebug(getSecretKey: yarn mode, secret key from credentials is null) +} + } else { +logDebug(getSecretKey: yarn mode, credentials are null) + } +} +val secret = System.getProperty(SPARK_SECRET, System.getenv(SPARK_SECRET)) +if (secret != null !secret.isEmpty()) return secret +// generate one +val sCookie = akka.util.Crypt.generateSecureCookie + +// if we generate we must be the first so lets set it so its used by everyone else +if (SparkHadoopUtil.get.isYarnMode) { + val creds = new Credentials() + creds.addSecretKey(new Text(akkaCookie), sCookie.getBytes()) + SparkHadoopUtil.get.addCurrentUserCredentials(creds) + logDebug(adding secret to credentials yarn mode) +} else { + System.setProperty(SPARK_SECRET, sCookie) + logDebug(adding secret to java property) +} +return sCookie + } + + def isUIAuthenticationEnabled(): Boolean = return isUIAuthOn + + // allow anyone in the acl list and the application owner + def checkUIViewPermissions(user: String): Boolean = { +if (isUIAuthenticationEnabled() (user != null)) { + if ((!viewAcls.contains(user)) (user != System.getProperty(user.name))) { --- End diff -- Agree, that sounds fine. Regards, Mridul On Feb 19, 2014 11:43 PM, Tom Graves notificati...@github.com wrote: In core/src/main/scala/org/apache/spark/SecurityManager.scala: + creds.addSecretKey(new Text(akkaCookie), sCookie.getBytes()) + SparkHadoopUtil.get.addCurrentUserCredentials(creds) + logDebug(adding secret to credentials yarn
[GitHub] incubator-spark pull request: MLLIB-24: url of Collaborative Filt...
Github user mengxr commented on the pull request: https://github.com/apache/incubator-spark/pull/619#issuecomment-35535759 DOI links are permanent so we don't need to worry about the link becoming invalid again. People will do a search and find the pdf easily if they don't have access to IEEE. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [Proposal] Adding sparse data suppor...
Github user fommil commented on the pull request: https://github.com/apache/incubator-spark/pull/575#issuecomment-35546981 @mengxr consider this message to be proof that jniloader is distributed under the Apache license. I'll update the build files next time I need a code change. If you want it quicker, issue a PR (and add it as a dual license) ;-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [Proposal] Adding sparse data suppor...
Github user fommil commented on the pull request: https://github.com/apache/incubator-spark/pull/575#issuecomment-35547276 @srowen The LGPL is ineligible primarily due to the restrictions it places on larger works, violating the third license criterion. Therefore, LGPL-licensed works must not be included in Apache products. where third license criterion is The license must not place restrictions on the distribution of larger works, other than to require that the covered component still complies with the conditions of its license. I do not see any violation here. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [Proposal] Adding sparse data suppor...
Github user fommil commented on the pull request: https://github.com/apache/incubator-spark/pull/575#issuecomment-35548061 @srowen I've asked the question. I'm interested to see the response: https://issues.apache.org/jira/browse/LEGAL-192 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [Proposal] Adding sparse data suppor...
Github user mengxr commented on the pull request: https://github.com/apache/incubator-spark/pull/575#issuecomment-35557645 @fommil Thanks a lot! The license JIRA is also interesting to follow ~ :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user pwendell commented on the pull request: https://github.com/apache/incubator-spark/pull/618#issuecomment-35567163 Thanks guys I put this in master and 0.9. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1105] fix site scala version ...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-spark/pull/618 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/585#discussion_r9890431 --- Diff: project/MimaBuild.scala --- @@ -0,0 +1,105 @@ +import com.typesafe.tools.mima.plugin.MimaKeys.{binaryIssueFilters, previousArtifact} +import com.typesafe.tools.mima.plugin.MimaPlugin.mimaDefaultSettings + +object MimaBuild { + + val ignoredABIProblems = { +import com.typesafe.tools.mima.core._ +import com.typesafe.tools.mima.core.ProblemFilters._ +/** + * A: Detections are semi private or likely to become semi private at some point. + */ + Seq(exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom), + exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom$), + exclude[MissingMethodProblem](org.apache.spark.util.Utils.cloneWritables), + // Scheduler is not considered a public API. + excludePackage(org.apache.spark.deploy), + // Was made private in 1.0 --- End diff -- Ah darn, seems like this doesn't work. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user colorant commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35570472 ah, so the sleep removed ;) and the synchronization block is already there, is it ok? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35572288 Build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35572287 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: SPARK-929: Fully deprecate usage of ...
Github user hsaputra commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/615#discussion_r9891770 --- Diff: core/src/main/scala/org/apache/spark/SparkContext.scala --- @@ -165,19 +165,20 @@ class SparkContext( jars.foreach(addJar) } + def warnSparkMem(value: String): String = { +logWarning(Using SPARK_MEM to set amount of memory to use per executor process is + + deprecated, please use instead spark.executor.memory) --- End diff -- Small nit of the warning wording: deprecated, please use spark.executor.memory instead. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: MLLIB-24: url of Collaborative Filt...
Github user CrazyJvm commented on the pull request: https://github.com/apache/incubator-spark/pull/619#issuecomment-35575435 take permanent valid url into consideration, change url from yahoo to ieee. thx @mengxr . http://dx.doi.org/10.1109/ICDM.2008.22 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35579235 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/12782/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: For SPARK-1082, Use Curator for ZK i...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/611#issuecomment-35579234 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Add a environment variable that allo...
Github user pwendell commented on the pull request: https://github.com/apache/incubator-spark/pull/192#issuecomment-35580209 See SPARK-1110... I took down some notes there relevant to this: https://spark-project.atlassian.net/browse/SPARK-1110 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: SPARK-929: Fully deprecate usage of ...
Github user pwendell commented on the pull request: https://github.com/apache/incubator-spark/pull/615#issuecomment-35581271 @sryza - I don't think this is relevant to the YARN codepath. AFAIK YARN doesn't use the ./spark-class script to launch the YARN application master (which embeds the driver program). I'm not totally sure how that JVM is actually launched though... couldn't figure it out on a quick glance at that code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: SPARK-929: Fully deprecate usage of ...
Github user pwendell commented on the pull request: https://github.com/apache/incubator-spark/pull/615#issuecomment-35581371 It looks like there is a separate variable called `amMemory` that deals with this in YARN. The command for launching that JVM gets set-up in: common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Adding an option to persist Spark RD...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/468#issuecomment-35582252 Build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [java8API] SPARK-964 Investigate the...
Github user ScrapCodes commented on the pull request: https://github.com/apache/incubator-spark/pull/539#issuecomment-35586860 Hey Matei, I feel this is better than before in overall. There is one thing I was not very sure about is putting a couple of implicits in JavaPairRDD. But this was already being done. There is no way I know our users from previous versions can avoid a recompile as such. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: Adding an option to persist Spark RD...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/468#issuecomment-35587409 Build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user ScrapCodes commented on a diff in the pull request: https://github.com/apache/incubator-spark/pull/585#discussion_r9895285 --- Diff: project/MimaBuild.scala --- @@ -0,0 +1,105 @@ +import com.typesafe.tools.mima.plugin.MimaKeys.{binaryIssueFilters, previousArtifact} +import com.typesafe.tools.mima.plugin.MimaPlugin.mimaDefaultSettings + +object MimaBuild { + + val ignoredABIProblems = { +import com.typesafe.tools.mima.core._ +import com.typesafe.tools.mima.core.ProblemFilters._ +/** + * A: Detections are semi private or likely to become semi private at some point. + */ + Seq(exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom), + exclude[MissingClassProblem](org.apache.spark.util.XORShiftRandom$), + exclude[MissingMethodProblem](org.apache.spark.util.Utils.cloneWritables), + // Scheduler is not considered a public API. + excludePackage(org.apache.spark.deploy), + // Was made private in 1.0 --- End diff -- you are right. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user ScrapCodes commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35593470 Hey @pwendell, Not sure how, cleared ivy and m2 for spark but it is not possible to get rid of these. I am trying it with jenkins once, since you could remove them w/o errors. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35594451 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: [SPARK-1094] Support MiMa for report...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/incubator-spark/pull/585#issuecomment-35594452 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-spark pull request: MLLIB-22. Support negative implicit ...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-spark/pull/500 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. To do so, please top-post your response. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---