[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47190528 No, just want to see Jenkins happy. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user mengxr commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47190546 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47190746 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47190739 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2172] PySpark cannot import mllib modul...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1223 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1225 [SPARK-2286][UI] Report exception/errors for failed tasks that are not ExceptionFailure Also added inline doc for each TaskEndReason. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark SPARK-2286 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1225.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1225 commit 38c73916e28c97cee49b2dab03f4dcc76d6da82f Author: Reynold Xin r...@apache.org Date: 2014-06-26T06:27:36Z [SPARK-2286][UI] Report exception/errors for failed tasks that are not ExceptionFailure. Also added inline doc for each TaskEndReason. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-2186: Spark SQL DSL support for simple a...
Github user edrevo commented on the pull request: https://github.com/apache/spark/pull/1211#issuecomment-47191917 I have added DSL support (`avg()`, `count(distinct())`...) following @rxin's suggestion. I'm happy to change the DSL style if you end up favoring the other approach. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1225#discussion_r14226878 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -30,27 +30,67 @@ import org.apache.spark.storage.BlockManagerId @DeveloperApi sealed trait TaskEndReason +/** + * :: DeveloperApi :: + * Task succeeded. + */ @DeveloperApi case object Success extends TaskEndReason --- End diff -- Want to rename this while you're at it? (It's developer API) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47191925 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47191952 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1516]Throw exception in yarn client ins...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1099#issuecomment-47191953 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16138/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1225#discussion_r14226907 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -30,27 +30,67 @@ import org.apache.spark.storage.BlockManagerId @DeveloperApi sealed trait TaskEndReason +/** + * :: DeveloperApi :: + * Task succeeded. + */ @DeveloperApi case object Success extends TaskEndReason --- End diff -- I can do that if we are not going to backport this into branch-1.0. I think we should though ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
GitHub user ueshin opened a pull request: https://github.com/apache/spark/pull/1226 [SPARK-2287] [SQL] Make ScalaReflection be able to handle Generic case classes. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ueshin/apache-spark issues/SPARK-2287 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1226.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1226 commit 7de570641d5bd099ed0e4b68f513aeb2b7ea3f1a Author: Takuya UESHIN ues...@happy-camper.st Date: 2014-06-24T08:08:48Z Make ScalaReflection be able to handle Generic case classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user andrewor14 commented on a diff in the pull request: https://github.com/apache/spark/pull/1225#discussion_r14227125 --- Diff: core/src/main/scala/org/apache/spark/TaskEndReason.scala --- @@ -58,10 +98,19 @@ case class ExceptionFailure( * it was fetched. */ @DeveloperApi -case object TaskResultLost extends TaskEndReason +case object TaskResultLost extends TaskFailedReason { + override def toErrorString: String = +TaskResultLost (result lost from block manager) --- End diff -- nit: move to line above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47192750 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47192746 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47192751 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47192918 This will cause merge conflict with #1193, but I can fix it soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2254] [SQL] ScalaRefection should mark ...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1193#issuecomment-47193347 Merging this one so you can fix the conflict with this other. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2254] [SQL] ScalaRefection should mark ...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/1193 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2254] [SQL] ScalaRefection should mark ...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/1193#issuecomment-47193596 @rxin Thanks! I will fix soon. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47193665 I spent some time verifying the math behind the PageRank (in particular starting values) to ensure that the delta formulation behaves identically to the static formulation which matches other reference implementations of PageRank. One of the key changes is I have added an extra normalization step at the end of the calculation to address a discrepancy in how we handle dangling vertices. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47193690 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47193686 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/1225#discussion_r14227540 --- Diff: core/src/main/scala/org/apache/spark/ui/jobs/StagePage.scala --- @@ -283,12 +283,7 @@ private[ui] class StagePage(parent: JobProgressTab) extends WebUIPage(stage) { /td }} td - {exception.map { e = -span - {e.className} ({e.description})br/ - {fmtStackTrace(e.stackTrace)} -/span - }.getOrElse()} + {errorMessage.map { e = pre{e}/pre }.getOrElse()} --- End diff -- It does. Looks like the stack trace in the details thing on the stage list. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user andrewor14 commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47193773 This is a pretty straightforward change. Looks good to me providing that the stack trace looks alright and tests pass. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47193791 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16142/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/1217#discussion_r14227560 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala --- @@ -158,4 +169,125 @@ object Pregel extends Logging { g } // end of apply + /** + * Execute a Pregel-like iterative vertex-parallel abstraction. The + * user-defined vertex-program `vprog` is executed in parallel on + * each vertex receiving any inbound messages and computing a new + * value for the vertex. The `sendMsg` function is then invoked on + * all out-edges and is used to compute an optional message to the + * destination vertex. The `mergeMsg` function is a commutative + * associative function used to combine messages destined to the + * same vertex. + * + * On the first iteration all vertices receive the `initialMsg` and + * on subsequent iterations if a vertex does not receive a message + * then the vertex-program is not invoked. + * + * This function iterates until there are no remaining messages, or + * for `maxIterations` iterations. + * + * @tparam VD the vertex data type + * @tparam ED the edge data type + * @tparam A the Pregel message type + * + * @param graph the input graph. + * + * @param initialMsg the message each vertex will receive at the on + * the first iteration + * + * @param maxIterations the maximum number of iterations to run for + * + * @param activeDirection the direction of edges incident to a vertex that received a message in + * the previous round on which to run `sendMsg`. For example, if this is `EdgeDirection.Out`, only + * out-edges of vertices that received a message in the previous round will run. The default is + * `EdgeDirection.Either`, which will run `sendMsg` on edges where either side received a message + * in the previous round. If this is `EdgeDirection.Both`, `sendMsg` will only run on edges where + * *both* vertices received a message. + * + * @param vprog the user-defined vertex program which runs on each + * vertex and receives the inbound message and computes a new vertex + * value. On the first iteration the vertex program is invoked on + * all vertices and is passed the default message. On subsequent + * iterations the vertex program is only invoked on those vertices + * that receive messages. + * + * @param sendMsg a user supplied function that is applied to out + * edges of vertices that received messages in the current + * iteration + * + * @param mergeMsg a user supplied function that takes two incoming + * messages of type A and merges them into a single message of type + * A. ''This function must be commutative and associative and + * ideally the size of A should not increase.'' + * + * @return the resulting graph at the end of the computation + * + */ + def run[VD: ClassTag, ED: ClassTag, A: ClassTag] + (graph: Graph[VD, ED], + maxIterations: Int = Int.MaxValue, + activeDirection: EdgeDirection = EdgeDirection.Either) + (vertexProgram: (VertexId, VD, Option[A], VertexContext) = VD, + sendMsg: (EdgeTriplet[VD, ED], EdgeContext) = Iterator[(VertexId, A)], + mergeMsg: (A, A) = A) + : Graph[VD, ED] = + { +// Initialize the graph with all vertices active +var g: Graph[(VD, Boolean), ED] = graph.mapVertices { (vid, vdata) = (vdata, true) }.cache() +// Determine the set of vertices that did not vote to halt +var activeVertices = g.vertices +var numActive = activeVertices.count() +var i = 0 +while (numActive 0 i maxIterations) { + // The send message wrapper removes the active fields from the triplet and places them in the edge context. + def sendMessageWrapper(triplet: EdgeTriplet[(VD, Boolean),ED]): Iterator[(VertexId, A)] = { +val simpleTriplet = new EdgeTriplet[VD, ED]() +simpleTriplet.set(triplet) +simpleTriplet.srcAttr = triplet.srcAttr._1 +simpleTriplet.dstAttr = triplet.dstAttr._1 +val ctx = new EdgeContext(i, triplet.srcAttr._2, triplet.dstAttr._2) +sendMsg(simpleTriplet, ctx) + } + + // Compute the messages for all the active vertices + val messages = g.mapReduceTriplets(sendMessageWrapper, mergeMsg, Some((activeVertices, activeDirection))) + + // get a reference to the current graph so that we can unpersist it once the new graph is created. + val prevG = g + + // Receive the messages to the subset of active vertices + g = g.outerJoinVertices(messages){ (vid, dataAndActive, msgOpt) = +val
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user jegonzal commented on a diff in the pull request: https://github.com/apache/spark/pull/1217#discussion_r14227573 --- Diff: graphx/src/main/scala/org/apache/spark/graphx/Pregel.scala --- @@ -158,4 +169,125 @@ object Pregel extends Logging { g } // end of apply + /** + * Execute a Pregel-like iterative vertex-parallel abstraction. The + * user-defined vertex-program `vprog` is executed in parallel on + * each vertex receiving any inbound messages and computing a new + * value for the vertex. The `sendMsg` function is then invoked on + * all out-edges and is used to compute an optional message to the + * destination vertex. The `mergeMsg` function is a commutative + * associative function used to combine messages destined to the + * same vertex. + * + * On the first iteration all vertices receive the `initialMsg` and + * on subsequent iterations if a vertex does not receive a message + * then the vertex-program is not invoked. + * + * This function iterates until there are no remaining messages, or + * for `maxIterations` iterations. + * + * @tparam VD the vertex data type + * @tparam ED the edge data type + * @tparam A the Pregel message type + * + * @param graph the input graph. + * + * @param initialMsg the message each vertex will receive at the on + * the first iteration + * + * @param maxIterations the maximum number of iterations to run for + * + * @param activeDirection the direction of edges incident to a vertex that received a message in + * the previous round on which to run `sendMsg`. For example, if this is `EdgeDirection.Out`, only + * out-edges of vertices that received a message in the previous round will run. The default is + * `EdgeDirection.Either`, which will run `sendMsg` on edges where either side received a message + * in the previous round. If this is `EdgeDirection.Both`, `sendMsg` will only run on edges where + * *both* vertices received a message. + * + * @param vprog the user-defined vertex program which runs on each + * vertex and receives the inbound message and computes a new vertex + * value. On the first iteration the vertex program is invoked on + * all vertices and is passed the default message. On subsequent + * iterations the vertex program is only invoked on those vertices + * that receive messages. + * + * @param sendMsg a user supplied function that is applied to out + * edges of vertices that received messages in the current + * iteration + * + * @param mergeMsg a user supplied function that takes two incoming + * messages of type A and merges them into a single message of type + * A. ''This function must be commutative and associative and + * ideally the size of A should not increase.'' + * + * @return the resulting graph at the end of the computation + * + */ + def run[VD: ClassTag, ED: ClassTag, A: ClassTag] + (graph: Graph[VD, ED], + maxIterations: Int = Int.MaxValue, + activeDirection: EdgeDirection = EdgeDirection.Either) + (vertexProgram: (VertexId, VD, Option[A], VertexContext) = VD, + sendMsg: (EdgeTriplet[VD, ED], EdgeContext) = Iterator[(VertexId, A)], + mergeMsg: (A, A) = A) + : Graph[VD, ED] = + { +// Initialize the graph with all vertices active +var g: Graph[(VD, Boolean), ED] = graph.mapVertices { (vid, vdata) = (vdata, true) }.cache() +// Determine the set of vertices that did not vote to halt +var activeVertices = g.vertices +var numActive = activeVertices.count() +var i = 0 +while (numActive 0 i maxIterations) { + // The send message wrapper removes the active fields from the triplet and places them in the edge context. + def sendMessageWrapper(triplet: EdgeTriplet[(VD, Boolean),ED]): Iterator[(VertexId, A)] = { +val simpleTriplet = new EdgeTriplet[VD, ED]() +simpleTriplet.set(triplet) +simpleTriplet.srcAttr = triplet.srcAttr._1 +simpleTriplet.dstAttr = triplet.dstAttr._1 +val ctx = new EdgeContext(i, triplet.srcAttr._2, triplet.dstAttr._2) +sendMsg(simpleTriplet, ctx) + } + + // Compute the messages for all the active vertices + val messages = g.mapReduceTriplets(sendMessageWrapper, mergeMsg, Some((activeVertices, activeDirection))) + + // get a reference to the current graph so that we can unpersist it once the new graph is created. + val prevG = g + + // Receive the messages to the subset of active vertices + g = g.outerJoinVertices(messages){ (vid, dataAndActive, msgOpt) = +val
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47194199 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47194200 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16139/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1782: svd for sparse matrix using ARPACK
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/964#issuecomment-47194339 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47194334 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1782: svd for sparse matrix using ARPACK
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/964#issuecomment-47194330 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47194382 Fixed merge conflict with #1193. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47194327 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47194328 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47194335 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47194448 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47194450 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16144/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47196045 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47196054 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47196162 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47196163 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16146/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Fix JIRA-983 and support exteranl sort for sor...
Github user xiajunluan commented on the pull request: https://github.com/apache/spark/pull/931#issuecomment-47196343 @pwendell I will update the codes soon, thanks for reminder. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47197537 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47197529 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47198182 Jenkins, retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47198281 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47198289 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Removed throwable field from FetchFailedExcept...
GitHub user rxin opened a pull request: https://github.com/apache/spark/pull/1227 Removed throwable field from FetchFailedException and added MetadataFetchFailedException FetchFailedException used to have a Throwable field, but in reality we never propogate any of the throwable/exceptions back to the driver because Executor explicitly looks for FetchFailedException and then sends FetchFailed as the TaskEndReason. This pull request removes the throwable and adds a MetadataFetchFailedException that extends FetchFailedException (so now MapOutputTracker throws MetadataFetchFailedException instead). You can merge this pull request into a Git repository by running: $ git pull https://github.com/rxin/spark metadataFetchException Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1227.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1227 commit 8861ee2b0cb8f78776c688d369cb8a4fe8a83b6d Author: Reynold Xin r...@apache.org Date: 2014-06-26T07:38:30Z Throw MetadataFetchFailedException in MapOutputTracker. commit 5cb1e0ac6a910877488a004590865324e5145a05 Author: Reynold Xin r...@apache.org Date: 2014-06-26T08:07:31Z MetadataFetchFailedException extends FetchFailedException. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Removed throwable field from FetchFailedExcept...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1227#issuecomment-47198650 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Removed throwable field from FetchFailedExcept...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1227#issuecomment-47198635 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47198651 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1782: svd for sparse matrix using ARPACK
Github user vrilleup commented on the pull request: https://github.com/apache/spark/pull/964#issuecomment-47198628 multiply function is RowMatrix was using dense matrix - sparse vector multiplication function, but in current breeze version there is no implementation for this combination. So it was actually a dense matrix - dense vector function invoked, which is very slow for sparse but large RowMatrix. I changed it to use sparse vector - dense vector multiplication in breeze. This larged improved the multiply function. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47198638 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47198800 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47198803 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16150/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface
Github user rxin commented on a diff in the pull request: https://github.com/apache/spark/pull/991#discussion_r14229594 --- Diff: core/src/main/java/org/apache/spark/Service.java --- @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark; + +import java.io.Closeable; +import java.io.IOException; + +// copy from hadoop +public interface Service extends Closeable { --- End diff -- Java code can actually extend Scala traits as well. Do you have other concerns? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47199290 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47199289 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47199293 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16141/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: SPARK-1782: svd for sparse matrix using ARPACK
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/964#issuecomment-47199297 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16145/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47199288 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47199294 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16143/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47199296 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16140/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved GraphX PageRank Test Coverage
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47199424 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved GraphX PageRank Test Coverage
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47199418 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user rxin commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47200164 There was a unit test failure that my latest push fixed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47200218 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved GraphX PageRank Test Coverage
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47200276 @ankurdave thanks for pointing out this bug! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47200208 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47201129 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16147/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47201127 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SQL]Extract the joinkeys from join condition
Github user egraldlo commented on a diff in the pull request: https://github.com/apache/spark/pull/1190#discussion_r14230460 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkStrategies.scala --- @@ -65,7 +64,7 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { def broadcastTables: Seq[String] = sqlContext.joinBroadcastTables.split(,).toBuffer def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match { - case HashFilteredJoin( + case ExtractEquiJoinKeys( --- End diff -- maybe annotation on line 48 can be modified. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47201642 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16148/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47201641 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Removed throwable field from FetchFailedExcept...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1227#issuecomment-47202409 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16149/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Removed throwable field from FetchFailedExcept...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1227#issuecomment-47202408 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved GraphX PageRank Test Coverage
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47202811 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16151/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Improved GraphX PageRank Test Coverage
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1228#issuecomment-47202809 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
GitHub user mengxr opened a pull request: https://github.com/apache/spark/pull/1229 [SPARK-2251] fix concurrency issues in random sampler The following code is very likely to throw an exception: ~~~ val rdd = sc.parallelize(0 until 111, 10).sample(false, 0.1) rdd.zip(rdd).count() ~~~ because the same random number generator is used in compute partitions. You can merge this pull request into a Git repository by running: $ git pull https://github.com/mengxr/spark fix-sample Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/1229.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1229 commit e7f5f5a4a81ee546fe157846b8cf527e1fba9a63 Author: Xiangrui Meng m...@databricks.com Date: 2014-06-26T08:45:46Z fix concurrency issues in random sampler --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47203216 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47203207 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47203638 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47203628 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47203899 Merged build finished. All automated tests passed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2286][UI] Report exception/errors for f...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1225#issuecomment-47203901 All automated tests passed. Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16152/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14231480 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -244,6 +255,17 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, actorSystem: A throw new SparkException(Error notifying standalone scheduler's driver actor, e) } } + + override def isReady(): Boolean = { +if (ready) { --- End diff -- Thanks @pwendell @kayousterhout I am more thoughtful about these code's performance. ^_^ But we can't simply inline the code because executorActor is a member of inner class DriverActor. Although we can get the member by adding some code, I don't sure it cost to do. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: Introducing an Improved Pregel API
Github user jegonzal commented on the pull request: https://github.com/apache/spark/pull/1217#issuecomment-47204112 @ankurdave and @rxin there is an issue with the current API. The `sendMessage` function pull the active field out of the vertex value here: https://github.com/apache/spark/pull/1217/files#diff-e399679417ffa6eeedf26a7630baca16R243 ```scala def sendMessageWrapper(triplet: EdgeTriplet[(VD, Boolean),ED]): Iterator[(VertexId, A)] = { val simpleTriplet = new EdgeTriplet[VD, ED]() simpleTriplet.set(triplet) simpleTriplet.srcAttr = triplet.srcAttr._1 simpleTriplet.dstAttr = triplet.dstAttr._1 val ctx = new EdgeContext(i, triplet.srcAttr._2, triplet.dstAttr._2) sendMsg(simpleTriplet, ctx) } // Compute the messages for all the active vertices val messages = g.mapReduceTriplets(sendMessageWrapper, mergeMsg, Some((activeVertices, activeDirection))) ``` thereby allowing the user a simple `sendMsg` interface: ```scala sendMsg: (EdgeTriplet[VD, ED], EdgeContext) = Iterator[(VertexId, A)] ``` However because we access the source and destination vertex attributes the byte code inspection will force a full 3-way join even if the user doesn't actually read the fields. The simplest solution would be to change the send message interface to operate on the extended vertex attribute passing (containing the active field). ```scala sendMsg: (EdgeTriplet[(VD, Boolean), ED], EdgeContext) = Iterator[(VertexId, A)] ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47204535 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47204537 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16154/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...
Github user li-zhihui commented on the pull request: https://github.com/apache/spark/pull/900#issuecomment-47204723 @tgravescs @kayousterhout I add a new commit * Move waitBackendReady to TaskSchedulerImpl.start * Code refactor by @kayousterhout 's comments --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47204966 Merged build triggered. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47204976 Merged build started. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1946] Submit tasks after (configured ra...
Github user li-zhihui commented on a diff in the pull request: https://github.com/apache/spark/pull/900#discussion_r14232018 --- Diff: yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClusterSchedulerBackend.scala --- @@ -0,0 +1,51 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.scheduler.cluster + + +import org.apache.spark.{Logging, SparkContext} +import org.apache.spark.deploy.yarn.ApplicationMasterArguments +import org.apache.spark.scheduler.TaskSchedulerImpl + +import scala.collection.mutable.ArrayBuffer + +private[spark] class YarnClusterSchedulerBackend( +scheduler: TaskSchedulerImpl, +sc: SparkContext) + extends CoarseGrainedSchedulerBackend(scheduler, sc.env.actorSystem) + with Logging { + + private[spark] def addArg(optionName: String, envVar: String, sysProp: String, + arrayBuf: ArrayBuffer[String]) { +if (System.getenv(envVar) != null) { + arrayBuf += (optionName, System.getenv(envVar)) +} else if (sc.getConf.contains(sysProp)) { + arrayBuf += (optionName, sc.getConf.get(sysProp)) +} + } + + override def start() { +super.start() +val argsArrayBuf = new ArrayBuffer[String]() +List((--num-executors, SPARK_EXECUTOR_INSTANCES, spark.executor.instances), + (--num-executors, SPARK_WORKER_INSTANCES, spark.worker.instances)) + .foreach { case (optName, envVar, sysProp) = addArg(optName, envVar, sysProp, argsArrayBuf) } +val args = new ApplicationMasterArguments(argsArrayBuf.toArray) +totalExecutors.set(args.numExecutors) --- End diff -- @kayousterhout Done. About constants, maybe we can take another PR to manage constants for the whole project. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2287] [SQL] Make ScalaReflection be abl...
Github user ueshin commented on the pull request: https://github.com/apache/spark/pull/1226#issuecomment-47206743 Could you please retest this? Previous tests seemed like Hive metastore was something wrong. (Can I let Jenkins do retest?) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1477]: Add the lifecycle interface
Github user witgo commented on a diff in the pull request: https://github.com/apache/spark/pull/991#discussion_r14232502 --- Diff: core/src/main/java/org/apache/spark/Service.java --- @@ -0,0 +1,97 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the License); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an AS IS BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.spark; + +import java.io.Closeable; +import java.io.IOException; + +// copy from hadoop +public interface Service extends Closeable { --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47206910 Merged build finished. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-2251] fix concurrency issues in random ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/1229#issuecomment-47206911 Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16153/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---