date:20150407

spark git commit: [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

2015-04-07 Thread joshrosen

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 1cde04f21 - ab1b8edb8


[SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

The spark_ec2.py script uses public_dns_name everywhere in the script except 
for testing ssh availability, which is done using the public ip address of the 
instances. This breaks the script for users who are deploying the cluster with 
a private-network-only security group. The fix is to use public_dns_name in the 
remaining place.

Author: Matt Aasted aas...@twitch.tv

Closes #5302 from aasted/master and squashes the following commits:

60cf6ee [Matt Aasted] [SPARK-6636] Use public DNS hostname everywhere in 
spark_ec2.py

(cherry picked from commit 6f0d55d76f758d217fd18ffa0ccf273d7ab0377b)
Signed-off-by: Josh Rosen joshro...@databricks.com


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ab1b8edb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ab1b8edb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ab1b8edb

Branch: refs/heads/branch-1.3
Commit: ab1b8edb81feb2ad5cd28611f809c57cb61db9fb
Parents: 1cde04f
Author: Matt Aasted aas...@twitch.tv
Authored: Mon Apr 6 23:50:48 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Mon Apr 6 23:52:46 2015 -0700

--
 ec2/spark_ec2.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ab1b8edb/ec2/spark_ec2.py
--
diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py
index 95082b5..e121dad 100755
--- a/ec2/spark_ec2.py
+++ b/ec2/spark_ec2.py
@@ -715,7 +715,7 @@ def is_cluster_ssh_available(cluster_instances, opts):
 Check if SSH is available on all the instances in a cluster.
 
 for i in cluster_instances:
-if not is_ssh_available(host=i.ip_address, opts=opts):
+if not is_ssh_available(host=i.public_dns_name, opts=opts):
 return False
 else:
 return True


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

2015-04-07 Thread joshrosen

Repository: spark
Updated Branches:
  refs/heads/master a0846c4b6 - 6f0d55d76


[SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

The spark_ec2.py script uses public_dns_name everywhere in the script except 
for testing ssh availability, which is done using the public ip address of the 
instances. This breaks the script for users who are deploying the cluster with 
a private-network-only security group. The fix is to use public_dns_name in the 
remaining place.

Author: Matt Aasted aas...@twitch.tv

Closes #5302 from aasted/master and squashes the following commits:

60cf6ee [Matt Aasted] [SPARK-6636] Use public DNS hostname everywhere in 
spark_ec2.py


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/6f0d55d7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/6f0d55d7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/6f0d55d7

Branch: refs/heads/master
Commit: 6f0d55d76f758d217fd18ffa0ccf273d7ab0377b
Parents: a0846c4
Author: Matt Aasted aas...@twitch.tv
Authored: Mon Apr 6 23:50:48 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Mon Apr 6 23:51:47 2015 -0700

--
 ec2/spark_ec2.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/6f0d55d7/ec2/spark_ec2.py
--
diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py
index 5507a9c..879a52c 100755
--- a/ec2/spark_ec2.py
+++ b/ec2/spark_ec2.py
@@ -809,7 +809,7 @@ def is_cluster_ssh_available(cluster_instances, opts):
 Check if SSH is available on all the instances in a cluster.
 
 for i in cluster_instances:
-if not is_ssh_available(host=i.ip_address, opts=opts):
+if not is_ssh_available(host=i.public_dns_name, opts=opts):
 return False
 else:
 return True


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from driver to driver

2015-04-07 Thread joshrosen

Repository: spark
Updated Branches:
  refs/heads/master e40ea8742 - a0846c4b6


[SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from driver to driver

Currently, the driver's executorId is set to `driver`. This choice of ID was 
present in older Spark versions, but it has started to cause problems now that 
executorIds are used in more contexts, such as Ganglia metric names or driver 
thread-dump links the web UI. The angle brackets must be escaped when embedding 
this ID in XML or as part of URLs and this has led to multiple problems:

- https://issues.apache.org/jira/browse/SPARK-6484
- https://issues.apache.org/jira/browse/SPARK-4313

The simplest solution seems to be to change this id to something that does not 
contain any special characters, such as `driver`.

I'm not sure whether we can perform this change in a patch release, since this 
ID may be considered a stable API by metrics users, but it's probably okay to 
do this in a major release as long as we document it in the release notes.

Author: Josh Rosen joshro...@databricks.com

Closes #5372 from JoshRosen/driver-id-fix and squashes the following commits:

42d3c10 [Josh Rosen] Clarify comment
0c5d04b [Josh Rosen] Add backwards-compatibility in BlockManagerId.isDriver
7ff12e0 [Josh Rosen] Change SparkContext.DRIVER_IDENTIFIER from driver to 
driver


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a0846c4b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a0846c4b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a0846c4b

Branch: refs/heads/master
Commit: a0846c4b635eac8d8637c83d490177f881952d27
Parents: e40ea87
Author: Josh Rosen joshro...@databricks.com
Authored: Mon Apr 6 23:33:16 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Mon Apr 6 23:33:16 2015 -0700

--
 core/src/main/scala/org/apache/spark/SparkContext.scala | 12 +++-
 .../scala/org/apache/spark/storage/BlockManagerId.scala |  5 -
 .../org/apache/spark/storage/BlockManagerSuite.scala|  6 ++
 3 files changed, 21 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a0846c4b/core/src/main/scala/org/apache/spark/SparkContext.scala
--
diff --git a/core/src/main/scala/org/apache/spark/SparkContext.scala 
b/core/src/main/scala/org/apache/spark/SparkContext.scala
index 942c597..3f1a7dd 100644
--- a/core/src/main/scala/org/apache/spark/SparkContext.scala
+++ b/core/src/main/scala/org/apache/spark/SparkContext.scala
@@ -1901,7 +1901,17 @@ object SparkContext extends Logging {
 
   private[spark] val SPARK_JOB_INTERRUPT_ON_CANCEL = 
spark.job.interruptOnCancel
 
-  private[spark] val DRIVER_IDENTIFIER = driver
+  /**
+   * Executor id for the driver.  In earlier versions of Spark, this was 
`driver`, but this was
+   * changed to `driver` because the angle brackets caused escaping issues in 
URLs and XML (see
+   * SPARK-6716 for more details).
+   */
+  private[spark] val DRIVER_IDENTIFIER = driver
+
+  /**
+   * Legacy version of DRIVER_IDENTIFIER, retained for backwards-compatibility.
+   */
+  private[spark] val LEGACY_DRIVER_IDENTIFIER = driver
 
   // The following deprecated objects have already been copied to `object 
AccumulatorParam` to
   // make the compiler find them automatically. They are duplicate codes only 
for backward

http://git-wip-us.apache.org/repos/asf/spark/blob/a0846c4b/core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala
--
diff --git a/core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala 
b/core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala
index a6f1ebf..69ac375 100644
--- a/core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala
+++ b/core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala
@@ -60,7 +60,10 @@ class BlockManagerId private (
 
   def port: Int = port_
 
-  def isDriver: Boolean = { executorId == SparkContext.DRIVER_IDENTIFIER }
+  def isDriver: Boolean = {
+executorId == SparkContext.DRIVER_IDENTIFIER ||
+  executorId == SparkContext.LEGACY_DRIVER_IDENTIFIER
+  }
 
   override def writeExternal(out: ObjectOutput): Unit = Utils.tryOrIOException 
{
 out.writeUTF(executorId_)

http://git-wip-us.apache.org/repos/asf/spark/blob/a0846c4b/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala 
b/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala
index 283090e..6dc5bc4 100644
--- a/core/src/test/scala/org/apache/spark/storage/BlockManagerSuite.scala
+++

spark git commit: [SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error

2015-04-07 Thread ankurdave

Repository: spark
Updated Branches:
  refs/heads/master 6f0d55d76 - ae980eb41


[SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error

Example of Graph#aggregateMessages has error.
Since aggregateMessages is a method of Graph, It should be written 
rawGraph.aggregateMessages

Author: Sasaki Toru sasaki...@nttdata.co.jp

Closes #5388 from sasakitoa/aggregateMessagesExample and squashes the following 
commits:

b1d631b [Sasaki Toru] Example of Graph#aggregateMessages has error


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ae980eb4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ae980eb4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ae980eb4

Branch: refs/heads/master
Commit: ae980eb41c00b5f1f64c650f267b884e864693f0
Parents: 6f0d55d
Author: Sasaki Toru sasaki...@nttdata.co.jp
Authored: Tue Apr 7 01:55:32 2015 -0700
Committer: Ankur Dave ankurd...@gmail.com
Committed: Tue Apr 7 01:55:32 2015 -0700

--
 graphx/src/main/scala/org/apache/spark/graphx/Graph.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ae980eb4/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala
--
diff --git a/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala 
b/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala
index 8494d06..36dc7b0 100644
--- a/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala
+++ b/graphx/src/main/scala/org/apache/spark/graphx/Graph.scala
@@ -409,7 +409,7 @@ abstract class Graph[VD: ClassTag, ED: ClassTag] protected 
() extends Serializab
* {{{
* val rawGraph: Graph[_, _] = Graph.textFile(twittergraph)
* val inDeg: RDD[(VertexId, Int)] =
-   *   aggregateMessages[Int](ctx = ctx.sendToDst(1), _ + _)
+   *   rawGraph.aggregateMessages[Int](ctx = ctx.sendToDst(1), _ + _)
* }}}
*
* @note By expressing computation at the edge level we achieve


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-3591][YARN]fire and forget for YARN cluster mode

2015-04-07 Thread tgraves

Repository: spark
Updated Branches:
  refs/heads/master ae980eb41 - b65bad65c


[SPARK-3591][YARN]fire and forget for YARN cluster mode

https://issues.apache.org/jira/browse/SPARK-3591

The output after this patch:
doggie153:/opt/oss/spark-1.3.0-bin-hadoop2.4/bin # ./spark-submit  --class 
org.apache.spark.examples.SparkPi --master yarn-cluster 
../lib/spark-examples*.jar
15/03/31 21:15:25 WARN NativeCodeLoader: Unable to load native-hadoop library 
for your platform... using builtin-java classes where applicable
15/03/31 21:15:25 INFO RMProxy: Connecting to ResourceManager at 
doggie153/10.177.112.153:8032
15/03/31 21:15:25 INFO Client: Requesting a new application from cluster with 4 
NodeManagers
15/03/31 21:15:25 INFO Client: Verifying our application has not requested more 
than the maximum memory capability of the cluster (8192 MB per container)
15/03/31 21:15:25 INFO Client: Will allocate AM container, with 896 MB memory 
including 384 MB overhead
15/03/31 21:15:25 INFO Client: Setting up container launch context for our AM
15/03/31 21:15:25 INFO Client: Preparing resources for our AM container
15/03/31 21:15:26 INFO Client: Uploading resource 
file:/opt/oss/spark-1.3.0-bin-hadoop2.4/lib/spark-assembly-1.4.0-SNAPSHOT-hadoop2.4.1.jar
 - 
hdfs://doggie153:9000/user/root/.sparkStaging/application_1427257505534_0016/spark-assembly-1.4.0-SNAPSHOT-hadoop2.4.1.jar
15/03/31 21:15:27 INFO Client: Uploading resource 
file:/opt/oss/spark-1.3.0-bin-hadoop2.4/lib/spark-examples-1.3.0-hadoop2.4.0.jar
 - 
hdfs://doggie153:9000/user/root/.sparkStaging/application_1427257505534_0016/spark-examples-1.3.0-hadoop2.4.0.jar
15/03/31 21:15:28 INFO Client: Setting up the launch environment for our AM 
container
15/03/31 21:15:28 INFO SecurityManager: Changing view acls to: root
15/03/31 21:15:28 INFO SecurityManager: Changing modify acls to: root
15/03/31 21:15:28 INFO SecurityManager: SecurityManager: authentication 
disabled; ui acls disabled; users with view permissions: Set(root); users with 
modify permissions: Set(root)
15/03/31 21:15:28 INFO Client: Submitting application 16 to ResourceManager
15/03/31 21:15:28 INFO YarnClientImpl: Submitted application 
application_1427257505534_0016
15/03/31 21:15:28 INFO Client: ... waiting before polling ResourceManager for 
application state
15/03/31 21:15:33 INFO Client: ... polling ResourceManager for application state
15/03/31 21:15:33 INFO Client: Application report for 
application_1427257505534_0016 (state: RUNNING)
15/03/31 21:15:33 INFO Client:
 client token: N/A
 diagnostics: N/A
 ApplicationMaster host: doggie157
 ApplicationMaster RPC port: 0
 queue: default
 start time: 1427807728307
 final status: UNDEFINED
 tracking URL: 
http://doggie153:8088/proxy/application_1427257505534_0016/
 user: root

/cc andrewor14

Author: WangTaoTheTonic wangtao...@huawei.com

Closes #5297 from WangTaoTheTonic/SPARK-3591 and squashes the following commits:

c76d232 [WangTaoTheTonic] wrap lines
16c90a8 [WangTaoTheTonic] move up lines to avoid duplicate
fea390d [WangTaoTheTonic] log failed/killed report, style and comment
be1cc2e [WangTaoTheTonic] reword
f0bc54f [WangTaoTheTonic] minor: expose appid in excepiton messages
ba9b22b [WangTaoTheTonic] wrong config name
e1a4013 [WangTaoTheTonic] revert to the old version and do some robust
19706c0 [WangTaoTheTonic] add a config to control whether to forget
0cbdce8 [WangTaoTheTonic] fire and forget for YARN cluster mode


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b65bad65
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b65bad65
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b65bad65

Branch: refs/heads/master
Commit: b65bad65c3500475b974ca0219f218eef296db2c
Parents: ae980eb
Author: WangTaoTheTonic wangtao...@huawei.com
Authored: Tue Apr 7 08:36:25 2015 -0500
Committer: Thomas Graves tgra...@apache.org
Committed: Tue Apr 7 08:36:25 2015 -0500

--
 .../scala/org/apache/spark/deploy/Client.scala  |  2 +-
 .../deploy/rest/StandaloneRestClient.scala  |  2 +-
 docs/running-on-yarn.md |  9 +++
 .../org/apache/spark/deploy/yarn/Client.scala   | 83 
 4 files changed, 61 insertions(+), 35 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b65bad65/core/src/main/scala/org/apache/spark/deploy/Client.scala
--
diff --git a/core/src/main/scala/org/apache/spark/deploy/Client.scala 
b/core/src/main/scala/org/apache/spark/deploy/Client.scala
index 65238af..8d13b2a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/Client.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/Client.scala
@@ -89,7 +89,7 @@ private class

spark git commit: Replace use of .size with .length for Arrays

2015-04-07 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 7162ecf88 - 2c32bef17


Replace use of .size with .length for Arrays

Invoking .size on arrays is valid, but requires an implicit conversion to 
SeqLike. This incurs a compile time overhead and more importantly a runtime 
overhead, as the Array must be wrapped before the method can be invoked. For 
example, the difference in generated byte code is:

  public int withSize();
Code:
   0: getstatic #23 // Field 
scala/Predef$.MODULE$:Lscala/Predef$;
   3: aload_0
   4: invokevirtual #25 // Method array:()[I
   7: invokevirtual #29 // Method 
scala/Predef$.intArrayOps:([I)Lscala/collection/mutable/ArrayOps;
  10: invokeinterface #34,  1   // InterfaceMethod 
scala/collection/mutable/ArrayOps.size:()I
  15: ireturn

  public int withLength();
Code:
   0: aload_0
   1: invokevirtual #25 // Method array:()[I
   4: arraylength
   5: ireturn

Author: sksamuel s...@sksamuel.com

Closes #5376 from sksamuel/master and squashes the following commits:

77ec261 [sksamuel] Replace use of .size with .length for Arrays.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2c32bef1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2c32bef1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2c32bef1

Branch: refs/heads/master
Commit: 2c32bef1790dac6f77ef9674f6106c2e24ea0338
Parents: 7162ecf
Author: sksamuel s...@sksamuel.com
Authored: Tue Apr 7 10:43:22 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Tue Apr 7 10:43:22 2015 -0700

--
 .../apache/spark/network/nio/Connection.scala   |  2 +-
 .../org/apache/spark/rdd/AsyncRDDActions.scala  | 10 -
 .../scala/org/apache/spark/rdd/BlockRDD.scala   |  2 +-
 .../org/apache/spark/rdd/CartesianRDD.scala |  4 ++--
 .../org/apache/spark/rdd/CheckpointRDD.scala|  2 +-
 .../org/apache/spark/rdd/CoGroupedRDD.scala |  4 ++--
 .../org/apache/spark/rdd/CoalescedRDD.scala |  2 +-
 .../apache/spark/rdd/DoubleRDDFunctions.scala   |  4 ++--
 .../apache/spark/rdd/OrderedRDDFunctions.scala  |  2 +-
 .../org/apache/spark/rdd/PairRDDFunctions.scala |  2 +-
 .../main/scala/org/apache/spark/rdd/RDD.scala   | 22 ++--
 .../apache/spark/rdd/RDDCheckpointData.scala|  6 +++---
 .../org/apache/spark/rdd/SubtractedRDD.scala|  2 +-
 .../scala/org/apache/spark/rdd/UnionRDD.scala   |  6 +++---
 .../apache/spark/rdd/ZippedPartitionsRDD.scala  |  4 ++--
 .../apache/spark/rdd/ZippedWithIndexRDD.scala   |  2 +-
 .../org/apache/spark/storage/RDDInfo.scala  |  2 +-
 .../apache/spark/ui/ConsoleProgressBar.scala|  4 ++--
 .../apache/spark/util/collection/BitSet.scala   |  2 +-
 19 files changed, 42 insertions(+), 42 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2c32bef1/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
--
diff --git a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala 
b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
index 04eb2bf..6b898bd 100644
--- a/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
+++ b/core/src/main/scala/org/apache/spark/network/nio/Connection.scala
@@ -181,7 +181,7 @@ abstract class Connection(val channel: SocketChannel, val 
selector: Selector,
 buffer.get(bytes)
 bytes.foreach(x = print(x +  ))
 buffer.position(curPosition)
-print( ( + bytes.size + ))
+print( ( + bytes.length + ))
   }
 
   def printBuffer(buffer: ByteBuffer, position: Int, length: Int) {

http://git-wip-us.apache.org/repos/asf/spark/blob/2c32bef1/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala
--
diff --git a/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala 
b/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala
index 646df28..3406a7e 100644
--- a/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala
+++ b/core/src/main/scala/org/apache/spark/rdd/AsyncRDDActions.scala
@@ -45,7 +45,7 @@ class AsyncRDDActions[T: ClassTag](self: RDD[T]) extends 
Serializable with Loggi
 }
 result
   },
-  Range(0, self.partitions.size),
+  Range(0, self.partitions.length),
   (index: Int, data: Long) = totalCount.addAndGet(data),
   totalCount.get())
   }
@@ -54,8 +54,8 @@ class AsyncRDDActions[T: ClassTag](self: RDD[T]) extends 
Serializable with Loggi
* Returns a future for retrieving all elements of this RDD.
*/
   def collectAsync(): FutureAction[Seq[T]] = {
-val results = new Array[Array[T]](self.partitions.size)
-

spark git commit: [SPARK-6733][ Scheduler]Added scala.language.existentials

2015-04-07 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master b65bad65c - 7162ecf88


[SPARK-6733][ Scheduler]Added scala.language.existentials

Author: Vinod K C vinod...@huawei.com

Closes #5384 from vinodkc/Suppression_Scala_existential_code and squashes the 
following commits:

82a3a1f [Vinod K C] Added scala.language.existentials


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7162ecf8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7162ecf8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7162ecf8

Branch: refs/heads/master
Commit: 7162ecf88624615c78a332de482f5defd297e415
Parents: b65bad6
Author: Vinod K C vinod...@huawei.com
Authored: Tue Apr 7 10:42:08 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Tue Apr 7 10:42:08 2015 -0700

--
 core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala   | 1 +
 .../test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala| 1 +
 2 files changed, 2 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7162ecf8/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
--
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index 917cce1..c82ae4b 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -24,6 +24,7 @@ import java.util.concurrent.atomic.AtomicInteger
 
 import scala.collection.mutable.{ArrayBuffer, HashMap, HashSet, Map, Stack}
 import scala.concurrent.duration._
+import scala.language.existentials
 import scala.language.postfixOps
 import scala.util.control.NonFatal
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7162ecf8/mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala
index 29d4ec5..fc73493 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/recommendation/ALSSuite.scala
@@ -22,6 +22,7 @@ import java.util.Random
 
 import scala.collection.mutable
 import scala.collection.mutable.ArrayBuffer
+import scala.language.existentials
 
 import com.github.fommil.netlib.BLAS.{getInstance = blas}
 import org.scalatest.FunSuite


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6750] Upgrade ScalaStyle to 0.7.

2015-04-07 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master 2c32bef17 - 123221591


[SPARK-6750] Upgrade ScalaStyle to 0.7.

0.7 fixes a bug that's pretty useful, i.e. inline functions no longer return 
explicit type definition.

Author: Reynold Xin r...@databricks.com

Closes #5399 from rxin/style0.7 and squashes the following commits:

54c41b2 [Reynold Xin] Actually update the version.
09c759c [Reynold Xin] [SPARK-6750] Upgrade ScalaStyle to 0.7.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/12322159
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/12322159
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/12322159

Branch: refs/heads/master
Commit: 12322159147581602978f7f5a6b33b887ef781a1
Parents: 2c32bef
Author: Reynold Xin r...@databricks.com
Authored: Tue Apr 7 12:37:33 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Tue Apr 7 12:37:33 2015 -0700

--
 project/plugins.sbt |  2 +-
 project/project/SparkPluginBuild.scala  | 16 +---
 .../scalastyle/NonASCIICharacterChecker.scala   | 39 
 3 files changed, 2 insertions(+), 55 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/12322159/project/plugins.sbt
--
diff --git a/project/plugins.sbt b/project/plugins.sbt
index ee45b6a..7096b0d 100644
--- a/project/plugins.sbt
+++ b/project/plugins.sbt
@@ -19,7 +19,7 @@ addSbtPlugin(com.github.mpeltonen % sbt-idea % 1.6.0)
 
 addSbtPlugin(net.virtual-void % sbt-dependency-graph % 0.7.4)
 
-addSbtPlugin(org.scalastyle %% scalastyle-sbt-plugin % 0.6.0)
+addSbtPlugin(org.scalastyle %% scalastyle-sbt-plugin % 0.7.0)
 
 addSbtPlugin(com.typesafe % sbt-mima-plugin % 0.1.6)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/12322159/project/project/SparkPluginBuild.scala
--
diff --git a/project/project/SparkPluginBuild.scala 
b/project/project/SparkPluginBuild.scala
index 8863f27..471d00b 100644
--- a/project/project/SparkPluginBuild.scala
+++ b/project/project/SparkPluginBuild.scala
@@ -24,20 +24,6 @@ import sbt.Keys._
  * becomes available for scalastyle sbt plugin.
  */
 object SparkPluginDef extends Build {
-  lazy val root = Project(plugins, file(.)) dependsOn(sparkStyle, 
sbtPomReader)
-  lazy val sparkStyle = Project(spark-style, file(spark-style), settings = 
styleSettings)
+  lazy val root = Project(plugins, file(.)) dependsOn(sbtPomReader)
   lazy val sbtPomReader = 
uri(https://github.com/ScrapCodes/sbt-pom-reader.git#ignore_artifact_id;)
-
-  // There is actually no need to publish this artifact.
-  def styleSettings = Defaults.defaultSettings ++ Seq (
-name :=  spark-style,
-organization :=  org.apache.spark,
-scalaVersion :=  2.10.4,
-scalacOptions:=  Seq(-unchecked, -deprecation),
-libraryDependencies  ++= Dependencies.scalaStyle
-  )
-
-  object Dependencies {
-val scalaStyle = Seq(org.scalastyle %% scalastyle % 0.4.0)
-  }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/12322159/project/spark-style/src/main/scala/org/apache/spark/scalastyle/NonASCIICharacterChecker.scala
--
diff --git 
a/project/spark-style/src/main/scala/org/apache/spark/scalastyle/NonASCIICharacterChecker.scala
 
b/project/spark-style/src/main/scala/org/apache/spark/scalastyle/NonASCIICharacterChecker.scala
deleted file mode 100644
index 3d43c35..000
--- 
a/project/spark-style/src/main/scala/org/apache/spark/scalastyle/NonASCIICharacterChecker.scala
+++ /dev/null
@@ -1,39 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the License); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an AS IS BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-
-package org.apache.spark.scalastyle
-
-import java.util.regex.Pattern
-
-import org.scalastyle.{PositionError, ScalariformChecker, ScalastyleError}
-
-import scalariform.lexer.Token
-import scalariform.parser.CompilationUnit
-
-class NonASCIICharacterChecker extends

spark git commit: [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path

2015-04-07 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master 123221591 - 596ba77c5


[SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has 
space in its path

escape spaces in the arguments.

Author: Masayoshi TSUZUKI tsudu...@oss.nttdata.co.jp

Closes #5347 from tsudukim/feature/SPARK-6568 and squashes the following 
commits:

9180aaf [Masayoshi TSUZUKI] [SPARK-6568] spark-shell.cmd --jars option does not 
accept the jar that has space in its path


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/596ba77c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/596ba77c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/596ba77c

Branch: refs/heads/master
Commit: 596ba77c5fdca79486396989e549632153055caf
Parents: 1232215
Author: Masayoshi TSUZUKI tsudu...@oss.nttdata.co.jp
Authored: Tue Apr 7 14:29:53 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Tue Apr 7 14:29:53 2015 -0700

--
 core/src/main/scala/org/apache/spark/util/Utils.scala  | 2 +-
 core/src/test/scala/org/apache/spark/util/UtilsSuite.scala | 6 --
 2 files changed, 5 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/596ba77c/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 0fdfaf3..25ae6ee 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -1661,7 +1661,7 @@ private[spark] object Utils extends Logging {
   /**
* Format a Windows path such that it can be safely passed to a URI.
*/
-  def formatWindowsPath(path: String): String = path.replace(\\, /)
+  def formatWindowsPath(path: String): String = path.replace(\\, 
/).replace( , %20)
 
   /**
* Indicates whether Spark is currently running unit tests.

http://git-wip-us.apache.org/repos/asf/spark/blob/596ba77c/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala 
b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
index 5d93086..b7cc840 100644
--- a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
@@ -241,6 +241,7 @@ class UtilsSuite extends FunSuite with 
ResetSystemProperties {
 assertResolves(C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(C:\\path\\to\\file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(file:/C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
+assertResolves(file:/C:/path to/file.txt, file:/C:/path%20to/file.txt, 
testWindows = true)
 assertResolves(file:///C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(file:/C:/file.txt#alias.txt, 
file:/C:/file.txt#alias.txt, testWindows = true)
 intercept[IllegalArgumentException] { Utils.resolveURI(file:foo) }
@@ -264,8 +265,9 @@ class UtilsSuite extends FunSuite with 
ResetSystemProperties {
 assertResolves(hdfs:/jar1,file:/jar2,jar3, 
shdfs:/jar1,file:/jar2,file:$cwd/jar3)
 assertResolves(hdfs:/jar1,file:/jar2,jar3,jar4#jar5,
   shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:$cwd/jar4#jar5)
-assertResolves(hdfs:/jar1,file:/jar2,jar3,C:\\pi.py#py.pi,
-  shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:/C:/pi.py#py.pi, 
testWindows = true)
+assertResolves(hdfs:/jar1,file:/jar2,jar3,C:\pi.py#py.pi,C:\path 
to\jar4.jar,
+  
shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:/C:/pi.py#py.pi,file:/C:/path%20to/jar4.jar,
+  testWindows = true)
   }
 
   test(nonLocalPaths) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: Revert [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path

2015-04-07 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master 596ba77c5 - e6f08fb42


Revert [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that 
has space in its path

This reverts commit 596ba77c5fdca79486396989e549632153055caf.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e6f08fb4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e6f08fb4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e6f08fb4

Branch: refs/heads/master
Commit: e6f08fb42fda35952ea8b005170750ae551dc7d9
Parents: 596ba77
Author: Xiangrui Meng m...@databricks.com
Authored: Tue Apr 7 14:34:15 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Tue Apr 7 14:34:15 2015 -0700

--
 core/src/main/scala/org/apache/spark/util/Utils.scala  | 2 +-
 core/src/test/scala/org/apache/spark/util/UtilsSuite.scala | 6 ++
 2 files changed, 3 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e6f08fb4/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 25ae6ee..0fdfaf3 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -1661,7 +1661,7 @@ private[spark] object Utils extends Logging {
   /**
* Format a Windows path such that it can be safely passed to a URI.
*/
-  def formatWindowsPath(path: String): String = path.replace(\\, 
/).replace( , %20)
+  def formatWindowsPath(path: String): String = path.replace(\\, /)
 
   /**
* Indicates whether Spark is currently running unit tests.

http://git-wip-us.apache.org/repos/asf/spark/blob/e6f08fb4/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala 
b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
index b7cc840..5d93086 100644
--- a/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/util/UtilsSuite.scala
@@ -241,7 +241,6 @@ class UtilsSuite extends FunSuite with 
ResetSystemProperties {
 assertResolves(C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(C:\\path\\to\\file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(file:/C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
-assertResolves(file:/C:/path to/file.txt, file:/C:/path%20to/file.txt, 
testWindows = true)
 assertResolves(file:///C:/path/to/file.txt, file:/C:/path/to/file.txt, 
testWindows = true)
 assertResolves(file:/C:/file.txt#alias.txt, 
file:/C:/file.txt#alias.txt, testWindows = true)
 intercept[IllegalArgumentException] { Utils.resolveURI(file:foo) }
@@ -265,9 +264,8 @@ class UtilsSuite extends FunSuite with 
ResetSystemProperties {
 assertResolves(hdfs:/jar1,file:/jar2,jar3, 
shdfs:/jar1,file:/jar2,file:$cwd/jar3)
 assertResolves(hdfs:/jar1,file:/jar2,jar3,jar4#jar5,
   shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:$cwd/jar4#jar5)
-assertResolves(hdfs:/jar1,file:/jar2,jar3,C:\pi.py#py.pi,C:\path 
to\jar4.jar,
-  
shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:/C:/pi.py#py.pi,file:/C:/path%20to/jar4.jar,
-  testWindows = true)
+assertResolves(hdfs:/jar1,file:/jar2,jar3,C:\\pi.py#py.pi,
+  shdfs:/jar1,file:/jar2,file:$cwd/jar3,file:/C:/pi.py#py.pi, 
testWindows = true)
   }
 
   test(nonLocalPaths) {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6720][MLLIB] PySpark MultivariateStatisticalSummary unit test for normL1...

2015-04-07 Thread meng

Repository: spark
Updated Branches:
  refs/heads/master e6f08fb42 - fc957dc78


[SPARK-6720][MLLIB] PySpark MultivariateStatisticalSummary unit test for 
normL1...

... and normL2.
Add test cases to insufficient unit test for `normL1` and `normL2`.

Ref: https://github.com/apache/spark/pull/5359

Author: lewuathe lewua...@me.com

Closes #5374 from Lewuathe/SPARK-6720 and squashes the following commits:

5541b24 [lewuathe] More accurate tests
dc5718c [lewuathe] [SPARK-6720] PySpark MultivariateStatisticalSummary unit 
test for normL1 and normL2


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc957dc7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/fc957dc7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/fc957dc7

Branch: refs/heads/master
Commit: fc957dc78138e72036dbbadc9a54f155d318c038
Parents: e6f08fb
Author: lewuathe lewua...@me.com
Authored: Tue Apr 7 14:36:57 2015 -0700
Committer: Xiangrui Meng m...@databricks.com
Committed: Tue Apr 7 14:36:57 2015 -0700

--
 python/pyspark/mllib/tests.py | 7 +++
 1 file changed, 7 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/fc957dc7/python/pyspark/mllib/tests.py
--
diff --git a/python/pyspark/mllib/tests.py b/python/pyspark/mllib/tests.py
index 47dad7d..61ef398 100644
--- a/python/pyspark/mllib/tests.py
+++ b/python/pyspark/mllib/tests.py
@@ -363,6 +363,13 @@ class StatTests(PySparkTestCase):
 self.assertEqual(10, len(summary.normL1()))
 self.assertEqual(10, len(summary.normL2()))
 
+data2 = self.sc.parallelize(xrange(10)).map(lambda x: Vectors.dense(x))
+summary2 = Statistics.colStats(data2)
+self.assertEqual(array([45.0]), summary2.normL1())
+import math
+expectedNormL2 = math.sqrt(sum(map(lambda x: x*x, xrange(10
+self.assertTrue(math.fabs(summary2.normL2()[0] - expectedNormL2)  
1e-14)
+
 
 class VectorUDTTests(PySparkTestCase):
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6748] [SQL] Makes QueryPlan.schema a lazy val

2015-04-07 Thread lian

Repository: spark
Updated Branches:
  refs/heads/master fc957dc78 - 77bcceb9f


[SPARK-6748] [SQL] Makes QueryPlan.schema a lazy val

`DataFrame.collect()` calls `SparkPlan.executeCollect()`, which consists of a 
single line:

```scala
execute().map(ScalaReflection.convertRowToScala(_, schema)).collect()
```

The problem is that, `QueryPlan.schema` is a function. And since 1.3.0, 
`convertRowToScala` starts returning a `GenericRowWithSchema`. Thus, every 
`GenericRowWithSchema` instance holds a separate copy of the schema object. 
Also, YJP profiling result of the following simple micro benchmark (executed in 
Spark shell) shows that constructing the schema object takes up to ~35% CPU 
time.

```scala
sc.parallelize(1 to 1000).
  map(i = (i, sval_$i)).
  toDF(key, value).
  saveAsParquetFile(file:///tmp/src.parquet)

// Profiling started from this line
sqlContext.parquetFile(file:///tmp/src.parquet).collect()
```

!-- Reviewable:start --
[img src=https://reviewable.io/review_button.png; height=40 alt=Review on 
Reviewable/](https://reviewable.io/reviews/apache/spark/5398)
!-- Reviewable:end --

Author: Cheng Lian l...@databricks.com

Closes #5398 from liancheng/spark-6748 and squashes the following commits:

3159469 [Cheng Lian] Makes QueryPlan.schema a lazy val


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/77bcceb9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/77bcceb9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/77bcceb9

Branch: refs/heads/master
Commit: 77bcceb9f01e97cb6f41791f2167b40c4311f701
Parents: fc957dc
Author: Cheng Lian l...@databricks.com
Authored: Wed Apr 8 07:00:56 2015 +0800
Committer: Cheng Lian l...@databricks.com
Committed: Wed Apr 8 07:00:56 2015 +0800

--
 .../main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/77bcceb9/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
index 02f7c26..7967189 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
+++ 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
@@ -150,7 +150,7 @@ abstract class QueryPlan[PlanType : TreeNode[PlanType]] 
extends TreeNode[PlanTy
 }.toSeq
   }
 
-  def schema: StructType = StructType.fromAttributes(output)
+  lazy val schema: StructType = StructType.fromAttributes(output)
 
   /** Returns the output schema in the tree format. */
   def schemaString: String = schema.treeString


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

spark git commit: [SPARK-6737] Fix memory leak in OutputCommitCoordinator

2015-04-07 Thread joshrosen

Repository: spark
Updated Branches:
  refs/heads/master 77bcceb9f - c83e03948


[SPARK-6737] Fix memory leak in OutputCommitCoordinator

This patch fixes a memory leak in the DAGScheduler, which caused us to leak a 
map entry per submitted stage.  The problem is that the OutputCommitCoordinator 
needs to be informed when stages end in order to remove entries from its 
`authorizedCommitters` map, but the DAGScheduler only called it in one of the 
four code paths that are used to mark stages as completed.

This patch fixes this issue by consolidating the processing of stage completion 
into a new `markStageAsFinished` method and updates DAGSchedulerSuite's 
`assertDataStructuresEmpty` assertion to also check the OutputCommitCoordinator 
data structures.  I've also added a comment at the top of DAGScheduler so that 
we remember to update this test when adding new data structures.

Author: Josh Rosen joshro...@databricks.com

Closes #5397 from JoshRosen/SPARK-6737 and squashes the following commits:

af3b02f [Josh Rosen] Consolidate stage completion handling code in a single 
method.
e96ce3a [Josh Rosen] Consolidate stage completion handling code in a single 
method.
3052aea [Josh Rosen] Comment update
7896899 [Josh Rosen] Fix SPARK-6737 by informing OutputCommitCoordinator of all 
stage end events.
4ead1dc [Josh Rosen] Add regression tests for SPARK-6737


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c83e0394
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c83e0394
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c83e0394

Branch: refs/heads/master
Commit: c83e03948b184ffb3a9418fecc4d2c26ae33b057
Parents: 77bcceb
Author: Josh Rosen joshro...@databricks.com
Authored: Tue Apr 7 16:18:55 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Tue Apr 7 16:18:55 2015 -0700

--
 .../apache/spark/scheduler/DAGScheduler.scala   | 63 +++-
 .../scheduler/OutputCommitCoordinator.scala |  7 +++
 .../spark/scheduler/DAGSchedulerSuite.scala |  1 +
 3 files changed, 42 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c83e0394/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
--
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index c82ae4b..c912520 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -50,6 +50,10 @@ import 
org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
  * not caused by shuffle file loss are handled by the TaskScheduler, which 
will retry each task
  * a small number of times before cancelling the whole stage.
  *
+ * Here's a checklist to use when making or reviewing changes to this class:
+ *
+ *  - When adding a new data structure, update 
`DAGSchedulerSuite.assertDataStructuresEmpty` to
+ *include the new structure. This will help to catch memory leaks.
  */
 private[spark]
 class DAGScheduler(
@@ -111,6 +115,8 @@ class DAGScheduler(
   //   stray messages to detect.
   private val failedEpoch = new HashMap[String, Long]
 
+  private [scheduler] val outputCommitCoordinator = env.outputCommitCoordinator
+
   // A closure serializer that we reuse.
   // This is only safe because DAGScheduler runs in a single thread.
   private val closureSerializer = SparkEnv.get.closureSerializer.newInstance()
@@ -128,8 +134,6 @@ class DAGScheduler(
   private[scheduler] val eventProcessLoop = new 
DAGSchedulerEventProcessLoop(this)
   taskScheduler.setDAGScheduler(this)
 
-  private val outputCommitCoordinator = env.outputCommitCoordinator
-
   // Called by TaskScheduler to report task's starting.
   def taskStarted(task: Task[_], taskInfo: TaskInfo) {
 eventProcessLoop.post(BeginEvent(task, taskInfo))
@@ -710,9 +714,10 @@ class DAGScheduler(
   // cancelling the stages because if the DAG scheduler is stopped, the 
entire application
   // is in the process of getting stopped.
   val stageFailedMessage = Stage cancelled because SparkContext was shut 
down
-  runningStages.foreach { stage =
-stage.latestInfo.stageFailed(stageFailedMessage)
-listenerBus.post(SparkListenerStageCompleted(stage.latestInfo))
+  // The `toArray` here is necessary so that we don't iterate over 
`runningStages` while
+  // mutating it.
+  runningStages.toArray.foreach { stage =
+markStageAsFinished(stage, Some(stageFailedMessage))
   }
   listenerBus.post(SparkListenerJobEnd(job.jobId, clock.getTimeMillis(), 
JobFailed(error)))
 }
@@ -887,10 +892,9 @@ class DAGScheduler(

spark git commit: [SPARK-6737] Fix memory leak in OutputCommitCoordinator

2015-04-07 Thread joshrosen

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 ab1b8edb8 - 277733b1d


[SPARK-6737] Fix memory leak in OutputCommitCoordinator

This patch fixes a memory leak in the DAGScheduler, which caused us to leak a 
map entry per submitted stage.  The problem is that the OutputCommitCoordinator 
needs to be informed when stages end in order to remove entries from its 
`authorizedCommitters` map, but the DAGScheduler only called it in one of the 
four code paths that are used to mark stages as completed.

This patch fixes this issue by consolidating the processing of stage completion 
into a new `markStageAsFinished` method and updates DAGSchedulerSuite's 
`assertDataStructuresEmpty` assertion to also check the OutputCommitCoordinator 
data structures.  I've also added a comment at the top of DAGScheduler so that 
we remember to update this test when adding new data structures.

Author: Josh Rosen joshro...@databricks.com

Closes #5397 from JoshRosen/SPARK-6737 and squashes the following commits:

af3b02f [Josh Rosen] Consolidate stage completion handling code in a single 
method.
e96ce3a [Josh Rosen] Consolidate stage completion handling code in a single 
method.
3052aea [Josh Rosen] Comment update
7896899 [Josh Rosen] Fix SPARK-6737 by informing OutputCommitCoordinator of all 
stage end events.
4ead1dc [Josh Rosen] Add regression tests for SPARK-6737

(cherry picked from commit c83e03948b184ffb3a9418fecc4d2c26ae33b057)
Signed-off-by: Josh Rosen joshro...@databricks.com

Conflicts:
core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/277733b1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/277733b1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/277733b1

Branch: refs/heads/branch-1.3
Commit: 277733b1daa16fcd39e1ab30ba73b040affc3810
Parents: ab1b8ed
Author: Josh Rosen joshro...@databricks.com
Authored: Tue Apr 7 16:18:55 2015 -0700
Committer: Josh Rosen joshro...@databricks.com
Committed: Tue Apr 7 16:26:07 2015 -0700

--
 .../apache/spark/scheduler/DAGScheduler.scala   | 63 +++-
 .../scheduler/OutputCommitCoordinator.scala |  7 +++
 .../spark/scheduler/DAGSchedulerSuite.scala |  1 +
 3 files changed, 42 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/277733b1/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
--
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index c10873e..4a79eba 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -54,6 +54,10 @@ import 
org.apache.spark.storage.BlockManagerMessages.BlockManagerHeartbeat
  * not caused by shuffle file loss are handled by the TaskScheduler, which 
will retry each task
  * a small number of times before cancelling the whole stage.
  *
+ * Here's a checklist to use when making or reviewing changes to this class:
+ *
+ *  - When adding a new data structure, update 
`DAGSchedulerSuite.assertDataStructuresEmpty` to
+ *include the new structure. This will help to catch memory leaks.
  */
 private[spark]
 class DAGScheduler(
@@ -115,6 +119,8 @@ class DAGScheduler(
   //   stray messages to detect.
   private val failedEpoch = new HashMap[String, Long]
 
+  private [scheduler] val outputCommitCoordinator = env.outputCommitCoordinator
+
   // A closure serializer that we reuse.
   // This is only safe because DAGScheduler runs in a single thread.
   private val closureSerializer = SparkEnv.get.closureSerializer.newInstance()
@@ -132,8 +138,6 @@ class DAGScheduler(
   private[scheduler] val eventProcessLoop = new 
DAGSchedulerEventProcessLoop(this)
   taskScheduler.setDAGScheduler(this)
 
-  private val outputCommitCoordinator = env.outputCommitCoordinator
-
   // Called by TaskScheduler to report task's starting.
   def taskStarted(task: Task[_], taskInfo: TaskInfo) {
 eventProcessLoop.post(BeginEvent(task, taskInfo))
@@ -698,9 +702,10 @@ class DAGScheduler(
   // cancelling the stages because if the DAG scheduler is stopped, the 
entire application
   // is in the process of getting stopped.
   val stageFailedMessage = Stage cancelled because SparkContext was shut 
down
-  runningStages.foreach { stage =
-stage.latestInfo.stageFailed(stageFailedMessage)
-listenerBus.post(SparkListenerStageCompleted(stage.latestInfo))
+  // The `toArray` here is necessary so that we don't iterate over 
`runningStages` while
+  // mutating it.
+  runningStages.toArray.foreach { stage =
+

spark git commit: [SPARK-6754] Remove unnecessary TaskContextHelper

2015-04-07 Thread rxin

Repository: spark
Updated Branches:
  refs/heads/master d138aa8ee - 8d2a36c0f


[SPARK-6754] Remove unnecessary TaskContextHelper

The TaskContextHelper was originally necessary because TaskContext was written 
in Java, which does
not have a way to specify that classes are package-private, so 
TaskContextHelper existed to work
around this. Now that TaskContext has been re-written in Scala, this class is 
no longer necessary.

rxin can you look at this? It looks like you missed this bit of cleanup when 
you moved TaskContext from Java to Scala in #4324

cc ScrapCodes and pwendell who added this originally.

Author: Kay Ousterhout kayousterh...@gmail.com

Closes #5402 from kayousterhout/SPARK-6754 and squashes the following commits:

f089800 [Kay Ousterhout] [SPARK-6754] Remove unnecessary TaskContextHelper


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/8d2a36c0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/8d2a36c0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/8d2a36c0

Branch: refs/heads/master
Commit: 8d2a36c0fdfbea9f58271ef6aeb89bb79b22cf62
Parents: d138aa8
Author: Kay Ousterhout kayousterh...@gmail.com
Authored: Tue Apr 7 22:40:42 2015 -0700
Committer: Reynold Xin r...@databricks.com
Committed: Tue Apr 7 22:40:42 2015 -0700

--
 .../org/apache/spark/TaskContextHelper.scala| 29 
 .../apache/spark/scheduler/DAGScheduler.scala   |  4 +--
 .../scala/org/apache/spark/scheduler/Task.scala |  6 ++--
 3 files changed, 5 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/8d2a36c0/core/src/main/scala/org/apache/spark/TaskContextHelper.scala
--
diff --git a/core/src/main/scala/org/apache/spark/TaskContextHelper.scala 
b/core/src/main/scala/org/apache/spark/TaskContextHelper.scala
deleted file mode 100644
index 4636c46..000
--- a/core/src/main/scala/org/apache/spark/TaskContextHelper.scala
+++ /dev/null
@@ -1,29 +0,0 @@
-/*
- * Licensed to the Apache Software Foundation (ASF) under one or more
- * contributor license agreements.  See the NOTICE file distributed with
- * this work for additional information regarding copyright ownership.
- * The ASF licenses this file to You under the Apache License, Version 2.0
- * (the License); you may not use this file except in compliance with
- * the License.  You may obtain a copy of the License at
- *
- *http://www.apache.org/licenses/LICENSE-2.0
- *
- * Unless required by applicable law or agreed to in writing, software
- * distributed under the License is distributed on an AS IS BASIS,
- * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
- * See the License for the specific language governing permissions and
- * limitations under the License.
- */
-
-package org.apache.spark
-
-/**
- * This class exists to restrict the visibility of TaskContext setters.
- */
-private [spark] object TaskContextHelper {
-
-  def setTaskContext(tc: TaskContext): Unit = TaskContext.setTaskContext(tc)
-
-  def unset(): Unit = TaskContext.unset()
-  
-}

http://git-wip-us.apache.org/repos/asf/spark/blob/8d2a36c0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
--
diff --git a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
index c912520..508fe7b 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala
@@ -645,13 +645,13 @@ class DAGScheduler(
   val split = rdd.partitions(job.partitions(0))
   val taskContext = new TaskContextImpl(job.finalStage.id, 
job.partitions(0), taskAttemptId = 0,
 attemptNumber = 0, runningLocally = true)
-  TaskContextHelper.setTaskContext(taskContext)
+  TaskContext.setTaskContext(taskContext)
   try {
 val result = job.func(taskContext, rdd.iterator(split, taskContext))
 job.listener.taskSucceeded(0, result)
   } finally {
 taskContext.markTaskCompleted()
-TaskContextHelper.unset()
+TaskContext.unset()
   }
 } catch {
   case e: Exception =

http://git-wip-us.apache.org/repos/asf/spark/blob/8d2a36c0/core/src/main/scala/org/apache/spark/scheduler/Task.scala
--
diff --git a/core/src/main/scala/org/apache/spark/scheduler/Task.scala 
b/core/src/main/scala/org/apache/spark/scheduler/Task.scala
index 4d9f940..8b59286 100644
--- a/core/src/main/scala/org/apache/spark/scheduler/Task.scala
+++ b/core/src/main/scala/org/apache/spark/scheduler/Task.scala
@@ -22,7 +22,7 @@ import

spark git commit: [SPARK-6705][MLLIB] Add fit intercept api to ml logisticregression

2015-04-07 Thread jkbradley

Repository: spark
Updated Branches:
  refs/heads/master c83e03948 - d138aa8ee


[SPARK-6705][MLLIB] Add fit intercept api to ml logisticregression

I have the fit intercept enabled by default for logistic regression, I
wonder what others think here. I understand that it enables allocation
by default which is undesirable, but one needs to have a very strong
reason for not having an intercept term enabled so it is the safer
default from a statistical sense.

Explicitly modeling the intercept by adding a column of all 1s does not
work. I believe the reason is that since the API for
LogisticRegressionWithLBFGS forces column normalization, and a column of all
1s has 0 variance so dividing by 0 kills it.

Author: Omede Firouz ofir...@palantir.com

Closes #5301 from oefirouz/addIntercept and squashes the following commits:

9f1286b [Omede Firouz] [SPARK-6705][MLLIB] Add fitInterceptTerm to 
LogisticRegression
1d6bd6f [Omede Firouz] [SPARK-6705][MLLIB] Add a fit intercept term to ML 
LogisticRegression
9963509 [Omede Firouz] [MLLIB] Add fitIntercept to LogisticRegression
2257fca [Omede Firouz] [MLLIB] Add fitIntercept param to logistic regression
329c1e2 [Omede Firouz] [MLLIB] Add fit intercept term
bd9663c [Omede Firouz] [MLLIB] Add fit intercept api to ml logisticregression


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d138aa8e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d138aa8e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d138aa8e

Branch: refs/heads/master
Commit: d138aa8ee23f4450242da3ac70a493229a90c76b
Parents: c83e039
Author: Omede Firouz ofir...@palantir.com
Authored: Tue Apr 7 23:36:31 2015 -0400
Committer: Joseph K. Bradley jos...@databricks.com
Committed: Tue Apr 7 23:36:31 2015 -0400

--
 .../spark/ml/classification/LogisticRegression.scala|  8 ++--
 .../scala/org/apache/spark/ml/param/sharedParams.scala  | 12 
 .../ml/classification/LogisticRegressionSuite.scala |  9 +
 3 files changed, 27 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d138aa8e/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
index 49c00f7..3462574 100644
--- 
a/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
+++ 
b/mllib/src/main/scala/org/apache/spark/ml/classification/LogisticRegression.scala
@@ -31,7 +31,7 @@ import org.apache.spark.storage.StorageLevel
  * Params for logistic regression.
  */
 private[classification] trait LogisticRegressionParams extends 
ProbabilisticClassifierParams
-  with HasRegParam with HasMaxIter with HasThreshold
+  with HasRegParam with HasMaxIter with HasFitIntercept with HasThreshold
 
 
 /**
@@ -56,6 +56,9 @@ class LogisticRegression
   def setMaxIter(value: Int): this.type = set(maxIter, value)
 
   /** @group setParam */
+  def setFitIntercept(value: Boolean): this.type = set(fitIntercept, value)
+
+  /** @group setParam */
   def setThreshold(value: Double): this.type = set(threshold, value)
 
   override protected def train(dataset: DataFrame, paramMap: ParamMap): 
LogisticRegressionModel = {
@@ -67,7 +70,8 @@ class LogisticRegression
 }
 
 // Train model
-val lr = new LogisticRegressionWithLBFGS
+val lr = new LogisticRegressionWithLBFGS()
+  .setIntercept(paramMap(fitIntercept))
 lr.optimizer
   .setRegParam(paramMap(regParam))
   .setNumIterations(paramMap(maxIter))

http://git-wip-us.apache.org/repos/asf/spark/blob/d138aa8e/mllib/src/main/scala/org/apache/spark/ml/param/sharedParams.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/param/sharedParams.scala 
b/mllib/src/main/scala/org/apache/spark/ml/param/sharedParams.scala
index 5d660d1..0739fdb 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/param/sharedParams.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/param/sharedParams.scala
@@ -106,6 +106,18 @@ private[ml] trait HasProbabilityCol extends Params {
   def getProbabilityCol: String = get(probabilityCol)
 }
 
+private[ml] trait HasFitIntercept extends Params {
+  /**
+   * param for fitting the intercept term, defaults to true
+   * @group param
+   */
+  val fitIntercept: BooleanParam =
+new BooleanParam(this, fitIntercept, indicates whether to fit an 
intercept term, Some(true))
+
+  /** @group getParam */
+  def getFitIntercept: Boolean = get(fitIntercept)
+}
+
 private[ml] trait HasThreshold extends Params {
   /**
* param for threshold in

[2/2] spark git commit: Revert Preparing Spark release v1.3.1-rc1

2015-04-07 Thread pwendell

Revert Preparing Spark release v1.3.1-rc1

This reverts commit 0dcb5d9f31b713ed90bcec63ebc4e530cbb69851.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/00837ccd
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/00837ccd
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/00837ccd

Branch: refs/heads/branch-1.3
Commit: 00837ccd02000b6bb5994a0d7ab8a41d11efe5a3
Parents: 333b473
Author: Patrick Wendell patr...@databricks.com
Authored: Tue Apr 7 22:31:25 2015 -0400
Committer: Patrick Wendell patr...@databricks.com
Committed: Tue Apr 7 22:31:25 2015 -0400

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 yarn/pom.xml  | 2 +-
 28 files changed, 28 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 67bebfc..114dde7 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index c7750a2..dea41f8 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a3dc28f..9a79d70 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 9f03cbd..73ab234 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 46adbe2..1a5aaf5 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/00837ccd/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 5fc589f..d5539d9 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.1-SNAPSHOT/version

[1/2] spark git commit: Revert Preparing development version 1.3.2-SNAPSHOT

2015-04-07 Thread pwendell

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 277733b1d - 00837ccd0


Revert Preparing development version 1.3.2-SNAPSHOT

This reverts commit 728c1f927822eb6b12f04dc47109feb6fbe02ec2.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/333b4732
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/333b4732
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/333b4732

Branch: refs/heads/branch-1.3
Commit: 333b47325ce22dbc7670f0499e43f02b82c0c797
Parents: 277733b
Author: Patrick Wendell patr...@databricks.com
Authored: Tue Apr 7 22:31:22 2015 -0400
Committer: Patrick Wendell patr...@databricks.com
Committed: Tue Apr 7 22:31:22 2015 -0400

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 yarn/pom.xml  | 2 +-
 28 files changed, 28 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 0952cd2..67bebfc 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.2-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index 602cc7b..c7750a2 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.2-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index 5971d05..a3dc28f 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.2-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index e1a3ecc..9f03cbd 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.2-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index f46a2a0..46adbe2 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.2-SNAPSHOT/version
+version1.3.1/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/333b4732/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 02331e8..5fc589f 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId

[1/2] spark git commit: Preparing Spark release v1.3.1-rc2

2015-04-07 Thread pwendell

Repository: spark
Updated Branches:
  refs/heads/branch-1.3 00837ccd0 - cdef7d080


Preparing Spark release v1.3.1-rc2


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7c4473aa
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7c4473aa
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7c4473aa

Branch: refs/heads/branch-1.3
Commit: 7c4473aa5a7f5de0323394aaedeefbf9738e8eb5
Parents: 00837cc
Author: Patrick Wendell patr...@databricks.com
Authored: Wed Apr 8 02:30:52 2015 +
Committer: Patrick Wendell patr...@databricks.com
Committed: Wed Apr 8 02:30:52 2015 +

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 yarn/pom.xml  | 2 +-
 28 files changed, 28 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 114dde7..67bebfc 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index dea41f8..c7750a2 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index 9a79d70..a3dc28f 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 73ab234..9f03cbd 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+version1.3.1/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 1a5aaf5..46adbe2 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+version1.3.1/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/7c4473aa/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index d5539d9..5fc589f 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1-SNAPSHOT/version
+

Git Push Summary

2015-04-07 Thread pwendell

Repository: spark
Updated Tags:  refs/tags/v1.3.1-rc2 [created] 7c4473aa5

-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org

[2/2] spark git commit: Preparing development version 1.3.2-SNAPSHOT

2015-04-07 Thread pwendell

Preparing development version 1.3.2-SNAPSHOT


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/cdef7d08
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/cdef7d08
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/cdef7d08

Branch: refs/heads/branch-1.3
Commit: cdef7d080aa3f473f5ea06ba816c01b41a0239eb
Parents: 7c4473a
Author: Patrick Wendell patr...@databricks.com
Authored: Wed Apr 8 02:30:53 2015 +
Committer: Patrick Wendell patr...@databricks.com
Committed: Wed Apr 8 02:30:53 2015 +

--
 assembly/pom.xml  | 2 +-
 bagel/pom.xml | 2 +-
 core/pom.xml  | 2 +-
 examples/pom.xml  | 2 +-
 external/flume-sink/pom.xml   | 2 +-
 external/flume/pom.xml| 2 +-
 external/kafka-assembly/pom.xml   | 2 +-
 external/kafka/pom.xml| 2 +-
 external/mqtt/pom.xml | 2 +-
 external/twitter/pom.xml  | 2 +-
 external/zeromq/pom.xml   | 2 +-
 extras/java8-tests/pom.xml| 2 +-
 extras/kinesis-asl/pom.xml| 2 +-
 extras/spark-ganglia-lgpl/pom.xml | 2 +-
 graphx/pom.xml| 2 +-
 mllib/pom.xml | 2 +-
 network/common/pom.xml| 2 +-
 network/shuffle/pom.xml   | 2 +-
 network/yarn/pom.xml  | 2 +-
 pom.xml   | 2 +-
 repl/pom.xml  | 2 +-
 sql/catalyst/pom.xml  | 2 +-
 sql/core/pom.xml  | 2 +-
 sql/hive-thriftserver/pom.xml | 2 +-
 sql/hive/pom.xml  | 2 +-
 streaming/pom.xml | 2 +-
 tools/pom.xml | 2 +-
 yarn/pom.xml  | 2 +-
 28 files changed, 28 insertions(+), 28 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 67bebfc..0952cd2 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index c7750a2..602cc7b 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a3dc28f..5971d05 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/examples/pom.xml
--
diff --git a/examples/pom.xml b/examples/pom.xml
index 9f03cbd..e1a3ecc 100644
--- a/examples/pom.xml
+++ b/examples/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/external/flume-sink/pom.xml
--
diff --git a/external/flume-sink/pom.xml b/external/flume-sink/pom.xml
index 46adbe2..f46a2a0 100644
--- a/external/flume-sink/pom.xml
+++ b/external/flume-sink/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent
 

http://git-wip-us.apache.org/repos/asf/spark/blob/cdef7d08/external/flume/pom.xml
--
diff --git a/external/flume/pom.xml b/external/flume/pom.xml
index 5fc589f..02331e8 100644
--- a/external/flume/pom.xml
+++ b/external/flume/pom.xml
@@ -21,7 +21,7 @@
   parent
 groupIdorg.apache.spark/groupId
 artifactIdspark-parent_2.10/artifactId
-version1.3.1/version
+version1.3.2-SNAPSHOT/version
 relativePath../../pom.xml/relativePath
   /parent

spark git commit: [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

spark git commit: [SPARK-6636] Use public DNS hostname everywhere in spark_ec2.py

spark git commit: [SPARK-6716] Change SparkContext.DRIVER_IDENTIFIER from driver to driver

spark git commit: [SPARK-6736][GraphX][Doc]Example of Graph#aggregateMessages has error

spark git commit: [SPARK-3591][YARN]fire and forget for YARN cluster mode

spark git commit: Replace use of .size with .length for Arrays

spark git commit: [SPARK-6733][ Scheduler]Added scala.language.existentials

spark git commit: [SPARK-6750] Upgrade ScalaStyle to 0.7.

spark git commit: [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path

spark git commit: Revert [SPARK-6568] spark-shell.cmd --jars option does not accept the jar that has space in its path

spark git commit: [SPARK-6720][MLLIB] PySpark MultivariateStatisticalSummary unit test for normL1...

spark git commit: [SPARK-6748] [SQL] Makes QueryPlan.schema a lazy val

spark git commit: [SPARK-6737] Fix memory leak in OutputCommitCoordinator

spark git commit: [SPARK-6737] Fix memory leak in OutputCommitCoordinator

spark git commit: [SPARK-6754] Remove unnecessary TaskContextHelper

spark git commit: [SPARK-6705][MLLIB] Add fit intercept api to ml logisticregression

[2/2] spark git commit: Revert Preparing Spark release v1.3.1-rc1

[1/2] spark git commit: Revert Preparing development version 1.3.2-SNAPSHOT

[1/2] spark git commit: Preparing Spark release v1.3.1-rc2

Git Push Summary

[2/2] spark git commit: Preparing development version 1.3.2-SNAPSHOT

21 matches

Site Navigation

Mail list logo

Footer information