spark git commit: [SPARK-8060] Improve DataFrame Python test coverage and documentation.

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 452eb82dd -> ce320cb2d


[SPARK-8060] Improve DataFrame Python test coverage and documentation.

Author: Reynold Xin 

Closes #6601 from rxin/python-read-write-test-and-doc and squashes the 
following commits:

baa8ad5 [Reynold Xin] Code review feedback.
f081d47 [Reynold Xin] More documentation updates.
c9902fa [Reynold Xin] [SPARK-8060] Improve DataFrame Python reader/writer 
interface doc and testing.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce320cb2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ce320cb2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ce320cb2

Branch: refs/heads/master
Commit: ce320cb2dbf28825f80795ce569735888f98d6e8
Parents: 452eb82
Author: Reynold Xin 
Authored: Wed Jun 3 00:23:34 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 00:23:34 2015 -0700

--
 .rat-excludes   |   1 +
 python/pyspark/sql/__init__.py  |  13 +-
 python/pyspark/sql/context.py   |  89 
 python/pyspark/sql/dataframe.py |  82 +++
 python/pyspark/sql/readwriter.py| 217 +--
 python/pyspark/sql/tests.py |   2 +
 .../sql/parquet_partitioned/_SUCCESS|   0
 .../sql/parquet_partitioned/_common_metadata| Bin 0 -> 210 bytes
 .../sql/parquet_partitioned/_metadata   | Bin 0 -> 743 bytes
 .../month=9/day=1/.part-r-8.gz.parquet.crc  | Bin 0 -> 12 bytes
 .../month=9/day=1/part-r-8.gz.parquet   | Bin 0 -> 322 bytes
 .../day=25/.part-r-2.gz.parquet.crc | Bin 0 -> 12 bytes
 .../day=25/.part-r-4.gz.parquet.crc | Bin 0 -> 12 bytes
 .../month=10/day=25/part-r-2.gz.parquet | Bin 0 -> 343 bytes
 .../month=10/day=25/part-r-4.gz.parquet | Bin 0 -> 343 bytes
 .../day=26/.part-r-5.gz.parquet.crc | Bin 0 -> 12 bytes
 .../month=10/day=26/part-r-5.gz.parquet | Bin 0 -> 333 bytes
 .../month=9/day=1/.part-r-7.gz.parquet.crc  | Bin 0 -> 12 bytes
 .../month=9/day=1/part-r-7.gz.parquet   | Bin 0 -> 343 bytes
 python/test_support/sql/people.json |   3 +
 20 files changed, 180 insertions(+), 227 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ce320cb2/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index c0f81b5..8f2722c 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -82,3 +82,4 @@ local-1426633911242/*
 local-1430917381534/*
 DESCRIPTION
 NAMESPACE
+test_support/*

http://git-wip-us.apache.org/repos/asf/spark/blob/ce320cb2/python/pyspark/sql/__init__.py
--
diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py
index 726d288..ad9c891 100644
--- a/python/pyspark/sql/__init__.py
+++ b/python/pyspark/sql/__init__.py
@@ -45,11 +45,20 @@ from __future__ import absolute_import
 
 
 def since(version):
+"""
+A decorator that annotates a function to append the version of Spark the 
function was added.
+"""
+import re
+indent_p = re.compile(r'\n( +)')
+
 def deco(f):
-f.__doc__ = f.__doc__.rstrip() + "\n\n.. versionadded:: %s" % version
+indents = indent_p.findall(f.__doc__)
+indent = ' ' * (min(len(m) for m in indents) if indents else 0)
+f.__doc__ = f.__doc__.rstrip() + "\n\n%s.. versionadded:: %s" % 
(indent, version)
 return f
 return deco
 
+
 from pyspark.sql.types import Row
 from pyspark.sql.context import SQLContext, HiveContext
 from pyspark.sql.column import Column
@@ -58,7 +67,9 @@ from pyspark.sql.group import GroupedData
 from pyspark.sql.readwriter import DataFrameReader, DataFrameWriter
 from pyspark.sql.window import Window, WindowSpec
 
+
 __all__ = [
 'SQLContext', 'HiveContext', 'DataFrame', 'GroupedData', 'Column', 'Row',
 'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec',
+'DataFrameReader', 'DataFrameWriter'
 ]

http://git-wip-us.apache.org/repos/asf/spark/blob/ce320cb2/python/pyspark/sql/context.py
--
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index 22f6257..9fdf43c 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -124,7 +124,10 @@ class SQLContext(object):
 @property
 @since("1.3.1")
 def udf(self):
-"""Returns a :class:`UDFRegistration` for UDF registration."""
+"""Returns a :class:`UDFRegistration` for UDF registration.
+
+:return: :class:`UDFRegistration`
+"""
 return UDFRegistration(self)
 
 @since(1.4)
@

spark git commit: [SPARK-8060] Improve DataFrame Python test coverage and documentation.

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 bd57af387 -> ee7f365bd


[SPARK-8060] Improve DataFrame Python test coverage and documentation.

Author: Reynold Xin 

Closes #6601 from rxin/python-read-write-test-and-doc and squashes the 
following commits:

baa8ad5 [Reynold Xin] Code review feedback.
f081d47 [Reynold Xin] More documentation updates.
c9902fa [Reynold Xin] [SPARK-8060] Improve DataFrame Python reader/writer 
interface doc and testing.

(cherry picked from commit ce320cb2dbf28825f80795ce569735888f98d6e8)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ee7f365b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ee7f365b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ee7f365b

Branch: refs/heads/branch-1.4
Commit: ee7f365bd0f8822b213a3f434bc958d9eba8db3c
Parents: bd57af3
Author: Reynold Xin 
Authored: Wed Jun 3 00:23:34 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 00:23:42 2015 -0700

--
 .rat-excludes   |   1 +
 python/pyspark/sql/__init__.py  |  13 +-
 python/pyspark/sql/context.py   |  89 
 python/pyspark/sql/dataframe.py |  82 +++
 python/pyspark/sql/readwriter.py| 217 +--
 python/pyspark/sql/tests.py |   2 +
 .../sql/parquet_partitioned/_SUCCESS|   0
 .../sql/parquet_partitioned/_common_metadata| Bin 0 -> 210 bytes
 .../sql/parquet_partitioned/_metadata   | Bin 0 -> 743 bytes
 .../month=9/day=1/.part-r-8.gz.parquet.crc  | Bin 0 -> 12 bytes
 .../month=9/day=1/part-r-8.gz.parquet   | Bin 0 -> 322 bytes
 .../day=25/.part-r-2.gz.parquet.crc | Bin 0 -> 12 bytes
 .../day=25/.part-r-4.gz.parquet.crc | Bin 0 -> 12 bytes
 .../month=10/day=25/part-r-2.gz.parquet | Bin 0 -> 343 bytes
 .../month=10/day=25/part-r-4.gz.parquet | Bin 0 -> 343 bytes
 .../day=26/.part-r-5.gz.parquet.crc | Bin 0 -> 12 bytes
 .../month=10/day=26/part-r-5.gz.parquet | Bin 0 -> 333 bytes
 .../month=9/day=1/.part-r-7.gz.parquet.crc  | Bin 0 -> 12 bytes
 .../month=9/day=1/part-r-7.gz.parquet   | Bin 0 -> 343 bytes
 python/test_support/sql/people.json |   3 +
 20 files changed, 180 insertions(+), 227 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ee7f365b/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index c0f81b5..8f2722c 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -82,3 +82,4 @@ local-1426633911242/*
 local-1430917381534/*
 DESCRIPTION
 NAMESPACE
+test_support/*

http://git-wip-us.apache.org/repos/asf/spark/blob/ee7f365b/python/pyspark/sql/__init__.py
--
diff --git a/python/pyspark/sql/__init__.py b/python/pyspark/sql/__init__.py
index 726d288..ad9c891 100644
--- a/python/pyspark/sql/__init__.py
+++ b/python/pyspark/sql/__init__.py
@@ -45,11 +45,20 @@ from __future__ import absolute_import
 
 
 def since(version):
+"""
+A decorator that annotates a function to append the version of Spark the 
function was added.
+"""
+import re
+indent_p = re.compile(r'\n( +)')
+
 def deco(f):
-f.__doc__ = f.__doc__.rstrip() + "\n\n.. versionadded:: %s" % version
+indents = indent_p.findall(f.__doc__)
+indent = ' ' * (min(len(m) for m in indents) if indents else 0)
+f.__doc__ = f.__doc__.rstrip() + "\n\n%s.. versionadded:: %s" % 
(indent, version)
 return f
 return deco
 
+
 from pyspark.sql.types import Row
 from pyspark.sql.context import SQLContext, HiveContext
 from pyspark.sql.column import Column
@@ -58,7 +67,9 @@ from pyspark.sql.group import GroupedData
 from pyspark.sql.readwriter import DataFrameReader, DataFrameWriter
 from pyspark.sql.window import Window, WindowSpec
 
+
 __all__ = [
 'SQLContext', 'HiveContext', 'DataFrame', 'GroupedData', 'Column', 'Row',
 'DataFrameNaFunctions', 'DataFrameStatFunctions', 'Window', 'WindowSpec',
+'DataFrameReader', 'DataFrameWriter'
 ]

http://git-wip-us.apache.org/repos/asf/spark/blob/ee7f365b/python/pyspark/sql/context.py
--
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index 22f6257..9fdf43c 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -124,7 +124,10 @@ class SQLContext(object):
 @property
 @since("1.3.1")
 def udf(self):
-"""Returns a :class:`UDFRegistration` for UDF registration."""
+"""Returns a :class:`UDFRegistration` for UDF registration.
+
+   

spark git commit: [SPARK-7562][SPARK-6444][SQL] Improve error reporting for expression data type mismatch

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master ce320cb2d -> d38cf217e


[SPARK-7562][SPARK-6444][SQL] Improve error reporting for expression data type 
mismatch

It seems hard to find a common pattern of checking types in `Expression`. 
Sometimes we know what input types we need(like `And`, we know we need two 
booleans), sometimes we just have some rules(like `Add`, we need 2 numeric 
types which are equal). So I defined a general interface `checkInputDataTypes` 
in `Expression` which returns a `TypeCheckResult`. `TypeCheckResult` can tell 
whether this expression passes the type checking or what the type mismatch is.

This PR mainly works on apply input types checking for arithmetic and predicate 
expressions.

TODO: apply type checking interface to more expressions.

Author: Wenchen Fan 

Closes #6405 from cloud-fan/6444 and squashes the following commits:

b5ff31b [Wenchen Fan] address comments
b917275 [Wenchen Fan] rebase
39929d9 [Wenchen Fan] add todo
0808fd2 [Wenchen Fan] make constrcutor of TypeCheckResult private
3bee157 [Wenchen Fan] and decimal type coercion rule for binary comparison
8883025 [Wenchen Fan] apply type check interface to CaseWhen
cffb67c [Wenchen Fan] to have resolved call the data type check function
6eaadff [Wenchen Fan] add equal type constraint to EqualTo
3affbd8 [Wenchen Fan] more fixes
654d46a [Wenchen Fan] improve tests
e0a3628 [Wenchen Fan] improve error message
1524ff6 [Wenchen Fan] fix style
69ca3fe [Wenchen Fan] add error message and tests
c71d02c [Wenchen Fan] fix hive tests
6491721 [Wenchen Fan] use value class TypeCheckResult
7ae76b9 [Wenchen Fan] address comments
cb77e4f [Wenchen Fan] Improve error reporting for expression data type mismatch


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d38cf217
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d38cf217
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d38cf217

Branch: refs/heads/master
Commit: d38cf217e0c6bfbf451c659675280b43a08bc70f
Parents: ce320cb
Author: Wenchen Fan 
Authored: Wed Jun 3 00:47:52 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 00:47:52 2015 -0700

--
 .../scala/org/apache/spark/SparkFunSuite.scala  |   4 +-
 .../sql/catalyst/analysis/CheckAnalysis.scala   |  12 +-
 .../catalyst/analysis/HiveTypeCoercion.scala| 132 
 .../sql/catalyst/analysis/TypeCheckResult.scala |  45 +++
 .../sql/catalyst/expressions/Expression.scala   |  28 +-
 .../sql/catalyst/expressions/arithmetic.scala   | 308 ---
 .../catalyst/expressions/mathfuncs/binary.scala |  17 +-
 .../sql/catalyst/expressions/predicates.scala   | 226 +++---
 .../sql/catalyst/optimizer/Optimizer.scala  |   4 +
 .../spark/sql/catalyst/util/DateUtils.scala |   2 +-
 .../spark/sql/catalyst/util/TypeUtils.scala |  56 
 .../org/apache/spark/sql/types/DataType.scala   |   2 +-
 .../analysis/DecimalPrecisionSuite.scala|   6 +-
 .../analysis/HiveTypeCoercionSuite.scala|  15 +-
 .../ExpressionTypeCheckingSuite.scala   | 143 +
 .../org/apache/spark/sql/json/InferSchema.scala |   2 +-
 .../org/apache/spark/sql/json/JsonRDD.scala |   2 +-
 17 files changed, 583 insertions(+), 421 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d38cf217/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala 
b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
index 8cb3443..9be9db0 100644
--- a/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
+++ b/core/src/test/scala/org/apache/spark/SparkFunSuite.scala
@@ -30,8 +30,8 @@ private[spark] abstract class SparkFunSuite extends FunSuite 
with Logging {
* Log the suite name and the test name before and after each test.
*
* Subclasses should never override this method. If they wish to run
-   * custom code before and after each test, they should should mix in
-   * the {{org.scalatest.BeforeAndAfter}} trait instead.
+   * custom code before and after each test, they should mix in the
+   * {{org.scalatest.BeforeAndAfter}} trait instead.
*/
   final protected override def withFixture(test: NoArgTest): Outcome = {
 val testName = test.text

http://git-wip-us.apache.org/repos/asf/spark/blob/d38cf217/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
--
diff --git 
a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
 
b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala
index 193dc6b..c0695ae 100644
--- 
a/sql/catalyst/src/main/scala/org/apache/spa

spark git commit: [SPARK-7983] [MLLIB] Add require for one-based indices in loadLibSVMFile

2015-06-03 Thread srowen
Repository: spark
Updated Branches:
  refs/heads/master d38cf217e -> 28dbde387


[SPARK-7983] [MLLIB] Add require for one-based indices in loadLibSVMFile

jira: https://issues.apache.org/jira/browse/SPARK-7983

Customers frequently use zero-based indices in their LIBSVM files. No warnings 
or errors from Spark will be reported during their computation afterwards, and 
usually it will lead to wired result for many algorithms (like GBDT).

add a quick check.

Author: Yuhao Yang 

Closes #6538 from hhbyyh/loadSVM and squashes the following commits:

79d9c11 [Yuhao Yang] optimization as respond to comments
4310710 [Yuhao Yang] merge conflict
96460f1 [Yuhao Yang] merge conflict
20a2811 [Yuhao Yang] use require
6e4f8ca [Yuhao Yang] add check for ascending order
9956365 [Yuhao Yang] add ut for 0-based loadlibsvm exception
5bd1f9a [Yuhao Yang] add require for one-based in loadLIBSVM


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/28dbde38
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/28dbde38
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/28dbde38

Branch: refs/heads/master
Commit: 28dbde3874ccdd44b73675938719b69336d23dac
Parents: d38cf21
Author: Yuhao Yang 
Authored: Wed Jun 3 13:15:57 2015 +0200
Committer: Sean Owen 
Committed: Wed Jun 3 13:15:57 2015 +0200

--
 .../org/apache/spark/mllib/util/MLUtils.scala   | 12 +++
 .../apache/spark/mllib/util/MLUtilsSuite.scala  | 35 
 2 files changed, 47 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/28dbde38/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala 
b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
index 541f328..52d6468 100644
--- a/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
+++ b/mllib/src/main/scala/org/apache/spark/mllib/util/MLUtils.scala
@@ -82,6 +82,18 @@ object MLUtils {
   val value = indexAndValue(1).toDouble
   (index, value)
 }.unzip
+
+// check if indices are one-based and in ascending order
+var previous = -1
+var i = 0
+val indicesLength = indices.length
+while (i < indicesLength) {
+  val current = indices(i)
+  require(current > previous, "indices should be one-based and in 
ascending order" )
+  previous = current
+  i += 1
+}
+
 (label, indices.toArray, values.toArray)
   }
 

http://git-wip-us.apache.org/repos/asf/spark/blob/28dbde38/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala 
b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
index 734b7ba..70219e9 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/util/MLUtilsSuite.scala
@@ -25,6 +25,7 @@ import breeze.linalg.{squaredDistance => 
breezeSquaredDistance}
 import com.google.common.base.Charsets
 import com.google.common.io.Files
 
+import org.apache.spark.SparkException
 import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.{DenseVector, SparseVector, Vectors}
 import org.apache.spark.mllib.regression.LabeledPoint
@@ -108,6 +109,40 @@ class MLUtilsSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 Utils.deleteRecursively(tempDir)
   }
 
+  test("loadLibSVMFile throws IllegalArgumentException when indices is 
zero-based") {
+val lines =
+  """
+|0
+|0 0:4.0 4:5.0 6:6.0
+  """.stripMargin
+val tempDir = Utils.createTempDir()
+val file = new File(tempDir.getPath, "part-0")
+Files.write(lines, file, Charsets.US_ASCII)
+val path = tempDir.toURI.toString
+
+intercept[SparkException] {
+  loadLibSVMFile(sc, path).collect()
+}
+Utils.deleteRecursively(tempDir)
+  }
+
+  test("loadLibSVMFile throws IllegalArgumentException when indices is not in 
ascending order") {
+val lines =
+  """
+|0
+|0 3:4.0 2:5.0 6:6.0
+  """.stripMargin
+val tempDir = Utils.createTempDir()
+val file = new File(tempDir.getPath, "part-0")
+Files.write(lines, file, Charsets.US_ASCII)
+val path = tempDir.toURI.toString
+
+intercept[SparkException] {
+  loadLibSVMFile(sc, path).collect()
+}
+Utils.deleteRecursively(tempDir)
+  }
+
   test("saveAsLibSVMFile") {
 val examples = sc.parallelize(Seq(
   LabeledPoint(1.1, Vectors.sparse(3, Seq((0, 1.23), (2, 4.56,


-

spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark
Updated Branches:
  refs/heads/master 28dbde387 -> f1646e102


[SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

https://issues.apache.org/jira/browse/SPARK-7973

Author: Yin Huai 

Closes #6525 from yhuai/SPARK-7973 and squashes the following commits:

763b821 [Yin Huai] Also change the timeout of "Single command with -e" to 2 
minutes.
e598a08 [Yin Huai] Increase the timeout to 3 minutes.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f1646e10
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f1646e10
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f1646e10

Branch: refs/heads/master
Commit: f1646e1023bd03e27268a8aa2ea11b6cc284075f
Parents: 28dbde3
Author: Yin Huai 
Authored: Wed Jun 3 09:26:21 2015 -0700
Committer: Yin Huai 
Committed: Wed Jun 3 09:26:21 2015 -0700

--
 .../scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f1646e10/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
--
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index 3732af7..13b0c59 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -133,7 +133,7 @@ class CliSuite extends SparkFunSuite with BeforeAndAfter 
with Logging {
   }
 
   test("Single command with -e") {
-runCliWithin(1.minute, Seq("-e", "SHOW DATABASES;"))("" -> "OK")
+runCliWithin(2.minute, Seq("-e", "SHOW DATABASES;"))("" -> "OK")
   }
 
   test("Single command with --database") {
@@ -165,7 +165,7 @@ class CliSuite extends SparkFunSuite with BeforeAndAfter 
with Logging {
 val dataFilePath =
   
Thread.currentThread().getContextClassLoader.getResource("data/files/small_kv.txt")
 
-runCliWithin(1.minute, Seq("--jars", s"$jarFile"))(
+runCliWithin(3.minute, Seq("--jars", s"$jarFile"))(
   """CREATE TABLE t1(key string, val string)
 |ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
   """.stripMargin


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

2015-06-03 Thread yhuai
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 ee7f365bd -> 54a4ea407


[SPARK-7973] [SQL] Increase the timeout of two CliSuite tests.

https://issues.apache.org/jira/browse/SPARK-7973

Author: Yin Huai 

Closes #6525 from yhuai/SPARK-7973 and squashes the following commits:

763b821 [Yin Huai] Also change the timeout of "Single command with -e" to 2 
minutes.
e598a08 [Yin Huai] Increase the timeout to 3 minutes.

(cherry picked from commit f1646e1023bd03e27268a8aa2ea11b6cc284075f)
Signed-off-by: Yin Huai 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/54a4ea40
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/54a4ea40
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/54a4ea40

Branch: refs/heads/branch-1.4
Commit: 54a4ea4078b2a423f2b20e0dbd2290004e251e0d
Parents: ee7f365
Author: Yin Huai 
Authored: Wed Jun 3 09:26:21 2015 -0700
Committer: Yin Huai 
Committed: Wed Jun 3 09:26:30 2015 -0700

--
 .../scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala  | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/54a4ea40/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
--
diff --git 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
index cc07db8..eb3a315 100644
--- 
a/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
+++ 
b/sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala
@@ -133,7 +133,7 @@ class CliSuite extends FunSuite with BeforeAndAfter with 
Logging {
   }
 
   test("Single command with -e") {
-runCliWithin(1.minute, Seq("-e", "SHOW DATABASES;"))("" -> "OK")
+runCliWithin(2.minute, Seq("-e", "SHOW DATABASES;"))("" -> "OK")
   }
 
   test("Single command with --database") {
@@ -165,7 +165,7 @@ class CliSuite extends FunSuite with BeforeAndAfter with 
Logging {
 val dataFilePath =
   
Thread.currentThread().getContextClassLoader.getResource("data/files/small_kv.txt")
 
-runCliWithin(1.minute, Seq("--jars", s"$jarFile"))(
+runCliWithin(3.minute, Seq("--jars", s"$jarFile"))(
   """CREATE TABLE t1(key string, val string)
 |ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe';
   """.stripMargin


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0

2015-06-03 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/master f1646e102 -> 2c4d550ed


[SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0

Author: Patrick Wendell 

Closes #6328 from pwendell/spark-1.5-update and squashes the following commits:

2f42d02 [Patrick Wendell] A few more excludes
4bebcf0 [Patrick Wendell] Update to RC4
61aaf46 [Patrick Wendell] Using new release candidate
55f1610 [Patrick Wendell] Another exclude
04b4f04 [Patrick Wendell] More issues with transient 1.4 changes
36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2c4d550e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2c4d550e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2c4d550e

Branch: refs/heads/master
Commit: 2c4d550eda0e6f33d2d575825c3faef4c9217067
Parents: f1646e1
Author: Patrick Wendell 
Authored: Wed Jun 3 10:11:27 2015 -0700
Committer: Patrick Wendell 
Committed: Wed Jun 3 10:11:27 2015 -0700

--
 assembly/pom.xml   |  2 +-
 bagel/pom.xml  |  2 +-
 core/pom.xml   |  2 +-
 core/src/main/scala/org/apache/spark/package.scala |  2 +-
 docs/_config.yml   |  4 ++--
 examples/pom.xml   |  2 +-
 external/flume-sink/pom.xml|  2 +-
 external/flume/pom.xml |  2 +-
 external/kafka-assembly/pom.xml|  2 +-
 external/kafka/pom.xml |  2 +-
 external/mqtt/pom.xml  |  2 +-
 external/twitter/pom.xml   |  2 +-
 external/zeromq/pom.xml|  2 +-
 extras/java8-tests/pom.xml |  2 +-
 extras/kinesis-asl/pom.xml |  2 +-
 extras/spark-ganglia-lgpl/pom.xml  |  2 +-
 graphx/pom.xml |  2 +-
 launcher/pom.xml   |  2 +-
 mllib/pom.xml  |  2 +-
 network/common/pom.xml |  2 +-
 network/shuffle/pom.xml|  2 +-
 network/yarn/pom.xml   |  2 +-
 pom.xml| 14 +-
 project/MimaBuild.scala|  3 ++-
 project/MimaExcludes.scala | 14 ++
 repl/pom.xml   |  2 +-
 sql/catalyst/pom.xml   |  2 +-
 sql/core/pom.xml   |  2 +-
 sql/hive-thriftserver/pom.xml  |  2 +-
 sql/hive/pom.xml   |  2 +-
 streaming/pom.xml  |  2 +-
 tools/pom.xml  |  2 +-
 unsafe/pom.xml |  2 +-
 yarn/pom.xml   |  2 +-
 34 files changed, 61 insertions(+), 34 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2c4d550e/assembly/pom.xml
--
diff --git a/assembly/pom.xml b/assembly/pom.xml
index 626c857..e9c6d26 100644
--- a/assembly/pom.xml
+++ b/assembly/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.10
-1.4.0-SNAPSHOT
+1.5.0-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/2c4d550e/bagel/pom.xml
--
diff --git a/bagel/pom.xml b/bagel/pom.xml
index 132cd43..ed5c37e 100644
--- a/bagel/pom.xml
+++ b/bagel/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.10
-1.4.0-SNAPSHOT
+1.5.0-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/2c4d550e/core/pom.xml
--
diff --git a/core/pom.xml b/core/pom.xml
index a021842..e35694e 100644
--- a/core/pom.xml
+++ b/core/pom.xml
@@ -21,7 +21,7 @@
   
 org.apache.spark
 spark-parent_2.10
-1.4.0-SNAPSHOT
+1.5.0-SNAPSHOT
 ../pom.xml
   
 

http://git-wip-us.apache.org/repos/asf/spark/blob/2c4d550e/core/src/main/scala/org/apache/spark/package.scala
--
diff --git a/core/src/main/scala/org/apache/spark/package.scala 
b/core/src/main/scala/org/apache/spark/package.scala
index 2ab41ba..8ae76c5 100644
--- a/core/src/main/scala/org/apache/spark/package.scala
+++ b/core/src/main/scala/org/apache/spark/package.scala
@@ -43,5 +43,5 @@ package org.apache
 
 package object spark {

svn commit: r1683391 - in /spark: examples.md site/examples.html

2015-06-03 Thread srowen
Author: srowen
Date: Wed Jun  3 17:14:40 2015
New Revision: 1683391

URL: http://svn.apache.org/r1683391
Log:
Fix two Java example typos

Modified:
spark/examples.md
spark/site/examples.html

Modified: spark/examples.md
URL: 
http://svn.apache.org/viewvc/spark/examples.md?rev=1683391&r1=1683390&r2=1683391&view=diff
==
--- spark/examples.md (original)
+++ spark/examples.md Wed Jun  3 17:14:40 2015
@@ -137,7 +137,7 @@ In this example, we search through the e
 JavaPairRDD pairs = words.mapToPair(new 
PairFunction() {
   public Tuple2 call(String s) { return 
new Tuple2(s, 1); }
 });
-JavaPairRDD counts = pairs.reduceByKey(new 
Function2() {
+JavaPairRDD counts = pairs.reduceByKey(new 
Function2() {
   public Integer call(Integer a, Integer b) { return a + b; }
 });
 counts.saveAsTextFile("hdfs://...");
@@ -178,7 +178,7 @@ In this example, we search through the e
   
 
 int count = spark.parallelize(makeRange(1, 
NUM_SAMPLES)).filter(new 
Function() {
-  public Integer call(Integer i) {
+  public Boolean call(Integer i) {
 double x = Math.random();
 double y = Math.random();
 return x*x + y*y < 1;

Modified: spark/site/examples.html
URL: 
http://svn.apache.org/viewvc/spark/site/examples.html?rev=1683391&r1=1683390&r2=1683391&view=diff
==
--- spark/site/examples.html (original)
+++ spark/site/examples.html Wed Jun  3 17:14:40 2015
@@ -297,7 +297,7 @@ previous ones, and actions, whi
 JavaPairRDD pairs = words.mapToPair(new 
PairFunction() {
   public Tuple2 call(String s) { return 
new Tuple2(s, 1); }
 });
-JavaPairRDD counts = pairs.reduceByKey(new 
Function2() {
+JavaPairRDD counts = pairs.reduceByKey(new 
Function2() {
   public Integer call(Integer a, Integer b) { return a + b; }
 });
 counts.saveAsTextFile("hdfs://...");
@@ -338,7 +338,7 @@ previous ones, and actions, whi
   
 
 int count = spark.parallelize(makeRange(1, 
NUM_SAMPLES)).filter(new 
Function() {
-  public Integer call(Integer i) {
+  public Boolean call(Integer i) {
 double x = Math.random();
 double y = Math.random();
 return x*x + y*y < 1;



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[1/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.3 bbd377228 -> e5747ee3a


http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala
--
diff --git 
a/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala 
b/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala
index 87a0395..9711161 100644
--- a/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala
+++ b/streaming/src/test/scala/org/apache/spark/streaming/UISeleniumSuite.scala
@@ -32,7 +32,12 @@ import org.apache.spark._
 /**
  * Selenium tests for the Spark Web UI.
  */
-class UISeleniumSuite extends FunSuite with WebBrowser with Matchers with 
BeforeAndAfterAll with TestSuiteBase {
+class UISeleniumSuite
+  extends SparkFunSuite
+  with WebBrowser
+  with Matchers
+  with BeforeAndAfterAll
+  with TestSuiteBase {
 
   implicit var webDriver: WebDriver = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala
--
diff --git 
a/streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala
 
b/streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala
index 7a6a2f3..569a415 100644
--- 
a/streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala
+++ 
b/streaming/src/test/scala/org/apache/spark/streaming/rdd/WriteAheadLogBackedBlockRDDSuite.scala
@@ -21,14 +21,17 @@ import java.io.File
 import scala.util.Random
 
 import org.apache.hadoop.conf.Configuration
-import org.scalatest.{BeforeAndAfterEach, BeforeAndAfterAll, FunSuite}
+import org.scalatest.{BeforeAndAfterEach, BeforeAndAfterAll}
 
-import org.apache.spark.{SparkConf, SparkContext}
+import org.apache.spark.{SparkConf, SparkContext, SparkFunSuite}
 import org.apache.spark.storage.{BlockId, BlockManager, StorageLevel, 
StreamBlockId}
 import org.apache.spark.streaming.util.{WriteAheadLogFileSegment, 
WriteAheadLogWriter}
 import org.apache.spark.util.Utils
 
-class WriteAheadLogBackedBlockRDDSuite extends FunSuite with BeforeAndAfterAll 
with BeforeAndAfterEach {
+class WriteAheadLogBackedBlockRDDSuite
+  extends SparkFunSuite
+  with BeforeAndAfterAll
+  with BeforeAndAfterEach {
   val conf = new SparkConf()
 .setMaster("local[2]")
 .setAppName(this.getClass.getSimpleName)

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/streaming/src/test/scala/org/apache/spark/streaming/util/RateLimitedOutputStreamSuite.scala
--
diff --git 
a/streaming/src/test/scala/org/apache/spark/streaming/util/RateLimitedOutputStreamSuite.scala
 
b/streaming/src/test/scala/org/apache/spark/streaming/util/RateLimitedOutputStreamSuite.scala
index 9ebf7b4..78fc344 100644
--- 
a/streaming/src/test/scala/org/apache/spark/streaming/util/RateLimitedOutputStreamSuite.scala
+++ 
b/streaming/src/test/scala/org/apache/spark/streaming/util/RateLimitedOutputStreamSuite.scala
@@ -20,9 +20,9 @@ package org.apache.spark.streaming.util
 import java.io.ByteArrayOutputStream
 import java.util.concurrent.TimeUnit._
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class RateLimitedOutputStreamSuite extends FunSuite {
+class RateLimitedOutputStreamSuite extends SparkFunSuite {
 
   private def benchmark[U](f: => U): Long = {
 val start = System.nanoTime

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala
--
diff --git 
a/streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala
 
b/streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala
index 8335659..d040ec1 100644
--- 
a/streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala
+++ 
b/streaming/src/test/scala/org/apache/spark/streaming/util/WriteAheadLogSuite.scala
@@ -26,11 +26,12 @@ import scala.language.{implicitConversions, postfixOps}
 import WriteAheadLogSuite._
 import org.apache.hadoop.conf.Configuration
 import org.apache.hadoop.fs.Path
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.util.{ManualClock, Utils}
-import org.scalatest.{BeforeAndAfter, FunSuite}
+import org.scalatest.BeforeAndAfter
 import org.scalatest.concurrent.Eventually._
 
-class WriteAheadLogSuite extends FunSuite with BeforeAndAfter {
+class WriteAheadLogSuite extends SparkFunSuite with BeforeAndAfter {
 
   val hadoopConf = new Configuration()
   var tempDir: File = null

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/yarn/pom.xml
-

[3/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala
--
diff --git 
a/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala
 
b/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala
index 51d273a..0fb3781 100644
--- 
a/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala
+++ 
b/external/flume/src/test/scala/org/apache/spark/streaming/flume/FlumeStreamSuite.scala
@@ -35,15 +35,15 @@ import org.jboss.netty.channel.ChannelPipeline
 import org.jboss.netty.channel.socket.SocketChannel
 import org.jboss.netty.channel.socket.nio.NioClientSocketChannelFactory
 import org.jboss.netty.handler.codec.compression._
-import org.scalatest.{BeforeAndAfter, FunSuite, Matchers}
+import org.scalatest.{BeforeAndAfter, Matchers}
 import org.scalatest.concurrent.Eventually._
 
-import org.apache.spark.{Logging, SparkConf}
+import org.apache.spark.{Logging, SparkConf, SparkFunSuite}
 import org.apache.spark.storage.StorageLevel
 import org.apache.spark.streaming.{Milliseconds, StreamingContext, 
TestOutputStream}
 import org.apache.spark.util.Utils
 
-class FlumeStreamSuite extends FunSuite with BeforeAndAfter with Matchers with 
Logging {
+class FlumeStreamSuite extends SparkFunSuite with BeforeAndAfter with Matchers 
with Logging {
   val conf = new 
SparkConf().setMaster("local[4]").setAppName("FlumeStreamSuite")
 
   var ssc: StreamingContext = null

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/kafka/pom.xml
--
diff --git a/external/kafka/pom.xml b/external/kafka/pom.xml
index 5b8f4a3..72d207f 100644
--- a/external/kafka/pom.xml
+++ b/external/kafka/pom.xml
@@ -42,6 +42,13 @@
   provided
 
 
+  org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
   org.apache.kafka
   kafka_${scala.binary.version}
   0.8.1.1

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaStreamSuite.scala
--
diff --git 
a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaStreamSuite.scala
 
b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaStreamSuite.scala
index e4966ee..6a259e0 100644
--- 
a/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaStreamSuite.scala
+++ 
b/external/kafka/src/test/scala/org/apache/spark/streaming/kafka/KafkaStreamSuite.scala
@@ -34,10 +34,10 @@ import kafka.server.{KafkaConfig, KafkaServer}
 import kafka.utils.ZKStringSerializer
 import org.I0Itec.zkclient.ZkClient
 import org.apache.zookeeper.server.{NIOServerCnxnFactory, ZooKeeperServer}
-import org.scalatest.{BeforeAndAfter, FunSuite}
+import org.scalatest.BeforeAndAfter
 import org.scalatest.concurrent.Eventually
 
-import org.apache.spark.{Logging, SparkConf}
+import org.apache.spark.{Logging, SparkConf, SparkFunSuite}
 import org.apache.spark.storage.StorageLevel
 import org.apache.spark.streaming.{Milliseconds, StreamingContext}
 import org.apache.spark.util.Utils
@@ -46,7 +46,7 @@ import org.apache.spark.util.Utils
  * This is an abstract base class for Kafka testsuites. This has the 
functionality to set up
  * and tear down local Kafka servers, and to push data using Kafka producers.
  */
-abstract class KafkaStreamSuiteBase extends FunSuite with Eventually with 
Logging {
+abstract class KafkaStreamSuiteBase extends SparkFunSuite with Eventually with 
Logging {
 
   private val zkHost = "localhost"
   private var zkPort: Int = 0

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/mqtt/pom.xml
--
diff --git a/external/mqtt/pom.xml b/external/mqtt/pom.xml
index 51e49ed..500e0a7 100644
--- a/external/mqtt/pom.xml
+++ b/external/mqtt/pom.xml
@@ -42,6 +42,13 @@
   provided
 
 
+  org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
   org.eclipse.paho
   org.eclipse.paho.client.mqttv3
   1.0.1

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala
--
diff --git 
a/external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala
 
b/external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala
index 05b7147..86e5ef3 100644
--- 
a/external/mqtt/src/test/scala/org/apache/spark/streaming/mqtt/MQTTStreamSuite.scala
+++ 
b/external/mqtt/src/test/scala/org/apache/spark

[5/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
[SPARK-7558] Demarcate tests in unit-tests.log (1.3)

This includes the following commits:

original: 9eb222c
hotfix1: 8c99793
hotfix2: a4f2412
scalastyle check: 609c492

---
Original patch #6441
Branch-1.4 patch #6598

Author: Andrew Or 

Closes #6602 from andrewor14/demarcate-tests-1.3 and squashes the following 
commits:

a75ff8f [Andrew Or] Fix hive-thrift server log4j problem
f782edd [Andrew Or] [SPARK-7558] Guard against direct uses of FunSuite / 
FunSuiteLike
2b7a4f4 [Andrew Or] Fix tests?
fec05c2 [Andrew Or] Fix tests
5342d50 [Andrew Or] Various whitespace changes (minor)
9af2756 [Andrew Or] Make all test suites extend SparkFunSuite instead of 
FunSuite
192a47c [Andrew Or] Fix log message
95ff5eb [Andrew Or] Add core tests as dependencies in all modules
8dffa0e [Andrew Or] Introduce base abstract class for all test suites


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e5747ee3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e5747ee3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e5747ee3

Branch: refs/heads/branch-1.3
Commit: e5747ee3abe3ccf7988042e2408492153ce19eea
Parents: bbd3772
Author: Andrew Or 
Authored: Wed Jun 3 10:38:56 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 10:38:56 2015 -0700

--
 bagel/pom.xml   |  7 +++
 .../org/apache/spark/bagel/BagelSuite.scala |  4 +-
 core/pom.xml|  6 +++
 .../org/apache/spark/AccumulatorSuite.scala |  3 +-
 .../org/apache/spark/CacheManagerSuite.scala|  4 +-
 .../org/apache/spark/CheckpointSuite.scala  |  4 +-
 .../org/apache/spark/ContextCleanerSuite.scala  |  4 +-
 .../org/apache/spark/DistributedSuite.scala |  3 +-
 .../scala/org/apache/spark/DriverSuite.scala|  3 +-
 .../spark/ExecutorAllocationManagerSuite.scala  |  8 +++-
 .../scala/org/apache/spark/FailureSuite.scala   |  4 +-
 .../org/apache/spark/FileServerSuite.scala  |  3 +-
 .../test/scala/org/apache/spark/FileSuite.scala |  3 +-
 .../org/apache/spark/FutureActionSuite.scala|  8 +++-
 .../apache/spark/ImplicitOrderingSuite.scala| 30 ++--
 .../org/apache/spark/JobCancellationSuite.scala |  4 +-
 .../apache/spark/MapOutputTrackerSuite.scala|  3 +-
 .../org/apache/spark/PartitioningSuite.scala|  4 +-
 .../org/apache/spark/SSLOptionsSuite.scala  |  4 +-
 .../org/apache/spark/SecurityManagerSuite.scala |  4 +-
 .../scala/org/apache/spark/ShuffleSuite.scala   |  3 +-
 .../scala/org/apache/spark/SparkConfSuite.scala |  3 +-
 .../apache/spark/SparkContextInfoSuite.scala|  4 +-
 .../SparkContextSchedulerCreationSuite.scala|  4 +-
 .../org/apache/spark/SparkContextSuite.scala|  4 +-
 .../scala/org/apache/spark/SparkFunSuite.scala  | 48 
 .../org/apache/spark/StatusTrackerSuite.scala   |  6 +--
 .../scala/org/apache/spark/ThreadingSuite.scala |  4 +-
 .../scala/org/apache/spark/UnpersistSuite.scala |  3 +-
 .../spark/api/python/PythonBroadcastSuite.scala |  6 +--
 .../spark/api/python/PythonRDDSuite.scala   |  4 +-
 .../spark/api/python/SerDeUtilSuite.scala   |  6 +--
 .../apache/spark/broadcast/BroadcastSuite.scala |  6 +--
 .../org/apache/spark/deploy/ClientSuite.scala   |  5 +-
 .../apache/spark/deploy/CommandUtilsSuite.scala |  5 +-
 .../apache/spark/deploy/JsonProtocolSuite.scala |  5 +-
 .../spark/deploy/LogUrlsStandaloneSuite.scala   |  6 +--
 .../apache/spark/deploy/PythonRunnerSuite.scala |  4 +-
 .../apache/spark/deploy/SparkSubmitSuite.scala  |  8 +++-
 .../spark/deploy/SparkSubmitUtilsSuite.scala|  6 ++-
 .../deploy/history/FsHistoryProviderSuite.scala |  6 +--
 .../deploy/history/HistoryServerSuite.scala |  4 +-
 .../spark/deploy/master/MasterSuite.scala   |  6 +--
 .../deploy/rest/StandaloneRestSubmitSuite.scala |  4 +-
 .../deploy/rest/SubmitRestProtocolSuite.scala   |  5 +-
 .../spark/deploy/worker/DriverRunnerTest.scala  |  5 +-
 .../deploy/worker/ExecutorRunnerTest.scala  |  6 +--
 .../deploy/worker/WorkerArgumentsTest.scala |  5 +-
 .../spark/deploy/worker/WorkerSuite.scala   |  6 +--
 .../deploy/worker/WorkerWatcherSuite.scala  |  5 +-
 .../spark/deploy/worker/ui/LogPageSuite.scala   |  6 ++-
 .../spark/executor/TaskMetricsSuite.scala   |  4 +-
 .../input/WholeTextFileRecordReaderSuite.scala  |  5 +-
 .../apache/spark/io/CompressionCodecSuite.scala |  6 +--
 .../spark/metrics/InputOutputMetricsSuite.scala |  6 +--
 .../spark/metrics/MetricsConfigSuite.scala  |  6 ++-
 .../spark/metrics/MetricsSystemSuite.scala  |  6 +--
 .../netty/NettyBlockTransferSecuritySuite.scala |  6 +--
 .../network/nio/ConnectionManagerSuite.scala|  6 +--
 .../apache/spark/rdd/AsyncRDDActionsSuite.scala |  6 +--
 .../org/apache/spark/rdd/DoubleRDDSuite.scala   |  4 +-
 .../org/apache/spark/rdd/JdbcRDDSuite.scala |  6 +--
 .../spark/rdd/Pa

[4/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala
index b1605da..7ef18d1 100644
--- a/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/DoubleRDDSuite.scala
@@ -17,11 +17,9 @@
 
 package org.apache.spark.rdd
 
-import org.scalatest.FunSuite
-
 import org.apache.spark._
 
-class DoubleRDDSuite extends FunSuite with SharedSparkContext {
+class DoubleRDDSuite extends SparkFunSuite with SharedSparkContext {
   test("sum") {
 assert(sc.parallelize(Seq.empty[Double]).sum() === 0.0)
 assert(sc.parallelize(Seq(1.0)).sum() === 1.0)

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala
--
diff --git a/core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala
index 6138d0b..9deae42 100644
--- a/core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/JdbcRDDSuite.scala
@@ -19,11 +19,11 @@ package org.apache.spark.rdd
 
 import java.sql._
 
-import org.scalatest.{BeforeAndAfter, FunSuite}
+import org.scalatest.BeforeAndAfter
 
-import org.apache.spark.{LocalSparkContext, SparkContext}
+import org.apache.spark.{LocalSparkContext, SparkContext, SparkFunSuite}
 
-class JdbcRDDSuite extends FunSuite with BeforeAndAfter with LocalSparkContext 
{
+class JdbcRDDSuite extends SparkFunSuite with BeforeAndAfter with 
LocalSparkContext {
 
   before {
 Class.forName("org.apache.derby.jdbc.EmbeddedDriver")

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala
index 108f70a..56713ac 100644
--- a/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/PairRDDFunctionsSuite.scala
@@ -28,12 +28,10 @@ import org.apache.hadoop.conf.{Configurable, Configuration}
 import org.apache.hadoop.mapreduce.{JobContext => NewJobContext, 
OutputCommitter => NewOutputCommitter,
 OutputFormat => NewOutputFormat, RecordWriter => NewRecordWriter,
 TaskAttemptContext => NewTaskAttempContext}
-import org.apache.spark.{Partitioner, SharedSparkContext}
+import org.apache.spark.{Partitioner, SharedSparkContext, SparkFunSuite}
 import org.apache.spark.util.Utils
 
-import org.scalatest.FunSuite
-
-class PairRDDFunctionsSuite extends FunSuite with SharedSparkContext {
+class PairRDDFunctionsSuite extends SparkFunSuite with SharedSparkContext {
   test("aggregateByKey") {
 val pairs = sc.parallelize(Array((1, 1), (1, 1), (3, 2), (5, 1), (5, 3)), 
2)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/ParallelCollectionSplitSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/rdd/ParallelCollectionSplitSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/ParallelCollectionSplitSuite.scala
index cd193ae..4021b97 100644
--- 
a/core/src/test/scala/org/apache/spark/rdd/ParallelCollectionSplitSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/rdd/ParallelCollectionSplitSuite.scala
@@ -22,10 +22,11 @@ import scala.collection.immutable.NumericRange
 import org.scalacheck.Arbitrary._
 import org.scalacheck.Gen
 import org.scalacheck.Prop._
-import org.scalatest.FunSuite
 import org.scalatest.prop.Checkers
 
-class ParallelCollectionSplitSuite extends FunSuite with Checkers {
+import org.apache.spark.SparkFunSuite
+
+class ParallelCollectionSplitSuite extends SparkFunSuite with Checkers {
   test("one element per slice") {
 val data = Array(1, 2, 3)
 val slices = ParallelCollectionRDD.slice(data, 3)

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/core/src/test/scala/org/apache/spark/rdd/PartitionPruningRDDSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/rdd/PartitionPruningRDDSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/PartitionPruningRDDSuite.scala
index 8408d7e..740c9f0 100644
--- a/core/src/test/scala/org/apache/spark/rdd/PartitionPruningRDDSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/PartitionPruningRDDSuite.scala
@@ -17,11 +17,9 @@
 
 package org.apache.spark.rdd
 
-import org.scalatest.FunSuite
+import org.apache.spark.{Partition, SharedSparkContext

[2/5] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.3)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala 
b/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala
index c9f5dc0..f415fbc 100644
--- a/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/mllib/regression/LassoSuite.scala
@@ -19,8 +19,7 @@ package org.apache.spark.mllib.regression
 
 import scala.util.Random
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.util.{LocalClusterSparkContext, 
LinearDataGenerator,
   MLlibTestSparkContext}
@@ -32,7 +31,7 @@ private object LassoSuite {
   val model = new LassoModel(weights = Vectors.dense(0.1, 0.2, 0.3), intercept 
= 0.5)
 }
 
-class LassoSuite extends FunSuite with MLlibTestSparkContext {
+class LassoSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   def validatePrediction(predictions: Seq[Double], input: Seq[LabeledPoint]) {
 val numOffPredictions = predictions.zip(input).count { case (prediction, 
expected) =>
@@ -141,7 +140,7 @@ class LassoSuite extends FunSuite with 
MLlibTestSparkContext {
   }
 }
 
-class LassoClusterSuite extends FunSuite with LocalClusterSparkContext {
+class LassoClusterSuite extends SparkFunSuite with LocalClusterSparkContext {
 
   test("task size should be small in both training and prediction") {
 val m = 4

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/mllib/src/test/scala/org/apache/spark/mllib/regression/LinearRegressionSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/regression/LinearRegressionSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/regression/LinearRegressionSuite.scala
index 3781931..f88a1c3 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/regression/LinearRegressionSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/regression/LinearRegressionSuite.scala
@@ -19,8 +19,7 @@ package org.apache.spark.mllib.regression
 
 import scala.util.Random
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.util.{LocalClusterSparkContext, 
LinearDataGenerator,
   MLlibTestSparkContext}
@@ -32,7 +31,7 @@ private object LinearRegressionSuite {
   val model = new LinearRegressionModel(weights = Vectors.dense(0.1, 0.2, 
0.3), intercept = 0.5)
 }
 
-class LinearRegressionSuite extends FunSuite with MLlibTestSparkContext {
+class LinearRegressionSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   def validatePrediction(predictions: Seq[Double], input: Seq[LabeledPoint]) {
 val numOffPredictions = predictions.zip(input).count { case (prediction, 
expected) =>
@@ -150,7 +149,7 @@ class LinearRegressionSuite extends FunSuite with 
MLlibTestSparkContext {
   }
 }
 
-class LinearRegressionClusterSuite extends FunSuite with 
LocalClusterSparkContext {
+class LinearRegressionClusterSuite extends SparkFunSuite with 
LocalClusterSparkContext {
 
   test("task size should be small in both training and prediction") {
 val m = 4

http://git-wip-us.apache.org/repos/asf/spark/blob/e5747ee3/mllib/src/test/scala/org/apache/spark/mllib/regression/RidgeRegressionSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/mllib/regression/RidgeRegressionSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/mllib/regression/RidgeRegressionSuite.scala
index 43d6115..dbfe6de 100644
--- 
a/mllib/src/test/scala/org/apache/spark/mllib/regression/RidgeRegressionSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/mllib/regression/RidgeRegressionSuite.scala
@@ -20,8 +20,8 @@ package org.apache.spark.mllib.regression
 import scala.util.Random
 
 import org.jblas.DoubleMatrix
-import org.scalatest.FunSuite
 
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.Vectors
 import org.apache.spark.mllib.util.{LocalClusterSparkContext, 
LinearDataGenerator,
   MLlibTestSparkContext}
@@ -33,7 +33,7 @@ private object RidgeRegressionSuite {
   val model = new RidgeRegressionModel(weights = Vectors.dense(0.1, 0.2, 0.3), 
intercept = 0.5)
 }
 
-class RidgeRegressionSuite extends FunSuite with MLlibTestSparkContext {
+class RidgeRegressionSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   def predictionError(predictions: Seq[Double], input: Seq[LabeledPoint]) = {
 predictions.zip(input).map { case (prediction, expected) =>
@@ -101,7 +101,7 @@ class RidgeRegressionSuite extends FunSuite with 
MLlibTestSparkContext {
   }
 }
 
-class RidgeRegressionClusterSuite extends FunSuit

spark git commit: [SPARK-7980] [SQL] Support SQLContext.range(end)

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 54a4ea407 -> 0a1dad6cd


[SPARK-7980] [SQL] Support SQLContext.range(end)

1. range() overloaded in SQLContext.scala
2. range() modified in python sql context.py
3. Tests added accordingly in DataFrameSuite.scala and python sql tests.py

Author: animesh 

Closes #6609 from animeshbaranawal/SPARK-7980 and squashes the following 
commits:

935899c [animesh] SPARK-7980:python+scala changes

(cherry picked from commit d053a31be93d789e3f26cf55d747ecf6ca386c29)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0a1dad6c
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0a1dad6c
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0a1dad6c

Branch: refs/heads/branch-1.4
Commit: 0a1dad6cd43a9f02acb83f03e2e20b2c8205c171
Parents: 54a4ea4
Author: animesh 
Authored: Wed Jun 3 11:28:18 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 11:28:38 2015 -0700

--
 python/pyspark/sql/context.py   | 12 ++--
 python/pyspark/sql/tests.py |  2 ++
 .../main/scala/org/apache/spark/sql/SQLContext.scala| 11 +++
 .../scala/org/apache/spark/sql/DataFrameSuite.scala |  8 
 4 files changed, 31 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0a1dad6c/python/pyspark/sql/context.py
--
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index 9fdf43c..1bebfc4 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -131,7 +131,7 @@ class SQLContext(object):
 return UDFRegistration(self)
 
 @since(1.4)
-def range(self, start, end, step=1, numPartitions=None):
+def range(self, start, end=None, step=1, numPartitions=None):
 """
 Create a :class:`DataFrame` with single LongType column named `id`,
 containing elements in a range from `start` to `end` (exclusive) with
@@ -145,10 +145,18 @@ class SQLContext(object):
 
 >>> sqlContext.range(1, 7, 2).collect()
 [Row(id=1), Row(id=3), Row(id=5)]
+
+>>> sqlContext.range(3).collect()
+[Row(id=0), Row(id=1), Row(id=2)]
 """
 if numPartitions is None:
 numPartitions = self._sc.defaultParallelism
-jdf = self._ssql_ctx.range(int(start), int(end), int(step), 
int(numPartitions))
+
+if end is None:
+jdf = self._ssql_ctx.range(0, int(start), int(step), 
int(numPartitions))
+else:
+jdf = self._ssql_ctx.range(int(start), int(end), int(step), 
int(numPartitions))
+
 return DataFrame(jdf, self)
 
 @ignore_unicode_prefix

http://git-wip-us.apache.org/repos/asf/spark/blob/0a1dad6c/python/pyspark/sql/tests.py
--
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 6e498f0..a6fce50 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -131,6 +131,8 @@ class SQLTests(ReusedPySparkTestCase):
 self.assertEqual(self.sqlCtx.range(1, 1).count(), 0)
 self.assertEqual(self.sqlCtx.range(1, 0, -1).count(), 1)
 self.assertEqual(self.sqlCtx.range(0, 1 << 40, 1 << 39).count(), 2)
+self.assertEqual(self.sqlCtx.range(-2).count(), 0)
+self.assertEqual(self.sqlCtx.range(3).count(), 3)
 
 def test_explode(self):
 from pyspark.sql.functions import explode

http://git-wip-us.apache.org/repos/asf/spark/blob/0a1dad6c/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
index 91e6385..f08fb4f 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@@ -720,6 +720,17 @@ class SQLContext(@transient val sparkContext: SparkContext)
   /**
* :: Experimental ::
* Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
+   * in an range from 0 to `end`(exclusive) with step value 1.
+   *
+   * @since 1.4.0
+   * @group dataframe
+   */
+  @Experimental
+  def range(end: Long): DataFrame = range(0, end)
+
+  /**
+   * :: Experimental ::
+   * Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
* in an range from `start` to `end`(exclusive) with an step value, with 
partition number
* specified.
*

http://git-wip-us.apache.org/repos/asf/spark/blob/0a1dad6c/sql/core/src/test/scala/org/apache/spar

spark git commit: [SPARK-7980] [SQL] Support SQLContext.range(end)

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 2c4d550ed -> d053a31be


[SPARK-7980] [SQL] Support SQLContext.range(end)

1. range() overloaded in SQLContext.scala
2. range() modified in python sql context.py
3. Tests added accordingly in DataFrameSuite.scala and python sql tests.py

Author: animesh 

Closes #6609 from animeshbaranawal/SPARK-7980 and squashes the following 
commits:

935899c [animesh] SPARK-7980:python+scala changes


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d053a31b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d053a31b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d053a31b

Branch: refs/heads/master
Commit: d053a31be93d789e3f26cf55d747ecf6ca386c29
Parents: 2c4d550
Author: animesh 
Authored: Wed Jun 3 11:28:18 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 11:28:18 2015 -0700

--
 python/pyspark/sql/context.py   | 12 ++--
 python/pyspark/sql/tests.py |  2 ++
 .../main/scala/org/apache/spark/sql/SQLContext.scala| 11 +++
 .../scala/org/apache/spark/sql/DataFrameSuite.scala |  8 
 4 files changed, 31 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d053a31b/python/pyspark/sql/context.py
--
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index 9fdf43c..1bebfc4 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -131,7 +131,7 @@ class SQLContext(object):
 return UDFRegistration(self)
 
 @since(1.4)
-def range(self, start, end, step=1, numPartitions=None):
+def range(self, start, end=None, step=1, numPartitions=None):
 """
 Create a :class:`DataFrame` with single LongType column named `id`,
 containing elements in a range from `start` to `end` (exclusive) with
@@ -145,10 +145,18 @@ class SQLContext(object):
 
 >>> sqlContext.range(1, 7, 2).collect()
 [Row(id=1), Row(id=3), Row(id=5)]
+
+>>> sqlContext.range(3).collect()
+[Row(id=0), Row(id=1), Row(id=2)]
 """
 if numPartitions is None:
 numPartitions = self._sc.defaultParallelism
-jdf = self._ssql_ctx.range(int(start), int(end), int(step), 
int(numPartitions))
+
+if end is None:
+jdf = self._ssql_ctx.range(0, int(start), int(step), 
int(numPartitions))
+else:
+jdf = self._ssql_ctx.range(int(start), int(end), int(step), 
int(numPartitions))
+
 return DataFrame(jdf, self)
 
 @ignore_unicode_prefix

http://git-wip-us.apache.org/repos/asf/spark/blob/d053a31b/python/pyspark/sql/tests.py
--
diff --git a/python/pyspark/sql/tests.py b/python/pyspark/sql/tests.py
index 6e498f0..a6fce50 100644
--- a/python/pyspark/sql/tests.py
+++ b/python/pyspark/sql/tests.py
@@ -131,6 +131,8 @@ class SQLTests(ReusedPySparkTestCase):
 self.assertEqual(self.sqlCtx.range(1, 1).count(), 0)
 self.assertEqual(self.sqlCtx.range(1, 0, -1).count(), 1)
 self.assertEqual(self.sqlCtx.range(0, 1 << 40, 1 << 39).count(), 2)
+self.assertEqual(self.sqlCtx.range(-2).count(), 0)
+self.assertEqual(self.sqlCtx.range(3).count(), 3)
 
 def test_explode(self):
 from pyspark.sql.functions import explode

http://git-wip-us.apache.org/repos/asf/spark/blob/d053a31b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
index 91e6385..f08fb4f 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@@ -720,6 +720,17 @@ class SQLContext(@transient val sparkContext: SparkContext)
   /**
* :: Experimental ::
* Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
+   * in an range from 0 to `end`(exclusive) with step value 1.
+   *
+   * @since 1.4.0
+   * @group dataframe
+   */
+  @Experimental
+  def range(end: Long): DataFrame = range(0, end)
+
+  /**
+   * :: Experimental ::
+   * Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
* in an range from `start` to `end`(exclusive) with an step value, with 
partition number
* specified.
*

http://git-wip-us.apache.org/repos/asf/spark/blob/d053a31b/sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
--
diff --g

spark git commit: [SPARK-7161] [HISTORY SERVER] Provide REST api to download event logs fro...

2015-06-03 Thread irashid
Repository: spark
Updated Branches:
  refs/heads/master d053a31be -> d2a86eb8f


[SPARK-7161] [HISTORY SERVER] Provide REST api to download event logs fro...

...m History Server

This PR adds a new API that allows the user to download event logs for an 
application as a zip file. APIs have been added to download all logs for a 
given application or just for a specific attempt.

This also add an additional method to the ApplicationHistoryProvider to get the 
raw files, zipped.

Author: Hari Shreedharan 

Closes #5792 from harishreedharan/eventlog-download and squashes the following 
commits:

221cc26 [Hari Shreedharan] Update docs with new API information.
a131be6 [Hari Shreedharan] Fix style issues.
5528bd8 [Hari Shreedharan] Merge branch 'master' into eventlog-download
6e8156e [Hari Shreedharan] Simplify tests, use Guava stream copy methods.
d8ddede [Hari Shreedharan] Remove unnecessary case in EventLogDownloadResource.
b53 [Hari Shreedharan] Changed interface to use zip stream. Added more 
tests.
1100b40 [Hari Shreedharan] Ensure that `Path` does not appear in interfaces, by 
rafactoring interfaces.
5a5f3e2 [Hari Shreedharan] Fix test ordering issue.
0b66948 [Hari Shreedharan] Minor formatting/import fixes.
4fc518c [Hari Shreedharan] Fix rat failures.
a48b91f [Hari Shreedharan] Refactor to make attemptId optional in the API. Also 
added tests.
0fc1424 [Hari Shreedharan] File download now works for individual attempts and 
the entire application.
350d7e8 [Hari Shreedharan] Merge remote-tracking branch 'asf/master' into 
eventlog-download
fd6ab00 [Hari Shreedharan] Fix style issues
32b7662 [Hari Shreedharan] Use UIRoot directly in ApiRootResource. Also, use 
`Response` class to set headers.
7b362b2 [Hari Shreedharan] Almost working.
3d18ebc [Hari Shreedharan] [WIP] Try getting the event log download to work.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d2a86eb8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d2a86eb8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d2a86eb8

Branch: refs/heads/master
Commit: d2a86eb8f0fcc02304604da56c589ea58c77587a
Parents: d053a31
Author: Hari Shreedharan 
Authored: Wed Jun 3 13:43:13 2015 -0500
Committer: Imran Rashid 
Committed: Wed Jun 3 13:43:13 2015 -0500

--
 .rat-excludes   |  2 +
 .../history/ApplicationHistoryProvider.scala| 11 +++
 .../deploy/history/FsHistoryProvider.scala  | 63 +-
 .../spark/deploy/history/HistoryServer.scala|  8 ++
 .../spark/status/api/v1/ApiRootResource.scala   | 20 +
 .../api/v1/EventLogDownloadResource.scala   | 70 
 .../application_list_json_expectation.json  | 16 
 .../completed_app_list_json_expectation.json| 16 
 .../minDate_app_list_json_expectation.json  | 34 ++--
 .../spark-events/local-1430917381535_1  |  5 ++
 .../spark-events/local-1430917381535_2  |  5 ++
 .../deploy/history/FsHistoryProviderSuite.scala | 40 -
 .../deploy/history/HistoryServerSuite.scala | 88 ++--
 docs/monitoring.md  |  8 ++
 14 files changed, 367 insertions(+), 19 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d2a86eb8/.rat-excludes
--
diff --git a/.rat-excludes b/.rat-excludes
index 8f2722c..994c7e8 100644
--- a/.rat-excludes
+++ b/.rat-excludes
@@ -80,6 +80,8 @@ local-1425081759269/*
 local-1426533911241/*
 local-1426633911242/*
 local-1430917381534/*
+local-1430917381535_1
+local-1430917381535_2
 DESCRIPTION
 NAMESPACE
 test_support/*

http://git-wip-us.apache.org/repos/asf/spark/blob/d2a86eb8/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
index 298a820..5f5e0fe 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/ApplicationHistoryProvider.scala
@@ -17,6 +17,9 @@
 
 package org.apache.spark.deploy.history
 
+import java.util.zip.ZipOutputStream
+
+import org.apache.spark.SparkException
 import org.apache.spark.ui.SparkUI
 
 private[spark] case class ApplicationAttemptInfo(
@@ -62,4 +65,12 @@ private[history] abstract class ApplicationHistoryProvider {
*/
   def getConfig(): Map[String, String] = Map()
 
+  /**
+   * Writes out the event logs to the output stream provided. The logs will be 
compressed into a
+   * single zip file and written out.
+   * @throws SparkExcep

spark git commit: [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option.

2015-06-03 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/master d2a86eb8f -> 708c63bbb


[SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and 
--master command line option.

Author: Sun Rui 

Closes #6605 from sun-rui/SPARK-8063 and squashes the following commits:

51ca48b [Sun Rui] [SPARK-8063][SPARKR] Spark master URL conflict between MASTER 
env variable and --master command line option.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/708c63bb
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/708c63bb
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/708c63bb

Branch: refs/heads/master
Commit: 708c63bbbe9580eb774fe47e23ef61338103afda
Parents: d2a86eb
Author: Sun Rui 
Authored: Wed Jun 3 11:56:35 2015 -0700
Committer: Shivaram Venkataraman 
Committed: Wed Jun 3 11:56:35 2015 -0700

--
 R/pkg/inst/profile/shell.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/708c63bb/R/pkg/inst/profile/shell.R
--
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index ca94f1d..773b6ec 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -24,7 +24,7 @@
   old <- getOption("defaultPackages")
   options(defaultPackages = c(old, "SparkR"))
 
-  sc <- SparkR::sparkR.init(Sys.getenv("MASTER", unset = ""))
+  sc <- SparkR::sparkR.init()
   assign("sc", sc, envir=.GlobalEnv)
   sqlContext <- SparkR::sparkRSQL.init(sc)
   assign("sqlContext", sqlContext, envir=.GlobalEnv)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and --master command line option.

2015-06-03 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 0a1dad6cd -> f67a27d02


[SPARK-8063] [SPARKR] Spark master URL conflict between MASTER env variable and 
--master command line option.

Author: Sun Rui 

Closes #6605 from sun-rui/SPARK-8063 and squashes the following commits:

51ca48b [Sun Rui] [SPARK-8063][SPARKR] Spark master URL conflict between MASTER 
env variable and --master command line option.

(cherry picked from commit 708c63bbbe9580eb774fe47e23ef61338103afda)
Signed-off-by: Shivaram Venkataraman 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f67a27d0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f67a27d0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f67a27d0

Branch: refs/heads/branch-1.4
Commit: f67a27d02699af24d5a2ccb843954a643a7ba078
Parents: 0a1dad6
Author: Sun Rui 
Authored: Wed Jun 3 11:56:35 2015 -0700
Committer: Shivaram Venkataraman 
Committed: Wed Jun 3 11:57:00 2015 -0700

--
 R/pkg/inst/profile/shell.R | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f67a27d0/R/pkg/inst/profile/shell.R
--
diff --git a/R/pkg/inst/profile/shell.R b/R/pkg/inst/profile/shell.R
index ca94f1d..773b6ec 100644
--- a/R/pkg/inst/profile/shell.R
+++ b/R/pkg/inst/profile/shell.R
@@ -24,7 +24,7 @@
   old <- getOption("defaultPackages")
   options(defaultPackages = c(old, "SparkR"))
 
-  sc <- SparkR::sparkR.init(Sys.getenv("MASTER", unset = ""))
+  sc <- SparkR::sparkR.init()
   assign("sc", sc, envir=.GlobalEnv)
   sqlContext <- SparkR::sparkRSQL.init(sc)
   assign("sqlContext", sqlContext, envir=.GlobalEnv)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures.

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 708c63bbb -> 939e4f3d8


[SPARK-8074] Parquet should throw AnalysisException during setup for data 
type/name related failures.

Author: Reynold Xin 

Closes #6608 from rxin/parquet-analysis and squashes the following commits:

b5dc8e2 [Reynold Xin] Code review feedback.
5617cf6 [Reynold Xin] [SPARK-8074] Parquet should throw AnalysisException 
during setup for data type/name related failures.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/939e4f3d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/939e4f3d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/939e4f3d

Branch: refs/heads/master
Commit: 939e4f3d8def16dfe03f0196be8e1c218a9daa32
Parents: 708c63b
Author: Reynold Xin 
Authored: Wed Jun 3 13:57:57 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 13:57:57 2015 -0700

--
 .../apache/spark/sql/parquet/ParquetTypes.scala | 20 +---
 .../apache/spark/sql/parquet/newParquet.scala   | 14 --
 2 files changed, 17 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/939e4f3d/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
index 6698b19..f8a5d84 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
@@ -19,7 +19,7 @@ package org.apache.spark.sql.parquet
 
 import java.io.IOException
 
-import scala.collection.mutable.ArrayBuffer
+import scala.collection.JavaConversions._
 import scala.util.Try
 
 import org.apache.hadoop.conf.Configuration
@@ -33,12 +33,11 @@ import parquet.schema.PrimitiveType.{PrimitiveTypeName => 
ParquetPrimitiveTypeNa
 import parquet.schema.Type.Repetition
 import parquet.schema.{ConversionPatterns, DecimalMetadata, GroupType => 
ParquetGroupType, MessageType, OriginalType => ParquetOriginalType, 
PrimitiveType => ParquetPrimitiveType, Type => ParquetType, Types => 
ParquetTypes}
 
+import org.apache.spark.Logging
+import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.types._
-import org.apache.spark.{Logging, SparkException}
 
-// Implicits
-import scala.collection.JavaConversions._
 
 /** A class representing Parquet info fields we care about, for passing back 
to Parquet */
 private[parquet] case class ParquetTypeInfo(
@@ -73,13 +72,12 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
   case ParquetPrimitiveTypeName.INT96 if int96AsTimestamp => TimestampType
   case ParquetPrimitiveTypeName.INT96 =>
 // TODO: add BigInteger type? TODO(andre) use DecimalType instead
-sys.error("Potential loss of precision: cannot convert INT96")
+throw new AnalysisException("Potential loss of precision: cannot 
convert INT96")
   case ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY
 if (originalType == ParquetOriginalType.DECIMAL && 
decimalInfo.getPrecision <= 18) =>
   // TODO: for now, our reader only supports decimals that fit in a 
Long
   DecimalType(decimalInfo.getPrecision, decimalInfo.getScale)
-  case _ => sys.error(
-s"Unsupported parquet datatype $parquetType")
+  case _ => throw new AnalysisException(s"Unsupported parquet datatype 
$parquetType")
 }
   }
 
@@ -371,7 +369,7 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
 parquetKeyType,
 parquetValueType)
 }
-case _ => sys.error(s"Unsupported datatype $ctype")
+case _ => throw new AnalysisException(s"Unsupported datatype $ctype")
   }
 }
   }
@@ -403,7 +401,7 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
   def convertFromString(string: String): Seq[Attribute] = {
 
Try(DataType.fromJson(string)).getOrElse(DataType.fromCaseClassString(string)) 
match {
   case s: StructType => s.toAttributes
-  case other => sys.error(s"Can convert $string to row")
+  case other => throw new AnalysisException(s"Can convert $string to row")
 }
   }
 
@@ -411,8 +409,8 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
 // ,;{}()\n\t= and space character are special characters in Parquet schema
 schema.map(_.name).foreach { name =>
   if (name.matches(".*[ ,;{}()\n\t=].*")) {
-sys.error(
-  s"""Attribute name "$name" contains invalid character(s) among " 
,;{}()\n\t=".
+throw new AnalysisExceptio

spark git commit: [SPARK-8074] Parquet should throw AnalysisException during setup for data type/name related failures.

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 f67a27d02 -> 1f90a06bd


[SPARK-8074] Parquet should throw AnalysisException during setup for data 
type/name related failures.

Author: Reynold Xin 

Closes #6608 from rxin/parquet-analysis and squashes the following commits:

b5dc8e2 [Reynold Xin] Code review feedback.
5617cf6 [Reynold Xin] [SPARK-8074] Parquet should throw AnalysisException 
during setup for data type/name related failures.

(cherry picked from commit 939e4f3d8def16dfe03f0196be8e1c218a9daa32)
Signed-off-by: Reynold Xin 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1f90a06b
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1f90a06b
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1f90a06b

Branch: refs/heads/branch-1.4
Commit: 1f90a06bda985ae8508e0439d11405d294fde2ec
Parents: f67a27d
Author: Reynold Xin 
Authored: Wed Jun 3 13:57:57 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 13:58:15 2015 -0700

--
 .../apache/spark/sql/parquet/ParquetTypes.scala | 20 +---
 .../apache/spark/sql/parquet/newParquet.scala   | 14 --
 2 files changed, 17 insertions(+), 17 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1f90a06b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
--
diff --git 
a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
index 6698b19..f8a5d84 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTypes.scala
@@ -19,7 +19,7 @@ package org.apache.spark.sql.parquet
 
 import java.io.IOException
 
-import scala.collection.mutable.ArrayBuffer
+import scala.collection.JavaConversions._
 import scala.util.Try
 
 import org.apache.hadoop.conf.Configuration
@@ -33,12 +33,11 @@ import parquet.schema.PrimitiveType.{PrimitiveTypeName => 
ParquetPrimitiveTypeNa
 import parquet.schema.Type.Repetition
 import parquet.schema.{ConversionPatterns, DecimalMetadata, GroupType => 
ParquetGroupType, MessageType, OriginalType => ParquetOriginalType, 
PrimitiveType => ParquetPrimitiveType, Type => ParquetType, Types => 
ParquetTypes}
 
+import org.apache.spark.Logging
+import org.apache.spark.sql.AnalysisException
 import org.apache.spark.sql.catalyst.expressions.{Attribute, 
AttributeReference}
 import org.apache.spark.sql.types._
-import org.apache.spark.{Logging, SparkException}
 
-// Implicits
-import scala.collection.JavaConversions._
 
 /** A class representing Parquet info fields we care about, for passing back 
to Parquet */
 private[parquet] case class ParquetTypeInfo(
@@ -73,13 +72,12 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
   case ParquetPrimitiveTypeName.INT96 if int96AsTimestamp => TimestampType
   case ParquetPrimitiveTypeName.INT96 =>
 // TODO: add BigInteger type? TODO(andre) use DecimalType instead
-sys.error("Potential loss of precision: cannot convert INT96")
+throw new AnalysisException("Potential loss of precision: cannot 
convert INT96")
   case ParquetPrimitiveTypeName.FIXED_LEN_BYTE_ARRAY
 if (originalType == ParquetOriginalType.DECIMAL && 
decimalInfo.getPrecision <= 18) =>
   // TODO: for now, our reader only supports decimals that fit in a 
Long
   DecimalType(decimalInfo.getPrecision, decimalInfo.getScale)
-  case _ => sys.error(
-s"Unsupported parquet datatype $parquetType")
+  case _ => throw new AnalysisException(s"Unsupported parquet datatype 
$parquetType")
 }
   }
 
@@ -371,7 +369,7 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
 parquetKeyType,
 parquetValueType)
 }
-case _ => sys.error(s"Unsupported datatype $ctype")
+case _ => throw new AnalysisException(s"Unsupported datatype $ctype")
   }
 }
   }
@@ -403,7 +401,7 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
   def convertFromString(string: String): Seq[Attribute] = {
 
Try(DataType.fromJson(string)).getOrElse(DataType.fromCaseClassString(string)) 
match {
   case s: StructType => s.toAttributes
-  case other => sys.error(s"Can convert $string to row")
+  case other => throw new AnalysisException(s"Can convert $string to row")
 }
   }
 
@@ -411,8 +409,8 @@ private[parquet] object ParquetTypesConverter extends 
Logging {
 // ,;{}()\n\t= and space character are special characters in Parquet schema
 schema.map(_.name).foreach { name =>
   if (name.matches(".*[ ,;{}()\n\t=].*")) {
-sys.error(
-  s"""Attri

spark git commit: Update documentation for [SPARK-7980] [SQL] Support SQLContext.range(end)

2015-06-03 Thread rxin
Repository: spark
Updated Branches:
  refs/heads/master 939e4f3d8 -> 2c5a06caf


Update documentation for [SPARK-7980] [SQL] Support SQLContext.range(end)


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/2c5a06ca
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/2c5a06ca
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/2c5a06ca

Branch: refs/heads/master
Commit: 2c5a06cafd2885ff5431fa96485db2564ae1cce3
Parents: 939e4f3
Author: Reynold Xin 
Authored: Wed Jun 3 14:19:10 2015 -0700
Committer: Reynold Xin 
Committed: Wed Jun 3 14:20:27 2015 -0700

--
 python/pyspark/sql/context.py   |  2 ++
 .../scala/org/apache/spark/sql/SQLContext.scala | 20 ++--
 2 files changed, 12 insertions(+), 10 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/2c5a06ca/python/pyspark/sql/context.py
--
diff --git a/python/pyspark/sql/context.py b/python/pyspark/sql/context.py
index 1bebfc4..599c9ac 100644
--- a/python/pyspark/sql/context.py
+++ b/python/pyspark/sql/context.py
@@ -146,6 +146,8 @@ class SQLContext(object):
 >>> sqlContext.range(1, 7, 2).collect()
 [Row(id=1), Row(id=3), Row(id=5)]
 
+If only one argument is specified, it will be used as the end value.
+
 >>> sqlContext.range(3).collect()
 [Row(id=0), Row(id=1), Row(id=2)]
 """

http://git-wip-us.apache.org/repos/asf/spark/blob/2c5a06ca/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
--
diff --git a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 
b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
index f08fb4f..0aab7fa 100644
--- a/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
+++ b/sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala
@@ -705,33 +705,33 @@ class SQLContext(@transient val sparkContext: 
SparkContext)
   /**
* :: Experimental ::
* Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
-   * in an range from `start` to `end`(exclusive) with step value 1.
+   * in an range from 0 to `end` (exclusive) with step value 1.
*
-   * @since 1.4.0
+   * @since 1.4.1
* @group dataframe
*/
   @Experimental
-  def range(start: Long, end: Long): DataFrame = {
-createDataFrame(
-  sparkContext.range(start, end).map(Row(_)),
-  StructType(StructField("id", LongType, nullable = false) :: Nil))
-  }
+  def range(end: Long): DataFrame = range(0, end)
 
   /**
* :: Experimental ::
* Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
-   * in an range from 0 to `end`(exclusive) with step value 1.
+   * in an range from `start` to `end` (exclusive) with step value 1.
*
* @since 1.4.0
* @group dataframe
*/
   @Experimental
-  def range(end: Long): DataFrame = range(0, end)
+  def range(start: Long, end: Long): DataFrame = {
+createDataFrame(
+  sparkContext.range(start, end).map(Row(_)),
+  StructType(StructField("id", LongType, nullable = false) :: Nil))
+  }
 
   /**
* :: Experimental ::
* Creates a [[DataFrame]] with a single [[LongType]] column named `id`, 
containing elements
-   * in an range from `start` to `end`(exclusive) with an step value, with 
partition number
+   * in an range from `start` to `end` (exclusive) with an step value, with 
partition number
* specified.
*
* @since 1.4.0


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

2015-06-03 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 1f90a06bd -> bfab61f39


[SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

Java-friendly APIs added:
* GaussianMixture.run()
* GaussianMixtureModel.predict()
* DistributedLDAModel.javaTopicDistributions()
* StreamingKMeans: trainOn, predictOn, predictOnValues
* Statistics.corr
* params
  * added doc to w() since Java docs do not inherit doc
  * removed non-Java-friendly w() from StringArrayParam and DoubleArrayParam
  * made DoubleArrayParam Java-friendly w() actually Java-friendly

I generated the doc and verified all changes.

CC: mengxr

Author: Joseph K. Bradley 

Closes #6562 from jkbradley/java-api-1.4 and squashes the following commits:

c16821b [Joseph K. Bradley] Small fixes based on code review.
d955581 [Joseph K. Bradley] unit test fixes
29b6b0d [Joseph K. Bradley] small fixes
fe6dcfe [Joseph K. Bradley] Added several Java-friendly APIs + unit tests: 
NaiveBayes, GaussianMixture, LDA, StreamingKMeans, Statistics.corr, params

(cherry picked from commit 20a26b595c74ac41cf7c19e6091d7e675e503321)
Signed-off-by: Joseph K. Bradley 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bfab61f3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bfab61f3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bfab61f3

Branch: refs/heads/branch-1.4
Commit: bfab61f39c07252ad938e55a655d65370ad29ca1
Parents: 1f90a06
Author: Joseph K. Bradley 
Authored: Wed Jun 3 14:34:20 2015 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jun 3 14:34:31 2015 -0700

--
 .../org/apache/spark/ml/param/params.scala  | 20 +++--
 .../mllib/clustering/GaussianMixture.scala  |  4 +
 .../mllib/clustering/GaussianMixtureModel.scala |  7 +-
 .../spark/mllib/clustering/LDAModel.scala   |  6 ++
 .../mllib/clustering/StreamingKMeans.scala  | 18 +
 .../apache/spark/mllib/stat/Statistics.scala|  9 +++
 .../JavaStreamingLogisticRegressionSuite.java   | 82 
 .../apache/spark/ml/param/JavaParamsSuite.java  |  1 +
 .../apache/spark/ml/param/JavaTestParams.java   | 29 +--
 .../JavaStreamingLogisticRegressionSuite.java   | 81 +++
 .../clustering/JavaGaussianMixtureSuite.java| 64 +++
 .../spark/mllib/clustering/JavaLDASuite.java|  4 +
 .../clustering/JavaStreamingKMeansSuite.java| 82 
 .../spark/mllib/stat/JavaStatisticsSuite.java   | 56 +
 14 files changed, 364 insertions(+), 99 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bfab61f3/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/param/params.scala 
b/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
index 473488d..ba94d6a 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
@@ -69,14 +69,10 @@ class Param[T](val parent: String, val name: String, val 
doc: String, val isVali
 }
   }
 
-  /**
-   * Creates a param pair with the given value (for Java).
-   */
+  /** Creates a param pair with the given value (for Java). */
   def w(value: T): ParamPair[T] = this -> value
 
-  /**
-   * Creates a param pair with the given value (for Scala).
-   */
+  /** Creates a param pair with the given value (for Scala). */
   def ->(value: T): ParamPair[T] = ParamPair(this, value)
 
   override final def toString: String = s"${parent}__$name"
@@ -190,6 +186,7 @@ class DoubleParam(parent: String, name: String, doc: 
String, isValid: Double =>
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Double): ParamPair[Double] = super.w(value)
 }
 
@@ -209,6 +206,7 @@ class IntParam(parent: String, name: String, doc: String, 
isValid: Int => Boolea
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Int): ParamPair[Int] = super.w(value)
 }
 
@@ -228,6 +226,7 @@ class FloatParam(parent: String, name: String, doc: String, 
isValid: Float => Bo
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Float): ParamPair[Float] = super.w(value)
 }
 
@@ -247,6 +246,7 @@ class LongParam(parent: String, name: String, doc: String, 
isValid: Long => Bool
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the 

spark git commit: [SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

2015-06-03 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/master 2c5a06caf -> 20a26b595


[SPARK-8054] [MLLIB] Added several Java-friendly APIs + unit tests

Java-friendly APIs added:
* GaussianMixture.run()
* GaussianMixtureModel.predict()
* DistributedLDAModel.javaTopicDistributions()
* StreamingKMeans: trainOn, predictOn, predictOnValues
* Statistics.corr
* params
  * added doc to w() since Java docs do not inherit doc
  * removed non-Java-friendly w() from StringArrayParam and DoubleArrayParam
  * made DoubleArrayParam Java-friendly w() actually Java-friendly

I generated the doc and verified all changes.

CC: mengxr

Author: Joseph K. Bradley 

Closes #6562 from jkbradley/java-api-1.4 and squashes the following commits:

c16821b [Joseph K. Bradley] Small fixes based on code review.
d955581 [Joseph K. Bradley] unit test fixes
29b6b0d [Joseph K. Bradley] small fixes
fe6dcfe [Joseph K. Bradley] Added several Java-friendly APIs + unit tests: 
NaiveBayes, GaussianMixture, LDA, StreamingKMeans, Statistics.corr, params


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/20a26b59
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/20a26b59
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/20a26b59

Branch: refs/heads/master
Commit: 20a26b595c74ac41cf7c19e6091d7e675e503321
Parents: 2c5a06c
Author: Joseph K. Bradley 
Authored: Wed Jun 3 14:34:20 2015 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jun 3 14:34:20 2015 -0700

--
 .../org/apache/spark/ml/param/params.scala  | 20 +++--
 .../mllib/clustering/GaussianMixture.scala  |  4 +
 .../mllib/clustering/GaussianMixtureModel.scala |  7 +-
 .../spark/mllib/clustering/LDAModel.scala   |  6 ++
 .../mllib/clustering/StreamingKMeans.scala  | 18 +
 .../apache/spark/mllib/stat/Statistics.scala|  9 +++
 .../JavaStreamingLogisticRegressionSuite.java   | 82 
 .../apache/spark/ml/param/JavaParamsSuite.java  |  1 +
 .../apache/spark/ml/param/JavaTestParams.java   | 29 +--
 .../JavaStreamingLogisticRegressionSuite.java   | 81 +++
 .../clustering/JavaGaussianMixtureSuite.java| 64 +++
 .../spark/mllib/clustering/JavaLDASuite.java|  4 +
 .../clustering/JavaStreamingKMeansSuite.java| 82 
 .../spark/mllib/stat/JavaStatisticsSuite.java   | 56 +
 14 files changed, 364 insertions(+), 99 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/20a26b59/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
--
diff --git a/mllib/src/main/scala/org/apache/spark/ml/param/params.scala 
b/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
index 473488d..ba94d6a 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/param/params.scala
@@ -69,14 +69,10 @@ class Param[T](val parent: String, val name: String, val 
doc: String, val isVali
 }
   }
 
-  /**
-   * Creates a param pair with the given value (for Java).
-   */
+  /** Creates a param pair with the given value (for Java). */
   def w(value: T): ParamPair[T] = this -> value
 
-  /**
-   * Creates a param pair with the given value (for Scala).
-   */
+  /** Creates a param pair with the given value (for Scala). */
   def ->(value: T): ParamPair[T] = ParamPair(this, value)
 
   override final def toString: String = s"${parent}__$name"
@@ -190,6 +186,7 @@ class DoubleParam(parent: String, name: String, doc: 
String, isValid: Double =>
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Double): ParamPair[Double] = super.w(value)
 }
 
@@ -209,6 +206,7 @@ class IntParam(parent: String, name: String, doc: String, 
isValid: Int => Boolea
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Int): ParamPair[Int] = super.w(value)
 }
 
@@ -228,6 +226,7 @@ class FloatParam(parent: String, name: String, doc: String, 
isValid: Float => Bo
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Float): ParamPair[Float] = super.w(value)
 }
 
@@ -247,6 +246,7 @@ class LongParam(parent: String, name: String, doc: String, 
isValid: Long => Bool
 
   def this(parent: Identifiable, name: String, doc: String) = this(parent.uid, 
name, doc)
 
+  /** Creates a param pair with the given value (for Java). */
   override def w(value: Long): ParamPair[Long] = super.w(value)
 }
 
@@ -260,6 +260,

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 20a26b595 -> c6a6dd0d0


[MINOR] [UI] Improve confusing message on log page

It's good practice to check if the input path is in the directory
we expect to avoid potentially confusing error messages.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c6a6dd0d
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c6a6dd0d
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c6a6dd0d

Branch: refs/heads/master
Commit: c6a6dd0d0736d548ff9f255e5ed5df45b29c46c1
Parents: 20a26b5
Author: Andrew Or 
Authored: Wed Jun 3 12:10:12 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:47:09 2015 -0700

--
 .../apache/spark/deploy/worker/ui/LogPage.scala |  9 +++
 .../scala/org/apache/spark/util/Utils.scala | 16 +
 .../spark/deploy/worker/ui/LogPageSuite.scala   | 36 +++
 .../org/apache/spark/util/UtilsSuite.scala  | 65 
 4 files changed, 115 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c6a6dd0d/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
index dc2bee6..53f8f9a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.deploy.worker.ui
 
+import java.io.File
+import java.net.URI
 import javax.servlet.http.HttpServletRequest
 
 import scala.xml.Node
@@ -135,6 +137,13 @@ private[ui] class LogPage(parent: WorkerWebUI) extends 
WebUIPage("logPage") with
   return ("Error: Log type must be one of " + 
supportedLogTypes.mkString(", "), 0, 0, 0)
 }
 
+// Verify that the normalized path of the log directory is in the working 
directory
+val normalizedUri = new URI(logDirectory).normalize()
+val normalizedLogDir = new File(normalizedUri.getPath)
+if (!Utils.isInDirectory(workDir, normalizedLogDir)) {
+  return ("Error: invalid log directory " + logDirectory, 0, 0, 0)
+}
+
 try {
   val files = RollingFileAppender.getSortedRolledOverFiles(logDirectory, 
logType)
   logDebug(s"Sorted log files of type $logType in 
$logDirectory:\n${files.mkString("\n")}")

http://git-wip-us.apache.org/repos/asf/spark/blob/c6a6dd0d/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 693e1a0..5f13241 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2227,6 +2227,22 @@ private[spark] object Utils extends Logging {
 }
   }
 
+  /**
+   * Return whether the specified file is a parent directory of the child file.
+   */
+  def isInDirectory(parent: File, child: File): Boolean = {
+if (child == null || parent == null) {
+  return false
+}
+if (!child.exists() || !parent.exists() || !parent.isDirectory()) {
+  return false
+}
+if (parent.equals(child)) {
+  return true
+}
+isInDirectory(parent, child.getParentFile)
+  }
+
 }
 
 private [util] class SparkShutdownHookManager {

http://git-wip-us.apache.org/repos/asf/spark/blob/c6a6dd0d/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
index 572360d..72eaffb 100644
--- a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
@@ -19,7 +19,7 @@ package org.apache.spark.deploy.worker.ui
 
 import java.io.{File, FileWriter}
 
-import org.mockito.Mockito.mock
+import org.mockito.Mockito.{mock, when}
 import org.scalatest.PrivateMethodTester
 
 import org.apache.spark.SparkFunSuite
@@ -28,33 +28,47 @@ class LogPageSuite extends SparkFunSuite with 
PrivateMethodTester {
 
   test("get logs simple") {
 val webui = mock(classOf[WorkerWebUI])
+val tmpDir = new File(sys.props("java.io.tmpdir"))
+val workDir = new File(tmpDir, "work-dir")
+workDir.mkdir()
+when(webui.workDir).thenReturn(workDir)
 val logPage = new LogPage(webui)
 
 // Prepare some fake log files to read later
 val out = "some stdout here"
 val err = "some stderr here"
-val t

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 bfab61f39 -> 31e0ae9e1


[MINOR] [UI] Improve confusing message on log page

It's good practice to check if the input path is in the directory
we expect to avoid potentially confusing error messages.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/31e0ae9e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/31e0ae9e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/31e0ae9e

Branch: refs/heads/branch-1.4
Commit: 31e0ae9e1df8500f6c3abf39c9aa61a67d34f5f3
Parents: bfab61f
Author: Andrew Or 
Authored: Wed Jun 3 12:10:12 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:48:15 2015 -0700

--
 .../apache/spark/deploy/worker/ui/LogPage.scala |  9 +++
 .../scala/org/apache/spark/util/Utils.scala | 16 +
 .../spark/deploy/worker/ui/LogPageSuite.scala   | 36 +++
 .../org/apache/spark/util/UtilsSuite.scala  | 65 
 4 files changed, 115 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/31e0ae9e/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
index dc2bee6..53f8f9a 100644
--- a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.deploy.worker.ui
 
+import java.io.File
+import java.net.URI
 import javax.servlet.http.HttpServletRequest
 
 import scala.xml.Node
@@ -135,6 +137,13 @@ private[ui] class LogPage(parent: WorkerWebUI) extends 
WebUIPage("logPage") with
   return ("Error: Log type must be one of " + 
supportedLogTypes.mkString(", "), 0, 0, 0)
 }
 
+// Verify that the normalized path of the log directory is in the working 
directory
+val normalizedUri = new URI(logDirectory).normalize()
+val normalizedLogDir = new File(normalizedUri.getPath)
+if (!Utils.isInDirectory(workDir, normalizedLogDir)) {
+  return ("Error: invalid log directory " + logDirectory, 0, 0, 0)
+}
+
 try {
   val files = RollingFileAppender.getSortedRolledOverFiles(logDirectory, 
logType)
   logDebug(s"Sorted log files of type $logType in 
$logDirectory:\n${files.mkString("\n")}")

http://git-wip-us.apache.org/repos/asf/spark/blob/31e0ae9e/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index 693e1a0..5f13241 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2227,6 +2227,22 @@ private[spark] object Utils extends Logging {
 }
   }
 
+  /**
+   * Return whether the specified file is a parent directory of the child file.
+   */
+  def isInDirectory(parent: File, child: File): Boolean = {
+if (child == null || parent == null) {
+  return false
+}
+if (!child.exists() || !parent.exists() || !parent.isDirectory()) {
+  return false
+}
+if (parent.equals(child)) {
+  return true
+}
+isInDirectory(parent, child.getParentFile)
+  }
+
 }
 
 private [util] class SparkShutdownHookManager {

http://git-wip-us.apache.org/repos/asf/spark/blob/31e0ae9e/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
index 2d02122..da53214 100644
--- a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
@@ -19,40 +19,54 @@ package org.apache.spark.deploy.worker.ui
 
 import java.io.{File, FileWriter}
 
-import org.mockito.Mockito.mock
+import org.mockito.Mockito.{mock, when}
 import org.scalatest.{FunSuite, PrivateMethodTester}
 
 class LogPageSuite extends FunSuite with PrivateMethodTester {
 
   test("get logs simple") {
 val webui = mock(classOf[WorkerWebUI])
+val tmpDir = new File(sys.props("java.io.tmpdir"))
+val workDir = new File(tmpDir, "work-dir")
+workDir.mkdir()
+when(webui.workDir).thenReturn(workDir)
 val logPage = new LogPage(webui)
 
 // Prepare some fake log files to read later
 val out = "some stdout here"
 val err = "some stderr here"
-val tmpDir = new File(sys.props("java.io.tmpdir

spark git commit: [MINOR] [UI] Improve confusing message on log page

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.3 e5747ee3a -> 744599609


[MINOR] [UI] Improve confusing message on log page

It's good practice to check if the input path is in the directory
we expect to avoid potentially confusing error messages.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/74459960
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/74459960
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/74459960

Branch: refs/heads/branch-1.3
Commit: 744599609dcb940df309c019bb57b98d4ba5d151
Parents: e5747ee
Author: Andrew Or 
Authored: Wed Jun 3 12:10:12 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:48:39 2015 -0700

--
 .../apache/spark/deploy/worker/ui/LogPage.scala |  9 +++
 .../scala/org/apache/spark/util/Utils.scala | 16 +
 .../spark/deploy/worker/ui/LogPageSuite.scala   | 36 +++
 .../org/apache/spark/util/UtilsSuite.scala  | 65 
 4 files changed, 115 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/74459960/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
index 999eb52..41a2b5c 100644
--- a/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/worker/ui/LogPage.scala
@@ -17,6 +17,8 @@
 
 package org.apache.spark.deploy.worker.ui
 
+import java.io.File
+import java.net.URI
 import javax.servlet.http.HttpServletRequest
 
 import scala.xml.Node
@@ -135,6 +137,13 @@ private[spark] class LogPage(parent: WorkerWebUI) extends 
WebUIPage("logPage") w
   return ("Error: Log type must be one of " + 
supportedLogTypes.mkString(", "), 0, 0, 0)
 }
 
+// Verify that the normalized path of the log directory is in the working 
directory
+val normalizedUri = new URI(logDirectory).normalize()
+val normalizedLogDir = new File(normalizedUri.getPath)
+if (!Utils.isInDirectory(workDir, normalizedLogDir)) {
+  return ("Error: invalid log directory " + logDirectory, 0, 0, 0)
+}
+
 try {
   val files = RollingFileAppender.getSortedRolledOverFiles(logDirectory, 
logType)
   logDebug(s"Sorted log files of type $logType in 
$logDirectory:\n${files.mkString("\n")}")

http://git-wip-us.apache.org/repos/asf/spark/blob/74459960/core/src/main/scala/org/apache/spark/util/Utils.scala
--
diff --git a/core/src/main/scala/org/apache/spark/util/Utils.scala 
b/core/src/main/scala/org/apache/spark/util/Utils.scala
index a57049d..c030986 100644
--- a/core/src/main/scala/org/apache/spark/util/Utils.scala
+++ b/core/src/main/scala/org/apache/spark/util/Utils.scala
@@ -2011,6 +2011,22 @@ private[spark] object Utils extends Logging {
   .getOrElse(UserGroupInformation.getCurrentUser().getShortUserName())
   }
 
+  /**
+   * Return whether the specified file is a parent directory of the child file.
+   */
+  def isInDirectory(parent: File, child: File): Boolean = {
+if (child == null || parent == null) {
+  return false
+}
+if (!child.exists() || !parent.exists() || !parent.isDirectory()) {
+  return false
+}
+if (parent.equals(child)) {
+  return true
+}
+isInDirectory(parent, child.getParentFile)
+  }
+
 }
 
 /**

http://git-wip-us.apache.org/repos/asf/spark/blob/74459960/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
index 572360d..72eaffb 100644
--- a/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/worker/ui/LogPageSuite.scala
@@ -19,7 +19,7 @@ package org.apache.spark.deploy.worker.ui
 
 import java.io.{File, FileWriter}
 
-import org.mockito.Mockito.mock
+import org.mockito.Mockito.{mock, when}
 import org.scalatest.PrivateMethodTester
 
 import org.apache.spark.SparkFunSuite
@@ -28,33 +28,47 @@ class LogPageSuite extends SparkFunSuite with 
PrivateMethodTester {
 
   test("get logs simple") {
 val webui = mock(classOf[WorkerWebUI])
+val tmpDir = new File(sys.props("java.io.tmpdir"))
+val workDir = new File(tmpDir, "work-dir")
+workDir.mkdir()
+when(webui.workDir).thenReturn(workDir)
 val logPage = new LogPage(webui)
 
 // Prepare some fake log files to read later
 val out = "some stdout here"
 val er

spark git commit: [SPARK-8083] [MESOS] Use the correct base path in mesos driver page.

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master c6a6dd0d0 -> bfbf12b34


[SPARK-8083] [MESOS] Use the correct base path in mesos driver page.

Author: Timothy Chen 

Closes #6615 from tnachen/mesos_driver_path and squashes the following commits:

4f47b7c [Timothy Chen] Use the correct base path in mesos driver page.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bfbf12b3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bfbf12b3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bfbf12b3

Branch: refs/heads/master
Commit: bfbf12b349e998c7e674649a07b88c4658ae0711
Parents: c6a6dd0
Author: Timothy Chen 
Authored: Wed Jun 3 14:57:23 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:57:23 2015 -0700

--
 .../main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bfbf12b3/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
index be8560d..e8ef60b 100644
--- a/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
@@ -68,7 +68,7 @@ private[ui] class DriverPage(parent: MesosClusterUI) extends 
WebUIPage("driver")
 retryHeaders, retryRow, 
Iterable.apply(driverState.description.retryState))
 val content =
   Driver state information for driver id {driverId}
-Back to Drivers
+Back to Drivers
 
   
 Driver state: {driverState.state}


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8083] [MESOS] Use the correct base path in mesos driver page.

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 31e0ae9e1 -> 59399a8f0


[SPARK-8083] [MESOS] Use the correct base path in mesos driver page.

Author: Timothy Chen 

Closes #6615 from tnachen/mesos_driver_path and squashes the following commits:

4f47b7c [Timothy Chen] Use the correct base path in mesos driver page.

(cherry picked from commit bfbf12b349e998c7e674649a07b88c4658ae0711)
Signed-off-by: Andrew Or 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/59399a8f
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/59399a8f
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/59399a8f

Branch: refs/heads/branch-1.4
Commit: 59399a8f0cbb962c896ade567093500af039444c
Parents: 31e0ae9
Author: Timothy Chen 
Authored: Wed Jun 3 14:57:23 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:58:33 2015 -0700

--
 .../main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala   | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/59399a8f/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala 
b/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
index be8560d..e8ef60b 100644
--- a/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
+++ b/core/src/main/scala/org/apache/spark/deploy/mesos/ui/DriverPage.scala
@@ -68,7 +68,7 @@ private[ui] class DriverPage(parent: MesosClusterUI) extends 
WebUIPage("driver")
 retryHeaders, retryRow, 
Iterable.apply(driverState.description.retryState))
 val content =
   Driver state information for driver id {driverId}
-Back to Drivers
+Back to Drivers
 
   
 Driver state: {driverState.state}


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8059] [YARN] Wake up allocation thread when new requests arrive.

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master bfbf12b34 -> aa40c4420


[SPARK-8059] [YARN] Wake up allocation thread when new requests arrive.

This should help reduce latency for new executor allocations.

Author: Marcelo Vanzin 

Closes #6600 from vanzin/SPARK-8059 and squashes the following commits:

8387a3a [Marcelo Vanzin] [SPARK-8059] [yarn] Wake up allocation thread when new 
requests arrive.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/aa40c442
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/aa40c442
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/aa40c442

Branch: refs/heads/master
Commit: aa40c4420717aa06a7964bd30b428fb73548beb2
Parents: bfbf12b
Author: Marcelo Vanzin 
Authored: Wed Jun 3 14:59:30 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 14:59:30 2015 -0700

--
 .../spark/deploy/yarn/ApplicationMaster.scala   | 16 +---
 .../apache/spark/deploy/yarn/YarnAllocator.scala|  7 ++-
 2 files changed, 19 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/aa40c442/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
--
diff --git 
a/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
index 760e458..002d7b6 100644
--- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
+++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
@@ -67,6 +67,7 @@ private[spark] class ApplicationMaster(
 
   @volatile private var reporterThread: Thread = _
   @volatile private var allocator: YarnAllocator = _
+  private val allocatorLock = new Object()
 
   // Fields used in client mode.
   private var rpcEnv: RpcEnv = null
@@ -359,7 +360,9 @@ private[spark] class ApplicationMaster(
   }
 logDebug(s"Number of pending allocations is $numPendingAllocate. " 
+
  s"Sleeping for $sleepInterval.")
-Thread.sleep(sleepInterval)
+allocatorLock.synchronized {
+  allocatorLock.wait(sleepInterval)
+}
   } catch {
 case e: InterruptedException =>
   }
@@ -546,8 +549,15 @@ private[spark] class ApplicationMaster(
 override def receiveAndReply(context: RpcCallContext): 
PartialFunction[Any, Unit] = {
   case RequestExecutors(requestedTotal) =>
 Option(allocator) match {
-  case Some(a) => a.requestTotalExecutors(requestedTotal)
-  case None => logWarning("Container allocator is not ready to request 
executors yet.")
+  case Some(a) =>
+allocatorLock.synchronized {
+  if (a.requestTotalExecutors(requestedTotal)) {
+allocatorLock.notifyAll()
+  }
+}
+
+  case None =>
+logWarning("Container allocator is not ready to request executors 
yet.")
 }
 context.reply(true)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/aa40c442/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
--
diff --git 
a/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala 
b/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
index 21193e7..940873f 100644
--- a/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
+++ b/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala
@@ -146,11 +146,16 @@ private[yarn] class YarnAllocator(
* Request as many executors from the ResourceManager as needed to reach the 
desired total. If
* the requested total is smaller than the current number of running 
executors, no executors will
* be killed.
+   *
+   * @return Whether the new requested total is different than the old value.
*/
-  def requestTotalExecutors(requestedTotal: Int): Unit = synchronized {
+  def requestTotalExecutors(requestedTotal: Int): Boolean = synchronized {
 if (requestedTotal != targetNumExecutors) {
   logInfo(s"Driver requested a total number of $requestedTotal 
executor(s).")
   targetNumExecutors = requestedTotal
+  true
+} else {
+  false
 }
   }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master aa40c4420 -> 1d8669f15


[SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw 
TimeoutException if timeout

Some places forget to call `assert` to check the return value of 
`AsynchronousListenerBus.waitUntilEmpty`. Instead of adding `assert` in these 
places, I think it's better to make `AsynchronousListenerBus.waitUntilEmpty` 
throw `TimeoutException`.

Author: zsxwing 

Closes #6550 from zsxwing/SPARK-8001 and squashes the following commits:

607674a [zsxwing] Make AsynchronousListenerBus.waitUntilEmpty throw 
TimeoutException if timeout


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/1d8669f1
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/1d8669f1
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/1d8669f1

Branch: refs/heads/master
Commit: 1d8669f15c136cd81f494dd487400c62c9498602
Parents: aa40c44
Author: zsxwing 
Authored: Wed Jun 3 15:03:07 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:03:07 2015 -0700

--
 .../spark/util/AsynchronousListenerBus.scala| 11 +-
 .../spark/deploy/LogUrlsStandaloneSuite.scala   |  4 ++--
 .../spark/scheduler/DAGSchedulerSuite.scala | 18 
 .../spark/scheduler/SparkListenerSuite.scala| 22 ++--
 .../SparkListenerWithClusterSuite.scala |  2 +-
 .../spark/deploy/yarn/YarnClusterSuite.scala|  2 +-
 6 files changed, 30 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/1d8669f1/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala 
b/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
index 1861d38..61b5a4c 100644
--- a/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
+++ b/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
@@ -120,21 +120,22 @@ private[spark] abstract class AsynchronousListenerBus[L 
<: AnyRef, E](name: Stri
 
   /**
* For testing only. Wait until there are no more events in the queue, or 
until the specified
-   * time has elapsed. Return true if the queue has emptied and false is the 
specified time
-   * elapsed before the queue emptied.
+   * time has elapsed. Throw `TimeoutException` if the specified time elapsed 
before the queue
+   * emptied.
*/
   @VisibleForTesting
-  def waitUntilEmpty(timeoutMillis: Int): Boolean = {
+  @throws(classOf[TimeoutException])
+  def waitUntilEmpty(timeoutMillis: Long): Unit = {
 val finishTime = System.currentTimeMillis + timeoutMillis
 while (!queueIsEmpty) {
   if (System.currentTimeMillis > finishTime) {
-return false
+throw new TimeoutException(
+  s"The event queue is not empty after $timeoutMillis milliseconds")
   }
   /* Sleep rather than using wait/notify, because this is used only for 
testing and
* wait/notify add overhead in the general case. */
   Thread.sleep(10)
 }
-true
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/1d8669f1/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
index c215b05..ddc9281 100644
--- a/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
@@ -41,7 +41,7 @@ class LogUrlsStandaloneSuite extends SparkFunSuite with 
LocalSparkContext {
 // Trigger a job so that executors get added
 sc.parallelize(1 to 100, 4).map(_.toString).count()
 
-assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))
+sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS)
 listener.addedExecutorInfos.values.foreach { info =>
   assert(info.logUrlMap.nonEmpty)
   // Browse to each URL to check that it's valid
@@ -71,7 +71,7 @@ class LogUrlsStandaloneSuite extends SparkFunSuite with 
LocalSparkContext {
 // Trigger a job so that executors get added
 sc.parallelize(1 to 100, 4).map(_.toString).count()
 
-assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))
+sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS)
 val listeners = sc.listenerBus.findListenersByClass[SaveExecutorInfo]
 assert(listeners.size === 1)
 val listener = listeners(0)

http://git-wip-us.apache.org/repos/asf/spark/blob/1d8669f1/core/src/test/scala/org/apache/spark/scheduler/DAGSchedulerSuite.scala
---

spark git commit: [SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw TimeoutException if timeout

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 59399a8f0 -> 306837e4e


[SPARK-8001] [CORE] Make AsynchronousListenerBus.waitUntilEmpty throw 
TimeoutException if timeout

Some places forget to call `assert` to check the return value of 
`AsynchronousListenerBus.waitUntilEmpty`. Instead of adding `assert` in these 
places, I think it's better to make `AsynchronousListenerBus.waitUntilEmpty` 
throw `TimeoutException`.

Author: zsxwing 

Closes #6550 from zsxwing/SPARK-8001 and squashes the following commits:

607674a [zsxwing] Make AsynchronousListenerBus.waitUntilEmpty throw 
TimeoutException if timeout

(cherry picked from commit 1d8669f15c136cd81f494dd487400c62c9498602)
Signed-off-by: Andrew Or 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/306837e4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/306837e4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/306837e4

Branch: refs/heads/branch-1.4
Commit: 306837e4e31e95ef9b66356ea13f17c2e2b2e7bd
Parents: 59399a8
Author: zsxwing 
Authored: Wed Jun 3 15:03:07 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:03:15 2015 -0700

--
 .../spark/util/AsynchronousListenerBus.scala| 11 +-
 .../spark/deploy/LogUrlsStandaloneSuite.scala   |  4 ++--
 .../spark/scheduler/DAGSchedulerSuite.scala | 18 
 .../spark/scheduler/SparkListenerSuite.scala| 22 ++--
 .../SparkListenerWithClusterSuite.scala |  2 +-
 .../spark/deploy/yarn/YarnClusterSuite.scala|  2 +-
 6 files changed, 30 insertions(+), 29 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/306837e4/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala 
b/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
index 1861d38..61b5a4c 100644
--- a/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
+++ b/core/src/main/scala/org/apache/spark/util/AsynchronousListenerBus.scala
@@ -120,21 +120,22 @@ private[spark] abstract class AsynchronousListenerBus[L 
<: AnyRef, E](name: Stri
 
   /**
* For testing only. Wait until there are no more events in the queue, or 
until the specified
-   * time has elapsed. Return true if the queue has emptied and false is the 
specified time
-   * elapsed before the queue emptied.
+   * time has elapsed. Throw `TimeoutException` if the specified time elapsed 
before the queue
+   * emptied.
*/
   @VisibleForTesting
-  def waitUntilEmpty(timeoutMillis: Int): Boolean = {
+  @throws(classOf[TimeoutException])
+  def waitUntilEmpty(timeoutMillis: Long): Unit = {
 val finishTime = System.currentTimeMillis + timeoutMillis
 while (!queueIsEmpty) {
   if (System.currentTimeMillis > finishTime) {
-return false
+throw new TimeoutException(
+  s"The event queue is not empty after $timeoutMillis milliseconds")
   }
   /* Sleep rather than using wait/notify, because this is used only for 
testing and
* wait/notify add overhead in the general case. */
   Thread.sleep(10)
 }
-true
   }
 
   /**

http://git-wip-us.apache.org/repos/asf/spark/blob/306837e4/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala 
b/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
index c93d16f..82f506c 100644
--- a/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
+++ b/core/src/test/scala/org/apache/spark/deploy/LogUrlsStandaloneSuite.scala
@@ -43,7 +43,7 @@ class LogUrlsStandaloneSuite extends FunSuite with 
LocalSparkContext {
 // Trigger a job so that executors get added
 sc.parallelize(1 to 100, 4).map(_.toString).count()
 
-assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))
+sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS)
 listener.addedExecutorInfos.values.foreach { info =>
   assert(info.logUrlMap.nonEmpty)
   // Browse to each URL to check that it's valid
@@ -73,7 +73,7 @@ class LogUrlsStandaloneSuite extends FunSuite with 
LocalSparkContext {
 // Trigger a job so that executors get added
 sc.parallelize(1 to 100, 4).map(_.toString).count()
 
-assert(sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS))
+sc.listenerBus.waitUntilEmpty(WAIT_TIMEOUT_MILLIS)
 val listeners = sc.listenerBus.findListenersByClass[SaveExecutorInfo]
 assert(listeners.size === 1)
 val listener = listeners(0)

http://git-wip-us.apache.org/repos/asf/spark/blob/306837e4/

spark git commit: [SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 1d8669f15 -> f27134782


[SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite

The flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite will fail if there are not enough executors up 
before running the jobs.

This PR adds `JobProgressListener.waitUntilExecutorsUp`. The tests for the 
cluster mode can use it to wait until the expected executors are up.

Author: zsxwing 

Closes #6546 from zsxwing/SPARK-7989 and squashes the following commits:

5560e09 [zsxwing] Fix a typo
3b69840 [zsxwing] Fix flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f2713478
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/f2713478
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/f2713478

Branch: refs/heads/master
Commit: f27134782ebb61c360330e2d6d5bb1aa02be3fb6
Parents: 1d8669f
Author: zsxwing 
Authored: Wed Jun 3 15:04:20 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:04:20 2015 -0700

--
 .../spark/ui/jobs/JobProgressListener.scala | 30 
 .../spark/ExternalShuffleServiceSuite.scala |  8 ++
 .../apache/spark/broadcast/BroadcastSuite.scala | 10 +--
 .../SparkListenerWithClusterSuite.scala | 10 +--
 4 files changed, 46 insertions(+), 12 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/f2713478/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala 
b/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
index f39e961..1d31fce 100644
--- a/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
+++ b/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
@@ -17,8 +17,12 @@
 
 package org.apache.spark.ui.jobs
 
+import java.util.concurrent.TimeoutException
+
 import scala.collection.mutable.{HashMap, HashSet, ListBuffer}
 
+import com.google.common.annotations.VisibleForTesting
+
 import org.apache.spark._
 import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.executor.TaskMetrics
@@ -526,4 +530,30 @@ class JobProgressListener(conf: SparkConf) extends 
SparkListener with Logging {
   override def onApplicationStart(appStarted: SparkListenerApplicationStart) {
 startTime = appStarted.time
   }
+
+  /**
+   * For testing only. Wait until at least `numExecutors` executors are up, or 
throw
+   * `TimeoutException` if the waiting time elapsed before `numExecutors` 
executors up.
+   *
+   * @param numExecutors the number of executors to wait at least
+   * @param timeout time to wait in milliseconds
+   */
+  @VisibleForTesting
+  private[spark] def waitUntilExecutorsUp(numExecutors: Int, timeout: Long): 
Unit = {
+val finishTime = System.currentTimeMillis() + timeout
+while (System.currentTimeMillis() < finishTime) {
+  val numBlockManagers = synchronized {
+blockManagerIds.size
+  }
+  if (numBlockManagers >= numExecutors + 1) {
+// Need to count the block manager in driver
+return
+  }
+  // Sleep rather than using wait/notify, because this is used only for 
testing and wait/notify
+  // add overhead in the general case.
+  Thread.sleep(10)
+}
+throw new TimeoutException(
+  s"Can't find $numExecutors executors before $timeout milliseconds 
elapsed")
+  }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/f2713478/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala 
b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
index bac6fdb..5b127a0 100644
--- a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
+++ b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
@@ -55,6 +55,14 @@ class ExternalShuffleServiceSuite extends ShuffleSuite with 
BeforeAndAfterAll {
 sc.env.blockManager.externalShuffleServiceEnabled should equal(true)
 sc.env.blockManager.shuffleClient.getClass should 
equal(classOf[ExternalShuffleClient])
 
+// In a slow machine, one slave may register hundreds of milliseconds 
ahead of the other one.
+// If we don't wait for all salves, it's possible that only one executor 
runs all jobs. Then
+// all shuffle blocks will be in this executor, 
ShuffleBlockFetcherIterator will directly fetch
+// local blocks from the local BlockManager and won't send req

spark git commit: [SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and SparkListenerWithClusterSuite

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 306837e4e -> 7e46ea022


[SPARK-7989] [CORE] [TESTS] Fix flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite

The flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite will fail if there are not enough executors up 
before running the jobs.

This PR adds `JobProgressListener.waitUntilExecutorsUp`. The tests for the 
cluster mode can use it to wait until the expected executors are up.

Author: zsxwing 

Closes #6546 from zsxwing/SPARK-7989 and squashes the following commits:

5560e09 [zsxwing] Fix a typo
3b69840 [zsxwing] Fix flaky tests in ExternalShuffleServiceSuite and 
SparkListenerWithClusterSuite

(cherry picked from commit f27134782ebb61c360330e2d6d5bb1aa02be3fb6)
Signed-off-by: Andrew Or 

Conflicts:
core/src/test/scala/org/apache/spark/broadcast/BroadcastSuite.scala

core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/7e46ea02
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/7e46ea02
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/7e46ea02

Branch: refs/heads/branch-1.4
Commit: 7e46ea0228f142f6b384331d62cec8f86e61c9d1
Parents: 306837e
Author: zsxwing 
Authored: Wed Jun 3 15:04:20 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:05:49 2015 -0700

--
 .../spark/ui/jobs/JobProgressListener.scala | 30 
 .../spark/ExternalShuffleServiceSuite.scala |  8 ++
 .../apache/spark/broadcast/BroadcastSuite.scala |  9 +-
 .../SparkListenerWithClusterSuite.scala | 10 +--
 4 files changed, 46 insertions(+), 11 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/7e46ea02/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala 
b/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
index f39e961..1d31fce 100644
--- a/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
+++ b/core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala
@@ -17,8 +17,12 @@
 
 package org.apache.spark.ui.jobs
 
+import java.util.concurrent.TimeoutException
+
 import scala.collection.mutable.{HashMap, HashSet, ListBuffer}
 
+import com.google.common.annotations.VisibleForTesting
+
 import org.apache.spark._
 import org.apache.spark.annotation.DeveloperApi
 import org.apache.spark.executor.TaskMetrics
@@ -526,4 +530,30 @@ class JobProgressListener(conf: SparkConf) extends 
SparkListener with Logging {
   override def onApplicationStart(appStarted: SparkListenerApplicationStart) {
 startTime = appStarted.time
   }
+
+  /**
+   * For testing only. Wait until at least `numExecutors` executors are up, or 
throw
+   * `TimeoutException` if the waiting time elapsed before `numExecutors` 
executors up.
+   *
+   * @param numExecutors the number of executors to wait at least
+   * @param timeout time to wait in milliseconds
+   */
+  @VisibleForTesting
+  private[spark] def waitUntilExecutorsUp(numExecutors: Int, timeout: Long): 
Unit = {
+val finishTime = System.currentTimeMillis() + timeout
+while (System.currentTimeMillis() < finishTime) {
+  val numBlockManagers = synchronized {
+blockManagerIds.size
+  }
+  if (numBlockManagers >= numExecutors + 1) {
+// Need to count the block manager in driver
+return
+  }
+  // Sleep rather than using wait/notify, because this is used only for 
testing and wait/notify
+  // add overhead in the general case.
+  Thread.sleep(10)
+}
+throw new TimeoutException(
+  s"Can't find $numExecutors executors before $timeout milliseconds 
elapsed")
+  }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/7e46ea02/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala 
b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
index bac6fdb..5b127a0 100644
--- a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
+++ b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
@@ -55,6 +55,14 @@ class ExternalShuffleServiceSuite extends ShuffleSuite with 
BeforeAndAfterAll {
 sc.env.blockManager.externalShuffleServiceEnabled should equal(true)
 sc.env.blockManager.shuffleClient.getClass should 
equal(classOf[ExternalShuffleClient])
 
+// In a slow machine, one slave may register hundreds of milliseconds 
ahead of 

spark git commit: [HOTFIX] Fix Hadoop-1 build caused by #5792.

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master f27134782 -> a8f1f1543


[HOTFIX] Fix Hadoop-1 build caused by #5792.

Replaced `fs.listFiles` with Hadoop-1 friendly `fs.listStatus` method.

Author: Hari Shreedharan 

Closes #6619 from harishreedharan/evetlog-hadoop-1-fix and squashes the 
following commits:

6192078 [Hari Shreedharan] [HOTFIX] Fix Hadoop-1 build caused by #5972.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/a8f1f154
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/a8f1f154
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/a8f1f154

Branch: refs/heads/master
Commit: a8f1f1543e29fb2897e9ae6940581b9e4a3a13fb
Parents: f271347
Author: Hari Shreedharan 
Authored: Wed Jun 3 15:11:02 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:11:02 2015 -0700

--
 .../org/apache/spark/deploy/history/FsHistoryProvider.scala  | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/a8f1f154/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
index 52b149b..5427a88 100644
--- 
a/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
+++ 
b/core/src/main/scala/org/apache/spark/deploy/history/FsHistoryProvider.scala
@@ -255,12 +255,12 @@ private[history] class FsHistoryProvider(conf: SparkConf, 
clock: Clock)
 // If this is a legacy directory, then add the directory to the 
zipStream and add
 // each file to that directory.
 if (isLegacyLogDirectory(fs.getFileStatus(logPath))) {
-  val files = fs.listFiles(logPath, false)
+  val files = fs.listStatus(logPath)
   zipStream.putNextEntry(new ZipEntry(attempt.logPath + "/"))
   zipStream.closeEntry()
-  while (files.hasNext) {
-val file = files.next().getPath
-zipFileToStream(file, attempt.logPath + Path.SEPARATOR + 
file.getName, zipStream)
+  files.foreach { file =>
+val path = file.getPath
+zipFileToStream(path, attempt.logPath + Path.SEPARATOR + 
path.getName, zipStream)
   }
 } else {
   zipFileToStream(new Path(logDir, attempt.logPath), 
attempt.logPath, zipStream)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master a8f1f1543 -> d3e026f87


[SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN

cc andrewor14

Author: Shivaram Venkataraman 

Closes #6424 from shivaram/spark-worker-instances-yarn-ec2 and squashes the 
following commits:

db244ae [Shivaram Venkataraman] Make Python Lint happy
0593d1b [Shivaram Venkataraman] Clear SPARK_WORKER_INSTANCES when using YARN


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d3e026f8
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d3e026f8
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d3e026f8

Branch: refs/heads/master
Commit: d3e026f8798f9875b90e8c372056ee3d71489be5
Parents: a8f1f15
Author: Shivaram Venkataraman 
Authored: Wed Jun 3 15:14:38 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:14:38 2015 -0700

--
 ec2/spark_ec2.py | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d3e026f8/ec2/spark_ec2.py
--
diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py
index ee0904c..84629cb 100755
--- a/ec2/spark_ec2.py
+++ b/ec2/spark_ec2.py
@@ -219,7 +219,8 @@ def parse_args():
  "(default: %default).")
 parser.add_option(
 "--hadoop-major-version", default="1",
-help="Major version of Hadoop (default: %default)")
+help="Major version of Hadoop. Valid options are 1 (Hadoop 1.0.4), 2 
(CDH 4.2.0), yarn " +
+ "(Hadoop 2.4.0) (default: %default)")
 parser.add_option(
 "-D", metavar="[ADDRESS:]PORT", dest="proxy_port",
 help="Use SSH dynamic port forwarding to create a SOCKS proxy at " +
@@ -271,7 +272,8 @@ def parse_args():
 help="Launch fresh slaves, but use an existing stopped master if 
possible")
 parser.add_option(
 "--worker-instances", type="int", default=1,
-help="Number of instances per worker: variable SPARK_WORKER_INSTANCES 
(default: %default)")
+help="Number of instances per worker: variable SPARK_WORKER_INSTANCES. 
Not used if YARN " +
+ "is used as Hadoop major version (default: %default)")
 parser.add_option(
 "--master-opts", type="string", default="",
 help="Extra options to give to master through SPARK_MASTER_OPTS 
variable " +
@@ -761,6 +763,10 @@ def setup_cluster(conn, master_nodes, slave_nodes, opts, 
deploy_ssh_key):
 if opts.ganglia:
 modules.append('ganglia')
 
+# Clear SPARK_WORKER_INSTANCES if running on YARN
+if opts.hadoop_major_version == "yarn":
+opts.worker_instances = ""
+
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
 print("Cloning spark-ec2 scripts from {r}/tree/{b} on master...".format(
@@ -998,6 +1004,7 @@ def deploy_files(conn, root_dir, opts, master_nodes, 
slave_nodes, modules):
 
 master_addresses = [get_dns_name(i, opts.private_ips) for i in 
master_nodes]
 slave_addresses = [get_dns_name(i, opts.private_ips) for i in slave_nodes]
+worker_instances_str = "%d" % opts.worker_instances if 
opts.worker_instances else ""
 template_vars = {
 "master_list": '\n'.join(master_addresses),
 "active_master": active_master,
@@ -1011,7 +1018,7 @@ def deploy_files(conn, root_dir, opts, master_nodes, 
slave_nodes, modules):
 "spark_version": spark_v,
 "tachyon_version": tachyon_v,
 "hadoop_major_version": opts.hadoop_major_version,
-"spark_worker_instances": "%d" % opts.worker_instances,
+"spark_worker_instances": worker_instances_str,
 "spark_master_opts": opts.master_opts
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 7e46ea022 -> ca21fff7d


[SPARK-3674] [EC2] Clear SPARK_WORKER_INSTANCES when using YARN

cc andrewor14

Author: Shivaram Venkataraman 

Closes #6424 from shivaram/spark-worker-instances-yarn-ec2 and squashes the 
following commits:

db244ae [Shivaram Venkataraman] Make Python Lint happy
0593d1b [Shivaram Venkataraman] Clear SPARK_WORKER_INSTANCES when using YARN

(cherry picked from commit d3e026f8798f9875b90e8c372056ee3d71489be5)
Signed-off-by: Andrew Or 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ca21fff7
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ca21fff7
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ca21fff7

Branch: refs/heads/branch-1.4
Commit: ca21fff7dad14da9549dfdfcb35de627dad99ff8
Parents: 7e46ea0
Author: Shivaram Venkataraman 
Authored: Wed Jun 3 15:14:38 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:14:44 2015 -0700

--
 ec2/spark_ec2.py | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ca21fff7/ec2/spark_ec2.py
--
diff --git a/ec2/spark_ec2.py b/ec2/spark_ec2.py
index 2154046..4e0d7cd 100755
--- a/ec2/spark_ec2.py
+++ b/ec2/spark_ec2.py
@@ -220,7 +220,8 @@ def parse_args():
  "(default: %default).")
 parser.add_option(
 "--hadoop-major-version", default="1",
-help="Major version of Hadoop (default: %default)")
+help="Major version of Hadoop. Valid options are 1 (Hadoop 1.0.4), 2 
(CDH 4.2.0), yarn " +
+ "(Hadoop 2.4.0) (default: %default)")
 parser.add_option(
 "-D", metavar="[ADDRESS:]PORT", dest="proxy_port",
 help="Use SSH dynamic port forwarding to create a SOCKS proxy at " +
@@ -272,7 +273,8 @@ def parse_args():
 help="Launch fresh slaves, but use an existing stopped master if 
possible")
 parser.add_option(
 "--worker-instances", type="int", default=1,
-help="Number of instances per worker: variable SPARK_WORKER_INSTANCES 
(default: %default)")
+help="Number of instances per worker: variable SPARK_WORKER_INSTANCES. 
Not used if YARN " +
+ "is used as Hadoop major version (default: %default)")
 parser.add_option(
 "--master-opts", type="string", default="",
 help="Extra options to give to master through SPARK_MASTER_OPTS 
variable " +
@@ -762,6 +764,10 @@ def setup_cluster(conn, master_nodes, slave_nodes, opts, 
deploy_ssh_key):
 if opts.ganglia:
 modules.append('ganglia')
 
+# Clear SPARK_WORKER_INSTANCES if running on YARN
+if opts.hadoop_major_version == "yarn":
+opts.worker_instances = ""
+
 # NOTE: We should clone the repository before running deploy_files to
 # prevent ec2-variables.sh from being overwritten
 print("Cloning spark-ec2 scripts from {r}/tree/{b} on master...".format(
@@ -995,6 +1001,7 @@ def deploy_files(conn, root_dir, opts, master_nodes, 
slave_nodes, modules):
 
 master_addresses = [get_dns_name(i, opts.private_ips) for i in 
master_nodes]
 slave_addresses = [get_dns_name(i, opts.private_ips) for i in slave_nodes]
+worker_instances_str = "%d" % opts.worker_instances if 
opts.worker_instances else ""
 template_vars = {
 "master_list": '\n'.join(master_addresses),
 "active_master": active_master,
@@ -1008,7 +1015,7 @@ def deploy_files(conn, root_dir, opts, master_nodes, 
slave_nodes, modules):
 "spark_version": spark_v,
 "tachyon_version": tachyon_v,
 "hadoop_major_version": opts.hadoop_major_version,
-"spark_worker_instances": "%d" % opts.worker_instances,
+"spark_worker_instances": worker_instances_str,
 "spark_master_opts": opts.master_opts
 }
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist

2015-06-03 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/master d3e026f87 -> 26c9d7a0f


[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not 
exist

This is just a workaround to a bigger problem. Some pipeline stages may not be 
effective during prediction, and they should not complain about missing 
required columns, e.g. `StringIndexerModel`. jkbradley

Author: Xiangrui Meng 

Closes #6595 from mengxr/SPARK-8051 and squashes the following commits:

b6a36b9 [Xiangrui Meng] add doc
f143fd4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
SPARK-8051
8ee7c7e [Xiangrui Meng] use SparkFunSuite
e112394 [Xiangrui Meng] make StringIndexerModel silent if input column does not 
exist


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/26c9d7a0
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/26c9d7a0
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/26c9d7a0

Branch: refs/heads/master
Commit: 26c9d7a0f975009e22ec91e5c0b5cfcada79b35e
Parents: d3e026f
Author: Xiangrui Meng 
Authored: Wed Jun 3 15:16:24 2015 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jun 3 15:16:24 2015 -0700

--
 .../org/apache/spark/ml/feature/StringIndexer.scala | 16 +++-
 .../spark/ml/feature/StringIndexerSuite.scala   |  8 
 2 files changed, 23 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/26c9d7a0/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
index a2dc8a8..f4e2507 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
@@ -88,6 +88,9 @@ class StringIndexer(override val uid: String) extends 
Estimator[StringIndexerMod
 /**
  * :: Experimental ::
  * Model fitted by [[StringIndexer]].
+ * NOTE: During transformation, if the input column does not exist,
+ * [[StringIndexerModel.transform]] would return the input dataset unmodified.
+ * This is a temporary fix for the case when target labels do not exist during 
prediction.
  */
 @Experimental
 class StringIndexerModel private[ml] (
@@ -112,6 +115,12 @@ class StringIndexerModel private[ml] (
   def setOutputCol(value: String): this.type = set(outputCol, value)
 
   override def transform(dataset: DataFrame): DataFrame = {
+if (!dataset.schema.fieldNames.contains($(inputCol))) {
+  logInfo(s"Input column ${$(inputCol)} does not exist during 
transformation. " +
+"Skip StringIndexerModel.")
+  return dataset
+}
+
 val indexer = udf { label: String =>
   if (labelToIndex.contains(label)) {
 labelToIndex(label)
@@ -128,6 +137,11 @@ class StringIndexerModel private[ml] (
   }
 
   override def transformSchema(schema: StructType): StructType = {
-validateAndTransformSchema(schema)
+if (schema.fieldNames.contains($(inputCol))) {
+  validateAndTransformSchema(schema)
+} else {
+  // If the input column does not exist during transformation, we skip 
StringIndexerModel.
+  schema
+}
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/26c9d7a0/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
index cbf1e8d..5f557e1 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
@@ -60,4 +60,12 @@ class StringIndexerSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 val expected = Set((0, 0.0), (1, 2.0), (2, 1.0), (3, 0.0), (4, 0.0), (5, 
1.0))
 assert(output === expected)
   }
+
+  test("StringIndexerModel should keep silent if the input column does not 
exist.") {
+val indexerModel = new StringIndexerModel("indexer", Array("a", "b", "c"))
+  .setInputCol("label")
+  .setOutputCol("labelIndex")
+val df = sqlContext.range(0L, 10L)
+assert(indexerModel.transform(df).eq(df))
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not exist

2015-06-03 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 ca21fff7d -> b2a22a651


[SPARK-8051] [MLLIB] make StringIndexerModel silent if input column does not 
exist

This is just a workaround to a bigger problem. Some pipeline stages may not be 
effective during prediction, and they should not complain about missing 
required columns, e.g. `StringIndexerModel`. jkbradley

Author: Xiangrui Meng 

Closes #6595 from mengxr/SPARK-8051 and squashes the following commits:

b6a36b9 [Xiangrui Meng] add doc
f143fd4 [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into 
SPARK-8051
8ee7c7e [Xiangrui Meng] use SparkFunSuite
e112394 [Xiangrui Meng] make StringIndexerModel silent if input column does not 
exist

(cherry picked from commit 26c9d7a0f975009e22ec91e5c0b5cfcada79b35e)
Signed-off-by: Joseph K. Bradley 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/b2a22a65
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/b2a22a65
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/b2a22a65

Branch: refs/heads/branch-1.4
Commit: b2a22a651f9d86ba85c78058c42402e7fdb3c4f1
Parents: ca21fff
Author: Xiangrui Meng 
Authored: Wed Jun 3 15:16:24 2015 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jun 3 15:16:36 2015 -0700

--
 .../org/apache/spark/ml/feature/StringIndexer.scala | 16 +++-
 .../spark/ml/feature/StringIndexerSuite.scala   |  8 
 2 files changed, 23 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/b2a22a65/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 
b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
index a2dc8a8..f4e2507 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala
@@ -88,6 +88,9 @@ class StringIndexer(override val uid: String) extends 
Estimator[StringIndexerMod
 /**
  * :: Experimental ::
  * Model fitted by [[StringIndexer]].
+ * NOTE: During transformation, if the input column does not exist,
+ * [[StringIndexerModel.transform]] would return the input dataset unmodified.
+ * This is a temporary fix for the case when target labels do not exist during 
prediction.
  */
 @Experimental
 class StringIndexerModel private[ml] (
@@ -112,6 +115,12 @@ class StringIndexerModel private[ml] (
   def setOutputCol(value: String): this.type = set(outputCol, value)
 
   override def transform(dataset: DataFrame): DataFrame = {
+if (!dataset.schema.fieldNames.contains($(inputCol))) {
+  logInfo(s"Input column ${$(inputCol)} does not exist during 
transformation. " +
+"Skip StringIndexerModel.")
+  return dataset
+}
+
 val indexer = udf { label: String =>
   if (labelToIndex.contains(label)) {
 labelToIndex(label)
@@ -128,6 +137,11 @@ class StringIndexerModel private[ml] (
   }
 
   override def transformSchema(schema: StructType): StructType = {
-validateAndTransformSchema(schema)
+if (schema.fieldNames.contains($(inputCol))) {
+  validateAndTransformSchema(schema)
+} else {
+  // If the input column does not exist during transformation, we skip 
StringIndexerModel.
+  schema
+}
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/b2a22a65/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
index 89c2fe4..5184863 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala
@@ -61,4 +61,12 @@ class StringIndexerSuite extends FunSuite with 
MLlibTestSparkContext {
 val expected = Set((0, 0.0), (1, 2.0), (2, 1.0), (3, 0.0), (4, 0.0), (5, 
1.0))
 assert(output === expected)
   }
+
+  test("StringIndexerModel should keep silent if the input column does not 
exist.") {
+val indexerModel = new StringIndexerModel("indexer", Array("a", "b", "c"))
+  .setInputCol("label")
+  .setOutputCol("labelIndex")
+val df = sqlContext.range(0L, 10L)
+assert(indexerModel.transform(df).eq(df))
+  }
 }


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [HOTFIX] Unbreak build from backporting #6546

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 b2a22a651 -> d0be9508f


[HOTFIX] Unbreak build from backporting #6546

This is caused by 7e46ea0228f142f6b384331d62cec8f86e61c9d1.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d0be9508
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d0be9508
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d0be9508

Branch: refs/heads/branch-1.4
Commit: d0be9508f50883c25456034d08343488928e6565
Parents: b2a22a6
Author: Andrew Or 
Authored: Wed Jun 3 15:25:35 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 15:25:35 2015 -0700

--
 .../org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d0be9508/core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
 
b/core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
index 2ce1b5b..1b4257d 100644
--- 
a/core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/scheduler/SparkListenerWithClusterSuite.scala
@@ -21,7 +21,7 @@ import scala.collection.mutable
 
 import org.scalatest.{FunSuite, BeforeAndAfter, BeforeAndAfterAll}
 
-import org.apache.spark.{LocalSparkContext, SparkContext, SparkFunSuite}
+import org.apache.spark.{LocalSparkContext, SparkContext}
 import org.apache.spark.scheduler.cluster.ExecutorInfo
 
 /**


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-6164] [ML] CrossValidatorModel should keep stats from fitting

2015-06-03 Thread jkbradley
Repository: spark
Updated Branches:
  refs/heads/master 26c9d7a0f -> d8662cd90


[SPARK-6164] [ML] CrossValidatorModel should keep stats from fitting

Added stats from cross validation as a val in the cross validation model to 
save them for user access.

Author: leahmcguire 

Closes #5915 from leahmcguire/saveCVmetrics and squashes the following commits:

49b507b [leahmcguire] fixed tyle error
67537b1 [leahmcguire] rebased
85907f0 [leahmcguire] fixed name
59987cc [leahmcguire] changed param name and test according to comments
36e71e3 [leahmcguire] rebasing
4b8223e [leahmcguire] fixed name
4ddffc6 [leahmcguire] changed param name and test according to comments
3a995da [leahmcguire] Added stats from cross validation as a val in the cross 
validation model to save them for user access


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d8662cd9
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d8662cd9
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d8662cd9

Branch: refs/heads/master
Commit: d8662cd909a41575df6e0ea1630d2386d3711240
Parents: 26c9d7a
Author: leahmcguire 
Authored: Wed Jun 3 15:46:38 2015 -0700
Committer: Joseph K. Bradley 
Committed: Wed Jun 3 15:46:38 2015 -0700

--
 .../scala/org/apache/spark/ml/tuning/CrossValidator.scala | 10 +++---
 .../org/apache/spark/ml/tuning/CrossValidatorSuite.scala  |  1 +
 2 files changed, 8 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/d8662cd9/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala
--
diff --git 
a/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala 
b/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala
index 6434b64..cb29392 100644
--- a/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala
+++ b/mllib/src/main/scala/org/apache/spark/ml/tuning/CrossValidator.scala
@@ -135,7 +135,7 @@ class CrossValidator(override val uid: String) extends 
Estimator[CrossValidatorM
 logInfo(s"Best set of parameters:\n${epm(bestIndex)}")
 logInfo(s"Best cross-validation metric: $bestMetric.")
 val bestModel = est.fit(dataset, epm(bestIndex)).asInstanceOf[Model[_]]
-copyValues(new CrossValidatorModel(uid, bestModel).setParent(this))
+copyValues(new CrossValidatorModel(uid, bestModel, 
metrics).setParent(this))
   }
 
   override def transformSchema(schema: StructType): StructType = {
@@ -158,7 +158,8 @@ class CrossValidator(override val uid: String) extends 
Estimator[CrossValidatorM
 @Experimental
 class CrossValidatorModel private[ml] (
 override val uid: String,
-val bestModel: Model[_])
+val bestModel: Model[_],
+val avgMetrics: Array[Double])
   extends Model[CrossValidatorModel] with CrossValidatorParams {
 
   override def validateParams(): Unit = {
@@ -175,7 +176,10 @@ class CrossValidatorModel private[ml] (
   }
 
   override def copy(extra: ParamMap): CrossValidatorModel = {
-val copied = new CrossValidatorModel(uid, 
bestModel.copy(extra).asInstanceOf[Model[_]])
+val copied = new CrossValidatorModel(
+  uid,
+  bestModel.copy(extra).asInstanceOf[Model[_]],
+  avgMetrics.clone())
 copyValues(copied, extra)
   }
 }

http://git-wip-us.apache.org/repos/asf/spark/blob/d8662cd9/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
index 5ba469c..9b3619f 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
@@ -56,6 +56,7 @@ class CrossValidatorSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 val parent = cvModel.bestModel.parent.asInstanceOf[LogisticRegression]
 assert(parent.getRegParam === 0.001)
 assert(parent.getMaxIter === 10)
+assert(cvModel.avgMetrics.length === lrParamMaps.length)
   }
 
   test("validateParams should check estimatorParamMaps") {


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [HOTFIX] [TYPO] Fix typo in #6546

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master d8662cd90 -> bfbdab12d


[HOTFIX] [TYPO] Fix typo in #6546


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bfbdab12
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bfbdab12
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bfbdab12

Branch: refs/heads/master
Commit: bfbdab12dd37587e5518dcbb76507b752759cace
Parents: d8662cd
Author: Andrew Or 
Authored: Wed Jun 3 16:04:02 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 16:04:02 2015 -0700

--
 .../scala/org/apache/spark/ExternalShuffleServiceSuite.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/bfbdab12/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala 
b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
index 5b127a0..1400122 100644
--- a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
+++ b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
@@ -56,11 +56,11 @@ class ExternalShuffleServiceSuite extends ShuffleSuite with 
BeforeAndAfterAll {
 sc.env.blockManager.shuffleClient.getClass should 
equal(classOf[ExternalShuffleClient])
 
 // In a slow machine, one slave may register hundreds of milliseconds 
ahead of the other one.
-// If we don't wait for all salves, it's possible that only one executor 
runs all jobs. Then
+// If we don't wait for all slaves, it's possible that only one executor 
runs all jobs. Then
 // all shuffle blocks will be in this executor, 
ShuffleBlockFetcherIterator will directly fetch
 // local blocks from the local BlockManager and won't send requests to 
ExternalShuffleService.
 // In this case, we won't receive FetchFailed. And it will make this test 
fail.
-// Therefore, we should wait until all salves are up
+// Therefore, we should wait until all slaves are up
 sc.jobProgressListener.waitUntilExecutorsUp(2, 1)
 
 val rdd = sc.parallelize(0 until 1000, 10).map(i => (i, 1)).reduceByKey(_ 
+ _)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [HOTFIX] [TYPO] Fix typo in #6546

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 d0be9508f -> 0bc9a3ec4


[HOTFIX] [TYPO] Fix typo in #6546


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0bc9a3ec
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0bc9a3ec
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0bc9a3ec

Branch: refs/heads/branch-1.4
Commit: 0bc9a3ec42b31d50f25b9307630b281fd7122418
Parents: d0be950
Author: Andrew Or 
Authored: Wed Jun 3 16:04:02 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 16:04:35 2015 -0700

--
 .../scala/org/apache/spark/ExternalShuffleServiceSuite.scala | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0bc9a3ec/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala 
b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
index 5b127a0..1400122 100644
--- a/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
+++ b/core/src/test/scala/org/apache/spark/ExternalShuffleServiceSuite.scala
@@ -56,11 +56,11 @@ class ExternalShuffleServiceSuite extends ShuffleSuite with 
BeforeAndAfterAll {
 sc.env.blockManager.shuffleClient.getClass should 
equal(classOf[ExternalShuffleClient])
 
 // In a slow machine, one slave may register hundreds of milliseconds 
ahead of the other one.
-// If we don't wait for all salves, it's possible that only one executor 
runs all jobs. Then
+// If we don't wait for all slaves, it's possible that only one executor 
runs all jobs. Then
 // all shuffle blocks will be in this executor, 
ShuffleBlockFetcherIterator will directly fetch
 // local blocks from the local BlockManager and won't send requests to 
ExternalShuffleService.
 // In this case, we won't receive FetchFailed. And it will make this test 
fail.
-// Therefore, we should wait until all salves are up
+// Therefore, we should wait until all slaves are up
 sc.jobProgressListener.waitUntilExecutorsUp(2, 1)
 
 val rdd = sc.parallelize(0 until 1000, 10).map(i => (i, 1)).reduceByKey(_ 
+ _)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [HOTFIX] History Server API docs error fix.

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master bfbdab12d -> 566cb5947


[HOTFIX] History Server API docs error fix.

Minor error in the monitoring docs. Also made indentation changes in 
`ApiRootResource`

Author: Hari Shreedharan 

Closes #6628 from harishreedharan/eventlog-formatting and squashes the 
following commits:

a12553d [Hari Shreedharan] Javadoc updates.
ca399b6 [Hari Shreedharan] [HOTFIX] History Server API docs error fix.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/566cb594
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/566cb594
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/566cb594

Branch: refs/heads/master
Commit: 566cb5947925c79ef90af72346672ab7d27bf4df
Parents: bfbdab1
Author: Hari Shreedharan 
Authored: Wed Jun 3 16:53:57 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 16:53:57 2015 -0700

--
 .../org/apache/spark/status/api/v1/ApiRootResource.scala  | 10 +++---
 .../spark/status/api/v1/EventLogDownloadResource.scala|  2 +-
 docs/monitoring.md|  2 +-
 3 files changed, 9 insertions(+), 5 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/566cb594/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala 
b/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
index 9af90ee..50b6ba6 100644
--- a/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
+++ b/core/src/main/scala/org/apache/spark/status/api/v1/ApiRootResource.scala
@@ -167,14 +167,14 @@ private[v1] class ApiRootResource extends 
UIRootFromServletContext {
 
   @Path("applications/{appId}/logs")
   def getEventLogs(
-@PathParam("appId") appId: String): EventLogDownloadResource = {
+  @PathParam("appId") appId: String): EventLogDownloadResource = {
 new EventLogDownloadResource(uiRoot, appId, None)
   }
 
   @Path("applications/{appId}/{attemptId}/logs")
   def getEventLogs(
-@PathParam("appId") appId: String,
-@PathParam("attemptId") attemptId: String): EventLogDownloadResource = {
+  @PathParam("appId") appId: String,
+  @PathParam("attemptId") attemptId: String): EventLogDownloadResource = {
 new EventLogDownloadResource(uiRoot, appId, Some(attemptId))
   }
 }
@@ -206,6 +206,10 @@ private[spark] trait UIRoot {
   def getSparkUI(appKey: String): Option[SparkUI]
   def getApplicationInfoList: Iterator[ApplicationInfo]
 
+  /**
+   * Write the event logs for the given app to the [[ZipOutputStream]] 
instance. If attemptId is
+   * [[None]], event logs for all attempts of this application will be written 
out.
+   */
   def writeEventLogs(appId: String, attemptId: Option[String], zipStream: 
ZipOutputStream): Unit = {
 Response.serverError()
   .entity("Event logs are only available through the history server.")

http://git-wip-us.apache.org/repos/asf/spark/blob/566cb594/core/src/main/scala/org/apache/spark/status/api/v1/EventLogDownloadResource.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/status/api/v1/EventLogDownloadResource.scala
 
b/core/src/main/scala/org/apache/spark/status/api/v1/EventLogDownloadResource.scala
index d416dba..22e21f0 100644
--- 
a/core/src/main/scala/org/apache/spark/status/api/v1/EventLogDownloadResource.scala
+++ 
b/core/src/main/scala/org/apache/spark/status/api/v1/EventLogDownloadResource.scala
@@ -44,7 +44,7 @@ private[v1] class EventLogDownloadResource(
   }
 
   val stream = new StreamingOutput {
-override def write(output: OutputStream) = {
+override def write(output: OutputStream): Unit = {
   val zipStream = new ZipOutputStream(output)
   try {
 uIRoot.writeEventLogs(appId, attemptId, zipStream)

http://git-wip-us.apache.org/repos/asf/spark/blob/566cb594/docs/monitoring.md
--
diff --git a/docs/monitoring.md b/docs/monitoring.md
index 31ecddc..bcf885f 100644
--- a/docs/monitoring.md
+++ b/docs/monitoring.md
@@ -233,7 +233,7 @@ for a running application, at 
`http://localhost:4040/api/v1`.
 Download the event logs for all attempts of the given application as a 
zip file
   
   
-/applications/[app-id]/[attempt-id/logs
+/applications/[app-id]/[attempt-id]/logs
 Download the event logs for the specified attempt of the given 
application as a zip file
   
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@sp

spark git commit: [SPARK-8088] don't attempt to lower number of executors by 0

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 0bc9a3ec4 -> 16748694b


[SPARK-8088] don't attempt to lower number of executors by 0

Author: Ryan Williams 

Closes #6624 from ryan-williams/execs and squashes the following commits:

b6f71d4 [Ryan Williams] don't attempt to lower number of executors by 0

(cherry picked from commit 51898b5158ac7e7e67b0539bc062c9c16ce9a7ce)
Signed-off-by: Andrew Or 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/16748694
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/16748694
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/16748694

Branch: refs/heads/branch-1.4
Commit: 16748694b8e80c1b4927d88be21068d70ed59b9a
Parents: 0bc9a3e
Author: Ryan Williams 
Authored: Wed Jun 3 16:54:46 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 16:54:52 2015 -0700

--
 .../org/apache/spark/ExecutorAllocationManager.scala  | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/16748694/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala 
b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
index 9514604..f7323a4 100644
--- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
+++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
@@ -266,10 +266,14 @@ private[spark] class ExecutorAllocationManager(
   // executors and inform the cluster manager to cancel the extra pending 
requests
   val oldNumExecutorsTarget = numExecutorsTarget
   numExecutorsTarget = math.max(maxNeeded, minNumExecutors)
-  client.requestTotalExecutors(numExecutorsTarget)
   numExecutorsToAdd = 1
-  logInfo(s"Lowering target number of executors to $numExecutorsTarget 
because " +
-s"not all requests are actually needed (previously 
$oldNumExecutorsTarget)")
+
+  // If the new target has not changed, avoid sending a message to the 
cluster manager
+  if (numExecutorsTarget < oldNumExecutorsTarget) {
+client.requestTotalExecutors(numExecutorsTarget)
+logInfo(s"Lowering target number of executors to $numExecutorsTarget 
(previously " +
+  s"$oldNumExecutorsTarget) because not all requested executors are 
actually needed")
+  }
   numExecutorsTarget - oldNumExecutorsTarget
 } else if (addTime != NOT_SET && now >= addTime) {
   val delta = addExecutors(maxNeeded)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8088] don't attempt to lower number of executors by 0

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 566cb5947 -> 51898b515


[SPARK-8088] don't attempt to lower number of executors by 0

Author: Ryan Williams 

Closes #6624 from ryan-williams/execs and squashes the following commits:

b6f71d4 [Ryan Williams] don't attempt to lower number of executors by 0


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/51898b51
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/51898b51
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/51898b51

Branch: refs/heads/master
Commit: 51898b5158ac7e7e67b0539bc062c9c16ce9a7ce
Parents: 566cb59
Author: Ryan Williams 
Authored: Wed Jun 3 16:54:46 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 16:54:46 2015 -0700

--
 .../org/apache/spark/ExecutorAllocationManager.scala  | 10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/51898b51/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
--
diff --git 
a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala 
b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
index 9514604..f7323a4 100644
--- a/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
+++ b/core/src/main/scala/org/apache/spark/ExecutorAllocationManager.scala
@@ -266,10 +266,14 @@ private[spark] class ExecutorAllocationManager(
   // executors and inform the cluster manager to cancel the extra pending 
requests
   val oldNumExecutorsTarget = numExecutorsTarget
   numExecutorsTarget = math.max(maxNeeded, minNumExecutors)
-  client.requestTotalExecutors(numExecutorsTarget)
   numExecutorsToAdd = 1
-  logInfo(s"Lowering target number of executors to $numExecutorsTarget 
because " +
-s"not all requests are actually needed (previously 
$oldNumExecutorsTarget)")
+
+  // If the new target has not changed, avoid sending a message to the 
cluster manager
+  if (numExecutorsTarget < oldNumExecutorsTarget) {
+client.requestTotalExecutors(numExecutorsTarget)
+logInfo(s"Lowering target number of executors to $numExecutorsTarget 
(previously " +
+  s"$oldNumExecutorsTarget) because not all requested executors are 
actually needed")
+  }
   numExecutorsTarget - oldNumExecutorsTarget
 } else if (addTime != NOT_SET && now >= addTime) {
   val delta = addExecutors(maxNeeded)


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8084] [SPARKR] Make SparkR scripts fail on error

2015-06-03 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 16748694b -> c2c129073


[SPARK-8084] [SPARKR] Make SparkR scripts fail on error

cc shaneknapp pwendell JoshRosen

Author: Shivaram Venkataraman 

Closes #6623 from shivaram/SPARK-8084 and squashes the following commits:

0ec5b26 [Shivaram Venkataraman] Make SparkR scripts fail on error

(cherry picked from commit 0576c3c4ff9d9bbff208e915bee1ac0d4956548c)
Signed-off-by: Shivaram Venkataraman 


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/c2c12907
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/c2c12907
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/c2c12907

Branch: refs/heads/branch-1.4
Commit: c2c129073f97de5c35532177b0811ff0892429b2
Parents: 1674869
Author: Shivaram Venkataraman 
Authored: Wed Jun 3 17:02:16 2015 -0700
Committer: Shivaram Venkataraman 
Committed: Wed Jun 3 17:02:29 2015 -0700

--
 R/create-docs.sh | 3 +++
 R/install-dev.sh | 2 ++
 2 files changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/c2c12907/R/create-docs.sh
--
diff --git a/R/create-docs.sh b/R/create-docs.sh
index 4194172..af47c08 100755
--- a/R/create-docs.sh
+++ b/R/create-docs.sh
@@ -23,6 +23,9 @@
 # After running this script the html docs can be found in 
 # $SPARK_HOME/R/pkg/html
 
+set -o pipefail
+set -e
+
 # Figure out where the script is
 export FWDIR="$(cd "`dirname "$0"`"; pwd)"
 pushd $FWDIR

http://git-wip-us.apache.org/repos/asf/spark/blob/c2c12907/R/install-dev.sh
--
diff --git a/R/install-dev.sh b/R/install-dev.sh
index 55ed6f4..b9e2527 100755
--- a/R/install-dev.sh
+++ b/R/install-dev.sh
@@ -26,6 +26,8 @@
 # NOTE(shivaram): Right now we use $SPARK_HOME/R/lib to be the installation 
directory
 # to load the SparkR package on the worker nodes.
 
+set -o pipefail
+set -e
 
 FWDIR="$(cd `dirname $0`; pwd)"
 LIB_DIR="$FWDIR/lib"


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [SPARK-8084] [SPARKR] Make SparkR scripts fail on error

2015-06-03 Thread shivaram
Repository: spark
Updated Branches:
  refs/heads/master 51898b515 -> 0576c3c4f


[SPARK-8084] [SPARKR] Make SparkR scripts fail on error

cc shaneknapp pwendell JoshRosen

Author: Shivaram Venkataraman 

Closes #6623 from shivaram/SPARK-8084 and squashes the following commits:

0ec5b26 [Shivaram Venkataraman] Make SparkR scripts fail on error


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/0576c3c4
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/0576c3c4
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/0576c3c4

Branch: refs/heads/master
Commit: 0576c3c4ff9d9bbff208e915bee1ac0d4956548c
Parents: 51898b5
Author: Shivaram Venkataraman 
Authored: Wed Jun 3 17:02:16 2015 -0700
Committer: Shivaram Venkataraman 
Committed: Wed Jun 3 17:02:16 2015 -0700

--
 R/create-docs.sh | 3 +++
 R/install-dev.sh | 2 ++
 2 files changed, 5 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/0576c3c4/R/create-docs.sh
--
diff --git a/R/create-docs.sh b/R/create-docs.sh
index 4194172..af47c08 100755
--- a/R/create-docs.sh
+++ b/R/create-docs.sh
@@ -23,6 +23,9 @@
 # After running this script the html docs can be found in 
 # $SPARK_HOME/R/pkg/html
 
+set -o pipefail
+set -e
+
 # Figure out where the script is
 export FWDIR="$(cd "`dirname "$0"`"; pwd)"
 pushd $FWDIR

http://git-wip-us.apache.org/repos/asf/spark/blob/0576c3c4/R/install-dev.sh
--
diff --git a/R/install-dev.sh b/R/install-dev.sh
index 55ed6f4..b9e2527 100755
--- a/R/install-dev.sh
+++ b/R/install-dev.sh
@@ -26,6 +26,8 @@
 # NOTE(shivaram): Right now we use $SPARK_HOME/R/lib to be the installation 
directory
 # to load the SparkR package on the worker nodes.
 
+set -o pipefail
+set -e
 
 FWDIR="$(cd `dirname $0`; pwd)"
 LIB_DIR="$FWDIR/lib"


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Increase Jenkins test timeout

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 0576c3c4f -> e35cd36e0


[BUILD] Increase Jenkins test timeout

Currently hive tests alone take 40m. The right thing to do is
to reduce the test time. However, that is a bigger project and
we currently have PRs blocking on tests not timing out.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e35cd36e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e35cd36e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e35cd36e

Branch: refs/heads/master
Commit: e35cd36e08faa43466759c412c420a9d8901d368
Parents: 0576c3c
Author: Andrew Or 
Authored: Wed Jun 3 17:40:14 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 17:40:14 2015 -0700

--
 dev/run-tests-jenkins | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/e35cd36e/dev/run-tests-jenkins
--
diff --git a/dev/run-tests-jenkins b/dev/run-tests-jenkins
index 8b2a44f..3cbd866 100755
--- a/dev/run-tests-jenkins
+++ b/dev/run-tests-jenkins
@@ -47,7 +47,9 @@ 
COMMIT_URL="https://github.com/apache/spark/commit/${ghprbActualCommit}";
 # GitHub doesn't auto-link short hashes when submitted via the API, 
unfortunately. :(
 SHORT_COMMIT_HASH="${ghprbActualCommit:0:7}"
 
-TESTS_TIMEOUT="150m" # format: http://linux.die.net/man/1/timeout
+# format: http://linux.die.net/man/1/timeout
+# must be less than the timeout configured on Jenkins (currently 180m)
+TESTS_TIMEOUT="175m"
 
 # Array to capture all tests to run on the pull request. These tests are held 
under the
 #+ dev/tests/ directory.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Increase Jenkins test timeout

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 c2c129073 -> 96f71b105


[BUILD] Increase Jenkins test timeout

Currently hive tests alone take 40m. The right thing to do is
to reduce the test time. However, that is a bigger project and
we currently have PRs blocking on tests not timing out.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/96f71b10
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/96f71b10
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/96f71b10

Branch: refs/heads/branch-1.4
Commit: 96f71b105ac9a3e965c61ef4c93f502a718e6332
Parents: c2c1290
Author: Andrew Or 
Authored: Wed Jun 3 17:40:14 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 17:45:07 2015 -0700

--
 dev/run-tests-jenkins | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/96f71b10/dev/run-tests-jenkins
--
diff --git a/dev/run-tests-jenkins b/dev/run-tests-jenkins
index 8b2a44f..3cbd866 100755
--- a/dev/run-tests-jenkins
+++ b/dev/run-tests-jenkins
@@ -47,7 +47,9 @@ 
COMMIT_URL="https://github.com/apache/spark/commit/${ghprbActualCommit}";
 # GitHub doesn't auto-link short hashes when submitted via the API, 
unfortunately. :(
 SHORT_COMMIT_HASH="${ghprbActualCommit:0:7}"
 
-TESTS_TIMEOUT="150m" # format: http://linux.die.net/man/1/timeout
+# format: http://linux.die.net/man/1/timeout
+# must be less than the timeout configured on Jenkins (currently 180m)
+TESTS_TIMEOUT="175m"
 
 # Array to capture all tests to run on the pull request. These tests are held 
under the
 #+ dev/tests/ directory.


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Use right branch when checking against Hive

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master e35cd36e0 -> 9cf740f35


[BUILD] Use right branch when checking against Hive

Right now we always run hive tests in branch-1.4 PRs because we compare whether 
the diff against master involves hive changes. Really we should be comparing 
against the target branch itself.

Author: Andrew Or 

Closes #6629 from andrewor14/build-check-hive and squashes the following 
commits:

450fbbd [Andrew Or] [BUILD] Use right branch when checking against Hive


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9cf740f3
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9cf740f3
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9cf740f3

Branch: refs/heads/master
Commit: 9cf740f357fef00b5251618b20501774852f8a28
Parents: e35cd36
Author: Andrew Or 
Authored: Wed Jun 3 18:08:53 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 18:08:53 2015 -0700

--
 dev/run-tests | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/9cf740f3/dev/run-tests
--
diff --git a/dev/run-tests b/dev/run-tests
index 7dd8d31..d178e2a 100755
--- a/dev/run-tests
+++ b/dev/run-tests
@@ -80,18 +80,19 @@ export SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS 
-Pkinesis-asl"
 # Only run Hive tests if there are SQL changes.
 # Partial solution for SPARK-1455.
 if [ -n "$AMPLAB_JENKINS" ]; then
-  git fetch origin master:master
+  target_branch="$ghprbTargetBranch"
+  git fetch origin "$target_branch":"$target_branch"
 
   # AMP_JENKINS_PRB indicates if the current build is a pull request build.
   if [ -n "$AMP_JENKINS_PRB" ]; then
 # It is a pull request build.
 sql_diffs=$(
-  git diff --name-only master \
+  git diff --name-only "$target_branch" \
   | grep -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
 )
 
 non_sql_diffs=$(
-  git diff --name-only master \
+  git diff --name-only "$target_branch" \
   | grep -v -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
 )
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Use right branch when checking against Hive (1.4)

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 96f71b105 -> 584a2ba21


[BUILD] Use right branch when checking against Hive (1.4)

For branch-1.4.

This is identical to #6629 and is strictly not necessary. I'm opening this as a 
PR since it changes Jenkins test behavior and I want to test it out here.

Author: Andrew Or 

Closes #6630 from andrewor14/build-check-hive-1.4 and squashes the following 
commits:

186ec65 [Andrew Or] [BUILD] Use right branch when checking against Hive


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/584a2ba2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/584a2ba2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/584a2ba2

Branch: refs/heads/branch-1.4
Commit: 584a2ba21cd176af0c7b6eb51acf5f2ffe0d332c
Parents: 96f71b1
Author: Andrew Or 
Authored: Wed Jun 3 18:09:14 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 18:09:14 2015 -0700

--
 dev/run-tests | 7 ---
 1 file changed, 4 insertions(+), 3 deletions(-)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/584a2ba2/dev/run-tests
--
diff --git a/dev/run-tests b/dev/run-tests
index ba1eee6..d38b4e1 100755
--- a/dev/run-tests
+++ b/dev/run-tests
@@ -80,18 +80,19 @@ export SBT_MAVEN_PROFILES_ARGS="$SBT_MAVEN_PROFILES_ARGS 
-Pkinesis-asl"
 # Only run Hive tests if there are SQL changes.
 # Partial solution for SPARK-1455.
 if [ -n "$AMPLAB_JENKINS" ]; then
-  git fetch origin master:master
+  target_branch="$ghprbTargetBranch"
+  git fetch origin "$target_branch":"$target_branch"
 
   # AMP_JENKINS_PRB indicates if the current build is a pull request build.
   if [ -n "$AMP_JENKINS_PRB" ]; then
 # It is a pull request build.
 sql_diffs=$(
-  git diff --name-only master \
+  git diff --name-only "$target_branch" \
   | grep -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
 )
 
 non_sql_diffs=$(
-  git diff --name-only master \
+  git diff --name-only "$target_branch" \
   | grep -v -e "^sql/" -e "^bin/spark-sql" -e "^sbin/start-thriftserver.sh"
 )
 


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Fix Maven build for Kinesis

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/master 9cf740f35 -> 984ad6014


[BUILD] Fix Maven build for Kinesis

A necessary dependency that is transitively referenced is not
provided, causing compilation failures in builds that provide
the kinesis-asl profile.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/984ad601
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/984ad601
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/984ad601

Branch: refs/heads/master
Commit: 984ad60147c933f2d5a2040c87ae687c14eb1724
Parents: 9cf740f
Author: Andrew Or 
Authored: Wed Jun 3 20:45:31 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 20:45:31 2015 -0700

--
 extras/kinesis-asl/pom.xml | 7 +++
 pom.xml| 2 ++
 2 files changed, 9 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/984ad601/extras/kinesis-asl/pom.xml
--
diff --git a/extras/kinesis-asl/pom.xml b/extras/kinesis-asl/pom.xml
index 4787991..c6f60bc 100644
--- a/extras/kinesis-asl/pom.xml
+++ b/extras/kinesis-asl/pom.xml
@@ -42,6 +42,13 @@
 
 
   org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
+  org.apache.spark
   spark-streaming_${scala.binary.version}
   ${project.version}
   test-jar

http://git-wip-us.apache.org/repos/asf/spark/blob/984ad601/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 0b1aaad..d03d33b 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1438,6 +1438,8 @@
 2.3
 
   false
+  
+  false
   
 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



[1/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 584a2ba21 -> bfe74b34a


http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala
 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala
index 24a53f9..804dfec 100644
--- 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala
+++ 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientDistributedCacheManagerSuite.scala
@@ -19,7 +19,6 @@ package org.apache.spark.deploy.yarn
 
 import java.net.URI
 
-import org.scalatest.FunSuite
 import org.scalatest.mock.MockitoSugar
 import org.mockito.Mockito.when
 
@@ -36,8 +35,10 @@ import org.apache.hadoop.yarn.util.{Records, ConverterUtils}
 import scala.collection.mutable.HashMap
 import scala.collection.mutable.Map
 
+import org.apache.spark.SparkFunSuite
 
-class ClientDistributedCacheManagerSuite extends FunSuite with MockitoSugar {
+
+class ClientDistributedCacheManagerSuite extends SparkFunSuite with 
MockitoSugar {
 
   class MockClientDistributedCacheManager extends 
ClientDistributedCacheManager {
 override def getVisibility(conf: Configuration, uri: URI, statCache: 
Map[URI, FileStatus]):

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
--
diff --git a/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
index 6da3e82..01d33c9 100644
--- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
+++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/ClientSuite.scala
@@ -33,12 +33,12 @@ import org.apache.hadoop.yarn.api.records._
 import org.apache.hadoop.yarn.conf.YarnConfiguration
 import org.mockito.Matchers._
 import org.mockito.Mockito._
-import org.scalatest.{BeforeAndAfterAll, FunSuite, Matchers}
+import org.scalatest.{BeforeAndAfterAll, Matchers}
 
-import org.apache.spark.{SparkException, SparkConf}
+import org.apache.spark.{SparkConf, SparkException, SparkFunSuite}
 import org.apache.spark.util.Utils
 
-class ClientSuite extends FunSuite with Matchers with BeforeAndAfterAll {
+class ClientSuite extends SparkFunSuite with Matchers with BeforeAndAfterAll {
 
   override def beforeAll(): Unit = {
 System.setProperty("SPARK_YARN_MODE", "true")

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
index 455f101..29b58e3 100644
--- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
+++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnAllocatorSuite.scala
@@ -26,13 +26,13 @@ import org.apache.hadoop.yarn.api.records._
 import org.apache.hadoop.yarn.client.api.AMRMClient
 import org.apache.hadoop.yarn.client.api.AMRMClient.ContainerRequest
 
-import org.apache.spark.SecurityManager
+import org.apache.spark.{SecurityManager, SparkFunSuite}
 import org.apache.spark.SparkConf
 import org.apache.spark.deploy.yarn.YarnSparkHadoopUtil._
 import org.apache.spark.deploy.yarn.YarnAllocator._
 import org.apache.spark.scheduler.SplitInfo
 
-import org.scalatest.{BeforeAndAfterEach, FunSuite, Matchers}
+import org.scalatest.{BeforeAndAfterEach, Matchers}
 
 class MockResolver extends DNSToSwitchMapping {
 
@@ -46,7 +46,7 @@ class MockResolver extends DNSToSwitchMapping {
   def reloadCachedMappings(names: JList[String]) {}
 }
 
-class YarnAllocatorSuite extends FunSuite with Matchers with 
BeforeAndAfterEach {
+class YarnAllocatorSuite extends SparkFunSuite with Matchers with 
BeforeAndAfterEach {
   val conf = new Configuration()
   conf.setClass(
 CommonConfigurationKeysPublic.NET_TOPOLOGY_NODE_SWITCH_MAPPING_IMPL_KEY,

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
--
diff --git 
a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala 
b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
index 8943616..2e2aace 100644
--- a/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
+++ b/yarn/src/test/scala/org/apache/spark/deploy/yarn/YarnClusterSuite.scala
@@ -29,9 +29,9 @@ import com.google.common.io.ByteStreams
 import com.google.common.io.Files
 import org.apache.hadoop.yarn

[3/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala
index 50a7863..732e2c4 100644
--- 
a/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/ml/regression/LinearRegressionSuite.scala
@@ -17,14 +17,13 @@
 
 package org.apache.spark.ml.regression
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.mllib.linalg.DenseVector
 import org.apache.spark.mllib.util.{LinearDataGenerator, MLlibTestSparkContext}
 import org.apache.spark.mllib.util.TestingUtils._
 import org.apache.spark.sql.{DataFrame, Row}
 
-class LinearRegressionSuite extends FunSuite with MLlibTestSparkContext {
+class LinearRegressionSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   @transient var dataset: DataFrame = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
 
b/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
index fe81e84..b24ecaa 100644
--- 
a/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/ml/regression/RandomForestRegressorSuite.scala
@@ -17,8 +17,7 @@
 
 package org.apache.spark.ml.regression
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.ml.impl.TreeTests
 import org.apache.spark.mllib.regression.LabeledPoint
 import org.apache.spark.mllib.tree.{EnsembleTestHelper, RandomForest => 
OldRandomForest}
@@ -31,7 +30,7 @@ import org.apache.spark.sql.DataFrame
 /**
  * Test suite for [[RandomForestRegressor]].
  */
-class RandomForestRegressorSuite extends FunSuite with MLlibTestSparkContext {
+class RandomForestRegressorSuite extends SparkFunSuite with 
MLlibTestSparkContext {
 
   import RandomForestRegressorSuite.compareAPIs
 
@@ -98,7 +97,7 @@ class RandomForestRegressorSuite extends FunSuite with 
MLlibTestSparkContext {
   */
 }
 
-private object RandomForestRegressorSuite extends FunSuite {
+private object RandomForestRegressorSuite extends SparkFunSuite {
 
   /**
* Train 2 models on the given dataset, one using the old API and one using 
the new API.

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
index 60d8bfe..6fef0b6 100644
--- a/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
+++ b/mllib/src/test/scala/org/apache/spark/ml/tuning/CrossValidatorSuite.scala
@@ -17,8 +17,7 @@
 
 package org.apache.spark.ml.tuning
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.ml.{Estimator, Model}
 import org.apache.spark.ml.classification.LogisticRegression
 import org.apache.spark.ml.evaluation.{BinaryClassificationEvaluator, 
Evaluator}
@@ -29,7 +28,7 @@ import org.apache.spark.mllib.util.MLlibTestSparkContext
 import org.apache.spark.sql.{DataFrame, SQLContext}
 import org.apache.spark.sql.types.StructType
 
-class CrossValidatorSuite extends FunSuite with MLlibTestSparkContext {
+class CrossValidatorSuite extends SparkFunSuite with MLlibTestSparkContext {
 
   @transient var dataset: DataFrame = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
--
diff --git 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
index 20aa100..810b700 100644
--- 
a/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
+++ 
b/mllib/src/test/scala/org/apache/spark/ml/tuning/ParamGridBuilderSuite.scala
@@ -19,11 +19,10 @@ package org.apache.spark.ml.tuning
 
 import scala.collection.mutable
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.apache.spark.ml.param.{ParamMap, TestParams}
 
-class ParamGridBuilderSuite extends FunSuite {
+class ParamGridBuilderSuite extends SparkFunSuite {
 
   val solver = new TestParams()
   import solver.{inputCol, maxIter}

http:/

[5/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
index 46d2e51..3940527 100644
--- 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferSecuritySuite.scala
@@ -31,12 +31,12 @@ import org.apache.spark.network.buffer.{ManagedBuffer, 
NioManagedBuffer}
 import org.apache.spark.network.shuffle.BlockFetchingListener
 import org.apache.spark.network.{BlockDataManager, BlockTransferService}
 import org.apache.spark.storage.{BlockId, ShuffleBlockId}
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.mockito.Mockito._
 import org.scalatest.mock.MockitoSugar
-import org.scalatest.{FunSuite, ShouldMatchers}
+import org.scalatest.ShouldMatchers
 
-class NettyBlockTransferSecuritySuite extends FunSuite with MockitoSugar with 
ShouldMatchers {
+class NettyBlockTransferSecuritySuite extends SparkFunSuite with MockitoSugar 
with ShouldMatchers {
   test("security default off") {
 val conf = new SparkConf()
   .set("spark.app.id", "app-id")

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
index a41f8b7..6f8e8a7 100644
--- 
a/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/netty/NettyBlockTransferServiceSuite.scala
@@ -18,11 +18,15 @@
 package org.apache.spark.network.netty
 
 import org.apache.spark.network.BlockDataManager
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.mockito.Mockito.mock
 import org.scalatest._
 
-class NettyBlockTransferServiceSuite extends FunSuite with BeforeAndAfterEach 
with ShouldMatchers {
+class NettyBlockTransferServiceSuite
+  extends SparkFunSuite
+  with BeforeAndAfterEach
+  with ShouldMatchers {
+
   private var service0: NettyBlockTransferService = _
   private var service1: NettyBlockTransferService = _
 

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala 
b/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
index 02424c5..5e364cc 100644
--- 
a/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/network/nio/ConnectionManagerSuite.scala
@@ -24,15 +24,13 @@ import scala.concurrent.duration._
 import scala.concurrent.{Await, TimeoutException}
 import scala.language.postfixOps
 
-import org.scalatest.FunSuite
-
-import org.apache.spark.{SecurityManager, SparkConf}
+import org.apache.spark.{SecurityManager, SparkConf, SparkFunSuite}
 import org.apache.spark.util.Utils
 
 /**
   * Test the ConnectionManager with various security settings.
   */
-class ConnectionManagerSuite extends FunSuite {
+class ConnectionManagerSuite extends SparkFunSuite {
 
   test("security default off") {
 val conf = new SparkConf

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala 
b/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
index f2b0ea1..ec99f2a 100644
--- a/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
+++ b/core/src/test/scala/org/apache/spark/rdd/AsyncRDDActionsSuite.scala
@@ -23,13 +23,13 @@ import scala.concurrent.{Await, TimeoutException}
 import scala.concurrent.duration.Duration
 import scala.concurrent.ExecutionContext.Implicits.global
 
-import org.scalatest.{BeforeAndAfterAll, FunSuite}
+import org.scalatest.BeforeAndAfterAll
 import org.scalatest.concurrent.Timeouts
 import org.scalatest.time.SpanSugar._
 
-import org.apache.spark.{SparkContext, SparkException, LocalSparkContext}
+import org.apache.spark.{LocalSparkContext, SparkContext, SparkException, 
SparkFunSuite}
 
-class AsyncRDDAction

[4/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
index cb99d14..a2a6d70 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/AppendOnlyMapSuite.scala
@@ -21,9 +21,9 @@ import java.util.Comparator
 
 import scala.collection.mutable.HashSet
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class AppendOnlyMapSuite extends FunSuite {
+class AppendOnlyMapSuite extends SparkFunSuite {
   test("initialization") {
 val goodMap1 = new AppendOnlyMap[Int, Int](1)
 assert(goodMap1.size === 0)

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
index ffc2069..69dbfa9 100644
--- a/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
+++ b/core/src/test/scala/org/apache/spark/util/collection/BitSetSuite.scala
@@ -17,9 +17,9 @@
 
 package org.apache.spark.util.collection
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class BitSetSuite extends FunSuite {
+class BitSetSuite extends SparkFunSuite {
 
   test("basic set and get") {
 val setBits = Seq(0, 9, 1, 10, 90, 96)

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
index c0c38cd..05306f4 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/ChainedBufferSuite.scala
@@ -19,10 +19,11 @@ package org.apache.spark.util.collection
 
 import java.nio.ByteBuffer
 
-import org.scalatest.FunSuite
 import org.scalatest.Matchers._
 
-class ChainedBufferSuite extends FunSuite {
+import org.apache.spark.SparkFunSuite
+
+class ChainedBufferSuite extends SparkFunSuite {
   test("write and read at start") {
 // write from start of source array
 val buffer = new ChainedBuffer(8)

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala 
b/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
index 6c956d9..bc54799 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/CompactBufferSuite.scala
@@ -17,9 +17,9 @@
 
 package org.apache.spark.util.collection
 
-import org.scalatest.FunSuite
+import org.apache.spark.SparkFunSuite
 
-class CompactBufferSuite extends FunSuite {
+class CompactBufferSuite extends SparkFunSuite {
   test("empty buffer") {
 val b = new CompactBuffer[Int]
 assert(b.size === 0)

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
--
diff --git 
a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
 
b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
index dff8f3d..79eba61 100644
--- 
a/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
+++ 
b/core/src/test/scala/org/apache/spark/util/collection/ExternalAppendOnlyMapSuite.scala
@@ -19,12 +19,10 @@ package org.apache.spark.util.collection
 
 import scala.collection.mutable.ArrayBuffer
 
-import org.scalatest.FunSuite
-
 import org.apache.spark._
 import org.apache.spark.io.CompressionCodec
 
-class ExternalAppendOnlyMapSuite extends FunSuite with LocalSparkContext {
+class ExternalAppendOnlyMapSuite extends SparkFunSuite with LocalSparkContext {
   private val allCompressionCodecs = CompressionCodec.ALL_COMPRESSION_CODECS
   private def createCombiner[T](i: T) = ArrayBuffer[T](i)
   private def mergeValue[T](buffer: ArrayBuffer[T], i: T): ArrayBuffer[T] = 
buffer += i

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/core/src/test/scala/org/apache

[6/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
[SPARK-7558] Demarcate tests in unit-tests.log (1.4)

This includes the following commits:

original: 9eb222c
hotfix1: 8c99793
hotfix2: a4f2412
scalastyle check: 609c492

---
Original patch #6441
Branch-1.3 patch #6602

Author: Andrew Or 

Closes #6598 from andrewor14/demarcate-tests-1.4 and squashes the following 
commits:

4c3c566 [Andrew Or] Merge branch 'branch-1.4' of github.com:apache/spark into 
demarcate-tests-1.4
e217b78 [Andrew Or] [SPARK-7558] Guard against direct uses of FunSuite / 
FunSuiteLike
46d4361 [Andrew Or] Various whitespace changes (minor)
3d9bf04 [Andrew Or] Make all test suites extend SparkFunSuite instead of 
FunSuite
eaa520e [Andrew Or] Fix tests?
b4d93de [Andrew Or] Fix tests
634a777 [Andrew Or] Fix log message
a932e8d [Andrew Or] Fix manual things that cannot be covered through automation
8bc355d [Andrew Or] Add core tests as dependencies in all modules
75d361f [Andrew Or] Introduce base abstract class for all test suites


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/bfe74b34
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/bfe74b34
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/bfe74b34

Branch: refs/heads/branch-1.4
Commit: bfe74b34a6ac6dada8749ffd9bcdc5f228741ea7
Parents: 584a2ba
Author: Andrew Or 
Authored: Wed Jun 3 20:46:44 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 20:46:44 2015 -0700

--
 bagel/pom.xml   |  7 +++
 .../org/apache/spark/bagel/BagelSuite.scala |  4 +-
 core/pom.xml|  6 +++
 .../org/apache/spark/AccumulatorSuite.scala |  3 +-
 .../org/apache/spark/CacheManagerSuite.scala|  4 +-
 .../org/apache/spark/CheckpointSuite.scala  |  4 +-
 .../org/apache/spark/ContextCleanerSuite.scala  |  4 +-
 .../org/apache/spark/DistributedSuite.scala |  3 +-
 .../scala/org/apache/spark/DriverSuite.scala|  3 +-
 .../spark/ExecutorAllocationManagerSuite.scala  |  8 +++-
 .../scala/org/apache/spark/FailureSuite.scala   |  4 +-
 .../org/apache/spark/FileServerSuite.scala  |  3 +-
 .../test/scala/org/apache/spark/FileSuite.scala |  3 +-
 .../org/apache/spark/FutureActionSuite.scala|  8 +++-
 .../apache/spark/HeartbeatReceiverSuite.scala   |  3 +-
 .../apache/spark/ImplicitOrderingSuite.scala|  4 +-
 .../org/apache/spark/JobCancellationSuite.scala |  4 +-
 .../apache/spark/MapOutputTrackerSuite.scala|  3 +-
 .../org/apache/spark/PartitioningSuite.scala|  4 +-
 .../org/apache/spark/SSLOptionsSuite.scala  |  4 +-
 .../org/apache/spark/SecurityManagerSuite.scala |  4 +-
 .../scala/org/apache/spark/ShuffleSuite.scala   |  3 +-
 .../scala/org/apache/spark/SparkConfSuite.scala |  3 +-
 .../apache/spark/SparkContextInfoSuite.scala|  4 +-
 .../SparkContextSchedulerCreationSuite.scala|  4 +-
 .../org/apache/spark/SparkContextSuite.scala|  4 +-
 .../scala/org/apache/spark/SparkFunSuite.scala  | 48 
 .../org/apache/spark/StatusTrackerSuite.scala   |  4 +-
 .../scala/org/apache/spark/ThreadingSuite.scala |  3 +-
 .../scala/org/apache/spark/UnpersistSuite.scala |  3 +-
 .../spark/api/python/PythonBroadcastSuite.scala |  6 +--
 .../spark/api/python/PythonRDDSuite.scala   |  4 +-
 .../spark/api/python/SerDeUtilSuite.scala   |  6 +--
 .../apache/spark/broadcast/BroadcastSuite.scala |  6 +--
 .../org/apache/spark/deploy/ClientSuite.scala   |  5 +-
 .../apache/spark/deploy/JsonProtocolSuite.scala |  5 +-
 .../spark/deploy/LogUrlsStandaloneSuite.scala   |  6 +--
 .../apache/spark/deploy/PythonRunnerSuite.scala |  5 +-
 .../apache/spark/deploy/SparkSubmitSuite.scala  |  8 +++-
 .../spark/deploy/SparkSubmitUtilsSuite.scala|  5 +-
 .../deploy/history/FsHistoryProviderSuite.scala |  6 +--
 .../deploy/history/HistoryServerSuite.scala |  6 +--
 .../spark/deploy/master/MasterSuite.scala   |  6 +--
 .../deploy/rest/StandaloneRestSubmitSuite.scala |  4 +-
 .../deploy/rest/SubmitRestProtocolSuite.scala   |  5 +-
 .../spark/deploy/worker/CommandUtilsSuite.scala |  5 +-
 .../spark/deploy/worker/DriverRunnerTest.scala  |  5 +-
 .../deploy/worker/ExecutorRunnerTest.scala  |  6 +--
 .../deploy/worker/WorkerArgumentsTest.scala |  5 +-
 .../spark/deploy/worker/WorkerSuite.scala   |  6 +--
 .../deploy/worker/WorkerWatcherSuite.scala  |  5 +-
 .../spark/deploy/worker/ui/LogPageSuite.scala   |  6 ++-
 .../spark/executor/TaskMetricsSuite.scala   |  4 +-
 .../input/WholeTextFileRecordReaderSuite.scala  |  5 +-
 .../apache/spark/io/CompressionCodecSuite.scala |  5 +-
 .../spark/metrics/InputOutputMetricsSuite.scala |  6 +--
 .../spark/metrics/MetricsConfigSuite.scala  |  6 ++-
 .../spark/metrics/MetricsSystemSuite.scala  |  6 +--
 .../netty/NettyBlockTransferSecuritySuite.scala |  6 +--
 .../netty/NettyBlockTransferServiceSuite.scala  |  8 +++-
 .../network/nio/Conne

[2/6] spark git commit: [SPARK-7558] Demarcate tests in unit-tests.log (1.4)

2015-06-03 Thread andrewor14
http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
--
diff --git 
a/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 934daae..50fd43a 100644
--- a/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.10/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -22,13 +22,12 @@ import java.net.URLClassLoader
 
 import scala.collection.mutable.ArrayBuffer
 
-import org.scalatest.FunSuite
-import org.apache.spark.SparkContext
+import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.commons.lang3.StringEscapeUtils
 import org.apache.spark.util.Utils
 
 
-class ReplSuite extends FunSuite {
+class ReplSuite extends SparkFunSuite {
 
   def runInterpreter(master: String, input: String): String = {
 val CONF_EXECUTOR_CLASSPATH = "spark.executor.extraClassPath"

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
--
diff --git 
a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala 
b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
index 14f5e9e..9ecc7c2 100644
--- a/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
+++ b/repl/scala-2.11/src/test/scala/org/apache/spark/repl/ReplSuite.scala
@@ -24,14 +24,13 @@ import scala.collection.mutable.ArrayBuffer
 import scala.concurrent.duration._
 import scala.tools.nsc.interpreter.SparkILoop
 
-import org.scalatest.FunSuite
 import org.apache.commons.lang3.StringEscapeUtils
-import org.apache.spark.SparkContext
+import org.apache.spark.{SparkContext, SparkFunSuite}
 import org.apache.spark.util.Utils
 
 
 
-class ReplSuite extends FunSuite {
+class ReplSuite extends SparkFunSuite {
 
   def runInterpreter(master: String, input: String): String = {
 val CONF_EXECUTOR_CLASSPATH = "spark.executor.extraClassPath"

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
--
diff --git 
a/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala 
b/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
index c709cde..a58eda1 100644
--- a/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
+++ b/repl/src/test/scala/org/apache/spark/repl/ExecutorClassLoaderSuite.scala
@@ -25,7 +25,6 @@ import scala.language.implicitConversions
 import scala.language.postfixOps
 
 import org.scalatest.BeforeAndAfterAll
-import org.scalatest.FunSuite
 import org.scalatest.concurrent.Interruptor
 import org.scalatest.concurrent.Timeouts._
 import org.scalatest.mock.MockitoSugar
@@ -35,7 +34,7 @@ import org.apache.spark._
 import org.apache.spark.util.Utils
 
 class ExecutorClassLoaderSuite
-  extends FunSuite
+  extends SparkFunSuite
   with BeforeAndAfterAll
   with MockitoSugar
   with Logging {

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/scalastyle-config.xml
--
diff --git a/scalastyle-config.xml b/scalastyle-config.xml
index dd4eb8c..e0bafa1 100644
--- a/scalastyle-config.xml
+++ b/scalastyle-config.xml
@@ -159,4 +159,11 @@
 
   
   
+  
+  
+   
+^FunSuite[A-Za-z]*$
+   
+   Tests must extend org.apache.spark.SparkFunSuite 
instead.
+  
 

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/sql/catalyst/pom.xml
--
diff --git a/sql/catalyst/pom.xml b/sql/catalyst/pom.xml
index 5c322d0..d9e1cdb 100644
--- a/sql/catalyst/pom.xml
+++ b/sql/catalyst/pom.xml
@@ -52,6 +52,13 @@
 
 
   org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
+  org.apache.spark
   spark-unsafe_${scala.binary.version}
   ${project.version}
 

http://git-wip-us.apache.org/repos/asf/spark/blob/bfe74b34/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
--
diff --git 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
index ea82cd2..c046dbf 100644
--- 
a/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
+++ 
b/sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/DistributionSuite.scala
@@ -17,14 +17,13 @@
 
 package org.apache.spark.sql.catalyst
 
-import org.scalatest.FunSuite
-
+import org.apache.spark.SparkFunSuite
 import org.a

spark git commit: [BUILD] Fix Maven build for Kinesis

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.4 bfe74b34a -> 84da65319


[BUILD] Fix Maven build for Kinesis

A necessary dependency that is transitively referenced is not
provided, causing compilation failures in builds that provide
the kinesis-asl profile.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/84da6531
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/84da6531
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/84da6531

Branch: refs/heads/branch-1.4
Commit: 84da653192a2d9edb82d0dbe50f577c4dc6a0c78
Parents: bfe74b3
Author: Andrew Or 
Authored: Wed Jun 3 20:45:31 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 20:47:53 2015 -0700

--
 extras/kinesis-asl/pom.xml | 7 +++
 pom.xml| 2 ++
 2 files changed, 9 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/84da6531/extras/kinesis-asl/pom.xml
--
diff --git a/extras/kinesis-asl/pom.xml b/extras/kinesis-asl/pom.xml
index 25847a1..f05643bfc 100644
--- a/extras/kinesis-asl/pom.xml
+++ b/extras/kinesis-asl/pom.xml
@@ -42,6 +42,13 @@
 
 
   org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
+  org.apache.spark
   spark-streaming_${scala.binary.version}
   ${project.version}
   test-jar

http://git-wip-us.apache.org/repos/asf/spark/blob/84da6531/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 9f99e28..c873f68 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1426,6 +1426,8 @@
 2.3
 
   false
+  
+  false
   
 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [BUILD] Fix Maven build for Kinesis

2015-06-03 Thread andrewor14
Repository: spark
Updated Branches:
  refs/heads/branch-1.3 744599609 -> ce137b8ed


[BUILD] Fix Maven build for Kinesis

A necessary dependency that is transitively referenced is not
provided, causing compilation failures in builds that provide
the kinesis-asl profile.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/ce137b8e
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/ce137b8e
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/ce137b8e

Branch: refs/heads/branch-1.3
Commit: ce137b8ed3b240b7516046699ac96daa55ddc129
Parents: 7445996
Author: Andrew Or 
Authored: Wed Jun 3 20:45:31 2015 -0700
Committer: Andrew Or 
Committed: Wed Jun 3 20:48:09 2015 -0700

--
 extras/kinesis-asl/pom.xml | 7 +++
 pom.xml| 2 ++
 2 files changed, 9 insertions(+)
--


http://git-wip-us.apache.org/repos/asf/spark/blob/ce137b8e/extras/kinesis-asl/pom.xml
--
diff --git a/extras/kinesis-asl/pom.xml b/extras/kinesis-asl/pom.xml
index 3ac273d..4eea37d 100644
--- a/extras/kinesis-asl/pom.xml
+++ b/extras/kinesis-asl/pom.xml
@@ -42,6 +42,13 @@
 
 
   org.apache.spark
+  spark-core_${scala.binary.version}
+  ${project.version}
+  test-jar
+  test
+
+
+  org.apache.spark
   spark-streaming_${scala.binary.version}
   ${project.version}
   test-jar

http://git-wip-us.apache.org/repos/asf/spark/blob/ce137b8e/pom.xml
--
diff --git a/pom.xml b/pom.xml
index 788201b..70d4d52 100644
--- a/pom.xml
+++ b/pom.xml
@@ -1350,6 +1350,8 @@
 2.2
 
   false
+  
+  false
   
 
   


-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: [MAINTENANCE] Closes #2854

2015-06-03 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/branch-0.9 2f03fc17c -> e63783a23


[MAINTENANCE] Closes #2854

This commit exists to close a pull request on github.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/e63783a2
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/e63783a2
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/e63783a2

Branch: refs/heads/branch-0.9
Commit: e63783a23f49fafc5d9f464ecfd107d19bd87787
Parents: 2f03fc1
Author: Patrick Wendell 
Authored: Wed Jun 3 23:35:16 2015 -0700
Committer: Patrick Wendell 
Committed: Wed Jun 3 23:35:16 2015 -0700

--

--



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org



spark git commit: MAINTENANCE: Automated closing of pull requests.

2015-06-03 Thread pwendell
Repository: spark
Updated Branches:
  refs/heads/master 984ad6014 -> 9982d453c


MAINTENANCE: Automated closing of pull requests.

This commit exists to close the following pull requests on Github:

Closes #5976 (close requested by 'JoshRosen')
Closes #4576 (close requested by 'pwendell')
Closes #3430 (close requested by 'pwendell')
Closes #2495 (close requested by 'pwendell')


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/9982d453
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/9982d453
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/9982d453

Branch: refs/heads/master
Commit: 9982d453c39e50aedae7d01e4c38fab1b2bc6be0
Parents: 984ad60
Author: Patrick Wendell 
Authored: Wed Jun 3 23:45:06 2015 -0700
Committer: Patrick Wendell 
Committed: Wed Jun 3 23:45:06 2015 -0700

--

--



-
To unsubscribe, e-mail: commits-unsubscr...@spark.apache.org
For additional commands, e-mail: commits-h...@spark.apache.org